All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem
@ 2014-09-24  8:27 Lukasz Majewski
  2014-09-24  8:27 ` [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping Lukasz Majewski
                   ` (4 more replies)
  0 siblings, 5 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-09-24  8:27 UTC (permalink / raw)
  To: Zhang Rui, Eduardo Valentin
  Cc: Linux PM list, Lukasz Majewski, Bartlomiej Zolnierkiewicz,
	linux-kernel, Lukasz Majewski

This little patch series comprises fixes for step_wise governor and
thermal core.

Lukasz Majewski (3):
  thermal: step_wise: fix: Prevent from binary overflow when trend is
    dropping
  thermal: core: fix: Initialize the max_state variable to 0
  thermal: core: fix: Check return code of the ->get_max_state()
    callback

 drivers/thermal/step_wise.c    | 2 +-
 drivers/thermal/thermal_core.c | 8 +++++---
 2 files changed, 6 insertions(+), 4 deletions(-)

-- 
2.0.0.rc2


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping
  2014-09-24  8:27 [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
@ 2014-09-24  8:27 ` Lukasz Majewski
  2014-10-02 21:13   ` Eduardo Valentin
  2014-10-09  3:14   ` Zhang Rui
  2014-09-24  8:27 ` [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0 Lukasz Majewski
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-09-24  8:27 UTC (permalink / raw)
  To: Zhang Rui, Eduardo Valentin
  Cc: Linux PM list, Lukasz Majewski, Bartlomiej Zolnierkiewicz,
	linux-kernel, Lukasz Majewski

It turns out that some boards can have instance->lower greater than 0 and
when thermal trend is dropping it results with next_target equal to -1.

Since the next_target is defined as unsigned long it is interpreted as
0xFFFFFFFF and larger than instance->upper.
As a result the next_target is set to instance->upper which ramps up to
maximal cooling device target when the temperature is steadily decreasing.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/step_wise.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
index 3b54c2c..fdd1f52 100644
--- a/drivers/thermal/step_wise.c
+++ b/drivers/thermal/step_wise.c
@@ -77,7 +77,7 @@ static unsigned long get_target_state(struct thermal_instance *instance,
 			next_target = instance->upper;
 		break;
 	case THERMAL_TREND_DROPPING:
-		if (cur_state == instance->lower) {
+		if (cur_state <= instance->lower) {
 			if (!throttle)
 				next_target = THERMAL_NO_TARGET;
 		} else {
-- 
2.0.0.rc2


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0
  2014-09-24  8:27 [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
  2014-09-24  8:27 ` [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping Lukasz Majewski
@ 2014-09-24  8:27 ` Lukasz Majewski
  2014-10-02 22:26   ` Eduardo Valentin
  2014-09-24  8:27 ` [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback Lukasz Majewski
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-09-24  8:27 UTC (permalink / raw)
  To: Zhang Rui, Eduardo Valentin
  Cc: Linux PM list, Lukasz Majewski, Bartlomiej Zolnierkiewicz,
	linux-kernel, Lukasz Majewski

Pointer to the uninitialized max_state variable is passed to get the
maximal cooling state.
For CPU cooling device (cpu_cooling.c) the cpufreq_get_max_state() is
called, which even when error occurs will return the max_state variable
unchanged.
Since error for ->get_max_state() is not checked, the automatically
allocated value of max_state is used for (upper > max_state) comparison.
For any possible max_state value it is very unlikely that it will be less
than upper.
As a consequence, the cooling device is bind even without the backed
cpufreq table initialized.

This initialization will prevent from accidental binding trip points to
cpu freq cooling frequencies when cpufreq driver itself is not yet fully
initialized.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/thermal_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 454884a..747618a 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -927,7 +927,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	struct thermal_instance *pos;
 	struct thermal_zone_device *pos1;
 	struct thermal_cooling_device *pos2;
-	unsigned long max_state;
+	unsigned long max_state = 0;
 	int result;
 
 	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
-- 
2.0.0.rc2


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback
  2014-09-24  8:27 [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
  2014-09-24  8:27 ` [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping Lukasz Majewski
  2014-09-24  8:27 ` [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0 Lukasz Majewski
@ 2014-09-24  8:27 ` Lukasz Majewski
  2014-10-02 22:24   ` Eduardo Valentin
  2014-10-02 14:40 ` [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
       [not found] ` <1411547232-21493-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
  4 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-09-24  8:27 UTC (permalink / raw)
  To: Zhang Rui, Eduardo Valentin
  Cc: Linux PM list, Lukasz Majewski, Bartlomiej Zolnierkiewicz,
	linux-kernel, Lukasz Majewski

Without this check it was possible to execute cooling device binding code
even when wrong max cooling state was read.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/thermal_core.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 747618a..8a4dc35 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	struct thermal_zone_device *pos1;
 	struct thermal_cooling_device *pos2;
 	unsigned long max_state = 0;
-	int result;
+	int result, ret;
 
 	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
 		return -EINVAL;
@@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	if (tz != pos1 || cdev != pos2)
 		return -EINVAL;
 
-	cdev->ops->get_max_state(cdev, &max_state);
+	ret = cdev->ops->get_max_state(cdev, &max_state);
+	if (ret)
+		return ret;
 
 	/* lower default 0, upper default max_state */
 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
-- 
2.0.0.rc2


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem
  2014-09-24  8:27 [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
                   ` (2 preceding siblings ...)
  2014-09-24  8:27 ` [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback Lukasz Majewski
@ 2014-10-02 14:40 ` Lukasz Majewski
       [not found] ` <1411547232-21493-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
  4 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-10-02 14:40 UTC (permalink / raw)
  To: Zhang Rui, Eduardo Valentin
  Cc: Lukasz Majewski, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

Dear Eduardo, Zhang,

> This little patch series comprises fixes for step_wise governor and
> thermal core.
> 
> Lukasz Majewski (3):
>   thermal: step_wise: fix: Prevent from binary overflow when trend is
>     dropping
>   thermal: core: fix: Initialize the max_state variable to 0
>   thermal: core: fix: Check return code of the ->get_max_state()
>     callback
> 

ping.

Please note, that those are rather self explanatory fixes.

>  drivers/thermal/step_wise.c    | 2 +-
>  drivers/thermal/thermal_core.c | 8 +++++---
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping
  2014-09-24  8:27 ` [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping Lukasz Majewski
@ 2014-10-02 21:13   ` Eduardo Valentin
  2014-10-03  7:26     ` Lukasz Majewski
  2014-10-09  3:14   ` Zhang Rui
  1 sibling, 1 reply; 62+ messages in thread
From: Eduardo Valentin @ 2014-10-02 21:13 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

Hello Lukasz,

On Wed, Sep 24, 2014 at 10:27:10AM +0200, Lukasz Majewski wrote:
> It turns out that some boards can have instance->lower greater than 0 and
> when thermal trend is dropping it results with next_target equal to -1.
> 
> Since the next_target is defined as unsigned long it is interpreted as
> 0xFFFFFFFF and larger than instance->upper.
> As a result the next_target is set to instance->upper which ramps up to
> maximal cooling device target when the temperature is steadily decreasing.
> 

Thanks for finding a problem and sending a fix.

Can you please explain a little more on how next_target reaches -1 when
lower is greater than 0?


> Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> ---
>  drivers/thermal/step_wise.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
> index 3b54c2c..fdd1f52 100644
> --- a/drivers/thermal/step_wise.c
> +++ b/drivers/thermal/step_wise.c
> @@ -77,7 +77,7 @@ static unsigned long get_target_state(struct thermal_instance *instance,
>  			next_target = instance->upper;
>  		break;
>  	case THERMAL_TREND_DROPPING:
> -		if (cur_state == instance->lower) {
> +		if (cur_state <= instance->lower) {
>  			if (!throttle)
>  				next_target = THERMAL_NO_TARGET;
>  		} else {
> -- 
> 2.0.0.rc2
> 


Eduardo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback
  2014-09-24  8:27 ` [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback Lukasz Majewski
@ 2014-10-02 22:24   ` Eduardo Valentin
  2014-10-03 10:40     ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Eduardo Valentin @ 2014-10-02 22:24 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

On Wed, Sep 24, 2014 at 10:27:12AM +0200, Lukasz Majewski wrote:
> Without this check it was possible to execute cooling device binding code
> even when wrong max cooling state was read.
> 
> Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>

Acked-by: Eduardo Valentin <edubezval@gmail.com>

Rui, are you taking these patches? Do you prefer me to collect them?

Cheers

Eduardo

> ---
>  drivers/thermal/thermal_core.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 747618a..8a4dc35 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
>  	struct thermal_zone_device *pos1;
>  	struct thermal_cooling_device *pos2;
>  	unsigned long max_state = 0;
> -	int result;
> +	int result, ret;
>  
>  	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
>  		return -EINVAL;
> @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
>  	if (tz != pos1 || cdev != pos2)
>  		return -EINVAL;
>  
> -	cdev->ops->get_max_state(cdev, &max_state);
> +	ret = cdev->ops->get_max_state(cdev, &max_state);
> +	if (ret)
> +		return ret;
>  
>  	/* lower default 0, upper default max_state */
>  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> -- 
> 2.0.0.rc2
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0
  2014-09-24  8:27 ` [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0 Lukasz Majewski
@ 2014-10-02 22:26   ` Eduardo Valentin
  2014-10-03  8:26     ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Eduardo Valentin @ 2014-10-02 22:26 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

Hello Lukasz,

On Wed, Sep 24, 2014 at 10:27:11AM +0200, Lukasz Majewski wrote:
> Pointer to the uninitialized max_state variable is passed to get the
> maximal cooling state.
> For CPU cooling device (cpu_cooling.c) the cpufreq_get_max_state() is
> called, which even when error occurs will return the max_state variable
> unchanged.
> Since error for ->get_max_state() is not checked, the automatically


Good that you added a fix in your series for this.


> allocated value of max_state is used for (upper > max_state) comparison.
> For any possible max_state value it is very unlikely that it will be less
> than upper.
> As a consequence, the cooling device is bind even without the backed
> cpufreq table initialized.
> 
> This initialization will prevent from accidental binding trip points to
> cpu freq cooling frequencies when cpufreq driver itself is not yet fully
> initialized.

Although I agree with the fix, as long as we also include a check for
the .get_max_state return value,  I believe the problem you are
describing is about initialization sequence.

In general, I believe we need a better
sequencing between thermal and cpufreq subsystems. One way out is to
include a check for cpufreq driver in the thermal driver, and return
-EPROBE_DEFER  when cpufreq is not ready.

> 
> Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> ---
>  drivers/thermal/thermal_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 454884a..747618a 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -927,7 +927,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
>  	struct thermal_instance *pos;
>  	struct thermal_zone_device *pos1;
>  	struct thermal_cooling_device *pos2;
> -	unsigned long max_state;
> +	unsigned long max_state = 0;
>  	int result;
>  
>  	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
> -- 
> 2.0.0.rc2
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping
  2014-10-02 21:13   ` Eduardo Valentin
@ 2014-10-03  7:26     ` Lukasz Majewski
  0 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-10-03  7:26 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Zhang Rui, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

Hi Eduardo,

> Hello Lukasz,
> 
> On Wed, Sep 24, 2014 at 10:27:10AM +0200, Lukasz Majewski wrote:
> > It turns out that some boards can have instance->lower greater than
> > 0 and when thermal trend is dropping it results with next_target
> > equal to -1.
> > 
> > Since the next_target is defined as unsigned long it is interpreted
> > as 0xFFFFFFFF and larger than instance->upper.
> > As a result the next_target is set to instance->upper which ramps
> > up to maximal cooling device target when the temperature is
> > steadily decreasing.
> > 
> 
> Thanks for finding a problem and sending a fix.
> 
> Can you please explain a little more on how next_target reaches -1
> when lower is greater than 0?

During testing I've created a fan based cooling device with
"instance->lower" = 1 and "instance->upper" = 3.

On the system the ordinary cpu_cooling device is also present.

With step_wise.c it happens that at "THERMAL_TREND_DROPPING" the
cur_state = 0, so the first if condition is false (line 80) and
cur_state, which is defined as unsigned long, is decremented. This
means that next_target equals to 0xFFFFFFFF because it also is defined
as unsigned long.
Such value is apparently bigger than "instance->upper" (line 85) and
therefore the maximal cooling state is chosen. As a result, even when
the thermal trend is dropping, the cooling device increase the cooling
state to maximum.

This behavior has been discovered during my work on adapting exynos
thermal driver to reuse the of-thermal. I will send patches for this
work after the weekend.

> 
> 
> > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > ---
> >  drivers/thermal/step_wise.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/thermal/step_wise.c
> > b/drivers/thermal/step_wise.c index 3b54c2c..fdd1f52 100644
> > --- a/drivers/thermal/step_wise.c
> > +++ b/drivers/thermal/step_wise.c
> > @@ -77,7 +77,7 @@ static unsigned long get_target_state(struct
> > thermal_instance *instance, next_target = instance->upper;
> >  		break;
> >  	case THERMAL_TREND_DROPPING:
> > -		if (cur_state == instance->lower) {
> > +		if (cur_state <= instance->lower) {
> >  			if (!throttle)
> >  				next_target = THERMAL_NO_TARGET;
> >  		} else {
> > -- 
> > 2.0.0.rc2
> > 
> 
> 
> Eduardo



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0
  2014-10-02 22:26   ` Eduardo Valentin
@ 2014-10-03  8:26     ` Lukasz Majewski
  2014-10-06 18:01       ` Eduardo Valentin
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-10-03  8:26 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Zhang Rui, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

Hi Eduardo,

> Hello Lukasz,
> 
> On Wed, Sep 24, 2014 at 10:27:11AM +0200, Lukasz Majewski wrote:
> > Pointer to the uninitialized max_state variable is passed to get the
> > maximal cooling state.
> > For CPU cooling device (cpu_cooling.c) the cpufreq_get_max_state()
> > is called, which even when error occurs will return the max_state
> > variable unchanged.
> > Since error for ->get_max_state() is not checked, the automatically
> 
> 
> Good that you added a fix in your series for this.
> 
> 
> > allocated value of max_state is used for (upper > max_state)
> > comparison. For any possible max_state value it is very unlikely
> > that it will be less than upper.
> > As a consequence, the cooling device is bind even without the backed
> > cpufreq table initialized.
> > 
> > This initialization will prevent from accidental binding trip
> > points to cpu freq cooling frequencies when cpufreq driver itself
> > is not yet fully initialized.
> 
> Although I agree with the fix, as long as we also include a check for
> the .get_max_state return value,  I believe the problem you are
> describing is about initialization sequence.

As you pointed out - the problem here is with initialization sequence.
Thermal and cpufreq cores are initialized very early.

However, the get_max_state() for cpu_cooling.c device (as it is the
pervasive way of cooling things) accesses cpufreq policy to get the
freq_table and count available states.

The issue here is with late initialization of cpufreq policy.

Up till now the cpu_cooling device was bind even when the
get_max_state() returned -EINVAL and everything worked after late
cpufreq policy initialization. However, during this time window the
thermal driver is not configured.

> 
> In general, I believe we need a better
> sequencing between thermal and cpufreq subsystems. One way out is to
> include a check for cpufreq driver in the thermal driver, and return
> -EPROBE_DEFER  when cpufreq is not ready.

I think that we could return -EPROBE_DEFER when cpufreq's policy is not
yet available and subscribe to cpufreq notifier to call
bind_cooling_device then.

Let's wait for Zhang opinion since he looks after the thermal core code.

> 
> > 
> > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > ---
> >  drivers/thermal/thermal_core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/thermal/thermal_core.c
> > b/drivers/thermal/thermal_core.c index 454884a..747618a 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -927,7 +927,7 @@ int thermal_zone_bind_cooling_device(struct
> > thermal_zone_device *tz, struct thermal_instance *pos;
> >  	struct thermal_zone_device *pos1;
> >  	struct thermal_cooling_device *pos2;
> > -	unsigned long max_state;
> > +	unsigned long max_state = 0;
> >  	int result;
> >  
> >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > THERMAL_TRIPS_NONE)) -- 
> > 2.0.0.rc2
> > 
> 



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback
  2014-10-02 22:24   ` Eduardo Valentin
@ 2014-10-03 10:40     ` Lukasz Majewski
  2014-10-09  3:15       ` Zhang Rui
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-10-03 10:40 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Linux PM list, Lukasz Majewski, Bartlomiej Zolnierkiewicz, linux-kernel

Hi Eduardo, Rui,

> On Wed, Sep 24, 2014 at 10:27:12AM +0200, Lukasz Majewski wrote:
> > Without this check it was possible to execute cooling device
> > binding code even when wrong max cooling state was read.
> > 
> > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> 
> Acked-by: Eduardo Valentin <edubezval@gmail.com>
> 
> Rui, are you taking these patches? Do you prefer me to collect them?

I'd like to ask you NOT to apply those patches.

In short: It will cause regression on all non Exynos boards.

Explanation:

Up till now the cpu_cooling device was bind even when the
get_max_state() returned -EINVAL and everything worked after late
cpufreq policy initialization.

However, during this time window the thermal driver is not properly
configured.

Applying PATCH2/3 and PATCH3/3 would cause bind error without any
further occasion for re-bind. As a result THERAL will not be present
on the target system.

It works on the Exynos boards, since at the report_trigger() function, 
when first trip point is reached, it is checked if cpu_cooling device
is bind. If it is not (due to "fixes" at PATCH2/3 and PATCH3/3) it
is rebind then.

Due to above, I think that it would be best to leave things as they are
now and prepare proper fix as suggested by Eduardo.

> 
> Cheers
> 
> Eduardo
> 
> > ---
> >  drivers/thermal/thermal_core.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/thermal/thermal_core.c
> > b/drivers/thermal/thermal_core.c index 747618a..8a4dc35 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct
> > thermal_zone_device *tz, struct thermal_zone_device *pos1;
> >  	struct thermal_cooling_device *pos2;
> >  	unsigned long max_state = 0;
> > -	int result;
> > +	int result, ret;
> >  
> >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > THERMAL_TRIPS_NONE)) return -EINVAL;
> > @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct
> > thermal_zone_device *tz, if (tz != pos1 || cdev != pos2)
> >  		return -EINVAL;
> >  
> > -	cdev->ops->get_max_state(cdev, &max_state);
> > +	ret = cdev->ops->get_max_state(cdev, &max_state);
> > +	if (ret)
> > +		return ret;
> >  
> >  	/* lower default 0, upper default max_state */
> >  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > -- 
> > 2.0.0.rc2
> > 



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0
  2014-10-03  8:26     ` Lukasz Majewski
@ 2014-10-06 18:01       ` Eduardo Valentin
  0 siblings, 0 replies; 62+ messages in thread
From: Eduardo Valentin @ 2014-10-06 18:01 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

On Fri, Oct 03, 2014 at 10:26:28AM +0200, Lukasz Majewski wrote:
> Hi Eduardo,
> 
> > Hello Lukasz,
> > 
> > On Wed, Sep 24, 2014 at 10:27:11AM +0200, Lukasz Majewski wrote:
> > > Pointer to the uninitialized max_state variable is passed to get the
> > > maximal cooling state.
> > > For CPU cooling device (cpu_cooling.c) the cpufreq_get_max_state()
> > > is called, which even when error occurs will return the max_state
> > > variable unchanged.
> > > Since error for ->get_max_state() is not checked, the automatically
> > 
> > 
> > Good that you added a fix in your series for this.
> > 
> > 
> > > allocated value of max_state is used for (upper > max_state)
> > > comparison. For any possible max_state value it is very unlikely
> > > that it will be less than upper.
> > > As a consequence, the cooling device is bind even without the backed
> > > cpufreq table initialized.
> > > 
> > > This initialization will prevent from accidental binding trip
> > > points to cpu freq cooling frequencies when cpufreq driver itself
> > > is not yet fully initialized.
> > 
> > Although I agree with the fix, as long as we also include a check for
> > the .get_max_state return value,  I believe the problem you are
> > describing is about initialization sequence.
> 
> As you pointed out - the problem here is with initialization sequence.
> Thermal and cpufreq cores are initialized very early.
> 
> However, the get_max_state() for cpu_cooling.c device (as it is the
> pervasive way of cooling things) accesses cpufreq policy to get the
> freq_table and count available states.
> 
> The issue here is with late initialization of cpufreq policy.
> 
> Up till now the cpu_cooling device was bind even when the
> get_max_state() returned -EINVAL and everything worked after late
> cpufreq policy initialization. However, during this time window the
> thermal driver is not configured.
> 
> > 
> > In general, I believe we need a better
> > sequencing between thermal and cpufreq subsystems. One way out is to
> > include a check for cpufreq driver in the thermal driver, and return
> > -EPROBE_DEFER  when cpufreq is not ready.
> 
> I think that we could return -EPROBE_DEFER when cpufreq's policy is not
> yet available and subscribe to cpufreq notifier to call
> bind_cooling_device then.

I agree with this approach, as long as we keep the existing users of
cpufreq cooling in a working state.

Cheers

> 
> Let's wait for Zhang opinion since he looks after the thermal core code.
> 
> > 
> > > 
> > > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > > ---
> > >  drivers/thermal/thermal_core.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/thermal/thermal_core.c
> > > b/drivers/thermal/thermal_core.c index 454884a..747618a 100644
> > > --- a/drivers/thermal/thermal_core.c
> > > +++ b/drivers/thermal/thermal_core.c
> > > @@ -927,7 +927,7 @@ int thermal_zone_bind_cooling_device(struct
> > > thermal_zone_device *tz, struct thermal_instance *pos;
> > >  	struct thermal_zone_device *pos1;
> > >  	struct thermal_cooling_device *pos2;
> > > -	unsigned long max_state;
> > > +	unsigned long max_state = 0;
> > >  	int result;
> > >  
> > >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > > THERMAL_TRIPS_NONE)) -- 
> > > 2.0.0.rc2
> > > 
> > 
> 
> 
> 
> -- 
> Best regards,
> 
> Lukasz Majewski
> 
> Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping
  2014-09-24  8:27 ` [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping Lukasz Majewski
  2014-10-02 21:13   ` Eduardo Valentin
@ 2014-10-09  3:14   ` Zhang Rui
  1 sibling, 0 replies; 62+ messages in thread
From: Zhang Rui @ 2014-10-09  3:14 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Eduardo Valentin, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

On Wed, 2014-09-24 at 10:27 +0200, Lukasz Majewski wrote:
> It turns out that some boards can have instance->lower greater than 0 and
> when thermal trend is dropping it results with next_target equal to -1.
> 
> Since the next_target is defined as unsigned long it is interpreted as
> 0xFFFFFFFF and larger than instance->upper.
> As a result the next_target is set to instance->upper which ramps up to
> maximal cooling device target when the temperature is steadily decreasing.
> 
> Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>

applied.

thanks,
rui
> ---
>  drivers/thermal/step_wise.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
> index 3b54c2c..fdd1f52 100644
> --- a/drivers/thermal/step_wise.c
> +++ b/drivers/thermal/step_wise.c
> @@ -77,7 +77,7 @@ static unsigned long get_target_state(struct thermal_instance *instance,
>  			next_target = instance->upper;
>  		break;
>  	case THERMAL_TREND_DROPPING:
> -		if (cur_state == instance->lower) {
> +		if (cur_state <= instance->lower) {
>  			if (!throttle)
>  				next_target = THERMAL_NO_TARGET;
>  		} else {



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback
  2014-10-03 10:40     ` Lukasz Majewski
@ 2014-10-09  3:15       ` Zhang Rui
  2014-10-09 16:47         ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Zhang Rui @ 2014-10-09  3:15 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Eduardo Valentin, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

On Fri, 2014-10-03 at 12:40 +0200, Lukasz Majewski wrote:
> Hi Eduardo, Rui,
> 
> > On Wed, Sep 24, 2014 at 10:27:12AM +0200, Lukasz Majewski wrote:
> > > Without this check it was possible to execute cooling device
> > > binding code even when wrong max cooling state was read.
> > > 
> > > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > 
> > Acked-by: Eduardo Valentin <edubezval@gmail.com>
> > 
> > Rui, are you taking these patches? Do you prefer me to collect them?
> 
> I'd like to ask you NOT to apply those patches.
> 
> In short: It will cause regression on all non Exynos boards.
> 
> Explanation:
> 
> Up till now the cpu_cooling device was bind even when the
> get_max_state() returned -EINVAL and everything worked after late
> cpufreq policy initialization.
> 
> However, during this time window the thermal driver is not properly
> configured.
> 
> Applying PATCH2/3 and PATCH3/3 would cause bind error without any
> further occasion for re-bind. As a result THERAL will not be present
> on the target system.
> 
> It works on the Exynos boards, since at the report_trigger() function, 
> when first trip point is reached, it is checked if cpu_cooling device
> is bind. If it is not (due to "fixes" at PATCH2/3 and PATCH3/3) it
> is rebind then.
> 
> Due to above, I think that it would be best to leave things as they are
> now and prepare proper fix as suggested by Eduardo.
> 
Agreed. Can anyone please propose such a patch?

thanks,
rui
> > 
> > Cheers
> > 
> > Eduardo
> > 
> > > ---
> > >  drivers/thermal/thermal_core.c | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/thermal/thermal_core.c
> > > b/drivers/thermal/thermal_core.c index 747618a..8a4dc35 100644
> > > --- a/drivers/thermal/thermal_core.c
> > > +++ b/drivers/thermal/thermal_core.c
> > > @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct
> > > thermal_zone_device *tz, struct thermal_zone_device *pos1;
> > >  	struct thermal_cooling_device *pos2;
> > >  	unsigned long max_state = 0;
> > > -	int result;
> > > +	int result, ret;
> > >  
> > >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > > THERMAL_TRIPS_NONE)) return -EINVAL;
> > > @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct
> > > thermal_zone_device *tz, if (tz != pos1 || cdev != pos2)
> > >  		return -EINVAL;
> > >  
> > > -	cdev->ops->get_max_state(cdev, &max_state);
> > > +	ret = cdev->ops->get_max_state(cdev, &max_state);
> > > +	if (ret)
> > > +		return ret;
> > >  
> > >  	/* lower default 0, upper default max_state */
> > >  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > > -- 
> > > 2.0.0.rc2
> > > 
> 
> 
> 



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback
  2014-10-09  3:15       ` Zhang Rui
@ 2014-10-09 16:47         ` Lukasz Majewski
  0 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-10-09 16:47 UTC (permalink / raw)
  To: Zhang Rui
  Cc: Eduardo Valentin, Linux PM list, Lukasz Majewski,
	Bartlomiej Zolnierkiewicz, linux-kernel

Hi Zhang,

> On Fri, 2014-10-03 at 12:40 +0200, Lukasz Majewski wrote:
> > Hi Eduardo, Rui,
> > 
> > > On Wed, Sep 24, 2014 at 10:27:12AM +0200, Lukasz Majewski wrote:
> > > > Without this check it was possible to execute cooling device
> > > > binding code even when wrong max cooling state was read.
> > > > 
> > > > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > > 
> > > Acked-by: Eduardo Valentin <edubezval@gmail.com>
> > > 
> > > Rui, are you taking these patches? Do you prefer me to collect
> > > them?
> > 
> > I'd like to ask you NOT to apply those patches.
> > 
> > In short: It will cause regression on all non Exynos boards.
> > 
> > Explanation:
> > 
> > Up till now the cpu_cooling device was bind even when the
> > get_max_state() returned -EINVAL and everything worked after late
> > cpufreq policy initialization.
> > 
> > However, during this time window the thermal driver is not properly
> > configured.
> > 
> > Applying PATCH2/3 and PATCH3/3 would cause bind error without any
> > further occasion for re-bind. As a result THERAL will not be present
> > on the target system.
> > 
> > It works on the Exynos boards, since at the report_trigger()
> > function, when first trip point is reached, it is checked if
> > cpu_cooling device is bind. If it is not (due to "fixes" at
> > PATCH2/3 and PATCH3/3) it is rebind then.
> > 
> > Due to above, I think that it would be best to leave things as they
> > are now and prepare proper fix as suggested by Eduardo.
> > 
> Agreed. Can anyone please propose such a patch?

I can propose myself as a volunteer for this. Unfortunately, I won't be
able to provide any code earlier than in two weeks time. 

> 
> thanks,
> rui
> > > 
> > > Cheers
> > > 
> > > Eduardo
> > > 
> > > > ---
> > > >  drivers/thermal/thermal_core.c | 6 ++++--
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/thermal/thermal_core.c
> > > > b/drivers/thermal/thermal_core.c index 747618a..8a4dc35 100644
> > > > --- a/drivers/thermal/thermal_core.c
> > > > +++ b/drivers/thermal/thermal_core.c
> > > > @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct
> > > > thermal_zone_device *tz, struct thermal_zone_device *pos1;
> > > >  	struct thermal_cooling_device *pos2;
> > > >  	unsigned long max_state = 0;
> > > > -	int result;
> > > > +	int result, ret;
> > > >  
> > > >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > > > THERMAL_TRIPS_NONE)) return -EINVAL;
> > > > @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct
> > > > thermal_zone_device *tz, if (tz != pos1 || cdev != pos2)
> > > >  		return -EINVAL;
> > > >  
> > > > -	cdev->ops->get_max_state(cdev, &max_state);
> > > > +	ret = cdev->ops->get_max_state(cdev, &max_state);
> > > > +	if (ret)
> > > > +		return ret;
> > > >  
> > > >  	/* lower default 0, upper default max_state */
> > > >  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > > > -- 
> > > > 2.0.0.rc2
> > > > 
> > 
> > 
> > 
> 



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers
  2014-09-24  8:27 [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
@ 2014-11-13 17:02     ` Lukasz Majewski
  2014-09-24  8:27 ` [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0 Lukasz Majewski
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Lukasz Majewski

Presented fixes are a response for problem described below:
http://thread.gmane.org/gmane.linux.kernel/1793821/match=thermal+core+fix+initialize+max_state+variable+0

In short - it turned out that two trivial fixes (included in this patch set)
require support for deferred probe in thermal drivers.

This situation shows up when CPU frequency reduction is used as a thermal cooling
device for a thermal zone.
It happens that during initialization, the call to thermal probe will be executed
before cpufreq probe (it can be observed at ./drivers/Makefile).
In such a situation thermal will not be properly configured until cpufreq policy
is setup.

In the current code (without included fixes) there is a time window in which thermal
can try to use not configured cpufreq and possibly crash the system.


Proposed solution was based on the code already available in the imx_thermal.c file.

/db8500_thermal.c:                      -> NOT NEEDED
/intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
/intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
/ti-soc-thermal/ti-bandgap.c:           -> FIXED  [omap2plus_defconfig]
/dove_thermal.c:                        -> NOT NEEDED - CPU_COOLING NOT AVAILABLE
                                                        [dove_defconfig]
/spear_thermal.c:                       -> FIXED  [spear3xx_defconfig]
/samsung/exynos_tmu.c:                  -> NOT NEEDED (nasty hack - will be reworked in later patches)
/imx_thermal.c:                         -> OK (deferred probe already in place)
/int340x_thermal/int3402_thermal.c:     -> NOT NEEDED - ACPI x86 - Intel specific
/int340x_thermal/int3400_thermal.c:     -> NOT NEEDED - ACPI x86 - Intel specific
/tegra_soctherm.c:                      -> FIXED  [tegra_defconfig]
/kirkwood_thermal.c:                    -> FIXED  [multi_v5_defconfig]
/armada_thermal.c:                      -> FIXED  [multi_v7_defconfig]
/rcar_thermal.c:                        -> FIXED  [shmobile_defconfig]
/db8500_cpufreq_cooling.c:              -> OK (deferred probe already in place) [multi_v7_defconfig]
/st/st_thermal_syscfg.c:                -> NOT NEEDED (Those two are enabled by e.g. ARMADA)
/st/st_thermal_memmap.c:


I only possess Exynos boards and Beagle Bone Black, so I'd be grateful for
testing proposed solution on other boards. The posted code is compile tested.

This code applies on Eduardo's ti-soc-thermal-next tree:
SHA1: 208a97042d66d9bfbcfab0d4a00c9fe317bb73d3

Lukasz Majewski (8):
  thermal:cpu cooling:armada: Provide deferred probing for armada driver
  thermal:cpu cooling:kirkwood: Provide deferred probing for kirkwood
    driver
  thermal:cpu cooling:rcar: Provide deferred probing for rcar driver
  thermal:cpu cooling:spear: Provide deferred probing for spear driver
  thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  thermal:cpu cooling:ti: Provide deferred probing for ti drivers
  thermal:core:fix: Initialize the max_state variable to 0
  thermal:core:fix: Check return code of the ->get_max_state() callback

 drivers/thermal/armada_thermal.c            | 7 +++++++
 drivers/thermal/kirkwood_thermal.c          | 7 +++++++
 drivers/thermal/rcar_thermal.c              | 7 +++++++
 drivers/thermal/spear_thermal.c             | 7 +++++++
 drivers/thermal/tegra_soctherm.c            | 7 +++++++
 drivers/thermal/thermal_core.c              | 8 +++++---
 drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
 7 files changed, 47 insertions(+), 3 deletions(-)

-- 
2.0.0.rc2

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers
@ 2014-11-13 17:02     ` Lukasz Majewski
  0 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

Presented fixes are a response for problem described below:
http://thread.gmane.org/gmane.linux.kernel/1793821/match=thermal+core+fix+initialize+max_state+variable+0

In short - it turned out that two trivial fixes (included in this patch set)
require support for deferred probe in thermal drivers.

This situation shows up when CPU frequency reduction is used as a thermal cooling
device for a thermal zone.
It happens that during initialization, the call to thermal probe will be executed
before cpufreq probe (it can be observed at ./drivers/Makefile).
In such a situation thermal will not be properly configured until cpufreq policy
is setup.

In the current code (without included fixes) there is a time window in which thermal
can try to use not configured cpufreq and possibly crash the system.


Proposed solution was based on the code already available in the imx_thermal.c file.

/db8500_thermal.c:                      -> NOT NEEDED
/intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
/intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
/ti-soc-thermal/ti-bandgap.c:           -> FIXED  [omap2plus_defconfig]
/dove_thermal.c:                        -> NOT NEEDED - CPU_COOLING NOT AVAILABLE
                                                        [dove_defconfig]
/spear_thermal.c:                       -> FIXED  [spear3xx_defconfig]
/samsung/exynos_tmu.c:                  -> NOT NEEDED (nasty hack - will be reworked in later patches)
/imx_thermal.c:                         -> OK (deferred probe already in place)
/int340x_thermal/int3402_thermal.c:     -> NOT NEEDED - ACPI x86 - Intel specific
/int340x_thermal/int3400_thermal.c:     -> NOT NEEDED - ACPI x86 - Intel specific
/tegra_soctherm.c:                      -> FIXED  [tegra_defconfig]
/kirkwood_thermal.c:                    -> FIXED  [multi_v5_defconfig]
/armada_thermal.c:                      -> FIXED  [multi_v7_defconfig]
/rcar_thermal.c:                        -> FIXED  [shmobile_defconfig]
/db8500_cpufreq_cooling.c:              -> OK (deferred probe already in place) [multi_v7_defconfig]
/st/st_thermal_syscfg.c:                -> NOT NEEDED (Those two are enabled by e.g. ARMADA)
/st/st_thermal_memmap.c:


I only possess Exynos boards and Beagle Bone Black, so I'd be grateful for
testing proposed solution on other boards. The posted code is compile tested.

This code applies on Eduardo's ti-soc-thermal-next tree:
SHA1: 208a97042d66d9bfbcfab0d4a00c9fe317bb73d3

Lukasz Majewski (8):
  thermal:cpu cooling:armada: Provide deferred probing for armada driver
  thermal:cpu cooling:kirkwood: Provide deferred probing for kirkwood
    driver
  thermal:cpu cooling:rcar: Provide deferred probing for rcar driver
  thermal:cpu cooling:spear: Provide deferred probing for spear driver
  thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  thermal:cpu cooling:ti: Provide deferred probing for ti drivers
  thermal:core:fix: Initialize the max_state variable to 0
  thermal:core:fix: Check return code of the ->get_max_state() callback

 drivers/thermal/armada_thermal.c            | 7 +++++++
 drivers/thermal/kirkwood_thermal.c          | 7 +++++++
 drivers/thermal/rcar_thermal.c              | 7 +++++++
 drivers/thermal/spear_thermal.c             | 7 +++++++
 drivers/thermal/tegra_soctherm.c            | 7 +++++++
 drivers/thermal/thermal_core.c              | 8 +++++---
 drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
 7 files changed, 47 insertions(+), 3 deletions(-)

-- 
2.0.0.rc2


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 1/8] thermal:cpu cooling:armada: Provide deferred probing for armada driver
  2014-11-13 17:02     ` Lukasz Majewski
  (?)
@ 2014-11-13 17:02     ` Lukasz Majewski
  -1 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

When CPU freq is used as a thermal zone cooling device, one needs to wait
until cpufreq subsystem is properly initialized.

This code is similar to the one already available in imx_thermal.c file.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/armada_thermal.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/thermal/armada_thermal.c b/drivers/thermal/armada_thermal.c
index 9d1420a..1b5d15d 100644
--- a/drivers/thermal/armada_thermal.c
+++ b/drivers/thermal/armada_thermal.c
@@ -23,6 +23,7 @@
 #include <linux/platform_device.h>
 #include <linux/of_device.h>
 #include <linux/thermal.h>
+#include <linux/cpufreq.h>
 
 #define THERMAL_VALID_MASK		0x1
 
@@ -280,6 +281,12 @@ static int armada_thermal_probe(struct platform_device *pdev)
 	struct armada_thermal_priv *priv;
 	struct resource *res;
 
+#ifdef CONFIG_CPU_THERMAL
+	if (!cpufreq_get_current_driver()) {
+		dev_dbg(&pdev->dev, "no cpufreq driver!");
+		return -EPROBE_DEFER;
+	}
+#endif
 	match = of_match_device(armada_thermal_id_table, &pdev->dev);
 	if (!match)
 		return -ENODEV;
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 2/8] thermal:cpu cooling:kirkwood: Provide deferred probing for kirkwood driver
  2014-11-13 17:02     ` Lukasz Majewski
  (?)
  (?)
@ 2014-11-13 17:02     ` Lukasz Majewski
  -1 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

When CPU freq is used as a thermal zone cooling device, one needs to wait
until cpufreq subsystem is properly initialized.

This code is similar to the one already available in imx_thermal.c file.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/kirkwood_thermal.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/thermal/kirkwood_thermal.c b/drivers/thermal/kirkwood_thermal.c
index 3b034a0..e938291 100644
--- a/drivers/thermal/kirkwood_thermal.c
+++ b/drivers/thermal/kirkwood_thermal.c
@@ -21,6 +21,7 @@
 #include <linux/module.h>
 #include <linux/platform_device.h>
 #include <linux/thermal.h>
+#include <linux/cpufreq.h>
 
 #define KIRKWOOD_THERMAL_VALID_OFFSET	9
 #define KIRKWOOD_THERMAL_VALID_MASK	0x1
@@ -75,6 +76,12 @@ static int kirkwood_thermal_probe(struct platform_device *pdev)
 	struct kirkwood_thermal_priv *priv;
 	struct resource *res;
 
+#ifdef CONFIG_CPU_THERMAL
+	if (!cpufreq_get_current_driver()) {
+		dev_dbg(&pdev->dev, "no cpufreq driver!");
+		return -EPROBE_DEFER;
+	}
+#endif
 	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
 	if (!priv)
 		return -ENOMEM;
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 3/8] thermal:cpu cooling:rcar: Provide deferred probing for rcar driver
  2014-11-13 17:02     ` Lukasz Majewski
                       ` (2 preceding siblings ...)
  (?)
@ 2014-11-13 17:02     ` Lukasz Majewski
  -1 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

When CPU freq is used as a thermal zone cooling device, one needs to wait
until cpufreq subsystem is properly initialized.

This code is similar to the one already available in imx_thermal.c file.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/rcar_thermal.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
index 8803e69..b268b4d 100644
--- a/drivers/thermal/rcar_thermal.c
+++ b/drivers/thermal/rcar_thermal.c
@@ -29,6 +29,7 @@
 #include <linux/slab.h>
 #include <linux/spinlock.h>
 #include <linux/thermal.h>
+#include <linux/cpufreq.h>
 
 #define IDLE_INTERVAL	5000
 
@@ -373,6 +374,12 @@ static int rcar_thermal_probe(struct platform_device *pdev)
 	int ret = -ENODEV;
 	int idle = IDLE_INTERVAL;
 
+#ifdef CONFIG_CPU_THERMAL
+	if (!cpufreq_get_current_driver()) {
+		dev_dbg(&pdev->dev, "no cpufreq driver!");
+		return -EPROBE_DEFER;
+	}
+#endif
 	common = devm_kzalloc(dev, sizeof(*common), GFP_KERNEL);
 	if (!common)
 		return -ENOMEM;
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 4/8] thermal:cpu cooling:spear: Provide deferred probing for spear driver
  2014-11-13 17:02     ` Lukasz Majewski
                       ` (3 preceding siblings ...)
  (?)
@ 2014-11-13 17:02     ` Lukasz Majewski
  -1 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

When CPU freq is used as a thermal zone cooling device, one needs to wait
until cpufreq subsystem is properly initialized.

This code is similar to the one already available in imx_thermal.c file.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/spear_thermal.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/thermal/spear_thermal.c b/drivers/thermal/spear_thermal.c
index 1e2193f..fd8fadf5 100644
--- a/drivers/thermal/spear_thermal.c
+++ b/drivers/thermal/spear_thermal.c
@@ -24,6 +24,7 @@
 #include <linux/module.h>
 #include <linux/platform_device.h>
 #include <linux/thermal.h>
+#include <linux/cpufreq.h>
 
 #define MD_FACTOR	1000
 
@@ -107,6 +108,12 @@ static int spear_thermal_probe(struct platform_device *pdev)
 	struct resource *res;
 	int ret = 0, val;
 
+#ifdef CONFIG_CPU_THERMAL
+	if (!cpufreq_get_current_driver()) {
+		dev_dbg(&pdev->dev, "no cpufreq driver!");
+		return -EPROBE_DEFER;
+	}
+#endif
 	if (!np || !of_property_read_u32(np, "st,thermal-flags", &val)) {
 		dev_err(&pdev->dev, "Failed: DT Pdata not passed\n");
 		return -EINVAL;
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-13 17:02     ` Lukasz Majewski
                       ` (4 preceding siblings ...)
  (?)
@ 2014-11-13 17:02     ` Lukasz Majewski
  2014-11-14 10:47       ` Mikko Perttunen
       [not found]       ` <1415898165-27406-6-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
  -1 siblings, 2 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

When CPU freq is used as a thermal zone cooling device, one needs to wait
until cpufreq subsystem is properly initialized.

This code is similar to the one already available in imx_thermal.c file.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/tegra_soctherm.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/thermal/tegra_soctherm.c b/drivers/thermal/tegra_soctherm.c
index 70f7e9e..9c5aaa4 100644
--- a/drivers/thermal/tegra_soctherm.c
+++ b/drivers/thermal/tegra_soctherm.c
@@ -26,6 +26,7 @@
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 #include <linux/thermal.h>
+#include <linux/cpufreq.h>
 
 #include <soc/tegra/fuse.h>
 
@@ -346,6 +347,12 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 
 	const struct tegra_tsensor *tsensors = t124_tsensors;
 
+#ifdef CONFIG_CPU_THERMAL
+	if (!cpufreq_get_current_driver()) {
+		dev_dbg(&pdev->dev, "no cpufreq driver!");
+		return -EPROBE_DEFER;
+	}
+#endif
 	tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL);
 	if (!tegra)
 		return -ENOMEM;
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 6/8] thermal:cpu cooling:ti: Provide deferred probing for ti drivers
  2014-11-13 17:02     ` Lukasz Majewski
                       ` (5 preceding siblings ...)
  (?)
@ 2014-11-13 17:02     ` Lukasz Majewski
  2014-11-20 19:00       ` Eduardo Valentin
  -1 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

When CPU freq is used as a thermal zone cooling device, one needs to wait
until cpufreq subsystem is properly initialized.

This code is similar to the one already available in imx_thermal.c file.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/thermal/ti-soc-thermal/ti-bandgap.c b/drivers/thermal/ti-soc-thermal/ti-bandgap.c
index 634b6ce..0ac5ead 100644
--- a/drivers/thermal/ti-soc-thermal/ti-bandgap.c
+++ b/drivers/thermal/ti-soc-thermal/ti-bandgap.c
@@ -40,6 +40,7 @@
 #include <linux/of_irq.h>
 #include <linux/of_gpio.h>
 #include <linux/io.h>
+#include <linux/cpufreq.h>
 
 #include "ti-bandgap.h"
 
@@ -1196,6 +1197,12 @@ int ti_bandgap_probe(struct platform_device *pdev)
 	struct ti_bandgap *bgp;
 	int clk_rate, ret = 0, i;
 
+#ifdef CONFIG_CPU_THERMAL
+	if (!cpufreq_get_current_driver()) {
+		dev_dbg(&pdev->dev, "no cpufreq driver!");
+		return -EPROBE_DEFER;
+	}
+#endif
 	bgp = ti_bandgap_build(pdev);
 	if (IS_ERR(bgp)) {
 		dev_err(&pdev->dev, "failed to fetch platform data\n");
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 7/8] thermal:core:fix: Initialize the max_state variable to 0
  2014-11-13 17:02     ` Lukasz Majewski
                       ` (6 preceding siblings ...)
  (?)
@ 2014-11-13 17:02     ` Lukasz Majewski
  -1 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

Pointer to the uninitialized max_state variable is passed to get the
maximal cooling state.
For CPU cooling device (cpu_cooling.c) the cpufreq_get_max_state() is
called, which even when error occurs will return the max_state variable
unchanged.
Since error for ->get_max_state() is not checked, the automatically
allocated value of max_state is used for (upper > max_state) comparison.
For any possible max_state value it is very unlikely that it will be less
than upper.
As a consequence, the cooling device is bind even without the backed
cpufreq table initialized.

This initialization will prevent from accidental binding trip points to
cpu freq cooling frequencies when cpufreq driver itself is not yet fully
initialized.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/thermal_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 43b9070..413daa4 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -927,7 +927,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	struct thermal_instance *pos;
 	struct thermal_zone_device *pos1;
 	struct thermal_cooling_device *pos2;
-	unsigned long max_state;
+	unsigned long max_state = 0;
 	int result;
 
 	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 8/8] thermal:core:fix: Check return code of the ->get_max_state() callback
  2014-11-13 17:02     ` Lukasz Majewski
@ 2014-11-13 17:02         ` Lukasz Majewski
  -1 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Lukasz Majewski

Without this check it was possible to execute cooling device binding code
even when wrong max cooling state was read.

Signed-off-by: Lukasz Majewski <l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
---
 drivers/thermal/thermal_core.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 413daa4..c1ddef1 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	struct thermal_zone_device *pos1;
 	struct thermal_cooling_device *pos2;
 	unsigned long max_state = 0;
-	int result;
+	int result, ret;
 
 	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
 		return -EINVAL;
@@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	if (tz != pos1 || cdev != pos2)
 		return -EINVAL;
 
-	cdev->ops->get_max_state(cdev, &max_state);
+	ret = cdev->ops->get_max_state(cdev, &max_state);
+	if (ret)
+		return ret;
 
 	/* lower default 0, upper default max_state */
 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
-- 
2.0.0.rc2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 8/8] thermal:core:fix: Check return code of the ->get_max_state() callback
@ 2014-11-13 17:02         ` Lukasz Majewski
  0 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-13 17:02 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

Without this check it was possible to execute cooling device binding code
even when wrong max cooling state was read.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
 drivers/thermal/thermal_core.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 413daa4..c1ddef1 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	struct thermal_zone_device *pos1;
 	struct thermal_cooling_device *pos2;
 	unsigned long max_state = 0;
-	int result;
+	int result, ret;
 
 	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
 		return -EINVAL;
@@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	if (tz != pos1 || cdev != pos2)
 		return -EINVAL;
 
-	cdev->ops->get_max_state(cdev, &max_state);
+	ret = cdev->ops->get_max_state(cdev, &max_state);
+	if (ret)
+		return ret;
 
 	/* lower default 0, upper default max_state */
 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
-- 
2.0.0.rc2


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-13 17:02     ` [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver Lukasz Majewski
@ 2014-11-14 10:47       ` Mikko Perttunen
  2014-11-14 11:24         ` Lukasz Majewski
       [not found]         ` <5465DDC5.6090301-/1wQRMveznE@public.gmane.org>
       [not found]       ` <1415898165-27406-6-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
  1 sibling, 2 replies; 62+ messages in thread
From: Mikko Perttunen @ 2014-11-14 10:47 UTC (permalink / raw)
  To: Lukasz Majewski, Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel

Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>

One potential issue I can see is that if the cpufreq driver fails to 
probe then you'll never get the thermal driver either. For example, 
Tegra124 currently has no cpufreq driver, so if CONFIG_CPU_THERMAL was 
enabled, then the soctherm driver would never be able to probe. But I 
don't really have a solution for this either.

Cheers,
Mikko

On 11/13/2014 07:02 PM, Lukasz Majewski wrote:
> When CPU freq is used as a thermal zone cooling device, one needs to wait
> until cpufreq subsystem is properly initialized.
>
> This code is similar to the one already available in imx_thermal.c file.
>
> Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> ---
>   drivers/thermal/tegra_soctherm.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/drivers/thermal/tegra_soctherm.c b/drivers/thermal/tegra_soctherm.c
> index 70f7e9e..9c5aaa4 100644
> --- a/drivers/thermal/tegra_soctherm.c
> +++ b/drivers/thermal/tegra_soctherm.c
> @@ -26,6 +26,7 @@
>   #include <linux/platform_device.h>
>   #include <linux/reset.h>
>   #include <linux/thermal.h>
> +#include <linux/cpufreq.h>
>
>   #include <soc/tegra/fuse.h>
>
> @@ -346,6 +347,12 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>
>   	const struct tegra_tsensor *tsensors = t124_tsensors;
>
> +#ifdef CONFIG_CPU_THERMAL
> +	if (!cpufreq_get_current_driver()) {
> +		dev_dbg(&pdev->dev, "no cpufreq driver!");
> +		return -EPROBE_DEFER;
> +	}
> +#endif
>   	tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL);
>   	if (!tegra)
>   		return -ENOMEM;
>


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-14 10:47       ` Mikko Perttunen
@ 2014-11-14 11:24         ` Lukasz Majewski
  2014-11-17 11:57           ` Thierry Reding
       [not found]         ` <5465DDC5.6090301-/1wQRMveznE@public.gmane.org>
  1 sibling, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-14 11:24 UTC (permalink / raw)
  To: Mikko Perttunen
  Cc: Eduardo Valentin, Zhang Rui, Ezequiel Garcia, Kuninori Morimoto,
	Linux PM list, Vincenzo Frascino, Bartlomiej Zolnierkiewicz,
	Lukasz Majewski, Nobuhiro Iwamatsu, Mikko Perttunen,
	Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	linux-kernel

Hi Mikko,

> Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>

Thanks for testing.

> 
> One potential issue I can see is that if the cpufreq driver fails to 
> probe then you'll never get the thermal driver either. For example, 
> Tegra124 currently has no cpufreq driver, so if CONFIG_CPU_THERMAL
> was enabled, then the soctherm driver would never be able to probe.

Yes, this is a potential issue. However, this option in tegra_defconfig
is by default disabled when thermal is enabled.

> But I don't really have a solution for this either.

Ok. I see.

> 
> Cheers,
> Mikko
> 
> On 11/13/2014 07:02 PM, Lukasz Majewski wrote:
> > When CPU freq is used as a thermal zone cooling device, one needs
> > to wait until cpufreq subsystem is properly initialized.
> >
> > This code is similar to the one already available in imx_thermal.c
> > file.
> >
> > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > ---
> >   drivers/thermal/tegra_soctherm.c | 7 +++++++
> >   1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/thermal/tegra_soctherm.c
> > b/drivers/thermal/tegra_soctherm.c index 70f7e9e..9c5aaa4 100644
> > --- a/drivers/thermal/tegra_soctherm.c
> > +++ b/drivers/thermal/tegra_soctherm.c
> > @@ -26,6 +26,7 @@
> >   #include <linux/platform_device.h>
> >   #include <linux/reset.h>
> >   #include <linux/thermal.h>
> > +#include <linux/cpufreq.h>
> >
> >   #include <soc/tegra/fuse.h>
> >
> > @@ -346,6 +347,12 @@ static int tegra_soctherm_probe(struct
> > platform_device *pdev)
> >
> >   	const struct tegra_tsensor *tsensors = t124_tsensors;
> >
> > +#ifdef CONFIG_CPU_THERMAL
> > +	if (!cpufreq_get_current_driver()) {
> > +		dev_dbg(&pdev->dev, "no cpufreq driver!");
> > +		return -EPROBE_DEFER;
> > +	}
> > +#endif
> >   	tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra),
> > GFP_KERNEL); if (!tegra)
> >   		return -ENOMEM;
> >
> 



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-13 17:02     ` [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver Lukasz Majewski
@ 2014-11-17 11:40           ` Thierry Reding
       [not found]       ` <1415898165-27406-6-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
  1 sibling, 0 replies; 62+ messages in thread
From: Thierry Reding @ 2014-11-17 11:40 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Eduardo Valentin, Zhang Rui, Ezequiel Garcia, Kuninori Morimoto,
	Linux PM list, Vincenzo Frascino, Bartlomiej Zolnierkiewicz,
	Lukasz Majewski, Nobuhiro Iwamatsu, Mikko Perttunen,
	Stephen Warren, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1240 bytes --]

On Thu, Nov 13, 2014 at 06:02:42PM +0100, Lukasz Majewski wrote:
> When CPU freq is used as a thermal zone cooling device, one needs to wait
> until cpufreq subsystem is properly initialized.
> 
> This code is similar to the one already available in imx_thermal.c file.
> 
> Signed-off-by: Lukasz Majewski <l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
> ---
>  drivers/thermal/tegra_soctherm.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/thermal/tegra_soctherm.c b/drivers/thermal/tegra_soctherm.c
> index 70f7e9e..9c5aaa4 100644
> --- a/drivers/thermal/tegra_soctherm.c
> +++ b/drivers/thermal/tegra_soctherm.c
> @@ -26,6 +26,7 @@
>  #include <linux/platform_device.h>
>  #include <linux/reset.h>
>  #include <linux/thermal.h>
> +#include <linux/cpufreq.h>
>  
>  #include <soc/tegra/fuse.h>
>  
> @@ -346,6 +347,12 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>  
>  	const struct tegra_tsensor *tsensors = t124_tsensors;
>  
> +#ifdef CONFIG_CPU_THERMAL
> +	if (!cpufreq_get_current_driver()) {
> +		dev_dbg(&pdev->dev, "no cpufreq driver!");

Shouldn't this rather be dev_err() or dev_warn() to give at least some
clue as to the cause?

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
@ 2014-11-17 11:40           ` Thierry Reding
  0 siblings, 0 replies; 62+ messages in thread
From: Thierry Reding @ 2014-11-17 11:40 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Eduardo Valentin, Zhang Rui, Ezequiel Garcia, Kuninori Morimoto,
	Linux PM list, Vincenzo Frascino, Bartlomiej Zolnierkiewicz,
	Lukasz Majewski, Nobuhiro Iwamatsu, Mikko Perttunen,
	Stephen Warren, Alexandre Courbot, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1212 bytes --]

On Thu, Nov 13, 2014 at 06:02:42PM +0100, Lukasz Majewski wrote:
> When CPU freq is used as a thermal zone cooling device, one needs to wait
> until cpufreq subsystem is properly initialized.
> 
> This code is similar to the one already available in imx_thermal.c file.
> 
> Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> ---
>  drivers/thermal/tegra_soctherm.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/thermal/tegra_soctherm.c b/drivers/thermal/tegra_soctherm.c
> index 70f7e9e..9c5aaa4 100644
> --- a/drivers/thermal/tegra_soctherm.c
> +++ b/drivers/thermal/tegra_soctherm.c
> @@ -26,6 +26,7 @@
>  #include <linux/platform_device.h>
>  #include <linux/reset.h>
>  #include <linux/thermal.h>
> +#include <linux/cpufreq.h>
>  
>  #include <soc/tegra/fuse.h>
>  
> @@ -346,6 +347,12 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>  
>  	const struct tegra_tsensor *tsensors = t124_tsensors;
>  
> +#ifdef CONFIG_CPU_THERMAL
> +	if (!cpufreq_get_current_driver()) {
> +		dev_dbg(&pdev->dev, "no cpufreq driver!");

Shouldn't this rather be dev_err() or dev_warn() to give at least some
clue as to the cause?

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-14 10:47       ` Mikko Perttunen
@ 2014-11-17 11:43             ` Thierry Reding
       [not found]         ` <5465DDC5.6090301-/1wQRMveznE@public.gmane.org>
  1 sibling, 0 replies; 62+ messages in thread
From: Thierry Reding @ 2014-11-17 11:43 UTC (permalink / raw)
  To: Mikko Perttunen
  Cc: Lukasz Majewski, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 683 bytes --]

On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
> Tested-by: Mikko Perttunen <mikko.perttunen-/1wQRMveznE@public.gmane.org>
> 
> One potential issue I can see is that if the cpufreq driver fails to probe
> then you'll never get the thermal driver either. For example, Tegra124
> currently has no cpufreq driver, so if CONFIG_CPU_THERMAL was enabled, then
> the soctherm driver would never be able to probe. But I don't really have a
> solution for this either.

It doesn't seem like there's any code whatsoever to deal with cpufreq
within the soctherm driver, so deferring probe based on something we're
not using anyway seems rather useless.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
@ 2014-11-17 11:43             ` Thierry Reding
  0 siblings, 0 replies; 62+ messages in thread
From: Thierry Reding @ 2014-11-17 11:43 UTC (permalink / raw)
  To: Mikko Perttunen
  Cc: Lukasz Majewski, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 663 bytes --]

On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
> Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
> 
> One potential issue I can see is that if the cpufreq driver fails to probe
> then you'll never get the thermal driver either. For example, Tegra124
> currently has no cpufreq driver, so if CONFIG_CPU_THERMAL was enabled, then
> the soctherm driver would never be able to probe. But I don't really have a
> solution for this either.

It doesn't seem like there's any code whatsoever to deal with cpufreq
within the soctherm driver, so deferring probe based on something we're
not using anyway seems rather useless.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-17 11:43             ` Thierry Reding
  (?)
@ 2014-11-17 11:50             ` Lukasz Majewski
  2014-11-17 12:01                 ` Thierry Reding
  -1 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-17 11:50 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mikko Perttunen, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

Hi Thierry,

> On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
> > Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
> > 
> > One potential issue I can see is that if the cpufreq driver fails
> > to probe then you'll never get the thermal driver either. For
> > example, Tegra124 currently has no cpufreq driver, so if
> > CONFIG_CPU_THERMAL was enabled, then the soctherm driver would
> > never be able to probe. But I don't really have a solution for this
> > either.
> 
> It doesn't seem like there's any code whatsoever to deal with cpufreq
> within the soctherm driver, so deferring probe based on something
> we're not using anyway seems rather useless.

So, If I understood you correctly - this patch is not needed in the 
/tegra_soctherm.c:[tegra_defconfig] driver and can be safely omitted in
v2 of this driver.

Please note that I'm not aware of Tegra specific use cases and hence
I've prepared fixes for all potential candidates.

> 
> Thierry

-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-14 11:24         ` Lukasz Majewski
@ 2014-11-17 11:57           ` Thierry Reding
  2014-11-17 12:51             ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Thierry Reding @ 2014-11-17 11:57 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Mikko Perttunen, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1220 bytes --]

On Fri, Nov 14, 2014 at 12:24:37PM +0100, Lukasz Majewski wrote:
> Hi Mikko,
> 
> > Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
> 
> Thanks for testing.
> 
> > 
> > One potential issue I can see is that if the cpufreq driver fails to 
> > probe then you'll never get the thermal driver either. For example, 
> > Tegra124 currently has no cpufreq driver, so if CONFIG_CPU_THERMAL
> > was enabled, then the soctherm driver would never be able to probe.
> 
> Yes, this is a potential issue. However, this option in tegra_defconfig
> is by default disabled when thermal is enabled.

Not everybody uses tegra_defconfig for their kernel builds. In fact I'd
imagine that the majority of kernels use a variant of multi_v7_defconfig
and therefore CPU_THERMAL might get enabled irrespective of any Tegra
support.

I think a better solution would be to add this check only when proper
support for CPU frequency based cooling is added. That is, when a call
to cpufreq_cooling_register() (or a variant thereof) is added.

But while at it, why not make it so that cpufreq_cooling_register()
detects if a cpufreq driver has been registered yet and propagate
-EPROBE_DEFER if necessary?

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-17 11:50             ` Lukasz Majewski
@ 2014-11-17 12:01                 ` Thierry Reding
  0 siblings, 0 replies; 62+ messages in thread
From: Thierry Reding @ 2014-11-17 12:01 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Mikko Perttunen, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1241 bytes --]

On Mon, Nov 17, 2014 at 12:50:13PM +0100, Lukasz Majewski wrote:
> Hi Thierry,
> 
> > On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
> > > Tested-by: Mikko Perttunen <mikko.perttunen-/1wQRMveznE@public.gmane.org>
> > > 
> > > One potential issue I can see is that if the cpufreq driver fails
> > > to probe then you'll never get the thermal driver either. For
> > > example, Tegra124 currently has no cpufreq driver, so if
> > > CONFIG_CPU_THERMAL was enabled, then the soctherm driver would
> > > never be able to probe. But I don't really have a solution for this
> > > either.
> > 
> > It doesn't seem like there's any code whatsoever to deal with cpufreq
> > within the soctherm driver, so deferring probe based on something
> > we're not using anyway seems rather useless.
> 
> So, If I understood you correctly - this patch is not needed in the 
> /tegra_soctherm.c:[tegra_defconfig] driver and can be safely omitted in
> v2 of this driver.

What I'm saying is that I don't think doing this mass conversion
wholesale is useful since none of the drivers register a cooling device
based on cpufreq. In other words: if you're not going to use a feature
there's no use testing for it.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
@ 2014-11-17 12:01                 ` Thierry Reding
  0 siblings, 0 replies; 62+ messages in thread
From: Thierry Reding @ 2014-11-17 12:01 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Mikko Perttunen, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1221 bytes --]

On Mon, Nov 17, 2014 at 12:50:13PM +0100, Lukasz Majewski wrote:
> Hi Thierry,
> 
> > On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
> > > Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
> > > 
> > > One potential issue I can see is that if the cpufreq driver fails
> > > to probe then you'll never get the thermal driver either. For
> > > example, Tegra124 currently has no cpufreq driver, so if
> > > CONFIG_CPU_THERMAL was enabled, then the soctherm driver would
> > > never be able to probe. But I don't really have a solution for this
> > > either.
> > 
> > It doesn't seem like there's any code whatsoever to deal with cpufreq
> > within the soctherm driver, so deferring probe based on something
> > we're not using anyway seems rather useless.
> 
> So, If I understood you correctly - this patch is not needed in the 
> /tegra_soctherm.c:[tegra_defconfig] driver and can be safely omitted in
> v2 of this driver.

What I'm saying is that I don't think doing this mass conversion
wholesale is useful since none of the drivers register a cooling device
based on cpufreq. In other words: if you're not going to use a feature
there's no use testing for it.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-17 11:43             ` Thierry Reding
@ 2014-11-17 12:51               ` Mikko Perttunen
  -1 siblings, 0 replies; 62+ messages in thread
From: Mikko Perttunen @ 2014-11-17 12:51 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Lukasz Majewski, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 11/17/2014 01:43 PM, Thierry Reding wrote:
> On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
>> Tested-by: Mikko Perttunen <mikko.perttunen-/1wQRMveznE@public.gmane.org>
>>
>> One potential issue I can see is that if the cpufreq driver fails to probe
>> then you'll never get the thermal driver either. For example, Tegra124
>> currently has no cpufreq driver, so if CONFIG_CPU_THERMAL was enabled, then
>> the soctherm driver would never be able to probe. But I don't really have a
>> solution for this either.
>
> It doesn't seem like there's any code whatsoever to deal with cpufreq
> within the soctherm driver, so deferring probe based on something we're
> not using anyway seems rather useless.
>
> Thierry
>

My understanding is that there needs to be no code inside soctherm to 
handle it, as the cpufreq driver (cpufreq-dt) will register a cooling 
device that will then be bound to the soctherm sensors using the 
of-thermal device tree properties. At this moment, however, we don't 
have that cpufreq driver so this patch is still useless for Tegra.

Mikko

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
@ 2014-11-17 12:51               ` Mikko Perttunen
  0 siblings, 0 replies; 62+ messages in thread
From: Mikko Perttunen @ 2014-11-17 12:51 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Lukasz Majewski, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

On 11/17/2014 01:43 PM, Thierry Reding wrote:
> On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
>> Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
>>
>> One potential issue I can see is that if the cpufreq driver fails to probe
>> then you'll never get the thermal driver either. For example, Tegra124
>> currently has no cpufreq driver, so if CONFIG_CPU_THERMAL was enabled, then
>> the soctherm driver would never be able to probe. But I don't really have a
>> solution for this either.
>
> It doesn't seem like there's any code whatsoever to deal with cpufreq
> within the soctherm driver, so deferring probe based on something we're
> not using anyway seems rather useless.
>
> Thierry
>

My understanding is that there needs to be no code inside soctherm to 
handle it, as the cpufreq driver (cpufreq-dt) will register a cooling 
device that will then be bound to the soctherm sensors using the 
of-thermal device tree properties. At this moment, however, we don't 
have that cpufreq driver so this patch is still useless for Tegra.

Mikko


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-17 11:57           ` Thierry Reding
@ 2014-11-17 12:51             ` Lukasz Majewski
  2014-11-17 13:18               ` Thierry Reding
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-17 12:51 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mikko Perttunen, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

Hi Thierry,

> On Fri, Nov 14, 2014 at 12:24:37PM +0100, Lukasz Majewski wrote:
> > Hi Mikko,
> > 
> > > Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
> > 
> > Thanks for testing.
> > 
> > > 
> > > One potential issue I can see is that if the cpufreq driver fails
> > > to probe then you'll never get the thermal driver either. For
> > > example, Tegra124 currently has no cpufreq driver, so if
> > > CONFIG_CPU_THERMAL was enabled, then the soctherm driver would
> > > never be able to probe.
> > 
> > Yes, this is a potential issue. However, this option in
> > tegra_defconfig is by default disabled when thermal is enabled.
> 
> Not everybody uses tegra_defconfig for their kernel builds. In fact
> I'd imagine that the majority of kernels use a variant of
> multi_v7_defconfig and therefore CPU_THERMAL might get enabled
> irrespective of any Tegra support.

I see your point. 

> 
> I think a better solution would be to add this check only when proper
> support for CPU frequency based cooling is added. That is, when a call
> to cpufreq_cooling_register() (or a variant thereof) is added.

For the above reason the code is only compiled in when user enable
CONFIG_CPU_THERMAL.

> 
> But while at it, why not make it so that cpufreq_cooling_register()
> detects if a cpufreq driver has been registered yet and propagate
> -EPROBE_DEFER if necessary?

cpufreq_cooling_register() may be a good place to add check for
deferred probe.

The problem with cpufreq_cooling_register() is that it may be called
late in the probe function (as it is done now in Exynos).


> 
> Thierry



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-17 12:01                 ` Thierry Reding
@ 2014-11-17 13:02                   ` Lukasz Majewski
  -1 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-17 13:02 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mikko Perttunen, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi Thierry,

> On Mon, Nov 17, 2014 at 12:50:13PM +0100, Lukasz Majewski wrote:
> > Hi Thierry,
> > 
> > > On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
> > > > Tested-by: Mikko Perttunen <mikko.perttunen-/1wQRMveznE@public.gmane.org>
> > > > 
> > > > One potential issue I can see is that if the cpufreq driver
> > > > fails to probe then you'll never get the thermal driver either.
> > > > For example, Tegra124 currently has no cpufreq driver, so if
> > > > CONFIG_CPU_THERMAL was enabled, then the soctherm driver would
> > > > never be able to probe. But I don't really have a solution for
> > > > this either.
> > > 
> > > It doesn't seem like there's any code whatsoever to deal with
> > > cpufreq within the soctherm driver, so deferring probe based on
> > > something we're not using anyway seems rather useless.
> > 
> > So, If I understood you correctly - this patch is not needed in the 
> > /tegra_soctherm.c:[tegra_defconfig] driver and can be safely
> > omitted in v2 of this driver.
> 
> What I'm saying is that I don't think doing this mass conversion
> wholesale is useful since none of the drivers register a cooling
> device based on cpufreq. In other words: if you're not going to use a
> feature there's no use testing for it.
> 

It seems, like one option here would be to add deferred proble to
cpufreq_cooling_register() or check which driver in its thermal probe
is calling cpufreq_cooling_register() function.

The latter option explains why in the imx_thermal.c file we check for
deferred probe without #ifdefs for CONFIG_CPU_THERMAL.

If no objections, I would like to stick to the code already available
in imx_thermal.c.

> Thierry

-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
@ 2014-11-17 13:02                   ` Lukasz Majewski
  0 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-17 13:02 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Mikko Perttunen, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

Hi Thierry,

> On Mon, Nov 17, 2014 at 12:50:13PM +0100, Lukasz Majewski wrote:
> > Hi Thierry,
> > 
> > > On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
> > > > Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
> > > > 
> > > > One potential issue I can see is that if the cpufreq driver
> > > > fails to probe then you'll never get the thermal driver either.
> > > > For example, Tegra124 currently has no cpufreq driver, so if
> > > > CONFIG_CPU_THERMAL was enabled, then the soctherm driver would
> > > > never be able to probe. But I don't really have a solution for
> > > > this either.
> > > 
> > > It doesn't seem like there's any code whatsoever to deal with
> > > cpufreq within the soctherm driver, so deferring probe based on
> > > something we're not using anyway seems rather useless.
> > 
> > So, If I understood you correctly - this patch is not needed in the 
> > /tegra_soctherm.c:[tegra_defconfig] driver and can be safely
> > omitted in v2 of this driver.
> 
> What I'm saying is that I don't think doing this mass conversion
> wholesale is useful since none of the drivers register a cooling
> device based on cpufreq. In other words: if you're not going to use a
> feature there's no use testing for it.
> 

It seems, like one option here would be to add deferred proble to
cpufreq_cooling_register() or check which driver in its thermal probe
is calling cpufreq_cooling_register() function.

The latter option explains why in the imx_thermal.c file we check for
deferred probe without #ifdefs for CONFIG_CPU_THERMAL.

If no objections, I would like to stick to the code already available
in imx_thermal.c.

> Thierry

-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-17 12:51               ` Mikko Perttunen
  (?)
@ 2014-11-17 13:08               ` Thierry Reding
  2014-11-17 13:24                 ` Mikko Perttunen
  -1 siblings, 1 reply; 62+ messages in thread
From: Thierry Reding @ 2014-11-17 13:08 UTC (permalink / raw)
  To: Mikko Perttunen
  Cc: Lukasz Majewski, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1315 bytes --]

On Mon, Nov 17, 2014 at 02:51:24PM +0200, Mikko Perttunen wrote:
> On 11/17/2014 01:43 PM, Thierry Reding wrote:
> >On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
> >>Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
> >>
> >>One potential issue I can see is that if the cpufreq driver fails to probe
> >>then you'll never get the thermal driver either. For example, Tegra124
> >>currently has no cpufreq driver, so if CONFIG_CPU_THERMAL was enabled, then
> >>the soctherm driver would never be able to probe. But I don't really have a
> >>solution for this either.
> >
> >It doesn't seem like there's any code whatsoever to deal with cpufreq
> >within the soctherm driver, so deferring probe based on something we're
> >not using anyway seems rather useless.
> >
> >Thierry
> >
> 
> My understanding is that there needs to be no code inside soctherm to handle
> it, as the cpufreq driver (cpufreq-dt) will register a cooling device that
> will then be bound to the soctherm sensors using the of-thermal device tree
> properties. At this moment, however, we don't have that cpufreq driver so
> this patch is still useless for Tegra.

But if the cpufreq driver will automatically do this already, why do we
even need to check for it in the soctherm driver?

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-17 12:51             ` Lukasz Majewski
@ 2014-11-17 13:18               ` Thierry Reding
  0 siblings, 0 replies; 62+ messages in thread
From: Thierry Reding @ 2014-11-17 13:18 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Mikko Perttunen, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2158 bytes --]

On Mon, Nov 17, 2014 at 01:51:42PM +0100, Lukasz Majewski wrote:
> Hi Thierry,
> 
> > On Fri, Nov 14, 2014 at 12:24:37PM +0100, Lukasz Majewski wrote:
> > > Hi Mikko,
> > > 
> > > > Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
> > > 
> > > Thanks for testing.
> > > 
> > > > 
> > > > One potential issue I can see is that if the cpufreq driver fails
> > > > to probe then you'll never get the thermal driver either. For
> > > > example, Tegra124 currently has no cpufreq driver, so if
> > > > CONFIG_CPU_THERMAL was enabled, then the soctherm driver would
> > > > never be able to probe.
> > > 
> > > Yes, this is a potential issue. However, this option in
> > > tegra_defconfig is by default disabled when thermal is enabled.
> > 
> > Not everybody uses tegra_defconfig for their kernel builds. In fact
> > I'd imagine that the majority of kernels use a variant of
> > multi_v7_defconfig and therefore CPU_THERMAL might get enabled
> > irrespective of any Tegra support.
> 
> I see your point. 
> 
> > 
> > I think a better solution would be to add this check only when proper
> > support for CPU frequency based cooling is added. That is, when a call
> > to cpufreq_cooling_register() (or a variant thereof) is added.
> 
> For the above reason the code is only compiled in when user enable
> CONFIG_CPU_THERMAL.

Like I said, CPU_THERMAL is a kernel-wide configuration option and not
tied to the specific SoC. Therefore if an SoC, say Exynos, gains full
support for cooling via cpufreq then somebody may decide to enable the
CPU_THERMAL option in multi_v7_defconfig. At that point every thermal
driver that you're patching in this way but for which no cpufreq driver
is registered will indefinitely defer probing.

So I see two options:

  1) make sure that we defer probing only on devices where a cpufreq
     driver is available
  2) not rely on deferred probing at all but allow a cooling device to
     be registered after the thermal driver has finished probing

1) seems impossible to do because cpufreq simply doesn't provide enough
infrastructure to deal with that situation.

Thierry

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
  2014-11-17 13:08               ` Thierry Reding
@ 2014-11-17 13:24                 ` Mikko Perttunen
  0 siblings, 0 replies; 62+ messages in thread
From: Mikko Perttunen @ 2014-11-17 13:24 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Lukasz Majewski, Eduardo Valentin, Zhang Rui, Ezequiel Garcia,
	Kuninori Morimoto, Linux PM list, Vincenzo Frascino,
	Bartlomiej Zolnierkiewicz, Lukasz Majewski, Nobuhiro Iwamatsu,
	Mikko Perttunen, Stephen Warren, Alexandre Courbot, linux-tegra,
	linux-kernel

On 11/17/2014 03:08 PM, Thierry Reding wrote:
> On Mon, Nov 17, 2014 at 02:51:24PM +0200, Mikko Perttunen wrote:
>> On 11/17/2014 01:43 PM, Thierry Reding wrote:
>>> On Fri, Nov 14, 2014 at 12:47:33PM +0200, Mikko Perttunen wrote:
>>>> Tested-by: Mikko Perttunen <mikko.perttunen@kapsi.fi>
>>>>
>>>> One potential issue I can see is that if the cpufreq driver fails to probe
>>>> then you'll never get the thermal driver either. For example, Tegra124
>>>> currently has no cpufreq driver, so if CONFIG_CPU_THERMAL was enabled, then
>>>> the soctherm driver would never be able to probe. But I don't really have a
>>>> solution for this either.
>>>
>>> It doesn't seem like there's any code whatsoever to deal with cpufreq
>>> within the soctherm driver, so deferring probe based on something we're
>>> not using anyway seems rather useless.
>>>
>>> Thierry
>>>
>>
>> My understanding is that there needs to be no code inside soctherm to handle
>> it, as the cpufreq driver (cpufreq-dt) will register a cooling device that
>> will then be bound to the soctherm sensors using the of-thermal device tree
>> properties. At this moment, however, we don't have that cpufreq driver so
>> this patch is still useless for Tegra.
>
> But if the cpufreq driver will automatically do this already, why do we
> even need to check for it in the soctherm driver?
>
> Thierry
>

Indeed, we shouldn't. Unless I am mistaken, the issue is then that the 
cpufreq cooling device calls thermal_cooling_device_register before 
being ready to handle callbacks, which clearly would be an issue in the 
cpufreq driver.

The thermal core seems to able to handle registrations of thermal zones 
and cooling devices in any order; AFAICT it defers binding the tz<->cdev 
mapping until both have registered themselves to the thermal core.

Mikko


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
  2014-11-13 17:02     ` Lukasz Majewski
                       ` (7 preceding siblings ...)
  (?)
@ 2014-11-18 10:16     ` Lukasz Majewski
       [not found]       ` <1416305790-27746-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
  -1 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-18 10:16 UTC (permalink / raw)
  To: Eduardo Valentin, Zhang Rui
  Cc: Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Lukasz Majewski

The return code from ->get_max_state() callback was not checked during
binding cooling device to thermal zone device.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
---
Changes for v2:
- It turned out that patches from 1 to 6 from v1 are not needed, since
  they either already solve the problem (like imx_thermal.c) or not use
  cpufreq as a thermal cooling device.
- The patch 7 from v1 is also not needed since this patch on error exits
  this function without using max_state variable.
- In thermal driver probe the cpufreq_cooling_register() method presence
  is crucial to evaluate if the thermal driver needs any actions with 
  -EPROBE_DEFER.
---
 drivers/thermal/thermal_core.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 43b9070..8567929 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	struct thermal_zone_device *pos1;
 	struct thermal_cooling_device *pos2;
 	unsigned long max_state;
-	int result;
+	int result, ret;
 
 	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
 		return -EINVAL;
@@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
 	if (tz != pos1 || cdev != pos2)
 		return -EINVAL;
 
-	cdev->ops->get_max_state(cdev, &max_state);
+	ret = cdev->ops->get_max_state(cdev, &max_state);
+	if (ret)
+		return ret;
 
 	/* lower default 0, upper default max_state */
 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
-- 
2.0.0.rc2


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers
  2014-11-13 17:02     ` Lukasz Majewski
@ 2014-11-20 18:54         ` Eduardo Valentin
  -1 siblings, 0 replies; 62+ messages in thread
From: Eduardo Valentin @ 2014-11-20 18:54 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 8660 bytes --]

Lukasz,

Thanks for the keeping this up. And apologize for late answer.

On Thu, Nov 13, 2014 at 06:02:37PM +0100, Lukasz Majewski wrote:
> Presented fixes are a response for problem described below:
> http://thread.gmane.org/gmane.linux.kernel/1793821/match=thermal+core+fix+initialize+max_state+variable+0
> 
> In short - it turned out that two trivial fixes (included in this patch set)
> require support for deferred probe in thermal drivers.
> 
> This situation shows up when CPU frequency reduction is used as a thermal cooling
> device for a thermal zone.
> It happens that during initialization, the call to thermal probe will be executed
> before cpufreq probe (it can be observed at ./drivers/Makefile).
> In such a situation thermal will not be properly configured until cpufreq policy
> is setup.
> 
> In the current code (without included fixes) there is a time window in which thermal
> can try to use not configured cpufreq and possibly crash the system.
> 
> 
> Proposed solution was based on the code already available in the imx_thermal.c file.
> 
> /db8500_thermal.c:                      -> NOT NEEDED
> /intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
> /intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
> /ti-soc-thermal/ti-bandgap.c:           -> FIXED  [omap2plus_defconfig]
> /dove_thermal.c:                        -> NOT NEEDED - CPU_COOLING NOT AVAILABLE
>                                                         [dove_defconfig]
> /spear_thermal.c:                       -> FIXED  [spear3xx_defconfig]
> /samsung/exynos_tmu.c:                  -> NOT NEEDED (nasty hack - will be reworked in later patches)
> /imx_thermal.c:                         -> OK (deferred probe already in place)
> /int340x_thermal/int3402_thermal.c:     -> NOT NEEDED - ACPI x86 - Intel specific
> /int340x_thermal/int3400_thermal.c:     -> NOT NEEDED - ACPI x86 - Intel specific
> /tegra_soctherm.c:                      -> FIXED  [tegra_defconfig]
> /kirkwood_thermal.c:                    -> FIXED  [multi_v5_defconfig]
> /armada_thermal.c:                      -> FIXED  [multi_v7_defconfig]
> /rcar_thermal.c:                        -> FIXED  [shmobile_defconfig]
> /db8500_cpufreq_cooling.c:              -> OK (deferred probe already in place) [multi_v7_defconfig]
> /st/st_thermal_syscfg.c:                -> NOT NEEDED (Those two are enabled by e.g. ARMADA)
> /st/st_thermal_memmap.c:
> 
> 


Instead of doing the same check on all drivers in the need for cpu
cooling looks like a promiscuous solution. What if we do this check in
cpu cooling itself and we propagate the error in callers code? 

From what I see, only exynos does not propagate the error. And we would
need a tweak in the cpufreq-dt code. Something like the following
(not tested):

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index f657c57..f139247 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -181,7 +181,6 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 {
 	struct cpufreq_dt_platform_data *pd;
 	struct cpufreq_frequency_table *freq_table;
-	struct thermal_cooling_device *cdev;
 	struct device_node *np;
 	struct private_data *priv;
 	struct device *cpu_dev;
@@ -264,20 +263,6 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 		goto out_free_priv;
 	}
 
-	/*
-	 * For now, just loading the cooling device;
-	 * thermal DT code takes care of matching them.
-	 */
-	if (of_find_property(np, "#cooling-cells", NULL)) {
-		cdev = of_cpufreq_cooling_register(np, cpu_present_mask);
-		if (IS_ERR(cdev))
-			dev_err(cpu_dev,
-				"running cpufreq without cooling device: %ld\n",
-				PTR_ERR(cdev));
-		else
-			priv->cdev = cdev;
-	}
-
 	priv->cpu_dev = cpu_dev;
 	priv->cpu_reg = cpu_reg;
 	policy->driver_data = priv;
@@ -287,7 +272,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 	if (ret) {
 		dev_err(cpu_dev, "%s: invalid frequency table: %d\n", __func__,
 			ret);
-		goto out_cooling_unregister;
+		goto free_table;
 	}
 
 	policy->cpuinfo.transition_latency = transition_latency;
@@ -300,8 +285,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 
 	return 0;
 
-out_cooling_unregister:
-	cpufreq_cooling_unregister(priv->cdev);
+free_table:
 	dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
 out_free_priv:
 	kfree(priv);
@@ -342,11 +326,14 @@ static struct cpufreq_driver dt_cpufreq_driver = {
 
 static int dt_cpufreq_probe(struct platform_device *pdev)
 {
+	struct device_node *np;
 	struct device *cpu_dev;
 	struct regulator *cpu_reg;
 	struct clk *cpu_clk;
 	int ret;
 
+	/* at this point we checked the pointer already right? */
+	np = of_node_get(pdev->dev.of_node);
 	/*
 	 * All per-cluster (CPUs sharing clock/voltages) initialization is done
 	 * from ->init(). In probe(), we just need to make sure that clk and
@@ -368,6 +355,28 @@ static int dt_cpufreq_probe(struct platform_device *pdev)
 	if (ret)
 		dev_err(cpu_dev, "failed register driver: %d\n", ret);
 
+	/*
+	 * For now, just loading the cooling device;
+	 * thermal DT code takes care of matching them.
+	 */
+	if (of_find_property(np, "#cooling-cells", NULL)) {
+		struct cpufreq_policy policy;
+		struct private_data *priv;
+		struct thermal_cooling_device *cdev;
+
+		/* TODO: can cpu0 be always used ? */
+		cpufreq_get_policy(&policy, 0);
+		priv = policy.driver_data;
+		cdev = of_cpufreq_cooling_register(np, cpu_present_mask);
+		if (IS_ERR(cdev))
+			dev_err(cpu_dev,
+				"running cpufreq without cooling device: %ld\n",
+				PTR_ERR(cdev));
+		else
+			priv->cdev = cdev;
+	}
+	of_node_put(np);
+
 	return ret;
 }
 
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 1ab0018..342eb9e 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -440,6 +440,11 @@ __cpufreq_cooling_register(struct device_node *np,
 	int ret = 0, i;
 	struct cpufreq_policy policy;
 
+	if (!cpufreq_get_current_driver()) {
+		dev_warn(&pdev->dev, "no cpufreq driver, deferring.");
+		return -EPROBE_DEFER;
+	}
+
 	/* Verify that all the clip cpus have same freq_min, freq_max limit */
 	for_each_cpu(i, clip_cpus) {
 		/* continue if cpufreq policy not found and not return error */
diff --git a/drivers/thermal/samsung/exynos_thermal_common.c b/drivers/thermal/samsung/exynos_thermal_common.c
index 3f5ad25..f84975e 100644
--- a/drivers/thermal/samsung/exynos_thermal_common.c
+++ b/drivers/thermal/samsung/exynos_thermal_common.c
@@ -373,7 +373,7 @@ int exynos_register_thermal(struct thermal_sensor_conf *sensor_conf)
 		if (IS_ERR(th_zone->cool_dev[th_zone->cool_dev_size])) {
 			dev_err(sensor_conf->dev,
 				"Failed to register cpufreq cooling device\n");
-			ret = -EINVAL;
+			ret = PTR_ERR(th_zone->cool_dev[th_zone->cool_dev_size]);
 			goto err_unregister;
 		}
 		th_zone->cool_dev_size++;


The above way, we avoid having same test in every driver that needs it.
Besides, it makes sense the cpu_cooling code takes care of this check,
as it is the very first part that has direct dependency with cpufreq.

> I only possess Exynos boards and Beagle Bone Black, so I'd be grateful for
> testing proposed solution on other boards. The posted code is compile tested.
> 
> This code applies on Eduardo's ti-soc-thermal-next tree:
> SHA1: 208a97042d66d9bfbcfab0d4a00c9fe317bb73d3
> 
> Lukasz Majewski (8):
>   thermal:cpu cooling:armada: Provide deferred probing for armada driver
>   thermal:cpu cooling:kirkwood: Provide deferred probing for kirkwood
>     driver
>   thermal:cpu cooling:rcar: Provide deferred probing for rcar driver
>   thermal:cpu cooling:spear: Provide deferred probing for spear driver
>   thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
>   thermal:cpu cooling:ti: Provide deferred probing for ti drivers
>   thermal:core:fix: Initialize the max_state variable to 0
>   thermal:core:fix: Check return code of the ->get_max_state() callback
> 
>  drivers/thermal/armada_thermal.c            | 7 +++++++
>  drivers/thermal/kirkwood_thermal.c          | 7 +++++++
>  drivers/thermal/rcar_thermal.c              | 7 +++++++
>  drivers/thermal/spear_thermal.c             | 7 +++++++
>  drivers/thermal/tegra_soctherm.c            | 7 +++++++
>  drivers/thermal/thermal_core.c              | 8 +++++---
>  drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
>  7 files changed, 47 insertions(+), 3 deletions(-)
> 
> -- 
> 2.0.0.rc2
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers
@ 2014-11-20 18:54         ` Eduardo Valentin
  0 siblings, 0 replies; 62+ messages in thread
From: Eduardo Valentin @ 2014-11-20 18:54 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 8660 bytes --]

Lukasz,

Thanks for the keeping this up. And apologize for late answer.

On Thu, Nov 13, 2014 at 06:02:37PM +0100, Lukasz Majewski wrote:
> Presented fixes are a response for problem described below:
> http://thread.gmane.org/gmane.linux.kernel/1793821/match=thermal+core+fix+initialize+max_state+variable+0
> 
> In short - it turned out that two trivial fixes (included in this patch set)
> require support for deferred probe in thermal drivers.
> 
> This situation shows up when CPU frequency reduction is used as a thermal cooling
> device for a thermal zone.
> It happens that during initialization, the call to thermal probe will be executed
> before cpufreq probe (it can be observed at ./drivers/Makefile).
> In such a situation thermal will not be properly configured until cpufreq policy
> is setup.
> 
> In the current code (without included fixes) there is a time window in which thermal
> can try to use not configured cpufreq and possibly crash the system.
> 
> 
> Proposed solution was based on the code already available in the imx_thermal.c file.
> 
> /db8500_thermal.c:                      -> NOT NEEDED
> /intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
> /intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
> /ti-soc-thermal/ti-bandgap.c:           -> FIXED  [omap2plus_defconfig]
> /dove_thermal.c:                        -> NOT NEEDED - CPU_COOLING NOT AVAILABLE
>                                                         [dove_defconfig]
> /spear_thermal.c:                       -> FIXED  [spear3xx_defconfig]
> /samsung/exynos_tmu.c:                  -> NOT NEEDED (nasty hack - will be reworked in later patches)
> /imx_thermal.c:                         -> OK (deferred probe already in place)
> /int340x_thermal/int3402_thermal.c:     -> NOT NEEDED - ACPI x86 - Intel specific
> /int340x_thermal/int3400_thermal.c:     -> NOT NEEDED - ACPI x86 - Intel specific
> /tegra_soctherm.c:                      -> FIXED  [tegra_defconfig]
> /kirkwood_thermal.c:                    -> FIXED  [multi_v5_defconfig]
> /armada_thermal.c:                      -> FIXED  [multi_v7_defconfig]
> /rcar_thermal.c:                        -> FIXED  [shmobile_defconfig]
> /db8500_cpufreq_cooling.c:              -> OK (deferred probe already in place) [multi_v7_defconfig]
> /st/st_thermal_syscfg.c:                -> NOT NEEDED (Those two are enabled by e.g. ARMADA)
> /st/st_thermal_memmap.c:
> 
> 


Instead of doing the same check on all drivers in the need for cpu
cooling looks like a promiscuous solution. What if we do this check in
cpu cooling itself and we propagate the error in callers code? 

From what I see, only exynos does not propagate the error. And we would
need a tweak in the cpufreq-dt code. Something like the following
(not tested):

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index f657c57..f139247 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -181,7 +181,6 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 {
 	struct cpufreq_dt_platform_data *pd;
 	struct cpufreq_frequency_table *freq_table;
-	struct thermal_cooling_device *cdev;
 	struct device_node *np;
 	struct private_data *priv;
 	struct device *cpu_dev;
@@ -264,20 +263,6 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 		goto out_free_priv;
 	}
 
-	/*
-	 * For now, just loading the cooling device;
-	 * thermal DT code takes care of matching them.
-	 */
-	if (of_find_property(np, "#cooling-cells", NULL)) {
-		cdev = of_cpufreq_cooling_register(np, cpu_present_mask);
-		if (IS_ERR(cdev))
-			dev_err(cpu_dev,
-				"running cpufreq without cooling device: %ld\n",
-				PTR_ERR(cdev));
-		else
-			priv->cdev = cdev;
-	}
-
 	priv->cpu_dev = cpu_dev;
 	priv->cpu_reg = cpu_reg;
 	policy->driver_data = priv;
@@ -287,7 +272,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 	if (ret) {
 		dev_err(cpu_dev, "%s: invalid frequency table: %d\n", __func__,
 			ret);
-		goto out_cooling_unregister;
+		goto free_table;
 	}
 
 	policy->cpuinfo.transition_latency = transition_latency;
@@ -300,8 +285,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 
 	return 0;
 
-out_cooling_unregister:
-	cpufreq_cooling_unregister(priv->cdev);
+free_table:
 	dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
 out_free_priv:
 	kfree(priv);
@@ -342,11 +326,14 @@ static struct cpufreq_driver dt_cpufreq_driver = {
 
 static int dt_cpufreq_probe(struct platform_device *pdev)
 {
+	struct device_node *np;
 	struct device *cpu_dev;
 	struct regulator *cpu_reg;
 	struct clk *cpu_clk;
 	int ret;
 
+	/* at this point we checked the pointer already right? */
+	np = of_node_get(pdev->dev.of_node);
 	/*
 	 * All per-cluster (CPUs sharing clock/voltages) initialization is done
 	 * from ->init(). In probe(), we just need to make sure that clk and
@@ -368,6 +355,28 @@ static int dt_cpufreq_probe(struct platform_device *pdev)
 	if (ret)
 		dev_err(cpu_dev, "failed register driver: %d\n", ret);
 
+	/*
+	 * For now, just loading the cooling device;
+	 * thermal DT code takes care of matching them.
+	 */
+	if (of_find_property(np, "#cooling-cells", NULL)) {
+		struct cpufreq_policy policy;
+		struct private_data *priv;
+		struct thermal_cooling_device *cdev;
+
+		/* TODO: can cpu0 be always used ? */
+		cpufreq_get_policy(&policy, 0);
+		priv = policy.driver_data;
+		cdev = of_cpufreq_cooling_register(np, cpu_present_mask);
+		if (IS_ERR(cdev))
+			dev_err(cpu_dev,
+				"running cpufreq without cooling device: %ld\n",
+				PTR_ERR(cdev));
+		else
+			priv->cdev = cdev;
+	}
+	of_node_put(np);
+
 	return ret;
 }
 
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 1ab0018..342eb9e 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -440,6 +440,11 @@ __cpufreq_cooling_register(struct device_node *np,
 	int ret = 0, i;
 	struct cpufreq_policy policy;
 
+	if (!cpufreq_get_current_driver()) {
+		dev_warn(&pdev->dev, "no cpufreq driver, deferring.");
+		return -EPROBE_DEFER;
+	}
+
 	/* Verify that all the clip cpus have same freq_min, freq_max limit */
 	for_each_cpu(i, clip_cpus) {
 		/* continue if cpufreq policy not found and not return error */
diff --git a/drivers/thermal/samsung/exynos_thermal_common.c b/drivers/thermal/samsung/exynos_thermal_common.c
index 3f5ad25..f84975e 100644
--- a/drivers/thermal/samsung/exynos_thermal_common.c
+++ b/drivers/thermal/samsung/exynos_thermal_common.c
@@ -373,7 +373,7 @@ int exynos_register_thermal(struct thermal_sensor_conf *sensor_conf)
 		if (IS_ERR(th_zone->cool_dev[th_zone->cool_dev_size])) {
 			dev_err(sensor_conf->dev,
 				"Failed to register cpufreq cooling device\n");
-			ret = -EINVAL;
+			ret = PTR_ERR(th_zone->cool_dev[th_zone->cool_dev_size]);
 			goto err_unregister;
 		}
 		th_zone->cool_dev_size++;


The above way, we avoid having same test in every driver that needs it.
Besides, it makes sense the cpu_cooling code takes care of this check,
as it is the very first part that has direct dependency with cpufreq.

> I only possess Exynos boards and Beagle Bone Black, so I'd be grateful for
> testing proposed solution on other boards. The posted code is compile tested.
> 
> This code applies on Eduardo's ti-soc-thermal-next tree:
> SHA1: 208a97042d66d9bfbcfab0d4a00c9fe317bb73d3
> 
> Lukasz Majewski (8):
>   thermal:cpu cooling:armada: Provide deferred probing for armada driver
>   thermal:cpu cooling:kirkwood: Provide deferred probing for kirkwood
>     driver
>   thermal:cpu cooling:rcar: Provide deferred probing for rcar driver
>   thermal:cpu cooling:spear: Provide deferred probing for spear driver
>   thermal:cpu cooling:tegra: Provide deferred probing for tegra driver
>   thermal:cpu cooling:ti: Provide deferred probing for ti drivers
>   thermal:core:fix: Initialize the max_state variable to 0
>   thermal:core:fix: Check return code of the ->get_max_state() callback
> 
>  drivers/thermal/armada_thermal.c            | 7 +++++++
>  drivers/thermal/kirkwood_thermal.c          | 7 +++++++
>  drivers/thermal/rcar_thermal.c              | 7 +++++++
>  drivers/thermal/spear_thermal.c             | 7 +++++++
>  drivers/thermal/tegra_soctherm.c            | 7 +++++++
>  drivers/thermal/thermal_core.c              | 8 +++++---
>  drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
>  7 files changed, 47 insertions(+), 3 deletions(-)
> 
> -- 
> 2.0.0.rc2
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 6/8] thermal:cpu cooling:ti: Provide deferred probing for ti drivers
  2014-11-13 17:02     ` [PATCH 6/8] thermal:cpu cooling:ti: Provide deferred probing for ti drivers Lukasz Majewski
@ 2014-11-20 19:00       ` Eduardo Valentin
  0 siblings, 0 replies; 62+ messages in thread
From: Eduardo Valentin @ 2014-11-20 19:00 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1442 bytes --]

Lukaz,

On Thu, Nov 13, 2014 at 06:02:43PM +0100, Lukasz Majewski wrote:
> When CPU freq is used as a thermal zone cooling device, one needs to wait
> until cpufreq subsystem is properly initialized.
> 
> This code is similar to the one already available in imx_thermal.c file.
> 
> Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> ---
>  drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/thermal/ti-soc-thermal/ti-bandgap.c b/drivers/thermal/ti-soc-thermal/ti-bandgap.c
> index 634b6ce..0ac5ead 100644
> --- a/drivers/thermal/ti-soc-thermal/ti-bandgap.c
> +++ b/drivers/thermal/ti-soc-thermal/ti-bandgap.c
> @@ -40,6 +40,7 @@
>  #include <linux/of_irq.h>
>  #include <linux/of_gpio.h>
>  #include <linux/io.h>
> +#include <linux/cpufreq.h>
>  
>  #include "ti-bandgap.h"
>  
> @@ -1196,6 +1197,12 @@ int ti_bandgap_probe(struct platform_device *pdev)
>  	struct ti_bandgap *bgp;
>  	int clk_rate, ret = 0, i;
>  
> +#ifdef CONFIG_CPU_THERMAL
> +	if (!cpufreq_get_current_driver()) {
> +		dev_dbg(&pdev->dev, "no cpufreq driver!");
> +		return -EPROBE_DEFER;
> +	}
> +#endif

This is part of drivers/thermal/ti-soc-thermal/ti-thermal-common.c. So,
no need to duplicate it in here.

>  	bgp = ti_bandgap_build(pdev);
>  	if (IS_ERR(bgp)) {
>  		dev_err(&pdev->dev, "failed to fetch platform data\n");
> -- 
> 2.0.0.rc2
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers
  2014-11-21  8:33         ` Lukasz Majewski
@ 2014-11-21  2:47           ` Eduardo Valentin
  2014-11-21 16:28               ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Eduardo Valentin @ 2014-11-21  2:47 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 11107 bytes --]

Lukasz,

On Fri, Nov 21, 2014 at 09:33:48AM +0100, Lukasz Majewski wrote:
> Hi Eduardo,
> 
> > Lukasz,
> > 
> > Thanks for the keeping this up. And apologize for late answer.
> 
> I've already posted v2 of this patch set (which consists of only one
> patch :-) ).
> 
> Thanks to Thierry Reding's hint, I've realized that I don't need to add
> code from patches 1-6 from v1.
> 
> Please instead review following patch:
> "thermal:core:fix: Check return code of the ->get_max_state() callback"
> https://patchwork.kernel.org/patch/5326991/

I see. If I got correctly, with the above patch, we still need to have
the check for cpufreq driver in the thermal driver right?
quoting:
"In thermal driver probe the cpufreq_cooling_register() method presence
  is crucial to evaluate if the thermal driver needs any actions with 
    -EPROBE_DEFER."

If yes, that means the proposal still leaves to drivers to deal with
the sequencing. For the patch above, I believe it is fine. However, a
better sequencing is still needed :-(. 

For the case of of-thermal based drivers, it should be dealt between
cpu_cooling and cpufreq, as I proposed, bellow. I really agree that
drivers should not care about this, and thus we should not spread the
check among drivers, specially if there is nothing regarding cpufreq in
the driver's code. I might send the proposal of having the check between
cpu_cooling and cpufreq as a formal patch, in a separated thread.

I will have a look in your v2. Briefly looking, looks reasonable.


Once again, thanks.

Cheers,

Eduardo Valentin

> 
> 
> > 
> > On Thu, Nov 13, 2014 at 06:02:37PM +0100, Lukasz Majewski wrote:
> > > Presented fixes are a response for problem described below:
> > > http://thread.gmane.org/gmane.linux.kernel/1793821/match=thermal+core+fix+initialize+max_state+variable+0
> > > 
> > > In short - it turned out that two trivial fixes (included in this
> > > patch set) require support for deferred probe in thermal drivers.
> > > 
> > > This situation shows up when CPU frequency reduction is used as a
> > > thermal cooling device for a thermal zone.
> > > It happens that during initialization, the call to thermal probe
> > > will be executed before cpufreq probe (it can be observed
> > > at ./drivers/Makefile). In such a situation thermal will not be
> > > properly configured until cpufreq policy is setup.
> > > 
> > > In the current code (without included fixes) there is a time window
> > > in which thermal can try to use not configured cpufreq and possibly
> > > crash the system.
> > > 
> > > 
> > > Proposed solution was based on the code already available in the
> > > imx_thermal.c file.
> > > 
> > > /db8500_thermal.c:                      -> NOT NEEDED
> > > /intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
> > > /intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
> > > /ti-soc-thermal/ti-bandgap.c:           -> FIXED
> > > [omap2plus_defconfig] /dove_thermal.c:                        ->
> > > NOT NEEDED - CPU_COOLING NOT AVAILABLE [dove_defconfig]
> > > /spear_thermal.c:                       -> FIXED
> > > [spear3xx_defconfig] /samsung/exynos_tmu.c:                  -> NOT
> > > NEEDED (nasty hack - will be reworked in later
> > > patches) /imx_thermal.c:                         -> OK (deferred
> > > probe already in place) /int340x_thermal/int3402_thermal.c:     ->
> > > NOT NEEDED - ACPI x86 - Intel
> > > specific /int340x_thermal/int3400_thermal.c:     -> NOT NEEDED -
> > > ACPI x86 - Intel specific /tegra_soctherm.c:
> > > -> FIXED  [tegra_defconfig] /kirkwood_thermal.c:
> > > -> FIXED
> > > [multi_v5_defconfig] /armada_thermal.c:                      ->
> > > FIXED  [multi_v7_defconfig] /rcar_thermal.c:
> > > -> FIXED
> > > [shmobile_defconfig] /db8500_cpufreq_cooling.c:              -> OK
> > > (deferred probe already in place)
> > > [multi_v7_defconfig] /st/st_thermal_syscfg.c:                -> NOT
> > > NEEDED (Those two are enabled by e.g.
> > > ARMADA) /st/st_thermal_memmap.c:
> > > 
> > > 
> > 
> > 
> > Instead of doing the same check on all drivers in the need for cpu
> > cooling looks like a promiscuous solution. What if we do this check in
> > cpu cooling itself and we propagate the error in callers code? 
> > 
> > From what I see, only exynos does not propagate the error. And we
> > would need a tweak in the cpufreq-dt code. Something like the
> > following (not tested):
> > 
> > diff --git a/drivers/cpufreq/cpufreq-dt.c
> > b/drivers/cpufreq/cpufreq-dt.c index f657c57..f139247 100644
> > --- a/drivers/cpufreq/cpufreq-dt.c
> > +++ b/drivers/cpufreq/cpufreq-dt.c
> > @@ -181,7 +181,6 @@ static int cpufreq_init(struct cpufreq_policy
> > *policy) {
> >  	struct cpufreq_dt_platform_data *pd;
> >  	struct cpufreq_frequency_table *freq_table;
> > -	struct thermal_cooling_device *cdev;
> >  	struct device_node *np;
> >  	struct private_data *priv;
> >  	struct device *cpu_dev;
> > @@ -264,20 +263,6 @@ static int cpufreq_init(struct cpufreq_policy
> > *policy) goto out_free_priv;
> >  	}
> >  
> > -	/*
> > -	 * For now, just loading the cooling device;
> > -	 * thermal DT code takes care of matching them.
> > -	 */
> > -	if (of_find_property(np, "#cooling-cells", NULL)) {
> > -		cdev = of_cpufreq_cooling_register(np,
> > cpu_present_mask);
> > -		if (IS_ERR(cdev))
> > -			dev_err(cpu_dev,
> > -				"running cpufreq without cooling
> > device: %ld\n",
> > -				PTR_ERR(cdev));
> > -		else
> > -			priv->cdev = cdev;
> > -	}
> > -
> >  	priv->cpu_dev = cpu_dev;
> >  	priv->cpu_reg = cpu_reg;
> >  	policy->driver_data = priv;
> > @@ -287,7 +272,7 @@ static int cpufreq_init(struct cpufreq_policy
> > *policy) if (ret) {
> >  		dev_err(cpu_dev, "%s: invalid frequency table:
> > %d\n", __func__, ret);
> > -		goto out_cooling_unregister;
> > +		goto free_table;
> >  	}
> >  
> >  	policy->cpuinfo.transition_latency = transition_latency;
> > @@ -300,8 +285,7 @@ static int cpufreq_init(struct cpufreq_policy
> > *policy) 
> >  	return 0;
> >  
> > -out_cooling_unregister:
> > -	cpufreq_cooling_unregister(priv->cdev);
> > +free_table:
> >  	dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
> >  out_free_priv:
> >  	kfree(priv);
> > @@ -342,11 +326,14 @@ static struct cpufreq_driver dt_cpufreq_driver
> > = { 
> >  static int dt_cpufreq_probe(struct platform_device *pdev)
> >  {
> > +	struct device_node *np;
> >  	struct device *cpu_dev;
> >  	struct regulator *cpu_reg;
> >  	struct clk *cpu_clk;
> >  	int ret;
> >  
> > +	/* at this point we checked the pointer already right? */
> > +	np = of_node_get(pdev->dev.of_node);
> >  	/*
> >  	 * All per-cluster (CPUs sharing clock/voltages)
> > initialization is done
> >  	 * from ->init(). In probe(), we just need to make sure that
> > clk and @@ -368,6 +355,28 @@ static int dt_cpufreq_probe(struct
> > platform_device *pdev) if (ret)
> >  		dev_err(cpu_dev, "failed register driver: %d\n",
> > ret); 
> > +	/*
> > +	 * For now, just loading the cooling device;
> > +	 * thermal DT code takes care of matching them.
> > +	 */
> > +	if (of_find_property(np, "#cooling-cells", NULL)) {
> > +		struct cpufreq_policy policy;
> > +		struct private_data *priv;
> > +		struct thermal_cooling_device *cdev;
> > +
> > +		/* TODO: can cpu0 be always used ? */
> > +		cpufreq_get_policy(&policy, 0);
> > +		priv = policy.driver_data;
> > +		cdev = of_cpufreq_cooling_register(np,
> > cpu_present_mask);
> > +		if (IS_ERR(cdev))
> > +			dev_err(cpu_dev,
> > +				"running cpufreq without cooling
> > device: %ld\n",
> > +				PTR_ERR(cdev));
> > +		else
> > +			priv->cdev = cdev;
> > +	}
> > +	of_node_put(np);
> > +
> >  	return ret;
> >  }
> >  
> > diff --git a/drivers/thermal/cpu_cooling.c
> > b/drivers/thermal/cpu_cooling.c index 1ab0018..342eb9e 100644
> > --- a/drivers/thermal/cpu_cooling.c
> > +++ b/drivers/thermal/cpu_cooling.c
> > @@ -440,6 +440,11 @@ __cpufreq_cooling_register(struct device_node
> > *np, int ret = 0, i;
> >  	struct cpufreq_policy policy;
> >  
> > +	if (!cpufreq_get_current_driver()) {
> > +		dev_warn(&pdev->dev, "no cpufreq driver,
> > deferring.");
> > +		return -EPROBE_DEFER;
> > +	}
> > +
> >  	/* Verify that all the clip cpus have same freq_min,
> > freq_max limit */ for_each_cpu(i, clip_cpus) {
> >  		/* continue if cpufreq policy not found and not
> > return error */ diff --git
> > a/drivers/thermal/samsung/exynos_thermal_common.c
> > b/drivers/thermal/samsung/exynos_thermal_common.c index
> > 3f5ad25..f84975e 100644 ---
> > a/drivers/thermal/samsung/exynos_thermal_common.c +++
> > b/drivers/thermal/samsung/exynos_thermal_common.c @@ -373,7 +373,7 @@
> > int exynos_register_thermal(struct thermal_sensor_conf *sensor_conf)
> > if (IS_ERR(th_zone->cool_dev[th_zone->cool_dev_size]))
> > { dev_err(sensor_conf->dev, "Failed to register cpufreq cooling
> > device\n");
> > -			ret = -EINVAL;
> > +			ret =
> > PTR_ERR(th_zone->cool_dev[th_zone->cool_dev_size]); goto
> > err_unregister; }
> >  		th_zone->cool_dev_size++;
> > 
> > 
> > The above way, we avoid having same test in every driver that needs
> > it. Besides, it makes sense the cpu_cooling code takes care of this
> > check, as it is the very first part that has direct dependency with
> > cpufreq.
> > 
> > > I only possess Exynos boards and Beagle Bone Black, so I'd be
> > > grateful for testing proposed solution on other boards. The posted
> > > code is compile tested.
> > > 
> > > This code applies on Eduardo's ti-soc-thermal-next tree:
> > > SHA1: 208a97042d66d9bfbcfab0d4a00c9fe317bb73d3
> > > 
> > > Lukasz Majewski (8):
> > >   thermal:cpu cooling:armada: Provide deferred probing for armada
> > > driver thermal:cpu cooling:kirkwood: Provide deferred probing for
> > > kirkwood driver
> > >   thermal:cpu cooling:rcar: Provide deferred probing for rcar driver
> > >   thermal:cpu cooling:spear: Provide deferred probing for spear
> > > driver thermal:cpu cooling:tegra: Provide deferred probing for
> > > tegra driver thermal:cpu cooling:ti: Provide deferred probing for
> > > ti drivers thermal:core:fix: Initialize the max_state variable to 0
> > >   thermal:core:fix: Check return code of the ->get_max_state()
> > > callback
> > > 
> > >  drivers/thermal/armada_thermal.c            | 7 +++++++
> > >  drivers/thermal/kirkwood_thermal.c          | 7 +++++++
> > >  drivers/thermal/rcar_thermal.c              | 7 +++++++
> > >  drivers/thermal/spear_thermal.c             | 7 +++++++
> > >  drivers/thermal/tegra_soctherm.c            | 7 +++++++
> > >  drivers/thermal/thermal_core.c              | 8 +++++---
> > >  drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
> > >  7 files changed, 47 insertions(+), 3 deletions(-)
> > > 
> > > -- 
> > > 2.0.0.rc2
> > > 
> 
> 
> 
> -- 
> Best regards,
> 
> Lukasz Majewski
> 
> Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers
  2014-11-20 18:54         ` Eduardo Valentin
  (?)
@ 2014-11-21  8:33         ` Lukasz Majewski
  2014-11-21  2:47           ` Eduardo Valentin
  -1 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-21  8:33 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel

Hi Eduardo,

> Lukasz,
> 
> Thanks for the keeping this up. And apologize for late answer.

I've already posted v2 of this patch set (which consists of only one
patch :-) ).

Thanks to Thierry Reding's hint, I've realized that I don't need to add
code from patches 1-6 from v1.

Please instead review following patch:
"thermal:core:fix: Check return code of the ->get_max_state() callback"
https://patchwork.kernel.org/patch/5326991/


> 
> On Thu, Nov 13, 2014 at 06:02:37PM +0100, Lukasz Majewski wrote:
> > Presented fixes are a response for problem described below:
> > http://thread.gmane.org/gmane.linux.kernel/1793821/match=thermal+core+fix+initialize+max_state+variable+0
> > 
> > In short - it turned out that two trivial fixes (included in this
> > patch set) require support for deferred probe in thermal drivers.
> > 
> > This situation shows up when CPU frequency reduction is used as a
> > thermal cooling device for a thermal zone.
> > It happens that during initialization, the call to thermal probe
> > will be executed before cpufreq probe (it can be observed
> > at ./drivers/Makefile). In such a situation thermal will not be
> > properly configured until cpufreq policy is setup.
> > 
> > In the current code (without included fixes) there is a time window
> > in which thermal can try to use not configured cpufreq and possibly
> > crash the system.
> > 
> > 
> > Proposed solution was based on the code already available in the
> > imx_thermal.c file.
> > 
> > /db8500_thermal.c:                      -> NOT NEEDED
> > /intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
> > /intel_powerclamp.c:                    -> NOT NEEDED - INTEL (x86)
> > /ti-soc-thermal/ti-bandgap.c:           -> FIXED
> > [omap2plus_defconfig] /dove_thermal.c:                        ->
> > NOT NEEDED - CPU_COOLING NOT AVAILABLE [dove_defconfig]
> > /spear_thermal.c:                       -> FIXED
> > [spear3xx_defconfig] /samsung/exynos_tmu.c:                  -> NOT
> > NEEDED (nasty hack - will be reworked in later
> > patches) /imx_thermal.c:                         -> OK (deferred
> > probe already in place) /int340x_thermal/int3402_thermal.c:     ->
> > NOT NEEDED - ACPI x86 - Intel
> > specific /int340x_thermal/int3400_thermal.c:     -> NOT NEEDED -
> > ACPI x86 - Intel specific /tegra_soctherm.c:
> > -> FIXED  [tegra_defconfig] /kirkwood_thermal.c:
> > -> FIXED
> > [multi_v5_defconfig] /armada_thermal.c:                      ->
> > FIXED  [multi_v7_defconfig] /rcar_thermal.c:
> > -> FIXED
> > [shmobile_defconfig] /db8500_cpufreq_cooling.c:              -> OK
> > (deferred probe already in place)
> > [multi_v7_defconfig] /st/st_thermal_syscfg.c:                -> NOT
> > NEEDED (Those two are enabled by e.g.
> > ARMADA) /st/st_thermal_memmap.c:
> > 
> > 
> 
> 
> Instead of doing the same check on all drivers in the need for cpu
> cooling looks like a promiscuous solution. What if we do this check in
> cpu cooling itself and we propagate the error in callers code? 
> 
> From what I see, only exynos does not propagate the error. And we
> would need a tweak in the cpufreq-dt code. Something like the
> following (not tested):
> 
> diff --git a/drivers/cpufreq/cpufreq-dt.c
> b/drivers/cpufreq/cpufreq-dt.c index f657c57..f139247 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -181,7 +181,6 @@ static int cpufreq_init(struct cpufreq_policy
> *policy) {
>  	struct cpufreq_dt_platform_data *pd;
>  	struct cpufreq_frequency_table *freq_table;
> -	struct thermal_cooling_device *cdev;
>  	struct device_node *np;
>  	struct private_data *priv;
>  	struct device *cpu_dev;
> @@ -264,20 +263,6 @@ static int cpufreq_init(struct cpufreq_policy
> *policy) goto out_free_priv;
>  	}
>  
> -	/*
> -	 * For now, just loading the cooling device;
> -	 * thermal DT code takes care of matching them.
> -	 */
> -	if (of_find_property(np, "#cooling-cells", NULL)) {
> -		cdev = of_cpufreq_cooling_register(np,
> cpu_present_mask);
> -		if (IS_ERR(cdev))
> -			dev_err(cpu_dev,
> -				"running cpufreq without cooling
> device: %ld\n",
> -				PTR_ERR(cdev));
> -		else
> -			priv->cdev = cdev;
> -	}
> -
>  	priv->cpu_dev = cpu_dev;
>  	priv->cpu_reg = cpu_reg;
>  	policy->driver_data = priv;
> @@ -287,7 +272,7 @@ static int cpufreq_init(struct cpufreq_policy
> *policy) if (ret) {
>  		dev_err(cpu_dev, "%s: invalid frequency table:
> %d\n", __func__, ret);
> -		goto out_cooling_unregister;
> +		goto free_table;
>  	}
>  
>  	policy->cpuinfo.transition_latency = transition_latency;
> @@ -300,8 +285,7 @@ static int cpufreq_init(struct cpufreq_policy
> *policy) 
>  	return 0;
>  
> -out_cooling_unregister:
> -	cpufreq_cooling_unregister(priv->cdev);
> +free_table:
>  	dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
>  out_free_priv:
>  	kfree(priv);
> @@ -342,11 +326,14 @@ static struct cpufreq_driver dt_cpufreq_driver
> = { 
>  static int dt_cpufreq_probe(struct platform_device *pdev)
>  {
> +	struct device_node *np;
>  	struct device *cpu_dev;
>  	struct regulator *cpu_reg;
>  	struct clk *cpu_clk;
>  	int ret;
>  
> +	/* at this point we checked the pointer already right? */
> +	np = of_node_get(pdev->dev.of_node);
>  	/*
>  	 * All per-cluster (CPUs sharing clock/voltages)
> initialization is done
>  	 * from ->init(). In probe(), we just need to make sure that
> clk and @@ -368,6 +355,28 @@ static int dt_cpufreq_probe(struct
> platform_device *pdev) if (ret)
>  		dev_err(cpu_dev, "failed register driver: %d\n",
> ret); 
> +	/*
> +	 * For now, just loading the cooling device;
> +	 * thermal DT code takes care of matching them.
> +	 */
> +	if (of_find_property(np, "#cooling-cells", NULL)) {
> +		struct cpufreq_policy policy;
> +		struct private_data *priv;
> +		struct thermal_cooling_device *cdev;
> +
> +		/* TODO: can cpu0 be always used ? */
> +		cpufreq_get_policy(&policy, 0);
> +		priv = policy.driver_data;
> +		cdev = of_cpufreq_cooling_register(np,
> cpu_present_mask);
> +		if (IS_ERR(cdev))
> +			dev_err(cpu_dev,
> +				"running cpufreq without cooling
> device: %ld\n",
> +				PTR_ERR(cdev));
> +		else
> +			priv->cdev = cdev;
> +	}
> +	of_node_put(np);
> +
>  	return ret;
>  }
>  
> diff --git a/drivers/thermal/cpu_cooling.c
> b/drivers/thermal/cpu_cooling.c index 1ab0018..342eb9e 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -440,6 +440,11 @@ __cpufreq_cooling_register(struct device_node
> *np, int ret = 0, i;
>  	struct cpufreq_policy policy;
>  
> +	if (!cpufreq_get_current_driver()) {
> +		dev_warn(&pdev->dev, "no cpufreq driver,
> deferring.");
> +		return -EPROBE_DEFER;
> +	}
> +
>  	/* Verify that all the clip cpus have same freq_min,
> freq_max limit */ for_each_cpu(i, clip_cpus) {
>  		/* continue if cpufreq policy not found and not
> return error */ diff --git
> a/drivers/thermal/samsung/exynos_thermal_common.c
> b/drivers/thermal/samsung/exynos_thermal_common.c index
> 3f5ad25..f84975e 100644 ---
> a/drivers/thermal/samsung/exynos_thermal_common.c +++
> b/drivers/thermal/samsung/exynos_thermal_common.c @@ -373,7 +373,7 @@
> int exynos_register_thermal(struct thermal_sensor_conf *sensor_conf)
> if (IS_ERR(th_zone->cool_dev[th_zone->cool_dev_size]))
> { dev_err(sensor_conf->dev, "Failed to register cpufreq cooling
> device\n");
> -			ret = -EINVAL;
> +			ret =
> PTR_ERR(th_zone->cool_dev[th_zone->cool_dev_size]); goto
> err_unregister; }
>  		th_zone->cool_dev_size++;
> 
> 
> The above way, we avoid having same test in every driver that needs
> it. Besides, it makes sense the cpu_cooling code takes care of this
> check, as it is the very first part that has direct dependency with
> cpufreq.
> 
> > I only possess Exynos boards and Beagle Bone Black, so I'd be
> > grateful for testing proposed solution on other boards. The posted
> > code is compile tested.
> > 
> > This code applies on Eduardo's ti-soc-thermal-next tree:
> > SHA1: 208a97042d66d9bfbcfab0d4a00c9fe317bb73d3
> > 
> > Lukasz Majewski (8):
> >   thermal:cpu cooling:armada: Provide deferred probing for armada
> > driver thermal:cpu cooling:kirkwood: Provide deferred probing for
> > kirkwood driver
> >   thermal:cpu cooling:rcar: Provide deferred probing for rcar driver
> >   thermal:cpu cooling:spear: Provide deferred probing for spear
> > driver thermal:cpu cooling:tegra: Provide deferred probing for
> > tegra driver thermal:cpu cooling:ti: Provide deferred probing for
> > ti drivers thermal:core:fix: Initialize the max_state variable to 0
> >   thermal:core:fix: Check return code of the ->get_max_state()
> > callback
> > 
> >  drivers/thermal/armada_thermal.c            | 7 +++++++
> >  drivers/thermal/kirkwood_thermal.c          | 7 +++++++
> >  drivers/thermal/rcar_thermal.c              | 7 +++++++
> >  drivers/thermal/spear_thermal.c             | 7 +++++++
> >  drivers/thermal/tegra_soctherm.c            | 7 +++++++
> >  drivers/thermal/thermal_core.c              | 8 +++++---
> >  drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
> >  7 files changed, 47 insertions(+), 3 deletions(-)
> > 
> > -- 
> > 2.0.0.rc2
> > 



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers
  2014-11-21  2:47           ` Eduardo Valentin
@ 2014-11-21 16:28               ` Lukasz Majewski
  0 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-21 16:28 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi Eduardo,

> Lukasz,
> 
> On Fri, Nov 21, 2014 at 09:33:48AM +0100, Lukasz Majewski wrote:
> > Hi Eduardo,
> > 
> > > Lukasz,
> > > 
> > > Thanks for the keeping this up. And apologize for late answer.
> > 
> > I've already posted v2 of this patch set (which consists of only one
> > patch :-) ).
> > 
> > Thanks to Thierry Reding's hint, I've realized that I don't need to
> > add code from patches 1-6 from v1.
> > 
> > Please instead review following patch:
> > "thermal:core:fix: Check return code of the ->get_max_state()
> > callback" https://patchwork.kernel.org/patch/5326991/
> 
> I see. If I got correctly, with the above patch, we still need to have
> the check for cpufreq driver in the thermal driver right?
> quoting:
> "In thermal driver probe the cpufreq_cooling_register() method
> presence is crucial to evaluate if the thermal driver needs any
> actions with -EPROBE_DEFER."

Yes we need those checks in thermal drivers. However, proper checks are
already in place - please look into imx_thermal.c case.

> 
> If yes, that means the proposal still leaves to drivers to deal with
> the sequencing. For the patch above, I believe it is fine. However, a
> better sequencing is still needed :-(. 

There is always a trade off. What I've shown in v2 is that I'm pretty
confident that my fix won't introduce any regression for present code.

> 
> For the case of of-thermal based drivers, it should be dealt between
> cpu_cooling and cpufreq, as I proposed, bellow. I really agree that
> drivers should not care about this, and thus we should not spread the
> check among drivers,

Unfortunately several checks are already in place (imx_thermal.c,
db8500_cpufreq.c, ti-soc*.c, in some way the old Exynos thermal driver).

> specially if there is nothing regarding cpufreq
> in the driver's code. 

There is a call to cpufreq_cooling_register(), which sometimes happens
in the late part of probe function. In such a case it would be quite
challenging to release already allocated resources (e.g. ti, old
Exynos driver).

> I might send the proposal of having the check
> between cpu_cooling and cpufreq as a formal patch, in a separated
> thread.

It would be beneficial to hear Viresh's opinion.

> 
> I will have a look in your v2. Briefly looking, looks reasonable.

In short: The goal of this patch is to show that regressions should not
happen and that it can be safely applied for v3.19.

> 
> 
> Once again, thanks.
> 
> Cheers,
> 
> Eduardo Valentin
> 
> > 
> > 
> > > 
> > > On Thu, Nov 13, 2014 at 06:02:37PM +0100, Lukasz Majewski wrote:
> > > > Presented fixes are a response for problem described below:
> > > > http://thread.gmane.org/gmane.linux.kernel/1793821/match=thermal+core+fix+initialize+max_state+variable+0
> > > > 
> > > > In short - it turned out that two trivial fixes (included in
> > > > this patch set) require support for deferred probe in thermal
> > > > drivers.
> > > > 
> > > > This situation shows up when CPU frequency reduction is used as
> > > > a thermal cooling device for a thermal zone.
> > > > It happens that during initialization, the call to thermal probe
> > > > will be executed before cpufreq probe (it can be observed
> > > > at ./drivers/Makefile). In such a situation thermal will not be
> > > > properly configured until cpufreq policy is setup.
> > > > 
> > > > In the current code (without included fixes) there is a time
> > > > window in which thermal can try to use not configured cpufreq
> > > > and possibly crash the system.
> > > > 
> > > > 
> > > > Proposed solution was based on the code already available in the
> > > > imx_thermal.c file.
> > > > 
> > > > /db8500_thermal.c:                      -> NOT NEEDED
> > > > /intel_powerclamp.c:                    -> NOT NEEDED - INTEL
> > > > (x86) /intel_powerclamp.c:                    -> NOT NEEDED -
> > > > INTEL (x86) /ti-soc-thermal/ti-bandgap.c:           -> FIXED
> > > > [omap2plus_defconfig] /dove_thermal.c:                        ->
> > > > NOT NEEDED - CPU_COOLING NOT AVAILABLE [dove_defconfig]
> > > > /spear_thermal.c:                       -> FIXED
> > > > [spear3xx_defconfig] /samsung/exynos_tmu.c:                  ->
> > > > NOT NEEDED (nasty hack - will be reworked in later
> > > > patches) /imx_thermal.c:                         -> OK (deferred
> > > > probe already in place) /int340x_thermal/int3402_thermal.c:
> > > > -> NOT NEEDED - ACPI x86 - Intel
> > > > specific /int340x_thermal/int3400_thermal.c:     -> NOT NEEDED -
> > > > ACPI x86 - Intel specific /tegra_soctherm.c:
> > > > -> FIXED  [tegra_defconfig] /kirkwood_thermal.c:
> > > > -> FIXED
> > > > [multi_v5_defconfig] /armada_thermal.c:                      ->
> > > > FIXED  [multi_v7_defconfig] /rcar_thermal.c:
> > > > -> FIXED
> > > > [shmobile_defconfig] /db8500_cpufreq_cooling.c:              ->
> > > > OK (deferred probe already in place)
> > > > [multi_v7_defconfig] /st/st_thermal_syscfg.c:                ->
> > > > NOT NEEDED (Those two are enabled by e.g.
> > > > ARMADA) /st/st_thermal_memmap.c:
> > > > 
> > > > 
> > > 
> > > 
> > > Instead of doing the same check on all drivers in the need for cpu
> > > cooling looks like a promiscuous solution. What if we do this
> > > check in cpu cooling itself and we propagate the error in callers
> > > code? 
> > > 
> > > From what I see, only exynos does not propagate the error. And we
> > > would need a tweak in the cpufreq-dt code. Something like the
> > > following (not tested):
> > > 
> > > diff --git a/drivers/cpufreq/cpufreq-dt.c
> > > b/drivers/cpufreq/cpufreq-dt.c index f657c57..f139247 100644
> > > --- a/drivers/cpufreq/cpufreq-dt.c
> > > +++ b/drivers/cpufreq/cpufreq-dt.c
> > > @@ -181,7 +181,6 @@ static int cpufreq_init(struct cpufreq_policy
> > > *policy) {
> > >  	struct cpufreq_dt_platform_data *pd;
> > >  	struct cpufreq_frequency_table *freq_table;
> > > -	struct thermal_cooling_device *cdev;
> > >  	struct device_node *np;
> > >  	struct private_data *priv;
> > >  	struct device *cpu_dev;
> > > @@ -264,20 +263,6 @@ static int cpufreq_init(struct cpufreq_policy
> > > *policy) goto out_free_priv;
> > >  	}
> > >  
> > > -	/*
> > > -	 * For now, just loading the cooling device;
> > > -	 * thermal DT code takes care of matching them.
> > > -	 */
> > > -	if (of_find_property(np, "#cooling-cells", NULL)) {
> > > -		cdev = of_cpufreq_cooling_register(np,
> > > cpu_present_mask);
> > > -		if (IS_ERR(cdev))
> > > -			dev_err(cpu_dev,
> > > -				"running cpufreq without cooling
> > > device: %ld\n",
> > > -				PTR_ERR(cdev));
> > > -		else
> > > -			priv->cdev = cdev;
> > > -	}
> > > -
> > >  	priv->cpu_dev = cpu_dev;
> > >  	priv->cpu_reg = cpu_reg;
> > >  	policy->driver_data = priv;
> > > @@ -287,7 +272,7 @@ static int cpufreq_init(struct cpufreq_policy
> > > *policy) if (ret) {
> > >  		dev_err(cpu_dev, "%s: invalid frequency table:
> > > %d\n", __func__, ret);
> > > -		goto out_cooling_unregister;
> > > +		goto free_table;
> > >  	}
> > >  
> > >  	policy->cpuinfo.transition_latency = transition_latency;
> > > @@ -300,8 +285,7 @@ static int cpufreq_init(struct cpufreq_policy
> > > *policy) 
> > >  	return 0;
> > >  
> > > -out_cooling_unregister:
> > > -	cpufreq_cooling_unregister(priv->cdev);
> > > +free_table:
> > >  	dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
> > >  out_free_priv:
> > >  	kfree(priv);
> > > @@ -342,11 +326,14 @@ static struct cpufreq_driver
> > > dt_cpufreq_driver = { 
> > >  static int dt_cpufreq_probe(struct platform_device *pdev)
> > >  {
> > > +	struct device_node *np;
> > >  	struct device *cpu_dev;
> > >  	struct regulator *cpu_reg;
> > >  	struct clk *cpu_clk;
> > >  	int ret;
> > >  
> > > +	/* at this point we checked the pointer already right? */
> > > +	np = of_node_get(pdev->dev.of_node);
> > >  	/*
> > >  	 * All per-cluster (CPUs sharing clock/voltages)
> > > initialization is done
> > >  	 * from ->init(). In probe(), we just need to make sure
> > > that clk and @@ -368,6 +355,28 @@ static int
> > > dt_cpufreq_probe(struct platform_device *pdev) if (ret)
> > >  		dev_err(cpu_dev, "failed register driver: %d\n",
> > > ret); 
> > > +	/*
> > > +	 * For now, just loading the cooling device;
> > > +	 * thermal DT code takes care of matching them.
> > > +	 */
> > > +	if (of_find_property(np, "#cooling-cells", NULL)) {
> > > +		struct cpufreq_policy policy;
> > > +		struct private_data *priv;
> > > +		struct thermal_cooling_device *cdev;
> > > +
> > > +		/* TODO: can cpu0 be always used ? */
> > > +		cpufreq_get_policy(&policy, 0);
> > > +		priv = policy.driver_data;
> > > +		cdev = of_cpufreq_cooling_register(np,
> > > cpu_present_mask);
> > > +		if (IS_ERR(cdev))
> > > +			dev_err(cpu_dev,
> > > +				"running cpufreq without cooling
> > > device: %ld\n",
> > > +				PTR_ERR(cdev));
> > > +		else
> > > +			priv->cdev = cdev;
> > > +	}
> > > +	of_node_put(np);
> > > +
> > >  	return ret;
> > >  }
> > >  
> > > diff --git a/drivers/thermal/cpu_cooling.c
> > > b/drivers/thermal/cpu_cooling.c index 1ab0018..342eb9e 100644
> > > --- a/drivers/thermal/cpu_cooling.c
> > > +++ b/drivers/thermal/cpu_cooling.c
> > > @@ -440,6 +440,11 @@ __cpufreq_cooling_register(struct device_node
> > > *np, int ret = 0, i;
> > >  	struct cpufreq_policy policy;
> > >  
> > > +	if (!cpufreq_get_current_driver()) {
> > > +		dev_warn(&pdev->dev, "no cpufreq driver,
> > > deferring.");
> > > +		return -EPROBE_DEFER;
> > > +	}
> > > +
> > >  	/* Verify that all the clip cpus have same freq_min,
> > > freq_max limit */ for_each_cpu(i, clip_cpus) {
> > >  		/* continue if cpufreq policy not found and not
> > > return error */ diff --git
> > > a/drivers/thermal/samsung/exynos_thermal_common.c
> > > b/drivers/thermal/samsung/exynos_thermal_common.c index
> > > 3f5ad25..f84975e 100644 ---
> > > a/drivers/thermal/samsung/exynos_thermal_common.c +++
> > > b/drivers/thermal/samsung/exynos_thermal_common.c @@ -373,7
> > > +373,7 @@ int exynos_register_thermal(struct thermal_sensor_conf
> > > *sensor_conf) if
> > > (IS_ERR(th_zone->cool_dev[th_zone->cool_dev_size]))
> > > { dev_err(sensor_conf->dev, "Failed to register cpufreq cooling
> > > device\n");
> > > -			ret = -EINVAL;
> > > +			ret =
> > > PTR_ERR(th_zone->cool_dev[th_zone->cool_dev_size]); goto
> > > err_unregister; }
> > >  		th_zone->cool_dev_size++;
> > > 
> > > 
> > > The above way, we avoid having same test in every driver that
> > > needs it. Besides, it makes sense the cpu_cooling code takes care
> > > of this check, as it is the very first part that has direct
> > > dependency with cpufreq.
> > > 
> > > > I only possess Exynos boards and Beagle Bone Black, so I'd be
> > > > grateful for testing proposed solution on other boards. The
> > > > posted code is compile tested.
> > > > 
> > > > This code applies on Eduardo's ti-soc-thermal-next tree:
> > > > SHA1: 208a97042d66d9bfbcfab0d4a00c9fe317bb73d3
> > > > 
> > > > Lukasz Majewski (8):
> > > >   thermal:cpu cooling:armada: Provide deferred probing for
> > > > armada driver thermal:cpu cooling:kirkwood: Provide deferred
> > > > probing for kirkwood driver
> > > >   thermal:cpu cooling:rcar: Provide deferred probing for rcar
> > > > driver thermal:cpu cooling:spear: Provide deferred probing for
> > > > spear driver thermal:cpu cooling:tegra: Provide deferred
> > > > probing for tegra driver thermal:cpu cooling:ti: Provide
> > > > deferred probing for ti drivers thermal:core:fix: Initialize
> > > > the max_state variable to 0 thermal:core:fix: Check return code
> > > > of the ->get_max_state() callback
> > > > 
> > > >  drivers/thermal/armada_thermal.c            | 7 +++++++
> > > >  drivers/thermal/kirkwood_thermal.c          | 7 +++++++
> > > >  drivers/thermal/rcar_thermal.c              | 7 +++++++
> > > >  drivers/thermal/spear_thermal.c             | 7 +++++++
> > > >  drivers/thermal/tegra_soctherm.c            | 7 +++++++
> > > >  drivers/thermal/thermal_core.c              | 8 +++++---
> > > >  drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
> > > >  7 files changed, 47 insertions(+), 3 deletions(-)
> > > > 
> > > > -- 
> > > > 2.0.0.rc2
> > > > 
> > 
> > 
> > 
> > -- 
> > Best regards,
> > 
> > Lukasz Majewski
> > 
> > Samsung R&D Institute Poland (SRPOL) | Linux Platform Group



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers
@ 2014-11-21 16:28               ` Lukasz Majewski
  0 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-21 16:28 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel

Hi Eduardo,

> Lukasz,
> 
> On Fri, Nov 21, 2014 at 09:33:48AM +0100, Lukasz Majewski wrote:
> > Hi Eduardo,
> > 
> > > Lukasz,
> > > 
> > > Thanks for the keeping this up. And apologize for late answer.
> > 
> > I've already posted v2 of this patch set (which consists of only one
> > patch :-) ).
> > 
> > Thanks to Thierry Reding's hint, I've realized that I don't need to
> > add code from patches 1-6 from v1.
> > 
> > Please instead review following patch:
> > "thermal:core:fix: Check return code of the ->get_max_state()
> > callback" https://patchwork.kernel.org/patch/5326991/
> 
> I see. If I got correctly, with the above patch, we still need to have
> the check for cpufreq driver in the thermal driver right?
> quoting:
> "In thermal driver probe the cpufreq_cooling_register() method
> presence is crucial to evaluate if the thermal driver needs any
> actions with -EPROBE_DEFER."

Yes we need those checks in thermal drivers. However, proper checks are
already in place - please look into imx_thermal.c case.

> 
> If yes, that means the proposal still leaves to drivers to deal with
> the sequencing. For the patch above, I believe it is fine. However, a
> better sequencing is still needed :-(. 

There is always a trade off. What I've shown in v2 is that I'm pretty
confident that my fix won't introduce any regression for present code.

> 
> For the case of of-thermal based drivers, it should be dealt between
> cpu_cooling and cpufreq, as I proposed, bellow. I really agree that
> drivers should not care about this, and thus we should not spread the
> check among drivers,

Unfortunately several checks are already in place (imx_thermal.c,
db8500_cpufreq.c, ti-soc*.c, in some way the old Exynos thermal driver).

> specially if there is nothing regarding cpufreq
> in the driver's code. 

There is a call to cpufreq_cooling_register(), which sometimes happens
in the late part of probe function. In such a case it would be quite
challenging to release already allocated resources (e.g. ti, old
Exynos driver).

> I might send the proposal of having the check
> between cpu_cooling and cpufreq as a formal patch, in a separated
> thread.

It would be beneficial to hear Viresh's opinion.

> 
> I will have a look in your v2. Briefly looking, looks reasonable.

In short: The goal of this patch is to show that regressions should not
happen and that it can be safely applied for v3.19.

> 
> 
> Once again, thanks.
> 
> Cheers,
> 
> Eduardo Valentin
> 
> > 
> > 
> > > 
> > > On Thu, Nov 13, 2014 at 06:02:37PM +0100, Lukasz Majewski wrote:
> > > > Presented fixes are a response for problem described below:
> > > > http://thread.gmane.org/gmane.linux.kernel/1793821/match=thermal+core+fix+initialize+max_state+variable+0
> > > > 
> > > > In short - it turned out that two trivial fixes (included in
> > > > this patch set) require support for deferred probe in thermal
> > > > drivers.
> > > > 
> > > > This situation shows up when CPU frequency reduction is used as
> > > > a thermal cooling device for a thermal zone.
> > > > It happens that during initialization, the call to thermal probe
> > > > will be executed before cpufreq probe (it can be observed
> > > > at ./drivers/Makefile). In such a situation thermal will not be
> > > > properly configured until cpufreq policy is setup.
> > > > 
> > > > In the current code (without included fixes) there is a time
> > > > window in which thermal can try to use not configured cpufreq
> > > > and possibly crash the system.
> > > > 
> > > > 
> > > > Proposed solution was based on the code already available in the
> > > > imx_thermal.c file.
> > > > 
> > > > /db8500_thermal.c:                      -> NOT NEEDED
> > > > /intel_powerclamp.c:                    -> NOT NEEDED - INTEL
> > > > (x86) /intel_powerclamp.c:                    -> NOT NEEDED -
> > > > INTEL (x86) /ti-soc-thermal/ti-bandgap.c:           -> FIXED
> > > > [omap2plus_defconfig] /dove_thermal.c:                        ->
> > > > NOT NEEDED - CPU_COOLING NOT AVAILABLE [dove_defconfig]
> > > > /spear_thermal.c:                       -> FIXED
> > > > [spear3xx_defconfig] /samsung/exynos_tmu.c:                  ->
> > > > NOT NEEDED (nasty hack - will be reworked in later
> > > > patches) /imx_thermal.c:                         -> OK (deferred
> > > > probe already in place) /int340x_thermal/int3402_thermal.c:
> > > > -> NOT NEEDED - ACPI x86 - Intel
> > > > specific /int340x_thermal/int3400_thermal.c:     -> NOT NEEDED -
> > > > ACPI x86 - Intel specific /tegra_soctherm.c:
> > > > -> FIXED  [tegra_defconfig] /kirkwood_thermal.c:
> > > > -> FIXED
> > > > [multi_v5_defconfig] /armada_thermal.c:                      ->
> > > > FIXED  [multi_v7_defconfig] /rcar_thermal.c:
> > > > -> FIXED
> > > > [shmobile_defconfig] /db8500_cpufreq_cooling.c:              ->
> > > > OK (deferred probe already in place)
> > > > [multi_v7_defconfig] /st/st_thermal_syscfg.c:                ->
> > > > NOT NEEDED (Those two are enabled by e.g.
> > > > ARMADA) /st/st_thermal_memmap.c:
> > > > 
> > > > 
> > > 
> > > 
> > > Instead of doing the same check on all drivers in the need for cpu
> > > cooling looks like a promiscuous solution. What if we do this
> > > check in cpu cooling itself and we propagate the error in callers
> > > code? 
> > > 
> > > From what I see, only exynos does not propagate the error. And we
> > > would need a tweak in the cpufreq-dt code. Something like the
> > > following (not tested):
> > > 
> > > diff --git a/drivers/cpufreq/cpufreq-dt.c
> > > b/drivers/cpufreq/cpufreq-dt.c index f657c57..f139247 100644
> > > --- a/drivers/cpufreq/cpufreq-dt.c
> > > +++ b/drivers/cpufreq/cpufreq-dt.c
> > > @@ -181,7 +181,6 @@ static int cpufreq_init(struct cpufreq_policy
> > > *policy) {
> > >  	struct cpufreq_dt_platform_data *pd;
> > >  	struct cpufreq_frequency_table *freq_table;
> > > -	struct thermal_cooling_device *cdev;
> > >  	struct device_node *np;
> > >  	struct private_data *priv;
> > >  	struct device *cpu_dev;
> > > @@ -264,20 +263,6 @@ static int cpufreq_init(struct cpufreq_policy
> > > *policy) goto out_free_priv;
> > >  	}
> > >  
> > > -	/*
> > > -	 * For now, just loading the cooling device;
> > > -	 * thermal DT code takes care of matching them.
> > > -	 */
> > > -	if (of_find_property(np, "#cooling-cells", NULL)) {
> > > -		cdev = of_cpufreq_cooling_register(np,
> > > cpu_present_mask);
> > > -		if (IS_ERR(cdev))
> > > -			dev_err(cpu_dev,
> > > -				"running cpufreq without cooling
> > > device: %ld\n",
> > > -				PTR_ERR(cdev));
> > > -		else
> > > -			priv->cdev = cdev;
> > > -	}
> > > -
> > >  	priv->cpu_dev = cpu_dev;
> > >  	priv->cpu_reg = cpu_reg;
> > >  	policy->driver_data = priv;
> > > @@ -287,7 +272,7 @@ static int cpufreq_init(struct cpufreq_policy
> > > *policy) if (ret) {
> > >  		dev_err(cpu_dev, "%s: invalid frequency table:
> > > %d\n", __func__, ret);
> > > -		goto out_cooling_unregister;
> > > +		goto free_table;
> > >  	}
> > >  
> > >  	policy->cpuinfo.transition_latency = transition_latency;
> > > @@ -300,8 +285,7 @@ static int cpufreq_init(struct cpufreq_policy
> > > *policy) 
> > >  	return 0;
> > >  
> > > -out_cooling_unregister:
> > > -	cpufreq_cooling_unregister(priv->cdev);
> > > +free_table:
> > >  	dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
> > >  out_free_priv:
> > >  	kfree(priv);
> > > @@ -342,11 +326,14 @@ static struct cpufreq_driver
> > > dt_cpufreq_driver = { 
> > >  static int dt_cpufreq_probe(struct platform_device *pdev)
> > >  {
> > > +	struct device_node *np;
> > >  	struct device *cpu_dev;
> > >  	struct regulator *cpu_reg;
> > >  	struct clk *cpu_clk;
> > >  	int ret;
> > >  
> > > +	/* at this point we checked the pointer already right? */
> > > +	np = of_node_get(pdev->dev.of_node);
> > >  	/*
> > >  	 * All per-cluster (CPUs sharing clock/voltages)
> > > initialization is done
> > >  	 * from ->init(). In probe(), we just need to make sure
> > > that clk and @@ -368,6 +355,28 @@ static int
> > > dt_cpufreq_probe(struct platform_device *pdev) if (ret)
> > >  		dev_err(cpu_dev, "failed register driver: %d\n",
> > > ret); 
> > > +	/*
> > > +	 * For now, just loading the cooling device;
> > > +	 * thermal DT code takes care of matching them.
> > > +	 */
> > > +	if (of_find_property(np, "#cooling-cells", NULL)) {
> > > +		struct cpufreq_policy policy;
> > > +		struct private_data *priv;
> > > +		struct thermal_cooling_device *cdev;
> > > +
> > > +		/* TODO: can cpu0 be always used ? */
> > > +		cpufreq_get_policy(&policy, 0);
> > > +		priv = policy.driver_data;
> > > +		cdev = of_cpufreq_cooling_register(np,
> > > cpu_present_mask);
> > > +		if (IS_ERR(cdev))
> > > +			dev_err(cpu_dev,
> > > +				"running cpufreq without cooling
> > > device: %ld\n",
> > > +				PTR_ERR(cdev));
> > > +		else
> > > +			priv->cdev = cdev;
> > > +	}
> > > +	of_node_put(np);
> > > +
> > >  	return ret;
> > >  }
> > >  
> > > diff --git a/drivers/thermal/cpu_cooling.c
> > > b/drivers/thermal/cpu_cooling.c index 1ab0018..342eb9e 100644
> > > --- a/drivers/thermal/cpu_cooling.c
> > > +++ b/drivers/thermal/cpu_cooling.c
> > > @@ -440,6 +440,11 @@ __cpufreq_cooling_register(struct device_node
> > > *np, int ret = 0, i;
> > >  	struct cpufreq_policy policy;
> > >  
> > > +	if (!cpufreq_get_current_driver()) {
> > > +		dev_warn(&pdev->dev, "no cpufreq driver,
> > > deferring.");
> > > +		return -EPROBE_DEFER;
> > > +	}
> > > +
> > >  	/* Verify that all the clip cpus have same freq_min,
> > > freq_max limit */ for_each_cpu(i, clip_cpus) {
> > >  		/* continue if cpufreq policy not found and not
> > > return error */ diff --git
> > > a/drivers/thermal/samsung/exynos_thermal_common.c
> > > b/drivers/thermal/samsung/exynos_thermal_common.c index
> > > 3f5ad25..f84975e 100644 ---
> > > a/drivers/thermal/samsung/exynos_thermal_common.c +++
> > > b/drivers/thermal/samsung/exynos_thermal_common.c @@ -373,7
> > > +373,7 @@ int exynos_register_thermal(struct thermal_sensor_conf
> > > *sensor_conf) if
> > > (IS_ERR(th_zone->cool_dev[th_zone->cool_dev_size]))
> > > { dev_err(sensor_conf->dev, "Failed to register cpufreq cooling
> > > device\n");
> > > -			ret = -EINVAL;
> > > +			ret =
> > > PTR_ERR(th_zone->cool_dev[th_zone->cool_dev_size]); goto
> > > err_unregister; }
> > >  		th_zone->cool_dev_size++;
> > > 
> > > 
> > > The above way, we avoid having same test in every driver that
> > > needs it. Besides, it makes sense the cpu_cooling code takes care
> > > of this check, as it is the very first part that has direct
> > > dependency with cpufreq.
> > > 
> > > > I only possess Exynos boards and Beagle Bone Black, so I'd be
> > > > grateful for testing proposed solution on other boards. The
> > > > posted code is compile tested.
> > > > 
> > > > This code applies on Eduardo's ti-soc-thermal-next tree:
> > > > SHA1: 208a97042d66d9bfbcfab0d4a00c9fe317bb73d3
> > > > 
> > > > Lukasz Majewski (8):
> > > >   thermal:cpu cooling:armada: Provide deferred probing for
> > > > armada driver thermal:cpu cooling:kirkwood: Provide deferred
> > > > probing for kirkwood driver
> > > >   thermal:cpu cooling:rcar: Provide deferred probing for rcar
> > > > driver thermal:cpu cooling:spear: Provide deferred probing for
> > > > spear driver thermal:cpu cooling:tegra: Provide deferred
> > > > probing for tegra driver thermal:cpu cooling:ti: Provide
> > > > deferred probing for ti drivers thermal:core:fix: Initialize
> > > > the max_state variable to 0 thermal:core:fix: Check return code
> > > > of the ->get_max_state() callback
> > > > 
> > > >  drivers/thermal/armada_thermal.c            | 7 +++++++
> > > >  drivers/thermal/kirkwood_thermal.c          | 7 +++++++
> > > >  drivers/thermal/rcar_thermal.c              | 7 +++++++
> > > >  drivers/thermal/spear_thermal.c             | 7 +++++++
> > > >  drivers/thermal/tegra_soctherm.c            | 7 +++++++
> > > >  drivers/thermal/thermal_core.c              | 8 +++++---
> > > >  drivers/thermal/ti-soc-thermal/ti-bandgap.c | 7 +++++++
> > > >  7 files changed, 47 insertions(+), 3 deletions(-)
> > > > 
> > > > -- 
> > > > 2.0.0.rc2
> > > > 
> > 
> > 
> > 
> > -- 
> > Best regards,
> > 
> > Lukasz Majewski
> > 
> > Samsung R&D Institute Poland (SRPOL) | Linux Platform Group



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
  2014-11-18 10:16     ` [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback Lukasz Majewski
@ 2014-11-21 18:08           ` Eduardo Valentin
  0 siblings, 0 replies; 62+ messages in thread
From: Eduardo Valentin @ 2014-11-21 18:08 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 3497 bytes --]


Lukasz,

On Tue, Nov 18, 2014 at 11:16:30AM +0100, Lukasz Majewski wrote:
> The return code from ->get_max_state() callback was not checked during
> binding cooling device to thermal zone device.
> 
> Signed-off-by: Lukasz Majewski <l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
> ---
> Changes for v2:
> - It turned out that patches from 1 to 6 from v1 are not needed, since
>   they either already solve the problem (like imx_thermal.c) or not use
>   cpufreq as a thermal cooling device.
> - The patch 7 from v1 is also not needed since this patch on error exits
>   this function without using max_state variable.
> - In thermal driver probe the cpufreq_cooling_register() method presence
>   is crucial to evaluate if the thermal driver needs any actions with 
>   -EPROBE_DEFER.

Have you tried this patch with of-thermal based systems?

The above proposal works if the thermal driver is dealing with loading
cpu_cooling. But for of-thermal based drivers, the idea is to leave to
cpufreq code to load it. 

As an example, I am taking the ti-soc-thermal, but we already have other
of-thermal based drivers. Booting with this patch ti-soc-thermal
(of-based boot) loads fine, but the cpu_cooling never gets bound to the
thermal zone.

The thing is that the bind may happen before cpufreq-dt code loads the
cpufreq driver, and when cpu_cooling is checking what is the max freq,
by using cpufreq table, it won't be able to do it, as there is no table.

While, without the patch, it will use wrong in the binding, but after
it gets bound, and cpufreq loads, the max will be used correctly.

And in this case, the system still works besides this bug. The reasoning
is because the max state comes from DT (2) and lower and upper wont be
equal to THERMAL_NO_LIMIT. Then, the following check will use the
parameter, instead of max_state:

        cdev->ops->get_max_state(cdev, &max_state);

	/* lower default 0, upper default max_state */
	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
	upper = upper == THERMAL_NO_LIMIT ?
				max_state : upper;

In summary, introducing this patch, although it fix a problem, will
introduce regressions, in of-thermal based drivers.

I believe, to have this fix, you need to provide a way to have probing
deferring also in cpu_cooling. That needs also the change in the cpufreq
driver, as I mentioned in the other thread.

Cheers,

> ---
>  drivers/thermal/thermal_core.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 43b9070..8567929 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
>  	struct thermal_zone_device *pos1;
>  	struct thermal_cooling_device *pos2;
>  	unsigned long max_state;
> -	int result;
> +	int result, ret;
>  
>  	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
>  		return -EINVAL;
> @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
>  	if (tz != pos1 || cdev != pos2)
>  		return -EINVAL;
>  
> -	cdev->ops->get_max_state(cdev, &max_state);
> +	ret = cdev->ops->get_max_state(cdev, &max_state);
> +	if (ret)
> +		return ret;
>  
>  	/* lower default 0, upper default max_state */
>  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> -- 
> 2.0.0.rc2
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
@ 2014-11-21 18:08           ` Eduardo Valentin
  0 siblings, 0 replies; 62+ messages in thread
From: Eduardo Valentin @ 2014-11-21 18:08 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3469 bytes --]


Lukasz,

On Tue, Nov 18, 2014 at 11:16:30AM +0100, Lukasz Majewski wrote:
> The return code from ->get_max_state() callback was not checked during
> binding cooling device to thermal zone device.
> 
> Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> ---
> Changes for v2:
> - It turned out that patches from 1 to 6 from v1 are not needed, since
>   they either already solve the problem (like imx_thermal.c) or not use
>   cpufreq as a thermal cooling device.
> - The patch 7 from v1 is also not needed since this patch on error exits
>   this function without using max_state variable.
> - In thermal driver probe the cpufreq_cooling_register() method presence
>   is crucial to evaluate if the thermal driver needs any actions with 
>   -EPROBE_DEFER.

Have you tried this patch with of-thermal based systems?

The above proposal works if the thermal driver is dealing with loading
cpu_cooling. But for of-thermal based drivers, the idea is to leave to
cpufreq code to load it. 

As an example, I am taking the ti-soc-thermal, but we already have other
of-thermal based drivers. Booting with this patch ti-soc-thermal
(of-based boot) loads fine, but the cpu_cooling never gets bound to the
thermal zone.

The thing is that the bind may happen before cpufreq-dt code loads the
cpufreq driver, and when cpu_cooling is checking what is the max freq,
by using cpufreq table, it won't be able to do it, as there is no table.

While, without the patch, it will use wrong in the binding, but after
it gets bound, and cpufreq loads, the max will be used correctly.

And in this case, the system still works besides this bug. The reasoning
is because the max state comes from DT (2) and lower and upper wont be
equal to THERMAL_NO_LIMIT. Then, the following check will use the
parameter, instead of max_state:

        cdev->ops->get_max_state(cdev, &max_state);

	/* lower default 0, upper default max_state */
	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
	upper = upper == THERMAL_NO_LIMIT ?
				max_state : upper;

In summary, introducing this patch, although it fix a problem, will
introduce regressions, in of-thermal based drivers.

I believe, to have this fix, you need to provide a way to have probing
deferring also in cpu_cooling. That needs also the change in the cpufreq
driver, as I mentioned in the other thread.

Cheers,

> ---
>  drivers/thermal/thermal_core.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 43b9070..8567929 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
>  	struct thermal_zone_device *pos1;
>  	struct thermal_cooling_device *pos2;
>  	unsigned long max_state;
> -	int result;
> +	int result, ret;
>  
>  	if (trip >= tz->trips || (trip < 0 && trip != THERMAL_TRIPS_NONE))
>  		return -EINVAL;
> @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
>  	if (tz != pos1 || cdev != pos2)
>  		return -EINVAL;
>  
> -	cdev->ops->get_max_state(cdev, &max_state);
> +	ret = cdev->ops->get_max_state(cdev, &max_state);
> +	if (ret)
> +		return ret;
>  
>  	/* lower default 0, upper default max_state */
>  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> -- 
> 2.0.0.rc2
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
  2014-11-21 18:08           ` Eduardo Valentin
  (?)
@ 2014-11-24 10:38           ` Lukasz Majewski
  2014-11-24 11:00               ` Viresh Kumar
  2014-11-24 18:02               ` Eduardo Valentin
  -1 siblings, 2 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-24 10:38 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Viresh Kumar

Hi Eduardo,

> 
> Lukasz,
> 
> On Tue, Nov 18, 2014 at 11:16:30AM +0100, Lukasz Majewski wrote:
> > The return code from ->get_max_state() callback was not checked
> > during binding cooling device to thermal zone device.
> > 
> > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > ---
> > Changes for v2:
> > - It turned out that patches from 1 to 6 from v1 are not needed,
> > since they either already solve the problem (like imx_thermal.c) or
> > not use cpufreq as a thermal cooling device.
> > - The patch 7 from v1 is also not needed since this patch on error
> > exits this function without using max_state variable.
> > - In thermal driver probe the cpufreq_cooling_register() method
> > presence is crucial to evaluate if the thermal driver needs any
> > actions with -EPROBE_DEFER.
> 
> Have you tried this patch with of-thermal based systems?

Yes. I did try it with Exynos (after the rework). And there weren't
any regressions.

To be precise - do you refer to of_cpufreq_cooling_register() [1] or
cpufreq_cooling_register() [2]?

For the latter [2] - drivers like imx_thermal.c are fully prepared for
-EDEFER_PROBE.

For the former [1] - only cpufreq-dt.c uses it (and Exynos SoC after
the rework).

> 
> The above proposal works if the thermal driver is dealing with loading
> cpu_cooling. But for of-thermal based drivers, the idea is to leave to
> cpufreq code to load it. 

I assume, that you mean case [1]?

> 
> As an example, I am taking the ti-soc-thermal, but we already have
> other of-thermal based drivers. Booting with this patch ti-soc-thermal
> (of-based boot) loads fine, but the cpu_cooling never gets bound to
> the thermal zone.

Could you share the exact SoC/board/_defconfig setup to reproduce this
behavior? I possess Beagle Bone Black, but it doesn't have thermal
support (perhaps because its lack of accuracy).

With my Exynos setup I didn't experience any problems with this patch.

> 
> The thing is that the bind may happen before cpufreq-dt code loads the
> cpufreq driver, and when cpu_cooling is checking what is the max freq,
> by using cpufreq table, it won't be able to do it, as there is no
> table.

As I look into the cpufreq-dt.c driver - in the cpufreq_init()
function, the call to of_cpufreq_cooling_register() is performed just
before cpufreq_table_validate_and_show().
It looks like a mistake in the cpufreq-dt.c code.

> 
> While, without the patch, it will use wrong in the binding, but after
> it gets bound, and cpufreq loads, the max will be used correctly.

Correct. Such _wrong_ behavior was the original motivation to prepare
this patch.

> 
> And in this case, the system still works besides this bug. 

Unfortunately there is also a "window" in which the driver is not
properly configured and can cause system crash, although it is unlikely.


> The
> reasoning is because the max state comes from DT (2) and lower and
> upper wont be equal to THERMAL_NO_LIMIT. Then, the following check
> will use the parameter, instead of max_state:
> 
>         cdev->ops->get_max_state(cdev, &max_state);
> 
> 	/* lower default 0, upper default max_state */
> 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> 	upper = upper == THERMAL_NO_LIMIT ?
> 				max_state : upper;
> 
> In summary, introducing this patch, although it fix a problem, will
> introduce regressions, in of-thermal based drivers.

To be more precise - it will affect systems, which use of-thermal.c and
cpufreq-dt.c in the same time, due to wrong ordering in the latter file.

Could you give me a hint about the exact affected system? I've grep'ed
for CPUFREQ_DT in the ./arch/arm/configs with no success.

> 
> I believe, to have this fix, you need to provide a way to have probing
> deferring also in cpu_cooling. That needs also the change in the
> cpufreq driver, as I mentioned in the other thread.

I will think about possible solution and refer to previous discussion. 

> 
> Cheers,
> 
> > ---
> >  drivers/thermal/thermal_core.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/thermal/thermal_core.c
> > b/drivers/thermal/thermal_core.c index 43b9070..8567929 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct
> > thermal_zone_device *tz, struct thermal_zone_device *pos1;
> >  	struct thermal_cooling_device *pos2;
> >  	unsigned long max_state;
> > -	int result;
> > +	int result, ret;
> >  
> >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > THERMAL_TRIPS_NONE)) return -EINVAL;
> > @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct
> > thermal_zone_device *tz, if (tz != pos1 || cdev != pos2)
> >  		return -EINVAL;
> >  
> > -	cdev->ops->get_max_state(cdev, &max_state);
> > +	ret = cdev->ops->get_max_state(cdev, &max_state);
> > +	if (ret)
> > +		return ret;
> >  
> >  	/* lower default 0, upper default max_state */
> >  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > -- 
> > 2.0.0.rc2
> > 


-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
  2014-11-24 10:38           ` Lukasz Majewski
@ 2014-11-24 11:00               ` Viresh Kumar
  2014-11-24 18:02               ` Eduardo Valentin
  1 sibling, 0 replies; 62+ messages in thread
From: Viresh Kumar @ 2014-11-24 11:00 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Eduardo Valentin, Zhang Rui, Ezequiel Garcia, Kuninori Morimoto,
	Linux PM list, Vincenzo Frascino, Bartlomiej Zolnierkiewicz,
	Lukasz Majewski, Nobuhiro Iwamatsu, Mikko Perttunen,
	Stephen Warren, Thierry Reding, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA, Linux Kernel Mailing List

On 24 November 2014 at 16:08, Lukasz Majewski <l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org> wrote:
> As I look into the cpufreq-dt.c driver - in the cpufreq_init()
> function, the call to of_cpufreq_cooling_register() is performed just
> before cpufreq_table_validate_and_show().
> It looks like a mistake in the cpufreq-dt.c code.

Yes. Just fixed it up and sent a patch. Please provide your
tested-by's..

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
@ 2014-11-24 11:00               ` Viresh Kumar
  0 siblings, 0 replies; 62+ messages in thread
From: Viresh Kumar @ 2014-11-24 11:00 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Eduardo Valentin, Zhang Rui, Ezequiel Garcia, Kuninori Morimoto,
	Linux PM list, Vincenzo Frascino, Bartlomiej Zolnierkiewicz,
	Lukasz Majewski, Nobuhiro Iwamatsu, Mikko Perttunen,
	Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	Linux Kernel Mailing List

On 24 November 2014 at 16:08, Lukasz Majewski <l.majewski@samsung.com> wrote:
> As I look into the cpufreq-dt.c driver - in the cpufreq_init()
> function, the call to of_cpufreq_cooling_register() is performed just
> before cpufreq_table_validate_and_show().
> It looks like a mistake in the cpufreq-dt.c code.

Yes. Just fixed it up and sent a patch. Please provide your
tested-by's..

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
  2014-11-24 11:00               ` Viresh Kumar
  (?)
@ 2014-11-24 14:24               ` Lukasz Majewski
  2014-11-25 10:46                 ` Viresh Kumar
  -1 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-24 14:24 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Eduardo Valentin, Zhang Rui, Ezequiel Garcia, Kuninori Morimoto,
	Linux PM list, Vincenzo Frascino, Bartlomiej Zolnierkiewicz,
	Lukasz Majewski, Nobuhiro Iwamatsu, Mikko Perttunen,
	Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	Linux Kernel Mailing List

Hi Viresh,

> On 24 November 2014 at 16:08, Lukasz Majewski
> <l.majewski@samsung.com> wrote:
> > As I look into the cpufreq-dt.c driver - in the cpufreq_init()
> > function, the call to of_cpufreq_cooling_register() is performed
> > just before cpufreq_table_validate_and_show().
> > It looks like a mistake in the cpufreq-dt.c code.
> 
> Yes. Just fixed it up and sent a patch. Please provide your
> tested-by's..

Thanks for your prompt response. I don't have board which uses both
of-thermal.c and cpufreq-dt.c (exynos uses old approach for cpufreq).

I think that Eduardo may have some boards for testing.

Regarding the patch:
Reviewed-by: Lukasz Majewski <l.majewski@samsung.com>

-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
  2014-11-24 10:38           ` Lukasz Majewski
@ 2014-11-24 18:02               ` Eduardo Valentin
  2014-11-24 18:02               ` Eduardo Valentin
  1 sibling, 0 replies; 62+ messages in thread
From: Eduardo Valentin @ 2014-11-24 18:02 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Viresh Kumar

[-- Attachment #1: Type: text/plain, Size: 6282 bytes --]


Hello Lukasz,

On Mon, Nov 24, 2014 at 11:38:54AM +0100, Lukasz Majewski wrote:
> Hi Eduardo,
> 
> > 
> > Lukasz,
> > 
> > On Tue, Nov 18, 2014 at 11:16:30AM +0100, Lukasz Majewski wrote:
> > > The return code from ->get_max_state() callback was not checked
> > > during binding cooling device to thermal zone device.
> > > 
> > > Signed-off-by: Lukasz Majewski <l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
> > > ---
> > > Changes for v2:
> > > - It turned out that patches from 1 to 6 from v1 are not needed,
> > > since they either already solve the problem (like imx_thermal.c) or
> > > not use cpufreq as a thermal cooling device.
> > > - The patch 7 from v1 is also not needed since this patch on error
> > > exits this function without using max_state variable.
> > > - In thermal driver probe the cpufreq_cooling_register() method
> > > presence is crucial to evaluate if the thermal driver needs any
> > > actions with -EPROBE_DEFER.
> > 
> > Have you tried this patch with of-thermal based systems?
> 
> Yes. I did try it with Exynos (after the rework). And there weren't
> any regressions.
> 
> To be precise - do you refer to of_cpufreq_cooling_register() [1] or
> cpufreq_cooling_register() [2]?
> 

[1]

> For the latter [2] - drivers like imx_thermal.c are fully prepared for
> -EDEFER_PROBE.
> 
> For the former [1] - only cpufreq-dt.c uses it (and Exynos SoC after
> the rework).
> 
> > 
> > The above proposal works if the thermal driver is dealing with loading
> > cpu_cooling. But for of-thermal based drivers, the idea is to leave to
> > cpufreq code to load it. 
> 
> I assume, that you mean case [1]?
> 

yup

> > 
> > As an example, I am taking the ti-soc-thermal, but we already have
> > other of-thermal based drivers. Booting with this patch ti-soc-thermal
> > (of-based boot) loads fine, but the cpu_cooling never gets bound to
> > the thermal zone.
> 
> Could you share the exact SoC/board/_defconfig setup to reproduce this
> behavior? I possess Beagle Bone Black, but it doesn't have thermal
> support (perhaps because its lack of accuracy).
>

Well, it may happen any system a driver with of-thermal + cpufreq-dt.

One board that is easily available is OMAP4460 panda board (tried
myself, the problem is there).

> With my Exynos setup I didn't experience any problems with this patch.
> 
> > 
> > The thing is that the bind may happen before cpufreq-dt code loads the
> > cpufreq driver, and when cpu_cooling is checking what is the max freq,
> > by using cpufreq table, it won't be able to do it, as there is no
> > table.
> 
> As I look into the cpufreq-dt.c driver - in the cpufreq_init()
> function, the call to of_cpufreq_cooling_register() is performed just
> before cpufreq_table_validate_and_show().
> It looks like a mistake in the cpufreq-dt.c code.
> 

Well, I believe for our case, better would be if the cpu_cooling could
be done after cpufreq driver registration call.


> > 
> > While, without the patch, it will use wrong in the binding, but after
> > it gets bound, and cpufreq loads, the max will be used correctly.
> 
> Correct. Such _wrong_ behavior was the original motivation to prepare
> this patch.
> 
> > 
> > And in this case, the system still works besides this bug. 
> 
> Unfortunately there is also a "window" in which the driver is not
> properly configured and can cause system crash, although it is unlikely.
> 

Agreed.

> 
> > The
> > reasoning is because the max state comes from DT (2) and lower and
> > upper wont be equal to THERMAL_NO_LIMIT. Then, the following check
> > will use the parameter, instead of max_state:
> > 
> >         cdev->ops->get_max_state(cdev, &max_state);
> > 
> > 	/* lower default 0, upper default max_state */
> > 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > 	upper = upper == THERMAL_NO_LIMIT ?
> > 				max_state : upper;
> > 
> > In summary, introducing this patch, although it fix a problem, will
> > introduce regressions, in of-thermal based drivers.
> 
> To be more precise - it will affect systems, which use of-thermal.c and
> cpufreq-dt.c in the same time, due to wrong ordering in the latter file.
> 

Exactly.

> Could you give me a hint about the exact affected system? I've grep'ed
> for CPUFREQ_DT in the ./arch/arm/configs with no success.
> 

Yeah, the grepping is correct. But well, just because it is not in
defconfigs does not mean it won't be used. 

> > 
> > I believe, to have this fix, you need to provide a way to have probing
> > deferring also in cpu_cooling. That needs also the change in the
> > cpufreq driver, as I mentioned in the other thread.
> 
> I will think about possible solution and refer to previous discussion. 
> 

Good. For your patch, it is still sane to have it. But needs to be taken
after fixing the ordering between cpufreq-dt and cpu_cooling.


> > 
> > Cheers,
> > 
> > > ---
> > >  drivers/thermal/thermal_core.c | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/thermal/thermal_core.c
> > > b/drivers/thermal/thermal_core.c index 43b9070..8567929 100644
> > > --- a/drivers/thermal/thermal_core.c
> > > +++ b/drivers/thermal/thermal_core.c
> > > @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct
> > > thermal_zone_device *tz, struct thermal_zone_device *pos1;
> > >  	struct thermal_cooling_device *pos2;
> > >  	unsigned long max_state;
> > > -	int result;
> > > +	int result, ret;
> > >  
> > >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > > THERMAL_TRIPS_NONE)) return -EINVAL;
> > > @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct
> > > thermal_zone_device *tz, if (tz != pos1 || cdev != pos2)
> > >  		return -EINVAL;
> > >  
> > > -	cdev->ops->get_max_state(cdev, &max_state);
> > > +	ret = cdev->ops->get_max_state(cdev, &max_state);
> > > +	if (ret)
> > > +		return ret;
> > >  
> > >  	/* lower default 0, upper default max_state */
> > >  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > > -- 
> > > 2.0.0.rc2
> > > 
> 
> 
> -- 
> Best regards,
> 
> Lukasz Majewski
> 
> Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
@ 2014-11-24 18:02               ` Eduardo Valentin
  0 siblings, 0 replies; 62+ messages in thread
From: Eduardo Valentin @ 2014-11-24 18:02 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Zhang Rui, Ezequiel Garcia, Kuninori Morimoto, Linux PM list,
	Vincenzo Frascino, Bartlomiej Zolnierkiewicz, Lukasz Majewski,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Viresh Kumar

[-- Attachment #1: Type: text/plain, Size: 6254 bytes --]


Hello Lukasz,

On Mon, Nov 24, 2014 at 11:38:54AM +0100, Lukasz Majewski wrote:
> Hi Eduardo,
> 
> > 
> > Lukasz,
> > 
> > On Tue, Nov 18, 2014 at 11:16:30AM +0100, Lukasz Majewski wrote:
> > > The return code from ->get_max_state() callback was not checked
> > > during binding cooling device to thermal zone device.
> > > 
> > > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > > ---
> > > Changes for v2:
> > > - It turned out that patches from 1 to 6 from v1 are not needed,
> > > since they either already solve the problem (like imx_thermal.c) or
> > > not use cpufreq as a thermal cooling device.
> > > - The patch 7 from v1 is also not needed since this patch on error
> > > exits this function without using max_state variable.
> > > - In thermal driver probe the cpufreq_cooling_register() method
> > > presence is crucial to evaluate if the thermal driver needs any
> > > actions with -EPROBE_DEFER.
> > 
> > Have you tried this patch with of-thermal based systems?
> 
> Yes. I did try it with Exynos (after the rework). And there weren't
> any regressions.
> 
> To be precise - do you refer to of_cpufreq_cooling_register() [1] or
> cpufreq_cooling_register() [2]?
> 

[1]

> For the latter [2] - drivers like imx_thermal.c are fully prepared for
> -EDEFER_PROBE.
> 
> For the former [1] - only cpufreq-dt.c uses it (and Exynos SoC after
> the rework).
> 
> > 
> > The above proposal works if the thermal driver is dealing with loading
> > cpu_cooling. But for of-thermal based drivers, the idea is to leave to
> > cpufreq code to load it. 
> 
> I assume, that you mean case [1]?
> 

yup

> > 
> > As an example, I am taking the ti-soc-thermal, but we already have
> > other of-thermal based drivers. Booting with this patch ti-soc-thermal
> > (of-based boot) loads fine, but the cpu_cooling never gets bound to
> > the thermal zone.
> 
> Could you share the exact SoC/board/_defconfig setup to reproduce this
> behavior? I possess Beagle Bone Black, but it doesn't have thermal
> support (perhaps because its lack of accuracy).
>

Well, it may happen any system a driver with of-thermal + cpufreq-dt.

One board that is easily available is OMAP4460 panda board (tried
myself, the problem is there).

> With my Exynos setup I didn't experience any problems with this patch.
> 
> > 
> > The thing is that the bind may happen before cpufreq-dt code loads the
> > cpufreq driver, and when cpu_cooling is checking what is the max freq,
> > by using cpufreq table, it won't be able to do it, as there is no
> > table.
> 
> As I look into the cpufreq-dt.c driver - in the cpufreq_init()
> function, the call to of_cpufreq_cooling_register() is performed just
> before cpufreq_table_validate_and_show().
> It looks like a mistake in the cpufreq-dt.c code.
> 

Well, I believe for our case, better would be if the cpu_cooling could
be done after cpufreq driver registration call.


> > 
> > While, without the patch, it will use wrong in the binding, but after
> > it gets bound, and cpufreq loads, the max will be used correctly.
> 
> Correct. Such _wrong_ behavior was the original motivation to prepare
> this patch.
> 
> > 
> > And in this case, the system still works besides this bug. 
> 
> Unfortunately there is also a "window" in which the driver is not
> properly configured and can cause system crash, although it is unlikely.
> 

Agreed.

> 
> > The
> > reasoning is because the max state comes from DT (2) and lower and
> > upper wont be equal to THERMAL_NO_LIMIT. Then, the following check
> > will use the parameter, instead of max_state:
> > 
> >         cdev->ops->get_max_state(cdev, &max_state);
> > 
> > 	/* lower default 0, upper default max_state */
> > 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > 	upper = upper == THERMAL_NO_LIMIT ?
> > 				max_state : upper;
> > 
> > In summary, introducing this patch, although it fix a problem, will
> > introduce regressions, in of-thermal based drivers.
> 
> To be more precise - it will affect systems, which use of-thermal.c and
> cpufreq-dt.c in the same time, due to wrong ordering in the latter file.
> 

Exactly.

> Could you give me a hint about the exact affected system? I've grep'ed
> for CPUFREQ_DT in the ./arch/arm/configs with no success.
> 

Yeah, the grepping is correct. But well, just because it is not in
defconfigs does not mean it won't be used. 

> > 
> > I believe, to have this fix, you need to provide a way to have probing
> > deferring also in cpu_cooling. That needs also the change in the
> > cpufreq driver, as I mentioned in the other thread.
> 
> I will think about possible solution and refer to previous discussion. 
> 

Good. For your patch, it is still sane to have it. But needs to be taken
after fixing the ordering between cpufreq-dt and cpu_cooling.


> > 
> > Cheers,
> > 
> > > ---
> > >  drivers/thermal/thermal_core.c | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/thermal/thermal_core.c
> > > b/drivers/thermal/thermal_core.c index 43b9070..8567929 100644
> > > --- a/drivers/thermal/thermal_core.c
> > > +++ b/drivers/thermal/thermal_core.c
> > > @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct
> > > thermal_zone_device *tz, struct thermal_zone_device *pos1;
> > >  	struct thermal_cooling_device *pos2;
> > >  	unsigned long max_state;
> > > -	int result;
> > > +	int result, ret;
> > >  
> > >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > > THERMAL_TRIPS_NONE)) return -EINVAL;
> > > @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct
> > > thermal_zone_device *tz, if (tz != pos1 || cdev != pos2)
> > >  		return -EINVAL;
> > >  
> > > -	cdev->ops->get_max_state(cdev, &max_state);
> > > +	ret = cdev->ops->get_max_state(cdev, &max_state);
> > > +	if (ret)
> > > +		return ret;
> > >  
> > >  	/* lower default 0, upper default max_state */
> > >  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > > -- 
> > > 2.0.0.rc2
> > > 
> 
> 
> -- 
> Best regards,
> 
> Lukasz Majewski
> 
> Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
  2014-11-24 18:02               ` Eduardo Valentin
  (?)
@ 2014-11-24 20:28               ` Lukasz Majewski
  -1 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2014-11-24 20:28 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Lukasz Majewski, Zhang Rui, Ezequiel Garcia, Kuninori Morimoto,
	Linux PM list, Vincenzo Frascino, Bartlomiej Zolnierkiewicz,
	Nobuhiro Iwamatsu, Mikko Perttunen, Stephen Warren,
	Thierry Reding, Alexandre Courbot, linux-tegra, linux-kernel,
	Viresh Kumar

[-- Attachment #1: Type: text/plain, Size: 6901 bytes --]

On Mon, 24 Nov 2014 14:02:25 -0400
Eduardo Valentin <edubezval@gmail.com> wrote:

> 
> Hello Lukasz,
> 
> On Mon, Nov 24, 2014 at 11:38:54AM +0100, Lukasz Majewski wrote:
> > Hi Eduardo,
> > 
> > > 
> > > Lukasz,
> > > 
> > > On Tue, Nov 18, 2014 at 11:16:30AM +0100, Lukasz Majewski wrote:
> > > > The return code from ->get_max_state() callback was not checked
> > > > during binding cooling device to thermal zone device.
> > > > 
> > > > Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> > > > ---
> > > > Changes for v2:
> > > > - It turned out that patches from 1 to 6 from v1 are not needed,
> > > > since they either already solve the problem (like
> > > > imx_thermal.c) or not use cpufreq as a thermal cooling device.
> > > > - The patch 7 from v1 is also not needed since this patch on
> > > > error exits this function without using max_state variable.
> > > > - In thermal driver probe the cpufreq_cooling_register() method
> > > > presence is crucial to evaluate if the thermal driver needs any
> > > > actions with -EPROBE_DEFER.
> > > 
> > > Have you tried this patch with of-thermal based systems?
> > 
> > Yes. I did try it with Exynos (after the rework). And there weren't
> > any regressions.
> > 
> > To be precise - do you refer to of_cpufreq_cooling_register() [1] or
> > cpufreq_cooling_register() [2]?
> > 
> 
> [1]
> 
> > For the latter [2] - drivers like imx_thermal.c are fully prepared
> > for -EDEFER_PROBE.
> > 
> > For the former [1] - only cpufreq-dt.c uses it (and Exynos SoC after
> > the rework).
> > 
> > > 
> > > The above proposal works if the thermal driver is dealing with
> > > loading cpu_cooling. But for of-thermal based drivers, the idea
> > > is to leave to cpufreq code to load it. 
> > 
> > I assume, that you mean case [1]?
> > 
> 
> yup
> 
> > > 
> > > As an example, I am taking the ti-soc-thermal, but we already have
> > > other of-thermal based drivers. Booting with this patch
> > > ti-soc-thermal (of-based boot) loads fine, but the cpu_cooling
> > > never gets bound to the thermal zone.
> > 
> > Could you share the exact SoC/board/_defconfig setup to reproduce
> > this behavior? I possess Beagle Bone Black, but it doesn't have
> > thermal support (perhaps because its lack of accuracy).
> >
> 
> Well, it may happen any system a driver with of-thermal + cpufreq-dt.
> 
> One board that is easily available is OMAP4460 panda board (tried
> myself, the problem is there).
> 
> > With my Exynos setup I didn't experience any problems with this
> > patch.
> > 
> > > 
> > > The thing is that the bind may happen before cpufreq-dt code
> > > loads the cpufreq driver, and when cpu_cooling is checking what
> > > is the max freq, by using cpufreq table, it won't be able to do
> > > it, as there is no table.
> > 
> > As I look into the cpufreq-dt.c driver - in the cpufreq_init()
> > function, the call to of_cpufreq_cooling_register() is performed
> > just before cpufreq_table_validate_and_show().
> > It looks like a mistake in the cpufreq-dt.c code.
> > 
> 
> Well, I believe for our case, better would be if the cpu_cooling could
> be done after cpufreq driver registration call.
> 
> 
> > > 
> > > While, without the patch, it will use wrong in the binding, but
> > > after it gets bound, and cpufreq loads, the max will be used
> > > correctly.
> > 
> > Correct. Such _wrong_ behavior was the original motivation to
> > prepare this patch.
> > 
> > > 
> > > And in this case, the system still works besides this bug. 
> > 
> > Unfortunately there is also a "window" in which the driver is not
> > properly configured and can cause system crash, although it is
> > unlikely.
> > 
> 
> Agreed.
> 
> > 
> > > The
> > > reasoning is because the max state comes from DT (2) and lower and
> > > upper wont be equal to THERMAL_NO_LIMIT. Then, the following check
> > > will use the parameter, instead of max_state:
> > > 
> > >         cdev->ops->get_max_state(cdev, &max_state);
> > > 
> > > 	/* lower default 0, upper default max_state */
> > > 	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > > 	upper = upper == THERMAL_NO_LIMIT ?
> > > 				max_state : upper;
> > > 
> > > In summary, introducing this patch, although it fix a problem,
> > > will introduce regressions, in of-thermal based drivers.
> > 
> > To be more precise - it will affect systems, which use of-thermal.c
> > and cpufreq-dt.c in the same time, due to wrong ordering in the
> > latter file.
> > 
> 
> Exactly.
> 
> > Could you give me a hint about the exact affected system? I've
> > grep'ed for CPUFREQ_DT in the ./arch/arm/configs with no success.
> > 
> 
> Yeah, the grepping is correct. But well, just because it is not in
> defconfigs does not mean it won't be used. 
> 
> > > 
> > > I believe, to have this fix, you need to provide a way to have
> > > probing deferring also in cpu_cooling. That needs also the change
> > > in the cpufreq driver, as I mentioned in the other thread.
> > 
> > I will think about possible solution and refer to previous
> > discussion. 
> > 
> 
> Good. For your patch, it is still sane to have it. But needs to be
> taken after fixing the ordering between cpufreq-dt and cpu_cooling.

Such fix has been already prepared by Viresh. Could you test if it
works on your Panda Board (unfortunately I don't have one)?
> 
> 
> > > 
> > > Cheers,
> > > 
> > > > ---
> > > >  drivers/thermal/thermal_core.c | 6 ++++--
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/thermal/thermal_core.c
> > > > b/drivers/thermal/thermal_core.c index 43b9070..8567929 100644
> > > > --- a/drivers/thermal/thermal_core.c
> > > > +++ b/drivers/thermal/thermal_core.c
> > > > @@ -928,7 +928,7 @@ int thermal_zone_bind_cooling_device(struct
> > > > thermal_zone_device *tz, struct thermal_zone_device *pos1;
> > > >  	struct thermal_cooling_device *pos2;
> > > >  	unsigned long max_state;
> > > > -	int result;
> > > > +	int result, ret;
> > > >  
> > > >  	if (trip >= tz->trips || (trip < 0 && trip !=
> > > > THERMAL_TRIPS_NONE)) return -EINVAL;
> > > > @@ -945,7 +945,9 @@ int thermal_zone_bind_cooling_device(struct
> > > > thermal_zone_device *tz, if (tz != pos1 || cdev != pos2)
> > > >  		return -EINVAL;
> > > >  
> > > > -	cdev->ops->get_max_state(cdev, &max_state);
> > > > +	ret = cdev->ops->get_max_state(cdev, &max_state);
> > > > +	if (ret)
> > > > +		return ret;
> > > >  
> > > >  	/* lower default 0, upper default max_state */
> > > >  	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
> > > > -- 
> > > > 2.0.0.rc2
> > > > 
> > 
> > 
> > -- 
> > Best regards,
> > 
> > Lukasz Majewski
> > 
> > Samsung R&D Institute Poland (SRPOL) | Linux Platform Group

Best regards,
Lukasz Majewski

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback
  2014-11-24 14:24               ` Lukasz Majewski
@ 2014-11-25 10:46                 ` Viresh Kumar
  0 siblings, 0 replies; 62+ messages in thread
From: Viresh Kumar @ 2014-11-25 10:46 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Eduardo Valentin, Zhang Rui, Ezequiel Garcia, Kuninori Morimoto,
	Linux PM list, Vincenzo Frascino, Bartlomiej Zolnierkiewicz,
	Lukasz Majewski, Nobuhiro Iwamatsu, Mikko Perttunen,
	Stephen Warren, Thierry Reding, Alexandre Courbot, linux-tegra,
	Linux Kernel Mailing List

On 24 November 2014 at 19:54, Lukasz Majewski <l.majewski@samsung.com> wrote:
> Reviewed-by: Lukasz Majewski <l.majewski@samsung.com>

Would be better if you send it as reply to that one.

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2014-11-25 10:46 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-24  8:27 [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
2014-09-24  8:27 ` [PATCH 1/3] thermal: step_wise: fix: Prevent from binary overflow when trend is dropping Lukasz Majewski
2014-10-02 21:13   ` Eduardo Valentin
2014-10-03  7:26     ` Lukasz Majewski
2014-10-09  3:14   ` Zhang Rui
2014-09-24  8:27 ` [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0 Lukasz Majewski
2014-10-02 22:26   ` Eduardo Valentin
2014-10-03  8:26     ` Lukasz Majewski
2014-10-06 18:01       ` Eduardo Valentin
2014-09-24  8:27 ` [PATCH 3/3] thermal: core: fix: Check return code of the ->get_max_state() callback Lukasz Majewski
2014-10-02 22:24   ` Eduardo Valentin
2014-10-03 10:40     ` Lukasz Majewski
2014-10-09  3:15       ` Zhang Rui
2014-10-09 16:47         ` Lukasz Majewski
2014-10-02 14:40 ` [PATCH 0/3] thermal: fix: Various fixes for thermal subsystem Lukasz Majewski
     [not found] ` <1411547232-21493-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-13 17:02   ` [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers Lukasz Majewski
2014-11-13 17:02     ` Lukasz Majewski
2014-11-13 17:02     ` [PATCH 1/8] thermal:cpu cooling:armada: Provide deferred probing for armada driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 2/8] thermal:cpu cooling:kirkwood: Provide deferred probing for kirkwood driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 3/8] thermal:cpu cooling:rcar: Provide deferred probing for rcar driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 4/8] thermal:cpu cooling:spear: Provide deferred probing for spear driver Lukasz Majewski
2014-11-13 17:02     ` [PATCH 5/8] thermal:cpu cooling:tegra: Provide deferred probing for tegra driver Lukasz Majewski
2014-11-14 10:47       ` Mikko Perttunen
2014-11-14 11:24         ` Lukasz Majewski
2014-11-17 11:57           ` Thierry Reding
2014-11-17 12:51             ` Lukasz Majewski
2014-11-17 13:18               ` Thierry Reding
     [not found]         ` <5465DDC5.6090301-/1wQRMveznE@public.gmane.org>
2014-11-17 11:43           ` Thierry Reding
2014-11-17 11:43             ` Thierry Reding
2014-11-17 11:50             ` Lukasz Majewski
2014-11-17 12:01               ` Thierry Reding
2014-11-17 12:01                 ` Thierry Reding
2014-11-17 13:02                 ` Lukasz Majewski
2014-11-17 13:02                   ` Lukasz Majewski
2014-11-17 12:51             ` Mikko Perttunen
2014-11-17 12:51               ` Mikko Perttunen
2014-11-17 13:08               ` Thierry Reding
2014-11-17 13:24                 ` Mikko Perttunen
     [not found]       ` <1415898165-27406-6-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-17 11:40         ` Thierry Reding
2014-11-17 11:40           ` Thierry Reding
2014-11-13 17:02     ` [PATCH 6/8] thermal:cpu cooling:ti: Provide deferred probing for ti drivers Lukasz Majewski
2014-11-20 19:00       ` Eduardo Valentin
2014-11-13 17:02     ` [PATCH 7/8] thermal:core:fix: Initialize the max_state variable to 0 Lukasz Majewski
2014-11-18 10:16     ` [PATCH v2] thermal:core:fix: Check return code of the ->get_max_state() callback Lukasz Majewski
     [not found]       ` <1416305790-27746-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-21 18:08         ` Eduardo Valentin
2014-11-21 18:08           ` Eduardo Valentin
2014-11-24 10:38           ` Lukasz Majewski
2014-11-24 11:00             ` Viresh Kumar
2014-11-24 11:00               ` Viresh Kumar
2014-11-24 14:24               ` Lukasz Majewski
2014-11-25 10:46                 ` Viresh Kumar
2014-11-24 18:02             ` Eduardo Valentin
2014-11-24 18:02               ` Eduardo Valentin
2014-11-24 20:28               ` Lukasz Majewski
     [not found]     ` <1415898165-27406-1-git-send-email-l.majewski-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2014-11-13 17:02       ` [PATCH 8/8] " Lukasz Majewski
2014-11-13 17:02         ` Lukasz Majewski
2014-11-20 18:54       ` [PATCH 0/8] thermal:cpu cooling:fix: Provide thermal core fixes with deferred probe for several drivers Eduardo Valentin
2014-11-20 18:54         ` Eduardo Valentin
2014-11-21  8:33         ` Lukasz Majewski
2014-11-21  2:47           ` Eduardo Valentin
2014-11-21 16:28             ` Lukasz Majewski
2014-11-21 16:28               ` Lukasz Majewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.