linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Fixes for Tegra soctherm
@ 2018-11-13 10:06 Wei Ni
  2018-11-13 10:06 ` [PATCH v2 1/3] thermal: tegra: continue if sensor register fails Wei Ni
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Wei Ni @ 2018-11-13 10:06 UTC (permalink / raw)
  To: thierry.reding, daniel.lezcano, linux-tegra
  Cc: rui.zhang, edubezval, linux-kernel, Wei Ni

This series fixed some issues for Tegra soctherm

Main changes from v1:
1. Acked by Thierry Reding <treding@nvidia.com> for the patch
"thermal: tegra: fix memory allocation".
2. Print out the sensor name when register failed.
2. Remove patch "thermal: tegra: fix coverity defect"

Wei Ni (3):
  thermal: tegra: continue if sensor register fails
  thermal: tegra: remove unnecessary warnings
  thermal: tegra: fix memory allocation

 drivers/thermal/tegra/soctherm.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/3] thermal: tegra: continue if sensor register fails
  2018-11-13 10:06 [PATCH v2 0/3] Fixes for Tegra soctherm Wei Ni
@ 2018-11-13 10:06 ` Wei Ni
  2018-11-21  8:55   ` Daniel Lezcano
  2018-11-13 10:06 ` [PATCH v2 2/3] thermal: tegra: remove unnecessary warnings Wei Ni
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Wei Ni @ 2018-11-13 10:06 UTC (permalink / raw)
  To: thierry.reding, daniel.lezcano, linux-tegra
  Cc: rui.zhang, edubezval, linux-kernel, Wei Ni

Don't bail when a sensor fails to register with the
thermal zone and allow other sensors to register.
This allows other sensors to register with thermal
framework even if one sensor fails registration.

Signed-off-by: Wei Ni <wni@nvidia.com>
---
 drivers/thermal/tegra/soctherm.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index ed28110a3535..a824d2e63af3 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -1370,9 +1370,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 							 &tegra_of_thermal_ops);
 		if (IS_ERR(z)) {
 			err = PTR_ERR(z);
-			dev_err(&pdev->dev, "failed to register sensor: %d\n",
-				err);
-			goto disable_clocks;
+			dev_warn(&pdev->dev, "failed to register sensor %s: %d\n",
+				 soc->ttgs[i]->name, err);
+			continue;
 		}
 
 		zone->tz = z;
@@ -1434,6 +1434,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
 		struct thermal_zone_device *tz;
 
 		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
+		if (!tz)
+			continue;
 		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
 		if (err) {
 			dev_err(&pdev->dev,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] thermal: tegra: remove unnecessary warnings
  2018-11-13 10:06 [PATCH v2 0/3] Fixes for Tegra soctherm Wei Ni
  2018-11-13 10:06 ` [PATCH v2 1/3] thermal: tegra: continue if sensor register fails Wei Ni
@ 2018-11-13 10:06 ` Wei Ni
  2018-11-13 10:06 ` [PATCH v2 3/3] thermal: tegra: fix memory allocation Wei Ni
  2018-11-20  7:06 ` [PATCH v2 0/3] Fixes for Tegra soctherm Wei Ni
  3 siblings, 0 replies; 13+ messages in thread
From: Wei Ni @ 2018-11-13 10:06 UTC (permalink / raw)
  To: thierry.reding, daniel.lezcano, linux-tegra
  Cc: rui.zhang, edubezval, linux-kernel, Wei Ni

Convert warnings to info as not all platforms may
have all the thresholds and sensors enabled.

Signed-off-by: Wei Ni <wni@nvidia.com>
---
 drivers/thermal/tegra/soctherm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index a824d2e63af3..161ef242bcca 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -550,7 +550,7 @@ static int tegra_soctherm_set_hwtrips(struct device *dev,
 
 	ret = tz->ops->get_crit_temp(tz, &temperature);
 	if (ret) {
-		dev_warn(dev, "thermtrip: %s: missing critical temperature\n",
+		dev_info(dev, "thermtrip: %s: missing critical temperature\n",
 			 sg->name);
 		goto set_throttle;
 	}
@@ -569,7 +569,7 @@ static int tegra_soctherm_set_hwtrips(struct device *dev,
 set_throttle:
 	ret = get_hot_temp(tz, &trip, &temperature);
 	if (ret) {
-		dev_warn(dev, "throttrip: %s: missing hot temperature\n",
+		dev_info(dev, "throttrip: %s: missing hot temperature\n",
 			 sg->name);
 		return 0;
 	}
@@ -600,7 +600,7 @@ static int tegra_soctherm_set_hwtrips(struct device *dev,
 	}
 
 	if (i == THROTTLE_SIZE)
-		dev_warn(dev, "throttrip: %s: missing throttle cdev\n",
+		dev_info(dev, "throttrip: %s: missing throttle cdev\n",
 			 sg->name);
 
 	return 0;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 3/3] thermal: tegra: fix memory allocation
  2018-11-13 10:06 [PATCH v2 0/3] Fixes for Tegra soctherm Wei Ni
  2018-11-13 10:06 ` [PATCH v2 1/3] thermal: tegra: continue if sensor register fails Wei Ni
  2018-11-13 10:06 ` [PATCH v2 2/3] thermal: tegra: remove unnecessary warnings Wei Ni
@ 2018-11-13 10:06 ` Wei Ni
  2018-11-20  7:06 ` [PATCH v2 0/3] Fixes for Tegra soctherm Wei Ni
  3 siblings, 0 replies; 13+ messages in thread
From: Wei Ni @ 2018-11-13 10:06 UTC (permalink / raw)
  To: thierry.reding, daniel.lezcano, linux-tegra
  Cc: rui.zhang, edubezval, linux-kernel, Wei Ni

Fix memory allocation to store the pointers to
thermal_zone_device.

Signed-off-by: Wei Ni <wni@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
---
 drivers/thermal/tegra/soctherm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index 161ef242bcca..96ca10789f17 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -1339,7 +1339,7 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
 	}
 
 	tegra->thermctl_tzs = devm_kcalloc(&pdev->dev,
-					   soc->num_ttgs, sizeof(*z),
+					   soc->num_ttgs, sizeof(z),
 					   GFP_KERNEL);
 	if (!tegra->thermctl_tzs)
 		return -ENOMEM;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/3] Fixes for Tegra soctherm
  2018-11-13 10:06 [PATCH v2 0/3] Fixes for Tegra soctherm Wei Ni
                   ` (2 preceding siblings ...)
  2018-11-13 10:06 ` [PATCH v2 3/3] thermal: tegra: fix memory allocation Wei Ni
@ 2018-11-20  7:06 ` Wei Ni
  3 siblings, 0 replies; 13+ messages in thread
From: Wei Ni @ 2018-11-20  7:06 UTC (permalink / raw)
  To: thierry.reding, daniel.lezcano, linux-tegra
  Cc: rui.zhang, edubezval, linux-kernel

Hi all,
Do you have any comments on this serial?

Thanks.
Wei.

On 13/11/2018 6:06 PM, Wei Ni wrote:
> This series fixed some issues for Tegra soctherm
> 
> Main changes from v1:
> 1. Acked by Thierry Reding <treding@nvidia.com> for the patch
> "thermal: tegra: fix memory allocation".
> 2. Print out the sensor name when register failed.
> 2. Remove patch "thermal: tegra: fix coverity defect"
> 
> Wei Ni (3):
>   thermal: tegra: continue if sensor register fails
>   thermal: tegra: remove unnecessary warnings
>   thermal: tegra: fix memory allocation
> 
>  drivers/thermal/tegra/soctherm.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] thermal: tegra: continue if sensor register fails
  2018-11-13 10:06 ` [PATCH v2 1/3] thermal: tegra: continue if sensor register fails Wei Ni
@ 2018-11-21  8:55   ` Daniel Lezcano
  2018-11-21 10:23     ` Wei Ni
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Lezcano @ 2018-11-21  8:55 UTC (permalink / raw)
  To: Wei Ni, thierry.reding, linux-tegra; +Cc: rui.zhang, edubezval, linux-kernel

On 13/11/2018 11:06, Wei Ni wrote:
> Don't bail when a sensor fails to register with the
> thermal zone and allow other sensors to register.
> This allows other sensors to register with thermal
> framework even if one sensor fails registration.

I'm not sure if ignoring the error is really safe. Can you describe the
real situation you want to overcome ? How do you differentiate critical
sensors ?

> Signed-off-by: Wei Ni <wni@nvidia.com>
> ---
>  drivers/thermal/tegra/soctherm.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
> index ed28110a3535..a824d2e63af3 100644
> --- a/drivers/thermal/tegra/soctherm.c
> +++ b/drivers/thermal/tegra/soctherm.c
> @@ -1370,9 +1370,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>  							 &tegra_of_thermal_ops);
>  		if (IS_ERR(z)) {
>  			err = PTR_ERR(z);
> -			dev_err(&pdev->dev, "failed to register sensor: %d\n",
> -				err);
> -			goto disable_clocks;
> +			dev_warn(&pdev->dev, "failed to register sensor %s: %d\n",
> +				 soc->ttgs[i]->name, err);
> +			continue;
>  		}
>  
>  		zone->tz = z;
> @@ -1434,6 +1434,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
>  		struct thermal_zone_device *tz;
>  
>  		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
> +		if (!tz)
> +			continue;
>  		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
>  		if (err) {
>  			dev_err(&pdev->dev,
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] thermal: tegra: continue if sensor register fails
  2018-11-21  8:55   ` Daniel Lezcano
@ 2018-11-21 10:23     ` Wei Ni
  2018-11-21 12:51       ` Daniel Lezcano
  0 siblings, 1 reply; 13+ messages in thread
From: Wei Ni @ 2018-11-21 10:23 UTC (permalink / raw)
  To: Daniel Lezcano, thierry.reding, linux-tegra
  Cc: rui.zhang, edubezval, linux-kernel



On 21/11/2018 4:55 PM, Daniel Lezcano wrote:
> On 13/11/2018 11:06, Wei Ni wrote:
>> Don't bail when a sensor fails to register with the
>> thermal zone and allow other sensors to register.
>> This allows other sensors to register with thermal
>> framework even if one sensor fails registration.
> 
> I'm not sure if ignoring the error is really safe. Can you describe the
> real situation you want to overcome ? How do you differentiate critical
> sensors ?

The driver will always try to register 4 thermal zones, including cpu,
gpu, mem and pll, but if the dts file doesn't set the corresponding
sensors, then the register will be failed.
Normally, the dts file will set all 4 sensors, but there may have some
platform doesn't support them all. So we post this patch.

BTW, what do you mean "critical sensors"? We will set critical trip temp
for all sensors.

Wei.

> 
>> Signed-off-by: Wei Ni <wni@nvidia.com>
>> ---
>>  drivers/thermal/tegra/soctherm.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
>> index ed28110a3535..a824d2e63af3 100644
>> --- a/drivers/thermal/tegra/soctherm.c
>> +++ b/drivers/thermal/tegra/soctherm.c
>> @@ -1370,9 +1370,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>>  							 &tegra_of_thermal_ops);
>>  		if (IS_ERR(z)) {
>>  			err = PTR_ERR(z);
>> -			dev_err(&pdev->dev, "failed to register sensor: %d\n",
>> -				err);
>> -			goto disable_clocks;
>> +			dev_warn(&pdev->dev, "failed to register sensor %s: %d\n",
>> +				 soc->ttgs[i]->name, err);
>> +			continue;
>>  		}
>>  
>>  		zone->tz = z;
>> @@ -1434,6 +1434,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
>>  		struct thermal_zone_device *tz;
>>  
>>  		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
>> +		if (!tz)
>> +			continue;
>>  		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
>>  		if (err) {
>>  			dev_err(&pdev->dev,
>>
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] thermal: tegra: continue if sensor register fails
  2018-11-21 10:23     ` Wei Ni
@ 2018-11-21 12:51       ` Daniel Lezcano
  2018-11-22  7:10         ` Wei Ni
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Lezcano @ 2018-11-21 12:51 UTC (permalink / raw)
  To: Wei Ni, thierry.reding, linux-tegra; +Cc: rui.zhang, edubezval, linux-kernel

On 21/11/2018 11:23, Wei Ni wrote:
> 
> 
> On 21/11/2018 4:55 PM, Daniel Lezcano wrote:
>> On 13/11/2018 11:06, Wei Ni wrote:
>>> Don't bail when a sensor fails to register with the
>>> thermal zone and allow other sensors to register.
>>> This allows other sensors to register with thermal
>>> framework even if one sensor fails registration.
>>
>> I'm not sure if ignoring the error is really safe. Can you describe the
>> real situation you want to overcome ? How do you differentiate critical
>> sensors ?
> 
> The driver will always try to register 4 thermal zones, including cpu,
> gpu, mem and pll, but if the dts file doesn't set the corresponding
> sensors, then the register will be failed.
> Normally, the dts file will set all 4 sensors, but there may have some
> platform doesn't support them all. So we post this patch.

Ignoring errors is not the way to go to support different platforms. Fix
the DT.


> BTW, what do you mean "critical sensors"? We will set critical trip temp
> for all sensors.

I meant sensor for thermal zone getting really high temperature.


>>> Signed-off-by: Wei Ni <wni@nvidia.com>
>>> ---
>>>  drivers/thermal/tegra/soctherm.c | 8 +++++---
>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
>>> index ed28110a3535..a824d2e63af3 100644
>>> --- a/drivers/thermal/tegra/soctherm.c
>>> +++ b/drivers/thermal/tegra/soctherm.c
>>> @@ -1370,9 +1370,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>>>  							 &tegra_of_thermal_ops);
>>>  		if (IS_ERR(z)) {
>>>  			err = PTR_ERR(z);
>>> -			dev_err(&pdev->dev, "failed to register sensor: %d\n",
>>> -				err);
>>> -			goto disable_clocks;
>>> +			dev_warn(&pdev->dev, "failed to register sensor %s: %d\n",
>>> +				 soc->ttgs[i]->name, err);
>>> +			continue;
>>>  		}
>>>  
>>>  		zone->tz = z;
>>> @@ -1434,6 +1434,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
>>>  		struct thermal_zone_device *tz;
>>>  
>>>  		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
>>> +		if (!tz)
>>> +			continue;
>>>  		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
>>>  		if (err) {
>>>  			dev_err(&pdev->dev,
>>>
>>
>>


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] thermal: tegra: continue if sensor register fails
  2018-11-21 12:51       ` Daniel Lezcano
@ 2018-11-22  7:10         ` Wei Ni
  2018-11-22 13:07           ` Daniel Lezcano
  0 siblings, 1 reply; 13+ messages in thread
From: Wei Ni @ 2018-11-22  7:10 UTC (permalink / raw)
  To: Daniel Lezcano, thierry.reding, linux-tegra
  Cc: rui.zhang, edubezval, linux-kernel



On 21/11/2018 8:51 PM, Daniel Lezcano wrote:
> On 21/11/2018 11:23, Wei Ni wrote:
>>
>>
>> On 21/11/2018 4:55 PM, Daniel Lezcano wrote:
>>> On 13/11/2018 11:06, Wei Ni wrote:
>>>> Don't bail when a sensor fails to register with the
>>>> thermal zone and allow other sensors to register.
>>>> This allows other sensors to register with thermal
>>>> framework even if one sensor fails registration.
>>>
>>> I'm not sure if ignoring the error is really safe. Can you describe the
>>> real situation you want to overcome ? How do you differentiate critical
>>> sensors ?
>>
>> The driver will always try to register 4 thermal zones, including cpu,
>> gpu, mem and pll, but if the dts file doesn't set the corresponding
>> sensors, then the register will be failed.
>> Normally, the dts file will set all 4 sensors, but there may have some
>> platform doesn't support them all. So we post this patch.
> 
> Ignoring errors is not the way to go to support different platforms. Fix
> the DT.

The issue isn't in DT file. The Tegra soc thermal include 4 sensors:
cpu, gpu, mem, pll. But in some platforms, for example, we may only need
to support 2 sensors, such as cpu and gpu, so we only configure these
two senors in DT file. But the driver will always try to register 4
sensors, cpu/gpu/mem/pll, so mem and pll will be registered failed. So
in this case we need to ignoring the failure, and continue to enable the
driver.

> 
> 
>> BTW, what do you mean "critical sensors"? We will set critical trip temp
>> for all sensors.
> 
> I meant sensor for thermal zone getting really high temperature.

We doesn't have the critical sensors. We set the critical trip temp for
all registered sensors. And these trip temp is set to the Tegra
hardware. So it mean if the temperature reached the critical trip point,
then the system will be shutdown directly.

> 
> 
>>>> Signed-off-by: Wei Ni <wni@nvidia.com>
>>>> ---
>>>>  drivers/thermal/tegra/soctherm.c | 8 +++++---
>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
>>>> index ed28110a3535..a824d2e63af3 100644
>>>> --- a/drivers/thermal/tegra/soctherm.c
>>>> +++ b/drivers/thermal/tegra/soctherm.c
>>>> @@ -1370,9 +1370,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>>>>  							 &tegra_of_thermal_ops);
>>>>  		if (IS_ERR(z)) {
>>>>  			err = PTR_ERR(z);
>>>> -			dev_err(&pdev->dev, "failed to register sensor: %d\n",
>>>> -				err);
>>>> -			goto disable_clocks;
>>>> +			dev_warn(&pdev->dev, "failed to register sensor %s: %d\n",
>>>> +				 soc->ttgs[i]->name, err);
>>>> +			continue;
>>>>  		}
>>>>  
>>>>  		zone->tz = z;
>>>> @@ -1434,6 +1434,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
>>>>  		struct thermal_zone_device *tz;
>>>>  
>>>>  		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
>>>> +		if (!tz)
>>>> +			continue;
>>>>  		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
>>>>  		if (err) {
>>>>  			dev_err(&pdev->dev,
>>>>
>>>
>>>
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] thermal: tegra: continue if sensor register fails
  2018-11-22  7:10         ` Wei Ni
@ 2018-11-22 13:07           ` Daniel Lezcano
  2018-11-23  6:15             ` Wei Ni
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Lezcano @ 2018-11-22 13:07 UTC (permalink / raw)
  To: Wei Ni, thierry.reding, linux-tegra; +Cc: rui.zhang, edubezval, linux-kernel

On 22/11/2018 08:10, Wei Ni wrote:
> 
> 
> On 21/11/2018 8:51 PM, Daniel Lezcano wrote:
>> On 21/11/2018 11:23, Wei Ni wrote:
>>>
>>>
>>> On 21/11/2018 4:55 PM, Daniel Lezcano wrote:
>>>> On 13/11/2018 11:06, Wei Ni wrote:
>>>>> Don't bail when a sensor fails to register with the
>>>>> thermal zone and allow other sensors to register.
>>>>> This allows other sensors to register with thermal
>>>>> framework even if one sensor fails registration.
>>>>
>>>> I'm not sure if ignoring the error is really safe. Can you describe the
>>>> real situation you want to overcome ? How do you differentiate critical
>>>> sensors ?
>>>
>>> The driver will always try to register 4 thermal zones, including cpu,
>>> gpu, mem and pll, but if the dts file doesn't set the corresponding
>>> sensors, then the register will be failed.
>>> Normally, the dts file will set all 4 sensors, but there may have some
>>> platform doesn't support them all. So we post this patch.
>>
>> Ignoring errors is not the way to go to support different platforms. Fix
>> the DT.
> 
> The issue isn't in DT file. The Tegra soc thermal include 4 sensors:
> cpu, gpu, mem, pll. But in some platforms, for example, we may only need
> to support 2 sensors, such as cpu and gpu, so we only configure these
> two senors in DT file. But the driver will always try to register 4
> sensors, cpu/gpu/mem/pll, so mem and pll will be registered failed. So
> in this case we need to ignoring the failure, and continue to enable the
> driver.

You can fix this by changing the driver to support the platform and
register the sensor you are interested in.

Ignoring errors is not a good idea.


>>> BTW, what do you mean "critical sensors"? We will set critical trip temp
>>> for all sensors.
>>
>> I meant sensor for thermal zone getting really high temperature.
> 
> We doesn't have the critical sensors. We set the critical trip temp for
> all registered sensors. And these trip temp is set to the Tegra
> hardware. So it mean if the temperature reached the critical trip point,
> then the system will be shutdown directly.
> 
>>
>>
>>>>> Signed-off-by: Wei Ni <wni@nvidia.com>
>>>>> ---
>>>>>  drivers/thermal/tegra/soctherm.c | 8 +++++---
>>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
>>>>> index ed28110a3535..a824d2e63af3 100644
>>>>> --- a/drivers/thermal/tegra/soctherm.c
>>>>> +++ b/drivers/thermal/tegra/soctherm.c
>>>>> @@ -1370,9 +1370,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>>>>>  							 &tegra_of_thermal_ops);
>>>>>  		if (IS_ERR(z)) {
>>>>>  			err = PTR_ERR(z);
>>>>> -			dev_err(&pdev->dev, "failed to register sensor: %d\n",
>>>>> -				err);
>>>>> -			goto disable_clocks;
>>>>> +			dev_warn(&pdev->dev, "failed to register sensor %s: %d\n",
>>>>> +				 soc->ttgs[i]->name, err);
>>>>> +			continue;
>>>>>  		}
>>>>>  
>>>>>  		zone->tz = z;
>>>>> @@ -1434,6 +1434,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
>>>>>  		struct thermal_zone_device *tz;
>>>>>  
>>>>>  		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
>>>>> +		if (!tz)
>>>>> +			continue;
>>>>>  		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
>>>>>  		if (err) {
>>>>>  			dev_err(&pdev->dev,
>>>>>
>>>>
>>>>
>>
>>


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] thermal: tegra: continue if sensor register fails
  2018-11-22 13:07           ` Daniel Lezcano
@ 2018-11-23  6:15             ` Wei Ni
  2018-11-23  6:51               ` Daniel Lezcano
  0 siblings, 1 reply; 13+ messages in thread
From: Wei Ni @ 2018-11-23  6:15 UTC (permalink / raw)
  To: Daniel Lezcano, thierry.reding, linux-tegra
  Cc: rui.zhang, edubezval, linux-kernel



On 22/11/2018 9:07 PM, Daniel Lezcano wrote:
> On 22/11/2018 08:10, Wei Ni wrote:
>>
>>
>> On 21/11/2018 8:51 PM, Daniel Lezcano wrote:
>>> On 21/11/2018 11:23, Wei Ni wrote:
>>>>
>>>>
>>>> On 21/11/2018 4:55 PM, Daniel Lezcano wrote:
>>>>> On 13/11/2018 11:06, Wei Ni wrote:
>>>>>> Don't bail when a sensor fails to register with the
>>>>>> thermal zone and allow other sensors to register.
>>>>>> This allows other sensors to register with thermal
>>>>>> framework even if one sensor fails registration.
>>>>>
>>>>> I'm not sure if ignoring the error is really safe. Can you describe the
>>>>> real situation you want to overcome ? How do you differentiate critical
>>>>> sensors ?
>>>>
>>>> The driver will always try to register 4 thermal zones, including cpu,
>>>> gpu, mem and pll, but if the dts file doesn't set the corresponding
>>>> sensors, then the register will be failed.
>>>> Normally, the dts file will set all 4 sensors, but there may have some
>>>> platform doesn't support them all. So we post this patch.
>>>
>>> Ignoring errors is not the way to go to support different platforms. Fix
>>> the DT.
>>
>> The issue isn't in DT file. The Tegra soc thermal include 4 sensors:
>> cpu, gpu, mem, pll. But in some platforms, for example, we may only need
>> to support 2 sensors, such as cpu and gpu, so we only configure these
>> two senors in DT file. But the driver will always try to register 4
>> sensors, cpu/gpu/mem/pll, so mem and pll will be registered failed. So
>> in this case we need to ignoring the failure, and continue to enable the
>> driver.
> 
> You can fix this by changing the driver to support the platform and
> register the sensor you are interested in.
> 
> Ignoring errors is not a good idea.

If hit the errors, the driver will print out the warning. In current
code, the driver probe routine will return failure directly, indeed it
didn't do anything either except print out warnings.
I think this error should not block other sensors' registration. Let's
consider this case, we have four sensors, if the one sensor register
failed, then the driver return probe failure, so the drive will not be
enabled, and other sensor can't work either, it mean the device may boot
up without any thermal sensors.
Or if the error is the -ENODEV, that mean the we didn't set
corresponding sensor id in the dt file, so we can continue to register.
If the error is other value, then we can return directly.

Wei.

> 
> 
>>>> BTW, what do you mean "critical sensors"? We will set critical trip temp
>>>> for all sensors.
>>>
>>> I meant sensor for thermal zone getting really high temperature.
>>
>> We doesn't have the critical sensors. We set the critical trip temp for
>> all registered sensors. And these trip temp is set to the Tegra
>> hardware. So it mean if the temperature reached the critical trip point,
>> then the system will be shutdown directly.
>>
>>>
>>>
>>>>>> Signed-off-by: Wei Ni <wni@nvidia.com>
>>>>>> ---
>>>>>>  drivers/thermal/tegra/soctherm.c | 8 +++++---
>>>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
>>>>>> index ed28110a3535..a824d2e63af3 100644
>>>>>> --- a/drivers/thermal/tegra/soctherm.c
>>>>>> +++ b/drivers/thermal/tegra/soctherm.c
>>>>>> @@ -1370,9 +1370,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>>>>>>  							 &tegra_of_thermal_ops);
>>>>>>  		if (IS_ERR(z)) {
>>>>>>  			err = PTR_ERR(z);
>>>>>> -			dev_err(&pdev->dev, "failed to register sensor: %d\n",
>>>>>> -				err);
>>>>>> -			goto disable_clocks;
>>>>>> +			dev_warn(&pdev->dev, "failed to register sensor %s: %d\n",
>>>>>> +				 soc->ttgs[i]->name, err);
>>>>>> +			continue;
>>>>>>  		}
>>>>>>  
>>>>>>  		zone->tz = z;
>>>>>> @@ -1434,6 +1434,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
>>>>>>  		struct thermal_zone_device *tz;
>>>>>>  
>>>>>>  		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
>>>>>> +		if (!tz)
>>>>>> +			continue;
>>>>>>  		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
>>>>>>  		if (err) {
>>>>>>  			dev_err(&pdev->dev,
>>>>>>
>>>>>
>>>>>
>>>
>>>
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] thermal: tegra: continue if sensor register fails
  2018-11-23  6:15             ` Wei Ni
@ 2018-11-23  6:51               ` Daniel Lezcano
  2018-11-23  8:28                 ` Wei Ni
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Lezcano @ 2018-11-23  6:51 UTC (permalink / raw)
  To: Wei Ni, thierry.reding, linux-tegra; +Cc: rui.zhang, edubezval, linux-kernel


Hi wei,

On 23/11/2018 07:15, Wei Ni wrote:
> 
> 
> On 22/11/2018 9:07 PM, Daniel Lezcano wrote:
>> On 22/11/2018 08:10, Wei Ni wrote:
>>>
>>>
>>> On 21/11/2018 8:51 PM, Daniel Lezcano wrote:
>>>> On 21/11/2018 11:23, Wei Ni wrote:
>>>>>
>>>>>
>>>>> On 21/11/2018 4:55 PM, Daniel Lezcano wrote:
>>>>>> On 13/11/2018 11:06, Wei Ni wrote:
>>>>>>> Don't bail when a sensor fails to register with the
>>>>>>> thermal zone and allow other sensors to register.
>>>>>>> This allows other sensors to register with thermal
>>>>>>> framework even if one sensor fails registration.
>>>>>>
>>>>>> I'm not sure if ignoring the error is really safe. Can you describe the
>>>>>> real situation you want to overcome ? How do you differentiate critical
>>>>>> sensors ?
>>>>>
>>>>> The driver will always try to register 4 thermal zones, including cpu,
>>>>> gpu, mem and pll, but if the dts file doesn't set the corresponding
>>>>> sensors, then the register will be failed.
>>>>> Normally, the dts file will set all 4 sensors, but there may have some
>>>>> platform doesn't support them all. So we post this patch.
>>>>
>>>> Ignoring errors is not the way to go to support different platforms. Fix
>>>> the DT.
>>>
>>> The issue isn't in DT file. The Tegra soc thermal include 4 sensors:
>>> cpu, gpu, mem, pll. But in some platforms, for example, we may only need
>>> to support 2 sensors, such as cpu and gpu, so we only configure these
>>> two senors in DT file. But the driver will always try to register 4
>>> sensors, cpu/gpu/mem/pll, so mem and pll will be registered failed. So
>>> in this case we need to ignoring the failure, and continue to enable the
>>> driver.
>>
>> You can fix this by changing the driver to support the platform and
>> register the sensor you are interested in.
>>
>> Ignoring errors is not a good idea.
> 
> If hit the errors, the driver will print out the warning. In current
> code, the driver probe routine will return failure directly, indeed it
> didn't do anything either except print out warnings.
> I think this error should not block other sensors' registration. Let's
> consider this case, we have four sensors, if the one sensor register
> failed, then the driver return probe failure, so the drive will not be
> enabled, and other sensor can't work either, it mean the device may boot
> up without any thermal sensors.
> Or if the error is the -ENODEV, that mean the we didn't set
> corresponding sensor id in the dt file, so we can continue to register.
> If the error is other value, then we can return directly.

It is a possibility but may be there are a couple of alternatives:

1. If there is a compatible string for the platform variant, use it to
probe the right sensors

or

2. Use the qoriq driver approach by reparsing the DT and find out the
thermal zone and their respective sensor id.


>>>>> BTW, what do you mean "critical sensors"? We will set critical trip temp
>>>>> for all sensors.
>>>>
>>>> I meant sensor for thermal zone getting really high temperature.
>>>
>>> We doesn't have the critical sensors. We set the critical trip temp for
>>> all registered sensors. And these trip temp is set to the Tegra
>>> hardware. So it mean if the temperature reached the critical trip point,
>>> then the system will be shutdown directly.
>>>
>>>>
>>>>
>>>>>>> Signed-off-by: Wei Ni <wni@nvidia.com>
>>>>>>> ---
>>>>>>>  drivers/thermal/tegra/soctherm.c | 8 +++++---
>>>>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
>>>>>>> index ed28110a3535..a824d2e63af3 100644
>>>>>>> --- a/drivers/thermal/tegra/soctherm.c
>>>>>>> +++ b/drivers/thermal/tegra/soctherm.c
>>>>>>> @@ -1370,9 +1370,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>>>>>>>  							 &tegra_of_thermal_ops);
>>>>>>>  		if (IS_ERR(z)) {
>>>>>>>  			err = PTR_ERR(z);
>>>>>>> -			dev_err(&pdev->dev, "failed to register sensor: %d\n",
>>>>>>> -				err);
>>>>>>> -			goto disable_clocks;
>>>>>>> +			dev_warn(&pdev->dev, "failed to register sensor %s: %d\n",
>>>>>>> +				 soc->ttgs[i]->name, err);
>>>>>>> +			continue;
>>>>>>>  		}
>>>>>>>  
>>>>>>>  		zone->tz = z;
>>>>>>> @@ -1434,6 +1434,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
>>>>>>>  		struct thermal_zone_device *tz;
>>>>>>>  
>>>>>>>  		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
>>>>>>> +		if (!tz)
>>>>>>> +			continue;
>>>>>>>  		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
>>>>>>>  		if (err) {
>>>>>>>  			dev_err(&pdev->dev,
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] thermal: tegra: continue if sensor register fails
  2018-11-23  6:51               ` Daniel Lezcano
@ 2018-11-23  8:28                 ` Wei Ni
  0 siblings, 0 replies; 13+ messages in thread
From: Wei Ni @ 2018-11-23  8:28 UTC (permalink / raw)
  To: Daniel Lezcano, thierry.reding, linux-tegra
  Cc: rui.zhang, edubezval, linux-kernel



On 23/11/2018 2:51 PM, Daniel Lezcano wrote:
> 
> Hi wei,
> 
> On 23/11/2018 07:15, Wei Ni wrote:
>>
>>
>> On 22/11/2018 9:07 PM, Daniel Lezcano wrote:
>>> On 22/11/2018 08:10, Wei Ni wrote:
>>>>
>>>>
>>>> On 21/11/2018 8:51 PM, Daniel Lezcano wrote:
>>>>> On 21/11/2018 11:23, Wei Ni wrote:
>>>>>>
>>>>>>
>>>>>> On 21/11/2018 4:55 PM, Daniel Lezcano wrote:
>>>>>>> On 13/11/2018 11:06, Wei Ni wrote:
>>>>>>>> Don't bail when a sensor fails to register with the
>>>>>>>> thermal zone and allow other sensors to register.
>>>>>>>> This allows other sensors to register with thermal
>>>>>>>> framework even if one sensor fails registration.
>>>>>>>
>>>>>>> I'm not sure if ignoring the error is really safe. Can you describe the
>>>>>>> real situation you want to overcome ? How do you differentiate critical
>>>>>>> sensors ?
>>>>>>
>>>>>> The driver will always try to register 4 thermal zones, including cpu,
>>>>>> gpu, mem and pll, but if the dts file doesn't set the corresponding
>>>>>> sensors, then the register will be failed.
>>>>>> Normally, the dts file will set all 4 sensors, but there may have some
>>>>>> platform doesn't support them all. So we post this patch.
>>>>>
>>>>> Ignoring errors is not the way to go to support different platforms. Fix
>>>>> the DT.
>>>>
>>>> The issue isn't in DT file. The Tegra soc thermal include 4 sensors:
>>>> cpu, gpu, mem, pll. But in some platforms, for example, we may only need
>>>> to support 2 sensors, such as cpu and gpu, so we only configure these
>>>> two senors in DT file. But the driver will always try to register 4
>>>> sensors, cpu/gpu/mem/pll, so mem and pll will be registered failed. So
>>>> in this case we need to ignoring the failure, and continue to enable the
>>>> driver.
>>>
>>> You can fix this by changing the driver to support the platform and
>>> register the sensor you are interested in.
>>>
>>> Ignoring errors is not a good idea.
>>
>> If hit the errors, the driver will print out the warning. In current
>> code, the driver probe routine will return failure directly, indeed it
>> didn't do anything either except print out warnings.
>> I think this error should not block other sensors' registration. Let's
>> consider this case, we have four sensors, if the one sensor register
>> failed, then the driver return probe failure, so the drive will not be
>> enabled, and other sensor can't work either, it mean the device may boot
>> up without any thermal sensors.
>> Or if the error is the -ENODEV, that mean the we didn't set
>> corresponding sensor id in the dt file, so we can continue to register.
>> If the error is other value, then we can return directly.
> 
> It is a possibility but may be there are a couple of alternatives:
> 
> 1. If there is a compatible string for the platform variant, use it to
> probe the right sensors
> 
> or
> 
> 2. Use the qoriq driver approach by reparsing the DT and find out the
> thermal zone and their respective sensor id.

Daniel, thanks for your comments, will consider it in my next version.

Wei.
> 
> 
>>>>>> BTW, what do you mean "critical sensors"? We will set critical trip temp
>>>>>> for all sensors.
>>>>>
>>>>> I meant sensor for thermal zone getting really high temperature.
>>>>
>>>> We doesn't have the critical sensors. We set the critical trip temp for
>>>> all registered sensors. And these trip temp is set to the Tegra
>>>> hardware. So it mean if the temperature reached the critical trip point,
>>>> then the system will be shutdown directly.
>>>>
>>>>>
>>>>>
>>>>>>>> Signed-off-by: Wei Ni <wni@nvidia.com>
>>>>>>>> ---
>>>>>>>>  drivers/thermal/tegra/soctherm.c | 8 +++++---
>>>>>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
>>>>>>>> index ed28110a3535..a824d2e63af3 100644
>>>>>>>> --- a/drivers/thermal/tegra/soctherm.c
>>>>>>>> +++ b/drivers/thermal/tegra/soctherm.c
>>>>>>>> @@ -1370,9 +1370,9 @@ static int tegra_soctherm_probe(struct platform_device *pdev)
>>>>>>>>  							 &tegra_of_thermal_ops);
>>>>>>>>  		if (IS_ERR(z)) {
>>>>>>>>  			err = PTR_ERR(z);
>>>>>>>> -			dev_err(&pdev->dev, "failed to register sensor: %d\n",
>>>>>>>> -				err);
>>>>>>>> -			goto disable_clocks;
>>>>>>>> +			dev_warn(&pdev->dev, "failed to register sensor %s: %d\n",
>>>>>>>> +				 soc->ttgs[i]->name, err);
>>>>>>>> +			continue;
>>>>>>>>  		}
>>>>>>>>  
>>>>>>>>  		zone->tz = z;
>>>>>>>> @@ -1434,6 +1434,8 @@ static int __maybe_unused soctherm_resume(struct device *dev)
>>>>>>>>  		struct thermal_zone_device *tz;
>>>>>>>>  
>>>>>>>>  		tz = tegra->thermctl_tzs[soc->ttgs[i]->id];
>>>>>>>> +		if (!tz)
>>>>>>>> +			continue;
>>>>>>>>  		err = tegra_soctherm_set_hwtrips(dev, soc->ttgs[i], tz);
>>>>>>>>  		if (err) {
>>>>>>>>  			dev_err(&pdev->dev,
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>>
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-11-23  8:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-13 10:06 [PATCH v2 0/3] Fixes for Tegra soctherm Wei Ni
2018-11-13 10:06 ` [PATCH v2 1/3] thermal: tegra: continue if sensor register fails Wei Ni
2018-11-21  8:55   ` Daniel Lezcano
2018-11-21 10:23     ` Wei Ni
2018-11-21 12:51       ` Daniel Lezcano
2018-11-22  7:10         ` Wei Ni
2018-11-22 13:07           ` Daniel Lezcano
2018-11-23  6:15             ` Wei Ni
2018-11-23  6:51               ` Daniel Lezcano
2018-11-23  8:28                 ` Wei Ni
2018-11-13 10:06 ` [PATCH v2 2/3] thermal: tegra: remove unnecessary warnings Wei Ni
2018-11-13 10:06 ` [PATCH v2 3/3] thermal: tegra: fix memory allocation Wei Ni
2018-11-20  7:06 ` [PATCH v2 0/3] Fixes for Tegra soctherm Wei Ni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).