linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* weird cooling_device/cur_state sysfs behaviour
@ 2020-03-20 15:10 Willy Wolff
  2020-03-20 17:43 ` Daniel Lezcano
  0 siblings, 1 reply; 4+ messages in thread
From: Willy Wolff @ 2020-03-20 15:10 UTC (permalink / raw)
  To: Zhang Rui, Eduardo Valentin, Daniel Lezcano, Amit Kucheria,
	linux-pm, linux-kernel

Hi all,
Hope you're doing well with the situation.

I fill that my board get a cold too...

When I write to a sysfs node, I have a weird behaviour about the function that is called behind.
This bug appears on an arm32 odroid-xu3, and only after v5.4, v5.3.18 behave correctly.


Here my modification to see what's going on:

diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index aa99edb4dff7..a437ae3f4b9f 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -706,11 +706,22 @@ cur_state_store(struct device *dev, struct device_attribute *attr,
        unsigned long state;
        int result;
 
-       if (sscanf(buf, "%ld\n", &state) != 1)
+       dump_stack();
+
+       printk(KERN_INFO "%s:%d:%s buf = \"%s\", count = %zu\n",
+              __FILE__, __LINE__, __func__, buf, count);
+
+       if (sscanf(buf, "%ld\n", &state) != 1) {
+               printk(KERN_INFO "%s:%d:%s\n",
+                      __FILE__, __LINE__, __func__);
                return -EINVAL;
+       }
 
-       if ((long)state < 0)
+       if ((long)state < 0) {
+               printk(KERN_INFO "%s:%d:%s\n",
+                      __FILE__, __LINE__, __func__);
                return -EINVAL;
+       }
 
        mutex_lock(&cdev->lock);
 
@@ -719,9 +730,14 @@ cur_state_store(struct device *dev, struct device_attribute *attr,
                thermal_cooling_device_stats_update(cdev, state);
 
        mutex_unlock(&cdev->lock);
+
+       printk(KERN_INFO "%s:%d:%s result = %d, count = %zu\n",
+              __FILE__, __LINE__, __func__, result, count);
+
        return result ? result : count;
 }
 
+
 static struct device_attribute
 dev_attr_cdev_type = __ATTR(type, 0444, cdev_type_show, NULL);
 static DEVICE_ATTR_RO(max_state);



And here the prob:
Thursday 2019-02-14 10:26:46 root@odroid-xu3-0 ~
$ echo 11 > /sys/devices/virtual/thermal/cooling_device1/cur_state                                                                                    
[  908.109955] CPU: 5 PID: 500 Comm: bash Not tainted 5.4.0-dirty #6
[  908.114910] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[  908.121014] [<c0112f08>] (unwind_backtrace) from [<c010df38>] (show_stack+0x10/0x14)
[  908.128796] [<c010df38>] (show_stack) from [<c0ae89cc>] (dump_stack+0xa4/0xd0)
[  908.135962] [<c0ae89cc>] (dump_stack) from [<c0759e0c>] (cur_state_store+0x2c/0x14c)
[  908.143686] [<c0759e0c>] (cur_state_store) from [<c033b360>] (kernfs_fop_write+0x100/0x1e0)
[  908.152035] [<c033b360>] (kernfs_fop_write) from [<c02a93f4>] (__vfs_write+0x2c/0x1c0)
[  908.159908] [<c02a93f4>] (__vfs_write) from [<c02ac25c>] (vfs_write+0xa4/0x188)
[  908.167163] [<c02ac25c>] (vfs_write) from [<c02ac4b0>] (ksys_write+0x58/0xd0)
[  908.174242] [<c02ac4b0>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
[  908.181874] Exception stack(0xeb915fa8 to 0xeb915ff0)
[  908.186800] 5fa0:                   00000074 0050ba08 00000001 0050ba08 00000003 00000000
[  908.195083] 5fc0: 00000074 0050ba08 b6eb3db0 00000004 00000003 00000003 00000000 00000000
[  908.203218] 5fe0: 00000004 bef3c830 b6e4b5b3 b6dd4746
[  908.210744] drivers/thermal/thermal_sysfs.c:712:cur_state_store buf = "11
[  908.210744] ", count = 3
[  908.221520] drivers/thermal/thermal_sysfs.c:735:cur_state_store result = 1, count = 3
[  908.228681] CPU: 5 PID: 500 Comm: bash Not tainted 5.4.0-dirty #6
[  908.234268] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[  908.240372] [<c0112f08>] (unwind_backtrace) from [<c010df38>] (show_stack+0x10/0x14)
[  908.248159] [<c010df38>] (show_stack) from [<c0ae89cc>] (dump_stack+0xa4/0xd0)
[  908.255325] [<c0ae89cc>] (dump_stack) from [<c0759e0c>] (cur_state_store+0x2c/0x14c)
[  908.263048] [<c0759e0c>] (cur_state_store) from [<c033b360>] (kernfs_fop_write+0x100/0x1e0)
[  908.271398] [<c033b360>] (kernfs_fop_write) from [<c02a93f4>] (__vfs_write+0x2c/0x1c0)
[  908.279271] [<c02a93f4>] (__vfs_write) from [<c02ac25c>] (vfs_write+0xa4/0x188)
[  908.286527] [<c02ac25c>] (vfs_write) from [<c02ac4b0>] (ksys_write+0x58/0xd0)
[  908.293608] [<c02ac4b0>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
[  908.301241] Exception stack(0xeb915fa8 to 0xeb915ff0)
[  908.306168] 5fa0:                   00000074 0050ba09 00000001 0050ba09 00000002 00000000
[  908.314448] 5fc0: 00000074 0050ba09 b6eb3db0 00000004 00000003 00000003 00000000 00000000
[  908.322581] 5fe0: 00000004 bef3c830 b6e4b5b3 b6dd4746
[  908.328660] drivers/thermal/thermal_sysfs.c:712:cur_state_store buf = "1
[  908.328660] ", count = 2
[  908.338432] drivers/thermal/thermal_sysfs.c:735:cur_state_store result = 1, count = 2
[  908.345695] CPU: 5 PID: 500 Comm: bash Not tainted 5.4.0-dirty #6
[  908.351172] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[  908.357276] [<c0112f08>] (unwind_backtrace) from [<c010df38>] (show_stack+0x10/0x14)
[  908.365062] [<c010df38>] (show_stack) from [<c0ae89cc>] (dump_stack+0xa4/0xd0)
[  908.372227] [<c0ae89cc>] (dump_stack) from [<c0759e0c>] (cur_state_store+0x2c/0x14c)
[  908.379951] [<c0759e0c>] (cur_state_store) from [<c033b360>] (kernfs_fop_write+0x100/0x1e0)
[  908.388302] [<c033b360>] (kernfs_fop_write) from [<c02a93f4>] (__vfs_write+0x2c/0x1c0)
[  908.396176] [<c02a93f4>] (__vfs_write) from [<c02ac25c>] (vfs_write+0xa4/0x188)
[  908.403433] [<c02ac25c>] (vfs_write) from [<c02ac4b0>] (ksys_write+0x58/0xd0)
[  908.410511] [<c02ac4b0>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
[  908.418145] Exception stack(0xeb915fa8 to 0xeb915ff0)
[  908.423068] 5fa0:                   00000074 0050ba0a 00000001 0050ba0a 00000001 00000000
[  908.431352] 5fc0: 00000074 0050ba0a b6eb3db0 00000004 00000003 00000003 00000000 00000000
[  908.439487] 5fe0: 00000004 bef3c830 b6e4b5b3 b6dd4746
[  908.445365] drivers/thermal/thermal_sysfs.c:712:cur_state_store buf = "
[  908.445365] ", count = 1
[  908.454762] drivers/thermal/thermal_sysfs.c:716:cur_state_store
-bash: echo: write error: Invalid argument




As you can see, the function is recalled with one character less, until the "buffer is empty".
I don't understand why. Can anybody help me for this thing?
Many thanks in advance.

Best Regards,
Willy


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: weird cooling_device/cur_state sysfs behaviour
  2020-03-20 15:10 weird cooling_device/cur_state sysfs behaviour Willy Wolff
@ 2020-03-20 17:43 ` Daniel Lezcano
  2020-03-20 20:17   ` Daniel Lezcano
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Lezcano @ 2020-03-20 17:43 UTC (permalink / raw)
  To: Willy Wolff, Zhang Rui, Eduardo Valentin, Amit Kucheria,
	linux-pm, linux-kernel

On 20/03/2020 16:10, Willy Wolff wrote:
> Hi all,
> Hope you're doing well with the situation.
> 
> I fill that my board get a cold too...
> 
> When I write to a sysfs node, I have a weird behaviour about the function that is called behind.
> This bug appears on an arm32 odroid-xu3, and only after v5.4, v5.3.18 behave correctly.
> 
> 
> Here my modification to see what's going on:
> 
> diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
> index aa99edb4dff7..a437ae3f4b9f 100644
> --- a/drivers/thermal/thermal_sysfs.c
> +++ b/drivers/thermal/thermal_sysfs.c
> @@ -706,11 +706,22 @@ cur_state_store(struct device *dev, struct device_attribute *attr,

[ ... ]

>         return result ? result : count;
>  }
>  
> +

[ ... ]

> 
> As you can see, the function is recalled with one character less, until the "buffer is empty".
> I don't understand why. Can anybody help me for this thing?
> Many thanks in advance.

[ ... ]

Very likely the problem is coming from:

        result = cdev->ops->set_cur_state(cdev, state);

which returns '1' as showed by the traces:

drivers/thermal/thermal_sysfs.c:735:cur_state_store result = 1, count = 3

And because of the return condition above:

	return result ? result : count;

 the function is returning result, so '1', which is interpreted by the
sysfs as "I wrote one character', so it recalls the function with the
two remaining characters, etc ...

The problem is from the governor AFAICT, which governor is it?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: weird cooling_device/cur_state sysfs behaviour
  2020-03-20 17:43 ` Daniel Lezcano
@ 2020-03-20 20:17   ` Daniel Lezcano
  2020-03-21  9:43     ` Willy Wolff
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Lezcano @ 2020-03-20 20:17 UTC (permalink / raw)
  To: Willy Wolff, Zhang Rui, Eduardo Valentin, Amit Kucheria,
	linux-pm, linux-kernel, Rafael J. Wysocki

Adding Rafael,

On 20/03/2020 18:43, Daniel Lezcano wrote:
> On 20/03/2020 16:10, Willy Wolff wrote:
>> Hi all,
>> Hope you're doing well with the situation.
>>
>> I fill that my board get a cold too...
>>
>> When I write to a sysfs node, I have a weird behaviour about the function that is called behind.
>> This bug appears on an arm32 odroid-xu3, and only after v5.4, v5.3.18 behave correctly.
>>
>>
>> Here my modification to see what's going on:
>>
>> diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
>> index aa99edb4dff7..a437ae3f4b9f 100644
>> --- a/drivers/thermal/thermal_sysfs.c
>> +++ b/drivers/thermal/thermal_sysfs.c
>> @@ -706,11 +706,22 @@ cur_state_store(struct device *dev, struct device_attribute *attr,
> 
> [ ... ]
> 
>>         return result ? result : count;
>>  }
>>  
>> +
> 
> [ ... ]
> 
>>
>> As you can see, the function is recalled with one character less, until the "buffer is empty".
>> I don't understand why. Can anybody help me for this thing?
>> Many thanks in advance.
> 
> [ ... ]
> 
> Very likely the problem is coming from:
> 
>         result = cdev->ops->set_cur_state(cdev, state);
> 
> which returns '1' as showed by the traces:
> 
> drivers/thermal/thermal_sysfs.c:735:cur_state_store result = 1, count = 3
> 
> And because of the return condition above:
> 
> 	return result ? result : count;
> 
>  the function is returning result, so '1', which is interpreted by the
> sysfs as "I wrote one character', so it recalls the function with the
> two remaining characters, etc ...
> 
> The problem is from the governor AFAICT, which governor is it?

I went through the code and I believe the problem is coming from:

cpufreq_set_cur_state
 -> freq_qos_update_request
   -> freq_qos_apply
       -> pm_qos_update_target =>

" * This function returns 1 if the aggregated constraint value has changed,"

	freq_qos_apply() does:

		ret = pm_qos_update_target()
	...
	return ret;


At the first glance, it is related to commit 77751a466ebd1 (Nov 2019).




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: weird cooling_device/cur_state sysfs behaviour
  2020-03-20 20:17   ` Daniel Lezcano
@ 2020-03-21  9:43     ` Willy Wolff
  0 siblings, 0 replies; 4+ messages in thread
From: Willy Wolff @ 2020-03-21  9:43 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Zhang Rui, Eduardo Valentin, Amit Kucheria, linux-pm,
	linux-kernel, Rafael J. Wysocki

Indeed, many thanks for you help.

On Fri, Mar 20, 2020 at 09:17:47pm +0100, Daniel Lezcano wrote:
> Adding Rafael,
> 
> On 20/03/2020 18:43, Daniel Lezcano wrote:
> > On 20/03/2020 16:10, Willy Wolff wrote:
> >> Hi all,
> >> Hope you're doing well with the situation.
> >>
> >> I fill that my board get a cold too...
> >>
> >> When I write to a sysfs node, I have a weird behaviour about the function that is called behind.
> >> This bug appears on an arm32 odroid-xu3, and only after v5.4, v5.3.18 behave correctly.
> >>
> >>
> >> Here my modification to see what's going on:
> >>
> >> diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
> >> index aa99edb4dff7..a437ae3f4b9f 100644
> >> --- a/drivers/thermal/thermal_sysfs.c
> >> +++ b/drivers/thermal/thermal_sysfs.c
> >> @@ -706,11 +706,22 @@ cur_state_store(struct device *dev, struct device_attribute *attr,
> > 
> > [ ... ]
> > 
> >>         return result ? result : count;
> >>  }
> >>  
> >> +
> > 
> > [ ... ]
> > 
> >>
> >> As you can see, the function is recalled with one character less, until the "buffer is empty".
> >> I don't understand why. Can anybody help me for this thing?
> >> Many thanks in advance.
> > 
> > [ ... ]
> > 
> > Very likely the problem is coming from:
> > 
> >         result = cdev->ops->set_cur_state(cdev, state);
> > 
> > which returns '1' as showed by the traces:
> > 
> > drivers/thermal/thermal_sysfs.c:735:cur_state_store result = 1, count = 3
> > 
> > And because of the return condition above:
> > 
> > 	return result ? result : count;
> > 
> >  the function is returning result, so '1', which is interpreted by the
> > sysfs as "I wrote one character', so it recalls the function with the
> > two remaining characters, etc ...
> > 
> > The problem is from the governor AFAICT, which governor is it?
> 
> I went through the code and I believe the problem is coming from:
> 
> cpufreq_set_cur_state
>  -> freq_qos_update_request
>    -> freq_qos_apply
>        -> pm_qos_update_target =>
> 
> " * This function returns 1 if the aggregated constraint value has changed,"
> 
> 	freq_qos_apply() does:
> 
> 		ret = pm_qos_update_target()
> 	...
> 	return ret;
> 
> 
> At the first glance, it is related to commit 77751a466ebd1 (Nov 2019).
> 
> 
> 
> 
> -- 
>  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
> 
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-03-21  9:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-20 15:10 weird cooling_device/cur_state sysfs behaviour Willy Wolff
2020-03-20 17:43 ` Daniel Lezcano
2020-03-20 20:17   ` Daniel Lezcano
2020-03-21  9:43     ` Willy Wolff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).