All of lore.kernel.org
 help / color / mirror / Atom feed
* cpufreq: Scaling driver sysfs attribute issues.
@ 2022-08-19 19:03 Chris Hyser
  2022-08-19 19:03 ` [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus Chris Hyser
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Hyser @ 2022-08-19 19:03 UTC (permalink / raw)
  To: chris.hyser, linux-pm, Rafael J. Wysocki, Viresh Kumar

We found a NULL pointer dereference in cppc_cpufreq scaling driver, traced it
to root cause, and looked at the other scaling drivers to see what else may be
affected.

The simplest fix for cppc_cpufreq is basically identical to commit e25303676e18
("cpufreq: acpi_cpufreq: prevent crash on reading freqdomain_cpus") and is
included.

The problem is a scaling driver which provides a struct freq_attr sysfs
attributes array passed via struct cpufreq_driver during driver registration,
has .init() and .exit() functions and does _not_ provide .online()/.offline()
routines.  The freq_attr attributes are often backed by allocated memory from
.init() and freed on .exit(). The issue is that the cpufreq core does not remove
those driver freq_attr sysfs files in this scenario though the backing memory
has been freed and while some instances just show garbage values, others contain
pointers. Note: registering empty .online()/.offline() driver functions can also
solve this problem as .exit() is not called and the memory has not been freed.

Looking through the other scaling drivers, most that provide a .attr use
'cpufreq_generic_attr' for which the 'show' functions do check for a NULL data
pointer. Two others are worth mentioning: brcm_avs_cpufreq driver has it's own
attributes backed by it's own functions. These do not check to see if the data
pointer is NULL in the show() functions, however as it uses devm_kzalloc(), it
has no .exit() and the sysfs files remain backed by memory until driver unload.

The other is the 'amd_pstate' driver. I don't have a means of testing, but I
suspect that this driver does have a similar problem, ie if a cpu is removed
and the various driver sysfs attrs are accessed, it will crash.

Additionally, I spent some time looking at an alternative that removed the
driver .attr set of sysfs attributes prior to calling .exit() in
__cpufreq_offline() and puts them back following an .init() in cpufreq_online().
On the AMD and ARM systems I tested with, no policies are shared and thus
.init()/.exit() are called on every cpu online/offline operation.  A simple test
consisting of two tight loops one onlining/offlining a few cpus and the other
reading the sysfs cpumask file, ultimately results in deadlock between sysfs and
the cpu hotplug machinery.  This seems consistent with various commit comments
about the locking here and a good time to send an email.

-chrish


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus
  2022-08-19 19:03 cpufreq: Scaling driver sysfs attribute issues Chris Hyser
@ 2022-08-19 19:03 ` Chris Hyser
  2022-08-22  5:39   ` Viresh Kumar
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Hyser @ 2022-08-19 19:03 UTC (permalink / raw)
  To: chris.hyser, linux-pm, Rafael J. Wysocki, Viresh Kumar

From: chris hyser <chris.hyser@oracle.com>

While running stress-ng --sysfs on an ARM system following a cpu offline,
we encountered the following NULL pointer dereference in the cppc_cpufreq
scaling driver:

[ 1003.576816] Call trace:
[ 1003.579255]  _find_next_bit+0x20/0xc8
[ 1003.582909]  cpufreq_show_cpus+0x78/0xf4
[ 1003.586830]  show_freqdomain_cpus+0x20/0x30 [cppc_cpufreq]
[ 1003.592318]  show+0x4c/0x78
[ 1003.595104]  sysfs_kf_seq_show+0x9

This is the exact issue described in commit e25303676e18 ("cpufreq:
acpi_cpufreq: prevent crash on reading freqdomain_cpus") with the fix
described there also solving this issue. I tracked the root cause to the
following: a scaling driver which provides a struct freq_attr sysfs
attributes array passed via struct cpufreq_driver during driver
registration, has .init() and .exit() functions and does _not_ provide
.online()/.offline() routines. cpufreq core creates the attributes, but
does not remove them even though .exit() frees the underlying memory. The
core functions and most drivers have corresponding NULL data pointer
checks.

Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
---
 drivers/cpufreq/cppc_cpufreq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 24eaf0e..4210353 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -876,6 +876,9 @@ static ssize_t show_freqdomain_cpus(struct cpufreq_policy *policy, char *buf)
 {
 	struct cppc_cpudata *cpu_data = policy->driver_data;
 
+	if (unlikely(!cpu_data))
+		return -ENODEV;
+
 	return cpufreq_show_cpus(cpu_data->shared_cpu_map, buf);
 }
 cpufreq_freq_attr_ro(freqdomain_cpus);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus
  2022-08-19 19:03 ` [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus Chris Hyser
@ 2022-08-22  5:39   ` Viresh Kumar
  2022-08-22 13:19     ` Chris Hyser
  0 siblings, 1 reply; 7+ messages in thread
From: Viresh Kumar @ 2022-08-22  5:39 UTC (permalink / raw)
  To: Chris Hyser; +Cc: linux-pm, Rafael J. Wysocki

On 19-08-22, 15:03, Chris Hyser wrote:
> From: chris hyser <chris.hyser@oracle.com>
> 
> While running stress-ng --sysfs on an ARM system following a cpu offline,
> we encountered the following NULL pointer dereference in the cppc_cpufreq
> scaling driver:
> 
> [ 1003.576816] Call trace:
> [ 1003.579255]  _find_next_bit+0x20/0xc8
> [ 1003.582909]  cpufreq_show_cpus+0x78/0xf4
> [ 1003.586830]  show_freqdomain_cpus+0x20/0x30 [cppc_cpufreq]
> [ 1003.592318]  show+0x4c/0x78
> [ 1003.595104]  sysfs_kf_seq_show+0x9
> 
> This is the exact issue described in commit e25303676e18 ("cpufreq:
> acpi_cpufreq: prevent crash on reading freqdomain_cpus") with the fix
> described there also solving this issue. I tracked the root cause to the
> following: a scaling driver which provides a struct freq_attr sysfs
> attributes array passed via struct cpufreq_driver during driver
> registration, has .init() and .exit() functions and does _not_ provide
> .online()/.offline() routines. cpufreq core creates the attributes, but
> does not remove them even though .exit() frees the underlying memory. The
> core functions and most drivers have corresponding NULL data pointer
> checks.
> 
> Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
> ---
>  drivers/cpufreq/cppc_cpufreq.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 24eaf0e..4210353 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -876,6 +876,9 @@ static ssize_t show_freqdomain_cpus(struct cpufreq_policy *policy, char *buf)
>  {
>  	struct cppc_cpudata *cpu_data = policy->driver_data;
>  
> +	if (unlikely(!cpu_data))
> +		return -ENODEV;
> +
>  	return cpufreq_show_cpus(cpu_data->shared_cpu_map, buf);
>  }
>  cpufreq_freq_attr_ro(freqdomain_cpus);

What kernel version are you testing this on ?

We merged a patch sometime back:

commit d4627a287e25 ("cpufreq: Abort show()/store() for half-initialized policies")

which I believe should have fixed this issue. I will be surprise if it
doesn't.

-- 
viresh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus
  2022-08-22  5:39   ` Viresh Kumar
@ 2022-08-22 13:19     ` Chris Hyser
  2022-08-22 15:54       ` Chris Hyser
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Hyser @ 2022-08-22 13:19 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: linux-pm, Rafael J. Wysocki



On 8/22/22 1:39 AM, Viresh Kumar wrote:
> On 19-08-22, 15:03, Chris Hyser wrote:
>> From: chris hyser <chris.hyser@oracle.com>
>>
>> While running stress-ng --sysfs on an ARM system following a cpu offline,
>> we encountered the following NULL pointer dereference in the cppc_cpufreq
>> scaling driver:
>>
>> [ 1003.576816] Call trace:
>> [ 1003.579255]  _find_next_bit+0x20/0xc8
>> [ 1003.582909]  cpufreq_show_cpus+0x78/0xf4
>> [ 1003.586830]  show_freqdomain_cpus+0x20/0x30 [cppc_cpufreq]
>> [ 1003.592318]  show+0x4c/0x78
>> [ 1003.595104]  sysfs_kf_seq_show+0x9
>>
>> This is the exact issue described in commit e25303676e18 ("cpufreq:
>> acpi_cpufreq: prevent crash on reading freqdomain_cpus") with the fix
>> described there also solving this issue. I tracked the root cause to the
>> following: a scaling driver which provides a struct freq_attr sysfs
>> attributes array passed via struct cpufreq_driver during driver
>> registration, has .init() and .exit() functions and does _not_ provide
>> .online()/.offline() routines. cpufreq core creates the attributes, but
>> does not remove them even though .exit() frees the underlying memory. The
>> core functions and most drivers have corresponding NULL data pointer
>> checks.
>>
>> Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
>> ---
>>   drivers/cpufreq/cppc_cpufreq.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index 24eaf0e..4210353 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -876,6 +876,9 @@ static ssize_t show_freqdomain_cpus(struct cpufreq_policy *policy, char *buf)
>>   {
>>   	struct cppc_cpudata *cpu_data = policy->driver_data;
>>   
>> +	if (unlikely(!cpu_data))
>> +		return -ENODEV;
>> +
>>   	return cpufreq_show_cpus(cpu_data->shared_cpu_map, buf);
>>   }
>>   cpufreq_freq_attr_ro(freqdomain_cpus);
> 
> What kernel version are you testing this on ?

5.19

> We merged a patch sometime back:
> 
> commit d4627a287e25 ("cpufreq: Abort show()/store() for half-initialized policies")
> 
> which I believe should have fixed this issue. I will be surprise if it
> doesn't.

This patch is present and apparently does not solve the problem.

-chrish

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus
  2022-08-22 13:19     ` Chris Hyser
@ 2022-08-22 15:54       ` Chris Hyser
  2022-08-22 20:14         ` Chris Hyser
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Hyser @ 2022-08-22 15:54 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: linux-pm, Rafael J. Wysocki

On 8/22/22 9:19 AM, Chris Hyser wrote:
> 
> 
> On 8/22/22 1:39 AM, Viresh Kumar wrote:
>> On 19-08-22, 15:03, Chris Hyser wrote:
>>> From: chris hyser <chris.hyser@oracle.com>
>>>
>>> While running stress-ng --sysfs on an ARM system following a cpu offline,
>>> we encountered the following NULL pointer dereference in the cppc_cpufreq
>>> scaling driver:
>>>
>>> [ 1003.576816] Call trace:
>>> [ 1003.579255]  _find_next_bit+0x20/0xc8
>>> [ 1003.582909]  cpufreq_show_cpus+0x78/0xf4
>>> [ 1003.586830]  show_freqdomain_cpus+0x20/0x30 [cppc_cpufreq]
>>> [ 1003.592318]  show+0x4c/0x78
>>> [ 1003.595104]  sysfs_kf_seq_show+0x9
>>>
>>> This is the exact issue described in commit e25303676e18 ("cpufreq:
>>> acpi_cpufreq: prevent crash on reading freqdomain_cpus") with the fix
>>> described there also solving this issue. I tracked the root cause to the
>>> following: a scaling driver which provides a struct freq_attr sysfs
>>> attributes array passed via struct cpufreq_driver during driver
>>> registration, has .init() and .exit() functions and does _not_ provide
>>> .online()/.offline() routines. cpufreq core creates the attributes, but
>>> does not remove them even though .exit() frees the underlying memory. The
>>> core functions and most drivers have corresponding NULL data pointer
>>> checks.
>>>
>>> Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
>>> ---
>>>   drivers/cpufreq/cppc_cpufreq.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>>> index 24eaf0e..4210353 100644
>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>> @@ -876,6 +876,9 @@ static ssize_t show_freqdomain_cpus(struct cpufreq_policy *policy, char *buf)
>>>   {
>>>       struct cppc_cpudata *cpu_data = policy->driver_data;
>>> +    if (unlikely(!cpu_data))
>>> +        return -ENODEV;
>>> +
>>>       return cpufreq_show_cpus(cpu_data->shared_cpu_map, buf);
>>>   }
>>>   cpufreq_freq_attr_ro(freqdomain_cpus);
>>
>> What kernel version are you testing this on ?
> 
> 5.19
> 
>> We merged a patch sometime back:
>>
>> commit d4627a287e25 ("cpufreq: Abort show()/store() for half-initialized policies")
>>
>> which I believe should have fixed this issue. I will be surprise if it
>> doesn't.
> 
> This patch is present and apparently does not solve the problem.

On digging into it, that patch says it is about catching partially initialized policies and indicates that by clearing 
the policy->cpus mask. However, on a normal online/offline a completely initialized policy survives (ptr in the percpu 
space) and that mask is not cleared and the driver show() funcs get called.

-chrish

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus
  2022-08-22 15:54       ` Chris Hyser
@ 2022-08-22 20:14         ` Chris Hyser
  2022-08-23  4:13           ` Viresh Kumar
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Hyser @ 2022-08-22 20:14 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: linux-pm, Rafael J. Wysocki

On 8/22/22 11:54 AM, Chris Hyser wrote:
> On 8/22/22 9:19 AM, Chris Hyser wrote:
>>
>>
>> On 8/22/22 1:39 AM, Viresh Kumar wrote:
>>> On 19-08-22, 15:03, Chris Hyser wrote:
>>>> From: chris hyser <chris.hyser@oracle.com>
>>>>
>>>> While running stress-ng --sysfs on an ARM system following a cpu offline,
>>>> we encountered the following NULL pointer dereference in the cppc_cpufreq
>>>> scaling driver:
>>>>
>>>> [ 1003.576816] Call trace:
>>>> [ 1003.579255]  _find_next_bit+0x20/0xc8
>>>> [ 1003.582909]  cpufreq_show_cpus+0x78/0xf4
>>>> [ 1003.586830]  show_freqdomain_cpus+0x20/0x30 [cppc_cpufreq]
>>>> [ 1003.592318]  show+0x4c/0x78
>>>> [ 1003.595104]  sysfs_kf_seq_show+0x9
>>>>
>>>> This is the exact issue described in commit e25303676e18 ("cpufreq:
>>>> acpi_cpufreq: prevent crash on reading freqdomain_cpus") with the fix
>>>> described there also solving this issue. I tracked the root cause to the
>>>> following: a scaling driver which provides a struct freq_attr sysfs
>>>> attributes array passed via struct cpufreq_driver during driver
>>>> registration, has .init() and .exit() functions and does _not_ provide
>>>> .online()/.offline() routines. cpufreq core creates the attributes, but
>>>> does not remove them even though .exit() frees the underlying memory. The
>>>> core functions and most drivers have corresponding NULL data pointer
>>>> checks.
>>>>
>>>> Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
>>>> ---
>>>>   drivers/cpufreq/cppc_cpufreq.c | 3 +++
>>>>   1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>>>> index 24eaf0e..4210353 100644
>>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>>> @@ -876,6 +876,9 @@ static ssize_t show_freqdomain_cpus(struct cpufreq_policy *policy, char *buf)
>>>>   {
>>>>       struct cppc_cpudata *cpu_data = policy->driver_data;
>>>> +    if (unlikely(!cpu_data))
>>>> +        return -ENODEV;
>>>> +
>>>>       return cpufreq_show_cpus(cpu_data->shared_cpu_map, buf);
>>>>   }
>>>>   cpufreq_freq_attr_ro(freqdomain_cpus);
>>>
>>> What kernel version are you testing this on ?
>>
>> 5.19
>>
>>> We merged a patch sometime back:
>>>
>>> commit d4627a287e25 ("cpufreq: Abort show()/store() for half-initialized policies")
>>>
>>> which I believe should have fixed this issue. I will be surprise if it
>>> doesn't.
>>
>> This patch is present and apparently does not solve the problem.
> 
> On digging into it, that patch says it is about catching partially initialized policies and indicates that by clearing 
> the policy->cpus mask. However, on a normal online/offline a completely initialized policy survives (ptr in the percpu 
> space) and that mask is not cleared and the driver show() funcs get called.

On yet further digging, that patch does appear to work. There is code that removes the current cpu on offline and as 
that is the only cpu, it does leave the policy->cpus empty. I think this patch must have showed up between when I first 
started digging into this problem and when I started trying to finalize my stuff. Thanks for pointing out the fix.

-chrish

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus
  2022-08-22 20:14         ` Chris Hyser
@ 2022-08-23  4:13           ` Viresh Kumar
  0 siblings, 0 replies; 7+ messages in thread
From: Viresh Kumar @ 2022-08-23  4:13 UTC (permalink / raw)
  To: Chris Hyser; +Cc: linux-pm, Rafael J. Wysocki

On 22-08-22, 16:14, Chris Hyser wrote:
> On yet further digging, that patch does appear to work. There is code that
> removes the current cpu on offline and as that is the only cpu, it does
> leave the policy->cpus empty. I think this patch must have showed up between
> when I first started digging into this problem and when I started trying to
> finalize my stuff. Thanks for pointing out the fix.

Great.

-- 
viresh

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-08-23  4:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-19 19:03 cpufreq: Scaling driver sysfs attribute issues Chris Hyser
2022-08-19 19:03 ` [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus Chris Hyser
2022-08-22  5:39   ` Viresh Kumar
2022-08-22 13:19     ` Chris Hyser
2022-08-22 15:54       ` Chris Hyser
2022-08-22 20:14         ` Chris Hyser
2022-08-23  4:13           ` Viresh Kumar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.