All of lore.kernel.org
 help / color / mirror / Atom feed
* OOB access on ACPI processor thermal device via sysfs write
@ 2020-04-02  7:41 Takashi Iwai
  2020-04-02  7:47 ` Zhang, Rui
  0 siblings, 1 reply; 6+ messages in thread
From: Takashi Iwai @ 2020-04-02  7:41 UTC (permalink / raw)
  To: linux-acpi; +Cc: Zhang Rui, Rafael J. Wysocki, Len Brown, linux-kernel

[ My post yesterday didn't seem go out properly, so I'm resending with
  a different MTA setup; sorry if you already received it ]

Hi,

after my recent patch [*], I started looking at the root cause more
closely, and found that it's a side-effect of the device
initialization order between acpi_processor_driver and cpufreq.
  [*] https://lore.kernel.org/linux-pm/20200330140859.12535-1-tiwai@suse.de/

In short:
  
- acpi_processor_driver_init() is called fairly early boot stage, and
  it registers a thermal device for each CPU. 

- A thermal device allocates a statistics table with the fixed size
  determined by get_max_state() ops call at its probe time.
  (The table entry is updated at each time a write to thermal state
  file happens.)

  For ACPI, processor_get_max_state() is called and it invokes
  acpi_processor_max_state() that calculates the max states depending
  on cpufreq, cpufreq_get_max_state().

  cpufreq_get_max_state() returns 0 at this stage because no cpufreq
  driver is available.  So the table size is set to 1.

- Later on, after cpufreq get initialized, the following sysfs write
  causes an OOB access and corrupts memory (eventually Oops):
    # echo 3 > /sys/class/thermal/cooling_device0/cur_state

  The reason is that now processor_get_max_state() returns 3 as
  cpufreq became available.  So the thermal driver believes as if it
  were a valid value, and updates the table accordingly although it
  overflows.


My patch above detects and avoids such an overflow, but it's no real
fix for the cause.  The proper fix needs either the shuffling of the
device creation order or some automatic resize of statistics table.

Do you guys have any suggestion how to solve it?

FWIW, I could reproduce the problem locally on my laptop with 5.6
kernel, and I believe this can be seen generically on most of
machines.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: OOB access on ACPI processor thermal device via sysfs write
  2020-04-02  7:41 OOB access on ACPI processor thermal device via sysfs write Takashi Iwai
@ 2020-04-02  7:47 ` Zhang, Rui
  2020-04-02  9:03   ` Takashi Iwai
  0 siblings, 1 reply; 6+ messages in thread
From: Zhang, Rui @ 2020-04-02  7:47 UTC (permalink / raw)
  To: Takashi Iwai, linux-acpi; +Cc: Rafael J. Wysocki, Len Brown, linux-kernel

CC Viresh.

Yes, I've received it.

To me, there is not a hard rule that the cooling device max_state must be static.
We should be able to detect the max_state change and reset the stats table when necessary.

I just finished a prototype patch to do so, and will paste it later.

Thanks,
rui

-----Original Message-----
From: Takashi Iwai <tiwai@suse.de> 
Sent: Thursday, April 02, 2020 3:42 PM
To: linux-acpi@vger.kernel.org
Cc: Zhang, Rui <rui.zhang@intel.com>; Rafael J. Wysocki <rjw@rjwysocki.net>; Len Brown <lenb@kernel.org>; linux-kernel@vger.kernel.org
Subject: OOB access on ACPI processor thermal device via sysfs write
Importance: High

[ My post yesterday didn't seem go out properly, so I'm resending with
  a different MTA setup; sorry if you already received it ]

Hi,

after my recent patch [*], I started looking at the root cause more closely, and found that it's a side-effect of the device initialization order between acpi_processor_driver and cpufreq.
  [*] https://lore.kernel.org/linux-pm/20200330140859.12535-1-tiwai@suse.de/

In short:
  
- acpi_processor_driver_init() is called fairly early boot stage, and
  it registers a thermal device for each CPU. 

- A thermal device allocates a statistics table with the fixed size
  determined by get_max_state() ops call at its probe time.
  (The table entry is updated at each time a write to thermal state
  file happens.)

  For ACPI, processor_get_max_state() is called and it invokes
  acpi_processor_max_state() that calculates the max states depending
  on cpufreq, cpufreq_get_max_state().

  cpufreq_get_max_state() returns 0 at this stage because no cpufreq
  driver is available.  So the table size is set to 1.

- Later on, after cpufreq get initialized, the following sysfs write
  causes an OOB access and corrupts memory (eventually Oops):
    # echo 3 > /sys/class/thermal/cooling_device0/cur_state

  The reason is that now processor_get_max_state() returns 3 as
  cpufreq became available.  So the thermal driver believes as if it
  were a valid value, and updates the table accordingly although it
  overflows.


My patch above detects and avoids such an overflow, but it's no real fix for the cause.  The proper fix needs either the shuffling of the device creation order or some automatic resize of statistics table.

Do you guys have any suggestion how to solve it?

FWIW, I could reproduce the problem locally on my laptop with 5.6 kernel, and I believe this can be seen generically on most of machines.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: OOB access on ACPI processor thermal device via sysfs write
  2020-04-02  7:47 ` Zhang, Rui
@ 2020-04-02  9:03   ` Takashi Iwai
  2020-04-02 10:03     ` Zhang Rui
  0 siblings, 1 reply; 6+ messages in thread
From: Takashi Iwai @ 2020-04-02  9:03 UTC (permalink / raw)
  To: Zhang, Rui; +Cc: linux-acpi, Rafael J. Wysocki, Len Brown, linux-kernel

On Thu, 02 Apr 2020 09:47:50 +0200,
Zhang, Rui wrote:
> 
> CC Viresh.
> 
> Yes, I've received it.
> 
> To me, there is not a hard rule that the cooling device max_state must be static.
> We should be able to detect the max_state change and reset the stats table when necessary.
> 
> I just finished a prototype patch to do so, and will paste it later.

Great, that sounds like a feasible option, indeed.


thanks,

Takashi

> 
> Thanks,
> rui
> 
> -----Original Message-----
> From: Takashi Iwai <tiwai@suse.de> 
> Sent: Thursday, April 02, 2020 3:42 PM
> To: linux-acpi@vger.kernel.org
> Cc: Zhang, Rui <rui.zhang@intel.com>; Rafael J. Wysocki <rjw@rjwysocki.net>; Len Brown <lenb@kernel.org>; linux-kernel@vger.kernel.org
> Subject: OOB access on ACPI processor thermal device via sysfs write
> Importance: High
> 
> [ My post yesterday didn't seem go out properly, so I'm resending with
>   a different MTA setup; sorry if you already received it ]
> 
> Hi,
> 
> after my recent patch [*], I started looking at the root cause more closely, and found that it's a side-effect of the device initialization order between acpi_processor_driver and cpufreq.
>   [*] https://lore.kernel.org/linux-pm/20200330140859.12535-1-tiwai@suse.de/
> 
> In short:
>   
> - acpi_processor_driver_init() is called fairly early boot stage, and
>   it registers a thermal device for each CPU. 
> 
> - A thermal device allocates a statistics table with the fixed size
>   determined by get_max_state() ops call at its probe time.
>   (The table entry is updated at each time a write to thermal state
>   file happens.)
> 
>   For ACPI, processor_get_max_state() is called and it invokes
>   acpi_processor_max_state() that calculates the max states depending
>   on cpufreq, cpufreq_get_max_state().
> 
>   cpufreq_get_max_state() returns 0 at this stage because no cpufreq
>   driver is available.  So the table size is set to 1.
> 
> - Later on, after cpufreq get initialized, the following sysfs write
>   causes an OOB access and corrupts memory (eventually Oops):
>     # echo 3 > /sys/class/thermal/cooling_device0/cur_state
> 
>   The reason is that now processor_get_max_state() returns 3 as
>   cpufreq became available.  So the thermal driver believes as if it
>   were a valid value, and updates the table accordingly although it
>   overflows.
> 
> 
> My patch above detects and avoids such an overflow, but it's no real fix for the cause.  The proper fix needs either the shuffling of the device creation order or some automatic resize of statistics table.
> 
> Do you guys have any suggestion how to solve it?
> 
> FWIW, I could reproduce the problem locally on my laptop with 5.6 kernel, and I believe this can be seen generically on most of machines.
> 
> 
> thanks,
> 
> Takashi
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: OOB access on ACPI processor thermal device via sysfs write
  2020-04-02  9:03   ` Takashi Iwai
@ 2020-04-02 10:03     ` Zhang Rui
  2020-04-02 10:09       ` Takashi Iwai
  2020-04-02 18:07       ` Rafael J. Wysocki
  0 siblings, 2 replies; 6+ messages in thread
From: Zhang Rui @ 2020-04-02 10:03 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: linux-acpi, Rafael J. Wysocki, Len Brown, linux-kernel

On Thu, 2020-04-02 at 11:03 +0200, Takashi Iwai wrote:
> On Thu, 02 Apr 2020 09:47:50 +0200,
> Zhang, Rui wrote:
> > 
> > CC Viresh.
> > 
> > Yes, I've received it.
> > 
> > To me, there is not a hard rule that the cooling device max_state
> > must be static.
> > We should be able to detect the max_state change and reset the
> > stats table when necessary.
> > 
> > I just finished a prototype patch to do so, and will paste it
> > later.
> 
> Great, that sounds like a feasible option, indeed.
> 
> 
Please try the patch below and see if the problem goes away or not.

From 7b429674a0e1a6226734c8919b876bb57d946b1d Mon Sep 17 00:00:00 2001
From: Zhang Rui <rui.zhang@intel.com>
Date: Thu, 2 Apr 2020 11:18:44 +0800
Subject: [RFC PATCH] thermal: update thermal stats table when max cooling
 state changed

The maximum cooling state of a cooling device may be changed at
runtime. Thus the statistics table must be updated to handle the real
maximum cooling states supported.

This fixes an OOB issue when updating the statistics of the processor
cooling device, because it only supports 1 cooling state before cpufreq
driver loaded.

Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs")
Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
 drivers/thermal/thermal_sysfs.c | 38 +++++++++++++++++++++++++++------
 1 file changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index aa99edb4dff7..c69173eb4b24 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -755,6 +755,9 @@ struct cooling_dev_stats {
 	unsigned int *trans_table;
 };
 
+static int cooling_device_stats_table_update(struct thermal_cooling_device *cdev);
+static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev);
+
 static void update_time_in_state(struct cooling_dev_stats *stats)
 {
 	ktime_t now = ktime_get(), delta;
@@ -768,8 +771,12 @@ static void update_time_in_state(struct cooling_dev_stats *stats)
 void thermal_cooling_device_stats_update(struct thermal_cooling_device *cdev,
 					 unsigned long new_state)
 {
-	struct cooling_dev_stats *stats = cdev->stats;
+	struct cooling_dev_stats *stats;
 
+	if (cooling_device_stats_table_update(cdev))
+		return;
+
+	stats = cdev->stats;
 	spin_lock(&stats->lock);
 
 	if (stats->state == new_state)
@@ -904,24 +911,32 @@ static const struct attribute_group cooling_device_stats_attr_group = {
 	.name = "stats"
 };
 
-static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
+static int cooling_device_stats_table_update(struct thermal_cooling_device *cdev)
 {
 	struct cooling_dev_stats *stats;
 	unsigned long states;
-	int var;
+	int var, ret;
 
-	if (cdev->ops->get_max_state(cdev, &states))
-		return;
+	ret = cdev->ops->get_max_state(cdev, &states);
+	if (ret)
+		return ret;
 
 	states++; /* Total number of states is highest state + 1 */
 
+	stats = cdev->stats;
+	if (stats) {
+		if (stats->max_states == states)
+			return 0;
+		else
+			cooling_device_stats_destroy(cdev);
+	}
+
 	var = sizeof(*stats);
 	var += sizeof(*stats->time_in_state) * states;
 	var += sizeof(*stats->trans_table) * states * states;
-
 	stats = kzalloc(var, GFP_KERNEL);
 	if (!stats)
-		return;
+		return -ENOMEM;
 
 	stats->time_in_state = (ktime_t *)(stats + 1);
 	stats->trans_table = (unsigned int *)(stats->time_in_state + states);
@@ -930,6 +945,15 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
 	stats->max_states = states;
 
 	spin_lock_init(&stats->lock);
+	return 0;
+}
+
+static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
+{
+	int var;
+
+	if (cooling_device_stats_table_update(cdev))
+		return;
 
 	/* Fill the empty slot left in cooling_device_attr_groups */
 	var = ARRAY_SIZE(cooling_device_attr_groups) - 2;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: OOB access on ACPI processor thermal device via sysfs write
  2020-04-02 10:03     ` Zhang Rui
@ 2020-04-02 10:09       ` Takashi Iwai
  2020-04-02 18:07       ` Rafael J. Wysocki
  1 sibling, 0 replies; 6+ messages in thread
From: Takashi Iwai @ 2020-04-02 10:09 UTC (permalink / raw)
  To: Zhang Rui; +Cc: linux-acpi, Rafael J. Wysocki, Len Brown, linux-kernel

On Thu, 02 Apr 2020 12:03:30 +0200,
Zhang Rui wrote:
> 
> On Thu, 2020-04-02 at 11:03 +0200, Takashi Iwai wrote:
> > On Thu, 02 Apr 2020 09:47:50 +0200,
> > Zhang, Rui wrote:
> > > 
> > > CC Viresh.
> > > 
> > > Yes, I've received it.
> > > 
> > > To me, there is not a hard rule that the cooling device max_state
> > > must be static.
> > > We should be able to detect the max_state change and reset the
> > > stats table when necessary.
> > > 
> > > I just finished a prototype patch to do so, and will paste it
> > > later.
> > 
> > Great, that sounds like a feasible option, indeed.
> > 
> > 
> Please try the patch below and see if the problem goes away or not.
> 
> >From 7b429674a0e1a6226734c8919b876bb57d946b1d Mon Sep 17 00:00:00 2001
> From: Zhang Rui <rui.zhang@intel.com>
> Date: Thu, 2 Apr 2020 11:18:44 +0800
> Subject: [RFC PATCH] thermal: update thermal stats table when max cooling
>  state changed
> 
> The maximum cooling state of a cooling device may be changed at
> runtime. Thus the statistics table must be updated to handle the real
> maximum cooling states supported.
> 
> This fixes an OOB issue when updating the statistics of the processor
> cooling device, because it only supports 1 cooling state before cpufreq
> driver loaded.
> 
> Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs")
> Reported-by: Takashi Iwai <tiwai@suse.de>
> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
> ---
>  drivers/thermal/thermal_sysfs.c | 38 +++++++++++++++++++++++++++------
>  1 file changed, 31 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
> index aa99edb4dff7..c69173eb4b24 100644
> --- a/drivers/thermal/thermal_sysfs.c
> +++ b/drivers/thermal/thermal_sysfs.c
> @@ -755,6 +755,9 @@ struct cooling_dev_stats {
>  	unsigned int *trans_table;
>  };
>  
> +static int cooling_device_stats_table_update(struct thermal_cooling_device *cdev);
> +static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev);
> +
>  static void update_time_in_state(struct cooling_dev_stats *stats)
>  {
>  	ktime_t now = ktime_get(), delta;
> @@ -768,8 +771,12 @@ static void update_time_in_state(struct cooling_dev_stats *stats)
>  void thermal_cooling_device_stats_update(struct thermal_cooling_device *cdev,
>  					 unsigned long new_state)
>  {
> -	struct cooling_dev_stats *stats = cdev->stats;
> +	struct cooling_dev_stats *stats;
>  
> +	if (cooling_device_stats_table_update(cdev))
> +		return;
> +
> +	stats = cdev->stats;
>  	spin_lock(&stats->lock);
>  
>  	if (stats->state == new_state)
> @@ -904,24 +911,32 @@ static const struct attribute_group cooling_device_stats_attr_group = {
>  	.name = "stats"
>  };
>  
> -static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
> +static int cooling_device_stats_table_update(struct thermal_cooling_device *cdev)
>  {
>  	struct cooling_dev_stats *stats;
>  	unsigned long states;
> -	int var;
> +	int var, ret;
>  
> -	if (cdev->ops->get_max_state(cdev, &states))
> -		return;
> +	ret = cdev->ops->get_max_state(cdev, &states);
> +	if (ret)
> +		return ret;
>  
>  	states++; /* Total number of states is highest state + 1 */
>  
> +	stats = cdev->stats;
> +	if (stats) {
> +		if (stats->max_states == states)
> +			return 0;
> +		else
> +			cooling_device_stats_destroy(cdev);
> +	}

This looks racy.  We may have concurrent accesses and it'll lead to
another access-after-free.

>  	var = sizeof(*stats);
>  	var += sizeof(*stats->time_in_state) * states;
>  	var += sizeof(*stats->trans_table) * states * states;
> -
>  	stats = kzalloc(var, GFP_KERNEL);
>  	if (!stats)
> -		return;
> +		return -ENOMEM;

... and this leaves NULL to cdev->stats.  Then a later access to sysfs
would lead to NULL derference.


>  	stats->time_in_state = (ktime_t *)(stats + 1);
>  	stats->trans_table = (unsigned int *)(stats->time_in_state + states);
> @@ -930,6 +945,15 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
>  	stats->max_states = states;
>  
>  	spin_lock_init(&stats->lock);

Also we must not re-initialize spinlock here at each resizing.
Rather use spinlock for switching to the new table.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: OOB access on ACPI processor thermal device via sysfs write
  2020-04-02 10:03     ` Zhang Rui
  2020-04-02 10:09       ` Takashi Iwai
@ 2020-04-02 18:07       ` Rafael J. Wysocki
  1 sibling, 0 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2020-04-02 18:07 UTC (permalink / raw)
  To: Zhang Rui; +Cc: Takashi Iwai, linux-acpi, Len Brown, linux-kernel

On Thursday, April 2, 2020 12:03:30 PM CEST Zhang Rui wrote:
> On Thu, 2020-04-02 at 11:03 +0200, Takashi Iwai wrote:
> > On Thu, 02 Apr 2020 09:47:50 +0200,
> > Zhang, Rui wrote:
> > > 
> > > CC Viresh.
> > > 
> > > Yes, I've received it.
> > > 
> > > To me, there is not a hard rule that the cooling device max_state
> > > must be static.
> > > We should be able to detect the max_state change and reset the
> > > stats table when necessary.
> > > 
> > > I just finished a prototype patch to do so, and will paste it
> > > later.
> > 
> > Great, that sounds like a feasible option, indeed.
> > 
> > 
> Please try the patch below and see if the problem goes away or not.
> 
> From 7b429674a0e1a6226734c8919b876bb57d946b1d Mon Sep 17 00:00:00 2001
> From: Zhang Rui <rui.zhang@intel.com>
> Date: Thu, 2 Apr 2020 11:18:44 +0800
> Subject: [RFC PATCH] thermal: update thermal stats table when max cooling
>  state changed
> 
> The maximum cooling state of a cooling device may be changed at
> runtime. Thus the statistics table must be updated to handle the real
> maximum cooling states supported.
> 
> This fixes an OOB issue when updating the statistics of the processor
> cooling device, because it only supports 1 cooling state before cpufreq
> driver loaded.

It might also be addressed by adding a ->get_state_count() callback to
struct thermal_cooling_device_ops (and fall back to ->get_max_state() if
that is NULL) and use that for the stats allocation.

If the new callback always returns CPUFREQ_THERMAL_MAX_STEP, the size of the
stats table will be sufficient in all cases and acpi_processor_notifier()
can update it as needed.




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-02 18:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-02  7:41 OOB access on ACPI processor thermal device via sysfs write Takashi Iwai
2020-04-02  7:47 ` Zhang, Rui
2020-04-02  9:03   ` Takashi Iwai
2020-04-02 10:03     ` Zhang Rui
2020-04-02 10:09       ` Takashi Iwai
2020-04-02 18:07       ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.