linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] thermal: fix race condition when updating cooling device
@ 2016-06-02 14:25 Michele Di Giorgio
  2016-06-06 12:51 ` Javi Merino
  0 siblings, 1 reply; 6+ messages in thread
From: Michele Di Giorgio @ 2016-06-02 14:25 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-kernel, Javi Merino, Michele Di Giorgio, Zhang Rui,
	Eduardo Valentin, Peter Feuerer

When multiple thermal zones are bound to the same cooling device, multiple
kernel threads may want to update the cooling device state by calling
thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
condition. Consider the following situation with two kernel threads k1 and k2:

	    Thread k1				Thread k2
                                    ||
                                    ||  call thermal_cdev_update()
                                    ||      ...
                                    ||      set_cur_state(cdev, target);
    call power_actor_set_power()    ||
        ...                         ||
        instance->target = state;   ||
        cdev->updated = false;      ||
                                    ||      cdev->updated = true;
                                    ||      // completes execution
    call thermal_cdev_update()      ||
        // cdev->updated == true    ||
        return;                     ||
                                    \/
                                    time

k2 has already looped through the thermal instances looking for the deepest
cooling device state and is preempted right before setting cdev->updated to
true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
to false. Then, k1 is preempted and k2 continues the execution by setting
cdev->updated to true, therefore preventing k1 from performing the update.
Notice that this is not an issue if k2 looks at the instance->target modified by
k1 "after" it is assigned by k1. In fact, in this case the update will happen
anyway and k1 can safely return immediately from thermal_cdev_update().

This may lead to a situation where a thermal governor never updates the cooling
device. For example, this is the case for the step_wise governor: when calling
the function thermal_zone_trip_update(), the governor may always get a new state
equal to the old one (which, however, wasn't notified to the cooling device) and
will therefore skip the update.

CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Peter Feuerer <peter@piie.net>
Reported-by: Toby Huang <toby.huang@arm.com>
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
---
Protecting only the assignment of cdev->updated with mutexes may look
suspicious, but it is necessary to guarantee synchronization and avoiding the
situation described in the commit message.

There are other two possible solutions.

Moving the cdev->lock mutex outside thermal_cdev_update() and protect both the
assignment and the function. This would work, but will probably cause many
issues when updating all the modules that use thermal_cdev_update().

The other solution is to make cdev->updated an atomic_t, change the if
condition to an atomic_cmpxchg and extend the critical section to include the
call to cdev->ops->set_cur_state().

 drivers/thermal/fair_share.c      |  2 ++
 drivers/thermal/gov_bang_bang.c   |  2 ++
 drivers/thermal/power_allocator.c |  2 ++
 drivers/thermal/step_wise.c       |  2 ++
 drivers/thermal/thermal_core.c    | 10 +++++++---
 5 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/fair_share.c b/drivers/thermal/fair_share.c
index 34fe365..68bd1b5 100644
--- a/drivers/thermal/fair_share.c
+++ b/drivers/thermal/fair_share.c
@@ -116,7 +116,9 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
 		instance->target = get_target_state(tz, cdev, percentage,
 						    cur_trip_level);
 
+		mutex_lock(&instance->cdev->lock);
 		instance->cdev->updated = false;
+		mutex_unlock(&instance->cdev->lock);
 		thermal_cdev_update(cdev);
 	}
 	return 0;
diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c
index 70836c5..39d1519 100644
--- a/drivers/thermal/gov_bang_bang.c
+++ b/drivers/thermal/gov_bang_bang.c
@@ -65,7 +65,9 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 		dev_dbg(&instance->cdev->device, "target=%d\n",
 					(int)instance->target);
 
+		mutex_lock(&instance->cdev->lock);
 		instance->cdev->updated = false; /* cdev needs update */
+		mutex_unlock(&instance->cdev->lock);
 	}
 
 	mutex_unlock(&tz->lock);
diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal/power_allocator.c
index 1246aa6..2cc5616 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -529,7 +529,9 @@ static void allow_maximum_power(struct thermal_zone_device *tz)
 			continue;
 
 		instance->target = 0;
+		mutex_lock(&instance->cdev->lock);
 		instance->cdev->updated = false;
+		mutex_unlock(&instance->cdev->lock);
 		thermal_cdev_update(instance->cdev);
 	}
 }
diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
index ea9366a..bcef2e7 100644
--- a/drivers/thermal/step_wise.c
+++ b/drivers/thermal/step_wise.c
@@ -175,7 +175,9 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 			update_passive_instance(tz, trip_type, -1);
 
 		instance->initialized = true;
+		mutex_lock(&instance->cdev->lock);
 		instance->cdev->updated = false; /* cdev needs update */
+		mutex_unlock(&instance->cdev->lock);
 	}
 
 	mutex_unlock(&tz->lock);
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index a0a8fd1..9d23109 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1089,7 +1089,9 @@ int power_actor_set_power(struct thermal_cooling_device *cdev,
 		return ret;
 
 	instance->target = state;
+	mutex_lock(&cdev->lock);
 	cdev->updated = false;
+	mutex_unlock(&cdev->lock);
 	thermal_cdev_update(cdev);
 
 	return 0;
@@ -1619,11 +1621,13 @@ void thermal_cdev_update(struct thermal_cooling_device *cdev)
 	struct thermal_instance *instance;
 	unsigned long target = 0;
 
+	mutex_lock(&cdev->lock);
 	/* cooling device is updated*/
-	if (cdev->updated)
+	if (cdev->updated) {
+		mutex_unlock(&cdev->lock);
 		return;
+	}
 
-	mutex_lock(&cdev->lock);
 	/* Make sure cdev enters the deepest cooling state */
 	list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) {
 		dev_dbg(&cdev->device, "zone%d->target=%lu\n",
@@ -1633,9 +1637,9 @@ void thermal_cdev_update(struct thermal_cooling_device *cdev)
 		if (instance->target > target)
 			target = instance->target;
 	}
-	mutex_unlock(&cdev->lock);
 	cdev->ops->set_cur_state(cdev, target);
 	cdev->updated = true;
+	mutex_unlock(&cdev->lock);
 	trace_cdev_update(cdev, target);
 	dev_dbg(&cdev->device, "set to state %lu\n", target);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] thermal: fix race condition when updating cooling device
  2016-06-02 14:25 [PATCH] thermal: fix race condition when updating cooling device Michele Di Giorgio
@ 2016-06-06 12:51 ` Javi Merino
  2016-06-27  8:54   ` Michele DiGiorgio
  0 siblings, 1 reply; 6+ messages in thread
From: Javi Merino @ 2016-06-06 12:51 UTC (permalink / raw)
  To: Michele Di Giorgio
  Cc: linux-pm, linux-kernel, Zhang Rui, Eduardo Valentin, Peter Feuerer

On Thu, Jun 02, 2016 at 03:25:31PM +0100, Michele Di Giorgio wrote:
> When multiple thermal zones are bound to the same cooling device, multiple
> kernel threads may want to update the cooling device state by calling
> thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
> condition. Consider the following situation with two kernel threads k1 and k2:
> 
> 	    Thread k1				Thread k2
>                                     ||
>                                     ||  call thermal_cdev_update()
>                                     ||      ...
>                                     ||      set_cur_state(cdev, target);
>     call power_actor_set_power()    ||
>         ...                         ||
>         instance->target = state;   ||
>         cdev->updated = false;      ||
>                                     ||      cdev->updated = true;
>                                     ||      // completes execution
>     call thermal_cdev_update()      ||
>         // cdev->updated == true    ||
>         return;                     ||
>                                     \/
>                                     time
> 
> k2 has already looped through the thermal instances looking for the deepest
> cooling device state and is preempted right before setting cdev->updated to
> true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
> to false. Then, k1 is preempted and k2 continues the execution by setting
> cdev->updated to true, therefore preventing k1 from performing the update.
> Notice that this is not an issue if k2 looks at the instance->target modified by
> k1 "after" it is assigned by k1. In fact, in this case the update will happen
> anyway and k1 can safely return immediately from thermal_cdev_update().
> 
> This may lead to a situation where a thermal governor never updates the cooling
> device. For example, this is the case for the step_wise governor: when calling
> the function thermal_zone_trip_update(), the governor may always get a new state
> equal to the old one (which, however, wasn't notified to the cooling device) and
> will therefore skip the update.
> 
> CC: Zhang Rui <rui.zhang@intel.com>
> CC: Eduardo Valentin <edubezval@gmail.com>
> CC: Peter Feuerer <peter@piie.net>
> Reported-by: Toby Huang <toby.huang@arm.com>
> Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
> ---
> Protecting only the assignment of cdev->updated with mutexes may look
> suspicious, but it is necessary to guarantee synchronization and avoiding the
> situation described in the commit message.
> 
> There are other two possible solutions.
> 
> Moving the cdev->lock mutex outside thermal_cdev_update() and protect both the
> assignment and the function. This would work, but will probably cause many
> issues when updating all the modules that use thermal_cdev_update().
> 
> The other solution is to make cdev->updated an atomic_t, change the if
> condition to an atomic_cmpxchg and extend the critical section to include the
> call to cdev->ops->set_cur_state().

True.  In any case, the mutex needs to cover set_cur_state() in
thermal_cdev_update().  This fixes the race condition, so I'm happy
with it.

Reviewed-by: Javi Merino <javi.merino@arm.com>

>  drivers/thermal/fair_share.c      |  2 ++
>  drivers/thermal/gov_bang_bang.c   |  2 ++
>  drivers/thermal/power_allocator.c |  2 ++
>  drivers/thermal/step_wise.c       |  2 ++
>  drivers/thermal/thermal_core.c    | 10 +++++++---
>  5 files changed, 15 insertions(+), 3 deletions(-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] thermal: fix race condition when updating cooling device
  2016-06-06 12:51 ` Javi Merino
@ 2016-06-27  8:54   ` Michele DiGiorgio
  0 siblings, 0 replies; 6+ messages in thread
From: Michele DiGiorgio @ 2016-06-27  8:54 UTC (permalink / raw)
  To: Eduardo Valentin, rui.zhang
  Cc: linux-pm, Javi Merino, linux-kernel, Zhang Rui, Eduardo Valentin,
	Peter Feuerer

Hi Eduardo, Rui,

On 06/06/16 13:51, Javi Merino wrote:
> On Thu, Jun 02, 2016 at 03:25:31PM +0100, Michele Di Giorgio wrote:
>> When multiple thermal zones are bound to the same cooling device, multiple
>> kernel threads may want to update the cooling device state by calling
>> thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
>> condition. Consider the following situation with two kernel threads k1 and k2:
>>
>>          Thread k1                           Thread k2
>>                                      ||
>>                                      ||  call thermal_cdev_update()
>>                                      ||      ...
>>                                      ||      set_cur_state(cdev, target);
>>      call power_actor_set_power()    ||
>>          ...                         ||
>>          instance->target = state;   ||
>>          cdev->updated = false;      ||
>>                                      ||      cdev->updated = true;
>>                                      ||      // completes execution
>>      call thermal_cdev_update()      ||
>>          // cdev->updated == true    ||
>>          return;                     ||
>>                                      \/
>>                                      time
>>
>> k2 has already looped through the thermal instances looking for the deepest
>> cooling device state and is preempted right before setting cdev->updated to
>> true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
>> to false. Then, k1 is preempted and k2 continues the execution by setting
>> cdev->updated to true, therefore preventing k1 from performing the update.
>> Notice that this is not an issue if k2 looks at the instance->target modified by
>> k1 "after" it is assigned by k1. In fact, in this case the update will happen
>> anyway and k1 can safely return immediately from thermal_cdev_update().
>>
>> This may lead to a situation where a thermal governor never updates the cooling
>> device. For example, this is the case for the step_wise governor: when calling
>> the function thermal_zone_trip_update(), the governor may always get a new state
>> equal to the old one (which, however, wasn't notified to the cooling device) and
>> will therefore skip the update.
>>
>> CC: Zhang Rui <rui.zhang@intel.com>
>> CC: Eduardo Valentin <edubezval@gmail.com>
>> CC: Peter Feuerer <peter@piie.net>
>> Reported-by: Toby Huang <toby.huang@arm.com>
>> Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
>> ---
>> Protecting only the assignment of cdev->updated with mutexes may look
>> suspicious, but it is necessary to guarantee synchronization and avoiding the
>> situation described in the commit message.
>>
>> There are other two possible solutions.
>>
>> Moving the cdev->lock mutex outside thermal_cdev_update() and protect both the
>> assignment and the function. This would work, but will probably cause many
>> issues when updating all the modules that use thermal_cdev_update().
>>
>> The other solution is to make cdev->updated an atomic_t, change the if
>> condition to an atomic_cmpxchg and extend the critical section to include the
>> call to cdev->ops->set_cur_state().
>
> True.  In any case, the mutex needs to cover set_cur_state() in
> thermal_cdev_update().  This fixes the race condition, so I'm happy
> with it.
>
> Reviewed-by: Javi Merino <javi.merino@arm.com>
>
>>   drivers/thermal/fair_share.c      |  2 ++
>>   drivers/thermal/gov_bang_bang.c   |  2 ++
>>   drivers/thermal/power_allocator.c |  2 ++
>>   drivers/thermal/step_wise.c       |  2 ++
>>   drivers/thermal/thermal_core.c    | 10 +++++++---
>>   5 files changed, 15 insertions(+), 3 deletions(-)
>

If there are no comments/issues, could you please consider this for next
merge window?

Thanks,
Michele
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] thermal: fix race condition when updating cooling device
  2016-07-20  9:46 Michele Di Giorgio
  2016-08-05 13:21 ` Michele DiGiorgio
@ 2016-08-11 21:50 ` Peter Feuerer
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Feuerer @ 2016-08-11 21:50 UTC (permalink / raw)
  To: Michele Di Giorgio, linux-pm
  Cc: linux-kernel, Javi Merino, Zhang Rui, Eduardo Valentin

20. Juli 2016 11:46 Uhr, "Michele Di Giorgio" <michele.digiorgio@arm.com> schrieb:
> When multiple thermal zones are bound to the same cooling device, multiple
> kernel threads may want to update the cooling device state by calling
> thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
> condition. Consider the following situation with two kernel threads k1 and k2:
> 
> Thread k1 Thread k2
> ||
> || call thermal_cdev_update()
> || ...
> || set_cur_state(cdev, target);
> call power_actor_set_power() ||
> ... ||
> instance->target = state; ||
> cdev->updated = false; ||
> || cdev->updated = true;
> || // completes execution
> call thermal_cdev_update() ||
> // cdev->updated == true ||
> return; ||
> \/
> time
> 
> k2 has already looped through the thermal instances looking for the deepest
> cooling device state and is preempted right before setting cdev->updated to
> true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
> to false. Then, k1 is preempted and k2 continues the execution by setting
> cdev->updated to true, therefore preventing k1 from performing the update.
> Notice that this is not an issue if k2 looks at the instance->target modified by
> k1 "after" it is assigned by k1. In fact, in this case the update will happen
> anyway and k1 can safely return immediately from thermal_cdev_update().
> 
> This may lead to a situation where a thermal governor never updates the cooling
> device. For example, this is the case for the step_wise governor: when calling
> the function thermal_zone_trip_update(), the governor may always get a new state
> equal to the old one (which, however, wasn't notified to the cooling device) and
> will therefore skip the update.
> 
> CC: Zhang Rui <rui.zhang@intel.com>
> CC: Eduardo Valentin <edubezval@gmail.com>
> CC: Peter Feuerer <peter@piie.net>

Acked-by: Peter Feuerer <peter@piie.net>


> Reported-by: Toby Huang <toby.huang@arm.com>
> Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
> ---
> Protecting only the assignment of cdev->updated with mutexes may look
> suspicious, but it is necessary to guarantee synchronization and avoiding the
> situation described in the commit message.
> 
> There are other two possible solutions.
> 
> Moving the cdev->lock mutex outside thermal_cdev_update() and protect both the
> assignment and the function. This would work, but will probably cause many
> issues when updating all the modules that use thermal_cdev_update().
> 
> The other solution is to make cdev->updated an atomic_t, change the if
> condition to an atomic_cmpxchg and extend the critical section to include the
> call to cdev->ops->set_cur_state().
> 
> drivers/thermal/fair_share.c | 2 ++
> drivers/thermal/gov_bang_bang.c | 2 ++
> drivers/thermal/power_allocator.c | 2 ++
> drivers/thermal/step_wise.c | 2 ++
> drivers/thermal/thermal_core.c | 10 +++++++---
> 5 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/thermal/fair_share.c b/drivers/thermal/fair_share.c
> index 34fe365..68bd1b5 100644
> --- a/drivers/thermal/fair_share.c
> +++ b/drivers/thermal/fair_share.c
> @@ -116,7 +116,9 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
> instance->target = get_target_state(tz, cdev, percentage,
> cur_trip_level);
> 
> + mutex_lock(&instance->cdev->lock);
> instance->cdev->updated = false;
> + mutex_unlock(&instance->cdev->lock);
> thermal_cdev_update(cdev);
> }
> return 0;
> diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c
> index 70836c5..39d1519 100644
> --- a/drivers/thermal/gov_bang_bang.c
> +++ b/drivers/thermal/gov_bang_bang.c
> @@ -65,7 +65,9 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
> dev_dbg(&instance->cdev->device, "target=%d\n",
> (int)instance->target);
> 
> + mutex_lock(&instance->cdev->lock);
> instance->cdev->updated = false; /* cdev needs update */
> + mutex_unlock(&instance->cdev->lock);
> }
> 
> mutex_unlock(&tz->lock);
> diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal/power_allocator.c
> index 1246aa6..2cc5616 100644
> --- a/drivers/thermal/power_allocator.c
> +++ b/drivers/thermal/power_allocator.c
> @@ -529,7 +529,9 @@ static void allow_maximum_power(struct thermal_zone_device *tz)
> continue;
> 
> instance->target = 0;
> + mutex_lock(&instance->cdev->lock);
> instance->cdev->updated = false;
> + mutex_unlock(&instance->cdev->lock);
> thermal_cdev_update(instance->cdev);
> }
> }
> diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
> index ea9366a..bcef2e7 100644
> --- a/drivers/thermal/step_wise.c
> +++ b/drivers/thermal/step_wise.c
> @@ -175,7 +175,9 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
> update_passive_instance(tz, trip_type, -1);
> 
> instance->initialized = true;
> + mutex_lock(&instance->cdev->lock);
> instance->cdev->updated = false; /* cdev needs update */
> + mutex_unlock(&instance->cdev->lock);
> }
> 
> mutex_unlock(&tz->lock);
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index a0a8fd1..9d23109 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -1089,7 +1089,9 @@ int power_actor_set_power(struct thermal_cooling_device *cdev,
> return ret;
> 
> instance->target = state;
> + mutex_lock(&cdev->lock);
> cdev->updated = false;
> + mutex_unlock(&cdev->lock);
> thermal_cdev_update(cdev);
> 
> return 0;
> @@ -1619,11 +1621,13 @@ void thermal_cdev_update(struct thermal_cooling_device *cdev)
> struct thermal_instance *instance;
> unsigned long target = 0;
> 
> + mutex_lock(&cdev->lock);
> /* cooling device is updated*/
> - if (cdev->updated)
> + if (cdev->updated) {
> + mutex_unlock(&cdev->lock);
> return;
> + }
> 
> - mutex_lock(&cdev->lock);
> /* Make sure cdev enters the deepest cooling state */
> list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) {
> dev_dbg(&cdev->device, "zone%d->target=%lu\n",
> @@ -1633,9 +1637,9 @@ void thermal_cdev_update(struct thermal_cooling_device *cdev)
> if (instance->target > target)
> target = instance->target;
> }
> - mutex_unlock(&cdev->lock);
> cdev->ops->set_cur_state(cdev, target);
> cdev->updated = true;
> + mutex_unlock(&cdev->lock);
> trace_cdev_update(cdev, target);
> dev_dbg(&cdev->device, "set to state %lu\n", target);
> }
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] thermal: fix race condition when updating cooling device
  2016-07-20  9:46 Michele Di Giorgio
@ 2016-08-05 13:21 ` Michele DiGiorgio
  2016-08-11 21:50 ` Peter Feuerer
  1 sibling, 0 replies; 6+ messages in thread
From: Michele DiGiorgio @ 2016-08-05 13:21 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-kernel, Javi Merino, Zhang Rui, Eduardo Valentin, Peter Feuerer

Hi Eduardo, Rui,

On 20/07/16 10:46, Michele Di Giorgio wrote:
> When multiple thermal zones are bound to the same cooling device, multiple
> kernel threads may want to update the cooling device state by calling
> thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
> condition. Consider the following situation with two kernel threads k1 and k2:
>
>           Thread k1                           Thread k2
>                                      ||
>                                      ||  call thermal_cdev_update()
>                                      ||      ...
>                                      ||      set_cur_state(cdev, target);
>      call power_actor_set_power()    ||
>          ...                         ||
>          instance->target = state;   ||
>          cdev->updated = false;      ||
>                                      ||      cdev->updated = true;
>                                      ||      // completes execution
>      call thermal_cdev_update()      ||
>          // cdev->updated == true    ||
>          return;                     ||
>                                      \/
>                                      time
>
> k2 has already looped through the thermal instances looking for the deepest
> cooling device state and is preempted right before setting cdev->updated to
> true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
> to false. Then, k1 is preempted and k2 continues the execution by setting
> cdev->updated to true, therefore preventing k1 from performing the update.
> Notice that this is not an issue if k2 looks at the instance->target modified by
> k1 "after" it is assigned by k1. In fact, in this case the update will happen
> anyway and k1 can safely return immediately from thermal_cdev_update().
>
> This may lead to a situation where a thermal governor never updates the cooling
> device. For example, this is the case for the step_wise governor: when calling
> the function thermal_zone_trip_update(), the governor may always get a new state
> equal to the old one (which, however, wasn't notified to the cooling device) and
> will therefore skip the update.
>
> CC: Zhang Rui <rui.zhang@intel.com>
> CC: Eduardo Valentin <edubezval@gmail.com>
> CC: Peter Feuerer <peter@piie.net>
> Reported-by: Toby Huang <toby.huang@arm.com>
> Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
> ---
> Protecting only the assignment of cdev->updated with mutexes may look
> suspicious, but it is necessary to guarantee synchronization and avoiding the
> situation described in the commit message.
>
> There are other two possible solutions.
>
> Moving the cdev->lock mutex outside thermal_cdev_update() and protect both the
> assignment and the function. This would work, but will probably cause many
> issues when updating all the modules that use thermal_cdev_update().
>
> The other solution is to make cdev->updated an atomic_t, change the if
> condition to an atomic_cmpxchg and extend the critical section to include the
> call to cdev->ops->set_cur_state().
>
>   drivers/thermal/fair_share.c      |  2 ++
>   drivers/thermal/gov_bang_bang.c   |  2 ++
>   drivers/thermal/power_allocator.c |  2 ++
>   drivers/thermal/step_wise.c       |  2 ++
>   drivers/thermal/thermal_core.c    | 10 +++++++---
>   5 files changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/thermal/fair_share.c b/drivers/thermal/fair_share.c
> index 34fe365..68bd1b5 100644
> --- a/drivers/thermal/fair_share.c
> +++ b/drivers/thermal/fair_share.c
> @@ -116,7 +116,9 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
>               instance->target = get_target_state(tz, cdev, percentage,
>                                                   cur_trip_level);
>
> +             mutex_lock(&instance->cdev->lock);
>               instance->cdev->updated = false;
> +             mutex_unlock(&instance->cdev->lock);
>               thermal_cdev_update(cdev);
>       }
>       return 0;
> diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c
> index 70836c5..39d1519 100644
> --- a/drivers/thermal/gov_bang_bang.c
> +++ b/drivers/thermal/gov_bang_bang.c
> @@ -65,7 +65,9 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
>               dev_dbg(&instance->cdev->device, "target=%d\n",
>                                       (int)instance->target);
>
> +             mutex_lock(&instance->cdev->lock);
>               instance->cdev->updated = false; /* cdev needs update */
> +             mutex_unlock(&instance->cdev->lock);
>       }
>
>       mutex_unlock(&tz->lock);
> diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal/power_allocator.c
> index 1246aa6..2cc5616 100644
> --- a/drivers/thermal/power_allocator.c
> +++ b/drivers/thermal/power_allocator.c
> @@ -529,7 +529,9 @@ static void allow_maximum_power(struct thermal_zone_device *tz)
>                       continue;
>
>               instance->target = 0;
> +             mutex_lock(&instance->cdev->lock);
>               instance->cdev->updated = false;
> +             mutex_unlock(&instance->cdev->lock);
>               thermal_cdev_update(instance->cdev);
>       }
>   }
> diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
> index ea9366a..bcef2e7 100644
> --- a/drivers/thermal/step_wise.c
> +++ b/drivers/thermal/step_wise.c
> @@ -175,7 +175,9 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
>                       update_passive_instance(tz, trip_type, -1);
>
>               instance->initialized = true;
> +             mutex_lock(&instance->cdev->lock);
>               instance->cdev->updated = false; /* cdev needs update */
> +             mutex_unlock(&instance->cdev->lock);
>       }
>
>       mutex_unlock(&tz->lock);
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index a0a8fd1..9d23109 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -1089,7 +1089,9 @@ int power_actor_set_power(struct thermal_cooling_device *cdev,
>               return ret;
>
>       instance->target = state;
> +     mutex_lock(&cdev->lock);
>       cdev->updated = false;
> +     mutex_unlock(&cdev->lock);
>       thermal_cdev_update(cdev);
>
>       return 0;
> @@ -1619,11 +1621,13 @@ void thermal_cdev_update(struct thermal_cooling_device *cdev)
>       struct thermal_instance *instance;
>       unsigned long target = 0;
>
> +     mutex_lock(&cdev->lock);
>       /* cooling device is updated*/
> -     if (cdev->updated)
> +     if (cdev->updated) {
> +             mutex_unlock(&cdev->lock);
>               return;
> +     }
>
> -     mutex_lock(&cdev->lock);
>       /* Make sure cdev enters the deepest cooling state */
>       list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) {
>               dev_dbg(&cdev->device, "zone%d->target=%lu\n",
> @@ -1633,9 +1637,9 @@ void thermal_cdev_update(struct thermal_cooling_device *cdev)
>               if (instance->target > target)
>                       target = instance->target;
>       }
> -     mutex_unlock(&cdev->lock);
>       cdev->ops->set_cur_state(cdev, target);
>       cdev->updated = true;
> +     mutex_unlock(&cdev->lock);
>       trace_cdev_update(cdev, target);
>       dev_dbg(&cdev->device, "set to state %lu\n", target);
>   }
>

If there are no comments/issues, could you please consider this for next
merge window?

Thanks,
Michele
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] thermal: fix race condition when updating cooling device
@ 2016-07-20  9:46 Michele Di Giorgio
  2016-08-05 13:21 ` Michele DiGiorgio
  2016-08-11 21:50 ` Peter Feuerer
  0 siblings, 2 replies; 6+ messages in thread
From: Michele Di Giorgio @ 2016-07-20  9:46 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-kernel, Javi Merino, Michele Di Giorgio, Zhang Rui,
	Eduardo Valentin, Peter Feuerer

When multiple thermal zones are bound to the same cooling device, multiple
kernel threads may want to update the cooling device state by calling
thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
condition. Consider the following situation with two kernel threads k1 and k2:

	    Thread k1				Thread k2
                                    ||
                                    ||  call thermal_cdev_update()
                                    ||      ...
                                    ||      set_cur_state(cdev, target);
    call power_actor_set_power()    ||
        ...                         ||
        instance->target = state;   ||
        cdev->updated = false;      ||
                                    ||      cdev->updated = true;
                                    ||      // completes execution
    call thermal_cdev_update()      ||
        // cdev->updated == true    ||
        return;                     ||
                                    \/
                                    time

k2 has already looped through the thermal instances looking for the deepest
cooling device state and is preempted right before setting cdev->updated to
true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
to false. Then, k1 is preempted and k2 continues the execution by setting
cdev->updated to true, therefore preventing k1 from performing the update.
Notice that this is not an issue if k2 looks at the instance->target modified by
k1 "after" it is assigned by k1. In fact, in this case the update will happen
anyway and k1 can safely return immediately from thermal_cdev_update().

This may lead to a situation where a thermal governor never updates the cooling
device. For example, this is the case for the step_wise governor: when calling
the function thermal_zone_trip_update(), the governor may always get a new state
equal to the old one (which, however, wasn't notified to the cooling device) and
will therefore skip the update.

CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Peter Feuerer <peter@piie.net>
Reported-by: Toby Huang <toby.huang@arm.com>
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
---
Protecting only the assignment of cdev->updated with mutexes may look
suspicious, but it is necessary to guarantee synchronization and avoiding the
situation described in the commit message.

There are other two possible solutions.

Moving the cdev->lock mutex outside thermal_cdev_update() and protect both the
assignment and the function. This would work, but will probably cause many
issues when updating all the modules that use thermal_cdev_update().

The other solution is to make cdev->updated an atomic_t, change the if
condition to an atomic_cmpxchg and extend the critical section to include the
call to cdev->ops->set_cur_state().

 drivers/thermal/fair_share.c      |  2 ++
 drivers/thermal/gov_bang_bang.c   |  2 ++
 drivers/thermal/power_allocator.c |  2 ++
 drivers/thermal/step_wise.c       |  2 ++
 drivers/thermal/thermal_core.c    | 10 +++++++---
 5 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/fair_share.c b/drivers/thermal/fair_share.c
index 34fe365..68bd1b5 100644
--- a/drivers/thermal/fair_share.c
+++ b/drivers/thermal/fair_share.c
@@ -116,7 +116,9 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
 		instance->target = get_target_state(tz, cdev, percentage,
 						    cur_trip_level);
 
+		mutex_lock(&instance->cdev->lock);
 		instance->cdev->updated = false;
+		mutex_unlock(&instance->cdev->lock);
 		thermal_cdev_update(cdev);
 	}
 	return 0;
diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c
index 70836c5..39d1519 100644
--- a/drivers/thermal/gov_bang_bang.c
+++ b/drivers/thermal/gov_bang_bang.c
@@ -65,7 +65,9 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 		dev_dbg(&instance->cdev->device, "target=%d\n",
 					(int)instance->target);
 
+		mutex_lock(&instance->cdev->lock);
 		instance->cdev->updated = false; /* cdev needs update */
+		mutex_unlock(&instance->cdev->lock);
 	}
 
 	mutex_unlock(&tz->lock);
diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal/power_allocator.c
index 1246aa6..2cc5616 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -529,7 +529,9 @@ static void allow_maximum_power(struct thermal_zone_device *tz)
 			continue;
 
 		instance->target = 0;
+		mutex_lock(&instance->cdev->lock);
 		instance->cdev->updated = false;
+		mutex_unlock(&instance->cdev->lock);
 		thermal_cdev_update(instance->cdev);
 	}
 }
diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
index ea9366a..bcef2e7 100644
--- a/drivers/thermal/step_wise.c
+++ b/drivers/thermal/step_wise.c
@@ -175,7 +175,9 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 			update_passive_instance(tz, trip_type, -1);
 
 		instance->initialized = true;
+		mutex_lock(&instance->cdev->lock);
 		instance->cdev->updated = false; /* cdev needs update */
+		mutex_unlock(&instance->cdev->lock);
 	}
 
 	mutex_unlock(&tz->lock);
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index a0a8fd1..9d23109 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1089,7 +1089,9 @@ int power_actor_set_power(struct thermal_cooling_device *cdev,
 		return ret;
 
 	instance->target = state;
+	mutex_lock(&cdev->lock);
 	cdev->updated = false;
+	mutex_unlock(&cdev->lock);
 	thermal_cdev_update(cdev);
 
 	return 0;
@@ -1619,11 +1621,13 @@ void thermal_cdev_update(struct thermal_cooling_device *cdev)
 	struct thermal_instance *instance;
 	unsigned long target = 0;
 
+	mutex_lock(&cdev->lock);
 	/* cooling device is updated*/
-	if (cdev->updated)
+	if (cdev->updated) {
+		mutex_unlock(&cdev->lock);
 		return;
+	}
 
-	mutex_lock(&cdev->lock);
 	/* Make sure cdev enters the deepest cooling state */
 	list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) {
 		dev_dbg(&cdev->device, "zone%d->target=%lu\n",
@@ -1633,9 +1637,9 @@ void thermal_cdev_update(struct thermal_cooling_device *cdev)
 		if (instance->target > target)
 			target = instance->target;
 	}
-	mutex_unlock(&cdev->lock);
 	cdev->ops->set_cur_state(cdev, target);
 	cdev->updated = true;
+	mutex_unlock(&cdev->lock);
 	trace_cdev_update(cdev, target);
 	dev_dbg(&cdev->device, "set to state %lu\n", target);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-08-11 21:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-02 14:25 [PATCH] thermal: fix race condition when updating cooling device Michele Di Giorgio
2016-06-06 12:51 ` Javi Merino
2016-06-27  8:54   ` Michele DiGiorgio
2016-07-20  9:46 Michele Di Giorgio
2016-08-05 13:21 ` Michele DiGiorgio
2016-08-11 21:50 ` Peter Feuerer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).