linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
@ 2022-11-28  1:44 Waiman Long
  2022-11-28 13:34 ` 答复: [External Mail][PATCH-tip] " David Wang 王标
  2022-12-01 13:44 ` [PATCH-tip] " Will Deacon
  0 siblings, 2 replies; 10+ messages in thread
From: Waiman Long @ 2022-11-28  1:44 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira
  Cc: Phil Auld, Wenjie Li, David Wang 王标,
	linux-kernel, Waiman Long, stable

Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
restricted on asymmetric systems"), the setting and clearing of
user_cpus_ptr are done under pi_lock for arm64 architecture. However,
dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
protection. When racing with the clearing of user_cpus_ptr in
__set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
double-free in arm64 kernel.

Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
cpumask") fixes this problem as user_cpus_ptr, once set, will never
be cleared in a task's lifetime. However, this bug was re-introduced
in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
do_set_cpus_allowed(). This time, it will affect all arches.

Fix this bug by always clearing the user_cpus_ptr of the newly
cloned/forked task before the copying process starts and check the
user_cpus_ptr state of the source task under pi_lock.

Note to stable, this patch won't be applicable to stable releases.
Just copy the new dup_user_cpus_ptr() function over.

Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
CC: stable@vger.kernel.org
Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8df51b08bb38..f2b75faaf71a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
 int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
 		      int node)
 {
+	cpumask_t *user_mask;
 	unsigned long flags;
 
+	/*
+	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
+	 * may differ by now due to racing.
+	 */
+	dst->user_cpus_ptr = NULL;
+
+	/*
+	 * This check is racy and losing the race is a valid situation.
+	 * It is not worth the extra overhead of taking the pi_lock on
+	 * every fork/clone.
+	 */
 	if (!src->user_cpus_ptr)
 		return 0;
 
-	dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
-	if (!dst->user_cpus_ptr)
+	user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+	if (!user_mask)
 		return -ENOMEM;
 
-	/* Use pi_lock to protect content of user_cpus_ptr */
+	/*
+	 * Use pi_lock to protect content of user_cpus_ptr
+	 *
+	 * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
+	 * do_set_cpus_allowed().
+	 */
 	raw_spin_lock_irqsave(&src->pi_lock, flags);
-	cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+	if (src->user_cpus_ptr) {
+		swap(dst->user_cpus_ptr, user_mask);
+		cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+	}
 	raw_spin_unlock_irqrestore(&src->pi_lock, flags);
+
+	if (unlikely(user_mask))
+		kfree(user_mask);
+
 	return 0;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* 答复: [External Mail][PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-11-28  1:44 [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr() Waiman Long
@ 2022-11-28 13:34 ` David Wang 王标
  2022-11-28 15:43   ` Waiman Long
  2022-12-01 13:44 ` [PATCH-tip] " Will Deacon
  1 sibling, 1 reply; 10+ messages in thread
From: David Wang 王标 @ 2022-11-28 13:34 UTC (permalink / raw)
  To: Waiman Long, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira
  Cc: Phil Auld, Wenjie Li, linux-kernel, stable

Hi, Waiman

We use 140 devices to test this patch 72 hours.  The issue can not be reproduced.  If no this patch,  the issue can be reproduced.
Could you help merge this patch to mailine?

https://lore.kernel.org/all/20221125023943.1118603-1-longman@redhat.com/

If this patch is applied to the maintainer's tree,  we can request google to help cherrypick to ACK to fix issue.

Thanks

-----邮件原件-----
发件人: Waiman Long <longman@redhat.com>
发送时间: 2022年11月28日 9:45
收件人: Ingo Molnar <mingo@redhat.com>; Peter Zijlstra <peterz@infradead.org>; Juri Lelli <juri.lelli@redhat.com>; Vincent Guittot <vincent.guittot@linaro.org>; Dietmar Eggemann <dietmar.eggemann@arm.com>; Steven Rostedt <rostedt@goodmis.org>; Ben Segall <bsegall@google.com>; Mel Gorman <mgorman@suse.de>; Daniel Bristot de Oliveira <bristot@redhat.com>
抄送: Phil Auld <pauld@redhat.com>; Wenjie Li <wenjieli@qti.qualcomm.com>; David Wang 王标 <wangbiao3@xiaomi.com>; linux-kernel@vger.kernel.org; Waiman Long <longman@redhat.com>; stable@vger.kernel.org
主题: [External Mail][PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()

[外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈

Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems"), the setting and clearing of user_cpus_ptr are done under pi_lock for arm64 architecture. However,
dup_user_cpus_ptr() accesses user_cpus_ptr without any lock protection. When racing with the clearing of user_cpus_ptr in __set_cpus_allowed_ptr_locked(), it can lead to user-after-free and double-free in arm64 kernel.

Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
cpumask") fixes this problem as user_cpus_ptr, once set, will never be cleared in a task's lifetime. However, this bug was re-introduced in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in do_set_cpus_allowed(). This time, it will affect all arches.

Fix this bug by always clearing the user_cpus_ptr of the newly cloned/forked task before the copying process starts and check the user_cpus_ptr state of the source task under pi_lock.

Note to stable, this patch won't be applicable to stable releases.
Just copy the new dup_user_cpus_ptr() function over.

Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
CC: stable@vger.kernel.org
Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8df51b08bb38..f2b75faaf71a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)  int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
                      int node)
 {
+       cpumask_t *user_mask;
        unsigned long flags;

+       /*
+        * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
+        * may differ by now due to racing.
+        */
+       dst->user_cpus_ptr = NULL;
+
+       /*
+        * This check is racy and losing the race is a valid situation.
+        * It is not worth the extra overhead of taking the pi_lock on
+        * every fork/clone.
+        */
        if (!src->user_cpus_ptr)
                return 0;

-       dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
-       if (!dst->user_cpus_ptr)
+       user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+       if (!user_mask)
                return -ENOMEM;

-       /* Use pi_lock to protect content of user_cpus_ptr */
+       /*
+        * Use pi_lock to protect content of user_cpus_ptr
+        *
+        * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
+        * do_set_cpus_allowed().
+        */
        raw_spin_lock_irqsave(&src->pi_lock, flags);
-       cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+       if (src->user_cpus_ptr) {
+               swap(dst->user_cpus_ptr, user_mask);
+               cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+       }
        raw_spin_unlock_irqrestore(&src->pi_lock, flags);
+
+       if (unlikely(user_mask))
+               kfree(user_mask);
+
        return 0;
 }

--
2.31.1

#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 答复: [External Mail][PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-11-28 13:34 ` 答复: [External Mail][PATCH-tip] " David Wang 王标
@ 2022-11-28 15:43   ` Waiman Long
  2022-11-29  3:11     ` 答复: " David Wang 王标
  0 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-11-28 15:43 UTC (permalink / raw)
  To: David Wang 王标,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira
  Cc: Phil Auld, Wenjie Li, linux-kernel, stable


On 11/28/22 08:34, David Wang 王标 wrote:
> Hi, Waiman
>
> We use 140 devices to test this patch 72 hours.  The issue can not be reproduced.  If no this patch,  the issue can be reproduced.
> Could you help merge this patch to mailine?
>
> https://lore.kernel.org/all/20221125023943.1118603-1-longman@redhat.com/
>
> If this patch is applied to the maintainer's tree,  we can request google to help cherrypick to ACK to fix issue.

Just want to clarify if you are testing the patch using the latest tip 
tree or on top of an existing linux version without the persistent user 
requested affinity patchset.

PeterZ is the scheduler maintainer who is responsible for merging 
scheduler related patch. It is up to him as to when that will happen.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 10+ messages in thread

* 答复: 答复: [External Mail][PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-11-28 15:43   ` Waiman Long
@ 2022-11-29  3:11     ` David Wang 王标
  0 siblings, 0 replies; 10+ messages in thread
From: David Wang 王标 @ 2022-11-29  3:11 UTC (permalink / raw)
  To: Waiman Long, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira
  Cc: Phil Auld, Wenjie Li, linux-kernel, stable

Hi  Waiman,  Peterz,

We test new patch basing on user requested affinity patchset(the latest tip tree).

You use 7 patch from ACK code . You check following  link change.
https://android-review.googlesource.com/c/kernel/common/+/2266724
https://android-review.googlesource.com/c/kernel/common/+/2266744
https://android-review.googlesource.com/c/kernel/common/+/2266745
https://android-review.googlesource.com/c/kernel/common/+/2266804
https://android-review.googlesource.com/c/kernel/common/+/2266784
https://android-review.googlesource.com/c/kernel/common/+/2267468
https://android-review.googlesource.com/c/kernel/common/+/2267664

You can confirm this with google team and check
https://partnerissuetracker.corp.google.com/u/0/issues/256578302 .

Hi  Peterz ,

Could you help merge
https://lore.kernel.org/all/20221125023943.1118603-1-longman@redhat.com/

We want to provide better product for user.
Thanks

David.

-----邮件原件-----
发件人: Waiman Long <longman@redhat.com>
发送时间: 2022年11月28日 23:43
收件人: David Wang 王标 <wangbiao3@xiaomi.com>; Ingo Molnar <mingo@redhat.com>; Peter Zijlstra <peterz@infradead.org>; Juri Lelli <juri.lelli@redhat.com>; Vincent Guittot <vincent.guittot@linaro.org>; Dietmar Eggemann <dietmar.eggemann@arm.com>; Steven Rostedt <rostedt@goodmis.org>; Ben Segall <bsegall@google.com>; Mel Gorman <mgorman@suse.de>; Daniel Bristot de Oliveira <bristot@redhat.com>
抄送: Phil Auld <pauld@redhat.com>; Wenjie Li <wenjieli@qti.qualcomm.com>; linux-kernel@vger.kernel.org; stable@vger.kernel.org
主题: Re: 答复: [External Mail][PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()

[外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈

On 11/28/22 08:34, David Wang 王标 wrote:
> Hi, Waiman
>
> We use 140 devices to test this patch 72 hours.  The issue can not be reproduced.  If no this patch,  the issue can be reproduced.
> Could you help merge this patch to mailine?
>
> https://lore.kernel.org/all/20221125023943.1118603-1-longman@redhat.co
> m/
>
> If this patch is applied to the maintainer's tree,  we can request google to help cherrypick to ACK to fix issue.

Just want to clarify if you are testing the patch using the latest tip tree or on top of an existing linux version without the persistent user requested affinity patchset.

PeterZ is the scheduler maintainer who is responsible for merging scheduler related patch. It is up to him as to when that will happen.

Cheers,
Longman

#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-11-28  1:44 [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr() Waiman Long
  2022-11-28 13:34 ` 答复: [External Mail][PATCH-tip] " David Wang 王标
@ 2022-12-01 13:44 ` Will Deacon
  2022-12-01 17:03   ` Waiman Long
  1 sibling, 1 reply; 10+ messages in thread
From: Will Deacon @ 2022-12-01 13:44 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel, stable

On Sun, Nov 27, 2022 at 08:44:41PM -0500, Waiman Long wrote:
> Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
> restricted on asymmetric systems"), the setting and clearing of
> user_cpus_ptr are done under pi_lock for arm64 architecture. However,
> dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
> protection. When racing with the clearing of user_cpus_ptr in
> __set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
> double-free in arm64 kernel.
> 
> Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
> cpumask") fixes this problem as user_cpus_ptr, once set, will never
> be cleared in a task's lifetime. However, this bug was re-introduced
> in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
> do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
> do_set_cpus_allowed(). This time, it will affect all arches.
> 
> Fix this bug by always clearing the user_cpus_ptr of the newly
> cloned/forked task before the copying process starts and check the
> user_cpus_ptr state of the source task under pi_lock.
> 
> Note to stable, this patch won't be applicable to stable releases.
> Just copy the new dup_user_cpus_ptr() function over.
> 
> Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> CC: stable@vger.kernel.org
> Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
>  1 file changed, 28 insertions(+), 4 deletions(-)

As per my comments on the previous version of this patch:

https://lore.kernel.org/lkml/20221201133602.GB28489@willie-the-truck/T/#t

I think there are other issues to fix when racing affinity changes with
fork() too.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8df51b08bb38..f2b75faaf71a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
>  int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
>  		      int node)
>  {
> +	cpumask_t *user_mask;
>  	unsigned long flags;
>  
> +	/*
> +	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
> +	 * may differ by now due to racing.
> +	 */
> +	dst->user_cpus_ptr = NULL;
> +
> +	/*
> +	 * This check is racy and losing the race is a valid situation.
> +	 * It is not worth the extra overhead of taking the pi_lock on
> +	 * every fork/clone.
> +	 */
>  	if (!src->user_cpus_ptr)
>  		return 0;

data_race() ?

>  
> -	dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> -	if (!dst->user_cpus_ptr)
> +	user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> +	if (!user_mask)
>  		return -ENOMEM;
>  
> -	/* Use pi_lock to protect content of user_cpus_ptr */
> +	/*
> +	 * Use pi_lock to protect content of user_cpus_ptr
> +	 *
> +	 * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
> +	 * do_set_cpus_allowed().
> +	 */
>  	raw_spin_lock_irqsave(&src->pi_lock, flags);
> -	cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
> +	if (src->user_cpus_ptr) {
> +		swap(dst->user_cpus_ptr, user_mask);

Isn't 'dst->user_cpus_ptr' always NULL here? Why do we need the swap()
instead of just assigning the thing directly?

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-12-01 13:44 ` [PATCH-tip] " Will Deacon
@ 2022-12-01 17:03   ` Waiman Long
  2022-12-02 10:18     ` Will Deacon
  0 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-12-01 17:03 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel, stable

On 12/1/22 08:44, Will Deacon wrote:
> On Sun, Nov 27, 2022 at 08:44:41PM -0500, Waiman Long wrote:
>> Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
>> restricted on asymmetric systems"), the setting and clearing of
>> user_cpus_ptr are done under pi_lock for arm64 architecture. However,
>> dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
>> protection. When racing with the clearing of user_cpus_ptr in
>> __set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
>> double-free in arm64 kernel.
>>
>> Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>> cpumask") fixes this problem as user_cpus_ptr, once set, will never
>> be cleared in a task's lifetime. However, this bug was re-introduced
>> in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
>> do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
>> do_set_cpus_allowed(). This time, it will affect all arches.
>>
>> Fix this bug by always clearing the user_cpus_ptr of the newly
>> cloned/forked task before the copying process starts and check the
>> user_cpus_ptr state of the source task under pi_lock.
>>
>> Note to stable, this patch won't be applicable to stable releases.
>> Just copy the new dup_user_cpus_ptr() function over.
>>
>> Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
>> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
>> CC: stable@vger.kernel.org
>> Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
>>   1 file changed, 28 insertions(+), 4 deletions(-)
> As per my comments on the previous version of this patch:
>
> https://lore.kernel.org/lkml/20221201133602.GB28489@willie-the-truck/T/#t
>
> I think there are other issues to fix when racing affinity changes with
> fork() too.
It is certainly possible that there are other bugs hiding somewhere:-)
>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 8df51b08bb38..f2b75faaf71a 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
>>   int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
>>   		      int node)
>>   {
>> +	cpumask_t *user_mask;
>>   	unsigned long flags;
>>   
>> +	/*
>> +	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
>> +	 * may differ by now due to racing.
>> +	 */
>> +	dst->user_cpus_ptr = NULL;
>> +
>> +	/*
>> +	 * This check is racy and losing the race is a valid situation.
>> +	 * It is not worth the extra overhead of taking the pi_lock on
>> +	 * every fork/clone.
>> +	 */
>>   	if (!src->user_cpus_ptr)
>>   		return 0;
> data_race() ?
Race is certainly possible, but the clearing of user_cpus_ptr before 
will mitigate any risk.
>
>>   
>> -	dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
>> -	if (!dst->user_cpus_ptr)
>> +	user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
>> +	if (!user_mask)
>>   		return -ENOMEM;
>>   
>> -	/* Use pi_lock to protect content of user_cpus_ptr */
>> +	/*
>> +	 * Use pi_lock to protect content of user_cpus_ptr
>> +	 *
>> +	 * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
>> +	 * do_set_cpus_allowed().
>> +	 */
>>   	raw_spin_lock_irqsave(&src->pi_lock, flags);
>> -	cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
>> +	if (src->user_cpus_ptr) {
>> +		swap(dst->user_cpus_ptr, user_mask);
> Isn't 'dst->user_cpus_ptr' always NULL here? Why do we need the swap()
> instead of just assigning the thing directly?

True. We still need to clear user_mask. So I used swap() instead of 2 
assignment statements. I am fine to go with either way.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-12-01 17:03   ` Waiman Long
@ 2022-12-02 10:18     ` Will Deacon
  2022-12-02 14:30       ` Waiman Long
  0 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2022-12-02 10:18 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel, stable

On Thu, Dec 01, 2022 at 12:03:39PM -0500, Waiman Long wrote:
> On 12/1/22 08:44, Will Deacon wrote:
> > On Sun, Nov 27, 2022 at 08:44:41PM -0500, Waiman Long wrote:
> > > Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
> > > restricted on asymmetric systems"), the setting and clearing of
> > > user_cpus_ptr are done under pi_lock for arm64 architecture. However,
> > > dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
> > > protection. When racing with the clearing of user_cpus_ptr in
> > > __set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
> > > double-free in arm64 kernel.
> > > 
> > > Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
> > > cpumask") fixes this problem as user_cpus_ptr, once set, will never
> > > be cleared in a task's lifetime. However, this bug was re-introduced
> > > in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
> > > do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
> > > do_set_cpus_allowed(). This time, it will affect all arches.
> > > 
> > > Fix this bug by always clearing the user_cpus_ptr of the newly
> > > cloned/forked task before the copying process starts and check the
> > > user_cpus_ptr state of the source task under pi_lock.
> > > 
> > > Note to stable, this patch won't be applicable to stable releases.
> > > Just copy the new dup_user_cpus_ptr() function over.
> > > 
> > > Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
> > > Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> > > CC: stable@vger.kernel.org
> > > Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
> > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > ---
> > >   kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
> > >   1 file changed, 28 insertions(+), 4 deletions(-)
> > As per my comments on the previous version of this patch:
> > 
> > https://lore.kernel.org/lkml/20221201133602.GB28489@willie-the-truck/T/#t
> > 
> > I think there are other issues to fix when racing affinity changes with
> > fork() too.
> It is certainly possible that there are other bugs hiding somewhere:-)

Right, but I actually took the time to hit the same race for the other
affinity mask field so it seems a bit narrow-minded for us just to fix the
one issue.

> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 8df51b08bb38..f2b75faaf71a 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
> > >   int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
> > >   		      int node)
> > >   {
> > > +	cpumask_t *user_mask;
> > >   	unsigned long flags;
> > > +	/*
> > > +	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
> > > +	 * may differ by now due to racing.
> > > +	 */
> > > +	dst->user_cpus_ptr = NULL;
> > > +
> > > +	/*
> > > +	 * This check is racy and losing the race is a valid situation.
> > > +	 * It is not worth the extra overhead of taking the pi_lock on
> > > +	 * every fork/clone.
> > > +	 */
> > >   	if (!src->user_cpus_ptr)
> > >   		return 0;
> > data_race() ?
> Race is certainly possible, but the clearing of user_cpus_ptr before will
> mitigate any risk.

Sorry, I meant let's wrap this access in the data_race() macro and add a
comment so that KCSAN won't report the false positive.

> > > -	dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> > > -	if (!dst->user_cpus_ptr)
> > > +	user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> > > +	if (!user_mask)
> > >   		return -ENOMEM;
> > > -	/* Use pi_lock to protect content of user_cpus_ptr */
> > > +	/*
> > > +	 * Use pi_lock to protect content of user_cpus_ptr
> > > +	 *
> > > +	 * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
> > > +	 * do_set_cpus_allowed().
> > > +	 */
> > >   	raw_spin_lock_irqsave(&src->pi_lock, flags);
> > > -	cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
> > > +	if (src->user_cpus_ptr) {
> > > +		swap(dst->user_cpus_ptr, user_mask);
> > Isn't 'dst->user_cpus_ptr' always NULL here? Why do we need the swap()
> > instead of just assigning the thing directly?
> 
> True. We still need to clear user_mask. So I used swap() instead of 2
> assignment statements. I am fine to go with either way.

I found it a bit bizarre at first, but on reflection it makes sense.

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-12-02 10:18     ` Will Deacon
@ 2022-12-02 14:30       ` Waiman Long
  2022-12-13 12:54         ` Will Deacon
  0 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-12-02 14:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel, stable

On 12/2/22 05:18, Will Deacon wrote:
> On Thu, Dec 01, 2022 at 12:03:39PM -0500, Waiman Long wrote:
>> On 12/1/22 08:44, Will Deacon wrote:
>>> On Sun, Nov 27, 2022 at 08:44:41PM -0500, Waiman Long wrote:
>>>> Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
>>>> restricted on asymmetric systems"), the setting and clearing of
>>>> user_cpus_ptr are done under pi_lock for arm64 architecture. However,
>>>> dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
>>>> protection. When racing with the clearing of user_cpus_ptr in
>>>> __set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
>>>> double-free in arm64 kernel.
>>>>
>>>> Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>>>> cpumask") fixes this problem as user_cpus_ptr, once set, will never
>>>> be cleared in a task's lifetime. However, this bug was re-introduced
>>>> in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
>>>> do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
>>>> do_set_cpus_allowed(). This time, it will affect all arches.
>>>>
>>>> Fix this bug by always clearing the user_cpus_ptr of the newly
>>>> cloned/forked task before the copying process starts and check the
>>>> user_cpus_ptr state of the source task under pi_lock.
>>>>
>>>> Note to stable, this patch won't be applicable to stable releases.
>>>> Just copy the new dup_user_cpus_ptr() function over.
>>>>
>>>> Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
>>>> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
>>>> CC: stable@vger.kernel.org
>>>> Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>> ---
>>>>    kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
>>>>    1 file changed, 28 insertions(+), 4 deletions(-)
>>> As per my comments on the previous version of this patch:
>>>
>>> https://lore.kernel.org/lkml/20221201133602.GB28489@willie-the-truck/T/#t
>>>
>>> I think there are other issues to fix when racing affinity changes with
>>> fork() too.
>> It is certainly possible that there are other bugs hiding somewhere:-)
> Right, but I actually took the time to hit the same race for the other
> affinity mask field so it seems a bit narrow-minded for us just to fix the
> one issue.

I focused on this particular one because of a double-free bug report 
from David. What other fields have you found to be subjected to data race?

>
>>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>>> index 8df51b08bb38..f2b75faaf71a 100644
>>>> --- a/kernel/sched/core.c
>>>> +++ b/kernel/sched/core.c
>>>> @@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
>>>>    int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
>>>>    		      int node)
>>>>    {
>>>> +	cpumask_t *user_mask;
>>>>    	unsigned long flags;
>>>> +	/*
>>>> +	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
>>>> +	 * may differ by now due to racing.
>>>> +	 */
>>>> +	dst->user_cpus_ptr = NULL;
>>>> +
>>>> +	/*
>>>> +	 * This check is racy and losing the race is a valid situation.
>>>> +	 * It is not worth the extra overhead of taking the pi_lock on
>>>> +	 * every fork/clone.
>>>> +	 */
>>>>    	if (!src->user_cpus_ptr)
>>>>    		return 0;
>>> data_race() ?
>> Race is certainly possible, but the clearing of user_cpus_ptr before will
>> mitigate any risk.
> Sorry, I meant let's wrap this access in the data_race() macro and add a
> comment so that KCSAN won't report the false positive.

Good point. I should have done that.

Thanks,
Longman


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-12-02 14:30       ` Waiman Long
@ 2022-12-13 12:54         ` Will Deacon
  2022-12-13 15:54           ` Waiman Long
  0 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2022-12-13 12:54 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel, stable

On Fri, Dec 02, 2022 at 09:30:56AM -0500, Waiman Long wrote:
> On 12/2/22 05:18, Will Deacon wrote:
> > On Thu, Dec 01, 2022 at 12:03:39PM -0500, Waiman Long wrote:
> > > On 12/1/22 08:44, Will Deacon wrote:
> > > > On Sun, Nov 27, 2022 at 08:44:41PM -0500, Waiman Long wrote:
> > > > > Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
> > > > > restricted on asymmetric systems"), the setting and clearing of
> > > > > user_cpus_ptr are done under pi_lock for arm64 architecture. However,
> > > > > dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
> > > > > protection. When racing with the clearing of user_cpus_ptr in
> > > > > __set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
> > > > > double-free in arm64 kernel.
> > > > > 
> > > > > Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
> > > > > cpumask") fixes this problem as user_cpus_ptr, once set, will never
> > > > > be cleared in a task's lifetime. However, this bug was re-introduced
> > > > > in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
> > > > > do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
> > > > > do_set_cpus_allowed(). This time, it will affect all arches.
> > > > > 
> > > > > Fix this bug by always clearing the user_cpus_ptr of the newly
> > > > > cloned/forked task before the copying process starts and check the
> > > > > user_cpus_ptr state of the source task under pi_lock.
> > > > > 
> > > > > Note to stable, this patch won't be applicable to stable releases.
> > > > > Just copy the new dup_user_cpus_ptr() function over.
> > > > > 
> > > > > Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
> > > > > Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> > > > > CC: stable@vger.kernel.org
> > > > > Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
> > > > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > > > ---
> > > > >    kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
> > > > >    1 file changed, 28 insertions(+), 4 deletions(-)
> > > > As per my comments on the previous version of this patch:
> > > > 
> > > > https://lore.kernel.org/lkml/20221201133602.GB28489@willie-the-truck/T/#t
> > > > 
> > > > I think there are other issues to fix when racing affinity changes with
> > > > fork() too.
> > > It is certainly possible that there are other bugs hiding somewhere:-)
> > Right, but I actually took the time to hit the same race for the other
> > affinity mask field so it seems a bit narrow-minded for us just to fix the
> > one issue.
> 
> I focused on this particular one because of a double-free bug report from
> David. What other fields have you found to be subjected to data race?

See my other report linked above where we race on 'task_struct::cpus_mask'.

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-12-13 12:54         ` Will Deacon
@ 2022-12-13 15:54           ` Waiman Long
  0 siblings, 0 replies; 10+ messages in thread
From: Waiman Long @ 2022-12-13 15:54 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel, stable

On 12/13/22 07:54, Will Deacon wrote:
> On Fri, Dec 02, 2022 at 09:30:56AM -0500, Waiman Long wrote:
>> On 12/2/22 05:18, Will Deacon wrote:
>>> On Thu, Dec 01, 2022 at 12:03:39PM -0500, Waiman Long wrote:
>>>> On 12/1/22 08:44, Will Deacon wrote:
>>>>> On Sun, Nov 27, 2022 at 08:44:41PM -0500, Waiman Long wrote:
>>>>>> Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
>>>>>> restricted on asymmetric systems"), the setting and clearing of
>>>>>> user_cpus_ptr are done under pi_lock for arm64 architecture. However,
>>>>>> dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
>>>>>> protection. When racing with the clearing of user_cpus_ptr in
>>>>>> __set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
>>>>>> double-free in arm64 kernel.
>>>>>>
>>>>>> Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>>>>>> cpumask") fixes this problem as user_cpus_ptr, once set, will never
>>>>>> be cleared in a task's lifetime. However, this bug was re-introduced
>>>>>> in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
>>>>>> do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
>>>>>> do_set_cpus_allowed(). This time, it will affect all arches.
>>>>>>
>>>>>> Fix this bug by always clearing the user_cpus_ptr of the newly
>>>>>> cloned/forked task before the copying process starts and check the
>>>>>> user_cpus_ptr state of the source task under pi_lock.
>>>>>>
>>>>>> Note to stable, this patch won't be applicable to stable releases.
>>>>>> Just copy the new dup_user_cpus_ptr() function over.
>>>>>>
>>>>>> Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
>>>>>> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
>>>>>> CC: stable@vger.kernel.org
>>>>>> Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
>>>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>>>> ---
>>>>>>     kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
>>>>>>     1 file changed, 28 insertions(+), 4 deletions(-)
>>>>> As per my comments on the previous version of this patch:
>>>>>
>>>>> https://lore.kernel.org/lkml/20221201133602.GB28489@willie-the-truck/T/#t
>>>>>
>>>>> I think there are other issues to fix when racing affinity changes with
>>>>> fork() too.
>>>> It is certainly possible that there are other bugs hiding somewhere:-)
>>> Right, but I actually took the time to hit the same race for the other
>>> affinity mask field so it seems a bit narrow-minded for us just to fix the
>>> one issue.
>> I focused on this particular one because of a double-free bug report from
>> David. What other fields have you found to be subjected to data race?
> See my other report linked above where we race on 'task_struct::cpus_mask'.

So you are referring to the fact a task structure may be changed while 
it is being copied to a child process at the same time. I think it is a 
hard problem to fix as I am not aware of a way to freeze the content of 
the task structure while the copying is in progress. There are just too 
many fields in the task structures that can be changed in many different 
contexts from different CPUs.

Anyway, this dup_user_cpus_ptr() bug is not related to racing in this 
copying process, it is caused by a race after that. I think it may be 
worthwhile to put a note about possible race in the dup_task_struct() 
process but I can't think of a good way to fix it.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-12-13 15:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-28  1:44 [PATCH-tip] sched: Fix use-after-free bug in dup_user_cpus_ptr() Waiman Long
2022-11-28 13:34 ` 答复: [External Mail][PATCH-tip] " David Wang 王标
2022-11-28 15:43   ` Waiman Long
2022-11-29  3:11     ` 答复: " David Wang 王标
2022-12-01 13:44 ` [PATCH-tip] " Will Deacon
2022-12-01 17:03   ` Waiman Long
2022-12-02 10:18     ` Will Deacon
2022-12-02 14:30       ` Waiman Long
2022-12-13 12:54         ` Will Deacon
2022-12-13 15:54           ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).