linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
@ 2022-11-25  2:39 Waiman Long
  2022-11-28  1:43 ` Waiman Long
  0 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2022-11-25  2:39 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira
  Cc: Phil Auld, Wenjie Li, David Wang 王标,
	linux-kernel, Waiman Long

In general, a non-null user_cpus_ptr will remain set until the task dies.
A possible exception to this is the fact that do_set_cpus_allowed()
will clear a non-null user_cpus_ptr. To allow this possible racing
condition, we need to check for NULL user_cpus_ptr under the pi_lock
before duping the user mask.

Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
 1 file changed, 28 insertions(+), 4 deletions(-)

 [v4] Minor comment update

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8df51b08bb38..f2b75faaf71a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
 int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
 		      int node)
 {
+	cpumask_t *user_mask;
 	unsigned long flags;
 
+	/*
+	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
+	 * may differ by now due to racing.
+	 */
+	dst->user_cpus_ptr = NULL;
+
+	/*
+	 * This check is racy and losing the race is a valid situation.
+	 * It is not worth the extra overhead of taking the pi_lock on
+	 * every fork/clone.
+	 */
 	if (!src->user_cpus_ptr)
 		return 0;
 
-	dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
-	if (!dst->user_cpus_ptr)
+	user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+	if (!user_mask)
 		return -ENOMEM;
 
-	/* Use pi_lock to protect content of user_cpus_ptr */
+	/*
+	 * Use pi_lock to protect content of user_cpus_ptr
+	 *
+	 * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
+	 * do_set_cpus_allowed().
+	 */
 	raw_spin_lock_irqsave(&src->pi_lock, flags);
-	cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+	if (src->user_cpus_ptr) {
+		swap(dst->user_cpus_ptr, user_mask);
+		cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+	}
 	raw_spin_unlock_irqrestore(&src->pi_lock, flags);
+
+	if (unlikely(user_mask))
+		kfree(user_mask);
+
 	return 0;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-25  2:39 [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr() Waiman Long
@ 2022-11-28  1:43 ` Waiman Long
  2022-11-28 12:00   ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2022-11-28  1:43 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira
  Cc: Phil Auld, Wenjie Li, David Wang 王标, linux-kernel

On 11/24/22 21:39, Waiman Long wrote:
> In general, a non-null user_cpus_ptr will remain set until the task dies.
> A possible exception to this is the fact that do_set_cpus_allowed()
> will clear a non-null user_cpus_ptr. To allow this possible racing
> condition, we need to check for NULL user_cpus_ptr under the pi_lock
> before duping the user mask.
>
> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> Signed-off-by: Waiman Long <longman@redhat.com>

This is actually a pre-existing use-after-free bug since commit 
07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on 
asymmetric systems"). So it needs to be fixed in the stable release as 
well. Will resend the patch with an additional fixes tag and updated 
commit log.

Cheers,
Longman

> ---
>   kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
>   1 file changed, 28 insertions(+), 4 deletions(-)
>
>   [v4] Minor comment update
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8df51b08bb38..f2b75faaf71a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
>   int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
>   		      int node)
>   {
> +	cpumask_t *user_mask;
>   	unsigned long flags;
>   
> +	/*
> +	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
> +	 * may differ by now due to racing.
> +	 */
> +	dst->user_cpus_ptr = NULL;
> +
> +	/*
> +	 * This check is racy and losing the race is a valid situation.
> +	 * It is not worth the extra overhead of taking the pi_lock on
> +	 * every fork/clone.
> +	 */
>   	if (!src->user_cpus_ptr)
>   		return 0;
>   
> -	dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> -	if (!dst->user_cpus_ptr)
> +	user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> +	if (!user_mask)
>   		return -ENOMEM;
>   
> -	/* Use pi_lock to protect content of user_cpus_ptr */
> +	/*
> +	 * Use pi_lock to protect content of user_cpus_ptr
> +	 *
> +	 * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
> +	 * do_set_cpus_allowed().
> +	 */
>   	raw_spin_lock_irqsave(&src->pi_lock, flags);
> -	cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
> +	if (src->user_cpus_ptr) {
> +		swap(dst->user_cpus_ptr, user_mask);
> +		cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
> +	}
>   	raw_spin_unlock_irqrestore(&src->pi_lock, flags);
> +
> +	if (unlikely(user_mask))
> +		kfree(user_mask);
> +
>   	return 0;
>   }
>   


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-28  1:43 ` Waiman Long
@ 2022-11-28 12:00   ` Will Deacon
  2022-11-28 15:11     ` Waiman Long
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2022-11-28 12:00 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel

On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
> On 11/24/22 21:39, Waiman Long wrote:
> > In general, a non-null user_cpus_ptr will remain set until the task dies.
> > A possible exception to this is the fact that do_set_cpus_allowed()
> > will clear a non-null user_cpus_ptr. To allow this possible racing
> > condition, we need to check for NULL user_cpus_ptr under the pi_lock
> > before duping the user mask.
> > 
> > Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> > Signed-off-by: Waiman Long <longman@redhat.com>
> 
> This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8
> ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
> So it needs to be fixed in the stable release as well. Will resend the patch
> with an additional fixes tag and updated commit log.

Please can you elaborate on the use-after-free here? Looking at
07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount
has dropped to zero and I can't see how that can race with fork().

What am I missing?

Will

> >   kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
> >   1 file changed, 28 insertions(+), 4 deletions(-)
> > 
> >   [v4] Minor comment update
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 8df51b08bb38..f2b75faaf71a 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
> >   int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
> >   		      int node)
> >   {
> > +	cpumask_t *user_mask;
> >   	unsigned long flags;
> > +	/*
> > +	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
> > +	 * may differ by now due to racing.
> > +	 */
> > +	dst->user_cpus_ptr = NULL;
> > +
> > +	/*
> > +	 * This check is racy and losing the race is a valid situation.
> > +	 * It is not worth the extra overhead of taking the pi_lock on
> > +	 * every fork/clone.
> > +	 */
> >   	if (!src->user_cpus_ptr)
> >   		return 0;
> > -	dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> > -	if (!dst->user_cpus_ptr)
> > +	user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> > +	if (!user_mask)
> >   		return -ENOMEM;
> > -	/* Use pi_lock to protect content of user_cpus_ptr */
> > +	/*
> > +	 * Use pi_lock to protect content of user_cpus_ptr
> > +	 *
> > +	 * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
> > +	 * do_set_cpus_allowed().
> > +	 */
> >   	raw_spin_lock_irqsave(&src->pi_lock, flags);
> > -	cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
> > +	if (src->user_cpus_ptr) {
> > +		swap(dst->user_cpus_ptr, user_mask);
> > +		cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
> > +	}
> >   	raw_spin_unlock_irqrestore(&src->pi_lock, flags);
> > +
> > +	if (unlikely(user_mask))
> > +		kfree(user_mask);
> > +
> >   	return 0;
> >   }
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-28 12:00   ` Will Deacon
@ 2022-11-28 15:11     ` Waiman Long
  2022-11-29 14:07       ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2022-11-28 15:11 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel

On 11/28/22 07:00, Will Deacon wrote:
> On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
>> On 11/24/22 21:39, Waiman Long wrote:
>>> In general, a non-null user_cpus_ptr will remain set until the task dies.
>>> A possible exception to this is the fact that do_set_cpus_allowed()
>>> will clear a non-null user_cpus_ptr. To allow this possible racing
>>> condition, we need to check for NULL user_cpus_ptr under the pi_lock
>>> before duping the user mask.
>>>
>>> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
>>> Signed-off-by: Waiman Long <longman@redhat.com>
>> This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8
>> ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
>> So it needs to be fixed in the stable release as well. Will resend the patch
>> with an additional fixes tag and updated commit log.
> Please can you elaborate on the use-after-free here? Looking at
> 07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount
> has dropped to zero and I can't see how that can race with fork().
>
> What am I missing?

I missed that at first. The current task cloning process copies the 
content of the task structure over to the newly cloned/forked task. IOW, 
if user_cpus_ptr had been set up previously, it will be copied over to 
the cloned task. Now if user_cpus_ptr of the source task is cleared 
right after that and before dup_user_cpus_ptr() is called. The obsolete 
user_cpus_ptr value in the cloned task will remain and get used even if 
it has been freed. That is what I call as use-after-free and double-free.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-28 15:11     ` Waiman Long
@ 2022-11-29 14:07       ` Will Deacon
  2022-11-29 15:32         ` Waiman Long
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2022-11-29 14:07 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel

On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote:
> On 11/28/22 07:00, Will Deacon wrote:
> > On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
> > > On 11/24/22 21:39, Waiman Long wrote:
> > > > In general, a non-null user_cpus_ptr will remain set until the task dies.
> > > > A possible exception to this is the fact that do_set_cpus_allowed()
> > > > will clear a non-null user_cpus_ptr. To allow this possible racing
> > > > condition, we need to check for NULL user_cpus_ptr under the pi_lock
> > > > before duping the user mask.
> > > > 
> > > > Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> > > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8
> > > ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
> > > So it needs to be fixed in the stable release as well. Will resend the patch
> > > with an additional fixes tag and updated commit log.
> > Please can you elaborate on the use-after-free here? Looking at
> > 07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount
> > has dropped to zero and I can't see how that can race with fork().
> > 
> > What am I missing?
> 
> I missed that at first. The current task cloning process copies the content
> of the task structure over to the newly cloned/forked task. IOW, if
> user_cpus_ptr had been set up previously, it will be copied over to the
> cloned task. Now if user_cpus_ptr of the source task is cleared right after
> that and before dup_user_cpus_ptr() is called. The obsolete user_cpus_ptr
> value in the cloned task will remain and get used even if it has been freed.
> That is what I call as use-after-free and double-free.

If the parent task can be modified concurrently with dup_task_struct() then
surely we'd have bigger issues because that's not going to be atomic? At the
very least we'd have a data race, but it also feels like we could end up
with inconsistent task state in the child. In fact, couldn't the normal
'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()?

Or am I still failing to understand the race?

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-29 14:07       ` Will Deacon
@ 2022-11-29 15:32         ` Waiman Long
  2022-11-29 15:57           ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2022-11-29 15:32 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel

On 11/29/22 09:07, Will Deacon wrote:
> On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote:
>> On 11/28/22 07:00, Will Deacon wrote:
>>> On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
>>>> On 11/24/22 21:39, Waiman Long wrote:
>>>>> In general, a non-null user_cpus_ptr will remain set until the task dies.
>>>>> A possible exception to this is the fact that do_set_cpus_allowed()
>>>>> will clear a non-null user_cpus_ptr. To allow this possible racing
>>>>> condition, we need to check for NULL user_cpus_ptr under the pi_lock
>>>>> before duping the user mask.
>>>>>
>>>>> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
>>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>> This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8
>>>> ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
>>>> So it needs to be fixed in the stable release as well. Will resend the patch
>>>> with an additional fixes tag and updated commit log.
>>> Please can you elaborate on the use-after-free here? Looking at
>>> 07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount
>>> has dropped to zero and I can't see how that can race with fork().
>>>
>>> What am I missing?
>> I missed that at first. The current task cloning process copies the content
>> of the task structure over to the newly cloned/forked task. IOW, if
>> user_cpus_ptr had been set up previously, it will be copied over to the
>> cloned task. Now if user_cpus_ptr of the source task is cleared right after
>> that and before dup_user_cpus_ptr() is called. The obsolete user_cpus_ptr
>> value in the cloned task will remain and get used even if it has been freed.
>> That is what I call as use-after-free and double-free.
> If the parent task can be modified concurrently with dup_task_struct() then
> surely we'd have bigger issues because that's not going to be atomic? At the
> very least we'd have a data race, but it also feels like we could end up
> with inconsistent task state in the child. In fact, couldn't the normal
> 'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()?
>
> Or am I still failing to understand the race?
>
> Will

A major difference between cpus_mask and user_cpus_ptr is that for 
cpus_mask, the bitmap is embedded into task_struct whereas user_cpus_ptr 
is a pointer to an external bitmap. So there is no issue of 
use-after-free wrt cpus_mask. That is not the case where the memory of 
the user_cpus_ptr of the parent task is freed, but then a reference to 
that memory is still available in the child's task struct and may be used.

Note that the problematic concurrence is not between the copying of task 
struct and changing of the task struct. It is what will happen after the 
task struct copying has already been done with an extra reference 
present in the child's task struct.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-29 15:32         ` Waiman Long
@ 2022-11-29 15:57           ` Will Deacon
  2022-11-29 16:03             ` Waiman Long
  0 siblings, 1 reply; 11+ messages in thread
From: Will Deacon @ 2022-11-29 15:57 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel

On Tue, Nov 29, 2022 at 10:32:49AM -0500, Waiman Long wrote:
> On 11/29/22 09:07, Will Deacon wrote:
> > On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote:
> > > On 11/28/22 07:00, Will Deacon wrote:
> > > > On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
> > > > > On 11/24/22 21:39, Waiman Long wrote:
> > > > > > In general, a non-null user_cpus_ptr will remain set until the task dies.
> > > > > > A possible exception to this is the fact that do_set_cpus_allowed()
> > > > > > will clear a non-null user_cpus_ptr. To allow this possible racing
> > > > > > condition, we need to check for NULL user_cpus_ptr under the pi_lock
> > > > > > before duping the user mask.
> > > > > > 
> > > > > > Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> > > > > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > > > This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8
> > > > > ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
> > > > > So it needs to be fixed in the stable release as well. Will resend the patch
> > > > > with an additional fixes tag and updated commit log.
> > > > Please can you elaborate on the use-after-free here? Looking at
> > > > 07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount
> > > > has dropped to zero and I can't see how that can race with fork().
> > > > 
> > > > What am I missing?
> > > I missed that at first. The current task cloning process copies the content
> > > of the task structure over to the newly cloned/forked task. IOW, if
> > > user_cpus_ptr had been set up previously, it will be copied over to the
> > > cloned task. Now if user_cpus_ptr of the source task is cleared right after
> > > that and before dup_user_cpus_ptr() is called. The obsolete user_cpus_ptr
> > > value in the cloned task will remain and get used even if it has been freed.
> > > That is what I call as use-after-free and double-free.
> > If the parent task can be modified concurrently with dup_task_struct() then
> > surely we'd have bigger issues because that's not going to be atomic? At the
> > very least we'd have a data race, but it also feels like we could end up
> > with inconsistent task state in the child. In fact, couldn't the normal
> > 'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()?
> > 
> > Or am I still failing to understand the race?
> > 
> A major difference between cpus_mask and user_cpus_ptr is that for
> cpus_mask, the bitmap is embedded into task_struct whereas user_cpus_ptr is
> a pointer to an external bitmap. So there is no issue of use-after-free wrt
> cpus_mask. That is not the case where the memory of the user_cpus_ptr of the
> parent task is freed, but then a reference to that memory is still available
> in the child's task struct and may be used.

Sure, I'm not saying there's a UAF on cpus_mask, but I'm concerned that we
could corrupt the data and end up with an affinity mask that doesn't correspond
to anything meaningful. Do you agree that's possible?

> Note that the problematic concurrence is not between the copying of task
> struct and changing of the task struct. It is what will happen after the
> task struct copying has already been done with an extra reference present in
> the child's task struct.

Well, sort of, but the child only has the extra reference _because_ the parent
pointer was concurrently cleared to NULL, otherwise dup_user_cpus_ptr() would
have allocated a new copy and we'd be ok, no?

Overall, I'm just very wary that we seem to be saying that copy_process()
can run concurrently with changes to the parent. Maybe it's all been written
with that in mindi (including all the arch callbacks), but I'd be astonished
if this is the only problem in there.

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-29 15:57           ` Will Deacon
@ 2022-11-29 16:03             ` Waiman Long
  2022-11-30 11:51               ` 答复: [External Mail]Re: " David Wang 王标
  2022-12-01 13:36               ` Will Deacon
  0 siblings, 2 replies; 11+ messages in thread
From: Waiman Long @ 2022-11-29 16:03 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel

On 11/29/22 10:57, Will Deacon wrote:
> On Tue, Nov 29, 2022 at 10:32:49AM -0500, Waiman Long wrote:
>> On 11/29/22 09:07, Will Deacon wrote:
>>> On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote:
>>>> On 11/28/22 07:00, Will Deacon wrote:
>>>>> On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
>>>>>> On 11/24/22 21:39, Waiman Long wrote:
>>>>>>> In general, a non-null user_cpus_ptr will remain set until the task dies.
>>>>>>> A possible exception to this is the fact that do_set_cpus_allowed()
>>>>>>> will clear a non-null user_cpus_ptr. To allow this possible racing
>>>>>>> condition, we need to check for NULL user_cpus_ptr under the pi_lock
>>>>>>> before duping the user mask.
>>>>>>>
>>>>>>> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
>>>>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>>>> This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8
>>>>>> ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
>>>>>> So it needs to be fixed in the stable release as well. Will resend the patch
>>>>>> with an additional fixes tag and updated commit log.
>>>>> Please can you elaborate on the use-after-free here? Looking at
>>>>> 07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount
>>>>> has dropped to zero and I can't see how that can race with fork().
>>>>>
>>>>> What am I missing?
>>>> I missed that at first. The current task cloning process copies the content
>>>> of the task structure over to the newly cloned/forked task. IOW, if
>>>> user_cpus_ptr had been set up previously, it will be copied over to the
>>>> cloned task. Now if user_cpus_ptr of the source task is cleared right after
>>>> that and before dup_user_cpus_ptr() is called. The obsolete user_cpus_ptr
>>>> value in the cloned task will remain and get used even if it has been freed.
>>>> That is what I call as use-after-free and double-free.
>>> If the parent task can be modified concurrently with dup_task_struct() then
>>> surely we'd have bigger issues because that's not going to be atomic? At the
>>> very least we'd have a data race, but it also feels like we could end up
>>> with inconsistent task state in the child. In fact, couldn't the normal
>>> 'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()?
>>>
>>> Or am I still failing to understand the race?
>>>
>> A major difference between cpus_mask and user_cpus_ptr is that for
>> cpus_mask, the bitmap is embedded into task_struct whereas user_cpus_ptr is
>> a pointer to an external bitmap. So there is no issue of use-after-free wrt
>> cpus_mask. That is not the case where the memory of the user_cpus_ptr of the
>> parent task is freed, but then a reference to that memory is still available
>> in the child's task struct and may be used.
> Sure, I'm not saying there's a UAF on cpus_mask, but I'm concerned that we
> could corrupt the data and end up with an affinity mask that doesn't correspond
> to anything meaningful. Do you agree that's possible?
That is certainly possible. So we have to be careful about it.
>
>> Note that the problematic concurrence is not between the copying of task
>> struct and changing of the task struct. It is what will happen after the
>> task struct copying has already been done with an extra reference present in
>> the child's task struct.
> Well, sort of, but the child only has the extra reference _because_ the parent
> pointer was concurrently cleared to NULL, otherwise dup_user_cpus_ptr() would
> have allocated a new copy and we'd be ok, no?
Yes, that is exactly where the problem is and this is what my patch is 
trying to fix.
>
> Overall, I'm just very wary that we seem to be saying that copy_process()
> can run concurrently with changes to the parent. Maybe it's all been written
> with that in mindi (including all the arch callbacks), but I'd be astonished
> if this is the only problem in there.

It seems like that, at least in some cases, the clearing of a task's 
user_cpus_ptr can be done by another task. So the parent may be unaware 
of it and so is not its fault.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* 答复: [External Mail]Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-29 16:03             ` Waiman Long
@ 2022-11-30 11:51               ` David Wang 王标
  2022-11-30 14:07                 ` Wenjie Li (Evan)
  2022-12-01 13:36               ` Will Deacon
  1 sibling, 1 reply; 11+ messages in thread
From: David Wang 王标 @ 2022-11-30 11:51 UTC (permalink / raw)
  To: Waiman Long, Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, linux-kernel,
	quic_wenjieli, 陈海华, 杨健龙,
	刘添威, 马振华,
	Ting11 Wang 王婷

Dear Will,  Longman,

Could we fix the issue first we met?  We can analyze other issue later.

Thanks


-----邮件原件-----
发件人: Waiman Long <longman@redhat.com>
发送时间: 2022年11月30日 0:04
收件人: Will Deacon <will@kernel.org>
抄送: Ingo Molnar <mingo@redhat.com>; Peter Zijlstra <peterz@infradead.org>; Juri Lelli <juri.lelli@redhat.com>; Vincent Guittot <vincent.guittot@linaro.org>; Dietmar Eggemann <dietmar.eggemann@arm.com>; Steven Rostedt <rostedt@goodmis.org>; Ben Segall <bsegall@google.com>; Mel Gorman <mgorman@suse.de>; Daniel Bristot de Oliveira <bristot@redhat.com>; Phil Auld <pauld@redhat.com>; Wenjie Li <wenjieli@qti.qualcomm.com>; David Wang 王标 <wangbiao3@xiaomi.com>; linux-kernel@vger.kernel.org
主题: [External Mail]Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()

[外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈

On 11/29/22 10:57, Will Deacon wrote:
> On Tue, Nov 29, 2022 at 10:32:49AM -0500, Waiman Long wrote:
>> On 11/29/22 09:07, Will Deacon wrote:
>>> On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote:
>>>> On 11/28/22 07:00, Will Deacon wrote:
>>>>> On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
>>>>>> On 11/24/22 21:39, Waiman Long wrote:
>>>>>>> In general, a non-null user_cpus_ptr will remain set until the task dies.
>>>>>>> A possible exception to this is the fact that
>>>>>>> do_set_cpus_allowed() will clear a non-null user_cpus_ptr. To
>>>>>>> allow this possible racing condition, we need to check for NULL
>>>>>>> user_cpus_ptr under the pi_lock before duping the user mask.
>>>>>>>
>>>>>>> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in
>>>>>>> do_set_cpus_allowed()")
>>>>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>>>> This is actually a pre-existing use-after-free bug since commit
>>>>>> 07ec77a1d4e8
>>>>>> ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
>>>>>> So it needs to be fixed in the stable release as well. Will
>>>>>> resend the patch with an additional fixes tag and updated commit log.
>>>>> Please can you elaborate on the use-after-free here? Looking at
>>>>> 07ec77a1d4e8, the mask is only freed in free_task() when the usage
>>>>> refcount has dropped to zero and I can't see how that can race with fork().
>>>>>
>>>>> What am I missing?
>>>> I missed that at first. The current task cloning process copies the
>>>> content of the task structure over to the newly cloned/forked task.
>>>> IOW, if user_cpus_ptr had been set up previously, it will be copied
>>>> over to the cloned task. Now if user_cpus_ptr of the source task is
>>>> cleared right after that and before dup_user_cpus_ptr() is called.
>>>> The obsolete user_cpus_ptr value in the cloned task will remain and get used even if it has been freed.
>>>> That is what I call as use-after-free and double-free.
>>> If the parent task can be modified concurrently with
>>> dup_task_struct() then surely we'd have bigger issues because that's
>>> not going to be atomic? At the very least we'd have a data race, but
>>> it also feels like we could end up with inconsistent task state in
>>> the child. In fact, couldn't the normal 'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()?
>>>
>>> Or am I still failing to understand the race?
>>>
>> A major difference between cpus_mask and user_cpus_ptr is that for
>> cpus_mask, the bitmap is embedded into task_struct whereas
>> user_cpus_ptr is a pointer to an external bitmap. So there is no
>> issue of use-after-free wrt cpus_mask. That is not the case where the
>> memory of the user_cpus_ptr of the parent task is freed, but then a
>> reference to that memory is still available in the child's task struct and may be used.
> Sure, I'm not saying there's a UAF on cpus_mask, but I'm concerned
> that we could corrupt the data and end up with an affinity mask that
> doesn't correspond to anything meaningful. Do you agree that's possible?
That is certainly possible. So we have to be careful about it.
>
>> Note that the problematic concurrence is not between the copying of
>> task struct and changing of the task struct. It is what will happen
>> after the task struct copying has already been done with an extra
>> reference present in the child's task struct.
> Well, sort of, but the child only has the extra reference _because_
> the parent pointer was concurrently cleared to NULL, otherwise
> dup_user_cpus_ptr() would have allocated a new copy and we'd be ok, no?
Yes, that is exactly where the problem is and this is what my patch is trying to fix.
>
> Overall, I'm just very wary that we seem to be saying that
> copy_process() can run concurrently with changes to the parent. Maybe
> it's all been written with that in mindi (including all the arch
> callbacks), but I'd be astonished if this is the only problem in there.

It seems like that, at least in some cases, the clearing of a task's user_cpus_ptr can be done by another task. So the parent may be unaware of it and so is not its fault.

Cheers,
Longman

#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External Mail]Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-30 11:51               ` 答复: [External Mail]Re: " David Wang 王标
@ 2022-11-30 14:07                 ` Wenjie Li (Evan)
  0 siblings, 0 replies; 11+ messages in thread
From: Wenjie Li (Evan) @ 2022-11-30 14:07 UTC (permalink / raw)
  To: David Wang 王标, Waiman Long, Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, linux-kernel,
	Wenjie Li (QUIC),
	chenhaihua, yangjianlong, 刘添威,
	mazhenhua, Ting11 Wang 王婷

Dear Will and Longman.

Maybe we can solve this scenario firstly if it dose not effect other architecture.

Thanks.

-----Original Message-----
From: David Wang 王标 <wangbiao3@xiaomi.com> 
Sent: Wednesday, November 30, 2022 7:52 PM
To: Waiman Long <longman@redhat.com>; Will Deacon <will@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>; Peter Zijlstra <peterz@infradead.org>; Juri Lelli <juri.lelli@redhat.com>; Vincent Guittot <vincent.guittot@linaro.org>; Dietmar Eggemann <dietmar.eggemann@arm.com>; Steven Rostedt <rostedt@goodmis.org>; Ben Segall <bsegall@google.com>; Mel Gorman <mgorman@suse.de>; Daniel Bristot de Oliveira <bristot@redhat.com>; Phil Auld <pauld@redhat.com>; linux-kernel@vger.kernel.org; Wenjie Li (QUIC) <quic_wenjieli@quicinc.com>; chenhaihua@xiaomi.com; yangjianlong@xiaomi.com; 刘添威 <liutianwei@xiaomi.com>; mazhenhua@xiaomi.com; Ting11 Wang 王婷 <wangting11@xiaomi.com>
Subject: 答复: [External Mail]Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()

WARNING: This email originated from outside of Qualcomm. Please be wary of any links or attachments, and do not enable macros.

Dear Will,  Longman,

Could we fix the issue first we met?  We can analyze other issue later.

Thanks


-----邮件原件-----
发件人: Waiman Long <longman@redhat.com>
发送时间: 2022年11月30日 0:04
收件人: Will Deacon <will@kernel.org>
抄送: Ingo Molnar <mingo@redhat.com>; Peter Zijlstra <peterz@infradead.org>; Juri Lelli <juri.lelli@redhat.com>; Vincent Guittot <vincent.guittot@linaro.org>; Dietmar Eggemann <dietmar.eggemann@arm.com>; Steven Rostedt <rostedt@goodmis.org>; Ben Segall <bsegall@google.com>; Mel Gorman <mgorman@suse.de>; Daniel Bristot de Oliveira <bristot@redhat.com>; Phil Auld <pauld@redhat.com>; Wenjie Li <wenjieli@qti.qualcomm.com>; David Wang 王标 <wangbiao3@xiaomi.com>; linux-kernel@vger.kernel.org
主题: [External Mail]Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()

[外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈

On 11/29/22 10:57, Will Deacon wrote:
> On Tue, Nov 29, 2022 at 10:32:49AM -0500, Waiman Long wrote:
>> On 11/29/22 09:07, Will Deacon wrote:
>>> On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote:
>>>> On 11/28/22 07:00, Will Deacon wrote:
>>>>> On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
>>>>>> On 11/24/22 21:39, Waiman Long wrote:
>>>>>>> In general, a non-null user_cpus_ptr will remain set until the task dies.
>>>>>>> A possible exception to this is the fact that
>>>>>>> do_set_cpus_allowed() will clear a non-null user_cpus_ptr. To 
>>>>>>> allow this possible racing condition, we need to check for NULL 
>>>>>>> user_cpus_ptr under the pi_lock before duping the user mask.
>>>>>>>
>>>>>>> Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in
>>>>>>> do_set_cpus_allowed()")
>>>>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>>>> This is actually a pre-existing use-after-free bug since commit
>>>>>> 07ec77a1d4e8
>>>>>> ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
>>>>>> So it needs to be fixed in the stable release as well. Will 
>>>>>> resend the patch with an additional fixes tag and updated commit log.
>>>>> Please can you elaborate on the use-after-free here? Looking at 
>>>>> 07ec77a1d4e8, the mask is only freed in free_task() when the usage 
>>>>> refcount has dropped to zero and I can't see how that can race with fork().
>>>>>
>>>>> What am I missing?
>>>> I missed that at first. The current task cloning process copies the 
>>>> content of the task structure over to the newly cloned/forked task.
>>>> IOW, if user_cpus_ptr had been set up previously, it will be copied 
>>>> over to the cloned task. Now if user_cpus_ptr of the source task is 
>>>> cleared right after that and before dup_user_cpus_ptr() is called.
>>>> The obsolete user_cpus_ptr value in the cloned task will remain and get used even if it has been freed.
>>>> That is what I call as use-after-free and double-free.
>>> If the parent task can be modified concurrently with
>>> dup_task_struct() then surely we'd have bigger issues because that's 
>>> not going to be atomic? At the very least we'd have a data race, but 
>>> it also feels like we could end up with inconsistent task state in 
>>> the child. In fact, couldn't the normal 'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()?
>>>
>>> Or am I still failing to understand the race?
>>>
>> A major difference between cpus_mask and user_cpus_ptr is that for 
>> cpus_mask, the bitmap is embedded into task_struct whereas 
>> user_cpus_ptr is a pointer to an external bitmap. So there is no 
>> issue of use-after-free wrt cpus_mask. That is not the case where the 
>> memory of the user_cpus_ptr of the parent task is freed, but then a 
>> reference to that memory is still available in the child's task struct and may be used.
> Sure, I'm not saying there's a UAF on cpus_mask, but I'm concerned 
> that we could corrupt the data and end up with an affinity mask that 
> doesn't correspond to anything meaningful. Do you agree that's possible?
That is certainly possible. So we have to be careful about it.
>
>> Note that the problematic concurrence is not between the copying of 
>> task struct and changing of the task struct. It is what will happen 
>> after the task struct copying has already been done with an extra 
>> reference present in the child's task struct.
> Well, sort of, but the child only has the extra reference _because_ 
> the parent pointer was concurrently cleared to NULL, otherwise
> dup_user_cpus_ptr() would have allocated a new copy and we'd be ok, no?
Yes, that is exactly where the problem is and this is what my patch is trying to fix.
>
> Overall, I'm just very wary that we seem to be saying that
> copy_process() can run concurrently with changes to the parent. Maybe 
> it's all been written with that in mindi (including all the arch 
> callbacks), but I'd be astonished if this is the only problem in there.

It seems like that, at least in some cases, the clearing of a task's user_cpus_ptr can be done by another task. So the parent may be unaware of it and so is not its fault.

Cheers,
Longman

#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()
  2022-11-29 16:03             ` Waiman Long
  2022-11-30 11:51               ` 答复: [External Mail]Re: " David Wang 王标
@ 2022-12-01 13:36               ` Will Deacon
  1 sibling, 0 replies; 11+ messages in thread
From: Will Deacon @ 2022-12-01 13:36 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Phil Auld, Wenjie Li,
	David Wang 王标,
	linux-kernel

On Tue, Nov 29, 2022 at 11:03:35AM -0500, Waiman Long wrote:
> On 11/29/22 10:57, Will Deacon wrote:
> > On Tue, Nov 29, 2022 at 10:32:49AM -0500, Waiman Long wrote:
> > > On 11/29/22 09:07, Will Deacon wrote:
> > > > On Mon, Nov 28, 2022 at 10:11:52AM -0500, Waiman Long wrote:
> > > > > On 11/28/22 07:00, Will Deacon wrote:
> > > > > > On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
> > > > > > > On 11/24/22 21:39, Waiman Long wrote:
> > > > > > > > In general, a non-null user_cpus_ptr will remain set until the task dies.
> > > > > > > > A possible exception to this is the fact that do_set_cpus_allowed()
> > > > > > > > will clear a non-null user_cpus_ptr. To allow this possible racing
> > > > > > > > condition, we need to check for NULL user_cpus_ptr under the pi_lock
> > > > > > > > before duping the user mask.
> > > > > > > > 
> > > > > > > > Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> > > > > > > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > > > > > This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8
> > > > > > > ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
> > > > > > > So it needs to be fixed in the stable release as well. Will resend the patch
> > > > > > > with an additional fixes tag and updated commit log.
> > > > > > Please can you elaborate on the use-after-free here? Looking at
> > > > > > 07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount
> > > > > > has dropped to zero and I can't see how that can race with fork().
> > > > > > 
> > > > > > What am I missing?
> > > > > I missed that at first. The current task cloning process copies the content
> > > > > of the task structure over to the newly cloned/forked task. IOW, if
> > > > > user_cpus_ptr had been set up previously, it will be copied over to the
> > > > > cloned task. Now if user_cpus_ptr of the source task is cleared right after
> > > > > that and before dup_user_cpus_ptr() is called. The obsolete user_cpus_ptr
> > > > > value in the cloned task will remain and get used even if it has been freed.
> > > > > That is what I call as use-after-free and double-free.
> > > > If the parent task can be modified concurrently with dup_task_struct() then
> > > > surely we'd have bigger issues because that's not going to be atomic? At the
> > > > very least we'd have a data race, but it also feels like we could end up
> > > > with inconsistent task state in the child. In fact, couldn't the normal
> > > > 'cpus_mask' be corrupted by a concurrent set_cpus_allowed_common()?
> > > > 
> > > > Or am I still failing to understand the race?
> > > > 
> > > A major difference between cpus_mask and user_cpus_ptr is that for
> > > cpus_mask, the bitmap is embedded into task_struct whereas user_cpus_ptr is
> > > a pointer to an external bitmap. So there is no issue of use-after-free wrt
> > > cpus_mask. That is not the case where the memory of the user_cpus_ptr of the
> > > parent task is freed, but then a reference to that memory is still available
> > > in the child's task struct and may be used.
> > Sure, I'm not saying there's a UAF on cpus_mask, but I'm concerned that we
> > could corrupt the data and end up with an affinity mask that doesn't correspond
> > to anything meaningful. Do you agree that's possible?
> That is certainly possible. So we have to be careful about it.

Hmm, but we're not being particularly careful, are we? I hacked memcpy()
to be byte-to-byte to make things a bit easier to reproduce, and sure enough
I can race a bog-standard sched_setaffinity() call w/ fork():

[ 1663.935258] BUG: KCSAN: data-race in arch_dup_task_struct+0x4c/0x224
[ 1663.936872]
[ 1663.937292] race at unknown origin, with read to 0xffff06a44b8880a9 of 1 bytes by task 351 on cpu 0:
[ 1663.938770]  arch_dup_task_struct+0x4c/0x224
[ 1663.939621]  dup_task_struct+0x68/0x2a8
[ 1663.940381]  copy_process+0x208/0x1404
[ 1663.941109]  kernel_clone+0xdc/0x2c8
[ 1663.941814]  __arm64_sys_clone+0x9c/0xd4
[ 1663.942909]  invoke_syscall+0x54/0x170
[ 1663.943816]  el0_svc_common+0x100/0x148
[ 1663.944607]  do_el0_svc+0x40/0x10c
[ 1663.945333]  el0_svc+0x2c/0x7c
[ 1663.946006]  el0t_64_sync_handler+0x84/0xf0
[ 1663.946804]  el0t_64_sync+0x18c/0x190

I then managed to get the child process to run with an affinity mask (i.e.
'task_struct::cpus_mask') of *zero*, which triggers the select_fallback_rq()
logic:

 | process 14622 (waiman) no longer affine to cpu0

So sure, it's not a UAF, but I still think it's an issue that should be
fixed.

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-12-01 13:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-25  2:39 [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr() Waiman Long
2022-11-28  1:43 ` Waiman Long
2022-11-28 12:00   ` Will Deacon
2022-11-28 15:11     ` Waiman Long
2022-11-29 14:07       ` Will Deacon
2022-11-29 15:32         ` Waiman Long
2022-11-29 15:57           ` Will Deacon
2022-11-29 16:03             ` Waiman Long
2022-11-30 11:51               ` 答复: [External Mail]Re: " David Wang 王标
2022-11-30 14:07                 ` Wenjie Li (Evan)
2022-12-01 13:36               ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).