All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
@ 2023-01-21  2:17 Waiman Long
  2023-01-24 19:48 ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2023-01-21  2:17 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Will Deacon
  Cc: Phil Auld, Linus Torvalds, linux-kernel, Waiman Long

The user_cpus_ptr field was originally added by commit b90ca8badbd1
("sched: Introduce task_struct::user_cpus_ptr to track requested
affinity"). It was used only by arm64 arch due to possible asymmetric
CPU setup.

Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
cpumask"), task_struct::user_cpus_ptr is repurposed to store user
requested cpu affinity specified in the sched_setaffinity().

This results in a performance regression in an arm64 system when booted
with "allow_mismatched_32bit_el0" on the command-line. The arch code will
(amongst other things) calls force_compatible_cpus_allowed_ptr() and
relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 64-bit
task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
will always result in a __sched_setaffinity() call whether there is a
previous force_compatible_cpus_allowed_ptr() call or not.

In order to fix this regression, a new scheduler flag
task_struct::cpus_allowed_restricted is now added to track if
force_compatible_cpus_allowed_ptr() has been called before or not. This
patch also updates the comments in force_compatible_cpus_allowed_ptr()
and relax_compatible_cpus_allowed_ptr() and handles their interaction
with sched_setaffinity().

Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask")
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/sched.h |  3 +++
 kernel/sched/core.c   | 46 ++++++++++++++++++++++++++++++++++---------
 kernel/sched/sched.h  |  2 ++
 3 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 853d08f7562b..f93f62a1f858 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -886,6 +886,9 @@ struct task_struct {
 	unsigned			sched_contributes_to_load:1;
 	unsigned			sched_migrated:1;
 
+	/* restrict_cpus_allowed_ptr() bit, serialized by scheduler locks */
+	unsigned			cpus_allowed_restricted:1;
+
 	/* Force alignment to the next boundary: */
 	unsigned			:0;
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index bb1ee6d7bdde..48234dc9005b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2999,15 +2999,40 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
 	struct rq *rq;
 
 	rq = task_rq_lock(p, &rf);
+
+	if (ctx->flags & (SCA_CLR_RESTRICT | SCA_SET_RESTRICT)) {
+		p->cpus_allowed_restricted = 0;
+	} else if (p->cpus_allowed_restricted) {
+		/*
+		 * If force_compatible_cpus_allowed_ptr() has been called,
+		 * we can't extend cpumask to beyond what is in cpus_mask.
+		 */
+		if (!cpumask_and(rq->scratch_mask, ctx->new_mask,
+				 &p->cpus_mask)) {
+			task_rq_unlock(rq, p, &rf);
+			return -EINVAL;
+		}
+
+		/*
+		 * Note that we don't need to do further user_cpus_ptr
+		 * masking below as cpus_mask should be a subset of
+		 * user_cpus_ptr if set.
+		 */
+		ctx->new_mask = rq->scratch_mask;
+	}
+
 	/*
 	 * Masking should be skipped if SCA_USER or any of the SCA_MIGRATE_*
-	 * flags are set.
+	 * flags are set or when cpus_allowed_restricted flag has been set.
 	 */
-	if (p->user_cpus_ptr &&
+	if (p->user_cpus_ptr && !p->cpus_allowed_restricted &&
 	    !(ctx->flags & (SCA_USER | SCA_MIGRATE_ENABLE | SCA_MIGRATE_DISABLE)) &&
 	    cpumask_and(rq->scratch_mask, ctx->new_mask, p->user_cpus_ptr))
 		ctx->new_mask = rq->scratch_mask;
 
+	if (ctx->flags & SCA_SET_RESTRICT)
+		p->cpus_allowed_restricted = 1;
+
 	return __set_cpus_allowed_ptr_locked(p, ctx, rq, &rf);
 }
 
@@ -3025,8 +3050,8 @@ EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
 /*
  * Change a given task's CPU affinity to the intersection of its current
  * affinity mask and @subset_mask, writing the resulting mask to @new_mask.
- * If user_cpus_ptr is defined, use it as the basis for restricting CPU
- * affinity or use cpu_online_mask instead.
+ * The cpus_allowed_restricted bit is set to indicate to a later
+ * relax_compatible_cpus_allowed_ptr() call to relax the cpumask.
  *
  * If the resulting mask is empty, leave the affinity unchanged and return
  * -EINVAL.
@@ -3037,7 +3062,7 @@ static int restrict_cpus_allowed_ptr(struct task_struct *p,
 {
 	struct affinity_context ac = {
 		.new_mask  = new_mask,
-		.flags     = 0,
+		.flags     = SCA_SET_RESTRICT,
 	};
 	struct rq_flags rf;
 	struct rq *rq;
@@ -3069,9 +3094,8 @@ static int restrict_cpus_allowed_ptr(struct task_struct *p,
 
 /*
  * Restrict the CPU affinity of task @p so that it is a subset of
- * task_cpu_possible_mask() and point @p->user_cpus_ptr to a copy of the
- * old affinity mask. If the resulting mask is empty, we warn and walk
- * up the cpuset hierarchy until we find a suitable mask.
+ * task_cpu_possible_mask(). If the resulting mask is empty, we warn
+ * and walk up the cpuset hierarchy until we find a suitable mask.
  */
 void force_compatible_cpus_allowed_ptr(struct task_struct *p)
 {
@@ -3126,10 +3150,14 @@ void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
 {
 	struct affinity_context ac = {
 		.new_mask  = task_user_cpus(p),
-		.flags     = 0,
+		.flags     = SCA_CLR_RESTRICT,
 	};
 	int ret;
 
+	/* Return if no previous force_compatible_cpus_allowed_ptr() call */
+	if (!data_race(p->cpus_allowed_restricted))
+		return;
+
 	/*
 	 * Try to restore the old affinity mask with __sched_setaffinity().
 	 * Cpuset masking will be done there too.
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 771f8ddb7053..adcef29d5479 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2293,6 +2293,8 @@ extern struct task_struct *pick_next_task_idle(struct rq *rq);
 #define SCA_MIGRATE_DISABLE	0x02
 #define SCA_MIGRATE_ENABLE	0x04
 #define SCA_USER		0x08
+#define SCA_CLR_RESTRICT	0x10
+#define SCA_SET_RESTRICT	0x20
 
 #ifdef CONFIG_SMP
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-21  2:17 [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state Waiman Long
@ 2023-01-24 19:48 ` Will Deacon
  2023-01-24 20:08   ` Waiman Long
  2023-01-24 20:24   ` Waiman Long
  0 siblings, 2 replies; 11+ messages in thread
From: Will Deacon @ 2023-01-24 19:48 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions

Hi Waiman,

[+Thorsten given where we are in the release cycle]

On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
> The user_cpus_ptr field was originally added by commit b90ca8badbd1
> ("sched: Introduce task_struct::user_cpus_ptr to track requested
> affinity"). It was used only by arm64 arch due to possible asymmetric
> CPU setup.
> 
> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
> cpumask"), task_struct::user_cpus_ptr is repurposed to store user
> requested cpu affinity specified in the sched_setaffinity().
> 
> This results in a performance regression in an arm64 system when booted
> with "allow_mismatched_32bit_el0" on the command-line. The arch code will
> (amongst other things) calls force_compatible_cpus_allowed_ptr() and
> relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 64-bit
> task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
> will always result in a __sched_setaffinity() call whether there is a
> previous force_compatible_cpus_allowed_ptr() call or not.

I'd argue it's more than just a performance regression -- the affinity
masks are set incorrectly, which is a user visible thing
(i.e. sched_getaffinity() gives unexpected values).

> In order to fix this regression, a new scheduler flag
> task_struct::cpus_allowed_restricted is now added to track if
> force_compatible_cpus_allowed_ptr() has been called before or not. This
> patch also updates the comments in force_compatible_cpus_allowed_ptr()
> and relax_compatible_cpus_allowed_ptr() and handles their interaction
> with sched_setaffinity().
> 
> Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask")
> Reported-by: Will Deacon <will@kernel.org>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  include/linux/sched.h |  3 +++
>  kernel/sched/core.c   | 46 ++++++++++++++++++++++++++++++++++---------
>  kernel/sched/sched.h  |  2 ++
>  3 files changed, 42 insertions(+), 9 deletions(-)

I find this pretty invasive, but I guess it's up to Peter and Ingo.
It also doesn't the whole problem for me; see below.

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 853d08f7562b..f93f62a1f858 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -886,6 +886,9 @@ struct task_struct {
>  	unsigned			sched_contributes_to_load:1;
>  	unsigned			sched_migrated:1;
>  
> +	/* restrict_cpus_allowed_ptr() bit, serialized by scheduler locks */
> +	unsigned			cpus_allowed_restricted:1;
> +
>  	/* Force alignment to the next boundary: */
>  	unsigned			:0;
>  
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index bb1ee6d7bdde..48234dc9005b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2999,15 +2999,40 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>  	struct rq *rq;
>  
>  	rq = task_rq_lock(p, &rf);
> +
> +	if (ctx->flags & (SCA_CLR_RESTRICT | SCA_SET_RESTRICT)) {
> +		p->cpus_allowed_restricted = 0;

I don't think this is ever called on the SCA_SET_RESTRICT path, as
restrict_cpus_allowed_ptr() calls __set_cpus_allowed_ptr_locked() directly.
In my testing, we see a failure in the following sequence:

  1. A 64-bit task has an affinity of 0xf
  2. It exec()s a 32-bit program and is forcefully restricted to the set
     of 32-bit-capable cores. Let's say that new mask is 0xc
  3. The 32-bit task now exec()s a 64-bit program again

And now we're still stuck with 0xc after step 3 whereas we should restore
0xf.

> +	} else if (p->cpus_allowed_restricted) {
> +		/*
> +		 * If force_compatible_cpus_allowed_ptr() has been called,
> +		 * we can't extend cpumask to beyond what is in cpus_mask.
> +		 */
> +		if (!cpumask_and(rq->scratch_mask, ctx->new_mask,
> +				 &p->cpus_mask)) {
> +			task_rq_unlock(rq, p, &rf);
> +			return -EINVAL;
> +		}

Why is this masking actually needed? __sched_setaffinity() already
takes into account the task_cpu_possible_mask(), which is why I asked you
before [1] about cases where the saved affinity is not simply a superset
of the effective affinity.

Thanks,

Will

[1] https://lore.kernel.org/r/20230120175931.GA22417@willie-the-truck

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-24 19:48 ` Will Deacon
@ 2023-01-24 20:08   ` Waiman Long
  2023-01-26 15:55     ` Will Deacon
  2023-01-24 20:24   ` Waiman Long
  1 sibling, 1 reply; 11+ messages in thread
From: Waiman Long @ 2023-01-24 20:08 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions

On 1/24/23 14:48, Will Deacon wrote:
> Hi Waiman,
>
> [+Thorsten given where we are in the release cycle]
>
> On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
>> The user_cpus_ptr field was originally added by commit b90ca8badbd1
>> ("sched: Introduce task_struct::user_cpus_ptr to track requested
>> affinity"). It was used only by arm64 arch due to possible asymmetric
>> CPU setup.
>>
>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>> cpumask"), task_struct::user_cpus_ptr is repurposed to store user
>> requested cpu affinity specified in the sched_setaffinity().
>>
>> This results in a performance regression in an arm64 system when booted
>> with "allow_mismatched_32bit_el0" on the command-line. The arch code will
>> (amongst other things) calls force_compatible_cpus_allowed_ptr() and
>> relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 64-bit
>> task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
>> will always result in a __sched_setaffinity() call whether there is a
>> previous force_compatible_cpus_allowed_ptr() call or not.
> I'd argue it's more than just a performance regression -- the affinity
> masks are set incorrectly, which is a user visible thing
> (i.e. sched_getaffinity() gives unexpected values).
>
>> In order to fix this regression, a new scheduler flag
>> task_struct::cpus_allowed_restricted is now added to track if
>> force_compatible_cpus_allowed_ptr() has been called before or not. This
>> patch also updates the comments in force_compatible_cpus_allowed_ptr()
>> and relax_compatible_cpus_allowed_ptr() and handles their interaction
>> with sched_setaffinity().
>>
>> Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask")
>> Reported-by: Will Deacon <will@kernel.org>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   include/linux/sched.h |  3 +++
>>   kernel/sched/core.c   | 46 ++++++++++++++++++++++++++++++++++---------
>>   kernel/sched/sched.h  |  2 ++
>>   3 files changed, 42 insertions(+), 9 deletions(-)
> I find this pretty invasive, but I guess it's up to Peter and Ingo.
> It also doesn't the whole problem for me; see below.
>
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 853d08f7562b..f93f62a1f858 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -886,6 +886,9 @@ struct task_struct {
>>   	unsigned			sched_contributes_to_load:1;
>>   	unsigned			sched_migrated:1;
>>   
>> +	/* restrict_cpus_allowed_ptr() bit, serialized by scheduler locks */
>> +	unsigned			cpus_allowed_restricted:1;
>> +
>>   	/* Force alignment to the next boundary: */
>>   	unsigned			:0;
>>   
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index bb1ee6d7bdde..48234dc9005b 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2999,15 +2999,40 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>>   	struct rq *rq;
>>   
>>   	rq = task_rq_lock(p, &rf);
>> +
>> +	if (ctx->flags & (SCA_CLR_RESTRICT | SCA_SET_RESTRICT)) {
>> +		p->cpus_allowed_restricted = 0;
> I don't think this is ever called on the SCA_SET_RESTRICT path, as
> restrict_cpus_allowed_ptr() calls __set_cpus_allowed_ptr_locked() directly.
> In my testing, we see a failure in the following sequence:
>
>    1. A 64-bit task has an affinity of 0xf
>    2. It exec()s a 32-bit program and is forcefully restricted to the set
>       of 32-bit-capable cores. Let's say that new mask is 0xc
>    3. The 32-bit task now exec()s a 64-bit program again
>
> And now we're still stuck with 0xc after step 3 whereas we should restore
> 0xf.
I am sorry that missed it. You are right. For setting the 
cpus_allowed_restricted bit, it should be done directly in 
restrict_cpus_allowed_ptr().
>> +	} else if (p->cpus_allowed_restricted) {
>> +		/*
>> +		 * If force_compatible_cpus_allowed_ptr() has been called,
>> +		 * we can't extend cpumask to beyond what is in cpus_mask.
>> +		 */
>> +		if (!cpumask_and(rq->scratch_mask, ctx->new_mask,
>> +				 &p->cpus_mask)) {
>> +			task_rq_unlock(rq, p, &rf);
>> +			return -EINVAL;
>> +		}
> Why is this masking actually needed? __sched_setaffinity() already
> takes into account the task_cpu_possible_mask(), which is why I asked you
> before [1] about cases where the saved affinity is not simply a superset
> of the effective affinity.

I kind of overlook the use of task_cpu_possible_mask() in 
__set_cpus_allowed_ptr_locked. So we don't really need that masking. 
That make the patch even simpler then. I will send out a v3.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-24 19:48 ` Will Deacon
  2023-01-24 20:08   ` Waiman Long
@ 2023-01-24 20:24   ` Waiman Long
  2023-01-26 16:11     ` Will Deacon
  1 sibling, 1 reply; 11+ messages in thread
From: Waiman Long @ 2023-01-24 20:24 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions


On 1/24/23 14:48, Will Deacon wrote:
> Hi Waiman,
>
> [+Thorsten given where we are in the release cycle]
>
> On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
>> The user_cpus_ptr field was originally added by commit b90ca8badbd1
>> ("sched: Introduce task_struct::user_cpus_ptr to track requested
>> affinity"). It was used only by arm64 arch due to possible asymmetric
>> CPU setup.
>>
>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>> cpumask"), task_struct::user_cpus_ptr is repurposed to store user
>> requested cpu affinity specified in the sched_setaffinity().
>>
>> This results in a performance regression in an arm64 system when booted
>> with "allow_mismatched_32bit_el0" on the command-line. The arch code will
>> (amongst other things) calls force_compatible_cpus_allowed_ptr() and
>> relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 64-bit
>> task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
>> will always result in a __sched_setaffinity() call whether there is a
>> previous force_compatible_cpus_allowed_ptr() call or not.
> I'd argue it's more than just a performance regression -- the affinity
> masks are set incorrectly, which is a user visible thing
> (i.e. sched_getaffinity() gives unexpected values).

Can your elaborate a bit more on what you mean by getting unexpected 
sched_getaffinity() results? You mean the result is wrong after a 
relax_compatible_cpus_allowed_ptr(). Right?

sched_getaffinity() just return whatever is in cpus_mask. Normally, it 
should be whatever cpus are allowed by the current cpuset unless 
sched_setaffinity() has been called before. So after a call to 
relax_compatible_cpus_allowed_ptr(), it should revert back to the 
cpu_allowed set in the cpuset. If sched_setaffinity() has been called, 
it should revert back to the intersection of the current cpuset and 
user_cpus_ptr.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-24 20:08   ` Waiman Long
@ 2023-01-26 15:55     ` Will Deacon
  0 siblings, 0 replies; 11+ messages in thread
From: Will Deacon @ 2023-01-26 15:55 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions

On Tue, Jan 24, 2023 at 03:08:09PM -0500, Waiman Long wrote:
> On 1/24/23 14:48, Will Deacon wrote:
> > On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
> > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > index 853d08f7562b..f93f62a1f858 100644
> > > --- a/include/linux/sched.h
> > > +++ b/include/linux/sched.h
> > > @@ -886,6 +886,9 @@ struct task_struct {
> > >   	unsigned			sched_contributes_to_load:1;
> > >   	unsigned			sched_migrated:1;
> > > +	/* restrict_cpus_allowed_ptr() bit, serialized by scheduler locks */
> > > +	unsigned			cpus_allowed_restricted:1;
> > > +
> > >   	/* Force alignment to the next boundary: */
> > >   	unsigned			:0;
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index bb1ee6d7bdde..48234dc9005b 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -2999,15 +2999,40 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> > >   	struct rq *rq;
> > >   	rq = task_rq_lock(p, &rf);
> > > +
> > > +	if (ctx->flags & (SCA_CLR_RESTRICT | SCA_SET_RESTRICT)) {
> > > +		p->cpus_allowed_restricted = 0;
> > I don't think this is ever called on the SCA_SET_RESTRICT path, as
> > restrict_cpus_allowed_ptr() calls __set_cpus_allowed_ptr_locked() directly.
> > In my testing, we see a failure in the following sequence:
> > 
> >    1. A 64-bit task has an affinity of 0xf
> >    2. It exec()s a 32-bit program and is forcefully restricted to the set
> >       of 32-bit-capable cores. Let's say that new mask is 0xc
> >    3. The 32-bit task now exec()s a 64-bit program again
> > 
> > And now we're still stuck with 0xc after step 3 whereas we should restore
> > 0xf.
> I am sorry that missed it. You are right. For setting the
> cpus_allowed_restricted bit, it should be done directly in
> restrict_cpus_allowed_ptr().
> > > +	} else if (p->cpus_allowed_restricted) {
> > > +		/*
> > > +		 * If force_compatible_cpus_allowed_ptr() has been called,
> > > +		 * we can't extend cpumask to beyond what is in cpus_mask.
> > > +		 */
> > > +		if (!cpumask_and(rq->scratch_mask, ctx->new_mask,
> > > +				 &p->cpus_mask)) {
> > > +			task_rq_unlock(rq, p, &rf);
> > > +			return -EINVAL;
> > > +		}
> > Why is this masking actually needed? __sched_setaffinity() already
> > takes into account the task_cpu_possible_mask(), which is why I asked you
> > before [1] about cases where the saved affinity is not simply a superset
> > of the effective affinity.
> 
> I kind of overlook the use of task_cpu_possible_mask() in
> __set_cpus_allowed_ptr_locked. So we don't really need that masking. That
> make the patch even simpler then. I will send out a v3.

Thanks; I'll give it a spin when I see it.

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-24 20:24   ` Waiman Long
@ 2023-01-26 16:11     ` Will Deacon
  2023-01-26 20:49       ` Waiman Long
  2023-01-30 17:32       ` Waiman Long
  0 siblings, 2 replies; 11+ messages in thread
From: Will Deacon @ 2023-01-26 16:11 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions

On Tue, Jan 24, 2023 at 03:24:36PM -0500, Waiman Long wrote:
> On 1/24/23 14:48, Will Deacon wrote:
> > On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
> > > The user_cpus_ptr field was originally added by commit b90ca8badbd1
> > > ("sched: Introduce task_struct::user_cpus_ptr to track requested
> > > affinity"). It was used only by arm64 arch due to possible asymmetric
> > > CPU setup.
> > > 
> > > Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
> > > cpumask"), task_struct::user_cpus_ptr is repurposed to store user
> > > requested cpu affinity specified in the sched_setaffinity().
> > > 
> > > This results in a performance regression in an arm64 system when booted
> > > with "allow_mismatched_32bit_el0" on the command-line. The arch code will
> > > (amongst other things) calls force_compatible_cpus_allowed_ptr() and
> > > relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 64-bit
> > > task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
> > > will always result in a __sched_setaffinity() call whether there is a
> > > previous force_compatible_cpus_allowed_ptr() call or not.
> > I'd argue it's more than just a performance regression -- the affinity
> > masks are set incorrectly, which is a user visible thing
> > (i.e. sched_getaffinity() gives unexpected values).
> 
> Can your elaborate a bit more on what you mean by getting unexpected
> sched_getaffinity() results? You mean the result is wrong after a
> relax_compatible_cpus_allowed_ptr(). Right?

Yes, as in the original report. If, on a 4-CPU system, I do the following
with v6.1 and "allow_mismatched_32bit_el0" on the kernel cmdline:

# for c in `seq 1 3`; do echo 0 > /sys/devices/system/cpu/cpu$c/online; done
# yes > /dev/null &
[1] 334
# taskset -p 334
pid 334's current affinity mask: 1
# for c in `seq 1 3`; do echo 1 > /sys/devices/system/cpu/cpu$c/online; done
# taskset -p 334
pid 334's current affinity mask: f

but with v6.2-rc5 that last taskset invocation gives:

pid 334's current affinity mask: 1

so, yes, the performance definitely regresses, but that's because the
affinity mask is wrong!

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-26 16:11     ` Will Deacon
@ 2023-01-26 20:49       ` Waiman Long
  2023-01-26 20:58         ` Waiman Long
  2023-01-30 17:32       ` Waiman Long
  1 sibling, 1 reply; 11+ messages in thread
From: Waiman Long @ 2023-01-26 20:49 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions

On 1/26/23 11:11, Will Deacon wrote:
> On Tue, Jan 24, 2023 at 03:24:36PM -0500, Waiman Long wrote:
>> On 1/24/23 14:48, Will Deacon wrote:
>>> On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
>>>> The user_cpus_ptr field was originally added by commit b90ca8badbd1
>>>> ("sched: Introduce task_struct::user_cpus_ptr to track requested
>>>> affinity"). It was used only by arm64 arch due to possible asymmetric
>>>> CPU setup.
>>>>
>>>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>>>> cpumask"), task_struct::user_cpus_ptr is repurposed to store user
>>>> requested cpu affinity specified in the sched_setaffinity().
>>>>
>>>> This results in a performance regression in an arm64 system when booted
>>>> with "allow_mismatched_32bit_el0" on the command-line. The arch code will
>>>> (amongst other things) calls force_compatible_cpus_allowed_ptr() and
>>>> relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 64-bit
>>>> task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
>>>> will always result in a __sched_setaffinity() call whether there is a
>>>> previous force_compatible_cpus_allowed_ptr() call or not.
>>> I'd argue it's more than just a performance regression -- the affinity
>>> masks are set incorrectly, which is a user visible thing
>>> (i.e. sched_getaffinity() gives unexpected values).
>> Can your elaborate a bit more on what you mean by getting unexpected
>> sched_getaffinity() results? You mean the result is wrong after a
>> relax_compatible_cpus_allowed_ptr(). Right?
> Yes, as in the original report. If, on a 4-CPU system, I do the following
> with v6.1 and "allow_mismatched_32bit_el0" on the kernel cmdline:
>
> # for c in `seq 1 3`; do echo 0 > /sys/devices/system/cpu/cpu$c/online; done
> # yes > /dev/null &
> [1] 334
> # taskset -p 334
> pid 334's current affinity mask: 1
> # for c in `seq 1 3`; do echo 1 > /sys/devices/system/cpu/cpu$c/online; done
> # taskset -p 334
> pid 334's current affinity mask: f
>
> but with v6.2-rc5 that last taskset invocation gives:
>
> pid 334's current affinity mask: 1
>
> so, yes, the performance definitely regresses, but that's because the
> affinity mask is wrong!

I see what you mean now. Hotplug doesn't work quite well now because 
user_cpus_ptr has been repurposed to store the value set of 
sched_setaffinity() but not the previous cpus_mask before 
force_compatible_cpus_allowed_ptr().

One possible solution is to modify the hotplug related code to check for 
the cpus_allowed_restricted, and if set, check task_cpu_possible_mask() 
to see if the cpu can be added back to its cpus_mask. I will take a 
further look at that later.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-26 20:49       ` Waiman Long
@ 2023-01-26 20:58         ` Waiman Long
  2023-01-27  1:56           ` Waiman Long
  0 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2023-01-26 20:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions

On 1/26/23 15:49, Waiman Long wrote:
> On 1/26/23 11:11, Will Deacon wrote:
>> On Tue, Jan 24, 2023 at 03:24:36PM -0500, Waiman Long wrote:
>>> On 1/24/23 14:48, Will Deacon wrote:
>>>> On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
>>>>> The user_cpus_ptr field was originally added by commit b90ca8badbd1
>>>>> ("sched: Introduce task_struct::user_cpus_ptr to track requested
>>>>> affinity"). It was used only by arm64 arch due to possible asymmetric
>>>>> CPU setup.
>>>>>
>>>>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>>>>> cpumask"), task_struct::user_cpus_ptr is repurposed to store user
>>>>> requested cpu affinity specified in the sched_setaffinity().
>>>>>
>>>>> This results in a performance regression in an arm64 system when 
>>>>> booted
>>>>> with "allow_mismatched_32bit_el0" on the command-line. The arch 
>>>>> code will
>>>>> (amongst other things) calls force_compatible_cpus_allowed_ptr() and
>>>>> relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 
>>>>> 64-bit
>>>>> task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
>>>>> will always result in a __sched_setaffinity() call whether there is a
>>>>> previous force_compatible_cpus_allowed_ptr() call or not.
>>>> I'd argue it's more than just a performance regression -- the affinity
>>>> masks are set incorrectly, which is a user visible thing
>>>> (i.e. sched_getaffinity() gives unexpected values).
>>> Can your elaborate a bit more on what you mean by getting unexpected
>>> sched_getaffinity() results? You mean the result is wrong after a
>>> relax_compatible_cpus_allowed_ptr(). Right?
>> Yes, as in the original report. If, on a 4-CPU system, I do the 
>> following
>> with v6.1 and "allow_mismatched_32bit_el0" on the kernel cmdline:
>>
>> # for c in `seq 1 3`; do echo 0 > 
>> /sys/devices/system/cpu/cpu$c/online; done
>> # yes > /dev/null &
>> [1] 334
>> # taskset -p 334
>> pid 334's current affinity mask: 1
>> # for c in `seq 1 3`; do echo 1 > 
>> /sys/devices/system/cpu/cpu$c/online; done
>> # taskset -p 334
>> pid 334's current affinity mask: f
>>
>> but with v6.2-rc5 that last taskset invocation gives:
>>
>> pid 334's current affinity mask: 1
>>
>> so, yes, the performance definitely regresses, but that's because the
>> affinity mask is wrong!
>
> I see what you mean now. Hotplug doesn't work quite well now because 
> user_cpus_ptr has been repurposed to store the value set of 
> sched_setaffinity() but not the previous cpus_mask before 
> force_compatible_cpus_allowed_ptr().
>
> One possible solution is to modify the hotplug related code to check 
> for the cpus_allowed_restricted, and if set, check 
> task_cpu_possible_mask() to see if the cpu can be added back to its 
> cpus_mask. I will take a further look at that later.

Wait, I think the cpuset hotplug code should be able to restore the 
right cpumask since task_cpu_possible_mask() is used there. Is cpuset 
enabled? Does the test works without allow_mismatched_32bit_el0?

I think there may be a bug somewhere.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-26 20:58         ` Waiman Long
@ 2023-01-27  1:56           ` Waiman Long
  2023-01-27 13:03             ` Will Deacon
  0 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2023-01-27  1:56 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions

On 1/26/23 15:58, Waiman Long wrote:
> On 1/26/23 15:49, Waiman Long wrote:
>> On 1/26/23 11:11, Will Deacon wrote:
>>> On Tue, Jan 24, 2023 at 03:24:36PM -0500, Waiman Long wrote:
>>>> On 1/24/23 14:48, Will Deacon wrote:
>>>>> On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
>>>>>> The user_cpus_ptr field was originally added by commit b90ca8badbd1
>>>>>> ("sched: Introduce task_struct::user_cpus_ptr to track requested
>>>>>> affinity"). It was used only by arm64 arch due to possible 
>>>>>> asymmetric
>>>>>> CPU setup.
>>>>>>
>>>>>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user 
>>>>>> requested
>>>>>> cpumask"), task_struct::user_cpus_ptr is repurposed to store user
>>>>>> requested cpu affinity specified in the sched_setaffinity().
>>>>>>
>>>>>> This results in a performance regression in an arm64 system when 
>>>>>> booted
>>>>>> with "allow_mismatched_32bit_el0" on the command-line. The arch 
>>>>>> code will
>>>>>> (amongst other things) calls force_compatible_cpus_allowed_ptr() and
>>>>>> relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 
>>>>>> 64-bit
>>>>>> task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
>>>>>> will always result in a __sched_setaffinity() call whether there 
>>>>>> is a
>>>>>> previous force_compatible_cpus_allowed_ptr() call or not.
>>>>> I'd argue it's more than just a performance regression -- the 
>>>>> affinity
>>>>> masks are set incorrectly, which is a user visible thing
>>>>> (i.e. sched_getaffinity() gives unexpected values).
>>>> Can your elaborate a bit more on what you mean by getting unexpected
>>>> sched_getaffinity() results? You mean the result is wrong after a
>>>> relax_compatible_cpus_allowed_ptr(). Right?
>>> Yes, as in the original report. If, on a 4-CPU system, I do the 
>>> following
>>> with v6.1 and "allow_mismatched_32bit_el0" on the kernel cmdline:
>>>
>>> # for c in `seq 1 3`; do echo 0 > 
>>> /sys/devices/system/cpu/cpu$c/online; done
>>> # yes > /dev/null &
>>> [1] 334
>>> # taskset -p 334
>>> pid 334's current affinity mask: 1
>>> # for c in `seq 1 3`; do echo 1 > 
>>> /sys/devices/system/cpu/cpu$c/online; done
>>> # taskset -p 334
>>> pid 334's current affinity mask: f
>>>
>>> but with v6.2-rc5 that last taskset invocation gives:
>>>
>>> pid 334's current affinity mask: 1
>>>
>>> so, yes, the performance definitely regresses, but that's because the
>>> affinity mask is wrong!
>>
>> I see what you mean now. Hotplug doesn't work quite well now because 
>> user_cpus_ptr has been repurposed to store the value set of 
>> sched_setaffinity() but not the previous cpus_mask before 
>> force_compatible_cpus_allowed_ptr().
>>
>> One possible solution is to modify the hotplug related code to check 
>> for the cpus_allowed_restricted, and if set, check 
>> task_cpu_possible_mask() to see if the cpu can be added back to its 
>> cpus_mask. I will take a further look at that later.
>
> Wait, I think the cpuset hotplug code should be able to restore the 
> right cpumask since task_cpu_possible_mask() is used there. Is cpuset 
> enabled? Does the test works without allow_mismatched_32bit_el0?

BTW, if the test result is from running on a kernel built with the v2 
patch, it is the unexpected result. That should be fixed in the v3 patch.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-27  1:56           ` Waiman Long
@ 2023-01-27 13:03             ` Will Deacon
  0 siblings, 0 replies; 11+ messages in thread
From: Will Deacon @ 2023-01-27 13:03 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions

On Thu, Jan 26, 2023 at 08:56:51PM -0500, Waiman Long wrote:
> On 1/26/23 15:58, Waiman Long wrote:
> > On 1/26/23 15:49, Waiman Long wrote:
> > > On 1/26/23 11:11, Will Deacon wrote:
> > > > On Tue, Jan 24, 2023 at 03:24:36PM -0500, Waiman Long wrote:
> > > > > On 1/24/23 14:48, Will Deacon wrote:
> > > > > > On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
> > > > > > > The user_cpus_ptr field was originally added by commit b90ca8badbd1
> > > > > > > ("sched: Introduce task_struct::user_cpus_ptr to track requested
> > > > > > > affinity"). It was used only by arm64 arch due to
> > > > > > > possible asymmetric
> > > > > > > CPU setup.
> > > > > > > 
> > > > > > > Since commit 8f9ea86fdf99 ("sched: Always preserve
> > > > > > > the user requested
> > > > > > > cpumask"), task_struct::user_cpus_ptr is repurposed to store user
> > > > > > > requested cpu affinity specified in the sched_setaffinity().
> > > > > > > 
> > > > > > > This results in a performance regression in an arm64
> > > > > > > system when booted
> > > > > > > with "allow_mismatched_32bit_el0" on the
> > > > > > > command-line. The arch code will
> > > > > > > (amongst other things) calls force_compatible_cpus_allowed_ptr() and
> > > > > > > relax_compatible_cpus_allowed_ptr() when exec()'ing
> > > > > > > a 32-bit or a 64-bit
> > > > > > > task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
> > > > > > > will always result in a __sched_setaffinity() call
> > > > > > > whether there is a
> > > > > > > previous force_compatible_cpus_allowed_ptr() call or not.
> > > > > > I'd argue it's more than just a performance regression
> > > > > > -- the affinity
> > > > > > masks are set incorrectly, which is a user visible thing
> > > > > > (i.e. sched_getaffinity() gives unexpected values).
> > > > > Can your elaborate a bit more on what you mean by getting unexpected
> > > > > sched_getaffinity() results? You mean the result is wrong after a
> > > > > relax_compatible_cpus_allowed_ptr(). Right?
> > > > Yes, as in the original report. If, on a 4-CPU system, I do the
> > > > following
> > > > with v6.1 and "allow_mismatched_32bit_el0" on the kernel cmdline:
> > > > 
> > > > # for c in `seq 1 3`; do echo 0 >
> > > > /sys/devices/system/cpu/cpu$c/online; done
> > > > # yes > /dev/null &
> > > > [1] 334
> > > > # taskset -p 334
> > > > pid 334's current affinity mask: 1
> > > > # for c in `seq 1 3`; do echo 1 >
> > > > /sys/devices/system/cpu/cpu$c/online; done
> > > > # taskset -p 334
> > > > pid 334's current affinity mask: f
> > > > 
> > > > but with v6.2-rc5 that last taskset invocation gives:
> > > > 
> > > > pid 334's current affinity mask: 1
> > > > 
> > > > so, yes, the performance definitely regresses, but that's because the
> > > > affinity mask is wrong!
> > > 
> > > I see what you mean now. Hotplug doesn't work quite well now because
> > > user_cpus_ptr has been repurposed to store the value set of
> > > sched_setaffinity() but not the previous cpus_mask before
> > > force_compatible_cpus_allowed_ptr().
> > > 
> > > One possible solution is to modify the hotplug related code to check
> > > for the cpus_allowed_restricted, and if set, check
> > > task_cpu_possible_mask() to see if the cpu can be added back to its
> > > cpus_mask. I will take a further look at that later.
> > 
> > Wait, I think the cpuset hotplug code should be able to restore the
> > right cpumask since task_cpu_possible_mask() is used there. Is cpuset
> > enabled? Does the test works without allow_mismatched_32bit_el0?
> 
> BTW, if the test result is from running on a kernel built with the v2 patch,
> it is the unexpected result. That should be fixed in the v3 patch.

The failure listed above is on vanilla v6.2-rc5. Your v2 has other issues,
as described in:

https://lore.kernel.org/r/20230124194805.GA27257@willie-the-truck

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
  2023-01-26 16:11     ` Will Deacon
  2023-01-26 20:49       ` Waiman Long
@ 2023-01-30 17:32       ` Waiman Long
  1 sibling, 0 replies; 11+ messages in thread
From: Waiman Long @ 2023-01-30 17:32 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Phil Auld,
	Linus Torvalds, linux-kernel, regressions, regressions

On 1/26/23 11:11, Will Deacon wrote:
> On Tue, Jan 24, 2023 at 03:24:36PM -0500, Waiman Long wrote:
>> On 1/24/23 14:48, Will Deacon wrote:
>>> On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
>>>> The user_cpus_ptr field was originally added by commit b90ca8badbd1
>>>> ("sched: Introduce task_struct::user_cpus_ptr to track requested
>>>> affinity"). It was used only by arm64 arch due to possible asymmetric
>>>> CPU setup.
>>>>
>>>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>>>> cpumask"), task_struct::user_cpus_ptr is repurposed to store user
>>>> requested cpu affinity specified in the sched_setaffinity().
>>>>
>>>> This results in a performance regression in an arm64 system when booted
>>>> with "allow_mismatched_32bit_el0" on the command-line. The arch code will
>>>> (amongst other things) calls force_compatible_cpus_allowed_ptr() and
>>>> relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 64-bit
>>>> task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
>>>> will always result in a __sched_setaffinity() call whether there is a
>>>> previous force_compatible_cpus_allowed_ptr() call or not.
>>> I'd argue it's more than just a performance regression -- the affinity
>>> masks are set incorrectly, which is a user visible thing
>>> (i.e. sched_getaffinity() gives unexpected values).
>> Can your elaborate a bit more on what you mean by getting unexpected
>> sched_getaffinity() results? You mean the result is wrong after a
>> relax_compatible_cpus_allowed_ptr(). Right?
> Yes, as in the original report. If, on a 4-CPU system, I do the following
> with v6.1 and "allow_mismatched_32bit_el0" on the kernel cmdline:
>
> # for c in `seq 1 3`; do echo 0 > /sys/devices/system/cpu/cpu$c/online; done
> # yes > /dev/null &
> [1] 334
> # taskset -p 334
> pid 334's current affinity mask: 1
> # for c in `seq 1 3`; do echo 1 > /sys/devices/system/cpu/cpu$c/online; done
> # taskset -p 334
> pid 334's current affinity mask: f
>
> but with v6.2-rc5 that last taskset invocation gives:
>
> pid 334's current affinity mask: 1
>
> so, yes, the performance definitely regresses, but that's because the
> affinity mask is wrong!

Are you using cgroup v1 or v2? Are your process in the root 
cgroup/cpuset or a child cgroup/cpuset under root?

If you are using cgroup v1 in a child cpuset, cpuset.cpus works more 
like cpuset.cpus_effective. IOW, with an offline and then online hotplug 
event, the cpu will be permanently lost from the cpuset. So the above is 
an expected result. If you using cgroup v2, the cpuset should be able to 
recover the lost cpu after the online event. If your process is in the 
root cpuset, the cpu will not be lost too. Alternatively, if you mount 
the v1 cpuset with the "cpuset_v2_mode" flag, it will behave more like 
v2 and recover the lost cpu.

I ran a similar cpu offline/online test with cgroup v1 and v2 and 
confirm that v1 has the above behavior and v2 is fine.

I believe what you have observed above may not be related to my sched 
patch as I can't see how it will affect cpu hotplug at all.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-01-30 17:32 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-21  2:17 [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state Waiman Long
2023-01-24 19:48 ` Will Deacon
2023-01-24 20:08   ` Waiman Long
2023-01-26 15:55     ` Will Deacon
2023-01-24 20:24   ` Waiman Long
2023-01-26 16:11     ` Will Deacon
2023-01-26 20:49       ` Waiman Long
2023-01-26 20:58         ` Waiman Long
2023-01-27  1:56           ` Waiman Long
2023-01-27 13:03             ` Will Deacon
2023-01-30 17:32       ` Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.