All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5.10-rt 0/7] add fixes for mainline migrate_disable commit
@ 2021-06-08  4:37 Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 1/7] sched: Fix migration_cpu_stop() requeueing Paul Gortmaker
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Paul Gortmaker @ 2021-06-08  4:37 UTC (permalink / raw)
  To: Steven Rostedt, linux-rt-users; +Cc: Peter Zijlstra, Valentin Schneider

Mainline 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
first appeared in the v5.11 release, but was already in earlier versions of -rt.

At that point, v5.10-rt was still in the -devel repo ; the commit was actually
first added in commit/tag v5.9-rc8-rt14-patches as:

  patches/0010-sched-Fix-migrate_disable-vs-set_cpus_allowed_ptr.patch

...and then can be found as commit 5d7baa3f554d in the v5.10-rt fast forward
branch that regularly merges v5.10.x-stable in the stable-rt repo.

This matters, because 6d337eab041d is mentioned in a bunch of "Fixes:" lines
and those fixes made it back into v5.11.7-stable, but weren't considered for
v5.10 because v5.10 (and v5.10.x-stable) didn't have 6d337eab041d.

So it is on -rt to ensure the related fixes make it back onto v5.10-rt since
we deployed the commit which is the subject of those Fixes lines on that 
particular version baseline (vs. mainline or stable deploying it).

Commits are applied here in upstream order.  Note also that the final commit
is not yet mainline, but is in tip on sched/core and hence I'm assuming it
will land in mainline with the same SHA.

Active non-EOL Shoulder releases:
v5.4-rt:  Does not have any variant of 6d337eab041d.
v5.12-rt: Has all fixes via mainline v5.12 except for tip 475ea6c60279e

Testing:
Sanity boot test of v5.10.41-rt42 +7 commits; with defconfig + rt enabled.

Credit goes to Kevin Hao <kexin.hao@windriver.com> for spotting and reporting
the missing fixes.

Paul.

---

Peter Zijlstra (6):
  sched: Fix migration_cpu_stop() requeueing
  sched: Simplify migration_cpu_stop()
  sched: Collate affine_move_task() stoppers
  sched: Optimize migration_cpu_stop()
  sched: Fix affine_move_task() self-concurrency
  sched: Simplify set_affinity_pending refcounts

Valentin Schneider (1):
  sched: Don't defer CPU pick to migration_cpu_stop()

 kernel/sched/core.c | 144 ++++++++++++++++++++++----------------------
 1 file changed, 73 insertions(+), 71 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/7] sched: Fix migration_cpu_stop() requeueing
  2021-06-08  4:37 [PATCH v5.10-rt 0/7] add fixes for mainline migrate_disable commit Paul Gortmaker
@ 2021-06-08  4:37 ` Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 2/7] sched: Simplify migration_cpu_stop() Paul Gortmaker
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Paul Gortmaker @ 2021-06-08  4:37 UTC (permalink / raw)
  To: Steven Rostedt, linux-rt-users; +Cc: Peter Zijlstra, Valentin Schneider

From: Peter Zijlstra <peterz@infradead.org>

commit 8a6edb5257e2a84720fe78cb179eca58ba76126f upstream.

When affine_move_task(p) is called on a running task @p, which is not
otherwise already changing affinity, we'll first set
p->migration_pending and then do:

	 stop_one_cpu(cpu_of_rq(rq), migration_cpu_stop, &arg);

This then gets us to migration_cpu_stop() running on the CPU that was
previously running our victim task @p.

If we find that our task is no longer on that runqueue (this can
happen because of a concurrent migration due to load-balance etc.),
then we'll end up at the:

	} else if (dest_cpu < 1 || pending) {

branch. Which we'll take because we set pending earlier. Here we first
check if the task @p has already satisfied the affinity constraints,
if so we bail early [A]. Otherwise we'll reissue migration_cpu_stop()
onto the CPU that is now hosting our task @p:

	stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
			    &pending->arg, &pending->stop_work);

Except, we've never initialized pending->arg, which will be all 0s.

This then results in running migration_cpu_stop() on the next CPU with
arg->p == NULL, which gives the by now obvious result of fireworks.

The cure is to change affine_move_task() to always use pending->arg,
furthermore we can use the exact same pattern as the
SCA_MIGRATE_ENABLE case, since we'll block on the pending->done
completion anyway, no point in adding yet another completion in
stop_one_cpu().

This then gives a clear distinction between the two
migration_cpu_stop() use cases:

  - sched_exec() / migrate_task_to() : arg->pending == NULL
  - affine_move_task() : arg->pending != NULL;

And we can have it ignore p->migration_pending when !arg->pending. Any
stop work from sched_exec() / migrate_task_to() is in addition to stop
works from affine_move_task(), which will be sufficient to issue the
completion.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.357743989@infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 kernel/sched/core.c | 39 ++++++++++++++++++++++++++++-----------
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3d3aa9db1548..a3dea38f410a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1958,6 +1958,24 @@ static int migration_cpu_stop(void *data)
 	rq_lock(rq, &rf);
 
 	pending = p->migration_pending;
+	if (pending && !arg->pending) {
+		/*
+		 * This happens from sched_exec() and migrate_task_to(),
+		 * neither of them care about pending and just want a task to
+		 * maybe move about.
+		 *
+		 * Even if there is a pending, we can ignore it, since
+		 * affine_move_task() will have it's own stop_work's in flight
+		 * which will manage the completion.
+		 *
+		 * Notably, pending doesn't need to match arg->pending. This can
+		 * happen when tripple concurrent affine_move_task() first sets
+		 * pending, then clears pending and eventually sets another
+		 * pending.
+		 */
+		pending = NULL;
+	}
+
 	/*
 	 * If task_rq(p) != rq, it cannot be migrated here, because we're
 	 * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
@@ -2230,10 +2248,6 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 			    int dest_cpu, unsigned int flags)
 {
 	struct set_affinity_pending my_pending = { }, *pending = NULL;
-	struct migration_arg arg = {
-		.task = p,
-		.dest_cpu = dest_cpu,
-	};
 	bool complete = false;
 
 	/* Can the task run on the task's current CPU? If so, we're done */
@@ -2271,6 +2285,12 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 			/* Install the request */
 			refcount_set(&my_pending.refs, 1);
 			init_completion(&my_pending.done);
+			my_pending.arg = (struct migration_arg) {
+				.task = p,
+				.dest_cpu = -1,		/* any */
+				.pending = &my_pending,
+			};
+
 			p->migration_pending = &my_pending;
 		} else {
 			pending = p->migration_pending;
@@ -2301,12 +2321,6 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 		p->migration_flags &= ~MDF_PUSH;
 		task_rq_unlock(rq, p, rf);
 
-		pending->arg = (struct migration_arg) {
-			.task = p,
-			.dest_cpu = -1,
-			.pending = pending,
-		};
-
 		stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
 				    &pending->arg, &pending->stop_work);
 
@@ -2319,8 +2333,11 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 		 * is_migration_disabled(p) checks to the stopper, which will
 		 * run on the same CPU as said p.
 		 */
+		refcount_inc(&pending->refs); /* pending->{arg,stop_work} */
 		task_rq_unlock(rq, p, rf);
-		stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
+
+		stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
+				    &pending->arg, &pending->stop_work);
 
 	} else {
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/7] sched: Simplify migration_cpu_stop()
  2021-06-08  4:37 [PATCH v5.10-rt 0/7] add fixes for mainline migrate_disable commit Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 1/7] sched: Fix migration_cpu_stop() requeueing Paul Gortmaker
@ 2021-06-08  4:37 ` Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 3/7] sched: Collate affine_move_task() stoppers Paul Gortmaker
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Paul Gortmaker @ 2021-06-08  4:37 UTC (permalink / raw)
  To: Steven Rostedt, linux-rt-users; +Cc: Peter Zijlstra, Valentin Schneider

From: Peter Zijlstra <peterz@infradead.org>

commit c20cf065d4a619d394d23290093b1002e27dff86 upstream.

When affine_move_task() issues a migration_cpu_stop(), the purpose of
that function is to complete that @pending, not any random other
p->migration_pending that might have gotten installed since.

This realization much simplifies migration_cpu_stop() and allows
further necessary steps to fix all this as it provides the guarantee
that @pending's stopper will complete @pending (and not some random
other @pending).

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.430014682@infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 kernel/sched/core.c | 56 +++++++--------------------------------------
 1 file changed, 8 insertions(+), 48 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a3dea38f410a..d497b13efb53 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1934,8 +1934,8 @@ static struct rq *__migrate_task(struct rq *rq, struct rq_flags *rf,
  */
 static int migration_cpu_stop(void *data)
 {
-	struct set_affinity_pending *pending;
 	struct migration_arg *arg = data;
+	struct set_affinity_pending *pending = arg->pending;
 	struct task_struct *p = arg->task;
 	int dest_cpu = arg->dest_cpu;
 	struct rq *rq = this_rq();
@@ -1957,25 +1957,6 @@ static int migration_cpu_stop(void *data)
 	raw_spin_lock(&p->pi_lock);
 	rq_lock(rq, &rf);
 
-	pending = p->migration_pending;
-	if (pending && !arg->pending) {
-		/*
-		 * This happens from sched_exec() and migrate_task_to(),
-		 * neither of them care about pending and just want a task to
-		 * maybe move about.
-		 *
-		 * Even if there is a pending, we can ignore it, since
-		 * affine_move_task() will have it's own stop_work's in flight
-		 * which will manage the completion.
-		 *
-		 * Notably, pending doesn't need to match arg->pending. This can
-		 * happen when tripple concurrent affine_move_task() first sets
-		 * pending, then clears pending and eventually sets another
-		 * pending.
-		 */
-		pending = NULL;
-	}
-
 	/*
 	 * If task_rq(p) != rq, it cannot be migrated here, because we're
 	 * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
@@ -1986,31 +1967,20 @@ static int migration_cpu_stop(void *data)
 			goto out;
 
 		if (pending) {
-			p->migration_pending = NULL;
+			if (p->migration_pending == pending)
+				p->migration_pending = NULL;
 			complete = true;
 		}
 
-		/* migrate_enable() --  we must not race against SCA */
-		if (dest_cpu < 0) {
-			/*
-			 * When this was migrate_enable() but we no longer
-			 * have a @pending, a concurrent SCA 'fixed' things
-			 * and we should be valid again. Nothing to do.
-			 */
-			if (!pending) {
-				WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), &p->cpus_mask));
-				goto out;
-			}
-
+		if (dest_cpu < 0)
 			dest_cpu = cpumask_any_distribute(&p->cpus_mask);
-		}
 
 		if (task_on_rq_queued(p))
 			rq = __migrate_task(rq, &rf, p, dest_cpu);
 		else
 			p->wake_cpu = dest_cpu;
 
-	} else if (dest_cpu < 0 || pending) {
+	} else if (pending) {
 		/*
 		 * This happens when we get migrated between migrate_enable()'s
 		 * preempt_enable() and scheduling the stopper task. At that
@@ -2025,22 +1995,13 @@ static int migration_cpu_stop(void *data)
 		 * ->pi_lock, so the allowed mask is stable - if it got
 		 * somewhere allowed, we're done.
 		 */
-		if (pending && cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) {
-			p->migration_pending = NULL;
+		if (cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) {
+			if (p->migration_pending == pending)
+				p->migration_pending = NULL;
 			complete = true;
 			goto out;
 		}
 
-		/*
-		 * When this was migrate_enable() but we no longer have an
-		 * @pending, a concurrent SCA 'fixed' things and we should be
-		 * valid again. Nothing to do.
-		 */
-		if (!pending) {
-			WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), &p->cpus_mask));
-			goto out;
-		}
-
 		/*
 		 * When migrate_enable() hits a rq mis-match we can't reliably
 		 * determine is_migration_disabled() and so have to chase after
@@ -2058,7 +2019,6 @@ static int migration_cpu_stop(void *data)
 		complete_all(&pending->done);
 
 	/* For pending->{arg,stop_work} */
-	pending = arg->pending;
 	if (pending && refcount_dec_and_test(&pending->refs))
 		wake_up_var(&pending->refs);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/7] sched: Collate affine_move_task() stoppers
  2021-06-08  4:37 [PATCH v5.10-rt 0/7] add fixes for mainline migrate_disable commit Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 1/7] sched: Fix migration_cpu_stop() requeueing Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 2/7] sched: Simplify migration_cpu_stop() Paul Gortmaker
@ 2021-06-08  4:37 ` Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 4/7] sched: Optimize migration_cpu_stop() Paul Gortmaker
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Paul Gortmaker @ 2021-06-08  4:37 UTC (permalink / raw)
  To: Steven Rostedt, linux-rt-users; +Cc: Peter Zijlstra, Valentin Schneider

From: Peter Zijlstra <peterz@infradead.org>

commit 58b1a45086b5f80f2b2842aa7ed0da51a64a302b upstream.

The SCA_MIGRATE_ENABLE and task_running() cases are almost identical,
collapse them to avoid further duplication.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.500108964@infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 kernel/sched/core.c | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d497b13efb53..6880c300c624 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2275,30 +2275,23 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 		return -EINVAL;
 	}
 
-	if (flags & SCA_MIGRATE_ENABLE) {
-
-		refcount_inc(&pending->refs); /* pending->{arg,stop_work} */
-		p->migration_flags &= ~MDF_PUSH;
-		task_rq_unlock(rq, p, rf);
-
-		stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
-				    &pending->arg, &pending->stop_work);
-
-		return 0;
-	}
-
 	if (task_running(rq, p) || p->state == TASK_WAKING) {
 		/*
-		 * Lessen races (and headaches) by delegating
-		 * is_migration_disabled(p) checks to the stopper, which will
-		 * run on the same CPU as said p.
+		 * MIGRATE_ENABLE gets here because 'p == current', but for
+		 * anything else we cannot do is_migration_disabled(), punt
+		 * and have the stopper function handle it all race-free.
 		 */
+
 		refcount_inc(&pending->refs); /* pending->{arg,stop_work} */
+		if (flags & SCA_MIGRATE_ENABLE)
+			p->migration_flags &= ~MDF_PUSH;
 		task_rq_unlock(rq, p, rf);
 
 		stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
 				    &pending->arg, &pending->stop_work);
 
+		if (flags & SCA_MIGRATE_ENABLE)
+			return 0;
 	} else {
 
 		if (!is_migration_disabled(p)) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/7] sched: Optimize migration_cpu_stop()
  2021-06-08  4:37 [PATCH v5.10-rt 0/7] add fixes for mainline migrate_disable commit Paul Gortmaker
                   ` (2 preceding siblings ...)
  2021-06-08  4:37 ` [PATCH 3/7] sched: Collate affine_move_task() stoppers Paul Gortmaker
@ 2021-06-08  4:37 ` Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 5/7] sched: Fix affine_move_task() self-concurrency Paul Gortmaker
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Paul Gortmaker @ 2021-06-08  4:37 UTC (permalink / raw)
  To: Steven Rostedt, linux-rt-users; +Cc: Peter Zijlstra, Valentin Schneider

From: Peter Zijlstra <peterz@infradead.org>

commit 3f1bc119cd7fc987c8ed25ffb717f99403bb308c upstream.

When the purpose of migration_cpu_stop() is to migrate the task to
'any' valid CPU, don't migrate the task when it's already running on a
valid CPU.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.569238629@infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 kernel/sched/core.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6880c300c624..9cbe12d8c5bd 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1972,14 +1972,25 @@ static int migration_cpu_stop(void *data)
 			complete = true;
 		}
 
-		if (dest_cpu < 0)
+		if (dest_cpu < 0) {
+			if (cpumask_test_cpu(task_cpu(p), &p->cpus_mask))
+				goto out;
+
 			dest_cpu = cpumask_any_distribute(&p->cpus_mask);
+		}
 
 		if (task_on_rq_queued(p))
 			rq = __migrate_task(rq, &rf, p, dest_cpu);
 		else
 			p->wake_cpu = dest_cpu;
 
+		/*
+		 * XXX __migrate_task() can fail, at which point we might end
+		 * up running on a dodgy CPU, AFAICT this can only happen
+		 * during CPU hotplug, at which point we'll get pushed out
+		 * anyway, so it's probably not a big deal.
+		 */
+
 	} else if (pending) {
 		/*
 		 * This happens when we get migrated between migrate_enable()'s
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/7] sched: Fix affine_move_task() self-concurrency
  2021-06-08  4:37 [PATCH v5.10-rt 0/7] add fixes for mainline migrate_disable commit Paul Gortmaker
                   ` (3 preceding siblings ...)
  2021-06-08  4:37 ` [PATCH 4/7] sched: Optimize migration_cpu_stop() Paul Gortmaker
@ 2021-06-08  4:37 ` Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 6/7] sched: Simplify set_affinity_pending refcounts Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 7/7] sched: Don't defer CPU pick to migration_cpu_stop() Paul Gortmaker
  6 siblings, 0 replies; 8+ messages in thread
From: Paul Gortmaker @ 2021-06-08  4:37 UTC (permalink / raw)
  To: Steven Rostedt, linux-rt-users; +Cc: Peter Zijlstra, Valentin Schneider

From: Peter Zijlstra <peterz@infradead.org>

commit 9e81889c7648d48dd5fe13f41cbc99f3c362484a upstream.

Consider:

   sched_setaffinity(p, X);		sched_setaffinity(p, Y);

Then the first will install p->migration_pending = &my_pending; and
issue stop_one_cpu_nowait(pending); and the second one will read
p->migration_pending and _also_ issue: stop_one_cpu_nowait(pending),
the _SAME_ @pending.

This causes stopper list corruption.

Add set_affinity_pending::stop_pending, to indicate if a stopper is in
progress.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.649146419@infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 kernel/sched/core.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9cbe12d8c5bd..20588a59300d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1900,6 +1900,7 @@ struct migration_arg {
 
 struct set_affinity_pending {
 	refcount_t		refs;
+	unsigned int		stop_pending;
 	struct completion	done;
 	struct cpu_stop_work	stop_work;
 	struct migration_arg	arg;
@@ -2018,12 +2019,15 @@ static int migration_cpu_stop(void *data)
 		 * determine is_migration_disabled() and so have to chase after
 		 * it.
 		 */
+		WARN_ON_ONCE(!pending->stop_pending);
 		task_rq_unlock(rq, p, &rf);
 		stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop,
 				    &pending->arg, &pending->stop_work);
 		return 0;
 	}
 out:
+	if (pending)
+		pending->stop_pending = false;
 	task_rq_unlock(rq, p, &rf);
 
 	if (complete)
@@ -2219,7 +2223,7 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 			    int dest_cpu, unsigned int flags)
 {
 	struct set_affinity_pending my_pending = { }, *pending = NULL;
-	bool complete = false;
+	bool stop_pending, complete = false;
 
 	/* Can the task run on the task's current CPU? If so, we're done */
 	if (cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
@@ -2292,14 +2296,19 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 		 * anything else we cannot do is_migration_disabled(), punt
 		 * and have the stopper function handle it all race-free.
 		 */
+		stop_pending = pending->stop_pending;
+		if (!stop_pending)
+			pending->stop_pending = true;
 
 		refcount_inc(&pending->refs); /* pending->{arg,stop_work} */
 		if (flags & SCA_MIGRATE_ENABLE)
 			p->migration_flags &= ~MDF_PUSH;
 		task_rq_unlock(rq, p, rf);
 
-		stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
-				    &pending->arg, &pending->stop_work);
+		if (!stop_pending) {
+			stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
+					    &pending->arg, &pending->stop_work);
+		}
 
 		if (flags & SCA_MIGRATE_ENABLE)
 			return 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/7] sched: Simplify set_affinity_pending refcounts
  2021-06-08  4:37 [PATCH v5.10-rt 0/7] add fixes for mainline migrate_disable commit Paul Gortmaker
                   ` (4 preceding siblings ...)
  2021-06-08  4:37 ` [PATCH 5/7] sched: Fix affine_move_task() self-concurrency Paul Gortmaker
@ 2021-06-08  4:37 ` Paul Gortmaker
  2021-06-08  4:37 ` [PATCH 7/7] sched: Don't defer CPU pick to migration_cpu_stop() Paul Gortmaker
  6 siblings, 0 replies; 8+ messages in thread
From: Paul Gortmaker @ 2021-06-08  4:37 UTC (permalink / raw)
  To: Steven Rostedt, linux-rt-users; +Cc: Peter Zijlstra, Valentin Schneider

From: Peter Zijlstra <peterz@infradead.org>

commit 50caf9c14b1498c90cf808dbba2ca29bd32ccba4 upstream.

Now that we have set_affinity_pending::stop_pending to indicate if a
stopper is in progress, and we have the guarantee that if that stopper
exists, it will (eventually) complete our @pending we can simplify the
refcount scheme by no longer counting the stopper thread.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.724130207@infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 kernel/sched/core.c | 32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 20588a59300d..35d8b80d7cb8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1898,6 +1898,10 @@ struct migration_arg {
 	struct set_affinity_pending	*pending;
 };
 
+/*
+ * @refs: number of wait_for_completion()
+ * @stop_pending: is @stop_work in use
+ */
 struct set_affinity_pending {
 	refcount_t		refs;
 	unsigned int		stop_pending;
@@ -2033,10 +2037,6 @@ static int migration_cpu_stop(void *data)
 	if (complete)
 		complete_all(&pending->done);
 
-	/* For pending->{arg,stop_work} */
-	if (pending && refcount_dec_and_test(&pending->refs))
-		wake_up_var(&pending->refs);
-
 	return 0;
 }
 
@@ -2235,12 +2235,16 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 			push_task = get_task_struct(p);
 		}
 
+		/*
+		 * If there are pending waiters, but no pending stop_work,
+		 * then complete now.
+		 */
 		pending = p->migration_pending;
-		if (pending) {
-			refcount_inc(&pending->refs);
+		if (pending && !pending->stop_pending) {
 			p->migration_pending = NULL;
 			complete = true;
 		}
+
 		task_rq_unlock(rq, p, rf);
 
 		if (push_task) {
@@ -2249,7 +2253,7 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 		}
 
 		if (complete)
-			goto do_complete;
+			complete_all(&pending->done);
 
 		return 0;
 	}
@@ -2300,9 +2304,9 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 		if (!stop_pending)
 			pending->stop_pending = true;
 
-		refcount_inc(&pending->refs); /* pending->{arg,stop_work} */
 		if (flags & SCA_MIGRATE_ENABLE)
 			p->migration_flags &= ~MDF_PUSH;
+
 		task_rq_unlock(rq, p, rf);
 
 		if (!stop_pending) {
@@ -2318,12 +2322,13 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 			if (task_on_rq_queued(p))
 				rq = move_queued_task(rq, rf, p, dest_cpu);
 
-			p->migration_pending = NULL;
-			complete = true;
+			if (!pending->stop_pending) {
+				p->migration_pending = NULL;
+				complete = true;
+			}
 		}
 		task_rq_unlock(rq, p, rf);
 
-do_complete:
 		if (complete)
 			complete_all(&pending->done);
 	}
@@ -2331,7 +2336,7 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 	wait_for_completion(&pending->done);
 
 	if (refcount_dec_and_test(&pending->refs))
-		wake_up_var(&pending->refs);
+		wake_up_var(&pending->refs); /* No UaF, just an address */
 
 	/*
 	 * Block the original owner of &pending until all subsequent callers
@@ -2339,6 +2344,9 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 	 */
 	wait_var_event(&my_pending.refs, !refcount_read(&my_pending.refs));
 
+	/* ARGH */
+	WARN_ON_ONCE(my_pending.stop_pending);
+
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 7/7] sched: Don't defer CPU pick to migration_cpu_stop()
  2021-06-08  4:37 [PATCH v5.10-rt 0/7] add fixes for mainline migrate_disable commit Paul Gortmaker
                   ` (5 preceding siblings ...)
  2021-06-08  4:37 ` [PATCH 6/7] sched: Simplify set_affinity_pending refcounts Paul Gortmaker
@ 2021-06-08  4:37 ` Paul Gortmaker
  6 siblings, 0 replies; 8+ messages in thread
From: Paul Gortmaker @ 2021-06-08  4:37 UTC (permalink / raw)
  To: Steven Rostedt, linux-rt-users; +Cc: Peter Zijlstra, Valentin Schneider

From: Valentin Schneider <valentin.schneider@arm.com>

commit 475ea6c60279e9f2ddf7e4cf2648cd8ae0608361 upstream.

Will reported that the 'XXX __migrate_task() can fail' in migration_cpu_stop()
can happen, and it *is* sort of a big deal. Looking at it some more, one
will note there is a glaring hole in the deferred CPU selection:

  (w/ CONFIG_CPUSET=n, so that the affinity mask passed via taskset doesn't
  get AND'd with cpu_online_mask)

  $ taskset -pc 0-2 $PID
  # offline CPUs 3-4
  $ taskset -pc 3-5 $PID
    `\
      $PID may stay on 0-2 due to the cpumask_any_distribute() picking an
      offline CPU and __migrate_task() refusing to do anything due to
      cpu_is_allowed().

set_cpus_allowed_ptr() goes to some length to pick a dest_cpu that matches
the right constraints vs affinity and the online/active state of the
CPUs. Reuse that instead of discarding it in the affine_move_task() case.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210526205751.842360-2-valentin.schneider@arm.com
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 kernel/sched/core.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 35d8b80d7cb8..b1e87f304ade 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1942,7 +1942,6 @@ static int migration_cpu_stop(void *data)
 	struct migration_arg *arg = data;
 	struct set_affinity_pending *pending = arg->pending;
 	struct task_struct *p = arg->task;
-	int dest_cpu = arg->dest_cpu;
 	struct rq *rq = this_rq();
 	bool complete = false;
 	struct rq_flags rf;
@@ -1975,19 +1974,15 @@ static int migration_cpu_stop(void *data)
 			if (p->migration_pending == pending)
 				p->migration_pending = NULL;
 			complete = true;
-		}
 
-		if (dest_cpu < 0) {
 			if (cpumask_test_cpu(task_cpu(p), &p->cpus_mask))
 				goto out;
-
-			dest_cpu = cpumask_any_distribute(&p->cpus_mask);
 		}
 
 		if (task_on_rq_queued(p))
-			rq = __migrate_task(rq, &rf, p, dest_cpu);
+			rq = __migrate_task(rq, &rf, p, arg->dest_cpu);
 		else
-			p->wake_cpu = dest_cpu;
+			p->wake_cpu = arg->dest_cpu;
 
 		/*
 		 * XXX __migrate_task() can fail, at which point we might end
@@ -2266,7 +2261,7 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 			init_completion(&my_pending.done);
 			my_pending.arg = (struct migration_arg) {
 				.task = p,
-				.dest_cpu = -1,		/* any */
+				.dest_cpu = dest_cpu,
 				.pending = &my_pending,
 			};
 
@@ -2274,6 +2269,15 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag
 		} else {
 			pending = p->migration_pending;
 			refcount_inc(&pending->refs);
+			/*
+			 * Affinity has changed, but we've already installed a
+			 * pending. migration_cpu_stop() *must* see this, else
+			 * we risk a completion of the pending despite having a
+			 * task on a disallowed CPU.
+			 *
+			 * Serialized by p->pi_lock, so this is safe.
+			 */
+			pending->arg.dest_cpu = dest_cpu;
 		}
 	}
 	pending = p->migration_pending;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-06-08  4:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08  4:37 [PATCH v5.10-rt 0/7] add fixes for mainline migrate_disable commit Paul Gortmaker
2021-06-08  4:37 ` [PATCH 1/7] sched: Fix migration_cpu_stop() requeueing Paul Gortmaker
2021-06-08  4:37 ` [PATCH 2/7] sched: Simplify migration_cpu_stop() Paul Gortmaker
2021-06-08  4:37 ` [PATCH 3/7] sched: Collate affine_move_task() stoppers Paul Gortmaker
2021-06-08  4:37 ` [PATCH 4/7] sched: Optimize migration_cpu_stop() Paul Gortmaker
2021-06-08  4:37 ` [PATCH 5/7] sched: Fix affine_move_task() self-concurrency Paul Gortmaker
2021-06-08  4:37 ` [PATCH 6/7] sched: Simplify set_affinity_pending refcounts Paul Gortmaker
2021-06-08  4:37 ` [PATCH 7/7] sched: Don't defer CPU pick to migration_cpu_stop() Paul Gortmaker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.