From: Valentin Schneider <valentin.schneider@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de, mingo@kernel.org,
linux-kernel@vger.kernel.org, bigeasy@linutronix.de,
qais.yousef@arm.com, swood@redhat.com, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
bristot@redhat.com, vincent.donnefort@arm.com, tj@kernel.org,
ouwen210@hotmail.com
Subject: Re: [PATCH v3 10/19] sched: Fix migrate_disable() vs set_cpus_allowed_ptr()
Date: Fri, 16 Oct 2020 13:48:17 +0100 [thread overview]
Message-ID: <jhjlfg6qqum.mognet@arm.com> (raw)
In-Reply-To: <20201015110923.910090294@infradead.org>
On 15/10/20 12:05, Peter Zijlstra wrote:
> @@ -1862,15 +1875,27 @@ static int migration_cpu_stop(void *data
> * we're holding p->pi_lock.
> */
> if (task_rq(p) == rq) {
> + if (is_migration_disabled(p))
> + goto out;
> +
> if (task_on_rq_queued(p))
> rq = __migrate_task(rq, &rf, p, arg->dest_cpu);
> else
> p->wake_cpu = arg->dest_cpu;
> +
> + if (arg->done) {
> + p->migration_pending = NULL;
> + complete = true;
Ok so nasty ahead:
P0@CPU0 P1 P2 stopper
migrate_disable();
sca(P0, {CPU1});
<installs pending>
migrate_enable();
<kicks stopper>
sca(P0, {CPU0});
<locks>
<local, has pending:
goto do_complete>
<unlocks>
complete_all();
refcount_dec();
refcount_dec();
<done>
<done>
<locks>
<fiddles with pending->arg->done>
First, P2 can clear p->migration_pending before the stopper gets to run.
Second, the complete_all() is done without pi / rq locks held, but P2 might
get to it before the stopper does. This may cause &pending to be popped off
the stack before the stopper gets to it, so mayhaps we would need the below
hunk.
The move_queued_task() from the stopper is "safe" in that we won't kick a
task outside of its allowed mask, although we may move it around for no
reason - I tried to prevent that.
---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a5b6eac07adb..1ebf653c2c2f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1859,6 +1859,13 @@ static struct rq *__migrate_task(struct rq *rq, struct rq_flags *rf,
return rq;
}
+struct set_affinity_pending {
+ refcount_t refs;
+ struct completion done;
+ struct cpu_stop_work stop_work;
+ struct migration_arg arg;
+};
+
/*
* migration_cpu_stop - this will be executed by a highprio stopper thread
* and performs thread migration by bumping thread off CPU then
@@ -1866,6 +1873,7 @@ static struct rq *__migrate_task(struct rq *rq, struct rq_flags *rf,
*/
static int migration_cpu_stop(void *data)
{
+ struct set_affinity_pending *pending;
struct migration_arg *arg = data;
struct task_struct *p = arg->task;
struct rq *rq = this_rq();
@@ -1886,13 +1894,22 @@ static int migration_cpu_stop(void *data)
raw_spin_lock(&p->pi_lock);
rq_lock(rq, &rf);
+
+ if (arg->done)
+ pending = container_of(arg->done, struct set_affinity_pending, done);
/*
* If task_rq(p) != rq, it cannot be migrated here, because we're
* holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
* we're holding p->pi_lock.
*/
if (task_rq(p) == rq) {
- if (is_migration_disabled(p))
+ /*
+ * An affinity update may have raced with us.
+ * p->migration_pending could now be NULL, or could be pointing
+ * elsewhere entirely.
+ */
+ if (is_migration_disabled(p) ||
+ (arg->done && p->migration_pending != pending))
goto out;
if (task_on_rq_queued(p))
@@ -2024,13 +2041,6 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
__do_set_cpus_allowed(p, new_mask, 0);
}
-struct set_affinity_pending {
- refcount_t refs;
- struct completion done;
- struct cpu_stop_work stop_work;
- struct migration_arg arg;
-};
-
/*
* This function is wildly self concurrent; here be dragons.
*
next prev parent reply other threads:[~2020-10-16 12:48 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-15 11:05 [PATCH v3 00/19] sched: Migrate disable support Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 01/19] stop_machine: Add function and caller debug info Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 02/19] sched: Fix balance_callback() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 03/19] sched/hotplug: Ensure only per-cpu kthreads run during hotplug Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 04/19] sched/core: Wait for tasks being pushed away on hotplug Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 05/19] workqueue: Manually break affinity " Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 06/19] sched/hotplug: Consolidate task migration on CPU unplug Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 07/19] sched: Fix hotplug vs CPU bandwidth control Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 08/19] sched: Massage set_cpus_allowed() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 09/19] sched: Add migrate_disable() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 10/19] sched: Fix migrate_disable() vs set_cpus_allowed_ptr() Peter Zijlstra
2020-10-15 13:54 ` Valentin Schneider
2020-10-15 14:19 ` Peter Zijlstra
2020-10-16 12:48 ` Valentin Schneider [this message]
[not found] ` <BN8PR12MB29784D239007D0D6CA3F4F2A9A010@BN8PR12MB2978.namprd12.prod.outlook.com>
2020-10-18 15:51 ` Valentin Schneider
[not found] ` <BN8PR12MB2978F76887133CCA2102B7589A1E0@BN8PR12MB2978.namprd12.prod.outlook.com>
2020-10-19 17:36 ` Valentin Schneider
[not found] ` <BN8PR12MB2978D36FF4A81C344DCF37FD9A1F0@BN8PR12MB2978.namprd12.prod.outlook.com>
2020-10-20 7:56 ` Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 11/19] sched/core: Make migrate disable and CPU hotplug cooperative Peter Zijlstra
[not found] ` <BN8PR12MB2978A3BB8062DF4CEEB184109A030@BN8PR12MB2978.namprd12.prod.outlook.com>
2020-10-16 9:35 ` Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 12/19] sched,rt: Use cpumask_any*_distribute() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 13/19] sched,rt: Use the full cpumask for balancing Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 14/19] sched, lockdep: Annotate ->pi_lock recursion Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 15/19] sched: Fix migrate_disable() vs rt/dl balancing Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 16/19] sched/proc: Print accurate cpumask vs migrate_disable() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 17/19] sched: Add migrate_disable() tracepoints Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 18/19] sched: Deny self-issued __set_cpus_allowed_ptr() when migrate_disable() Peter Zijlstra
2020-10-15 11:05 ` [PATCH v3 19/19] sched: Comment affine_move_task() Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jhjlfg6qqum.mognet@arm.com \
--to=valentin.schneider@arm.com \
--cc=bigeasy@linutronix.de \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=ouwen210@hotmail.com \
--cc=peterz@infradead.org \
--cc=qais.yousef@arm.com \
--cc=rostedt@goodmis.org \
--cc=swood@redhat.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vincent.donnefort@arm.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).