From: tip-bot for Peter Zijlstra <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: mingo@kernel.org, torvalds@linux-foundation.org,
peterz@infradead.org, riel@redhat.com, rostedt@goodmis.org,
akpm@linux-foundation.org, aarcange@redhat.com, oleg@redhat.com,
tglx@linutronix.de, hpa@zytor.com, linux-kernel@vger.kernel.org,
hannes@cmpxchg.org, srivatsa.bhat@linux.vnet.ibm.com,
paulmck@linux.vnet.ibm.com, srikar@linux.vnet.ibm.com,
mgorman@suse.de
Subject: [tip:sched/core] sched: Remove get_online_cpus() usage
Date: Thu, 17 Oct 2013 09:49:39 -0700 [thread overview]
Message-ID: <tip-6acce3ef84520537f8a09a12c9ddbe814a584dd2@git.kernel.org> (raw)
In-Reply-To: <20131011123820.GV3081@twins.programming.kicks-ass.net>
Commit-ID: 6acce3ef84520537f8a09a12c9ddbe814a584dd2
Gitweb: http://git.kernel.org/tip/6acce3ef84520537f8a09a12c9ddbe814a584dd2
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Fri, 11 Oct 2013 14:38:20 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 16 Oct 2013 14:22:16 +0200
sched: Remove get_online_cpus() usage
Remove get_online_cpus() usage from the scheduler; there's 4 sites that
use it:
- sched_init_smp(); where its completely superfluous since we're in
'early' boot and there simply cannot be any hotplugging.
- sched_getaffinity(); we already take a raw spinlock to protect the
task cpus_allowed mask, this disables preemption and therefore
also stabilizes cpu_online_mask as that's modified using
stop_machine. However switch to active mask for symmetry with
sched_setaffinity()/set_cpus_allowed_ptr(). We guarantee active
mask stability by inserting sync_rcu/sched() into _cpu_down.
- sched_setaffinity(); we don't appear to need get_online_cpus()
either, there's two sites where hotplug appears relevant:
* cpuset_cpus_allowed(); for the !cpuset case we use possible_mask,
for the cpuset case we hold task_lock, which is a spinlock and
thus for mainline disables preemption (might cause pain on RT).
* set_cpus_allowed_ptr(); Holds all scheduler locks and thus has
preemption properly disabled; also it already deals with hotplug
races explicitly where it releases them.
- migrate_swap(); we can make stop_two_cpus() do the heavy lifting for
us with a little trickery. By adding a sync_sched/rcu() after the
CPU_DOWN_PREPARE notifier we can provide preempt/rcu guarantees for
cpu_active_mask. Use these to validate that both our cpus are active
when queueing the stop work before we queue the stop_machine works
for take_cpu_down().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20131011123820.GV3081@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
kernel/cpu.c | 17 +++++++++++++++++
kernel/sched/core.c | 20 ++++++++++----------
kernel/stop_machine.c | 26 +++++++++++++++++++++-----
3 files changed, 48 insertions(+), 15 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index d7f07a2..63aa50d 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -308,6 +308,23 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
}
smpboot_park_threads(cpu);
+ /*
+ * By now we've cleared cpu_active_mask, wait for all preempt-disabled
+ * and RCU users of this state to go away such that all new such users
+ * will observe it.
+ *
+ * For CONFIG_PREEMPT we have preemptible RCU and its sync_rcu() might
+ * not imply sync_sched(), so explicitly call both.
+ */
+#ifdef CONFIG_PREEMPT
+ synchronize_sched();
+#endif
+ synchronize_rcu();
+
+ /*
+ * So now all preempt/rcu users must observe !cpu_active().
+ */
+
err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
if (err) {
/* CPU didn't die: tell everyone. Can't complain. */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a972acd..c06b8d3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1085,8 +1085,6 @@ int migrate_swap(struct task_struct *cur, struct task_struct *p)
struct migration_swap_arg arg;
int ret = -EINVAL;
- get_online_cpus();
-
arg = (struct migration_swap_arg){
.src_task = cur,
.src_cpu = task_cpu(cur),
@@ -1097,6 +1095,10 @@ int migrate_swap(struct task_struct *cur, struct task_struct *p)
if (arg.src_cpu == arg.dst_cpu)
goto out;
+ /*
+ * These three tests are all lockless; this is OK since all of them
+ * will be re-checked with proper locks held further down the line.
+ */
if (!cpu_active(arg.src_cpu) || !cpu_active(arg.dst_cpu))
goto out;
@@ -1109,7 +1111,6 @@ int migrate_swap(struct task_struct *cur, struct task_struct *p)
ret = stop_two_cpus(arg.dst_cpu, arg.src_cpu, migrate_swap_stop, &arg);
out:
- put_online_cpus();
return ret;
}
@@ -3710,7 +3711,6 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
struct task_struct *p;
int retval;
- get_online_cpus();
rcu_read_lock();
p = find_process_by_pid(pid);
@@ -3773,7 +3773,6 @@ out_free_cpus_allowed:
free_cpumask_var(cpus_allowed);
out_put_task:
put_task_struct(p);
- put_online_cpus();
return retval;
}
@@ -3818,7 +3817,6 @@ long sched_getaffinity(pid_t pid, struct cpumask *mask)
unsigned long flags;
int retval;
- get_online_cpus();
rcu_read_lock();
retval = -ESRCH;
@@ -3831,12 +3829,11 @@ long sched_getaffinity(pid_t pid, struct cpumask *mask)
goto out_unlock;
raw_spin_lock_irqsave(&p->pi_lock, flags);
- cpumask_and(mask, &p->cpus_allowed, cpu_online_mask);
+ cpumask_and(mask, &p->cpus_allowed, cpu_active_mask);
raw_spin_unlock_irqrestore(&p->pi_lock, flags);
out_unlock:
rcu_read_unlock();
- put_online_cpus();
return retval;
}
@@ -6494,14 +6491,17 @@ void __init sched_init_smp(void)
sched_init_numa();
- get_online_cpus();
+ /*
+ * There's no userspace yet to cause hotplug operations; hence all the
+ * cpu masks are stable and all blatant races in the below code cannot
+ * happen.
+ */
mutex_lock(&sched_domains_mutex);
init_sched_domains(cpu_active_mask);
cpumask_andnot(non_isolated_cpus, cpu_possible_mask, cpu_isolated_map);
if (cpumask_empty(non_isolated_cpus))
cpumask_set_cpu(smp_processor_id(), non_isolated_cpus);
mutex_unlock(&sched_domains_mutex);
- put_online_cpus();
hotcpu_notifier(sched_domains_numa_masks_update, CPU_PRI_SCHED_ACTIVE);
hotcpu_notifier(cpuset_cpu_active, CPU_PRI_CPUSET_ACTIVE);
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 32a6c44..c530bc5 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -234,11 +234,13 @@ static void irq_cpu_stop_queue_work(void *arg)
*/
int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *arg)
{
- int call_cpu;
struct cpu_stop_done done;
struct cpu_stop_work work1, work2;
struct irq_cpu_stop_queue_work_info call_args;
- struct multi_stop_data msdata = {
+ struct multi_stop_data msdata;
+
+ preempt_disable();
+ msdata = (struct multi_stop_data){
.fn = fn,
.data = arg,
.num_threads = 2,
@@ -262,16 +264,30 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *
set_state(&msdata, MULTI_STOP_PREPARE);
/*
+ * If we observe both CPUs active we know _cpu_down() cannot yet have
+ * queued its stop_machine works and therefore ours will get executed
+ * first. Or its not either one of our CPUs that's getting unplugged,
+ * in which case we don't care.
+ *
+ * This relies on the stopper workqueues to be FIFO.
+ */
+ if (!cpu_active(cpu1) || !cpu_active(cpu2)) {
+ preempt_enable();
+ return -ENOENT;
+ }
+
+ /*
* Queuing needs to be done by the lowest numbered CPU, to ensure
* that works are always queued in the same order on every CPU.
* This prevents deadlocks.
*/
- call_cpu = min(cpu1, cpu2);
-
- smp_call_function_single(call_cpu, &irq_cpu_stop_queue_work,
+ smp_call_function_single(min(cpu1, cpu2),
+ &irq_cpu_stop_queue_work,
&call_args, 0);
+ preempt_enable();
wait_for_completion(&done.completion);
+
return done.executed ? done.ret : -ENOENT;
}
next prev parent reply other threads:[~2013-10-17 16:51 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-08 10:25 [PATCH 0/6] Optimize the cpu hotplug locking -v2 Peter Zijlstra
2013-10-08 10:25 ` [PATCH 1/6] hotplug: Optimize {get,put}_online_cpus() Peter Zijlstra
2013-10-08 15:08 ` Rik van Riel
2013-10-10 5:47 ` Andrew Morton
2013-10-10 11:06 ` Oleg Nesterov
2013-10-10 14:55 ` Paul E. McKenney
2013-10-08 10:25 ` [PATCH 2/6] rcu: Create rcu_sync infrastructure Peter Zijlstra
2013-10-08 20:40 ` Jonathan Corbet
2013-10-09 19:52 ` Peter Zijlstra
2013-10-17 2:56 ` Lai Jiangshan
2013-10-17 10:36 ` Srikar Dronamraju
2013-10-08 10:25 ` [PATCH 3/6] hotplug: Optimize cpu_hotplug_{begin,done}() using rcu_sync Peter Zijlstra
2013-10-08 16:28 ` Paul E. McKenney
2013-10-08 10:25 ` [PATCH 4/6] rcusync: Introduce struct rcu_sync_ops Peter Zijlstra
2013-10-08 16:30 ` Paul E. McKenney
2013-10-17 2:07 ` Lai Jiangshan
[not found] ` <20131017154228.GL4553@linux.vnet.ibm.com>
2013-10-18 1:23 ` Lai Jiangshan
2013-10-18 12:10 ` Oleg Nesterov
2013-10-20 16:58 ` Paul E. McKenney
2013-10-08 10:25 ` [PATCH 5/6] rcusync: Add the CONFIG_PROVE_RCU checks Peter Zijlstra
2013-10-08 16:30 ` Paul E. McKenney
2013-10-08 10:25 ` [PATCH 6/6] rcusync: Introduce rcu_sync_dtor() Peter Zijlstra
2013-10-08 16:32 ` Paul E. McKenney
2013-10-08 15:27 ` [PATCH 0/6] Optimize the cpu hotplug locking -v2 Oleg Nesterov
2013-10-08 15:38 ` Peter Zijlstra
2013-10-10 5:50 ` Andrew Morton
2013-10-10 6:27 ` Ingo Molnar
2013-10-10 6:34 ` Andrew Morton
2013-10-10 7:27 ` Ingo Molnar
2013-10-10 7:33 ` Andrew Morton
2013-10-10 7:45 ` Ingo Molnar
2013-10-10 12:19 ` Peter Zijlstra
2013-10-10 14:57 ` Ingo Molnar
2013-10-10 15:21 ` Peter Zijlstra
2013-10-10 15:36 ` Oleg Nesterov
2013-10-10 16:50 ` Ingo Molnar
2013-10-10 17:13 ` Paul E. McKenney
2013-10-10 17:35 ` Ingo Molnar
2013-10-10 18:35 ` Peter Zijlstra
2013-10-10 15:26 ` Oleg Nesterov
2013-10-10 16:00 ` Andrew Morton
2013-10-10 16:36 ` Steven Rostedt
2013-10-10 16:43 ` Andrew Morton
2013-10-10 16:53 ` Peter Zijlstra
2013-10-10 17:13 ` Steven Rostedt
2013-10-10 17:48 ` Andrew Morton
2013-10-10 18:10 ` Linus Torvalds
2013-10-10 18:43 ` Steven Rostedt
2013-10-10 18:50 ` Peter Zijlstra
2013-10-10 19:15 ` Paul E. McKenney
2013-10-10 19:00 ` Linus Torvalds
2013-10-10 18:46 ` Peter Zijlstra
2013-10-10 18:34 ` Peter Zijlstra
2013-10-10 18:49 ` Linus Torvalds
2013-10-10 19:04 ` Steven Rostedt
2013-10-10 19:16 ` Linus Torvalds
2013-10-10 19:34 ` Peter Zijlstra
2013-10-10 19:34 ` Steven Rostedt
2013-10-11 6:09 ` Ingo Molnar
2013-10-11 12:38 ` Peter Zijlstra
2013-10-11 18:25 ` Oleg Nesterov
2013-10-11 20:48 ` Peter Zijlstra
2013-10-12 17:06 ` Oleg Nesterov
2013-10-14 9:05 ` Peter Zijlstra
2013-10-14 9:23 ` Paul E. McKenney
2013-10-15 1:01 ` Paul E. McKenney
2013-10-17 16:49 ` tip-bot for Peter Zijlstra [this message]
2013-10-10 17:39 ` Oleg Nesterov
2013-10-10 16:52 ` Ingo Molnar
2013-10-10 17:44 ` Paul E. McKenney
2013-10-10 16:54 ` Oleg Nesterov
2013-10-10 19:04 ` Srivatsa S. Bhat
2013-10-10 18:52 ` Srivatsa S. Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tip-6acce3ef84520537f8a09a12c9ddbe814a584dd2@git.kernel.org \
--to=tipbot@zytor.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=rostedt@goodmis.org \
--cc=srikar@linux.vnet.ibm.com \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).