[PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task()
@ 2014-09-22 18:36 Kirill Tkhai
  2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai

From: Kirill Tkhai <ktkhai@parallels.com>

Combine two branches which do the same.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
---
 kernel/sched/fair.c |   16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2a1e6ac..420bc98 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5292,24 +5292,12 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
 	if (!tsk_cache_hot)
 		tsk_cache_hot = migrate_degrades_locality(p, env);
 
-	if (migrate_improves_locality(p, env)) {
-#ifdef CONFIG_SCHEDSTATS
+	if (migrate_improves_locality(p, env) || !tsk_cache_hot ||
+	    env->sd->nr_balance_failed > env->sd->cache_nice_tries) {
 		if (tsk_cache_hot) {
 			schedstat_inc(env->sd, lb_hot_gained[env->idle]);
 			schedstat_inc(p, se.statistics.nr_forced_migrations);
 		}
-#endif
-		return 1;
-	}
-
-	if (!tsk_cache_hot ||
-		env->sd->nr_balance_failed > env->sd->cache_nice_tries) {
-
-		if (tsk_cache_hot) {
-			schedstat_inc(env->sd, lb_hot_gained[env->idle]);
-			schedstat_inc(p, se.statistics.nr_forced_migrations);
-		}
-
 		return 1;
 	}
 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu
  2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai
@ 2014-09-22 18:36 ` Kirill Tkhai
  2014-09-23 15:06     ` Peter Zijlstra
  2014-09-22 18:36 ` [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock Kirill Tkhai
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai

From: Kirill Tkhai <ktkhai@parallels.com>

Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW,
may pull a task when it's in the middle of schedule().

CPU1(task1 calls schedule)            CPU2
...                                   schedule()
...                                      idle_balance()
...                                         load_balance()
...                                            ...
schedule()                                     ...
   prepare_lock_switch()                       ...
      raw_spin_unlock(&rq1->lock)              ...
      ...                                      raw_spin_lock(&rq1->lock)
      ...                                         detach_tasks();
      ...                                            can_migrate_task(task1)
      ...                                         attach_tasks(); <--- move task1 to rq2
      ...                                      raw_spin_unlock(&rq1->lock)
      ...                                context_switch() <--- switch to task1's stack
      ...                                ...
   (using task1's stack)                 (using task1's stack)
   ...                                   ...
   context_switch()                      ...


Parallel use of a single stack is not a good idea.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Cc: <stable@vger.kernel.org> # Should this go to stable?
---
 kernel/sched/core.c     |   11 +++--------
 kernel/sched/deadline.c |    7 ++++++-
 kernel/sched/fair.c     |    3 +++
 kernel/sched/rt.c       |    7 ++++++-
 kernel/sched/sched.h    |   16 ++++++++++++++++
 5 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2a93b87..5b864e9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1700,15 +1700,10 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 
 #ifdef CONFIG_SMP
 	/*
-	 * If the owning (remote) cpu is still in the middle of schedule() with
-	 * this task as prev, wait until its done referencing the task.
+	 * Note, that p is dequeued at the moment. But it still
+	 * may be "prev" in the middle of schedule() on other cpu.
 	 */
-	while (p->on_cpu)
-		cpu_relax();
-	/*
-	 * Pairs with the smp_wmb() in finish_lock_switch().
-	 */
-	smp_rmb();
+	cpu_relax__while_on_cpu(p);
 
 	p->sched_contributes_to_load = !!task_contributes_to_load(p);
 	p->state = TASK_WAKING;
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index aaa5abb..ea0ba33 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1364,7 +1364,9 @@ static int push_dl_task(struct rq *rq)
 		next_task = task;
 		goto retry;
 	}
-
+#ifdef __ARCH_WANT_UNLOCKED_CTXSW
+	cpu_relax__while_on_cpu(next_task);
+#endif
 	deactivate_task(rq, next_task, 0);
 	set_task_cpu(next_task, later_rq->cpu);
 	activate_task(later_rq, next_task, 0);
@@ -1451,6 +1453,9 @@ static int pull_dl_task(struct rq *this_rq)
 
 			ret = 1;
 
+#ifdef __ARCH_WANT_UNLOCKED_CTXSW
+			cpu_relax__while_on_cpu(p);
+#endif
 			deactivate_task(src_rq, p, 0);
 			set_task_cpu(p, this_cpu);
 			activate_task(this_rq, p, 0);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 420bc98..80c5064 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5298,6 +5298,9 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
 			schedstat_inc(env->sd, lb_hot_gained[env->idle]);
 			schedstat_inc(p, se.statistics.nr_forced_migrations);
 		}
+#ifdef __ARCH_WANT_UNLOCKED_CTXSW
+		cpu_relax__while_on_cpu(p);
+#endif
 		return 1;
 	}
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 2e6a774..de356b0 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1734,7 +1734,9 @@ static int push_rt_task(struct rq *rq)
 		next_task = task;
 		goto retry;
 	}
-
+#ifdef __ARCH_WANT_UNLOCKED_CTXSW
+	cpu_relax__while_on_cpu(next_task);
+#endif
 	deactivate_task(rq, next_task, 0);
 	set_task_cpu(next_task, lowest_rq->cpu);
 	activate_task(lowest_rq, next_task, 0);
@@ -1823,6 +1825,9 @@ static int pull_rt_task(struct rq *this_rq)
 
 			ret = 1;
 
+#ifdef __ARCH_WANT_UNLOCKED_CTXSW
+			cpu_relax__while_on_cpu(p);
+#endif
 			deactivate_task(src_rq, p, 0);
 			set_task_cpu(p, this_cpu);
 			activate_task(this_rq, p, 0);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 1bc6aad..9c07d72 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1034,6 +1034,22 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
 #endif /* __ARCH_WANT_UNLOCKED_CTXSW */
 
 /*
+ * If the owning (remote) cpu is still in the middle of schedule() with
+ * this task as prev, wait until its done referencing the task.
+ */
+static inline void cpu_relax__while_on_cpu(struct task_struct *p)
+{
+#ifdef CONFIG_SMP
+	while (p->on_cpu)
+		cpu_relax();
+	/*
+	 * Pairs with the smp_wmb() in finish_lock_switch().
+	 */
+	smp_rmb();
+#endif
+}
+
+/*
  * wake flags
  */
 #define WF_SYNC		0x01		/* waker goes to sleep after wakeup */


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock
  2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai
  2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai
@ 2014-09-22 18:36 ` Kirill Tkhai
  2014-09-24 14:57   ` [tip:sched/core] " tip-bot for Kirill Tkhai
  2014-09-22 18:36 ` [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask Kirill Tkhai
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai

From: Kirill Tkhai <ktkhai@parallels.com>

dl_bw_of() dereferences rq->rd which has to have RCU read lock held.
Probability of use-after-free isn't zero here.

Also add lockdep assert into dl_bw_cpus().

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Cc: <stable@vger.kernel.org> # v3.14+
---
 kernel/sched/core.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5b864e9..a300fce 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1989,6 +1989,8 @@ unsigned long to_ratio(u64 period, u64 runtime)
 #ifdef CONFIG_SMP
 inline struct dl_bw *dl_bw_of(int i)
 {
+	rcu_lockdep_assert(rcu_read_lock_sched_held(),
+			   "sched RCU must be held");
 	return &cpu_rq(i)->rd->dl_bw;
 }
 
@@ -1997,6 +1999,8 @@ static inline int dl_bw_cpus(int i)
 	struct root_domain *rd = cpu_rq(i)->rd;
 	int cpus = 0;
 
+	rcu_lockdep_assert(rcu_read_lock_sched_held(),
+			   "sched RCU must be held");
 	for_each_cpu_and(i, rd->span, cpu_active_mask)
 		cpus++;
 
@@ -7623,6 +7627,8 @@ static int sched_dl_global_constraints(void)
 	int cpu, ret = 0;
 	unsigned long flags;
 
+	rcu_read_lock();
+
 	/*
 	 * Here we want to check the bandwidth not being set to some
 	 * value smaller than the currently allocated bandwidth in
@@ -7644,6 +7650,8 @@ static int sched_dl_global_constraints(void)
 			break;
 	}
 
+	rcu_read_unlock();
+
 	return ret;
 }
 
@@ -7659,6 +7667,7 @@ static void sched_dl_do_global(void)
 	if (global_rt_runtime() != RUNTIME_INF)
 		new_bw = to_ratio(global_rt_period(), global_rt_runtime());
 
+	rcu_read_lock();
 	/*
 	 * FIXME: As above...
 	 */
@@ -7669,6 +7678,7 @@ static void sched_dl_do_global(void)
 		dl_b->bw = new_bw;
 		raw_spin_unlock_irqrestore(&dl_b->lock, flags);
 	}
+	rcu_read_unlock();
 }
 
 static int sched_rt_global_validate(void)


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask
  2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai
  2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai
  2014-09-22 18:36 ` [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock Kirill Tkhai
@ 2014-09-22 18:36 ` Kirill Tkhai
  2014-09-24 14:57   ` [tip:sched/core] sched: cleanup: Rename 'out_unlock' to ' out_free_new_mask' tip-bot for Kirill Tkhai
  2014-09-22 18:36 ` [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock Kirill Tkhai
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai

From: Kirill Tkhai <ktkhai@parallels.com>

Nothing is locked there, so label's name only confuses a reader.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
---
 kernel/sched/core.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a300fce..3b07710 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4003,14 +4003,14 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 		rcu_read_lock();
 		if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
 			rcu_read_unlock();
-			goto out_unlock;
+			goto out_free_new_mask;
 		}
 		rcu_read_unlock();
 	}
 
 	retval = security_task_setscheduler(p);
 	if (retval)
-		goto out_unlock;
+		goto out_free_new_mask;
 
 
 	cpuset_cpus_allowed(p, cpus_allowed);
@@ -4028,7 +4028,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 
 		if (dl_bandwidth_enabled() && !cpumask_subset(span, new_mask)) {
 			retval = -EBUSY;
-			goto out_unlock;
+			goto out_free_new_mask;
 		}
 	}
 #endif
@@ -4047,7 +4047,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 			goto again;
 		}
 	}
-out_unlock:
+out_free_new_mask:
 	free_cpumask_var(new_mask);
 out_free_cpus_allowed:
 	free_cpumask_var(cpus_allowed);


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock
  2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai
                   ` (2 preceding siblings ...)
  2014-09-22 18:36 ` [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask Kirill Tkhai
@ 2014-09-22 18:36 ` Kirill Tkhai
  2014-09-24 14:57   ` [tip:sched/core] " tip-bot for Kirill Tkhai
  2014-09-22 18:36 ` [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt() Kirill Tkhai
  2014-09-24 14:56 ` [tip:sched/core] sched/fair: Remove duplicate code from can_migrate_task() tip-bot for Kirill Tkhai
  5 siblings, 1 reply; 18+ messages in thread
From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai

From: Kirill Tkhai <ktkhai@parallels.com>

Probability of use-after-free isn't zero in this place.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Cc: <stable@vger.kernel.org> # v3.14+
---
 kernel/sched/core.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3b07710..643ee99 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4023,13 +4023,14 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 	 * root_domain.
 	 */
 #ifdef CONFIG_SMP
-	if (task_has_dl_policy(p)) {
-		const struct cpumask *span = task_rq(p)->rd->span;
-
-		if (dl_bandwidth_enabled() && !cpumask_subset(span, new_mask)) {
+	if (task_has_dl_policy(p) && dl_bandwidth_enabled()) {
+		rcu_read_lock();
+		if (!cpumask_subset(task_rq(p)->rd->span, new_mask)) {
 			retval = -EBUSY;
+			rcu_read_unlock();
 			goto out_free_new_mask;
 		}
+		rcu_read_unlock();
 	}
 #endif
 again:


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt()
  2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai
                   ` (3 preceding siblings ...)
  2014-09-22 18:36 ` [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock Kirill Tkhai
@ 2014-09-22 18:36 ` Kirill Tkhai
  2014-09-24 14:57   ` [tip:sched/core] " tip-bot for Kirill Tkhai
  2014-09-24 14:56 ` [tip:sched/core] sched/fair: Remove duplicate code from can_migrate_task() tip-bot for Kirill Tkhai
  5 siblings, 1 reply; 18+ messages in thread
From: Kirill Tkhai @ 2014-09-22 18:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Kirill Tkhai

From: Kirill Tkhai <ktkhai@parallels.com>

Some time ago PREEMPT_NEED_RESCHED was implemented,
so reschedule technics is a little more difficult now.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
---
 kernel/sched/rt.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index de356b0..c322071 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2077,7 +2077,7 @@ static void task_tick_rt(struct rq *rq, struct task_struct *p, int queued)
 	for_each_sched_rt_entity(rt_se) {
 		if (rt_se->run_list.prev != rt_se->run_list.next) {
 			requeue_task_rt(rq, p, 0);
-			set_tsk_need_resched(p);
+			resched_curr(rq);
 			return;
 		}
 	}


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
  2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai
@ 2014-09-23 15:06     ` Peter Zijlstra
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2014-09-23 15:06 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: linux-kernel, Ingo Molnar, Kirill Tkhai, oleg, linux, ralf,
	tony.luck, linux-mips, linux-ia64

On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote:
> From: Kirill Tkhai <ktkhai@parallels.com>
> 
> Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW,
> may pull a task when it's in the middle of schedule().
> 
> CPU1(task1 calls schedule)            CPU2
> ...                                   schedule()
> ...                                      idle_balance()
> ...                                         load_balance()
> ...                                            ...
> schedule()                                     ...
>    prepare_lock_switch()                       ...
>       raw_spin_unlock(&rq1->lock)              ...
>       ...                                      raw_spin_lock(&rq1->lock)
>       ...                                         detach_tasks();
>       ...                                            can_migrate_task(task1)
>       ...                                         attach_tasks(); <--- move task1 to rq2
>       ...                                      raw_spin_unlock(&rq1->lock)
>       ...                                context_switch() <--- switch to task1's stack
>       ...                                ...
>    (using task1's stack)                 (using task1's stack)
>    ...                                   ...
>    context_switch()                      ...
> 
> 
> Parallel use of a single stack is not a good idea.

Indeed it is, but how about we do this instead?

---
Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW

Kirill found that there's a subtle race in the
__ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the
entire exception because neither arch that uses it seems to actually
still require it.

Boot tested on mips64el (qemu) only.

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/ia64/include/asm/processor.h |  1 -
 arch/mips/include/asm/processor.h |  6 ------
 kernel/sched/core.c               |  6 ------
 kernel/sched/sched.h              | 30 ------------------------------
 4 files changed, 43 deletions(-)

diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index c736713..ce53c50 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -19,7 +19,6 @@
 #include <asm/ptrace.h>
 #include <asm/ustack.h>
 
-#define __ARCH_WANT_UNLOCKED_CTXSW
 #define ARCH_HAS_PREFETCH_SWITCH_STACK
 
 #define IA64_NUM_PHYS_STACK_REG	96
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index 05f0843..f1df4cb 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p);
 #define ARCH_HAS_PREFETCHW
 #define prefetchw(x) __builtin_prefetch((x), 1, 1)
 
-/*
- * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP
- * systems.
- */
-#define __ARCH_WANT_UNLOCKED_CTXSW
-
 #endif
 
 #endif /* _ASM_PROCESSOR_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2a93b87..ccbafb0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2304,10 +2304,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
 	 */
 	post_schedule(rq);
 
-#ifdef __ARCH_WANT_UNLOCKED_CTXSW
-	/* In this case, finish_task_switch does not reenable preemption */
-	preempt_enable();
-#endif
 	if (current->set_child_tid)
 		put_user(task_pid_vnr(current), current->set_child_tid);
 }
@@ -2350,9 +2346,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	 * of the scheduler it's an obvious special-case), so we
 	 * do an early lockdep release here:
 	 */
-#ifndef __ARCH_WANT_UNLOCKED_CTXSW
 	spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
-#endif
 
 	context_tracking_task_switch(prev, next);
 	/* Here we just switch the register state and the stack. */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 1bc6aad..d87f122 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -966,7 +966,6 @@ static inline int task_on_rq_migrating(struct task_struct *p)
 # define finish_arch_post_lock_switch()	do { } while (0)
 #endif
 
-#ifndef __ARCH_WANT_UNLOCKED_CTXSW
 static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
 {
 #ifdef CONFIG_SMP
@@ -1004,35 +1003,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
 	raw_spin_unlock_irq(&rq->lock);
 }
 
-#else /* __ARCH_WANT_UNLOCKED_CTXSW */
-static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
-{
-#ifdef CONFIG_SMP
-	/*
-	 * We can optimise this out completely for !SMP, because the
-	 * SMP rebalancing from interrupt is the only thing that cares
-	 * here.
-	 */
-	next->on_cpu = 1;
-#endif
-	raw_spin_unlock(&rq->lock);
-}
-
-static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
-{
-#ifdef CONFIG_SMP
-	/*
-	 * After ->on_cpu is cleared, the task can be moved to a different CPU.
-	 * We must ensure this doesn't happen until the switch is completely
-	 * finished.
-	 */
-	smp_wmb();
-	prev->on_cpu = 0;
-#endif
-	local_irq_enable();
-}
-#endif /* __ARCH_WANT_UNLOCKED_CTXSW */
-
 /*
  * wake flags
  */

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
@ 2014-09-23 15:06     ` Peter Zijlstra
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2014-09-23 15:06 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: linux-kernel, Ingo Molnar, Kirill Tkhai, oleg, linux, ralf,
	tony.luck, linux-mips, linux-ia64

On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote:
> From: Kirill Tkhai <ktkhai@parallels.com>
> 
> Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW,
> may pull a task when it's in the middle of schedule().
> 
> CPU1(task1 calls schedule)            CPU2
> ...                                   schedule()
> ...                                      idle_balance()
> ...                                         load_balance()
> ...                                            ...
> schedule()                                     ...
>    prepare_lock_switch()                       ...
>       raw_spin_unlock(&rq1->lock)              ...
>       ...                                      raw_spin_lock(&rq1->lock)
>       ...                                         detach_tasks();
>       ...                                            can_migrate_task(task1)
>       ...                                         attach_tasks(); <--- move task1 to rq2
>       ...                                      raw_spin_unlock(&rq1->lock)
>       ...                                context_switch() <--- switch to task1's stack
>       ...                                ...
>    (using task1's stack)                 (using task1's stack)
>    ...                                   ...
>    context_switch()                      ...
> 
> 
> Parallel use of a single stack is not a good idea.

Indeed it is, but how about we do this instead?

---
Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW

Kirill found that there's a subtle race in the
__ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the
entire exception because neither arch that uses it seems to actually
still require it.

Boot tested on mips64el (qemu) only.

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/ia64/include/asm/processor.h |  1 -
 arch/mips/include/asm/processor.h |  6 ------
 kernel/sched/core.c               |  6 ------
 kernel/sched/sched.h              | 30 ------------------------------
 4 files changed, 43 deletions(-)

diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index c736713..ce53c50 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -19,7 +19,6 @@
 #include <asm/ptrace.h>
 #include <asm/ustack.h>
 
-#define __ARCH_WANT_UNLOCKED_CTXSW
 #define ARCH_HAS_PREFETCH_SWITCH_STACK
 
 #define IA64_NUM_PHYS_STACK_REG	96
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index 05f0843..f1df4cb 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p);
 #define ARCH_HAS_PREFETCHW
 #define prefetchw(x) __builtin_prefetch((x), 1, 1)
 
-/*
- * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP
- * systems.
- */
-#define __ARCH_WANT_UNLOCKED_CTXSW
-
 #endif
 
 #endif /* _ASM_PROCESSOR_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2a93b87..ccbafb0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2304,10 +2304,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
 	 */
 	post_schedule(rq);
 
-#ifdef __ARCH_WANT_UNLOCKED_CTXSW
-	/* In this case, finish_task_switch does not reenable preemption */
-	preempt_enable();
-#endif
 	if (current->set_child_tid)
 		put_user(task_pid_vnr(current), current->set_child_tid);
 }
@@ -2350,9 +2346,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	 * of the scheduler it's an obvious special-case), so we
 	 * do an early lockdep release here:
 	 */
-#ifndef __ARCH_WANT_UNLOCKED_CTXSW
 	spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
-#endif
 
 	context_tracking_task_switch(prev, next);
 	/* Here we just switch the register state and the stack. */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 1bc6aad..d87f122 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -966,7 +966,6 @@ static inline int task_on_rq_migrating(struct task_struct *p)
 # define finish_arch_post_lock_switch()	do { } while (0)
 #endif
 
-#ifndef __ARCH_WANT_UNLOCKED_CTXSW
 static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
 {
 #ifdef CONFIG_SMP
@@ -1004,35 +1003,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
 	raw_spin_unlock_irq(&rq->lock);
 }
 
-#else /* __ARCH_WANT_UNLOCKED_CTXSW */
-static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
-{
-#ifdef CONFIG_SMP
-	/*
-	 * We can optimise this out completely for !SMP, because the
-	 * SMP rebalancing from interrupt is the only thing that cares
-	 * here.
-	 */
-	next->on_cpu = 1;
-#endif
-	raw_spin_unlock(&rq->lock);
-}
-
-static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
-{
-#ifdef CONFIG_SMP
-	/*
-	 * After ->on_cpu is cleared, the task can be moved to a different CPU.
-	 * We must ensure this doesn't happen until the switch is completely
-	 * finished.
-	 */
-	smp_wmb();
-	prev->on_cpu = 0;
-#endif
-	local_irq_enable();
-}
-#endif /* __ARCH_WANT_UNLOCKED_CTXSW */
-
 /*
  * wake flags
  */

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
  2014-09-23 15:06     ` Peter Zijlstra
@ 2014-09-23 15:16       ` Kirill Tkhai
  -1 siblings, 0 replies; 18+ messages in thread
From: Kirill Tkhai @ 2014-09-23 15:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Kirill Tkhai, oleg, linux, ralf,
	tony.luck, linux-mips, linux-ia64

23.09.2014, 19:06, "Peter Zijlstra" <peterz@infradead.org>:
> On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote:
>>  From: Kirill Tkhai <ktkhai@parallels.com>
>>
>>  Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW,
>>  may pull a task when it's in the middle of schedule().
>>
>>  CPU1(task1 calls schedule)            CPU2
>>  ...                                   schedule()
>>  ...                                      idle_balance()
>>  ...                                         load_balance()
>>  ...                                            ...
>>  schedule()                                     ...
>>     prepare_lock_switch()                       ...
>>        raw_spin_unlock(&rq1->lock)              ...
>>        ...                                      raw_spin_lock(&rq1->lock)
>>        ...                                         detach_tasks();
>>        ...                                            can_migrate_task(task1)
>>        ...                                         attach_tasks(); <--- move task1 to rq2
>>        ...                                      raw_spin_unlock(&rq1->lock)
>>        ...                                context_switch() <--- switch to task1's stack
>>        ...                                ...
>>     (using task1's stack)                 (using task1's stack)
>>     ...                                   ...
>>     context_switch()                      ...
>>
>>  Parallel use of a single stack is not a good idea.
>
> Indeed it is, but how about we do this instead?

Completely agree, looks good for me.

> ---
> Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
>
> Kirill found that there's a subtle race in the
> __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the
> entire exception because neither arch that uses it seems to actually
> still require it.
>
> Boot tested on mips64el (qemu) only.
>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Reviewed-by: Kirill Tkhai <ktkhai@parallels.com> (one more review of ia64 part)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
@ 2014-09-23 15:16       ` Kirill Tkhai
  0 siblings, 0 replies; 18+ messages in thread
From: Kirill Tkhai @ 2014-09-23 15:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Kirill Tkhai, oleg, linux, ralf,
	tony.luck, linux-mips, linux-ia64

23.09.2014, 19:06, "Peter Zijlstra" <peterz@infradead.org>:
> On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote:
>> šFrom: Kirill Tkhai <ktkhai@parallels.com>
>>
>> šArchitectures, which define __ARCH_WANT_UNLOCKED_CTXSW,
>> šmay pull a task when it's in the middle of schedule().
>>
>> šCPU1(task1 calls schedule) šššššššššššCPU2
>> š... ššššššššššššššššššššššššššššššššššschedule()
>> š... šššššššššššššššššššššššššššššššššššššidle_balance()
>> š... ššššššššššššššššššššššššššššššššššššššššload_balance()
>> š... ššššššššššššššššššššššššššššššššššššššššššš...
>> šschedule() šššššššššššššššššššššššššššššššššššš...
>> ššššprepare_lock_switch() šššššššššššššššššššššš...
>> šššššššraw_spin_unlock(&rq1->lock) ššššššššššššš...
>> ššššššš... šššššššššššššššššššššššššššššššššššššraw_spin_lock(&rq1->lock)
>> ššššššš... ššššššššššššššššššššššššššššššššššššššššdetach_tasks();
>> ššššššš... šššššššššššššššššššššššššššššššššššššššššššcan_migrate_task(task1)
>> ššššššš... ššššššššššššššššššššššššššššššššššššššššattach_tasks(); <--- move task1 to rq2
>> ššššššš... šššššššššššššššššššššššššššššššššššššraw_spin_unlock(&rq1->lock)
>> ššššššš... šššššššššššššššššššššššššššššššcontext_switch() <--- switch to task1's stack
>> ššššššš... ššššššššššššššššššššššššššššššš...
>> šššš(using task1's stack) šššššššššššššššš(using task1's stack)
>> šššš... šššššššššššššššššššššššššššššššššš...
>> ššššcontext_switch() ššššššššššššššššššššš...
>>
>> šParallel use of a single stack is not a good idea.
>
> Indeed it is, but how about we do this instead?

Completely agree, looks good for me.

> ---
> Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
>
> Kirill found that there's a subtle race in the
> __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the
> entire exception because neither arch that uses it seems to actually
> still require it.
>
> Boot tested on mips64el (qemu) only.
>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Reviewed-by: Kirill Tkhai <ktkhai@parallels.com> (one more review of ia64 part)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
  2014-09-23 15:06     ` Peter Zijlstra
@ 2014-09-23 19:45       ` Guenter Roeck
  -1 siblings, 0 replies; 18+ messages in thread
From: Guenter Roeck @ 2014-09-23 19:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Kirill Tkhai, linux-kernel, Ingo Molnar, Kirill Tkhai, oleg,
	ralf, tony.luck, linux-mips, linux-ia64

On Tue, Sep 23, 2014 at 05:06:41PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote:
> > From: Kirill Tkhai <ktkhai@parallels.com>
> > 
> > Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW,
> > may pull a task when it's in the middle of schedule().
> > 
> > CPU1(task1 calls schedule)            CPU2
> > ...                                   schedule()
> > ...                                      idle_balance()
> > ...                                         load_balance()
> > ...                                            ...
> > schedule()                                     ...
> >    prepare_lock_switch()                       ...
> >       raw_spin_unlock(&rq1->lock)              ...
> >       ...                                      raw_spin_lock(&rq1->lock)
> >       ...                                         detach_tasks();
> >       ...                                            can_migrate_task(task1)
> >       ...                                         attach_tasks(); <--- move task1 to rq2
> >       ...                                      raw_spin_unlock(&rq1->lock)
> >       ...                                context_switch() <--- switch to task1's stack
> >       ...                                ...
> >    (using task1's stack)                 (using task1's stack)
> >    ...                                   ...
> >    context_switch()                      ...
> > 
> > 
> > Parallel use of a single stack is not a good idea.
> 
> Indeed it is, but how about we do this instead?
> 
> ---
> Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
> 
> Kirill found that there's a subtle race in the
> __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the
> entire exception because neither arch that uses it seems to actually
> still require it.
> 
> Boot tested on mips64el (qemu) only.
> 
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/ia64/include/asm/processor.h |  1 -
>  arch/mips/include/asm/processor.h |  6 ------
>  kernel/sched/core.c               |  6 ------
>  kernel/sched/sched.h              | 30 ------------------------------

You might also want to clean up any references to the define if you remove it.

Documentation/scheduler/sched-arch.txt:you must `#define __ARCH_WANT_UNLOCKED_CTXSW` in a header file
arch/blackfin/kernel/entry.S:    * since Blackfin does not define __ARCH_WANT_UNLOCKED_CTXSW, so

Guenter

>  4 files changed, 43 deletions(-)
> 
> diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
> index c736713..ce53c50 100644
> --- a/arch/ia64/include/asm/processor.h
> +++ b/arch/ia64/include/asm/processor.h
> @@ -19,7 +19,6 @@
>  #include <asm/ptrace.h>
>  #include <asm/ustack.h>
>  
> -#define __ARCH_WANT_UNLOCKED_CTXSW
>  #define ARCH_HAS_PREFETCH_SWITCH_STACK
>  
>  #define IA64_NUM_PHYS_STACK_REG	96
> diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
> index 05f0843..f1df4cb 100644
> --- a/arch/mips/include/asm/processor.h
> +++ b/arch/mips/include/asm/processor.h
> @@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p);
>  #define ARCH_HAS_PREFETCHW
>  #define prefetchw(x) __builtin_prefetch((x), 1, 1)
>  
> -/*
> - * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP
> - * systems.
> - */
> -#define __ARCH_WANT_UNLOCKED_CTXSW
> -
>  #endif
>  
>  #endif /* _ASM_PROCESSOR_H */
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 2a93b87..ccbafb0 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2304,10 +2304,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
>  	 */
>  	post_schedule(rq);
>  
> -#ifdef __ARCH_WANT_UNLOCKED_CTXSW
> -	/* In this case, finish_task_switch does not reenable preemption */
> -	preempt_enable();
> -#endif
>  	if (current->set_child_tid)
>  		put_user(task_pid_vnr(current), current->set_child_tid);
>  }
> @@ -2350,9 +2346,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
>  	 * of the scheduler it's an obvious special-case), so we
>  	 * do an early lockdep release here:
>  	 */
> -#ifndef __ARCH_WANT_UNLOCKED_CTXSW
>  	spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
> -#endif
>  
>  	context_tracking_task_switch(prev, next);
>  	/* Here we just switch the register state and the stack. */
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 1bc6aad..d87f122 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -966,7 +966,6 @@ static inline int task_on_rq_migrating(struct task_struct *p)
>  # define finish_arch_post_lock_switch()	do { } while (0)
>  #endif
>  
> -#ifndef __ARCH_WANT_UNLOCKED_CTXSW
>  static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
>  {
>  #ifdef CONFIG_SMP
> @@ -1004,35 +1003,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
>  	raw_spin_unlock_irq(&rq->lock);
>  }
>  
> -#else /* __ARCH_WANT_UNLOCKED_CTXSW */
> -static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
> -{
> -#ifdef CONFIG_SMP
> -	/*
> -	 * We can optimise this out completely for !SMP, because the
> -	 * SMP rebalancing from interrupt is the only thing that cares
> -	 * here.
> -	 */
> -	next->on_cpu = 1;
> -#endif
> -	raw_spin_unlock(&rq->lock);
> -}
> -
> -static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
> -{
> -#ifdef CONFIG_SMP
> -	/*
> -	 * After ->on_cpu is cleared, the task can be moved to a different CPU.
> -	 * We must ensure this doesn't happen until the switch is completely
> -	 * finished.
> -	 */
> -	smp_wmb();
> -	prev->on_cpu = 0;
> -#endif
> -	local_irq_enable();
> -}
> -#endif /* __ARCH_WANT_UNLOCKED_CTXSW */
> -
>  /*
>   * wake flags
>   */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
@ 2014-09-23 19:45       ` Guenter Roeck
  0 siblings, 0 replies; 18+ messages in thread
From: Guenter Roeck @ 2014-09-23 19:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Kirill Tkhai, linux-kernel, Ingo Molnar, Kirill Tkhai, oleg,
	ralf, tony.luck, linux-mips, linux-ia64

On Tue, Sep 23, 2014 at 05:06:41PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 22, 2014 at 10:36:18PM +0400, Kirill Tkhai wrote:
> > From: Kirill Tkhai <ktkhai@parallels.com>
> > 
> > Architectures, which define __ARCH_WANT_UNLOCKED_CTXSW,
> > may pull a task when it's in the middle of schedule().
> > 
> > CPU1(task1 calls schedule)            CPU2
> > ...                                   schedule()
> > ...                                      idle_balance()
> > ...                                         load_balance()
> > ...                                            ...
> > schedule()                                     ...
> >    prepare_lock_switch()                       ...
> >       raw_spin_unlock(&rq1->lock)              ...
> >       ...                                      raw_spin_lock(&rq1->lock)
> >       ...                                         detach_tasks();
> >       ...                                            can_migrate_task(task1)
> >       ...                                         attach_tasks(); <--- move task1 to rq2
> >       ...                                      raw_spin_unlock(&rq1->lock)
> >       ...                                context_switch() <--- switch to task1's stack
> >       ...                                ...
> >    (using task1's stack)                 (using task1's stack)
> >    ...                                   ...
> >    context_switch()                      ...
> > 
> > 
> > Parallel use of a single stack is not a good idea.
> 
> Indeed it is, but how about we do this instead?
> 
> ---
> Subject: sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
> 
> Kirill found that there's a subtle race in the
> __ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the
> entire exception because neither arch that uses it seems to actually
> still require it.
> 
> Boot tested on mips64el (qemu) only.
> 
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/ia64/include/asm/processor.h |  1 -
>  arch/mips/include/asm/processor.h |  6 ------
>  kernel/sched/core.c               |  6 ------
>  kernel/sched/sched.h              | 30 ------------------------------

You might also want to clean up any references to the define if you remove it.

Documentation/scheduler/sched-arch.txt:you must `#define __ARCH_WANT_UNLOCKED_CTXSW` in a header file
arch/blackfin/kernel/entry.S:    * since Blackfin does not define __ARCH_WANT_UNLOCKED_CTXSW, so

Guenter

>  4 files changed, 43 deletions(-)
> 
> diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
> index c736713..ce53c50 100644
> --- a/arch/ia64/include/asm/processor.h
> +++ b/arch/ia64/include/asm/processor.h
> @@ -19,7 +19,6 @@
>  #include <asm/ptrace.h>
>  #include <asm/ustack.h>
>  
> -#define __ARCH_WANT_UNLOCKED_CTXSW
>  #define ARCH_HAS_PREFETCH_SWITCH_STACK
>  
>  #define IA64_NUM_PHYS_STACK_REG	96
> diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
> index 05f0843..f1df4cb 100644
> --- a/arch/mips/include/asm/processor.h
> +++ b/arch/mips/include/asm/processor.h
> @@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p);
>  #define ARCH_HAS_PREFETCHW
>  #define prefetchw(x) __builtin_prefetch((x), 1, 1)
>  
> -/*
> - * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP
> - * systems.
> - */
> -#define __ARCH_WANT_UNLOCKED_CTXSW
> -
>  #endif
>  
>  #endif /* _ASM_PROCESSOR_H */
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 2a93b87..ccbafb0 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2304,10 +2304,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
>  	 */
>  	post_schedule(rq);
>  
> -#ifdef __ARCH_WANT_UNLOCKED_CTXSW
> -	/* In this case, finish_task_switch does not reenable preemption */
> -	preempt_enable();
> -#endif
>  	if (current->set_child_tid)
>  		put_user(task_pid_vnr(current), current->set_child_tid);
>  }
> @@ -2350,9 +2346,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
>  	 * of the scheduler it's an obvious special-case), so we
>  	 * do an early lockdep release here:
>  	 */
> -#ifndef __ARCH_WANT_UNLOCKED_CTXSW
>  	spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
> -#endif
>  
>  	context_tracking_task_switch(prev, next);
>  	/* Here we just switch the register state and the stack. */
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 1bc6aad..d87f122 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -966,7 +966,6 @@ static inline int task_on_rq_migrating(struct task_struct *p)
>  # define finish_arch_post_lock_switch()	do { } while (0)
>  #endif
>  
> -#ifndef __ARCH_WANT_UNLOCKED_CTXSW
>  static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
>  {
>  #ifdef CONFIG_SMP
> @@ -1004,35 +1003,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
>  	raw_spin_unlock_irq(&rq->lock);
>  }
>  
> -#else /* __ARCH_WANT_UNLOCKED_CTXSW */
> -static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
> -{
> -#ifdef CONFIG_SMP
> -	/*
> -	 * We can optimise this out completely for !SMP, because the
> -	 * SMP rebalancing from interrupt is the only thing that cares
> -	 * here.
> -	 */
> -	next->on_cpu = 1;
> -#endif
> -	raw_spin_unlock(&rq->lock);
> -}
> -
> -static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
> -{
> -#ifdef CONFIG_SMP
> -	/*
> -	 * After ->on_cpu is cleared, the task can be moved to a different CPU.
> -	 * We must ensure this doesn't happen until the switch is completely
> -	 * finished.
> -	 */
> -	smp_wmb();
> -	prev->on_cpu = 0;
> -#endif
> -	local_irq_enable();
> -}
> -#endif /* __ARCH_WANT_UNLOCKED_CTXSW */
> -
>  /*
>   * wake flags
>   */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [tip:sched/core] sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
  2014-09-23 15:06     ` Peter Zijlstra
                       ` (2 preceding siblings ...)
  (?)
@ 2014-09-24 14:56     ` tip-bot for Peter Zijlstra
  -1 siblings, 0 replies; 18+ messages in thread
From: tip-bot for Peter Zijlstra @ 2014-09-24 14:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, torvalds, peterz, tony.luck, ralf, akpm, tglx, davidlohr,
	hpa, linux-kernel, james.hogan, qais.yousef, keescook,
	paul.burton, fenghua.yu, tkhai

Commit-ID:  c55f5158f5606f8a62e694b7e009f59b92ac6258
Gitweb:     http://git.kernel.org/tip/c55f5158f5606f8a62e694b7e009f59b92ac6258
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Tue, 23 Sep 2014 17:06:41 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Sep 2014 14:47:05 +0200

sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW

Kirill found that there's a subtle race in the
__ARCH_WANT_UNLOCKED_CTXSW code, and instead of fixing it, remove the
entire exception because neither arch that uses it seems to actually
still require it.

Boot tested on mips64el (qemu) only.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kirill Tkhai <tkhai@yandex.ru>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Qais Yousef <qais.yousef@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: oleg@redhat.com
Cc: linux@roeck-us.net
Cc: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20140923150641.GH3312@worktop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/ia64/include/asm/processor.h |  1 -
 arch/mips/include/asm/processor.h |  6 ------
 kernel/sched/core.c               |  6 ------
 kernel/sched/sched.h              | 30 ------------------------------
 4 files changed, 43 deletions(-)

diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index c736713..ce53c50 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -19,7 +19,6 @@
 #include <asm/ptrace.h>
 #include <asm/ustack.h>
 
-#define __ARCH_WANT_UNLOCKED_CTXSW
 #define ARCH_HAS_PREFETCH_SWITCH_STACK
 
 #define IA64_NUM_PHYS_STACK_REG	96
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index 05f0843..f1df4cb 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -397,12 +397,6 @@ unsigned long get_wchan(struct task_struct *p);
 #define ARCH_HAS_PREFETCHW
 #define prefetchw(x) __builtin_prefetch((x), 1, 1)
 
-/*
- * See Documentation/scheduler/sched-arch.txt; prevents deadlock on SMP
- * systems.
- */
-#define __ARCH_WANT_UNLOCKED_CTXSW
-
 #endif
 
 #endif /* _ASM_PROCESSOR_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d65566d..5b0eac9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2331,10 +2331,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
 	 */
 	post_schedule(rq);
 
-#ifdef __ARCH_WANT_UNLOCKED_CTXSW
-	/* In this case, finish_task_switch does not reenable preemption */
-	preempt_enable();
-#endif
 	if (current->set_child_tid)
 		put_user(task_pid_vnr(current), current->set_child_tid);
 }
@@ -2377,9 +2373,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	 * of the scheduler it's an obvious special-case), so we
 	 * do an early lockdep release here:
 	 */
-#ifndef __ARCH_WANT_UNLOCKED_CTXSW
 	spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
-#endif
 
 	context_tracking_task_switch(prev, next);
 	/* Here we just switch the register state and the stack. */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 16e1ca9..6130251 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -975,7 +975,6 @@ static inline int task_on_rq_migrating(struct task_struct *p)
 # define finish_arch_post_lock_switch()	do { } while (0)
 #endif
 
-#ifndef __ARCH_WANT_UNLOCKED_CTXSW
 static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
 {
 #ifdef CONFIG_SMP
@@ -1013,35 +1012,6 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
 	raw_spin_unlock_irq(&rq->lock);
 }
 
-#else /* __ARCH_WANT_UNLOCKED_CTXSW */
-static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
-{
-#ifdef CONFIG_SMP
-	/*
-	 * We can optimise this out completely for !SMP, because the
-	 * SMP rebalancing from interrupt is the only thing that cares
-	 * here.
-	 */
-	next->on_cpu = 1;
-#endif
-	raw_spin_unlock(&rq->lock);
-}
-
-static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
-{
-#ifdef CONFIG_SMP
-	/*
-	 * After ->on_cpu is cleared, the task can be moved to a different CPU.
-	 * We must ensure this doesn't happen until the switch is completely
-	 * finished.
-	 */
-	smp_wmb();
-	prev->on_cpu = 0;
-#endif
-	local_irq_enable();
-}
-#endif /* __ARCH_WANT_UNLOCKED_CTXSW */
-
 /*
  * wake flags
  */

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip:sched/core] sched/fair: Remove duplicate code from can_migrate_task()
  2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai
                   ` (4 preceding siblings ...)
  2014-09-22 18:36 ` [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt() Kirill Tkhai
@ 2014-09-24 14:56 ` tip-bot for Kirill Tkhai
  5 siblings, 0 replies; 18+ messages in thread
From: tip-bot for Kirill Tkhai @ 2014-09-24 14:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, ktkhai, hpa, mingo, torvalds, peterz, tglx

Commit-ID:  7a96c231ca23f0f5622852307df4209afc502ec3
Gitweb:     http://git.kernel.org/tip/7a96c231ca23f0f5622852307df4209afc502ec3
Author:     Kirill Tkhai <ktkhai@parallels.com>
AuthorDate: Mon, 22 Sep 2014 22:36:12 +0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Sep 2014 14:47:07 +0200

sched/fair: Remove duplicate code from can_migrate_task()

Combine two branches which do the same.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140922183612.11015.64200.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8cb32f8..10a5a28 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5315,24 +5315,12 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
 	if (!tsk_cache_hot)
 		tsk_cache_hot = migrate_degrades_locality(p, env);
 
-	if (migrate_improves_locality(p, env)) {
-#ifdef CONFIG_SCHEDSTATS
+	if (migrate_improves_locality(p, env) || !tsk_cache_hot ||
+	    env->sd->nr_balance_failed > env->sd->cache_nice_tries) {
 		if (tsk_cache_hot) {
 			schedstat_inc(env->sd, lb_hot_gained[env->idle]);
 			schedstat_inc(p, se.statistics.nr_forced_migrations);
 		}
-#endif
-		return 1;
-	}
-
-	if (!tsk_cache_hot ||
-		env->sd->nr_balance_failed > env->sd->cache_nice_tries) {
-
-		if (tsk_cache_hot) {
-			schedstat_inc(env->sd, lb_hot_gained[env->idle]);
-			schedstat_inc(p, se.statistics.nr_forced_migrations);
-		}
-
 		return 1;
 	}
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip:sched/core] sched: Use dl_bw_of() under RCU read lock
  2014-09-22 18:36 ` [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock Kirill Tkhai
@ 2014-09-24 14:57   ` tip-bot for Kirill Tkhai
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot for Kirill Tkhai @ 2014-09-24 14:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, ktkhai, hpa, mingo, torvalds, peterz, paulmck, tglx

Commit-ID:  66339c31bc3978d5fff9c4b4cb590a861def4db2
Gitweb:     http://git.kernel.org/tip/66339c31bc3978d5fff9c4b4cb590a861def4db2
Author:     Kirill Tkhai <ktkhai@parallels.com>
AuthorDate: Mon, 22 Sep 2014 22:36:24 +0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Sep 2014 14:47:09 +0200

sched: Use dl_bw_of() under RCU read lock

dl_bw_of() dereferences rq->rd which has to have RCU read lock held.
Probability of use-after-free isn't zero here.

Also add lockdep assert into dl_bw_cpus().

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # v3.14+
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140922183624.11015.71558.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5b0eac9..f0adb03 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2021,6 +2021,8 @@ unsigned long to_ratio(u64 period, u64 runtime)
 #ifdef CONFIG_SMP
 inline struct dl_bw *dl_bw_of(int i)
 {
+	rcu_lockdep_assert(rcu_read_lock_sched_held(),
+			   "sched RCU must be held");
 	return &cpu_rq(i)->rd->dl_bw;
 }
 
@@ -2029,6 +2031,8 @@ static inline int dl_bw_cpus(int i)
 	struct root_domain *rd = cpu_rq(i)->rd;
 	int cpus = 0;
 
+	rcu_lockdep_assert(rcu_read_lock_sched_held(),
+			   "sched RCU must be held");
 	for_each_cpu_and(i, rd->span, cpu_active_mask)
 		cpus++;
 
@@ -7645,6 +7649,8 @@ static int sched_dl_global_constraints(void)
 	int cpu, ret = 0;
 	unsigned long flags;
 
+	rcu_read_lock();
+
 	/*
 	 * Here we want to check the bandwidth not being set to some
 	 * value smaller than the currently allocated bandwidth in
@@ -7666,6 +7672,8 @@ static int sched_dl_global_constraints(void)
 			break;
 	}
 
+	rcu_read_unlock();
+
 	return ret;
 }
 
@@ -7681,6 +7689,7 @@ static void sched_dl_do_global(void)
 	if (global_rt_runtime() != RUNTIME_INF)
 		new_bw = to_ratio(global_rt_period(), global_rt_runtime());
 
+	rcu_read_lock();
 	/*
 	 * FIXME: As above...
 	 */
@@ -7691,6 +7700,7 @@ static void sched_dl_do_global(void)
 		dl_b->bw = new_bw;
 		raw_spin_unlock_irqrestore(&dl_b->lock, flags);
 	}
+	rcu_read_unlock();
 }
 
 static int sched_rt_global_validate(void)

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip:sched/core] sched: cleanup: Rename 'out_unlock' to ' out_free_new_mask'
  2014-09-22 18:36 ` [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask Kirill Tkhai
@ 2014-09-24 14:57   ` tip-bot for Kirill Tkhai
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot for Kirill Tkhai @ 2014-09-24 14:57 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, ktkhai, hpa, mingo, peterz, tglx

Commit-ID:  16303ab2fe214635240a8f57cad2cd29792d4e3b
Gitweb:     http://git.kernel.org/tip/16303ab2fe214635240a8f57cad2cd29792d4e3b
Author:     Kirill Tkhai <ktkhai@parallels.com>
AuthorDate: Mon, 22 Sep 2014 22:36:30 +0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Sep 2014 14:47:10 +0200

sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask'

Nothing is locked there, so label's name only confuses a reader.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140922183630.11015.59500.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0adb03..316127a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4029,14 +4029,14 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 		rcu_read_lock();
 		if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
 			rcu_read_unlock();
-			goto out_unlock;
+			goto out_free_new_mask;
 		}
 		rcu_read_unlock();
 	}
 
 	retval = security_task_setscheduler(p);
 	if (retval)
-		goto out_unlock;
+		goto out_free_new_mask;
 
 
 	cpuset_cpus_allowed(p, cpus_allowed);
@@ -4054,7 +4054,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 
 		if (dl_bandwidth_enabled() && !cpumask_subset(span, new_mask)) {
 			retval = -EBUSY;
-			goto out_unlock;
+			goto out_free_new_mask;
 		}
 	}
 #endif
@@ -4073,7 +4073,7 @@ again:
 			goto again;
 		}
 	}
-out_unlock:
+out_free_new_mask:
 	free_cpumask_var(new_mask);
 out_free_cpus_allowed:
 	free_cpumask_var(cpus_allowed);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip:sched/core] sched: Use rq->rd in sched_setaffinity() under RCU read lock
  2014-09-22 18:36 ` [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock Kirill Tkhai
@ 2014-09-24 14:57   ` tip-bot for Kirill Tkhai
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot for Kirill Tkhai @ 2014-09-24 14:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, ktkhai, hpa, mingo, torvalds, peterz, paulmck, tglx

Commit-ID:  f1e3a0932f3a9554371792a7daaf1e0eb19f66d5
Gitweb:     http://git.kernel.org/tip/f1e3a0932f3a9554371792a7daaf1e0eb19f66d5
Author:     Kirill Tkhai <ktkhai@parallels.com>
AuthorDate: Mon, 22 Sep 2014 22:36:36 +0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Sep 2014 14:47:11 +0200

sched: Use rq->rd in sched_setaffinity() under RCU read lock

Probability of use-after-free isn't zero in this place.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # v3.14+
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140922183636.11015.83611.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 316127a..b5349fe 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4049,13 +4049,14 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 	 * root_domain.
 	 */
 #ifdef CONFIG_SMP
-	if (task_has_dl_policy(p)) {
-		const struct cpumask *span = task_rq(p)->rd->span;
-
-		if (dl_bandwidth_enabled() && !cpumask_subset(span, new_mask)) {
+	if (task_has_dl_policy(p) && dl_bandwidth_enabled()) {
+		rcu_read_lock();
+		if (!cpumask_subset(task_rq(p)->rd->span, new_mask)) {
 			retval = -EBUSY;
+			rcu_read_unlock();
 			goto out_free_new_mask;
 		}
+		rcu_read_unlock();
 	}
 #endif
 again:

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip:sched/core] sched/rt: Use resched_curr() in task_tick_rt()
  2014-09-22 18:36 ` [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt() Kirill Tkhai
@ 2014-09-24 14:57   ` tip-bot for Kirill Tkhai
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot for Kirill Tkhai @ 2014-09-24 14:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, ktkhai, hpa, mingo, torvalds, peterz, tglx

Commit-ID:  8aa6f0ebf41b5fdd186276394bf07e7bd6884d94
Gitweb:     http://git.kernel.org/tip/8aa6f0ebf41b5fdd186276394bf07e7bd6884d94
Author:     Kirill Tkhai <ktkhai@parallels.com>
AuthorDate: Mon, 22 Sep 2014 22:36:43 +0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Sep 2014 14:47:12 +0200

sched/rt: Use resched_curr() in task_tick_rt()

Some time ago PREEMPT_NEED_RESCHED was implemented,
so reschedule technics is a little more difficult now.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140922183642.11015.66039.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/rt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 2e6a774..87ea5bf 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2072,7 +2072,7 @@ static void task_tick_rt(struct rq *rq, struct task_struct *p, int queued)
 	for_each_sched_rt_entity(rt_se) {
 		if (rt_se->run_list.prev != rt_se->run_list.next) {
 			requeue_task_rt(rq, p, 0);
-			set_tsk_need_resched(p);
+			resched_curr(rq);
 			return;
 		}
 	}

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-09-24 14:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-22 18:36 [PATCH v3 1/6] sched/fair: Remove duplicate code from can_migrate_task() Kirill Tkhai
2014-09-22 18:36 ` [PATCH v3 2/6] sched: Do not pick a task which is switching on other cpu Kirill Tkhai
2014-09-23 15:06   ` [RFC][PATCH] sched,mips,ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW Peter Zijlstra
2014-09-23 15:06     ` Peter Zijlstra
2014-09-23 15:16     ` Kirill Tkhai
2014-09-23 15:16       ` Kirill Tkhai
2014-09-23 19:45     ` Guenter Roeck
2014-09-23 19:45       ` Guenter Roeck
2014-09-24 14:56     ` [tip:sched/core] sched, mips, ia64: " tip-bot for Peter Zijlstra
2014-09-22 18:36 ` [PATCH v3 3/6] sched: Use dl_bw_of() under RCU read lock Kirill Tkhai
2014-09-24 14:57   ` [tip:sched/core] " tip-bot for Kirill Tkhai
2014-09-22 18:36 ` [PATCH v3 4/6] sched: cleanup: Rename out_unlock to out_free_new_mask Kirill Tkhai
2014-09-24 14:57   ` [tip:sched/core] sched: cleanup: Rename 'out_unlock' to ' out_free_new_mask' tip-bot for Kirill Tkhai
2014-09-22 18:36 ` [PATCH v3 5/6] sched: Use rq->rd in sched_setaffinity() under RCU read lock Kirill Tkhai
2014-09-24 14:57   ` [tip:sched/core] " tip-bot for Kirill Tkhai
2014-09-22 18:36 ` [PATCH v3 6/6] sched/rt: Use resched_curr() in task_tick_rt() Kirill Tkhai
2014-09-24 14:57   ` [tip:sched/core] " tip-bot for Kirill Tkhai
2014-09-24 14:56 ` [tip:sched/core] sched/fair: Remove duplicate code from can_migrate_task() tip-bot for Kirill Tkhai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.