linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] nohz: A few improvements v2
@ 2015-04-21 12:23 Frederic Weisbecker
  2015-04-21 12:23 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
                   ` (4 more replies)
  0 siblings, 5 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Mike Galbraith,
	Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney,
	Ingo Molnar, Rik van Riel, Martin Schwidefsky

2nd version of the patchset "context_tracking: A few improvements" with
the nohz isolcpus patches from Chris.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	nohz/core

HEAD: b885e51f754a09bce2709c789a913735ee62bfd2

Thanks,
	Frederic
---

Chris Metcalf (2):
      nohz: Add tick_nohz_full_add_cpus_to() API
      nohz: Set isolcpus when nohz_full is set

Frederic Weisbecker (2):
      context_tracking: Protect against recursion
      context_tracking: Inherit TIF_NOHZ through forks instead of context switches


 include/linux/context_tracking.h       | 10 -----
 include/linux/context_tracking_state.h |  1 +
 include/linux/tick.h                   |  7 ++++
 kernel/context_tracking.c              | 75 ++++++++++++++++++++++++----------
 kernel/sched/core.c                    |  4 +-
 5 files changed, 64 insertions(+), 33 deletions(-)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/4] context_tracking: Protect against recursion
  2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker
@ 2015-04-21 12:23 ` Frederic Weisbecker
  2015-04-21 12:23 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Mike Galbraith,
	Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney,
	Ingo Molnar, Rik van Riel, Martin Schwidefsky

Context tracking recursion can happen when an exception triggers in the
middle of a call to a context tracking probe.

This special case can be caused by vmalloc faults. If an access to a
memory area allocated by vmalloc happens in the middle of
context_tracking_enter(), we may run into an endless fault loop because
the exception in turn calls context_tracking_enter() which faults on
the same vmalloc'ed memory, triggering an exception again, etc...

Some rare crashes have been reported so lets protect against this with
a recursion counter.

Reported-by: Dave Jones <davej@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/context_tracking_state.h |  1 +
 kernel/context_tracking.c              | 30 ++++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 6b7b96a..678ecdf 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -12,6 +12,7 @@ struct context_tracking {
 	 * may be further optimized using static keys.
 	 */
 	bool active;
+	int recursion;
 	enum ctx_state {
 		CONTEXT_KERNEL = 0,
 		CONTEXT_USER,
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 72d59a1..b9e0b4f 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -38,6 +38,25 @@ void context_tracking_cpu_set(int cpu)
 	}
 }
 
+static bool context_tracking_recursion_enter(void)
+{
+	int recursion;
+
+	recursion = __this_cpu_inc_return(context_tracking.recursion);
+	if (recursion == 1)
+		return true;
+
+	WARN_ONCE((recursion < 1), "Invalid context tracking recursion value %d\n", recursion);
+	__this_cpu_dec(context_tracking.recursion);
+
+	return false;
+}
+
+static void context_tracking_recursion_exit(void)
+{
+	__this_cpu_dec(context_tracking.recursion);
+}
+
 /**
  * context_tracking_enter - Inform the context tracking that the CPU is going
  *                          enter user or guest space mode.
@@ -75,6 +94,11 @@ void context_tracking_enter(enum ctx_state state)
 	WARN_ON_ONCE(!current->mm);
 
 	local_irq_save(flags);
+	if (!context_tracking_recursion_enter()) {
+		local_irq_restore(flags);
+		return;
+	}
+
 	if ( __this_cpu_read(context_tracking.state) != state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
@@ -105,6 +129,7 @@ void context_tracking_enter(enum ctx_state state)
 		 */
 		__this_cpu_write(context_tracking.state, state);
 	}
+	context_tracking_recursion_exit();
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_enter);
@@ -139,6 +164,10 @@ void context_tracking_exit(enum ctx_state state)
 		return;
 
 	local_irq_save(flags);
+	if (!context_tracking_recursion_enter()) {
+		local_irq_restore(flags);
+		return;
+	}
 	if (__this_cpu_read(context_tracking.state) == state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
@@ -153,6 +182,7 @@ void context_tracking_exit(enum ctx_state state)
 		}
 		__this_cpu_write(context_tracking.state, CONTEXT_KERNEL);
 	}
+	context_tracking_recursion_exit();
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_exit);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches
  2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker
  2015-04-21 12:23 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
@ 2015-04-21 12:23 ` Frederic Weisbecker
  2015-04-21 15:26   ` Peter Zijlstra
  2015-04-21 12:23 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Mike Galbraith,
	Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney,
	Ingo Molnar, Rik van Riel, Martin Schwidefsky

TIF_NOHZ is used by context_tracking to force syscall slow-path on every
task in order to track userspace roundtrips. As such, it must be set on
all running tasks.

It's currently explicitly inherited through context switches. There is
no need to do it on this fast-path though. The flag could be simply
set once for all on all tasks, whether they are running or not.

Lets do this by setting the flag to init task on early boot and let it
propagate through fork inheritance.

While at it mark context_tracking_cpu_set() as init code, we only need
it at early boot time.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/context_tracking.h | 10 --------
 kernel/context_tracking.c        | 51 ++++++++++++++++++++--------------------
 kernel/sched/core.c              |  1 -
 3 files changed, 26 insertions(+), 36 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 2821838..b96bd29 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -14,8 +14,6 @@ extern void context_tracking_enter(enum ctx_state state);
 extern void context_tracking_exit(enum ctx_state state);
 extern void context_tracking_user_enter(void);
 extern void context_tracking_user_exit(void);
-extern void __context_tracking_task_switch(struct task_struct *prev,
-					   struct task_struct *next);
 
 static inline void user_enter(void)
 {
@@ -51,19 +49,11 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 	}
 }
 
-static inline void context_tracking_task_switch(struct task_struct *prev,
-						struct task_struct *next)
-{
-	if (context_tracking_is_enabled())
-		__context_tracking_task_switch(prev, next);
-}
 #else
 static inline void user_enter(void) { }
 static inline void user_exit(void) { }
 static inline enum ctx_state exception_enter(void) { return 0; }
 static inline void exception_exit(enum ctx_state prev_ctx) { }
-static inline void context_tracking_task_switch(struct task_struct *prev,
-						struct task_struct *next) { }
 #endif /* !CONFIG_CONTEXT_TRACKING */
 
 
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index b9e0b4f..70edc57 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -30,14 +30,6 @@ EXPORT_SYMBOL_GPL(context_tracking_enabled);
 DEFINE_PER_CPU(struct context_tracking, context_tracking);
 EXPORT_SYMBOL_GPL(context_tracking);
 
-void context_tracking_cpu_set(int cpu)
-{
-	if (!per_cpu(context_tracking.active, cpu)) {
-		per_cpu(context_tracking.active, cpu) = true;
-		static_key_slow_inc(&context_tracking_enabled);
-	}
-}
-
 static bool context_tracking_recursion_enter(void)
 {
 	int recursion;
@@ -194,24 +186,33 @@ void context_tracking_user_exit(void)
 }
 NOKPROBE_SYMBOL(context_tracking_user_exit);
 
-/**
- * __context_tracking_task_switch - context switch the syscall callbacks
- * @prev: the task that is being switched out
- * @next: the task that is being switched in
- *
- * The context tracking uses the syscall slow path to implement its user-kernel
- * boundaries probes on syscalls. This way it doesn't impact the syscall fast
- * path on CPUs that don't do context tracking.
- *
- * But we need to clear the flag on the previous task because it may later
- * migrate to some CPU that doesn't do the context tracking. As such the TIF
- * flag may not be desired there.
- */
-void __context_tracking_task_switch(struct task_struct *prev,
-				    struct task_struct *next)
+void __init context_tracking_cpu_set(int cpu)
 {
-	clear_tsk_thread_flag(prev, TIF_NOHZ);
-	set_tsk_thread_flag(next, TIF_NOHZ);
+	static __initdata bool initialized = false;
+	struct task_struct *p, *t;
+
+	if (!per_cpu(context_tracking.active, cpu)) {
+		per_cpu(context_tracking.active, cpu) = true;
+		static_key_slow_inc(&context_tracking_enabled);
+	}
+
+	if (initialized)
+		return;
+
+	set_tsk_thread_flag(&init_task, TIF_NOHZ);
+
+	/*
+	 * There shouldn't be any thread at this early boot stage
+	 * but the scheduler is ready to host any. So lets walk
+	 * the tasklist just in case. tasklist_lock isn't necessary
+	 * either that early but take it for correctness checkers.
+	 */
+	read_lock(&tasklist_lock);
+	for_each_process_thread(p, t)
+		set_tsk_thread_flag(t, TIF_NOHZ);
+	read_unlock(&tasklist_lock);
+
+	initialized = true;
 }
 
 #ifdef CONFIG_CONTEXT_TRACKING_FORCE
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f9123a8..8414a6a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2347,7 +2347,6 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	 */
 	spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
 
-	context_tracking_task_switch(prev, next);
 	/* Here we just switch the register state and the stack. */
 	switch_to(prev, next, prev);
 	barrier();
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API
  2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker
  2015-04-21 12:23 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
  2015-04-21 12:23 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker
@ 2015-04-21 12:23 ` Frederic Weisbecker
  2015-04-21 12:23 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
  2015-04-21 15:26 ` [PATCH 0/4] nohz: A few improvements v2 Peter Zijlstra
  4 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw)
  To: LKML
  Cc: Chris Metcalf, Peter Zijlstra, Mike Galbraith, Ingo Molnar,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki,
	Paul E . McKenney, Ingo Molnar, Frederic Weisbecker,
	Rik van Riel, Martin Schwidefsky

From: Chris Metcalf <cmetcalf@ezchip.com>

This API is useful to modify a cpumask indicating some special nohz-type
functionality so that the nohz cores are automatically added to that set.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Link: http://lkml.kernel.org/r/1429024675-18938-1-git-send-email-cmetcalf@ezchip.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/tick.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index f8492da5..4191b56 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -134,6 +134,12 @@ static inline bool tick_nohz_full_cpu(int cpu)
 	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
@@ -142,6 +148,7 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick(void) { }
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2015-04-21 12:23 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker
@ 2015-04-21 12:23 ` Frederic Weisbecker
  2015-04-21 15:26 ` [PATCH 0/4] nohz: A few improvements v2 Peter Zijlstra
  4 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw)
  To: LKML
  Cc: Chris Metcalf, Peter Zijlstra, Mike Galbraith, Ingo Molnar,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki,
	Paul E . McKenney, Ingo Molnar, Frederic Weisbecker,
	Rik van Riel, Martin Schwidefsky

From: Chris Metcalf <cmetcalf@ezchip.com>

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8414a6a..ee4f735 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7051,6 +7051,9 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
+	/* nohz_full won't take effect without isolating the cpus. */
+	tick_nohz_full_add_cpus_to(cpu_isolated_map);
+
 	sched_init_numa();
 
 	/*
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches
  2015-04-21 12:23 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker
@ 2015-04-21 15:26   ` Peter Zijlstra
  2015-04-21 16:51     ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2015-04-21 15:26 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On Tue, Apr 21, 2015 at 02:23:07PM +0200, Frederic Weisbecker wrote:
> +void __init context_tracking_cpu_set(int cpu)
>  {
> +	static __initdata bool initialized = false;
> +	struct task_struct *p, *t;
> +
> +	if (!per_cpu(context_tracking.active, cpu)) {
> +		per_cpu(context_tracking.active, cpu) = true;
> +		static_key_slow_inc(&context_tracking_enabled);
> +	}
> +
> +	if (initialized)
> +		return;
> +
> +	set_tsk_thread_flag(&init_task, TIF_NOHZ);
> +
> +	/*
> +	 * There shouldn't be any thread at this early boot stage
> +	 * but the scheduler is ready to host any. So lets walk
> +	 * the tasklist just in case. tasklist_lock isn't necessary
> +	 * either that early but take it for correctness checkers.
> +	 */
> +	read_lock(&tasklist_lock);
> +	for_each_process_thread(p, t)
> +		set_tsk_thread_flag(t, TIF_NOHZ);

If there should not be any task, should there not be a WARN_ON_ONCE()
here?

> +	read_unlock(&tasklist_lock);
> +
> +	initialized = true;
>  }

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/4] nohz: A few improvements v2
  2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2015-04-21 12:23 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
@ 2015-04-21 15:26 ` Peter Zijlstra
  4 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2015-04-21 15:26 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On Tue, Apr 21, 2015 at 02:23:05PM +0200, Frederic Weisbecker wrote:
> 2nd version of the patchset "context_tracking: A few improvements" with
> the nohz isolcpus patches from Chris.
> 

Other than the one comment;

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches
  2015-04-21 15:26   ` Peter Zijlstra
@ 2015-04-21 16:51     ` Frederic Weisbecker
  2015-04-21 20:52       ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-21 16:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On Tue, Apr 21, 2015 at 05:26:00PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 21, 2015 at 02:23:07PM +0200, Frederic Weisbecker wrote:
> > +void __init context_tracking_cpu_set(int cpu)
> >  {
> > +	static __initdata bool initialized = false;
> > +	struct task_struct *p, *t;
> > +
> > +	if (!per_cpu(context_tracking.active, cpu)) {
> > +		per_cpu(context_tracking.active, cpu) = true;
> > +		static_key_slow_inc(&context_tracking_enabled);
> > +	}
> > +
> > +	if (initialized)
> > +		return;
> > +
> > +	set_tsk_thread_flag(&init_task, TIF_NOHZ);
> > +
> > +	/*
> > +	 * There shouldn't be any thread at this early boot stage
> > +	 * but the scheduler is ready to host any. So lets walk
> > +	 * the tasklist just in case. tasklist_lock isn't necessary
> > +	 * either that early but take it for correctness checkers.
> > +	 */
> > +	read_lock(&tasklist_lock);
> > +	for_each_process_thread(p, t)
> > +		set_tsk_thread_flag(t, TIF_NOHZ);
> 
> If there should not be any task, should there not be a WARN_ON_ONCE()
> here?

Well, it's legal to have a task at that time because sched_init() was called.
I just haven't observed any task other than init/0. But future code (or alternate
configs than mine) might create a task between sched_init() and tick_init(). And
the above code takes care of such a possibility.

> 
> > +	read_unlock(&tasklist_lock);
> > +
> > +	initialized = true;
> >  }

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches
  2015-04-21 16:51     ` Frederic Weisbecker
@ 2015-04-21 20:52       ` Peter Zijlstra
  2015-04-21 21:06         ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2015-04-21 20:52 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On Tue, Apr 21, 2015 at 06:51:54PM +0200, Frederic Weisbecker wrote:
> On Tue, Apr 21, 2015 at 05:26:00PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 21, 2015 at 02:23:07PM +0200, Frederic Weisbecker wrote:
> > > +void __init context_tracking_cpu_set(int cpu)
> > >  {
> > > +	static __initdata bool initialized = false;
> > > +	struct task_struct *p, *t;
> > > +
> > > +	if (!per_cpu(context_tracking.active, cpu)) {
> > > +		per_cpu(context_tracking.active, cpu) = true;
> > > +		static_key_slow_inc(&context_tracking_enabled);
> > > +	}
> > > +
> > > +	if (initialized)
> > > +		return;
> > > +
> > > +	set_tsk_thread_flag(&init_task, TIF_NOHZ);
> > > +
> > > +	/*
> > > +	 * There shouldn't be any thread at this early boot stage
> > > +	 * but the scheduler is ready to host any. So lets walk
> > > +	 * the tasklist just in case. tasklist_lock isn't necessary
> > > +	 * either that early but take it for correctness checkers.
> > > +	 */
> > > +	read_lock(&tasklist_lock);
> > > +	for_each_process_thread(p, t)
> > > +		set_tsk_thread_flag(t, TIF_NOHZ);
> > 
> > If there should not be any task, should there not be a WARN_ON_ONCE()
> > here?
> 
> Well, it's legal to have a task at that time because sched_init() was called.
> I just haven't observed any task other than init/0. But future code (or alternate
> configs than mine) might create a task between sched_init() and tick_init(). And
> the above code takes care of such a possibility.

So why not do it sooner?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches
  2015-04-21 20:52       ` Peter Zijlstra
@ 2015-04-21 21:06         ` Frederic Weisbecker
  2015-04-22 15:35           ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-21 21:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On Tue, Apr 21, 2015 at 10:52:18PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 21, 2015 at 06:51:54PM +0200, Frederic Weisbecker wrote:
> > On Tue, Apr 21, 2015 at 05:26:00PM +0200, Peter Zijlstra wrote:
> > > On Tue, Apr 21, 2015 at 02:23:07PM +0200, Frederic Weisbecker wrote:
> > > > +void __init context_tracking_cpu_set(int cpu)
> > > >  {
> > > > +	static __initdata bool initialized = false;
> > > > +	struct task_struct *p, *t;
> > > > +
> > > > +	if (!per_cpu(context_tracking.active, cpu)) {
> > > > +		per_cpu(context_tracking.active, cpu) = true;
> > > > +		static_key_slow_inc(&context_tracking_enabled);
> > > > +	}
> > > > +
> > > > +	if (initialized)
> > > > +		return;
> > > > +
> > > > +	set_tsk_thread_flag(&init_task, TIF_NOHZ);
> > > > +
> > > > +	/*
> > > > +	 * There shouldn't be any thread at this early boot stage
> > > > +	 * but the scheduler is ready to host any. So lets walk
> > > > +	 * the tasklist just in case. tasklist_lock isn't necessary
> > > > +	 * either that early but take it for correctness checkers.
> > > > +	 */
> > > > +	read_lock(&tasklist_lock);
> > > > +	for_each_process_thread(p, t)
> > > > +		set_tsk_thread_flag(t, TIF_NOHZ);
> > > 
> > > If there should not be any task, should there not be a WARN_ON_ONCE()
> > > here?
> > 
> > Well, it's legal to have a task at that time because sched_init() was called.
> > I just haven't observed any task other than init/0. But future code (or alternate
> > configs than mine) might create a task between sched_init() and tick_init(). And
> > the above code takes care of such a possibility.
> 
> So why not do it sooner?

Because I need the tick nohz intialization to be after irq initialization (init_IRQ()),
which it depends on.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches
  2015-04-21 21:06         ` Frederic Weisbecker
@ 2015-04-22 15:35           ` Peter Zijlstra
  2015-04-22 15:51             ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2015-04-22 15:35 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On Tue, Apr 21, 2015 at 11:06:04PM +0200, Frederic Weisbecker wrote:
> 
> Because I need the tick nohz intialization to be after irq initialization (init_IRQ()),
> which it depends on.

OK, that's all _way_ before rest_init() which does the very first
fork().

I think we can safely put a WARN in there.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches
  2015-04-22 15:35           ` Peter Zijlstra
@ 2015-04-22 15:51             ` Frederic Weisbecker
  0 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-22 15:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On Wed, Apr 22, 2015 at 05:35:53PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 21, 2015 at 11:06:04PM +0200, Frederic Weisbecker wrote:
> > 
> > Because I need the tick nohz intialization to be after irq initialization (init_IRQ()),
> > which it depends on.
> 
> OK, that's all _way_ before rest_init() which does the very first
> fork().
> 
> I think we can safely put a WARN in there.

Ah good, I'm doing that!

thanks!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-19  2:30                 ` Mike Galbraith
@ 2015-06-12 19:12                   ` Rik van Riel
  0 siblings, 0 replies; 38+ messages in thread
From: Rik van Riel @ 2015-06-12 19:12 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On 05/18/2015 10:30 PM, Mike Galbraith wrote:
> On Mon, 2015-05-18 at 10:52 -0400, Rik van Riel wrote:
> 
>> For real time KVM, it is desirable to run the VCPU threads on
>> isolated CPUs as real time tasks.
>>
>> Meanwhile, the emulator threads can run as normal tasks anywhere
>> on the system.
>>
>> This means the /machine cpuset, which all guests live under,
>> needs to contain all the system's CPUs, both isolated and
>> non-isolated, otherwise it will be unable to have the VCPU
>> threads on isolated CPUs and the emulator threads on other
>> CPUs.
> 
> Ok, I know next to nothing about virtualization only because I failed at
> maintaining the desired absolute zero, but...
> 
> Why do you use isolcpus for this?  This KVM setup sounds very similar to
> what I do for real realtime.  I use a shield script to set up an
> exclusive isolated set and a system set and move the mundane world into
> the system set.  Inject realtime proggy into its exclusive home, it
> creates/pins worker bees, done.  Why the cpuset isolcpus hybrid?

Because libvirt doesn't know any better currently.

Ideally in the long run, we would use only cpusets, and not
boot with isolcpus at all.

Of course, things like irqbalance also key off isolcpus, to
avoid assigning irqs to the isolated cpus, so there are quite
a few puzzle pieces, and the best I can hope for in the short
term is to simply resolve the incompatibilities where one
piece of software undoes the settings required by another...

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-22 14:39                   ` Frederic Weisbecker
@ 2015-05-22 15:20                     ` Mike Galbraith
  0 siblings, 0 replies; 38+ messages in thread
From: Mike Galbraith @ 2015-05-22 15:20 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Afzal Mohammed, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Fri, 2015-05-22 at 16:39 +0200, Frederic Weisbecker wrote:
> On Thu, May 21, 2015 at 08:59:38PM +0200, Mike Galbraith wrote:
> > I think it's a mistake to make any rash assumption and/or mandate that
> > the user WILL use nohz_full CPUs immediately, or even at all for that
> > matter.  nohz_full currently is nothing but a CPU attribute, period,
> > nothing more, nothing less.
> 
> That's what the nohz_full parameter is for. Now if you're referring to
> change the nohz_full toggle to a runtime interface such as sysfs or such,
> I don't see that's a pressing need, especially considering the added
> complexity. At least I haven't heard much requests for it.

I'm not advocating anything like that, I'm just standing on my soap box
advocating NOT restricting user choices.  Cpusets works fine for me, and
I'd like them to keep on working.  If I need to add any hooks, buttons,
or rubber tubing I'll do it there ;-)

	-Mike


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 18:59                 ` Mike Galbraith
@ 2015-05-22 14:39                   ` Frederic Weisbecker
  2015-05-22 15:20                     ` Mike Galbraith
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2015-05-22 14:39 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Paul E. McKenney, Afzal Mohammed, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, May 21, 2015 at 08:59:38PM +0200, Mike Galbraith wrote:
> I think it's a mistake to make any rash assumption and/or mandate that
> the user WILL use nohz_full CPUs immediately, or even at all for that
> matter.  nohz_full currently is nothing but a CPU attribute, period,
> nothing more, nothing less.

That's what the nohz_full parameter is for. Now if you're referring to
change the nohz_full toggle to a runtime interface such as sysfs or such,
I don't see that's a pressing need, especially considering the added
complexity. At least I haven't heard much requests for it.

> The overhead that comes with the it is the
> only thing that suggests that the set really really should be used
> exclusively for isolated compute loads, and even then, they had better
> be pure userspace due to the hefty border crossing overhead.

Yep.

> 
> Let that overhear be seriously reduced, as per the Ingo/Andy discussion,
> and presto, the things become a very interesting to dynamic sets.  You
> can really really use them as you see fit, and on the fly, it doesn't
> cost you an arm and a leg just to turn it on.  The only restriction
> being the static set declaration, that you have to manage until that too
> goes away.
> 
> I see no reason to turn nohz_full into a rigid construct.

I'm all for making nohz_full just about the tick and drive the rest of the
isolation features through other settings.

Ideally the isolation should be tuned by userspace, we have all the interface
available for that: process affinity, irq affinity, workqueue affinity (soon),
watchdog, etc... Then we'll also add syscall or prctl to loop on hard isolation
(dataplane).

Unfortunately people seem to prefer adding that complexity to the kernel and let
it assume bold and unflexible pre-setting on its own.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 13:06               ` Frederic Weisbecker
  2015-05-21 13:27                 ` Paul E. McKenney
  2015-05-21 13:29                 ` Afzal Mohammed
@ 2015-05-21 18:59                 ` Mike Galbraith
  2015-05-22 14:39                   ` Frederic Weisbecker
  2 siblings, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2015-05-21 18:59 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Afzal Mohammed, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, 2015-05-21 at 15:06 +0200, Frederic Weisbecker wrote:
> On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> > On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote:
> > > Hi,
> > > 
> > > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:
> > > 
> > > > > > Given that kernel initiated association to isolcpus, a user turning
> > > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > > > > 
> > > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > > > > time as compared to w/o this patch, except boot cpu every one else
> > > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > > > > it was working fine, but not after this - it is now like a single core
> > > > > system.
> > > > 
> > > > I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> > > > to do for you?
> > > 
> > > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.
> > > 
> > > Thought that shutting down ticks as much as possible would be
> > > beneficial to normal loads too, though it has been mentioned to be used
> > > for specialized loads. Seems like drawbacks due to it weigh against
> > > normal loads, but haven't so far observed any (on a laptop with normal
> > > activities) before this change.
> > 
> > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > normal-workload distractions or for some real-time systems.  For mixed
> > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > use the boot parameters to select which CPUs are to be running the
> > specialized portion of the workload.
> > 
> > And you would of course need to lead enough CPUs running normally to
> > handle the non-specialized portion of the workload.
> > 
> > This sort of thing has traditionally required specialized kernels,
> > so the cool thing here is that we can make Linux do it.  Though, as
> > you noticed, careful configuration is still required.
> > 
> > Seem reasonable?
> 
> That said if he saw a big performance regression after applying these patches,
> then there is likely a problem in the patchset. Well it could be due to that mode
> which loops on full dynticks before resuming to userspace. Indeed when that is
> enabled, I expect real throughput issues on workloads doing lots of kernel <->
> userspace roundtrips. We just need to make sure this thing only works when requested.
> 
> Anyway, I need to look at the patchset.

I think it's a mistake to make any rash assumption and/or mandate that
the user WILL use nohz_full CPUs immediately, or even at all for that
matter.  nohz_full currently is nothing but a CPU attribute, period,
nothing more, nothing less.  The overhead that comes with the it is the
only thing that suggests that the set really really should be used
exclusively for isolated compute loads, and even then, they had better
be pure userspace due to the hefty border crossing overhead.

Let that overhear be seriously reduced, as per the Ingo/Andy discussion,
and presto, the things become a very interesting to dynamic sets.  You
can really really use them as you see fit, and on the fly, it doesn't
cost you an arm and a leg just to turn it on.  The only restriction
being the static set declaration, that you have to manage until that too
goes away.

I see no reason to turn nohz_full into a rigid construct.

	-Mike


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 14:14                   ` Paul E. McKenney
@ 2015-05-21 14:46                     ` Frederic Weisbecker
  0 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2015-05-21 14:46 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Afzal Mohammed, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, May 21, 2015 at 07:14:09AM -0700, Paul E. McKenney wrote:
> On Thu, May 21, 2015 at 06:59:05PM +0530, Afzal Mohammed wrote:
> > Hi,
> > 
> > On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote:
> > > On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> > 
> > > > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > > > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > > > normal-workload distractions or for some real-time systems.  For mixed
> > > > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > > > use the boot parameters to select which CPUs are to be running the
> > > > specialized portion of the workload.
> > > > 
> > > > And you would of course need to lead enough CPUs running normally to
> > > > handle the non-specialized portion of the workload.
> > > > 
> > > > This sort of thing has traditionally required specialized kernels,
> > > > so the cool thing here is that we can make Linux do it.  Though, as
> > > > you noticed, careful configuration is still required.
> > > > 
> > > > Seem reasonable?
> > 
> > Yes, thanks, some dots got connected :)
> > 
> > > That said if he saw a big performance regression after applying these patches,
> > > then there is likely a problem in the patchset. Well it could be due to that mode
> > > which loops on full dynticks before resuming to userspace. Indeed when that is
> > > enabled, I expect real throughput issues on workloads doing lots of kernel <->
> > > userspace roundtrips. We just need to make sure this thing only works when requested.
> > 
> > With this change (& having NO_HZ_FULL_ALL), hackbench was being served
> > only by the boot cpu, while w/o this change, all 8 (this is a quad
> > core HT processor) was being used - observation based on 'top'.
> 
> Good to know!  And it will be interesting to see what Frederic decides
> based on his review of the patchset.

Ah wait I was confusing this patchset with the dataplane mode.

No the performance loss seen by Afzal is actually expected. Now that we set
all nohz full CPUs as isolcpus, when you start a process such as hackbench,
it will be affine to CPU 0 unless you explicitly tweak the affinity of
hackbench.

So yeah this is a desired effect :-)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 13:29                 ` Afzal Mohammed
@ 2015-05-21 14:14                   ` Paul E. McKenney
  2015-05-21 14:46                     ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Paul E. McKenney @ 2015-05-21 14:14 UTC (permalink / raw)
  To: Afzal Mohammed
  Cc: Frederic Weisbecker, Mike Galbraith, Sasha Levin, Ingo Molnar,
	LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

On Thu, May 21, 2015 at 06:59:05PM +0530, Afzal Mohammed wrote:
> Hi,
> 
> On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote:
> > On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> 
> > > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > > normal-workload distractions or for some real-time systems.  For mixed
> > > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > > use the boot parameters to select which CPUs are to be running the
> > > specialized portion of the workload.
> > > 
> > > And you would of course need to lead enough CPUs running normally to
> > > handle the non-specialized portion of the workload.
> > > 
> > > This sort of thing has traditionally required specialized kernels,
> > > so the cool thing here is that we can make Linux do it.  Though, as
> > > you noticed, careful configuration is still required.
> > > 
> > > Seem reasonable?
> 
> Yes, thanks, some dots got connected :)
> 
> > That said if he saw a big performance regression after applying these patches,
> > then there is likely a problem in the patchset. Well it could be due to that mode
> > which loops on full dynticks before resuming to userspace. Indeed when that is
> > enabled, I expect real throughput issues on workloads doing lots of kernel <->
> > userspace roundtrips. We just need to make sure this thing only works when requested.
> 
> With this change (& having NO_HZ_FULL_ALL), hackbench was being served
> only by the boot cpu, while w/o this change, all 8 (this is a quad
> core HT processor) was being used - observation based on 'top'.

Good to know!  And it will be interesting to see what Frederic decides
based on his review of the patchset.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 13:06               ` Frederic Weisbecker
  2015-05-21 13:27                 ` Paul E. McKenney
@ 2015-05-21 13:29                 ` Afzal Mohammed
  2015-05-21 14:14                   ` Paul E. McKenney
  2015-05-21 18:59                 ` Mike Galbraith
  2 siblings, 1 reply; 38+ messages in thread
From: Afzal Mohammed @ 2015-05-21 13:29 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

Hi,

On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote:
> On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:

> > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > normal-workload distractions or for some real-time systems.  For mixed
> > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > use the boot parameters to select which CPUs are to be running the
> > specialized portion of the workload.
> > 
> > And you would of course need to lead enough CPUs running normally to
> > handle the non-specialized portion of the workload.
> > 
> > This sort of thing has traditionally required specialized kernels,
> > so the cool thing here is that we can make Linux do it.  Though, as
> > you noticed, careful configuration is still required.
> > 
> > Seem reasonable?

Yes, thanks, some dots got connected :)

> That said if he saw a big performance regression after applying these patches,
> then there is likely a problem in the patchset. Well it could be due to that mode
> which loops on full dynticks before resuming to userspace. Indeed when that is
> enabled, I expect real throughput issues on workloads doing lots of kernel <->
> userspace roundtrips. We just need to make sure this thing only works when requested.

With this change (& having NO_HZ_FULL_ALL), hackbench was being served
only by the boot cpu, while w/o this change, all 8 (this is a quad
core HT processor) was being used - observation based on 'top'.

Regards
Afzal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 13:06               ` Frederic Weisbecker
@ 2015-05-21 13:27                 ` Paul E. McKenney
  2015-05-21 13:29                 ` Afzal Mohammed
  2015-05-21 18:59                 ` Mike Galbraith
  2 siblings, 0 replies; 38+ messages in thread
From: Paul E. McKenney @ 2015-05-21 13:27 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Afzal Mohammed, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote:
> On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> > On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote:
> > > Hi,
> > > 
> > > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:
> > > 
> > > > > > Given that kernel initiated association to isolcpus, a user turning
> > > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > > > > 
> > > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > > > > time as compared to w/o this patch, except boot cpu every one else
> > > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > > > > it was working fine, but not after this - it is now like a single core
> > > > > system.
> > > > 
> > > > I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> > > > to do for you?
> > > 
> > > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.
> > > 
> > > Thought that shutting down ticks as much as possible would be
> > > beneficial to normal loads too, though it has been mentioned to be used
> > > for specialized loads. Seems like drawbacks due to it weigh against
> > > normal loads, but haven't so far observed any (on a laptop with normal
> > > activities) before this change.
> > 
> > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > normal-workload distractions or for some real-time systems.  For mixed
> > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > use the boot parameters to select which CPUs are to be running the
> > specialized portion of the workload.
> > 
> > And you would of course need to lead enough CPUs running normally to
> > handle the non-specialized portion of the workload.
> > 
> > This sort of thing has traditionally required specialized kernels,
> > so the cool thing here is that we can make Linux do it.  Though, as
> > you noticed, careful configuration is still required.
> > 
> > Seem reasonable?
> 
> That said if he saw a big performance regression after applying these patches,
> then there is likely a problem in the patchset. Well it could be due to that mode
> which loops on full dynticks before resuming to userspace. Indeed when that is
> enabled, I expect real throughput issues on workloads doing lots of kernel <->
> userspace roundtrips. We just need to make sure this thing only works when requested.
> 
> Anyway, I need to look at the patchset.

Fair enough!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 12:57             ` Paul E. McKenney
@ 2015-05-21 13:06               ` Frederic Weisbecker
  2015-05-21 13:27                 ` Paul E. McKenney
                                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2015-05-21 13:06 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Afzal Mohammed, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote:
> > Hi,
> > 
> > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:
> > 
> > > > > Given that kernel initiated association to isolcpus, a user turning
> > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > > > 
> > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > > > time as compared to w/o this patch, except boot cpu every one else
> > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > > > it was working fine, but not after this - it is now like a single core
> > > > system.
> > > 
> > > I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> > > to do for you?
> > 
> > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.
> > 
> > Thought that shutting down ticks as much as possible would be
> > beneficial to normal loads too, though it has been mentioned to be used
> > for specialized loads. Seems like drawbacks due to it weigh against
> > normal loads, but haven't so far observed any (on a laptop with normal
> > activities) before this change.
> 
> Indeed, NO_HZ_FULL is special purpose.  You normally would select
> NO_HZ_FULL_ALL only on a system intended for heavy compute without
> normal-workload distractions or for some real-time systems.  For mixed
> workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> use the boot parameters to select which CPUs are to be running the
> specialized portion of the workload.
> 
> And you would of course need to lead enough CPUs running normally to
> handle the non-specialized portion of the workload.
> 
> This sort of thing has traditionally required specialized kernels,
> so the cool thing here is that we can make Linux do it.  Though, as
> you noticed, careful configuration is still required.
> 
> Seem reasonable?

That said if he saw a big performance regression after applying these patches,
then there is likely a problem in the patchset. Well it could be due to that mode
which loops on full dynticks before resuming to userspace. Indeed when that is
enabled, I expect real throughput issues on workloads doing lots of kernel <->
userspace roundtrips. We just need to make sure this thing only works when requested.

Anyway, I need to look at the patchset.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 12:12           ` Afzal Mohammed
@ 2015-05-21 12:57             ` Paul E. McKenney
  2015-05-21 13:06               ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Paul E. McKenney @ 2015-05-21 12:57 UTC (permalink / raw)
  To: Afzal Mohammed
  Cc: Mike Galbraith, Sasha Levin, Frederic Weisbecker, Ingo Molnar,
	LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote:
> Hi,
> 
> On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:
> 
> > > > Given that kernel initiated association to isolcpus, a user turning
> > > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > > 
> > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > > time as compared to w/o this patch, except boot cpu every one else
> > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > > it was working fine, but not after this - it is now like a single core
> > > system.
> > 
> > I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> > to do for you?
> 
> I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.
> 
> Thought that shutting down ticks as much as possible would be
> beneficial to normal loads too, though it has been mentioned to be used
> for specialized loads. Seems like drawbacks due to it weigh against
> normal loads, but haven't so far observed any (on a laptop with normal
> activities) before this change.

Indeed, NO_HZ_FULL is special purpose.  You normally would select
NO_HZ_FULL_ALL only on a system intended for heavy compute without
normal-workload distractions or for some real-time systems.  For mixed
workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
use the boot parameters to select which CPUs are to be running the
specialized portion of the workload.

And you would of course need to lead enough CPUs running normally to
handle the non-specialized portion of the workload.

This sort of thing has traditionally required specialized kernels,
so the cool thing here is that we can make Linux do it.  Though, as
you noticed, careful configuration is still required.

Seem reasonable?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-20 21:00         ` Paul E. McKenney
@ 2015-05-21 12:12           ` Afzal Mohammed
  2015-05-21 12:57             ` Paul E. McKenney
  0 siblings, 1 reply; 38+ messages in thread
From: Afzal Mohammed @ 2015-05-21 12:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Mike Galbraith, Sasha Levin, Frederic Weisbecker, Ingo Molnar,
	LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

Hi,

On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:

> > > Given that kernel initiated association to isolcpus, a user turning
> > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > 
> > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > time as compared to w/o this patch, except boot cpu every one else
> > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > it was working fine, but not after this - it is now like a single core
> > system.
> 
> I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> to do for you?

I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.

Thought that shutting down ticks as much as possible would be
beneficial to normal loads too, though it has been mentioned to be used
for specialized loads. Seems like drawbacks due to it weigh against
normal loads, but haven't so far observed any (on a laptop with normal
activities) before this change.

Regards
Afzal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-20 20:38       ` Afzal Mohammed
@ 2015-05-20 21:00         ` Paul E. McKenney
  2015-05-21 12:12           ` Afzal Mohammed
  0 siblings, 1 reply; 38+ messages in thread
From: Paul E. McKenney @ 2015-05-20 21:00 UTC (permalink / raw)
  To: Afzal Mohammed
  Cc: Mike Galbraith, Sasha Levin, Frederic Weisbecker, Ingo Molnar,
	LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

On Thu, May 21, 2015 at 02:08:09AM +0530, Afzal Mohammed wrote:
> Hi,
> 
> On Sun, May 17, 2015 at 07:30:50AM +0200, Mike Galbraith wrote:
> 
> > Yeah, tying nohz_full set to isolcpus set up an initial condition that
> > you have to tear down with cpusets if you want those cpus returned to
> > the general purpose pool.  I had considered the kernel setting initial
> > state to be an optimization, but have reconsidered.
> 
> > Given that kernel initiated association to isolcpus, a user turning
> > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> 
> On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> time as compared to w/o this patch, except boot cpu every one else
> jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> it was working fine, but not after this - it is now like a single core
> system.

I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
to do for you?

							Thanx, Paul

> Regards
> Afzal
> 
> > he/she does not have CPUSETS enabled, or should Rik's patch rendering
> > isolcpus immutable be merged, he/she would quickly discover that the
> > generic box has been transformed into an utterly inflexible specialist
> > incapable of performing mundane tasks ever again.
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-17  5:30     ` Mike Galbraith
  2015-05-18  2:17       ` Rik van Riel
@ 2015-05-20 20:38       ` Afzal Mohammed
  2015-05-20 21:00         ` Paul E. McKenney
  1 sibling, 1 reply; 38+ messages in thread
From: Afzal Mohammed @ 2015-05-20 20:38 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

Hi,

On Sun, May 17, 2015 at 07:30:50AM +0200, Mike Galbraith wrote:

> Yeah, tying nohz_full set to isolcpus set up an initial condition that
> you have to tear down with cpusets if you want those cpus returned to
> the general purpose pool.  I had considered the kernel setting initial
> state to be an optimization, but have reconsidered.

> Given that kernel initiated association to isolcpus, a user turning
> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If

On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
time as compared to w/o this patch, except boot cpu every one else
jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
it was working fine, but not after this - it is now like a single core
system.

Regards
Afzal

> he/she does not have CPUSETS enabled, or should Rik's patch rendering
> isolcpus immutable be merged, he/she would quickly discover that the
> generic box has been transformed into an utterly inflexible specialist
> incapable of performing mundane tasks ever again.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18 14:52               ` Rik van Riel
@ 2015-05-19  2:30                 ` Mike Galbraith
  2015-06-12 19:12                   ` Rik van Riel
  0 siblings, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2015-05-19  2:30 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On Mon, 2015-05-18 at 10:52 -0400, Rik van Riel wrote:

> For real time KVM, it is desirable to run the VCPU threads on
> isolated CPUs as real time tasks.
> 
> Meanwhile, the emulator threads can run as normal tasks anywhere
> on the system.
> 
> This means the /machine cpuset, which all guests live under,
> needs to contain all the system's CPUs, both isolated and
> non-isolated, otherwise it will be unable to have the VCPU
> threads on isolated CPUs and the emulator threads on other
> CPUs.

Ok, I know next to nothing about virtualization only because I failed at
maintaining the desired absolute zero, but...

Why do you use isolcpus for this?  This KVM setup sounds very similar to
what I do for real realtime.  I use a shield script to set up an
exclusive isolated set and a system set and move the mundane world into
the system set.  Inject realtime proggy into its exclusive home, it
creates/pins worker bees, done.  Why the cpuset isolcpus hybrid?

When you say emulator threads can run anywhere, it sounds like that
includes the isolated set, but even so, nothing prevents injecting
whatever (the kernel permits) into whichever set.

	-Mike



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18 14:22             ` Mike Galbraith
@ 2015-05-18 14:52               ` Rik van Riel
  2015-05-19  2:30                 ` Mike Galbraith
  0 siblings, 1 reply; 38+ messages in thread
From: Rik van Riel @ 2015-05-18 14:52 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On 05/18/2015 10:22 AM, Mike Galbraith wrote:
> On Mon, 2015-05-18 at 10:07 -0400, Rik van Riel wrote:
>> On 05/17/2015 11:29 PM, Mike Galbraith wrote:
>>> On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote:
>>>> On 05/17/2015 01:30 AM, Mike Galbraith wrote:
>>>>
>>>>> Given that kernel initiated association to isolcpus, a user turning
>>>>> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
>>>>> he/she does not have CPUSETS enabled, or should Rik's patch rendering
>>>>> isolcpus immutable be merged, 
>>>>
>>>> My patch does not aim to make isolcpus immutable, it aims to make
>>>> isolcpus resistent to system management tools (like libvirt)
>>>> automatically undoing isolcpus the instant a cpuset with the default
>>>> cpus (inherited from the root group) is created.
>>>
>>> Aim or not, if cpusets is the sole modifier, it'll render isolcpus
>>> immutable, no?  Cpusets could grow an override to the override I
>>> suppose, to regain control of the resource it thinks it manages.
>>
>> The other way would be to make /sys/devices/system/cpu/isolcpus
>> (which Greg KH promised he would queue up for 4.2) writable.
> 
> Anything is better than override the override.  That's easy, but the
> changelog would have to be farmed out to a talented used car salesman.

For real time KVM, it is desirable to run the VCPU threads on
isolated CPUs as real time tasks.

Meanwhile, the emulator threads can run as normal tasks anywhere
on the system.

This means the /machine cpuset, which all guests live under,
needs to contain all the system's CPUs, both isolated and
non-isolated, otherwise it will be unable to have the VCPU
threads on isolated CPUs and the emulator threads on other
CPUs.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18 14:07           ` Rik van Riel
@ 2015-05-18 14:22             ` Mike Galbraith
  2015-05-18 14:52               ` Rik van Riel
  0 siblings, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2015-05-18 14:22 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On Mon, 2015-05-18 at 10:07 -0400, Rik van Riel wrote:
> On 05/17/2015 11:29 PM, Mike Galbraith wrote:
> > On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote:
> >> On 05/17/2015 01:30 AM, Mike Galbraith wrote:
> >>
> >>> Given that kernel initiated association to isolcpus, a user turning
> >>> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> >>> he/she does not have CPUSETS enabled, or should Rik's patch rendering
> >>> isolcpus immutable be merged, 
> >>
> >> My patch does not aim to make isolcpus immutable, it aims to make
> >> isolcpus resistent to system management tools (like libvirt)
> >> automatically undoing isolcpus the instant a cpuset with the default
> >> cpus (inherited from the root group) is created.
> > 
> > Aim or not, if cpusets is the sole modifier, it'll render isolcpus
> > immutable, no?  Cpusets could grow an override to the override I
> > suppose, to regain control of the resource it thinks it manages.
> 
> The other way would be to make /sys/devices/system/cpu/isolcpus
> (which Greg KH promised he would queue up for 4.2) writable.

Anything is better than override the override.  That's easy, but the
changelog would have to be farmed out to a talented used car salesman.

	-Mike


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18  3:29         ` Mike Galbraith
@ 2015-05-18 14:07           ` Rik van Riel
  2015-05-18 14:22             ` Mike Galbraith
  0 siblings, 1 reply; 38+ messages in thread
From: Rik van Riel @ 2015-05-18 14:07 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On 05/17/2015 11:29 PM, Mike Galbraith wrote:
> On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote:
>> On 05/17/2015 01:30 AM, Mike Galbraith wrote:
>>
>>> Given that kernel initiated association to isolcpus, a user turning
>>> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
>>> he/she does not have CPUSETS enabled, or should Rik's patch rendering
>>> isolcpus immutable be merged, 
>>
>> My patch does not aim to make isolcpus immutable, it aims to make
>> isolcpus resistent to system management tools (like libvirt)
>> automatically undoing isolcpus the instant a cpuset with the default
>> cpus (inherited from the root group) is created.
> 
> Aim or not, if cpusets is the sole modifier, it'll render isolcpus
> immutable, no?  Cpusets could grow an override to the override I
> suppose, to regain control of the resource it thinks it manages.

The other way would be to make /sys/devices/system/cpu/isolcpus
(which Greg KH promised he would queue up for 4.2) writable.

I am all for making isolcpus changeable at run time. I just want
to prevent it being changed accidentally at run time, by system
tools that want to place their workloads in cpusets.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18  2:17       ` Rik van Riel
@ 2015-05-18  3:29         ` Mike Galbraith
  2015-05-18 14:07           ` Rik van Riel
  0 siblings, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2015-05-18  3:29 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote:
> On 05/17/2015 01:30 AM, Mike Galbraith wrote:
> 
> > Given that kernel initiated association to isolcpus, a user turning
> > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > he/she does not have CPUSETS enabled, or should Rik's patch rendering
> > isolcpus immutable be merged, 
> 
> My patch does not aim to make isolcpus immutable, it aims to make
> isolcpus resistent to system management tools (like libvirt)
> automatically undoing isolcpus the instant a cpuset with the default
> cpus (inherited from the root group) is created.

Aim or not, if cpusets is the sole modifier, it'll render isolcpus
immutable, no?  Cpusets could grow an override to the override I
suppose, to regain control of the resource it thinks it manages.

	-Mike


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-17  5:30     ` Mike Galbraith
@ 2015-05-18  2:17       ` Rik van Riel
  2015-05-18  3:29         ` Mike Galbraith
  2015-05-20 20:38       ` Afzal Mohammed
  1 sibling, 1 reply; 38+ messages in thread
From: Rik van Riel @ 2015-05-18  2:17 UTC (permalink / raw)
  To: Mike Galbraith, Sasha Levin
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf,
	Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On 05/17/2015 01:30 AM, Mike Galbraith wrote:

> Given that kernel initiated association to isolcpus, a user turning
> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> he/she does not have CPUSETS enabled, or should Rik's patch rendering
> isolcpus immutable be merged, 

My patch does not aim to make isolcpus immutable, it aims to make
isolcpus resistent to system management tools (like libvirt)
automatically undoing isolcpus the instant a cpuset with the default
cpus (inherited from the root group) is created.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-16 19:39   ` Sasha Levin
@ 2015-05-17  5:30     ` Mike Galbraith
  2015-05-18  2:17       ` Rik van Riel
  2015-05-20 20:38       ` Afzal Mohammed
  0 siblings, 2 replies; 38+ messages in thread
From: Mike Galbraith @ 2015-05-17  5:30 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf,
	Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Sat, 2015-05-16 at 15:39 -0400, Sasha Levin wrote:
> On 05/06/2015 12:04 PM, Frederic Weisbecker wrote:
> > From: Chris Metcalf <cmetcalf@ezchip.com>
> > 
> > nohz_full is only useful with isolcpus also set, since otherwise the
> > scheduler has to run periodically to try to determine whether to steal
> > work from other cores.
> > 
> > Accordingly, when booting with nohz_full=xxx on the command line, we
> > should act as if isolcpus=xxx was also set, and set (or extend) the
> > isolcpus set to include the nohz_full cpus.
> > 
> > Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> > Acked-by: Rik van Riel <riel@redhat.com>
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Rik van Riel <riel@redhat.com>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> 
> Hi folks,
> 
> I've noticed a regression in my testing a few days ago and bisected it down to
> this patch. I was seeing frequent soft lockups/RCU lockups and the load of the
> testing VMs would go beyond 400-500 (on 32 VCPU guests) - note I'm booting them
> with nohz_full=1-27.
> 
> This patch sort of explains the behaviour I was seeing now: most of the cores
> are no longer being used by the scheduler, and the remaining cores can't deal
> with the load imposed on them which results in "lockups" which are really just
> the CPUs being unable to keep up.
> 
> I always thought that nohz_full without isolcpus meant that the cores would
> be available to the scheduler, but it won't interfere if there is one task
> running on them. It seems that this patch changed that behaviour.
> 
> Did I misunderstand that?

Yeah, tying nohz_full set to isolcpus set up an initial condition that
you have to tear down with cpusets if you want those cpus returned to
the general purpose pool.  I had considered the kernel setting initial
state to be an optimization, but have reconsidered.

You may have misunderstood somewhat though, if you do not explicitly
isolate the nohz_full set from the scheduler via either isolcpus or
cpusets, there is no exclusion from load balancing, the scheduler may
place additional tasks on a nohz_full cpu to resolve load imbalance.

Given that kernel initiated association to isolcpus, a user turning
NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
he/she does not have CPUSETS enabled, or should Rik's patch rendering
isolcpus immutable be merged, he/she would quickly discover that the
generic box has been transformed into an utterly inflexible specialist
incapable of performing mundane tasks ever again.

	-Mike


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
@ 2015-05-16 19:39   ` Sasha Levin
  2015-05-17  5:30     ` Mike Galbraith
  0 siblings, 1 reply; 38+ messages in thread
From: Sasha Levin @ 2015-05-16 19:39 UTC (permalink / raw)
  To: Frederic Weisbecker, Ingo Molnar
  Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Dave Jones, Thomas Gleixner, Oleg Nesterov,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On 05/06/2015 12:04 PM, Frederic Weisbecker wrote:
> From: Chris Metcalf <cmetcalf@ezchip.com>
> 
> nohz_full is only useful with isolcpus also set, since otherwise the
> scheduler has to run periodically to try to determine whether to steal
> work from other cores.
> 
> Accordingly, when booting with nohz_full=xxx on the command line, we
> should act as if isolcpus=xxx was also set, and set (or extend) the
> isolcpus set to include the nohz_full cpus.
> 
> Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> Acked-by: Rik van Riel <riel@redhat.com>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>

Hi folks,

I've noticed a regression in my testing a few days ago and bisected it down to
this patch. I was seeing frequent soft lockups/RCU lockups and the load of the
testing VMs would go beyond 400-500 (on 32 VCPU guests) - note I'm booting them
with nohz_full=1-27.

This patch sort of explains the behaviour I was seeing now: most of the cores
are no longer being used by the scheduler, and the remaining cores can't deal
with the load imposed on them which results in "lockups" which are really just
the CPUs being unable to keep up.

I always thought that nohz_full without isolcpus meant that the cores would
be available to the scheduler, but it won't interfere if there is one task
running on them. It seems that this patch changed that behaviour.

Did I misunderstand that?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-06 16:04 [GIT PULL] nohz: A few improvements v4 Frederic Weisbecker
@ 2015-05-06 16:04 ` Frederic Weisbecker
  2015-05-16 19:39   ` Sasha Levin
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2015-05-06 16:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Dave Jones, Thomas Gleixner, Oleg Nesterov,
	Paul E . McKenney, Ingo Molnar, Frederic Weisbecker,
	Rik van Riel, Martin Schwidefsky

From: Chris Metcalf <cmetcalf@ezchip.com>

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6f149f8..e95b4d8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7042,6 +7042,9 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
+	/* nohz_full won't take effect without isolating the cpus. */
+	tick_nohz_full_add_cpus_to(cpu_isolated_map);
+
 	sched_init_numa();
 
 	/*
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-04-25 23:13     ` Frederic Weisbecker
@ 2015-04-25 23:50       ` Gene Heskett
  0 siblings, 0 replies; 38+ messages in thread
From: Gene Heskett @ 2015-04-25 23:50 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Saturday 25 April 2015 19:13:10 Frederic Weisbecker wrote:
> On Fri, Apr 24, 2015 at 04:07:52PM -0400, Gene Heskett wrote:
> > On Friday 24 April 2015 11:58:31 Frederic Weisbecker wrote:
> > > From: Chris Metcalf <cmetcalf@ezchip.com>
> > >
> > > nohz_full is only useful with isolcpus also set, since otherwise
> > > the scheduler has to run periodically to try to determine whether
> > > to steal work from other cores.
> > >
> > > Accordingly, when booting with nohz_full=xxx on the command line,
> > > we should act as if isolcpus=xxx was also set, and set (or extend)
> > > the isolcpus set to include the nohz_full cpus.
> > >
> > > Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> > > Acked-by: Rik van Riel <riel@redhat.com>
> > > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> > > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: Rik van Riel <riel@redhat.com>
> > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> >
> > As a user of LinuxCNC, we expect the core(s) so isolated by the
> > isolcpus argument at bootup time to remain undisturbed in order to
> > preserve the I/O updates heartbeat latency at the absolute minimum
> > that board and cpu combo can accomplish.
> >
> > If this patch changes that behaviour such that the isolated core is
> > grabbed for another job while the RTAI bits and pieces are loaded
> > and running machinery, causing the machinery to lose this steady,
> > possibly as little as a 20 microsecond period repeating operation
> > heartbeat, this will quite effectively destroy our ability to run
> > this software on linux without farming that whole operation out to
> > intelligent I/O cards.
> >
> > Please keep this in mind.  I don't read the patch well enough to
> > determine this myself.
>
> I think it's not a problem. This patch isn't adding any work or noise
> to isolcpus, it's actually setting CPUs that run tickless in userspace
> to be part of the isolcpus set, because we need userspace-tickless
> CPUs to not be disturbed at all. You shouldn't be concerned if you
> don't use full dynticks.
>
> If you happen to use full dynticks in your usecase one day, you'll use
> it on your isolcpus anyway.

TBT I am probably too paranoid for my own good, but I did want to be 
heard from the far corner of the cheap seats in the bleachers.  I write 
my stuff in bash, but there are, I expect, some guys in our group who 
could given enough time, run down anything that breaks it for us and 
patch it back out.  We are the guys who generally run an RTAI patched 
kernel anyway.

Some of the machinery driven by LinuxCNC is pretty good sized stuff, like 
a 5 axis milling machine in Cincinnati that may be the biggest one 
Cincinnati ever made, bed is 26 feet long, nearly 6 feet wide. The whole 
thing is a bit over 200k lbs.  Bad controller, controller company gone, 
sold at scrap cheap.  Adapted it to be run by LinuxCNC. After 
commissioning it, he turned his shop machinists loose to use it.  First 
job was a locomotive axle bearing seat in a multi-ton truck casting, 
needed .001" tolerance for a shrink fitted bearing.  Measured when done, 
it was off .0002" worst case.  It has been detected by the seismographs 
at the UNI across town. :)  My biggest machine is less than 200 lbs.  
Toy's IOW.

Thank you for taking the time to reply, Frederick.  Its appreciated.

Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-04-24 20:07   ` Gene Heskett
@ 2015-04-25 23:13     ` Frederic Weisbecker
  2015-04-25 23:50       ` Gene Heskett
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-25 23:13 UTC (permalink / raw)
  To: Gene Heskett
  Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Fri, Apr 24, 2015 at 04:07:52PM -0400, Gene Heskett wrote:
> On Friday 24 April 2015 11:58:31 Frederic Weisbecker wrote:
> > From: Chris Metcalf <cmetcalf@ezchip.com>
> >
> > nohz_full is only useful with isolcpus also set, since otherwise the
> > scheduler has to run periodically to try to determine whether to steal
> > work from other cores.
> >
> > Accordingly, when booting with nohz_full=xxx on the command line, we
> > should act as if isolcpus=xxx was also set, and set (or extend) the
> > isolcpus set to include the nohz_full cpus.
> >
> > Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> > Acked-by: Rik van Riel <riel@redhat.com>
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Rik van Riel <riel@redhat.com>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> 
> As a user of LinuxCNC, we expect the core(s) so isolated by the isolcpus 
> argument at bootup time to remain undisturbed in order to preserve the 
> I/O updates heartbeat latency at the absolute minimum that board and cpu 
> combo can accomplish.
> 
> If this patch changes that behaviour such that the isolated core is 
> grabbed for another job while the RTAI bits and pieces are loaded and 
> running machinery, causing the machinery to lose this steady, possibly 
> as little as a 20 microsecond period repeating operation heartbeat, this 
> will quite effectively destroy our ability to run this software on linux 
> without farming that whole operation out to intelligent I/O cards.
> 
> Please keep this in mind.  I don't read the patch well enough to 
> determine this myself.

I think it's not a problem. This patch isn't adding any work or noise to
isolcpus, it's actually setting CPUs that run tickless in userspace to be
part of the isolcpus set, because we need userspace-tickless CPUs to not
be disturbed at all. You shouldn't be concerned if you don't use full
dynticks.

If you happen to use full dynticks in your usecase one day, you'll use it
on your isolcpus anyway.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-04-24 15:58 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
@ 2015-04-24 20:07   ` Gene Heskett
  2015-04-25 23:13     ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Gene Heskett @ 2015-04-24 20:07 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Friday 24 April 2015 11:58:31 Frederic Weisbecker wrote:
> From: Chris Metcalf <cmetcalf@ezchip.com>
>
> nohz_full is only useful with isolcpus also set, since otherwise the
> scheduler has to run periodically to try to determine whether to steal
> work from other cores.
>
> Accordingly, when booting with nohz_full=xxx on the command line, we
> should act as if isolcpus=xxx was also set, and set (or extend) the
> isolcpus set to include the nohz_full cpus.
>
> Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> Acked-by: Rik van Riel <riel@redhat.com>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>

As a user of LinuxCNC, we expect the core(s) so isolated by the isolcpus 
argument at bootup time to remain undisturbed in order to preserve the 
I/O updates heartbeat latency at the absolute minimum that board and cpu 
combo can accomplish.

If this patch changes that behaviour such that the isolated core is 
grabbed for another job while the RTAI bits and pieces are loaded and 
running machinery, causing the machinery to lose this steady, possibly 
as little as a 20 microsecond period repeating operation heartbeat, this 
will quite effectively destroy our ability to run this software on linux 
without farming that whole operation out to intelligent I/O cards.

Please keep this in mind.  I don't read the patch well enough to 
determine this myself.

> ---
>  kernel/sched/core.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 6f149f8..e95b4d8 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7042,6 +7042,9 @@ void __init sched_init_smp(void)
>  	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
>  	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
>
> +	/* nohz_full won't take effect without isolating the cpus. */
> +	tick_nohz_full_add_cpus_to(cpu_isolated_map);
> +
>  	sched_init_numa();
>
>  	/*

Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-04-24 15:58 [PATCH 0/4] nohz: A few improvements v3 Frederic Weisbecker
@ 2015-04-24 15:58 ` Frederic Weisbecker
  2015-04-24 20:07   ` Gene Heskett
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-24 15:58 UTC (permalink / raw)
  To: LKML
  Cc: Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Frederic Weisbecker, Rik van Riel, Martin Schwidefsky

From: Chris Metcalf <cmetcalf@ezchip.com>

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6f149f8..e95b4d8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7042,6 +7042,9 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
+	/* nohz_full won't take effect without isolating the cpus. */
+	tick_nohz_full_add_cpus_to(cpu_isolated_map);
+
 	sched_init_numa();
 
 	/*
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2015-06-12 19:12 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker
2015-04-21 12:23 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
2015-04-21 12:23 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker
2015-04-21 15:26   ` Peter Zijlstra
2015-04-21 16:51     ` Frederic Weisbecker
2015-04-21 20:52       ` Peter Zijlstra
2015-04-21 21:06         ` Frederic Weisbecker
2015-04-22 15:35           ` Peter Zijlstra
2015-04-22 15:51             ` Frederic Weisbecker
2015-04-21 12:23 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker
2015-04-21 12:23 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
2015-04-21 15:26 ` [PATCH 0/4] nohz: A few improvements v2 Peter Zijlstra
2015-04-24 15:58 [PATCH 0/4] nohz: A few improvements v3 Frederic Weisbecker
2015-04-24 15:58 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
2015-04-24 20:07   ` Gene Heskett
2015-04-25 23:13     ` Frederic Weisbecker
2015-04-25 23:50       ` Gene Heskett
2015-05-06 16:04 [GIT PULL] nohz: A few improvements v4 Frederic Weisbecker
2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
2015-05-16 19:39   ` Sasha Levin
2015-05-17  5:30     ` Mike Galbraith
2015-05-18  2:17       ` Rik van Riel
2015-05-18  3:29         ` Mike Galbraith
2015-05-18 14:07           ` Rik van Riel
2015-05-18 14:22             ` Mike Galbraith
2015-05-18 14:52               ` Rik van Riel
2015-05-19  2:30                 ` Mike Galbraith
2015-06-12 19:12                   ` Rik van Riel
2015-05-20 20:38       ` Afzal Mohammed
2015-05-20 21:00         ` Paul E. McKenney
2015-05-21 12:12           ` Afzal Mohammed
2015-05-21 12:57             ` Paul E. McKenney
2015-05-21 13:06               ` Frederic Weisbecker
2015-05-21 13:27                 ` Paul E. McKenney
2015-05-21 13:29                 ` Afzal Mohammed
2015-05-21 14:14                   ` Paul E. McKenney
2015-05-21 14:46                     ` Frederic Weisbecker
2015-05-21 18:59                 ` Mike Galbraith
2015-05-22 14:39                   ` Frederic Weisbecker
2015-05-22 15:20                     ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).