All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] sched/nohz: Affine unpinned timers to housekeepers
@ 2015-09-01 14:50 Frederic Weisbecker
  2015-09-01 14:50 ` [PATCH 1/2] nohz: " Frederic Weisbecker
  2015-09-01 14:51 ` [PATCH 2/2] nohz: Assert existing housekeepers when nohz full enabled Frederic Weisbecker
  0 siblings, 2 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2015-09-01 14:50 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: LKML, Frederic Weisbecker, Chris Metcalf, Thomas Gleixner,
	Preeti U Murthy, Christoph Lameter, Vatika Harlalka,
	Paul E . McKenney

Ingo,

Please pull the sched/core branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	sched/core

HEAD: f0673808f014ae18ae4cd8cdc7daf7b3b6c0a144

---
* Move unpinned timers out of full dynticks CPUs that don't want to be
  disturbed (testing showed great results).

* Enforce assertion to make sure that we have housekeepers to handle
  unbound timers, workqueues, timekeepers. Also improve the related
  comments.
---

Thanks,
	Frederic
---

Frederic Weisbecker (1):
      nohz: Assert existing housekeepers when nohz full enabled

Vatika Harlalka (1):
      nohz: Affine unpinned timers to housekeepers


 include/linux/tick.h     |  9 +++++++++
 kernel/sched/core.c      |  7 +++++--
 kernel/time/tick-sched.c | 15 +++++++++++----
 3 files changed, 25 insertions(+), 6 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-01 14:50 [GIT PULL] sched/nohz: Affine unpinned timers to housekeepers Frederic Weisbecker
@ 2015-09-01 14:50 ` Frederic Weisbecker
  2015-09-01 19:14   ` Jiang, Yunhong
  2015-09-02 15:57   ` [tip:sched/urgent] " tip-bot for Vatika Harlalka
  2015-09-01 14:51 ` [PATCH 2/2] nohz: Assert existing housekeepers when nohz full enabled Frederic Weisbecker
  1 sibling, 2 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2015-09-01 14:50 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: LKML, Vatika Harlalka, Chris Metcalf, Thomas Gleixner,
	Preeti U Murthy, Christoph Lameter, Frederic Weisbecker,
	Paul E . McKenney

From: Vatika Harlalka <vatikaharlalka@gmail.com>

The problem addressed in this patch is about affining unpinned timers.
Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
jitter due to firing of such timers on them.

This patch will affine timers to online CPUs which are not full dynticks
in NOHZ_FULL configured systems. It should not introduce overhead in
nohz full off case due to static keys.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off by: Vatika Harlalka <vatikaharlalka@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/tick.h | 9 +++++++++
 kernel/sched/core.c  | 7 +++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 48d901f..e312219 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -147,11 +147,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
 		cpumask_or(mask, mask, tick_nohz_full_mask);
 }
 
+static inline int housekeeping_any_cpu(void)
+{
+	return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+}
+
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
 extern void tick_nohz_full_kick_all(void);
 extern void __tick_nohz_task_switch(void);
 #else
+static inline int housekeeping_any_cpu(void)
+{
+	return smp_processor_id();
+}
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8b864ec..0902e4d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
 	int i, cpu = smp_processor_id();
 	struct sched_domain *sd;
 
-	if (!idle_cpu(cpu))
+	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
 		return cpu;
 
 	rcu_read_lock();
 	for_each_domain(cpu, sd) {
 		for_each_cpu(i, sched_domain_span(sd)) {
-			if (!idle_cpu(i)) {
+			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
 				cpu = i;
 				goto unlock;
 			}
 		}
 	}
+
+	if (!is_housekeeping_cpu(cpu))
+		cpu = housekeeping_any_cpu();
 unlock:
 	rcu_read_unlock();
 	return cpu;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] nohz: Assert existing housekeepers when nohz full enabled
  2015-09-01 14:50 [GIT PULL] sched/nohz: Affine unpinned timers to housekeepers Frederic Weisbecker
  2015-09-01 14:50 ` [PATCH 1/2] nohz: " Frederic Weisbecker
@ 2015-09-01 14:51 ` Frederic Weisbecker
  2015-09-02 15:57   ` [tip:sched/urgent] " tip-bot for Frederic Weisbecker
  1 sibling, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2015-09-01 14:51 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: LKML, Frederic Weisbecker, Chris Metcalf, Thomas Gleixner,
	Preeti U Murthy, Christoph Lameter, Vatika Harlalka,
	Paul E . McKenney

The code ensures that when nohz full is running, at least the boot CPU
serves as a housekeeper and it can't be later offlined.

Let's assert this assumption to make sure that we have CPUs to handle
unbound jobs like workqueues and timers while nohz full CPUs run
undisturbed.

Also improve the comments on housekeeper offlining prevention.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vatika Harlalka <vatikaharlalka@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/time/tick-sched.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 3319e16..7c7ec45 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -290,16 +290,17 @@ static int __init tick_nohz_full_setup(char *str)
 __setup("nohz_full=", tick_nohz_full_setup);
 
 static int tick_nohz_cpu_down_callback(struct notifier_block *nfb,
-						 unsigned long action,
-						 void *hcpu)
+				       unsigned long action,
+				       void *hcpu)
 {
 	unsigned int cpu = (unsigned long)hcpu;
 
 	switch (action & ~CPU_TASKS_FROZEN) {
 	case CPU_DOWN_PREPARE:
 		/*
-		 * If we handle the timekeeping duty for full dynticks CPUs,
-		 * we can't safely shutdown that CPU.
+		 * The boot CPU handles housekeeping duty (unbound timers,
+		 * workqueues, timekeeping, ...) on behalf of full dynticks
+		 * CPUs. It must remain online when nohz full is enabled.
 		 */
 		if (tick_nohz_full_running && tick_do_timer_cpu == cpu)
 			return NOTIFY_BAD;
@@ -370,6 +371,12 @@ void __init tick_nohz_init(void)
 	cpu_notifier(tick_nohz_cpu_down_callback, 0);
 	pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
 		cpumask_pr_args(tick_nohz_full_mask));
+
+	/*
+	 * We need at least one CPU to handle housekeeping work such
+	 * as timekeeping, unbound timers, workqueues, ...
+	 */
+	WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
 }
 #endif
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-01 14:50 ` [PATCH 1/2] nohz: " Frederic Weisbecker
@ 2015-09-01 19:14   ` Jiang, Yunhong
  2015-09-01 20:47     ` Frederic Weisbecker
  2015-09-02 15:57   ` [tip:sched/urgent] " tip-bot for Vatika Harlalka
  1 sibling, 1 reply; 14+ messages in thread
From: Jiang, Yunhong @ 2015-09-01 19:14 UTC (permalink / raw)
  To: Frederic Weisbecker, Ingo Molnar, Peter Zijlstra
  Cc: LKML, Vatika Harlalka, Chris Metcalf, Thomas Gleixner,
	Preeti U Murthy, Christoph Lameter, Paul E . McKenney

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> owner@vger.kernel.org] On Behalf Of Frederic Weisbecker
> Sent: Tuesday, September 1, 2015 7:51 AM
> To: Ingo Molnar; Peter Zijlstra
> Cc: LKML; Vatika Harlalka; Chris Metcalf; Thomas Gleixner; Preeti U Murthy;
> Christoph Lameter; Frederic Weisbecker; Paul E . McKenney
> Subject: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
> 
> From: Vatika Harlalka <vatikaharlalka@gmail.com>
> 
> The problem addressed in this patch is about affining unpinned timers.
> Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
> jitter due to firing of such timers on them.
> 
> This patch will affine timers to online CPUs which are not full dynticks
> in NOHZ_FULL configured systems. It should not introduce overhead in
> nohz full off case due to static keys.
> 
> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off by: Vatika Harlalka <vatikaharlalka@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Chris Metcalf <cmetcalf@ezchip.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
>  include/linux/tick.h | 9 +++++++++
>  kernel/sched/core.c  | 7 +++++--
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 48d901f..e312219 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -147,11 +147,20 @@ static inline void tick_nohz_full_add_cpus_to(struct
> cpumask *mask)
>  		cpumask_or(mask, mask, tick_nohz_full_mask);
>  }
> 
> +static inline int housekeeping_any_cpu(void)
> +{
> +	return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> +}
> +
>  extern void tick_nohz_full_kick(void);
>  extern void tick_nohz_full_kick_cpu(int cpu);
>  extern void tick_nohz_full_kick_all(void);
>  extern void __tick_nohz_task_switch(void);
>  #else
> +static inline int housekeeping_any_cpu(void)
> +{
> +	return smp_processor_id();
> +}
>  static inline bool tick_nohz_full_enabled(void) { return false; }
>  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
>  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8b864ec..0902e4d 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
>  	int i, cpu = smp_processor_id();
>  	struct sched_domain *sd;
> 
> -	if (!idle_cpu(cpu))
> +	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
>  		return cpu;
> 
>  	rcu_read_lock();
>  	for_each_domain(cpu, sd) {
>  		for_each_cpu(i, sched_domain_span(sd)) {
> -			if (!idle_cpu(i)) {
> +			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {

Hi, Frederic, sorry for a naive question. Per my understanding, the tick_nohz_full_mask is added to cpu_isolated_map in sched_init_smp(), and the cpu_isolated_map is excluded from sched_domain in init_sched_domains(), so why check here?

Thanks
--jyh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-01 19:14   ` Jiang, Yunhong
@ 2015-09-01 20:47     ` Frederic Weisbecker
  2015-09-02  9:38       ` Mike Galbraith
  2015-09-04 15:56       ` Luiz Capitulino
  0 siblings, 2 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2015-09-01 20:47 UTC (permalink / raw)
  To: Jiang, Yunhong
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Vatika Harlalka,
	Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter, Paul E . McKenney

On Tue, Sep 01, 2015 at 07:14:13PM +0000, Jiang, Yunhong wrote:
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 8b864ec..0902e4d 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
> >  	int i, cpu = smp_processor_id();
> >  	struct sched_domain *sd;
> > 
> > -	if (!idle_cpu(cpu))
> > +	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
> >  		return cpu;
> > 
> >  	rcu_read_lock();
> >  	for_each_domain(cpu, sd) {
> >  		for_each_cpu(i, sched_domain_span(sd)) {
> > -			if (!idle_cpu(i)) {
> > +			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
> 
> Hi, Frederic, sorry for a naive question. Per my understanding, the tick_nohz_full_mask is added to cpu_isolated_map in
> sched_init_smp(), and the cpu_isolated_map is excluded from sched_domain in init_sched_domains(), so why check here?

Very good observation! But it's better to keep this check in the domain loop in
case things change in the future such as removing that cpu_isolated_map inclusion
or other suprises.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-01 20:47     ` Frederic Weisbecker
@ 2015-09-02  9:38       ` Mike Galbraith
  2015-09-02 12:33         ` Mike Galbraith
  2015-09-02 16:16         ` Chris Metcalf
  2015-09-04 15:56       ` Luiz Capitulino
  1 sibling, 2 replies; 14+ messages in thread
From: Mike Galbraith @ 2015-09-02  9:38 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Jiang, Yunhong, Ingo Molnar, Peter Zijlstra, LKML,
	Vatika Harlalka, Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter, Paul E . McKenney

On Tue, 2015-09-01 at 22:47 +0200, Frederic Weisbecker wrote:
> On Tue, Sep 01, 2015 at 07:14:13PM +0000, Jiang, Yunhong wrote:
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 8b864ec..0902e4d 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
> > >  	int i, cpu = smp_processor_id();
> > >  	struct sched_domain *sd;
> > > 
> > > -	if (!idle_cpu(cpu))
> > > +	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
> > >  		return cpu;
> > > 
> > >  	rcu_read_lock();
> > >  	for_each_domain(cpu, sd) {
> > >  		for_each_cpu(i, sched_domain_span(sd)) {
> > > -			if (!idle_cpu(i)) {
> > > +			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
> > 
> > Hi, Frederic, sorry for a naive question. Per my understanding, the tick_nohz_full_mask is added to cpu_isolated_map in
> > sched_init_smp(), and the cpu_isolated_map is excluded from sched_domain in init_sched_domains(), so why check here?
> 
> Very good observation! But it's better to keep this check in the domain loop in
> case things change in the future such as removing that cpu_isolated_map inclusion
> or other suprises.

IMHO, nohz_full -> cpu_isolated_map removal really wants to happen.
NO_HZ_FULL_ALL currently means "Woohoo, next stop NR_CPUS=0".

	-Mike


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-02  9:38       ` Mike Galbraith
@ 2015-09-02 12:33         ` Mike Galbraith
  2015-09-02 16:16         ` Chris Metcalf
  1 sibling, 0 replies; 14+ messages in thread
From: Mike Galbraith @ 2015-09-02 12:33 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Jiang, Yunhong, Ingo Molnar, Peter Zijlstra, LKML,
	Vatika Harlalka, Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter, Paul E . McKenney

On Wed, 2015-09-02 at 11:38 +0200, Mike Galbraith wrote:
> On Tue, 2015-09-01 at 22:47 +0200, Frederic Weisbecker wrote:
> > On Tue, Sep 01, 2015 at 07:14:13PM +0000, Jiang, Yunhong wrote:
> > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > > index 8b864ec..0902e4d 100644
> > > > --- a/kernel/sched/core.c
> > > > +++ b/kernel/sched/core.c
> > > > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
> > > >  	int i, cpu = smp_processor_id();
> > > >  	struct sched_domain *sd;
> > > > 
> > > > -	if (!idle_cpu(cpu))
> > > > +	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
> > > >  		return cpu;
> > > > 
> > > >  	rcu_read_lock();
> > > >  	for_each_domain(cpu, sd) {
> > > >  		for_each_cpu(i, sched_domain_span(sd)) {
> > > > -			if (!idle_cpu(i)) {
> > > > +			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
> > > 
> > > Hi, Frederic, sorry for a naive question. Per my understanding, the tick_nohz_full_mask is added to cpu_isolated_map in
> > > sched_init_smp(), and the cpu_isolated_map is excluded from sched_domain in init_sched_domains(), so why check here?
> > 
> > Very good observation! But it's better to keep this check in the domain loop in
> > case things change in the future such as removing that cpu_isolated_map inclusion
> > or other suprises.
> 
> IMHO, nohz_full -> cpu_isolated_map removal really wants to happen.
> NO_HZ_FULL_ALL currently means "Woohoo, next stop NR_CPUS=0".

(which surprises folks who have [had] it enabled [and may even have been
using and/or testing it])

https://lkml.org/lkml/2015/9/2/145


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:sched/urgent] nohz: Affine unpinned timers to housekeepers
  2015-09-01 14:50 ` [PATCH 1/2] nohz: " Frederic Weisbecker
  2015-09-01 19:14   ` Jiang, Yunhong
@ 2015-09-02 15:57   ` tip-bot for Vatika Harlalka
  1 sibling, 0 replies; 14+ messages in thread
From: tip-bot for Vatika Harlalka @ 2015-09-02 15:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, tglx, torvalds, cmetcalf, cl, paulmck, fweisbec,
	vatikaharlalka, preeti, peterz, hpa, linux-kernel

Commit-ID:  9642d18eee2cd169b60c6ac0f20bda745b5a3d1e
Gitweb:     http://git.kernel.org/tip/9642d18eee2cd169b60c6ac0f20bda745b5a3d1e
Author:     Vatika Harlalka <vatikaharlalka@gmail.com>
AuthorDate: Tue, 1 Sep 2015 16:50:59 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 2 Sep 2015 10:33:22 +0200

nohz: Affine unpinned timers to housekeepers

The problem addressed in this patch is about affining unpinned
timers. Adaptive or Full Dynticks CPUs are currently disturbed
by unnecessary jitter due to firing of such timers on them.

This patch will affine timers to online CPUs which are not full
dynticks in NOHZ_FULL configured systems. It should not
introduce overhead in nohz full off case due to static keys.

Signed-off-by: Vatika Harlalka <vatikaharlalka@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1441119060-2230-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/tick.h | 9 +++++++++
 kernel/sched/core.c  | 7 +++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 48d901f..e312219 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -147,11 +147,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
 		cpumask_or(mask, mask, tick_nohz_full_mask);
 }
 
+static inline int housekeeping_any_cpu(void)
+{
+	return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+}
+
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
 extern void tick_nohz_full_kick_all(void);
 extern void __tick_nohz_task_switch(void);
 #else
+static inline int housekeeping_any_cpu(void)
+{
+	return smp_processor_id();
+}
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8b864ec..0902e4d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
 	int i, cpu = smp_processor_id();
 	struct sched_domain *sd;
 
-	if (!idle_cpu(cpu))
+	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
 		return cpu;
 
 	rcu_read_lock();
 	for_each_domain(cpu, sd) {
 		for_each_cpu(i, sched_domain_span(sd)) {
-			if (!idle_cpu(i)) {
+			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
 				cpu = i;
 				goto unlock;
 			}
 		}
 	}
+
+	if (!is_housekeeping_cpu(cpu))
+		cpu = housekeeping_any_cpu();
 unlock:
 	rcu_read_unlock();
 	return cpu;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [tip:sched/urgent] nohz: Assert existing housekeepers when nohz full enabled
  2015-09-01 14:51 ` [PATCH 2/2] nohz: Assert existing housekeepers when nohz full enabled Frederic Weisbecker
@ 2015-09-02 15:57   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2015-09-02 15:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, preeti, linux-kernel, peterz, torvalds, hpa, mingo,
	cmetcalf, fweisbec, cl, vatikaharlalka, paulmck

Commit-ID:  7c8bb6cb95061b3143759459ed6c6b0c73bcfecb
Gitweb:     http://git.kernel.org/tip/7c8bb6cb95061b3143759459ed6c6b0c73bcfecb
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Tue, 1 Sep 2015 16:51:00 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 2 Sep 2015 10:33:22 +0200

nohz: Assert existing housekeepers when nohz full enabled

The code ensures that when nohz full is running, at least the
boot CPU serves as a housekeeper and it can't be later offlined.

Let's assert this assumption to make sure that we have CPUs to
handle unbound jobs like workqueues and timers while nohz full
CPUs run undisturbed.

Also improve the comments on housekeeper offlining prevention.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Vatika Harlalka <vatikaharlalka@gmail.com>
Link: http://lkml.kernel.org/r/1441119060-2230-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/tick-sched.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 3319e16..7c7ec45 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -290,16 +290,17 @@ static int __init tick_nohz_full_setup(char *str)
 __setup("nohz_full=", tick_nohz_full_setup);
 
 static int tick_nohz_cpu_down_callback(struct notifier_block *nfb,
-						 unsigned long action,
-						 void *hcpu)
+				       unsigned long action,
+				       void *hcpu)
 {
 	unsigned int cpu = (unsigned long)hcpu;
 
 	switch (action & ~CPU_TASKS_FROZEN) {
 	case CPU_DOWN_PREPARE:
 		/*
-		 * If we handle the timekeeping duty for full dynticks CPUs,
-		 * we can't safely shutdown that CPU.
+		 * The boot CPU handles housekeeping duty (unbound timers,
+		 * workqueues, timekeeping, ...) on behalf of full dynticks
+		 * CPUs. It must remain online when nohz full is enabled.
 		 */
 		if (tick_nohz_full_running && tick_do_timer_cpu == cpu)
 			return NOTIFY_BAD;
@@ -370,6 +371,12 @@ void __init tick_nohz_init(void)
 	cpu_notifier(tick_nohz_cpu_down_callback, 0);
 	pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
 		cpumask_pr_args(tick_nohz_full_mask));
+
+	/*
+	 * We need at least one CPU to handle housekeeping work such
+	 * as timekeeping, unbound timers, workqueues, ...
+	 */
+	WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
 }
 #endif
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-02  9:38       ` Mike Galbraith
  2015-09-02 12:33         ` Mike Galbraith
@ 2015-09-02 16:16         ` Chris Metcalf
  2015-09-02 19:03           ` Jiang, Yunhong
  2015-09-03  1:29           ` Mike Galbraith
  1 sibling, 2 replies; 14+ messages in thread
From: Chris Metcalf @ 2015-09-02 16:16 UTC (permalink / raw)
  To: Mike Galbraith, Frederic Weisbecker
  Cc: Jiang, Yunhong, Ingo Molnar, Peter Zijlstra, LKML,
	Vatika Harlalka, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter, Paul E . McKenney

On 09/02/2015 05:38 AM, Mike Galbraith wrote:
> IMHO, nohz_full -> cpu_isolated_map removal really wants to happen.
> NO_HZ_FULL_ALL currently means "Woohoo, next stop NR_CPUS=0".

Yeah, the problem seems to be folks who use it as a kind of
"hey, maybe this gives me some optimization boost somewhere"
kind of setting.  Did we ever hear actual use cases for people who
benefited from running nohz_full on cpus with an active scheduler,
i.e. no isolcpus for that core?  I find it hard to imagine, but, maybe...?

If we don't have such use cases, what should we do here?  I'm
slightly sympathetic to these folks who are going "Gee, my machine
suddenly got way slower", but only in the same sense as people
who shoot themselves in the foot and then say "Gee, my foot is
bleeding".  But maybe I'm being too hard core :-)

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-02 16:16         ` Chris Metcalf
@ 2015-09-02 19:03           ` Jiang, Yunhong
  2015-09-03  1:29           ` Mike Galbraith
  1 sibling, 0 replies; 14+ messages in thread
From: Jiang, Yunhong @ 2015-09-02 19:03 UTC (permalink / raw)
  To: Chris Metcalf, Mike Galbraith, Frederic Weisbecker
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Vatika Harlalka,
	Thomas Gleixner, Preeti U Murthy, Christoph Lameter,
	Paul E . McKenney

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1267 bytes --]

> -----Original Message-----
> From: Chris Metcalf [mailto:cmetcalf@ezchip.com]
> Sent: Wednesday, September 2, 2015 9:17 AM
> To: Mike Galbraith; Frederic Weisbecker
> Cc: Jiang, Yunhong; Ingo Molnar; Peter Zijlstra; LKML; Vatika Harlalka; Thomas
> Gleixner; Preeti U Murthy; Christoph Lameter; Paul E . McKenney
> Subject: Re: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
> 
> On 09/02/2015 05:38 AM, Mike Galbraith wrote:
> > IMHO, nohz_full -> cpu_isolated_map removal really wants to happen.
> > NO_HZ_FULL_ALL currently means "Woohoo, next stop NR_CPUS=0".
> 
> Yeah, the problem seems to be folks who use it as a kind of
> "hey, maybe this gives me some optimization boost somewhere"
> kind of setting.  Did we ever hear actual use cases for people who
> benefited from running nohz_full on cpus with an active scheduler,
> i.e. no isolcpus for that core?  I find it hard to imagine, but, maybe...?
> 

I think they can use cpuset instead of isolcpus, like Viresh stated https://lkml.org/lkml/2014/4/14/199  .
This patch in fact removes one gap between cpuset and isolcpus.

Thanks
--jyh
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-02 16:16         ` Chris Metcalf
  2015-09-02 19:03           ` Jiang, Yunhong
@ 2015-09-03  1:29           ` Mike Galbraith
  1 sibling, 0 replies; 14+ messages in thread
From: Mike Galbraith @ 2015-09-03  1:29 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Frederic Weisbecker, Jiang, Yunhong, Ingo Molnar, Peter Zijlstra,
	LKML, Vatika Harlalka, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter, Paul E . McKenney

On Wed, 2015-09-02 at 12:16 -0400, Chris Metcalf wrote:
> On 09/02/2015 05:38 AM, Mike Galbraith wrote:
> > IMHO, nohz_full -> cpu_isolated_map removal really wants to happen.
> > NO_HZ_FULL_ALL currently means "Woohoo, next stop NR_CPUS=0".
> 
> Yeah, the problem seems to be folks who use it as a kind of
> "hey, maybe this gives me some optimization boost somewhere"
> kind of setting.  Did we ever hear actual use cases for people who
> benefited from running nohz_full on cpus with an active scheduler,
> i.e. no isolcpus for that core?  I find it hard to imagine, but, maybe...?

The only sane usage atm is my entire box (small, big likely won't boot)
is a dedicated specialist.  Previously, you could also have had the
feature is valuable enough that I'm willing to pay the quite high price
to have this feature available for whenever I feel like using it.

> If we don't have such use cases, what should we do here?  I'm
> slightly sympathetic to these folks who are going "Gee, my machine
> suddenly got way slower", but only in the same sense as people
> who shoot themselves in the foot and then say "Gee, my foot is
> bleeding".  But maybe I'm being too hard core :-)

I think it's bloody stupid to declare nohz_full cpus dead when they can
in fact do generic work.  There's no reason to do so... unless maybe we
really do need to hold the hand of poor ole Aunt Tilly... the hapless
HPC administrator who can't figure out how to use cpusets or such.  

When the overhead goes away, dynamic usage will become a lot more
_practical_, but you can do it in the here and now modulo the cost and
that bogus death certificate.  They ain't dead, two patches combined to
bury the poor things alive.

	-Mike


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-01 20:47     ` Frederic Weisbecker
  2015-09-02  9:38       ` Mike Galbraith
@ 2015-09-04 15:56       ` Luiz Capitulino
  2015-09-04 17:19         ` Frederic Weisbecker
  1 sibling, 1 reply; 14+ messages in thread
From: Luiz Capitulino @ 2015-09-04 15:56 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Jiang, Yunhong, Ingo Molnar, Peter Zijlstra, LKML,
	Vatika Harlalka, Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter, Paul E . McKenney

On Tue, 1 Sep 2015 22:47:24 +0200
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Tue, Sep 01, 2015 at 07:14:13PM +0000, Jiang, Yunhong wrote:
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 8b864ec..0902e4d 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
> > >  	int i, cpu = smp_processor_id();
> > >  	struct sched_domain *sd;
> > > 
> > > -	if (!idle_cpu(cpu))
> > > +	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
> > >  		return cpu;
> > > 
> > >  	rcu_read_lock();
> > >  	for_each_domain(cpu, sd) {
> > >  		for_each_cpu(i, sched_domain_span(sd)) {
> > > -			if (!idle_cpu(i)) {
> > > +			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
> > 
> > Hi, Frederic, sorry for a naive question. Per my understanding, the tick_nohz_full_mask is added to cpu_isolated_map in
> > sched_init_smp(), and the cpu_isolated_map is excluded from sched_domain in init_sched_domains(), so why check here?
> 
> Very good observation! But it's better to keep this check in the domain loop in
> case things change in the future such as removing that cpu_isolated_map inclusion
> or other suprises.

I have another observation.

As nohz_full cores are already excluded from the sched domains, this patch
boils down to migrating timers which are queued by the nohz_full core
itself. However, it only fully works for lowres timers. hrtimers may
still fire on nohz_full cores, because hrtimers are only migrated if
the core returned by get_nohz_timer_target() has a queued timer expiring
before than the timer being migrated.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] nohz: Affine unpinned timers to housekeepers
  2015-09-04 15:56       ` Luiz Capitulino
@ 2015-09-04 17:19         ` Frederic Weisbecker
  0 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2015-09-04 17:19 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Jiang, Yunhong, Ingo Molnar, Peter Zijlstra, LKML,
	Vatika Harlalka, Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter, Paul E . McKenney

On Fri, Sep 04, 2015 at 11:56:33AM -0400, Luiz Capitulino wrote:
> On Tue, 1 Sep 2015 22:47:24 +0200
> Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Tue, Sep 01, 2015 at 07:14:13PM +0000, Jiang, Yunhong wrote:
> > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > > index 8b864ec..0902e4d 100644
> > > > --- a/kernel/sched/core.c
> > > > +++ b/kernel/sched/core.c
> > > > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
> > > >  	int i, cpu = smp_processor_id();
> > > >  	struct sched_domain *sd;
> > > > 
> > > > -	if (!idle_cpu(cpu))
> > > > +	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
> > > >  		return cpu;
> > > > 
> > > >  	rcu_read_lock();
> > > >  	for_each_domain(cpu, sd) {
> > > >  		for_each_cpu(i, sched_domain_span(sd)) {
> > > > -			if (!idle_cpu(i)) {
> > > > +			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
> > > 
> > > Hi, Frederic, sorry for a naive question. Per my understanding, the tick_nohz_full_mask is added to cpu_isolated_map in
> > > sched_init_smp(), and the cpu_isolated_map is excluded from sched_domain in init_sched_domains(), so why check here?
> > 
> > Very good observation! But it's better to keep this check in the domain loop in
> > case things change in the future such as removing that cpu_isolated_map inclusion
> > or other suprises.
> 
> I have another observation.
> 
> As nohz_full cores are already excluded from the sched domains, this patch
> boils down to migrating timers which are queued by the nohz_full core
> itself. However, it only fully works for lowres timers. hrtimers may
> still fire on nohz_full cores, because hrtimers are only migrated if
> the core returned by get_nohz_timer_target() has a queued timer expiring
> before than the timer being migrated.

That's right. Probably some of them may never move to another CPU if they have
very small delays (smaller than those of hrtimer enqueued on other housekeepers).
And if they reach a housekeeper they can still move back to nohz full CPUs later
if they are modified from them.

The only safe solution is to always drive unbound hrtimers from housekeepers.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-09-04 17:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-01 14:50 [GIT PULL] sched/nohz: Affine unpinned timers to housekeepers Frederic Weisbecker
2015-09-01 14:50 ` [PATCH 1/2] nohz: " Frederic Weisbecker
2015-09-01 19:14   ` Jiang, Yunhong
2015-09-01 20:47     ` Frederic Weisbecker
2015-09-02  9:38       ` Mike Galbraith
2015-09-02 12:33         ` Mike Galbraith
2015-09-02 16:16         ` Chris Metcalf
2015-09-02 19:03           ` Jiang, Yunhong
2015-09-03  1:29           ` Mike Galbraith
2015-09-04 15:56       ` Luiz Capitulino
2015-09-04 17:19         ` Frederic Weisbecker
2015-09-02 15:57   ` [tip:sched/urgent] " tip-bot for Vatika Harlalka
2015-09-01 14:51 ` [PATCH 2/2] nohz: Assert existing housekeepers when nohz full enabled Frederic Weisbecker
2015-09-02 15:57   ` [tip:sched/urgent] " tip-bot for Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.