linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* How to turn scheduler tick on for current nohz_full CPU?
@ 2019-07-24 11:53 Paul E. McKenney
  2019-07-24 13:22 ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2019-07-24 11:53 UTC (permalink / raw)
  To: fweisbec, tglx, mingo; +Cc: linux-kernel, rcu, peterz

Hello!

One of the callback-invocation forward-progress issues turns out to
be nohz_full CPUs not turning their scheduling-clock interrupt back on
when running in kernel mode.  Given that callback floods can cause RCU's
callback-invocation loop to run for some time, it would be good for this
loop to re-enable this interrupt.  Of course, this problem applies to
pretty much any kernel code that might loop for an extended time period,
not just RCU.

I took a quick look at kernel/time/tick-sched.c and the closest thing
I found was tick_nohz_full_kick_cpu(), except that (1) it isn't clear
that this does much when invoked on the current CPU and (2) it doesn't
help in rcutorture TREE04.  In contrast, disabling NO_HZ_FULL and using
RCU_NOCB_CPU instead works quite well.

So what should I be calling instead of tick_nohz_full_kick_cpu() to
re-enable the current CPU's scheduling-clock interrupt?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to turn scheduler tick on for current nohz_full CPU?
  2019-07-24 11:53 How to turn scheduler tick on for current nohz_full CPU? Paul E. McKenney
@ 2019-07-24 13:22 ` Frederic Weisbecker
  2019-07-24 13:52   ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2019-07-24 13:22 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: fweisbec, tglx, mingo, linux-kernel, rcu, peterz

On Wed, Jul 24, 2019 at 04:53:31AM -0700, Paul E. McKenney wrote:
> Hello!
> 
> One of the callback-invocation forward-progress issues turns out to
> be nohz_full CPUs not turning their scheduling-clock interrupt back on
> when running in kernel mode.  Given that callback floods can cause RCU's
> callback-invocation loop to run for some time, it would be good for this
> loop to re-enable this interrupt.  Of course, this problem applies to
> pretty much any kernel code that might loop for an extended time period,
> not just RCU.
> 
> I took a quick look at kernel/time/tick-sched.c and the closest thing
> I found was tick_nohz_full_kick_cpu(), except that (1) it isn't clear
> that this does much when invoked on the current CPU and (2) it doesn't
> help in rcutorture TREE04.  In contrast, disabling NO_HZ_FULL and using
> RCU_NOCB_CPU instead works quite well.
> 
> So what should I be calling instead of tick_nohz_full_kick_cpu() to
> re-enable the current CPU's scheduling-clock interrupt?

Indeed, kernel code is assumed to be quick enough (between two extended grace
periods) to avoid running the tick for RCU. But some long lasting kernel code
may require to tick temporarily.

You can use tick_nohz_dep_set_cpu(cpu, TICK_DEP_MASK_RCU) with the
following:

diff --git a/include/linux/tick.h b/include/linux/tick.h
index f92a10b5e112..3f476e2a4bf7 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -108,7 +108,8 @@ enum tick_dep_bits {
 	TICK_DEP_BIT_POSIX_TIMER	= 0,
 	TICK_DEP_BIT_PERF_EVENTS	= 1,
 	TICK_DEP_BIT_SCHED		= 2,
-	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3
+	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3,
+	TICK_DEP_BIT_RCU		= 4
 };
 
 #define TICK_DEP_MASK_NONE		0
@@ -116,6 +117,7 @@ enum tick_dep_bits {
 #define TICK_DEP_MASK_PERF_EVENTS	(1 << TICK_DEP_BIT_PERF_EVENTS)
 #define TICK_DEP_MASK_SCHED		(1 << TICK_DEP_BIT_SCHED)
 #define TICK_DEP_MASK_CLOCK_UNSTABLE	(1 << TICK_DEP_BIT_CLOCK_UNSTABLE)
+#define TICK_DEP_MASK_RCU		(1 << TICK_DEP_BIT_RCU)
 
 #ifdef CONFIG_NO_HZ_COMMON
 extern bool tick_nohz_enabled;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: How to turn scheduler tick on for current nohz_full CPU?
  2019-07-24 13:22 ` Frederic Weisbecker
@ 2019-07-24 13:52   ` Paul E. McKenney
  2019-07-24 14:30     ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2019-07-24 13:52 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: fweisbec, tglx, mingo, linux-kernel, rcu, peterz

On Wed, Jul 24, 2019 at 03:22:59PM +0200, Frederic Weisbecker wrote:
> On Wed, Jul 24, 2019 at 04:53:31AM -0700, Paul E. McKenney wrote:
> > Hello!
> > 
> > One of the callback-invocation forward-progress issues turns out to
> > be nohz_full CPUs not turning their scheduling-clock interrupt back on
> > when running in kernel mode.  Given that callback floods can cause RCU's
> > callback-invocation loop to run for some time, it would be good for this
> > loop to re-enable this interrupt.  Of course, this problem applies to
> > pretty much any kernel code that might loop for an extended time period,
> > not just RCU.
> > 
> > I took a quick look at kernel/time/tick-sched.c and the closest thing
> > I found was tick_nohz_full_kick_cpu(), except that (1) it isn't clear
> > that this does much when invoked on the current CPU and (2) it doesn't
> > help in rcutorture TREE04.  In contrast, disabling NO_HZ_FULL and using
> > RCU_NOCB_CPU instead works quite well.
> > 
> > So what should I be calling instead of tick_nohz_full_kick_cpu() to
> > re-enable the current CPU's scheduling-clock interrupt?
> 
> Indeed, kernel code is assumed to be quick enough (between two extended grace
> periods) to avoid running the tick for RCU. But some long lasting kernel code
> may require to tick temporarily.
> 
> You can use tick_nohz_dep_set_cpu(cpu, TICK_DEP_MASK_RCU) with the
> following:
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index f92a10b5e112..3f476e2a4bf7 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -108,7 +108,8 @@ enum tick_dep_bits {
>  	TICK_DEP_BIT_POSIX_TIMER	= 0,
>  	TICK_DEP_BIT_PERF_EVENTS	= 1,
>  	TICK_DEP_BIT_SCHED		= 2,
> -	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3
> +	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3,
> +	TICK_DEP_BIT_RCU		= 4
>  };
>  
>  #define TICK_DEP_MASK_NONE		0
> @@ -116,6 +117,7 @@ enum tick_dep_bits {
>  #define TICK_DEP_MASK_PERF_EVENTS	(1 << TICK_DEP_BIT_PERF_EVENTS)
>  #define TICK_DEP_MASK_SCHED		(1 << TICK_DEP_BIT_SCHED)
>  #define TICK_DEP_MASK_CLOCK_UNSTABLE	(1 << TICK_DEP_BIT_CLOCK_UNSTABLE)
> +#define TICK_DEP_MASK_RCU		(1 << TICK_DEP_BIT_RCU)
>  
>  #ifdef CONFIG_NO_HZ_COMMON
>  extern bool tick_nohz_enabled;

I will give this a try, thank you!  (Testing will take a few days.)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to turn scheduler tick on for current nohz_full CPU?
  2019-07-24 13:52   ` Paul E. McKenney
@ 2019-07-24 14:30     ` Frederic Weisbecker
  2019-07-25  1:12       ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2019-07-24 14:30 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: fweisbec, tglx, mingo, linux-kernel, rcu, peterz

On Wed, Jul 24, 2019 at 06:52:19AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 24, 2019 at 03:22:59PM +0200, Frederic Weisbecker wrote:
> > On Wed, Jul 24, 2019 at 04:53:31AM -0700, Paul E. McKenney wrote:
> > > Hello!
> > > 
> > > One of the callback-invocation forward-progress issues turns out to
> > > be nohz_full CPUs not turning their scheduling-clock interrupt back on
> > > when running in kernel mode.  Given that callback floods can cause RCU's
> > > callback-invocation loop to run for some time, it would be good for this
> > > loop to re-enable this interrupt.  Of course, this problem applies to
> > > pretty much any kernel code that might loop for an extended time period,
> > > not just RCU.
> > > 
> > > I took a quick look at kernel/time/tick-sched.c and the closest thing
> > > I found was tick_nohz_full_kick_cpu(), except that (1) it isn't clear
> > > that this does much when invoked on the current CPU and (2) it doesn't
> > > help in rcutorture TREE04.  In contrast, disabling NO_HZ_FULL and using
> > > RCU_NOCB_CPU instead works quite well.
> > > 
> > > So what should I be calling instead of tick_nohz_full_kick_cpu() to
> > > re-enable the current CPU's scheduling-clock interrupt?
> > 
> > Indeed, kernel code is assumed to be quick enough (between two extended grace
> > periods) to avoid running the tick for RCU. But some long lasting kernel code
> > may require to tick temporarily.
> > 
> > You can use tick_nohz_dep_set_cpu(cpu, TICK_DEP_MASK_RCU) with the
> > following:
> > 
> > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > index f92a10b5e112..3f476e2a4bf7 100644
> > --- a/include/linux/tick.h
> > +++ b/include/linux/tick.h
> > @@ -108,7 +108,8 @@ enum tick_dep_bits {
> >  	TICK_DEP_BIT_POSIX_TIMER	= 0,
> >  	TICK_DEP_BIT_PERF_EVENTS	= 1,
> >  	TICK_DEP_BIT_SCHED		= 2,
> > -	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3
> > +	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3,
> > +	TICK_DEP_BIT_RCU		= 4
> >  };
> >  
> >  #define TICK_DEP_MASK_NONE		0
> > @@ -116,6 +117,7 @@ enum tick_dep_bits {
> >  #define TICK_DEP_MASK_PERF_EVENTS	(1 << TICK_DEP_BIT_PERF_EVENTS)
> >  #define TICK_DEP_MASK_SCHED		(1 << TICK_DEP_BIT_SCHED)
> >  #define TICK_DEP_MASK_CLOCK_UNSTABLE	(1 << TICK_DEP_BIT_CLOCK_UNSTABLE)
> > +#define TICK_DEP_MASK_RCU		(1 << TICK_DEP_BIT_RCU)
> >  
> >  #ifdef CONFIG_NO_HZ_COMMON
> >  extern bool tick_nohz_enabled;
> 
> I will give this a try, thank you!  (Testing will take a few days.)

Sure!

For the background: expect a self-IPI to fire which then restart the tick on IRQ exit.
Once you're later done with the work, don't forget to remove the tick dependency:

     tick_nohz_dep_clear_cpu(cpu, TICK_DEP_MASK_RCU);


Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to turn scheduler tick on for current nohz_full CPU?
  2019-07-24 14:30     ` Frederic Weisbecker
@ 2019-07-25  1:12       ` Paul E. McKenney
  2019-07-29 22:32         ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2019-07-25  1:12 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: fweisbec, tglx, mingo, linux-kernel, rcu, peterz

On Wed, Jul 24, 2019 at 04:30:13PM +0200, Frederic Weisbecker wrote:
> On Wed, Jul 24, 2019 at 06:52:19AM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 24, 2019 at 03:22:59PM +0200, Frederic Weisbecker wrote:
> > > On Wed, Jul 24, 2019 at 04:53:31AM -0700, Paul E. McKenney wrote:
> > > > Hello!
> > > > 
> > > > One of the callback-invocation forward-progress issues turns out to
> > > > be nohz_full CPUs not turning their scheduling-clock interrupt back on
> > > > when running in kernel mode.  Given that callback floods can cause RCU's
> > > > callback-invocation loop to run for some time, it would be good for this
> > > > loop to re-enable this interrupt.  Of course, this problem applies to
> > > > pretty much any kernel code that might loop for an extended time period,
> > > > not just RCU.
> > > > 
> > > > I took a quick look at kernel/time/tick-sched.c and the closest thing
> > > > I found was tick_nohz_full_kick_cpu(), except that (1) it isn't clear
> > > > that this does much when invoked on the current CPU and (2) it doesn't
> > > > help in rcutorture TREE04.  In contrast, disabling NO_HZ_FULL and using
> > > > RCU_NOCB_CPU instead works quite well.
> > > > 
> > > > So what should I be calling instead of tick_nohz_full_kick_cpu() to
> > > > re-enable the current CPU's scheduling-clock interrupt?
> > > 
> > > Indeed, kernel code is assumed to be quick enough (between two extended grace
> > > periods) to avoid running the tick for RCU. But some long lasting kernel code
> > > may require to tick temporarily.
> > > 
> > > You can use tick_nohz_dep_set_cpu(cpu, TICK_DEP_MASK_RCU) with the
> > > following:
> > > 
> > > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > > index f92a10b5e112..3f476e2a4bf7 100644
> > > --- a/include/linux/tick.h
> > > +++ b/include/linux/tick.h
> > > @@ -108,7 +108,8 @@ enum tick_dep_bits {
> > >  	TICK_DEP_BIT_POSIX_TIMER	= 0,
> > >  	TICK_DEP_BIT_PERF_EVENTS	= 1,
> > >  	TICK_DEP_BIT_SCHED		= 2,
> > > -	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3
> > > +	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3,
> > > +	TICK_DEP_BIT_RCU		= 4
> > >  };
> > >  
> > >  #define TICK_DEP_MASK_NONE		0
> > > @@ -116,6 +117,7 @@ enum tick_dep_bits {
> > >  #define TICK_DEP_MASK_PERF_EVENTS	(1 << TICK_DEP_BIT_PERF_EVENTS)
> > >  #define TICK_DEP_MASK_SCHED		(1 << TICK_DEP_BIT_SCHED)
> > >  #define TICK_DEP_MASK_CLOCK_UNSTABLE	(1 << TICK_DEP_BIT_CLOCK_UNSTABLE)
> > > +#define TICK_DEP_MASK_RCU		(1 << TICK_DEP_BIT_RCU)
> > >  
> > >  #ifdef CONFIG_NO_HZ_COMMON
> > >  extern bool tick_nohz_enabled;
> > 
> > I will give this a try, thank you!  (Testing will take a few days.)
> 
> Sure!
> 
> For the background: expect a self-IPI to fire which then restart the tick on IRQ exit.
> Once you're later done with the work, don't forget to remove the tick dependency:
> 
>      tick_nohz_dep_clear_cpu(cpu, TICK_DEP_MASK_RCU);

Does this get cleared automatically if preempted, or should I clear
this is one of RCU's scheduler hooks?  (My guess is the latter, but
I figured I should ask.)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to turn scheduler tick on for current nohz_full CPU?
  2019-07-25  1:12       ` Paul E. McKenney
@ 2019-07-29 22:32         ` Paul E. McKenney
  2019-07-30 16:43           ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2019-07-29 22:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: fweisbec, tglx, mingo, linux-kernel, rcu, peterz, paulmckrcu

On Wed, Jul 24, 2019 at 06:12:43PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 24, 2019 at 04:30:13PM +0200, Frederic Weisbecker wrote:
> > On Wed, Jul 24, 2019 at 06:52:19AM -0700, Paul E. McKenney wrote:
> > > On Wed, Jul 24, 2019 at 03:22:59PM +0200, Frederic Weisbecker wrote:
> > > > On Wed, Jul 24, 2019 at 04:53:31AM -0700, Paul E. McKenney wrote:
> > > > > Hello!
> > > > > 
> > > > > One of the callback-invocation forward-progress issues turns out to
> > > > > be nohz_full CPUs not turning their scheduling-clock interrupt back on
> > > > > when running in kernel mode.  Given that callback floods can cause RCU's
> > > > > callback-invocation loop to run for some time, it would be good for this
> > > > > loop to re-enable this interrupt.  Of course, this problem applies to
> > > > > pretty much any kernel code that might loop for an extended time period,
> > > > > not just RCU.
> > > > > 
> > > > > I took a quick look at kernel/time/tick-sched.c and the closest thing
> > > > > I found was tick_nohz_full_kick_cpu(), except that (1) it isn't clear
> > > > > that this does much when invoked on the current CPU and (2) it doesn't
> > > > > help in rcutorture TREE04.  In contrast, disabling NO_HZ_FULL and using
> > > > > RCU_NOCB_CPU instead works quite well.
> > > > > 
> > > > > So what should I be calling instead of tick_nohz_full_kick_cpu() to
> > > > > re-enable the current CPU's scheduling-clock interrupt?
> > > > 
> > > > Indeed, kernel code is assumed to be quick enough (between two extended grace
> > > > periods) to avoid running the tick for RCU. But some long lasting kernel code
> > > > may require to tick temporarily.
> > > > 
> > > > You can use tick_nohz_dep_set_cpu(cpu, TICK_DEP_MASK_RCU) with the
> > > > following:
> > > > 
> > > > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > > > index f92a10b5e112..3f476e2a4bf7 100644
> > > > --- a/include/linux/tick.h
> > > > +++ b/include/linux/tick.h
> > > > @@ -108,7 +108,8 @@ enum tick_dep_bits {
> > > >  	TICK_DEP_BIT_POSIX_TIMER	= 0,
> > > >  	TICK_DEP_BIT_PERF_EVENTS	= 1,
> > > >  	TICK_DEP_BIT_SCHED		= 2,
> > > > -	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3
> > > > +	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3,
> > > > +	TICK_DEP_BIT_RCU		= 4
> > > >  };
> > > >  
> > > >  #define TICK_DEP_MASK_NONE		0
> > > > @@ -116,6 +117,7 @@ enum tick_dep_bits {
> > > >  #define TICK_DEP_MASK_PERF_EVENTS	(1 << TICK_DEP_BIT_PERF_EVENTS)
> > > >  #define TICK_DEP_MASK_SCHED		(1 << TICK_DEP_BIT_SCHED)
> > > >  #define TICK_DEP_MASK_CLOCK_UNSTABLE	(1 << TICK_DEP_BIT_CLOCK_UNSTABLE)
> > > > +#define TICK_DEP_MASK_RCU		(1 << TICK_DEP_BIT_RCU)
> > > >  
> > > >  #ifdef CONFIG_NO_HZ_COMMON
> > > >  extern bool tick_nohz_enabled;
> > > 
> > > I will give this a try, thank you!  (Testing will take a few days.)
> > 
> > Sure!
> > 
> > For the background: expect a self-IPI to fire which then restart the tick on IRQ exit.
> > Once you're later done with the work, don't forget to remove the tick dependency:
> > 
> >      tick_nohz_dep_clear_cpu(cpu, TICK_DEP_MASK_RCU);
> 
> Does this get cleared automatically if preempted, or should I clear
> this is one of RCU's scheduler hooks?  (My guess is the latter, but
> I figured I should ask.)

The patch below (which includes your patch) does help considerably.
However, it does have some shortcomings:

1.	Adds an atomic operation (albeit a cache-local one) to
	the scheduler fastpath.  One approach would be to have
	a way of testing this bit and clearing it only if set.

	Another approach would be to instead clear it on the
	transition to nohz_full userspace or to idle.

2.	There are a lot of other places in the kernel that are in
	need of this bit being set.  I am therefore considering making
	multi_cpu_stop() or its callers set this bit on all CPUs upon
	entry and clear it upon exit.  While in this state, it is
	likely necessary to disable clearing this bit.  Or it would
	be necessary to make multi_cpu_stop() repeat clearing the bit
	every so often.

	As it stands, I have CPU hotplug removal operations taking
	more than 400 seconds.

3.	It was tempting to ask for this bit to be tracked on a per-task
	basis, but from what I can see that adds at least as much
	complexity as it removes.

Thoughts?

							Thanx, Paul

PS.  Outage on @linux.ibm.com, hence the CC of my gmail address.

------------------------------------------------------------------------

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 196a0a7bfc4f..0dea6fb33a11 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -108,7 +108,8 @@ enum tick_dep_bits {
 	TICK_DEP_BIT_POSIX_TIMER	= 0,
 	TICK_DEP_BIT_PERF_EVENTS	= 1,
 	TICK_DEP_BIT_SCHED		= 2,
-	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3
+	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3,
+	TICK_DEP_BIT_RCU		= 4
 };
 
 #define TICK_DEP_MASK_NONE		0
@@ -116,6 +117,7 @@ enum tick_dep_bits {
 #define TICK_DEP_MASK_PERF_EVENTS	(1 << TICK_DEP_BIT_PERF_EVENTS)
 #define TICK_DEP_MASK_SCHED		(1 << TICK_DEP_BIT_SCHED)
 #define TICK_DEP_MASK_CLOCK_UNSTABLE	(1 << TICK_DEP_BIT_CLOCK_UNSTABLE)
+#define TICK_DEP_MASK_RCU		(1 << TICK_DEP_BIT_RCU)
 
 #ifdef CONFIG_NO_HZ_COMMON
 extern bool tick_nohz_enabled;
@@ -258,6 +260,9 @@ static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
 
+static inline void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) { }
+static inline void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit) { }
+
 static inline void tick_dep_set(enum tick_dep_bits bit) { }
 static inline void tick_dep_clear(enum tick_dep_bits bit) { }
 static inline void tick_dep_set_cpu(int cpu, enum tick_dep_bits bit) { }
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 3c9feca1eab1..e6a74b9fb0b8 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -44,6 +44,7 @@
 #include <linux/sched/debug.h>
 #include <linux/sched/sysctl.h>
 #include <linux/oom.h>
+#include <linux/tick.h>
 
 #include "rcu.h"
 
@@ -1353,6 +1354,7 @@ static void rcu_torture_timer(struct timer_list *unused)
 static int
 rcu_torture_reader(void *arg)
 {
+	unsigned long i = 0;
 	unsigned long lastsleep = jiffies;
 	long myid = (long)arg;
 	int mynumonline = myid;
@@ -1363,26 +1365,31 @@ rcu_torture_reader(void *arg)
 	set_user_nice(current, MAX_NICE);
 	if (irqreader && cur_ops->irq_capable)
 		timer_setup_on_stack(&t, rcu_torture_timer, 0);
-
 	do {
 		if (irqreader && cur_ops->irq_capable) {
 			if (!timer_pending(&t))
 				mod_timer(&t, jiffies + 1);
 		}
-		if (!rcu_torture_one_read(&rand))
+		if (!rcu_torture_one_read(&rand) && !torture_must_stop())
 			schedule_timeout_interruptible(HZ);
-		if (time_after(jiffies, lastsleep)) {
+		if (time_after(jiffies, lastsleep) && !torture_must_stop()) {
 			schedule_timeout_interruptible(1);
 			lastsleep = jiffies + 10;
 		}
 		while (num_online_cpus() < mynumonline && !torture_must_stop())
 			schedule_timeout_interruptible(HZ / 5);
+		if (IS_ENABLED(CONFIG_NO_HZ_FULL) && i++ & 0x1f)
+			tick_nohz_dep_set_cpu(raw_smp_processor_id(),
+					      TICK_DEP_MASK_RCU);
 		stutter_wait("rcu_torture_reader");
 	} while (!torture_must_stop());
 	if (irqreader && cur_ops->irq_capable) {
 		del_timer_sync(&t);
 		destroy_timer_on_stack(&t);
 	}
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_nohz_dep_clear_cpu(raw_smp_processor_id(),
+					TICK_DEP_MASK_RCU);
 	torture_kthread_stopping("rcu_torture_reader");
 	return 0;
 }
@@ -1729,10 +1736,12 @@ static void rcu_torture_fwd_prog_cond_resched(unsigned long iter)
 		// Real call_rcu() floods hit userspace, so emulate that.
 		if (need_resched() || (iter & 0xfff))
 			schedule();
-	} else {
-		// No userspace emulation: CB invocation throttles call_rcu()
-		cond_resched();
+		return;
 	}
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_nohz_dep_set_cpu(smp_processor_id(), TICK_DEP_MASK_RCU);
+	// No userspace emulation: CB invocation throttles call_rcu()
+	cond_resched();
 }
 
 /*
@@ -1760,6 +1769,9 @@ static unsigned long rcu_torture_fwd_prog_cbfree(void)
 		freed++;
 		rcu_torture_fwd_prog_cond_resched(freed);
 	}
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_nohz_dep_clear_cpu(raw_smp_processor_id(),
+					TICK_DEP_MASK_RCU);
 	return freed;
 }
 
@@ -1826,6 +1838,9 @@ static void rcu_torture_fwd_prog_nr(int *tested, int *tested_tries)
 		destroy_rcu_head_on_stack(&fcs.rh);
 	}
 	schedule_timeout_uninterruptible(HZ / 10); /* Let kthreads recover. */
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_nohz_dep_clear_cpu(raw_smp_processor_id(),
+					TICK_DEP_MASK_RCU);
 	WRITE_ONCE(rcu_fwd_cb_nodelay, false);
 }
 
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 256622168356..4f0eb24d986e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2178,6 +2178,10 @@ static void rcu_do_batch(struct rcu_data *rdp)
 			cond_resched_tasks_rcu_qs();
 			lockdep_assert_irqs_enabled();
 			local_bh_disable();
+			if (IS_ENABLED(CONFIG_NO_HZ_FULL) &&
+			    !likely(-rcl.len & 31))
+				tick_nohz_dep_set_cpu(smp_processor_id(),
+						      TICK_DEP_MASK_RCU);
 		}
 	}
 
@@ -2217,6 +2221,9 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	/* Re-invoke RCU core processing if there are callbacks remaining. */
 	if (!offloaded && rcu_segcblist_ready_cbs(&rdp->cblist))
 		invoke_rcu_core();
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_nohz_dep_clear_cpu(raw_smp_processor_id(),
+					TICK_DEP_MASK_RCU);
 }
 
 /*
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 379cb7e50a62..7d285adc34c6 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -329,6 +329,8 @@ void rcu_note_context_switch(bool preempt)
 	rcu_qs();
 	if (rdp->exp_deferred_qs)
 		rcu_report_exp_rdp(rdp);
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_nohz_dep_clear_cpu(smp_processor_id(), TICK_DEP_MASK_RCU);
 	trace_rcu_utilization(TPS("End context switch"));
 }
 EXPORT_SYMBOL_GPL(rcu_note_context_switch);
@@ -853,6 +855,8 @@ void rcu_note_context_switch(bool preempt)
 	if (!preempt)
 		rcu_tasks_qs(current);
 out:
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_nohz_dep_clear_cpu(smp_processor_id(), TICK_DEP_MASK_RCU);
 	trace_rcu_utilization(TPS("End context switch"));
 }
 EXPORT_SYMBOL_GPL(rcu_note_context_switch);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f4ee1a3428ae..e617421a2c97 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -324,6 +324,7 @@ void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit)
 		preempt_enable();
 	}
 }
+EXPORT_SYMBOL_GPL(tick_nohz_dep_set_cpu);
 
 void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit)
 {
@@ -331,6 +332,7 @@ void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit)
 
 	atomic_andnot(BIT(bit), &ts->tick_dep_mask);
 }
+EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_cpu);
 
 /*
  * Set a per-task tick dependency. Posix CPU timers need this in order to elapse

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: How to turn scheduler tick on for current nohz_full CPU?
  2019-07-29 22:32         ` Paul E. McKenney
@ 2019-07-30 16:43           ` Frederic Weisbecker
  2019-07-30 17:36             ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2019-07-30 16:43 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: fweisbec, tglx, mingo, linux-kernel, rcu, peterz, paulmckrcu

On Mon, Jul 29, 2019 at 03:32:38PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 24, 2019 at 06:12:43PM -0700, Paul E. McKenney wrote:
> 
> The patch below (which includes your patch) does help considerably.
> However, it does have some shortcomings:
> 
> 1.	Adds an atomic operation (albeit a cache-local one) to
> 	the scheduler fastpath.  One approach would be to have
> 	a way of testing this bit and clearing it only if set.
> 
> 	Another approach would be to instead clear it on the
> 	transition to nohz_full userspace or to idle.

Well, the latter would be costly as it is going to restart the tick on every
user -> kernel transitions.

> 
> 2.	There are a lot of other places in the kernel that are in
> 	need of this bit being set.  I am therefore considering making
> 	multi_cpu_stop() or its callers set this bit on all CPUs upon
> 	entry and clear it upon exit.  While in this state, it is
> 	likely necessary to disable clearing this bit.  Or it would
> 	be necessary to make multi_cpu_stop() repeat clearing the bit
> 	every so often.
> 
> 	As it stands, I have CPU hotplug removal operations taking
> 	more than 400 seconds.
> 
> 3.	It was tempting to ask for this bit to be tracked on a per-task
> 	basis, but from what I can see that adds at least as much
> 	complexity as it removes.

Yeah I forgot to answer, you can use tick_dep_set_task() for that.

> 
> Thoughts?
> 
> 							Thanx, Paul
> 
> PS.  Outage on @linux.ibm.com, hence the CC of my gmail address.
> 
> ------------------------------------------------------------------------
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 196a0a7bfc4f..0dea6fb33a11 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -108,7 +108,8 @@ enum tick_dep_bits {
>  	TICK_DEP_BIT_POSIX_TIMER	= 0,
>  	TICK_DEP_BIT_PERF_EVENTS	= 1,
>  	TICK_DEP_BIT_SCHED		= 2,
> -	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3
> +	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3,
> +	TICK_DEP_BIT_RCU		= 4
>  };
>  
>  #define TICK_DEP_MASK_NONE		0
> @@ -116,6 +117,7 @@ enum tick_dep_bits {
>  #define TICK_DEP_MASK_PERF_EVENTS	(1 << TICK_DEP_BIT_PERF_EVENTS)
>  #define TICK_DEP_MASK_SCHED		(1 << TICK_DEP_BIT_SCHED)
>  #define TICK_DEP_MASK_CLOCK_UNSTABLE	(1 << TICK_DEP_BIT_CLOCK_UNSTABLE)
> +#define TICK_DEP_MASK_RCU		(1 << TICK_DEP_BIT_RCU)
>  
>  #ifdef CONFIG_NO_HZ_COMMON
>  extern bool tick_nohz_enabled;
> @@ -258,6 +260,9 @@ static inline bool tick_nohz_full_enabled(void) { return false; }
>  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
>  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
>  
> +static inline void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) { }
> +static inline void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit) { }

And I gave you the wrong APIs. Please consider_using tick_dep_set_cpu()
and tick_dep_clear_cpu() that first check if the CPU uses nohz_full.

Those should have the !CONFIG_NO_HZ_FULL stub implemented as well.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to turn scheduler tick on for current nohz_full CPU?
  2019-07-30 16:43           ` Frederic Weisbecker
@ 2019-07-30 17:36             ` Paul E. McKenney
  2019-08-19  2:14               ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2019-07-30 17:36 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: fweisbec, tglx, mingo, linux-kernel, rcu, peterz

On Tue, Jul 30, 2019 at 06:43:10PM +0200, Frederic Weisbecker wrote:
> On Mon, Jul 29, 2019 at 03:32:38PM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 24, 2019 at 06:12:43PM -0700, Paul E. McKenney wrote:
> > 
> > The patch below (which includes your patch) does help considerably.
> > However, it does have some shortcomings:
> > 
> > 1.	Adds an atomic operation (albeit a cache-local one) to
> > 	the scheduler fastpath.  One approach would be to have
> > 	a way of testing this bit and clearing it only if set.
> > 
> > 	Another approach would be to instead clear it on the
> > 	transition to nohz_full userspace or to idle.
> 
> Well, the latter would be costly as it is going to restart the tick on every
> user -> kernel transitions.

You lost me on this one.  I would be turning off RCU's request to
maintain the tick on transition to nohz_full userspace or to idle.
Why would the tick get turned on by a later user->kernel transition?

> > 2.	There are a lot of other places in the kernel that are in
> > 	need of this bit being set.  I am therefore considering making
> > 	multi_cpu_stop() or its callers set this bit on all CPUs upon
> > 	entry and clear it upon exit.  While in this state, it is
> > 	likely necessary to disable clearing this bit.  Or it would
> > 	be necessary to make multi_cpu_stop() repeat clearing the bit
> > 	every so often.
> > 
> > 	As it stands, I have CPU hotplug removal operations taking
> > 	more than 400 seconds.
> > 
> > 3.	It was tempting to ask for this bit to be tracked on a per-task
> > 	basis, but from what I can see that adds at least as much
> > 	complexity as it removes.
> 
> Yeah I forgot to answer, you can use tick_dep_set_task() for that.

Ah, good, that would remove my need to clear things on the scheduler
fastpaths.  My guess is that I use both the per-CPU and the per-task
variant in different places, but testing will tell!  ;-)

Thank you!

							Thanx, Paul

> > Thoughts?
> > 
> > 							Thanx, Paul
> > 
> > PS.  Outage on @linux.ibm.com, hence the CC of my gmail address.
> > 
> > ------------------------------------------------------------------------
> > 
> > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > index 196a0a7bfc4f..0dea6fb33a11 100644
> > --- a/include/linux/tick.h
> > +++ b/include/linux/tick.h
> > @@ -108,7 +108,8 @@ enum tick_dep_bits {
> >  	TICK_DEP_BIT_POSIX_TIMER	= 0,
> >  	TICK_DEP_BIT_PERF_EVENTS	= 1,
> >  	TICK_DEP_BIT_SCHED		= 2,
> > -	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3
> > +	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3,
> > +	TICK_DEP_BIT_RCU		= 4
> >  };
> >  
> >  #define TICK_DEP_MASK_NONE		0
> > @@ -116,6 +117,7 @@ enum tick_dep_bits {
> >  #define TICK_DEP_MASK_PERF_EVENTS	(1 << TICK_DEP_BIT_PERF_EVENTS)
> >  #define TICK_DEP_MASK_SCHED		(1 << TICK_DEP_BIT_SCHED)
> >  #define TICK_DEP_MASK_CLOCK_UNSTABLE	(1 << TICK_DEP_BIT_CLOCK_UNSTABLE)
> > +#define TICK_DEP_MASK_RCU		(1 << TICK_DEP_BIT_RCU)
> >  
> >  #ifdef CONFIG_NO_HZ_COMMON
> >  extern bool tick_nohz_enabled;
> > @@ -258,6 +260,9 @@ static inline bool tick_nohz_full_enabled(void) { return false; }
> >  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
> >  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> >  
> > +static inline void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) { }
> > +static inline void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit) { }
> 
> And I gave you the wrong APIs. Please consider_using tick_dep_set_cpu()
> and tick_dep_clear_cpu() that first check if the CPU uses nohz_full.
> 
> Those should have the !CONFIG_NO_HZ_FULL stub implemented as well.
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to turn scheduler tick on for current nohz_full CPU?
  2019-07-30 17:36             ` Paul E. McKenney
@ 2019-08-19  2:14               ` Paul E. McKenney
  0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2019-08-19  2:14 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: fweisbec, tglx, mingo, linux-kernel, rcu, peterz, joel

On Tue, Jul 30, 2019 at 10:36:37AM -0700, Paul E. McKenney wrote:
> On Tue, Jul 30, 2019 at 06:43:10PM +0200, Frederic Weisbecker wrote:
> > On Mon, Jul 29, 2019 at 03:32:38PM -0700, Paul E. McKenney wrote:
> > > On Wed, Jul 24, 2019 at 06:12:43PM -0700, Paul E. McKenney wrote:
> > > 
> > > The patch below (which includes your patch) does help considerably.
> > > However, it does have some shortcomings:
> > > 
> > > 1.	Adds an atomic operation (albeit a cache-local one) to
> > > 	the scheduler fastpath.  One approach would be to have
> > > 	a way of testing this bit and clearing it only if set.
> > > 
> > > 	Another approach would be to instead clear it on the
> > > 	transition to nohz_full userspace or to idle.
> > 
> > Well, the latter would be costly as it is going to restart the tick on every
> > user -> kernel transitions.
> 
> You lost me on this one.  I would be turning off RCU's request to
> maintain the tick on transition to nohz_full userspace or to idle.
> Why would the tick get turned on by a later user->kernel transition?
> 
> > > 2.	There are a lot of other places in the kernel that are in
> > > 	need of this bit being set.  I am therefore considering making
> > > 	multi_cpu_stop() or its callers set this bit on all CPUs upon
> > > 	entry and clear it upon exit.  While in this state, it is
> > > 	likely necessary to disable clearing this bit.  Or it would
> > > 	be necessary to make multi_cpu_stop() repeat clearing the bit
> > > 	every so often.
> > > 
> > > 	As it stands, I have CPU hotplug removal operations taking
> > > 	more than 400 seconds.
> > > 
> > > 3.	It was tempting to ask for this bit to be tracked on a per-task
> > > 	basis, but from what I can see that adds at least as much
> > > 	complexity as it removes.
> > 
> > Yeah I forgot to answer, you can use tick_dep_set_task() for that.
> 
> Ah, good, that would remove my need to clear things on the scheduler
> fastpaths.  My guess is that I use both the per-CPU and the per-task
> variant in different places, but testing will tell!  ;-)

And after a few bug fixes (one key fix for an embarrassing bug from Joel),
this now passes 14-hour rcutorture on the scenario that used to get RCU
CPU stall warnings once or twice per hour.  Though still not material for
the upcoming merge window.

Thank you, everyone!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-08-19  2:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-24 11:53 How to turn scheduler tick on for current nohz_full CPU? Paul E. McKenney
2019-07-24 13:22 ` Frederic Weisbecker
2019-07-24 13:52   ` Paul E. McKenney
2019-07-24 14:30     ` Frederic Weisbecker
2019-07-25  1:12       ` Paul E. McKenney
2019-07-29 22:32         ` Paul E. McKenney
2019-07-30 16:43           ` Frederic Weisbecker
2019-07-30 17:36             ` Paul E. McKenney
2019-08-19  2:14               ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).