All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RT 0/4] Address rcutorture issues
@ 2019-06-19  1:19 Scott Wood
  2019-06-19  1:19 ` [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs Scott Wood
                   ` (4 more replies)
  0 siblings, 5 replies; 31+ messages in thread
From: Scott Wood @ 2019-06-19  1:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Paul E. McKenney, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

With these patches, rcutorture mostly works on PREEMPT_RT_FULL.  I still
once in a while get forward progress complaints (particularly,
rcu_torture_fwd_prog_cr) when a grace period is held up for a few seconds
after which point so many callbacks have been enqueued that even making
reasonable progress isn't going to beat the timeout.  I believe I've only
seen this when running heavy loads in addition to rcutorture (though I've
done more testing under load than without); I don't know whether the
forward progress tests are expected to work under such load.

Scott Wood (4):
  rcu: Acquire RCU lock when disabling BHs
  sched: migrate_enable: Use sleeping_lock to indicate involuntary sleep
  rcu: unlock special: Treat irq and preempt disabled the same
  rcutorture: Avoid problematic critical section nesting

 include/linux/rcupdate.h |  4 +++
 include/linux/sched.h    |  4 +--
 kernel/rcu/rcutorture.c  | 92 ++++++++++++++++++++++++++++++++++++++++--------
 kernel/rcu/tree_plugin.h | 12 ++-----
 kernel/rcu/update.c      |  4 +++
 kernel/sched/core.c      |  2 ++
 kernel/softirq.c         | 12 +++++--
 7 files changed, 102 insertions(+), 28 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs
  2019-06-19  1:19 [PATCH RT 0/4] Address rcutorture issues Scott Wood
@ 2019-06-19  1:19 ` Scott Wood
  2019-06-20 20:53   ` Paul E. McKenney
  2019-06-19  1:19 ` [PATCH RT 2/4] sched: migrate_enable: Use sleeping_lock to indicate involuntary sleep Scott Wood
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-19  1:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Paul E. McKenney, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel, Scott Wood

A plain local_bh_disable() is documented as creating an RCU critical
section, and (at least) rcutorture expects this to be the case.  However,
in_softirq() doesn't block a grace period on PREEMPT_RT, since RCU checks
preempt_count() directly.  Even if RCU were changed to check
in_softirq(), that wouldn't allow blocked BH disablers to be boosted.

Fix this by calling rcu_read_lock() from local_bh_disable(), and update
rcu_read_lock_bh_held() accordingly.

Signed-off-by: Scott Wood <swood@redhat.com>
---
 include/linux/rcupdate.h |  4 ++++
 kernel/rcu/update.c      |  4 ++++
 kernel/softirq.c         | 12 +++++++++---
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index fb267bc04fdf..aca4e5e25ace 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -637,10 +637,12 @@ static inline void rcu_read_unlock(void)
 static inline void rcu_read_lock_bh(void)
 {
 	local_bh_disable();
+#ifndef CONFIG_PREEMPT_RT_FULL
 	__acquire(RCU_BH);
 	rcu_lock_acquire(&rcu_bh_lock_map);
 	RCU_LOCKDEP_WARN(!rcu_is_watching(),
 			 "rcu_read_lock_bh() used illegally while idle");
+#endif
 }
 
 /*
@@ -650,10 +652,12 @@ static inline void rcu_read_lock_bh(void)
  */
 static inline void rcu_read_unlock_bh(void)
 {
+#ifndef CONFIG_PREEMPT_RT_FULL
 	RCU_LOCKDEP_WARN(!rcu_is_watching(),
 			 "rcu_read_unlock_bh() used illegally while idle");
 	rcu_lock_release(&rcu_bh_lock_map);
 	__release(RCU_BH);
+#endif
 	local_bh_enable();
 }
 
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 3700b730ea55..eb653a329e0b 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -307,7 +307,11 @@ int rcu_read_lock_bh_held(void)
 		return 0;
 	if (!rcu_lockdep_current_cpu_online())
 		return 0;
+#ifdef CONFIG_PREEMPT_RT_FULL
+	return lock_is_held(&rcu_lock_map) || irqs_disabled();
+#else
 	return in_softirq() || irqs_disabled();
+#endif
 }
 EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
 
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 473369122ddd..eb46dd3ff92d 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -121,8 +121,10 @@ void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
 	long soft_cnt;
 
 	WARN_ON_ONCE(in_irq());
-	if (!in_atomic())
+	if (!in_atomic()) {
 		local_lock(bh_lock);
+		rcu_read_lock();
+	}
 	soft_cnt = this_cpu_inc_return(softirq_counter);
 	WARN_ON_ONCE(soft_cnt == 0);
 
@@ -155,8 +157,10 @@ void _local_bh_enable(void)
 	local_irq_restore(flags);
 #endif
 
-	if (!in_atomic())
+	if (!in_atomic()) {
+		rcu_read_unlock();
 		local_unlock(bh_lock);
+	}
 }
 
 void _local_bh_enable_rt(void)
@@ -189,8 +193,10 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
 	WARN_ON_ONCE(count < 0);
 	local_irq_enable();
 
-	if (!in_atomic())
+	if (!in_atomic()) {
+		rcu_read_unlock();
 		local_unlock(bh_lock);
+	}
 
 	preempt_check_resched();
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RT 2/4] sched: migrate_enable: Use sleeping_lock to indicate involuntary sleep
  2019-06-19  1:19 [PATCH RT 0/4] Address rcutorture issues Scott Wood
  2019-06-19  1:19 ` [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs Scott Wood
@ 2019-06-19  1:19 ` Scott Wood
  2019-06-19  1:19 ` [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same Scott Wood
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 31+ messages in thread
From: Scott Wood @ 2019-06-19  1:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Paul E. McKenney, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel, Scott Wood

Without this, rcu_note_context_switch() will complain if an RCU read
lock is held when migrate_enable() calls stop_one_cpu().

Signed-off-by: Scott Wood <swood@redhat.com>
---
 include/linux/sched.h    | 4 ++--
 kernel/rcu/tree_plugin.h | 2 +-
 kernel/sched/core.c      | 2 ++
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e1ea2ea52feb..9b8334c24dad 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -681,7 +681,7 @@ struct task_struct {
 	int				migrate_disable_atomic;
 # endif
 #endif
-#ifdef CONFIG_PREEMPT_RT_FULL
+#ifdef CONFIG_PREEMPT_RT_BASE
 	int				sleeping_lock;
 #endif
 
@@ -1870,7 +1870,7 @@ static __always_inline bool need_resched(void)
 	return unlikely(tif_need_resched());
 }
 
-#ifdef CONFIG_PREEMPT_RT_FULL
+#ifdef CONFIG_PREEMPT_RT_BASE
 static inline void sleeping_lock_inc(void)
 {
 	current->sleeping_lock++;
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 4bac3e5ee1ab..5d63914b3687 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -311,7 +311,7 @@ void rcu_note_context_switch(bool preempt)
 	barrier(); /* Avoid RCU read-side critical sections leaking down. */
 	trace_rcu_utilization(TPS("Start context switch"));
 	lockdep_assert_irqs_disabled();
-#if defined(CONFIG_PREEMPT_RT_FULL)
+#if defined(CONFIG_PREEMPT_RT_BASE)
 	sleeping_l = t->sleeping_lock;
 #endif
 	WARN_ON_ONCE(!preempt && t->rcu_read_lock_nesting > 0 && !sleeping_l);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a8493ff60b67..fce7d574ab5b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7351,7 +7351,9 @@ void migrate_enable(void)
 			unpin_current_cpu();
 			preempt_lazy_enable();
 			preempt_enable();
+			sleeping_lock_inc();
 			stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg);
+			sleeping_lock_dec();
 			tlb_migrate_finish(p->mm);
 
 			return;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same
  2019-06-19  1:19 [PATCH RT 0/4] Address rcutorture issues Scott Wood
  2019-06-19  1:19 ` [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs Scott Wood
  2019-06-19  1:19 ` [PATCH RT 2/4] sched: migrate_enable: Use sleeping_lock to indicate involuntary sleep Scott Wood
@ 2019-06-19  1:19 ` Scott Wood
  2019-06-20 21:10   ` Paul E. McKenney
  2019-06-19  1:19 ` [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting Scott Wood
  2019-06-20 19:12 ` [PATCH RT 0/4] Address rcutorture issues Paul E. McKenney
  4 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-19  1:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Paul E. McKenney, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel, Scott Wood

[Note: Just before posting this I noticed that the invoke_rcu_core stuff
 is part of the latest RCU pull request, and it has a patch that
 addresses this in a more complicated way that appears to deal with the
 bare irq-disabled sequence as well.

 Assuming we need/want to support such sequences, is the
 invoke_rcu_core() call actually going to result in scheduling any
 sooner?  resched_curr() just does the same setting of need_resched
 when it's the same cpu.
]

Since special should never be getting set inside an irqs-disabled
critical section, this is safe as long as there are no sequences of
rcu_read_lock()/local_irq_disable()/rcu_read_unlock()/local_irq_enable()
(without preempt_disable() wrapped around the IRQ disabling, as spinlocks
do).  If there are such sequences, then the grace period may be delayed
until the next time need_resched is checked.

This is needed because otherwise, in a sequence such as:
1. rcu_read_lock()
2. *preempt*, set rcu_read_unlock_special.b.blocked
3. preempt_disable()
4. rcu_read_unlock()
5. preempt_enable()

...rcu_read_unlock_special.b.blocked will not be cleared during
step 4, because of the disabled preemption.  If an interrupt is then
taken between steps 4 and 5, and that interrupt enters scheduler code
that takes pi/rq locks, and an rcu read lock inside that, then when
dropping that rcu read lock we will end up in rcu_read_unlock_special()
again -- but this time, since irqs are disabled, it will call
invoke_rcu_core() in the RT tree (regardless of PREEMPT_RT_FULL), which
calls wake_up_process().  This can cause a pi/rq lock deadlock.  An
example of interrupt code that does this is scheduler_tick().

The above sequence can be found in (at least) __lock_task_sighand() (for
!PREEMPT_RT_FULL) and d_alloc_parallel().

It's potentially an issue on non-RT as well.  While
raise_softirq_irqoff() doesn't call wake_up_process() when in_interrupt()
is true, if code between steps 4 and 5 directly calls into scheduler
code, and that code uses RCU with pi/rq lock held, wake_up_process() can
still be called.

On RT, migrate_enable() is such a codepath, so an in_interrupt() check
alone would not work on RT.  Instead, keep track of whether we've already
had an rcu_read_unlock_special() with preemption disabled but haven't yet
scheduled, and rely on the preempt_enable() yet to come instead of
calling invoke_rcu_core().

Signed-off-by: Scott Wood <swood@redhat.com>
---
 kernel/rcu/tree_plugin.h | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 5d63914b3687..d7ddbcc7231c 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -630,14 +630,8 @@ static void rcu_read_unlock_special(struct task_struct *t)
 	if (preempt_bh_were_disabled || irqs_were_disabled) {
 		WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, false);
 		/* Need to defer quiescent state until everything is enabled. */
-		if (irqs_were_disabled) {
-			/* Enabling irqs does not reschedule, so... */
-			invoke_rcu_core();
-		} else {
-			/* Enabling BH or preempt does reschedule, so... */
-			set_tsk_need_resched(current);
-			set_preempt_need_resched();
-		}
+		set_tsk_need_resched(current);
+		set_preempt_need_resched();
 		local_irq_restore(flags);
 		return;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-19  1:19 [PATCH RT 0/4] Address rcutorture issues Scott Wood
                   ` (2 preceding siblings ...)
  2019-06-19  1:19 ` [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same Scott Wood
@ 2019-06-19  1:19 ` Scott Wood
  2019-06-20 21:18   ` Paul E. McKenney
  2019-06-20 19:12 ` [PATCH RT 0/4] Address rcutorture issues Paul E. McKenney
  4 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-19  1:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Paul E. McKenney, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel, Scott Wood

rcutorture was generating some nesting scenarios that are not
reasonable.  Constrain the state selection to avoid them.

Example #1:

1. preempt_disable()
2. local_bh_disable()
3. preempt_enable()
4. local_bh_enable()

On PREEMPT_RT, BH disabling takes a local lock only when called in
non-atomic context.  Thus, atomic context must be retained until after BH
is re-enabled.  Likewise, if BH is initially disabled in non-atomic
context, it cannot be re-enabled in atomic context.

Example #2:

1. rcu_read_lock()
2. local_irq_disable()
3. rcu_read_unlock()
4. local_irq_enable()

If the thread is preempted between steps 1 and 2,
rcu_read_unlock_special.b.blocked will be set, but it won't be
acted on in step 3 because IRQs are disabled.  Thus, reporting of the
quiescent state will be delayed beyond the local_irq_enable().

Example #3:

1. preempt_disable()
2. local_irq_disable()
3. preempt_enable()
4. local_irq_enable()

If need_resched is set between steps 1 and 2, then the reschedule
in step 3 will not happen.

Signed-off-by: Scott Wood <swood@redhat.com>
---
TODO: Document restrictions and add debug checks for invalid sequences.

I had been planning to resolve #1 (only as shown, not the case of
disabling preemption while non-atomic and enabling while atomic) by
changing how migrate_disable() works to avoid the split behavior, but
recently BH disabling was changed to do the same thing.  I still plan to
send the migrate disable changes as a separate patchset, for the sake of
the significant performance improvement I saw.
---
 kernel/rcu/rcutorture.c | 92 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 78 insertions(+), 14 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 584b0d1da0a3..0523d9e78246 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -73,10 +73,13 @@
 #define RCUTORTURE_RDR_RBH	 0x08	/*  ... rcu_read_lock_bh(). */
 #define RCUTORTURE_RDR_SCHED	 0x10	/*  ... rcu_read_lock_sched(). */
 #define RCUTORTURE_RDR_RCU	 0x20	/*  ... entering another RCU reader. */
-#define RCUTORTURE_RDR_NBITS	 6	/* Number of bits defined above. */
+#define RCUTORTURE_RDR_ATOM_BH	 0x40	/*  ... disabling bh while atomic */
+#define RCUTORTURE_RDR_ATOM_RBH	 0x80	/*  ... RBH while atomic */
+#define RCUTORTURE_RDR_NBITS	 8	/* Number of bits defined above. */
 #define RCUTORTURE_MAX_EXTEND	 \
 	(RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | RCUTORTURE_RDR_PREEMPT | \
-	 RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED)
+	 RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED | \
+	 RCUTORTURE_RDR_ATOM_BH | RCUTORTURE_RDR_ATOM_RBH)
 #define RCUTORTURE_RDR_MAX_LOOPS 0x7	/* Maximum reader extensions. */
 					/* Must be power of two minus one. */
 #define RCUTORTURE_RDR_MAX_SEGS (RCUTORTURE_RDR_MAX_LOOPS + 3)
@@ -1111,31 +1114,52 @@ static void rcutorture_one_extend(int *readstate, int newstate,
 	WARN_ON_ONCE((idxold >> RCUTORTURE_RDR_SHIFT) > 1);
 	rtrsp->rt_readstate = newstate;
 
-	/* First, put new protection in place to avoid critical-section gap. */
+	/*
+	 * First, put new protection in place to avoid critical-section gap.
+	 * Disable preemption around the ATOM disables to ensure that
+	 * in_atomic() is true.
+	 */
 	if (statesnew & RCUTORTURE_RDR_BH)
 		local_bh_disable();
+	if (statesnew & RCUTORTURE_RDR_RBH)
+		rcu_read_lock_bh();
 	if (statesnew & RCUTORTURE_RDR_IRQ)
 		local_irq_disable();
 	if (statesnew & RCUTORTURE_RDR_PREEMPT)
 		preempt_disable();
-	if (statesnew & RCUTORTURE_RDR_RBH)
-		rcu_read_lock_bh();
 	if (statesnew & RCUTORTURE_RDR_SCHED)
 		rcu_read_lock_sched();
+	preempt_disable();
+	if (statesnew & RCUTORTURE_RDR_ATOM_BH)
+		local_bh_disable();
+	if (statesnew & RCUTORTURE_RDR_ATOM_RBH)
+		rcu_read_lock_bh();
+	preempt_enable();
 	if (statesnew & RCUTORTURE_RDR_RCU)
 		idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;
 
-	/* Next, remove old protection, irq first due to bh conflict. */
+	/*
+	 * Next, remove old protection, in decreasing order of strength
+	 * to avoid unlock paths that aren't safe in the stronger
+	 * context.  Disable preemption around the ATOM enables in
+	 * case the context was only atomic due to IRQ disabling.
+	 */
+	preempt_disable();
 	if (statesold & RCUTORTURE_RDR_IRQ)
 		local_irq_enable();
-	if (statesold & RCUTORTURE_RDR_BH)
+	if (statesold & RCUTORTURE_RDR_ATOM_BH)
 		local_bh_enable();
+	if (statesold & RCUTORTURE_RDR_ATOM_RBH)
+		rcu_read_unlock_bh();
+	preempt_enable();
 	if (statesold & RCUTORTURE_RDR_PREEMPT)
 		preempt_enable();
-	if (statesold & RCUTORTURE_RDR_RBH)
-		rcu_read_unlock_bh();
 	if (statesold & RCUTORTURE_RDR_SCHED)
 		rcu_read_unlock_sched();
+	if (statesold & RCUTORTURE_RDR_BH)
+		local_bh_enable();
+	if (statesold & RCUTORTURE_RDR_RBH)
+		rcu_read_unlock_bh();
 	if (statesold & RCUTORTURE_RDR_RCU)
 		cur_ops->readunlock(idxold >> RCUTORTURE_RDR_SHIFT);
 
@@ -1171,6 +1195,12 @@ static int rcutorture_extend_mask_max(void)
 	int mask = rcutorture_extend_mask_max();
 	unsigned long randmask1 = torture_random(trsp) >> 8;
 	unsigned long randmask2 = randmask1 >> 3;
+	unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
+	unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ;
+	unsigned long nonatomic_bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
+	unsigned long atomic_bhs = RCUTORTURE_RDR_ATOM_BH |
+				   RCUTORTURE_RDR_ATOM_RBH;
+	unsigned long tmp;
 
 	WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
 	/* Most of the time lots of bits, half the time only one bit. */
@@ -1178,11 +1208,45 @@ static int rcutorture_extend_mask_max(void)
 		mask = mask & randmask2;
 	else
 		mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
-	/* Can't enable bh w/irq disabled. */
-	if ((mask & RCUTORTURE_RDR_IRQ) &&
-	    ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
-	     (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
-		mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
+
+	/*
+	 * Can't enable bh w/irq disabled.
+	 *
+	 * Can't enable preemption with irqs disabled, if irqs had ever
+	 * been enabled during this preempt critical section (could miss
+	 * a reschedule).
+	 */
+	tmp = atomic_bhs | nonatomic_bhs | preempts;
+	if (mask & RCUTORTURE_RDR_IRQ)
+		mask |= oldmask & tmp;
+
+	/*
+	 * Can't release the outermost rcu lock in an irq disabled
+	 * section without preemption also being disabled, if irqs had
+	 * ever been enabled during this RCU critical section (could leak
+	 * a special flag and delay reporting the qs).
+	 */
+	if ((oldmask & RCUTORTURE_RDR_RCU) && (mask & RCUTORTURE_RDR_IRQ) &&
+	    !(mask & preempts))
+		mask |= RCUTORTURE_RDR_RCU;
+
+	/* Can't modify atomic bh in non-atomic context */
+	if ((oldmask & atomic_bhs) && (mask & atomic_bhs) &&
+	    !(mask & preempts_irq)) {
+		mask |= oldmask & preempts_irq;
+		if (mask & RCUTORTURE_RDR_IRQ)
+			mask |= oldmask & tmp;
+	}
+	if ((mask & atomic_bhs) && !(mask & preempts_irq))
+		mask |= RCUTORTURE_RDR_PREEMPT;
+
+	/* Can't modify non-atomic bh in atomic context */
+	tmp = nonatomic_bhs;
+	if (oldmask & preempts_irq)
+		mask &= ~tmp;
+	if ((oldmask | mask) & preempts_irq)
+		mask |= oldmask & tmp;
+
 	if ((mask & RCUTORTURE_RDR_IRQ) &&
 	    !(mask & cur_ops->ext_irq_conflict) &&
 	    (oldmask & cur_ops->ext_irq_conflict))
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH RT 0/4] Address rcutorture issues
  2019-06-19  1:19 [PATCH RT 0/4] Address rcutorture issues Scott Wood
                   ` (3 preceding siblings ...)
  2019-06-19  1:19 ` [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting Scott Wood
@ 2019-06-20 19:12 ` Paul E. McKenney
  4 siblings, 0 replies; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-20 19:12 UTC (permalink / raw)
  To: Scott Wood
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Tue, Jun 18, 2019 at 08:19:04PM -0500, Scott Wood wrote:
> With these patches, rcutorture mostly works on PREEMPT_RT_FULL.  I still
> once in a while get forward progress complaints (particularly,
> rcu_torture_fwd_prog_cr) when a grace period is held up for a few seconds
> after which point so many callbacks have been enqueued that even making
> reasonable progress isn't going to beat the timeout.  I believe I've only
> seen this when running heavy loads in addition to rcutorture (though I've
> done more testing under load than without); I don't know whether the
> forward progress tests are expected to work under such load.

With current -rcu, the torture tests are ahead of RCU's forward-progress
code, so rcu_torture_fwd_prog_cr() failures are expected behavior,
particularly in the TREE04 and TREE07 scenarios.  This is more of a
problem for real-time because of callback offloading, which removes the
backpressure that normally exists from callback processing to whatever
is running on that same CPU generating so many callbacks.  So this issue
has been providing me some entertainment.  ;-)

If you put lots of load on the system while running rcutorture, but
leave the CPU running rcu_torture_fwd_prog_cr() otherwise idle, then yes,
you can eventually force rcu_torture_fwd_prog_cr() pretty much no matter
what RCU does otherwise.  Particularly given that rcutorture is running
within a guest OS.  There has been some discussion of RCU asking for help
from the hypervisor, but it hasn't yet gotten further than discussion.

For another example, if all but one of the CPUs is an no-CBs CPU
(or, equivalently, a nohz_full CPU), and all of the rcuo kthreads
are constrained to run on the remaining CPU, it is not hard to create
workloads that produce more callbacks than that remaining CPU can possibly
keep up with.  The traditional position has of course been the Spiderman
principle "With great power comes great responsibility".  ;-)

							Thanx, Paul

> Scott Wood (4):
>   rcu: Acquire RCU lock when disabling BHs
>   sched: migrate_enable: Use sleeping_lock to indicate involuntary sleep
>   rcu: unlock special: Treat irq and preempt disabled the same
>   rcutorture: Avoid problematic critical section nesting
> 
>  include/linux/rcupdate.h |  4 +++
>  include/linux/sched.h    |  4 +--
>  kernel/rcu/rcutorture.c  | 92 ++++++++++++++++++++++++++++++++++++++++--------
>  kernel/rcu/tree_plugin.h | 12 ++-----
>  kernel/rcu/update.c      |  4 +++
>  kernel/sched/core.c      |  2 ++
>  kernel/softirq.c         | 12 +++++--
>  7 files changed, 102 insertions(+), 28 deletions(-)
> 
> -- 
> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs
  2019-06-19  1:19 ` [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs Scott Wood
@ 2019-06-20 20:53   ` Paul E. McKenney
  2019-06-20 21:06     ` Scott Wood
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-20 20:53 UTC (permalink / raw)
  To: Scott Wood
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Tue, Jun 18, 2019 at 08:19:05PM -0500, Scott Wood wrote:
> A plain local_bh_disable() is documented as creating an RCU critical
> section, and (at least) rcutorture expects this to be the case.  However,
> in_softirq() doesn't block a grace period on PREEMPT_RT, since RCU checks
> preempt_count() directly.  Even if RCU were changed to check
> in_softirq(), that wouldn't allow blocked BH disablers to be boosted.
> 
> Fix this by calling rcu_read_lock() from local_bh_disable(), and update
> rcu_read_lock_bh_held() accordingly.
> 
> Signed-off-by: Scott Wood <swood@redhat.com>
> ---
>  include/linux/rcupdate.h |  4 ++++
>  kernel/rcu/update.c      |  4 ++++
>  kernel/softirq.c         | 12 +++++++++---
>  3 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index fb267bc04fdf..aca4e5e25ace 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -637,10 +637,12 @@ static inline void rcu_read_unlock(void)
>  static inline void rcu_read_lock_bh(void)
>  {
>  	local_bh_disable();
> +#ifndef CONFIG_PREEMPT_RT_FULL

How about this instead?

	if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL))
		return;

And similarly below.

>  	__acquire(RCU_BH);
>  	rcu_lock_acquire(&rcu_bh_lock_map);
>  	RCU_LOCKDEP_WARN(!rcu_is_watching(),
>  			 "rcu_read_lock_bh() used illegally while idle");
> +#endif
>  }
>  
>  /*
> @@ -650,10 +652,12 @@ static inline void rcu_read_lock_bh(void)
>   */
>  static inline void rcu_read_unlock_bh(void)
>  {
> +#ifndef CONFIG_PREEMPT_RT_FULL
>  	RCU_LOCKDEP_WARN(!rcu_is_watching(),
>  			 "rcu_read_unlock_bh() used illegally while idle");
>  	rcu_lock_release(&rcu_bh_lock_map);
>  	__release(RCU_BH);
> +#endif
>  	local_bh_enable();
>  }
>  
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 3700b730ea55..eb653a329e0b 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -307,7 +307,11 @@ int rcu_read_lock_bh_held(void)
>  		return 0;
>  	if (!rcu_lockdep_current_cpu_online())
>  		return 0;
> +#ifdef CONFIG_PREEMPT_RT_FULL
> +	return lock_is_held(&rcu_lock_map) || irqs_disabled();
> +#else
>  	return in_softirq() || irqs_disabled();
> +#endif
>  }
>  EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
>  
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 473369122ddd..eb46dd3ff92d 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -121,8 +121,10 @@ void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
>  	long soft_cnt;
>  
>  	WARN_ON_ONCE(in_irq());
> -	if (!in_atomic())
> +	if (!in_atomic()) {
>  		local_lock(bh_lock);
> +		rcu_read_lock();
> +	}
>  	soft_cnt = this_cpu_inc_return(softirq_counter);
>  	WARN_ON_ONCE(soft_cnt == 0);
>  
> @@ -155,8 +157,10 @@ void _local_bh_enable(void)
>  	local_irq_restore(flags);
>  #endif
>  
> -	if (!in_atomic())
> +	if (!in_atomic()) {
> +		rcu_read_unlock();
>  		local_unlock(bh_lock);
> +	}
>  }
>  
>  void _local_bh_enable_rt(void)
> @@ -189,8 +193,10 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
>  	WARN_ON_ONCE(count < 0);
>  	local_irq_enable();
>  
> -	if (!in_atomic())
> +	if (!in_atomic()) {
> +		rcu_read_unlock();
>  		local_unlock(bh_lock);
> +	}
>  
>  	preempt_check_resched();
>  }

And I have to ask...

What did you do to test this change to kernel/softirq.c?  My past attempts
to do this sort of thing have always run afoul of open-coded BH transitions.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs
  2019-06-20 20:53   ` Paul E. McKenney
@ 2019-06-20 21:06     ` Scott Wood
  2019-06-20 21:20       ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-20 21:06 UTC (permalink / raw)
  To: paulmck
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, 2019-06-20 at 13:53 -0700, Paul E. McKenney wrote:
> On Tue, Jun 18, 2019 at 08:19:05PM -0500, Scott Wood wrote:
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index fb267bc04fdf..aca4e5e25ace 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -637,10 +637,12 @@ static inline void rcu_read_unlock(void)
> >  static inline void rcu_read_lock_bh(void)
> >  {
> >  	local_bh_disable();
> > +#ifndef CONFIG_PREEMPT_RT_FULL
> 
> How about this instead?
> 
> 	if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL))
> 		return;

OK.

> > @@ -189,8 +193,10 @@ void __local_bh_enable_ip(unsigned long ip,
> > unsigned int cnt)
> >  	WARN_ON_ONCE(count < 0);
> >  	local_irq_enable();
> >  
> > -	if (!in_atomic())
> > +	if (!in_atomic()) {
> > +		rcu_read_unlock();
> >  		local_unlock(bh_lock);
> > +	}
> >  
> >  	preempt_check_resched();
> >  }
> 
> And I have to ask...
> 
> What did you do to test this change to kernel/softirq.c?  My past attempts
> to do this sort of thing have always run afoul of open-coded BH
> transitions.

Mostly rcutorture and loads such as kernel builds, on a debug kernel.  By
"open-coded BH transition" do you mean directly manipulating the preempt
count?  That would already be broken on RT.

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same
  2019-06-19  1:19 ` [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same Scott Wood
@ 2019-06-20 21:10   ` Paul E. McKenney
  2019-06-20 21:59     ` Scott Wood
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-20 21:10 UTC (permalink / raw)
  To: Scott Wood
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Tue, Jun 18, 2019 at 08:19:07PM -0500, Scott Wood wrote:
> [Note: Just before posting this I noticed that the invoke_rcu_core stuff
>  is part of the latest RCU pull request, and it has a patch that
>  addresses this in a more complicated way that appears to deal with the
>  bare irq-disabled sequence as well.

Far easier to deal with it than to debug the lack of it.  ;-)

>  Assuming we need/want to support such sequences, is the
>  invoke_rcu_core() call actually going to result in scheduling any
>  sooner?  resched_curr() just does the same setting of need_resched
>  when it's the same cpu.
> ]

Yes, invoke_rcu_core() can in some cases invoke the scheduler sooner.
Setting the CPU-local bits might not have effect until the next interrupt.

So if -rt wants the simpler and slower approach, the change needs to
use IS_ENABLED(CONFIG_PREEMPT_RT_FULL) or similar.  Not that this is
an issue until CONFIG_PREEMPT_RT_FULL hits mainline.

							Thanx, Paul

> Since special should never be getting set inside an irqs-disabled
> critical section, this is safe as long as there are no sequences of
> rcu_read_lock()/local_irq_disable()/rcu_read_unlock()/local_irq_enable()
> (without preempt_disable() wrapped around the IRQ disabling, as spinlocks
> do).  If there are such sequences, then the grace period may be delayed
> until the next time need_resched is checked.
> 
> This is needed because otherwise, in a sequence such as:
> 1. rcu_read_lock()
> 2. *preempt*, set rcu_read_unlock_special.b.blocked
> 3. preempt_disable()
> 4. rcu_read_unlock()
> 5. preempt_enable()
> 
> ...rcu_read_unlock_special.b.blocked will not be cleared during
> step 4, because of the disabled preemption.  If an interrupt is then
> taken between steps 4 and 5, and that interrupt enters scheduler code
> that takes pi/rq locks, and an rcu read lock inside that, then when
> dropping that rcu read lock we will end up in rcu_read_unlock_special()
> again -- but this time, since irqs are disabled, it will call
> invoke_rcu_core() in the RT tree (regardless of PREEMPT_RT_FULL), which
> calls wake_up_process().  This can cause a pi/rq lock deadlock.  An
> example of interrupt code that does this is scheduler_tick().
> 
> The above sequence can be found in (at least) __lock_task_sighand() (for
> !PREEMPT_RT_FULL) and d_alloc_parallel().
> 
> It's potentially an issue on non-RT as well.  While
> raise_softirq_irqoff() doesn't call wake_up_process() when in_interrupt()
> is true, if code between steps 4 and 5 directly calls into scheduler
> code, and that code uses RCU with pi/rq lock held, wake_up_process() can
> still be called.
> 
> On RT, migrate_enable() is such a codepath, so an in_interrupt() check
> alone would not work on RT.  Instead, keep track of whether we've already
> had an rcu_read_unlock_special() with preemption disabled but haven't yet
> scheduled, and rely on the preempt_enable() yet to come instead of
> calling invoke_rcu_core().
> 
> Signed-off-by: Scott Wood <swood@redhat.com>
> ---
>  kernel/rcu/tree_plugin.h | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 5d63914b3687..d7ddbcc7231c 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -630,14 +630,8 @@ static void rcu_read_unlock_special(struct task_struct *t)
>  	if (preempt_bh_were_disabled || irqs_were_disabled) {
>  		WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, false);
>  		/* Need to defer quiescent state until everything is enabled. */
> -		if (irqs_were_disabled) {
> -			/* Enabling irqs does not reschedule, so... */
> -			invoke_rcu_core();
> -		} else {
> -			/* Enabling BH or preempt does reschedule, so... */
> -			set_tsk_need_resched(current);
> -			set_preempt_need_resched();
> -		}
> +		set_tsk_need_resched(current);
> +		set_preempt_need_resched();
>  		local_irq_restore(flags);
>  		return;
>  	}
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-19  1:19 ` [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting Scott Wood
@ 2019-06-20 21:18   ` Paul E. McKenney
  2019-06-20 21:43     ` Scott Wood
  2019-06-21 16:38     ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-20 21:18 UTC (permalink / raw)
  To: Scott Wood
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Tue, Jun 18, 2019 at 08:19:08PM -0500, Scott Wood wrote:
> rcutorture was generating some nesting scenarios that are not
> reasonable.  Constrain the state selection to avoid them.
> 
> Example #1:
> 
> 1. preempt_disable()
> 2. local_bh_disable()
> 3. preempt_enable()
> 4. local_bh_enable()
> 
> On PREEMPT_RT, BH disabling takes a local lock only when called in
> non-atomic context.  Thus, atomic context must be retained until after BH
> is re-enabled.  Likewise, if BH is initially disabled in non-atomic
> context, it cannot be re-enabled in atomic context.
> 
> Example #2:
> 
> 1. rcu_read_lock()
> 2. local_irq_disable()
> 3. rcu_read_unlock()
> 4. local_irq_enable()
> 
> If the thread is preempted between steps 1 and 2,
> rcu_read_unlock_special.b.blocked will be set, but it won't be
> acted on in step 3 because IRQs are disabled.  Thus, reporting of the
> quiescent state will be delayed beyond the local_irq_enable().
> 
> Example #3:
> 
> 1. preempt_disable()
> 2. local_irq_disable()
> 3. preempt_enable()
> 4. local_irq_enable()
> 
> If need_resched is set between steps 1 and 2, then the reschedule
> in step 3 will not happen.
> 
> Signed-off-by: Scott Wood <swood@redhat.com>

OK for -rt, but as long as people can code those sequences without getting
their wrists slapped, RCU needs to deal with it.  So I cannot accept
this in mainline at the current time.  Yes, I will know when it is safe
to accept it when rcutorture's virtual wrist gets slapped in mainline.
Why did you ask?  ;-)

But I have to ask...  With this elaboration, is it time to make this a
data-driven state machine?  Or is the complexity not yet to the point
where that would constitute a simplification?

							Thanx, Paul

> ---
> TODO: Document restrictions and add debug checks for invalid sequences.
> 
> I had been planning to resolve #1 (only as shown, not the case of
> disabling preemption while non-atomic and enabling while atomic) by
> changing how migrate_disable() works to avoid the split behavior, but
> recently BH disabling was changed to do the same thing.  I still plan to
> send the migrate disable changes as a separate patchset, for the sake of
> the significant performance improvement I saw.
> ---
>  kernel/rcu/rcutorture.c | 92 +++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 78 insertions(+), 14 deletions(-)
> 
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index 584b0d1da0a3..0523d9e78246 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -73,10 +73,13 @@
>  #define RCUTORTURE_RDR_RBH	 0x08	/*  ... rcu_read_lock_bh(). */
>  #define RCUTORTURE_RDR_SCHED	 0x10	/*  ... rcu_read_lock_sched(). */
>  #define RCUTORTURE_RDR_RCU	 0x20	/*  ... entering another RCU reader. */
> -#define RCUTORTURE_RDR_NBITS	 6	/* Number of bits defined above. */
> +#define RCUTORTURE_RDR_ATOM_BH	 0x40	/*  ... disabling bh while atomic */
> +#define RCUTORTURE_RDR_ATOM_RBH	 0x80	/*  ... RBH while atomic */
> +#define RCUTORTURE_RDR_NBITS	 8	/* Number of bits defined above. */
>  #define RCUTORTURE_MAX_EXTEND	 \
>  	(RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | RCUTORTURE_RDR_PREEMPT | \
> -	 RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED)
> +	 RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED | \
> +	 RCUTORTURE_RDR_ATOM_BH | RCUTORTURE_RDR_ATOM_RBH)
>  #define RCUTORTURE_RDR_MAX_LOOPS 0x7	/* Maximum reader extensions. */
>  					/* Must be power of two minus one. */
>  #define RCUTORTURE_RDR_MAX_SEGS (RCUTORTURE_RDR_MAX_LOOPS + 3)
> @@ -1111,31 +1114,52 @@ static void rcutorture_one_extend(int *readstate, int newstate,
>  	WARN_ON_ONCE((idxold >> RCUTORTURE_RDR_SHIFT) > 1);
>  	rtrsp->rt_readstate = newstate;
>  
> -	/* First, put new protection in place to avoid critical-section gap. */
> +	/*
> +	 * First, put new protection in place to avoid critical-section gap.
> +	 * Disable preemption around the ATOM disables to ensure that
> +	 * in_atomic() is true.
> +	 */
>  	if (statesnew & RCUTORTURE_RDR_BH)
>  		local_bh_disable();
> +	if (statesnew & RCUTORTURE_RDR_RBH)
> +		rcu_read_lock_bh();
>  	if (statesnew & RCUTORTURE_RDR_IRQ)
>  		local_irq_disable();
>  	if (statesnew & RCUTORTURE_RDR_PREEMPT)
>  		preempt_disable();
> -	if (statesnew & RCUTORTURE_RDR_RBH)
> -		rcu_read_lock_bh();
>  	if (statesnew & RCUTORTURE_RDR_SCHED)
>  		rcu_read_lock_sched();
> +	preempt_disable();
> +	if (statesnew & RCUTORTURE_RDR_ATOM_BH)
> +		local_bh_disable();
> +	if (statesnew & RCUTORTURE_RDR_ATOM_RBH)
> +		rcu_read_lock_bh();
> +	preempt_enable();
>  	if (statesnew & RCUTORTURE_RDR_RCU)
>  		idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;
>  
> -	/* Next, remove old protection, irq first due to bh conflict. */
> +	/*
> +	 * Next, remove old protection, in decreasing order of strength
> +	 * to avoid unlock paths that aren't safe in the stronger
> +	 * context.  Disable preemption around the ATOM enables in
> +	 * case the context was only atomic due to IRQ disabling.
> +	 */
> +	preempt_disable();
>  	if (statesold & RCUTORTURE_RDR_IRQ)
>  		local_irq_enable();
> -	if (statesold & RCUTORTURE_RDR_BH)
> +	if (statesold & RCUTORTURE_RDR_ATOM_BH)
>  		local_bh_enable();
> +	if (statesold & RCUTORTURE_RDR_ATOM_RBH)
> +		rcu_read_unlock_bh();
> +	preempt_enable();
>  	if (statesold & RCUTORTURE_RDR_PREEMPT)
>  		preempt_enable();
> -	if (statesold & RCUTORTURE_RDR_RBH)
> -		rcu_read_unlock_bh();
>  	if (statesold & RCUTORTURE_RDR_SCHED)
>  		rcu_read_unlock_sched();
> +	if (statesold & RCUTORTURE_RDR_BH)
> +		local_bh_enable();
> +	if (statesold & RCUTORTURE_RDR_RBH)
> +		rcu_read_unlock_bh();
>  	if (statesold & RCUTORTURE_RDR_RCU)
>  		cur_ops->readunlock(idxold >> RCUTORTURE_RDR_SHIFT);
>  
> @@ -1171,6 +1195,12 @@ static int rcutorture_extend_mask_max(void)
>  	int mask = rcutorture_extend_mask_max();
>  	unsigned long randmask1 = torture_random(trsp) >> 8;
>  	unsigned long randmask2 = randmask1 >> 3;
> +	unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
> +	unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ;
> +	unsigned long nonatomic_bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> +	unsigned long atomic_bhs = RCUTORTURE_RDR_ATOM_BH |
> +				   RCUTORTURE_RDR_ATOM_RBH;
> +	unsigned long tmp;
>  
>  	WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
>  	/* Most of the time lots of bits, half the time only one bit. */
> @@ -1178,11 +1208,45 @@ static int rcutorture_extend_mask_max(void)
>  		mask = mask & randmask2;
>  	else
>  		mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
> -	/* Can't enable bh w/irq disabled. */
> -	if ((mask & RCUTORTURE_RDR_IRQ) &&
> -	    ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
> -	     (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
> -		mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> +
> +	/*
> +	 * Can't enable bh w/irq disabled.
> +	 *
> +	 * Can't enable preemption with irqs disabled, if irqs had ever
> +	 * been enabled during this preempt critical section (could miss
> +	 * a reschedule).
> +	 */
> +	tmp = atomic_bhs | nonatomic_bhs | preempts;
> +	if (mask & RCUTORTURE_RDR_IRQ)
> +		mask |= oldmask & tmp;
> +
> +	/*
> +	 * Can't release the outermost rcu lock in an irq disabled
> +	 * section without preemption also being disabled, if irqs had
> +	 * ever been enabled during this RCU critical section (could leak
> +	 * a special flag and delay reporting the qs).
> +	 */
> +	if ((oldmask & RCUTORTURE_RDR_RCU) && (mask & RCUTORTURE_RDR_IRQ) &&
> +	    !(mask & preempts))
> +		mask |= RCUTORTURE_RDR_RCU;
> +
> +	/* Can't modify atomic bh in non-atomic context */
> +	if ((oldmask & atomic_bhs) && (mask & atomic_bhs) &&
> +	    !(mask & preempts_irq)) {
> +		mask |= oldmask & preempts_irq;
> +		if (mask & RCUTORTURE_RDR_IRQ)
> +			mask |= oldmask & tmp;
> +	}
> +	if ((mask & atomic_bhs) && !(mask & preempts_irq))
> +		mask |= RCUTORTURE_RDR_PREEMPT;
> +
> +	/* Can't modify non-atomic bh in atomic context */
> +	tmp = nonatomic_bhs;
> +	if (oldmask & preempts_irq)
> +		mask &= ~tmp;
> +	if ((oldmask | mask) & preempts_irq)
> +		mask |= oldmask & tmp;
> +
>  	if ((mask & RCUTORTURE_RDR_IRQ) &&
>  	    !(mask & cur_ops->ext_irq_conflict) &&
>  	    (oldmask & cur_ops->ext_irq_conflict))
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs
  2019-06-20 21:06     ` Scott Wood
@ 2019-06-20 21:20       ` Paul E. McKenney
  2019-06-20 21:38         ` Scott Wood
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-20 21:20 UTC (permalink / raw)
  To: Scott Wood
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, Jun 20, 2019 at 04:06:02PM -0500, Scott Wood wrote:
> On Thu, 2019-06-20 at 13:53 -0700, Paul E. McKenney wrote:
> > On Tue, Jun 18, 2019 at 08:19:05PM -0500, Scott Wood wrote:
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index fb267bc04fdf..aca4e5e25ace 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -637,10 +637,12 @@ static inline void rcu_read_unlock(void)
> > >  static inline void rcu_read_lock_bh(void)
> > >  {
> > >  	local_bh_disable();
> > > +#ifndef CONFIG_PREEMPT_RT_FULL
> > 
> > How about this instead?
> > 
> > 	if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL))
> > 		return;
> 
> OK.
> 
> > > @@ -189,8 +193,10 @@ void __local_bh_enable_ip(unsigned long ip,
> > > unsigned int cnt)
> > >  	WARN_ON_ONCE(count < 0);
> > >  	local_irq_enable();
> > >  
> > > -	if (!in_atomic())
> > > +	if (!in_atomic()) {
> > > +		rcu_read_unlock();
> > >  		local_unlock(bh_lock);
> > > +	}
> > >  
> > >  	preempt_check_resched();
> > >  }
> > 
> > And I have to ask...
> > 
> > What did you do to test this change to kernel/softirq.c?  My past attempts
> > to do this sort of thing have always run afoul of open-coded BH
> > transitions.
> 
> Mostly rcutorture and loads such as kernel builds, on a debug kernel.  By
> "open-coded BH transition" do you mean directly manipulating the preempt
> count?  That would already be broken on RT.

OK, then maybe you guys have already done the needed cleanup work.  Cool!

But don't the additions of rcu_read_lock() and rcu_read_unlock() want
to be protected by "!IS_ENABLED(CONFIG_PREEMPT_RT_FULL)" or similar?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs
  2019-06-20 21:20       ` Paul E. McKenney
@ 2019-06-20 21:38         ` Scott Wood
  2019-06-20 22:16           ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-20 21:38 UTC (permalink / raw)
  To: paulmck
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, 2019-06-20 at 14:20 -0700, Paul E. McKenney wrote:
> On Thu, Jun 20, 2019 at 04:06:02PM -0500, Scott Wood wrote:
> > On Thu, 2019-06-20 at 13:53 -0700, Paul E. McKenney wrote:
> > > And I have to ask...
> > > 
> > > What did you do to test this change to kernel/softirq.c?  My past
> > > attempts
> > > to do this sort of thing have always run afoul of open-coded BH
> > > transitions.
> > 
> > Mostly rcutorture and loads such as kernel builds, on a debug
> > kernel.  By
> > "open-coded BH transition" do you mean directly manipulating the preempt
> > count?  That would already be broken on RT.
> 
> OK, then maybe you guys have already done the needed cleanup work.  Cool!

Do you remember what code was doing such things?  Grepping for the obvious
things doesn't turn up anything outside the softirq code, even in the
earlier non-RT kernels I checked.

> But don't the additions of rcu_read_lock() and rcu_read_unlock() want
> to be protected by "!IS_ENABLED(CONFIG_PREEMPT_RT_FULL)" or similar?

This is already a separate PREEMPT_RT_FULL-specific implementation.

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-20 21:18   ` Paul E. McKenney
@ 2019-06-20 21:43     ` Scott Wood
  2019-06-21 16:38     ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 31+ messages in thread
From: Scott Wood @ 2019-06-20 21:43 UTC (permalink / raw)
  To: paulmck
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, 2019-06-20 at 14:18 -0700, Paul E. McKenney wrote:
> On Tue, Jun 18, 2019 at 08:19:08PM -0500, Scott Wood wrote:
> > rcutorture was generating some nesting scenarios that are not
> > reasonable.  Constrain the state selection to avoid them.
> > 
> > Example #1:
> > 
> > 1. preempt_disable()
> > 2. local_bh_disable()
> > 3. preempt_enable()
> > 4. local_bh_enable()
> > 
> > On PREEMPT_RT, BH disabling takes a local lock only when called in
> > non-atomic context.  Thus, atomic context must be retained until after
> > BH
> > is re-enabled.  Likewise, if BH is initially disabled in non-atomic
> > context, it cannot be re-enabled in atomic context.
> > 
> > Example #2:
> > 
> > 1. rcu_read_lock()
> > 2. local_irq_disable()
> > 3. rcu_read_unlock()
> > 4. local_irq_enable()
> > 
> > If the thread is preempted between steps 1 and 2,
> > rcu_read_unlock_special.b.blocked will be set, but it won't be
> > acted on in step 3 because IRQs are disabled.  Thus, reporting of the
> > quiescent state will be delayed beyond the local_irq_enable().
> > 
> > Example #3:
> > 
> > 1. preempt_disable()
> > 2. local_irq_disable()
> > 3. preempt_enable()
> > 4. local_irq_enable()
> > 
> > If need_resched is set between steps 1 and 2, then the reschedule
> > in step 3 will not happen.
> > 
> > Signed-off-by: Scott Wood <swood@redhat.com>
> 
> OK for -rt, but as long as people can code those sequences without getting
> their wrists slapped, RCU needs to deal with it.  So I cannot accept
> this in mainline at the current time.  Yes, I will know when it is safe
> to accept it when rcutorture's virtual wrist gets slapped in mainline.
> Why did you ask?  ;-)

Hence the "TODO: Document restrictions and add debug checks for invalid
sequences". :-)

> But I have to ask...  With this elaboration, is it time to make this a
> data-driven state machine?  Or is the complexity not yet to the point
> where that would constitute a simplification?

Perhaps... Making the entire sequence visible to the constraint enforcement
would also allow relaxing some of the constraints that currently have to
assume the worst regarding what happened before the most recent state.

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same
  2019-06-20 21:10   ` Paul E. McKenney
@ 2019-06-20 21:59     ` Scott Wood
  2019-06-20 22:25       ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-20 21:59 UTC (permalink / raw)
  To: paulmck
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, 2019-06-20 at 14:10 -0700, Paul E. McKenney wrote:
> On Tue, Jun 18, 2019 at 08:19:07PM -0500, Scott Wood wrote:
> > [Note: Just before posting this I noticed that the invoke_rcu_core stuff
> >  is part of the latest RCU pull request, and it has a patch that
> >  addresses this in a more complicated way that appears to deal with the
> >  bare irq-disabled sequence as well.
> 
> Far easier to deal with it than to debug the lack of it.  ;-)
> 
> >  Assuming we need/want to support such sequences, is the
> >  invoke_rcu_core() call actually going to result in scheduling any
> >  sooner?  resched_curr() just does the same setting of need_resched
> >  when it's the same cpu.
> > ]
> 
> Yes, invoke_rcu_core() can in some cases invoke the scheduler sooner.
> Setting the CPU-local bits might not have effect until the next interrupt.

Maybe I'm missing something, but I don't see how (in the non-use_softirq
case).  It just calls wake_up_process(), which in resched_curr() will set
need_resched but not do an IPI-to-self.

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs
  2019-06-20 21:38         ` Scott Wood
@ 2019-06-20 22:16           ` Paul E. McKenney
  0 siblings, 0 replies; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-20 22:16 UTC (permalink / raw)
  To: Scott Wood
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, Jun 20, 2019 at 04:38:47PM -0500, Scott Wood wrote:
> On Thu, 2019-06-20 at 14:20 -0700, Paul E. McKenney wrote:
> > On Thu, Jun 20, 2019 at 04:06:02PM -0500, Scott Wood wrote:
> > > On Thu, 2019-06-20 at 13:53 -0700, Paul E. McKenney wrote:
> > > > And I have to ask...
> > > > 
> > > > What did you do to test this change to kernel/softirq.c?  My past
> > > > attempts
> > > > to do this sort of thing have always run afoul of open-coded BH
> > > > transitions.
> > > 
> > > Mostly rcutorture and loads such as kernel builds, on a debug
> > > kernel.  By
> > > "open-coded BH transition" do you mean directly manipulating the preempt
> > > count?  That would already be broken on RT.
> > 
> > OK, then maybe you guys have already done the needed cleanup work.  Cool!
> 
> Do you remember what code was doing such things?  Grepping for the obvious
> things doesn't turn up anything outside the softirq code, even in the
> earlier non-RT kernels I checked.

It was many years ago, and it is quite possible that I am conflating
irqs with bh or some such.  If it now works, it now works.

> > But don't the additions of rcu_read_lock() and rcu_read_unlock() want
> > to be protected by "!IS_ENABLED(CONFIG_PREEMPT_RT_FULL)" or similar?
> 
> This is already a separate PREEMPT_RT_FULL-specific implementation.

Ah, sorry for the noise, then!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same
  2019-06-20 21:59     ` Scott Wood
@ 2019-06-20 22:25       ` Paul E. McKenney
  2019-06-20 23:08         ` Scott Wood
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-20 22:25 UTC (permalink / raw)
  To: Scott Wood
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, Jun 20, 2019 at 04:59:30PM -0500, Scott Wood wrote:
> On Thu, 2019-06-20 at 14:10 -0700, Paul E. McKenney wrote:
> > On Tue, Jun 18, 2019 at 08:19:07PM -0500, Scott Wood wrote:
> > > [Note: Just before posting this I noticed that the invoke_rcu_core stuff
> > >  is part of the latest RCU pull request, and it has a patch that
> > >  addresses this in a more complicated way that appears to deal with the
> > >  bare irq-disabled sequence as well.
> > 
> > Far easier to deal with it than to debug the lack of it.  ;-)
> > 
> > >  Assuming we need/want to support such sequences, is the
> > >  invoke_rcu_core() call actually going to result in scheduling any
> > >  sooner?  resched_curr() just does the same setting of need_resched
> > >  when it's the same cpu.
> > > ]
> > 
> > Yes, invoke_rcu_core() can in some cases invoke the scheduler sooner.
> > Setting the CPU-local bits might not have effect until the next interrupt.
> 
> Maybe I'm missing something, but I don't see how (in the non-use_softirq
> case).  It just calls wake_up_process(), which in resched_curr() will set
> need_resched but not do an IPI-to-self.

The common non-rt case will be use_softirq.  Or are you referring
specifically to this block of code in current -rcu?

		} else if (exp && irqs_were_disabled && !use_softirq &&
			   !t->rcu_read_unlock_special.b.deferred_qs) {
			// Safe to awaken and we get no help from enabling
			// irqs, unlike bh/preempt.
			invoke_rcu_core();

								Thanx, Paul


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same
  2019-06-20 22:25       ` Paul E. McKenney
@ 2019-06-20 23:08         ` Scott Wood
  2019-06-22  0:26           ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-20 23:08 UTC (permalink / raw)
  To: paulmck
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, 2019-06-20 at 15:25 -0700, Paul E. McKenney wrote:
> On Thu, Jun 20, 2019 at 04:59:30PM -0500, Scott Wood wrote:
> > On Thu, 2019-06-20 at 14:10 -0700, Paul E. McKenney wrote:
> > > On Tue, Jun 18, 2019 at 08:19:07PM -0500, Scott Wood wrote:
> > > > [Note: Just before posting this I noticed that the invoke_rcu_core
> > > > stuff
> > > >  is part of the latest RCU pull request, and it has a patch that
> > > >  addresses this in a more complicated way that appears to deal with
> > > > the
> > > >  bare irq-disabled sequence as well.
> > > 
> > > Far easier to deal with it than to debug the lack of it.  ;-)
> > > 
> > > >  Assuming we need/want to support such sequences, is the
> > > >  invoke_rcu_core() call actually going to result in scheduling any
> > > >  sooner?  resched_curr() just does the same setting of need_resched
> > > >  when it's the same cpu.
> > > > ]
> > > 
> > > Yes, invoke_rcu_core() can in some cases invoke the scheduler sooner.
> > > Setting the CPU-local bits might not have effect until the next
> > > interrupt.
> > 
> > Maybe I'm missing something, but I don't see how (in the non-use_softirq
> > case).  It just calls wake_up_process(), which in resched_curr() will
> > set
> > need_resched but not do an IPI-to-self.
> 
> The common non-rt case will be use_softirq.  Or are you referring
> specifically to this block of code in current -rcu?
> 
> 		} else if (exp && irqs_were_disabled && !use_softirq &&
> 			   !t->rcu_read_unlock_special.b.deferred_qs) {
> 			// Safe to awaken and we get no help from enabling
> 			// irqs, unlike bh/preempt.
> 			invoke_rcu_core();

Yes, that one.  If that block is removed the else path should be sufficient,
now that an IPI-to-self has been added.

Also, shouldn't the IPI-to-self be conditioned on irqs_were_disabled? 
Besides that being the problem the IPI was meant to address, if irqs are
enabled the IPI is likely to happen before preempt is re-enabled and thus it
won't accomplish anything.

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-20 21:18   ` Paul E. McKenney
  2019-06-20 21:43     ` Scott Wood
@ 2019-06-21 16:38     ` Sebastian Andrzej Siewior
  2019-06-21 23:59       ` Paul E. McKenney
  1 sibling, 1 reply; 31+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-06-21 16:38 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Scott Wood, Peter Zijlstra, Steven Rostedt, Thomas Gleixner,
	Juri Lelli, Clark Williams, linux-rt-users, linux-kernel

On 2019-06-20 14:18:26 [-0700], Paul E. McKenney wrote:
> > Example #1:
> > 
> > 1. preempt_disable()
> > 2. local_bh_disable()
> > 3. preempt_enable()
> > 4. local_bh_enable()
> > 
> > Example #2:
> > 
> > 1. rcu_read_lock()
> > 2. local_irq_disable()
> > 3. rcu_read_unlock()
> > 4. local_irq_enable()
> > 
> > Example #3:
> > 
> > 1. preempt_disable()
> > 2. local_irq_disable()
> > 3. preempt_enable()
> > 4. local_irq_enable()
> 
> OK for -rt, but as long as people can code those sequences without getting
> their wrists slapped, RCU needs to deal with it.  So I cannot accept
> this in mainline at the current time.  Yes, I will know when it is safe
> to accept it when rcutorture's virtual wrist gets slapped in mainline.

All three examples are not symmetrical so if people use this mainline
then they should get their wrists slapped. Since RT trips over each one
of those I try to get rid of them if I notice something like that.

In example #3 you would lose a scheduling event if TIF_NEED_RESCHED gets
set between step 1 and 2 (as local schedule requirement) because the
preempt_enable() would trigger schedule() which does not happen due to
IRQ-off.

Sebastian

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-21 16:38     ` Sebastian Andrzej Siewior
@ 2019-06-21 23:59       ` Paul E. McKenney
  2019-06-26 15:08         ` Steven Rostedt
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-21 23:59 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Scott Wood, Peter Zijlstra, Steven Rostedt, Thomas Gleixner,
	Juri Lelli, Clark Williams, linux-rt-users, linux-kernel

On Fri, Jun 21, 2019 at 06:38:21PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-06-20 14:18:26 [-0700], Paul E. McKenney wrote:
> > > Example #1:
> > > 
> > > 1. preempt_disable()
> > > 2. local_bh_disable()
> > > 3. preempt_enable()
> > > 4. local_bh_enable()
> > > 
> > > Example #2:
> > > 
> > > 1. rcu_read_lock()
> > > 2. local_irq_disable()
> > > 3. rcu_read_unlock()
> > > 4. local_irq_enable()
> > > 
> > > Example #3:
> > > 
> > > 1. preempt_disable()
> > > 2. local_irq_disable()
> > > 3. preempt_enable()
> > > 4. local_irq_enable()
> > 
> > OK for -rt, but as long as people can code those sequences without getting
> > their wrists slapped, RCU needs to deal with it.  So I cannot accept
> > this in mainline at the current time.  Yes, I will know when it is safe
> > to accept it when rcutorture's virtual wrist gets slapped in mainline.
> 
> All three examples are not symmetrical so if people use this mainline
> then they should get their wrists slapped. Since RT trips over each one
> of those I try to get rid of them if I notice something like that.
> 
> In example #3 you would lose a scheduling event if TIF_NEED_RESCHED gets
> set between step 1 and 2 (as local schedule requirement) because the
> preempt_enable() would trigger schedule() which does not happen due to
> IRQ-off.

I have no objection to the outlawing of a number of these sequences in
mainline, but am rather pointing out that until they really are outlawed
and eliminated, rcutorture must continue to test them in mainline.
Of course, an rcutorture running in -rt should avoid testing things that
break -rt, including these sequences.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same
  2019-06-20 23:08         ` Scott Wood
@ 2019-06-22  0:26           ` Paul E. McKenney
  2019-06-22 19:13             ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-22  0:26 UTC (permalink / raw)
  To: Scott Wood
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, Jun 20, 2019 at 06:08:19PM -0500, Scott Wood wrote:
> On Thu, 2019-06-20 at 15:25 -0700, Paul E. McKenney wrote:
> > On Thu, Jun 20, 2019 at 04:59:30PM -0500, Scott Wood wrote:
> > > On Thu, 2019-06-20 at 14:10 -0700, Paul E. McKenney wrote:
> > > > On Tue, Jun 18, 2019 at 08:19:07PM -0500, Scott Wood wrote:
> > > > > [Note: Just before posting this I noticed that the invoke_rcu_core
> > > > > stuff
> > > > >  is part of the latest RCU pull request, and it has a patch that
> > > > >  addresses this in a more complicated way that appears to deal with
> > > > > the
> > > > >  bare irq-disabled sequence as well.
> > > > 
> > > > Far easier to deal with it than to debug the lack of it.  ;-)
> > > > 
> > > > >  Assuming we need/want to support such sequences, is the
> > > > >  invoke_rcu_core() call actually going to result in scheduling any
> > > > >  sooner?  resched_curr() just does the same setting of need_resched
> > > > >  when it's the same cpu.
> > > > > ]
> > > > 
> > > > Yes, invoke_rcu_core() can in some cases invoke the scheduler sooner.
> > > > Setting the CPU-local bits might not have effect until the next
> > > > interrupt.
> > > 
> > > Maybe I'm missing something, but I don't see how (in the non-use_softirq
> > > case).  It just calls wake_up_process(), which in resched_curr() will
> > > set
> > > need_resched but not do an IPI-to-self.
> > 
> > The common non-rt case will be use_softirq.  Or are you referring
> > specifically to this block of code in current -rcu?
> > 
> > 		} else if (exp && irqs_were_disabled && !use_softirq &&
> > 			   !t->rcu_read_unlock_special.b.deferred_qs) {
> > 			// Safe to awaken and we get no help from enabling
> > 			// irqs, unlike bh/preempt.
> > 			invoke_rcu_core();
> 
> Yes, that one.  If that block is removed the else path should be sufficient,
> now that an IPI-to-self has been added.

I will give it a try and let you know what happens.

> Also, shouldn't the IPI-to-self be conditioned on irqs_were_disabled? 
> Besides that being the problem the IPI was meant to address, if irqs are
> enabled the IPI is likely to happen before preempt is re-enabled and thus it
> won't accomplish anything.

Plus if preempt is disabled, the later preempt_enable() will check
(ditto for local_bh_enable()).  Unless the preempt_enable() is instead
a preempt_enable_no_resched(), of course.  :-/

							Thanx, Paul


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same
  2019-06-22  0:26           ` Paul E. McKenney
@ 2019-06-22 19:13             ` Paul E. McKenney
  2019-06-24 17:40               ` Scott Wood
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-22 19:13 UTC (permalink / raw)
  To: Scott Wood
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Fri, Jun 21, 2019 at 05:26:06PM -0700, Paul E. McKenney wrote:
> On Thu, Jun 20, 2019 at 06:08:19PM -0500, Scott Wood wrote:
> > On Thu, 2019-06-20 at 15:25 -0700, Paul E. McKenney wrote:
> > > On Thu, Jun 20, 2019 at 04:59:30PM -0500, Scott Wood wrote:
> > > > On Thu, 2019-06-20 at 14:10 -0700, Paul E. McKenney wrote:
> > > > > On Tue, Jun 18, 2019 at 08:19:07PM -0500, Scott Wood wrote:
> > > > > > [Note: Just before posting this I noticed that the invoke_rcu_core
> > > > > > stuff
> > > > > >  is part of the latest RCU pull request, and it has a patch that
> > > > > >  addresses this in a more complicated way that appears to deal with
> > > > > > the
> > > > > >  bare irq-disabled sequence as well.
> > > > > 
> > > > > Far easier to deal with it than to debug the lack of it.  ;-)
> > > > > 
> > > > > >  Assuming we need/want to support such sequences, is the
> > > > > >  invoke_rcu_core() call actually going to result in scheduling any
> > > > > >  sooner?  resched_curr() just does the same setting of need_resched
> > > > > >  when it's the same cpu.
> > > > > > ]
> > > > > 
> > > > > Yes, invoke_rcu_core() can in some cases invoke the scheduler sooner.
> > > > > Setting the CPU-local bits might not have effect until the next
> > > > > interrupt.
> > > > 
> > > > Maybe I'm missing something, but I don't see how (in the non-use_softirq
> > > > case).  It just calls wake_up_process(), which in resched_curr() will
> > > > set
> > > > need_resched but not do an IPI-to-self.
> > > 
> > > The common non-rt case will be use_softirq.  Or are you referring
> > > specifically to this block of code in current -rcu?
> > > 
> > > 		} else if (exp && irqs_were_disabled && !use_softirq &&
> > > 			   !t->rcu_read_unlock_special.b.deferred_qs) {
> > > 			// Safe to awaken and we get no help from enabling
> > > 			// irqs, unlike bh/preempt.
> > > 			invoke_rcu_core();
> > 
> > Yes, that one.  If that block is removed the else path should be sufficient,
> > now that an IPI-to-self has been added.
> 
> I will give it a try and let you know what happens.

How about the following?

								Thanx, Paul

------------------------------------------------------------------------

commit 2fd23b1b649bf7e0754fa1dfce01e945bc62f4af
Author: Paul E. McKenney <paulmck@linux.ibm.com>
Date:   Sat Jun 22 12:05:54 2019 -0700

    rcu: Simplify rcu_read_unlock_special() deferred wakeups
    
    In !use_softirq runs, we clearly cannot rely on raise_softirq() and
    its lightweight bit setting, so we must instead do some form of wakeup.
    In the absence of a self-IPI when interrupts are disabled, these wakeups
    can be delayed until the next interrupt occurs.  This means that calling
    invoke_rcu_core() doesn't actually do any expediting.
    
    In this case, it is better to take the "else" clause, which sets the
    current CPU's resched bits and, if there is an expedited grace period
    in flight, uses IRQ-work to force the needed self-IPI.  This commit
    therefore removes the "else if" clause that calls invoke_rcu_core().
    
    Reported-by: Scott Wood <swood@redhat.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 841060fce33c..c631413f457f 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -629,17 +629,12 @@ static void rcu_read_unlock_special(struct task_struct *t)
 			// Using softirq, safe to awaken, and we get
 			// no help from enabling irqs, unlike bh/preempt.
 			raise_softirq_irqoff(RCU_SOFTIRQ);
-		} else if (exp && irqs_were_disabled && !use_softirq &&
-			   !t->rcu_read_unlock_special.b.deferred_qs) {
-			// Safe to awaken and we get no help from enabling
-			// irqs, unlike bh/preempt.
-			invoke_rcu_core();
 		} else {
 			// Enabling BH or preempt does reschedule, so...
 			// Also if no expediting or NO_HZ_FULL, slow is OK.
 			set_tsk_need_resched(current);
 			set_preempt_need_resched();
-			if (IS_ENABLED(CONFIG_IRQ_WORK) &&
+			if (IS_ENABLED(CONFIG_IRQ_WORK) && irqs_were_disabled &&
 			    !rdp->defer_qs_iw_pending && exp) {
 				// Get scheduler to re-evaluate and call hooks.
 				// If !IRQ_WORK, FQS scan will eventually IPI.


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same
  2019-06-22 19:13             ` Paul E. McKenney
@ 2019-06-24 17:40               ` Scott Wood
  0 siblings, 0 replies; 31+ messages in thread
From: Scott Wood @ 2019-06-24 17:40 UTC (permalink / raw)
  To: paulmck
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Sat, 2019-06-22 at 12:13 -0700, Paul E. McKenney wrote:
> On Fri, Jun 21, 2019 at 05:26:06PM -0700, Paul E. McKenney wrote:
> > On Thu, Jun 20, 2019 at 06:08:19PM -0500, Scott Wood wrote:
> > > On Thu, 2019-06-20 at 15:25 -0700, Paul E. McKenney wrote:
> > > > On Thu, Jun 20, 2019 at 04:59:30PM -0500, Scott Wood wrote:
> > > > > On Thu, 2019-06-20 at 14:10 -0700, Paul E. McKenney wrote:
> > > > > > On Tue, Jun 18, 2019 at 08:19:07PM -0500, Scott Wood wrote:
> > > > > > > [Note: Just before posting this I noticed that the
> > > > > > > invoke_rcu_core
> > > > > > > stuff
> > > > > > >  is part of the latest RCU pull request, and it has a patch
> > > > > > > that
> > > > > > >  addresses this in a more complicated way that appears to deal
> > > > > > > with
> > > > > > > the
> > > > > > >  bare irq-disabled sequence as well.
> > > > > > 
> > > > > > Far easier to deal with it than to debug the lack of it.  ;-)
> > > > > > 
> > > > > > >  Assuming we need/want to support such sequences, is the
> > > > > > >  invoke_rcu_core() call actually going to result in scheduling
> > > > > > > any
> > > > > > >  sooner?  resched_curr() just does the same setting of
> > > > > > > need_resched
> > > > > > >  when it's the same cpu.
> > > > > > > ]
> > > > > > 
> > > > > > Yes, invoke_rcu_core() can in some cases invoke the scheduler
> > > > > > sooner.
> > > > > > Setting the CPU-local bits might not have effect until the next
> > > > > > interrupt.
> > > > > 
> > > > > Maybe I'm missing something, but I don't see how (in the non-
> > > > > use_softirq
> > > > > case).  It just calls wake_up_process(), which in resched_curr()
> > > > > will
> > > > > set
> > > > > need_resched but not do an IPI-to-self.
> > > > 
> > > > The common non-rt case will be use_softirq.  Or are you referring
> > > > specifically to this block of code in current -rcu?
> > > > 
> > > > 		} else if (exp && irqs_were_disabled && !use_softirq
> > > > &&
> > > > 			   !t-
> > > > >rcu_read_unlock_special.b.deferred_qs) {
> > > > 			// Safe to awaken and we get no help from
> > > > enabling
> > > > 			// irqs, unlike bh/preempt.
> > > > 			invoke_rcu_core();
> > > 
> > > Yes, that one.  If that block is removed the else path should be
> > > sufficient,
> > > now that an IPI-to-self has been added.
> > 
> > I will give it a try and let you know what happens.
> 
> How about the following?

Looks good, thanks.

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-21 23:59       ` Paul E. McKenney
@ 2019-06-26 15:08         ` Steven Rostedt
  2019-06-26 16:49           ` Scott Wood
  0 siblings, 1 reply; 31+ messages in thread
From: Steven Rostedt @ 2019-06-26 15:08 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Sebastian Andrzej Siewior, Scott Wood, Peter Zijlstra,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Fri, 21 Jun 2019 16:59:55 -0700
"Paul E. McKenney" <paulmck@linux.ibm.com> wrote:

> I have no objection to the outlawing of a number of these sequences in
> mainline, but am rather pointing out that until they really are outlawed
> and eliminated, rcutorture must continue to test them in mainline.
> Of course, an rcutorture running in -rt should avoid testing things that
> break -rt, including these sequences.

We should update lockdep to complain about these sequences. That would
"outlaw" them in mainline. That is, after we clean up all the current
sequences in the code. And we also need to get Linus's approval of this
as I believe he was against enforcing this in the past.

-- Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-26 15:08         ` Steven Rostedt
@ 2019-06-26 16:49           ` Scott Wood
  2019-06-27 18:00             ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-26 16:49 UTC (permalink / raw)
  To: Steven Rostedt, Paul E. McKenney
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Thomas Gleixner,
	Juri Lelli, Clark Williams, linux-rt-users, linux-kernel

On Wed, 2019-06-26 at 11:08 -0400, Steven Rostedt wrote:
> On Fri, 21 Jun 2019 16:59:55 -0700
> "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> 
> > I have no objection to the outlawing of a number of these sequences in
> > mainline, but am rather pointing out that until they really are outlawed
> > and eliminated, rcutorture must continue to test them in mainline.
> > Of course, an rcutorture running in -rt should avoid testing things that
> > break -rt, including these sequences.
> 
> We should update lockdep to complain about these sequences. That would
> "outlaw" them in mainline. That is, after we clean up all the current
> sequences in the code. And we also need to get Linus's approval of this
> as I believe he was against enforcing this in the past.

Was the opposition to prohibiting some specific sequence?  It's only certain
misnesting scenarios that are problematic.  The rcu_read_lock/
local_irq_disable restriction can be dropped with the IPI-to-self added in
Paul's tree.  Are there any known instances of the other two (besides
rcutorture)?

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-26 16:49           ` Scott Wood
@ 2019-06-27 18:00             ` Paul E. McKenney
  2019-06-27 20:16               ` Scott Wood
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-27 18:00 UTC (permalink / raw)
  To: Scott Wood
  Cc: Steven Rostedt, Sebastian Andrzej Siewior, Peter Zijlstra,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Wed, Jun 26, 2019 at 11:49:16AM -0500, Scott Wood wrote:
> On Wed, 2019-06-26 at 11:08 -0400, Steven Rostedt wrote:
> > On Fri, 21 Jun 2019 16:59:55 -0700
> > "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> > 
> > > I have no objection to the outlawing of a number of these sequences in
> > > mainline, but am rather pointing out that until they really are outlawed
> > > and eliminated, rcutorture must continue to test them in mainline.
> > > Of course, an rcutorture running in -rt should avoid testing things that
> > > break -rt, including these sequences.
> > 
> > We should update lockdep to complain about these sequences. That would
> > "outlaw" them in mainline. That is, after we clean up all the current
> > sequences in the code. And we also need to get Linus's approval of this
> > as I believe he was against enforcing this in the past.
> 
> Was the opposition to prohibiting some specific sequence?  It's only certain
> misnesting scenarios that are problematic.  The rcu_read_lock/
> local_irq_disable restriction can be dropped with the IPI-to-self added in
> Paul's tree.  Are there any known instances of the other two (besides
> rcutorture)?

Given the failure scenario Sebastian Siewior reported today, there
apparently are some, at least when running threaded interrupt handlers.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-27 18:00             ` Paul E. McKenney
@ 2019-06-27 20:16               ` Scott Wood
  2019-06-27 20:50                 ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-27 20:16 UTC (permalink / raw)
  To: paulmck
  Cc: Steven Rostedt, Sebastian Andrzej Siewior, Peter Zijlstra,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, 2019-06-27 at 11:00 -0700, Paul E. McKenney wrote:
> On Wed, Jun 26, 2019 at 11:49:16AM -0500, Scott Wood wrote:
> > On Wed, 2019-06-26 at 11:08 -0400, Steven Rostedt wrote:
> > > On Fri, 21 Jun 2019 16:59:55 -0700
> > > "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> > > 
> > > > I have no objection to the outlawing of a number of these sequences
> > > > in
> > > > mainline, but am rather pointing out that until they really are
> > > > outlawed
> > > > and eliminated, rcutorture must continue to test them in mainline.
> > > > Of course, an rcutorture running in -rt should avoid testing things
> > > > that
> > > > break -rt, including these sequences.
> > > 
> > > We should update lockdep to complain about these sequences. That would
> > > "outlaw" them in mainline. That is, after we clean up all the current
> > > sequences in the code. And we also need to get Linus's approval of
> > > this
> > > as I believe he was against enforcing this in the past.
> > 
> > Was the opposition to prohibiting some specific sequence?  It's only
> > certain
> > misnesting scenarios that are problematic.  The rcu_read_lock/
> > local_irq_disable restriction can be dropped with the IPI-to-self added
> > in
> > Paul's tree.  Are there any known instances of the other two (besides
> > rcutorture)?
> 
> Given the failure scenario Sebastian Siewior reported today, there
> apparently are some, at least when running threaded interrupt handlers.

That's the rcu misnesting, which it looks like we can allow with the IPI-to-
self; I was asking about the other two.  I suppose if we really need to, we
could work around preempt_disable()/local_irq_disable()/preempt_enable()/
local_irq_enable() by having preempt_enable() do an IPI-to-self if
need_resched is set and IRQs are disabled.  The RT local_bh_disable()
atomic/non-atomic misnesting would be more difficult, but I don't think
impossible.  I've got lazy migrate disable working (initially as an attempt
to deal with misnesting but it turned out to give a huge speedup as well;
will send as soon as I take care of a loose end in the deadline scheduler);
it's possible that something similar could be done with the softirq lock
(and given that I saw a slowdown when that lock was introduced, it may also
be worth doing just for performance).

BTW, it's not clear to me whether the failure Sebastian saw was due to the
bare irq disabled version, which was what I was talking about prohibiting
(he didn't show the context that was interrupted).  The version where
preempt is disabled (with or without irqs being disabled inside the preempt
disabled region) definitely happens and is what I was trying to address with
patch 3/4.

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-27 20:16               ` Scott Wood
@ 2019-06-27 20:50                 ` Paul E. McKenney
  2019-06-27 22:46                   ` Scott Wood
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-27 20:50 UTC (permalink / raw)
  To: Scott Wood
  Cc: Steven Rostedt, Sebastian Andrzej Siewior, Peter Zijlstra,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, Jun 27, 2019 at 03:16:09PM -0500, Scott Wood wrote:
> On Thu, 2019-06-27 at 11:00 -0700, Paul E. McKenney wrote:
> > On Wed, Jun 26, 2019 at 11:49:16AM -0500, Scott Wood wrote:
> > > On Wed, 2019-06-26 at 11:08 -0400, Steven Rostedt wrote:
> > > > On Fri, 21 Jun 2019 16:59:55 -0700
> > > > "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> > > > 
> > > > > I have no objection to the outlawing of a number of these sequences
> > > > > in
> > > > > mainline, but am rather pointing out that until they really are
> > > > > outlawed
> > > > > and eliminated, rcutorture must continue to test them in mainline.
> > > > > Of course, an rcutorture running in -rt should avoid testing things
> > > > > that
> > > > > break -rt, including these sequences.
> > > > 
> > > > We should update lockdep to complain about these sequences. That would
> > > > "outlaw" them in mainline. That is, after we clean up all the current
> > > > sequences in the code. And we also need to get Linus's approval of
> > > > this
> > > > as I believe he was against enforcing this in the past.
> > > 
> > > Was the opposition to prohibiting some specific sequence?  It's only
> > > certain
> > > misnesting scenarios that are problematic.  The rcu_read_lock/
> > > local_irq_disable restriction can be dropped with the IPI-to-self added
> > > in
> > > Paul's tree.  Are there any known instances of the other two (besides
> > > rcutorture)?

If by IPI-to-self you mean the IRQ work trick, that isn't implemented
across all architectures yet, is it?

> > Given the failure scenario Sebastian Siewior reported today, there
> > apparently are some, at least when running threaded interrupt handlers.
> 
> That's the rcu misnesting, which it looks like we can allow with the IPI-to-
> self; I was asking about the other two.  I suppose if we really need to, we
> could work around preempt_disable()/local_irq_disable()/preempt_enable()/
> local_irq_enable() by having preempt_enable() do an IPI-to-self if
> need_resched is set and IRQs are disabled.  The RT local_bh_disable()
> atomic/non-atomic misnesting would be more difficult, but I don't think
> impossible.  I've got lazy migrate disable working (initially as an attempt
> to deal with misnesting but it turned out to give a huge speedup as well;
> will send as soon as I take care of a loose end in the deadline scheduler);
> it's possible that something similar could be done with the softirq lock
> (and given that I saw a slowdown when that lock was introduced, it may also
> be worth doing just for performance).
> 
> BTW, it's not clear to me whether the failure Sebastian saw was due to the
> bare irq disabled version, which was what I was talking about prohibiting
> (he didn't show the context that was interrupted).  The version where
> preempt is disabled (with or without irqs being disabled inside the preempt
> disabled region) definitely happens and is what I was trying to address with
> patch 3/4.

I don't claim to yet fully understand what Sebastian was seeing, though
I am obviously hoping that my local experiments showing it to be fixed
in current -rcu hold true.

Why not simply make rcutorture check whether it is running in a
PREEMPT_RT_FULL environment and avoid the PREEMPT_RT_FULL-unfriendly
testing only in that case?  This could be done compatibly with mainline by
adding another rcutorture module parameter that suppressed the problematic
testing, disabled by default.  Such a patch could be accepted into
mainline, and then -rt could have a very small patch that changed the
default to enabled for CONFIG_PREEMPT_RT_FULL=y kernels.

And should we later get to a place where the PREEMPT_RT_FULL-unfriendly
scenarios are prohibited across all kernel configurations, then the module
parameter can be removed.  Again, until we know (as opposed to suspect)
that these scenarios really don't happen, mainline rcutorture must
continue testing them.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-27 20:50                 ` Paul E. McKenney
@ 2019-06-27 22:46                   ` Scott Wood
  2019-06-28  0:52                     ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-27 22:46 UTC (permalink / raw)
  To: paulmck
  Cc: Steven Rostedt, Sebastian Andrzej Siewior, Peter Zijlstra,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, 2019-06-27 at 13:50 -0700, Paul E. McKenney wrote:
> On Thu, Jun 27, 2019 at 03:16:09PM -0500, Scott Wood wrote:
> > On Thu, 2019-06-27 at 11:00 -0700, Paul E. McKenney wrote:
> > > On Wed, Jun 26, 2019 at 11:49:16AM -0500, Scott Wood wrote:
> > > > > 
> > > > > On Fri, 21 Jun 2019 16:59:55 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> > > > > 
> > > > > > I have no objection to the outlawing of a number of these
> > > > > > sequences
> > > > > > in
> > > > > > mainline, but am rather pointing out that until they really are
> > > > > > outlawed
> > > > > > and eliminated, rcutorture must continue to test them in
> > > > > > mainline.
> > > > > > Of course, an rcutorture running in -rt should avoid testing
> > > > > > things
> > > > > > that
> > > > > > break -rt, including these sequences.
> > > > > 
> > > > > sequences in the code. And we also need to get Linus's approval of
> > > > > this
> > > > > as I believe he was against enforcing this in the past.
> > > > 
> > > > Was the opposition to prohibiting some specific sequence?  It's only
> > > > certain
> > > > misnesting scenarios that are problematic.  The rcu_read_lock/
> > > > local_irq_disable restriction can be dropped with the IPI-to-self
> > > > added
> > > > in
> > > > Paul's tree.  Are there any known instances of the other two
> > > > (besides
> > > > rcutorture)?
> 
> If by IPI-to-self you mean the IRQ work trick, that isn't implemented
> across all architectures yet, is it?

Right... smp_send_reschedule() has wider coverage, but even then there's
some hardware that just can't do it reasonably (e.g. pre-APIC x86).  So I
guess the options are:

1. Accept that such hardware might experience delayed grace period
completion in certain configurations,
2. Have such hardware check for need_resched in local_irq_enable() (not nice
if sharing a kernel build with hardware that doesn't need it), or
3. Forbid the sequence (enforced by debug checks).  Again, this would only
prohibit rcu_read_lock()/local_irq_disable()/rcu_read_unlock()/
local_irq_enable() *without* preempt disabling around the IRQ-disabled
region.

> Why not simply make rcutorture cyheck whether it is running in a
> PREEMPT_RT_FULL environment and avoid the PREEMPT_RT_FULL-unfriendly
> testing only in that case?
>
> And should we later get to a place where the PREEMPT_RT_FULL-unfriendly
> scenarios are prohibited across all kernel configurations, then the module
> parameter can be removed.  Again, until we know (as opposed to suspect)
> that these scenarios really don't happen, mainline rcutorture must
> continue testing them.

Yes, I already acknowledged that debug checks detecting the sequences should
come before the test removal (including this patch as an RFC at this point
was mainly meant as a demonstration of what's needed to get rcutorture to
pass), but it'd be nice to have some idea of whether there would be
opposition to the concept before coding up the checks.  I'd rather not
continue the state of "these sequences can blow up on RT and we don't know
if they exist or not" any longer than necessary.  Plus, only one of the
sequences is exclusively an RT issue (though it's the one with the worst
consequences).

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-27 22:46                   ` Scott Wood
@ 2019-06-28  0:52                     ` Paul E. McKenney
  2019-06-28 19:37                       ` Scott Wood
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-28  0:52 UTC (permalink / raw)
  To: Scott Wood
  Cc: Steven Rostedt, Sebastian Andrzej Siewior, Peter Zijlstra,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, Jun 27, 2019 at 05:46:27PM -0500, Scott Wood wrote:
> On Thu, 2019-06-27 at 13:50 -0700, Paul E. McKenney wrote:
> > On Thu, Jun 27, 2019 at 03:16:09PM -0500, Scott Wood wrote:
> > > On Thu, 2019-06-27 at 11:00 -0700, Paul E. McKenney wrote:
> > > > On Wed, Jun 26, 2019 at 11:49:16AM -0500, Scott Wood wrote:
> > > > > > 
> > > > > > On Fri, 21 Jun 2019 16:59:55 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> > > > > > 
> > > > > > > I have no objection to the outlawing of a number of these
> > > > > > > sequences
> > > > > > > in
> > > > > > > mainline, but am rather pointing out that until they really are
> > > > > > > outlawed
> > > > > > > and eliminated, rcutorture must continue to test them in
> > > > > > > mainline.
> > > > > > > Of course, an rcutorture running in -rt should avoid testing
> > > > > > > things
> > > > > > > that
> > > > > > > break -rt, including these sequences.
> > > > > > 
> > > > > > sequences in the code. And we also need to get Linus's approval of
> > > > > > this
> > > > > > as I believe he was against enforcing this in the past.
> > > > > 
> > > > > Was the opposition to prohibiting some specific sequence?  It's only
> > > > > certain
> > > > > misnesting scenarios that are problematic.  The rcu_read_lock/
> > > > > local_irq_disable restriction can be dropped with the IPI-to-self
> > > > > added
> > > > > in
> > > > > Paul's tree.  Are there any known instances of the other two
> > > > > (besides
> > > > > rcutorture)?
> > 
> > If by IPI-to-self you mean the IRQ work trick, that isn't implemented
> > across all architectures yet, is it?
> 
> Right... smp_send_reschedule() has wider coverage, but even then there's
> some hardware that just can't do it reasonably (e.g. pre-APIC x86).

Except that smp_send_reschedule() won't do anything unless the scheduler
things something needs to be done, as it its wake list is non-empty.
Which might explain why Peter Zijlstra didn't suggest it.

>                                                                      So I
> guess the options are:
> 
> 1. Accept that such hardware might experience delayed grace period
> completion in certain configurations,
> 2. Have such hardware check for need_resched in local_irq_enable() (not nice
> if sharing a kernel build with hardware that doesn't need it), or
> 3. Forbid the sequence (enforced by debug checks).  Again, this would only
> prohibit rcu_read_lock()/local_irq_disable()/rcu_read_unlock()/
> local_irq_enable() *without* preempt disabling around the IRQ-disabled
> region.

4. If further testing continues to show it to be reliable, continue
using the scheme in -rcu.
5. Use a short-duration hrtimer to get a clean environment in short
order.  Yes, the timer might fire while preemption and/or softirqs
are disabled, but then the code can rely on the following
preempt_enable(), local_bh_enable(), or whatever.  This condition
should be sufficiently rare to avoid issues with hrtimer overhead.
6. Use smp_call_function_single() to IPI some other poor slob of a
CPU, which then does the same back.  Non-waiting version in both
cases, of course.

Probably others as well.

> > Why not simply make rcutorture cyheck whether it is running in a
> > PREEMPT_RT_FULL environment and avoid the PREEMPT_RT_FULL-unfriendly
> > testing only in that case?
> >
> > And should we later get to a place where the PREEMPT_RT_FULL-unfriendly
> > scenarios are prohibited across all kernel configurations, then the module
> > parameter can be removed.  Again, until we know (as opposed to suspect)
> > that these scenarios really don't happen, mainline rcutorture must
> > continue testing them.
> 
> Yes, I already acknowledged that debug checks detecting the sequences should
> come before the test removal

OK, good to hear.  As you may have noticed, I was getting the impression
that you might have changed your mind on this point.  ;-)

>                              (including this patch as an RFC at this point
> was mainly meant as a demonstration of what's needed to get rcutorture to
> pass), but it'd be nice to have some idea of whether there would be
> opposition to the concept before coding up the checks.  I'd rather not
> continue the state of "these sequences can blow up on RT and we don't know
> if they exist or not" any longer than necessary.  Plus, only one of the
> sequences is exclusively an RT issue (though it's the one with the worst
> consequences).

Steve Rostedt's point about enlisting the aid of lockdep seems worth
looking into.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-28  0:52                     ` Paul E. McKenney
@ 2019-06-28 19:37                       ` Scott Wood
  2019-06-28 20:24                         ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Scott Wood @ 2019-06-28 19:37 UTC (permalink / raw)
  To: paulmck
  Cc: Steven Rostedt, Sebastian Andrzej Siewior, Peter Zijlstra,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Thu, 2019-06-27 at 17:52 -0700, Paul E. McKenney wrote:
> On Thu, Jun 27, 2019 at 05:46:27PM -0500, Scott Wood wrote:
> > On Thu, 2019-06-27 at 13:50 -0700, Paul E. McKenney wrote:
> > > If by IPI-to-self you mean the IRQ work trick, that isn't implemented
> > > across all architectures yet, is it?
> > 
> > Right... smp_send_reschedule() has wider coverage, but even then there's
> > some hardware that just can't do it reasonably (e.g. pre-APIC x86).
> 
> Except that smp_send_reschedule() won't do anything unless the scheduler
> things something needs to be done, as it its wake list is non-empty.
> Which might explain why Peter Zijlstra didn't suggest it.

The wake list stuff is separate from the original purpose of the IPI, which
is to hit the need_resched check on IRQ exit.  When that happens, the
scheduler will call into RCU, even if it doesn't change threads.  

> > So I guess the options are:
> > 
> > 1. Accept that such hardware might experience delayed grace period
> > completion in certain configurations,
> > 2. Have such hardware check for need_resched in local_irq_enable() (not
> > nice
> > if sharing a kernel build with hardware that doesn't need it), or
> > 3. Forbid the sequence (enforced by debug checks).  Again, this would
> > only
> > prohibit rcu_read_lock()/local_irq_disable()/rcu_read_unlock()/
> > local_irq_enable() *without* preempt disabling around the IRQ-disabled
> > region.
> 
> 4. If further testing continues to show it to be reliable, continue
> using the scheme in -rcu.

If the testing isn't done on machines that can't do the IPI then it's
basically option #1.  FWIW I don't think option #1 is unreasonable given
that we're talking about very old and/or specialized hardware, and we're
only talking about delays, not a crash (maybe limit the ability to use
nohz_full on such hardware?).  Of course if it turns out people are actually
trying to run (modern versions of) RT on such hardware, that might be
different. :-)

> 5. Use a short-duration hrtimer to get a clean environment in short
> order.  Yes, the timer might fire while preemption and/or softirqs
> are disabled, but then the code can rely on the following
> preempt_enable(), local_bh_enable(), or whatever.  This condition
> should be sufficiently rare to avoid issues with hrtimer overhead.

Yeah, I considered that but was hesitant due to overhead -- at least in the
case of the example I gave (pre-APIC x86), arming a oneshot timer is pretty
slow.  Plus, some hardware might entirely lack one-shot timer capability.

> 6. Use smp_call_function_single() to IPI some other poor slob of a
> CPU, which then does the same back.  Non-waiting version in both
> cases, of course.

I was assuming any hardware that can't do smp_send_reschedule() is not SMP.

> 
> Probably others as well.
> 
> > > Why not simply make rcutorture cyheck whether it is running in a
> > > PREEMPT_RT_FULL environment and avoid the PREEMPT_RT_FULL-unfriendly
> > > testing only in that case?
> > > 
> > > And should we later get to a place where the PREEMPT_RT_FULL-
> > > unfriendly
> > > scenarios are prohibited across all kernel configurations, then the
> > > module
> > > parameter can be removed.  Again, until we know (as opposed to
> > > suspect)
> > > that these scenarios really don't happen, mainline rcutorture must
> > > continue testing them.
> > 
> > Yes, I already acknowledged that debug checks detecting the sequences
> > should
> > come before the test removal
> 
> OK, good to hear.  As you may have noticed, I was getting the impression
> that you might have changed your mind on this point.  ;-)
> 
> >                              (including this patch as an RFC at this
> > point
> > was mainly meant as a demonstration of what's needed to get rcutorture
> > to
> > pass), but it'd be nice to have some idea of whether there would be
> > opposition to the concept before coding up the checks.  I'd rather not
> > continue the state of "these sequences can blow up on RT and we don't
> > know
> > if they exist or not" any longer than necessary.  Plus, only one of the
> > sequences is exclusively an RT issue (though it's the one with the worst
> > consequences).
> 
> Steve Rostedt's point about enlisting the aid of lockdep seems worth
> looking into.

Sure.  I was just concerned by the "Linus was against enforcing this in the
past" comment and was hoping for more details.

-Scott



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting
  2019-06-28 19:37                       ` Scott Wood
@ 2019-06-28 20:24                         ` Paul E. McKenney
  0 siblings, 0 replies; 31+ messages in thread
From: Paul E. McKenney @ 2019-06-28 20:24 UTC (permalink / raw)
  To: Scott Wood
  Cc: Steven Rostedt, Sebastian Andrzej Siewior, Peter Zijlstra,
	Thomas Gleixner, Juri Lelli, Clark Williams, linux-rt-users,
	linux-kernel

On Fri, Jun 28, 2019 at 02:37:24PM -0500, Scott Wood wrote:
> On Thu, 2019-06-27 at 17:52 -0700, Paul E. McKenney wrote:
> > On Thu, Jun 27, 2019 at 05:46:27PM -0500, Scott Wood wrote:
> > > On Thu, 2019-06-27 at 13:50 -0700, Paul E. McKenney wrote:
> > > > If by IPI-to-self you mean the IRQ work trick, that isn't implemented
> > > > across all architectures yet, is it?
> > > 
> > > Right... smp_send_reschedule() has wider coverage, but even then there's
> > > some hardware that just can't do it reasonably (e.g. pre-APIC x86).
> > 
> > Except that smp_send_reschedule() won't do anything unless the scheduler
> > things something needs to be done, as it its wake list is non-empty.
> > Which might explain why Peter Zijlstra didn't suggest it.
> 
> The wake list stuff is separate from the original purpose of the IPI, which
> is to hit the need_resched check on IRQ exit.  When that happens, the
> scheduler will call into RCU, even if it doesn't change threads.  

Got it, thank you!

> > > So I guess the options are:
> > > 
> > > 1. Accept that such hardware might experience delayed grace period
> > > completion in certain configurations,
> > > 2. Have such hardware check for need_resched in local_irq_enable() (not
> > > nice
> > > if sharing a kernel build with hardware that doesn't need it), or
> > > 3. Forbid the sequence (enforced by debug checks).  Again, this would
> > > only
> > > prohibit rcu_read_lock()/local_irq_disable()/rcu_read_unlock()/
> > > local_irq_enable() *without* preempt disabling around the IRQ-disabled
> > > region.
> > 
> > 4. If further testing continues to show it to be reliable, continue
> > using the scheme in -rcu.
> 
> If the testing isn't done on machines that can't do the IPI then it's
> basically option #1.  FWIW I don't think option #1 is unreasonable given
> that we're talking about very old and/or specialized hardware, and we're
> only talking about delays, not a crash (maybe limit the ability to use
> nohz_full on such hardware?).  Of course if it turns out people are actually
> trying to run (modern versions of) RT on such hardware, that might be
> different. :-)

Having tried and failed to remove DEC Alpha support several times, I
know which way to bet.  Though DEC Alpha support is no longer much of a
burden on the non-Alpha portions of Linux, so no longer much motivation
for removing its support.

> > 5. Use a short-duration hrtimer to get a clean environment in short
> > order.  Yes, the timer might fire while preemption and/or softirqs
> > are disabled, but then the code can rely on the following
> > preempt_enable(), local_bh_enable(), or whatever.  This condition
> > should be sufficiently rare to avoid issues with hrtimer overhead.
> 
> Yeah, I considered that but was hesitant due to overhead -- at least in the
> case of the example I gave (pre-APIC x86), arming a oneshot timer is pretty
> slow.  Plus, some hardware might entirely lack one-shot timer capability.

The overhead is incurred in a rare case, and on systems lacking oneshot
timer it is always possible to fall back on normal timers, albeit with
fixed-time added delays.  But yes, this does add a bit of complexity.

Alternatively, assuming this case is rare, normal timers might suffice
without the need for hrtimers.

> > 6. Use smp_call_function_single() to IPI some other poor slob of a
> > CPU, which then does the same back.  Non-waiting version in both
> > cases, of course.
> 
> I was assuming any hardware that can't do smp_send_reschedule() is not SMP.

I have no idea either way.

> > Probably others as well.
> > 
> > > > Why not simply make rcutorture cyheck whether it is running in a
> > > > PREEMPT_RT_FULL environment and avoid the PREEMPT_RT_FULL-unfriendly
> > > > testing only in that case?
> > > > 
> > > > And should we later get to a place where the PREEMPT_RT_FULL-
> > > > unfriendly
> > > > scenarios are prohibited across all kernel configurations, then the
> > > > module
> > > > parameter can be removed.  Again, until we know (as opposed to
> > > > suspect)
> > > > that these scenarios really don't happen, mainline rcutorture must
> > > > continue testing them.
> > > 
> > > Yes, I already acknowledged that debug checks detecting the sequences
> > > should
> > > come before the test removal
> > 
> > OK, good to hear.  As you may have noticed, I was getting the impression
> > that you might have changed your mind on this point.  ;-)
> > 
> > >                              (including this patch as an RFC at this
> > > point
> > > was mainly meant as a demonstration of what's needed to get rcutorture
> > > to
> > > pass), but it'd be nice to have some idea of whether there would be
> > > opposition to the concept before coding up the checks.  I'd rather not
> > > continue the state of "these sequences can blow up on RT and we don't
> > > know
> > > if they exist or not" any longer than necessary.  Plus, only one of the
> > > sequences is exclusively an RT issue (though it's the one with the worst
> > > consequences).
> > 
> > Steve Rostedt's point about enlisting the aid of lockdep seems worth
> > looking into.
> 
> Sure.  I was just concerned by the "Linus was against enforcing this in the
> past" comment and was hoping for more details.

It is sometimes the case that showing that something never happens makes
people more comfortable with enforcing that something.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2019-06-28 20:25 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-19  1:19 [PATCH RT 0/4] Address rcutorture issues Scott Wood
2019-06-19  1:19 ` [PATCH RT 1/4] rcu: Acquire RCU lock when disabling BHs Scott Wood
2019-06-20 20:53   ` Paul E. McKenney
2019-06-20 21:06     ` Scott Wood
2019-06-20 21:20       ` Paul E. McKenney
2019-06-20 21:38         ` Scott Wood
2019-06-20 22:16           ` Paul E. McKenney
2019-06-19  1:19 ` [PATCH RT 2/4] sched: migrate_enable: Use sleeping_lock to indicate involuntary sleep Scott Wood
2019-06-19  1:19 ` [RFC PATCH RT 3/4] rcu: unlock special: Treat irq and preempt disabled the same Scott Wood
2019-06-20 21:10   ` Paul E. McKenney
2019-06-20 21:59     ` Scott Wood
2019-06-20 22:25       ` Paul E. McKenney
2019-06-20 23:08         ` Scott Wood
2019-06-22  0:26           ` Paul E. McKenney
2019-06-22 19:13             ` Paul E. McKenney
2019-06-24 17:40               ` Scott Wood
2019-06-19  1:19 ` [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting Scott Wood
2019-06-20 21:18   ` Paul E. McKenney
2019-06-20 21:43     ` Scott Wood
2019-06-21 16:38     ` Sebastian Andrzej Siewior
2019-06-21 23:59       ` Paul E. McKenney
2019-06-26 15:08         ` Steven Rostedt
2019-06-26 16:49           ` Scott Wood
2019-06-27 18:00             ` Paul E. McKenney
2019-06-27 20:16               ` Scott Wood
2019-06-27 20:50                 ` Paul E. McKenney
2019-06-27 22:46                   ` Scott Wood
2019-06-28  0:52                     ` Paul E. McKenney
2019-06-28 19:37                       ` Scott Wood
2019-06-28 20:24                         ` Paul E. McKenney
2019-06-20 19:12 ` [PATCH RT 0/4] Address rcutorture issues Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.