linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
@ 2015-06-30 21:48 Paul E. McKenney
  2015-06-30 21:48 ` [PATCH RFC tip/core/rcu 1/5] rcu: Prepare for expedited GP driving normal GP Paul E. McKenney
  2015-06-30 22:00 ` [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones josh
  0 siblings, 2 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-06-30 21:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

Hello!

This series contains some highly experimental patches that allow normal
grace periods to take advantage of the work done by concurrent expedited
grace periods.  This can reduce the overhead incurred by normal grace
periods by eliminating the need for force-quiescent-state scans that
would otherwise have happened after the expedited grace period completed.
It is not clear whether this is a useful tradeoff.  Nevertheless, this
series contains the following patches:

1.	Pure code-movement commit that allows the RCU grace-period kthread
	to access the expedited grace period's state.

2.	Use snapshot technique to allow a normal grace period to complete
	after an expedited grace period has completed.

3.	Changes to rcutorture to fully test normal grace periods.  Current
	tests would fail to notice some types of forward-progress bugs due
	to the presence of expedited grace periods in the common case.

4.	Make expedited grace periods awaken any concurrent normal grace
	periods.

5.	Throttle the awakening to avoid excessive CPU consumption by the
	normal grace-period kthread.

A separate series will cover changes to the expedited grace-period
machinery itself.

							Thanx, Paul

------------------------------------------------------------------------

 b/kernel/rcu/rcutorture.c |   22 +++++-
 b/kernel/rcu/tree.c       |  165 +++++++++++++++++++++++++++-------------------
 b/kernel/rcu/tree.h       |    6 +
 3 files changed, 127 insertions(+), 66 deletions(-)


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH RFC tip/core/rcu 1/5] rcu: Prepare for expedited GP driving normal GP
  2015-06-30 21:48 [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones Paul E. McKenney
@ 2015-06-30 21:48 ` Paul E. McKenney
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs Paul E. McKenney
                     ` (3 more replies)
  2015-06-30 22:00 ` [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones josh
  1 sibling, 4 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-06-30 21:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

This is a pure code-movement commit in preparation for changes that
allow short-circuiting of normal grace-period processing due to
concurrent execution of an expedited grace period.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 108 +++++++++++++++++++++++++++---------------------------
 1 file changed, 54 insertions(+), 54 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 041968e3b7ce..bea9b9c80d91 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1393,6 +1393,60 @@ void rcu_cpu_stall_reset(void)
 		WRITE_ONCE(rsp->jiffies_stall, jiffies + ULONG_MAX / 2);
 }
 
+/* Adjust sequence number for start of update-side operation. */
+static void rcu_seq_start(unsigned long *sp)
+{
+	WRITE_ONCE(*sp, *sp + 1);
+	smp_mb(); /* Ensure update-side operation after counter increment. */
+	WARN_ON_ONCE(!(*sp & 0x1));
+}
+
+/* Adjust sequence number for end of update-side operation. */
+static void rcu_seq_end(unsigned long *sp)
+{
+	smp_mb(); /* Ensure update-side operation before counter increment. */
+	WRITE_ONCE(*sp, *sp + 1);
+	WARN_ON_ONCE(*sp & 0x1);
+}
+
+/* Take a snapshot of the update side's sequence number. */
+static unsigned long rcu_seq_snap(unsigned long *sp)
+{
+	unsigned long s;
+
+	smp_mb(); /* Caller's modifications seen first by other CPUs. */
+	s = (READ_ONCE(*sp) + 3) & ~0x1;
+	smp_mb(); /* Above access must not bleed into critical section. */
+	return s;
+}
+
+/*
+ * Given a snapshot from rcu_seq_snap(), determine whether or not a
+ * full update-side operation has occurred.
+ */
+static bool rcu_seq_done(unsigned long *sp, unsigned long s)
+{
+	return ULONG_CMP_GE(READ_ONCE(*sp), s);
+}
+
+/* Wrapper functions for expedited grace periods.  */
+static void rcu_exp_gp_seq_start(struct rcu_state *rsp)
+{
+	rcu_seq_start(&rsp->expedited_sequence);
+}
+static void rcu_exp_gp_seq_end(struct rcu_state *rsp)
+{
+	rcu_seq_end(&rsp->expedited_sequence);
+}
+static unsigned long rcu_exp_gp_seq_snap(struct rcu_state *rsp)
+{
+	return rcu_seq_snap(&rsp->expedited_sequence);
+}
+static bool rcu_exp_gp_seq_done(struct rcu_state *rsp, unsigned long s)
+{
+	return rcu_seq_done(&rsp->expedited_sequence, s);
+}
+
 /*
  * Initialize the specified rcu_data structure's default callback list
  * to empty.  The default callback list is the one that is not used by
@@ -3307,60 +3361,6 @@ void cond_synchronize_sched(unsigned long oldstate)
 }
 EXPORT_SYMBOL_GPL(cond_synchronize_sched);
 
-/* Adjust sequence number for start of update-side operation. */
-static void rcu_seq_start(unsigned long *sp)
-{
-	WRITE_ONCE(*sp, *sp + 1);
-	smp_mb(); /* Ensure update-side operation after counter increment. */
-	WARN_ON_ONCE(!(*sp & 0x1));
-}
-
-/* Adjust sequence number for end of update-side operation. */
-static void rcu_seq_end(unsigned long *sp)
-{
-	smp_mb(); /* Ensure update-side operation before counter increment. */
-	WRITE_ONCE(*sp, *sp + 1);
-	WARN_ON_ONCE(*sp & 0x1);
-}
-
-/* Take a snapshot of the update side's sequence number. */
-static unsigned long rcu_seq_snap(unsigned long *sp)
-{
-	unsigned long s;
-
-	smp_mb(); /* Caller's modifications seen first by other CPUs. */
-	s = (READ_ONCE(*sp) + 3) & ~0x1;
-	smp_mb(); /* Above access must not bleed into critical section. */
-	return s;
-}
-
-/*
- * Given a snapshot from rcu_seq_snap(), determine whether or not a
- * full update-side operation has occurred.
- */
-static bool rcu_seq_done(unsigned long *sp, unsigned long s)
-{
-	return ULONG_CMP_GE(READ_ONCE(*sp), s);
-}
-
-/* Wrapper functions for expedited grace periods.  */
-static void rcu_exp_gp_seq_start(struct rcu_state *rsp)
-{
-	rcu_seq_start(&rsp->expedited_sequence);
-}
-static void rcu_exp_gp_seq_end(struct rcu_state *rsp)
-{
-	rcu_seq_end(&rsp->expedited_sequence);
-}
-static unsigned long rcu_exp_gp_seq_snap(struct rcu_state *rsp)
-{
-	return rcu_seq_snap(&rsp->expedited_sequence);
-}
-static bool rcu_exp_gp_seq_done(struct rcu_state *rsp, unsigned long s)
-{
-	return rcu_seq_done(&rsp->expedited_sequence, s);
-}
-
 /* Common code for synchronize_{rcu,sched}_expedited() work-done checking. */
 static bool sync_exp_work_done(struct rcu_state *rsp, struct rcu_node *rnp,
 			       atomic_long_t *stat, unsigned long s)
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-06-30 21:48 ` [PATCH RFC tip/core/rcu 1/5] rcu: Prepare for expedited GP driving normal GP Paul E. McKenney
@ 2015-06-30 21:48   ` Paul E. McKenney
  2015-07-01 10:03     ` Peter Zijlstra
  2015-07-01 10:05     ` Peter Zijlstra
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 3/5] rcutorture: Ensure that normal GPs advance without " Paul E. McKenney
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-06-30 21:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

If an expedited grace period executes after the start of a normal
grace period, there is no point in the normal grace period doing any
more work, as the expedited grace period has supplied all the needed
quiescent states.  This commit therefore makes the normal grace-period
initialization process take a snapshot of the expedited grace-period
state, and then recheck the state just before each force-quiescent-state
scan.  If the recheck determines that a full expedited grace period
executed since the beginning of the normal grace period, the grace-period
kthread proceeds immediately to grace-period cleanup.

Because the expedited grace period does not awaken the grace-period
kthread, this change should provide only a minimal reduction in
grace-period latency, however, it does reduce the overhead of detecting
the end of the grace period.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 40 ++++++++++++++++++++++++++++++++--------
 kernel/rcu/tree.h |  6 ++++++
 2 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index bea9b9c80d91..5365f6332a60 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -104,6 +104,7 @@ struct rcu_state sname##_state = { \
 	.orphan_nxttail = &sname##_state.orphan_nxtlist, \
 	.orphan_donetail = &sname##_state.orphan_donelist, \
 	.barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \
+	.exp_rsp = &sname##_state, \
 	.name = RCU_STATE_NAME(sname), \
 	.abbr = sabbr, \
 }
@@ -1954,6 +1955,15 @@ static int rcu_gp_init(struct rcu_state *rsp)
 		WRITE_ONCE(rsp->gp_activity, jiffies);
 	}
 
+	/*
+	 * Record associated expedited-grace-period snapshot.  This cannot
+	 * be done before the root rcu_node structure has been initialized
+	 * due to the fact that callbacks can still be registered for the
+	 * current grace period until that initialization is complete.
+	 */
+	rsp->gp_exp_snap = rcu_exp_gp_seq_snap(rsp->exp_rsp);
+	rsp->gp_exp_help = false;
+
 	return 1;
 }
 
@@ -2035,8 +2045,14 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	rcu_for_each_node_breadth_first(rsp, rnp) {
 		raw_spin_lock_irq(&rnp->lock);
 		smp_mb__after_unlock_lock();
-		WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp));
-		WARN_ON_ONCE(rnp->qsmask);
+		if (!rsp->gp_exp_help) {
+			WARN_ON_ONCE(rnp->qsmask);
+			WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp));
+		} else {
+			/* Clear out state from truncated grace period. */
+			rnp->qsmask = 0;
+			rnp->gp_tasks = NULL;
+		}
 		WRITE_ONCE(rnp->completed, rsp->gpnum);
 		rdp = this_cpu_ptr(rsp->rda);
 		if (rnp == rdp->mynode)
@@ -2121,17 +2137,24 @@ static int __noreturn rcu_gp_kthread(void *arg)
 					       TPS("fqswait"));
 			rsp->gp_state = RCU_GP_WAIT_FQS;
 			ret = wait_event_interruptible_timeout(rsp->gp_wq,
-					((gf = READ_ONCE(rsp->gp_flags)) &
-					 RCU_GP_FLAG_FQS) ||
-					(!READ_ONCE(rnp->qsmask) &&
-					 !rcu_preempt_blocked_readers_cgp(rnp)),
-					j);
+				((gf = READ_ONCE(rsp->gp_flags)) &
+				 RCU_GP_FLAG_FQS) ||
+				(!READ_ONCE(rnp->qsmask) &&
+				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
+				rcu_exp_gp_seq_done(rsp->exp_rsp,
+						    rsp->gp_exp_snap),
+				j);
 			rsp->gp_state = RCU_GP_DONE_FQS;
 			/* Locking provides needed memory barriers. */
 			/* If grace period done, leave loop. */
 			if (!READ_ONCE(rnp->qsmask) &&
-			    !rcu_preempt_blocked_readers_cgp(rnp))
+			    !rcu_preempt_blocked_readers_cgp(rnp)) {
 				break;
+			} else if (rcu_exp_gp_seq_done(rsp->exp_rsp,
+						       rsp->gp_exp_snap)) {
+				rsp->gp_exp_help = true;
+				break;
+			}
 			/* If time for quiescent-state forcing, do it. */
 			if (ULONG_CMP_GE(jiffies, rsp->jiffies_force_qs) ||
 			    (gf & RCU_GP_FLAG_FQS)) {
@@ -4190,6 +4213,7 @@ void __init rcu_init(void)
 	rcu_init_geometry();
 	rcu_init_one(&rcu_bh_state, &rcu_bh_data);
 	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
+	rcu_bh_state.exp_rsp = &rcu_sched_state;
 	if (dump_tree)
 		rcu_dump_rcu_node_tree(&rcu_sched_state);
 	__rcu_init_preempt();
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index e40b65d45495..eed6e84cb182 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -466,6 +466,10 @@ struct rcu_state {
 	wait_queue_head_t gp_wq;		/* Where GP task waits. */
 	short gp_flags;				/* Commands for GP task. */
 	short gp_state;				/* GP kthread sleep state. */
+	bool gp_exp_help;			/* Expedited GP helped the */
+						/*  just-completed normal GP. */
+	unsigned long gp_exp_snap;		/* Expedited snapshot for */
+						/*  FQS short-circuiting. */
 
 	/* End of fields guarded by root rcu_node's lock. */
 
@@ -495,6 +499,8 @@ struct rcu_state {
 	atomic_long_t expedited_normal;		/* # fallbacks to normal. */
 	atomic_t expedited_need_qs;		/* # CPUs left to check in. */
 	wait_queue_head_t expedited_wq;		/* Wait for check-ins. */
+	struct rcu_state *exp_rsp;		/* RCU flavor that expedites */
+						/*  for this flavor. */
 
 	unsigned long jiffies_force_qs;		/* Time at which to invoke */
 						/*  force_quiescent_state(). */
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH RFC tip/core/rcu 3/5] rcutorture: Ensure that normal GPs advance without expedited GPs
  2015-06-30 21:48 ` [PATCH RFC tip/core/rcu 1/5] rcu: Prepare for expedited GP driving normal GP Paul E. McKenney
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs Paul E. McKenney
@ 2015-06-30 21:48   ` Paul E. McKenney
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 4/5] rcu: Wake grace-period kthread at end of expedited grace period Paul E. McKenney
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP Paul E. McKenney
  3 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-06-30 21:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Currently, rcutorture runs between 25% and 50% of grace periods as
expedited grace periods.  This means that on large configurations,
there are expected to be a number of expedited grace periods executing
concurrently with each normal grace period.  Now that normal grace periods
can be assisted by expedited grace periods, rcutorture would be unable to
detect a bug that prevented normal grace periods from completing if there
was no expedited grace period helping it.

This commit therefore alternates 30-second phases that run expedited
grace periods and 30-second phases that do not, so that this sort of
bug will provoke an RCU CPU stall warning.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/rcutorture.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 59aa76b4460e..c62c1c34d73c 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -54,6 +54,11 @@
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com> and Josh Triplett <josh@joshtriplett.org>");
 
+#ifdef CONFIG_RCU_CPU_STALL_TIMEOUT
+#define NOEXP_SECS_DEFAULT (CONFIG_RCU_CPU_STALL_TIMEOUT + 10)
+#else /* #ifdef CONFIG_RCU_CPU_STALL_TIMEOUT */
+#define NOEXP_SECS_DEFAULT 30
+#endif /* #else #ifdef CONFIG_RCU_CPU_STALL_TIMEOUT */
 
 torture_param(int, cbflood_inter_holdoff, HZ,
 	      "Holdoff between floods (jiffies)");
@@ -75,6 +80,8 @@ torture_param(int, irqreader, 1, "Allow RCU readers from irq handlers");
 torture_param(int, n_barrier_cbs, 0,
 	     "# of callbacks/kthreads for barrier testing");
 torture_param(int, nfakewriters, 4, "Number of RCU fake writer threads");
+torture_param(int, noexp_secs, NOEXP_SECS_DEFAULT,
+	      "Time to suppress/enable expedited GPs (s), 0 to disable");
 torture_param(int, nreaders, -1, "Number of RCU reader threads");
 torture_param(int, object_debug, 0,
 	     "Enable debug-object double call_rcu() testing");
@@ -192,6 +199,8 @@ static u64 notrace rcu_trace_clock_local(void)
 }
 #endif /* #else #ifdef CONFIG_RCU_TRACE */
 
+static unsigned long rcutorture_noexp;
+
 static unsigned long boost_starttime;	/* jiffies of next boost test start. */
 static DEFINE_MUTEX(boost_mutex);	/* protect setting boost_starttime */
 					/*  and boost task create/destroy. */
@@ -894,6 +903,8 @@ rcu_torture_writer(void *arg)
 	bool gp_cond1 = gp_cond, gp_exp1 = gp_exp, gp_normal1 = gp_normal;
 	bool gp_sync1 = gp_sync;
 	int i;
+	int noexp_jiffies = noexp_secs * HZ;
+	unsigned long noexp_jiffies_next = jiffies - 1;
 	struct rcu_torture *rp;
 	struct rcu_torture *old_rp;
 	static DEFINE_TORTURE_RANDOM(rand);
@@ -942,6 +953,10 @@ rcu_torture_writer(void *arg)
 	do {
 		rcu_torture_writer_state = RTWS_FIXED_DELAY;
 		schedule_timeout_uninterruptible(1);
+		if (time_after(jiffies, noexp_jiffies_next)) {
+			rcutorture_noexp = jiffies + noexp_jiffies;
+			noexp_jiffies_next = rcutorture_noexp + noexp_jiffies;
+		}
 		rp = rcu_torture_alloc();
 		if (rp == NULL)
 			continue;
@@ -966,6 +981,9 @@ rcu_torture_writer(void *arg)
 				cur_ops->deferred_free(old_rp);
 				break;
 			case RTWS_EXP_SYNC:
+				if (time_before(jiffies, rcutorture_noexp) &&
+				    cur_ops->sync && gp_sync1)
+					goto noexp;
 				rcu_torture_writer_state = RTWS_EXP_SYNC;
 				cur_ops->exp_sync();
 				rcu_torture_pipe_update(old_rp);
@@ -982,6 +1000,7 @@ rcu_torture_writer(void *arg)
 				rcu_torture_pipe_update(old_rp);
 				break;
 			case RTWS_SYNC:
+noexp:
 				rcu_torture_writer_state = RTWS_SYNC;
 				cur_ops->sync();
 				rcu_torture_pipe_update(old_rp);
@@ -1036,7 +1055,8 @@ rcu_torture_fakewriter(void *arg)
 		    torture_random(&rand) % (nfakewriters * 8) == 0) {
 			cur_ops->cb_barrier();
 		} else if (gp_normal == gp_exp) {
-			if (torture_random(&rand) & 0x80)
+			if (time_before(jiffies, rcutorture_noexp) ||
+			    torture_random(&rand) & 0x80)
 				cur_ops->sync();
 			else
 				cur_ops->exp_sync();
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH RFC tip/core/rcu 4/5] rcu: Wake grace-period kthread at end of expedited grace period
  2015-06-30 21:48 ` [PATCH RFC tip/core/rcu 1/5] rcu: Prepare for expedited GP driving normal GP Paul E. McKenney
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs Paul E. McKenney
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 3/5] rcutorture: Ensure that normal GPs advance without " Paul E. McKenney
@ 2015-06-30 21:48   ` Paul E. McKenney
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP Paul E. McKenney
  3 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-06-30 21:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Let the expedited grace period confer its lower latency onto any
normal grace period that is in progress.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 5365f6332a60..308b6acb4260 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3507,6 +3507,8 @@ void synchronize_sched_expedited(void)
 	rcu_exp_gp_seq_end(rsp);
 	mutex_unlock(&rnp->exp_funnel_mutex);
 	smp_mb(); /* ensure subsequent action seen after grace period. */
+	if (rsp->gp_kthread && rcu_gp_in_progress(rsp))
+		wake_up(&rsp->gp_wq);
 
 	put_online_cpus();
 }
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP
  2015-06-30 21:48 ` [PATCH RFC tip/core/rcu 1/5] rcu: Prepare for expedited GP driving normal GP Paul E. McKenney
                     ` (2 preceding siblings ...)
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 4/5] rcu: Wake grace-period kthread at end of expedited grace period Paul E. McKenney
@ 2015-06-30 21:48   ` Paul E. McKenney
  2015-06-30 21:56     ` Eric Dumazet
  2015-07-01 10:07     ` Peter Zijlstra
  3 siblings, 2 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-06-30 21:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
	peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 308b6acb4260..247aa1120c4c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3505,10 +3505,19 @@ void synchronize_sched_expedited(void)
 			   !atomic_read(&rsp->expedited_need_qs));
 
 	rcu_exp_gp_seq_end(rsp);
-	mutex_unlock(&rnp->exp_funnel_mutex);
 	smp_mb(); /* ensure subsequent action seen after grace period. */
-	if (rsp->gp_kthread && rcu_gp_in_progress(rsp))
-		wake_up(&rsp->gp_wq);
+	if (rsp->gp_kthread && rcu_gp_in_progress(rsp)) {
+		static unsigned long nextgp;
+		static unsigned long nextjiffy;
+
+		if (time_after_eq(jiffies, nextgp) ||
+		    ULONG_CMP_GE(rsp->gpnum, nextgp)) {
+			nextgp = rsp->gpnum + 4;
+			nextjiffy = jiffies + 10;
+			wake_up(&rsp->gp_wq);
+		}
+	}
+	mutex_unlock(&rnp->exp_funnel_mutex);
 
 	put_online_cpus();
 }
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP Paul E. McKenney
@ 2015-06-30 21:56     ` Eric Dumazet
  2015-06-30 22:10       ` Paul E. McKenney
  2015-07-01 10:07     ` Peter Zijlstra
  1 sibling, 1 reply; 57+ messages in thread
From: Eric Dumazet @ 2015-06-30 21:56 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Ingo Molnar, laijs, dipankar, Andrew Morton,
	mathieu.desnoyers, josh, Thomas Gleixner, peterz, rostedt,
	dhowells, dvhart, fweisbec, oleg, bobby.prani

On Tue, Jun 30, 2015 at 11:48 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/rcu/tree.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 308b6acb4260..247aa1120c4c 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3505,10 +3505,19 @@ void synchronize_sched_expedited(void)
>                            !atomic_read(&rsp->expedited_need_qs));
>
>         rcu_exp_gp_seq_end(rsp);
> -       mutex_unlock(&rnp->exp_funnel_mutex);
>         smp_mb(); /* ensure subsequent action seen after grace period. */
> -       if (rsp->gp_kthread && rcu_gp_in_progress(rsp))
> -               wake_up(&rsp->gp_wq);
> +       if (rsp->gp_kthread && rcu_gp_in_progress(rsp)) {
> +               static unsigned long nextgp;
> +               static unsigned long nextjiffy;
> +
> +               if (time_after_eq(jiffies, nextgp) ||
> +                   ULONG_CMP_GE(rsp->gpnum, nextgp)) {
> +                       nextgp = rsp->gpnum + 4;
> +                       nextjiffy = jiffies + 10;

Do you want 10 ticks or 10 ms (as stated in title) ?

> +                       wake_up(&rsp->gp_wq);
> +               }
> +       }
> +       mutex_unlock(&rnp->exp_funnel_mutex);
>
>         put_online_cpus();
>  }
> --
> 1.8.1.5
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-06-30 21:48 [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones Paul E. McKenney
  2015-06-30 21:48 ` [PATCH RFC tip/core/rcu 1/5] rcu: Prepare for expedited GP driving normal GP Paul E. McKenney
@ 2015-06-30 22:00 ` josh
  2015-06-30 22:12   ` Paul E. McKenney
  1 sibling, 1 reply; 57+ messages in thread
From: josh @ 2015-06-30 22:00 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

On Tue, Jun 30, 2015 at 02:48:05PM -0700, Paul E. McKenney wrote:
> Hello!
> 
> This series contains some highly experimental patches that allow normal
> grace periods to take advantage of the work done by concurrent expedited
> grace periods.  This can reduce the overhead incurred by normal grace
> periods by eliminating the need for force-quiescent-state scans that
> would otherwise have happened after the expedited grace period completed.
> It is not clear whether this is a useful tradeoff.  Nevertheless, this
> series contains the following patches:

While it makes sense to avoid unnecessarily delaying a normal grace
period if the expedited machinery has provided the necessary delay, I'm
also *deeply* concerned that this will create a new class of
nondeterministic performance issues.  Something that uses RCU may
perform badly due to grace period latency, but then suddenly start
performing well because an unrelated task starts hammering expedited
grace periods.  This seems particularly likely during boot, for
instance, where RCU grace periods can be a significant component of boot
time (when you're trying to boot to userspace in small fractions of a
second).

- Josh Triplett

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP
  2015-06-30 21:56     ` Eric Dumazet
@ 2015-06-30 22:10       ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-06-30 22:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: LKML, Ingo Molnar, laijs, dipankar, Andrew Morton,
	mathieu.desnoyers, josh, Thomas Gleixner, peterz, rostedt,
	dhowells, dvhart, fweisbec, oleg, bobby.prani

On Tue, Jun 30, 2015 at 11:56:37PM +0200, Eric Dumazet wrote:
> On Tue, Jun 30, 2015 at 11:48 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> >
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/rcu/tree.c | 15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 308b6acb4260..247aa1120c4c 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3505,10 +3505,19 @@ void synchronize_sched_expedited(void)
> >                            !atomic_read(&rsp->expedited_need_qs));
> >
> >         rcu_exp_gp_seq_end(rsp);
> > -       mutex_unlock(&rnp->exp_funnel_mutex);
> >         smp_mb(); /* ensure subsequent action seen after grace period. */
> > -       if (rsp->gp_kthread && rcu_gp_in_progress(rsp))
> > -               wake_up(&rsp->gp_wq);
> > +       if (rsp->gp_kthread && rcu_gp_in_progress(rsp)) {
> > +               static unsigned long nextgp;
> > +               static unsigned long nextjiffy;
> > +
> > +               if (time_after_eq(jiffies, nextgp) ||
> > +                   ULONG_CMP_GE(rsp->gpnum, nextgp)) {
> > +                       nextgp = rsp->gpnum + 4;
> > +                       nextjiffy = jiffies + 10;
> 
> Do you want 10 ticks or 10 ms (as stated in title) ?

Ten ticks, good catch!

							Thanx, Paul

> > +                       wake_up(&rsp->gp_wq);
> > +               }
> > +       }
> > +       mutex_unlock(&rnp->exp_funnel_mutex);
> >
> >         put_online_cpus();
> >  }
> > --
> > 1.8.1.5
> >
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-06-30 22:00 ` [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones josh
@ 2015-06-30 22:12   ` Paul E. McKenney
  2015-06-30 23:46     ` josh
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-06-30 22:12 UTC (permalink / raw)
  To: josh
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

On Tue, Jun 30, 2015 at 03:00:15PM -0700, josh@joshtriplett.org wrote:
> On Tue, Jun 30, 2015 at 02:48:05PM -0700, Paul E. McKenney wrote:
> > Hello!
> > 
> > This series contains some highly experimental patches that allow normal
> > grace periods to take advantage of the work done by concurrent expedited
> > grace periods.  This can reduce the overhead incurred by normal grace
> > periods by eliminating the need for force-quiescent-state scans that
> > would otherwise have happened after the expedited grace period completed.
> > It is not clear whether this is a useful tradeoff.  Nevertheless, this
> > series contains the following patches:
> 
> While it makes sense to avoid unnecessarily delaying a normal grace
> period if the expedited machinery has provided the necessary delay, I'm
> also *deeply* concerned that this will create a new class of
> nondeterministic performance issues.  Something that uses RCU may
> perform badly due to grace period latency, but then suddenly start
> performing well because an unrelated task starts hammering expedited
> grace periods.  This seems particularly likely during boot, for
> instance, where RCU grace periods can be a significant component of boot
> time (when you're trying to boot to userspace in small fractions of a
> second).

I will take that as another vote against.  And for a reason that I had
not yet come up with, so good show!  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-06-30 22:12   ` Paul E. McKenney
@ 2015-06-30 23:46     ` josh
  2015-07-01  0:15       ` Paul E. McKenney
  2015-07-01 10:09       ` Peter Zijlstra
  0 siblings, 2 replies; 57+ messages in thread
From: josh @ 2015-06-30 23:46 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

On Tue, Jun 30, 2015 at 03:12:24PM -0700, Paul E. McKenney wrote:
> On Tue, Jun 30, 2015 at 03:00:15PM -0700, josh@joshtriplett.org wrote:
> > On Tue, Jun 30, 2015 at 02:48:05PM -0700, Paul E. McKenney wrote:
> > > Hello!
> > > 
> > > This series contains some highly experimental patches that allow normal
> > > grace periods to take advantage of the work done by concurrent expedited
> > > grace periods.  This can reduce the overhead incurred by normal grace
> > > periods by eliminating the need for force-quiescent-state scans that
> > > would otherwise have happened after the expedited grace period completed.
> > > It is not clear whether this is a useful tradeoff.  Nevertheless, this
> > > series contains the following patches:
> > 
> > While it makes sense to avoid unnecessarily delaying a normal grace
> > period if the expedited machinery has provided the necessary delay, I'm
> > also *deeply* concerned that this will create a new class of
> > nondeterministic performance issues.  Something that uses RCU may
> > perform badly due to grace period latency, but then suddenly start
> > performing well because an unrelated task starts hammering expedited
> > grace periods.  This seems particularly likely during boot, for
> > instance, where RCU grace periods can be a significant component of boot
> > time (when you're trying to boot to userspace in small fractions of a
> > second).
> 
> I will take that as another vote against.  And for a reason that I had
> not yet come up with, so good show!  ;-)

Consider it a fairly weak concern against.  Increasing performance seems
like a good thing in general; I just don't relish the future "feels less
responsive" bug reports that take a long time to track down and turn out
to be "this completely unrelated driver was loaded and started using
expedited grace periods".

Then again, perhaps the more relevant concern would be why drivers use
expedited grace periods in the first place.

- Josh Triplett

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-06-30 23:46     ` josh
@ 2015-07-01  0:15       ` Paul E. McKenney
  2015-07-01  0:42         ` Josh Triplett
  2015-07-01 10:09       ` Peter Zijlstra
  1 sibling, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01  0:15 UTC (permalink / raw)
  To: josh
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

On Tue, Jun 30, 2015 at 04:46:33PM -0700, josh@joshtriplett.org wrote:
> On Tue, Jun 30, 2015 at 03:12:24PM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 30, 2015 at 03:00:15PM -0700, josh@joshtriplett.org wrote:
> > > On Tue, Jun 30, 2015 at 02:48:05PM -0700, Paul E. McKenney wrote:
> > > > Hello!
> > > > 
> > > > This series contains some highly experimental patches that allow normal
> > > > grace periods to take advantage of the work done by concurrent expedited
> > > > grace periods.  This can reduce the overhead incurred by normal grace
> > > > periods by eliminating the need for force-quiescent-state scans that
> > > > would otherwise have happened after the expedited grace period completed.
> > > > It is not clear whether this is a useful tradeoff.  Nevertheless, this
> > > > series contains the following patches:
> > > 
> > > While it makes sense to avoid unnecessarily delaying a normal grace
> > > period if the expedited machinery has provided the necessary delay, I'm
> > > also *deeply* concerned that this will create a new class of
> > > nondeterministic performance issues.  Something that uses RCU may
> > > perform badly due to grace period latency, but then suddenly start
> > > performing well because an unrelated task starts hammering expedited
> > > grace periods.  This seems particularly likely during boot, for
> > > instance, where RCU grace periods can be a significant component of boot
> > > time (when you're trying to boot to userspace in small fractions of a
> > > second).
> > 
> > I will take that as another vote against.  And for a reason that I had
> > not yet come up with, so good show!  ;-)
> 
> Consider it a fairly weak concern against.  Increasing performance seems
> like a good thing in general; I just don't relish the future "feels less
> responsive" bug reports that take a long time to track down and turn out
> to be "this completely unrelated driver was loaded and started using
> expedited grace periods".

>From what I can see, this one needs a good reason to go in, as opposed
to a good reason to stay out.

> Then again, perhaps the more relevant concern would be why drivers use
> expedited grace periods in the first place.

Networking uses expedited grace periods when RTNL is held to reduce
contention on that lock.  Several other places have used it to minimize
user-visible grace-period slowdown.  But there are probably places that
would be better served doing something different.  That is after all
the common case for most synchronization primitives.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01  0:15       ` Paul E. McKenney
@ 2015-07-01  0:42         ` Josh Triplett
  2015-07-01  3:37           ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Josh Triplett @ 2015-07-01  0:42 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

On Tue, Jun 30, 2015 at 05:15:58PM -0700, Paul E. McKenney wrote:
> On Tue, Jun 30, 2015 at 04:46:33PM -0700, josh@joshtriplett.org wrote:
> > On Tue, Jun 30, 2015 at 03:12:24PM -0700, Paul E. McKenney wrote:
> > > On Tue, Jun 30, 2015 at 03:00:15PM -0700, josh@joshtriplett.org wrote:
> > > > On Tue, Jun 30, 2015 at 02:48:05PM -0700, Paul E. McKenney wrote:
> > > > > Hello!
> > > > > 
> > > > > This series contains some highly experimental patches that allow normal
> > > > > grace periods to take advantage of the work done by concurrent expedited
> > > > > grace periods.  This can reduce the overhead incurred by normal grace
> > > > > periods by eliminating the need for force-quiescent-state scans that
> > > > > would otherwise have happened after the expedited grace period completed.
> > > > > It is not clear whether this is a useful tradeoff.  Nevertheless, this
> > > > > series contains the following patches:
> > > > 
> > > > While it makes sense to avoid unnecessarily delaying a normal grace
> > > > period if the expedited machinery has provided the necessary delay, I'm
> > > > also *deeply* concerned that this will create a new class of
> > > > nondeterministic performance issues.  Something that uses RCU may
> > > > perform badly due to grace period latency, but then suddenly start
> > > > performing well because an unrelated task starts hammering expedited
> > > > grace periods.  This seems particularly likely during boot, for
> > > > instance, where RCU grace periods can be a significant component of boot
> > > > time (when you're trying to boot to userspace in small fractions of a
> > > > second).
> > > 
> > > I will take that as another vote against.  And for a reason that I had
> > > not yet come up with, so good show!  ;-)
> > 
> > Consider it a fairly weak concern against.  Increasing performance seems
> > like a good thing in general; I just don't relish the future "feels less
> > responsive" bug reports that take a long time to track down and turn out
> > to be "this completely unrelated driver was loaded and started using
> > expedited grace periods".
> 
> From what I can see, this one needs a good reason to go in, as opposed
> to a good reason to stay out.
> 
> > Then again, perhaps the more relevant concern would be why drivers use
> > expedited grace periods in the first place.
> 
> Networking uses expedited grace periods when RTNL is held to reduce
> contention on that lock.

Wait, what?  Why is anything using traditional (non-S) RCU while *any*
lock is held?

> Several other places have used it to minimize
> user-visible grace-period slowdown.  But there are probably places that
> would be better served doing something different.  That is after all
> the common case for most synchronization primitives.  ;-)

Sounds likely. :)

- Josh Triplett

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01  0:42         ` Josh Triplett
@ 2015-07-01  3:37           ` Paul E. McKenney
  2015-07-01 10:12             ` Peter Zijlstra
  2015-07-01 15:43             ` Josh Triplett
  0 siblings, 2 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01  3:37 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

On Tue, Jun 30, 2015 at 05:42:14PM -0700, Josh Triplett wrote:
> On Tue, Jun 30, 2015 at 05:15:58PM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 30, 2015 at 04:46:33PM -0700, josh@joshtriplett.org wrote:
> > > On Tue, Jun 30, 2015 at 03:12:24PM -0700, Paul E. McKenney wrote:
> > > > On Tue, Jun 30, 2015 at 03:00:15PM -0700, josh@joshtriplett.org wrote:
> > > > > On Tue, Jun 30, 2015 at 02:48:05PM -0700, Paul E. McKenney wrote:
> > > > > > Hello!
> > > > > > 
> > > > > > This series contains some highly experimental patches that allow normal
> > > > > > grace periods to take advantage of the work done by concurrent expedited
> > > > > > grace periods.  This can reduce the overhead incurred by normal grace
> > > > > > periods by eliminating the need for force-quiescent-state scans that
> > > > > > would otherwise have happened after the expedited grace period completed.
> > > > > > It is not clear whether this is a useful tradeoff.  Nevertheless, this
> > > > > > series contains the following patches:
> > > > > 
> > > > > While it makes sense to avoid unnecessarily delaying a normal grace
> > > > > period if the expedited machinery has provided the necessary delay, I'm
> > > > > also *deeply* concerned that this will create a new class of
> > > > > nondeterministic performance issues.  Something that uses RCU may
> > > > > perform badly due to grace period latency, but then suddenly start
> > > > > performing well because an unrelated task starts hammering expedited
> > > > > grace periods.  This seems particularly likely during boot, for
> > > > > instance, where RCU grace periods can be a significant component of boot
> > > > > time (when you're trying to boot to userspace in small fractions of a
> > > > > second).
> > > > 
> > > > I will take that as another vote against.  And for a reason that I had
> > > > not yet come up with, so good show!  ;-)
> > > 
> > > Consider it a fairly weak concern against.  Increasing performance seems
> > > like a good thing in general; I just don't relish the future "feels less
> > > responsive" bug reports that take a long time to track down and turn out
> > > to be "this completely unrelated driver was loaded and started using
> > > expedited grace periods".
> > 
> > From what I can see, this one needs a good reason to go in, as opposed
> > to a good reason to stay out.
> > 
> > > Then again, perhaps the more relevant concern would be why drivers use
> > > expedited grace periods in the first place.
> > 
> > Networking uses expedited grace periods when RTNL is held to reduce
> > contention on that lock.
> 
> Wait, what?  Why is anything using traditional (non-S) RCU while *any*
> lock is held?

In their defense, it is a sleeplock that is never taken except when
rearranging networking configuration.  Sometimes they need a grace period
under the lock.  So synchronize_net() checks to see if RTNL is held, and
does a synchronize_rcu_expedited() if so and a synchronize_rcu() if not.

But maybe I am misunderstanding your question?

> > Several other places have used it to minimize
> > user-visible grace-period slowdown.  But there are probably places that
> > would be better served doing something different.  That is after all
> > the common case for most synchronization primitives.  ;-)
> 
> Sounds likely. :)

;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs Paul E. McKenney
@ 2015-07-01 10:03     ` Peter Zijlstra
  2015-07-01 13:42       ` Paul E. McKenney
  2015-07-01 20:59       ` Paul E. McKenney
  2015-07-01 10:05     ` Peter Zijlstra
  1 sibling, 2 replies; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-01 10:03 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Tue, Jun 30, 2015 at 02:48:27PM -0700, Paul E. McKenney wrote:
> @@ -2121,17 +2137,24 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  					       TPS("fqswait"));
>  			rsp->gp_state = RCU_GP_WAIT_FQS;
>  			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> -					((gf = READ_ONCE(rsp->gp_flags)) &
> -					 RCU_GP_FLAG_FQS) ||
> -					(!READ_ONCE(rnp->qsmask) &&
> -					 !rcu_preempt_blocked_readers_cgp(rnp)),
> -					j);
> +				((gf = READ_ONCE(rsp->gp_flags)) &
> +				 RCU_GP_FLAG_FQS) ||
> +				(!READ_ONCE(rnp->qsmask) &&
> +				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
> +				rcu_exp_gp_seq_done(rsp->exp_rsp,
> +						    rsp->gp_exp_snap),
> +				j);

How about using a helper function there?

static inline bool rcu_gp_done(rsp, rnp)
{
	/* Forced Quiescent State complete */
	if (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS)
		return true;

	/* QS not masked and not blocked by preempted readers */
	if (!READ_ONCE(rnp->qsmask) &&
	    !rcu_preempt_blocked_readers_cgp(rnp))
		return true;

	/* Expedited Grace Period completed */
	if (rcu_exp_gp_seq_done(rsp))
		return true;

	return false;
}

	ret = wait_event_interruptible_timeout(rsp->gp_wq,
		rcu_gp_done(rsp, rnp), j);



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs Paul E. McKenney
  2015-07-01 10:03     ` Peter Zijlstra
@ 2015-07-01 10:05     ` Peter Zijlstra
  2015-07-01 13:41       ` Paul E. McKenney
  1 sibling, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-01 10:05 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Tue, Jun 30, 2015 at 02:48:27PM -0700, Paul E. McKenney wrote:
>  			rsp->gp_state = RCU_GP_WAIT_FQS;
>  			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> +				((gf = READ_ONCE(rsp->gp_flags)) &
> +				 RCU_GP_FLAG_FQS) ||
> +				(!READ_ONCE(rnp->qsmask) &&
> +				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
> +				rcu_exp_gp_seq_done(rsp->exp_rsp,
> +						    rsp->gp_exp_snap),
> +				j);
>  			rsp->gp_state = RCU_GP_DONE_FQS;

How can the GP be done if we timed out or got interrupted?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP
  2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP Paul E. McKenney
  2015-06-30 21:56     ` Eric Dumazet
@ 2015-07-01 10:07     ` Peter Zijlstra
  2015-07-01 13:45       ` Paul E. McKenney
  2015-07-01 19:30       ` Paul E. McKenney
  1 sibling, 2 replies; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-01 10:07 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Tue, Jun 30, 2015 at 02:48:30PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

This seems like a good place to explain why this is a desirable thing,
no?

Why would you want to limit this?

> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/rcu/tree.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 308b6acb4260..247aa1120c4c 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3505,10 +3505,19 @@ void synchronize_sched_expedited(void)
>  			   !atomic_read(&rsp->expedited_need_qs));
>  
>  	rcu_exp_gp_seq_end(rsp);
> -	mutex_unlock(&rnp->exp_funnel_mutex);
>  	smp_mb(); /* ensure subsequent action seen after grace period. */
> -	if (rsp->gp_kthread && rcu_gp_in_progress(rsp))
> -		wake_up(&rsp->gp_wq);
> +	if (rsp->gp_kthread && rcu_gp_in_progress(rsp)) {
> +		static unsigned long nextgp;
> +		static unsigned long nextjiffy;
> +
> +		if (time_after_eq(jiffies, nextgp) ||
> +		    ULONG_CMP_GE(rsp->gpnum, nextgp)) {
> +			nextgp = rsp->gpnum + 4;
> +			nextjiffy = jiffies + 10;
> +			wake_up(&rsp->gp_wq);
> +		}
> +	}
> +	mutex_unlock(&rnp->exp_funnel_mutex);
>  
>  	put_online_cpus();
>  }
> -- 
> 1.8.1.5
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-06-30 23:46     ` josh
  2015-07-01  0:15       ` Paul E. McKenney
@ 2015-07-01 10:09       ` Peter Zijlstra
  2015-07-01 10:55         ` Peter Zijlstra
  1 sibling, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-01 10:09 UTC (permalink / raw)
  To: josh
  Cc: Paul E. McKenney, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Tue, Jun 30, 2015 at 04:46:33PM -0700, josh@joshtriplett.org wrote:
> Consider it a fairly weak concern against.  Increasing performance seems
> like a good thing in general; I just don't relish the future "feels less
> responsive" bug reports that take a long time to track down and turn out
> to be "this completely unrelated driver was loaded and started using
> expedited grace periods".

random drivers, or for that matter, new-code of any sort. Should _NOT_
be using expedited grace periods.

They're a horrid hack only suitable for unfixable ABI.

> Then again, perhaps the more relevant concern would be why drivers use
> expedited grace periods in the first place.

Right.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01  3:37           ` Paul E. McKenney
@ 2015-07-01 10:12             ` Peter Zijlstra
  2015-07-01 14:01               ` Paul E. McKenney
  2015-07-01 15:43             ` Josh Triplett
  1 sibling, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-01 10:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Josh Triplett, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Tue, Jun 30, 2015 at 08:37:01PM -0700, Paul E. McKenney wrote:
> > Wait, what?  Why is anything using traditional (non-S) RCU while *any*
> > lock is held?
> 
> In their defense, it is a sleeplock that is never taken except when
> rearranging networking configuration.  Sometimes they need a grace period
> under the lock.  So synchronize_net() checks to see if RTNL is held, and
> does a synchronize_rcu_expedited() if so and a synchronize_rcu() if not.

Sounds vile.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 10:09       ` Peter Zijlstra
@ 2015-07-01 10:55         ` Peter Zijlstra
  2015-07-01 14:00           ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-01 10:55 UTC (permalink / raw)
  To: josh
  Cc: Paul E. McKenney, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 12:09:39PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 30, 2015 at 04:46:33PM -0700, josh@joshtriplett.org wrote:
> > Consider it a fairly weak concern against.  Increasing performance seems
> > like a good thing in general; I just don't relish the future "feels less
> > responsive" bug reports that take a long time to track down and turn out
> > to be "this completely unrelated driver was loaded and started using
> > expedited grace periods".
> 
> random drivers, or for that matter, new-code of any sort. Should _NOT_
> be using expedited grace periods.
> 
> They're a horrid hack only suitable for unfixable ABI.

Let me repeat, just in case I've not been clear. Expedited grace periods
are _BAD_ and should be avoided at all costs.

They perturb the _entire_ machine.

Yes we can polish the turd, but in the end its still a turd.

Sadly people seem to have taken a liking to them, ooh a make RCU go
faster button. And there's not been much if any pushback on people using
it.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-01 10:05     ` Peter Zijlstra
@ 2015-07-01 13:41       ` Paul E. McKenney
  2015-07-01 13:48         ` Peter Zijlstra
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 13:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Wed, Jul 01, 2015 at 12:05:21PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 30, 2015 at 02:48:27PM -0700, Paul E. McKenney wrote:
> >  			rsp->gp_state = RCU_GP_WAIT_FQS;
> >  			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> > +				((gf = READ_ONCE(rsp->gp_flags)) &
> > +				 RCU_GP_FLAG_FQS) ||
> > +				(!READ_ONCE(rnp->qsmask) &&
> > +				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
> > +				rcu_exp_gp_seq_done(rsp->exp_rsp,
> > +						    rsp->gp_exp_snap),
> > +				j);
> >  			rsp->gp_state = RCU_GP_DONE_FQS;
> 
> How can the GP be done if we timed out or got interrupted?

If all the CPUs still blocking the grace period went idle, or in a
NO_HZ_FULL kernel, entered nohz_full userspace execution.  Or, if
certain low-probability races happen, went offline.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-01 10:03     ` Peter Zijlstra
@ 2015-07-01 13:42       ` Paul E. McKenney
  2015-07-01 20:59       ` Paul E. McKenney
  1 sibling, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 13:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Wed, Jul 01, 2015 at 12:03:52PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 30, 2015 at 02:48:27PM -0700, Paul E. McKenney wrote:
> > @@ -2121,17 +2137,24 @@ static int __noreturn rcu_gp_kthread(void *arg)
> >  					       TPS("fqswait"));
> >  			rsp->gp_state = RCU_GP_WAIT_FQS;
> >  			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> > -					((gf = READ_ONCE(rsp->gp_flags)) &
> > -					 RCU_GP_FLAG_FQS) ||
> > -					(!READ_ONCE(rnp->qsmask) &&
> > -					 !rcu_preempt_blocked_readers_cgp(rnp)),
> > -					j);
> > +				((gf = READ_ONCE(rsp->gp_flags)) &
> > +				 RCU_GP_FLAG_FQS) ||
> > +				(!READ_ONCE(rnp->qsmask) &&
> > +				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
> > +				rcu_exp_gp_seq_done(rsp->exp_rsp,
> > +						    rsp->gp_exp_snap),
> > +				j);
> 
> How about using a helper function there?

Good point, will do.

							Thanx, Paul

> static inline bool rcu_gp_done(rsp, rnp)
> {
> 	/* Forced Quiescent State complete */
> 	if (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS)
> 		return true;
> 
> 	/* QS not masked and not blocked by preempted readers */
> 	if (!READ_ONCE(rnp->qsmask) &&
> 	    !rcu_preempt_blocked_readers_cgp(rnp))
> 		return true;
> 
> 	/* Expedited Grace Period completed */
> 	if (rcu_exp_gp_seq_done(rsp))
> 		return true;
> 
> 	return false;
> }
> 
> 	ret = wait_event_interruptible_timeout(rsp->gp_wq,
> 		rcu_gp_done(rsp, rnp), j);
> 
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP
  2015-07-01 10:07     ` Peter Zijlstra
@ 2015-07-01 13:45       ` Paul E. McKenney
  2015-07-01 19:30       ` Paul E. McKenney
  1 sibling, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 13:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Wed, Jul 01, 2015 at 12:07:30PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 30, 2015 at 02:48:30PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> This seems like a good place to explain why this is a desirable thing,
> no?

Good point.

> Why would you want to limit this?

Because the unconditional wakeup is a two-edges sword.  It reduces
the latency of normal RCU grace periods on the one hand, but it makes
rcu_sched consume even more CPU on the other.

							Thanx, Paul

> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/rcu/tree.c | 15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 308b6acb4260..247aa1120c4c 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3505,10 +3505,19 @@ void synchronize_sched_expedited(void)
> >  			   !atomic_read(&rsp->expedited_need_qs));
> >  
> >  	rcu_exp_gp_seq_end(rsp);
> > -	mutex_unlock(&rnp->exp_funnel_mutex);
> >  	smp_mb(); /* ensure subsequent action seen after grace period. */
> > -	if (rsp->gp_kthread && rcu_gp_in_progress(rsp))
> > -		wake_up(&rsp->gp_wq);
> > +	if (rsp->gp_kthread && rcu_gp_in_progress(rsp)) {
> > +		static unsigned long nextgp;
> > +		static unsigned long nextjiffy;
> > +
> > +		if (time_after_eq(jiffies, nextgp) ||
> > +		    ULONG_CMP_GE(rsp->gpnum, nextgp)) {
> > +			nextgp = rsp->gpnum + 4;
> > +			nextjiffy = jiffies + 10;
> > +			wake_up(&rsp->gp_wq);
> > +		}
> > +	}
> > +	mutex_unlock(&rnp->exp_funnel_mutex);
> >  
> >  	put_online_cpus();
> >  }
> > -- 
> > 1.8.1.5
> > 
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-01 13:41       ` Paul E. McKenney
@ 2015-07-01 13:48         ` Peter Zijlstra
  2015-07-01 14:03           ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-01 13:48 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Wed, Jul 01, 2015 at 06:41:44AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 01, 2015 at 12:05:21PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 30, 2015 at 02:48:27PM -0700, Paul E. McKenney wrote:
> > >  			rsp->gp_state = RCU_GP_WAIT_FQS;
> > >  			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> > > +				((gf = READ_ONCE(rsp->gp_flags)) &
> > > +				 RCU_GP_FLAG_FQS) ||
> > > +				(!READ_ONCE(rnp->qsmask) &&
> > > +				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
> > > +				rcu_exp_gp_seq_done(rsp->exp_rsp,
> > > +						    rsp->gp_exp_snap),
> > > +				j);
> > >  			rsp->gp_state = RCU_GP_DONE_FQS;
> > 
> > How can the GP be done if we timed out or got interrupted?
> 
> If all the CPUs still blocking the grace period went idle, or in a
> NO_HZ_FULL kernel, entered nohz_full userspace execution.  Or, if
> certain low-probability races happen, went offline.

But what if none of those are true and we still timed out? You
unconditionally grant the GP.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 10:55         ` Peter Zijlstra
@ 2015-07-01 14:00           ` Paul E. McKenney
  2015-07-01 14:17             ` Peter Zijlstra
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 14:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: josh, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 12:55:11PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 01, 2015 at 12:09:39PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 30, 2015 at 04:46:33PM -0700, josh@joshtriplett.org wrote:
> > > Consider it a fairly weak concern against.  Increasing performance seems
> > > like a good thing in general; I just don't relish the future "feels less
> > > responsive" bug reports that take a long time to track down and turn out
> > > to be "this completely unrelated driver was loaded and started using
> > > expedited grace periods".
> > 
> > random drivers, or for that matter, new-code of any sort. Should _NOT_
> > be using expedited grace periods.
> > 
> > They're a horrid hack only suitable for unfixable ABI.
> 
> Let me repeat, just in case I've not been clear. Expedited grace periods
> are _BAD_ and should be avoided at all costs.

That is a bit extreme, Peter.

There should not be a problem using them for operations that are not done
while the system is running the main workload.  Which is why I am OK with
synchronize_net() using synchronize_rcu_expedited() when RTNL is held.
The operations that do that are setup/configuration operations that you
won't normally be doing while your HPC or realtime workload is running.

I believe that many of the other uses are similar.

> They perturb the _entire_ machine.

The portion of the machine that is non-idle and not executing in nohz_full
userspace, that is.  So nohz_full usermode execution is unperturbed by
expedited grace periods.

In addition, synchronize_srcu_expedited() does -not- perturb anything
other than the task actually executing synchronize_srcu_expedited().
So those read-side memory barriers are gaining you something, and there
should not be much need to push back on synchronize_srcu_expedited().

> Yes we can polish the turd, but in the end its still a turd.

And I intend to polish it even more.  ;-)

> Sadly people seem to have taken a liking to them, ooh a make RCU go
> faster button. And there's not been much if any pushback on people using
> it.

There aren't all that many uses, so I don't believe that
people are abusing it that much.  There are only four non-RCU
uses of synchronize_rcu_expedited() and only two non-RCU uses of
synchronize_sched_expedited().  In contrast, there are a couple hundred
uses of synchronize_rcu() and about 40 uses of synchronize_sched().

So I am not seeing much evidence of wanton use of either
synchronize_srcu() or synchronize_sched().

Are a huge pile of them coming in this merge window or something?
What raised your concerns on this issue?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 10:12             ` Peter Zijlstra
@ 2015-07-01 14:01               ` Paul E. McKenney
  2015-07-01 14:08                 ` Eric Dumazet
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 14:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Triplett, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 12:12:21PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 30, 2015 at 08:37:01PM -0700, Paul E. McKenney wrote:
> > > Wait, what?  Why is anything using traditional (non-S) RCU while *any*
> > > lock is held?
> > 
> > In their defense, it is a sleeplock that is never taken except when
> > rearranging networking configuration.  Sometimes they need a grace period
> > under the lock.  So synchronize_net() checks to see if RTNL is held, and
> > does a synchronize_rcu_expedited() if so and a synchronize_rcu() if not.
> 
> Sounds vile.

OK, I'll bite.  Exactly what seems especially vile about it?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-01 13:48         ` Peter Zijlstra
@ 2015-07-01 14:03           ` Paul E. McKenney
  2015-07-02 12:03             ` Peter Zijlstra
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 14:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Wed, Jul 01, 2015 at 03:48:06PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 01, 2015 at 06:41:44AM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 01, 2015 at 12:05:21PM +0200, Peter Zijlstra wrote:
> > > On Tue, Jun 30, 2015 at 02:48:27PM -0700, Paul E. McKenney wrote:
> > > >  			rsp->gp_state = RCU_GP_WAIT_FQS;
> > > >  			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> > > > +				((gf = READ_ONCE(rsp->gp_flags)) &
> > > > +				 RCU_GP_FLAG_FQS) ||
> > > > +				(!READ_ONCE(rnp->qsmask) &&
> > > > +				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
> > > > +				rcu_exp_gp_seq_done(rsp->exp_rsp,
> > > > +						    rsp->gp_exp_snap),
> > > > +				j);
> > > >  			rsp->gp_state = RCU_GP_DONE_FQS;
> > > 
> > > How can the GP be done if we timed out or got interrupted?
> > 
> > If all the CPUs still blocking the grace period went idle, or in a
> > NO_HZ_FULL kernel, entered nohz_full userspace execution.  Or, if
> > certain low-probability races happen, went offline.
> 
> But what if none of those are true and we still timed out? You
> unconditionally grant the GP.

Say what???

I recheck the conditions and break out of the loop only if one or more
of the grace-period-end conditions is satisfied.  If not, I invoke
rcu_gp_fqs() to do the scan to see if all remaining CPUs are idle,
in nohz_full userspace execution, or offline.

What am I mising here?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 14:01               ` Paul E. McKenney
@ 2015-07-01 14:08                 ` Eric Dumazet
  2015-07-01 15:58                   ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Eric Dumazet @ 2015-07-01 14:08 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Peter Zijlstra, Josh Triplett, LKML, Ingo Molnar, laijs,
	dipankar, Andrew Morton, mathieu.desnoyers, Thomas Gleixner,
	rostedt, dhowells, dvhart, fweisbec, oleg, bobby.prani

> OK, I'll bite.  Exactly what seems especially vile about it?

Presumably we would need to convert everything to async model
(call_rcu() and friends),
but that is a long way to go...

For example, synchronize_srcu_expedited() in kvm_set_irq_routing()
really looks overkill.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 14:00           ` Paul E. McKenney
@ 2015-07-01 14:17             ` Peter Zijlstra
  2015-07-01 16:17               ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-01 14:17 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: josh, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 07:00:31AM -0700, Paul E. McKenney wrote:

> That is a bit extreme, Peter.

Of course; but I'm really not seeing people taking due care with them

> Are a huge pile of them coming in this merge window or something?
> What raised your concerns on this issue?

This is complete horse manure (breaking the nvidiot binary blob is a
good thing):

74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")

Also, I'm not entirely convinced things like:

fd2ed4d25270 ("dm: add statistics support")
83d5e5b0af90 ("dm: optimize use SRCU and RCU")
ef3230880abd ("backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu")

Are in the 'never' happens category. Esp. the backing-dev one, it
triggers every time you unplug a USB stick or similar.

Rejigging a DM might indeed be rare enough; but then again, people use
DM explicitly so they can rejig while in operation.

Also, they really do not explain how expedited really is the only option
available. Why things can't be batched etc..

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01  3:37           ` Paul E. McKenney
  2015-07-01 10:12             ` Peter Zijlstra
@ 2015-07-01 15:43             ` Josh Triplett
  2015-07-01 15:59               ` Paul E. McKenney
  1 sibling, 1 reply; 57+ messages in thread
From: Josh Triplett @ 2015-07-01 15:43 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

On Tue, Jun 30, 2015 at 08:37:01PM -0700, Paul E. McKenney wrote:
> On Tue, Jun 30, 2015 at 05:42:14PM -0700, Josh Triplett wrote:
> > On Tue, Jun 30, 2015 at 05:15:58PM -0700, Paul E. McKenney wrote:
> > > On Tue, Jun 30, 2015 at 04:46:33PM -0700, josh@joshtriplett.org wrote:
> > > > On Tue, Jun 30, 2015 at 03:12:24PM -0700, Paul E. McKenney wrote:
> > > > > On Tue, Jun 30, 2015 at 03:00:15PM -0700, josh@joshtriplett.org wrote:
> > > > > > On Tue, Jun 30, 2015 at 02:48:05PM -0700, Paul E. McKenney wrote:
> > > > > > > Hello!
> > > > > > > 
> > > > > > > This series contains some highly experimental patches that allow normal
> > > > > > > grace periods to take advantage of the work done by concurrent expedited
> > > > > > > grace periods.  This can reduce the overhead incurred by normal grace
> > > > > > > periods by eliminating the need for force-quiescent-state scans that
> > > > > > > would otherwise have happened after the expedited grace period completed.
> > > > > > > It is not clear whether this is a useful tradeoff.  Nevertheless, this
> > > > > > > series contains the following patches:
> > > > > > 
> > > > > > While it makes sense to avoid unnecessarily delaying a normal grace
> > > > > > period if the expedited machinery has provided the necessary delay, I'm
> > > > > > also *deeply* concerned that this will create a new class of
> > > > > > nondeterministic performance issues.  Something that uses RCU may
> > > > > > perform badly due to grace period latency, but then suddenly start
> > > > > > performing well because an unrelated task starts hammering expedited
> > > > > > grace periods.  This seems particularly likely during boot, for
> > > > > > instance, where RCU grace periods can be a significant component of boot
> > > > > > time (when you're trying to boot to userspace in small fractions of a
> > > > > > second).
> > > > > 
> > > > > I will take that as another vote against.  And for a reason that I had
> > > > > not yet come up with, so good show!  ;-)
> > > > 
> > > > Consider it a fairly weak concern against.  Increasing performance seems
> > > > like a good thing in general; I just don't relish the future "feels less
> > > > responsive" bug reports that take a long time to track down and turn out
> > > > to be "this completely unrelated driver was loaded and started using
> > > > expedited grace periods".
> > > 
> > > From what I can see, this one needs a good reason to go in, as opposed
> > > to a good reason to stay out.
> > > 
> > > > Then again, perhaps the more relevant concern would be why drivers use
> > > > expedited grace periods in the first place.
> > > 
> > > Networking uses expedited grace periods when RTNL is held to reduce
> > > contention on that lock.
> > 
> > Wait, what?  Why is anything using traditional (non-S) RCU while *any*
> > lock is held?
> 
> In their defense, it is a sleeplock that is never taken except when
> rearranging networking configuration.  Sometimes they need a grace period
> under the lock.  So synchronize_net() checks to see if RTNL is held, and
> does a synchronize_rcu_expedited() if so and a synchronize_rcu() if not.
> 
> But maybe I am misunderstanding your question?

No, you understood my question.  It seems wrong that they would need a
grace period *under* the lock, rather than the usual case of making a
change under the lock, dropping the lock, and *then* synchronizing.

- Josh Triplett

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 14:08                 ` Eric Dumazet
@ 2015-07-01 15:58                   ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 15:58 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Peter Zijlstra, Josh Triplett, LKML, Ingo Molnar, laijs,
	dipankar, Andrew Morton, mathieu.desnoyers, Thomas Gleixner,
	rostedt, dhowells, dvhart, fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 04:08:27PM +0200, Eric Dumazet wrote:
> > OK, I'll bite.  Exactly what seems especially vile about it?
> 
> Presumably we would need to convert everything to async model
> (call_rcu() and friends),
> but that is a long way to go...
> 
> For example, synchronize_srcu_expedited() in kvm_set_irq_routing()
> really looks overkill.

But synchronize_srcu_expedited() is SRCU, and does not disturb the
rest of the system.

Don't get me wrong, if converting it to call_srcu() helps, why not?
But synchronize_srcu_expedited() isn't messing up HPC or real-time
workloads.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 15:43             ` Josh Triplett
@ 2015-07-01 15:59               ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 15:59 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

On Wed, Jul 01, 2015 at 08:43:54AM -0700, Josh Triplett wrote:
> On Tue, Jun 30, 2015 at 08:37:01PM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 30, 2015 at 05:42:14PM -0700, Josh Triplett wrote:
> > > On Tue, Jun 30, 2015 at 05:15:58PM -0700, Paul E. McKenney wrote:
> > > > On Tue, Jun 30, 2015 at 04:46:33PM -0700, josh@joshtriplett.org wrote:
> > > > > On Tue, Jun 30, 2015 at 03:12:24PM -0700, Paul E. McKenney wrote:
> > > > > > On Tue, Jun 30, 2015 at 03:00:15PM -0700, josh@joshtriplett.org wrote:
> > > > > > > On Tue, Jun 30, 2015 at 02:48:05PM -0700, Paul E. McKenney wrote:
> > > > > > > > Hello!
> > > > > > > > 
> > > > > > > > This series contains some highly experimental patches that allow normal
> > > > > > > > grace periods to take advantage of the work done by concurrent expedited
> > > > > > > > grace periods.  This can reduce the overhead incurred by normal grace
> > > > > > > > periods by eliminating the need for force-quiescent-state scans that
> > > > > > > > would otherwise have happened after the expedited grace period completed.
> > > > > > > > It is not clear whether this is a useful tradeoff.  Nevertheless, this
> > > > > > > > series contains the following patches:
> > > > > > > 
> > > > > > > While it makes sense to avoid unnecessarily delaying a normal grace
> > > > > > > period if the expedited machinery has provided the necessary delay, I'm
> > > > > > > also *deeply* concerned that this will create a new class of
> > > > > > > nondeterministic performance issues.  Something that uses RCU may
> > > > > > > perform badly due to grace period latency, but then suddenly start
> > > > > > > performing well because an unrelated task starts hammering expedited
> > > > > > > grace periods.  This seems particularly likely during boot, for
> > > > > > > instance, where RCU grace periods can be a significant component of boot
> > > > > > > time (when you're trying to boot to userspace in small fractions of a
> > > > > > > second).
> > > > > > 
> > > > > > I will take that as another vote against.  And for a reason that I had
> > > > > > not yet come up with, so good show!  ;-)
> > > > > 
> > > > > Consider it a fairly weak concern against.  Increasing performance seems
> > > > > like a good thing in general; I just don't relish the future "feels less
> > > > > responsive" bug reports that take a long time to track down and turn out
> > > > > to be "this completely unrelated driver was loaded and started using
> > > > > expedited grace periods".
> > > > 
> > > > From what I can see, this one needs a good reason to go in, as opposed
> > > > to a good reason to stay out.
> > > > 
> > > > > Then again, perhaps the more relevant concern would be why drivers use
> > > > > expedited grace periods in the first place.
> > > > 
> > > > Networking uses expedited grace periods when RTNL is held to reduce
> > > > contention on that lock.
> > > 
> > > Wait, what?  Why is anything using traditional (non-S) RCU while *any*
> > > lock is held?
> > 
> > In their defense, it is a sleeplock that is never taken except when
> > rearranging networking configuration.  Sometimes they need a grace period
> > under the lock.  So synchronize_net() checks to see if RTNL is held, and
> > does a synchronize_rcu_expedited() if so and a synchronize_rcu() if not.
> > 
> > But maybe I am misunderstanding your question?
> 
> No, you understood my question.  It seems wrong that they would need a
> grace period *under* the lock, rather than the usual case of making a
> change under the lock, dropping the lock, and *then* synchronizing.

On that I must defer to the networking folks.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 14:17             ` Peter Zijlstra
@ 2015-07-01 16:17               ` Paul E. McKenney
  2015-07-01 17:02                 ` Peter Zijlstra
  2015-07-02  1:11                 ` Mike Galbraith
  0 siblings, 2 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 16:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: josh, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 01, 2015 at 07:00:31AM -0700, Paul E. McKenney wrote:
> 
> > That is a bit extreme, Peter.
> 
> Of course; but I'm really not seeing people taking due care with them

;-)

> > Are a huge pile of them coming in this merge window or something?
> > What raised your concerns on this issue?
> 
> This is complete horse manure (breaking the nvidiot binary blob is a
> good thing):
> 
> 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")

Really???

I am not concerned about this one.  After all, one of the first things
that people do for OS-jitter-sensitive workloads is to get rid of
binary blobs.  And any runtime use of ACPI as well.  And let's face it,
if your latency-sensitive workload is using either binary blobs or ACPI,
you have already completely lost.  Therefore, an additional expedited
grace period cannot possibly cause you to lose any more.

> Also, I'm not entirely convinced things like:
> 
> fd2ed4d25270 ("dm: add statistics support")
> 83d5e5b0af90 ("dm: optimize use SRCU and RCU")
> ef3230880abd ("backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu")
> 
> Are in the 'never' happens category. Esp. the backing-dev one, it
> triggers every time you unplug a USB stick or similar.

Which people should be assiduously avoiding for any sort of
industrial-control system, especially given things like STUXNET.

> Rejigging a DM might indeed be rare enough; but then again, people use
> DM explicitly so they can rejig while in operation.

They rejig DM when running OS-jitter-sensitive workloads?

> Also, they really do not explain how expedited really is the only option
> available. Why things can't be batched etc..

Fair question!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 16:17               ` Paul E. McKenney
@ 2015-07-01 17:02                 ` Peter Zijlstra
  2015-07-01 20:09                   ` Paul E. McKenney
  2015-07-02  1:11                 ` Mike Galbraith
  1 sibling, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-01 17:02 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: josh, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 09:17:05AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:

> > 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")
> 
> Really???
> 
> I am not concerned about this one.  After all, one of the first things
> that people do for OS-jitter-sensitive workloads is to get rid of
> binary blobs.  And any runtime use of ACPI as well.  And let's face it,
> if your latency-sensitive workload is using either binary blobs or ACPI,
> you have already completely lost.  Therefore, an additional expedited
> grace period cannot possibly cause you to lose any more.

This isn't solely about rt etc.. this call is a generic facility used by
however many consumers. A normal workstation/server could run into it at
relatively high frequency depending on its workload.

Even on not latency sensitive workloads I think hammering all active
CPUs is bad behaviour. Remember that a typical server class machine
easily has more than 32 CPUs these days.

> > Also, I'm not entirely convinced things like:
> > 
> > fd2ed4d25270 ("dm: add statistics support")
> > 83d5e5b0af90 ("dm: optimize use SRCU and RCU")
> > ef3230880abd ("backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu")
> > 
> > Are in the 'never' happens category. Esp. the backing-dev one, it
> > triggers every time you unplug a USB stick or similar.
> 
> Which people should be assiduously avoiding for any sort of
> industrial-control system, especially given things like STUXNET.

USB sure, but a backing dev is involved in nfs clients, loopback and all
sorts of block/filesystem like setups.

unmount an NFS mount and voila expedited rcu, unmount a loopback, tada.

All you need is a regular server workload triggering any of that on a
semi regular basis and even !rt people might start to notice something
is up.

> > Rejigging a DM might indeed be rare enough; but then again, people use
> > DM explicitly so they can rejig while in operation.
> 
> They rejig DM when running OS-jitter-sensitive workloads?

Unlikely but who knows, I don't really know DM, so I can't even tell
what would trigger these.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP
  2015-07-01 10:07     ` Peter Zijlstra
  2015-07-01 13:45       ` Paul E. McKenney
@ 2015-07-01 19:30       ` Paul E. McKenney
  1 sibling, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 19:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Wed, Jul 01, 2015 at 12:07:30PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 30, 2015 at 02:48:30PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> This seems like a good place to explain why this is a desirable thing,
> no?
> 
> Why would you want to limit this?

Good point!  I added the following to the commit log:

	Currently, the expedited grace period unconditionally awakens the
	normal grace period anytime there is a grace period in progress.
	However, this can be too much of a good thing for the following
	reasons: (1) It can result in excessive CPU consumption on
	the part of the RCU grace-period kthread, (2) The resulting
	variations in normal grace-period latency may cause confusion
	(as Josh Triplett points out), and (3) In many cases, reducing
	normal grace-period latency will be of little or no value.
	This commit therefore does the wakeups only once per ten jiffies
	or every fourth grace period, whichever is more frequent.

Does that help?

							Thanx, Paul

> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/rcu/tree.c | 15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 308b6acb4260..247aa1120c4c 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3505,10 +3505,19 @@ void synchronize_sched_expedited(void)
> >  			   !atomic_read(&rsp->expedited_need_qs));
> >  
> >  	rcu_exp_gp_seq_end(rsp);
> > -	mutex_unlock(&rnp->exp_funnel_mutex);
> >  	smp_mb(); /* ensure subsequent action seen after grace period. */
> > -	if (rsp->gp_kthread && rcu_gp_in_progress(rsp))
> > -		wake_up(&rsp->gp_wq);
> > +	if (rsp->gp_kthread && rcu_gp_in_progress(rsp)) {
> > +		static unsigned long nextgp;
> > +		static unsigned long nextjiffy;
> > +
> > +		if (time_after_eq(jiffies, nextgp) ||
> > +		    ULONG_CMP_GE(rsp->gpnum, nextgp)) {
> > +			nextgp = rsp->gpnum + 4;
> > +			nextjiffy = jiffies + 10;
> > +			wake_up(&rsp->gp_wq);
> > +		}
> > +	}
> > +	mutex_unlock(&rnp->exp_funnel_mutex);
> >  
> >  	put_online_cpus();
> >  }
> > -- 
> > 1.8.1.5
> > 
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 17:02                 ` Peter Zijlstra
@ 2015-07-01 20:09                   ` Paul E. McKenney
  2015-07-01 21:20                     ` josh
  2015-07-02  7:47                     ` Ingo Molnar
  0 siblings, 2 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 20:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: josh, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 07:02:42PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 01, 2015 at 09:17:05AM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:
> 
> > > 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")
> > 
> > Really???
> > 
> > I am not concerned about this one.  After all, one of the first things
> > that people do for OS-jitter-sensitive workloads is to get rid of
> > binary blobs.  And any runtime use of ACPI as well.  And let's face it,
> > if your latency-sensitive workload is using either binary blobs or ACPI,
> > you have already completely lost.  Therefore, an additional expedited
> > grace period cannot possibly cause you to lose any more.
> 
> This isn't solely about rt etc.. this call is a generic facility used by
> however many consumers. A normal workstation/server could run into it at
> relatively high frequency depending on its workload.
> 
> Even on not latency sensitive workloads I think hammering all active
> CPUs is bad behaviour. Remember that a typical server class machine
> easily has more than 32 CPUs these days.

Well, that certainly is one reason for the funnel locking, sequence
counters, etc., keeping the overhead bounded despite large numbers
of CPUs.  So I don't believe that a non-RT/non-HPC workload is going
to notice.

> > > Also, I'm not entirely convinced things like:
> > > 
> > > fd2ed4d25270 ("dm: add statistics support")
> > > 83d5e5b0af90 ("dm: optimize use SRCU and RCU")
> > > ef3230880abd ("backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu")
> > > 
> > > Are in the 'never' happens category. Esp. the backing-dev one, it
> > > triggers every time you unplug a USB stick or similar.
> > 
> > Which people should be assiduously avoiding for any sort of
> > industrial-control system, especially given things like STUXNET.
> 
> USB sure, but a backing dev is involved in nfs clients, loopback and all
> sorts of block/filesystem like setups.
> 
> unmount an NFS mount and voila expedited rcu, unmount a loopback, tada.
> 
> All you need is a regular server workload triggering any of that on a
> semi regular basis and even !rt people might start to notice something
> is up.

I don't believe that latency-sensitive systems are going to be messing
with remapping their storage at runtime, let alone on a regular basis.
If they are not latency sensitive, and assuming that the rate of
storage remapping is at all sane, I bet that they won't notice the
synchronize_rcu_expedited() overhead.  The overhead of the actual
remapping will very likely leave the synchronize_rcu_expedited() overhead
way down in the noise.

And if they are doing completely insane rates of storage remapping,
I suspect that the batching in the synchronize_rcu_expedited()
implementation will reduce the expedited-grace-period overhead still
further as a fraction of the total.

> > > Rejigging a DM might indeed be rare enough; but then again, people use
> > > DM explicitly so they can rejig while in operation.
> > 
> > They rejig DM when running OS-jitter-sensitive workloads?
> 
> Unlikely but who knows, I don't really know DM, so I can't even tell
> what would trigger these.

In my experience, the hard-core low-latency guys avoid doing pretty
much anything that isn't completely essential to the job at hand.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-01 10:03     ` Peter Zijlstra
  2015-07-01 13:42       ` Paul E. McKenney
@ 2015-07-01 20:59       ` Paul E. McKenney
  1 sibling, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 20:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Wed, Jul 01, 2015 at 12:03:52PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 30, 2015 at 02:48:27PM -0700, Paul E. McKenney wrote:
> > @@ -2121,17 +2137,24 @@ static int __noreturn rcu_gp_kthread(void *arg)
> >  					       TPS("fqswait"));
> >  			rsp->gp_state = RCU_GP_WAIT_FQS;
> >  			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> > -					((gf = READ_ONCE(rsp->gp_flags)) &
> > -					 RCU_GP_FLAG_FQS) ||
> > -					(!READ_ONCE(rnp->qsmask) &&
> > -					 !rcu_preempt_blocked_readers_cgp(rnp)),
> > -					j);
> > +				((gf = READ_ONCE(rsp->gp_flags)) &
> > +				 RCU_GP_FLAG_FQS) ||
> > +				(!READ_ONCE(rnp->qsmask) &&
> > +				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
> > +				rcu_exp_gp_seq_done(rsp->exp_rsp,
> > +						    rsp->gp_exp_snap),
> > +				j);
> 
> How about using a helper function there?
> 
> static inline bool rcu_gp_done(rsp, rnp)
> {
> 	/* Forced Quiescent State complete */
> 	if (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS)
> 		return true;
> 
> 	/* QS not masked and not blocked by preempted readers */
> 	if (!READ_ONCE(rnp->qsmask) &&
> 	    !rcu_preempt_blocked_readers_cgp(rnp))
> 		return true;
> 
> 	/* Expedited Grace Period completed */
> 	if (rcu_exp_gp_seq_done(rsp))
> 		return true;
> 
> 	return false;
> }
> 
> 	ret = wait_event_interruptible_timeout(rsp->gp_wq,
> 		rcu_gp_done(rsp, rnp), j);

Fair point, and applies to the original as well, give or take comments
and naming.  Please see below for the corresponding patch.  Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

commit a8bb9abbf3690e507d61efe8144cec091ce5fb02
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Wed Jul 1 13:50:28 2015 -0700

    rcu: Pull out wait_event*() condition into helper function
    
    The condition for the wait_event_interruptible_timeout() that waits
    to do the next force-quiescent-state scan is a bit ornate:
    
    	((gf = READ_ONCE(rsp->gp_flags)) &
    	 RCU_GP_FLAG_FQS) ||
    	(!READ_ONCE(rnp->qsmask) &&
    	 !rcu_preempt_blocked_readers_cgp(rnp))
    
    This commit therefore pulls this condition out into a helper function
    and comments its component conditions.
    
    Reported-by: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 4992bfd360b6..2afb8e8c5134 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1904,6 +1904,26 @@ static int rcu_gp_init(struct rcu_state *rsp)
 }
 
 /*
+ * Helper function for wait_event_interruptible_timeout() wakeup
+ * at force-quiescent-state time.
+ */
+static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
+{
+	struct rcu_node *rnp = rcu_get_root(rsp);
+
+	/* Someone like call_rcu() requested a force-quiescent-state scan. */
+	*gfp = READ_ONCE(rsp->gp_flags);
+	if (*gfp & RCU_GP_FLAG_FQS)
+		return true;
+
+	/* The current grace period has completed. */
+	if (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers_cgp(rnp))
+		return true;
+
+	return false;
+}
+
+/*
  * Do one round of quiescent-state forcing.
  */
 static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
@@ -2067,11 +2087,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 					       TPS("fqswait"));
 			rsp->gp_state = RCU_GP_WAIT_FQS;
 			ret = wait_event_interruptible_timeout(rsp->gp_wq,
-					((gf = READ_ONCE(rsp->gp_flags)) &
-					 RCU_GP_FLAG_FQS) ||
-					(!READ_ONCE(rnp->qsmask) &&
-					 !rcu_preempt_blocked_readers_cgp(rnp)),
-					j);
+					rcu_gp_fqs_check_wake(rsp, &gf), j);
 			rsp->gp_state = RCU_GP_DONE_FQS;
 			/* Locking provides needed memory barriers. */
 			/* If grace period done, leave loop. */


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 20:09                   ` Paul E. McKenney
@ 2015-07-01 21:20                     ` josh
  2015-07-01 21:49                       ` Paul E. McKenney
  2015-07-02  7:47                     ` Ingo Molnar
  1 sibling, 1 reply; 57+ messages in thread
From: josh @ 2015-07-01 21:20 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 01:09:36PM -0700, Paul E. McKenney wrote:
> On Wed, Jul 01, 2015 at 07:02:42PM +0200, Peter Zijlstra wrote:
> > USB sure, but a backing dev is involved in nfs clients, loopback and all
> > sorts of block/filesystem like setups.
> > 
> > unmount an NFS mount and voila expedited rcu, unmount a loopback, tada.
> > 
> > All you need is a regular server workload triggering any of that on a
> > semi regular basis and even !rt people might start to notice something
> > is up.
> 
> I don't believe that latency-sensitive systems are going to be messing
> with remapping their storage at runtime, let alone on a regular basis.
> If they are not latency sensitive, and assuming that the rate of
> storage remapping is at all sane, I bet that they won't notice the
> synchronize_rcu_expedited() overhead.  The overhead of the actual
> remapping will very likely leave the synchronize_rcu_expedited() overhead
> way down in the noise.
> 
> And if they are doing completely insane rates of storage remapping,
> I suspect that the batching in the synchronize_rcu_expedited()
> implementation will reduce the expedited-grace-period overhead still
> further as a fraction of the total.

Consider the case of container-based systems, calling mount as part of
container setup and umount as part of container teardown.

And those workloads are often sensitive to latency, not throughput.

- Josh Triplett

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 21:20                     ` josh
@ 2015-07-01 21:49                       ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-01 21:49 UTC (permalink / raw)
  To: josh
  Cc: Peter Zijlstra, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, Jul 01, 2015 at 02:20:01PM -0700, josh@joshtriplett.org wrote:
> On Wed, Jul 01, 2015 at 01:09:36PM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 01, 2015 at 07:02:42PM +0200, Peter Zijlstra wrote:
> > > USB sure, but a backing dev is involved in nfs clients, loopback and all
> > > sorts of block/filesystem like setups.
> > > 
> > > unmount an NFS mount and voila expedited rcu, unmount a loopback, tada.
> > > 
> > > All you need is a regular server workload triggering any of that on a
> > > semi regular basis and even !rt people might start to notice something
> > > is up.
> > 
> > I don't believe that latency-sensitive systems are going to be messing
> > with remapping their storage at runtime, let alone on a regular basis.
> > If they are not latency sensitive, and assuming that the rate of
> > storage remapping is at all sane, I bet that they won't notice the
> > synchronize_rcu_expedited() overhead.  The overhead of the actual
> > remapping will very likely leave the synchronize_rcu_expedited() overhead
> > way down in the noise.
> > 
> > And if they are doing completely insane rates of storage remapping,
> > I suspect that the batching in the synchronize_rcu_expedited()
> > implementation will reduce the expedited-grace-period overhead still
> > further as a fraction of the total.
> 
> Consider the case of container-based systems, calling mount as part of
> container setup and umount as part of container teardown.
> 
> And those workloads are often sensitive to latency, not throughput.

So people are really seeing a synchronize_rcu_expedited() on each
container setup/teardown right now?  Or is this something that could
happen if they were mounting block devices rather than rebind mounts?

And when you say that these workloads are sensitive to latency, I am
guessing that you mean to the millisecond-level latencies seen from
synchronize_rcu() as opposed to the microsecond-level OS jitter from
synchronize_rcu_expedited().  Or are there really containers workloads
that care about the few microseconds of OS jitter that would be incurred
due to expedited grace periods?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 16:17               ` Paul E. McKenney
  2015-07-01 17:02                 ` Peter Zijlstra
@ 2015-07-02  1:11                 ` Mike Galbraith
  2015-07-02  1:34                   ` josh
  1 sibling, 1 reply; 57+ messages in thread
From: Mike Galbraith @ 2015-07-02  1:11 UTC (permalink / raw)
  To: paulmck
  Cc: Peter Zijlstra, josh, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, 2015-07-01 at 09:17 -0700, Paul E. McKenney wrote:
> On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:
> > On Wed, Jul 01, 2015 at 07:00:31AM -0700, Paul E. McKenney wrote:
> > 
> > > That is a bit extreme, Peter.
> > 
> > Of course; but I'm really not seeing people taking due care with them
> 
> ;-)
> 
> > > Are a huge pile of them coming in this merge window or something?
> > > What raised your concerns on this issue?
> > 
> > This is complete horse manure (breaking the nvidiot binary blob is a
> > good thing):
> > 
> > 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")
> 
> Really???
> 
> I am not concerned about this one.  After all, one of the first things
> that people do for OS-jitter-sensitive workloads is to get rid of
> binary blobs.

I know two users who have no choice but to use the nvidia driver with
their realtime applications, as nouveau is not up to the task.

	-Mike


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02  1:11                 ` Mike Galbraith
@ 2015-07-02  1:34                   ` josh
  2015-07-02  1:59                     ` Mike Galbraith
  0 siblings, 1 reply; 57+ messages in thread
From: josh @ 2015-07-02  1:34 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: paulmck, Peter Zijlstra, linux-kernel, mingo, laijs, dipankar,
	akpm, mathieu.desnoyers, tglx, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby.prani

On Thu, Jul 02, 2015 at 03:11:24AM +0200, Mike Galbraith wrote:
> On Wed, 2015-07-01 at 09:17 -0700, Paul E. McKenney wrote:
> > On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:
> > > On Wed, Jul 01, 2015 at 07:00:31AM -0700, Paul E. McKenney wrote:
> > > 
> > > > That is a bit extreme, Peter.
> > > 
> > > Of course; but I'm really not seeing people taking due care with them
> > 
> > ;-)
> > 
> > > > Are a huge pile of them coming in this merge window or something?
> > > > What raised your concerns on this issue?
> > > 
> > > This is complete horse manure (breaking the nvidiot binary blob is a
> > > good thing):
> > > 
> > > 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")
> > 
> > Really???
> > 
> > I am not concerned about this one.  After all, one of the first things
> > that people do for OS-jitter-sensitive workloads is to get rid of
> > binary blobs.
> 
> I know two users who have no choice but to use the nvidia driver with
> their realtime applications, as nouveau is not up to the task.

Sounds like they have a relatively loose definition of "realtime", then.

- Josh Triplett

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02  1:34                   ` josh
@ 2015-07-02  1:59                     ` Mike Galbraith
  2015-07-02  2:18                       ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Mike Galbraith @ 2015-07-02  1:59 UTC (permalink / raw)
  To: josh
  Cc: paulmck, Peter Zijlstra, linux-kernel, mingo, laijs, dipankar,
	akpm, mathieu.desnoyers, tglx, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby.prani

On Wed, 2015-07-01 at 18:34 -0700, josh@joshtriplett.org wrote:
> On Thu, Jul 02, 2015 at 03:11:24AM +0200, Mike Galbraith wrote:
> > On Wed, 2015-07-01 at 09:17 -0700, Paul E. McKenney wrote:
> > > On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:
> > > > On Wed, Jul 01, 2015 at 07:00:31AM -0700, Paul E. McKenney wrote:
> > > > 
> > > > > That is a bit extreme, Peter.
> > > > 
> > > > Of course; but I'm really not seeing people taking due care with them
> > > 
> > > ;-)
> > > 
> > > > > Are a huge pile of them coming in this merge window or something?
> > > > > What raised your concerns on this issue?
> > > > 
> > > > This is complete horse manure (breaking the nvidiot binary blob is a
> > > > good thing):
> > > > 
> > > > 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")
> > > 
> > > Really???
> > > 
> > > I am not concerned about this one.  After all, one of the first things
> > > that people do for OS-jitter-sensitive workloads is to get rid of
> > > binary blobs.
> > 
> > I know two users who have no choice but to use the nvidia driver with
> > their realtime applications, as nouveau is not up to the task.
> 
> Sounds like they have a relatively loose definition of "realtime", then.

It would be better it they broke their beasts up into a bunch of small
synchronized boxen, but they use big boxen here and now, with realtime
rendering being a non-disposable portion of the load.

	-Mike


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02  1:59                     ` Mike Galbraith
@ 2015-07-02  2:18                       ` Paul E. McKenney
  2015-07-02  2:50                         ` Mike Galbraith
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-02  2:18 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: josh, Peter Zijlstra, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Thu, Jul 02, 2015 at 03:59:55AM +0200, Mike Galbraith wrote:
> On Wed, 2015-07-01 at 18:34 -0700, josh@joshtriplett.org wrote:
> > On Thu, Jul 02, 2015 at 03:11:24AM +0200, Mike Galbraith wrote:
> > > On Wed, 2015-07-01 at 09:17 -0700, Paul E. McKenney wrote:
> > > > On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:
> > > > > On Wed, Jul 01, 2015 at 07:00:31AM -0700, Paul E. McKenney wrote:
> > > > > 
> > > > > > That is a bit extreme, Peter.
> > > > > 
> > > > > Of course; but I'm really not seeing people taking due care with them
> > > > 
> > > > ;-)
> > > > 
> > > > > > Are a huge pile of them coming in this merge window or something?
> > > > > > What raised your concerns on this issue?
> > > > > 
> > > > > This is complete horse manure (breaking the nvidiot binary blob is a
> > > > > good thing):
> > > > > 
> > > > > 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")
> > > > 
> > > > Really???
> > > > 
> > > > I am not concerned about this one.  After all, one of the first things
> > > > that people do for OS-jitter-sensitive workloads is to get rid of
> > > > binary blobs.
> > > 
> > > I know two users who have no choice but to use the nvidia driver with
> > > their realtime applications, as nouveau is not up to the task.
> > 
> > Sounds like they have a relatively loose definition of "realtime", then.
> 
> It would be better it they broke their beasts up into a bunch of small
> synchronized boxen, but they use big boxen here and now, with realtime
> rendering being a non-disposable portion of the load.

Does their normal workload trigger the condition that results in the
expedited grace period?  If so, do they use NO_HZ_FULL?  (I am guessing
"no".)  If not, is the resulting double-context-switch on their worker
CPUs a real problem for them?  (I am guessing "no" given that they have
not complained, at least not that I know of.)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02  2:18                       ` Paul E. McKenney
@ 2015-07-02  2:50                         ` Mike Galbraith
  2015-07-02  3:15                           ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Mike Galbraith @ 2015-07-02  2:50 UTC (permalink / raw)
  To: paulmck
  Cc: josh, Peter Zijlstra, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Wed, 2015-07-01 at 19:18 -0700, Paul E. McKenney wrote:

> Does their normal workload trigger the condition that results in the
> expedited grace period?  If so, do they use NO_HZ_FULL?

They're having no trouble that I'm aware of, and I definitely would be
made aware.  They're not currently using a kernel with NO_HZ_FULL even
built in, as that stuff is far far too raw in the current kernel.

They may meet a NO_HZ_FULL enabled kernel soonish, but will be told to
not turn it on for anything other than pure compute, as NO_HZ_FULL is
currently not usable for mixed load in partitioned box, much less rt. 

	-Mike


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02  2:50                         ` Mike Galbraith
@ 2015-07-02  3:15                           ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-02  3:15 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: josh, Peter Zijlstra, linux-kernel, mingo, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Thu, Jul 02, 2015 at 04:50:45AM +0200, Mike Galbraith wrote:
> On Wed, 2015-07-01 at 19:18 -0700, Paul E. McKenney wrote:
> 
> > Does their normal workload trigger the condition that results in the
> > expedited grace period?  If so, do they use NO_HZ_FULL?
> 
> They're having no trouble that I'm aware of, and I definitely would be
> made aware.  They're not currently using a kernel with NO_HZ_FULL even
> built in, as that stuff is far far too raw in the current kernel.
> 
> They may meet a NO_HZ_FULL enabled kernel soonish, but will be told to
> not turn it on for anything other than pure compute, as NO_HZ_FULL is
> currently not usable for mixed load in partitioned box, much less rt. 

OK, sounds like synchronize_rcu_expedited() is a complete non-problem
for them, then.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-01 20:09                   ` Paul E. McKenney
  2015-07-01 21:20                     ` josh
@ 2015-07-02  7:47                     ` Ingo Molnar
  2015-07-02 13:58                       ` Paul E. McKenney
  1 sibling, 1 reply; 57+ messages in thread
From: Ingo Molnar @ 2015-07-02  7:47 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, josh, linux-kernel, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Jul 01, 2015 at 07:02:42PM +0200, Peter Zijlstra wrote:
> > On Wed, Jul 01, 2015 at 09:17:05AM -0700, Paul E. McKenney wrote:
> > > On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:
> > 
> > > > 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")
> > > 
> > > Really???
> > > 
> > > I am not concerned about this one.  After all, one of the first things that 
> > > people do for OS-jitter-sensitive workloads is to get rid of binary blobs.  
> > > And any runtime use of ACPI as well.  And let's face it, if your 
> > > latency-sensitive workload is using either binary blobs or ACPI, you have 
> > > already completely lost.  Therefore, an additional expedited grace period 
> > > cannot possibly cause you to lose any more.
> > 
> > This isn't solely about rt etc.. this call is a generic facility used by 
> > however many consumers. A normal workstation/server could run into it at 
> > relatively high frequency depending on its workload.
> > 
> > Even on not latency sensitive workloads I think hammering all active CPUs is 
> > bad behaviour. Remember that a typical server class machine easily has more 
> > than 32 CPUs these days.
> 
> Well, that certainly is one reason for the funnel locking, sequence counters, 
> etc., keeping the overhead bounded despite large numbers of CPUs.  So I don't 
> believe that a non-RT/non-HPC workload is going to notice.

So I think Peter's concern is that we should not be offering/promoting APIs that 
are easy to add, hard to remove/convert - especially if we _know_ they eventually 
have to be converted. That model does not scale, it piles up increasing amounts of 
crud.

Also, there will be a threshold over which it will be increasingly harder to make 
hard-rt promises, because so much seemingly mundane functionality will be using 
these APIs. The big plus of -rt is that it's out of the box hard RT - if people 
are able to control their environment carefully they can use RTAI or so. I.e. it 
directly cuts into the usability of Linux in certain segments.

Death by a thousand cuts and such.

And it's not like it's that hard to stem the flow of algorithmic sloppiness at the 
source, right?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-01 14:03           ` Paul E. McKenney
@ 2015-07-02 12:03             ` Peter Zijlstra
  2015-07-02 14:06               ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-02 12:03 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Wed, Jul 01, 2015 at 07:03:13AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 01, 2015 at 03:48:06PM +0200, Peter Zijlstra wrote:
> > On Wed, Jul 01, 2015 at 06:41:44AM -0700, Paul E. McKenney wrote:
> > > On Wed, Jul 01, 2015 at 12:05:21PM +0200, Peter Zijlstra wrote:
> > > > On Tue, Jun 30, 2015 at 02:48:27PM -0700, Paul E. McKenney wrote:
> > > > >  			rsp->gp_state = RCU_GP_WAIT_FQS;
> > > > >  			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> > > > > +				((gf = READ_ONCE(rsp->gp_flags)) &
> > > > > +				 RCU_GP_FLAG_FQS) ||
> > > > > +				(!READ_ONCE(rnp->qsmask) &&
> > > > > +				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
> > > > > +				rcu_exp_gp_seq_done(rsp->exp_rsp,
> > > > > +						    rsp->gp_exp_snap),
> > > > > +				j);
> > > > >  			rsp->gp_state = RCU_GP_DONE_FQS;
> > > > 
> > > > How can the GP be done if we timed out or got interrupted?
> > > 
> > > If all the CPUs still blocking the grace period went idle, or in a
> > > NO_HZ_FULL kernel, entered nohz_full userspace execution.  Or, if
> > > certain low-probability races happen, went offline.
> > 
> > But what if none of those are true and we still timed out? You
> > unconditionally grant the GP.
> 
> Say what???
> 
> I recheck the conditions and break out of the loop only if one or more
> of the grace-period-end conditions is satisfied.  If not, I invoke
> rcu_gp_fqs() to do the scan to see if all remaining CPUs are idle,
> in nohz_full userspace execution, or offline.
> 
> What am I mising here?

The whole wait_event_interruptible_timeout() thing can end without @cond
being true, after which you unconditionally set ->gp_state =
RCU_GP_DONE_FQS.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02  7:47                     ` Ingo Molnar
@ 2015-07-02 13:58                       ` Paul E. McKenney
  2015-07-02 18:35                         ` Ingo Molnar
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-02 13:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, josh, linux-kernel, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Thu, Jul 02, 2015 at 09:47:19AM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Jul 01, 2015 at 07:02:42PM +0200, Peter Zijlstra wrote:
> > > On Wed, Jul 01, 2015 at 09:17:05AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:
> > > 
> > > > > 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")
> > > > 
> > > > Really???
> > > > 
> > > > I am not concerned about this one.  After all, one of the first things that 
> > > > people do for OS-jitter-sensitive workloads is to get rid of binary blobs.  
> > > > And any runtime use of ACPI as well.  And let's face it, if your 
> > > > latency-sensitive workload is using either binary blobs or ACPI, you have 
> > > > already completely lost.  Therefore, an additional expedited grace period 
> > > > cannot possibly cause you to lose any more.
> > > 
> > > This isn't solely about rt etc.. this call is a generic facility used by 
> > > however many consumers. A normal workstation/server could run into it at 
> > > relatively high frequency depending on its workload.
> > > 
> > > Even on not latency sensitive workloads I think hammering all active CPUs is 
> > > bad behaviour. Remember that a typical server class machine easily has more 
> > > than 32 CPUs these days.
> > 
> > Well, that certainly is one reason for the funnel locking, sequence counters, 
> > etc., keeping the overhead bounded despite large numbers of CPUs.  So I don't 
> > believe that a non-RT/non-HPC workload is going to notice.
> 
> So I think Peter's concern is that we should not be offering/promoting APIs that 
> are easy to add, hard to remove/convert - especially if we _know_ they eventually 
> have to be converted. That model does not scale, it piles up increasing amounts of 
> crud.
> 
> Also, there will be a threshold over which it will be increasingly harder to make 
> hard-rt promises, because so much seemingly mundane functionality will be using 
> these APIs. The big plus of -rt is that it's out of the box hard RT - if people 
> are able to control their environment carefully they can use RTAI or so. I.e. it 
> directly cuts into the usability of Linux in certain segments.
> 
> Death by a thousand cuts and such.
> 
> And it's not like it's that hard to stem the flow of algorithmic sloppiness at the 
> source, right?

OK, first let me make sure that I understand what you are asking for:

1.	Completely eliminate synchronize_rcu_expedited() and
	synchronize_sched_expedited(), replacing all uses with their
	unexpedited counterparts.  (Note that synchronize_srcu_expedited()
	does not wake up CPUs, courtesy of its read-side memory barriers.)
	The fast-boot guys are probably going to complain, along with
	the networking guys.

2.	Keep synchronize_rcu_expedited() and synchronize_sched_expedited(),
	but push back hard on any new uses and question any existing uses.

3.	Revert 74b51ee152b6 ("ACPI / osl: speedup grace period in
	acpi_os_map_cleanup").

4.	Something else?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-02 12:03             ` Peter Zijlstra
@ 2015-07-02 14:06               ` Paul E. McKenney
  2015-07-02 16:48                 ` Peter Zijlstra
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-02 14:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Thu, Jul 02, 2015 at 02:03:29PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 01, 2015 at 07:03:13AM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 01, 2015 at 03:48:06PM +0200, Peter Zijlstra wrote:
> > > On Wed, Jul 01, 2015 at 06:41:44AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Jul 01, 2015 at 12:05:21PM +0200, Peter Zijlstra wrote:
> > > > > On Tue, Jun 30, 2015 at 02:48:27PM -0700, Paul E. McKenney wrote:
> > > > > >  			rsp->gp_state = RCU_GP_WAIT_FQS;
> > > > > >  			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> > > > > > +				((gf = READ_ONCE(rsp->gp_flags)) &
> > > > > > +				 RCU_GP_FLAG_FQS) ||
> > > > > > +				(!READ_ONCE(rnp->qsmask) &&
> > > > > > +				 !rcu_preempt_blocked_readers_cgp(rnp)) ||
> > > > > > +				rcu_exp_gp_seq_done(rsp->exp_rsp,
> > > > > > +						    rsp->gp_exp_snap),
> > > > > > +				j);
> > > > > >  			rsp->gp_state = RCU_GP_DONE_FQS;
> > > > > 
> > > > > How can the GP be done if we timed out or got interrupted?
> > > > 
> > > > If all the CPUs still blocking the grace period went idle, or in a
> > > > NO_HZ_FULL kernel, entered nohz_full userspace execution.  Or, if
> > > > certain low-probability races happen, went offline.
> > > 
> > > But what if none of those are true and we still timed out? You
> > > unconditionally grant the GP.
> > 
> > Say what???
> > 
> > I recheck the conditions and break out of the loop only if one or more
> > of the grace-period-end conditions is satisfied.  If not, I invoke
> > rcu_gp_fqs() to do the scan to see if all remaining CPUs are idle,
> > in nohz_full userspace execution, or offline.
> > 
> > What am I mising here?
> 
> The whole wait_event_interruptible_timeout() thing can end without @cond
> being true, after which you unconditionally set ->gp_state =
> RCU_GP_DONE_FQS.

Ah, but then if the grace period is not done, even after a scan forcing
quiescent states, we go back through the loop and set it back to
RCU_GP_WAIT_FQS.

Or is your point that RCU_GP_DONE_FQS is a bad name?  Perhaps I should
change it to something like RCU_GP_DOING_FQS.  Or am I still missing
something here?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-02 14:06               ` Paul E. McKenney
@ 2015-07-02 16:48                 ` Peter Zijlstra
  2015-07-02 19:35                   ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-02 16:48 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Thu, Jul 02, 2015 at 07:06:17AM -0700, Paul E. McKenney wrote:
> Or is your point that RCU_GP_DONE_FQS is a bad name?  Perhaps I should
> change it to something like RCU_GP_DOING_FQS.  Or am I still missing
> something here?

Yes, that might have been the root of my confusion.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02 13:58                       ` Paul E. McKenney
@ 2015-07-02 18:35                         ` Ingo Molnar
  2015-07-02 18:47                           ` Mathieu Desnoyers
  2015-07-02 19:22                           ` Paul E. McKenney
  0 siblings, 2 replies; 57+ messages in thread
From: Ingo Molnar @ 2015-07-02 18:35 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, josh, linux-kernel, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> > And it's not like it's that hard to stem the flow of algorithmic sloppiness at 
> > the source, right?
> 
> OK, first let me make sure that I understand what you are asking for:
> 
> 1.	Completely eliminate synchronize_rcu_expedited() and
> 	synchronize_sched_expedited(), replacing all uses with their
> 	unexpedited counterparts.  (Note that synchronize_srcu_expedited()
> 	does not wake up CPUs, courtesy of its read-side memory barriers.)
> 	The fast-boot guys are probably going to complain, along with
> 	the networking guys.
> 
> 2.	Keep synchronize_rcu_expedited() and synchronize_sched_expedited(),
> 	but push back hard on any new uses and question any existing uses.
> 
> 3.	Revert 74b51ee152b6 ("ACPI / osl: speedup grace period in
> 	acpi_os_map_cleanup").
> 
> 4.	Something else?

I'd love to have 1) but 2) would be a realistic second best option? ;-)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02 18:35                         ` Ingo Molnar
@ 2015-07-02 18:47                           ` Mathieu Desnoyers
  2015-07-02 19:23                             ` Paul E. McKenney
  2015-07-02 19:22                           ` Paul E. McKenney
  1 sibling, 1 reply; 57+ messages in thread
From: Mathieu Desnoyers @ 2015-07-02 18:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul E. McKenney, Peter Zijlstra, josh, linux-kernel,
	Lai Jiangshan, dipankar, Andrew Morton, tglx, rostedt, dhowells,
	edumazet, dvhart, fweisbec, oleg, bobby prani

----- On Jul 2, 2015, at 2:35 PM, Ingo Molnar mingo@kernel.org wrote:

> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
>> > And it's not like it's that hard to stem the flow of algorithmic sloppiness at
>> > the source, right?
>> 
>> OK, first let me make sure that I understand what you are asking for:
>> 
>> 1.	Completely eliminate synchronize_rcu_expedited() and
>> 	synchronize_sched_expedited(), replacing all uses with their
>> 	unexpedited counterparts.  (Note that synchronize_srcu_expedited()
>> 	does not wake up CPUs, courtesy of its read-side memory barriers.)
>> 	The fast-boot guys are probably going to complain, along with
>> 	the networking guys.
>> 
>> 2.	Keep synchronize_rcu_expedited() and synchronize_sched_expedited(),
>> 	but push back hard on any new uses and question any existing uses.
>> 
>> 3.	Revert 74b51ee152b6 ("ACPI / osl: speedup grace period in
>> 	acpi_os_map_cleanup").
>> 
>> 4.	Something else?
> 
> I'd love to have 1) but 2) would be a realistic second best option? ;-)

Perhaps triggering a printk warning if use of
synchronize_{rcu,sched}_expedited() go beyond of certain rate might be
another option ? If we detect that a caller calls it too often, we could
emit a printk warning with a stack trace. This should ensure everyone
is very careful about where they use it.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02 18:35                         ` Ingo Molnar
  2015-07-02 18:47                           ` Mathieu Desnoyers
@ 2015-07-02 19:22                           ` Paul E. McKenney
  1 sibling, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-02 19:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, josh, linux-kernel, laijs, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Thu, Jul 02, 2015 at 08:35:57PM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > And it's not like it's that hard to stem the flow of algorithmic sloppiness at 
> > > the source, right?
> > 
> > OK, first let me make sure that I understand what you are asking for:
> > 
> > 1.	Completely eliminate synchronize_rcu_expedited() and
> > 	synchronize_sched_expedited(), replacing all uses with their
> > 	unexpedited counterparts.  (Note that synchronize_srcu_expedited()
> > 	does not wake up CPUs, courtesy of its read-side memory barriers.)
> > 	The fast-boot guys are probably going to complain, along with
> > 	the networking guys.
> > 
> > 2.	Keep synchronize_rcu_expedited() and synchronize_sched_expedited(),
> > 	but push back hard on any new uses and question any existing uses.
> > 
> > 3.	Revert 74b51ee152b6 ("ACPI / osl: speedup grace period in
> > 	acpi_os_map_cleanup").
> > 
> > 4.	Something else?
> 
> I'd love to have 1) but 2) would be a realistic second best option? ;-)

OK, how about the following checkpatch.pl patch?

And here are some other actions I have taken and will take to improve
the situation, both for OS jitter and for scalability:

o	Reduce OS jitter by switching from try_stop_cpus() to
	stop_one_cpu_nowait(), courtesy of Peter Zijlstra.

	I expect to send this in v4.3 or v4.4, depending on how
	testing and review goes.

o	Eliminate expedited-grace-period-induced OS jitter on idle CPUs.
	This went into v3.19.  Note that this also avoids IPIing
	nohz_full CPUs.

o	I believe that I can reduce OS jitter by a further factor of two
	(worst case) or factor of five (average case), but I am still
	thinking about exactly how to do this.	(This also has the
	benefit of shutting up a lockdep false positive.)

o	There is a global counter that synchronize_sched_expedited()
	uses to determine when all the CPUs have passed through a
	quiescent state.  This is a scalability bottleneck on modest
	systems under heavy load, so I will be switching this to
	instead use the combining tree.

o	Because both synchronize_sched_expedited() and
	synchronize_rcu_expedited() can potentially wake up each and
	every CPU, on sufficiently large systems, they are quite slow.
	If this scalability problem ever becomes real, I intend to use
	multiple kthreads to do the wakeups on large systems.

Seem reasonable?

							Thanx, Paul

------------------------------------------------------------------------

scripts: Make checkpatch.pl warn on expedited RCU grace periods

The synchronize_rcu_expedited() and synchronize_sched_expedited()
expedited-grace-period primitives induce OS jitter, which can degrade
real-time response.  This commit therefore adds a checkpatch.pl warning
on any patch adding them.

Note that this patch does not warn on synchronize_srcu_expedited()
because it does not induce OS jitter, courtesy of its otherwise
much-maligned read-side memory barriers.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 89b1df4e72ab..ddd82d743bba 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -4898,6 +4898,12 @@ sub process {
 				     "memory barrier without comment\n" . $herecurr);
 			}
 		}
+# check for expedited grace periods that interrupt CPUs.
+# note that synchronize_srcu_expedited() does -not- do this, so no complaints.
+		if ($line =~ /\b(synchronize_rcu_expedited|synchronize_sched_expedited)\(/) {
+			WARN("EXPEDITED_RCU_GRACE_PERIOD",
+			     "expedited RCU grace periods should be avoided\n" . $herecurr);
+		}
 # check of hardware specific defines
 		if ($line =~ m@^.\s*\#\s*if.*\b(__i386__|__powerpc64__|__sun__|__s390x__)\b@ && $realfile !~ m@include/asm-@) {
 			CHK("ARCH_DEFINES",


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02 18:47                           ` Mathieu Desnoyers
@ 2015-07-02 19:23                             ` Paul E. McKenney
  2015-07-02 21:07                               ` Mathieu Desnoyers
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-02 19:23 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Peter Zijlstra, josh, linux-kernel, Lai Jiangshan,
	dipankar, Andrew Morton, tglx, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby prani

On Thu, Jul 02, 2015 at 06:47:47PM +0000, Mathieu Desnoyers wrote:
> ----- On Jul 2, 2015, at 2:35 PM, Ingo Molnar mingo@kernel.org wrote:
> 
> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > 
> >> > And it's not like it's that hard to stem the flow of algorithmic sloppiness at
> >> > the source, right?
> >> 
> >> OK, first let me make sure that I understand what you are asking for:
> >> 
> >> 1.	Completely eliminate synchronize_rcu_expedited() and
> >> 	synchronize_sched_expedited(), replacing all uses with their
> >> 	unexpedited counterparts.  (Note that synchronize_srcu_expedited()
> >> 	does not wake up CPUs, courtesy of its read-side memory barriers.)
> >> 	The fast-boot guys are probably going to complain, along with
> >> 	the networking guys.
> >> 
> >> 2.	Keep synchronize_rcu_expedited() and synchronize_sched_expedited(),
> >> 	but push back hard on any new uses and question any existing uses.
> >> 
> >> 3.	Revert 74b51ee152b6 ("ACPI / osl: speedup grace period in
> >> 	acpi_os_map_cleanup").
> >> 
> >> 4.	Something else?
> > 
> > I'd love to have 1) but 2) would be a realistic second best option? ;-)
> 
> Perhaps triggering a printk warning if use of
> synchronize_{rcu,sched}_expedited() go beyond of certain rate might be
> another option ? If we detect that a caller calls it too often, we could
> emit a printk warning with a stack trace. This should ensure everyone
> is very careful about where they use it.

My first thought is that a storm of expedited grace periods would be
most likely to show up in some error condition, and having them
splat might obscure the splats identifying the real problem.  Or did
you have something else in mind here?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-02 16:48                 ` Peter Zijlstra
@ 2015-07-02 19:35                   ` Paul E. McKenney
  2015-07-06 14:52                     ` Peter Zijlstra
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2015-07-02 19:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Thu, Jul 02, 2015 at 06:48:16PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 02, 2015 at 07:06:17AM -0700, Paul E. McKenney wrote:
> > Or is your point that RCU_GP_DONE_FQS is a bad name?  Perhaps I should
> > change it to something like RCU_GP_DOING_FQS.  Or am I still missing
> > something here?
> 
> Yes, that might have been the root of my confusion.

OK, how about this, then?

							Thanx, Paul

------------------------------------------------------------------------

rcu: Rename RCU_GP_DONE_FQS to RCU_GP_DOING_FQS

The grace-period kthread sleeps waiting to do a force-quiescent-state
scan, and when awakened sets rsp->gp_state to RCU_GP_DONE_FQS.
However, this is confusing because the kthread has not done the
force-quiescent-state, but is instead just starting to do it.  This commit
therefore renames RCU_GP_DONE_FQS to RCU_GP_DOING_FQS in order to make
things a bit easier on reviewers.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2afb8e8c5134..e1d9909fd59d 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2088,7 +2088,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 			rsp->gp_state = RCU_GP_WAIT_FQS;
 			ret = wait_event_interruptible_timeout(rsp->gp_wq,
 					rcu_gp_fqs_check_wake(rsp, &gf), j);
-			rsp->gp_state = RCU_GP_DONE_FQS;
+			rsp->gp_state = RCU_GP_DOING_FQS;
 			/* Locking provides needed memory barriers. */
 			/* If grace period done, leave loop. */
 			if (!READ_ONCE(rnp->qsmask) &&
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index cb402b5c2e71..852b810df38e 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -533,7 +533,7 @@ struct rcu_state {
 #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
 #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
 #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
-#define RCU_GP_DONE_FQS  4	/* Wait done for force-quiescent-state time. */
+#define RCU_GP_DOING_FQS 4	/* Wait done for force-quiescent-state time. */
 #define RCU_GP_CLEANUP   5	/* Grace-period cleanup started. */
 #define RCU_GP_CLEANED   6	/* Grace-period cleanup complete. */
 


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones
  2015-07-02 19:23                             ` Paul E. McKenney
@ 2015-07-02 21:07                               ` Mathieu Desnoyers
  0 siblings, 0 replies; 57+ messages in thread
From: Mathieu Desnoyers @ 2015-07-02 21:07 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, Peter Zijlstra, josh, linux-kernel, Lai Jiangshan,
	dipankar, Andrew Morton, tglx, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby prani

----- On Jul 2, 2015, at 3:23 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com wrote:

> On Thu, Jul 02, 2015 at 06:47:47PM +0000, Mathieu Desnoyers wrote:
>> ----- On Jul 2, 2015, at 2:35 PM, Ingo Molnar mingo@kernel.org wrote:
>> 
>> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
>> > 
>> >> > And it's not like it's that hard to stem the flow of algorithmic sloppiness at
>> >> > the source, right?
>> >> 
>> >> OK, first let me make sure that I understand what you are asking for:
>> >> 
>> >> 1.	Completely eliminate synchronize_rcu_expedited() and
>> >> 	synchronize_sched_expedited(), replacing all uses with their
>> >> 	unexpedited counterparts.  (Note that synchronize_srcu_expedited()
>> >> 	does not wake up CPUs, courtesy of its read-side memory barriers.)
>> >> 	The fast-boot guys are probably going to complain, along with
>> >> 	the networking guys.
>> >> 
>> >> 2.	Keep synchronize_rcu_expedited() and synchronize_sched_expedited(),
>> >> 	but push back hard on any new uses and question any existing uses.
>> >> 
>> >> 3.	Revert 74b51ee152b6 ("ACPI / osl: speedup grace period in
>> >> 	acpi_os_map_cleanup").
>> >> 
>> >> 4.	Something else?
>> > 
>> > I'd love to have 1) but 2) would be a realistic second best option? ;-)
>> 
>> Perhaps triggering a printk warning if use of
>> synchronize_{rcu,sched}_expedited() go beyond of certain rate might be
>> another option ? If we detect that a caller calls it too often, we could
>> emit a printk warning with a stack trace. This should ensure everyone
>> is very careful about where they use it.
> 
> My first thought is that a storm of expedited grace periods would be
> most likely to show up in some error condition, and having them
> splat might obscure the splats identifying the real problem.  Or did
> you have something else in mind here?

Fair point! So I guess your checkpatch approach is more appropriate.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs
  2015-07-02 19:35                   ` Paul E. McKenney
@ 2015-07-06 14:52                     ` Peter Zijlstra
  0 siblings, 0 replies; 57+ messages in thread
From: Peter Zijlstra @ 2015-07-06 14:52 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, tglx, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
	bobby.prani

On Thu, Jul 02, 2015 at 12:35:45PM -0700, Paul E. McKenney wrote:
> OK, how about this, then?

> rcu: Rename RCU_GP_DONE_FQS to RCU_GP_DOING_FQS
> 
> The grace-period kthread sleeps waiting to do a force-quiescent-state
> scan, and when awakened sets rsp->gp_state to RCU_GP_DONE_FQS.
> However, this is confusing because the kthread has not done the
> force-quiescent-state, but is instead just starting to do it.  This commit
> therefore renames RCU_GP_DONE_FQS to RCU_GP_DOING_FQS in order to make
> things a bit easier on reviewers.
> 
> Reported-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Much better, thanks! :-)

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2015-07-06 14:53 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-30 21:48 [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones Paul E. McKenney
2015-06-30 21:48 ` [PATCH RFC tip/core/rcu 1/5] rcu: Prepare for expedited GP driving normal GP Paul E. McKenney
2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 2/5] rcu: Short-circuit normal GPs via expedited GPs Paul E. McKenney
2015-07-01 10:03     ` Peter Zijlstra
2015-07-01 13:42       ` Paul E. McKenney
2015-07-01 20:59       ` Paul E. McKenney
2015-07-01 10:05     ` Peter Zijlstra
2015-07-01 13:41       ` Paul E. McKenney
2015-07-01 13:48         ` Peter Zijlstra
2015-07-01 14:03           ` Paul E. McKenney
2015-07-02 12:03             ` Peter Zijlstra
2015-07-02 14:06               ` Paul E. McKenney
2015-07-02 16:48                 ` Peter Zijlstra
2015-07-02 19:35                   ` Paul E. McKenney
2015-07-06 14:52                     ` Peter Zijlstra
2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 3/5] rcutorture: Ensure that normal GPs advance without " Paul E. McKenney
2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 4/5] rcu: Wake grace-period kthread at end of expedited grace period Paul E. McKenney
2015-06-30 21:48   ` [PATCH RFC tip/core/rcu 5/5] rcu: Limit expedited helping to every 10 ms or every 4th GP Paul E. McKenney
2015-06-30 21:56     ` Eric Dumazet
2015-06-30 22:10       ` Paul E. McKenney
2015-07-01 10:07     ` Peter Zijlstra
2015-07-01 13:45       ` Paul E. McKenney
2015-07-01 19:30       ` Paul E. McKenney
2015-06-30 22:00 ` [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones josh
2015-06-30 22:12   ` Paul E. McKenney
2015-06-30 23:46     ` josh
2015-07-01  0:15       ` Paul E. McKenney
2015-07-01  0:42         ` Josh Triplett
2015-07-01  3:37           ` Paul E. McKenney
2015-07-01 10:12             ` Peter Zijlstra
2015-07-01 14:01               ` Paul E. McKenney
2015-07-01 14:08                 ` Eric Dumazet
2015-07-01 15:58                   ` Paul E. McKenney
2015-07-01 15:43             ` Josh Triplett
2015-07-01 15:59               ` Paul E. McKenney
2015-07-01 10:09       ` Peter Zijlstra
2015-07-01 10:55         ` Peter Zijlstra
2015-07-01 14:00           ` Paul E. McKenney
2015-07-01 14:17             ` Peter Zijlstra
2015-07-01 16:17               ` Paul E. McKenney
2015-07-01 17:02                 ` Peter Zijlstra
2015-07-01 20:09                   ` Paul E. McKenney
2015-07-01 21:20                     ` josh
2015-07-01 21:49                       ` Paul E. McKenney
2015-07-02  7:47                     ` Ingo Molnar
2015-07-02 13:58                       ` Paul E. McKenney
2015-07-02 18:35                         ` Ingo Molnar
2015-07-02 18:47                           ` Mathieu Desnoyers
2015-07-02 19:23                             ` Paul E. McKenney
2015-07-02 21:07                               ` Mathieu Desnoyers
2015-07-02 19:22                           ` Paul E. McKenney
2015-07-02  1:11                 ` Mike Galbraith
2015-07-02  1:34                   ` josh
2015-07-02  1:59                     ` Mike Galbraith
2015-07-02  2:18                       ` Paul E. McKenney
2015-07-02  2:50                         ` Mike Galbraith
2015-07-02  3:15                           ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).