All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 01/27] rcu: Make rcu_read_unlock_special() static
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 02/27] rcu: Improve rcu_note_voluntary_context_switch() reporting Paul E. McKenney
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

Because rcu_read_unlock_special() is no longer used outside of
kernel/rcu/tree_plugin.h, this commit makes it static.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h | 1 -
 kernel/rcu/tree_plugin.h | 3 ++-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 65163aa0bb04..67ec077c7ee5 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -64,7 +64,6 @@ void rcu_barrier_tasks(void);
 
 void __rcu_read_lock(void);
 void __rcu_read_unlock(void);
-void rcu_read_unlock_special(struct task_struct *t);
 void synchronize_rcu(void);
 
 /*
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 613372246a07..54a251640f53 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -127,6 +127,7 @@ static struct rcu_data __percpu *const rcu_data_p = &rcu_preempt_data;
 
 static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp,
 			       bool wake);
+static void rcu_read_unlock_special(struct task_struct *t);
 
 /*
  * Tell them what RCU they are running.
@@ -461,7 +462,7 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp)
  * notify RCU core processing or task having blocked during the RCU
  * read-side critical section.
  */
-void rcu_read_unlock_special(struct task_struct *t)
+static void rcu_read_unlock_special(struct task_struct *t)
 {
 	bool empty_exp;
 	bool empty_norm;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19
@ 2018-06-26  0:34 Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 01/27] rcu: Make rcu_read_unlock_special() static Paul E. McKenney
                   ` (26 more replies)
  0 siblings, 27 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel

Hello!

This series contains miscellaneous fixes that are sequenced after the
infamous ->gp_seq conversion:

1.	Make rcu_read_unlock_special() static.

2.	Improve rcu_note_voluntary_context_switch() reporting, courtesy of
	Byungchul Park.

3.	Get rid of Sphinx warnings at rcu_pointer_handoff(), courtesy of
	Mauro Carvalho Chehab.

4.	Use pr_fmt to prefix "rcu: " to logging output, courtesy of
	Joe Perches.

5.	Improve RCU-tasks naming and comments.

6.	Mark task as .need_qs less aggressively.

7-14.	Inline, remove "inline", remove __maybe_unused, remove unused
	variables, remove unused functions, and fix comments.

15.	Use RCU CPU stall timeout for rcu_check_gp_start_stall().

16.	Add comment documenting how rcu_seq_snap() works, courtesy of
	Joel Fernandes.

17.	Speed up calling of RCU tasks callbacks, courtesy of Steven Rostedt.

18.	Add comment to the last sleep in the rcu tasks loop, also courtesy
	of Steven Rostedt.

19.	Add diagnostics for rcutorture writer stall warning.

20.	Check the range of jiffies_till_{first,next}_fqs when setting them,
	courtesy of Byungchul Park.

21.	Update synchronize_rcu() definition in whatisRCU.txt, courtesy
	of Andrea Parri.

22.	Make rcu_seq_diff() more exact, which in turn allows more exact
	rcutorture diagnostics of busted RCU implementations.

23.	Update RCU, SRCU, and TORTURE-TEST entries in MAINTAINERS file.

24.	Print stall-warning NMI dyntick state in hexadecimal.

25.	Add grace-period number to rcutorture statistics printout.

26.	Improve documentation for list_for_each_entry_from_rcu(), courtesy
	of NeilBrown.

27.	Assign higher prio to RCU threads if rcutorture is built-in,
	courtesy of Joel Fernandes.

							Thanx, Paul

------------------------------------------------------------------------

 Documentation/RCU/whatisRCU.txt |   18 +++--
 MAINTAINERS                     |    9 +-
 include/linux/rculist.h         |   19 +++++
 include/linux/rcupdate.h        |   22 +++---
 include/linux/rcutiny.h         |    2 
 kernel/rcu/rcu.h                |   49 ++++++++++++++
 kernel/rcu/rcuperf.c            |    9 +-
 kernel/rcu/rcutorture.c         |    6 -
 kernel/rcu/srcutree.c           |    8 +-
 kernel/rcu/tiny.c               |    4 -
 kernel/rcu/tree.c               |  132 +++++++++++++++++++++++++---------------
 kernel/rcu/tree.h               |    1 
 kernel/rcu/tree_plugin.h        |   55 ++++++----------
 kernel/rcu/update.c             |   45 +++++++++----
 14 files changed, 245 insertions(+), 134 deletions(-)


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 02/27] rcu: Improve rcu_note_voluntary_context_switch() reporting
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 01/27] rcu: Make rcu_read_unlock_special() static Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 03/27] rcu: rcupdate.h: Get rid of Sphinx warnings at rcu_pointer_handoff() Paul E. McKenney
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Byungchul Park, Paul E . McKenney

From: Byungchul Park <byungchul.park@lge.com>

We expect a quiescent state of TASKS_RCU when cond_resched_tasks_rcu_qs()
is called, no matter whether it actually be scheduled or not. However,
it currently doesn't report the quiescent state when the task enters
into __schedule() as it's called with preempt = true. So make it report
the quiescent state unconditionally when cond_resched_tasks_rcu_qs() is
called.

And in TINY_RCU, even though the quiescent state of rcu_bh also should
be reported when the tick interrupt comes from user, it doesn't. So make
it reported.

Lastly in TREE_RCU, rcu_note_voluntary_context_switch() should be
reported when the tick interrupt comes from not only user but also idle,
as an extended quiescent state.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Simplify rcutiny portion given no RCU-tasks for !PREEMPT. ]
---
 include/linux/rcupdate.h | 4 ++--
 kernel/rcu/tiny.c        | 4 +---
 kernel/rcu/tree.c        | 4 ++--
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 67ec077c7ee5..5cab15e7ec83 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -194,8 +194,8 @@ static inline void exit_tasks_rcu_finish(void) { }
  */
 #define cond_resched_tasks_rcu_qs() \
 do { \
-	if (!cond_resched()) \
-		rcu_note_voluntary_context_switch_lite(current); \
+	rcu_note_voluntary_context_switch_lite(current); \
+	cond_resched(); \
 } while (0)
 
 /*
diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index a64eee0db39e..befc9321a89c 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -122,10 +122,8 @@ void rcu_check_callbacks(int user)
 {
 	if (user)
 		rcu_sched_qs();
-	else if (!in_softirq())
+	if (user || !in_softirq())
 		rcu_bh_qs();
-	if (user)
-		rcu_note_voluntary_context_switch(current);
 }
 
 /*
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 4b0aee4f9318..50a8f49e1763 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2645,6 +2645,7 @@ void rcu_check_callbacks(int user)
 
 		rcu_sched_qs();
 		rcu_bh_qs();
+		rcu_note_voluntary_context_switch(current);
 
 	} else if (!in_softirq()) {
 
@@ -2660,8 +2661,7 @@ void rcu_check_callbacks(int user)
 	rcu_preempt_check_callbacks();
 	if (rcu_pending())
 		invoke_rcu_core();
-	if (user)
-		rcu_note_voluntary_context_switch(current);
+
 	trace_rcu_utilization(TPS("End scheduler-tick"));
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 03/27] rcu: rcupdate.h: Get rid of Sphinx warnings at rcu_pointer_handoff()
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 01/27] rcu: Make rcu_read_unlock_special() static Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 02/27] rcu: Improve rcu_note_voluntary_context_switch() reporting Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 04/27] rcu: Use pr_fmt to prefix "rcu: " to logging output Paul E. McKenney
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Mauro Carvalho Chehab, Paul E . McKenney

From: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>

The code example at rcupdate.h currently produce lots of warnings:

	./include/linux/rcupdate.h:572: WARNING: Unexpected indentation.
	./include/linux/rcupdate.h:576: WARNING: Unexpected indentation.
	./include/linux/rcupdate.h:580: WARNING: Block quote ends without a blank line; unexpected unindent.
	./include/linux/rcupdate.h:582: WARNING: Block quote ends without a blank line; unexpected unindent.
	./include/linux/rcupdate.h:582: WARNING: Inline literal start-string without end-string.

This commit therefore changes it to a code-block.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 5cab15e7ec83..dacc90358b33 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -566,8 +566,8 @@ static inline void rcu_preempt_sleep_check(void) { }
  * This is simply an identity function, but it documents where a pointer
  * is handed off from RCU to some other synchronization mechanism, for
  * example, reference counting or locking.  In C11, it would map to
- * kill_dependency().  It could be used as follows:
- * ``
+ * kill_dependency().  It could be used as follows::
+ *
  *	rcu_read_lock();
  *	p = rcu_dereference(gp);
  *	long_lived = is_long_lived(p);
@@ -578,7 +578,6 @@ static inline void rcu_preempt_sleep_check(void) { }
  *			p = rcu_pointer_handoff(p);
  *	}
  *	rcu_read_unlock();
- *``
  */
 #define rcu_pointer_handoff(p) (p)
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 04/27] rcu: Use pr_fmt to prefix "rcu: " to logging output
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (2 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 03/27] rcu: rcupdate.h: Get rid of Sphinx warnings at rcu_pointer_handoff() Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 05/27] rcu: Improve RCU-tasks naming and comments Paul E. McKenney
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Joe Perches, Paul E . McKenney

From: Joe Perches <joe@perches.com>

This commit also adjusts some whitespace while in the area.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Revert string-breaking %s as requested by Andy Shevchenko. ]
---
 kernel/rcu/rcuperf.c     |  7 +++----
 kernel/rcu/rcutorture.c  |  4 ++--
 kernel/rcu/srcutree.c    |  5 ++++-
 kernel/rcu/tree.c        |  8 +++++---
 kernel/rcu/tree_plugin.h | 10 ++++++----
 5 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index b080bc4a4f45..00e395c0d7d0 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -680,12 +680,11 @@ rcu_perf_init(void)
 			break;
 	}
 	if (i == ARRAY_SIZE(perf_ops)) {
-		pr_alert("rcu-perf: invalid perf type: \"%s\"\n",
-			 perf_type);
+		pr_alert("rcu-perf: invalid perf type: \"%s\"\n", perf_type);
 		pr_alert("rcu-perf types:");
 		for (i = 0; i < ARRAY_SIZE(perf_ops); i++)
-			pr_alert(" %s", perf_ops[i]->name);
-		pr_alert("\n");
+			pr_cont(" %s", perf_ops[i]->name);
+		pr_cont("\n");
 		firsterr = -EINVAL;
 		goto unwind;
 	}
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index eb6d4915b4e6..773ce12f9582 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -1755,8 +1755,8 @@ rcu_torture_init(void)
 			 torture_type);
 		pr_alert("rcu-torture types:");
 		for (i = 0; i < ARRAY_SIZE(torture_ops); i++)
-			pr_alert(" %s", torture_ops[i]->name);
-		pr_alert("\n");
+			pr_cont(" %s", torture_ops[i]->name);
+		pr_cont("\n");
 		firsterr = -EINVAL;
 		goto unwind;
 	}
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index d6d6ea9738c0..e526b56998af 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -26,6 +26,8 @@
  *
  */
 
+#define pr_fmt(fmt) "rcu: " fmt
+
 #include <linux/export.h>
 #include <linux/mutex.h>
 #include <linux/percpu.h>
@@ -390,7 +392,8 @@ void _cleanup_srcu_struct(struct srcu_struct *sp, bool quiesced)
 		}
 	if (WARN_ON(rcu_seq_state(READ_ONCE(sp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
 	    WARN_ON(srcu_readers_active(sp))) {
-		pr_info("%s: Active srcu_struct %p state: %d\n", __func__, sp, rcu_seq_state(READ_ONCE(sp->srcu_gp_seq)));
+		pr_info("%s: Active srcu_struct %p state: %d\n",
+			__func__, sp, rcu_seq_state(READ_ONCE(sp->srcu_gp_seq)));
 		return; /* Caller forgot to stop doing call_srcu()? */
 	}
 	free_percpu(sp->sda);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 50a8f49e1763..fcb0355a8084 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -27,6 +27,9 @@
  * For detailed explanation of Read-Copy Update mechanism see -
  *	Documentation/RCU
  */
+
+#define pr_fmt(fmt) "rcu: " fmt
+
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
@@ -1374,8 +1377,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq)
 	 * See Documentation/RCU/stallwarn.txt for info on how to debug
 	 * RCU CPU stall warnings.
 	 */
-	pr_err("INFO: %s detected stalls on CPUs/tasks:",
-	       rsp->name);
+	pr_err("INFO: %s detected stalls on CPUs/tasks:", rsp->name);
 	print_cpu_stall_info_begin();
 	rcu_for_each_leaf_node(rsp, rnp) {
 		raw_spin_lock_irqsave_rcu_node(rnp, flags);
@@ -4046,7 +4048,7 @@ static void __init rcu_init_geometry(void)
 	if (rcu_fanout_leaf == RCU_FANOUT_LEAF &&
 	    nr_cpu_ids == NR_CPUS)
 		return;
-	pr_info("RCU: Adjusting geometry for rcu_fanout_leaf=%d, nr_cpu_ids=%u\n",
+	pr_info("Adjusting geometry for rcu_fanout_leaf=%d, nr_cpu_ids=%u\n",
 		rcu_fanout_leaf, nr_cpu_ids);
 
 	/*
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 54a251640f53..dbfe90191e19 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -74,8 +74,8 @@ static void __init rcu_bootup_announce_oddness(void)
 		pr_info("\tRCU event tracing is enabled.\n");
 	if ((IS_ENABLED(CONFIG_64BIT) && RCU_FANOUT != 64) ||
 	    (!IS_ENABLED(CONFIG_64BIT) && RCU_FANOUT != 32))
-		pr_info("\tCONFIG_RCU_FANOUT set to non-default value of %d\n",
-		       RCU_FANOUT);
+		pr_info("\tCONFIG_RCU_FANOUT set to non-default value of %d.\n",
+			RCU_FANOUT);
 	if (rcu_fanout_exact)
 		pr_info("\tHierarchical RCU autobalancing is disabled.\n");
 	if (IS_ENABLED(CONFIG_RCU_FAST_NO_HZ))
@@ -88,11 +88,13 @@ static void __init rcu_bootup_announce_oddness(void)
 		pr_info("\tBuild-time adjustment of leaf fanout to %d.\n",
 			RCU_FANOUT_LEAF);
 	if (rcu_fanout_leaf != RCU_FANOUT_LEAF)
-		pr_info("\tBoot-time adjustment of leaf fanout to %d.\n", rcu_fanout_leaf);
+		pr_info("\tBoot-time adjustment of leaf fanout to %d.\n",
+			rcu_fanout_leaf);
 	if (nr_cpu_ids != NR_CPUS)
 		pr_info("\tRCU restricting CPUs from NR_CPUS=%d to nr_cpu_ids=%u.\n", NR_CPUS, nr_cpu_ids);
 #ifdef CONFIG_RCU_BOOST
-	pr_info("\tRCU priority boosting: priority %d delay %d ms.\n", kthread_prio, CONFIG_RCU_BOOST_DELAY);
+	pr_info("\tRCU priority boosting: priority %d delay %d ms.\n",
+		kthread_prio, CONFIG_RCU_BOOST_DELAY);
 #endif
 	if (blimit != DEFAULT_RCU_BLIMIT)
 		pr_info("\tBoot-time adjustment of callback invocation limit to %ld.\n", blimit);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 05/27] rcu: Improve RCU-tasks naming and comments
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (3 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 04/27] rcu: Use pr_fmt to prefix "rcu: " to logging output Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 06/27] rcu: Mark task as .need_qs less aggressively Paul E. McKenney
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

The naming and comments associated with some RCU-tasks code make
the faulty assumption that context switches due to cond_resched()
are voluntary.  As several people pointed out, this is not the case.
This commit therefore updates function names and comments to better
reflect current reality.

Reported-by: Byungchul Park <byungchul.park@lge.com>
Reported-by: Joel Fernandes <joel@joelfernandes.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h | 12 ++++++------
 include/linux/rcutiny.h  |  2 +-
 kernel/rcu/tree.c        |  2 +-
 kernel/rcu/update.c      | 27 ++++++++++++++-------------
 4 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index dacc90358b33..75e5b393cf44 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -158,11 +158,11 @@ static inline void rcu_init_nohz(void) { }
 	} while (0)
 
 /*
- * Note a voluntary context switch for RCU-tasks benefit.  This is a
- * macro rather than an inline function to avoid #include hell.
+ * Note a quasi-voluntary context switch for RCU-tasks's benefit.
+ * This is a macro rather than an inline function to avoid #include hell.
  */
 #ifdef CONFIG_TASKS_RCU
-#define rcu_note_voluntary_context_switch_lite(t) \
+#define rcu_tasks_qs(t) \
 	do { \
 		if (READ_ONCE((t)->rcu_tasks_holdout)) \
 			WRITE_ONCE((t)->rcu_tasks_holdout, false); \
@@ -170,14 +170,14 @@ static inline void rcu_init_nohz(void) { }
 #define rcu_note_voluntary_context_switch(t) \
 	do { \
 		rcu_all_qs(); \
-		rcu_note_voluntary_context_switch_lite(t); \
+		rcu_tasks_qs(t); \
 	} while (0)
 void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
 void synchronize_rcu_tasks(void);
 void exit_tasks_rcu_start(void);
 void exit_tasks_rcu_finish(void);
 #else /* #ifdef CONFIG_TASKS_RCU */
-#define rcu_note_voluntary_context_switch_lite(t)	do { } while (0)
+#define rcu_tasks_qs(t)	do { } while (0)
 #define rcu_note_voluntary_context_switch(t)		rcu_all_qs()
 #define call_rcu_tasks call_rcu_sched
 #define synchronize_rcu_tasks synchronize_sched
@@ -194,7 +194,7 @@ static inline void exit_tasks_rcu_finish(void) { }
  */
 #define cond_resched_tasks_rcu_qs() \
 do { \
-	rcu_note_voluntary_context_switch_lite(current); \
+	rcu_tasks_qs(current); \
 	cond_resched(); \
 } while (0)
 
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 7b3c82e8a625..8d9a0ea8f0b5 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -93,7 +93,7 @@ static inline void kfree_call_rcu(struct rcu_head *head,
 #define rcu_note_context_switch(preempt) \
 	do { \
 		rcu_sched_qs(); \
-		rcu_note_voluntary_context_switch_lite(current); \
+		rcu_tasks_qs(current); \
 	} while (0)
 
 static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index fcb0355a8084..1ea971244512 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -457,7 +457,7 @@ void rcu_note_context_switch(bool preempt)
 		rcu_momentary_dyntick_idle();
 	this_cpu_inc(rcu_dynticks.rcu_qs_ctr);
 	if (!preempt)
-		rcu_note_voluntary_context_switch_lite(current);
+		rcu_tasks_qs(current);
 out:
 	trace_rcu_utilization(TPS("End context switch"));
 	barrier(); /* Avoid RCU read-side critical sections leaking up. */
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 4c230a60ece4..5783bdf86e5a 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -507,14 +507,15 @@ early_initcall(check_cpu_stall_init);
 #ifdef CONFIG_TASKS_RCU
 
 /*
- * Simple variant of RCU whose quiescent states are voluntary context switch,
- * user-space execution, and idle.  As such, grace periods can take one good
- * long time.  There are no read-side primitives similar to rcu_read_lock()
- * and rcu_read_unlock() because this implementation is intended to get
- * the system into a safe state for some of the manipulations involved in
- * tracing and the like.  Finally, this implementation does not support
- * high call_rcu_tasks() rates from multiple CPUs.  If this is required,
- * per-CPU callback lists will be needed.
+ * Simple variant of RCU whose quiescent states are voluntary context
+ * switch, cond_resched_rcu_qs(), user-space execution, and idle.
+ * As such, grace periods can take one good long time.  There are no
+ * read-side primitives similar to rcu_read_lock() and rcu_read_unlock()
+ * because this implementation is intended to get the system into a safe
+ * state for some of the manipulations involved in tracing and the like.
+ * Finally, this implementation does not support high call_rcu_tasks()
+ * rates from multiple CPUs.  If this is required, per-CPU callback lists
+ * will be needed.
  */
 
 /* Global list of callbacks and associated lock. */
@@ -542,11 +543,11 @@ static struct task_struct *rcu_tasks_kthread_ptr;
  * period elapses, in other words after all currently executing RCU
  * read-side critical sections have completed. call_rcu_tasks() assumes
  * that the read-side critical sections end at a voluntary context
- * switch (not a preemption!), entry into idle, or transition to usermode
- * execution.  As such, there are no read-side primitives analogous to
- * rcu_read_lock() and rcu_read_unlock() because this primitive is intended
- * to determine that all tasks have passed through a safe state, not so
- * much for data-strcuture synchronization.
+ * switch (not a preemption!), cond_resched_rcu_qs(), entry into idle,
+ * or transition to usermode execution.  As such, there are no read-side
+ * primitives analogous to rcu_read_lock() and rcu_read_unlock() because
+ * this primitive is intended to determine that all tasks have passed
+ * through a safe state, not so much for data-strcuture synchronization.
  *
  * See the description of call_rcu() for more detailed information on
  * memory ordering guarantees.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 06/27] rcu: Mark task as .need_qs less aggressively
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (4 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 05/27] rcu: Improve RCU-tasks naming and comments Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26 17:08   ` Peter Zijlstra
  2018-06-26  0:34 ` [PATCH tip/core/rcu 07/27] rcu: Inline rcu_dynticks_momentary_idle() into its sole caller Paul E. McKenney
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

If any scheduling-clock interrupt interrupts an RCU-preempt read-side
critical section, the interrupted task's ->rcu_read_unlock_special.b.need_qs
field is set.  This causes the outermost rcu_read_unlock() to incur the
extra overhead of calling into rcu_read_unlock_special().  This commit
reduces that overhead by setting ->rcu_read_unlock_special.b.need_qs only
if the grace period has been in effect for more than one second.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree_plugin.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index dbfe90191e19..0239cf8a4be6 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -730,6 +730,7 @@ rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp)
  */
 static void rcu_preempt_check_callbacks(void)
 {
+	struct rcu_state *rsp = &rcu_preempt_state;
 	struct task_struct *t = current;
 
 	if (t->rcu_read_lock_nesting == 0) {
@@ -738,7 +739,9 @@ static void rcu_preempt_check_callbacks(void)
 	}
 	if (t->rcu_read_lock_nesting > 0 &&
 	    __this_cpu_read(rcu_data_p->core_needs_qs) &&
-	    __this_cpu_read(rcu_data_p->cpu_no_qs.b.norm))
+	    __this_cpu_read(rcu_data_p->cpu_no_qs.b.norm) &&
+	    !t->rcu_read_unlock_special.b.need_qs &&
+	    time_after(jiffies, rsp->gp_start + HZ))
 		t->rcu_read_unlock_special.b.need_qs = true;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 07/27] rcu: Inline rcu_dynticks_momentary_idle() into its sole caller
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (5 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 06/27] rcu: Mark task as .need_qs less aggressively Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 08/27] rcu: Clarify and correct the rcu_preempt_qs() header comment Paul E. McKenney
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

The rcu_dynticks_momentary_idle() function is invoked only from
rcu_momentary_dyntick_idle(), and neither function is particularly
large.  This commit therefore saves a few lines by inlining
rcu_dynticks_momentary_idle() into rcu_momentary_dyntick_idle().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 23 +++++++----------------
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1ea971244512..98f51157f735 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -385,20 +385,6 @@ static bool rcu_dynticks_in_eqs_since(struct rcu_dynticks *rdtp, int snap)
 	return snap != rcu_dynticks_snap(rdtp);
 }
 
-/*
- * Do a double-increment of the ->dynticks counter to emulate a
- * momentary idle-CPU quiescent state.
- */
-static void rcu_dynticks_momentary_idle(void)
-{
-	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
-	int special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR,
-					&rdtp->dynticks);
-
-	/* It is illegal to call this from idle state. */
-	WARN_ON_ONCE(!(special & RCU_DYNTICK_CTRL_CTR));
-}
-
 /*
  * Set the special (bottom) bit of the specified CPU so that it
  * will take special action (such as flushing its TLB) on the
@@ -430,12 +416,17 @@ bool rcu_eqs_special_set(int cpu)
  *
  * We inform the RCU core by emulating a zero-duration dyntick-idle period.
  *
- * The caller must have disabled interrupts.
+ * The caller must have disabled interrupts and must not be idle.
  */
 static void rcu_momentary_dyntick_idle(void)
 {
+	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
+	int special;
+
 	raw_cpu_write(rcu_dynticks.rcu_need_heavy_qs, false);
-	rcu_dynticks_momentary_idle();
+	special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks);
+	/* It is illegal to call this from idle state. */
+	WARN_ON_ONCE(!(special & RCU_DYNTICK_CTRL_CTR));
 }
 
 /*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 08/27] rcu: Clarify and correct the rcu_preempt_qs() header comment
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (6 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 07/27] rcu: Inline rcu_dynticks_momentary_idle() into its sole caller Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 09/27] rcu: Remove unused rcu_kick_nohz_cpu() function Paul E. McKenney
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

The rcu_preempt_qs() function only applies to the CPU, not the task.
A task really is allowed to invoke this function while in an RCU-preempt
read-side critical section, but only if it has first added itself to
some leaf rcu_node structure's ->blkd_tasks list.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree_plugin.h | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 0239cf8a4be6..07d1ad175994 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -294,13 +294,17 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
 }
 
 /*
- * Record a preemptible-RCU quiescent state for the specified CPU.  Note
- * that this just means that the task currently running on the CPU is
- * not in a quiescent state.  There might be any number of tasks blocked
- * while in an RCU read-side critical section.
+ * Record a preemptible-RCU quiescent state for the specified CPU.
+ * Note that this does not necessarily mean that the task currently running
+ * on the CPU is in a quiescent state:  Instead, it means that the current
+ * grace period need not wait on any RCU read-side critical section that
+ * starts later on this CPU.  It also means that if the current task is
+ * in an RCU read-side critical section, it has already added itself to
+ * some leaf rcu_node structure's ->blkd_tasks list.  In addition to the
+ * current task, there might be any number of other tasks blocked while
+ * in an RCU read-side critical section.
  *
- * As with the other rcu_*_qs() functions, callers to this function
- * must disable preemption.
+ * Callers to this function must disable preemption.
  */
 static void rcu_preempt_qs(void)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 09/27] rcu: Remove unused rcu_kick_nohz_cpu() function
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (7 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 08/27] rcu: Clarify and correct the rcu_preempt_qs() header comment Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 10/27] rcu: Remove unused local variable "cpu" Paul E. McKenney
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

The rcu_kick_nohz_cpu() function is no longer used, and the functionality
it used to provide is now provided by a call to resched_cpu() in the
force-quiescent-state function rcu_implicit_dynticks_qs().  This commit
therefore removes rcu_kick_nohz_cpu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.h        |  1 -
 kernel/rcu/tree_plugin.h | 17 -----------------
 2 files changed, 18 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index d51e6edc8e83..4e74df768c57 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -483,7 +483,6 @@ static void __init rcu_spawn_nocb_kthreads(void);
 #ifdef CONFIG_RCU_NOCB_CPU
 static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp);
 #endif /* #ifdef CONFIG_RCU_NOCB_CPU */
-static void __maybe_unused rcu_kick_nohz_cpu(int cpu);
 static bool init_nocb_callback_list(struct rcu_data *rdp);
 static void rcu_bind_gp_kthread(void);
 static bool rcu_nohz_full_cpu(struct rcu_state *rsp);
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 07d1ad175994..75a91d58b8f7 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2645,23 +2645,6 @@ static bool init_nocb_callback_list(struct rcu_data *rdp)
 
 #endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
 
-/*
- * An adaptive-ticks CPU can potentially execute in kernel mode for an
- * arbitrarily long period of time with the scheduling-clock tick turned
- * off.  RCU will be paying attention to this CPU because it is in the
- * kernel, but the CPU cannot be guaranteed to be executing the RCU state
- * machine because the scheduling-clock tick has been disabled.  Therefore,
- * if an adaptive-ticks CPU is failing to respond to the current grace
- * period and has not be idle from an RCU perspective, kick it.
- */
-static void __maybe_unused rcu_kick_nohz_cpu(int cpu)
-{
-#ifdef CONFIG_NO_HZ_FULL
-	if (tick_nohz_full_cpu(cpu))
-		smp_send_reschedule(cpu);
-#endif /* #ifdef CONFIG_NO_HZ_FULL */
-}
-
 /*
  * Is this CPU a NO_HZ_FULL CPU that should ignore RCU so that the
  * grace-period kthread will do force_quiescent_state() processing?
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 10/27] rcu: Remove unused local variable "cpu"
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (8 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 09/27] rcu: Remove unused rcu_kick_nohz_cpu() function Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 11/27] rcu: Remove "inline" from panic_on_rcu_stall() and rcu_blocking_is_gp() Paul E. McKenney
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

One danger of using __maybe_unused is that the compiler doesn't yell
at you when you remove the last reference, witness rcu_bind_gp_kthread()
and its local variable "cpu".  This commit removes this local variable.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree_plugin.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 75a91d58b8f7..2cc9bf0d363a 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2670,8 +2670,6 @@ static bool rcu_nohz_full_cpu(struct rcu_state *rsp)
  */
 static void rcu_bind_gp_kthread(void)
 {
-	int __maybe_unused cpu;
-
 	if (!tick_nohz_full_enabled())
 		return;
 	housekeeping_affine(current, HK_FLAG_RCU);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 11/27] rcu: Remove "inline" from panic_on_rcu_stall() and rcu_blocking_is_gp()
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (9 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 10/27] rcu: Remove unused local variable "cpu" Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 12/27] rcu: Remove "inline" from rcu_torture_print_module_parms() Paul E. McKenney
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

These functions are in kernel/rcu/tree.c, which is not an include file,
so there is no problem dropping the "inline", especially given that these
functions are nowhere near a fastpath.  This commit therefore delegates
the inlining decision to the compiler by dropping the "inline".

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 98f51157f735..11fb6542b9d4 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1342,7 +1342,7 @@ static void rcu_stall_kick_kthreads(struct rcu_state *rsp)
 	}
 }
 
-static inline void panic_on_rcu_stall(void)
+static void panic_on_rcu_stall(void)
 {
 	if (sysctl_panic_on_rcu_stall)
 		panic("RCU Stall\n");
@@ -3078,7 +3078,7 @@ EXPORT_SYMBOL_GPL(kfree_call_rcu);
  * when there was in fact only one the whole time, as this just adds
  * some overhead: RCU still operates correctly.
  */
-static inline int rcu_blocking_is_gp(void)
+static int rcu_blocking_is_gp(void)
 {
 	int ret;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 12/27] rcu: Remove "inline" from rcu_torture_print_module_parms()
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (10 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 11/27] rcu: Remove "inline" from panic_on_rcu_stall() and rcu_blocking_is_gp() Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:34 ` [PATCH tip/core/rcu 13/27] rcu: Remove "inline" from rcu_perf_print_module_parms() Paul E. McKenney
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

This function is in rcutorture.c, which is not an include file, so there
is no problem dropping the "inline", especially given that this function
is invoked only twice per rcutorture run.  This commit therefore delegates
the inlining decision to the compiler by dropping the "inline".

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/rcutorture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 773ce12f9582..2cce2ca2bc2b 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -1359,7 +1359,7 @@ rcu_torture_stats(void *arg)
 	return 0;
 }
 
-static inline void
+static void
 rcu_torture_print_module_parms(struct rcu_torture_ops *cur_ops, const char *tag)
 {
 	pr_alert("%s" TORTURE_FLAG
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 13/27] rcu: Remove "inline" from rcu_perf_print_module_parms()
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (11 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 12/27] rcu: Remove "inline" from rcu_torture_print_module_parms() Paul E. McKenney
@ 2018-06-26  0:34 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 14/27] rcu: Remove __maybe_unused from rcu_cpu_has_callbacks() Paul E. McKenney
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

This function is in rcuperf.c, which is not an include file, so there
is no problem dropping the "inline", especially given that this function
is invoked only twice per rcuperf run.  This commit therefore delegates
the inlining decision to the compiler by dropping the "inline".

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/rcuperf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index 00e395c0d7d0..3e86940245d9 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -536,7 +536,7 @@ rcu_perf_writer(void *arg)
 	return 0;
 }
 
-static inline void
+static void
 rcu_perf_print_module_parms(struct rcu_perf_ops *cur_ops, const char *tag)
 {
 	pr_alert("%s" PERF_FLAG
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 14/27] rcu: Remove __maybe_unused from rcu_cpu_has_callbacks()
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (12 preceding siblings ...)
  2018-06-26  0:34 ` [PATCH tip/core/rcu 13/27] rcu: Remove "inline" from rcu_perf_print_module_parms() Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 15/27] rcu: Use RCU CPU stall timeout for rcu_check_gp_start_stall() Paul E. McKenney
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

The rcu_cpu_has_callbacks() function is now used in all configurations,
so this commit removes the __maybe_unused.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 11fb6542b9d4..a95d49730cfc 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3313,7 +3313,7 @@ static int rcu_pending(void)
  * non-NULL, store an indication of whether all callbacks are lazy.
  * (If there are no callbacks, all of them are deemed to be lazy.)
  */
-static bool __maybe_unused rcu_cpu_has_callbacks(bool *all_lazy)
+static bool rcu_cpu_has_callbacks(bool *all_lazy)
 {
 	bool al = true;
 	bool hc = false;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 15/27] rcu: Use RCU CPU stall timeout for rcu_check_gp_start_stall()
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (13 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 14/27] rcu: Remove __maybe_unused from rcu_cpu_has_callbacks() Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works Paul E. McKenney
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

Currently, rcu_check_gp_start_stall() waits for one second after the first
request before complaining that a grace period has not yet started.  This
was desirable while testing the conversion from ->future_gp_needed[] to
->gp_seq_needed, but it is a bit on the hair-trigger side for production
use under heavy load.  This commit therefore makes this wait time be
exactly that of the RCU CPU stall warning, allowing easy adjustment of
both timeouts to suit the distribution or installation at hand.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a95d49730cfc..c5fec335bc31 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2753,6 +2753,7 @@ static void
 rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp,
 			 struct rcu_data *rdp)
 {
+	const unsigned long gpssdelay = rcu_jiffies_till_stall_check() * HZ;
 	unsigned long flags;
 	unsigned long j;
 	struct rcu_node *rnp_root = rcu_get_root(rsp);
@@ -2762,8 +2763,8 @@ rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp,
 	    ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed))
 		return;
 	j = jiffies; /* Expensive access, and in common case don't get here. */
-	if (time_before(j, READ_ONCE(rsp->gp_req_activity) + HZ) ||
-	    time_before(j, READ_ONCE(rsp->gp_activity) + HZ) ||
+	if (time_before(j, READ_ONCE(rsp->gp_req_activity) + gpssdelay) ||
+	    time_before(j, READ_ONCE(rsp->gp_activity) + gpssdelay) ||
 	    atomic_read(&warned))
 		return;
 
@@ -2771,8 +2772,8 @@ rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp,
 	j = jiffies;
 	if (rcu_gp_in_progress(rsp) ||
 	    ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
-	    time_before(j, READ_ONCE(rsp->gp_req_activity) + HZ) ||
-	    time_before(j, READ_ONCE(rsp->gp_activity) + HZ) ||
+	    time_before(j, READ_ONCE(rsp->gp_req_activity) + gpssdelay) ||
+	    time_before(j, READ_ONCE(rsp->gp_activity) + gpssdelay) ||
 	    atomic_read(&warned)) {
 		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 		return;
@@ -2783,18 +2784,18 @@ rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp,
 	j = jiffies;
 	if (rcu_gp_in_progress(rsp) ||
 	    ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
-	    time_before(j, rsp->gp_req_activity + HZ) ||
-	    time_before(j, rsp->gp_activity + HZ) ||
+	    time_before(j, rsp->gp_req_activity + gpssdelay) ||
+	    time_before(j, rsp->gp_activity + gpssdelay) ||
 	    atomic_xchg(&warned, 1)) {
 		raw_spin_unlock_rcu_node(rnp_root); /* irqs remain disabled. */
 		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 		return;
 	}
-	pr_alert("%s: g%ld->%ld gar:%lu ga:%lu f%#x %s->state:%#lx\n",
+	pr_alert("%s: g%ld->%ld gar:%lu ga:%lu f%#x gs:%d %s->state:%#lx\n",
 		 __func__, (long)READ_ONCE(rsp->gp_seq),
 		 (long)READ_ONCE(rnp_root->gp_seq_needed),
 		 j - rsp->gp_req_activity, j - rsp->gp_activity,
-		 rsp->gp_flags, rsp->name,
+		 rsp->gp_flags, rsp->gp_state, rsp->name,
 		 rsp->gp_kthread ? rsp->gp_kthread->state : 0x1ffffL);
 	WARN_ON(1);
 	raw_spin_unlock_rcu_node(rnp_root);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (14 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 15/27] rcu: Use RCU CPU stall timeout for rcu_check_gp_start_stall() Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26 17:14   ` Peter Zijlstra
  2018-06-26 17:30   ` Peter Zijlstra
  2018-06-26  0:35 ` [PATCH tip/core/rcu 17/27] rcu: Speed up calling of RCU tasks callbacks Paul E. McKenney
                   ` (10 subsequent siblings)
  26 siblings, 2 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

rcu_seq_snap may be tricky to decipher. Lets document how it works with
an example to make it easier.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/rcu.h | 34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index aa215d6355f8..60f089d08c47 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -91,7 +91,39 @@ static inline void rcu_seq_end(unsigned long *sp)
 	WRITE_ONCE(*sp, rcu_seq_endval(sp));
 }
 
-/* Take a snapshot of the update side's sequence number. */
+/*
+ * rcu_seq_snap - Take a snapshot of the update side's sequence number.
+ *
+ * This function returns the earliest value of the grace-period sequence number
+ * that will indicate that a full grace period has elapsed since the current
+ * time.  Once the grace-period sequence number has reached this value, it will
+ * be safe to invoke all callbacks that have been registered prior to the
+ * current time. This value is the current grace-period number plus two to the
+ * power of the number of low-order bits reserved for state, then rounded up to
+ * the next value in which the state bits are all zero.
+ *
+ * In the current design, RCU_SEQ_STATE_MASK=3 and the least significant bit of
+ * the seq is used to track if a GP is in progress or not. Given this, it is
+ * sufficient if we add (6+1) and mask with ~3 to get the next GP. Let's see
+ * why with an example:
+ *
+ * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
+ * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
+ * to account for the shift due to 2 state bits. Now, if the current seq is
+ * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
+ * is already in progress so the next GP that a future call back will be queued
+ * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
+ * the 2 lower bits by adding 0b11. In case the lower bit was set, the overflow
+ * will cause the extra +1 to the GP, along with the usual +1 explained before.
+ * This gives us GP+2. Finally we mask the lower to bits by ~0x3 in case the
+ * overflow didn't occur. This masking is needed because in case RCU was idle
+ * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
+ * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
+ *
+ * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
+ * which can be generalized to:
+ * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
+ */
 static inline unsigned long rcu_seq_snap(unsigned long *sp)
 {
 	unsigned long s;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 17/27] rcu: Speed up calling of RCU tasks callbacks
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (15 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 18/27] rcu: Add comment to the last sleep in the rcu tasks loop Paul E. McKenney
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E . McKenney

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Joel Fernandes found that the synchronize_rcu_tasks() was taking a
significant amount of time. He demonstrated it with the following test:

 # cd /sys/kernel/tracing
 # while [ 1 ]; do x=1; done &
 # echo '__schedule_bug:traceon' > set_ftrace_filter
 # time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real	0m1.064s
user	0m0.000s
sys	0m0.004s

Where it takes a little over a second to perform the synchronize,
because there's a loop that waits 1 second at a time for tasks to get
through their quiescent points when there's a task that must be waited
for.

After discussion we came up with a simple way to wait for holdouts but
increase the time for each iteration of the loop but no more than a
full second.

With the new patch we have:

 # time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real	0m0.131s
user	0m0.000s
sys	0m0.004s

Which drops it down to 13% of what the original wait time was.

Link: http://lkml.kernel.org/r/20180523063815.198302-2-joel@joelfernandes.org
Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Suggested-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/update.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 5783bdf86e5a..4c7c49c106ee 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -668,6 +668,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
 	struct rcu_head *list;
 	struct rcu_head *next;
 	LIST_HEAD(rcu_tasks_holdouts);
+	int fract;
 
 	/* Run on housekeeping CPUs by default.  Sysadm can move if desired. */
 	housekeeping_affine(current, HK_FLAG_RCU);
@@ -749,13 +750,25 @@ static int __noreturn rcu_tasks_kthread(void *arg)
 		 * holdouts.  When the list is empty, we are done.
 		 */
 		lastreport = jiffies;
-		while (!list_empty(&rcu_tasks_holdouts)) {
+
+		/* Start off with HZ/10 wait and slowly back off to 1 HZ wait*/
+		fract = 10;
+
+		for (;;) {
 			bool firstreport;
 			bool needreport;
 			int rtst;
 			struct task_struct *t1;
 
-			schedule_timeout_interruptible(HZ);
+			if (list_empty(&rcu_tasks_holdouts))
+				break;
+
+			/* Slowly back off waiting for holdouts */
+			schedule_timeout_interruptible(HZ/fract);
+
+			if (fract > 1)
+				fract--;
+
 			rtst = READ_ONCE(rcu_task_stall_timeout);
 			needreport = rtst > 0 &&
 				     time_after(jiffies, lastreport + rtst);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 18/27] rcu: Add comment to the last sleep in the rcu tasks loop
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (16 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 17/27] rcu: Speed up calling of RCU tasks callbacks Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 19/27] rcu: Add diagnostics for rcutorture writer stall warning Paul E. McKenney
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E . McKenney

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

At the end of rcu_tasks_kthread() there's a lonely
schedule_timeout_uninterruptible() call with no apparent rationale for
its existence. But there is. It is to keep the thread from going into
a tight loop if there's some anomaly. That really needs a comment.

Link: http://lkml.kernel.org/r/20180524223839.GU3803@linux.vnet.ibm.com
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/update.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 4c7c49c106ee..39cb23d22109 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -814,6 +814,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
 			list = next;
 			cond_resched();
 		}
+		/* Paranoid sleep to keep this from entering a tight loop */
 		schedule_timeout_uninterruptible(HZ/10);
 	}
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 19/27] rcu: Add diagnostics for rcutorture writer stall warning
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (17 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 18/27] rcu: Add comment to the last sleep in the rcu tasks loop Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 20/27] rcu: Check the range of jiffies_till_{first,next}_fqs when setting them Paul E. McKenney
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

This commit adds any in-the-future ->gp_seq_needed fields to the
diagnostics for an rcutorture writer stall warning message.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c5fec335bc31..bc1ed77a3df9 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -606,11 +606,32 @@ EXPORT_SYMBOL_GPL(rcu_sched_force_quiescent_state);
  */
 void show_rcu_gp_kthreads(void)
 {
+	int cpu;
+	struct rcu_data *rdp;
+	struct rcu_node *rnp;
 	struct rcu_state *rsp;
 
 	for_each_rcu_flavor(rsp) {
 		pr_info("%s: wait state: %d ->state: %#lx\n",
 			rsp->name, rsp->gp_state, rsp->gp_kthread->state);
+		rcu_for_each_node_breadth_first(rsp, rnp) {
+			if (ULONG_CMP_GE(rsp->gp_seq, rnp->gp_seq_needed))
+				continue;
+			pr_info("\trcu_node %d:%d ->gp_seq %lu ->gp_seq_needed %lu\n",
+				rnp->grplo, rnp->grphi, rnp->gp_seq,
+				rnp->gp_seq_needed);
+			if (!rcu_is_leaf_node(rnp))
+				continue;
+			for_each_leaf_node_possible_cpu(rnp, cpu) {
+				rdp = per_cpu_ptr(rsp->rda, cpu);
+				if (rdp->gpwrap ||
+				    ULONG_CMP_GE(rsp->gp_seq,
+						 rdp->gp_seq_needed))
+					continue;
+				pr_info("\tcpu %d ->gp_seq_needed %lu\n",
+					cpu, rdp->gp_seq_needed);
+			}
+		}
 		/* sched_show_task(rsp->gp_kthread); */
 	}
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 20/27] rcu: Check the range of jiffies_till_{first,next}_fqs when setting them
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (18 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 19/27] rcu: Add diagnostics for rcutorture writer stall warning Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 21/27] doc: Update synchronize_rcu() definition in whatisRCU.txt Paul E. McKenney
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Byungchul Park, Paul E . McKenney

From: Byungchul Park <byungchul.park@lge.com>

Currently, the range of jiffies_till_{first,next}_fqs are checked and
adjusted on and on in the loop of rcu_gp_kthread on runtime.

However, it's enough to check them only when setting them, not every
time in the loop. So make them handled on a setting time via sysfs.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 45 ++++++++++++++++++++++++++++++++-------------
 1 file changed, 32 insertions(+), 13 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index bc1ed77a3df9..c59cca40ea92 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -510,8 +510,38 @@ static ulong jiffies_till_first_fqs = ULONG_MAX;
 static ulong jiffies_till_next_fqs = ULONG_MAX;
 static bool rcu_kick_kthreads;
 
-module_param(jiffies_till_first_fqs, ulong, 0644);
-module_param(jiffies_till_next_fqs, ulong, 0644);
+static int param_set_first_fqs_jiffies(const char *val, const struct kernel_param *kp)
+{
+	ulong j;
+	int ret = kstrtoul(val, 0, &j);
+
+	if (!ret)
+		WRITE_ONCE(*(ulong *)kp->arg, (j > HZ) ? HZ : j);
+	return ret;
+}
+
+static int param_set_next_fqs_jiffies(const char *val, const struct kernel_param *kp)
+{
+	ulong j;
+	int ret = kstrtoul(val, 0, &j);
+
+	if (!ret)
+		WRITE_ONCE(*(ulong *)kp->arg, (j > HZ) ? HZ : (j ?: 1));
+	return ret;
+}
+
+static struct kernel_param_ops first_fqs_jiffies_ops = {
+	.set = param_set_first_fqs_jiffies,
+	.get = param_get_ulong,
+};
+
+static struct kernel_param_ops next_fqs_jiffies_ops = {
+	.set = param_set_next_fqs_jiffies,
+	.get = param_get_ulong,
+};
+
+module_param_cb(jiffies_till_first_fqs, &first_fqs_jiffies_ops, &jiffies_till_first_fqs, 0644);
+module_param_cb(jiffies_till_next_fqs, &next_fqs_jiffies_ops, &jiffies_till_next_fqs, 0644);
 module_param(rcu_kick_kthreads, bool, 0644);
 
 /*
@@ -2180,10 +2210,6 @@ static int __noreturn rcu_gp_kthread(void *arg)
 		/* Handle quiescent-state forcing. */
 		first_gp_fqs = true;
 		j = jiffies_till_first_fqs;
-		if (j > HZ) {
-			j = HZ;
-			jiffies_till_first_fqs = HZ;
-		}
 		ret = 0;
 		for (;;) {
 			if (!ret) {
@@ -2218,13 +2244,6 @@ static int __noreturn rcu_gp_kthread(void *arg)
 				WRITE_ONCE(rsp->gp_activity, jiffies);
 				ret = 0; /* Force full wait till next FQS. */
 				j = jiffies_till_next_fqs;
-				if (j > HZ) {
-					j = HZ;
-					jiffies_till_next_fqs = HZ;
-				} else if (j < 1) {
-					j = 1;
-					jiffies_till_next_fqs = 1;
-				}
 			} else {
 				/* Deal with stray signal. */
 				cond_resched_tasks_rcu_qs();
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 21/27] doc: Update synchronize_rcu() definition in whatisRCU.txt
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (19 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 20/27] rcu: Check the range of jiffies_till_{first,next}_fqs when setting them Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 22/27] rcu: Make rcu_seq_diff() more exact Paul E. McKenney
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Andrea Parri, Paul E . McKenney, Jonathan Corbet

From: Andrea Parri <andrea.parri@amarulasolutions.com>

The synchronize_rcu() definition based on RW-locks in whatisRCU.txt
does not meet the "Memory-Barrier Guarantees" in Requirements.html;
for example, the following SB-like test:

    P0:                      P1:

    WRITE_ONCE(x, 1);        WRITE_ONCE(y, 1);
    synchronize_rcu();       smp_mb();
    r0 = READ_ONCE(y);       r1 = READ_ONCE(x);

should not be allowed to reach the state "r0 = 0 AND r1 = 0", but
the current write_lock()+write_unlock() definition can not ensure
this.  This commit therefore inserts an smp_mb__after_spinlock()
in order to cause this synchronize_rcu() implementation to provide
this memory-barrier guarantee.

Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/whatisRCU.txt | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 65eb856526b7..94288f1b8759 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -588,6 +588,7 @@ It is extremely simple:
 	void synchronize_rcu(void)
 	{
 		write_lock(&rcu_gp_mutex);
+		smp_mb__after_spinlock();
 		write_unlock(&rcu_gp_mutex);
 	}
 
@@ -609,12 +610,15 @@ don't forget about them when submitting patches making use of RCU!]
 
 The rcu_read_lock() and rcu_read_unlock() primitive read-acquire
 and release a global reader-writer lock.  The synchronize_rcu()
-primitive write-acquires this same lock, then immediately releases
-it.  This means that once synchronize_rcu() exits, all RCU read-side
-critical sections that were in progress before synchronize_rcu() was
-called are guaranteed to have completed -- there is no way that
-synchronize_rcu() would have been able to write-acquire the lock
-otherwise.
+primitive write-acquires this same lock, then releases it.  This means
+that once synchronize_rcu() exits, all RCU read-side critical sections
+that were in progress before synchronize_rcu() was called are guaranteed
+to have completed -- there is no way that synchronize_rcu() would have
+been able to write-acquire the lock otherwise.  The smp_mb__after_spinlock()
+promotes synchronize_rcu() to a full memory barrier in compliance with
+the "Memory-Barrier Guarantees" listed in:
+
+	Documentation/RCU/Design/Requirements/Requirements.html.
 
 It is possible to nest rcu_read_lock(), since reader-writer locks may
 be recursively acquired.  Note also that rcu_read_lock() is immune
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 22/27] rcu: Make rcu_seq_diff() more exact
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (20 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 21/27] doc: Update synchronize_rcu() definition in whatisRCU.txt Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 23/27] MAINTAINERS: Update RCU, SRCU, and TORTURE-TEST entries Paul E. McKenney
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

The current implementatation of rcu_seq_diff() follows tradition in
providing a rough-and-ready approximation of the number of elapsed grace
periods between the two rcu_seq values.  However, this difference is
used to flag RCU-failure "near misses", which can be a valuable debugging
aid, so more exactitude would be an improvement.  This commit therefore
improves the accuracy of rcu_seq_diff().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/rcu.h | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 60f089d08c47..46f12d0fe319 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -180,7 +180,20 @@ static inline bool rcu_seq_new_gp(unsigned long old, unsigned long new)
  */
 static inline unsigned long rcu_seq_diff(unsigned long new, unsigned long old)
 {
-	return (new - old) >> RCU_SEQ_CTR_SHIFT;
+	unsigned long rnd_diff;
+
+	if (old == new)
+		return 0;
+	/*
+	 * Compute the number of grace periods (still shifted up), plus
+	 * one if either of new and old is not an exact grace period.
+	 */
+	rnd_diff = (new & ~RCU_SEQ_STATE_MASK) -
+		   ((old + RCU_SEQ_STATE_MASK) & ~RCU_SEQ_STATE_MASK) +
+		   ((new & RCU_SEQ_STATE_MASK) || (old & RCU_SEQ_STATE_MASK));
+	if (ULONG_CMP_GE(RCU_SEQ_STATE_MASK, rnd_diff))
+		return 1; /* Definitely no grace period has elapsed. */
+	return ((rnd_diff - RCU_SEQ_STATE_MASK - 1) >> RCU_SEQ_CTR_SHIFT) + 2;
 }
 
 /*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 23/27] MAINTAINERS: Update RCU, SRCU, and TORTURE-TEST entries
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (21 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 22/27] rcu: Make rcu_seq_diff() more exact Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 24/27] rcu: Print stall-warning NMI dyntick state in hexadecimal Paul E. McKenney
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney, Davidlohr Bueso

The RCU, SRCU, and TORTURE-TEST entries are missing some recent
changes, so this commit brings them up to date.

Reported-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
---
 MAINTAINERS | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9d5eeff51b5f..27bbc72c6191 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12021,9 +12021,9 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
 F:	Documentation/RCU/
 X:	Documentation/RCU/torture.txt
 F:	include/linux/rcu*
-X:	include/linux/srcu.h
+X:	include/linux/srcu*.h
 F:	kernel/rcu/
-X:	kernel/torture.c
+X:	kernel/rcu/srcu*.c
 
 REAL TIME CLOCK (RTC) SUBSYSTEM
 M:	Alessandro Zummo <a.zummo@towertech.it>
@@ -13060,8 +13060,8 @@ L:	linux-kernel@vger.kernel.org
 W:	http://www.rdrop.com/users/paulmck/RCU/
 S:	Supported
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
-F:	include/linux/srcu.h
-F:	kernel/rcu/srcu.c
+F:	include/linux/srcu*.h
+F:	kernel/rcu/srcu*.c
 
 SERIAL LOW-POWER INTER-CHIP MEDIA BUS (SLIMbus)
 M:	Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
@@ -14420,6 +14420,7 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
 F:	Documentation/RCU/torture.txt
 F:	kernel/torture.c
 F:	kernel/rcu/rcutorture.c
+F:	kernel/rcu/rcuperf.c
 F:	kernel/locking/locktorture.c
 
 TOSHIBA ACPI EXTRAS DRIVER
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 24/27] rcu: Print stall-warning NMI dyntick state in hexadecimal
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (22 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 23/27] MAINTAINERS: Update RCU, SRCU, and TORTURE-TEST entries Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 25/27] srcu: Add grace-period number to rcutorture statistics printout Paul E. McKenney
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

The ->dynticks_nmi_nesting field records the nesting depth of both
interrupt and NMI handlers.  Because the kernel can enter interrupts
and never leave them (and vice versa) and because NMIs can interrupt
manipulation of the ->dynticks_nmi_nesting field, the values in this
field must be both chosen and maniupated very carefully.  As a result,
although the value is zero when the corresponding CPU is executing
neither an interrupt nor an NMI handler, it is 4,611,686,018,427,387,906
on 64-bit systems when there is a single level of interrupt/NMI handling
in progress.

This number is difficult to remember and interpret, so this commit
switches the output to hexadecimal, resulting in the much nicer
0x4000000000000002.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree_plugin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 2cc9bf0d363a..c1b17f5b9361 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1801,7 +1801,7 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu)
 	}
 	print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
 	delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
-	pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%ld softirq=%u/%u fqs=%ld %s\n",
+	pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n",
 	       cpu,
 	       "O."[!!cpu_online(cpu)],
 	       "o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)],
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 25/27] srcu: Add grace-period number to rcutorture statistics printout
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (23 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 24/27] rcu: Print stall-warning NMI dyntick state in hexadecimal Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 26/27] rculist: Improve documentation for list_for_each_entry_from_rcu() Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 27/27] rcu: Assign higher prio to RCU threads if rcutorture is built-in Paul E. McKenney
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E. McKenney

This commit adds the SRCU grace-period number to the rcutorture statistics
printout, which allows it to be compared to the rcutorture "Writer stall
state" message.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/srcutree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index e526b56998af..6c9866a854b1 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1268,7 +1268,8 @@ void srcu_torture_stats_print(struct srcu_struct *sp, char *tt, char *tf)
 	unsigned long s0 = 0, s1 = 0;
 
 	idx = sp->srcu_idx & 0x1;
-	pr_alert("%s%s Tree SRCU per-CPU(idx=%d):", tt, tf, idx);
+	pr_alert("%s%s Tree SRCU g%ld per-CPU(idx=%d):",
+		 tt, tf, rcu_seq_current(&sp->srcu_gp_seq), idx);
 	for_each_possible_cpu(cpu) {
 		unsigned long l0, l1;
 		unsigned long u0, u1;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 26/27] rculist: Improve documentation for list_for_each_entry_from_rcu()
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (24 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 25/27] srcu: Add grace-period number to rcutorture statistics printout Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  2018-06-26  0:35 ` [PATCH tip/core/rcu 27/27] rcu: Assign higher prio to RCU threads if rcutorture is built-in Paul E. McKenney
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	NeilBrown, Paul E . McKenney

From: NeilBrown <neilb@suse.com>

Unfortunately the patch for adding list_for_each_entry_from_rcu()
wasn't the final patch after all review.  It is functionally
correct but the documentation was incomplete.

This patch adds this missing documentation which includes an update to
the documentation for list_for_each_entry_continue_rcu() to match the
documentation for the new list_for_each_entry_from_rcu(), and adds
list_for_each_entry_from_rcu() and the already existing
hlist_for_each_entry_from_rcu() to section 7 of whatisRCU.txt.

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/whatisRCU.txt |  2 ++
 include/linux/rculist.h         | 19 ++++++++++++++++++-
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 94288f1b8759..c2a7facf7ff9 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -820,11 +820,13 @@ RCU list traversal:
 	list_next_rcu
 	list_for_each_entry_rcu
 	list_for_each_entry_continue_rcu
+	list_for_each_entry_from_rcu
 	hlist_first_rcu
 	hlist_next_rcu
 	hlist_pprev_rcu
 	hlist_for_each_entry_rcu
 	hlist_for_each_entry_rcu_bh
+	hlist_for_each_entry_from_rcu
 	hlist_for_each_entry_continue_rcu
 	hlist_for_each_entry_continue_rcu_bh
 	hlist_nulls_first_rcu
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 36df6ccbc874..4786c2235b98 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -396,7 +396,16 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
  * @member:	the name of the list_head within the struct.
  *
  * Continue to iterate over list of given type, continuing after
- * the current position.
+ * the current position which must have been in the list when the RCU read
+ * lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_from_rcu() except
+ * this starts after the given position and that one starts at the given
+ * position.
  */
 #define list_for_each_entry_continue_rcu(pos, head, member) 		\
 	for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
@@ -411,6 +420,14 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
  *
  * Iterate over the tail of a list starting from a given position,
  * which must have been in the list when the RCU read lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_continue_rcu() except
+ * this starts from the given position and that one starts from the position
+ * after the given position.
  */
 #define list_for_each_entry_from_rcu(pos, head, member)			\
 	for (; &(pos)->member != (head);					\
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH tip/core/rcu 27/27] rcu: Assign higher prio to RCU threads if rcutorture is built-in
  2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
                   ` (25 preceding siblings ...)
  2018-06-26  0:35 ` [PATCH tip/core/rcu 26/27] rculist: Improve documentation for list_for_each_entry_from_rcu() Paul E. McKenney
@ 2018-06-26  0:35 ` Paul E. McKenney
  26 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26  0:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel,
	Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

The rcutorture RCU priority boosting tests fail even with CONFIG_RCU_BOOST
set because rcutorture's threads run at the same priority as the default
RCU kthreads (RT class with priority of 1).

This patch checks if RCU torture is built into the kernel and if so,
assigns RT priority 1 to the RCU threads, allowing the rcutorture boost
tests to pass.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c59cca40ea92..988683089a4c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3921,12 +3921,16 @@ static int __init rcu_spawn_gp_kthread(void)
 	struct task_struct *t;
 
 	/* Force priority into range. */
-	if (IS_ENABLED(CONFIG_RCU_BOOST) && kthread_prio < 1)
+	if (IS_ENABLED(CONFIG_RCU_BOOST) && kthread_prio < 2
+	    && IS_BUILTIN(CONFIG_RCU_TORTURE_TEST))
+		kthread_prio = 2;
+	else if (IS_ENABLED(CONFIG_RCU_BOOST) && kthread_prio < 1)
 		kthread_prio = 1;
 	else if (kthread_prio < 0)
 		kthread_prio = 0;
 	else if (kthread_prio > 99)
 		kthread_prio = 99;
+
 	if (kthread_prio != kthread_prio_in)
 		pr_alert("rcu_spawn_gp_kthread(): Limited prio to %d from %d\n",
 			 kthread_prio, kthread_prio_in);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 06/27] rcu: Mark task as .need_qs less aggressively
  2018-06-26  0:34 ` [PATCH tip/core/rcu 06/27] rcu: Mark task as .need_qs less aggressively Paul E. McKenney
@ 2018-06-26 17:08   ` Peter Zijlstra
  2018-06-26 18:03     ` Paul E. McKenney
  0 siblings, 1 reply; 46+ messages in thread
From: Peter Zijlstra @ 2018-06-26 17:08 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Mon, Jun 25, 2018 at 05:34:52PM -0700, Paul E. McKenney wrote:
> If any scheduling-clock interrupt interrupts an RCU-preempt read-side
> critical section, the interrupted task's ->rcu_read_unlock_special.b.need_qs
> field is set.  This causes the outermost rcu_read_unlock() to incur the
> extra overhead of calling into rcu_read_unlock_special().  This commit
> reduces that overhead by setting ->rcu_read_unlock_special.b.need_qs only
> if the grace period has been in effect for more than one second.

Even less agressive is never setting it at all.

Changelog fails to explain why not setting it every tick is correct, nor
why 1s is a 'safe' value to use.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26  0:35 ` [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works Paul E. McKenney
@ 2018-06-26 17:14   ` Peter Zijlstra
  2018-06-26 18:08     ` Paul E. McKenney
  2018-06-26 17:30   ` Peter Zijlstra
  1 sibling, 1 reply; 46+ messages in thread
From: Peter Zijlstra @ 2018-06-26 17:14 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> 
> rcu_seq_snap may be tricky to decipher. Lets document how it works with
> an example to make it easier.

Since you had me looking at them functions; why isn't rcu_seq_snap()
using smp_load_acquire() and rcu_seq_end() using smp_store_release() ?
Their respective comments seem to suggest that would be sufficent.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26  0:35 ` [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works Paul E. McKenney
  2018-06-26 17:14   ` Peter Zijlstra
@ 2018-06-26 17:30   ` Peter Zijlstra
  2018-06-26 18:10     ` Paul E. McKenney
  2018-06-27  4:39     ` Joel Fernandes
  1 sibling, 2 replies; 46+ messages in thread
From: Peter Zijlstra @ 2018-06-26 17:30 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> +/*
> + * rcu_seq_snap - Take a snapshot of the update side's sequence number.
> + *
> + * This function returns the earliest value of the grace-period sequence number
> + * that will indicate that a full grace period has elapsed since the current
> + * time.  Once the grace-period sequence number has reached this value, it will
> + * be safe to invoke all callbacks that have been registered prior to the
> + * current time. This value is the current grace-period number plus two to the
> + * power of the number of low-order bits reserved for state, then rounded up to
> + * the next value in which the state bits are all zero.

If you complete that by saying _why_ you need to round up there, then
the below verbiage is completely redundant.

> + * In the current design, RCU_SEQ_STATE_MASK=3 and the least significant bit of
> + * the seq is used to track if a GP is in progress or not. Given this, it is
> + * sufficient if we add (6+1) and mask with ~3 to get the next GP. Let's see
> + * why with an example:
> + *
> + * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
> + * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
> + * to account for the shift due to 2 state bits. Now, if the current seq is
> + * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
> + * is already in progress so the next GP that a future call back will be queued
> + * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
> + * the 2 lower bits by adding 0b11. In case the lower bit was set, the overflow
> + * will cause the extra +1 to the GP, along with the usual +1 explained before.
> + * This gives us GP+2. Finally we mask the lower to bits by ~0x3 in case the
> + * overflow didn't occur. This masking is needed because in case RCU was idle
> + * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
> + * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
> + *
> + * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
> + * which can be generalized to:
> + * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
> + */

Is the below not much simpler:

>  static inline unsigned long rcu_seq_snap(unsigned long *sp)
>  {
>  	unsigned long s;

	s = smp_load_aquire(sp);

	/* Add one GP */
	s += 1 << RCU_SEQ_CTR_SHIFT;

	/* Complete any pending state by rounding up */
	s = __ALIGN_MASK(s, RCU_SEQ_STATE_MASK);

	return s;
}



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 06/27] rcu: Mark task as .need_qs less aggressively
  2018-06-26 17:08   ` Peter Zijlstra
@ 2018-06-26 18:03     ` Paul E. McKenney
  2018-06-26 21:38       ` Paul E. McKenney
  0 siblings, 1 reply; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26 18:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Tue, Jun 26, 2018 at 07:08:12PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 25, 2018 at 05:34:52PM -0700, Paul E. McKenney wrote:
> > If any scheduling-clock interrupt interrupts an RCU-preempt read-side
> > critical section, the interrupted task's ->rcu_read_unlock_special.b.need_qs
> > field is set.  This causes the outermost rcu_read_unlock() to incur the
> > extra overhead of calling into rcu_read_unlock_special().  This commit
> > reduces that overhead by setting ->rcu_read_unlock_special.b.need_qs only
> > if the grace period has been in effect for more than one second.
> 
> Even less agressive is never setting it at all.

True, but if the CPU has been in an RCU read-side critical section for
a full second (which is the case with high probability when .b.need_qs
is set after this change), we might want to respond to the end of that
critical section sooner rather than later.

> Changelog fails to explain why not setting it every tick is correct, nor
> why 1s is a 'safe' value to use.

The RCU CPU stall warning cannot be set to less than 3s, so 1s is
reasonable.  It is a tradeoff -- setting it lower causes a greater
fraction of RCU read-side critical sections to incur extra overhead at
rcu_read_unlock() time, while setting it higher keeps a lazy approach
to reporting the quiescent state to core RCU for longer critical sections.

The upcoming RCU-bh/RCU-preempt/RCU-sched consolidation will raise
contention and overhead, so this is one of several things done to
lower overhead and contention to compensate for that.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26 17:14   ` Peter Zijlstra
@ 2018-06-26 18:08     ` Paul E. McKenney
  2018-06-26 19:21       ` Peter Zijlstra
  0 siblings, 1 reply; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26 18:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Tue, Jun 26, 2018 at 07:14:54PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> > From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > 
> > rcu_seq_snap may be tricky to decipher. Lets document how it works with
> > an example to make it easier.
> 
> Since you had me looking at them functions; why isn't rcu_seq_snap()
> using smp_load_acquire() and rcu_seq_end() using smp_store_release() ?
> Their respective comments seem to suggest that would be sufficent.

I do not believe that this would suffice.  Would it make sense to refer
to Documentation/RCU/Design/Memory-Ordering in the comment header?

Except that this would invite sprinkling this pathname far and wide...

The key point is that these functions are part of the any-to-any
memory-ordering guarantee that RCU grace periods provide.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26 17:30   ` Peter Zijlstra
@ 2018-06-26 18:10     ` Paul E. McKenney
  2018-06-26 19:27       ` Peter Zijlstra
  2018-06-27  4:39     ` Joel Fernandes
  1 sibling, 1 reply; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26 18:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Tue, Jun 26, 2018 at 07:30:55PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> > +/*
> > + * rcu_seq_snap - Take a snapshot of the update side's sequence number.
> > + *
> > + * This function returns the earliest value of the grace-period sequence number
> > + * that will indicate that a full grace period has elapsed since the current
> > + * time.  Once the grace-period sequence number has reached this value, it will
> > + * be safe to invoke all callbacks that have been registered prior to the
> > + * current time. This value is the current grace-period number plus two to the
> > + * power of the number of low-order bits reserved for state, then rounded up to
> > + * the next value in which the state bits are all zero.
> 
> If you complete that by saying _why_ you need to round up there, then
> the below verbiage is completely redundant.

I will leave this between you, Joel, and whoever else is interested.
The initial state worked for me.  ;-)

> > + * In the current design, RCU_SEQ_STATE_MASK=3 and the least significant bit of
> > + * the seq is used to track if a GP is in progress or not. Given this, it is
> > + * sufficient if we add (6+1) and mask with ~3 to get the next GP. Let's see
> > + * why with an example:
> > + *
> > + * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
> > + * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
> > + * to account for the shift due to 2 state bits. Now, if the current seq is
> > + * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
> > + * is already in progress so the next GP that a future call back will be queued
> > + * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
> > + * the 2 lower bits by adding 0b11. In case the lower bit was set, the overflow
> > + * will cause the extra +1 to the GP, along with the usual +1 explained before.
> > + * This gives us GP+2. Finally we mask the lower to bits by ~0x3 in case the
> > + * overflow didn't occur. This masking is needed because in case RCU was idle
> > + * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
> > + * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
> > + *
> > + * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
> > + * which can be generalized to:
> > + * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
> > + */
> 
> Is the below not much simpler:
> 
> >  static inline unsigned long rcu_seq_snap(unsigned long *sp)
> >  {
> >  	unsigned long s;
> 
> 	s = smp_load_aquire(sp);
> 
> 	/* Add one GP */
> 	s += 1 << RCU_SEQ_CTR_SHIFT;
> 
> 	/* Complete any pending state by rounding up */
> 	s = __ALIGN_MASK(s, RCU_SEQ_STATE_MASK);
> 
> 	return s;
> }

Seems equivalent to me, but with more lines.  To say nothing of more
levels of lookup of macro definitions.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26 18:08     ` Paul E. McKenney
@ 2018-06-26 19:21       ` Peter Zijlstra
  2018-06-26 19:31         ` Paul E. McKenney
  0 siblings, 1 reply; 46+ messages in thread
From: Peter Zijlstra @ 2018-06-26 19:21 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Tue, Jun 26, 2018 at 11:08:55AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 26, 2018 at 07:14:54PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> > > From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > > 
> > > rcu_seq_snap may be tricky to decipher. Lets document how it works with
> > > an example to make it easier.
> > 
> > Since you had me looking at them functions; why isn't rcu_seq_snap()
> > using smp_load_acquire() and rcu_seq_end() using smp_store_release() ?
> > Their respective comments seem to suggest that would be sufficent.
> 
> I do not believe that this would suffice.  Would it make sense to refer
> to Documentation/RCU/Design/Memory-Ordering in the comment header?

No, because I can't read that thing in an editor.

> Except that this would invite sprinkling this pathname far and wide...
> 
> The key point is that these functions are part of the any-to-any
> memory-ordering guarantee that RCU grace periods provide.

Then the existing comment is misleading and really needs change.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26 18:10     ` Paul E. McKenney
@ 2018-06-26 19:27       ` Peter Zijlstra
  0 siblings, 0 replies; 46+ messages in thread
From: Peter Zijlstra @ 2018-06-26 19:27 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Tue, Jun 26, 2018 at 11:10:25AM -0700, Paul E. McKenney wrote:
> > Is the below not much simpler:
> > 
> > >  static inline unsigned long rcu_seq_snap(unsigned long *sp)
> > >  {
> > >  	unsigned long s;
> > 
> > 	s = smp_load_aquire(sp);
> > 
> > 	/* Add one GP */
> > 	s += 1 << RCU_SEQ_CTR_SHIFT;
> > 
> > 	/* Complete any pending state by rounding up */
> > 	s = __ALIGN_MASK(s, RCU_SEQ_STATE_MASK);
> > 
> > 	return s;
> > }
> 
> Seems equivalent to me, but with more lines.  To say nothing of more
> levels of lookup of macro definitions.  ;-)

But it does explain the various parts in the equation, or at least gives
a fair clue as to how the thing composes. And I find the alignment thing
far easier to read that an open coded variant, but whatever.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26 19:21       ` Peter Zijlstra
@ 2018-06-26 19:31         ` Paul E. McKenney
  2018-06-26 20:15           ` Peter Zijlstra
  0 siblings, 1 reply; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26 19:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Tue, Jun 26, 2018 at 09:21:13PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 26, 2018 at 11:08:55AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 26, 2018 at 07:14:54PM +0200, Peter Zijlstra wrote:
> > > On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> > > > From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > > > 
> > > > rcu_seq_snap may be tricky to decipher. Lets document how it works with
> > > > an example to make it easier.
> > > 
> > > Since you had me looking at them functions; why isn't rcu_seq_snap()
> > > using smp_load_acquire() and rcu_seq_end() using smp_store_release() ?
> > > Their respective comments seem to suggest that would be sufficent.
> > 
> > I do not believe that this would suffice.  Would it make sense to refer
> > to Documentation/RCU/Design/Memory-Ordering in the comment header?
> 
> No, because I can't read that thing in an editor.
> 
> > Except that this would invite sprinkling this pathname far and wide...
> > 
> > The key point is that these functions are part of the any-to-any
> > memory-ordering guarantee that RCU grace periods provide.
> 
> Then the existing comment is misleading and really needs change.

Would it be sufficient to add something like "The memory barrier is
required to support the many-to-many ordering guaranteed by RCU grace
periods"?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26 19:31         ` Paul E. McKenney
@ 2018-06-26 20:15           ` Peter Zijlstra
  2018-06-26 22:27             ` Paul E. McKenney
  0 siblings, 1 reply; 46+ messages in thread
From: Peter Zijlstra @ 2018-06-26 20:15 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Tue, Jun 26, 2018 at 12:31:46PM -0700, Paul E. McKenney wrote:
> > Then the existing comment is misleading and really needs change.
> 
> Would it be sufficient to add something like "The memory barrier is
> required to support the many-to-many ordering guaranteed by RCU grace
> periods"?

What would lead me to wonder why after the load and not (also) before.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 06/27] rcu: Mark task as .need_qs less aggressively
  2018-06-26 18:03     ` Paul E. McKenney
@ 2018-06-26 21:38       ` Paul E. McKenney
  0 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26 21:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Tue, Jun 26, 2018 at 11:03:03AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 26, 2018 at 07:08:12PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 25, 2018 at 05:34:52PM -0700, Paul E. McKenney wrote:
> > > If any scheduling-clock interrupt interrupts an RCU-preempt read-side
> > > critical section, the interrupted task's ->rcu_read_unlock_special.b.need_qs
> > > field is set.  This causes the outermost rcu_read_unlock() to incur the
> > > extra overhead of calling into rcu_read_unlock_special().  This commit
> > > reduces that overhead by setting ->rcu_read_unlock_special.b.need_qs only
> > > if the grace period has been in effect for more than one second.
> > 
> > Even less agressive is never setting it at all.
> 
> True, but if the CPU has been in an RCU read-side critical section for
> a full second (which is the case with high probability when .b.need_qs
> is set after this change), we might want to respond to the end of that
> critical section sooner rather than later.
> 
> > Changelog fails to explain why not setting it every tick is correct, nor
> > why 1s is a 'safe' value to use.
> 
> The RCU CPU stall warning cannot be set to less than 3s, so 1s is
> reasonable.  It is a tradeoff -- setting it lower causes a greater
> fraction of RCU read-side critical sections to incur extra overhead at
> rcu_read_unlock() time, while setting it higher keeps a lazy approach
> to reporting the quiescent state to core RCU for longer critical sections.
> 
> The upcoming RCU-bh/RCU-preempt/RCU-sched consolidation will raise
> contention and overhead, so this is one of several things done to
> lower overhead and contention to compensate for that.

And does the following updated commit log help?

							Thanx, Paul

------------------------------------------------------------------------

commit aaf8af680740afc363583a6ed9549b08b613dd3d
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Wed May 16 14:41:41 2018 -0700

    rcu: Mark task as .need_qs less aggressively
    
    If any scheduling-clock interrupt interrupts an RCU-preempt read-side
    critical section, the interrupted task's ->rcu_read_unlock_special.b.need_qs
    field is set.  This causes the outermost rcu_read_unlock() to incur the
    extra overhead of calling into rcu_read_unlock_special().  This commit
    reduces that overhead by setting ->rcu_read_unlock_special.b.need_qs only
    if the grace period has been in effect for more than one second.
    
    Why one second?  Because this is comfortably smaller than the minimum
    RCU CPU stall-warning timeout of three seconds, but long enough that the
    .need_qs marking should happen quite rarely.  And if your RCU read-side
    critical section has run on-CPU for a full second, it is not unreasonable
    to invest some CPU time in ending the grace period quickly.
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index dbfe90191e19..0239cf8a4be6 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -730,6 +730,7 @@ rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp)
  */
 static void rcu_preempt_check_callbacks(void)
 {
+	struct rcu_state *rsp = &rcu_preempt_state;
 	struct task_struct *t = current;
 
 	if (t->rcu_read_lock_nesting == 0) {
@@ -738,7 +739,9 @@ static void rcu_preempt_check_callbacks(void)
 	}
 	if (t->rcu_read_lock_nesting > 0 &&
 	    __this_cpu_read(rcu_data_p->core_needs_qs) &&
-	    __this_cpu_read(rcu_data_p->cpu_no_qs.b.norm))
+	    __this_cpu_read(rcu_data_p->cpu_no_qs.b.norm) &&
+	    !t->rcu_read_unlock_special.b.need_qs &&
+	    time_after(jiffies, rsp->gp_start + HZ))
 		t->rcu_read_unlock_special.b.need_qs = true;
 }
 


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26 20:15           ` Peter Zijlstra
@ 2018-06-26 22:27             ` Paul E. McKenney
  0 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-26 22:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, joel

On Tue, Jun 26, 2018 at 10:15:03PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 26, 2018 at 12:31:46PM -0700, Paul E. McKenney wrote:
> > > Then the existing comment is misleading and really needs change.
> > 
> > Would it be sufficient to add something like "The memory barrier is
> > required to support the many-to-many ordering guaranteed by RCU grace
> > periods"?
> 
> What would lead me to wonder why after the load and not (also) before.

The memory barriers on the other sides of these primitives are supplied
by the caller.  For example, in srcu_gp_end(), there is a set of
spin_unlock_irq_rcu_node()/spin_lock_irq_rcu_node() pairs between the
rcu_seq_end() of the prior SRCU grace period and the rcu_seq_start()
of the next one.

However, these things aren't anywhere near a fastpath, so I could add
the smp_mb() calls on the other sides for readability, if that would
be useful.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-26 17:30   ` Peter Zijlstra
  2018-06-26 18:10     ` Paul E. McKenney
@ 2018-06-27  4:39     ` Joel Fernandes
  2018-06-27 17:54       ` Paul E. McKenney
  1 sibling, 1 reply; 46+ messages in thread
From: Joel Fernandes @ 2018-06-27  4:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Paul E. McKenney, linux-kernel, mingo, jiangshanlai, dipankar,
	akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, kernel-team

On Tue, Jun 26, 2018 at 07:30:55PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> > +/*
> > + * rcu_seq_snap - Take a snapshot of the update side's sequence number.
> > + *
> > + * This function returns the earliest value of the grace-period sequence number
> > + * that will indicate that a full grace period has elapsed since the current
> > + * time.  Once the grace-period sequence number has reached this value, it will
> > + * be safe to invoke all callbacks that have been registered prior to the
> > + * current time. This value is the current grace-period number plus two to the
> > + * power of the number of low-order bits reserved for state, then rounded up to
> > + * the next value in which the state bits are all zero.
> 
> If you complete that by saying _why_ you need to round up there, then
> the below verbiage is completely redundant.
> 
> > + * In the current design, RCU_SEQ_STATE_MASK=3 and the least significant bit of
> > + * the seq is used to track if a GP is in progress or not. Given this, it is
> > + * sufficient if we add (6+1) and mask with ~3 to get the next GP. Let's see
> > + * why with an example:
> > + *
> > + * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
> > + * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
> > + * to account for the shift due to 2 state bits. Now, if the current seq is
> > + * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
> > + * is already in progress so the next GP that a future call back will be queued
> > + * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
> > + * the 2 lower bits by adding 0b11. In case the lower bit was set, the overflow
> > + * will cause the extra +1 to the GP, along with the usual +1 explained before.
> > + * This gives us GP+2. Finally we mask the lower to bits by ~0x3 in case the
> > + * overflow didn't occur. This masking is needed because in case RCU was idle
> > + * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
> > + * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
> > + *
> > + * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
> > + * which can be generalized to:
> > + * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
> > + */
> 
> Is the below not much simpler:
> 
> >  static inline unsigned long rcu_seq_snap(unsigned long *sp)
> >  {
> >  	unsigned long s;
> 
> 	s = smp_load_aquire(sp);
> 
> 	/* Add one GP */
> 	s += 1 << RCU_SEQ_CTR_SHIFT;
> 
> 	/* Complete any pending state by rounding up */

I would suggest this comment be changed to "Add another GP if there was a
pending state".

> 	s = __ALIGN_MASK(s, RCU_SEQ_STATE_MASK);
> 

I agree with Peter's suggestions for both the verbiage reduction in the
comments in the header, as the new code he is proposing is more
self-documenting. I believe I proposed a big comment just because the code
wasn't self-documenting or obvious previously so needed an explanation.

How would you like to proceed? Let me know what you guys decide, I am really
Ok with anything. If you guys agree, should I write a follow-up patch with
Peter's suggestion that applies on top of this one?  Or do we want to drop
this one in favor of Peter's suggestion?

I guess we also have to conclude the other part about using memory barriers,
but I think that should be a separate patch.

thanks!

 - Joel


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-27  4:39     ` Joel Fernandes
@ 2018-06-27 17:54       ` Paul E. McKenney
  2018-06-27 18:27         ` Joel Fernandes
  0 siblings, 1 reply; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-27 17:54 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Peter Zijlstra, linux-kernel, mingo, jiangshanlai, dipankar,
	akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, kernel-team

On Tue, Jun 26, 2018 at 09:39:13PM -0700, Joel Fernandes wrote:
> On Tue, Jun 26, 2018 at 07:30:55PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> > > +/*
> > > + * rcu_seq_snap - Take a snapshot of the update side's sequence number.
> > > + *
> > > + * This function returns the earliest value of the grace-period sequence number
> > > + * that will indicate that a full grace period has elapsed since the current
> > > + * time.  Once the grace-period sequence number has reached this value, it will
> > > + * be safe to invoke all callbacks that have been registered prior to the
> > > + * current time. This value is the current grace-period number plus two to the
> > > + * power of the number of low-order bits reserved for state, then rounded up to
> > > + * the next value in which the state bits are all zero.
> > 
> > If you complete that by saying _why_ you need to round up there, then
> > the below verbiage is completely redundant.
> > 
> > > + * In the current design, RCU_SEQ_STATE_MASK=3 and the least significant bit of
> > > + * the seq is used to track if a GP is in progress or not. Given this, it is
> > > + * sufficient if we add (6+1) and mask with ~3 to get the next GP. Let's see
> > > + * why with an example:
> > > + *
> > > + * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
> > > + * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
> > > + * to account for the shift due to 2 state bits. Now, if the current seq is
> > > + * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
> > > + * is already in progress so the next GP that a future call back will be queued
> > > + * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
> > > + * the 2 lower bits by adding 0b11. In case the lower bit was set, the overflow
> > > + * will cause the extra +1 to the GP, along with the usual +1 explained before.
> > > + * This gives us GP+2. Finally we mask the lower to bits by ~0x3 in case the
> > > + * overflow didn't occur. This masking is needed because in case RCU was idle
> > > + * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
> > > + * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
> > > + *
> > > + * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
> > > + * which can be generalized to:
> > > + * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
> > > + */
> > 
> > Is the below not much simpler:
> > 
> > >  static inline unsigned long rcu_seq_snap(unsigned long *sp)
> > >  {
> > >  	unsigned long s;
> > 
> > 	s = smp_load_aquire(sp);
> > 
> > 	/* Add one GP */
> > 	s += 1 << RCU_SEQ_CTR_SHIFT;
> > 
> > 	/* Complete any pending state by rounding up */
> 
> I would suggest this comment be changed to "Add another GP if there was a
> pending state".
> 
> > 	s = __ALIGN_MASK(s, RCU_SEQ_STATE_MASK);
> > 
> 
> I agree with Peter's suggestions for both the verbiage reduction in the
> comments in the header, as the new code he is proposing is more
> self-documenting. I believe I proposed a big comment just because the code
> wasn't self-documenting or obvious previously so needed an explanation.
> 
> How would you like to proceed? Let me know what you guys decide, I am really
> Ok with anything. If you guys agree, should I write a follow-up patch with
> Peter's suggestion that applies on top of this one?  Or do we want to drop
> this one in favor of Peter's suggestion?

Shortening the comment would be good, so please do that.

I cannot say that I am much of a fan of the suggested change to the
computation, but I don't feel all that strongly about it.  If the two
of you agree on a formulation and get at least one other RCU maintainer
or reviewer to agree as well, I will take the change.

> I guess we also have to conclude the other part about using memory barriers,
> but I think that should be a separate patch.

It definitely should not be part of this patch.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-27 17:54       ` Paul E. McKenney
@ 2018-06-27 18:27         ` Joel Fernandes
  2018-06-27 19:11           ` Paul E. McKenney
  2018-06-28  5:10           ` Joel Fernandes
  0 siblings, 2 replies; 46+ messages in thread
From: Joel Fernandes @ 2018-06-27 18:27 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, linux-kernel, mingo, jiangshanlai, dipankar,
	akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, kernel-team

On Wed, Jun 27, 2018 at 10:54:36AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 26, 2018 at 09:39:13PM -0700, Joel Fernandes wrote:
> > On Tue, Jun 26, 2018 at 07:30:55PM +0200, Peter Zijlstra wrote:
> > > On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> > > > +/*
> > > > + * rcu_seq_snap - Take a snapshot of the update side's sequence number.
> > > > + *
> > > > + * This function returns the earliest value of the grace-period sequence number
> > > > + * that will indicate that a full grace period has elapsed since the current
> > > > + * time.  Once the grace-period sequence number has reached this value, it will
> > > > + * be safe to invoke all callbacks that have been registered prior to the
> > > > + * current time. This value is the current grace-period number plus two to the
> > > > + * power of the number of low-order bits reserved for state, then rounded up to
> > > > + * the next value in which the state bits are all zero.
> > > 
> > > If you complete that by saying _why_ you need to round up there, then
> > > the below verbiage is completely redundant.
> > > 
> > > > + * In the current design, RCU_SEQ_STATE_MASK=3 and the least significant bit of
> > > > + * the seq is used to track if a GP is in progress or not. Given this, it is
> > > > + * sufficient if we add (6+1) and mask with ~3 to get the next GP. Let's see
> > > > + * why with an example:
> > > > + *
> > > > + * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
> > > > + * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
> > > > + * to account for the shift due to 2 state bits. Now, if the current seq is
> > > > + * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
> > > > + * is already in progress so the next GP that a future call back will be queued
> > > > + * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
> > > > + * the 2 lower bits by adding 0b11. In case the lower bit was set, the overflow
> > > > + * will cause the extra +1 to the GP, along with the usual +1 explained before.
> > > > + * This gives us GP+2. Finally we mask the lower to bits by ~0x3 in case the
> > > > + * overflow didn't occur. This masking is needed because in case RCU was idle
> > > > + * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
> > > > + * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
> > > > + *
> > > > + * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
> > > > + * which can be generalized to:
> > > > + * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
> > > > + */
> > > 
> > > Is the below not much simpler:
> > > 
> > > >  static inline unsigned long rcu_seq_snap(unsigned long *sp)
> > > >  {
> > > >  	unsigned long s;
> > > 
> > > 	s = smp_load_aquire(sp);
> > > 
> > > 	/* Add one GP */
> > > 	s += 1 << RCU_SEQ_CTR_SHIFT;
> > > 
> > > 	/* Complete any pending state by rounding up */
> > 
> > I would suggest this comment be changed to "Add another GP if there was a
> > pending state".
> > 
> > > 	s = __ALIGN_MASK(s, RCU_SEQ_STATE_MASK);
> > > 
> > 
> > I agree with Peter's suggestions for both the verbiage reduction in the
> > comments in the header, as the new code he is proposing is more
> > self-documenting. I believe I proposed a big comment just because the code
> > wasn't self-documenting or obvious previously so needed an explanation.
> > 
> > How would you like to proceed? Let me know what you guys decide, I am really
> > Ok with anything. If you guys agree, should I write a follow-up patch with
> > Peter's suggestion that applies on top of this one?  Or do we want to drop
> > this one in favor of Peter's suggestion?
> 
> Shortening the comment would be good, so please do that.
> 
> I cannot say that I am much of a fan of the suggested change to the
> computation, but I don't feel all that strongly about it.  If the two

Did you mean a code generation standpoint or from a higher level coding standpoint?

From a code generation perspective, the code is identical, I did a quick
test to confirm that:

0000000000000000 <rcu_seq_snap_old>:
       0:       e8 00 00 00 00          callq  5 <rcu_seq_snap_old+0x5>
       5:       48 8b 07                mov    (%rdi),%rax
       8:       f0 83 44 24 fc 00       lock addl $0x0,-0x4(%rsp)
       e:       48 83 c0 07             add    $0x7,%rax
      12:       48 83 e0 fc             and    $0xfffffffffffffffc,%rax
      16:       c3                      retq
      17:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
      1e:       00 00

0000000000000020 <rcu_seq_snap_new>:
      20:       e8 00 00 00 00          callq  25 <rcu_seq_snap_new+0x5>
      25:       48 8b 07                mov    (%rdi),%rax
      28:       f0 83 44 24 fc 00       lock addl $0x0,-0x4(%rsp)
      2e:       48 83 c0 07             add    $0x7,%rax
      32:       48 83 e0 fc             and    $0xfffffffffffffffc,%rax
      36:       c3                      retq
      37:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
      3e:       00 00

> of you agree on a formulation and get at least one other RCU maintainer
> or reviewer to agree as well, I will take the change.
> 

Cool, sounds good.

thanks!

- Joel


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-27 18:27         ` Joel Fernandes
@ 2018-06-27 19:11           ` Paul E. McKenney
  2018-06-28  5:10           ` Joel Fernandes
  1 sibling, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-27 19:11 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Peter Zijlstra, linux-kernel, mingo, jiangshanlai, dipankar,
	akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, kernel-team

On Wed, Jun 27, 2018 at 11:27:26AM -0700, Joel Fernandes wrote:
> On Wed, Jun 27, 2018 at 10:54:36AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 26, 2018 at 09:39:13PM -0700, Joel Fernandes wrote:
> > > On Tue, Jun 26, 2018 at 07:30:55PM +0200, Peter Zijlstra wrote:
> > > > On Mon, Jun 25, 2018 at 05:35:02PM -0700, Paul E. McKenney wrote:
> > > > > +/*
> > > > > + * rcu_seq_snap - Take a snapshot of the update side's sequence number.
> > > > > + *
> > > > > + * This function returns the earliest value of the grace-period sequence number
> > > > > + * that will indicate that a full grace period has elapsed since the current
> > > > > + * time.  Once the grace-period sequence number has reached this value, it will
> > > > > + * be safe to invoke all callbacks that have been registered prior to the
> > > > > + * current time. This value is the current grace-period number plus two to the
> > > > > + * power of the number of low-order bits reserved for state, then rounded up to
> > > > > + * the next value in which the state bits are all zero.
> > > > 
> > > > If you complete that by saying _why_ you need to round up there, then
> > > > the below verbiage is completely redundant.
> > > > 
> > > > > + * In the current design, RCU_SEQ_STATE_MASK=3 and the least significant bit of
> > > > > + * the seq is used to track if a GP is in progress or not. Given this, it is
> > > > > + * sufficient if we add (6+1) and mask with ~3 to get the next GP. Let's see
> > > > > + * why with an example:
> > > > > + *
> > > > > + * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
> > > > > + * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
> > > > > + * to account for the shift due to 2 state bits. Now, if the current seq is
> > > > > + * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
> > > > > + * is already in progress so the next GP that a future call back will be queued
> > > > > + * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
> > > > > + * the 2 lower bits by adding 0b11. In case the lower bit was set, the overflow
> > > > > + * will cause the extra +1 to the GP, along with the usual +1 explained before.
> > > > > + * This gives us GP+2. Finally we mask the lower to bits by ~0x3 in case the
> > > > > + * overflow didn't occur. This masking is needed because in case RCU was idle
> > > > > + * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
> > > > > + * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
> > > > > + *
> > > > > + * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
> > > > > + * which can be generalized to:
> > > > > + * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
> > > > > + */
> > > > 
> > > > Is the below not much simpler:
> > > > 
> > > > >  static inline unsigned long rcu_seq_snap(unsigned long *sp)
> > > > >  {
> > > > >  	unsigned long s;
> > > > 
> > > > 	s = smp_load_aquire(sp);
> > > > 
> > > > 	/* Add one GP */
> > > > 	s += 1 << RCU_SEQ_CTR_SHIFT;
> > > > 
> > > > 	/* Complete any pending state by rounding up */
> > > 
> > > I would suggest this comment be changed to "Add another GP if there was a
> > > pending state".
> > > 
> > > > 	s = __ALIGN_MASK(s, RCU_SEQ_STATE_MASK);
> > > > 
> > > 
> > > I agree with Peter's suggestions for both the verbiage reduction in the
> > > comments in the header, as the new code he is proposing is more
> > > self-documenting. I believe I proposed a big comment just because the code
> > > wasn't self-documenting or obvious previously so needed an explanation.
> > > 
> > > How would you like to proceed? Let me know what you guys decide, I am really
> > > Ok with anything. If you guys agree, should I write a follow-up patch with
> > > Peter's suggestion that applies on top of this one?  Or do we want to drop
> > > this one in favor of Peter's suggestion?
> > 
> > Shortening the comment would be good, so please do that.
> > 
> > I cannot say that I am much of a fan of the suggested change to the
> > computation, but I don't feel all that strongly about it.  If the two
> 
> Did you mean a code generation standpoint or from a higher level coding standpoint?

I mean from the viewpoint that this changes the source code from a
straightforward single-line computation that can be seen at a glance
into something using a macro with a "__" prefix that invokes yet another
macro that eventually does something.

And yes, once I looked up the macro definitions, I did see that it is
functionally equivalent to the original code.  ;-)

> >From a code generation perspective, the code is identical, I did a quick
> test to confirm that:
> 
> 0000000000000000 <rcu_seq_snap_old>:
>        0:       e8 00 00 00 00          callq  5 <rcu_seq_snap_old+0x5>
>        5:       48 8b 07                mov    (%rdi),%rax
>        8:       f0 83 44 24 fc 00       lock addl $0x0,-0x4(%rsp)
>        e:       48 83 c0 07             add    $0x7,%rax
>       12:       48 83 e0 fc             and    $0xfffffffffffffffc,%rax
>       16:       c3                      retq
>       17:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
>       1e:       00 00
> 
> 0000000000000020 <rcu_seq_snap_new>:
>       20:       e8 00 00 00 00          callq  25 <rcu_seq_snap_new+0x5>
>       25:       48 8b 07                mov    (%rdi),%rax
>       28:       f0 83 44 24 fc 00       lock addl $0x0,-0x4(%rsp)
>       2e:       48 83 c0 07             add    $0x7,%rax
>       32:       48 83 e0 fc             and    $0xfffffffffffffffc,%rax
>       36:       c3                      retq
>       37:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
>       3e:       00 00

That is reassuring.

							Thanx, Paul

> > of you agree on a formulation and get at least one other RCU maintainer
> > or reviewer to agree as well, I will take the change.
> > 
> 
> Cool, sounds good.
> 
> thanks!
> 
> - Joel
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-27 18:27         ` Joel Fernandes
  2018-06-27 19:11           ` Paul E. McKenney
@ 2018-06-28  5:10           ` Joel Fernandes
  2018-06-28 17:42             ` Paul E. McKenney
  1 sibling, 1 reply; 46+ messages in thread
From: Joel Fernandes @ 2018-06-28  5:10 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, linux-kernel, mingo, jiangshanlai, dipankar,
	akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, kernel-team

On Wed, Jun 27, 2018 at 11:27:26AM -0700, Joel Fernandes wrote:
[..]
> > > > 	s = __ALIGN_MASK(s, RCU_SEQ_STATE_MASK);
> > > > 
> > > 
> > > I agree with Peter's suggestions for both the verbiage reduction in the
> > > comments in the header, as the new code he is proposing is more
> > > self-documenting. I believe I proposed a big comment just because the code
> > > wasn't self-documenting or obvious previously so needed an explanation.
> > > 
> > > How would you like to proceed? Let me know what you guys decide, I am really
> > > Ok with anything. If you guys agree, should I write a follow-up patch with
> > > Peter's suggestion that applies on top of this one?  Or do we want to drop
> > > this one in favor of Peter's suggestion?
> > 
> > Shortening the comment would be good, so please do that.

Paul,

Do you want to fold the below patch into the original one? Or do you prefer I
resent the original patch fixed up?

Following is the patch ontop of current 'dev' branch in your tree, with the
excessive comments removed.

Thanks to Peter for suggesting!

---8<-----------------------

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Subject: [PATCH] rcu: Remove excessive commentary on rcu_seq_snap

There isn't a strong need to explain in excessive detail about
rcu_seq_snap with an example. Remove unnecessary and redundant comments.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Fixes: 9701945dd79e ("rcu: Add comment documenting how rcu_seq_snap works")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/rcu.h | 22 ----------------------
 1 file changed, 22 deletions(-)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 0af6ce6d8b66..4d04683c31b2 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -101,28 +101,6 @@ static inline void rcu_seq_end(unsigned long *sp)
  * current time. This value is the current grace-period number plus two to the
  * power of the number of low-order bits reserved for state, then rounded up to
  * the next value in which the state bits are all zero.
- *
- * In the current design, RCU_SEQ_STATE_MASK=3 and the least significant bit of
- * the seq is used to track if a GP is in progress or not. Given this, it is
- * sufficient if we add (6+1) and mask with ~3 to get the next GP. Let's see
- * why with an example:
- *
- * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
- * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
- * to account for the shift due to 2 state bits. Now, if the current seq is
- * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
- * is already in progress so the next GP that a future call back will be queued
- * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
- * the 2 lower bits by adding 0b11. In case the lower bit was set, the overflow
- * will cause the extra +1 to the GP, along with the usual +1 explained before.
- * This gives us GP+2. Finally we mask the lower to bits by ~0x3 in case the
- * overflow didn't occur. This masking is needed because in case RCU was idle
- * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
- * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
- *
- * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
- * which can be generalized to:
- * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
  */
 static inline unsigned long rcu_seq_snap(unsigned long *sp)
 {
-- 
2.18.0.rc2.346.g013aa6912e-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works
  2018-06-28  5:10           ` Joel Fernandes
@ 2018-06-28 17:42             ` Paul E. McKenney
  0 siblings, 0 replies; 46+ messages in thread
From: Paul E. McKenney @ 2018-06-28 17:42 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Peter Zijlstra, linux-kernel, mingo, jiangshanlai, dipankar,
	akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet,
	fweisbec, oleg, kernel-team

On Wed, Jun 27, 2018 at 10:10:28PM -0700, Joel Fernandes wrote:
> On Wed, Jun 27, 2018 at 11:27:26AM -0700, Joel Fernandes wrote:
> [..]
> > > > > 	s = __ALIGN_MASK(s, RCU_SEQ_STATE_MASK);
> > > > > 
> > > > 
> > > > I agree with Peter's suggestions for both the verbiage reduction in the
> > > > comments in the header, as the new code he is proposing is more
> > > > self-documenting. I believe I proposed a big comment just because the code
> > > > wasn't self-documenting or obvious previously so needed an explanation.
> > > > 
> > > > How would you like to proceed? Let me know what you guys decide, I am really
> > > > Ok with anything. If you guys agree, should I write a follow-up patch with
> > > > Peter's suggestion that applies on top of this one?  Or do we want to drop
> > > > this one in favor of Peter's suggestion?
> > > 
> > > Shortening the comment would be good, so please do that.
> 
> Paul,
> 
> Do you want to fold the below patch into the original one? Or do you prefer I
> resent the original patch fixed up?
> 
> Following is the patch ontop of current 'dev' branch in your tree, with the
> excessive comments removed.
> 
> Thanks to Peter for suggesting!

I merged this into the current commit, with the result shown below.

Thank you both!

							Thanx, Paul

-----------------------------------------------------------------------

commit c9a6ed70aad9f4571afba3e12e869f5fccc26a40
Author: Joel Fernandes (Google) <joel@joelfernandes.org>
Date:   Tue May 22 23:38:13 2018 -0700

    rcu: Add comment documenting how rcu_seq_snap works
    
    rcu_seq_snap may be tricky to decipher. Lets document how it works with
    an example to make it easier.
    
    Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    [ paulmck: Shrink comment as suggested by Peter Zijlstra. ]

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index aa215d6355f8..89f13fffac73 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -91,7 +91,17 @@ static inline void rcu_seq_end(unsigned long *sp)
 	WRITE_ONCE(*sp, rcu_seq_endval(sp));
 }
 
-/* Take a snapshot of the update side's sequence number. */
+/*
+ * rcu_seq_snap - Take a snapshot of the update side's sequence number.
+ *
+ * This function returns the earliest value of the grace-period sequence number
+ * that will indicate that a full grace period has elapsed since the current
+ * time.  Once the grace-period sequence number has reached this value, it will
+ * be safe to invoke all callbacks that have been registered prior to the
+ * current time. This value is the current grace-period number plus two to the
+ * power of the number of low-order bits reserved for state, then rounded up to
+ * the next value in which the state bits are all zero.
+ */
 static inline unsigned long rcu_seq_snap(unsigned long *sp)
 {
 	unsigned long s;


^ permalink raw reply related	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2018-06-28 17:40 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-26  0:34 [PATCH tip/core/rcu 0/27] Post-gp_seq miscellaneous fixes for v4.19 Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 01/27] rcu: Make rcu_read_unlock_special() static Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 02/27] rcu: Improve rcu_note_voluntary_context_switch() reporting Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 03/27] rcu: rcupdate.h: Get rid of Sphinx warnings at rcu_pointer_handoff() Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 04/27] rcu: Use pr_fmt to prefix "rcu: " to logging output Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 05/27] rcu: Improve RCU-tasks naming and comments Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 06/27] rcu: Mark task as .need_qs less aggressively Paul E. McKenney
2018-06-26 17:08   ` Peter Zijlstra
2018-06-26 18:03     ` Paul E. McKenney
2018-06-26 21:38       ` Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 07/27] rcu: Inline rcu_dynticks_momentary_idle() into its sole caller Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 08/27] rcu: Clarify and correct the rcu_preempt_qs() header comment Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 09/27] rcu: Remove unused rcu_kick_nohz_cpu() function Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 10/27] rcu: Remove unused local variable "cpu" Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 11/27] rcu: Remove "inline" from panic_on_rcu_stall() and rcu_blocking_is_gp() Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 12/27] rcu: Remove "inline" from rcu_torture_print_module_parms() Paul E. McKenney
2018-06-26  0:34 ` [PATCH tip/core/rcu 13/27] rcu: Remove "inline" from rcu_perf_print_module_parms() Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 14/27] rcu: Remove __maybe_unused from rcu_cpu_has_callbacks() Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 15/27] rcu: Use RCU CPU stall timeout for rcu_check_gp_start_stall() Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 16/27] rcu: Add comment documenting how rcu_seq_snap works Paul E. McKenney
2018-06-26 17:14   ` Peter Zijlstra
2018-06-26 18:08     ` Paul E. McKenney
2018-06-26 19:21       ` Peter Zijlstra
2018-06-26 19:31         ` Paul E. McKenney
2018-06-26 20:15           ` Peter Zijlstra
2018-06-26 22:27             ` Paul E. McKenney
2018-06-26 17:30   ` Peter Zijlstra
2018-06-26 18:10     ` Paul E. McKenney
2018-06-26 19:27       ` Peter Zijlstra
2018-06-27  4:39     ` Joel Fernandes
2018-06-27 17:54       ` Paul E. McKenney
2018-06-27 18:27         ` Joel Fernandes
2018-06-27 19:11           ` Paul E. McKenney
2018-06-28  5:10           ` Joel Fernandes
2018-06-28 17:42             ` Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 17/27] rcu: Speed up calling of RCU tasks callbacks Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 18/27] rcu: Add comment to the last sleep in the rcu tasks loop Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 19/27] rcu: Add diagnostics for rcutorture writer stall warning Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 20/27] rcu: Check the range of jiffies_till_{first,next}_fqs when setting them Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 21/27] doc: Update synchronize_rcu() definition in whatisRCU.txt Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 22/27] rcu: Make rcu_seq_diff() more exact Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 23/27] MAINTAINERS: Update RCU, SRCU, and TORTURE-TEST entries Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 24/27] rcu: Print stall-warning NMI dyntick state in hexadecimal Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 25/27] srcu: Add grace-period number to rcutorture statistics printout Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 26/27] rculist: Improve documentation for list_for_each_entry_from_rcu() Paul E. McKenney
2018-06-26  0:35 ` [PATCH tip/core/rcu 27/27] rcu: Assign higher prio to RCU threads if rcutorture is built-in Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.