linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10
@ 2016-11-14 16:56 Paul E. McKenney
  2016-11-14 16:57 ` [PATCH tip/core/rcu 1/7] rcu: Tighten up __call_rcu() rcu_head alignment check Paul E. McKenney
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 16:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

Hello!

This series contains miscellaneous fixes to RCU:

1.	Tighten up __call_rcu's alignment check for the rcu_head structure
	now that no arch does two-byte alignment of pointers.

2.	Remove obsolete comment about when rcu_check_callbacks() was invoked.

3.	Remove obsolete comment about which function opportunistically
	notes grace periods beginnings and endings.

4.	Update the RCU_TRACE Kconfig option's help text to note that
	it enables event tracing in addition to debugfs, courtesy of
	Nikolay Borisov.

5.	Add event tracing for long read-side delays in rcutorture
	in order to better debug too-short grace periods.

6.	Make expedited grace periods recheck dyntick-idle state to avoid
	sending needless IPIs.

7.	Don't kick CPUs unless there is grace-period activity currently
	in progress.

							Thanx, Paul

------------------------------------------------------------------------

 include/trace/events/rcu.h |    5 ++++-
 kernel/rcu/rcutorture.c    |   11 ++++++++++-
 kernel/rcu/tree.c          |   17 ++++++-----------
 kernel/rcu/tree.h          |    1 +
 kernel/rcu/tree_exp.h      |   12 +++++++++++-
 lib/Kconfig.debug          |    3 ++-
 6 files changed, 34 insertions(+), 15 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH tip/core/rcu 1/7] rcu: Tighten up __call_rcu() rcu_head alignment check
  2016-11-14 16:56 [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Paul E. McKenney
@ 2016-11-14 16:57 ` Paul E. McKenney
  2016-11-14 16:57 ` [PATCH tip/core/rcu 2/7] rcu: Remove obsolete rcu_check_callbacks() header comment Paul E. McKenney
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 16:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

Commit 720abae3d68ae ("rcu: force alignment on struct
callback_head/rcu_head") forced the rcu_head (AKA callback_head)
structure's alignment to pointer size, that is, to 4-byte boundaries on
32-bit systems and to 8-byte boundaries on 64-bit systems.  This
commit therefore checks for this same alignment in __call_rcu(),
which used to contain a looser check for two-byte alignment.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 kernel/rcu/tree.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 69a5611a7e7c..37e4f7d2be0c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3121,7 +3121,9 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func,
 	unsigned long flags;
 	struct rcu_data *rdp;
 
-	WARN_ON_ONCE((unsigned long)head & 0x1); /* Misaligned rcu_head! */
+	/* Misaligned rcu_head! */
+	WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));
+
 	if (debug_rcu_head_queue(head)) {
 		/* Probable double call_rcu(), so leak the callback. */
 		WRITE_ONCE(head->func, rcu_leak_callback);
-- 
2.5.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH tip/core/rcu 2/7] rcu: Remove obsolete rcu_check_callbacks() header comment
  2016-11-14 16:56 [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Paul E. McKenney
  2016-11-14 16:57 ` [PATCH tip/core/rcu 1/7] rcu: Tighten up __call_rcu() rcu_head alignment check Paul E. McKenney
@ 2016-11-14 16:57 ` Paul E. McKenney
  2016-11-14 16:57 ` [PATCH tip/core/rcu 3/7] rcu: Remove obsolete comment from __call_rcu() Paul E. McKenney
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 16:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

In the deep past, rcu_check_callbacks() was only invoked if rcu_pending()
returned true.  Which was fine, but these days rcu_check_callbacks()
is invoked unconditionally.  This commit therefore removes the obsolete
sentence from the header comment.

Reported-by: Michalis Kokologiannakis <mixaskok@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 37e4f7d2be0c..d52780024f9f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2828,8 +2828,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
  * Also schedule RCU core processing.
  *
  * This function must be called from hardirq context.  It is normally
- * invoked from the scheduling-clock interrupt.  If rcu_pending returns
- * false, there is no point in invoking rcu_check_callbacks().
+ * invoked from the scheduling-clock interrupt.
  */
 void rcu_check_callbacks(int user)
 {
-- 
2.5.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH tip/core/rcu 3/7] rcu: Remove obsolete comment from __call_rcu()
  2016-11-14 16:56 [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Paul E. McKenney
  2016-11-14 16:57 ` [PATCH tip/core/rcu 1/7] rcu: Tighten up __call_rcu() rcu_head alignment check Paul E. McKenney
  2016-11-14 16:57 ` [PATCH tip/core/rcu 2/7] rcu: Remove obsolete rcu_check_callbacks() header comment Paul E. McKenney
@ 2016-11-14 16:57 ` Paul E. McKenney
  2016-11-14 16:57 ` [PATCH tip/core/rcu 4/7] rcu: RCU_TRACE enables event tracing as well as debugfs Paul E. McKenney
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 16:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

The __call_rcu() comment about opportunistically noting grace period
beginnings and endings is obsolete.  RCU still does such opportunistic
noting, but in __call_rcu_core() rather than __call_rcu(), and there
already is an appropriate comment in __call_rcu_core().  This commit
therefore removes the obsolete comment.

Reported-by: Michalis Kokologiannakis <mixaskok@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d52780024f9f..865af187498e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3131,13 +3131,6 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func,
 	}
 	head->func = func;
 	head->next = NULL;
-
-	/*
-	 * Opportunistically note grace-period endings and beginnings.
-	 * Note that we might see a beginning right after we see an
-	 * end, but never vice versa, since this CPU has to pass through
-	 * a quiescent state betweentimes.
-	 */
 	local_irq_save(flags);
 	rdp = this_cpu_ptr(rsp->rda);
 
-- 
2.5.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH tip/core/rcu 4/7] rcu: RCU_TRACE enables event tracing as well as debugfs
  2016-11-14 16:56 [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Paul E. McKenney
                   ` (2 preceding siblings ...)
  2016-11-14 16:57 ` [PATCH tip/core/rcu 3/7] rcu: Remove obsolete comment from __call_rcu() Paul E. McKenney
@ 2016-11-14 16:57 ` Paul E. McKenney
  2016-11-14 16:57 ` [PATCH tip/core/rcu 5/7] torture: Trace long read-side delays Paul E. McKenney
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 16:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Nikolay Borisov, Paul E. McKenney

From: Nikolay Borisov <kernel@kyup.com>

The commit brings the RCU_TRACE Kconfig option's help text up to date
by noting that it enables additional event tracing as well as debugfs.

Signed-off-by: Nikolay Borisov <kernel@kyup.com>
[ paulmck:  Do some wordsmithing. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 lib/Kconfig.debug | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 33bc56cf60d7..4a431f18f218 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1430,7 +1430,8 @@ config RCU_TRACE
 	select TRACE_CLOCK
 	help
 	  This option provides tracing in RCU which presents stats
-	  in debugfs for debugging RCU implementation.
+	  in debugfs for debugging RCU implementation.  It also enables
+	  additional tracepoints for ftrace-style event tracing.
 
 	  Say Y here if you want to enable RCU tracing
 	  Say N if you are unsure.
-- 
2.5.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH tip/core/rcu 5/7] torture: Trace long read-side delays
  2016-11-14 16:56 [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Paul E. McKenney
                   ` (3 preceding siblings ...)
  2016-11-14 16:57 ` [PATCH tip/core/rcu 4/7] rcu: RCU_TRACE enables event tracing as well as debugfs Paul E. McKenney
@ 2016-11-14 16:57 ` Paul E. McKenney
  2016-11-14 17:21   ` Josh Triplett
  2016-11-14 16:57 ` [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state Paul E. McKenney
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 16:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

Although rcutorture will occasionally do a 50-millisecond grace-period
delay, these delays are quite rare.  And rightly so, because otherwise
the read rate would be quite low.  Thie means that it can be important
to identify whether or not a given run contained a long-delay read.
This commit therefore inserts a trace_rcu_torture_read() event to flag
runs containing long delays.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/trace/events/rcu.h |  5 ++++-
 kernel/rcu/rcutorture.c    | 11 ++++++++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index d3e756539d44..b31e05bc8e26 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -698,7 +698,10 @@ TRACE_EVENT(rcu_batch_end,
 /*
  * Tracepoint for rcutorture readers.  The first argument is the name
  * of the RCU flavor from rcutorture's viewpoint and the second argument
- * is the callback address.
+ * is the callback address.  The third callback is the start time in
+ * seconds, and the last two arguments are the grace period numbers
+ * and the beginning and end of the read, respectively.  Note that the
+ * callback address can be NULL.
  */
 TRACE_EVENT(rcu_torture_read,
 
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index bf08fee53dc7..87c51225ceec 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -289,15 +289,24 @@ static int rcu_torture_read_lock(void) __acquires(RCU)
 
 static void rcu_read_delay(struct torture_random_state *rrsp)
 {
+	unsigned long started;
+	unsigned long completed;
 	const unsigned long shortdelay_us = 200;
 	const unsigned long longdelay_ms = 50;
+	unsigned long long ts;
 
 	/* We want a short delay sometimes to make a reader delay the grace
 	 * period, and we want a long delay occasionally to trigger
 	 * force_quiescent_state. */
 
-	if (!(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms)))
+	if (!(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms))) {
+		started = cur_ops->completed();
+		ts = rcu_trace_clock_local();
 		mdelay(longdelay_ms);
+		completed = cur_ops->completed();
+		do_trace_rcu_torture_read(cur_ops->name, NULL, ts,
+					  started, completed);
+	}
 	if (!(torture_random(rrsp) % (nrealreaders * 2 * shortdelay_us)))
 		udelay(shortdelay_us);
 #ifdef CONFIG_PREEMPT
-- 
2.5.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state
  2016-11-14 16:56 [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Paul E. McKenney
                   ` (4 preceding siblings ...)
  2016-11-14 16:57 ` [PATCH tip/core/rcu 5/7] torture: Trace long read-side delays Paul E. McKenney
@ 2016-11-14 16:57 ` Paul E. McKenney
  2016-11-14 17:25   ` Josh Triplett
  2016-11-14 16:57 ` [PATCH tip/core/rcu 7/7] rcu: Don't kick unless grace period or request Paul E. McKenney
  2016-11-14 17:26 ` [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Josh Triplett
  7 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 16:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

Expedited grace periods check dyntick-idle state, and avoid sending
IPIs to idle CPUs, including those running guest OSes, and, on NOHZ_FULL
kernels, nohz_full CPUs.  However, the kernel has been observed checking
a CPU while it was non-idle, but sending the IPI after it has gone
idle.  This commit therefore rechecks idle state immediately before
sending the IPI, refraining from IPIing CPUs that have since gone idle.

Reported-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.h     |  1 +
 kernel/rcu/tree_exp.h | 12 +++++++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index e99a5234d9ed..fe98dd24adf8 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -404,6 +404,7 @@ struct rcu_data {
 	atomic_long_t exp_workdone1;	/* # done by others #1. */
 	atomic_long_t exp_workdone2;	/* # done by others #2. */
 	atomic_long_t exp_workdone3;	/* # done by others #3. */
+	int exp_dynticks_snap;		/* Double-check need for IPI. */
 
 	/* 7) Callback offloading. */
 #ifdef CONFIG_RCU_NOCB_CPU
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 24343eb87b58..d3053e99fdb6 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -358,8 +358,10 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
 			struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
 			struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
 
+			rdp->exp_dynticks_snap =
+				atomic_add_return(0, &rdtp->dynticks);
 			if (raw_smp_processor_id() == cpu ||
-			    !(atomic_add_return(0, &rdtp->dynticks) & 0x1) ||
+			    !(rdp->exp_dynticks_snap & 0x1) ||
 			    !(rnp->qsmaskinitnext & rdp->grpmask))
 				mask_ofl_test |= rdp->grpmask;
 		}
@@ -377,9 +379,17 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
 		/* IPI the remaining CPUs for expedited quiescent state. */
 		for_each_leaf_node_possible_cpu(rnp, cpu) {
 			unsigned long mask = leaf_node_cpu_bit(rnp, cpu);
+			struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
+			struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
+
 			if (!(mask_ofl_ipi & mask))
 				continue;
 retry_ipi:
+			if (atomic_add_return(0, &rdtp->dynticks) !=
+			    rdp->exp_dynticks_snap) {
+				mask_ofl_test |= mask;
+				continue;
+			}
 			ret = smp_call_function_single(cpu, func, rsp, 0);
 			if (!ret) {
 				mask_ofl_ipi &= ~mask;
-- 
2.5.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH tip/core/rcu 7/7] rcu: Don't kick unless grace period or request
  2016-11-14 16:56 [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Paul E. McKenney
                   ` (5 preceding siblings ...)
  2016-11-14 16:57 ` [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state Paul E. McKenney
@ 2016-11-14 16:57 ` Paul E. McKenney
  2016-11-14 17:26 ` [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Josh Triplett
  7 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 16:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

The current code can result in spurious kicks when there are no grace
periods in progress and no grace-period-related requests.  This is
sort of OK for a diagnostic aid, but the resulting ftrace-dump messages
in dmesg are annoying.  This commit therefore avoids spurious kicks
in the common case.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 865af187498e..96c52e43f7ca 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1304,7 +1304,8 @@ static void rcu_stall_kick_kthreads(struct rcu_state *rsp)
 	if (!rcu_kick_kthreads)
 		return;
 	j = READ_ONCE(rsp->jiffies_kick_kthreads);
-	if (time_after(jiffies, j) && rsp->gp_kthread) {
+	if (time_after(jiffies, j) && rsp->gp_kthread &&
+	    (rcu_gp_in_progress(rsp) || READ_ONCE(rsp->gp_flags))) {
 		WARN_ONCE(1, "Kicking %s grace-period kthread\n", rsp->name);
 		rcu_ftrace_dump(DUMP_ALL);
 		wake_up_process(rsp->gp_kthread);
-- 
2.5.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH tip/core/rcu 5/7] torture: Trace long read-side delays
  2016-11-14 16:57 ` [PATCH tip/core/rcu 5/7] torture: Trace long read-side delays Paul E. McKenney
@ 2016-11-14 17:21   ` Josh Triplett
  2016-11-14 18:44     ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Josh Triplett @ 2016-11-14 17:21 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, peterz, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby.prani

On Mon, Nov 14, 2016 at 08:57:11AM -0800, Paul E. McKenney wrote:
> Although rcutorture will occasionally do a 50-millisecond grace-period
> delay, these delays are quite rare.  And rightly so, because otherwise
> the read rate would be quite low.  Thie means that it can be important
> to identify whether or not a given run contained a long-delay read.
> This commit therefore inserts a trace_rcu_torture_read() event to flag
> runs containing long delays.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

A couple of apparent typos below.  With those fixed:
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

>  include/trace/events/rcu.h |  5 ++++-
>  kernel/rcu/rcutorture.c    | 11 ++++++++++-
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> index d3e756539d44..b31e05bc8e26 100644
> --- a/include/trace/events/rcu.h
> +++ b/include/trace/events/rcu.h
> @@ -698,7 +698,10 @@ TRACE_EVENT(rcu_batch_end,
>  /*
>   * Tracepoint for rcutorture readers.  The first argument is the name
>   * of the RCU flavor from rcutorture's viewpoint and the second argument
> - * is the callback address.
> + * is the callback address.  The third callback is the start time in
> + * seconds, and the last two arguments are the grace period numbers
> + * and the beginning and end of the read, respectively.  Note that the
> + * callback address can be NULL.

s/third callback/third argument/?

Also, s/and the beginning/of the beginning/?

>  TRACE_EVENT(rcu_torture_read,
>  
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index bf08fee53dc7..87c51225ceec 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -289,15 +289,24 @@ static int rcu_torture_read_lock(void) __acquires(RCU)
>  
>  static void rcu_read_delay(struct torture_random_state *rrsp)
>  {
> +	unsigned long started;
> +	unsigned long completed;
>  	const unsigned long shortdelay_us = 200;
>  	const unsigned long longdelay_ms = 50;
> +	unsigned long long ts;
>  
>  	/* We want a short delay sometimes to make a reader delay the grace
>  	 * period, and we want a long delay occasionally to trigger
>  	 * force_quiescent_state. */
>  
> -	if (!(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms)))
> +	if (!(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms))) {
> +		started = cur_ops->completed();
> +		ts = rcu_trace_clock_local();
>  		mdelay(longdelay_ms);
> +		completed = cur_ops->completed();
> +		do_trace_rcu_torture_read(cur_ops->name, NULL, ts,
> +					  started, completed);
> +	}
>  	if (!(torture_random(rrsp) % (nrealreaders * 2 * shortdelay_us)))
>  		udelay(shortdelay_us);
>  #ifdef CONFIG_PREEMPT
> -- 
> 2.5.2
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state
  2016-11-14 16:57 ` [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state Paul E. McKenney
@ 2016-11-14 17:25   ` Josh Triplett
  2016-11-14 17:37     ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Josh Triplett @ 2016-11-14 17:25 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, peterz, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby.prani

On Mon, Nov 14, 2016 at 08:57:12AM -0800, Paul E. McKenney wrote:
> Expedited grace periods check dyntick-idle state, and avoid sending
> IPIs to idle CPUs, including those running guest OSes, and, on NOHZ_FULL
> kernels, nohz_full CPUs.  However, the kernel has been observed checking
> a CPU while it was non-idle, but sending the IPI after it has gone
> idle.  This commit therefore rechecks idle state immediately before
> sending the IPI, refraining from IPIing CPUs that have since gone idle.
> 
> Reported-by: Rik van Riel <riel@redhat.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

atomic_add_return(0, ...) seems odd.  Do you actually want that, rather
than atomic_read(...)?  If so, can you please document exactly why?

>  kernel/rcu/tree.h     |  1 +
>  kernel/rcu/tree_exp.h | 12 +++++++++++-
>  2 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index e99a5234d9ed..fe98dd24adf8 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -404,6 +404,7 @@ struct rcu_data {
>  	atomic_long_t exp_workdone1;	/* # done by others #1. */
>  	atomic_long_t exp_workdone2;	/* # done by others #2. */
>  	atomic_long_t exp_workdone3;	/* # done by others #3. */
> +	int exp_dynticks_snap;		/* Double-check need for IPI. */
>  
>  	/* 7) Callback offloading. */
>  #ifdef CONFIG_RCU_NOCB_CPU
> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> index 24343eb87b58..d3053e99fdb6 100644
> --- a/kernel/rcu/tree_exp.h
> +++ b/kernel/rcu/tree_exp.h
> @@ -358,8 +358,10 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
>  			struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
>  			struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
>  
> +			rdp->exp_dynticks_snap =
> +				atomic_add_return(0, &rdtp->dynticks);
>  			if (raw_smp_processor_id() == cpu ||
> -			    !(atomic_add_return(0, &rdtp->dynticks) & 0x1) ||
> +			    !(rdp->exp_dynticks_snap & 0x1) ||
>  			    !(rnp->qsmaskinitnext & rdp->grpmask))
>  				mask_ofl_test |= rdp->grpmask;
>  		}
> @@ -377,9 +379,17 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
>  		/* IPI the remaining CPUs for expedited quiescent state. */
>  		for_each_leaf_node_possible_cpu(rnp, cpu) {
>  			unsigned long mask = leaf_node_cpu_bit(rnp, cpu);
> +			struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
> +			struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
> +
>  			if (!(mask_ofl_ipi & mask))
>  				continue;
>  retry_ipi:
> +			if (atomic_add_return(0, &rdtp->dynticks) !=
> +			    rdp->exp_dynticks_snap) {
> +				mask_ofl_test |= mask;
> +				continue;
> +			}
>  			ret = smp_call_function_single(cpu, func, rsp, 0);
>  			if (!ret) {
>  				mask_ofl_ipi &= ~mask;
> -- 
> 2.5.2
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10
  2016-11-14 16:56 [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Paul E. McKenney
                   ` (6 preceding siblings ...)
  2016-11-14 16:57 ` [PATCH tip/core/rcu 7/7] rcu: Don't kick unless grace period or request Paul E. McKenney
@ 2016-11-14 17:26 ` Josh Triplett
  7 siblings, 0 replies; 17+ messages in thread
From: Josh Triplett @ 2016-11-14 17:26 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, peterz, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby.prani

On Mon, Nov 14, 2016 at 08:56:48AM -0800, Paul E. McKenney wrote:
> Hello!
> 
> This series contains miscellaneous fixes to RCU:
> 
> 1.	Tighten up __call_rcu's alignment check for the rcu_head structure
> 	now that no arch does two-byte alignment of pointers.
> 
> 2.	Remove obsolete comment about when rcu_check_callbacks() was invoked.
> 
> 3.	Remove obsolete comment about which function opportunistically
> 	notes grace periods beginnings and endings.
> 
> 4.	Update the RCU_TRACE Kconfig option's help text to note that
> 	it enables event tracing in addition to debugfs, courtesy of
> 	Nikolay Borisov.
> 
> 5.	Add event tracing for long read-side delays in rcutorture
> 	in order to better debug too-short grace periods.
> 
> 6.	Make expedited grace periods recheck dyntick-idle state to avoid
> 	sending needless IPIs.
> 
> 7.	Don't kick CPUs unless there is grace-period activity currently
> 	in progress.

I replied to patches 5 and 6 with comments.  For 1-4 and 7:
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state
  2016-11-14 17:25   ` Josh Triplett
@ 2016-11-14 17:37     ` Peter Zijlstra
  2016-11-14 18:12       ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2016-11-14 17:37 UTC (permalink / raw)
  To: Josh Triplett
  Cc: Paul E. McKenney, linux-kernel, mingo, jiangshanlai, dipankar,
	akpm, mathieu.desnoyers, tglx, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby.prani

On Mon, Nov 14, 2016 at 09:25:12AM -0800, Josh Triplett wrote:
> On Mon, Nov 14, 2016 at 08:57:12AM -0800, Paul E. McKenney wrote:
> > Expedited grace periods check dyntick-idle state, and avoid sending
> > IPIs to idle CPUs, including those running guest OSes, and, on NOHZ_FULL
> > kernels, nohz_full CPUs.  However, the kernel has been observed checking
> > a CPU while it was non-idle, but sending the IPI after it has gone
> > idle.  This commit therefore rechecks idle state immediately before
> > sending the IPI, refraining from IPIing CPUs that have since gone idle.
> > 
> > Reported-by: Rik van Riel <riel@redhat.com>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> atomic_add_return(0, ...) seems odd.  Do you actually want that, rather
> than atomic_read(...)?  If so, can you please document exactly why?

Yes that is weird. The only effective difference is that it would do a
load-exclusive instead of a regular load.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state
  2016-11-14 17:37     ` Peter Zijlstra
@ 2016-11-14 18:12       ` Paul E. McKenney
  2016-11-15  8:16         ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 18:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Triplett, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Mon, Nov 14, 2016 at 06:37:33PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 14, 2016 at 09:25:12AM -0800, Josh Triplett wrote:
> > On Mon, Nov 14, 2016 at 08:57:12AM -0800, Paul E. McKenney wrote:
> > > Expedited grace periods check dyntick-idle state, and avoid sending
> > > IPIs to idle CPUs, including those running guest OSes, and, on NOHZ_FULL
> > > kernels, nohz_full CPUs.  However, the kernel has been observed checking
> > > a CPU while it was non-idle, but sending the IPI after it has gone
> > > idle.  This commit therefore rechecks idle state immediately before
> > > sending the IPI, refraining from IPIing CPUs that have since gone idle.
> > > 
> > > Reported-by: Rik van Riel <riel@redhat.com>
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > atomic_add_return(0, ...) seems odd.  Do you actually want that, rather
> > than atomic_read(...)?  If so, can you please document exactly why?
> 
> Yes that is weird. The only effective difference is that it would do a
> load-exclusive instead of a regular load.

It is weird, and checking to see if it is safe to convert it and its
friends to something with less overhead is on my list.   This starts
with a patch series I will post soon that consolidates all these
atomic_add_return() calls into a single function, which will ease testing
and other verification.

All that aside, please keep in mind that much is required from this load.
It is part of a network of ordered operations that guarantee that any
operation from any CPU preceding a given grace period is seen to precede
any other operation from any CPU following that same grace period.
And each and every CPU must agree on the order of those two operations,
otherwise, RCU is broken.

In addition, please note also that these operations are nowhere near
any fastpaths.

In fact, the specific operation you are concerned about is in an expedited
grace period, which has significant overhead.  This this added IPI can
substitute for an IPI, so, believe it or not, is an optimization.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH tip/core/rcu 5/7] torture: Trace long read-side delays
  2016-11-14 17:21   ` Josh Triplett
@ 2016-11-14 18:44     ` Paul E. McKenney
  2016-11-14 18:54       ` Josh Triplett
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-14 18:44 UTC (permalink / raw)
  To: Josh Triplett
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, peterz, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby.prani

On Mon, Nov 14, 2016 at 09:21:10AM -0800, Josh Triplett wrote:
> On Mon, Nov 14, 2016 at 08:57:11AM -0800, Paul E. McKenney wrote:
> > Although rcutorture will occasionally do a 50-millisecond grace-period
> > delay, these delays are quite rare.  And rightly so, because otherwise
> > the read rate would be quite low.  Thie means that it can be important
> > to identify whether or not a given run contained a long-delay read.
> > This commit therefore inserts a trace_rcu_torture_read() event to flag
> > runs containing long delays.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> A couple of apparent typos below.  With those fixed:
> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> 
> >  include/trace/events/rcu.h |  5 ++++-
> >  kernel/rcu/rcutorture.c    | 11 ++++++++++-
> >  2 files changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> > index d3e756539d44..b31e05bc8e26 100644
> > --- a/include/trace/events/rcu.h
> > +++ b/include/trace/events/rcu.h
> > @@ -698,7 +698,10 @@ TRACE_EVENT(rcu_batch_end,
> >  /*
> >   * Tracepoint for rcutorture readers.  The first argument is the name
> >   * of the RCU flavor from rcutorture's viewpoint and the second argument
> > - * is the callback address.
> > + * is the callback address.  The third callback is the start time in
> > + * seconds, and the last two arguments are the grace period numbers
> > + * and the beginning and end of the read, respectively.  Note that the
> > + * callback address can be NULL.
> 
> s/third callback/third argument/?

Good catch, fixed!

> Also, s/and the beginning/of the beginning/?

Let's see...  "the last two arguments are the grace period numbers and
the beginning and end of the read, respectively."  -ENONSENSE for sure.

I believe that the "and" needs to become "at" as follows:

"the last two arguments are the grace period numbers at the beginning
and end of the read, respectively."

Does that help?

							Thanx, Paul

> >  TRACE_EVENT(rcu_torture_read,
> >  
> > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > index bf08fee53dc7..87c51225ceec 100644
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -289,15 +289,24 @@ static int rcu_torture_read_lock(void) __acquires(RCU)
> >  
> >  static void rcu_read_delay(struct torture_random_state *rrsp)
> >  {
> > +	unsigned long started;
> > +	unsigned long completed;
> >  	const unsigned long shortdelay_us = 200;
> >  	const unsigned long longdelay_ms = 50;
> > +	unsigned long long ts;
> >  
> >  	/* We want a short delay sometimes to make a reader delay the grace
> >  	 * period, and we want a long delay occasionally to trigger
> >  	 * force_quiescent_state. */
> >  
> > -	if (!(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms)))
> > +	if (!(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms))) {
> > +		started = cur_ops->completed();
> > +		ts = rcu_trace_clock_local();
> >  		mdelay(longdelay_ms);
> > +		completed = cur_ops->completed();
> > +		do_trace_rcu_torture_read(cur_ops->name, NULL, ts,
> > +					  started, completed);
> > +	}
> >  	if (!(torture_random(rrsp) % (nrealreaders * 2 * shortdelay_us)))
> >  		udelay(shortdelay_us);
> >  #ifdef CONFIG_PREEMPT
> > -- 
> > 2.5.2
> > 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH tip/core/rcu 5/7] torture: Trace long read-side delays
  2016-11-14 18:44     ` Paul E. McKenney
@ 2016-11-14 18:54       ` Josh Triplett
  0 siblings, 0 replies; 17+ messages in thread
From: Josh Triplett @ 2016-11-14 18:54 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, peterz, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby.prani

On Mon, Nov 14, 2016 at 10:44:21AM -0800, Paul E. McKenney wrote:
> On Mon, Nov 14, 2016 at 09:21:10AM -0800, Josh Triplett wrote:
> > On Mon, Nov 14, 2016 at 08:57:11AM -0800, Paul E. McKenney wrote:
> > > Although rcutorture will occasionally do a 50-millisecond grace-period
> > > delay, these delays are quite rare.  And rightly so, because otherwise
> > > the read rate would be quite low.  Thie means that it can be important
> > > to identify whether or not a given run contained a long-delay read.
> > > This commit therefore inserts a trace_rcu_torture_read() event to flag
> > > runs containing long delays.
> > > 
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > A couple of apparent typos below.  With those fixed:
> > Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> > 
> > >  include/trace/events/rcu.h |  5 ++++-
> > >  kernel/rcu/rcutorture.c    | 11 ++++++++++-
> > >  2 files changed, 14 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> > > index d3e756539d44..b31e05bc8e26 100644
> > > --- a/include/trace/events/rcu.h
> > > +++ b/include/trace/events/rcu.h
> > > @@ -698,7 +698,10 @@ TRACE_EVENT(rcu_batch_end,
> > >  /*
> > >   * Tracepoint for rcutorture readers.  The first argument is the name
> > >   * of the RCU flavor from rcutorture's viewpoint and the second argument
> > > - * is the callback address.
> > > + * is the callback address.  The third callback is the start time in
> > > + * seconds, and the last two arguments are the grace period numbers
> > > + * and the beginning and end of the read, respectively.  Note that the
> > > + * callback address can be NULL.
> > 
> > s/third callback/third argument/?
> 
> Good catch, fixed!
> 
> > Also, s/and the beginning/of the beginning/?
> 
> Let's see...  "the last two arguments are the grace period numbers and
> the beginning and end of the read, respectively."  -ENONSENSE for sure.
> 
> I believe that the "and" needs to become "at" as follows:
> 
> "the last two arguments are the grace period numbers at the beginning
> and end of the read, respectively."
> 
> Does that help?

That works, yes.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state
  2016-11-14 18:12       ` Paul E. McKenney
@ 2016-11-15  8:16         ` Peter Zijlstra
  2016-11-15 14:36           ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2016-11-15  8:16 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Josh Triplett, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Mon, Nov 14, 2016 at 10:12:37AM -0800, Paul E. McKenney wrote:
> On Mon, Nov 14, 2016 at 06:37:33PM +0100, Peter Zijlstra wrote:
> > On Mon, Nov 14, 2016 at 09:25:12AM -0800, Josh Triplett wrote:
> > > On Mon, Nov 14, 2016 at 08:57:12AM -0800, Paul E. McKenney wrote:
> > > > Expedited grace periods check dyntick-idle state, and avoid sending
> > > > IPIs to idle CPUs, including those running guest OSes, and, on NOHZ_FULL
> > > > kernels, nohz_full CPUs.  However, the kernel has been observed checking
> > > > a CPU while it was non-idle, but sending the IPI after it has gone
> > > > idle.  This commit therefore rechecks idle state immediately before
> > > > sending the IPI, refraining from IPIing CPUs that have since gone idle.
> > > > 
> > > > Reported-by: Rik van Riel <riel@redhat.com>
> > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > 
> > > atomic_add_return(0, ...) seems odd.  Do you actually want that, rather
> > > than atomic_read(...)?  If so, can you please document exactly why?
> > 
> > Yes that is weird. The only effective difference is that it would do a
> > load-exclusive instead of a regular load.
> 
> It is weird, and checking to see if it is safe to convert it and its
> friends to something with less overhead is on my list.   This starts
> with a patch series I will post soon that consolidates all these
> atomic_add_return() calls into a single function, which will ease testing
> and other verification.
> 
> All that aside, please keep in mind that much is required from this load.
> It is part of a network of ordered operations that guarantee that any
> operation from any CPU preceding a given grace period is seen to precede
> any other operation from any CPU following that same grace period.
> And each and every CPU must agree on the order of those two operations,
> otherwise, RCU is broken.

OK, so something similar to:

	smp_mb();
	atomic_read();

then? That would order, with global transitivity, against prior
operations.

> In addition, please note also that these operations are nowhere near
> any fastpaths.

My concern is mostly that it reads very weird. I appreciate this not
being fast path code, but confusing code is bad in any form.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state
  2016-11-15  8:16         ` Peter Zijlstra
@ 2016-11-15 14:36           ` Paul E. McKenney
  0 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2016-11-15 14:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Triplett, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, rostedt, dhowells, edumazet, dvhart,
	fweisbec, oleg, bobby.prani

On Tue, Nov 15, 2016 at 09:16:55AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 14, 2016 at 10:12:37AM -0800, Paul E. McKenney wrote:
> > On Mon, Nov 14, 2016 at 06:37:33PM +0100, Peter Zijlstra wrote:
> > > On Mon, Nov 14, 2016 at 09:25:12AM -0800, Josh Triplett wrote:
> > > > On Mon, Nov 14, 2016 at 08:57:12AM -0800, Paul E. McKenney wrote:
> > > > > Expedited grace periods check dyntick-idle state, and avoid sending
> > > > > IPIs to idle CPUs, including those running guest OSes, and, on NOHZ_FULL
> > > > > kernels, nohz_full CPUs.  However, the kernel has been observed checking
> > > > > a CPU while it was non-idle, but sending the IPI after it has gone
> > > > > idle.  This commit therefore rechecks idle state immediately before
> > > > > sending the IPI, refraining from IPIing CPUs that have since gone idle.
> > > > > 
> > > > > Reported-by: Rik van Riel <riel@redhat.com>
> > > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > > 
> > > > atomic_add_return(0, ...) seems odd.  Do you actually want that, rather
> > > > than atomic_read(...)?  If so, can you please document exactly why?
> > > 
> > > Yes that is weird. The only effective difference is that it would do a
> > > load-exclusive instead of a regular load.
> > 
> > It is weird, and checking to see if it is safe to convert it and its
> > friends to something with less overhead is on my list.   This starts
> > with a patch series I will post soon that consolidates all these
> > atomic_add_return() calls into a single function, which will ease testing
> > and other verification.
> > 
> > All that aside, please keep in mind that much is required from this load.
> > It is part of a network of ordered operations that guarantee that any
> > operation from any CPU preceding a given grace period is seen to precede
> > any other operation from any CPU following that same grace period.
> > And each and every CPU must agree on the order of those two operations,
> > otherwise, RCU is broken.
> 
> OK, so something similar to:
> 
> 	smp_mb();
> 	atomic_read();
> 
> then? That would order, with global transitivity, against prior
> operations.

Maybe.  The consolidation in the later patch series is a first step
towards potential weakening.

> > In addition, please note also that these operations are nowhere near
> > any fastpaths.
> 
> My concern is mostly that it reads very weird. I appreciate this not
> being fast path code, but confusing code is bad in any form.

It is the long-standing code that has been checking dyntick-idle counters
for quite some time.  Just applying that same code to a new use case
in within the expedited grace periods, as you can see by looking a bit
earlier in that same function.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-11-15 14:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-14 16:56 [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Paul E. McKenney
2016-11-14 16:57 ` [PATCH tip/core/rcu 1/7] rcu: Tighten up __call_rcu() rcu_head alignment check Paul E. McKenney
2016-11-14 16:57 ` [PATCH tip/core/rcu 2/7] rcu: Remove obsolete rcu_check_callbacks() header comment Paul E. McKenney
2016-11-14 16:57 ` [PATCH tip/core/rcu 3/7] rcu: Remove obsolete comment from __call_rcu() Paul E. McKenney
2016-11-14 16:57 ` [PATCH tip/core/rcu 4/7] rcu: RCU_TRACE enables event tracing as well as debugfs Paul E. McKenney
2016-11-14 16:57 ` [PATCH tip/core/rcu 5/7] torture: Trace long read-side delays Paul E. McKenney
2016-11-14 17:21   ` Josh Triplett
2016-11-14 18:44     ` Paul E. McKenney
2016-11-14 18:54       ` Josh Triplett
2016-11-14 16:57 ` [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state Paul E. McKenney
2016-11-14 17:25   ` Josh Triplett
2016-11-14 17:37     ` Peter Zijlstra
2016-11-14 18:12       ` Paul E. McKenney
2016-11-15  8:16         ` Peter Zijlstra
2016-11-15 14:36           ` Paul E. McKenney
2016-11-14 16:57 ` [PATCH tip/core/rcu 7/7] rcu: Don't kick unless grace period or request Paul E. McKenney
2016-11-14 17:26 ` [PATCH tip/core/rcu 0/7] Miscellaneous fixes for 4.10 Josh Triplett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).