All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: rcu <rcu@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	kernel-team <kernel-team@fb.com>, Ingo Molnar <mingo@kernel.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	rostedt <rostedt@goodmis.org>,
	David Howells <dhowells@redhat.com>,
	Eric Dumazet <edumazet@google.com>, fweisbec <fweisbec@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>,
	"Joel Fernandes, Google" <joel@joelfernandes.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH v2 rcu 04/18] rcu: Weaken ->dynticks accesses and updates
Date: Wed, 28 Jul 2021 13:28:02 -0700	[thread overview]
Message-ID: <20210728202802.GL4397@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <874308613.9545.1627502582005.JavaMail.zimbra@efficios.com>

On Wed, Jul 28, 2021 at 04:03:02PM -0400, Mathieu Desnoyers wrote:
> ----- On Jul 28, 2021, at 3:45 PM, paulmck paulmck@kernel.org wrote:
> [...]
> > 
> > And how about like this?
> > 
> >						Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > commit cb8914dcc6443cca15ce48d937a93c0dfdb114d3
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Wed Jul 28 12:38:42 2021 -0700
> > 
> >    rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting()
> >    
> >    The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
> >    counter of an incoming CPU if required.  It is currently is invoked
> 
> "is currently is" -> "is currently"

Good catch, fixed!

> >    from rcutree_prepare_cpu(), which runs before the incoming CPU is
> >    running, and thus on some other CPU.  This makes the per-CPU accesses in
> >    rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
> >    the running CPU cannot possibly be in dyntick-idle mode, which means
> >    that rcu_dynticks_eqs_online() never has any effect.  One could argue
> >    that this means that rcu_dynticks_eqs_online() is unnecessary, however,
> >    removing it makes the CPU-online process vulnerable to slight changes
> >    in the CPU-offline process.
> 
> Why favor moving this from the prepare_cpu to the cpu_starting hotplug step,
> rather than using the target cpu's rdp from rcutree_prepare_cpu ? Maybe there
> was a good reason for having this very early in the prepare_cpu step ?

Some years back, there was a good reason. This reason was that
rcutree_prepare_cpu() marked the CPU as being online from an RCU
viewpoint.  But now rcu_cpu_starting() is the one that marks the CPU as
being online, so the ->dynticks check can be deferred to this function.

> Also, the commit message refers to this bug as having no effect because the
> running CPU cannot possibly be in dyntick-idle mode. I understand that calling
> this function was indeed effect-less, but then why is it OK for the CPU coming
> online to skip this call in the first place ? This commit message hints at
> "slight changes in the CPU-offline process" which could break it, but therer is
> no explanation of what makes this not an actual bug fix.

Because rcutorture would not have suffered in silence had this
situation ever arisen.

I have updated the commit log to answer these questions as shown
below.  Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

commit 516c8c4cc6fce62539f7e0182739812db4591c1d
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Wed Jul 28 12:38:42 2021 -0700

    rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting()
    
    The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
    counter of an incoming CPU when required.  It is currently invoked
    from rcutree_prepare_cpu(), which runs before the incoming CPU is
    running, and thus on some other CPU.  This makes the per-CPU accesses in
    rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
    the running CPU cannot possibly be in dyntick-idle mode, which means
    that rcu_dynticks_eqs_online() never has any effect.
    
    It is currently OK for rcu_dynticks_eqs_online() to have no effect, but
    only because the CPU-offline process just happens to leave ->dynticks in
    the correct state.  After all, if ->dynticks were in the wrong state on a
    just-onlined CPU, rcutorture would complain bitterly the next time that
    CPU went idle, at least in kernels built with CONFIG_RCU_EQS_DEBUG=y,
    for example, those built by rcutorture scenario TREE04.  One could
    argue that this means that rcu_dynticks_eqs_online() is unnecessary,
    however, removing it would make the CPU-online process vulnerable to
    slight changes in the CPU-offline process.
    
    One could also ask why it is safe to move the rcu_dynticks_eqs_online()
    call so late in the CPU-online process.  Indeed, there was a time when it
    would not have been safe, which does much to explain its current location.
    However, the marking of a CPU as online from an RCU perspective has long
    since moved from rcutree_prepare_cpu() to rcu_cpu_starting(), and all
    that is required is that ->dynticks be set correctly by the time that
    the CPU is marked as online from an RCU perspective.  After all, the RCU
    grace-period kthread does not check to see if offline CPUs are also idle.
    (In case you were curious, this is one reason why there is quiescent-state
    reporting as part of the offlining process.)
    
    This commit therefore moves the call to rcu_dynticks_eqs_online() from
    rcutree_prepare_cpu() to rcu_cpu_starting(), this latter being guaranteed
    to be running on the incoming CPU.  The call to this function must of
    course be placed before this rcu_cpu_starting() announces this CPU's
    presence to RCU.
    
    Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0172a5fd6d8de..aa00babdaf544 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4129,7 +4129,6 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
 	rdp->blimit = blimit;
 	rdp->dynticks_nesting = 1;	/* CPU not up, no tearing. */
-	rcu_dynticks_eqs_online();
 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
 
 	/*
@@ -4249,6 +4248,7 @@ void rcu_cpu_starting(unsigned int cpu)
 	mask = rdp->grpmask;
 	WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
 	WARN_ON_ONCE(!(rnp->ofl_seq & 0x1));
+	rcu_dynticks_eqs_online();
 	smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
 	raw_spin_lock_irqsave_rcu_node(rnp, flags);
 	WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask);

  reply	other threads:[~2021-07-28 20:28 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-21 20:20 [PATCH rcu 0/18] Miscellaneous fixes for v5.15 Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 01/18] rcu: Fix to include first blocked task in stall warning Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 02/18] rcu: Fix stall-warning deadlock due to non-release of rcu_node ->lock Paul E. McKenney
2021-08-03 14:24   ` Qais Yousef
2021-08-03 15:52     ` Paul E. McKenney
2021-08-03 16:12       ` Qais Yousef
2021-08-03 16:28         ` Paul E. McKenney
2021-08-03 16:33           ` Qais Yousef
2021-08-04 13:50           ` Qais Yousef
2021-08-04 22:33             ` Paul E. McKenney
2021-08-06  9:56               ` Qais Yousef
2021-08-06  9:57   ` Qais Yousef
2021-08-06 11:43     ` Paul E. McKenney
2021-08-06 12:33       ` Qais Yousef
2021-07-21 20:21 ` [PATCH rcu 03/18] rcu: Remove special bit at the bottom of the ->dynticks counter Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 04/18] rcu: Weaken ->dynticks accesses and updates Paul E. McKenney
2021-07-21 20:41   ` Linus Torvalds
2021-07-21 21:25     ` Paul E. McKenney
2021-07-28 17:37   ` [PATCH v2 " Paul E. McKenney
2021-07-28 17:58     ` Linus Torvalds
2021-07-28 18:12       ` Mathieu Desnoyers
2021-07-28 18:32         ` Linus Torvalds
2021-07-28 18:39           ` Mathieu Desnoyers
2021-07-28 18:46         ` Paul E. McKenney
2021-07-28 18:46       ` Paul E. McKenney
2021-07-28 18:57         ` Linus Torvalds
2021-07-28 18:23     ` Mathieu Desnoyers
2021-07-28 18:58       ` Paul E. McKenney
2021-07-28 19:45         ` Paul E. McKenney
2021-07-28 20:03           ` Mathieu Desnoyers
2021-07-28 20:28             ` Paul E. McKenney [this message]
2021-07-29 14:41               ` Mathieu Desnoyers
2021-07-29 15:57                 ` Paul E. McKenney
2021-07-29 17:41                   ` Mathieu Desnoyers
2021-07-29 18:05                     ` Paul E. McKenney
2021-07-29 18:42                       ` Mathieu Desnoyers
2021-07-28 20:37     ` Josh Triplett
2021-07-28 20:47       ` Paul E. McKenney
2021-07-28 22:23         ` Frederic Weisbecker
2021-07-29  1:07           ` Paul E. McKenney
2021-07-29  7:58   ` [PATCH " Boqun Feng
2021-07-29 10:53     ` Frederic Weisbecker
2021-07-30  5:56       ` Boqun Feng
2021-07-30 17:18         ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 05/18] rcu: Mark accesses to ->rcu_read_lock_nesting Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 06/18] rculist: Unify documentation about missing list_empty_rcu() Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 07/18] rcu/tree: Handle VM stoppage in stall detection Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 08/18] rcu: Do not disable GP stall detection in rcu_cpu_stall_reset() Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 09/18] rcu: Start timing stall repetitions after warning complete Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 10/18] srcutiny: Mark read-side data races Paul E. McKenney
2021-07-29  8:23   ` Boqun Feng
2021-07-29 13:36     ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 11/18] rcu: Mark lockless ->qsmask read in rcu_check_boost_fail() Paul E. McKenney
2021-07-29  8:54   ` Boqun Feng
2021-07-29 14:03     ` Paul E. McKenney
2021-07-30  2:28       ` Boqun Feng
2021-07-30  3:26         ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 12/18] rcu: Make rcu_gp_init() and rcu_gp_fqs_loop noinline to conserve stack Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 13/18] rcu: Remove trailing spaces and tabs Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 14/18] rcu: Mark accesses in tree_stall.h Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 15/18] rcu: Remove useless "ret" update in rcu_gp_fqs_loop() Paul E. McKenney
2021-08-03 16:48   ` Joe Perches
2021-08-03 17:10     ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 16/18] rcu: Use per_cpu_ptr to get the pointer of per_cpu variable Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 17/18] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 18/18] rcu: Print human-readable message for schedule() in RCU reader Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210728202802.GL4397@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.