rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: paulmck <paulmck@kernel.org>
Cc: rcu <rcu@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	kernel-team <kernel-team@fb.com>, Ingo Molnar <mingo@kernel.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	rostedt <rostedt@goodmis.org>,
	David Howells <dhowells@redhat.com>,
	Eric Dumazet <edumazet@google.com>, fweisbec <fweisbec@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>,
	"Joel Fernandes, Google" <joel@joelfernandes.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH v2 rcu 04/18] rcu: Weaken ->dynticks accesses and updates
Date: Thu, 29 Jul 2021 14:42:17 -0400 (EDT)	[thread overview]
Message-ID: <2059529299.18.1627584137760.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20210729180517.GZ4397@paulmck-ThinkPad-P17-Gen-1>

----- On Jul 29, 2021, at 2:05 PM, paulmck paulmck@kernel.org wrote:

> On Thu, Jul 29, 2021 at 01:41:41PM -0400, Mathieu Desnoyers wrote:
>> ----- On Jul 29, 2021, at 11:57 AM, paulmck paulmck@kernel.org wrote:
>> 
>> > On Thu, Jul 29, 2021 at 10:41:18AM -0400, Mathieu Desnoyers wrote:
>> >> ----- On Jul 28, 2021, at 4:28 PM, paulmck paulmck@kernel.org wrote:
>> >> 
>> >> > On Wed, Jul 28, 2021 at 04:03:02PM -0400, Mathieu Desnoyers wrote:
>> >> >> ----- On Jul 28, 2021, at 3:45 PM, paulmck paulmck@kernel.org wrote:
>> >> >> [...]
>> >> >> > 
>> >> >> > And how about like this?
>> >> >> > 
>> >> >> >						Thanx, Paul
>> >> >> > 
>> >> >> > ------------------------------------------------------------------------
>> >> >> > 
>> >> >> > commit cb8914dcc6443cca15ce48d937a93c0dfdb114d3
>> >> >> > Author: Paul E. McKenney <paulmck@kernel.org>
>> >> >> > Date:   Wed Jul 28 12:38:42 2021 -0700
>> >> >> > 
>> >> >> >    rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting()
>> >> >> >    
>> >> >> >    The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
>> >> >> >    counter of an incoming CPU if required.  It is currently is invoked
>> >> >> 
>> >> >> "is currently is" -> "is currently"
>> >> > 
>> >> > Good catch, fixed!
>> >> > 
>> >> >> >    from rcutree_prepare_cpu(), which runs before the incoming CPU is
>> >> >> >    running, and thus on some other CPU.  This makes the per-CPU accesses in
>> >> >> >    rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
>> >> >> >    the running CPU cannot possibly be in dyntick-idle mode, which means
>> >> >> >    that rcu_dynticks_eqs_online() never has any effect.  One could argue
>> >> >> >    that this means that rcu_dynticks_eqs_online() is unnecessary, however,
>> >> >> >    removing it makes the CPU-online process vulnerable to slight changes
>> >> >> >    in the CPU-offline process.
>> >> >> 
>> >> >> Why favor moving this from the prepare_cpu to the cpu_starting hotplug step,
>> >> >> rather than using the target cpu's rdp from rcutree_prepare_cpu ? Maybe there
>> >> >> was a good reason for having this very early in the prepare_cpu step ?
>> >> > 
>> >> > Some years back, there was a good reason. This reason was that
>> >> > rcutree_prepare_cpu() marked the CPU as being online from an RCU
>> >> > viewpoint.  But now rcu_cpu_starting() is the one that marks the CPU as
>> >> > being online, so the ->dynticks check can be deferred to this function.
>> >> > 
>> >> >> Also, the commit message refers to this bug as having no effect because the
>> >> >> running CPU cannot possibly be in dyntick-idle mode. I understand that calling
>> >> >> this function was indeed effect-less, but then why is it OK for the CPU coming
>> >> >> online to skip this call in the first place ? This commit message hints at
>> >> >> "slight changes in the CPU-offline process" which could break it, but therer is
>> >> >> no explanation of what makes this not an actual bug fix.
>> >> > 
>> >> > Because rcutorture would not have suffered in silence had this
>> >> > situation ever arisen.
>> >> 
>> >> Testing can usually prove the presence of a bug, but it's rather tricky to prove
>> >> the absence of bug.
>> > 
>> > In general, true enough.
>> > 
>> > But in this particular case, a WARN would have deterministically triggered
>> > the very next time that this CPU found its way either to the idle loop
>> > or to nohz_full usermode execution.
>> > 
>> >> > I have updated the commit log to answer these questions as shown
>> >> > below.  Thoughts?
>> >> 
>> >> I'm still concerned about one scenario wrt moving rcu_dynticks_eqs_online()
>> >> from rcutree_prepare_cpu to rcu_cpu_starting. What happens if an interrupt
>> >> handler, or a NMI handler, nests early over the CPU-online startup code ?
>> >> AFAIU, this interrupt handler could contain RCU read-side critical sections,
>> >> but if the eqs state does not show the CPU as "online", I wonder whether it
>> >> will work as expected.
>> > 
>> > Interrupts are still disabled at this point in the onlining process,
>> > my _irqsave() locks notwithstanding.
>> > 
>> > You are right about NMI handlers, but there would be much more damage
>> > from an early NMI handler than mere RCU issues.  And this can be handled
>> > as described in the next paragraph.
>> > 
>> > Now, there are architectures (including x86) that need RCU readers
>> > before notify_cpu_starting() time (which is where rcu_cpu_starting()
>> > is invoked by default, and those architectures can (and do) simply
>> > place a call to rcu_cpu_starting() at an earlier appropriate point in
>> > the architecture-specific CPU-bringup code.  And this is in fact the
>> > reason for the ->cpu_started check at the beginning of rcu_cpu_starting().
>> > So an architecture using NMIs early in the CPU-bringup code should
>> > invoke rcu_cpu_starting() before enabling NMIs.
>> > 
>> > So why not just be safe and mark the CPU online early in the process?
>> > 
>> > Because that could result in unbounded grace periods and strange
>> > deadlocks.  These deadlocks were broken earlier by code that assumed that
>> > a CPU could not possibly take more than one jiffy to come online, but that
>> > assumption is clearly broken on todays systems, even the bare-metal ones.
>> > 
>> > In theory, I could change the raw_spin_lock_irqsave_rcu_node() to
>> > raw_spin_lock_rcu_node(), rely on the lockdep_assert_irqs_disabled()
>> > in the matching raw_spin_unlock_rcu_node(), and ditch the "flags"
>> > local variable, but rcu_report_qs_rnp() needs that "flags" argument.
>> > 
>> > OK, I guess one approach is to get the "flags" value from local_save_flags()
>> > and get rid of the _irqsave and _irq restore.  Assuming lockdep is
>> > functional that early in CPU bringup.
>> > 
>> > But would that really be better than status quo?
>> 
>> I'm OK with your explanation about the fact that interrupts and NMIs scenarios
>> are correctly handled, so moving this call from prepare_cpu to cpu_starting
>> is fine with me.
> 
> I will add a "Link:" to this conversation.
> 
> May I also add your "Acked-by" or similar?

Of course, feel free to add my Acked-by.

Thanks,

Mathieu

> 
>							Thanx, Paul
> 
>> Thanks,
>> 
>> Mathieu
>> 
>> > 
>> >							Thanx, Paul
>> > 
>> >> Thanks,
>> >> 
>> >> Mathieu
>> >> 
>> >> > 
>> >> >							Thanx, Paul
>> >> > 
>> >> > ------------------------------------------------------------------------
>> >> > 
>> >> > commit 516c8c4cc6fce62539f7e0182739812db4591c1d
>> >> > Author: Paul E. McKenney <paulmck@kernel.org>
>> >> > Date:   Wed Jul 28 12:38:42 2021 -0700
>> >> > 
>> >> >    rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting()
>> >> >    
>> >> >    The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
>> >> >    counter of an incoming CPU when required.  It is currently invoked
>> >> >    from rcutree_prepare_cpu(), which runs before the incoming CPU is
>> >> >    running, and thus on some other CPU.  This makes the per-CPU accesses in
>> >> >    rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
>> >> >    the running CPU cannot possibly be in dyntick-idle mode, which means
>> >> >    that rcu_dynticks_eqs_online() never has any effect.
>> >> >    
>> >> >    It is currently OK for rcu_dynticks_eqs_online() to have no effect, but
>> >> >    only because the CPU-offline process just happens to leave ->dynticks in
>> >> >    the correct state.  After all, if ->dynticks were in the wrong state on a
>> >> >    just-onlined CPU, rcutorture would complain bitterly the next time that
>> >> >    CPU went idle, at least in kernels built with CONFIG_RCU_EQS_DEBUG=y,
>> >> >    for example, those built by rcutorture scenario TREE04.  One could
>> >> >    argue that this means that rcu_dynticks_eqs_online() is unnecessary,
>> >> >    however, removing it would make the CPU-online process vulnerable to
>> >> >    slight changes in the CPU-offline process.
>> >> >    
>> >> >    One could also ask why it is safe to move the rcu_dynticks_eqs_online()
>> >> >    call so late in the CPU-online process.  Indeed, there was a time when it
>> >> >    would not have been safe, which does much to explain its current location.
>> >> >    However, the marking of a CPU as online from an RCU perspective has long
>> >> >    since moved from rcutree_prepare_cpu() to rcu_cpu_starting(), and all
>> >> >    that is required is that ->dynticks be set correctly by the time that
>> >> >    the CPU is marked as online from an RCU perspective.  After all, the RCU
>> >> >    grace-period kthread does not check to see if offline CPUs are also idle.
>> >> >    (In case you were curious, this is one reason why there is quiescent-state
>> >> >    reporting as part of the offlining process.)
>> >> >    
>> >> >    This commit therefore moves the call to rcu_dynticks_eqs_online() from
>> >> >    rcutree_prepare_cpu() to rcu_cpu_starting(), this latter being guaranteed
>> >> >    to be running on the incoming CPU.  The call to this function must of
>> >> >    course be placed before this rcu_cpu_starting() announces this CPU's
>> >> >    presence to RCU.
>> >> >    
>> >> >    Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> >> >    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>> >> > 
>> >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>> >> > index 0172a5fd6d8de..aa00babdaf544 100644
>> >> > --- a/kernel/rcu/tree.c
>> >> > +++ b/kernel/rcu/tree.c
>> >> > @@ -4129,7 +4129,6 @@ int rcutree_prepare_cpu(unsigned int cpu)
>> >> > 	rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
>> >> > 	rdp->blimit = blimit;
>> >> > 	rdp->dynticks_nesting = 1;	/* CPU not up, no tearing. */
>> >> > -	rcu_dynticks_eqs_online();
>> >> > 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
>> >> > 
>> >> > 	/*
>> >> > @@ -4249,6 +4248,7 @@ void rcu_cpu_starting(unsigned int cpu)
>> >> > 	mask = rdp->grpmask;
>> >> > 	WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
>> >> > 	WARN_ON_ONCE(!(rnp->ofl_seq & 0x1));
>> >> > +	rcu_dynticks_eqs_online();
>> >> > 	smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
>> >> > 	raw_spin_lock_irqsave_rcu_node(rnp, flags);
>> >> >  	WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask);
>> >> 
>> >> --
>> >> Mathieu Desnoyers
>> >> EfficiOS Inc.
>> > > http://www.efficios.com
>> 
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
> > http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2021-07-29 18:42 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-21 20:20 [PATCH rcu 0/18] Miscellaneous fixes for v5.15 Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 01/18] rcu: Fix to include first blocked task in stall warning Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 02/18] rcu: Fix stall-warning deadlock due to non-release of rcu_node ->lock Paul E. McKenney
2021-08-03 14:24   ` Qais Yousef
2021-08-03 15:52     ` Paul E. McKenney
2021-08-03 16:12       ` Qais Yousef
2021-08-03 16:28         ` Paul E. McKenney
2021-08-03 16:33           ` Qais Yousef
2021-08-04 13:50           ` Qais Yousef
2021-08-04 22:33             ` Paul E. McKenney
2021-08-06  9:56               ` Qais Yousef
2021-08-06  9:57   ` Qais Yousef
2021-08-06 11:43     ` Paul E. McKenney
2021-08-06 12:33       ` Qais Yousef
2021-07-21 20:21 ` [PATCH rcu 03/18] rcu: Remove special bit at the bottom of the ->dynticks counter Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 04/18] rcu: Weaken ->dynticks accesses and updates Paul E. McKenney
2021-07-21 20:41   ` Linus Torvalds
2021-07-21 21:25     ` Paul E. McKenney
2021-07-28 17:37   ` [PATCH v2 " Paul E. McKenney
2021-07-28 17:58     ` Linus Torvalds
2021-07-28 18:12       ` Mathieu Desnoyers
2021-07-28 18:32         ` Linus Torvalds
2021-07-28 18:39           ` Mathieu Desnoyers
2021-07-28 18:46         ` Paul E. McKenney
2021-07-28 18:46       ` Paul E. McKenney
2021-07-28 18:57         ` Linus Torvalds
2021-07-28 18:23     ` Mathieu Desnoyers
2021-07-28 18:58       ` Paul E. McKenney
2021-07-28 19:45         ` Paul E. McKenney
2021-07-28 20:03           ` Mathieu Desnoyers
2021-07-28 20:28             ` Paul E. McKenney
2021-07-29 14:41               ` Mathieu Desnoyers
2021-07-29 15:57                 ` Paul E. McKenney
2021-07-29 17:41                   ` Mathieu Desnoyers
2021-07-29 18:05                     ` Paul E. McKenney
2021-07-29 18:42                       ` Mathieu Desnoyers [this message]
2021-07-28 20:37     ` Josh Triplett
2021-07-28 20:47       ` Paul E. McKenney
2021-07-28 22:23         ` Frederic Weisbecker
2021-07-29  1:07           ` Paul E. McKenney
2021-07-29  7:58   ` [PATCH " Boqun Feng
2021-07-29 10:53     ` Frederic Weisbecker
2021-07-30  5:56       ` Boqun Feng
2021-07-30 17:18         ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 05/18] rcu: Mark accesses to ->rcu_read_lock_nesting Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 06/18] rculist: Unify documentation about missing list_empty_rcu() Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 07/18] rcu/tree: Handle VM stoppage in stall detection Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 08/18] rcu: Do not disable GP stall detection in rcu_cpu_stall_reset() Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 09/18] rcu: Start timing stall repetitions after warning complete Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 10/18] srcutiny: Mark read-side data races Paul E. McKenney
2021-07-29  8:23   ` Boqun Feng
2021-07-29 13:36     ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 11/18] rcu: Mark lockless ->qsmask read in rcu_check_boost_fail() Paul E. McKenney
2021-07-29  8:54   ` Boqun Feng
2021-07-29 14:03     ` Paul E. McKenney
2021-07-30  2:28       ` Boqun Feng
2021-07-30  3:26         ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 12/18] rcu: Make rcu_gp_init() and rcu_gp_fqs_loop noinline to conserve stack Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 13/18] rcu: Remove trailing spaces and tabs Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 14/18] rcu: Mark accesses in tree_stall.h Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 15/18] rcu: Remove useless "ret" update in rcu_gp_fqs_loop() Paul E. McKenney
2021-08-03 16:48   ` Joe Perches
2021-08-03 17:10     ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 16/18] rcu: Use per_cpu_ptr to get the pointer of per_cpu variable Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 17/18] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 18/18] rcu: Print human-readable message for schedule() in RCU reader Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2059529299.18.1627584137760.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).