All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	stable <stable@vger.kernel.org>
Subject: Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
Date: Sun, 24 Sep 2017 17:34:56 -0700	[thread overview]
Message-ID: <20170925003456.GA13412@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170925002653.GL3521@linux.vnet.ibm.com>

On Sun, Sep 24, 2017 at 05:26:53PM -0700, Paul E. McKenney wrote:
> On Sun, Sep 24, 2017 at 05:12:13PM -0700, Linus Torvalds wrote:
> > On Sun, Sep 24, 2017 at 5:03 PM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > Mostly just paranoia on my part.  I would be happy to remove it if
> > > you prefer.  Or you or Steve can do so if that is more convenient.
> > 
> > I really don't think it's warranted. The values are *stable*. There's
> > no subtle lack of locking, or some optimistic access to a value that
> > can change.
> > 
> > The compiler can generate code to read the value fifteen billion
> > times, and it will always get the same value.
> > 
> > Yes, maybe in between the different accesses, an NMI will happen, and
> > the value will be incremented, but then as the NMI exits, it will
> > decrement again, so the code that got interrupted will not actually
> > see the change.
> > 
> > So the READ_ONCE() isn't "paranoia". It's just confusing.
> > 
> > > And yes, consistency would dictate that the uses in rcu_nmi_enter()
> > > and rcu_nmi_exit() should be _ONCE(), particularly the stores to
> > > ->dynticks_nmi_nesting.
> > 
> > NO.
> > 
> > That would be just more of that confusion.
> > 
> > That value is STABLE. It's stable even within an NMI handler. The NMI
> > code can read it, modify it, write it back, do a little dance, all
> > without having to care. There's no "_ONCE()" about it - not for the
> > readers, not for the writers, not for _anybody_.
> > 
> > So adding even more READ/WRITE_ONCE() accesses wouldn't be
> > "consistent", it would just be insanity.
> > 
> > Now, if an NMI happens and the value would be different on entry than
> > it is on exit, that would be something else. Then it really wouldn't
> > be stable wrt random users. But that would also be a major bug in the
> > NMI handler, as far as I can tell.
> > 
> > So the reason I'm objecting to that READ_ONCE() is that it isn't
> > "paranoia", it's "voodoo programming". And we don't do voodoo
> > programming.
> 
> I already agreed that the READ_ONCE() can be removed.

And for whatever it is worth, here is the updated patch.

							Thanx, Paul

------------------------------------------------------------------------

commit 3e2baa988b9c13095995c46c51e0e32c0b6a7d43
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Fri Sep 22 13:14:42 2017 -0700

    rcu: Allow for page faults in NMI handlers
    
    A number of architecture invoke rcu_irq_enter() on exception entry in
    order to allow RCU read-side critical sections in the exception handler
    when the exception is from an idle or nohz_full CPU.  This works, at
    least unless the exception happens in an NMI handler.  In that case,
    rcu_nmi_enter() would already have exited the extended quiescent state,
    which would mean that rcu_irq_enter() would (incorrectly) cause RCU
    to think that it is again in an extended quiescent state.  This will
    in turn result in lockdep splats in response to later RCU read-side
    critical sections.
    
    This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
    take no action if there is an rcu_nmi_enter() in effect, thus avoiding
    the unscheduled return to RCU quiescent state.  This in turn should
    make the kernel safe for on-demand RCU voyeurism.
    
    Reported-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    [ paulmck: Remove READ_ONCE() per Linux Torvalds feedback. ]

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index db5eb8c3f7af..e4fe06d42385 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -891,6 +891,11 @@ void rcu_irq_exit(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (rdtp->dynticks_nmi_nesting)
+		return;
+
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
 		     rdtp->dynticks_nesting < 1);
 	if (rdtp->dynticks_nesting <= 1) {
@@ -1036,6 +1041,11 @@ void rcu_irq_enter(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_enter() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (rdtp->dynticks_nmi_nesting)
+		return;
+
 	oldval = rdtp->dynticks_nesting;
 	rdtp->dynticks_nesting++;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&

  reply	other threads:[~2017-09-25  0:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
2017-09-23 20:56 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
2017-09-24 19:42   ` Linus Torvalds
2017-09-25  0:03     ` Paul E. McKenney
2017-09-25  0:12       ` Linus Torvalds
2017-09-25  0:26         ` Paul E. McKenney
2017-09-25  0:34           ` Paul E. McKenney [this message]
2017-09-25  4:41             ` Steven Rostedt
2017-09-25  4:56               ` Paul E. McKenney
2017-09-26  3:19                 ` Paul E. McKenney
2017-09-23 20:56 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
2017-09-23 20:56 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
2017-09-23 20:56 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
  -- strict thread matches above, loose matches on Subject: below --
2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
2017-09-22 22:15 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170925003456.GA13412@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.