linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "Borislav Petkov" <bp@alien8.de>, "X86 ML" <x86@kernel.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Tony Luck" <tony.luck@intel.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>
Subject: Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context
Date: Mon, 24 Nov 2014 15:50:58 -0800	[thread overview]
Message-ID: <20141124235058.GZ5050@linux.vnet.ibm.com> (raw)
In-Reply-To: <CALCETrV37kqC658AzBHZ5z_vH2b_1T821q5EfbMcS9Ecpj8EEw@mail.gmail.com>

On Mon, Nov 24, 2014 at 03:35:55PM -0800, Andy Lutomirski wrote:
> On Mon, Nov 24, 2014 at 3:31 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Mon, Nov 24, 2014 at 02:57:54PM -0800, Paul E. McKenney wrote:
> >> On Mon, Nov 24, 2014 at 02:36:18PM -0800, Andy Lutomirski wrote:
> >> > On Mon, Nov 24, 2014 at 2:34 PM, Paul E. McKenney
> >> > <paulmck@linux.vnet.ibm.com> wrote:
> >> > > On Mon, Nov 24, 2014 at 01:35:01PM -0800, Paul E. McKenney wrote:
> >>
> >> [ . . . ]
> >>
> >> > > And the following Promela model claims that your approach works.
> >> > > Should I trust it?  ;-)
> >> > >
> >> >
> >> > I think so.
> >> >
> >> > Want to write a patch?  If so, whose tree should it go in?  I can add
> >> > it to my IST series, but that seems a bit odd.
> >>
> >> Working on it.  ;-)
> >
> > And here is a sneak preview of the patch.  Thoughts?
> >
> >                                                         Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > rcu: Make rcu_nmi_enter() handle nesting
> >
> > Andy Lutomirski is introducing ISTs into x86, which from RCU's
> > viewpoint are NMIs.  Because ISTs and NMIs can nest, rcu_nmi_enter() and
> > rcu_nmi_exit() must now correctly handle nesting.
> 
> You must not be a frequent reader of entry_64.S and the Intel SDM :)
> IOW, IST is just a stack switching mechanism, and these interrupts
> have been around forever -- they're just buggy right now.
> 
> How about:
> 
> x86 has multiple types of NMI-like interrupts: real NMIs, machine
> checks, and, for some values of NMI-like, debugging and breakpoint
> interrupts.  These interrupts can nest inside each other.  Andy
> Lutomirski is adding RCU support to these interrupts, so
> rcu_nmi_enter() and rcu_nmi_exit() must now correctly handle nesting.
> 
> Other than that, I like it.

And here is the updated version.  Left to my normal workflow, this would
go into 3.20.  Please let me know if you need it earlier, but in that
case, I will need you to test it fairly soon.  (I don't have any way
to test nested NMI-like things.)

							Thanx, Paul

------------------------------------------------------------------------

rcu: Make rcu_nmi_enter() handle nesting

The x86 architecture has multiple types of NMI-like interrupts: real
NMIs, machine checks, and, for some values of NMI-like, debugging
and breakpoint interrupts.  These interrupts can nest inside each
other.  Andy Lutomirski is adding RCU support to these interrupts,
so rcu_nmi_enter() and rcu_nmi_exit() must now correctly handle nesting.

This commit therefore introduces nesting, using a clever NMI-coordination
algorithm suggested by Andy.  The trick is to atomically increment
->dynticks (if needed) before manipulating ->dynticks_nmi_nesting on entry
(and, accordingly, after on exit).  In addition, ->dynticks_nmi_nesting
is incremented by one if ->dynticks was incremented and by two otherwise.
This means that when rcu_nmi_exit() sees ->dynticks_nmi_nesting equal
to one, it knows that ->dynticks must be atomically incremented.

This NMI-coordination algorithms has been validated by the following
Promela model, for whatever that might be worth:

/*
 * Promela model for Andy Lutomirski's suggested change to rcu_nmi_enter()
 * that allows nesting.
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
 * Copyright IBM Corporation, 2014
 *
 * Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
 */

byte dynticks_nesting = 0;
byte dynticks_nmi_nesting = 0;
byte dynticks = 0;

/*
 * Promela verision of rcu_nmi_enter().
 */
inline rcu_nmi_enter()
{
	byte incby;

	incby = 2;
	assert(dynticks_nmi_nesting >= 0);
	if
	:: (dynticks & 1) == 0 ->
		atomic {
			dynticks = dynticks + 1;
		}
		assert((dynticks & 1) == 1);
		incby = 1;
	:: else ->
		skip;
	fi;
	dynticks_nmi_nesting = dynticks_nmi_nesting + incby;
	assert(dynticks_nmi_nesting >= 1);
}

/*
 * Promela verision of rcu_nmi_exit().
 */
inline rcu_nmi_exit()
{
	assert(dynticks_nmi_nesting > 0);
	assert((dynticks & 1) != 0);
	if
	:: dynticks_nmi_nesting != 1 ->
		dynticks_nmi_nesting = dynticks_nmi_nesting - 2;
	:: else ->
		dynticks_nmi_nesting = 0;
		atomic {
			dynticks = dynticks + 1;
		}
		assert((dynticks & 1) == 0);
	fi;
}

/*
 * Base-level NMI runs non-atomically.  Crudely emulates process-level
 * dynticks-idle entry/exit.
 */
proctype base_NMI()
{
	byte busy;

	busy = 0;
	do
	::	if
		:: 1 ->	atomic {
				dynticks = dynticks + 1;
			}
			busy = 0;
		:: 1 ->	skip;
		fi;
		rcu_nmi_enter();
		assert((dynticks & 1) == 1);
		rcu_nmi_exit();
		if
		:: busy -> skip;
		:: !busy ->
			atomic {
				dynticks = dynticks + 1;
			}
			busy = 1;
		fi;
	od;
}

/*
 * Nested NMI runs atomically to emulate interrupting base_level().
 */
proctype nested_NMI()
{
	do
	::	atomic {
			rcu_nmi_enter();
			assert((dynticks & 1) == 1);
			rcu_nmi_exit();
		}
	od;
}

init {
	run base_NMI();
	run nested_NMI();
}

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8749f43f3f05..fc0236992655 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -759,39 +759,71 @@ void rcu_irq_enter(void)
 /**
  * rcu_nmi_enter - inform RCU of entry to NMI context
  *
- * If the CPU was idle with dynamic ticks active, and there is no
- * irq handler running, this updates rdtp->dynticks_nmi to let the
- * RCU grace-period handling know that the CPU is active.
+ * If the CPU was idle from RCU's viewpoint, update rdtp->dynticks and
+ * rdtp->dynticks_nmi_nesting to let the RCU grace-period handling know
+ * that the CPU is active.  This implementation permits nested NMIs, as
+ * long as the nesting level does not overflow an int.  (You will probably
+ * run out of stack space first.)
  */
 void rcu_nmi_enter(void)
 {
 	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
+	int incby = 2;
 
-	if (rdtp->dynticks_nmi_nesting == 0 &&
-	    (atomic_read(&rdtp->dynticks) & 0x1))
-		return;
-	rdtp->dynticks_nmi_nesting++;
-	smp_mb__before_atomic();  /* Force delay from prior write. */
-	atomic_inc(&rdtp->dynticks);
-	/* CPUs seeing atomic_inc() must see later RCU read-side crit sects */
-	smp_mb__after_atomic();  /* See above. */
-	WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
+	/* Complain about underflow. */
+	WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 0);
+
+	/*
+	 * If idle from RCU viewpoint, atomically increment ->dynticks
+	 * to mark non-idle and increment ->dynticks_nmi_nesting by one.
+	 * Otherwise, increment ->dynticks_nmi_nesting by two.  This means
+	 * if ->dynticks_nmi_nesting is equal to one, we are guaranteed
+	 * to be in the outermost NMI handler that interrupted an RCU-idle
+	 * period (observation due to Andy Lutomirski).
+	 */
+	if (!(atomic_read(&rdtp->dynticks) & 0x1)) {
+		smp_mb__before_atomic();  /* Force delay from prior write. */
+		atomic_inc(&rdtp->dynticks);
+		/* atomic_inc() before later RCU read-side crit sects */
+		smp_mb__after_atomic();  /* See above. */
+		WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
+		incby = 1;
+	}
+	rdtp->dynticks_nmi_nesting += incby;
+	barrier();
 }
 
 /**
  * rcu_nmi_exit - inform RCU of exit from NMI context
  *
- * If the CPU was idle with dynamic ticks active, and there is no
- * irq handler running, this updates rdtp->dynticks_nmi to let the
- * RCU grace-period handling know that the CPU is no longer active.
+ * If we are returning from the outermost NMI handler that interrupted an
+ * RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting
+ * to let the RCU grace-period handling know that the CPU is back to
+ * being RCU-idle.
  */
 void rcu_nmi_exit(void)
 {
 	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
 
-	if (rdtp->dynticks_nmi_nesting == 0 ||
-	    --rdtp->dynticks_nmi_nesting != 0)
+	/*
+	 * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
+	 * (We are exiting an NMI handler, so RCU better be paying attention
+	 * to us!)
+	 */
+	WARN_ON_ONCE(rdtp->dynticks_nmi_nesting <= 0);
+	WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
+
+	/*
+	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
+	 * leave it in non-RCU-idle state.
+	 */
+	if (rdtp->dynticks_nmi_nesting != 1) {
+		rdtp->dynticks_nmi_nesting -= 2;
 		return;
+	}
+
+	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
+	rdtp->dynticks_nmi_nesting = 0;
 	/* CPUs seeing atomic_inc() must see prior RCU read-side crit sects */
 	smp_mb__before_atomic();  /* See above. */
 	atomic_inc(&rdtp->dynticks);


  reply	other threads:[~2014-11-24 23:51 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-21 21:26 [PATCH v4 0/5] x86: Rework IST interrupts Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 1/5] uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME Andy Lutomirski
2014-11-22 16:55   ` Borislav Petkov
2014-11-24 17:58     ` Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Andy Lutomirski
2014-11-21 21:32   ` Andy Lutomirski
2014-11-21 22:07     ` Paul E. McKenney
2014-11-21 22:19       ` Andy Lutomirski
2014-11-21 22:55         ` Paul E. McKenney
2014-11-21 23:06           ` Andy Lutomirski
2014-11-21 23:38             ` Paul E. McKenney
2014-11-22  2:00               ` Andy Lutomirski
2014-11-22  4:20                 ` Paul E. McKenney
2014-11-22  5:53                   ` Andy Lutomirski
2014-11-22 23:41                     ` Paul E. McKenney
2014-11-24 20:22                       ` Andy Lutomirski
2014-11-24 20:54                         ` Paul E. McKenney
2014-11-24 21:02                           ` Andy Lutomirski
2014-11-24 21:35                             ` Paul E. McKenney
2014-11-24 22:34                               ` Paul E. McKenney
2014-11-24 22:36                                 ` Andy Lutomirski
2014-11-24 22:57                                   ` Paul E. McKenney
2014-11-24 23:31                                     ` Paul E. McKenney
2014-11-24 23:35                                       ` Andy Lutomirski
2014-11-24 23:50                                         ` Paul E. McKenney [this message]
2014-11-24 23:52                                           ` Andy Lutomirski
2014-11-25 18:58                                             ` Borislav Petkov
2014-11-25 19:16                                               ` Paul E. McKenney
2014-12-11  0:22                                               ` Tony Luck
2014-12-11  0:24                                                 ` Andy Lutomirski
2015-01-05 21:46                                                   ` Tony Luck
2015-01-05 21:54                                                     ` Andy Lutomirski
2015-01-06  0:44                                                       ` [PATCH] x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks Luck, Tony
2015-01-06  1:01                                                         ` Andy Lutomirski
2015-01-06 18:00                                                           ` Luck, Tony
2015-01-07 12:13                                                             ` Borislav Petkov
2015-01-07 15:51                                                               ` Andy Lutomirski
2015-01-07 15:58                                                                 ` Borislav Petkov
2015-01-07 16:12                                                                 ` Paul E. McKenney
2014-11-25 17:13                                           ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Paul E. McKenney
2014-11-27  7:03                                           ` Lai Jiangshan
2014-11-27 16:46                                             ` Paul E. McKenney
2014-11-24 21:27                           ` Paul E. McKenney
2014-11-21 22:20       ` Frederic Weisbecker
2014-11-21 22:00   ` Paul E. McKenney
2014-11-22 17:20   ` Borislav Petkov
2014-11-24 19:48     ` Andy Lutomirski
2015-01-22 21:52   ` Sasha Levin
2015-01-23 17:58     ` Andy Lutomirski
2015-01-23 18:04       ` Borislav Petkov
2015-01-23 18:34         ` Andy Lutomirski
2015-01-23 20:48           ` Sasha Levin
2015-01-24  1:25             ` Andy Lutomirski
2015-01-28 16:33               ` Andy Lutomirski
2015-01-28 17:48                 ` Paul E. McKenney
2015-01-28 21:02                   ` Andy Lutomirski
2015-01-30 19:57                     ` Sasha Levin
2015-01-31  1:28                       ` Sasha Levin
2015-01-31  3:12                         ` Andy Lutomirski
2015-01-31 12:50                           ` Andy Lutomirski
2015-01-31 13:01                         ` [PATCH] x86, traps: Fix ist_enter from userspace Andy Lutomirski
2015-01-31 15:09                           ` Sasha Levin
2015-01-31 16:18                           ` Paul E. McKenney
2015-02-01  2:17                             ` Andy Lutomirski
2015-02-04  6:01                           ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 3/5] x86, entry: Switch stacks on a paranoid entry " Andy Lutomirski
2014-11-24 15:55   ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 4/5] x86: Clean up current_stack_pointer Andy Lutomirski
2014-11-24 11:39   ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 5/5] x86, traps: Add ist_begin_non_atomic and ist_end_non_atomic Andy Lutomirski
2014-11-24 15:54   ` Borislav Petkov
2014-11-24 19:52     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141124235058.GZ5050@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=bp@alien8.de \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).