From: Andy Lutomirski <luto@amacapital.net>
To: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: "Borislav Petkov" <bp@alien8.de>, "X86 ML" <x86@kernel.org>,
"Linus Torvalds" <torvalds@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Peter Zijlstra" <peterz@infradead.org>,
"Oleg Nesterov" <oleg@redhat.com>,
"Tony Luck" <tony.luck@intel.com>,
"Andi Kleen" <andi@firstfloor.org>,
"Josh Triplett" <josh@joshtriplett.org>,
"Frédéric Weisbecker" <fweisbec@gmail.com>
Subject: Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context
Date: Mon, 24 Nov 2014 15:52:10 -0800 [thread overview]
Message-ID: <CALCETrUFwyBYFWZi4JC1-HftmUumLr+bB1kh8EW6LViN+G+j+g@mail.gmail.com> (raw)
In-Reply-To: <20141124235058.GZ5050@linux.vnet.ibm.com>
On Mon, Nov 24, 2014 at 3:50 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Mon, Nov 24, 2014 at 03:35:55PM -0800, Andy Lutomirski wrote:
>> On Mon, Nov 24, 2014 at 3:31 PM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> > On Mon, Nov 24, 2014 at 02:57:54PM -0800, Paul E. McKenney wrote:
>> >> On Mon, Nov 24, 2014 at 02:36:18PM -0800, Andy Lutomirski wrote:
>> >> > On Mon, Nov 24, 2014 at 2:34 PM, Paul E. McKenney
>> >> > <paulmck@linux.vnet.ibm.com> wrote:
>> >> > > On Mon, Nov 24, 2014 at 01:35:01PM -0800, Paul E. McKenney wrote:
>> >>
>> >> [ . . . ]
>> >>
>> >> > > And the following Promela model claims that your approach works.
>> >> > > Should I trust it? ;-)
>> >> > >
>> >> >
>> >> > I think so.
>> >> >
>> >> > Want to write a patch? If so, whose tree should it go in? I can add
>> >> > it to my IST series, but that seems a bit odd.
>> >>
>> >> Working on it. ;-)
>> >
>> > And here is a sneak preview of the patch. Thoughts?
>> >
>> > Thanx, Paul
>> >
>> > ------------------------------------------------------------------------
>> >
>> > rcu: Make rcu_nmi_enter() handle nesting
>> >
>> > Andy Lutomirski is introducing ISTs into x86, which from RCU's
>> > viewpoint are NMIs. Because ISTs and NMIs can nest, rcu_nmi_enter() and
>> > rcu_nmi_exit() must now correctly handle nesting.
>>
>> You must not be a frequent reader of entry_64.S and the Intel SDM :)
>> IOW, IST is just a stack switching mechanism, and these interrupts
>> have been around forever -- they're just buggy right now.
>>
>> How about:
>>
>> x86 has multiple types of NMI-like interrupts: real NMIs, machine
>> checks, and, for some values of NMI-like, debugging and breakpoint
>> interrupts. These interrupts can nest inside each other. Andy
>> Lutomirski is adding RCU support to these interrupts, so
>> rcu_nmi_enter() and rcu_nmi_exit() must now correctly handle nesting.
>>
>> Other than that, I like it.
>
> And here is the updated version. Left to my normal workflow, this would
> go into 3.20. Please let me know if you need it earlier, but in that
> case, I will need you to test it fairly soon. (I don't have any way
> to test nested NMI-like things.)
Dunno. Tony and Borislav -- when do you want the IST stack switching stuff?
--Andy
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: Make rcu_nmi_enter() handle nesting
>
> The x86 architecture has multiple types of NMI-like interrupts: real
> NMIs, machine checks, and, for some values of NMI-like, debugging
> and breakpoint interrupts. These interrupts can nest inside each
> other. Andy Lutomirski is adding RCU support to these interrupts,
> so rcu_nmi_enter() and rcu_nmi_exit() must now correctly handle nesting.
>
> This commit therefore introduces nesting, using a clever NMI-coordination
> algorithm suggested by Andy. The trick is to atomically increment
> ->dynticks (if needed) before manipulating ->dynticks_nmi_nesting on entry
> (and, accordingly, after on exit). In addition, ->dynticks_nmi_nesting
> is incremented by one if ->dynticks was incremented and by two otherwise.
> This means that when rcu_nmi_exit() sees ->dynticks_nmi_nesting equal
> to one, it knows that ->dynticks must be atomically incremented.
>
> This NMI-coordination algorithms has been validated by the following
> Promela model, for whatever that might be worth:
>
> /*
> * Promela model for Andy Lutomirski's suggested change to rcu_nmi_enter()
> * that allows nesting.
> *
> * This program is free software; you can redistribute it and/or modify
> * it under the terms of the GNU General Public License as published by
> * the Free Software Foundation; either version 2 of the License, or
> * (at your option) any later version.
> *
> * This program is distributed in the hope that it will be useful,
> * but WITHOUT ANY WARRANTY; without even the implied warranty of
> * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> * GNU General Public License for more details.
> *
> * You should have received a copy of the GNU General Public License
> * along with this program; if not, you can access it online at
> * http://www.gnu.org/licenses/gpl-2.0.html.
> *
> * Copyright IBM Corporation, 2014
> *
> * Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> */
>
> byte dynticks_nesting = 0;
> byte dynticks_nmi_nesting = 0;
> byte dynticks = 0;
>
> /*
> * Promela verision of rcu_nmi_enter().
> */
> inline rcu_nmi_enter()
> {
> byte incby;
>
> incby = 2;
> assert(dynticks_nmi_nesting >= 0);
> if
> :: (dynticks & 1) == 0 ->
> atomic {
> dynticks = dynticks + 1;
> }
> assert((dynticks & 1) == 1);
> incby = 1;
> :: else ->
> skip;
> fi;
> dynticks_nmi_nesting = dynticks_nmi_nesting + incby;
> assert(dynticks_nmi_nesting >= 1);
> }
>
> /*
> * Promela verision of rcu_nmi_exit().
> */
> inline rcu_nmi_exit()
> {
> assert(dynticks_nmi_nesting > 0);
> assert((dynticks & 1) != 0);
> if
> :: dynticks_nmi_nesting != 1 ->
> dynticks_nmi_nesting = dynticks_nmi_nesting - 2;
> :: else ->
> dynticks_nmi_nesting = 0;
> atomic {
> dynticks = dynticks + 1;
> }
> assert((dynticks & 1) == 0);
> fi;
> }
>
> /*
> * Base-level NMI runs non-atomically. Crudely emulates process-level
> * dynticks-idle entry/exit.
> */
> proctype base_NMI()
> {
> byte busy;
>
> busy = 0;
> do
> :: if
> :: 1 -> atomic {
> dynticks = dynticks + 1;
> }
> busy = 0;
> :: 1 -> skip;
> fi;
> rcu_nmi_enter();
> assert((dynticks & 1) == 1);
> rcu_nmi_exit();
> if
> :: busy -> skip;
> :: !busy ->
> atomic {
> dynticks = dynticks + 1;
> }
> busy = 1;
> fi;
> od;
> }
>
> /*
> * Nested NMI runs atomically to emulate interrupting base_level().
> */
> proctype nested_NMI()
> {
> do
> :: atomic {
> rcu_nmi_enter();
> assert((dynticks & 1) == 1);
> rcu_nmi_exit();
> }
> od;
> }
>
> init {
> run base_NMI();
> run nested_NMI();
> }
>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 8749f43f3f05..fc0236992655 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -759,39 +759,71 @@ void rcu_irq_enter(void)
> /**
> * rcu_nmi_enter - inform RCU of entry to NMI context
> *
> - * If the CPU was idle with dynamic ticks active, and there is no
> - * irq handler running, this updates rdtp->dynticks_nmi to let the
> - * RCU grace-period handling know that the CPU is active.
> + * If the CPU was idle from RCU's viewpoint, update rdtp->dynticks and
> + * rdtp->dynticks_nmi_nesting to let the RCU grace-period handling know
> + * that the CPU is active. This implementation permits nested NMIs, as
> + * long as the nesting level does not overflow an int. (You will probably
> + * run out of stack space first.)
> */
> void rcu_nmi_enter(void)
> {
> struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> + int incby = 2;
>
> - if (rdtp->dynticks_nmi_nesting == 0 &&
> - (atomic_read(&rdtp->dynticks) & 0x1))
> - return;
> - rdtp->dynticks_nmi_nesting++;
> - smp_mb__before_atomic(); /* Force delay from prior write. */
> - atomic_inc(&rdtp->dynticks);
> - /* CPUs seeing atomic_inc() must see later RCU read-side crit sects */
> - smp_mb__after_atomic(); /* See above. */
> - WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> + /* Complain about underflow. */
> + WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 0);
> +
> + /*
> + * If idle from RCU viewpoint, atomically increment ->dynticks
> + * to mark non-idle and increment ->dynticks_nmi_nesting by one.
> + * Otherwise, increment ->dynticks_nmi_nesting by two. This means
> + * if ->dynticks_nmi_nesting is equal to one, we are guaranteed
> + * to be in the outermost NMI handler that interrupted an RCU-idle
> + * period (observation due to Andy Lutomirski).
> + */
> + if (!(atomic_read(&rdtp->dynticks) & 0x1)) {
> + smp_mb__before_atomic(); /* Force delay from prior write. */
> + atomic_inc(&rdtp->dynticks);
> + /* atomic_inc() before later RCU read-side crit sects */
> + smp_mb__after_atomic(); /* See above. */
> + WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> + incby = 1;
> + }
> + rdtp->dynticks_nmi_nesting += incby;
> + barrier();
> }
>
> /**
> * rcu_nmi_exit - inform RCU of exit from NMI context
> *
> - * If the CPU was idle with dynamic ticks active, and there is no
> - * irq handler running, this updates rdtp->dynticks_nmi to let the
> - * RCU grace-period handling know that the CPU is no longer active.
> + * If we are returning from the outermost NMI handler that interrupted an
> + * RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting
> + * to let the RCU grace-period handling know that the CPU is back to
> + * being RCU-idle.
> */
> void rcu_nmi_exit(void)
> {
> struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
>
> - if (rdtp->dynticks_nmi_nesting == 0 ||
> - --rdtp->dynticks_nmi_nesting != 0)
> + /*
> + * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
> + * (We are exiting an NMI handler, so RCU better be paying attention
> + * to us!)
> + */
> + WARN_ON_ONCE(rdtp->dynticks_nmi_nesting <= 0);
> + WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> +
> + /*
> + * If the nesting level is not 1, the CPU wasn't RCU-idle, so
> + * leave it in non-RCU-idle state.
> + */
> + if (rdtp->dynticks_nmi_nesting != 1) {
> + rdtp->dynticks_nmi_nesting -= 2;
> return;
> + }
> +
> + /* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
> + rdtp->dynticks_nmi_nesting = 0;
> /* CPUs seeing atomic_inc() must see prior RCU read-side crit sects */
> smp_mb__before_atomic(); /* See above. */
> atomic_inc(&rdtp->dynticks);
>
--
Andy Lutomirski
AMA Capital Management, LLC
next prev parent reply other threads:[~2014-11-24 23:52 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-21 21:26 [PATCH v4 0/5] x86: Rework IST interrupts Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 1/5] uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME Andy Lutomirski
2014-11-22 16:55 ` Borislav Petkov
2014-11-24 17:58 ` Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Andy Lutomirski
2014-11-21 21:32 ` Andy Lutomirski
2014-11-21 22:07 ` Paul E. McKenney
2014-11-21 22:19 ` Andy Lutomirski
2014-11-21 22:55 ` Paul E. McKenney
2014-11-21 23:06 ` Andy Lutomirski
2014-11-21 23:38 ` Paul E. McKenney
2014-11-22 2:00 ` Andy Lutomirski
2014-11-22 4:20 ` Paul E. McKenney
2014-11-22 5:53 ` Andy Lutomirski
2014-11-22 23:41 ` Paul E. McKenney
2014-11-24 20:22 ` Andy Lutomirski
2014-11-24 20:54 ` Paul E. McKenney
2014-11-24 21:02 ` Andy Lutomirski
2014-11-24 21:35 ` Paul E. McKenney
2014-11-24 22:34 ` Paul E. McKenney
2014-11-24 22:36 ` Andy Lutomirski
2014-11-24 22:57 ` Paul E. McKenney
2014-11-24 23:31 ` Paul E. McKenney
2014-11-24 23:35 ` Andy Lutomirski
2014-11-24 23:50 ` Paul E. McKenney
2014-11-24 23:52 ` Andy Lutomirski [this message]
2014-11-25 18:58 ` Borislav Petkov
2014-11-25 19:16 ` Paul E. McKenney
2014-12-11 0:22 ` Tony Luck
2014-12-11 0:24 ` Andy Lutomirski
2015-01-05 21:46 ` Tony Luck
2015-01-05 21:54 ` Andy Lutomirski
2015-01-06 0:44 ` [PATCH] x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks Luck, Tony
2015-01-06 1:01 ` Andy Lutomirski
2015-01-06 18:00 ` Luck, Tony
2015-01-07 12:13 ` Borislav Petkov
2015-01-07 15:51 ` Andy Lutomirski
2015-01-07 15:58 ` Borislav Petkov
2015-01-07 16:12 ` Paul E. McKenney
2014-11-25 17:13 ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Paul E. McKenney
2014-11-27 7:03 ` Lai Jiangshan
2014-11-27 16:46 ` Paul E. McKenney
2014-11-24 21:27 ` Paul E. McKenney
2014-11-21 22:20 ` Frederic Weisbecker
2014-11-21 22:00 ` Paul E. McKenney
2014-11-22 17:20 ` Borislav Petkov
2014-11-24 19:48 ` Andy Lutomirski
2015-01-22 21:52 ` Sasha Levin
2015-01-23 17:58 ` Andy Lutomirski
2015-01-23 18:04 ` Borislav Petkov
2015-01-23 18:34 ` Andy Lutomirski
2015-01-23 20:48 ` Sasha Levin
2015-01-24 1:25 ` Andy Lutomirski
2015-01-28 16:33 ` Andy Lutomirski
2015-01-28 17:48 ` Paul E. McKenney
2015-01-28 21:02 ` Andy Lutomirski
2015-01-30 19:57 ` Sasha Levin
2015-01-31 1:28 ` Sasha Levin
2015-01-31 3:12 ` Andy Lutomirski
2015-01-31 12:50 ` Andy Lutomirski
2015-01-31 13:01 ` [PATCH] x86, traps: Fix ist_enter from userspace Andy Lutomirski
2015-01-31 15:09 ` Sasha Levin
2015-01-31 16:18 ` Paul E. McKenney
2015-02-01 2:17 ` Andy Lutomirski
2015-02-04 6:01 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 3/5] x86, entry: Switch stacks on a paranoid entry " Andy Lutomirski
2014-11-24 15:55 ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 4/5] x86: Clean up current_stack_pointer Andy Lutomirski
2014-11-24 11:39 ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 5/5] x86, traps: Add ist_begin_non_atomic and ist_end_non_atomic Andy Lutomirski
2014-11-24 15:54 ` Borislav Petkov
2014-11-24 19:52 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALCETrUFwyBYFWZi4JC1-HftmUumLr+bB1kh8EW6LViN+G+j+g@mail.gmail.com \
--to=luto@amacapital.net \
--cc=andi@firstfloor.org \
--cc=bp@alien8.de \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).