linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: "Borislav Petkov" <bp@alien8.de>, "X86 ML" <x86@kernel.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Tony Luck" <tony.luck@intel.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>
Subject: Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context
Date: Mon, 24 Nov 2014 15:35:55 -0800	[thread overview]
Message-ID: <CALCETrV37kqC658AzBHZ5z_vH2b_1T821q5EfbMcS9Ecpj8EEw@mail.gmail.com> (raw)
In-Reply-To: <20141124233101.GA2819@linux.vnet.ibm.com>

On Mon, Nov 24, 2014 at 3:31 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Mon, Nov 24, 2014 at 02:57:54PM -0800, Paul E. McKenney wrote:
>> On Mon, Nov 24, 2014 at 02:36:18PM -0800, Andy Lutomirski wrote:
>> > On Mon, Nov 24, 2014 at 2:34 PM, Paul E. McKenney
>> > <paulmck@linux.vnet.ibm.com> wrote:
>> > > On Mon, Nov 24, 2014 at 01:35:01PM -0800, Paul E. McKenney wrote:
>>
>> [ . . . ]
>>
>> > > And the following Promela model claims that your approach works.
>> > > Should I trust it?  ;-)
>> > >
>> >
>> > I think so.
>> >
>> > Want to write a patch?  If so, whose tree should it go in?  I can add
>> > it to my IST series, but that seems a bit odd.
>>
>> Working on it.  ;-)
>
> And here is a sneak preview of the patch.  Thoughts?
>
>                                                         Thanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: Make rcu_nmi_enter() handle nesting
>
> Andy Lutomirski is introducing ISTs into x86, which from RCU's
> viewpoint are NMIs.  Because ISTs and NMIs can nest, rcu_nmi_enter() and
> rcu_nmi_exit() must now correctly handle nesting.

You must not be a frequent reader of entry_64.S and the Intel SDM :)
IOW, IST is just a stack switching mechanism, and these interrupts
have been around forever -- they're just buggy right now.

How about:

x86 has multiple types of NMI-like interrupts: real NMIs, machine
checks, and, for some values of NMI-like, debugging and breakpoint
interrupts.  These interrupts can nest inside each other.  Andy
Lutomirski is adding RCU support to these interrupts, so
rcu_nmi_enter() and rcu_nmi_exit() must now correctly handle nesting.

Other than that, I like it.

>  This commit therefore
> introduces nesting, using a clever NMI-coordination algorithm suggested
> by Andy.  The trick is to atomically increment ->dynticks (if needed)
> before manipulating ->dynticks_nmi_nesting on entry (and, accordingly,
> after on exit).  In addition, ->dynticks_nmi_nesting is incremented by
> one if ->dynticks was incremented and by two otherwise.  This means that
> when rcu_nmi_exit() sees ->dynticks_nmi_nesting equal to one, it knows
> that ->dynticks must be atomically incremented.
>
> This NMI-coordination algorithms has been validated by the following
> Promela model, for whatever that might be worth:
>
> /*
>  * Promela model for Andy Lutomirski's suggested change to rcu_nmi_enter()
>  * that allows nesting.
>  *
>  * This program is free software; you can redistribute it and/or modify
>  * it under the terms of the GNU General Public License as published by
>  * the Free Software Foundation; either version 2 of the License, or
>  * (at your option) any later version.
>  *
>  * This program is distributed in the hope that it will be useful,
>  * but WITHOUT ANY WARRANTY; without even the implied warranty of
>  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>  * GNU General Public License for more details.
>  *
>  * You should have received a copy of the GNU General Public License
>  * along with this program; if not, you can access it online at
>  * http://www.gnu.org/licenses/gpl-2.0.html.
>  *
>  * Copyright IBM Corporation, 2014
>  *
>  * Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>  */
>
> byte dynticks_nesting = 0;
> byte dynticks_nmi_nesting = 0;
> byte dynticks = 0;
>
> /*
>  * Promela verision of rcu_nmi_enter().
>  */
> inline rcu_nmi_enter()
> {
>         byte incby;
>
>         incby = 2;
>         assert(dynticks_nmi_nesting >= 0);
>         if
>         :: (dynticks & 1) == 0 ->
>                 atomic {
>                         dynticks = dynticks + 1;
>                 }
>                 assert((dynticks & 1) == 1);
>                 incby = 1;
>         :: else ->
>                 skip;
>         fi;
>         dynticks_nmi_nesting = dynticks_nmi_nesting + incby;
>         assert(dynticks_nmi_nesting >= 1);
> }
>
> /*
>  * Promela verision of rcu_nmi_exit().
>  */
> inline rcu_nmi_exit()
> {
>         assert(dynticks_nmi_nesting > 0);
>         assert((dynticks & 1) != 0);
>         if
>         :: dynticks_nmi_nesting != 1 ->
>                 dynticks_nmi_nesting = dynticks_nmi_nesting - 2;
>         :: else ->
>                 dynticks_nmi_nesting = 0;
>                 atomic {
>                         dynticks = dynticks + 1;
>                 }
>                 assert((dynticks & 1) == 0);
>         fi;
> }
>
> /*
>  * Base-level NMI runs non-atomically.  Crudely emulates process-level
>  * dynticks-idle entry/exit.
>  */
> proctype base_NMI()
> {
>         byte busy;
>
>         busy = 0;
>         do
>         ::      if
>                 :: 1 -> atomic {
>                                 dynticks = dynticks + 1;
>                         }
>                         busy = 0;
>                 :: 1 -> skip;
>                 fi;
>                 rcu_nmi_enter();
>                 assert((dynticks & 1) == 1);
>                 rcu_nmi_exit();
>                 if
>                 :: busy -> skip;
>                 :: !busy ->
>                         atomic {
>                                 dynticks = dynticks + 1;
>                         }
>                         busy = 1;
>                 fi;
>         od;
> }
>
> /*
>  * Nested NMI runs atomically to emulate interrupting base_level().
>  */
> proctype nested_NMI()
> {
>         do
>         ::      atomic {
>                         rcu_nmi_enter();
>                         assert((dynticks & 1) == 1);
>                         rcu_nmi_exit();
>                 }
>         od;
> }
>
> init {
>         run base_NMI();
>         run nested_NMI();
> }
>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 8749f43f3f05..fc0236992655 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -759,39 +759,71 @@ void rcu_irq_enter(void)
>  /**
>   * rcu_nmi_enter - inform RCU of entry to NMI context
>   *
> - * If the CPU was idle with dynamic ticks active, and there is no
> - * irq handler running, this updates rdtp->dynticks_nmi to let the
> - * RCU grace-period handling know that the CPU is active.
> + * If the CPU was idle from RCU's viewpoint, update rdtp->dynticks and
> + * rdtp->dynticks_nmi_nesting to let the RCU grace-period handling know
> + * that the CPU is active.  This implementation permits nested NMIs, as
> + * long as the nesting level does not overflow an int.  (You will probably
> + * run out of stack space first.)
>   */
>  void rcu_nmi_enter(void)
>  {
>         struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> +       int incby = 2;
>
> -       if (rdtp->dynticks_nmi_nesting == 0 &&
> -           (atomic_read(&rdtp->dynticks) & 0x1))
> -               return;
> -       rdtp->dynticks_nmi_nesting++;
> -       smp_mb__before_atomic();  /* Force delay from prior write. */
> -       atomic_inc(&rdtp->dynticks);
> -       /* CPUs seeing atomic_inc() must see later RCU read-side crit sects */
> -       smp_mb__after_atomic();  /* See above. */
> -       WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> +       /* Complain about underflow. */
> +       WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 0);
> +
> +       /*
> +        * If idle from RCU viewpoint, atomically increment ->dynticks
> +        * to mark non-idle and increment ->dynticks_nmi_nesting by one.
> +        * Otherwise, increment ->dynticks_nmi_nesting by two.  This means
> +        * if ->dynticks_nmi_nesting is equal to one, we are guaranteed
> +        * to be in the outermost NMI handler that interrupted an RCU-idle
> +        * period (observation due to Andy Lutomirski).
> +        */
> +       if (!(atomic_read(&rdtp->dynticks) & 0x1)) {
> +               smp_mb__before_atomic();  /* Force delay from prior write. */
> +               atomic_inc(&rdtp->dynticks);
> +               /* atomic_inc() before later RCU read-side crit sects */
> +               smp_mb__after_atomic();  /* See above. */
> +               WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> +               incby = 1;
> +       }
> +       rdtp->dynticks_nmi_nesting += incby;
> +       barrier();
>  }
>
>  /**
>   * rcu_nmi_exit - inform RCU of exit from NMI context
>   *
> - * If the CPU was idle with dynamic ticks active, and there is no
> - * irq handler running, this updates rdtp->dynticks_nmi to let the
> - * RCU grace-period handling know that the CPU is no longer active.
> + * If we are returning from the outermost NMI handler that interrupted an
> + * RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting
> + * to let the RCU grace-period handling know that the CPU is back to
> + * being RCU-idle.
>   */
>  void rcu_nmi_exit(void)
>  {
>         struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
>
> -       if (rdtp->dynticks_nmi_nesting == 0 ||
> -           --rdtp->dynticks_nmi_nesting != 0)
> +       /*
> +        * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
> +        * (We are exiting an NMI handler, so RCU better be paying attention
> +        * to us!)
> +        */
> +       WARN_ON_ONCE(rdtp->dynticks_nmi_nesting <= 0);
> +       WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> +
> +       /*
> +        * If the nesting level is not 1, the CPU wasn't RCU-idle, so
> +        * leave it in non-RCU-idle state.
> +        */
> +       if (rdtp->dynticks_nmi_nesting != 1) {
> +               rdtp->dynticks_nmi_nesting -= 2;
>                 return;
> +       }
> +
> +       /* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
> +       rdtp->dynticks_nmi_nesting = 0;
>         /* CPUs seeing atomic_inc() must see prior RCU read-side crit sects */
>         smp_mb__before_atomic();  /* See above. */
>         atomic_inc(&rdtp->dynticks);
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

  reply	other threads:[~2014-11-24 23:36 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-21 21:26 [PATCH v4 0/5] x86: Rework IST interrupts Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 1/5] uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME Andy Lutomirski
2014-11-22 16:55   ` Borislav Petkov
2014-11-24 17:58     ` Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Andy Lutomirski
2014-11-21 21:32   ` Andy Lutomirski
2014-11-21 22:07     ` Paul E. McKenney
2014-11-21 22:19       ` Andy Lutomirski
2014-11-21 22:55         ` Paul E. McKenney
2014-11-21 23:06           ` Andy Lutomirski
2014-11-21 23:38             ` Paul E. McKenney
2014-11-22  2:00               ` Andy Lutomirski
2014-11-22  4:20                 ` Paul E. McKenney
2014-11-22  5:53                   ` Andy Lutomirski
2014-11-22 23:41                     ` Paul E. McKenney
2014-11-24 20:22                       ` Andy Lutomirski
2014-11-24 20:54                         ` Paul E. McKenney
2014-11-24 21:02                           ` Andy Lutomirski
2014-11-24 21:35                             ` Paul E. McKenney
2014-11-24 22:34                               ` Paul E. McKenney
2014-11-24 22:36                                 ` Andy Lutomirski
2014-11-24 22:57                                   ` Paul E. McKenney
2014-11-24 23:31                                     ` Paul E. McKenney
2014-11-24 23:35                                       ` Andy Lutomirski [this message]
2014-11-24 23:50                                         ` Paul E. McKenney
2014-11-24 23:52                                           ` Andy Lutomirski
2014-11-25 18:58                                             ` Borislav Petkov
2014-11-25 19:16                                               ` Paul E. McKenney
2014-12-11  0:22                                               ` Tony Luck
2014-12-11  0:24                                                 ` Andy Lutomirski
2015-01-05 21:46                                                   ` Tony Luck
2015-01-05 21:54                                                     ` Andy Lutomirski
2015-01-06  0:44                                                       ` [PATCH] x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks Luck, Tony
2015-01-06  1:01                                                         ` Andy Lutomirski
2015-01-06 18:00                                                           ` Luck, Tony
2015-01-07 12:13                                                             ` Borislav Petkov
2015-01-07 15:51                                                               ` Andy Lutomirski
2015-01-07 15:58                                                                 ` Borislav Petkov
2015-01-07 16:12                                                                 ` Paul E. McKenney
2014-11-25 17:13                                           ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Paul E. McKenney
2014-11-27  7:03                                           ` Lai Jiangshan
2014-11-27 16:46                                             ` Paul E. McKenney
2014-11-24 21:27                           ` Paul E. McKenney
2014-11-21 22:20       ` Frederic Weisbecker
2014-11-21 22:00   ` Paul E. McKenney
2014-11-22 17:20   ` Borislav Petkov
2014-11-24 19:48     ` Andy Lutomirski
2015-01-22 21:52   ` Sasha Levin
2015-01-23 17:58     ` Andy Lutomirski
2015-01-23 18:04       ` Borislav Petkov
2015-01-23 18:34         ` Andy Lutomirski
2015-01-23 20:48           ` Sasha Levin
2015-01-24  1:25             ` Andy Lutomirski
2015-01-28 16:33               ` Andy Lutomirski
2015-01-28 17:48                 ` Paul E. McKenney
2015-01-28 21:02                   ` Andy Lutomirski
2015-01-30 19:57                     ` Sasha Levin
2015-01-31  1:28                       ` Sasha Levin
2015-01-31  3:12                         ` Andy Lutomirski
2015-01-31 12:50                           ` Andy Lutomirski
2015-01-31 13:01                         ` [PATCH] x86, traps: Fix ist_enter from userspace Andy Lutomirski
2015-01-31 15:09                           ` Sasha Levin
2015-01-31 16:18                           ` Paul E. McKenney
2015-02-01  2:17                             ` Andy Lutomirski
2015-02-04  6:01                           ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 3/5] x86, entry: Switch stacks on a paranoid entry " Andy Lutomirski
2014-11-24 15:55   ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 4/5] x86: Clean up current_stack_pointer Andy Lutomirski
2014-11-24 11:39   ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 5/5] x86, traps: Add ist_begin_non_atomic and ist_end_non_atomic Andy Lutomirski
2014-11-24 15:54   ` Borislav Petkov
2014-11-24 19:52     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrV37kqC658AzBHZ5z_vH2b_1T821q5EfbMcS9Ecpj8EEw@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=andi@firstfloor.org \
    --cc=bp@alien8.de \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).