linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "Borislav Petkov" <bp@alien8.de>, "X86 ML" <x86@kernel.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Tony Luck" <tony.luck@intel.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>
Subject: Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context
Date: Mon, 24 Nov 2014 12:54:41 -0800	[thread overview]
Message-ID: <20141124205441.GW5050@linux.vnet.ibm.com> (raw)
In-Reply-To: <CALCETrVzeK=VWnAhpkPhxj9C012ETgArEPmLMFfhKJ7k1pHsXA@mail.gmail.com>

On Mon, Nov 24, 2014 at 12:22:13PM -0800, Andy Lutomirski wrote:
> On Sat, Nov 22, 2014 at 3:41 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Fri, Nov 21, 2014 at 09:53:29PM -0800, Andy Lutomirski wrote:
> >> On Fri, Nov 21, 2014 at 8:20 PM, Paul E. McKenney
> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> > On Fri, Nov 21, 2014 at 06:00:14PM -0800, Andy Lutomirski wrote:
> >> >> On Fri, Nov 21, 2014 at 3:38 PM, Paul E. McKenney
> >> >> <paulmck@linux.vnet.ibm.com> wrote:
> 
> > Returning state sounds like a bad idea, if we can reasonably avoid it.
> 
> I agree, except that we already do it for exception_enter(), etc.  But
> yes, changing fewer things is nice.
> 
> >
> > And I think I finally see what you are pointing out about my code: If
> > another NMI comes in between the time I increment ->dynticks_nmi_nesting
> > and the time I atomically increment ->dynticks, the nested NMI handler
> > will incorrectly believe that RCU is already paying attention to this CPU.
> > Which would indeed not be at all good, so good catch!!!
> >
> >> Otherwise, I think that there may need to be enough state somewhere so
> >> that the outermost nested rcu_nmi_enter knows whether to increment
> >> dynticks.  For example, dynticks_nmi_nesting could store the nesting
> >> count * 2 - (1 if the outermost nested user needs to increment
> >> dynticks).  Something like:
> >>
> >> void rcu_nmi_enter(void)
> >> {
> >>   /* Be very careful -- this function may be called reentrently on the
> >> same CPU. */
> >>   atomically: increment dynticks if it's even.
> >>
> >>   /* If an rcu_nmi_enter/rcu_nmi_exit pair happens here, then it will not change
> >>    * the state. */
> >>
> >>   local_inc(&dynticks_nmi_nesting, (we incremented dynticks ? 1 : 2));
> >>
> >>   WARN_ON(we incremented dynticks and dynticks_nmi_nesting was nonzero);
> >> }
> >>
> >> void rcu_nmi_exit(void)
> >> {
> >>   WARN_ON(!(dynticks & 1));
> >>   locally atomically: dynticks_nmi_nesting -= 2, unless
> >> dynticks_nmi_nesting == 1, in which case set it to zero
> >>
> >>   if (dynticks_nmi_nesting was 1)
> >>     atomic_inc(&dynticks);
> >> }
> >>
> >> The invariant here is that, for a single unnested enter/exit, if
> >> dynticks_nmi_nesting != 0, then dynticks is odd.  As a result, an
> >> rcu_nmi_enter/rcu_nmi_exit pair at any time when dynticks_nmi_nesting
> >> != 0 *or* dynticks is odd will have no net effect, so the invariant,
> >> in fact, holds for all invocations, nested or otherwise.
> >>
> >> At least one of those conditions is true at all times during the
> >> execution of outermost pair, starting with the first atomic operation
> >> and ending with the final atomic_inc.  So they nest properly no matter
> >> what else happens (unless, of course, someone else pokes dynticks in
> >> the middle).
> >>
> >> Thoughts?
> >
> > Let's see...  The evenness of ->dynticks should be preserved by nested NMI
> > handlers, so the check and increment need not be atomic.  We don't have
> > any way (other than atomic operations) to do local atomic modifications
> > on all architectures, because we cannot mask NMIs.  (Yes, it can work
> > on x86, but this is common code that needs to work everywhere.)  On the
> > other hand, presumably NMIs are rare, so atomic modification of the NMI
> > nesting counter should be OK, at least if it proves absolutely necessary.
> > And I am thinking that a mechanical proof will be needed here.  :-/
> >
> > But first, let me try generating the code and informally evaluating it:
> >
> >          1   struct rcu_dynticks {
> >          2     long long dynticks_nesting;
> >          3     int dynticks_nmi_nesting;
> >          4     atomic_t dynticks;
> >          5   };
> >          6
> >          7   void rcu_nmi_enter(void)
> >          8   {
> >          9     struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> >         10     int incby = 2;
> >         11
> >         12     if (!(atomic_read(&rdtp->dynticks) & 0x1)) {
> >         13       smp_mb__before_atomic();
> >         14       atomic_inc(&rdtp->dynticks);
> >         15       smp_mb__after_atomic();
> >         16       WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> >         17       incby = 1;
> 
> WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 1) here, perhaps?

That would make sense.

> >         18     }
> >         19     rdtp->dynticks_nmi_nesting += incby;
> 
> Oh, I see why you don't need local_add -- it's because an nmi in the
> middle of this increment won't have any effect on the interrupted
> code, so even a software RMW will be okay.

Yep!  ;-)

> >         20     barrier();
> >         21   }
> >         22
> >         23   void rcu_nmi_exit(void)
> >         24   {
> >         25     struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> >         26
> >         27     WARN_ON_ONCE(!rdtp->dynticks_nmi_nesting);
> >         28     WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> >         29     if (rdtp->dynticks_nmi_nesting != 1) {
> 
> WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 2), perhaps?

This is already implied by the WARN_ON_ONCE() on line 27 and the check
on line 29.

> >         30       rdtp->dynticks_nmi_nesting -= 2;
> >         31       return;
> >         32     }
> >         33     rdtp->dynticks_nmi_nesting = 0;
> >         34     smp_mb__before_atomic();
> 
> This implies barrier(), right?

Yep!

> >         35     atomic_inc(&rdtp->dynticks);
> >         36     smp_mb__after_atomic();
> >         37     WARN_ON_ONCE(atomic_read(&rdtp->dynticks) & 0x1);
> >         38   }
> >
> > Line 9 picks up a pointer to this CPU's rcu_dynticks structure and line 10
> > assumes that we don't need to increment ->dynticks.
> >
> > Line 12 checks to see if ->dynticks is even.  Note that this check is
> > stable: If there are nested NMIs, they will increment ->dynticks twice
> > or not at all, and either way preserves the evenness (to be proven, of
> > course, but that is the plan).  If ->dynticks is even, lines 13-15
> > atomically increment it, line 16 complains if still even, and line 17
> > says we will increment ->dynticks_nmi_nesting by only 1.
> >
> > Either way, line 19 increments ->dynticks_nmi_nesting as needed and
> > line 20 keeps the compiler from getting too cute.
> >
> > For rcu_nmi_exit(), line 25 again picks up this CPUs rcu_dynticks
> > structure.  Lines 27 and 28 complain bitterly if invariants are violated.
> > If line 29 finds that the value of ->dynticks_nmi_nesting is not 1,
> > then line 30 subtracts 2 from ->dynticks_nmi_nesting and line 31 returns.
> >
> > Otherwise, line 33 sets ->dynticks_nmi_nesting to zero, lines 34-36
> > atomically increment ->dynticks with full ordering, and line 37
> > complains bitterly if ->dynticks is not even.
> >
> > So, if an NMI occurs before rcu_nmi_enter's atomic increment, then the
> > nested NMI's rcu_nmi_enter() and rcu_nmi_exit() will think that they are
> > not nested, which is the correct thing for them to think in that case.
> > They will increment ->dynticks twice and restore ->dynticks_nmi_nesting
> > to zero (adding and then subtracting 1).  If the NMI happens after the
> > atomic increment, then the nested rcu_nmi_enter() and rcu_nmi_exit()
> > will leave ->dynticks alone, and will restore ->dynticks_nmi_nesting
> > to zero (adding and subtracting two again).  If the NMI happens after
> > the increment of ->dynticks_nmi_nesting, the nested NMI's rcu_nmi_enter()
> > and rcu_nmi_exit() will again restore ->dynticks_nmi_nesting, but this
> > time to one (again adding and subtracting two).
> >
> > In rcu_nmi_exit(), ->dynticks_nmi_nesting of zero had better not happen,
> > one means we need to atomically increment ->dynticks, and other values
> > mean that we are partially or fully nested.  Reasoning proceeds as for
> > rcu_nmi_enter(), but in the opposite direction.
> >
> > Whew!  That might even work.
> 
> I think I like this, with the warnings above.

OK with dropping the one that I called out as redundant?

> > But how about taking a different approach.  Assuming that there can
> > never be more than (say) 14 nesting NMI-like things, use the lower
> > four bits of ->dynticks to represent the NMI nesting and the upper
> > 28 bits as the counter.  This of course requires modifying lots of
> > places in RCU that check the counter, but it is probably time to
> > abstract the check anyway.
> >
> > This would allow my earlier attempted logic to work and (maybe) simplify
> > the reasoning a bit (and yes, the "magic" constants need macros):
> >
> >         void rcu_nmi_enter(void)
> >         {
> >                 struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> >                 int nesting = atomic_read(&rdtp->dynticks) & 0xf;
> >                 int incby = 0x01;
> >
> >                 WARN_ON_ONCE(nexting == 0xf);
> >                 if (nesting == 0) {
> >                         if (atomic_read(&rdtp->dynticks) & 0x10)
> >                                 return;
> >                         incby = 0x11;
> >                 }
> >                 smp_mb__before_atomic();
> >                 atomic_add(&rdtp->dynticks, incby);
> >                 smp_mb__after_atomic();
> >                 WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> >         }
> >
> >         void rcu_nmi_exit(void)
> >         {
> >                 struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> >                 int nesting = atomic_read(&rdtp->dynticks) & 0xf;
> >                 int incby = 0x0f;
> >
> >                 if (nesting == 0)
> >                         return;
> >                 if (nesting > 1)
> >                         incby = -1;
> >                 smp_mb__before_atomic();
> >                 atomic_add(&rdtp->dynticks, incby);
> >                 smp_mb__after_atomic();
> >                 WARN_ON_ONCE(atomic_read(&rdtp->dynticks) & 0x1);
> >         }
> >
> > Over to you!  ;-)
> 
> This latter one is all you :)

Well, let's see how I feel about it after trying a Promela model of
the first code sequence.  ;-)

							Thanx, Paul

> --Andy
> 
> >
> >                                                         Thanx, Paul
> >
> 
> 
> 
> -- 
> Andy Lutomirski
> AMA Capital Management, LLC
> 


  reply	other threads:[~2014-11-24 20:54 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-21 21:26 [PATCH v4 0/5] x86: Rework IST interrupts Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 1/5] uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME Andy Lutomirski
2014-11-22 16:55   ` Borislav Petkov
2014-11-24 17:58     ` Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Andy Lutomirski
2014-11-21 21:32   ` Andy Lutomirski
2014-11-21 22:07     ` Paul E. McKenney
2014-11-21 22:19       ` Andy Lutomirski
2014-11-21 22:55         ` Paul E. McKenney
2014-11-21 23:06           ` Andy Lutomirski
2014-11-21 23:38             ` Paul E. McKenney
2014-11-22  2:00               ` Andy Lutomirski
2014-11-22  4:20                 ` Paul E. McKenney
2014-11-22  5:53                   ` Andy Lutomirski
2014-11-22 23:41                     ` Paul E. McKenney
2014-11-24 20:22                       ` Andy Lutomirski
2014-11-24 20:54                         ` Paul E. McKenney [this message]
2014-11-24 21:02                           ` Andy Lutomirski
2014-11-24 21:35                             ` Paul E. McKenney
2014-11-24 22:34                               ` Paul E. McKenney
2014-11-24 22:36                                 ` Andy Lutomirski
2014-11-24 22:57                                   ` Paul E. McKenney
2014-11-24 23:31                                     ` Paul E. McKenney
2014-11-24 23:35                                       ` Andy Lutomirski
2014-11-24 23:50                                         ` Paul E. McKenney
2014-11-24 23:52                                           ` Andy Lutomirski
2014-11-25 18:58                                             ` Borislav Petkov
2014-11-25 19:16                                               ` Paul E. McKenney
2014-12-11  0:22                                               ` Tony Luck
2014-12-11  0:24                                                 ` Andy Lutomirski
2015-01-05 21:46                                                   ` Tony Luck
2015-01-05 21:54                                                     ` Andy Lutomirski
2015-01-06  0:44                                                       ` [PATCH] x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks Luck, Tony
2015-01-06  1:01                                                         ` Andy Lutomirski
2015-01-06 18:00                                                           ` Luck, Tony
2015-01-07 12:13                                                             ` Borislav Petkov
2015-01-07 15:51                                                               ` Andy Lutomirski
2015-01-07 15:58                                                                 ` Borislav Petkov
2015-01-07 16:12                                                                 ` Paul E. McKenney
2014-11-25 17:13                                           ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Paul E. McKenney
2014-11-27  7:03                                           ` Lai Jiangshan
2014-11-27 16:46                                             ` Paul E. McKenney
2014-11-24 21:27                           ` Paul E. McKenney
2014-11-21 22:20       ` Frederic Weisbecker
2014-11-21 22:00   ` Paul E. McKenney
2014-11-22 17:20   ` Borislav Petkov
2014-11-24 19:48     ` Andy Lutomirski
2015-01-22 21:52   ` Sasha Levin
2015-01-23 17:58     ` Andy Lutomirski
2015-01-23 18:04       ` Borislav Petkov
2015-01-23 18:34         ` Andy Lutomirski
2015-01-23 20:48           ` Sasha Levin
2015-01-24  1:25             ` Andy Lutomirski
2015-01-28 16:33               ` Andy Lutomirski
2015-01-28 17:48                 ` Paul E. McKenney
2015-01-28 21:02                   ` Andy Lutomirski
2015-01-30 19:57                     ` Sasha Levin
2015-01-31  1:28                       ` Sasha Levin
2015-01-31  3:12                         ` Andy Lutomirski
2015-01-31 12:50                           ` Andy Lutomirski
2015-01-31 13:01                         ` [PATCH] x86, traps: Fix ist_enter from userspace Andy Lutomirski
2015-01-31 15:09                           ` Sasha Levin
2015-01-31 16:18                           ` Paul E. McKenney
2015-02-01  2:17                             ` Andy Lutomirski
2015-02-04  6:01                           ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 3/5] x86, entry: Switch stacks on a paranoid entry " Andy Lutomirski
2014-11-24 15:55   ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 4/5] x86: Clean up current_stack_pointer Andy Lutomirski
2014-11-24 11:39   ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 5/5] x86, traps: Add ist_begin_non_atomic and ist_end_non_atomic Andy Lutomirski
2014-11-24 15:54   ` Borislav Petkov
2014-11-24 19:52     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141124205441.GW5050@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=bp@alien8.de \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).