linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Ingo Molnar <mingo@kernel.org>
Cc: fweisbec@redhat.com, Paolo Bonzini <pbonzini@redhat.com>,
	X86 ML <x86@kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Ingo Molnar <mingo@redhat.com>,
	Mike Galbraith <umgwanakikbuti@gmail.com>,
	Rik van Riel <riel@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	williams@redhat.com
Subject: Re: [PATCH 3/3] context_tracking,x86: remove extraneous irq disable & enable from context tracking on syscall entry
Date: Thu, 7 May 2015 10:47:10 -0700	[thread overview]
Message-ID: <CALCETrX8zEcAXBAsut8MGU7CwEMd_dFYDsRyysY0qR18qAC0rg@mail.gmail.com> (raw)
In-Reply-To: <20150507150845.GA20608@gmail.com>

On May 7, 2015 8:38 PM, "Ingo Molnar" <mingo@kernel.org> wrote:
>
>
> * Andy Lutomirski <luto@amacapital.net> wrote:
>
> > I think one or both of us is missing something or we're just talking
> > about different things.
>
> That's very much possible!
>
> I think part of the problem is that I called the 'remote CPU' the RT
> CPU, while you seem to be calling it the CPU that does the
> synchronize_rcu().
>
> So lets start again, with calling the synchronize_rcu() the 'remote
> CPU', and the one doing the RT workload the 'RT CPU':
>
> > If the nohz/RT cpu is about to enter user mode and stay there for a
> > long time, it does:
> >
> >   this_cpu_inc(rcu_qs_ctr);
> >
> > or whatever.  Maybe we add:
> >
> >   this_cpu_set(rcu_state) = IN_USER;
> >
> > or however it's spelled.
> >
> > The remote CPU wants to check our state.  If this happens just
> > before the IN_USER write or rcu_qs_ctr increment becomes visible,
> > then it'll think we're in kernel mode.  Now it either has to poll
> > (which is fine) or try to get us to tell the RCU core when we become
> > quiescent by setting TIF_RCU_THINGY.
>
> So do you mean:
>
>    this_cpu_set(rcu_state) = IN_KERNEL;
>    ...
>    this_cpu_inc(rcu_qs_ctr);
>    this_cpu_set(rcu_state) = IN_USER;
>
> ?
>
> So in your proposal we'd have an INC and two MOVs. I think we can make
> it just two simple stores into a byte flag, one on entry and one on
> exit:
>
>    this_cpu_set(rcu_state) = IN_KERNEL;
>    ...
>    this_cpu_set(rcu_state) = IN_USER;
>

I was thinking that either a counter or a state flag could make sense.
Doing both would be pointless.  The counter could use the low bits to
indicate the state.  The benefit of the counter would be that the
RCU-waiting CPU could observe that the counter has incremented and
that therefore a grace period has elapsed.  Getting it right would
require lots of care.

> plus the rare but magic TIF_RCU_THINGY that tells a waiting
> synchronize_rcu() about the next quiescent state.
>
> > The problem is that I don't see how TIF_RCU_THINGY can work
> > reliably. If the remote CPU sets it, it'll be too late and we'll
> > still enter user mode without seeing it.  If it's just an
> > optimization, though, then it should be fine.
>
> Well, after setting it, the remote CPU has to re-check whether the RT
> CPU has entered user-mode - before it goes to wait.

How?

Suppose the exit path looked like:

this_cpu_write(rcu_state, IN_USER);

if (ti->flags & _TIF_RCU_NOTIFY) {
    if (test_and_clear_bit(TIF_RCU_NOTIFY, &ti->flags))
        slow_notify_rcu_that_we_are_exiting();
}

iret or sysret;

The RCU-waiting CPU sees that rcu_state == IN_KERNEL and sets
_TIF_RCU_NOTIFY.  This could happen arbitrarily late before IRET
because stores can be delayed.  (It could even happen after sysret,
IIRC, but IRET is serializing.)

If we put an mfence after this_cpu_set or did an unconditional
test_and_clear_bit on ti->flags then this problem goes away, but that
would probably be slower than we'd like.

--Andy

  reply	other threads:[~2015-05-07 17:47 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-30 21:23 [PATCH 0/3] reduce nohz_full syscall overhead by 10% riel
2015-04-30 21:23 ` [PATCH 1/3] reduce indentation in __acct_update_integrals riel
2015-04-30 21:23 ` [PATCH 2/3] remove local_irq_save from __acct_update_integrals riel
2015-04-30 21:23 ` [PATCH 3/3] context_tracking,x86: remove extraneous irq disable & enable from context tracking on syscall entry riel
2015-04-30 21:56   ` Andy Lutomirski
2015-05-01  6:40   ` Ingo Molnar
2015-05-01 15:20     ` Rik van Riel
2015-05-01 15:59       ` Ingo Molnar
2015-05-01 16:03         ` Andy Lutomirski
2015-05-01 16:21           ` Ingo Molnar
2015-05-01 16:26             ` Rik van Riel
2015-05-01 16:34               ` Ingo Molnar
2015-05-01 18:05                 ` Rik van Riel
2015-05-01 18:40                   ` Ingo Molnar
2015-05-01 19:11                     ` Rik van Riel
2015-05-01 19:37                       ` Andy Lutomirski
2015-05-02  5:27                         ` Ingo Molnar
2015-05-02 18:27                           ` Rik van Riel
2015-05-03 18:41                           ` Andy Lutomirski
2015-05-07 10:35                             ` Ingo Molnar
2015-05-04  9:26                           ` Paolo Bonzini
2015-05-04 13:30                             ` Rik van Riel
2015-05-04 14:06                             ` Rik van Riel
2015-05-04 14:19                             ` Rik van Riel
2015-05-04 15:59                             ` question about RCU dynticks_nesting Rik van Riel
2015-05-04 18:39                               ` Paul E. McKenney
2015-05-04 19:39                                 ` Rik van Riel
2015-05-04 20:02                                   ` Paul E. McKenney
2015-05-04 20:13                                     ` Rik van Riel
2015-05-04 20:38                                       ` Paul E. McKenney
2015-05-04 20:53                                         ` Rik van Riel
2015-05-05  5:54                                           ` Paul E. McKenney
2015-05-06  1:49                                             ` Mike Galbraith
2015-05-06  3:44                                               ` Mike Galbraith
2015-05-06  6:06                                                 ` Paul E. McKenney
2015-05-06  6:52                                                   ` Mike Galbraith
2015-05-06  7:01                                                     ` Mike Galbraith
2015-05-07  0:59                                           ` Frederic Weisbecker
2015-05-07 15:44                                             ` Rik van Riel
2015-05-04 19:00                               ` Rik van Riel
2015-05-04 19:39                                 ` Paul E. McKenney
2015-05-04 19:59                                   ` Rik van Riel
2015-05-04 20:40                                     ` Paul E. McKenney
2015-05-05 10:53                                   ` Peter Zijlstra
2015-05-05 12:34                                     ` Paul E. McKenney
2015-05-05 13:00                                       ` Peter Zijlstra
2015-05-05 18:35                                         ` Paul E. McKenney
2015-05-05 21:09                                           ` Rik van Riel
2015-05-06  5:41                                             ` Paul E. McKenney
2015-05-05 10:48                                 ` Peter Zijlstra
2015-05-05 10:51                                   ` Peter Zijlstra
2015-05-05 12:30                                     ` Paul E. McKenney
2015-05-02  4:06                   ` [PATCH 3/3] context_tracking,x86: remove extraneous irq disable & enable from context tracking on syscall entry Mike Galbraith
2015-05-01 16:37             ` Ingo Molnar
2015-05-01 16:40               ` Rik van Riel
2015-05-01 16:45                 ` Ingo Molnar
2015-05-01 16:54                   ` Rik van Riel
2015-05-01 17:12                     ` Ingo Molnar
2015-05-01 17:22                       ` Rik van Riel
2015-05-01 17:59                         ` Ingo Molnar
2015-05-01 16:22           ` Rik van Riel
2015-05-01 16:27             ` Ingo Molnar
2015-05-03 13:23       ` Mike Galbraith
2015-05-03 17:30         ` Rik van Riel
2015-05-03 18:24           ` Andy Lutomirski
2015-05-03 18:52             ` Rik van Riel
2015-05-07 10:48               ` Ingo Molnar
2015-05-07 12:18                 ` Frederic Weisbecker
2015-05-07 12:29                   ` Ingo Molnar
2015-05-07 15:47                     ` Rik van Riel
2015-05-08  7:58                       ` Ingo Molnar
2015-05-07 12:22                 ` Andy Lutomirski
2015-05-07 12:44                   ` Ingo Molnar
2015-05-07 12:49                     ` Ingo Molnar
2015-05-08  6:17                       ` Paul E. McKenney
2015-05-07 12:52                     ` Andy Lutomirski
2015-05-07 15:08                       ` Ingo Molnar
2015-05-07 17:47                         ` Andy Lutomirski [this message]
2015-05-08  6:37                           ` Ingo Molnar
2015-05-08 10:59                             ` Andy Lutomirski
2015-05-08 11:27                               ` Ingo Molnar
2015-05-08 12:56                                 ` Andy Lutomirski
2015-05-08 13:27                                   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrX8zEcAXBAsut8MGU7CwEMd_dFYDsRyysY0qR18qAC0rg@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=fweisbec@redhat.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=umgwanakikbuti@gmail.com \
    --cc=williams@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).