From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753598AbbHRWeQ (ORCPT <rfc822;w@1wt.eu>);
	Tue, 18 Aug 2015 18:34:16 -0400
Received: from mail-wi0-f180.google.com ([209.85.212.180]:35686 "EHLO
	mail-wi0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752175AbbHRWeN (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 18 Aug 2015 18:34:13 -0400
Date: Wed, 19 Aug 2015 00:34:10 +0200
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>, Rik van Riel <riel@redhat.com>,
        Borislav Petkov <bp@alien8.de>, Peter Zijlstra <peterz@infradead.org>,
        Brian Gerst <brgerst@gmail.com>,
        Denys Vlasenko <vda.linux@googlemail.com>,
        Kees Cook <keescook@chromium.org>,
        Thomas Gleixner <tglx@linutronix.de>, Oleg Nesterov <oleg@redhat.com>,
        Andrew Lutomirski <luto@kernel.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Ingo Molnar <mingo@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "linux-tip-commits@vger.kernel.org" 
	<linux-tip-commits@vger.kernel.org>
Subject: Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work
 to C and remove old assembly code
Message-ID: <20150818223409.GB12858@lerouge>
References: <60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org>
 <tip-02bc7768fe447ae305e924b931fa629073a4a1b9@git.kernel.org>
 <20150811223827.GB15639@lerouge>
 <CALCETrVo+FoiWoRJXAtcONm9itV1FsogCDx8U1SussBpr_K9NA@mail.gmail.com>
 <20150811232234.GD15639@lerouge>
 <CALCETrVN00n0MJ-F1idkE7i4NA6H_kN69ztsTqH6sOpJHJ0NMQ@mail.gmail.com>
 <20150812133217.GB21542@lerouge>
 <CALCETrVMtAuckNBctBbXUPpTNRV1PoPufJUQuSuZ8V5vQgFW6A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CALCETrVMtAuckNBctBbXUPpTNRV1PoPufJUQuSuZ8V5vQgFW6A@mail.gmail.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Aug 12, 2015 at 07:59:44AM -0700, Andy Lutomirski wrote:
> On Wed, Aug 12, 2015 at 6:32 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > Right, and doing it the way we did previously was safe wrt. that.
> >
> > Can't we have exceptions slow path just like the way we do it in syscalls?
> >
> > Then the exception slow path would just do:
> >
> >     if TIF_NOHZ
> >        ctx = exception_enter()
> >     exception_handler()
> >     if TIF_NOHZ
> >        exception_exit(ctx)
> 
> What's the purpose of TIF_NOHZ right now?  For syscalls, it makes
> sense, but is there any case in which TIF_NOHZ is set on one CPU but
> not on another CPU?  It might make sense to get the performance back
> using static keys instead of TIF_NOHZ.

Sure if we can manage to do that. The nice thing about TIF flags is that
they are a single check that is always there.

> 
> If we switched back to exception_enter, we'd have to remember the
> previous state, and, with a single exception right now, I think that's
> unnecessary.
> 
> I think there are only three states we can be in at exception entry:
> user (and user_mode(regs)), kernel (and kernel_mode(regs)), or
> NMI-like.

But we can have user && (!user_mode(regs)) if exception happens on exception
entry code.

> In the user case, the new code is correct.  In the kernel
> case, the new code is also correct.  In the NMI case (if we're nested
> in an NMI or similar entry)) then it is and was the responsibility of
> the NMI-like entry to call rcu_nmi_enter(), and things that nest
> inside that shouldn't touch context tracking (with the possible
> exception of calling rcu_nmi_enter() again).
> 
> In current -tip, there's a slight hole in this due to syscalls, and I'll fix it.

There must be a check for context tracking enabled anyway. So why can't
we just just do in exception entry code:

       if (exception_slow_path()) {
           exception_enter()
           exception_handler()
           exception_exit()
       } else {
           normal stuff
       }

Especially if we can manage to implement static keys in ASM, this will sum up to
a single one.

> >> The latter is annoying, but the entry code needs to deal with it
> >> anyway.  For example, any exception early in NMI is currently really
> >> bad.  Non-IST exceptions very early in SYSCALL are fatal.
> >> Non-paranoid exceptions outside swapgs are fatal.  Etc.
> >
> > Sure but that doesn't mean I'm happy with introducing new fragile path
> > like those. Especially as we have a way to fix without more overhead.
> 
> I think my approach can work with even less overhead: there are fewer
> branches due to checking the previous state.
> 
> >> > Also as long as there is at least one instruction between entry to the kernel
> >> > and context tracking noting it, there is a risk for an exception. Hence entry
> >> > code will never be atomic enough to avoid this kind of bugs.
> >>
> >> By that argument, we're doomed.  Non-IST exceptions outside swapgs are fatal.
> >
> > Does that concern only error_entry() exceptions?
> 
> Yes, but the set of paranoid_entry exceptions is shrinking.  In -tip, there are:
> 
> NMI: NMI is special and will call rcu_nmi_enter().  Nothing's changing here.
> 
> MCE: Once upon a time, MCE was simply buggy.  As of 4.0 (IIRC) MCE
> from kernel mode calls rcu_nmi_enter().
> 
> BP: This is going away, I think.  #BP should stop being special by 4.4.
> 
> DB: That's the only weird case.  Patches to prevent instruction
> breakpoints in entry code are already in -tip.  The only thing left is
> kernel watchpoints, and we need to do something about that.

So now we can't set a breakpoint on syscall entry anymore?

I'm still nervous with all that.