From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753598AbbHRWeQ (ORCPT ); Tue, 18 Aug 2015 18:34:16 -0400 Received: from mail-wi0-f180.google.com ([209.85.212.180]:35686 "EHLO mail-wi0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752175AbbHRWeN (ORCPT ); Tue, 18 Aug 2015 18:34:13 -0400 Date: Wed, 19 Aug 2015 00:34:10 +0200 From: Frederic Weisbecker To: Andy Lutomirski Cc: Denys Vlasenko , Rik van Riel , Borislav Petkov , Peter Zijlstra , Brian Gerst , Denys Vlasenko , Kees Cook , Thomas Gleixner , Oleg Nesterov , Andrew Lutomirski , Linus Torvalds , Ingo Molnar , "H. Peter Anvin" , "linux-kernel@vger.kernel.org" , "linux-tip-commits@vger.kernel.org" Subject: Re: [tip:x86/asm] x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old assembly code Message-ID: <20150818223409.GB12858@lerouge> References: <60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org> <20150811223827.GB15639@lerouge> <20150811232234.GD15639@lerouge> <20150812133217.GB21542@lerouge> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 12, 2015 at 07:59:44AM -0700, Andy Lutomirski wrote: > On Wed, Aug 12, 2015 at 6:32 AM, Frederic Weisbecker wrote: > > Right, and doing it the way we did previously was safe wrt. that. > > > > Can't we have exceptions slow path just like the way we do it in syscalls? > > > > Then the exception slow path would just do: > > > > if TIF_NOHZ > > ctx = exception_enter() > > exception_handler() > > if TIF_NOHZ > > exception_exit(ctx) > > What's the purpose of TIF_NOHZ right now? For syscalls, it makes > sense, but is there any case in which TIF_NOHZ is set on one CPU but > not on another CPU? It might make sense to get the performance back > using static keys instead of TIF_NOHZ. Sure if we can manage to do that. The nice thing about TIF flags is that they are a single check that is always there. > > If we switched back to exception_enter, we'd have to remember the > previous state, and, with a single exception right now, I think that's > unnecessary. > > I think there are only three states we can be in at exception entry: > user (and user_mode(regs)), kernel (and kernel_mode(regs)), or > NMI-like. But we can have user && (!user_mode(regs)) if exception happens on exception entry code. > In the user case, the new code is correct. In the kernel > case, the new code is also correct. In the NMI case (if we're nested > in an NMI or similar entry)) then it is and was the responsibility of > the NMI-like entry to call rcu_nmi_enter(), and things that nest > inside that shouldn't touch context tracking (with the possible > exception of calling rcu_nmi_enter() again). > > In current -tip, there's a slight hole in this due to syscalls, and I'll fix it. There must be a check for context tracking enabled anyway. So why can't we just just do in exception entry code: if (exception_slow_path()) { exception_enter() exception_handler() exception_exit() } else { normal stuff } Especially if we can manage to implement static keys in ASM, this will sum up to a single one. > >> The latter is annoying, but the entry code needs to deal with it > >> anyway. For example, any exception early in NMI is currently really > >> bad. Non-IST exceptions very early in SYSCALL are fatal. > >> Non-paranoid exceptions outside swapgs are fatal. Etc. > > > > Sure but that doesn't mean I'm happy with introducing new fragile path > > like those. Especially as we have a way to fix without more overhead. > > I think my approach can work with even less overhead: there are fewer > branches due to checking the previous state. > > >> > Also as long as there is at least one instruction between entry to the kernel > >> > and context tracking noting it, there is a risk for an exception. Hence entry > >> > code will never be atomic enough to avoid this kind of bugs. > >> > >> By that argument, we're doomed. Non-IST exceptions outside swapgs are fatal. > > > > Does that concern only error_entry() exceptions? > > Yes, but the set of paranoid_entry exceptions is shrinking. In -tip, there are: > > NMI: NMI is special and will call rcu_nmi_enter(). Nothing's changing here. > > MCE: Once upon a time, MCE was simply buggy. As of 4.0 (IIRC) MCE > from kernel mode calls rcu_nmi_enter(). > > BP: This is going away, I think. #BP should stop being special by 4.4. > > DB: That's the only weird case. Patches to prevent instruction > breakpoints in entry code are already in -tip. The only thing left is > kernel watchpoints, and we need to do something about that. So now we can't set a breakpoint on syscall entry anymore? I'm still nervous with all that.