From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752159AbbGRNYE (ORCPT ); Sat, 18 Jul 2015 09:24:04 -0400 Received: from mail-wi0-f175.google.com ([209.85.212.175]:34470 "EHLO mail-wi0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750727AbbGRNYC (ORCPT ); Sat, 18 Jul 2015 09:24:02 -0400 Date: Sat, 18 Jul 2015 15:23:57 +0200 From: Frederic Weisbecker To: Andy Lutomirski Cc: Andrew Lutomirski , Ingo Molnar , Kees Cook , Linus Torvalds , Peter Zijlstra , Paul McKenney , Oleg Nesterov , "H. Peter Anvin" , "linux-kernel@vger.kernel.org" , Denys Vlasenko , Rik van Riel , Borislav Petkov , Thomas Gleixner , Denys Vlasenko , Brian Gerst , "linux-tip-commits@vger.kernel.org" Subject: Re: [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion Message-ID: <20150718132357.GC1747@lerouge> References: <20150714232620.GE29441@lerouge> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 14, 2015 at 04:33:39PM -0700, Andy Lutomirski wrote: > On Tue, Jul 14, 2015 at 4:26 PM, Frederic Weisbecker wrote: > > On Tue, Jul 07, 2015 at 03:54:32AM -0700, tip-bot for Andy Lutomirski wrote: > >> Commit-ID: 0333a209cbf600e980fc55c24878a56f25f48b65 > >> Gitweb: http://git.kernel.org/tip/0333a209cbf600e980fc55c24878a56f25f48b65 > >> Author: Andy Lutomirski > >> AuthorDate: Fri, 3 Jul 2015 12:44:34 -0700 > >> Committer: Ingo Molnar > >> CommitDate: Tue, 7 Jul 2015 10:59:10 +0200 > >> > >> x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion > >> > >> Signed-off-by: Andy Lutomirski > >> Cc: Andy Lutomirski > >> Cc: Borislav Petkov > >> Cc: Brian Gerst > >> Cc: Denys Vlasenko > >> Cc: Denys Vlasenko > >> Cc: Frederic Weisbecker > >> Cc: H. Peter Anvin > >> Cc: Kees Cook > >> Cc: Linus Torvalds > >> Cc: Oleg Nesterov > >> Cc: Paul E. McKenney > >> Cc: Peter Zijlstra > >> Cc: Rik van Riel > >> Cc: Thomas Gleixner > >> Cc: paulmck@linux.vnet.ibm.com > >> Link: http://lkml.kernel.org/r/e8bdc4ed0193fb2fd130f3d6b7b8023e2ec1ab62.1435952415.git.luto@kernel.org > >> Signed-off-by: Ingo Molnar > >> --- > >> arch/x86/kernel/irq.c | 15 +++++++++++++++ > >> 1 file changed, 15 insertions(+) > >> > >> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > >> index 88b36648..6233de0 100644 > >> --- a/arch/x86/kernel/irq.c > >> +++ b/arch/x86/kernel/irq.c > >> @@ -216,8 +216,23 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs) > >> unsigned vector = ~regs->orig_ax; > >> unsigned irq; > >> > >> + /* > >> + * NB: Unlike exception entries, IRQ entries do not reliably > >> + * handle context tracking in the low-level entry code. This is > >> + * because syscall entries execute briefly with IRQs on before > >> + * updating context tracking state, so we can take an IRQ from > >> + * kernel mode with CONTEXT_USER. The low-level entry code only > >> + * updates the context if we came from user mode, so we won't > >> + * switch to CONTEXT_KERNEL. We'll fix that once the syscall > >> + * code is cleaned up enough that we can cleanly defer enabling > >> + * IRQs. > >> + */ > >> + > > > > Now is it a problem to take interrupts in kernel mode with CONTEXT_USER? > > I'm not sure it's worth trying to make it not happen. > > It's not currently a problem, but it would be nice if we could do the > equivalent of: > > if (user_mode(regs)) { > user_exit(); (or enter_from_user_mode or whatever) > } else { > // don't bother -- already in CONTEXT_KERNEL > } This was the initial implementation of context tracking but it was terribly buggy. What if we enter the kernel, we haven't yet got a change to call context_tracking_user_exit() and we get an exception in the kernel entry path? user_mode(regs) will return the wrong value and bad things happen. This is why context tracking needs its own tracking state, because we are always out of sync with the real processor context anyway. > > i.e. the same thing that do_general_protection, etc do in -tip. That > would get rid of any need to store the previous context. > > Currently we can't because of syscalls and maybe because of KVM. KVM > has a weird fake interrupt thing. > > > > >> entering_irq(); > >> > >> + /* entering_irq() tells RCU that we're not quiescent. Check it. */ > >> + rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU"); > > > > Why do we need to check that? > > Sanity check. If we're changing a bunch of context tracking details, > I want to assert that it actually works. But we call rcu_irq_enter() right before. It's more or less like doing: local_irq_disable(); WARN_ON(!irqs_disabled());