From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752200AbbGROKX (ORCPT ); Sat, 18 Jul 2015 10:10:23 -0400 Received: from e33.co.us.ibm.com ([32.97.110.151]:53363 "EHLO e33.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750809AbbGROKV (ORCPT ); Sat, 18 Jul 2015 10:10:21 -0400 X-Helo: d03dlp03.boulder.ibm.com X-MailFrom: paulmck@linux.vnet.ibm.com X-RcptTo: linux-tip-commits@vger.kernel.org Date: Sat, 18 Jul 2015 07:10:13 -0700 From: "Paul E. McKenney" To: Frederic Weisbecker Cc: Andy Lutomirski , Andrew Lutomirski , Ingo Molnar , Kees Cook , Linus Torvalds , Peter Zijlstra , Oleg Nesterov , "H. Peter Anvin" , "linux-kernel@vger.kernel.org" , Denys Vlasenko , Rik van Riel , Borislav Petkov , Thomas Gleixner , Denys Vlasenko , Brian Gerst , "linux-tip-commits@vger.kernel.org" Subject: Re: [tip:x86/asm] x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion Message-ID: <20150718141013.GS3717@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20150714232620.GE29441@lerouge> <20150718132357.GC1747@lerouge> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150718132357.GC1747@lerouge> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15071814-0009-0000-0000-00000C949CC1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 18, 2015 at 03:23:57PM +0200, Frederic Weisbecker wrote: > On Tue, Jul 14, 2015 at 04:33:39PM -0700, Andy Lutomirski wrote: > > On Tue, Jul 14, 2015 at 4:26 PM, Frederic Weisbecker wrote: > > > On Tue, Jul 07, 2015 at 03:54:32AM -0700, tip-bot for Andy Lutomirski wrote: > > >> Commit-ID: 0333a209cbf600e980fc55c24878a56f25f48b65 > > >> Gitweb: http://git.kernel.org/tip/0333a209cbf600e980fc55c24878a56f25f48b65 > > >> Author: Andy Lutomirski > > >> AuthorDate: Fri, 3 Jul 2015 12:44:34 -0700 > > >> Committer: Ingo Molnar > > >> CommitDate: Tue, 7 Jul 2015 10:59:10 +0200 > > >> > > >> x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion > > >> > > >> Signed-off-by: Andy Lutomirski > > >> Cc: Andy Lutomirski > > >> Cc: Borislav Petkov > > >> Cc: Brian Gerst > > >> Cc: Denys Vlasenko > > >> Cc: Denys Vlasenko > > >> Cc: Frederic Weisbecker > > >> Cc: H. Peter Anvin > > >> Cc: Kees Cook > > >> Cc: Linus Torvalds > > >> Cc: Oleg Nesterov > > >> Cc: Paul E. McKenney > > >> Cc: Peter Zijlstra > > >> Cc: Rik van Riel > > >> Cc: Thomas Gleixner > > >> Cc: paulmck@linux.vnet.ibm.com > > >> Link: http://lkml.kernel.org/r/e8bdc4ed0193fb2fd130f3d6b7b8023e2ec1ab62.1435952415.git.luto@kernel.org > > >> Signed-off-by: Ingo Molnar > > >> --- > > >> arch/x86/kernel/irq.c | 15 +++++++++++++++ > > >> 1 file changed, 15 insertions(+) > > >> > > >> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > > >> index 88b36648..6233de0 100644 > > >> --- a/arch/x86/kernel/irq.c > > >> +++ b/arch/x86/kernel/irq.c > > >> @@ -216,8 +216,23 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs) > > >> unsigned vector = ~regs->orig_ax; > > >> unsigned irq; > > >> > > >> + /* > > >> + * NB: Unlike exception entries, IRQ entries do not reliably > > >> + * handle context tracking in the low-level entry code. This is > > >> + * because syscall entries execute briefly with IRQs on before > > >> + * updating context tracking state, so we can take an IRQ from > > >> + * kernel mode with CONTEXT_USER. The low-level entry code only > > >> + * updates the context if we came from user mode, so we won't > > >> + * switch to CONTEXT_KERNEL. We'll fix that once the syscall > > >> + * code is cleaned up enough that we can cleanly defer enabling > > >> + * IRQs. > > >> + */ > > >> + > > > > > > Now is it a problem to take interrupts in kernel mode with CONTEXT_USER? > > > I'm not sure it's worth trying to make it not happen. > > > > It's not currently a problem, but it would be nice if we could do the > > equivalent of: > > > > if (user_mode(regs)) { > > user_exit(); (or enter_from_user_mode or whatever) > > } else { > > // don't bother -- already in CONTEXT_KERNEL > > } > > This was the initial implementation of context tracking but it was terribly > buggy. What if we enter the kernel, we haven't yet got a change to call > context_tracking_user_exit() and we get an exception in the kernel entry > path? user_mode(regs) will return the wrong value and bad things happen. > > This is why context tracking needs its own tracking state, because we are always > out of sync with the real processor context anyway. > > > > > i.e. the same thing that do_general_protection, etc do in -tip. That > > would get rid of any need to store the previous context. > > > > Currently we can't because of syscalls and maybe because of KVM. KVM > > has a weird fake interrupt thing. > > > > > > > >> entering_irq(); > > >> > > >> + /* entering_irq() tells RCU that we're not quiescent. Check it. */ > > >> + rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU"); > > > > > > Why do we need to check that? > > > > Sanity check. If we're changing a bunch of context tracking details, > > I want to assert that it actually works. > > But we call rcu_irq_enter() right before. > > It's more or less like doing: > > local_irq_disable(); > WARN_ON(!irqs_disabled()); If we end up in a world where RCU sometimes uses context-tracking state and sometimes uses its own state (for example, for architecture that do not support context tracking), such a check might make more sense. It would be all too easy for someone to accidentailly manage to disable both somehow, and things would sort of work but have strange undebuggable failure cases. Sometimes. Thanx, Paul