From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751499AbaKVRUb (ORCPT ); Sat, 22 Nov 2014 12:20:31 -0500 Received: from mail.skyhub.de ([78.46.96.112]:38124 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750735AbaKVRUa (ORCPT ); Sat, 22 Nov 2014 12:20:30 -0500 Date: Sat, 22 Nov 2014 18:20:25 +0100 From: Borislav Petkov To: Andy Lutomirski Cc: x86@kernel.org, Linus Torvalds , linux-kernel@vger.kernel.org, Peter Zijlstra , Oleg Nesterov , Tony Luck , Andi Kleen , "Paul E. McKenney" , Josh Triplett , =?utf-8?B?RnLDqWTDqXJpYw==?= Weisbecker Subject: Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Message-ID: <20141122172025.GC4395@pd.tnic> References: <7665538633a500255d7da9ca5985547f6a2aa191.1416604491.git.luto@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <7665538633a500255d7da9ca5985547f6a2aa191.1416604491.git.luto@amacapital.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 21, 2014 at 01:26:08PM -0800, Andy Lutomirski wrote: > We currently pretend that IST context is like standard exception > context, but this is incorrect. IST entries from userspace are like > standard exceptions except that they use per-cpu stacks, so they are > atomic. IST entries from kernel space are like NMIs from RCU's > perspective -- they are not quiescent states even if they > interrupted the kernel during a quiescent state. > > Add and use ist_enter and ist_exit to track IST context. Even > though x86_32 has no IST stacks, we track these interrupts the same > way. > > This fixes two issues: > > - Scheduling from an IST interrupt handler will now warn. It would > previously appear to work as long as we got lucky and nothing > overwrote the stack frame. (I don't know of any bugs in this > that would trigger the warning, but it's good to be on the safe > side.) > > - RCU handling in IST context was dangerous. As far as I know, > only machine checks were likely to trigger this, but it's good to > be on the safe side. > > Note that the machine check handlers appears to have been missing > any context tracking at all before this patch. > > Cc: "Paul E. McKenney" > Cc: Josh Triplett > Cc: Frédéric Weisbecker > Signed-off-by: Andy Lutomirski > --- > arch/x86/include/asm/traps.h | 4 +++ > arch/x86/kernel/cpu/mcheck/mce.c | 5 ++++ > arch/x86/kernel/cpu/mcheck/p5.c | 6 +++++ > arch/x86/kernel/cpu/mcheck/winchip.c | 5 ++++ > arch/x86/kernel/traps.c | 49 ++++++++++++++++++++++++++++++------ > 5 files changed, 61 insertions(+), 8 deletions(-) ... > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c > index 0d0e922fafc1..f5c4b8813774 100644 > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -107,6 +107,39 @@ static inline void preempt_conditional_cli(struct pt_regs *regs) > preempt_count_dec(); > } > > +enum ctx_state ist_enter(struct pt_regs *regs) > +{ > + /* > + * We are atomic because we're on the IST stack (or we're on x86_32, > + * in which case we still shouldn't schedule. > + */ > + preempt_count_add(HARDIRQ_OFFSET); > + > + if (user_mode_vm(regs)) { > + /* Other than that, we're just an exception. */ > + return exception_enter(); > + } else { > + /* > + * We might have interrupted pretty much anything. In > + * fact, if we're a machine check, we can even interrupt > + * NMI processing. We don't want in_nmi() to return true, > + * but we need to notify RCU. > + */ > + rcu_nmi_enter(); > + return IN_KERNEL; /* the value is irrelevant. */ > + } I guess dropping the explicit else-branch could make it look a bit nicer with the curly braces gone and all... enum ctx_state ist_enter(struct pt_regs *regs) { /* * We are atomic because we're on the IST stack (or we're on x86_32, * in which case we still shouldn't schedule. */ preempt_count_add(HARDIRQ_OFFSET); if (user_mode_vm(regs)) /* Other than that, we're just an exception. */ return exception_enter(); /* * We might have interrupted pretty much anything. In fact, if we're a * machine check, we can even interrupt NMI processing. We don't want * in_nmi() to return true, but we need to notify RCU. */ rcu_nmi_enter(); return IN_KERNEL; /* the value is irrelevant. */ } > +} > + > +void ist_exit(struct pt_regs *regs, enum ctx_state prev_state) > +{ > + preempt_count_sub(HARDIRQ_OFFSET); > + > + if (user_mode_vm(regs)) > + return exception_exit(prev_state); > + else > + rcu_nmi_exit(); > +} Ditto here. > + > static nokprobe_inline int > do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str, > struct pt_regs *regs, long error_code) -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. --