From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752455AbcHNIiR (ORCPT ); Sun, 14 Aug 2016 04:38:17 -0400 Received: from mail-ua0-f177.google.com ([209.85.217.177]:32930 "EHLO mail-ua0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751282AbcHNIiP (ORCPT ); Sun, 14 Aug 2016 04:38:15 -0400 MIME-Version: 1.0 In-Reply-To: <4b376a205e6852574fe3051b8f85b2a57779b6e6.1471011425.git.jpoimboe@redhat.com> References: <4b376a205e6852574fe3051b8f85b2a57779b6e6.1471011425.git.jpoimboe@redhat.com> From: Andy Lutomirski Date: Sun, 14 Aug 2016 01:10:42 -0700 Message-ID: Subject: Re: [PATCH v3 41/51] x86/entry/unwind: create stack frames for saved interrupt registers To: Josh Poimboeuf Cc: Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , X86 ML , "linux-kernel@vger.kernel.org" , Linus Torvalds , Steven Rostedt , Brian Gerst , Kees Cook , Peter Zijlstra , Frederic Weisbecker , Byungchul Park , Nilay Vaish Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 12, 2016 at 7:29 AM, Josh Poimboeuf wrote: > With frame pointers, when a task is interrupted, its stack is no longer > completely reliable because the function could have been interrupted > before it had a chance to save the previous frame pointer on the stack. > So the caller of the interrupted function could get skipped by a stack > trace. > > This is problematic for live patching, which needs to know whether a > stack trace of a sleeping task can be relied upon. There's currently no > way to detect if a sleeping task was interrupted by a page fault > exception or preemption before it went to sleep. > > Another issue is that when dumping the stack of an interrupted task, the > unwinder has no way of knowing where the saved pt_regs registers are, so > it can't print them. > > This solves those issues by encoding the pt_regs pointer in the frame > pointer on entry from an interrupt or an exception. > > This patch also updates the unwinder to be able to decode it, because > otherwise the unwinder would be broken by this change. > > Note that this causes a change in the behavior of the unwinder: each > instance of a pt_regs on the stack is now considered a "frame". So > callers of unwind_get_return_address() will now get an occasional > 'regs->ip' address that would have previously been skipped over. Acked-by: Andy Lutomirski with minor optional nitpicks below. > > Suggested-by: Andy Lutomirski > Signed-off-by: Josh Poimboeuf > --- > arch/x86/entry/calling.h | 21 +++++++++++ > arch/x86/entry/entry_32.S | 40 ++++++++++++++++++--- > arch/x86/entry/entry_64.S | 10 ++++-- > arch/x86/include/asm/unwind.h | 18 ++++++++-- > arch/x86/kernel/unwind_frame.c | 82 +++++++++++++++++++++++++++++++++++++----- > 5 files changed, 153 insertions(+), 18 deletions(-) > > diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h > index 9a9e588..ab799a3 100644 > --- a/arch/x86/entry/calling.h > +++ b/arch/x86/entry/calling.h > @@ -201,6 +201,27 @@ For 32-bit we have the following conventions - kernel is built with > .byte 0xf1 > .endm > > + /* > + * This is a sneaky trick to help the unwinder find pt_regs on the > + * stack. The frame pointer is replaced with an encoded pointer to > + * pt_regs. The encoding is just a clearing of the highest-order bit, > + * which makes it an invalid address and is also a signal to the > + * unwinder that it's a pt_regs pointer in disguise. > + * > + * NOTE: This macro must be used *after* SAVE_EXTRA_REGS because it > + * corrupts the original rbp. > + */ > +.macro ENCODE_FRAME_POINTER ptregs_offset=0 > +#ifdef CONFIG_FRAME_POINTER > + .if \ptregs_offset > + leaq \ptregs_offset(%rsp), %rbp > + .else > + mov %rsp, %rbp > + .endif > + btr $63, %rbp > +#endif > +.endm > + > #endif /* CONFIG_X86_64 */ > > /* > diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S > index 4396278..4006fa3 100644 > --- a/arch/x86/entry/entry_32.S > +++ b/arch/x86/entry/entry_32.S > @@ -174,6 +174,23 @@ > SET_KERNEL_GS %edx > .endm > > +/* > + * This is a sneaky trick to help the unwinder find pt_regs on the > + * stack. The frame pointer is replaced with an encoded pointer to > + * pt_regs. The encoding is just a clearing of the highest-order bit, > + * which makes it an invalid address and is also a signal to the > + * unwinder that it's a pt_regs pointer in disguise. > + * > + * NOTE: This macro must be used *after* SAVE_ALL because it corrupts the > + * original rbp. > + */ > +.macro ENCODE_FRAME_POINTER > +#ifdef CONFIG_FRAME_POINTER > + mov %esp, %ebp > + btr $31, %ebp > +#endif > +.endm > + > .macro RESTORE_INT_REGS > popl %ebx > popl %ecx > @@ -205,10 +222,16 @@ > .endm > > ENTRY(ret_from_fork) > + call 1f pushl $ret_from_fork is the same length and slightly less strange. OTOH it forces a relocation, and this function doesn't return, so there shouldn't be any performance issue, so this may save a byte or two in the compressed image. > +1: push $0 This could maybe use a comment. --Andy