From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751207AbcGWNJ1 (ORCPT ); Sat, 23 Jul 2016 09:09:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50320 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750877AbcGWNJ0 (ORCPT ); Sat, 23 Jul 2016 09:09:26 -0400 Date: Sat, 23 Jul 2016 08:09:22 -0500 From: Josh Poimboeuf To: Andy Lutomirski Cc: Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , X86 ML , "linux-kernel@vger.kernel.org" , Linus Torvalds , Steven Rostedt , Brian Gerst , Kees Cook , Peter Zijlstra , Frederic Weisbecker , Byungchul Park Subject: Re: [PATCH 10/19] x86/dumpstack: add get_stack_info() interface Message-ID: <20160723130922.ybhi5dt42astqny3@treble> References: <88568c51f1a253210897e368262d0f5fa1d7e97a.1469136008.git.jpoimboe@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Sat, 23 Jul 2016 13:09:25 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 22, 2016 at 04:52:10PM -0700, Andy Lutomirski wrote: > On Fri, Jul 22, 2016 at 4:26 PM, Andy Lutomirski wrote: > > On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf wrote: > >> valid_stack_ptr() is buggy: it assumes that all stacks are of size > >> THREAD_SIZE, which is not true for exception stacks. So the > >> walk_stack() callbacks will need to know the location of the beginning > >> of the stack as well as the end. > >> > >> Another issue is that in general the various features of a stack (type, > >> size, next stack pointer, description string) are scattered around in > >> various places throughout the stack dump code. > > > > I finally figured out what visit_info is. But would it make more > > sense to track it in the unwind state so that the unwinder can > > directly make sure it doesn't start looping? > > > > I just realized that it *is* in the unwind state. But maybe this code > in update_stack_state: > > sp = info->next; > if (!sp). > goto unknown; > > if (get_stack_info(sp, state->task, info, &state->stack_mask)) > goto unknown; > > if (!on_stack(info, addr, len)) > goto unknown; > > should do something like: > > if (get_stack_info(addr, ...)) > goto unknown. > > sp = info->end; > > instead. Alternatively, maybe it would make sense to keep sp as is > (have update_stack_state return bool instead of returning a pointer) > so that a frame that switches stacks still shows the actual sp at the > time that the frame called whatever the it called. > > I'm really quite confused by what state->sp means in your current > code. In the non-stack-switching case (everything is on the thread > stack), it appears to always match state->bp. Am I missing something? > If I'm understanding this correctly, when you're pointing at a call > frame, state->bp is that frame's base address (the top of the stack > frame), unwind_get_return_address() returns the address to which that > frame would return, and, in the future, unwind_get_gpr(UNWIND_DI) or > whatever it ends up looking like will return RDI at the time that the > frame called whatever function it called, if known. By that logic, > shouldn't state->sp be sp on entry to the call instruction? (Or could > sp just be removed? Does it do anything?) Yeah, I think sp has no purpose and can actually just be removed. (It was leftover from a previous iteration of the code where it did have a purpose and I forgot to remove it.) > I guess the reason I'm still not 100% comfortable with the idea that > pt_regs frames don't exist a real frames is that I keep waffling as to > how I should think about the regs associated with a frame in the > future DWARF world. I think I imagine them being the regs at the time > that the frame did it's call to the next frame, which, by an > admittedly weak analogy, means that the pt_regs state would be the > regs at the time that the exception or interrupt happened. That > offers a third silly option for dealing with the annoying '?': emit > two frames for a pt_regs, but have the frame in the entry code show > NULL for its return address because it's not a normal return. Well, I'd say let's not get ahead of ourselves. I think the current regs-aren't-a-frame design works fine for now, and the code is fairly simple. If/when we get a DWARF unwinder, we can revisit that decision. -- Josh