From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753704AbcG3AvR (ORCPT ); Fri, 29 Jul 2016 20:51:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60776 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753561AbcG3AvD (ORCPT ); Fri, 29 Jul 2016 20:51:03 -0400 Date: Fri, 29 Jul 2016 19:50:59 -0500 From: Josh Poimboeuf To: Steven Rostedt Cc: Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, Andy Lutomirski , Linus Torvalds , Brian Gerst , Kees Cook , Peter Zijlstra , Frederic Weisbecker , Byungchul Park Subject: Re: [PATCH 05/19] x86/dumpstack: fix function graph tracing stack dump reliability issues Message-ID: <20160730005059.5krpl2xsvqfbnntj@treble> References: <20160729185521.62a5cd2a@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160729185521.62a5cd2a@gandalf.local.home> User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Sat, 30 Jul 2016 00:51:02 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 29, 2016 at 06:55:21PM -0400, Steven Rostedt wrote: > On Thu, 21 Jul 2016 16:21:42 -0500 > Josh Poimboeuf wrote: > > > When function graph tracing is enabled for a function, its return > > address on the stack is replaced with the address of an ftrace handler > > (return_to_handler). When dumping the stack of a task with graph > > tracing enabled, there are some subtle bugs: > > > > - The fake return_to_handler() address can be reported as reliable. > > Instead, because it's not the real caller, it should be considered > > unreliable. > > I have some mixed emotions about this. First, it's not "fake", the > function *is* going to return to it, but you are right, that's not the > function that was called. > > I do like to see these in the trace, because sometimes these functions > are an issue. But I guess I can live with them being marked as > "unreliable". Yeah, this is a little iffy. Calling return_to_handler() "fake" isn't 100% accurate. It wasn't involved in the *call* path, but it will be involved in the *return* path. My thinking was that when either saving or dumping the stack, you normally only care about what led up to that point (the call path), rather than what will happen in the future (the return path). That's especially true in the non-oops stack trace case, which isn't used for debugging. For example, reporting return_to_handler() in the reliable trace of a perf profiling operation would just be confusing. And in the oops case, where debugging is important, I think "unreliable" is more appropriate because it serves as a hint that graph tracing was involved, instead of trying to assert that it was the real caller, which could create some confusion. > > - In print_context_stack(), the real caller's return address is always > > reported as reliable, even if the return_to_handler() address wasn't > > referred to by a frame pointer. > > Hmm, if CONFIG_FRAME_POINTER is enabled, perhaps we should only call > the look up of ftrace_graph_ret_addr(). Hmm, playing with this, yeah, > we definitely should. It can report the wrong reliability. > > Without doing the reliability check we can get out of sync with the > ret_stack. I have a patch to go on top of this patch below (hmm, it may > not apply fully, because I was using a different base tree than you). Yeah, your patch makes it better. Thanks! BTW, it would be really nice if ftrace_graph_ret_addr() were idempotent so we could get the "real" return address without having to pass in a state variable. For example we could add an "unsigned long *retp" pointer to ftrace_ret_stack, which points to the return address on the stack. Then we could get rid of the index state variable in ftrace_graph_ret_addr, and also then there would never be a chance of the stack dump getting out of sync with the ret_stack. What do you think? -- Josh