From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753413AbbEUK1S (ORCPT <rfc822;w@1wt.eu>);
	Thu, 21 May 2015 06:27:18 -0400
Received: from mail-wg0-f44.google.com ([74.125.82.44]:32894 "EHLO
	mail-wg0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751667AbbEUK1N (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 21 May 2015 06:27:13 -0400
Date: Thu, 21 May 2015 12:27:08 +0200
From: Ingo Molnar <mingo@kernel.org>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        Andy Lutomirski <luto@amacapital.net>,
        Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>,
        "H. Peter Anvin" <hpa@zytor.com>, Michal Marek <mmarek@suse.cz>,
        Peter Zijlstra <peterz@infradead.org>, X86 ML <x86@kernel.org>,
        live-patching@vger.kernel.org,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Andy Lutomirski <luto@kernel.org>,
        Denys Vlasenko <dvlasenk@redhat.com>, Brian Gerst <brgerst@gmail.com>,
        Peter Zijlstra <a.p.zijlstra@chello.nl>,
        Borislav Petkov <bp@alien8.de>,
        Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v4 0/3] Compile-time stack frame pointer validation
Message-ID: <20150521102708.GA17009@gmail.com>
References: <cover.1431966710.git.jpoimboe@redhat.com>
 <20150520103339.GA22205@gmail.com>
 <20150520141331.GA16995@treble.redhat.com>
 <20150520144810.GA10374@gmail.com>
 <CALCETrWAP=7dN8P3dn0JwYTn2SHYtyi-3z7KqGcjV0X6d=s50g@mail.gmail.com>
 <20150520162537.GD16995@treble.redhat.com>
 <CA+55aFzj6VmJQQynhrqywn43t1iep5Qf750HS=e7QJeEpYz1OQ@mail.gmail.com>
 <20150520172052.GE16995@treble.redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150520172052.GE16995@treble.redhat.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


* Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Wed, May 20, 2015 at 09:59:18AM -0700, Linus Torvalds wrote:
> > On Wed, May 20, 2015 at 9:25 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > > On Wed, May 20, 2015 at 09:03:37AM -0700, Andy Lutomirski wrote:
> > >>
> > >> I've never quite understood what the '?' means.
> > >
> > > It basically means "here's a function address we found on the 
> > > stack, which may or may not have been called."  It's needed 
> > > because stack walking isn't currently 100% reliable.
> > 
> > It is often quite interesting and helpful, because it shows stale 
> > data on the stack, giving clues about what happened just before.
> > 
> > Now, I'd like gcc to generally be better about not wasting so much 
> > stack frame, so in that sense I'd like to see fewer '?" entries 
> > just from a code quality standpoint, but when debugging those 
> > things, the downside of "noise" is often cancelled by the upside 
> > of "ahh, it happens after calling X".
> > 
> > So the "perfect stack frames" is actually not as great a thing as 
> > some people want to make it seem.
> 
> Ok, I can see how looking at stale stack data could be useful for 
> some of the really tough problems.

And note that the tough problems are actually the ones where we need 
that information the most. So any stack backtrace printing method must 
be biased towards helping the difficult scenarios - not the trivial 
crashes. That is one of the reasons why we are always printing the 
question marks.

> But right now, the meaning of '?' is ambiguous.  It could be stale 
> data, or it could be part of a frame for the current stack which was 
> skipped due to missing frame pointers or an exception.

Yes, of course. That's not a big problem as the actual symbolic 
information will tell us a lot, which allows us to reconstruct the 
real call chain, plus allows us to see any 'recent execution activity' 
that might be on the stack as stale entries.

> If we can somehow make the stack unwinder reliable, then it would at 
> least allow us to remove the ambiguity of the '?' entries.  And it 
> would reduce the "noise" for the majority of issues where we don't 
> care about stale stack data, and can simply ignore it.

Yes, but note the above consideration - the probability distribution 
of kernel bugs tends to have a _very_ long tail, with bugs that 
sometimes take years to trigger and fix. Kernel developers upstream 
and at distros tend to spend a disproportionately large amount of time 
staring at difficult to decode bugs.

For that reason it is far more important to still stay maintainable 
with those kinds of difficult bugs, than to make the resolution of 
trivial, unambiguous crashes a tiny bit easier by printing fewer 
'distractions'...

Also, note that the '?' entries have another role: they cross-check 
the unwinder.

If you think we'll be able to do a perfect unwinder then think again: 
debug info _will_ be messed up periodically, either by us or by 
tooling, because right now no kernel code or other functionality 
relies on perfect unwinding.

So this is not like C++ exception handling where broken unwinding will 
break the code. This is something that is literally only visible in 
kernel logs currently, as a slight anomaly.

So any x86 stack unwinder code must be fundamentally based on the idea 
and expectation that stack unwinding is always going to be somewhat 
imperfect and somewhat statistical.

Thanks,

	Ingo