linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>, "H . Peter Anvin" <hpa@zytor.com>,
	X86 ML <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Brian Gerst <brgerst@gmail.com>,
	Kees Cook <keescook@chromium.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Byungchul Park <byungchul.park@lge.com>
Subject: Re: [PATCH 10/19] x86/dumpstack: add get_stack_info() interface
Date: Tue, 26 Jul 2016 15:37:59 -0700	[thread overview]
Message-ID: <CALCETrWyCthrO9YxhkqoO8V5tq0xZVReH7OuUmbTKauxQQj5gg@mail.gmail.com> (raw)
In-Reply-To: <20160726222454.xqfr2l2qkc6xkavw@treble>

On Tue, Jul 26, 2016 at 3:24 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> On Tue, Jul 26, 2016 at 01:59:20PM -0700, Andy Lutomirski wrote:
>> On Tue, Jul 26, 2016 at 9:26 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>> > On Mon, Jul 25, 2016 at 05:09:44PM -0700, Andy Lutomirski wrote:
>> >> On Sat, Jul 23, 2016 at 7:04 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>> >> >> > Unless I'm missing something, I think it should be fine for nested NMIs,
>> >> >> > since they're all on the same stack.  I can try to test it.  What in
>> >> >> > particular are you worried about?
>> >> >> >
>> >> >>
>> >> >> The top of the NMI frame contains no less than *three* (SS, SP, FLAGS,
>> >> >> CS, IP) records.  Off the top of my head, the record that matters is
>> >> >> the third one, so it should be regs[-15].  If an MCE hits while this
>> >> >> mess is being set up, good luck unwinding *that*.  If you really want
>> >> >> to know, take a deep breath, read the long comment in entry_64.S after
>> >> >> .Lnmi_from_kernel, then give up on x86 and start hacking on some other
>> >> >> architecture.
>> >> >
>> >> > I did read that comment.  Fortunately there's a big difference between
>> >> > reading and understanding so I can go on being an ignorant x86 hacker!
>> >> >
>> >> > For nested NMIs, it does look like the stack of the exception which
>> >> > interrupted the first NMI would get skipped by the stack dump.  (But
>> >> > that's a general problem, not specific to my patch set.)
>> >>
>> >> If we end up with task -> IST -> NMI -> same IST, we're doomed and
>> >> we're going to crash, so it doesn't matter much whether the unwinder
>> >> works.  Is that what you mean?
>> >
>> > I read the NMI entry code again, and now I realize my comment was
>> > completely misinformed, so never mind.
>> >
>> > Is "IST -> NMI -> same IST" even possible, since the other IST's are
>> > higher priority than NMI?
>>
>> Priority only matters for events that happen concurrently.
>> Synchronous things like #DB will always fire if the conditions that
>> trigger them are hit,
>
> So just to clarify, are you saying a lower priority exception like NMI
> can interrupt a higher priority exception handler like #DB?  I'm getting
> a different conclusion from reading section 6.9 of the Intel System
> Programming Guide.

Yes, effectively.  From the CPU's perspective, it's done with the #DB
as soon as it finishes pushing the stack frame and starts running
instructions again.  So the chain of events looks like:


<-- CPU is delivering #DB.  NMI can't be delivered.
debug:
<-- Oh boy, done with delivering #DB.  NMIs can be delivered again!
  pushq $whatever
  ...
  iretq  <-- CPU has no idea that this is related to the #DB

>
>> >> > Am I correct in understanding that there can only be one level of NMI
>> >> > nesting at any given time?  If so, could we make it easier on the
>> >> > unwinder by putting the nested NMI on a separate software stack, so the
>> >> > "next stack" pointers are always in the same place?  Or am I just being
>> >> > naive?
>> >>
>> >> I think you're being naive :)
>> >>
>> >> But we don't really need the unwinder to be 100% faithful here.  If we have:
>> >>
>> >> task stack
>> >> NMI
>> >> nested NMI
>> >>
>> >> then the nested NMI code won't call into C and thus it should be
>> >> impossible to ever invoke your unwinder on that state.  Instead the
>> >> nested NMI code will fiddle with the saved regs and return in such a
>> >> way that the outer NMI will be forced to loop through again.  So it
>> >> *should* (assuming I'm remembering all this crap correctly) be
>> >> sufficient to find the "outermost" pt_regs, which is sitting at
>> >> (struct pt_regs *)(end - 12) - 1 or thereabouts and look at it's ->sp
>> >> value.  This ought to be the same thing that the frame-based unwinder
>> >> would naturally try to do.  But if you make this change, ISTM you
>> >> should make it separately because it does change behavior and Linus is
>> >> understandably leery about that.
>> >
>> > Ok, I think that makes sense to me now.  As I understand it, the
>> > "outermost" RIP is the authoritative one, because it was written by the
>> > original NMI.  Any nested NMIs will update the original and/or iret
>> > RIPs, which will only ever point to NMI entry code, and so they should
>> > be ignored.
>> >
>> > But I think there's a case where this wouldn't work:
>> >
>> > task stack
>> > NMI
>> > IST
>> > stack dump
>> >
>> > If the IST interrupt hits before the NMI has a chance to update the
>> > outermost regs, the authoritative RIP would be the original one written
>> > by HW, right?
>>
>> This should be impossible unless that last entry is MCE.  If we
>> actually fire an event that isn't MCE early in NMI entry, something
>> already went very wrong.
>
> So we don't need to support breakpoints in the early NMI entry code?

No.  Instead we try not to let it happen.  See, for example:

commit e5779e8e12299f77c2421a707855d8d124171d85
Author: Andy Lutomirski <luto@kernel.org>
Date:   Thu Jul 30 20:32:40 2015 -0700

    perf/x86/hw_breakpoints: Disallow kernel breakpoints unless kprobe-safe


>>
>> Be careful, though: kernel threads might not have a "user" pt_regs in
>> the "user_mode" returns true sense.  Checking that it's either
>> user_mode() or at task_pt_regs() might be a good condition to check.
>
> Yeah.  I guess there are two distinct cases of "going off the rails":
>
> 1) The unwinder doesn't get to the end of the stack (user regs for user
>    tasks, or whatever the end is for kthreads).
>
> 2) The unwinder strays away from the current stack's "previous stack"
>    pointer.
>
> We could warn on either case (though there's probably overlap between
> the two).

I'm in favor of both.  But I think it's best to do them at the end the
series so that they're easy to revert in the event that Linus
complains and neither of us can convince him that's it's okay.

>
> --
> Josh



-- 
Andy Lutomirski
AMA Capital Management, LLC

  parent reply	other threads:[~2016-07-26 22:38 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-21 21:21 [PATCH 00/19] x86/dumpstack: rewrite x86 stack dump code Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 01/19] x86/dumpstack: remove show_trace() Josh Poimboeuf
2016-07-21 21:49   ` Andy Lutomirski
2016-07-21 21:21 ` [PATCH 02/19] x86/dumpstack: add get_stack_pointer() and get_frame_pointer() Josh Poimboeuf
2016-07-21 21:53   ` Andy Lutomirski
2016-07-21 21:21 ` [PATCH 03/19] x86/dumpstack: remove unnecessary stack pointer arguments Josh Poimboeuf
2016-07-21 21:56   ` Andy Lutomirski
2016-07-22  1:41     ` Josh Poimboeuf
2016-07-22  2:29       ` Andy Lutomirski
2016-07-22  3:08       ` Brian Gerst
2016-07-21 21:21 ` [PATCH 04/19] x86/dumpstack: make printk_stack_address() more generally useful Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 05/19] x86/dumpstack: fix function graph tracing stack dump reliability issues Josh Poimboeuf
2016-07-29 22:55   ` Steven Rostedt
2016-07-30  0:50     ` Josh Poimboeuf
2016-07-30  2:20       ` Steven Rostedt
2016-07-30 13:51         ` Josh Poimboeuf
2016-08-01 14:28           ` Steven Rostedt
2016-08-01 15:36             ` Josh Poimboeuf
2016-08-02 21:00               ` Josh Poimboeuf
2016-08-02 21:16                 ` Steven Rostedt
2016-08-02 22:13                   ` Josh Poimboeuf
2016-08-02 23:16                     ` Steven Rostedt
2016-08-03  1:56                       ` Josh Poimboeuf
2016-08-03  2:30                         ` Steven Rostedt
2016-08-03  2:50                           ` Josh Poimboeuf
2016-08-03  2:59                             ` Steven Rostedt
2016-08-03  3:12                               ` Josh Poimboeuf
2016-08-03  3:18                                 ` Steven Rostedt
2016-08-03  3:21                                   ` Steven Rostedt
2016-08-03  3:31                                     ` Josh Poimboeuf
2016-08-03  3:45                                       ` Steven Rostedt
2016-08-03 14:13                                         ` Josh Poimboeuf
2016-08-03  3:30                                   ` Josh Poimboeuf
2016-08-01 15:59     ` Josh Poimboeuf
2016-08-01 16:05       ` Steven Rostedt
2016-08-01 16:19         ` Josh Poimboeuf
2016-08-01 16:24     ` Josh Poimboeuf
2016-08-01 16:56       ` Steven Rostedt
2016-07-21 21:21 ` [PATCH 06/19] x86/dumpstack: remove extra brackets around "EOE" Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 07/19] x86/dumpstack: add IRQ_USABLE_STACK_SIZE define Josh Poimboeuf
2016-07-21 22:01   ` Andy Lutomirski
2016-07-22  1:48     ` Josh Poimboeuf
2016-07-22  8:24       ` Ingo Molnar
2016-07-21 21:21 ` [PATCH 08/19] x86/dumpstack: don't disable preemption in show_stack_log_lvl() and dump_trace() Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 09/19] x86/dumpstack: simplify in_exception_stack() Josh Poimboeuf
2016-07-21 22:05   ` Andy Lutomirski
2016-07-21 21:21 ` [PATCH 10/19] x86/dumpstack: add get_stack_info() interface Josh Poimboeuf
2016-07-22 23:26   ` Andy Lutomirski
2016-07-22 23:52     ` Andy Lutomirski
2016-07-23 13:09       ` Josh Poimboeuf
2016-07-22 23:54     ` Josh Poimboeuf
2016-07-23  0:15       ` Andy Lutomirski
2016-07-23 14:04         ` Josh Poimboeuf
2016-07-26  0:09           ` Andy Lutomirski
2016-07-26 16:26             ` Josh Poimboeuf
2016-07-26 17:51               ` Steven Rostedt
2016-07-26 18:56                 ` Josh Poimboeuf
2016-07-26 20:59               ` Andy Lutomirski
2016-07-26 22:24                 ` Josh Poimboeuf
2016-07-26 22:31                   ` Steven Rostedt
2016-07-26 22:37                   ` Andy Lutomirski [this message]
2016-07-26 16:47             ` Josh Poimboeuf
2016-07-26 17:49               ` Brian Gerst
2016-07-26 18:59                 ` Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 11/19] x86/dumptrace: add new unwind interface and implementations Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 12/19] perf/x86: convert perf_callchain_kernel() to the new unwinder Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 13/19] x86/stacktrace: convert save_stack_trace_*() " Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 14/19] oprofile/x86: convert x86_backtrace() " Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 15/19] x86/dumpstack: convert show_trace_log_lvl() " Josh Poimboeuf
2016-07-21 21:49   ` Byungchul Park
2016-07-22  1:38     ` Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 16/19] x86/dumpstack: remove dump_trace() Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 17/19] x86/entry/dumpstack: encode pt_regs pointer in frame pointer Josh Poimboeuf
2016-07-21 22:27   ` Andy Lutomirski
2016-07-21 21:21 ` [PATCH 18/19] x86/dumpstack: print stack identifier on its own line Josh Poimboeuf
2016-07-21 21:21 ` [PATCH 19/19] x86/dumpstack: print any pt_regs found on the stack Josh Poimboeuf
2016-07-21 22:32   ` Andy Lutomirski
2016-07-22  3:30     ` Josh Poimboeuf
2016-07-22  5:13       ` Andy Lutomirski
2016-07-22 15:57         ` Josh Poimboeuf
2016-07-22 21:46           ` Andy Lutomirski
2016-07-22 22:20             ` Josh Poimboeuf
2016-07-22 23:18               ` Andy Lutomirski
2016-07-22 23:30                 ` Josh Poimboeuf
2016-07-22 23:39                   ` Andy Lutomirski
2016-07-23  0:00                     ` Josh Poimboeuf
2016-07-23  0:22 ` [PATCH 00/19] x86/dumpstack: rewrite x86 stack dump code Linus Torvalds
2016-07-23  0:31   ` Andy Lutomirski
2016-07-23  5:35     ` Josh Poimboeuf
2016-07-23  5:39       ` Linus Torvalds
2016-07-23 12:53         ` Josh Poimboeuf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrWyCthrO9YxhkqoO8V5tq0xZVReH7OuUmbTKauxQQj5gg@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=brgerst@gmail.com \
    --cc=byungchul.park@lge.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).