linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Masami Hiramatsu <mhiramat@kernel.org>
To: Daniel Xu <dxu@dxuuu.xyz>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>, X86 ML <x86@kernel.org>,
	linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
	kuba@kernel.org, mingo@redhat.com, ast@kernel.org,
	tglx@linutronix.de, kernel-team@fb.com, yhs@fb.com,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes
Date: Wed, 10 Mar 2021 19:50:36 +0900	[thread overview]
Message-ID: <20210310195036.9aefe44bda0418484886c3a9@kernel.org> (raw)
In-Reply-To: <20210309213442.fyhxozdcyxfjljih@dlxu-fedora-R90QNFJV>

On Tue, 9 Mar 2021 13:34:42 -0800
Daniel Xu <dxu@dxuuu.xyz> wrote:

> Hi Masami,
> 
> Just want to clarify a few points:
> 
> On Mon, Mar 08, 2021 at 11:52:10AM +0900, Masami Hiramatsu wrote:
> > On Sun, 7 Mar 2021 13:23:33 -0800
> > Daniel Xu <dxu@dxuuu.xyz> wrote:
> > To help your understanding, let me explain.
> > 
> > If we have a code here
> > 
> > caller_func:
> > 0x00 add sp, 0x20	/* 0x20 bytes stack frame allocated */
> > ...
> > 0x10 call target_func
> > 0x15 ... /* return address */
> > 
> > On the stack in the entry of target_func, we have
> > 
> > [stack]
> > 0x0e0 caller_func+0x15
> > ... /* 0x20 bytes = 4 entries  are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> > 
> > And when we put a kretprobe on the target_func, the stack will be
> > 
> > [stack]
> > 0x0e0 kretprobe_trampoline
> > ... /* 0x20 bytes = 4 entries  are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> > 
> > * "caller_func+0x15" is saved in current->kretprobe_instances.first.
> > 
> > When returning from the target_func, call consumed the 0x0e0 and
> > jump to kretprobe_trampoline. Let's see the assembler code.
> > 
> >         ".text\n"
> >         ".global kretprobe_trampoline\n"
> >         ".type kretprobe_trampoline, @function\n"
> >         "kretprobe_trampoline:\n"
> >         /* We don't bother saving the ss register */
> >         "       pushq %rsp\n"
> >         "       pushfq\n"
> >         SAVE_REGS_STRING
> >         "       movq %rsp, %rdi\n"
> >         "       call trampoline_handler\n"
> >         /* Replace saved sp with true return address. */
> >         "       movq %rax, 19*8(%rsp)\n"
> >         RESTORE_REGS_STRING
> >         "       popfq\n"
> >         "       ret\n"
> > 
> > When the entry of trampoline_handler, stack is like this;
> > 
> > [stack]
> > 0x040 kretprobe_trampoline+0x25
> > 0x048 r15
> > ...     /* pt_regs */
> > 0x0d8 flags
> > 0x0e0 rsp (=0x0e0)
> > ... /* 0x20 bytes = 4 entries  are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> > 
> > And after returned from trampoline_handler, "movq" changes the
> > stack like this.
> > 
> > [stack]
> > 0x040 kretprobe_trampoline+0x25
> > 0x048 r15
> > ...     /* pt_regs */
> > 0x0d8 flags
> > 0x0e0 caller_func+0x15
> > ... /* 0x20 bytes = 4 entries  are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> 
> Thanks for the detailed explanation. I think I understand kretprobe
> mechanics from a somewhat high level (kprobe saves real return address
> on entry, overwrites return address to trampoline, then trampoline
> runs handler and finally resets return address to real return address).
> 
> I don't usually write much assembly so the details escape me somewhat.
> 
> > So at the kretprobe handler, we have 2 issues.
> > 1) the return address (caller_func+0x15) is not on the stack.
> >    this can be solved by searching from current->kretprobe_instances.
> 
> Yes, agreed.
> 
> > 2) the stack frame size of kretprobe_trampoline is unknown
> >    Since the stackframe is fixed, the fixed number (0x98) can be used.
> 
> I'm confused why this is relevant. Is it so ORC knows where to find
> saved return address in the frame?

No, because the kretprobe_trampoline is somewhat special. Usually, at the
function entry, there is a return address on the top of stack, but
kretprobe_trampoline doesn't have it.
So we have to put a hint at the function entry to mark there should be
a next return address. (and ORC unwinder must find it)

> > However, those solutions are only for the kretprobe handler. The stacktrace
> > from interrupt handler hit in the kretprobe_trampoline still doesn't work.
> > 
> > So, here is my idea;
> > 
> > 1) Change the trampline code to prepare stack frame at first and save
> >    registers on it, instead of "push". This will makes ORC easy to setup
> >    stackframe information for this code.
> 
> I'm confused on the details here. But this is what Josh solves in his
> patch, right?

I'm not so sure how objtool makes the ORC information. If it can trace the
push/pop correctly, yes, it is solved.

> > 2) change the return addres fixup timing. Instead of using return value
> >    of trampoline handler, before removing the real return address from
> >    current->kretprobe_instances.
> 
> Is the idea to have `kretprobe_trampoline` place the real return address
> on the stack (while telling ORC where to find it) _before_ running `call
> trampoline_handler` ? So that an unwind from inside the user defined
> kretprobe handler simply unwinds correctly?

No, unless calling the trampoline_handler, we can not get the real return
address. Thus, the __kretprobe_trampoline_handler() will call return address
fixup function right before unlink the current->kretprobe_instances.

> And to be extra clear, this would only work for stack_trace_save() and
> not stack_trace_save_regs()?

Yes, for the stack_trace_save_regs() and the stack-tracing inside the
kretprobe'd target function, we still need a hack as same as orc_ftrace_find().

> 
> > 3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, it
> >    checks the contents of the end of stackframe (at the place of regs->sp)
> >    is same as the address of it. If it is, it can find the correct address
> >    from current->kretprobe_instances. If not, there is a correct address.
> 
> What do you mean by "it" w.r.t. "is the same address of it"? I'm
> confused on this point.

Oh I meant,

3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, orc_find()
    checks the contents of the end of stackframe (at the place of regs->sp)
   is same as the address of the stackframe (Note that kretprobe_trampoline
   does "push %sp" at first). If so, orc_find() can find the correct address
   from current->kretprobe_instances. If not, there is a correct address.

I need to see the orc unwinder carefully, orc_find() only gets the ip but
to find stackframe, I think this should be fixed in the caller of orc_find().

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

      reply	other threads:[~2021-03-10 10:51 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-05 15:38 [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 1/5] ia64: kprobes: Fix to pass correct trampoline address to the handler Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 2/5] kprobes: treewide: Replace arch_deref_entry_point() with dereference_function_descriptor() Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 3/5] kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler() Masami Hiramatsu
2021-03-10 14:21   ` Miroslav Benes
2021-03-10 15:42     ` Masami Hiramatsu
2021-03-11 13:37       ` Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 4/5] kprobes: stacktrace: Recover the address changed by kretprobe Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 5/5] tracing: Remove kretprobe unknown indicator from stacktrace Masami Hiramatsu
2021-03-05 15:44   ` Steven Rostedt
2021-03-05 19:16 ` [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes Daniel Xu
2021-03-06  1:13   ` Masami Hiramatsu
2021-03-07 21:23     ` Daniel Xu
2021-03-08  2:52       ` Masami Hiramatsu
2021-03-08 13:05         ` Masami Hiramatsu
2021-03-09  1:19         ` Josh Poimboeuf
2021-03-10  9:57           ` Masami Hiramatsu
2021-03-10 15:08             ` Josh Poimboeuf
2021-03-10 15:55               ` Masami Hiramatsu
2021-03-10 18:31                 ` Josh Poimboeuf
2021-03-11  0:20                   ` Masami Hiramatsu
2021-03-11  1:06                     ` Josh Poimboeuf
2021-03-11  1:54                       ` Masami Hiramatsu
2021-03-11 16:51                         ` Josh Poimboeuf
2021-03-12  3:22                           ` Masami Hiramatsu
2021-03-10 22:46                 ` Daniel Xu
2021-03-09 21:34         ` Daniel Xu
2021-03-10 10:50           ` Masami Hiramatsu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210310195036.9aefe44bda0418484886c3a9@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dxu@dxuuu.xyz \
    --cc=jpoimboe@redhat.com \
    --cc=kernel-team@fb.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).