From: Masami Hiramatsu <mhiramat@kernel.org>
To: Daniel Xu <dxu@dxuuu.xyz>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Ingo Molnar <mingo@kernel.org>, X86 ML <x86@kernel.org>,
linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
kuba@kernel.org, mingo@redhat.com, ast@kernel.org,
tglx@linutronix.de, kernel-team@fb.com, yhs@fb.com,
Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes
Date: Wed, 10 Mar 2021 19:50:36 +0900 [thread overview]
Message-ID: <20210310195036.9aefe44bda0418484886c3a9@kernel.org> (raw)
In-Reply-To: <20210309213442.fyhxozdcyxfjljih@dlxu-fedora-R90QNFJV>
On Tue, 9 Mar 2021 13:34:42 -0800
Daniel Xu <dxu@dxuuu.xyz> wrote:
> Hi Masami,
>
> Just want to clarify a few points:
>
> On Mon, Mar 08, 2021 at 11:52:10AM +0900, Masami Hiramatsu wrote:
> > On Sun, 7 Mar 2021 13:23:33 -0800
> > Daniel Xu <dxu@dxuuu.xyz> wrote:
> > To help your understanding, let me explain.
> >
> > If we have a code here
> >
> > caller_func:
> > 0x00 add sp, 0x20 /* 0x20 bytes stack frame allocated */
> > ...
> > 0x10 call target_func
> > 0x15 ... /* return address */
> >
> > On the stack in the entry of target_func, we have
> >
> > [stack]
> > 0x0e0 caller_func+0x15
> > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> >
> > And when we put a kretprobe on the target_func, the stack will be
> >
> > [stack]
> > 0x0e0 kretprobe_trampoline
> > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> >
> > * "caller_func+0x15" is saved in current->kretprobe_instances.first.
> >
> > When returning from the target_func, call consumed the 0x0e0 and
> > jump to kretprobe_trampoline. Let's see the assembler code.
> >
> > ".text\n"
> > ".global kretprobe_trampoline\n"
> > ".type kretprobe_trampoline, @function\n"
> > "kretprobe_trampoline:\n"
> > /* We don't bother saving the ss register */
> > " pushq %rsp\n"
> > " pushfq\n"
> > SAVE_REGS_STRING
> > " movq %rsp, %rdi\n"
> > " call trampoline_handler\n"
> > /* Replace saved sp with true return address. */
> > " movq %rax, 19*8(%rsp)\n"
> > RESTORE_REGS_STRING
> > " popfq\n"
> > " ret\n"
> >
> > When the entry of trampoline_handler, stack is like this;
> >
> > [stack]
> > 0x040 kretprobe_trampoline+0x25
> > 0x048 r15
> > ... /* pt_regs */
> > 0x0d8 flags
> > 0x0e0 rsp (=0x0e0)
> > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */
> > 0x100 /* caller_func return address */
> >
> > And after returned from trampoline_handler, "movq" changes the
> > stack like this.
> >
> > [stack]
> > 0x040 kretprobe_trampoline+0x25
> > 0x048 r15
> > ... /* pt_regs */
> > 0x0d8 flags
> > 0x0e0 caller_func+0x15
> > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */
> > 0x100 /* caller_func return address */
>
> Thanks for the detailed explanation. I think I understand kretprobe
> mechanics from a somewhat high level (kprobe saves real return address
> on entry, overwrites return address to trampoline, then trampoline
> runs handler and finally resets return address to real return address).
>
> I don't usually write much assembly so the details escape me somewhat.
>
> > So at the kretprobe handler, we have 2 issues.
> > 1) the return address (caller_func+0x15) is not on the stack.
> > this can be solved by searching from current->kretprobe_instances.
>
> Yes, agreed.
>
> > 2) the stack frame size of kretprobe_trampoline is unknown
> > Since the stackframe is fixed, the fixed number (0x98) can be used.
>
> I'm confused why this is relevant. Is it so ORC knows where to find
> saved return address in the frame?
No, because the kretprobe_trampoline is somewhat special. Usually, at the
function entry, there is a return address on the top of stack, but
kretprobe_trampoline doesn't have it.
So we have to put a hint at the function entry to mark there should be
a next return address. (and ORC unwinder must find it)
> > However, those solutions are only for the kretprobe handler. The stacktrace
> > from interrupt handler hit in the kretprobe_trampoline still doesn't work.
> >
> > So, here is my idea;
> >
> > 1) Change the trampline code to prepare stack frame at first and save
> > registers on it, instead of "push". This will makes ORC easy to setup
> > stackframe information for this code.
>
> I'm confused on the details here. But this is what Josh solves in his
> patch, right?
I'm not so sure how objtool makes the ORC information. If it can trace the
push/pop correctly, yes, it is solved.
> > 2) change the return addres fixup timing. Instead of using return value
> > of trampoline handler, before removing the real return address from
> > current->kretprobe_instances.
>
> Is the idea to have `kretprobe_trampoline` place the real return address
> on the stack (while telling ORC where to find it) _before_ running `call
> trampoline_handler` ? So that an unwind from inside the user defined
> kretprobe handler simply unwinds correctly?
No, unless calling the trampoline_handler, we can not get the real return
address. Thus, the __kretprobe_trampoline_handler() will call return address
fixup function right before unlink the current->kretprobe_instances.
> And to be extra clear, this would only work for stack_trace_save() and
> not stack_trace_save_regs()?
Yes, for the stack_trace_save_regs() and the stack-tracing inside the
kretprobe'd target function, we still need a hack as same as orc_ftrace_find().
>
> > 3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, it
> > checks the contents of the end of stackframe (at the place of regs->sp)
> > is same as the address of it. If it is, it can find the correct address
> > from current->kretprobe_instances. If not, there is a correct address.
>
> What do you mean by "it" w.r.t. "is the same address of it"? I'm
> confused on this point.
Oh I meant,
3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, orc_find()
checks the contents of the end of stackframe (at the place of regs->sp)
is same as the address of the stackframe (Note that kretprobe_trampoline
does "push %sp" at first). If so, orc_find() can find the correct address
from current->kretprobe_instances. If not, there is a correct address.
I need to see the orc unwinder carefully, orc_find() only gets the ip but
to find stackframe, I think this should be fixed in the caller of orc_find().
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
prev parent reply other threads:[~2021-03-10 10:51 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-05 15:38 [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 1/5] ia64: kprobes: Fix to pass correct trampoline address to the handler Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 2/5] kprobes: treewide: Replace arch_deref_entry_point() with dereference_function_descriptor() Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 3/5] kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler() Masami Hiramatsu
2021-03-10 14:21 ` Miroslav Benes
2021-03-10 15:42 ` Masami Hiramatsu
2021-03-11 13:37 ` Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 4/5] kprobes: stacktrace: Recover the address changed by kretprobe Masami Hiramatsu
2021-03-05 15:39 ` [PATCH -tip 5/5] tracing: Remove kretprobe unknown indicator from stacktrace Masami Hiramatsu
2021-03-05 15:44 ` Steven Rostedt
2021-03-05 19:16 ` [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes Daniel Xu
2021-03-06 1:13 ` Masami Hiramatsu
2021-03-07 21:23 ` Daniel Xu
2021-03-08 2:52 ` Masami Hiramatsu
2021-03-08 13:05 ` Masami Hiramatsu
2021-03-09 1:19 ` Josh Poimboeuf
2021-03-10 9:57 ` Masami Hiramatsu
2021-03-10 15:08 ` Josh Poimboeuf
2021-03-10 15:55 ` Masami Hiramatsu
2021-03-10 18:31 ` Josh Poimboeuf
2021-03-11 0:20 ` Masami Hiramatsu
2021-03-11 1:06 ` Josh Poimboeuf
2021-03-11 1:54 ` Masami Hiramatsu
2021-03-11 16:51 ` Josh Poimboeuf
2021-03-12 3:22 ` Masami Hiramatsu
2021-03-10 22:46 ` Daniel Xu
2021-03-09 21:34 ` Daniel Xu
2021-03-10 10:50 ` Masami Hiramatsu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210310195036.9aefe44bda0418484886c3a9@kernel.org \
--to=mhiramat@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=dxu@dxuuu.xyz \
--cc=jpoimboe@redhat.com \
--cc=kernel-team@fb.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).