LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Masami Hiramatsu <mhiramat@kernel.org>
To: Ingo Molnar <mingo@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
	peterz@infradead.org,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Andrea Righi <righi.andrea@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: [PATCH 0/2] kprobes: Fix kretprobe incorrect stacking order problem
Date: Mon,  7 Jan 2019 22:31:34 +0900
Message-ID: <154686789378.15479.2886543882215785247.stgit@devbox> (raw)

Hello,

On recent talk with Andrea, I started more precise investigation on
the kernel panic with kretprobes on notrace functions, which Francis
had been reported last year ( https://lkml.org/lkml/2017/7/14/466 ).

At first, I tried to reproduce the issue. I picked up __fdget and
ftrace_ops_assist_func as probed functions.
With CONFIG_KPROBE_EVENTS_ON_NOTRACE=y, I could reproduce the kernel
panic as below.

=====
/sys/kernel/debug/tracing # echo "r:event_1 __fdget" >> kprobe_events 
/sys/kernel/debug/tracing # echo "r:event_2 ftrace_ops_assist_func" >> kprobe_events
/sys/kernel/debug/tracing # echo 1 > events/kprobes/enable 
[   70.491856] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[   70.493203] PGD 800000001c62e067 P4D 800000001c62e067 PUD 1b5bf067 PMD 0 
[   70.494247] Oops: 0010 [#1] PREEMPT SMP PTI
[   70.494918] CPU: 6 PID: 1210 Comm: sh Not tainted 4.20.0-rc3+ #58
[   70.495931] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
[   70.497906] RIP: 0010:0x10
[   70.498465] Code: Bad RIP value.
[   70.499077] RSP: 0018:ffffb1d4c0347e78 EFLAGS: 00010246
[   70.499959] RAX: 00000000fffffff7 RBX: 0000000000000000 RCX: 0000000000000000
[   70.501383] RDX: ffff88f19f9c4f80 RSI: ffffffffb7d75e12 RDI: ffffffffb7d0ede7
[   70.502501] RBP: 00007ffc7061af20 R08: 0000000080000002 R09: ffff88f19f9c4f80
[   70.503698] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000005401
[   70.504810] R13: 0000000000000000 R14: ffffb1d4c0347f58 R15: 0000000000000000
[   70.506028] FS:  0000000000922880(0000) GS:ffff88f19d380000(0000) knlGS:0000000000000000
[   70.507354] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   70.508271] CR2: ffffffffffffffe6 CR3: 000000001f916000 CR4: 00000000000006e0
[   70.509419] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   70.510803] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   70.511748] Call Trace:
[   70.512225]  ? ksys_ioctl+0x70/0x70
[   70.512884]  ? __fdget+0x5/0x10
[   70.513454]  ? __fdget+0x5/0x10
[   70.513980]  ? copy_oldmem_page_encrypted+0x20/0x20
[   70.514815]  ? __x64_sys_ioctl+0x16/0x20
[   70.515596]  ? do_syscall_64+0x50/0x100
[   70.516229]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   70.517143] Modules linked in:
[   70.517806] CR2: 0000000000000010
[   70.518527] ---[ end trace ece844ac05189f10 ]---
[   70.519417] RIP: 0010:0x10
[   70.520026] Code: Bad RIP value.
[   70.520800] RSP: 0018:ffffb1d4c0347e78 EFLAGS: 00010246
[   70.521948] RAX: 00000000fffffff7 RBX: 0000000000000000 RCX: 0000000000000000
[   70.523315] RDX: ffff88f19f9c4f80 RSI: ffffffffb7d75e12 RDI: ffffffffb7d0ede7
[   70.524515] RBP: 00007ffc7061af20 R08: 0000000080000002 R09: ffff88f19f9c4f80
[   70.525702] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000005401
[   70.526715] R13: 0000000000000000 R14: ffffb1d4c0347f58 R15: 0000000000000000
[   70.527673] FS:  0000000000922880(0000) GS:ffff88f19d380000(0000) knlGS:0000000000000000
[   70.528896] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   70.529851] CR2: ffffffffffffffe6 CR3: 000000001f916000 CR4: 00000000000006e0
[   70.530922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   70.531907] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Killed
=====

This seems the kernel is trying to execute incorrect address.

Next, I focused on the combination of probes. From Francis's report,
this issue caused by the combination of kretprobes, not kprobes.
I ensured that was true.

r __fdget & r ftrace_ops_assist_func => NG
p __fdget & p ftrace_ops_assist_func => OK
p __fdget & r ftrace_ops_assist_func => OK
r __fdget & p ftrace_ops_assist_func => OK

r: kretprobe, p: kprobe

This gave me a hint of what happened. I can explain the cause of this
issue as below;
 
Correct processing of kretprobe on probed-function.

<caller>
 -><probed-function>
   ->fentry
    ->ftrace_ops_assist_func()
     ->kprobe_ftrace_handler()
      ...->pre_handler_kretprobe()
           push the return address (caller) of probed-function to top of the
           kretprobe list and replace it with kretprobe_trampoline.
    <-(ftrace_ops_assist_func())
   <-(fentry)
  <-(probed-function)
[kretprobe_trampoline]
 ->tampoline_handler()
   pop the return address (caller) from top of the kretprobe list
 <-(trampoline_handler())
<caller>

When we put a kretprobe on ftrace_ops_assist_func(), below happens

<caller>
 -><probed-function>
   ->fentry
    ->ftrace_ops_assist_func()
     ->int3
      ->kprobe_int3_handler()
      ...->pre_handler_kretprobe()
           push the return address (*fentry*) of ftrace_ops_assist_func() to
           top of the kretprobe list and replace it with kretprobe_trampoline.
      <-kprobe_int3_handler()
     <-(int3)
     ->kprobe_ftrace_handler()
      ...->pre_handler_kretprobe()
           push the return address (caller) of probed-function to top of the
           kretprobe list and replace it with kretprobe_trampoline.
     <-(kprobe_ftrace_handler())
    <-(ftrace_ops_assist_func())
    [kretprobe_trampoline]
     ->tampoline_handler()
       pop the return address (caller) from top of the kretprobe list
     <-(trampoline_handler())
    <caller>
    [run caller with incorrect stack information]
   <-(<caller>)
  !!KERNEL PANIC!!

Therefore, this kernel panic happens only when we put 2 k*ret*probes on
ftrace_ops_assist_func() and other functions. If we put kprobes, it
doesn't cause any issue, since it doesn't change the return address.

To fix (or just avoid) this issue, we can introduce a frame pointer
verification to skip wrong order entries. And I also would like to
blacklist those functions because those are part of ftrace-based 
kprobe handling routine.

BTW, this is not all of issues. To remove CONFIG_KPROBE_EVENTS_ON_NOTRACE
I'm trying to find out other notrace functions which can cause
kernel crash by probing. Mostly done on x86, so I'll post it
after this series.

Thank you,

---

Masami Hiramatsu (2):
      x86/kprobes: Verify stack frame on kretprobe
      kprobes: Mark ftrace mcount handler functions nokprobe


 arch/x86/kernel/kprobes/core.c |   26 ++++++++++++++++++++++++++
 include/linux/kprobes.h        |    1 +
 kernel/trace/ftrace.c          |    5 ++++-
 3 files changed, 31 insertions(+), 1 deletion(-)

--
Masami Hiramatsu (Linaro) <mhiramat@kernel.org>

             reply index

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 13:31 Masami Hiramatsu [this message]
2019-01-07 13:32 ` [PATCH 1/2] x86/kprobes: Verify stack frame on kretprobe Masami Hiramatsu
2019-01-07 13:32 ` [PATCH 2/2] kprobes: Mark ftrace mcount handler functions nokprobe Masami Hiramatsu
2019-01-07 14:55   ` Andrea Righi
2019-01-07 17:29     ` Steven Rostedt
2019-01-08  2:41       ` Masami Hiramatsu
2019-01-08  2:40     ` Masami Hiramatsu
2019-01-07 17:23   ` kbuild test robot
2019-01-07 17:38   ` kbuild test robot
2019-01-07 17:28 ` [PATCH 0/2] kprobes: Fix kretprobe incorrect stacking order problem Andrea Righi
2019-01-07 18:34 ` Andrea Righi
2019-01-07 19:27   ` Steven Rostedt
2019-01-07 19:52     ` Andrea Righi
2019-01-07 19:59       ` Steven Rostedt
2019-01-07 21:19         ` Andrea Righi
2019-01-07 21:28           ` Steven Rostedt
2019-01-07 21:34             ` Andrea Righi
2019-01-08  2:56               ` Masami Hiramatsu

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=154686789378.15479.2886543882215785247.stgit@devbox \
    --to=mhiramat@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=righi.andrea@gmail.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git