linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/unwind/orc: Fix unwind ip when kprobes hits push/pop
@ 2022-12-01  8:53 Chen Zhongjin
  2023-02-10 21:30 ` Josh Poimboeuf
  0 siblings, 1 reply; 2+ messages in thread
From: Chen Zhongjin @ 2022-12-01  8:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: chenzhongjin, jpoimboe, peterz, tglx, mingo, bp, dave.hansen, x86, hpa

When unwind stack at asm_exc_int3, the orc type is UNWIND_HINT_TYPE_REGS
and the unwinder will use pt_regs->ip to find next orc, which point to
the probed insn + INT3_INSN_SIZE.

If the probed insn is push/pop, it will point to the next insn which has
different orc state. Before the probed insn has been really excuted in
single step, using next insn ip to find orc will get a wrong unwinding
result.

So, when there is kprobe running and the previous op code is int3,
state->signal should be false so that ip - 1 will be used to find next
orc, until the probed push/pop has been single steped and kprobe set
kprobe_status as KPROBE_HIT_SSDONE.

Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
---
dump_stack() in pre_handler probing push:

 Call Trace:
  <TASK>
  dump_stack_lvl+0x79/0x9b
  push_handler_pre+0x1b/0x2e [kp_unwind]
  aggr_pre_handler+0xd8/0x180
  ? opt_pre_handler+0x160/0x160
  ? run_push+0xd/0x61 [kp_unwind]
  kprobe_int3_handler+0x3f3/0x530
  do_int3+0x3b/0x80
  exc_int3+0x2b/0x80
  asm_exc_int3+0x35/0x40
 RIP: 0010:run_push+0xd/0x61 [kp_unwind]
 RSP: 0018:ffff88800650f998 EFLAGS: 00000293
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffc0708060
 RDX: ffff8880124b0000 RSI: 0000000000000000 RDI: ffffed1000ca1f28
 RBP: 0000000000000000 R08: 0000000000000020 R09: ffffed1000ca1ef7
 R10: ffff88800650f7b7 R11: ffffed1000ca1ef6 R12: ffffffffc0668000
 R13: 0000000000000000 R14: fffffbfff80e14f8 R15: 1ffffffff80e14f9
  ? 0xffffffffc0668000
  ? run_push+0xb/0x61 [kp_unwind]
  ? run_push+0xd/0x61 [kp_unwind]
  ? kprobe_init+0x8a/0x1000 [kp_unwind]
  ? do_one_initcall+0xd0/0x4e0
  ? trace_event_raw_event_initcall_level+0x1c0/0x1c0
  ? rcu_read_lock_sched_held+0xa5/0xd0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? __kmem_cache_alloc_node+0x1da/0x780
  ? kasan_unpoison+0x23/0x50
  ? 0xffffffffc0668000
  ? do_init_module+0x1cc/0x6a0
  ? load_module+0x5eee/0x7210
  ? ext4_file_read_iter+0x161/0x3a0
  ? module_frob_arch_sections+0x40/0x40
  ? security_file_permission+0x408/0x600
  ? security_kernel_post_read_file+0x93/0xc0
  ? __do_sys_finit_module+0x13c/0x200
  ? __do_sys_finit_module+0x13c/0x200
  ? __ia32_sys_init_module+0xb0/0xb0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? syscall_enter_from_user_mode+0x1d/0x50
  ? syscall_enter_from_user_mode+0x1d/0x50
  ? do_syscall_64+0x38/0x90
  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
  </TASK>

After apply this patch, stack trace is correct: 

 Call Trace:
  <TASK>
  dump_stack_lvl+0x79/0x9b
  push_handler_post+0x1b/0x27 [kp_unwind]
  aggr_post_handler+0xdc/0x160
  ? aggr_pre_handler+0x180/0x180
  ? run_push+0xc/0x61 [kp_unwind]
  kprobe_post_process+0x7b/0x210
  kprobe_int3_handler+0x307/0x530
  ? 0xffffffffc0770000
  ? 0xffffffffc0770002
  do_int3+0x3b/0x80
  exc_int3+0x2b/0x80
  asm_exc_int3+0x35/0x40
 RIP: 0010:run_push+0xd/0x61 [kp_unwind]
 RSP: 0018:ffff88801014f9a0 EFLAGS: 00000293
 RAX: 0000000000000293 RBX: 0000000000000000 RCX: ffffffffc0760060
 RDX: ffff88800d1f8000 RSI: 0000000000000000 RDI: ffffed1002029f28
 RBP: 0000000000000000 R08: 0000000000000020 R09: ffffed1002029ef7
 R10: ffff88801014f7b7 R11: ffffed1002029ef6 R12: ffffffffc0768000
 R13: 0000000000000003 R14: fffffbfff80ec4f8 R15: 1ffffffff80ec4f9
  ? 0xffffffffc0768000
  ? run_push+0xb/0x61 [kp_unwind]
  ? 0xffffffffc0770002
  kprobe_init+0x8a/0x1000 [kp_unwind]
  do_one_initcall+0xd0/0x4e0
  ? trace_event_raw_event_initcall_level+0x1c0/0x1c0
  ? rcu_read_lock_sched_held+0xa5/0xd0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? __kmem_cache_alloc_node+0x1da/0x780
  ? kasan_unpoison+0x23/0x50
  ? 0xffffffffc0768000
  do_init_module+0x1cc/0x6a0
  load_module+0x5eee/0x7210
  ? ext4_file_read_iter+0x161/0x3a0
  ? module_frob_arch_sections+0x40/0x40
  ? security_file_permission+0x408/0x600
  ? security_kernel_post_read_file+0x93/0xc0
  ? __do_sys_finit_module+0x13c/0x200
  __do_sys_finit_module+0x13c/0x200
  ? __ia32_sys_init_module+0xb0/0xb0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? syscall_enter_from_user_mode+0x1d/0x50
  ? syscall_enter_from_user_mode+0x1d/0x50
  do_syscall_64+0x38/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 RIP: 0033:0x7fe46ad1b839
---
 arch/x86/kernel/unwind_orc.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index c059820dfaea..81e74b7e7fda 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -2,6 +2,7 @@
 #include <linux/objtool.h>
 #include <linux/module.h>
 #include <linux/sort.h>
+#include <linux/kprobes.h>
 #include <asm/ptrace.h>
 #include <asm/stacktrace.h>
 #include <asm/unwind.h>
@@ -425,6 +426,7 @@ bool unwind_next_frame(struct unwind_state *state)
 	enum stack_type prev_type = state->stack_info.type;
 	struct orc_entry *orc;
 	bool indirect = false;
+	u8 op;
 
 	if (unwind_done(state))
 		return false;
@@ -568,7 +570,21 @@ bool unwind_next_frame(struct unwind_state *state)
 		state->regs = (struct pt_regs *)sp;
 		state->prev_regs = NULL;
 		state->full_regs = true;
-		state->signal = true;
+#ifdef CONFIG_KPROBES
+		/*
+		 * When kprobe replaces push/pop to int3, pt_regs->ip points to
+		 * the next insn which has different orc state.
+		 * Before push/pop is really excuted in single step, signal should
+		 * be set to false so we will use ip - 1 to find correct next orc.
+		 */
+		if (kprobe_running() &&
+			get_kprobe_ctlblk()->kprobe_status != KPROBE_HIT_SSDONE &&
+			!get_kernel_nofault(op, (u8 *)state->ip - 1) &&
+			op == INT3_INSN_OPCODE)
+			state->signal = false;
+		else
+#endif
+			state->signal = true;
 		break;
 
 	case UNWIND_HINT_TYPE_REGS_PARTIAL:
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] x86/unwind/orc: Fix unwind ip when kprobes hits push/pop
  2022-12-01  8:53 [PATCH] x86/unwind/orc: Fix unwind ip when kprobes hits push/pop Chen Zhongjin
@ 2023-02-10 21:30 ` Josh Poimboeuf
  0 siblings, 0 replies; 2+ messages in thread
From: Josh Poimboeuf @ 2023-02-10 21:30 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: linux-kernel, peterz, tglx, mingo, bp, dave.hansen, x86, hpa

On Thu, Dec 01, 2022 at 04:53:11PM +0800, Chen Zhongjin wrote:
> When unwind stack at asm_exc_int3, the orc type is UNWIND_HINT_TYPE_REGS
> and the unwinder will use pt_regs->ip to find next orc, which point to
> the probed insn + INT3_INSN_SIZE.
> 
> If the probed insn is push/pop, it will point to the next insn which has
> different orc state. Before the probed insn has been really excuted in
> single step, using next insn ip to find orc will get a wrong unwinding
> result.
> 
> So, when there is kprobe running and the previous op code is int3,
> state->signal should be false so that ip - 1 will be used to find next
> orc, until the probed push/pop has been single steped and kprobe set
> kprobe_status as KPROBE_HIT_SSDONE.
> 
> Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
> Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>

Hi Chen,

Thanks for reporting this, and sorry for the delay.  I would rather fix
this in a more general way, by specifying 'signal' (or lack thereof) in
the ORC data.  I will work up a patch.

-- 
Josh

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-02-10 21:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-01  8:53 [PATCH] x86/unwind/orc: Fix unwind ip when kprobes hits push/pop Chen Zhongjin
2023-02-10 21:30 ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).