[v2,1/4] arm64: kprobes: Recover pstate.D in single-step exception handler
diff mbox series

Message ID 156378171555.12011.2511666394591527888.stgit@devnote2
State Superseded
Headers show
Series
  • arm64: kprobes: Fix some bugs in arm64 kprobes
Related show

Commit Message

Masami Hiramatsu July 22, 2019, 7:48 a.m. UTC
On arm64, if a nested kprobes hit, it can crash the kernel with below
error message.

[  152.118921] Unexpected kernel single-step exception at EL1

This is because commit 7419333fa15e ("arm64: kprobe: Always clear
pstate.D in breakpoint exception handler") unmask pstate.D for
doing single step but does not recover it after single step in
the nested kprobes. That is correct *unless* any nested kprobes
(single-stepping) runs inside other kprobes user handler.

When the 1st kprobe hits, do_debug_exception() will be called. At this
point, debug exception (= pstate.D) must be masked (=1). When the 2nd
 (nested) kprobe is hit before single-step of the first kprobe, it
unmask debug exception (pstate.D = 0) and return.
Then, when the 1st kprobe setting up single-step, it saves current
DAIF, mask DAIF, enable single-step, and restore DAIF.
However, since "D" flag in DAIF is cleared by the 2nd kprobe, the
single-step exception happens soon after restoring DAIF.

To solve this issue, this stores all DAIF bits and restore it
after single stepping.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: commit 7419333fa15e ("arm64: kprobe: Always clear pstate.D in breakpoint exception handler")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
  Changes in v2:
   - Save and restore all DAIF flags.
   - Operate pstate directly and remove spsr_set_debug_flag().
---
 arch/arm64/kernel/probes/kprobes.c |   41 ++++++------------------------------
 1 file changed, 7 insertions(+), 34 deletions(-)

Comments

James Morse July 23, 2019, 4:03 p.m. UTC | #1
Hi!

On 22/07/2019 08:48, Masami Hiramatsu wrote:
> On arm64, if a nested kprobes hit, it can crash the kernel with below
> error message.
> 
> [  152.118921] Unexpected kernel single-step exception at EL1
> 
> This is because commit 7419333fa15e ("arm64: kprobe: Always clear
> pstate.D in breakpoint exception handler") unmask pstate.D for
> doing single step but does not recover it after single step in
> the nested kprobes.

> That is correct *unless* any nested kprobes
> (single-stepping) runs inside other kprobes user handler.

(I don't think this is correct, its just usually invisible as PSTATE.D is normally clear)


> When the 1st kprobe hits, do_debug_exception() will be called. At this
> point, debug exception (= pstate.D) must be masked (=1). When the 2nd
>  (nested) kprobe is hit before single-step of the first kprobe, it
> unmask debug exception (pstate.D = 0) and return.
> Then, when the 1st kprobe setting up single-step, it saves current
> DAIF, mask DAIF, enable single-step, and restore DAIF.
> However, since "D" flag in DAIF is cleared by the 2nd kprobe, the
> single-step exception happens soon after restoring DAIF.

This is pretty complicated. Just to check I've understood this properly:
Stepping on a kprobe in a kprobe-user's pre_handler will cause the remainder of the
handler (the first one) to run with PSTATE.D clear. Once we enable single-step, we start
stepping the debug handler, and will never step the original kprobe'd instruction.

This is describing the most complicated way that this problem shows up! (I agree its also
the worst)

I can get this to show up with just one kprobe. (function/file names here are meaningless):

| static int wibble(struct seq_file *m, void *discard)
| {
|        unsigned long d, flags;
|
|        flags = local_daif_save();
|
|        kprobe_me();
|        d = read_sysreg(daif);
|        local_daif_restore(flags);
|
|        seq_printf(m, "%lx\n", d);
|
|        return 0;
| }

plumbed into debugfs, then kicked using the kprobe_example module:
| root@adam:/sys/kernel/debug# cat wibble
| 3c0

| root@adam:/sys/kernel/debug# insmod ~morse/kprobe_example.ko symbol=kprobe_me
| [   69.478098] Planted kprobe at [..]
| root@adam:/sys/kernel/debug# cat wibble
| [   71.478935] <kprobe_me> pre_handler: p->addr = [..], pc = [..], pstate = 0x600003c5
| [   71.488942] <kprobe_me> post_handler: p->addr = [..], pstate = 0x600001c5
| 1c0

| root@adam:/sys/kernel/debug#

This is problem for any code that had debug masked, not just kprobes.

Can we start the commit-message with the simplest description of the problem: kprobes
manipulates the interrupted PSTATE for single step, and doesn't restore it.

(trying to understand this bug through kprobe's interaction with itself is hard!)


> To solve this issue, this stores all DAIF bits and restore it
> after single stepping.


> diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
> index bd5dfffca272..348e02b799a2 100644
> --- a/arch/arm64/kernel/probes/kprobes.c
> +++ b/arch/arm64/kernel/probes/kprobes.c
> @@ -29,6 +29,8 @@
>  
>  #include "decode-insn.h"
>  
> +#define PSR_DAIF_MASK	(PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)

We should probably move this to daifflags.h. Its going to be useful to other series too.


Patch looks good!
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>

(I haven't tried to test the nested kprobes case...)


Thanks,

James
Masami Hiramatsu July 24, 2019, 1:09 p.m. UTC | #2
On Tue, 23 Jul 2019 17:03:47 +0100
James Morse <james.morse@arm.com> wrote:

> Hi!
> 
> On 22/07/2019 08:48, Masami Hiramatsu wrote:
> > On arm64, if a nested kprobes hit, it can crash the kernel with below
> > error message.
> > 
> > [  152.118921] Unexpected kernel single-step exception at EL1
> > 
> > This is because commit 7419333fa15e ("arm64: kprobe: Always clear
> > pstate.D in breakpoint exception handler") unmask pstate.D for
> > doing single step but does not recover it after single step in
> > the nested kprobes.
> 
> > That is correct *unless* any nested kprobes
> > (single-stepping) runs inside other kprobes user handler.
> 
> (I don't think this is correct, its just usually invisible as PSTATE.D is normally clear)

Ah, right.

> 
> 
> > When the 1st kprobe hits, do_debug_exception() will be called. At this
> > point, debug exception (= pstate.D) must be masked (=1). When the 2nd
> >  (nested) kprobe is hit before single-step of the first kprobe, it
> > unmask debug exception (pstate.D = 0) and return.
> > Then, when the 1st kprobe setting up single-step, it saves current
> > DAIF, mask DAIF, enable single-step, and restore DAIF.
> > However, since "D" flag in DAIF is cleared by the 2nd kprobe, the
> > single-step exception happens soon after restoring DAIF.
> 
> This is pretty complicated. Just to check I've understood this properly:
> Stepping on a kprobe in a kprobe-user's pre_handler will cause the remainder of the
> handler (the first one) to run with PSTATE.D clear. Once we enable single-step, we start
> stepping the debug handler, and will never step the original kprobe'd instruction.

Yes, that's correct. I saw the single stepping happens on right after recover
the saved daif.

> 
> This is describing the most complicated way that this problem shows up! (I agree its also
> the worst)
> 
> I can get this to show up with just one kprobe. (function/file names here are meaningless):
> 
> | static int wibble(struct seq_file *m, void *discard)
> | {
> |        unsigned long d, flags;
> |
> |        flags = local_daif_save();
> |
> |        kprobe_me();
> |        d = read_sysreg(daif);
> |        local_daif_restore(flags);
> |
> |        seq_printf(m, "%lx\n", d);
> |
> |        return 0;
> | }
> 
> plumbed into debugfs, then kicked using the kprobe_example module:
> | root@adam:/sys/kernel/debug# cat wibble
> | 3c0
> 
> | root@adam:/sys/kernel/debug# insmod ~morse/kprobe_example.ko symbol=kprobe_me
> | [   69.478098] Planted kprobe at [..]
> | root@adam:/sys/kernel/debug# cat wibble
> | [   71.478935] <kprobe_me> pre_handler: p->addr = [..], pc = [..], pstate = 0x600003c5
> | [   71.488942] <kprobe_me> post_handler: p->addr = [..], pstate = 0x600001c5
> | 1c0
> | root@adam:/sys/kernel/debug#
> 
> This is problem for any code that had debug masked, not just kprobes.

Agreed.

> 
> Can we start the commit-message with the simplest description of the problem: kprobes
> manipulates the interrupted PSTATE for single step, and doesn't restore it.

Thanks for making it clearer :)

> 
> (trying to understand this bug through kprobe's interaction with itself is hard!)
> 
> 
> > To solve this issue, this stores all DAIF bits and restore it
> > after single stepping.
> 
> 
> > diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
> > index bd5dfffca272..348e02b799a2 100644
> > --- a/arch/arm64/kernel/probes/kprobes.c
> > +++ b/arch/arm64/kernel/probes/kprobes.c
> > @@ -29,6 +29,8 @@
> >  
> >  #include "decode-insn.h"
> >  
> > +#define PSR_DAIF_MASK	(PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
> 
> We should probably move this to daifflags.h. Its going to be useful to other series too.

OK.

> 
> 
> Patch looks good!
> Reviewed-by: James Morse <james.morse@arm.com>
> Tested-by: James Morse <james.morse@arm.com>
> 
> (I haven't tried to test the nested kprobes case...)

OK, I'll update and resend it.

Thank you!

> 
> 
> Thanks,
> 
> James

Patch
diff mbox series

diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
index bd5dfffca272..348e02b799a2 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -29,6 +29,8 @@ 
 
 #include "decode-insn.h"
 
+#define PSR_DAIF_MASK	(PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
+
 DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
 DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
 
@@ -167,33 +169,6 @@  static void __kprobes set_current_kprobe(struct kprobe *p)
 	__this_cpu_write(current_kprobe, p);
 }
 
-/*
- * When PSTATE.D is set (masked), then software step exceptions can not be
- * generated.
- * SPSR's D bit shows the value of PSTATE.D immediately before the
- * exception was taken. PSTATE.D is set while entering into any exception
- * mode, however software clears it for any normal (none-debug-exception)
- * mode in the exception entry. Therefore, when we are entering into kprobe
- * breakpoint handler from any normal mode then SPSR.D bit is already
- * cleared, however it is set when we are entering from any debug exception
- * mode.
- * Since we always need to generate single step exception after a kprobe
- * breakpoint exception therefore we need to clear it unconditionally, when
- * we become sure that the current breakpoint exception is for kprobe.
- */
-static void __kprobes
-spsr_set_debug_flag(struct pt_regs *regs, int mask)
-{
-	unsigned long spsr = regs->pstate;
-
-	if (mask)
-		spsr |= PSR_D_BIT;
-	else
-		spsr &= ~PSR_D_BIT;
-
-	regs->pstate = spsr;
-}
-
 /*
  * Interrupts need to be disabled before single-step mode is set, and not
  * reenabled until after single-step mode ends.
@@ -205,17 +180,17 @@  spsr_set_debug_flag(struct pt_regs *regs, int mask)
 static void __kprobes kprobes_save_local_irqflag(struct kprobe_ctlblk *kcb,
 						struct pt_regs *regs)
 {
-	kcb->saved_irqflag = regs->pstate;
+	kcb->saved_irqflag = regs->pstate & PSR_DAIF_MASK;
 	regs->pstate |= PSR_I_BIT;
+	/* Unmask PSTATE.D for enabling software step exceptions. */
+	regs->pstate &= ~PSR_D_BIT;
 }
 
 static void __kprobes kprobes_restore_local_irqflag(struct kprobe_ctlblk *kcb,
 						struct pt_regs *regs)
 {
-	if (kcb->saved_irqflag & PSR_I_BIT)
-		regs->pstate |= PSR_I_BIT;
-	else
-		regs->pstate &= ~PSR_I_BIT;
+	regs->pstate &= ~PSR_DAIF_MASK;
+	regs->pstate |= kcb->saved_irqflag;
 }
 
 static void __kprobes
@@ -252,8 +227,6 @@  static void __kprobes setup_singlestep(struct kprobe *p,
 
 		set_ss_context(kcb, slot);	/* mark pending ss */
 
-		spsr_set_debug_flag(regs, 0);
-
 		/* IRQs and single stepping do not mix well. */
 		kprobes_save_local_irqflag(kcb, regs);
 		kernel_enable_single_step(regs);