linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly
@ 2018-08-23 17:16 Masami Hiramatsu
  2018-08-24  1:41 ` Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Masami Hiramatsu @ 2018-08-23 17:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, Masami Hiramatsu, Ravi Bangoria,
	Arnaldo Carvalho de Melo, Michael Rodin, linux-kernel

Since copy_optimized_instructions() misses to update real RIP
address while copying several instructions to working buffer,
it adjusts RIP-relative instruction with wrong RIP address for
the 2nd and subsequent instructions.

This may break the kernel (like kernel freeze) because
probed instruction can refer a wrong data. For example,
putting kprobes on cpumask_next hit this bug.

cpumask_next is normally like below if CONFIG_CPUMASK_OFFSTACK=y
(in this case nr_cpumask_bits is an alias of nr_cpu_ids)

 <cpumask_next>:
       48 89 f0                mov    %rsi,%rax
       8b 35 7b fb e2 00       mov    0xe2fb7b(%rip),%esi
        # ffffffff82db9e64 <nr_cpu_ids>
       55                      push   %rbp
...

If we put a kprobe on it and optimized with jump, it becomes
like this.

	e9 95 7d 07 1e	jmpq   0xffffffffa000207a
	7b fb	jnp    0xffffffff81f8a2e2 <cpumask_next+2>
	e2 00	loop   0xffffffff81f8a2e9 <cpumask_next+9>
	55	push   %rbp

This shows first 2 "mov" instructions are copied to trampoline
buffer at 0xffffffffa000207a. Here is the disassembled result.
(skipped optprobe template instructions)

Dump of assembler code from 0xffffffffa000207a to 0xffffffffa00020ea:
	54	push   %rsp
...
	48 83 c4 08	add    $0x8,%rsp
	9d	popfq
	48 89 f0	mov    %rsi,%rax
	8b 35 82 7d db e2	mov    -0x1d24827e(%rip),%esi
        # 0xffffffff82db9e67 <nr_cpu_ids+3>

As it shows, the 2nd mov accesses *(nr_cpu_ids+3) instead of
*nr_cpu_ids. This leads a kernel freeze because cpumask_next()
always returns 0 and for_each_cpu() never ended.

Fixing this by adding len correctly to real RIP address while
copying.

Fixes: 63fef14fc98a ("kprobes/x86: Make insn buffer always ROX and use text_poke()")
Reported-by: Michael Rodin <michael@rodin.online>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/kprobes/opt.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index eaf02f2e7300..e92672b8b490 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -189,7 +189,8 @@ static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
 	int len = 0, ret;
 
 	while (len < RELATIVEJUMP_SIZE) {
-		ret = __copy_instruction(dest + len, src + len, real, &insn);
+		ret = __copy_instruction(dest + len, src + len, real + len,
+			&insn);
 		if (!ret || !can_boost(&insn, src + len))
 			return -EINVAL;
 		len += ret;


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly
  2018-08-23 17:16 [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly Masami Hiramatsu
@ 2018-08-24  1:41 ` Steven Rostedt
  2018-08-24  7:57   ` Masami Hiramatsu
  2018-12-04  8:13 ` Ingo Molnar
  2018-12-04 23:05 ` [tip:perf/urgent] kprobes/x86: Fix instruction patching corruption when copying more than one RIP-relative instruction tip-bot for Masami Hiramatsu
  2 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2018-08-24  1:41 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Ravi Bangoria, Arnaldo Carvalho de Melo,
	Michael Rodin, linux-kernel

On Fri, 24 Aug 2018 02:16:12 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> Dump of assembler code from 0xffffffffa000207a to 0xffffffffa00020ea:
> 	54	push   %rsp
> ...
> 	48 83 c4 08	add    $0x8,%rsp
> 	9d	popfq
> 	48 89 f0	mov    %rsi,%rax
> 	8b 35 82 7d db e2	mov    -0x1d24827e(%rip),%esi
>         # 0xffffffff82db9e67 <nr_cpu_ids+3>
> 
> As it shows, the 2nd mov accesses *(nr_cpu_ids+3) instead of
> *nr_cpu_ids. This leads a kernel freeze because cpumask_next()
> always returns 0 and for_each_cpu() never ended.

Ouch! Nice catch.

> 
> Fixing this by adding len correctly to real RIP address while
> copying.
> 
> Fixes: 63fef14fc98a ("kprobes/x86: Make insn buffer always ROX and use text_poke()")
> Reported-by: Michael Rodin <michael@rodin.online>
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/kernel/kprobes/opt.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
> index eaf02f2e7300..e92672b8b490 100644
> --- a/arch/x86/kernel/kprobes/opt.c
> +++ b/arch/x86/kernel/kprobes/opt.c
> @@ -189,7 +189,8 @@ static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
>  	int len = 0, ret;
>  
>  	while (len < RELATIVEJUMP_SIZE) {
> -		ret = __copy_instruction(dest + len, src + len, real, &insn);
> +		ret = __copy_instruction(dest + len, src + len, real + len,
> +			&insn);
>  		if (!ret || !can_boost(&insn, src + len))
>  			return -EINVAL;
>  		len += ret;

Looking at the change that broke this we have:

> -static int copy_optimized_instructions(u8 *dest, u8 *src)
> +static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
>  {
>         struct insn insn;
>         int len = 0, ret;
>  
>         while (len < RELATIVEJUMP_SIZE) {
> -               ret = __copy_instruction(dest + len, src + len, &insn);
> +               ret = __copy_instruction(dest + len, src + len, real, &insn);

Where "real" was added as a parameter to __copy_instruction. Note that
we pass in "dest + len" but not "real + len" as you patch fixes.
__copy_instruction was changed by the bad commit with:

> -int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
> +int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
>  {
>         kprobe_opcode_t buf[MAX_INSN_SIZE];
>         unsigned long recovered_insn =
> @@ -387,11 +388,11 @@ int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
>                  * have given.
>                  */
>                 newdisp = (u8 *) src + (s64) insn->displacement.value
> -                         - (u8 *) dest;
> +                         - (u8 *) real;

"real" replaces "dest", which was the first parameter to __copy_instruction.

>                         return 0;

And:

>  int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
>                                   struct kprobe *__unused)
>  {
> -       u8 *buf;
> -       int ret;
> +       u8 *buf = NULL, *slot;
> +       int ret, len;
>         long rel;
>  
>         if (!can_optimize((unsigned long)op->kp.addr))
>                 return -EILSEQ;
>  
> -       op->optinsn.insn = get_optinsn_slot();
> -       if (!op->optinsn.insn)
> +       buf = kzalloc(MAX_OPTINSN_SIZE, GFP_KERNEL);
> +       if (!buf)
>                 return -ENOMEM;
>  
> +       op->optinsn.insn = slot = get_optinsn_slot();
> +       if (!slot) {
> +               ret = -ENOMEM;
> +               goto out;
> +       }
> +
>         /*
>          * Verify if the address gap is in 2GB range, because this uses
>          * a relative jump.
>          */
> -       rel = (long)op->optinsn.insn - (long)op->kp.addr + RELATIVEJUMP_SIZE;
> +       rel = (long)slot - (long)op->kp.addr + RELATIVEJUMP_SIZE;
>         if (abs(rel) > 0x7fffffff) {
> -               __arch_remove_optimized_kprobe(op, 0);
> -               return -ERANGE;
> +               ret = -ERANGE;
> +               goto err;
>         }
>  
> -       buf = (u8 *)op->optinsn.insn;

"slot" is equivalent to the old "buf".

> -       set_memory_rw((unsigned long)buf & PAGE_MASK, 1);
> +       /* Copy arch-dep-instance from template */
> +       memcpy(buf, &optprobe_template_entry, TMPL_END_IDX);
>  
>         /* Copy instructions into the out-of-line buffer */
> -       ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr);
> -       if (ret < 0) {
> -               __arch_remove_optimized_kprobe(op, 0);
> -               return ret;
> -       }
> +       ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr,
> +                                         slot + TMPL_END_IDX);

We pass in "real" as "slot + TMPL_END_IDX" and "dest" as "buf +
TMPL_END_IDX", thus to make it be equivalent to the code before this
commit, "real" should have "+ len" added to it in order to be
equivalent to what was there before.

That said...

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve


> +       if (ret < 0)
> +               goto err;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly
  2018-08-24  1:41 ` Steven Rostedt
@ 2018-08-24  7:57   ` Masami Hiramatsu
  2018-09-03  3:38     ` Masami Hiramatsu
  0 siblings, 1 reply; 7+ messages in thread
From: Masami Hiramatsu @ 2018-08-24  7:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Ravi Bangoria, Arnaldo Carvalho de Melo,
	Michael Rodin, linux-kernel

On Thu, 23 Aug 2018 21:41:09 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 24 Aug 2018 02:16:12 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > Dump of assembler code from 0xffffffffa000207a to 0xffffffffa00020ea:
> > 	54	push   %rsp
> > ...
> > 	48 83 c4 08	add    $0x8,%rsp
> > 	9d	popfq
> > 	48 89 f0	mov    %rsi,%rax
> > 	8b 35 82 7d db e2	mov    -0x1d24827e(%rip),%esi
> >         # 0xffffffff82db9e67 <nr_cpu_ids+3>
> > 
> > As it shows, the 2nd mov accesses *(nr_cpu_ids+3) instead of
> > *nr_cpu_ids. This leads a kernel freeze because cpumask_next()
> > always returns 0 and for_each_cpu() never ended.
> 
> Ouch! Nice catch.
> 
> > 
> > Fixing this by adding len correctly to real RIP address while
> > copying.
> > 
> > Fixes: 63fef14fc98a ("kprobes/x86: Make insn buffer always ROX and use text_poke()")
> > Reported-by: Michael Rodin <michael@rodin.online>
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> >  arch/x86/kernel/kprobes/opt.c |    3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
> > index eaf02f2e7300..e92672b8b490 100644
> > --- a/arch/x86/kernel/kprobes/opt.c
> > +++ b/arch/x86/kernel/kprobes/opt.c
> > @@ -189,7 +189,8 @@ static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
> >  	int len = 0, ret;
> >  
> >  	while (len < RELATIVEJUMP_SIZE) {
> > -		ret = __copy_instruction(dest + len, src + len, real, &insn);
> > +		ret = __copy_instruction(dest + len, src + len, real + len,
> > +			&insn);
> >  		if (!ret || !can_boost(&insn, src + len))
> >  			return -EINVAL;
> >  		len += ret;
> 
> Looking at the change that broke this we have:
> 
> > -static int copy_optimized_instructions(u8 *dest, u8 *src)
> > +static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
> >  {
> >         struct insn insn;
> >         int len = 0, ret;
> >  
> >         while (len < RELATIVEJUMP_SIZE) {
> > -               ret = __copy_instruction(dest + len, src + len, &insn);
> > +               ret = __copy_instruction(dest + len, src + len, real, &insn);
> 
> Where "real" was added as a parameter to __copy_instruction. Note that
> we pass in "dest + len" but not "real + len" as you patch fixes.
> __copy_instruction was changed by the bad commit with:
> 
> > -int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
> > +int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
> >  {
> >         kprobe_opcode_t buf[MAX_INSN_SIZE];
> >         unsigned long recovered_insn =
> > @@ -387,11 +388,11 @@ int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
> >                  * have given.
> >                  */
> >                 newdisp = (u8 *) src + (s64) insn->displacement.value
> > -                         - (u8 *) dest;
> > +                         - (u8 *) real;
> 
> "real" replaces "dest", which was the first parameter to __copy_instruction.
> 
> >                         return 0;
> 
> And:
> 
> >  int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
> >                                   struct kprobe *__unused)
> >  {
> > -       u8 *buf;
> > -       int ret;
> > +       u8 *buf = NULL, *slot;
> > +       int ret, len;
> >         long rel;
> >  
> >         if (!can_optimize((unsigned long)op->kp.addr))
> >                 return -EILSEQ;
> >  
> > -       op->optinsn.insn = get_optinsn_slot();
> > -       if (!op->optinsn.insn)
> > +       buf = kzalloc(MAX_OPTINSN_SIZE, GFP_KERNEL);
> > +       if (!buf)
> >                 return -ENOMEM;
> >  
> > +       op->optinsn.insn = slot = get_optinsn_slot();
> > +       if (!slot) {
> > +               ret = -ENOMEM;
> > +               goto out;
> > +       }
> > +
> >         /*
> >          * Verify if the address gap is in 2GB range, because this uses
> >          * a relative jump.
> >          */
> > -       rel = (long)op->optinsn.insn - (long)op->kp.addr + RELATIVEJUMP_SIZE;
> > +       rel = (long)slot - (long)op->kp.addr + RELATIVEJUMP_SIZE;
> >         if (abs(rel) > 0x7fffffff) {
> > -               __arch_remove_optimized_kprobe(op, 0);
> > -               return -ERANGE;
> > +               ret = -ERANGE;
> > +               goto err;
> >         }
> >  
> > -       buf = (u8 *)op->optinsn.insn;
> 
> "slot" is equivalent to the old "buf".
> 
> > -       set_memory_rw((unsigned long)buf & PAGE_MASK, 1);
> > +       /* Copy arch-dep-instance from template */
> > +       memcpy(buf, &optprobe_template_entry, TMPL_END_IDX);
> >  
> >         /* Copy instructions into the out-of-line buffer */
> > -       ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr);
> > -       if (ret < 0) {
> > -               __arch_remove_optimized_kprobe(op, 0);
> > -               return ret;
> > -       }
> > +       ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr,
> > +                                         slot + TMPL_END_IDX);
> 
> We pass in "real" as "slot + TMPL_END_IDX" and "dest" as "buf +
> TMPL_END_IDX", thus to make it be equivalent to the code before this
> commit, "real" should have "+ len" added to it in order to be
> equivalent to what was there before.

Right! The broken commit splits trampoline buffer into 
"temporary" destination buffer and "real" trampoline buffer,
and use the "real" address for RIP-relative adjustment.
However, I forgot to introduce update the "real" address
in the copying loop.

> 
> That said...
> 
> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Thanks!


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly
  2018-08-24  7:57   ` Masami Hiramatsu
@ 2018-09-03  3:38     ` Masami Hiramatsu
  0 siblings, 0 replies; 7+ messages in thread
From: Masami Hiramatsu @ 2018-09-03  3:38 UTC (permalink / raw)
  To: Masami Hiramatsu, Ingo Molnar
  Cc: Steven Rostedt, Ingo Molnar, Ravi Bangoria,
	Arnaldo Carvalho de Melo, Michael Rodin, linux-kernel

Hi Ingo,

Could you pick this fix to urgent branch?

Thank you,

On Fri, 24 Aug 2018 16:57:19 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Thu, 23 Aug 2018 21:41:09 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Fri, 24 Aug 2018 02:16:12 +0900
> > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > 
> > > Dump of assembler code from 0xffffffffa000207a to 0xffffffffa00020ea:
> > > 	54	push   %rsp
> > > ...
> > > 	48 83 c4 08	add    $0x8,%rsp
> > > 	9d	popfq
> > > 	48 89 f0	mov    %rsi,%rax
> > > 	8b 35 82 7d db e2	mov    -0x1d24827e(%rip),%esi
> > >         # 0xffffffff82db9e67 <nr_cpu_ids+3>
> > > 
> > > As it shows, the 2nd mov accesses *(nr_cpu_ids+3) instead of
> > > *nr_cpu_ids. This leads a kernel freeze because cpumask_next()
> > > always returns 0 and for_each_cpu() never ended.
> > 
> > Ouch! Nice catch.
> > 
> > > 
> > > Fixing this by adding len correctly to real RIP address while
> > > copying.
> > > 
> > > Fixes: 63fef14fc98a ("kprobes/x86: Make insn buffer always ROX and use text_poke()")
> > > Reported-by: Michael Rodin <michael@rodin.online>
> > > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  arch/x86/kernel/kprobes/opt.c |    3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
> > > index eaf02f2e7300..e92672b8b490 100644
> > > --- a/arch/x86/kernel/kprobes/opt.c
> > > +++ b/arch/x86/kernel/kprobes/opt.c
> > > @@ -189,7 +189,8 @@ static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
> > >  	int len = 0, ret;
> > >  
> > >  	while (len < RELATIVEJUMP_SIZE) {
> > > -		ret = __copy_instruction(dest + len, src + len, real, &insn);
> > > +		ret = __copy_instruction(dest + len, src + len, real + len,
> > > +			&insn);
> > >  		if (!ret || !can_boost(&insn, src + len))
> > >  			return -EINVAL;
> > >  		len += ret;
> > 
> > Looking at the change that broke this we have:
> > 
> > > -static int copy_optimized_instructions(u8 *dest, u8 *src)
> > > +static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
> > >  {
> > >         struct insn insn;
> > >         int len = 0, ret;
> > >  
> > >         while (len < RELATIVEJUMP_SIZE) {
> > > -               ret = __copy_instruction(dest + len, src + len, &insn);
> > > +               ret = __copy_instruction(dest + len, src + len, real, &insn);
> > 
> > Where "real" was added as a parameter to __copy_instruction. Note that
> > we pass in "dest + len" but not "real + len" as you patch fixes.
> > __copy_instruction was changed by the bad commit with:
> > 
> > > -int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
> > > +int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
> > >  {
> > >         kprobe_opcode_t buf[MAX_INSN_SIZE];
> > >         unsigned long recovered_insn =
> > > @@ -387,11 +388,11 @@ int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
> > >                  * have given.
> > >                  */
> > >                 newdisp = (u8 *) src + (s64) insn->displacement.value
> > > -                         - (u8 *) dest;
> > > +                         - (u8 *) real;
> > 
> > "real" replaces "dest", which was the first parameter to __copy_instruction.
> > 
> > >                         return 0;
> > 
> > And:
> > 
> > >  int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
> > >                                   struct kprobe *__unused)
> > >  {
> > > -       u8 *buf;
> > > -       int ret;
> > > +       u8 *buf = NULL, *slot;
> > > +       int ret, len;
> > >         long rel;
> > >  
> > >         if (!can_optimize((unsigned long)op->kp.addr))
> > >                 return -EILSEQ;
> > >  
> > > -       op->optinsn.insn = get_optinsn_slot();
> > > -       if (!op->optinsn.insn)
> > > +       buf = kzalloc(MAX_OPTINSN_SIZE, GFP_KERNEL);
> > > +       if (!buf)
> > >                 return -ENOMEM;
> > >  
> > > +       op->optinsn.insn = slot = get_optinsn_slot();
> > > +       if (!slot) {
> > > +               ret = -ENOMEM;
> > > +               goto out;
> > > +       }
> > > +
> > >         /*
> > >          * Verify if the address gap is in 2GB range, because this uses
> > >          * a relative jump.
> > >          */
> > > -       rel = (long)op->optinsn.insn - (long)op->kp.addr + RELATIVEJUMP_SIZE;
> > > +       rel = (long)slot - (long)op->kp.addr + RELATIVEJUMP_SIZE;
> > >         if (abs(rel) > 0x7fffffff) {
> > > -               __arch_remove_optimized_kprobe(op, 0);
> > > -               return -ERANGE;
> > > +               ret = -ERANGE;
> > > +               goto err;
> > >         }
> > >  
> > > -       buf = (u8 *)op->optinsn.insn;
> > 
> > "slot" is equivalent to the old "buf".
> > 
> > > -       set_memory_rw((unsigned long)buf & PAGE_MASK, 1);
> > > +       /* Copy arch-dep-instance from template */
> > > +       memcpy(buf, &optprobe_template_entry, TMPL_END_IDX);
> > >  
> > >         /* Copy instructions into the out-of-line buffer */
> > > -       ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr);
> > > -       if (ret < 0) {
> > > -               __arch_remove_optimized_kprobe(op, 0);
> > > -               return ret;
> > > -       }
> > > +       ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr,
> > > +                                         slot + TMPL_END_IDX);
> > 
> > We pass in "real" as "slot + TMPL_END_IDX" and "dest" as "buf +
> > TMPL_END_IDX", thus to make it be equivalent to the code before this
> > commit, "real" should have "+ len" added to it in order to be
> > equivalent to what was there before.
> 
> Right! The broken commit splits trampoline buffer into 
> "temporary" destination buffer and "real" trampoline buffer,
> and use the "real" address for RIP-relative adjustment.
> However, I forgot to introduce update the "real" address
> in the copying loop.
> 
> > 
> > That said...
> > 
> > Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> 
> Thanks!
> 
> 
> -- 
> Masami Hiramatsu <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly
  2018-08-23 17:16 [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly Masami Hiramatsu
  2018-08-24  1:41 ` Steven Rostedt
@ 2018-12-04  8:13 ` Ingo Molnar
  2018-12-04 13:49   ` Masami Hiramatsu
  2018-12-04 23:05 ` [tip:perf/urgent] kprobes/x86: Fix instruction patching corruption when copying more than one RIP-relative instruction tip-bot for Masami Hiramatsu
  2 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2018-12-04  8:13 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Steven Rostedt, Ravi Bangoria,
	Arnaldo Carvalho de Melo, Michael Rodin, linux-kernel


* Masami Hiramatsu <mhiramat@kernel.org> wrote:

> Since copy_optimized_instructions() misses to update real RIP
> address while copying several instructions to working buffer,
> it adjusts RIP-relative instruction with wrong RIP address for
> the 2nd and subsequent instructions.
> 
> This may break the kernel (like kernel freeze) because
> probed instruction can refer a wrong data. For example,
> putting kprobes on cpumask_next hit this bug.
> 
> cpumask_next is normally like below if CONFIG_CPUMASK_OFFSTACK=y
> (in this case nr_cpumask_bits is an alias of nr_cpu_ids)
> 
>  <cpumask_next>:
>        48 89 f0                mov    %rsi,%rax
>        8b 35 7b fb e2 00       mov    0xe2fb7b(%rip),%esi
>         # ffffffff82db9e64 <nr_cpu_ids>
>        55                      push   %rbp
> ...
> 
> If we put a kprobe on it and optimized with jump, it becomes
> like this.
> 
> 	e9 95 7d 07 1e	jmpq   0xffffffffa000207a
> 	7b fb	jnp    0xffffffff81f8a2e2 <cpumask_next+2>
> 	e2 00	loop   0xffffffff81f8a2e9 <cpumask_next+9>
> 	55	push   %rbp
> 
> This shows first 2 "mov" instructions are copied to trampoline
> buffer at 0xffffffffa000207a. Here is the disassembled result.
> (skipped optprobe template instructions)
> 
> Dump of assembler code from 0xffffffffa000207a to 0xffffffffa00020ea:
> 	54	push   %rsp
> ...
> 	48 83 c4 08	add    $0x8,%rsp
> 	9d	popfq
> 	48 89 f0	mov    %rsi,%rax
> 	8b 35 82 7d db e2	mov    -0x1d24827e(%rip),%esi
>         # 0xffffffff82db9e67 <nr_cpu_ids+3>
> 
> As it shows, the 2nd mov accesses *(nr_cpu_ids+3) instead of
> *nr_cpu_ids. This leads a kernel freeze because cpumask_next()
> always returns 0 and for_each_cpu() never ended.
> 
> Fixing this by adding len correctly to real RIP address while
> copying.
> 
> Fixes: 63fef14fc98a ("kprobes/x86: Make insn buffer always ROX and use text_poke()")
> Reported-by: Michael Rodin <michael@rodin.online>
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/kernel/kprobes/opt.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
> index eaf02f2e7300..e92672b8b490 100644
> --- a/arch/x86/kernel/kprobes/opt.c
> +++ b/arch/x86/kernel/kprobes/opt.c
> @@ -189,7 +189,8 @@ static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
>  	int len = 0, ret;
>  
>  	while (len < RELATIVEJUMP_SIZE) {
> -		ret = __copy_instruction(dest + len, src + len, real, &insn);
> +		ret = __copy_instruction(dest + len, src + len, real + len,
> +			&insn);

I applied this, except that I unbroke the line: please ignore checkpatch 
in such cases where the cure is worse than the disease ...

I.e. even if slightly over 80 cols, this is more readable:

		ret = __copy_instruction(dest + len, src + len, real + len, &insn);

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly
  2018-12-04  8:13 ` Ingo Molnar
@ 2018-12-04 13:49   ` Masami Hiramatsu
  0 siblings, 0 replies; 7+ messages in thread
From: Masami Hiramatsu @ 2018-12-04 13:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Ingo Molnar, Steven Rostedt, Ravi Bangoria,
	Arnaldo Carvalho de Melo, Michael Rodin, linux-kernel

On Tue, 4 Dec 2018 09:13:35 +0100
Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > Since copy_optimized_instructions() misses to update real RIP
> > address while copying several instructions to working buffer,
> > it adjusts RIP-relative instruction with wrong RIP address for
> > the 2nd and subsequent instructions.
> > 
> > This may break the kernel (like kernel freeze) because
> > probed instruction can refer a wrong data. For example,
> > putting kprobes on cpumask_next hit this bug.
> > 
> > cpumask_next is normally like below if CONFIG_CPUMASK_OFFSTACK=y
> > (in this case nr_cpumask_bits is an alias of nr_cpu_ids)
> > 
> >  <cpumask_next>:
> >        48 89 f0                mov    %rsi,%rax
> >        8b 35 7b fb e2 00       mov    0xe2fb7b(%rip),%esi
> >         # ffffffff82db9e64 <nr_cpu_ids>
> >        55                      push   %rbp
> > ...
> > 
> > If we put a kprobe on it and optimized with jump, it becomes
> > like this.
> > 
> > 	e9 95 7d 07 1e	jmpq   0xffffffffa000207a
> > 	7b fb	jnp    0xffffffff81f8a2e2 <cpumask_next+2>
> > 	e2 00	loop   0xffffffff81f8a2e9 <cpumask_next+9>
> > 	55	push   %rbp
> > 
> > This shows first 2 "mov" instructions are copied to trampoline
> > buffer at 0xffffffffa000207a. Here is the disassembled result.
> > (skipped optprobe template instructions)
> > 
> > Dump of assembler code from 0xffffffffa000207a to 0xffffffffa00020ea:
> > 	54	push   %rsp
> > ...
> > 	48 83 c4 08	add    $0x8,%rsp
> > 	9d	popfq
> > 	48 89 f0	mov    %rsi,%rax
> > 	8b 35 82 7d db e2	mov    -0x1d24827e(%rip),%esi
> >         # 0xffffffff82db9e67 <nr_cpu_ids+3>
> > 
> > As it shows, the 2nd mov accesses *(nr_cpu_ids+3) instead of
> > *nr_cpu_ids. This leads a kernel freeze because cpumask_next()
> > always returns 0 and for_each_cpu() never ended.
> > 
> > Fixing this by adding len correctly to real RIP address while
> > copying.
> > 
> > Fixes: 63fef14fc98a ("kprobes/x86: Make insn buffer always ROX and use text_poke()")
> > Reported-by: Michael Rodin <michael@rodin.online>
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> >  arch/x86/kernel/kprobes/opt.c |    3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
> > index eaf02f2e7300..e92672b8b490 100644
> > --- a/arch/x86/kernel/kprobes/opt.c
> > +++ b/arch/x86/kernel/kprobes/opt.c
> > @@ -189,7 +189,8 @@ static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
> >  	int len = 0, ret;
> >  
> >  	while (len < RELATIVEJUMP_SIZE) {
> > -		ret = __copy_instruction(dest + len, src + len, real, &insn);
> > +		ret = __copy_instruction(dest + len, src + len, real + len,
> > +			&insn);
> 
> I applied this, except that I unbroke the line: please ignore checkpatch 
> in such cases where the cure is worse than the disease ...

Thanks Ingo!

> 
> I.e. even if slightly over 80 cols, this is more readable:
> 
> 		ret = __copy_instruction(dest + len, src + len, real + len, &insn);

Got it.

BTW, this ugry "+ len" repeat would be avoided. I would better make a new inline
function to wrap it.


Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [tip:perf/urgent] kprobes/x86: Fix instruction patching corruption when copying more than one RIP-relative instruction
  2018-08-23 17:16 [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly Masami Hiramatsu
  2018-08-24  1:41 ` Steven Rostedt
  2018-12-04  8:13 ` Ingo Molnar
@ 2018-12-04 23:05 ` tip-bot for Masami Hiramatsu
  2 siblings, 0 replies; 7+ messages in thread
From: tip-bot for Masami Hiramatsu @ 2018-12-04 23:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, tglx, ravi.bangoria, linux-kernel, rostedt, acme,
	mhiramat, hpa, peterz, michael, mingo

Commit-ID:  43a1b0cb4cd6dbfd3cd9c10da663368394d299d8
Gitweb:     https://git.kernel.org/tip/43a1b0cb4cd6dbfd3cd9c10da663368394d299d8
Author:     Masami Hiramatsu <mhiramat@kernel.org>
AuthorDate: Fri, 24 Aug 2018 02:16:12 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 4 Dec 2018 09:35:20 +0100

kprobes/x86: Fix instruction patching corruption when copying more than one RIP-relative instruction

After copy_optimized_instructions() copies several instructions
to the working buffer it tries to fix up the real RIP address, but it
adjusts the RIP-relative instruction with an incorrect RIP address
for the 2nd and subsequent instructions due to a bug in the logic.

This will break the kernel pretty badly (with likely outcomes such as
a kernel freeze, a crash, or worse) because probed instructions can refer
to the wrong data.

For example putting kprobes on cpumask_next() typically hits this bug.

cpumask_next() is normally like below if CONFIG_CPUMASK_OFFSTACK=y
(in this case nr_cpumask_bits is an alias of nr_cpu_ids):

 <cpumask_next>:
	48 89 f0		mov    %rsi,%rax
	8b 35 7b fb e2 00	mov    0xe2fb7b(%rip),%esi # ffffffff82db9e64 <nr_cpu_ids>
	55			push   %rbp
...

If we put a kprobe on it and it gets jump-optimized, it gets
patched by the kprobes code like this:

 <cpumask_next>:
	e9 95 7d 07 1e		jmpq   0xffffffffa000207a
	7b fb			jnp    0xffffffff81f8a2e2 <cpumask_next+2>
	e2 00			loop   0xffffffff81f8a2e9 <cpumask_next+9>
	55			push   %rbp

This shows that the first two MOV instructions were copied to a
trampoline buffer at 0xffffffffa000207a.

Here is the disassembled result of the trampoline, skipping
the optprobe template instructions:

	# Dump of assembly code from 0xffffffffa000207a to 0xffffffffa00020ea:

	54			push   %rsp
	...
	48 83 c4 08		add    $0x8,%rsp
	9d			popfq
	48 89 f0		mov    %rsi,%rax
	8b 35 82 7d db e2	mov    -0x1d24827e(%rip),%esi # 0xffffffff82db9e67 <nr_cpu_ids+3>

This dump shows that the second MOV accesses *(nr_cpu_ids+3) instead of
the original *nr_cpu_ids. This leads to a kernel freeze because
cpumask_next() always returns 0 and for_each_cpu() never ends.

Fix this by adding 'len' correctly to the real RIP address while
copying.

[ mingo: Improved the changelog. ]

Reported-by: Michael Rodin <michael@rodin.online>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # v4.15+
Fixes: 63fef14fc98a ("kprobes/x86: Make insn buffer always ROX and use text_poke()")
Link: http://lkml.kernel.org/r/153504457253.22602.1314289671019919596.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/kprobes/opt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 40b16b270656..6adf6e6c2933 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -189,7 +189,7 @@ static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
 	int len = 0, ret;
 
 	while (len < RELATIVEJUMP_SIZE) {
-		ret = __copy_instruction(dest + len, src + len, real, &insn);
+		ret = __copy_instruction(dest + len, src + len, real + len, &insn);
 		if (!ret || !can_boost(&insn, src + len))
 			return -EINVAL;
 		len += ret;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-12-04 23:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-23 17:16 [BUGFIX PATCH -tip] kprobes/x86: Fix to copy RIP relative instruction correctly Masami Hiramatsu
2018-08-24  1:41 ` Steven Rostedt
2018-08-24  7:57   ` Masami Hiramatsu
2018-09-03  3:38     ` Masami Hiramatsu
2018-12-04  8:13 ` Ingo Molnar
2018-12-04 13:49   ` Masami Hiramatsu
2018-12-04 23:05 ` [tip:perf/urgent] kprobes/x86: Fix instruction patching corruption when copying more than one RIP-relative instruction tip-bot for Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).