bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bpf: Fix recursion check in trampoline
@ 2021-04-27 22:41 Jiri Olsa
  2021-04-28  1:10 ` Alexei Starovoitov
  0 siblings, 1 reply; 3+ messages in thread
From: Jiri Olsa @ 2021-04-27 22:41 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

The recursion check in __bpf_prog_enter and __bpf_prog_exit leaves
some (not inlined) functions unprotected:

In __bpf_prog_enter:
  - migrate_disable is called before prog->active is checked

In __bpf_prog_exit:
  - migrate_enable,rcu_read_unlock_strict are called after
    prog->active is decreased

When attaching trampoline to them we get panic like:

  traps: PANIC: double fault, error_code: 0x0
  double fault: 0000 [#1] SMP PTI
  RIP: 0010:__bpf_prog_enter+0x4/0x50
  ...
  Call Trace:
   <IRQ>
   bpf_trampoline_6442466513_0+0x18/0x1000
   migrate_disable+0x5/0x50
   __bpf_prog_enter+0x9/0x50
   bpf_trampoline_6442466513_0+0x18/0x1000
   migrate_disable+0x5/0x50
   __bpf_prog_enter+0x9/0x50
   bpf_trampoline_6442466513_0+0x18/0x1000
   migrate_disable+0x5/0x50
   __bpf_prog_enter+0x9/0x50
   bpf_trampoline_6442466513_0+0x18/0x1000
   migrate_disable+0x5/0x50
   ...

Making the recursion check before the rest of the calls
in __bpf_prog_enter and as last call in __bpf_prog_exit.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/bpf/trampoline.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 4aa8b52adf25..301735f7e88e 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -558,12 +558,12 @@ static void notrace inc_misses_counter(struct bpf_prog *prog)
 u64 notrace __bpf_prog_enter(struct bpf_prog *prog)
 	__acquires(RCU)
 {
-	rcu_read_lock();
-	migrate_disable();
 	if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) {
 		inc_misses_counter(prog);
 		return 0;
 	}
+	rcu_read_lock();
+	migrate_disable();
 	return bpf_prog_start_time();
 }
 
@@ -590,10 +590,12 @@ static void notrace update_prog_stats(struct bpf_prog *prog,
 void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start)
 	__releases(RCU)
 {
-	update_prog_stats(prog, start);
+	if (start) {
+		update_prog_stats(prog, start);
+		migrate_enable();
+		rcu_read_unlock();
+	}
 	__this_cpu_dec(*(prog->active));
-	migrate_enable();
-	rcu_read_unlock();
 }
 
 u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] bpf: Fix recursion check in trampoline
  2021-04-27 22:41 [PATCH] bpf: Fix recursion check in trampoline Jiri Olsa
@ 2021-04-28  1:10 ` Alexei Starovoitov
  2021-04-28  6:44   ` Jiri Olsa
  0 siblings, 1 reply; 3+ messages in thread
From: Alexei Starovoitov @ 2021-04-28  1:10 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Tue, Apr 27, 2021 at 3:42 PM Jiri Olsa <jolsa@kernel.org> wrote:
>
> The recursion check in __bpf_prog_enter and __bpf_prog_exit leaves
> some (not inlined) functions unprotected:
>
> In __bpf_prog_enter:
>   - migrate_disable is called before prog->active is checked
>
> In __bpf_prog_exit:
>   - migrate_enable,rcu_read_unlock_strict are called after
>     prog->active is decreased
>
> When attaching trampoline to them we get panic like:
>
>   traps: PANIC: double fault, error_code: 0x0
>   double fault: 0000 [#1] SMP PTI
>   RIP: 0010:__bpf_prog_enter+0x4/0x50
>   ...
>   Call Trace:
>    <IRQ>
>    bpf_trampoline_6442466513_0+0x18/0x1000
>    migrate_disable+0x5/0x50
>    __bpf_prog_enter+0x9/0x50
>    bpf_trampoline_6442466513_0+0x18/0x1000
>    migrate_disable+0x5/0x50
>    __bpf_prog_enter+0x9/0x50
>    bpf_trampoline_6442466513_0+0x18/0x1000
>    migrate_disable+0x5/0x50
>    __bpf_prog_enter+0x9/0x50
>    bpf_trampoline_6442466513_0+0x18/0x1000
>    migrate_disable+0x5/0x50
>    ...
>
> Making the recursion check before the rest of the calls
> in __bpf_prog_enter and as last call in __bpf_prog_exit.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  kernel/bpf/trampoline.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index 4aa8b52adf25..301735f7e88e 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -558,12 +558,12 @@ static void notrace inc_misses_counter(struct bpf_prog *prog)
>  u64 notrace __bpf_prog_enter(struct bpf_prog *prog)
>         __acquires(RCU)
>  {
> -       rcu_read_lock();
> -       migrate_disable();
>         if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) {
>                 inc_misses_counter(prog);
>                 return 0;
>         }
> +       rcu_read_lock();
> +       migrate_disable();

That obviously doesn't work.
After cpu_inc the task can migrate and cpu_dec
will happen on a different cpu likely underflowing
the counter into negative.
We can either mark migrate_disable as nokprobe/notrace or have bpf
trampoline specific denylist.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] bpf: Fix recursion check in trampoline
  2021-04-28  1:10 ` Alexei Starovoitov
@ 2021-04-28  6:44   ` Jiri Olsa
  0 siblings, 0 replies; 3+ messages in thread
From: Jiri Olsa @ 2021-04-28  6:44 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Tue, Apr 27, 2021 at 06:10:32PM -0700, Alexei Starovoitov wrote:
> On Tue, Apr 27, 2021 at 3:42 PM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > The recursion check in __bpf_prog_enter and __bpf_prog_exit leaves
> > some (not inlined) functions unprotected:
> >
> > In __bpf_prog_enter:
> >   - migrate_disable is called before prog->active is checked
> >
> > In __bpf_prog_exit:
> >   - migrate_enable,rcu_read_unlock_strict are called after
> >     prog->active is decreased
> >
> > When attaching trampoline to them we get panic like:
> >
> >   traps: PANIC: double fault, error_code: 0x0
> >   double fault: 0000 [#1] SMP PTI
> >   RIP: 0010:__bpf_prog_enter+0x4/0x50
> >   ...
> >   Call Trace:
> >    <IRQ>
> >    bpf_trampoline_6442466513_0+0x18/0x1000
> >    migrate_disable+0x5/0x50
> >    __bpf_prog_enter+0x9/0x50
> >    bpf_trampoline_6442466513_0+0x18/0x1000
> >    migrate_disable+0x5/0x50
> >    __bpf_prog_enter+0x9/0x50
> >    bpf_trampoline_6442466513_0+0x18/0x1000
> >    migrate_disable+0x5/0x50
> >    __bpf_prog_enter+0x9/0x50
> >    bpf_trampoline_6442466513_0+0x18/0x1000
> >    migrate_disable+0x5/0x50
> >    ...
> >
> > Making the recursion check before the rest of the calls
> > in __bpf_prog_enter and as last call in __bpf_prog_exit.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  kernel/bpf/trampoline.c | 12 +++++++-----
> >  1 file changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > index 4aa8b52adf25..301735f7e88e 100644
> > --- a/kernel/bpf/trampoline.c
> > +++ b/kernel/bpf/trampoline.c
> > @@ -558,12 +558,12 @@ static void notrace inc_misses_counter(struct bpf_prog *prog)
> >  u64 notrace __bpf_prog_enter(struct bpf_prog *prog)
> >         __acquires(RCU)
> >  {
> > -       rcu_read_lock();
> > -       migrate_disable();
> >         if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) {
> >                 inc_misses_counter(prog);
> >                 return 0;
> >         }
> > +       rcu_read_lock();
> > +       migrate_disable();
> 
> That obviously doesn't work.
> After cpu_inc the task can migrate and cpu_dec
> will happen on a different cpu likely underflowing
> the counter into negative.

ugh right

> We can either mark migrate_disable as nokprobe/notrace or have bpf
> trampoline specific denylist.
> 

I was using notrace to disable that, but that would limit
other tracers.. I'll add bpf denylist

jirka


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-28  6:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-27 22:41 [PATCH] bpf: Fix recursion check in trampoline Jiri Olsa
2021-04-28  1:10 ` Alexei Starovoitov
2021-04-28  6:44   ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).