bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup
@ 2020-05-20 17:22 Jakub Sitnicki
  2020-05-20 17:40 ` sdf
  2020-05-21 19:08 ` Andrii Nakryiko
  0 siblings, 2 replies; 7+ messages in thread
From: Jakub Sitnicki @ 2020-05-20 17:22 UTC (permalink / raw)
  To: bpf; +Cc: netdev, kernel-team, Petar Penkov, Stanislav Fomichev

When attaching a flow dissector program to a network namespace with
bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog.

If netns gets destroyed while a flow dissector is still attached, and there
are no other references to the prog, we leak the reference and the program
remains loaded.

Leak can be reproduced by running flow dissector tests from selftests/bpf:

  # bpftool prog list
  # ./test_flow_dissector.sh
  ...
  selftests: test_flow_dissector [PASS]
  # bpftool prog list
  4: flow_dissector  name _dissect  tag e314084d332a5338  gpl
          loaded_at 2020-05-20T18:50:53+0200  uid 0
          xlated 552B  jited 355B  memlock 4096B  map_ids 3,4
          btf_id 4
  #

Fix it by detaching the flow dissector program when netns is going away.

Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---

Discovered while working on bpf_link support for netns-attached progs.
Looks like bpf tree material so pushing it out separately.

-jkbs

 net/core/flow_dissector.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 3eff84824c8b..b6179cd20158 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -179,6 +179,27 @@ int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr)
 	return 0;
 }
 
+static void __net_exit flow_dissector_pernet_pre_exit(struct net *net)
+{
+	struct bpf_prog *attached;
+
+	/* We don't lock the update-side because there are no
+	 * references left to this netns when we get called. Hence
+	 * there can be no attach/detach in progress.
+	 */
+	rcu_read_lock();
+	attached = rcu_dereference(net->flow_dissector_prog);
+	if (attached) {
+		RCU_INIT_POINTER(net->flow_dissector_prog, NULL);
+		bpf_prog_put(attached);
+	}
+	rcu_read_unlock();
+}
+
+static struct pernet_operations flow_dissector_pernet_ops __net_initdata = {
+	.pre_exit = flow_dissector_pernet_pre_exit,
+};
+
 /**
  * __skb_flow_get_ports - extract the upper layer ports and return them
  * @skb: sk_buff to extract the ports from
@@ -1827,6 +1848,8 @@ EXPORT_SYMBOL(flow_keys_basic_dissector);
 
 static int __init init_default_flow_dissectors(void)
 {
+	int err;
+
 	skb_flow_dissector_init(&flow_keys_dissector,
 				flow_keys_dissector_keys,
 				ARRAY_SIZE(flow_keys_dissector_keys));
@@ -1836,7 +1859,11 @@ static int __init init_default_flow_dissectors(void)
 	skb_flow_dissector_init(&flow_keys_basic_dissector,
 				flow_keys_basic_dissector_keys,
 				ARRAY_SIZE(flow_keys_basic_dissector_keys));
-	return 0;
+
+	err = register_pernet_subsys(&flow_dissector_pernet_ops);
+
+	WARN_ON(err);
+	return err;
 }
 
 core_initcall(init_default_flow_dissectors);
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup
  2020-05-20 17:22 [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup Jakub Sitnicki
@ 2020-05-20 17:40 ` sdf
  2020-05-21  0:56   ` Alexei Starovoitov
  2020-05-21  8:42   ` Jakub Sitnicki
  2020-05-21 19:08 ` Andrii Nakryiko
  1 sibling, 2 replies; 7+ messages in thread
From: sdf @ 2020-05-20 17:40 UTC (permalink / raw)
  To: Jakub Sitnicki; +Cc: bpf, netdev, kernel-team, Petar Penkov

On 05/20, Jakub Sitnicki wrote:
> When attaching a flow dissector program to a network namespace with
> bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog.

> If netns gets destroyed while a flow dissector is still attached, and  
> there
> are no other references to the prog, we leak the reference and the program
> remains loaded.

> Leak can be reproduced by running flow dissector tests from selftests/bpf:

>    # bpftool prog list
>    # ./test_flow_dissector.sh
>    ...
>    selftests: test_flow_dissector [PASS]
>    # bpftool prog list
>    4: flow_dissector  name _dissect  tag e314084d332a5338  gpl
>            loaded_at 2020-05-20T18:50:53+0200  uid 0
>            xlated 552B  jited 355B  memlock 4096B  map_ids 3,4
>            btf_id 4
>    #

> Fix it by detaching the flow dissector program when netns is going away.

> Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---

> Discovered while working on bpf_link support for netns-attached progs.
> Looks like bpf tree material so pushing it out separately.
Oh, good catch!

> -jkbs

>   net/core/flow_dissector.c | 29 ++++++++++++++++++++++++++++-
>   1 file changed, 28 insertions(+), 1 deletion(-)

> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 3eff84824c8b..b6179cd20158 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -179,6 +179,27 @@ int skb_flow_dissector_bpf_prog_detach(const union  
> bpf_attr *attr)
>   	return 0;
>   }

> +static void __net_exit flow_dissector_pernet_pre_exit(struct net *net)
> +{
> +	struct bpf_prog *attached;
> +
> +	/* We don't lock the update-side because there are no
> +	 * references left to this netns when we get called. Hence
> +	 * there can be no attach/detach in progress.
> +	 */
> +	rcu_read_lock();
> +	attached = rcu_dereference(net->flow_dissector_prog);
> +	if (attached) {
> +		RCU_INIT_POINTER(net->flow_dissector_prog, NULL);
> +		bpf_prog_put(attached);
> +	}
> +	rcu_read_unlock();
> +}
I wonder, should we instead refactor existing
skb_flow_dissector_bpf_prog_detach to accept netns (instead of attr)
can call that here? Instead of reimplementing it (I don't think we
care about mutex lock/unlock efficiency here?). Thoughts?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup
  2020-05-20 17:40 ` sdf
@ 2020-05-21  0:56   ` Alexei Starovoitov
  2020-05-21  8:42   ` Jakub Sitnicki
  1 sibling, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2020-05-21  0:56 UTC (permalink / raw)
  To: sdf; +Cc: Jakub Sitnicki, bpf, netdev, kernel-team, Petar Penkov

On Wed, May 20, 2020 at 10:40:00AM -0700, sdf@google.com wrote:
> On 05/20, Jakub Sitnicki wrote:
> > When attaching a flow dissector program to a network namespace with
> > bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog.
> 
> > If netns gets destroyed while a flow dissector is still attached, and
> > there
> > are no other references to the prog, we leak the reference and the program
> > remains loaded.
> 
> > Leak can be reproduced by running flow dissector tests from selftests/bpf:
> 
> >    # bpftool prog list
> >    # ./test_flow_dissector.sh
> >    ...
> >    selftests: test_flow_dissector [PASS]
> >    # bpftool prog list
> >    4: flow_dissector  name _dissect  tag e314084d332a5338  gpl
> >            loaded_at 2020-05-20T18:50:53+0200  uid 0
> >            xlated 552B  jited 355B  memlock 4096B  map_ids 3,4
> >            btf_id 4
> >    #
> 
> > Fix it by detaching the flow dissector program when netns is going away.
> 
> > Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
> > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> > ---
> 
> > Discovered while working on bpf_link support for netns-attached progs.
> > Looks like bpf tree material so pushing it out separately.
> Oh, good catch!

Good catch indeed!

> 
> > -jkbs
> 
> >   net/core/flow_dissector.c | 29 ++++++++++++++++++++++++++++-
> >   1 file changed, 28 insertions(+), 1 deletion(-)
> 
> > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> > index 3eff84824c8b..b6179cd20158 100644
> > --- a/net/core/flow_dissector.c
> > +++ b/net/core/flow_dissector.c
> > @@ -179,6 +179,27 @@ int skb_flow_dissector_bpf_prog_detach(const union
> > bpf_attr *attr)
> >   	return 0;
> >   }
> 
> > +static void __net_exit flow_dissector_pernet_pre_exit(struct net *net)
> > +{
> > +	struct bpf_prog *attached;
> > +
> > +	/* We don't lock the update-side because there are no
> > +	 * references left to this netns when we get called. Hence
> > +	 * there can be no attach/detach in progress.
> > +	 */
> > +	rcu_read_lock();
> > +	attached = rcu_dereference(net->flow_dissector_prog);
> > +	if (attached) {
> > +		RCU_INIT_POINTER(net->flow_dissector_prog, NULL);
> > +		bpf_prog_put(attached);
> > +	}
> > +	rcu_read_unlock();
> > +}
> I wonder, should we instead refactor existing
> skb_flow_dissector_bpf_prog_detach to accept netns (instead of attr)
> can call that here? Instead of reimplementing it (I don't think we
> care about mutex lock/unlock efficiency here?). Thoughts?

Agree. Would be good to share that bit of code.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup
  2020-05-20 17:40 ` sdf
  2020-05-21  0:56   ` Alexei Starovoitov
@ 2020-05-21  8:42   ` Jakub Sitnicki
  1 sibling, 0 replies; 7+ messages in thread
From: Jakub Sitnicki @ 2020-05-21  8:42 UTC (permalink / raw)
  To: sdf; +Cc: bpf, netdev, kernel-team, Petar Penkov, Alexei Starovoitov

On Wed, 20 May 2020 10:40:00 -0700
sdf@google.com wrote:

> > +static void __net_exit flow_dissector_pernet_pre_exit(struct net *net)
> > +{
> > +	struct bpf_prog *attached;
> > +
> > +	/* We don't lock the update-side because there are no
> > +	 * references left to this netns when we get called. Hence
> > +	 * there can be no attach/detach in progress.
> > +	 */
> > +	rcu_read_lock();
> > +	attached = rcu_dereference(net->flow_dissector_prog);
> > +	if (attached) {
> > +		RCU_INIT_POINTER(net->flow_dissector_prog, NULL);
> > +		bpf_prog_put(attached);
> > +	}
> > +	rcu_read_unlock();
> > +}  
> I wonder, should we instead refactor existing
> skb_flow_dissector_bpf_prog_detach to accept netns (instead of attr)
> can call that here? Instead of reimplementing it (I don't think we
> care about mutex lock/unlock efficiency here?). Thoughts?

I wanted to be nice to container-heavy workloads where network
namespaces get torn down frequently and in parallel and avoid
locking a global mutex. OTOH we already do it today, for instance in
devlink pre_exit callback.

In our case I think there is a way to have the cake and it eat too:

https://lore.kernel.org/bpf/20200521083435.560256-1-jakub@cloudflare.com/

Thanks for reviewing it,
-jkbs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup
  2020-05-20 17:22 [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup Jakub Sitnicki
  2020-05-20 17:40 ` sdf
@ 2020-05-21 19:08 ` Andrii Nakryiko
  2020-05-22  0:53   ` Alexei Starovoitov
  1 sibling, 1 reply; 7+ messages in thread
From: Andrii Nakryiko @ 2020-05-21 19:08 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: bpf, Networking, kernel-team, Petar Penkov, Stanislav Fomichev

On Wed, May 20, 2020 at 10:24 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> When attaching a flow dissector program to a network namespace with
> bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog.
>
> If netns gets destroyed while a flow dissector is still attached, and there
> are no other references to the prog, we leak the reference and the program
> remains loaded.
>
> Leak can be reproduced by running flow dissector tests from selftests/bpf:
>
>   # bpftool prog list
>   # ./test_flow_dissector.sh
>   ...
>   selftests: test_flow_dissector [PASS]
>   # bpftool prog list
>   4: flow_dissector  name _dissect  tag e314084d332a5338  gpl
>           loaded_at 2020-05-20T18:50:53+0200  uid 0
>           xlated 552B  jited 355B  memlock 4096B  map_ids 3,4
>           btf_id 4
>   #
>
> Fix it by detaching the flow dissector program when netns is going away.
>
> Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>
> Discovered while working on bpf_link support for netns-attached progs.
> Looks like bpf tree material so pushing it out separately.
>
> -jkbs
>

[...]

>  /**
>   * __skb_flow_get_ports - extract the upper layer ports and return them
>   * @skb: sk_buff to extract the ports from
> @@ -1827,6 +1848,8 @@ EXPORT_SYMBOL(flow_keys_basic_dissector);
>
>  static int __init init_default_flow_dissectors(void)
>  {
> +       int err;
> +
>         skb_flow_dissector_init(&flow_keys_dissector,
>                                 flow_keys_dissector_keys,
>                                 ARRAY_SIZE(flow_keys_dissector_keys));
> @@ -1836,7 +1859,11 @@ static int __init init_default_flow_dissectors(void)
>         skb_flow_dissector_init(&flow_keys_basic_dissector,
>                                 flow_keys_basic_dissector_keys,
>                                 ARRAY_SIZE(flow_keys_basic_dissector_keys));
> -       return 0;
> +
> +       err = register_pernet_subsys(&flow_dissector_pernet_ops);
> +
> +       WARN_ON(err);

syzbot simulates memory allocation failures, which can bubble up here,
so this WARN_ON will probably trigger. I wonder if this could be
rewritten so that init fails, when registration fails? What are the
consequences?

> +       return err;
>  }
>
>  core_initcall(init_default_flow_dissectors);
> --
> 2.25.4
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup
  2020-05-21 19:08 ` Andrii Nakryiko
@ 2020-05-22  0:53   ` Alexei Starovoitov
  2020-05-22  8:22     ` Jakub Sitnicki
  0 siblings, 1 reply; 7+ messages in thread
From: Alexei Starovoitov @ 2020-05-22  0:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jakub Sitnicki, bpf, Networking, kernel-team, Petar Penkov,
	Stanislav Fomichev

On Thu, May 21, 2020 at 12:09 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, May 20, 2020 at 10:24 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >
> > When attaching a flow dissector program to a network namespace with
> > bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog.
> >
> > If netns gets destroyed while a flow dissector is still attached, and there
> > are no other references to the prog, we leak the reference and the program
> > remains loaded.
> >
> > Leak can be reproduced by running flow dissector tests from selftests/bpf:
> >
> >   # bpftool prog list
> >   # ./test_flow_dissector.sh
> >   ...
> >   selftests: test_flow_dissector [PASS]
> >   # bpftool prog list
> >   4: flow_dissector  name _dissect  tag e314084d332a5338  gpl
> >           loaded_at 2020-05-20T18:50:53+0200  uid 0
> >           xlated 552B  jited 355B  memlock 4096B  map_ids 3,4
> >           btf_id 4
> >   #
> >
> > Fix it by detaching the flow dissector program when netns is going away.
> >
> > Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
> > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> > ---
> >
> > Discovered while working on bpf_link support for netns-attached progs.
> > Looks like bpf tree material so pushing it out separately.
> >
> > -jkbs
> >
>
> [...]
>
> >  /**
> >   * __skb_flow_get_ports - extract the upper layer ports and return them
> >   * @skb: sk_buff to extract the ports from
> > @@ -1827,6 +1848,8 @@ EXPORT_SYMBOL(flow_keys_basic_dissector);
> >
> >  static int __init init_default_flow_dissectors(void)
> >  {
> > +       int err;
> > +
> >         skb_flow_dissector_init(&flow_keys_dissector,
> >                                 flow_keys_dissector_keys,
> >                                 ARRAY_SIZE(flow_keys_dissector_keys));
> > @@ -1836,7 +1859,11 @@ static int __init init_default_flow_dissectors(void)
> >         skb_flow_dissector_init(&flow_keys_basic_dissector,
> >                                 flow_keys_basic_dissector_keys,
> >                                 ARRAY_SIZE(flow_keys_basic_dissector_keys));
> > -       return 0;
> > +
> > +       err = register_pernet_subsys(&flow_dissector_pernet_ops);
> > +
> > +       WARN_ON(err);
>
> syzbot simulates memory allocation failures, which can bubble up here,
> so this WARN_ON will probably trigger. I wonder if this could be
> rewritten so that init fails, when registration fails? What are the
> consequences?

good catch. that warn is pointless.
I removed it and force pushed the bpf tree.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup
  2020-05-22  0:53   ` Alexei Starovoitov
@ 2020-05-22  8:22     ` Jakub Sitnicki
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Sitnicki @ 2020-05-22  8:22 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, kernel-team, Petar Penkov,
	Stanislav Fomichev

On Thu, 21 May 2020 17:53:14 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, May 21, 2020 at 12:09 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, May 20, 2020 at 10:24 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:  
> > >
> > > When attaching a flow dissector program to a network namespace with
> > > bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog.
> > >
> > > If netns gets destroyed while a flow dissector is still attached, and there
> > > are no other references to the prog, we leak the reference and the program
> > > remains loaded.
> > >
> > > Leak can be reproduced by running flow dissector tests from selftests/bpf:
> > >
> > >   # bpftool prog list
> > >   # ./test_flow_dissector.sh
> > >   ...
> > >   selftests: test_flow_dissector [PASS]
> > >   # bpftool prog list
> > >   4: flow_dissector  name _dissect  tag e314084d332a5338  gpl
> > >           loaded_at 2020-05-20T18:50:53+0200  uid 0
> > >           xlated 552B  jited 355B  memlock 4096B  map_ids 3,4
> > >           btf_id 4
> > >   #
> > >
> > > Fix it by detaching the flow dissector program when netns is going away.
> > >
> > > Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
> > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> > > ---
> > >
> > > Discovered while working on bpf_link support for netns-attached progs.
> > > Looks like bpf tree material so pushing it out separately.
> > >
> > > -jkbs
> > >  
> >
> > [...]
> >  
> > >  /**
> > >   * __skb_flow_get_ports - extract the upper layer ports and return them
> > >   * @skb: sk_buff to extract the ports from
> > > @@ -1827,6 +1848,8 @@ EXPORT_SYMBOL(flow_keys_basic_dissector);
> > >
> > >  static int __init init_default_flow_dissectors(void)
> > >  {
> > > +       int err;
> > > +
> > >         skb_flow_dissector_init(&flow_keys_dissector,
> > >                                 flow_keys_dissector_keys,
> > >                                 ARRAY_SIZE(flow_keys_dissector_keys));
> > > @@ -1836,7 +1859,11 @@ static int __init init_default_flow_dissectors(void)
> > >         skb_flow_dissector_init(&flow_keys_basic_dissector,
> > >                                 flow_keys_basic_dissector_keys,
> > >                                 ARRAY_SIZE(flow_keys_basic_dissector_keys));
> > > -       return 0;
> > > +
> > > +       err = register_pernet_subsys(&flow_dissector_pernet_ops);
> > > +
> > > +       WARN_ON(err);  
> >
> > syzbot simulates memory allocation failures, which can bubble up here,
> > so this WARN_ON will probably trigger. I wonder if this could be
> > rewritten so that init fails, when registration fails? What are the
> > consequences?  
> 
> good catch. that warn is pointless.
> I removed it and force pushed the bpf tree.

Thanks for patching it up. I'll keep it in mind next time.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-22  8:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-20 17:22 [PATCH bpf] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup Jakub Sitnicki
2020-05-20 17:40 ` sdf
2020-05-21  0:56   ` Alexei Starovoitov
2020-05-21  8:42   ` Jakub Sitnicki
2020-05-21 19:08 ` Andrii Nakryiko
2020-05-22  0:53   ` Alexei Starovoitov
2020-05-22  8:22     ` Jakub Sitnicki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).