[PATCH bpf-next 0/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH bpf-next 0/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
@ 2019-10-02 17:33 Stanislav Fomichev
  2019-10-02 17:33 ` [PATCH bpf-next 1/2] " Stanislav Fomichev
  2019-10-02 17:33 ` [PATCH bpf-next 2/2] selftests/bpf: add test for BPF flow dissector in the root namespace Stanislav Fomichev
  0 siblings, 2 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-10-02 17:33 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Petar Penkov

While having a per-net-ns flow dissector programs is convenient for
testing, security-wise it's better to have only one vetted global
flow dissector implementation.

Let's have a convention that when BPF flow dissector is installed
in the root namespace, child namespaces can't override it.

Note, that it's totally possible to attach flow_dissector programs
to several namespaces and then switch to a global one. In this case,
only the root one will trigger; users are still able to detach
flow_dissector programs from non-root namespaces.

Alternative solution might be something like a sysctl to enable
the global mode.

Cc: Petar Penkov <ppenkov@google.com>

Stanislav Fomichev (2):
  bpf/flow_dissector: add mode to enforce global BPF flow dissector
  selftests/bpf: add test for BPF flow dissector in the root namespace

 Documentation/bpf/prog_flow_dissector.rst     |  3 ++
 net/core/flow_dissector.c                     | 11 ++++-
 .../selftests/bpf/test_flow_dissector.sh      | 48 ++++++++++++++++---
 3 files changed, 55 insertions(+), 7 deletions(-)

-- 
2.23.0.444.g18eeb5a265-goog

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-02 17:33 [PATCH bpf-next 0/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector Stanislav Fomichev
@ 2019-10-02 17:33 ` Stanislav Fomichev
  2019-10-02 20:57   ` Song Liu
  2019-10-02 23:29   ` Andrii Nakryiko
  2019-10-02 17:33 ` [PATCH bpf-next 2/2] selftests/bpf: add test for BPF flow dissector in the root namespace Stanislav Fomichev
  1 sibling, 2 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-10-02 17:33 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Petar Penkov

Always use init_net flow dissector BPF program if it's attached and fall
back to the per-net namespace one. Also, deny installing new programs if
there is already one attached to the root namespace.
Users can still detach their BPF programs, but can't attach any
new ones (-EPERM).

Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 Documentation/bpf/prog_flow_dissector.rst |  3 +++
 net/core/flow_dissector.c                 | 11 ++++++++++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
index a78bf036cadd..4d86780ab0f1 100644
--- a/Documentation/bpf/prog_flow_dissector.rst
+++ b/Documentation/bpf/prog_flow_dissector.rst
@@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel
 C-based implementation can export. Notable example is single VLAN (802.1Q)
 and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
 for a set of information that's currently can be exported from the BPF context.
+
+When BPF flow dissector is attached to the root network namespace (machine-wide
+policy), users can't override it in their child network namespaces.
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 7c09d87d3269..494e2016fe84 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -115,6 +115,11 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
 	struct bpf_prog *attached;
 	struct net *net;
 
+	if (rcu_access_pointer(init_net.flow_dissector_prog)) {
+		/* Can't override root flow dissector program */
+		return -EPERM;
+	}
+
 	net = current->nsproxy->net_ns;
 	mutex_lock(&flow_dissector_mutex);
 	attached = rcu_dereference_protected(net->flow_dissector_prog,
@@ -910,7 +915,11 @@ bool __skb_flow_dissect(const struct net *net,
 	WARN_ON_ONCE(!net);
 	if (net) {
 		rcu_read_lock();
-		attached = rcu_dereference(net->flow_dissector_prog);
+		attached =
+			rcu_dereference(init_net.flow_dissector_prog);
+
+		if (!attached)
+			attached = rcu_dereference(net->flow_dissector_prog);
 
 		if (attached) {
 			struct bpf_flow_keys flow_keys;
-- 
2.23.0.444.g18eeb5a265-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf-next 2/2] selftests/bpf: add test for BPF flow dissector in the root namespace
  2019-10-02 17:33 [PATCH bpf-next 0/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector Stanislav Fomichev
  2019-10-02 17:33 ` [PATCH bpf-next 1/2] " Stanislav Fomichev
@ 2019-10-02 17:33 ` Stanislav Fomichev
  1 sibling, 0 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-10-02 17:33 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, Petar Penkov

Make sure sub-namespaces get EPERM if root flow dissector is
attached.

Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/test_flow_dissector.sh      | 48 ++++++++++++++++---
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_flow_dissector.sh b/tools/testing/selftests/bpf/test_flow_dissector.sh
index d23d4da66b83..2c3a25d64faf 100755
--- a/tools/testing/selftests/bpf/test_flow_dissector.sh
+++ b/tools/testing/selftests/bpf/test_flow_dissector.sh
@@ -18,19 +18,55 @@ fi
 # this is the case and run it with in_netns.sh if it is being run in the root
 # namespace.
 if [[ -z $(ip netns identify $$) ]]; then
+	err=0
+	if bpftool="$(which bpftool)"; then
+		echo "Testing global flow dissector..."
+
+		$bpftool prog loadall ./bpf_flow.o /sys/fs/bpf/flow \
+			type flow_dissector
+
+		if ! unshare --net $bpftool prog attach pinned \
+			/sys/fs/bpf/flow/flow_dissector flow_dissector; then
+			echo "Unexpected unsuccessful attach in namespace" >&2
+			err=1
+		fi
+
+		$bpftool prog attach pinned /sys/fs/bpf/flow/flow_dissector \
+			flow_dissector
+
+		if unshare --net $bpftool prog attach pinned \
+			/sys/fs/bpf/flow/flow_dissector flow_dissector; then
+			echo "Unexpected successful attach in namespace" >&2
+			err=1
+		fi
+
+		if ! $bpftool prog detach pinned \
+			/sys/fs/bpf/flow/flow_dissector flow_dissector; then
+			echo "Failed to detach flow dissector" >&2
+			err=1
+		fi
+
+		rm -rf /sys/fs/bpf/flow
+	else
+		echo "Skipping root flow dissector test, bpftool not found" >&2
+	fi
+
+	# Run the rest of the tests in a net namespace.
 	../net/in_netns.sh "$0" "$@"
-	exit $?
-fi
+	err=$(( $err + $? ))
 
-# Determine selftest success via shell exit code
-exit_handler()
-{
-	if (( $? == 0 )); then
+	if (( $err == 0 )); then
 		echo "selftests: $TESTNAME [PASS]";
 	else
 		echo "selftests: $TESTNAME [FAILED]";
 	fi
 
+	exit $err
+fi
+
+# Determine selftest success via shell exit code
+exit_handler()
+{
 	set +e
 
 	# Cleanup
-- 
2.23.0.444.g18eeb5a265-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-02 17:33 ` [PATCH bpf-next 1/2] " Stanislav Fomichev
@ 2019-10-02 20:57   ` Song Liu
  2019-10-02 21:31     ` Stanislav Fomichev
  2019-10-02 23:29   ` Andrii Nakryiko
  1 sibling, 1 reply; 12+ messages in thread
From: Song Liu @ 2019-10-02 20:57 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S . Miller, Alexei Starovoitov,
	Daniel Borkmann, Petar Penkov

On Wed, Oct 2, 2019 at 10:36 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> Always use init_net flow dissector BPF program if it's attached and fall
> back to the per-net namespace one. Also, deny installing new programs if
> there is already one attached to the root namespace.
> Users can still detach their BPF programs, but can't attach any
> new ones (-EPERM).
>
> Cc: Petar Penkov <ppenkov@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  Documentation/bpf/prog_flow_dissector.rst |  3 +++
>  net/core/flow_dissector.c                 | 11 ++++++++++-
>  2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
> index a78bf036cadd..4d86780ab0f1 100644
> --- a/Documentation/bpf/prog_flow_dissector.rst
> +++ b/Documentation/bpf/prog_flow_dissector.rst
> @@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel
>  C-based implementation can export. Notable example is single VLAN (802.1Q)
>  and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
>  for a set of information that's currently can be exported from the BPF context.
> +
> +When BPF flow dissector is attached to the root network namespace (machine-wide
> +policy), users can't override it in their child network namespaces.
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 7c09d87d3269..494e2016fe84 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -115,6 +115,11 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
>         struct bpf_prog *attached;
>         struct net *net;
>
> +       if (rcu_access_pointer(init_net.flow_dissector_prog)) {
> +               /* Can't override root flow dissector program */
> +               return -EPERM;

Maybe -EBUSY is more accurate?

Thanks,
Song

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-02 20:57   ` Song Liu
@ 2019-10-02 21:31     ` Stanislav Fomichev
  0 siblings, 0 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-10-02 21:31 UTC (permalink / raw)
  To: Song Liu
  Cc: Stanislav Fomichev, Networking, bpf, David S . Miller,
	Alexei Starovoitov, Daniel Borkmann, Petar Penkov

On 10/02, Song Liu wrote:
> On Wed, Oct 2, 2019 at 10:36 AM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > Always use init_net flow dissector BPF program if it's attached and fall
> > back to the per-net namespace one. Also, deny installing new programs if
> > there is already one attached to the root namespace.
> > Users can still detach their BPF programs, but can't attach any
> > new ones (-EPERM).
> >
> > Cc: Petar Penkov <ppenkov@google.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  Documentation/bpf/prog_flow_dissector.rst |  3 +++
> >  net/core/flow_dissector.c                 | 11 ++++++++++-
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
> > index a78bf036cadd..4d86780ab0f1 100644
> > --- a/Documentation/bpf/prog_flow_dissector.rst
> > +++ b/Documentation/bpf/prog_flow_dissector.rst
> > @@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel
> >  C-based implementation can export. Notable example is single VLAN (802.1Q)
> >  and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
> >  for a set of information that's currently can be exported from the BPF context.
> > +
> > +When BPF flow dissector is attached to the root network namespace (machine-wide
> > +policy), users can't override it in their child network namespaces.
> > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> > index 7c09d87d3269..494e2016fe84 100644
> > --- a/net/core/flow_dissector.c
> > +++ b/net/core/flow_dissector.c
> > @@ -115,6 +115,11 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
> >         struct bpf_prog *attached;
> >         struct net *net;
> >
> > +       if (rcu_access_pointer(init_net.flow_dissector_prog)) {
> > +               /* Can't override root flow dissector program */
> > +               return -EPERM;
> 
> Maybe -EBUSY is more accurate?
I'm not sure, -EBUSY to me means that I can retry and (maybe) eventually
will succeed. Maybe return -EEXIST? At least it gives a hint that BPF
flow dissector is already there and retrying won't help. Thoughts?

> Thanks,
> Song

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-02 17:33 ` [PATCH bpf-next 1/2] " Stanislav Fomichev
  2019-10-02 20:57   ` Song Liu
@ 2019-10-02 23:29   ` Andrii Nakryiko
  2019-10-03  1:43     ` Stanislav Fomichev
  1 sibling, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2019-10-02 23:29 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, Petar Penkov

On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> Always use init_net flow dissector BPF program if it's attached and fall
> back to the per-net namespace one. Also, deny installing new programs if
> there is already one attached to the root namespace.
> Users can still detach their BPF programs, but can't attach any
> new ones (-EPERM).
>
> Cc: Petar Penkov <ppenkov@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  Documentation/bpf/prog_flow_dissector.rst |  3 +++
>  net/core/flow_dissector.c                 | 11 ++++++++++-
>  2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
> index a78bf036cadd..4d86780ab0f1 100644
> --- a/Documentation/bpf/prog_flow_dissector.rst
> +++ b/Documentation/bpf/prog_flow_dissector.rst
> @@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel
>  C-based implementation can export. Notable example is single VLAN (802.1Q)
>  and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
>  for a set of information that's currently can be exported from the BPF context.
> +
> +When BPF flow dissector is attached to the root network namespace (machine-wide
> +policy), users can't override it in their child network namespaces.
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 7c09d87d3269..494e2016fe84 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -115,6 +115,11 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
>         struct bpf_prog *attached;
>         struct net *net;
>
> +       if (rcu_access_pointer(init_net.flow_dissector_prog)) {
> +               /* Can't override root flow dissector program */
> +               return -EPERM;
> +       }

This is racy, shouldn't this be checked after grabbing a lock below?

> +
>         net = current->nsproxy->net_ns;
>         mutex_lock(&flow_dissector_mutex);
>         attached = rcu_dereference_protected(net->flow_dissector_prog,
> @@ -910,7 +915,11 @@ bool __skb_flow_dissect(const struct net *net,
>         WARN_ON_ONCE(!net);
>         if (net) {
>                 rcu_read_lock();
> -               attached = rcu_dereference(net->flow_dissector_prog);
> +               attached =
> +                       rcu_dereference(init_net.flow_dissector_prog);
> +
> +               if (!attached)
> +                       attached = rcu_dereference(net->flow_dissector_prog);
>
>                 if (attached) {
>                         struct bpf_flow_keys flow_keys;
> --
> 2.23.0.444.g18eeb5a265-goog
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-02 23:29   ` Andrii Nakryiko
@ 2019-10-03  1:43     ` Stanislav Fomichev
  2019-10-03  2:47       ` Andrii Nakryiko
  0 siblings, 1 reply; 12+ messages in thread
From: Stanislav Fomichev @ 2019-10-03  1:43 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Petar Penkov

On 10/02, Andrii Nakryiko wrote:
> On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > Always use init_net flow dissector BPF program if it's attached and fall
> > back to the per-net namespace one. Also, deny installing new programs if
> > there is already one attached to the root namespace.
> > Users can still detach their BPF programs, but can't attach any
> > new ones (-EPERM).
> >
> > Cc: Petar Penkov <ppenkov@google.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  Documentation/bpf/prog_flow_dissector.rst |  3 +++
> >  net/core/flow_dissector.c                 | 11 ++++++++++-
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
> > index a78bf036cadd..4d86780ab0f1 100644
> > --- a/Documentation/bpf/prog_flow_dissector.rst
> > +++ b/Documentation/bpf/prog_flow_dissector.rst
> > @@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel
> >  C-based implementation can export. Notable example is single VLAN (802.1Q)
> >  and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
> >  for a set of information that's currently can be exported from the BPF context.
> > +
> > +When BPF flow dissector is attached to the root network namespace (machine-wide
> > +policy), users can't override it in their child network namespaces.
> > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> > index 7c09d87d3269..494e2016fe84 100644
> > --- a/net/core/flow_dissector.c
> > +++ b/net/core/flow_dissector.c
> > @@ -115,6 +115,11 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
> >         struct bpf_prog *attached;
> >         struct net *net;
> >
> > +       if (rcu_access_pointer(init_net.flow_dissector_prog)) {
> > +               /* Can't override root flow dissector program */
> > +               return -EPERM;
> > +       }
> 
> This is racy, shouldn't this be checked after grabbing a lock below?
What kind of race do you have in mind?

Even if I put this check under the mutex, it's still possible that if
two cpus concurrently start attaching flow dissector programs (i.e. call
sys_bpf(BPF_PROG_ATTACH)) at the same time (one to root ns, the other
to non-root ns), the cpu that is attaching to non-root can grab mutex first,
pass all the checks and attach the prog (higher frequency, tubo boost, etc).

The mutex is there to protect only against concurrent attaches to the
_same_ netns. For the sake of simplicity we have a global one instead
of a mutex per net-ns.

So I'd rather not grab the mutex and keep it simple. Even in there is a
race, in __skb_flow_dissect we always check init_net first.

> > +
> >         net = current->nsproxy->net_ns;
> >         mutex_lock(&flow_dissector_mutex);
> >         attached = rcu_dereference_protected(net->flow_dissector_prog,
> > @@ -910,7 +915,11 @@ bool __skb_flow_dissect(const struct net *net,
> >         WARN_ON_ONCE(!net);
> >         if (net) {
> >                 rcu_read_lock();
> > -               attached = rcu_dereference(net->flow_dissector_prog);
> > +               attached =
> > +                       rcu_dereference(init_net.flow_dissector_prog);
> > +
> > +               if (!attached)
> > +                       attached = rcu_dereference(net->flow_dissector_prog);
> >
> >                 if (attached) {
> >                         struct bpf_flow_keys flow_keys;
> > --
> > 2.23.0.444.g18eeb5a265-goog
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-03  1:43     ` Stanislav Fomichev
@ 2019-10-03  2:47       ` Andrii Nakryiko
  2019-10-03 16:01         ` Stanislav Fomichev
  0 siblings, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2019-10-03  2:47 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Petar Penkov

On Wed, Oct 2, 2019 at 6:43 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> On 10/02, Andrii Nakryiko wrote:
> > On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > Always use init_net flow dissector BPF program if it's attached and fall
> > > back to the per-net namespace one. Also, deny installing new programs if
> > > there is already one attached to the root namespace.
> > > Users can still detach their BPF programs, but can't attach any
> > > new ones (-EPERM).

I find this quite confusing for users, honestly. If there is no root
namespace dissector we'll successfully attach per-net ones and they
will be working fine. That some process will attach root one and all
the previously successfully working ones will suddenly "break" without
users potentially not realizing why. I bet this will be hair-pulling
investigation for someone. Furthermore, if root net dissector is
already attached, all subsequent attachment will now start failing.

I'm not sure what's the better behavior here is, but maybe at least
forcibly detach already attached ones, so when someone goes and tries
to investigate, they will see that their BPF program is not attached
anymore. Printing dmesg warning would be hugely useful here as well.

Alternatively, if there is any per-net dissector attached, we might
disallow root net dissector to be installed. Sort of "too late to the
party" way, but at least not surprising to successfully installed
dissectors.

Thoughts?

> > >
> > > Cc: Petar Penkov <ppenkov@google.com>
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  Documentation/bpf/prog_flow_dissector.rst |  3 +++
> > >  net/core/flow_dissector.c                 | 11 ++++++++++-
> > >  2 files changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
> > > index a78bf036cadd..4d86780ab0f1 100644
> > > --- a/Documentation/bpf/prog_flow_dissector.rst
> > > +++ b/Documentation/bpf/prog_flow_dissector.rst
> > > @@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel
> > >  C-based implementation can export. Notable example is single VLAN (802.1Q)
> > >  and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
> > >  for a set of information that's currently can be exported from the BPF context.
> > > +
> > > +When BPF flow dissector is attached to the root network namespace (machine-wide
> > > +policy), users can't override it in their child network namespaces.
> > > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> > > index 7c09d87d3269..494e2016fe84 100644
> > > --- a/net/core/flow_dissector.c
> > > +++ b/net/core/flow_dissector.c
> > > @@ -115,6 +115,11 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
> > >         struct bpf_prog *attached;
> > >         struct net *net;
> > >
> > > +       if (rcu_access_pointer(init_net.flow_dissector_prog)) {
> > > +               /* Can't override root flow dissector program */
> > > +               return -EPERM;
> > > +       }
> >
> > This is racy, shouldn't this be checked after grabbing a lock below?
> What kind of race do you have in mind?

I was thinking about the case of two competing attaches for root
init_net, but it seems like we will double-check again under lock, so
this is fine as is.

>
> Even if I put this check under the mutex, it's still possible that if
> two cpus concurrently start attaching flow dissector programs (i.e. call
> sys_bpf(BPF_PROG_ATTACH)) at the same time (one to root ns, the other
> to non-root ns), the cpu that is attaching to non-root can grab mutex first,
> pass all the checks and attach the prog (higher frequency, tubo boost, etc).
>
> The mutex is there to protect only against concurrent attaches to the
> _same_ netns. For the sake of simplicity we have a global one instead
> of a mutex per net-ns.
>
> So I'd rather not grab the mutex and keep it simple. Even in there is a
> race, in __skb_flow_dissect we always check init_net first.
>
> > > +
> > >         net = current->nsproxy->net_ns;
> > >         mutex_lock(&flow_dissector_mutex);
> > >         attached = rcu_dereference_protected(net->flow_dissector_prog,
> > > @@ -910,7 +915,11 @@ bool __skb_flow_dissect(const struct net *net,
> > >         WARN_ON_ONCE(!net);
> > >         if (net) {
> > >                 rcu_read_lock();
> > > -               attached = rcu_dereference(net->flow_dissector_prog);
> > > +               attached =
> > > +                       rcu_dereference(init_net.flow_dissector_prog);
> > > +
> > > +               if (!attached)
> > > +                       attached = rcu_dereference(net->flow_dissector_prog);
> > >
> > >                 if (attached) {
> > >                         struct bpf_flow_keys flow_keys;
> > > --
> > > 2.23.0.444.g18eeb5a265-goog
> > >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-03  2:47       ` Andrii Nakryiko
@ 2019-10-03 16:01         ` Stanislav Fomichev
  2019-10-03 16:26           ` Andrii Nakryiko
  0 siblings, 1 reply; 12+ messages in thread
From: Stanislav Fomichev @ 2019-10-03 16:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Petar Penkov

On 10/02, Andrii Nakryiko wrote:
> On Wed, Oct 2, 2019 at 6:43 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> >
> > On 10/02, Andrii Nakryiko wrote:
> > > On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > Always use init_net flow dissector BPF program if it's attached and fall
> > > > back to the per-net namespace one. Also, deny installing new programs if
> > > > there is already one attached to the root namespace.
> > > > Users can still detach their BPF programs, but can't attach any
> > > > new ones (-EPERM).
> 
> I find this quite confusing for users, honestly. If there is no root
> namespace dissector we'll successfully attach per-net ones and they
> will be working fine. That some process will attach root one and all
> the previously successfully working ones will suddenly "break" without
> users potentially not realizing why. I bet this will be hair-pulling
> investigation for someone. Furthermore, if root net dissector is
> already attached, all subsequent attachment will now start failing.
The idea is that if sysadmin decides to use system-wide dissector it would
be attached from the init scripts/systemd early in the boot process.
So the users in your example would always get EPERM/EBUSY/EXIST.
I don't really see a realistic use-case where root and non-root
namespaces attach/detach flow dissector programs at non-boot
time (or why non-root containers could have BPF dissector and root
could have C dissector; multi-nic machine?).

But I totally see your point about confusion. See below.

> I'm not sure what's the better behavior here is, but maybe at least
> forcibly detach already attached ones, so when someone goes and tries
> to investigate, they will see that their BPF program is not attached
> anymore. Printing dmesg warning would be hugely useful here as well.
We can do for_each_net and detach non-root ones; that sounds
feasible and may avoid the confusion (at least when you query
non-root ns to see if the prog is still there, you get a valid
indication that it's not).

> Alternatively, if there is any per-net dissector attached, we might
> disallow root net dissector to be installed. Sort of "too late to the
> party" way, but at least not surprising to successfully installed
> dissectors.
We can do this as well.

> Thoughts?
Let me try to implement both of your suggestions and see which one makes
more sense. I'm leaning towards the later (simple check to see if
any non-root ns has the prog attached).

I'll follow up with a v2 if all goes well.

> > > > Cc: Petar Penkov <ppenkov@google.com>
> > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > ---
> > > >  Documentation/bpf/prog_flow_dissector.rst |  3 +++
> > > >  net/core/flow_dissector.c                 | 11 ++++++++++-
> > > >  2 files changed, 13 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
> > > > index a78bf036cadd..4d86780ab0f1 100644
> > > > --- a/Documentation/bpf/prog_flow_dissector.rst
> > > > +++ b/Documentation/bpf/prog_flow_dissector.rst
> > > > @@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel
> > > >  C-based implementation can export. Notable example is single VLAN (802.1Q)
> > > >  and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
> > > >  for a set of information that's currently can be exported from the BPF context.
> > > > +
> > > > +When BPF flow dissector is attached to the root network namespace (machine-wide
> > > > +policy), users can't override it in their child network namespaces.
> > > > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> > > > index 7c09d87d3269..494e2016fe84 100644
> > > > --- a/net/core/flow_dissector.c
> > > > +++ b/net/core/flow_dissector.c
> > > > @@ -115,6 +115,11 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
> > > >         struct bpf_prog *attached;
> > > >         struct net *net;
> > > >
> > > > +       if (rcu_access_pointer(init_net.flow_dissector_prog)) {
> > > > +               /* Can't override root flow dissector program */
> > > > +               return -EPERM;
> > > > +       }
> > >
> > > This is racy, shouldn't this be checked after grabbing a lock below?
> > What kind of race do you have in mind?
> 
> I was thinking about the case of two competing attaches for root
> init_net, but it seems like we will double-check again under lock, so
> this is fine as is.
> 
> >
> > Even if I put this check under the mutex, it's still possible that if
> > two cpus concurrently start attaching flow dissector programs (i.e. call
> > sys_bpf(BPF_PROG_ATTACH)) at the same time (one to root ns, the other
> > to non-root ns), the cpu that is attaching to non-root can grab mutex first,
> > pass all the checks and attach the prog (higher frequency, tubo boost, etc).
> >
> > The mutex is there to protect only against concurrent attaches to the
> > _same_ netns. For the sake of simplicity we have a global one instead
> > of a mutex per net-ns.
> >
> > So I'd rather not grab the mutex and keep it simple. Even in there is a
> > race, in __skb_flow_dissect we always check init_net first.
> >
> > > > +
> > > >         net = current->nsproxy->net_ns;
> > > >         mutex_lock(&flow_dissector_mutex);
> > > >         attached = rcu_dereference_protected(net->flow_dissector_prog,
> > > > @@ -910,7 +915,11 @@ bool __skb_flow_dissect(const struct net *net,
> > > >         WARN_ON_ONCE(!net);
> > > >         if (net) {
> > > >                 rcu_read_lock();
> > > > -               attached = rcu_dereference(net->flow_dissector_prog);
> > > > +               attached =
> > > > +                       rcu_dereference(init_net.flow_dissector_prog);
> > > > +
> > > > +               if (!attached)
> > > > +                       attached = rcu_dereference(net->flow_dissector_prog);
> > > >
> > > >                 if (attached) {
> > > >                         struct bpf_flow_keys flow_keys;
> > > > --
> > > > 2.23.0.444.g18eeb5a265-goog
> > > >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-03 16:01         ` Stanislav Fomichev
@ 2019-10-03 16:26           ` Andrii Nakryiko
  2019-10-03 17:45             ` John Fastabend
  0 siblings, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2019-10-03 16:26 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Petar Penkov

On Thu, Oct 3, 2019 at 9:01 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> On 10/02, Andrii Nakryiko wrote:
> > On Wed, Oct 2, 2019 at 6:43 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > >
> > > On 10/02, Andrii Nakryiko wrote:
> > > > On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > > >
> > > > > Always use init_net flow dissector BPF program if it's attached and fall
> > > > > back to the per-net namespace one. Also, deny installing new programs if
> > > > > there is already one attached to the root namespace.
> > > > > Users can still detach their BPF programs, but can't attach any
> > > > > new ones (-EPERM).
> >
> > I find this quite confusing for users, honestly. If there is no root
> > namespace dissector we'll successfully attach per-net ones and they
> > will be working fine. That some process will attach root one and all
> > the previously successfully working ones will suddenly "break" without
> > users potentially not realizing why. I bet this will be hair-pulling
> > investigation for someone. Furthermore, if root net dissector is
> > already attached, all subsequent attachment will now start failing.
> The idea is that if sysadmin decides to use system-wide dissector it would
> be attached from the init scripts/systemd early in the boot process.
> So the users in your example would always get EPERM/EBUSY/EXIST.
> I don't really see a realistic use-case where root and non-root
> namespaces attach/detach flow dissector programs at non-boot
> time (or why non-root containers could have BPF dissector and root
> could have C dissector; multi-nic machine?).
>
> But I totally see your point about confusion. See below.
>
> > I'm not sure what's the better behavior here is, but maybe at least
> > forcibly detach already attached ones, so when someone goes and tries
> > to investigate, they will see that their BPF program is not attached
> > anymore. Printing dmesg warning would be hugely useful here as well.
> We can do for_each_net and detach non-root ones; that sounds
> feasible and may avoid the confusion (at least when you query
> non-root ns to see if the prog is still there, you get a valid
> indication that it's not).
>
> > Alternatively, if there is any per-net dissector attached, we might
> > disallow root net dissector to be installed. Sort of "too late to the
> > party" way, but at least not surprising to successfully installed
> > dissectors.
> We can do this as well.
>
> > Thoughts?
> Let me try to implement both of your suggestions and see which one makes
> more sense. I'm leaning towards the later (simple check to see if
> any non-root ns has the prog attached).
>
> I'll follow up with a v2 if all goes well.

Thanks! I don't have strong opinion on either, see what makes most
sense from an actual user perspective.

>
> > > > > Cc: Petar Penkov <ppenkov@google.com>
> > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > ---
> > > > >  Documentation/bpf/prog_flow_dissector.rst |  3 +++
> > > > >  net/core/flow_dissector.c                 | 11 ++++++++++-
> > > > >  2 files changed, 13 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
> > > > > index a78bf036cadd..4d86780ab0f1 100644
> > > > > --- a/Documentation/bpf/prog_flow_dissector.rst
> > > > > +++ b/Documentation/bpf/prog_flow_dissector.rst
> > > > > @@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel
> > > > >  C-based implementation can export. Notable example is single VLAN (802.1Q)
> > > > >  and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
> > > > >  for a set of information that's currently can be exported from the BPF context.
> > > > > +
> > > > > +When BPF flow dissector is attached to the root network namespace (machine-wide
> > > > > +policy), users can't override it in their child network namespaces.
> > > > > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> > > > > index 7c09d87d3269..494e2016fe84 100644
> > > > > --- a/net/core/flow_dissector.c
> > > > > +++ b/net/core/flow_dissector.c
> > > > > @@ -115,6 +115,11 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
> > > > >         struct bpf_prog *attached;
> > > > >         struct net *net;
> > > > >
> > > > > +       if (rcu_access_pointer(init_net.flow_dissector_prog)) {
> > > > > +               /* Can't override root flow dissector program */
> > > > > +               return -EPERM;
> > > > > +       }
> > > >
> > > > This is racy, shouldn't this be checked after grabbing a lock below?
> > > What kind of race do you have in mind?
> >
> > I was thinking about the case of two competing attaches for root
> > init_net, but it seems like we will double-check again under lock, so
> > this is fine as is.
> >
> > >
> > > Even if I put this check under the mutex, it's still possible that if
> > > two cpus concurrently start attaching flow dissector programs (i.e. call
> > > sys_bpf(BPF_PROG_ATTACH)) at the same time (one to root ns, the other
> > > to non-root ns), the cpu that is attaching to non-root can grab mutex first,
> > > pass all the checks and attach the prog (higher frequency, tubo boost, etc).
> > >
> > > The mutex is there to protect only against concurrent attaches to the
> > > _same_ netns. For the sake of simplicity we have a global one instead
> > > of a mutex per net-ns.
> > >
> > > So I'd rather not grab the mutex and keep it simple. Even in there is a
> > > race, in __skb_flow_dissect we always check init_net first.
> > >
> > > > > +
> > > > >         net = current->nsproxy->net_ns;
> > > > >         mutex_lock(&flow_dissector_mutex);
> > > > >         attached = rcu_dereference_protected(net->flow_dissector_prog,
> > > > > @@ -910,7 +915,11 @@ bool __skb_flow_dissect(const struct net *net,
> > > > >         WARN_ON_ONCE(!net);
> > > > >         if (net) {
> > > > >                 rcu_read_lock();
> > > > > -               attached = rcu_dereference(net->flow_dissector_prog);
> > > > > +               attached =
> > > > > +                       rcu_dereference(init_net.flow_dissector_prog);
> > > > > +
> > > > > +               if (!attached)
> > > > > +                       attached = rcu_dereference(net->flow_dissector_prog);
> > > > >
> > > > >                 if (attached) {
> > > > >                         struct bpf_flow_keys flow_keys;
> > > > > --
> > > > > 2.23.0.444.g18eeb5a265-goog
> > > > >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-03 16:26           ` Andrii Nakryiko
@ 2019-10-03 17:45             ` John Fastabend
  2019-10-03 17:58               ` Stanislav Fomichev
  0 siblings, 1 reply; 12+ messages in thread
From: John Fastabend @ 2019-10-03 17:45 UTC (permalink / raw)
  To: Andrii Nakryiko, Stanislav Fomichev
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Petar Penkov

Andrii Nakryiko wrote:
> On Thu, Oct 3, 2019 at 9:01 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> >
> > On 10/02, Andrii Nakryiko wrote:
> > > On Wed, Oct 2, 2019 at 6:43 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > > >
> > > > On 10/02, Andrii Nakryiko wrote:
> > > > > On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > > > >
> > > > > > Always use init_net flow dissector BPF program if it's attached and fall
> > > > > > back to the per-net namespace one. Also, deny installing new programs if
> > > > > > there is already one attached to the root namespace.
> > > > > > Users can still detach their BPF programs, but can't attach any
> > > > > > new ones (-EPERM).
> > >
> > > I find this quite confusing for users, honestly. If there is no root
> > > namespace dissector we'll successfully attach per-net ones and they
> > > will be working fine. That some process will attach root one and all
> > > the previously successfully working ones will suddenly "break" without
> > > users potentially not realizing why. I bet this will be hair-pulling
> > > investigation for someone. Furthermore, if root net dissector is
> > > already attached, all subsequent attachment will now start failing.
> > The idea is that if sysadmin decides to use system-wide dissector it would
> > be attached from the init scripts/systemd early in the boot process.
> > So the users in your example would always get EPERM/EBUSY/EXIST.
> > I don't really see a realistic use-case where root and non-root
> > namespaces attach/detach flow dissector programs at non-boot
> > time (or why non-root containers could have BPF dissector and root
> > could have C dissector; multi-nic machine?).
> >
> > But I totally see your point about confusion. See below.
> >
> > > I'm not sure what's the better behavior here is, but maybe at least
> > > forcibly detach already attached ones, so when someone goes and tries
> > > to investigate, they will see that their BPF program is not attached
> > > anymore. Printing dmesg warning would be hugely useful here as well.
> > We can do for_each_net and detach non-root ones; that sounds
> > feasible and may avoid the confusion (at least when you query
> > non-root ns to see if the prog is still there, you get a valid
> > indication that it's not).
> >
> > > Alternatively, if there is any per-net dissector attached, we might
> > > disallow root net dissector to be installed. Sort of "too late to the
> > > party" way, but at least not surprising to successfully installed
> > > dissectors.
> > We can do this as well.
> >
> > > Thoughts?
> > Let me try to implement both of your suggestions and see which one makes
> > more sense. I'm leaning towards the later (simple check to see if
> > any non-root ns has the prog attached).
> >
> > I'll follow up with a v2 if all goes well.
> 
> Thanks! I don't have strong opinion on either, see what makes most
> sense from an actual user perspective.


From my point of view the second option is better. The root namespace flow
dissector attach should always happen first before any other namespaces are
created. If any namespaces have already attached then just fail the root
namespace. 

Otherwise if you detach existing dissectors from a container these were
probably attached by the init container which might not be running anymore
and I have no easy way to learn/find out about this without creating another
container specifically to watch for this. If I'm relying on the dissector
for something now I can seemingly random errors. So its a bit ugly and I'll
probably just tell users to always attach the root namespace first to avoid
this headache. On the other side if the root namespace already has a
flow dissector attached and my init container fails its attach cmd I
can handle the error gracefully or even fail to launch the container with
a nice error message and the administrator can figure something out.
I'm always in favor of hard errors vs trying to guess what the right
choice is for any particular setup.

Also it seems to me just checking if anything is attached is going to make
the code simpler vs trying to detach things in all namespaces.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
  2019-10-03 17:45             ` John Fastabend
@ 2019-10-03 17:58               ` Stanislav Fomichev
  0 siblings, 0 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-10-03 17:58 UTC (permalink / raw)
  To: John Fastabend
  Cc: Andrii Nakryiko, Stanislav Fomichev, Networking, bpf,
	David S. Miller, Alexei Starovoitov, Daniel Borkmann,
	Petar Penkov

On 10/03, John Fastabend wrote:
> Andrii Nakryiko wrote:
> > On Thu, Oct 3, 2019 at 9:01 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > >
> > > On 10/02, Andrii Nakryiko wrote:
> > > > On Wed, Oct 2, 2019 at 6:43 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > > > >
> > > > > On 10/02, Andrii Nakryiko wrote:
> > > > > > On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > > > > >
> > > > > > > Always use init_net flow dissector BPF program if it's attached and fall
> > > > > > > back to the per-net namespace one. Also, deny installing new programs if
> > > > > > > there is already one attached to the root namespace.
> > > > > > > Users can still detach their BPF programs, but can't attach any
> > > > > > > new ones (-EPERM).
> > > >
> > > > I find this quite confusing for users, honestly. If there is no root
> > > > namespace dissector we'll successfully attach per-net ones and they
> > > > will be working fine. That some process will attach root one and all
> > > > the previously successfully working ones will suddenly "break" without
> > > > users potentially not realizing why. I bet this will be hair-pulling
> > > > investigation for someone. Furthermore, if root net dissector is
> > > > already attached, all subsequent attachment will now start failing.
> > > The idea is that if sysadmin decides to use system-wide dissector it would
> > > be attached from the init scripts/systemd early in the boot process.
> > > So the users in your example would always get EPERM/EBUSY/EXIST.
> > > I don't really see a realistic use-case where root and non-root
> > > namespaces attach/detach flow dissector programs at non-boot
> > > time (or why non-root containers could have BPF dissector and root
> > > could have C dissector; multi-nic machine?).
> > >
> > > But I totally see your point about confusion. See below.
> > >
> > > > I'm not sure what's the better behavior here is, but maybe at least
> > > > forcibly detach already attached ones, so when someone goes and tries
> > > > to investigate, they will see that their BPF program is not attached
> > > > anymore. Printing dmesg warning would be hugely useful here as well.
> > > We can do for_each_net and detach non-root ones; that sounds
> > > feasible and may avoid the confusion (at least when you query
> > > non-root ns to see if the prog is still there, you get a valid
> > > indication that it's not).
> > >
> > > > Alternatively, if there is any per-net dissector attached, we might
> > > > disallow root net dissector to be installed. Sort of "too late to the
> > > > party" way, but at least not surprising to successfully installed
> > > > dissectors.
> > > We can do this as well.
> > >
> > > > Thoughts?
> > > Let me try to implement both of your suggestions and see which one makes
> > > more sense. I'm leaning towards the later (simple check to see if
> > > any non-root ns has the prog attached).
> > >
> > > I'll follow up with a v2 if all goes well.
> > 
> > Thanks! I don't have strong opinion on either, see what makes most
> > sense from an actual user perspective.
> 
> 
> From my point of view the second option is better. The root namespace flow
> dissector attach should always happen first before any other namespaces are
> created. If any namespaces have already attached then just fail the root
> namespace. 
> 
> Otherwise if you detach existing dissectors from a container these were
> probably attached by the init container which might not be running anymore
> and I have no easy way to learn/find out about this without creating another
> container specifically to watch for this. If I'm relying on the dissector
> for something now I can seemingly random errors. So its a bit ugly and I'll
> probably just tell users to always attach the root namespace first to avoid
> this headache. On the other side if the root namespace already has a
> flow dissector attached and my init container fails its attach cmd I
> can handle the error gracefully or even fail to launch the container with
> a nice error message and the administrator can figure something out.
> I'm always in favor of hard errors vs trying to guess what the right
> choice is for any particular setup.
> 
> Also it seems to me just checking if anything is attached is going to make
> the code simpler vs trying to detach things in all namespaces.
Agreed, I was also leaning towards this option. Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-10-03 17:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-02 17:33 [PATCH bpf-next 0/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector Stanislav Fomichev
2019-10-02 17:33 ` [PATCH bpf-next 1/2] " Stanislav Fomichev
2019-10-02 20:57   ` Song Liu
2019-10-02 21:31     ` Stanislav Fomichev
2019-10-02 23:29   ` Andrii Nakryiko
2019-10-03  1:43     ` Stanislav Fomichev
2019-10-03  2:47       ` Andrii Nakryiko
2019-10-03 16:01         ` Stanislav Fomichev
2019-10-03 16:26           ` Andrii Nakryiko
2019-10-03 17:45             ` John Fastabend
2019-10-03 17:58               ` Stanislav Fomichev
2019-10-02 17:33 ` [PATCH bpf-next 2/2] selftests/bpf: add test for BPF flow dissector in the root namespace Stanislav Fomichev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).