All of lore.kernel.org
 help / color / mirror / Atom feed
From: sdf@google.com
To: Jakub Sitnicki <jakub@cloudflare.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, kernel-team@cloudflare.com
Subject: Re: [PATCH bpf-next 3/8] net: Introduce netns_bpf for BPF programs attached to netns
Date: Wed, 27 May 2020 13:36:07 -0700	[thread overview]
Message-ID: <20200527203607.GA57268@google.com> (raw)
In-Reply-To: <87v9kh2mzg.fsf@cloudflare.com>

On 05/27, Jakub Sitnicki wrote:
> On Wed, May 27, 2020 at 07:40 PM CEST, sdf@google.com wrote:
> > On 05/27, Jakub Sitnicki wrote:
> >> In order to:
> >
> >>   (1) attach more than one BPF program type to netns, or
> >>   (2) support attaching BPF programs to netns with bpf_link, or
> >>   (3) support multi-prog attach points for netns
> >
> >> we will need to keep more state per netns than a single pointer like we
> >> have now for BPF flow dissector program.
> >
> >> Prepare for the above by extracting netns_bpf that is part of struct  
> net,
> >> for storing all state related to BPF programs attached to netns.
> >
> >> Turn flow dissector callbacks for querying/attaching/detaching a  
> program
> >> into generic ones that operate on netns_bpf. Next patch will move the
> >> generic callbacks into their own module.
> >
> >> This is similar to how it is organized for cgroup with cgroup_bpf.
> >
> >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> >> ---
> >>   include/linux/bpf-netns.h   | 56 ++++++++++++++++++++++
> >>   include/linux/skbuff.h      | 26 ----------
> >>   include/net/net_namespace.h |  4 +-
> >>   include/net/netns/bpf.h     | 17 +++++++
> >>   kernel/bpf/syscall.c        |  7 +--
> >>   net/core/flow_dissector.c   | 96  
> ++++++++++++++++++++++++-------------
> >>   6 files changed, 143 insertions(+), 63 deletions(-)
> >>   create mode 100644 include/linux/bpf-netns.h
> >>   create mode 100644 include/net/netns/bpf.h
> >
> >> diff --git a/include/linux/bpf-netns.h b/include/linux/bpf-netns.h
> >> new file mode 100644
> >> index 000000000000..f3aec3d79824
> >> --- /dev/null
> >> +++ b/include/linux/bpf-netns.h
> >> @@ -0,0 +1,56 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +#ifndef _BPF_NETNS_H
> >> +#define _BPF_NETNS_H
> >> +
> >> +#include <linux/mutex.h>
> >> +#include <uapi/linux/bpf.h>
> >> +
> >> +enum netns_bpf_attach_type {
> >> +	NETNS_BPF_INVALID = -1,
> >> +	NETNS_BPF_FLOW_DISSECTOR = 0,
> >> +	MAX_NETNS_BPF_ATTACH_TYPE
> >> +};
> >> +
> >> +static inline enum netns_bpf_attach_type
> >> +to_netns_bpf_attach_type(enum bpf_attach_type attach_type)
> >> +{
> >> +	switch (attach_type) {
> >> +	case BPF_FLOW_DISSECTOR:
> >> +		return NETNS_BPF_FLOW_DISSECTOR;
> >> +	default:
> >> +		return NETNS_BPF_INVALID;
> >> +	}
> >> +}
> >> +
> >> +/* Protects updates to netns_bpf */
> >> +extern struct mutex netns_bpf_mutex;
> > I wonder whether it's a good time to make this mutex per-netns, WDYT?
> >
> > The only problem I see is that it might complicate the global
> > mode of flow dissector where we go over every ns to make sure no
> > progs are attached. That will be racy with per-ns mutex unless
> > we do something about it...

> It crossed my mind. I stuck with a global mutex for a couple of
> reasons. Different that one you bring up, which I forgot about.

> 1. Don't know if it has potential to be a bottleneck.

> cgroup BPF uses a global mutex too. Even one that serializes access to
> more data than just BPF programs attached to a cgroup.

> Also, we grab the netns_bpf_mutex only on prog attach/detach, and link
> create/update/release. Netns teardown is not grabbing it. So if you're
> not using netns BPF you're not going to "feel" contention.

> 2. Makes locking on nets bpf_link release trickier

> In bpf_netns_link_release (patch 5), we deref pointer from link to
> struct net under RCU read lock, in case the net is being destroyed
> simulatneously.

> However, we're also grabbing the netns_bpf_mutex, in case of another
> possible scenario, when struct net is alive and well (refcnt > 0), but
> we're racing with a prog attach/detach to access net->bpf.{links,progs}.

> Making the mutex part of net->bpf means I first need to somehow ensure
> netns stays alive if I go to sleep waiting for the lock. Or it would
> have to be a spinlock, or some better (simpler?) locking scheme.


> The above two convinced me that I should start with a global mutex, and
> go for more pain if there's contention.

> Thanks for giving the series a review.
Yeah, everything makes sense, agreed, feel free to slap:
Reviewed-by: Stanislav Fomichev <sdf@google.com>

  reply	other threads:[~2020-05-27 20:36 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 17:08 [PATCH bpf-next 0/8] Link-based program attachment to network namespaces Jakub Sitnicki
2020-05-27 17:08 ` [PATCH bpf-next 1/8] flow_dissector: Don't grab update-side lock on prog detach from pre_exit Jakub Sitnicki
2020-05-27 17:35   ` sdf
2020-05-27 17:08 ` [PATCH bpf-next 2/8] flow_dissector: Pull locking up from prog attach callback Jakub Sitnicki
2020-05-27 17:35   ` sdf
2020-05-27 17:08 ` [PATCH bpf-next 3/8] net: Introduce netns_bpf for BPF programs attached to netns Jakub Sitnicki
2020-05-27 17:40   ` sdf
2020-05-27 19:31     ` Jakub Sitnicki
2020-05-27 20:36       ` sdf [this message]
2020-05-27 17:08 ` [PATCH bpf-next 4/8] flow_dissector: Move out netns_bpf prog callbacks Jakub Sitnicki
2020-05-27 17:08 ` [PATCH bpf-next 5/8] bpf: Add link-based BPF program attachment to network namespace Jakub Sitnicki
2020-05-27 17:48   ` sdf
2020-05-27 19:54     ` Jakub Sitnicki
2020-05-27 20:38       ` sdf
2020-05-28 10:34         ` Jakub Sitnicki
2020-05-27 20:53   ` sdf
2020-05-28 11:03     ` Jakub Sitnicki
2020-05-28 16:09       ` sdf
2020-05-28  2:54   ` kbuild test robot
2020-05-28  2:54     ` kbuild test robot
2020-06-04 23:38     ` Nick Desaulniers
2020-06-04 23:38       ` Nick Desaulniers
2020-06-05 14:41       ` Jakub Sitnicki
2020-05-28  5:56   ` Andrii Nakryiko
2020-05-28 12:26     ` Jakub Sitnicki
2020-05-28 18:09       ` Andrii Nakryiko
2020-05-28 18:48         ` Alexei Starovoitov
2020-05-28 13:30   ` Jakub Sitnicki
2020-05-27 17:08 ` [PATCH bpf-next 6/8] libbpf: Add support for bpf_link-based netns attachment Jakub Sitnicki
2020-05-28  5:59   ` Andrii Nakryiko
2020-05-28 13:05     ` Jakub Sitnicki
2020-05-27 17:08 ` [PATCH bpf-next 7/8] bpftool: Support link show for netns-attached links Jakub Sitnicki
2020-05-28  6:02   ` Andrii Nakryiko
2020-05-28 13:10     ` Jakub Sitnicki
2020-05-27 17:08 ` [PATCH bpf-next 8/8] selftests/bpf: Add tests for attaching bpf_link to netns Jakub Sitnicki
2020-05-28  6:08   ` Andrii Nakryiko
2020-05-28 13:29     ` Jakub Sitnicki
2020-05-28 18:31       ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200527203607.GA57268@google.com \
    --to=sdf@google.com \
    --cc=bpf@vger.kernel.org \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.