netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@fomichev.me>
To: John Fastabend <john.fastabend@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Stanislav Fomichev <sdf@google.com>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Petar Penkov <ppenkov@google.com>
Subject: Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
Date: Thu, 3 Oct 2019 10:58:48 -0700	[thread overview]
Message-ID: <20191003175848.GE3223377@mini-arch> (raw)
In-Reply-To: <5d9633a2de69c_55732aec43fe05c41@john-XPS-13-9370.notmuch>

On 10/03, John Fastabend wrote:
> Andrii Nakryiko wrote:
> > On Thu, Oct 3, 2019 at 9:01 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > >
> > > On 10/02, Andrii Nakryiko wrote:
> > > > On Wed, Oct 2, 2019 at 6:43 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > > > >
> > > > > On 10/02, Andrii Nakryiko wrote:
> > > > > > On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > > > > >
> > > > > > > Always use init_net flow dissector BPF program if it's attached and fall
> > > > > > > back to the per-net namespace one. Also, deny installing new programs if
> > > > > > > there is already one attached to the root namespace.
> > > > > > > Users can still detach their BPF programs, but can't attach any
> > > > > > > new ones (-EPERM).
> > > >
> > > > I find this quite confusing for users, honestly. If there is no root
> > > > namespace dissector we'll successfully attach per-net ones and they
> > > > will be working fine. That some process will attach root one and all
> > > > the previously successfully working ones will suddenly "break" without
> > > > users potentially not realizing why. I bet this will be hair-pulling
> > > > investigation for someone. Furthermore, if root net dissector is
> > > > already attached, all subsequent attachment will now start failing.
> > > The idea is that if sysadmin decides to use system-wide dissector it would
> > > be attached from the init scripts/systemd early in the boot process.
> > > So the users in your example would always get EPERM/EBUSY/EXIST.
> > > I don't really see a realistic use-case where root and non-root
> > > namespaces attach/detach flow dissector programs at non-boot
> > > time (or why non-root containers could have BPF dissector and root
> > > could have C dissector; multi-nic machine?).
> > >
> > > But I totally see your point about confusion. See below.
> > >
> > > > I'm not sure what's the better behavior here is, but maybe at least
> > > > forcibly detach already attached ones, so when someone goes and tries
> > > > to investigate, they will see that their BPF program is not attached
> > > > anymore. Printing dmesg warning would be hugely useful here as well.
> > > We can do for_each_net and detach non-root ones; that sounds
> > > feasible and may avoid the confusion (at least when you query
> > > non-root ns to see if the prog is still there, you get a valid
> > > indication that it's not).
> > >
> > > > Alternatively, if there is any per-net dissector attached, we might
> > > > disallow root net dissector to be installed. Sort of "too late to the
> > > > party" way, but at least not surprising to successfully installed
> > > > dissectors.
> > > We can do this as well.
> > >
> > > > Thoughts?
> > > Let me try to implement both of your suggestions and see which one makes
> > > more sense. I'm leaning towards the later (simple check to see if
> > > any non-root ns has the prog attached).
> > >
> > > I'll follow up with a v2 if all goes well.
> > 
> > Thanks! I don't have strong opinion on either, see what makes most
> > sense from an actual user perspective.
> 
> 
> From my point of view the second option is better. The root namespace flow
> dissector attach should always happen first before any other namespaces are
> created. If any namespaces have already attached then just fail the root
> namespace. 
> 
> Otherwise if you detach existing dissectors from a container these were
> probably attached by the init container which might not be running anymore
> and I have no easy way to learn/find out about this without creating another
> container specifically to watch for this. If I'm relying on the dissector
> for something now I can seemingly random errors. So its a bit ugly and I'll
> probably just tell users to always attach the root namespace first to avoid
> this headache. On the other side if the root namespace already has a
> flow dissector attached and my init container fails its attach cmd I
> can handle the error gracefully or even fail to launch the container with
> a nice error message and the administrator can figure something out.
> I'm always in favor of hard errors vs trying to guess what the right
> choice is for any particular setup.
> 
> Also it seems to me just checking if anything is attached is going to make
> the code simpler vs trying to detach things in all namespaces.
Agreed, I was also leaning towards this option. Thanks!

  reply	other threads:[~2019-10-03 17:58 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-02 17:33 [PATCH bpf-next 0/2] " Stanislav Fomichev
2019-10-02 17:33 ` [PATCH bpf-next 1/2] " Stanislav Fomichev
2019-10-02 20:57   ` Song Liu
2019-10-02 21:31     ` Stanislav Fomichev
2019-10-02 23:29   ` Andrii Nakryiko
2019-10-03  1:43     ` Stanislav Fomichev
2019-10-03  2:47       ` Andrii Nakryiko
2019-10-03 16:01         ` Stanislav Fomichev
2019-10-03 16:26           ` Andrii Nakryiko
2019-10-03 17:45             ` John Fastabend
2019-10-03 17:58               ` Stanislav Fomichev [this message]
2019-10-02 17:33 ` [PATCH bpf-next 2/2] selftests/bpf: add test for BPF flow dissector in the root namespace Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191003175848.GE3223377@mini-arch \
    --to=sdf@fomichev.me \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=ppenkov@google.com \
    --cc=sdf@google.com \
    --subject='Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).