All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [RFC PATCH bpf-next 4/8] bpf: support GET_FD_BY_ID and GET_NEXT_ID for bpf_link
Date: Thu, 9 Apr 2020 11:49:33 -0700	[thread overview]
Message-ID: <CAEf4BzbXCsHCJ6Tet0i5g=pKB_uYqvgiaBNuY-NMdZm8rdZN5g@mail.gmail.com> (raw)
In-Reply-To: <87tv1t65cr.fsf@toke.dk>

On Wed, Apr 8, 2020 at 2:21 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Wed, Apr 8, 2020 at 8:14 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> >>
> >> > On Mon, Apr 6, 2020 at 4:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >>
> >> >> Andrii Nakryiko <andriin@fb.com> writes:
> >> >>
> >> >> > Add support to look up bpf_link by ID and iterate over all existing bpf_links
> >> >> > in the system. GET_FD_BY_ID code handles not-yet-ready bpf_link by checking
> >> >> > that its ID hasn't been set to non-zero value yet. Setting bpf_link's ID is
> >> >> > done as the very last step in finalizing bpf_link, together with installing
> >> >> > FD. This approach allows users of bpf_link in kernel code to not worry about
> >> >> > races between user-space and kernel code that hasn't finished attaching and
> >> >> > initializing bpf_link.
> >> >> >
> >> >> > Further, it's critical that BPF_LINK_GET_FD_BY_ID only ever allows to create
> >> >> > bpf_link FD that's O_RDONLY. This is to protect processes owning bpf_link and
> >> >> > thus allowed to perform modifications on them (like LINK_UPDATE), from other
> >> >> > processes that got bpf_link ID from GET_NEXT_ID API. In the latter case, only
> >> >> > querying bpf_link information (implemented later in the series) will be
> >> >> > allowed.
> >> >>
> >> >> I must admit I remain sceptical about this model of restricting access
> >> >> without any of the regular override mechanisms (for instance, enforcing
> >> >> read-only mode regardless of CAP_DAC_OVERRIDE in this series). Since you
> >> >> keep saying there would be 'some' override mechanism, I think it would
> >> >> be helpful if you could just include that so we can see the full
> >> >> mechanism in context.
> >> >
> >> > I wasn't aware of CAP_DAC_OVERRIDE, thanks for bringing this up.
> >> >
> >> > One way to go about this is to allow creating writable bpf_link for
> >> > GET_FD_BY_ID if CAP_DAC_OVERRIDE is set. Then we can allow LINK_DETACH
> >> > operation on writable links, same as we do with LINK_UPDATE here.
> >> > LINK_DETACH will do the same as cgroup bpf_link auto-detachment on
> >> > cgroup dying: it will detach bpf_link, but will leave it alive until
> >> > last FD is closed.
> >>
> >> Yup, I think this would be a reasonable way to implement the override
> >> mechanism - it would ensure 'full root' users (like a root shell) can
> >> remove attachments, while still preventing applications from doing so by
> >> limiting their capabilities.
> >
> > So I did some experiments and I think I want to keep GET_FD_BY_ID for
> > bpf_link to return only read-only bpf_links.
>
> Why, exactly? (also, see below)

For the reasons I explained below: because you can turn read-only
bpf_link into writable one through pinning + chmod, if you have
CAP_DAC_OVERRIDE.

>
> > After that, one can pin bpf_link temporarily and re-open it as
> > writable one, provided CAP_DAC_OVERRIDE capability is present. All
> > that works already, because pinned bpf_link is just a file, so one can
> > do fchmod on it and all that will go through normal file access
> > permission check code path.
>
> Ah, I did not know that was possible - I was assuming that bpffs was
> doing something special to prevent that. But if not, great!
>
> > Unfortunately, just re-opening same FD as writable (which would
> > be possible if fcntl(fd, F_SETFL, S_IRUSR
> >  S_IWUSR) was supported on Linux) without pinning is not possible.
> > Opening link from /proc/<pid>/fd/<link-fd> doesn't seem to work
> > either, because backing inode is not BPF FS inode. I'm not sure, but
> > maybe we can support the latter eventually. But either way, I think
> > given this is to be used for manual troubleshooting, going through few
> > extra hoops to force-detach bpf_link is actually a good thing.
>
> Hmm, I disagree that deliberately making users jump through hoops is a
> good thing. Smells an awful lot like security through obscurity to me;
> and we all know how well that works anyway...

Depends on who users are? bpftool can implement this as one of
`bpftool link` sub-commands and allow human operators to force-detach
bpf_link, if necessary. I think applications shouldn't do this
(programmatically) at all, which is why I think it's actually good
that it's harder and not obvious, this will make developer think again
before implementing this, hopefully. For me it's about discouraging
bad practice.

>
> >> Extending on the concept of RO/RW bpf_link attachments, maybe it should
> >> even be possible for an application to choose which mode it wants to pin
> >> its fd in? With the same capability being able to override it of
> >> course...
> >
> > Isn't that what patch #2 is doing?...
>
> Ah yes, so it is! I guess I skipped over that a bit too fast ;)
>
> > There are few bugs in the implementation currently, but it will work
> > in the final version.
>
> Cool.
>
> >> > We need to consider, though, if CAP_DAC_OVERRIDE is something that can
> >> > be disabled for majority of real-life applications to prevent them
> >> > from doing this. If every realistic application has/needs
> >> > CAP_DAC_OVERRIDE, then that's essentially just saying that anyone can
> >> > get writable bpf_link and do anything with it.
> >>
> >> I poked around a bit, and looking at the sandboxing configurations
> >> shipped with various daemons in their systemd unit files, it appears
> >> that the main case where daemons are granted CAP_DAC_OVERRIDE is if they
> >> have to be able to read /etc/shadow (which is installed as chmod 0). If
> >> this is really the case, that would indicate it's not a widely needed
> >> capability; but I wouldn't exactly say that I've done a comprehensive
> >> survey, so probably a good idea for you to check your users as well :)
> >
> > Right, it might not be possible to drop it for all applications right
> > away, but at least CAP_DAC_OVERRIDE is not CAP_SYS_ADMIN, which is
> > absolutely necessary to work with BPF.
>
> Yeah, I do hope that we'll eventually get CAP_BPF...
>
> -Toke
>

  reply	other threads:[~2020-04-09 18:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-04  0:09 [RFC PATCH bpf-next 0/8] bpf_link observability APIs Andrii Nakryiko
2020-04-04  0:09 ` [RFC PATCH bpf-next 1/8] bpf: refactor bpf_link update handling Andrii Nakryiko
2020-04-04  0:09 ` [RFC PATCH bpf-next 2/8] bpf: allow bpf_link pinning as read-only and enforce LINK_UPDATE Andrii Nakryiko
2020-04-04  0:09 ` [RFC PATCH bpf-next 3/8] bpf: allocate ID for bpf_link Andrii Nakryiko
2020-04-04  0:09 ` [RFC PATCH bpf-next 4/8] bpf: support GET_FD_BY_ID and GET_NEXT_ID " Andrii Nakryiko
2020-04-06 11:34   ` Toke Høiland-Jørgensen
2020-04-06 19:06     ` Andrii Nakryiko
2020-04-08 15:14       ` Toke Høiland-Jørgensen
2020-04-08 20:23         ` Andrii Nakryiko
2020-04-08 21:21           ` Toke Høiland-Jørgensen
2020-04-09 18:49             ` Andrii Nakryiko [this message]
2020-04-14 10:32               ` Toke Høiland-Jørgensen
2020-04-14 18:47                 ` Andrii Nakryiko
2020-04-15  9:26                   ` Toke Høiland-Jørgensen
2020-04-04  0:09 ` [RFC PATCH bpf-next 5/8] bpf: add support for BPF_OBJ_GET_INFO_BY_FD " Andrii Nakryiko
2020-04-06 11:34   ` Toke Høiland-Jørgensen
2020-04-06 18:58     ` Andrii Nakryiko
2020-04-11 10:10   ` kbuild test robot
2020-04-04  0:09 ` [RFC PATCH bpf-next 6/8] libbpf: add low-level APIs for new bpf_link commands Andrii Nakryiko
2020-04-04  0:09 ` [RFC PATCH bpf-next 7/8] bpftool: expose attach_type-to-string array to non-cgroup code Andrii Nakryiko
2020-04-04  0:09 ` [RFC PATCH bpf-next 8/8] bpftool: add bpf_link show and pin support Andrii Nakryiko
2020-04-08 23:44   ` David Ahern
2020-04-09 18:50     ` Andrii Nakryiko
2020-04-05 16:26 ` [RFC PATCH bpf-next 0/8] bpf_link observability APIs David Ahern
2020-04-05 18:31   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzbXCsHCJ6Tet0i5g=pKB_uYqvgiaBNuY-NMdZm8rdZN5g@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.