From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
Networking <netdev@vger.kernel.org>,
Alexei Starovoitov <ast@fb.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <kernel-team@fb.com>
Subject: Re: [RFC PATCH bpf-next 4/8] bpf: support GET_FD_BY_ID and GET_NEXT_ID for bpf_link
Date: Thu, 9 Apr 2020 11:49:33 -0700 [thread overview]
Message-ID: <CAEf4BzbXCsHCJ6Tet0i5g=pKB_uYqvgiaBNuY-NMdZm8rdZN5g@mail.gmail.com> (raw)
In-Reply-To: <87tv1t65cr.fsf@toke.dk>
On Wed, Apr 8, 2020 at 2:21 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Wed, Apr 8, 2020 at 8:14 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> >>
> >> > On Mon, Apr 6, 2020 at 4:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >>
> >> >> Andrii Nakryiko <andriin@fb.com> writes:
> >> >>
> >> >> > Add support to look up bpf_link by ID and iterate over all existing bpf_links
> >> >> > in the system. GET_FD_BY_ID code handles not-yet-ready bpf_link by checking
> >> >> > that its ID hasn't been set to non-zero value yet. Setting bpf_link's ID is
> >> >> > done as the very last step in finalizing bpf_link, together with installing
> >> >> > FD. This approach allows users of bpf_link in kernel code to not worry about
> >> >> > races between user-space and kernel code that hasn't finished attaching and
> >> >> > initializing bpf_link.
> >> >> >
> >> >> > Further, it's critical that BPF_LINK_GET_FD_BY_ID only ever allows to create
> >> >> > bpf_link FD that's O_RDONLY. This is to protect processes owning bpf_link and
> >> >> > thus allowed to perform modifications on them (like LINK_UPDATE), from other
> >> >> > processes that got bpf_link ID from GET_NEXT_ID API. In the latter case, only
> >> >> > querying bpf_link information (implemented later in the series) will be
> >> >> > allowed.
> >> >>
> >> >> I must admit I remain sceptical about this model of restricting access
> >> >> without any of the regular override mechanisms (for instance, enforcing
> >> >> read-only mode regardless of CAP_DAC_OVERRIDE in this series). Since you
> >> >> keep saying there would be 'some' override mechanism, I think it would
> >> >> be helpful if you could just include that so we can see the full
> >> >> mechanism in context.
> >> >
> >> > I wasn't aware of CAP_DAC_OVERRIDE, thanks for bringing this up.
> >> >
> >> > One way to go about this is to allow creating writable bpf_link for
> >> > GET_FD_BY_ID if CAP_DAC_OVERRIDE is set. Then we can allow LINK_DETACH
> >> > operation on writable links, same as we do with LINK_UPDATE here.
> >> > LINK_DETACH will do the same as cgroup bpf_link auto-detachment on
> >> > cgroup dying: it will detach bpf_link, but will leave it alive until
> >> > last FD is closed.
> >>
> >> Yup, I think this would be a reasonable way to implement the override
> >> mechanism - it would ensure 'full root' users (like a root shell) can
> >> remove attachments, while still preventing applications from doing so by
> >> limiting their capabilities.
> >
> > So I did some experiments and I think I want to keep GET_FD_BY_ID for
> > bpf_link to return only read-only bpf_links.
>
> Why, exactly? (also, see below)
For the reasons I explained below: because you can turn read-only
bpf_link into writable one through pinning + chmod, if you have
CAP_DAC_OVERRIDE.
>
> > After that, one can pin bpf_link temporarily and re-open it as
> > writable one, provided CAP_DAC_OVERRIDE capability is present. All
> > that works already, because pinned bpf_link is just a file, so one can
> > do fchmod on it and all that will go through normal file access
> > permission check code path.
>
> Ah, I did not know that was possible - I was assuming that bpffs was
> doing something special to prevent that. But if not, great!
>
> > Unfortunately, just re-opening same FD as writable (which would
> > be possible if fcntl(fd, F_SETFL, S_IRUSR
> > S_IWUSR) was supported on Linux) without pinning is not possible.
> > Opening link from /proc/<pid>/fd/<link-fd> doesn't seem to work
> > either, because backing inode is not BPF FS inode. I'm not sure, but
> > maybe we can support the latter eventually. But either way, I think
> > given this is to be used for manual troubleshooting, going through few
> > extra hoops to force-detach bpf_link is actually a good thing.
>
> Hmm, I disagree that deliberately making users jump through hoops is a
> good thing. Smells an awful lot like security through obscurity to me;
> and we all know how well that works anyway...
Depends on who users are? bpftool can implement this as one of
`bpftool link` sub-commands and allow human operators to force-detach
bpf_link, if necessary. I think applications shouldn't do this
(programmatically) at all, which is why I think it's actually good
that it's harder and not obvious, this will make developer think again
before implementing this, hopefully. For me it's about discouraging
bad practice.
>
> >> Extending on the concept of RO/RW bpf_link attachments, maybe it should
> >> even be possible for an application to choose which mode it wants to pin
> >> its fd in? With the same capability being able to override it of
> >> course...
> >
> > Isn't that what patch #2 is doing?...
>
> Ah yes, so it is! I guess I skipped over that a bit too fast ;)
>
> > There are few bugs in the implementation currently, but it will work
> > in the final version.
>
> Cool.
>
> >> > We need to consider, though, if CAP_DAC_OVERRIDE is something that can
> >> > be disabled for majority of real-life applications to prevent them
> >> > from doing this. If every realistic application has/needs
> >> > CAP_DAC_OVERRIDE, then that's essentially just saying that anyone can
> >> > get writable bpf_link and do anything with it.
> >>
> >> I poked around a bit, and looking at the sandboxing configurations
> >> shipped with various daemons in their systemd unit files, it appears
> >> that the main case where daemons are granted CAP_DAC_OVERRIDE is if they
> >> have to be able to read /etc/shadow (which is installed as chmod 0). If
> >> this is really the case, that would indicate it's not a widely needed
> >> capability; but I wouldn't exactly say that I've done a comprehensive
> >> survey, so probably a good idea for you to check your users as well :)
> >
> > Right, it might not be possible to drop it for all applications right
> > away, but at least CAP_DAC_OVERRIDE is not CAP_SYS_ADMIN, which is
> > absolutely necessary to work with BPF.
>
> Yeah, I do hope that we'll eventually get CAP_BPF...
>
> -Toke
>
next prev parent reply other threads:[~2020-04-09 18:49 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-04 0:09 [RFC PATCH bpf-next 0/8] bpf_link observability APIs Andrii Nakryiko
2020-04-04 0:09 ` [RFC PATCH bpf-next 1/8] bpf: refactor bpf_link update handling Andrii Nakryiko
2020-04-04 0:09 ` [RFC PATCH bpf-next 2/8] bpf: allow bpf_link pinning as read-only and enforce LINK_UPDATE Andrii Nakryiko
2020-04-04 0:09 ` [RFC PATCH bpf-next 3/8] bpf: allocate ID for bpf_link Andrii Nakryiko
2020-04-04 0:09 ` [RFC PATCH bpf-next 4/8] bpf: support GET_FD_BY_ID and GET_NEXT_ID " Andrii Nakryiko
2020-04-06 11:34 ` Toke Høiland-Jørgensen
2020-04-06 19:06 ` Andrii Nakryiko
2020-04-08 15:14 ` Toke Høiland-Jørgensen
2020-04-08 20:23 ` Andrii Nakryiko
2020-04-08 21:21 ` Toke Høiland-Jørgensen
2020-04-09 18:49 ` Andrii Nakryiko [this message]
2020-04-14 10:32 ` Toke Høiland-Jørgensen
2020-04-14 18:47 ` Andrii Nakryiko
2020-04-15 9:26 ` Toke Høiland-Jørgensen
2020-04-04 0:09 ` [RFC PATCH bpf-next 5/8] bpf: add support for BPF_OBJ_GET_INFO_BY_FD " Andrii Nakryiko
2020-04-06 11:34 ` Toke Høiland-Jørgensen
2020-04-06 18:58 ` Andrii Nakryiko
2020-04-04 0:09 ` [RFC PATCH bpf-next 6/8] libbpf: add low-level APIs for new bpf_link commands Andrii Nakryiko
2020-04-04 0:09 ` [RFC PATCH bpf-next 7/8] bpftool: expose attach_type-to-string array to non-cgroup code Andrii Nakryiko
2020-04-04 0:09 ` [RFC PATCH bpf-next 8/8] bpftool: add bpf_link show and pin support Andrii Nakryiko
2020-04-08 23:44 ` David Ahern
2020-04-09 18:50 ` Andrii Nakryiko
2020-04-05 16:26 ` [RFC PATCH bpf-next 0/8] bpf_link observability APIs David Ahern
2020-04-05 18:31 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEf4BzbXCsHCJ6Tet0i5g=pKB_uYqvgiaBNuY-NMdZm8rdZN5g@mail.gmail.com' \
--to=andrii.nakryiko@gmail.com \
--cc=andriin@fb.com \
--cc=ast@fb.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
--cc=netdev@vger.kernel.org \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).