Linux-Security-Module Archive on lore.kernel.org
 help / color / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Song Liu <songliubraving@fb.com>,
	Kees Cook <keescook@chromium.org>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>,
	Lorenz Bauer <lmb@cloudflare.com>, Jann Horn <jannh@google.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	Linux API <linux-api@vger.kernel.org>,
	LSM List <linux-security-module@vger.kernel.org>
Subject: Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
Date: Tue, 6 Aug 2019 22:24:25 -0700
Message-ID: <CALCETrXEHL3+NAY6P6vUj7Pvd9ZpZsYC6VCLXOaNxb90a_POGw@mail.gmail.com> (raw)
In-Reply-To: <20190806011134.p5baub5l3t5fkmou@ast-mbp>

On Mon, Aug 5, 2019 at 6:11 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Aug 05, 2019 at 02:25:35PM -0700, Andy Lutomirski wrote:
> > It tries to make the kernel respect the access modes for fds.  Without
> > this patch, there seem to be some holes: nothing looked at program fds
> > and, unless I missed something, you could take a readonly fd for a
> > program, pin the program, and reopen it RW.
>
> I think it's by design. iirc Daniel had a use case for something like this.

That seems odd.  Daniel, can you elaborate?

> Hence unprivileged bpf is actually something that can be deprecated.

I hope not.  There are a couple setsockopt uses right now, and and
seccomp will surely want it someday.  And the bpf-inside-container use
case really is unprivileged bpf -- containers are, in many (most?)
cases, explicitly not trusted by the host.  But regardless, see below.

> > The ability to obtain an fd to a cgroup does *not* imply any right to
> > modify that cgroup.  The ability to write to a cgroup directory
> > already means something else -- it's the ability to create cgroups
> > under the group in question.  I'm suggesting that a new API be added
> > that allows attaching a bpf program to a cgroup without capabilities
> > and that instead requires write access to a new file in the cgroup
> > directory.  (It could be a single file for all bpf types or one file
> > per type.  I prefer the latter -- it gives the admin finer-grained
> > control.)
>
> This is something to discuss. I don't mind something like this,
> but in general bpf is not for untrusted users.
> Hence I don't want to overdesign.

I think the code would end up being extremely simple.  It would be an
ABI change, but I think that it could easily be arranged so that new
and old libbpf would be fully functional on new and old kernels except
that only new libbpf with a new kernel would be able to attach cgroup
programs without privilege.  I see no reason to remove the current
cgroup attach API.

> See https://github.com/systemd/systemd/blob/01234e1fe777602265ffd25ab6e73823c62b0efe/src/core/bpf-firewall.c#L671-L674
> bpf based IP sandboxing doesn't work in 'systemd --user'.
> That is just one of the problems that people complained about.

This isn't privilege *dropping* -- this is not having privilege in the
first place.  Giving systemd --user unrestricted use of bpf() is
tantamount to making the user root.  But again, see below.

>
> Inside containers and inside nested containers we need to start processes
> that will use bpf. All of the processes are trusted.

Trusted by whom?  In a non-nested container, the container manager
*might* be trusted by the outside world.  In a *nested* container,
unless the inner container management is controlled from outside the
outer container, it's not trusted.  I don't know much about how
Facebook's containers work, but the LXC/LXD/Podman world is moving
very strongly toward user namespaces and maximally-untrusted
containers, and I think bpf() should work in that context.

> To solve your concern of bypassing all capable checks...
> How about we do /dev/bpf/full_verifier first?
> It will replace capable() checks in the verifier only.

I'm not convinced that "in the verifier" is the right distinction.
Telling administrators that some setting lets certain users bypass
bpf() verifier checks doesn't have a clear enough meaning.  I propose,
instead, that the current capable() checks be divided into three
categories:

a) Those that, by design, control privileged operations.  This
includes most attach calls, but it also includes allow_ptr_leaks,
bpf_probe_read(), and quite a few other things.  It also includes all
of the by_id calls, I think, unless some clever modification to the
way they worked would isolate different users' objects.  I think that
persistent objects can do pretty much everything that by_id users
would need, so this isn't a big deal.

b) Those that protect code that is considered to be at higher risk of
exposing security bugs.  In other words, these are things that would
have no need for privilege if we could prove that the implementation
was perfect.  This includes bpf2bpf, LPM, etc.

c) Those that serve to prevent excessive resource usage.  This
includes the bounded loop code (I think -- this might also be category
(b)) and several explicit checks for program size.

I suppose that the speculative execution mitigation stuff could be
split out from (b) into its own category.

Based on this, how about a different division: /dev/bpf_dangerous for
(b) and /dev/bpf_resources for (c)?  And (a) is dealt with by
gradually reducing the privilege needed for the important operations.
/dev/bpf_dangerous isn't ideal in the long run though, since a lot of
the operations in that category will probably become less dangerous as
the code is better vetted, and people granting "dangerous" permission
might want to restrict it to only the specific parts that they need.
/dev/bpf_lpm, /dev/bpf2bpf, etc could work better.

This type of thing actually fits quite nicely into an idea I've been
thinking about for a while called "implicit rights". In very brief
summary, there would be objects called /dev/rights/xyz, where xyz is
the same of a "right".  If there is a readable object of the right
type at the literal path "/dev/rights/xyz", then you have right xyz.
There's a bit more flexibility on top of this.  BPF could use
/dev/rights/bpf/maptypes/lpm and
/dev/rights/bpf/verifier/bounded_loops, for example.  Other non-BPF
use cases include a biggie:
/dev/rights/namespace/create_unprivileged_userns.
/dev/rights/bind_port/80 would be nice, too.

I may have a bit of time over the next few days to actually prototype
this.  Would you be interested in using this for BPF?

  reply index

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190627201923.2589391-1-songliubraving@fb.com>
     [not found] ` <20190627201923.2589391-2-songliubraving@fb.com>
     [not found]   ` <21894f45-70d8-dfca-8c02-044f776c5e05@kernel.org>
     [not found]     ` <3C595328-3ABE-4421-9772-8D41094A4F57@fb.com>
     [not found]       ` <CALCETrWBnH4Q43POU8cQ7YMjb9LioK28FDEQf7aHZbdf1eBZWg@mail.gmail.com>
     [not found]         ` <0DE7F23E-9CD2-4F03-82B5-835506B59056@fb.com>
     [not found]           ` <CALCETrWBWbNFJvsTCeUchu3BZJ3SH3dvtXLUB2EhnPrzFfsLNA@mail.gmail.com>
     [not found]             ` <201907021115.DCD56BBABB@keescook>
     [not found]               ` <CALCETrXTta26CTtEDnzvtd03-WOGdXcnsAogP8JjLkcj4-mHvg@mail.gmail.com>
     [not found]                 ` <4A7A225A-6C23-4C0F-9A95-7C6C56B281ED@fb.com>
     [not found]                   ` <CALCETrX2bMnwC6_t4b_G-hzJSfMPrkK4YKs5ebcecv2LJ0rt3w@mail.gmail.com>
     [not found]                     ` <514D5453-0AEE-420F-AEB6-3F4F58C62E7E@fb.com>
     [not found]                       ` <1DE886F3-3982-45DE-B545-67AD6A4871AB@amacapital.net>
     [not found]                         ` <7F51F8B8-CF4C-4D82-AAE1-F0F28951DB7F@fb.com>
     [not found]                           ` <77354A95-4107-41A7-8936-D144F01C3CA4@fb.com>
     [not found]                             ` <369476A8-4CE1-43DA-9239-06437C0384C7@fb.com>
2019-07-30 20:24                               ` Andy Lutomirski
2019-07-31  8:10                                 ` Song Liu
2019-07-31 19:09                                   ` Andy Lutomirski
2019-08-02  7:21                                     ` Song Liu
2019-08-04 22:16                                       ` Andy Lutomirski
2019-08-05  0:08                                         ` Andy Lutomirski
2019-08-05  5:47                                           ` Andy Lutomirski
2019-08-05  7:36                                             ` Song Liu
2019-08-05 17:23                                               ` Andy Lutomirski
2019-08-05 19:21                                                 ` Alexei Starovoitov
2019-08-05 21:25                                                   ` Andy Lutomirski
2019-08-05 22:21                                                     ` Andy Lutomirski
2019-08-06  1:11                                                     ` Alexei Starovoitov
2019-08-07  5:24                                                       ` Andy Lutomirski [this message]
2019-08-07  9:03                                                         ` Lorenz Bauer
2019-08-07 13:52                                                           ` Andy Lutomirski
2019-08-13 21:58                                                         ` Alexei Starovoitov
2019-08-13 22:26                                                           ` Daniel Colascione
2019-08-13 23:24                                                             ` Andy Lutomirski
2019-08-13 23:06                                                           ` Andy Lutomirski
2019-08-14  0:57                                                             ` Alexei Starovoitov
2019-08-14 17:51                                                               ` Andy Lutomirski
2019-08-14 22:05                                                                 ` Alexei Starovoitov
2019-08-14 22:30                                                                   ` Andy Lutomirski
2019-08-14 23:33                                                                     ` Alexei Starovoitov
2019-08-14 23:59                                                                       ` Andy Lutomirski
2019-08-15  0:36                                                                         ` Alexei Starovoitov
2019-08-15 11:24                                                                   ` Jordan Glover
2019-08-15 17:28                                                                     ` Alexei Starovoitov
2019-08-15 18:36                                                                       ` Andy Lutomirski
2019-08-15 23:08                                                                         ` Alexei Starovoitov
2019-08-16  9:34                                                                           ` Jordan Glover
2019-08-16  9:59                                                                             ` Thomas Gleixner
2019-08-16 11:33                                                                               ` Jordan Glover
2019-08-16 19:52                                                                                 ` Alexei Starovoitov
2019-08-16 20:28                                                                                   ` Thomas Gleixner
2019-08-17 15:02                                                                                     ` Alexei Starovoitov
2019-08-17 15:44                                                                                       ` Andy Lutomirski
2019-08-19  9:15                                                                                       ` Thomas Gleixner
2019-08-19 17:27                                                                                         ` Alexei Starovoitov
2019-08-19 17:38                                                                                           ` Andy Lutomirski
2019-08-15 18:43                                                                       ` Jordan Glover
2019-08-15 19:46                                                           ` Kees Cook
2019-08-15 23:46                                                             ` Alexei Starovoitov
2019-08-16  0:54                                                               ` Andy Lutomirski
2019-08-16  5:56                                                                 ` Song Liu
2019-08-16 21:45                                                                 ` Alexei Starovoitov
2019-08-16 22:22                                                                   ` Christian Brauner
2019-08-17 15:08                                                                     ` Alexei Starovoitov
2019-08-17 15:16                                                                       ` Christian Brauner
2019-08-17 15:36                                                                         ` Alexei Starovoitov
2019-08-17 15:42                                                                           ` Christian Brauner
2019-08-22 14:17                                                         ` Daniel Borkmann
2019-08-22 15:16                                                           ` Andy Lutomirski
2019-08-22 15:17                                                             ` RFC: very rough draft of a bpf permission model Andy Lutomirski
2019-08-22 23:26                                                               ` Alexei Starovoitov
2019-08-23 23:09                                                                 ` Andy Lutomirski
2019-08-26 22:36                                                                   ` Alexei Starovoitov
2019-08-27  0:05                                                                     ` Andy Lutomirski
2019-08-27  0:34                                                                       ` Alexei Starovoitov
2019-08-22 22:48                                                           ` [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf Alexei Starovoitov

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrXEHL3+NAY6P6vUj7Pvd9ZpZsYC6VCLXOaNxb90a_POGw@mail.gmail.com \
    --to=luto@kernel.org \
    --cc=Kernel-team@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lmb@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Security-Module Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-security-module/0 linux-security-module/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-security-module linux-security-module/ https://lore.kernel.org/linux-security-module \
		linux-security-module@vger.kernel.org linux-security-module@archiver.kernel.org
	public-inbox-index linux-security-module


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-security-module


AGPL code for this site: git clone https://public-inbox.org/ public-inbox