BPF Archive on lore.kernel.org
 help / color / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Kees Cook <keescook@chromium.org>,
	Andy Lutomirski <luto@kernel.org>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>,
	Lorenz Bauer <lmb@cloudflare.com>, Jann Horn <jannh@google.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	Linux API <linux-api@vger.kernel.org>,
	LSM List <linux-security-module@vger.kernel.org>
Subject: Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
Date: Fri, 16 Aug 2019 05:56:20 +0000
Message-ID: <4F52274A-CD70-4261-A255-2C4A7E818141@fb.com> (raw)
In-Reply-To: <B0364660-AD6A-4E5C-B04F-3B6DA78B4BBE@amacapital.net>



> On Aug 15, 2019, at 5:54 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> 
> 
> 
>> On Aug 15, 2019, at 4:46 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> 
>>> 
>>> I'm not sure why you draw the line for VMs -- they're just as buggy
>>> as anything else. Regardless, I reject this line of thinking: yes,
>>> all software is buggy, but that isn't a reason to give up.
>> 
>> hmm. are you saying you want kernel community to work towards
>> making containers (namespaces) being able to run arbitrary code
>> downloaded from the internet?
> 
> Yes.
> 
> As an example, Sandstorm uses a combination of namespaces (user, network, mount, ipc) and a moderately permissive seccomp policy to run arbitrary code. Not just little snippets, either — node.js, Mongo, MySQL, Meteor, and other fairly heavyweight stacks can all run under Sandstorm, with the whole stack (database engine binaries, etc) supplied by entirely untrusted customers.  During the time Sandstorm was under active development, I can recall *one* bug that would have allowed a sandbox escape. That’s a pretty good track record.  (Also, Meltdown and Spectre, sigh.)
> 
> To be clear, Sandstorm did not allow creation of a userns by the untrusted code, and Sandstorm would have heavily restricted bpf(), but that should only be necessary because of the possibility of kernel bugs, not because of the overall design.
> 
> Alexei, I’m trying to encourage you to aim for something even better than you have now. Right now, if you grant a user various very strong capabilities, that user’s systemd can use bpf network filters.  Your proposal would allow this with a different, but still very strong, set of capabilities. There’s nothing wrong with this per se, but I think you can aim much higher:
> 
> CAP_NET_ADMIN and your CAP_BPF both effectively allow the holder to take over the system, *by design*.  I’m suggesting that you engage the security community (Kees, myself, Aleksa, Jann, Serge, Christian, etc) to aim for something better: make it so that a normal Linux distro would be willing to relax its settings enough so that normal users can use bpf filtering in the systemd units and maybe eventually use even more bpf() capabilities. And let’s make is to that mainstream container managers (that use userns!) will be willing (as an option) to delegate bpf() to their containers. We’re happy to help design, review, and even write code, but we need you to be willing to work with us to make a design that seems like it will work and then to wait long enough to merge it for us to think about it, try to poke holes in it, and convince ourselves and each other that it has a good chance of being sound.
> 
> Obviously there will be many cases where an unprivileged program should *not* be able to use bpf() IP filtering, but let’s make it so that enabling these advanced features does not automatically give away the keys to the kingdom.
> 
> (Sandstorm still exists but is no longer as actively developed, sadly.)

I am trying to understand different perspectives here. 

Disclaimer: Alexei and I both work for Facebook. But he may disagree 
with everything I am about to say below, because we haven't sync'ed 
about this for a while. :)

I think there are two types of use cases here: 

    1. CAP_BPF_ADMIN: one big key to all sys_bpf(). 
    2. CAP_BPF: subset of sys_bpf() that is safe for containers.

IIUC, currently, CAP_BPF_ADMIN is (almost) same as CAP_SYS_ADMIN. 
And there aren't many real world use cases for CAP_BPF. 

The /dev/bpf patch tries to separate CAP_BPF_ADMIN from CAP_SYS_ADMIN.
On the other hand, Andy would like to introduce CAP_BPF and build
amazing use cases around it (chicken-egg problem). 

Did I misunderstand anything?

If not, I think these two use cases do not really conflict with each
other, and we probably need both of them. Then, the next question is 
do we really need both/either of them. Maybe having two separate 
discussions would make it easier?


The following are some questions I am trying to understand for 
the two cases. 

For CAP_BPF_ADMIN (or /dev/bpf):
Can we just use CAP_NET_ADMIN? It is safer than CAP_SYS_ADMIN, and
reuse existing CAP_ should be easier than introducing a new one? 

For CAP_BPF: 
Do we really need it for the containers? Is it possible to implement 
all container use cases with SUID? At this moment, I think SUID is 
the right way to go for this use case, because this is likely to 
start with a small set of functionalities. We can introduce CAP_BPF
when the container use case is too complicated for SUID. 


I hope some of these questions/thoughts would make some sense?

Thanks,
Song

  reply index

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-27 20:19 [PATCH v2 bpf-next 0/4] sys_bpf() access control " Song Liu
2019-06-27 20:19 ` [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access " Song Liu
2019-06-27 23:40   ` Andy Lutomirski
2019-06-27 23:42     ` Andy Lutomirski
2019-06-28 10:28       ` Christian Brauner
2019-06-28  9:05     ` Lorenz Bauer
2019-06-28 19:04     ` Song Liu
2019-06-30  0:12       ` Andy Lutomirski
2019-07-01  9:03         ` Song Liu
2019-07-02  1:59           ` Andy Lutomirski
2019-07-02 18:24             ` Kees Cook
2019-07-02 21:32               ` Andy Lutomirski
2019-07-02 23:48                 ` Song Liu
2019-07-22 20:53                 ` Song Liu
2019-07-23 10:45                   ` Lorenz Bauer
2019-07-23 15:11                   ` Andy Lutomirski
2019-07-23 22:56                     ` Song Liu
2019-07-24  1:40                       ` Andy Lutomirski
2019-07-24  6:30                         ` Song Liu
2019-07-27 18:20                           ` Song Liu
2019-07-30  5:07                             ` Song Liu
2019-07-30 20:24                               ` Andy Lutomirski
2019-07-31  8:10                                 ` Song Liu
2019-07-31 19:09                                   ` Andy Lutomirski
2019-08-02  7:21                                     ` Song Liu
2019-08-04 22:16                                       ` Andy Lutomirski
2019-08-05  0:08                                         ` Andy Lutomirski
2019-08-05  5:47                                           ` Andy Lutomirski
2019-08-05  7:36                                             ` Song Liu
2019-08-05 17:23                                               ` Andy Lutomirski
2019-08-05 19:21                                                 ` Alexei Starovoitov
2019-08-05 21:25                                                   ` Andy Lutomirski
2019-08-05 22:21                                                     ` Andy Lutomirski
2019-08-06  1:11                                                     ` Alexei Starovoitov
2019-08-07  5:24                                                       ` Andy Lutomirski
2019-08-07  9:03                                                         ` Lorenz Bauer
2019-08-07 13:52                                                           ` Andy Lutomirski
2019-08-13 21:58                                                         ` Alexei Starovoitov
2019-08-13 22:26                                                           ` Daniel Colascione
2019-08-13 23:24                                                             ` Andy Lutomirski
2019-08-13 23:06                                                           ` Andy Lutomirski
2019-08-14  0:57                                                             ` Alexei Starovoitov
2019-08-14 17:51                                                               ` Andy Lutomirski
2019-08-14 22:05                                                                 ` Alexei Starovoitov
2019-08-14 22:30                                                                   ` Andy Lutomirski
2019-08-14 23:33                                                                     ` Alexei Starovoitov
2019-08-14 23:59                                                                       ` Andy Lutomirski
2019-08-15  0:36                                                                         ` Alexei Starovoitov
2019-08-15 11:24                                                                   ` Jordan Glover
2019-08-15 17:28                                                                     ` Alexei Starovoitov
2019-08-15 18:36                                                                       ` Andy Lutomirski
2019-08-15 23:08                                                                         ` Alexei Starovoitov
2019-08-16  9:34                                                                           ` Jordan Glover
2019-08-16  9:59                                                                             ` Thomas Gleixner
2019-08-16 11:33                                                                               ` Jordan Glover
2019-08-16 19:52                                                                                 ` Alexei Starovoitov
2019-08-16 20:28                                                                                   ` Thomas Gleixner
2019-08-17 15:02                                                                                     ` Alexei Starovoitov
2019-08-17 15:44                                                                                       ` Andy Lutomirski
2019-08-19  9:15                                                                                       ` Thomas Gleixner
2019-08-19 17:27                                                                                         ` Alexei Starovoitov
2019-08-19 17:38                                                                                           ` Andy Lutomirski
2019-08-15 18:43                                                                       ` Jordan Glover
2019-08-15 19:46                                                           ` Kees Cook
2019-08-15 23:46                                                             ` Alexei Starovoitov
2019-08-16  0:54                                                               ` Andy Lutomirski
2019-08-16  5:56                                                                 ` Song Liu [this message]
2019-08-16 21:45                                                                 ` Alexei Starovoitov
2019-08-16 22:22                                                                   ` Christian Brauner
2019-08-17 15:08                                                                     ` Alexei Starovoitov
2019-08-17 15:16                                                                       ` Christian Brauner
2019-08-17 15:36                                                                         ` Alexei Starovoitov
2019-08-17 15:42                                                                           ` Christian Brauner
2019-08-22 14:17                                                         ` Daniel Borkmann
2019-08-22 15:16                                                           ` Andy Lutomirski
2019-08-22 15:17                                                             ` RFC: very rough draft of a bpf permission model Andy Lutomirski
2019-08-22 23:26                                                               ` Alexei Starovoitov
2019-08-23 23:09                                                                 ` Andy Lutomirski
2019-08-26 22:36                                                                   ` Alexei Starovoitov
2019-08-27  0:05                                                                     ` Andy Lutomirski
2019-08-27  0:34                                                                       ` Alexei Starovoitov
2019-08-22 22:48                                                           ` [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf Alexei Starovoitov
2019-07-30 20:20                             ` Andy Lutomirski
2019-07-31  7:44                               ` Song Liu
2019-06-28  9:01   ` Lorenz Bauer
2019-06-28 19:10     ` Song Liu
2019-07-01  9:34       ` Lorenz Bauer
2019-07-02 19:22   ` Andrii Nakryiko
2019-07-03  7:28     ` Greg KH
2019-06-27 20:19 ` [PATCH v2 bpf-next 2/4] bpf: sync tools/include/uapi/linux/bpf.h Song Liu
2019-06-27 20:19 ` [PATCH v2 bpf-next 3/4] libbpf: add libbpf_[enable|disable]_sys_bpf() Song Liu
2019-06-27 20:19 ` [PATCH v2 bpf-next 4/4] bpftool: use libbpf_[enable|disable]_sys_bpf() Song Liu

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F52274A-CD70-4261-A255-2C4A7E818141@fb.com \
    --to=songliubraving@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lmb@cloudflare.com \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org bpf@archiver.kernel.org
	public-inbox-index bpf


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/ public-inbox