bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <christian.brauner@ubuntu.com>
To: Tianyin Xu <tyxu@illinois.edu>
Cc: Tycho Andersen <tycho@tycho.pizza>,
	Andy Lutomirski <luto@kernel.org>,
	YiFei Zhu <zhuyifei1999@gmail.com>,
	"containers@lists.linux.dev" <containers@lists.linux.dev>,
	bpf <bpf@vger.kernel.org>, "Zhu, YiFei" <yifeifz2@illinois.edu>,
	LSM List <linux-security-module@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"Kuo, Hsuan-Chi" <hckuo2@illinois.edu>,
	Claudio Canella <claudio.canella@iaik.tugraz.at>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Daniel Gruss <daniel.gruss@iaik.tugraz.at>,
	Dimitrios Skarlatos <dskarlat@cs.cmu.edu>,
	Giuseppe Scrivano <gscrivan@redhat.com>,
	Hubertus Franke <frankeh@us.ibm.com>,
	Jann Horn <jannh@google.com>,
	"Jia, Jinghao" <jinghao7@illinois.edu>,
	"Torrellas, Josep" <torrella@illinois.edu>,
	Kees Cook <keescook@chromium.org>,
	Sargun Dhillon <sargun@sargun.me>,
	Tobin Feldman-Fitzthum <tobin@ibm.com>,
	Tom Hromatka <tom.hromatka@oracle.com>,
	Will Drewry <wad@chromium.org>
Subject: Re: [RFC PATCH bpf-next seccomp 00/12] eBPF seccomp filters
Date: Thu, 20 May 2021 11:37:59 +0200	[thread overview]
Message-ID: <20210520093759.mj5diqjdmj2dekdr@wittgenstein> (raw)
In-Reply-To: <20210520085613.gvshk4jffmzggvsm@wittgenstein>

On Thu, May 20, 2021 at 10:56:13AM +0200, Christian Brauner wrote:
> On Thu, May 20, 2021 at 03:16:10AM -0500, Tianyin Xu wrote:
> > On Mon, May 17, 2021 at 10:40 AM Tycho Andersen <tycho@tycho.pizza> wrote:
> > >
> > > On Sun, May 16, 2021 at 03:38:00AM -0500, Tianyin Xu wrote:
> > > > On Sat, May 15, 2021 at 10:49 AM Andy Lutomirski <luto@kernel.org> wrote:
> > > > >
> > > > > On 5/10/21 10:21 PM, YiFei Zhu wrote:
> > > > > > On Mon, May 10, 2021 at 12:47 PM Andy Lutomirski <luto@kernel.org> wrote:
> > > > > >> On Mon, May 10, 2021 at 10:22 AM YiFei Zhu <zhuyifei1999@gmail.com> wrote:
> > > > > >>>
> > > > > >>> From: YiFei Zhu <yifeifz2@illinois.edu>
> > > > > >>>
> > > > > >>> Based on: https://urldefense.com/v3/__https://lists.linux-foundation.org/pipermail/containers/2018-February/038571.html__;!!DZ3fjg!thbAoRgmCeWjlv0qPDndNZW1j6Y2Kl_huVyUffr4wVbISf-aUiULaWHwkKJrNJyo$
> > > > > >>>
> > > > > >>> This patchset enables seccomp filters to be written in eBPF.
> > > > > >>> Supporting eBPF filters has been proposed a few times in the past.
> > > > > >>> The main concerns were (1) use cases and (2) security. We have
> > > > > >>> identified many use cases that can benefit from advanced eBPF
> > > > > >>> filters, such as:
> > > > > >>
> > > > > >> I haven't reviewed this carefully, but I think we need to distinguish
> > > > > >> a few things:
> > > > > >>
> > > > > >> 1. Using the eBPF *language*.
> > > > > >>
> > > > > >> 2. Allowing the use of stateful / non-pure eBPF features.
> > > > > >>
> > > > > >> 3. Allowing the eBPF programs to read the target process' memory.
> > > > > >>
> > > > > >> I'm generally in favor of (1).  I'm not at all sure about (2), and I'm
> > > > > >> even less convinced by (3).
> > > > > >>
> > > > > >>>
> > > > > >>>   * exec-only-once filter / apply filter after exec
> > > > > >>
> > > > > >> This is (2).  I'm not sure it's a good idea.
> > > > > >
> > > > > > The basic idea is that for a container runtime it may wait to execute
> > > > > > a program in a container without that program being able to execve
> > > > > > another program, stopping any attack that involves loading another
> > > > > > binary. The container runtime can block any syscall but execve in the
> > > > > > exec-ed process by using only cBPF.
> > > > > >
> > > > > > The use case is suggested by Andrea Arcangeli and Giuseppe Scrivano.
> > > > > > @Andrea and @Giuseppe, could you clarify more in case I missed
> > > > > > something?
> > > > >
> > > > > We've discussed having a notifier-using filter be able to replace its
> > > > > filter.  This would allow this and other use cases without any
> > > > > additional eBPF or cBPF code.
> > > > >
> > > >
> > > > A notifier is not always a solution (even ignoring its perf overhead).
> > > >
> > > > One problem, pointed out by Andrea Arcangeli, is that notifiers need
> > > > userspace daemons. So, it can hardly be used by daemonless container
> > > > engines like Podman.
> > >
> > > I'm not sure I buy this argument. Podman already has a conmon instance
> > > for each container, this could be a child of that conmon process, or
> > > live inside conmon itself.
> > >
> > > Tycho
> > 
> > I checked with Andrea Arcangeli and Giuseppe Scrivano who are working on Podman.
> > 
> > You are right that Podman is not completely daemonless. However, “the
> > fact it's no entirely daemonless doesn't imply it's a good idea to
> > make it worse and to add complexity to the background conmon daemon or
> > to add more daemons.”
> > 
> > TL;DR. User notifiers are surely more flexible, but are also more
> > expensive and complex to implement, compared with ebpf filters. /*
> > I’ll reply to Sargun’s performance argument in a separate email */
> > 
> > I'm sure you know Podman well, but let me still move some jade from
> > Andrea and Giuseppe (all credits on podmon/crun are theirs) to
> > elaborate the point, for folks cced on the list who are not very
> > familiar with Podman.
> > 
> > Basically, the current order goes as follows:
> > 
> >          podman -> conmon -> crun -> container_binary
> >                                \
> >                                 - seccomp done at crun level, not conmon
> > 
> > At runtime, what's left is:
> > 
> >          conmon -> container_binary  /* podman disappears; crun disappears */
> > 
> > So, to go through and use seccomp notify to block `exec`, we can
> > either start the container_binary with a seccomp agent wrapper, or
> > bloat the common binary (as pointed out by Tycho).
> > 
> > If we go with the first approach, we will have:
> > 
> >          podman -> conmon -> crun -> seccomp_agent -> container_binary
> > 
> > So, at runtime we'd be left with one more daemon:
> > 
> >         conmon -> seccomp_agent -> container_binary
> 
> That seems like a strawman. I don't see why this has to be out of
> process or a separate daemon. Conmon uses a regular event loop. Adding
> support for processing notifier syscall notifications is
> straightforward. Moving it to a plugin as you mentioned below is a
> design decision not a necessity.
> 
> > 
> > Apparently, nobody likes one more daemon. So, the proposal from
> 
> I'm not sure such a blanket statements about an indeterminate group of
> people's alleged preferences constitutes a technical argument wny we
> need ebpf in seccomp.

I was missing a :) here.

> 
> > Giuseppe was/is to use user notifiers as plugins (.so) loaded by
> > conmon:
> > https://github.com/containers/conmon/pull/190
> > https://github.com/containers/crun/pull/438
> > 
> > Now, with the ebpf filter support, one can implement the same thing
> > using an embarrassingly simple ebpf filter and, thanks to Giuseppe,
> > this is well supported by crun.
> 
> So I think this is trying to jump the gun by saying "Look, the result
> might be simpler.". That may even be the case - though I'm not yet
> convinced - but Andy's point stands that this brings a slew of issues on
> the table that need clear answers. Bringing stateful ebpf features into
> seccomp is a pretty big step and especially around the
> privilege/security model it looks pretty handwavy right now.
> 
> Christian
> 

  reply	other threads:[~2021-05-20  9:59 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 17:22 [RFC PATCH bpf-next seccomp 00/12] eBPF seccomp filters YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 01/12] seccomp: Move no_new_privs check to after prepare_filter YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 02/12] bpf, seccomp: Add eBPF filter capabilities YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 03/12] seccomp, ptrace: Add a mechanism to retrieve attached eBPF seccomp filters YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 04/12] libbpf: recognize section "seccomp" YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 05/12] samples/bpf: Add eBPF seccomp sample programs YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 06/12] lsm: New hook seccomp_extended YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 07/12] bpf/verifier: allow restricting direct map access YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 08/12] seccomp-ebpf: restrict filter to almost cBPF if LSM request such YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 09/12] yama: (concept) restrict seccomp-eBPF with ptrace_scope YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 10/12] seccomp-ebpf: Add ability to read user memory YiFei Zhu
2021-05-11  2:04   ` Alexei Starovoitov
2021-05-11  7:14     ` YiFei Zhu
2021-05-12 22:36       ` Alexei Starovoitov
2021-05-13  5:26         ` YiFei Zhu
2021-05-13 14:53           ` Andy Lutomirski
2021-05-13 17:12             ` YiFei Zhu
2021-05-13 17:15               ` Andy Lutomirski
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 11/12] bpf/verifier: support NULL-able ptr to BTF ID as helper argument YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 12/12] seccomp-ebpf: support task storage from BPF-LSM, defaulting to group leader YiFei Zhu
2021-05-11  1:58   ` Alexei Starovoitov
2021-05-11  5:44     ` YiFei Zhu
2021-05-12 21:56       ` Alexei Starovoitov
2021-05-10 17:47 ` [RFC PATCH bpf-next seccomp 00/12] eBPF seccomp filters Andy Lutomirski
2021-05-11  5:21   ` YiFei Zhu
2021-05-15 15:49     ` Andy Lutomirski
2021-05-20  9:05       ` Christian Brauner
     [not found]     ` <fffbea8189794a8da539f6082af3de8e@DM5PR11MB1692.namprd11.prod.outlook.com>
2021-05-16  8:38       ` Tianyin Xu
2021-05-17 15:40         ` Tycho Andersen
2021-05-17 17:07         ` Sargun Dhillon
     [not found]         ` <108b4b9c2daa4123805d2b92cf51374b@DM5PR11MB1692.namprd11.prod.outlook.com>
2021-05-20  8:16           ` Tianyin Xu
2021-05-20  8:56             ` Christian Brauner
2021-05-20  9:37               ` Christian Brauner [this message]
2021-06-01 19:55               ` Kees Cook
2021-06-09  6:32                 ` Jinghao Jia
2021-06-09  6:27               ` Jinghao Jia
     [not found]             ` <00fe481c572d486289bc88780f48e88f@DM5PR11MB1692.namprd11.prod.outlook.com>
2021-05-20 22:13               ` Tianyin Xu
     [not found]         ` <eae2a0e5038b41c4af87edcb3d4cdc13@DM5PR11MB1692.namprd11.prod.outlook.com>
2021-05-20  8:22           ` Tianyin Xu
2021-05-24 18:55             ` Sargun Dhillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210520093759.mj5diqjdmj2dekdr@wittgenstein \
    --to=christian.brauner@ubuntu.com \
    --cc=aarcange@redhat.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=claudio.canella@iaik.tugraz.at \
    --cc=containers@lists.linux.dev \
    --cc=daniel.gruss@iaik.tugraz.at \
    --cc=daniel@iogearbox.net \
    --cc=dskarlat@cs.cmu.edu \
    --cc=frankeh@us.ibm.com \
    --cc=gscrivan@redhat.com \
    --cc=hckuo2@illinois.edu \
    --cc=jannh@google.com \
    --cc=jinghao7@illinois.edu \
    --cc=keescook@chromium.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=sargun@sargun.me \
    --cc=tobin@ibm.com \
    --cc=tom.hromatka@oracle.com \
    --cc=torrella@illinois.edu \
    --cc=tycho@tycho.pizza \
    --cc=tyxu@illinois.edu \
    --cc=wad@chromium.org \
    --cc=yifeifz2@illinois.edu \
    --cc=zhuyifei1999@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).