bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sargun Dhillon <sargun@sargun.me>
To: Tianyin Xu <tyxu@illinois.edu>
Cc: Andy Lutomirski <luto@kernel.org>,
	YiFei Zhu <zhuyifei1999@gmail.com>,
	"containers@lists.linux.dev" <containers@lists.linux.dev>,
	bpf <bpf@vger.kernel.org>, "Zhu, YiFei" <yifeifz2@illinois.edu>,
	LSM List <linux-security-module@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"Kuo, Hsuan-Chi" <hckuo2@illinois.edu>,
	Claudio Canella <claudio.canella@iaik.tugraz.at>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Daniel Gruss <daniel.gruss@iaik.tugraz.at>,
	Dimitrios Skarlatos <dskarlat@cs.cmu.edu>,
	Giuseppe Scrivano <gscrivan@redhat.com>,
	Hubertus Franke <frankeh@us.ibm.com>,
	Jann Horn <jannh@google.com>,
	"Jia, Jinghao" <jinghao7@illinois.edu>,
	"Torrellas, Josep" <torrella@illinois.edu>,
	Kees Cook <keescook@chromium.org>,
	Tobin Feldman-Fitzthum <tobin@ibm.com>,
	Tom Hromatka <tom.hromatka@oracle.com>,
	Will Drewry <wad@chromium.org>
Subject: Re: [RFC PATCH bpf-next seccomp 00/12] eBPF seccomp filters
Date: Mon, 24 May 2021 11:55:29 -0700	[thread overview]
Message-ID: <CAMp4zn9gAA4csoM=p75_hU_EfxMaw25yrjy0bFnn3gGhrksFhg@mail.gmail.com> (raw)
In-Reply-To: <CAGMVDEFE8g5XKyQbB1xaK3ve58cENN2hZm3u=ktpGFgmBdQkeQ@mail.gmail.com>

On Thu, May 20, 2021 at 1:22 AM Tianyin Xu <tyxu@illinois.edu> wrote:
>
> On Mon, May 17, 2021 at 12:08 PM Sargun Dhillon <sargun@sargun.me> wrote:
> >
> > While I agree with you that this is the case right now, there's no reason it
> > has to be the case. There's a variety of mechanisms that can be employed
> > to significantly speed up the performance of the notifier. For example, right
> > now the notifier is behind one large per-filter lock. That could be removed
> > allowing for better concurrency. There are a large number of mechanisms
> > that scale O(n) with the outstanding notifications -- again, something
> > that could be improved.
>
> Thanks for the pointer! But, I don’t think this can fundamentally
> eliminate the performance gap between the notifiers and the ebpf
> filters. IMHO, the additional context switches of user notifiers make
> the difference.
>
I mean, I still think it can be closed. Or at least get better. I've
thought about
working on performance improvements, but they're lower on the list
than functionality changes.

> >
> > The other big improvement that could be made is being able to use something
> > like io_uring with the notifier interface, but it would require a
> > fairly significant
> > user API change -- and a move away from ioctl. I'm not sure if people are
> > excited about that idea at the moment.
> >
>
> Apologize that I don’t fully understand your proposal. My
> understanding about io_uring is that it allows you to amortize the
> cost of context switch but not eliminate it, unless you are willing to
> dedicate a core for it. I still believe that, even with io_uring, user
> notifiers are going to be much slower than eBPF filters.
The notifier gets significantly slower as a function of the notifications. If
you have a large number of notifications in flight, or if you're trying to
concurrently handle a large number of notifications, it gets slower. This
is where something like io_uring is super useful in terms of reducing
wakeups.

Also, in the original futex2 patches, it had a mechanism to better handle
(scheduling) of notifier like cases[1]. If the seccomp notifier did a similar
thing, we could see better performance.

>
> Btw, our patches are based on your patch set (thank you!). Are you
> using user notifiers (with your improved version?) these days? It will
> be nice to hear your opinions on ebpf filters.
>
I'm so glad that someone is picking up the work on this.

> > >
> > >
> > > > >> eBPF doesn't really have a privilege model yet.  There was a long and
> > > > >> disappointing thread about this awhile back.
> > > > >
> > > > > The idea is that “seccomp-eBPF does not make life easier for an
> > > > > adversary”. Any attack an adversary could potentially utilize
> > > > > seccomp-eBPF, they can do the same with other eBPF features, i.e. it
> > > > > would be an issue with eBPF in general rather than specifically
> > > > > seccomp’s use of eBPF.
> > > > >
> > > > > Here it is referring to the helpers goes to the base
> > > > > bpf_base_func_proto if the caller is unprivileged (!bpf_capable ||
> > > > > !perfmon_capable). In this case, if the adversary would utilize eBPF
> > > > > helpers to perform an attack, they could do it via another
> > > > > unprivileged prog type.
> > > > >
> > > > > That said, there are a few additional helpers this patchset is adding:
> > > > > * get_current_uid_gid
> > > > > * get_current_pid_tgid
> > > > >   These two provide public information (are namespaces a concern?). I
> > > > > have no idea what kind of exploit it could add unless the adversary
> > > > > somehow side-channels the task_struct? But in that case, how is the
> > > > > reading of task_struct different from how the rest of the kernel is
> > > > > reading task_struct?
> > > >
> > > > Yes, namespaces are a concern.  This idea got mostly shot down for kdbus
> > > > (what ever happened to that?), and it likely has the same problems for
> > > > seccomp.
> > > >
So, we actually have a case where we want to inspect an argument --
We want to look at the FD number that's passed to the sendmsg syscall, and then
see if that's an AF_INET socket, and if it is, then pass back to
notifier, otherwise
allow it to continue through. This is an area where I can see eBPF being
very useful.

> > > > >>
> > > > >> What is this for?
> > > > >
> > > > > Memory reading opens up lots of use cases. For example, logging what
> > > > > files are being opened without imposing too much performance penalty
> > > > > from strace. Or as an accelerator for user notify emulation, where
> > > > > syscalls can be rejected on a fast path if we know the memory contents
> > > > > does not satisfy certain conditions that user notify will check.
> > > > >
> > > >
> > > > This has all kinds of race conditions.
> > > >
> > > >
> > > > I hate to be a party pooper, but this patchset is going to very high bar
> > > > to acceptance.  Right now, seccomp has a couple of excellent properties:
> > > >
> > > > First, while it has limited expressiveness, it is simple enough that the
> > > > implementation can be easily understood and the scope for
> > > > vulnerabilities that fall through the cracks of the seccomp sandbox
> > > > model is low.  Compare this to Windows' low-integrity/high-integrity
> > > > sandbox system: there is a never ending string of sandbox escapes due to
> > > > token misuse, unexpected things at various integrity levels, etc.
> > > > Seccomp doesn't have tokens or integrity levels, and these bugs don't
> > > > happen.
> > > >
> > > > Second, seccomp works, almost unchanged, in a completely unprivileged
> > > > context.  The last time making eBPF work sensibly in a less- or
> > > > -unprivileged context, the maintainers mostly rejected the idea of
> > > > developing/debugging a permission model for maps, cleaning up the bpf
> > > > object id system, etc.  You are going to have a very hard time
> > > > convincing the seccomp maintainers to let any of these mechanism
> > > > interact with seccomp until the underlying permission model is in place.
> > > >
> > > > --Andy
> > >
> > > Thanks for pointing out the tradeoff between expressiveness vs. simplicity.
> > >
> > > Note that we are _not_ proposing to replace cbpf, but propose to also
> > > support ebpf filters. There certainly are use cases where cbpf is
> > > sufficient, but there are also important use cases ebpf could make
> > > life much easier.
> > >
> > > Most importantly, we strongly believe that ebpf filters can be
> > > supported without reducing security.
> > >
> > > No worries about “party pooping” and we appreciate the feedback. We’d
> > > love to hear concerns and collect feedback so we can address them to
> > > hit that very high bar.
> > >
> > >
> > > ~t
> > >
> > > --
> > > Tianyin Xu
> > > University of Illinois at Urbana-Champaign
> > > https://urldefense.com/v3/__https://tianyin.github.io/__;!!DZ3fjg!o4__Ob32oapUDg9_f6hzksoFiX9517CJ5-w8qtG9i-WKFs_xWbGQfUHpLjHjCddw$
>

[1]: https://lore.kernel.org/lkml/20210215152404.250281-1-andrealmeid@collabora.com/T/

      reply	other threads:[~2021-05-24 18:56 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 17:22 [RFC PATCH bpf-next seccomp 00/12] eBPF seccomp filters YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 01/12] seccomp: Move no_new_privs check to after prepare_filter YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 02/12] bpf, seccomp: Add eBPF filter capabilities YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 03/12] seccomp, ptrace: Add a mechanism to retrieve attached eBPF seccomp filters YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 04/12] libbpf: recognize section "seccomp" YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 05/12] samples/bpf: Add eBPF seccomp sample programs YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 06/12] lsm: New hook seccomp_extended YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 07/12] bpf/verifier: allow restricting direct map access YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 08/12] seccomp-ebpf: restrict filter to almost cBPF if LSM request such YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 09/12] yama: (concept) restrict seccomp-eBPF with ptrace_scope YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 10/12] seccomp-ebpf: Add ability to read user memory YiFei Zhu
2021-05-11  2:04   ` Alexei Starovoitov
2021-05-11  7:14     ` YiFei Zhu
2021-05-12 22:36       ` Alexei Starovoitov
2021-05-13  5:26         ` YiFei Zhu
2021-05-13 14:53           ` Andy Lutomirski
2021-05-13 17:12             ` YiFei Zhu
2021-05-13 17:15               ` Andy Lutomirski
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 11/12] bpf/verifier: support NULL-able ptr to BTF ID as helper argument YiFei Zhu
2021-05-10 17:22 ` [RFC PATCH bpf-next seccomp 12/12] seccomp-ebpf: support task storage from BPF-LSM, defaulting to group leader YiFei Zhu
2021-05-11  1:58   ` Alexei Starovoitov
2021-05-11  5:44     ` YiFei Zhu
2021-05-12 21:56       ` Alexei Starovoitov
2021-05-10 17:47 ` [RFC PATCH bpf-next seccomp 00/12] eBPF seccomp filters Andy Lutomirski
2021-05-11  5:21   ` YiFei Zhu
2021-05-15 15:49     ` Andy Lutomirski
2021-05-20  9:05       ` Christian Brauner
     [not found]     ` <fffbea8189794a8da539f6082af3de8e@DM5PR11MB1692.namprd11.prod.outlook.com>
2021-05-16  8:38       ` Tianyin Xu
2021-05-17 15:40         ` Tycho Andersen
2021-05-17 17:07         ` Sargun Dhillon
     [not found]         ` <108b4b9c2daa4123805d2b92cf51374b@DM5PR11MB1692.namprd11.prod.outlook.com>
2021-05-20  8:16           ` Tianyin Xu
2021-05-20  8:56             ` Christian Brauner
2021-05-20  9:37               ` Christian Brauner
2021-06-01 19:55               ` Kees Cook
2021-06-09  6:32                 ` Jinghao Jia
2021-06-09  6:27               ` Jinghao Jia
     [not found]             ` <00fe481c572d486289bc88780f48e88f@DM5PR11MB1692.namprd11.prod.outlook.com>
2021-05-20 22:13               ` Tianyin Xu
     [not found]         ` <eae2a0e5038b41c4af87edcb3d4cdc13@DM5PR11MB1692.namprd11.prod.outlook.com>
2021-05-20  8:22           ` Tianyin Xu
2021-05-24 18:55             ` Sargun Dhillon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMp4zn9gAA4csoM=p75_hU_EfxMaw25yrjy0bFnn3gGhrksFhg@mail.gmail.com' \
    --to=sargun@sargun.me \
    --cc=aarcange@redhat.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=claudio.canella@iaik.tugraz.at \
    --cc=containers@lists.linux.dev \
    --cc=daniel.gruss@iaik.tugraz.at \
    --cc=daniel@iogearbox.net \
    --cc=dskarlat@cs.cmu.edu \
    --cc=frankeh@us.ibm.com \
    --cc=gscrivan@redhat.com \
    --cc=hckuo2@illinois.edu \
    --cc=jannh@google.com \
    --cc=jinghao7@illinois.edu \
    --cc=keescook@chromium.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=tobin@ibm.com \
    --cc=tom.hromatka@oracle.com \
    --cc=torrella@illinois.edu \
    --cc=tyxu@illinois.edu \
    --cc=wad@chromium.org \
    --cc=yifeifz2@illinois.edu \
    --cc=zhuyifei1999@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).