All of lore.kernel.org
 help / color / mirror / Atom feed
* [net-next v3 0/2] eBPF seccomp filters
@ 2018-02-26  7:26 Sargun Dhillon
       [not found] ` <20180226072651.GA27045-du9IEJ8oIxHXYT48pCVpJ3c7ZZ+wIVaZYkHkVr5ML8kVGlcevz2xqA@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Sargun Dhillon @ 2018-02-26  7:26 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: wad-F7+t8E8rja9g9hUCZPvPmw, keescook-F7+t8E8rja9g9hUCZPvPmw,
	daniel-FeC+5ew28dpmcu3hnIyYJQ, ast-DgEjT+Ai2ygdnm+yROfE0A,
	luto-kltTT9wpgjJwATOyAt5JVQ

This patchset enables seccomp filters to be written in eBPF. Although, this
patchset doesn't introduce much of the functionality enabled by eBPF, it lays
the ground work for it. Currently, you have to disable CHECKPOINT_RESTORE
support in order to utilize eBPF seccomp filters, as eBPF filters cannot be
retrieved via the ptrace GET_FILTER API.

Any user can load a bpf seccomp filter program, and it can be pinned and
reused without requiring access to the bpf syscalls. A user only requires
the traditional permissions of either being cap_sys_admin, or have
no_new_privs set in order to install their rule.

The primary reason for not adding maps support in this patchset is
to avoid introducing new complexities around PR_SET_NO_NEW_PRIVS.
If we have a map that the BPF program can read, it can potentially
"change" privileges after running. It seems like doing writes only
is safe, because it can be pure, and side effect free, and therefore
not negatively effect PR_SET_NO_NEW_PRIVS. Nonetheless, if we come
to an agreement, this can be in a follow-up patchset.

A benchmark of this patchset is as follows for a very standard eBPF filter:

Given this test program:
for (i = 10; i < 99999999; i++) syscall(__NR_getpid);

If I implement an eBPF filter with PROG_ARRAYs with a program per syscall,
and tail call, the numbers are such:
ebpf JIT 12.3% slower than native
ebpf no JIT 13.6% slower than native
seccomp JIT 17.6% slower than native
seccomp no JIT 37% slower than native

The speed of the traditional seccomp filter increases O(n) with the number
of syscalls with discrete rulesets, whereas ebpf is O(1), given any number
of syscall filters.

Changes since v2:
  * Rename sample
  * Code cleanup
Changes since v1:
  * Use a flag to indicate loading an eBPF filter, not a separate command
  * Remove printk helper
  * Remove ptrace patch / restore filter / sample
  * Add some safe helpers

Sargun Dhillon (2):
  bpf, seccomp: Add eBPF filter capabilities
  bpf: Add eBPF seccomp sample programs

 arch/Kconfig                    |   8 ++
 include/linux/bpf_types.h       |   3 +
 include/linux/seccomp.h         |   3 +-
 include/uapi/linux/bpf.h        |   2 +
 include/uapi/linux/seccomp.h    |   7 +-
 kernel/bpf/syscall.c            |   1 +
 kernel/seccomp.c                | 159 ++++++++++++++++++++++++++++++++++------
 samples/bpf/Makefile            |   5 ++
 samples/bpf/bpf_load.c          |   9 ++-
 samples/bpf/test_seccomp_kern.c |  41 +++++++++++
 samples/bpf/test_seccomp_user.c |  46 ++++++++++++
 11 files changed, 255 insertions(+), 29 deletions(-)
 create mode 100644 samples/bpf/test_seccomp_kern.c
 create mode 100644 samples/bpf/test_seccomp_user.c

-- 
2.14.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found] ` <20180226072651.GA27045-du9IEJ8oIxHXYT48pCVpJ3c7ZZ+wIVaZYkHkVr5ML8kVGlcevz2xqA@public.gmane.org>
@ 2018-02-26 23:04   ` Alexei Starovoitov
  2018-02-26 23:20     ` Kees Cook
  2018-02-27  0:01     ` Sargun Dhillon
  0 siblings, 2 replies; 29+ messages in thread
From: Alexei Starovoitov @ 2018-02-26 23:04 UTC (permalink / raw)
  To: Sargun Dhillon
  Cc: wad-F7+t8E8rja9g9hUCZPvPmw, keescook-F7+t8E8rja9g9hUCZPvPmw,
	daniel-FeC+5ew28dpmcu3hnIyYJQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	ast-DgEjT+Ai2ygdnm+yROfE0A, luto-kltTT9wpgjJwATOyAt5JVQ

On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
> This patchset enables seccomp filters to be written in eBPF. Although, this
> patchset doesn't introduce much of the functionality enabled by eBPF, it lays
> the ground work for it. Currently, you have to disable CHECKPOINT_RESTORE
> support in order to utilize eBPF seccomp filters, as eBPF filters cannot be
> retrieved via the ptrace GET_FILTER API.

this was discussed multiple times in the past.
In eBPF land it's practically impossible to do checkpoint/restore
of the whole bpf program/map graph.

> Any user can load a bpf seccomp filter program, and it can be pinned and
> reused without requiring access to the bpf syscalls. A user only requires
> the traditional permissions of either being cap_sys_admin, or have
> no_new_privs set in order to install their rule.
> 
> The primary reason for not adding maps support in this patchset is
> to avoid introducing new complexities around PR_SET_NO_NEW_PRIVS.
> If we have a map that the BPF program can read, it can potentially
> "change" privileges after running. It seems like doing writes only
> is safe, because it can be pure, and side effect free, and therefore
> not negatively effect PR_SET_NO_NEW_PRIVS. Nonetheless, if we come
> to an agreement, this can be in a follow-up patchset.

readonly maps already exist. See BPF_F_RDONLY.
Is that not enough?

> A benchmark of this patchset is as follows for a very standard eBPF filter:
> 
> Given this test program:
> for (i = 10; i < 99999999; i++) syscall(__NR_getpid);
> 
> If I implement an eBPF filter with PROG_ARRAYs with a program per syscall,
> and tail call, the numbers are such:
> ebpf JIT 12.3% slower than native
> ebpf no JIT 13.6% slower than native
> seccomp JIT 17.6% slower than native
> seccomp no JIT 37% slower than native

the perf gains are misleading, since patches don't enable bpf_tail_call.

The main statement I want to hear from seccomp maintainers before
proceeding any further on this that enabling eBPF in seccomp won't lead
to seccomp folks arguing against changes in bpf core (like verifier)
just because it's used by seccomp.
It must be spelled out in the commit log with explicit Ack.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
  2018-02-26 23:04   ` Alexei Starovoitov
@ 2018-02-26 23:20     ` Kees Cook
       [not found]       ` <CAGXu5jLdOcrn16q9pQ7JwTf88AVsL0o5LMJ=4P6vRN36u-_k_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-02-27  0:01     ` Sargun Dhillon
  1 sibling, 1 reply; 29+ messages in thread
From: Kees Cook @ 2018-02-26 23:20 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Will Drewry, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Andy Lutomirski,
	Sargun Dhillon

On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
<alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>> This patchset enables seccomp filters to be written in eBPF. Although, this
>> [...]
> The main statement I want to hear from seccomp maintainers before
> proceeding any further on this that enabling eBPF in seccomp won't lead
> to seccomp folks arguing against changes in bpf core (like verifier)
> just because it's used by seccomp.
> It must be spelled out in the commit log with explicit Ack.

The primary thing I'm concerned about with eBPF and seccomp is
side-effects from eBPF programs running at syscall time. This is an
extremely sensitive area, and I want to be sure there won't be
feature-creep here that leads to seccomp getting into a bad state.

As long as seccomp can continue have its own verifier, I *think* this
will be fine, though, again I remain concerned about maps, etc. I'm
still reviewing these patches and how they might provide overlap with
Tycho's needs too, etc.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
  2018-02-26 23:04   ` Alexei Starovoitov
  2018-02-26 23:20     ` Kees Cook
@ 2018-02-27  0:01     ` Sargun Dhillon
       [not found]       ` <CAMp4zn_Qe0aXhxNzpETBABAhKWF2WkZXnpzrJczbD=6k42OydA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 29+ messages in thread
From: Sargun Dhillon @ 2018-02-27  0:01 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Will Drewry, Kees Cook, Daniel Borkmann, netdev,
	Linux Containers, Alexei Starovoitov, Andy Lutomirski

On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
<alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>> This patchset enables seccomp filters to be written in eBPF. Although, this
>> patchset doesn't introduce much of the functionality enabled by eBPF, it lays
>> the ground work for it. Currently, you have to disable CHECKPOINT_RESTORE
>> support in order to utilize eBPF seccomp filters, as eBPF filters cannot be
>> retrieved via the ptrace GET_FILTER API.
>
> this was discussed multiple times in the past.
> In eBPF land it's practically impossible to do checkpoint/restore
> of the whole bpf program/map graph.
>
>> Any user can load a bpf seccomp filter program, and it can be pinned and
>> reused without requiring access to the bpf syscalls. A user only requires
>> the traditional permissions of either being cap_sys_admin, or have
>> no_new_privs set in order to install their rule.
>>
>> The primary reason for not adding maps support in this patchset is
>> to avoid introducing new complexities around PR_SET_NO_NEW_PRIVS.
>> If we have a map that the BPF program can read, it can potentially
>> "change" privileges after running. It seems like doing writes only
>> is safe, because it can be pure, and side effect free, and therefore
>> not negatively effect PR_SET_NO_NEW_PRIVS. Nonetheless, if we come
>> to an agreement, this can be in a follow-up patchset.
>
> readonly maps already exist. See BPF_F_RDONLY.
> Is that not enough?
>
With BPF_F_RDONLY, is there a mechanism to populate a prog_array, and
then mark it rd_only?

>> A benchmark of this patchset is as follows for a very standard eBPF filter:
>>
>> Given this test program:
>> for (i = 10; i < 99999999; i++) syscall(__NR_getpid);
>>
>> If I implement an eBPF filter with PROG_ARRAYs with a program per syscall,
>> and tail call, the numbers are such:
>> ebpf JIT 12.3% slower than native
>> ebpf no JIT 13.6% slower than native
>> seccomp JIT 17.6% slower than native
>> seccomp no JIT 37% slower than native
>
> the perf gains are misleading, since patches don't enable bpf_tail_call.
>
> The main statement I want to hear from seccomp maintainers before
> proceeding any further on this that enabling eBPF in seccomp won't lead
> to seccomp folks arguing against changes in bpf core (like verifier)
> just because it's used by seccomp.
> It must be spelled out in the commit log with explicit Ack.
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]       ` <CAGXu5jLdOcrn16q9pQ7JwTf88AVsL0o5LMJ=4P6vRN36u-_k_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-27  1:01         ` Tycho Andersen
  2018-02-27  3:46           ` Sargun Dhillon
  2018-02-27  4:19         ` Andy Lutomirski
  1 sibling, 1 reply; 29+ messages in thread
From: Tycho Andersen @ 2018-02-27  1:01 UTC (permalink / raw)
  To: Kees Cook
  Cc: Will Drewry, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Andy Lutomirski,
	Sargun Dhillon, Alexei Starovoitov

On Mon, Feb 26, 2018 at 03:20:15PM -0800, Kees Cook wrote:
> On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
> <alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
> >> This patchset enables seccomp filters to be written in eBPF. Although, this
> >> [...]
> > The main statement I want to hear from seccomp maintainers before
> > proceeding any further on this that enabling eBPF in seccomp won't lead
> > to seccomp folks arguing against changes in bpf core (like verifier)
> > just because it's used by seccomp.
> > It must be spelled out in the commit log with explicit Ack.
> 
> The primary thing I'm concerned about with eBPF and seccomp is
> side-effects from eBPF programs running at syscall time. This is an
> extremely sensitive area, and I want to be sure there won't be
> feature-creep here that leads to seccomp getting into a bad state.
> 
> As long as seccomp can continue have its own verifier,

I guess these patches should introduce some additional restrictions in
kernel/seccomp.c then? Based on my reading now, it's whatever the eBPF
verifier allows.

> I *think* this will be fine, though, again I remain concerned about
> maps, etc. I'm still reviewing these patches and how they might
> provide overlap with Tycho's needs too, etc.

Yes, it's on my TODO list to take a look at how to do it as suggested
by Alexi on top of this set before posting a v2. Haven't had time
recently, though.

Cheers,

Tycho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
  2018-02-27  1:01         ` Tycho Andersen
@ 2018-02-27  3:46           ` Sargun Dhillon
       [not found]             ` <CAMp4zn9BAxv40q56PPsmvXcD000N4ZuAN3g=OF=od18_gT8UEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Sargun Dhillon @ 2018-02-27  3:46 UTC (permalink / raw)
  To: Tycho Andersen
  Cc: Will Drewry, Kees Cook, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Andy Lutomirski,
	Alexei Starovoitov

On Mon, Feb 26, 2018 at 5:01 PM, Tycho Andersen <tycho-E0fblnxP3wo@public.gmane.org> wrote:
> On Mon, Feb 26, 2018 at 03:20:15PM -0800, Kees Cook wrote:
>> On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
>> <alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>> >> This patchset enables seccomp filters to be written in eBPF. Although, this
>> >> [...]
>> > The main statement I want to hear from seccomp maintainers before
>> > proceeding any further on this that enabling eBPF in seccomp won't lead
>> > to seccomp folks arguing against changes in bpf core (like verifier)
>> > just because it's used by seccomp.
>> > It must be spelled out in the commit log with explicit Ack.
>>
>> The primary thing I'm concerned about with eBPF and seccomp is
>> side-effects from eBPF programs running at syscall time. This is an
>> extremely sensitive area, and I want to be sure there won't be
>> feature-creep here that leads to seccomp getting into a bad state.
>>
>> As long as seccomp can continue have its own verifier,
>
> I guess these patches should introduce some additional restrictions in
> kernel/seccomp.c then? Based on my reading now, it's whatever the eBPF
> verifier allows.
>
Like what? The helpers allowed are listed in seccomp.c. You have the
same restrictions as the traditional eBPF verifier (no unsafe memory
access, jumps backwards, etc..). I'm not sure which built-in eBPF
functionality presents risk.

>> I *think* this will be fine, though, again I remain concerned about
>> maps, etc. I'm still reviewing these patches and how they might
>> provide overlap with Tycho's needs too, etc.
>
> Yes, it's on my TODO list to take a look at how to do it as suggested
> by Alexi on top of this set before posting a v2. Haven't had time
> recently, though.
>
> Cheers,
>
> Tycho

There's a lot of interest (in general) of having a mechanism to do
notifications to userspace processes from eBPF for a variety of use
cases. I think that this would be valuable for more than just seccomp,
if it's implemented in a general purpose manner.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]             ` <CAMp4zn9BAxv40q56PPsmvXcD000N4ZuAN3g=OF=od18_gT8UEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-27  4:01               ` Tycho Andersen
  0 siblings, 0 replies; 29+ messages in thread
From: Tycho Andersen @ 2018-02-27  4:01 UTC (permalink / raw)
  To: Sargun Dhillon
  Cc: Will Drewry, Kees Cook, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Andy Lutomirski,
	Alexei Starovoitov

On Mon, Feb 26, 2018 at 07:46:19PM -0800, Sargun Dhillon wrote:
> On Mon, Feb 26, 2018 at 5:01 PM, Tycho Andersen <tycho-E0fblnxP3wo@public.gmane.org> wrote:
> > On Mon, Feb 26, 2018 at 03:20:15PM -0800, Kees Cook wrote:
> >> On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
> >> <alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >> > On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
> >> >> This patchset enables seccomp filters to be written in eBPF. Although, this
> >> >> [...]
> >> > The main statement I want to hear from seccomp maintainers before
> >> > proceeding any further on this that enabling eBPF in seccomp won't lead
> >> > to seccomp folks arguing against changes in bpf core (like verifier)
> >> > just because it's used by seccomp.
> >> > It must be spelled out in the commit log with explicit Ack.
> >>
> >> The primary thing I'm concerned about with eBPF and seccomp is
> >> side-effects from eBPF programs running at syscall time. This is an
> >> extremely sensitive area, and I want to be sure there won't be
> >> feature-creep here that leads to seccomp getting into a bad state.
> >>
> >> As long as seccomp can continue have its own verifier,
> >
> > I guess these patches should introduce some additional restrictions in
> > kernel/seccomp.c then? Based on my reading now, it's whatever the eBPF
> > verifier allows.
> >
> Like what? The helpers allowed are listed in seccomp.c. You have the
> same restrictions as the traditional eBPF verifier (no unsafe memory
> access, jumps backwards, etc..). I'm not sure which built-in eBPF
> functionality presents risk.

I think that's the $64,000 question that Kees is trying to answer r.e.
maps, etc.

There's also the possibility that eBPF grows something new
that's unsafe for seccomp.

Cheers,

Tycho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]       ` <CAGXu5jLdOcrn16q9pQ7JwTf88AVsL0o5LMJ=4P6vRN36u-_k_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-02-27  1:01         ` Tycho Andersen
@ 2018-02-27  4:19         ` Andy Lutomirski
       [not found]           ` <CALCETrXNODxWkcwF-LbXBn+Ju7QJEyi3JR+spsRX4ecg8d1iMQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2018-02-27  4:19 UTC (permalink / raw)
  To: Kees Cook
  Cc: Will Drewry, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Sargun Dhillon,
	Alexei Starovoitov

> On Feb 26, 2018, at 3:20 PM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>
> On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
> <alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>>> This patchset enables seccomp filters to be written in eBPF. Although, this
>>> [...]
>> The main statement I want to hear from seccomp maintainers before
>> proceeding any further on this that enabling eBPF in seccomp won't lead
>> to seccomp folks arguing against changes in bpf core (like verifier)
>> just because it's used by seccomp.
>> It must be spelled out in the commit log with explicit Ack.
>
> The primary thing I'm concerned about with eBPF and seccomp is
> side-effects from eBPF programs running at syscall time. This is an
> extremely sensitive area, and I want to be sure there won't be
> feature-creep here that leads to seccomp getting into a bad state.
>
> As long as seccomp can continue have its own verifier, I *think* this
> will be fine, though, again I remain concerned about maps, etc. I'm
> still reviewing these patches and how they might provide overlap with
> Tycho's needs too, etc.

I'm not sure I see this as a huge problem.  As far as I can see, there
are three ways that a verifier change could be problematic:

1. Addition of a new type of map.  But seccomp would just not allow
new map types by default, right?

2. Addition of a new BPF_CALLable helper.  Seccomp wants a way to
whitelist BPF_CALL targets.  That should be straightforward.

3. Straight-up bugs.  Those are exactly as problematic as verifier
bugs in any other unprivileged eBPF program type, right?  I don't see
why seccomp is special here.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]           ` <CALCETrXNODxWkcwF-LbXBn+Ju7QJEyi3JR+spsRX4ecg8d1iMQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-27  4:38             ` Kees Cook
       [not found]               ` <CAGXu5j+64WzxjBnpQxYCU50ak+VqVw1y0W+MWygFodxsDqEZRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Kees Cook @ 2018-02-27  4:38 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Will Drewry, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Sargun Dhillon,
	Alexei Starovoitov

On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>> On Feb 26, 2018, at 3:20 PM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>>
>> On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
>> <alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>> On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>>>> This patchset enables seccomp filters to be written in eBPF. Although, this
>>>> [...]
>>> The main statement I want to hear from seccomp maintainers before
>>> proceeding any further on this that enabling eBPF in seccomp won't lead
>>> to seccomp folks arguing against changes in bpf core (like verifier)
>>> just because it's used by seccomp.
>>> It must be spelled out in the commit log with explicit Ack.
>>
>> The primary thing I'm concerned about with eBPF and seccomp is
>> side-effects from eBPF programs running at syscall time. This is an
>> extremely sensitive area, and I want to be sure there won't be
>> feature-creep here that leads to seccomp getting into a bad state.
>>
>> As long as seccomp can continue have its own verifier, I *think* this
>> will be fine, though, again I remain concerned about maps, etc. I'm
>> still reviewing these patches and how they might provide overlap with
>> Tycho's needs too, etc.
>
> I'm not sure I see this as a huge problem.  As far as I can see, there
> are three ways that a verifier change could be problematic:
>
> 1. Addition of a new type of map.  But seccomp would just not allow
> new map types by default, right?
>
> 2. Addition of a new BPF_CALLable helper.  Seccomp wants a way to
> whitelist BPF_CALL targets.  That should be straightforward.

Yup, agreed on 1 and 2.

> 3. Straight-up bugs.  Those are exactly as problematic as verifier
> bugs in any other unprivileged eBPF program type, right?  I don't see
> why seccomp is special here.

My concern is more about unintended design mistakes or other feature
creep with side-effects, especially when it comes to privileges and
synchronization. Getting no-new-privs done correctly, for example,
took some careful thought and discussion, and I'm shy from how painful
TSYNC was on the process locking side, and eBPF has had some rather
ugly flaws in the past (and recently: it was nice to be able to say
for Spectre that seccomp filters couldn't be constructed to make
attacks but eBPF could). Adding the complexity needs to be worth the
gain. I'm on board for doing it, I just want to be careful. :)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]               ` <CAGXu5j+64WzxjBnpQxYCU50ak+VqVw1y0W+MWygFodxsDqEZRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-27  4:54                 ` Andy Lutomirski
       [not found]                   ` <A20EA7DD-94E9-488A-B9FF-D8E2C9F26611-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
  2018-02-27 14:53                   ` chris hyser
  1 sibling, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2018-02-27  4:54 UTC (permalink / raw)
  To: Kees Cook
  Cc: Will Drewry, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, mic-WFhQfpSGs3bR7s880joybQ,
	Sargun Dhillon, Alexei Starovoitov



> On Feb 26, 2018, at 8:38 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Feb 26, 2018, at 3:20 PM, Kees Cook <keescook@chromium.org> wrote:
>>> 
>>> On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
>>> <alexei.starovoitov@gmail.com> wrote:
>>>>> On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>>>>> This patchset enables seccomp filters to be written in eBPF. Although, this
>>>>> [...]
>>>> The main statement I want to hear from seccomp maintainers before
>>>> proceeding any further on this that enabling eBPF in seccomp won't lead
>>>> to seccomp folks arguing against changes in bpf core (like verifier)
>>>> just because it's used by seccomp.
>>>> It must be spelled out in the commit log with explicit Ack.
>>> 
>>> The primary thing I'm concerned about with eBPF and seccomp is
>>> side-effects from eBPF programs running at syscall time. This is an
>>> extremely sensitive area, and I want to be sure there won't be
>>> feature-creep here that leads to seccomp getting into a bad state.
>>> 
>>> As long as seccomp can continue have its own verifier, I *think* this
>>> will be fine, though, again I remain concerned about maps, etc. I'm
>>> still reviewing these patches and how they might provide overlap with
>>> Tycho's needs too, etc.
>> 
>> I'm not sure I see this as a huge problem.  As far as I can see, there
>> are three ways that a verifier change could be problematic:
>> 
>> 1. Addition of a new type of map.  But seccomp would just not allow
>> new map types by default, right?
>> 
>> 2. Addition of a new BPF_CALLable helper.  Seccomp wants a way to
>> whitelist BPF_CALL targets.  That should be straightforward.
> 
> Yup, agreed on 1 and 2.
> 
>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>> bugs in any other unprivileged eBPF program type, right?  I don't see
>> why seccomp is special here.
> 
> My concern is more about unintended design mistakes or other feature
> creep with side-effects, especially when it comes to privileges and
> synchronization. Getting no-new-privs done correctly, for example,
> took some careful thought and discussion, and I'm shy from how painful
> TSYNC was on the process locking side, and eBPF has had some rather
> ugly flaws in the past (and recently: it was nice to be able to say
> for Spectre that seccomp filters couldn't be constructed to make
> attacks but eBPF could). Adding the complexity needs to be worth the
> gain. I'm on board for doing it, I just want to be careful. :)
> 

I agree.  I think that, if we do this right, we get a clean version of Tycho's notifiers.  We can also very easily build on that to send a non-blocking message to the notifier fd, which gets us a version of seccomp logging that works for things like Chromium and even strace.  I think this is worth it.

I also think this sort of argument is why Mickaël's privileged-first Landlock is the wrong approach.  By getting the unprivileged parts right from day one, we can carefully extend the mechanism and keep it usable by unprivileged apps.  But, if we'd started as root-only, fixing up everything needed to make it safe for unprivileged users after the fact would have been quite messy.

And the considerations for making eBPF safe for use by unprivileged tasks to filter their descendents are more or less the same for seccomp and Landlock.  Can we please arrange things so we solve this problem only once?
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]       ` <CAMp4zn_Qe0aXhxNzpETBABAhKWF2WkZXnpzrJczbD=6k42OydA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-27  9:28         ` Daniel Borkmann
  0 siblings, 0 replies; 29+ messages in thread
From: Daniel Borkmann @ 2018-02-27  9:28 UTC (permalink / raw)
  To: Sargun Dhillon, Alexei Starovoitov
  Cc: Will Drewry, Kees Cook, netdev, Linux Containers,
	Alexei Starovoitov, Andy Lutomirski

On 02/27/2018 01:01 AM, Sargun Dhillon wrote:
> On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
> <alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>>> This patchset enables seccomp filters to be written in eBPF. Although, this
>>> patchset doesn't introduce much of the functionality enabled by eBPF, it lays
>>> the ground work for it. Currently, you have to disable CHECKPOINT_RESTORE
>>> support in order to utilize eBPF seccomp filters, as eBPF filters cannot be
>>> retrieved via the ptrace GET_FILTER API.
>>
>> this was discussed multiple times in the past.
>> In eBPF land it's practically impossible to do checkpoint/restore
>> of the whole bpf program/map graph.
>>
>>> Any user can load a bpf seccomp filter program, and it can be pinned and
>>> reused without requiring access to the bpf syscalls. A user only requires
>>> the traditional permissions of either being cap_sys_admin, or have
>>> no_new_privs set in order to install their rule.
>>>
>>> The primary reason for not adding maps support in this patchset is
>>> to avoid introducing new complexities around PR_SET_NO_NEW_PRIVS.
>>> If we have a map that the BPF program can read, it can potentially
>>> "change" privileges after running. It seems like doing writes only
>>> is safe, because it can be pure, and side effect free, and therefore
>>> not negatively effect PR_SET_NO_NEW_PRIVS. Nonetheless, if we come
>>> to an agreement, this can be in a follow-up patchset.
>>
>> readonly maps already exist. See BPF_F_RDONLY.
>> Is that not enough?
>>
> With BPF_F_RDONLY, is there a mechanism to populate a prog_array, and
> then mark it rd_only?

This would still need to be extended for this purpose. Right now this is
either set on map creation (e.g. such that only prog itself can update the
entries) or obj_get. So you'd need a mechanism that sets flags into rdonly
mode where once set it cannot be undone anymore for the remaining lifetime
of the map.

>>> A benchmark of this patchset is as follows for a very standard eBPF filter:
>>>
>>> Given this test program:
>>> for (i = 10; i < 99999999; i++) syscall(__NR_getpid);
>>>
>>> If I implement an eBPF filter with PROG_ARRAYs with a program per syscall,
>>> and tail call, the numbers are such:
>>> ebpf JIT 12.3% slower than native
>>> ebpf no JIT 13.6% slower than native
>>> seccomp JIT 17.6% slower than native
>>> seccomp no JIT 37% slower than native
>>
>> the perf gains are misleading, since patches don't enable bpf_tail_call.
>>
>> The main statement I want to hear from seccomp maintainers before
>> proceeding any further on this that enabling eBPF in seccomp won't lead
>> to seccomp folks arguing against changes in bpf core (like verifier)
>> just because it's used by seccomp.
>> It must be spelled out in the commit log with explicit Ack.

Fully agree.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]               ` <CAGXu5j+64WzxjBnpQxYCU50ak+VqVw1y0W+MWygFodxsDqEZRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-27 14:53                   ` chris hyser
  2018-02-27 14:53                   ` chris hyser
  1 sibling, 0 replies; 29+ messages in thread
From: chris hyser @ 2018-02-27 14:53 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski
  Cc: Will Drewry, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Sargun Dhillon,
	Alexei Starovoitov

On 02/26/2018 11:38 PM, Kees Cook wrote:
> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>> bugs in any other unprivileged eBPF program type, right?  I don't see
>> why seccomp is special here.
> 
> My concern is more about unintended design mistakes or other feature
> creep with side-effects, especially when it comes to privileges and
> synchronization. Getting no-new-privs done correctly, for example,
> took some careful thought and discussion, and I'm shy from how painful
> TSYNC was on the process locking side, and eBPF has had some rather
> ugly flaws in the past (and recently: it was nice to be able to say
> for Spectre that seccomp filters couldn't be constructed to make
> attacks but eBPF could). Adding the complexity needs to be worth the
> gain. I'm on board for doing it, I just want to be careful. :)


Another option might be to remove c/eBPF from the equation all together. c/eBPF allows flexibility and that almost 
always comes at the cost of additional security risk. Seccomp is for enhanced security yes? How about a new seccomp mode 
that passes in something like a bit vector or hashmap for "simple" white/black list checks validated by kernel code, 
versus user provided interpreted code? Of course this removes a fair number of things you can currently do or would be 
able to do with eBPF. Of course, restated from a security point of view, this removes a fair number of things an 
_attacker_ can do. Presumably the performance improvement would also be significant.

Is this an idea worth prototyping?

-chrish

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
@ 2018-02-27 14:53                   ` chris hyser
  0 siblings, 0 replies; 29+ messages in thread
From: chris hyser @ 2018-02-27 14:53 UTC (permalink / raw)
  To: Kees Cook, Andy Lutomirski,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Netdev
  Cc: Will Drewry, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Sargun Dhillon,
	Alexei Starovoitov

On 02/26/2018 11:38 PM, Kees Cook wrote:
> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>> bugs in any other unprivileged eBPF program type, right?  I don't see
>> why seccomp is special here.
> 
> My concern is more about unintended design mistakes or other feature
> creep with side-effects, especially when it comes to privileges and
> synchronization. Getting no-new-privs done correctly, for example,
> took some careful thought and discussion, and I'm shy from how painful
> TSYNC was on the process locking side, and eBPF has had some rather
> ugly flaws in the past (and recently: it was nice to be able to say
> for Spectre that seccomp filters couldn't be constructed to make
> attacks but eBPF could). Adding the complexity needs to be worth the
> gain. I'm on board for doing it, I just want to be careful. :)


Another option might be to remove c/eBPF from the equation all together. c/eBPF allows flexibility and that almost 
always comes at the cost of additional security risk. Seccomp is for enhanced security yes? How about a new seccomp mode 
that passes in something like a bit vector or hashmap for "simple" white/black list checks validated by kernel code, 
versus user provided interpreted code? Of course this removes a fair number of things you can currently do or would be 
able to do with eBPF. Of course, restated from a security point of view, this removes a fair number of things an 
_attacker_ can do. Presumably the performance improvement would also be significant.

Is this an idea worth prototyping?

-chrish

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                   ` <db759dd2-31dc-d094-251d-d4c1e8af8704-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2018-02-27 16:00                     ` Kees Cook
       [not found]                       ` <CAGXu5j+idW9AjZHVdeedqLOFXriObUJLvcw8-9k5WxyQF8EWrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Kees Cook @ 2018-02-27 16:00 UTC (permalink / raw)
  To: chris hyser
  Cc: Will Drewry, Daniel Borkmann, Netdev, Linux Containers,
	Alexei Starovoitov, Andy Lutomirski, Sargun Dhillon,
	Alexei Starovoitov

On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>
>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>> wrote:
>>>
>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>> why seccomp is special here.
>>
>>
>> My concern is more about unintended design mistakes or other feature
>> creep with side-effects, especially when it comes to privileges and
>> synchronization. Getting no-new-privs done correctly, for example,
>> took some careful thought and discussion, and I'm shy from how painful
>> TSYNC was on the process locking side, and eBPF has had some rather
>> ugly flaws in the past (and recently: it was nice to be able to say
>> for Spectre that seccomp filters couldn't be constructed to make
>> attacks but eBPF could). Adding the complexity needs to be worth the
>> gain. I'm on board for doing it, I just want to be careful. :)
>
>
>
> Another option might be to remove c/eBPF from the equation all together.
> c/eBPF allows flexibility and that almost always comes at the cost of
> additional security risk. Seccomp is for enhanced security yes? How about a
> new seccomp mode that passes in something like a bit vector or hashmap for
> "simple" white/black list checks validated by kernel code, versus user
> provided interpreted code? Of course this removes a fair number of things
> you can currently do or would be able to do with eBPF. Of course, restated
> from a security point of view, this removes a fair number of things an
> _attacker_ can do. Presumably the performance improvement would also be
> significant.
>
> Is this an idea worth prototyping?

That was the original prototype for seccomp-filter. :) The discussion
around that from years ago basically boiled down to it being
inflexible. Given all the things people want to do at syscall time,
that continues to be true. So true, in fact, that here we are now,
trying to move to eBPF from cBPF. ;)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                       ` <CAGXu5j+idW9AjZHVdeedqLOFXriObUJLvcw8-9k5WxyQF8EWrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-27 16:59                         ` chris hyser
       [not found]                           ` <ddbefdda-f3b8-3956-fa0f-dcba8cf8e7d9-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: chris hyser @ 2018-02-27 16:59 UTC (permalink / raw)
  To: Kees Cook
  Cc: Will Drewry, Daniel Borkmann, Netdev, Linux Containers,
	Alexei Starovoitov, Andy Lutomirski, Sargun Dhillon,
	Alexei Starovoitov

On 02/27/2018 11:00 AM, Kees Cook wrote:
> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>
>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>>> wrote:
>>>>
>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>> why seccomp is special here.
>>>
>>>
>>> My concern is more about unintended design mistakes or other feature
>>> creep with side-effects, especially when it comes to privileges and
>>> synchronization. Getting no-new-privs done correctly, for example,
>>> took some careful thought and discussion, and I'm shy from how painful
>>> TSYNC was on the process locking side, and eBPF has had some rather
>>> ugly flaws in the past (and recently: it was nice to be able to say
>>> for Spectre that seccomp filters couldn't be constructed to make
>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>> gain. I'm on board for doing it, I just want to be careful. :)
>>
>>
>>
>> Another option might be to remove c/eBPF from the equation all together.
>> c/eBPF allows flexibility and that almost always comes at the cost of
>> additional security risk. Seccomp is for enhanced security yes? How about a
>> new seccomp mode that passes in something like a bit vector or hashmap for
>> "simple" white/black list checks validated by kernel code, versus user
>> provided interpreted code? Of course this removes a fair number of things
>> you can currently do or would be able to do with eBPF. Of course, restated
>> from a security point of view, this removes a fair number of things an
>> _attacker_ can do. Presumably the performance improvement would also be
>> significant.
>>
>> Is this an idea worth prototyping?
> 
> That was the original prototype for seccomp-filter. :) The discussion
> around that from years ago basically boiled down to it being
> inflexible. Given all the things people want to do at syscall time,
> that continues to be true. So true, in fact, that here we are now,
> trying to move to eBPF from cBPF. ;)

I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in 
areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We 
ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not flexibility.

-chrish

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                           ` <ddbefdda-f3b8-3956-fa0f-dcba8cf8e7d9-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2018-02-27 19:19                             ` Kees Cook
       [not found]                               ` <CAGXu5jKnk90Yruhx_=t8yW2ziLaubqW80pxB95g5W_XnMuT1mA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-02-27 21:58                             ` Daniel Borkmann
  1 sibling, 1 reply; 29+ messages in thread
From: Kees Cook @ 2018-02-27 19:19 UTC (permalink / raw)
  To: chris hyser
  Cc: Will Drewry, Daniel Borkmann, Netdev, Linux Containers,
	Alexei Starovoitov, Andy Lutomirski, Sargun Dhillon,
	Alexei Starovoitov

On Tue, Feb 27, 2018 at 8:59 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
> On 02/27/2018 11:00 AM, Kees Cook wrote:
>>
>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>> wrote:
>>>
>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>>
>>>>
>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>>>> wrote:
>>>>>
>>>>>
>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>> why seccomp is special here.
>>>>
>>>>
>>>>
>>>> My concern is more about unintended design mistakes or other feature
>>>> creep with side-effects, especially when it comes to privileges and
>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>> took some careful thought and discussion, and I'm shy from how painful
>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>
>>>
>>>
>>>
>>> Another option might be to remove c/eBPF from the equation all together.
>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>> additional security risk. Seccomp is for enhanced security yes? How about
>>> a
>>> new seccomp mode that passes in something like a bit vector or hashmap
>>> for
>>> "simple" white/black list checks validated by kernel code, versus user
>>> provided interpreted code? Of course this removes a fair number of things
>>> you can currently do or would be able to do with eBPF. Of course,
>>> restated
>>> from a security point of view, this removes a fair number of things an
>>> _attacker_ can do. Presumably the performance improvement would also be
>>> significant.
>>>
>>> Is this an idea worth prototyping?
>>
>>
>> That was the original prototype for seccomp-filter. :) The discussion
>> around that from years ago basically boiled down to it being
>> inflexible. Given all the things people want to do at syscall time,
>> that continues to be true. So true, in fact, that here we are now,
>> trying to move to eBPF from cBPF. ;)
>
>
> I will try to find that discussion. As someone pointed out here though, eBPF

A good starting point might be this:
https://lwn.net/Articles/441232/

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                               ` <CAGXu5jKnk90Yruhx_=t8yW2ziLaubqW80pxB95g5W_XnMuT1mA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-27 21:22                                 ` chris hyser
  0 siblings, 0 replies; 29+ messages in thread
From: chris hyser @ 2018-02-27 21:22 UTC (permalink / raw)
  To: Kees Cook
  Cc: Will Drewry, Daniel Borkmann, Netdev, Linux Containers,
	Alexei Starovoitov, Andy Lutomirski, Sargun Dhillon,
	Alexei Starovoitov

On 02/27/2018 02:19 PM, Kees Cook wrote:
> On Tue, Feb 27, 2018 at 8:59 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>> I will try to find that discussion. As someone pointed out here though, eBPF
> 
> A good starting point might be this:
> https://lwn.net/Articles/441232/

Thanks. A fair amount of reading referenced there :-). In particular I'll be curious to find out what happened to this idea:

"Essentially, that would make for three choices for each system call: enabled, disabled, or filtered."

Something like that might address some of the security concerns in that a simple go/no go on syscall number need not 
incur the performance hit nor increased attack surface of running c/eBPF code, but it is there for argument checking, 
etc if you need it. Basically instead of the kernel making the flexibility/performance/security trade-off in advance, 
you leave it to user code/policy.

Anyway, lest it is not clear :-), I think your instincts on security and eBPF are dead on. At the same time it is 
powerful and useful. So, how to make it optional?

-chrish

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                           ` <ddbefdda-f3b8-3956-fa0f-dcba8cf8e7d9-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  2018-02-27 19:19                             ` Kees Cook
@ 2018-02-27 21:58                             ` Daniel Borkmann
       [not found]                               ` <f712a383-8e84-da64-a454-51fdebf28741-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
  1 sibling, 1 reply; 29+ messages in thread
From: Daniel Borkmann @ 2018-02-27 21:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Will Drewry, Netdev, Linux Containers, Alexei Starovoitov,
	Andy Lutomirski, Sargun Dhillon, Alexei Starovoitov

On 02/27/2018 05:59 PM, chris hyser wrote:
> On 02/27/2018 11:00 AM, Kees Cook wrote:
>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser@oracle.com> wrote:
>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto@amacapital.net>
>>>> wrote:
>>>>>
>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>> why seccomp is special here.
>>>>
>>>> My concern is more about unintended design mistakes or other feature
>>>> creep with side-effects, especially when it comes to privileges and
>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>> took some careful thought and discussion, and I'm shy from how painful
>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>> attacks but eBPF could). Adding the complexity needs to be worth the

Well, not really. One part of all the Spectre mitigations that went upstream
from BPF side was to have an option to remove interpreter entirely and that
also relates to seccomp eventually. But other than that an attacker might
potentially find as well useful gadgets inside seccomp or any other code
that is inside the kernel, so it's not a strict necessity either.

>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>
>>> Another option might be to remove c/eBPF from the equation all together.
>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>> additional security risk. Seccomp is for enhanced security yes? How about a
>>> new seccomp mode that passes in something like a bit vector or hashmap for
>>> "simple" white/black list checks validated by kernel code, versus user
>>> provided interpreted code? Of course this removes a fair number of things
>>> you can currently do or would be able to do with eBPF. Of course, restated
>>> from a security point of view, this removes a fair number of things an
>>> _attacker_ can do. Presumably the performance improvement would also be
>>> significant.

Good luck with not breaking existing applications relying on seccomp out
there.

>>> Is this an idea worth prototyping?
>>
>> That was the original prototype for seccomp-filter. :) The discussion
>> around that from years ago basically boiled down to it being
>> inflexible. Given all the things people want to do at syscall time,
>> that continues to be true. So true, in fact, that here we are now,
>> trying to move to eBPF from cBPF. ;)

Right, agree. cBPF is also pretty much frozen these days and aside from
that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind
doing something similar for eBPF side as long as this is reasonably
maintainable and not making BPF core more complex, but most of it can
already be set in the verifier anyway based on prog type. Note, that
performance of seccomp/BPF is definitely a demand as well which is why
people still extend the old remaining cBPF JITs today such that it can
be JITed also from there.

> I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not flexibility.

Not really, security of verifier and BPF infra in general is on the top
of the list, it's fundamental to the underlying concept and just because
it is heavily used also in tracing and networking, it only shows that the
concept is highly flexible that it can be applied in multiple areas.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                               ` <f712a383-8e84-da64-a454-51fdebf28741-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
@ 2018-02-27 22:20                                 ` chris hyser
       [not found]                                   ` <7fc0fab8-c1bc-bc76-a892-b3faab7d16ad-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: chris hyser @ 2018-02-27 22:20 UTC (permalink / raw)
  To: Daniel Borkmann, Kees Cook
  Cc: Will Drewry, Netdev, Linux Containers, Alexei Starovoitov,
	Andy Lutomirski, Sargun Dhillon, Alexei Starovoitov

On 02/27/2018 04:58 PM, Daniel Borkmann wrote:
> On 02/27/2018 05:59 PM, chris hyser wrote:
>> On 02/27/2018 11:00 AM, Kees Cook wrote:
>>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser@oracle.com> wrote:
>>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto@amacapital.net>
>>>>> wrote:
>>>>>>
>>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>>> why seccomp is special here.
>>>>>
>>>>> My concern is more about unintended design mistakes or other feature
>>>>> creep with side-effects, especially when it comes to privileges and
>>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>>> took some careful thought and discussion, and I'm shy from how painful
>>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>>> attacks but eBPF could). Adding the complexity needs to be worth the
> 
> Well, not really. One part of all the Spectre mitigations that went upstream
> from BPF side was to have an option to remove interpreter entirely and that
> also relates to seccomp eventually. But other than that an attacker might
> potentially find as well useful gadgets inside seccomp or any other code
> that is inside the kernel, so it's not a strict necessity either.
> 
>>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>>
>>>> Another option might be to remove c/eBPF from the equation all together.
>>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>>> additional security risk. Seccomp is for enhanced security yes? How about a
>>>> new seccomp mode that passes in something like a bit vector or hashmap for
>>>> "simple" white/black list checks validated by kernel code, versus user
>>>> provided interpreted code? Of course this removes a fair number of things
>>>> you can currently do or would be able to do with eBPF. Of course, restated
>>>> from a security point of view, this removes a fair number of things an
>>>> _attacker_ can do. Presumably the performance improvement would also be
>>>> significant.
> 
> Good luck with not breaking existing applications relying on seccomp out
> there.

This wasn't in the context of an implementation proposal, but the assumption would be to add this in addition to the old 
way. Now, does that make sense to do? That is the discussion.

> 
>>>> Is this an idea worth prototyping?
>>>
>>> That was the original prototype for seccomp-filter. :) The discussion
>>> around that from years ago basically boiled down to it being
>>> inflexible. Given all the things people want to do at syscall time,
>>> that continues to be true. So true, in fact, that here we are now,
>>> trying to move to eBPF from cBPF. ;)
> 
> Right, agree. cBPF is also pretty much frozen these days and aside from
> that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind
> doing something similar for eBPF side as long as this is reasonably
> maintainable and not making BPF core more complex, but most of it can
> already be set in the verifier anyway based on prog type. Note, that
> performance of seccomp/BPF is definitely a demand as well which is why
> people still extend the old remaining cBPF JITs today such that it can
> be JITed also from there.
> 
>> I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not flexibility.
> 
> Not really, security of verifier and BPF infra in general is on the top
> of the list, it's fundamental to the underlying concept and just because
> it is heavily used also in tracing and networking, it only shows that the
> concept is highly flexible that it can be applied in multiple areas.

Ok. Let me look into this a bit because this is the heart of the matter.

-chrish
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                   ` <A20EA7DD-94E9-488A-B9FF-D8E2C9F26611-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
@ 2018-02-27 23:10                     ` Mickaël Salaün
       [not found]                       ` <5323e010-09df-26d9-15f5-c723faa13224-WFhQfpSGs3bR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Mickaël Salaün @ 2018-02-27 23:10 UTC (permalink / raw)
  To: Andy Lutomirski, Kees Cook
  Cc: Will Drewry, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Sargun Dhillon,
	Alexei Starovoitov


[-- Attachment #1.1.1: Type: text/plain, Size: 4285 bytes --]


On 27/02/2018 05:54, Andy Lutomirski wrote:
> 
> 
>> On Feb 26, 2018, at 8:38 PM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>>
>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>>>> On Feb 26, 2018, at 3:20 PM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>>>>
>>>> On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
>>>> <alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>>>> On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>>>>>> This patchset enables seccomp filters to be written in eBPF. Although, this
>>>>>> [...]
>>>>> The main statement I want to hear from seccomp maintainers before
>>>>> proceeding any further on this that enabling eBPF in seccomp won't lead
>>>>> to seccomp folks arguing against changes in bpf core (like verifier)
>>>>> just because it's used by seccomp.
>>>>> It must be spelled out in the commit log with explicit Ack.
>>>>
>>>> The primary thing I'm concerned about with eBPF and seccomp is
>>>> side-effects from eBPF programs running at syscall time. This is an
>>>> extremely sensitive area, and I want to be sure there won't be
>>>> feature-creep here that leads to seccomp getting into a bad state.
>>>>
>>>> As long as seccomp can continue have its own verifier, I *think* this
>>>> will be fine, though, again I remain concerned about maps, etc. I'm
>>>> still reviewing these patches and how they might provide overlap with
>>>> Tycho's needs too, etc.
>>>
>>> I'm not sure I see this as a huge problem.  As far as I can see, there
>>> are three ways that a verifier change could be problematic:
>>>
>>> 1. Addition of a new type of map.  But seccomp would just not allow
>>> new map types by default, right?
>>>
>>> 2. Addition of a new BPF_CALLable helper.  Seccomp wants a way to
>>> whitelist BPF_CALL targets.  That should be straightforward.
>>
>> Yup, agreed on 1 and 2.
>>
>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>> why seccomp is special here.
>>
>> My concern is more about unintended design mistakes or other feature
>> creep with side-effects, especially when it comes to privileges and
>> synchronization. Getting no-new-privs done correctly, for example,
>> took some careful thought and discussion, and I'm shy from how painful
>> TSYNC was on the process locking side, and eBPF has had some rather
>> ugly flaws in the past (and recently: it was nice to be able to say
>> for Spectre that seccomp filters couldn't be constructed to make
>> attacks but eBPF could). Adding the complexity needs to be worth the
>> gain. I'm on board for doing it, I just want to be careful. :)
>>
> 
> I agree.  I think that, if we do this right, we get a clean version of Tycho's notifiers.  We can also very easily build on that to send a non-blocking message to the notifier fd, which gets us a version of seccomp logging that works for things like Chromium and even strace.  I think this is worth it.
> 
> I also think this sort of argument is why Mickaël's privileged-first Landlock is the wrong approach.  By getting the unprivileged parts right from day one, we can carefully extend the mechanism and keep it usable by unprivileged apps.  But, if we'd started as root-only, fixing up everything needed to make it safe for unprivileged users after the fact would have been quite messy.

We agreed (including Kees and you, at the Santa Fe LPC) to limit the use
of Landlock to CAP_SYS_ADMIN at first. It is an artificial limitation
that can be re-enabled by removing three explicit checks/lines. Landlock
was designed for unprivileged use from day one and it is still the goal.

> 
> And the considerations for making eBPF safe for use by unprivileged tasks to filter their descendents are more or less the same for seccomp and Landlock.  Can we please arrange things so we solve this problem only once?
> 

Landlock is definitely focused on eBPF. It should not be hard to add a
new Landlock program type to mimic the seccomp filter checks (to use
eBPF features like maps), but I'm not sure to get the use case here.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                       ` <5323e010-09df-26d9-15f5-c723faa13224-WFhQfpSGs3bR7s880joybQ@public.gmane.org>
@ 2018-02-27 23:11                         ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2018-02-27 23:11 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Will Drewry, Kees Cook, Daniel Borkmann, Network Development,
	Linux Containers, Alexei Starovoitov, Sargun Dhillon,
	Alexei Starovoitov

On Tue, Feb 27, 2018 at 11:10 PM, Mickaël Salaün <mic@digikod.net> wrote:
>
> On 27/02/2018 05:54, Andy Lutomirski wrote:
>>
>>
>>> On Feb 26, 2018, at 8:38 PM, Kees Cook <keescook@chromium.org> wrote:
>>>
>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>> On Feb 26, 2018, at 3:20 PM, Kees Cook <keescook@chromium.org> wrote:
>>>>>
>>>>> On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>> On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>>>>>>> This patchset enables seccomp filters to be written in eBPF. Although, this
>>>>>>> [...]
>>>>>> The main statement I want to hear from seccomp maintainers before
>>>>>> proceeding any further on this that enabling eBPF in seccomp won't lead
>>>>>> to seccomp folks arguing against changes in bpf core (like verifier)
>>>>>> just because it's used by seccomp.
>>>>>> It must be spelled out in the commit log with explicit Ack.
>>>>>
>>>>> The primary thing I'm concerned about with eBPF and seccomp is
>>>>> side-effects from eBPF programs running at syscall time. This is an
>>>>> extremely sensitive area, and I want to be sure there won't be
>>>>> feature-creep here that leads to seccomp getting into a bad state.
>>>>>
>>>>> As long as seccomp can continue have its own verifier, I *think* this
>>>>> will be fine, though, again I remain concerned about maps, etc. I'm
>>>>> still reviewing these patches and how they might provide overlap with
>>>>> Tycho's needs too, etc.
>>>>
>>>> I'm not sure I see this as a huge problem.  As far as I can see, there
>>>> are three ways that a verifier change could be problematic:
>>>>
>>>> 1. Addition of a new type of map.  But seccomp would just not allow
>>>> new map types by default, right?
>>>>
>>>> 2. Addition of a new BPF_CALLable helper.  Seccomp wants a way to
>>>> whitelist BPF_CALL targets.  That should be straightforward.
>>>
>>> Yup, agreed on 1 and 2.
>>>
>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>> why seccomp is special here.
>>>
>>> My concern is more about unintended design mistakes or other feature
>>> creep with side-effects, especially when it comes to privileges and
>>> synchronization. Getting no-new-privs done correctly, for example,
>>> took some careful thought and discussion, and I'm shy from how painful
>>> TSYNC was on the process locking side, and eBPF has had some rather
>>> ugly flaws in the past (and recently: it was nice to be able to say
>>> for Spectre that seccomp filters couldn't be constructed to make
>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>
>>
>> I agree.  I think that, if we do this right, we get a clean version of Tycho's notifiers.  We can also very easily build on that to send a non-blocking message to the notifier fd, which gets us a version of seccomp logging that works for things like Chromium and even strace.  I think this is worth it.
>>
>> I also think this sort of argument is why Mickaël's privileged-first Landlock is the wrong approach.  By getting the unprivileged parts right from day one, we can carefully extend the mechanism and keep it usable by unprivileged apps.  But, if we'd started as root-only, fixing up everything needed to make it safe for unprivileged users after the fact would have been quite messy.
>
> We agreed (including Kees and you, at the Santa Fe LPC) to limit the use
> of Landlock to CAP_SYS_ADMIN at first. It is an artificial limitation
> that can be re-enabled by removing three explicit checks/lines. Landlock
> was designed for unprivileged use from day one and it is still the goal.

Indeed.  I was obviously too tired to read your email intelligently
last night.  Sorry.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                                   ` <7fc0fab8-c1bc-bc76-a892-b3faab7d16ad-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2018-02-27 23:55                                     ` chris hyser
       [not found]                                       ` <4fbef77e-92ad-b896-a259-492412ad4c55-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: chris hyser @ 2018-02-27 23:55 UTC (permalink / raw)
  To: Daniel Borkmann, Kees Cook
  Cc: Will Drewry, Netdev, Linux Containers, Alexei Starovoitov,
	Andy Lutomirski, Sargun Dhillon, Alexei Starovoitov

> On 02/27/2018 04:58 PM, Daniel Borkmann wrote: >> On 02/27/2018 05:59 PM, chris hyser wrote:
>>> On 02/27/2018 11:00 AM, Kees Cook wrote:
>>>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser@oracle.com> wrote:
>>>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto@amacapital.net>
>>>>>> wrote:
>>>>>>>
>>>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>>>> why seccomp is special here.
>>>>>>
>>>>>> My concern is more about unintended design mistakes or other feature
>>>>>> creep with side-effects, especially when it comes to privileges and
>>>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>>>> took some careful thought and discussion, and I'm shy from how painful
>>>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>
>> Well, not really. One part of all the Spectre mitigations that went upstream
>> from BPF side was to have an option to remove interpreter entirely and that
>> also relates to seccomp eventually. But other than that an attacker might
>> potentially find as well useful gadgets inside seccomp or any other code
>> that is inside the kernel, so it's not a strict necessity either.
>>
>>>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>>>
>>>>> Another option might be to remove c/eBPF from the equation all together.
>>>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>>>> additional security risk. Seccomp is for enhanced security yes? How about a
>>>>> new seccomp mode that passes in something like a bit vector or hashmap for
>>>>> "simple" white/black list checks validated by kernel code, versus user
>>>>> provided interpreted code? Of course this removes a fair number of things
>>>>> you can currently do or would be able to do with eBPF. Of course, restated
>>>>> from a security point of view, this removes a fair number of things an
>>>>> _attacker_ can do. Presumably the performance improvement would also be
>>>>> significant.
>>
>> Good luck with not breaking existing applications relying on seccomp out
>> there.
> 
> This wasn't in the context of an implementation proposal, but the assumption would be to add this in addition to the old 
> way. Now, does that make sense to do? That is the discussion.
> 
>>
>>>>> Is this an idea worth prototyping?
>>>>
>>>> That was the original prototype for seccomp-filter. :) The discussion
>>>> around that from years ago basically boiled down to it being
>>>> inflexible. Given all the things people want to do at syscall time,
>>>> that continues to be true. So true, in fact, that here we are now,
>>>> trying to move to eBPF from cBPF. ;)
>>
>> Right, agree. cBPF is also pretty much frozen these days and aside from
>> that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind
>> doing something similar for eBPF side as long as this is reasonably
>> maintainable and not making BPF core more complex, but most of it can
>> already be set in the verifier anyway based on prog type. Note, that
>> performance of seccomp/BPF is definitely a demand as well which is why
>> people still extend the old remaining cBPF JITs today such that it can
>> be JITed also from there.
>>
>>> I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in 
>>> areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We 
>>> ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not 
>>> flexibility.
>>
>> Not really, security of verifier and BPF infra in general is on the top
>> of the list, it's fundamental to the underlying concept and just because
>> it is heavily used also in tracing and networking, it only shows that the
>> concept is highly flexible that it can be applied in multiple areas.

If you're implying that because seccomp would have it's own verifier and could therefore restrict itself to a subset of 
eBPF, therefore any future additions/features to eBPF would not necessarily make seccomp less secure, I mainly agree. Is 
that the argument?

-chrish



_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                                       ` <4fbef77e-92ad-b896-a259-492412ad4c55-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2018-02-28 19:56                                         ` Daniel Borkmann
       [not found]                                           ` <19cd2e07-5702-1713-6903-e5667250b09d-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Daniel Borkmann @ 2018-02-28 19:56 UTC (permalink / raw)
  To: chris hyser, Kees Cook
  Cc: Will Drewry, Netdev, Linux Containers, Alexei Starovoitov,
	Andy Lutomirski, Sargun Dhillon, Alexei Starovoitov

On 02/28/2018 12:55 AM, chris hyser wrote:
>> On 02/27/2018 04:58 PM, Daniel Borkmann wrote: >> On 02/27/2018 05:59 PM, chris hyser wrote:
>>>> On 02/27/2018 11:00 AM, Kees Cook wrote:
>>>>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser@oracle.com> wrote:
>>>>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto@amacapital.net>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>>>>> why seccomp is special here.
>>>>>>>
>>>>>>> My concern is more about unintended design mistakes or other feature
>>>>>>> creep with side-effects, especially when it comes to privileges and
>>>>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>>>>> took some careful thought and discussion, and I'm shy from how painful
>>>>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>>>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>>
>>> Well, not really. One part of all the Spectre mitigations that went upstream
>>> from BPF side was to have an option to remove interpreter entirely and that
>>> also relates to seccomp eventually. But other than that an attacker might
>>> potentially find as well useful gadgets inside seccomp or any other code
>>> that is inside the kernel, so it's not a strict necessity either.
>>>
>>>>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>>>>
>>>>>> Another option might be to remove c/eBPF from the equation all together.
>>>>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>>>>> additional security risk. Seccomp is for enhanced security yes? How about a
>>>>>> new seccomp mode that passes in something like a bit vector or hashmap for
>>>>>> "simple" white/black list checks validated by kernel code, versus user
>>>>>> provided interpreted code? Of course this removes a fair number of things
>>>>>> you can currently do or would be able to do with eBPF. Of course, restated
>>>>>> from a security point of view, this removes a fair number of things an
>>>>>> _attacker_ can do. Presumably the performance improvement would also be
>>>>>> significant.
>>>
>>> Good luck with not breaking existing applications relying on seccomp out
>>> there.
>>
>> This wasn't in the context of an implementation proposal, but the assumption would be to add this in addition to the old way. Now, does that make sense to do? That is the discussion.

I see; didn't read that out from the above when you also mentioned removing
cBPF, but fair enough.

>>>>>> Is this an idea worth prototyping?
>>>>>
>>>>> That was the original prototype for seccomp-filter. :) The discussion
>>>>> around that from years ago basically boiled down to it being
>>>>> inflexible. Given all the things people want to do at syscall time,
>>>>> that continues to be true. So true, in fact, that here we are now,
>>>>> trying to move to eBPF from cBPF. ;)
>>>
>>> Right, agree. cBPF is also pretty much frozen these days and aside from
>>> that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind
>>> doing something similar for eBPF side as long as this is reasonably
>>> maintainable and not making BPF core more complex, but most of it can
>>> already be set in the verifier anyway based on prog type. Note, that
>>> performance of seccomp/BPF is definitely a demand as well which is why
>>> people still extend the old remaining cBPF JITs today such that it can
>>> be JITed also from there.
>>>
>>>> I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not flexibility.
>>>
>>> Not really, security of verifier and BPF infra in general is on the top
>>> of the list, it's fundamental to the underlying concept and just because
>>> it is heavily used also in tracing and networking, it only shows that the
>>> concept is highly flexible that it can be applied in multiple areas.
> 
> If you're implying that because seccomp would have it's own verifier and could therefore restrict itself to a subset of eBPF, therefore any future additions/features to eBPF would not necessarily make seccomp less secure, I mainly agree. Is that the argument?

Ok, in addition to the current unpriv restrictions imposed by the verifier,
what additional requirements would you have from your side in order to get
to semantics that make sense for you wrt seccomp/eBPF? Just trying to
understand how far we are away from that. Note that not every new feature,
map or helper is enabled for every program type of course.

Thanks,
Daniel


> -chrish
> 
> 
> 

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                                           ` <19cd2e07-5702-1713-6903-e5667250b09d-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
@ 2018-03-01  6:46                                             ` chris hyser
  2018-03-01 17:44                                             ` Andy Lutomirski
  1 sibling, 0 replies; 29+ messages in thread
From: chris hyser @ 2018-03-01  6:46 UTC (permalink / raw)
  To: Daniel Borkmann, Kees Cook
  Cc: Will Drewry, Netdev, Linux Containers, Alexei Starovoitov,
	Andy Lutomirski, Sargun Dhillon, Alexei Starovoitov

On 02/28/2018 02:56 PM, Daniel Borkmann wrote:
> On 02/28/2018 12:55 AM, chris hyser wrote:

>> If you're implying that because seccomp would have it's own verifier and could therefore restrict itself to a subset of eBPF, 
>> therefore any future additions/features to eBPF would not necessarily make seccomp less secure, I mainly agree. Is that th
>> argument?
> 
> Ok, in addition to the current unpriv restrictions imposed by the verifier,
> what additional requirements would you have from your side in order to get
> to semantics that make sense for you wrt seccomp/eBPF? Just trying to
> understand how far we are away from that. Note that not every new feature,
> map or helper is enabled for every program type of course.

Let me try to clarify my argument by laying out my thoughts here (I apologize if it seems pedantic):

The intent of seccomp is to reduce the kernel attack surfaces available to a compromised user program; if you can't make 
a syscall, you can't exploit some lurking security bug.

Now, if I order various possible seccomp implementation choices in terms of minimizing that implementation's kernel 
attack surfaces it might reasonably be:

1) simple bit vector go/no go check (or similar)
2) cBPF like today
3) some restricted subset of eBPF
4) some other or less restricted subset of eBPF
5) all current eBPF
6) eBPF plus future features

The trade-off is that more features equals less security (ie more security risk). The question is do we allow user land 
to decide where they want to be on that scale or do we pick knowing it will be too much for some and too little for 
others. Answering your question therefore requires knowing either where we choose to be on that scale or what options 
and at what granularity we allow user land to choose. In terms of user land options, just choosing between 2 and 5 might 
be enough. I'd favor say 1 and 4 or 1 and 5 as I don't think it unreasonable for the security paranoid to choose 1.

-chrish

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                                           ` <19cd2e07-5702-1713-6903-e5667250b09d-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
  2018-03-01  6:46                                             ` chris hyser
@ 2018-03-01 17:44                                             ` Andy Lutomirski
       [not found]                                               ` <CALCETrWugC-M-b2hhKu+Zq6W4w6vDn+bDCURLw48Loa+_SQaqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2018-03-01 17:44 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Will Drewry, Kees Cook, Netdev, Linux Containers,
	Alexei Starovoitov, Sargun Dhillon, Alexei Starovoitov

On Wed, Feb 28, 2018 at 7:56 PM, Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org> wrote:
> On 02/28/2018 12:55 AM, chris hyser wrote:
>>> On 02/27/2018 04:58 PM, Daniel Borkmann wrote: >> On 02/27/2018 05:59 PM, chris hyser wrote:
>>>>> On 02/27/2018 11:00 AM, Kees Cook wrote:
>>>>>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>>>>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>>>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>>>>>> why seccomp is special here.
>>>>>>>>
>>>>>>>> My concern is more about unintended design mistakes or other feature
>>>>>>>> creep with side-effects, especially when it comes to privileges and
>>>>>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>>>>>> took some careful thought and discussion, and I'm shy from how painful
>>>>>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>>>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>>>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>>>>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>>>
>>>> Well, not really. One part of all the Spectre mitigations that went upstream
>>>> from BPF side was to have an option to remove interpreter entirely and that
>>>> also relates to seccomp eventually. But other than that an attacker might
>>>> potentially find as well useful gadgets inside seccomp or any other code
>>>> that is inside the kernel, so it's not a strict necessity either.
>>>>
>>>>>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>>>>>
>>>>>>> Another option might be to remove c/eBPF from the equation all together.
>>>>>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>>>>>> additional security risk. Seccomp is for enhanced security yes? How about a
>>>>>>> new seccomp mode that passes in something like a bit vector or hashmap for
>>>>>>> "simple" white/black list checks validated by kernel code, versus user
>>>>>>> provided interpreted code? Of course this removes a fair number of things
>>>>>>> you can currently do or would be able to do with eBPF. Of course, restated
>>>>>>> from a security point of view, this removes a fair number of things an
>>>>>>> _attacker_ can do. Presumably the performance improvement would also be
>>>>>>> significant.
>>>>
>>>> Good luck with not breaking existing applications relying on seccomp out
>>>> there.
>>>
>>> This wasn't in the context of an implementation proposal, but the assumption would be to add this in addition to the old way. Now, does that make sense to do? That is the discussion.
>
> I see; didn't read that out from the above when you also mentioned removing
> cBPF, but fair enough.
>
>>>>>>> Is this an idea worth prototyping?
>>>>>>
>>>>>> That was the original prototype for seccomp-filter. :) The discussion
>>>>>> around that from years ago basically boiled down to it being
>>>>>> inflexible. Given all the things people want to do at syscall time,
>>>>>> that continues to be true. So true, in fact, that here we are now,
>>>>>> trying to move to eBPF from cBPF. ;)
>>>>
>>>> Right, agree. cBPF is also pretty much frozen these days and aside from
>>>> that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind
>>>> doing something similar for eBPF side as long as this is reasonably
>>>> maintainable and not making BPF core more complex, but most of it can
>>>> already be set in the verifier anyway based on prog type. Note, that
>>>> performance of seccomp/BPF is definitely a demand as well which is why
>>>> people still extend the old remaining cBPF JITs today such that it can
>>>> be JITed also from there.
>>>>
>>>>> I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not flexibility.
>>>>
>>>> Not really, security of verifier and BPF infra in general is on the top
>>>> of the list, it's fundamental to the underlying concept and just because
>>>> it is heavily used also in tracing and networking, it only shows that the
>>>> concept is highly flexible that it can be applied in multiple areas.
>>
>> If you're implying that because seccomp would have it's own verifier and could therefore restrict itself to a subset of eBPF, therefore any future additions/features to eBPF would not necessarily make seccomp less secure, I mainly agree. Is that the argument?
>
> Ok, in addition to the current unpriv restrictions imposed by the verifier,
> what additional requirements would you have from your side in order to get
> to semantics that make sense for you wrt seccomp/eBPF? Just trying to
> understand how far we are away from that. Note that not every new feature,
> map or helper is enabled for every program type of course.
>

I haven't looked at the exact unpriv restrictions lately, but I think
I remember them.  Regardless, from my perspective, here's what I would
care about as a seccomp reviewer:

1. No extra information should become available to a seccomp filter
program without seccomp's explicit opt-in.  In other words, a seccomp
filter program should be able to access the fields in struct
seccomp_data, the return values of BPF_CALL helpers explicitly
authorized by seccomp, and the values in maps authorized by seccomp
(if any!), and that's it.  They should not be able to learn the
current time, any kernel pointers, user register state (except that
which is contained in seccomp_data), etc.  I believe that this is
already the case except insofar as core BPF_CALL helpers may violate
this.  (I'm not sure exactly what the policy is on the use of BPF_CALL
helpers.)

2. Filter evaluation should have no side effects except as explicitly
authorized by seccomp or by systemwide tracing.  So perf observing
that a seccomp filter ran is fine, but seccomp filters should not be
able to write to the system log, to perf ring buffers, to maps, etc.

3. Stability.  If a filter passes verification on two different
kernels, it should behave the same on both kernels, even if the filter
is buggy in some theoretical sense.

And that's it.  Obviously the attack surface provided by the ability
to load and run a filter should be minimized, but that's true for eBPF
in general and has little to do with seccomp in particular.

#1 and #2 are probably fairly straightforward using existing
mechanisms, unless the BPF_CALL hooks or map authorization hooks for
program types need to be extended a bit to get it right.  #3 is maybe
more interesting, but I imagine that XDP and any upcoming bpf-based
iptables replacement have the same requirement.  In contrast, bpf
*tracing* doesn't really require #3 to the same extent -- it's really
such an awful thing if a buggy or otherwise naughty bpf tracing
program behaves differently after a kernel upgrade.

I suppose that another way of saying this is that an eBPF seccomp
program should behave like a pure function except to the extent that
the seccomp core makes an explicit exception.

I'm not terribly concerned about the additional attack surface exposed
by eBPF itself.  Sure, it's more dangerous to allow a sandboxed
program to load its own eBPF programs than to allow a sandboxed
program to load its own cBPF programs, but such is the price of
progress.  If I'm writing a restrictive sandbox a la chromium's, I'm
not going to allow it to load eBPF programs, but I can still use eBPF
to enforce the sandbox policy.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                                               ` <CALCETrWugC-M-b2hhKu+Zq6W4w6vDn+bDCURLw48Loa+_SQaqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-03-01 21:51                                                 ` Sargun Dhillon
       [not found]                                                   ` <CAMp4zn9g06jTAAycw6hNXF+KsfOM2SXvr1aYywnXyXkEiSO0rA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-03-01 21:54                                                 ` Daniel Borkmann
  1 sibling, 1 reply; 29+ messages in thread
From: Sargun Dhillon @ 2018-03-01 21:51 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Will Drewry, Kees Cook, Daniel Borkmann, Netdev,
	Linux Containers, Alexei Starovoitov, Alexei Starovoitov

On Thu, Mar 1, 2018 at 9:44 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Wed, Feb 28, 2018 at 7:56 PM, Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org> wrote:
>> On 02/28/2018 12:55 AM, chris hyser wrote:
>>>> On 02/27/2018 04:58 PM, Daniel Borkmann wrote: >> On 02/27/2018 05:59 PM, chris hyser wrote:
>>>>>> On 02/27/2018 11:00 AM, Kees Cook wrote:
>>>>>>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>>>>>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>>>>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>>>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>>>>>>> why seccomp is special here.
>>>>>>>>>
>>>>>>>>> My concern is more about unintended design mistakes or other feature
>>>>>>>>> creep with side-effects, especially when it comes to privileges and
>>>>>>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>>>>>>> took some careful thought and discussion, and I'm shy from how painful
>>>>>>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>>>>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>>>>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>>>>>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>>>>
>>>>> Well, not really. One part of all the Spectre mitigations that went upstream
>>>>> from BPF side was to have an option to remove interpreter entirely and that
>>>>> also relates to seccomp eventually. But other than that an attacker might
>>>>> potentially find as well useful gadgets inside seccomp or any other code
>>>>> that is inside the kernel, so it's not a strict necessity either.
>>>>>
>>>>>>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>>>>>>
>>>>>>>> Another option might be to remove c/eBPF from the equation all together.
>>>>>>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>>>>>>> additional security risk. Seccomp is for enhanced security yes? How about a
>>>>>>>> new seccomp mode that passes in something like a bit vector or hashmap for
>>>>>>>> "simple" white/black list checks validated by kernel code, versus user
>>>>>>>> provided interpreted code? Of course this removes a fair number of things
>>>>>>>> you can currently do or would be able to do with eBPF. Of course, restated
>>>>>>>> from a security point of view, this removes a fair number of things an
>>>>>>>> _attacker_ can do. Presumably the performance improvement would also be
>>>>>>>> significant.
>>>>>
>>>>> Good luck with not breaking existing applications relying on seccomp out
>>>>> there.
>>>>
>>>> This wasn't in the context of an implementation proposal, but the assumption would be to add this in addition to the old way. Now, does that make sense to do? That is the discussion.
>>
>> I see; didn't read that out from the above when you also mentioned removing
>> cBPF, but fair enough.
>>
>>>>>>>> Is this an idea worth prototyping?
>>>>>>>
>>>>>>> That was the original prototype for seccomp-filter. :) The discussion
>>>>>>> around that from years ago basically boiled down to it being
>>>>>>> inflexible. Given all the things people want to do at syscall time,
>>>>>>> that continues to be true. So true, in fact, that here we are now,
>>>>>>> trying to move to eBPF from cBPF. ;)
>>>>>
>>>>> Right, agree. cBPF is also pretty much frozen these days and aside from
>>>>> that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind
>>>>> doing something similar for eBPF side as long as this is reasonably
>>>>> maintainable and not making BPF core more complex, but most of it can
>>>>> already be set in the verifier anyway based on prog type. Note, that
>>>>> performance of seccomp/BPF is definitely a demand as well which is why
>>>>> people still extend the old remaining cBPF JITs today such that it can
>>>>> be JITed also from there.
>>>>>
>>>>>> I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not flexibility.
>>>>>
>>>>> Not really, security of verifier and BPF infra in general is on the top
>>>>> of the list, it's fundamental to the underlying concept and just because
>>>>> it is heavily used also in tracing and networking, it only shows that the
>>>>> concept is highly flexible that it can be applied in multiple areas.
>>>
>>> If you're implying that because seccomp would have it's own verifier and could therefore restrict itself to a subset of eBPF, therefore any future additions/features to eBPF would not necessarily make seccomp less secure, I mainly agree. Is that the argument?
>>
>> Ok, in addition to the current unpriv restrictions imposed by the verifier,
>> what additional requirements would you have from your side in order to get
>> to semantics that make sense for you wrt seccomp/eBPF? Just trying to
>> understand how far we are away from that. Note that not every new feature,
>> map or helper is enabled for every program type of course.
>>
>
> I haven't looked at the exact unpriv restrictions lately, but I think
> I remember them.  Regardless, from my perspective, here's what I would
> care about as a seccomp reviewer:
>
> 1. No extra information should become available to a seccomp filter
> program without seccomp's explicit opt-in.  In other words, a seccomp
> filter program should be able to access the fields in struct
> seccomp_data, the return values of BPF_CALL helpers explicitly
> authorized by seccomp, and the values in maps authorized by seccomp
> (if any!), and that's it.  They should not be able to learn the
> current time, any kernel pointers, user register state (except that
> which is contained in seccomp_data), etc.  I believe that this is
> already the case except insofar as core BPF_CALL helpers may violate
> this.  (I'm not sure exactly what the policy is on the use of BPF_CALL
> helpers.)
>
The only calls that were whitelisted was uid/gid, pid/tid, ktime, and
prandom. There was no printk, nor perf_events. It didn't give access
to maps, or anything else. I have to ask though, why isn't it okay for
eBPF filters to have access to time? My use case is that I launch a
job, and it has 5 minutes to do some privileged operations, and then I
want to deny any privileged operations. Maybe when writeable maps are
a thing, we can do something more advanced, but trying to narrow the
window of attack has a lot of benefit.

> 2. Filter evaluation should have no side effects except as explicitly
> authorized by seccomp or by systemwide tracing.  So perf observing
> that a seccomp filter ran is fine, but seccomp filters should not be
> able to write to the system log, to perf ring buffers, to maps, etc.
>
See above.

> 3. Stability.  If a filter passes verification on two different
> kernels, it should behave the same on both kernels, even if the filter
> is buggy in some theoretical sense.
In the sense that BPF is part of the uapi, and uapi is supposed to be
stable? I think that already makes sense, because for all the places
where eBPF is used for networking, it has to follow this property.

>
> And that's it.  Obviously the attack surface provided by the ability
> to load and run a filter should be minimized, but that's true for eBPF
> in general and has little to do with seccomp in particular.
>
> #1 and #2 are probably fairly straightforward using existing
> mechanisms, unless the BPF_CALL hooks or map authorization hooks for
> program types need to be extended a bit to get it right.  #3 is maybe
> more interesting, but I imagine that XDP and any upcoming bpf-based
> iptables replacement have the same requirement.  In contrast, bpf
> *tracing* doesn't really require #3 to the same extent -- it's really
> such an awful thing if a buggy or otherwise naughty bpf tracing
> program behaves differently after a kernel upgrade.
>
> I suppose that another way of saying this is that an eBPF seccomp
> program should behave like a pure function except to the extent that
> the seccomp core makes an explicit exception.
>
> I'm not terribly concerned about the additional attack surface exposed
> by eBPF itself.  Sure, it's more dangerous to allow a sandboxed
> program to load its own eBPF programs than to allow a sandboxed
> program to load its own cBPF programs, but such is the price of
> progress.  If I'm writing a restrictive sandbox a la chromium's, I'm
> not going to allow it to load eBPF programs, but I can still use eBPF
> to enforce the sandbox policy.
>
> --Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                                               ` <CALCETrWugC-M-b2hhKu+Zq6W4w6vDn+bDCURLw48Loa+_SQaqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-03-01 21:51                                                 ` Sargun Dhillon
@ 2018-03-01 21:54                                                 ` Daniel Borkmann
  1 sibling, 0 replies; 29+ messages in thread
From: Daniel Borkmann @ 2018-03-01 21:54 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Will Drewry, Kees Cook, Netdev, Linux Containers,
	Alexei Starovoitov, Sargun Dhillon, Alexei Starovoitov

On 03/01/2018 06:44 PM, Andy Lutomirski wrote:
> On Wed, Feb 28, 2018 at 7:56 PM, Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org> wrote:
>> On 02/28/2018 12:55 AM, chris hyser wrote:
>>>> On 02/27/2018 04:58 PM, Daniel Borkmann wrote: >> On 02/27/2018 05:59 PM, chris hyser wrote:
>>>>>> On 02/27/2018 11:00 AM, Kees Cook wrote:
>>>>>>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>>>>>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>>>>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>>>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>>>>>>> why seccomp is special here.
>>>>>>>>>
>>>>>>>>> My concern is more about unintended design mistakes or other feature
>>>>>>>>> creep with side-effects, especially when it comes to privileges and
>>>>>>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>>>>>>> took some careful thought and discussion, and I'm shy from how painful
>>>>>>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>>>>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>>>>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>>>>>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>>>>
>>>>> Well, not really. One part of all the Spectre mitigations that went upstream
>>>>> from BPF side was to have an option to remove interpreter entirely and that
>>>>> also relates to seccomp eventually. But other than that an attacker might
>>>>> potentially find as well useful gadgets inside seccomp or any other code
>>>>> that is inside the kernel, so it's not a strict necessity either.
>>>>>
>>>>>>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>>>>>>
>>>>>>>> Another option might be to remove c/eBPF from the equation all together.
>>>>>>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>>>>>>> additional security risk. Seccomp is for enhanced security yes? How about a
>>>>>>>> new seccomp mode that passes in something like a bit vector or hashmap for
>>>>>>>> "simple" white/black list checks validated by kernel code, versus user
>>>>>>>> provided interpreted code? Of course this removes a fair number of things
>>>>>>>> you can currently do or would be able to do with eBPF. Of course, restated
>>>>>>>> from a security point of view, this removes a fair number of things an
>>>>>>>> _attacker_ can do. Presumably the performance improvement would also be
>>>>>>>> significant.
>>>>>
>>>>> Good luck with not breaking existing applications relying on seccomp out
>>>>> there.
>>>>
>>>> This wasn't in the context of an implementation proposal, but the assumption would be to add this in addition to the old way. Now, does that make sense to do? That is the discussion.
>>
>> I see; didn't read that out from the above when you also mentioned removing
>> cBPF, but fair enough.
>>
>>>>>>>> Is this an idea worth prototyping?
>>>>>>>
>>>>>>> That was the original prototype for seccomp-filter. :) The discussion
>>>>>>> around that from years ago basically boiled down to it being
>>>>>>> inflexible. Given all the things people want to do at syscall time,
>>>>>>> that continues to be true. So true, in fact, that here we are now,
>>>>>>> trying to move to eBPF from cBPF. ;)
>>>>>
>>>>> Right, agree. cBPF is also pretty much frozen these days and aside from
>>>>> that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind
>>>>> doing something similar for eBPF side as long as this is reasonably
>>>>> maintainable and not making BPF core more complex, but most of it can
>>>>> already be set in the verifier anyway based on prog type. Note, that
>>>>> performance of seccomp/BPF is definitely a demand as well which is why
>>>>> people still extend the old remaining cBPF JITs today such that it can
>>>>> be JITed also from there.
>>>>>
>>>>>> I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not flexibility.
>>>>>
>>>>> Not really, security of verifier and BPF infra in general is on the top
>>>>> of the list, it's fundamental to the underlying concept and just because
>>>>> it is heavily used also in tracing and networking, it only shows that the
>>>>> concept is highly flexible that it can be applied in multiple areas.
>>>
>>> If you're implying that because seccomp would have it's own verifier and could therefore restrict itself to a subset of eBPF, therefore any future additions/features to eBPF would not necessarily make seccomp less secure, I mainly agree. Is that the argument?
>>
>> Ok, in addition to the current unpriv restrictions imposed by the verifier,
>> what additional requirements would you have from your side in order to get
>> to semantics that make sense for you wrt seccomp/eBPF? Just trying to
>> understand how far we are away from that. Note that not every new feature,
>> map or helper is enabled for every program type of course.
> 
> I haven't looked at the exact unpriv restrictions lately, but I think
> I remember them.  Regardless, from my perspective, here's what I would
> care about as a seccomp reviewer:
> 
> 1. No extra information should become available to a seccomp filter
> program without seccomp's explicit opt-in.  In other words, a seccomp
> filter program should be able to access the fields in struct
> seccomp_data, the return values of BPF_CALL helpers explicitly
> authorized by seccomp, and the values in maps authorized by seccomp
> (if any!), and that's it.  They should not be able to learn the
> current time, any kernel pointers, user register state (except that
> which is contained in seccomp_data), etc.  I believe that this is
> already the case except insofar as core BPF_CALL helpers may violate
> this.  (I'm not sure exactly what the policy is on the use of BPF_CALL
> helpers.)

Yes, that is the case already. You also have full control over e.g.
what helpers may be called when you add a new program type for
seccomp, you could as well say that no helper would be allowed in
the extreme case (which brings you more or less back to cBPF) or you
can add helpers explicitly and only used for the seccomp program type.
So there's full control over this.

> 2. Filter evaluation should have no side effects except as explicitly
> authorized by seccomp or by systemwide tracing.  So perf observing
> that a seccomp filter ran is fine, but seccomp filters should not be
> able to write to the system log, to perf ring buffers, to maps, etc.

That's fine as well. The latter would need a trivial adjustment in
that programs are not allowed to write but only read a map value for
maps with data. Basically right now we have an option for the opposite
where user space would have read only access, but not the program
itself, but adding that is minor.

> 3. Stability.  If a filter passes verification on two different
> kernels, it should behave the same on both kernels, even if the filter
> is buggy in some theoretical sense.

We don't allow for breaking existing BPF programs. The only exception
is tracing, of course, where kernel internal data structures that might
get inspected/walked may change, therefore the program needs to be tied
to a kernel version, but for all the rest of the program types that is
not the case and ABI is stable in same way as we provide this guarantee
for syscalls towards user space. This is a hard requirement for networking
programs and other types just as well.

> And that's it.  Obviously the attack surface provided by the ability
> to load and run a filter should be minimized, but that's true for eBPF
> in general and has little to do with seccomp in particular.

Agree.

> #1 and #2 are probably fairly straightforward using existing
> mechanisms, unless the BPF_CALL hooks or map authorization hooks for
> program types need to be extended a bit to get it right.  #3 is maybe
> more interesting, but I imagine that XDP and any upcoming bpf-based
> iptables replacement have the same requirement.  In contrast, bpf
> *tracing* doesn't really require #3 to the same extent -- it's really
> such an awful thing if a buggy or otherwise naughty bpf tracing
> program behaves differently after a kernel upgrade.

(Yeah, see my comment above.)

> I suppose that another way of saying this is that an eBPF seccomp
> program should behave like a pure function except to the extent that
> the seccomp core makes an explicit exception.

Makes sense.

> I'm not terribly concerned about the additional attack surface exposed
> by eBPF itself.  Sure, it's more dangerous to allow a sandboxed
> program to load its own eBPF programs than to allow a sandboxed
> program to load its own cBPF programs, but such is the price of
> progress.  If I'm writing a restrictive sandbox a la chromium's, I'm
> not going to allow it to load eBPF programs, but I can still use eBPF
> to enforce the sandbox policy.
> 
> --Andy
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                                                   ` <CAMp4zn9g06jTAAycw6hNXF+KsfOM2SXvr1aYywnXyXkEiSO0rA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-03-01 21:59                                                     ` Andy Lutomirski
       [not found]                                                       ` <CALCETrVQ-V1b58aHxudQNTSn0J8yirsnUghyzjkP-M_Dqptqjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2018-03-01 21:59 UTC (permalink / raw)
  To: Sargun Dhillon
  Cc: Will Drewry, Kees Cook, Daniel Borkmann, Netdev,
	Linux Containers, Alexei Starovoitov, Alexei Starovoitov

On Thu, Mar 1, 2018 at 9:51 PM, Sargun Dhillon <sargun-GaZTRHToo+CzQB+pC5nmwQ@public.gmane.org> wrote:
> On Thu, Mar 1, 2018 at 9:44 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>> On Wed, Feb 28, 2018 at 7:56 PM, Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org> wrote:
>>> On 02/28/2018 12:55 AM, chris hyser wrote:
>>>>> On 02/27/2018 04:58 PM, Daniel Borkmann wrote: >> On 02/27/2018 05:59 PM, chris hyser wrote:
>>>>>>> On 02/27/2018 11:00 AM, Kees Cook wrote:
>>>>>>>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>>>>>>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>>>>>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>>>>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>>>>>>>> why seccomp is special here.
>>>>>>>>>>
>>>>>>>>>> My concern is more about unintended design mistakes or other feature
>>>>>>>>>> creep with side-effects, especially when it comes to privileges and
>>>>>>>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>>>>>>>> took some careful thought and discussion, and I'm shy from how painful
>>>>>>>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>>>>>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>>>>>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>>>>>>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>>>>>
>>>>>> Well, not really. One part of all the Spectre mitigations that went upstream
>>>>>> from BPF side was to have an option to remove interpreter entirely and that
>>>>>> also relates to seccomp eventually. But other than that an attacker might
>>>>>> potentially find as well useful gadgets inside seccomp or any other code
>>>>>> that is inside the kernel, so it's not a strict necessity either.
>>>>>>
>>>>>>>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>>>>>>>
>>>>>>>>> Another option might be to remove c/eBPF from the equation all together.
>>>>>>>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>>>>>>>> additional security risk. Seccomp is for enhanced security yes? How about a
>>>>>>>>> new seccomp mode that passes in something like a bit vector or hashmap for
>>>>>>>>> "simple" white/black list checks validated by kernel code, versus user
>>>>>>>>> provided interpreted code? Of course this removes a fair number of things
>>>>>>>>> you can currently do or would be able to do with eBPF. Of course, restated
>>>>>>>>> from a security point of view, this removes a fair number of things an
>>>>>>>>> _attacker_ can do. Presumably the performance improvement would also be
>>>>>>>>> significant.
>>>>>>
>>>>>> Good luck with not breaking existing applications relying on seccomp out
>>>>>> there.
>>>>>
>>>>> This wasn't in the context of an implementation proposal, but the assumption would be to add this in addition to the old way. Now, does that make sense to do? That is the discussion.
>>>
>>> I see; didn't read that out from the above when you also mentioned removing
>>> cBPF, but fair enough.
>>>
>>>>>>>>> Is this an idea worth prototyping?
>>>>>>>>
>>>>>>>> That was the original prototype for seccomp-filter. :) The discussion
>>>>>>>> around that from years ago basically boiled down to it being
>>>>>>>> inflexible. Given all the things people want to do at syscall time,
>>>>>>>> that continues to be true. So true, in fact, that here we are now,
>>>>>>>> trying to move to eBPF from cBPF. ;)
>>>>>>
>>>>>> Right, agree. cBPF is also pretty much frozen these days and aside from
>>>>>> that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind
>>>>>> doing something similar for eBPF side as long as this is reasonably
>>>>>> maintainable and not making BPF core more complex, but most of it can
>>>>>> already be set in the verifier anyway based on prog type. Note, that
>>>>>> performance of seccomp/BPF is definitely a demand as well which is why
>>>>>> people still extend the old remaining cBPF JITs today such that it can
>>>>>> be JITed also from there.
>>>>>>
>>>>>>> I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not flexibility.
>>>>>>
>>>>>> Not really, security of verifier and BPF infra in general is on the top
>>>>>> of the list, it's fundamental to the underlying concept and just because
>>>>>> it is heavily used also in tracing and networking, it only shows that the
>>>>>> concept is highly flexible that it can be applied in multiple areas.
>>>>
>>>> If you're implying that because seccomp would have it's own verifier and could therefore restrict itself to a subset of eBPF, therefore any future additions/features to eBPF would not necessarily make seccomp less secure, I mainly agree. Is that the argument?
>>>
>>> Ok, in addition to the current unpriv restrictions imposed by the verifier,
>>> what additional requirements would you have from your side in order to get
>>> to semantics that make sense for you wrt seccomp/eBPF? Just trying to
>>> understand how far we are away from that. Note that not every new feature,
>>> map or helper is enabled for every program type of course.
>>>
>>
>> I haven't looked at the exact unpriv restrictions lately, but I think
>> I remember them.  Regardless, from my perspective, here's what I would
>> care about as a seccomp reviewer:
>>
>> 1. No extra information should become available to a seccomp filter
>> program without seccomp's explicit opt-in.  In other words, a seccomp
>> filter program should be able to access the fields in struct
>> seccomp_data, the return values of BPF_CALL helpers explicitly
>> authorized by seccomp, and the values in maps authorized by seccomp
>> (if any!), and that's it.  They should not be able to learn the
>> current time, any kernel pointers, user register state (except that
>> which is contained in seccomp_data), etc.  I believe that this is
>> already the case except insofar as core BPF_CALL helpers may violate
>> this.  (I'm not sure exactly what the policy is on the use of BPF_CALL
>> helpers.)
>>
> The only calls that were whitelisted was uid/gid, pid/tid, ktime, and
> prandom. There was no printk, nor perf_events. It didn't give access
> to maps, or anything else. I have to ask though, why isn't it okay for
> eBPF filters to have access to time? My use case is that I launch a
> job, and it has 5 minutes to do some privileged operations, and then I
> want to deny any privileged operations. Maybe when writeable maps are
> a thing, we can do something more advanced, but trying to narrow the
> window of attack has a lot of benefit.

To avoid derailing this discussion: I think that the first incarnation
of eBPF seccomp in the kernel should allow no BPF_CALL helpers
whatsoever.  Addition of helpers needs to be reviewed carefully.  As
for the "five minutes" use: just kill the process after 5 minutes or
use user notifiers.  That use case is just too weird to be worthy of
kernel support IMO.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next v3 0/2] eBPF seccomp filters
       [not found]                                                       ` <CALCETrVQ-V1b58aHxudQNTSn0J8yirsnUghyzjkP-M_Dqptqjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-03-01 22:46                                                         ` Sargun Dhillon
  0 siblings, 0 replies; 29+ messages in thread
From: Sargun Dhillon @ 2018-03-01 22:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Will Drewry, Kees Cook, Daniel Borkmann, Netdev,
	Linux Containers, Alexei Starovoitov, Alexei Starovoitov

On Thu, Mar 1, 2018 at 1:59 PM, Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Thu, Mar 1, 2018 at 9:51 PM, Sargun Dhillon <sargun-GaZTRHToo+CzQB+pC5nmwQ@public.gmane.org> wrote:
>> On Thu, Mar 1, 2018 at 9:44 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>>> On Wed, Feb 28, 2018 at 7:56 PM, Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org> wrote:
>>>> On 02/28/2018 12:55 AM, chris hyser wrote:
>>>>>> On 02/27/2018 04:58 PM, Daniel Borkmann wrote: >> On 02/27/2018 05:59 PM, chris hyser wrote:
>>>>>>>> On 02/27/2018 11:00 AM, Kees Cook wrote:
>>>>>>>>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hyser-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>>>>>>>>> On 02/26/2018 11:38 PM, Kees Cook wrote:
>>>>>>>>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> 3. Straight-up bugs.  Those are exactly as problematic as verifier
>>>>>>>>>>>> bugs in any other unprivileged eBPF program type, right?  I don't see
>>>>>>>>>>>> why seccomp is special here.
>>>>>>>>>>>
>>>>>>>>>>> My concern is more about unintended design mistakes or other feature
>>>>>>>>>>> creep with side-effects, especially when it comes to privileges and
>>>>>>>>>>> synchronization. Getting no-new-privs done correctly, for example,
>>>>>>>>>>> took some careful thought and discussion, and I'm shy from how painful
>>>>>>>>>>> TSYNC was on the process locking side, and eBPF has had some rather
>>>>>>>>>>> ugly flaws in the past (and recently: it was nice to be able to say
>>>>>>>>>>> for Spectre that seccomp filters couldn't be constructed to make
>>>>>>>>>>> attacks but eBPF could). Adding the complexity needs to be worth the
>>>>>>>
>>>>>>> Well, not really. One part of all the Spectre mitigations that went upstream
>>>>>>> from BPF side was to have an option to remove interpreter entirely and that
>>>>>>> also relates to seccomp eventually. But other than that an attacker might
>>>>>>> potentially find as well useful gadgets inside seccomp or any other code
>>>>>>> that is inside the kernel, so it's not a strict necessity either.
>>>>>>>
>>>>>>>>>>> gain. I'm on board for doing it, I just want to be careful. :)
>>>>>>>>>>
>>>>>>>>>> Another option might be to remove c/eBPF from the equation all together.
>>>>>>>>>> c/eBPF allows flexibility and that almost always comes at the cost of
>>>>>>>>>> additional security risk. Seccomp is for enhanced security yes? How about a
>>>>>>>>>> new seccomp mode that passes in something like a bit vector or hashmap for
>>>>>>>>>> "simple" white/black list checks validated by kernel code, versus user
>>>>>>>>>> provided interpreted code? Of course this removes a fair number of things
>>>>>>>>>> you can currently do or would be able to do with eBPF. Of course, restated
>>>>>>>>>> from a security point of view, this removes a fair number of things an
>>>>>>>>>> _attacker_ can do. Presumably the performance improvement would also be
>>>>>>>>>> significant.
>>>>>>>
>>>>>>> Good luck with not breaking existing applications relying on seccomp out
>>>>>>> there.
>>>>>>
>>>>>> This wasn't in the context of an implementation proposal, but the assumption would be to add this in addition to the old way. Now, does that make sense to do? That is the discussion.
>>>>
>>>> I see; didn't read that out from the above when you also mentioned removing
>>>> cBPF, but fair enough.
>>>>
>>>>>>>>>> Is this an idea worth prototyping?
>>>>>>>>>
>>>>>>>>> That was the original prototype for seccomp-filter. :) The discussion
>>>>>>>>> around that from years ago basically boiled down to it being
>>>>>>>>> inflexible. Given all the things people want to do at syscall time,
>>>>>>>>> that continues to be true. So true, in fact, that here we are now,
>>>>>>>>> trying to move to eBPF from cBPF. ;)
>>>>>>>
>>>>>>> Right, agree. cBPF is also pretty much frozen these days and aside from
>>>>>>> that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind
>>>>>>> doing something similar for eBPF side as long as this is reasonably
>>>>>>> maintainable and not making BPF core more complex, but most of it can
>>>>>>> already be set in the verifier anyway based on prog type. Note, that
>>>>>>> performance of seccomp/BPF is definitely a demand as well which is why
>>>>>>> people still extend the old remaining cBPF JITs today such that it can
>>>>>>> be JITed also from there.
>>>>>>>
>>>>>>>> I will try to find that discussion. As someone pointed out here though, eBPF is being used by more and more people in areas where security is not the primary concern. Differing objectives will make this a long term continuing issue. We ourselves were looking at eBPF simply as a means to use a hashmap for a white/blacklist, i.e. performance not flexibility.
>>>>>>>
>>>>>>> Not really, security of verifier and BPF infra in general is on the top
>>>>>>> of the list, it's fundamental to the underlying concept and just because
>>>>>>> it is heavily used also in tracing and networking, it only shows that the
>>>>>>> concept is highly flexible that it can be applied in multiple areas.
>>>>>
>>>>> If you're implying that because seccomp would have it's own verifier and could therefore restrict itself to a subset of eBPF, therefore any future additions/features to eBPF would not necessarily make seccomp less secure, I mainly agree. Is that the argument?
>>>>
>>>> Ok, in addition to the current unpriv restrictions imposed by the verifier,
>>>> what additional requirements would you have from your side in order to get
>>>> to semantics that make sense for you wrt seccomp/eBPF? Just trying to
>>>> understand how far we are away from that. Note that not every new feature,
>>>> map or helper is enabled for every program type of course.
>>>>
>>>
>>> I haven't looked at the exact unpriv restrictions lately, but I think
>>> I remember them.  Regardless, from my perspective, here's what I would
>>> care about as a seccomp reviewer:
>>>
>>> 1. No extra information should become available to a seccomp filter
>>> program without seccomp's explicit opt-in.  In other words, a seccomp
>>> filter program should be able to access the fields in struct
>>> seccomp_data, the return values of BPF_CALL helpers explicitly
>>> authorized by seccomp, and the values in maps authorized by seccomp
>>> (if any!), and that's it.  They should not be able to learn the
>>> current time, any kernel pointers, user register state (except that
>>> which is contained in seccomp_data), etc.  I believe that this is
>>> already the case except insofar as core BPF_CALL helpers may violate
>>> this.  (I'm not sure exactly what the policy is on the use of BPF_CALL
>>> helpers.)
>>>
>> The only calls that were whitelisted was uid/gid, pid/tid, ktime, and
>> prandom. There was no printk, nor perf_events. It didn't give access
>> to maps, or anything else. I have to ask though, why isn't it okay for
>> eBPF filters to have access to time? My use case is that I launch a
>> job, and it has 5 minutes to do some privileged operations, and then I
>> want to deny any privileged operations. Maybe when writeable maps are
>> a thing, we can do something more advanced, but trying to narrow the
>> window of attack has a lot of benefit.
>
> To avoid derailing this discussion: I think that the first incarnation
> of eBPF seccomp in the kernel should allow no BPF_CALL helpers
> whatsoever.  Addition of helpers needs to be reviewed carefully.  As
> for the "five minutes" use: just kill the process after 5 minutes or
> use user notifiers.  That use case is just too weird to be worthy of
> kernel support IMO.
I think that's an okay place to start.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2018-03-01 22:46 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-26  7:26 [net-next v3 0/2] eBPF seccomp filters Sargun Dhillon
     [not found] ` <20180226072651.GA27045-du9IEJ8oIxHXYT48pCVpJ3c7ZZ+wIVaZYkHkVr5ML8kVGlcevz2xqA@public.gmane.org>
2018-02-26 23:04   ` Alexei Starovoitov
2018-02-26 23:20     ` Kees Cook
     [not found]       ` <CAGXu5jLdOcrn16q9pQ7JwTf88AVsL0o5LMJ=4P6vRN36u-_k_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-27  1:01         ` Tycho Andersen
2018-02-27  3:46           ` Sargun Dhillon
     [not found]             ` <CAMp4zn9BAxv40q56PPsmvXcD000N4ZuAN3g=OF=od18_gT8UEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-27  4:01               ` Tycho Andersen
2018-02-27  4:19         ` Andy Lutomirski
     [not found]           ` <CALCETrXNODxWkcwF-LbXBn+Ju7QJEyi3JR+spsRX4ecg8d1iMQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-27  4:38             ` Kees Cook
     [not found]               ` <CAGXu5j+64WzxjBnpQxYCU50ak+VqVw1y0W+MWygFodxsDqEZRw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-27  4:54                 ` Andy Lutomirski
     [not found]                   ` <A20EA7DD-94E9-488A-B9FF-D8E2C9F26611-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
2018-02-27 23:10                     ` Mickaël Salaün
     [not found]                       ` <5323e010-09df-26d9-15f5-c723faa13224-WFhQfpSGs3bR7s880joybQ@public.gmane.org>
2018-02-27 23:11                         ` Andy Lutomirski
2018-02-27 14:53                 ` chris hyser
2018-02-27 14:53                   ` chris hyser
     [not found]                   ` <db759dd2-31dc-d094-251d-d4c1e8af8704-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2018-02-27 16:00                     ` Kees Cook
     [not found]                       ` <CAGXu5j+idW9AjZHVdeedqLOFXriObUJLvcw8-9k5WxyQF8EWrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-27 16:59                         ` chris hyser
     [not found]                           ` <ddbefdda-f3b8-3956-fa0f-dcba8cf8e7d9-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2018-02-27 19:19                             ` Kees Cook
     [not found]                               ` <CAGXu5jKnk90Yruhx_=t8yW2ziLaubqW80pxB95g5W_XnMuT1mA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-27 21:22                                 ` chris hyser
2018-02-27 21:58                             ` Daniel Borkmann
     [not found]                               ` <f712a383-8e84-da64-a454-51fdebf28741-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
2018-02-27 22:20                                 ` chris hyser
     [not found]                                   ` <7fc0fab8-c1bc-bc76-a892-b3faab7d16ad-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2018-02-27 23:55                                     ` chris hyser
     [not found]                                       ` <4fbef77e-92ad-b896-a259-492412ad4c55-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2018-02-28 19:56                                         ` Daniel Borkmann
     [not found]                                           ` <19cd2e07-5702-1713-6903-e5667250b09d-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
2018-03-01  6:46                                             ` chris hyser
2018-03-01 17:44                                             ` Andy Lutomirski
     [not found]                                               ` <CALCETrWugC-M-b2hhKu+Zq6W4w6vDn+bDCURLw48Loa+_SQaqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-01 21:51                                                 ` Sargun Dhillon
     [not found]                                                   ` <CAMp4zn9g06jTAAycw6hNXF+KsfOM2SXvr1aYywnXyXkEiSO0rA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-01 21:59                                                     ` Andy Lutomirski
     [not found]                                                       ` <CALCETrVQ-V1b58aHxudQNTSn0J8yirsnUghyzjkP-M_Dqptqjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-01 22:46                                                         ` Sargun Dhillon
2018-03-01 21:54                                                 ` Daniel Borkmann
2018-02-27  0:01     ` Sargun Dhillon
     [not found]       ` <CAMp4zn_Qe0aXhxNzpETBABAhKWF2WkZXnpzrJczbD=6k42OydA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-27  9:28         ` Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.