bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Kees Cook <keescook@chromium.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Will Drewry <wad@chromium.org>, Shuah Khan <shuah@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	kafai@fb.com, Song Liu <songliubraving@fb.com>,
	yhs@fb.com, kernel list <linux-kernel@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	bpf@vger.kernel.org, Tycho Andersen <tycho@tycho.ws>,
	Tyler Hicks <tyhicks@canonical.com>
Subject: Re: [PATCH v1 1/3] seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE
Date: Thu, 19 Sep 2019 21:37:06 +0200	[thread overview]
Message-ID: <CAG48ez1QkJAMgTpqv4EqbDmYPPpxuB8cR=XhUAr1fHZOBY_DHg@mail.gmail.com> (raw)
In-Reply-To: <20190919095903.19370-2-christian.brauner@ubuntu.com>

On Thu, Sep 19, 2019 at 11:59 AM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
> This allows the seccomp notifier to continue a syscall.
> Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
> which enables a process (watchee) to retrieve an fd for its seccomp
> filter. This fd can then be handed to another (usually more privileged)
> process (watcher). The watcher will then be able to receive seccomp
> messages about the syscalls having been performed by the watchee.
> This can be solved by
> telling seccomp to resume the syscall.
> @@ -780,8 +783,14 @@ static void seccomp_do_user_notification(int this_syscall,
>                 list_del(&n.list);
>  out:
>         mutex_unlock(&match->notify_lock);
> +
> +       /* Userspace requests to continue the syscall. */
> +               return 0;
> +
>         syscall_set_return_value(current, task_pt_regs(current),
>                                  err, ret);
> +       return -1;
>  }

Seccomp currently expects the various seccomp return values to be
fully ordered based on how much action the kernel should take against
the requested syscall. Currently, the range of return values is
basically divided into three regions: "block syscall in some way"
decide" (SECCOMP_RET_TRACE) and "allow" (SECCOMP_RET_LOG and
syscalls, it will be able to override a negative decision from

In practice, that's probably not a big deal, since I'm not aware of
anyone actually using SECCOMP_RET_TRACE for security purposes, and on
top of that, you'd have to allow ioctl(..., SECCOMP_IOCTL_NOTIF_SEND,
...) and seccomp() with SECCOMP_FILTER_FLAG_NEW_LISTENER in your
seccomp policy for this to work.

More interestingly, what about the case where two
SECCOMP_RET_USER_NOTIF filters are installed? The most recently
installed filter takes precedence if the return values's action parts
are the same (and this is also documented in the manpage); so if a
container engine installs a filter that always intercepts sys_foobar()
(and never uses SECCOMP_USER_NOTIF_FLAG_CONTINUE), and then something
inside the container also installs a filter that always intercepts
sys_foobar() (and always uses SECCOMP_USER_NOTIF_FLAG_CONTINUE), the
container engine's filter will become ineffective.

With my tendency to overcomplicate things, I'm thinking that maybe it
might be a good idea to:
 - collect a list of all filters that returned SECCOMP_RET_USER_NOTIF,
as well as the highest-precedence return value that was less strict
 - sequentially send notifications to all of the
SECCOMP_RET_USER_NOTIF filters until one doesn't return
 - if all returned SECCOMP_USER_NOTIF_FLAG_CONTINUE, go with the
highest-precedence return value that was less strict than
SECCOMP_RET_USER_NOTIF, or allow if no such return value was

But perhaps, for now, it would also be enough to just expand the big
fat warning note and tell people that if they allow the use of
filter, SECCOMP_RET_USER_NOTIF is bypassable. And if someone actually
has a usecase where SECCOMP_RET_USER_NOTIF should be secure and nested
SECCOMP_RET_USER_NOTIF support is needed, that more complicated logic
could be added later?

  reply	other threads:[~2019-09-19 19:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-19  9:59 [PATCH v1 0/3] seccomp: continue syscall from notifier Christian Brauner
2019-09-19  9:59 ` [PATCH v1 1/3] seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE Christian Brauner
2019-09-19 19:37   ` Jann Horn [this message]
2019-09-19 22:22     ` Christian Brauner
2019-09-19  9:59 ` [PATCH v1 2/3] seccomp: avoid overflow in implicit constant conversion Christian Brauner
2019-09-19  9:59 ` [PATCH v1 3/3] seccomp: test SECCOMP_USER_NOTIF_FLAG_CONTINUE Christian Brauner
2019-09-19 17:13   ` shuah
2019-09-19 21:50     ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez1QkJAMgTpqv4EqbDmYPPpxuB8cR=XhUAr1fHZOBY_DHg@mail.gmail.com' \
    --to=jannh@google.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=netdev@vger.kernel.org \
    --cc=shuah@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=tycho@tycho.ws \
    --cc=tyhicks@canonical.com \
    --cc=wad@chromium.org \
    --cc=yhs@fb.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).