linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: Tycho Andersen <tycho@tycho.pizza>
Cc: Christian Brauner <christian.brauner@canonical.com>,
	linux-man <linux-man@vger.kernel.org>,
	Song Liu <songliubraving@fb.com>, Will Drewry <wad@chromium.org>,
	Kees Cook <keescook@chromium.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Giuseppe Scrivano <gscrivan@redhat.com>,
	Robert Sesek <rsesek@google.com>,
	Linux Containers <containers@lists.linux-foundation.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>,
	bpf <bpf@vger.kernel.org>, Andy Lutomirski <luto@amacapital.net>,
	Christian Brauner <christian@brauner.io>
Subject: Re: For review: seccomp_user_notif(2) manual page
Date: Thu, 1 Oct 2020 20:18:49 +0200	[thread overview]
Message-ID: <CAG48ez1W+Ym5=-PdUhyei_UCJov0agEF4YVyARL=pooWYmdEAg@mail.gmail.com> (raw)
In-Reply-To: <20201001165850.GC1260245@cisco>

On Thu, Oct 1, 2020 at 6:58 PM Tycho Andersen <tycho@tycho.pizza> wrote:
> On Thu, Oct 01, 2020 at 05:47:54PM +0200, Jann Horn via Containers wrote:
> > On Thu, Oct 1, 2020 at 2:54 PM Christian Brauner
> > <christian.brauner@canonical.com> wrote:
> > > On Wed, Sep 30, 2020 at 05:53:46PM +0200, Jann Horn via Containers wrote:
> > > > On Wed, Sep 30, 2020 at 1:07 PM Michael Kerrisk (man-pages)
> > > > <mtk.manpages@gmail.com> wrote:
> > > > > NOTES
> > > > >        The file descriptor returned when seccomp(2) is employed with the
> > > > >        SECCOMP_FILTER_FLAG_NEW_LISTENER  flag  can  be  monitored  using
> > > > >        poll(2), epoll(7), and select(2).  When a notification  is  pend‐
> > > > >        ing,  these interfaces indicate that the file descriptor is read‐
> > > > >        able.
> > > >
> > > > We should probably also point out somewhere that, as
> > > > include/uapi/linux/seccomp.h says:
> > > >
> > > >  * Similar precautions should be applied when stacking SECCOMP_RET_USER_NOTIF
> > > >  * or SECCOMP_RET_TRACE. For SECCOMP_RET_USER_NOTIF filters acting on the
> > > >  * same syscall, the most recently added filter takes precedence. This means
> > > >  * that the new SECCOMP_RET_USER_NOTIF filter can override any
> > > >  * SECCOMP_IOCTL_NOTIF_SEND from earlier filters, essentially allowing all
> > > >  * such filtered syscalls to be executed by sending the response
> > > >  * SECCOMP_USER_NOTIF_FLAG_CONTINUE. Note that SECCOMP_RET_TRACE can equally
> > > >  * be overriden by SECCOMP_USER_NOTIF_FLAG_CONTINUE.
> > > >
> > > > In other words, from a security perspective, you must assume that the
> > > > target process can bypass any SECCOMP_RET_USER_NOTIF (or
> > > > SECCOMP_RET_TRACE) filters unless it is completely prohibited from
> > > > calling seccomp(). This should also be noted over in the main
> > > > seccomp(2) manpage, especially the SECCOMP_RET_TRACE part.
> > >
> > > So I was actually wondering about this when I skimmed this and a while
> > > ago but forgot about this again... Afaict, you can only ever load a
> > > single filter with SECCOMP_FILTER_FLAG_NEW_LISTENER set. If there
> > > already is a filter with the SECCOMP_FILTER_FLAG_NEW_LISTENER property
> > > in the tasks filter hierarchy then the kernel will refuse to load a new
> > > one?
> > >
> > > static struct file *init_listener(struct seccomp_filter *filter)
> > > {
> > >         struct file *ret = ERR_PTR(-EBUSY);
> > >         struct seccomp_filter *cur;
> > >
> > >         for (cur = current->seccomp.filter; cur; cur = cur->prev) {
> > >                 if (cur->notif)
> > >                         goto out;
> > >         }
> > >
> > > shouldn't that be sufficient to guarantee that USER_NOTIF filters can't
> > > override each other for the same task simply because there can only ever
> > > be a single one?
> >
> > Good point. Exceeeept that that check seems ineffective because this
> > happens before we take the locks that guard against TSYNC, and also
> > before we decide to which existing filter we want to chain the new
> > filter. So if two threads race with TSYNC, I think they'll be able to
> > chain two filters with listeners together.
>
> Yep, seems the check needs to also be in seccomp_can_sync_threads() to
> be totally effective,
>
> > I don't know whether we want to eternalize this "only one listener
> > across all the filters" restriction in the manpage though, or whether
> > the man page should just say that the kernel currently doesn't support
> > it but that security-wise you should assume that it might at some
> > point.
>
> This requirement originally came from Andy, arguing that the semantics
> of this were/are confusing, which still makes sense to me. Perhaps we
> should do something like the below?
[...]
> +static bool has_listener_parent(struct seccomp_filter *child)
> +{
> +       struct seccomp_filter *cur;
> +
> +       for (cur = current->seccomp.filter; cur; cur = cur->prev) {
> +               if (cur->notif)
> +                       return true;
> +       }
> +
> +       return false;
> +}
[...]
> @@ -407,6 +419,11 @@ static inline pid_t seccomp_can_sync_threads(void)
[...]
> +               /* don't allow TSYNC to install multiple listeners */
> +               if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER &&
> +                   !has_listener_parent(thread->seccomp.filter))
> +                       continue;
[...]
> @@ -1462,12 +1479,9 @@ static const struct file_operations seccomp_notify_ops = {
>  static struct file *init_listener(struct seccomp_filter *filter)
[...]
> -       for (cur = current->seccomp.filter; cur; cur = cur->prev) {
> -               if (cur->notif)
> -                       goto out;
> -       }
> +       if (has_listener_parent(current->seccomp.filter))
> +               goto out;

I dislike this because it combines a non-locked check and a locked
check. And I don't think this will work in the case where TSYNC and
non-TSYNC race - if the non-TSYNC call nests around the TSYNC filter
installation, the thread that called seccomp in non-TSYNC mode will
still end up with two notifying filters. How about the following?


diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 676d4af62103..c49ad8ba0bc1 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -1475,11 +1475,6 @@ static struct file *init_listener(struct
seccomp_filter *filter)
        struct file *ret = ERR_PTR(-EBUSY);
        struct seccomp_filter *cur;

-       for (cur = current->seccomp.filter; cur; cur = cur->prev) {
-               if (cur->notif)
-                       goto out;
-       }
-
        ret = ERR_PTR(-ENOMEM);
        filter->notif = kzalloc(sizeof(*(filter->notif)), GFP_KERNEL);
        if (!filter->notif)
@@ -1504,6 +1499,31 @@ static struct file *init_listener(struct
seccomp_filter *filter)
        return ret;
 }

+/*
+ * Does @new_child have a listener while an ancestor also has a listener?
+ * If so, we'll want to reject this filter.
+ * This only has to be tested for the current process, even in the TSYNC case,
+ * because TSYNC installs @child with the same parent on all threads.
+ * Note that @new_child is not hooked up to its parent at this point yet, so
+ * we use current->seccomp.filter.
+ */
+static bool has_duplicate_listener(struct seccomp_filter *new_child)
+{
+       struct seccomp_filter *cur;
+
+       /* must be protected against concurrent TSYNC */
+       lockdep_assert_held(&current->sighand->siglock);
+
+       if (!new_child->notif)
+               return false;
+       for (cur = current->seccomp.filter; cur; cur = cur->prev) {
+               if (cur->notif)
+                       return true;
+       }
+
+       return false;
+}
+
 /**
  * seccomp_set_mode_filter: internal function for setting seccomp filter
  * @flags:  flags to change filter behavior
@@ -1575,6 +1595,9 @@ static long seccomp_set_mode_filter(unsigned int flags,
        if (!seccomp_may_assign_mode(seccomp_mode))
                goto out;

+       if (has_duplicate_listener(prepared))
+               goto out;
+
        ret = seccomp_attach_filter(flags, prepared);
        if (ret)
                goto out;

  parent reply	other threads:[~2020-10-01 18:21 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-30 11:07 Michael Kerrisk (man-pages)
2020-09-30 15:03 ` Tycho Andersen
2020-09-30 15:11   ` Tycho Andersen
2020-09-30 20:34   ` Michael Kerrisk (man-pages)
2020-09-30 23:03     ` Tycho Andersen
2020-09-30 23:11       ` Jann Horn
2020-09-30 23:24         ` Tycho Andersen
2020-10-01  1:52           ` Jann Horn
2020-10-01  2:14             ` Jann Horn
2020-10-25 16:31               ` Michael Kerrisk (man-pages)
2020-10-26 15:54                 ` Jann Horn
2020-10-27  6:14                   ` Michael Kerrisk (man-pages)
2020-10-27 10:28                     ` Jann Horn
2020-10-28  6:31                       ` Sargun Dhillon
2020-10-28  9:43                         ` Jann Horn
2020-10-28 17:43                           ` Sargun Dhillon
2020-10-28 18:20                             ` Jann Horn
2020-10-01  7:49             ` Michael Kerrisk (man-pages)
2020-10-26  0:32             ` Kees Cook
2020-10-26  9:51               ` Jann Horn
2020-10-26 10:31                 ` Jann Horn
2020-10-28 22:56                   ` Kees Cook
2020-10-29  1:11                     ` Jann Horn
     [not found]                   ` <20201029021348.GB25673@cisco>
2020-10-29  4:26                     ` Jann Horn
2020-10-28 22:53                 ` Kees Cook
2020-10-29  1:25                   ` Jann Horn
2020-10-01  7:45       ` Michael Kerrisk (man-pages)
2020-10-14  4:40         ` Michael Kerrisk (man-pages)
2020-09-30 15:53 ` Jann Horn
2020-10-01 12:54   ` Christian Brauner
2020-10-01 15:47     ` Jann Horn
2020-10-01 16:58       ` Tycho Andersen
2020-10-01 17:12         ` Christian Brauner
2020-10-14  5:41           ` Michael Kerrisk (man-pages)
2020-10-01 18:18         ` Jann Horn [this message]
2020-10-01 18:56           ` Tycho Andersen
2020-10-01 17:05       ` Christian Brauner
2020-10-15 11:24   ` Michael Kerrisk (man-pages)
2020-10-15 20:32     ` Jann Horn
2020-10-16 18:29       ` Michael Kerrisk (man-pages)
2020-10-17  0:25         ` Jann Horn
2020-10-24 12:52           ` Michael Kerrisk (man-pages)
2020-10-26  9:32             ` Jann Horn
2020-10-26  9:47               ` Michael Kerrisk (man-pages)
2020-09-30 23:39 ` Kees Cook
2020-10-15 11:24   ` Michael Kerrisk (man-pages)
2020-10-26  0:19     ` Kees Cook
2020-10-26  9:39       ` Michael Kerrisk (man-pages)
2020-10-01 12:36 ` Christian Brauner
2020-10-15 11:23   ` Michael Kerrisk (man-pages)
2020-10-01 21:06 ` Sargun Dhillon
2020-10-01 23:19   ` Tycho Andersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez1W+Ym5=-PdUhyei_UCJov0agEF4YVyARL=pooWYmdEAg@mail.gmail.com' \
    --to=jannh@google.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=christian.brauner@canonical.com \
    --cc=christian@brauner.io \
    --cc=containers@lists.linux-foundation.org \
    --cc=daniel@iogearbox.net \
    --cc=gscrivan@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mtk.manpages@gmail.com \
    --cc=rsesek@google.com \
    --cc=songliubraving@fb.com \
    --cc=tycho@tycho.pizza \
    --cc=wad@chromium.org \
    --subject='Re: For review: seccomp_user_notif(2) manual page' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).