From: Jann Horn <jannh@google.com> To: Tycho Andersen <tycho@tycho.pizza> Cc: Christian Brauner <christian.brauner@canonical.com>, linux-man <linux-man@vger.kernel.org>, Song Liu <songliubraving@fb.com>, Will Drewry <wad@chromium.org>, Kees Cook <keescook@chromium.org>, Daniel Borkmann <daniel@iogearbox.net>, Giuseppe Scrivano <gscrivan@redhat.com>, Robert Sesek <rsesek@google.com>, Linux Containers <containers@lists.linux-foundation.org>, lkml <linux-kernel@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>, bpf <bpf@vger.kernel.org>, Andy Lutomirski <luto@amacapital.net>, Christian Brauner <christian@brauner.io> Subject: Re: For review: seccomp_user_notif(2) manual page Date: Thu, 1 Oct 2020 20:18:49 +0200 [thread overview] Message-ID: <CAG48ez1W+Ym5=-PdUhyei_UCJov0agEF4YVyARL=pooWYmdEAg@mail.gmail.com> (raw) In-Reply-To: <20201001165850.GC1260245@cisco> On Thu, Oct 1, 2020 at 6:58 PM Tycho Andersen <tycho@tycho.pizza> wrote: > On Thu, Oct 01, 2020 at 05:47:54PM +0200, Jann Horn via Containers wrote: > > On Thu, Oct 1, 2020 at 2:54 PM Christian Brauner > > <christian.brauner@canonical.com> wrote: > > > On Wed, Sep 30, 2020 at 05:53:46PM +0200, Jann Horn via Containers wrote: > > > > On Wed, Sep 30, 2020 at 1:07 PM Michael Kerrisk (man-pages) > > > > <mtk.manpages@gmail.com> wrote: > > > > > NOTES > > > > > The file descriptor returned when seccomp(2) is employed with the > > > > > SECCOMP_FILTER_FLAG_NEW_LISTENER flag can be monitored using > > > > > poll(2), epoll(7), and select(2). When a notification is pend‐ > > > > > ing, these interfaces indicate that the file descriptor is read‐ > > > > > able. > > > > > > > > We should probably also point out somewhere that, as > > > > include/uapi/linux/seccomp.h says: > > > > > > > > * Similar precautions should be applied when stacking SECCOMP_RET_USER_NOTIF > > > > * or SECCOMP_RET_TRACE. For SECCOMP_RET_USER_NOTIF filters acting on the > > > > * same syscall, the most recently added filter takes precedence. This means > > > > * that the new SECCOMP_RET_USER_NOTIF filter can override any > > > > * SECCOMP_IOCTL_NOTIF_SEND from earlier filters, essentially allowing all > > > > * such filtered syscalls to be executed by sending the response > > > > * SECCOMP_USER_NOTIF_FLAG_CONTINUE. Note that SECCOMP_RET_TRACE can equally > > > > * be overriden by SECCOMP_USER_NOTIF_FLAG_CONTINUE. > > > > > > > > In other words, from a security perspective, you must assume that the > > > > target process can bypass any SECCOMP_RET_USER_NOTIF (or > > > > SECCOMP_RET_TRACE) filters unless it is completely prohibited from > > > > calling seccomp(). This should also be noted over in the main > > > > seccomp(2) manpage, especially the SECCOMP_RET_TRACE part. > > > > > > So I was actually wondering about this when I skimmed this and a while > > > ago but forgot about this again... Afaict, you can only ever load a > > > single filter with SECCOMP_FILTER_FLAG_NEW_LISTENER set. If there > > > already is a filter with the SECCOMP_FILTER_FLAG_NEW_LISTENER property > > > in the tasks filter hierarchy then the kernel will refuse to load a new > > > one? > > > > > > static struct file *init_listener(struct seccomp_filter *filter) > > > { > > > struct file *ret = ERR_PTR(-EBUSY); > > > struct seccomp_filter *cur; > > > > > > for (cur = current->seccomp.filter; cur; cur = cur->prev) { > > > if (cur->notif) > > > goto out; > > > } > > > > > > shouldn't that be sufficient to guarantee that USER_NOTIF filters can't > > > override each other for the same task simply because there can only ever > > > be a single one? > > > > Good point. Exceeeept that that check seems ineffective because this > > happens before we take the locks that guard against TSYNC, and also > > before we decide to which existing filter we want to chain the new > > filter. So if two threads race with TSYNC, I think they'll be able to > > chain two filters with listeners together. > > Yep, seems the check needs to also be in seccomp_can_sync_threads() to > be totally effective, > > > I don't know whether we want to eternalize this "only one listener > > across all the filters" restriction in the manpage though, or whether > > the man page should just say that the kernel currently doesn't support > > it but that security-wise you should assume that it might at some > > point. > > This requirement originally came from Andy, arguing that the semantics > of this were/are confusing, which still makes sense to me. Perhaps we > should do something like the below? [...] > +static bool has_listener_parent(struct seccomp_filter *child) > +{ > + struct seccomp_filter *cur; > + > + for (cur = current->seccomp.filter; cur; cur = cur->prev) { > + if (cur->notif) > + return true; > + } > + > + return false; > +} [...] > @@ -407,6 +419,11 @@ static inline pid_t seccomp_can_sync_threads(void) [...] > + /* don't allow TSYNC to install multiple listeners */ > + if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER && > + !has_listener_parent(thread->seccomp.filter)) > + continue; [...] > @@ -1462,12 +1479,9 @@ static const struct file_operations seccomp_notify_ops = { > static struct file *init_listener(struct seccomp_filter *filter) [...] > - for (cur = current->seccomp.filter; cur; cur = cur->prev) { > - if (cur->notif) > - goto out; > - } > + if (has_listener_parent(current->seccomp.filter)) > + goto out; I dislike this because it combines a non-locked check and a locked check. And I don't think this will work in the case where TSYNC and non-TSYNC race - if the non-TSYNC call nests around the TSYNC filter installation, the thread that called seccomp in non-TSYNC mode will still end up with two notifying filters. How about the following? diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 676d4af62103..c49ad8ba0bc1 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -1475,11 +1475,6 @@ static struct file *init_listener(struct seccomp_filter *filter) struct file *ret = ERR_PTR(-EBUSY); struct seccomp_filter *cur; - for (cur = current->seccomp.filter; cur; cur = cur->prev) { - if (cur->notif) - goto out; - } - ret = ERR_PTR(-ENOMEM); filter->notif = kzalloc(sizeof(*(filter->notif)), GFP_KERNEL); if (!filter->notif) @@ -1504,6 +1499,31 @@ static struct file *init_listener(struct seccomp_filter *filter) return ret; } +/* + * Does @new_child have a listener while an ancestor also has a listener? + * If so, we'll want to reject this filter. + * This only has to be tested for the current process, even in the TSYNC case, + * because TSYNC installs @child with the same parent on all threads. + * Note that @new_child is not hooked up to its parent at this point yet, so + * we use current->seccomp.filter. + */ +static bool has_duplicate_listener(struct seccomp_filter *new_child) +{ + struct seccomp_filter *cur; + + /* must be protected against concurrent TSYNC */ + lockdep_assert_held(¤t->sighand->siglock); + + if (!new_child->notif) + return false; + for (cur = current->seccomp.filter; cur; cur = cur->prev) { + if (cur->notif) + return true; + } + + return false; +} + /** * seccomp_set_mode_filter: internal function for setting seccomp filter * @flags: flags to change filter behavior @@ -1575,6 +1595,9 @@ static long seccomp_set_mode_filter(unsigned int flags, if (!seccomp_may_assign_mode(seccomp_mode)) goto out; + if (has_duplicate_listener(prepared)) + goto out; + ret = seccomp_attach_filter(flags, prepared); if (ret) goto out;
next prev parent reply other threads:[~2020-10-01 18:21 UTC|newest] Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-09-30 11:07 Michael Kerrisk (man-pages) 2020-09-30 15:03 ` Tycho Andersen 2020-09-30 15:11 ` Tycho Andersen 2020-09-30 20:34 ` Michael Kerrisk (man-pages) 2020-09-30 23:03 ` Tycho Andersen 2020-09-30 23:11 ` Jann Horn 2020-09-30 23:24 ` Tycho Andersen 2020-10-01 1:52 ` Jann Horn 2020-10-01 2:14 ` Jann Horn 2020-10-25 16:31 ` Michael Kerrisk (man-pages) 2020-10-26 15:54 ` Jann Horn 2020-10-27 6:14 ` Michael Kerrisk (man-pages) 2020-10-27 10:28 ` Jann Horn 2020-10-28 6:31 ` Sargun Dhillon 2020-10-28 9:43 ` Jann Horn 2020-10-28 17:43 ` Sargun Dhillon 2020-10-28 18:20 ` Jann Horn 2020-10-01 7:49 ` Michael Kerrisk (man-pages) 2020-10-26 0:32 ` Kees Cook 2020-10-26 9:51 ` Jann Horn 2020-10-26 10:31 ` Jann Horn 2020-10-28 22:56 ` Kees Cook 2020-10-29 1:11 ` Jann Horn [not found] ` <20201029021348.GB25673@cisco> 2020-10-29 4:26 ` Jann Horn 2020-10-28 22:53 ` Kees Cook 2020-10-29 1:25 ` Jann Horn 2020-10-01 7:45 ` Michael Kerrisk (man-pages) 2020-10-14 4:40 ` Michael Kerrisk (man-pages) 2020-09-30 15:53 ` Jann Horn 2020-10-01 12:54 ` Christian Brauner 2020-10-01 15:47 ` Jann Horn 2020-10-01 16:58 ` Tycho Andersen 2020-10-01 17:12 ` Christian Brauner 2020-10-14 5:41 ` Michael Kerrisk (man-pages) 2020-10-01 18:18 ` Jann Horn [this message] 2020-10-01 18:56 ` Tycho Andersen 2020-10-01 17:05 ` Christian Brauner 2020-10-15 11:24 ` Michael Kerrisk (man-pages) 2020-10-15 20:32 ` Jann Horn 2020-10-16 18:29 ` Michael Kerrisk (man-pages) 2020-10-17 0:25 ` Jann Horn 2020-10-24 12:52 ` Michael Kerrisk (man-pages) 2020-10-26 9:32 ` Jann Horn 2020-10-26 9:47 ` Michael Kerrisk (man-pages) 2020-09-30 23:39 ` Kees Cook 2020-10-15 11:24 ` Michael Kerrisk (man-pages) 2020-10-26 0:19 ` Kees Cook 2020-10-26 9:39 ` Michael Kerrisk (man-pages) 2020-10-01 12:36 ` Christian Brauner 2020-10-15 11:23 ` Michael Kerrisk (man-pages) 2020-10-01 21:06 ` Sargun Dhillon 2020-10-01 23:19 ` Tycho Andersen
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAG48ez1W+Ym5=-PdUhyei_UCJov0agEF4YVyARL=pooWYmdEAg@mail.gmail.com' \ --to=jannh@google.com \ --cc=ast@kernel.org \ --cc=bpf@vger.kernel.org \ --cc=christian.brauner@canonical.com \ --cc=christian@brauner.io \ --cc=containers@lists.linux-foundation.org \ --cc=daniel@iogearbox.net \ --cc=gscrivan@redhat.com \ --cc=keescook@chromium.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-man@vger.kernel.org \ --cc=luto@amacapital.net \ --cc=mtk.manpages@gmail.com \ --cc=rsesek@google.com \ --cc=songliubraving@fb.com \ --cc=tycho@tycho.pizza \ --cc=wad@chromium.org \ --subject='Re: For review: seccomp_user_notif(2) manual page' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).