All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <christian.brauner@ubuntu.com>
To: Kees Cook <keescook@chromium.org>
Cc: linux-kernel@vger.kernel.org, Andy Lutomirski <luto@kernel.org>,
	Tycho Andersen <tycho@tycho.ws>,
	Matt Denton <mpdenton@google.com>,
	Sargun Dhillon <sargun@sargun.me>, Jann Horn <jannh@google.com>,
	Chris Palmer <palmer@google.com>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Robert Sesek <rsesek@google.com>,
	Jeffrey Vander Stoep <jeffv@google.com>,
	Linux Containers <containers@lists.linux-foundation.org>
Subject: Re: [PATCH 1/2] seccomp: notify user trap about unused filter
Date: Thu, 28 May 2020 01:16:46 +0200	[thread overview]
Message-ID: <20200527231646.4v743erjpzh6qe5f@wittgenstein> (raw)
In-Reply-To: <20200527224501.jddwcmvtvjtjsmsx@wittgenstein>

On Thu, May 28, 2020 at 12:45:02AM +0200, Christian Brauner wrote:
> On Wed, May 27, 2020 at 03:37:58PM -0700, Kees Cook wrote:
> > On Thu, May 28, 2020 at 12:05:32AM +0200, Christian Brauner wrote:
> > > The main question also is, is there precedence where the kernel just
> > > closes the file descriptor for userspace behind it's back? I'm not sure
> > > I've heard of this before. That's not how that works afaict; it's also
> > > not how we do pidfds. We don't just close the fd when the task
> > > associated with it goes away, we notify and then userspace can close.
> > 
> > But there's a mapping between pidfd and task struct that is separate
> > from task struct itself, yes? I.e. keeping a pidfd open doesn't pin
> > struct task in memory forever, right?
> 
> No, but that's an implementation detail and we discussed that. It pins
> struct pid instead of task_struct. Once the process is fully gone you
> just get ESRCH.
> For example, fds to /proc/<pid>/<tid>/ fds aren't just closed once the
> task has gone away, userspace will just get ESRCH when it tries to open
> files under there but the fd remains valid until close() is called.
> 
> In addition, of all the anon inode fds, none of them have the "close the
> file behind userspace back" behavior: io_uring, signalfd, timerfd, btf,
> perf_event, bpf-prog, bpf-link, bpf-map, pidfd, userfaultfd, fanotify,
> inotify, eventpoll, fscontext, eventfd. These are just core kernel ones.
> I'm pretty sure that it'd be very odd behavior if we did that. I'd
> rather just notify userspace and leave the close to them. But maybe I'm
> missing something.

I'm also starting to think this isn't even possible or currently doable
safely.
The fdtable in the kernel would end up with a dangling pointer, I would
think. Unless you backtrack all fds that still have a reference into the
fdtable and refer to that file and close them all in the kernel which I
don't think is possible and also sounds very dodgy. This also really
seems like we would be breaking a major contract, namely that fds stay
valid until userspace calls close, execve(), or exits.

Christian

  reply	other threads:[~2020-05-27 23:16 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 11:19 [PATCH 1/2] seccomp: notify user trap about unused filter Christian Brauner
2020-05-27 11:19 ` [PATCH 2/2] tests: test seccomp filter notifications Christian Brauner
2020-05-27 15:25 ` [PATCH 1/2] seccomp: notify user trap about unused filter Tycho Andersen
2020-05-27 15:35   ` Christian Brauner
2020-05-27 17:37 ` Sargun Dhillon
2020-05-27 19:13   ` Christian Brauner
2020-05-27 21:43 ` Kees Cook
2020-05-27 21:52   ` Tycho Andersen
2020-05-27 22:36     ` Kees Cook
2020-05-27 22:56       ` Tycho Andersen
2020-05-28  1:50         ` Kees Cook
2020-05-27 22:05   ` Christian Brauner
2020-05-27 22:37     ` Kees Cook
2020-05-27 22:45       ` Christian Brauner
2020-05-27 23:16         ` Christian Brauner [this message]
2020-05-28  1:59           ` Kees Cook
2020-05-28  4:14             ` Jann Horn
2020-05-28 14:16             ` Christian Brauner
2020-05-28 14:39               ` Christian Brauner
2020-05-28  1:49         ` Kees Cook
2020-05-28  4:04 ` Jann Horn
2020-05-28  9:57   ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200527231646.4v743erjpzh6qe5f@wittgenstein \
    --to=christian.brauner@ubuntu.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=cyphar@cyphar.com \
    --cc=jannh@google.com \
    --cc=jeffv@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mpdenton@google.com \
    --cc=palmer@google.com \
    --cc=rsesek@google.com \
    --cc=sargun@sargun.me \
    --cc=tycho@tycho.ws \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.