All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <christian@brauner.io>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Jann Horn <jannh@google.com>
Cc: Daniel Colascione <dancol@google.com>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Andrew Lutomirski <luto@kernel.org>,
	David Howells <dhowells@redhat.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Linux API <linux-api@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Kees Cook <keescook@chromium.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
	Jonathan Kowalski <bl0pbl33p@gmail.com>,
	"Dmitry V. Levin" <ldv@altlinux.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Joel Fernandes <joel@joelfernandes.org>
Subject: Re: [PATCH v2 0/5] pid: add pidfd_open()
Date: Mon, 1 Apr 2019 21:42:15 +0200	[thread overview]
Message-ID: <20190401194214.4rbvmwogpke3cjcx@brauner.io> (raw)
In-Reply-To: <CAHk-=wgKOJTdN9TqyjW6FngEZvgUTmCDzJPRrtekW_1SD0gfhw@mail.gmail.com>

On Mon, Apr 01, 2019 at 09:01:29AM -0700, Linus Torvalds wrote:
> On Mon, Apr 1, 2019 at 8:55 AM Daniel Colascione <dancol@google.com> wrote:
> >
> >
> > > I wonder if we really want a fill procfs2, or maybe we could just make
> > > the pidfd readable (yes, it's a directory file descriptor, but we
> > > could allow reading).
> >
> > What would read(2) read?
> 
> We could make it read anything, but it would have to be something
> people agree is sufficient (and not so expensive to create that rare
> users of that data would find the overhead excessive).
> 
> Eg we could make it return the same thing that /proc/<pid>/status
> reads right now.
> 
> But it sounds like you need pretty much all of /proc/<pid>/xyz:

From what I gather from this thread we are still best of with using fds
to /proc/<pid> as pidfds. Linus, do you agree or have I misunderstood?
Yes, we can have an internal mount option to restrict access to various
parts of procfs from such pidfds or do the parent-less bind-mount trick
but I think this beats having a stunted dummy dirfd that we implement a
read method on.
One thing is that we also need something to disable access to the
"/proc/<pid>/net". One option could be to give the files in "net/" an
->open-handler which checks that our file->f_path.mnt is not one of our
special clone() mounts and if it is refuse the open.

To clarify the way forward:
Jann and I were discussing whether pidfd_open() still makes sense and
whether I shouldn't just jump straight to a first version of
CLONE_PIDFD.
Basically, if you have a system without CONFIG_PROC_FS it makes sense
that clone gives back an anon inode file descriptor as pidfds because
you can still signal threads in a race-free way. But it doesn't make a
lot of sense to have pidfd_open() in this scenario because you can't
really do anything with that pidfd apart from sending signals. And on a
system like that sending a signal is still racy. Since the process can
be recycled between learning the pid number and calling pidfd_open()
[1]. So it only makes sense to have _clone()_ give back anon_inode() fds
on a system without CONFIG_PROC_FS but it doesn't make sense for
pidfd_open() In other news, I think it makes more sense if I jump to the
implementation of CLONE_PIDFD instead of working on pidfd_open().

[1]: The only case - that seems rather far-fetched - where it makes
     sense is when the parent wants to create that pidfd and hand it to
     someone else.

Christian

WARNING: multiple messages have this Message-ID (diff)
From: Christian Brauner <christian@brauner.io>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Jann Horn <jannh@google.com>
Cc: Daniel Colascione <dancol@google.com>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Andrew Lutomirski <luto@kernel.org>,
	David Howells <dhowells@redhat.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Linux API <linux-api@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Kees Cook <keescook@chromium.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
	Jonathan Kowalski <bl0pbl33p@gmail.com>,
	"Dmitry V. Levin" <ldv@altlinux.org>,
	Andrew Morton <akpm@linux-foundation.org>, Oleg Nesterov <oleg>
Subject: Re: [PATCH v2 0/5] pid: add pidfd_open()
Date: Mon, 1 Apr 2019 21:42:15 +0200	[thread overview]
Message-ID: <20190401194214.4rbvmwogpke3cjcx@brauner.io> (raw)
In-Reply-To: <CAHk-=wgKOJTdN9TqyjW6FngEZvgUTmCDzJPRrtekW_1SD0gfhw@mail.gmail.com>

On Mon, Apr 01, 2019 at 09:01:29AM -0700, Linus Torvalds wrote:
> On Mon, Apr 1, 2019 at 8:55 AM Daniel Colascione <dancol@google.com> wrote:
> >
> >
> > > I wonder if we really want a fill procfs2, or maybe we could just make
> > > the pidfd readable (yes, it's a directory file descriptor, but we
> > > could allow reading).
> >
> > What would read(2) read?
> 
> We could make it read anything, but it would have to be something
> people agree is sufficient (and not so expensive to create that rare
> users of that data would find the overhead excessive).
> 
> Eg we could make it return the same thing that /proc/<pid>/status
> reads right now.
> 
> But it sounds like you need pretty much all of /proc/<pid>/xyz:

>From what I gather from this thread we are still best of with using fds
to /proc/<pid> as pidfds. Linus, do you agree or have I misunderstood?
Yes, we can have an internal mount option to restrict access to various
parts of procfs from such pidfds or do the parent-less bind-mount trick
but I think this beats having a stunted dummy dirfd that we implement a
read method on.
One thing is that we also need something to disable access to the
"/proc/<pid>/net". One option could be to give the files in "net/" an
->open-handler which checks that our file->f_path.mnt is not one of our
special clone() mounts and if it is refuse the open.

To clarify the way forward:
Jann and I were discussing whether pidfd_open() still makes sense and
whether I shouldn't just jump straight to a first version of
CLONE_PIDFD.
Basically, if you have a system without CONFIG_PROC_FS it makes sense
that clone gives back an anon inode file descriptor as pidfds because
you can still signal threads in a race-free way. But it doesn't make a
lot of sense to have pidfd_open() in this scenario because you can't
really do anything with that pidfd apart from sending signals. And on a
system like that sending a signal is still racy. Since the process can
be recycled between learning the pid number and calling pidfd_open()
[1]. So it only makes sense to have _clone()_ give back anon_inode() fds
on a system without CONFIG_PROC_FS but it doesn't make sense for
pidfd_open() In other news, I think it makes more sense if I jump to the
implementation of CLONE_PIDFD instead of working on pidfd_open().

[1]: The only case - that seems rather far-fetched - where it makes
     sense is when the parent wants to create that pidfd and hand it to
     someone else.

Christian

  parent reply	other threads:[~2019-04-01 19:42 UTC|newest]

Thread overview: 158+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-29 15:54 [PATCH v2 0/5] pid: add pidfd_open() Christian Brauner
2019-03-29 15:54 ` [PATCH v2 1/5] Make anon_inodes unconditional Christian Brauner
2019-03-29 15:54 ` [PATCH v2 2/5] pid: add pidfd_open() Christian Brauner
2019-03-29 23:45   ` Jann Horn
2019-03-29 23:45     ` Jann Horn
2019-03-29 23:55     ` Christian Brauner
2019-03-29 23:55       ` Christian Brauner
2019-03-30 11:53   ` Jürg Billeter
2019-03-30 14:37     ` Christian Brauner
2019-03-30 14:51       ` Jonathan Kowalski
2019-03-30 14:51         ` Jonathan Kowalski
2019-03-29 15:54 ` [PATCH v2 3/5] signal: support pidfd_open() with pidfd_send_signal() Christian Brauner
2019-03-29 15:54 ` [PATCH v2 4/5] signal: PIDFD_SIGNAL_TID threads via pidfds Christian Brauner
2019-03-30  1:06   ` Jann Horn
2019-03-30  1:06     ` Jann Horn
2019-03-30  1:22     ` Christian Brauner
2019-03-30  1:22       ` Christian Brauner
2019-03-30  1:34       ` Christian Brauner
2019-03-30  1:34         ` Christian Brauner
2019-03-30  1:42         ` Christian Brauner
2019-03-30  1:42           ` Christian Brauner
2019-03-29 15:54 ` [PATCH v2 5/5] tests: add pidfd_open() tests Christian Brauner
2019-03-30 16:09 ` [PATCH v2 0/5] pid: add pidfd_open() Linus Torvalds
2019-03-30 16:09   ` Linus Torvalds
2019-03-30 16:11   ` Daniel Colascione
2019-03-30 16:11     ` Daniel Colascione
2019-03-30 16:16     ` Linus Torvalds
2019-03-30 16:16       ` Linus Torvalds
2019-03-30 16:18       ` Linus Torvalds
2019-03-30 16:18         ` Linus Torvalds
2019-03-31  1:07         ` Joel Fernandes
2019-03-31  1:07           ` Joel Fernandes
2019-03-31  2:34           ` Jann Horn
2019-03-31  2:34             ` Jann Horn
2019-03-31  4:08             ` Joel Fernandes
2019-03-31  4:08               ` Joel Fernandes
2019-03-31  4:46               ` Jann Horn
2019-03-31  4:46                 ` Jann Horn
2019-03-31 14:52                 ` Linus Torvalds
2019-03-31 14:52                   ` Linus Torvalds
2019-03-31 15:05                   ` Christian Brauner
2019-03-31 15:05                     ` Christian Brauner
2019-03-31 15:21                     ` Daniel Colascione
2019-03-31 15:21                       ` Daniel Colascione
2019-03-31 15:33                   ` Jonathan Kowalski
2019-03-31 15:33                     ` Jonathan Kowalski
2019-03-30 16:19   ` Christian Brauner
2019-03-30 16:19     ` Christian Brauner
2019-03-30 16:24     ` Linus Torvalds
2019-03-30 16:24       ` Linus Torvalds
2019-03-30 16:34       ` Daniel Colascione
2019-03-30 16:34         ` Daniel Colascione
2019-03-30 16:38         ` Christian Brauner
2019-03-30 16:38           ` Christian Brauner
2019-03-30 17:04         ` Linus Torvalds
2019-03-30 17:04           ` Linus Torvalds
2019-03-30 17:12           ` Christian Brauner
2019-03-30 17:12             ` Christian Brauner
2019-03-30 17:24             ` Linus Torvalds
2019-03-30 17:24               ` Linus Torvalds
2019-03-30 17:37               ` Christian Brauner
2019-03-30 17:37                 ` Christian Brauner
2019-03-30 17:50               ` Jonathan Kowalski
2019-03-30 17:50                 ` Jonathan Kowalski
2019-03-30 17:52                 ` Christian Brauner
2019-03-30 17:52                   ` Christian Brauner
2019-03-30 17:59                   ` Jonathan Kowalski
2019-03-30 17:59                     ` Jonathan Kowalski
2019-03-30 18:02                     ` Christian Brauner
2019-03-30 18:02                       ` Christian Brauner
2019-03-30 18:00               ` Jann Horn
2019-03-30 18:00                 ` Jann Horn
2019-03-31 20:09               ` Andy Lutomirski
2019-03-31 20:09                 ` Andy Lutomirski
2019-03-31 21:03                 ` Linus Torvalds
2019-03-31 21:03                   ` Linus Torvalds
2019-03-31 21:10                   ` Christian Brauner
2019-03-31 21:10                     ` Christian Brauner
2019-03-31 21:17                     ` Linus Torvalds
2019-03-31 21:17                       ` Linus Torvalds
2019-03-31 22:03                       ` Christian Brauner
2019-03-31 22:03                         ` Christian Brauner
2019-03-31 22:16                         ` Linus Torvalds
2019-03-31 22:16                           ` Linus Torvalds
2019-03-31 22:33                           ` Christian Brauner
2019-03-31 22:33                             ` Christian Brauner
2019-04-01  0:52                             ` Jann Horn
2019-04-01  0:52                               ` Jann Horn
2019-04-01  8:47                               ` Yann Droneaud
2019-04-01  8:47                                 ` Yann Droneaud
2019-04-01 10:03                               ` Jonathan Kowalski
2019-04-01 10:03                                 ` Jonathan Kowalski
2019-03-31 23:40                           ` Linus Torvalds
2019-03-31 23:40                             ` Linus Torvalds
2019-04-01  0:09                             ` Al Viro
2019-04-01  0:09                               ` Al Viro
2019-04-01  0:18                               ` Linus Torvalds
2019-04-01  0:18                                 ` Linus Torvalds
2019-04-01  0:21                                 ` Christian Brauner
2019-04-01  0:21                                   ` Christian Brauner
2019-04-01  6:37                                 ` Al Viro
2019-04-01  6:37                                   ` Al Viro
2019-04-01  6:41                                   ` Al Viro
2019-04-01  6:41                                     ` Al Viro
2019-03-31 22:03                       ` Jonathan Kowalski
2019-03-31 22:03                         ` Jonathan Kowalski
2019-04-01  2:13                       ` Andy Lutomirski
2019-04-01  2:13                         ` Andy Lutomirski
2019-04-01 11:40                         ` Aleksa Sarai
2019-04-01 11:40                           ` Aleksa Sarai
2019-04-01 15:36                           ` Linus Torvalds
2019-04-01 15:36                             ` Linus Torvalds
2019-04-01 15:47                             ` Christian Brauner
2019-04-01 15:47                               ` Christian Brauner
2019-04-01 15:55                             ` Daniel Colascione
2019-04-01 15:55                               ` Daniel Colascione
2019-04-01 16:01                               ` Linus Torvalds
2019-04-01 16:01                                 ` Linus Torvalds
2019-04-01 16:13                                 ` Daniel Colascione
2019-04-01 16:13                                   ` Daniel Colascione
2019-04-01 19:42                                 ` Christian Brauner [this message]
2019-04-01 19:42                                   ` Christian Brauner
2019-04-01 21:30                                   ` Linus Torvalds
2019-04-01 21:30                                     ` Linus Torvalds
2019-04-01 21:58                                     ` Jonathan Kowalski
2019-04-01 21:58                                       ` Jonathan Kowalski
2019-04-01 22:13                                       ` Linus Torvalds
2019-04-01 22:13                                         ` Linus Torvalds
2019-04-01 22:34                                         ` Daniel Colascione
2019-04-01 22:34                                           ` Daniel Colascione
2019-04-01 16:07                               ` Jonathan Kowalski
2019-04-01 16:07                                 ` Jonathan Kowalski
2019-04-01 16:15                                 ` Linus Torvalds
2019-04-01 16:15                                   ` Linus Torvalds
2019-04-01 16:27                                   ` Jonathan Kowalski
2019-04-01 16:27                                     ` Jonathan Kowalski
2019-04-01 16:21                                 ` Daniel Colascione
2019-04-01 16:21                                   ` Daniel Colascione
2019-04-01 16:29                                   ` Linus Torvalds
2019-04-01 16:29                                     ` Linus Torvalds
2019-04-01 16:45                                     ` Daniel Colascione
2019-04-01 16:45                                       ` Daniel Colascione
2019-04-01 17:00                                       ` David Laight
2019-04-01 17:00                                         ` David Laight
2019-04-01 17:32                                       ` Linus Torvalds
2019-04-01 17:32                                         ` Linus Torvalds
2019-04-02 11:03                                       ` Florian Weimer
2019-04-02 11:03                                         ` Florian Weimer
2019-04-01 16:10                             ` Andy Lutomirski
2019-04-01 16:10                               ` Andy Lutomirski
2019-04-01 12:04                         ` Christian Brauner
2019-04-01 12:04                           ` Christian Brauner
2019-04-01 13:43                           ` Jann Horn
2019-04-01 13:43                             ` Jann Horn
2019-03-31 21:19                 ` Christian Brauner
2019-03-31 21:19                   ` Christian Brauner
2019-03-30 16:37       ` Christian Brauner
2019-03-30 16:37         ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190401194214.4rbvmwogpke3cjcx@brauner.io \
    --to=christian@brauner.io \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bl0pbl33p@gmail.com \
    --cc=cyphar@cyphar.com \
    --cc=dancol@google.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=khlebnikov@yandex-team.ru \
    --cc=ldv@altlinux.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=nagarathnam.muthusamy@oracle.com \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.