All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Kowalski <bl0pbl33p@gmail.com>
To: Christian Brauner <christian@brauner.io>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Daniel Colascione <dancol@google.com>,
	Jann Horn <jannh@google.com>, Andrew Lutomirski <luto@kernel.org>,
	David Howells <dhowells@redhat.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Linux API <linux-api@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Kees Cook <keescook@chromium.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
	"Dmitry V. Levin" <ldv@altlinux.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Joel Fernandes <joel@joelfernandes.org>
Subject: Re: [PATCH v2 0/5] pid: add pidfd_open()
Date: Sat, 30 Mar 2019 17:59:34 +0000	[thread overview]
Message-ID: <CAGLj2rEtNNs0BXHuGkjpT4seHuh=Lj79iVYP-n117+Dv+ThcJA@mail.gmail.com> (raw)
In-Reply-To: <20190330175241.4itdnx3tl5upzjxd@brauner.io>

On Sat, Mar 30, 2019 at 5:52 PM Christian Brauner <christian@brauner.io> wrote:
>
> On Sat, Mar 30, 2019 at 05:50:20PM +0000, Jonathan Kowalski wrote:
> > On Sat, Mar 30, 2019 at 5:24 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > On Sat, Mar 30, 2019 at 10:12 AM Christian Brauner <christian@brauner.io> wrote:
> > > >
> > > >
> > > > To clarify, what the Android guys really wanted to be part of the api is
> > > > a way to get race-free access to metadata associated with a given pidfd.
> > > > And the idea was that *if and only if procfs is mounted* you could do:
> > > >
> > > > int pidfd = pidfd_open(1234, 0);
> > > >
> > > > int procfd = open("/proc", O_RDONLY | O_CLOEXEC);
> > > > int procpidfd = ioctl(pidfd, PIDFD_TO_PROCFD, procfd);
> > >
> > > And my claim is that this is three system calls - one of them very
> > > hacky - to just do
> > >
> > >     int pidfd = open("/proc/%d", O_PATH);
> > >
> > > and you're done. It acts as the pidfd _and_ the way to get the
> > > associated status files etc.
> > >
> > > So there is absolutely zero advantage to going through pidfd_open().
> > >
> > > No. No. No.
> > >
> > > So the *only* reason for "pidfd_open()" is if you don't have /proc in
> > > the first place. In which case the whole PIDFD_TO_PROCFD is bogus.
> > >
> > > Yeah, yeah, if you want to avoid going through the pathname
> > > translation, that's one thing, but if that's your aim, then you again
> > > should also just admit that PIDFD_TO_PROCFD is disgusting and wrong,
> > > and you're basically saying "ok, I'm not going to do /proc at all".
> > >
> > > So I'm ok with the whole "simpler, faster, no-proc pidfd", but then it
> > > really has to be *SIMPLER* and *NO PROCFS*.
> > >
> >
> > (Resending because accidently it wasn't a reply-all)
> >
> > If you go with pidfd_open, that should also mean you remove the
> > ability to be able to use /proc/<PID> dir fds in pidfd_send_signal.
> >
> > Otherwise the semantics are hairy: I can only pidfd_open a task
> > reachable from my active namespace, but somehow also be able to open a
>
> You can easily setns() to another pid namespace and get a pidfd there.
> That's how most namespace interactions work right now. We already had
> that discussion.

Only if it is a child namespace, or you have the relevant capabilities to setns.

Currently, if I just put a task in PID namespace, it can see /proc of
an ancestor PID namespace, and opendir /proc/<PID>, this is accepted
by pidfd_send_signal.

If you ever allow signalling across PID namespaces (because file
descriptors should be able to do that, they are not namespaced, see
files, sockets, etc), it will become a problem. Getting pidfds from
outside my active namespace should require userspace cooperation.

So, opening a pidfd should be limited to what *I* can see in my
namespace, like every other namespace. That is what a namespace is,
and PIDs have their own namespace, they're not exposed in the
filesystem namespace.

WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Kowalski <bl0pbl33p@gmail.com>
To: Christian Brauner <christian@brauner.io>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Daniel Colascione <dancol@google.com>,
	Jann Horn <jannh@google.com>, Andrew Lutomirski <luto@kernel.org>,
	David Howells <dhowells@redhat.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Linux API <linux-api@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Kees Cook <keescook@chromium.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
	"Dmitry V. Levin" <ldv@altlinux.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Nagarathnam
Subject: Re: [PATCH v2 0/5] pid: add pidfd_open()
Date: Sat, 30 Mar 2019 17:59:34 +0000	[thread overview]
Message-ID: <CAGLj2rEtNNs0BXHuGkjpT4seHuh=Lj79iVYP-n117+Dv+ThcJA@mail.gmail.com> (raw)
In-Reply-To: <20190330175241.4itdnx3tl5upzjxd@brauner.io>

On Sat, Mar 30, 2019 at 5:52 PM Christian Brauner <christian@brauner.io> wrote:
>
> On Sat, Mar 30, 2019 at 05:50:20PM +0000, Jonathan Kowalski wrote:
> > On Sat, Mar 30, 2019 at 5:24 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > On Sat, Mar 30, 2019 at 10:12 AM Christian Brauner <christian@brauner.io> wrote:
> > > >
> > > >
> > > > To clarify, what the Android guys really wanted to be part of the api is
> > > > a way to get race-free access to metadata associated with a given pidfd.
> > > > And the idea was that *if and only if procfs is mounted* you could do:
> > > >
> > > > int pidfd = pidfd_open(1234, 0);
> > > >
> > > > int procfd = open("/proc", O_RDONLY | O_CLOEXEC);
> > > > int procpidfd = ioctl(pidfd, PIDFD_TO_PROCFD, procfd);
> > >
> > > And my claim is that this is three system calls - one of them very
> > > hacky - to just do
> > >
> > >     int pidfd = open("/proc/%d", O_PATH);
> > >
> > > and you're done. It acts as the pidfd _and_ the way to get the
> > > associated status files etc.
> > >
> > > So there is absolutely zero advantage to going through pidfd_open().
> > >
> > > No. No. No.
> > >
> > > So the *only* reason for "pidfd_open()" is if you don't have /proc in
> > > the first place. In which case the whole PIDFD_TO_PROCFD is bogus.
> > >
> > > Yeah, yeah, if you want to avoid going through the pathname
> > > translation, that's one thing, but if that's your aim, then you again
> > > should also just admit that PIDFD_TO_PROCFD is disgusting and wrong,
> > > and you're basically saying "ok, I'm not going to do /proc at all".
> > >
> > > So I'm ok with the whole "simpler, faster, no-proc pidfd", but then it
> > > really has to be *SIMPLER* and *NO PROCFS*.
> > >
> >
> > (Resending because accidently it wasn't a reply-all)
> >
> > If you go with pidfd_open, that should also mean you remove the
> > ability to be able to use /proc/<PID> dir fds in pidfd_send_signal.
> >
> > Otherwise the semantics are hairy: I can only pidfd_open a task
> > reachable from my active namespace, but somehow also be able to open a
>
> You can easily setns() to another pid namespace and get a pidfd there.
> That's how most namespace interactions work right now. We already had
> that discussion.

Only if it is a child namespace, or you have the relevant capabilities to setns.

Currently, if I just put a task in PID namespace, it can see /proc of
an ancestor PID namespace, and opendir /proc/<PID>, this is accepted
by pidfd_send_signal.

If you ever allow signalling across PID namespaces (because file
descriptors should be able to do that, they are not namespaced, see
files, sockets, etc), it will become a problem. Getting pidfds from
outside my active namespace should require userspace cooperation.

So, opening a pidfd should be limited to what *I* can see in my
namespace, like every other namespace. That is what a namespace is,
and PIDs have their own namespace, they're not exposed in the
filesystem namespace.

  reply	other threads:[~2019-03-30 17:59 UTC|newest]

Thread overview: 158+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-29 15:54 [PATCH v2 0/5] pid: add pidfd_open() Christian Brauner
2019-03-29 15:54 ` [PATCH v2 1/5] Make anon_inodes unconditional Christian Brauner
2019-03-29 15:54 ` [PATCH v2 2/5] pid: add pidfd_open() Christian Brauner
2019-03-29 23:45   ` Jann Horn
2019-03-29 23:45     ` Jann Horn
2019-03-29 23:55     ` Christian Brauner
2019-03-29 23:55       ` Christian Brauner
2019-03-30 11:53   ` Jürg Billeter
2019-03-30 14:37     ` Christian Brauner
2019-03-30 14:51       ` Jonathan Kowalski
2019-03-30 14:51         ` Jonathan Kowalski
2019-03-29 15:54 ` [PATCH v2 3/5] signal: support pidfd_open() with pidfd_send_signal() Christian Brauner
2019-03-29 15:54 ` [PATCH v2 4/5] signal: PIDFD_SIGNAL_TID threads via pidfds Christian Brauner
2019-03-30  1:06   ` Jann Horn
2019-03-30  1:06     ` Jann Horn
2019-03-30  1:22     ` Christian Brauner
2019-03-30  1:22       ` Christian Brauner
2019-03-30  1:34       ` Christian Brauner
2019-03-30  1:34         ` Christian Brauner
2019-03-30  1:42         ` Christian Brauner
2019-03-30  1:42           ` Christian Brauner
2019-03-29 15:54 ` [PATCH v2 5/5] tests: add pidfd_open() tests Christian Brauner
2019-03-30 16:09 ` [PATCH v2 0/5] pid: add pidfd_open() Linus Torvalds
2019-03-30 16:09   ` Linus Torvalds
2019-03-30 16:11   ` Daniel Colascione
2019-03-30 16:11     ` Daniel Colascione
2019-03-30 16:16     ` Linus Torvalds
2019-03-30 16:16       ` Linus Torvalds
2019-03-30 16:18       ` Linus Torvalds
2019-03-30 16:18         ` Linus Torvalds
2019-03-31  1:07         ` Joel Fernandes
2019-03-31  1:07           ` Joel Fernandes
2019-03-31  2:34           ` Jann Horn
2019-03-31  2:34             ` Jann Horn
2019-03-31  4:08             ` Joel Fernandes
2019-03-31  4:08               ` Joel Fernandes
2019-03-31  4:46               ` Jann Horn
2019-03-31  4:46                 ` Jann Horn
2019-03-31 14:52                 ` Linus Torvalds
2019-03-31 14:52                   ` Linus Torvalds
2019-03-31 15:05                   ` Christian Brauner
2019-03-31 15:05                     ` Christian Brauner
2019-03-31 15:21                     ` Daniel Colascione
2019-03-31 15:21                       ` Daniel Colascione
2019-03-31 15:33                   ` Jonathan Kowalski
2019-03-31 15:33                     ` Jonathan Kowalski
2019-03-30 16:19   ` Christian Brauner
2019-03-30 16:19     ` Christian Brauner
2019-03-30 16:24     ` Linus Torvalds
2019-03-30 16:24       ` Linus Torvalds
2019-03-30 16:34       ` Daniel Colascione
2019-03-30 16:34         ` Daniel Colascione
2019-03-30 16:38         ` Christian Brauner
2019-03-30 16:38           ` Christian Brauner
2019-03-30 17:04         ` Linus Torvalds
2019-03-30 17:04           ` Linus Torvalds
2019-03-30 17:12           ` Christian Brauner
2019-03-30 17:12             ` Christian Brauner
2019-03-30 17:24             ` Linus Torvalds
2019-03-30 17:24               ` Linus Torvalds
2019-03-30 17:37               ` Christian Brauner
2019-03-30 17:37                 ` Christian Brauner
2019-03-30 17:50               ` Jonathan Kowalski
2019-03-30 17:50                 ` Jonathan Kowalski
2019-03-30 17:52                 ` Christian Brauner
2019-03-30 17:52                   ` Christian Brauner
2019-03-30 17:59                   ` Jonathan Kowalski [this message]
2019-03-30 17:59                     ` Jonathan Kowalski
2019-03-30 18:02                     ` Christian Brauner
2019-03-30 18:02                       ` Christian Brauner
2019-03-30 18:00               ` Jann Horn
2019-03-30 18:00                 ` Jann Horn
2019-03-31 20:09               ` Andy Lutomirski
2019-03-31 20:09                 ` Andy Lutomirski
2019-03-31 21:03                 ` Linus Torvalds
2019-03-31 21:03                   ` Linus Torvalds
2019-03-31 21:10                   ` Christian Brauner
2019-03-31 21:10                     ` Christian Brauner
2019-03-31 21:17                     ` Linus Torvalds
2019-03-31 21:17                       ` Linus Torvalds
2019-03-31 22:03                       ` Christian Brauner
2019-03-31 22:03                         ` Christian Brauner
2019-03-31 22:16                         ` Linus Torvalds
2019-03-31 22:16                           ` Linus Torvalds
2019-03-31 22:33                           ` Christian Brauner
2019-03-31 22:33                             ` Christian Brauner
2019-04-01  0:52                             ` Jann Horn
2019-04-01  0:52                               ` Jann Horn
2019-04-01  8:47                               ` Yann Droneaud
2019-04-01  8:47                                 ` Yann Droneaud
2019-04-01 10:03                               ` Jonathan Kowalski
2019-04-01 10:03                                 ` Jonathan Kowalski
2019-03-31 23:40                           ` Linus Torvalds
2019-03-31 23:40                             ` Linus Torvalds
2019-04-01  0:09                             ` Al Viro
2019-04-01  0:09                               ` Al Viro
2019-04-01  0:18                               ` Linus Torvalds
2019-04-01  0:18                                 ` Linus Torvalds
2019-04-01  0:21                                 ` Christian Brauner
2019-04-01  0:21                                   ` Christian Brauner
2019-04-01  6:37                                 ` Al Viro
2019-04-01  6:37                                   ` Al Viro
2019-04-01  6:41                                   ` Al Viro
2019-04-01  6:41                                     ` Al Viro
2019-03-31 22:03                       ` Jonathan Kowalski
2019-03-31 22:03                         ` Jonathan Kowalski
2019-04-01  2:13                       ` Andy Lutomirski
2019-04-01  2:13                         ` Andy Lutomirski
2019-04-01 11:40                         ` Aleksa Sarai
2019-04-01 11:40                           ` Aleksa Sarai
2019-04-01 15:36                           ` Linus Torvalds
2019-04-01 15:36                             ` Linus Torvalds
2019-04-01 15:47                             ` Christian Brauner
2019-04-01 15:47                               ` Christian Brauner
2019-04-01 15:55                             ` Daniel Colascione
2019-04-01 15:55                               ` Daniel Colascione
2019-04-01 16:01                               ` Linus Torvalds
2019-04-01 16:01                                 ` Linus Torvalds
2019-04-01 16:13                                 ` Daniel Colascione
2019-04-01 16:13                                   ` Daniel Colascione
2019-04-01 19:42                                 ` Christian Brauner
2019-04-01 19:42                                   ` Christian Brauner
2019-04-01 21:30                                   ` Linus Torvalds
2019-04-01 21:30                                     ` Linus Torvalds
2019-04-01 21:58                                     ` Jonathan Kowalski
2019-04-01 21:58                                       ` Jonathan Kowalski
2019-04-01 22:13                                       ` Linus Torvalds
2019-04-01 22:13                                         ` Linus Torvalds
2019-04-01 22:34                                         ` Daniel Colascione
2019-04-01 22:34                                           ` Daniel Colascione
2019-04-01 16:07                               ` Jonathan Kowalski
2019-04-01 16:07                                 ` Jonathan Kowalski
2019-04-01 16:15                                 ` Linus Torvalds
2019-04-01 16:15                                   ` Linus Torvalds
2019-04-01 16:27                                   ` Jonathan Kowalski
2019-04-01 16:27                                     ` Jonathan Kowalski
2019-04-01 16:21                                 ` Daniel Colascione
2019-04-01 16:21                                   ` Daniel Colascione
2019-04-01 16:29                                   ` Linus Torvalds
2019-04-01 16:29                                     ` Linus Torvalds
2019-04-01 16:45                                     ` Daniel Colascione
2019-04-01 16:45                                       ` Daniel Colascione
2019-04-01 17:00                                       ` David Laight
2019-04-01 17:00                                         ` David Laight
2019-04-01 17:32                                       ` Linus Torvalds
2019-04-01 17:32                                         ` Linus Torvalds
2019-04-02 11:03                                       ` Florian Weimer
2019-04-02 11:03                                         ` Florian Weimer
2019-04-01 16:10                             ` Andy Lutomirski
2019-04-01 16:10                               ` Andy Lutomirski
2019-04-01 12:04                         ` Christian Brauner
2019-04-01 12:04                           ` Christian Brauner
2019-04-01 13:43                           ` Jann Horn
2019-04-01 13:43                             ` Jann Horn
2019-03-31 21:19                 ` Christian Brauner
2019-03-31 21:19                   ` Christian Brauner
2019-03-30 16:37       ` Christian Brauner
2019-03-30 16:37         ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGLj2rEtNNs0BXHuGkjpT4seHuh=Lj79iVYP-n117+Dv+ThcJA@mail.gmail.com' \
    --to=bl0pbl33p@gmail.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=christian@brauner.io \
    --cc=cyphar@cyphar.com \
    --cc=dancol@google.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=khlebnikov@yandex-team.ru \
    --cc=ldv@altlinux.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=nagarathnam.muthusamy@oracle.com \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.