All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Colascione <dancol@google.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jonathan Kowalski <bl0pbl33p@gmail.com>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Christian Brauner <christian@brauner.io>,
	Jann Horn <jannh@google.com>, Andrew Lutomirski <luto@kernel.org>,
	David Howells <dhowells@redhat.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Linux API <linux-api@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Kees Cook <keescook@chromium.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
	"Dmitry V. Levin" <ldv@altlinux.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Joel Fernandes <joel@joelfernandes.org>
Subject: Re: [PATCH v2 0/5] pid: add pidfd_open()
Date: Mon, 1 Apr 2019 09:45:22 -0700	[thread overview]
Message-ID: <CAKOZuet2QCoW1XMpTAzXS868PhF-gh14czFn4gwE4A=oX=Yr8w@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wjgENV0pT8=MjB1UVBeWCrECCNXv15AKKX7Kfg7Eak7qw@mail.gmail.com>

On Mon, Apr 1, 2019 at 9:30 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, Apr 1, 2019 at 9:22 AM Daniel Colascione <dancol@google.com> wrote:
> >
> > There's a subtlety: suppose I'm a library and I want to create a
> > private subprocess. I use the new clone facility, whatever it is, and
> > get a pidfd back. I need to be able to read the child's exit status
> > even if the child exits before clone returns in the parent. Doesn't
> > this requirement imply that the pidfd, kernel-side, contain something
> > a bit more than a struct pid?
>
> Note that clone() has always supported this, but it has basically
> never been used.
>
> One of the early thinkings behind clone() was that it could be used
> for basically "AIO in user space", by having exactly these kinds of
> library-private internal subthreads that are hidden as real threads.
>
> It's why we have that special CSIGNAL mask for setting the exit
> signal. It doesn't *just* set the signal to be sent when the thread
> exits - setting the signal to something else than SIGCHLD actually
> changes behavior in other ways too.
>
> In particular a non-SIGCHLD thread won't be found by a regular
> "waitpid()". You have to explicitly wait for it with __WCLONE (which
> in turn is supposed to be used with the explicit pid to be waited
> for).

But doesn't the CSIGNAL approach still require that libraries somehow
coordinate which non-SIGCHLD signal they use? (Signal coordination a
separate problem, unfortunately.) If you don't set a signal, you don't
get any notification at all, and in that case, you have to burn a
thread on a blocking wait, right? And you don't *have* to wait for a
specific PID with __WCLONE. If two libraries each did a separate
__WCLONE or __WALL wait on all subprocesses, they'd still interfere,
AIUI, even if the interface isn't supposed to be used that way.

I like the pidfd way of doing private subprocesses a bit more than the
CSIGNAL approach because it's more "natural" (in that a pidfd is just
a handle like we have handles to tons of other things), integrates
with existing event and wait facilities, doesn't require any global
coordination at all, and works for more than this one use case.
(CSIGNAL is limited to immediate parents. You could, in principle,
SCM_RIGHTS a pidfd to someone else.)

> Now, none of this was ever really used. The people who wanted AIO
> wanted the brain-damaged POSIX kind of AIO, not something cool and
> clever. So I suspect the whole exit-signal thing was just used for
> random small per-project things.

AIO is a whole other ball of wax. :-)

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Colascione <dancol@google.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jonathan Kowalski <bl0pbl33p@gmail.com>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Christian Brauner <christian@brauner.io>,
	Jann Horn <jannh@google.com>, Andrew Lutomirski <luto@kernel.org>,
	David Howells <dhowells@redhat.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Linux API <linux-api@vger.kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Kees Cook <keescook@chromium.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk-manpages <mtk.manpages@gmail.com>,
	"Dmitry V. Levin" <ldv@altlinux.org>,
	Andrew Morton <akpm@linux-fou>
Subject: Re: [PATCH v2 0/5] pid: add pidfd_open()
Date: Mon, 1 Apr 2019 09:45:22 -0700	[thread overview]
Message-ID: <CAKOZuet2QCoW1XMpTAzXS868PhF-gh14czFn4gwE4A=oX=Yr8w@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wjgENV0pT8=MjB1UVBeWCrECCNXv15AKKX7Kfg7Eak7qw@mail.gmail.com>

On Mon, Apr 1, 2019 at 9:30 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, Apr 1, 2019 at 9:22 AM Daniel Colascione <dancol@google.com> wrote:
> >
> > There's a subtlety: suppose I'm a library and I want to create a
> > private subprocess. I use the new clone facility, whatever it is, and
> > get a pidfd back. I need to be able to read the child's exit status
> > even if the child exits before clone returns in the parent. Doesn't
> > this requirement imply that the pidfd, kernel-side, contain something
> > a bit more than a struct pid?
>
> Note that clone() has always supported this, but it has basically
> never been used.
>
> One of the early thinkings behind clone() was that it could be used
> for basically "AIO in user space", by having exactly these kinds of
> library-private internal subthreads that are hidden as real threads.
>
> It's why we have that special CSIGNAL mask for setting the exit
> signal. It doesn't *just* set the signal to be sent when the thread
> exits - setting the signal to something else than SIGCHLD actually
> changes behavior in other ways too.
>
> In particular a non-SIGCHLD thread won't be found by a regular
> "waitpid()". You have to explicitly wait for it with __WCLONE (which
> in turn is supposed to be used with the explicit pid to be waited
> for).

But doesn't the CSIGNAL approach still require that libraries somehow
coordinate which non-SIGCHLD signal they use? (Signal coordination a
separate problem, unfortunately.) If you don't set a signal, you don't
get any notification at all, and in that case, you have to burn a
thread on a blocking wait, right? And you don't *have* to wait for a
specific PID with __WCLONE. If two libraries each did a separate
__WCLONE or __WALL wait on all subprocesses, they'd still interfere,
AIUI, even if the interface isn't supposed to be used that way.

I like the pidfd way of doing private subprocesses a bit more than the
CSIGNAL approach because it's more "natural" (in that a pidfd is just
a handle like we have handles to tons of other things), integrates
with existing event and wait facilities, doesn't require any global
coordination at all, and works for more than this one use case.
(CSIGNAL is limited to immediate parents. You could, in principle,
SCM_RIGHTS a pidfd to someone else.)

> Now, none of this was ever really used. The people who wanted AIO
> wanted the brain-damaged POSIX kind of AIO, not something cool and
> clever. So I suspect the whole exit-signal thing was just used for
> random small per-project things.

AIO is a whole other ball of wax. :-)

  reply	other threads:[~2019-04-01 16:45 UTC|newest]

Thread overview: 158+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-29 15:54 [PATCH v2 0/5] pid: add pidfd_open() Christian Brauner
2019-03-29 15:54 ` [PATCH v2 1/5] Make anon_inodes unconditional Christian Brauner
2019-03-29 15:54 ` [PATCH v2 2/5] pid: add pidfd_open() Christian Brauner
2019-03-29 23:45   ` Jann Horn
2019-03-29 23:45     ` Jann Horn
2019-03-29 23:55     ` Christian Brauner
2019-03-29 23:55       ` Christian Brauner
2019-03-30 11:53   ` Jürg Billeter
2019-03-30 14:37     ` Christian Brauner
2019-03-30 14:51       ` Jonathan Kowalski
2019-03-30 14:51         ` Jonathan Kowalski
2019-03-29 15:54 ` [PATCH v2 3/5] signal: support pidfd_open() with pidfd_send_signal() Christian Brauner
2019-03-29 15:54 ` [PATCH v2 4/5] signal: PIDFD_SIGNAL_TID threads via pidfds Christian Brauner
2019-03-30  1:06   ` Jann Horn
2019-03-30  1:06     ` Jann Horn
2019-03-30  1:22     ` Christian Brauner
2019-03-30  1:22       ` Christian Brauner
2019-03-30  1:34       ` Christian Brauner
2019-03-30  1:34         ` Christian Brauner
2019-03-30  1:42         ` Christian Brauner
2019-03-30  1:42           ` Christian Brauner
2019-03-29 15:54 ` [PATCH v2 5/5] tests: add pidfd_open() tests Christian Brauner
2019-03-30 16:09 ` [PATCH v2 0/5] pid: add pidfd_open() Linus Torvalds
2019-03-30 16:09   ` Linus Torvalds
2019-03-30 16:11   ` Daniel Colascione
2019-03-30 16:11     ` Daniel Colascione
2019-03-30 16:16     ` Linus Torvalds
2019-03-30 16:16       ` Linus Torvalds
2019-03-30 16:18       ` Linus Torvalds
2019-03-30 16:18         ` Linus Torvalds
2019-03-31  1:07         ` Joel Fernandes
2019-03-31  1:07           ` Joel Fernandes
2019-03-31  2:34           ` Jann Horn
2019-03-31  2:34             ` Jann Horn
2019-03-31  4:08             ` Joel Fernandes
2019-03-31  4:08               ` Joel Fernandes
2019-03-31  4:46               ` Jann Horn
2019-03-31  4:46                 ` Jann Horn
2019-03-31 14:52                 ` Linus Torvalds
2019-03-31 14:52                   ` Linus Torvalds
2019-03-31 15:05                   ` Christian Brauner
2019-03-31 15:05                     ` Christian Brauner
2019-03-31 15:21                     ` Daniel Colascione
2019-03-31 15:21                       ` Daniel Colascione
2019-03-31 15:33                   ` Jonathan Kowalski
2019-03-31 15:33                     ` Jonathan Kowalski
2019-03-30 16:19   ` Christian Brauner
2019-03-30 16:19     ` Christian Brauner
2019-03-30 16:24     ` Linus Torvalds
2019-03-30 16:24       ` Linus Torvalds
2019-03-30 16:34       ` Daniel Colascione
2019-03-30 16:34         ` Daniel Colascione
2019-03-30 16:38         ` Christian Brauner
2019-03-30 16:38           ` Christian Brauner
2019-03-30 17:04         ` Linus Torvalds
2019-03-30 17:04           ` Linus Torvalds
2019-03-30 17:12           ` Christian Brauner
2019-03-30 17:12             ` Christian Brauner
2019-03-30 17:24             ` Linus Torvalds
2019-03-30 17:24               ` Linus Torvalds
2019-03-30 17:37               ` Christian Brauner
2019-03-30 17:37                 ` Christian Brauner
2019-03-30 17:50               ` Jonathan Kowalski
2019-03-30 17:50                 ` Jonathan Kowalski
2019-03-30 17:52                 ` Christian Brauner
2019-03-30 17:52                   ` Christian Brauner
2019-03-30 17:59                   ` Jonathan Kowalski
2019-03-30 17:59                     ` Jonathan Kowalski
2019-03-30 18:02                     ` Christian Brauner
2019-03-30 18:02                       ` Christian Brauner
2019-03-30 18:00               ` Jann Horn
2019-03-30 18:00                 ` Jann Horn
2019-03-31 20:09               ` Andy Lutomirski
2019-03-31 20:09                 ` Andy Lutomirski
2019-03-31 21:03                 ` Linus Torvalds
2019-03-31 21:03                   ` Linus Torvalds
2019-03-31 21:10                   ` Christian Brauner
2019-03-31 21:10                     ` Christian Brauner
2019-03-31 21:17                     ` Linus Torvalds
2019-03-31 21:17                       ` Linus Torvalds
2019-03-31 22:03                       ` Christian Brauner
2019-03-31 22:03                         ` Christian Brauner
2019-03-31 22:16                         ` Linus Torvalds
2019-03-31 22:16                           ` Linus Torvalds
2019-03-31 22:33                           ` Christian Brauner
2019-03-31 22:33                             ` Christian Brauner
2019-04-01  0:52                             ` Jann Horn
2019-04-01  0:52                               ` Jann Horn
2019-04-01  8:47                               ` Yann Droneaud
2019-04-01  8:47                                 ` Yann Droneaud
2019-04-01 10:03                               ` Jonathan Kowalski
2019-04-01 10:03                                 ` Jonathan Kowalski
2019-03-31 23:40                           ` Linus Torvalds
2019-03-31 23:40                             ` Linus Torvalds
2019-04-01  0:09                             ` Al Viro
2019-04-01  0:09                               ` Al Viro
2019-04-01  0:18                               ` Linus Torvalds
2019-04-01  0:18                                 ` Linus Torvalds
2019-04-01  0:21                                 ` Christian Brauner
2019-04-01  0:21                                   ` Christian Brauner
2019-04-01  6:37                                 ` Al Viro
2019-04-01  6:37                                   ` Al Viro
2019-04-01  6:41                                   ` Al Viro
2019-04-01  6:41                                     ` Al Viro
2019-03-31 22:03                       ` Jonathan Kowalski
2019-03-31 22:03                         ` Jonathan Kowalski
2019-04-01  2:13                       ` Andy Lutomirski
2019-04-01  2:13                         ` Andy Lutomirski
2019-04-01 11:40                         ` Aleksa Sarai
2019-04-01 11:40                           ` Aleksa Sarai
2019-04-01 15:36                           ` Linus Torvalds
2019-04-01 15:36                             ` Linus Torvalds
2019-04-01 15:47                             ` Christian Brauner
2019-04-01 15:47                               ` Christian Brauner
2019-04-01 15:55                             ` Daniel Colascione
2019-04-01 15:55                               ` Daniel Colascione
2019-04-01 16:01                               ` Linus Torvalds
2019-04-01 16:01                                 ` Linus Torvalds
2019-04-01 16:13                                 ` Daniel Colascione
2019-04-01 16:13                                   ` Daniel Colascione
2019-04-01 19:42                                 ` Christian Brauner
2019-04-01 19:42                                   ` Christian Brauner
2019-04-01 21:30                                   ` Linus Torvalds
2019-04-01 21:30                                     ` Linus Torvalds
2019-04-01 21:58                                     ` Jonathan Kowalski
2019-04-01 21:58                                       ` Jonathan Kowalski
2019-04-01 22:13                                       ` Linus Torvalds
2019-04-01 22:13                                         ` Linus Torvalds
2019-04-01 22:34                                         ` Daniel Colascione
2019-04-01 22:34                                           ` Daniel Colascione
2019-04-01 16:07                               ` Jonathan Kowalski
2019-04-01 16:07                                 ` Jonathan Kowalski
2019-04-01 16:15                                 ` Linus Torvalds
2019-04-01 16:15                                   ` Linus Torvalds
2019-04-01 16:27                                   ` Jonathan Kowalski
2019-04-01 16:27                                     ` Jonathan Kowalski
2019-04-01 16:21                                 ` Daniel Colascione
2019-04-01 16:21                                   ` Daniel Colascione
2019-04-01 16:29                                   ` Linus Torvalds
2019-04-01 16:29                                     ` Linus Torvalds
2019-04-01 16:45                                     ` Daniel Colascione [this message]
2019-04-01 16:45                                       ` Daniel Colascione
2019-04-01 17:00                                       ` David Laight
2019-04-01 17:00                                         ` David Laight
2019-04-01 17:32                                       ` Linus Torvalds
2019-04-01 17:32                                         ` Linus Torvalds
2019-04-02 11:03                                       ` Florian Weimer
2019-04-02 11:03                                         ` Florian Weimer
2019-04-01 16:10                             ` Andy Lutomirski
2019-04-01 16:10                               ` Andy Lutomirski
2019-04-01 12:04                         ` Christian Brauner
2019-04-01 12:04                           ` Christian Brauner
2019-04-01 13:43                           ` Jann Horn
2019-04-01 13:43                             ` Jann Horn
2019-03-31 21:19                 ` Christian Brauner
2019-03-31 21:19                   ` Christian Brauner
2019-03-30 16:37       ` Christian Brauner
2019-03-30 16:37         ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKOZuet2QCoW1XMpTAzXS868PhF-gh14czFn4gwE4A=oX=Yr8w@mail.gmail.com' \
    --to=dancol@google.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bl0pbl33p@gmail.com \
    --cc=christian@brauner.io \
    --cc=cyphar@cyphar.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=khlebnikov@yandex-team.ru \
    --cc=ldv@altlinux.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=nagarathnam.muthusamy@oracle.com \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.