From: Jonathan Kowalski <bl0pbl33p@gmail.com> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jann Horn <jannh@google.com>, Joel Fernandes <joel@joelfernandes.org>, Daniel Colascione <dancol@google.com>, Christian Brauner <christian@brauner.io>, Andrew Lutomirski <luto@kernel.org>, David Howells <dhowells@redhat.com>, "Serge E. Hallyn" <serge@hallyn.com>, Linux API <linux-api@vger.kernel.org>, Linux List Kernel Mailing <linux-kernel@vger.kernel.org>, Arnd Bergmann <arnd@arndb.de>, "Eric W. Biederman" <ebiederm@xmission.com>, Konstantin Khlebnikov <khlebnikov@yandex-team.ru>, Kees Cook <keescook@chromium.org>, Alexey Dobriyan <adobriyan@gmail.com>, Thomas Gleixner <tglx@linutronix.de>, Michael Kerrisk-manpages <mtk.manpages@gmail.com>, "Dmitry V. Levin" <ldv@altlinux.org>, Andrew Morton <akpm@linux-foundation.org>, Oleg Nesterov <oleg@redhat.com>, Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>, Aleksa Sarai <cyphar@cyphar.com>, Al Viro <viro@zeniv.linux.org.uk> Subject: Re: [PATCH v2 0/5] pid: add pidfd_open() Date: Sun, 31 Mar 2019 16:33:48 +0100 [thread overview] Message-ID: <CAGLj2rELekFugPJ8OGH8wxFabmZ1UJBORwywpZrStKjohTF63A@mail.gmail.com> (raw) In-Reply-To: <CAHk-=wiZ40LVjnXSi9iHLE_-ZBsWFGCgdmNiYZUXn1-V5YBg2g@mail.gmail.com> On Sun, Mar 31, 2019 at 3:59 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Sat, Mar 30, 2019 at 9:47 PM Jann Horn <jannh@google.com> wrote: > > > > Sure, given a pidfd_clone() syscall, as long as the parent of the > > process is giving you a pidfd for it and you don't have to deal with > > grandchildren created by fork() calls outside your control, that > > works. > > Don't do pidfd_clone() and pidfd_wait(). > > Both of those existing system calls already get a "flags" argument. > Just make a WPIDFD (for waitid) and CLONE_PIDFD (for clone) bit, and > make the existing system calls just take/return a pidfd. clone is out of flags, so there will have to be a new system call. I am not sure about the waitid bit. Are you suggesting it takes a pidfd and waits using it? I was thinking if we could make the pidfd itself pollable and readable for exit status. At pidfd_open time, you pass the flag and only if you're a parent you get a readable instance, if not, a pollable one for everyone (eg. for an indirect child as a reaper), and it fails for threads. Then, the pidfd clone2 returns can also be polled and read from. The main pain point is, currently when I ptrace from a thread a process, I need to use waitpid (waitid throws away ptrace critical information), and since ptrace works on a thread by thread basis, only the attached thread can do the waitpid. This means I cannot do anything else from the attached thread concurrently. waitfd was supposed to solve this (back in 2009) but it never made it in, and clone4 from Josh Triplett did something similar (returned exit status over the clonefd). FreeBSD's process descriptors are also pollable (which is where all this work was originally inspired from) and it would help with adoption if semantics were similar. Besides that, it would help libraries to be able to host their own set of children without affecting the entire process's waiting logic oe mucking with the SIGCHLD handler (you wouldn't need signals). > > Side note: we could (should?) also make the default maxpid just be > larger. It needs to fit in an 'int', but MAXINT instead of 65535 would > likely alreadt make a lot of these attacks harder. > > There was some really old legacy reason why we actually limited it to > 65535 originally. It was old and crufty even back when.. > > Linus > > Linus
WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Kowalski <bl0pbl33p@gmail.com> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jann Horn <jannh@google.com>, Joel Fernandes <joel@joelfernandes.org>, Daniel Colascione <dancol@google.com>, Christian Brauner <christian@brauner.io>, Andrew Lutomirski <luto@kernel.org>, David Howells <dhowells@redhat.com>, "Serge E. Hallyn" <serge@hallyn.com>, Linux API <linux-api@vger.kernel.org>, Linux List Kernel Mailing <linux-kernel@vger.kernel.org>, Arnd Bergmann <arnd@arndb.de>, "Eric W. Biederman" <ebiederm@xmission.com>, Konstantin Khlebnikov <khlebnikov@yandex-team.ru>, Kees Cook <keescook@chromium.org>, Alexey Dobriyan <adobriyan@gmail.com>, Thomas Gleixner <tglx@linutronix.de>, Michael Kerrisk-manpages <mtk.manpages@gmail.com>, "Dmitry V. Levin" <ldv@altlinux.org>, Andrew Morton <akpm@linux-foundation.org>, Oleg Nesterov <oleg@> Subject: Re: [PATCH v2 0/5] pid: add pidfd_open() Date: Sun, 31 Mar 2019 16:33:48 +0100 [thread overview] Message-ID: <CAGLj2rELekFugPJ8OGH8wxFabmZ1UJBORwywpZrStKjohTF63A@mail.gmail.com> (raw) In-Reply-To: <CAHk-=wiZ40LVjnXSi9iHLE_-ZBsWFGCgdmNiYZUXn1-V5YBg2g@mail.gmail.com> On Sun, Mar 31, 2019 at 3:59 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Sat, Mar 30, 2019 at 9:47 PM Jann Horn <jannh@google.com> wrote: > > > > Sure, given a pidfd_clone() syscall, as long as the parent of the > > process is giving you a pidfd for it and you don't have to deal with > > grandchildren created by fork() calls outside your control, that > > works. > > Don't do pidfd_clone() and pidfd_wait(). > > Both of those existing system calls already get a "flags" argument. > Just make a WPIDFD (for waitid) and CLONE_PIDFD (for clone) bit, and > make the existing system calls just take/return a pidfd. clone is out of flags, so there will have to be a new system call. I am not sure about the waitid bit. Are you suggesting it takes a pidfd and waits using it? I was thinking if we could make the pidfd itself pollable and readable for exit status. At pidfd_open time, you pass the flag and only if you're a parent you get a readable instance, if not, a pollable one for everyone (eg. for an indirect child as a reaper), and it fails for threads. Then, the pidfd clone2 returns can also be polled and read from. The main pain point is, currently when I ptrace from a thread a process, I need to use waitpid (waitid throws away ptrace critical information), and since ptrace works on a thread by thread basis, only the attached thread can do the waitpid. This means I cannot do anything else from the attached thread concurrently. waitfd was supposed to solve this (back in 2009) but it never made it in, and clone4 from Josh Triplett did something similar (returned exit status over the clonefd). FreeBSD's process descriptors are also pollable (which is where all this work was originally inspired from) and it would help with adoption if semantics were similar. Besides that, it would help libraries to be able to host their own set of children without affecting the entire process's waiting logic oe mucking with the SIGCHLD handler (you wouldn't need signals). > > Side note: we could (should?) also make the default maxpid just be > larger. It needs to fit in an 'int', but MAXINT instead of 65535 would > likely alreadt make a lot of these attacks harder. > > There was some really old legacy reason why we actually limited it to > 65535 originally. It was old and crufty even back when.. > > Linus > > Linus
next prev parent reply other threads:[~2019-03-31 15:33 UTC|newest] Thread overview: 158+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-03-29 15:54 [PATCH v2 0/5] pid: add pidfd_open() Christian Brauner 2019-03-29 15:54 ` [PATCH v2 1/5] Make anon_inodes unconditional Christian Brauner 2019-03-29 15:54 ` [PATCH v2 2/5] pid: add pidfd_open() Christian Brauner 2019-03-29 23:45 ` Jann Horn 2019-03-29 23:45 ` Jann Horn 2019-03-29 23:55 ` Christian Brauner 2019-03-29 23:55 ` Christian Brauner 2019-03-30 11:53 ` Jürg Billeter 2019-03-30 14:37 ` Christian Brauner 2019-03-30 14:51 ` Jonathan Kowalski 2019-03-30 14:51 ` Jonathan Kowalski 2019-03-29 15:54 ` [PATCH v2 3/5] signal: support pidfd_open() with pidfd_send_signal() Christian Brauner 2019-03-29 15:54 ` [PATCH v2 4/5] signal: PIDFD_SIGNAL_TID threads via pidfds Christian Brauner 2019-03-30 1:06 ` Jann Horn 2019-03-30 1:06 ` Jann Horn 2019-03-30 1:22 ` Christian Brauner 2019-03-30 1:22 ` Christian Brauner 2019-03-30 1:34 ` Christian Brauner 2019-03-30 1:34 ` Christian Brauner 2019-03-30 1:42 ` Christian Brauner 2019-03-30 1:42 ` Christian Brauner 2019-03-29 15:54 ` [PATCH v2 5/5] tests: add pidfd_open() tests Christian Brauner 2019-03-30 16:09 ` [PATCH v2 0/5] pid: add pidfd_open() Linus Torvalds 2019-03-30 16:09 ` Linus Torvalds 2019-03-30 16:11 ` Daniel Colascione 2019-03-30 16:11 ` Daniel Colascione 2019-03-30 16:16 ` Linus Torvalds 2019-03-30 16:16 ` Linus Torvalds 2019-03-30 16:18 ` Linus Torvalds 2019-03-30 16:18 ` Linus Torvalds 2019-03-31 1:07 ` Joel Fernandes 2019-03-31 1:07 ` Joel Fernandes 2019-03-31 2:34 ` Jann Horn 2019-03-31 2:34 ` Jann Horn 2019-03-31 4:08 ` Joel Fernandes 2019-03-31 4:08 ` Joel Fernandes 2019-03-31 4:46 ` Jann Horn 2019-03-31 4:46 ` Jann Horn 2019-03-31 14:52 ` Linus Torvalds 2019-03-31 14:52 ` Linus Torvalds 2019-03-31 15:05 ` Christian Brauner 2019-03-31 15:05 ` Christian Brauner 2019-03-31 15:21 ` Daniel Colascione 2019-03-31 15:21 ` Daniel Colascione 2019-03-31 15:33 ` Jonathan Kowalski [this message] 2019-03-31 15:33 ` Jonathan Kowalski 2019-03-30 16:19 ` Christian Brauner 2019-03-30 16:19 ` Christian Brauner 2019-03-30 16:24 ` Linus Torvalds 2019-03-30 16:24 ` Linus Torvalds 2019-03-30 16:34 ` Daniel Colascione 2019-03-30 16:34 ` Daniel Colascione 2019-03-30 16:38 ` Christian Brauner 2019-03-30 16:38 ` Christian Brauner 2019-03-30 17:04 ` Linus Torvalds 2019-03-30 17:04 ` Linus Torvalds 2019-03-30 17:12 ` Christian Brauner 2019-03-30 17:12 ` Christian Brauner 2019-03-30 17:24 ` Linus Torvalds 2019-03-30 17:24 ` Linus Torvalds 2019-03-30 17:37 ` Christian Brauner 2019-03-30 17:37 ` Christian Brauner 2019-03-30 17:50 ` Jonathan Kowalski 2019-03-30 17:50 ` Jonathan Kowalski 2019-03-30 17:52 ` Christian Brauner 2019-03-30 17:52 ` Christian Brauner 2019-03-30 17:59 ` Jonathan Kowalski 2019-03-30 17:59 ` Jonathan Kowalski 2019-03-30 18:02 ` Christian Brauner 2019-03-30 18:02 ` Christian Brauner 2019-03-30 18:00 ` Jann Horn 2019-03-30 18:00 ` Jann Horn 2019-03-31 20:09 ` Andy Lutomirski 2019-03-31 20:09 ` Andy Lutomirski 2019-03-31 21:03 ` Linus Torvalds 2019-03-31 21:03 ` Linus Torvalds 2019-03-31 21:10 ` Christian Brauner 2019-03-31 21:10 ` Christian Brauner 2019-03-31 21:17 ` Linus Torvalds 2019-03-31 21:17 ` Linus Torvalds 2019-03-31 22:03 ` Christian Brauner 2019-03-31 22:03 ` Christian Brauner 2019-03-31 22:16 ` Linus Torvalds 2019-03-31 22:16 ` Linus Torvalds 2019-03-31 22:33 ` Christian Brauner 2019-03-31 22:33 ` Christian Brauner 2019-04-01 0:52 ` Jann Horn 2019-04-01 0:52 ` Jann Horn 2019-04-01 8:47 ` Yann Droneaud 2019-04-01 8:47 ` Yann Droneaud 2019-04-01 10:03 ` Jonathan Kowalski 2019-04-01 10:03 ` Jonathan Kowalski 2019-03-31 23:40 ` Linus Torvalds 2019-03-31 23:40 ` Linus Torvalds 2019-04-01 0:09 ` Al Viro 2019-04-01 0:09 ` Al Viro 2019-04-01 0:18 ` Linus Torvalds 2019-04-01 0:18 ` Linus Torvalds 2019-04-01 0:21 ` Christian Brauner 2019-04-01 0:21 ` Christian Brauner 2019-04-01 6:37 ` Al Viro 2019-04-01 6:37 ` Al Viro 2019-04-01 6:41 ` Al Viro 2019-04-01 6:41 ` Al Viro 2019-03-31 22:03 ` Jonathan Kowalski 2019-03-31 22:03 ` Jonathan Kowalski 2019-04-01 2:13 ` Andy Lutomirski 2019-04-01 2:13 ` Andy Lutomirski 2019-04-01 11:40 ` Aleksa Sarai 2019-04-01 11:40 ` Aleksa Sarai 2019-04-01 15:36 ` Linus Torvalds 2019-04-01 15:36 ` Linus Torvalds 2019-04-01 15:47 ` Christian Brauner 2019-04-01 15:47 ` Christian Brauner 2019-04-01 15:55 ` Daniel Colascione 2019-04-01 15:55 ` Daniel Colascione 2019-04-01 16:01 ` Linus Torvalds 2019-04-01 16:01 ` Linus Torvalds 2019-04-01 16:13 ` Daniel Colascione 2019-04-01 16:13 ` Daniel Colascione 2019-04-01 19:42 ` Christian Brauner 2019-04-01 19:42 ` Christian Brauner 2019-04-01 21:30 ` Linus Torvalds 2019-04-01 21:30 ` Linus Torvalds 2019-04-01 21:58 ` Jonathan Kowalski 2019-04-01 21:58 ` Jonathan Kowalski 2019-04-01 22:13 ` Linus Torvalds 2019-04-01 22:13 ` Linus Torvalds 2019-04-01 22:34 ` Daniel Colascione 2019-04-01 22:34 ` Daniel Colascione 2019-04-01 16:07 ` Jonathan Kowalski 2019-04-01 16:07 ` Jonathan Kowalski 2019-04-01 16:15 ` Linus Torvalds 2019-04-01 16:15 ` Linus Torvalds 2019-04-01 16:27 ` Jonathan Kowalski 2019-04-01 16:27 ` Jonathan Kowalski 2019-04-01 16:21 ` Daniel Colascione 2019-04-01 16:21 ` Daniel Colascione 2019-04-01 16:29 ` Linus Torvalds 2019-04-01 16:29 ` Linus Torvalds 2019-04-01 16:45 ` Daniel Colascione 2019-04-01 16:45 ` Daniel Colascione 2019-04-01 17:00 ` David Laight 2019-04-01 17:00 ` David Laight 2019-04-01 17:32 ` Linus Torvalds 2019-04-01 17:32 ` Linus Torvalds 2019-04-02 11:03 ` Florian Weimer 2019-04-02 11:03 ` Florian Weimer 2019-04-01 16:10 ` Andy Lutomirski 2019-04-01 16:10 ` Andy Lutomirski 2019-04-01 12:04 ` Christian Brauner 2019-04-01 12:04 ` Christian Brauner 2019-04-01 13:43 ` Jann Horn 2019-04-01 13:43 ` Jann Horn 2019-03-31 21:19 ` Christian Brauner 2019-03-31 21:19 ` Christian Brauner 2019-03-30 16:37 ` Christian Brauner 2019-03-30 16:37 ` Christian Brauner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAGLj2rELekFugPJ8OGH8wxFabmZ1UJBORwywpZrStKjohTF63A@mail.gmail.com \ --to=bl0pbl33p@gmail.com \ --cc=adobriyan@gmail.com \ --cc=akpm@linux-foundation.org \ --cc=arnd@arndb.de \ --cc=christian@brauner.io \ --cc=cyphar@cyphar.com \ --cc=dancol@google.com \ --cc=dhowells@redhat.com \ --cc=ebiederm@xmission.com \ --cc=jannh@google.com \ --cc=joel@joelfernandes.org \ --cc=keescook@chromium.org \ --cc=khlebnikov@yandex-team.ru \ --cc=ldv@altlinux.org \ --cc=linux-api@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=luto@kernel.org \ --cc=mtk.manpages@gmail.com \ --cc=nagarathnam.muthusamy@oracle.com \ --cc=oleg@redhat.com \ --cc=serge@hallyn.com \ --cc=tglx@linutronix.de \ --cc=torvalds@linux-foundation.org \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.