From: Daniel Colascione <dancol@google.com> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Aleksa Sarai <cyphar@cyphar.com>, Andy Lutomirski <luto@amacapital.net>, Christian Brauner <christian@brauner.io>, Jann Horn <jannh@google.com>, Andrew Lutomirski <luto@kernel.org>, David Howells <dhowells@redhat.com>, "Serge E. Hallyn" <serge@hallyn.com>, Linux API <linux-api@vger.kernel.org>, Linux List Kernel Mailing <linux-kernel@vger.kernel.org>, Arnd Bergmann <arnd@arndb.de>, "Eric W. Biederman" <ebiederm@xmission.com>, Konstantin Khlebnikov <khlebnikov@yandex-team.ru>, Kees Cook <keescook@chromium.org>, Alexey Dobriyan <adobriyan@gmail.com>, Thomas Gleixner <tglx@linutronix.de>, Michael Kerrisk-manpages <mtk.manpages@gmail.com>, Jonathan Kowalski <bl0pbl33p@gmail.com>, "Dmitry V. Levin" <ldv@altlinux.org>, Andrew Morton <akpm@linux-foundation.org>, Oleg Nesterov <oleg@redhat.com>, Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>, Al Viro <viro@zeniv.linux.org.uk>, Joel Fernandes <joel@joelfernandes.org> Subject: Re: [PATCH v2 0/5] pid: add pidfd_open() Date: Mon, 1 Apr 2019 08:55:00 -0700 [thread overview] Message-ID: <CAKOZuev4Q4CY0-2rUpTujSKMVJ9L9Exv=_divFC0G0_OaQHaGw@mail.gmail.com> (raw) In-Reply-To: <CAHk-=wgKqBQznZdTQaM6yQ+_5dcz-+q8=2sbQsAoDh55hQTLMA@mail.gmail.com> On Mon, Apr 1, 2019 at 8:36 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Mon, Apr 1, 2019 at 4:41 AM Aleksa Sarai <cyphar@cyphar.com> wrote: > > > > Eric pitched a procfs2 which would *just* be the PIDs some time ago (in > > an attempt to make it possible one day to mount /proc inside a container > > without adding a bunch of masked paths), though it was just an idea and > > I don't know if he ever had a patch for it. Couldn't this mode just be a relatively simple procfs mount option instead of a whole new filesystem? It'd be a bit like hidepid, right? The internal bind mount option and the no-dotdot-traversal options also look good to me. > I wonder if we really want a fill procfs2, or maybe we could just make > the pidfd readable (yes, it's a directory file descriptor, but we > could allow reading). What would read(2) read? > What are the *actual* use cases for opening /proc files through it? If > it's really just for a small subset that android wants to do this > (getting basic process state like "running" etc), rather than anything > else, then we could skip the whole /proc linking entirely and go the > other way instead (ie open_pidfd() would get that limited IO model, > and we could make the /proc directory node get the same limited IO > model). We do a lot of process state inspection and manipulation, including reading and writing the oom killer adjustment score, reading smaps, and the occasional cgroup manipulation. More generally, I'd also like to be able to write a race-free pkill(1). Doing this work via pidfd would be convenient. More generally, we can't enumerate the specific use cases, because what we want to do with processes isn't bounded in advance, and we regularly find new things in /proc/pid that we want to read and write. I'd rather not prematurely limit the applicability of the pidfd interface, especially when there's a simple option (the procfs directory file descriptor approach) that doesn't require in-advance enumeration of supported process inspection and manipulation actions or a separate per-option pidfd equivalent. I very much want a general-purpose API that reuses the metadata interfaces the kernel already exposes. It's not clear to me how this rich interface could be matched by read(2) on a pidfd.
WARNING: multiple messages have this Message-ID (diff)
From: Daniel Colascione <dancol@google.com> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Aleksa Sarai <cyphar@cyphar.com>, Andy Lutomirski <luto@amacapital.net>, Christian Brauner <christian@brauner.io>, Jann Horn <jannh@google.com>, Andrew Lutomirski <luto@kernel.org>, David Howells <dhowells@redhat.com>, "Serge E. Hallyn" <serge@hallyn.com>, Linux API <linux-api@vger.kernel.org>, Linux List Kernel Mailing <linux-kernel@vger.kernel.org>, Arnd Bergmann <arnd@arndb.de>, "Eric W. Biederman" <ebiederm@xmission.com>, Konstantin Khlebnikov <khlebnikov@yandex-team.ru>, Kees Cook <keescook@chromium.org>, Alexey Dobriyan <adobriyan@gmail.com>, Thomas Gleixner <tglx@linutronix.de>, Michael Kerrisk-manpages <mtk.manpages@gmail.com>, Jonathan Kowalski <bl0pbl33p@gmail.com>, "Dmitry V. Levin" <ldv@altlinux.org>, Andrew Morton <akpm@linux-fou> Subject: Re: [PATCH v2 0/5] pid: add pidfd_open() Date: Mon, 1 Apr 2019 08:55:00 -0700 [thread overview] Message-ID: <CAKOZuev4Q4CY0-2rUpTujSKMVJ9L9Exv=_divFC0G0_OaQHaGw@mail.gmail.com> (raw) In-Reply-To: <CAHk-=wgKqBQznZdTQaM6yQ+_5dcz-+q8=2sbQsAoDh55hQTLMA@mail.gmail.com> On Mon, Apr 1, 2019 at 8:36 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Mon, Apr 1, 2019 at 4:41 AM Aleksa Sarai <cyphar@cyphar.com> wrote: > > > > Eric pitched a procfs2 which would *just* be the PIDs some time ago (in > > an attempt to make it possible one day to mount /proc inside a container > > without adding a bunch of masked paths), though it was just an idea and > > I don't know if he ever had a patch for it. Couldn't this mode just be a relatively simple procfs mount option instead of a whole new filesystem? It'd be a bit like hidepid, right? The internal bind mount option and the no-dotdot-traversal options also look good to me. > I wonder if we really want a fill procfs2, or maybe we could just make > the pidfd readable (yes, it's a directory file descriptor, but we > could allow reading). What would read(2) read? > What are the *actual* use cases for opening /proc files through it? If > it's really just for a small subset that android wants to do this > (getting basic process state like "running" etc), rather than anything > else, then we could skip the whole /proc linking entirely and go the > other way instead (ie open_pidfd() would get that limited IO model, > and we could make the /proc directory node get the same limited IO > model). We do a lot of process state inspection and manipulation, including reading and writing the oom killer adjustment score, reading smaps, and the occasional cgroup manipulation. More generally, I'd also like to be able to write a race-free pkill(1). Doing this work via pidfd would be convenient. More generally, we can't enumerate the specific use cases, because what we want to do with processes isn't bounded in advance, and we regularly find new things in /proc/pid that we want to read and write. I'd rather not prematurely limit the applicability of the pidfd interface, especially when there's a simple option (the procfs directory file descriptor approach) that doesn't require in-advance enumeration of supported process inspection and manipulation actions or a separate per-option pidfd equivalent. I very much want a general-purpose API that reuses the metadata interfaces the kernel already exposes. It's not clear to me how this rich interface could be matched by read(2) on a pidfd.
next prev parent reply other threads:[~2019-04-01 15:55 UTC|newest] Thread overview: 158+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-03-29 15:54 [PATCH v2 0/5] pid: add pidfd_open() Christian Brauner 2019-03-29 15:54 ` [PATCH v2 1/5] Make anon_inodes unconditional Christian Brauner 2019-03-29 15:54 ` [PATCH v2 2/5] pid: add pidfd_open() Christian Brauner 2019-03-29 23:45 ` Jann Horn 2019-03-29 23:45 ` Jann Horn 2019-03-29 23:55 ` Christian Brauner 2019-03-29 23:55 ` Christian Brauner 2019-03-30 11:53 ` Jürg Billeter 2019-03-30 14:37 ` Christian Brauner 2019-03-30 14:51 ` Jonathan Kowalski 2019-03-30 14:51 ` Jonathan Kowalski 2019-03-29 15:54 ` [PATCH v2 3/5] signal: support pidfd_open() with pidfd_send_signal() Christian Brauner 2019-03-29 15:54 ` [PATCH v2 4/5] signal: PIDFD_SIGNAL_TID threads via pidfds Christian Brauner 2019-03-30 1:06 ` Jann Horn 2019-03-30 1:06 ` Jann Horn 2019-03-30 1:22 ` Christian Brauner 2019-03-30 1:22 ` Christian Brauner 2019-03-30 1:34 ` Christian Brauner 2019-03-30 1:34 ` Christian Brauner 2019-03-30 1:42 ` Christian Brauner 2019-03-30 1:42 ` Christian Brauner 2019-03-29 15:54 ` [PATCH v2 5/5] tests: add pidfd_open() tests Christian Brauner 2019-03-30 16:09 ` [PATCH v2 0/5] pid: add pidfd_open() Linus Torvalds 2019-03-30 16:09 ` Linus Torvalds 2019-03-30 16:11 ` Daniel Colascione 2019-03-30 16:11 ` Daniel Colascione 2019-03-30 16:16 ` Linus Torvalds 2019-03-30 16:16 ` Linus Torvalds 2019-03-30 16:18 ` Linus Torvalds 2019-03-30 16:18 ` Linus Torvalds 2019-03-31 1:07 ` Joel Fernandes 2019-03-31 1:07 ` Joel Fernandes 2019-03-31 2:34 ` Jann Horn 2019-03-31 2:34 ` Jann Horn 2019-03-31 4:08 ` Joel Fernandes 2019-03-31 4:08 ` Joel Fernandes 2019-03-31 4:46 ` Jann Horn 2019-03-31 4:46 ` Jann Horn 2019-03-31 14:52 ` Linus Torvalds 2019-03-31 14:52 ` Linus Torvalds 2019-03-31 15:05 ` Christian Brauner 2019-03-31 15:05 ` Christian Brauner 2019-03-31 15:21 ` Daniel Colascione 2019-03-31 15:21 ` Daniel Colascione 2019-03-31 15:33 ` Jonathan Kowalski 2019-03-31 15:33 ` Jonathan Kowalski 2019-03-30 16:19 ` Christian Brauner 2019-03-30 16:19 ` Christian Brauner 2019-03-30 16:24 ` Linus Torvalds 2019-03-30 16:24 ` Linus Torvalds 2019-03-30 16:34 ` Daniel Colascione 2019-03-30 16:34 ` Daniel Colascione 2019-03-30 16:38 ` Christian Brauner 2019-03-30 16:38 ` Christian Brauner 2019-03-30 17:04 ` Linus Torvalds 2019-03-30 17:04 ` Linus Torvalds 2019-03-30 17:12 ` Christian Brauner 2019-03-30 17:12 ` Christian Brauner 2019-03-30 17:24 ` Linus Torvalds 2019-03-30 17:24 ` Linus Torvalds 2019-03-30 17:37 ` Christian Brauner 2019-03-30 17:37 ` Christian Brauner 2019-03-30 17:50 ` Jonathan Kowalski 2019-03-30 17:50 ` Jonathan Kowalski 2019-03-30 17:52 ` Christian Brauner 2019-03-30 17:52 ` Christian Brauner 2019-03-30 17:59 ` Jonathan Kowalski 2019-03-30 17:59 ` Jonathan Kowalski 2019-03-30 18:02 ` Christian Brauner 2019-03-30 18:02 ` Christian Brauner 2019-03-30 18:00 ` Jann Horn 2019-03-30 18:00 ` Jann Horn 2019-03-31 20:09 ` Andy Lutomirski 2019-03-31 20:09 ` Andy Lutomirski 2019-03-31 21:03 ` Linus Torvalds 2019-03-31 21:03 ` Linus Torvalds 2019-03-31 21:10 ` Christian Brauner 2019-03-31 21:10 ` Christian Brauner 2019-03-31 21:17 ` Linus Torvalds 2019-03-31 21:17 ` Linus Torvalds 2019-03-31 22:03 ` Christian Brauner 2019-03-31 22:03 ` Christian Brauner 2019-03-31 22:16 ` Linus Torvalds 2019-03-31 22:16 ` Linus Torvalds 2019-03-31 22:33 ` Christian Brauner 2019-03-31 22:33 ` Christian Brauner 2019-04-01 0:52 ` Jann Horn 2019-04-01 0:52 ` Jann Horn 2019-04-01 8:47 ` Yann Droneaud 2019-04-01 8:47 ` Yann Droneaud 2019-04-01 10:03 ` Jonathan Kowalski 2019-04-01 10:03 ` Jonathan Kowalski 2019-03-31 23:40 ` Linus Torvalds 2019-03-31 23:40 ` Linus Torvalds 2019-04-01 0:09 ` Al Viro 2019-04-01 0:09 ` Al Viro 2019-04-01 0:18 ` Linus Torvalds 2019-04-01 0:18 ` Linus Torvalds 2019-04-01 0:21 ` Christian Brauner 2019-04-01 0:21 ` Christian Brauner 2019-04-01 6:37 ` Al Viro 2019-04-01 6:37 ` Al Viro 2019-04-01 6:41 ` Al Viro 2019-04-01 6:41 ` Al Viro 2019-03-31 22:03 ` Jonathan Kowalski 2019-03-31 22:03 ` Jonathan Kowalski 2019-04-01 2:13 ` Andy Lutomirski 2019-04-01 2:13 ` Andy Lutomirski 2019-04-01 11:40 ` Aleksa Sarai 2019-04-01 11:40 ` Aleksa Sarai 2019-04-01 15:36 ` Linus Torvalds 2019-04-01 15:36 ` Linus Torvalds 2019-04-01 15:47 ` Christian Brauner 2019-04-01 15:47 ` Christian Brauner 2019-04-01 15:55 ` Daniel Colascione [this message] 2019-04-01 15:55 ` Daniel Colascione 2019-04-01 16:01 ` Linus Torvalds 2019-04-01 16:01 ` Linus Torvalds 2019-04-01 16:13 ` Daniel Colascione 2019-04-01 16:13 ` Daniel Colascione 2019-04-01 19:42 ` Christian Brauner 2019-04-01 19:42 ` Christian Brauner 2019-04-01 21:30 ` Linus Torvalds 2019-04-01 21:30 ` Linus Torvalds 2019-04-01 21:58 ` Jonathan Kowalski 2019-04-01 21:58 ` Jonathan Kowalski 2019-04-01 22:13 ` Linus Torvalds 2019-04-01 22:13 ` Linus Torvalds 2019-04-01 22:34 ` Daniel Colascione 2019-04-01 22:34 ` Daniel Colascione 2019-04-01 16:07 ` Jonathan Kowalski 2019-04-01 16:07 ` Jonathan Kowalski 2019-04-01 16:15 ` Linus Torvalds 2019-04-01 16:15 ` Linus Torvalds 2019-04-01 16:27 ` Jonathan Kowalski 2019-04-01 16:27 ` Jonathan Kowalski 2019-04-01 16:21 ` Daniel Colascione 2019-04-01 16:21 ` Daniel Colascione 2019-04-01 16:29 ` Linus Torvalds 2019-04-01 16:29 ` Linus Torvalds 2019-04-01 16:45 ` Daniel Colascione 2019-04-01 16:45 ` Daniel Colascione 2019-04-01 17:00 ` David Laight 2019-04-01 17:00 ` David Laight 2019-04-01 17:32 ` Linus Torvalds 2019-04-01 17:32 ` Linus Torvalds 2019-04-02 11:03 ` Florian Weimer 2019-04-02 11:03 ` Florian Weimer 2019-04-01 16:10 ` Andy Lutomirski 2019-04-01 16:10 ` Andy Lutomirski 2019-04-01 12:04 ` Christian Brauner 2019-04-01 12:04 ` Christian Brauner 2019-04-01 13:43 ` Jann Horn 2019-04-01 13:43 ` Jann Horn 2019-03-31 21:19 ` Christian Brauner 2019-03-31 21:19 ` Christian Brauner 2019-03-30 16:37 ` Christian Brauner 2019-03-30 16:37 ` Christian Brauner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAKOZuev4Q4CY0-2rUpTujSKMVJ9L9Exv=_divFC0G0_OaQHaGw@mail.gmail.com' \ --to=dancol@google.com \ --cc=adobriyan@gmail.com \ --cc=akpm@linux-foundation.org \ --cc=arnd@arndb.de \ --cc=bl0pbl33p@gmail.com \ --cc=christian@brauner.io \ --cc=cyphar@cyphar.com \ --cc=dhowells@redhat.com \ --cc=ebiederm@xmission.com \ --cc=jannh@google.com \ --cc=joel@joelfernandes.org \ --cc=keescook@chromium.org \ --cc=khlebnikov@yandex-team.ru \ --cc=ldv@altlinux.org \ --cc=linux-api@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=luto@amacapital.net \ --cc=luto@kernel.org \ --cc=mtk.manpages@gmail.com \ --cc=nagarathnam.muthusamy@oracle.com \ --cc=oleg@redhat.com \ --cc=serge@hallyn.com \ --cc=tglx@linutronix.de \ --cc=torvalds@linux-foundation.org \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.