linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <christian@brauner.io>
To: Daniel Colascione <dancol@google.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Jann Horn <jannh@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>, Aleksa Sarai <cyphar@cyphar.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	Tim Murray <timmurray@google.com>,
	Kees Cook <keescook@chromium.org>,
	David Howells <dhowells@redhat.com>
Subject: Re: [PATCH] proc: allow killing processes via file descriptors
Date: Sun, 18 Nov 2018 20:24:45 +0100	[thread overview]
Message-ID: <20181118192443.7ylyl24svrn6jvjd@brauner.io> (raw)
In-Reply-To: <CAKOZueto1CmUSWFKjOkazcDcZH6CiXXCr4U=ghewLh8GjE2-=A@mail.gmail.com>

On Sun, Nov 18, 2018 at 10:07:31AM -0800, Daniel Colascione wrote:
> On Sun, Nov 18, 2018 at 9:41 AM, Christian Brauner <christian@brauner.io> wrote:
> > On Sun, Nov 18, 2018 at 07:38:09AM -0800, Andy Lutomirski wrote:
> >> On Sun, Nov 18, 2018 at 5:59 AM Daniel Colascione <dancol@google.com> wrote:
> >> >
> >> > I had been led to believe that the proposal would be a comprehensive
> >> > process API, not an ioctl basically equivalent to my previous patch.
> >> > If you had a more comprehensive proposal, please just share it on LKML
> >> > instead of limiting the discussion to those able to attend these
> >> > various conferences. If there's some determined opposition to a
> >> > general new process API, this opposition needs a fair and full airing,
> >> > as not everyone can attend these conferences.
> >> >
> >> > On Sun, Nov 18, 2018 at 3:17 AM, Christian Brauner <christian@brauner.io> wrote:
> >> > > With this patch an open() call on /proc/<pid> will give userspace a handle
> >> > > to struct pid of the process associated with /proc/<pid>. This allows to
> >> > > maintain a stable handle on a process.
> >> > > I have been discussing various approaches extensively during technical
> >> > > conferences this year culminating in a long argument with Eric at Linux
> >> > > Plumbers. The general consensus was that having a handle on a process
> >> > > will be something that is very simple and easy to maintain
> >> >
> >> > ioctls are the opposite of "easy to maintain". Their
> >> > file-descriptor-specific behavior makes it difficult to use the things
> >> > safely. If you want to take this approach, please make a new system
> >> > call. An ioctl is just a system call with a very strange spelling and
> >> > unfortunate collision semantics.
> >> >
> >> > > with the
> >> > > option of being extensible via a more advanced api if the need arises.
> >> >
> >> > The need *has* arisen; see my exithand patch.
> >> >
> >> > > I
> >> > > believe that this patch is the most simple, dumb, and therefore
> >> > > maintainable solution.
> >> > >
> >> > > The need for this has arisen in order to reliably kill a process without
> >> > > running into issues of the pid being recycled as has been described in the
> >> > > rejected patch [1].
> >> >
> >> > That patch was not "rejected". It was tabled pending the more
> >> > comprehensive process API proposal that was supposed to have emerged.
> >> > This patch is just another variant of the sort of approach we
> >> > discussed on that patch's thread here. As I mentioned on that thread,
> >> > the right approach option is a new system call, not an ioctl.
> >> >
> >> >  To fulfill the need described in that patchset a new
> >> > > ioctl() PROC_FD_SIGNAL is added. It can be used to send signals to a
> >> > > process via a file descriptor:
> >> > >
> >> > > int fd = open("/proc/1234", O_DIRECTORY | O_CLOEXEC);
> >> > > ioctl(fd, PROC_FD_SIGNAL, SIGKILL);
> >> > > close(fd);
> >> > >
> >> > > Note, the stable handle will allow us to carefully extend this feature in
> >> > > the future.
> >> >
> >> > We still need the ability to synchronously wait on a process's death,
> >> > as in my patch set. I will be refreshing that patch set.
> >>
> >> I fully agree that a more comprehensive, less expensive API for
> >> managing processes would be nice.  But I also think that this patch
> >> (using the directory fd and ioctl) is better from a security
> >> perspective than using a new file in /proc.
> >>
> >> I have an old patch to make proc directory fds pollable:
> >>
> >> https://lore.kernel.org/patchwork/patch/345098/
> >>
> >> That patch plus the one in this thread might make a nice addition to
> >> the kernel even if we expect something much better to come along
> >> later.
> >
> > I agree. Eric's point was to make the first implementation of this as
> > simple as possible that's why this patch is intentionally almost
> > trivial. And I like it for its simplicity.
> >
> > I had a more comprehensive API proposal of which open(/proc/<pid>) was a
> > part. I didn't send out alongside this patch as Eric clearly prefered to
> > only have the /proc/<pid> part. Here is the full proposal as I intended
> > to originally send it out:
> 
> Thanks.
> 
> > The gist is to have file descriptors for processes which is obviously not a new
> > idea. This has been done before in other OSes and it has been tried before in
> > Linux [2], [3] (Thanks to Kees for pointing out these patches.). So I want to
> > make it very clear that I'm not laying claim to this being my or even a novel
> > idea in any way. However, I want to diverge from previous approaches with my
> > suggestion. (Though I can't be sure that there's not something similar in other
> > OSes already.)
> 
> Windows works basically as you describe. You can create a process is
> suspended state, configure it however you want, then let it run.
> CreateProcess (and even moreso, NtCreateProcess) also provide a rich
> (and *extensible*) interface for pre-creation process configuration.
> 
> >> One of the main motivations for having procfds is to have a race-free way of
> > configuring, starting, polling, and killing a process. Basically, a process
> > lifecycle api if you want to think about it that way. The api should also be
> > easily extendable in the future to avoid running into the limitations we
> > currently see with the clone*() syscall(s) again.
> >
> > One of the crucial points of the api is to *separate the configuration
> > of a process through a procfd from actually creating the process*.
> > This is a crucial property expressed in the open*() system calls. First, get a
> > stable handle on an object then allow for ways to configure it. As such the
> > procfd api shares the same insight with Al's and David's new mount api.
> > (Fwiw, Andy also pointed out similarities with posix_spawn().)
> > What I envisioned was to have the following syscalls (multiple name suggestions):
> >
> > 1. int process_open / proc_open / procopen
> > 2. int process_config / proc_config / procconfig or ioctl()-based
> > 3. int process_info / proc_info / procinfo or ioctl()-based
> > 4. int process_manage / proc_manage / procmanage or ioctl()-based
> 
> The API you've proposed seems fine to me, although I'd either 1)
> consolidate operations further into one system call, or 2) separate
> the different management operations into more and different system
> calls that can be audited independently. The grouping you've proposed
> seems to have the worst aspects of API splitting and API multiplexing.
> But I have no objection to it in spirit.
> 
> That said, while I do want to fix process configuration and startup
> generally, I want to fix specific holes in the existing API surface
> first. The two patches I've already sent do that, and this work
> shouldn't wait on an ideal larger process-API overhaul that may or may
> not arrive. Based on previous history, I suspect that an API of the
> scope you're proposing would take years to overcome all LKML
> objections and land. I don't want to wait that long when we can make
> smaller fixes that would not conflict with the general architecture.
> 
> The original patch on this thread is half of the right fix. While I
> think we should use a system call instead of an ioctl, and while I
> have some specific implementation critiques (which I described in a
> different message), it's the right general sort of thing. We should
> merge it.

Thanks. I agree. Note, I don't care if it's an ioctl() or not. I'm happy
to instead add a syscall process_signal() alongside this patchset. What
do people prefer?

> 
> Next, I want to merge my exithand proposal, or something like it. It's
> likewise a simple change that, in a minimal way, addresses a
> longstanding API deficiency. I'm very strongly against the
> POLLERR-on-directory variant of the idea.

  parent reply	other threads:[~2018-11-18 19:24 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-18 11:17 [PATCH] proc: allow killing processes via file descriptors Christian Brauner
2018-11-18 13:59 ` Daniel Colascione
2018-11-18 15:38   ` Andy Lutomirski
2018-11-18 15:53     ` Daniel Colascione
2018-11-18 16:17       ` Andy Lutomirski
2018-11-18 16:29         ` Daniel Colascione
2018-11-18 17:13           ` Andy Lutomirski
2018-11-18 17:17             ` Daniel Colascione
2018-11-18 17:43               ` Eric W. Biederman
2018-11-18 17:45                 ` Andy Lutomirski
2018-11-18 17:56                 ` Daniel Colascione
2018-11-18 16:33         ` Randy Dunlap
2018-11-18 16:48           ` Daniel Colascione
2018-11-18 17:09             ` Andy Lutomirski
2018-11-18 17:24               ` Daniel Colascione
2018-11-18 17:42                 ` Andy Lutomirski
2018-11-18 17:51                   ` Daniel Colascione
2018-11-18 18:28                     ` Andy Lutomirski
2018-11-18 18:43                       ` Daniel Colascione
2018-11-18 19:05                         ` Aleksa Sarai
2018-11-18 19:44                           ` Daniel Colascione
2018-11-18 20:15                             ` Christian Brauner
2018-11-18 20:21                               ` Daniel Colascione
2018-11-18 20:28                             ` Andy Lutomirski
2018-11-18 20:32                               ` Daniel Colascione
2018-11-19  1:43                                 ` Andy Lutomirski
2018-11-18 20:43                               ` Christian Brauner
2018-11-18 20:54                                 ` Daniel Colascione
2018-11-18 21:23                                   ` Christian Brauner
2018-11-18 21:30                                     ` Christian Brauner
2018-11-19  0:31                                       ` Daniel Colascione
2018-11-19  0:40                                         ` Christian Brauner
2018-11-19  0:09                             ` Aleksa Sarai
2018-11-19  0:53                               ` Daniel Colascione
2018-11-19  1:16                                 ` Daniel Colascione
2018-11-19 16:13                       ` Dmitry Safonov
2018-11-19 16:26                         ` [PATCH] proc: allow killing processes via file descriptors (Larger pids) Eric W. Biederman
2018-11-19 16:27                         ` [PATCH] proc: allow killing processes via file descriptors Daniel Colascione
2018-11-19 20:21                           ` Aleksa Sarai
2018-11-19  2:47                   ` Al Viro
2018-11-19  3:01                     ` Andy Lutomirski
2018-11-18 17:41     ` Christian Brauner
2018-11-18 17:44       ` Andy Lutomirski
2018-11-18 18:07       ` Daniel Colascione
2018-11-18 18:15         ` Andy Lutomirski
2018-11-18 18:31           ` Daniel Colascione
2018-11-18 19:24         ` Christian Brauner [this message]
2018-11-19  0:08         ` Aleksa Sarai
2018-11-19  1:14           ` Daniel Colascione
2018-11-18 16:03 ` Daniel Colascione
2018-11-19 10:56 ` kbuild test robot
2018-11-19 14:15 ` David Laight
2018-11-19 15:49 ` Dave Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181118192443.7ylyl24svrn6jvjd@brauner.io \
    --to=christian@brauner.io \
    --cc=akpm@linux-foundation.org \
    --cc=cyphar@cyphar.com \
    --cc=dancol@google.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    --cc=timmurray@google.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).