linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Colascione <dancol@google.com>
To: Christian Brauner <christian.brauner@canonical.com>
Cc: Joel Fernandes <joelaf@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Tim Murray <timmurray@google.com>,
	Suren Baghdasaryan <surenb@google.com>
Subject: Re: [RFC PATCH] Implement /proc/pid/kill
Date: Tue, 30 Oct 2018 10:48:51 +0000	[thread overview]
Message-ID: <CAKOZuesb4ns1dM7itX9uzvYBGp_aSgQXV1ONuV4eQRq0FiehYQ@mail.gmail.com> (raw)
In-Reply-To: <20181030104037.73t5uz3piywxwmye@gmail.com>

On Tue, Oct 30, 2018 at 10:40 AM, Christian Brauner
<christian.brauner@canonical.com> wrote:
> On Tue, Oct 30, 2018 at 11:39:11AM +0100, Christian Brauner wrote:
>> On Tue, Oct 30, 2018 at 08:50:22AM +0000, Daniel Colascione wrote:
>> > On Tue, Oct 30, 2018 at 3:21 AM, Joel Fernandes <joelaf@google.com> wrote:
>> > > On Mon, Oct 29, 2018 at 3:11 PM Daniel Colascione <dancol@google.com> wrote:
>> > >>
>> > >> Add a simple proc-based kill interface. To use /proc/pid/kill, just
>> > >> write the signal number in base-10 ASCII to the kill file of the
>> > >> process to be killed: for example, 'echo 9 > /proc/$$/kill'.
>> > >>
>> > >> Semantically, /proc/pid/kill works like kill(2), except that the
>> > >> process ID comes from the proc filesystem context instead of from an
>> > >> explicit system call parameter. This way, it's possible to avoid races
>> > >> between inspecting some aspect of a process and that process's PID
>> > >> being reused for some other process.
>> > >>
>> > >> With /proc/pid/kill, it's possible to write a proper race-free and
>> > >> safe pkill(1). An approximation follows. A real program might use
>> > >> openat(2), having opened a process's /proc/pid directory explicitly,
>> > >> with the directory file descriptor serving as a sort of "process
>> > >> handle".
>> > >
>> > > How long does the 'inspection' procedure take? If its a short
>> > > duration, then is PID reuse really an issue, I mean the PIDs are not
>> > > reused until wrap around and the only reason this can be a problem is
>> > > if you have the wrap around while the 'inspecting some aspect'
>> > > procedure takes really long.
>> >
>> > It's a race. Would you make similar statements about a similar fix for
>> > a race condition involving a mutex and a double-free just because the
>> > race didn't crash most of the time? The issue I'm trying to fix here
>> > is the same problem, one level higher up in the abstraction hierarchy.
>> >
>> > > Also the proc fs is typically not the right place for this. Some
>> > > entries in proc are writeable, but those are for changing values of
>> > > kernel data structures. The title of man proc(5) is "proc - process
>> > > information pseudo-filesystem". So its "information" right?
>> >
>> > Why should userspace care whether a particular operation is "changing
>> > [a] value[] of [a] kernel data structure" or something else? That
>> > something in /proc is a struct field is an implementation detail. It's
>> > the interface semantics that matters, and whether a particular
>> > operation is achieved by changing a struct field or by making a
>> > function call is irrelevant to userspace. Proc is a filesystem about
>> > processes. Why shouldn't you be able to send a signal to a process via
>> > proc? It's an operation involving processes.
>> >
>> > It's already possible to do things *to* processes via proc, e.g.,
>> > adjust OOM killer scores. Proc filesystem file descriptors are
>> > userspace references to kernel-side struct pid instances, and as such,
>> > make good process handles. There are already "verb" files in procfs,
>> > such as /proc/sys/vm/drop_caches and /proc/sysrq-trigger. Why not add
>> > a kill "verb", especially if it closes a race that can't be closed
>> > some other way?
>> >
>> > You could implement this interface as a system call that took a procfs
>> > directory file descriptor, but relative to this proposal, it would be
>> > all downside. Such a thing would act just the same way as
>> > /pric/pid/kill, and wouldn't be usable from the shell or from programs
>> > that didn't want to use syscall(2). (Since glibc isn't adding new
>> > system call wrappers.) AFAIK, the only downside of having a "kill"
>> > file is the need for a string-to-integer conversion, but compared to
>> > process killing, integer parsing is insignificant.
>> >
>> > > IMO without a really good reason for this, it could really be a hard
>> > > sell but the RFC was worth it anyway to discuss it ;-)
>> >
>> > The traditional unix process API is down there at level -10 of Rusty
>> > Russel's old bad API scale: "It's impossible to get right". The races
>> > in the current API are unavoidable. That most programs don't hit these
>> > races most of the time doesn't mean that the race isn't present.
>> >
>> > We've moved to a model where we identify other system resources, like
>> > DRM fences, locks, sockets, and everything else via file descriptors.
>> > This change is a step toward using procfs file descriptors to work
>> > with processes, which makes the system more regular and easier to
>> > reason about. A clean API that's possible to use correctly is a
>> > worthwhile project.
>>
>> So I have been disucssing a new process API With David Howells, Kees
>> Cook and a few others and I am working on an RFC/proposal for this. It
>> is partially inspired by the new mount API. So I would like to block
>> this patch until then. I would like to get this right very much and

It's good to hear that others are thinking about this problem.

>> I
>> don't think this is the way to go.

Why not?

Does your proposed API allow for a race-free pkill, with arbitrary
selection criteria? This capability is a good litmus test for fixing
the long-standing Unix process API issues.

  reply	other threads:[~2018-10-30 10:48 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-29 22:10 [RFC PATCH] Implement /proc/pid/kill Daniel Colascione
2018-10-30  3:21 ` Joel Fernandes
2018-10-30  8:50   ` Daniel Colascione
2018-10-30 10:39     ` Christian Brauner
2018-10-30 10:40       ` Christian Brauner
2018-10-30 10:48         ` Daniel Colascione [this message]
2018-10-30 11:04           ` Christian Brauner
2018-10-30 11:12             ` Daniel Colascione
2018-10-30 11:19               ` Christian Brauner
2018-10-31  5:00                 ` Eric W. Biederman
2018-10-30 17:01     ` Joel Fernandes
2018-10-30  5:00 ` Aleksa Sarai
2018-10-30  9:05   ` Daniel Colascione
2018-10-30 20:45     ` Aleksa Sarai
2018-10-30 21:42       ` Joel Fernandes
2018-10-30 22:23         ` Aleksa Sarai
2018-10-30 22:33           ` Joel Fernandes
2018-10-30 22:49             ` Aleksa Sarai
2018-10-31  0:42               ` Joel Fernandes
2018-10-31  1:59                 ` Daniel Colascione
2018-10-30 23:10             ` Daniel Colascione
2018-10-30 23:23               ` Christian Brauner
2018-10-30 23:55                 ` Daniel Colascione
2018-10-31  2:56                 ` Aleksa Sarai
2018-10-31  4:24                   ` Joel Fernandes
2018-11-01 20:40                     ` Joel Fernandes
2018-11-02  9:46                       ` Christian Brauner
2018-11-02 14:34                         ` Serge E. Hallyn
2018-10-31  0:57               ` Joel Fernandes
2018-10-31  1:56                 ` Daniel Colascione
2018-10-31  4:47   ` Eric W. Biederman
2018-10-31  4:44 ` Eric W. Biederman
2018-10-31 12:44   ` Oleg Nesterov
2018-10-31 13:27     ` Daniel Colascione
2018-10-31 15:10       ` Oleg Nesterov
2018-10-31 15:16         ` Daniel Colascione
2018-10-31 15:49           ` Oleg Nesterov
2018-11-01 11:53       ` David Laight
2018-11-01 15:50         ` Daniel Colascione
2018-10-31 14:37 ` [PATCH v2] " Daniel Colascione
2018-10-31 15:05   ` Joel Fernandes
2018-10-31 17:33     ` Aleksa Sarai
2018-10-31 21:47       ` Joel Fernandes
2018-10-31 15:59 ` [PATCH v3] " Daniel Colascione
2018-10-31 17:54   ` Tycho Andersen
2018-10-31 18:00     ` Daniel Colascione
2018-10-31 18:17       ` Tycho Andersen
2018-10-31 19:33         ` Daniel Colascione
2018-10-31 20:06           ` Tycho Andersen
2018-11-01 11:33           ` David Laight
2018-11-12  1:19             ` Eric W. Biederman
2018-10-31 16:22 ` [RFC PATCH] " Jann Horn
2018-11-01  4:53   ` Andy Lutomirski
2018-11-12 23:13 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKOZuesb4ns1dM7itX9uzvYBGp_aSgQXV1ONuV4eQRq0FiehYQ@mail.gmail.com \
    --to=dancol@google.com \
    --cc=christian.brauner@canonical.com \
    --cc=joelaf@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).