linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <christian@brauner.io>
To: Oleg Nesterov <oleg@redhat.com>
Cc: torvalds@linux-foundation.org, viro@zeniv.linux.org.uk,
	jannh@google.com, dhowells@redhat.com, linux-api@vger.kernel.org,
	linux-kernel@vger.kernel.org, serge@hallyn.com, luto@kernel.org,
	arnd@arndb.de, ebiederm@xmission.com, keescook@chromium.org,
	tglx@linutronix.de, mtk.manpages@gmail.com,
	akpm@linux-foundation.org, cyphar@cyphar.com,
	joel@joelfernandes.org, dancol@google.com
Subject: Re: [PATCH 2/4] clone: add CLONE_PIDFD
Date: Mon, 15 Apr 2019 15:52:48 +0200	[thread overview]
Message-ID: <20190415135246.d6pvyf3pkt3sbh6t@brauner.io> (raw)
In-Reply-To: <20190415132416.GB22204@redhat.com>

On Mon, Apr 15, 2019 at 03:24:16PM +0200, Oleg Nesterov wrote:
> On 04/15, Christian Brauner wrote:
> >
> > > CLONE_PARENT_SETTID doesn't look very usefule, so what if we add
> > >
> > > 	if ((clone_flags & (CLONE_PIDFD|CLONE_PARENT_SETTID)) ==
> > > 	                   (CLONE_PIDFD|CLONE_PARENT_SETTID))
> > > 		return ERR_PTR(-EINVAL);
> > >
> > > at the start of copy_process() ?
> > >
> > > Then it can do
> > >
> > > 	if (clone_flags & CLONE_PIDFD) {
> > > 		retval = pidfd_create(pid, &pidfdf);
> > > 		if (retval < 0)
> > > 			goto bad_fork_free_pid;
> > > 		retval = put_user(retval, parent_tidptr)
> > > 		if (retval < 0)
> > > 			goto bad_fork_free_pid;
> > > 	}
> >
> > Uhhh Oleg, that is nifty. I have to say I like that a lot. This would
> > let us return the pid and the pidfd in one go and we can also start
> > pidfd numbering at 0.
> 
> Christian, sorry if it was already discussed, but I can't force myself to
> read all the previous discussions ;)
> 
> If we forget about CONFIG_PROC_FS, why do we really want to create a file?
> 
> 
> Suppose we add a global u64 counter incremented by copy_process and reported
> in /proc/$pid/status. Suppose that clone(CLONE_PIDFD) writes this counter to
> *parent_tidptr. Let's denote this counter as UNIQ_PID.
> 
> Now, if you want to (say) safely kill a task and you have its UNIQ_PID, you
> can do
> 
> 	kill_by_pid_uniq(int pid, u64 uniq_pid)
> 	{
> 		pidfd = open("/proc/$pid", O_DIRECTORY);
> 
> 		status = openat(pidfd, "status");
> 		u64 this_uniq_pid = ... read UNIQ_PID from status ...;
> 
> 		if (uniq_pid != this_uniq_pid)
> 			return;
> 
> 		pidfd_send_signal(pidfd);
> 	}
> 
> Why else do we want pidfd?

I think this was thrown around at one point but this is rather
inelegant imho. It basically makes a process unique by using a
combination of two identifiers. You end up with a similar concept but
you make it way less flexible and extensible imho. With pidfds you can
not care about pids at all if you don't want to. The UNIQ_PID thing
would require you to always juggle two identifiers.

Your example would also only work if CONFIG_PROC_FS is set (Not sure if
that's what you meant by "forget about CONFIG_PROC_FS")? Say, you get
a pid from clone() and your UNIQ_PID thing. Then you still can't
reliably kill a process because pidfd_send_signal() is not useable since
you can't get pidfds. And if you go the kill way you end up with the same
problem. Yes, you could solve this by probably extending syscalls to
take a UNIQ_PID argument but that seems very inelegant.

The UNIQ_PID implementation would also require being tracked in the
kernel either in task_struct or struct pid potentially and thus would
probably add more infrastructure in the kernel. We don't need any of
that if we simply rely on pidfds.

Most of all, the pidfd concept allows one way more flexibility in
extending it. For example, Joel is working on a patchset to make pidfds
pollable so you can get information about a process death by polling
them. We also want to be able to potentially wait on a process with
waitid(W_PIDFD) or similar as suggested by Linus in earlier threads. At
that point you end up in a similar situation as tgkill() where you pass
a tgid and a pid already to make sure that the pid you pass has the tgid
as thread-group leader. That is all way simpler with pidfds.

  reply	other threads:[~2019-04-15 13:52 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-14 20:14 [PATCH 0/4] clone: add CLONE_PIDFD Christian Brauner
2019-04-14 20:14 ` [PATCH 1/4] Make anon_inodes unconditional Christian Brauner
2019-04-14 20:14 ` [PATCH 2/4] clone: add CLONE_PIDFD Christian Brauner
2019-04-15 10:52   ` Oleg Nesterov
2019-04-15 11:42     ` Christian Brauner
2019-04-15 13:24       ` Oleg Nesterov
2019-04-15 13:52         ` Christian Brauner [this message]
2019-04-15 16:25           ` Joel Fernandes
2019-04-15 17:15         ` Jonathan Kowalski
2019-04-15 19:39           ` Daniel Colascione
2019-04-14 20:14 ` [PATCH 3/4] signal: support CLONE_PIDFD with pidfd_send_signal Christian Brauner
2019-04-14 20:14 ` [PATCH 4/4] samples: show race-free pidfd metadata access Christian Brauner
2019-04-15 10:08 ` RFC: on adding new CLONE_* flags [WAS Re: [PATCH 0/4] clone: add CLONE_PIDFD] Enrico Weigelt, metux IT consult
2019-04-15 15:50   ` Serge E. Hallyn
2019-04-16 18:32     ` Enrico Weigelt, metux IT consult
2019-04-29 15:49       ` Serge E. Hallyn
2019-04-29 17:31         ` Enrico Weigelt, metux IT consult
2019-05-05  2:32           ` Serge E. Hallyn
2019-04-15 19:59   ` Aleksa Sarai
2019-04-15 20:29     ` Andy Lutomirski
2019-04-15 21:27       ` Jonathan Kowalski
2019-04-15 23:58         ` Andy Lutomirski
2019-04-16 18:45       ` Enrico Weigelt, metux IT consult
2019-04-16 21:31         ` Andy Lutomirski
2019-04-17 12:03           ` Enrico Weigelt, metux IT consult
2019-04-17 12:54             ` Christian Brauner
2019-04-18 15:46               ` Enrico Weigelt, metux IT consult
2019-04-17 12:19       ` Florian Weimer
2019-04-17 16:46         ` Andy Lutomirski
2019-04-20  7:14       ` Kevin Easton
2019-04-20 11:15         ` Christian Brauner
2019-04-20 15:06         ` Daniel Colascione
2019-04-29 19:30         ` Jann Horn
2019-04-29 19:55           ` Jann Horn
2019-04-29 20:21             ` Linus Torvalds
2019-04-29 20:38               ` Florian Weimer
2019-04-29 20:51                 ` Christian Brauner
2019-04-29 21:31                 ` Linus Torvalds
2019-04-30  7:01                   ` Florian Weimer
2019-04-30  0:38               ` Jann Horn
2019-04-30  2:16                 ` Linus Torvalds
2019-04-30  8:21                   ` Florian Weimer
2019-04-30 16:19                     ` Linus Torvalds
2019-04-30 16:26                       ` Linus Torvalds
2019-04-30 17:07                         ` Florian Weimer
2019-04-30 12:39               ` Oleg Nesterov
2019-04-30 16:24                 ` Linus Torvalds
2019-04-29 20:49             ` Florian Weimer
2019-04-29 20:52               ` Christian Brauner
2019-04-20 15:28       ` Al Viro
2019-04-16 18:37     ` Enrico Weigelt, metux IT consult
2019-04-15 10:16 ` [PATCH 0/4] clone: add CLONE_PIDFD Enrico Weigelt, metux IT consult

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190415135246.d6pvyf3pkt3sbh6t@brauner.io \
    --to=christian@brauner.io \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=cyphar@cyphar.com \
    --cc=dancol@google.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).