linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Christian Brauner <christian@brauner.io>
Cc: Oleg Nesterov <oleg@redhat.com>,
	torvalds@linux-foundation.org, viro@zeniv.linux.org.uk,
	jannh@google.com, dhowells@redhat.com, linux-api@vger.kernel.org,
	linux-kernel@vger.kernel.org, serge@hallyn.com, luto@kernel.org,
	arnd@arndb.de, ebiederm@xmission.com, keescook@chromium.org,
	tglx@linutronix.de, mtk.manpages@gmail.com,
	akpm@linux-foundation.org, cyphar@cyphar.com, dancol@google.com
Subject: Re: [PATCH 2/4] clone: add CLONE_PIDFD
Date: Mon, 15 Apr 2019 12:25:42 -0400	[thread overview]
Message-ID: <20190415162542.GA246478@google.com> (raw)
In-Reply-To: <20190415135246.d6pvyf3pkt3sbh6t@brauner.io>

On Mon, Apr 15, 2019 at 03:52:48PM +0200, Christian Brauner wrote:
> On Mon, Apr 15, 2019 at 03:24:16PM +0200, Oleg Nesterov wrote:
> > On 04/15, Christian Brauner wrote:
> > >
> > > > CLONE_PARENT_SETTID doesn't look very usefule, so what if we add
> > > >
> > > > 	if ((clone_flags & (CLONE_PIDFD|CLONE_PARENT_SETTID)) ==
> > > > 	                   (CLONE_PIDFD|CLONE_PARENT_SETTID))
> > > > 		return ERR_PTR(-EINVAL);
> > > >
> > > > at the start of copy_process() ?
> > > >
> > > > Then it can do
> > > >
> > > > 	if (clone_flags & CLONE_PIDFD) {
> > > > 		retval = pidfd_create(pid, &pidfdf);
> > > > 		if (retval < 0)
> > > > 			goto bad_fork_free_pid;
> > > > 		retval = put_user(retval, parent_tidptr)
> > > > 		if (retval < 0)
> > > > 			goto bad_fork_free_pid;
> > > > 	}
> > >
> > > Uhhh Oleg, that is nifty. I have to say I like that a lot. This would
> > > let us return the pid and the pidfd in one go and we can also start
> > > pidfd numbering at 0.
> > 
> > Christian, sorry if it was already discussed, but I can't force myself to
> > read all the previous discussions ;)
> > 
> > If we forget about CONFIG_PROC_FS, why do we really want to create a file?
> > 
> > 
> > Suppose we add a global u64 counter incremented by copy_process and reported
> > in /proc/$pid/status. Suppose that clone(CLONE_PIDFD) writes this counter to
> > *parent_tidptr. Let's denote this counter as UNIQ_PID.
> > 
> > Now, if you want to (say) safely kill a task and you have its UNIQ_PID, you
> > can do
> > 
> > 	kill_by_pid_uniq(int pid, u64 uniq_pid)
> > 	{
> > 		pidfd = open("/proc/$pid", O_DIRECTORY);
> > 
> > 		status = openat(pidfd, "status");
> > 		u64 this_uniq_pid = ... read UNIQ_PID from status ...;
> > 
> > 		if (uniq_pid != this_uniq_pid)
> > 			return;
> > 
> > 		pidfd_send_signal(pidfd);
> > 	}
> > 
> > Why else do we want pidfd?
> 
> I think this was thrown around at one point but this is rather
> inelegant imho. It basically makes a process unique by using a
> combination of two identifiers. You end up with a similar concept but
> you make it way less flexible and extensible imho. With pidfds you can
> not care about pids at all if you don't want to. The UNIQ_PID thing
> would require you to always juggle two identifiers.
> 
> Your example would also only work if CONFIG_PROC_FS is set (Not sure if
> that's what you meant by "forget about CONFIG_PROC_FS")? Say, you get
> a pid from clone() and your UNIQ_PID thing. Then you still can't
> reliably kill a process because pidfd_send_signal() is not useable since
> you can't get pidfds. And if you go the kill way you end up with the same
> problem. Yes, you could solve this by probably extending syscalls to
> take a UNIQ_PID argument but that seems very inelegant.
> 
> The UNIQ_PID implementation would also require being tracked in the
> kernel either in task_struct or struct pid potentially and thus would
> probably add more infrastructure in the kernel. We don't need any of
> that if we simply rely on pidfds.
> 
> Most of all, the pidfd concept allows one way more flexibility in
> extending it. For example, Joel is working on a patchset to make pidfds
> pollable so you can get information about a process death by polling
> them. We also want to be able to potentially wait on a process with
> waitid(W_PIDFD) or similar as suggested by Linus in earlier threads. At
> that point you end up in a similar situation as tgkill() where you pass
> a tgid and a pid already to make sure that the pid you pass has the tgid
> as thread-group leader. That is all way simpler with pidfds.

I agree the pidfd file descriptor approach is simpler than dealing with 2
pids and is needed for the poll notification support I posted.

Also in the future it allows for a pidfd to sent over IPC to another
process using binder or unix domain sockets.

thanks,

 - Joel


  reply	other threads:[~2019-04-15 16:25 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-14 20:14 [PATCH 0/4] clone: add CLONE_PIDFD Christian Brauner
2019-04-14 20:14 ` [PATCH 1/4] Make anon_inodes unconditional Christian Brauner
2019-04-14 20:14 ` [PATCH 2/4] clone: add CLONE_PIDFD Christian Brauner
2019-04-15 10:52   ` Oleg Nesterov
2019-04-15 11:42     ` Christian Brauner
2019-04-15 13:24       ` Oleg Nesterov
2019-04-15 13:52         ` Christian Brauner
2019-04-15 16:25           ` Joel Fernandes [this message]
2019-04-15 17:15         ` Jonathan Kowalski
2019-04-15 19:39           ` Daniel Colascione
2019-04-14 20:14 ` [PATCH 3/4] signal: support CLONE_PIDFD with pidfd_send_signal Christian Brauner
2019-04-14 20:14 ` [PATCH 4/4] samples: show race-free pidfd metadata access Christian Brauner
2019-04-15 10:08 ` RFC: on adding new CLONE_* flags [WAS Re: [PATCH 0/4] clone: add CLONE_PIDFD] Enrico Weigelt, metux IT consult
2019-04-15 15:50   ` Serge E. Hallyn
2019-04-16 18:32     ` Enrico Weigelt, metux IT consult
2019-04-29 15:49       ` Serge E. Hallyn
2019-04-29 17:31         ` Enrico Weigelt, metux IT consult
2019-05-05  2:32           ` Serge E. Hallyn
2019-04-15 19:59   ` Aleksa Sarai
2019-04-15 20:29     ` Andy Lutomirski
2019-04-15 21:27       ` Jonathan Kowalski
2019-04-15 23:58         ` Andy Lutomirski
2019-04-16 18:45       ` Enrico Weigelt, metux IT consult
2019-04-16 21:31         ` Andy Lutomirski
2019-04-17 12:03           ` Enrico Weigelt, metux IT consult
2019-04-17 12:54             ` Christian Brauner
2019-04-18 15:46               ` Enrico Weigelt, metux IT consult
2019-04-17 12:19       ` Florian Weimer
2019-04-17 16:46         ` Andy Lutomirski
2019-04-20  7:14       ` Kevin Easton
2019-04-20 11:15         ` Christian Brauner
2019-04-20 15:06         ` Daniel Colascione
2019-04-29 19:30         ` Jann Horn
2019-04-29 19:55           ` Jann Horn
2019-04-29 20:21             ` Linus Torvalds
2019-04-29 20:38               ` Florian Weimer
2019-04-29 20:51                 ` Christian Brauner
2019-04-29 21:31                 ` Linus Torvalds
2019-04-30  7:01                   ` Florian Weimer
2019-04-30  0:38               ` Jann Horn
2019-04-30  2:16                 ` Linus Torvalds
2019-04-30  8:21                   ` Florian Weimer
2019-04-30 16:19                     ` Linus Torvalds
2019-04-30 16:26                       ` Linus Torvalds
2019-04-30 17:07                         ` Florian Weimer
2019-04-30 12:39               ` Oleg Nesterov
2019-04-30 16:24                 ` Linus Torvalds
2019-04-29 20:49             ` Florian Weimer
2019-04-29 20:52               ` Christian Brauner
2019-04-20 15:28       ` Al Viro
2019-04-16 18:37     ` Enrico Weigelt, metux IT consult
2019-04-15 10:16 ` [PATCH 0/4] clone: add CLONE_PIDFD Enrico Weigelt, metux IT consult

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190415162542.GA246478@google.com \
    --to=joel@joelfernandes.org \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=christian@brauner.io \
    --cc=cyphar@cyphar.com \
    --cc=dancol@google.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).