All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <christian@brauner.io>
To: Jann Horn <jannh@google.com>
Cc: kernel list <linux-kernel@vger.kernel.org>,
	Oleg Nesterov <oleg@redhat.com>, Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Kees Cook <keescook@chromium.org>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	Thomas Gleixner <tglx@linutronix.de>, Tejun Heo <tj@kernel.org>,
	David Howells <dhowells@redhat.com>,
	Andy Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Aleksa Sarai <cyphar@cyphar.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	kernel-team <kernel-team@android.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH 4/5] pidfd: add CLONE_WAIT_PID
Date: Thu, 25 Jul 2019 12:11:24 +0200	[thread overview]
Message-ID: <20190725101123.zp7y2weotyqkfsv3@brauner.io> (raw)
In-Reply-To: <DB572B9F-4D21-402F-A68B-CD193BBB166C@brauner.io>

On Wed, Jul 24, 2019 at 09:10:20PM +0200, Christian Brauner wrote:
> On July 24, 2019 9:07:54 PM GMT+02:00, Jann Horn <jannh@google.com> wrote:
> >On Wed, Jul 24, 2019 at 8:27 PM Christian Brauner
> ><christian@brauner.io> wrote:
> >> On July 24, 2019 8:14:26 PM GMT+02:00, Jann Horn <jannh@google.com>
> >wrote:
> >> >On Wed, Jul 24, 2019 at 4:48 PM Christian Brauner
> >> ><christian@brauner.io> wrote:
> >> >> If CLONE_WAIT_PID is set the newly created process will not be
> >> >> considered by process wait requests that wait generically on
> >children
> >> >> such as:
> >> >>
> >> >>         syscall(__NR_wait4, -1, wstatus, options, rusage)
> >> >>         syscall(__NR_waitpid, -1, wstatus, options)
> >> >>         syscall(__NR_waitid, P_ALL, -1, siginfo, options, rusage)
> >> >>         syscall(__NR_waitid, P_PGID, -1, siginfo, options, rusage)
> >> >>         syscall(__NR_waitpid, -pid, wstatus, options)
> >> >>         syscall(__NR_wait4, -pid, wstatus, options, rusage)
> >> >>
> >> >> A process created with CLONE_WAIT_PID can only be waited upon with
> >a
> >> >> focussed wait call. This ensures that processes can be reaped even
> >if
> >> >> all file descriptors referring to it are closed.
> >> >[...]
> >> >> diff --git a/kernel/fork.c b/kernel/fork.c
> >> >> index baaff6570517..a067f3876e2e 100644
> >> >> --- a/kernel/fork.c
> >> >> +++ b/kernel/fork.c
> >> >> @@ -1910,6 +1910,8 @@ static __latent_entropy struct task_struct
> >> >*copy_process(
> >> >>         delayacct_tsk_init(p);  /* Must remain after
> >> >dup_task_struct() */
> >> >>         p->flags &= ~(PF_SUPERPRIV | PF_WQ_WORKER | PF_IDLE);
> >> >>         p->flags |= PF_FORKNOEXEC;
> >> >> +       if (clone_flags & CLONE_WAIT_PID)
> >> >> +               p->flags |= PF_WAIT_PID;
> >> >>         INIT_LIST_HEAD(&p->children);
> >> >>         INIT_LIST_HEAD(&p->sibling);
> >> >>         rcu_copy_process(p);
> >> >
> >> >This means that if a process with PF_WAIT_PID forks, the child
> >> >inherits the flag, right? That seems unintended? You might have to
> >add
> >> >something like "if (clone_flags & CLONE_THREAD == 0) p->flags &=
> >> >~PF_WAIT_PID;" before this. (I think threads do have to inherit the
> >> >flag so that the case where a non-leader thread of the child goes
> >> >through execve and steals the leader's identity is handled
> >properly.)
> >> >Or you could cram it somewhere into signal_struct instead of on the
> >> >task - that might be a more logical place for it?
> >>
> >> Hm, CLONE_WAIT_PID is only useable with CLONE_PIDFD which in turn is
> >> not useable with CLONE_THREAD.
> >> But we should probably make that explicit for CLONE_WAIT_PID too.
> >
> >To clarify:
> >
> >This code looks buggy to me because p->flags is inherited from the
> >parent, with the exception of flags that are explicitly stripped out.
> >Since PF_WAIT_PID is not stripped out, this means that if task A
> >creates a child B with clone(CLONE_WAIT_PID), and then task B uses
> >fork() to create a child C, then B will not be able to use
> >wait(&status) to wait for C since C inherited PF_WAIT_PID from B.
> >
> >The obvious way to fix that would be to always strip out PF_WAIT_PID;
> >but that would also be wrong, because if task B creates a thread C,
> >and then C calls execve(), the task_struct of B goes away and B's TGID
> >is taken over by C. When C eventually exits, it should still obey the
> >CLONE_WAIT_PID (since to A, it's all the same process). Therefore, if
> >p->flags is used to track whether the task was created with
> >CLONE_WAIT_PID, PF_WAIT_PID must be inherited if CLONE_THREAD is set.
> >So:
> >
> >diff --git a/kernel/fork.c b/kernel/fork.c
> >index d8ae0f1b4148..b32e1e9a6c9c 100644
> >--- a/kernel/fork.c
> >+++ b/kernel/fork.c
> >@@ -1902,6 +1902,10 @@ static __latent_entropy struct task_struct
> >*copy_process(
> >      delayacct_tsk_init(p);  /* Must remain after dup_task_struct() */
> >        p->flags &= ~(PF_SUPERPRIV | PF_WQ_WORKER | PF_IDLE);
> >        p->flags |= PF_FORKNOEXEC;
> >+       if (!(clone_flags & CLONE_THREAD))
> >+               p->flags &= ~PF_PF_WAIT_PID;
> >+       if (clone_flags & CLONE_WAIT_PID)
> >+               p->flags |= PF_PF_WAIT_PID;
> >        INIT_LIST_HEAD(&p->children);
> >        INIT_LIST_HEAD(&p->sibling);
> >        rcu_copy_process(p);
> >
> >An alternative would be to not use p->flags at all, but instead make
> >this a property of the signal_struct - since the property is shared by
> >all threads, that might make more sense?
> 
> Yeah, thanks for clarifying.
> Now it's more obvious.
> I need to take a look at the signal struct before I can say anything about this.

I've been looking at this a bit late last night.
Putting this in the flags argument of signal_struct would indeed be
possible. But it feels misplaced to me there. I think the implied
semantics by having this part of task_struct are nicer, i.e. the intent
is clearer especially when the task is filtered later on in exit.c.
So unless anyone sees a clear problem or otherwise objects I would keep
it as a property of task_struct for now and fix it up.

Christian

  reply	other threads:[~2019-07-25 10:11 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-24 14:46 [PATCH 0/5] pidfd: waiting on processes through pidfds Christian Brauner
2019-07-24 14:46 ` [RFC][PATCH 1/5] exit: kill struct waitid_info Christian Brauner
2019-07-24 17:37   ` Linus Torvalds
2019-07-24 22:01     ` Christian Brauner
2019-07-25 12:46     ` Eric W. Biederman
2019-07-26  8:01       ` Christian Brauner
2019-07-26 11:59         ` Eric W. Biederman
2019-07-26 12:37           ` Christian Brauner
2019-07-25  9:40   ` Oleg Nesterov
2019-07-25 10:07     ` Christian Brauner
2019-07-24 14:46 ` [PATCH 2/5] pidfd: add pidfd_wait() Christian Brauner
2019-07-24 17:45   ` Linus Torvalds
2019-07-24 17:50     ` Christian Brauner
2019-07-24 17:52       ` Christian Brauner
2019-07-25 14:34     ` Eric W. Biederman
2019-07-25 10:16   ` Oleg Nesterov
2019-07-25 10:21     ` Christian Brauner
2019-07-26  8:19   ` Arnd Bergmann
2019-07-26  8:24     ` Christian Brauner
2019-07-26  9:57       ` Arnd Bergmann
2019-07-24 14:46 ` [PATCH 3/5] arch: wire-up pidfd_wait() Christian Brauner
2019-07-24 14:46   ` Christian Brauner
2019-07-24 14:46   ` Christian Brauner
2019-07-24 14:46   ` Christian Brauner
2019-07-24 14:46 ` [PATCH 4/5] pidfd: add CLONE_WAIT_PID Christian Brauner
2019-07-24 18:14   ` Jann Horn
2019-07-24 18:27     ` Christian Brauner
2019-07-24 19:07       ` Jann Horn
2019-07-24 19:10         ` Christian Brauner
2019-07-25 10:11           ` Christian Brauner [this message]
2019-07-25 10:30         ` Oleg Nesterov
2019-07-25 10:36           ` Christian Brauner
2019-07-25 10:35   ` Oleg Nesterov
2019-07-25 10:40     ` Christian Brauner
2019-07-25 11:25       ` Oleg Nesterov
2019-07-25 11:41         ` Christian Brauner
2019-07-25 11:43         ` Oleg Nesterov
2019-07-25 12:26           ` Christian Brauner
2019-07-25 16:13             ` Oleg Nesterov
2019-07-25 16:56               ` Christian Brauner
2019-07-25 16:57             ` Eric W. Biederman
2019-07-24 14:46 ` [PATCH 5/5] pidfd: add pidfd_wait tests Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190725101123.zp7y2weotyqkfsv3@brauner.io \
    --to=christian@brauner.io \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=cyphar@cyphar.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.