All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josh Triplett <josh@joshtriplett.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Thiago Macieira <thiago.macieira@intel.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Ingo Molnar <mingo@redhat.com>, Kees Cook <keescook@chromium.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Rik van Riel <riel@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, x86@kernel.org
Subject: Re: [PATCH 6/6] clone4: Introduce new CLONE_FD flag to get task exit notification via fd
Date: Sat, 14 Mar 2015 15:03:08 -0700	[thread overview]
Message-ID: <20150314220307.GI22130@thin> (raw)
In-Reply-To: <20150314185424.GA6813@redhat.com>

On Sat, Mar 14, 2015 at 07:54:24PM +0100, Oleg Nesterov wrote:
> On 03/14, Thiago Macieira wrote:
> > On Saturday 14 March 2015 15:32:35 Oleg Nesterov wrote:
> > > It is not clear to me what do_wait() should do with ->autoreap child, even
> > > ignoring ptrace.
> > >
> > > Just suppose that real_parent has a single "autoreap" child. Should
> > > wait(NULL) hanf then?
> >
> > It should ignore the child that is set to autoreap. wait(NULL) should return -
> > ECHILD, indicating there are no children waiting to be reaped.
> 
> I disagree. I won't really argue now, because I think that this needs
> a separate discussion.

We should certainly discuss it further, but why a "separate" discussion
rather than just discussing the semantics of autoreap and wait here?

> And imo "autoreap" should come as a separate feature.

Thinking about this further, I originally thought that CLONE_FD would
*have* to imply autoreap, because otherwise the calling process still
has to call a wait function on the process after getting the exit
notification via the file descriptor.  However, with the current version
(which holds a reference to the task via the task_struct and generates
the data in ->read), it could potentially make sense to have a file
descriptor for a process that still gets zombified until the parent
waits on it.

Autoreap would still be a potentially useful addition to simplify
process management; it would effectively become "always treat this child
as though the parent had the signal ignored or SA_NOCLDWAIT set", which
would just be a simple change to do_notify_parent, rather than a complex
one to exit_notify that potentially interacts with ptrace.  Matching the
semantics of SA_NOCLDWAIT seems reasonable.

Thiago, see below for a question about switching to the semantics of
SA_NOCLDWAIT.

> I think that wait(NULL) should hang like it hangs even if the parent ignores
> SIGCHLD. But in this case the parent should be woken up when the "autoreap"
> child exits.

I had to think about this for a while, but I think it makes sense now.
wait should *not* ever return the PID of an autoreaped process, because
that would introduce a race condition (the caller cannot safely do
*anything* with the PID of an autoreaped process, since by the time it
does, the process may be gone and the PID may be reused).  However, that
doesn't mean wait cannot block on the process, and then subsequently
wake up and return -ECHILD (or keep waiting on some other child process
if there is one).  That's apparently the semantic used with SA_NOCLDWAIT
or if you have SIGCHLD set to SIG_IGN, and matching that seems
appropriate.

Thiago, could your QProcess implementation handle that modified autoreap
semantic?  The downside there is that if your calling process has a
process-wide loop that waits for all processes (and explicitly passes
the Linux-specific __WCLONE or __WALL flag, since your processes
launched with a 0 signal would count as "clone" children), they'd get
back the processes you launch, too.  (That would happen with your
userspace-emulated version too for calls *without* __WCLONE or __WALL.)
You'd still get the exit status you need via the clonefd, without a
race, and you wouldn't need to touch process-wide signal handling, so I
think this should still work and avoid any races.

I'm going to try implementing that semantic, which should significantly
simplify the last patch of this series.

> If nothing else. Suppose that the parent does waitid(WEXITED|WSTOPPED).
> Should WSTOPPED work? I think it should.

Yeah, I guess it should.  Arguably there ought to be a clone flag that
lets you receive stop/continue notifications for that process via the
file descriptor instead (to allow a library to handle job control for a
process without touching process-wide signal handling), but that can
come later.

> At the same time, if we add autoreap then probably it also makes sense to add
> WEXITIED_UNLESS_AUTOREAP.

Potentially, though for many applications you could also just pass a
signal of 0 and avoid passing __WALL or __WCLONE.

- Josh Triplett

WARNING: multiple messages have this Message-ID (diff)
From: Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>
To: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Thiago Macieira
	<thiago.macieira-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"Paul E. McKenney"
	<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>,
	Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Michael Kerrisk
	<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
Subject: Re: [PATCH 6/6] clone4: Introduce new CLONE_FD flag to get task exit notification via fd
Date: Sat, 14 Mar 2015 15:03:08 -0700	[thread overview]
Message-ID: <20150314220307.GI22130@thin> (raw)
In-Reply-To: <20150314185424.GA6813-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Sat, Mar 14, 2015 at 07:54:24PM +0100, Oleg Nesterov wrote:
> On 03/14, Thiago Macieira wrote:
> > On Saturday 14 March 2015 15:32:35 Oleg Nesterov wrote:
> > > It is not clear to me what do_wait() should do with ->autoreap child, even
> > > ignoring ptrace.
> > >
> > > Just suppose that real_parent has a single "autoreap" child. Should
> > > wait(NULL) hanf then?
> >
> > It should ignore the child that is set to autoreap. wait(NULL) should return -
> > ECHILD, indicating there are no children waiting to be reaped.
> 
> I disagree. I won't really argue now, because I think that this needs
> a separate discussion.

We should certainly discuss it further, but why a "separate" discussion
rather than just discussing the semantics of autoreap and wait here?

> And imo "autoreap" should come as a separate feature.

Thinking about this further, I originally thought that CLONE_FD would
*have* to imply autoreap, because otherwise the calling process still
has to call a wait function on the process after getting the exit
notification via the file descriptor.  However, with the current version
(which holds a reference to the task via the task_struct and generates
the data in ->read), it could potentially make sense to have a file
descriptor for a process that still gets zombified until the parent
waits on it.

Autoreap would still be a potentially useful addition to simplify
process management; it would effectively become "always treat this child
as though the parent had the signal ignored or SA_NOCLDWAIT set", which
would just be a simple change to do_notify_parent, rather than a complex
one to exit_notify that potentially interacts with ptrace.  Matching the
semantics of SA_NOCLDWAIT seems reasonable.

Thiago, see below for a question about switching to the semantics of
SA_NOCLDWAIT.

> I think that wait(NULL) should hang like it hangs even if the parent ignores
> SIGCHLD. But in this case the parent should be woken up when the "autoreap"
> child exits.

I had to think about this for a while, but I think it makes sense now.
wait should *not* ever return the PID of an autoreaped process, because
that would introduce a race condition (the caller cannot safely do
*anything* with the PID of an autoreaped process, since by the time it
does, the process may be gone and the PID may be reused).  However, that
doesn't mean wait cannot block on the process, and then subsequently
wake up and return -ECHILD (or keep waiting on some other child process
if there is one).  That's apparently the semantic used with SA_NOCLDWAIT
or if you have SIGCHLD set to SIG_IGN, and matching that seems
appropriate.

Thiago, could your QProcess implementation handle that modified autoreap
semantic?  The downside there is that if your calling process has a
process-wide loop that waits for all processes (and explicitly passes
the Linux-specific __WCLONE or __WALL flag, since your processes
launched with a 0 signal would count as "clone" children), they'd get
back the processes you launch, too.  (That would happen with your
userspace-emulated version too for calls *without* __WCLONE or __WALL.)
You'd still get the exit status you need via the clonefd, without a
race, and you wouldn't need to touch process-wide signal handling, so I
think this should still work and avoid any races.

I'm going to try implementing that semantic, which should significantly
simplify the last patch of this series.

> If nothing else. Suppose that the parent does waitid(WEXITED|WSTOPPED).
> Should WSTOPPED work? I think it should.

Yeah, I guess it should.  Arguably there ought to be a clone flag that
lets you receive stop/continue notifications for that process via the
file descriptor instead (to allow a library to handle job control for a
process without touching process-wide signal handling), but that can
come later.

> At the same time, if we add autoreap then probably it also makes sense to add
> WEXITIED_UNLESS_AUTOREAP.

Potentially, though for many applications you could also just pass a
signal of 0 and avoid passing __WALL or __WCLONE.

- Josh Triplett

  reply	other threads:[~2015-03-14 22:03 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-13  1:40 [PATCH 0/6] CLONE_FD: Task exit notification via file descriptor Josh Triplett
2015-03-13  1:40 ` Josh Triplett
2015-03-13  1:40 ` [PATCH 1/6] clone: Support passing tls argument via C rather than pt_regs magic Josh Triplett
2015-03-13  1:40 ` [PATCH 2/6] x86: Opt into HAVE_COPY_THREAD_TLS, for both 32-bit and 64-bit Josh Triplett
2015-03-13  1:40   ` Josh Triplett
2015-03-13 22:01   ` Andy Lutomirski
2015-03-13 22:01     ` Andy Lutomirski
2015-03-13 22:31     ` josh
2015-03-13 22:38       ` Andy Lutomirski
2015-03-13 22:43         ` josh
2015-03-13 22:43           ` josh-iaAMLnmF4UmaiuxdJuQwMA
2015-03-13 22:45           ` Andy Lutomirski
2015-03-13 22:45             ` Andy Lutomirski
2015-03-13 23:01             ` josh
2015-03-13 23:01               ` josh-iaAMLnmF4UmaiuxdJuQwMA
2015-03-13  1:40 ` [PATCH 3/6] Introduce a new clone4 syscall with more flag bits and extensible arguments Josh Triplett
2015-03-13  1:40 ` [PATCH 4/6] signal: Factor out a helper function to process task_struct exit_code Josh Triplett
2015-03-13  1:40 ` [PATCH 5/6] fs: Make alloc_fd non-private Josh Triplett
2015-03-13  1:40   ` Josh Triplett
2015-03-13  1:41 ` [PATCH 6/6] clone4: Introduce new CLONE_FD flag to get task exit notification via fd Josh Triplett
2015-03-13 16:21   ` Oleg Nesterov
2015-03-13 19:57     ` josh
2015-03-13 21:34       ` Andy Lutomirski
2015-03-13 21:34         ` Andy Lutomirski
2015-03-13 22:20         ` josh
2015-03-13 22:28           ` Andy Lutomirski
2015-03-13 22:28             ` Andy Lutomirski
2015-03-13 22:34             ` josh
2015-03-13 22:34               ` josh-iaAMLnmF4UmaiuxdJuQwMA
2015-03-13 22:38               ` Andy Lutomirski
2015-03-14 14:14       ` Oleg Nesterov
2015-03-14 14:14         ` Oleg Nesterov
2015-03-14 14:32         ` Oleg Nesterov
2015-03-14 14:32           ` Oleg Nesterov
2015-03-14 18:38           ` Thiago Macieira
2015-03-14 18:54             ` Oleg Nesterov
2015-03-14 22:03               ` Josh Triplett [this message]
2015-03-14 22:03                 ` Josh Triplett
2015-03-14 22:26                 ` Thiago Macieira
2015-03-14 19:01             ` Josh Triplett
2015-03-14 19:18               ` Oleg Nesterov
2015-03-14 19:18                 ` Oleg Nesterov
2015-03-14 19:47                 ` Oleg Nesterov
2015-03-14 19:47                   ` Oleg Nesterov
2015-03-14 20:14                   ` Josh Triplett
2015-03-14 20:14                     ` Josh Triplett
2015-03-14 20:30                     ` Oleg Nesterov
2015-03-14 22:14                       ` Josh Triplett
2015-03-14 22:14                         ` Josh Triplett
2015-03-14 20:03                 ` Josh Triplett
2015-03-14 20:03                   ` Josh Triplett
2015-03-14 20:20                   ` Oleg Nesterov
2015-03-14 22:09         ` Josh Triplett
2015-03-14 14:35   ` Oleg Nesterov
2015-03-14 14:35     ` Oleg Nesterov
2015-03-14 19:15     ` Josh Triplett
2015-03-14 19:15       ` Josh Triplett
2015-03-14 19:24       ` Oleg Nesterov
2015-03-14 19:48         ` Josh Triplett
2015-03-14 19:48           ` Josh Triplett
2015-03-13  1:41 ` [PATCH] clone4.2: New manpage documenting clone4(2) Josh Triplett
2015-03-13  2:07 ` [PATCH 0/6] CLONE_FD: Task exit notification via file descriptor Thiago Macieira
2015-03-13  2:07   ` Thiago Macieira
2015-03-13 16:05 ` David Drysdale
2015-03-13 16:05   ` David Drysdale
2015-03-13 19:42   ` Josh Triplett
2015-03-13 21:16     ` Thiago Macieira
2015-03-13 21:44       ` josh
2015-03-13 21:33     ` Andy Lutomirski
2015-03-13 21:45       ` josh
2015-03-13 21:45         ` josh-iaAMLnmF4UmaiuxdJuQwMA
2015-03-13 21:51         ` Andy Lutomirski
2015-03-13 21:51           ` Andy Lutomirski
2015-03-14  1:11           ` Thiago Macieira
2015-03-14  1:11             ` Thiago Macieira
2015-03-14 19:03             ` Thiago Macieira
2015-03-14 19:29               ` Josh Triplett
2015-03-14 19:29                 ` Josh Triplett
2015-03-15 10:18                 ` David Drysdale
2015-03-15 10:18                   ` David Drysdale
2015-03-15 10:59                   ` Josh Triplett
2015-03-15  8:55     ` David Drysdale
2015-03-15  8:55       ` David Drysdale

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150314220307.GI22130@thin \
    --to=josh@joshtriplett.org \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=thiago.macieira@intel.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.