From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752543AbbCNWDW (ORCPT ); Sat, 14 Mar 2015 18:03:22 -0400 Received: from relay5-d.mail.gandi.net ([217.70.183.197]:39642 "EHLO relay5-d.mail.gandi.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751239AbbCNWDT (ORCPT ); Sat, 14 Mar 2015 18:03:19 -0400 X-Originating-IP: 50.43.43.179 Date: Sat, 14 Mar 2015 15:03:08 -0700 From: Josh Triplett To: Oleg Nesterov Cc: Thiago Macieira , Al Viro , Andrew Morton , Andy Lutomirski , Ingo Molnar , Kees Cook , "Paul E. McKenney" , "H. Peter Anvin" , Rik van Riel , Thomas Gleixner , Michael Kerrisk , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, x86@kernel.org Subject: Re: [PATCH 6/6] clone4: Introduce new CLONE_FD flag to get task exit notification via fd Message-ID: <20150314220307.GI22130@thin> References: <20150314141414.GA11062@redhat.com> <20150314143235.GA12086@redhat.com> <28025621.k7WkrfHd4d@tjmaciei-mobl4> <20150314185424.GA6813@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150314185424.GA6813@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Mar 14, 2015 at 07:54:24PM +0100, Oleg Nesterov wrote: > On 03/14, Thiago Macieira wrote: > > On Saturday 14 March 2015 15:32:35 Oleg Nesterov wrote: > > > It is not clear to me what do_wait() should do with ->autoreap child, even > > > ignoring ptrace. > > > > > > Just suppose that real_parent has a single "autoreap" child. Should > > > wait(NULL) hanf then? > > > > It should ignore the child that is set to autoreap. wait(NULL) should return - > > ECHILD, indicating there are no children waiting to be reaped. > > I disagree. I won't really argue now, because I think that this needs > a separate discussion. We should certainly discuss it further, but why a "separate" discussion rather than just discussing the semantics of autoreap and wait here? > And imo "autoreap" should come as a separate feature. Thinking about this further, I originally thought that CLONE_FD would *have* to imply autoreap, because otherwise the calling process still has to call a wait function on the process after getting the exit notification via the file descriptor. However, with the current version (which holds a reference to the task via the task_struct and generates the data in ->read), it could potentially make sense to have a file descriptor for a process that still gets zombified until the parent waits on it. Autoreap would still be a potentially useful addition to simplify process management; it would effectively become "always treat this child as though the parent had the signal ignored or SA_NOCLDWAIT set", which would just be a simple change to do_notify_parent, rather than a complex one to exit_notify that potentially interacts with ptrace. Matching the semantics of SA_NOCLDWAIT seems reasonable. Thiago, see below for a question about switching to the semantics of SA_NOCLDWAIT. > I think that wait(NULL) should hang like it hangs even if the parent ignores > SIGCHLD. But in this case the parent should be woken up when the "autoreap" > child exits. I had to think about this for a while, but I think it makes sense now. wait should *not* ever return the PID of an autoreaped process, because that would introduce a race condition (the caller cannot safely do *anything* with the PID of an autoreaped process, since by the time it does, the process may be gone and the PID may be reused). However, that doesn't mean wait cannot block on the process, and then subsequently wake up and return -ECHILD (or keep waiting on some other child process if there is one). That's apparently the semantic used with SA_NOCLDWAIT or if you have SIGCHLD set to SIG_IGN, and matching that seems appropriate. Thiago, could your QProcess implementation handle that modified autoreap semantic? The downside there is that if your calling process has a process-wide loop that waits for all processes (and explicitly passes the Linux-specific __WCLONE or __WALL flag, since your processes launched with a 0 signal would count as "clone" children), they'd get back the processes you launch, too. (That would happen with your userspace-emulated version too for calls *without* __WCLONE or __WALL.) You'd still get the exit status you need via the clonefd, without a race, and you wouldn't need to touch process-wide signal handling, so I think this should still work and avoid any races. I'm going to try implementing that semantic, which should significantly simplify the last patch of this series. > If nothing else. Suppose that the parent does waitid(WEXITED|WSTOPPED). > Should WSTOPPED work? I think it should. Yeah, I guess it should. Arguably there ought to be a clone flag that lets you receive stop/continue notifications for that process via the file descriptor instead (to allow a library to handle job control for a process without touching process-wide signal handling), but that can come later. > At the same time, if we add autoreap then probably it also makes sense to add > WEXITIED_UNLESS_AUTOREAP. Potentially, though for many applications you could also just pass a signal of 0 and avoid passing __WALL or __WCLONE. - Josh Triplett From mboxrd@z Thu Jan 1 00:00:00 1970 From: Josh Triplett Subject: Re: [PATCH 6/6] clone4: Introduce new CLONE_FD flag to get task exit notification via fd Date: Sat, 14 Mar 2015 15:03:08 -0700 Message-ID: <20150314220307.GI22130@thin> References: <20150314141414.GA11062@redhat.com> <20150314143235.GA12086@redhat.com> <28025621.k7WkrfHd4d@tjmaciei-mobl4> <20150314185424.GA6813@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Thiago Macieira , Al Viro , Andrew Morton , Andy Lutomirski , Ingo Molnar , Kees Cook , "Paul E. McKenney" , "H. Peter Anvin" , Rik van Riel , Thomas Gleixner , Michael Kerrisk , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org To: Oleg Nesterov Return-path: Content-Disposition: inline In-Reply-To: <20150314185424.GA6813-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-fsdevel.vger.kernel.org On Sat, Mar 14, 2015 at 07:54:24PM +0100, Oleg Nesterov wrote: > On 03/14, Thiago Macieira wrote: > > On Saturday 14 March 2015 15:32:35 Oleg Nesterov wrote: > > > It is not clear to me what do_wait() should do with ->autoreap child, even > > > ignoring ptrace. > > > > > > Just suppose that real_parent has a single "autoreap" child. Should > > > wait(NULL) hanf then? > > > > It should ignore the child that is set to autoreap. wait(NULL) should return - > > ECHILD, indicating there are no children waiting to be reaped. > > I disagree. I won't really argue now, because I think that this needs > a separate discussion. We should certainly discuss it further, but why a "separate" discussion rather than just discussing the semantics of autoreap and wait here? > And imo "autoreap" should come as a separate feature. Thinking about this further, I originally thought that CLONE_FD would *have* to imply autoreap, because otherwise the calling process still has to call a wait function on the process after getting the exit notification via the file descriptor. However, with the current version (which holds a reference to the task via the task_struct and generates the data in ->read), it could potentially make sense to have a file descriptor for a process that still gets zombified until the parent waits on it. Autoreap would still be a potentially useful addition to simplify process management; it would effectively become "always treat this child as though the parent had the signal ignored or SA_NOCLDWAIT set", which would just be a simple change to do_notify_parent, rather than a complex one to exit_notify that potentially interacts with ptrace. Matching the semantics of SA_NOCLDWAIT seems reasonable. Thiago, see below for a question about switching to the semantics of SA_NOCLDWAIT. > I think that wait(NULL) should hang like it hangs even if the parent ignores > SIGCHLD. But in this case the parent should be woken up when the "autoreap" > child exits. I had to think about this for a while, but I think it makes sense now. wait should *not* ever return the PID of an autoreaped process, because that would introduce a race condition (the caller cannot safely do *anything* with the PID of an autoreaped process, since by the time it does, the process may be gone and the PID may be reused). However, that doesn't mean wait cannot block on the process, and then subsequently wake up and return -ECHILD (or keep waiting on some other child process if there is one). That's apparently the semantic used with SA_NOCLDWAIT or if you have SIGCHLD set to SIG_IGN, and matching that seems appropriate. Thiago, could your QProcess implementation handle that modified autoreap semantic? The downside there is that if your calling process has a process-wide loop that waits for all processes (and explicitly passes the Linux-specific __WCLONE or __WALL flag, since your processes launched with a 0 signal would count as "clone" children), they'd get back the processes you launch, too. (That would happen with your userspace-emulated version too for calls *without* __WCLONE or __WALL.) You'd still get the exit status you need via the clonefd, without a race, and you wouldn't need to touch process-wide signal handling, so I think this should still work and avoid any races. I'm going to try implementing that semantic, which should significantly simplify the last patch of this series. > If nothing else. Suppose that the parent does waitid(WEXITED|WSTOPPED). > Should WSTOPPED work? I think it should. Yeah, I guess it should. Arguably there ought to be a clone flag that lets you receive stop/continue notifications for that process via the file descriptor instead (to allow a library to handle job control for a process without touching process-wide signal handling), but that can come later. > At the same time, if we add autoreap then probably it also makes sense to add > WEXITIED_UNLESS_AUTOREAP. Potentially, though for many applications you could also just pass a signal of 0 and avoid passing __WALL or __WCLONE. - Josh Triplett