All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Kay Sievers <kay.sievers@vrfy.org>
Cc: Lennart Poettering <mzxreary@0pointer.de>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-man@vger.kernel.org, roland@hack.frob.com,
	torvalds@linux-foundation.org
Subject: Re: + prctl-add-pr_setget_child_reaper-to-allow-simple-process-supervision .patch added to -mm tree
Date: Sat, 20 Aug 2011 17:33:31 +0200	[thread overview]
Message-ID: <20110820153331.GA22577@redhat.com> (raw)
In-Reply-To: <20110819145815.GA15420@redhat.com>

On 08/19, Oleg Nesterov wrote:
>
> On 08/19, Kay Sievers wrote:
> >
> > version 4:
>
> Looks correct... But I don't trust myself. Especially after I missed
> the problem with klthreads ;)

And I missed another problem. So. We do need to check that
father != init_task even if we check has_child_subreaper.



Suppose that a kernel thread execs the usermode task T which
does prctl(REAPER). Suppose that its grandchild C exits, it
should be reparented to T.

If T is alive - everything fine, the lookup finds T with
->is_sub_reaper set.

If T has exited - everything fine again, C->parent was already
reparented to pid_ns->child_reaper (or another sub-reaper).

But! If T exits, there is a window between setting PF_EXITING
and forget_original_parent() which should re-parent C->parent.
If C exits in this window, it will see PF_EXITING and continue
the lookup, but it will never reach pid_ns->child_reaper.

(if we could check ->exit_state instead of PF_EXITING, everything
 would be fine).





And cough... there is another, not that subtle problem ;) That
task T can _clear_ ->is_child_subreaper after forking the child.
But since this obviously can't clear C->has_child_subreaper, we
can't trust it.




So. please add this check back. I insisted you should remove it,
but I was wrong. Otherwise looks correct.





Damn. And why do we check PF_EXITING but not exit_state? this is
because we have to drop tasklist for exit_task_namespaces(), see
762a24beed3f3ab93224bd447710e6c36fcf1968. However, there were a
lot of changes since then. Afaics we can change do_notify_parent()
to use task_active_pid_ns(tsk->parent) and then we can call
exit_task_namespaces() before exit_notify(). In this case we can
change exit_notify/forget_original_parent to reparent and set
exit_state under tasklist, this also saves unlock+lock.

Oleg.


WARNING: multiple messages have this Message-ID (diff)
From: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Kay Sievers <kay.sievers-tD+1rO4QERM@public.gmane.org>
Cc: Lennart Poettering
	<mzxreary-uLTowLwuiw4b1SvskN2V4Q@public.gmane.org>,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	roland-/Z5OmTQCD9xF6kxbq+BtvQ@public.gmane.org,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org
Subject: Re: + prctl-add-pr_setget_child_reaper-to-allow-simple-process-supervision .patch added to -mm tree
Date: Sat, 20 Aug 2011 17:33:31 +0200	[thread overview]
Message-ID: <20110820153331.GA22577@redhat.com> (raw)
In-Reply-To: <20110819145815.GA15420-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On 08/19, Oleg Nesterov wrote:
>
> On 08/19, Kay Sievers wrote:
> >
> > version 4:
>
> Looks correct... But I don't trust myself. Especially after I missed
> the problem with klthreads ;)

And I missed another problem. So. We do need to check that
father != init_task even if we check has_child_subreaper.



Suppose that a kernel thread execs the usermode task T which
does prctl(REAPER). Suppose that its grandchild C exits, it
should be reparented to T.

If T is alive - everything fine, the lookup finds T with
->is_sub_reaper set.

If T has exited - everything fine again, C->parent was already
reparented to pid_ns->child_reaper (or another sub-reaper).

But! If T exits, there is a window between setting PF_EXITING
and forget_original_parent() which should re-parent C->parent.
If C exits in this window, it will see PF_EXITING and continue
the lookup, but it will never reach pid_ns->child_reaper.

(if we could check ->exit_state instead of PF_EXITING, everything
 would be fine).





And cough... there is another, not that subtle problem ;) That
task T can _clear_ ->is_child_subreaper after forking the child.
But since this obviously can't clear C->has_child_subreaper, we
can't trust it.




So. please add this check back. I insisted you should remove it,
but I was wrong. Otherwise looks correct.





Damn. And why do we check PF_EXITING but not exit_state? this is
because we have to drop tasklist for exit_task_namespaces(), see
762a24beed3f3ab93224bd447710e6c36fcf1968. However, there were a
lot of changes since then. Afaics we can change do_notify_parent()
to use task_active_pid_ns(tsk->parent) and then we can call
exit_task_namespaces() before exit_notify(). In this case we can
change exit_notify/forget_original_parent to reparent and set
exit_state under tasklist, this also saves unlock+lock.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-08-20 15:36 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-16 20:11 + prctl-add-pr_setget_child_reaper-to-allow-simple-process-supervision.patch added to -mm tree akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b
2011-08-17 11:55 ` + prctl-add-pr_setget_child_reaper-to-allow-simple-process-supervision .patch " Oleg Nesterov
2011-08-17 11:55   ` Oleg Nesterov
2011-08-17 13:05   ` Oleg Nesterov
2011-08-17 13:05     ` Oleg Nesterov
2011-08-17 13:21     ` Kay Sievers
2011-08-17 13:21       ` Kay Sievers
2011-08-17 13:37       ` Alan Cox
2011-08-17 13:37         ` Alan Cox
2011-08-23  0:30         ` Colin Walters
2011-08-23  0:30           ` Colin Walters
2011-08-17 14:16       ` Oleg Nesterov
2011-08-17 14:16         ` Oleg Nesterov
2011-08-17 16:03       ` Denys Vlasenko
2011-08-17 16:03         ` Denys Vlasenko
2011-08-17 13:13   ` Kay Sievers
2011-08-17 13:45     ` Oleg Nesterov
2011-08-17 13:45       ` Oleg Nesterov
2011-08-17 15:45       ` Kay Sievers
2011-08-17 15:45         ` Kay Sievers
2011-08-17 15:53         ` Alan Cox
2011-08-17 15:53           ` Alan Cox
2011-08-17 16:20         ` Oleg Nesterov
2011-08-17 16:20           ` Oleg Nesterov
2011-08-17 16:47           ` Kay Sievers
2011-08-17 16:47             ` Kay Sievers
2011-08-17 18:57             ` Oleg Nesterov
2011-08-17 18:57               ` Oleg Nesterov
2011-08-17 20:56               ` Kay Sievers
2011-08-17 20:56                 ` Kay Sievers
2011-08-18 12:43       ` Lennart Poettering
2011-08-18 12:43         ` Lennart Poettering
2011-08-18 14:25         ` Oleg Nesterov
2011-08-18 14:25           ` Oleg Nesterov
2011-08-18 18:11           ` Kay Sievers
2011-08-18 18:48             ` Oleg Nesterov
2011-08-18 18:48               ` Oleg Nesterov
2011-08-19  1:31               ` Kay Sievers
2011-08-19  1:31                 ` Kay Sievers
2011-08-19 12:25                 ` Oleg Nesterov
2011-08-19 12:25                   ` Oleg Nesterov
2011-08-19 12:44                   ` Kay Sievers
2011-08-19 12:44                     ` Kay Sievers
2011-08-19 13:13                     ` Oleg Nesterov
2011-08-19 13:13                       ` Oleg Nesterov
2011-08-19 14:20                       ` Kay Sievers
2011-08-19 14:58                         ` Oleg Nesterov
2011-08-19 14:58                           ` Oleg Nesterov
2011-08-20 15:33                           ` Oleg Nesterov [this message]
2011-08-20 15:33                             ` Oleg Nesterov
2011-08-21 18:33                             ` Kay Sievers
2011-08-22 11:14                               ` Oleg Nesterov
2011-08-22 11:14                                 ` Oleg Nesterov
2011-08-22 23:48                                 ` Kay Sievers
2011-08-22 23:48                                   ` Kay Sievers
2011-08-18 21:23             ` Linus Torvalds
2011-08-18 21:23               ` Linus Torvalds
2011-08-18 21:55               ` Kay Sievers
2011-08-18 21:55                 ` Kay Sievers
2011-08-18 22:22                 ` Linus Torvalds
2011-08-18 22:22                   ` Linus Torvalds
2011-08-19  0:48                   ` Kay Sievers
2011-08-19  0:48                     ` Kay Sievers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110820153331.GA22577@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kay.sievers@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=mzxreary@0pointer.de \
    --cc=roland@hack.frob.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.