All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kay Sievers <kay.sievers@vrfy.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	lennart@poettering.net, linux-man@vger.kernel.org,
	roland@hack.frob.com, torvalds@linux-foundation.org
Subject: Re: + prctl-add-pr_setget_child_reaper-to-allow-simple-process-supervision .patch added to -mm tree
Date: Wed, 17 Aug 2011 17:45:03 +0200	[thread overview]
Message-ID: <CAPXgP101N_GESzpqu=P_H8cLoekMzb2_W2eWyAqATSjm4Gj9CA@mail.gmail.com> (raw)
In-Reply-To: <20110817134516.GA14136@redhat.com>

On Wed, Aug 17, 2011 at 15:45, Oleg Nesterov <oleg@redhat.com> wrote:
> On 08/17, Kay Sievers wrote:
>> On Wed, Aug 17, 2011 at 13:55, Oleg Nesterov <oleg@redhat.com> wrote:
>> >
>> > I try to never argue with the new features. But to be honest, this
>> > doesn't look very good to me.
>> >
>> > OK, a service manager M does prctl(PR_SET_CHILD_REAPER), then it forks
>> > a service X which forks another child C and exits. Then C exits and
>> > notifies M.
>> >
>> > But. How can M know that the service X should be restarted? It only
>> > knows the pid.
>>
>> Legacy services write pid files and we read them, so we know the pid
>> to watch for. Proper services never double-fork and reparent in a
>> modern init environment.
>
> OK. So, this patch can only help to handle the legacy services?

It helps them with services that need it. It is not recommended to
double-fork ever with a modern init system, but it's historic default
and common practice, and we are not going to change that any time
soon.

> And
> the service should participate (write pid files for example). And,

This is not meant as a security feature, if that's what your asking.
It will not prevent services from doing nasty things and escape the
process that started them. But it's still a feature that today only
PID 1 and which we need for more processes.

>> > What if wait(WEXITED) succeeds because C in turn does
>> > fork + exit?
>>
>> Nothing is really doing this.
>
> OK. But this means you propose this patch to solve the very specific
> problems.

No, it's for a very common problem. But again, it's not a security feature.

> IOW, imho this doesn't look very useful "in general" to me.

It is very useful if you have an init-like daemon.

> May be we need something else instead... And iiuc you don't really
> need to change the reparenting, you only want the notification if
> the process exits.

No, we want to be the parent of the process, and we want to be the one
who reaps all the child process, not only receive some out-of-band
notifications. The sub-init is the babysitter of all the things it has
started, and that should be reflected in the parent child relation.

>> >> @@ -1296,6 +1296,8 @@ struct task_struct {
>> >>                                * execve */
>> >>       unsigned in_iowait:1;
>> >>
>> >> +     /* Reparent child processes to this process instead of pid 1. */
>> >> +     unsigned child_reaper:1;
>> >
>> > First of all - this is already very wrong imho. This should be
>> > per-process, not per-thread.
>>
>> What do you mean? That would go where instead?
>
> You should mark the whole process as sub-reaper, not a single thread
> which does prctl(). The parent/child relationship is process-wide.

Ok.

> If nothing else. Suppose that application does pthread_create(), the
> new thread does prctl(REAPER) and exits.

I get the (weird) idea. :)

>> >> +     /* find the first ancestor which is marked as child_reaper */
>> >> +     for (reaper = father->parent;
>> >> +          reaper != &init_task && reaper != pid_ns->child_reaper;
>> >> +          reaper = reaper->parent)
>> >
>> > This loop can never reach init_task/child_reaper and crash the kernel.
>>
>> You mean: *if* this loop can never ...?
>
> Yes.
>
>> > For example, father->parent can point to init_task's sub-thread.
>> >
>> > OTOH you shouldn't use init_task at all.
>>
>> What would we use instead?
>
> You should check ->child_reaper only. But see above, it can be multithreaded.

The main PID 1 from the system has no ->child_reaper set as far as I
see, hence we check for init_task.

>> > Also. You shouldn't do this if the sub-namespace init exits, this is
>> > wrong.
>>
>> It we find a sub-init, before the namespace PID1, why wouldn't we return it?
>
> Ah, I meant pid_ns->child_reaper, not task->child_reaper.
>
> If pid_ns->child_reaper exits we should never try to "reparent" its
> children, see zap_pid_ns_processes() in particular. IOW, this should
> go into the "else" branch of "if (pid_ns->child_reaper == father)"

I don't understand this. If we find a marked task->child_reaper
_before_ we find a pid_ns->child_reaper in the chain of parents, why
wouldn't we return it?

Kay

WARNING: multiple messages have this Message-ID (diff)
From: Kay Sievers <kay.sievers-tD+1rO4QERM@public.gmane.org>
To: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	lennart-mdGvqq1h2p+GdvJs77BJ7Q@public.gmane.org,
	linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	roland-/Z5OmTQCD9xF6kxbq+BtvQ@public.gmane.org,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org
Subject: Re: + prctl-add-pr_setget_child_reaper-to-allow-simple-process-supervision .patch added to -mm tree
Date: Wed, 17 Aug 2011 17:45:03 +0200	[thread overview]
Message-ID: <CAPXgP101N_GESzpqu=P_H8cLoekMzb2_W2eWyAqATSjm4Gj9CA@mail.gmail.com> (raw)
In-Reply-To: <20110817134516.GA14136-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Wed, Aug 17, 2011 at 15:45, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 08/17, Kay Sievers wrote:
>> On Wed, Aug 17, 2011 at 13:55, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> >
>> > I try to never argue with the new features. But to be honest, this
>> > doesn't look very good to me.
>> >
>> > OK, a service manager M does prctl(PR_SET_CHILD_REAPER), then it forks
>> > a service X which forks another child C and exits. Then C exits and
>> > notifies M.
>> >
>> > But. How can M know that the service X should be restarted? It only
>> > knows the pid.
>>
>> Legacy services write pid files and we read them, so we know the pid
>> to watch for. Proper services never double-fork and reparent in a
>> modern init environment.
>
> OK. So, this patch can only help to handle the legacy services?

It helps them with services that need it. It is not recommended to
double-fork ever with a modern init system, but it's historic default
and common practice, and we are not going to change that any time
soon.

> And
> the service should participate (write pid files for example). And,

This is not meant as a security feature, if that's what your asking.
It will not prevent services from doing nasty things and escape the
process that started them. But it's still a feature that today only
PID 1 and which we need for more processes.

>> > What if wait(WEXITED) succeeds because C in turn does
>> > fork + exit?
>>
>> Nothing is really doing this.
>
> OK. But this means you propose this patch to solve the very specific
> problems.

No, it's for a very common problem. But again, it's not a security feature.

> IOW, imho this doesn't look very useful "in general" to me.

It is very useful if you have an init-like daemon.

> May be we need something else instead... And iiuc you don't really
> need to change the reparenting, you only want the notification if
> the process exits.

No, we want to be the parent of the process, and we want to be the one
who reaps all the child process, not only receive some out-of-band
notifications. The sub-init is the babysitter of all the things it has
started, and that should be reflected in the parent child relation.

>> >> @@ -1296,6 +1296,8 @@ struct task_struct {
>> >>                                * execve */
>> >>       unsigned in_iowait:1;
>> >>
>> >> +     /* Reparent child processes to this process instead of pid 1. */
>> >> +     unsigned child_reaper:1;
>> >
>> > First of all - this is already very wrong imho. This should be
>> > per-process, not per-thread.
>>
>> What do you mean? That would go where instead?
>
> You should mark the whole process as sub-reaper, not a single thread
> which does prctl(). The parent/child relationship is process-wide.

Ok.

> If nothing else. Suppose that application does pthread_create(), the
> new thread does prctl(REAPER) and exits.

I get the (weird) idea. :)

>> >> +     /* find the first ancestor which is marked as child_reaper */
>> >> +     for (reaper = father->parent;
>> >> +          reaper != &init_task && reaper != pid_ns->child_reaper;
>> >> +          reaper = reaper->parent)
>> >
>> > This loop can never reach init_task/child_reaper and crash the kernel.
>>
>> You mean: *if* this loop can never ...?
>
> Yes.
>
>> > For example, father->parent can point to init_task's sub-thread.
>> >
>> > OTOH you shouldn't use init_task at all.
>>
>> What would we use instead?
>
> You should check ->child_reaper only. But see above, it can be multithreaded.

The main PID 1 from the system has no ->child_reaper set as far as I
see, hence we check for init_task.

>> > Also. You shouldn't do this if the sub-namespace init exits, this is
>> > wrong.
>>
>> It we find a sub-init, before the namespace PID1, why wouldn't we return it?
>
> Ah, I meant pid_ns->child_reaper, not task->child_reaper.
>
> If pid_ns->child_reaper exits we should never try to "reparent" its
> children, see zap_pid_ns_processes() in particular. IOW, this should
> go into the "else" branch of "if (pid_ns->child_reaper == father)"

I don't understand this. If we find a marked task->child_reaper
_before_ we find a pid_ns->child_reaper in the chain of parents, why
wouldn't we return it?

Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-08-17 15:45 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-16 20:11 + prctl-add-pr_setget_child_reaper-to-allow-simple-process-supervision.patch added to -mm tree akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b
2011-08-17 11:55 ` + prctl-add-pr_setget_child_reaper-to-allow-simple-process-supervision .patch " Oleg Nesterov
2011-08-17 11:55   ` Oleg Nesterov
2011-08-17 13:05   ` Oleg Nesterov
2011-08-17 13:05     ` Oleg Nesterov
2011-08-17 13:21     ` Kay Sievers
2011-08-17 13:21       ` Kay Sievers
2011-08-17 13:37       ` Alan Cox
2011-08-17 13:37         ` Alan Cox
2011-08-23  0:30         ` Colin Walters
2011-08-23  0:30           ` Colin Walters
2011-08-17 14:16       ` Oleg Nesterov
2011-08-17 14:16         ` Oleg Nesterov
2011-08-17 16:03       ` Denys Vlasenko
2011-08-17 16:03         ` Denys Vlasenko
2011-08-17 13:13   ` Kay Sievers
2011-08-17 13:45     ` Oleg Nesterov
2011-08-17 13:45       ` Oleg Nesterov
2011-08-17 15:45       ` Kay Sievers [this message]
2011-08-17 15:45         ` Kay Sievers
2011-08-17 15:53         ` Alan Cox
2011-08-17 15:53           ` Alan Cox
2011-08-17 16:20         ` Oleg Nesterov
2011-08-17 16:20           ` Oleg Nesterov
2011-08-17 16:47           ` Kay Sievers
2011-08-17 16:47             ` Kay Sievers
2011-08-17 18:57             ` Oleg Nesterov
2011-08-17 18:57               ` Oleg Nesterov
2011-08-17 20:56               ` Kay Sievers
2011-08-17 20:56                 ` Kay Sievers
2011-08-18 12:43       ` Lennart Poettering
2011-08-18 12:43         ` Lennart Poettering
2011-08-18 14:25         ` Oleg Nesterov
2011-08-18 14:25           ` Oleg Nesterov
2011-08-18 18:11           ` Kay Sievers
2011-08-18 18:48             ` Oleg Nesterov
2011-08-18 18:48               ` Oleg Nesterov
2011-08-19  1:31               ` Kay Sievers
2011-08-19  1:31                 ` Kay Sievers
2011-08-19 12:25                 ` Oleg Nesterov
2011-08-19 12:25                   ` Oleg Nesterov
2011-08-19 12:44                   ` Kay Sievers
2011-08-19 12:44                     ` Kay Sievers
2011-08-19 13:13                     ` Oleg Nesterov
2011-08-19 13:13                       ` Oleg Nesterov
2011-08-19 14:20                       ` Kay Sievers
2011-08-19 14:58                         ` Oleg Nesterov
2011-08-19 14:58                           ` Oleg Nesterov
2011-08-20 15:33                           ` Oleg Nesterov
2011-08-20 15:33                             ` Oleg Nesterov
2011-08-21 18:33                             ` Kay Sievers
2011-08-22 11:14                               ` Oleg Nesterov
2011-08-22 11:14                                 ` Oleg Nesterov
2011-08-22 23:48                                 ` Kay Sievers
2011-08-22 23:48                                   ` Kay Sievers
2011-08-18 21:23             ` Linus Torvalds
2011-08-18 21:23               ` Linus Torvalds
2011-08-18 21:55               ` Kay Sievers
2011-08-18 21:55                 ` Kay Sievers
2011-08-18 22:22                 ` Linus Torvalds
2011-08-18 22:22                   ` Linus Torvalds
2011-08-19  0:48                   ` Kay Sievers
2011-08-19  0:48                     ` Kay Sievers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPXgP101N_GESzpqu=P_H8cLoekMzb2_W2eWyAqATSjm4Gj9CA@mail.gmail.com' \
    --to=kay.sievers@vrfy.org \
    --cc=akpm@linux-foundation.org \
    --cc=lennart@poettering.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=roland@hack.frob.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.