[PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
@ 2010-09-16 18:59 Will Drewry
       [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
                   ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-16 18:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro,
	Roland McGrath, Neil Horman, Andi Kleen, Eric W. Biederman,
	containers, linux-fsdevel, Will Drewry

format_corename uses task_tgid_vnr to provide the numeric pid of a
core-dumping process.  For file-based coredumps, this is perfectly
satisfactory.  However, when the core_pattern contains a pipe, the
substituted PID is invalid in the namespace of the core_pattern pipe
helper, the init namespace.

By changing this, any core collector may now find the process in the
init namespace /proc.  This helps with VFS namespacing too since the
mount root is available via /proc.  Unfortunately, it does not help in
cases of more complex namespacing, like net namespaces.  For that, the
helper thread will need to be migrated to the core-dump namespaces.

I have a separate patch series which implements migrating the
____call_usermodehelper thread to the coredump namespace, but it adds a
fair amount of complexity which might be better handled by someone who
understands that code.  I'm happy to mail it out as well though (it
works, but I don't assign a namespaced pid which may open up issues on
its own).

Any and all comments, feedback will be appreciated!

Signed-off-by: Will Drewry <wad@chromium.org>
---
 fs/exec.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 828dd24..0b8a874 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr)
 	char *const out_end = corename + CORENAME_MAX_SIZE;
 	int rc;
 	int pid_in_pattern = 0;
+	pid_t pid = task_tgid_vnr(current);
+
+	/* The pipe helper runs in the init namespace and should
+	 * receive the matching pid until that changes.
+	 */
+	if (ispipe)
+		pid = task_tgid_nr(current);

 	/* Repeat as long as we have more pattern to process and more output
 	   space */
@@ -1489,7 +1496,7 @@ static int format_corename(char *corename, long signr)
 			case 'p':
 				pid_in_pattern = 1;
 				rc = snprintf(out_ptr, out_end - out_ptr,
-					      "%d", task_tgid_vnr(current));
+					      "%d", pid);
 				if (rc > out_end - out_ptr)
 					goto out;
 				out_ptr += rc;
@@ -1568,7 +1575,7 @@ static int format_corename(char *corename, long signr)
 	 * the filename. Do not do this for piped commands. */
 	if (!ispipe && !pid_in_pattern && core_uses_pid) {
 		rc = snprintf(out_ptr, out_end - out_ptr,
-			      ".%d", task_tgid_vnr(current));
+			      ".%d", pid);
 		if (rc > out_end - out_ptr)
 			goto out;
 		out_ptr += rc;
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
       [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
@ 2010-09-16 19:35   ` Oleg Nesterov
  2010-09-17 13:26   ` Andi Kleen
  1 sibling, 0 replies; 43+ messages in thread
From: Oleg Nesterov @ 2010-09-16 19:35 UTC (permalink / raw)
  To: Will Drewry
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Andrew Morton, Roland McGrath, Eric W. Biederman

On 09/16, Will Drewry wrote:
>
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr)
>  	char *const out_end = corename + CORENAME_MAX_SIZE;
>  	int rc;
>  	int pid_in_pattern = 0;
> +	pid_t pid = task_tgid_vnr(current);
> +
> +	/* The pipe helper runs in the init namespace and should
> +	 * receive the matching pid until that changes.
> +	 */
> +	if (ispipe)
> +		pid = task_tgid_nr(current);

Agreed, it doesn't make sense to pass the "local" pid to the init ns.

Oleg.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
  2010-09-16 18:59 [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper Will Drewry
       [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
@ 2010-09-16 19:35 ` Oleg Nesterov
       [not found]   ` <20100916193543.GA11016-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2010-09-16 20:12   ` Eric W. Biederman
  2010-09-17 13:26 ` Andi Kleen
  2 siblings, 2 replies; 43+ messages in thread
From: Oleg Nesterov @ 2010-09-16 19:35 UTC (permalink / raw)
  To: Will Drewry
  Cc: linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro,
	Roland McGrath, Neil Horman, Andi Kleen, Eric W. Biederman,
	containers, linux-fsdevel

On 09/16, Will Drewry wrote:
>
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr)
>  	char *const out_end = corename + CORENAME_MAX_SIZE;
>  	int rc;
>  	int pid_in_pattern = 0;
> +	pid_t pid = task_tgid_vnr(current);
> +
> +	/* The pipe helper runs in the init namespace and should
> +	 * receive the matching pid until that changes.
> +	 */
> +	if (ispipe)
> +		pid = task_tgid_nr(current);

Agreed, it doesn't make sense to pass the "local" pid to the init ns.

Oleg.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
       [not found]   ` <20100916193543.GA11016-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2010-09-16 20:12     ` Eric W. Biederman
  0 siblings, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2010-09-16 20:12 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Will Drewry, Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Andrew Morton, Roland McGrath

Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:

> On 09/16, Will Drewry wrote:
>>
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr)
>>  	char *const out_end = corename + CORENAME_MAX_SIZE;
>>  	int rc;
>>  	int pid_in_pattern = 0;
>> +	pid_t pid = task_tgid_vnr(current);
>> +
>> +	/* The pipe helper runs in the init namespace and should
>> +	 * receive the matching pid until that changes.
>> +	 */
>> +	if (ispipe)
>> +		pid = task_tgid_nr(current);
>
> Agreed, it doesn't make sense to pass the "local" pid to the init ns.

Yes, passing the local pid to the init pid namespace doesn't make a lot
of sense.  I have recently fixed unix domain sockets to avoid doing that
kind of silliness.

That said I don't think this is a complete fix.  We also potentially
have the same issue with the uts namespace and the user namespace.

I believe the core file holds all of this information relative to the
process that is dying, one elf note or other so we don't need to worry
about information loss.

As for how to implement this for the pid case can we simply grab the
namespace at the ispipe stage and use task_tgid_nr_ns.  Something like:
pid_ns = current->nsproxy->pid_ns;
if (is_pipe)
	pid_ns = &init_pid_ns;

....
	/* pid */
	case 'p':
		pid_in_pattern = 1;
		rc = snprintf(out_ptr, out_end - out_ptr,
			      "%d", task_tgid_pid_nr_ns(pid_ns, current));
		if (rc > out_end - out_ptr)
			goto out;
		out_ptr += rc;
		break;

Will you were saying something about complex namespacing (in particular
the network namespace giving you problems).  The network namespace has
no influence over pipes so how does that come into play?

I can imagine that it would be nice to have different core patterns
depending on where you are in the process tree, so a container can
do something different than the system outside of the container.  Are
you thinking along those lines, or are you imagining something else?

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
  2010-09-16 19:35 ` Oleg Nesterov
       [not found]   ` <20100916193543.GA11016-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2010-09-16 20:12   ` Eric W. Biederman
  2010-09-16 21:02     ` Will Drewry
                       ` (2 more replies)
  1 sibling, 3 replies; 43+ messages in thread
From: Eric W. Biederman @ 2010-09-16 20:12 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Will Drewry, linux-kernel, Alexander Viro, Andrew Morton,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Andi Kleen,
	containers, linux-fsdevel

Oleg Nesterov <oleg@redhat.com> writes:

> On 09/16, Will Drewry wrote:
>>
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr)
>>  	char *const out_end = corename + CORENAME_MAX_SIZE;
>>  	int rc;
>>  	int pid_in_pattern = 0;
>> +	pid_t pid = task_tgid_vnr(current);
>> +
>> +	/* The pipe helper runs in the init namespace and should
>> +	 * receive the matching pid until that changes.
>> +	 */
>> +	if (ispipe)
>> +		pid = task_tgid_nr(current);
>
> Agreed, it doesn't make sense to pass the "local" pid to the init ns.

Yes, passing the local pid to the init pid namespace doesn't make a lot
of sense.  I have recently fixed unix domain sockets to avoid doing that
kind of silliness.

That said I don't think this is a complete fix.  We also potentially
have the same issue with the uts namespace and the user namespace.

I believe the core file holds all of this information relative to the
process that is dying, one elf note or other so we don't need to worry
about information loss.

As for how to implement this for the pid case can we simply grab the
namespace at the ispipe stage and use task_tgid_nr_ns.  Something like:
pid_ns = current->nsproxy->pid_ns;
if (is_pipe)
	pid_ns = &init_pid_ns;

....
	/* pid */
	case 'p':
		pid_in_pattern = 1;
		rc = snprintf(out_ptr, out_end - out_ptr,
			      "%d", task_tgid_pid_nr_ns(pid_ns, current));
		if (rc > out_end - out_ptr)
			goto out;
		out_ptr += rc;
		break;

Will you were saying something about complex namespacing (in particular
the network namespace giving you problems).  The network namespace has
no influence over pipes so how does that come into play?

I can imagine that it would be nice to have different core patterns
depending on where you are in the process tree, so a container can
do something different than the system outside of the container.  Are
you thinking along those lines, or are you imagining something else?

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
       [not found]     ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
@ 2010-09-16 21:02       ` Will Drewry
  2010-09-17 19:08       ` Roland McGrath
  1 sibling, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-16 21:02 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen,
	Alexander Viro, KOSAKI Motohiro,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton,
	Roland McGrath

On Thu, Sep 16, 2010 at 3:12 PM, Eric W. Biederman
<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> wrote:
> Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
>
>> On 09/16, Will Drewry wrote:
>>>
>>> --- a/fs/exec.c
>>> +++ b/fs/exec.c
>>> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr)
>>>      char *const out_end = corename + CORENAME_MAX_SIZE;
>>>      int rc;
>>>      int pid_in_pattern = 0;
>>> +    pid_t pid = task_tgid_vnr(current);
>>> +
>>> +    /* The pipe helper runs in the init namespace and should
>>> +     * receive the matching pid until that changes.
>>> +     */
>>> +    if (ispipe)
>>> +            pid = task_tgid_nr(current);
>>
>> Agreed, it doesn't make sense to pass the "local" pid to the init ns.
>
> Yes, passing the local pid to the init pid namespace doesn't make a lot
> of sense.  I have recently fixed unix domain sockets to avoid doing that
> kind of silliness.
>
> That said I don't think this is a complete fix.  We also potentially
> have the same issue with the uts namespace and the user namespace.

Hrm true.  I was looking at getting the proper pid to the core handler
as a means for it to get the rest from
/proc/<pid>/[status|mountinfo|root] and so on.  As is, it seemed
non-trivial to reliably walk the /proc tree to find the dumping
process.

> I believe the core file holds all of this information relative to the
> process that is dying, one elf note or other so we don't need to worry
> about information loss.

Yeah - my main concern was /proc access to get information that might
otherwise be lost.

> As for how to implement this for the pid case can we simply grab the
> namespace at the ispipe stage and use task_tgid_nr_ns.  Something like:
> pid_ns = current->nsproxy->pid_ns;
> if (is_pipe)
>        pid_ns = &init_pid_ns;
>
> ....
>        /* pid */
>        case 'p':
>                pid_in_pattern = 1;
>                rc = snprintf(out_ptr, out_end - out_ptr,
>                              "%d", task_tgid_pid_nr_ns(pid_ns, current));
>                if (rc > out_end - out_ptr)
>                        goto out;
>                out_ptr += rc;
>                break;

Would that yield different results than the task_tgid_nr versus vnr,
or is it just cleaner?

> Will you were saying something about complex namespacing (in particular
> the network namespace giving you problems).  The network namespace has
> no influence over pipes so how does that come into play?

Ah! All I meant is that this patch lets you access anything that is in
/proc/[tgid]/... for the dumping process.  However, it seems that in
the long run, core_pattern should run the pipe handler inside the
namespace(s) of the process do_coredump is running for.  For instance,
if it wanted to make a network connection on coredump, it'd be doing
it from the init namespace.

> I can imagine that it would be nice to have different core patterns
> depending on where you are in the process tree, so a container can
> do something different than the system outside of the container.  Are
> you thinking along those lines, or are you imagining something else?

Yup - in general, I was thinking that even if core_pattern was not
split by namespace, it would be nice to attempt to execute it in the
namespace of the process that needs it.  My understand of namespaces
is that they were a gateway to longer term lightweight containers (as
the patches go into kernel.org) which would mean that a change to
core_pattern shouldn't result in a process running in the init
namespace (at least not by default!).

To that end, I implemented a function similar to your setns() syscall
patch which calls create_new_namespaces (and copy_fs_struct) to create
an unattached fs and nsproxy which I then migrate the
____call_usermodehelper thread over to in umh_setup_pipe.  It leaves
it with a pid of 0, but otherwise functional.  (I also played with
changing its pid or best effort assigning one in the namespace as per
older threads, but it all seemed to fragile.) I can post them on this
thread (or wherever) if you'd like to see them, but they're pretty
trivial, and I'm unsure of their correctness.

cheers!
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
  2010-09-16 20:12   ` Eric W. Biederman
@ 2010-09-16 21:02     ` Will Drewry
       [not found]     ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
  2010-09-17 19:08     ` Roland McGrath
  2 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-16 21:02 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Oleg Nesterov, linux-kernel, Alexander Viro, Andrew Morton,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Andi Kleen,
	containers, linux-fsdevel

On Thu, Sep 16, 2010 at 3:12 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Oleg Nesterov <oleg@redhat.com> writes:
>
>> On 09/16, Will Drewry wrote:
>>>
>>> --- a/fs/exec.c
>>> +++ b/fs/exec.c
>>> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr)
>>>      char *const out_end = corename + CORENAME_MAX_SIZE;
>>>      int rc;
>>>      int pid_in_pattern = 0;
>>> +    pid_t pid = task_tgid_vnr(current);
>>> +
>>> +    /* The pipe helper runs in the init namespace and should
>>> +     * receive the matching pid until that changes.
>>> +     */
>>> +    if (ispipe)
>>> +            pid = task_tgid_nr(current);
>>
>> Agreed, it doesn't make sense to pass the "local" pid to the init ns.
>
> Yes, passing the local pid to the init pid namespace doesn't make a lot
> of sense.  I have recently fixed unix domain sockets to avoid doing that
> kind of silliness.
>
> That said I don't think this is a complete fix.  We also potentially
> have the same issue with the uts namespace and the user namespace.

Hrm true.  I was looking at getting the proper pid to the core handler
as a means for it to get the rest from
/proc/<pid>/[status|mountinfo|root] and so on.  As is, it seemed
non-trivial to reliably walk the /proc tree to find the dumping
process.

> I believe the core file holds all of this information relative to the
> process that is dying, one elf note or other so we don't need to worry
> about information loss.

Yeah - my main concern was /proc access to get information that might
otherwise be lost.

> As for how to implement this for the pid case can we simply grab the
> namespace at the ispipe stage and use task_tgid_nr_ns.  Something like:
> pid_ns = current->nsproxy->pid_ns;
> if (is_pipe)
>        pid_ns = &init_pid_ns;
>
> ....
>        /* pid */
>        case 'p':
>                pid_in_pattern = 1;
>                rc = snprintf(out_ptr, out_end - out_ptr,
>                              "%d", task_tgid_pid_nr_ns(pid_ns, current));
>                if (rc > out_end - out_ptr)
>                        goto out;
>                out_ptr += rc;
>                break;

Would that yield different results than the task_tgid_nr versus vnr,
or is it just cleaner?

> Will you were saying something about complex namespacing (in particular
> the network namespace giving you problems).  The network namespace has
> no influence over pipes so how does that come into play?

Ah! All I meant is that this patch lets you access anything that is in
/proc/[tgid]/... for the dumping process.  However, it seems that in
the long run, core_pattern should run the pipe handler inside the
namespace(s) of the process do_coredump is running for.  For instance,
if it wanted to make a network connection on coredump, it'd be doing
it from the init namespace.

> I can imagine that it would be nice to have different core patterns
> depending on where you are in the process tree, so a container can
> do something different than the system outside of the container.  Are
> you thinking along those lines, or are you imagining something else?

Yup - in general, I was thinking that even if core_pattern was not
split by namespace, it would be nice to attempt to execute it in the
namespace of the process that needs it.  My understand of namespaces
is that they were a gateway to longer term lightweight containers (as
the patches go into kernel.org) which would mean that a change to
core_pattern shouldn't result in a process running in the init
namespace (at least not by default!).

To that end, I implemented a function similar to your setns() syscall
patch which calls create_new_namespaces (and copy_fs_struct) to create
an unattached fs and nsproxy which I then migrate the
____call_usermodehelper thread over to in umh_setup_pipe.  It leaves
it with a pid of 0, but otherwise functional.  (I also played with
changing its pid or best effort assigning one in the namespace as per
older threads, but it all seemed to fragile.) I can post them on this
thread (or wherever) if you'd like to see them, but they're pretty
trivial, and I'm unsure of their correctness.

cheers!
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
       [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
  2010-09-16 19:35   ` Oleg Nesterov
@ 2010-09-17 13:26   ` Andi Kleen
  1 sibling, 0 replies; 43+ messages in thread
From: Andi Kleen @ 2010-09-17 13:26 UTC (permalink / raw)
  To: Will Drewry
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander Viro, KOSAKI Motohiro,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton,
	Roland McGrath, Eric W. Biederman

On Thu, 16 Sep 2010 13:59:59 -0500
Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:

> format_corename uses task_tgid_vnr to provide the numeric pid of a
> core-dumping process.  For file-based coredumps, this is perfectly
> satisfactory.  However, when the core_pattern contains a pipe, the
> substituted PID is invalid in the namespace of the core_pattern pipe
> helper, the init namespace.

Nasty problem. I wonder how many more similar problems name spaces
have introduced.

But wouldn't it be better to place the helper into the name space(s)
of the executed process? I guess it would risk breaking some existing
set ups, but it seem like the cleanest solution to me. If you want
to move the core dump out of the name space you could still
use a named pipe or something like that with someone outside listening.

That would also fix the net namespace problem you mentioned

-Andi

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
  2010-09-16 18:59 [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper Will Drewry
       [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
  2010-09-16 19:35 ` Oleg Nesterov
@ 2010-09-17 13:26 ` Andi Kleen
  2010-09-17 14:52     ` Will Drewry
                     ` (3 more replies)
  2 siblings, 4 replies; 43+ messages in thread
From: Andi Kleen @ 2010-09-17 13:26 UTC (permalink / raw)
  To: Will Drewry
  Cc: linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman,
	containers, linux-fsdevel

On Thu, 16 Sep 2010 13:59:59 -0500
Will Drewry <wad@chromium.org> wrote:

> format_corename uses task_tgid_vnr to provide the numeric pid of a
> core-dumping process.  For file-based coredumps, this is perfectly
> satisfactory.  However, when the core_pattern contains a pipe, the
> substituted PID is invalid in the namespace of the core_pattern pipe
> helper, the init namespace.

Nasty problem. I wonder how many more similar problems name spaces
have introduced.

But wouldn't it be better to place the helper into the name space(s)
of the executed process? I guess it would risk breaking some existing
set ups, but it seem like the cleanest solution to me. If you want
to move the core dump out of the name space you could still
use a named pipe or something like that with someone outside listening.

That would also fix the net namespace problem you mentioned

-Andi



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
       [not found]   ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org>
@ 2010-09-17 14:52     ` Will Drewry
  2010-09-17 15:16     ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry
  2010-09-17 15:16     ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry
  2 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-17 14:52 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander Viro, KOSAKI Motohiro,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, Eugene Teo,
	Roland McGrath, Eric W. Biederman

On Fri, Sep 17, 2010 at 8:26 AM, Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org> wrote:
> On Thu, 16 Sep 2010 13:59:59 -0500
> Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>
>> format_corename uses task_tgid_vnr to provide the numeric pid of a
>> core-dumping process.  For file-based coredumps, this is perfectly
>> satisfactory.  However, when the core_pattern contains a pipe, the
>> substituted PID is invalid in the namespace of the core_pattern pipe
>> helper, the init namespace.
>
> Nasty problem. I wonder how many more similar problems name spaces
> have introduced.
>
> But wouldn't it be better to place the helper into the name space(s)
> of the executed process? I guess it would risk breaking some existing
> set ups, but it seem like the cleanest solution to me. If you want
> to move the core dump out of the name space you could still
> use a named pipe or something like that with someone outside listening.
>
> That would also fix the net namespace problem you mentioned

I agree.  This was what I did first, but I wasn't confident I'd done
it right.  I'll go ahead and post the two patches I used to accomplish
it.   I'm happy to spend some time polishing them until they are
functionally correct, if everyone is happy enough with the approach.

more to come,
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
  2010-09-17 13:26 ` Andi Kleen
@ 2010-09-17 14:52     ` Will Drewry
       [not found]   ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org>
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-17 14:52 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman,
	containers, linux-fsdevel, Eugene Teo

On Fri, Sep 17, 2010 at 8:26 AM, Andi Kleen <andi@firstfloor.org> wrote:
> On Thu, 16 Sep 2010 13:59:59 -0500
> Will Drewry <wad@chromium.org> wrote:
>
>> format_corename uses task_tgid_vnr to provide the numeric pid of a
>> core-dumping process.  For file-based coredumps, this is perfectly
>> satisfactory.  However, when the core_pattern contains a pipe, the
>> substituted PID is invalid in the namespace of the core_pattern pipe
>> helper, the init namespace.
>
> Nasty problem. I wonder how many more similar problems name spaces
> have introduced.
>
> But wouldn't it be better to place the helper into the name space(s)
> of the executed process? I guess it would risk breaking some existing
> set ups, but it seem like the cleanest solution to me. If you want
> to move the core dump out of the name space you could still
> use a named pipe or something like that with someone outside listening.
>
> That would also fix the net namespace problem you mentioned

I agree.  This was what I did first, but I wasn't confident I'd done
it right.  I'll go ahead and post the two patches I used to accomplish
it.   I'm happy to spend some time polishing them until they are
functionally correct, if everyone is happy enough with the approach.

more to come,
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
@ 2010-09-17 14:52     ` Will Drewry
  0 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-17 14:52 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman,
	containers, linux-fsdevel, Eugene Teo

On Fri, Sep 17, 2010 at 8:26 AM, Andi Kleen <andi@firstfloor.org> wrote:
> On Thu, 16 Sep 2010 13:59:59 -0500
> Will Drewry <wad@chromium.org> wrote:
>
>> format_corename uses task_tgid_vnr to provide the numeric pid of a
>> core-dumping process.  For file-based coredumps, this is perfectly
>> satisfactory.  However, when the core_pattern contains a pipe, the
>> substituted PID is invalid in the namespace of the core_pattern pipe
>> helper, the init namespace.
>
> Nasty problem. I wonder how many more similar problems name spaces
> have introduced.
>
> But wouldn't it be better to place the helper into the name space(s)
> of the executed process? I guess it would risk breaking some existing
> set ups, but it seem like the cleanest solution to me. If you want
> to move the core dump out of the name space you could still
> use a named pipe or something like that with someone outside listening.
>
> That would also fix the net namespace problem you mentioned

I agree.  This was what I did first, but I wasn't confident I'd done
it right.  I'll go ahead and post the two patches I used to accomplish
it.   I'm happy to spend some time polishing them until they are
functionally correct, if everyone is happy enough with the approach.

more to come,
will
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 1/2] nsproxy: add copy_namespaces_unattached
       [not found]   ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org>
  2010-09-17 14:52     ` Will Drewry
@ 2010-09-17 15:16     ` Will Drewry
  2010-09-17 15:16     ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry
  2 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-17 15:16 UTC (permalink / raw)
  To: Andi Kleen, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Will Drewry, Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, Oleg Nesterov, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn,
	Andrew Morton, Alexey Dobriyan, Roland McGrath,
	Eric W. Biederman

This changes adds copy_namespaces_unattached which provides similar
behavior to copy_namespaces() for clone, but is meant for use when a
new namespace needs to be derived from an existing process outside
of process creation.

The next patch in this series shows this function used in fs/exec.c to
insert the core_pattern pipe thread into the crashed processes
namespaces.

This patch is similar to the setns patches floated earlier this year,
but the goal is less lofty though not incompatible!

Any and all input, thoughts, etc will be appreciated.

Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
 include/linux/nsproxy.h |    2 ++
 kernel/nsproxy.c        |   39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index 7b370c7..4c823d2 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -63,6 +63,8 @@ static inline struct nsproxy *task_nsproxy(struct task_struct *tsk)
 }
 
 int copy_namespaces(unsigned long flags, struct task_struct *tsk);
+int copy_namespaces_unattached(unsigned long flags, struct task_struct *tsk,
+			       struct nsproxy **nsproxy, struct fs_struct **fs);
 void exit_task_namespaces(struct task_struct *tsk);
 void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
 void free_nsproxy(struct nsproxy *ns);
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index f74e6c0..ddaea4d 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -22,6 +22,7 @@
 #include <linux/pid_namespace.h>
 #include <net/net_namespace.h>
 #include <linux/ipc_namespace.h>
+#include <linux/fs_struct.h>
 
 static struct kmem_cache *nsproxy_cachep;
 
@@ -161,6 +162,44 @@ out:
 	return err;
 }
 
+/**
+ * copy_namespaces_unattached: creates a new nsproxy and fs from a given task
+ * @flags:	clone flags to change namespace creation/copy behavior
+ * @tsk:	task's namespace to base the nsproxy and fs on
+ * @nsproxy:	pointer which will contain the newly created nsproxy
+ * @fs:		pointer which will contain the newly created fs_struct
+ *
+ * Returns 0 on success and non-zero on failure.
+ *
+ * This function should aid in migrating processes across namespaces when after
+ * creation.
+ */
+int copy_namespaces_unattached(unsigned long flags, struct task_struct *tsk,
+			       struct nsproxy **nsproxy, struct fs_struct **fs)
+{
+	int err = 0;
+	if (!fs || !nsproxy) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	*fs = copy_fs_struct(tsk->fs);
+	if (!*fs) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	*nsproxy = create_new_namespaces(flags, tsk, *fs);
+	if (IS_ERR(*nsproxy)) {
+		err = PTR_ERR(*nsproxy);
+		free_fs_struct(*fs);
+		goto out;
+	}
+
+out:
+	return err;
+}
+
 void free_nsproxy(struct nsproxy *ns)
 {
 	if (ns->mnt_ns)
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 1/2] nsproxy: add copy_namespaces_unattached
  2010-09-17 13:26 ` Andi Kleen
  2010-09-17 14:52     ` Will Drewry
       [not found]   ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org>
@ 2010-09-17 15:16   ` Will Drewry
  2010-09-17 15:16   ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry
  3 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-17 15:16 UTC (permalink / raw)
  To: Andi Kleen, linux-kernel
  Cc: Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro,
	Roland McGrath, Neil Horman, Eric W. Biederman, containers,
	Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel, Will Drewry

This changes adds copy_namespaces_unattached which provides similar
behavior to copy_namespaces() for clone, but is meant for use when a
new namespace needs to be derived from an existing process outside
of process creation.

The next patch in this series shows this function used in fs/exec.c to
insert the core_pattern pipe thread into the crashed processes
namespaces.

This patch is similar to the setns patches floated earlier this year,
but the goal is less lofty though not incompatible!

Any and all input, thoughts, etc will be appreciated.

Signed-off-by: Will Drewry <wad@chromium.org>
---
 include/linux/nsproxy.h |    2 ++
 kernel/nsproxy.c        |   39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index 7b370c7..4c823d2 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -63,6 +63,8 @@ static inline struct nsproxy *task_nsproxy(struct task_struct *tsk)
 }
 
 int copy_namespaces(unsigned long flags, struct task_struct *tsk);
+int copy_namespaces_unattached(unsigned long flags, struct task_struct *tsk,
+			       struct nsproxy **nsproxy, struct fs_struct **fs);
 void exit_task_namespaces(struct task_struct *tsk);
 void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
 void free_nsproxy(struct nsproxy *ns);
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index f74e6c0..ddaea4d 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -22,6 +22,7 @@
 #include <linux/pid_namespace.h>
 #include <net/net_namespace.h>
 #include <linux/ipc_namespace.h>
+#include <linux/fs_struct.h>
 
 static struct kmem_cache *nsproxy_cachep;
 
@@ -161,6 +162,44 @@ out:
 	return err;
 }
 
+/**
+ * copy_namespaces_unattached: creates a new nsproxy and fs from a given task
+ * @flags:	clone flags to change namespace creation/copy behavior
+ * @tsk:	task's namespace to base the nsproxy and fs on
+ * @nsproxy:	pointer which will contain the newly created nsproxy
+ * @fs:		pointer which will contain the newly created fs_struct
+ *
+ * Returns 0 on success and non-zero on failure.
+ *
+ * This function should aid in migrating processes across namespaces when after
+ * creation.
+ */
+int copy_namespaces_unattached(unsigned long flags, struct task_struct *tsk,
+			       struct nsproxy **nsproxy, struct fs_struct **fs)
+{
+	int err = 0;
+	if (!fs || !nsproxy) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	*fs = copy_fs_struct(tsk->fs);
+	if (!*fs) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	*nsproxy = create_new_namespaces(flags, tsk, *fs);
+	if (IS_ERR(*nsproxy)) {
+		err = PTR_ERR(*nsproxy);
+		free_fs_struct(*fs);
+		goto out;
+	}
+
+out:
+	return err;
+}
+
 void free_nsproxy(struct nsproxy *ns)
 {
 	if (ns->mnt_ns)
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
       [not found]   ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org>
  2010-09-17 14:52     ` Will Drewry
  2010-09-17 15:16     ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry
@ 2010-09-17 15:16     ` Will Drewry
  2 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-17 15:16 UTC (permalink / raw)
  To: Andi Kleen, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Will Drewry, Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, Oleg Nesterov, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn,
	Andrew Morton, Alexey Dobriyan, Roland McGrath,
	Eric W. Biederman

Presently, a core_pattern pipe endpoint will be run in the init
namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
core dumping process but will have no access to that processes /proc
without walking the init namespace /proc looking through all the global
pids until it finds the one it thinks matches.

One option for implementing this I've already posed:
  https://patchwork.kernel.org/patch/185912/
However, it's unclear if it is desirable for the core_pattern to run
outside the namespace.  In particular, it can easily access the mounts
via /proc/[pid_nr]/root, but if there is a net namespace, it will not
have access.  (Originally introduced in 2007 in commit
b488893a390edfe027bae7a46e9af8083e740668 )

Instead, this change implements the more complex option two.  It
migrates the ____call_usermodehelper() thread into the same namespaces
as the dumping process.  It does not assign a pid in that namespace so
the collector will appear to be pid 0.  (Using alloc_pid and change_pid
are options though. I avoided it because kernel_thread's returned pid
will then mismatch the actual threads pid.)

Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
 fs/exec.c |   56 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 828dd24..3b82607 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -51,6 +51,7 @@
 #include <linux/audit.h>
 #include <linux/tracehook.h>
 #include <linux/kmod.h>
+#include <linux/nsproxy.h>
 #include <linux/fsnotify.h>
 #include <linux/fs_struct.h>
 #include <linux/pipe_fs_i.h>
@@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core";
 unsigned int core_pipe_limit;
 int suid_dumpable = 0;
 
+struct coredump_pipe_params {
+	struct coredump_params *params;
+	struct nsproxy *nsproxy;
+	struct fs_struct *fs;
+};
+
 /* The maximal length of core_pattern is also specified in sysctl.c */
 
 static LIST_HEAD(formats);
@@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info)
 {
 	struct file *rp, *wp;
 	struct fdtable *fdt;
-	struct coredump_params *cp = (struct coredump_params *)info->data;
-	struct files_struct *cf = current->files;
+	struct coredump_pipe_params *pipe_params =
+		(struct coredump_pipe_params *)info->data;
+	struct coredump_params *cp = pipe_params->params;
+	struct task_struct *tsk = current;
+	struct files_struct *cf = tsk->files;
+	struct fs_struct *cfs = tsk->fs;
+
+	/* Migrate this thread into the crashing namespaces, but
+	 * don't change its pid struct to avoid breaking any other
+	 * dependencies.  This will mean the process is pid=0 if it
+	 * was migrated into a pid namespace.
+	 */
+	if (pipe_params->nsproxy && pipe_params->fs) {
+		int kill;
+		switch_task_namespaces(tsk, pipe_params->nsproxy);
+		pipe_params->nsproxy = NULL;
+
+		task_lock(tsk);
+		tsk->fs = pipe_params->fs;
+		task_unlock(tsk);
+		pipe_params->fs = NULL;
+		/* Clean up the previous fs struct */
+		write_lock(&cfs->lock);
+		kill = !--cfs->users;
+		write_unlock(&cfs->lock);
+		if (kill)
+			free_fs_struct(cfs);
+	}
 
 	wp = create_write_pipe(0);
 	if (IS_ERR(wp))
@@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
  	if (ispipe) {
 		int dump_count;
 		char **helper_argv;
+		struct coredump_pipe_params pipe_params = { .params = &cprm };
 
 		if (cprm.limit == 1) {
 			/*
@@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
 			goto fail_dropcount;
 		}
 
+		/* Run the core_collector in the crashing namespaces */
+		if (copy_namespaces_unattached(0, current,
+			&pipe_params.nsproxy, &pipe_params.fs)) {
+			printk(KERN_WARNING "%s failed to copy namespaces\n",
+			       __func__);
+			argv_free(helper_argv);
+			goto fail_dropcount;
+		}
+
 		retval = call_usermodehelper_fns(helper_argv[0], helper_argv,
 					NULL, UMH_WAIT_EXEC, umh_pipe_setup,
-					NULL, &cprm);
+					NULL, &pipe_params);
 		argv_free(helper_argv);
+		/* nsproxy and fs will survive if call_usermodehelper_fns hits
+		 * ENOMEM prior to creating a new thread.
+		 */
+		if (pipe_params.nsproxy)
+			put_nsproxy(pipe_params.nsproxy);
+		if (pipe_params.fs)  /* not in use by anything else */
+			free_fs_struct(pipe_params.fs);
 		if (retval) {
  			printk(KERN_INFO "Core dump to %s pipe failed\n",
 			       corename);
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-17 13:26 ` Andi Kleen
                     ` (2 preceding siblings ...)
  2010-09-17 15:16   ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry
@ 2010-09-17 15:16   ` Will Drewry
       [not found]     ` <1284736618-27153-2-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
  2010-09-17 18:15     ` [PATCH 2/2] " Neil Horman
  3 siblings, 2 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-17 15:16 UTC (permalink / raw)
  To: Andi Kleen, linux-kernel
  Cc: Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro,
	Roland McGrath, Neil Horman, Eric W. Biederman, containers,
	Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel, Will Drewry

Presently, a core_pattern pipe endpoint will be run in the init
namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
core dumping process but will have no access to that processes /proc
without walking the init namespace /proc looking through all the global
pids until it finds the one it thinks matches.

One option for implementing this I've already posed:
  https://patchwork.kernel.org/patch/185912/
However, it's unclear if it is desirable for the core_pattern to run
outside the namespace.  In particular, it can easily access the mounts
via /proc/[pid_nr]/root, but if there is a net namespace, it will not
have access.  (Originally introduced in 2007 in commit
b488893a390edfe027bae7a46e9af8083e740668 )

Instead, this change implements the more complex option two.  It
migrates the ____call_usermodehelper() thread into the same namespaces
as the dumping process.  It does not assign a pid in that namespace so
the collector will appear to be pid 0.  (Using alloc_pid and change_pid
are options though. I avoided it because kernel_thread's returned pid
will then mismatch the actual threads pid.)

Signed-off-by: Will Drewry <wad@chromium.org>
---
 fs/exec.c |   56 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 828dd24..3b82607 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -51,6 +51,7 @@
 #include <linux/audit.h>
 #include <linux/tracehook.h>
 #include <linux/kmod.h>
+#include <linux/nsproxy.h>
 #include <linux/fsnotify.h>
 #include <linux/fs_struct.h>
 #include <linux/pipe_fs_i.h>
@@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core";
 unsigned int core_pipe_limit;
 int suid_dumpable = 0;
 
+struct coredump_pipe_params {
+	struct coredump_params *params;
+	struct nsproxy *nsproxy;
+	struct fs_struct *fs;
+};
+
 /* The maximal length of core_pattern is also specified in sysctl.c */
 
 static LIST_HEAD(formats);
@@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info)
 {
 	struct file *rp, *wp;
 	struct fdtable *fdt;
-	struct coredump_params *cp = (struct coredump_params *)info->data;
-	struct files_struct *cf = current->files;
+	struct coredump_pipe_params *pipe_params =
+		(struct coredump_pipe_params *)info->data;
+	struct coredump_params *cp = pipe_params->params;
+	struct task_struct *tsk = current;
+	struct files_struct *cf = tsk->files;
+	struct fs_struct *cfs = tsk->fs;
+
+	/* Migrate this thread into the crashing namespaces, but
+	 * don't change its pid struct to avoid breaking any other
+	 * dependencies.  This will mean the process is pid=0 if it
+	 * was migrated into a pid namespace.
+	 */
+	if (pipe_params->nsproxy && pipe_params->fs) {
+		int kill;
+		switch_task_namespaces(tsk, pipe_params->nsproxy);
+		pipe_params->nsproxy = NULL;
+
+		task_lock(tsk);
+		tsk->fs = pipe_params->fs;
+		task_unlock(tsk);
+		pipe_params->fs = NULL;
+		/* Clean up the previous fs struct */
+		write_lock(&cfs->lock);
+		kill = !--cfs->users;
+		write_unlock(&cfs->lock);
+		if (kill)
+			free_fs_struct(cfs);
+	}
 
 	wp = create_write_pipe(0);
 	if (IS_ERR(wp))
@@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
  	if (ispipe) {
 		int dump_count;
 		char **helper_argv;
+		struct coredump_pipe_params pipe_params = { .params = &cprm };
 
 		if (cprm.limit == 1) {
 			/*
@@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
 			goto fail_dropcount;
 		}
 
+		/* Run the core_collector in the crashing namespaces */
+		if (copy_namespaces_unattached(0, current,
+			&pipe_params.nsproxy, &pipe_params.fs)) {
+			printk(KERN_WARNING "%s failed to copy namespaces\n",
+			       __func__);
+			argv_free(helper_argv);
+			goto fail_dropcount;
+		}
+
 		retval = call_usermodehelper_fns(helper_argv[0], helper_argv,
 					NULL, UMH_WAIT_EXEC, umh_pipe_setup,
-					NULL, &cprm);
+					NULL, &pipe_params);
 		argv_free(helper_argv);
+		/* nsproxy and fs will survive if call_usermodehelper_fns hits
+		 * ENOMEM prior to creating a new thread.
+		 */
+		if (pipe_params.nsproxy)
+			put_nsproxy(pipe_params.nsproxy);
+		if (pipe_params.fs)  /* not in use by anything else */
+			free_fs_struct(pipe_params.fs);
 		if (retval) {
  			printk(KERN_INFO "Core dump to %s pipe failed\n",
 			       corename);
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
       [not found]     ` <1284736618-27153-2-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
@ 2010-09-17 18:15       ` Neil Horman
  2010-09-18  1:29         ` Oleg Nesterov
  1 sibling, 0 replies; 43+ messages in thread
From: Neil Horman @ 2010-09-17 18:15 UTC (permalink / raw)
  To: Will Drewry
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath, Eric W. Biederman

On Fri, Sep 17, 2010 at 10:16:58AM -0500, Will Drewry wrote:
> Presently, a core_pattern pipe endpoint will be run in the init
> namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
> core dumping process but will have no access to that processes /proc
> without walking the init namespace /proc looking through all the global
> pids until it finds the one it thinks matches.
> 
> One option for implementing this I've already posed:
>   https://patchwork.kernel.org/patch/185912/
> However, it's unclear if it is desirable for the core_pattern to run
> outside the namespace.  In particular, it can easily access the mounts
> via /proc/[pid_nr]/root, but if there is a net namespace, it will not
> have access.  (Originally introduced in 2007 in commit
> b488893a390edfe027bae7a46e9af8083e740668 )
> 
> Instead, this change implements the more complex option two.  It
> migrates the ____call_usermodehelper() thread into the same namespaces
> as the dumping process.  It does not assign a pid in that namespace so
> the collector will appear to be pid 0.  (Using alloc_pid and change_pid
> are options though. I avoided it because kernel_thread's returned pid
> will then mismatch the actual threads pid.)
> 
> Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> ---
>  fs/exec.c |   56 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 files changed, 53 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 828dd24..3b82607 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -51,6 +51,7 @@
>  #include <linux/audit.h>
>  #include <linux/tracehook.h>
>  #include <linux/kmod.h>
> +#include <linux/nsproxy.h>
>  #include <linux/fsnotify.h>
>  #include <linux/fs_struct.h>
>  #include <linux/pipe_fs_i.h>
> @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core";
>  unsigned int core_pipe_limit;
>  int suid_dumpable = 0;
>  
> +struct coredump_pipe_params {
> +	struct coredump_params *params;
> +	struct nsproxy *nsproxy;
> +	struct fs_struct *fs;
> +};
> +
Generally, I like this approach, but i think perhaps it would be better if you
made the coredump_pipe_params struct a member of the coredump_params structure
(perhaps within an anonymous union, so you can handle future core dump types, if
those ever come along).  That will save you needing the extra pointer to the
coredump_params structure itself.

>  /* The maximal length of core_pattern is also specified in sysctl.c */
>  
>  static LIST_HEAD(formats);
> @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info)
>  {
>  	struct file *rp, *wp;
>  	struct fdtable *fdt;
> -	struct coredump_params *cp = (struct coredump_params *)info->data;
> -	struct files_struct *cf = current->files;
> +	struct coredump_pipe_params *pipe_params =
> +		(struct coredump_pipe_params *)info->data;
If you do the above, you can just reference this as cp->pipe_params, or
what-not.

> +	struct coredump_params *cp = pipe_params->params;
> +	struct task_struct *tsk = current;
> +	struct files_struct *cf = tsk->files;
> +	struct fs_struct *cfs = tsk->fs;
> +
> +	/* Migrate this thread into the crashing namespaces, but
> +	 * don't change its pid struct to avoid breaking any other
> +	 * dependencies.  This will mean the process is pid=0 if it
> +	 * was migrated into a pid namespace.
> +	 */
> +	if (pipe_params->nsproxy && pipe_params->fs) {
> +		int kill;
> +		switch_task_namespaces(tsk, pipe_params->nsproxy);
> +		pipe_params->nsproxy = NULL;
> +
> +		task_lock(tsk);
> +		tsk->fs = pipe_params->fs;
> +		task_unlock(tsk);
> +		pipe_params->fs = NULL;
> +		/* Clean up the previous fs struct */
> +		write_lock(&cfs->lock);
> +		kill = !--cfs->users;
> +		write_unlock(&cfs->lock);
> +		if (kill)
> +			free_fs_struct(cfs);
> +	}
>  
>  	wp = create_write_pipe(0);
>  	if (IS_ERR(wp))
> @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
>   	if (ispipe) {
>  		int dump_count;
>  		char **helper_argv;
> +		struct coredump_pipe_params pipe_params = { .params = &cprm };
>  
And you won't have to allocate this, it will just be part of the
coredump_params.

>  		if (cprm.limit == 1) {
>  			/*
> @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
>  			goto fail_dropcount;
>  		}
>  
> +		/* Run the core_collector in the crashing namespaces */
> +		if (copy_namespaces_unattached(0, current,
> +			&pipe_params.nsproxy, &pipe_params.fs)) {
> +			printk(KERN_WARNING "%s failed to copy namespaces\n",
> +			       __func__);
> +			argv_free(helper_argv);
> +			goto fail_dropcount;
> +		}
> +
>  		retval = call_usermodehelper_fns(helper_argv[0], helper_argv,
>  					NULL, UMH_WAIT_EXEC, umh_pipe_setup,
> -					NULL, &cprm);
> +					NULL, &pipe_params);
And you can pass a pointer to coredump_params here, instead of a pointer to a
pipe_specific coredumps, to a function that isn't specific to pipes

>  		argv_free(helper_argv);
> +		/* nsproxy and fs will survive if call_usermodehelper_fns hits
> +		 * ENOMEM prior to creating a new thread.
> +		 */
> +		if (pipe_params.nsproxy)
> +			put_nsproxy(pipe_params.nsproxy);
> +		if (pipe_params.fs)  /* not in use by anything else */
> +			free_fs_struct(pipe_params.fs);
>  		if (retval) {
>   			printk(KERN_INFO "Core dump to %s pipe failed\n",
>  			       corename);
> -- 
> 1.7.0.4
> 
> 


Regards
Neil

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-17 15:16   ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry
       [not found]     ` <1284736618-27153-2-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
@ 2010-09-17 18:15     ` Neil Horman
       [not found]       ` <20100917181556.GA2499-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  2010-09-18  2:33       ` Will Drewry
  1 sibling, 2 replies; 43+ messages in thread
From: Neil Horman @ 2010-09-17 18:15 UTC (permalink / raw)
  To: Will Drewry
  Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton,
	Oleg Nesterov, KOSAKI Motohiro, Roland McGrath,
	Eric W. Biederman, containers, Eugene Teo, Tejun Heo,
	Serge Hallyn, Alexey Dobriyan, linux-fsdevel

On Fri, Sep 17, 2010 at 10:16:58AM -0500, Will Drewry wrote:
> Presently, a core_pattern pipe endpoint will be run in the init
> namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
> core dumping process but will have no access to that processes /proc
> without walking the init namespace /proc looking through all the global
> pids until it finds the one it thinks matches.
> 
> One option for implementing this I've already posed:
>   https://patchwork.kernel.org/patch/185912/
> However, it's unclear if it is desirable for the core_pattern to run
> outside the namespace.  In particular, it can easily access the mounts
> via /proc/[pid_nr]/root, but if there is a net namespace, it will not
> have access.  (Originally introduced in 2007 in commit
> b488893a390edfe027bae7a46e9af8083e740668 )
> 
> Instead, this change implements the more complex option two.  It
> migrates the ____call_usermodehelper() thread into the same namespaces
> as the dumping process.  It does not assign a pid in that namespace so
> the collector will appear to be pid 0.  (Using alloc_pid and change_pid
> are options though. I avoided it because kernel_thread's returned pid
> will then mismatch the actual threads pid.)
> 
> Signed-off-by: Will Drewry <wad@chromium.org>
> ---
>  fs/exec.c |   56 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 files changed, 53 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 828dd24..3b82607 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -51,6 +51,7 @@
>  #include <linux/audit.h>
>  #include <linux/tracehook.h>
>  #include <linux/kmod.h>
> +#include <linux/nsproxy.h>
>  #include <linux/fsnotify.h>
>  #include <linux/fs_struct.h>
>  #include <linux/pipe_fs_i.h>
> @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core";
>  unsigned int core_pipe_limit;
>  int suid_dumpable = 0;
>  
> +struct coredump_pipe_params {
> +	struct coredump_params *params;
> +	struct nsproxy *nsproxy;
> +	struct fs_struct *fs;
> +};
> +
Generally, I like this approach, but i think perhaps it would be better if you
made the coredump_pipe_params struct a member of the coredump_params structure
(perhaps within an anonymous union, so you can handle future core dump types, if
those ever come along).  That will save you needing the extra pointer to the
coredump_params structure itself.

>  /* The maximal length of core_pattern is also specified in sysctl.c */
>  
>  static LIST_HEAD(formats);
> @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info)
>  {
>  	struct file *rp, *wp;
>  	struct fdtable *fdt;
> -	struct coredump_params *cp = (struct coredump_params *)info->data;
> -	struct files_struct *cf = current->files;
> +	struct coredump_pipe_params *pipe_params =
> +		(struct coredump_pipe_params *)info->data;
If you do the above, you can just reference this as cp->pipe_params, or
what-not.

> +	struct coredump_params *cp = pipe_params->params;
> +	struct task_struct *tsk = current;
> +	struct files_struct *cf = tsk->files;
> +	struct fs_struct *cfs = tsk->fs;
> +
> +	/* Migrate this thread into the crashing namespaces, but
> +	 * don't change its pid struct to avoid breaking any other
> +	 * dependencies.  This will mean the process is pid=0 if it
> +	 * was migrated into a pid namespace.
> +	 */
> +	if (pipe_params->nsproxy && pipe_params->fs) {
> +		int kill;
> +		switch_task_namespaces(tsk, pipe_params->nsproxy);
> +		pipe_params->nsproxy = NULL;
> +
> +		task_lock(tsk);
> +		tsk->fs = pipe_params->fs;
> +		task_unlock(tsk);
> +		pipe_params->fs = NULL;
> +		/* Clean up the previous fs struct */
> +		write_lock(&cfs->lock);
> +		kill = !--cfs->users;
> +		write_unlock(&cfs->lock);
> +		if (kill)
> +			free_fs_struct(cfs);
> +	}
>  
>  	wp = create_write_pipe(0);
>  	if (IS_ERR(wp))
> @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
>   	if (ispipe) {
>  		int dump_count;
>  		char **helper_argv;
> +		struct coredump_pipe_params pipe_params = { .params = &cprm };
>  
And you won't have to allocate this, it will just be part of the
coredump_params.

>  		if (cprm.limit == 1) {
>  			/*
> @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
>  			goto fail_dropcount;
>  		}
>  
> +		/* Run the core_collector in the crashing namespaces */
> +		if (copy_namespaces_unattached(0, current,
> +			&pipe_params.nsproxy, &pipe_params.fs)) {
> +			printk(KERN_WARNING "%s failed to copy namespaces\n",
> +			       __func__);
> +			argv_free(helper_argv);
> +			goto fail_dropcount;
> +		}
> +
>  		retval = call_usermodehelper_fns(helper_argv[0], helper_argv,
>  					NULL, UMH_WAIT_EXEC, umh_pipe_setup,
> -					NULL, &cprm);
> +					NULL, &pipe_params);
And you can pass a pointer to coredump_params here, instead of a pointer to a
pipe_specific coredumps, to a function that isn't specific to pipes

>  		argv_free(helper_argv);
> +		/* nsproxy and fs will survive if call_usermodehelper_fns hits
> +		 * ENOMEM prior to creating a new thread.
> +		 */
> +		if (pipe_params.nsproxy)
> +			put_nsproxy(pipe_params.nsproxy);
> +		if (pipe_params.fs)  /* not in use by anything else */
> +			free_fs_struct(pipe_params.fs);
>  		if (retval) {
>   			printk(KERN_INFO "Core dump to %s pipe failed\n",
>  			       corename);
> -- 
> 1.7.0.4
> 
> 


Regards
Neil


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
       [not found]     ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
  2010-09-16 21:02       ` Will Drewry
@ 2010-09-17 19:08       ` Roland McGrath
  1 sibling, 0 replies; 43+ messages in thread
From: Roland McGrath @ 2010-09-17 19:08 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Will Drewry, Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen,
	Alexander Viro, KOSAKI Motohiro,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton

> That said I don't think this is a complete fix.  We also potentially
> have the same issue with the uts namespace and the user namespace.

True.

> I believe the core file holds all of this information relative to the
> process that is dying, one elf note or other so we don't need to worry
> about information loss.

That's correct (see linux/elfcore.h, pr_*id fields in prstatus and prpsinfo).

> I can imagine that it would be nice to have different core patterns
> depending on where you are in the process tree, so a container can
> do something different than the system outside of the container.  Are
> you thinking along those lines, or are you imagining something else?

I agree.  The format string being part of the pid_ns makes sense to me.  
(I don't have any ideas about the interface for setting it.)
Of course, then what would make most sense is for the pipe handler to
run in the innermost namespace that set its own format, rather than in
the global ns.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
  2010-09-16 20:12   ` Eric W. Biederman
  2010-09-16 21:02     ` Will Drewry
       [not found]     ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
@ 2010-09-17 19:08     ` Roland McGrath
  2 siblings, 0 replies; 43+ messages in thread
From: Roland McGrath @ 2010-09-17 19:08 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Oleg Nesterov, Will Drewry, linux-kernel, Alexander Viro,
	Andrew Morton, KOSAKI Motohiro, Neil Horman, Andi Kleen,
	containers, linux-fsdevel

> That said I don't think this is a complete fix.  We also potentially
> have the same issue with the uts namespace and the user namespace.

True.

> I believe the core file holds all of this information relative to the
> process that is dying, one elf note or other so we don't need to worry
> about information loss.

That's correct (see linux/elfcore.h, pr_*id fields in prstatus and prpsinfo).

> I can imagine that it would be nice to have different core patterns
> depending on where you are in the process tree, so a container can
> do something different than the system outside of the container.  Are
> you thinking along those lines, or are you imagining something else?

I agree.  The format string being part of the pid_ns makes sense to me.  
(I don't have any ideas about the interface for setting it.)
Of course, then what would make most sense is for the pipe handler to
run in the innermost namespace that set its own format, rather than in
the global ns.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-17 15:16   ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry
@ 2010-09-18  1:29         ` Oleg Nesterov
  2010-09-17 18:15     ` [PATCH 2/2] " Neil Horman
  1 sibling, 0 replies; 43+ messages in thread
From: Oleg Nesterov @ 2010-09-18  1:29 UTC (permalink / raw)
  To: Will Drewry
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath, Eric W. Biederman

On 09/17, Will Drewry wrote:
>
> Instead, this change implements the more complex option two.  It
> migrates the ____call_usermodehelper() thread into the same namespaces
> as the dumping process.  It does not assign a pid in that namespace so
> the collector will appear to be pid 0.

Hmm... You mean, it won't visible in that namespace? Afacis, it should
have the correct pid in the init ns, no?

I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps
this is OK... Say, sys_getpid() returns 0, strange.

> +		/* Run the core_collector in the crashing namespaces */
> +		if (copy_namespaces_unattached(0, current,
> +			&pipe_params.nsproxy, &pipe_params.fs)) {
> +			printk(KERN_WARNING "%s failed to copy namespaces\n",
> +			       __func__);
> +			argv_free(helper_argv);
> +			goto fail_dropcount;
> +		}

This looks overcomplicated to me, or I missed something.

I do not understand why should we do this beforehand, and why we need
copy_namespaces_unattached().

Can't you just pass current to umh_pipe_setup() (or another helper) as
the argument? Then this helper can copy ->fs and ->nsproxy itself.

In fact, I do not understand why create_new_namespaces() is used. It
is called with flags == 0 anyway, can't we just do

	ns = coredumping_task->nsproxy;
	get_nsproxy(ns);	
	switch_task_namespaces(current, ns);

?

Oleg.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
@ 2010-09-18  1:29         ` Oleg Nesterov
  0 siblings, 0 replies; 43+ messages in thread
From: Oleg Nesterov @ 2010-09-18  1:29 UTC (permalink / raw)
  To: Will Drewry
  Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman,
	containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel

On 09/17, Will Drewry wrote:
>
> Instead, this change implements the more complex option two.  It
> migrates the ____call_usermodehelper() thread into the same namespaces
> as the dumping process.  It does not assign a pid in that namespace so
> the collector will appear to be pid 0.

Hmm... You mean, it won't visible in that namespace? Afacis, it should
have the correct pid in the init ns, no?

I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps
this is OK... Say, sys_getpid() returns 0, strange.

> +		/* Run the core_collector in the crashing namespaces */
> +		if (copy_namespaces_unattached(0, current,
> +			&pipe_params.nsproxy, &pipe_params.fs)) {
> +			printk(KERN_WARNING "%s failed to copy namespaces\n",
> +			       __func__);
> +			argv_free(helper_argv);
> +			goto fail_dropcount;
> +		}

This looks overcomplicated to me, or I missed something.

I do not understand why should we do this beforehand, and why we need
copy_namespaces_unattached().

Can't you just pass current to umh_pipe_setup() (or another helper) as
the argument? Then this helper can copy ->fs and ->nsproxy itself.

In fact, I do not understand why create_new_namespaces() is used. It
is called with flags == 0 anyway, can't we just do

	ns = coredumping_task->nsproxy;
	get_nsproxy(ns);	
	switch_task_namespaces(current, ns);

?

Oleg.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
       [not found]       ` <20100917181556.GA2499-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2010-09-18  2:33         ` Will Drewry
  0 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-18  2:33 UTC (permalink / raw)
  To: Neil Horman
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath, Eric W. Biederman

On Fri, Sep 17, 2010 at 1:15 PM, Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org> wrote:
> On Fri, Sep 17, 2010 at 10:16:58AM -0500, Will Drewry wrote:
>> Presently, a core_pattern pipe endpoint will be run in the init
>> namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
>> core dumping process but will have no access to that processes /proc
>> without walking the init namespace /proc looking through all the global
>> pids until it finds the one it thinks matches.
>>
>> One option for implementing this I've already posed:
>>   https://patchwork.kernel.org/patch/185912/
>> However, it's unclear if it is desirable for the core_pattern to run
>> outside the namespace.  In particular, it can easily access the mounts
>> via /proc/[pid_nr]/root, but if there is a net namespace, it will not
>> have access.  (Originally introduced in 2007 in commit
>> b488893a390edfe027bae7a46e9af8083e740668 )
>>
>> Instead, this change implements the more complex option two.  It
>> migrates the ____call_usermodehelper() thread into the same namespaces
>> as the dumping process.  It does not assign a pid in that namespace so
>> the collector will appear to be pid 0.  (Using alloc_pid and change_pid
>> are options though. I avoided it because kernel_thread's returned pid
>> will then mismatch the actual threads pid.)
>>
>> Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>> ---
>>  fs/exec.c |   56 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
>>  1 files changed, 53 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index 828dd24..3b82607 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -51,6 +51,7 @@
>>  #include <linux/audit.h>
>>  #include <linux/tracehook.h>
>>  #include <linux/kmod.h>
>> +#include <linux/nsproxy.h>
>>  #include <linux/fsnotify.h>
>>  #include <linux/fs_struct.h>
>>  #include <linux/pipe_fs_i.h>
>> @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core";
>>  unsigned int core_pipe_limit;
>>  int suid_dumpable = 0;
>>
>> +struct coredump_pipe_params {
>> +     struct coredump_params *params;
>> +     struct nsproxy *nsproxy;
>> +     struct fs_struct *fs;
>> +};
>> +
> Generally, I like this approach, but i think perhaps it would be better if you
> made the coredump_pipe_params struct a member of the coredump_params structure
> (perhaps within an anonymous union, so you can handle future core dump types, if
> those ever come along).  That will save you needing the extra pointer to the
> coredump_params structure itself.

I'll rework it as suggested while also taking into account Oleg's
feedback (another mail coming).

Thanks!

>>  /* The maximal length of core_pattern is also specified in sysctl.c */
>>
>>  static LIST_HEAD(formats);
>> @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info)
>>  {
>>       struct file *rp, *wp;
>>       struct fdtable *fdt;
>> -     struct coredump_params *cp = (struct coredump_params *)info->data;
>> -     struct files_struct *cf = current->files;
>> +     struct coredump_pipe_params *pipe_params =
>> +             (struct coredump_pipe_params *)info->data;
> If you do the above, you can just reference this as cp->pipe_params, or
> what-not.
>
>> +     struct coredump_params *cp = pipe_params->params;
>> +     struct task_struct *tsk = current;
>> +     struct files_struct *cf = tsk->files;
>> +     struct fs_struct *cfs = tsk->fs;
>> +
>> +     /* Migrate this thread into the crashing namespaces, but
>> +      * don't change its pid struct to avoid breaking any other
>> +      * dependencies.  This will mean the process is pid=0 if it
>> +      * was migrated into a pid namespace.
>> +      */
>> +     if (pipe_params->nsproxy && pipe_params->fs) {
>> +             int kill;
>> +             switch_task_namespaces(tsk, pipe_params->nsproxy);
>> +             pipe_params->nsproxy = NULL;
>> +
>> +             task_lock(tsk);
>> +             tsk->fs = pipe_params->fs;
>> +             task_unlock(tsk);
>> +             pipe_params->fs = NULL;
>> +             /* Clean up the previous fs struct */
>> +             write_lock(&cfs->lock);
>> +             kill = !--cfs->users;
>> +             write_unlock(&cfs->lock);
>> +             if (kill)
>> +                     free_fs_struct(cfs);
>> +     }
>>
>>       wp = create_write_pipe(0);
>>       if (IS_ERR(wp))
>> @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
>>       if (ispipe) {
>>               int dump_count;
>>               char **helper_argv;
>> +             struct coredump_pipe_params pipe_params = { .params = &cprm };
>>
> And you won't have to allocate this, it will just be part of the
> coredump_params.
>
>>               if (cprm.limit == 1) {
>>                       /*
>> @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
>>                       goto fail_dropcount;
>>               }
>>
>> +             /* Run the core_collector in the crashing namespaces */
>> +             if (copy_namespaces_unattached(0, current,
>> +                     &pipe_params.nsproxy, &pipe_params.fs)) {
>> +                     printk(KERN_WARNING "%s failed to copy namespaces\n",
>> +                            __func__);
>> +                     argv_free(helper_argv);
>> +                     goto fail_dropcount;
>> +             }
>> +
>>               retval = call_usermodehelper_fns(helper_argv[0], helper_argv,
>>                                       NULL, UMH_WAIT_EXEC, umh_pipe_setup,
>> -                                     NULL, &cprm);
>> +                                     NULL, &pipe_params);
> And you can pass a pointer to coredump_params here, instead of a pointer to a
> pipe_specific coredumps, to a function that isn't specific to pipes
>
>>               argv_free(helper_argv);
>> +             /* nsproxy and fs will survive if call_usermodehelper_fns hits
>> +              * ENOMEM prior to creating a new thread.
>> +              */
>> +             if (pipe_params.nsproxy)
>> +                     put_nsproxy(pipe_params.nsproxy);
>> +             if (pipe_params.fs)  /* not in use by anything else */
>> +                     free_fs_struct(pipe_params.fs);
>>               if (retval) {
>>                       printk(KERN_INFO "Core dump to %s pipe failed\n",
>>                              corename);
>> --
>> 1.7.0.4
>>
>>
>
>
> Regards
> Neil
>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-17 18:15     ` [PATCH 2/2] " Neil Horman
       [not found]       ` <20100917181556.GA2499-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2010-09-18  2:33       ` Will Drewry
  1 sibling, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-18  2:33 UTC (permalink / raw)
  To: Neil Horman
  Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton,
	Oleg Nesterov, KOSAKI Motohiro, Roland McGrath,
	Eric W. Biederman, containers, Eugene Teo, Tejun Heo,
	Serge Hallyn, Alexey Dobriyan, linux-fsdevel

On Fri, Sep 17, 2010 at 1:15 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
> On Fri, Sep 17, 2010 at 10:16:58AM -0500, Will Drewry wrote:
>> Presently, a core_pattern pipe endpoint will be run in the init
>> namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
>> core dumping process but will have no access to that processes /proc
>> without walking the init namespace /proc looking through all the global
>> pids until it finds the one it thinks matches.
>>
>> One option for implementing this I've already posed:
>>   https://patchwork.kernel.org/patch/185912/
>> However, it's unclear if it is desirable for the core_pattern to run
>> outside the namespace.  In particular, it can easily access the mounts
>> via /proc/[pid_nr]/root, but if there is a net namespace, it will not
>> have access.  (Originally introduced in 2007 in commit
>> b488893a390edfe027bae7a46e9af8083e740668 )
>>
>> Instead, this change implements the more complex option two.  It
>> migrates the ____call_usermodehelper() thread into the same namespaces
>> as the dumping process.  It does not assign a pid in that namespace so
>> the collector will appear to be pid 0.  (Using alloc_pid and change_pid
>> are options though. I avoided it because kernel_thread's returned pid
>> will then mismatch the actual threads pid.)
>>
>> Signed-off-by: Will Drewry <wad@chromium.org>
>> ---
>>  fs/exec.c |   56 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
>>  1 files changed, 53 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index 828dd24..3b82607 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -51,6 +51,7 @@
>>  #include <linux/audit.h>
>>  #include <linux/tracehook.h>
>>  #include <linux/kmod.h>
>> +#include <linux/nsproxy.h>
>>  #include <linux/fsnotify.h>
>>  #include <linux/fs_struct.h>
>>  #include <linux/pipe_fs_i.h>
>> @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core";
>>  unsigned int core_pipe_limit;
>>  int suid_dumpable = 0;
>>
>> +struct coredump_pipe_params {
>> +     struct coredump_params *params;
>> +     struct nsproxy *nsproxy;
>> +     struct fs_struct *fs;
>> +};
>> +
> Generally, I like this approach, but i think perhaps it would be better if you
> made the coredump_pipe_params struct a member of the coredump_params structure
> (perhaps within an anonymous union, so you can handle future core dump types, if
> those ever come along).  That will save you needing the extra pointer to the
> coredump_params structure itself.

I'll rework it as suggested while also taking into account Oleg's
feedback (another mail coming).

Thanks!

>>  /* The maximal length of core_pattern is also specified in sysctl.c */
>>
>>  static LIST_HEAD(formats);
>> @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info)
>>  {
>>       struct file *rp, *wp;
>>       struct fdtable *fdt;
>> -     struct coredump_params *cp = (struct coredump_params *)info->data;
>> -     struct files_struct *cf = current->files;
>> +     struct coredump_pipe_params *pipe_params =
>> +             (struct coredump_pipe_params *)info->data;
> If you do the above, you can just reference this as cp->pipe_params, or
> what-not.
>
>> +     struct coredump_params *cp = pipe_params->params;
>> +     struct task_struct *tsk = current;
>> +     struct files_struct *cf = tsk->files;
>> +     struct fs_struct *cfs = tsk->fs;
>> +
>> +     /* Migrate this thread into the crashing namespaces, but
>> +      * don't change its pid struct to avoid breaking any other
>> +      * dependencies.  This will mean the process is pid=0 if it
>> +      * was migrated into a pid namespace.
>> +      */
>> +     if (pipe_params->nsproxy && pipe_params->fs) {
>> +             int kill;
>> +             switch_task_namespaces(tsk, pipe_params->nsproxy);
>> +             pipe_params->nsproxy = NULL;
>> +
>> +             task_lock(tsk);
>> +             tsk->fs = pipe_params->fs;
>> +             task_unlock(tsk);
>> +             pipe_params->fs = NULL;
>> +             /* Clean up the previous fs struct */
>> +             write_lock(&cfs->lock);
>> +             kill = !--cfs->users;
>> +             write_unlock(&cfs->lock);
>> +             if (kill)
>> +                     free_fs_struct(cfs);
>> +     }
>>
>>       wp = create_write_pipe(0);
>>       if (IS_ERR(wp))
>> @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
>>       if (ispipe) {
>>               int dump_count;
>>               char **helper_argv;
>> +             struct coredump_pipe_params pipe_params = { .params = &cprm };
>>
> And you won't have to allocate this, it will just be part of the
> coredump_params.
>
>>               if (cprm.limit == 1) {
>>                       /*
>> @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
>>                       goto fail_dropcount;
>>               }
>>
>> +             /* Run the core_collector in the crashing namespaces */
>> +             if (copy_namespaces_unattached(0, current,
>> +                     &pipe_params.nsproxy, &pipe_params.fs)) {
>> +                     printk(KERN_WARNING "%s failed to copy namespaces\n",
>> +                            __func__);
>> +                     argv_free(helper_argv);
>> +                     goto fail_dropcount;
>> +             }
>> +
>>               retval = call_usermodehelper_fns(helper_argv[0], helper_argv,
>>                                       NULL, UMH_WAIT_EXEC, umh_pipe_setup,
>> -                                     NULL, &cprm);
>> +                                     NULL, &pipe_params);
> And you can pass a pointer to coredump_params here, instead of a pointer to a
> pipe_specific coredumps, to a function that isn't specific to pipes
>
>>               argv_free(helper_argv);
>> +             /* nsproxy and fs will survive if call_usermodehelper_fns hits
>> +              * ENOMEM prior to creating a new thread.
>> +              */
>> +             if (pipe_params.nsproxy)
>> +                     put_nsproxy(pipe_params.nsproxy);
>> +             if (pipe_params.fs)  /* not in use by anything else */
>> +                     free_fs_struct(pipe_params.fs);
>>               if (retval) {
>>                       printk(KERN_INFO "Core dump to %s pipe failed\n",
>>                              corename);
>> --
>> 1.7.0.4
>>
>>
>
>
> Regards
> Neil
>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-18  1:29         ` Oleg Nesterov
@ 2010-09-18  2:34             ` Will Drewry
  -1 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-18  2:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath, Eric W. Biederman

On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 09/17, Will Drewry wrote:
>>
>> Instead, this change implements the more complex option two.  It
>> migrates the ____call_usermodehelper() thread into the same namespaces
>> as the dumping process.  It does not assign a pid in that namespace so
>> the collector will appear to be pid 0.
>
> Hmm... You mean, it won't visible in that namespace? Afacis, it should
> have the correct pid in the init ns, no?

Exactly - it will be unmapped in its new pid namespace, returning 0
only in that case, as you point out below!

>
> I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps
> this is OK... Say, sys_getpid() returns 0, strange.
>
>> +             /* Run the core_collector in the crashing namespaces */
>> +             if (copy_namespaces_unattached(0, current,
>> +                     &pipe_params.nsproxy, &pipe_params.fs)) {
>> +                     printk(KERN_WARNING "%s failed to copy namespaces\n",
>> +                            __func__);
>> +                     argv_free(helper_argv);
>> +                     goto fail_dropcount;
>> +             }
>
> This looks overcomplicated to me, or I missed something.
>
> I do not understand why should we do this beforehand, and why we need
> copy_namespaces_unattached().
>
> Can't you just pass current to umh_pipe_setup() (or another helper) as
> the argument? Then this helper can copy ->fs and ->nsproxy itself.

I wasn't sure if it was reasonable to pass the current task_struct
over, but I certainly can.

> In fact, I do not understand why create_new_namespaces() is used. It
> is called with flags == 0 anyway, can't we just do
>
>        ns = coredumping_task->nsproxy;
>        get_nsproxy(ns);
>        switch_task_namespaces(current, ns);
>
> ?

So that was my first thought (which I tried).  I did exactly what you
suggested to the khelper thread, and the lack of the fs struct bit me.
 Since the older patch from Eric Biederman (setns()) had taken the
route of deflecting the work through create_new_namespaces(), I did
too.  I figured it would ensure that any namespacing behavior that
needed to be done would be done.

In practice, this seems to amount to just adding a refcount to all the
namespaces and creating a new nsproxy which isn't really needed.  Most
likely, doing what you've suggested above plus the copy_fs_struct and
the swap out will do the trick.  I'll try it out and see.  That's make
it much clearer I think.

thanks!
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
@ 2010-09-18  2:34             ` Will Drewry
  0 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-18  2:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman,
	containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel

On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 09/17, Will Drewry wrote:
>>
>> Instead, this change implements the more complex option two.  It
>> migrates the ____call_usermodehelper() thread into the same namespaces
>> as the dumping process.  It does not assign a pid in that namespace so
>> the collector will appear to be pid 0.
>
> Hmm... You mean, it won't visible in that namespace? Afacis, it should
> have the correct pid in the init ns, no?

Exactly - it will be unmapped in its new pid namespace, returning 0
only in that case, as you point out below!

>
> I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps
> this is OK... Say, sys_getpid() returns 0, strange.
>
>> +             /* Run the core_collector in the crashing namespaces */
>> +             if (copy_namespaces_unattached(0, current,
>> +                     &pipe_params.nsproxy, &pipe_params.fs)) {
>> +                     printk(KERN_WARNING "%s failed to copy namespaces\n",
>> +                            __func__);
>> +                     argv_free(helper_argv);
>> +                     goto fail_dropcount;
>> +             }
>
> This looks overcomplicated to me, or I missed something.
>
> I do not understand why should we do this beforehand, and why we need
> copy_namespaces_unattached().
>
> Can't you just pass current to umh_pipe_setup() (or another helper) as
> the argument? Then this helper can copy ->fs and ->nsproxy itself.

I wasn't sure if it was reasonable to pass the current task_struct
over, but I certainly can.

> In fact, I do not understand why create_new_namespaces() is used. It
> is called with flags == 0 anyway, can't we just do
>
>        ns = coredumping_task->nsproxy;
>        get_nsproxy(ns);
>        switch_task_namespaces(current, ns);
>
> ?

So that was my first thought (which I tried).  I did exactly what you
suggested to the khelper thread, and the lack of the fs struct bit me.
 Since the older patch from Eric Biederman (setns()) had taken the
route of deflecting the work through create_new_namespaces(), I did
too.  I figured it would ensure that any namespacing behavior that
needed to be done would be done.

In practice, this seems to amount to just adding a refcount to all the
namespaces and creating a new nsproxy which isn't really needed.  Most
likely, doing what you've suggested above plus the copy_fs_struct and
the swap out will do the trick.  I'll try it out and see.  That's make
it much clearer I think.

thanks!
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace
       [not found]         ` <20100918012939.GA25046-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2010-09-18  2:34             ` Will Drewry
@ 2010-09-18  3:13           ` Will Drewry
  1 sibling, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-18  3:13 UTC (permalink / raw)
  To: Andi Kleen, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Will Drewry, Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, Oleg Nesterov, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn,
	Andrew Morton, Alexey Dobriyan, Roland McGrath,
	Eric W. Biederman

Presently, a core_pattern pipe endpoint will be run in the init
namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
core dumping process but will have no access to that processes /proc
without walking the init namespace /proc looking through all the global
pids until it finds the one it thinks matches.

One option for fixing this is to change the reported pid:
  https://patchwork.kernel.org/patch/185912/
However, it's unclear if it is desirable for the core_pattern to run
outside the namespace.  In particular, it can easily access the mounts
via /proc/[pid_nr]/root, but if there is a net namespace, it will not
have access.  (Originally introduced in 2007 in commit
b488893a390edfe027bae7a46e9af8083e740668 )

Instead, this change implements the more complex option two.  It
migrates the ____call_usermodehelper() thread into the same namespaces
as the dumping process.  It does not assign a pid in that namespace so
the collector will appear to be pid 0 in the namespace.

Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

---
v2: dropped patch 1/2
    removed use of new copy_namespaces_unattached (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org)
    moved pipe_params into coredump_params (nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org)

 fs/exec.c               |   46 +++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/binfmts.h |    9 +++++++++
 2 files changed, 54 insertions(+), 1 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 828dd24..4cbb735 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -51,6 +51,7 @@
 #include <linux/audit.h>
 #include <linux/tracehook.h>
 #include <linux/kmod.h>
+#include <linux/nsproxy.h>
 #include <linux/fsnotify.h>
 #include <linux/fs_struct.h>
 #include <linux/pipe_fs_i.h>
@@ -1820,7 +1821,32 @@ static int umh_pipe_setup(struct subprocess_info *info)
 	struct file *rp, *wp;
 	struct fdtable *fdt;
 	struct coredump_params *cp = (struct coredump_params *)info->data;
-	struct files_struct *cf = current->files;
+	struct coredump_pipe_params *pipe_params = &cp->pipe_params;
+	struct task_struct *tsk = current;
+	struct files_struct *cf = tsk->files;
+	struct fs_struct *cfs = tsk->fs;
+
+	/* Migrate this thread into the crashing namespaces, but
+	 * don't change its pid struct to avoid breaking any other
+	 * dependencies.  This will mean the process is pid=0 if it
+	 * was migrated into a pid namespace.
+	 */
+	if (pipe_params->nsproxy && pipe_params->fs) {
+		int kill;
+		switch_task_namespaces(tsk, pipe_params->nsproxy);
+		pipe_params->nsproxy = NULL;
+
+		task_lock(tsk);
+		tsk->fs = pipe_params->fs;
+		task_unlock(tsk);
+		pipe_params->fs = NULL;
+		/* Clean up the previous fs struct */
+		write_lock(&cfs->lock);
+		kill = !--cfs->users;
+		write_unlock(&cfs->lock);
+		if (kill)
+			free_fs_struct(cfs);
+	}
 
 	wp = create_write_pipe(0);
 	if (IS_ERR(wp))
@@ -1950,10 +1976,28 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
 			goto fail_dropcount;
 		}
 
+		/* Run the core_collector in the crashing namespaces */
+		cprm.pipe_params.fs = copy_fs_struct(current->fs);
+		if (!cprm.pipe_params.fs) {
+			printk(KERN_WARNING "%s failed to copy fs\n",
+				__func__);
+			argv_free(helper_argv);
+			goto fail_dropcount;
+		}
+		get_nsproxy(current->nsproxy);
+		cprm.pipe_params.nsproxy = current->nsproxy;
+
 		retval = call_usermodehelper_fns(helper_argv[0], helper_argv,
 					NULL, UMH_WAIT_EXEC, umh_pipe_setup,
 					NULL, &cprm);
 		argv_free(helper_argv);
+		/* nsproxy and fs will survive if call_usermodehelper_fns hits
+		 * ENOMEM prior to creating a new thread.
+		 */
+		if (cprm.pipe_params.nsproxy)
+			put_nsproxy(cprm.pipe_params.nsproxy);
+		if (cprm.pipe_params.fs)  /* not in use by anything else */
+			free_fs_struct(cprm.pipe_params.fs);
 		if (retval) {
  			printk(KERN_INFO "Core dump to %s pipe failed\n",
 			       corename);
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index a065612..2629603 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -68,6 +68,12 @@ struct linux_binprm{
 
 #define BINPRM_MAX_RECURSION 4
 
+/* Parameters to be passed to a pipe target during coredump */
+struct coredump_pipe_params {
+	struct nsproxy *nsproxy;
+	struct fs_struct *fs;
+};
+
 /* Function parameter for binfmt->coredump */
 struct coredump_params {
 	long signr;
@@ -75,6 +81,9 @@ struct coredump_params {
 	struct file *file;
 	unsigned long limit;
 	unsigned long mm_flags;
+	union {
+		struct coredump_pipe_params pipe_params;
+	};
 };
 
 /*
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace
  2010-09-18  1:29         ` Oleg Nesterov
  (?)
  (?)
@ 2010-09-18  3:13         ` Will Drewry
  2010-09-20 18:34           ` Eric W. Biederman
       [not found]           ` <1284779629-15273-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
  -1 siblings, 2 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-18  3:13 UTC (permalink / raw)
  To: Andi Kleen, linux-kernel
  Cc: Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro,
	Roland McGrath, Neil Horman, Eric W. Biederman, containers,
	Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel, Will Drewry

Presently, a core_pattern pipe endpoint will be run in the init
namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
core dumping process but will have no access to that processes /proc
without walking the init namespace /proc looking through all the global
pids until it finds the one it thinks matches.

One option for fixing this is to change the reported pid:
  https://patchwork.kernel.org/patch/185912/
However, it's unclear if it is desirable for the core_pattern to run
outside the namespace.  In particular, it can easily access the mounts
via /proc/[pid_nr]/root, but if there is a net namespace, it will not
have access.  (Originally introduced in 2007 in commit
b488893a390edfe027bae7a46e9af8083e740668 )

Instead, this change implements the more complex option two.  It
migrates the ____call_usermodehelper() thread into the same namespaces
as the dumping process.  It does not assign a pid in that namespace so
the collector will appear to be pid 0 in the namespace.

Signed-off-by: Will Drewry <wad@chromium.org>

---
v2: dropped patch 1/2
    removed use of new copy_namespaces_unattached (oleg@redhat.com)
    moved pipe_params into coredump_params (nhorman@tuxdriver.com)

 fs/exec.c               |   46 +++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/binfmts.h |    9 +++++++++
 2 files changed, 54 insertions(+), 1 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 828dd24..4cbb735 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -51,6 +51,7 @@
 #include <linux/audit.h>
 #include <linux/tracehook.h>
 #include <linux/kmod.h>
+#include <linux/nsproxy.h>
 #include <linux/fsnotify.h>
 #include <linux/fs_struct.h>
 #include <linux/pipe_fs_i.h>
@@ -1820,7 +1821,32 @@ static int umh_pipe_setup(struct subprocess_info *info)
 	struct file *rp, *wp;
 	struct fdtable *fdt;
 	struct coredump_params *cp = (struct coredump_params *)info->data;
-	struct files_struct *cf = current->files;
+	struct coredump_pipe_params *pipe_params = &cp->pipe_params;
+	struct task_struct *tsk = current;
+	struct files_struct *cf = tsk->files;
+	struct fs_struct *cfs = tsk->fs;
+
+	/* Migrate this thread into the crashing namespaces, but
+	 * don't change its pid struct to avoid breaking any other
+	 * dependencies.  This will mean the process is pid=0 if it
+	 * was migrated into a pid namespace.
+	 */
+	if (pipe_params->nsproxy && pipe_params->fs) {
+		int kill;
+		switch_task_namespaces(tsk, pipe_params->nsproxy);
+		pipe_params->nsproxy = NULL;
+
+		task_lock(tsk);
+		tsk->fs = pipe_params->fs;
+		task_unlock(tsk);
+		pipe_params->fs = NULL;
+		/* Clean up the previous fs struct */
+		write_lock(&cfs->lock);
+		kill = !--cfs->users;
+		write_unlock(&cfs->lock);
+		if (kill)
+			free_fs_struct(cfs);
+	}
 
 	wp = create_write_pipe(0);
 	if (IS_ERR(wp))
@@ -1950,10 +1976,28 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs)
 			goto fail_dropcount;
 		}
 
+		/* Run the core_collector in the crashing namespaces */
+		cprm.pipe_params.fs = copy_fs_struct(current->fs);
+		if (!cprm.pipe_params.fs) {
+			printk(KERN_WARNING "%s failed to copy fs\n",
+				__func__);
+			argv_free(helper_argv);
+			goto fail_dropcount;
+		}
+		get_nsproxy(current->nsproxy);
+		cprm.pipe_params.nsproxy = current->nsproxy;
+
 		retval = call_usermodehelper_fns(helper_argv[0], helper_argv,
 					NULL, UMH_WAIT_EXEC, umh_pipe_setup,
 					NULL, &cprm);
 		argv_free(helper_argv);
+		/* nsproxy and fs will survive if call_usermodehelper_fns hits
+		 * ENOMEM prior to creating a new thread.
+		 */
+		if (cprm.pipe_params.nsproxy)
+			put_nsproxy(cprm.pipe_params.nsproxy);
+		if (cprm.pipe_params.fs)  /* not in use by anything else */
+			free_fs_struct(cprm.pipe_params.fs);
 		if (retval) {
  			printk(KERN_INFO "Core dump to %s pipe failed\n",
 			       corename);
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index a065612..2629603 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -68,6 +68,12 @@ struct linux_binprm{
 
 #define BINPRM_MAX_RECURSION 4
 
+/* Parameters to be passed to a pipe target during coredump */
+struct coredump_pipe_params {
+	struct nsproxy *nsproxy;
+	struct fs_struct *fs;
+};
+
 /* Function parameter for binfmt->coredump */
 struct coredump_params {
 	long signr;
@@ -75,6 +81,9 @@ struct coredump_params {
 	struct file *file;
 	unsigned long limit;
 	unsigned long mm_flags;
+	union {
+		struct coredump_pipe_params pipe_params;
+	};
 };
 
 /*
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-18  2:34             ` Will Drewry
  (?)
@ 2010-09-18  3:14             ` Will Drewry
  -1 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-18  3:14 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath, Eric W. Biederman

On Fri, Sep 17, 2010 at 9:34 PM, Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> On 09/17, Will Drewry wrote:
>>>
>>> Instead, this change implements the more complex option two.  It
>>> migrates the ____call_usermodehelper() thread into the same namespaces
>>> as the dumping process.  It does not assign a pid in that namespace so
>>> the collector will appear to be pid 0.
>>
>> Hmm... You mean, it won't visible in that namespace? Afacis, it should
>> have the correct pid in the init ns, no?
>
> Exactly - it will be unmapped in its new pid namespace, returning 0
> only in that case, as you point out below!
>
>>
>> I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps
>> this is OK... Say, sys_getpid() returns 0, strange.
>>
>>> +             /* Run the core_collector in the crashing namespaces */
>>> +             if (copy_namespaces_unattached(0, current,
>>> +                     &pipe_params.nsproxy, &pipe_params.fs)) {
>>> +                     printk(KERN_WARNING "%s failed to copy namespaces\n",
>>> +                            __func__);
>>> +                     argv_free(helper_argv);
>>> +                     goto fail_dropcount;
>>> +             }
>>
>> This looks overcomplicated to me, or I missed something.
>>
>> I do not understand why should we do this beforehand, and why we need
>> copy_namespaces_unattached().
>>
>> Can't you just pass current to umh_pipe_setup() (or another helper) as
>> the argument? Then this helper can copy ->fs and ->nsproxy itself.
>
> I wasn't sure if it was reasonable to pass the current task_struct
> over, but I certainly can.
>
>> In fact, I do not understand why create_new_namespaces() is used. It
>> is called with flags == 0 anyway, can't we just do
>>
>>        ns = coredumping_task->nsproxy;
>>        get_nsproxy(ns);
>>        switch_task_namespaces(current, ns);
>>
>> ?
>
> So that was my first thought (which I tried).  I did exactly what you
> suggested to the khelper thread, and the lack of the fs struct bit me.
>  Since the older patch from Eric Biederman (setns()) had taken the
> route of deflecting the work through create_new_namespaces(), I did
> too.  I figured it would ensure that any namespacing behavior that
> needed to be done would be done.
>
> In practice, this seems to amount to just adding a refcount to all the
> namespaces and creating a new nsproxy which isn't really needed.  Most
> likely, doing what you've suggested above plus the copy_fs_struct and
> the swap out will do the trick.  I'll try it out and see.  That's make
> it much clearer I think.

That seems to work -- I'll post an update shortly with your comments
and Neil's integrated - hopefully accurately.  Any other thoughts on
style/cleanup/etc will certainly be appreciated.

cheers!
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-18  2:34             ` Will Drewry
  (?)
  (?)
@ 2010-09-18  3:14             ` Will Drewry
  -1 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-18  3:14 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman,
	containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel

On Fri, Sep 17, 2010 at 9:34 PM, Will Drewry <wad@chromium.org> wrote:
> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>> On 09/17, Will Drewry wrote:
>>>
>>> Instead, this change implements the more complex option two.  It
>>> migrates the ____call_usermodehelper() thread into the same namespaces
>>> as the dumping process.  It does not assign a pid in that namespace so
>>> the collector will appear to be pid 0.
>>
>> Hmm... You mean, it won't visible in that namespace? Afacis, it should
>> have the correct pid in the init ns, no?
>
> Exactly - it will be unmapped in its new pid namespace, returning 0
> only in that case, as you point out below!
>
>>
>> I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps
>> this is OK... Say, sys_getpid() returns 0, strange.
>>
>>> +             /* Run the core_collector in the crashing namespaces */
>>> +             if (copy_namespaces_unattached(0, current,
>>> +                     &pipe_params.nsproxy, &pipe_params.fs)) {
>>> +                     printk(KERN_WARNING "%s failed to copy namespaces\n",
>>> +                            __func__);
>>> +                     argv_free(helper_argv);
>>> +                     goto fail_dropcount;
>>> +             }
>>
>> This looks overcomplicated to me, or I missed something.
>>
>> I do not understand why should we do this beforehand, and why we need
>> copy_namespaces_unattached().
>>
>> Can't you just pass current to umh_pipe_setup() (or another helper) as
>> the argument? Then this helper can copy ->fs and ->nsproxy itself.
>
> I wasn't sure if it was reasonable to pass the current task_struct
> over, but I certainly can.
>
>> In fact, I do not understand why create_new_namespaces() is used. It
>> is called with flags == 0 anyway, can't we just do
>>
>>        ns = coredumping_task->nsproxy;
>>        get_nsproxy(ns);
>>        switch_task_namespaces(current, ns);
>>
>> ?
>
> So that was my first thought (which I tried).  I did exactly what you
> suggested to the khelper thread, and the lack of the fs struct bit me.
>  Since the older patch from Eric Biederman (setns()) had taken the
> route of deflecting the work through create_new_namespaces(), I did
> too.  I figured it would ensure that any namespacing behavior that
> needed to be done would be done.
>
> In practice, this seems to amount to just adding a refcount to all the
> namespaces and creating a new nsproxy which isn't really needed.  Most
> likely, doing what you've suggested above plus the copy_fs_struct and
> the swap out will do the trick.  I'll try it out and see.  That's make
> it much clearer I think.

That seems to work -- I'll post an update shortly with your comments
and Neil's integrated - hopefully accurately.  Any other thoughts on
style/cleanup/etc will certainly be appreciated.

cheers!
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace
       [not found]           ` <1284779629-15273-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
@ 2010-09-20 18:34             ` Eric W. Biederman
  0 siblings, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2010-09-20 18:34 UTC (permalink / raw)
  To: Will Drewry
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath

Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> writes:

> Presently, a core_pattern pipe endpoint will be run in the init
> namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
> core dumping process but will have no access to that processes /proc
> without walking the init namespace /proc looking through all the global
> pids until it finds the one it thinks matches.
>
> One option for fixing this is to change the reported pid:
>   https://patchwork.kernel.org/patch/185912/
> However, it's unclear if it is desirable for the core_pattern to run
> outside the namespace.  In particular, it can easily access the mounts
> via /proc/[pid_nr]/root, but if there is a net namespace, it will not
> have access.  (Originally introduced in 2007 in commit
> b488893a390edfe027bae7a46e9af8083e740668 )
>
> Instead, this change implements the more complex option two.  It
> migrates the ____call_usermodehelper() thread into the same namespaces
> as the dumping process.  It does not assign a pid in that namespace so
> the collector will appear to be pid 0 in the namespace.

I like the direction this is going but as structured this looks like a
security exploit waiting to happen.

The pipe process needs to run in the namespaces of the process who set
the core pattern, not in the namespaces of the dumping process.
Otherwise it is possible to trigger a privileged process to run in a
context where it's reality that it expected, causing it to misuse
it's privileges.  Even if we don't have a privilege problem I think
we will have a case of mismatched functionality where the core pattern
will not work as expected.

I believe the solution here is to move the core pattern into the pid
namespace and to capture the namespaces when set, but if the core
pattern hasn't been set to use the core pattern of the parent pid
namespace.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace
  2010-09-18  3:13         ` Will Drewry
@ 2010-09-20 18:34           ` Eric W. Biederman
  2010-09-20 19:12             ` Andi Kleen
       [not found]             ` <m1eico1cyv.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
       [not found]           ` <1284779629-15273-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
  1 sibling, 2 replies; 43+ messages in thread
From: Eric W. Biederman @ 2010-09-20 18:34 UTC (permalink / raw)
  To: Will Drewry
  Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton,
	Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman,
	containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel

Will Drewry <wad@chromium.org> writes:

> Presently, a core_pattern pipe endpoint will be run in the init
> namespace.  It will receive the virtual pid (task_tgid_vnr->%p) of the
> core dumping process but will have no access to that processes /proc
> without walking the init namespace /proc looking through all the global
> pids until it finds the one it thinks matches.
>
> One option for fixing this is to change the reported pid:
>   https://patchwork.kernel.org/patch/185912/
> However, it's unclear if it is desirable for the core_pattern to run
> outside the namespace.  In particular, it can easily access the mounts
> via /proc/[pid_nr]/root, but if there is a net namespace, it will not
> have access.  (Originally introduced in 2007 in commit
> b488893a390edfe027bae7a46e9af8083e740668 )
>
> Instead, this change implements the more complex option two.  It
> migrates the ____call_usermodehelper() thread into the same namespaces
> as the dumping process.  It does not assign a pid in that namespace so
> the collector will appear to be pid 0 in the namespace.

I like the direction this is going but as structured this looks like a
security exploit waiting to happen.

The pipe process needs to run in the namespaces of the process who set
the core pattern, not in the namespaces of the dumping process.
Otherwise it is possible to trigger a privileged process to run in a
context where it's reality that it expected, causing it to misuse
it's privileges.  Even if we don't have a privilege problem I think
we will have a case of mismatched functionality where the core pattern
will not work as expected.

I believe the solution here is to move the core pattern into the pid
namespace and to capture the namespaces when set, but if the core
pattern hasn't been set to use the core pattern of the parent pid
namespace.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-18  2:34             ` Will Drewry
                               ` (2 preceding siblings ...)
  (?)
@ 2010-09-20 18:50             ` Oleg Nesterov
  -1 siblings, 0 replies; 43+ messages in thread
From: Oleg Nesterov @ 2010-09-20 18:50 UTC (permalink / raw)
  To: Will Drewry
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath, Eric W. Biederman

On 09/17, Will Drewry wrote:
>
> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >
> > This looks overcomplicated to me, or I missed something.
> >
> > I do not understand why should we do this beforehand, and why we need
> > copy_namespaces_unattached().
> >
> > Can't you just pass current to umh_pipe_setup() (or another helper) as
> > the argument? Then this helper can copy ->fs and ->nsproxy itself.
>
> I wasn't sure if it was reasonable to pass the current task_struct
> over, but I certainly can.

Why not? current calls call_usermodehelper_exec(), it can't go away
until subprocess_info->init() returns, it sleeps on wait_for_completion().

> In practice, this seems to amount to just adding a refcount to all the
> namespaces and creating a new nsproxy which isn't really needed.  Most
> likely, doing what you've suggested above plus the copy_fs_struct and
> the swap out will do the trick.  I'll try it out and see.  That's make
> it much clearer I think.

Yes, just get_nsproxy() (like fork() does) should be fine in this case.

As for copying ->fs, I am not sure actually. core_pattern is global,
say it is "|/coredumper". If you change ->root, then exec can fail
because that binary is not visible to the coredumping process?

Probably we should move core_pattern into ->pid_ns, I dunno.

Oleg.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-18  2:34             ` Will Drewry
                               ` (3 preceding siblings ...)
  (?)
@ 2010-09-20 18:50             ` Oleg Nesterov
       [not found]               ` <20100920185001.GA955-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2010-09-20 20:28                 ` Will Drewry
  -1 siblings, 2 replies; 43+ messages in thread
From: Oleg Nesterov @ 2010-09-20 18:50 UTC (permalink / raw)
  To: Will Drewry
  Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman,
	containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel

On 09/17, Will Drewry wrote:
>
> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > This looks overcomplicated to me, or I missed something.
> >
> > I do not understand why should we do this beforehand, and why we need
> > copy_namespaces_unattached().
> >
> > Can't you just pass current to umh_pipe_setup() (or another helper) as
> > the argument? Then this helper can copy ->fs and ->nsproxy itself.
>
> I wasn't sure if it was reasonable to pass the current task_struct
> over, but I certainly can.

Why not? current calls call_usermodehelper_exec(), it can't go away
until subprocess_info->init() returns, it sleeps on wait_for_completion().

> In practice, this seems to amount to just adding a refcount to all the
> namespaces and creating a new nsproxy which isn't really needed.  Most
> likely, doing what you've suggested above plus the copy_fs_struct and
> the swap out will do the trick.  I'll try it out and see.  That's make
> it much clearer I think.

Yes, just get_nsproxy() (like fork() does) should be fine in this case.

As for copying ->fs, I am not sure actually. core_pattern is global,
say it is "|/coredumper". If you change ->root, then exec can fail
because that binary is not visible to the coredumping process?

Probably we should move core_pattern into ->pid_ns, I dunno.

Oleg.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace
       [not found]             ` <m1eico1cyv.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
@ 2010-09-20 19:12               ` Andi Kleen
  0 siblings, 0 replies; 43+ messages in thread
From: Andi Kleen @ 2010-09-20 19:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Will Drewry, Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Oleg Nesterov,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath

> The pipe process needs to run in the namespaces of the process who set
> the core pattern, not in the namespaces of the dumping process.
> Otherwise it is possible to trigger a privileged process to run in a
> context where it's reality that it expected, causing it to misuse
> it's privileges.  Even if we don't have a privilege problem I think
> we will have a case of mismatched functionality where the core pattern
> will not work as expected.

For me it seems rather the other way around: running the helper in some
highly priviledged namespace is more dangerous. If it runs in the 
same context as the crasher it can do the least amount of damage
relative to the crash process.

And as Will pointed out it's the only sane way to deal with net namespaces.

-Andi

-- 
ak-VuQAYsv1563Yd54FQh9/CA@public.gmane.org -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace
  2010-09-20 18:34           ` Eric W. Biederman
@ 2010-09-20 19:12             ` Andi Kleen
  2010-09-20 20:26                 ` Will Drewry
       [not found]               ` <20100920191214.GB7496-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
       [not found]             ` <m1eico1cyv.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
  1 sibling, 2 replies; 43+ messages in thread
From: Andi Kleen @ 2010-09-20 19:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Will Drewry, Andi Kleen, linux-kernel, Alexander Viro,
	Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath,
	Neil Horman, containers, Eugene Teo, Tejun Heo, Serge Hallyn,
	Alexey Dobriyan, linux-fsdevel

> The pipe process needs to run in the namespaces of the process who set
> the core pattern, not in the namespaces of the dumping process.
> Otherwise it is possible to trigger a privileged process to run in a
> context where it's reality that it expected, causing it to misuse
> it's privileges.  Even if we don't have a privilege problem I think
> we will have a case of mismatched functionality where the core pattern
> will not work as expected.

For me it seems rather the other way around: running the helper in some
highly priviledged namespace is more dangerous. If it runs in the 
same context as the crasher it can do the least amount of damage
relative to the crash process.

And as Will pointed out it's the only sane way to deal with net namespaces.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace
       [not found]               ` <20100920191214.GB7496-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
@ 2010-09-20 20:26                 ` Will Drewry
  0 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-20 20:26 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath, Alexander Viro

On Mon, Sep 20, 2010 at 2:12 PM, Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org> wrote:
>> The pipe process needs to run in the namespaces of the process who set
>> the core pattern, not in the namespaces of the dumping process.
>> Otherwise it is possible to trigger a privileged process to run in a
>> context where it's reality that it expected, causing it to misuse
>> it's privileges.  Even if we don't have a privilege problem I think
>> we will have a case of mismatched functionality where the core pattern
>> will not work as expected.
>
> For me it seems rather the other way around: running the helper in some
> highly priviledged namespace is more dangerous. If it runs in the
> same context as the crasher it can do the least amount of damage
> relative to the crash process.
>
> And as Will pointed out it's the only sane way to deal with net namespaces.

I think you're both right.  How it is implemented right now is an
escape from the linux container. If you allow the root user in a
container to mount proc and update core_pattern, they can escape.
(core_pattern = |/well/known/binary_or_scripting_lang)  I'm sure there
are other escapes too (and any umh call is likely an escape like this
one -- e.g., modprobe_path).  That said, using my patch above might
let you traverse a path otherwise blocked by an LSM enforcement (E.g.,
root user runs a process which sets up a vfs namespace with an
encrypted mount and the lsm blocks access to the /proc/[pid]/root -
but core_pattern still runs and with access).

That said, using the setters namespace makes sense to me as a consumer
of core_pattern too. You can set the core_pattern outside of a
chroot/container and collect core dumps there _or_ you can let a
"root" user in a container have their own core collector without
providing a simple escape.  Making format_corename use the correct pid
namespace for translation would make these cases even more seamless.

Unfortunately, I haven't yet looked at doing it that way yet.  The
namespace-transition patch posted is what occurred to me initially.
Perhaps it won't be so hard.  I'll take a look at what it'd take to do
move core_pattern since it'd resolve both the escape/lsm-bypass
scenarios and the mismatch between the arbitrary namespace and the
core_pattern values.

Thanks!
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace
  2010-09-20 19:12             ` Andi Kleen
@ 2010-09-20 20:26                 ` Will Drewry
       [not found]               ` <20100920191214.GB7496-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
  1 sibling, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-20 20:26 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Eric W. Biederman, linux-kernel, Alexander Viro, Andrew Morton,
	Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman,
	containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel

On Mon, Sep 20, 2010 at 2:12 PM, Andi Kleen <andi@firstfloor.org> wrote:
>> The pipe process needs to run in the namespaces of the process who set
>> the core pattern, not in the namespaces of the dumping process.
>> Otherwise it is possible to trigger a privileged process to run in a
>> context where it's reality that it expected, causing it to misuse
>> it's privileges.  Even if we don't have a privilege problem I think
>> we will have a case of mismatched functionality where the core pattern
>> will not work as expected.
>
> For me it seems rather the other way around: running the helper in some
> highly priviledged namespace is more dangerous. If it runs in the
> same context as the crasher it can do the least amount of damage
> relative to the crash process.
>
> And as Will pointed out it's the only sane way to deal with net namespaces.

I think you're both right.  How it is implemented right now is an
escape from the linux container. If you allow the root user in a
container to mount proc and update core_pattern, they can escape.
(core_pattern = |/well/known/binary_or_scripting_lang)  I'm sure there
are other escapes too (and any umh call is likely an escape like this
one -- e.g., modprobe_path).  That said, using my patch above might
let you traverse a path otherwise blocked by an LSM enforcement (E.g.,
root user runs a process which sets up a vfs namespace with an
encrypted mount and the lsm blocks access to the /proc/[pid]/root -
but core_pattern still runs and with access).

That said, using the setters namespace makes sense to me as a consumer
of core_pattern too. You can set the core_pattern outside of a
chroot/container and collect core dumps there _or_ you can let a
"root" user in a container have their own core collector without
providing a simple escape.  Making format_corename use the correct pid
namespace for translation would make these cases even more seamless.

Unfortunately, I haven't yet looked at doing it that way yet.  The
namespace-transition patch posted is what occurred to me initially.
Perhaps it won't be so hard.  I'll take a look at what it'd take to do
move core_pattern since it'd resolve both the escape/lsm-bypass
scenarios and the mismatch between the arbitrary namespace and the
core_pattern values.

Thanks!
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace
@ 2010-09-20 20:26                 ` Will Drewry
  0 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-20 20:26 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Eric W. Biederman, linux-kernel, Alexander Viro, Andrew Morton,
	Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman,
	containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel

On Mon, Sep 20, 2010 at 2:12 PM, Andi Kleen <andi@firstfloor.org> wrote:
>> The pipe process needs to run in the namespaces of the process who set
>> the core pattern, not in the namespaces of the dumping process.
>> Otherwise it is possible to trigger a privileged process to run in a
>> context where it's reality that it expected, causing it to misuse
>> it's privileges.  Even if we don't have a privilege problem I think
>> we will have a case of mismatched functionality where the core pattern
>> will not work as expected.
>
> For me it seems rather the other way around: running the helper in some
> highly priviledged namespace is more dangerous. If it runs in the
> same context as the crasher it can do the least amount of damage
> relative to the crash process.
>
> And as Will pointed out it's the only sane way to deal with net namespaces.

I think you're both right.  How it is implemented right now is an
escape from the linux container. If you allow the root user in a
container to mount proc and update core_pattern, they can escape.
(core_pattern = |/well/known/binary_or_scripting_lang)  I'm sure there
are other escapes too (and any umh call is likely an escape like this
one -- e.g., modprobe_path).  That said, using my patch above might
let you traverse a path otherwise blocked by an LSM enforcement (E.g.,
root user runs a process which sets up a vfs namespace with an
encrypted mount and the lsm blocks access to the /proc/[pid]/root -
but core_pattern still runs and with access).

That said, using the setters namespace makes sense to me as a consumer
of core_pattern too. You can set the core_pattern outside of a
chroot/container and collect core dumps there _or_ you can let a
"root" user in a container have their own core collector without
providing a simple escape.  Making format_corename use the correct pid
namespace for translation would make these cases even more seamless.

Unfortunately, I haven't yet looked at doing it that way yet.  The
namespace-transition patch posted is what occurred to me initially.
Perhaps it won't be so hard.  I'll take a look at what it'd take to do
move core_pattern since it'd resolve both the escape/lsm-bypass
scenarios and the mismatch between the arbitrary namespace and the
core_pattern values.

Thanks!
will
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
       [not found]               ` <20100920185001.GA955-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2010-09-20 20:28                 ` Will Drewry
  0 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-20 20:28 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro,
	KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton,
	Alexey Dobriyan, Roland McGrath, Eric W. Biederman

On Mon, Sep 20, 2010 at 1:50 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 09/17, Will Drewry wrote:
>>
>> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> >
>> > This looks overcomplicated to me, or I missed something.
>> >
>> > I do not understand why should we do this beforehand, and why we need
>> > copy_namespaces_unattached().
>> >
>> > Can't you just pass current to umh_pipe_setup() (or another helper) as
>> > the argument? Then this helper can copy ->fs and ->nsproxy itself.
>>
>> I wasn't sure if it was reasonable to pass the current task_struct
>> over, but I certainly can.
>
> Why not? current calls call_usermodehelper_exec(), it can't go away
> until subprocess_info->init() returns, it sleeps on wait_for_completion().

yeah - I wasn't sure because the other coredump_params didn't pass it,
so I assumed there was some history around that.  Though it sounds
like the current approach may not be the way forward anyhow.

>> In practice, this seems to amount to just adding a refcount to all the
>> namespaces and creating a new nsproxy which isn't really needed.  Most
>> likely, doing what you've suggested above plus the copy_fs_struct and
>> the swap out will do the trick.  I'll try it out and see.  That's make
>> it much clearer I think.
>
> Yes, just get_nsproxy() (like fork() does) should be fine in this case.
>
> As for copying ->fs, I am not sure actually. core_pattern is global,
> say it is "|/coredumper". If you change ->root, then exec can fail
> because that binary is not visible to the coredumping process?

Yeah - it's lose-lose I think.  On one hand, it may not run, on the
other hand it may have access where it shouldn't or not have access it
where it needs it.

> Probably we should move core_pattern into ->pid_ns, I dunno.

Sounds like this is worth doing. I'll look into it and post something
for further consideration!
thanks again -
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
  2010-09-20 18:50             ` Oleg Nesterov
@ 2010-09-20 20:28                 ` Will Drewry
  2010-09-20 20:28                 ` Will Drewry
  1 sibling, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-20 20:28 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman,
	containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel

On Mon, Sep 20, 2010 at 1:50 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 09/17, Will Drewry wrote:
>>
>> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>> >
>> > This looks overcomplicated to me, or I missed something.
>> >
>> > I do not understand why should we do this beforehand, and why we need
>> > copy_namespaces_unattached().
>> >
>> > Can't you just pass current to umh_pipe_setup() (or another helper) as
>> > the argument? Then this helper can copy ->fs and ->nsproxy itself.
>>
>> I wasn't sure if it was reasonable to pass the current task_struct
>> over, but I certainly can.
>
> Why not? current calls call_usermodehelper_exec(), it can't go away
> until subprocess_info->init() returns, it sleeps on wait_for_completion().

yeah - I wasn't sure because the other coredump_params didn't pass it,
so I assumed there was some history around that.  Though it sounds
like the current approach may not be the way forward anyhow.

>> In practice, this seems to amount to just adding a refcount to all the
>> namespaces and creating a new nsproxy which isn't really needed.  Most
>> likely, doing what you've suggested above plus the copy_fs_struct and
>> the swap out will do the trick.  I'll try it out and see.  That's make
>> it much clearer I think.
>
> Yes, just get_nsproxy() (like fork() does) should be fine in this case.
>
> As for copying ->fs, I am not sure actually. core_pattern is global,
> say it is "|/coredumper". If you change ->root, then exec can fail
> because that binary is not visible to the coredumping process?

Yeah - it's lose-lose I think.  On one hand, it may not run, on the
other hand it may have access where it shouldn't or not have access it
where it needs it.

> Probably we should move core_pattern into ->pid_ns, I dunno.

Sounds like this is worth doing. I'll look into it and post something
for further consideration!
thanks again -
will

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace
@ 2010-09-20 20:28                 ` Will Drewry
  0 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-20 20:28 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton,
	KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman,
	containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan,
	linux-fsdevel

On Mon, Sep 20, 2010 at 1:50 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 09/17, Will Drewry wrote:
>>
>> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>> >
>> > This looks overcomplicated to me, or I missed something.
>> >
>> > I do not understand why should we do this beforehand, and why we need
>> > copy_namespaces_unattached().
>> >
>> > Can't you just pass current to umh_pipe_setup() (or another helper) as
>> > the argument? Then this helper can copy ->fs and ->nsproxy itself.
>>
>> I wasn't sure if it was reasonable to pass the current task_struct
>> over, but I certainly can.
>
> Why not? current calls call_usermodehelper_exec(), it can't go away
> until subprocess_info->init() returns, it sleeps on wait_for_completion().

yeah - I wasn't sure because the other coredump_params didn't pass it,
so I assumed there was some history around that.  Though it sounds
like the current approach may not be the way forward anyhow.

>> In practice, this seems to amount to just adding a refcount to all the
>> namespaces and creating a new nsproxy which isn't really needed.  Most
>> likely, doing what you've suggested above plus the copy_fs_struct and
>> the swap out will do the trick.  I'll try it out and see.  That's make
>> it much clearer I think.
>
> Yes, just get_nsproxy() (like fork() does) should be fine in this case.
>
> As for copying ->fs, I am not sure actually. core_pattern is global,
> say it is "|/coredumper". If you change ->root, then exec can fail
> because that binary is not visible to the coredumping process?

Yeah - it's lose-lose I think.  On one hand, it may not run, on the
other hand it may have access where it shouldn't or not have access it
where it needs it.

> Probably we should move core_pattern into ->pid_ns, I dunno.

Sounds like this is worth doing. I'll look into it and post something
for further consideration!
thanks again -
will
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper
@ 2010-09-16 18:59 Will Drewry
  0 siblings, 0 replies; 43+ messages in thread
From: Will Drewry @ 2010-09-16 18:59 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Will Drewry, Neil Horman,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov, Andi Kleen, Alexander Viro, KOSAKI Motohiro,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton,
	Roland McGrath, Eric W. Biederman

format_corename uses task_tgid_vnr to provide the numeric pid of a
core-dumping process.  For file-based coredumps, this is perfectly
satisfactory.  However, when the core_pattern contains a pipe, the
substituted PID is invalid in the namespace of the core_pattern pipe
helper, the init namespace.

By changing this, any core collector may now find the process in the
init namespace /proc.  This helps with VFS namespacing too since the
mount root is available via /proc.  Unfortunately, it does not help in
cases of more complex namespacing, like net namespaces.  For that, the
helper thread will need to be migrated to the core-dump namespaces.

I have a separate patch series which implements migrating the
____call_usermodehelper thread to the coredump namespace, but it adds a
fair amount of complexity which might be better handled by someone who
understands that code.  I'm happy to mail it out as well though (it
works, but I don't assign a namespaced pid which may open up issues on
its own).

Any and all comments, feedback will be appreciated!

Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
 fs/exec.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 828dd24..0b8a874 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr)
 	char *const out_end = corename + CORENAME_MAX_SIZE;
 	int rc;
 	int pid_in_pattern = 0;
+	pid_t pid = task_tgid_vnr(current);
+
+	/* The pipe helper runs in the init namespace and should
+	 * receive the matching pid until that changes.
+	 */
+	if (ispipe)
+		pid = task_tgid_nr(current);

 	/* Repeat as long as we have more pattern to process and more output
 	   space */
@@ -1489,7 +1496,7 @@ static int format_corename(char *corename, long signr)
 			case 'p':
 				pid_in_pattern = 1;
 				rc = snprintf(out_ptr, out_end - out_ptr,
-					      "%d", task_tgid_vnr(current));
+					      "%d", pid);
 				if (rc > out_end - out_ptr)
 					goto out;
 				out_ptr += rc;
@@ -1568,7 +1575,7 @@ static int format_corename(char *corename, long signr)
 	 * the filename. Do not do this for piped commands. */
 	if (!ispipe && !pid_in_pattern && core_uses_pid) {
 		rc = snprintf(out_ptr, out_end - out_ptr,
-			      ".%d", task_tgid_vnr(current));
+			      ".%d", pid);
 		if (rc > out_end - out_ptr)
 			goto out;
 		out_ptr += rc;
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2010-09-20 20:28 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-16 18:59 [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper Will Drewry
     [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2010-09-16 19:35   ` Oleg Nesterov
2010-09-17 13:26   ` Andi Kleen
2010-09-16 19:35 ` Oleg Nesterov
     [not found]   ` <20100916193543.GA11016-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-09-16 20:12     ` Eric W. Biederman
2010-09-16 20:12   ` Eric W. Biederman
2010-09-16 21:02     ` Will Drewry
     [not found]     ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-09-16 21:02       ` Will Drewry
2010-09-17 19:08       ` Roland McGrath
2010-09-17 19:08     ` Roland McGrath
2010-09-17 13:26 ` Andi Kleen
2010-09-17 14:52   ` Will Drewry
2010-09-17 14:52     ` Will Drewry
     [not found]   ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org>
2010-09-17 14:52     ` Will Drewry
2010-09-17 15:16     ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry
2010-09-17 15:16     ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry
2010-09-17 15:16   ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry
2010-09-17 15:16   ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry
     [not found]     ` <1284736618-27153-2-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2010-09-17 18:15       ` Neil Horman
2010-09-18  1:29       ` Oleg Nesterov
2010-09-18  1:29         ` Oleg Nesterov
     [not found]         ` <20100918012939.GA25046-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-09-18  2:34           ` Will Drewry
2010-09-18  2:34             ` Will Drewry
2010-09-18  3:14             ` Will Drewry
2010-09-18  3:14             ` Will Drewry
2010-09-20 18:50             ` Oleg Nesterov
2010-09-20 18:50             ` Oleg Nesterov
     [not found]               ` <20100920185001.GA955-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-09-20 20:28                 ` Will Drewry
2010-09-20 20:28               ` Will Drewry
2010-09-20 20:28                 ` Will Drewry
2010-09-18  3:13           ` [PATCH][RFC] v2 " Will Drewry
2010-09-18  3:13         ` Will Drewry
2010-09-20 18:34           ` Eric W. Biederman
2010-09-20 19:12             ` Andi Kleen
2010-09-20 20:26               ` Will Drewry
2010-09-20 20:26                 ` Will Drewry
     [not found]               ` <20100920191214.GB7496-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2010-09-20 20:26                 ` Will Drewry
     [not found]             ` <m1eico1cyv.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-09-20 19:12               ` Andi Kleen
     [not found]           ` <1284779629-15273-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2010-09-20 18:34             ` Eric W. Biederman
2010-09-17 18:15     ` [PATCH 2/2] " Neil Horman
     [not found]       ` <20100917181556.GA2499-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-09-18  2:33         ` Will Drewry
2010-09-18  2:33       ` Will Drewry
2010-09-16 18:59 [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper Will Drewry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.