* [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper @ 2010-09-16 18:59 Will Drewry [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> ` (2 more replies) 0 siblings, 3 replies; 43+ messages in thread From: Will Drewry @ 2010-09-16 18:59 UTC (permalink / raw) To: linux-kernel Cc: Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, Andi Kleen, Eric W. Biederman, containers, linux-fsdevel, Will Drewry format_corename uses task_tgid_vnr to provide the numeric pid of a core-dumping process. For file-based coredumps, this is perfectly satisfactory. However, when the core_pattern contains a pipe, the substituted PID is invalid in the namespace of the core_pattern pipe helper, the init namespace. By changing this, any core collector may now find the process in the init namespace /proc. This helps with VFS namespacing too since the mount root is available via /proc. Unfortunately, it does not help in cases of more complex namespacing, like net namespaces. For that, the helper thread will need to be migrated to the core-dump namespaces. I have a separate patch series which implements migrating the ____call_usermodehelper thread to the coredump namespace, but it adds a fair amount of complexity which might be better handled by someone who understands that code. I'm happy to mail it out as well though (it works, but I don't assign a namespaced pid which may open up issues on its own). Any and all comments, feedback will be appreciated! Signed-off-by: Will Drewry <wad@chromium.org> --- fs/exec.c | 11 +++++++++-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 828dd24..0b8a874 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr) char *const out_end = corename + CORENAME_MAX_SIZE; int rc; int pid_in_pattern = 0; + pid_t pid = task_tgid_vnr(current); + + /* The pipe helper runs in the init namespace and should + * receive the matching pid until that changes. + */ + if (ispipe) + pid = task_tgid_nr(current); /* Repeat as long as we have more pattern to process and more output space */ @@ -1489,7 +1496,7 @@ static int format_corename(char *corename, long signr) case 'p': pid_in_pattern = 1; rc = snprintf(out_ptr, out_end - out_ptr, - "%d", task_tgid_vnr(current)); + "%d", pid); if (rc > out_end - out_ptr) goto out; out_ptr += rc; @@ -1568,7 +1575,7 @@ static int format_corename(char *corename, long signr) * the filename. Do not do this for piped commands. */ if (!ispipe && !pid_in_pattern && core_uses_pid) { rc = snprintf(out_ptr, out_end - out_ptr, - ".%d", task_tgid_vnr(current)); + ".%d", pid); if (rc > out_end - out_ptr) goto out; out_ptr += rc; -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
[parent not found: <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> @ 2010-09-16 19:35 ` Oleg Nesterov 2010-09-17 13:26 ` Andi Kleen 1 sibling, 0 replies; 43+ messages in thread From: Oleg Nesterov @ 2010-09-16 19:35 UTC (permalink / raw) To: Will Drewry Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, Roland McGrath, Eric W. Biederman On 09/16, Will Drewry wrote: > > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr) > char *const out_end = corename + CORENAME_MAX_SIZE; > int rc; > int pid_in_pattern = 0; > + pid_t pid = task_tgid_vnr(current); > + > + /* The pipe helper runs in the init namespace and should > + * receive the matching pid until that changes. > + */ > + if (ispipe) > + pid = task_tgid_nr(current); Agreed, it doesn't make sense to pass the "local" pid to the init ns. Oleg. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> 2010-09-16 19:35 ` Oleg Nesterov @ 2010-09-17 13:26 ` Andi Kleen 1 sibling, 0 replies; 43+ messages in thread From: Andi Kleen @ 2010-09-17 13:26 UTC (permalink / raw) To: Will Drewry Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alexander Viro, KOSAKI Motohiro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, Roland McGrath, Eric W. Biederman On Thu, 16 Sep 2010 13:59:59 -0500 Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: > format_corename uses task_tgid_vnr to provide the numeric pid of a > core-dumping process. For file-based coredumps, this is perfectly > satisfactory. However, when the core_pattern contains a pipe, the > substituted PID is invalid in the namespace of the core_pattern pipe > helper, the init namespace. Nasty problem. I wonder how many more similar problems name spaces have introduced. But wouldn't it be better to place the helper into the name space(s) of the executed process? I guess it would risk breaking some existing set ups, but it seem like the cleanest solution to me. If you want to move the core dump out of the name space you could still use a named pipe or something like that with someone outside listening. That would also fix the net namespace problem you mentioned -Andi ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper 2010-09-16 18:59 [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper Will Drewry [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> @ 2010-09-16 19:35 ` Oleg Nesterov [not found] ` <20100916193543.GA11016-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2010-09-16 20:12 ` Eric W. Biederman 2010-09-17 13:26 ` Andi Kleen 2 siblings, 2 replies; 43+ messages in thread From: Oleg Nesterov @ 2010-09-16 19:35 UTC (permalink / raw) To: Will Drewry Cc: linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Roland McGrath, Neil Horman, Andi Kleen, Eric W. Biederman, containers, linux-fsdevel On 09/16, Will Drewry wrote: > > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr) > char *const out_end = corename + CORENAME_MAX_SIZE; > int rc; > int pid_in_pattern = 0; > + pid_t pid = task_tgid_vnr(current); > + > + /* The pipe helper runs in the init namespace and should > + * receive the matching pid until that changes. > + */ > + if (ispipe) > + pid = task_tgid_nr(current); Agreed, it doesn't make sense to pass the "local" pid to the init ns. Oleg. ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <20100916193543.GA11016-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper [not found] ` <20100916193543.GA11016-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-09-16 20:12 ` Eric W. Biederman 0 siblings, 0 replies; 43+ messages in thread From: Eric W. Biederman @ 2010-09-16 20:12 UTC (permalink / raw) To: Oleg Nesterov Cc: Will Drewry, Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, Roland McGrath Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > On 09/16, Will Drewry wrote: >> >> --- a/fs/exec.c >> +++ b/fs/exec.c >> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr) >> char *const out_end = corename + CORENAME_MAX_SIZE; >> int rc; >> int pid_in_pattern = 0; >> + pid_t pid = task_tgid_vnr(current); >> + >> + /* The pipe helper runs in the init namespace and should >> + * receive the matching pid until that changes. >> + */ >> + if (ispipe) >> + pid = task_tgid_nr(current); > > Agreed, it doesn't make sense to pass the "local" pid to the init ns. Yes, passing the local pid to the init pid namespace doesn't make a lot of sense. I have recently fixed unix domain sockets to avoid doing that kind of silliness. That said I don't think this is a complete fix. We also potentially have the same issue with the uts namespace and the user namespace. I believe the core file holds all of this information relative to the process that is dying, one elf note or other so we don't need to worry about information loss. As for how to implement this for the pid case can we simply grab the namespace at the ispipe stage and use task_tgid_nr_ns. Something like: pid_ns = current->nsproxy->pid_ns; if (is_pipe) pid_ns = &init_pid_ns; .... /* pid */ case 'p': pid_in_pattern = 1; rc = snprintf(out_ptr, out_end - out_ptr, "%d", task_tgid_pid_nr_ns(pid_ns, current)); if (rc > out_end - out_ptr) goto out; out_ptr += rc; break; Will you were saying something about complex namespacing (in particular the network namespace giving you problems). The network namespace has no influence over pipes so how does that come into play? I can imagine that it would be nice to have different core patterns depending on where you are in the process tree, so a container can do something different than the system outside of the container. Are you thinking along those lines, or are you imagining something else? Eric ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper 2010-09-16 19:35 ` Oleg Nesterov [not found] ` <20100916193543.GA11016-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-09-16 20:12 ` Eric W. Biederman 2010-09-16 21:02 ` Will Drewry ` (2 more replies) 1 sibling, 3 replies; 43+ messages in thread From: Eric W. Biederman @ 2010-09-16 20:12 UTC (permalink / raw) To: Oleg Nesterov Cc: Will Drewry, linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Roland McGrath, Neil Horman, Andi Kleen, containers, linux-fsdevel Oleg Nesterov <oleg@redhat.com> writes: > On 09/16, Will Drewry wrote: >> >> --- a/fs/exec.c >> +++ b/fs/exec.c >> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr) >> char *const out_end = corename + CORENAME_MAX_SIZE; >> int rc; >> int pid_in_pattern = 0; >> + pid_t pid = task_tgid_vnr(current); >> + >> + /* The pipe helper runs in the init namespace and should >> + * receive the matching pid until that changes. >> + */ >> + if (ispipe) >> + pid = task_tgid_nr(current); > > Agreed, it doesn't make sense to pass the "local" pid to the init ns. Yes, passing the local pid to the init pid namespace doesn't make a lot of sense. I have recently fixed unix domain sockets to avoid doing that kind of silliness. That said I don't think this is a complete fix. We also potentially have the same issue with the uts namespace and the user namespace. I believe the core file holds all of this information relative to the process that is dying, one elf note or other so we don't need to worry about information loss. As for how to implement this for the pid case can we simply grab the namespace at the ispipe stage and use task_tgid_nr_ns. Something like: pid_ns = current->nsproxy->pid_ns; if (is_pipe) pid_ns = &init_pid_ns; .... /* pid */ case 'p': pid_in_pattern = 1; rc = snprintf(out_ptr, out_end - out_ptr, "%d", task_tgid_pid_nr_ns(pid_ns, current)); if (rc > out_end - out_ptr) goto out; out_ptr += rc; break; Will you were saying something about complex namespacing (in particular the network namespace giving you problems). The network namespace has no influence over pipes so how does that come into play? I can imagine that it would be nice to have different core patterns depending on where you are in the process tree, so a container can do something different than the system outside of the container. Are you thinking along those lines, or are you imagining something else? Eric ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper 2010-09-16 20:12 ` Eric W. Biederman @ 2010-09-16 21:02 ` Will Drewry [not found] ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> 2010-09-17 19:08 ` Roland McGrath 2 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-16 21:02 UTC (permalink / raw) To: Eric W. Biederman Cc: Oleg Nesterov, linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Roland McGrath, Neil Horman, Andi Kleen, containers, linux-fsdevel On Thu, Sep 16, 2010 at 3:12 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Oleg Nesterov <oleg@redhat.com> writes: > >> On 09/16, Will Drewry wrote: >>> >>> --- a/fs/exec.c >>> +++ b/fs/exec.c >>> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr) >>> char *const out_end = corename + CORENAME_MAX_SIZE; >>> int rc; >>> int pid_in_pattern = 0; >>> + pid_t pid = task_tgid_vnr(current); >>> + >>> + /* The pipe helper runs in the init namespace and should >>> + * receive the matching pid until that changes. >>> + */ >>> + if (ispipe) >>> + pid = task_tgid_nr(current); >> >> Agreed, it doesn't make sense to pass the "local" pid to the init ns. > > Yes, passing the local pid to the init pid namespace doesn't make a lot > of sense. I have recently fixed unix domain sockets to avoid doing that > kind of silliness. > > That said I don't think this is a complete fix. We also potentially > have the same issue with the uts namespace and the user namespace. Hrm true. I was looking at getting the proper pid to the core handler as a means for it to get the rest from /proc/<pid>/[status|mountinfo|root] and so on. As is, it seemed non-trivial to reliably walk the /proc tree to find the dumping process. > I believe the core file holds all of this information relative to the > process that is dying, one elf note or other so we don't need to worry > about information loss. Yeah - my main concern was /proc access to get information that might otherwise be lost. > As for how to implement this for the pid case can we simply grab the > namespace at the ispipe stage and use task_tgid_nr_ns. Something like: > pid_ns = current->nsproxy->pid_ns; > if (is_pipe) > pid_ns = &init_pid_ns; > > .... > /* pid */ > case 'p': > pid_in_pattern = 1; > rc = snprintf(out_ptr, out_end - out_ptr, > "%d", task_tgid_pid_nr_ns(pid_ns, current)); > if (rc > out_end - out_ptr) > goto out; > out_ptr += rc; > break; Would that yield different results than the task_tgid_nr versus vnr, or is it just cleaner? > Will you were saying something about complex namespacing (in particular > the network namespace giving you problems). The network namespace has > no influence over pipes so how does that come into play? Ah! All I meant is that this patch lets you access anything that is in /proc/[tgid]/... for the dumping process. However, it seems that in the long run, core_pattern should run the pipe handler inside the namespace(s) of the process do_coredump is running for. For instance, if it wanted to make a network connection on coredump, it'd be doing it from the init namespace. > I can imagine that it would be nice to have different core patterns > depending on where you are in the process tree, so a container can > do something different than the system outside of the container. Are > you thinking along those lines, or are you imagining something else? Yup - in general, I was thinking that even if core_pattern was not split by namespace, it would be nice to attempt to execute it in the namespace of the process that needs it. My understand of namespaces is that they were a gateway to longer term lightweight containers (as the patches go into kernel.org) which would mean that a change to core_pattern shouldn't result in a process running in the init namespace (at least not by default!). To that end, I implemented a function similar to your setns() syscall patch which calls create_new_namespaces (and copy_fs_struct) to create an unattached fs and nsproxy which I then migrate the ____call_usermodehelper thread over to in umh_setup_pipe. It leaves it with a pid of 0, but otherwise functional. (I also played with changing its pid or best effort assigning one in the namespace as per older threads, but it all seemed to fragile.) I can post them on this thread (or wherever) if you'd like to see them, but they're pretty trivial, and I'm unsure of their correctness. cheers! will ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>]
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper [not found] ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> @ 2010-09-16 21:02 ` Will Drewry 2010-09-17 19:08 ` Roland McGrath 1 sibling, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-16 21:02 UTC (permalink / raw) To: Eric W. Biederman Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, Roland McGrath On Thu, Sep 16, 2010 at 3:12 PM, Eric W. Biederman <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> wrote: > Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > >> On 09/16, Will Drewry wrote: >>> >>> --- a/fs/exec.c >>> +++ b/fs/exec.c >>> @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr) >>> char *const out_end = corename + CORENAME_MAX_SIZE; >>> int rc; >>> int pid_in_pattern = 0; >>> + pid_t pid = task_tgid_vnr(current); >>> + >>> + /* The pipe helper runs in the init namespace and should >>> + * receive the matching pid until that changes. >>> + */ >>> + if (ispipe) >>> + pid = task_tgid_nr(current); >> >> Agreed, it doesn't make sense to pass the "local" pid to the init ns. > > Yes, passing the local pid to the init pid namespace doesn't make a lot > of sense. I have recently fixed unix domain sockets to avoid doing that > kind of silliness. > > That said I don't think this is a complete fix. We also potentially > have the same issue with the uts namespace and the user namespace. Hrm true. I was looking at getting the proper pid to the core handler as a means for it to get the rest from /proc/<pid>/[status|mountinfo|root] and so on. As is, it seemed non-trivial to reliably walk the /proc tree to find the dumping process. > I believe the core file holds all of this information relative to the > process that is dying, one elf note or other so we don't need to worry > about information loss. Yeah - my main concern was /proc access to get information that might otherwise be lost. > As for how to implement this for the pid case can we simply grab the > namespace at the ispipe stage and use task_tgid_nr_ns. Something like: > pid_ns = current->nsproxy->pid_ns; > if (is_pipe) > pid_ns = &init_pid_ns; > > .... > /* pid */ > case 'p': > pid_in_pattern = 1; > rc = snprintf(out_ptr, out_end - out_ptr, > "%d", task_tgid_pid_nr_ns(pid_ns, current)); > if (rc > out_end - out_ptr) > goto out; > out_ptr += rc; > break; Would that yield different results than the task_tgid_nr versus vnr, or is it just cleaner? > Will you were saying something about complex namespacing (in particular > the network namespace giving you problems). The network namespace has > no influence over pipes so how does that come into play? Ah! All I meant is that this patch lets you access anything that is in /proc/[tgid]/... for the dumping process. However, it seems that in the long run, core_pattern should run the pipe handler inside the namespace(s) of the process do_coredump is running for. For instance, if it wanted to make a network connection on coredump, it'd be doing it from the init namespace. > I can imagine that it would be nice to have different core patterns > depending on where you are in the process tree, so a container can > do something different than the system outside of the container. Are > you thinking along those lines, or are you imagining something else? Yup - in general, I was thinking that even if core_pattern was not split by namespace, it would be nice to attempt to execute it in the namespace of the process that needs it. My understand of namespaces is that they were a gateway to longer term lightweight containers (as the patches go into kernel.org) which would mean that a change to core_pattern shouldn't result in a process running in the init namespace (at least not by default!). To that end, I implemented a function similar to your setns() syscall patch which calls create_new_namespaces (and copy_fs_struct) to create an unattached fs and nsproxy which I then migrate the ____call_usermodehelper thread over to in umh_setup_pipe. It leaves it with a pid of 0, but otherwise functional. (I also played with changing its pid or best effort assigning one in the namespace as per older threads, but it all seemed to fragile.) I can post them on this thread (or wherever) if you'd like to see them, but they're pretty trivial, and I'm unsure of their correctness. cheers! will ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper [not found] ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> 2010-09-16 21:02 ` Will Drewry @ 2010-09-17 19:08 ` Roland McGrath 1 sibling, 0 replies; 43+ messages in thread From: Roland McGrath @ 2010-09-17 19:08 UTC (permalink / raw) To: Eric W. Biederman Cc: Will Drewry, Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton > That said I don't think this is a complete fix. We also potentially > have the same issue with the uts namespace and the user namespace. True. > I believe the core file holds all of this information relative to the > process that is dying, one elf note or other so we don't need to worry > about information loss. That's correct (see linux/elfcore.h, pr_*id fields in prstatus and prpsinfo). > I can imagine that it would be nice to have different core patterns > depending on where you are in the process tree, so a container can > do something different than the system outside of the container. Are > you thinking along those lines, or are you imagining something else? I agree. The format string being part of the pid_ns makes sense to me. (I don't have any ideas about the interface for setting it.) Of course, then what would make most sense is for the pipe handler to run in the innermost namespace that set its own format, rather than in the global ns. Thanks, Roland ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper 2010-09-16 20:12 ` Eric W. Biederman 2010-09-16 21:02 ` Will Drewry [not found] ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> @ 2010-09-17 19:08 ` Roland McGrath 2 siblings, 0 replies; 43+ messages in thread From: Roland McGrath @ 2010-09-17 19:08 UTC (permalink / raw) To: Eric W. Biederman Cc: Oleg Nesterov, Will Drewry, linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Neil Horman, Andi Kleen, containers, linux-fsdevel > That said I don't think this is a complete fix. We also potentially > have the same issue with the uts namespace and the user namespace. True. > I believe the core file holds all of this information relative to the > process that is dying, one elf note or other so we don't need to worry > about information loss. That's correct (see linux/elfcore.h, pr_*id fields in prstatus and prpsinfo). > I can imagine that it would be nice to have different core patterns > depending on where you are in the process tree, so a container can > do something different than the system outside of the container. Are > you thinking along those lines, or are you imagining something else? I agree. The format string being part of the pid_ns makes sense to me. (I don't have any ideas about the interface for setting it.) Of course, then what would make most sense is for the pipe handler to run in the innermost namespace that set its own format, rather than in the global ns. Thanks, Roland ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper 2010-09-16 18:59 [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper Will Drewry [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> 2010-09-16 19:35 ` Oleg Nesterov @ 2010-09-17 13:26 ` Andi Kleen 2010-09-17 14:52 ` Will Drewry ` (3 more replies) 2 siblings, 4 replies; 43+ messages in thread From: Andi Kleen @ 2010-09-17 13:26 UTC (permalink / raw) To: Will Drewry Cc: linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, linux-fsdevel On Thu, 16 Sep 2010 13:59:59 -0500 Will Drewry <wad@chromium.org> wrote: > format_corename uses task_tgid_vnr to provide the numeric pid of a > core-dumping process. For file-based coredumps, this is perfectly > satisfactory. However, when the core_pattern contains a pipe, the > substituted PID is invalid in the namespace of the core_pattern pipe > helper, the init namespace. Nasty problem. I wonder how many more similar problems name spaces have introduced. But wouldn't it be better to place the helper into the name space(s) of the executed process? I guess it would risk breaking some existing set ups, but it seem like the cleanest solution to me. If you want to move the core dump out of the name space you could still use a named pipe or something like that with someone outside listening. That would also fix the net namespace problem you mentioned -Andi ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper 2010-09-17 13:26 ` Andi Kleen @ 2010-09-17 14:52 ` Will Drewry [not found] ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org> ` (2 subsequent siblings) 3 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-17 14:52 UTC (permalink / raw) To: Andi Kleen Cc: linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, linux-fsdevel, Eugene Teo On Fri, Sep 17, 2010 at 8:26 AM, Andi Kleen <andi@firstfloor.org> wrote: > On Thu, 16 Sep 2010 13:59:59 -0500 > Will Drewry <wad@chromium.org> wrote: > >> format_corename uses task_tgid_vnr to provide the numeric pid of a >> core-dumping process. For file-based coredumps, this is perfectly >> satisfactory. However, when the core_pattern contains a pipe, the >> substituted PID is invalid in the namespace of the core_pattern pipe >> helper, the init namespace. > > Nasty problem. I wonder how many more similar problems name spaces > have introduced. > > But wouldn't it be better to place the helper into the name space(s) > of the executed process? I guess it would risk breaking some existing > set ups, but it seem like the cleanest solution to me. If you want > to move the core dump out of the name space you could still > use a named pipe or something like that with someone outside listening. > > That would also fix the net namespace problem you mentioned I agree. This was what I did first, but I wasn't confident I'd done it right. I'll go ahead and post the two patches I used to accomplish it. I'm happy to spend some time polishing them until they are functionally correct, if everyone is happy enough with the approach. more to come, will ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper @ 2010-09-17 14:52 ` Will Drewry 0 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-17 14:52 UTC (permalink / raw) To: Andi Kleen Cc: linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, linux-fsdevel, Eugene Teo On Fri, Sep 17, 2010 at 8:26 AM, Andi Kleen <andi@firstfloor.org> wrote: > On Thu, 16 Sep 2010 13:59:59 -0500 > Will Drewry <wad@chromium.org> wrote: > >> format_corename uses task_tgid_vnr to provide the numeric pid of a >> core-dumping process. For file-based coredumps, this is perfectly >> satisfactory. However, when the core_pattern contains a pipe, the >> substituted PID is invalid in the namespace of the core_pattern pipe >> helper, the init namespace. > > Nasty problem. I wonder how many more similar problems name spaces > have introduced. > > But wouldn't it be better to place the helper into the name space(s) > of the executed process? I guess it would risk breaking some existing > set ups, but it seem like the cleanest solution to me. If you want > to move the core dump out of the name space you could still > use a named pipe or something like that with someone outside listening. > > That would also fix the net namespace problem you mentioned I agree. This was what I did first, but I wasn't confident I'd done it right. I'll go ahead and post the two patches I used to accomplish it. I'm happy to spend some time polishing them until they are functionally correct, if everyone is happy enough with the approach. more to come, will -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org>]
* Re: [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper [not found] ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org> @ 2010-09-17 14:52 ` Will Drewry 2010-09-17 15:16 ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry 2010-09-17 15:16 ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry 2 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-17 14:52 UTC (permalink / raw) To: Andi Kleen Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alexander Viro, KOSAKI Motohiro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, Eugene Teo, Roland McGrath, Eric W. Biederman On Fri, Sep 17, 2010 at 8:26 AM, Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org> wrote: > On Thu, 16 Sep 2010 13:59:59 -0500 > Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: > >> format_corename uses task_tgid_vnr to provide the numeric pid of a >> core-dumping process. For file-based coredumps, this is perfectly >> satisfactory. However, when the core_pattern contains a pipe, the >> substituted PID is invalid in the namespace of the core_pattern pipe >> helper, the init namespace. > > Nasty problem. I wonder how many more similar problems name spaces > have introduced. > > But wouldn't it be better to place the helper into the name space(s) > of the executed process? I guess it would risk breaking some existing > set ups, but it seem like the cleanest solution to me. If you want > to move the core dump out of the name space you could still > use a named pipe or something like that with someone outside listening. > > That would also fix the net namespace problem you mentioned I agree. This was what I did first, but I wasn't confident I'd done it right. I'll go ahead and post the two patches I used to accomplish it. I'm happy to spend some time polishing them until they are functionally correct, if everyone is happy enough with the approach. more to come, will ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 1/2] nsproxy: add copy_namespaces_unattached [not found] ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org> 2010-09-17 14:52 ` Will Drewry @ 2010-09-17 15:16 ` Will Drewry 2010-09-17 15:16 ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry 2 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-17 15:16 UTC (permalink / raw) To: Andi Kleen, linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: Will Drewry, Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, Oleg Nesterov, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman This changes adds copy_namespaces_unattached which provides similar behavior to copy_namespaces() for clone, but is meant for use when a new namespace needs to be derived from an existing process outside of process creation. The next patch in this series shows this function used in fs/exec.c to insert the core_pattern pipe thread into the crashed processes namespaces. This patch is similar to the setns patches floated earlier this year, but the goal is less lofty though not incompatible! Any and all input, thoughts, etc will be appreciated. Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> --- include/linux/nsproxy.h | 2 ++ kernel/nsproxy.c | 39 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 41 insertions(+), 0 deletions(-) diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h index 7b370c7..4c823d2 100644 --- a/include/linux/nsproxy.h +++ b/include/linux/nsproxy.h @@ -63,6 +63,8 @@ static inline struct nsproxy *task_nsproxy(struct task_struct *tsk) } int copy_namespaces(unsigned long flags, struct task_struct *tsk); +int copy_namespaces_unattached(unsigned long flags, struct task_struct *tsk, + struct nsproxy **nsproxy, struct fs_struct **fs); void exit_task_namespaces(struct task_struct *tsk); void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new); void free_nsproxy(struct nsproxy *ns); diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index f74e6c0..ddaea4d 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -22,6 +22,7 @@ #include <linux/pid_namespace.h> #include <net/net_namespace.h> #include <linux/ipc_namespace.h> +#include <linux/fs_struct.h> static struct kmem_cache *nsproxy_cachep; @@ -161,6 +162,44 @@ out: return err; } +/** + * copy_namespaces_unattached: creates a new nsproxy and fs from a given task + * @flags: clone flags to change namespace creation/copy behavior + * @tsk: task's namespace to base the nsproxy and fs on + * @nsproxy: pointer which will contain the newly created nsproxy + * @fs: pointer which will contain the newly created fs_struct + * + * Returns 0 on success and non-zero on failure. + * + * This function should aid in migrating processes across namespaces when after + * creation. + */ +int copy_namespaces_unattached(unsigned long flags, struct task_struct *tsk, + struct nsproxy **nsproxy, struct fs_struct **fs) +{ + int err = 0; + if (!fs || !nsproxy) { + err = -EINVAL; + goto out; + } + + *fs = copy_fs_struct(tsk->fs); + if (!*fs) { + err = -ENOMEM; + goto out; + } + + *nsproxy = create_new_namespaces(flags, tsk, *fs); + if (IS_ERR(*nsproxy)) { + err = PTR_ERR(*nsproxy); + free_fs_struct(*fs); + goto out; + } + +out: + return err; +} + void free_nsproxy(struct nsproxy *ns) { if (ns->mnt_ns) -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace [not found] ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org> 2010-09-17 14:52 ` Will Drewry 2010-09-17 15:16 ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry @ 2010-09-17 15:16 ` Will Drewry 2 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-17 15:16 UTC (permalink / raw) To: Andi Kleen, linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: Will Drewry, Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, Oleg Nesterov, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman Presently, a core_pattern pipe endpoint will be run in the init namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the core dumping process but will have no access to that processes /proc without walking the init namespace /proc looking through all the global pids until it finds the one it thinks matches. One option for implementing this I've already posed: https://patchwork.kernel.org/patch/185912/ However, it's unclear if it is desirable for the core_pattern to run outside the namespace. In particular, it can easily access the mounts via /proc/[pid_nr]/root, but if there is a net namespace, it will not have access. (Originally introduced in 2007 in commit b488893a390edfe027bae7a46e9af8083e740668 ) Instead, this change implements the more complex option two. It migrates the ____call_usermodehelper() thread into the same namespaces as the dumping process. It does not assign a pid in that namespace so the collector will appear to be pid 0. (Using alloc_pid and change_pid are options though. I avoided it because kernel_thread's returned pid will then mismatch the actual threads pid.) Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> --- fs/exec.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 53 insertions(+), 3 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 828dd24..3b82607 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -51,6 +51,7 @@ #include <linux/audit.h> #include <linux/tracehook.h> #include <linux/kmod.h> +#include <linux/nsproxy.h> #include <linux/fsnotify.h> #include <linux/fs_struct.h> #include <linux/pipe_fs_i.h> @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core"; unsigned int core_pipe_limit; int suid_dumpable = 0; +struct coredump_pipe_params { + struct coredump_params *params; + struct nsproxy *nsproxy; + struct fs_struct *fs; +}; + /* The maximal length of core_pattern is also specified in sysctl.c */ static LIST_HEAD(formats); @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info) { struct file *rp, *wp; struct fdtable *fdt; - struct coredump_params *cp = (struct coredump_params *)info->data; - struct files_struct *cf = current->files; + struct coredump_pipe_params *pipe_params = + (struct coredump_pipe_params *)info->data; + struct coredump_params *cp = pipe_params->params; + struct task_struct *tsk = current; + struct files_struct *cf = tsk->files; + struct fs_struct *cfs = tsk->fs; + + /* Migrate this thread into the crashing namespaces, but + * don't change its pid struct to avoid breaking any other + * dependencies. This will mean the process is pid=0 if it + * was migrated into a pid namespace. + */ + if (pipe_params->nsproxy && pipe_params->fs) { + int kill; + switch_task_namespaces(tsk, pipe_params->nsproxy); + pipe_params->nsproxy = NULL; + + task_lock(tsk); + tsk->fs = pipe_params->fs; + task_unlock(tsk); + pipe_params->fs = NULL; + /* Clean up the previous fs struct */ + write_lock(&cfs->lock); + kill = !--cfs->users; + write_unlock(&cfs->lock); + if (kill) + free_fs_struct(cfs); + } wp = create_write_pipe(0); if (IS_ERR(wp)) @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) if (ispipe) { int dump_count; char **helper_argv; + struct coredump_pipe_params pipe_params = { .params = &cprm }; if (cprm.limit == 1) { /* @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) goto fail_dropcount; } + /* Run the core_collector in the crashing namespaces */ + if (copy_namespaces_unattached(0, current, + &pipe_params.nsproxy, &pipe_params.fs)) { + printk(KERN_WARNING "%s failed to copy namespaces\n", + __func__); + argv_free(helper_argv); + goto fail_dropcount; + } + retval = call_usermodehelper_fns(helper_argv[0], helper_argv, NULL, UMH_WAIT_EXEC, umh_pipe_setup, - NULL, &cprm); + NULL, &pipe_params); argv_free(helper_argv); + /* nsproxy and fs will survive if call_usermodehelper_fns hits + * ENOMEM prior to creating a new thread. + */ + if (pipe_params.nsproxy) + put_nsproxy(pipe_params.nsproxy); + if (pipe_params.fs) /* not in use by anything else */ + free_fs_struct(pipe_params.fs); if (retval) { printk(KERN_INFO "Core dump to %s pipe failed\n", corename); -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH 1/2] nsproxy: add copy_namespaces_unattached 2010-09-17 13:26 ` Andi Kleen 2010-09-17 14:52 ` Will Drewry [not found] ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org> @ 2010-09-17 15:16 ` Will Drewry 2010-09-17 15:16 ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry 3 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-17 15:16 UTC (permalink / raw) To: Andi Kleen, linux-kernel Cc: Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel, Will Drewry This changes adds copy_namespaces_unattached which provides similar behavior to copy_namespaces() for clone, but is meant for use when a new namespace needs to be derived from an existing process outside of process creation. The next patch in this series shows this function used in fs/exec.c to insert the core_pattern pipe thread into the crashed processes namespaces. This patch is similar to the setns patches floated earlier this year, but the goal is less lofty though not incompatible! Any and all input, thoughts, etc will be appreciated. Signed-off-by: Will Drewry <wad@chromium.org> --- include/linux/nsproxy.h | 2 ++ kernel/nsproxy.c | 39 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 41 insertions(+), 0 deletions(-) diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h index 7b370c7..4c823d2 100644 --- a/include/linux/nsproxy.h +++ b/include/linux/nsproxy.h @@ -63,6 +63,8 @@ static inline struct nsproxy *task_nsproxy(struct task_struct *tsk) } int copy_namespaces(unsigned long flags, struct task_struct *tsk); +int copy_namespaces_unattached(unsigned long flags, struct task_struct *tsk, + struct nsproxy **nsproxy, struct fs_struct **fs); void exit_task_namespaces(struct task_struct *tsk); void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new); void free_nsproxy(struct nsproxy *ns); diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index f74e6c0..ddaea4d 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -22,6 +22,7 @@ #include <linux/pid_namespace.h> #include <net/net_namespace.h> #include <linux/ipc_namespace.h> +#include <linux/fs_struct.h> static struct kmem_cache *nsproxy_cachep; @@ -161,6 +162,44 @@ out: return err; } +/** + * copy_namespaces_unattached: creates a new nsproxy and fs from a given task + * @flags: clone flags to change namespace creation/copy behavior + * @tsk: task's namespace to base the nsproxy and fs on + * @nsproxy: pointer which will contain the newly created nsproxy + * @fs: pointer which will contain the newly created fs_struct + * + * Returns 0 on success and non-zero on failure. + * + * This function should aid in migrating processes across namespaces when after + * creation. + */ +int copy_namespaces_unattached(unsigned long flags, struct task_struct *tsk, + struct nsproxy **nsproxy, struct fs_struct **fs) +{ + int err = 0; + if (!fs || !nsproxy) { + err = -EINVAL; + goto out; + } + + *fs = copy_fs_struct(tsk->fs); + if (!*fs) { + err = -ENOMEM; + goto out; + } + + *nsproxy = create_new_namespaces(flags, tsk, *fs); + if (IS_ERR(*nsproxy)) { + err = PTR_ERR(*nsproxy); + free_fs_struct(*fs); + goto out; + } + +out: + return err; +} + void free_nsproxy(struct nsproxy *ns) { if (ns->mnt_ns) -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-17 13:26 ` Andi Kleen ` (2 preceding siblings ...) 2010-09-17 15:16 ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry @ 2010-09-17 15:16 ` Will Drewry [not found] ` <1284736618-27153-2-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> 2010-09-17 18:15 ` [PATCH 2/2] " Neil Horman 3 siblings, 2 replies; 43+ messages in thread From: Will Drewry @ 2010-09-17 15:16 UTC (permalink / raw) To: Andi Kleen, linux-kernel Cc: Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel, Will Drewry Presently, a core_pattern pipe endpoint will be run in the init namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the core dumping process but will have no access to that processes /proc without walking the init namespace /proc looking through all the global pids until it finds the one it thinks matches. One option for implementing this I've already posed: https://patchwork.kernel.org/patch/185912/ However, it's unclear if it is desirable for the core_pattern to run outside the namespace. In particular, it can easily access the mounts via /proc/[pid_nr]/root, but if there is a net namespace, it will not have access. (Originally introduced in 2007 in commit b488893a390edfe027bae7a46e9af8083e740668 ) Instead, this change implements the more complex option two. It migrates the ____call_usermodehelper() thread into the same namespaces as the dumping process. It does not assign a pid in that namespace so the collector will appear to be pid 0. (Using alloc_pid and change_pid are options though. I avoided it because kernel_thread's returned pid will then mismatch the actual threads pid.) Signed-off-by: Will Drewry <wad@chromium.org> --- fs/exec.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 53 insertions(+), 3 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 828dd24..3b82607 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -51,6 +51,7 @@ #include <linux/audit.h> #include <linux/tracehook.h> #include <linux/kmod.h> +#include <linux/nsproxy.h> #include <linux/fsnotify.h> #include <linux/fs_struct.h> #include <linux/pipe_fs_i.h> @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core"; unsigned int core_pipe_limit; int suid_dumpable = 0; +struct coredump_pipe_params { + struct coredump_params *params; + struct nsproxy *nsproxy; + struct fs_struct *fs; +}; + /* The maximal length of core_pattern is also specified in sysctl.c */ static LIST_HEAD(formats); @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info) { struct file *rp, *wp; struct fdtable *fdt; - struct coredump_params *cp = (struct coredump_params *)info->data; - struct files_struct *cf = current->files; + struct coredump_pipe_params *pipe_params = + (struct coredump_pipe_params *)info->data; + struct coredump_params *cp = pipe_params->params; + struct task_struct *tsk = current; + struct files_struct *cf = tsk->files; + struct fs_struct *cfs = tsk->fs; + + /* Migrate this thread into the crashing namespaces, but + * don't change its pid struct to avoid breaking any other + * dependencies. This will mean the process is pid=0 if it + * was migrated into a pid namespace. + */ + if (pipe_params->nsproxy && pipe_params->fs) { + int kill; + switch_task_namespaces(tsk, pipe_params->nsproxy); + pipe_params->nsproxy = NULL; + + task_lock(tsk); + tsk->fs = pipe_params->fs; + task_unlock(tsk); + pipe_params->fs = NULL; + /* Clean up the previous fs struct */ + write_lock(&cfs->lock); + kill = !--cfs->users; + write_unlock(&cfs->lock); + if (kill) + free_fs_struct(cfs); + } wp = create_write_pipe(0); if (IS_ERR(wp)) @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) if (ispipe) { int dump_count; char **helper_argv; + struct coredump_pipe_params pipe_params = { .params = &cprm }; if (cprm.limit == 1) { /* @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) goto fail_dropcount; } + /* Run the core_collector in the crashing namespaces */ + if (copy_namespaces_unattached(0, current, + &pipe_params.nsproxy, &pipe_params.fs)) { + printk(KERN_WARNING "%s failed to copy namespaces\n", + __func__); + argv_free(helper_argv); + goto fail_dropcount; + } + retval = call_usermodehelper_fns(helper_argv[0], helper_argv, NULL, UMH_WAIT_EXEC, umh_pipe_setup, - NULL, &cprm); + NULL, &pipe_params); argv_free(helper_argv); + /* nsproxy and fs will survive if call_usermodehelper_fns hits + * ENOMEM prior to creating a new thread. + */ + if (pipe_params.nsproxy) + put_nsproxy(pipe_params.nsproxy); + if (pipe_params.fs) /* not in use by anything else */ + free_fs_struct(pipe_params.fs); if (retval) { printk(KERN_INFO "Core dump to %s pipe failed\n", corename); -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
[parent not found: <1284736618-27153-2-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace [not found] ` <1284736618-27153-2-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> @ 2010-09-17 18:15 ` Neil Horman 2010-09-18 1:29 ` Oleg Nesterov 1 sibling, 0 replies; 43+ messages in thread From: Neil Horman @ 2010-09-17 18:15 UTC (permalink / raw) To: Will Drewry Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman On Fri, Sep 17, 2010 at 10:16:58AM -0500, Will Drewry wrote: > Presently, a core_pattern pipe endpoint will be run in the init > namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the > core dumping process but will have no access to that processes /proc > without walking the init namespace /proc looking through all the global > pids until it finds the one it thinks matches. > > One option for implementing this I've already posed: > https://patchwork.kernel.org/patch/185912/ > However, it's unclear if it is desirable for the core_pattern to run > outside the namespace. In particular, it can easily access the mounts > via /proc/[pid_nr]/root, but if there is a net namespace, it will not > have access. (Originally introduced in 2007 in commit > b488893a390edfe027bae7a46e9af8083e740668 ) > > Instead, this change implements the more complex option two. It > migrates the ____call_usermodehelper() thread into the same namespaces > as the dumping process. It does not assign a pid in that namespace so > the collector will appear to be pid 0. (Using alloc_pid and change_pid > are options though. I avoided it because kernel_thread's returned pid > will then mismatch the actual threads pid.) > > Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> > --- > fs/exec.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- > 1 files changed, 53 insertions(+), 3 deletions(-) > > diff --git a/fs/exec.c b/fs/exec.c > index 828dd24..3b82607 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -51,6 +51,7 @@ > #include <linux/audit.h> > #include <linux/tracehook.h> > #include <linux/kmod.h> > +#include <linux/nsproxy.h> > #include <linux/fsnotify.h> > #include <linux/fs_struct.h> > #include <linux/pipe_fs_i.h> > @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core"; > unsigned int core_pipe_limit; > int suid_dumpable = 0; > > +struct coredump_pipe_params { > + struct coredump_params *params; > + struct nsproxy *nsproxy; > + struct fs_struct *fs; > +}; > + Generally, I like this approach, but i think perhaps it would be better if you made the coredump_pipe_params struct a member of the coredump_params structure (perhaps within an anonymous union, so you can handle future core dump types, if those ever come along). That will save you needing the extra pointer to the coredump_params structure itself. > /* The maximal length of core_pattern is also specified in sysctl.c */ > > static LIST_HEAD(formats); > @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info) > { > struct file *rp, *wp; > struct fdtable *fdt; > - struct coredump_params *cp = (struct coredump_params *)info->data; > - struct files_struct *cf = current->files; > + struct coredump_pipe_params *pipe_params = > + (struct coredump_pipe_params *)info->data; If you do the above, you can just reference this as cp->pipe_params, or what-not. > + struct coredump_params *cp = pipe_params->params; > + struct task_struct *tsk = current; > + struct files_struct *cf = tsk->files; > + struct fs_struct *cfs = tsk->fs; > + > + /* Migrate this thread into the crashing namespaces, but > + * don't change its pid struct to avoid breaking any other > + * dependencies. This will mean the process is pid=0 if it > + * was migrated into a pid namespace. > + */ > + if (pipe_params->nsproxy && pipe_params->fs) { > + int kill; > + switch_task_namespaces(tsk, pipe_params->nsproxy); > + pipe_params->nsproxy = NULL; > + > + task_lock(tsk); > + tsk->fs = pipe_params->fs; > + task_unlock(tsk); > + pipe_params->fs = NULL; > + /* Clean up the previous fs struct */ > + write_lock(&cfs->lock); > + kill = !--cfs->users; > + write_unlock(&cfs->lock); > + if (kill) > + free_fs_struct(cfs); > + } > > wp = create_write_pipe(0); > if (IS_ERR(wp)) > @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) > if (ispipe) { > int dump_count; > char **helper_argv; > + struct coredump_pipe_params pipe_params = { .params = &cprm }; > And you won't have to allocate this, it will just be part of the coredump_params. > if (cprm.limit == 1) { > /* > @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) > goto fail_dropcount; > } > > + /* Run the core_collector in the crashing namespaces */ > + if (copy_namespaces_unattached(0, current, > + &pipe_params.nsproxy, &pipe_params.fs)) { > + printk(KERN_WARNING "%s failed to copy namespaces\n", > + __func__); > + argv_free(helper_argv); > + goto fail_dropcount; > + } > + > retval = call_usermodehelper_fns(helper_argv[0], helper_argv, > NULL, UMH_WAIT_EXEC, umh_pipe_setup, > - NULL, &cprm); > + NULL, &pipe_params); And you can pass a pointer to coredump_params here, instead of a pointer to a pipe_specific coredumps, to a function that isn't specific to pipes > argv_free(helper_argv); > + /* nsproxy and fs will survive if call_usermodehelper_fns hits > + * ENOMEM prior to creating a new thread. > + */ > + if (pipe_params.nsproxy) > + put_nsproxy(pipe_params.nsproxy); > + if (pipe_params.fs) /* not in use by anything else */ > + free_fs_struct(pipe_params.fs); > if (retval) { > printk(KERN_INFO "Core dump to %s pipe failed\n", > corename); > -- > 1.7.0.4 > > Regards Neil ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-17 15:16 ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry @ 2010-09-18 1:29 ` Oleg Nesterov 2010-09-17 18:15 ` [PATCH 2/2] " Neil Horman 1 sibling, 0 replies; 43+ messages in thread From: Oleg Nesterov @ 2010-09-18 1:29 UTC (permalink / raw) To: Will Drewry Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman On 09/17, Will Drewry wrote: > > Instead, this change implements the more complex option two. It > migrates the ____call_usermodehelper() thread into the same namespaces > as the dumping process. It does not assign a pid in that namespace so > the collector will appear to be pid 0. Hmm... You mean, it won't visible in that namespace? Afacis, it should have the correct pid in the init ns, no? I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps this is OK... Say, sys_getpid() returns 0, strange. > + /* Run the core_collector in the crashing namespaces */ > + if (copy_namespaces_unattached(0, current, > + &pipe_params.nsproxy, &pipe_params.fs)) { > + printk(KERN_WARNING "%s failed to copy namespaces\n", > + __func__); > + argv_free(helper_argv); > + goto fail_dropcount; > + } This looks overcomplicated to me, or I missed something. I do not understand why should we do this beforehand, and why we need copy_namespaces_unattached(). Can't you just pass current to umh_pipe_setup() (or another helper) as the argument? Then this helper can copy ->fs and ->nsproxy itself. In fact, I do not understand why create_new_namespaces() is used. It is called with flags == 0 anyway, can't we just do ns = coredumping_task->nsproxy; get_nsproxy(ns); switch_task_namespaces(current, ns); ? Oleg. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace @ 2010-09-18 1:29 ` Oleg Nesterov 0 siblings, 0 replies; 43+ messages in thread From: Oleg Nesterov @ 2010-09-18 1:29 UTC (permalink / raw) To: Will Drewry Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On 09/17, Will Drewry wrote: > > Instead, this change implements the more complex option two. It > migrates the ____call_usermodehelper() thread into the same namespaces > as the dumping process. It does not assign a pid in that namespace so > the collector will appear to be pid 0. Hmm... You mean, it won't visible in that namespace? Afacis, it should have the correct pid in the init ns, no? I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps this is OK... Say, sys_getpid() returns 0, strange. > + /* Run the core_collector in the crashing namespaces */ > + if (copy_namespaces_unattached(0, current, > + &pipe_params.nsproxy, &pipe_params.fs)) { > + printk(KERN_WARNING "%s failed to copy namespaces\n", > + __func__); > + argv_free(helper_argv); > + goto fail_dropcount; > + } This looks overcomplicated to me, or I missed something. I do not understand why should we do this beforehand, and why we need copy_namespaces_unattached(). Can't you just pass current to umh_pipe_setup() (or another helper) as the argument? Then this helper can copy ->fs and ->nsproxy itself. In fact, I do not understand why create_new_namespaces() is used. It is called with flags == 0 anyway, can't we just do ns = coredumping_task->nsproxy; get_nsproxy(ns); switch_task_namespaces(current, ns); ? Oleg. ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <20100918012939.GA25046-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-18 1:29 ` Oleg Nesterov @ 2010-09-18 2:34 ` Will Drewry -1 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-18 2:34 UTC (permalink / raw) To: Oleg Nesterov Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On 09/17, Will Drewry wrote: >> >> Instead, this change implements the more complex option two. It >> migrates the ____call_usermodehelper() thread into the same namespaces >> as the dumping process. It does not assign a pid in that namespace so >> the collector will appear to be pid 0. > > Hmm... You mean, it won't visible in that namespace? Afacis, it should > have the correct pid in the init ns, no? Exactly - it will be unmapped in its new pid namespace, returning 0 only in that case, as you point out below! > > I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps > this is OK... Say, sys_getpid() returns 0, strange. > >> + /* Run the core_collector in the crashing namespaces */ >> + if (copy_namespaces_unattached(0, current, >> + &pipe_params.nsproxy, &pipe_params.fs)) { >> + printk(KERN_WARNING "%s failed to copy namespaces\n", >> + __func__); >> + argv_free(helper_argv); >> + goto fail_dropcount; >> + } > > This looks overcomplicated to me, or I missed something. > > I do not understand why should we do this beforehand, and why we need > copy_namespaces_unattached(). > > Can't you just pass current to umh_pipe_setup() (or another helper) as > the argument? Then this helper can copy ->fs and ->nsproxy itself. I wasn't sure if it was reasonable to pass the current task_struct over, but I certainly can. > In fact, I do not understand why create_new_namespaces() is used. It > is called with flags == 0 anyway, can't we just do > > ns = coredumping_task->nsproxy; > get_nsproxy(ns); > switch_task_namespaces(current, ns); > > ? So that was my first thought (which I tried). I did exactly what you suggested to the khelper thread, and the lack of the fs struct bit me. Since the older patch from Eric Biederman (setns()) had taken the route of deflecting the work through create_new_namespaces(), I did too. I figured it would ensure that any namespacing behavior that needed to be done would be done. In practice, this seems to amount to just adding a refcount to all the namespaces and creating a new nsproxy which isn't really needed. Most likely, doing what you've suggested above plus the copy_fs_struct and the swap out will do the trick. I'll try it out and see. That's make it much clearer I think. thanks! will ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace @ 2010-09-18 2:34 ` Will Drewry 0 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-18 2:34 UTC (permalink / raw) To: Oleg Nesterov Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote: > On 09/17, Will Drewry wrote: >> >> Instead, this change implements the more complex option two. It >> migrates the ____call_usermodehelper() thread into the same namespaces >> as the dumping process. It does not assign a pid in that namespace so >> the collector will appear to be pid 0. > > Hmm... You mean, it won't visible in that namespace? Afacis, it should > have the correct pid in the init ns, no? Exactly - it will be unmapped in its new pid namespace, returning 0 only in that case, as you point out below! > > I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps > this is OK... Say, sys_getpid() returns 0, strange. > >> + /* Run the core_collector in the crashing namespaces */ >> + if (copy_namespaces_unattached(0, current, >> + &pipe_params.nsproxy, &pipe_params.fs)) { >> + printk(KERN_WARNING "%s failed to copy namespaces\n", >> + __func__); >> + argv_free(helper_argv); >> + goto fail_dropcount; >> + } > > This looks overcomplicated to me, or I missed something. > > I do not understand why should we do this beforehand, and why we need > copy_namespaces_unattached(). > > Can't you just pass current to umh_pipe_setup() (or another helper) as > the argument? Then this helper can copy ->fs and ->nsproxy itself. I wasn't sure if it was reasonable to pass the current task_struct over, but I certainly can. > In fact, I do not understand why create_new_namespaces() is used. It > is called with flags == 0 anyway, can't we just do > > ns = coredumping_task->nsproxy; > get_nsproxy(ns); > switch_task_namespaces(current, ns); > > ? So that was my first thought (which I tried). I did exactly what you suggested to the khelper thread, and the lack of the fs struct bit me. Since the older patch from Eric Biederman (setns()) had taken the route of deflecting the work through create_new_namespaces(), I did too. I figured it would ensure that any namespacing behavior that needed to be done would be done. In practice, this seems to amount to just adding a refcount to all the namespaces and creating a new nsproxy which isn't really needed. Most likely, doing what you've suggested above plus the copy_fs_struct and the swap out will do the trick. I'll try it out and see. That's make it much clearer I think. thanks! will ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-18 2:34 ` Will Drewry (?) @ 2010-09-18 3:14 ` Will Drewry -1 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-18 3:14 UTC (permalink / raw) To: Oleg Nesterov Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman On Fri, Sep 17, 2010 at 9:34 PM, Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: > On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: >> On 09/17, Will Drewry wrote: >>> >>> Instead, this change implements the more complex option two. It >>> migrates the ____call_usermodehelper() thread into the same namespaces >>> as the dumping process. It does not assign a pid in that namespace so >>> the collector will appear to be pid 0. >> >> Hmm... You mean, it won't visible in that namespace? Afacis, it should >> have the correct pid in the init ns, no? > > Exactly - it will be unmapped in its new pid namespace, returning 0 > only in that case, as you point out below! > >> >> I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps >> this is OK... Say, sys_getpid() returns 0, strange. >> >>> + /* Run the core_collector in the crashing namespaces */ >>> + if (copy_namespaces_unattached(0, current, >>> + &pipe_params.nsproxy, &pipe_params.fs)) { >>> + printk(KERN_WARNING "%s failed to copy namespaces\n", >>> + __func__); >>> + argv_free(helper_argv); >>> + goto fail_dropcount; >>> + } >> >> This looks overcomplicated to me, or I missed something. >> >> I do not understand why should we do this beforehand, and why we need >> copy_namespaces_unattached(). >> >> Can't you just pass current to umh_pipe_setup() (or another helper) as >> the argument? Then this helper can copy ->fs and ->nsproxy itself. > > I wasn't sure if it was reasonable to pass the current task_struct > over, but I certainly can. > >> In fact, I do not understand why create_new_namespaces() is used. It >> is called with flags == 0 anyway, can't we just do >> >> ns = coredumping_task->nsproxy; >> get_nsproxy(ns); >> switch_task_namespaces(current, ns); >> >> ? > > So that was my first thought (which I tried). I did exactly what you > suggested to the khelper thread, and the lack of the fs struct bit me. > Since the older patch from Eric Biederman (setns()) had taken the > route of deflecting the work through create_new_namespaces(), I did > too. I figured it would ensure that any namespacing behavior that > needed to be done would be done. > > In practice, this seems to amount to just adding a refcount to all the > namespaces and creating a new nsproxy which isn't really needed. Most > likely, doing what you've suggested above plus the copy_fs_struct and > the swap out will do the trick. I'll try it out and see. That's make > it much clearer I think. That seems to work -- I'll post an update shortly with your comments and Neil's integrated - hopefully accurately. Any other thoughts on style/cleanup/etc will certainly be appreciated. cheers! will ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-18 2:34 ` Will Drewry (?) (?) @ 2010-09-18 3:14 ` Will Drewry -1 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-18 3:14 UTC (permalink / raw) To: Oleg Nesterov Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On Fri, Sep 17, 2010 at 9:34 PM, Will Drewry <wad@chromium.org> wrote: > On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote: >> On 09/17, Will Drewry wrote: >>> >>> Instead, this change implements the more complex option two. It >>> migrates the ____call_usermodehelper() thread into the same namespaces >>> as the dumping process. It does not assign a pid in that namespace so >>> the collector will appear to be pid 0. >> >> Hmm... You mean, it won't visible in that namespace? Afacis, it should >> have the correct pid in the init ns, no? > > Exactly - it will be unmapped in its new pid namespace, returning 0 > only in that case, as you point out below! > >> >> I am a bit worried task_active_pid_ns() != nsproxy->pid_ns, but perhaps >> this is OK... Say, sys_getpid() returns 0, strange. >> >>> + /* Run the core_collector in the crashing namespaces */ >>> + if (copy_namespaces_unattached(0, current, >>> + &pipe_params.nsproxy, &pipe_params.fs)) { >>> + printk(KERN_WARNING "%s failed to copy namespaces\n", >>> + __func__); >>> + argv_free(helper_argv); >>> + goto fail_dropcount; >>> + } >> >> This looks overcomplicated to me, or I missed something. >> >> I do not understand why should we do this beforehand, and why we need >> copy_namespaces_unattached(). >> >> Can't you just pass current to umh_pipe_setup() (or another helper) as >> the argument? Then this helper can copy ->fs and ->nsproxy itself. > > I wasn't sure if it was reasonable to pass the current task_struct > over, but I certainly can. > >> In fact, I do not understand why create_new_namespaces() is used. It >> is called with flags == 0 anyway, can't we just do >> >> ns = coredumping_task->nsproxy; >> get_nsproxy(ns); >> switch_task_namespaces(current, ns); >> >> ? > > So that was my first thought (which I tried). I did exactly what you > suggested to the khelper thread, and the lack of the fs struct bit me. > Since the older patch from Eric Biederman (setns()) had taken the > route of deflecting the work through create_new_namespaces(), I did > too. I figured it would ensure that any namespacing behavior that > needed to be done would be done. > > In practice, this seems to amount to just adding a refcount to all the > namespaces and creating a new nsproxy which isn't really needed. Most > likely, doing what you've suggested above plus the copy_fs_struct and > the swap out will do the trick. I'll try it out and see. That's make > it much clearer I think. That seems to work -- I'll post an update shortly with your comments and Neil's integrated - hopefully accurately. Any other thoughts on style/cleanup/etc will certainly be appreciated. cheers! will ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-18 2:34 ` Will Drewry ` (2 preceding siblings ...) (?) @ 2010-09-20 18:50 ` Oleg Nesterov -1 siblings, 0 replies; 43+ messages in thread From: Oleg Nesterov @ 2010-09-20 18:50 UTC (permalink / raw) To: Will Drewry Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman On 09/17, Will Drewry wrote: > > On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > This looks overcomplicated to me, or I missed something. > > > > I do not understand why should we do this beforehand, and why we need > > copy_namespaces_unattached(). > > > > Can't you just pass current to umh_pipe_setup() (or another helper) as > > the argument? Then this helper can copy ->fs and ->nsproxy itself. > > I wasn't sure if it was reasonable to pass the current task_struct > over, but I certainly can. Why not? current calls call_usermodehelper_exec(), it can't go away until subprocess_info->init() returns, it sleeps on wait_for_completion(). > In practice, this seems to amount to just adding a refcount to all the > namespaces and creating a new nsproxy which isn't really needed. Most > likely, doing what you've suggested above plus the copy_fs_struct and > the swap out will do the trick. I'll try it out and see. That's make > it much clearer I think. Yes, just get_nsproxy() (like fork() does) should be fine in this case. As for copying ->fs, I am not sure actually. core_pattern is global, say it is "|/coredumper". If you change ->root, then exec can fail because that binary is not visible to the coredumping process? Probably we should move core_pattern into ->pid_ns, I dunno. Oleg. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-18 2:34 ` Will Drewry ` (3 preceding siblings ...) (?) @ 2010-09-20 18:50 ` Oleg Nesterov [not found] ` <20100920185001.GA955-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2010-09-20 20:28 ` Will Drewry -1 siblings, 2 replies; 43+ messages in thread From: Oleg Nesterov @ 2010-09-20 18:50 UTC (permalink / raw) To: Will Drewry Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On 09/17, Will Drewry wrote: > > On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote: > > > > This looks overcomplicated to me, or I missed something. > > > > I do not understand why should we do this beforehand, and why we need > > copy_namespaces_unattached(). > > > > Can't you just pass current to umh_pipe_setup() (or another helper) as > > the argument? Then this helper can copy ->fs and ->nsproxy itself. > > I wasn't sure if it was reasonable to pass the current task_struct > over, but I certainly can. Why not? current calls call_usermodehelper_exec(), it can't go away until subprocess_info->init() returns, it sleeps on wait_for_completion(). > In practice, this seems to amount to just adding a refcount to all the > namespaces and creating a new nsproxy which isn't really needed. Most > likely, doing what you've suggested above plus the copy_fs_struct and > the swap out will do the trick. I'll try it out and see. That's make > it much clearer I think. Yes, just get_nsproxy() (like fork() does) should be fine in this case. As for copying ->fs, I am not sure actually. core_pattern is global, say it is "|/coredumper". If you change ->root, then exec can fail because that binary is not visible to the coredumping process? Probably we should move core_pattern into ->pid_ns, I dunno. Oleg. ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <20100920185001.GA955-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace [not found] ` <20100920185001.GA955-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2010-09-20 20:28 ` Will Drewry 0 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-20 20:28 UTC (permalink / raw) To: Oleg Nesterov Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman On Mon, Sep 20, 2010 at 1:50 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On 09/17, Will Drewry wrote: >> >> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: >> > >> > This looks overcomplicated to me, or I missed something. >> > >> > I do not understand why should we do this beforehand, and why we need >> > copy_namespaces_unattached(). >> > >> > Can't you just pass current to umh_pipe_setup() (or another helper) as >> > the argument? Then this helper can copy ->fs and ->nsproxy itself. >> >> I wasn't sure if it was reasonable to pass the current task_struct >> over, but I certainly can. > > Why not? current calls call_usermodehelper_exec(), it can't go away > until subprocess_info->init() returns, it sleeps on wait_for_completion(). yeah - I wasn't sure because the other coredump_params didn't pass it, so I assumed there was some history around that. Though it sounds like the current approach may not be the way forward anyhow. >> In practice, this seems to amount to just adding a refcount to all the >> namespaces and creating a new nsproxy which isn't really needed. Most >> likely, doing what you've suggested above plus the copy_fs_struct and >> the swap out will do the trick. I'll try it out and see. That's make >> it much clearer I think. > > Yes, just get_nsproxy() (like fork() does) should be fine in this case. > > As for copying ->fs, I am not sure actually. core_pattern is global, > say it is "|/coredumper". If you change ->root, then exec can fail > because that binary is not visible to the coredumping process? Yeah - it's lose-lose I think. On one hand, it may not run, on the other hand it may have access where it shouldn't or not have access it where it needs it. > Probably we should move core_pattern into ->pid_ns, I dunno. Sounds like this is worth doing. I'll look into it and post something for further consideration! thanks again - will ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-20 18:50 ` Oleg Nesterov @ 2010-09-20 20:28 ` Will Drewry 2010-09-20 20:28 ` Will Drewry 1 sibling, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-20 20:28 UTC (permalink / raw) To: Oleg Nesterov Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On Mon, Sep 20, 2010 at 1:50 PM, Oleg Nesterov <oleg@redhat.com> wrote: > On 09/17, Will Drewry wrote: >> >> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote: >> > >> > This looks overcomplicated to me, or I missed something. >> > >> > I do not understand why should we do this beforehand, and why we need >> > copy_namespaces_unattached(). >> > >> > Can't you just pass current to umh_pipe_setup() (or another helper) as >> > the argument? Then this helper can copy ->fs and ->nsproxy itself. >> >> I wasn't sure if it was reasonable to pass the current task_struct >> over, but I certainly can. > > Why not? current calls call_usermodehelper_exec(), it can't go away > until subprocess_info->init() returns, it sleeps on wait_for_completion(). yeah - I wasn't sure because the other coredump_params didn't pass it, so I assumed there was some history around that. Though it sounds like the current approach may not be the way forward anyhow. >> In practice, this seems to amount to just adding a refcount to all the >> namespaces and creating a new nsproxy which isn't really needed. Most >> likely, doing what you've suggested above plus the copy_fs_struct and >> the swap out will do the trick. I'll try it out and see. That's make >> it much clearer I think. > > Yes, just get_nsproxy() (like fork() does) should be fine in this case. > > As for copying ->fs, I am not sure actually. core_pattern is global, > say it is "|/coredumper". If you change ->root, then exec can fail > because that binary is not visible to the coredumping process? Yeah - it's lose-lose I think. On one hand, it may not run, on the other hand it may have access where it shouldn't or not have access it where it needs it. > Probably we should move core_pattern into ->pid_ns, I dunno. Sounds like this is worth doing. I'll look into it and post something for further consideration! thanks again - will ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace @ 2010-09-20 20:28 ` Will Drewry 0 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-20 20:28 UTC (permalink / raw) To: Oleg Nesterov Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On Mon, Sep 20, 2010 at 1:50 PM, Oleg Nesterov <oleg@redhat.com> wrote: > On 09/17, Will Drewry wrote: >> >> On Fri, Sep 17, 2010 at 8:29 PM, Oleg Nesterov <oleg@redhat.com> wrote: >> > >> > This looks overcomplicated to me, or I missed something. >> > >> > I do not understand why should we do this beforehand, and why we need >> > copy_namespaces_unattached(). >> > >> > Can't you just pass current to umh_pipe_setup() (or another helper) as >> > the argument? Then this helper can copy ->fs and ->nsproxy itself. >> >> I wasn't sure if it was reasonable to pass the current task_struct >> over, but I certainly can. > > Why not? current calls call_usermodehelper_exec(), it can't go away > until subprocess_info->init() returns, it sleeps on wait_for_completion(). yeah - I wasn't sure because the other coredump_params didn't pass it, so I assumed there was some history around that. Though it sounds like the current approach may not be the way forward anyhow. >> In practice, this seems to amount to just adding a refcount to all the >> namespaces and creating a new nsproxy which isn't really needed. Most >> likely, doing what you've suggested above plus the copy_fs_struct and >> the swap out will do the trick. I'll try it out and see. That's make >> it much clearer I think. > > Yes, just get_nsproxy() (like fork() does) should be fine in this case. > > As for copying ->fs, I am not sure actually. core_pattern is global, > say it is "|/coredumper". If you change ->root, then exec can fail > because that binary is not visible to the coredumping process? Yeah - it's lose-lose I think. On one hand, it may not run, on the other hand it may have access where it shouldn't or not have access it where it needs it. > Probably we should move core_pattern into ->pid_ns, I dunno. Sounds like this is worth doing. I'll look into it and post something for further consideration! thanks again - will -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace [not found] ` <20100918012939.GA25046-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2010-09-18 2:34 ` Will Drewry @ 2010-09-18 3:13 ` Will Drewry 1 sibling, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-18 3:13 UTC (permalink / raw) To: Andi Kleen, linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: Will Drewry, Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, Oleg Nesterov, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman Presently, a core_pattern pipe endpoint will be run in the init namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the core dumping process but will have no access to that processes /proc without walking the init namespace /proc looking through all the global pids until it finds the one it thinks matches. One option for fixing this is to change the reported pid: https://patchwork.kernel.org/patch/185912/ However, it's unclear if it is desirable for the core_pattern to run outside the namespace. In particular, it can easily access the mounts via /proc/[pid_nr]/root, but if there is a net namespace, it will not have access. (Originally introduced in 2007 in commit b488893a390edfe027bae7a46e9af8083e740668 ) Instead, this change implements the more complex option two. It migrates the ____call_usermodehelper() thread into the same namespaces as the dumping process. It does not assign a pid in that namespace so the collector will appear to be pid 0 in the namespace. Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> --- v2: dropped patch 1/2 removed use of new copy_namespaces_unattached (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org) moved pipe_params into coredump_params (nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org) fs/exec.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- include/linux/binfmts.h | 9 +++++++++ 2 files changed, 54 insertions(+), 1 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 828dd24..4cbb735 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -51,6 +51,7 @@ #include <linux/audit.h> #include <linux/tracehook.h> #include <linux/kmod.h> +#include <linux/nsproxy.h> #include <linux/fsnotify.h> #include <linux/fs_struct.h> #include <linux/pipe_fs_i.h> @@ -1820,7 +1821,32 @@ static int umh_pipe_setup(struct subprocess_info *info) struct file *rp, *wp; struct fdtable *fdt; struct coredump_params *cp = (struct coredump_params *)info->data; - struct files_struct *cf = current->files; + struct coredump_pipe_params *pipe_params = &cp->pipe_params; + struct task_struct *tsk = current; + struct files_struct *cf = tsk->files; + struct fs_struct *cfs = tsk->fs; + + /* Migrate this thread into the crashing namespaces, but + * don't change its pid struct to avoid breaking any other + * dependencies. This will mean the process is pid=0 if it + * was migrated into a pid namespace. + */ + if (pipe_params->nsproxy && pipe_params->fs) { + int kill; + switch_task_namespaces(tsk, pipe_params->nsproxy); + pipe_params->nsproxy = NULL; + + task_lock(tsk); + tsk->fs = pipe_params->fs; + task_unlock(tsk); + pipe_params->fs = NULL; + /* Clean up the previous fs struct */ + write_lock(&cfs->lock); + kill = !--cfs->users; + write_unlock(&cfs->lock); + if (kill) + free_fs_struct(cfs); + } wp = create_write_pipe(0); if (IS_ERR(wp)) @@ -1950,10 +1976,28 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) goto fail_dropcount; } + /* Run the core_collector in the crashing namespaces */ + cprm.pipe_params.fs = copy_fs_struct(current->fs); + if (!cprm.pipe_params.fs) { + printk(KERN_WARNING "%s failed to copy fs\n", + __func__); + argv_free(helper_argv); + goto fail_dropcount; + } + get_nsproxy(current->nsproxy); + cprm.pipe_params.nsproxy = current->nsproxy; + retval = call_usermodehelper_fns(helper_argv[0], helper_argv, NULL, UMH_WAIT_EXEC, umh_pipe_setup, NULL, &cprm); argv_free(helper_argv); + /* nsproxy and fs will survive if call_usermodehelper_fns hits + * ENOMEM prior to creating a new thread. + */ + if (cprm.pipe_params.nsproxy) + put_nsproxy(cprm.pipe_params.nsproxy); + if (cprm.pipe_params.fs) /* not in use by anything else */ + free_fs_struct(cprm.pipe_params.fs); if (retval) { printk(KERN_INFO "Core dump to %s pipe failed\n", corename); diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index a065612..2629603 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -68,6 +68,12 @@ struct linux_binprm{ #define BINPRM_MAX_RECURSION 4 +/* Parameters to be passed to a pipe target during coredump */ +struct coredump_pipe_params { + struct nsproxy *nsproxy; + struct fs_struct *fs; +}; + /* Function parameter for binfmt->coredump */ struct coredump_params { long signr; @@ -75,6 +81,9 @@ struct coredump_params { struct file *file; unsigned long limit; unsigned long mm_flags; + union { + struct coredump_pipe_params pipe_params; + }; }; /* -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace 2010-09-18 1:29 ` Oleg Nesterov (?) (?) @ 2010-09-18 3:13 ` Will Drewry 2010-09-20 18:34 ` Eric W. Biederman [not found] ` <1284779629-15273-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> -1 siblings, 2 replies; 43+ messages in thread From: Will Drewry @ 2010-09-18 3:13 UTC (permalink / raw) To: Andi Kleen, linux-kernel Cc: Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel, Will Drewry Presently, a core_pattern pipe endpoint will be run in the init namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the core dumping process but will have no access to that processes /proc without walking the init namespace /proc looking through all the global pids until it finds the one it thinks matches. One option for fixing this is to change the reported pid: https://patchwork.kernel.org/patch/185912/ However, it's unclear if it is desirable for the core_pattern to run outside the namespace. In particular, it can easily access the mounts via /proc/[pid_nr]/root, but if there is a net namespace, it will not have access. (Originally introduced in 2007 in commit b488893a390edfe027bae7a46e9af8083e740668 ) Instead, this change implements the more complex option two. It migrates the ____call_usermodehelper() thread into the same namespaces as the dumping process. It does not assign a pid in that namespace so the collector will appear to be pid 0 in the namespace. Signed-off-by: Will Drewry <wad@chromium.org> --- v2: dropped patch 1/2 removed use of new copy_namespaces_unattached (oleg@redhat.com) moved pipe_params into coredump_params (nhorman@tuxdriver.com) fs/exec.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- include/linux/binfmts.h | 9 +++++++++ 2 files changed, 54 insertions(+), 1 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 828dd24..4cbb735 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -51,6 +51,7 @@ #include <linux/audit.h> #include <linux/tracehook.h> #include <linux/kmod.h> +#include <linux/nsproxy.h> #include <linux/fsnotify.h> #include <linux/fs_struct.h> #include <linux/pipe_fs_i.h> @@ -1820,7 +1821,32 @@ static int umh_pipe_setup(struct subprocess_info *info) struct file *rp, *wp; struct fdtable *fdt; struct coredump_params *cp = (struct coredump_params *)info->data; - struct files_struct *cf = current->files; + struct coredump_pipe_params *pipe_params = &cp->pipe_params; + struct task_struct *tsk = current; + struct files_struct *cf = tsk->files; + struct fs_struct *cfs = tsk->fs; + + /* Migrate this thread into the crashing namespaces, but + * don't change its pid struct to avoid breaking any other + * dependencies. This will mean the process is pid=0 if it + * was migrated into a pid namespace. + */ + if (pipe_params->nsproxy && pipe_params->fs) { + int kill; + switch_task_namespaces(tsk, pipe_params->nsproxy); + pipe_params->nsproxy = NULL; + + task_lock(tsk); + tsk->fs = pipe_params->fs; + task_unlock(tsk); + pipe_params->fs = NULL; + /* Clean up the previous fs struct */ + write_lock(&cfs->lock); + kill = !--cfs->users; + write_unlock(&cfs->lock); + if (kill) + free_fs_struct(cfs); + } wp = create_write_pipe(0); if (IS_ERR(wp)) @@ -1950,10 +1976,28 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) goto fail_dropcount; } + /* Run the core_collector in the crashing namespaces */ + cprm.pipe_params.fs = copy_fs_struct(current->fs); + if (!cprm.pipe_params.fs) { + printk(KERN_WARNING "%s failed to copy fs\n", + __func__); + argv_free(helper_argv); + goto fail_dropcount; + } + get_nsproxy(current->nsproxy); + cprm.pipe_params.nsproxy = current->nsproxy; + retval = call_usermodehelper_fns(helper_argv[0], helper_argv, NULL, UMH_WAIT_EXEC, umh_pipe_setup, NULL, &cprm); argv_free(helper_argv); + /* nsproxy and fs will survive if call_usermodehelper_fns hits + * ENOMEM prior to creating a new thread. + */ + if (cprm.pipe_params.nsproxy) + put_nsproxy(cprm.pipe_params.nsproxy); + if (cprm.pipe_params.fs) /* not in use by anything else */ + free_fs_struct(cprm.pipe_params.fs); if (retval) { printk(KERN_INFO "Core dump to %s pipe failed\n", corename); diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index a065612..2629603 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -68,6 +68,12 @@ struct linux_binprm{ #define BINPRM_MAX_RECURSION 4 +/* Parameters to be passed to a pipe target during coredump */ +struct coredump_pipe_params { + struct nsproxy *nsproxy; + struct fs_struct *fs; +}; + /* Function parameter for binfmt->coredump */ struct coredump_params { long signr; @@ -75,6 +81,9 @@ struct coredump_params { struct file *file; unsigned long limit; unsigned long mm_flags; + union { + struct coredump_pipe_params pipe_params; + }; }; /* -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace 2010-09-18 3:13 ` Will Drewry @ 2010-09-20 18:34 ` Eric W. Biederman 2010-09-20 19:12 ` Andi Kleen [not found] ` <m1eico1cyv.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> [not found] ` <1284779629-15273-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> 1 sibling, 2 replies; 43+ messages in thread From: Eric W. Biederman @ 2010-09-20 18:34 UTC (permalink / raw) To: Will Drewry Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel Will Drewry <wad@chromium.org> writes: > Presently, a core_pattern pipe endpoint will be run in the init > namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the > core dumping process but will have no access to that processes /proc > without walking the init namespace /proc looking through all the global > pids until it finds the one it thinks matches. > > One option for fixing this is to change the reported pid: > https://patchwork.kernel.org/patch/185912/ > However, it's unclear if it is desirable for the core_pattern to run > outside the namespace. In particular, it can easily access the mounts > via /proc/[pid_nr]/root, but if there is a net namespace, it will not > have access. (Originally introduced in 2007 in commit > b488893a390edfe027bae7a46e9af8083e740668 ) > > Instead, this change implements the more complex option two. It > migrates the ____call_usermodehelper() thread into the same namespaces > as the dumping process. It does not assign a pid in that namespace so > the collector will appear to be pid 0 in the namespace. I like the direction this is going but as structured this looks like a security exploit waiting to happen. The pipe process needs to run in the namespaces of the process who set the core pattern, not in the namespaces of the dumping process. Otherwise it is possible to trigger a privileged process to run in a context where it's reality that it expected, causing it to misuse it's privileges. Even if we don't have a privilege problem I think we will have a case of mismatched functionality where the core pattern will not work as expected. I believe the solution here is to move the core pattern into the pid namespace and to capture the namespaces when set, but if the core pattern hasn't been set to use the core pattern of the parent pid namespace. Eric ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace 2010-09-20 18:34 ` Eric W. Biederman @ 2010-09-20 19:12 ` Andi Kleen 2010-09-20 20:26 ` Will Drewry [not found] ` <20100920191214.GB7496-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org> [not found] ` <m1eico1cyv.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> 1 sibling, 2 replies; 43+ messages in thread From: Andi Kleen @ 2010-09-20 19:12 UTC (permalink / raw) To: Eric W. Biederman Cc: Will Drewry, Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel > The pipe process needs to run in the namespaces of the process who set > the core pattern, not in the namespaces of the dumping process. > Otherwise it is possible to trigger a privileged process to run in a > context where it's reality that it expected, causing it to misuse > it's privileges. Even if we don't have a privilege problem I think > we will have a case of mismatched functionality where the core pattern > will not work as expected. For me it seems rather the other way around: running the helper in some highly priviledged namespace is more dangerous. If it runs in the same context as the crasher it can do the least amount of damage relative to the crash process. And as Will pointed out it's the only sane way to deal with net namespaces. -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace 2010-09-20 19:12 ` Andi Kleen @ 2010-09-20 20:26 ` Will Drewry [not found] ` <20100920191214.GB7496-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org> 1 sibling, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-20 20:26 UTC (permalink / raw) To: Andi Kleen Cc: Eric W. Biederman, linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On Mon, Sep 20, 2010 at 2:12 PM, Andi Kleen <andi@firstfloor.org> wrote: >> The pipe process needs to run in the namespaces of the process who set >> the core pattern, not in the namespaces of the dumping process. >> Otherwise it is possible to trigger a privileged process to run in a >> context where it's reality that it expected, causing it to misuse >> it's privileges. Even if we don't have a privilege problem I think >> we will have a case of mismatched functionality where the core pattern >> will not work as expected. > > For me it seems rather the other way around: running the helper in some > highly priviledged namespace is more dangerous. If it runs in the > same context as the crasher it can do the least amount of damage > relative to the crash process. > > And as Will pointed out it's the only sane way to deal with net namespaces. I think you're both right. How it is implemented right now is an escape from the linux container. If you allow the root user in a container to mount proc and update core_pattern, they can escape. (core_pattern = |/well/known/binary_or_scripting_lang) I'm sure there are other escapes too (and any umh call is likely an escape like this one -- e.g., modprobe_path). That said, using my patch above might let you traverse a path otherwise blocked by an LSM enforcement (E.g., root user runs a process which sets up a vfs namespace with an encrypted mount and the lsm blocks access to the /proc/[pid]/root - but core_pattern still runs and with access). That said, using the setters namespace makes sense to me as a consumer of core_pattern too. You can set the core_pattern outside of a chroot/container and collect core dumps there _or_ you can let a "root" user in a container have their own core collector without providing a simple escape. Making format_corename use the correct pid namespace for translation would make these cases even more seamless. Unfortunately, I haven't yet looked at doing it that way yet. The namespace-transition patch posted is what occurred to me initially. Perhaps it won't be so hard. I'll take a look at what it'd take to do move core_pattern since it'd resolve both the escape/lsm-bypass scenarios and the mismatch between the arbitrary namespace and the core_pattern values. Thanks! will ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace @ 2010-09-20 20:26 ` Will Drewry 0 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-20 20:26 UTC (permalink / raw) To: Andi Kleen Cc: Eric W. Biederman, linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Neil Horman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On Mon, Sep 20, 2010 at 2:12 PM, Andi Kleen <andi@firstfloor.org> wrote: >> The pipe process needs to run in the namespaces of the process who set >> the core pattern, not in the namespaces of the dumping process. >> Otherwise it is possible to trigger a privileged process to run in a >> context where it's reality that it expected, causing it to misuse >> it's privileges. Even if we don't have a privilege problem I think >> we will have a case of mismatched functionality where the core pattern >> will not work as expected. > > For me it seems rather the other way around: running the helper in some > highly priviledged namespace is more dangerous. If it runs in the > same context as the crasher it can do the least amount of damage > relative to the crash process. > > And as Will pointed out it's the only sane way to deal with net namespaces. I think you're both right. How it is implemented right now is an escape from the linux container. If you allow the root user in a container to mount proc and update core_pattern, they can escape. (core_pattern = |/well/known/binary_or_scripting_lang) I'm sure there are other escapes too (and any umh call is likely an escape like this one -- e.g., modprobe_path). That said, using my patch above might let you traverse a path otherwise blocked by an LSM enforcement (E.g., root user runs a process which sets up a vfs namespace with an encrypted mount and the lsm blocks access to the /proc/[pid]/root - but core_pattern still runs and with access). That said, using the setters namespace makes sense to me as a consumer of core_pattern too. You can set the core_pattern outside of a chroot/container and collect core dumps there _or_ you can let a "root" user in a container have their own core collector without providing a simple escape. Making format_corename use the correct pid namespace for translation would make these cases even more seamless. Unfortunately, I haven't yet looked at doing it that way yet. The namespace-transition patch posted is what occurred to me initially. Perhaps it won't be so hard. I'll take a look at what it'd take to do move core_pattern since it'd resolve both the escape/lsm-bypass scenarios and the mismatch between the arbitrary namespace and the core_pattern values. Thanks! will -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <20100920191214.GB7496-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>]
* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace [not found] ` <20100920191214.GB7496-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org> @ 2010-09-20 20:26 ` Will Drewry 0 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-20 20:26 UTC (permalink / raw) To: Andi Kleen Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Alexander Viro On Mon, Sep 20, 2010 at 2:12 PM, Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org> wrote: >> The pipe process needs to run in the namespaces of the process who set >> the core pattern, not in the namespaces of the dumping process. >> Otherwise it is possible to trigger a privileged process to run in a >> context where it's reality that it expected, causing it to misuse >> it's privileges. Even if we don't have a privilege problem I think >> we will have a case of mismatched functionality where the core pattern >> will not work as expected. > > For me it seems rather the other way around: running the helper in some > highly priviledged namespace is more dangerous. If it runs in the > same context as the crasher it can do the least amount of damage > relative to the crash process. > > And as Will pointed out it's the only sane way to deal with net namespaces. I think you're both right. How it is implemented right now is an escape from the linux container. If you allow the root user in a container to mount proc and update core_pattern, they can escape. (core_pattern = |/well/known/binary_or_scripting_lang) I'm sure there are other escapes too (and any umh call is likely an escape like this one -- e.g., modprobe_path). That said, using my patch above might let you traverse a path otherwise blocked by an LSM enforcement (E.g., root user runs a process which sets up a vfs namespace with an encrypted mount and the lsm blocks access to the /proc/[pid]/root - but core_pattern still runs and with access). That said, using the setters namespace makes sense to me as a consumer of core_pattern too. You can set the core_pattern outside of a chroot/container and collect core dumps there _or_ you can let a "root" user in a container have their own core collector without providing a simple escape. Making format_corename use the correct pid namespace for translation would make these cases even more seamless. Unfortunately, I haven't yet looked at doing it that way yet. The namespace-transition patch posted is what occurred to me initially. Perhaps it won't be so hard. I'll take a look at what it'd take to do move core_pattern since it'd resolve both the escape/lsm-bypass scenarios and the mismatch between the arbitrary namespace and the core_pattern values. Thanks! will ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <m1eico1cyv.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>]
* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace [not found] ` <m1eico1cyv.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> @ 2010-09-20 19:12 ` Andi Kleen 0 siblings, 0 replies; 43+ messages in thread From: Andi Kleen @ 2010-09-20 19:12 UTC (permalink / raw) To: Eric W. Biederman Cc: Will Drewry, Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Oleg Nesterov, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath > The pipe process needs to run in the namespaces of the process who set > the core pattern, not in the namespaces of the dumping process. > Otherwise it is possible to trigger a privileged process to run in a > context where it's reality that it expected, causing it to misuse > it's privileges. Even if we don't have a privilege problem I think > we will have a case of mismatched functionality where the core pattern > will not work as expected. For me it seems rather the other way around: running the helper in some highly priviledged namespace is more dangerous. If it runs in the same context as the crasher it can do the least amount of damage relative to the crash process. And as Will pointed out it's the only sane way to deal with net namespaces. -Andi -- ak-VuQAYsv1563Yd54FQh9/CA@public.gmane.org -- Speaking for myself only. ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <1284779629-15273-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH][RFC] v2 exec: move core_pattern pipe helper into the crashing namespace [not found] ` <1284779629-15273-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> @ 2010-09-20 18:34 ` Eric W. Biederman 0 siblings, 0 replies; 43+ messages in thread From: Eric W. Biederman @ 2010-09-20 18:34 UTC (permalink / raw) To: Will Drewry Cc: Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> writes: > Presently, a core_pattern pipe endpoint will be run in the init > namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the > core dumping process but will have no access to that processes /proc > without walking the init namespace /proc looking through all the global > pids until it finds the one it thinks matches. > > One option for fixing this is to change the reported pid: > https://patchwork.kernel.org/patch/185912/ > However, it's unclear if it is desirable for the core_pattern to run > outside the namespace. In particular, it can easily access the mounts > via /proc/[pid_nr]/root, but if there is a net namespace, it will not > have access. (Originally introduced in 2007 in commit > b488893a390edfe027bae7a46e9af8083e740668 ) > > Instead, this change implements the more complex option two. It > migrates the ____call_usermodehelper() thread into the same namespaces > as the dumping process. It does not assign a pid in that namespace so > the collector will appear to be pid 0 in the namespace. I like the direction this is going but as structured this looks like a security exploit waiting to happen. The pipe process needs to run in the namespaces of the process who set the core pattern, not in the namespaces of the dumping process. Otherwise it is possible to trigger a privileged process to run in a context where it's reality that it expected, causing it to misuse it's privileges. Even if we don't have a privilege problem I think we will have a case of mismatched functionality where the core pattern will not work as expected. I believe the solution here is to move the core pattern into the pid namespace and to capture the namespaces when set, but if the core pattern hasn't been set to use the core pattern of the parent pid namespace. Eric ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-17 15:16 ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry [not found] ` <1284736618-27153-2-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> @ 2010-09-17 18:15 ` Neil Horman [not found] ` <20100917181556.GA2499-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2010-09-18 2:33 ` Will Drewry 1 sibling, 2 replies; 43+ messages in thread From: Neil Horman @ 2010-09-17 18:15 UTC (permalink / raw) To: Will Drewry Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On Fri, Sep 17, 2010 at 10:16:58AM -0500, Will Drewry wrote: > Presently, a core_pattern pipe endpoint will be run in the init > namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the > core dumping process but will have no access to that processes /proc > without walking the init namespace /proc looking through all the global > pids until it finds the one it thinks matches. > > One option for implementing this I've already posed: > https://patchwork.kernel.org/patch/185912/ > However, it's unclear if it is desirable for the core_pattern to run > outside the namespace. In particular, it can easily access the mounts > via /proc/[pid_nr]/root, but if there is a net namespace, it will not > have access. (Originally introduced in 2007 in commit > b488893a390edfe027bae7a46e9af8083e740668 ) > > Instead, this change implements the more complex option two. It > migrates the ____call_usermodehelper() thread into the same namespaces > as the dumping process. It does not assign a pid in that namespace so > the collector will appear to be pid 0. (Using alloc_pid and change_pid > are options though. I avoided it because kernel_thread's returned pid > will then mismatch the actual threads pid.) > > Signed-off-by: Will Drewry <wad@chromium.org> > --- > fs/exec.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- > 1 files changed, 53 insertions(+), 3 deletions(-) > > diff --git a/fs/exec.c b/fs/exec.c > index 828dd24..3b82607 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -51,6 +51,7 @@ > #include <linux/audit.h> > #include <linux/tracehook.h> > #include <linux/kmod.h> > +#include <linux/nsproxy.h> > #include <linux/fsnotify.h> > #include <linux/fs_struct.h> > #include <linux/pipe_fs_i.h> > @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core"; > unsigned int core_pipe_limit; > int suid_dumpable = 0; > > +struct coredump_pipe_params { > + struct coredump_params *params; > + struct nsproxy *nsproxy; > + struct fs_struct *fs; > +}; > + Generally, I like this approach, but i think perhaps it would be better if you made the coredump_pipe_params struct a member of the coredump_params structure (perhaps within an anonymous union, so you can handle future core dump types, if those ever come along). That will save you needing the extra pointer to the coredump_params structure itself. > /* The maximal length of core_pattern is also specified in sysctl.c */ > > static LIST_HEAD(formats); > @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info) > { > struct file *rp, *wp; > struct fdtable *fdt; > - struct coredump_params *cp = (struct coredump_params *)info->data; > - struct files_struct *cf = current->files; > + struct coredump_pipe_params *pipe_params = > + (struct coredump_pipe_params *)info->data; If you do the above, you can just reference this as cp->pipe_params, or what-not. > + struct coredump_params *cp = pipe_params->params; > + struct task_struct *tsk = current; > + struct files_struct *cf = tsk->files; > + struct fs_struct *cfs = tsk->fs; > + > + /* Migrate this thread into the crashing namespaces, but > + * don't change its pid struct to avoid breaking any other > + * dependencies. This will mean the process is pid=0 if it > + * was migrated into a pid namespace. > + */ > + if (pipe_params->nsproxy && pipe_params->fs) { > + int kill; > + switch_task_namespaces(tsk, pipe_params->nsproxy); > + pipe_params->nsproxy = NULL; > + > + task_lock(tsk); > + tsk->fs = pipe_params->fs; > + task_unlock(tsk); > + pipe_params->fs = NULL; > + /* Clean up the previous fs struct */ > + write_lock(&cfs->lock); > + kill = !--cfs->users; > + write_unlock(&cfs->lock); > + if (kill) > + free_fs_struct(cfs); > + } > > wp = create_write_pipe(0); > if (IS_ERR(wp)) > @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) > if (ispipe) { > int dump_count; > char **helper_argv; > + struct coredump_pipe_params pipe_params = { .params = &cprm }; > And you won't have to allocate this, it will just be part of the coredump_params. > if (cprm.limit == 1) { > /* > @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) > goto fail_dropcount; > } > > + /* Run the core_collector in the crashing namespaces */ > + if (copy_namespaces_unattached(0, current, > + &pipe_params.nsproxy, &pipe_params.fs)) { > + printk(KERN_WARNING "%s failed to copy namespaces\n", > + __func__); > + argv_free(helper_argv); > + goto fail_dropcount; > + } > + > retval = call_usermodehelper_fns(helper_argv[0], helper_argv, > NULL, UMH_WAIT_EXEC, umh_pipe_setup, > - NULL, &cprm); > + NULL, &pipe_params); And you can pass a pointer to coredump_params here, instead of a pointer to a pipe_specific coredumps, to a function that isn't specific to pipes > argv_free(helper_argv); > + /* nsproxy and fs will survive if call_usermodehelper_fns hits > + * ENOMEM prior to creating a new thread. > + */ > + if (pipe_params.nsproxy) > + put_nsproxy(pipe_params.nsproxy); > + if (pipe_params.fs) /* not in use by anything else */ > + free_fs_struct(pipe_params.fs); > if (retval) { > printk(KERN_INFO "Core dump to %s pipe failed\n", > corename); > -- > 1.7.0.4 > > Regards Neil ^ permalink raw reply [flat|nested] 43+ messages in thread
[parent not found: <20100917181556.GA2499-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace [not found] ` <20100917181556.GA2499-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2010-09-18 2:33 ` Will Drewry 0 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-18 2:33 UTC (permalink / raw) To: Neil Horman Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Eugene Teo, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Alexander Viro, KOSAKI Motohiro, Tejun Heo, Serge Hallyn, Andrew Morton, Alexey Dobriyan, Roland McGrath, Eric W. Biederman On Fri, Sep 17, 2010 at 1:15 PM, Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org> wrote: > On Fri, Sep 17, 2010 at 10:16:58AM -0500, Will Drewry wrote: >> Presently, a core_pattern pipe endpoint will be run in the init >> namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the >> core dumping process but will have no access to that processes /proc >> without walking the init namespace /proc looking through all the global >> pids until it finds the one it thinks matches. >> >> One option for implementing this I've already posed: >> https://patchwork.kernel.org/patch/185912/ >> However, it's unclear if it is desirable for the core_pattern to run >> outside the namespace. In particular, it can easily access the mounts >> via /proc/[pid_nr]/root, but if there is a net namespace, it will not >> have access. (Originally introduced in 2007 in commit >> b488893a390edfe027bae7a46e9af8083e740668 ) >> >> Instead, this change implements the more complex option two. It >> migrates the ____call_usermodehelper() thread into the same namespaces >> as the dumping process. It does not assign a pid in that namespace so >> the collector will appear to be pid 0. (Using alloc_pid and change_pid >> are options though. I avoided it because kernel_thread's returned pid >> will then mismatch the actual threads pid.) >> >> Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> >> --- >> fs/exec.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- >> 1 files changed, 53 insertions(+), 3 deletions(-) >> >> diff --git a/fs/exec.c b/fs/exec.c >> index 828dd24..3b82607 100644 >> --- a/fs/exec.c >> +++ b/fs/exec.c >> @@ -51,6 +51,7 @@ >> #include <linux/audit.h> >> #include <linux/tracehook.h> >> #include <linux/kmod.h> >> +#include <linux/nsproxy.h> >> #include <linux/fsnotify.h> >> #include <linux/fs_struct.h> >> #include <linux/pipe_fs_i.h> >> @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core"; >> unsigned int core_pipe_limit; >> int suid_dumpable = 0; >> >> +struct coredump_pipe_params { >> + struct coredump_params *params; >> + struct nsproxy *nsproxy; >> + struct fs_struct *fs; >> +}; >> + > Generally, I like this approach, but i think perhaps it would be better if you > made the coredump_pipe_params struct a member of the coredump_params structure > (perhaps within an anonymous union, so you can handle future core dump types, if > those ever come along). That will save you needing the extra pointer to the > coredump_params structure itself. I'll rework it as suggested while also taking into account Oleg's feedback (another mail coming). Thanks! >> /* The maximal length of core_pattern is also specified in sysctl.c */ >> >> static LIST_HEAD(formats); >> @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info) >> { >> struct file *rp, *wp; >> struct fdtable *fdt; >> - struct coredump_params *cp = (struct coredump_params *)info->data; >> - struct files_struct *cf = current->files; >> + struct coredump_pipe_params *pipe_params = >> + (struct coredump_pipe_params *)info->data; > If you do the above, you can just reference this as cp->pipe_params, or > what-not. > >> + struct coredump_params *cp = pipe_params->params; >> + struct task_struct *tsk = current; >> + struct files_struct *cf = tsk->files; >> + struct fs_struct *cfs = tsk->fs; >> + >> + /* Migrate this thread into the crashing namespaces, but >> + * don't change its pid struct to avoid breaking any other >> + * dependencies. This will mean the process is pid=0 if it >> + * was migrated into a pid namespace. >> + */ >> + if (pipe_params->nsproxy && pipe_params->fs) { >> + int kill; >> + switch_task_namespaces(tsk, pipe_params->nsproxy); >> + pipe_params->nsproxy = NULL; >> + >> + task_lock(tsk); >> + tsk->fs = pipe_params->fs; >> + task_unlock(tsk); >> + pipe_params->fs = NULL; >> + /* Clean up the previous fs struct */ >> + write_lock(&cfs->lock); >> + kill = !--cfs->users; >> + write_unlock(&cfs->lock); >> + if (kill) >> + free_fs_struct(cfs); >> + } >> >> wp = create_write_pipe(0); >> if (IS_ERR(wp)) >> @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) >> if (ispipe) { >> int dump_count; >> char **helper_argv; >> + struct coredump_pipe_params pipe_params = { .params = &cprm }; >> > And you won't have to allocate this, it will just be part of the > coredump_params. > >> if (cprm.limit == 1) { >> /* >> @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) >> goto fail_dropcount; >> } >> >> + /* Run the core_collector in the crashing namespaces */ >> + if (copy_namespaces_unattached(0, current, >> + &pipe_params.nsproxy, &pipe_params.fs)) { >> + printk(KERN_WARNING "%s failed to copy namespaces\n", >> + __func__); >> + argv_free(helper_argv); >> + goto fail_dropcount; >> + } >> + >> retval = call_usermodehelper_fns(helper_argv[0], helper_argv, >> NULL, UMH_WAIT_EXEC, umh_pipe_setup, >> - NULL, &cprm); >> + NULL, &pipe_params); > And you can pass a pointer to coredump_params here, instead of a pointer to a > pipe_specific coredumps, to a function that isn't specific to pipes > >> argv_free(helper_argv); >> + /* nsproxy and fs will survive if call_usermodehelper_fns hits >> + * ENOMEM prior to creating a new thread. >> + */ >> + if (pipe_params.nsproxy) >> + put_nsproxy(pipe_params.nsproxy); >> + if (pipe_params.fs) /* not in use by anything else */ >> + free_fs_struct(pipe_params.fs); >> if (retval) { >> printk(KERN_INFO "Core dump to %s pipe failed\n", >> corename); >> -- >> 1.7.0.4 >> >> > > > Regards > Neil > > ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace 2010-09-17 18:15 ` [PATCH 2/2] " Neil Horman [not found] ` <20100917181556.GA2499-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2010-09-18 2:33 ` Will Drewry 1 sibling, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-18 2:33 UTC (permalink / raw) To: Neil Horman Cc: Andi Kleen, linux-kernel, Alexander Viro, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Roland McGrath, Eric W. Biederman, containers, Eugene Teo, Tejun Heo, Serge Hallyn, Alexey Dobriyan, linux-fsdevel On Fri, Sep 17, 2010 at 1:15 PM, Neil Horman <nhorman@tuxdriver.com> wrote: > On Fri, Sep 17, 2010 at 10:16:58AM -0500, Will Drewry wrote: >> Presently, a core_pattern pipe endpoint will be run in the init >> namespace. It will receive the virtual pid (task_tgid_vnr->%p) of the >> core dumping process but will have no access to that processes /proc >> without walking the init namespace /proc looking through all the global >> pids until it finds the one it thinks matches. >> >> One option for implementing this I've already posed: >> https://patchwork.kernel.org/patch/185912/ >> However, it's unclear if it is desirable for the core_pattern to run >> outside the namespace. In particular, it can easily access the mounts >> via /proc/[pid_nr]/root, but if there is a net namespace, it will not >> have access. (Originally introduced in 2007 in commit >> b488893a390edfe027bae7a46e9af8083e740668 ) >> >> Instead, this change implements the more complex option two. It >> migrates the ____call_usermodehelper() thread into the same namespaces >> as the dumping process. It does not assign a pid in that namespace so >> the collector will appear to be pid 0. (Using alloc_pid and change_pid >> are options though. I avoided it because kernel_thread's returned pid >> will then mismatch the actual threads pid.) >> >> Signed-off-by: Will Drewry <wad@chromium.org> >> --- >> fs/exec.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- >> 1 files changed, 53 insertions(+), 3 deletions(-) >> >> diff --git a/fs/exec.c b/fs/exec.c >> index 828dd24..3b82607 100644 >> --- a/fs/exec.c >> +++ b/fs/exec.c >> @@ -51,6 +51,7 @@ >> #include <linux/audit.h> >> #include <linux/tracehook.h> >> #include <linux/kmod.h> >> +#include <linux/nsproxy.h> >> #include <linux/fsnotify.h> >> #include <linux/fs_struct.h> >> #include <linux/pipe_fs_i.h> >> @@ -65,6 +66,12 @@ char core_pattern[CORENAME_MAX_SIZE] = "core"; >> unsigned int core_pipe_limit; >> int suid_dumpable = 0; >> >> +struct coredump_pipe_params { >> + struct coredump_params *params; >> + struct nsproxy *nsproxy; >> + struct fs_struct *fs; >> +}; >> + > Generally, I like this approach, but i think perhaps it would be better if you > made the coredump_pipe_params struct a member of the coredump_params structure > (perhaps within an anonymous union, so you can handle future core dump types, if > those ever come along). That will save you needing the extra pointer to the > coredump_params structure itself. I'll rework it as suggested while also taking into account Oleg's feedback (another mail coming). Thanks! >> /* The maximal length of core_pattern is also specified in sysctl.c */ >> >> static LIST_HEAD(formats); >> @@ -1819,8 +1826,34 @@ static int umh_pipe_setup(struct subprocess_info *info) >> { >> struct file *rp, *wp; >> struct fdtable *fdt; >> - struct coredump_params *cp = (struct coredump_params *)info->data; >> - struct files_struct *cf = current->files; >> + struct coredump_pipe_params *pipe_params = >> + (struct coredump_pipe_params *)info->data; > If you do the above, you can just reference this as cp->pipe_params, or > what-not. > >> + struct coredump_params *cp = pipe_params->params; >> + struct task_struct *tsk = current; >> + struct files_struct *cf = tsk->files; >> + struct fs_struct *cfs = tsk->fs; >> + >> + /* Migrate this thread into the crashing namespaces, but >> + * don't change its pid struct to avoid breaking any other >> + * dependencies. This will mean the process is pid=0 if it >> + * was migrated into a pid namespace. >> + */ >> + if (pipe_params->nsproxy && pipe_params->fs) { >> + int kill; >> + switch_task_namespaces(tsk, pipe_params->nsproxy); >> + pipe_params->nsproxy = NULL; >> + >> + task_lock(tsk); >> + tsk->fs = pipe_params->fs; >> + task_unlock(tsk); >> + pipe_params->fs = NULL; >> + /* Clean up the previous fs struct */ >> + write_lock(&cfs->lock); >> + kill = !--cfs->users; >> + write_unlock(&cfs->lock); >> + if (kill) >> + free_fs_struct(cfs); >> + } >> >> wp = create_write_pipe(0); >> if (IS_ERR(wp)) >> @@ -1911,6 +1944,7 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) >> if (ispipe) { >> int dump_count; >> char **helper_argv; >> + struct coredump_pipe_params pipe_params = { .params = &cprm }; >> > And you won't have to allocate this, it will just be part of the > coredump_params. > >> if (cprm.limit == 1) { >> /* >> @@ -1950,10 +1984,26 @@ void do_coredump(long signr, int exit_code, struct pt_regs *regs) >> goto fail_dropcount; >> } >> >> + /* Run the core_collector in the crashing namespaces */ >> + if (copy_namespaces_unattached(0, current, >> + &pipe_params.nsproxy, &pipe_params.fs)) { >> + printk(KERN_WARNING "%s failed to copy namespaces\n", >> + __func__); >> + argv_free(helper_argv); >> + goto fail_dropcount; >> + } >> + >> retval = call_usermodehelper_fns(helper_argv[0], helper_argv, >> NULL, UMH_WAIT_EXEC, umh_pipe_setup, >> - NULL, &cprm); >> + NULL, &pipe_params); > And you can pass a pointer to coredump_params here, instead of a pointer to a > pipe_specific coredumps, to a function that isn't specific to pipes > >> argv_free(helper_argv); >> + /* nsproxy and fs will survive if call_usermodehelper_fns hits >> + * ENOMEM prior to creating a new thread. >> + */ >> + if (pipe_params.nsproxy) >> + put_nsproxy(pipe_params.nsproxy); >> + if (pipe_params.fs) /* not in use by anything else */ >> + free_fs_struct(pipe_params.fs); >> if (retval) { >> printk(KERN_INFO "Core dump to %s pipe failed\n", >> corename); >> -- >> 1.7.0.4 >> >> > > > Regards > Neil > > ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper @ 2010-09-16 18:59 Will Drewry 0 siblings, 0 replies; 43+ messages in thread From: Will Drewry @ 2010-09-16 18:59 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: Will Drewry, Neil Horman, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Oleg Nesterov, Andi Kleen, Alexander Viro, KOSAKI Motohiro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, Roland McGrath, Eric W. Biederman format_corename uses task_tgid_vnr to provide the numeric pid of a core-dumping process. For file-based coredumps, this is perfectly satisfactory. However, when the core_pattern contains a pipe, the substituted PID is invalid in the namespace of the core_pattern pipe helper, the init namespace. By changing this, any core collector may now find the process in the init namespace /proc. This helps with VFS namespacing too since the mount root is available via /proc. Unfortunately, it does not help in cases of more complex namespacing, like net namespaces. For that, the helper thread will need to be migrated to the core-dump namespaces. I have a separate patch series which implements migrating the ____call_usermodehelper thread to the coredump namespace, but it adds a fair amount of complexity which might be better handled by someone who understands that code. I'm happy to mail it out as well though (it works, but I don't assign a namespaced pid which may open up issues on its own). Any and all comments, feedback will be appreciated! Signed-off-by: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> --- fs/exec.c | 11 +++++++++-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 828dd24..0b8a874 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1467,6 +1467,13 @@ static int format_corename(char *corename, long signr) char *const out_end = corename + CORENAME_MAX_SIZE; int rc; int pid_in_pattern = 0; + pid_t pid = task_tgid_vnr(current); + + /* The pipe helper runs in the init namespace and should + * receive the matching pid until that changes. + */ + if (ispipe) + pid = task_tgid_nr(current); /* Repeat as long as we have more pattern to process and more output space */ @@ -1489,7 +1496,7 @@ static int format_corename(char *corename, long signr) case 'p': pid_in_pattern = 1; rc = snprintf(out_ptr, out_end - out_ptr, - "%d", task_tgid_vnr(current)); + "%d", pid); if (rc > out_end - out_ptr) goto out; out_ptr += rc; @@ -1568,7 +1575,7 @@ static int format_corename(char *corename, long signr) * the filename. Do not do this for piped commands. */ if (!ispipe && !pid_in_pattern && core_uses_pid) { rc = snprintf(out_ptr, out_end - out_ptr, - ".%d", task_tgid_vnr(current)); + ".%d", pid); if (rc > out_end - out_ptr) goto out; out_ptr += rc; -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 43+ messages in thread
end of thread, other threads:[~2010-09-20 20:28 UTC | newest] Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-09-16 18:59 [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper Will Drewry [not found] ` <1284663599-3549-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> 2010-09-16 19:35 ` Oleg Nesterov 2010-09-17 13:26 ` Andi Kleen 2010-09-16 19:35 ` Oleg Nesterov [not found] ` <20100916193543.GA11016-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2010-09-16 20:12 ` Eric W. Biederman 2010-09-16 20:12 ` Eric W. Biederman 2010-09-16 21:02 ` Will Drewry [not found] ` <m1zkvh4fdc.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> 2010-09-16 21:02 ` Will Drewry 2010-09-17 19:08 ` Roland McGrath 2010-09-17 19:08 ` Roland McGrath 2010-09-17 13:26 ` Andi Kleen 2010-09-17 14:52 ` Will Drewry 2010-09-17 14:52 ` Will Drewry [not found] ` <20100917152639.0e88341a-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org> 2010-09-17 14:52 ` Will Drewry 2010-09-17 15:16 ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry 2010-09-17 15:16 ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry 2010-09-17 15:16 ` [PATCH 1/2] nsproxy: add copy_namespaces_unattached Will Drewry 2010-09-17 15:16 ` [PATCH 2/2] exec: move core_pattern pipe helper into the crashing namespace Will Drewry [not found] ` <1284736618-27153-2-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> 2010-09-17 18:15 ` Neil Horman 2010-09-18 1:29 ` Oleg Nesterov 2010-09-18 1:29 ` Oleg Nesterov [not found] ` <20100918012939.GA25046-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2010-09-18 2:34 ` Will Drewry 2010-09-18 2:34 ` Will Drewry 2010-09-18 3:14 ` Will Drewry 2010-09-18 3:14 ` Will Drewry 2010-09-20 18:50 ` Oleg Nesterov 2010-09-20 18:50 ` Oleg Nesterov [not found] ` <20100920185001.GA955-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2010-09-20 20:28 ` Will Drewry 2010-09-20 20:28 ` Will Drewry 2010-09-20 20:28 ` Will Drewry 2010-09-18 3:13 ` [PATCH][RFC] v2 " Will Drewry 2010-09-18 3:13 ` Will Drewry 2010-09-20 18:34 ` Eric W. Biederman 2010-09-20 19:12 ` Andi Kleen 2010-09-20 20:26 ` Will Drewry 2010-09-20 20:26 ` Will Drewry [not found] ` <20100920191214.GB7496-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org> 2010-09-20 20:26 ` Will Drewry [not found] ` <m1eico1cyv.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org> 2010-09-20 19:12 ` Andi Kleen [not found] ` <1284779629-15273-1-git-send-email-wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> 2010-09-20 18:34 ` Eric W. Biederman 2010-09-17 18:15 ` [PATCH 2/2] " Neil Horman [not found] ` <20100917181556.GA2499-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2010-09-18 2:33 ` Will Drewry 2010-09-18 2:33 ` Will Drewry 2010-09-16 18:59 [PATCH][RFC] fs/exec.c: provide the correct process pid to the pipe helper Will Drewry
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.