* [PATCH v2 0/3] Write dump into container's filesystem for pipe_type core_pattern @ 2016-06-07 9:52 Zhao Lei 2016-06-07 11:29 ` [PATCH v2 1/3] Save dump_root into pid_namespace Zhao Lei ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Zhao Lei @ 2016-06-07 9:52 UTC (permalink / raw) To: linux-kernel Cc: containers, Eric W. Biederman, Mateusz Guzik, Kamezawa Hiroyuki, Zhao Lei In current system, when we set core_pattern to a pipe, both pipe program and program's output are in host's filesystem. But when we set core_pattern to a file, the container will write dump into container's filesystem. For example, when we set following core_pattern: # echo "|/my_dump_pipe %s %c %p %u %g %t e" >/proc/sys/kernel/core_pattern and trigger a segment fault in a container, my_dump_pipe is searched from host's filesystem, and it will write coredump into host's filesystem too. In a privileged container, user can destroy host system by following command: # # In a container # echo "|/bin/dd of=/boot/vmlinuz" >/proc/sys/kernel/core_pattern # make_dump Actually, all operation in a container should not change host's environment, the container should use core_pattern as its private setting. In detail, in core dump action: 1: Search pipe program in container's fs namespace. 2: Run pipe program in container's fs namespace to write coredump to it. I rewrited this patch from origional: http://www.gossamer-threads.com/lists/linux/kernel/2395715?do=post_view_flat and changed the impliment way and function detail discussed in: http://www.gossamer-threads.com/lists/linux/kernel/2397602?nohighlight=1#2397602 Changelog v1->v2: 1: Move path_put() out of spin_lock, suggested by: Al Viro <viro@ftp.linux.org.uk> Changelog RFC->v1: 1: RFC->v1 2: Rebase on top of v4.7-rc2 Changes against previous impliment: 1: Avoid forking thread from the crach process. Suggested-by: Eric W. Biederman <ebiederm@xmission.com> 2: To keep compatibility with current code, if user hadn't change core_pattern in container, the dump file will still write to the host filesystem. Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Zhao Lei (3): Save dump_root into pid_namespace Make dump_pipe thread possilbe to select the rootfs Write dump into container's filesystem for pipe_type core_pattern fs/coredump.c | 19 ++++++++++++++++++- fs/fs_struct.c | 25 ++++++++++++++++--------- include/linux/fs_struct.h | 3 ++- include/linux/kmod.h | 4 +++- include/linux/pid_namespace.h | 3 +++ include/linux/sched.h | 5 +++-- init/do_mounts_initrd.c | 3 ++- init/main.c | 4 ++-- kernel/fork.c | 34 ++++++++++++++++++++-------------- kernel/kmod.c | 13 ++++++++----- kernel/kthread.c | 3 ++- kernel/pid.c | 1 + kernel/pid_namespace.c | 6 ++++++ kernel/sysctl.c | 33 +++++++++++++++++++++++++++++---- lib/kobject_uevent.c | 3 ++- security/keys/request_key.c | 2 +- 16 files changed, 118 insertions(+), 43 deletions(-) -- 1.8.5.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/3] Save dump_root into pid_namespace 2016-06-07 9:52 [PATCH v2 0/3] Write dump into container's filesystem for pipe_type core_pattern Zhao Lei @ 2016-06-07 11:29 ` Zhao Lei 2016-06-07 11:29 ` [PATCH v2 2/3] Make dump_pipe thread possilbe to select the rootfs Zhao Lei 2016-06-07 11:29 ` [PATCH v2 3/3] Write dump into container's filesystem for pipe_type core_pattern Zhao Lei 2 siblings, 0 replies; 6+ messages in thread From: Zhao Lei @ 2016-06-07 11:29 UTC (permalink / raw) To: linux-kernel Cc: containers, Eric W. Biederman, Mateusz Guzik, Kamezawa Hiroyuki, Zhao Lei In current system, when we set core_pattern to a pipe, both pipe program and program's output are in host's filesystem. But when we set core_pattern to a file, the container will write dump into container's filesystem. Reason of above different is: In pipe_mode dump_pattern setting, the process who write the dumpfile is a kernel thread, whose fs_root always point to host's root fs. This patch save the dump_root into pid_namespace, and when a crach happened in container, this dump_root can be used as fs_root of dump_writter_thread. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> --- include/linux/pid_namespace.h | 3 +++ kernel/pid.c | 1 + kernel/pid_namespace.c | 6 ++++++ kernel/sysctl.c | 33 +++++++++++++++++++++++++++++---- 4 files changed, 39 insertions(+), 4 deletions(-) diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h index 918b117..535a532 100644 --- a/include/linux/pid_namespace.h +++ b/include/linux/pid_namespace.h @@ -9,6 +9,7 @@ #include <linux/nsproxy.h> #include <linux/kref.h> #include <linux/ns_common.h> +#include <linux/path.h> struct pidmap { atomic_t nr_free; @@ -45,6 +46,8 @@ struct pid_namespace { int hide_pid; int reboot; /* group exit code if this pidns was rebooted */ struct ns_common ns; + spinlock_t root_for_dump_lock; + struct path root_for_dump; }; extern struct pid_namespace init_pid_ns; diff --git a/kernel/pid.c b/kernel/pid.c index f66162f..ef4cd85 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -83,6 +83,7 @@ struct pid_namespace init_pid_ns = { #ifdef CONFIG_PID_NS .ns.ops = &pidns_operations, #endif + .root_for_dump_lock = __SPIN_LOCK_UNLOCKED(init_pid_ns.root_for_dump_lock), }; EXPORT_SYMBOL_GPL(init_pid_ns); diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index a65ba13..3d0eced 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -123,6 +123,8 @@ static struct pid_namespace *create_pid_namespace(struct user_namespace *user_ns for (i = 1; i < PIDMAP_ENTRIES; i++) atomic_set(&ns->pidmap[i].nr_free, BITS_PER_PAGE); + spin_lock_init(&ns->root_for_dump_lock); + return ns; out_free_map: @@ -147,6 +149,10 @@ static void destroy_pid_namespace(struct pid_namespace *ns) for (i = 0; i < PIDMAP_ENTRIES; i++) kfree(ns->pidmap[i].page); put_user_ns(ns->user_ns); + + if (ns->root_for_dump.mnt) + path_put(&ns->root_for_dump); + call_rcu(&ns->rcu, delayed_free_pidns); } diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 87b2fc3..0e82b2f 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -65,6 +65,7 @@ #include <linux/sched/sysctl.h> #include <linux/kexec.h> #include <linux/bpf.h> +#include <linux/fs_struct.h> #include <asm/uaccess.h> #include <asm/processor.h> @@ -2372,10 +2373,34 @@ static int proc_dointvec_minmax_coredump(struct ctl_table *table, int write, static int proc_dostring_coredump(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos) { - int error = proc_dostring(table, write, buffer, lenp, ppos); - if (!error) - validate_coredump_safety(); - return error; + struct pid_namespace *pid_ns; + struct path old_path; + int error; + + error = proc_dostring(table, write, buffer, lenp, ppos); + if (error) + return error; + + pid_ns = task_active_pid_ns(current); + if (WARN_ON(!pid_ns)) + return -EINVAL; + + spin_lock(&pid_ns->root_for_dump_lock); + + old_path = pid_ns->root_for_dump; + + spin_lock(¤t->fs->lock); + pid_ns->root_for_dump = current->fs->root; + path_get(&pid_ns->root_for_dump); + spin_unlock(¤t->fs->lock); + + spin_unlock(&pid_ns->root_for_dump_lock); + + if (old_path.mnt) + path_put(&old_path); + + validate_coredump_safety(); + return 0; } #endif -- 1.8.5.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/3] Make dump_pipe thread possilbe to select the rootfs 2016-06-07 9:52 [PATCH v2 0/3] Write dump into container's filesystem for pipe_type core_pattern Zhao Lei 2016-06-07 11:29 ` [PATCH v2 1/3] Save dump_root into pid_namespace Zhao Lei @ 2016-06-07 11:29 ` Zhao Lei 2016-06-07 11:29 ` [PATCH v2 3/3] Write dump into container's filesystem for pipe_type core_pattern Zhao Lei 2 siblings, 0 replies; 6+ messages in thread From: Zhao Lei @ 2016-06-07 11:29 UTC (permalink / raw) To: linux-kernel Cc: containers, Eric W. Biederman, Mateusz Guzik, Kamezawa Hiroyuki, Zhao Lei To make the dump_pipe thread run in container's filesystem, we need to make it possible to select its fs_root from fork. Then the dump_pipe thread will exec user_defined pipe program in container's fs_root, and the problem will also write dumpdata into the same fs_root. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> --- fs/coredump.c | 3 ++- fs/fs_struct.c | 25 ++++++++++++++++--------- include/linux/fs_struct.h | 3 ++- include/linux/kmod.h | 4 +++- include/linux/sched.h | 5 +++-- init/do_mounts_initrd.c | 3 ++- init/main.c | 4 ++-- kernel/fork.c | 34 ++++++++++++++++++++-------------- kernel/kmod.c | 13 ++++++++----- kernel/kthread.c | 3 ++- lib/kobject_uevent.c | 3 ++- security/keys/request_key.c | 2 +- 12 files changed, 63 insertions(+), 39 deletions(-) diff --git a/fs/coredump.c b/fs/coredump.c index 38a7ab8..864985e 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -641,7 +641,8 @@ void do_coredump(const siginfo_t *siginfo) retval = -ENOMEM; sub_info = call_usermodehelper_setup(helper_argv[0], helper_argv, NULL, GFP_KERNEL, - umh_pipe_setup, NULL, &cprm); + umh_pipe_setup, NULL, &cprm, + NULL); if (sub_info) retval = call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC); diff --git a/fs/fs_struct.c b/fs/fs_struct.c index 7dca743..0ff30ad 100644 --- a/fs/fs_struct.c +++ b/fs/fs_struct.c @@ -107,7 +107,8 @@ void exit_fs(struct task_struct *tsk) } } -struct fs_struct *copy_fs_struct(struct fs_struct *old) +struct fs_struct *copy_fs_struct(struct fs_struct *old, + struct path *root_override) { struct fs_struct *fs = kmem_cache_alloc(fs_cachep, GFP_KERNEL); /* We don't need to lock fs - think why ;-) */ @@ -117,13 +118,19 @@ struct fs_struct *copy_fs_struct(struct fs_struct *old) spin_lock_init(&fs->lock); seqcount_init(&fs->seq); fs->umask = old->umask; - - spin_lock(&old->lock); - fs->root = old->root; - path_get(&fs->root); - fs->pwd = old->pwd; - path_get(&fs->pwd); - spin_unlock(&old->lock); + if (root_override) { + fs->root = *root_override; + path_get(&fs->root); + fs->pwd = *root_override; + path_get(&fs->pwd); + } else { + spin_lock(&old->lock); + fs->root = old->root; + path_get(&fs->root); + fs->pwd = old->pwd; + path_get(&fs->pwd); + spin_unlock(&old->lock); + } } return fs; } @@ -131,7 +138,7 @@ struct fs_struct *copy_fs_struct(struct fs_struct *old) int unshare_fs_struct(void) { struct fs_struct *fs = current->fs; - struct fs_struct *new_fs = copy_fs_struct(fs); + struct fs_struct *new_fs = copy_fs_struct(fs, NULL); int kill; if (!new_fs) diff --git a/include/linux/fs_struct.h b/include/linux/fs_struct.h index 0efc3e6..7274b29 100644 --- a/include/linux/fs_struct.h +++ b/include/linux/fs_struct.h @@ -19,7 +19,8 @@ extern struct kmem_cache *fs_cachep; extern void exit_fs(struct task_struct *); extern void set_fs_root(struct fs_struct *, const struct path *); extern void set_fs_pwd(struct fs_struct *, const struct path *); -extern struct fs_struct *copy_fs_struct(struct fs_struct *); +extern struct fs_struct *copy_fs_struct(struct fs_struct *, + struct path *root_override); extern void free_fs_struct(struct fs_struct *); extern int unshare_fs_struct(void); diff --git a/include/linux/kmod.h b/include/linux/kmod.h index fcfd2bf..73f5265 100644 --- a/include/linux/kmod.h +++ b/include/linux/kmod.h @@ -56,6 +56,7 @@ struct file; struct subprocess_info { struct work_struct work; struct completion *complete; + struct path *root_override; char *path; char **argv; char **envp; @@ -72,7 +73,8 @@ call_usermodehelper(char *path, char **argv, char **envp, int wait); extern struct subprocess_info * call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask, int (*init)(struct subprocess_info *info, struct cred *new), - void (*cleanup)(struct subprocess_info *), void *data); + void (*cleanup)(struct subprocess_info *), void *data, + struct path *root_override); extern int call_usermodehelper_exec(struct subprocess_info *info, int wait); diff --git a/include/linux/sched.h b/include/linux/sched.h index 6e42ada..aee2230 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -134,6 +134,7 @@ struct perf_event_context; struct blk_plug; struct filename; struct nameidata; +struct path; #define VMACACHE_BITS 2 #define VMACACHE_SIZE (1U << VMACACHE_BITS) @@ -2804,10 +2805,10 @@ extern int do_execveat(int, struct filename *, const char __user * const __user *, const char __user * const __user *, int); -extern long _do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *, unsigned long); +extern long _do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *, unsigned long, struct path *); extern long do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *); struct task_struct *fork_idle(int); -extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags); +extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags, struct path *); extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec); static inline void set_task_comm(struct task_struct *tsk, const char *from) diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index a1000ca..b401b22 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -72,7 +72,8 @@ static void __init handle_initrd(void) current->flags |= PF_FREEZER_SKIP; info = call_usermodehelper_setup("/linuxrc", argv, envp_init, - GFP_KERNEL, init_linuxrc, NULL, NULL); + GFP_KERNEL, init_linuxrc, NULL, NULL, + NULL); if (!info) return; call_usermodehelper_exec(info, UMH_WAIT_PROC); diff --git a/init/main.c b/init/main.c index 4c17fda..6ea4bbc 100644 --- a/init/main.c +++ b/init/main.c @@ -390,9 +390,9 @@ static noinline void __init_refok rest_init(void) * the init task will end up wanting to create kthreads, which, if * we schedule it before we create kthreadd, will OOPS. */ - kernel_thread(kernel_init, NULL, CLONE_FS); + kernel_thread(kernel_init, NULL, CLONE_FS, NULL); numa_default_policy(); - pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES); + pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES, NULL); rcu_read_lock(); kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); rcu_read_unlock(); diff --git a/kernel/fork.c b/kernel/fork.c index 5c2c355..b6543e1 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1028,7 +1028,8 @@ fail_nomem: return retval; } -static int copy_fs(unsigned long clone_flags, struct task_struct *tsk) +static int copy_fs(unsigned long clone_flags, struct task_struct *tsk, + struct path *root_override) { struct fs_struct *fs = current->fs; if (clone_flags & CLONE_FS) { @@ -1042,7 +1043,7 @@ static int copy_fs(unsigned long clone_flags, struct task_struct *tsk) spin_unlock(&fs->lock); return 0; } - tsk->fs = copy_fs_struct(fs); + tsk->fs = copy_fs_struct(fs, root_override); if (!tsk->fs) return -ENOMEM; return 0; @@ -1284,7 +1285,8 @@ static struct task_struct *copy_process(unsigned long clone_flags, struct pid *pid, int trace, unsigned long tls, - int node) + int node, + struct path *root_override) { int retval; struct task_struct *p; @@ -1472,7 +1474,7 @@ static struct task_struct *copy_process(unsigned long clone_flags, retval = copy_files(clone_flags, p); if (retval) goto bad_fork_cleanup_semundo; - retval = copy_fs(clone_flags, p); + retval = copy_fs(clone_flags, p, root_override); if (retval) goto bad_fork_cleanup_files; retval = copy_sighand(clone_flags, p); @@ -1715,7 +1717,7 @@ struct task_struct *fork_idle(int cpu) { struct task_struct *task; task = copy_process(CLONE_VM, 0, 0, NULL, &init_struct_pid, 0, 0, - cpu_to_node(cpu)); + cpu_to_node(cpu), NULL); if (!IS_ERR(task)) { init_idle_pids(task->pids); init_idle(task, cpu); @@ -1735,7 +1737,8 @@ long _do_fork(unsigned long clone_flags, unsigned long stack_size, int __user *parent_tidptr, int __user *child_tidptr, - unsigned long tls) + unsigned long tls, + struct path *root_override) { struct task_struct *p; int trace = 0; @@ -1760,7 +1763,8 @@ long _do_fork(unsigned long clone_flags, } p = copy_process(clone_flags, stack_start, stack_size, - child_tidptr, NULL, trace, tls, NUMA_NO_NODE); + child_tidptr, NULL, trace, tls, NUMA_NO_NODE, + root_override); /* * Do this prior waking up the new thread - the thread pointer * might get invalid after that point, if the thread exits quickly. @@ -1811,24 +1815,25 @@ long do_fork(unsigned long clone_flags, int __user *child_tidptr) { return _do_fork(clone_flags, stack_start, stack_size, - parent_tidptr, child_tidptr, 0); + parent_tidptr, child_tidptr, 0, NULL); } #endif /* * Create a kernel thread. */ -pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags) +pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags, + struct path *root_override) { return _do_fork(flags|CLONE_VM|CLONE_UNTRACED, (unsigned long)fn, - (unsigned long)arg, NULL, NULL, 0); + (unsigned long)arg, NULL, NULL, 0, root_override); } #ifdef __ARCH_WANT_SYS_FORK SYSCALL_DEFINE0(fork) { #ifdef CONFIG_MMU - return _do_fork(SIGCHLD, 0, 0, NULL, NULL, 0); + return _do_fork(SIGCHLD, 0, 0, NULL, NULL, 0, NULL); #else /* can not support in nommu mode */ return -EINVAL; @@ -1840,7 +1845,7 @@ SYSCALL_DEFINE0(fork) SYSCALL_DEFINE0(vfork) { return _do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, 0, - 0, NULL, NULL, 0); + 0, NULL, NULL, 0, NULL); } #endif @@ -1868,7 +1873,8 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp, unsigned long, tls) #endif { - return _do_fork(clone_flags, newsp, 0, parent_tidptr, child_tidptr, tls); + return _do_fork(clone_flags, newsp, 0, parent_tidptr, child_tidptr, + tls, NULL); } #endif @@ -1964,7 +1970,7 @@ static int unshare_fs(unsigned long unshare_flags, struct fs_struct **new_fsp) if (fs->users == 1) return 0; - *new_fsp = copy_fs_struct(fs); + *new_fsp = copy_fs_struct(fs, NULL); if (!*new_fsp) return -ENOMEM; diff --git a/kernel/kmod.c b/kernel/kmod.c index 0277d12..0d7f9e0 100644 --- a/kernel/kmod.c +++ b/kernel/kmod.c @@ -91,7 +91,7 @@ static int call_modprobe(char *module_name, int wait) argv[4] = NULL; info = call_usermodehelper_setup(modprobe_path, argv, envp, GFP_KERNEL, - NULL, free_modprobe_argv, NULL); + NULL, free_modprobe_argv, NULL, NULL); if (!info) goto free_module_name; @@ -272,7 +272,8 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info) /* If SIGCLD is ignored sys_wait4 won't populate the status. */ kernel_sigaction(SIGCHLD, SIG_DFL); - pid = kernel_thread(call_usermodehelper_exec_async, sub_info, SIGCHLD); + pid = kernel_thread(call_usermodehelper_exec_async, sub_info, SIGCHLD, + sub_info->root_override); if (pid < 0) { sub_info->retval = pid; } else { @@ -333,7 +334,8 @@ static void call_usermodehelper_exec_work(struct work_struct *work) * that always ignores SIGCHLD to ensure auto-reaping. */ pid = kernel_thread(call_usermodehelper_exec_async, sub_info, - CLONE_PARENT | SIGCHLD); + CLONE_PARENT | SIGCHLD, + sub_info->root_override); if (pid < 0) { sub_info->retval = pid; umh_complete(sub_info); @@ -520,7 +522,7 @@ struct subprocess_info *call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask, int (*init)(struct subprocess_info *info, struct cred *new), void (*cleanup)(struct subprocess_info *info), - void *data) + void *data, struct path *root_override) { struct subprocess_info *sub_info; sub_info = kzalloc(sizeof(struct subprocess_info), gfp_mask); @@ -528,6 +530,7 @@ struct subprocess_info *call_usermodehelper_setup(char *path, char **argv, goto out; INIT_WORK(&sub_info->work, call_usermodehelper_exec_work); + sub_info->root_override = root_override; sub_info->path = path; sub_info->argv = argv; sub_info->envp = envp; @@ -619,7 +622,7 @@ int call_usermodehelper(char *path, char **argv, char **envp, int wait) gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL; info = call_usermodehelper_setup(path, argv, envp, gfp_mask, - NULL, NULL, NULL); + NULL, NULL, NULL, NULL); if (info == NULL) return -ENOMEM; diff --git a/kernel/kthread.c b/kernel/kthread.c index 9ff173d..cc3b143 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -230,7 +230,8 @@ static void create_kthread(struct kthread_create_info *create) current->pref_node_fork = create->node; #endif /* We want our own signal handler (we take no signals by default). */ - pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD); + pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD, + NULL); if (pid < 0) { /* If user was SIGKILLed, I release the structure. */ struct completion *done = xchg(&create->done, NULL); diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c index f6c2c1e..490d268 100644 --- a/lib/kobject_uevent.c +++ b/lib/kobject_uevent.c @@ -345,7 +345,8 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, retval = -ENOMEM; info = call_usermodehelper_setup(env->argv[0], env->argv, env->envp, GFP_KERNEL, - NULL, cleanup_uevent_env, env); + NULL, cleanup_uevent_env, env, + NULL); if (info) { retval = call_usermodehelper_exec(info, UMH_NO_WAIT); env = NULL; /* freed by cleanup_uevent_env */ diff --git a/security/keys/request_key.c b/security/keys/request_key.c index a29e355..ed81a5b 100644 --- a/security/keys/request_key.c +++ b/security/keys/request_key.c @@ -79,7 +79,7 @@ static int call_usermodehelper_keys(char *path, char **argv, char **envp, info = call_usermodehelper_setup(path, argv, envp, GFP_KERNEL, umh_keys_init, umh_keys_cleanup, - session_keyring); + session_keyring, NULL); if (!info) return -ENOMEM; -- 1.8.5.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 3/3] Write dump into container's filesystem for pipe_type core_pattern 2016-06-07 9:52 [PATCH v2 0/3] Write dump into container's filesystem for pipe_type core_pattern Zhao Lei 2016-06-07 11:29 ` [PATCH v2 1/3] Save dump_root into pid_namespace Zhao Lei 2016-06-07 11:29 ` [PATCH v2 2/3] Make dump_pipe thread possilbe to select the rootfs Zhao Lei @ 2016-06-07 11:29 ` Zhao Lei 2016-06-07 19:09 ` Mateusz Guzik 2 siblings, 1 reply; 6+ messages in thread From: Zhao Lei @ 2016-06-07 11:29 UTC (permalink / raw) To: linux-kernel Cc: containers, Eric W. Biederman, Mateusz Guzik, Kamezawa Hiroyuki, Zhao Lei In current system, when we set core_pattern to a pipe, both pipe program and program's output are in host's filesystem. But when we set core_pattern to a file, the container will write dump into container's filesystem. For example, when we set following core_pattern: # echo "|/my_dump_pipe %s %c %p %u %g %t e" >/proc/sys/kernel/core_pattern and trigger a segment fault in a container, my_dump_pipe is searched from host's filesystem, and it will write coredump into host's filesystem too. In a privileged container, user can destroy host system by following command: # # In a container # echo "|/bin/dd of=/boot/vmlinuz" >/proc/sys/kernel/core_pattern # make_dump Actually, all operation in a container should not change host's environment, the container should use core_pattern as its private setting. In detail, in core dump action: 1: Search pipe program in container's fs namespace. 2: Run pipe program in container's fs namespace to write coredump to it. This patch fixed above problem by running pipe program with container's fs_root. Test: 1: do dump in host should have same action with current code. [HOST] # ulimit -c 1024000 [HOST] # rm -f /tmp/*dump* [HOST] # echo "|/dump_pipe %s %c %p %u %g %t e" >/proc/sys/kernel/core_pattern [HOST] # ./make_dump [HOST] Segmentation fault (core dumped) [HOST] # ls -l /tmp/*dump* # Should see host_dump_*. [HOST] -rw-r--r-- 1 root root 331776 Apr 15 18:01 /tmp/host_dump_11_1048576000_2356_0_0_1460714470 2: do dump after change core_pattern in container the container should write dump into its filesystem. [HOST] # rm -f /tmp/*dump* [HOST] # echo "|/dump_pipe %s %c %p %u %g %t e" >/proc/sys/kernel/core_pattern [HOST] # lxc-start -n vm_dumptest [GUEST]Please press Enter to activate this console. [GUEST]# ulimit -c 1024000 [GUEST]# rm -f /tmp/*dump* [GUEST]# echo "|/dump_pipe %s %c %p %u %g %t e" >/proc/sys/kernel/core_pattern [GUEST]# ./make_dump [GUEST]Segmentation fault (core dumped) [GUEST]# ls -l /tmp/*dump* # Should see guest_dump_* [GUEST]-rw-r--r-- 1 root root 331776 Apr 15 10:01 /tmp/guest_dump_11_524288000_12_0_0_1460714482 3: do dump without change core_pattern in container the container should write dump into host's filesystem to keep compatibility. [HOST] # rm -f /tmp/*dump* [HOST] # echo "|/dump_pipe %s %c %p %u %g %t e" >/proc/sys/kernel/core_pattern [HOST] # lxc-start -n vm_dumptest [GUEST]Please press Enter to activate this console. [GUEST]# ulimit -c 1024000 [GUEST]# rm -f /tmp/*dump* [GUEST]# ./make_dump [GUEST]Segmentation fault (core dumped) [GUEST]# ls -l /tmp/*dump* # Should not see dump file [GUEST]ls: /tmp/*dump*: No such file or directory [HOST] # ls -l /tmp/*dump* # Should see dump file [HOST] -rw-r--r-- 1 root root 331776 Apr 15 18:01 /tmp/host_dump_11_524288000_12_0_0_1460714516 Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> --- fs/coredump.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/fs/coredump.c b/fs/coredump.c index 864985e..4616a25 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -592,6 +592,8 @@ void do_coredump(const siginfo_t *siginfo) int dump_count; char **helper_argv; struct subprocess_info *sub_info; + struct pid_namespace *pid_ns; + struct path root_fs; if (ispipe < 0) { printk(KERN_WARNING "format_corename failed\n"); @@ -638,15 +640,29 @@ void do_coredump(const siginfo_t *siginfo) goto fail_dropcount; } + pid_ns = task_active_pid_ns(current); + spin_lock(&pid_ns->root_for_dump_lock); + while (pid_ns != &init_pid_ns) { + if (pid_ns->root_for_dump.mnt) + break; + spin_unlock(&pid_ns->root_for_dump_lock); + pid_ns = pid_ns->parent, + spin_lock(&pid_ns->root_for_dump_lock); + } + root_fs = pid_ns->root_for_dump; + path_get(&root_fs); + spin_unlock(&pid_ns->root_for_dump_lock); + retval = -ENOMEM; sub_info = call_usermodehelper_setup(helper_argv[0], helper_argv, NULL, GFP_KERNEL, umh_pipe_setup, NULL, &cprm, - NULL); + &root_fs); if (sub_info) retval = call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC); + path_put(&root_fs); argv_free(helper_argv); if (retval) { printk(KERN_INFO "Core dump to |%s pipe failed\n", -- 1.8.5.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 3/3] Write dump into container's filesystem for pipe_type core_pattern 2016-06-07 11:29 ` [PATCH v2 3/3] Write dump into container's filesystem for pipe_type core_pattern Zhao Lei @ 2016-06-07 19:09 ` Mateusz Guzik 2016-06-07 21:38 ` Eric W. Biederman 0 siblings, 1 reply; 6+ messages in thread From: Mateusz Guzik @ 2016-06-07 19:09 UTC (permalink / raw) To: Zhao Lei; +Cc: linux-kernel, containers, Eric W. Biederman, Kamezawa Hiroyuki On Tue, Jun 07, 2016 at 07:29:37PM +0800, Zhao Lei wrote: > In current system, when we set core_pattern to a pipe, both pipe program > and program's output are in host's filesystem. > But when we set core_pattern to a file, the container will write dump > into container's filesystem. > > For example, when we set following core_pattern: > # echo "|/my_dump_pipe %s %c %p %u %g %t e" >/proc/sys/kernel/core_pattern > and trigger a segment fault in a container, my_dump_pipe is searched from > host's filesystem, and it will write coredump into host's filesystem too. > > In a privileged container, user can destroy host system by following > command: > # # In a container > # echo "|/bin/dd of=/boot/vmlinuz" >/proc/sys/kernel/core_pattern > # make_dump > > Actually, all operation in a container should not change host's > environment, the container should use core_pattern as its private setting. > In detail, in core dump action: > 1: Search pipe program in container's fs namespace. > 2: Run pipe program in container's fs namespace to write coredump to it. > > This patch fixed above problem by running pipe program with container's > fs_root. > This does not look sufficient, but I can't easily verify. For instance, is the spawned process able to attach itself with ptrace to processes outside of the original container? Even if not, can it mount procfs and use /proc/pid/root of processes outside of said container? The spawned process should be subject to all limitations imposed on the container (which may mean it just must be in the container). -- Mateusz Guzik ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 3/3] Write dump into container's filesystem for pipe_type core_pattern 2016-06-07 19:09 ` Mateusz Guzik @ 2016-06-07 21:38 ` Eric W. Biederman 0 siblings, 0 replies; 6+ messages in thread From: Eric W. Biederman @ 2016-06-07 21:38 UTC (permalink / raw) To: Mateusz Guzik; +Cc: Zhao Lei, linux-kernel, containers, Kamezawa Hiroyuki Mateusz Guzik <mguzik@redhat.com> writes: > On Tue, Jun 07, 2016 at 07:29:37PM +0800, Zhao Lei wrote: >> In current system, when we set core_pattern to a pipe, both pipe program >> and program's output are in host's filesystem. >> But when we set core_pattern to a file, the container will write dump >> into container's filesystem. >> >> For example, when we set following core_pattern: >> # echo "|/my_dump_pipe %s %c %p %u %g %t e" >/proc/sys/kernel/core_pattern >> and trigger a segment fault in a container, my_dump_pipe is searched from >> host's filesystem, and it will write coredump into host's filesystem too. >> >> In a privileged container, user can destroy host system by following >> command: >> # # In a container >> # echo "|/bin/dd of=/boot/vmlinuz" >/proc/sys/kernel/core_pattern >> # make_dump >> >> Actually, all operation in a container should not change host's >> environment, the container should use core_pattern as its private setting. >> In detail, in core dump action: >> 1: Search pipe program in container's fs namespace. >> 2: Run pipe program in container's fs namespace to write coredump to it. >> >> This patch fixed above problem by running pipe program with container's >> fs_root. >> > > This does not look sufficient, but I can't easily verify. > > For instance, is the spawned process able to attach itself with ptrace > to processes outside of the original container? Even if not, can it > mount procfs and use /proc/pid/root of processes outside of said > container? > > The spawned process should be subject to all limitations imposed on the > container (which may mean it just must be in the container). Pretty much. The most constructive suggestion I have seen is to have the containers init process logically fork in the kernel and spawn a process that way. Anything that uses call_usermodehelper is trivially not safe because of these concerns. Eric ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-06-07 21:50 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-07 9:52 [PATCH v2 0/3] Write dump into container's filesystem for pipe_type core_pattern Zhao Lei 2016-06-07 11:29 ` [PATCH v2 1/3] Save dump_root into pid_namespace Zhao Lei 2016-06-07 11:29 ` [PATCH v2 2/3] Make dump_pipe thread possilbe to select the rootfs Zhao Lei 2016-06-07 11:29 ` [PATCH v2 3/3] Write dump into container's filesystem for pipe_type core_pattern Zhao Lei 2016-06-07 19:09 ` Mateusz Guzik 2016-06-07 21:38 ` Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).