* [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-22 22:39 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 22:39 UTC (permalink / raw) To: Andrew Morton Cc: Kees Cook, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening There continues to be unexpected side-effects and security exposures via CLONE_NEWUSER. For many end-users running distro kernels with CONFIG_USER_NS enabled, there is no way to disable this feature when desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so admins not running containers or Chrome can avoid the risks of this feature. -Kees ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-22 22:39 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 22:39 UTC (permalink / raw) To: Andrew Morton Cc: Kees Cook, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening There continues to be unexpected side-effects and security exposures via CLONE_NEWUSER. For many end-users running distro kernels with CONFIG_USER_NS enabled, there is no way to disable this feature when desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so admins not running containers or Chrome can avoid the risks of this feature. -Kees ^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin 2016-01-22 22:39 ` [kernel-hardening] " Kees Cook @ 2016-01-22 22:39 ` Kees Cook -1 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 22:39 UTC (permalink / raw) To: Andrew Morton Cc: Kees Cook, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Several sysctls expect a state where the highest value (in extra2) is locked once set for that boot. Yama does this, and kptr_restrict should be doing it. This extracts Yama's logic and adds it to the existing proc_dointvec_minmax_sysadmin, taking care to avoid the simple boolean states (which do not get locked). Since Yama wants to be checking a different capability, we build wrappers for both cases (CAP_SYS_ADMIN and CAP_SYS_PTRACE). Signed-off-by: Kees Cook <keescook@chromium.org> --- Documentation/sysctl/kernel.txt | 4 +++- include/linux/sysctl.h | 18 ++++++++++++++++++ kernel/sysctl.c | 34 +++++++++++++++++++++------------- security/yama/yama_lsm.c | 18 +----------------- 4 files changed, 43 insertions(+), 31 deletions(-) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 73c6b1ef0e84..bbfc5e339a3d 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -385,7 +385,9 @@ to protect against uses of %pK in dmesg(8) if leaking kernel pointer values to unprivileged users is a concern. When kptr_restrict is set to (2), kernel pointers printed using -%pK will be replaced with 0's regardless of privileges. +%pK will be replaced with 0's regardless of privileges, and the value +will be locked at "2", so that the root user cannot remove this +restriction. ============================================================== diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index fa7bc29925c9..f8f0b991fe3e 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -23,6 +23,7 @@ #include <linux/list.h> #include <linux/rcupdate.h> +#include <linux/capability.h> #include <linux/wait.h> #include <linux/rbtree.h> #include <uapi/linux/sysctl.h> @@ -55,6 +56,23 @@ extern int proc_doulongvec_ms_jiffies_minmax(struct ctl_table *table, int, void __user *, size_t *, loff_t *); extern int proc_do_large_bitmap(struct ctl_table *, int, void __user *, size_t *, loff_t *); +extern int proc_dointvec_minmax_cap(int cap, struct ctl_table *table, + int write, void __user *buffer, + size_t *lenp, loff_t *ppos); +static inline int proc_dointvec_minmax_cap_sysadmin(struct ctl_table *table, + int write, void __user *buffer, + size_t *lenp, loff_t *ppos) +{ + return proc_dointvec_minmax_cap(CAP_SYS_ADMIN, table, write, buffer, + lenp, ppos); +} +static inline int proc_dointvec_minmax_cap_ptrace(struct ctl_table *table, + int write, void __user *buffer, + size_t *lenp, loff_t *ppos) +{ + return proc_dointvec_minmax_cap(CAP_SYS_PTRACE, table, write, buffer, + lenp, ppos); +} /* * Register a set of sysctl names by calling register_sysctl_table diff --git a/kernel/sysctl.c b/kernel/sysctl.c index c810f8afdb7f..fc8899dd636d 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -181,11 +181,6 @@ static int proc_taint(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos); #endif -#ifdef CONFIG_PRINTK -static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, - void __user *buffer, size_t *lenp, loff_t *ppos); -#endif - static int proc_dointvec_minmax_coredump(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos); #ifdef CONFIG_COREDUMP @@ -803,7 +798,7 @@ static struct ctl_table kern_table[] = { .data = &dmesg_restrict, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = proc_dointvec_minmax_sysadmin, + .proc_handler = proc_dointvec_minmax_cap_sysadmin, .extra1 = &zero, .extra2 = &one, }, @@ -812,7 +807,7 @@ static struct ctl_table kern_table[] = { .data = &kptr_restrict, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = proc_dointvec_minmax_sysadmin, + .proc_handler = proc_dointvec_minmax_cap_sysadmin, .extra1 = &zero, .extra2 = &two, }, @@ -2217,16 +2212,29 @@ static int proc_taint(struct ctl_table *table, int write, return err; } -#ifdef CONFIG_PRINTK -static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, - void __user *buffer, size_t *lenp, loff_t *ppos) +int proc_dointvec_minmax_cap(int cap, struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos) { - if (write && !capable(CAP_SYS_ADMIN)) + struct ctl_table table_copy; + int value; + + /* Require init capabilities to make changes. */ + if (write && !capable(cap)) return -EPERM; - return proc_dointvec_minmax(table, write, buffer, lenp, ppos); + /* + * To deal with const sysctl tables, we make a copy to perform + * the locking. When data is >1 and ==extra2, lock extra1 to + * extra2 to stop the value from being changed any further at + * runtime. + */ + table_copy = *table; + value = *(int *)table_copy.data; + if (value > 1 && value == *(int *)table_copy.extra2) + table_copy.extra1 = table_copy.extra2; + + return proc_dointvec_minmax(&table_copy, write, buffer, lenp, ppos); } -#endif struct do_proc_dointvec_minmax_conv_param { int *min; diff --git a/security/yama/yama_lsm.c b/security/yama/yama_lsm.c index d3c19c970a06..3215afd08fbd 100644 --- a/security/yama/yama_lsm.c +++ b/security/yama/yama_lsm.c @@ -354,22 +354,6 @@ static struct security_hook_list yama_hooks[] = { }; #ifdef CONFIG_SYSCTL -static int yama_dointvec_minmax(struct ctl_table *table, int write, - void __user *buffer, size_t *lenp, loff_t *ppos) -{ - struct ctl_table table_copy; - - if (write && !capable(CAP_SYS_PTRACE)) - return -EPERM; - - /* Lock the max value if it ever gets set. */ - table_copy = *table; - if (*(int *)table_copy.data == *(int *)table_copy.extra2) - table_copy.extra1 = table_copy.extra2; - - return proc_dointvec_minmax(&table_copy, write, buffer, lenp, ppos); -} - static int zero; static int max_scope = YAMA_SCOPE_NO_ATTACH; @@ -385,7 +369,7 @@ static struct ctl_table yama_sysctl_table[] = { .data = &ptrace_scope, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = yama_dointvec_minmax, + .proc_handler = proc_dointvec_minmax_cap_ptrace, .extra1 = &zero, .extra2 = &max_scope, }, -- 2.6.3 ^ permalink raw reply related [flat|nested] 80+ messages in thread
* [kernel-hardening] [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin @ 2016-01-22 22:39 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 22:39 UTC (permalink / raw) To: Andrew Morton Cc: Kees Cook, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Several sysctls expect a state where the highest value (in extra2) is locked once set for that boot. Yama does this, and kptr_restrict should be doing it. This extracts Yama's logic and adds it to the existing proc_dointvec_minmax_sysadmin, taking care to avoid the simple boolean states (which do not get locked). Since Yama wants to be checking a different capability, we build wrappers for both cases (CAP_SYS_ADMIN and CAP_SYS_PTRACE). Signed-off-by: Kees Cook <keescook@chromium.org> --- Documentation/sysctl/kernel.txt | 4 +++- include/linux/sysctl.h | 18 ++++++++++++++++++ kernel/sysctl.c | 34 +++++++++++++++++++++------------- security/yama/yama_lsm.c | 18 +----------------- 4 files changed, 43 insertions(+), 31 deletions(-) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 73c6b1ef0e84..bbfc5e339a3d 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -385,7 +385,9 @@ to protect against uses of %pK in dmesg(8) if leaking kernel pointer values to unprivileged users is a concern. When kptr_restrict is set to (2), kernel pointers printed using -%pK will be replaced with 0's regardless of privileges. +%pK will be replaced with 0's regardless of privileges, and the value +will be locked at "2", so that the root user cannot remove this +restriction. ============================================================== diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index fa7bc29925c9..f8f0b991fe3e 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -23,6 +23,7 @@ #include <linux/list.h> #include <linux/rcupdate.h> +#include <linux/capability.h> #include <linux/wait.h> #include <linux/rbtree.h> #include <uapi/linux/sysctl.h> @@ -55,6 +56,23 @@ extern int proc_doulongvec_ms_jiffies_minmax(struct ctl_table *table, int, void __user *, size_t *, loff_t *); extern int proc_do_large_bitmap(struct ctl_table *, int, void __user *, size_t *, loff_t *); +extern int proc_dointvec_minmax_cap(int cap, struct ctl_table *table, + int write, void __user *buffer, + size_t *lenp, loff_t *ppos); +static inline int proc_dointvec_minmax_cap_sysadmin(struct ctl_table *table, + int write, void __user *buffer, + size_t *lenp, loff_t *ppos) +{ + return proc_dointvec_minmax_cap(CAP_SYS_ADMIN, table, write, buffer, + lenp, ppos); +} +static inline int proc_dointvec_minmax_cap_ptrace(struct ctl_table *table, + int write, void __user *buffer, + size_t *lenp, loff_t *ppos) +{ + return proc_dointvec_minmax_cap(CAP_SYS_PTRACE, table, write, buffer, + lenp, ppos); +} /* * Register a set of sysctl names by calling register_sysctl_table diff --git a/kernel/sysctl.c b/kernel/sysctl.c index c810f8afdb7f..fc8899dd636d 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -181,11 +181,6 @@ static int proc_taint(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos); #endif -#ifdef CONFIG_PRINTK -static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, - void __user *buffer, size_t *lenp, loff_t *ppos); -#endif - static int proc_dointvec_minmax_coredump(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos); #ifdef CONFIG_COREDUMP @@ -803,7 +798,7 @@ static struct ctl_table kern_table[] = { .data = &dmesg_restrict, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = proc_dointvec_minmax_sysadmin, + .proc_handler = proc_dointvec_minmax_cap_sysadmin, .extra1 = &zero, .extra2 = &one, }, @@ -812,7 +807,7 @@ static struct ctl_table kern_table[] = { .data = &kptr_restrict, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = proc_dointvec_minmax_sysadmin, + .proc_handler = proc_dointvec_minmax_cap_sysadmin, .extra1 = &zero, .extra2 = &two, }, @@ -2217,16 +2212,29 @@ static int proc_taint(struct ctl_table *table, int write, return err; } -#ifdef CONFIG_PRINTK -static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, - void __user *buffer, size_t *lenp, loff_t *ppos) +int proc_dointvec_minmax_cap(int cap, struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos) { - if (write && !capable(CAP_SYS_ADMIN)) + struct ctl_table table_copy; + int value; + + /* Require init capabilities to make changes. */ + if (write && !capable(cap)) return -EPERM; - return proc_dointvec_minmax(table, write, buffer, lenp, ppos); + /* + * To deal with const sysctl tables, we make a copy to perform + * the locking. When data is >1 and ==extra2, lock extra1 to + * extra2 to stop the value from being changed any further at + * runtime. + */ + table_copy = *table; + value = *(int *)table_copy.data; + if (value > 1 && value == *(int *)table_copy.extra2) + table_copy.extra1 = table_copy.extra2; + + return proc_dointvec_minmax(&table_copy, write, buffer, lenp, ppos); } -#endif struct do_proc_dointvec_minmax_conv_param { int *min; diff --git a/security/yama/yama_lsm.c b/security/yama/yama_lsm.c index d3c19c970a06..3215afd08fbd 100644 --- a/security/yama/yama_lsm.c +++ b/security/yama/yama_lsm.c @@ -354,22 +354,6 @@ static struct security_hook_list yama_hooks[] = { }; #ifdef CONFIG_SYSCTL -static int yama_dointvec_minmax(struct ctl_table *table, int write, - void __user *buffer, size_t *lenp, loff_t *ppos) -{ - struct ctl_table table_copy; - - if (write && !capable(CAP_SYS_PTRACE)) - return -EPERM; - - /* Lock the max value if it ever gets set. */ - table_copy = *table; - if (*(int *)table_copy.data == *(int *)table_copy.extra2) - table_copy.extra1 = table_copy.extra2; - - return proc_dointvec_minmax(&table_copy, write, buffer, lenp, ppos); -} - static int zero; static int max_scope = YAMA_SCOPE_NO_ATTACH; @@ -385,7 +369,7 @@ static struct ctl_table yama_sysctl_table[] = { .data = &ptrace_scope, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = yama_dointvec_minmax, + .proc_handler = proc_dointvec_minmax_cap_ptrace, .extra1 = &zero, .extra2 = &max_scope, }, -- 2.6.3 ^ permalink raw reply related [flat|nested] 80+ messages in thread
* Re: [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin 2016-01-22 22:39 ` [kernel-hardening] " Kees Cook @ 2016-01-23 3:10 ` Eric W. Biederman -1 siblings, 0 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-23 3:10 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Al Viro, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Kees Cook <keescook@chromium.org> writes: > Several sysctls expect a state where the highest value (in extra2) is > locked once set for that boot. Yama does this, and kptr_restrict should > be doing it. This extracts Yama's logic and adds it to the existing > proc_dointvec_minmax_sysadmin, taking care to avoid the simple boolean > states (which do not get locked). Since Yama wants to be checking a > different capability, we build wrappers for both cases (CAP_SYS_ADMIN > and CAP_SYS_PTRACE). Sigh this sysctl appears susceptible to known attacks. In my quick skim I believe this sysctl implementation that checks capabilities is susceptible to attacks where the already open file descriptor is set as stdout on a setuid root application. Can we come up with an interface that isn't exploitable by an application that will act as a setuid cat? Eric > -#ifdef CONFIG_PRINTK > -static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, > - void __user *buffer, size_t *lenp, loff_t *ppos) > +int proc_dointvec_minmax_cap(int cap, struct ctl_table *table, int write, > + void __user *buffer, size_t *lenp, loff_t *ppos) > { > - if (write && !capable(CAP_SYS_ADMIN)) > + struct ctl_table table_copy; > + int value; > + > + /* Require init capabilities to make changes. */ > + if (write && !capable(cap)) > return -EPERM; > > - return proc_dointvec_minmax(table, write, buffer, lenp, ppos); > + /* > + * To deal with const sysctl tables, we make a copy to perform > + * the locking. When data is >1 and ==extra2, lock extra1 to > + * extra2 to stop the value from being changed any further at > + * runtime. > + */ > + table_copy = *table; > + value = *(int *)table_copy.data; > + if (value > 1 && value == *(int *)table_copy.extra2) > + table_copy.extra1 = table_copy.extra2; > + > + return proc_dointvec_minmax(&table_copy, write, buffer, lenp, ppos); > } > -#endif ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin @ 2016-01-23 3:10 ` Eric W. Biederman 0 siblings, 0 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-23 3:10 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Al Viro, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Kees Cook <keescook@chromium.org> writes: > Several sysctls expect a state where the highest value (in extra2) is > locked once set for that boot. Yama does this, and kptr_restrict should > be doing it. This extracts Yama's logic and adds it to the existing > proc_dointvec_minmax_sysadmin, taking care to avoid the simple boolean > states (which do not get locked). Since Yama wants to be checking a > different capability, we build wrappers for both cases (CAP_SYS_ADMIN > and CAP_SYS_PTRACE). Sigh this sysctl appears susceptible to known attacks. In my quick skim I believe this sysctl implementation that checks capabilities is susceptible to attacks where the already open file descriptor is set as stdout on a setuid root application. Can we come up with an interface that isn't exploitable by an application that will act as a setuid cat? Eric > -#ifdef CONFIG_PRINTK > -static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write, > - void __user *buffer, size_t *lenp, loff_t *ppos) > +int proc_dointvec_minmax_cap(int cap, struct ctl_table *table, int write, > + void __user *buffer, size_t *lenp, loff_t *ppos) > { > - if (write && !capable(CAP_SYS_ADMIN)) > + struct ctl_table table_copy; > + int value; > + > + /* Require init capabilities to make changes. */ > + if (write && !capable(cap)) > return -EPERM; > > - return proc_dointvec_minmax(table, write, buffer, lenp, ppos); > + /* > + * To deal with const sysctl tables, we make a copy to perform > + * the locking. When data is >1 and ==extra2, lock extra1 to > + * extra2 to stop the value from being changed any further at > + * runtime. > + */ > + table_copy = *table; > + value = *(int *)table_copy.data; > + if (value > 1 && value == *(int *)table_copy.extra2) > + table_copy.extra1 = table_copy.extra2; > + > + return proc_dointvec_minmax(&table_copy, write, buffer, lenp, ppos); > } > -#endif ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin 2016-01-23 3:10 ` [kernel-hardening] " Eric W. Biederman (?) @ 2016-01-23 22:25 ` Jann Horn 2016-01-24 1:20 ` Eric W. Biederman -1 siblings, 1 reply; 80+ messages in thread From: Jann Horn @ 2016-01-23 22:25 UTC (permalink / raw) To: Eric W. Biederman Cc: Kees Cook, Andrew Morton, Al Viro, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening [-- Attachment #1: Type: text/plain, Size: 1207 bytes --] On Fri, Jan 22, 2016 at 09:10:07PM -0600, Eric W. Biederman wrote: > Kees Cook <keescook@chromium.org> writes: > > > Several sysctls expect a state where the highest value (in extra2) is > > locked once set for that boot. Yama does this, and kptr_restrict should > > be doing it. This extracts Yama's logic and adds it to the existing > > proc_dointvec_minmax_sysadmin, taking care to avoid the simple boolean > > states (which do not get locked). Since Yama wants to be checking a > > different capability, we build wrappers for both cases (CAP_SYS_ADMIN > > and CAP_SYS_PTRACE). > > Sigh this sysctl appears susceptible to known attacks. > > In my quick skim I believe this sysctl implementation that checks > capabilities is susceptible to attacks where the already open file > descriptor is set as stdout on a setuid root application. > > Can we come up with an interface that isn't exploitable by an > application that will act as a setuid cat? Adding the struct file * to the parameters of all proc_handler functions would work, right? (Or just filp->f_cred? That would be less generic.) A quick grep says that's just about 160 functions that'll need to be changed. :/ [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin 2016-01-23 22:25 ` Jann Horn @ 2016-01-24 1:20 ` Eric W. Biederman 2016-01-24 1:43 ` Al Viro 0 siblings, 1 reply; 80+ messages in thread From: Eric W. Biederman @ 2016-01-24 1:20 UTC (permalink / raw) To: Jann Horn Cc: kernel-hardening, Kees Cook, Andrew Morton, Al Viro, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel Jann Horn <jann@thejh.net> writes: > On Fri, Jan 22, 2016 at 09:10:07PM -0600, Eric W. Biederman wrote: >> Kees Cook <keescook@chromium.org> writes: >> >> > Several sysctls expect a state where the highest value (in extra2) is >> > locked once set for that boot. Yama does this, and kptr_restrict should >> > be doing it. This extracts Yama's logic and adds it to the existing >> > proc_dointvec_minmax_sysadmin, taking care to avoid the simple boolean >> > states (which do not get locked). Since Yama wants to be checking a >> > different capability, we build wrappers for both cases (CAP_SYS_ADMIN >> > and CAP_SYS_PTRACE). >> >> Sigh this sysctl appears susceptible to known attacks. >> >> In my quick skim I believe this sysctl implementation that checks >> capabilities is susceptible to attacks where the already open file >> descriptor is set as stdout on a setuid root application. >> >> Can we come up with an interface that isn't exploitable by an >> application that will act as a setuid cat? > > Adding the struct file * to the parameters of all proc_handler > functions would work, right? (Or just filp->f_cred? That would be > less generic.) > > A quick grep says that's just about 160 functions that'll need to > be changed. :/ Yep. That is about the size of it. file * used to be passed to the sysctl methods but it was removed several years ago because no one was using it. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin 2016-01-24 1:20 ` Eric W. Biederman @ 2016-01-24 1:43 ` Al Viro 2016-01-24 1:56 ` Jann Horn 0 siblings, 1 reply; 80+ messages in thread From: Al Viro @ 2016-01-24 1:43 UTC (permalink / raw) To: Eric W. Biederman Cc: Jann Horn, kernel-hardening, Kees Cook, Andrew Morton, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On Sat, Jan 23, 2016 at 07:20:17PM -0600, Eric W. Biederman wrote: > Yep. That is about the size of it. file * used to be passed to the > sysctl methods but it was removed several years ago because no one was > using it. Generally cred would be better... Alternatively we could eat one more pointer in task_struct and stash a reference to that sucker there, rather than adding an explicit argument (again, with cred instead of file). Not sure... ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin 2016-01-24 1:43 ` Al Viro @ 2016-01-24 1:56 ` Jann Horn 2016-01-24 6:02 ` Eric W. Biederman 0 siblings, 1 reply; 80+ messages in thread From: Jann Horn @ 2016-01-24 1:56 UTC (permalink / raw) To: Al Viro Cc: Eric W. Biederman, kernel-hardening, Kees Cook, Andrew Morton, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel [-- Attachment #1: Type: text/plain, Size: 866 bytes --] On Sun, Jan 24, 2016 at 01:43:42AM +0000, Al Viro wrote: > On Sat, Jan 23, 2016 at 07:20:17PM -0600, Eric W. Biederman wrote: > > > Yep. That is about the size of it. file * used to be passed to the > > sysctl methods but it was removed several years ago because no one was > > using it. > > Generally cred would be better... > Alternatively we could eat one more > pointer in task_struct and stash a reference to that sucker there, rather > than adding an explicit argument (again, with cred instead of file). > Not sure... I think it makes sense to do this the same way as the rest of the VFS code here (which passes the creds down through an argument). And adding the arguments everywhere doesn't really mean more work - either way, someone should probably go through all of those sysctl handlers and fix them up to use the file creds. [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin 2016-01-24 1:56 ` Jann Horn @ 2016-01-24 6:02 ` Eric W. Biederman 2016-01-24 6:32 ` Jann Horn 0 siblings, 1 reply; 80+ messages in thread From: Eric W. Biederman @ 2016-01-24 6:02 UTC (permalink / raw) To: Jann Horn Cc: Al Viro, kernel-hardening, Kees Cook, Andrew Morton, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel Jann Horn <jann@thejh.net> writes: > On Sun, Jan 24, 2016 at 01:43:42AM +0000, Al Viro wrote: >> On Sat, Jan 23, 2016 at 07:20:17PM -0600, Eric W. Biederman wrote: >> >> > Yep. That is about the size of it. file * used to be passed to the >> > sysctl methods but it was removed several years ago because no one was >> > using it. >> >> Generally cred would be better... > >> Alternatively we could eat one more >> pointer in task_struct and stash a reference to that sucker there, rather >> than adding an explicit argument (again, with cred instead of file). >> Not sure... > > I think it makes sense to do this the same way as the rest of the VFS code > here (which passes the creds down through an argument). > > And adding the arguments everywhere doesn't really mean more work - either > way, someone should probably go through all of those sysctl handlers and > fix them up to use the file creds. Not all of them need it. It might be worth figuring out the necessary rigamarole to hook into sysctl_perm the way the networking code does and have that require the capability at open time. The advantage is that open time is when it is actually appropraite to check permissions. I could be wrong but I doubt there is enough madness with the handful of sysctl users that call capable to require the checks to happen on write and not on open. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin 2016-01-24 6:02 ` Eric W. Biederman @ 2016-01-24 6:32 ` Jann Horn 2016-01-24 6:44 ` Eric W. Biederman 0 siblings, 1 reply; 80+ messages in thread From: Jann Horn @ 2016-01-24 6:32 UTC (permalink / raw) To: Eric W. Biederman Cc: Al Viro, kernel-hardening, Kees Cook, Andrew Morton, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1993 bytes --] On Sun, Jan 24, 2016 at 12:02:41AM -0600, Eric W. Biederman wrote: > Jann Horn <jann@thejh.net> writes: > > > On Sun, Jan 24, 2016 at 01:43:42AM +0000, Al Viro wrote: > >> On Sat, Jan 23, 2016 at 07:20:17PM -0600, Eric W. Biederman wrote: > >> > >> > Yep. That is about the size of it. file * used to be passed to the > >> > sysctl methods but it was removed several years ago because no one was > >> > using it. > >> > >> Generally cred would be better... > > > >> Alternatively we could eat one more > >> pointer in task_struct and stash a reference to that sucker there, rather > >> than adding an explicit argument (again, with cred instead of file). > >> Not sure... > > > > I think it makes sense to do this the same way as the rest of the VFS code > > here (which passes the creds down through an argument). > > > > And adding the arguments everywhere doesn't really mean more work - either > > way, someone should probably go through all of those sysctl handlers and > > fix them up to use the file creds. > > Not all of them need it. It might be worth figuring out the necessary > rigamarole to hook into sysctl_perm the way the networking code does and > have that require the capability at open time. > > The advantage is that open time is when it is actually appropraite to > check permissions. I could be wrong but I doubt there is enough madness > with the handful of sysctl users that call capable to require the checks > to happen on write and not on open. That would work - if all sysctls know whether a capability will be needed for writing later on and don't decide it based on the written data. Is that always true? Looking through some of the sysctl handlers, I found proc_do_uts_string and pid_ns_ctl_handler, which operate on a namespace looked up through current at write time. I think that's buggy and ought to be done using the file opener creds and on the file opener's namespaces, but where can those be stored? [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin 2016-01-24 6:32 ` Jann Horn @ 2016-01-24 6:44 ` Eric W. Biederman 0 siblings, 0 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-24 6:44 UTC (permalink / raw) To: Jann Horn Cc: Al Viro, kernel-hardening, Kees Cook, Andrew Morton, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel Jann Horn <jann@thejh.net> writes: > On Sun, Jan 24, 2016 at 12:02:41AM -0600, Eric W. Biederman wrote: >> Jann Horn <jann@thejh.net> writes: >> >> > On Sun, Jan 24, 2016 at 01:43:42AM +0000, Al Viro wrote: >> >> On Sat, Jan 23, 2016 at 07:20:17PM -0600, Eric W. Biederman wrote: >> >> >> >> > Yep. That is about the size of it. file * used to be passed to the >> >> > sysctl methods but it was removed several years ago because no one was >> >> > using it. >> >> >> >> Generally cred would be better... >> > >> >> Alternatively we could eat one more >> >> pointer in task_struct and stash a reference to that sucker there, rather >> >> than adding an explicit argument (again, with cred instead of file). >> >> Not sure... >> > >> > I think it makes sense to do this the same way as the rest of the VFS code >> > here (which passes the creds down through an argument). >> > >> > And adding the arguments everywhere doesn't really mean more work - either >> > way, someone should probably go through all of those sysctl handlers and >> > fix them up to use the file creds. >> >> Not all of them need it. It might be worth figuring out the necessary >> rigamarole to hook into sysctl_perm the way the networking code does and >> have that require the capability at open time. >> >> The advantage is that open time is when it is actually appropraite to >> check permissions. I could be wrong but I doubt there is enough madness >> with the handful of sysctl users that call capable to require the checks >> to happen on write and not on open. > > That would work - if all sysctls know whether a capability will be needed > for writing later on and don't decide it based on the written data. Is that > always true? That is certainly the common case to stick a pointer to static data in struct sysctl_table. > Looking through some of the sysctl handlers, I found proc_do_uts_string and > pid_ns_ctl_handler, which operate on a namespace looked up through current > at write time. I think that's buggy and ought to be done using the file > opener creds and on the file opener's namespaces, but where can those be > stored? Interesting point. I had not though about it from that angle. From all other angles it has just been something that would be nice to fix but I hadn't seen the need. We have all of the infrastructure needed to register sysctls per namespace and the networking stack uses it. That would not be hard to use for uts, pid and ipc namespaces as well. That would remove any race. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 22:39 ` [kernel-hardening] " Kees Cook @ 2016-01-22 22:39 ` Kees Cook -1 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 22:39 UTC (permalink / raw) To: Andrew Morton Cc: Kees Cook, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening There continues to be many CONFIG_USER_NS related security exposures. For admins running distro kernels with CONFIG_USER_NS, there is no way to disable CLONE_NEWUSER. As many systems do not need CLONE_NEWUSER, this provides a way for sysadmins to disable the feature. This is inspired by a similar restriction in Grsecurity, but adds a sysctl. Signed-off-by: Kees Cook <keescook@chromium.org> --- Documentation/sysctl/kernel.txt | 17 +++++++++++++++++ kernel/sysctl.c | 14 ++++++++++++++ kernel/user_namespace.c | 7 +++++++ 3 files changed, 38 insertions(+) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index bbfc5e339a3d..e9e8a4f949f5 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -85,6 +85,7 @@ show up in /proc/sys/kernel: - tainted - threads-max - unknown_nmi_panic +- userns_restrict - watchdog - watchdog_thresh - version @@ -933,6 +934,22 @@ example. If a system hangs up, try pressing the NMI switch. ============================================================== +userns_restrict: + +This toggle indicates whether CLONE_NEWUSER is available. As CLONE_NEWUSER +has many unexpected side-effects and security exposures, this allows the +sysadmin to disable the feature without needing to rebuild the kernel. + +When userns_restrict is set to (0), the default, there are no restrictions. + +When userns_restrict is set to (1), CLONE_NEWUSER is only available to +processes that have CAP_SYS_ADMIN, CAP_SETUID, and CAP_SETGID. + +When userns_restrict is set to (2), CLONE_NEWUSER is not available at all, +and the value is locked to "2" for the duration of the boot. + +============================================================== + watchdog: This parameter can be used to disable or enable the soft lockup detector diff --git a/kernel/sysctl.c b/kernel/sysctl.c index fc8899dd636d..ceb8b107fe28 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -112,6 +112,9 @@ extern int sysctl_nr_open_min, sysctl_nr_open_max; #ifndef CONFIG_MMU extern int sysctl_nr_trim_pages; #endif +#ifdef CONFIG_USER_NS +extern int sysctl_userns_restrict; +#endif /* Constants used for minimum and maximum */ #ifdef CONFIG_LOCKUP_DETECTOR @@ -812,6 +815,17 @@ static struct ctl_table kern_table[] = { .extra2 = &two, }, #endif +#ifdef CONFIG_USER_NS + { + .procname = "userns_restrict", + .data = &sysctl_userns_restrict, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax_cap_sysadmin, + .extra1 = &zero, + .extra2 = &two, + }, +#endif { .procname = "ngroups_max", .data = &ngroups_max, diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 9bafc211930c..38395f9625ff 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -25,6 +25,7 @@ static struct kmem_cache *user_ns_cachep __read_mostly; static DEFINE_MUTEX(userns_state_mutex); +int sysctl_userns_restrict __read_mostly; static bool new_idmap_permitted(const struct file *file, struct user_namespace *ns, int cap_setid, @@ -84,6 +85,12 @@ int create_user_ns(struct cred *new) !kgid_has_mapping(parent_ns, group)) return -EPERM; + if (sysctl_userns_restrict == 2 || + (sysctl_userns_restrict == 1 && (!capable(CAP_SYS_ADMIN) || + !capable(CAP_SETUID) || + !capable(CAP_SETGID)))) + return -EPERM; + ns = kmem_cache_zalloc(user_ns_cachep, GFP_KERNEL); if (!ns) return -ENOMEM; -- 2.6.3 ^ permalink raw reply related [flat|nested] 80+ messages in thread
* [kernel-hardening] [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-22 22:39 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 22:39 UTC (permalink / raw) To: Andrew Morton Cc: Kees Cook, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening There continues to be many CONFIG_USER_NS related security exposures. For admins running distro kernels with CONFIG_USER_NS, there is no way to disable CLONE_NEWUSER. As many systems do not need CLONE_NEWUSER, this provides a way for sysadmins to disable the feature. This is inspired by a similar restriction in Grsecurity, but adds a sysctl. Signed-off-by: Kees Cook <keescook@chromium.org> --- Documentation/sysctl/kernel.txt | 17 +++++++++++++++++ kernel/sysctl.c | 14 ++++++++++++++ kernel/user_namespace.c | 7 +++++++ 3 files changed, 38 insertions(+) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index bbfc5e339a3d..e9e8a4f949f5 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -85,6 +85,7 @@ show up in /proc/sys/kernel: - tainted - threads-max - unknown_nmi_panic +- userns_restrict - watchdog - watchdog_thresh - version @@ -933,6 +934,22 @@ example. If a system hangs up, try pressing the NMI switch. ============================================================== +userns_restrict: + +This toggle indicates whether CLONE_NEWUSER is available. As CLONE_NEWUSER +has many unexpected side-effects and security exposures, this allows the +sysadmin to disable the feature without needing to rebuild the kernel. + +When userns_restrict is set to (0), the default, there are no restrictions. + +When userns_restrict is set to (1), CLONE_NEWUSER is only available to +processes that have CAP_SYS_ADMIN, CAP_SETUID, and CAP_SETGID. + +When userns_restrict is set to (2), CLONE_NEWUSER is not available at all, +and the value is locked to "2" for the duration of the boot. + +============================================================== + watchdog: This parameter can be used to disable or enable the soft lockup detector diff --git a/kernel/sysctl.c b/kernel/sysctl.c index fc8899dd636d..ceb8b107fe28 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -112,6 +112,9 @@ extern int sysctl_nr_open_min, sysctl_nr_open_max; #ifndef CONFIG_MMU extern int sysctl_nr_trim_pages; #endif +#ifdef CONFIG_USER_NS +extern int sysctl_userns_restrict; +#endif /* Constants used for minimum and maximum */ #ifdef CONFIG_LOCKUP_DETECTOR @@ -812,6 +815,17 @@ static struct ctl_table kern_table[] = { .extra2 = &two, }, #endif +#ifdef CONFIG_USER_NS + { + .procname = "userns_restrict", + .data = &sysctl_userns_restrict, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax_cap_sysadmin, + .extra1 = &zero, + .extra2 = &two, + }, +#endif { .procname = "ngroups_max", .data = &ngroups_max, diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 9bafc211930c..38395f9625ff 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -25,6 +25,7 @@ static struct kmem_cache *user_ns_cachep __read_mostly; static DEFINE_MUTEX(userns_state_mutex); +int sysctl_userns_restrict __read_mostly; static bool new_idmap_permitted(const struct file *file, struct user_namespace *ns, int cap_setid, @@ -84,6 +85,12 @@ int create_user_ns(struct cred *new) !kgid_has_mapping(parent_ns, group)) return -EPERM; + if (sysctl_userns_restrict == 2 || + (sysctl_userns_restrict == 1 && (!capable(CAP_SYS_ADMIN) || + !capable(CAP_SETUID) || + !capable(CAP_SETGID)))) + return -EPERM; + ns = kmem_cache_zalloc(user_ns_cachep, GFP_KERNEL); if (!ns) return -ENOMEM; -- 2.6.3 ^ permalink raw reply related [flat|nested] 80+ messages in thread
* Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 22:39 ` [kernel-hardening] " Kees Cook @ 2016-01-22 22:47 ` Robert Święcki -1 siblings, 0 replies; 80+ messages in thread From: Robert Święcki @ 2016-01-22 22:47 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening Seems that Debian and some older Ubuntu versions are already using $ sysctl -a | grep usern kernel.unprivileged_userns_clone = 0 Shall we be consistent wit it? 2016-01-22 23:39 GMT+01:00 Kees Cook <keescook@chromium.org>: > There continues to be many CONFIG_USER_NS related security exposures. > For admins running distro kernels with CONFIG_USER_NS, there is no way > to disable CLONE_NEWUSER. As many systems do not need CLONE_NEWUSER, > this provides a way for sysadmins to disable the feature. > > This is inspired by a similar restriction in Grsecurity, but adds > a sysctl. > > Signed-off-by: Kees Cook <keescook@chromium.org> > --- > Documentation/sysctl/kernel.txt | 17 +++++++++++++++++ > kernel/sysctl.c | 14 ++++++++++++++ > kernel/user_namespace.c | 7 +++++++ > 3 files changed, 38 insertions(+) > > diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt > index bbfc5e339a3d..e9e8a4f949f5 100644 > --- a/Documentation/sysctl/kernel.txt > +++ b/Documentation/sysctl/kernel.txt > @@ -85,6 +85,7 @@ show up in /proc/sys/kernel: > - tainted > - threads-max > - unknown_nmi_panic > +- userns_restrict > - watchdog > - watchdog_thresh > - version > @@ -933,6 +934,22 @@ example. If a system hangs up, try pressing the NMI switch. > > ============================================================== > > +userns_restrict: > + > +This toggle indicates whether CLONE_NEWUSER is available. As CLONE_NEWUSER > +has many unexpected side-effects and security exposures, this allows the > +sysadmin to disable the feature without needing to rebuild the kernel. > + > +When userns_restrict is set to (0), the default, there are no restrictions. > + > +When userns_restrict is set to (1), CLONE_NEWUSER is only available to > +processes that have CAP_SYS_ADMIN, CAP_SETUID, and CAP_SETGID. > + > +When userns_restrict is set to (2), CLONE_NEWUSER is not available at all, > +and the value is locked to "2" for the duration of the boot. > + > +============================================================== > + > watchdog: > > This parameter can be used to disable or enable the soft lockup detector > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index fc8899dd636d..ceb8b107fe28 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -112,6 +112,9 @@ extern int sysctl_nr_open_min, sysctl_nr_open_max; > #ifndef CONFIG_MMU > extern int sysctl_nr_trim_pages; > #endif > +#ifdef CONFIG_USER_NS > +extern int sysctl_userns_restrict; > +#endif > > /* Constants used for minimum and maximum */ > #ifdef CONFIG_LOCKUP_DETECTOR > @@ -812,6 +815,17 @@ static struct ctl_table kern_table[] = { > .extra2 = &two, > }, > #endif > +#ifdef CONFIG_USER_NS > + { > + .procname = "userns_restrict", > + .data = &sysctl_userns_restrict, > + .maxlen = sizeof(int), > + .mode = 0644, > + .proc_handler = proc_dointvec_minmax_cap_sysadmin, > + .extra1 = &zero, > + .extra2 = &two, > + }, > +#endif > { > .procname = "ngroups_max", > .data = &ngroups_max, > diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c > index 9bafc211930c..38395f9625ff 100644 > --- a/kernel/user_namespace.c > +++ b/kernel/user_namespace.c > @@ -25,6 +25,7 @@ > > static struct kmem_cache *user_ns_cachep __read_mostly; > static DEFINE_MUTEX(userns_state_mutex); > +int sysctl_userns_restrict __read_mostly; > > static bool new_idmap_permitted(const struct file *file, > struct user_namespace *ns, int cap_setid, > @@ -84,6 +85,12 @@ int create_user_ns(struct cred *new) > !kgid_has_mapping(parent_ns, group)) > return -EPERM; > > + if (sysctl_userns_restrict == 2 || > + (sysctl_userns_restrict == 1 && (!capable(CAP_SYS_ADMIN) || > + !capable(CAP_SETUID) || > + !capable(CAP_SETGID)))) > + return -EPERM; > + > ns = kmem_cache_zalloc(user_ns_cachep, GFP_KERNEL); > if (!ns) > return -ENOMEM; > -- > 2.6.3 > -- Robert Święcki ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-22 22:47 ` Robert Święcki 0 siblings, 0 replies; 80+ messages in thread From: Robert Święcki @ 2016-01-22 22:47 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening Seems that Debian and some older Ubuntu versions are already using $ sysctl -a | grep usern kernel.unprivileged_userns_clone = 0 Shall we be consistent wit it? 2016-01-22 23:39 GMT+01:00 Kees Cook <keescook@chromium.org>: > There continues to be many CONFIG_USER_NS related security exposures. > For admins running distro kernels with CONFIG_USER_NS, there is no way > to disable CLONE_NEWUSER. As many systems do not need CLONE_NEWUSER, > this provides a way for sysadmins to disable the feature. > > This is inspired by a similar restriction in Grsecurity, but adds > a sysctl. > > Signed-off-by: Kees Cook <keescook@chromium.org> > --- > Documentation/sysctl/kernel.txt | 17 +++++++++++++++++ > kernel/sysctl.c | 14 ++++++++++++++ > kernel/user_namespace.c | 7 +++++++ > 3 files changed, 38 insertions(+) > > diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt > index bbfc5e339a3d..e9e8a4f949f5 100644 > --- a/Documentation/sysctl/kernel.txt > +++ b/Documentation/sysctl/kernel.txt > @@ -85,6 +85,7 @@ show up in /proc/sys/kernel: > - tainted > - threads-max > - unknown_nmi_panic > +- userns_restrict > - watchdog > - watchdog_thresh > - version > @@ -933,6 +934,22 @@ example. If a system hangs up, try pressing the NMI switch. > > ============================================================== > > +userns_restrict: > + > +This toggle indicates whether CLONE_NEWUSER is available. As CLONE_NEWUSER > +has many unexpected side-effects and security exposures, this allows the > +sysadmin to disable the feature without needing to rebuild the kernel. > + > +When userns_restrict is set to (0), the default, there are no restrictions. > + > +When userns_restrict is set to (1), CLONE_NEWUSER is only available to > +processes that have CAP_SYS_ADMIN, CAP_SETUID, and CAP_SETGID. > + > +When userns_restrict is set to (2), CLONE_NEWUSER is not available at all, > +and the value is locked to "2" for the duration of the boot. > + > +============================================================== > + > watchdog: > > This parameter can be used to disable or enable the soft lockup detector > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index fc8899dd636d..ceb8b107fe28 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -112,6 +112,9 @@ extern int sysctl_nr_open_min, sysctl_nr_open_max; > #ifndef CONFIG_MMU > extern int sysctl_nr_trim_pages; > #endif > +#ifdef CONFIG_USER_NS > +extern int sysctl_userns_restrict; > +#endif > > /* Constants used for minimum and maximum */ > #ifdef CONFIG_LOCKUP_DETECTOR > @@ -812,6 +815,17 @@ static struct ctl_table kern_table[] = { > .extra2 = &two, > }, > #endif > +#ifdef CONFIG_USER_NS > + { > + .procname = "userns_restrict", > + .data = &sysctl_userns_restrict, > + .maxlen = sizeof(int), > + .mode = 0644, > + .proc_handler = proc_dointvec_minmax_cap_sysadmin, > + .extra1 = &zero, > + .extra2 = &two, > + }, > +#endif > { > .procname = "ngroups_max", > .data = &ngroups_max, > diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c > index 9bafc211930c..38395f9625ff 100644 > --- a/kernel/user_namespace.c > +++ b/kernel/user_namespace.c > @@ -25,6 +25,7 @@ > > static struct kmem_cache *user_ns_cachep __read_mostly; > static DEFINE_MUTEX(userns_state_mutex); > +int sysctl_userns_restrict __read_mostly; > > static bool new_idmap_permitted(const struct file *file, > struct user_namespace *ns, int cap_setid, > @@ -84,6 +85,12 @@ int create_user_ns(struct cred *new) > !kgid_has_mapping(parent_ns, group)) > return -EPERM; > > + if (sysctl_userns_restrict == 2 || > + (sysctl_userns_restrict == 1 && (!capable(CAP_SYS_ADMIN) || > + !capable(CAP_SETUID) || > + !capable(CAP_SETGID)))) > + return -EPERM; > + > ns = kmem_cache_zalloc(user_ns_cachep, GFP_KERNEL); > if (!ns) > return -ENOMEM; > -- > 2.6.3 > -- Robert Święcki ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 22:47 ` [kernel-hardening] " Robert Święcki @ 2016-01-22 22:50 ` Kees Cook -1 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 22:50 UTC (permalink / raw) To: Robert Święcki Cc: Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening On Fri, Jan 22, 2016 at 2:47 PM, Robert Święcki <robert@swiecki.net> wrote: > Seems that Debian and some older Ubuntu versions are already using > > $ sysctl -a | grep usern > kernel.unprivileged_userns_clone = 0 > > Shall we be consistent wit it? Oh! I didn't see that on systems I checked. On which version did you find that? I'd kind of like to keep the _restrict name, as that follows kptr_ and dmesg_... -Kees > > 2016-01-22 23:39 GMT+01:00 Kees Cook <keescook@chromium.org>: >> There continues to be many CONFIG_USER_NS related security exposures. >> For admins running distro kernels with CONFIG_USER_NS, there is no way >> to disable CLONE_NEWUSER. As many systems do not need CLONE_NEWUSER, >> this provides a way for sysadmins to disable the feature. >> >> This is inspired by a similar restriction in Grsecurity, but adds >> a sysctl. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> >> --- >> Documentation/sysctl/kernel.txt | 17 +++++++++++++++++ >> kernel/sysctl.c | 14 ++++++++++++++ >> kernel/user_namespace.c | 7 +++++++ >> 3 files changed, 38 insertions(+) >> >> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt >> index bbfc5e339a3d..e9e8a4f949f5 100644 >> --- a/Documentation/sysctl/kernel.txt >> +++ b/Documentation/sysctl/kernel.txt >> @@ -85,6 +85,7 @@ show up in /proc/sys/kernel: >> - tainted >> - threads-max >> - unknown_nmi_panic >> +- userns_restrict >> - watchdog >> - watchdog_thresh >> - version >> @@ -933,6 +934,22 @@ example. If a system hangs up, try pressing the NMI switch. >> >> ============================================================== >> >> +userns_restrict: >> + >> +This toggle indicates whether CLONE_NEWUSER is available. As CLONE_NEWUSER >> +has many unexpected side-effects and security exposures, this allows the >> +sysadmin to disable the feature without needing to rebuild the kernel. >> + >> +When userns_restrict is set to (0), the default, there are no restrictions. >> + >> +When userns_restrict is set to (1), CLONE_NEWUSER is only available to >> +processes that have CAP_SYS_ADMIN, CAP_SETUID, and CAP_SETGID. >> + >> +When userns_restrict is set to (2), CLONE_NEWUSER is not available at all, >> +and the value is locked to "2" for the duration of the boot. >> + >> +============================================================== >> + >> watchdog: >> >> This parameter can be used to disable or enable the soft lockup detector >> diff --git a/kernel/sysctl.c b/kernel/sysctl.c >> index fc8899dd636d..ceb8b107fe28 100644 >> --- a/kernel/sysctl.c >> +++ b/kernel/sysctl.c >> @@ -112,6 +112,9 @@ extern int sysctl_nr_open_min, sysctl_nr_open_max; >> #ifndef CONFIG_MMU >> extern int sysctl_nr_trim_pages; >> #endif >> +#ifdef CONFIG_USER_NS >> +extern int sysctl_userns_restrict; >> +#endif >> >> /* Constants used for minimum and maximum */ >> #ifdef CONFIG_LOCKUP_DETECTOR >> @@ -812,6 +815,17 @@ static struct ctl_table kern_table[] = { >> .extra2 = &two, >> }, >> #endif >> +#ifdef CONFIG_USER_NS >> + { >> + .procname = "userns_restrict", >> + .data = &sysctl_userns_restrict, >> + .maxlen = sizeof(int), >> + .mode = 0644, >> + .proc_handler = proc_dointvec_minmax_cap_sysadmin, >> + .extra1 = &zero, >> + .extra2 = &two, >> + }, >> +#endif >> { >> .procname = "ngroups_max", >> .data = &ngroups_max, >> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c >> index 9bafc211930c..38395f9625ff 100644 >> --- a/kernel/user_namespace.c >> +++ b/kernel/user_namespace.c >> @@ -25,6 +25,7 @@ >> >> static struct kmem_cache *user_ns_cachep __read_mostly; >> static DEFINE_MUTEX(userns_state_mutex); >> +int sysctl_userns_restrict __read_mostly; >> >> static bool new_idmap_permitted(const struct file *file, >> struct user_namespace *ns, int cap_setid, >> @@ -84,6 +85,12 @@ int create_user_ns(struct cred *new) >> !kgid_has_mapping(parent_ns, group)) >> return -EPERM; >> >> + if (sysctl_userns_restrict == 2 || >> + (sysctl_userns_restrict == 1 && (!capable(CAP_SYS_ADMIN) || >> + !capable(CAP_SETUID) || >> + !capable(CAP_SETGID)))) >> + return -EPERM; >> + >> ns = kmem_cache_zalloc(user_ns_cachep, GFP_KERNEL); >> if (!ns) >> return -ENOMEM; >> -- >> 2.6.3 >> > > > > -- > Robert Święcki -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-22 22:50 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 22:50 UTC (permalink / raw) To: Robert Święcki Cc: Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening On Fri, Jan 22, 2016 at 2:47 PM, Robert Święcki <robert@swiecki.net> wrote: > Seems that Debian and some older Ubuntu versions are already using > > $ sysctl -a | grep usern > kernel.unprivileged_userns_clone = 0 > > Shall we be consistent wit it? Oh! I didn't see that on systems I checked. On which version did you find that? I'd kind of like to keep the _restrict name, as that follows kptr_ and dmesg_... -Kees > > 2016-01-22 23:39 GMT+01:00 Kees Cook <keescook@chromium.org>: >> There continues to be many CONFIG_USER_NS related security exposures. >> For admins running distro kernels with CONFIG_USER_NS, there is no way >> to disable CLONE_NEWUSER. As many systems do not need CLONE_NEWUSER, >> this provides a way for sysadmins to disable the feature. >> >> This is inspired by a similar restriction in Grsecurity, but adds >> a sysctl. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> >> --- >> Documentation/sysctl/kernel.txt | 17 +++++++++++++++++ >> kernel/sysctl.c | 14 ++++++++++++++ >> kernel/user_namespace.c | 7 +++++++ >> 3 files changed, 38 insertions(+) >> >> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt >> index bbfc5e339a3d..e9e8a4f949f5 100644 >> --- a/Documentation/sysctl/kernel.txt >> +++ b/Documentation/sysctl/kernel.txt >> @@ -85,6 +85,7 @@ show up in /proc/sys/kernel: >> - tainted >> - threads-max >> - unknown_nmi_panic >> +- userns_restrict >> - watchdog >> - watchdog_thresh >> - version >> @@ -933,6 +934,22 @@ example. If a system hangs up, try pressing the NMI switch. >> >> ============================================================== >> >> +userns_restrict: >> + >> +This toggle indicates whether CLONE_NEWUSER is available. As CLONE_NEWUSER >> +has many unexpected side-effects and security exposures, this allows the >> +sysadmin to disable the feature without needing to rebuild the kernel. >> + >> +When userns_restrict is set to (0), the default, there are no restrictions. >> + >> +When userns_restrict is set to (1), CLONE_NEWUSER is only available to >> +processes that have CAP_SYS_ADMIN, CAP_SETUID, and CAP_SETGID. >> + >> +When userns_restrict is set to (2), CLONE_NEWUSER is not available at all, >> +and the value is locked to "2" for the duration of the boot. >> + >> +============================================================== >> + >> watchdog: >> >> This parameter can be used to disable or enable the soft lockup detector >> diff --git a/kernel/sysctl.c b/kernel/sysctl.c >> index fc8899dd636d..ceb8b107fe28 100644 >> --- a/kernel/sysctl.c >> +++ b/kernel/sysctl.c >> @@ -112,6 +112,9 @@ extern int sysctl_nr_open_min, sysctl_nr_open_max; >> #ifndef CONFIG_MMU >> extern int sysctl_nr_trim_pages; >> #endif >> +#ifdef CONFIG_USER_NS >> +extern int sysctl_userns_restrict; >> +#endif >> >> /* Constants used for minimum and maximum */ >> #ifdef CONFIG_LOCKUP_DETECTOR >> @@ -812,6 +815,17 @@ static struct ctl_table kern_table[] = { >> .extra2 = &two, >> }, >> #endif >> +#ifdef CONFIG_USER_NS >> + { >> + .procname = "userns_restrict", >> + .data = &sysctl_userns_restrict, >> + .maxlen = sizeof(int), >> + .mode = 0644, >> + .proc_handler = proc_dointvec_minmax_cap_sysadmin, >> + .extra1 = &zero, >> + .extra2 = &two, >> + }, >> +#endif >> { >> .procname = "ngroups_max", >> .data = &ngroups_max, >> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c >> index 9bafc211930c..38395f9625ff 100644 >> --- a/kernel/user_namespace.c >> +++ b/kernel/user_namespace.c >> @@ -25,6 +25,7 @@ >> >> static struct kmem_cache *user_ns_cachep __read_mostly; >> static DEFINE_MUTEX(userns_state_mutex); >> +int sysctl_userns_restrict __read_mostly; >> >> static bool new_idmap_permitted(const struct file *file, >> struct user_namespace *ns, int cap_setid, >> @@ -84,6 +85,12 @@ int create_user_ns(struct cred *new) >> !kgid_has_mapping(parent_ns, group)) >> return -EPERM; >> >> + if (sysctl_userns_restrict == 2 || >> + (sysctl_userns_restrict == 1 && (!capable(CAP_SYS_ADMIN) || >> + !capable(CAP_SETUID) || >> + !capable(CAP_SETGID)))) >> + return -EPERM; >> + >> ns = kmem_cache_zalloc(user_ns_cachep, GFP_KERNEL); >> if (!ns) >> return -ENOMEM; >> -- >> 2.6.3 >> > > > > -- > Robert Święcki -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 22:50 ` [kernel-hardening] " Kees Cook @ 2016-01-22 22:55 ` Robert Święcki -1 siblings, 0 replies; 80+ messages in thread From: Robert Święcki @ 2016-01-22 22:55 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: >> Seems that Debian and some older Ubuntu versions are already using >> >> $ sysctl -a | grep usern >> kernel.unprivileged_userns_clone = 0 >> >> Shall we be consistent wit it? > > Oh! I didn't see that on systems I checked. On which version did you find that? $ uname -a Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 (2016-01-07) x86_64 GNU/Linux $ cat /etc/debian_version 8.2 IIRC some older kernels delivered with Ubuntu Precise were also using it (but maybe I'm mistaken) -- Robert Święcki ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-22 22:55 ` Robert Święcki 0 siblings, 0 replies; 80+ messages in thread From: Robert Święcki @ 2016-01-22 22:55 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: >> Seems that Debian and some older Ubuntu versions are already using >> >> $ sysctl -a | grep usern >> kernel.unprivileged_userns_clone = 0 >> >> Shall we be consistent wit it? > > Oh! I didn't see that on systems I checked. On which version did you find that? $ uname -a Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 (2016-01-07) x86_64 GNU/Linux $ cat /etc/debian_version 8.2 IIRC some older kernels delivered with Ubuntu Precise were also using it (but maybe I'm mistaken) -- Robert Święcki ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 22:55 ` [kernel-hardening] " Robert Święcki @ 2016-01-22 23:00 ` Kees Cook -1 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 23:00 UTC (permalink / raw) To: Robert Święcki, Serge E. Hallyn, Ben Hutchings Cc: Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: > >>> Seems that Debian and some older Ubuntu versions are already using >>> >>> $ sysctl -a | grep usern >>> kernel.unprivileged_userns_clone = 0 >>> >>> Shall we be consistent wit it? >> >> Oh! I didn't see that on systems I checked. On which version did you find that? > > $ uname -a > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 > (2016-01-07) x86_64 GNU/Linux > $ cat /etc/debian_version > 8.2 Ah-ha, Debian only, though it looks like this was just committed to the Ubuntu kernel tree too: > IIRC some older kernels delivered with Ubuntu Precise were also using > it (but maybe I'm mistaken) I don't see it there. I think my patch is more complete, but I'm happy to change the name if this sysctl has already started to enter the global consciousness. ;) Serge, Ben, what do you think? -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-22 23:00 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-22 23:00 UTC (permalink / raw) To: Robert Święcki, Serge E. Hallyn, Ben Hutchings Cc: Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: > >>> Seems that Debian and some older Ubuntu versions are already using >>> >>> $ sysctl -a | grep usern >>> kernel.unprivileged_userns_clone = 0 >>> >>> Shall we be consistent wit it? >> >> Oh! I didn't see that on systems I checked. On which version did you find that? > > $ uname -a > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 > (2016-01-07) x86_64 GNU/Linux > $ cat /etc/debian_version > 8.2 Ah-ha, Debian only, though it looks like this was just committed to the Ubuntu kernel tree too: > IIRC some older kernels delivered with Ubuntu Precise were also using > it (but maybe I'm mistaken) I don't see it there. I think my patch is more complete, but I'm happy to change the name if this sysctl has already started to enter the global consciousness. ;) Serge, Ben, what do you think? -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 23:00 ` [kernel-hardening] " Kees Cook @ 2016-01-23 0:44 ` Serge Hallyn -1 siblings, 0 replies; 80+ messages in thread From: Serge Hallyn @ 2016-01-23 0:44 UTC (permalink / raw) To: Kees Cook Cc: Robert Święcki, Ben Hutchings, Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening Quoting Kees Cook (keescook@chromium.org): > On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: > > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: > > > >>> Seems that Debian and some older Ubuntu versions are already using > >>> > >>> $ sysctl -a | grep usern > >>> kernel.unprivileged_userns_clone = 0 > >>> > >>> Shall we be consistent wit it? > >> > >> Oh! I didn't see that on systems I checked. On which version did you find that? > > > > $ uname -a > > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 > > (2016-01-07) x86_64 GNU/Linux > > $ cat /etc/debian_version > > 8.2 > > Ah-ha, Debian only, though it looks like this was just committed to > the Ubuntu kernel tree too: > > > > IIRC some older kernels delivered with Ubuntu Precise were also using > > it (but maybe I'm mistaken) > > I don't see it there. > > I think my patch is more complete, but I'm happy to change the name if > this sysctl has already started to enter the global consciousness. ;) > > Serge, Ben, what do you think? > > -Kees Hey, I had originally written this for Ubuntu when userns was still new and not upstream. Then we dropped it when it got upstream. The reason we are re-adding it is because we're going to be pushing the envelop again wrt unprivileged userns usage. Seth has been working on supporting mounts of fuse, for instance. When everything is upstream, (or we drop it :) we'll drop the patch again. -serge ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-23 0:44 ` Serge Hallyn 0 siblings, 0 replies; 80+ messages in thread From: Serge Hallyn @ 2016-01-23 0:44 UTC (permalink / raw) To: Kees Cook Cc: Robert Święcki, Ben Hutchings, Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening Quoting Kees Cook (keescook@chromium.org): > On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: > > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: > > > >>> Seems that Debian and some older Ubuntu versions are already using > >>> > >>> $ sysctl -a | grep usern > >>> kernel.unprivileged_userns_clone = 0 > >>> > >>> Shall we be consistent wit it? > >> > >> Oh! I didn't see that on systems I checked. On which version did you find that? > > > > $ uname -a > > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 > > (2016-01-07) x86_64 GNU/Linux > > $ cat /etc/debian_version > > 8.2 > > Ah-ha, Debian only, though it looks like this was just committed to > the Ubuntu kernel tree too: > > > > IIRC some older kernels delivered with Ubuntu Precise were also using > > it (but maybe I'm mistaken) > > I don't see it there. > > I think my patch is more complete, but I'm happy to change the name if > this sysctl has already started to enter the global consciousness. ;) > > Serge, Ben, what do you think? > > -Kees Hey, I had originally written this for Ubuntu when userns was still new and not upstream. Then we dropped it when it got upstream. The reason we are re-adding it is because we're going to be pushing the envelop again wrt unprivileged userns usage. Seth has been working on supporting mounts of fuse, for instance. When everything is upstream, (or we drop it :) we'll drop the patch again. -serge ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 23:00 ` [kernel-hardening] " Kees Cook @ 2016-01-23 0:44 ` Serge Hallyn -1 siblings, 0 replies; 80+ messages in thread From: Serge Hallyn @ 2016-01-23 0:44 UTC (permalink / raw) To: Kees Cook Cc: Robert Święcki, Ben Hutchings, Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening Quoting Kees Cook (keescook@chromium.org): > On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: > > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: > > > >>> Seems that Debian and some older Ubuntu versions are already using > >>> > >>> $ sysctl -a | grep usern > >>> kernel.unprivileged_userns_clone = 0 > >>> > >>> Shall we be consistent wit it? > >> > >> Oh! I didn't see that on systems I checked. On which version did you find that? > > > > $ uname -a > > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 > > (2016-01-07) x86_64 GNU/Linux > > $ cat /etc/debian_version > > 8.2 > > Ah-ha, Debian only, though it looks like this was just committed to > the Ubuntu kernel tree too: > > > > IIRC some older kernels delivered with Ubuntu Precise were also using > > it (but maybe I'm mistaken) > > I don't see it there. > > I think my patch is more complete, but I'm happy to change the name if > this sysctl has already started to enter the global consciousness. ;) > > Serge, Ben, what do you think? Oh, sorry - as for the name of it, what is the alternative you are proposing? ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-23 0:44 ` Serge Hallyn 0 siblings, 0 replies; 80+ messages in thread From: Serge Hallyn @ 2016-01-23 0:44 UTC (permalink / raw) To: Kees Cook Cc: Robert Święcki, Ben Hutchings, Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening Quoting Kees Cook (keescook@chromium.org): > On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: > > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: > > > >>> Seems that Debian and some older Ubuntu versions are already using > >>> > >>> $ sysctl -a | grep usern > >>> kernel.unprivileged_userns_clone = 0 > >>> > >>> Shall we be consistent wit it? > >> > >> Oh! I didn't see that on systems I checked. On which version did you find that? > > > > $ uname -a > > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 > > (2016-01-07) x86_64 GNU/Linux > > $ cat /etc/debian_version > > 8.2 > > Ah-ha, Debian only, though it looks like this was just committed to > the Ubuntu kernel tree too: > > > > IIRC some older kernels delivered with Ubuntu Precise were also using > > it (but maybe I'm mistaken) > > I don't see it there. > > I think my patch is more complete, but I'm happy to change the name if > this sysctl has already started to enter the global consciousness. ;) > > Serge, Ben, what do you think? Oh, sorry - as for the name of it, what is the alternative you are proposing? ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 23:00 ` [kernel-hardening] " Kees Cook ` (2 preceding siblings ...) (?) @ 2016-01-23 0:59 ` Ben Hutchings 2016-01-24 20:59 ` Kees Cook -1 siblings, 1 reply; 80+ messages in thread From: Ben Hutchings @ 2016-01-23 0:59 UTC (permalink / raw) To: kernel-hardening, Robert Święcki, Serge E. Hallyn Cc: Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML [-- Attachment #1: Type: text/plain, Size: 1639 bytes --] On Fri, 2016-01-22 at 15:00 -0800, Kees Cook wrote: > On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: > > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: > > > > > > Seems that Debian and some older Ubuntu versions are already using > > > > > > > > $ sysctl -a | grep usern > > > > kernel.unprivileged_userns_clone = 0 > > > > > > > > Shall we be consistent wit it? > > > > > > Oh! I didn't see that on systems I checked. On which version did you find that? > > > > $ uname -a > > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 > > (2016-01-07) x86_64 GNU/Linux > > $ cat /etc/debian_version > > 8.2 > > Ah-ha, Debian only, though it looks like this was just committed to > the Ubuntu kernel tree too: > > > > IIRC some older kernels delivered with Ubuntu Precise were also using > > it (but maybe I'm mistaken) > > I don't see it there. > > I think my patch is more complete, but I'm happy to change the name if > this sysctl has already started to enter the global consciousness. ;) > > Serge, Ben, what do you think? I agree that using the '_restrict' suffix for new restrictions makes sense. I also don't think that a third possible value for kernel.unprivileged_userns_clone would would be understandable. I would probably make kernel.unprivileged_userns_clone a wrapper for kernel.userns_restrict in Debian, then deprecate and eventually remove it. Ben. -- Ben Hutchings Life is what happens to you while you're busy making other plans. - John Lennon [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 811 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-23 0:59 ` Ben Hutchings @ 2016-01-24 20:59 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-24 20:59 UTC (permalink / raw) To: kernel-hardening Cc: Robert Święcki, Serge E. Hallyn, Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML On Fri, Jan 22, 2016 at 4:59 PM, Ben Hutchings <ben@decadent.org.uk> wrote: > On Fri, 2016-01-22 at 15:00 -0800, Kees Cook wrote: >> On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: >> > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: >> > >> > > > Seems that Debian and some older Ubuntu versions are already using >> > > > >> > > > $ sysctl -a | grep usern >> > > > kernel.unprivileged_userns_clone = 0 >> > > > >> > > > Shall we be consistent wit it? >> > > >> > > Oh! I didn't see that on systems I checked. On which version did you find that? >> > >> > $ uname -a >> > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 >> > (2016-01-07) x86_64 GNU/Linux >> > $ cat /etc/debian_version >> > 8.2 >> >> Ah-ha, Debian only, though it looks like this was just committed to >> the Ubuntu kernel tree too: >> >> >> > IIRC some older kernels delivered with Ubuntu Precise were also using >> > it (but maybe I'm mistaken) >> >> I don't see it there. >> >> I think my patch is more complete, but I'm happy to change the name if >> this sysctl has already started to enter the global consciousness. ;) >> >> Serge, Ben, what do you think? > > I agree that using the '_restrict' suffix for new restrictions makes > sense. I also don't think that a third possible value for > kernel.unprivileged_userns_clone would would be understandable. > > I would probably make kernel.unprivileged_userns_clone a wrapper for > kernel.userns_restrict in Debian, then deprecate and eventually remove > it. Okay, cool. We'll keep my patch as-is then. Thanks! -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-24 20:59 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-24 20:59 UTC (permalink / raw) To: kernel-hardening Cc: Robert Święcki, Serge E. Hallyn, Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Andy Lutomirski, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML On Fri, Jan 22, 2016 at 4:59 PM, Ben Hutchings <ben@decadent.org.uk> wrote: > On Fri, 2016-01-22 at 15:00 -0800, Kees Cook wrote: >> On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: >> > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: >> > >> > > > Seems that Debian and some older Ubuntu versions are already using >> > > > >> > > > $ sysctl -a | grep usern >> > > > kernel.unprivileged_userns_clone = 0 >> > > > >> > > > Shall we be consistent wit it? >> > > >> > > Oh! I didn't see that on systems I checked. On which version did you find that? >> > >> > $ uname -a >> > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 >> > (2016-01-07) x86_64 GNU/Linux >> > $ cat /etc/debian_version >> > 8.2 >> >> Ah-ha, Debian only, though it looks like this was just committed to >> the Ubuntu kernel tree too: >> >> >> > IIRC some older kernels delivered with Ubuntu Precise were also using >> > it (but maybe I'm mistaken) >> >> I don't see it there. >> >> I think my patch is more complete, but I'm happy to change the name if >> this sysctl has already started to enter the global consciousness. ;) >> >> Serge, Ben, what do you think? > > I agree that using the '_restrict' suffix for new restrictions makes > sense. I also don't think that a third possible value for > kernel.unprivileged_userns_clone would would be understandable. > > I would probably make kernel.unprivileged_userns_clone a wrapper for > kernel.userns_restrict in Debian, then deprecate and eventually remove > it. Okay, cool. We'll keep my patch as-is then. Thanks! -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-24 20:59 ` Kees Cook (?) @ 2016-01-24 22:20 ` Andy Lutomirski 2016-01-25 18:51 ` Kees Cook -1 siblings, 1 reply; 80+ messages in thread From: Andy Lutomirski @ 2016-01-24 22:20 UTC (permalink / raw) To: Kees Cook Cc: kernel-hardening, Robert Święcki, Serge E. Hallyn, Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML On Sun, Jan 24, 2016 at 12:59 PM, Kees Cook <keescook@chromium.org> wrote: > On Fri, Jan 22, 2016 at 4:59 PM, Ben Hutchings <ben@decadent.org.uk> wrote: >> On Fri, 2016-01-22 at 15:00 -0800, Kees Cook wrote: >>> On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: >>> > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: >>> > >>> > > > Seems that Debian and some older Ubuntu versions are already using >>> > > > >>> > > > $ sysctl -a | grep usern >>> > > > kernel.unprivileged_userns_clone = 0 >>> > > > >>> > > > Shall we be consistent wit it? >>> > > >>> > > Oh! I didn't see that on systems I checked. On which version did you find that? >>> > >>> > $ uname -a >>> > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 >>> > (2016-01-07) x86_64 GNU/Linux >>> > $ cat /etc/debian_version >>> > 8.2 >>> >>> Ah-ha, Debian only, though it looks like this was just committed to >>> the Ubuntu kernel tree too: >>> >>> >>> > IIRC some older kernels delivered with Ubuntu Precise were also using >>> > it (but maybe I'm mistaken) >>> >>> I don't see it there. >>> >>> I think my patch is more complete, but I'm happy to change the name if >>> this sysctl has already started to enter the global consciousness. ;) >>> >>> Serge, Ben, what do you think? >> >> I agree that using the '_restrict' suffix for new restrictions makes >> sense. I also don't think that a third possible value for >> kernel.unprivileged_userns_clone would would be understandable. >> >> I would probably make kernel.unprivileged_userns_clone a wrapper for >> kernel.userns_restrict in Debian, then deprecate and eventually remove >> it. > > Okay, cool. We'll keep my patch as-is then. Thanks! We still need to deal with the capable check in the write handler though, right? But I must be missing something: why is mode 0644 insufficient? --Andy ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-24 22:20 ` Andy Lutomirski @ 2016-01-25 18:51 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-25 18:51 UTC (permalink / raw) To: Andy Lutomirski Cc: kernel-hardening, Robert Święcki, Serge E. Hallyn, Andrew Morton, Al Viro, Richard Weinberger, Eric W. Biederman, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML On Sun, Jan 24, 2016 at 2:20 PM, Andy Lutomirski <luto@amacapital.net> wrote: > On Sun, Jan 24, 2016 at 12:59 PM, Kees Cook <keescook@chromium.org> wrote: >> On Fri, Jan 22, 2016 at 4:59 PM, Ben Hutchings <ben@decadent.org.uk> wrote: >>> On Fri, 2016-01-22 at 15:00 -0800, Kees Cook wrote: >>>> On Fri, Jan 22, 2016 at 2:55 PM, Robert Święcki <robert@swiecki.net> wrote: >>>> > 2016-01-22 23:50 GMT+01:00 Kees Cook <keescook@chromium.org>: >>>> > >>>> > > > Seems that Debian and some older Ubuntu versions are already using >>>> > > > >>>> > > > $ sysctl -a | grep usern >>>> > > > kernel.unprivileged_userns_clone = 0 >>>> > > > >>>> > > > Shall we be consistent wit it? >>>> > > >>>> > > Oh! I didn't see that on systems I checked. On which version did you find that? >>>> > >>>> > $ uname -a >>>> > Linux bc1 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 >>>> > (2016-01-07) x86_64 GNU/Linux >>>> > $ cat /etc/debian_version >>>> > 8.2 >>>> >>>> Ah-ha, Debian only, though it looks like this was just committed to >>>> the Ubuntu kernel tree too: >>>> >>>> >>>> > IIRC some older kernels delivered with Ubuntu Precise were also using >>>> > it (but maybe I'm mistaken) >>>> >>>> I don't see it there. >>>> >>>> I think my patch is more complete, but I'm happy to change the name if >>>> this sysctl has already started to enter the global consciousness. ;) >>>> >>>> Serge, Ben, what do you think? >>> >>> I agree that using the '_restrict' suffix for new restrictions makes >>> sense. I also don't think that a third possible value for >>> kernel.unprivileged_userns_clone would would be understandable. >>> >>> I would probably make kernel.unprivileged_userns_clone a wrapper for >>> kernel.userns_restrict in Debian, then deprecate and eventually remove >>> it. >> >> Okay, cool. We'll keep my patch as-is then. Thanks! > > We still need to deal with the capable check in the write handler though, right? > > But I must be missing something: why is mode 0644 insufficient? Yeah, separate issue. I think it's a corner case: a non-cap root user using a setuid tool to write to sysctls. It's worth solving, but I'd like to land the CLONE_NEWUSER sysctl first; it's much more urgent. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 22:39 ` [kernel-hardening] " Kees Cook @ 2016-01-22 22:49 ` Richard Weinberger -1 siblings, 0 replies; 80+ messages in thread From: Richard Weinberger @ 2016-01-22 22:49 UTC (permalink / raw) To: Kees Cook, Andrew Morton Cc: Al Viro, Eric W. Biederman, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Am 22.01.2016 um 23:39 schrieb Kees Cook: > There continues to be unexpected side-effects and security exposures > via CLONE_NEWUSER. For many end-users running distro kernels with > CONFIG_USER_NS enabled, there is no way to disable this feature when > desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so > admins not running containers or Chrome can avoid the risks of this > feature. Last time such a patch came up I was not thrilled because hiding a scary feature behind a knob IMHO doesn't make it any better nor helps finding issues. But as userns is still a source of a lot of issues and distros enable it by default a knob for the admin seems to be a good idea by now. ;-\ Thanks, //richard ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-22 22:49 ` Richard Weinberger 0 siblings, 0 replies; 80+ messages in thread From: Richard Weinberger @ 2016-01-22 22:49 UTC (permalink / raw) To: Kees Cook, Andrew Morton Cc: Al Viro, Eric W. Biederman, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Am 22.01.2016 um 23:39 schrieb Kees Cook: > There continues to be unexpected side-effects and security exposures > via CLONE_NEWUSER. For many end-users running distro kernels with > CONFIG_USER_NS enabled, there is no way to disable this feature when > desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so > admins not running containers or Chrome can avoid the risks of this > feature. Last time such a patch came up I was not thrilled because hiding a scary feature behind a knob IMHO doesn't make it any better nor helps finding issues. But as userns is still a source of a lot of issues and distros enable it by default a knob for the admin seems to be a good idea by now. ;-\ Thanks, //richard ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-22 22:39 ` [kernel-hardening] " Kees Cook @ 2016-01-23 3:02 ` Eric W. Biederman -1 siblings, 0 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-23 3:02 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Al Viro, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Kees Cook <keescook@chromium.org> writes: > There continues to be unexpected side-effects and security exposures > via CLONE_NEWUSER. For many end-users running distro kernels with > CONFIG_USER_NS enabled, there is no way to disable this feature when > desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so > admins not running containers or Chrome can avoid the risks of this > feature. I don't actually think there do continue to be unexpected side-effects and security exposures with CLONE_NEWUSER. It takes a while for all of the fixes to trickle out to distros. At most what I have seen recently are problems with other kernel interfaces being amplified with user namespaces. AKA the current mess with devpts, and the unexpected issues with bind mounts in mount namespaces. I have a couple of concerns with a sysctl. 1) As user namespaces settle out this sysctl has the potential to decrease the security of the system overall as sandboxing features of the kernel will not be available to unprivileged applications. Web browsing with chrome will be less safe for example. 2) I strongly suspect the granularity of a sysctl is wrong for access to user namespaces on a production system. In general I suspect what we want is something like seccomp. I believe all of the relevant bits are in registers. I actually thought that was enough for seccomp. Does seccomp not work for some reason? 3) A sysctl breeds a false sense of security in thinking that if a security issue is discovered you can just flip a switch, disable all new user namespaces and you won't be vulnerable. In fact most of the issues in the past have only required being in a user namespace to trigger. Which means any containers or user namespaces that already exist could be used to exploit any new found issue. Which means that a I don't think a sysctl will give the desired level of protection. In my analysis of the issues to date I don't know of anything short of a reboot that would meaninfully remove the threat. 4) With applications like docker coming on-line I don't think a restriction to processes with capabilities is actually meaninful for restricting access to user namespaces. So I have concerns about both efficacy and usability with the proposed sysctl. So to keep this productive. Please tell me about the threat model you envision, and how you envision knobs in the kernel being used to counter those threats. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-23 3:02 ` Eric W. Biederman 0 siblings, 0 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-23 3:02 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Al Viro, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Kees Cook <keescook@chromium.org> writes: > There continues to be unexpected side-effects and security exposures > via CLONE_NEWUSER. For many end-users running distro kernels with > CONFIG_USER_NS enabled, there is no way to disable this feature when > desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so > admins not running containers or Chrome can avoid the risks of this > feature. I don't actually think there do continue to be unexpected side-effects and security exposures with CLONE_NEWUSER. It takes a while for all of the fixes to trickle out to distros. At most what I have seen recently are problems with other kernel interfaces being amplified with user namespaces. AKA the current mess with devpts, and the unexpected issues with bind mounts in mount namespaces. I have a couple of concerns with a sysctl. 1) As user namespaces settle out this sysctl has the potential to decrease the security of the system overall as sandboxing features of the kernel will not be available to unprivileged applications. Web browsing with chrome will be less safe for example. 2) I strongly suspect the granularity of a sysctl is wrong for access to user namespaces on a production system. In general I suspect what we want is something like seccomp. I believe all of the relevant bits are in registers. I actually thought that was enough for seccomp. Does seccomp not work for some reason? 3) A sysctl breeds a false sense of security in thinking that if a security issue is discovered you can just flip a switch, disable all new user namespaces and you won't be vulnerable. In fact most of the issues in the past have only required being in a user namespace to trigger. Which means any containers or user namespaces that already exist could be used to exploit any new found issue. Which means that a I don't think a sysctl will give the desired level of protection. In my analysis of the issues to date I don't know of anything short of a reboot that would meaninfully remove the threat. 4) With applications like docker coming on-line I don't think a restriction to processes with capabilities is actually meaninful for restricting access to user namespaces. So I have concerns about both efficacy and usability with the proposed sysctl. So to keep this productive. Please tell me about the threat model you envision, and how you envision knobs in the kernel being used to counter those threats. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-23 3:02 ` [kernel-hardening] " Eric W. Biederman @ 2016-01-24 20:57 ` Kees Cook -1 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-24 20:57 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, Al Viro, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: > >> There continues to be unexpected side-effects and security exposures >> via CLONE_NEWUSER. For many end-users running distro kernels with >> CONFIG_USER_NS enabled, there is no way to disable this feature when >> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >> admins not running containers or Chrome can avoid the risks of this >> feature. > > I don't actually think there do continue to be unexpected side-effects > and security exposures with CLONE_NEWUSER. It takes a while for all of > the fixes to trickle out to distros. At most what I have seen recently > are problems with other kernel interfaces being amplified with user > namespaces. AKA the current mess with devpts, and the unexpected > issues with bind mounts in mount namespaces. Access to CLONE_NEWUSER has lead to a lot of security issues over the last 3 years. There has to be a way to avoid this for people that have no interest in containers. For admins running servers where there are no containers (which is still a giant number of systems -- containers are popular but not ubiquitous), the sysctl makes perfect sense. > I have a couple of concerns with a sysctl. > > 1) As user namespaces settle out this sysctl has the potential to > decrease the security of the system overall as sandboxing > features of the kernel will not be available to unprivileged > applications. > > Web browsing with chrome will be less safe for example. I don't propose this for Desktops. > 2) I strongly suspect the granularity of a sysctl is wrong for access to > user namespaces on a production system. > > In general I suspect what we want is something like seccomp. I > believe all of the relevant bits are in registers. I actually > thought that was enough for seccomp. Does seccomp not work for > some reason? Setting a global seccomp filter on init is not possible with any inits yet, and for some architectures it would push all processes onto the slow path. It's an extraordinarily big hammer for wanting to turn off a single area of the kernel with a long history of problems. Also, seccomp is arguably a program author's policy tool, not a system policy tool. We could offer this sysctl as an LSM too, but that's even messier. This is a trivial change to user namespaces and provides a large protection to people that aren't interested in the risks of running containers. > 3) A sysctl breeds a false sense of security in thinking that if a > security issue is discovered you can just flip a switch, disable > all new user namespaces and you won't be vulnerable. > > In fact most of the issues in the past have only required being in > a user namespace to trigger. Which means any containers or user > namespaces that already exist could be used to exploit any new > found issue. Which means that a I don't think a sysctl will give > the desired level of protection. > > In my analysis of the issues to date I don't know of anything > short of a reboot that would meaninfully remove the threat. Any admin that decides to just turn off CLONE_NEWUSER in the middle of still using it is insane. I don't think this breeds any false sense of security as most sysctls are set at boot time. > 4) With applications like docker coming on-line I don't think a > restriction to processes with capabilities is actually meaninful > for restricting access to user namespaces. Admins who are currently using containers are already exposed to so much attack surface. This is not for them, it's for people that don't use containers. > So I have concerns about both efficacy and usability with the proposed > sysctl. Two distros already have this sysctl because it was so strongly requested by their users. This needs to be upstream so we can manage the effects correctly. > So to keep this productive. Please tell me about the threat model > you envision, and how you envision knobs in the kernel being used to > counter those threats. The threat model I envision is post-intrusion escalation of privileges on systems that run distro kernels and do not use containers. I envision the sysctl being used at boot time to kill the entire class of current and future vulnerabilities exposed by CLONE_NEWUSER. Just like the sysctls used to turn off modules at boot or turn off kexec at boot. As Linux developers I feel we have an obligation to provide our end users with run-time choices (not just compile-time choices), since most of our users are using kernels built by someone else. Given the repeated problems with module auto-loading, we provided a way to disable module loading. Given the physical-memory-rewriting exposure of kexec, we provides a way to disable kexec. Given the conflict between hibernation and kASLR, we provided a way to choose one at runtime. Here, we're looking back on three years of vulnerabilities around CLONE_NEWUSER with no end in sight, and we have an obligation to help the end users that don't want to be exposed to this any more. Note I'm not suggesting we stop trying to fix the problems we find with user namespaces, but we need to provide a way to disable them. Having this sysctl is vastly superior to telling people how to rewrite their kernel memory at boot time to disable syscalls: https://outflux.net/blog/archives/2013/12/10/live-patching-the-kernel/ -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-24 20:57 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-24 20:57 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, Al Viro, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML, kernel-hardening On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: > >> There continues to be unexpected side-effects and security exposures >> via CLONE_NEWUSER. For many end-users running distro kernels with >> CONFIG_USER_NS enabled, there is no way to disable this feature when >> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >> admins not running containers or Chrome can avoid the risks of this >> feature. > > I don't actually think there do continue to be unexpected side-effects > and security exposures with CLONE_NEWUSER. It takes a while for all of > the fixes to trickle out to distros. At most what I have seen recently > are problems with other kernel interfaces being amplified with user > namespaces. AKA the current mess with devpts, and the unexpected > issues with bind mounts in mount namespaces. Access to CLONE_NEWUSER has lead to a lot of security issues over the last 3 years. There has to be a way to avoid this for people that have no interest in containers. For admins running servers where there are no containers (which is still a giant number of systems -- containers are popular but not ubiquitous), the sysctl makes perfect sense. > I have a couple of concerns with a sysctl. > > 1) As user namespaces settle out this sysctl has the potential to > decrease the security of the system overall as sandboxing > features of the kernel will not be available to unprivileged > applications. > > Web browsing with chrome will be less safe for example. I don't propose this for Desktops. > 2) I strongly suspect the granularity of a sysctl is wrong for access to > user namespaces on a production system. > > In general I suspect what we want is something like seccomp. I > believe all of the relevant bits are in registers. I actually > thought that was enough for seccomp. Does seccomp not work for > some reason? Setting a global seccomp filter on init is not possible with any inits yet, and for some architectures it would push all processes onto the slow path. It's an extraordinarily big hammer for wanting to turn off a single area of the kernel with a long history of problems. Also, seccomp is arguably a program author's policy tool, not a system policy tool. We could offer this sysctl as an LSM too, but that's even messier. This is a trivial change to user namespaces and provides a large protection to people that aren't interested in the risks of running containers. > 3) A sysctl breeds a false sense of security in thinking that if a > security issue is discovered you can just flip a switch, disable > all new user namespaces and you won't be vulnerable. > > In fact most of the issues in the past have only required being in > a user namespace to trigger. Which means any containers or user > namespaces that already exist could be used to exploit any new > found issue. Which means that a I don't think a sysctl will give > the desired level of protection. > > In my analysis of the issues to date I don't know of anything > short of a reboot that would meaninfully remove the threat. Any admin that decides to just turn off CLONE_NEWUSER in the middle of still using it is insane. I don't think this breeds any false sense of security as most sysctls are set at boot time. > 4) With applications like docker coming on-line I don't think a > restriction to processes with capabilities is actually meaninful > for restricting access to user namespaces. Admins who are currently using containers are already exposed to so much attack surface. This is not for them, it's for people that don't use containers. > So I have concerns about both efficacy and usability with the proposed > sysctl. Two distros already have this sysctl because it was so strongly requested by their users. This needs to be upstream so we can manage the effects correctly. > So to keep this productive. Please tell me about the threat model > you envision, and how you envision knobs in the kernel being used to > counter those threats. The threat model I envision is post-intrusion escalation of privileges on systems that run distro kernels and do not use containers. I envision the sysctl being used at boot time to kill the entire class of current and future vulnerabilities exposed by CLONE_NEWUSER. Just like the sysctls used to turn off modules at boot or turn off kexec at boot. As Linux developers I feel we have an obligation to provide our end users with run-time choices (not just compile-time choices), since most of our users are using kernels built by someone else. Given the repeated problems with module auto-loading, we provided a way to disable module loading. Given the physical-memory-rewriting exposure of kexec, we provides a way to disable kexec. Given the conflict between hibernation and kASLR, we provided a way to choose one at runtime. Here, we're looking back on three years of vulnerabilities around CLONE_NEWUSER with no end in sight, and we have an obligation to help the end users that don't want to be exposed to this any more. Note I'm not suggesting we stop trying to fix the problems we find with user namespaces, but we need to provide a way to disable them. Having this sysctl is vastly superior to telling people how to rewrite their kernel memory at boot time to disable syscalls: https://outflux.net/blog/archives/2013/12/10/live-patching-the-kernel/ -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-24 20:57 ` [kernel-hardening] " Kees Cook (?) @ 2016-01-26 7:38 ` Serge Hallyn -1 siblings, 0 replies; 80+ messages in thread From: Serge Hallyn @ 2016-01-26 7:38 UTC (permalink / raw) To: kernel-hardening Cc: Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Andy Lutomirski, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, LKML Quoting Kees Cook (keescook@chromium.org): > On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman > > So I have concerns about both efficacy and usability with the proposed > > sysctl. > > Two distros already have this sysctl because it was so strongly > requested by their users. This needs to be upstream so we can manage > the effects correctly. Which two distros? Was it in fact requested by their users? My opinion remains that long-term this is a bad thing. If we're going to have this upstream, it should be clearly marked so as to be easily removable at some point down the road. Userspace that cannot count on a feature (in the best case) won't use it or (much worse) will fall back to broken behavior in one case or the other. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-23 3:02 ` [kernel-hardening] " Eric W. Biederman @ 2016-01-24 22:22 ` Andy Lutomirski -1 siblings, 0 replies; 80+ messages in thread From: Andy Lutomirski @ 2016-01-24 22:22 UTC (permalink / raw) To: Eric W. Biederman Cc: Kees Cook, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: > >> There continues to be unexpected side-effects and security exposures >> via CLONE_NEWUSER. For many end-users running distro kernels with >> CONFIG_USER_NS enabled, there is no way to disable this feature when >> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >> admins not running containers or Chrome can avoid the risks of this >> feature. > > I don't actually think there do continue to be unexpected side-effects > and security exposures with CLONE_NEWUSER. It takes a while for all of > the fixes to trickle out to distros. At most what I have seen recently > are problems with other kernel interfaces being amplified with user > namespaces. AKA the current mess with devpts, and the unexpected > issues with bind mounts in mount namespaces. > > > So to keep this productive. Please tell me about the threat model > you envision, and how you envision knobs in the kernel being used to > counter those threats. I consider the ability to use CLONE_NEWUSER to acquire CAP_NET_ADMIN over /any/ network namespace and to thus access the network configuration API to be a huge risk. For example, unprivileged users can program iptables. I'll eat my hat if there are no privilege escalations in there. (They can't request module loading, but still.) --Andy ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-24 22:22 ` Andy Lutomirski 0 siblings, 0 replies; 80+ messages in thread From: Andy Lutomirski @ 2016-01-24 22:22 UTC (permalink / raw) To: Eric W. Biederman Cc: Kees Cook, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: > >> There continues to be unexpected side-effects and security exposures >> via CLONE_NEWUSER. For many end-users running distro kernels with >> CONFIG_USER_NS enabled, there is no way to disable this feature when >> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >> admins not running containers or Chrome can avoid the risks of this >> feature. > > I don't actually think there do continue to be unexpected side-effects > and security exposures with CLONE_NEWUSER. It takes a while for all of > the fixes to trickle out to distros. At most what I have seen recently > are problems with other kernel interfaces being amplified with user > namespaces. AKA the current mess with devpts, and the unexpected > issues with bind mounts in mount namespaces. > > > So to keep this productive. Please tell me about the threat model > you envision, and how you envision knobs in the kernel being used to > counter those threats. I consider the ability to use CLONE_NEWUSER to acquire CAP_NET_ADMIN over /any/ network namespace and to thus access the network configuration API to be a huge risk. For example, unprivileged users can program iptables. I'll eat my hat if there are no privilege escalations in there. (They can't request module loading, but still.) --Andy ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-24 22:22 ` [kernel-hardening] " Andy Lutomirski @ 2016-01-25 18:51 ` Kees Cook -1 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-25 18:51 UTC (permalink / raw) To: Andy Lutomirski Cc: Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Sun, Jan 24, 2016 at 2:22 PM, Andy Lutomirski <luto@amacapital.net> wrote: > On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman > <ebiederm@xmission.com> wrote: >> Kees Cook <keescook@chromium.org> writes: >> >>> There continues to be unexpected side-effects and security exposures >>> via CLONE_NEWUSER. For many end-users running distro kernels with >>> CONFIG_USER_NS enabled, there is no way to disable this feature when >>> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >>> admins not running containers or Chrome can avoid the risks of this >>> feature. >> >> I don't actually think there do continue to be unexpected side-effects >> and security exposures with CLONE_NEWUSER. It takes a while for all of >> the fixes to trickle out to distros. At most what I have seen recently >> are problems with other kernel interfaces being amplified with user >> namespaces. AKA the current mess with devpts, and the unexpected >> issues with bind mounts in mount namespaces. >> > >> >> So to keep this productive. Please tell me about the threat model >> you envision, and how you envision knobs in the kernel being used to >> counter those threats. > > I consider the ability to use CLONE_NEWUSER to acquire CAP_NET_ADMIN > over /any/ network namespace and to thus access the network > configuration API to be a huge risk. For example, unprivileged users > can program iptables. I'll eat my hat if there are no privilege > escalations in there. (They can't request module loading, but still.) Should I consider this an Ack for the patch? :) -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-25 18:51 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-25 18:51 UTC (permalink / raw) To: Andy Lutomirski Cc: Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Sun, Jan 24, 2016 at 2:22 PM, Andy Lutomirski <luto@amacapital.net> wrote: > On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman > <ebiederm@xmission.com> wrote: >> Kees Cook <keescook@chromium.org> writes: >> >>> There continues to be unexpected side-effects and security exposures >>> via CLONE_NEWUSER. For many end-users running distro kernels with >>> CONFIG_USER_NS enabled, there is no way to disable this feature when >>> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >>> admins not running containers or Chrome can avoid the risks of this >>> feature. >> >> I don't actually think there do continue to be unexpected side-effects >> and security exposures with CLONE_NEWUSER. It takes a while for all of >> the fixes to trickle out to distros. At most what I have seen recently >> are problems with other kernel interfaces being amplified with user >> namespaces. AKA the current mess with devpts, and the unexpected >> issues with bind mounts in mount namespaces. >> > >> >> So to keep this productive. Please tell me about the threat model >> you envision, and how you envision knobs in the kernel being used to >> counter those threats. > > I consider the ability to use CLONE_NEWUSER to acquire CAP_NET_ADMIN > over /any/ network namespace and to thus access the network > configuration API to be a huge risk. For example, unprivileged users > can program iptables. I'll eat my hat if there are no privilege > escalations in there. (They can't request module loading, but still.) Should I consider this an Ack for the patch? :) -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-25 18:51 ` [kernel-hardening] " Kees Cook @ 2016-01-25 18:53 ` Andy Lutomirski -1 siblings, 0 replies; 80+ messages in thread From: Andy Lutomirski @ 2016-01-25 18:53 UTC (permalink / raw) To: Kees Cook Cc: Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 10:51 AM, Kees Cook <keescook@chromium.org> wrote: > On Sun, Jan 24, 2016 at 2:22 PM, Andy Lutomirski <luto@amacapital.net> wrote: >> On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman >> <ebiederm@xmission.com> wrote: >>> Kees Cook <keescook@chromium.org> writes: >>> >>>> There continues to be unexpected side-effects and security exposures >>>> via CLONE_NEWUSER. For many end-users running distro kernels with >>>> CONFIG_USER_NS enabled, there is no way to disable this feature when >>>> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >>>> admins not running containers or Chrome can avoid the risks of this >>>> feature. >>> >>> I don't actually think there do continue to be unexpected side-effects >>> and security exposures with CLONE_NEWUSER. It takes a while for all of >>> the fixes to trickle out to distros. At most what I have seen recently >>> are problems with other kernel interfaces being amplified with user >>> namespaces. AKA the current mess with devpts, and the unexpected >>> issues with bind mounts in mount namespaces. >>> >> >>> >>> So to keep this productive. Please tell me about the threat model >>> you envision, and how you envision knobs in the kernel being used to >>> counter those threats. >> >> I consider the ability to use CLONE_NEWUSER to acquire CAP_NET_ADMIN >> over /any/ network namespace and to thus access the network >> configuration API to be a huge risk. For example, unprivileged users >> can program iptables. I'll eat my hat if there are no privilege >> escalations in there. (They can't request module loading, but still.) > > Should I consider this an Ack for the patch? :) Only if you explain why you need the CAP_SYS_ADMIN check. :) IOW, I think you could change that one line of code and have a less weird version of the patch that would work just fine. --Andy -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-25 18:53 ` Andy Lutomirski 0 siblings, 0 replies; 80+ messages in thread From: Andy Lutomirski @ 2016-01-25 18:53 UTC (permalink / raw) To: Kees Cook Cc: Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 10:51 AM, Kees Cook <keescook@chromium.org> wrote: > On Sun, Jan 24, 2016 at 2:22 PM, Andy Lutomirski <luto@amacapital.net> wrote: >> On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman >> <ebiederm@xmission.com> wrote: >>> Kees Cook <keescook@chromium.org> writes: >>> >>>> There continues to be unexpected side-effects and security exposures >>>> via CLONE_NEWUSER. For many end-users running distro kernels with >>>> CONFIG_USER_NS enabled, there is no way to disable this feature when >>>> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >>>> admins not running containers or Chrome can avoid the risks of this >>>> feature. >>> >>> I don't actually think there do continue to be unexpected side-effects >>> and security exposures with CLONE_NEWUSER. It takes a while for all of >>> the fixes to trickle out to distros. At most what I have seen recently >>> are problems with other kernel interfaces being amplified with user >>> namespaces. AKA the current mess with devpts, and the unexpected >>> issues with bind mounts in mount namespaces. >>> >> >>> >>> So to keep this productive. Please tell me about the threat model >>> you envision, and how you envision knobs in the kernel being used to >>> counter those threats. >> >> I consider the ability to use CLONE_NEWUSER to acquire CAP_NET_ADMIN >> over /any/ network namespace and to thus access the network >> configuration API to be a huge risk. For example, unprivileged users >> can program iptables. I'll eat my hat if there are no privilege >> escalations in there. (They can't request module loading, but still.) > > Should I consider this an Ack for the patch? :) Only if you explain why you need the CAP_SYS_ADMIN check. :) IOW, I think you could change that one line of code and have a less weird version of the patch that would work just fine. --Andy -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-25 18:53 ` [kernel-hardening] " Andy Lutomirski @ 2016-01-25 18:56 ` Kees Cook -1 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-25 18:56 UTC (permalink / raw) To: Andy Lutomirski Cc: Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 10:53 AM, Andy Lutomirski <luto@amacapital.net> wrote: > On Mon, Jan 25, 2016 at 10:51 AM, Kees Cook <keescook@chromium.org> wrote: >> On Sun, Jan 24, 2016 at 2:22 PM, Andy Lutomirski <luto@amacapital.net> wrote: >>> On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman >>> <ebiederm@xmission.com> wrote: >>>> Kees Cook <keescook@chromium.org> writes: >>>> >>>>> There continues to be unexpected side-effects and security exposures >>>>> via CLONE_NEWUSER. For many end-users running distro kernels with >>>>> CONFIG_USER_NS enabled, there is no way to disable this feature when >>>>> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >>>>> admins not running containers or Chrome can avoid the risks of this >>>>> feature. >>>> >>>> I don't actually think there do continue to be unexpected side-effects >>>> and security exposures with CLONE_NEWUSER. It takes a while for all of >>>> the fixes to trickle out to distros. At most what I have seen recently >>>> are problems with other kernel interfaces being amplified with user >>>> namespaces. AKA the current mess with devpts, and the unexpected >>>> issues with bind mounts in mount namespaces. >>>> >>> >>>> >>>> So to keep this productive. Please tell me about the threat model >>>> you envision, and how you envision knobs in the kernel being used to >>>> counter those threats. >>> >>> I consider the ability to use CLONE_NEWUSER to acquire CAP_NET_ADMIN >>> over /any/ network namespace and to thus access the network >>> configuration API to be a huge risk. For example, unprivileged users >>> can program iptables. I'll eat my hat if there are no privilege >>> escalations in there. (They can't request module loading, but still.) >> >> Should I consider this an Ack for the patch? :) > > Only if you explain why you need the CAP_SYS_ADMIN check. :) Hm? In the sysctl write? Because otherwise a non-cap root user could turn "1" to "0". The restriction on CLONE_NEWUSER checks caps, not uid, so the uid must be protected by cap checks. The DAC permissions on sysctls for cap-based restrictions make no sense -- they need to be doing cap checks not DAC checks. It's the same logic for why dmesg_restrict and kptr_restrict use the same cap check. > IOW, I think you could change that one line of code and have a less > weird version of the patch that would work just fine. Well, I don't know about less weird, but it would leave a unneeded hole in the permission checks. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-25 18:56 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-25 18:56 UTC (permalink / raw) To: Andy Lutomirski Cc: Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 10:53 AM, Andy Lutomirski <luto@amacapital.net> wrote: > On Mon, Jan 25, 2016 at 10:51 AM, Kees Cook <keescook@chromium.org> wrote: >> On Sun, Jan 24, 2016 at 2:22 PM, Andy Lutomirski <luto@amacapital.net> wrote: >>> On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman >>> <ebiederm@xmission.com> wrote: >>>> Kees Cook <keescook@chromium.org> writes: >>>> >>>>> There continues to be unexpected side-effects and security exposures >>>>> via CLONE_NEWUSER. For many end-users running distro kernels with >>>>> CONFIG_USER_NS enabled, there is no way to disable this feature when >>>>> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so >>>>> admins not running containers or Chrome can avoid the risks of this >>>>> feature. >>>> >>>> I don't actually think there do continue to be unexpected side-effects >>>> and security exposures with CLONE_NEWUSER. It takes a while for all of >>>> the fixes to trickle out to distros. At most what I have seen recently >>>> are problems with other kernel interfaces being amplified with user >>>> namespaces. AKA the current mess with devpts, and the unexpected >>>> issues with bind mounts in mount namespaces. >>>> >>> >>>> >>>> So to keep this productive. Please tell me about the threat model >>>> you envision, and how you envision knobs in the kernel being used to >>>> counter those threats. >>> >>> I consider the ability to use CLONE_NEWUSER to acquire CAP_NET_ADMIN >>> over /any/ network namespace and to thus access the network >>> configuration API to be a huge risk. For example, unprivileged users >>> can program iptables. I'll eat my hat if there are no privilege >>> escalations in there. (They can't request module loading, but still.) >> >> Should I consider this an Ack for the patch? :) > > Only if you explain why you need the CAP_SYS_ADMIN check. :) Hm? In the sysctl write? Because otherwise a non-cap root user could turn "1" to "0". The restriction on CLONE_NEWUSER checks caps, not uid, so the uid must be protected by cap checks. The DAC permissions on sysctls for cap-based restrictions make no sense -- they need to be doing cap checks not DAC checks. It's the same logic for why dmesg_restrict and kptr_restrict use the same cap check. > IOW, I think you could change that one line of code and have a less > weird version of the patch that would work just fine. Well, I don't know about less weird, but it would leave a unneeded hole in the permission checks. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-25 18:56 ` [kernel-hardening] " Kees Cook @ 2016-01-25 19:33 ` Eric W. Biederman -1 siblings, 0 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-25 19:33 UTC (permalink / raw) To: Kees Cook Cc: Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Kees Cook <keescook@chromium.org> writes: > > Well, I don't know about less weird, but it would leave a unneeded > hole in the permission checks. To be clear the current patch has my: Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> The code is buggy, and poorly thought through. Your lack of interest in fixing the bugs in your patch is distressing. So broken code, not willing to fix. No. We are not merging this sysctl. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-25 19:33 ` Eric W. Biederman 0 siblings, 0 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-25 19:33 UTC (permalink / raw) To: Kees Cook Cc: Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Kees Cook <keescook@chromium.org> writes: > > Well, I don't know about less weird, but it would leave a unneeded > hole in the permission checks. To be clear the current patch has my: Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> The code is buggy, and poorly thought through. Your lack of interest in fixing the bugs in your patch is distressing. So broken code, not willing to fix. No. We are not merging this sysctl. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-25 19:33 ` [kernel-hardening] " Eric W. Biederman @ 2016-01-25 22:34 ` Kees Cook -1 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-25 22:34 UTC (permalink / raw) To: Eric W. Biederman Cc: Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: >> >> Well, I don't know about less weird, but it would leave a unneeded >> hole in the permission checks. > > To be clear the current patch has my: > > Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> > > The code is buggy, and poorly thought through. Your lack of interest in > fixing the bugs in your patch is distressing. I'm not sure where you see me having a "lack of interest". The existing cap-checking sysctls have a corner-case bug, which is orthogonal to this change. > So broken code, not willing to fix. No. We are not merging this sysctl. I think you're jumping to conclusions. :) This feature is already implemented by two distros, and likely wanted by others. We cannot ignore that. The sysctl default doesn't change the existing behavior, so this doesn't get in your way at all. Can you please respond to my earlier email where I rebutted each of your arguments against it? Just saying "no" and putting words in my mouth isn't very productive. Andy, given your interest in this feature, and my explanation of the CAP_SYSADMIN check, what are your thoughts? -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-25 22:34 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-25 22:34 UTC (permalink / raw) To: Eric W. Biederman Cc: Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: >> >> Well, I don't know about less weird, but it would leave a unneeded >> hole in the permission checks. > > To be clear the current patch has my: > > Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> > > The code is buggy, and poorly thought through. Your lack of interest in > fixing the bugs in your patch is distressing. I'm not sure where you see me having a "lack of interest". The existing cap-checking sysctls have a corner-case bug, which is orthogonal to this change. > So broken code, not willing to fix. No. We are not merging this sysctl. I think you're jumping to conclusions. :) This feature is already implemented by two distros, and likely wanted by others. We cannot ignore that. The sysctl default doesn't change the existing behavior, so this doesn't get in your way at all. Can you please respond to my earlier email where I rebutted each of your arguments against it? Just saying "no" and putting words in my mouth isn't very productive. Andy, given your interest in this feature, and my explanation of the CAP_SYSADMIN check, what are your thoughts? -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-25 22:34 ` [kernel-hardening] " Kees Cook @ 2016-01-25 23:33 ` Andy Lutomirski -1 siblings, 0 replies; 80+ messages in thread From: Andy Lutomirski @ 2016-01-25 23:33 UTC (permalink / raw) To: Kees Cook Cc: Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 2:34 PM, Kees Cook <keescook@chromium.org> wrote: > On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman > <ebiederm@xmission.com> wrote: >> Kees Cook <keescook@chromium.org> writes: >>> >>> Well, I don't know about less weird, but it would leave a unneeded >>> hole in the permission checks. >> >> To be clear the current patch has my: >> >> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >> >> The code is buggy, and poorly thought through. Your lack of interest in >> fixing the bugs in your patch is distressing. > > I'm not sure where you see me having a "lack of interest". The > existing cap-checking sysctls have a corner-case bug, which is > orthogonal to this change. > >> So broken code, not willing to fix. No. We are not merging this sysctl. > > I think you're jumping to conclusions. :) > > This feature is already implemented by two distros, and likely wanted > by others. We cannot ignore that. The sysctl default doesn't change > the existing behavior, so this doesn't get in your way at all. Can you > please respond to my earlier email where I rebutted each of your > arguments against it? Just saying "no" and putting words in my mouth > isn't very productive. > > Andy, given your interest in this feature, and my explanation of the > CAP_SYSADMIN check, what are your thoughts? > I think the sysctl sucks, but I don't have a better idea, so I think it should go in. There's clearly demand. A better change would be to have an option to tighten up what namespaces can be manipulated in which manner. If anyone wants to step up and do that, it sounds useful. Denying CAP_NET_ADMIN in an unprivileged user ns would address one piece of attack surface. There are probably others. *However*, I think that trying to protect against a hypothetical attacker with uid == global root who has procfs access but doesn't have CAP_SYS_ADMIN isn't important enough to be worth introducing yet another bad capable() call. Whoever wants to tilt at the windmill of fixng sysctl permissions can address all of them, and then maybe this sysctl would be worth tightening up. --Andy ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-25 23:33 ` Andy Lutomirski 0 siblings, 0 replies; 80+ messages in thread From: Andy Lutomirski @ 2016-01-25 23:33 UTC (permalink / raw) To: Kees Cook Cc: Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 2:34 PM, Kees Cook <keescook@chromium.org> wrote: > On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman > <ebiederm@xmission.com> wrote: >> Kees Cook <keescook@chromium.org> writes: >>> >>> Well, I don't know about less weird, but it would leave a unneeded >>> hole in the permission checks. >> >> To be clear the current patch has my: >> >> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >> >> The code is buggy, and poorly thought through. Your lack of interest in >> fixing the bugs in your patch is distressing. > > I'm not sure where you see me having a "lack of interest". The > existing cap-checking sysctls have a corner-case bug, which is > orthogonal to this change. > >> So broken code, not willing to fix. No. We are not merging this sysctl. > > I think you're jumping to conclusions. :) > > This feature is already implemented by two distros, and likely wanted > by others. We cannot ignore that. The sysctl default doesn't change > the existing behavior, so this doesn't get in your way at all. Can you > please respond to my earlier email where I rebutted each of your > arguments against it? Just saying "no" and putting words in my mouth > isn't very productive. > > Andy, given your interest in this feature, and my explanation of the > CAP_SYSADMIN check, what are your thoughts? > I think the sysctl sucks, but I don't have a better idea, so I think it should go in. There's clearly demand. A better change would be to have an option to tighten up what namespaces can be manipulated in which manner. If anyone wants to step up and do that, it sounds useful. Denying CAP_NET_ADMIN in an unprivileged user ns would address one piece of attack surface. There are probably others. *However*, I think that trying to protect against a hypothetical attacker with uid == global root who has procfs access but doesn't have CAP_SYS_ADMIN isn't important enough to be worth introducing yet another bad capable() call. Whoever wants to tilt at the windmill of fixng sysctl permissions can address all of them, and then maybe this sysctl would be worth tightening up. --Andy ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-25 22:34 ` [kernel-hardening] " Kees Cook (?) (?) @ 2016-01-26 2:27 ` Daniel Micay -1 siblings, 0 replies; 80+ messages in thread From: Daniel Micay @ 2016-01-26 2:27 UTC (permalink / raw) To: kernel-hardening, Eric W. Biederman Cc: Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel [-- Attachment #1: Type: text/plain, Size: 418 bytes --] > This feature is already implemented by two distros, and likely wanted > by others. We cannot ignore that. Date point: Arch Linux won't be enabling CONFIG_USERNS until there's a way to disable unprivileged user namespaces. The kernel maintainers are unwilling to carry long-term out-of-tree patches. https://github.com/sandstorm-io/sandstorm/blob/d270755b1b55e5be6c96df2cce7c914f35f0d2a2/install.sh#L464-L474 [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-25 22:34 ` [kernel-hardening] " Kees Cook @ 2016-01-26 4:57 ` Eric W. Biederman -1 siblings, 0 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-26 4:57 UTC (permalink / raw) To: Kees Cook Cc: Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Kees Cook <keescook@chromium.org> writes: > On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman > <ebiederm@xmission.com> wrote: >> Kees Cook <keescook@chromium.org> writes: >>> >>> Well, I don't know about less weird, but it would leave a unneeded >>> hole in the permission checks. >> >> To be clear the current patch has my: >> >> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >> >> The code is buggy, and poorly thought through. Your lack of interest in >> fixing the bugs in your patch is distressing. > > I'm not sure where you see me having a "lack of interest". The > existing cap-checking sysctls have a corner-case bug, which is > orthogonal to this change. That certainly doesn't sound like you have any plans to change anything there. >> So broken code, not willing to fix. No. We are not merging this sysctl. > > I think you're jumping to conclusions. :) I think I am the maintainer. What you are proposing is very much something that is only of interst to people who are not using user namespaces. It is fatally flawed as a way to avoid new attack surfaces for people who don't care as the sysctl leaves user namespaces enabled by default. It is fatally flawed as remediation to recommend to people to change if a new user namespace related but is discovered. Any running process that happens to be created while user namespace creation was enabled will continue to exist. Effectively a reboot will be required as part of a mitigation. Many sysadmins will get that wrong. I can't possibly see your sysctl as proposed achieving it's goals. A person has to be entirely too aware of subtlety and nuance to use it effectively. > This feature is already implemented by two distros, and likely wanted > by others. We cannot ignore that. The sysctl default doesn't change > the existing behavior, so this doesn't get in your way at all. Can you > please respond to my earlier email where I rebutted each of your > arguments against it? Just saying "no" and putting words in my mouth > isn't very productive. Calling people who make mistakes insane is not a rebuttal. In security usability matters, and your sysctl has low usability. Further you seem to have missed something crucial in your understanding. As was explained earlier the sysctl was added to ubuntu to allow early adopters to experiment not as a long term way of managing user namespaces. What sounds like a generally useful feature that would cover your use case and many others is a per user limit on the number of user namespaces users may create. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-26 4:57 ` Eric W. Biederman 0 siblings, 0 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-26 4:57 UTC (permalink / raw) To: Kees Cook Cc: Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening Kees Cook <keescook@chromium.org> writes: > On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman > <ebiederm@xmission.com> wrote: >> Kees Cook <keescook@chromium.org> writes: >>> >>> Well, I don't know about less weird, but it would leave a unneeded >>> hole in the permission checks. >> >> To be clear the current patch has my: >> >> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >> >> The code is buggy, and poorly thought through. Your lack of interest in >> fixing the bugs in your patch is distressing. > > I'm not sure where you see me having a "lack of interest". The > existing cap-checking sysctls have a corner-case bug, which is > orthogonal to this change. That certainly doesn't sound like you have any plans to change anything there. >> So broken code, not willing to fix. No. We are not merging this sysctl. > > I think you're jumping to conclusions. :) I think I am the maintainer. What you are proposing is very much something that is only of interst to people who are not using user namespaces. It is fatally flawed as a way to avoid new attack surfaces for people who don't care as the sysctl leaves user namespaces enabled by default. It is fatally flawed as remediation to recommend to people to change if a new user namespace related but is discovered. Any running process that happens to be created while user namespace creation was enabled will continue to exist. Effectively a reboot will be required as part of a mitigation. Many sysadmins will get that wrong. I can't possibly see your sysctl as proposed achieving it's goals. A person has to be entirely too aware of subtlety and nuance to use it effectively. > This feature is already implemented by two distros, and likely wanted > by others. We cannot ignore that. The sysctl default doesn't change > the existing behavior, so this doesn't get in your way at all. Can you > please respond to my earlier email where I rebutted each of your > arguments against it? Just saying "no" and putting words in my mouth > isn't very productive. Calling people who make mistakes insane is not a rebuttal. In security usability matters, and your sysctl has low usability. Further you seem to have missed something crucial in your understanding. As was explained earlier the sysctl was added to ubuntu to allow early adopters to experiment not as a long term way of managing user namespaces. What sounds like a generally useful feature that would cover your use case and many others is a per user limit on the number of user namespaces users may create. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 4:57 ` [kernel-hardening] " Eric W. Biederman @ 2016-01-26 14:38 ` Josh Boyer -1 siblings, 0 replies; 80+ messages in thread From: Josh Boyer @ 2016-01-26 14:38 UTC (permalink / raw) To: Eric W. Biederman Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: > >> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >> <ebiederm@xmission.com> wrote: >>> Kees Cook <keescook@chromium.org> writes: >>>> >>>> Well, I don't know about less weird, but it would leave a unneeded >>>> hole in the permission checks. >>> >>> To be clear the current patch has my: >>> >>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>> >>> The code is buggy, and poorly thought through. Your lack of interest in >>> fixing the bugs in your patch is distressing. >> >> I'm not sure where you see me having a "lack of interest". The >> existing cap-checking sysctls have a corner-case bug, which is >> orthogonal to this change. > > That certainly doesn't sound like you have any plans to change anything > there. > >>> So broken code, not willing to fix. No. We are not merging this sysctl. >> >> I think you're jumping to conclusions. :) > > I think I am the maintainer. > > What you are proposing is very much something that is only of interst to > people who are not using user namespaces. It is fatally flawed as > a way to avoid new attack surfaces for people who don't care as the > sysctl leaves user namespaces enabled by default. It is fatally flawed > as remediation to recommend to people to change if a new user namespace > related but is discovered. Any running process that happens to be > created while user namespace creation was enabled will continue to > exist. Effectively a reboot will be required as part of a mitigation. > Many sysadmins will get that wrong. > > I can't possibly see your sysctl as proposed achieving it's goals. A > person has to be entirely too aware of subtlety and nuance to use it > effectively. What you're saying is true for the "oh crap" case of a new userns related CVE being found. However, there is the case where sysadmins know for a fact that a set of machines should not allow user namespaces to be enabled. Currently they have 2 choices, 1) use their distro kernel as-is, which may not meet their goal of having userns disabled, or 2) rebuild their kernel to disable it, which may invalidate any support contracts they have. I tend to agree with you on the lack of value around runtime mitigation, but allowing an admin to toggle this as a blatant on/off switch on reboot does have value. >> This feature is already implemented by two distros, and likely wanted >> by others. We cannot ignore that. The sysctl default doesn't change >> the existing behavior, so this doesn't get in your way at all. Can you >> please respond to my earlier email where I rebutted each of your >> arguments against it? Just saying "no" and putting words in my mouth >> isn't very productive. > > Calling people who make mistakes insane is not a rebuttal. In security > usability matters, and your sysctl has low usability. > > Further you seem to have missed something crucial in your understanding. > As was explained earlier the sysctl was added to ubuntu to allow early > adopters to experiment not as a long term way of managing user > namespaces. > > > What sounds like a generally useful feature that would cover your use > case and many others is a per user limit on the number of user > namespaces users may create. Where that number may be zero? I don't see how that is really any better than a sysctl. Could you elaborate? josh ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-26 14:38 ` Josh Boyer 0 siblings, 0 replies; 80+ messages in thread From: Josh Boyer @ 2016-01-26 14:38 UTC (permalink / raw) To: Eric W. Biederman Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: > >> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >> <ebiederm@xmission.com> wrote: >>> Kees Cook <keescook@chromium.org> writes: >>>> >>>> Well, I don't know about less weird, but it would leave a unneeded >>>> hole in the permission checks. >>> >>> To be clear the current patch has my: >>> >>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>> >>> The code is buggy, and poorly thought through. Your lack of interest in >>> fixing the bugs in your patch is distressing. >> >> I'm not sure where you see me having a "lack of interest". The >> existing cap-checking sysctls have a corner-case bug, which is >> orthogonal to this change. > > That certainly doesn't sound like you have any plans to change anything > there. > >>> So broken code, not willing to fix. No. We are not merging this sysctl. >> >> I think you're jumping to conclusions. :) > > I think I am the maintainer. > > What you are proposing is very much something that is only of interst to > people who are not using user namespaces. It is fatally flawed as > a way to avoid new attack surfaces for people who don't care as the > sysctl leaves user namespaces enabled by default. It is fatally flawed > as remediation to recommend to people to change if a new user namespace > related but is discovered. Any running process that happens to be > created while user namespace creation was enabled will continue to > exist. Effectively a reboot will be required as part of a mitigation. > Many sysadmins will get that wrong. > > I can't possibly see your sysctl as proposed achieving it's goals. A > person has to be entirely too aware of subtlety and nuance to use it > effectively. What you're saying is true for the "oh crap" case of a new userns related CVE being found. However, there is the case where sysadmins know for a fact that a set of machines should not allow user namespaces to be enabled. Currently they have 2 choices, 1) use their distro kernel as-is, which may not meet their goal of having userns disabled, or 2) rebuild their kernel to disable it, which may invalidate any support contracts they have. I tend to agree with you on the lack of value around runtime mitigation, but allowing an admin to toggle this as a blatant on/off switch on reboot does have value. >> This feature is already implemented by two distros, and likely wanted >> by others. We cannot ignore that. The sysctl default doesn't change >> the existing behavior, so this doesn't get in your way at all. Can you >> please respond to my earlier email where I rebutted each of your >> arguments against it? Just saying "no" and putting words in my mouth >> isn't very productive. > > Calling people who make mistakes insane is not a rebuttal. In security > usability matters, and your sysctl has low usability. > > Further you seem to have missed something crucial in your understanding. > As was explained earlier the sysctl was added to ubuntu to allow early > adopters to experiment not as a long term way of managing user > namespaces. > > > What sounds like a generally useful feature that would cover your use > case and many others is a per user limit on the number of user > namespaces users may create. Where that number may be zero? I don't see how that is really any better than a sysctl. Could you elaborate? josh ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 14:38 ` [kernel-hardening] " Josh Boyer @ 2016-01-26 14:46 ` Austin S. Hemmelgarn -1 siblings, 0 replies; 80+ messages in thread From: Austin S. Hemmelgarn @ 2016-01-26 14:46 UTC (permalink / raw) To: Josh Boyer, Eric W. Biederman Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On 2016-01-26 09:38, Josh Boyer wrote: > On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman > <ebiederm@xmission.com> wrote: >> Kees Cook <keescook@chromium.org> writes: >> >>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >>> <ebiederm@xmission.com> wrote: >>>> Kees Cook <keescook@chromium.org> writes: >>>>> >>>>> Well, I don't know about less weird, but it would leave a unneeded >>>>> hole in the permission checks. >>>> >>>> To be clear the current patch has my: >>>> >>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>>> >>>> The code is buggy, and poorly thought through. Your lack of interest in >>>> fixing the bugs in your patch is distressing. >>> >>> I'm not sure where you see me having a "lack of interest". The >>> existing cap-checking sysctls have a corner-case bug, which is >>> orthogonal to this change. >> >> That certainly doesn't sound like you have any plans to change anything >> there. >> >>>> So broken code, not willing to fix. No. We are not merging this sysctl. >>> >>> I think you're jumping to conclusions. :) >> >> I think I am the maintainer. >> >> What you are proposing is very much something that is only of interst to >> people who are not using user namespaces. It is fatally flawed as >> a way to avoid new attack surfaces for people who don't care as the >> sysctl leaves user namespaces enabled by default. It is fatally flawed >> as remediation to recommend to people to change if a new user namespace >> related but is discovered. Any running process that happens to be >> created while user namespace creation was enabled will continue to >> exist. Effectively a reboot will be required as part of a mitigation. >> Many sysadmins will get that wrong. >> >> I can't possibly see your sysctl as proposed achieving it's goals. A >> person has to be entirely too aware of subtlety and nuance to use it >> effectively. > > What you're saying is true for the "oh crap" case of a new userns > related CVE being found. However, there is the case where sysadmins > know for a fact that a set of machines should not allow user > namespaces to be enabled. Currently they have 2 choices, 1) use their > distro kernel as-is, which may not meet their goal of having userns > disabled, or 2) rebuild their kernel to disable it, which may > invalidate any support contracts they have. > > I tend to agree with you on the lack of value around runtime > mitigation, but allowing an admin to toggle this as a blatant on/off > switch on reboot does have value. > >>> This feature is already implemented by two distros, and likely wanted >>> by others. We cannot ignore that. The sysctl default doesn't change >>> the existing behavior, so this doesn't get in your way at all. Can you >>> please respond to my earlier email where I rebutted each of your >>> arguments against it? Just saying "no" and putting words in my mouth >>> isn't very productive. >> >> Calling people who make mistakes insane is not a rebuttal. In security >> usability matters, and your sysctl has low usability. >> >> Further you seem to have missed something crucial in your understanding. >> As was explained earlier the sysctl was added to ubuntu to allow early >> adopters to experiment not as a long term way of managing user >> namespaces. >> >> >> What sounds like a generally useful feature that would cover your use >> case and many others is a per user limit on the number of user >> namespaces users may create. > > Where that number may be zero? I don't see how that is really any > better than a sysctl. Could you elaborate? It's a better option because it would allow better configurability. Take for example a single user desktop system with some network daemons. On such a system, the actual login used for the graphical environment by the user should be allowed at least a few user namespaces, because some software depends on them for security (Chrome for example, as well as some distro's build systems), but system users should be limited to at most one if they need it, and ideally zero, so that remote exploits couldn't give access to a user namespace. Conversely, on a server system, it's not unreasonable to completely disable user namespaces for almost everything, except for giving one to services that use them properly for sand-boxing. I will state though that I only feel this is a better solution given that two criteria are met: 1. You can set 0 as the limit. 2. You can configure this without needing some special software (this in particular means that seccomp is not an option). ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-26 14:46 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 80+ messages in thread From: Austin S. Hemmelgarn @ 2016-01-26 14:46 UTC (permalink / raw) To: Josh Boyer, Eric W. Biederman Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On 2016-01-26 09:38, Josh Boyer wrote: > On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman > <ebiederm@xmission.com> wrote: >> Kees Cook <keescook@chromium.org> writes: >> >>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >>> <ebiederm@xmission.com> wrote: >>>> Kees Cook <keescook@chromium.org> writes: >>>>> >>>>> Well, I don't know about less weird, but it would leave a unneeded >>>>> hole in the permission checks. >>>> >>>> To be clear the current patch has my: >>>> >>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>>> >>>> The code is buggy, and poorly thought through. Your lack of interest in >>>> fixing the bugs in your patch is distressing. >>> >>> I'm not sure where you see me having a "lack of interest". The >>> existing cap-checking sysctls have a corner-case bug, which is >>> orthogonal to this change. >> >> That certainly doesn't sound like you have any plans to change anything >> there. >> >>>> So broken code, not willing to fix. No. We are not merging this sysctl. >>> >>> I think you're jumping to conclusions. :) >> >> I think I am the maintainer. >> >> What you are proposing is very much something that is only of interst to >> people who are not using user namespaces. It is fatally flawed as >> a way to avoid new attack surfaces for people who don't care as the >> sysctl leaves user namespaces enabled by default. It is fatally flawed >> as remediation to recommend to people to change if a new user namespace >> related but is discovered. Any running process that happens to be >> created while user namespace creation was enabled will continue to >> exist. Effectively a reboot will be required as part of a mitigation. >> Many sysadmins will get that wrong. >> >> I can't possibly see your sysctl as proposed achieving it's goals. A >> person has to be entirely too aware of subtlety and nuance to use it >> effectively. > > What you're saying is true for the "oh crap" case of a new userns > related CVE being found. However, there is the case where sysadmins > know for a fact that a set of machines should not allow user > namespaces to be enabled. Currently they have 2 choices, 1) use their > distro kernel as-is, which may not meet their goal of having userns > disabled, or 2) rebuild their kernel to disable it, which may > invalidate any support contracts they have. > > I tend to agree with you on the lack of value around runtime > mitigation, but allowing an admin to toggle this as a blatant on/off > switch on reboot does have value. > >>> This feature is already implemented by two distros, and likely wanted >>> by others. We cannot ignore that. The sysctl default doesn't change >>> the existing behavior, so this doesn't get in your way at all. Can you >>> please respond to my earlier email where I rebutted each of your >>> arguments against it? Just saying "no" and putting words in my mouth >>> isn't very productive. >> >> Calling people who make mistakes insane is not a rebuttal. In security >> usability matters, and your sysctl has low usability. >> >> Further you seem to have missed something crucial in your understanding. >> As was explained earlier the sysctl was added to ubuntu to allow early >> adopters to experiment not as a long term way of managing user >> namespaces. >> >> >> What sounds like a generally useful feature that would cover your use >> case and many others is a per user limit on the number of user >> namespaces users may create. > > Where that number may be zero? I don't see how that is really any > better than a sysctl. Could you elaborate? It's a better option because it would allow better configurability. Take for example a single user desktop system with some network daemons. On such a system, the actual login used for the graphical environment by the user should be allowed at least a few user namespaces, because some software depends on them for security (Chrome for example, as well as some distro's build systems), but system users should be limited to at most one if they need it, and ideally zero, so that remote exploits couldn't give access to a user namespace. Conversely, on a server system, it's not unreasonable to completely disable user namespaces for almost everything, except for giving one to services that use them properly for sand-boxing. I will state though that I only feel this is a better solution given that two criteria are met: 1. You can set 0 as the limit. 2. You can configure this without needing some special software (this in particular means that seccomp is not an option). ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 14:46 ` [kernel-hardening] " Austin S. Hemmelgarn @ 2016-01-26 14:56 ` Josh Boyer -1 siblings, 0 replies; 80+ messages in thread From: Josh Boyer @ 2016-01-26 14:56 UTC (permalink / raw) To: Austin S. Hemmelgarn Cc: Eric W. Biederman, Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Tue, Jan 26, 2016 at 9:46 AM, Austin S. Hemmelgarn <ahferroin7@gmail.com> wrote: > On 2016-01-26 09:38, Josh Boyer wrote: >> >> On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman >> <ebiederm@xmission.com> wrote: >>> >>> Kees Cook <keescook@chromium.org> writes: >>> >>>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >>>> <ebiederm@xmission.com> wrote: >>>>> >>>>> Kees Cook <keescook@chromium.org> writes: >>>>>> >>>>>> >>>>>> Well, I don't know about less weird, but it would leave a unneeded >>>>>> hole in the permission checks. >>>>> >>>>> >>>>> To be clear the current patch has my: >>>>> >>>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>>>> >>>>> The code is buggy, and poorly thought through. Your lack of interest >>>>> in >>>>> fixing the bugs in your patch is distressing. >>>> >>>> >>>> I'm not sure where you see me having a "lack of interest". The >>>> existing cap-checking sysctls have a corner-case bug, which is >>>> orthogonal to this change. >>> >>> >>> That certainly doesn't sound like you have any plans to change anything >>> there. >>> >>>>> So broken code, not willing to fix. No. We are not merging this >>>>> sysctl. >>>> >>>> >>>> I think you're jumping to conclusions. :) >>> >>> >>> I think I am the maintainer. >>> >>> What you are proposing is very much something that is only of interst to >>> people who are not using user namespaces. It is fatally flawed as >>> a way to avoid new attack surfaces for people who don't care as the >>> sysctl leaves user namespaces enabled by default. It is fatally flawed >>> as remediation to recommend to people to change if a new user namespace >>> related but is discovered. Any running process that happens to be >>> created while user namespace creation was enabled will continue to >>> exist. Effectively a reboot will be required as part of a mitigation. >>> Many sysadmins will get that wrong. >>> >>> I can't possibly see your sysctl as proposed achieving it's goals. A >>> person has to be entirely too aware of subtlety and nuance to use it >>> effectively. >> >> >> What you're saying is true for the "oh crap" case of a new userns >> related CVE being found. However, there is the case where sysadmins >> know for a fact that a set of machines should not allow user >> namespaces to be enabled. Currently they have 2 choices, 1) use their >> distro kernel as-is, which may not meet their goal of having userns >> disabled, or 2) rebuild their kernel to disable it, which may >> invalidate any support contracts they have. >> >> I tend to agree with you on the lack of value around runtime >> mitigation, but allowing an admin to toggle this as a blatant on/off >> switch on reboot does have value. >> >>>> This feature is already implemented by two distros, and likely wanted >>>> by others. We cannot ignore that. The sysctl default doesn't change >>>> the existing behavior, so this doesn't get in your way at all. Can you >>>> please respond to my earlier email where I rebutted each of your >>>> arguments against it? Just saying "no" and putting words in my mouth >>>> isn't very productive. >>> >>> >>> Calling people who make mistakes insane is not a rebuttal. In security >>> usability matters, and your sysctl has low usability. >>> >>> Further you seem to have missed something crucial in your understanding. >>> As was explained earlier the sysctl was added to ubuntu to allow early >>> adopters to experiment not as a long term way of managing user >>> namespaces. >>> >>> >>> What sounds like a generally useful feature that would cover your use >>> case and many others is a per user limit on the number of user >>> namespaces users may create. >> >> >> Where that number may be zero? I don't see how that is really any >> better than a sysctl. Could you elaborate? > > It's a better option because it would allow better configurability. Take for > example a single user desktop system with some network daemons. On such a > system, the actual login used for the graphical environment by the user > should be allowed at least a few user namespaces, because some software > depends on them for security (Chrome for example, as well as some distro's > build systems), but system users should be limited to at most one if they > need it, and ideally zero, so that remote exploits couldn't give access to a > user namespace. > > Conversely, on a server system, it's not unreasonable to completely disable > user namespaces for almost everything, except for giving one to services > that use them properly for sand-boxing. OK, so better granularity. Fine. > I will state though that I only feel this is a better solution given that > two criteria are met: > 1. You can set 0 as the limit. > 2. You can configure this without needing some special software (this in > particular means that seccomp is not an option). I'd have to add 3. You can set a global default for all users that can be overridden on a per user basis. Otherwise you play whack-a-mole with every new user or daemon that adds its own uid. josh ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-26 14:56 ` Josh Boyer 0 siblings, 0 replies; 80+ messages in thread From: Josh Boyer @ 2016-01-26 14:56 UTC (permalink / raw) To: Austin S. Hemmelgarn Cc: Eric W. Biederman, Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Tue, Jan 26, 2016 at 9:46 AM, Austin S. Hemmelgarn <ahferroin7@gmail.com> wrote: > On 2016-01-26 09:38, Josh Boyer wrote: >> >> On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman >> <ebiederm@xmission.com> wrote: >>> >>> Kees Cook <keescook@chromium.org> writes: >>> >>>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >>>> <ebiederm@xmission.com> wrote: >>>>> >>>>> Kees Cook <keescook@chromium.org> writes: >>>>>> >>>>>> >>>>>> Well, I don't know about less weird, but it would leave a unneeded >>>>>> hole in the permission checks. >>>>> >>>>> >>>>> To be clear the current patch has my: >>>>> >>>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>>>> >>>>> The code is buggy, and poorly thought through. Your lack of interest >>>>> in >>>>> fixing the bugs in your patch is distressing. >>>> >>>> >>>> I'm not sure where you see me having a "lack of interest". The >>>> existing cap-checking sysctls have a corner-case bug, which is >>>> orthogonal to this change. >>> >>> >>> That certainly doesn't sound like you have any plans to change anything >>> there. >>> >>>>> So broken code, not willing to fix. No. We are not merging this >>>>> sysctl. >>>> >>>> >>>> I think you're jumping to conclusions. :) >>> >>> >>> I think I am the maintainer. >>> >>> What you are proposing is very much something that is only of interst to >>> people who are not using user namespaces. It is fatally flawed as >>> a way to avoid new attack surfaces for people who don't care as the >>> sysctl leaves user namespaces enabled by default. It is fatally flawed >>> as remediation to recommend to people to change if a new user namespace >>> related but is discovered. Any running process that happens to be >>> created while user namespace creation was enabled will continue to >>> exist. Effectively a reboot will be required as part of a mitigation. >>> Many sysadmins will get that wrong. >>> >>> I can't possibly see your sysctl as proposed achieving it's goals. A >>> person has to be entirely too aware of subtlety and nuance to use it >>> effectively. >> >> >> What you're saying is true for the "oh crap" case of a new userns >> related CVE being found. However, there is the case where sysadmins >> know for a fact that a set of machines should not allow user >> namespaces to be enabled. Currently they have 2 choices, 1) use their >> distro kernel as-is, which may not meet their goal of having userns >> disabled, or 2) rebuild their kernel to disable it, which may >> invalidate any support contracts they have. >> >> I tend to agree with you on the lack of value around runtime >> mitigation, but allowing an admin to toggle this as a blatant on/off >> switch on reboot does have value. >> >>>> This feature is already implemented by two distros, and likely wanted >>>> by others. We cannot ignore that. The sysctl default doesn't change >>>> the existing behavior, so this doesn't get in your way at all. Can you >>>> please respond to my earlier email where I rebutted each of your >>>> arguments against it? Just saying "no" and putting words in my mouth >>>> isn't very productive. >>> >>> >>> Calling people who make mistakes insane is not a rebuttal. In security >>> usability matters, and your sysctl has low usability. >>> >>> Further you seem to have missed something crucial in your understanding. >>> As was explained earlier the sysctl was added to ubuntu to allow early >>> adopters to experiment not as a long term way of managing user >>> namespaces. >>> >>> >>> What sounds like a generally useful feature that would cover your use >>> case and many others is a per user limit on the number of user >>> namespaces users may create. >> >> >> Where that number may be zero? I don't see how that is really any >> better than a sysctl. Could you elaborate? > > It's a better option because it would allow better configurability. Take for > example a single user desktop system with some network daemons. On such a > system, the actual login used for the graphical environment by the user > should be allowed at least a few user namespaces, because some software > depends on them for security (Chrome for example, as well as some distro's > build systems), but system users should be limited to at most one if they > need it, and ideally zero, so that remote exploits couldn't give access to a > user namespace. > > Conversely, on a server system, it's not unreasonable to completely disable > user namespaces for almost everything, except for giving one to services > that use them properly for sand-boxing. OK, so better granularity. Fine. > I will state though that I only feel this is a better solution given that > two criteria are met: > 1. You can set 0 as the limit. > 2. You can configure this without needing some special software (this in > particular means that seccomp is not an option). I'd have to add 3. You can set a global default for all users that can be overridden on a per user basis. Otherwise you play whack-a-mole with every new user or daemon that adds its own uid. josh ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 14:56 ` [kernel-hardening] " Josh Boyer (?) @ 2016-01-26 17:20 ` Serge Hallyn 2016-01-26 19:56 ` Josh Boyer -1 siblings, 1 reply; 80+ messages in thread From: Serge Hallyn @ 2016-01-26 17:20 UTC (permalink / raw) To: kernel-hardening Cc: Austin S. Hemmelgarn, Eric W. Biederman, Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel Quoting Josh Boyer (jwboyer@fedoraproject.org): > On Tue, Jan 26, 2016 at 9:46 AM, Austin S. Hemmelgarn > <ahferroin7@gmail.com> wrote: > > On 2016-01-26 09:38, Josh Boyer wrote: > >> > >> On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman > >> <ebiederm@xmission.com> wrote: > >>> > >>> Kees Cook <keescook@chromium.org> writes: > >>> > >>>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman > >>>> <ebiederm@xmission.com> wrote: > >>>>> > >>>>> Kees Cook <keescook@chromium.org> writes: > >>>>>> > >>>>>> > >>>>>> Well, I don't know about less weird, but it would leave a unneeded > >>>>>> hole in the permission checks. > >>>>> > >>>>> > >>>>> To be clear the current patch has my: > >>>>> > >>>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> > >>>>> > >>>>> The code is buggy, and poorly thought through. Your lack of interest > >>>>> in > >>>>> fixing the bugs in your patch is distressing. > >>>> > >>>> > >>>> I'm not sure where you see me having a "lack of interest". The > >>>> existing cap-checking sysctls have a corner-case bug, which is > >>>> orthogonal to this change. > >>> > >>> > >>> That certainly doesn't sound like you have any plans to change anything > >>> there. > >>> > >>>>> So broken code, not willing to fix. No. We are not merging this > >>>>> sysctl. > >>>> > >>>> > >>>> I think you're jumping to conclusions. :) > >>> > >>> > >>> I think I am the maintainer. > >>> > >>> What you are proposing is very much something that is only of interst to > >>> people who are not using user namespaces. It is fatally flawed as > >>> a way to avoid new attack surfaces for people who don't care as the > >>> sysctl leaves user namespaces enabled by default. It is fatally flawed > >>> as remediation to recommend to people to change if a new user namespace > >>> related but is discovered. Any running process that happens to be > >>> created while user namespace creation was enabled will continue to > >>> exist. Effectively a reboot will be required as part of a mitigation. > >>> Many sysadmins will get that wrong. > >>> > >>> I can't possibly see your sysctl as proposed achieving it's goals. A > >>> person has to be entirely too aware of subtlety and nuance to use it > >>> effectively. > >> > >> > >> What you're saying is true for the "oh crap" case of a new userns > >> related CVE being found. However, there is the case where sysadmins > >> know for a fact that a set of machines should not allow user > >> namespaces to be enabled. Currently they have 2 choices, 1) use their > >> distro kernel as-is, which may not meet their goal of having userns > >> disabled, or 2) rebuild their kernel to disable it, which may > >> invalidate any support contracts they have. > >> > >> I tend to agree with you on the lack of value around runtime > >> mitigation, but allowing an admin to toggle this as a blatant on/off > >> switch on reboot does have value. > >> > >>>> This feature is already implemented by two distros, and likely wanted > >>>> by others. We cannot ignore that. The sysctl default doesn't change > >>>> the existing behavior, so this doesn't get in your way at all. Can you > >>>> please respond to my earlier email where I rebutted each of your > >>>> arguments against it? Just saying "no" and putting words in my mouth > >>>> isn't very productive. > >>> > >>> > >>> Calling people who make mistakes insane is not a rebuttal. In security > >>> usability matters, and your sysctl has low usability. > >>> > >>> Further you seem to have missed something crucial in your understanding. > >>> As was explained earlier the sysctl was added to ubuntu to allow early > >>> adopters to experiment not as a long term way of managing user > >>> namespaces. > >>> > >>> > >>> What sounds like a generally useful feature that would cover your use > >>> case and many others is a per user limit on the number of user > >>> namespaces users may create. > >> > >> > >> Where that number may be zero? I don't see how that is really any > >> better than a sysctl. Could you elaborate? > > > > It's a better option because it would allow better configurability. Take for > > example a single user desktop system with some network daemons. On such a > > system, the actual login used for the graphical environment by the user > > should be allowed at least a few user namespaces, because some software > > depends on them for security (Chrome for example, as well as some distro's > > build systems), but system users should be limited to at most one if they > > need it, and ideally zero, so that remote exploits couldn't give access to a > > user namespace. > > > > Conversely, on a server system, it's not unreasonable to completely disable > > user namespaces for almost everything, except for giving one to services > > that use them properly for sand-boxing. > > OK, so better granularity. Fine. > > > I will state though that I only feel this is a better solution given that > > two criteria are met: > > 1. You can set 0 as the limit. > > 2. You can configure this without needing some special software (this in > > particular means that seccomp is not an option). > > I'd have to add 3. You can set a global default for all users that can > be overridden on a per user basis. > > Otherwise you play whack-a-mole with every new user or daemon that > adds its own uid. Given that you want per-user, does a per-uid rlimit, which could be -1 (unlimited) by default, inherited for all uids mapped into a namespace owned by the uid, and which can be set (only reduced) by pam on login, make sense? I'm still not actually seeing the value of this apart from another knob to prevent kernel memory abuse. But at least it does kill two birds with one stone (also satisfying people who want it turned off altogether). Is there a third use case for limiting number of user namespaces per uid? ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 17:20 ` Serge Hallyn @ 2016-01-26 19:56 ` Josh Boyer 2016-01-26 20:11 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 80+ messages in thread From: Josh Boyer @ 2016-01-26 19:56 UTC (permalink / raw) To: Serge Hallyn Cc: kernel-hardening, Austin S. Hemmelgarn, Eric W. Biederman, Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On Tue, Jan 26, 2016 at 12:20 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote: > Quoting Josh Boyer (jwboyer@fedoraproject.org): >> On Tue, Jan 26, 2016 at 9:46 AM, Austin S. Hemmelgarn >> <ahferroin7@gmail.com> wrote: >> > On 2016-01-26 09:38, Josh Boyer wrote: >> >> >> >> On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman >> >> <ebiederm@xmission.com> wrote: >> >>> >> >>> Kees Cook <keescook@chromium.org> writes: >> >>> >> >>>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >> >>>> <ebiederm@xmission.com> wrote: >> >>>>> >> >>>>> Kees Cook <keescook@chromium.org> writes: >> >>>>>> >> >>>>>> >> >>>>>> Well, I don't know about less weird, but it would leave a unneeded >> >>>>>> hole in the permission checks. >> >>>>> >> >>>>> >> >>>>> To be clear the current patch has my: >> >>>>> >> >>>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >> >>>>> >> >>>>> The code is buggy, and poorly thought through. Your lack of interest >> >>>>> in >> >>>>> fixing the bugs in your patch is distressing. >> >>>> >> >>>> >> >>>> I'm not sure where you see me having a "lack of interest". The >> >>>> existing cap-checking sysctls have a corner-case bug, which is >> >>>> orthogonal to this change. >> >>> >> >>> >> >>> That certainly doesn't sound like you have any plans to change anything >> >>> there. >> >>> >> >>>>> So broken code, not willing to fix. No. We are not merging this >> >>>>> sysctl. >> >>>> >> >>>> >> >>>> I think you're jumping to conclusions. :) >> >>> >> >>> >> >>> I think I am the maintainer. >> >>> >> >>> What you are proposing is very much something that is only of interst to >> >>> people who are not using user namespaces. It is fatally flawed as >> >>> a way to avoid new attack surfaces for people who don't care as the >> >>> sysctl leaves user namespaces enabled by default. It is fatally flawed >> >>> as remediation to recommend to people to change if a new user namespace >> >>> related but is discovered. Any running process that happens to be >> >>> created while user namespace creation was enabled will continue to >> >>> exist. Effectively a reboot will be required as part of a mitigation. >> >>> Many sysadmins will get that wrong. >> >>> >> >>> I can't possibly see your sysctl as proposed achieving it's goals. A >> >>> person has to be entirely too aware of subtlety and nuance to use it >> >>> effectively. >> >> >> >> >> >> What you're saying is true for the "oh crap" case of a new userns >> >> related CVE being found. However, there is the case where sysadmins >> >> know for a fact that a set of machines should not allow user >> >> namespaces to be enabled. Currently they have 2 choices, 1) use their >> >> distro kernel as-is, which may not meet their goal of having userns >> >> disabled, or 2) rebuild their kernel to disable it, which may >> >> invalidate any support contracts they have. >> >> >> >> I tend to agree with you on the lack of value around runtime >> >> mitigation, but allowing an admin to toggle this as a blatant on/off >> >> switch on reboot does have value. >> >> >> >>>> This feature is already implemented by two distros, and likely wanted >> >>>> by others. We cannot ignore that. The sysctl default doesn't change >> >>>> the existing behavior, so this doesn't get in your way at all. Can you >> >>>> please respond to my earlier email where I rebutted each of your >> >>>> arguments against it? Just saying "no" and putting words in my mouth >> >>>> isn't very productive. >> >>> >> >>> >> >>> Calling people who make mistakes insane is not a rebuttal. In security >> >>> usability matters, and your sysctl has low usability. >> >>> >> >>> Further you seem to have missed something crucial in your understanding. >> >>> As was explained earlier the sysctl was added to ubuntu to allow early >> >>> adopters to experiment not as a long term way of managing user >> >>> namespaces. >> >>> >> >>> >> >>> What sounds like a generally useful feature that would cover your use >> >>> case and many others is a per user limit on the number of user >> >>> namespaces users may create. >> >> >> >> >> >> Where that number may be zero? I don't see how that is really any >> >> better than a sysctl. Could you elaborate? >> > >> > It's a better option because it would allow better configurability. Take for >> > example a single user desktop system with some network daemons. On such a >> > system, the actual login used for the graphical environment by the user >> > should be allowed at least a few user namespaces, because some software >> > depends on them for security (Chrome for example, as well as some distro's >> > build systems), but system users should be limited to at most one if they >> > need it, and ideally zero, so that remote exploits couldn't give access to a >> > user namespace. >> > >> > Conversely, on a server system, it's not unreasonable to completely disable >> > user namespaces for almost everything, except for giving one to services >> > that use them properly for sand-boxing. >> >> OK, so better granularity. Fine. >> >> > I will state though that I only feel this is a better solution given that >> > two criteria are met: >> > 1. You can set 0 as the limit. >> > 2. You can configure this without needing some special software (this in >> > particular means that seccomp is not an option). >> >> I'd have to add 3. You can set a global default for all users that can >> be overridden on a per user basis. >> >> Otherwise you play whack-a-mole with every new user or daemon that >> adds its own uid. > > Given that you want per-user, does a per-uid rlimit, which could be -1 > (unlimited) by default, inherited for all uids mapped into a namespace > owned by the uid, and which can be set (only reduced) by pam on login, > make sense? To clarify, I don't actively want per-user. Eric suggested it and I'm thinking through it from a theoretical perspective. I'd likely be fine with a big ban-hammer that sysadmins can set, but that doesn't mean I'm opposed to something more flexible if it makes sense. I'm not sure if rlimit makes sense in the way you describe it. I don't care about uids within an existing user namespace really. That is icing on the cake. I was looking for something that would disallow uids to create user namespaces to begin with (so inheritance wouldn't matter). If rlimit is that mechanism, then I guess. Seems like an odd fit though, particularly if you tie it to pam. > I'm still not actually seeing the value of this apart from another knob > to prevent kernel memory abuse. But at least it does kill two birds > with one stone (also satisfying people who want it turned off altogether). > Is there a third use case for limiting number of user namespaces per uid? Not that I can think of. josh ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 19:56 ` Josh Boyer @ 2016-01-26 20:11 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 80+ messages in thread From: Austin S. Hemmelgarn @ 2016-01-26 20:11 UTC (permalink / raw) To: Josh Boyer, Serge Hallyn Cc: kernel-hardening, Eric W. Biederman, Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On 2016-01-26 14:56, Josh Boyer wrote: > On Tue, Jan 26, 2016 at 12:20 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote: >> Quoting Josh Boyer (jwboyer@fedoraproject.org): >>> On Tue, Jan 26, 2016 at 9:46 AM, Austin S. Hemmelgarn >>> <ahferroin7@gmail.com> wrote: >>>> On 2016-01-26 09:38, Josh Boyer wrote: >>>>> >>>>> On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman >>>>> <ebiederm@xmission.com> wrote: >>>>>> >>>>>> Kees Cook <keescook@chromium.org> writes: >>>>>> >>>>>>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >>>>>>> <ebiederm@xmission.com> wrote: >>>>>>>> >>>>>>>> Kees Cook <keescook@chromium.org> writes: >>>>>>>>> >>>>>>>>> >>>>>>>>> Well, I don't know about less weird, but it would leave a unneeded >>>>>>>>> hole in the permission checks. >>>>>>>> >>>>>>>> >>>>>>>> To be clear the current patch has my: >>>>>>>> >>>>>>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>>>>>>> >>>>>>>> The code is buggy, and poorly thought through. Your lack of interest >>>>>>>> in >>>>>>>> fixing the bugs in your patch is distressing. >>>>>>> >>>>>>> >>>>>>> I'm not sure where you see me having a "lack of interest". The >>>>>>> existing cap-checking sysctls have a corner-case bug, which is >>>>>>> orthogonal to this change. >>>>>> >>>>>> >>>>>> That certainly doesn't sound like you have any plans to change anything >>>>>> there. >>>>>> >>>>>>>> So broken code, not willing to fix. No. We are not merging this >>>>>>>> sysctl. >>>>>>> >>>>>>> >>>>>>> I think you're jumping to conclusions. :) >>>>>> >>>>>> >>>>>> I think I am the maintainer. >>>>>> >>>>>> What you are proposing is very much something that is only of interst to >>>>>> people who are not using user namespaces. It is fatally flawed as >>>>>> a way to avoid new attack surfaces for people who don't care as the >>>>>> sysctl leaves user namespaces enabled by default. It is fatally flawed >>>>>> as remediation to recommend to people to change if a new user namespace >>>>>> related but is discovered. Any running process that happens to be >>>>>> created while user namespace creation was enabled will continue to >>>>>> exist. Effectively a reboot will be required as part of a mitigation. >>>>>> Many sysadmins will get that wrong. >>>>>> >>>>>> I can't possibly see your sysctl as proposed achieving it's goals. A >>>>>> person has to be entirely too aware of subtlety and nuance to use it >>>>>> effectively. >>>>> >>>>> >>>>> What you're saying is true for the "oh crap" case of a new userns >>>>> related CVE being found. However, there is the case where sysadmins >>>>> know for a fact that a set of machines should not allow user >>>>> namespaces to be enabled. Currently they have 2 choices, 1) use their >>>>> distro kernel as-is, which may not meet their goal of having userns >>>>> disabled, or 2) rebuild their kernel to disable it, which may >>>>> invalidate any support contracts they have. >>>>> >>>>> I tend to agree with you on the lack of value around runtime >>>>> mitigation, but allowing an admin to toggle this as a blatant on/off >>>>> switch on reboot does have value. >>>>> >>>>>>> This feature is already implemented by two distros, and likely wanted >>>>>>> by others. We cannot ignore that. The sysctl default doesn't change >>>>>>> the existing behavior, so this doesn't get in your way at all. Can you >>>>>>> please respond to my earlier email where I rebutted each of your >>>>>>> arguments against it? Just saying "no" and putting words in my mouth >>>>>>> isn't very productive. >>>>>> >>>>>> >>>>>> Calling people who make mistakes insane is not a rebuttal. In security >>>>>> usability matters, and your sysctl has low usability. >>>>>> >>>>>> Further you seem to have missed something crucial in your understanding. >>>>>> As was explained earlier the sysctl was added to ubuntu to allow early >>>>>> adopters to experiment not as a long term way of managing user >>>>>> namespaces. >>>>>> >>>>>> >>>>>> What sounds like a generally useful feature that would cover your use >>>>>> case and many others is a per user limit on the number of user >>>>>> namespaces users may create. >>>>> >>>>> >>>>> Where that number may be zero? I don't see how that is really any >>>>> better than a sysctl. Could you elaborate? >>>> >>>> It's a better option because it would allow better configurability. Take for >>>> example a single user desktop system with some network daemons. On such a >>>> system, the actual login used for the graphical environment by the user >>>> should be allowed at least a few user namespaces, because some software >>>> depends on them for security (Chrome for example, as well as some distro's >>>> build systems), but system users should be limited to at most one if they >>>> need it, and ideally zero, so that remote exploits couldn't give access to a >>>> user namespace. >>>> >>>> Conversely, on a server system, it's not unreasonable to completely disable >>>> user namespaces for almost everything, except for giving one to services >>>> that use them properly for sand-boxing. >>> >>> OK, so better granularity. Fine. >>> >>>> I will state though that I only feel this is a better solution given that >>>> two criteria are met: >>>> 1. You can set 0 as the limit. >>>> 2. You can configure this without needing some special software (this in >>>> particular means that seccomp is not an option). >>> >>> I'd have to add 3. You can set a global default for all users that can >>> be overridden on a per user basis. >>> >>> Otherwise you play whack-a-mole with every new user or daemon that >>> adds its own uid. >> >> Given that you want per-user, does a per-uid rlimit, which could be -1 >> (unlimited) by default, inherited for all uids mapped into a namespace >> owned by the uid, and which can be set (only reduced) by pam on login, >> make sense? > > To clarify, I don't actively want per-user. Eric suggested it and I'm > thinking through it from a theoretical perspective. I'd likely be > fine with a big ban-hammer that sysadmins can set, but that doesn't > mean I'm opposed to something more flexible if it makes sense. > > I'm not sure if rlimit makes sense in the way you describe it. I > don't care about uids within an existing user namespace really. That > is icing on the cake. I was looking for something that would disallow > uids to create user namespaces to begin with (so inheritance wouldn't > matter). If rlimit is that mechanism, then I guess. Seems like an > odd fit though, particularly if you tie it to pam. The PAM connection would be a side effect of it being an rlimit, not something by itself. It's not used on a lot of smaller systems because Linux is not used as much as a time sharing system, but part of the purpose of PAM was to be able to set rlimits and similar things on new sessions before the user could do anything. This is still used today, and /etc/security/limits.conf can be found on any modern Linux system which uses PAM. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 14:38 ` [kernel-hardening] " Josh Boyer (?) (?) @ 2016-01-26 17:15 ` Serge Hallyn 2016-01-26 18:09 ` Austin S. Hemmelgarn ` (2 more replies) -1 siblings, 3 replies; 80+ messages in thread From: Serge Hallyn @ 2016-01-26 17:15 UTC (permalink / raw) To: kernel-hardening Cc: Eric W. Biederman, Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel Quoting Josh Boyer (jwboyer@fedoraproject.org): > On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman > <ebiederm@xmission.com> wrote: > > Kees Cook <keescook@chromium.org> writes: > > > >> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman > >> <ebiederm@xmission.com> wrote: > >>> Kees Cook <keescook@chromium.org> writes: > >>>> > >>>> Well, I don't know about less weird, but it would leave a unneeded > >>>> hole in the permission checks. > >>> > >>> To be clear the current patch has my: > >>> > >>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> > >>> > >>> The code is buggy, and poorly thought through. Your lack of interest in > >>> fixing the bugs in your patch is distressing. > >> > >> I'm not sure where you see me having a "lack of interest". The > >> existing cap-checking sysctls have a corner-case bug, which is > >> orthogonal to this change. > > > > That certainly doesn't sound like you have any plans to change anything > > there. > > > >>> So broken code, not willing to fix. No. We are not merging this sysctl. > >> > >> I think you're jumping to conclusions. :) > > > > I think I am the maintainer. > > > > What you are proposing is very much something that is only of interst to > > people who are not using user namespaces. It is fatally flawed as > > a way to avoid new attack surfaces for people who don't care as the > > sysctl leaves user namespaces enabled by default. It is fatally flawed > > as remediation to recommend to people to change if a new user namespace > > related but is discovered. Any running process that happens to be > > created while user namespace creation was enabled will continue to > > exist. Effectively a reboot will be required as part of a mitigation. > > Many sysadmins will get that wrong. > > > > I can't possibly see your sysctl as proposed achieving it's goals. A > > person has to be entirely too aware of subtlety and nuance to use it > > effectively. > > What you're saying is true for the "oh crap" case of a new userns > related CVE being found. However, there is the case where sysadmins > know for a fact that a set of machines should not allow user > namespaces to be enabled. Currently they have 2 choices, 1) use their Hi - can you give a specific example of this? (Where users really should not be able to use them - not where they might not need them) I think it'll help the discussion tremendously. Because so far the only good arguments I've seen have been about actual bugs in the user namespaces, which would not warrant a designed-in permanent disable switch. If there are good use cases where such a disable switch will always be needed (and compiling out can't satisfy) that'd be helpful. thanks, -serge ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 17:15 ` Serge Hallyn @ 2016-01-26 18:09 ` Austin S. Hemmelgarn 2016-01-26 18:27 ` Andy Lutomirski 2016-01-26 23:13 ` Kees Cook 2016-01-26 23:47 ` Josh Boyer 2 siblings, 1 reply; 80+ messages in thread From: Austin S. Hemmelgarn @ 2016-01-26 18:09 UTC (permalink / raw) To: Serge Hallyn, kernel-hardening Cc: Eric W. Biederman, Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On 2016-01-26 12:15, Serge Hallyn wrote: > Quoting Josh Boyer (jwboyer@fedoraproject.org): >> On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman >> <ebiederm@xmission.com> wrote: >>> Kees Cook <keescook@chromium.org> writes: >>> >>>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >>>> <ebiederm@xmission.com> wrote: >>>>> Kees Cook <keescook@chromium.org> writes: >>>>>> >>>>>> Well, I don't know about less weird, but it would leave a unneeded >>>>>> hole in the permission checks. >>>>> >>>>> To be clear the current patch has my: >>>>> >>>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>>>> >>>>> The code is buggy, and poorly thought through. Your lack of interest in >>>>> fixing the bugs in your patch is distressing. >>>> >>>> I'm not sure where you see me having a "lack of interest". The >>>> existing cap-checking sysctls have a corner-case bug, which is >>>> orthogonal to this change. >>> >>> That certainly doesn't sound like you have any plans to change anything >>> there. >>> >>>>> So broken code, not willing to fix. No. We are not merging this sysctl. >>>> >>>> I think you're jumping to conclusions. :) >>> >>> I think I am the maintainer. >>> >>> What you are proposing is very much something that is only of interst to >>> people who are not using user namespaces. It is fatally flawed as >>> a way to avoid new attack surfaces for people who don't care as the >>> sysctl leaves user namespaces enabled by default. It is fatally flawed >>> as remediation to recommend to people to change if a new user namespace >>> related but is discovered. Any running process that happens to be >>> created while user namespace creation was enabled will continue to >>> exist. Effectively a reboot will be required as part of a mitigation. >>> Many sysadmins will get that wrong. >>> >>> I can't possibly see your sysctl as proposed achieving it's goals. A >>> person has to be entirely too aware of subtlety and nuance to use it >>> effectively. >> >> What you're saying is true for the "oh crap" case of a new userns >> related CVE being found. However, there is the case where sysadmins >> know for a fact that a set of machines should not allow user >> namespaces to be enabled. Currently they have 2 choices, 1) use their > > Hi - can you give a specific example of this? (Where users really should > not be able to use them - not where they might not need them) I think > it'll help the discussion tremendously. Because so far the only good > arguments I've seen have been about actual bugs in the user namespaces, > which would not warrant a designed-in permanent disable switch. If > there are good use cases where such a disable switch will always be > needed (and compiling out can't satisfy) that'd be helpful. In general, if a particular daemon provides a network service and does not use user namespaces for sand-boxing, it should not be allowed to use user namespaces, because those then become something else to potentially land an exploit through. ntpd, postfix, and most other regularly used network servers fall into this category. If you're hosting a shared system providing terminal server like usage where the users actually have shell access, then they probably should not be able to use user namespaces on the server. In essence, if there are cases where you know for certain that users do not need user namespaces, they should not be allowed to use them. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 18:09 ` Austin S. Hemmelgarn @ 2016-01-26 18:27 ` Andy Lutomirski 2016-01-26 18:45 ` Austin S. Hemmelgarn 2016-01-26 23:15 ` Kees Cook 0 siblings, 2 replies; 80+ messages in thread From: Andy Lutomirski @ 2016-01-26 18:27 UTC (permalink / raw) To: Austin S. Hemmelgarn Cc: Serge Hallyn, kernel-hardening, Eric W. Biederman, Kees Cook, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On Tue, Jan 26, 2016 at 10:09 AM, Austin S. Hemmelgarn <ahferroin7@gmail.com> wrote: > On 2016-01-26 12:15, Serge Hallyn wrote: >> >> Quoting Josh Boyer (jwboyer@fedoraproject.org): >>> >>> On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman >>> <ebiederm@xmission.com> wrote: >>>> >>>> Kees Cook <keescook@chromium.org> writes: >>>> >>>>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >>>>> <ebiederm@xmission.com> wrote: >>>>>> >>>>>> Kees Cook <keescook@chromium.org> writes: >>>>>>> >>>>>>> >>>>>>> Well, I don't know about less weird, but it would leave a unneeded >>>>>>> hole in the permission checks. >>>>>> >>>>>> >>>>>> To be clear the current patch has my: >>>>>> >>>>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>>>>> >>>>>> The code is buggy, and poorly thought through. Your lack of interest >>>>>> in >>>>>> fixing the bugs in your patch is distressing. >>>>> >>>>> >>>>> I'm not sure where you see me having a "lack of interest". The >>>>> existing cap-checking sysctls have a corner-case bug, which is >>>>> orthogonal to this change. >>>> >>>> >>>> That certainly doesn't sound like you have any plans to change anything >>>> there. >>>> >>>>>> So broken code, not willing to fix. No. We are not merging this >>>>>> sysctl. >>>>> >>>>> >>>>> I think you're jumping to conclusions. :) >>>> >>>> >>>> I think I am the maintainer. >>>> >>>> What you are proposing is very much something that is only of interst to >>>> people who are not using user namespaces. It is fatally flawed as >>>> a way to avoid new attack surfaces for people who don't care as the >>>> sysctl leaves user namespaces enabled by default. It is fatally flawed >>>> as remediation to recommend to people to change if a new user namespace >>>> related but is discovered. Any running process that happens to be >>>> created while user namespace creation was enabled will continue to >>>> exist. Effectively a reboot will be required as part of a mitigation. >>>> Many sysadmins will get that wrong. >>>> >>>> I can't possibly see your sysctl as proposed achieving it's goals. A >>>> person has to be entirely too aware of subtlety and nuance to use it >>>> effectively. >>> >>> >>> What you're saying is true for the "oh crap" case of a new userns >>> related CVE being found. However, there is the case where sysadmins >>> know for a fact that a set of machines should not allow user >>> namespaces to be enabled. Currently they have 2 choices, 1) use their >> >> >> Hi - can you give a specific example of this? (Where users really should >> not be able to use them - not where they might not need them) I think >> it'll help the discussion tremendously. Because so far the only good >> arguments I've seen have been about actual bugs in the user namespaces, >> which would not warrant a designed-in permanent disable switch. If >> there are good use cases where such a disable switch will always be >> needed (and compiling out can't satisfy) that'd be helpful. > > In general, if a particular daemon provides a network service and does not > use user namespaces for sand-boxing, it should not be allowed to use user > namespaces, because those then become something else to potentially land an > exploit through. ntpd, postfix, and most other regularly used network > servers fall into this category. seccomp handles this issue quite nicely. > > If you're hosting a shared system providing terminal server like usage where > the users actually have shell access, then they probably should not be able > to use user namespaces on the server. > Au contraire. If they have user ns access, then can sandbox their own programs. --Andy ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 18:27 ` Andy Lutomirski @ 2016-01-26 18:45 ` Austin S. Hemmelgarn 2016-01-26 23:15 ` Kees Cook 1 sibling, 0 replies; 80+ messages in thread From: Austin S. Hemmelgarn @ 2016-01-26 18:45 UTC (permalink / raw) To: Andy Lutomirski Cc: Serge Hallyn, kernel-hardening, Eric W. Biederman, Kees Cook, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On 2016-01-26 13:27, Andy Lutomirski wrote: > On Tue, Jan 26, 2016 at 10:09 AM, Austin S. Hemmelgarn > <ahferroin7@gmail.com> wrote: >> On 2016-01-26 12:15, Serge Hallyn wrote: >>> >>> Quoting Josh Boyer (jwboyer@fedoraproject.org): >>>> >>>> On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman >>>> <ebiederm@xmission.com> wrote: >>>>> >>>>> Kees Cook <keescook@chromium.org> writes: >>>>> >>>>>> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >>>>>> <ebiederm@xmission.com> wrote: >>>>>>> >>>>>>> Kees Cook <keescook@chromium.org> writes: >>>>>>>> >>>>>>>> >>>>>>>> Well, I don't know about less weird, but it would leave a unneeded >>>>>>>> hole in the permission checks. >>>>>>> >>>>>>> >>>>>>> To be clear the current patch has my: >>>>>>> >>>>>>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>>>>>> >>>>>>> The code is buggy, and poorly thought through. Your lack of interest >>>>>>> in >>>>>>> fixing the bugs in your patch is distressing. >>>>>> >>>>>> >>>>>> I'm not sure where you see me having a "lack of interest". The >>>>>> existing cap-checking sysctls have a corner-case bug, which is >>>>>> orthogonal to this change. >>>>> >>>>> >>>>> That certainly doesn't sound like you have any plans to change anything >>>>> there. >>>>> >>>>>>> So broken code, not willing to fix. No. We are not merging this >>>>>>> sysctl. >>>>>> >>>>>> >>>>>> I think you're jumping to conclusions. :) >>>>> >>>>> >>>>> I think I am the maintainer. >>>>> >>>>> What you are proposing is very much something that is only of interst to >>>>> people who are not using user namespaces. It is fatally flawed as >>>>> a way to avoid new attack surfaces for people who don't care as the >>>>> sysctl leaves user namespaces enabled by default. It is fatally flawed >>>>> as remediation to recommend to people to change if a new user namespace >>>>> related but is discovered. Any running process that happens to be >>>>> created while user namespace creation was enabled will continue to >>>>> exist. Effectively a reboot will be required as part of a mitigation. >>>>> Many sysadmins will get that wrong. >>>>> >>>>> I can't possibly see your sysctl as proposed achieving it's goals. A >>>>> person has to be entirely too aware of subtlety and nuance to use it >>>>> effectively. >>>> >>>> >>>> What you're saying is true for the "oh crap" case of a new userns >>>> related CVE being found. However, there is the case where sysadmins >>>> know for a fact that a set of machines should not allow user >>>> namespaces to be enabled. Currently they have 2 choices, 1) use their >>> >>> >>> Hi - can you give a specific example of this? (Where users really should >>> not be able to use them - not where they might not need them) I think >>> it'll help the discussion tremendously. Because so far the only good >>> arguments I've seen have been about actual bugs in the user namespaces, >>> which would not warrant a designed-in permanent disable switch. If >>> there are good use cases where such a disable switch will always be >>> needed (and compiling out can't satisfy) that'd be helpful. >> >> In general, if a particular daemon provides a network service and does not >> use user namespaces for sand-boxing, it should not be allowed to use user >> namespaces, because those then become something else to potentially land an >> exploit through. ntpd, postfix, and most other regularly used network >> servers fall into this category. > > seccomp handles this issue quite nicely. > seccomp is a pain to set up given current tooling, and isn't supported by most server software. Unless there's some tool out there to hook arbitrary seccomp filters into an arbitrary program, then this isn't an option for most people. >> >> If you're hosting a shared system providing terminal server like usage where >> the users actually have shell access, then they probably should not be able >> to use user namespaces on the server. >> > > Au contraire. If they have user ns access, then can sandbox their own programs. I should clarify, by 'terminal server like usage' I meant thin client setups, not Sun Ray or Windows style terminal servers. IOW, a file server that provides a few extra services (DHCP, TFTP and similar) and only allows shell access so users can move around their own files directly on the server. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 18:27 ` Andy Lutomirski 2016-01-26 18:45 ` Austin S. Hemmelgarn @ 2016-01-26 23:15 ` Kees Cook 1 sibling, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-26 23:15 UTC (permalink / raw) To: Andy Lutomirski Cc: Austin S. Hemmelgarn, Serge Hallyn, kernel-hardening, Eric W. Biederman, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On Tue, Jan 26, 2016 at 10:27 AM, Andy Lutomirski <luto@amacapital.net> wrote: > On Tue, Jan 26, 2016 at 10:09 AM, Austin S. Hemmelgarn > <ahferroin7@gmail.com> wrote: >> On 2016-01-26 12:15, Serge Hallyn wrote: >>> >>> Quoting Josh Boyer (jwboyer@fedoraproject.org): >>>> What you're saying is true for the "oh crap" case of a new userns >>>> related CVE being found. However, there is the case where sysadmins >>>> know for a fact that a set of machines should not allow user >>>> namespaces to be enabled. Currently they have 2 choices, 1) use their >>> >>> >>> Hi - can you give a specific example of this? (Where users really should >>> not be able to use them - not where they might not need them) I think >>> it'll help the discussion tremendously. Because so far the only good >>> arguments I've seen have been about actual bugs in the user namespaces, >>> which would not warrant a designed-in permanent disable switch. If >>> there are good use cases where such a disable switch will always be >>> needed (and compiling out can't satisfy) that'd be helpful. >> >> In general, if a particular daemon provides a network service and does not >> use user namespaces for sand-boxing, it should not be allowed to use user >> namespaces, because those then become something else to potentially land an >> exploit through. ntpd, postfix, and most other regularly used network >> servers fall into this category. > > seccomp handles this issue quite nicely. > >> >> If you're hosting a shared system providing terminal server like usage where >> the users actually have shell access, then they probably should not be able >> to use user namespaces on the server. >> > > Au contraire. If they have user ns access, then can sandbox their own programs. The open-ended cases of web servers and shell access aren't cleanly handled by seccomp. And we're talking about protecting them as soon as this knob exists, not after each program or service grows its own sandboxing solution. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 17:15 ` Serge Hallyn 2016-01-26 18:09 ` Austin S. Hemmelgarn @ 2016-01-26 23:13 ` Kees Cook 2016-01-27 10:27 ` Eric W. Biederman 2016-01-26 23:47 ` Josh Boyer 2 siblings, 1 reply; 80+ messages in thread From: Kees Cook @ 2016-01-26 23:13 UTC (permalink / raw) To: Serge Hallyn Cc: kernel-hardening, Eric W. Biederman, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On Tue, Jan 26, 2016 at 9:15 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote: > Quoting Josh Boyer (jwboyer@fedoraproject.org): >> What you're saying is true for the "oh crap" case of a new userns >> related CVE being found. However, there is the case where sysadmins >> know for a fact that a set of machines should not allow user >> namespaces to be enabled. Currently they have 2 choices, 1) use their > > Hi - can you give a specific example of this? (Where users really should > not be able to use them - not where they might not need them) I think > it'll help the discussion tremendously. Because so far the only good > arguments I've seen have been about actual bugs in the user namespaces, > which would not warrant a designed-in permanent disable switch. If > there are good use cases where such a disable switch will always be > needed (and compiling out can't satisfy) that'd be helpful. My example is a machine in a colo rack serving web pages. A site gets attacked, and www-data uses user namespaces to continue their attack to gain root privileges. The admin of such a machine could have disabled userns months earlier and limited the scope of the attack. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 23:13 ` Kees Cook @ 2016-01-27 10:27 ` Eric W. Biederman 2016-01-27 12:32 ` Austin S. Hemmelgarn 2016-01-28 14:41 ` Robert Święcki 0 siblings, 2 replies; 80+ messages in thread From: Eric W. Biederman @ 2016-01-27 10:27 UTC (permalink / raw) To: Kees Cook Cc: Serge Hallyn, kernel-hardening, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel Kees Cook <keescook@chromium.org> writes: > On Tue, Jan 26, 2016 at 9:15 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote: >> Quoting Josh Boyer (jwboyer@fedoraproject.org): >>> What you're saying is true for the "oh crap" case of a new userns >>> related CVE being found. However, there is the case where sysadmins >>> know for a fact that a set of machines should not allow user >>> namespaces to be enabled. Currently they have 2 choices, 1) use their >> >> Hi - can you give a specific example of this? (Where users really should >> not be able to use them - not where they might not need them) I think >> it'll help the discussion tremendously. Because so far the only good >> arguments I've seen have been about actual bugs in the user namespaces, >> which would not warrant a designed-in permanent disable switch. If >> there are good use cases where such a disable switch will always be >> needed (and compiling out can't satisfy) that'd be helpful. > > My example is a machine in a colo rack serving web pages. A site gets > attacked, and www-data uses user namespaces to continue their attack > to gain root privileges. > > The admin of such a machine could have disabled userns months earlier > and limited the scope of the attack. Of course for the paranoid there is already a mechanism to do this. /sbin/chroot. No new user namespaces are allowed to be created inside of a chroot. Eric ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-27 10:27 ` Eric W. Biederman @ 2016-01-27 12:32 ` Austin S. Hemmelgarn 2016-01-28 14:41 ` Robert Święcki 1 sibling, 0 replies; 80+ messages in thread From: Austin S. Hemmelgarn @ 2016-01-27 12:32 UTC (permalink / raw) To: Eric W. Biederman, Kees Cook Cc: Serge Hallyn, kernel-hardening, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On 2016-01-27 05:27, Eric W. Biederman wrote: > Kees Cook <keescook@chromium.org> writes: > >> On Tue, Jan 26, 2016 at 9:15 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote: >>> Quoting Josh Boyer (jwboyer@fedoraproject.org): >>>> What you're saying is true for the "oh crap" case of a new userns >>>> related CVE being found. However, there is the case where sysadmins >>>> know for a fact that a set of machines should not allow user >>>> namespaces to be enabled. Currently they have 2 choices, 1) use their >>> >>> Hi - can you give a specific example of this? (Where users really should >>> not be able to use them - not where they might not need them) I think >>> it'll help the discussion tremendously. Because so far the only good >>> arguments I've seen have been about actual bugs in the user namespaces, >>> which would not warrant a designed-in permanent disable switch. If >>> there are good use cases where such a disable switch will always be >>> needed (and compiling out can't satisfy) that'd be helpful. >> >> My example is a machine in a colo rack serving web pages. A site gets >> attacked, and www-data uses user namespaces to continue their attack >> to gain root privileges. >> >> The admin of such a machine could have disabled userns months earlier >> and limited the scope of the attack. > > Of course for the paranoid there is already a mechanism to do this. > /sbin/chroot. > > No new user namespaces are allowed to be created inside of a chroot. I would like to point out that this is undocumented outside of the kernel source code on every Linux system I've seen. And, it's not hugely obvious from looking at the source code unless you have some experience with kernel programming. Also, being able to limit to exactly one (or possibly two depending on the application's usage of them) user namespace would be useful, as that would allow sand-boxing without the ability to create any more through some exploit. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-27 10:27 ` Eric W. Biederman @ 2016-01-28 14:41 ` Robert Święcki 2016-01-28 14:41 ` Robert Święcki 1 sibling, 0 replies; 80+ messages in thread From: Robert Święcki @ 2016-01-28 14:41 UTC (permalink / raw) To: Eric W. Biederman Cc: Kees Cook, Serge Hallyn, kernel-hardening, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel >> The admin of such a machine could have disabled userns months earlier >> and limited the scope of the attack. > > Of course for the paranoid there is already a mechanism to do this. > /sbin/chroot. > > No new user namespaces are allowed to be created inside of a chroot. Another alternative is to create a custom kernel module which will disable the user namespace (by limiting to privileged users only, or disabling it altogether). IMO people tend to use distro kernels for convenience, and a suggestion of creating a chroot dir for every service exposed to users, or building a custom kernel module is an advice that not many sysadmins using distro kernels would take, even if they have concerns about the the increased attack surface enabled by CLONE_NEWUSER. Also, I don't think the willingness to disable the feature or limit it to the already privileged users will be something that only truly paranoid sysadmins/users would have. We've seen a fair amount of privilege escalation / DoS bugs that this kernel feature enabled in the recent 18 months or so, and they still seem to be found on a somewhat regular basis. Therefore this discussion and its outcome might be of interest to less paranoid folk as well. I agree with Kees that the "knob" will be mostly used by admins of web-servers (and similar services) as a protection against privilege escalation after getting low-priv code execution on the system. Therefore a sysctl enabled from /etc/sysctl.conf should work well here, even if it's set to the permissive mode by default at boot. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-28 14:41 ` Robert Święcki 0 siblings, 0 replies; 80+ messages in thread From: Robert Święcki @ 2016-01-28 14:41 UTC (permalink / raw) To: Eric W. Biederman Cc: Kees Cook, Serge Hallyn, kernel-hardening, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel >> The admin of such a machine could have disabled userns months earlier >> and limited the scope of the attack. > > Of course for the paranoid there is already a mechanism to do this. > /sbin/chroot. > > No new user namespaces are allowed to be created inside of a chroot. Another alternative is to create a custom kernel module which will disable the user namespace (by limiting to privileged users only, or disabling it altogether). IMO people tend to use distro kernels for convenience, and a suggestion of creating a chroot dir for every service exposed to users, or building a custom kernel module is an advice that not many sysadmins using distro kernels would take, even if they have concerns about the the increased attack surface enabled by CLONE_NEWUSER. Also, I don't think the willingness to disable the feature or limit it to the already privileged users will be something that only truly paranoid sysadmins/users would have. We've seen a fair amount of privilege escalation / DoS bugs that this kernel feature enabled in the recent 18 months or so, and they still seem to be found on a somewhat regular basis. Therefore this discussion and its outcome might be of interest to less paranoid folk as well. I agree with Kees that the "knob" will be mostly used by admins of web-servers (and similar services) as a protection against privilege escalation after getting low-priv code execution on the system. Therefore a sysctl enabled from /etc/sysctl.conf should work well here, even if it's set to the permissive mode by default at boot. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 17:15 ` Serge Hallyn 2016-01-26 18:09 ` Austin S. Hemmelgarn 2016-01-26 23:13 ` Kees Cook @ 2016-01-26 23:47 ` Josh Boyer 2 siblings, 0 replies; 80+ messages in thread From: Josh Boyer @ 2016-01-26 23:47 UTC (permalink / raw) To: Serge E. Hallyn Cc: Kostya Serebryany, linux-doc, Al Viro, Andrew Morton, David Howells, kernel-hardening, Andy Lutomirski, Eric W. Biederman, Alexander Potapenko, Kees Cook, Eric Dumazet, Robert Święcki, Miklos Szeredi, Dmitry Vyukov, Richard Weinberger, linux-kernel, Sasha Levin [-- Attachment #1: Type: text/plain, Size: 3131 bytes --] On Jan 26, 2016 3:58 PM, "Serge Hallyn" <serge.hallyn@ubuntu.com> wrote: > > Quoting Josh Boyer (jwboyer@fedoraproject.org): > > On Mon, Jan 25, 2016 at 11:57 PM, Eric W. Biederman > > <ebiederm@xmission.com> wrote: > > > Kees Cook <keescook@chromium.org> writes: > > > > > >> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman > > >> <ebiederm@xmission.com> wrote: > > >>> Kees Cook <keescook@chromium.org> writes: > > >>>> > > >>>> Well, I don't know about less weird, but it would leave a unneeded > > >>>> hole in the permission checks. > > >>> > > >>> To be clear the current patch has my: > > >>> > > >>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> > > >>> > > >>> The code is buggy, and poorly thought through. Your lack of interest in > > >>> fixing the bugs in your patch is distressing. > > >> > > >> I'm not sure where you see me having a "lack of interest". The > > >> existing cap-checking sysctls have a corner-case bug, which is > > >> orthogonal to this change. > > > > > > That certainly doesn't sound like you have any plans to change anything > > > there. > > > > > >>> So broken code, not willing to fix. No. We are not merging this sysctl. > > >> > > >> I think you're jumping to conclusions. :) > > > > > > I think I am the maintainer. > > > > > > What you are proposing is very much something that is only of interst to > > > people who are not using user namespaces. It is fatally flawed as > > > a way to avoid new attack surfaces for people who don't care as the > > > sysctl leaves user namespaces enabled by default. It is fatally flawed > > > as remediation to recommend to people to change if a new user namespace > > > related but is discovered. Any running process that happens to be > > > created while user namespace creation was enabled will continue to > > > exist. Effectively a reboot will be required as part of a mitigation. > > > Many sysadmins will get that wrong. > > > > > > I can't possibly see your sysctl as proposed achieving it's goals. A > > > person has to be entirely too aware of subtlety and nuance to use it > > > effectively. > > > > What you're saying is true for the "oh crap" case of a new userns > > related CVE being found. However, there is the case where sysadmins > > know for a fact that a set of machines should not allow user > > namespaces to be enabled. Currently they have 2 choices, 1) use their > > Hi - can you give a specific example of this? (Where users really should > not be able to use them - not where they might not need them) I think > it'll help the discussion tremendously. Because so far the only good > arguments I've seen have been about actual bugs in the user namespaces, > which would not warrant a designed-in permanent disable switch. If > there are good use cases where such a disable switch will always be > needed (and compiling out can't satisfy) that'd be helpful. I already did in the email you quoted. Limiting the surface where the admin knows that shouldn't be used without having to rebuild a distribution kernel. Think of it like blacklisting a module, except user namespaces can't be modules. josh [-- Attachment #2: Type: text/html, Size: 4499 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 4:57 ` [kernel-hardening] " Eric W. Biederman @ 2016-01-26 16:37 ` Kees Cook -1 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-26 16:37 UTC (permalink / raw) To: Eric W. Biederman Cc: Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 8:57 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: > >> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >> <ebiederm@xmission.com> wrote: >>> Kees Cook <keescook@chromium.org> writes: >>>> >>>> Well, I don't know about less weird, but it would leave a unneeded >>>> hole in the permission checks. >>> >>> To be clear the current patch has my: >>> >>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>> >>> The code is buggy, and poorly thought through. Your lack of interest in >>> fixing the bugs in your patch is distressing. >> >> I'm not sure where you see me having a "lack of interest". The >> existing cap-checking sysctls have a corner-case bug, which is >> orthogonal to this change. > > That certainly doesn't sound like you have any plans to change anything > there. Again, not sure why you think that. My primary role in kernel development is fixing or helping coordinate fixing of security issues and features. I already acknowledged the issue (it is a corner case, and no one seems to debate that). I'm working based on priorities; I have a long list of things to do. :) >>> So broken code, not willing to fix. No. We are not merging this sysctl. >> >> I think you're jumping to conclusions. :) > > I think I am the maintainer. Sure, no debate there. In fact, I'm certain you're the maintainer. :) > What you are proposing is very much something that is only of interst to > people who are not using user namespaces. It is fatally flawed as > a way to avoid new attack surfaces for people who don't care as the > sysctl leaves user namespaces enabled by default. It is fatally flawed > as remediation to recommend to people to change if a new user namespace > related but is discovered. Any running process that happens to be > created while user namespace creation was enabled will continue to > exist. Effectively a reboot will be required as part of a mitigation. > Many sysadmins will get that wrong. I disagree. The same kinds of issues exist with any of the *_restrict sysctls: if you turn them on later, things that happened before are still going to be a problem. You'll have already leaked a kernel base address, etc. This would be no different. I'm open to having this sysctl kill all CLONE_NEWUSERed process trees, if you think that'll be more useful? > I can't possibly see your sysctl as proposed achieving it's goals. A > person has to be entirely too aware of subtlety and nuance to use it > effectively. Again, I disagree. There are plenty of people who want to have user ns disabled. This gives them the knob to do so. >> This feature is already implemented by two distros, and likely wanted >> by others. We cannot ignore that. The sysctl default doesn't change >> the existing behavior, so this doesn't get in your way at all. Can you >> please respond to my earlier email where I rebutted each of your >> arguments against it? Just saying "no" and putting words in my mouth >> isn't very productive. > > Calling people who make mistakes insane is not a rebuttal. In security I said this: >> Any admin that decides to just turn off CLONE_NEWUSER in the middle of >> still using it is insane. I don't think this breeds any false sense of >> security as most sysctls are set at boot time. I was arguing that admins that use the sysctl are not going to be the admins that are using containers already. I didn't mean it as "making a mistake is insane" but rather "it would appear that a person using both would be seeking opposing goals". > usability matters, and your sysctl has low usability. Unsurprisingly, we disagree here too. This sysctl serves as an attack surface reduction tool. I never saw it as a way to evict existing containers. > Further you seem to have missed something crucial in your understanding. > As was explained earlier the sysctl was added to ubuntu to allow early > adopters to experiment not as a long term way of managing user > namespaces. It's not about management: the audience of the sysctl is only those that are not using user namespaces. Providing attack surface reduction tools to admins is a net win for Linux security as a whole. We both want the same thing: a safer Linux environment. There's no debate that having user ns exposes a larger attack surface than not having it. Being able to disable it for people not interested in using user ns means a reduction in their attack surface. > What sounds like a generally useful feature that would cover your use > case and many others is a per user limit on the number of user > namespaces users may create. That sounds fine to me. Are you thinking of a new RLIMIT, or something else? I don't need a sysctl, I just want a way to effectively disable user ns. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled @ 2016-01-26 16:37 ` Kees Cook 0 siblings, 0 replies; 80+ messages in thread From: Kees Cook @ 2016-01-26 16:37 UTC (permalink / raw) To: Eric W. Biederman Cc: Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel, kernel-hardening On Mon, Jan 25, 2016 at 8:57 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: > >> On Mon, Jan 25, 2016 at 11:33 AM, Eric W. Biederman >> <ebiederm@xmission.com> wrote: >>> Kees Cook <keescook@chromium.org> writes: >>>> >>>> Well, I don't know about less weird, but it would leave a unneeded >>>> hole in the permission checks. >>> >>> To be clear the current patch has my: >>> >>> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> >>> >>> The code is buggy, and poorly thought through. Your lack of interest in >>> fixing the bugs in your patch is distressing. >> >> I'm not sure where you see me having a "lack of interest". The >> existing cap-checking sysctls have a corner-case bug, which is >> orthogonal to this change. > > That certainly doesn't sound like you have any plans to change anything > there. Again, not sure why you think that. My primary role in kernel development is fixing or helping coordinate fixing of security issues and features. I already acknowledged the issue (it is a corner case, and no one seems to debate that). I'm working based on priorities; I have a long list of things to do. :) >>> So broken code, not willing to fix. No. We are not merging this sysctl. >> >> I think you're jumping to conclusions. :) > > I think I am the maintainer. Sure, no debate there. In fact, I'm certain you're the maintainer. :) > What you are proposing is very much something that is only of interst to > people who are not using user namespaces. It is fatally flawed as > a way to avoid new attack surfaces for people who don't care as the > sysctl leaves user namespaces enabled by default. It is fatally flawed > as remediation to recommend to people to change if a new user namespace > related but is discovered. Any running process that happens to be > created while user namespace creation was enabled will continue to > exist. Effectively a reboot will be required as part of a mitigation. > Many sysadmins will get that wrong. I disagree. The same kinds of issues exist with any of the *_restrict sysctls: if you turn them on later, things that happened before are still going to be a problem. You'll have already leaked a kernel base address, etc. This would be no different. I'm open to having this sysctl kill all CLONE_NEWUSERed process trees, if you think that'll be more useful? > I can't possibly see your sysctl as proposed achieving it's goals. A > person has to be entirely too aware of subtlety and nuance to use it > effectively. Again, I disagree. There are plenty of people who want to have user ns disabled. This gives them the knob to do so. >> This feature is already implemented by two distros, and likely wanted >> by others. We cannot ignore that. The sysctl default doesn't change >> the existing behavior, so this doesn't get in your way at all. Can you >> please respond to my earlier email where I rebutted each of your >> arguments against it? Just saying "no" and putting words in my mouth >> isn't very productive. > > Calling people who make mistakes insane is not a rebuttal. In security I said this: >> Any admin that decides to just turn off CLONE_NEWUSER in the middle of >> still using it is insane. I don't think this breeds any false sense of >> security as most sysctls are set at boot time. I was arguing that admins that use the sysctl are not going to be the admins that are using containers already. I didn't mean it as "making a mistake is insane" but rather "it would appear that a person using both would be seeking opposing goals". > usability matters, and your sysctl has low usability. Unsurprisingly, we disagree here too. This sysctl serves as an attack surface reduction tool. I never saw it as a way to evict existing containers. > Further you seem to have missed something crucial in your understanding. > As was explained earlier the sysctl was added to ubuntu to allow early > adopters to experiment not as a long term way of managing user > namespaces. It's not about management: the audience of the sysctl is only those that are not using user namespaces. Providing attack surface reduction tools to admins is a net win for Linux security as a whole. We both want the same thing: a safer Linux environment. There's no debate that having user ns exposes a larger attack surface than not having it. Being able to disable it for people not interested in using user ns means a reduction in their attack surface. > What sounds like a generally useful feature that would cover your use > case and many others is a per user limit on the number of user > namespaces users may create. That sounds fine to me. Are you thinking of a new RLIMIT, or something else? I don't need a sysctl, I just want a way to effectively disable user ns. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-26 4:57 ` [kernel-hardening] " Eric W. Biederman ` (2 preceding siblings ...) (?) @ 2016-01-28 8:56 ` Serge E. Hallyn 2016-01-28 12:53 ` Austin S. Hemmelgarn -1 siblings, 1 reply; 80+ messages in thread From: Serge E. Hallyn @ 2016-01-28 8:56 UTC (permalink / raw) To: kernel-hardening Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On Mon, Jan 25, 2016 at 10:57:32PM -0600, Eric W. Biederman wrote: > What sounds like a generally useful feature that would cover your use > case and many others is a per user limit on the number of user > namespaces users may create. Ok, I'm sorry, but after thinking about this quite awhile, I think this is a bad idea. If I'm allowed to create exactly one, then (a) I won't be able to run two instances of chrome (does chrome use one userns per tab or per application?), yet (b) i can easily just not use chrome and use my allocation to run a vulnerability. IMO, having a (hopefully temporary, so cleanly separated out) sysctl, which perhaps goes so far as to kill all non-init user namespaces when set to -1, makes the most sense. I still think the harm due to having userspace not being able to rely on user namespaces will, long term, be worse than the security implications of having user namespaces always enabled. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled 2016-01-28 8:56 ` Serge E. Hallyn @ 2016-01-28 12:53 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 80+ messages in thread From: Austin S. Hemmelgarn @ 2016-01-28 12:53 UTC (permalink / raw) To: Serge E. Hallyn, kernel-hardening Cc: Kees Cook, Andy Lutomirski, Andrew Morton, Al Viro, Richard Weinberger, Robert Święcki, Dmitry Vyukov, David Howells, Miklos Szeredi, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Sasha Levin, linux-doc, linux-kernel On 2016-01-28 03:56, Serge E. Hallyn wrote: > On Mon, Jan 25, 2016 at 10:57:32PM -0600, Eric W. Biederman wrote: >> What sounds like a generally useful feature that would cover your use >> case and many others is a per user limit on the number of user >> namespaces users may create. > > Ok, I'm sorry, but after thinking about this quite awhile, I think this > is a bad idea. If I'm allowed to create exactly one, then (a) I won't > be able to run two instances of chrome (does chrome use one userns per > tab or per application?), yet (b) i can easily just not use chrome and > use my allocation to run a vulnerability. Ignore regular users WRT the ability to limit to one or two namespace instances, the advantage of being able to limit to one or two user namespaces for a given user is that it prevents daemons that use it for sand-boxing from being exploited to create more. If we go with Eric's suggestion, there is absolutely no reason on a regular desktop system that the users used for a graphical login would have to be limited. > > IMO, having a (hopefully temporary, so cleanly separated out) sysctl, > which perhaps goes so far as to kill all non-init user namespaces when > set to -1, makes the most sense. I still think the harm due to having > userspace not being able to rely on user namespaces will, long term, be > worse than the security implications of having user namespaces always > enabled. Userspace can't rely on them as it is right now. The default kernel config has them disabled, and a number of distros are refusing to ship kernels with them enabled at least in the short term, wheras a number are only shipping kernels with them enabled (I'd be willing to bet that part of what prompted this originally was Chrome). Just by virtue of this, you can't rely on them being there and need to gracefully handle the situation if they aren't. Adding a sysctl allowing them to be disabled completely on a kernel that has support built in would not make them all that much more unreliable than they already are. ^ permalink raw reply [flat|nested] 80+ messages in thread
end of thread, other threads:[~2016-01-28 14:41 UTC | newest] Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-01-22 22:39 [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled Kees Cook 2016-01-22 22:39 ` [kernel-hardening] " Kees Cook 2016-01-22 22:39 ` [PATCH 1/2] sysctl: expand use of proc_dointvec_minmax_sysadmin Kees Cook 2016-01-22 22:39 ` [kernel-hardening] " Kees Cook 2016-01-23 3:10 ` Eric W. Biederman 2016-01-23 3:10 ` [kernel-hardening] " Eric W. Biederman 2016-01-23 22:25 ` Jann Horn 2016-01-24 1:20 ` Eric W. Biederman 2016-01-24 1:43 ` Al Viro 2016-01-24 1:56 ` Jann Horn 2016-01-24 6:02 ` Eric W. Biederman 2016-01-24 6:32 ` Jann Horn 2016-01-24 6:44 ` Eric W. Biederman 2016-01-22 22:39 ` [PATCH 2/2] sysctl: allow CLONE_NEWUSER to be disabled Kees Cook 2016-01-22 22:39 ` [kernel-hardening] " Kees Cook 2016-01-22 22:47 ` Robert Święcki 2016-01-22 22:47 ` [kernel-hardening] " Robert Święcki 2016-01-22 22:50 ` Kees Cook 2016-01-22 22:50 ` [kernel-hardening] " Kees Cook 2016-01-22 22:55 ` Robert Święcki 2016-01-22 22:55 ` [kernel-hardening] " Robert Święcki 2016-01-22 23:00 ` Kees Cook 2016-01-22 23:00 ` [kernel-hardening] " Kees Cook 2016-01-23 0:44 ` Serge Hallyn 2016-01-23 0:44 ` [kernel-hardening] " Serge Hallyn 2016-01-23 0:44 ` Serge Hallyn 2016-01-23 0:44 ` [kernel-hardening] " Serge Hallyn 2016-01-23 0:59 ` Ben Hutchings 2016-01-24 20:59 ` Kees Cook 2016-01-24 20:59 ` Kees Cook 2016-01-24 22:20 ` Andy Lutomirski 2016-01-25 18:51 ` Kees Cook 2016-01-22 22:49 ` [PATCH 0/2] " Richard Weinberger 2016-01-22 22:49 ` [kernel-hardening] " Richard Weinberger 2016-01-23 3:02 ` Eric W. Biederman 2016-01-23 3:02 ` [kernel-hardening] " Eric W. Biederman 2016-01-24 20:57 ` Kees Cook 2016-01-24 20:57 ` [kernel-hardening] " Kees Cook 2016-01-26 7:38 ` Serge Hallyn 2016-01-24 22:22 ` Andy Lutomirski 2016-01-24 22:22 ` [kernel-hardening] " Andy Lutomirski 2016-01-25 18:51 ` Kees Cook 2016-01-25 18:51 ` [kernel-hardening] " Kees Cook 2016-01-25 18:53 ` Andy Lutomirski 2016-01-25 18:53 ` [kernel-hardening] " Andy Lutomirski 2016-01-25 18:56 ` Kees Cook 2016-01-25 18:56 ` [kernel-hardening] " Kees Cook 2016-01-25 19:33 ` Eric W. Biederman 2016-01-25 19:33 ` [kernel-hardening] " Eric W. Biederman 2016-01-25 22:34 ` Kees Cook 2016-01-25 22:34 ` [kernel-hardening] " Kees Cook 2016-01-25 23:33 ` Andy Lutomirski 2016-01-25 23:33 ` [kernel-hardening] " Andy Lutomirski 2016-01-26 2:27 ` Daniel Micay 2016-01-26 4:57 ` Eric W. Biederman 2016-01-26 4:57 ` [kernel-hardening] " Eric W. Biederman 2016-01-26 14:38 ` Josh Boyer 2016-01-26 14:38 ` [kernel-hardening] " Josh Boyer 2016-01-26 14:46 ` Austin S. Hemmelgarn 2016-01-26 14:46 ` [kernel-hardening] " Austin S. Hemmelgarn 2016-01-26 14:56 ` Josh Boyer 2016-01-26 14:56 ` [kernel-hardening] " Josh Boyer 2016-01-26 17:20 ` Serge Hallyn 2016-01-26 19:56 ` Josh Boyer 2016-01-26 20:11 ` Austin S. Hemmelgarn 2016-01-26 17:15 ` Serge Hallyn 2016-01-26 18:09 ` Austin S. Hemmelgarn 2016-01-26 18:27 ` Andy Lutomirski 2016-01-26 18:45 ` Austin S. Hemmelgarn 2016-01-26 23:15 ` Kees Cook 2016-01-26 23:13 ` Kees Cook 2016-01-27 10:27 ` Eric W. Biederman 2016-01-27 12:32 ` Austin S. Hemmelgarn 2016-01-28 14:41 ` Robert Święcki 2016-01-28 14:41 ` Robert Święcki 2016-01-26 23:47 ` Josh Boyer 2016-01-26 16:37 ` Kees Cook 2016-01-26 16:37 ` [kernel-hardening] " Kees Cook 2016-01-28 8:56 ` Serge E. Hallyn 2016-01-28 12:53 ` Austin S. Hemmelgarn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.