* [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities @ 2015-01-29 18:43 Iulia Manda 2015-01-29 18:59 ` Geert Uytterhoeven 2015-01-29 23:44 ` Casey Schaufler 0 siblings, 2 replies; 22+ messages in thread From: Iulia Manda @ 2015-01-29 18:43 UTC (permalink / raw) To: gnomes; +Cc: serge.hallyn, linux-kernel, akpm, paulmck, josh, peterz, mhocko There are a lot of embedded systems that run most or all of their functionality in init, running as root:root. For these systems, supporting multiple users is not necessary. This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root users, non-root groups, and capabilities optional. When this symbol is not defined, UID and GID are zero in any possible case and processes always have all capabilities. The following syscalls are compiled out: setuid, setregid, setgid, setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, setfsuid, setfsgid, capget, capset. Also, groups.c is compiled out completely. This change saves about 25 KB on a defconfig build. The kernel was booted in Qemu. All the common functionalities work. Adding users/groups is not possible, failing with -ENOSYS. Bloat-o-meter output: add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> --- Changes since v1: - refactor code; - compile out groups.c; - if groups_alloc is called, enable NON_ROOT; arch/s390/Kconfig | 1 + drivers/staging/lustre/lustre/Kconfig | 1 + fs/nfsd/Kconfig | 1 + include/linux/capability.h | 29 +++++++++++++++++++++++++++ include/linux/cred.h | 23 ++++++++++++++++++---- include/linux/uidgid.h | 12 +++++++++++ init/Kconfig | 19 +++++++++++++++++- kernel/Makefile | 4 +++- kernel/capability.c | 35 ++++++++++++++++++--------------- kernel/cred.c | 3 +++ kernel/groups.c | 3 --- kernel/sys.c | 2 ++ kernel/sys_ni.c | 14 +++++++++++++ net/sunrpc/Kconfig | 2 ++ 14 files changed, 124 insertions(+), 25 deletions(-) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 68b68d7..b2d2116 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -324,6 +324,7 @@ config COMPAT select COMPAT_BINFMT_ELF if BINFMT_ELF select ARCH_WANT_OLD_COMPAT_IPC select COMPAT_OLD_SIGACTION + select NON_ROOT help Select this option if you want to enable your system kernel to handle system-calls from ELF binaries for 31 bit ESA. This option diff --git a/drivers/staging/lustre/lustre/Kconfig b/drivers/staging/lustre/lustre/Kconfig index 6725467..b975f62 100644 --- a/drivers/staging/lustre/lustre/Kconfig +++ b/drivers/staging/lustre/lustre/Kconfig @@ -10,6 +10,7 @@ config LUSTRE_FS select CRYPTO_SHA1 select CRYPTO_SHA256 select CRYPTO_SHA512 + select NON_ROOT help This option enables Lustre file system client support. Choose Y here if you want to access a Lustre file system cluster. To compile diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig index 7339515..1a8d6d9 100644 --- a/fs/nfsd/Kconfig +++ b/fs/nfsd/Kconfig @@ -6,6 +6,7 @@ config NFSD select SUNRPC select EXPORTFS select NFS_ACL_SUPPORT if NFSD_V2_ACL + select NON_ROOT help Choose Y here if you want to allow other computers to access files residing on this system using Sun's Network File System diff --git a/include/linux/capability.h b/include/linux/capability.h index aa93e5e..601c5de 100644 --- a/include/linux/capability.h +++ b/include/linux/capability.h @@ -205,6 +205,7 @@ static inline kernel_cap_t cap_raise_nfsd_set(const kernel_cap_t a, cap_intersect(permitted, __cap_nfsd_set)); } +#ifdef CONFIG_NON_ROOT extern bool has_capability(struct task_struct *t, int cap); extern bool has_ns_capability(struct task_struct *t, struct user_namespace *ns, int cap); @@ -213,6 +214,34 @@ extern bool has_ns_capability_noaudit(struct task_struct *t, struct user_namespace *ns, int cap); extern bool capable(int cap); extern bool ns_capable(struct user_namespace *ns, int cap); +#else +static inline bool has_capability(struct task_struct *t, int cap) +{ + return true; +} +static inline bool has_ns_capability(struct task_struct *t, + struct user_namespace *ns, int cap) +{ + return true; +} +static inline bool has_capability_noaudit(struct task_struct *t, int cap) +{ + return true; +} +static inline bool has_ns_capability_noaudit(struct task_struct *t, + struct user_namespace *ns, int cap) +{ + return true; +} +static inline bool capable(int cap) +{ + return true; +} +static inline bool ns_capable(struct user_namespace *ns, int cap) +{ + return true; +} +#endif /* CONFIG_NON_ROOT */ extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); diff --git a/include/linux/cred.h b/include/linux/cred.h index 2fb2ca2..08ea5c6 100644 --- a/include/linux/cred.h +++ b/include/linux/cred.h @@ -62,9 +62,27 @@ do { \ groups_free(group_info); \ } while (0) -extern struct group_info *groups_alloc(int); extern struct group_info init_groups; +#ifdef CONFIG_NON_ROOT +extern struct group_info *groups_alloc(int); extern void groups_free(struct group_info *); + +extern int in_group_p(kgid_t); +extern int in_egroup_p(kgid_t); +#else +static inline void groups_free(struct group_info *group_info) +{ +} + +static inline int in_group_p(kgid_t grp) +{ + return 1; +} +static inline int in_egroup_p(kgid_t grp) +{ + return 1; +} +#endif extern int set_current_groups(struct group_info *); extern void set_groups(struct cred *, struct group_info *); extern int groups_search(const struct group_info *, kgid_t); @@ -74,9 +92,6 @@ extern bool may_setgroups(void); #define GROUP_AT(gi, i) \ ((gi)->blocks[(i) / NGROUPS_PER_BLOCK][(i) % NGROUPS_PER_BLOCK]) -extern int in_group_p(kgid_t); -extern int in_egroup_p(kgid_t); - /* * The security context of a task * diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h index 2d1f9b6..22bd1fa 100644 --- a/include/linux/uidgid.h +++ b/include/linux/uidgid.h @@ -29,6 +29,7 @@ typedef struct { #define KUIDT_INIT(value) (kuid_t){ value } #define KGIDT_INIT(value) (kgid_t){ value } +#ifdef CONFIG_NON_ROOT static inline uid_t __kuid_val(kuid_t uid) { return uid.val; @@ -38,6 +39,17 @@ static inline gid_t __kgid_val(kgid_t gid) { return gid.val; } +#else +static inline uid_t __kuid_val(kuid_t uid) +{ + return 0; +} + +static inline gid_t __kgid_val(kgid_t gid) +{ + return 0; +} +#endif #define GLOBAL_ROOT_UID KUIDT_INIT(0) #define GLOBAL_ROOT_GID KGIDT_INIT(0) diff --git a/init/Kconfig b/init/Kconfig index 9afb971..dc5bfd4 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -394,6 +394,7 @@ endchoice config BSD_PROCESS_ACCT bool "BSD Process Accounting" + select NON_ROOT help If you say Y here, a user level program will be able to instruct the kernel (via a special system call) to write process accounting @@ -420,6 +421,7 @@ config BSD_PROCESS_ACCT_V3 config TASKSTATS bool "Export task/process statistics through netlink" depends on NET + select NON_ROOT default n help Export selected statistics for tasks/processes through the @@ -1140,6 +1142,7 @@ config CHECKPOINT_RESTORE menuconfig NAMESPACES bool "Namespaces support" if EXPERT + depends on NON_ROOT default !EXPERT help Provides the way to make tasks work with different objects using @@ -1352,11 +1355,25 @@ menuconfig EXPERT config UID16 bool "Enable 16-bit UID system calls" if EXPERT - depends on HAVE_UID16 + depends on HAVE_UID16 && NON_ROOT default y help This enables the legacy 16-bit UID syscall wrappers. +config NON_ROOT + bool "Multiple users, groups and capabilities support" if EXPERT + default y + help + This option enables support for non-root users, groups and + capabilities. + + If you say N here, all processes will run with UID 0, GID 0, and all + possible capabilities. Saying N here also compiles out support for + system calls related to UIDs, GIDs, and capabilities, such as setuid, + setgid, and capset. + + If unsure, say Y here. + config SGETMASK_SYSCALL bool "sgetmask/ssetmask syscalls support" if EXPERT def_bool PARISC || MN10300 || BLACKFIN || M68K || PPC || MIPS || X86 || SPARC || CRIS || MICROBLAZE || SUPERH diff --git a/kernel/Makefile b/kernel/Makefile index a59481a..d5ca6b8 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -9,7 +9,9 @@ obj-y = fork.o exec_domain.o panic.o \ extable.o params.o \ kthread.o sys_ni.o nsproxy.o \ notifier.o ksysfs.o cred.o reboot.o \ - async.o range.o groups.o smpboot.o + async.o range.o smpboot.o + +obj-$(CONFIG_NON_ROOT) += groups.o ifdef CONFIG_FUNCTION_TRACER # Do not trace debug files and internal ftrace files diff --git a/kernel/capability.c b/kernel/capability.c index 989f5bf..2638412 100644 --- a/kernel/capability.c +++ b/kernel/capability.c @@ -35,6 +35,7 @@ static int __init file_caps_disable(char *str) } __setup("no_file_caps", file_caps_disable); +#ifdef CONFIG_NON_ROOT /* * More recent versions of libcap are available from: * @@ -386,6 +387,24 @@ bool ns_capable(struct user_namespace *ns, int cap) } EXPORT_SYMBOL(ns_capable); + +/** + * capable - Determine if the current task has a superior capability in effect + * @cap: The capability to be tested for + * + * Return true if the current task has the given superior capability currently + * available for use, false if not. + * + * This sets PF_SUPERPRIV on the task if the capability is available on the + * assumption that it's about to be used. + */ +bool capable(int cap) +{ + return ns_capable(&init_user_ns, cap); +} +EXPORT_SYMBOL(capable); +#endif /* CONFIG_NON_ROOT */ + /** * file_ns_capable - Determine if the file's opener had a capability in effect * @file: The file we want to check @@ -412,22 +431,6 @@ bool file_ns_capable(const struct file *file, struct user_namespace *ns, EXPORT_SYMBOL(file_ns_capable); /** - * capable - Determine if the current task has a superior capability in effect - * @cap: The capability to be tested for - * - * Return true if the current task has the given superior capability currently - * available for use, false if not. - * - * This sets PF_SUPERPRIV on the task if the capability is available on the - * assumption that it's about to be used. - */ -bool capable(int cap) -{ - return ns_capable(&init_user_ns, cap); -} -EXPORT_SYMBOL(capable); - -/** * capable_wrt_inode_uidgid - Check nsown_capable and uid and gid mapped * @inode: The inode in question * @cap: The capability in question diff --git a/kernel/cred.c b/kernel/cred.c index e0573a4..ec1c076 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -29,6 +29,9 @@ static struct kmem_cache *cred_jar; +/* init to 2 - one for init_task, one to ensure it is never freed */ +struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; + /* * The initial credentials for the initial task */ diff --git a/kernel/groups.c b/kernel/groups.c index 664411f..74d431d 100644 --- a/kernel/groups.c +++ b/kernel/groups.c @@ -9,9 +9,6 @@ #include <linux/user_namespace.h> #include <asm/uaccess.h> -/* init to 2 - one for init_task, one to ensure it is never freed */ -struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; - struct group_info *groups_alloc(int gidsetsize) { struct group_info *group_info; diff --git a/kernel/sys.c b/kernel/sys.c index a8c9f5a..bfe532b 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -319,6 +319,7 @@ out_unlock: * SMP: There are not races, the GIDs are checked only by filesystem * operations (as far as semantic preservation is concerned). */ +#ifdef CONFIG_NON_ROOT SYSCALL_DEFINE2(setregid, gid_t, rgid, gid_t, egid) { struct user_namespace *ns = current_user_ns(); @@ -809,6 +810,7 @@ change_okay: commit_creds(new); return old_fsgid; } +#endif /* CONFIG_NON_ROOT */ /** * sys_getpid - return the thread group id of the current process diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 5adcb0a..7995ef5 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -159,6 +159,20 @@ cond_syscall(sys_uselib); cond_syscall(sys_fadvise64); cond_syscall(sys_fadvise64_64); cond_syscall(sys_madvise); +cond_syscall(sys_setuid); +cond_syscall(sys_setregid); +cond_syscall(sys_setgid); +cond_syscall(sys_setreuid); +cond_syscall(sys_setresuid); +cond_syscall(sys_getresuid); +cond_syscall(sys_setresgid); +cond_syscall(sys_getresgid); +cond_syscall(sys_setgroups); +cond_syscall(sys_getgroups); +cond_syscall(sys_setfsuid); +cond_syscall(sys_setfsgid); +cond_syscall(sys_capget); +cond_syscall(sys_capset); /* arch-specific weak syscall entries */ cond_syscall(sys_pciconfig_read); diff --git a/net/sunrpc/Kconfig b/net/sunrpc/Kconfig index fb78117..2b2c471 100644 --- a/net/sunrpc/Kconfig +++ b/net/sunrpc/Kconfig @@ -1,9 +1,11 @@ config SUNRPC tristate + select NON_ROOT config SUNRPC_GSS tristate select OID_REGISTRY + select NON_ROOT config SUNRPC_BACKCHANNEL bool -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-29 18:43 [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities Iulia Manda @ 2015-01-29 18:59 ` Geert Uytterhoeven 2015-01-29 20:01 ` josh 2015-01-29 23:44 ` Casey Schaufler 1 sibling, 1 reply; 22+ messages in thread From: Geert Uytterhoeven @ 2015-01-29 18:59 UTC (permalink / raw) To: Iulia Manda Cc: One Thousand Gnomes, Serge Hallyn, linux-kernel, Andrew Morton, Paul McKenney, Josh Triplett, Peter Zijlstra, Michal Hocko Hi Iulia, On Thu, Jan 29, 2015 at 7:43 PM, Iulia Manda <iulia.manda21@gmail.com> wrote: > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > index 68b68d7..b2d2116 100644 > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -324,6 +324,7 @@ config COMPAT > select COMPAT_BINFMT_ELF if BINFMT_ELF > select ARCH_WANT_OLD_COMPAT_IPC > select COMPAT_OLD_SIGACTION > + select NON_ROOT > @@ -10,6 +10,7 @@ config LUSTRE_FS > + select NON_ROOT > @@ -6,6 +6,7 @@ config NFSD > + select NON_ROOT > config BSD_PROCESS_ACCT > bool "BSD Process Accounting" > + select NON_ROOT > config TASKSTATS > + select NON_ROOT Is there a specific reason why you chose to use "select NON_ROOT" instead of "depends on NON_ROOT" for all these options? As configuring NON_ROOT=n is quite a drastic decision, I don't think you should let that be revertable such easily by all those selects. > @@ -1140,6 +1142,7 @@ config CHECKPOINT_RESTORE > > menuconfig NAMESPACES > bool "Namespaces support" if EXPERT > + depends on NON_ROOT > @@ -1352,11 +1355,25 @@ menuconfig EXPERT > > config UID16 > bool "Enable 16-bit UID system calls" if EXPERT > - depends on HAVE_UID16 > + depends on HAVE_UID16 && NON_ROOT Ah, finally a few "depends on". > +config NON_ROOT > + bool "Multiple users, groups and capabilities support" if EXPERT > + default y > + help > + This option enables support for non-root users, groups and > + capabilities. > + > + If you say N here, all processes will run with UID 0, GID 0, and all > + possible capabilities. Saying N here also compiles out support for > + system calls related to UIDs, GIDs, and capabilities, such as setuid, > + setgid, and capset. > + > + If unsure, say Y here. I think it would be clearer to use positive instead of negative logic. What about calling the option "MULTIUSER" instead of "NON_ROOT"? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-29 18:59 ` Geert Uytterhoeven @ 2015-01-29 20:01 ` josh 2015-01-29 20:16 ` Geert Uytterhoeven 0 siblings, 1 reply; 22+ messages in thread From: josh @ 2015-01-29 20:01 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Iulia Manda, One Thousand Gnomes, Serge Hallyn, linux-kernel, Andrew Morton, Paul McKenney, Peter Zijlstra, Michal Hocko On Thu, Jan 29, 2015 at 07:59:09PM +0100, Geert Uytterhoeven wrote: > Hi Iulia, > > On Thu, Jan 29, 2015 at 7:43 PM, Iulia Manda <iulia.manda21@gmail.com> wrote: > > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > > index 68b68d7..b2d2116 100644 > > --- a/arch/s390/Kconfig > > +++ b/arch/s390/Kconfig > > @@ -324,6 +324,7 @@ config COMPAT > > select COMPAT_BINFMT_ELF if BINFMT_ELF > > select ARCH_WANT_OLD_COMPAT_IPC > > select COMPAT_OLD_SIGACTION > > + select NON_ROOT > > > @@ -10,6 +10,7 @@ config LUSTRE_FS > > > + select NON_ROOT > > > @@ -6,6 +6,7 @@ config NFSD > > > + select NON_ROOT > > > config BSD_PROCESS_ACCT > > bool "BSD Process Accounting" > > + select NON_ROOT > > > config TASKSTATS > > > + select NON_ROOT > > Is there a specific reason why you chose to use "select NON_ROOT" > instead of "depends on NON_ROOT" for all these options? > As configuring NON_ROOT=n is quite a drastic decision, I don't > think you should let that be revertable such easily by all those selects. In the past, there's been quite a bit of negative feedback about "depends on", because that makes various options invisible and un-enableable. (Kconfig can be awkward that way.) However, I think it'd be perfectly reasonable to make all of these "depends on NON_ROOT" instead, if there aren't any objections to doing so. > > @@ -1140,6 +1142,7 @@ config CHECKPOINT_RESTORE > > > > menuconfig NAMESPACES > > bool "Namespaces support" if EXPERT > > + depends on NON_ROOT > > > @@ -1352,11 +1355,25 @@ menuconfig EXPERT > > > > config UID16 > > bool "Enable 16-bit UID system calls" if EXPERT > > - depends on HAVE_UID16 > > + depends on HAVE_UID16 && NON_ROOT > > Ah, finally a few "depends on". > > > +config NON_ROOT > > + bool "Multiple users, groups and capabilities support" if EXPERT > > + default y > > + help > > + This option enables support for non-root users, groups and > > + capabilities. > > + > > + If you say N here, all processes will run with UID 0, GID 0, and all > > + possible capabilities. Saying N here also compiles out support for > > + system calls related to UIDs, GIDs, and capabilities, such as setuid, > > + setgid, and capset. > > + > > + If unsure, say Y here. > > I think it would be clearer to use positive instead of negative logic. > What about calling the option "MULTIUSER" instead of "NON_ROOT"? Nice name idea; reminiscent of Multics versus UNIX. The original motivation for CONFIG_NON_ROOT was to ensure that 'y' was the option that added code to the kernel, so that "allnoconfig" does the right thing. As long as the logic stays that way around, changing the name of the option seems perfectly fine. (As long as we're bikeshedding: CONFIG_MULTIUSER or CONFIG_MULTI_USER?) - Josh Triplett ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-29 20:01 ` josh @ 2015-01-29 20:16 ` Geert Uytterhoeven 0 siblings, 0 replies; 22+ messages in thread From: Geert Uytterhoeven @ 2015-01-29 20:16 UTC (permalink / raw) To: Josh Triplett Cc: Iulia Manda, One Thousand Gnomes, Serge Hallyn, linux-kernel, Andrew Morton, Paul McKenney, Peter Zijlstra, Michal Hocko Hi Josh, On Thu, Jan 29, 2015 at 9:01 PM, <josh@joshtriplett.org> wrote: >> > + select NON_ROOT >> >> Is there a specific reason why you chose to use "select NON_ROOT" >> instead of "depends on NON_ROOT" for all these options? >> As configuring NON_ROOT=n is quite a drastic decision, I don't >> think you should let that be revertable such easily by all those selects. > > In the past, there's been quite a bit of negative feedback about > "depends on", because that makes various options invisible and > un-enableable. (Kconfig can be awkward that way.) However, I think > it'd be perfectly reasonable to make all of these "depends on NON_ROOT" > instead, if there aren't any objections to doing so. There's been more complaints about select, as it bypasses other dependencies... > (As long as we're bikeshedding: CONFIG_MULTIUSER or CONFIG_MULTI_USER?) (I had checked before) ARM already has a MULTI_USER define, which does something different. CIFS has CIFS_MOUNT_MULTIUSER. So CONFIG_MULTIUSER sounds like the best color ;-) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-29 18:43 [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities Iulia Manda 2015-01-29 18:59 ` Geert Uytterhoeven @ 2015-01-29 23:44 ` Casey Schaufler 2015-01-30 0:32 ` Paul E. McKenney 2015-01-30 0:43 ` josh 1 sibling, 2 replies; 22+ messages in thread From: Casey Schaufler @ 2015-01-29 23:44 UTC (permalink / raw) To: Iulia Manda, gnomes Cc: serge.hallyn, linux-kernel, akpm, paulmck, josh, peterz, mhocko, Casey Schaufler, LSM On 1/29/2015 10:43 AM, Iulia Manda wrote: > There are a lot of embedded systems that run most or all of their functionality > in init, running as root:root. For these systems, supporting multiple users is > not necessary. > > This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root > users, non-root groups, and capabilities optional. > > When this symbol is not defined, UID and GID are zero in any possible case > and processes always have all capabilities. > > The following syscalls are compiled out: setuid, setregid, setgid, > setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, > setfsuid, setfsgid, capget, capset. > > Also, groups.c is compiled out completely. > > This change saves about 25 KB on a defconfig build. > > The kernel was booted in Qemu. All the common functionalities work. Adding > users/groups is not possible, failing with -ENOSYS. > > Bloat-o-meter output: > add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) > > Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> > Reviewed-by: Josh Triplett <josh@joshtriplett.org> v2 does nothing to address the longstanding position of the community that disabling the traditional user based access controls is unacceptable. If the community has abandoned that position, and I see no reason to believe that is true, the correct implementation is to rework the LSM from an additional controls model to an authoritative hook model. Speaking of the LSM, what is your expectation regarding the use of security modules in addition to "NON_ROOT"? Is it forbidden, allowed or encouraged? Hacking security code out with ifdefs is a common enough practice, but I like to think the kernel community knows better. > --- > Changes since v1: > - refactor code; > - compile out groups.c; > - if groups_alloc is called, enable NON_ROOT; > > arch/s390/Kconfig | 1 + > drivers/staging/lustre/lustre/Kconfig | 1 + > fs/nfsd/Kconfig | 1 + > include/linux/capability.h | 29 +++++++++++++++++++++++++++ > include/linux/cred.h | 23 ++++++++++++++++++---- > include/linux/uidgid.h | 12 +++++++++++ > init/Kconfig | 19 +++++++++++++++++- > kernel/Makefile | 4 +++- > kernel/capability.c | 35 ++++++++++++++++++--------------- > kernel/cred.c | 3 +++ > kernel/groups.c | 3 --- > kernel/sys.c | 2 ++ > kernel/sys_ni.c | 14 +++++++++++++ > net/sunrpc/Kconfig | 2 ++ > 14 files changed, 124 insertions(+), 25 deletions(-) > > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > index 68b68d7..b2d2116 100644 > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -324,6 +324,7 @@ config COMPAT > select COMPAT_BINFMT_ELF if BINFMT_ELF > select ARCH_WANT_OLD_COMPAT_IPC > select COMPAT_OLD_SIGACTION > + select NON_ROOT > help > Select this option if you want to enable your system kernel to > handle system-calls from ELF binaries for 31 bit ESA. This option > diff --git a/drivers/staging/lustre/lustre/Kconfig b/drivers/staging/lustre/lustre/Kconfig > index 6725467..b975f62 100644 > --- a/drivers/staging/lustre/lustre/Kconfig > +++ b/drivers/staging/lustre/lustre/Kconfig > @@ -10,6 +10,7 @@ config LUSTRE_FS > select CRYPTO_SHA1 > select CRYPTO_SHA256 > select CRYPTO_SHA512 > + select NON_ROOT > help > This option enables Lustre file system client support. Choose Y > here if you want to access a Lustre file system cluster. To compile > diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig > index 7339515..1a8d6d9 100644 > --- a/fs/nfsd/Kconfig > +++ b/fs/nfsd/Kconfig > @@ -6,6 +6,7 @@ config NFSD > select SUNRPC > select EXPORTFS > select NFS_ACL_SUPPORT if NFSD_V2_ACL > + select NON_ROOT > help > Choose Y here if you want to allow other computers to access > files residing on this system using Sun's Network File System > diff --git a/include/linux/capability.h b/include/linux/capability.h > index aa93e5e..601c5de 100644 > --- a/include/linux/capability.h > +++ b/include/linux/capability.h > @@ -205,6 +205,7 @@ static inline kernel_cap_t cap_raise_nfsd_set(const kernel_cap_t a, > cap_intersect(permitted, __cap_nfsd_set)); > } > > +#ifdef CONFIG_NON_ROOT > extern bool has_capability(struct task_struct *t, int cap); > extern bool has_ns_capability(struct task_struct *t, > struct user_namespace *ns, int cap); > @@ -213,6 +214,34 @@ extern bool has_ns_capability_noaudit(struct task_struct *t, > struct user_namespace *ns, int cap); > extern bool capable(int cap); > extern bool ns_capable(struct user_namespace *ns, int cap); > +#else > +static inline bool has_capability(struct task_struct *t, int cap) > +{ > + return true; > +} > +static inline bool has_ns_capability(struct task_struct *t, > + struct user_namespace *ns, int cap) > +{ > + return true; > +} > +static inline bool has_capability_noaudit(struct task_struct *t, int cap) > +{ > + return true; > +} > +static inline bool has_ns_capability_noaudit(struct task_struct *t, > + struct user_namespace *ns, int cap) > +{ > + return true; > +} > +static inline bool capable(int cap) > +{ > + return true; > +} > +static inline bool ns_capable(struct user_namespace *ns, int cap) > +{ > + return true; > +} > +#endif /* CONFIG_NON_ROOT */ > extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); > extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); > > diff --git a/include/linux/cred.h b/include/linux/cred.h > index 2fb2ca2..08ea5c6 100644 > --- a/include/linux/cred.h > +++ b/include/linux/cred.h > @@ -62,9 +62,27 @@ do { \ > groups_free(group_info); \ > } while (0) > > -extern struct group_info *groups_alloc(int); > extern struct group_info init_groups; > +#ifdef CONFIG_NON_ROOT > +extern struct group_info *groups_alloc(int); > extern void groups_free(struct group_info *); > + > +extern int in_group_p(kgid_t); > +extern int in_egroup_p(kgid_t); > +#else > +static inline void groups_free(struct group_info *group_info) > +{ > +} > + > +static inline int in_group_p(kgid_t grp) > +{ > + return 1; > +} > +static inline int in_egroup_p(kgid_t grp) > +{ > + return 1; > +} > +#endif > extern int set_current_groups(struct group_info *); > extern void set_groups(struct cred *, struct group_info *); > extern int groups_search(const struct group_info *, kgid_t); > @@ -74,9 +92,6 @@ extern bool may_setgroups(void); > #define GROUP_AT(gi, i) \ > ((gi)->blocks[(i) / NGROUPS_PER_BLOCK][(i) % NGROUPS_PER_BLOCK]) > > -extern int in_group_p(kgid_t); > -extern int in_egroup_p(kgid_t); > - > /* > * The security context of a task > * > diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h > index 2d1f9b6..22bd1fa 100644 > --- a/include/linux/uidgid.h > +++ b/include/linux/uidgid.h > @@ -29,6 +29,7 @@ typedef struct { > #define KUIDT_INIT(value) (kuid_t){ value } > #define KGIDT_INIT(value) (kgid_t){ value } > > +#ifdef CONFIG_NON_ROOT > static inline uid_t __kuid_val(kuid_t uid) > { > return uid.val; > @@ -38,6 +39,17 @@ static inline gid_t __kgid_val(kgid_t gid) > { > return gid.val; > } > +#else > +static inline uid_t __kuid_val(kuid_t uid) > +{ > + return 0; > +} > + > +static inline gid_t __kgid_val(kgid_t gid) > +{ > + return 0; > +} > +#endif > > #define GLOBAL_ROOT_UID KUIDT_INIT(0) > #define GLOBAL_ROOT_GID KGIDT_INIT(0) > diff --git a/init/Kconfig b/init/Kconfig > index 9afb971..dc5bfd4 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -394,6 +394,7 @@ endchoice > > config BSD_PROCESS_ACCT > bool "BSD Process Accounting" > + select NON_ROOT > help > If you say Y here, a user level program will be able to instruct the > kernel (via a special system call) to write process accounting > @@ -420,6 +421,7 @@ config BSD_PROCESS_ACCT_V3 > config TASKSTATS > bool "Export task/process statistics through netlink" > depends on NET > + select NON_ROOT > default n > help > Export selected statistics for tasks/processes through the > @@ -1140,6 +1142,7 @@ config CHECKPOINT_RESTORE > > menuconfig NAMESPACES > bool "Namespaces support" if EXPERT > + depends on NON_ROOT > default !EXPERT > help > Provides the way to make tasks work with different objects using > @@ -1352,11 +1355,25 @@ menuconfig EXPERT > > config UID16 > bool "Enable 16-bit UID system calls" if EXPERT > - depends on HAVE_UID16 > + depends on HAVE_UID16 && NON_ROOT > default y > help > This enables the legacy 16-bit UID syscall wrappers. > > +config NON_ROOT > + bool "Multiple users, groups and capabilities support" if EXPERT > + default y > + help > + This option enables support for non-root users, groups and > + capabilities. > + > + If you say N here, all processes will run with UID 0, GID 0, and all > + possible capabilities. Saying N here also compiles out support for > + system calls related to UIDs, GIDs, and capabilities, such as setuid, > + setgid, and capset. > + > + If unsure, say Y here. > + > config SGETMASK_SYSCALL > bool "sgetmask/ssetmask syscalls support" if EXPERT > def_bool PARISC || MN10300 || BLACKFIN || M68K || PPC || MIPS || X86 || SPARC || CRIS || MICROBLAZE || SUPERH > diff --git a/kernel/Makefile b/kernel/Makefile > index a59481a..d5ca6b8 100644 > --- a/kernel/Makefile > +++ b/kernel/Makefile > @@ -9,7 +9,9 @@ obj-y = fork.o exec_domain.o panic.o \ > extable.o params.o \ > kthread.o sys_ni.o nsproxy.o \ > notifier.o ksysfs.o cred.o reboot.o \ > - async.o range.o groups.o smpboot.o > + async.o range.o smpboot.o > + > +obj-$(CONFIG_NON_ROOT) += groups.o > > ifdef CONFIG_FUNCTION_TRACER > # Do not trace debug files and internal ftrace files > diff --git a/kernel/capability.c b/kernel/capability.c > index 989f5bf..2638412 100644 > --- a/kernel/capability.c > +++ b/kernel/capability.c > @@ -35,6 +35,7 @@ static int __init file_caps_disable(char *str) > } > __setup("no_file_caps", file_caps_disable); > > +#ifdef CONFIG_NON_ROOT > /* > * More recent versions of libcap are available from: > * > @@ -386,6 +387,24 @@ bool ns_capable(struct user_namespace *ns, int cap) > } > EXPORT_SYMBOL(ns_capable); > > + > +/** > + * capable - Determine if the current task has a superior capability in effect > + * @cap: The capability to be tested for > + * > + * Return true if the current task has the given superior capability currently > + * available for use, false if not. > + * > + * This sets PF_SUPERPRIV on the task if the capability is available on the > + * assumption that it's about to be used. > + */ > +bool capable(int cap) > +{ > + return ns_capable(&init_user_ns, cap); > +} > +EXPORT_SYMBOL(capable); > +#endif /* CONFIG_NON_ROOT */ > + > /** > * file_ns_capable - Determine if the file's opener had a capability in effect > * @file: The file we want to check > @@ -412,22 +431,6 @@ bool file_ns_capable(const struct file *file, struct user_namespace *ns, > EXPORT_SYMBOL(file_ns_capable); > > /** > - * capable - Determine if the current task has a superior capability in effect > - * @cap: The capability to be tested for > - * > - * Return true if the current task has the given superior capability currently > - * available for use, false if not. > - * > - * This sets PF_SUPERPRIV on the task if the capability is available on the > - * assumption that it's about to be used. > - */ > -bool capable(int cap) > -{ > - return ns_capable(&init_user_ns, cap); > -} > -EXPORT_SYMBOL(capable); > - > -/** > * capable_wrt_inode_uidgid - Check nsown_capable and uid and gid mapped > * @inode: The inode in question > * @cap: The capability in question > diff --git a/kernel/cred.c b/kernel/cred.c > index e0573a4..ec1c076 100644 > --- a/kernel/cred.c > +++ b/kernel/cred.c > @@ -29,6 +29,9 @@ > > static struct kmem_cache *cred_jar; > > +/* init to 2 - one for init_task, one to ensure it is never freed */ > +struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; > + > /* > * The initial credentials for the initial task > */ > diff --git a/kernel/groups.c b/kernel/groups.c > index 664411f..74d431d 100644 > --- a/kernel/groups.c > +++ b/kernel/groups.c > @@ -9,9 +9,6 @@ > #include <linux/user_namespace.h> > #include <asm/uaccess.h> > > -/* init to 2 - one for init_task, one to ensure it is never freed */ > -struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; > - > struct group_info *groups_alloc(int gidsetsize) > { > struct group_info *group_info; > diff --git a/kernel/sys.c b/kernel/sys.c > index a8c9f5a..bfe532b 100644 > --- a/kernel/sys.c > +++ b/kernel/sys.c > @@ -319,6 +319,7 @@ out_unlock: > * SMP: There are not races, the GIDs are checked only by filesystem > * operations (as far as semantic preservation is concerned). > */ > +#ifdef CONFIG_NON_ROOT > SYSCALL_DEFINE2(setregid, gid_t, rgid, gid_t, egid) > { > struct user_namespace *ns = current_user_ns(); > @@ -809,6 +810,7 @@ change_okay: > commit_creds(new); > return old_fsgid; > } > +#endif /* CONFIG_NON_ROOT */ > > /** > * sys_getpid - return the thread group id of the current process > diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c > index 5adcb0a..7995ef5 100644 > --- a/kernel/sys_ni.c > +++ b/kernel/sys_ni.c > @@ -159,6 +159,20 @@ cond_syscall(sys_uselib); > cond_syscall(sys_fadvise64); > cond_syscall(sys_fadvise64_64); > cond_syscall(sys_madvise); > +cond_syscall(sys_setuid); > +cond_syscall(sys_setregid); > +cond_syscall(sys_setgid); > +cond_syscall(sys_setreuid); > +cond_syscall(sys_setresuid); > +cond_syscall(sys_getresuid); > +cond_syscall(sys_setresgid); > +cond_syscall(sys_getresgid); > +cond_syscall(sys_setgroups); > +cond_syscall(sys_getgroups); > +cond_syscall(sys_setfsuid); > +cond_syscall(sys_setfsgid); > +cond_syscall(sys_capget); > +cond_syscall(sys_capset); > > /* arch-specific weak syscall entries */ > cond_syscall(sys_pciconfig_read); > diff --git a/net/sunrpc/Kconfig b/net/sunrpc/Kconfig > index fb78117..2b2c471 100644 > --- a/net/sunrpc/Kconfig > +++ b/net/sunrpc/Kconfig > @@ -1,9 +1,11 @@ > config SUNRPC > tristate > + select NON_ROOT > > config SUNRPC_GSS > tristate > select OID_REGISTRY > + select NON_ROOT > > config SUNRPC_BACKCHANNEL > bool ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-29 23:44 ` Casey Schaufler @ 2015-01-30 0:32 ` Paul E. McKenney 2015-01-30 1:25 ` Casey Schaufler 2015-01-30 0:43 ` josh 1 sibling, 1 reply; 22+ messages in thread From: Paul E. McKenney @ 2015-01-30 0:32 UTC (permalink / raw) To: Casey Schaufler Cc: Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, josh, peterz, mhocko, LSM On Thu, Jan 29, 2015 at 03:44:46PM -0800, Casey Schaufler wrote: > On 1/29/2015 10:43 AM, Iulia Manda wrote: > > There are a lot of embedded systems that run most or all of their functionality > > in init, running as root:root. For these systems, supporting multiple users is > > not necessary. > > > > This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root > > users, non-root groups, and capabilities optional. > > > > When this symbol is not defined, UID and GID are zero in any possible case > > and processes always have all capabilities. > > > > The following syscalls are compiled out: setuid, setregid, setgid, > > setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, > > setfsuid, setfsgid, capget, capset. > > > > Also, groups.c is compiled out completely. > > > > This change saves about 25 KB on a defconfig build. > > > > The kernel was booted in Qemu. All the common functionalities work. Adding > > users/groups is not possible, failing with -ENOSYS. > > > > Bloat-o-meter output: > > add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) > > > > Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> > > Reviewed-by: Josh Triplett <josh@joshtriplett.org> > > v2 does nothing to address the longstanding position of > the community that disabling the traditional user based > access controls is unacceptable. > > If the community has abandoned that position, and I see no > reason to believe that is true, the correct implementation > is to rework the LSM from an additional controls model to > an authoritative hook model. > > Speaking of the LSM, what is your expectation regarding the > use of security modules in addition to "NON_ROOT"? Is it > forbidden, allowed or encouraged? I am guessing that people who remove uids and gids from their kernels would tend not to add LSM. From what I understand, these kernels are designed for special-purpose applications that have very limited and stylized interactions with the outside world. Applications that, back in the day, would have been written to run on bare metal without any OS whatsoever. > Hacking security code out with ifdefs is a common enough > practice, but I like to think the kernel community knows > better. >From what I understand, the alternative in this case is for the applications to use some other "OS" that lacks security from the get-go, so one can argue that NON_ROOT or MULTIUSER or whatever isn't resulting in a net decrease in security. Thanx, Paul > > --- > > Changes since v1: > > - refactor code; > > - compile out groups.c; > > - if groups_alloc is called, enable NON_ROOT; > > > > arch/s390/Kconfig | 1 + > > drivers/staging/lustre/lustre/Kconfig | 1 + > > fs/nfsd/Kconfig | 1 + > > include/linux/capability.h | 29 +++++++++++++++++++++++++++ > > include/linux/cred.h | 23 ++++++++++++++++++---- > > include/linux/uidgid.h | 12 +++++++++++ > > init/Kconfig | 19 +++++++++++++++++- > > kernel/Makefile | 4 +++- > > kernel/capability.c | 35 ++++++++++++++++++--------------- > > kernel/cred.c | 3 +++ > > kernel/groups.c | 3 --- > > kernel/sys.c | 2 ++ > > kernel/sys_ni.c | 14 +++++++++++++ > > net/sunrpc/Kconfig | 2 ++ > > 14 files changed, 124 insertions(+), 25 deletions(-) > > > > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > > index 68b68d7..b2d2116 100644 > > --- a/arch/s390/Kconfig > > +++ b/arch/s390/Kconfig > > @@ -324,6 +324,7 @@ config COMPAT > > select COMPAT_BINFMT_ELF if BINFMT_ELF > > select ARCH_WANT_OLD_COMPAT_IPC > > select COMPAT_OLD_SIGACTION > > + select NON_ROOT > > help > > Select this option if you want to enable your system kernel to > > handle system-calls from ELF binaries for 31 bit ESA. This option > > diff --git a/drivers/staging/lustre/lustre/Kconfig b/drivers/staging/lustre/lustre/Kconfig > > index 6725467..b975f62 100644 > > --- a/drivers/staging/lustre/lustre/Kconfig > > +++ b/drivers/staging/lustre/lustre/Kconfig > > @@ -10,6 +10,7 @@ config LUSTRE_FS > > select CRYPTO_SHA1 > > select CRYPTO_SHA256 > > select CRYPTO_SHA512 > > + select NON_ROOT > > help > > This option enables Lustre file system client support. Choose Y > > here if you want to access a Lustre file system cluster. To compile > > diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig > > index 7339515..1a8d6d9 100644 > > --- a/fs/nfsd/Kconfig > > +++ b/fs/nfsd/Kconfig > > @@ -6,6 +6,7 @@ config NFSD > > select SUNRPC > > select EXPORTFS > > select NFS_ACL_SUPPORT if NFSD_V2_ACL > > + select NON_ROOT > > help > > Choose Y here if you want to allow other computers to access > > files residing on this system using Sun's Network File System > > diff --git a/include/linux/capability.h b/include/linux/capability.h > > index aa93e5e..601c5de 100644 > > --- a/include/linux/capability.h > > +++ b/include/linux/capability.h > > @@ -205,6 +205,7 @@ static inline kernel_cap_t cap_raise_nfsd_set(const kernel_cap_t a, > > cap_intersect(permitted, __cap_nfsd_set)); > > } > > > > +#ifdef CONFIG_NON_ROOT > > extern bool has_capability(struct task_struct *t, int cap); > > extern bool has_ns_capability(struct task_struct *t, > > struct user_namespace *ns, int cap); > > @@ -213,6 +214,34 @@ extern bool has_ns_capability_noaudit(struct task_struct *t, > > struct user_namespace *ns, int cap); > > extern bool capable(int cap); > > extern bool ns_capable(struct user_namespace *ns, int cap); > > +#else > > +static inline bool has_capability(struct task_struct *t, int cap) > > +{ > > + return true; > > +} > > +static inline bool has_ns_capability(struct task_struct *t, > > + struct user_namespace *ns, int cap) > > +{ > > + return true; > > +} > > +static inline bool has_capability_noaudit(struct task_struct *t, int cap) > > +{ > > + return true; > > +} > > +static inline bool has_ns_capability_noaudit(struct task_struct *t, > > + struct user_namespace *ns, int cap) > > +{ > > + return true; > > +} > > +static inline bool capable(int cap) > > +{ > > + return true; > > +} > > +static inline bool ns_capable(struct user_namespace *ns, int cap) > > +{ > > + return true; > > +} > > +#endif /* CONFIG_NON_ROOT */ > > extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); > > extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); > > > > diff --git a/include/linux/cred.h b/include/linux/cred.h > > index 2fb2ca2..08ea5c6 100644 > > --- a/include/linux/cred.h > > +++ b/include/linux/cred.h > > @@ -62,9 +62,27 @@ do { \ > > groups_free(group_info); \ > > } while (0) > > > > -extern struct group_info *groups_alloc(int); > > extern struct group_info init_groups; > > +#ifdef CONFIG_NON_ROOT > > +extern struct group_info *groups_alloc(int); > > extern void groups_free(struct group_info *); > > + > > +extern int in_group_p(kgid_t); > > +extern int in_egroup_p(kgid_t); > > +#else > > +static inline void groups_free(struct group_info *group_info) > > +{ > > +} > > + > > +static inline int in_group_p(kgid_t grp) > > +{ > > + return 1; > > +} > > +static inline int in_egroup_p(kgid_t grp) > > +{ > > + return 1; > > +} > > +#endif > > extern int set_current_groups(struct group_info *); > > extern void set_groups(struct cred *, struct group_info *); > > extern int groups_search(const struct group_info *, kgid_t); > > @@ -74,9 +92,6 @@ extern bool may_setgroups(void); > > #define GROUP_AT(gi, i) \ > > ((gi)->blocks[(i) / NGROUPS_PER_BLOCK][(i) % NGROUPS_PER_BLOCK]) > > > > -extern int in_group_p(kgid_t); > > -extern int in_egroup_p(kgid_t); > > - > > /* > > * The security context of a task > > * > > diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h > > index 2d1f9b6..22bd1fa 100644 > > --- a/include/linux/uidgid.h > > +++ b/include/linux/uidgid.h > > @@ -29,6 +29,7 @@ typedef struct { > > #define KUIDT_INIT(value) (kuid_t){ value } > > #define KGIDT_INIT(value) (kgid_t){ value } > > > > +#ifdef CONFIG_NON_ROOT > > static inline uid_t __kuid_val(kuid_t uid) > > { > > return uid.val; > > @@ -38,6 +39,17 @@ static inline gid_t __kgid_val(kgid_t gid) > > { > > return gid.val; > > } > > +#else > > +static inline uid_t __kuid_val(kuid_t uid) > > +{ > > + return 0; > > +} > > + > > +static inline gid_t __kgid_val(kgid_t gid) > > +{ > > + return 0; > > +} > > +#endif > > > > #define GLOBAL_ROOT_UID KUIDT_INIT(0) > > #define GLOBAL_ROOT_GID KGIDT_INIT(0) > > diff --git a/init/Kconfig b/init/Kconfig > > index 9afb971..dc5bfd4 100644 > > --- a/init/Kconfig > > +++ b/init/Kconfig > > @@ -394,6 +394,7 @@ endchoice > > > > config BSD_PROCESS_ACCT > > bool "BSD Process Accounting" > > + select NON_ROOT > > help > > If you say Y here, a user level program will be able to instruct the > > kernel (via a special system call) to write process accounting > > @@ -420,6 +421,7 @@ config BSD_PROCESS_ACCT_V3 > > config TASKSTATS > > bool "Export task/process statistics through netlink" > > depends on NET > > + select NON_ROOT > > default n > > help > > Export selected statistics for tasks/processes through the > > @@ -1140,6 +1142,7 @@ config CHECKPOINT_RESTORE > > > > menuconfig NAMESPACES > > bool "Namespaces support" if EXPERT > > + depends on NON_ROOT > > default !EXPERT > > help > > Provides the way to make tasks work with different objects using > > @@ -1352,11 +1355,25 @@ menuconfig EXPERT > > > > config UID16 > > bool "Enable 16-bit UID system calls" if EXPERT > > - depends on HAVE_UID16 > > + depends on HAVE_UID16 && NON_ROOT > > default y > > help > > This enables the legacy 16-bit UID syscall wrappers. > > > > +config NON_ROOT > > + bool "Multiple users, groups and capabilities support" if EXPERT > > + default y > > + help > > + This option enables support for non-root users, groups and > > + capabilities. > > + > > + If you say N here, all processes will run with UID 0, GID 0, and all > > + possible capabilities. Saying N here also compiles out support for > > + system calls related to UIDs, GIDs, and capabilities, such as setuid, > > + setgid, and capset. > > + > > + If unsure, say Y here. > > + > > config SGETMASK_SYSCALL > > bool "sgetmask/ssetmask syscalls support" if EXPERT > > def_bool PARISC || MN10300 || BLACKFIN || M68K || PPC || MIPS || X86 || SPARC || CRIS || MICROBLAZE || SUPERH > > diff --git a/kernel/Makefile b/kernel/Makefile > > index a59481a..d5ca6b8 100644 > > --- a/kernel/Makefile > > +++ b/kernel/Makefile > > @@ -9,7 +9,9 @@ obj-y = fork.o exec_domain.o panic.o \ > > extable.o params.o \ > > kthread.o sys_ni.o nsproxy.o \ > > notifier.o ksysfs.o cred.o reboot.o \ > > - async.o range.o groups.o smpboot.o > > + async.o range.o smpboot.o > > + > > +obj-$(CONFIG_NON_ROOT) += groups.o > > > > ifdef CONFIG_FUNCTION_TRACER > > # Do not trace debug files and internal ftrace files > > diff --git a/kernel/capability.c b/kernel/capability.c > > index 989f5bf..2638412 100644 > > --- a/kernel/capability.c > > +++ b/kernel/capability.c > > @@ -35,6 +35,7 @@ static int __init file_caps_disable(char *str) > > } > > __setup("no_file_caps", file_caps_disable); > > > > +#ifdef CONFIG_NON_ROOT > > /* > > * More recent versions of libcap are available from: > > * > > @@ -386,6 +387,24 @@ bool ns_capable(struct user_namespace *ns, int cap) > > } > > EXPORT_SYMBOL(ns_capable); > > > > + > > +/** > > + * capable - Determine if the current task has a superior capability in effect > > + * @cap: The capability to be tested for > > + * > > + * Return true if the current task has the given superior capability currently > > + * available for use, false if not. > > + * > > + * This sets PF_SUPERPRIV on the task if the capability is available on the > > + * assumption that it's about to be used. > > + */ > > +bool capable(int cap) > > +{ > > + return ns_capable(&init_user_ns, cap); > > +} > > +EXPORT_SYMBOL(capable); > > +#endif /* CONFIG_NON_ROOT */ > > + > > /** > > * file_ns_capable - Determine if the file's opener had a capability in effect > > * @file: The file we want to check > > @@ -412,22 +431,6 @@ bool file_ns_capable(const struct file *file, struct user_namespace *ns, > > EXPORT_SYMBOL(file_ns_capable); > > > > /** > > - * capable - Determine if the current task has a superior capability in effect > > - * @cap: The capability to be tested for > > - * > > - * Return true if the current task has the given superior capability currently > > - * available for use, false if not. > > - * > > - * This sets PF_SUPERPRIV on the task if the capability is available on the > > - * assumption that it's about to be used. > > - */ > > -bool capable(int cap) > > -{ > > - return ns_capable(&init_user_ns, cap); > > -} > > -EXPORT_SYMBOL(capable); > > - > > -/** > > * capable_wrt_inode_uidgid - Check nsown_capable and uid and gid mapped > > * @inode: The inode in question > > * @cap: The capability in question > > diff --git a/kernel/cred.c b/kernel/cred.c > > index e0573a4..ec1c076 100644 > > --- a/kernel/cred.c > > +++ b/kernel/cred.c > > @@ -29,6 +29,9 @@ > > > > static struct kmem_cache *cred_jar; > > > > +/* init to 2 - one for init_task, one to ensure it is never freed */ > > +struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; > > + > > /* > > * The initial credentials for the initial task > > */ > > diff --git a/kernel/groups.c b/kernel/groups.c > > index 664411f..74d431d 100644 > > --- a/kernel/groups.c > > +++ b/kernel/groups.c > > @@ -9,9 +9,6 @@ > > #include <linux/user_namespace.h> > > #include <asm/uaccess.h> > > > > -/* init to 2 - one for init_task, one to ensure it is never freed */ > > -struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; > > - > > struct group_info *groups_alloc(int gidsetsize) > > { > > struct group_info *group_info; > > diff --git a/kernel/sys.c b/kernel/sys.c > > index a8c9f5a..bfe532b 100644 > > --- a/kernel/sys.c > > +++ b/kernel/sys.c > > @@ -319,6 +319,7 @@ out_unlock: > > * SMP: There are not races, the GIDs are checked only by filesystem > > * operations (as far as semantic preservation is concerned). > > */ > > +#ifdef CONFIG_NON_ROOT > > SYSCALL_DEFINE2(setregid, gid_t, rgid, gid_t, egid) > > { > > struct user_namespace *ns = current_user_ns(); > > @@ -809,6 +810,7 @@ change_okay: > > commit_creds(new); > > return old_fsgid; > > } > > +#endif /* CONFIG_NON_ROOT */ > > > > /** > > * sys_getpid - return the thread group id of the current process > > diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c > > index 5adcb0a..7995ef5 100644 > > --- a/kernel/sys_ni.c > > +++ b/kernel/sys_ni.c > > @@ -159,6 +159,20 @@ cond_syscall(sys_uselib); > > cond_syscall(sys_fadvise64); > > cond_syscall(sys_fadvise64_64); > > cond_syscall(sys_madvise); > > +cond_syscall(sys_setuid); > > +cond_syscall(sys_setregid); > > +cond_syscall(sys_setgid); > > +cond_syscall(sys_setreuid); > > +cond_syscall(sys_setresuid); > > +cond_syscall(sys_getresuid); > > +cond_syscall(sys_setresgid); > > +cond_syscall(sys_getresgid); > > +cond_syscall(sys_setgroups); > > +cond_syscall(sys_getgroups); > > +cond_syscall(sys_setfsuid); > > +cond_syscall(sys_setfsgid); > > +cond_syscall(sys_capget); > > +cond_syscall(sys_capset); > > > > /* arch-specific weak syscall entries */ > > cond_syscall(sys_pciconfig_read); > > diff --git a/net/sunrpc/Kconfig b/net/sunrpc/Kconfig > > index fb78117..2b2c471 100644 > > --- a/net/sunrpc/Kconfig > > +++ b/net/sunrpc/Kconfig > > @@ -1,9 +1,11 @@ > > config SUNRPC > > tristate > > + select NON_ROOT > > > > config SUNRPC_GSS > > tristate > > select OID_REGISTRY > > + select NON_ROOT > > > > config SUNRPC_BACKCHANNEL > > bool > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 0:32 ` Paul E. McKenney @ 2015-01-30 1:25 ` Casey Schaufler 2015-01-30 1:36 ` Paul E. McKenney 0 siblings, 1 reply; 22+ messages in thread From: Casey Schaufler @ 2015-01-30 1:25 UTC (permalink / raw) To: paulmck Cc: Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, josh, peterz, mhocko, LSM, Casey Schaufler On 1/29/2015 4:32 PM, Paul E. McKenney wrote: > On Thu, Jan 29, 2015 at 03:44:46PM -0800, Casey Schaufler wrote: >> On 1/29/2015 10:43 AM, Iulia Manda wrote: >>> There are a lot of embedded systems that run most or all of their functionality >>> in init, running as root:root. For these systems, supporting multiple users is >>> not necessary. >>> >>> This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root >>> users, non-root groups, and capabilities optional. >>> >>> When this symbol is not defined, UID and GID are zero in any possible case >>> and processes always have all capabilities. >>> >>> The following syscalls are compiled out: setuid, setregid, setgid, >>> setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, >>> setfsuid, setfsgid, capget, capset. >>> >>> Also, groups.c is compiled out completely. >>> >>> This change saves about 25 KB on a defconfig build. >>> >>> The kernel was booted in Qemu. All the common functionalities work. Adding >>> users/groups is not possible, failing with -ENOSYS. >>> >>> Bloat-o-meter output: >>> add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) >>> >>> Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> >>> Reviewed-by: Josh Triplett <josh@joshtriplett.org> >> v2 does nothing to address the longstanding position of >> the community that disabling the traditional user based >> access controls is unacceptable. >> >> If the community has abandoned that position, and I see no >> reason to believe that is true, the correct implementation >> is to rework the LSM from an additional controls model to >> an authoritative hook model. >> >> Speaking of the LSM, what is your expectation regarding the >> use of security modules in addition to "NON_ROOT"? Is it >> forbidden, allowed or encouraged? > I am guessing that people who remove uids and gids from their > kernels would tend not to add LSM. From what I understand, these > kernels are designed for special-purpose applications that have > very limited and stylized interactions with the outside world. > Applications that, back in the day, would have been written to > run on bare metal without any OS whatsoever. Linux is still going to be too big for those applications. Taking the UID, GID and capability processing out is, at 25k, hardly significant. Yes, you'll save some processing time, but the benchmarks I've run in the dim dark past indicated that the impact is actually trivial. I would of course invite the advocates of this patch to produce numbers. No, if you are looking to switch from a RTOS to a Linux kernel, UID processing isn't going to be your first (second, or third) concern. As for LSMs, I can easily see putting in the security model from the old RTOS on top of a NON_ROOT configuration. Won't that be fun when the CVEs start to fly? Do you think you'll be running system services like systemd on top of this? Anyone *else* remember what happened when they put capability handling into sendmail? > >> Hacking security code out with ifdefs is a common enough >> practice, but I like to think the kernel community knows >> better. > >From what I understand, the alternative in this case is for the > applications to use some other "OS" that lacks security from the get-go, > so one can argue that NON_ROOT or MULTIUSER or whatever isn't resulting > in a net decrease in security. > > Thanx, Paul > >>> --- >>> Changes since v1: >>> - refactor code; >>> - compile out groups.c; >>> - if groups_alloc is called, enable NON_ROOT; >>> >>> arch/s390/Kconfig | 1 + >>> drivers/staging/lustre/lustre/Kconfig | 1 + >>> fs/nfsd/Kconfig | 1 + >>> include/linux/capability.h | 29 +++++++++++++++++++++++++++ >>> include/linux/cred.h | 23 ++++++++++++++++++---- >>> include/linux/uidgid.h | 12 +++++++++++ >>> init/Kconfig | 19 +++++++++++++++++- >>> kernel/Makefile | 4 +++- >>> kernel/capability.c | 35 ++++++++++++++++++--------------- >>> kernel/cred.c | 3 +++ >>> kernel/groups.c | 3 --- >>> kernel/sys.c | 2 ++ >>> kernel/sys_ni.c | 14 +++++++++++++ >>> net/sunrpc/Kconfig | 2 ++ >>> 14 files changed, 124 insertions(+), 25 deletions(-) >>> >>> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig >>> index 68b68d7..b2d2116 100644 >>> --- a/arch/s390/Kconfig >>> +++ b/arch/s390/Kconfig >>> @@ -324,6 +324,7 @@ config COMPAT >>> select COMPAT_BINFMT_ELF if BINFMT_ELF >>> select ARCH_WANT_OLD_COMPAT_IPC >>> select COMPAT_OLD_SIGACTION >>> + select NON_ROOT >>> help >>> Select this option if you want to enable your system kernel to >>> handle system-calls from ELF binaries for 31 bit ESA. This option >>> diff --git a/drivers/staging/lustre/lustre/Kconfig b/drivers/staging/lustre/lustre/Kconfig >>> index 6725467..b975f62 100644 >>> --- a/drivers/staging/lustre/lustre/Kconfig >>> +++ b/drivers/staging/lustre/lustre/Kconfig >>> @@ -10,6 +10,7 @@ config LUSTRE_FS >>> select CRYPTO_SHA1 >>> select CRYPTO_SHA256 >>> select CRYPTO_SHA512 >>> + select NON_ROOT >>> help >>> This option enables Lustre file system client support. Choose Y >>> here if you want to access a Lustre file system cluster. To compile >>> diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig >>> index 7339515..1a8d6d9 100644 >>> --- a/fs/nfsd/Kconfig >>> +++ b/fs/nfsd/Kconfig >>> @@ -6,6 +6,7 @@ config NFSD >>> select SUNRPC >>> select EXPORTFS >>> select NFS_ACL_SUPPORT if NFSD_V2_ACL >>> + select NON_ROOT >>> help >>> Choose Y here if you want to allow other computers to access >>> files residing on this system using Sun's Network File System >>> diff --git a/include/linux/capability.h b/include/linux/capability.h >>> index aa93e5e..601c5de 100644 >>> --- a/include/linux/capability.h >>> +++ b/include/linux/capability.h >>> @@ -205,6 +205,7 @@ static inline kernel_cap_t cap_raise_nfsd_set(const kernel_cap_t a, >>> cap_intersect(permitted, __cap_nfsd_set)); >>> } >>> >>> +#ifdef CONFIG_NON_ROOT >>> extern bool has_capability(struct task_struct *t, int cap); >>> extern bool has_ns_capability(struct task_struct *t, >>> struct user_namespace *ns, int cap); >>> @@ -213,6 +214,34 @@ extern bool has_ns_capability_noaudit(struct task_struct *t, >>> struct user_namespace *ns, int cap); >>> extern bool capable(int cap); >>> extern bool ns_capable(struct user_namespace *ns, int cap); >>> +#else >>> +static inline bool has_capability(struct task_struct *t, int cap) >>> +{ >>> + return true; >>> +} >>> +static inline bool has_ns_capability(struct task_struct *t, >>> + struct user_namespace *ns, int cap) >>> +{ >>> + return true; >>> +} >>> +static inline bool has_capability_noaudit(struct task_struct *t, int cap) >>> +{ >>> + return true; >>> +} >>> +static inline bool has_ns_capability_noaudit(struct task_struct *t, >>> + struct user_namespace *ns, int cap) >>> +{ >>> + return true; >>> +} >>> +static inline bool capable(int cap) >>> +{ >>> + return true; >>> +} >>> +static inline bool ns_capable(struct user_namespace *ns, int cap) >>> +{ >>> + return true; >>> +} >>> +#endif /* CONFIG_NON_ROOT */ >>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); >>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); >>> >>> diff --git a/include/linux/cred.h b/include/linux/cred.h >>> index 2fb2ca2..08ea5c6 100644 >>> --- a/include/linux/cred.h >>> +++ b/include/linux/cred.h >>> @@ -62,9 +62,27 @@ do { \ >>> groups_free(group_info); \ >>> } while (0) >>> >>> -extern struct group_info *groups_alloc(int); >>> extern struct group_info init_groups; >>> +#ifdef CONFIG_NON_ROOT >>> +extern struct group_info *groups_alloc(int); >>> extern void groups_free(struct group_info *); >>> + >>> +extern int in_group_p(kgid_t); >>> +extern int in_egroup_p(kgid_t); >>> +#else >>> +static inline void groups_free(struct group_info *group_info) >>> +{ >>> +} >>> + >>> +static inline int in_group_p(kgid_t grp) >>> +{ >>> + return 1; >>> +} >>> +static inline int in_egroup_p(kgid_t grp) >>> +{ >>> + return 1; >>> +} >>> +#endif >>> extern int set_current_groups(struct group_info *); >>> extern void set_groups(struct cred *, struct group_info *); >>> extern int groups_search(const struct group_info *, kgid_t); >>> @@ -74,9 +92,6 @@ extern bool may_setgroups(void); >>> #define GROUP_AT(gi, i) \ >>> ((gi)->blocks[(i) / NGROUPS_PER_BLOCK][(i) % NGROUPS_PER_BLOCK]) >>> >>> -extern int in_group_p(kgid_t); >>> -extern int in_egroup_p(kgid_t); >>> - >>> /* >>> * The security context of a task >>> * >>> diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h >>> index 2d1f9b6..22bd1fa 100644 >>> --- a/include/linux/uidgid.h >>> +++ b/include/linux/uidgid.h >>> @@ -29,6 +29,7 @@ typedef struct { >>> #define KUIDT_INIT(value) (kuid_t){ value } >>> #define KGIDT_INIT(value) (kgid_t){ value } >>> >>> +#ifdef CONFIG_NON_ROOT >>> static inline uid_t __kuid_val(kuid_t uid) >>> { >>> return uid.val; >>> @@ -38,6 +39,17 @@ static inline gid_t __kgid_val(kgid_t gid) >>> { >>> return gid.val; >>> } >>> +#else >>> +static inline uid_t __kuid_val(kuid_t uid) >>> +{ >>> + return 0; >>> +} >>> + >>> +static inline gid_t __kgid_val(kgid_t gid) >>> +{ >>> + return 0; >>> +} >>> +#endif >>> >>> #define GLOBAL_ROOT_UID KUIDT_INIT(0) >>> #define GLOBAL_ROOT_GID KGIDT_INIT(0) >>> diff --git a/init/Kconfig b/init/Kconfig >>> index 9afb971..dc5bfd4 100644 >>> --- a/init/Kconfig >>> +++ b/init/Kconfig >>> @@ -394,6 +394,7 @@ endchoice >>> >>> config BSD_PROCESS_ACCT >>> bool "BSD Process Accounting" >>> + select NON_ROOT >>> help >>> If you say Y here, a user level program will be able to instruct the >>> kernel (via a special system call) to write process accounting >>> @@ -420,6 +421,7 @@ config BSD_PROCESS_ACCT_V3 >>> config TASKSTATS >>> bool "Export task/process statistics through netlink" >>> depends on NET >>> + select NON_ROOT >>> default n >>> help >>> Export selected statistics for tasks/processes through the >>> @@ -1140,6 +1142,7 @@ config CHECKPOINT_RESTORE >>> >>> menuconfig NAMESPACES >>> bool "Namespaces support" if EXPERT >>> + depends on NON_ROOT >>> default !EXPERT >>> help >>> Provides the way to make tasks work with different objects using >>> @@ -1352,11 +1355,25 @@ menuconfig EXPERT >>> >>> config UID16 >>> bool "Enable 16-bit UID system calls" if EXPERT >>> - depends on HAVE_UID16 >>> + depends on HAVE_UID16 && NON_ROOT >>> default y >>> help >>> This enables the legacy 16-bit UID syscall wrappers. >>> >>> +config NON_ROOT >>> + bool "Multiple users, groups and capabilities support" if EXPERT >>> + default y >>> + help >>> + This option enables support for non-root users, groups and >>> + capabilities. >>> + >>> + If you say N here, all processes will run with UID 0, GID 0, and all >>> + possible capabilities. Saying N here also compiles out support for >>> + system calls related to UIDs, GIDs, and capabilities, such as setuid, >>> + setgid, and capset. >>> + >>> + If unsure, say Y here. >>> + >>> config SGETMASK_SYSCALL >>> bool "sgetmask/ssetmask syscalls support" if EXPERT >>> def_bool PARISC || MN10300 || BLACKFIN || M68K || PPC || MIPS || X86 || SPARC || CRIS || MICROBLAZE || SUPERH >>> diff --git a/kernel/Makefile b/kernel/Makefile >>> index a59481a..d5ca6b8 100644 >>> --- a/kernel/Makefile >>> +++ b/kernel/Makefile >>> @@ -9,7 +9,9 @@ obj-y = fork.o exec_domain.o panic.o \ >>> extable.o params.o \ >>> kthread.o sys_ni.o nsproxy.o \ >>> notifier.o ksysfs.o cred.o reboot.o \ >>> - async.o range.o groups.o smpboot.o >>> + async.o range.o smpboot.o >>> + >>> +obj-$(CONFIG_NON_ROOT) += groups.o >>> >>> ifdef CONFIG_FUNCTION_TRACER >>> # Do not trace debug files and internal ftrace files >>> diff --git a/kernel/capability.c b/kernel/capability.c >>> index 989f5bf..2638412 100644 >>> --- a/kernel/capability.c >>> +++ b/kernel/capability.c >>> @@ -35,6 +35,7 @@ static int __init file_caps_disable(char *str) >>> } >>> __setup("no_file_caps", file_caps_disable); >>> >>> +#ifdef CONFIG_NON_ROOT >>> /* >>> * More recent versions of libcap are available from: >>> * >>> @@ -386,6 +387,24 @@ bool ns_capable(struct user_namespace *ns, int cap) >>> } >>> EXPORT_SYMBOL(ns_capable); >>> >>> + >>> +/** >>> + * capable - Determine if the current task has a superior capability in effect >>> + * @cap: The capability to be tested for >>> + * >>> + * Return true if the current task has the given superior capability currently >>> + * available for use, false if not. >>> + * >>> + * This sets PF_SUPERPRIV on the task if the capability is available on the >>> + * assumption that it's about to be used. >>> + */ >>> +bool capable(int cap) >>> +{ >>> + return ns_capable(&init_user_ns, cap); >>> +} >>> +EXPORT_SYMBOL(capable); >>> +#endif /* CONFIG_NON_ROOT */ >>> + >>> /** >>> * file_ns_capable - Determine if the file's opener had a capability in effect >>> * @file: The file we want to check >>> @@ -412,22 +431,6 @@ bool file_ns_capable(const struct file *file, struct user_namespace *ns, >>> EXPORT_SYMBOL(file_ns_capable); >>> >>> /** >>> - * capable - Determine if the current task has a superior capability in effect >>> - * @cap: The capability to be tested for >>> - * >>> - * Return true if the current task has the given superior capability currently >>> - * available for use, false if not. >>> - * >>> - * This sets PF_SUPERPRIV on the task if the capability is available on the >>> - * assumption that it's about to be used. >>> - */ >>> -bool capable(int cap) >>> -{ >>> - return ns_capable(&init_user_ns, cap); >>> -} >>> -EXPORT_SYMBOL(capable); >>> - >>> -/** >>> * capable_wrt_inode_uidgid - Check nsown_capable and uid and gid mapped >>> * @inode: The inode in question >>> * @cap: The capability in question >>> diff --git a/kernel/cred.c b/kernel/cred.c >>> index e0573a4..ec1c076 100644 >>> --- a/kernel/cred.c >>> +++ b/kernel/cred.c >>> @@ -29,6 +29,9 @@ >>> >>> static struct kmem_cache *cred_jar; >>> >>> +/* init to 2 - one for init_task, one to ensure it is never freed */ >>> +struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; >>> + >>> /* >>> * The initial credentials for the initial task >>> */ >>> diff --git a/kernel/groups.c b/kernel/groups.c >>> index 664411f..74d431d 100644 >>> --- a/kernel/groups.c >>> +++ b/kernel/groups.c >>> @@ -9,9 +9,6 @@ >>> #include <linux/user_namespace.h> >>> #include <asm/uaccess.h> >>> >>> -/* init to 2 - one for init_task, one to ensure it is never freed */ >>> -struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; >>> - >>> struct group_info *groups_alloc(int gidsetsize) >>> { >>> struct group_info *group_info; >>> diff --git a/kernel/sys.c b/kernel/sys.c >>> index a8c9f5a..bfe532b 100644 >>> --- a/kernel/sys.c >>> +++ b/kernel/sys.c >>> @@ -319,6 +319,7 @@ out_unlock: >>> * SMP: There are not races, the GIDs are checked only by filesystem >>> * operations (as far as semantic preservation is concerned). >>> */ >>> +#ifdef CONFIG_NON_ROOT >>> SYSCALL_DEFINE2(setregid, gid_t, rgid, gid_t, egid) >>> { >>> struct user_namespace *ns = current_user_ns(); >>> @@ -809,6 +810,7 @@ change_okay: >>> commit_creds(new); >>> return old_fsgid; >>> } >>> +#endif /* CONFIG_NON_ROOT */ >>> >>> /** >>> * sys_getpid - return the thread group id of the current process >>> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c >>> index 5adcb0a..7995ef5 100644 >>> --- a/kernel/sys_ni.c >>> +++ b/kernel/sys_ni.c >>> @@ -159,6 +159,20 @@ cond_syscall(sys_uselib); >>> cond_syscall(sys_fadvise64); >>> cond_syscall(sys_fadvise64_64); >>> cond_syscall(sys_madvise); >>> +cond_syscall(sys_setuid); >>> +cond_syscall(sys_setregid); >>> +cond_syscall(sys_setgid); >>> +cond_syscall(sys_setreuid); >>> +cond_syscall(sys_setresuid); >>> +cond_syscall(sys_getresuid); >>> +cond_syscall(sys_setresgid); >>> +cond_syscall(sys_getresgid); >>> +cond_syscall(sys_setgroups); >>> +cond_syscall(sys_getgroups); >>> +cond_syscall(sys_setfsuid); >>> +cond_syscall(sys_setfsgid); >>> +cond_syscall(sys_capget); >>> +cond_syscall(sys_capset); >>> >>> /* arch-specific weak syscall entries */ >>> cond_syscall(sys_pciconfig_read); >>> diff --git a/net/sunrpc/Kconfig b/net/sunrpc/Kconfig >>> index fb78117..2b2c471 100644 >>> --- a/net/sunrpc/Kconfig >>> +++ b/net/sunrpc/Kconfig >>> @@ -1,9 +1,11 @@ >>> config SUNRPC >>> tristate >>> + select NON_ROOT >>> >>> config SUNRPC_GSS >>> tristate >>> select OID_REGISTRY >>> + select NON_ROOT >>> >>> config SUNRPC_BACKCHANNEL >>> bool > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 1:25 ` Casey Schaufler @ 2015-01-30 1:36 ` Paul E. McKenney 2015-01-30 2:25 ` Casey Schaufler 0 siblings, 1 reply; 22+ messages in thread From: Paul E. McKenney @ 2015-01-30 1:36 UTC (permalink / raw) To: Casey Schaufler Cc: Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, josh, peterz, mhocko, LSM On Thu, Jan 29, 2015 at 05:25:56PM -0800, Casey Schaufler wrote: > On 1/29/2015 4:32 PM, Paul E. McKenney wrote: > > On Thu, Jan 29, 2015 at 03:44:46PM -0800, Casey Schaufler wrote: > >> On 1/29/2015 10:43 AM, Iulia Manda wrote: > >>> There are a lot of embedded systems that run most or all of their functionality > >>> in init, running as root:root. For these systems, supporting multiple users is > >>> not necessary. > >>> > >>> This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root > >>> users, non-root groups, and capabilities optional. > >>> > >>> When this symbol is not defined, UID and GID are zero in any possible case > >>> and processes always have all capabilities. > >>> > >>> The following syscalls are compiled out: setuid, setregid, setgid, > >>> setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, > >>> setfsuid, setfsgid, capget, capset. > >>> > >>> Also, groups.c is compiled out completely. > >>> > >>> This change saves about 25 KB on a defconfig build. > >>> > >>> The kernel was booted in Qemu. All the common functionalities work. Adding > >>> users/groups is not possible, failing with -ENOSYS. > >>> > >>> Bloat-o-meter output: > >>> add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) > >>> > >>> Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> > >>> Reviewed-by: Josh Triplett <josh@joshtriplett.org> > >> v2 does nothing to address the longstanding position of > >> the community that disabling the traditional user based > >> access controls is unacceptable. > >> > >> If the community has abandoned that position, and I see no > >> reason to believe that is true, the correct implementation > >> is to rework the LSM from an additional controls model to > >> an authoritative hook model. > >> > >> Speaking of the LSM, what is your expectation regarding the > >> use of security modules in addition to "NON_ROOT"? Is it > >> forbidden, allowed or encouraged? > > I am guessing that people who remove uids and gids from their > > kernels would tend not to add LSM. From what I understand, these > > kernels are designed for special-purpose applications that have > > very limited and stylized interactions with the outside world. > > Applications that, back in the day, would have been written to > > run on bare metal without any OS whatsoever. > > Linux is still going to be too big for those applications. Taking > the UID, GID and capability processing out is, at 25k, hardly significant. > Yes, you'll save some processing time, but the benchmarks I've run in the > dim dark past indicated that the impact is actually trivial. I would of > course invite the advocates of this patch to produce numbers. No, if you > are looking to switch from a RTOS to a Linux kernel, UID processing isn't > going to be your first (second, or third) concern. A few K here, a few K there, and pretty soon you actually fit into the small-memory 32-bit SoCs. I do not believe that the processing time is the issue. > As for LSMs, I can easily see putting in the security model from the old > RTOS on top of a NON_ROOT configuration. Won't that be fun when the CVEs > start to fly? > > Do you think you'll be running system services like systemd on top of this? > Anyone *else* remember what happened when they put capability handling into > sendmail? Nope, I don't expect these systems to be using LSM, systemd, or sendmail. I think that many of these will instead run the application directly out of the init process. Thanx, Paul > >> Hacking security code out with ifdefs is a common enough > >> practice, but I like to think the kernel community knows > >> better. > > >From what I understand, the alternative in this case is for the > > applications to use some other "OS" that lacks security from the get-go, > > so one can argue that NON_ROOT or MULTIUSER or whatever isn't resulting > > in a net decrease in security. > > > > Thanx, Paul > > > >>> --- > >>> Changes since v1: > >>> - refactor code; > >>> - compile out groups.c; > >>> - if groups_alloc is called, enable NON_ROOT; > >>> > >>> arch/s390/Kconfig | 1 + > >>> drivers/staging/lustre/lustre/Kconfig | 1 + > >>> fs/nfsd/Kconfig | 1 + > >>> include/linux/capability.h | 29 +++++++++++++++++++++++++++ > >>> include/linux/cred.h | 23 ++++++++++++++++++---- > >>> include/linux/uidgid.h | 12 +++++++++++ > >>> init/Kconfig | 19 +++++++++++++++++- > >>> kernel/Makefile | 4 +++- > >>> kernel/capability.c | 35 ++++++++++++++++++--------------- > >>> kernel/cred.c | 3 +++ > >>> kernel/groups.c | 3 --- > >>> kernel/sys.c | 2 ++ > >>> kernel/sys_ni.c | 14 +++++++++++++ > >>> net/sunrpc/Kconfig | 2 ++ > >>> 14 files changed, 124 insertions(+), 25 deletions(-) > >>> > >>> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > >>> index 68b68d7..b2d2116 100644 > >>> --- a/arch/s390/Kconfig > >>> +++ b/arch/s390/Kconfig > >>> @@ -324,6 +324,7 @@ config COMPAT > >>> select COMPAT_BINFMT_ELF if BINFMT_ELF > >>> select ARCH_WANT_OLD_COMPAT_IPC > >>> select COMPAT_OLD_SIGACTION > >>> + select NON_ROOT > >>> help > >>> Select this option if you want to enable your system kernel to > >>> handle system-calls from ELF binaries for 31 bit ESA. This option > >>> diff --git a/drivers/staging/lustre/lustre/Kconfig b/drivers/staging/lustre/lustre/Kconfig > >>> index 6725467..b975f62 100644 > >>> --- a/drivers/staging/lustre/lustre/Kconfig > >>> +++ b/drivers/staging/lustre/lustre/Kconfig > >>> @@ -10,6 +10,7 @@ config LUSTRE_FS > >>> select CRYPTO_SHA1 > >>> select CRYPTO_SHA256 > >>> select CRYPTO_SHA512 > >>> + select NON_ROOT > >>> help > >>> This option enables Lustre file system client support. Choose Y > >>> here if you want to access a Lustre file system cluster. To compile > >>> diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig > >>> index 7339515..1a8d6d9 100644 > >>> --- a/fs/nfsd/Kconfig > >>> +++ b/fs/nfsd/Kconfig > >>> @@ -6,6 +6,7 @@ config NFSD > >>> select SUNRPC > >>> select EXPORTFS > >>> select NFS_ACL_SUPPORT if NFSD_V2_ACL > >>> + select NON_ROOT > >>> help > >>> Choose Y here if you want to allow other computers to access > >>> files residing on this system using Sun's Network File System > >>> diff --git a/include/linux/capability.h b/include/linux/capability.h > >>> index aa93e5e..601c5de 100644 > >>> --- a/include/linux/capability.h > >>> +++ b/include/linux/capability.h > >>> @@ -205,6 +205,7 @@ static inline kernel_cap_t cap_raise_nfsd_set(const kernel_cap_t a, > >>> cap_intersect(permitted, __cap_nfsd_set)); > >>> } > >>> > >>> +#ifdef CONFIG_NON_ROOT > >>> extern bool has_capability(struct task_struct *t, int cap); > >>> extern bool has_ns_capability(struct task_struct *t, > >>> struct user_namespace *ns, int cap); > >>> @@ -213,6 +214,34 @@ extern bool has_ns_capability_noaudit(struct task_struct *t, > >>> struct user_namespace *ns, int cap); > >>> extern bool capable(int cap); > >>> extern bool ns_capable(struct user_namespace *ns, int cap); > >>> +#else > >>> +static inline bool has_capability(struct task_struct *t, int cap) > >>> +{ > >>> + return true; > >>> +} > >>> +static inline bool has_ns_capability(struct task_struct *t, > >>> + struct user_namespace *ns, int cap) > >>> +{ > >>> + return true; > >>> +} > >>> +static inline bool has_capability_noaudit(struct task_struct *t, int cap) > >>> +{ > >>> + return true; > >>> +} > >>> +static inline bool has_ns_capability_noaudit(struct task_struct *t, > >>> + struct user_namespace *ns, int cap) > >>> +{ > >>> + return true; > >>> +} > >>> +static inline bool capable(int cap) > >>> +{ > >>> + return true; > >>> +} > >>> +static inline bool ns_capable(struct user_namespace *ns, int cap) > >>> +{ > >>> + return true; > >>> +} > >>> +#endif /* CONFIG_NON_ROOT */ > >>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); > >>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); > >>> > >>> diff --git a/include/linux/cred.h b/include/linux/cred.h > >>> index 2fb2ca2..08ea5c6 100644 > >>> --- a/include/linux/cred.h > >>> +++ b/include/linux/cred.h > >>> @@ -62,9 +62,27 @@ do { \ > >>> groups_free(group_info); \ > >>> } while (0) > >>> > >>> -extern struct group_info *groups_alloc(int); > >>> extern struct group_info init_groups; > >>> +#ifdef CONFIG_NON_ROOT > >>> +extern struct group_info *groups_alloc(int); > >>> extern void groups_free(struct group_info *); > >>> + > >>> +extern int in_group_p(kgid_t); > >>> +extern int in_egroup_p(kgid_t); > >>> +#else > >>> +static inline void groups_free(struct group_info *group_info) > >>> +{ > >>> +} > >>> + > >>> +static inline int in_group_p(kgid_t grp) > >>> +{ > >>> + return 1; > >>> +} > >>> +static inline int in_egroup_p(kgid_t grp) > >>> +{ > >>> + return 1; > >>> +} > >>> +#endif > >>> extern int set_current_groups(struct group_info *); > >>> extern void set_groups(struct cred *, struct group_info *); > >>> extern int groups_search(const struct group_info *, kgid_t); > >>> @@ -74,9 +92,6 @@ extern bool may_setgroups(void); > >>> #define GROUP_AT(gi, i) \ > >>> ((gi)->blocks[(i) / NGROUPS_PER_BLOCK][(i) % NGROUPS_PER_BLOCK]) > >>> > >>> -extern int in_group_p(kgid_t); > >>> -extern int in_egroup_p(kgid_t); > >>> - > >>> /* > >>> * The security context of a task > >>> * > >>> diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h > >>> index 2d1f9b6..22bd1fa 100644 > >>> --- a/include/linux/uidgid.h > >>> +++ b/include/linux/uidgid.h > >>> @@ -29,6 +29,7 @@ typedef struct { > >>> #define KUIDT_INIT(value) (kuid_t){ value } > >>> #define KGIDT_INIT(value) (kgid_t){ value } > >>> > >>> +#ifdef CONFIG_NON_ROOT > >>> static inline uid_t __kuid_val(kuid_t uid) > >>> { > >>> return uid.val; > >>> @@ -38,6 +39,17 @@ static inline gid_t __kgid_val(kgid_t gid) > >>> { > >>> return gid.val; > >>> } > >>> +#else > >>> +static inline uid_t __kuid_val(kuid_t uid) > >>> +{ > >>> + return 0; > >>> +} > >>> + > >>> +static inline gid_t __kgid_val(kgid_t gid) > >>> +{ > >>> + return 0; > >>> +} > >>> +#endif > >>> > >>> #define GLOBAL_ROOT_UID KUIDT_INIT(0) > >>> #define GLOBAL_ROOT_GID KGIDT_INIT(0) > >>> diff --git a/init/Kconfig b/init/Kconfig > >>> index 9afb971..dc5bfd4 100644 > >>> --- a/init/Kconfig > >>> +++ b/init/Kconfig > >>> @@ -394,6 +394,7 @@ endchoice > >>> > >>> config BSD_PROCESS_ACCT > >>> bool "BSD Process Accounting" > >>> + select NON_ROOT > >>> help > >>> If you say Y here, a user level program will be able to instruct the > >>> kernel (via a special system call) to write process accounting > >>> @@ -420,6 +421,7 @@ config BSD_PROCESS_ACCT_V3 > >>> config TASKSTATS > >>> bool "Export task/process statistics through netlink" > >>> depends on NET > >>> + select NON_ROOT > >>> default n > >>> help > >>> Export selected statistics for tasks/processes through the > >>> @@ -1140,6 +1142,7 @@ config CHECKPOINT_RESTORE > >>> > >>> menuconfig NAMESPACES > >>> bool "Namespaces support" if EXPERT > >>> + depends on NON_ROOT > >>> default !EXPERT > >>> help > >>> Provides the way to make tasks work with different objects using > >>> @@ -1352,11 +1355,25 @@ menuconfig EXPERT > >>> > >>> config UID16 > >>> bool "Enable 16-bit UID system calls" if EXPERT > >>> - depends on HAVE_UID16 > >>> + depends on HAVE_UID16 && NON_ROOT > >>> default y > >>> help > >>> This enables the legacy 16-bit UID syscall wrappers. > >>> > >>> +config NON_ROOT > >>> + bool "Multiple users, groups and capabilities support" if EXPERT > >>> + default y > >>> + help > >>> + This option enables support for non-root users, groups and > >>> + capabilities. > >>> + > >>> + If you say N here, all processes will run with UID 0, GID 0, and all > >>> + possible capabilities. Saying N here also compiles out support for > >>> + system calls related to UIDs, GIDs, and capabilities, such as setuid, > >>> + setgid, and capset. > >>> + > >>> + If unsure, say Y here. > >>> + > >>> config SGETMASK_SYSCALL > >>> bool "sgetmask/ssetmask syscalls support" if EXPERT > >>> def_bool PARISC || MN10300 || BLACKFIN || M68K || PPC || MIPS || X86 || SPARC || CRIS || MICROBLAZE || SUPERH > >>> diff --git a/kernel/Makefile b/kernel/Makefile > >>> index a59481a..d5ca6b8 100644 > >>> --- a/kernel/Makefile > >>> +++ b/kernel/Makefile > >>> @@ -9,7 +9,9 @@ obj-y = fork.o exec_domain.o panic.o \ > >>> extable.o params.o \ > >>> kthread.o sys_ni.o nsproxy.o \ > >>> notifier.o ksysfs.o cred.o reboot.o \ > >>> - async.o range.o groups.o smpboot.o > >>> + async.o range.o smpboot.o > >>> + > >>> +obj-$(CONFIG_NON_ROOT) += groups.o > >>> > >>> ifdef CONFIG_FUNCTION_TRACER > >>> # Do not trace debug files and internal ftrace files > >>> diff --git a/kernel/capability.c b/kernel/capability.c > >>> index 989f5bf..2638412 100644 > >>> --- a/kernel/capability.c > >>> +++ b/kernel/capability.c > >>> @@ -35,6 +35,7 @@ static int __init file_caps_disable(char *str) > >>> } > >>> __setup("no_file_caps", file_caps_disable); > >>> > >>> +#ifdef CONFIG_NON_ROOT > >>> /* > >>> * More recent versions of libcap are available from: > >>> * > >>> @@ -386,6 +387,24 @@ bool ns_capable(struct user_namespace *ns, int cap) > >>> } > >>> EXPORT_SYMBOL(ns_capable); > >>> > >>> + > >>> +/** > >>> + * capable - Determine if the current task has a superior capability in effect > >>> + * @cap: The capability to be tested for > >>> + * > >>> + * Return true if the current task has the given superior capability currently > >>> + * available for use, false if not. > >>> + * > >>> + * This sets PF_SUPERPRIV on the task if the capability is available on the > >>> + * assumption that it's about to be used. > >>> + */ > >>> +bool capable(int cap) > >>> +{ > >>> + return ns_capable(&init_user_ns, cap); > >>> +} > >>> +EXPORT_SYMBOL(capable); > >>> +#endif /* CONFIG_NON_ROOT */ > >>> + > >>> /** > >>> * file_ns_capable - Determine if the file's opener had a capability in effect > >>> * @file: The file we want to check > >>> @@ -412,22 +431,6 @@ bool file_ns_capable(const struct file *file, struct user_namespace *ns, > >>> EXPORT_SYMBOL(file_ns_capable); > >>> > >>> /** > >>> - * capable - Determine if the current task has a superior capability in effect > >>> - * @cap: The capability to be tested for > >>> - * > >>> - * Return true if the current task has the given superior capability currently > >>> - * available for use, false if not. > >>> - * > >>> - * This sets PF_SUPERPRIV on the task if the capability is available on the > >>> - * assumption that it's about to be used. > >>> - */ > >>> -bool capable(int cap) > >>> -{ > >>> - return ns_capable(&init_user_ns, cap); > >>> -} > >>> -EXPORT_SYMBOL(capable); > >>> - > >>> -/** > >>> * capable_wrt_inode_uidgid - Check nsown_capable and uid and gid mapped > >>> * @inode: The inode in question > >>> * @cap: The capability in question > >>> diff --git a/kernel/cred.c b/kernel/cred.c > >>> index e0573a4..ec1c076 100644 > >>> --- a/kernel/cred.c > >>> +++ b/kernel/cred.c > >>> @@ -29,6 +29,9 @@ > >>> > >>> static struct kmem_cache *cred_jar; > >>> > >>> +/* init to 2 - one for init_task, one to ensure it is never freed */ > >>> +struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; > >>> + > >>> /* > >>> * The initial credentials for the initial task > >>> */ > >>> diff --git a/kernel/groups.c b/kernel/groups.c > >>> index 664411f..74d431d 100644 > >>> --- a/kernel/groups.c > >>> +++ b/kernel/groups.c > >>> @@ -9,9 +9,6 @@ > >>> #include <linux/user_namespace.h> > >>> #include <asm/uaccess.h> > >>> > >>> -/* init to 2 - one for init_task, one to ensure it is never freed */ > >>> -struct group_info init_groups = { .usage = ATOMIC_INIT(2) }; > >>> - > >>> struct group_info *groups_alloc(int gidsetsize) > >>> { > >>> struct group_info *group_info; > >>> diff --git a/kernel/sys.c b/kernel/sys.c > >>> index a8c9f5a..bfe532b 100644 > >>> --- a/kernel/sys.c > >>> +++ b/kernel/sys.c > >>> @@ -319,6 +319,7 @@ out_unlock: > >>> * SMP: There are not races, the GIDs are checked only by filesystem > >>> * operations (as far as semantic preservation is concerned). > >>> */ > >>> +#ifdef CONFIG_NON_ROOT > >>> SYSCALL_DEFINE2(setregid, gid_t, rgid, gid_t, egid) > >>> { > >>> struct user_namespace *ns = current_user_ns(); > >>> @@ -809,6 +810,7 @@ change_okay: > >>> commit_creds(new); > >>> return old_fsgid; > >>> } > >>> +#endif /* CONFIG_NON_ROOT */ > >>> > >>> /** > >>> * sys_getpid - return the thread group id of the current process > >>> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c > >>> index 5adcb0a..7995ef5 100644 > >>> --- a/kernel/sys_ni.c > >>> +++ b/kernel/sys_ni.c > >>> @@ -159,6 +159,20 @@ cond_syscall(sys_uselib); > >>> cond_syscall(sys_fadvise64); > >>> cond_syscall(sys_fadvise64_64); > >>> cond_syscall(sys_madvise); > >>> +cond_syscall(sys_setuid); > >>> +cond_syscall(sys_setregid); > >>> +cond_syscall(sys_setgid); > >>> +cond_syscall(sys_setreuid); > >>> +cond_syscall(sys_setresuid); > >>> +cond_syscall(sys_getresuid); > >>> +cond_syscall(sys_setresgid); > >>> +cond_syscall(sys_getresgid); > >>> +cond_syscall(sys_setgroups); > >>> +cond_syscall(sys_getgroups); > >>> +cond_syscall(sys_setfsuid); > >>> +cond_syscall(sys_setfsgid); > >>> +cond_syscall(sys_capget); > >>> +cond_syscall(sys_capset); > >>> > >>> /* arch-specific weak syscall entries */ > >>> cond_syscall(sys_pciconfig_read); > >>> diff --git a/net/sunrpc/Kconfig b/net/sunrpc/Kconfig > >>> index fb78117..2b2c471 100644 > >>> --- a/net/sunrpc/Kconfig > >>> +++ b/net/sunrpc/Kconfig > >>> @@ -1,9 +1,11 @@ > >>> config SUNRPC > >>> tristate > >>> + select NON_ROOT > >>> > >>> config SUNRPC_GSS > >>> tristate > >>> select OID_REGISTRY > >>> + select NON_ROOT > >>> > >>> config SUNRPC_BACKCHANNEL > >>> bool > > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 1:36 ` Paul E. McKenney @ 2015-01-30 2:25 ` Casey Schaufler 2015-01-30 7:07 ` Paul E. McKenney 2015-01-30 19:13 ` josh 0 siblings, 2 replies; 22+ messages in thread From: Casey Schaufler @ 2015-01-30 2:25 UTC (permalink / raw) To: paulmck Cc: Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, josh, peterz, mhocko, LSM, Casey Schaufler On 1/29/2015 5:36 PM, Paul E. McKenney wrote: > On Thu, Jan 29, 2015 at 05:25:56PM -0800, Casey Schaufler wrote: >> On 1/29/2015 4:32 PM, Paul E. McKenney wrote: >>> On Thu, Jan 29, 2015 at 03:44:46PM -0800, Casey Schaufler wrote: >>>> On 1/29/2015 10:43 AM, Iulia Manda wrote: >>>>> There are a lot of embedded systems that run most or all of their functionality >>>>> in init, running as root:root. For these systems, supporting multiple users is >>>>> not necessary. >>>>> >>>>> This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root >>>>> users, non-root groups, and capabilities optional. >>>>> >>>>> When this symbol is not defined, UID and GID are zero in any possible case >>>>> and processes always have all capabilities. >>>>> >>>>> The following syscalls are compiled out: setuid, setregid, setgid, >>>>> setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, >>>>> setfsuid, setfsgid, capget, capset. >>>>> >>>>> Also, groups.c is compiled out completely. >>>>> >>>>> This change saves about 25 KB on a defconfig build. >>>>> >>>>> The kernel was booted in Qemu. All the common functionalities work. Adding >>>>> users/groups is not possible, failing with -ENOSYS. >>>>> >>>>> Bloat-o-meter output: >>>>> add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) >>>>> >>>>> Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> >>>>> Reviewed-by: Josh Triplett <josh@joshtriplett.org> >>>> v2 does nothing to address the longstanding position of >>>> the community that disabling the traditional user based >>>> access controls is unacceptable. >>>> >>>> If the community has abandoned that position, and I see no >>>> reason to believe that is true, the correct implementation >>>> is to rework the LSM from an additional controls model to >>>> an authoritative hook model. >>>> >>>> Speaking of the LSM, what is your expectation regarding the >>>> use of security modules in addition to "NON_ROOT"? Is it >>>> forbidden, allowed or encouraged? >>> I am guessing that people who remove uids and gids from their >>> kernels would tend not to add LSM. From what I understand, these >>> kernels are designed for special-purpose applications that have >>> very limited and stylized interactions with the outside world. >>> Applications that, back in the day, would have been written to >>> run on bare metal without any OS whatsoever. >> Linux is still going to be too big for those applications. Taking >> the UID, GID and capability processing out is, at 25k, hardly significant. >> Yes, you'll save some processing time, but the benchmarks I've run in the >> dim dark past indicated that the impact is actually trivial. I would of >> course invite the advocates of this patch to produce numbers. No, if you >> are looking to switch from a RTOS to a Linux kernel, UID processing isn't >> going to be your first (second, or third) concern. > A few K here, a few K there, and pretty soon you actually fit into the > small-memory 32-bit SoCs. I do not believe that the processing time > is the issue. And UNIX, with UID and GID processing, used to run in 64K of RAM, without swap or paging. Bluntly, there are many other places to look before you go here. >> As for LSMs, I can easily see putting in the security model from the old >> RTOS on top of a NON_ROOT configuration. Won't that be fun when the CVEs >> start to fly? >> >> Do you think you'll be running system services like systemd on top of this? >> Anyone *else* remember what happened when they put capability handling into >> sendmail? > Nope, I don't expect these systems to be using LSM, systemd, or sendmail. > I think that many of these will instead run the application directly > out of the init process. Where an "application" might be something like CrossWalk, which is going to pull in all sorts of fun services that no one will want to maintain or change for the environment. Resulting in all sorts of security issues. It would be inappropriate for me to sit aside and let you go in that direction unwarned. I'm not going to try to stop you, because I know that's futile. Let me know what I can do to help when the time comes. > > Thanx, Paul > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 2:25 ` Casey Schaufler @ 2015-01-30 7:07 ` Paul E. McKenney 2015-01-30 19:13 ` josh 1 sibling, 0 replies; 22+ messages in thread From: Paul E. McKenney @ 2015-01-30 7:07 UTC (permalink / raw) To: Casey Schaufler Cc: Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, josh, peterz, mhocko, LSM On Thu, Jan 29, 2015 at 06:25:23PM -0800, Casey Schaufler wrote: > On 1/29/2015 5:36 PM, Paul E. McKenney wrote: > > On Thu, Jan 29, 2015 at 05:25:56PM -0800, Casey Schaufler wrote: > >> On 1/29/2015 4:32 PM, Paul E. McKenney wrote: > >>> On Thu, Jan 29, 2015 at 03:44:46PM -0800, Casey Schaufler wrote: > >>>> On 1/29/2015 10:43 AM, Iulia Manda wrote: > >>>>> There are a lot of embedded systems that run most or all of their functionality > >>>>> in init, running as root:root. For these systems, supporting multiple users is > >>>>> not necessary. > >>>>> > >>>>> This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root > >>>>> users, non-root groups, and capabilities optional. > >>>>> > >>>>> When this symbol is not defined, UID and GID are zero in any possible case > >>>>> and processes always have all capabilities. > >>>>> > >>>>> The following syscalls are compiled out: setuid, setregid, setgid, > >>>>> setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, > >>>>> setfsuid, setfsgid, capget, capset. > >>>>> > >>>>> Also, groups.c is compiled out completely. > >>>>> > >>>>> This change saves about 25 KB on a defconfig build. > >>>>> > >>>>> The kernel was booted in Qemu. All the common functionalities work. Adding > >>>>> users/groups is not possible, failing with -ENOSYS. > >>>>> > >>>>> Bloat-o-meter output: > >>>>> add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) > >>>>> > >>>>> Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> > >>>>> Reviewed-by: Josh Triplett <josh@joshtriplett.org> > >>>> v2 does nothing to address the longstanding position of > >>>> the community that disabling the traditional user based > >>>> access controls is unacceptable. > >>>> > >>>> If the community has abandoned that position, and I see no > >>>> reason to believe that is true, the correct implementation > >>>> is to rework the LSM from an additional controls model to > >>>> an authoritative hook model. > >>>> > >>>> Speaking of the LSM, what is your expectation regarding the > >>>> use of security modules in addition to "NON_ROOT"? Is it > >>>> forbidden, allowed or encouraged? > >>> I am guessing that people who remove uids and gids from their > >>> kernels would tend not to add LSM. From what I understand, these > >>> kernels are designed for special-purpose applications that have > >>> very limited and stylized interactions with the outside world. > >>> Applications that, back in the day, would have been written to > >>> run on bare metal without any OS whatsoever. > >> Linux is still going to be too big for those applications. Taking > >> the UID, GID and capability processing out is, at 25k, hardly significant. > >> Yes, you'll save some processing time, but the benchmarks I've run in the > >> dim dark past indicated that the impact is actually trivial. I would of > >> course invite the advocates of this patch to produce numbers. No, if you > >> are looking to switch from a RTOS to a Linux kernel, UID processing isn't > >> going to be your first (second, or third) concern. > > A few K here, a few K there, and pretty soon you actually fit into the > > small-memory 32-bit SoCs. I do not believe that the processing time > > is the issue. > > And UNIX, with UID and GID processing, used to run in 64K of RAM, > without swap or paging. Bluntly, there are many other places to look > before you go here. Even more bluntly, it is not me that you need to convince, and I would expect them to be profoundly unimpressed by your interesting suggestion that they drop back to BSD 2.8 and PDP11s. > >> As for LSMs, I can easily see putting in the security model from the old > >> RTOS on top of a NON_ROOT configuration. Won't that be fun when the CVEs > >> start to fly? > >> > >> Do you think you'll be running system services like systemd on top of this? > >> Anyone *else* remember what happened when they put capability handling into > >> sendmail? > > Nope, I don't expect these systems to be using LSM, systemd, or sendmail. > > I think that many of these will instead run the application directly > > out of the init process. > > Where an "application" might be something like CrossWalk, which is going to pull > in all sorts of fun services that no one will want to maintain or change for the > environment. Resulting in all sorts of security issues. It would be inappropriate > for me to sit aside and let you go in that direction unwarned. I'm not going to > try to stop you, because I know that's futile. Let me know what I can do to help > when the time comes. I have no idea what sort of application framework they will choose. But of course, there will be all sorts of security issues no matter what technology they choose. After all, it is not like we are lacking in security issues in the current non-Linux installed base. Or in the Linux installed base, for that matter. Thanx, Paul ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 2:25 ` Casey Schaufler 2015-01-30 7:07 ` Paul E. McKenney @ 2015-01-30 19:13 ` josh 2015-01-30 19:48 ` Casey Schaufler 1 sibling, 1 reply; 22+ messages in thread From: josh @ 2015-01-30 19:13 UTC (permalink / raw) To: Casey Schaufler Cc: paulmck, Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, peterz, mhocko, LSM On Thu, Jan 29, 2015 at 06:25:23PM -0800, Casey Schaufler wrote: > On 1/29/2015 5:36 PM, Paul E. McKenney wrote: > > A few K here, a few K there, and pretty soon you actually fit into the > > small-memory 32-bit SoCs. I do not believe that the processing time > > is the issue. > > And UNIX, with UID and GID processing, used to run in 64K of RAM, > without swap or paging. Bluntly, there are many other places to look > before you go here. And we're looking in all those places too. Each patch is worth evaluating independently. We've *already* gone here, the code is written (and being revised based on feedback), and "go work over there out of my backyard" is not going to work. One of these days, we're going to run in 64k again. > >> As for LSMs, I can easily see putting in the security model from the old > >> RTOS on top of a NON_ROOT configuration. Won't that be fun when the CVEs > >> start to fly? The security model is "there's one process on this system". (Expect patches for CONFIG_FORK=n and CONFIG_EXEC=n at some point.) > >> Do you think you'll be running system services like systemd on top of this? > >> Anyone *else* remember what happened when they put capability handling into > >> sendmail? > > Nope, I don't expect these systems to be using LSM, systemd, or sendmail. > > I think that many of these will instead run the application directly > > out of the init process. > > Where an "application" might be something like CrossWalk, No, not a chance. If you're running a web runtime, you're on a much larger system, and you're going to be less concerned about shaving kilobytes; you're also going to want many of the kernel facilities for sandboxing code. The kinds of applications we're talking about here run entirely in one binary, serving a few very narrow functions. We're not talking "automobile IVI system" here; we're talking "two buttons and an output", or "a few sensors and an SD card". - Josh Triplett ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 19:13 ` josh @ 2015-01-30 19:48 ` Casey Schaufler 2015-01-30 20:20 ` Austin S Hemmelgarn ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Casey Schaufler @ 2015-01-30 19:48 UTC (permalink / raw) To: josh Cc: paulmck, Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, peterz, mhocko, LSM, Casey Schaufler On 1/30/2015 11:13 AM, josh@joshtriplett.org wrote: > On Thu, Jan 29, 2015 at 06:25:23PM -0800, Casey Schaufler wrote: >> On 1/29/2015 5:36 PM, Paul E. McKenney wrote: >>> A few K here, a few K there, and pretty soon you actually fit into the >>> small-memory 32-bit SoCs. I do not believe that the processing time >>> is the issue. >> And UNIX, with UID and GID processing, used to run in 64K of RAM, >> without swap or paging. Bluntly, there are many other places to look >> before you go here. > And we're looking in all those places too. Each patch is worth > evaluating independently. We've *already* gone here, the code is > written (and being revised based on feedback), and "go work over there > out of my backyard" is not going to work. One of these days, we're > going to run in 64k again. Oh good heavens. Don't take this personally. I don't. >>>> As for LSMs, I can easily see putting in the security model from the old >>>> RTOS on top of a NON_ROOT configuration. Won't that be fun when the CVEs >>>> start to fly? > The security model is "there's one process on this system". (Expect > patches for CONFIG_FORK=n and CONFIG_EXEC=n at some point.) Ok. Why not use Bada? >>>> Do you think you'll be running system services like systemd on top of this? >>>> Anyone *else* remember what happened when they put capability handling into >>>> sendmail? >>> Nope, I don't expect these systems to be using LSM, systemd, or sendmail. >>> I think that many of these will instead run the application directly >>> out of the init process. >> Where an "application" might be something like CrossWalk, > No, not a chance. If you're running a web runtime, you're on a much > larger system, and you're going to be less concerned about shaving > kilobytes; you're also going to want many of the kernel facilities for > sandboxing code. > > The kinds of applications we're talking about here run entirely in one > binary, serving a few very narrow functions. We're not talking > "automobile IVI system" here; we're talking "two buttons and an output", > or "a few sensors and an SD card". Linux is an insane choice for such a system. Why would you even consider it? > > - Josh Triplett > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 19:48 ` Casey Schaufler @ 2015-01-30 20:20 ` Austin S Hemmelgarn 2015-01-30 21:40 ` Josh Triplett 2015-01-31 17:00 ` Jarkko Sakkinen 2 siblings, 0 replies; 22+ messages in thread From: Austin S Hemmelgarn @ 2015-01-30 20:20 UTC (permalink / raw) To: Casey Schaufler, josh Cc: paulmck, Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, peterz, mhocko, LSM [-- Attachment #1: Type: text/plain, Size: 2948 bytes --] On 2015-01-30 14:48, Casey Schaufler wrote: > On 1/30/2015 11:13 AM, josh@joshtriplett.org wrote: >> On Thu, Jan 29, 2015 at 06:25:23PM -0800, Casey Schaufler wrote: >>> On 1/29/2015 5:36 PM, Paul E. McKenney wrote: >>>> A few K here, a few K there, and pretty soon you actually fit into the >>>> small-memory 32-bit SoCs. I do not believe that the processing time >>>> is the issue. >>> And UNIX, with UID and GID processing, used to run in 64K of RAM, >>> without swap or paging. Bluntly, there are many other places to look >>> before you go here. >> And we're looking in all those places too. Each patch is worth >> evaluating independently. We've *already* gone here, the code is >> written (and being revised based on feedback), and "go work over there >> out of my backyard" is not going to work. One of these days, we're >> going to run in 64k again. > > Oh good heavens. Don't take this personally. I don't. > >>>>> As for LSMs, I can easily see putting in the security model from the old >>>>> RTOS on top of a NON_ROOT configuration. Won't that be fun when the CVEs >>>>> start to fly? >> The security model is "there's one process on this system". (Expect >> patches for CONFIG_FORK=n and CONFIG_EXEC=n at some point.) > > Ok. Why not use Bada? > >>>>> Do you think you'll be running system services like systemd on top of this? >>>>> Anyone *else* remember what happened when they put capability handling into >>>>> sendmail? >>>> Nope, I don't expect these systems to be using LSM, systemd, or sendmail. >>>> I think that many of these will instead run the application directly >>>> out of the init process. >>> Where an "application" might be something like CrossWalk, >> No, not a chance. If you're running a web runtime, you're on a much >> larger system, and you're going to be less concerned about shaving >> kilobytes; you're also going to want many of the kernel facilities for >> sandboxing code. >> >> The kinds of applications we're talking about here run entirely in one >> binary, serving a few very narrow functions. We're not talking >> "automobile IVI system" here; we're talking "two buttons and an output", >> or "a few sensors and an SD card". > > Linux is an insane choice for such a system. Why would you > even consider it? > Because there are weird people out there who want to do embedded development in Python, and insane people out there who want to do it in Perl, and people who want to do real-time stuff but can't for some reason learn to use something sensible for that like RTEMS or FreeRTOS. Also, Linux isn't as crazy as some other choices. Many ATM's and cash registers (at least in the US) run Windows with all of the software running with administrator privileges , and I've seen my fair share of minimalistic systems running DOS. While Linux may not be the _best_ choice for such use cases, it by far is not the worst. [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 2455 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 19:48 ` Casey Schaufler 2015-01-30 20:20 ` Austin S Hemmelgarn @ 2015-01-30 21:40 ` Josh Triplett 2015-01-30 21:56 ` Richard Weinberger 2015-01-31 17:00 ` Jarkko Sakkinen 2 siblings, 1 reply; 22+ messages in thread From: Josh Triplett @ 2015-01-30 21:40 UTC (permalink / raw) To: Casey Schaufler Cc: paulmck, Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, peterz, mhocko, LSM On Fri, Jan 30, 2015 at 11:48:04AM -0800, Casey Schaufler wrote: > On 1/30/2015 11:13 AM, josh@joshtriplett.org wrote: > > On Thu, Jan 29, 2015 at 06:25:23PM -0800, Casey Schaufler wrote: > >> On 1/29/2015 5:36 PM, Paul E. McKenney wrote: > >>> Casey Schaufler wrote: > >>>> As for LSMs, I can easily see putting in the security model from the old > >>>> RTOS on top of a NON_ROOT configuration. Won't that be fun when the CVEs > >>>> start to fly? > > The security model is "there's one process on this system". (Expect > > patches for CONFIG_FORK=n and CONFIG_EXEC=n at some point.) > > Ok. Why not use Bada? If you're asking about that particular OS: perhaps because it's proprietary, and also dead now (as of early 2013 according to Wikipedia)? From a quick look, it also looks much larger than desired. Leaving aside all the other reasons to not run a non-Linux OS, which I'd hope most people on these lists don't need much convincing of. More generally, why not run some random tiny RTOS? Because they're mostly proprietary, and not Linux, and with few exceptions don't live nearly as long as Linux, and don't have as many expert developers as Linux... > >>>> Do you think you'll be running system services like systemd on top of this? > >>>> Anyone *else* remember what happened when they put capability handling into > >>>> sendmail? > >>> Nope, I don't expect these systems to be using LSM, systemd, or sendmail. > >>> I think that many of these will instead run the application directly > >>> out of the init process. > >> Where an "application" might be something like CrossWalk, > > No, not a chance. If you're running a web runtime, you're on a much > > larger system, and you're going to be less concerned about shaving > > kilobytes; you're also going to want many of the kernel facilities for > > sandboxing code. > > > > The kinds of applications we're talking about here run entirely in one > > binary, serving a few very narrow functions. We're not talking > > "automobile IVI system" here; we're talking "two buttons and an output", > > or "a few sensors and an SD card". > > Linux is an insane choice for such a system. Why would you > even consider it? Linux was once an insane choice for supercomputers with thousands of CPUs. SMP support once had to be added, and it was significant enough to warrant bumping the major version number, back when that had significance. *Today*, Linux is a challenging choice for a tiny embedded system. We're trying to fix that. Why write yet another driver when Linux has one? Why reimplement code and risk rediscovering yet another bug Linux solved long ago? Why not use a system that scales, such that if you need more functionality, you can turn on a few more options without rewriting all your code for another environment? Why not use an OS that will definitely run on your next piece of hardware too, without needing a total rewrite? I would hope that I'm preaching to the choir here. Let's take a step back from the philosophy for a moment, and go back to reviewing code and patches. - Josh Triplett ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 21:40 ` Josh Triplett @ 2015-01-30 21:56 ` Richard Weinberger 2015-01-31 23:30 ` Paul E. McKenney 0 siblings, 1 reply; 22+ messages in thread From: Richard Weinberger @ 2015-01-30 21:56 UTC (permalink / raw) To: Josh Triplett Cc: Casey Schaufler, Paul McKenney, Iulia Manda, One Thousand Gnomes, Serge Hallyn, LKML, Andrew Morton, Peter Zijlstra, Michal Hocko, LSM On Fri, Jan 30, 2015 at 10:40 PM, Josh Triplett <josh@joshtriplett.org> wrote: > *Today*, Linux is a challenging choice for a tiny embedded system. > We're trying to fix that. Can you please more specific about the embedded systems exactly you're talking about? I find this patch rather controversial as it removes a lot of security. Embedded systems *are* a target for all kind of attacks. Misguided embedded engineers will abuse this feature and produce even more weak targets. -- Thanks, //richard ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 21:56 ` Richard Weinberger @ 2015-01-31 23:30 ` Paul E. McKenney 2015-01-31 23:33 ` Richard Weinberger 0 siblings, 1 reply; 22+ messages in thread From: Paul E. McKenney @ 2015-01-31 23:30 UTC (permalink / raw) To: Richard Weinberger Cc: Josh Triplett, Casey Schaufler, Iulia Manda, One Thousand Gnomes, Serge Hallyn, LKML, Andrew Morton, Peter Zijlstra, Michal Hocko, LSM On Fri, Jan 30, 2015 at 10:56:14PM +0100, Richard Weinberger wrote: > On Fri, Jan 30, 2015 at 10:40 PM, Josh Triplett <josh@joshtriplett.org> wrote: > > *Today*, Linux is a challenging choice for a tiny embedded system. > > We're trying to fix that. > > Can you please more specific about the embedded systems exactly you're > talking about? > > I find this patch rather controversial as it removes a lot of security. > Embedded systems *are* a target for all kind of attacks. > Misguided embedded engineers will abuse this feature and produce even more > weak targets. Without this patch, those same engineers would simply run everything as root. "Make a foolproof system, and they will invent a better fool". ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-31 23:30 ` Paul E. McKenney @ 2015-01-31 23:33 ` Richard Weinberger 2015-02-01 19:45 ` Paul E. McKenney 0 siblings, 1 reply; 22+ messages in thread From: Richard Weinberger @ 2015-01-31 23:33 UTC (permalink / raw) To: paulmck, Richard Weinberger Cc: Josh Triplett, Casey Schaufler, Iulia Manda, One Thousand Gnomes, Serge Hallyn, LKML, Andrew Morton, Peter Zijlstra, Michal Hocko, LSM Am 01.02.2015 um 00:30 schrieb Paul E. McKenney: > On Fri, Jan 30, 2015 at 10:56:14PM +0100, Richard Weinberger wrote: >> On Fri, Jan 30, 2015 at 10:40 PM, Josh Triplett <josh@joshtriplett.org> wrote: >>> *Today*, Linux is a challenging choice for a tiny embedded system. >>> We're trying to fix that. >> >> Can you please more specific about the embedded systems exactly you're >> talking about? >> >> I find this patch rather controversial as it removes a lot of security. >> Embedded systems *are* a target for all kind of attacks. >> Misguided embedded engineers will abuse this feature and produce even more >> weak targets. > > Without this patch, those same engineers would simply run everything as > root. "Make a foolproof system, and they will invent a better fool". ;-) Luckily many services will run as non-root by default and some even refuse to run as root. :-) Thanks, //richard ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-31 23:33 ` Richard Weinberger @ 2015-02-01 19:45 ` Paul E. McKenney 0 siblings, 0 replies; 22+ messages in thread From: Paul E. McKenney @ 2015-02-01 19:45 UTC (permalink / raw) To: Richard Weinberger Cc: Richard Weinberger, Josh Triplett, Casey Schaufler, Iulia Manda, One Thousand Gnomes, Serge Hallyn, LKML, Andrew Morton, Peter Zijlstra, Michal Hocko, LSM On Sun, Feb 01, 2015 at 12:33:23AM +0100, Richard Weinberger wrote: > Am 01.02.2015 um 00:30 schrieb Paul E. McKenney: > > On Fri, Jan 30, 2015 at 10:56:14PM +0100, Richard Weinberger wrote: > >> On Fri, Jan 30, 2015 at 10:40 PM, Josh Triplett <josh@joshtriplett.org> wrote: > >>> *Today*, Linux is a challenging choice for a tiny embedded system. > >>> We're trying to fix that. > >> > >> Can you please more specific about the embedded systems exactly you're > >> talking about? > >> > >> I find this patch rather controversial as it removes a lot of security. > >> Embedded systems *are* a target for all kind of attacks. > >> Misguided embedded engineers will abuse this feature and produce even more > >> weak targets. > > > > Without this patch, those same engineers would simply run everything as > > root. "Make a foolproof system, and they will invent a better fool". ;-) > > Luckily many services will run as non-root by default and some even refuse to > run as root. :-) Here is hoping that this helps those engineers to use a decent security model for their devices! ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 19:48 ` Casey Schaufler 2015-01-30 20:20 ` Austin S Hemmelgarn 2015-01-30 21:40 ` Josh Triplett @ 2015-01-31 17:00 ` Jarkko Sakkinen 2 siblings, 0 replies; 22+ messages in thread From: Jarkko Sakkinen @ 2015-01-31 17:00 UTC (permalink / raw) To: Casey Schaufler Cc: josh, paulmck, Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, peterz, mhocko, LSM On Fri, Jan 30, 2015 at 11:48:04AM -0800, Casey Schaufler wrote: > > The kinds of applications we're talking about here run entirely in one > > binary, serving a few very narrow functions. We're not talking > > "automobile IVI system" here; we're talking "two buttons and an output", > > or "a few sensors and an SD card". > > Linux is an insane choice for such a system. Why would you > even consider it? One of the reasons would be developer productivity. You can use the same tools and platform accross all scales. > > > > - Josh Triplett > > /Jarkko ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-29 23:44 ` Casey Schaufler 2015-01-30 0:32 ` Paul E. McKenney @ 2015-01-30 0:43 ` josh 2015-01-30 2:05 ` Casey Schaufler 1 sibling, 1 reply; 22+ messages in thread From: josh @ 2015-01-30 0:43 UTC (permalink / raw) To: Casey Schaufler Cc: Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, paulmck, peterz, mhocko, LSM On Thu, Jan 29, 2015 at 03:44:46PM -0800, Casey Schaufler wrote: > On 1/29/2015 10:43 AM, Iulia Manda wrote: > > There are a lot of embedded systems that run most or all of their functionality > > in init, running as root:root. For these systems, supporting multiple users is > > not necessary. > > > > This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root > > users, non-root groups, and capabilities optional. > > > > When this symbol is not defined, UID and GID are zero in any possible case > > and processes always have all capabilities. > > > > The following syscalls are compiled out: setuid, setregid, setgid, > > setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, > > setfsuid, setfsgid, capget, capset. > > > > Also, groups.c is compiled out completely. > > > > This change saves about 25 KB on a defconfig build. > > > > The kernel was booted in Qemu. All the common functionalities work. Adding > > users/groups is not possible, failing with -ENOSYS. > > > > Bloat-o-meter output: > > add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) > > > > Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> > > Reviewed-by: Josh Triplett <josh@joshtriplett.org> > > v2 does nothing to address the longstanding position of > the community that disabling the traditional user based > access controls is unacceptable. So far that "longstanding position" consists of you griping that we're not implementing authoritative LSM hooks for you and re-fighting that battle for you. Patches for authoritative LSM hooks did indeed get refused long ago, which is an excellent reason for us to not recast this patch to reimplement them that way. If it does turn out that the security maintainers in the kernel are open to the idea of authoritative LSM hooks, by all means I would encourage you to revisit such hooks. But there's a significant difference between "add the ability to disable access controls" and "add a framework that allows replacing the user/group security model with arbitrary access controls", and it's not at all obvious that the "right" solution for the former is an implementation of the latter; it also seems entirely plausible that the kernel community remains opposed to the latter, which does not necessarily rule out the former. Given that, it would be helpful to hear feedback from more of the community. > Speaking of the LSM, what is your expectation regarding the > use of security modules in addition to "NON_ROOT"? Is it > forbidden, allowed or encouraged? I would expect that any security module would need to depend on NON_ROOT (or MULTIUSER as v3 may end up calling it, per Geert Uytterhoeven's suggestion). A kernel configuration with this option turned off intentionally does not *have* user/group access controls. The intent of this option isn't "turn standard access controls off so they get out of the way of non-standard access controls"; the intent is "turn all access controls off because there will never be unprivileged processes on this system". So, on that basis, it sounds like v3 should add a dependency from SECURITY to MULTIUSER. - Josh Triplett ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 0:43 ` josh @ 2015-01-30 2:05 ` Casey Schaufler 2015-01-30 21:04 ` Josh Triplett 0 siblings, 1 reply; 22+ messages in thread From: Casey Schaufler @ 2015-01-30 2:05 UTC (permalink / raw) To: josh Cc: Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, paulmck, peterz, mhocko, LSM, Casey Schaufler On 1/29/2015 4:43 PM, josh@joshtriplett.org wrote: > On Thu, Jan 29, 2015 at 03:44:46PM -0800, Casey Schaufler wrote: >> On 1/29/2015 10:43 AM, Iulia Manda wrote: >>> There are a lot of embedded systems that run most or all of their functionality >>> in init, running as root:root. For these systems, supporting multiple users is >>> not necessary. >>> >>> This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root >>> users, non-root groups, and capabilities optional. >>> >>> When this symbol is not defined, UID and GID are zero in any possible case >>> and processes always have all capabilities. >>> >>> The following syscalls are compiled out: setuid, setregid, setgid, >>> setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, >>> setfsuid, setfsgid, capget, capset. >>> >>> Also, groups.c is compiled out completely. >>> >>> This change saves about 25 KB on a defconfig build. >>> >>> The kernel was booted in Qemu. All the common functionalities work. Adding >>> users/groups is not possible, failing with -ENOSYS. >>> >>> Bloat-o-meter output: >>> add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) >>> >>> Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> >>> Reviewed-by: Josh Triplett <josh@joshtriplett.org> >> v2 does nothing to address the longstanding position of >> the community that disabling the traditional user based >> access controls is unacceptable. > So far that "longstanding position" consists of you griping that we're > not implementing authoritative LSM hooks for you and re-fighting that > battle for you. Patches for authoritative LSM hooks did indeed get > refused long ago, which is an excellent reason for us to not recast this > patch to reimplement them that way. The reason for bringing up authoritative hooks is that they allowed for a configuration like the one you have implemented. That fact was presented as an important reason why authoritative hooks could not be allowed. The point is not that I wanted authoritative hooks. The point is that the community opposed the very configuration you have implemented. I mention the authoritative hooks argument because that's where the issue was discussed. And if I felt sufficient strongly about bringing back authoritative hooks I wouldn't whinge to you about it. I'd go do it, and make a proper job of it. There are bigger and more important fish frying in the LSM community just now. > If it does turn out that the security maintainers in the kernel are open > to the idea of authoritative LSM hooks, by all means I would encourage > you to revisit such hooks. But there's a significant difference between > "add the ability to disable access controls" and "add a framework that > allows replacing the user/group security model with arbitrary access > controls", and it's not at all obvious that the "right" solution for the > former is an implementation of the latter; it also seems entirely > plausible that the kernel community remains opposed to the latter, which > does not necessarily rule out the former. My concern is that you've got a very specific configuration that is going to have all sort of application compatibility problems. I'm all for that as an experimental environment, but I don't think it's anywhere near ready or perhaps appropriate for upstream. > Given that, it would be helpful to hear feedback from more of the > community. Oh, I agree. I would also be curious about the user-space environment you hope to support with this kernel. >> Speaking of the LSM, what is your expectation regarding the >> use of security modules in addition to "NON_ROOT"? Is it >> forbidden, allowed or encouraged? > I would expect that any security module would need to depend on NON_ROOT > (or MULTIUSER as v3 may end up calling it, per Geert Uytterhoeven's > suggestion). A kernel configuration with this option turned off > intentionally does not *have* user/group access controls. The intent of > this option isn't "turn standard access controls off so they get out of > the way of non-standard access controls"; the intent is "turn all access > controls off because there will never be unprivileged processes on this > system". Pretty limiting, and completely inappropriate for any system that gets connected as a part of the Internet of Things. So I'm back to thinking that while this may be a fun experiment, it doesn't belong as a supported upstream configuration. I hate thinking of Ubuntu running on top of this kernel, but someone will want to try it, you can bet. > So, on that basis, it sounds like v3 should add a dependency from > SECURITY to MULTIUSER. Your goals, your call, of course. If it's not generally useful though ... > - Josh Triplett ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities 2015-01-30 2:05 ` Casey Schaufler @ 2015-01-30 21:04 ` Josh Triplett 0 siblings, 0 replies; 22+ messages in thread From: Josh Triplett @ 2015-01-30 21:04 UTC (permalink / raw) To: Casey Schaufler Cc: Iulia Manda, gnomes, serge.hallyn, linux-kernel, akpm, paulmck, peterz, mhocko, LSM On Thu, Jan 29, 2015 at 06:05:53PM -0800, Casey Schaufler wrote: > On 1/29/2015 4:43 PM, josh@joshtriplett.org wrote: > > On Thu, Jan 29, 2015 at 03:44:46PM -0800, Casey Schaufler wrote: > >> On 1/29/2015 10:43 AM, Iulia Manda wrote: > >>> There are a lot of embedded systems that run most or all of their functionality > >>> in init, running as root:root. For these systems, supporting multiple users is > >>> not necessary. > >>> > >>> This patch adds a new symbol, CONFIG_NON_ROOT, that makes support for non-root > >>> users, non-root groups, and capabilities optional. > >>> > >>> When this symbol is not defined, UID and GID are zero in any possible case > >>> and processes always have all capabilities. > >>> > >>> The following syscalls are compiled out: setuid, setregid, setgid, > >>> setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, > >>> setfsuid, setfsgid, capget, capset. > >>> > >>> Also, groups.c is compiled out completely. > >>> > >>> This change saves about 25 KB on a defconfig build. > >>> > >>> The kernel was booted in Qemu. All the common functionalities work. Adding > >>> users/groups is not possible, failing with -ENOSYS. > >>> > >>> Bloat-o-meter output: > >>> add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) > >>> > >>> Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> > >>> Reviewed-by: Josh Triplett <josh@joshtriplett.org> > >> v2 does nothing to address the longstanding position of > >> the community that disabling the traditional user based > >> access controls is unacceptable. > > So far that "longstanding position" consists of you griping that we're > > not implementing authoritative LSM hooks for you and re-fighting that > > battle for you. Patches for authoritative LSM hooks did indeed get > > refused long ago, which is an excellent reason for us to not recast this > > patch to reimplement them that way. > > The reason for bringing up authoritative hooks is that they allowed for > a configuration like the one you have implemented. That fact was presented > as an important reason why authoritative hooks could not be allowed. > The point is not that I wanted authoritative hooks. The point is that the > community opposed the very configuration you have implemented. I mention > the authoritative hooks argument because that's where the issue was discussed. I recall those same discussions when they happened; the portions of the discussion I saw were less concerned about the ability to turn *off* the security model, and more concerned that you could entirely *replace* the UNIX security model with something more complex that could grant permissions where you thought you'd denied them. Referencing one of the mails from that discussion (http://lwn.net/2001/1108/a/cs-hooks.php3 , which for the record I entirely agree with your position and arguments in), the three concerns expressed against authoritative LSM hooks were: > 1. It is more invasive. > 2. It increases the likelihood that modules can accidentally > undermine the base logic. > 3. It increases the likelihood that the LSM patch will introduce an > error into the base kernel. 1) This patch is far less invasive. 2) There's no "accidentally" here; this is a config option to *intentionally* turn off the base logic. 3) Again, this patch is much simpler, single-purpose, and easier to review. So, by all means, let's have this discussion again, but let's not frame it in the context of authoritative versus restrictive LSM hooks, which is an entirely different argument. > > If it does turn out that the security maintainers in the kernel are open > > to the idea of authoritative LSM hooks, by all means I would encourage > > you to revisit such hooks. But there's a significant difference between > > "add the ability to disable access controls" and "add a framework that > > allows replacing the user/group security model with arbitrary access > > controls", and it's not at all obvious that the "right" solution for the > > former is an implementation of the latter; it also seems entirely > > plausible that the kernel community remains opposed to the latter, which > > does not necessarily rule out the former. > > My concern is that you've got a very specific configuration that is going > to have all sort of application compatibility problems. I'm all for that > as an experimental environment, but I don't think it's anywhere near ready > or perhaps appropriate for upstream. This is a configuration option in the middle of the "potential application compatibility problems" menu, also known as CONFIG_EXPERT. Any and all options there can and will disable parts of core kernel functionality. This option is intended to support purpose-built applications that know full well what they're getting themselves into by turning it off. > > Given that, it would be helpful to hear feedback from more of the > > community. > > Oh, I agree. I would also be curious about the user-space environment > you hope to support with this kernel. We've successfully booted kernels and simple userspaces with this patch in qemu. The userspace environment for this kind of kernel would typically have init=/application, little to no filesystem, likely a read-only filesystem if any, and reboot-on-panic as an application supervision mechanism. > >> Speaking of the LSM, what is your expectation regarding the > >> use of security modules in addition to "NON_ROOT"? Is it > >> forbidden, allowed or encouraged? > > I would expect that any security module would need to depend on NON_ROOT > > (or MULTIUSER as v3 may end up calling it, per Geert Uytterhoeven's > > suggestion). A kernel configuration with this option turned off > > intentionally does not *have* user/group access controls. The intent of > > this option isn't "turn standard access controls off so they get out of > > the way of non-standard access controls"; the intent is "turn all access > > controls off because there will never be unprivileged processes on this > > system". > > Pretty limiting, and completely inappropriate for any system that > gets connected as a part of the Internet of Things. So I'm back to > thinking that while this may be a fun experiment, it doesn't belong > as a supported upstream configuration. I hate thinking of Ubuntu running > on top of this kernel, but someone will want to try it, you can bet. I would hate to see that too. Distributions know better than to enable this kind of option in a distro kernel; give them a little credit. So, someone using this option will have to manually build a kernel with this option intentionally disabled. This option isn't a loaded and pre-pointed footgun; it would take non-trivial effort to shoot yourself in the foot with it. And, in fact, any normal distribution will fail to work out of the box with this option disabled, making it even more difficult to misuse. If you could switch to another user ID, but then that user ID was still root-equivalent, that would be quite dangerous; here, any attempt to switch to a non-root user ID (or group ID, or reduced set of capabilities) will fail up front with ENOSYS. - Josh Triplett ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2015-02-01 19:45 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-29 18:43 [PATCH v2] kernel: Conditionally support non-root users, groups and capabilities Iulia Manda 2015-01-29 18:59 ` Geert Uytterhoeven 2015-01-29 20:01 ` josh 2015-01-29 20:16 ` Geert Uytterhoeven 2015-01-29 23:44 ` Casey Schaufler 2015-01-30 0:32 ` Paul E. McKenney 2015-01-30 1:25 ` Casey Schaufler 2015-01-30 1:36 ` Paul E. McKenney 2015-01-30 2:25 ` Casey Schaufler 2015-01-30 7:07 ` Paul E. McKenney 2015-01-30 19:13 ` josh 2015-01-30 19:48 ` Casey Schaufler 2015-01-30 20:20 ` Austin S Hemmelgarn 2015-01-30 21:40 ` Josh Triplett 2015-01-30 21:56 ` Richard Weinberger 2015-01-31 23:30 ` Paul E. McKenney 2015-01-31 23:33 ` Richard Weinberger 2015-02-01 19:45 ` Paul E. McKenney 2015-01-31 17:00 ` Jarkko Sakkinen 2015-01-30 0:43 ` josh 2015-01-30 2:05 ` Casey Schaufler 2015-01-30 21:04 ` Josh Triplett
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.