* [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions @ 2016-10-04 4:41 John Stultz 2016-10-04 4:41 ` [PATCH 1/2] cgroup: Add generic cgroup subsystem permission checks John Stultz ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: John Stultz @ 2016-10-04 4:41 UTC (permalink / raw) To: lkml Cc: John Stultz, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir As a heads up, this is just a first RFC and not a submission. I wanted to send this out again, as the last time I submitted this (https://marc.info/?l=linux-kernel&m=143217972215192&w=2) the discussion got out into the separate issue of how Android at one time abused memcg (but I believe now memcg is no longer used). So for this revision, I've removed any memcg usage so we can try to focus on just the actively used cpuset and cpuctrl cgroups. Android currently loosens the cgroup attchment permissions, allowing tasks with CAP_SYS_NICE to be able to allow tasks to move arbitrary tasks across cgroups. Android currently uses cgroups to bound tasks in various states (ie: foreground applications, background applications, audio application, system audio, and system tasks), to specific cpus as well as to limit cpu time. This allows for things like audio applications to be SCHED_FIFO but not run-away hogging infinite cpu, and background task cpu usage to be similarly cputime limited, and kept to only low-power cpus. The migration of a task from the foreground to background, or to elevate a task to audio priority, may be done by system service that does not run as root. So this patch allows processes with CAP_SYS_NICE to be able to migrate tasks between cgroups. I suspect if there was a specific cap (CAP_SYS_CHANGE_CGROUP) for this, it would be usable here, but in its absence, they've overloaded CAP_SYS_NICE for this use. At first glance, overloading CAP_SYS_NICE seems a bit hackish, but this shows that there is a active and widely deployed use for different cgroup attachment rules then what is currently available. I've tried to rework the patches so this attachment policy is build time configurable, and wanted to send them out for review so folks might give their thoughts on this implementation and what they might see as a better way to go about achieving the same goal. Thoughts and feedback would be appriciated! thanks -john Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: cgroups@vger.kernel.org Cc: Android Kernel Team <kernel-team@android.com> Cc: Rom Lemarchand <romlem@android.com> Cc: Colin Cross <ccross@android.com> Cc: Dmitry Shmidt <dimitrysh@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Christian Poetzsch <christian.potzsch@imgtec.com> Cc: Amit Pundir <amit.pundir@linaro.org> Colin Cross (1): cgroup: Add generic cgroup subsystem permission checks Rom Lemarchand (1): cgroup: Add a allow_attach policy for Android Documentation/cgroup-v1/cgroups.txt | 9 ++++++ include/linux/cgroup-defs.h | 1 + include/linux/cgroup.h | 16 ++++++++++ init/Kconfig | 7 +++++ kernel/cgroup.c | 61 +++++++++++++++++++++++++++++++++++-- kernel/cpuset.c | 3 ++ kernel/sched/core.c | 3 ++ 7 files changed, 98 insertions(+), 2 deletions(-) -- 1.9.1 ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] cgroup: Add generic cgroup subsystem permission checks 2016-10-04 4:41 [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions John Stultz @ 2016-10-04 4:41 ` John Stultz 2016-10-05 19:09 ` Dmitry Torokhov 2016-10-04 4:41 ` [PATCH 2/2] cgroup: Add a allow_attach policy for Android John Stultz 2016-10-04 16:16 ` [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions Tejun Heo 2 siblings, 1 reply; 20+ messages in thread From: John Stultz @ 2016-10-04 4:41 UTC (permalink / raw) To: lkml Cc: Colin Cross, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, John Stultz From: Colin Cross <ccross@android.com> Rather than using explicit euid == 0 checks when trying to move tasks into a cgroup, move permission checks into each specific cgroup subsystem. If a subsystem does not specify a 'allow_attach' handler, then we fall back to doing the checks the old way. This patch adds a 'allow_attach' handler instead of reusing the 'can_attach' handler, since if the 'can_attach' handler was reused, a new cgroup that implements 'can_attach' but not the permission checks could end up with no permission checks at all. This also includes folded down fixes from: Christian Poetzsch <christian.potzsch@imgtec.com> Amit Pundir <amit.pundir@linaro.org> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: cgroups@vger.kernel.org Cc: Android Kernel Team <kernel-team@android.com> Cc: Rom Lemarchand <romlem@android.com> Cc: Colin Cross <ccross@android.com> Cc: Dmitry Shmidt <dimitrysh@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Christian Poetzsch <christian.potzsch@imgtec.com> Cc: Amit Pundir <amit.pundir@linaro.org> Original-Author: San Mehat <san@google.com> Signed-off-by: Colin Cross <ccross@android.com> [jstultz: Rewording of commit message, folded down fixes] Signed-off-by: John Stultz <john.stultz@linaro.org> --- Documentation/cgroup-v1/cgroups.txt | 9 +++++++++ include/linux/cgroup-defs.h | 1 + kernel/cgroup.c | 40 +++++++++++++++++++++++++++++++++++-- 3 files changed, 48 insertions(+), 2 deletions(-) diff --git a/Documentation/cgroup-v1/cgroups.txt b/Documentation/cgroup-v1/cgroups.txt index 308e5ff..295f026 100644 --- a/Documentation/cgroup-v1/cgroups.txt +++ b/Documentation/cgroup-v1/cgroups.txt @@ -578,6 +578,15 @@ is completely unused; @cgrp->parent is still valid. (Note - can also be called for a newly-created cgroup if an error occurs after this subsystem's create() method has been called for the new cgroup). +int allow_attach(struct cgroup *cgrp, struct cgroup_taskset *tset) +(cgroup_mutex held by caller) + +Called prior to moving a task into a cgroup; if the subsystem +returns an error, this will abort the attach operation. Used +to extend the permission checks - if all subsystems in a cgroup +return 0, the attach will be allowed to proceed, even if the +default permission check (root or same user) fails. + int can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset) (cgroup_mutex held by caller) diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index 5b17de6..0f4548c 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -441,6 +441,7 @@ struct cgroup_subsys { void (*css_free)(struct cgroup_subsys_state *css); void (*css_reset)(struct cgroup_subsys_state *css); + int (*allow_attach)(struct cgroup_taskset *tset); int (*can_attach)(struct cgroup_taskset *tset); void (*cancel_attach)(struct cgroup_taskset *tset); void (*attach)(struct cgroup_taskset *tset); diff --git a/kernel/cgroup.c b/kernel/cgroup.c index d6b729b..e6afe2d 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2833,6 +2833,25 @@ static int cgroup_attach_task(struct cgroup *dst_cgrp, return ret; } +static int cgroup_allow_attach(struct cgroup *cgrp, struct cgroup_taskset *tset) +{ + struct cgroup_subsys_state *css; + int i; + int ret; + + for_each_css(css, i, cgrp) { + if (css->ss->allow_attach) { + ret = css->ss->allow_attach(tset); + if (ret) + return ret; + } else { + return -EACCES; + } + } + + return 0; +} + static int cgroup_procs_write_permission(struct task_struct *task, struct cgroup *dst_cgrp, struct kernfs_open_file *of) @@ -2847,8 +2866,25 @@ static int cgroup_procs_write_permission(struct task_struct *task, */ if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) && !uid_eq(cred->euid, tcred->uid) && - !uid_eq(cred->euid, tcred->suid)) - ret = -EACCES; + !uid_eq(cred->euid, tcred->suid)) { + /* + * if the default permission check fails, give each + * cgroup a chance to extend the permission check + */ + struct cgroup_taskset tset = { + .src_csets = LIST_HEAD_INIT(tset.src_csets), + .dst_csets = LIST_HEAD_INIT(tset.dst_csets), + .csets = &tset.src_csets, + }; + struct css_set *cset; + + cset = task_css_set(task); + list_add(&cset->mg_node, &tset.src_csets); + ret = cgroup_allow_attach(dst_cgrp, &tset); + list_del(&tset.src_csets); + if (ret) + ret = -EACCES; + } if (!ret && cgroup_on_dfl(dst_cgrp)) { struct super_block *sb = of->file->f_path.dentry->d_sb; -- 1.9.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cgroup: Add generic cgroup subsystem permission checks 2016-10-04 4:41 ` [PATCH 1/2] cgroup: Add generic cgroup subsystem permission checks John Stultz @ 2016-10-05 19:09 ` Dmitry Torokhov 2016-10-05 19:16 ` John Stultz 0 siblings, 1 reply; 20+ messages in thread From: Dmitry Torokhov @ 2016-10-05 19:09 UTC (permalink / raw) To: John Stultz Cc: lkml, Colin Cross, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Ricky Zhou [ Some comments are form Ricky Zhou <rickyz@chromium.org>, some from myself ] On Mon, Oct 03, 2016 at 09:41:29PM -0700, John Stultz wrote: > From: Colin Cross <ccross@android.com> > > Rather than using explicit euid == 0 checks when trying to move > tasks into a cgroup, move permission checks into each specific > cgroup subsystem. If a subsystem does not specify a 'allow_attach' > handler, then we fall back to doing the checks the old way. > > This patch adds a 'allow_attach' handler instead of reusing the > 'can_attach' handler, since if the 'can_attach' handler was > reused, a new cgroup that implements 'can_attach' but not the > permission checks could end up with no permission checks at all. > > This also includes folded down fixes from: > Christian Poetzsch <christian.potzsch@imgtec.com> > Amit Pundir <amit.pundir@linaro.org> > > Cc: Tejun Heo <tj@kernel.org> > Cc: Li Zefan <lizefan@huawei.com> > Cc: Jonathan Corbet <corbet@lwn.net> > Cc: cgroups@vger.kernel.org > Cc: Android Kernel Team <kernel-team@android.com> > Cc: Rom Lemarchand <romlem@android.com> > Cc: Colin Cross <ccross@android.com> > Cc: Dmitry Shmidt <dimitrysh@google.com> > Cc: Todd Kjos <tkjos@google.com> > Cc: Christian Poetzsch <christian.potzsch@imgtec.com> > Cc: Amit Pundir <amit.pundir@linaro.org> > Original-Author: San Mehat <san@google.com> > Signed-off-by: Colin Cross <ccross@android.com> > [jstultz: Rewording of commit message, folded down fixes] > Signed-off-by: John Stultz <john.stultz@linaro.org> > --- > Documentation/cgroup-v1/cgroups.txt | 9 +++++++++ > include/linux/cgroup-defs.h | 1 + > kernel/cgroup.c | 40 +++++++++++++++++++++++++++++++++++-- > 3 files changed, 48 insertions(+), 2 deletions(-) > > diff --git a/Documentation/cgroup-v1/cgroups.txt b/Documentation/cgroup-v1/cgroups.txt > index 308e5ff..295f026 100644 > --- a/Documentation/cgroup-v1/cgroups.txt > +++ b/Documentation/cgroup-v1/cgroups.txt > @@ -578,6 +578,15 @@ is completely unused; @cgrp->parent is still valid. (Note - can also > be called for a newly-created cgroup if an error occurs after this > subsystem's create() method has been called for the new cgroup). > > +int allow_attach(struct cgroup *cgrp, struct cgroup_taskset *tset) > +(cgroup_mutex held by caller) > + > +Called prior to moving a task into a cgroup; if the subsystem > +returns an error, this will abort the attach operation. Used > +to extend the permission checks - if all subsystems in a cgroup > +return 0, the attach will be allowed to proceed, even if the > +default permission check (root or same user) fails. > + > int can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset) > (cgroup_mutex held by caller) > > diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h > index 5b17de6..0f4548c 100644 > --- a/include/linux/cgroup-defs.h > +++ b/include/linux/cgroup-defs.h > @@ -441,6 +441,7 @@ struct cgroup_subsys { > void (*css_free)(struct cgroup_subsys_state *css); > void (*css_reset)(struct cgroup_subsys_state *css); > > + int (*allow_attach)(struct cgroup_taskset *tset); > int (*can_attach)(struct cgroup_taskset *tset); > void (*cancel_attach)(struct cgroup_taskset *tset); > void (*attach)(struct cgroup_taskset *tset); > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index d6b729b..e6afe2d 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -2833,6 +2833,25 @@ static int cgroup_attach_task(struct cgroup *dst_cgrp, > return ret; > } > > +static int cgroup_allow_attach(struct cgroup *cgrp, struct cgroup_taskset *tset) > +{ > + struct cgroup_subsys_state *css; > + int i; > + int ret; > + > + for_each_css(css, i, cgrp) { > + if (css->ss->allow_attach) { > + ret = css->ss->allow_attach(tset); > + if (ret) > + return ret; > + } else { > + return -EACCES; > + } > + } > + > + return 0; > +} > + > static int cgroup_procs_write_permission(struct task_struct *task, > struct cgroup *dst_cgrp, > struct kernfs_open_file *of) > @@ -2847,8 +2866,25 @@ static int cgroup_procs_write_permission(struct task_struct *task, > */ > if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) && > !uid_eq(cred->euid, tcred->uid) && > - !uid_eq(cred->euid, tcred->suid)) > - ret = -EACCES; > + !uid_eq(cred->euid, tcred->suid)) { > + /* > + * if the default permission check fails, give each > + * cgroup a chance to extend the permission check > + */ > + struct cgroup_taskset tset = { > + .src_csets = LIST_HEAD_INIT(tset.src_csets), > + .dst_csets = LIST_HEAD_INIT(tset.dst_csets), > + .csets = &tset.src_csets, > + }; > + struct css_set *cset; > + > + cset = task_css_set(task); Do we need to take css_set_lock here? If not, why? > + list_add(&cset->mg_node, &tset.src_csets); > + ret = cgroup_allow_attach(dst_cgrp, &tset); > + list_del(&tset.src_csets); This should be list_del_init(&cset->mg_node); since you are deleting task's cset from the tset list, not other way around. It only happen to work because there is exactly 1 member in tset.src_csets and list_del done on it is exactly list_del_init on the node, so you are not leaving with uncorrupted mg_node in task's cset. > + if (ret) > + ret = -EACCES; > + } > > if (!ret && cgroup_on_dfl(dst_cgrp)) { > struct super_block *sb = of->file->f_path.dentry->d_sb; Isn't this, generally speaking, racy? We take current task's cset and check if we have rights to move it over. But we do not have any locking between check and actual move, so can the cset change between these 2 operations? And if cset can't really change and it is only 1 task, then why do we bother with forming taskset at all? Can we make allow_attach take just the target task argument? Thanks. -- Dmitry ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cgroup: Add generic cgroup subsystem permission checks 2016-10-05 19:09 ` Dmitry Torokhov @ 2016-10-05 19:16 ` John Stultz 2016-10-05 19:23 ` Dmitry Torokhov 0 siblings, 1 reply; 20+ messages in thread From: John Stultz @ 2016-10-05 19:16 UTC (permalink / raw) To: Dmitry Torokhov Cc: lkml, Colin Cross, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Ricky Zhou On Wed, Oct 5, 2016 at 12:09 PM, Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote: > [ Some comments are form Ricky Zhou <rickyz@chromium.org>, some from > myself ] > On Mon, Oct 03, 2016 at 09:41:29PM -0700, John Stultz wrote: >> From: Colin Cross <ccross@android.com> >> [snip] >> + >> + cset = task_css_set(task); > > Do we need to take css_set_lock here? If not, why? > >> + list_add(&cset->mg_node, &tset.src_csets); >> + ret = cgroup_allow_attach(dst_cgrp, &tset); >> + list_del(&tset.src_csets); > > This should be > > list_del_init(&cset->mg_node); > > since you are deleting task's cset from the tset list, not other way > around. It only happen to work because there is exactly 1 member in > tset.src_csets and list_del done on it is exactly list_del_init on the > node, so you are not leaving with uncorrupted mg_node in task's cset. > >> + if (ret) >> + ret = -EACCES; >> + } >> >> if (!ret && cgroup_on_dfl(dst_cgrp)) { >> struct super_block *sb = of->file->f_path.dentry->d_sb; > > Isn't this, generally speaking, racy? We take current task's cset and > check if we have rights to move it over. But we do not have any locking > between check and actual move, so can the cset change between these 2 > operations? > > And if cset can't really change and it is only 1 task, then why do we > bother with forming taskset at all? Can we make allow_attach take just > the target task argument? After Tejun's feedback, I've tried reworking the same functionality in a much simpler fashion by introducing a new capability bit. https://lkml.org/lkml/2016/10/4/479 I believe that approach doesn't have the drawbacks you've pointed out here, but would appreciate your input on it. As for your feedback on this patch, I'll have to look into it a bit, as I don't have good answers for you for you right off. But these do seem like valid concerns and since the Android common.git kernels are using the code I submitted here, this issues likely need to be fixed there. thanks -john ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cgroup: Add generic cgroup subsystem permission checks 2016-10-05 19:16 ` John Stultz @ 2016-10-05 19:23 ` Dmitry Torokhov 0 siblings, 0 replies; 20+ messages in thread From: Dmitry Torokhov @ 2016-10-05 19:23 UTC (permalink / raw) To: John Stultz Cc: lkml, Colin Cross, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Ricky Zhou On Wed, Oct 5, 2016 at 12:16 PM, John Stultz <john.stultz@linaro.org> wrote: > On Wed, Oct 5, 2016 at 12:09 PM, Dmitry Torokhov > <dmitry.torokhov@gmail.com> wrote: >> [ Some comments are form Ricky Zhou <rickyz@chromium.org>, some from >> myself ] >> On Mon, Oct 03, 2016 at 09:41:29PM -0700, John Stultz wrote: >>> From: Colin Cross <ccross@android.com> >>> > [snip] >>> + >>> + cset = task_css_set(task); >> >> Do we need to take css_set_lock here? If not, why? >> >>> + list_add(&cset->mg_node, &tset.src_csets); >>> + ret = cgroup_allow_attach(dst_cgrp, &tset); >>> + list_del(&tset.src_csets); >> >> This should be >> >> list_del_init(&cset->mg_node); >> >> since you are deleting task's cset from the tset list, not other way >> around. It only happen to work because there is exactly 1 member in >> tset.src_csets and list_del done on it is exactly list_del_init on the >> node, so you are not leaving with uncorrupted mg_node in task's cset. >> >>> + if (ret) >>> + ret = -EACCES; >>> + } >>> >>> if (!ret && cgroup_on_dfl(dst_cgrp)) { >>> struct super_block *sb = of->file->f_path.dentry->d_sb; >> >> Isn't this, generally speaking, racy? We take current task's cset and >> check if we have rights to move it over. But we do not have any locking >> between check and actual move, so can the cset change between these 2 >> operations? >> >> And if cset can't really change and it is only 1 task, then why do we >> bother with forming taskset at all? Can we make allow_attach take just >> the target task argument? > > After Tejun's feedback, I've tried reworking the same functionality in > a much simpler fashion by introducing a new capability bit. > https://lkml.org/lkml/2016/10/4/479 > > I believe that approach doesn't have the drawbacks you've pointed out > here, but would appreciate your input on it. > > As for your feedback on this patch, I'll have to look into it a bit, > as I don't have good answers for you for you right off. But these do > seem like valid concerns and since the Android common.git kernels are > using the code I submitted here, this issues likely need to be fixed > there. Yeah, we are looking into the same for ChromeOS, so we have this: https://chromium-review.googlesource.com/c/393907/ Thanks. -- Dmitry ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/2] cgroup: Add a allow_attach policy for Android 2016-10-04 4:41 [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions John Stultz 2016-10-04 4:41 ` [PATCH 1/2] cgroup: Add generic cgroup subsystem permission checks John Stultz @ 2016-10-04 4:41 ` John Stultz 2016-10-05 19:10 ` Dmitry Torokhov 2016-10-04 16:16 ` [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions Tejun Heo 2 siblings, 1 reply; 20+ messages in thread From: John Stultz @ 2016-10-04 4:41 UTC (permalink / raw) To: lkml Cc: Rom Lemarchand, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, John Stultz From: Rom Lemarchand <romlem@android.com> If CONFIG_CGROUP_NICE_ATTACH is enabled, this implements an allow_attach policy for Android, which allows any process with CAP_SYS_NICE to move tasks across cpuset and cpuctrl cgroups. This includes folded down fixes from: Dmitry Shmidt <dimitrysh@google.com> Riley Andrews <riandrews@google.com> Amit Pundir <amit.pundir@linaro.org> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: cgroups@vger.kernel.org Cc: Android Kernel Team <kernel-team@android.com> Cc: Rom Lemarchand <romlem@android.com> Cc: Colin Cross <ccross@android.com> Cc: Dmitry Shmidt <dimitrysh@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Christian Poetzsch <christian.potzsch@imgtec.com> Cc: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Rom Lemarchand <romlem@android.com> [jstultz: Majorly reworked to make this policy function configurable, and folded in fixes] Signed-off-by: John Stultz <john.stultz@linaro.org> --- include/linux/cgroup.h | 16 ++++++++++++++++ init/Kconfig | 8 ++++++++ kernel/cgroup.c | 22 ++++++++++++++++++++++ kernel/cpuset.c | 3 +++ kernel/sched/core.c | 3 +++ 5 files changed, 52 insertions(+) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 984f73b..eab4311 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -554,6 +554,17 @@ static inline void pr_cont_cgroup_path(struct cgroup *cgrp) pr_cont_kernfs_path(cgrp->kn); } +#ifdef CONFIG_CGROUP_NICE_ATTACH +/* + * Default Android check for whether the current process is allowed to move a + * task across cgroups, either because CAP_SYS_NICE is set or because the uid + * of the calling process is the same as the moved task or because we are + * running as root. + * Returns 0 if this is allowed, or -EACCES otherwise. + */ +int cgroup_nice_allow_attach(struct cgroup_taskset *tset); +#endif + #else /* !CONFIG_CGROUPS */ struct cgroup_subsys_state; @@ -647,6 +658,11 @@ copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, return old_ns; } +static inline int subsys_cgroup_allow_attach(void *tset) +{ + return -EINVAL; +} + #endif /* !CONFIG_CGROUPS */ static inline void get_cgroup_ns(struct cgroup_namespace *ns) diff --git a/init/Kconfig b/init/Kconfig index cac3f09..c000734 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1021,6 +1021,14 @@ config DEBUG_BLK_CGROUP Enable some debugging help. Currently it exports additional stat files in a cgroup which can be useful for debugging. +config CGROUP_NICE_ATTACH + bool "Enabled Android-style loosening of perm checks for attachment" + default n + ---help--- + Allows non-root processes to add arbitrary processes to cpuset and + cpuctrl cgroups if they have CAP_SYS_NICE set. This is useful for + Android. + config CGROUP_WRITEBACK bool depends on MEMCG && BLK_CGROUP diff --git a/kernel/cgroup.c b/kernel/cgroup.c index e6afe2d..a53f0be 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2833,6 +2833,28 @@ static int cgroup_attach_task(struct cgroup *dst_cgrp, return ret; } +#ifdef CONFIG_CGROUP_NICE_ATTACH +int cgroup_nice_allow_attach(struct cgroup_taskset *tset) +{ + const struct cred *cred = current_cred(), *tcred; + struct task_struct *task; + struct cgroup_subsys_state *css; + + if (capable(CAP_SYS_NICE)) + return 0; + + cgroup_taskset_for_each(task, css, tset) { + tcred = __task_cred(task); + + if (current != task && !uid_eq(cred->euid, tcred->uid) && + !uid_eq(cred->euid, tcred->suid)) + return -EACCES; + } + + return 0; +} +#endif + static int cgroup_allow_attach(struct cgroup *cgrp, struct cgroup_taskset *tset) { struct cgroup_subsys_state *css; diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 2b4c20a..87aede4 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -2100,6 +2100,9 @@ struct cgroup_subsys cpuset_cgrp_subsys = { .css_offline = cpuset_css_offline, .css_free = cpuset_css_free, .can_attach = cpuset_can_attach, +#ifdef CONFIG_CGROUP_NICE_ATTACH + .allow_attach = cgroup_nice_allow_attach, +#endif .cancel_attach = cpuset_cancel_attach, .attach = cpuset_attach, .post_attach = cpuset_post_attach, diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 44817c6..5573505 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8657,6 +8657,9 @@ struct cgroup_subsys cpu_cgrp_subsys = { .fork = cpu_cgroup_fork, .can_attach = cpu_cgroup_can_attach, .attach = cpu_cgroup_attach, +#ifdef CONFIG_CGROUP_NICE_ATTACH + .allow_attach = cgroup_nice_allow_attach, +#endif .legacy_cftypes = cpu_files, .early_init = true, }; -- 1.9.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cgroup: Add a allow_attach policy for Android 2016-10-04 4:41 ` [PATCH 2/2] cgroup: Add a allow_attach policy for Android John Stultz @ 2016-10-05 19:10 ` Dmitry Torokhov 2016-10-05 19:18 ` John Stultz 0 siblings, 1 reply; 20+ messages in thread From: Dmitry Torokhov @ 2016-10-05 19:10 UTC (permalink / raw) To: John Stultz Cc: lkml, Rom Lemarchand, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Ricky Zhou On Mon, Oct 03, 2016 at 09:41:30PM -0700, John Stultz wrote: > +#ifdef CONFIG_CGROUP_NICE_ATTACH > +int cgroup_nice_allow_attach(struct cgroup_taskset *tset) > +{ > + const struct cred *cred = current_cred(), *tcred; > + struct task_struct *task; > + struct cgroup_subsys_state *css; > + > + if (capable(CAP_SYS_NICE)) > + return 0; > + > + cgroup_taskset_for_each(task, css, tset) { > + tcred = __task_cred(task); __task_cred() requires RCU lock (courtesy Ricky Z). > + > + if (current != task && !uid_eq(cred->euid, tcred->uid) && > + !uid_eq(cred->euid, tcred->suid)) > + return -EACCES; > + } > + > + return 0; > +} > +#endif Thanks. -- Dmitry ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cgroup: Add a allow_attach policy for Android 2016-10-05 19:10 ` Dmitry Torokhov @ 2016-10-05 19:18 ` John Stultz 2016-10-06 22:43 ` Dmitry Torokhov 0 siblings, 1 reply; 20+ messages in thread From: John Stultz @ 2016-10-05 19:18 UTC (permalink / raw) To: Dmitry Torokhov Cc: lkml, Rom Lemarchand, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Ricky Zhou On Wed, Oct 5, 2016 at 12:10 PM, Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote: > On Mon, Oct 03, 2016 at 09:41:30PM -0700, John Stultz wrote: >> +#ifdef CONFIG_CGROUP_NICE_ATTACH >> +int cgroup_nice_allow_attach(struct cgroup_taskset *tset) >> +{ >> + const struct cred *cred = current_cred(), *tcred; >> + struct task_struct *task; >> + struct cgroup_subsys_state *css; >> + >> + if (capable(CAP_SYS_NICE)) >> + return 0; >> + >> + cgroup_taskset_for_each(task, css, tset) { >> + tcred = __task_cred(task); > > __task_cred() requires RCU lock (courtesy Ricky Z). Again, hopefully this isn't an issue with the new approach, but for the short term I'll see if we can get this fixed in the android tree. Thanks again for your careful review! -john ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cgroup: Add a allow_attach policy for Android 2016-10-05 19:18 ` John Stultz @ 2016-10-06 22:43 ` Dmitry Torokhov 2016-10-06 22:52 ` Dmitry Torokhov 0 siblings, 1 reply; 20+ messages in thread From: Dmitry Torokhov @ 2016-10-06 22:43 UTC (permalink / raw) To: John Stultz Cc: lkml, Rom Lemarchand, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Ricky Zhou On Wed, Oct 05, 2016 at 12:18:17PM -0700, John Stultz wrote: > On Wed, Oct 5, 2016 at 12:10 PM, Dmitry Torokhov > <dmitry.torokhov@gmail.com> wrote: > > On Mon, Oct 03, 2016 at 09:41:30PM -0700, John Stultz wrote: > >> +#ifdef CONFIG_CGROUP_NICE_ATTACH > >> +int cgroup_nice_allow_attach(struct cgroup_taskset *tset) > >> +{ > >> + const struct cred *cred = current_cred(), *tcred; > >> + struct task_struct *task; > >> + struct cgroup_subsys_state *css; > >> + > >> + if (capable(CAP_SYS_NICE)) > >> + return 0; > >> + > >> + cgroup_taskset_for_each(task, css, tset) { > >> + tcred = __task_cred(task); > > > > __task_cred() requires RCU lock (courtesy Ricky Z). > > Again, hopefully this isn't an issue with the new approach, but for > the short term I'll see if we can get this fixed in the android tree. > Actually, it should all be simply removed from there right away, as this ends up being basically noop (but with all the locking violations and races): cgroup_taskset_for_each() needs tasks to be placed on cset->mg_tasks list, but nobody does this in the ->allow_access() code path, so this loops always executes exactly 0 times and the whole thing is exactly equivalent of doing return capable(CAP_SYS_NICE) ? 0 : -EACESS; which can be done right in cgroup_procs_write_permission(). Thanks. -- Dmitry ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cgroup: Add a allow_attach policy for Android 2016-10-06 22:43 ` Dmitry Torokhov @ 2016-10-06 22:52 ` Dmitry Torokhov 0 siblings, 0 replies; 20+ messages in thread From: Dmitry Torokhov @ 2016-10-06 22:52 UTC (permalink / raw) To: John Stultz Cc: lkml, Rom Lemarchand, Tejun Heo, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Ricky Zhou On Thu, Oct 06, 2016 at 03:43:51PM -0700, Dmitry Torokhov wrote: > On Wed, Oct 05, 2016 at 12:18:17PM -0700, John Stultz wrote: > > On Wed, Oct 5, 2016 at 12:10 PM, Dmitry Torokhov > > <dmitry.torokhov@gmail.com> wrote: > > > On Mon, Oct 03, 2016 at 09:41:30PM -0700, John Stultz wrote: > > >> +#ifdef CONFIG_CGROUP_NICE_ATTACH > > >> +int cgroup_nice_allow_attach(struct cgroup_taskset *tset) > > >> +{ > > >> + const struct cred *cred = current_cred(), *tcred; > > >> + struct task_struct *task; > > >> + struct cgroup_subsys_state *css; > > >> + > > >> + if (capable(CAP_SYS_NICE)) > > >> + return 0; > > >> + > > >> + cgroup_taskset_for_each(task, css, tset) { > > >> + tcred = __task_cred(task); > > > > > > __task_cred() requires RCU lock (courtesy Ricky Z). > > > > Again, hopefully this isn't an issue with the new approach, but for > > the short term I'll see if we can get this fixed in the android tree. > > > > Actually, it should all be simply removed from there right away, as this > ends up being basically noop (but with all the locking violations and > races): > > cgroup_taskset_for_each() needs tasks to be placed on cset->mg_tasks > list, but nobody does this in the ->allow_access() code path, so this > loops always executes exactly 0 times and the whole thing is exactly > equivalent of doing > > return capable(CAP_SYS_NICE) ? 0 : -EACESS; > > which can be done right in cgroup_procs_write_permission(). Umm, sorry, no, it actually always returns 0, regardless even of capabilities. Permissions are indeed being relaxed ;) -- Dmitry ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 4:41 [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions John Stultz 2016-10-04 4:41 ` [PATCH 1/2] cgroup: Add generic cgroup subsystem permission checks John Stultz 2016-10-04 4:41 ` [PATCH 2/2] cgroup: Add a allow_attach policy for Android John Stultz @ 2016-10-04 16:16 ` Tejun Heo 2016-10-04 18:01 ` John Stultz 2016-10-04 18:03 ` John Stultz 2 siblings, 2 replies; 20+ messages in thread From: Tejun Heo @ 2016-10-04 16:16 UTC (permalink / raw) To: John Stultz Cc: lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir Hello, John. On Mon, Oct 03, 2016 at 09:41:28PM -0700, John Stultz wrote: > The migration of a task from the foreground to background, or to > elevate a task to audio priority, may be done by system service that > does not run as root. So this patch allows processes with CAP_SYS_NICE > to be able to migrate tasks between cgroups. I suspect if there was a > specific cap (CAP_SYS_CHANGE_CGROUP) for this, it would be usable here, > but in its absence, they've overloaded CAP_SYS_NICE for this use. CAP_SYS_RESOURCE won't do? > At first glance, overloading CAP_SYS_NICE seems a bit hackish, but this > shows that there is a active and widely deployed use for different cgroup > attachment rules then what is currently available. I'm curious who issues these migrations. Is that restricted to certain uids? If so, would it work for android if cgroupfs supports ACL so that those uids can be approved via setfacl? That'd be an a lot more generic approach. Thanks. -- tejun ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 16:16 ` [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions Tejun Heo @ 2016-10-04 18:01 ` John Stultz 2016-10-04 19:38 ` Tejun Heo 2016-10-04 18:03 ` John Stultz 1 sibling, 1 reply; 20+ messages in thread From: John Stultz @ 2016-10-04 18:01 UTC (permalink / raw) To: Tejun Heo Cc: lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir On Tue, Oct 4, 2016 at 9:16 AM, Tejun Heo <tj@kernel.org> wrote: > On Mon, Oct 03, 2016 at 09:41:28PM -0700, John Stultz wrote: >> The migration of a task from the foreground to background, or to >> elevate a task to audio priority, may be done by system service that >> does not run as root. So this patch allows processes with CAP_SYS_NICE >> to be able to migrate tasks between cgroups. I suspect if there was a >> specific cap (CAP_SYS_CHANGE_CGROUP) for this, it would be usable here, >> but in its absence, they've overloaded CAP_SYS_NICE for this use. > > CAP_SYS_RESOURCE won't do? > >> At first glance, overloading CAP_SYS_NICE seems a bit hackish, but this >> shows that there is a active and widely deployed use for different cgroup >> attachment rules then what is currently available. > > I'm curious who issues these migrations. The system_server process via the sched_policy logic: http://androidxref.com/7.0.0_r1/xref/system/core/libcutils/sched_policy.c#274 See set_cpuset_policy() and set_sched_policy(). > Is that restricted to > certain uids? If so, would it work for android if cgroupfs supports > ACL so that those uids can be approved via setfacl? That'd be an a > lot more generic approach. So tasks might move themselves in some cases to specific groups, but mostly its controlled by the system_server. So to make sure I understand your suggestion, you're suggesting the cgroupfs files like: cpuctrl/tasks, cpuctrl/bg_non_interactive/tasks, cpuset/foreground/tasks, cpuset/background/tasks, etc use ACL permissions to specify the specific uids that can write to them? I guess this would be conceptually similar to just setting the owner to the system task, no? Though I'm not sure that would be sufficient since it would still fail the cgroup_procs_write_permission() checks. Or are you suggesting we add extra logic to make the file owner uid as sufficient to change other tasks? thanks -john ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 18:01 ` John Stultz @ 2016-10-04 19:38 ` Tejun Heo 2016-10-04 19:46 ` John Stultz 2016-10-04 20:18 ` Serge E. Hallyn 0 siblings, 2 replies; 20+ messages in thread From: Tejun Heo @ 2016-10-04 19:38 UTC (permalink / raw) To: John Stultz Cc: lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Serge Hallyn Hello, John. On Tue, Oct 04, 2016 at 11:01:12AM -0700, John Stultz wrote: > So to make sure I understand your suggestion, you're suggesting the > cgroupfs files like: > cpuctrl/tasks, > cpuctrl/bg_non_interactive/tasks, > cpuset/foreground/tasks, > cpuset/background/tasks, > etc > use ACL permissions to specify the specific uids that can write to > them? I guess this would be conceptually similar to just setting the > owner to the system task, no? Though I'm not sure that would be Yeah, finer grained but essentially just giving write perms. > sufficient since it would still fail the > cgroup_procs_write_permission() checks. Or are you suggesting we add > extra logic to make the file owner uid as sufficient to change other > tasks? Hah, now I'm not sure how this is supposed to work inside a userns as it's checking against GLOBAL_ROOT_UID. cc'ing Serge. Serge, can you please have a look? But back on subject, yeah, I think a capability based approach is better here too. No idea how difficult it is to add a new CAP but I think it's worth trying. Can you please spin up a patch? Thanks! -- tejun ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 19:38 ` Tejun Heo @ 2016-10-04 19:46 ` John Stultz 2016-10-04 19:49 ` Tejun Heo 2016-10-04 20:18 ` Serge E. Hallyn 1 sibling, 1 reply; 20+ messages in thread From: John Stultz @ 2016-10-04 19:46 UTC (permalink / raw) To: Tejun Heo Cc: lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Serge Hallyn On Tue, Oct 4, 2016 at 12:38 PM, Tejun Heo <tj@kernel.org> wrote: > Hello, John. > > On Tue, Oct 04, 2016 at 11:01:12AM -0700, John Stultz wrote: >> So to make sure I understand your suggestion, you're suggesting the >> cgroupfs files like: >> cpuctrl/tasks, >> cpuctrl/bg_non_interactive/tasks, >> cpuset/foreground/tasks, >> cpuset/background/tasks, >> etc >> use ACL permissions to specify the specific uids that can write to >> them? I guess this would be conceptually similar to just setting the >> owner to the system task, no? Though I'm not sure that would be > > Yeah, finer grained but essentially just giving write perms. > >> sufficient since it would still fail the >> cgroup_procs_write_permission() checks. Or are you suggesting we add >> extra logic to make the file owner uid as sufficient to change other >> tasks? > > Hah, now I'm not sure how this is supposed to work inside a userns as > it's checking against GLOBAL_ROOT_UID. cc'ing Serge. Serge, can you > please have a look? > > But back on subject, yeah, I think a capability based approach is > better here too. No idea how difficult it is to add a new CAP but I > think it's worth trying. Can you please spin up a patch? Ok. I'll respin this introducing and using a new CAP value. That said, while CAP_SYS_NICE seems a bit overloaded here, it doesn't conceptually have that much friction for use with cpuset and cpuctrl cgroups: (from the man page: http://man7.org/linux/man-pages/man7/capabilities.7.html ) CAP_SYS_NICE * Raise process nice value (nice(2), setpriority(2)) and change the nice value for arbitrary processes; * set real-time scheduling policies for calling process, and set scheduling policies and priorities for arbitrary processes (sched_setscheduler(2), sched_setparam(2), shed_setattr(2)); * set CPU affinity for arbitrary processes (sched_setaffinity(2)); * set I/O scheduling class and priority for arbitrary processes (ioprio_set(2)); * apply migrate_pages(2) to arbitrary processes and allow processes to be migrated to arbitrary nodes; * apply move_pages(2) to arbitrary processes; * use the MPOL_MF_MOVE_ALL flag with mbind(2) and move_pages(2). If you can tweak nice value and set realtime scheduling policy that really seems to me just as invasive as what moving tasks between the cpuctrl and cpuset cgroups could do. thanks -john ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 19:46 ` John Stultz @ 2016-10-04 19:49 ` Tejun Heo 0 siblings, 0 replies; 20+ messages in thread From: Tejun Heo @ 2016-10-04 19:49 UTC (permalink / raw) To: John Stultz Cc: lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir, Serge Hallyn Hello, On Tue, Oct 04, 2016 at 12:46:24PM -0700, John Stultz wrote: > Ok. I'll respin this introducing and using a new CAP value. > > That said, while CAP_SYS_NICE seems a bit overloaded here, it doesn't > conceptually have that much friction for use with cpuset and cpuctrl > cgroups: We need to solve it for userns too and I wanna avoid pushing permission logic into specific controllers. The logical extensions of that would be protecting control interface files by different CAPs too. It might work for some knobs and then there will all but certainly unclear corner cases and so on. Let's please not go there. Thanks. -- tejun ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 19:38 ` Tejun Heo 2016-10-04 19:46 ` John Stultz @ 2016-10-04 20:18 ` Serge E. Hallyn 2016-10-04 20:33 ` Tejun Heo 1 sibling, 1 reply; 20+ messages in thread From: Serge E. Hallyn @ 2016-10-04 20:18 UTC (permalink / raw) To: Tejun Heo Cc: John Stultz, lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir Quoting Tejun Heo (tj@kernel.org): > Hello, John. > > On Tue, Oct 04, 2016 at 11:01:12AM -0700, John Stultz wrote: > > So to make sure I understand your suggestion, you're suggesting the > > cgroupfs files like: > > cpuctrl/tasks, > > cpuctrl/bg_non_interactive/tasks, > > cpuset/foreground/tasks, > > cpuset/background/tasks, > > etc > > use ACL permissions to specify the specific uids that can write to > > them? I guess this would be conceptually similar to just setting the > > owner to the system task, no? Though I'm not sure that would be > > Yeah, finer grained but essentially just giving write perms. > > > sufficient since it would still fail the > > cgroup_procs_write_permission() checks. Or are you suggesting we add > > extra logic to make the file owner uid as sufficient to change other > > tasks? > > Hah, now I'm not sure how this is supposed to work inside a userns as > it's checking against GLOBAL_ROOT_UID. cc'ing Serge. Serge, can you > please have a look? Hi, thanks for the cc, how about changing the GLOBAL_ROOT_UID check with a targeted capability check, like if (!ns_capable(tcred->user_ns, CAP_SYS_NICE) && !uid_eq(cred->euid, tcred->uid) && !uid_eq(cred->euid, tcred->suid)) ret = -EACCES; where the actual capability to use may require some thought. > But back on subject, yeah, I think a capability based approach is > better here too. No idea how difficult it is to add a new CAP but I > think it's worth trying. Can you please spin up a patch? > > Thanks! > > -- > tejun ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 20:18 ` Serge E. Hallyn @ 2016-10-04 20:33 ` Tejun Heo 2016-10-04 21:26 ` Serge E. Hallyn 0 siblings, 1 reply; 20+ messages in thread From: Tejun Heo @ 2016-10-04 20:33 UTC (permalink / raw) To: Serge E. Hallyn Cc: John Stultz, lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir Hello, Serge. On Tue, Oct 04, 2016 at 03:18:40PM -0500, Serge E. Hallyn wrote: > how about changing the GLOBAL_ROOT_UID check with a targeted > capability check, like > > if (!ns_capable(tcred->user_ns, CAP_SYS_NICE) && > !uid_eq(cred->euid, tcred->uid) && > !uid_eq(cred->euid, tcred->suid)) > ret = -EACCES; > > where the actual capability to use may require some thought. Yeah, that's the direction I'm thinking too. We can't use CAP_SYS_NICE in general tho. Let's see if a dedicated CAP sticks. Thanks. -- tejun ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 20:33 ` Tejun Heo @ 2016-10-04 21:26 ` Serge E. Hallyn 2016-10-04 21:29 ` Tejun Heo 0 siblings, 1 reply; 20+ messages in thread From: Serge E. Hallyn @ 2016-10-04 21:26 UTC (permalink / raw) To: Tejun Heo Cc: Serge E. Hallyn, John Stultz, lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir Quoting Tejun Heo (tj@kernel.org): > Hello, Serge. > > On Tue, Oct 04, 2016 at 03:18:40PM -0500, Serge E. Hallyn wrote: > > how about changing the GLOBAL_ROOT_UID check with a targeted > > capability check, like > > > > if (!ns_capable(tcred->user_ns, CAP_SYS_NICE) && > > !uid_eq(cred->euid, tcred->uid) && > > !uid_eq(cred->euid, tcred->suid)) > > ret = -EACCES; > > > > where the actual capability to use may require some thought. > > Yeah, that's the direction I'm thinking too. We can't use > CAP_SYS_NICE in general tho. Let's see if a dedicated CAP sticks. One possibility would be to let each cgroup subsystem define a move_caps capability mask which is required over the target task. And add a new CAP_CGROUP which always suffices? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 21:26 ` Serge E. Hallyn @ 2016-10-04 21:29 ` Tejun Heo 0 siblings, 0 replies; 20+ messages in thread From: Tejun Heo @ 2016-10-04 21:29 UTC (permalink / raw) To: Serge E. Hallyn Cc: John Stultz, lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir On Tue, Oct 04, 2016 at 04:26:43PM -0500, Serge E. Hallyn wrote: > Quoting Tejun Heo (tj@kernel.org): > > Hello, Serge. > > > > On Tue, Oct 04, 2016 at 03:18:40PM -0500, Serge E. Hallyn wrote: > > > how about changing the GLOBAL_ROOT_UID check with a targeted > > > capability check, like > > > > > > if (!ns_capable(tcred->user_ns, CAP_SYS_NICE) && > > > !uid_eq(cred->euid, tcred->uid) && > > > !uid_eq(cred->euid, tcred->suid)) > > > ret = -EACCES; > > > > > > where the actual capability to use may require some thought. > > > > Yeah, that's the direction I'm thinking too. We can't use > > CAP_SYS_NICE in general tho. Let's see if a dedicated CAP sticks. > > One possibility would be to let each cgroup subsystem define > a move_caps capability mask which is required over the target > task. And add a new CAP_CGROUP which always suffices? As I wrote in another reply, I really don't wanna do that. It brings in the question about control knob permissions too and makes the permission checks a lot more difficult to predit. I'd much rather just get rid of the extra checks, at least on the v2 hierarchy. The extra checks are protecting against pulling in random processes into a delegated subtree and v2 hierarchy already has a protection against that. Thanks. -- tejun ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions 2016-10-04 16:16 ` [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions Tejun Heo 2016-10-04 18:01 ` John Stultz @ 2016-10-04 18:03 ` John Stultz 1 sibling, 0 replies; 20+ messages in thread From: John Stultz @ 2016-10-04 18:03 UTC (permalink / raw) To: Tejun Heo Cc: lkml, Li Zefan, Jonathan Corbet, cgroups, Android Kernel Team, Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos, Christian Poetzsch, Amit Pundir On Tue, Oct 4, 2016 at 9:16 AM, Tejun Heo <tj@kernel.org> wrote: > On Mon, Oct 03, 2016 at 09:41:28PM -0700, John Stultz wrote: >> The migration of a task from the foreground to background, or to >> elevate a task to audio priority, may be done by system service that >> does not run as root. So this patch allows processes with CAP_SYS_NICE >> to be able to migrate tasks between cgroups. I suspect if there was a >> specific cap (CAP_SYS_CHANGE_CGROUP) for this, it would be usable here, >> but in its absence, they've overloaded CAP_SYS_NICE for this use. > > CAP_SYS_RESOURCE won't do? Oh, and I'll have to look into this one to see what CAP_SYS_RESOURCE actually allows. We ran into trouble in the past changing the CAPs required to something higher-level, which then causes trouble because they want to avoid elevating permissions on system tasks and keep them as restricted as possible. thans -john ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2016-10-06 22:53 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-10-04 4:41 [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions John Stultz 2016-10-04 4:41 ` [PATCH 1/2] cgroup: Add generic cgroup subsystem permission checks John Stultz 2016-10-05 19:09 ` Dmitry Torokhov 2016-10-05 19:16 ` John Stultz 2016-10-05 19:23 ` Dmitry Torokhov 2016-10-04 4:41 ` [PATCH 2/2] cgroup: Add a allow_attach policy for Android John Stultz 2016-10-05 19:10 ` Dmitry Torokhov 2016-10-05 19:18 ` John Stultz 2016-10-06 22:43 ` Dmitry Torokhov 2016-10-06 22:52 ` Dmitry Torokhov 2016-10-04 16:16 ` [RFC][PATCH 0/2] Another pass at Android style loosening of cgroup attach permissions Tejun Heo 2016-10-04 18:01 ` John Stultz 2016-10-04 19:38 ` Tejun Heo 2016-10-04 19:46 ` John Stultz 2016-10-04 19:49 ` Tejun Heo 2016-10-04 20:18 ` Serge E. Hallyn 2016-10-04 20:33 ` Tejun Heo 2016-10-04 21:26 ` Serge E. Hallyn 2016-10-04 21:29 ` Tejun Heo 2016-10-04 18:03 ` John Stultz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).