All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET cgroup/for-5.16-fixes] cgroup: Use open-time creds and namespace for migration perm checks
@ 2021-12-09 21:47 Tejun Heo
  2021-12-09 21:47 ` [PATCH 1/6] cgroup: Use open-time credentials for process migraton " Tejun Heo
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 21:47 UTC (permalink / raw)
  To: torvalds, ebiederm, mkoutny, axboe, keescook, oleg, peterz, tglx,
	jnewsome, legion, luto, jannh
  Cc: linux-kernel, security, kernel-team

Hello,

cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's credentials and cgroup
namespace which is a potential security weakness as it may allow scenarios
where a less privileged process tricks a more privileged one into writing
into a fd that it created.

This patchset make the perm checks use credentials and cgroup namespace
stored at the time of open and contains the following patches.

 0001-cgroup-Use-open-time-credentials-for-process-migrato.patch
 0002-cgroup-Allocate-cgroup_file_ctx-for-kernfs_open_file.patch
 0003-cgroup-Use-open-time-cgroup-namespace-for-process-mi.patch
 0004-selftests-cgroup-Make-cg_create-use-0755-for-permiss.patch
 0005-selftests-cgroup-Test-open-time-credential-usage-for.patch
 0006-selftests-cgroup-Test-open-time-cgroup-namespace-usa.patch

The patchset is also available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-migration-perms

Michal, does this also fix the original bug you were trying to fix? For now,
I didn't add Fixes / stable tags. If ppl are okay with the patchset, I'll
route it through cgroup/for-5.16-fixes.

diffstat follows. Thanks.

 kernel/cgroup/cgroup-internal.h              |   14 ++++
 kernel/cgroup/cgroup-v1.c                    |    7 +-
 kernel/cgroup/cgroup.c                       |   82 ++++++++++++++++++-------
 tools/testing/selftests/cgroup/cgroup_util.c |    2 
 tools/testing/selftests/cgroup/test_core.c   |  165 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 244 insertions(+), 26 deletions(-)

--
tejun


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] cgroup: Use open-time credentials for process migraton perm checks
  2021-12-09 21:47 [PATCHSET cgroup/for-5.16-fixes] cgroup: Use open-time creds and namespace for migration perm checks Tejun Heo
@ 2021-12-09 21:47 ` Tejun Heo
  2021-12-10 17:41   ` Linus Torvalds
  2021-12-09 21:47 ` [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv Tejun Heo
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 21:47 UTC (permalink / raw)
  To: torvalds, ebiederm, mkoutny, axboe, keescook, oleg, peterz, tglx,
	jnewsome, legion, luto, jannh
  Cc: linux-kernel, security, kernel-team, Tejun Heo

cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's credentials which is a
potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.

This patch makes both cgroup2 and cgroup1 process migration interfaces to
use the credentials saved at the time of open (file->f_cred) instead of
current's.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Michal Koutný <mkoutny@suse.com>
---
 kernel/cgroup/cgroup-v1.c | 7 ++++---
 kernel/cgroup/cgroup.c    | 9 ++++++++-
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 81c9e0685948..0e7369103ba6 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -504,10 +504,11 @@ static ssize_t __cgroup1_procs_write(struct kernfs_open_file *of,
 		goto out_unlock;
 
 	/*
-	 * Even if we're attaching all tasks in the thread group, we only
-	 * need to check permissions on one of them.
+	 * Even if we're attaching all tasks in the thread group, we only need
+	 * to check permissions on one of them. Check permissions using the
+	 * credentials from file open to protect against inherited fd attacks.
 	 */
-	cred = current_cred();
+	cred = of->file->f_cred;
 	tcred = get_task_cred(task);
 	if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) &&
 	    !uid_eq(cred->euid, tcred->uid) &&
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 919194de39c8..2632e46da1d4 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4892,6 +4892,7 @@ static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf,
 {
 	struct cgroup *src_cgrp, *dst_cgrp;
 	struct task_struct *task;
+	const struct cred *saved_cred;
 	ssize_t ret;
 	bool locked;
 
@@ -4909,9 +4910,15 @@ static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf,
 	src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root);
 	spin_unlock_irq(&css_set_lock);
 
-	/* process and thread migrations follow same delegation rule */
+	/*
+	 * Process and thread migrations follow same delegation rule. Check
+	 * permissions using the credentials from file open to protect against
+	 * inherited fd attacks.
+	 */
+	saved_cred = override_creds(of->file->f_cred);
 	ret = cgroup_attach_permissions(src_cgrp, dst_cgrp,
 					of->file->f_path.dentry->d_sb, threadgroup);
+	revert_creds(saved_cred);
 	if (ret)
 		goto out_finish;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
  2021-12-09 21:47 [PATCHSET cgroup/for-5.16-fixes] cgroup: Use open-time creds and namespace for migration perm checks Tejun Heo
  2021-12-09 21:47 ` [PATCH 1/6] cgroup: Use open-time credentials for process migraton " Tejun Heo
@ 2021-12-09 21:47 ` Tejun Heo
  2021-12-10 17:53   ` Linus Torvalds
  2021-12-09 21:47 ` [PATCH 3/6] cgroup: Use open-time cgroup namespace for process migration perm checks Tejun Heo
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 21:47 UTC (permalink / raw)
  To: torvalds, ebiederm, mkoutny, axboe, keescook, oleg, peterz, tglx,
	jnewsome, legion, luto, jannh
  Cc: linux-kernel, security, kernel-team, Tejun Heo

of->priv is currently used by each interface file implementation to store
private information. This patch collects the current two private data usages
into struct cgroup_file_ctx which is allocated and freed by the common path.
This allows generic private data which applies to multiple files, which will
be used to in the following patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup/cgroup-internal.h | 12 +++++++++
 kernel/cgroup/cgroup.c          | 47 ++++++++++++++++++++++++---------
 2 files changed, 46 insertions(+), 13 deletions(-)

diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h
index bfbeabc17a9d..8f681f14828c 100644
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -65,6 +65,18 @@ static inline struct cgroup_fs_context *cgroup_fc2context(struct fs_context *fc)
 	return container_of(kfc, struct cgroup_fs_context, kfc);
 }
 
+struct cgroup_file_ctx {
+	union {
+		struct {
+			struct css_task_iter	*it;
+		} procs;
+
+		struct {
+			void			*trigger;
+		} psi;
+	};
+};
+
 /*
  * A cgroup can be associated with multiple css_sets as different tasks may
  * belong to different cgroups on different hierarchies.  In the other
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 2632e46da1d4..2992eb7e8244 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -3630,6 +3630,7 @@ static int cgroup_cpu_pressure_show(struct seq_file *seq, void *v)
 static ssize_t cgroup_pressure_write(struct kernfs_open_file *of, char *buf,
 					  size_t nbytes, enum psi_res res)
 {
+	struct cgroup_file_ctx *ctx = of->priv;
 	struct psi_trigger *new;
 	struct cgroup *cgrp;
 	struct psi_group *psi;
@@ -3648,7 +3649,7 @@ static ssize_t cgroup_pressure_write(struct kernfs_open_file *of, char *buf,
 		return PTR_ERR(new);
 	}
 
-	psi_trigger_replace(&of->priv, new);
+	psi_trigger_replace(&ctx->psi.trigger, new);
 
 	cgroup_put(cgrp);
 
@@ -3679,12 +3680,16 @@ static ssize_t cgroup_cpu_pressure_write(struct kernfs_open_file *of,
 static __poll_t cgroup_pressure_poll(struct kernfs_open_file *of,
 					  poll_table *pt)
 {
-	return psi_trigger_poll(&of->priv, of->file, pt);
+	struct cgroup_file_ctx *ctx = of->priv;
+
+	return psi_trigger_poll(&ctx->psi.trigger, of->file, pt);
 }
 
 static void cgroup_pressure_release(struct kernfs_open_file *of)
 {
-	psi_trigger_replace(&of->priv, NULL);
+	struct cgroup_file_ctx *ctx = of->priv;
+
+	psi_trigger_replace(&ctx->psi.trigger, NULL);
 }
 
 bool cgroup_psi_enabled(void)
@@ -3811,18 +3816,31 @@ static ssize_t cgroup_kill_write(struct kernfs_open_file *of, char *buf,
 static int cgroup_file_open(struct kernfs_open_file *of)
 {
 	struct cftype *cft = of_cft(of);
+	struct cgroup_file_ctx *ctx;
+	int ret;
 
-	if (cft->open)
-		return cft->open(of);
-	return 0;
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+	of->priv = ctx;
+
+	if (!cft->open)
+		return 0;
+
+	ret = cft->open(of);
+	if (ret)
+		kfree(ctx);
+	return ret;
 }
 
 static void cgroup_file_release(struct kernfs_open_file *of)
 {
 	struct cftype *cft = of_cft(of);
+	struct cgroup_file_ctx *ctx = of->priv;
 
 	if (cft->release)
 		cft->release(of);
+	kfree(ctx);
 }
 
 static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf,
@@ -4751,21 +4769,23 @@ void css_task_iter_end(struct css_task_iter *it)
 
 static void cgroup_procs_release(struct kernfs_open_file *of)
 {
-	if (of->priv) {
-		css_task_iter_end(of->priv);
-		kfree(of->priv);
+	struct cgroup_file_ctx *ctx = of->priv;
+
+	if (ctx->procs.it) {
+		css_task_iter_end(ctx->procs.it);
+		kfree(ctx->procs.it);
 	}
 }
 
 static void *cgroup_procs_next(struct seq_file *s, void *v, loff_t *pos)
 {
 	struct kernfs_open_file *of = s->private;
-	struct css_task_iter *it = of->priv;
+	struct cgroup_file_ctx *ctx = of->priv;
 
 	if (pos)
 		(*pos)++;
 
-	return css_task_iter_next(it);
+	return css_task_iter_next(ctx->procs.it);
 }
 
 static void *__cgroup_procs_start(struct seq_file *s, loff_t *pos,
@@ -4773,7 +4793,8 @@ static void *__cgroup_procs_start(struct seq_file *s, loff_t *pos,
 {
 	struct kernfs_open_file *of = s->private;
 	struct cgroup *cgrp = seq_css(s)->cgroup;
-	struct css_task_iter *it = of->priv;
+	struct cgroup_file_ctx *ctx = of->priv;
+	struct css_task_iter *it = ctx->procs.it;
 
 	/*
 	 * When a seq_file is seeked, it's always traversed sequentially
@@ -4786,7 +4807,7 @@ static void *__cgroup_procs_start(struct seq_file *s, loff_t *pos,
 		it = kzalloc(sizeof(*it), GFP_KERNEL);
 		if (!it)
 			return ERR_PTR(-ENOMEM);
-		of->priv = it;
+		ctx->procs.it = it;
 		css_task_iter_start(&cgrp->self, iter_flags, it);
 	} else if (!(*pos)) {
 		css_task_iter_end(it);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] cgroup: Use open-time cgroup namespace for process migration perm checks
  2021-12-09 21:47 [PATCHSET cgroup/for-5.16-fixes] cgroup: Use open-time creds and namespace for migration perm checks Tejun Heo
  2021-12-09 21:47 ` [PATCH 1/6] cgroup: Use open-time credentials for process migraton " Tejun Heo
  2021-12-09 21:47 ` [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv Tejun Heo
@ 2021-12-09 21:47 ` Tejun Heo
  2021-12-09 21:47 ` [PATCH 4/6] selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644 Tejun Heo
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 21:47 UTC (permalink / raw)
  To: torvalds, ebiederm, mkoutny, axboe, keescook, oleg, peterz, tglx,
	jnewsome, legion, luto, jannh
  Cc: linux-kernel, security, kernel-team, Tejun Heo

cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's cgroup namespace which is
a potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.

This patch makes cgroup remember the cgroup namespace at the time of open
and uses it for migration permission checks instad of current's. Note that
this only applies to cgroup2 as cgroup1 doesn't have namespace support.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Michal Koutný <mkoutny@suse.com>
---
 kernel/cgroup/cgroup-internal.h |  2 ++
 kernel/cgroup/cgroup.c          | 28 +++++++++++++++++++---------
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h
index 8f681f14828c..eb0585245b07 100644
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -66,6 +66,8 @@ static inline struct cgroup_fs_context *cgroup_fc2context(struct fs_context *fc)
 }
 
 struct cgroup_file_ctx {
+	struct cgroup_namespace	*ns;
+
 	union {
 		struct {
 			struct css_task_iter	*it;
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 2992eb7e8244..a3292558e96c 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -3822,14 +3822,19 @@ static int cgroup_file_open(struct kernfs_open_file *of)
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (!ctx)
 		return -ENOMEM;
+
+	ctx->ns = current->nsproxy->cgroup_ns;
+	get_cgroup_ns(ctx->ns);
 	of->priv = ctx;
 
 	if (!cft->open)
 		return 0;
 
 	ret = cft->open(of);
-	if (ret)
+	if (ret) {
+		put_cgroup_ns(ctx->ns);
 		kfree(ctx);
+	}
 	return ret;
 }
 
@@ -3840,13 +3845,14 @@ static void cgroup_file_release(struct kernfs_open_file *of)
 
 	if (cft->release)
 		cft->release(of);
+	put_cgroup_ns(ctx->ns);
 	kfree(ctx);
 }
 
 static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf,
 				 size_t nbytes, loff_t off)
 {
-	struct cgroup_namespace *ns = current->nsproxy->cgroup_ns;
+	struct cgroup_file_ctx *ctx = of->priv;
 	struct cgroup *cgrp = of->kn->parent->priv;
 	struct cftype *cft = of_cft(of);
 	struct cgroup_subsys_state *css;
@@ -3863,7 +3869,7 @@ static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf,
 	 */
 	if ((cgrp->root->flags & CGRP_ROOT_NS_DELEGATE) &&
 	    !(cft->flags & CFTYPE_NS_DELEGATABLE) &&
-	    ns != &init_cgroup_ns && ns->root_cset->dfl_cgrp == cgrp)
+	    ctx->ns != &init_cgroup_ns && ctx->ns->root_cset->dfl_cgrp == cgrp)
 		return -EPERM;
 
 	if (cft->write)
@@ -4859,9 +4865,9 @@ static int cgroup_may_write(const struct cgroup *cgrp, struct super_block *sb)
 
 static int cgroup_procs_write_permission(struct cgroup *src_cgrp,
 					 struct cgroup *dst_cgrp,
-					 struct super_block *sb)
+					 struct super_block *sb,
+					 struct cgroup_namespace *ns)
 {
-	struct cgroup_namespace *ns = current->nsproxy->cgroup_ns;
 	struct cgroup *com_cgrp = src_cgrp;
 	int ret;
 
@@ -4890,11 +4896,12 @@ static int cgroup_procs_write_permission(struct cgroup *src_cgrp,
 
 static int cgroup_attach_permissions(struct cgroup *src_cgrp,
 				     struct cgroup *dst_cgrp,
-				     struct super_block *sb, bool threadgroup)
+				     struct super_block *sb, bool threadgroup,
+				     struct cgroup_namespace *ns)
 {
 	int ret = 0;
 
-	ret = cgroup_procs_write_permission(src_cgrp, dst_cgrp, sb);
+	ret = cgroup_procs_write_permission(src_cgrp, dst_cgrp, sb, ns);
 	if (ret)
 		return ret;
 
@@ -4911,6 +4918,7 @@ static int cgroup_attach_permissions(struct cgroup *src_cgrp,
 static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf,
 				    bool threadgroup)
 {
+	struct cgroup_file_ctx *ctx = of->priv;
 	struct cgroup *src_cgrp, *dst_cgrp;
 	struct task_struct *task;
 	const struct cred *saved_cred;
@@ -4938,7 +4946,8 @@ static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf,
 	 */
 	saved_cred = override_creds(of->file->f_cred);
 	ret = cgroup_attach_permissions(src_cgrp, dst_cgrp,
-					of->file->f_path.dentry->d_sb, threadgroup);
+					of->file->f_path.dentry->d_sb,
+					threadgroup, ctx->ns);
 	revert_creds(saved_cred);
 	if (ret)
 		goto out_finish;
@@ -6158,7 +6167,8 @@ static int cgroup_css_set_fork(struct kernel_clone_args *kargs)
 		goto err;
 
 	ret = cgroup_attach_permissions(cset->dfl_cgrp, dst_cgrp, sb,
-					!(kargs->flags & CLONE_THREAD));
+					!(kargs->flags & CLONE_THREAD),
+					current->nsproxy->cgroup_ns);
 	if (ret)
 		goto err;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644
  2021-12-09 21:47 [PATCHSET cgroup/for-5.16-fixes] cgroup: Use open-time creds and namespace for migration perm checks Tejun Heo
                   ` (2 preceding siblings ...)
  2021-12-09 21:47 ` [PATCH 3/6] cgroup: Use open-time cgroup namespace for process migration perm checks Tejun Heo
@ 2021-12-09 21:47 ` Tejun Heo
  2021-12-09 21:47 ` [PATCH 5/6] selftests: cgroup: Test open-time credential usage for migration checks Tejun Heo
  2021-12-09 21:47 ` [PATCH 6/6] selftests: cgroup: Test open-time cgroup namespace " Tejun Heo
  5 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 21:47 UTC (permalink / raw)
  To: torvalds, ebiederm, mkoutny, axboe, keescook, oleg, peterz, tglx,
	jnewsome, legion, luto, jannh
  Cc: linux-kernel, security, kernel-team, Tejun Heo

0644 is an odd perm to create a cgroup which is a directory. Use the regular
0755 instead. This is necessary for euid switching test case.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 tools/testing/selftests/cgroup/cgroup_util.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c
index 623cec04ad42..0cf7e90c0052 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/cgroup_util.c
@@ -221,7 +221,7 @@ int cg_find_unified_root(char *root, size_t len)
 
 int cg_create(const char *cgroup)
 {
-	return mkdir(cgroup, 0644);
+	return mkdir(cgroup, 0755);
 }
 
 int cg_wait_for_proc_count(const char *cgroup, int count)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] selftests: cgroup: Test open-time credential usage for migration checks
  2021-12-09 21:47 [PATCHSET cgroup/for-5.16-fixes] cgroup: Use open-time creds and namespace for migration perm checks Tejun Heo
                   ` (3 preceding siblings ...)
  2021-12-09 21:47 ` [PATCH 4/6] selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644 Tejun Heo
@ 2021-12-09 21:47 ` Tejun Heo
  2021-12-09 21:47 ` [PATCH 6/6] selftests: cgroup: Test open-time cgroup namespace " Tejun Heo
  5 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 21:47 UTC (permalink / raw)
  To: torvalds, ebiederm, mkoutny, axboe, keescook, oleg, peterz, tglx,
	jnewsome, legion, luto, jannh
  Cc: linux-kernel, security, kernel-team, Tejun Heo

When a task is writing to an fd opened by a different task, the perm check
should use the credentials of the latter task. Add a test for it.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 tools/testing/selftests/cgroup/test_core.c | 68 ++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/tools/testing/selftests/cgroup/test_core.c b/tools/testing/selftests/cgroup/test_core.c
index 3df648c37876..01b766506973 100644
--- a/tools/testing/selftests/cgroup/test_core.c
+++ b/tools/testing/selftests/cgroup/test_core.c
@@ -674,6 +674,73 @@ static int test_cgcore_thread_migration(const char *root)
 	return ret;
 }
 
+/*
+ * cgroup migration permission check should be performed based on the
+ * credentials at the time of open instead of write.
+ */
+static int test_cgcore_lesser_euid_open(const char *root)
+{
+	const uid_t test_euid = 65534;	/* usually nobody, any !root is fine */
+	int ret = KSFT_FAIL;
+	char *cg_test_a = NULL, *cg_test_b = NULL;
+	char *cg_test_a_procs = NULL, *cg_test_b_procs = NULL;
+	int cg_test_b_procs_fd = -1;
+	uid_t saved_uid;
+
+	cg_test_a = cg_name(root, "cg_test_a");
+	cg_test_b = cg_name(root, "cg_test_b");
+
+	if (!cg_test_a || !cg_test_b)
+		goto cleanup;
+
+	cg_test_a_procs = cg_name(cg_test_a, "cgroup.procs");
+	cg_test_b_procs = cg_name(cg_test_b, "cgroup.procs");
+
+	if (!cg_test_a_procs || !cg_test_b_procs)
+		goto cleanup;
+
+	if (cg_create(cg_test_a) || cg_create(cg_test_b))
+		goto cleanup;
+
+	if (cg_enter_current(cg_test_a))
+		goto cleanup;
+
+	if (chown(cg_test_a_procs, test_euid, -1) ||
+	    chown(cg_test_b_procs, test_euid, -1))
+		goto cleanup;
+
+	saved_uid = geteuid();
+	if (seteuid(test_euid))
+		goto cleanup;
+
+	cg_test_b_procs_fd = open(cg_test_b_procs, O_RDWR);
+
+	if (seteuid(saved_uid))
+		goto cleanup;
+
+	if (cg_test_b_procs_fd < 0)
+		goto cleanup;
+
+	if (write(cg_test_b_procs_fd, "0", 1) >= 0 || errno != EACCES)
+		goto cleanup;
+
+	ret = KSFT_PASS;
+
+cleanup:
+	cg_enter_current(root);
+	if (cg_test_b_procs_fd >= 0)
+		close(cg_test_b_procs_fd);
+	if (cg_test_b)
+		cg_destroy(cg_test_b);
+	if (cg_test_a)
+		cg_destroy(cg_test_a);
+	free(cg_test_b_procs);
+	free(cg_test_a_procs);
+	free(cg_test_b);
+	free(cg_test_a);
+	return ret;
+}
+
 #define T(x) { x, #x }
 struct corecg_test {
 	int (*fn)(const char *root);
@@ -689,6 +756,7 @@ struct corecg_test {
 	T(test_cgcore_proc_migration),
 	T(test_cgcore_thread_migration),
 	T(test_cgcore_destroy),
+	T(test_cgcore_lesser_euid_open),
 };
 #undef T
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] selftests: cgroup: Test open-time cgroup namespace usage for migration checks
  2021-12-09 21:47 [PATCHSET cgroup/for-5.16-fixes] cgroup: Use open-time creds and namespace for migration perm checks Tejun Heo
                   ` (4 preceding siblings ...)
  2021-12-09 21:47 ` [PATCH 5/6] selftests: cgroup: Test open-time credential usage for migration checks Tejun Heo
@ 2021-12-09 21:47 ` Tejun Heo
  5 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 21:47 UTC (permalink / raw)
  To: torvalds, ebiederm, mkoutny, axboe, keescook, oleg, peterz, tglx,
	jnewsome, legion, luto, jannh
  Cc: linux-kernel, security, kernel-team, Tejun Heo

When a task is writing to an fd opened by a different task, the perm check
should use the cgroup namespace of the latter task. Add a test for it.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 tools/testing/selftests/cgroup/test_core.c | 97 ++++++++++++++++++++++
 1 file changed, 97 insertions(+)

diff --git a/tools/testing/selftests/cgroup/test_core.c b/tools/testing/selftests/cgroup/test_core.c
index 01b766506973..600123503063 100644
--- a/tools/testing/selftests/cgroup/test_core.c
+++ b/tools/testing/selftests/cgroup/test_core.c
@@ -1,11 +1,14 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
+#define _GNU_SOURCE
 #include <linux/limits.h>
+#include <linux/sched.h>
 #include <sys/types.h>
 #include <sys/mman.h>
 #include <sys/wait.h>
 #include <unistd.h>
 #include <fcntl.h>
+#include <sched.h>
 #include <stdio.h>
 #include <errno.h>
 #include <signal.h>
@@ -741,6 +744,99 @@ static int test_cgcore_lesser_euid_open(const char *root)
 	return ret;
 }
 
+struct lesser_ns_open_thread_arg {
+	const char	*path;
+	int		fd;
+	int		err;
+};
+
+static int lesser_ns_open_thread_fn(void *arg)
+{
+	struct lesser_ns_open_thread_arg *targ = arg;
+
+	targ->fd = open(targ->path, O_RDWR);
+	targ->err = errno;
+	return 0;
+}
+
+/*
+ * cgroup migration permission check should be performed based on the cgroup
+ * namespace at the time of open instead of write.
+ */
+static int test_cgcore_lesser_ns_open(const char *root)
+{
+	static char stack[65536];
+	const uid_t test_euid = 65534;	/* usually nobody, any !root is fine */
+	int ret = KSFT_FAIL;
+	char *cg_test_a = NULL, *cg_test_b = NULL;
+	char *cg_test_a_procs = NULL, *cg_test_b_procs = NULL;
+	int cg_test_b_procs_fd = -1;
+	struct lesser_ns_open_thread_arg targ = { .fd = -1 };
+	pid_t pid;
+	int status;
+
+	cg_test_a = cg_name(root, "cg_test_a");
+	cg_test_b = cg_name(root, "cg_test_b");
+
+	if (!cg_test_a || !cg_test_b)
+		goto cleanup;
+
+	cg_test_a_procs = cg_name(cg_test_a, "cgroup.procs");
+	cg_test_b_procs = cg_name(cg_test_b, "cgroup.procs");
+
+	if (!cg_test_a_procs || !cg_test_b_procs)
+		goto cleanup;
+
+	if (cg_create(cg_test_a) || cg_create(cg_test_b))
+		goto cleanup;
+
+	if (cg_enter_current(cg_test_b))
+		goto cleanup;
+
+	if (chown(cg_test_a_procs, test_euid, -1) ||
+	    chown(cg_test_b_procs, test_euid, -1))
+		goto cleanup;
+
+	targ.path = cg_test_b_procs;
+	pid = clone(lesser_ns_open_thread_fn, stack + sizeof(stack),
+		    CLONE_NEWCGROUP | CLONE_FILES | CLONE_VM | SIGCHLD,
+		    &targ);
+	if (pid < 0)
+		goto cleanup;
+
+	if (waitpid(pid, &status, 0) < 0)
+		goto cleanup;
+
+	if (!WIFEXITED(status))
+		goto cleanup;
+
+	cg_test_b_procs_fd = targ.fd;
+	if (cg_test_b_procs_fd < 0)
+		goto cleanup;
+
+	if (cg_enter_current(cg_test_a))
+		goto cleanup;
+
+	if ((status = write(cg_test_b_procs_fd, "0", 1)) >= 0 || errno != ENOENT)
+		goto cleanup;
+
+	ret = KSFT_PASS;
+
+cleanup:
+	cg_enter_current(root);
+	if (cg_test_b_procs_fd >= 0)
+		close(cg_test_b_procs_fd);
+	if (cg_test_b)
+		cg_destroy(cg_test_b);
+	if (cg_test_a)
+		cg_destroy(cg_test_a);
+	free(cg_test_b_procs);
+	free(cg_test_a_procs);
+	free(cg_test_b);
+	free(cg_test_a);
+	return ret;
+}
+
 #define T(x) { x, #x }
 struct corecg_test {
 	int (*fn)(const char *root);
@@ -757,6 +853,7 @@ struct corecg_test {
 	T(test_cgcore_thread_migration),
 	T(test_cgcore_destroy),
 	T(test_cgcore_lesser_euid_open),
+	T(test_cgcore_lesser_ns_open),
 };
 #undef T
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/6] cgroup: Use open-time credentials for process migraton perm checks
  2021-12-09 21:47 ` [PATCH 1/6] cgroup: Use open-time credentials for process migraton " Tejun Heo
@ 2021-12-10 17:41   ` Linus Torvalds
  0 siblings, 0 replies; 13+ messages in thread
From: Linus Torvalds @ 2021-12-10 17:41 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric W. Biederman, Michal Koutny, Jens Axboe, Kees Cook,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Jim Newsome,
	Alexey Gladkov, Andy Lutomirski, Jann Horn,
	Linux Kernel Mailing List, Security Officers, Kernel Team

On Thu, Dec 9, 2021 at 1:47 PM Tejun Heo <tj@kernel.org> wrote:
>
> +       saved_cred = override_creds(of->file->f_cred);
>         ret = cgroup_attach_permissions(src_cgrp, dst_cgrp,
>                                         of->file->f_path.dentry->d_sb, threadgroup);
> +       revert_creds(saved_cred);

I'm not happy about adding another override_creds/revert_creds pair,
but looking at what that thing ends up doing I don't see a better
alternative - no sane way to pass in explicit creds due to the way
that "descend to common parent, use inode_permission" thing works.

So it's not pretty, and I don't love it, but I don't see a better
solution either.

            Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
  2021-12-09 21:47 ` [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv Tejun Heo
@ 2021-12-10 17:53   ` Linus Torvalds
  2021-12-10 18:38     ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2021-12-10 17:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric W. Biederman, Michal Koutny, Jens Axboe, Kees Cook,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Jim Newsome,
	Alexey Gladkov, Andy Lutomirski, Jann Horn,
	Linux Kernel Mailing List, Security Officers, Kernel Team

On Thu, Dec 9, 2021 at 1:47 PM Tejun Heo <tj@kernel.org> wrote:
>
> of->priv is currently used by each interface file implementation to store
> private information. This patch collects the current two private data usages
> into struct cgroup_file_ctx which is allocated and freed by the common path.
> This allows generic private data which applies to multiple files, which will
> be used to in the following patch.

I'm not sure if it's worth it having that union just to make the
struct be 8 bytes instead of 16 (and later 16 bytes instead of 24),
when the real cost is that dynamic allocation overhead, and there's
likely only one or two active actual allocations at a time.

IOW, I'm not convinced there's any real memory savings, and making it
a union means that now you have to be very careful about who does
what..

And yes, people historically had to be very careful _anyway_, because
the "union" was kind of implicit in how there was just a shared 'void
*priv' thing that was either that iterator pointer or that psi trigger
pointer.

So in that sense this is a semantically minimal patch, but I think
that practically speaking we'd be better off without that possible
source of confusion, and just always have that cgroup proc file have a
full structure.

In fact, I think it means we could just make the thing *contain* the
iterator, instead of having a pointer to an iterator, and getting rid
of the now redundant dynamic alloc/free of the iterator pointer.

Wouldn't that simplify things? And might there not be some cgroup
pressure user that also wants to use the iterator interfaces? Maybe
not, my point is more that once we have an explicit struct allocation
for cgroup proc files, we might as well clarify and simplify the
code..

           Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
  2021-12-10 17:53   ` Linus Torvalds
@ 2021-12-10 18:38     ` Tejun Heo
  2021-12-10 18:45       ` Linus Torvalds
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2021-12-10 18:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric W. Biederman, Michal Koutny, Jens Axboe, Kees Cook,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Jim Newsome,
	Alexey Gladkov, Andy Lutomirski, Jann Horn,
	Linux Kernel Mailing List, Security Officers, Kernel Team

Hello,

On Fri, Dec 10, 2021 at 09:53:41AM -0800, Linus Torvalds wrote:
> On Thu, Dec 9, 2021 at 1:47 PM Tejun Heo <tj@kernel.org> wrote:
> >
> > of->priv is currently used by each interface file implementation to store
> > private information. This patch collects the current two private data usages
> > into struct cgroup_file_ctx which is allocated and freed by the common path.
> > This allows generic private data which applies to multiple files, which will
> > be used to in the following patch.
> 
> I'm not sure if it's worth it having that union just to make the
> struct be 8 bytes instead of 16 (and later 16 bytes instead of 24),
> when the real cost is that dynamic allocation overhead, and there's
> likely only one or two active actual allocations at a time.

Yeah, I was initially doing ctx->procs_it and ->psi_trigger which looked
kinda silly so then nested structs and then why not the union which is kinda
logical in semantic sense.

> Wouldn't that simplify things? And might there not be some cgroup
> pressure user that also wants to use the iterator interfaces? Maybe
> not, my point is more that once we have an explicit struct allocation
> for cgroup proc files, we might as well clarify and simplify the
> code..

It's a bit of bikeshedding but I wanna explicitly denote who at currently
uses the fields, so how about nested structs w/ embedded iterator? If other
files ever want to share fields, we can shift those fields to the common
area together with ->ns.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
  2021-12-10 18:38     ` Tejun Heo
@ 2021-12-10 18:45       ` Linus Torvalds
  2021-12-10 19:06         ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2021-12-10 18:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric W. Biederman, Michal Koutny, Jens Axboe, Kees Cook,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Jim Newsome,
	Alexey Gladkov, Andy Lutomirski, Jann Horn,
	Linux Kernel Mailing List, Security Officers, Kernel Team

On Fri, Dec 10, 2021 at 10:38 AM Tejun Heo <tj@kernel.org> wrote:
>
> It's a bit of bikeshedding but I wanna explicitly denote who at currently
> uses the fields, so how about nested structs w/ embedded iterator?

I don't mind being careful about uses if it makes sense and it's
really statically obvious and never used outside that one file so you
have no non-local cases,,,

Yeah, you export the struct in an internal header file right now, but
I think that is just for the upcoming 'ns' member, not the union
members.

But if it then ever becomes a possible source of confusion and it's
not obvious from the context who uses what, I'd rather use the extra 8
bytes in the allocation.

Ok?

                   Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
  2021-12-10 18:45       ` Linus Torvalds
@ 2021-12-10 19:06         ` Tejun Heo
  2021-12-10 19:14           ` Linus Torvalds
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2021-12-10 19:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric W. Biederman, Michal Koutny, Jens Axboe, Kees Cook,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Jim Newsome,
	Alexey Gladkov, Andy Lutomirski, Jann Horn,
	Linux Kernel Mailing List, Security Officers, Kernel Team

On Fri, Dec 10, 2021 at 10:45:33AM -0800, Linus Torvalds wrote:
> But if it then ever becomes a possible source of confusion and it's
> not obvious from the context who uses what, I'd rather use the extra 8
> bytes in the allocation.
> 
> Ok?

Just so that I'm understanding you correctly. The following is what I was
suggesting. ie. just dropping the union and making the iterator embedded:

  struct cgroup_file_ctx {
          struct cgroup_namespace         ns;

          struct {
                  struct css_task_iter    it;
          } procs;

          struct {
                  void                    *trigger;
          } psi;
  };

and I was wondering whether you wanted something like the following:

  struct cgroup_file_ctx {
          struct cgroup_namespace         ns;
          struct css_task_iter            it;
          void                            *trigger;
  };

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
  2021-12-10 19:06         ` Tejun Heo
@ 2021-12-10 19:14           ` Linus Torvalds
  0 siblings, 0 replies; 13+ messages in thread
From: Linus Torvalds @ 2021-12-10 19:14 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric W. Biederman, Michal Koutny, Jens Axboe, Kees Cook,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Jim Newsome,
	Alexey Gladkov, Andy Lutomirski, Jann Horn,
	Linux Kernel Mailing List, Security Officers, Kernel Team

On Fri, Dec 10, 2021 at 11:06 AM Tejun Heo <tj@kernel.org> wrote:
>
> Just so that I'm understanding you correctly. The following is what I was
> suggesting. ie. just dropping the union and making the iterator embedded:

Oh, I thought that by keeping them separate you meant still keeping the union.

But you just meant separating them into their own "namespace" sub-structures.

That's fine by me, I don't think it matters, and if you find the
namespacing helpful go right ahead.

             Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-12-10 19:15 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-09 21:47 [PATCHSET cgroup/for-5.16-fixes] cgroup: Use open-time creds and namespace for migration perm checks Tejun Heo
2021-12-09 21:47 ` [PATCH 1/6] cgroup: Use open-time credentials for process migraton " Tejun Heo
2021-12-10 17:41   ` Linus Torvalds
2021-12-09 21:47 ` [PATCH 2/6] cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv Tejun Heo
2021-12-10 17:53   ` Linus Torvalds
2021-12-10 18:38     ` Tejun Heo
2021-12-10 18:45       ` Linus Torvalds
2021-12-10 19:06         ` Tejun Heo
2021-12-10 19:14           ` Linus Torvalds
2021-12-09 21:47 ` [PATCH 3/6] cgroup: Use open-time cgroup namespace for process migration perm checks Tejun Heo
2021-12-09 21:47 ` [PATCH 4/6] selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644 Tejun Heo
2021-12-09 21:47 ` [PATCH 5/6] selftests: cgroup: Test open-time credential usage for migration checks Tejun Heo
2021-12-09 21:47 ` [PATCH 6/6] selftests: cgroup: Test open-time cgroup namespace " Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.