All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Honor cgroup namespace when resolving cgroup id
@ 2022-08-26 16:52 ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel, cgroups, bpf
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

Cgroup id is becoming a new way for userspace how to refer to cgroups it
wants to act upon. As opposed to cgroupfs (paths, opened FDs), the
current approach does not reflect limited view by (non-init) cgroup
namespaces.

This patches don't aim to limit what a user can do (consider an uid=0 in
mere cgroup namespace) but to provide consistent view within a
namespace.

The series is based on bpf-next with the new cgroup_iter. I've only
boot-tested it (especially I didn't run the BPF selftest).

Michal Koutný (4):
  cgroup: Honor caller's cgroup NS when resolving path
  cgroup: cgroup: Honor caller's cgroup NS when resolving cgroup id
  cgroup: Homogenize cgroup_get_from_id() return value
  cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors

 block/blk-cgroup-fc-appid.c |  4 +--
 include/linux/cgroup.h      |  8 +++---
 kernel/bpf/cgroup_iter.c    |  9 ++++---
 kernel/cgroup/cgroup.c      | 53 ++++++++++++++++++++++++++++---------
 mm/memcontrol.c             |  4 +--
 5 files changed, 54 insertions(+), 24 deletions(-)


base-commit: 343949e10798a52c6d6a14effc962e010ed471ae
-- 
2.37.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 0/4] Honor cgroup namespace when resolving cgroup id
@ 2022-08-26 16:52 ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel, cgroups, bpf
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

Cgroup id is becoming a new way for userspace how to refer to cgroups it
wants to act upon. As opposed to cgroupfs (paths, opened FDs), the
current approach does not reflect limited view by (non-init) cgroup
namespaces.

This patches don't aim to limit what a user can do (consider an uid=0 in
mere cgroup namespace) but to provide consistent view within a
namespace.

The series is based on bpf-next with the new cgroup_iter. I've only
boot-tested it (especially I didn't run the BPF selftest).

Michal Koutn√Ω (4):
  cgroup: Honor caller's cgroup NS when resolving path
  cgroup: cgroup: Honor caller's cgroup NS when resolving cgroup id
  cgroup: Homogenize cgroup_get_from_id() return value
  cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors

 block/blk-cgroup-fc-appid.c |  4 +--
 include/linux/cgroup.h      |  8 +++---
 kernel/bpf/cgroup_iter.c    |  9 ++++---
 kernel/cgroup/cgroup.c      | 53 ++++++++++++++++++++++++++++---------
 mm/memcontrol.c             |  4 +--
 5 files changed, 54 insertions(+), 24 deletions(-)


base-commit: 343949e10798a52c6d6a14effc962e010ed471ae
-- 
2.37.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] cgroup: Honor caller's cgroup NS when resolving path
@ 2022-08-26 16:52   ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel, cgroups, bpf
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

cgroup_get_from_path() is not widely used function. Its callers presume
the path is resolved under cgroup namespace. (There is one caller
currently and resolving in init NS won't make harm (netfilter). However,
future users may be subject to different effects when resolving
globally.)
Since, there's currently no use for the global resolution, modify the
existing function to take cgroup NS into account.

Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/cgroup/cgroup.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 5f4502aa2b3b..1a8b50d15ebf 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6598,8 +6598,12 @@ struct cgroup *cgroup_get_from_path(const char *path)
 {
 	struct kernfs_node *kn;
 	struct cgroup *cgrp = ERR_PTR(-ENOENT);
+	struct cgroup *root_cgrp;
 
-	kn = kernfs_walk_and_get(cgrp_dfl_root.cgrp.kn, path);
+	spin_lock_irq(&css_set_lock);
+	root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
+	kn = kernfs_walk_and_get(root_cgrp->kn, path);
+	spin_unlock_irq(&css_set_lock);
 	if (!kn)
 		goto out;
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 1/4] cgroup: Honor caller's cgroup NS when resolving path
@ 2022-08-26 16:52   ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, bpf-u79uwXL29TY76Z2rM5mHXA
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

cgroup_get_from_path() is not widely used function. Its callers presume
the path is resolved under cgroup namespace. (There is one caller
currently and resolving in init NS won't make harm (netfilter). However,
future users may be subject to different effects when resolving
globally.)
Since, there's currently no use for the global resolution, modify the
existing function to take cgroup NS into account.

Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Signed-off-by: Michal Koutn√Ω <mkoutny-IBi9RG/b67k@public.gmane.org>
---
 kernel/cgroup/cgroup.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 5f4502aa2b3b..1a8b50d15ebf 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6598,8 +6598,12 @@ struct cgroup *cgroup_get_from_path(const char *path)
 {
 	struct kernfs_node *kn;
 	struct cgroup *cgrp = ERR_PTR(-ENOENT);
+	struct cgroup *root_cgrp;
 
-	kn = kernfs_walk_and_get(cgrp_dfl_root.cgrp.kn, path);
+	spin_lock_irq(&css_set_lock);
+	root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
+	kn = kernfs_walk_and_get(root_cgrp->kn, path);
+	spin_unlock_irq(&css_set_lock);
 	if (!kn)
 		goto out;
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] cgroup: cgroup: Honor caller's cgroup NS when resolving cgroup id
  2022-08-26 16:52 ` Michal Koutný
@ 2022-08-26 16:52   ` Michal Koutný
  -1 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel, cgroups, bpf
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

Cgroup ids are resolved in the global scope. That may be needed sometime
(in future) but currently it violates virtual view provided through
cgroup namespaces.

There are currently following users of the resolution:
- fc_appid_store
- bpf_iter_attach_cgroup
- mem_cgroup_get_from_ino

None of the is a called on behalf of kernel but the resolution is made
with proper userspace context, hence the default to current->nsproxy
makes sens. (This doesn't rule out cgroup_get_from_id with cgroup NS
parameter in the future.)

Since cgroup ids are defined on v2 hierarchy only, we simply check
existence in the cgroup namespace by looking at ancestry on the default
hierarchy.

Fixes: 6b658c4863c1 ("scsi: cgroup: Add cgroup_get_from_id()")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/cgroup/cgroup.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 1a8b50d15ebf..4ca90ee6b902 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6007,11 +6007,12 @@ void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen)
  * cgroup_get_from_id : get the cgroup associated with cgroup id
  * @id: cgroup id
  * On success return the cgrp, on failure return NULL
+ * Only cgroups within current task's cgroup NS are valid.
  */
 struct cgroup *cgroup_get_from_id(u64 id)
 {
 	struct kernfs_node *kn;
-	struct cgroup *cgrp = NULL;
+	struct cgroup *cgrp = NULL, *root_cgrp;
 
 	kn = kernfs_find_and_get_node_by_id(cgrp_dfl_root.kf_root, id);
 	if (!kn)
@@ -6024,8 +6025,18 @@ struct cgroup *cgroup_get_from_id(u64 id)
 		cgrp = NULL;
 
 	rcu_read_unlock();
-
 	kernfs_put(kn);
+
+	if (!cgrp)
+		goto out;
+
+	spin_lock_irq(&css_set_lock);
+	root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
+	spin_unlock_irq(&css_set_lock);
+	if (!cgroup_is_descendant(cgrp, root_cgrp)) {
+		cgroup_put(cgrp);
+		cgrp = NULL;
+	}
 out:
 	return cgrp;
 }
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] cgroup: cgroup: Honor caller's cgroup NS when resolving cgroup id
@ 2022-08-26 16:52   ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel, cgroups, bpf
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

Cgroup ids are resolved in the global scope. That may be needed sometime
(in future) but currently it violates virtual view provided through
cgroup namespaces.

There are currently following users of the resolution:
- fc_appid_store
- bpf_iter_attach_cgroup
- mem_cgroup_get_from_ino

None of the is a called on behalf of kernel but the resolution is made
with proper userspace context, hence the default to current->nsproxy
makes sens. (This doesn't rule out cgroup_get_from_id with cgroup NS
parameter in the future.)

Since cgroup ids are defined on v2 hierarchy only, we simply check
existence in the cgroup namespace by looking at ancestry on the default
hierarchy.

Fixes: 6b658c4863c1 ("scsi: cgroup: Add cgroup_get_from_id()")
Signed-off-by: Michal Koutn√Ω <mkoutny@suse.com>
---
 kernel/cgroup/cgroup.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 1a8b50d15ebf..4ca90ee6b902 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6007,11 +6007,12 @@ void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen)
  * cgroup_get_from_id : get the cgroup associated with cgroup id
  * @id: cgroup id
  * On success return the cgrp, on failure return NULL
+ * Only cgroups within current task's cgroup NS are valid.
  */
 struct cgroup *cgroup_get_from_id(u64 id)
 {
 	struct kernfs_node *kn;
-	struct cgroup *cgrp = NULL;
+	struct cgroup *cgrp = NULL, *root_cgrp;
 
 	kn = kernfs_find_and_get_node_by_id(cgrp_dfl_root.kf_root, id);
 	if (!kn)
@@ -6024,8 +6025,18 @@ struct cgroup *cgroup_get_from_id(u64 id)
 		cgrp = NULL;
 
 	rcu_read_unlock();
-
 	kernfs_put(kn);
+
+	if (!cgrp)
+		goto out;
+
+	spin_lock_irq(&css_set_lock);
+	root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
+	spin_unlock_irq(&css_set_lock);
+	if (!cgroup_is_descendant(cgrp, root_cgrp)) {
+		cgroup_put(cgrp);
+		cgrp = NULL;
+	}
 out:
 	return cgrp;
 }
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] cgroup: Homogenize cgroup_get_from_id() return value
  2022-08-26 16:52 ` Michal Koutný
@ 2022-08-26 16:52   ` Michal Koutný
  -1 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel, cgroups, bpf
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

Cgroup id is user provided datum hence extend its return domain to
include possible error reason (similar to cgroup_get_from_fd()).

This change also fixes commit d4ccaf58a847 ("bpf: Introduce cgroup
iter") that would use NULL instead of proper error handling in
d4ccaf58a847 ("bpf: Introduce cgroup iter").

Additionally, neither of: fc_appid_store, bpf_iter_attach_cgroup,
mem_cgroup_get_from_ino (callers of cgroup_get_from_fd) is built without
CONFIG_CGROUPS (depends via CONFIG_BLK_CGROUP, direct, transitive
CONFIG_MEMCG respectively) transitive, so drop the singular definition
not needed with !CONFIG_CGROUPS.

Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 block/blk-cgroup-fc-appid.c | 4 ++--
 include/linux/cgroup.h      | 5 -----
 kernel/cgroup/cgroup.c      | 4 ++--
 mm/memcontrol.c             | 4 ++--
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/block/blk-cgroup-fc-appid.c b/block/blk-cgroup-fc-appid.c
index 760a2e1878dd..842e5e1c0f3c 100644
--- a/block/blk-cgroup-fc-appid.c
+++ b/block/blk-cgroup-fc-appid.c
@@ -19,8 +19,8 @@ int blkcg_set_fc_appid(char *app_id, u64 cgrp_id, size_t app_id_len)
 		return -EINVAL;
 
 	cgrp = cgroup_get_from_id(cgrp_id);
-	if (!cgrp)
-		return -ENOENT;
+	if (IS_ERR(cgrp))
+		return PTR_ERR(cgrp);
 	css = cgroup_get_e_css(cgrp, &io_cgrp_subsys);
 	if (!css) {
 		ret = -ENOENT;
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index ed53bfe7c46c..b6a9528374a8 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -752,11 +752,6 @@ static inline bool task_under_cgroup_hierarchy(struct task_struct *task,
 
 static inline void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen)
 {}
-
-static inline struct cgroup *cgroup_get_from_id(u64 id)
-{
-	return NULL;
-}
 #endif /* !CONFIG_CGROUPS */
 
 #ifdef CONFIG_CGROUPS
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 4ca90ee6b902..c0377726031f 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6006,7 +6006,7 @@ void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen)
 /*
  * cgroup_get_from_id : get the cgroup associated with cgroup id
  * @id: cgroup id
- * On success return the cgrp, on failure return NULL
+ * On success return the cgrp or ERR_PTR on failure
  * Only cgroups within current task's cgroup NS are valid.
  */
 struct cgroup *cgroup_get_from_id(u64 id)
@@ -6038,7 +6038,7 @@ struct cgroup *cgroup_get_from_id(u64 id)
 		cgrp = NULL;
 	}
 out:
-	return cgrp;
+	return cgrp ?: ERR_PTR(-ENOENT);
 }
 EXPORT_SYMBOL_GPL(cgroup_get_from_id);
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b69979c9ced5..86f5ca8c6fa6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5110,8 +5110,8 @@ struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
 	struct mem_cgroup *memcg;
 
 	cgrp = cgroup_get_from_id(ino);
-	if (!cgrp)
-		return ERR_PTR(-ENOENT);
+	if (IS_ERR(cgrp))
+		return PTR_ERR(cgrp);
 
 	css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
 	if (css)
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] cgroup: Homogenize cgroup_get_from_id() return value
@ 2022-08-26 16:52   ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel, cgroups, bpf
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

Cgroup id is user provided datum hence extend its return domain to
include possible error reason (similar to cgroup_get_from_fd()).

This change also fixes commit d4ccaf58a847 ("bpf: Introduce cgroup
iter") that would use NULL instead of proper error handling in
d4ccaf58a847 ("bpf: Introduce cgroup iter").

Additionally, neither of: fc_appid_store, bpf_iter_attach_cgroup,
mem_cgroup_get_from_ino (callers of cgroup_get_from_fd) is built without
CONFIG_CGROUPS (depends via CONFIG_BLK_CGROUP, direct, transitive
CONFIG_MEMCG respectively) transitive, so drop the singular definition
not needed with !CONFIG_CGROUPS.

Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter")
Signed-off-by: Michal Koutn√Ω <mkoutny@suse.com>
---
 block/blk-cgroup-fc-appid.c | 4 ++--
 include/linux/cgroup.h      | 5 -----
 kernel/cgroup/cgroup.c      | 4 ++--
 mm/memcontrol.c             | 4 ++--
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/block/blk-cgroup-fc-appid.c b/block/blk-cgroup-fc-appid.c
index 760a2e1878dd..842e5e1c0f3c 100644
--- a/block/blk-cgroup-fc-appid.c
+++ b/block/blk-cgroup-fc-appid.c
@@ -19,8 +19,8 @@ int blkcg_set_fc_appid(char *app_id, u64 cgrp_id, size_t app_id_len)
 		return -EINVAL;
 
 	cgrp = cgroup_get_from_id(cgrp_id);
-	if (!cgrp)
-		return -ENOENT;
+	if (IS_ERR(cgrp))
+		return PTR_ERR(cgrp);
 	css = cgroup_get_e_css(cgrp, &io_cgrp_subsys);
 	if (!css) {
 		ret = -ENOENT;
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index ed53bfe7c46c..b6a9528374a8 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -752,11 +752,6 @@ static inline bool task_under_cgroup_hierarchy(struct task_struct *task,
 
 static inline void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen)
 {}
-
-static inline struct cgroup *cgroup_get_from_id(u64 id)
-{
-	return NULL;
-}
 #endif /* !CONFIG_CGROUPS */
 
 #ifdef CONFIG_CGROUPS
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 4ca90ee6b902..c0377726031f 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6006,7 +6006,7 @@ void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen)
 /*
  * cgroup_get_from_id : get the cgroup associated with cgroup id
  * @id: cgroup id
- * On success return the cgrp, on failure return NULL
+ * On success return the cgrp or ERR_PTR on failure
  * Only cgroups within current task's cgroup NS are valid.
  */
 struct cgroup *cgroup_get_from_id(u64 id)
@@ -6038,7 +6038,7 @@ struct cgroup *cgroup_get_from_id(u64 id)
 		cgrp = NULL;
 	}
 out:
-	return cgrp;
+	return cgrp ?: ERR_PTR(-ENOENT);
 }
 EXPORT_SYMBOL_GPL(cgroup_get_from_id);
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b69979c9ced5..86f5ca8c6fa6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5110,8 +5110,8 @@ struct mem_cgroup *mem_cgroup_get_from_ino(unsigned long ino)
 	struct mem_cgroup *memcg;
 
 	cgrp = cgroup_get_from_id(ino);
-	if (!cgrp)
-		return ERR_PTR(-ENOENT);
+	if (IS_ERR(cgrp))
+		return PTR_ERR(cgrp);
 
 	css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
 	if (css)
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors
@ 2022-08-26 16:52   ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel, cgroups, bpf
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

The iterator with BPF_CGROUP_ITER_ANCESTORS_UP can traverse up across a
cgroup namespace level, which may be surprising within a non-init cgroup
namespace.

Introduce and use a new cgroup_parent_ns() helper that stops according
to cgroup namespace boundary. With BPF_CGROUP_ITER_ANCESTORS_UP. We use
the cgroup namespace of the iterator caller, not that one of the creator
(might be different, the former is relevant).

Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 include/linux/cgroup.h   |  3 +++
 kernel/bpf/cgroup_iter.c |  9 ++++++---
 kernel/cgroup/cgroup.c   | 32 +++++++++++++++++++++++---------
 3 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index b6a9528374a8..b63a80e03fae 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -858,6 +858,9 @@ struct cgroup_namespace *copy_cgroup_ns(unsigned long flags,
 int cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen,
 		   struct cgroup_namespace *ns);
 
+struct cgroup *cgroup_parent_ns(struct cgroup *cgrp,
+				struct cgroup_namespace *ns);
+
 #else /* !CONFIG_CGROUPS */
 
 static inline void free_cgroup_ns(struct cgroup_namespace *ns) { }
diff --git a/kernel/bpf/cgroup_iter.c b/kernel/bpf/cgroup_iter.c
index c69bce2f4403..06ee4a0c5870 100644
--- a/kernel/bpf/cgroup_iter.c
+++ b/kernel/bpf/cgroup_iter.c
@@ -104,6 +104,7 @@ static void *cgroup_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 {
 	struct cgroup_subsys_state *curr = (struct cgroup_subsys_state *)v;
 	struct cgroup_iter_priv *p = seq->private;
+	struct cgroup *parent;
 
 	++*pos;
 	if (p->terminate)
@@ -113,9 +114,11 @@ static void *cgroup_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 		return css_next_descendant_pre(curr, p->start_css);
 	else if (p->order == BPF_CGROUP_ITER_DESCENDANTS_POST)
 		return css_next_descendant_post(curr, p->start_css);
-	else if (p->order == BPF_CGROUP_ITER_ANCESTORS_UP)
-		return curr->parent;
-	else  /* BPF_CGROUP_ITER_SELF_ONLY */
+	else if (p->order == BPF_CGROUP_ITER_ANCESTORS_UP) {
+		parent = cgroup_parent_ns(curr->cgroup,
+					  current->nsproxy->cgroup_ns);
+		return parent ? &parent->self : NULL;
+	} else  /* BPF_CGROUP_ITER_SELF_ONLY */
 		return NULL;
 }
 
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index c0377726031f..d60b5dfbbbc9 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1417,11 +1417,11 @@ static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset,
 }
 
 /*
- * look up cgroup associated with current task's cgroup namespace on the
+ * look up cgroup associated with given cgroup namespace on the
  * specified hierarchy
  */
-static struct cgroup *
-current_cgns_cgroup_from_root(struct cgroup_root *root)
+static struct cgroup *cgns_cgroup_from_root(struct cgroup_root *root,
+					    struct cgroup_namespace *ns)
 {
 	struct cgroup *res = NULL;
 	struct css_set *cset;
@@ -1430,7 +1430,7 @@ current_cgns_cgroup_from_root(struct cgroup_root *root)
 
 	rcu_read_lock();
 
-	cset = current->nsproxy->cgroup_ns->root_cset;
+	cset = ns->root_cset;
 	res = __cset_cgroup_from_root(cset, root);
 
 	rcu_read_unlock();
@@ -1852,15 +1852,15 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
 	int len = 0;
 	char *buf = NULL;
 	struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root);
-	struct cgroup *ns_cgroup;
+	struct cgroup *root_cgroup;
 
 	buf = kmalloc(PATH_MAX, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
 
 	spin_lock_irq(&css_set_lock);
-	ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot);
-	len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);
+	root_cgroup = cgns_cgroup_from_root(kf_cgroot, current->nsproxy->cgroup_ns);
+	len = kernfs_path_from_node(kf_node, root_cgroup->kn, buf, PATH_MAX);
 	spin_unlock_irq(&css_set_lock);
 
 	if (len >= PATH_MAX)
@@ -2330,6 +2330,18 @@ int cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen,
 }
 EXPORT_SYMBOL_GPL(cgroup_path_ns);
 
+struct cgroup *cgroup_parent_ns(struct cgroup *cgrp,
+				   struct cgroup_namespace *ns)
+{
+	struct cgroup *root_cgrp;
+
+	spin_lock_irq(&css_set_lock);
+	root_cgrp = cgns_cgroup_from_root(cgrp->root, ns);
+	spin_unlock_irq(&css_set_lock);
+
+	return cgrp == root_cgrp ? NULL : cgroup_parent(cgrp);
+}
+
 /**
  * task_cgroup_path - cgroup path of a task in the first cgroup hierarchy
  * @task: target task
@@ -6031,7 +6043,8 @@ struct cgroup *cgroup_get_from_id(u64 id)
 		goto out;
 
 	spin_lock_irq(&css_set_lock);
-	root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
+	root_cgrp = cgns_cgroup_from_root(&cgrp_dfl_root,
+					  current->nsproxy->cgroup_ns);
 	spin_unlock_irq(&css_set_lock);
 	if (!cgroup_is_descendant(cgrp, root_cgrp)) {
 		cgroup_put(cgrp);
@@ -6612,7 +6625,8 @@ struct cgroup *cgroup_get_from_path(const char *path)
 	struct cgroup *root_cgrp;
 
 	spin_lock_irq(&css_set_lock);
-	root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
+	root_cgrp = cgns_cgroup_from_root(&cgrp_dfl_root,
+					  current->nsproxy->cgroup_ns);
 	kn = kernfs_walk_and_get(root_cgrp->kn, path);
 	spin_unlock_irq(&css_set_lock);
 	if (!kn)
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors
@ 2022-08-26 16:52   ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-26 16:52 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, bpf-u79uwXL29TY76Z2rM5mHXA
  Cc: Tejun Heo, Aditya Kali, Serge Hallyn, Roman Gushchin,
	Yonghong Song, Muneendra Kumar, Yosry Ahmed, Hao Luo

The iterator with BPF_CGROUP_ITER_ANCESTORS_UP can traverse up across a
cgroup namespace level, which may be surprising within a non-init cgroup
namespace.

Introduce and use a new cgroup_parent_ns() helper that stops according
to cgroup namespace boundary. With BPF_CGROUP_ITER_ANCESTORS_UP. We use
the cgroup namespace of the iterator caller, not that one of the creator
(might be different, the former is relevant).

Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter")
Signed-off-by: Michal Koutn√Ω <mkoutny-IBi9RG/b67k@public.gmane.org>
---
 include/linux/cgroup.h   |  3 +++
 kernel/bpf/cgroup_iter.c |  9 ++++++---
 kernel/cgroup/cgroup.c   | 32 +++++++++++++++++++++++---------
 3 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index b6a9528374a8..b63a80e03fae 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -858,6 +858,9 @@ struct cgroup_namespace *copy_cgroup_ns(unsigned long flags,
 int cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen,
 		   struct cgroup_namespace *ns);
 
+struct cgroup *cgroup_parent_ns(struct cgroup *cgrp,
+				struct cgroup_namespace *ns);
+
 #else /* !CONFIG_CGROUPS */
 
 static inline void free_cgroup_ns(struct cgroup_namespace *ns) { }
diff --git a/kernel/bpf/cgroup_iter.c b/kernel/bpf/cgroup_iter.c
index c69bce2f4403..06ee4a0c5870 100644
--- a/kernel/bpf/cgroup_iter.c
+++ b/kernel/bpf/cgroup_iter.c
@@ -104,6 +104,7 @@ static void *cgroup_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 {
 	struct cgroup_subsys_state *curr = (struct cgroup_subsys_state *)v;
 	struct cgroup_iter_priv *p = seq->private;
+	struct cgroup *parent;
 
 	++*pos;
 	if (p->terminate)
@@ -113,9 +114,11 @@ static void *cgroup_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 		return css_next_descendant_pre(curr, p->start_css);
 	else if (p->order == BPF_CGROUP_ITER_DESCENDANTS_POST)
 		return css_next_descendant_post(curr, p->start_css);
-	else if (p->order == BPF_CGROUP_ITER_ANCESTORS_UP)
-		return curr->parent;
-	else  /* BPF_CGROUP_ITER_SELF_ONLY */
+	else if (p->order == BPF_CGROUP_ITER_ANCESTORS_UP) {
+		parent = cgroup_parent_ns(curr->cgroup,
+					  current->nsproxy->cgroup_ns);
+		return parent ? &parent->self : NULL;
+	} else  /* BPF_CGROUP_ITER_SELF_ONLY */
 		return NULL;
 }
 
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index c0377726031f..d60b5dfbbbc9 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1417,11 +1417,11 @@ static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset,
 }
 
 /*
- * look up cgroup associated with current task's cgroup namespace on the
+ * look up cgroup associated with given cgroup namespace on the
  * specified hierarchy
  */
-static struct cgroup *
-current_cgns_cgroup_from_root(struct cgroup_root *root)
+static struct cgroup *cgns_cgroup_from_root(struct cgroup_root *root,
+					    struct cgroup_namespace *ns)
 {
 	struct cgroup *res = NULL;
 	struct css_set *cset;
@@ -1430,7 +1430,7 @@ current_cgns_cgroup_from_root(struct cgroup_root *root)
 
 	rcu_read_lock();
 
-	cset = current->nsproxy->cgroup_ns->root_cset;
+	cset = ns->root_cset;
 	res = __cset_cgroup_from_root(cset, root);
 
 	rcu_read_unlock();
@@ -1852,15 +1852,15 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
 	int len = 0;
 	char *buf = NULL;
 	struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root);
-	struct cgroup *ns_cgroup;
+	struct cgroup *root_cgroup;
 
 	buf = kmalloc(PATH_MAX, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
 
 	spin_lock_irq(&css_set_lock);
-	ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot);
-	len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);
+	root_cgroup = cgns_cgroup_from_root(kf_cgroot, current->nsproxy->cgroup_ns);
+	len = kernfs_path_from_node(kf_node, root_cgroup->kn, buf, PATH_MAX);
 	spin_unlock_irq(&css_set_lock);
 
 	if (len >= PATH_MAX)
@@ -2330,6 +2330,18 @@ int cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen,
 }
 EXPORT_SYMBOL_GPL(cgroup_path_ns);
 
+struct cgroup *cgroup_parent_ns(struct cgroup *cgrp,
+				   struct cgroup_namespace *ns)
+{
+	struct cgroup *root_cgrp;
+
+	spin_lock_irq(&css_set_lock);
+	root_cgrp = cgns_cgroup_from_root(cgrp->root, ns);
+	spin_unlock_irq(&css_set_lock);
+
+	return cgrp == root_cgrp ? NULL : cgroup_parent(cgrp);
+}
+
 /**
  * task_cgroup_path - cgroup path of a task in the first cgroup hierarchy
  * @task: target task
@@ -6031,7 +6043,8 @@ struct cgroup *cgroup_get_from_id(u64 id)
 		goto out;
 
 	spin_lock_irq(&css_set_lock);
-	root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
+	root_cgrp = cgns_cgroup_from_root(&cgrp_dfl_root,
+					  current->nsproxy->cgroup_ns);
 	spin_unlock_irq(&css_set_lock);
 	if (!cgroup_is_descendant(cgrp, root_cgrp)) {
 		cgroup_put(cgrp);
@@ -6612,7 +6625,8 @@ struct cgroup *cgroup_get_from_path(const char *path)
 	struct cgroup *root_cgrp;
 
 	spin_lock_irq(&css_set_lock);
-	root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
+	root_cgrp = cgns_cgroup_from_root(&cgrp_dfl_root,
+					  current->nsproxy->cgroup_ns);
 	kn = kernfs_walk_and_get(root_cgrp->kn, path);
 	spin_unlock_irq(&css_set_lock);
 	if (!kn)
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors
  2022-08-26 16:52   ` Michal Koutný
  (?)
@ 2022-08-26 17:41   ` Yosry Ahmed
  2022-08-29 12:59       ` Michal Koutný
  -1 siblings, 1 reply; 19+ messages in thread
From: Yosry Ahmed @ 2022-08-26 17:41 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Linux Kernel Mailing List, Cgroups, bpf, Tejun Heo, Aditya Kali,
	Serge Hallyn, Roman Gushchin, Yonghong Song, Muneendra Kumar,
	Hao Luo

Hi there!

Thanks for following up with this series!

On Fri, Aug 26, 2022 at 9:53 AM Michal Koutný <mkoutny@suse.com> wrote:
>
> The iterator with BPF_CGROUP_ITER_ANCESTORS_UP can traverse up across a
> cgroup namespace level, which may be surprising within a non-init cgroup
> namespace.
>
> Introduce and use a new cgroup_parent_ns() helper that stops according
> to cgroup namespace boundary. With BPF_CGROUP_ITER_ANCESTORS_UP. We use
> the cgroup namespace of the iterator caller, not that one of the creator
> (might be different, the former is relevant).
>
> Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter")
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  include/linux/cgroup.h   |  3 +++
>  kernel/bpf/cgroup_iter.c |  9 ++++++---
>  kernel/cgroup/cgroup.c   | 32 +++++++++++++++++++++++---------
>  3 files changed, 32 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index b6a9528374a8..b63a80e03fae 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -858,6 +858,9 @@ struct cgroup_namespace *copy_cgroup_ns(unsigned long flags,
>  int cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen,
>                    struct cgroup_namespace *ns);
>
> +struct cgroup *cgroup_parent_ns(struct cgroup *cgrp,
> +                               struct cgroup_namespace *ns);
> +
>  #else /* !CONFIG_CGROUPS */
>
>  static inline void free_cgroup_ns(struct cgroup_namespace *ns) { }
> diff --git a/kernel/bpf/cgroup_iter.c b/kernel/bpf/cgroup_iter.c
> index c69bce2f4403..06ee4a0c5870 100644
> --- a/kernel/bpf/cgroup_iter.c
> +++ b/kernel/bpf/cgroup_iter.c
> @@ -104,6 +104,7 @@ static void *cgroup_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos)
>  {
>         struct cgroup_subsys_state *curr = (struct cgroup_subsys_state *)v;
>         struct cgroup_iter_priv *p = seq->private;
> +       struct cgroup *parent;
>
>         ++*pos;
>         if (p->terminate)
> @@ -113,9 +114,11 @@ static void *cgroup_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos)
>                 return css_next_descendant_pre(curr, p->start_css);
>         else if (p->order == BPF_CGROUP_ITER_DESCENDANTS_POST)
>                 return css_next_descendant_post(curr, p->start_css);
> -       else if (p->order == BPF_CGROUP_ITER_ANCESTORS_UP)
> -               return curr->parent;
> -       else  /* BPF_CGROUP_ITER_SELF_ONLY */
> +       else if (p->order == BPF_CGROUP_ITER_ANCESTORS_UP) {
> +               parent = cgroup_parent_ns(curr->cgroup,
> +                                         current->nsproxy->cgroup_ns);
> +               return parent ? &parent->self : NULL;
> +       } else  /* BPF_CGROUP_ITER_SELF_ONLY */
>                 return NULL;
>  }
>
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index c0377726031f..d60b5dfbbbc9 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -1417,11 +1417,11 @@ static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset,
>  }
>
>  /*
> - * look up cgroup associated with current task's cgroup namespace on the
> + * look up cgroup associated with given cgroup namespace on the
>   * specified hierarchy
>   */
> -static struct cgroup *
> -current_cgns_cgroup_from_root(struct cgroup_root *root)
> +static struct cgroup *cgns_cgroup_from_root(struct cgroup_root *root,
> +                                           struct cgroup_namespace *ns)
>  {
>         struct cgroup *res = NULL;
>         struct css_set *cset;
> @@ -1430,7 +1430,7 @@ current_cgns_cgroup_from_root(struct cgroup_root *root)
>
>         rcu_read_lock();
>
> -       cset = current->nsproxy->cgroup_ns->root_cset;
> +       cset = ns->root_cset;
>         res = __cset_cgroup_from_root(cset, root);
>
>         rcu_read_unlock();
> @@ -1852,15 +1852,15 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
>         int len = 0;
>         char *buf = NULL;
>         struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root);
> -       struct cgroup *ns_cgroup;
> +       struct cgroup *root_cgroup;
>
>         buf = kmalloc(PATH_MAX, GFP_KERNEL);
>         if (!buf)
>                 return -ENOMEM;
>
>         spin_lock_irq(&css_set_lock);
> -       ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot);
> -       len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);
> +       root_cgroup = cgns_cgroup_from_root(kf_cgroot, current->nsproxy->cgroup_ns);
> +       len = kernfs_path_from_node(kf_node, root_cgroup->kn, buf, PATH_MAX);
>         spin_unlock_irq(&css_set_lock);
>
>         if (len >= PATH_MAX)
> @@ -2330,6 +2330,18 @@ int cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen,
>  }
>  EXPORT_SYMBOL_GPL(cgroup_path_ns);
>
> +struct cgroup *cgroup_parent_ns(struct cgroup *cgrp,
> +                                  struct cgroup_namespace *ns)
> +{
> +       struct cgroup *root_cgrp;
> +
> +       spin_lock_irq(&css_set_lock);
> +       root_cgrp = cgns_cgroup_from_root(cgrp->root, ns);
> +       spin_unlock_irq(&css_set_lock);
> +
> +       return cgrp == root_cgrp ? NULL : cgroup_parent(cgrp);

I understand that currently cgroup_iter is the only user of this, but
for future use cases, is it safe to assume that cgrp will always be
inside ns? Would it be safer to do something like:

struct cgroup *parent = cgroup_parent(cgrp);

if (!parent)
    return NULL;

return cgroup_is_descendant(parent, root_cgrp) ? parent : NULL;


> +}
> +
>  /**
>   * task_cgroup_path - cgroup path of a task in the first cgroup hierarchy
>   * @task: target task
> @@ -6031,7 +6043,8 @@ struct cgroup *cgroup_get_from_id(u64 id)
>                 goto out;
>
>         spin_lock_irq(&css_set_lock);
> -       root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
> +       root_cgrp = cgns_cgroup_from_root(&cgrp_dfl_root,
> +                                         current->nsproxy->cgroup_ns);
>         spin_unlock_irq(&css_set_lock);
>         if (!cgroup_is_descendant(cgrp, root_cgrp)) {
>                 cgroup_put(cgrp);
> @@ -6612,7 +6625,8 @@ struct cgroup *cgroup_get_from_path(const char *path)
>         struct cgroup *root_cgrp;
>
>         spin_lock_irq(&css_set_lock);
> -       root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
> +       root_cgrp = cgns_cgroup_from_root(&cgrp_dfl_root,
> +                                         current->nsproxy->cgroup_ns);
>         kn = kernfs_walk_and_get(root_cgrp->kn, path);
>         spin_unlock_irq(&css_set_lock);
>         if (!kn)
> --
> 2.37.0
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] Honor cgroup namespace when resolving cgroup id
  2022-08-26 16:52 ` Michal Koutný
                   ` (4 preceding siblings ...)
  (?)
@ 2022-08-26 20:59 ` Tejun Heo
  -1 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2022-08-26 20:59 UTC (permalink / raw)
  To: Michal Koutný
  Cc: linux-kernel, cgroups, bpf, Aditya Kali, Serge Hallyn,
	Roman Gushchin, Yonghong Song, Muneendra Kumar, Yosry Ahmed,
	Hao Luo

On Fri, Aug 26, 2022 at 06:52:34PM +0200, Michal Koutný wrote:
> Cgroup id is becoming a new way for userspace how to refer to cgroups it
> wants to act upon. As opposed to cgroupfs (paths, opened FDs), the
> current approach does not reflect limited view by (non-init) cgroup
> namespaces.
> 
> This patches don't aim to limit what a user can do (consider an uid=0 in
> mere cgroup namespace) but to provide consistent view within a
> namespace.

Applied 1-3 to cgroup/for-6.1. The branch will be stable, so please feel
free to pull from bpf/for-next.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] Honor cgroup namespace when resolving cgroup id
  2022-08-26 16:52 ` Michal Koutný
                   ` (5 preceding siblings ...)
  (?)
@ 2022-08-26 21:08 ` Tejun Heo
  -1 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2022-08-26 21:08 UTC (permalink / raw)
  To: Michal Koutný
  Cc: linux-kernel, cgroups, bpf, Aditya Kali, Serge Hallyn,
	Roman Gushchin, Yonghong Song, Muneendra Kumar, Yosry Ahmed,
	Hao Luo

Hello,

On Fri, Aug 26, 2022 at 06:52:34PM +0200, Michal Koutný wrote:
> Cgroup id is becoming a new way for userspace how to refer to cgroups it
> wants to act upon. As opposed to cgroupfs (paths, opened FDs), the
> current approach does not reflect limited view by (non-init) cgroup
> namespaces.

Looking at the code, I'm not quite sure we're actually plugging all holes in
terms of lookup. I think cgroup_get_from_path() would allow walking up past
the ns boundary. We aren't using kernfs ns support and I don't see anything
preventing ..'ing past the boundary.

> This patches don't aim to limit what a user can do (consider an uid=0 in
> mere cgroup namespace) but to provide consistent view within a
> namespace.

Considering userns and the fact that we try to isolate two separate sub
hierarchies delegated to the same UID, I think we'd have to tighten down on
the behaviors so that visiblity scope matches the permission scope.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors
@ 2022-08-29 12:59       ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-29 12:59 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Linux Kernel Mailing List, Cgroups, bpf, Tejun Heo, Aditya Kali,
	Serge Hallyn, Roman Gushchin, Yonghong Song, Muneendra Kumar,
	Hao Luo

[-- Attachment #1: Type: text/plain, Size: 918 bytes --]

On Fri, Aug 26, 2022 at 10:41:37AM -0700, Yosry Ahmed <yosryahmed@google.com> wrote:
> I understand that currently cgroup_iter is the only user of this, but
> for future use cases, is it safe to assume that cgrp will always be
> inside ns? Would it be safer to do something like:

I preferred the simpler root_cgrp comparison to avoid pointer
arithmetics in cgroup_is_descendant. But I also made the assumption of
cgrp in ns.

Thanks, I'll likely adjust cgroup_path_ns to make it more robust for
an external cgrp.


I'd like to clarify, if a process A in a broad cgroup ns sets up a BPF
cgroup iterator, exposes it via bpffs and than a process B in a narrowed
cgroup ns (which excludes the origin cgroup) wants to traverse the
iterator, should it fail straight ahead (regardless of iter order)?
The alternative would be to allow self-dereference but prohibit any
iterator moves (regardless of order).


Thanks,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors
@ 2022-08-29 12:59       ` Michal Koutný
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Koutný @ 2022-08-29 12:59 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Linux Kernel Mailing List, Cgroups, bpf, Tejun Heo, Aditya Kali,
	Serge Hallyn, Roman Gushchin, Yonghong Song, Muneendra Kumar,
	Hao Luo

[-- Attachment #1: Type: text/plain, Size: 947 bytes --]

On Fri, Aug 26, 2022 at 10:41:37AM -0700, Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> I understand that currently cgroup_iter is the only user of this, but
> for future use cases, is it safe to assume that cgrp will always be
> inside ns? Would it be safer to do something like:

I preferred the simpler root_cgrp comparison to avoid pointer
arithmetics in cgroup_is_descendant. But I also made the assumption of
cgrp in ns.

Thanks, I'll likely adjust cgroup_path_ns to make it more robust for
an external cgrp.


I'd like to clarify, if a process A in a broad cgroup ns sets up a BPF
cgroup iterator, exposes it via bpffs and than a process B in a narrowed
cgroup ns (which excludes the origin cgroup) wants to traverse the
iterator, should it fail straight ahead (regardless of iter order)?
The alternative would be to allow self-dereference but prohibit any
iterator moves (regardless of order).


Thanks,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors
  2022-08-29 12:59       ` Michal Koutný
@ 2022-08-29 17:30         ` Yosry Ahmed
  -1 siblings, 0 replies; 19+ messages in thread
From: Yosry Ahmed @ 2022-08-29 17:30 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Linux Kernel Mailing List, Cgroups, bpf, Tejun Heo, Aditya Kali,
	Serge Hallyn, Roman Gushchin, Yonghong Song, Muneendra Kumar,
	Hao Luo

On Mon, Aug 29, 2022 at 6:00 AM Michal Koutný <mkoutny@suse.com> wrote:
>
> On Fri, Aug 26, 2022 at 10:41:37AM -0700, Yosry Ahmed <yosryahmed@google.com> wrote:
> > I understand that currently cgroup_iter is the only user of this, but
> > for future use cases, is it safe to assume that cgrp will always be
> > inside ns? Would it be safer to do something like:
>
> I preferred the simpler root_cgrp comparison to avoid pointer
> arithmetics in cgroup_is_descendant. But I also made the assumption of
> cgrp in ns.
>
> Thanks, I'll likely adjust cgroup_path_ns to make it more robust for
> an external cgrp.
>

Great, thanks!

>
> I'd like to clarify, if a process A in a broad cgroup ns sets up a BPF
> cgroup iterator, exposes it via bpffs and than a process B in a narrowed
> cgroup ns (which excludes the origin cgroup) wants to traverse the
> iterator, should it fail straight ahead (regardless of iter order)?
> The alternative would be to allow self-dereference but prohibit any
> iterator moves (regardless of order).
>

imo it should fail straight ahead, but maybe others (Tejun? Hao?) have
other opinions here.

>
> Thanks,
> Michal

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors
@ 2022-08-29 17:30         ` Yosry Ahmed
  0 siblings, 0 replies; 19+ messages in thread
From: Yosry Ahmed @ 2022-08-29 17:30 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Linux Kernel Mailing List, Cgroups, bpf, Tejun Heo, Aditya Kali,
	Serge Hallyn, Roman Gushchin, Yonghong Song, Muneendra Kumar,
	Hao Luo

On Mon, Aug 29, 2022 at 6:00 AM Michal Koutn√Ω <mkoutny@suse.com> wrote:
>
> On Fri, Aug 26, 2022 at 10:41:37AM -0700, Yosry Ahmed <yosryahmed@google.com> wrote:
> > I understand that currently cgroup_iter is the only user of this, but
> > for future use cases, is it safe to assume that cgrp will always be
> > inside ns? Would it be safer to do something like:
>
> I preferred the simpler root_cgrp comparison to avoid pointer
> arithmetics in cgroup_is_descendant. But I also made the assumption of
> cgrp in ns.
>
> Thanks, I'll likely adjust cgroup_path_ns to make it more robust for
> an external cgrp.
>

Great, thanks!

>
> I'd like to clarify, if a process A in a broad cgroup ns sets up a BPF
> cgroup iterator, exposes it via bpffs and than a process B in a narrowed
> cgroup ns (which excludes the origin cgroup) wants to traverse the
> iterator, should it fail straight ahead (regardless of iter order)?
> The alternative would be to allow self-dereference but prohibit any
> iterator moves (regardless of order).
>

imo it should fail straight ahead, but maybe others (Tejun? Hao?) have
other opinions here.

>
> Thanks,
> Michal

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors
  2022-08-29 17:30         ` Yosry Ahmed
  (?)
@ 2022-08-29 17:49         ` Tejun Heo
  2022-08-29 18:02           ` Hao Luo
  -1 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2022-08-29 17:49 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Michal Koutný,
	Linux Kernel Mailing List, Cgroups, bpf, Aditya Kali,
	Serge Hallyn, Roman Gushchin, Yonghong Song, Muneendra Kumar,
	Hao Luo

On Mon, Aug 29, 2022 at 10:30:45AM -0700, Yosry Ahmed wrote:
> > I'd like to clarify, if a process A in a broad cgroup ns sets up a BPF
> > cgroup iterator, exposes it via bpffs and than a process B in a narrowed
> > cgroup ns (which excludes the origin cgroup) wants to traverse the
> > iterator, should it fail straight ahead (regardless of iter order)?
> > The alternative would be to allow self-dereference but prohibit any
> > iterator moves (regardless of order).
> >
> 
> imo it should fail straight ahead, but maybe others (Tejun? Hao?) have
> other opinions here.

Yeah, I'd prefer it to fail right away as that's simple and gives us the
most choices for the future.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors
  2022-08-29 17:49         ` Tejun Heo
@ 2022-08-29 18:02           ` Hao Luo
  0 siblings, 0 replies; 19+ messages in thread
From: Hao Luo @ 2022-08-29 18:02 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Yosry Ahmed, Michal Koutný,
	Linux Kernel Mailing List, Cgroups, bpf, Aditya Kali,
	Serge Hallyn, Roman Gushchin, Yonghong Song, Muneendra Kumar

On Mon, Aug 29, 2022 at 10:49 AM Tejun Heo <tj@kernel.org> wrote:
>
> On Mon, Aug 29, 2022 at 10:30:45AM -0700, Yosry Ahmed wrote:
> > > I'd like to clarify, if a process A in a broad cgroup ns sets up a BPF
> > > cgroup iterator, exposes it via bpffs and than a process B in a narrowed
> > > cgroup ns (which excludes the origin cgroup) wants to traverse the
> > > iterator, should it fail straight ahead (regardless of iter order)?
> > > The alternative would be to allow self-dereference but prohibit any
> > > iterator moves (regardless of order).
> > >
> >
> > imo it should fail straight ahead, but maybe others (Tejun? Hao?) have
> > other opinions here.
>
> Yeah, I'd prefer it to fail right away as that's simple and gives us the
> most choices for the future.
>

Thanks Michal for fixing the cgroup iter use case! I agree that
failing straight ahead is better. I don't envision a use case that
wants the alternative.

> Thanks.
>
> --
> tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-08-29 18:02 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-26 16:52 [PATCH 0/4] Honor cgroup namespace when resolving cgroup id Michal Koutný
2022-08-26 16:52 ` Michal Koutný
2022-08-26 16:52 ` [PATCH 1/4] cgroup: Honor caller's cgroup NS when resolving path Michal Koutný
2022-08-26 16:52   ` Michal Koutný
2022-08-26 16:52 ` [PATCH 2/4] cgroup: cgroup: Honor caller's cgroup NS when resolving cgroup id Michal Koutný
2022-08-26 16:52   ` Michal Koutný
2022-08-26 16:52 ` [PATCH 3/4] cgroup: Homogenize cgroup_get_from_id() return value Michal Koutný
2022-08-26 16:52   ` Michal Koutný
2022-08-26 16:52 ` [PATCH 4/4] cgroup/bpf: Honor cgroup NS in cgroup_iter for ancestors Michal Koutný
2022-08-26 16:52   ` Michal Koutný
2022-08-26 17:41   ` Yosry Ahmed
2022-08-29 12:59     ` Michal Koutný
2022-08-29 12:59       ` Michal Koutný
2022-08-29 17:30       ` Yosry Ahmed
2022-08-29 17:30         ` Yosry Ahmed
2022-08-29 17:49         ` Tejun Heo
2022-08-29 18:02           ` Hao Luo
2022-08-26 20:59 ` [PATCH 0/4] Honor cgroup namespace when resolving cgroup id Tejun Heo
2022-08-26 21:08 ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.