All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michal Koutný" <mkoutny@suse.com>
To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Tejun Heo <tj@kernel.org>,
	Zefan Li <lizefan.x@bytedance.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Dave Chinner <dchinner@redhat.com>,
	Rik van Riel <riel@surriel.com>, Jiri Wiesner <jwiesner@suse.de>
Subject: [RFC PATCH 3/3] cgroup: Do not take css_set_lock in cgroup_show_path
Date: Tue,  2 May 2023 15:38:47 +0200	[thread overview]
Message-ID: <20230502133847.14570-4-mkoutny@suse.com> (raw)
In-Reply-To: <20230502133847.14570-1-mkoutny@suse.com>

/proc/$pid/mountinfo may accumulate lots of entries (causing frequent
re-reads of whole file) or lots cgroupfs entries alone.
The cgroupfs entries rendered with cgroup_show_path() may increase/be
subject of css_set_lock contention causing further slowdown -- not only
mountinfo rendering but any other css_set_lock user.

We leverage the fact that mountinfo reading happens with namespace_sem
taken and hierarchy roots thus cannot be freed concurrently.

There are three relevant nodes for each cgroupfs entry:

        R ... cgroup hierarchy root
        M ... mount root
        C ... reader's cgroup NS root

mountinfo is supposed to show path from C to M.

Current's css_set (and linked root cgroups) are stable under
namespace_sem, therefore current_cgns_cgroup_from_root() doesn't need
css_set_lock.

When the path is assembled in kernfs_path_from_node(), we know that:
- C kernfs_node is (transitively) pinned via current->nsproxy,
- M kernfs_node is pinned thanks to namespace_sem,
- path C to M is pinned via child->parent references (this holds also
  when C and M are in distinct subtrees).

Streamline mountinfo rendering a bit by relieving css_set_lock and add
careful notes about that.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/cgroup/cgroup.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 32d693a797b9..e2ec6f0028be 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1407,12 +1407,18 @@ static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset,
 	struct cgroup *res_cgroup = NULL;
 
 	if (cset == &init_css_set) {
+		/* callers must ensure root stability */
 		res_cgroup = &root->cgrp;
 	} else if (root == &cgrp_dfl_root) {
 		res_cgroup = cset->dfl_cgrp;
 	} else {
 		struct cgrp_cset_link *link;
-		lockdep_assert_held(&css_set_lock);
+		/* cset's cgroups are pinned unless they are root cgroups that
+		 * were unmounted.  We look at links to !cgrp_dfl_root
+		 * cgroup_root, either lock ensures the list is not mutated
+		 */
+		lockdep_assert(lockdep_is_held(&css_set_lock) ||
+			       lockdep_is_held_type(&namespace_sem, -1));
 
 		list_for_each_entry(link, &cset->cgrp_links, cgrp_link) {
 			struct cgroup *c = link->cgrp;
@@ -1438,8 +1444,6 @@ current_cgns_cgroup_from_root(struct cgroup_root *root)
 	struct cgroup *res = NULL;
 	struct css_set *cset;
 
-	lockdep_assert_held(&css_set_lock);
-
 	/* namespace_sem ensures `root` stability on unmount */
 	lockdep_assert(lockdep_is_held_type(&namespace_sem, -1));
 
@@ -1905,10 +1909,8 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
 	if (!buf)
 		return -ENOMEM;
 
-	spin_lock_irq(&css_set_lock);
 	ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot);
 	len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);
-	spin_unlock_irq(&css_set_lock);
 
 	if (len >= PATH_MAX)
 		len = -ERANGE;
-- 
2.40.1


WARNING: multiple messages have this Message-ID (diff)
From: "Michal Koutný" <mkoutny@suse.com>
To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Tejun Heo <tj@kernel.org>,
	Zefan Li <lizefan.x@bytedance.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Dave Chinner <dchinner@redhat.com>,
	Rik van Riel <riel@surriel.com>, Jiri Wiesner <jwiesner@suse.de>
Subject: [RFC PATCH 3/3] cgroup: Do not take css_set_lock in cgroup_show_path
Date: Tue,  2 May 2023 15:38:47 +0200	[thread overview]
Message-ID: <20230502133847.14570-4-mkoutny@suse.com> (raw)
In-Reply-To: <20230502133847.14570-1-mkoutny@suse.com>

/proc/$pid/mountinfo may accumulate lots of entries (causing frequent
re-reads of whole file) or lots cgroupfs entries alone.
The cgroupfs entries rendered with cgroup_show_path() may increase/be
subject of css_set_lock contention causing further slowdown -- not only
mountinfo rendering but any other css_set_lock user.

We leverage the fact that mountinfo reading happens with namespace_sem
taken and hierarchy roots thus cannot be freed concurrently.

There are three relevant nodes for each cgroupfs entry:

        R ... cgroup hierarchy root
        M ... mount root
        C ... reader's cgroup NS root

mountinfo is supposed to show path from C to M.

Current's css_set (and linked root cgroups) are stable under
namespace_sem, therefore current_cgns_cgroup_from_root() doesn't need
css_set_lock.

When the path is assembled in kernfs_path_from_node(), we know that:
- C kernfs_node is (transitively) pinned via current->nsproxy,
- M kernfs_node is pinned thanks to namespace_sem,
- path C to M is pinned via child->parent references (this holds also
  when C and M are in distinct subtrees).

Streamline mountinfo rendering a bit by relieving css_set_lock and add
careful notes about that.

Signed-off-by: Michal Koutn√Ω <mkoutny@suse.com>
---
 kernel/cgroup/cgroup.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 32d693a797b9..e2ec6f0028be 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1407,12 +1407,18 @@ static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset,
 	struct cgroup *res_cgroup = NULL;
 
 	if (cset == &init_css_set) {
+		/* callers must ensure root stability */
 		res_cgroup = &root->cgrp;
 	} else if (root == &cgrp_dfl_root) {
 		res_cgroup = cset->dfl_cgrp;
 	} else {
 		struct cgrp_cset_link *link;
-		lockdep_assert_held(&css_set_lock);
+		/* cset's cgroups are pinned unless they are root cgroups that
+		 * were unmounted.  We look at links to !cgrp_dfl_root
+		 * cgroup_root, either lock ensures the list is not mutated
+		 */
+		lockdep_assert(lockdep_is_held(&css_set_lock) ||
+			       lockdep_is_held_type(&namespace_sem, -1));
 
 		list_for_each_entry(link, &cset->cgrp_links, cgrp_link) {
 			struct cgroup *c = link->cgrp;
@@ -1438,8 +1444,6 @@ current_cgns_cgroup_from_root(struct cgroup_root *root)
 	struct cgroup *res = NULL;
 	struct css_set *cset;
 
-	lockdep_assert_held(&css_set_lock);
-
 	/* namespace_sem ensures `root` stability on unmount */
 	lockdep_assert(lockdep_is_held_type(&namespace_sem, -1));
 
@@ -1905,10 +1909,8 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
 	if (!buf)
 		return -ENOMEM;
 
-	spin_lock_irq(&css_set_lock);
 	ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot);
 	len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);
-	spin_unlock_irq(&css_set_lock);
 
 	if (len >= PATH_MAX)
 		len = -ERANGE;
-- 
2.40.1


  parent reply	other threads:[~2023-05-02 13:40 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-02 13:38 [RFC PATCH 0/3] Rework locking when rendering mountinfo cgroup paths Michal Koutný
2023-05-02 13:38 ` Michal Koutný
2023-05-02 13:38 ` [RFC PATCH 1/3] cgroup: Drop unused function for cgroup_path Michal Koutný
2023-05-02 13:38   ` Michal Koutný
2023-05-02 19:58   ` Waiman Long
2023-05-02 13:38 ` [RFC PATCH 2/3] cgroup: Rely on namespace_sem in current_cgns_cgroup_from_root explicitly Michal Koutný
2023-05-02 13:38   ` Michal Koutný
2023-05-02 19:50   ` Waiman Long
2023-05-23 10:42   ` Christian Brauner
2023-05-23 10:42     ` Christian Brauner
2023-05-23 19:12     ` Tejun Heo
2023-05-02 13:38 ` Michal Koutný [this message]
2023-05-02 13:38   ` [RFC PATCH 3/3] cgroup: Do not take css_set_lock in cgroup_show_path Michal Koutný
2023-05-02 19:56   ` Waiman Long
2023-05-02 19:56     ` Waiman Long
2023-05-05 15:45   ` Tejun Heo
2023-05-05 17:32     ` Michal Koutný
2023-05-05 18:17       ` Tejun Heo
2023-05-09 10:34         ` Michal Koutný
2023-05-22 20:55           ` Tejun Heo
2023-05-23 12:09 ` [RFC PATCH 0/3] Rework locking when rendering mountinfo cgroup paths Christian Brauner
2023-05-23 12:09   ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230502133847.14570-4-mkoutny@suse.com \
    --to=mkoutny@suse.com \
    --cc=brauner@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=dchinner@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=jwiesner@suse.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=riel@surriel.com \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.