From: "Michal Koutný" <mkoutny@suse.com> To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk>, Christian Brauner <brauner@kernel.org>, Tejun Heo <tj@kernel.org>, Zefan Li <lizefan.x@bytedance.com>, Johannes Weiner <hannes@cmpxchg.org>, Dave Chinner <dchinner@redhat.com>, Rik van Riel <riel@surriel.com>, Jiri Wiesner <jwiesner@suse.de> Subject: [RFC PATCH 3/3] cgroup: Do not take css_set_lock in cgroup_show_path Date: Tue, 2 May 2023 15:38:47 +0200 [thread overview] Message-ID: <20230502133847.14570-4-mkoutny@suse.com> (raw) In-Reply-To: <20230502133847.14570-1-mkoutny@suse.com> /proc/$pid/mountinfo may accumulate lots of entries (causing frequent re-reads of whole file) or lots cgroupfs entries alone. The cgroupfs entries rendered with cgroup_show_path() may increase/be subject of css_set_lock contention causing further slowdown -- not only mountinfo rendering but any other css_set_lock user. We leverage the fact that mountinfo reading happens with namespace_sem taken and hierarchy roots thus cannot be freed concurrently. There are three relevant nodes for each cgroupfs entry: R ... cgroup hierarchy root M ... mount root C ... reader's cgroup NS root mountinfo is supposed to show path from C to M. Current's css_set (and linked root cgroups) are stable under namespace_sem, therefore current_cgns_cgroup_from_root() doesn't need css_set_lock. When the path is assembled in kernfs_path_from_node(), we know that: - C kernfs_node is (transitively) pinned via current->nsproxy, - M kernfs_node is pinned thanks to namespace_sem, - path C to M is pinned via child->parent references (this holds also when C and M are in distinct subtrees). Streamline mountinfo rendering a bit by relieving css_set_lock and add careful notes about that. Signed-off-by: Michal Koutný <mkoutny@suse.com> --- kernel/cgroup/cgroup.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 32d693a797b9..e2ec6f0028be 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1407,12 +1407,18 @@ static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset, struct cgroup *res_cgroup = NULL; if (cset == &init_css_set) { + /* callers must ensure root stability */ res_cgroup = &root->cgrp; } else if (root == &cgrp_dfl_root) { res_cgroup = cset->dfl_cgrp; } else { struct cgrp_cset_link *link; - lockdep_assert_held(&css_set_lock); + /* cset's cgroups are pinned unless they are root cgroups that + * were unmounted. We look at links to !cgrp_dfl_root + * cgroup_root, either lock ensures the list is not mutated + */ + lockdep_assert(lockdep_is_held(&css_set_lock) || + lockdep_is_held_type(&namespace_sem, -1)); list_for_each_entry(link, &cset->cgrp_links, cgrp_link) { struct cgroup *c = link->cgrp; @@ -1438,8 +1444,6 @@ current_cgns_cgroup_from_root(struct cgroup_root *root) struct cgroup *res = NULL; struct css_set *cset; - lockdep_assert_held(&css_set_lock); - /* namespace_sem ensures `root` stability on unmount */ lockdep_assert(lockdep_is_held_type(&namespace_sem, -1)); @@ -1905,10 +1909,8 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, if (!buf) return -ENOMEM; - spin_lock_irq(&css_set_lock); ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot); len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX); - spin_unlock_irq(&css_set_lock); if (len >= PATH_MAX) len = -ERANGE; -- 2.40.1
WARNING: multiple messages have this Message-ID (diff)
From: "Michal Koutný" <mkoutny@suse.com> To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk>, Christian Brauner <brauner@kernel.org>, Tejun Heo <tj@kernel.org>, Zefan Li <lizefan.x@bytedance.com>, Johannes Weiner <hannes@cmpxchg.org>, Dave Chinner <dchinner@redhat.com>, Rik van Riel <riel@surriel.com>, Jiri Wiesner <jwiesner@suse.de> Subject: [RFC PATCH 3/3] cgroup: Do not take css_set_lock in cgroup_show_path Date: Tue, 2 May 2023 15:38:47 +0200 [thread overview] Message-ID: <20230502133847.14570-4-mkoutny@suse.com> (raw) In-Reply-To: <20230502133847.14570-1-mkoutny@suse.com> /proc/$pid/mountinfo may accumulate lots of entries (causing frequent re-reads of whole file) or lots cgroupfs entries alone. The cgroupfs entries rendered with cgroup_show_path() may increase/be subject of css_set_lock contention causing further slowdown -- not only mountinfo rendering but any other css_set_lock user. We leverage the fact that mountinfo reading happens with namespace_sem taken and hierarchy roots thus cannot be freed concurrently. There are three relevant nodes for each cgroupfs entry: R ... cgroup hierarchy root M ... mount root C ... reader's cgroup NS root mountinfo is supposed to show path from C to M. Current's css_set (and linked root cgroups) are stable under namespace_sem, therefore current_cgns_cgroup_from_root() doesn't need css_set_lock. When the path is assembled in kernfs_path_from_node(), we know that: - C kernfs_node is (transitively) pinned via current->nsproxy, - M kernfs_node is pinned thanks to namespace_sem, - path C to M is pinned via child->parent references (this holds also when C and M are in distinct subtrees). Streamline mountinfo rendering a bit by relieving css_set_lock and add careful notes about that. Signed-off-by: Michal Koutn√Ω <mkoutny@suse.com> --- kernel/cgroup/cgroup.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 32d693a797b9..e2ec6f0028be 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1407,12 +1407,18 @@ static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset, struct cgroup *res_cgroup = NULL; if (cset == &init_css_set) { + /* callers must ensure root stability */ res_cgroup = &root->cgrp; } else if (root == &cgrp_dfl_root) { res_cgroup = cset->dfl_cgrp; } else { struct cgrp_cset_link *link; - lockdep_assert_held(&css_set_lock); + /* cset's cgroups are pinned unless they are root cgroups that + * were unmounted. We look at links to !cgrp_dfl_root + * cgroup_root, either lock ensures the list is not mutated + */ + lockdep_assert(lockdep_is_held(&css_set_lock) || + lockdep_is_held_type(&namespace_sem, -1)); list_for_each_entry(link, &cset->cgrp_links, cgrp_link) { struct cgroup *c = link->cgrp; @@ -1438,8 +1444,6 @@ current_cgns_cgroup_from_root(struct cgroup_root *root) struct cgroup *res = NULL; struct css_set *cset; - lockdep_assert_held(&css_set_lock); - /* namespace_sem ensures `root` stability on unmount */ lockdep_assert(lockdep_is_held_type(&namespace_sem, -1)); @@ -1905,10 +1909,8 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, if (!buf) return -ENOMEM; - spin_lock_irq(&css_set_lock); ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot); len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX); - spin_unlock_irq(&css_set_lock); if (len >= PATH_MAX) len = -ERANGE; -- 2.40.1
next prev parent reply other threads:[~2023-05-02 13:40 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-05-02 13:38 [RFC PATCH 0/3] Rework locking when rendering mountinfo cgroup paths Michal Koutný 2023-05-02 13:38 ` Michal Koutný 2023-05-02 13:38 ` [RFC PATCH 1/3] cgroup: Drop unused function for cgroup_path Michal Koutný 2023-05-02 13:38 ` Michal Koutný 2023-05-02 19:58 ` Waiman Long 2023-05-02 13:38 ` [RFC PATCH 2/3] cgroup: Rely on namespace_sem in current_cgns_cgroup_from_root explicitly Michal Koutný 2023-05-02 13:38 ` Michal Koutný 2023-05-02 19:50 ` Waiman Long 2023-05-23 10:42 ` Christian Brauner 2023-05-23 10:42 ` Christian Brauner 2023-05-23 19:12 ` Tejun Heo 2023-05-02 13:38 ` Michal Koutný [this message] 2023-05-02 13:38 ` [RFC PATCH 3/3] cgroup: Do not take css_set_lock in cgroup_show_path Michal Koutný 2023-05-02 19:56 ` Waiman Long 2023-05-02 19:56 ` Waiman Long 2023-05-05 15:45 ` Tejun Heo 2023-05-05 17:32 ` Michal Koutný 2023-05-05 18:17 ` Tejun Heo 2023-05-09 10:34 ` Michal Koutný 2023-05-22 20:55 ` Tejun Heo 2023-05-23 12:09 ` [RFC PATCH 0/3] Rework locking when rendering mountinfo cgroup paths Christian Brauner 2023-05-23 12:09 ` Christian Brauner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230502133847.14570-4-mkoutny@suse.com \ --to=mkoutny@suse.com \ --cc=brauner@kernel.org \ --cc=cgroups@vger.kernel.org \ --cc=dchinner@redhat.com \ --cc=hannes@cmpxchg.org \ --cc=jwiesner@suse.de \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=lizefan.x@bytedance.com \ --cc=riel@surriel.com \ --cc=tj@kernel.org \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.