All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <hawk@kernel.org>
To: tj@kernel.org, hannes@cmpxchg.org, lizefan.x@bytedance.com,
	cgroups@vger.kernel.org, yosryahmed@google.com,
	longman@redhat.com
Cc: Jesper Dangaard Brouer <hawk@kernel.org>,
	netdev@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, shakeel.butt@linux.dev,
	kernel-team@cloudflare.com, linux-kernel@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	mhocko@kernel.org
Subject: [PATCH v1 2/3] cgroup/rstat: convert cgroup_rstat_lock back to mutex
Date: Tue, 16 Apr 2024 19:51:33 +0200	[thread overview]
Message-ID: <171328989335.3930751.3091577850420501533.stgit@firesoul> (raw)
In-Reply-To: <171328983017.3930751.9484082608778623495.stgit@firesoul>

Since kernel v4.18, cgroup_rstat_lock has been an IRQ-disabling spinlock,
as introduced by commit 0fa294fb1985 ("cgroup: Replace cgroup_rstat_mutex
with a spinlock").

Despite efforts in cgroup_rstat_flush_locked() to yield the lock when
necessary during the collection of per-CPU stats, this approach has led
to several scaling issues observed in production environments. Holding
this IRQ lock has caused starvation of other critical kernel functions,
such as softirq (e.g., timers and netstack). Although kernel v6.8
introduced optimizations in this area, we continue to observe instances
where the spin_lock is held for 64-128 ms in production.

This patch converts cgroup_rstat_lock back to being a mutex lock. This
change is made possible thanks to the significant effort by Yosry Ahmed
to eliminate all atomic context use-cases through multiple commits,
ending in 0a2dc6ac3329 ("cgroup: removecgroup_rstat_flush_atomic()"),
included in kernel v6.5.

After this patch lock contention will be less obvious, as converting this
to a mutex avoids multiple CPUs spinning while waiting for the lock, but
it doesn't remove the lock contention. It is recommended to use the
tracepoints to diagnose this.

Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
---
 kernel/cgroup/rstat.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index ff68c904e647..a90d68a7c27f 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -9,7 +9,7 @@
 
 #include <trace/events/cgroup.h>
 
-static DEFINE_SPINLOCK(cgroup_rstat_lock);
+static DEFINE_MUTEX(cgroup_rstat_lock);
 static DEFINE_PER_CPU(raw_spinlock_t, cgroup_rstat_cpu_lock);
 
 static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu);
@@ -238,10 +238,10 @@ static inline void __cgroup_rstat_lock(struct cgroup *cgrp, int cpu_in_loop)
 {
 	bool contended;
 
-	contended = !spin_trylock_irq(&cgroup_rstat_lock);
+	contended = !mutex_trylock(&cgroup_rstat_lock);
 	if (contended) {
 		trace_cgroup_rstat_lock_contended(cgrp, cpu_in_loop, contended);
-		spin_lock_irq(&cgroup_rstat_lock);
+		mutex_lock(&cgroup_rstat_lock);
 	}
 	trace_cgroup_rstat_locked(cgrp, cpu_in_loop, contended);
 }
@@ -250,7 +250,7 @@ static inline void __cgroup_rstat_unlock(struct cgroup *cgrp, int cpu_in_loop)
 	__releases(&cgroup_rstat_lock)
 {
 	trace_cgroup_rstat_unlock(cgrp, cpu_in_loop, false);
-	spin_unlock_irq(&cgroup_rstat_lock);
+	mutex_unlock(&cgroup_rstat_lock);
 }
 
 /* see cgroup_rstat_flush() */
@@ -278,7 +278,7 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp)
 		}
 
 		/* play nice and yield if necessary */
-		if (need_resched() || spin_needbreak(&cgroup_rstat_lock)) {
+		if (need_resched()) {
 			__cgroup_rstat_unlock(cgrp, cpu);
 			if (!cond_resched())
 				cpu_relax();



  parent reply	other threads:[~2024-04-16 17:51 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-16 17:51 [PATCH v1 0/3] cgroup/rstat: global cgroup_rstat_lock changes Jesper Dangaard Brouer
2024-04-16 17:51 ` [PATCH v1 1/3] cgroup/rstat: add cgroup_rstat_lock helpers and tracepoints Jesper Dangaard Brouer
2024-04-16 21:36   ` Tejun Heo
2024-04-18  8:00     ` Jesper Dangaard Brouer
2024-04-23 16:53   ` Simon Horman
2024-04-29 11:36     ` Jesper Dangaard Brouer
2024-04-29 17:48       ` Simon Horman
2024-04-16 17:51 ` Jesper Dangaard Brouer [this message]
2024-04-18  2:19   ` [PATCH v1 2/3] cgroup/rstat: convert cgroup_rstat_lock back to mutex Yosry Ahmed
2024-04-18  9:02     ` Jesper Dangaard Brouer
2024-04-18 14:49       ` Shakeel Butt
2024-04-18 20:39         ` Yosry Ahmed
2024-04-19 13:15           ` Jesper Dangaard Brouer
2024-04-19 16:11             ` Shakeel Butt
2024-04-19 19:21               ` Yosry Ahmed
2024-04-18 20:38       ` Yosry Ahmed
2024-04-16 17:51 ` [PATCH v1 3/3] cgroup/rstat: introduce ratelimited rstat flushing Jesper Dangaard Brouer
2024-04-18  2:21   ` Yosry Ahmed
2024-04-18 11:00     ` Jesper Dangaard Brouer
2024-04-18 15:49       ` Shakeel Butt
2024-04-18 21:00       ` Yosry Ahmed
2024-04-18 21:15         ` Tejun Heo
2024-04-18 21:22           ` Yosry Ahmed
2024-04-18 21:32             ` Tejun Heo
2024-04-19 10:16         ` Jesper Dangaard Brouer
2024-04-19 19:25           ` Yosry Ahmed
2024-04-16 21:38 ` [PATCH v1 0/3] cgroup/rstat: global cgroup_rstat_lock changes Tejun Heo
2024-04-18  2:13   ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=171328989335.3930751.3091577850420501533.stgit@firesoul \
    --to=hawk@kernel.org \
    --cc=acme@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=tj@kernel.org \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.