All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: "Tejun Heo" <tj@kernel.org>, "Josef Bacik" <josef@toxicpanda.com>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Zefan Li" <lizefan.x@bytedance.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeelb@google.com>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Michal Koutný" <mkoutny@suse.com>
Cc: Vasily Averin <vasily.averin@linux.dev>,
	cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	bpf@vger.kernel.org, Yosry Ahmed <yosryahmed@google.com>,
	Michal Hocko <mhocko@suse.com>
Subject: [PATCH v3 6/8] workingset: memcg: sleep when flushing stats in workingset_refault()
Date: Thu, 30 Mar 2023 19:17:59 +0000	[thread overview]
Message-ID: <20230330191801.1967435-7-yosryahmed@google.com> (raw)
In-Reply-To: <20230330191801.1967435-1-yosryahmed@google.com>

In workingset_refault(), we call
mem_cgroup_flush_stats_atomic_ratelimited() to read accurate stats
within an RCU read section and with sleeping disallowed. Move the
call above the RCU read section to make it non-atomic.

Flushing is an expensive operation that scales with the number of cpus
and the number of cgroups in the system, so avoid doing it atomically
where possible.

Since workingset_refault() is the only caller of
mem_cgroup_flush_stats_atomic_ratelimited(), just make it non-atomic,
and rename it to mem_cgroup_flush_stats_ratelimited().

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/memcontrol.h | 4 ++--
 mm/memcontrol.c            | 4 ++--
 mm/workingset.c            | 5 +++--
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index b424ba3ebd09..a4bc3910a2eb 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1038,7 +1038,7 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,
 
 void mem_cgroup_flush_stats(void);
 void mem_cgroup_flush_stats_atomic(void);
-void mem_cgroup_flush_stats_atomic_ratelimited(void);
+void mem_cgroup_flush_stats_ratelimited(void);
 
 void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
 			      int val);
@@ -1540,7 +1540,7 @@ static inline void mem_cgroup_flush_stats_atomic(void)
 {
 }
 
-static inline void mem_cgroup_flush_stats_atomic_ratelimited(void)
+static inline void mem_cgroup_flush_stats_ratelimited(void)
 {
 }
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a2ce3aa10d94..361c0bbf7283 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -673,10 +673,10 @@ void mem_cgroup_flush_stats_atomic(void)
 		do_flush_stats(true);
 }
 
-void mem_cgroup_flush_stats_atomic_ratelimited(void)
+void mem_cgroup_flush_stats_ratelimited(void)
 {
 	if (time_after64(jiffies_64, READ_ONCE(flush_next_time)))
-		mem_cgroup_flush_stats_atomic();
+		mem_cgroup_flush_stats();
 }
 
 static void flush_memcg_stats_dwork(struct work_struct *w)
diff --git a/mm/workingset.c b/mm/workingset.c
index dab0c362b9e3..3025beee9b34 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -406,6 +406,9 @@ void workingset_refault(struct folio *folio, void *shadow)
 	unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset);
 	eviction <<= bucket_order;
 
+	/* Flush stats (and potentially sleep) before holding RCU read lock */
+	mem_cgroup_flush_stats_ratelimited();
+
 	rcu_read_lock();
 	/*
 	 * Look up the memcg associated with the stored ID. It might
@@ -461,8 +464,6 @@ void workingset_refault(struct folio *folio, void *shadow)
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr);
-
-	mem_cgroup_flush_stats_atomic_ratelimited();
 	/*
 	 * Compare the distance to the existing workingset size. We
 	 * don't activate pages that couldn't stay resident even if
-- 
2.40.0.348.gf938b09366-goog


  parent reply	other threads:[~2023-03-30 19:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-30 19:17 [PATCH v3 0/8] memcg: avoid flushing stats atomically where possible Yosry Ahmed
2023-03-30 19:17 ` [PATCH v3 1/8] cgroup: rename cgroup_rstat_flush_"irqsafe" to "atomic" Yosry Ahmed
2023-03-30 19:17   ` Yosry Ahmed
2023-03-30 19:17 ` [PATCH v3 2/8] memcg: rename mem_cgroup_flush_stats_"delayed" to "ratelimited" Yosry Ahmed
2023-03-30 19:17 ` [PATCH v3 3/8] memcg: do not flush stats in irq context Yosry Ahmed
2023-03-30 19:17   ` Yosry Ahmed
2023-03-30 19:17 ` [PATCH v3 4/8] memcg: replace stats_flush_lock with an atomic Yosry Ahmed
2023-04-04 16:52   ` Michal Koutný
2023-04-04 16:52     ` Michal Koutný
2023-04-04 17:13     ` Shakeel Butt
2023-04-04 17:13       ` Shakeel Butt
2023-04-04 17:21       ` Shakeel Butt
2023-04-04 17:21         ` Shakeel Butt
2023-04-04 17:32         ` Michal Koutný
2023-03-30 19:17 ` [PATCH v3 5/8] memcg: sleep during flushing stats in safe contexts Yosry Ahmed
2023-03-30 19:30   ` Johannes Weiner
2023-04-04 15:09   ` Johannes Weiner
2023-03-30 19:17 ` Yosry Ahmed [this message]
2023-04-04 16:53   ` [PATCH v3 6/8] workingset: memcg: sleep when flushing stats in workingset_refault() Michal Koutný
2023-04-04 16:53     ` Michal Koutný
2023-04-04 18:09     ` Yosry Ahmed
2023-04-05  8:00       ` Michal Koutný
2023-04-05  8:00         ` Michal Koutný
2023-04-05  8:02         ` Yosry Ahmed
2023-04-05  8:02           ` Yosry Ahmed
2023-03-30 19:18 ` [PATCH v3 7/8] vmscan: memcg: sleep when flushing stats during reclaim Yosry Ahmed
2023-03-30 19:18   ` Yosry Ahmed
2023-03-30 19:18 ` [PATCH v3 8/8] memcg: do not modify rstat tree for zero updates Yosry Ahmed
2023-03-30 19:18   ` Yosry Ahmed
2023-04-04 16:53   ` Michal Koutný

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230330191801.1967435-7-yosryahmed@google.com \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=vasily.averin@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.