All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michal Koutný" <mkoutny@suse.com>
To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christian Brauner <brauner@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Mike Christie <michael.christie@oracle.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	Andrei Vagin <avagin@gmail.com>,
	Shakeel Butt <shakeelb@google.com>, Adam Majer <amajer@suse.com>,
	Jan Kara <jack@suse.cz>, Michal Hocko <mhocko@kernel.org>
Subject: [PATCH] mm: Sync percpu mm RSS counters before querying
Date: Fri, 16 Jun 2023 20:07:18 +0200	[thread overview]
Message-ID: <20230616180718.17725-1-mkoutny@suse.com> (raw)

An issue was observed with stats collected in struct rusage on ppc64le
with 64kB pages. The percpu counters use batching with
	percpu_counter_batch = max(32, nr*2) # in PAGE_SIZE
i.e. with larger pages but similar RSS consumption (bytes), there'll be
less flushes and error more noticeable.

In this given case (getting consumption of exited child), we can request
percpu counter's flush without worrying about contention with updaters.

Fortunately, the commit f1a7941243c1 ("mm: convert mm's rss stats into
percpu_counter") didn't eradicate all traces of SPLIT_RSS_COUNTING and
this mechanism already provided some synchronization points before
reading stats.
Therefore, use sync_mm_rss as carrier for percpu counters refreshes and
forget SPLIT_RSS_COUNTING macro for good.

Impact of summing on a 8 CPU machine:
Benchmark 1: taskset -c 1 ./shell-bench.sh

Before
  Time (mean ± σ):      9.950 s ±  0.052 s    [User: 7.773 s, System: 2.023 s]

After
  Time (mean ± σ):      9.990 s ±  0.070 s    [User: 7.825 s, System: 2.011 s]

cat >shell-bench.sh <<EOD
for (( i = 0; i < 20000; i++ )); do
	/bin/true
done
EOD

The script is meant to stress fork-exit path (exit is where sync_mm_rss
is most called, add_mm_rss_vec should be covered in fork).

Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter")
Reported-by: Adam Majer <amajer@suse.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---

Notes:
    - dummy approach to fix correctness, performance should be treated separately
    - add percpu_counter_set to really update the value
    - RFC https://lore.kernel.org/r/20230608171256.17827-1-mkoutny@suse.com/

 include/linux/mm.h | 7 +++----
 kernel/fork.c      | 4 ----
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 27ce77080c79..c7ac1cbc6fb3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2547,13 +2547,12 @@ static inline void setmax_mm_hiwater_rss(unsigned long *maxrss,
 		*maxrss = hiwater_rss;
 }
 
-#if defined(SPLIT_RSS_COUNTING)
-void sync_mm_rss(struct mm_struct *mm);
-#else
 static inline void sync_mm_rss(struct mm_struct *mm)
 {
+	for (int i = 0; i < NR_MM_COUNTERS; ++i)
+		percpu_counter_set(&mm->rss_stat[i],
+				   percpu_counter_sum(&mm->rss_stat[i]));
 }
-#endif
 
 #ifndef CONFIG_ARCH_HAS_PTE_SPECIAL
 static inline int pte_special(pte_t pte)
diff --git a/kernel/fork.c b/kernel/fork.c
index 81cba91f30bb..e030eb902e4b 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2412,10 +2412,6 @@ __latent_entropy struct task_struct *copy_process(
 	p->io_uring = NULL;
 #endif
 
-#if defined(SPLIT_RSS_COUNTING)
-	memset(&p->rss_stat, 0, sizeof(p->rss_stat));
-#endif
-
 	p->default_timer_slack_ns = current->timer_slack_ns;
 
 #ifdef CONFIG_PSI
-- 
2.41.0


             reply	other threads:[~2023-06-16 18:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-16 18:07 Michal Koutný [this message]
2023-06-16 19:31 ` [PATCH] mm: Sync percpu mm RSS counters before querying Andrew Morton
2023-06-19 12:53   ` Michal Koutný
2023-06-19 13:10 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230616180718.17725-1-mkoutny@suse.com \
    --to=mkoutny@suse.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=amajer@suse.com \
    --cc=avagin@gmail.com \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhocko@kernel.org \
    --cc=michael.christie@oracle.com \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=shakeelb@google.com \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.