All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yafang Shao <laoar.shao@gmail.com>
To: akpm@linux-foundation.org, mhocko@kernel.org, vdavydov.dev@gmail.com
Cc: linux-mm@kvack.org, Yafang Shao <laoar.shao@gmail.com>,
	Chris Down <chris@chrisdown.name>, Roman Gushchin <guro@fb.com>,
	stable@vger.kernel.org
Subject: [PATCH] mm, memcg: fix wrong mem cgroup protection
Date: Thu, 23 Apr 2020 02:16:29 -0400	[thread overview]
Message-ID: <20200423061629.24185-1-laoar.shao@gmail.com> (raw)

This patch is an improvement of a previous version[1], as the previous
version is not easy to understand.
This issue persists in the newest kernel, I have to resend the fix. As
the implementation is changed, I drop Roman's ack from the previous
version.

Here's the explanation of this issue.
memory.{low,min} won't take effect if the to-be-reclaimed memcg is the
sc->target_mem_cgroup, that can also be proved by the implementation in
mem_cgroup_protected(), see bellow,
	mem_cgroup_protected
		if (memcg == root) [2]
			return MEMCG_PROT_NONE;

But this rule is ignored in mem_cgroup_protection(), which will read
memory.{emin, elow} as the protection whatever the memcg is.

How would this issue happen?
Because in mem_cgroup_protected() we forget to clear the
memory.{emin, elow} if the memcg is target_mem_cgroup [2].

An example to illustrate this issue.
   root_mem_cgroup
         /
        A   memory.max: 1024M
            memory.min: 512M
            memory.current: 800M ('current' must be greater than 'min')
Once kswapd starts to reclaim memcg A, it assigns 512M to memory.emin of A.
Then kswapd stops.
As a result of it, the memory values of A will be,
   root_mem_cgroup
         /
        A   memory.max: 1024M
            memory.min: 512M
            memory.current: 512M (approximately)
            memory.emin: 512M

Then a new workload starts to run in memcg A, and it will trigger memcg
relcaim in A soon. As memcg A is the target_mem_cgroup of this
reclaimer, so it return directly without touching memory.{emin, elow}.[2]
The memory values of A will be,
   root_mem_cgroup
         /
        A   memory.max: 1024M
            memory.min: 512M
            memory.current: 1024M (approximately)
            memory.emin: 512M
Then this memory.emin will be used in mem_cgroup_protection() to get the
scan count, which is obvoiusly a wrong scan count.

[1]. https://lore.kernel.org/linux-mm/20200216145249.6900-1-laoar.shao@gmail.com/

Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
Cc: Chris Down <chris@chrisdown.name>
Cc: Roman Gushchin <guro@fb.com>
Cc: stable@vger.kernel.org
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 include/linux/memcontrol.h | 13 +++++++++++--
 mm/vmscan.c                |  4 ++--
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d275c72c4f8e..114cfe06bf60 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -344,12 +344,20 @@ static inline bool mem_cgroup_disabled(void)
 	return !cgroup_subsys_enabled(memory_cgrp_subsys);
 }
 
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg,
+static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
+						  struct mem_cgroup *memcg,
 						  bool in_low_reclaim)
 {
 	if (mem_cgroup_disabled())
 		return 0;
 
+	/*
+	 * Memcg protection won't take effect if the memcg is the target
+	 * root memcg.
+	 */
+	if (root == memcg)
+		return 0;
+
 	if (in_low_reclaim)
 		return READ_ONCE(memcg->memory.emin);
 
@@ -835,7 +843,8 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm,
 {
 }
 
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg,
+static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
+						  struct mem_cgroup *memcg,
 						  bool in_low_reclaim)
 {
 	return 0;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b06868fc4926..ad2782f754ab 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2346,9 +2346,9 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
 		unsigned long protection;
 
 		lruvec_size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx);
-		protection = mem_cgroup_protection(memcg,
+		protection = mem_cgroup_protection(sc->target_mem_cgroup,
+						   memcg,
 						   sc->memcg_low_reclaim);
-
 		if (protection) {
 			/*
 			 * Scale a cgroup's reclaim pressure by proportioning
-- 
2.18.2


             reply	other threads:[~2020-04-23  6:16 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-23  6:16 Yafang Shao [this message]
2020-04-23 15:33 ` [PATCH] mm, memcg: fix wrong mem cgroup protection Chris Down
2020-04-23 21:13   ` Roman Gushchin
2020-04-24  0:32     ` Yafang Shao
2020-04-24  0:32       ` Yafang Shao
2020-04-24 10:40     ` Michal Hocko
2020-04-24 10:57       ` Yafang Shao
2020-04-24 10:57         ` Yafang Shao
2020-04-24  0:49   ` Yafang Shao
2020-04-24  0:49     ` Yafang Shao
2020-04-24 12:18     ` Chris Down
2020-04-24 12:44       ` Yafang Shao
2020-04-24 12:44         ` Yafang Shao
2020-04-24 13:05         ` Chris Down
2020-04-24 13:10           ` Yafang Shao
2020-04-24 13:10             ` Yafang Shao
2020-04-23 21:06 ` Roman Gushchin
2020-04-24  0:29   ` Yafang Shao
2020-04-24  0:29     ` Yafang Shao
2020-04-24 13:14 ` Johannes Weiner
2020-04-24 13:44   ` Johannes Weiner
2020-04-24 14:33     ` Michal Hocko
2020-04-24 16:08     ` Yafang Shao
2020-04-24 16:08       ` Yafang Shao
2020-04-24 14:29   ` Michal Hocko
2020-04-24 15:10     ` Johannes Weiner
2020-04-24 16:21       ` Michal Hocko
2020-04-24 16:51         ` Johannes Weiner
2020-04-27  8:25           ` Michal Hocko
2020-04-27  8:37             ` Yafang Shao
2020-04-27  8:37               ` Yafang Shao
2020-04-27 16:52             ` Johannes Weiner
2020-04-24 16:21     ` Roman Gushchin
2020-04-24 16:30       ` Yafang Shao
2020-04-24 16:30         ` Yafang Shao
2020-04-24 16:00   ` Yafang Shao
2020-04-24 16:00     ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200423061629.24185-1-laoar.shao@gmail.com \
    --to=laoar.shao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris@chrisdown.name \
    --cc=guro@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.