linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm, memcg: fix wrong mem cgroup protection
@ 2020-04-23  6:16 Yafang Shao
  2020-04-23 15:33 ` Chris Down
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Yafang Shao @ 2020-04-23  6:16 UTC (permalink / raw)
  To: akpm, mhocko, vdavydov.dev
  Cc: linux-mm, Yafang Shao, Chris Down, Roman Gushchin, stable

This patch is an improvement of a previous version[1], as the previous
version is not easy to understand.
This issue persists in the newest kernel, I have to resend the fix. As
the implementation is changed, I drop Roman's ack from the previous
version.

Here's the explanation of this issue.
memory.{low,min} won't take effect if the to-be-reclaimed memcg is the
sc->target_mem_cgroup, that can also be proved by the implementation in
mem_cgroup_protected(), see bellow,
	mem_cgroup_protected
		if (memcg == root) [2]
			return MEMCG_PROT_NONE;

But this rule is ignored in mem_cgroup_protection(), which will read
memory.{emin, elow} as the protection whatever the memcg is.

How would this issue happen?
Because in mem_cgroup_protected() we forget to clear the
memory.{emin, elow} if the memcg is target_mem_cgroup [2].

An example to illustrate this issue.
   root_mem_cgroup
         /
        A   memory.max: 1024M
            memory.min: 512M
            memory.current: 800M ('current' must be greater than 'min')
Once kswapd starts to reclaim memcg A, it assigns 512M to memory.emin of A.
Then kswapd stops.
As a result of it, the memory values of A will be,
   root_mem_cgroup
         /
        A   memory.max: 1024M
            memory.min: 512M
            memory.current: 512M (approximately)
            memory.emin: 512M

Then a new workload starts to run in memcg A, and it will trigger memcg
relcaim in A soon. As memcg A is the target_mem_cgroup of this
reclaimer, so it return directly without touching memory.{emin, elow}.[2]
The memory values of A will be,
   root_mem_cgroup
         /
        A   memory.max: 1024M
            memory.min: 512M
            memory.current: 1024M (approximately)
            memory.emin: 512M
Then this memory.emin will be used in mem_cgroup_protection() to get the
scan count, which is obvoiusly a wrong scan count.

[1]. https://lore.kernel.org/linux-mm/20200216145249.6900-1-laoar.shao@gmail.com/

Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
Cc: Chris Down <chris@chrisdown.name>
Cc: Roman Gushchin <guro@fb.com>
Cc: stable@vger.kernel.org
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 include/linux/memcontrol.h | 13 +++++++++++--
 mm/vmscan.c                |  4 ++--
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d275c72c4f8e..114cfe06bf60 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -344,12 +344,20 @@ static inline bool mem_cgroup_disabled(void)
 	return !cgroup_subsys_enabled(memory_cgrp_subsys);
 }
 
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg,
+static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
+						  struct mem_cgroup *memcg,
 						  bool in_low_reclaim)
 {
 	if (mem_cgroup_disabled())
 		return 0;
 
+	/*
+	 * Memcg protection won't take effect if the memcg is the target
+	 * root memcg.
+	 */
+	if (root == memcg)
+		return 0;
+
 	if (in_low_reclaim)
 		return READ_ONCE(memcg->memory.emin);
 
@@ -835,7 +843,8 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm,
 {
 }
 
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg,
+static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
+						  struct mem_cgroup *memcg,
 						  bool in_low_reclaim)
 {
 	return 0;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b06868fc4926..ad2782f754ab 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2346,9 +2346,9 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
 		unsigned long protection;
 
 		lruvec_size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx);
-		protection = mem_cgroup_protection(memcg,
+		protection = mem_cgroup_protection(sc->target_mem_cgroup,
+						   memcg,
 						   sc->memcg_low_reclaim);
-
 		if (protection) {
 			/*
 			 * Scale a cgroup's reclaim pressure by proportioning
-- 
2.18.2



^ permalink raw reply related	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-04-27 16:52 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-23  6:16 [PATCH] mm, memcg: fix wrong mem cgroup protection Yafang Shao
2020-04-23 15:33 ` Chris Down
2020-04-23 21:13   ` Roman Gushchin
2020-04-24  0:32     ` Yafang Shao
2020-04-24 10:40     ` Michal Hocko
2020-04-24 10:57       ` Yafang Shao
2020-04-24  0:49   ` Yafang Shao
2020-04-24 12:18     ` Chris Down
2020-04-24 12:44       ` Yafang Shao
2020-04-24 13:05         ` Chris Down
2020-04-24 13:10           ` Yafang Shao
2020-04-23 21:06 ` Roman Gushchin
2020-04-24  0:29   ` Yafang Shao
2020-04-24 13:14 ` Johannes Weiner
2020-04-24 13:44   ` Johannes Weiner
2020-04-24 14:33     ` Michal Hocko
2020-04-24 16:08     ` Yafang Shao
2020-04-24 14:29   ` Michal Hocko
2020-04-24 15:10     ` Johannes Weiner
2020-04-24 16:21       ` Michal Hocko
2020-04-24 16:51         ` Johannes Weiner
2020-04-27  8:25           ` Michal Hocko
2020-04-27  8:37             ` Yafang Shao
2020-04-27 16:52             ` Johannes Weiner
2020-04-24 16:21     ` Roman Gushchin
2020-04-24 16:30       ` Yafang Shao
2020-04-24 16:00   ` Yafang Shao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).