Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH 4/4] mm: memcontrol: fix NUMA round-robin reclaim at intermediate level
Date: Fri, 12 Apr 2019 11:15:07 -0400
Message-ID: <20190412151507.2769-5-hannes@cmpxchg.org> (raw)
In-Reply-To: <20190412151507.2769-1-hannes@cmpxchg.org>

When a cgroup is reclaimed on behalf of a configured limit, reclaim
needs to round-robin through all NUMA nodes that hold pages of the
memcg in question. However, when assembling the mask of candidate NUMA
nodes, the code only consults the *local* cgroup LRU counters, not the
recursive counters for the entire subtree. Cgroup limits are
frequently configured against intermediate cgroups that do not have
memory on their own LRUs. In this case, the node mask will always come
up empty and reclaim falls back to scanning only the current node.

If a cgroup subtree has some memory on one node but the processes are
bound to another node afterwards, the limit reclaim will never age or
reclaim that memory anymore.

To fix this, use the recursive LRU counts for a cgroup subtree to
determine which nodes hold memory of that cgroup.

The code has been broken like this forever, so it doesn't seem to be a
problem in practice. I just noticed it while reviewing the way the LRU
counters are used in general.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2eb2d4ef9b34..2535e54e7989 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1512,13 +1512,13 @@ static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *memcg,
 {
 	struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg);
 
-	if (lruvec_page_state_local(lruvec, NR_INACTIVE_FILE) ||
-	    lruvec_page_state_local(lruvec, NR_ACTIVE_FILE))
+	if (lruvec_page_state(lruvec, NR_INACTIVE_FILE) ||
+	    lruvec_page_state(lruvec, NR_ACTIVE_FILE))
 		return true;
 	if (noswap || !total_swap_pages)
 		return false;
-	if (lruvec_page_state_local(lruvec, NR_INACTIVE_ANON) ||
-	    lruvec_page_state_local(lruvec, NR_ACTIVE_ANON))
+	if (lruvec_page_state(lruvec, NR_INACTIVE_ANON) ||
+	    lruvec_page_state(lruvec, NR_ACTIVE_ANON))
 		return true;
 	return false;
 
-- 
2.21.0


  parent reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-12 15:15 [PATCH 0/4] mm: memcontrol: memory.stat cost & correctness Johannes Weiner
2019-04-12 15:15 ` [PATCH 1/4] mm: memcontrol: make cgroup stats and events query API explicitly local Johannes Weiner
2019-04-12 15:15 ` [PATCH 2/4] mm: memcontrol: move stat/event counting functions out-of-line Johannes Weiner
2019-04-12 15:15 ` [PATCH 3/4] mm: memcontrol: fix recursive statistics correctness & scalabilty Johannes Weiner
2019-04-12 19:55   ` Shakeel Butt
2019-04-12 20:10     ` Johannes Weiner
2019-04-12 20:38       ` Shakeel Butt
2019-04-12 20:15     ` Roman Gushchin
2019-04-12 20:50       ` Shakeel Butt
2019-04-12 15:15 ` Johannes Weiner [this message]
2019-04-12 20:07 ` [PATCH 0/4] mm: memcontrol: memory.stat cost & correctness Shakeel Butt
2019-04-12 22:04 ` Roman Gushchin

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190412151507.2769-5-hannes@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org linux-mm@archiver.kernel.org
	public-inbox-index linux-mm


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox