linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nhat Pham <nphamcs@gmail.com>
To: akpm@linux-foundation.org
Cc: hannes@cmpxchg.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, bfoster@redhat.com,
	willy@infradead.org, kernel-team@meta.com
Subject: [PATCH v2 1/4] workingset: fix confusion around eviction vs refault container
Date: Mon,  5 Dec 2022 09:51:37 -0800	[thread overview]
Message-ID: <20221205175140.1543229-2-nphamcs@gmail.com> (raw)
In-Reply-To: <20221205175140.1543229-1-nphamcs@gmail.com>

From: Johannes Weiner <hannes@cmpxchg.org>

Refault decisions are made based on the lruvec where the page was
evicted, as that determined its LRU order while it was alive. Stats
and workingset aging must then occur on the lruvec of the new page, as
that's the node and cgroup that experience the refault and that's the
lruvec whose nonresident info ages out by a new resident page. Those
lruvecs could be different when a page is shared between cgroups, or
the refaulting page is allocated on a different node.

There are currently two mix-ups:

1. When swap is available, the resident anon set must be considered
   when comparing the refault distance. The comparison is made against
   the right anon set, but the check for swap is not. When pages get
   evicted from a cgroup with swap, and refault in one without, this
   can incorrectly consider a hot refault as cold - and vice
   versa. Fix that by using the eviction cgroup for the swap check.

2. The stats and workingset age are updated against the wrong lruvec
   altogether: the right cgroup but the wrong NUMA node. When a page
   refaults on a different NUMA node, this will have confusing stats
   and distort the workingset age on a different lruvec - again
   possibly resulting in hot/cold misclassifications down the line.

Fix the swap check and the refault pgdat to address both concerns.

This was found during code review. It hasn't caused notable issues in
production, suggesting that those refault-migrations are relatively
rare in practice.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Co-developed-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
---
 mm/workingset.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index ae7e984b23c6..79585d55c45d 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -457,6 +457,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 	 */
 	nr = folio_nr_pages(folio);
 	memcg = folio_memcg(folio);
+	pgdat = folio_pgdat(folio);
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);

 	mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr);
@@ -474,7 +475,7 @@ void workingset_refault(struct folio *folio, void *shadow)
 		workingset_size += lruvec_page_state(eviction_lruvec,
 						     NR_INACTIVE_FILE);
 	}
-	if (mem_cgroup_get_nr_swap_pages(memcg) > 0) {
+	if (mem_cgroup_get_nr_swap_pages(eviction_memcg) > 0) {
 		workingset_size += lruvec_page_state(eviction_lruvec,
 						     NR_ACTIVE_ANON);
 		if (file) {
--
2.30.2

  reply	other threads:[~2022-12-05 17:52 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-05 17:51 [PATCH v2 0/4] cachestat: a new syscall for page cache state of files Nhat Pham
2022-12-05 17:51 ` Nhat Pham [this message]
2022-12-05 17:51 ` [PATCH v2 2/4] workingset: refactor LRU refault to expose refault recency check Nhat Pham
2022-12-05 23:49   ` Yu Zhao
2022-12-06  1:19     ` Nhat Pham
2022-12-06  1:28       ` Yu Zhao
2022-12-06  2:22         ` Nhat Pham
2022-12-06  2:25           ` Yu Zhao
2022-12-08 18:07             ` Nhat Pham
2022-12-06 15:22   ` Matthew Wilcox
2022-12-07 17:28     ` Nhat Pham
2022-12-05 17:51 ` [PATCH v2 3/4] cachestat: implement cachestat syscall Nhat Pham
2022-12-05 23:15   ` kernel test robot
2022-12-06  4:48   ` kernel test robot
2022-12-06  4:59   ` kernel test robot
2022-12-07  1:42   ` kernel test robot
2022-12-10 23:51   ` Al Viro
2022-12-12 13:22     ` Brian Foster
2022-12-12 16:50       ` Nhat Pham
2022-12-12 16:23     ` Nhat Pham
2022-12-12 16:24       ` Matthew Wilcox
2022-12-12 16:37         ` Nhat Pham
2022-12-05 17:51 ` [PATCH v2 4/4] selftests: Add selftests for cachestat Nhat Pham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221205175140.1543229-2-nphamcs@gmail.com \
    --to=nphamcs@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bfoster@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).