linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Matthew Wilcox <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	Alistair Popple <apopple@nvidia.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@surriel.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Yu Zhao <yuzhao@google.com>, Greg Thelen <gthelen@google.com>,
	Shakeel Butt <shakeelb@google.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH v2 13/13] mm/thp: shrink_page_list() avoid splitting VM_LOCKED THP
Date: Mon, 14 Feb 2022 18:42:33 -0800 (PST)	[thread overview]
Message-ID: <531d13ee-bc7d-329a-9748-5e272f699d78@google.com> (raw)
In-Reply-To: <55a49083-37f9-3766-1de9-9feea7428ac@google.com>

4.8 commit 7751b2da6be0 ("vmscan: split file huge pages before paging
them out") inserted a split_huge_page_to_list() into shrink_page_list()
without considering the mlock case: no problem if the page has already
been marked as Mlocked (the !page_evictable check much higher up will
have skipped all this), but it has always been the case that races or
omissions in setting Mlocked can rely on page reclaim to detect this
and correct it before actually reclaiming - and that remains so, but
what a shame if a hugepage is needlessly split before discovering it.

It is surprising that page_check_references() returns PAGEREF_RECLAIM
when VM_LOCKED, but there was a good reason for that: try_to_unmap_one()
is where the condition is detected and corrected; and until now it could
not be done in page_referenced_one(), because that does not always have
the page locked.  Now that mlock's requirement for page lock has gone,
copy try_to_unmap_one()'s mlock restoration into page_referenced_one(),
and let page_check_references() return PAGEREF_ACTIVATE in this case.

But page_referenced_one() may find a pte mapping one part of a hugepage:
what hold should a pte mapped in a VM_LOCKED area exert over the entire
huge page?  That's debatable.  The approach taken here is to treat that
pte mapping in page_referenced_one() as if not VM_LOCKED, and if no
VM_LOCKED pmd mapping is found later in the walk, and lack of reference
permits, then PAGEREF_RECLAIM take it to attempted splitting as before.

Signed-off-by: Hugh Dickins <hughd@google.com>
---
v2: same as v1.

 mm/rmap.c   | 7 +++++--
 mm/vmscan.c | 6 +++---
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 714bfdc72c7b..c7921c102bc0 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -812,7 +812,10 @@ static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
 	while (page_vma_mapped_walk(&pvmw)) {
 		address = pvmw.address;
 
-		if (vma->vm_flags & VM_LOCKED) {
+		if ((vma->vm_flags & VM_LOCKED) &&
+		    (!PageTransCompound(page) || !pvmw.pte)) {
+			/* Restore the mlock which got missed */
+			mlock_vma_page(page, vma, !pvmw.pte);
 			page_vma_mapped_walk_done(&pvmw);
 			pra->vm_flags |= VM_LOCKED;
 			return false; /* To break the loop */
@@ -851,7 +854,7 @@ static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
 
 	if (referenced) {
 		pra->referenced++;
-		pra->vm_flags |= vma->vm_flags;
+		pra->vm_flags |= vma->vm_flags & ~VM_LOCKED;
 	}
 
 	if (!pra->mapcount)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 090bfb605ecf..a160efba3c73 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1386,11 +1386,11 @@ static enum page_references page_check_references(struct page *page,
 	referenced_page = TestClearPageReferenced(page);
 
 	/*
-	 * Mlock lost the isolation race with us.  Let try_to_unmap()
-	 * move the page to the unevictable list.
+	 * The supposedly reclaimable page was found to be in a VM_LOCKED vma.
+	 * Let the page, now marked Mlocked, be moved to the unevictable list.
 	 */
 	if (vm_flags & VM_LOCKED)
-		return PAGEREF_RECLAIM;
+		return PAGEREF_ACTIVATE;
 
 	if (referenced_ptes) {
 		/*
-- 
2.34.1


  parent reply	other threads:[~2022-02-15  2:42 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-15  2:18 [PATCH v2 00/13] mm/munlock: rework of mlock+munlock page handling Hugh Dickins
2022-02-15  2:20 ` [PATCH v2 01/13] mm/munlock: delete page_mlock() and all its works Hugh Dickins
2022-02-15  2:21 ` [PATCH v2 02/13] mm/munlock: delete FOLL_MLOCK and FOLL_POPULATE Hugh Dickins
2022-02-15  2:23 ` [PATCH v2 03/13] mm/munlock: delete munlock_vma_pages_all(), allow oomreap Hugh Dickins
2022-02-15  2:26 ` [PATCH v2 04/13] mm/munlock: rmap call mlock_vma_page() munlock_vma_page() Hugh Dickins
2022-02-15 15:22   ` Matthew Wilcox
2022-02-15 21:38     ` Hugh Dickins
2022-02-15 23:21       ` Matthew Wilcox
2022-02-15  2:28 ` [PATCH v2 05/13] mm/munlock: replace clear_page_mlock() by final clearance Hugh Dickins
2022-02-15  2:29 ` [PATCH v2 06/13] mm/munlock: maintain page->mlock_count while unevictable Hugh Dickins
2022-02-15  2:31 ` [PATCH v2 07/13] mm/munlock: mlock_pte_range() when mlocking or munlocking Hugh Dickins
2022-02-18  6:35   ` [mm/munlock] 237b445401: stress-ng.remap.ops_per_sec -62.6% regression kernel test robot
2022-02-18  8:49     ` Hugh Dickins
2022-02-21  6:32       ` Hugh Dickins
2022-02-24  8:37         ` Oliver Sang
2022-02-15  2:33 ` [PATCH v2 08/13] mm/migrate: __unmap_and_move() push good newpage to LRU Hugh Dickins
2022-02-15  2:34 ` [PATCH v2 09/13] mm/munlock: delete smp_mb() from __pagevec_lru_add_fn() Hugh Dickins
2022-02-15  2:37 ` [PATCH v2 10/13] mm/munlock: mlock_page() munlock_page() batch by pagevec Hugh Dickins
2022-02-15 16:40   ` Matthew Wilcox
2022-02-15 21:02     ` Hugh Dickins
2022-02-15 22:56       ` Matthew Wilcox
2022-02-15  2:38 ` [PATCH v2 11/13] mm/munlock: page migration needs mlock pagevec drained Hugh Dickins
2022-02-15  2:40 ` [PATCH v2 12/13] mm/thp: collapse_file() do try_to_unmap(TTU_BATCH_FLUSH) Hugh Dickins
2022-02-15  2:42 ` Hugh Dickins [this message]
2022-02-15 19:35 ` [PATCH v2 00/13] mm/munlock: rework of mlock+munlock page handling Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=531d13ee-bc7d-329a-9748-5e272f699d78@google.com \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=david@redhat.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    --cc=shakeelb@google.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).