All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Yang Shi <shy828301@gmail.com>,
	Wang Yugui <wangyugui@e16-tech.com>,
	Matthew Wilcox <willy@infradead.org>,
	Alistair Popple <apopple@nvidia.com>,
	Ralph Campbell <rcampbell@nvidia.com>, Zi Yan <ziy@nvidia.com>,
	Peter Xu <peterx@redhat.com>, Will Deacon <will@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: [PATCH 10/11] mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
Date: Wed, 9 Jun 2021 23:52:37 -0700 (PDT)	[thread overview]
Message-ID: <fedb8632-1798-de42-f39e-873551d5bc81@google.com> (raw)
In-Reply-To: <589b358c-febc-c88e-d4c2-7834b37fa7bf@google.com>

Running certain tests with a DEBUG_VM kernel would crash within hours,
on the total_mapcount BUG() in split_huge_page_to_list(), while trying
to free up some memory by punching a hole in a shmem huge page: split's
try_to_unmap() was unable to find all the mappings of the page (which,
on a !DEBUG_VM kernel, would then keep the huge page pinned in memory).

Crash dumps showed two tail pages of a shmem huge page remained mapped
by pte: ptes in a non-huge-aligned vma of a gVisor process, at the end
of a long unmapped range; and no page table had yet been allocated for
the head of the huge page to be mapped into.

Although designed to handle these odd misaligned huge-page-mapped-by-pte
cases, page_vma_mapped_walk() falls short by returning false prematurely
when !pmd_present or !pud_present or !p4d_present or !pgd_present: there
are cases when a huge page may span the boundary, with ptes present in
the next.

Restructure page_vma_mapped_walk() as a loop to continue in these cases,
while keeping its layout much as before. Add a step_forward() helper to
advance pvmw->address across those boundaries: originally I tried to use
mm's standard p?d_addr_end() macros, but hit the same crash 512 times
less often: because of the way redundant levels are folded together,
but folded differently in different configurations, it was just too
difficult to use them correctly; and step_forward() is simpler anyway.

Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
---
 mm/page_vma_mapped.c | 34 +++++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index f6839f536645..6eb2f1863506 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -116,6 +116,13 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
 	return pfn_is_match(pvmw->page, pfn);
 }
 
+static void step_forward(struct page_vma_mapped_walk *pvmw, unsigned long size)
+{
+	pvmw->address = (pvmw->address + size) & ~(size - 1);
+	if (!pvmw->address)
+		pvmw->address = ULONG_MAX;
+}
+
 /**
  * page_vma_mapped_walk - check if @pvmw->page is mapped in @pvmw->vma at
  * @pvmw->address
@@ -183,16 +190,22 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 	if (pvmw->pte)
 		goto next_pte;
 restart:
-	{
+	do {
 		pgd = pgd_offset(mm, pvmw->address);
-		if (!pgd_present(*pgd))
-			return false;
+		if (!pgd_present(*pgd)) {
+			step_forward(pvmw, PGDIR_SIZE);
+			continue;
+		}
 		p4d = p4d_offset(pgd, pvmw->address);
-		if (!p4d_present(*p4d))
-			return false;
+		if (!p4d_present(*p4d)) {
+			step_forward(pvmw, P4D_SIZE);
+			continue;
+		}
 		pud = pud_offset(p4d, pvmw->address);
-		if (!pud_present(*pud))
-			return false;
+		if (!pud_present(*pud)) {
+			step_forward(pvmw, PUD_SIZE);
+			continue;
+		}
 
 		pvmw->pmd = pmd_offset(pud, pvmw->address);
 		/*
@@ -240,7 +253,8 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 
 				spin_unlock(ptl);
 			}
-			return false;
+			step_forward(pvmw, PMD_SIZE);
+			continue;
 		}
 		if (!map_pte(pvmw))
 			goto next_pte;
@@ -270,7 +284,9 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 			spin_lock(pvmw->ptl);
 		}
 		goto this_pte;
-	}
+	} while (pvmw->address < end);
+
+	return false;
 }
 
 /**
-- 
2.26.2


  parent reply	other threads:[~2021-06-10  6:53 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-10  6:31 [PATCH 00/11] mm: page_vma_mapped_walk() cleanup and THP fixes Hugh Dickins
2021-06-10  6:31 ` Hugh Dickins
2021-06-10  6:34 ` [PATCH 01/11] mm: page_vma_mapped_walk(): use page for pvmw->page Hugh Dickins
2021-06-10  6:34   ` Hugh Dickins
2021-06-10  8:12   ` Alistair Popple
2021-06-10  8:55   ` Kirill A. Shutemov
2021-06-10 14:14     ` Peter Xu
2021-06-10 22:35       ` Hugh Dickins
2021-06-10  6:36 ` [PATCH 02/11] mm: page_vma_mapped_walk(): settle PageHuge on entry Hugh Dickins
2021-06-10  6:36   ` Hugh Dickins
2021-06-10  8:57   ` Kirill A. Shutemov
2021-06-10 14:17   ` Peter Xu
2021-06-10 22:45     ` Hugh Dickins
2021-06-10  6:38 ` [PATCH 03/11] mm: page_vma_mapped_walk(): use pmd_read_atomic() Hugh Dickins
2021-06-10  6:38   ` Hugh Dickins
2021-06-10  9:06   ` Kirill A. Shutemov
2021-06-10 12:15     ` Jason Gunthorpe
2021-06-11  6:37       ` Hugh Dickins
2021-06-11 15:36         ` Jason Gunthorpe
2021-06-11 19:05           ` Hugh Dickins
2021-06-11 19:05             ` Hugh Dickins
2021-06-11 19:42             ` Jason Gunthorpe
2021-06-15  9:46               ` Will Deacon
2021-06-16  0:42                 ` Jason Gunthorpe
2021-06-16 10:27                   ` Will Deacon
2021-06-11 19:33           ` Hugh Dickins
2021-06-11 19:33             ` Hugh Dickins
2021-06-10  6:40 ` [PATCH 04/11] mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd Hugh Dickins
2021-06-10  6:40   ` Hugh Dickins
2021-06-10  9:10   ` Kirill A. Shutemov
2021-06-10 14:31   ` Peter Xu
2021-06-10  6:42 ` [PATCH 05/11] mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block Hugh Dickins
2021-06-10  6:42   ` Hugh Dickins
2021-06-10  9:16   ` Kirill A. Shutemov
2021-06-10 14:48   ` Peter Xu
2021-06-10  6:44 ` [PATCH 06/11] mm: page_vma_mapped_walk(): crossing page table boundary Hugh Dickins
2021-06-10  6:44   ` Hugh Dickins
2021-06-10  9:32   ` Kirill A. Shutemov
2021-06-10 23:02     ` Hugh Dickins
2021-06-11 11:23       ` Kirill A. Shutemov
2021-06-10  6:46 ` [PATCH 07/11] mm: page_vma_mapped_walk(): add a level of indentation Hugh Dickins
2021-06-10  6:46   ` Hugh Dickins
2021-06-10  9:34   ` Kirill A. Shutemov
2021-06-10  6:48 ` [PATCH 08/11] mm: page_vma_mapped_walk(): use goto instead of while (1) Hugh Dickins
2021-06-10  6:48   ` Hugh Dickins
2021-06-10  9:39   ` Kirill A. Shutemov
2021-06-10  6:50 ` [PATCH 09/11] mm: page_vma_mapped_walk(): get vma_address_end() earlier Hugh Dickins
2021-06-10  6:50   ` Hugh Dickins
2021-06-10  9:40   ` Kirill A. Shutemov
2021-06-10  6:52 ` Hugh Dickins [this message]
2021-06-10  6:52   ` [PATCH 10/11] mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes Hugh Dickins
2021-06-10  9:42   ` Kirill A. Shutemov
2021-06-10  6:54 ` [PATCH 11/11] mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk() Hugh Dickins
2021-06-10  6:54   ` Hugh Dickins
2021-06-10  9:43   ` Kirill A. Shutemov
2021-06-11 18:29     ` Hugh Dickins
2021-06-11 18:29       ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fedb8632-1798-de42-f39e-873551d5bc81@google.com \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=rcampbell@nvidia.com \
    --cc=shy828301@gmail.com \
    --cc=wangyugui@e16-tech.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.