All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com,
	jack@suse.cz, juew@google.com, kirill.shutemov@linux.intel.com,
	linmiaohe@huawei.com, linux-mm@kvack.org, minchan@kernel.org,
	mm-commits@vger.kernel.org, naoya.horiguchi@nec.com,
	osalvador@suse.de, peterx@redhat.com, rcampbell@nvidia.com,
	shakeelb@google.com, shy828301@gmail.com, stable@vger.kernel.org,
	torvalds@linux-foundation.org, wangyugui@e16-tech.com,
	willy@infradead.org, ziy@nvidia.com
Subject: [patch 12/18] mm/thp: make is_huge_zero_pmd() safe and quicker
Date: Tue, 15 Jun 2021 18:23:49 -0700	[thread overview]
Message-ID: <20210616012349.Q0qeVCwYo%akpm@linux-foundation.org> (raw)
In-Reply-To: <20210615182248.9a0ba90e8e66b9f4a53c0d23@linux-foundation.org>

From: Hugh Dickins <hughd@google.com>
Subject: mm/thp: make is_huge_zero_pmd() safe and quicker

Most callers of is_huge_zero_pmd() supply a pmd already verified present;
but a few (notably zap_huge_pmd()) do not - it might be a pmd migration
entry, in which the pfn is encoded differently from a present pmd: which
might pass the is_huge_zero_pmd() test (though not on x86, since L1TF
forced us to protect against that); or perhaps even crash in pmd_page()
applied to a swap-like entry.

Make it safe by adding pmd_present() check into is_huge_zero_pmd() itself;
and make it quicker by saving huge_zero_pfn, so that is_huge_zero_pmd()
will not need to do that pmd_page() lookup each time.

__split_huge_pmd_locked() checked pmd_trans_huge() before: that worked,
but is unnecessary now that is_huge_zero_pmd() checks present.

Link: https://lkml.kernel.org/r/21ea9ca-a1f5-8b90-5e88-95fb1c49bbfa@google.com
Fixes: e71769ae5260 ("mm: enable thp migration for shmem thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/huge_mm.h |    8 +++++++-
 mm/huge_memory.c        |    5 ++++-
 2 files changed, 11 insertions(+), 2 deletions(-)

--- a/include/linux/huge_mm.h~mm-thp-make-is_huge_zero_pmd-safe-and-quicker
+++ a/include/linux/huge_mm.h
@@ -286,6 +286,7 @@ struct page *follow_devmap_pud(struct vm
 vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t orig_pmd);
 
 extern struct page *huge_zero_page;
+extern unsigned long huge_zero_pfn;
 
 static inline bool is_huge_zero_page(struct page *page)
 {
@@ -294,7 +295,7 @@ static inline bool is_huge_zero_page(str
 
 static inline bool is_huge_zero_pmd(pmd_t pmd)
 {
-	return is_huge_zero_page(pmd_page(pmd));
+	return READ_ONCE(huge_zero_pfn) == pmd_pfn(pmd) && pmd_present(pmd);
 }
 
 static inline bool is_huge_zero_pud(pud_t pud)
@@ -439,6 +440,11 @@ static inline bool is_huge_zero_page(str
 {
 	return false;
 }
+
+static inline bool is_huge_zero_pmd(pmd_t pmd)
+{
+	return false;
+}
 
 static inline bool is_huge_zero_pud(pud_t pud)
 {
--- a/mm/huge_memory.c~mm-thp-make-is_huge_zero_pmd-safe-and-quicker
+++ a/mm/huge_memory.c
@@ -62,6 +62,7 @@ static struct shrinker deferred_split_sh
 
 static atomic_t huge_zero_refcount;
 struct page *huge_zero_page __read_mostly;
+unsigned long huge_zero_pfn __read_mostly = ~0UL;
 
 bool transparent_hugepage_enabled(struct vm_area_struct *vma)
 {
@@ -98,6 +99,7 @@ retry:
 		__free_pages(zero_page, compound_order(zero_page));
 		goto retry;
 	}
+	WRITE_ONCE(huge_zero_pfn, page_to_pfn(zero_page));
 
 	/* We take additional reference here. It will be put back by shrinker */
 	atomic_set(&huge_zero_refcount, 2);
@@ -147,6 +149,7 @@ static unsigned long shrink_huge_zero_pa
 	if (atomic_cmpxchg(&huge_zero_refcount, 1, 0) == 1) {
 		struct page *zero_page = xchg(&huge_zero_page, NULL);
 		BUG_ON(zero_page == NULL);
+		WRITE_ONCE(huge_zero_pfn, ~0UL);
 		__free_pages(zero_page, compound_order(zero_page));
 		return HPAGE_PMD_NR;
 	}
@@ -2071,7 +2074,7 @@ static void __split_huge_pmd_locked(stru
 		return;
 	}
 
-	if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) {
+	if (is_huge_zero_pmd(*pmd)) {
 		/*
 		 * FIXME: Do we want to invalidate secondary mmu by calling
 		 * mmu_notifier_invalidate_range() see comments below inside
_

  parent reply	other threads:[~2021-06-16  1:23 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-16  1:22 incoming Andrew Morton
2021-06-16  1:23 ` [patch 01/18] mm,hwpoison: fix race with hugetlb page allocation Andrew Morton
2021-06-16  1:23 ` [patch 02/18] mm/swap: fix pte_same_as_swp() not removing uffd-wp bit when compare Andrew Morton
2021-06-16  1:23 ` [patch 03/18] mm/slub: clarify verification reporting Andrew Morton
2021-06-16  1:23 ` [patch 04/18] mm/slub: fix redzoning for small allocations Andrew Morton
2021-06-16  1:23 ` [patch 05/18] mm/slub: actually fix freelist pointer vs redzoning Andrew Morton
2021-06-16  1:23 ` [patch 06/18] mm/hugetlb: expand restore_reserve_on_error functionality Andrew Morton
2021-06-16  1:23 ` [patch 07/18] mm/memory-failure: make sure wait for page writeback in memory_failure Andrew Morton
2021-06-16  1:23 ` [patch 08/18] crash_core, vmcoreinfo: append 'SECTION_SIZE_BITS' to vmcoreinfo Andrew Morton
2021-06-16  1:23 ` [patch 09/18] mm/slub.c: include swab.h Andrew Morton
2021-06-16  1:23 ` [patch 10/18] mm, thp: use head page in __migration_entry_wait() Andrew Morton
2021-06-16  1:23 ` [patch 11/18] mm/thp: fix __split_huge_pmd_locked() on shmem migration entry Andrew Morton
2021-06-16  1:23 ` Andrew Morton [this message]
2021-06-16  1:23 ` [patch 13/18] mm/thp: try_to_unmap() use TTU_SYNC for safe splitting Andrew Morton
2021-06-16  1:23 ` [patch 14/18] mm/thp: fix vma_address() if virtual address below file offset Andrew Morton
2021-06-16  1:24 ` [patch 15/18] mm/thp: fix page_address_in_vma() on file THP tails Andrew Morton
2021-06-16  1:24 ` [patch 16/18] mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page() Andrew Morton
2021-06-16  1:24 ` [patch 17/18] mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split Andrew Morton
2021-06-16  1:24 ` [patch 18/18] mm/sparse: fix check_usemap_section_nr warnings Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210616012349.Q0qeVCwYo%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=juew@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=rcampbell@nvidia.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=wangyugui@e16-tech.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.