linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [mm-unstable v4 PATCH 0/7] Cleanup transhuge_xxx helpers
@ 2022-06-15 17:29 Yang Shi
  2022-06-15 17:29 ` [v4 PATCH 1/7] mm: khugepaged: check THP flag in hugepage_vma_check() Yang Shi
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Yang Shi @ 2022-06-15 17:29 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, willy, zokeefe, linmiaohe, akpm
  Cc: shy828301, linux-mm, linux-kernel


v4: * Consolidated the transhuge_vma_size_ok() helper proposed in the
      earlier versions into transhuge_vma_suitable(), per Zach.
    * Fixed the regression introduced by patch 3/7, per Zach and Miaohe.
    * Reworded the comment for transhuge_vma_suitable(), per Zach.
    * Removed khugepaged_enter() per Miaohe.
    * More comments for hugepage_vma_check(), per Zach.
    * Squashed patch 4/7 (mm: khugepaged: use transhuge_vma_suitable replace open-code)
      in the earlier version into patch 2/7 of this version.
    * Minor correction to the doc about THPeligible (patch 7/7), so the
      total number of patches is kept 7. 
v3: * Fixed the comment from Willy
v2: * Rebased to the latest mm-unstable
    * Fixed potential regression for smaps's THPeligible

This series is the follow-up of the discussion about cleaning up transhuge_xxx
helpers at https://lore.kernel.org/linux-mm/627a71f8-e879-69a5-ceb3-fc8d29d2f7f1@suse.cz/.

THP has a bunch of helpers that do VMA sanity check for different paths, they
do the similar checks for the most callsites and have a lot duplicate codes.
And it is confusing what helpers should be used at what conditions.

This series reorganized and cleaned up the code so that we could consolidate
all the checks into hugepage_vma_check().

The transhuge_vma_enabled(), transparent_hugepage_active() and
__transparent_hugepage_enabled() are killed by this series.


Yang Shi (7):
      mm: khugepaged: check THP flag in hugepage_vma_check()
      mm: thp: consolidate vma size check to transhuge_vma_suitable
      mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate
      mm: thp: kill transparent_hugepage_active()
      mm: thp: kill __transhuge_page_enabled()
      mm: khugepaged: reorg some khugepaged helpers
      doc: proc: fix the description to THPeligible

 Documentation/filesystems/proc.rst |  4 ++-
 fs/proc/task_mmu.c                 |  2 +-
 include/linux/huge_mm.h            | 75 +++++++++++++++++++------------------------------------
 include/linux/khugepaged.h         | 30 ----------------------
 mm/huge_memory.c                   | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++--------
 mm/khugepaged.c                    | 84 +++++++++++++++++++-------------------------------------------
 mm/memory.c                        |  7 ++++--
 7 files changed, 130 insertions(+), 153 deletions(-)



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [v4 PATCH 1/7] mm: khugepaged: check THP flag in hugepage_vma_check()
  2022-06-15 17:29 [mm-unstable v4 PATCH 0/7] Cleanup transhuge_xxx helpers Yang Shi
@ 2022-06-15 17:29 ` Yang Shi
  2022-06-15 17:29 ` [v4 PATCH 2/7] mm: thp: consolidate vma size check to transhuge_vma_suitable Yang Shi
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Yang Shi @ 2022-06-15 17:29 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, willy, zokeefe, linmiaohe, akpm
  Cc: shy828301, linux-mm, linux-kernel

Currently the THP flag check in hugepage_vma_check() will fallthrough if
the flag is NEVER and VM_HUGEPAGE is set.  This is not a problem for now
since all the callers have the flag checked before or can't be invoked if
the flag is NEVER.

However, the following patch will call hugepage_vma_check() in more
places, for example, page fault, so this flag must be checked in
hugepge_vma_check().

Reviewed-by: Zach O'Keefe <zokeefe@google.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/khugepaged.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 476d79360101..b1dab94c0f1e 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -458,6 +458,9 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
 	if (shmem_file(vma->vm_file))
 		return shmem_huge_enabled(vma);
 
+	if (!khugepaged_enabled())
+		return false;
+
 	/* THP settings require madvise. */
 	if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
 		return false;
-- 
2.26.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [v4 PATCH 2/7] mm: thp: consolidate vma size check to transhuge_vma_suitable
  2022-06-15 17:29 [mm-unstable v4 PATCH 0/7] Cleanup transhuge_xxx helpers Yang Shi
  2022-06-15 17:29 ` [v4 PATCH 1/7] mm: khugepaged: check THP flag in hugepage_vma_check() Yang Shi
@ 2022-06-15 17:29 ` Yang Shi
  2022-06-15 23:55   ` Zach O'Keefe
  2022-06-15 17:29 ` [v4 PATCH 3/7] mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate Yang Shi
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Yang Shi @ 2022-06-15 17:29 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, willy, zokeefe, linmiaohe, akpm
  Cc: shy828301, linux-mm, linux-kernel

There are couple of places that check whether the vma size is ok for
THP or whether address fits, they are open coded and duplicate, use
transhuge_vma_suitable() to do the job by passing in (vma->end -
HPAGE_PMD_SIZE).

Move vma size check into hugepage_vma_check().  This will make
khugepaged_enter() is as same as khugepaged_enter_vma().  There is just
one caller for khugepaged_enter(), replace it to khugepaged_enter_vma()
and remove khugepaged_enter().

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/huge_mm.h    | 11 +++++++++++
 include/linux/khugepaged.h | 14 --------------
 mm/huge_memory.c           |  2 +-
 mm/khugepaged.c            | 19 ++++++-------------
 4 files changed, 18 insertions(+), 28 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 648cb3ce7099..8a5a8bfce0f5 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -116,6 +116,17 @@ extern struct kobj_attribute shmem_enabled_attr;
 
 extern unsigned long transparent_hugepage_flags;
 
+/*
+ * Do the below checks:
+ *   - For file vma, check if the linear page offset of vma is
+ *     HPAGE_PMD_NR aligned within the file.  The hugepage is
+ *     guaranteed to be hugepage-aligned within the file, but we must
+ *     check that the PMD-aligned addresses in the VMA map to
+ *     PMD-aligned offsets within the file, else the hugepage will
+ *     not be PMD-mappable.
+ *   - For all vmas, check if the haddr is in an aligned HPAGE_PMD_SIZE
+ *     area.
+ */
 static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
 		unsigned long addr)
 {
diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 392d34c3c59a..31ca8a7f78f4 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -51,16 +51,6 @@ static inline void khugepaged_exit(struct mm_struct *mm)
 	if (test_bit(MMF_VM_HUGEPAGE, &mm->flags))
 		__khugepaged_exit(mm);
 }
-
-static inline void khugepaged_enter(struct vm_area_struct *vma,
-				   unsigned long vm_flags)
-{
-	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
-	    khugepaged_enabled()) {
-		if (hugepage_vma_check(vma, vm_flags))
-			__khugepaged_enter(vma->vm_mm);
-	}
-}
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
 static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
 {
@@ -68,10 +58,6 @@ static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm
 static inline void khugepaged_exit(struct mm_struct *mm)
 {
 }
-static inline void khugepaged_enter(struct vm_area_struct *vma,
-				    unsigned long vm_flags)
-{
-}
 static inline void khugepaged_enter_vma(struct vm_area_struct *vma,
 					unsigned long vm_flags)
 {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 4f9bbb4eab23..b530462c4493 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -726,7 +726,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
 		return VM_FAULT_FALLBACK;
 	if (unlikely(anon_vma_prepare(vma)))
 		return VM_FAULT_OOM;
-	khugepaged_enter(vma, vma->vm_flags);
+	khugepaged_enter_vma(vma, vma->vm_flags);
 
 	if (!(vmf->flags & FAULT_FLAG_WRITE) &&
 			!mm_forbids_zeropage(vma->vm_mm) &&
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b1dab94c0f1e..db0b334a7d1f 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -450,8 +450,8 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
 	if (vma_is_dax(vma))
 		return false;
 
-	if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
-				vma->vm_pgoff, HPAGE_PMD_NR))
+	/* Check alignment for file vma and size for both file and anon vma */
+	if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
 		return false;
 
 	/* Enabled via shmem mount options or sysfs settings. */
@@ -512,9 +512,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
 			  unsigned long vm_flags)
 {
 	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
-	    khugepaged_enabled() &&
-	    (((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
-	     (vma->vm_end & HPAGE_PMD_MASK))) {
+	    khugepaged_enabled()) {
 		if (hugepage_vma_check(vma, vm_flags))
 			__khugepaged_enter(vma->vm_mm);
 	}
@@ -950,7 +948,6 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
 		struct vm_area_struct **vmap)
 {
 	struct vm_area_struct *vma;
-	unsigned long hstart, hend;
 
 	if (unlikely(khugepaged_test_exit(mm)))
 		return SCAN_ANY_PROCESS;
@@ -959,9 +956,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
 	if (!vma)
 		return SCAN_VMA_NULL;
 
-	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
-	hend = vma->vm_end & HPAGE_PMD_MASK;
-	if (address < hstart || address + HPAGE_PMD_SIZE > hend)
+	if (!transhuge_vma_suitable(vma, address))
 		return SCAN_ADDRESS_RANGE;
 	if (!hugepage_vma_check(vma, vma->vm_flags))
 		return SCAN_VMA_CHECK;
@@ -2147,10 +2142,8 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 			progress++;
 			continue;
 		}
-		hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
-		hend = vma->vm_end & HPAGE_PMD_MASK;
-		if (hstart >= hend)
-			goto skip;
+		hstart = round_up(vma->vm_start, HPAGE_PMD_SIZE);
+		hend = round_down(vma->vm_end, HPAGE_PMD_SIZE);
 		if (khugepaged_scan.address > hend)
 			goto skip;
 		if (khugepaged_scan.address < hstart)
-- 
2.26.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [v4 PATCH 3/7] mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate
  2022-06-15 17:29 [mm-unstable v4 PATCH 0/7] Cleanup transhuge_xxx helpers Yang Shi
  2022-06-15 17:29 ` [v4 PATCH 1/7] mm: khugepaged: check THP flag in hugepage_vma_check() Yang Shi
  2022-06-15 17:29 ` [v4 PATCH 2/7] mm: thp: consolidate vma size check to transhuge_vma_suitable Yang Shi
@ 2022-06-15 17:29 ` Yang Shi
  2022-06-15 23:57   ` Zach O'Keefe
  2022-06-15 17:29 ` [v4 PATCH 4/7] mm: thp: kill transparent_hugepage_active() Yang Shi
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Yang Shi @ 2022-06-15 17:29 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, willy, zokeefe, linmiaohe, akpm
  Cc: shy828301, linux-mm, linux-kernel

The hugepage_vma_revalidate() needs to check if the vma is still
anonymous vma or not since the address may be unmapped then remapped to
file before khugepaged reaquired the mmap_lock.

The old comment is not quite helpful, elaborate this with better
comment.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/khugepaged.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index db0b334a7d1f..5baa394e34c8 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -960,7 +960,13 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
 		return SCAN_ADDRESS_RANGE;
 	if (!hugepage_vma_check(vma, vma->vm_flags))
 		return SCAN_VMA_CHECK;
-	/* Anon VMA expected */
+	/*
+	 * Anon VMA expected, the address may be unmapped then
+	 * remapped to file after khugepaged reaquired the mmap_lock.
+	 *
+	 * hugepage_vma_check may return true for qualified file
+	 * vmas.
+	 */
 	if (!vma->anon_vma || !vma_is_anonymous(vma))
 		return SCAN_VMA_CHECK;
 	return 0;
-- 
2.26.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [v4 PATCH 4/7] mm: thp: kill transparent_hugepage_active()
  2022-06-15 17:29 [mm-unstable v4 PATCH 0/7] Cleanup transhuge_xxx helpers Yang Shi
                   ` (2 preceding siblings ...)
  2022-06-15 17:29 ` [v4 PATCH 3/7] mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate Yang Shi
@ 2022-06-15 17:29 ` Yang Shi
  2022-06-16  0:17   ` Zach O'Keefe
  2022-06-15 17:29 ` [v4 PATCH 5/7] mm: thp: kill __transhuge_page_enabled() Yang Shi
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Yang Shi @ 2022-06-15 17:29 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, willy, zokeefe, linmiaohe, akpm
  Cc: shy828301, linux-mm, linux-kernel

The transparent_hugepage_active() was introduced to show THP eligibility
bit in smaps in proc, smaps is the only user.  But it actually does the
similar check as hugepage_vma_check() which is used by khugepaged.  We
definitely don't have to maintain two similar checks, so kill
transparent_hugepage_active().

This patch also fixed the wrong behavior for VM_NO_KHUGEPAGED vmas.

Also move hugepage_vma_check() to huge_memory.c and huge_mm.h since it
is not only for khugepaged anymore.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 fs/proc/task_mmu.c         |  2 +-
 include/linux/huge_mm.h    | 11 ++++++++-
 include/linux/khugepaged.h |  2 --
 mm/huge_memory.c           | 50 +++++++++++++++++++++++++++++++-------
 mm/khugepaged.c            | 48 +++---------------------------------
 5 files changed, 56 insertions(+), 57 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 37ccb5c9f4f8..39a40ec181e7 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -863,7 +863,7 @@ static int show_smap(struct seq_file *m, void *v)
 	__show_smap(m, &mss, false);
 
 	seq_printf(m, "THPeligible:    %d\n",
-		   transparent_hugepage_active(vma));
+		   hugepage_vma_check(vma, vma->vm_flags, true));
 
 	if (arch_pkeys_enabled())
 		seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 8a5a8bfce0f5..aeb13119ee28 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -202,7 +202,9 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma)
 	       !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
 }
 
-bool transparent_hugepage_active(struct vm_area_struct *vma);
+bool hugepage_vma_check(struct vm_area_struct *vma,
+			unsigned long vm_flags,
+			bool smaps);
 
 #define transparent_hugepage_use_zero_page()				\
 	(transparent_hugepage_flags &					\
@@ -368,6 +370,13 @@ static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,
 	return false;
 }
 
+static inline bool hugepage_vma_check(struct vm_area_struct *vma,
+				       unsigned long vm_flags,
+				       bool smaps)
+{
+	return false;
+}
+
 static inline void prep_transhuge_page(struct page *page) {}
 
 #define transparent_hugepage_flags 0UL
diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 31ca8a7f78f4..ea5fd4c398f7 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -10,8 +10,6 @@ extern struct attribute_group khugepaged_attr_group;
 extern int khugepaged_init(void);
 extern void khugepaged_destroy(void);
 extern int start_stop_khugepaged(void);
-extern bool hugepage_vma_check(struct vm_area_struct *vma,
-			       unsigned long vm_flags);
 extern void __khugepaged_enter(struct mm_struct *mm);
 extern void __khugepaged_exit(struct mm_struct *mm);
 extern void khugepaged_enter_vma(struct vm_area_struct *vma,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b530462c4493..a28c6100b491 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -69,21 +69,53 @@ static atomic_t huge_zero_refcount;
 struct page *huge_zero_page __read_mostly;
 unsigned long huge_zero_pfn __read_mostly = ~0UL;
 
-bool transparent_hugepage_active(struct vm_area_struct *vma)
+bool hugepage_vma_check(struct vm_area_struct *vma,
+			unsigned long vm_flags,
+			bool smaps)
 {
-	/* The addr is used to check if the vma size fits */
-	unsigned long addr = (vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE;
+	if (!transhuge_vma_enabled(vma, vm_flags))
+		return false;
+
+	if (vm_flags & VM_NO_KHUGEPAGED)
+		return false;
+
+	/* Don't run khugepaged against DAX vma */
+	if (vma_is_dax(vma))
+		return false;
 
-	if (!transhuge_vma_suitable(vma, addr))
+	/* Check alignment for file vma and size for both file and anon vma */
+	if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
 		return false;
-	if (vma_is_anonymous(vma))
-		return __transparent_hugepage_enabled(vma);
-	if (vma_is_shmem(vma))
+
+	/* Enabled via shmem mount options or sysfs settings. */
+	if (shmem_file(vma->vm_file))
 		return shmem_huge_enabled(vma);
-	if (transhuge_vma_enabled(vma, vma->vm_flags) && file_thp_enabled(vma))
+
+	if (!khugepaged_enabled())
+		return false;
+
+	/* THP settings require madvise. */
+	if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
+		return false;
+
+	/* Only regular file is valid */
+	if (file_thp_enabled(vma))
 		return true;
 
-	return false;
+	if (!vma_is_anonymous(vma))
+		return false;
+
+	if (vma_is_temporary_stack(vma))
+		return false;
+
+	/*
+	 * THPeligible bit of smaps should show 1 for proper VMAs even
+	 * though anon_vma is not initialized yet.
+	 */
+	if (!vma->anon_vma)
+		return smaps;
+
+	return true;
 }
 
 static bool get_huge_zero_page(void)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 5baa394e34c8..3afd87f8c0b1 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -437,46 +437,6 @@ static inline int khugepaged_test_exit(struct mm_struct *mm)
 	return atomic_read(&mm->mm_users) == 0;
 }
 
-bool hugepage_vma_check(struct vm_area_struct *vma,
-			unsigned long vm_flags)
-{
-	if (!transhuge_vma_enabled(vma, vm_flags))
-		return false;
-
-	if (vm_flags & VM_NO_KHUGEPAGED)
-		return false;
-
-	/* Don't run khugepaged against DAX vma */
-	if (vma_is_dax(vma))
-		return false;
-
-	/* Check alignment for file vma and size for both file and anon vma */
-	if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
-		return false;
-
-	/* Enabled via shmem mount options or sysfs settings. */
-	if (shmem_file(vma->vm_file))
-		return shmem_huge_enabled(vma);
-
-	if (!khugepaged_enabled())
-		return false;
-
-	/* THP settings require madvise. */
-	if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
-		return false;
-
-	/* Only regular file is valid */
-	if (file_thp_enabled(vma))
-		return true;
-
-	if (!vma->anon_vma || !vma_is_anonymous(vma))
-		return false;
-	if (vma_is_temporary_stack(vma))
-		return false;
-
-	return true;
-}
-
 void __khugepaged_enter(struct mm_struct *mm)
 {
 	struct mm_slot *mm_slot;
@@ -513,7 +473,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
 {
 	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
 	    khugepaged_enabled()) {
-		if (hugepage_vma_check(vma, vm_flags))
+		if (hugepage_vma_check(vma, vm_flags, false))
 			__khugepaged_enter(vma->vm_mm);
 	}
 }
@@ -958,7 +918,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
 
 	if (!transhuge_vma_suitable(vma, address))
 		return SCAN_ADDRESS_RANGE;
-	if (!hugepage_vma_check(vma, vma->vm_flags))
+	if (!hugepage_vma_check(vma, vma->vm_flags, false))
 		return SCAN_VMA_CHECK;
 	/*
 	 * Anon VMA expected, the address may be unmapped then
@@ -1448,7 +1408,7 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
 	 * the valid THP. Add extra VM_HUGEPAGE so hugepage_vma_check()
 	 * will not fail the vma for missing VM_HUGEPAGE
 	 */
-	if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE))
+	if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE, false))
 		return;
 
 	/* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
@@ -2143,7 +2103,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 			progress++;
 			break;
 		}
-		if (!hugepage_vma_check(vma, vma->vm_flags)) {
+		if (!hugepage_vma_check(vma, vma->vm_flags, false)) {
 skip:
 			progress++;
 			continue;
-- 
2.26.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [v4 PATCH 5/7] mm: thp: kill __transhuge_page_enabled()
  2022-06-15 17:29 [mm-unstable v4 PATCH 0/7] Cleanup transhuge_xxx helpers Yang Shi
                   ` (3 preceding siblings ...)
  2022-06-15 17:29 ` [v4 PATCH 4/7] mm: thp: kill transparent_hugepage_active() Yang Shi
@ 2022-06-15 17:29 ` Yang Shi
  2022-06-16  0:56   ` Zach O'Keefe
  2022-06-15 17:29 ` [v4 PATCH 6/7] mm: khugepaged: reorg some khugepaged helpers Yang Shi
  2022-06-15 17:29 ` [v4 PATCH 7/7] doc: proc: fix the description to THPeligible Yang Shi
  6 siblings, 1 reply; 15+ messages in thread
From: Yang Shi @ 2022-06-15 17:29 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, willy, zokeefe, linmiaohe, akpm
  Cc: shy828301, linux-mm, linux-kernel

The page fault path checks THP eligibility with
__transhuge_page_enabled() which does the similar thing as
hugepage_vma_check(), so use hugepage_vma_check() instead.

However page fault allows DAX and !anon_vma cases, so added a new flag,
in_pf, to hugepage_vma_check() to make page fault work correctly.

The in_pf flag is also used to skip shmem and file THP for page fault
since shmem handles THP in its own shmem_fault() and file THP allocation
on fault is not supported yet.

Also remove hugepage_vma_enabled() since hugepage_vma_check() is the
only caller now, it is not necessary to have a helper function.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 fs/proc/task_mmu.c      |  2 +-
 include/linux/huge_mm.h | 57 ++---------------------------------------
 mm/huge_memory.c        | 51 ++++++++++++++++++++++++++++--------
 mm/khugepaged.c         |  8 +++---
 mm/memory.c             |  7 +++--
 5 files changed, 52 insertions(+), 73 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 39a40ec181e7..cef72e49acc5 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -863,7 +863,7 @@ static int show_smap(struct seq_file *m, void *v)
 	__show_smap(m, &mss, false);
 
 	seq_printf(m, "THPeligible:    %d\n",
-		   hugepage_vma_check(vma, vma->vm_flags, true));
+		   hugepage_vma_check(vma, vma->vm_flags, true, false));
 
 	if (arch_pkeys_enabled())
 		seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index aeb13119ee28..9d97d7ee6234 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -146,48 +146,6 @@ static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
 	return true;
 }
 
-static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,
-					  unsigned long vm_flags)
-{
-	/* Explicitly disabled through madvise. */
-	if ((vm_flags & VM_NOHUGEPAGE) ||
-	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
-		return false;
-	return true;
-}
-
-/*
- * to be used on vmas which are known to support THP.
- * Use transparent_hugepage_active otherwise
- */
-static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
-{
-
-	/*
-	 * If the hardware/firmware marked hugepage support disabled.
-	 */
-	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_NEVER_DAX))
-		return false;
-
-	if (!transhuge_vma_enabled(vma, vma->vm_flags))
-		return false;
-
-	if (vma_is_temporary_stack(vma))
-		return false;
-
-	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_FLAG))
-		return true;
-
-	if (vma_is_dax(vma))
-		return true;
-
-	if (transparent_hugepage_flags &
-				(1 << TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG))
-		return !!(vma->vm_flags & VM_HUGEPAGE);
-
-	return false;
-}
-
 static inline bool file_thp_enabled(struct vm_area_struct *vma)
 {
 	struct inode *inode;
@@ -204,7 +162,7 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma)
 
 bool hugepage_vma_check(struct vm_area_struct *vma,
 			unsigned long vm_flags,
-			bool smaps);
+			bool smaps, bool in_pf);
 
 #define transparent_hugepage_use_zero_page()				\
 	(transparent_hugepage_flags &					\
@@ -348,11 +306,6 @@ static inline bool folio_test_pmd_mappable(struct folio *folio)
 	return false;
 }
 
-static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
-{
-	return false;
-}
-
 static inline bool transparent_hugepage_active(struct vm_area_struct *vma)
 {
 	return false;
@@ -364,15 +317,9 @@ static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
 	return false;
 }
 
-static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,
-					  unsigned long vm_flags)
-{
-	return false;
-}
-
 static inline bool hugepage_vma_check(struct vm_area_struct *vma,
 				       unsigned long vm_flags,
-				       bool smaps)
+				       bool smaps, bool in_pf)
 {
 	return false;
 }
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a28c6100b491..d0c37d99917b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -71,24 +71,50 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL;
 
 bool hugepage_vma_check(struct vm_area_struct *vma,
 			unsigned long vm_flags,
-			bool smaps)
+			bool smaps, bool in_pf)
 {
-	if (!transhuge_vma_enabled(vma, vm_flags))
+	/*
+	 * Explicitly disabled through madvise or prctl, or some
+	 * architectures may disable THP for some mappings, for
+	 * example, s390 kvm.
+	 * */
+	if ((vm_flags & VM_NOHUGEPAGE) ||
+	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
 		return false;
-
-	if (vm_flags & VM_NO_KHUGEPAGED)
+	/*
+	 * If the hardware/firmware marked hugepage support disabled.
+	 */
+	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_NEVER_DAX))
 		return false;
 
-	/* Don't run khugepaged against DAX vma */
+	/* khugepaged doesn't collapse DAX vma, but page fault is fine. */
 	if (vma_is_dax(vma))
+		return in_pf;
+
+	/*
+	 * Special VMA and hugetlb VMA.
+	 * Must be checked after dax since some dax mappings may have
+	 * VM_MIXEDMAP set.
+	 */
+	if (vm_flags & VM_NO_KHUGEPAGED)
 		return false;
 
-	/* Check alignment for file vma and size for both file and anon vma */
-	if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
+	/*
+	 * Check alignment for file vma and size for both file and anon vma.
+	 *
+	 * Skip the check for page fault. Huge fault does the check in fault
+	 * handlers. And this check is not suitable for huge PUD fault.
+	 */
+	if (!in_pf &&
+	    !transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
 		return false;
 
-	/* Enabled via shmem mount options or sysfs settings. */
-	if (shmem_file(vma->vm_file))
+	/*
+	 * Enabled via shmem mount options or sysfs settings.
+	 * Must be done before hugepage flags check since shmem has its
+	 * own flags.
+	 */
+	if (!in_pf && shmem_file(vma->vm_file))
 		return shmem_huge_enabled(vma);
 
 	if (!khugepaged_enabled())
@@ -99,7 +125,7 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
 		return false;
 
 	/* Only regular file is valid */
-	if (file_thp_enabled(vma))
+	if (!in_pf && file_thp_enabled(vma))
 		return true;
 
 	if (!vma_is_anonymous(vma))
@@ -111,9 +137,12 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
 	/*
 	 * THPeligible bit of smaps should show 1 for proper VMAs even
 	 * though anon_vma is not initialized yet.
+	 *
+	 * Allow page fault since anon_vma may be not initialized until
+	 * the first page fault.
 	 */
 	if (!vma->anon_vma)
-		return smaps;
+		return (smaps || in_pf);
 
 	return true;
 }
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 3afd87f8c0b1..2a676f37c921 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -473,7 +473,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
 {
 	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
 	    khugepaged_enabled()) {
-		if (hugepage_vma_check(vma, vm_flags, false))
+		if (hugepage_vma_check(vma, vm_flags, false, false))
 			__khugepaged_enter(vma->vm_mm);
 	}
 }
@@ -918,7 +918,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
 
 	if (!transhuge_vma_suitable(vma, address))
 		return SCAN_ADDRESS_RANGE;
-	if (!hugepage_vma_check(vma, vma->vm_flags, false))
+	if (!hugepage_vma_check(vma, vma->vm_flags, false, false))
 		return SCAN_VMA_CHECK;
 	/*
 	 * Anon VMA expected, the address may be unmapped then
@@ -1408,7 +1408,7 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
 	 * the valid THP. Add extra VM_HUGEPAGE so hugepage_vma_check()
 	 * will not fail the vma for missing VM_HUGEPAGE
 	 */
-	if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE, false))
+	if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE, false, false))
 		return;
 
 	/* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
@@ -2103,7 +2103,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 			progress++;
 			break;
 		}
-		if (!hugepage_vma_check(vma, vma->vm_flags, false)) {
+		if (!hugepage_vma_check(vma, vma->vm_flags, false, false)) {
 skip:
 			progress++;
 			continue;
diff --git a/mm/memory.c b/mm/memory.c
index be724238a9d3..fee2884481f2 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4985,6 +4985,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
 		.gfp_mask = __get_fault_gfp_mask(vma),
 	};
 	struct mm_struct *mm = vma->vm_mm;
+	unsigned long vm_flags = vma->vm_flags;
 	pgd_t *pgd;
 	p4d_t *p4d;
 	vm_fault_t ret;
@@ -4998,7 +4999,8 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
 	if (!vmf.pud)
 		return VM_FAULT_OOM;
 retry_pud:
-	if (pud_none(*vmf.pud) && __transparent_hugepage_enabled(vma)) {
+	if (pud_none(*vmf.pud) &&
+	    hugepage_vma_check(vma, vm_flags, false, true)) {
 		ret = create_huge_pud(&vmf);
 		if (!(ret & VM_FAULT_FALLBACK))
 			return ret;
@@ -5031,7 +5033,8 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
 	if (pud_trans_unstable(vmf.pud))
 		goto retry_pud;
 
-	if (pmd_none(*vmf.pmd) && __transparent_hugepage_enabled(vma)) {
+	if (pmd_none(*vmf.pmd) &&
+	    hugepage_vma_check(vma, vm_flags, false, true)) {
 		ret = create_huge_pmd(&vmf);
 		if (!(ret & VM_FAULT_FALLBACK))
 			return ret;
-- 
2.26.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [v4 PATCH 6/7] mm: khugepaged: reorg some khugepaged helpers
  2022-06-15 17:29 [mm-unstable v4 PATCH 0/7] Cleanup transhuge_xxx helpers Yang Shi
                   ` (4 preceding siblings ...)
  2022-06-15 17:29 ` [v4 PATCH 5/7] mm: thp: kill __transhuge_page_enabled() Yang Shi
@ 2022-06-15 17:29 ` Yang Shi
  2022-06-16  0:27   ` Zach O'Keefe
  2022-06-15 17:29 ` [v4 PATCH 7/7] doc: proc: fix the description to THPeligible Yang Shi
  6 siblings, 1 reply; 15+ messages in thread
From: Yang Shi @ 2022-06-15 17:29 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, willy, zokeefe, linmiaohe, akpm
  Cc: shy828301, linux-mm, linux-kernel

The khugepaged_{enabled|always|req_madv} are not khugepaged only
anymore, move them to huge_mm.h and rename to hugepage_flags_xxx, and
remove khugepaged_req_madv due to no users.

Also move khugepaged_defrag to khugepaged.c since its only caller is in
that file, it doesn't have to be in a header file.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/huge_mm.h    |  8 ++++++++
 include/linux/khugepaged.h | 14 --------------
 mm/huge_memory.c           |  4 ++--
 mm/khugepaged.c            | 18 +++++++++++-------
 4 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 9d97d7ee6234..4cf546af7d97 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -116,6 +116,14 @@ extern struct kobj_attribute shmem_enabled_attr;
 
 extern unsigned long transparent_hugepage_flags;
 
+#define hugepage_flags_enabled()					       \
+	(transparent_hugepage_flags &				       \
+	 ((1<<TRANSPARENT_HUGEPAGE_FLAG) |		       \
+	  (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG)))
+#define hugepage_flags_always()				\
+	(transparent_hugepage_flags &			\
+	 (1<<TRANSPARENT_HUGEPAGE_FLAG))
+
 /*
  * Do the below checks:
  *   - For file vma, check if the linear page offset of vma is
diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index ea5fd4c398f7..384f034ae947 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -24,20 +24,6 @@ static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
 }
 #endif
 
-#define khugepaged_enabled()					       \
-	(transparent_hugepage_flags &				       \
-	 ((1<<TRANSPARENT_HUGEPAGE_FLAG) |		       \
-	  (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG)))
-#define khugepaged_always()				\
-	(transparent_hugepage_flags &			\
-	 (1<<TRANSPARENT_HUGEPAGE_FLAG))
-#define khugepaged_req_madv()					\
-	(transparent_hugepage_flags &				\
-	 (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG))
-#define khugepaged_defrag()					\
-	(transparent_hugepage_flags &				\
-	 (1<<TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG))
-
 static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
 {
 	if (test_bit(MMF_VM_HUGEPAGE, &oldmm->flags))
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index d0c37d99917b..0f2cce2d7041 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -117,11 +117,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
 	if (!in_pf && shmem_file(vma->vm_file))
 		return shmem_huge_enabled(vma);
 
-	if (!khugepaged_enabled())
+	if (!hugepage_flags_enabled())
 		return false;
 
 	/* THP settings require madvise. */
-	if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
+	if (!(vm_flags & VM_HUGEPAGE) && !hugepage_flags_always())
 		return false;
 
 	/* Only regular file is valid */
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 2a676f37c921..d8ebb60aae36 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -472,7 +472,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
 			  unsigned long vm_flags)
 {
 	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
-	    khugepaged_enabled()) {
+	    hugepage_flags_enabled()) {
 		if (hugepage_vma_check(vma, vm_flags, false, false))
 			__khugepaged_enter(vma->vm_mm);
 	}
@@ -763,6 +763,10 @@ static bool khugepaged_scan_abort(int nid)
 	return false;
 }
 
+#define khugepaged_defrag()					\
+	(transparent_hugepage_flags &				\
+	 (1<<TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG))
+
 /* Defrag for khugepaged will enter direct reclaim/compaction if necessary */
 static inline gfp_t alloc_hugepage_khugepaged_gfpmask(void)
 {
@@ -860,7 +864,7 @@ static struct page *khugepaged_alloc_hugepage(bool *wait)
 			khugepaged_alloc_sleep();
 		} else
 			count_vm_event(THP_COLLAPSE_ALLOC);
-	} while (unlikely(!hpage) && likely(khugepaged_enabled()));
+	} while (unlikely(!hpage) && likely(hugepage_flags_enabled()));
 
 	return hpage;
 }
@@ -2186,7 +2190,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 static int khugepaged_has_work(void)
 {
 	return !list_empty(&khugepaged_scan.mm_head) &&
-		khugepaged_enabled();
+		hugepage_flags_enabled();
 }
 
 static int khugepaged_wait_event(void)
@@ -2251,7 +2255,7 @@ static void khugepaged_wait_work(void)
 		return;
 	}
 
-	if (khugepaged_enabled())
+	if (hugepage_flags_enabled())
 		wait_event_freezable(khugepaged_wait, khugepaged_wait_event());
 }
 
@@ -2282,7 +2286,7 @@ static void set_recommended_min_free_kbytes(void)
 	int nr_zones = 0;
 	unsigned long recommended_min;
 
-	if (!khugepaged_enabled()) {
+	if (!hugepage_flags_enabled()) {
 		calculate_min_free_kbytes();
 		goto update_wmarks;
 	}
@@ -2332,7 +2336,7 @@ int start_stop_khugepaged(void)
 	int err = 0;
 
 	mutex_lock(&khugepaged_mutex);
-	if (khugepaged_enabled()) {
+	if (hugepage_flags_enabled()) {
 		if (!khugepaged_thread)
 			khugepaged_thread = kthread_run(khugepaged, NULL,
 							"khugepaged");
@@ -2358,7 +2362,7 @@ int start_stop_khugepaged(void)
 void khugepaged_min_free_kbytes_update(void)
 {
 	mutex_lock(&khugepaged_mutex);
-	if (khugepaged_enabled() && khugepaged_thread)
+	if (hugepage_flags_enabled() && khugepaged_thread)
 		set_recommended_min_free_kbytes();
 	mutex_unlock(&khugepaged_mutex);
 }
-- 
2.26.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [v4 PATCH 7/7] doc: proc: fix the description to THPeligible
  2022-06-15 17:29 [mm-unstable v4 PATCH 0/7] Cleanup transhuge_xxx helpers Yang Shi
                   ` (5 preceding siblings ...)
  2022-06-15 17:29 ` [v4 PATCH 6/7] mm: khugepaged: reorg some khugepaged helpers Yang Shi
@ 2022-06-15 17:29 ` Yang Shi
  2022-06-16  0:20   ` Zach O'Keefe
  6 siblings, 1 reply; 15+ messages in thread
From: Yang Shi @ 2022-06-15 17:29 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, willy, zokeefe, linmiaohe, akpm
  Cc: shy828301, linux-mm, linux-kernel

The THPeligible bit shows 1 if and only if the VMA is eligible for
allocating THP and the THP is also PMD mappable.  Some misaligned file
VMAs may be eligible for allocating THP but the THP can't be mapped by
PMD.  Make this more explictly to avoid ambiguity.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 Documentation/filesystems/proc.rst | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 1bc91fb8c321..a5e41e636a1a 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -514,8 +514,10 @@ replaced by copy-on-write) part of the underlying shmem object out on swap.
 "SwapPss" shows proportional swap share of this mapping. Unlike "Swap", this
 does not take into account swapped out page of underlying shmem objects.
 "Locked" indicates whether the mapping is locked in memory or not.
+
 "THPeligible" indicates whether the mapping is eligible for allocating THP
-pages - 1 if true, 0 otherwise. It just shows the current status.
+pages as well as the THP is PMD mappable or not - 1 if true, 0 otherwise.
+It just shows the current status.
 
 "VmFlags" field deserves a separate description. This member represents the
 kernel flags associated with the particular virtual memory area in two letter
-- 
2.26.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [v4 PATCH 2/7] mm: thp: consolidate vma size check to transhuge_vma_suitable
  2022-06-15 17:29 ` [v4 PATCH 2/7] mm: thp: consolidate vma size check to transhuge_vma_suitable Yang Shi
@ 2022-06-15 23:55   ` Zach O'Keefe
  0 siblings, 0 replies; 15+ messages in thread
From: Zach O'Keefe @ 2022-06-15 23:55 UTC (permalink / raw)
  To: Yang Shi
  Cc: vbabka, kirill.shutemov, willy, linmiaohe, akpm, linux-mm, linux-kernel

On 15 Jun 10:29, Yang Shi wrote:
> There are couple of places that check whether the vma size is ok for
> THP or whether address fits, they are open coded and duplicate, use
> transhuge_vma_suitable() to do the job by passing in (vma->end -
> HPAGE_PMD_SIZE).
> 
> Move vma size check into hugepage_vma_check().  This will make
> khugepaged_enter() is as same as khugepaged_enter_vma().  There is just
> one caller for khugepaged_enter(), replace it to khugepaged_enter_vma()
> and remove khugepaged_enter().
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  include/linux/huge_mm.h    | 11 +++++++++++
>  include/linux/khugepaged.h | 14 --------------
>  mm/huge_memory.c           |  2 +-
>  mm/khugepaged.c            | 19 ++++++-------------
>  4 files changed, 18 insertions(+), 28 deletions(-)
> 
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 648cb3ce7099..8a5a8bfce0f5 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -116,6 +116,17 @@ extern struct kobj_attribute shmem_enabled_attr;
>  
>  extern unsigned long transparent_hugepage_flags;
>  
> +/*
> + * Do the below checks:
> + *   - For file vma, check if the linear page offset of vma is
> + *     HPAGE_PMD_NR aligned within the file.  The hugepage is
> + *     guaranteed to be hugepage-aligned within the file, but we must
> + *     check that the PMD-aligned addresses in the VMA map to
> + *     PMD-aligned offsets within the file, else the hugepage will
> + *     not be PMD-mappable.
> + *   - For all vmas, check if the haddr is in an aligned HPAGE_PMD_SIZE
> + *     area.
> + */
>  static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
>  		unsigned long addr)
>  {
> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
> index 392d34c3c59a..31ca8a7f78f4 100644
> --- a/include/linux/khugepaged.h
> +++ b/include/linux/khugepaged.h
> @@ -51,16 +51,6 @@ static inline void khugepaged_exit(struct mm_struct *mm)
>  	if (test_bit(MMF_VM_HUGEPAGE, &mm->flags))
>  		__khugepaged_exit(mm);
>  }
> -
> -static inline void khugepaged_enter(struct vm_area_struct *vma,
> -				   unsigned long vm_flags)
> -{
> -	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
> -	    khugepaged_enabled()) {
> -		if (hugepage_vma_check(vma, vm_flags))
> -			__khugepaged_enter(vma->vm_mm);
> -	}
> -}
>  #else /* CONFIG_TRANSPARENT_HUGEPAGE */
>  static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
>  {
> @@ -68,10 +58,6 @@ static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm
>  static inline void khugepaged_exit(struct mm_struct *mm)
>  {
>  }
> -static inline void khugepaged_enter(struct vm_area_struct *vma,
> -				    unsigned long vm_flags)
> -{
> -}
>  static inline void khugepaged_enter_vma(struct vm_area_struct *vma,
>  					unsigned long vm_flags)
>  {
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 4f9bbb4eab23..b530462c4493 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -726,7 +726,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
>  		return VM_FAULT_FALLBACK;
>  	if (unlikely(anon_vma_prepare(vma)))
>  		return VM_FAULT_OOM;
> -	khugepaged_enter(vma, vma->vm_flags);
> +	khugepaged_enter_vma(vma, vma->vm_flags);
>  
>  	if (!(vmf->flags & FAULT_FLAG_WRITE) &&
>  			!mm_forbids_zeropage(vma->vm_mm) &&
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index b1dab94c0f1e..db0b334a7d1f 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -450,8 +450,8 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
>  	if (vma_is_dax(vma))
>  		return false;
>  
> -	if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
> -				vma->vm_pgoff, HPAGE_PMD_NR))
> +	/* Check alignment for file vma and size for both file and anon vma */
> +	if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
>  		return false;
>  
>  	/* Enabled via shmem mount options or sysfs settings. */
> @@ -512,9 +512,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
>  			  unsigned long vm_flags)
>  {
>  	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
> -	    khugepaged_enabled() &&
> -	    (((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
> -	     (vma->vm_end & HPAGE_PMD_MASK))) {
> +	    khugepaged_enabled()) {
>  		if (hugepage_vma_check(vma, vm_flags))
>  			__khugepaged_enter(vma->vm_mm);
>  	}
> @@ -950,7 +948,6 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
>  		struct vm_area_struct **vmap)
>  {
>  	struct vm_area_struct *vma;
> -	unsigned long hstart, hend;
>  
>  	if (unlikely(khugepaged_test_exit(mm)))
>  		return SCAN_ANY_PROCESS;
> @@ -959,9 +956,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
>  	if (!vma)
>  		return SCAN_VMA_NULL;
>  
> -	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
> -	hend = vma->vm_end & HPAGE_PMD_MASK;
> -	if (address < hstart || address + HPAGE_PMD_SIZE > hend)
> +	if (!transhuge_vma_suitable(vma, address))
>  		return SCAN_ADDRESS_RANGE;
>  	if (!hugepage_vma_check(vma, vma->vm_flags))
>  		return SCAN_VMA_CHECK;
> @@ -2147,10 +2142,8 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
>  			progress++;
>  			continue;
>  		}
> -		hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
> -		hend = vma->vm_end & HPAGE_PMD_MASK;
> -		if (hstart >= hend)
> -			goto skip;
> +		hstart = round_up(vma->vm_start, HPAGE_PMD_SIZE);
> +		hend = round_down(vma->vm_end, HPAGE_PMD_SIZE);
>  		if (khugepaged_scan.address > hend)
>  			goto skip;
>  		if (khugepaged_scan.address < hstart)
> -- 
> 2.26.3
> 

Thanks for the patches, Yang.

Reviewed-by: Zach O'Keefe <zokeefe@google.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v4 PATCH 3/7] mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate
  2022-06-15 17:29 ` [v4 PATCH 3/7] mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate Yang Shi
@ 2022-06-15 23:57   ` Zach O'Keefe
  0 siblings, 0 replies; 15+ messages in thread
From: Zach O'Keefe @ 2022-06-15 23:57 UTC (permalink / raw)
  To: Yang Shi
  Cc: vbabka, kirill.shutemov, willy, linmiaohe, akpm, linux-mm, linux-kernel

On 15 Jun 10:29, Yang Shi wrote:
> The hugepage_vma_revalidate() needs to check if the vma is still
> anonymous vma or not since the address may be unmapped then remapped to
> file before khugepaged reaquired the mmap_lock.
> 
> The old comment is not quite helpful, elaborate this with better
> comment.
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  mm/khugepaged.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index db0b334a7d1f..5baa394e34c8 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -960,7 +960,13 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
>  		return SCAN_ADDRESS_RANGE;
>  	if (!hugepage_vma_check(vma, vma->vm_flags))
>  		return SCAN_VMA_CHECK;
> -	/* Anon VMA expected */
> +	/*
> +	 * Anon VMA expected, the address may be unmapped then
> +	 * remapped to file after khugepaged reaquired the mmap_lock.
> +	 *
> +	 * hugepage_vma_check may return true for qualified file
> +	 * vmas.
> +	 */
>  	if (!vma->anon_vma || !vma_is_anonymous(vma))
>  		return SCAN_VMA_CHECK;
>  	return 0;
> -- 
> 2.26.3
> 

Reviewed-by: Zach O'Keefe <zokeefe@google.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v4 PATCH 4/7] mm: thp: kill transparent_hugepage_active()
  2022-06-15 17:29 ` [v4 PATCH 4/7] mm: thp: kill transparent_hugepage_active() Yang Shi
@ 2022-06-16  0:17   ` Zach O'Keefe
  2022-06-16  0:32     ` Yang Shi
  0 siblings, 1 reply; 15+ messages in thread
From: Zach O'Keefe @ 2022-06-16  0:17 UTC (permalink / raw)
  To: Yang Shi
  Cc: vbabka, kirill.shutemov, willy, linmiaohe, akpm, linux-mm, linux-kernel

On 15 Jun 10:29, Yang Shi wrote:
> The transparent_hugepage_active() was introduced to show THP eligibility
> bit in smaps in proc, smaps is the only user.  But it actually does the
> similar check as hugepage_vma_check() which is used by khugepaged.  We
> definitely don't have to maintain two similar checks, so kill
> transparent_hugepage_active().
> 
> This patch also fixed the wrong behavior for VM_NO_KHUGEPAGED vmas.
> 
> Also move hugepage_vma_check() to huge_memory.c and huge_mm.h since it
> is not only for khugepaged anymore.
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  fs/proc/task_mmu.c         |  2 +-
>  include/linux/huge_mm.h    | 11 ++++++++-
>  include/linux/khugepaged.h |  2 --
>  mm/huge_memory.c           | 50 +++++++++++++++++++++++++++++++-------
>  mm/khugepaged.c            | 48 +++---------------------------------
>  5 files changed, 56 insertions(+), 57 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 37ccb5c9f4f8..39a40ec181e7 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -863,7 +863,7 @@ static int show_smap(struct seq_file *m, void *v)
>  	__show_smap(m, &mss, false);
>  
>  	seq_printf(m, "THPeligible:    %d\n",
> -		   transparent_hugepage_active(vma));
> +		   hugepage_vma_check(vma, vma->vm_flags, true));
>  
>  	if (arch_pkeys_enabled())
>  		seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 8a5a8bfce0f5..aeb13119ee28 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -202,7 +202,9 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma)
>  	       !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
>  }
>  
> -bool transparent_hugepage_active(struct vm_area_struct *vma);

Like v3, I think this also needs to be deleted from !CONFIG_TRANSPARENT_HUGEPAGE

Otherwise can add

Reviewed-by: Zach O'Keefe <zokeefe@google.com>

> +bool hugepage_vma_check(struct vm_area_struct *vma,
> +			unsigned long vm_flags,
> +			bool smaps);
>  
>  #define transparent_hugepage_use_zero_page()				\
>  	(transparent_hugepage_flags &					\
> @@ -368,6 +370,13 @@ static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,
>  	return false;
>  }
>  
> +static inline bool hugepage_vma_check(struct vm_area_struct *vma,
> +				       unsigned long vm_flags,
> +				       bool smaps)
> +{
> +	return false;
> +}
> +
>  static inline void prep_transhuge_page(struct page *page) {}
>  
>  #define transparent_hugepage_flags 0UL
> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
> index 31ca8a7f78f4..ea5fd4c398f7 100644
> --- a/include/linux/khugepaged.h
> +++ b/include/linux/khugepaged.h
> @@ -10,8 +10,6 @@ extern struct attribute_group khugepaged_attr_group;
>  extern int khugepaged_init(void);
>  extern void khugepaged_destroy(void);
>  extern int start_stop_khugepaged(void);
> -extern bool hugepage_vma_check(struct vm_area_struct *vma,
> -			       unsigned long vm_flags);
>  extern void __khugepaged_enter(struct mm_struct *mm);
>  extern void __khugepaged_exit(struct mm_struct *mm);
>  extern void khugepaged_enter_vma(struct vm_area_struct *vma,
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index b530462c4493..a28c6100b491 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -69,21 +69,53 @@ static atomic_t huge_zero_refcount;
>  struct page *huge_zero_page __read_mostly;
>  unsigned long huge_zero_pfn __read_mostly = ~0UL;
>  
> -bool transparent_hugepage_active(struct vm_area_struct *vma)
> +bool hugepage_vma_check(struct vm_area_struct *vma,
> +			unsigned long vm_flags,
> +			bool smaps)
>  {
> -	/* The addr is used to check if the vma size fits */
> -	unsigned long addr = (vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE;
> +	if (!transhuge_vma_enabled(vma, vm_flags))
> +		return false;
> +
> +	if (vm_flags & VM_NO_KHUGEPAGED)
> +		return false;
> +
> +	/* Don't run khugepaged against DAX vma */
> +	if (vma_is_dax(vma))
> +		return false;
>  
> -	if (!transhuge_vma_suitable(vma, addr))
> +	/* Check alignment for file vma and size for both file and anon vma */
> +	if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
>  		return false;
> -	if (vma_is_anonymous(vma))
> -		return __transparent_hugepage_enabled(vma);
> -	if (vma_is_shmem(vma))
> +
> +	/* Enabled via shmem mount options or sysfs settings. */
> +	if (shmem_file(vma->vm_file))
>  		return shmem_huge_enabled(vma);
> -	if (transhuge_vma_enabled(vma, vma->vm_flags) && file_thp_enabled(vma))
> +
> +	if (!khugepaged_enabled())
> +		return false;
> +
> +	/* THP settings require madvise. */
> +	if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
> +		return false;
> +
> +	/* Only regular file is valid */
> +	if (file_thp_enabled(vma))
>  		return true;
>  
> -	return false;
> +	if (!vma_is_anonymous(vma))
> +		return false;
> +
> +	if (vma_is_temporary_stack(vma))
> +		return false;
> +
> +	/*
> +	 * THPeligible bit of smaps should show 1 for proper VMAs even
> +	 * though anon_vma is not initialized yet.
> +	 */
> +	if (!vma->anon_vma)
> +		return smaps;
> +
> +	return true;
>  }
>  
>  static bool get_huge_zero_page(void)
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 5baa394e34c8..3afd87f8c0b1 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -437,46 +437,6 @@ static inline int khugepaged_test_exit(struct mm_struct *mm)
>  	return atomic_read(&mm->mm_users) == 0;
>  }
>  
> -bool hugepage_vma_check(struct vm_area_struct *vma,
> -			unsigned long vm_flags)
> -{
> -	if (!transhuge_vma_enabled(vma, vm_flags))
> -		return false;
> -
> -	if (vm_flags & VM_NO_KHUGEPAGED)
> -		return false;
> -
> -	/* Don't run khugepaged against DAX vma */
> -	if (vma_is_dax(vma))
> -		return false;
> -
> -	/* Check alignment for file vma and size for both file and anon vma */
> -	if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
> -		return false;
> -
> -	/* Enabled via shmem mount options or sysfs settings. */
> -	if (shmem_file(vma->vm_file))
> -		return shmem_huge_enabled(vma);
> -
> -	if (!khugepaged_enabled())
> -		return false;
> -
> -	/* THP settings require madvise. */
> -	if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
> -		return false;
> -
> -	/* Only regular file is valid */
> -	if (file_thp_enabled(vma))
> -		return true;
> -
> -	if (!vma->anon_vma || !vma_is_anonymous(vma))
> -		return false;
> -	if (vma_is_temporary_stack(vma))
> -		return false;
> -
> -	return true;
> -}
> -
>  void __khugepaged_enter(struct mm_struct *mm)
>  {
>  	struct mm_slot *mm_slot;
> @@ -513,7 +473,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
>  {
>  	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
>  	    khugepaged_enabled()) {
> -		if (hugepage_vma_check(vma, vm_flags))
> +		if (hugepage_vma_check(vma, vm_flags, false))
>  			__khugepaged_enter(vma->vm_mm);
>  	}
>  }
> @@ -958,7 +918,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
>  
>  	if (!transhuge_vma_suitable(vma, address))
>  		return SCAN_ADDRESS_RANGE;
> -	if (!hugepage_vma_check(vma, vma->vm_flags))
> +	if (!hugepage_vma_check(vma, vma->vm_flags, false))
>  		return SCAN_VMA_CHECK;
>  	/*
>  	 * Anon VMA expected, the address may be unmapped then
> @@ -1448,7 +1408,7 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
>  	 * the valid THP. Add extra VM_HUGEPAGE so hugepage_vma_check()
>  	 * will not fail the vma for missing VM_HUGEPAGE
>  	 */
> -	if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE))
> +	if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE, false))
>  		return;
>  
>  	/* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
> @@ -2143,7 +2103,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
>  			progress++;
>  			break;
>  		}
> -		if (!hugepage_vma_check(vma, vma->vm_flags)) {
> +		if (!hugepage_vma_check(vma, vma->vm_flags, false)) {
>  skip:
>  			progress++;
>  			continue;
> -- 
> 2.26.3
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v4 PATCH 7/7] doc: proc: fix the description to THPeligible
  2022-06-15 17:29 ` [v4 PATCH 7/7] doc: proc: fix the description to THPeligible Yang Shi
@ 2022-06-16  0:20   ` Zach O'Keefe
  0 siblings, 0 replies; 15+ messages in thread
From: Zach O'Keefe @ 2022-06-16  0:20 UTC (permalink / raw)
  To: Yang Shi
  Cc: vbabka, kirill.shutemov, willy, linmiaohe, akpm, linux-mm, linux-kernel

On 15 Jun 10:29, Yang Shi wrote:
> The THPeligible bit shows 1 if and only if the VMA is eligible for
> allocating THP and the THP is also PMD mappable.  Some misaligned file
> VMAs may be eligible for allocating THP but the THP can't be mapped by
> PMD.  Make this more explictly to avoid ambiguity.
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  Documentation/filesystems/proc.rst | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 1bc91fb8c321..a5e41e636a1a 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -514,8 +514,10 @@ replaced by copy-on-write) part of the underlying shmem object out on swap.
>  "SwapPss" shows proportional swap share of this mapping. Unlike "Swap", this
>  does not take into account swapped out page of underlying shmem objects.
>  "Locked" indicates whether the mapping is locked in memory or not.
> +
>  "THPeligible" indicates whether the mapping is eligible for allocating THP
> -pages - 1 if true, 0 otherwise. It just shows the current status.
> +pages as well as the THP is PMD mappable or not - 1 if true, 0 otherwise.
> +It just shows the current status.
>  
>  "VmFlags" field deserves a separate description. This member represents the
>  kernel flags associated with the particular virtual memory area in two letter
> -- 
> 2.26.3
> 

Thanks for updating this!

Reviewed-by: Zach O'Keefe <zokeefe@google.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v4 PATCH 6/7] mm: khugepaged: reorg some khugepaged helpers
  2022-06-15 17:29 ` [v4 PATCH 6/7] mm: khugepaged: reorg some khugepaged helpers Yang Shi
@ 2022-06-16  0:27   ` Zach O'Keefe
  0 siblings, 0 replies; 15+ messages in thread
From: Zach O'Keefe @ 2022-06-16  0:27 UTC (permalink / raw)
  To: Yang Shi
  Cc: vbabka, kirill.shutemov, willy, linmiaohe, akpm, linux-mm, linux-kernel

On 15 Jun 10:29, Yang Shi wrote:
> The khugepaged_{enabled|always|req_madv} are not khugepaged only
> anymore, move them to huge_mm.h and rename to hugepage_flags_xxx, and
> remove khugepaged_req_madv due to no users.
> 
> Also move khugepaged_defrag to khugepaged.c since its only caller is in
> that file, it doesn't have to be in a header file.
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  include/linux/huge_mm.h    |  8 ++++++++
>  include/linux/khugepaged.h | 14 --------------
>  mm/huge_memory.c           |  4 ++--
>  mm/khugepaged.c            | 18 +++++++++++-------
>  4 files changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 9d97d7ee6234..4cf546af7d97 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -116,6 +116,14 @@ extern struct kobj_attribute shmem_enabled_attr;
>  
>  extern unsigned long transparent_hugepage_flags;
>  
> +#define hugepage_flags_enabled()					       \
> +	(transparent_hugepage_flags &				       \
> +	 ((1<<TRANSPARENT_HUGEPAGE_FLAG) |		       \
> +	  (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG)))
> +#define hugepage_flags_always()				\
> +	(transparent_hugepage_flags &			\
> +	 (1<<TRANSPARENT_HUGEPAGE_FLAG))
> +
>  /*
>   * Do the below checks:
>   *   - For file vma, check if the linear page offset of vma is
> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
> index ea5fd4c398f7..384f034ae947 100644
> --- a/include/linux/khugepaged.h
> +++ b/include/linux/khugepaged.h
> @@ -24,20 +24,6 @@ static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
>  }
>  #endif
>  
> -#define khugepaged_enabled()					       \
> -	(transparent_hugepage_flags &				       \
> -	 ((1<<TRANSPARENT_HUGEPAGE_FLAG) |		       \
> -	  (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG)))
> -#define khugepaged_always()				\
> -	(transparent_hugepage_flags &			\
> -	 (1<<TRANSPARENT_HUGEPAGE_FLAG))
> -#define khugepaged_req_madv()					\
> -	(transparent_hugepage_flags &				\
> -	 (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG))
> -#define khugepaged_defrag()					\
> -	(transparent_hugepage_flags &				\
> -	 (1<<TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG))
> -
>  static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
>  {
>  	if (test_bit(MMF_VM_HUGEPAGE, &oldmm->flags))
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index d0c37d99917b..0f2cce2d7041 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -117,11 +117,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
>  	if (!in_pf && shmem_file(vma->vm_file))
>  		return shmem_huge_enabled(vma);
>  
> -	if (!khugepaged_enabled())
> +	if (!hugepage_flags_enabled())
>  		return false;
>  
>  	/* THP settings require madvise. */
> -	if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
> +	if (!(vm_flags & VM_HUGEPAGE) && !hugepage_flags_always())
>  		return false;
>  
>  	/* Only regular file is valid */
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 2a676f37c921..d8ebb60aae36 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -472,7 +472,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
>  			  unsigned long vm_flags)
>  {
>  	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
> -	    khugepaged_enabled()) {
> +	    hugepage_flags_enabled()) {
>  		if (hugepage_vma_check(vma, vm_flags, false, false))
>  			__khugepaged_enter(vma->vm_mm);
>  	}
> @@ -763,6 +763,10 @@ static bool khugepaged_scan_abort(int nid)
>  	return false;
>  }
>  
> +#define khugepaged_defrag()					\
> +	(transparent_hugepage_flags &				\
> +	 (1<<TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG))
> +
>  /* Defrag for khugepaged will enter direct reclaim/compaction if necessary */
>  static inline gfp_t alloc_hugepage_khugepaged_gfpmask(void)
>  {
> @@ -860,7 +864,7 @@ static struct page *khugepaged_alloc_hugepage(bool *wait)
>  			khugepaged_alloc_sleep();
>  		} else
>  			count_vm_event(THP_COLLAPSE_ALLOC);
> -	} while (unlikely(!hpage) && likely(khugepaged_enabled()));
> +	} while (unlikely(!hpage) && likely(hugepage_flags_enabled()));
>  
>  	return hpage;
>  }
> @@ -2186,7 +2190,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
>  static int khugepaged_has_work(void)
>  {
>  	return !list_empty(&khugepaged_scan.mm_head) &&
> -		khugepaged_enabled();
> +		hugepage_flags_enabled();
>  }
>  
>  static int khugepaged_wait_event(void)
> @@ -2251,7 +2255,7 @@ static void khugepaged_wait_work(void)
>  		return;
>  	}
>  
> -	if (khugepaged_enabled())
> +	if (hugepage_flags_enabled())
>  		wait_event_freezable(khugepaged_wait, khugepaged_wait_event());
>  }
>  
> @@ -2282,7 +2286,7 @@ static void set_recommended_min_free_kbytes(void)
>  	int nr_zones = 0;
>  	unsigned long recommended_min;
>  
> -	if (!khugepaged_enabled()) {
> +	if (!hugepage_flags_enabled()) {
>  		calculate_min_free_kbytes();
>  		goto update_wmarks;
>  	}
> @@ -2332,7 +2336,7 @@ int start_stop_khugepaged(void)
>  	int err = 0;
>  
>  	mutex_lock(&khugepaged_mutex);
> -	if (khugepaged_enabled()) {
> +	if (hugepage_flags_enabled()) {
>  		if (!khugepaged_thread)
>  			khugepaged_thread = kthread_run(khugepaged, NULL,
>  							"khugepaged");
> @@ -2358,7 +2362,7 @@ int start_stop_khugepaged(void)
>  void khugepaged_min_free_kbytes_update(void)
>  {
>  	mutex_lock(&khugepaged_mutex);
> -	if (khugepaged_enabled() && khugepaged_thread)
> +	if (hugepage_flags_enabled() && khugepaged_thread)
>  		set_recommended_min_free_kbytes();
>  	mutex_unlock(&khugepaged_mutex);
>  }
> -- 
> 2.26.3
> 

Reviewed-by: Zach O'Keefe <zokeefe@google.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v4 PATCH 4/7] mm: thp: kill transparent_hugepage_active()
  2022-06-16  0:17   ` Zach O'Keefe
@ 2022-06-16  0:32     ` Yang Shi
  0 siblings, 0 replies; 15+ messages in thread
From: Yang Shi @ 2022-06-16  0:32 UTC (permalink / raw)
  To: Zach O'Keefe
  Cc: Vlastimil Babka, Kirill A. Shutemov, Matthew Wilcox, Miaohe Lin,
	Andrew Morton, Linux MM, Linux Kernel Mailing List

On Wed, Jun 15, 2022 at 5:17 PM Zach O'Keefe <zokeefe@google.com> wrote:
>
> On 15 Jun 10:29, Yang Shi wrote:
> > The transparent_hugepage_active() was introduced to show THP eligibility
> > bit in smaps in proc, smaps is the only user.  But it actually does the
> > similar check as hugepage_vma_check() which is used by khugepaged.  We
> > definitely don't have to maintain two similar checks, so kill
> > transparent_hugepage_active().
> >
> > This patch also fixed the wrong behavior for VM_NO_KHUGEPAGED vmas.
> >
> > Also move hugepage_vma_check() to huge_memory.c and huge_mm.h since it
> > is not only for khugepaged anymore.
> >
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > ---
> >  fs/proc/task_mmu.c         |  2 +-
> >  include/linux/huge_mm.h    | 11 ++++++++-
> >  include/linux/khugepaged.h |  2 --
> >  mm/huge_memory.c           | 50 +++++++++++++++++++++++++++++++-------
> >  mm/khugepaged.c            | 48 +++---------------------------------
> >  5 files changed, 56 insertions(+), 57 deletions(-)
> >
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index 37ccb5c9f4f8..39a40ec181e7 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -863,7 +863,7 @@ static int show_smap(struct seq_file *m, void *v)
> >       __show_smap(m, &mss, false);
> >
> >       seq_printf(m, "THPeligible:    %d\n",
> > -                transparent_hugepage_active(vma));
> > +                hugepage_vma_check(vma, vma->vm_flags, true));
> >
> >       if (arch_pkeys_enabled())
> >               seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> > index 8a5a8bfce0f5..aeb13119ee28 100644
> > --- a/include/linux/huge_mm.h
> > +++ b/include/linux/huge_mm.h
> > @@ -202,7 +202,9 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma)
> >              !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
> >  }
> >
> > -bool transparent_hugepage_active(struct vm_area_struct *vma);
>
> Like v3, I think this also needs to be deleted from !CONFIG_TRANSPARENT_HUGEPAGE

Oops, thanks for catching this. Will fix it.

>
> Otherwise can add
>
> Reviewed-by: Zach O'Keefe <zokeefe@google.com>
>
> > +bool hugepage_vma_check(struct vm_area_struct *vma,
> > +                     unsigned long vm_flags,
> > +                     bool smaps);
> >
> >  #define transparent_hugepage_use_zero_page()                         \
> >       (transparent_hugepage_flags &                                   \
> > @@ -368,6 +370,13 @@ static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,
> >       return false;
> >  }
> >
> > +static inline bool hugepage_vma_check(struct vm_area_struct *vma,
> > +                                    unsigned long vm_flags,
> > +                                    bool smaps)
> > +{
> > +     return false;
> > +}
> > +
> >  static inline void prep_transhuge_page(struct page *page) {}
> >
> >  #define transparent_hugepage_flags 0UL
> > diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
> > index 31ca8a7f78f4..ea5fd4c398f7 100644
> > --- a/include/linux/khugepaged.h
> > +++ b/include/linux/khugepaged.h
> > @@ -10,8 +10,6 @@ extern struct attribute_group khugepaged_attr_group;
> >  extern int khugepaged_init(void);
> >  extern void khugepaged_destroy(void);
> >  extern int start_stop_khugepaged(void);
> > -extern bool hugepage_vma_check(struct vm_area_struct *vma,
> > -                            unsigned long vm_flags);
> >  extern void __khugepaged_enter(struct mm_struct *mm);
> >  extern void __khugepaged_exit(struct mm_struct *mm);
> >  extern void khugepaged_enter_vma(struct vm_area_struct *vma,
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index b530462c4493..a28c6100b491 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -69,21 +69,53 @@ static atomic_t huge_zero_refcount;
> >  struct page *huge_zero_page __read_mostly;
> >  unsigned long huge_zero_pfn __read_mostly = ~0UL;
> >
> > -bool transparent_hugepage_active(struct vm_area_struct *vma)
> > +bool hugepage_vma_check(struct vm_area_struct *vma,
> > +                     unsigned long vm_flags,
> > +                     bool smaps)
> >  {
> > -     /* The addr is used to check if the vma size fits */
> > -     unsigned long addr = (vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE;
> > +     if (!transhuge_vma_enabled(vma, vm_flags))
> > +             return false;
> > +
> > +     if (vm_flags & VM_NO_KHUGEPAGED)
> > +             return false;
> > +
> > +     /* Don't run khugepaged against DAX vma */
> > +     if (vma_is_dax(vma))
> > +             return false;
> >
> > -     if (!transhuge_vma_suitable(vma, addr))
> > +     /* Check alignment for file vma and size for both file and anon vma */
> > +     if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
> >               return false;
> > -     if (vma_is_anonymous(vma))
> > -             return __transparent_hugepage_enabled(vma);
> > -     if (vma_is_shmem(vma))
> > +
> > +     /* Enabled via shmem mount options or sysfs settings. */
> > +     if (shmem_file(vma->vm_file))
> >               return shmem_huge_enabled(vma);
> > -     if (transhuge_vma_enabled(vma, vma->vm_flags) && file_thp_enabled(vma))
> > +
> > +     if (!khugepaged_enabled())
> > +             return false;
> > +
> > +     /* THP settings require madvise. */
> > +     if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
> > +             return false;
> > +
> > +     /* Only regular file is valid */
> > +     if (file_thp_enabled(vma))
> >               return true;
> >
> > -     return false;
> > +     if (!vma_is_anonymous(vma))
> > +             return false;
> > +
> > +     if (vma_is_temporary_stack(vma))
> > +             return false;
> > +
> > +     /*
> > +      * THPeligible bit of smaps should show 1 for proper VMAs even
> > +      * though anon_vma is not initialized yet.
> > +      */
> > +     if (!vma->anon_vma)
> > +             return smaps;
> > +
> > +     return true;
> >  }
> >
> >  static bool get_huge_zero_page(void)
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 5baa394e34c8..3afd87f8c0b1 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -437,46 +437,6 @@ static inline int khugepaged_test_exit(struct mm_struct *mm)
> >       return atomic_read(&mm->mm_users) == 0;
> >  }
> >
> > -bool hugepage_vma_check(struct vm_area_struct *vma,
> > -                     unsigned long vm_flags)
> > -{
> > -     if (!transhuge_vma_enabled(vma, vm_flags))
> > -             return false;
> > -
> > -     if (vm_flags & VM_NO_KHUGEPAGED)
> > -             return false;
> > -
> > -     /* Don't run khugepaged against DAX vma */
> > -     if (vma_is_dax(vma))
> > -             return false;
> > -
> > -     /* Check alignment for file vma and size for both file and anon vma */
> > -     if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
> > -             return false;
> > -
> > -     /* Enabled via shmem mount options or sysfs settings. */
> > -     if (shmem_file(vma->vm_file))
> > -             return shmem_huge_enabled(vma);
> > -
> > -     if (!khugepaged_enabled())
> > -             return false;
> > -
> > -     /* THP settings require madvise. */
> > -     if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
> > -             return false;
> > -
> > -     /* Only regular file is valid */
> > -     if (file_thp_enabled(vma))
> > -             return true;
> > -
> > -     if (!vma->anon_vma || !vma_is_anonymous(vma))
> > -             return false;
> > -     if (vma_is_temporary_stack(vma))
> > -             return false;
> > -
> > -     return true;
> > -}
> > -
> >  void __khugepaged_enter(struct mm_struct *mm)
> >  {
> >       struct mm_slot *mm_slot;
> > @@ -513,7 +473,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
> >  {
> >       if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
> >           khugepaged_enabled()) {
> > -             if (hugepage_vma_check(vma, vm_flags))
> > +             if (hugepage_vma_check(vma, vm_flags, false))
> >                       __khugepaged_enter(vma->vm_mm);
> >       }
> >  }
> > @@ -958,7 +918,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
> >
> >       if (!transhuge_vma_suitable(vma, address))
> >               return SCAN_ADDRESS_RANGE;
> > -     if (!hugepage_vma_check(vma, vma->vm_flags))
> > +     if (!hugepage_vma_check(vma, vma->vm_flags, false))
> >               return SCAN_VMA_CHECK;
> >       /*
> >        * Anon VMA expected, the address may be unmapped then
> > @@ -1448,7 +1408,7 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
> >        * the valid THP. Add extra VM_HUGEPAGE so hugepage_vma_check()
> >        * will not fail the vma for missing VM_HUGEPAGE
> >        */
> > -     if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE))
> > +     if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE, false))
> >               return;
> >
> >       /* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
> > @@ -2143,7 +2103,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
> >                       progress++;
> >                       break;
> >               }
> > -             if (!hugepage_vma_check(vma, vma->vm_flags)) {
> > +             if (!hugepage_vma_check(vma, vma->vm_flags, false)) {
> >  skip:
> >                       progress++;
> >                       continue;
> > --
> > 2.26.3
> >


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v4 PATCH 5/7] mm: thp: kill __transhuge_page_enabled()
  2022-06-15 17:29 ` [v4 PATCH 5/7] mm: thp: kill __transhuge_page_enabled() Yang Shi
@ 2022-06-16  0:56   ` Zach O'Keefe
  0 siblings, 0 replies; 15+ messages in thread
From: Zach O'Keefe @ 2022-06-16  0:56 UTC (permalink / raw)
  To: Yang Shi
  Cc: vbabka, kirill.shutemov, willy, linmiaohe, akpm, linux-mm, linux-kernel

On 15 Jun 10:29, Yang Shi wrote:
> The page fault path checks THP eligibility with
> __transhuge_page_enabled() which does the similar thing as
> hugepage_vma_check(), so use hugepage_vma_check() instead.
> 
> However page fault allows DAX and !anon_vma cases, so added a new flag,
> in_pf, to hugepage_vma_check() to make page fault work correctly.
> 
> The in_pf flag is also used to skip shmem and file THP for page fault
> since shmem handles THP in its own shmem_fault() and file THP allocation
> on fault is not supported yet.
> 
> Also remove hugepage_vma_enabled() since hugepage_vma_check() is the
> only caller now, it is not necessary to have a helper function.
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  fs/proc/task_mmu.c      |  2 +-
>  include/linux/huge_mm.h | 57 ++---------------------------------------
>  mm/huge_memory.c        | 51 ++++++++++++++++++++++++++++--------
>  mm/khugepaged.c         |  8 +++---
>  mm/memory.c             |  7 +++--
>  5 files changed, 52 insertions(+), 73 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 39a40ec181e7..cef72e49acc5 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -863,7 +863,7 @@ static int show_smap(struct seq_file *m, void *v)
>  	__show_smap(m, &mss, false);
>  
>  	seq_printf(m, "THPeligible:    %d\n",
> -		   hugepage_vma_check(vma, vma->vm_flags, true));
> +		   hugepage_vma_check(vma, vma->vm_flags, true, false));
>  
>  	if (arch_pkeys_enabled())
>  		seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index aeb13119ee28..9d97d7ee6234 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -146,48 +146,6 @@ static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
>  	return true;
>  }
>  
> -static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,
> -					  unsigned long vm_flags)
> -{
> -	/* Explicitly disabled through madvise. */
> -	if ((vm_flags & VM_NOHUGEPAGE) ||
> -	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
> -		return false;
> -	return true;
> -}
> -
> -/*
> - * to be used on vmas which are known to support THP.
> - * Use transparent_hugepage_active otherwise
> - */
> -static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
> -{
> -
> -	/*
> -	 * If the hardware/firmware marked hugepage support disabled.
> -	 */
> -	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_NEVER_DAX))
> -		return false;
> -
> -	if (!transhuge_vma_enabled(vma, vma->vm_flags))
> -		return false;
> -
> -	if (vma_is_temporary_stack(vma))
> -		return false;
> -
> -	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_FLAG))
> -		return true;
> -
> -	if (vma_is_dax(vma))
> -		return true;
> -
> -	if (transparent_hugepage_flags &
> -				(1 << TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG))
> -		return !!(vma->vm_flags & VM_HUGEPAGE);
> -
> -	return false;
> -}
> -
>  static inline bool file_thp_enabled(struct vm_area_struct *vma)
>  {
>  	struct inode *inode;
> @@ -204,7 +162,7 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma)
>  
>  bool hugepage_vma_check(struct vm_area_struct *vma,
>  			unsigned long vm_flags,
> -			bool smaps);
> +			bool smaps, bool in_pf);
>  
>  #define transparent_hugepage_use_zero_page()				\
>  	(transparent_hugepage_flags &					\
> @@ -348,11 +306,6 @@ static inline bool folio_test_pmd_mappable(struct folio *folio)
>  	return false;
>  }
>  
> -static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
> -{
> -	return false;
> -}
> -
>  static inline bool transparent_hugepage_active(struct vm_area_struct *vma)
>  {
>  	return false;
> @@ -364,15 +317,9 @@ static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
>  	return false;
>  }
>  
> -static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,
> -					  unsigned long vm_flags)
> -{
> -	return false;
> -}
> -
>  static inline bool hugepage_vma_check(struct vm_area_struct *vma,
>  				       unsigned long vm_flags,
> -				       bool smaps)
> +				       bool smaps, bool in_pf)
>  {
>  	return false;
>  }
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a28c6100b491..d0c37d99917b 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -71,24 +71,50 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL;
>  
>  bool hugepage_vma_check(struct vm_area_struct *vma,
>  			unsigned long vm_flags,
> -			bool smaps)
> +			bool smaps, bool in_pf)
>  {
> -	if (!transhuge_vma_enabled(vma, vm_flags))
> +	/*
> +	 * Explicitly disabled through madvise or prctl, or some
> +	 * architectures may disable THP for some mappings, for
> +	 * example, s390 kvm.
> +	 * */
> +	if ((vm_flags & VM_NOHUGEPAGE) ||
> +	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
>  		return false;
> -
> -	if (vm_flags & VM_NO_KHUGEPAGED)
> +	/*
> +	 * If the hardware/firmware marked hugepage support disabled.
> +	 */
> +	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_NEVER_DAX))
>  		return false;
>  
> -	/* Don't run khugepaged against DAX vma */
> +	/* khugepaged doesn't collapse DAX vma, but page fault is fine. */
>  	if (vma_is_dax(vma))
> +		return in_pf;
> +
> +	/*
> +	 * Special VMA and hugetlb VMA.
> +	 * Must be checked after dax since some dax mappings may have
> +	 * VM_MIXEDMAP set.
> +	 */
> +	if (vm_flags & VM_NO_KHUGEPAGED)
>  		return false;
>  
> -	/* Check alignment for file vma and size for both file and anon vma */
> -	if (!transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
> +	/*
> +	 * Check alignment for file vma and size for both file and anon vma.
> +	 *
> +	 * Skip the check for page fault. Huge fault does the check in fault
> +	 * handlers. And this check is not suitable for huge PUD fault.
> +	 */
> +	if (!in_pf &&
> +	    !transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE)))
>  		return false;

Good catch.

>  
> -	/* Enabled via shmem mount options or sysfs settings. */
> -	if (shmem_file(vma->vm_file))
> +	/*
> +	 * Enabled via shmem mount options or sysfs settings.
> +	 * Must be done before hugepage flags check since shmem has its
> +	 * own flags.
> +	 */
> +	if (!in_pf && shmem_file(vma->vm_file))
>  		return shmem_huge_enabled(vma);
>  
>  	if (!khugepaged_enabled())
> @@ -99,7 +125,7 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
>  		return false;
>  
>  	/* Only regular file is valid */
> -	if (file_thp_enabled(vma))
> +	if (!in_pf && file_thp_enabled(vma))
>  		return true;
>  
>  	if (!vma_is_anonymous(vma))
> @@ -111,9 +137,12 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
>  	/*
>  	 * THPeligible bit of smaps should show 1 for proper VMAs even
>  	 * though anon_vma is not initialized yet.
> +	 *
> +	 * Allow page fault since anon_vma may be not initialized until
> +	 * the first page fault.
>  	 */
>  	if (!vma->anon_vma)
> -		return smaps;
> +		return (smaps || in_pf);
>  
>  	return true;
>  }
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 3afd87f8c0b1..2a676f37c921 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -473,7 +473,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
>  {
>  	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
>  	    khugepaged_enabled()) {
> -		if (hugepage_vma_check(vma, vm_flags, false))
> +		if (hugepage_vma_check(vma, vm_flags, false, false))
>  			__khugepaged_enter(vma->vm_mm);
>  	}
>  }
> @@ -918,7 +918,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
>  
>  	if (!transhuge_vma_suitable(vma, address))
>  		return SCAN_ADDRESS_RANGE;
> -	if (!hugepage_vma_check(vma, vma->vm_flags, false))
> +	if (!hugepage_vma_check(vma, vma->vm_flags, false, false))
>  		return SCAN_VMA_CHECK;
>  	/*
>  	 * Anon VMA expected, the address may be unmapped then
> @@ -1408,7 +1408,7 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
>  	 * the valid THP. Add extra VM_HUGEPAGE so hugepage_vma_check()
>  	 * will not fail the vma for missing VM_HUGEPAGE
>  	 */
> -	if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE, false))
> +	if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE, false, false))
>  		return;
>  
>  	/* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
> @@ -2103,7 +2103,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
>  			progress++;
>  			break;
>  		}
> -		if (!hugepage_vma_check(vma, vma->vm_flags, false)) {
> +		if (!hugepage_vma_check(vma, vma->vm_flags, false, false)) {
>  skip:
>  			progress++;
>  			continue;
> diff --git a/mm/memory.c b/mm/memory.c
> index be724238a9d3..fee2884481f2 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4985,6 +4985,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
>  		.gfp_mask = __get_fault_gfp_mask(vma),
>  	};
>  	struct mm_struct *mm = vma->vm_mm;
> +	unsigned long vm_flags = vma->vm_flags;
>  	pgd_t *pgd;
>  	p4d_t *p4d;
>  	vm_fault_t ret;
> @@ -4998,7 +4999,8 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
>  	if (!vmf.pud)
>  		return VM_FAULT_OOM;
>  retry_pud:
> -	if (pud_none(*vmf.pud) && __transparent_hugepage_enabled(vma)) {
> +	if (pud_none(*vmf.pud) &&
> +	    hugepage_vma_check(vma, vm_flags, false, true)) {
>  		ret = create_huge_pud(&vmf);
>  		if (!(ret & VM_FAULT_FALLBACK))
>  			return ret;
> @@ -5031,7 +5033,8 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
>  	if (pud_trans_unstable(vmf.pud))
>  		goto retry_pud;
>  
> -	if (pmd_none(*vmf.pmd) && __transparent_hugepage_enabled(vma)) {
> +	if (pmd_none(*vmf.pmd) &&
> +	    hugepage_vma_check(vma, vm_flags, false, true)) {
>  		ret = create_huge_pmd(&vmf);
>  		if (!(ret & VM_FAULT_FALLBACK))
>  			return ret;
> -- 
> 2.26.3
> 

Thanks for the updates / improvements!

Reviewed-by: Zach O'Keefe


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-06-16  0:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-15 17:29 [mm-unstable v4 PATCH 0/7] Cleanup transhuge_xxx helpers Yang Shi
2022-06-15 17:29 ` [v4 PATCH 1/7] mm: khugepaged: check THP flag in hugepage_vma_check() Yang Shi
2022-06-15 17:29 ` [v4 PATCH 2/7] mm: thp: consolidate vma size check to transhuge_vma_suitable Yang Shi
2022-06-15 23:55   ` Zach O'Keefe
2022-06-15 17:29 ` [v4 PATCH 3/7] mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate Yang Shi
2022-06-15 23:57   ` Zach O'Keefe
2022-06-15 17:29 ` [v4 PATCH 4/7] mm: thp: kill transparent_hugepage_active() Yang Shi
2022-06-16  0:17   ` Zach O'Keefe
2022-06-16  0:32     ` Yang Shi
2022-06-15 17:29 ` [v4 PATCH 5/7] mm: thp: kill __transhuge_page_enabled() Yang Shi
2022-06-16  0:56   ` Zach O'Keefe
2022-06-15 17:29 ` [v4 PATCH 6/7] mm: khugepaged: reorg some khugepaged helpers Yang Shi
2022-06-16  0:27   ` Zach O'Keefe
2022-06-15 17:29 ` [v4 PATCH 7/7] doc: proc: fix the description to THPeligible Yang Shi
2022-06-16  0:20   ` Zach O'Keefe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).