All of lore.kernel.org
 help / color / mirror / Atom feed
* [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent
@ 2022-05-10 20:32 Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Yang Shi @ 2022-05-10 20:32 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, akpm
  Cc: shy828301, linux-mm, linux-fsdevel, linux-kernel


Changelog
v4: * Incorporated Vlastimil's comments for patch 6/8.
    * Reworked the commit log of patch 8/8 to make what the series fixed
      clearer.
    * Rebased onto mm-unstable tree.
    * Collected the acks from Vlastimil.
v3: * Register mm to khugepaged in common mmap path instead of touching
      filesystem code (patch 8/8).
    * New patch 7/8 cleaned up and renamed khugepaged_enter_vma_merge()
      to khugepaged_enter_vma().
    * Collected acked-by from Song Liu for patch 1 ~ 6.
    * Rebased on top of 5.18-rc1.
v2: * Collected reviewed-by tags from Miaohe Lin.
    * Fixed build error for patch 4/8.

The readonly FS THP relies on khugepaged to collapse THP for suitable
vmas.  But the behavior is inconsistent for "always" mode (https://lore.kernel.org/linux-mm/00f195d4-d039-3cf2-d3a1-a2c88de397a0@suse.cz/).

The "always" mode means THP allocation should be tried all the time and
khugepaged should try to collapse THP all the time. Of course the
allocation and collapse may fail due to other factors and conditions.

Currently file THP may not be collapsed by khugepaged even though all
the conditions are met. That does break the semantics of "always" mode.

So make sure readonly FS vmas are registered to khugepaged to fix the
break.

Registering suitable vmas in common mmap path, that could cover both
readonly FS vmas and shmem vmas, so removed the khugepaged calls in
shmem.c.

The patch 1 ~ 7 are minor bug fixes, clean up and preparation patches.
The patch 8 is the real meat. 


Tested with khugepaged test in selftests and the testcase provided by
Vlastimil Babka in https://lore.kernel.org/lkml/df3b5d1c-a36b-2c73-3e27-99e74983de3a@suse.cz/
by commenting out MADV_HUGEPAGE call.


Yang Shi (8):
      sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE
      mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED
      mm: khugepaged: skip DAX vma
      mm: thp: only regular file could be THP eligible
      mm: khugepaged: make khugepaged_enter() void function
      mm: khugepaged: make hugepage_vma_check() non-static
      mm: khugepaged: introduce khugepaged_enter_vma() helper
      mm: mmap: register suitable readonly file vmas for khugepaged

 include/linux/huge_mm.h        | 14 ++++++++++++++
 include/linux/khugepaged.h     | 44 ++++++++++++++++++--------------------------
 include/linux/sched/coredump.h |  3 ++-
 kernel/fork.c                  |  4 +---
 mm/huge_memory.c               | 15 ++++-----------
 mm/khugepaged.c                | 61 ++++++++++++++++++++++++++-----------------------------------
 mm/mmap.c                      | 18 ++++++++++++------
 mm/shmem.c                     | 12 ------------
 8 files changed, 77 insertions(+), 94 deletions(-)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [v4 PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE
  2022-05-10 20:32 [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
@ 2022-05-10 20:32 ` Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED Yang Shi
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Yang Shi @ 2022-05-10 20:32 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, akpm
  Cc: shy828301, linux-mm, linux-fsdevel, linux-kernel

MMF_VM_HUGEPAGE is set as long as the mm is available for khugepaged by
khugepaged_enter(), not only when VM_HUGEPAGE is set on vma.  Correct
the comment to avoid confusion.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Vlastmil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/sched/coredump.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
index 4d9e3a656875..4d0a5be28b70 100644
--- a/include/linux/sched/coredump.h
+++ b/include/linux/sched/coredump.h
@@ -57,7 +57,8 @@ static inline int get_dumpable(struct mm_struct *mm)
 #endif
 					/* leave room for more dump flags */
 #define MMF_VM_MERGEABLE	16	/* KSM may merge identical pages */
-#define MMF_VM_HUGEPAGE		17	/* set when VM_HUGEPAGE is set on vma */
+#define MMF_VM_HUGEPAGE		17	/* set when mm is available for
+					   khugepaged */
 /*
  * This one-shot flag is dropped due to necessity of changing exe once again
  * on NFS restore
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [v4 PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED
  2022-05-10 20:32 [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
@ 2022-05-10 20:32 ` Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 3/8] mm: khugepaged: skip DAX vma Yang Shi
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Yang Shi @ 2022-05-10 20:32 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, akpm
  Cc: shy828301, linux-mm, linux-fsdevel, linux-kernel

The hugepage_vma_check() called by khugepaged_enter_vma_merge() does
check VM_NO_KHUGEPAGED. Remove the check from caller and move the check
in hugepage_vma_check() up.

More checks may be run for VM_NO_KHUGEPAGED vmas, but MADV_HUGEPAGE is
definitely not a hot path, so cleaner code does outweigh.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Vlastmil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/khugepaged.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 76c4ad60b9a9..dc8849d9dde4 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -365,8 +365,7 @@ int hugepage_madvise(struct vm_area_struct *vma,
 		 * register it here without waiting a page fault that
 		 * may not happen any time soon.
 		 */
-		if (!(*vm_flags & VM_NO_KHUGEPAGED) &&
-				khugepaged_enter_vma_merge(vma, *vm_flags))
+		if (khugepaged_enter_vma_merge(vma, *vm_flags))
 			return -ENOMEM;
 		break;
 	case MADV_NOHUGEPAGE:
@@ -445,6 +444,9 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 	if (!transhuge_vma_enabled(vma, vm_flags))
 		return false;
 
+	if (vm_flags & VM_NO_KHUGEPAGED)
+		return false;
+
 	if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
 				vma->vm_pgoff, HPAGE_PMD_NR))
 		return false;
@@ -470,7 +472,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 		return false;
 	if (vma_is_temporary_stack(vma))
 		return false;
-	return !(vm_flags & VM_NO_KHUGEPAGED);
+
+	return true;
 }
 
 int __khugepaged_enter(struct mm_struct *mm)
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [v4 PATCH 3/8] mm: khugepaged: skip DAX vma
  2022-05-10 20:32 [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED Yang Shi
@ 2022-05-10 20:32 ` Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 4/8] mm: thp: only regular file could be THP eligible Yang Shi
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Yang Shi @ 2022-05-10 20:32 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, akpm
  Cc: shy828301, linux-mm, linux-fsdevel, linux-kernel

The DAX vma may be seen by khugepaged when the mm has other khugepaged
suitable vmas.  So khugepaged may try to collapse THP for DAX vma, but
it will fail due to page sanity check, for example, page is not
on LRU.

So it is not harmful, but it is definitely pointless to run khugepaged
against DAX vma, so skip it in early check.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Vlastmil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/khugepaged.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index dc8849d9dde4..a2380d88c3ea 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -447,6 +447,10 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 	if (vm_flags & VM_NO_KHUGEPAGED)
 		return false;
 
+	/* Don't run khugepaged against DAX vma */
+	if (vma_is_dax(vma))
+		return false;
+
 	if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
 				vma->vm_pgoff, HPAGE_PMD_NR))
 		return false;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [v4 PATCH 4/8] mm: thp: only regular file could be THP eligible
  2022-05-10 20:32 [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (2 preceding siblings ...)
  2022-05-10 20:32 ` [v4 PATCH 3/8] mm: khugepaged: skip DAX vma Yang Shi
@ 2022-05-10 20:32 ` Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 5/8] mm: khugepaged: make khugepaged_enter() void function Yang Shi
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Yang Shi @ 2022-05-10 20:32 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, akpm
  Cc: shy828301, linux-mm, linux-fsdevel, linux-kernel

Since commit a4aeaa06d45e ("mm: khugepaged: skip huge page collapse for
special files"), khugepaged just collapses THP for regular file which is
the intended usecase for readonly fs THP.  Only show regular file as THP
eligible accordingly.

And make file_thp_enabled() available for khugepaged too in order to remove
duplicate code.

Acked-by: Song Liu <song@kernel.org>
Acked-by: Vlastmil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/huge_mm.h | 14 ++++++++++++++
 mm/huge_memory.c        | 11 ++---------
 mm/khugepaged.c         |  9 ++-------
 3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index fbf36bb1be22..de29821231c9 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -173,6 +173,20 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
 	return false;
 }
 
+static inline bool file_thp_enabled(struct vm_area_struct *vma)
+{
+	struct inode *inode;
+
+	if (!vma->vm_file)
+		return false;
+
+	inode = vma->vm_file->f_inode;
+
+	return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) &&
+	       (vma->vm_flags & VM_EXEC) &&
+	       !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
+}
+
 bool transparent_hugepage_active(struct vm_area_struct *vma);
 
 #define transparent_hugepage_use_zero_page()				\
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index d0c26a3b3b17..82434a9d4499 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -69,13 +69,6 @@ static atomic_t huge_zero_refcount;
 struct page *huge_zero_page __read_mostly;
 unsigned long huge_zero_pfn __read_mostly = ~0UL;
 
-static inline bool file_thp_enabled(struct vm_area_struct *vma)
-{
-	return transhuge_vma_enabled(vma, vma->vm_flags) && vma->vm_file &&
-	       !inode_is_open_for_write(vma->vm_file->f_inode) &&
-	       (vma->vm_flags & VM_EXEC);
-}
-
 bool transparent_hugepage_active(struct vm_area_struct *vma)
 {
 	/* The addr is used to check if the vma size fits */
@@ -87,8 +80,8 @@ bool transparent_hugepage_active(struct vm_area_struct *vma)
 		return __transparent_hugepage_enabled(vma);
 	if (vma_is_shmem(vma))
 		return shmem_huge_enabled(vma);
-	if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS))
-		return file_thp_enabled(vma);
+	if (transhuge_vma_enabled(vma, vma->vm_flags) && file_thp_enabled(vma))
+		return true;
 
 	return false;
 }
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a2380d88c3ea..c0d3215008ba 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -464,13 +464,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 		return false;
 
 	/* Only regular file is valid */
-	if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
-	    (vm_flags & VM_EXEC)) {
-		struct inode *inode = vma->vm_file->f_inode;
-
-		return !inode_is_open_for_write(inode) &&
-			S_ISREG(inode->i_mode);
-	}
+	if (file_thp_enabled(vma))
+		return true;
 
 	if (!vma->anon_vma || !vma_is_anonymous(vma))
 		return false;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [v4 PATCH 5/8] mm: khugepaged: make khugepaged_enter() void function
  2022-05-10 20:32 [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (3 preceding siblings ...)
  2022-05-10 20:32 ` [v4 PATCH 4/8] mm: thp: only regular file could be THP eligible Yang Shi
@ 2022-05-10 20:32 ` Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 6/8] mm: khugepaged: make hugepage_vma_check() non-static Yang Shi
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Yang Shi @ 2022-05-10 20:32 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, akpm
  Cc: shy828301, linux-mm, linux-fsdevel, linux-kernel

The most callers of khugepaged_enter() don't care about the return
value.  Only dup_mmap(), anonymous THP page fault and MADV_HUGEPAGE handle
the error by returning -ENOMEM.  Actually it is not harmful for them to
ignore the error case either.  It also sounds overkilling to fail fork()
and page fault early due to khugepaged_enter() error, and MADV_HUGEPAGE
does set VM_HUGEPAGE flag regardless of the error.

Acked-by: Song Liu <song@kernel.org>
Acked-by: Vlastmil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/khugepaged.h | 30 ++++++++++++------------------
 kernel/fork.c              |  4 +---
 mm/huge_memory.c           |  4 ++--
 mm/khugepaged.c            | 18 +++++++-----------
 4 files changed, 22 insertions(+), 34 deletions(-)

diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 2fcc01891b47..0423d3619f26 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -12,10 +12,10 @@ extern struct attribute_group khugepaged_attr_group;
 extern int khugepaged_init(void);
 extern void khugepaged_destroy(void);
 extern int start_stop_khugepaged(void);
-extern int __khugepaged_enter(struct mm_struct *mm);
+extern void __khugepaged_enter(struct mm_struct *mm);
 extern void __khugepaged_exit(struct mm_struct *mm);
-extern int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
-				      unsigned long vm_flags);
+extern void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
+				       unsigned long vm_flags);
 extern void khugepaged_min_free_kbytes_update(void);
 #ifdef CONFIG_SHMEM
 extern void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr);
@@ -40,11 +40,10 @@ static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
 	(transparent_hugepage_flags &				\
 	 (1<<TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG))
 
-static inline int khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
+static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
 {
 	if (test_bit(MMF_VM_HUGEPAGE, &oldmm->flags))
-		return __khugepaged_enter(mm);
-	return 0;
+		__khugepaged_enter(mm);
 }
 
 static inline void khugepaged_exit(struct mm_struct *mm)
@@ -53,7 +52,7 @@ static inline void khugepaged_exit(struct mm_struct *mm)
 		__khugepaged_exit(mm);
 }
 
-static inline int khugepaged_enter(struct vm_area_struct *vma,
+static inline void khugepaged_enter(struct vm_area_struct *vma,
 				   unsigned long vm_flags)
 {
 	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags))
@@ -62,27 +61,22 @@ static inline int khugepaged_enter(struct vm_area_struct *vma,
 		     (khugepaged_req_madv() && (vm_flags & VM_HUGEPAGE))) &&
 		    !(vm_flags & VM_NOHUGEPAGE) &&
 		    !test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
-			if (__khugepaged_enter(vma->vm_mm))
-				return -ENOMEM;
-	return 0;
+			__khugepaged_enter(vma->vm_mm);
 }
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
-static inline int khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
+static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
 {
-	return 0;
 }
 static inline void khugepaged_exit(struct mm_struct *mm)
 {
 }
-static inline int khugepaged_enter(struct vm_area_struct *vma,
-				   unsigned long vm_flags)
+static inline void khugepaged_enter(struct vm_area_struct *vma,
+				    unsigned long vm_flags)
 {
-	return 0;
 }
-static inline int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
-					     unsigned long vm_flags)
+static inline void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
+					      unsigned long vm_flags)
 {
-	return 0;
 }
 static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
 					   unsigned long addr)
diff --git a/kernel/fork.c b/kernel/fork.c
index 536dc3289734..6692f5d78371 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -608,9 +608,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
 	retval = ksm_fork(mm, oldmm);
 	if (retval)
 		goto out;
-	retval = khugepaged_fork(mm, oldmm);
-	if (retval)
-		goto out;
+	khugepaged_fork(mm, oldmm);
 
 	retval = mas_expected_entries(&mas, oldmm->map_count);
 	if (retval)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 82434a9d4499..80e8b58b4f39 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -726,8 +726,8 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
 		return VM_FAULT_FALLBACK;
 	if (unlikely(anon_vma_prepare(vma)))
 		return VM_FAULT_OOM;
-	if (unlikely(khugepaged_enter(vma, vma->vm_flags)))
-		return VM_FAULT_OOM;
+	khugepaged_enter(vma, vma->vm_flags);
+
 	if (!(vmf->flags & FAULT_FLAG_WRITE) &&
 			!mm_forbids_zeropage(vma->vm_mm) &&
 			transparent_hugepage_use_zero_page()) {
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index c0d3215008ba..7815218ab960 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -365,8 +365,7 @@ int hugepage_madvise(struct vm_area_struct *vma,
 		 * register it here without waiting a page fault that
 		 * may not happen any time soon.
 		 */
-		if (khugepaged_enter_vma_merge(vma, *vm_flags))
-			return -ENOMEM;
+		khugepaged_enter_vma_merge(vma, *vm_flags);
 		break;
 	case MADV_NOHUGEPAGE:
 		*vm_flags &= ~VM_HUGEPAGE;
@@ -475,20 +474,20 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 	return true;
 }
 
-int __khugepaged_enter(struct mm_struct *mm)
+void __khugepaged_enter(struct mm_struct *mm)
 {
 	struct mm_slot *mm_slot;
 	int wakeup;
 
 	mm_slot = alloc_mm_slot();
 	if (!mm_slot)
-		return -ENOMEM;
+		return;
 
 	/* __khugepaged_exit() must not run from under us */
 	VM_BUG_ON_MM(khugepaged_test_exit(mm), mm);
 	if (unlikely(test_and_set_bit(MMF_VM_HUGEPAGE, &mm->flags))) {
 		free_mm_slot(mm_slot);
-		return 0;
+		return;
 	}
 
 	spin_lock(&khugepaged_mm_lock);
@@ -504,11 +503,9 @@ int __khugepaged_enter(struct mm_struct *mm)
 	mmgrab(mm);
 	if (wakeup)
 		wake_up_interruptible(&khugepaged_wait);
-
-	return 0;
 }
 
-int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
+void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 			       unsigned long vm_flags)
 {
 	unsigned long hstart, hend;
@@ -519,13 +516,12 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 	 * file-private shmem THP is not supported.
 	 */
 	if (!hugepage_vma_check(vma, vm_flags))
-		return 0;
+		return;
 
 	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 	hend = vma->vm_end & HPAGE_PMD_MASK;
 	if (hstart < hend)
-		return khugepaged_enter(vma, vm_flags);
-	return 0;
+		khugepaged_enter(vma, vm_flags);
 }
 
 void __khugepaged_exit(struct mm_struct *mm)
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [v4 PATCH 6/8] mm: khugepaged: make hugepage_vma_check() non-static
  2022-05-10 20:32 [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (4 preceding siblings ...)
  2022-05-10 20:32 ` [v4 PATCH 5/8] mm: khugepaged: make khugepaged_enter() void function Yang Shi
@ 2022-05-10 20:32 ` Yang Shi
  2022-05-10 21:05   ` Andrew Morton
  2022-05-10 20:32 ` [v4 PATCH 7/8] mm: khugepaged: introduce khugepaged_enter_vma() helper Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 8/8] mm: mmap: register suitable readonly file vmas for khugepaged Yang Shi
  7 siblings, 1 reply; 11+ messages in thread
From: Yang Shi @ 2022-05-10 20:32 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, akpm
  Cc: shy828301, linux-mm, linux-fsdevel, linux-kernel

The hugepage_vma_check() could be reused by khugepaged_enter() and
khugepaged_enter_vma_merge(), but it is static in khugepaged.c.
Make it non-static and declare it in khugepaged.h.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/khugepaged.h | 14 ++++++--------
 mm/khugepaged.c            | 25 +++++++++----------------
 2 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 0423d3619f26..c340f6bb39d6 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -3,8 +3,6 @@
 #define _LINUX_KHUGEPAGED_H
 
 #include <linux/sched/coredump.h> /* MMF_VM_HUGEPAGE */
-#include <linux/shmem_fs.h>
-
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern struct attribute_group khugepaged_attr_group;
@@ -12,6 +10,8 @@ extern struct attribute_group khugepaged_attr_group;
 extern int khugepaged_init(void);
 extern void khugepaged_destroy(void);
 extern int start_stop_khugepaged(void);
+extern bool hugepage_vma_check(struct vm_area_struct *vma,
+			       unsigned long vm_flags);
 extern void __khugepaged_enter(struct mm_struct *mm);
 extern void __khugepaged_exit(struct mm_struct *mm);
 extern void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
@@ -55,13 +55,11 @@ static inline void khugepaged_exit(struct mm_struct *mm)
 static inline void khugepaged_enter(struct vm_area_struct *vma,
 				   unsigned long vm_flags)
 {
-	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags))
-		if ((khugepaged_always() ||
-		     (shmem_file(vma->vm_file) && shmem_huge_enabled(vma)) ||
-		     (khugepaged_req_madv() && (vm_flags & VM_HUGEPAGE))) &&
-		    !(vm_flags & VM_NOHUGEPAGE) &&
-		    !test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
+	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
+	    khugepaged_enabled()) {
+		if (hugepage_vma_check(vma, vm_flags))
 			__khugepaged_enter(vma->vm_mm);
+	}
 }
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
 static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 7815218ab960..dec449339964 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -437,8 +437,8 @@ static inline int khugepaged_test_exit(struct mm_struct *mm)
 	return atomic_read(&mm->mm_users) == 0;
 }
 
-static bool hugepage_vma_check(struct vm_area_struct *vma,
-			       unsigned long vm_flags)
+bool hugepage_vma_check(struct vm_area_struct *vma,
+			unsigned long vm_flags)
 {
 	if (!transhuge_vma_enabled(vma, vm_flags))
 		return false;
@@ -508,20 +508,13 @@ void __khugepaged_enter(struct mm_struct *mm)
 void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 			       unsigned long vm_flags)
 {
-	unsigned long hstart, hend;
-
-	/*
-	 * khugepaged only supports read-only files for non-shmem files.
-	 * khugepaged does not yet work on special mappings. And
-	 * file-private shmem THP is not supported.
-	 */
-	if (!hugepage_vma_check(vma, vm_flags))
-		return;
-
-	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
-	hend = vma->vm_end & HPAGE_PMD_MASK;
-	if (hstart < hend)
-		khugepaged_enter(vma, vm_flags);
+	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
+	    khugepaged_enabled() &&
+	    (((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
+	     (vma->vm_end & HPAGE_PMD_MASK))) {
+		if (hugepage_vma_check(vma, vm_flags))
+			__khugepaged_enter(vma->vm_mm);
+	}
 }
 
 void __khugepaged_exit(struct mm_struct *mm)
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [v4 PATCH 7/8] mm: khugepaged: introduce khugepaged_enter_vma() helper
  2022-05-10 20:32 [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (5 preceding siblings ...)
  2022-05-10 20:32 ` [v4 PATCH 6/8] mm: khugepaged: make hugepage_vma_check() non-static Yang Shi
@ 2022-05-10 20:32 ` Yang Shi
  2022-05-10 20:32 ` [v4 PATCH 8/8] mm: mmap: register suitable readonly file vmas for khugepaged Yang Shi
  7 siblings, 0 replies; 11+ messages in thread
From: Yang Shi @ 2022-05-10 20:32 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, akpm
  Cc: shy828301, linux-mm, linux-fsdevel, linux-kernel

The khugepaged_enter_vma_merge() actually does as the same thing as the
khugepaged_enter() section called by shmem_mmap(), so consolidate them
into one helper and rename it to khugepaged_enter_vma().

Acked-by: Vlastmil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/khugepaged.h |  8 ++++----
 mm/khugepaged.c            |  6 +++---
 mm/mmap.c                  | 12 ++++++------
 mm/shmem.c                 | 12 ++----------
 4 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index c340f6bb39d6..392d34c3c59a 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -14,8 +14,8 @@ extern bool hugepage_vma_check(struct vm_area_struct *vma,
 			       unsigned long vm_flags);
 extern void __khugepaged_enter(struct mm_struct *mm);
 extern void __khugepaged_exit(struct mm_struct *mm);
-extern void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
-				       unsigned long vm_flags);
+extern void khugepaged_enter_vma(struct vm_area_struct *vma,
+				 unsigned long vm_flags);
 extern void khugepaged_min_free_kbytes_update(void);
 #ifdef CONFIG_SHMEM
 extern void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr);
@@ -72,8 +72,8 @@ static inline void khugepaged_enter(struct vm_area_struct *vma,
 				    unsigned long vm_flags)
 {
 }
-static inline void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
-					      unsigned long vm_flags)
+static inline void khugepaged_enter_vma(struct vm_area_struct *vma,
+					unsigned long vm_flags)
 {
 }
 static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index dec449339964..32db587c5224 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -365,7 +365,7 @@ int hugepage_madvise(struct vm_area_struct *vma,
 		 * register it here without waiting a page fault that
 		 * may not happen any time soon.
 		 */
-		khugepaged_enter_vma_merge(vma, *vm_flags);
+		khugepaged_enter_vma(vma, *vm_flags);
 		break;
 	case MADV_NOHUGEPAGE:
 		*vm_flags &= ~VM_HUGEPAGE;
@@ -505,8 +505,8 @@ void __khugepaged_enter(struct mm_struct *mm)
 		wake_up_interruptible(&khugepaged_wait);
 }
 
-void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
-			       unsigned long vm_flags)
+void khugepaged_enter_vma(struct vm_area_struct *vma,
+			  unsigned long vm_flags)
 {
 	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
 	    khugepaged_enabled() &&
diff --git a/mm/mmap.c b/mm/mmap.c
index 3445a8c304af..34ff1600426c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1122,7 +1122,7 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
 					 end, prev->vm_pgoff, NULL, prev);
 		if (err)
 			return NULL;
-		khugepaged_enter_vma_merge(prev, vm_flags);
+		khugepaged_enter_vma(prev, vm_flags);
 		return prev;
 	}
 
@@ -1149,7 +1149,7 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
 		}
 		if (err)
 			return NULL;
-		khugepaged_enter_vma_merge(area, vm_flags);
+		khugepaged_enter_vma(area, vm_flags);
 		return area;
 	}
 
@@ -2046,7 +2046,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 		}
 	}
 	anon_vma_unlock_write(vma->anon_vma);
-	khugepaged_enter_vma_merge(vma, vma->vm_flags);
+	khugepaged_enter_vma(vma, vma->vm_flags);
 	return error;
 }
 #endif /* CONFIG_STACK_GROWSUP || CONFIG_IA64 */
@@ -2127,7 +2127,7 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address)
 		}
 	}
 	anon_vma_unlock_write(vma->anon_vma);
-	khugepaged_enter_vma_merge(vma, vma->vm_flags);
+	khugepaged_enter_vma(vma, vma->vm_flags);
 	return error;
 }
 
@@ -2635,7 +2635,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 	/* Actually expand, if possible */
 	if (vma &&
 	    !vma_expand(&mas, vma, merge_start, merge_end, vm_pgoff, next)) {
-		khugepaged_enter_vma_merge(vma, vm_flags);
+		khugepaged_enter_vma(vma, vm_flags);
 		goto expanded;
 	}
 
@@ -3051,7 +3051,7 @@ static int do_brk_flags(struct ma_state *mas, struct vm_area_struct *vma,
 			anon_vma_interval_tree_post_update_vma(vma);
 			anon_vma_unlock_write(vma->anon_vma);
 		}
-		khugepaged_enter_vma_merge(vma, flags);
+		khugepaged_enter_vma(vma, flags);
 		goto out;
 	}
 
diff --git a/mm/shmem.c b/mm/shmem.c
index 29701be579f8..89f6f4fec3f9 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2232,11 +2232,7 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 
 	file_accessed(file);
 	vma->vm_ops = &shmem_vm_ops;
-	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
-			((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
-			(vma->vm_end & HPAGE_PMD_MASK)) {
-		khugepaged_enter(vma, vma->vm_flags);
-	}
+	khugepaged_enter_vma(vma, vma->vm_flags);
 	return 0;
 }
 
@@ -4137,11 +4133,7 @@ int shmem_zero_setup(struct vm_area_struct *vma)
 	vma->vm_file = file;
 	vma->vm_ops = &shmem_vm_ops;
 
-	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
-			((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
-			(vma->vm_end & HPAGE_PMD_MASK)) {
-		khugepaged_enter(vma, vma->vm_flags);
-	}
+	khugepaged_enter_vma(vma, vma->vm_flags);
 
 	return 0;
 }
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [v4 PATCH 8/8] mm: mmap: register suitable readonly file vmas for khugepaged
  2022-05-10 20:32 [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (6 preceding siblings ...)
  2022-05-10 20:32 ` [v4 PATCH 7/8] mm: khugepaged: introduce khugepaged_enter_vma() helper Yang Shi
@ 2022-05-10 20:32 ` Yang Shi
  7 siblings, 0 replies; 11+ messages in thread
From: Yang Shi @ 2022-05-10 20:32 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, akpm
  Cc: shy828301, linux-mm, linux-fsdevel, linux-kernel

The readonly FS THP relies on khugepaged to collapse THP for suitable
vmas.  But the behavior is inconsistent for "always" mode (https://lore.kernel.org/linux-mm/00f195d4-d039-3cf2-d3a1-a2c88de397a0@suse.cz/).

The "always" mode means THP allocation should be tried all the time and
khugepaged should try to collapse THP all the time. Of course the
allocation and collapse may fail due to other factors and conditions.

Currently file THP may not be collapsed by khugepaged even though all
the conditions are met. That does break the semantics of "always" mode.

So make sure readonly FS vmas are registered to khugepaged to fix the
break.

Registering suitable vmas in common mmap path, that could cover both
readonly FS vmas and shmem vmas, so removed the khugepaged calls in
shmem.c.

Still need to keep the khugepaged call in vma_merge() since vma_merge()
is called in a lot of places, for example, madvise, mprotect, etc.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastmil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/mmap.c  | 6 ++++++
 mm/shmem.c | 4 ----
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 34ff1600426c..6d7a6c7b50bb 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2745,6 +2745,12 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 		i_mmap_unlock_write(vma->vm_file->f_mapping);
 	}
 
+	/*
+	 * vma_merge() calls khugepaged_enter_vma() either, the below
+	 * call covers the non-merge case.
+	 */
+	khugepaged_enter_vma(vma, vma->vm_flags);
+
 	/* Once vma denies write, undo our temporary denial count */
 unmap_writable:
 	if (file && vm_flags & VM_SHARED)
diff --git a/mm/shmem.c b/mm/shmem.c
index 89f6f4fec3f9..67a3f3b05fb2 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -34,7 +34,6 @@
 #include <linux/export.h>
 #include <linux/swap.h>
 #include <linux/uio.h>
-#include <linux/khugepaged.h>
 #include <linux/hugetlb.h>
 #include <linux/fs_parser.h>
 #include <linux/swapfile.h>
@@ -2232,7 +2231,6 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 
 	file_accessed(file);
 	vma->vm_ops = &shmem_vm_ops;
-	khugepaged_enter_vma(vma, vma->vm_flags);
 	return 0;
 }
 
@@ -4133,8 +4131,6 @@ int shmem_zero_setup(struct vm_area_struct *vma)
 	vma->vm_file = file;
 	vma->vm_ops = &shmem_vm_ops;
 
-	khugepaged_enter_vma(vma, vma->vm_flags);
-
 	return 0;
 }
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [v4 PATCH 6/8] mm: khugepaged: make hugepage_vma_check() non-static
  2022-05-10 20:32 ` [v4 PATCH 6/8] mm: khugepaged: make hugepage_vma_check() non-static Yang Shi
@ 2022-05-10 21:05   ` Andrew Morton
  2022-05-10 22:35     ` Yang Shi
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2022-05-10 21:05 UTC (permalink / raw)
  To: Yang Shi
  Cc: vbabka, kirill.shutemov, linmiaohe, songliubraving, riel, willy,
	ziy, tytso, linux-mm, linux-fsdevel, linux-kernel

On Tue, 10 May 2022 13:32:20 -0700 Yang Shi <shy828301@gmail.com> wrote:

> The hugepage_vma_check() could be reused by khugepaged_enter() and
> khugepaged_enter_vma_merge(), but it is static in khugepaged.c.
> Make it non-static and declare it in khugepaged.h.
> 
> ..
>
> @@ -508,20 +508,13 @@ void __khugepaged_enter(struct mm_struct *mm)
>  void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
>  			       unsigned long vm_flags)
>  {
> -	unsigned long hstart, hend;
> -
> -	/*
> -	 * khugepaged only supports read-only files for non-shmem files.
> -	 * khugepaged does not yet work on special mappings. And
> -	 * file-private shmem THP is not supported.
> -	 */
> -	if (!hugepage_vma_check(vma, vm_flags))
> -		return;
> -
> -	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
> -	hend = vma->vm_end & HPAGE_PMD_MASK;
> -	if (hstart < hend)
> -		khugepaged_enter(vma, vm_flags);
> +	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
> +	    khugepaged_enabled() &&
> +	    (((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
> +	     (vma->vm_end & HPAGE_PMD_MASK))) {

Reviewing these bounds-checking tests is so hard :(  Can we simplify?

> +		if (hugepage_vma_check(vma, vm_flags))
> +			__khugepaged_enter(vma->vm_mm);
> +	}
>  }

void khugepaged_enter_vma(struct vm_area_struct *vma,
			  unsigned long vm_flags)
{
	if (test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags))
		return;
	if (!khugepaged_enabled())
		return;
	if (round_up(vma->vm_start, HPAGE_PMD_SIZE) >=
			(vma->vm_end & HPAGE_PMD_MASK))
		return;		/* vma is too small */
	if (!hugepage_vma_check(vma, vm_flags))
		return;
	__khugepaged_enter(vma->vm_mm);
}


Also, it might be slightly faster to have checked MMF_VM_HUGEPAGE
before khugepaged_enabled(), but it looks odd.  And it might be slower,
too - more pointer chasing.

I wish someone would document hugepage_vma_check().

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [v4 PATCH 6/8] mm: khugepaged: make hugepage_vma_check() non-static
  2022-05-10 21:05   ` Andrew Morton
@ 2022-05-10 22:35     ` Yang Shi
  0 siblings, 0 replies; 11+ messages in thread
From: Yang Shi @ 2022-05-10 22:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Kirill A. Shutemov, Miaohe Lin, Song Liu,
	Rik van Riel, Matthew Wilcox, Zi Yan, Theodore Ts'o,
	Linux MM, Linux FS-devel Mailing List, Linux Kernel Mailing List

On Tue, May 10, 2022 at 2:05 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue, 10 May 2022 13:32:20 -0700 Yang Shi <shy828301@gmail.com> wrote:
>
> > The hugepage_vma_check() could be reused by khugepaged_enter() and
> > khugepaged_enter_vma_merge(), but it is static in khugepaged.c.
> > Make it non-static and declare it in khugepaged.h.
> >
> > ..
> >
> > @@ -508,20 +508,13 @@ void __khugepaged_enter(struct mm_struct *mm)
> >  void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
> >                              unsigned long vm_flags)
> >  {
> > -     unsigned long hstart, hend;
> > -
> > -     /*
> > -      * khugepaged only supports read-only files for non-shmem files.
> > -      * khugepaged does not yet work on special mappings. And
> > -      * file-private shmem THP is not supported.
> > -      */
> > -     if (!hugepage_vma_check(vma, vm_flags))
> > -             return;
> > -
> > -     hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
> > -     hend = vma->vm_end & HPAGE_PMD_MASK;
> > -     if (hstart < hend)
> > -             khugepaged_enter(vma, vm_flags);
> > +     if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
> > +         khugepaged_enabled() &&
> > +         (((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
> > +          (vma->vm_end & HPAGE_PMD_MASK))) {
>
> Reviewing these bounds-checking tests is so hard :(  Can we simplify?

Yeah, I think they can be moved into a helper with a more descriptive name.

>
> > +             if (hugepage_vma_check(vma, vm_flags))
> > +                     __khugepaged_enter(vma->vm_mm);
> > +     }
> >  }
>
> void khugepaged_enter_vma(struct vm_area_struct *vma,
>                           unsigned long vm_flags)
> {
>         if (test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags))
>                 return;
>         if (!khugepaged_enabled())
>                 return;
>         if (round_up(vma->vm_start, HPAGE_PMD_SIZE) >=
>                         (vma->vm_end & HPAGE_PMD_MASK))
>                 return;         /* vma is too small */
>         if (!hugepage_vma_check(vma, vm_flags))
>                 return;
>         __khugepaged_enter(vma->vm_mm);
> }
>
>
> Also, it might be slightly faster to have checked MMF_VM_HUGEPAGE
> before khugepaged_enabled(), but it looks odd.  And it might be slower,
> too - more pointer chasing.

I think most configurations have always or madvise mode set
(khugepaged_enabled() return true), so having checked MMF_VM_HUGEPAGE
before khugepaged_enabled() seems slightly better, but anyway it
should not have measurable effect IMHO.

>
> I wish someone would document hugepage_vma_check().

I will clean up all the stuff further in a new patchset, for example,
trying to consolidate all the different
hugepage_suitable/enabled/active checks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-05-10 22:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-10 20:32 [mm-unstable v4 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
2022-05-10 20:32 ` [v4 PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
2022-05-10 20:32 ` [v4 PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED Yang Shi
2022-05-10 20:32 ` [v4 PATCH 3/8] mm: khugepaged: skip DAX vma Yang Shi
2022-05-10 20:32 ` [v4 PATCH 4/8] mm: thp: only regular file could be THP eligible Yang Shi
2022-05-10 20:32 ` [v4 PATCH 5/8] mm: khugepaged: make khugepaged_enter() void function Yang Shi
2022-05-10 20:32 ` [v4 PATCH 6/8] mm: khugepaged: make hugepage_vma_check() non-static Yang Shi
2022-05-10 21:05   ` Andrew Morton
2022-05-10 22:35     ` Yang Shi
2022-05-10 20:32 ` [v4 PATCH 7/8] mm: khugepaged: introduce khugepaged_enter_vma() helper Yang Shi
2022-05-10 20:32 ` [v4 PATCH 8/8] mm: mmap: register suitable readonly file vmas for khugepaged Yang Shi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.