All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent
@ 2022-02-28 23:57 Yang Shi
  2022-02-28 23:57 ` [PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Yang Shi @ 2022-02-28 23:57 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, songliubraving, linmiaohe, riel, willy,
	ziy, akpm, tytso, adilger.kernel, darrick.wong
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel


The readonly FS THP relies on khugepaged to collapse THP for suitable
vmas.  But it is kind of "random luck" for khugepaged to see the
readonly FS vmas (see report: https://lore.kernel.org/linux-mm/00f195d4-d039-3cf2-d3a1-a2c88de397a0@suse.cz/) since currently the vmas are registered to khugepaged when:
  - Anon huge pmd page fault
  - VMA merge
  - MADV_HUGEPAGE
  - Shmem mmap

If the above conditions are not met, even though khugepaged is enabled
it won't see readonly FS vmas at all.  MADV_HUGEPAGE could be specified
explicitly to tell khugepaged to collapse this area, but when khugepaged
mode is "always" it should scan suitable vmas as long as VM_NOHUGEPAGE
is not set.

So make sure readonly FS vmas are registered to khugepaged to make the
behavior more consistent.

Registering the vmas in mmap path seems more preferred from performance
point of view since page fault path is definitely hot path.


The patch 1 ~ 7 are minor bug fixes, clean up and preparation patches.
The patch 8 converts ext4 and xfs.  We may need convert more filesystems,
but I'd like to hear some comments before doing that.


Tested with khugepaged test in selftests and the testcase provided by
Vlastimil Babka in https://lore.kernel.org/lkml/df3b5d1c-a36b-2c73-3e27-99e74983de3a@suse.cz/
by commenting out MADV_HUGEPAGE call.


 b/fs/ext4/file.c                 |    4 +++
 b/fs/xfs/xfs_file.c              |    4 +++
 b/include/linux/huge_mm.h        |    9 +++++++
 b/include/linux/khugepaged.h     |   69 +++++++++++++++++++++----------------------------------------
 b/include/linux/sched/coredump.h |    3 +-
 b/kernel/fork.c                  |    4 ---
 b/mm/huge_memory.c               |   15 +++----------
 b/mm/khugepaged.c                |   71 ++++++++++++++++++++++++++++++++++++++++++++-------------------
 b/mm/shmem.c                     |   14 +++---------
 9 files changed, 102 insertions(+), 91 deletions(-)



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE
  2022-02-28 23:57 [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
@ 2022-02-28 23:57 ` Yang Shi
  2022-03-01  8:45   ` Miaohe Lin
  2022-02-28 23:57 ` [PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED Yang Shi
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Yang Shi @ 2022-02-28 23:57 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, songliubraving, linmiaohe, riel, willy,
	ziy, akpm, tytso, adilger.kernel, darrick.wong
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel

MMF_VM_HUGEPAGE is set as long as the mm is available for khugepaged by
khugepaged_enter(), not only when VM_HUGEPAGE is set on vma.  Correct
the comment to avoid confusion.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/sched/coredump.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
index 4d9e3a656875..4d0a5be28b70 100644
--- a/include/linux/sched/coredump.h
+++ b/include/linux/sched/coredump.h
@@ -57,7 +57,8 @@ static inline int get_dumpable(struct mm_struct *mm)
 #endif
 					/* leave room for more dump flags */
 #define MMF_VM_MERGEABLE	16	/* KSM may merge identical pages */
-#define MMF_VM_HUGEPAGE		17	/* set when VM_HUGEPAGE is set on vma */
+#define MMF_VM_HUGEPAGE		17	/* set when mm is available for
+					   khugepaged */
 /*
  * This one-shot flag is dropped due to necessity of changing exe once again
  * on NFS restore
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED
  2022-02-28 23:57 [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
  2022-02-28 23:57 ` [PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
@ 2022-02-28 23:57 ` Yang Shi
  2022-03-01  9:07   ` Miaohe Lin
  2022-02-28 23:57 ` [PATCH 3/8] mm: khugepaged: skip DAX vma Yang Shi
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Yang Shi @ 2022-02-28 23:57 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, songliubraving, linmiaohe, riel, willy,
	ziy, akpm, tytso, adilger.kernel, darrick.wong
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel

The hugepage_vma_check() called by khugepaged_enter_vma_merge() does
check VM_NO_KHUGEPAGED. Remove the check from caller and move the check
in hugepage_vma_check() up.

More checks may be run for VM_NO_KHUGEPAGED vmas, but MADV_HUGEPAGE is
definitely not a hot path, so cleaner code does outweigh.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/khugepaged.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 131492fd1148..82c71c6da9ce 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -366,8 +366,7 @@ int hugepage_madvise(struct vm_area_struct *vma,
 		 * register it here without waiting a page fault that
 		 * may not happen any time soon.
 		 */
-		if (!(*vm_flags & VM_NO_KHUGEPAGED) &&
-				khugepaged_enter_vma_merge(vma, *vm_flags))
+		if (khugepaged_enter_vma_merge(vma, *vm_flags))
 			return -ENOMEM;
 		break;
 	case MADV_NOHUGEPAGE:
@@ -446,6 +445,9 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 	if (!transhuge_vma_enabled(vma, vm_flags))
 		return false;
 
+	if (vm_flags & VM_NO_KHUGEPAGED)
+		return false;
+
 	if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
 				vma->vm_pgoff, HPAGE_PMD_NR))
 		return false;
@@ -471,7 +473,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 		return false;
 	if (vma_is_temporary_stack(vma))
 		return false;
-	return !(vm_flags & VM_NO_KHUGEPAGED);
+
+	return true;
 }
 
 int __khugepaged_enter(struct mm_struct *mm)
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/8] mm: khugepaged: skip DAX vma
  2022-02-28 23:57 [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
  2022-02-28 23:57 ` [PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
  2022-02-28 23:57 ` [PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED Yang Shi
@ 2022-02-28 23:57 ` Yang Shi
  2022-03-02  3:21   ` Miaohe Lin
  2022-02-28 23:57 ` [PATCH 4/8] mm: thp: only regular file could be THP eligible Yang Shi
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Yang Shi @ 2022-02-28 23:57 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, songliubraving, linmiaohe, riel, willy,
	ziy, akpm, tytso, adilger.kernel, darrick.wong
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel

The DAX vma may be seen by khugepaged when the mm has other khugepaged
suitable vmas.  So khugepaged may try to collapse THP for DAX vma, but
it will fail due to page sanity check, for example, page is not
on LRU.

So it is not harmful, but it is definitely pointless to run khugepaged
against DAX vma, so skip it in early check.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/khugepaged.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 82c71c6da9ce..a0e4fa33660e 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -448,6 +448,10 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 	if (vm_flags & VM_NO_KHUGEPAGED)
 		return false;
 
+	/* Don't run khugepaged against DAX vma */
+	if (vma_is_dax(vma))
+		return false;
+
 	if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
 				vma->vm_pgoff, HPAGE_PMD_NR))
 		return false;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/8] mm: thp: only regular file could be THP eligible
  2022-02-28 23:57 [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (2 preceding siblings ...)
  2022-02-28 23:57 ` [PATCH 3/8] mm: khugepaged: skip DAX vma Yang Shi
@ 2022-02-28 23:57 ` Yang Shi
  2022-02-28 23:57 ` [PATCH 5/8] mm: khugepaged: make khugepaged_enter() void function Yang Shi
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Yang Shi @ 2022-02-28 23:57 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, songliubraving, linmiaohe, riel, willy,
	ziy, akpm, tytso, adilger.kernel, darrick.wong
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel

Since commit a4aeaa06d45e ("mm: khugepaged: skip huge page collapse for
special files"), khugepaged just collapses THP for regular file which is
the intended usecase for readonly fs THP.  Only show regular file as THP
eligible accordingly.

And make file_thp_enabled() available for khugepaged too in order to remove
duplicate code.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/huge_mm.h |  9 +++++++++
 mm/huge_memory.c        | 11 ++---------
 mm/khugepaged.c         |  9 ++-------
 3 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index e4c18ba8d3bf..e6d867f72458 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -172,6 +172,15 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
 	return false;
 }
 
+static inline bool file_thp_enabled(struct vm_area_struct *vma)
+{
+	struct inode *inode = vma->vm_file->f_inode;
+
+	return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && vma->vm_file &&
+	       (vma->vm_flags & VM_EXEC) &&
+	       !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
+}
+
 bool transparent_hugepage_active(struct vm_area_struct *vma);
 
 #define transparent_hugepage_use_zero_page()				\
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 406a3c28c026..a87b3df63209 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -64,13 +64,6 @@ static atomic_t huge_zero_refcount;
 struct page *huge_zero_page __read_mostly;
 unsigned long huge_zero_pfn __read_mostly = ~0UL;
 
-static inline bool file_thp_enabled(struct vm_area_struct *vma)
-{
-	return transhuge_vma_enabled(vma, vma->vm_flags) && vma->vm_file &&
-	       !inode_is_open_for_write(vma->vm_file->f_inode) &&
-	       (vma->vm_flags & VM_EXEC);
-}
-
 bool transparent_hugepage_active(struct vm_area_struct *vma)
 {
 	/* The addr is used to check if the vma size fits */
@@ -82,8 +75,8 @@ bool transparent_hugepage_active(struct vm_area_struct *vma)
 		return __transparent_hugepage_enabled(vma);
 	if (vma_is_shmem(vma))
 		return shmem_huge_enabled(vma);
-	if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS))
-		return file_thp_enabled(vma);
+	if (transhuge_vma_enabled(vma, vma->vm_flags) && file_thp_enabled(vma))
+		return true;
 
 	return false;
 }
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a0e4fa33660e..3dbac3e23f43 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -465,13 +465,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 		return false;
 
 	/* Only regular file is valid */
-	if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
-	    (vm_flags & VM_EXEC)) {
-		struct inode *inode = vma->vm_file->f_inode;
-
-		return !inode_is_open_for_write(inode) &&
-			S_ISREG(inode->i_mode);
-	}
+	if (file_thp_enabled(vma))
+		return true;
 
 	if (!vma->anon_vma || vma->vm_ops)
 		return false;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/8] mm: khugepaged: make khugepaged_enter() void function
  2022-02-28 23:57 [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (3 preceding siblings ...)
  2022-02-28 23:57 ` [PATCH 4/8] mm: thp: only regular file could be THP eligible Yang Shi
@ 2022-02-28 23:57 ` Yang Shi
  2022-02-28 23:57 ` [PATCH 6/8] mm: khugepaged: move some khugepaged_* functions to khugepaged.c Yang Shi
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Yang Shi @ 2022-02-28 23:57 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, songliubraving, linmiaohe, riel, willy,
	ziy, akpm, tytso, adilger.kernel, darrick.wong
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel

The most callers of khugepaged_enter() don't care about the return
value.  Only dup_mmap(), anonymous THP page fault and MADV_HUGEPAGE handle
the error by returning -ENOMEM.  Actually it is not harmful for them to
ignore the error case either.  It also sounds overkilling to fail fork()
and page fault early due to khugepaged_enter() error, and MADV_HUGEPAGE
does set VM_HUGEPAGE flag regardless of the error.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/khugepaged.h | 30 ++++++++++++------------------
 kernel/fork.c              |  4 +---
 mm/huge_memory.c           |  4 ++--
 mm/khugepaged.c            | 18 +++++++-----------
 4 files changed, 22 insertions(+), 34 deletions(-)

diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 2fcc01891b47..0423d3619f26 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -12,10 +12,10 @@ extern struct attribute_group khugepaged_attr_group;
 extern int khugepaged_init(void);
 extern void khugepaged_destroy(void);
 extern int start_stop_khugepaged(void);
-extern int __khugepaged_enter(struct mm_struct *mm);
+extern void __khugepaged_enter(struct mm_struct *mm);
 extern void __khugepaged_exit(struct mm_struct *mm);
-extern int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
-				      unsigned long vm_flags);
+extern void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
+				       unsigned long vm_flags);
 extern void khugepaged_min_free_kbytes_update(void);
 #ifdef CONFIG_SHMEM
 extern void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr);
@@ -40,11 +40,10 @@ static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
 	(transparent_hugepage_flags &				\
 	 (1<<TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG))
 
-static inline int khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
+static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
 {
 	if (test_bit(MMF_VM_HUGEPAGE, &oldmm->flags))
-		return __khugepaged_enter(mm);
-	return 0;
+		__khugepaged_enter(mm);
 }
 
 static inline void khugepaged_exit(struct mm_struct *mm)
@@ -53,7 +52,7 @@ static inline void khugepaged_exit(struct mm_struct *mm)
 		__khugepaged_exit(mm);
 }
 
-static inline int khugepaged_enter(struct vm_area_struct *vma,
+static inline void khugepaged_enter(struct vm_area_struct *vma,
 				   unsigned long vm_flags)
 {
 	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags))
@@ -62,27 +61,22 @@ static inline int khugepaged_enter(struct vm_area_struct *vma,
 		     (khugepaged_req_madv() && (vm_flags & VM_HUGEPAGE))) &&
 		    !(vm_flags & VM_NOHUGEPAGE) &&
 		    !test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
-			if (__khugepaged_enter(vma->vm_mm))
-				return -ENOMEM;
-	return 0;
+			__khugepaged_enter(vma->vm_mm);
 }
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
-static inline int khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
+static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
 {
-	return 0;
 }
 static inline void khugepaged_exit(struct mm_struct *mm)
 {
 }
-static inline int khugepaged_enter(struct vm_area_struct *vma,
-				   unsigned long vm_flags)
+static inline void khugepaged_enter(struct vm_area_struct *vma,
+				    unsigned long vm_flags)
 {
-	return 0;
 }
-static inline int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
-					     unsigned long vm_flags)
+static inline void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
+					      unsigned long vm_flags)
 {
-	return 0;
 }
 static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
 					   unsigned long addr)
diff --git a/kernel/fork.c b/kernel/fork.c
index a024bf6254df..dc85418c426a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -523,9 +523,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
 	retval = ksm_fork(mm, oldmm);
 	if (retval)
 		goto out;
-	retval = khugepaged_fork(mm, oldmm);
-	if (retval)
-		goto out;
+	khugepaged_fork(mm, oldmm);
 
 	prev = NULL;
 	for (mpnt = oldmm->mmap; mpnt; mpnt = mpnt->vm_next) {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a87b3df63209..ec2490d6af09 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -725,8 +725,8 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
 		return VM_FAULT_FALLBACK;
 	if (unlikely(anon_vma_prepare(vma)))
 		return VM_FAULT_OOM;
-	if (unlikely(khugepaged_enter(vma, vma->vm_flags)))
-		return VM_FAULT_OOM;
+	khugepaged_enter(vma, vma->vm_flags);
+
 	if (!(vmf->flags & FAULT_FLAG_WRITE) &&
 			!mm_forbids_zeropage(vma->vm_mm) &&
 			transparent_hugepage_use_zero_page()) {
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 3dbac3e23f43..b87af297e652 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -366,8 +366,7 @@ int hugepage_madvise(struct vm_area_struct *vma,
 		 * register it here without waiting a page fault that
 		 * may not happen any time soon.
 		 */
-		if (khugepaged_enter_vma_merge(vma, *vm_flags))
-			return -ENOMEM;
+		khugepaged_enter_vma_merge(vma, *vm_flags);
 		break;
 	case MADV_NOHUGEPAGE:
 		*vm_flags &= ~VM_HUGEPAGE;
@@ -476,20 +475,20 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 	return true;
 }
 
-int __khugepaged_enter(struct mm_struct *mm)
+void __khugepaged_enter(struct mm_struct *mm)
 {
 	struct mm_slot *mm_slot;
 	int wakeup;
 
 	mm_slot = alloc_mm_slot();
 	if (!mm_slot)
-		return -ENOMEM;
+		return;
 
 	/* __khugepaged_exit() must not run from under us */
 	VM_BUG_ON_MM(khugepaged_test_exit(mm), mm);
 	if (unlikely(test_and_set_bit(MMF_VM_HUGEPAGE, &mm->flags))) {
 		free_mm_slot(mm_slot);
-		return 0;
+		return;
 	}
 
 	spin_lock(&khugepaged_mm_lock);
@@ -505,11 +504,9 @@ int __khugepaged_enter(struct mm_struct *mm)
 	mmgrab(mm);
 	if (wakeup)
 		wake_up_interruptible(&khugepaged_wait);
-
-	return 0;
 }
 
-int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
+void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 			       unsigned long vm_flags)
 {
 	unsigned long hstart, hend;
@@ -520,13 +517,12 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 	 * file-private shmem THP is not supported.
 	 */
 	if (!hugepage_vma_check(vma, vm_flags))
-		return 0;
+		return;
 
 	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 	hend = vma->vm_end & HPAGE_PMD_MASK;
 	if (hstart < hend)
-		return khugepaged_enter(vma, vm_flags);
-	return 0;
+		khugepaged_enter(vma, vm_flags);
 }
 
 void __khugepaged_exit(struct mm_struct *mm)
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/8] mm: khugepaged: move some khugepaged_* functions to khugepaged.c
  2022-02-28 23:57 [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (4 preceding siblings ...)
  2022-02-28 23:57 ` [PATCH 5/8] mm: khugepaged: make khugepaged_enter() void function Yang Shi
@ 2022-02-28 23:57 ` Yang Shi
  2022-02-28 23:57 ` [PATCH 7/8] mm: khugepaged: introduce khugepaged_enter_file() helper Yang Shi
  2022-02-28 23:57 ` [PATCH 8/8] fs: register suitable readonly vmas for khugepaged Yang Shi
  7 siblings, 0 replies; 23+ messages in thread
From: Yang Shi @ 2022-02-28 23:57 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, songliubraving, linmiaohe, riel, willy,
	ziy, akpm, tytso, adilger.kernel, darrick.wong
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel

This move also makes the following patches easier.  The following patches
will call khugepaged_enter() for regular filesystems to make readonly FS
THP collapse more consistent.  They need to use some macros defined in
huge_mm.h, for example, HPAGE_PMD_*, but it seems not preferred to
polute filesystems code with including unnecessary header files.  With
this move the filesystems code just need include khugepaged.h, which is
quite small and the use is quite specific, to call khugepaged_enter()
to hook mm with khugepaged.

And the khugepaged_* functions actually are wrappers for some non-inline
functions, so it seems the benefits are not too much to keep them inline.

This also helps to reuse hugepage_vma_check() for khugepaged_enter() so
that we could remove some duplicate checks.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/khugepaged.h | 33 ++++++---------------------------
 mm/khugepaged.c            | 20 ++++++++++++++++++++
 2 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 0423d3619f26..54e169116d49 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -16,6 +16,12 @@ extern void __khugepaged_enter(struct mm_struct *mm);
 extern void __khugepaged_exit(struct mm_struct *mm);
 extern void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 				       unsigned long vm_flags);
+extern void khugepaged_fork(struct mm_struct *mm,
+			    struct mm_struct *oldmm);
+extern void khugepaged_exit(struct mm_struct *mm);
+extern void khugepaged_enter(struct vm_area_struct *vma,
+			     unsigned long vm_flags);
+
 extern void khugepaged_min_free_kbytes_update(void);
 #ifdef CONFIG_SHMEM
 extern void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr);
@@ -33,36 +39,9 @@ static inline void collapse_pte_mapped_thp(struct mm_struct *mm,
 #define khugepaged_always()				\
 	(transparent_hugepage_flags &			\
 	 (1<<TRANSPARENT_HUGEPAGE_FLAG))
-#define khugepaged_req_madv()					\
-	(transparent_hugepage_flags &				\
-	 (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG))
 #define khugepaged_defrag()					\
 	(transparent_hugepage_flags &				\
 	 (1<<TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG))
-
-static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
-{
-	if (test_bit(MMF_VM_HUGEPAGE, &oldmm->flags))
-		__khugepaged_enter(mm);
-}
-
-static inline void khugepaged_exit(struct mm_struct *mm)
-{
-	if (test_bit(MMF_VM_HUGEPAGE, &mm->flags))
-		__khugepaged_exit(mm);
-}
-
-static inline void khugepaged_enter(struct vm_area_struct *vma,
-				   unsigned long vm_flags)
-{
-	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags))
-		if ((khugepaged_always() ||
-		     (shmem_file(vma->vm_file) && shmem_huge_enabled(vma)) ||
-		     (khugepaged_req_madv() && (vm_flags & VM_HUGEPAGE))) &&
-		    !(vm_flags & VM_NOHUGEPAGE) &&
-		    !test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
-			__khugepaged_enter(vma->vm_mm);
-}
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
 static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
 {
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b87af297e652..4cb4379ecf25 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -557,6 +557,26 @@ void __khugepaged_exit(struct mm_struct *mm)
 	}
 }
 
+void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm)
+{
+	if (test_bit(MMF_VM_HUGEPAGE, &oldmm->flags))
+		__khugepaged_enter(mm);
+}
+
+void khugepaged_exit(struct mm_struct *mm)
+{
+	if (test_bit(MMF_VM_HUGEPAGE, &mm->flags))
+		__khugepaged_exit(mm);
+}
+
+void khugepaged_enter(struct vm_area_struct *vma, unsigned long vm_flags)
+{
+	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
+	    khugepaged_enabled())
+		if (hugepage_vma_check(vma, vm_flags))
+			__khugepaged_enter(vma->vm_mm);
+}
+
 static void release_pte_page(struct page *page)
 {
 	mod_node_page_state(page_pgdat(page),
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/8] mm: khugepaged: introduce khugepaged_enter_file() helper
  2022-02-28 23:57 [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (5 preceding siblings ...)
  2022-02-28 23:57 ` [PATCH 6/8] mm: khugepaged: move some khugepaged_* functions to khugepaged.c Yang Shi
@ 2022-02-28 23:57 ` Yang Shi
  2022-02-28 23:57 ` [PATCH 8/8] fs: register suitable readonly vmas for khugepaged Yang Shi
  7 siblings, 0 replies; 23+ messages in thread
From: Yang Shi @ 2022-02-28 23:57 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, songliubraving, linmiaohe, riel, willy,
	ziy, akpm, tytso, adilger.kernel, darrick.wong
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel

The following patch will have filesystems code call khugepaged_enter()
to make readonly FS THP collapse more consistent.  Extract the current
implementation used by shmem in khugepaged_enter_file() helper so that
it could be reused by other filesystems and export the symbol for
modules.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/khugepaged.h |  6 ++++++
 mm/khugepaged.c            | 11 +++++++++++
 mm/shmem.c                 | 14 ++++----------
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 54e169116d49..06464e9a1f91 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -21,6 +21,8 @@ extern void khugepaged_fork(struct mm_struct *mm,
 extern void khugepaged_exit(struct mm_struct *mm);
 extern void khugepaged_enter(struct vm_area_struct *vma,
 			     unsigned long vm_flags);
+extern void khugepaged_enter_file(struct vm_area_struct *vma,
+				  unsigned long vm_flags);
 
 extern void khugepaged_min_free_kbytes_update(void);
 #ifdef CONFIG_SHMEM
@@ -53,6 +55,10 @@ static inline void khugepaged_enter(struct vm_area_struct *vma,
 				    unsigned long vm_flags)
 {
 }
+static inline void khugepaged_enter_file(struct vm_area_struct *vma,
+					 unsigned long vm_flags)
+{
+}
 static inline void khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 					      unsigned long vm_flags)
 {
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 4cb4379ecf25..93c9072983e2 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -577,6 +577,17 @@ void khugepaged_enter(struct vm_area_struct *vma, unsigned long vm_flags)
 			__khugepaged_enter(vma->vm_mm);
 }
 
+void khugepaged_enter_file(struct vm_area_struct *vma, unsigned long vm_flags)
+{
+	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
+	    khugepaged_enabled() &&
+	    (((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
+	     (vma->vm_end & HPAGE_PMD_MASK)))
+		if (hugepage_vma_check(vma, vm_flags))
+			__khugepaged_enter(vma->vm_mm);
+}
+EXPORT_SYMBOL_GPL(khugepaged_enter_file);
+
 static void release_pte_page(struct page *page)
 {
 	mod_node_page_state(page_pgdat(page),
diff --git a/mm/shmem.c b/mm/shmem.c
index a09b29ec2b45..c2346e5d2b24 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2233,11 +2233,9 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 
 	file_accessed(file);
 	vma->vm_ops = &shmem_vm_ops;
-	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
-			((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
-			(vma->vm_end & HPAGE_PMD_MASK)) {
-		khugepaged_enter(vma, vma->vm_flags);
-	}
+
+	khugepaged_enter_file(vma, vma->vm_flags);
+
 	return 0;
 }
 
@@ -4132,11 +4130,7 @@ int shmem_zero_setup(struct vm_area_struct *vma)
 	vma->vm_file = file;
 	vma->vm_ops = &shmem_vm_ops;
 
-	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
-			((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
-			(vma->vm_end & HPAGE_PMD_MASK)) {
-		khugepaged_enter(vma, vma->vm_flags);
-	}
+	khugepaged_enter_file(vma, vma->vm_flags);
 
 	return 0;
 }
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 8/8] fs: register suitable readonly vmas for khugepaged
  2022-02-28 23:57 [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
                   ` (6 preceding siblings ...)
  2022-02-28 23:57 ` [PATCH 7/8] mm: khugepaged: introduce khugepaged_enter_file() helper Yang Shi
@ 2022-02-28 23:57 ` Yang Shi
  7 siblings, 0 replies; 23+ messages in thread
From: Yang Shi @ 2022-02-28 23:57 UTC (permalink / raw)
  To: vbabka, kirill.shutemov, songliubraving, linmiaohe, riel, willy,
	ziy, akpm, tytso, adilger.kernel, darrick.wong
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel

The readonly FS THP relies on khugepaged to collapse THP for suitable
vmas.  But it is kind of "random luck" for khugepaged to see the
readonly FS vmas (https://lore.kernel.org/linux-mm/00f195d4-d039-3cf2-d3a1-a2c88de397a0@suse.cz/)
since currently the vmas are registered to khugepaged when:
  - Anon huge pmd page fault
  - VMA merge
  - MADV_HUGEPAGE
  - Shmem mmap

If the above conditions are not met, even though khugepaged is enabled
it won't see readonly FS vmas at all.  MADV_HUGEPAGE could be specified
explicitly to tell khugepaged to collapse this area, but when khugepaged
mode is "always" it should scan suitable vmas as long as VM_NOHUGEPAGE
is not set.

So make sure readonly FS vmas are registered to khugepaged to make the
behavior more consistent.

Registering the vmas in mmap path seems more preferred from performance
point of view since page fault path is definitely hot path.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 fs/ext4/file.c    | 4 ++++
 fs/xfs/xfs_file.c | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8cc11715518a..b894cd5aff44 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -30,6 +30,7 @@
 #include <linux/uio.h>
 #include <linux/mman.h>
 #include <linux/backing-dev.h>
+#include <linux/khugepaged.h>
 #include "ext4.h"
 #include "ext4_jbd2.h"
 #include "xattr.h"
@@ -782,6 +783,9 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
 	} else {
 		vma->vm_ops = &ext4_file_vm_ops;
 	}
+
+	khugepaged_enter_file(vma, vma->vm_flags);
+
 	return 0;
 }
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5bddb1e9e0b3..d94144b1fb0f 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -30,6 +30,7 @@
 #include <linux/mman.h>
 #include <linux/fadvise.h>
 #include <linux/mount.h>
+#include <linux/khugepaged.h>
 
 static const struct vm_operations_struct xfs_file_vm_ops;
 
@@ -1407,6 +1408,9 @@ xfs_file_mmap(
 	vma->vm_ops = &xfs_file_vm_ops;
 	if (IS_DAX(inode))
 		vma->vm_flags |= VM_HUGEPAGE;
+
+	khugepaged_enter_file(vma, vma->vm_flags);
+
 	return 0;
 }
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE
  2022-02-28 23:57 ` [PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
@ 2022-03-01  8:45   ` Miaohe Lin
  2022-03-01 21:49     ` Yang Shi
  0 siblings, 1 reply; 23+ messages in thread
From: Miaohe Lin @ 2022-03-01  8:45 UTC (permalink / raw)
  To: Yang Shi
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel,
	vbabka, kirill.shutemov, songliubraving, riel, willy, ziy, akpm,
	tytso, adilger.kernel, darrick.wong

On 2022/3/1 7:57, Yang Shi wrote:
> MMF_VM_HUGEPAGE is set as long as the mm is available for khugepaged by
> khugepaged_enter(), not only when VM_HUGEPAGE is set on vma.  Correct
> the comment to avoid confusion.
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  include/linux/sched/coredump.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
> index 4d9e3a656875..4d0a5be28b70 100644
> --- a/include/linux/sched/coredump.h
> +++ b/include/linux/sched/coredump.h
> @@ -57,7 +57,8 @@ static inline int get_dumpable(struct mm_struct *mm)
>  #endif
>  					/* leave room for more dump flags */
>  #define MMF_VM_MERGEABLE	16	/* KSM may merge identical pages */
> -#define MMF_VM_HUGEPAGE		17	/* set when VM_HUGEPAGE is set on vma */
> +#define MMF_VM_HUGEPAGE		17	/* set when mm is available for
> +					   khugepaged */

I think this comment could be written in one line. Anyway, this patch looks good to me.
Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

>  /*
>   * This one-shot flag is dropped due to necessity of changing exe once again
>   * on NFS restore
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED
  2022-02-28 23:57 ` [PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED Yang Shi
@ 2022-03-01  9:07   ` Miaohe Lin
  2022-03-02 18:43     ` Yang Shi
  0 siblings, 1 reply; 23+ messages in thread
From: Miaohe Lin @ 2022-03-01  9:07 UTC (permalink / raw)
  To: Yang Shi
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel,
	vbabka, kirill.shutemov, songliubraving, riel, willy, ziy, akpm,
	tytso, adilger.kernel, darrick.wong

On 2022/3/1 7:57, Yang Shi wrote:
> The hugepage_vma_check() called by khugepaged_enter_vma_merge() does
> check VM_NO_KHUGEPAGED. Remove the check from caller and move the check
> in hugepage_vma_check() up.
> 
> More checks may be run for VM_NO_KHUGEPAGED vmas, but MADV_HUGEPAGE is
> definitely not a hot path, so cleaner code does outweigh.
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  mm/khugepaged.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 131492fd1148..82c71c6da9ce 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -366,8 +366,7 @@ int hugepage_madvise(struct vm_area_struct *vma,
>  		 * register it here without waiting a page fault that
>  		 * may not happen any time soon.
>  		 */
> -		if (!(*vm_flags & VM_NO_KHUGEPAGED) &&
> -				khugepaged_enter_vma_merge(vma, *vm_flags))
> +		if (khugepaged_enter_vma_merge(vma, *vm_flags))
>  			return -ENOMEM;
>  		break;
>  	case MADV_NOHUGEPAGE:
> @@ -446,6 +445,9 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
>  	if (!transhuge_vma_enabled(vma, vm_flags))
>  		return false;
>  
> +	if (vm_flags & VM_NO_KHUGEPAGED)
> +		return false;
> +

This patch does improve the readability. But I have a question.
It seems VM_NO_KHUGEPAGED is not checked in the below if-condition:

	/* Only regular file is valid */
	if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
	    (vm_flags & VM_EXEC)) {
		struct inode *inode = vma->vm_file->f_inode;

		return !inode_is_open_for_write(inode) &&
			S_ISREG(inode->i_mode);
	}

If we return false due to VM_NO_KHUGEPAGED here, it seems it will affect the
return value of this CONFIG_READ_ONLY_THP_FOR_FS condition check.
Or am I miss something?

Thanks.

>  	if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
>  				vma->vm_pgoff, HPAGE_PMD_NR))
>  		return false;
> @@ -471,7 +473,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
>  		return false;
>  	if (vma_is_temporary_stack(vma))
>  		return false;
> -	return !(vm_flags & VM_NO_KHUGEPAGED);
> +
> +	return true;
>  }
>  
>  int __khugepaged_enter(struct mm_struct *mm)
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] mm: thp: only regular file could be THP eligible
  2022-02-28 23:57 ` [PATCH 4/8] mm: thp: only regular file could be THP eligible Yang Shi
  (?)
@ 2022-03-03 11:43 ` Dan Carpenter
  -1 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2022-03-01 16:57 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 3011 bytes --]

CC: kbuild-all(a)lists.01.org
BCC: lkp(a)intel.com
In-Reply-To: <20220228235741.102941-5-shy828301@gmail.com>
References: <20220228235741.102941-5-shy828301@gmail.com>
TO: Yang Shi <shy828301@gmail.com>
TO: vbabka(a)suse.cz
TO: kirill.shutemov(a)linux.intel.com
TO: songliubraving(a)fb.com
TO: linmiaohe(a)huawei.com
TO: riel(a)surriel.com
TO: willy(a)infradead.org
TO: ziy(a)nvidia.com
TO: akpm(a)linux-foundation.org
TO: tytso(a)mit.edu
TO: adilger.kernel(a)dilger.ca
TO: darrick.wong(a)oracle.com
CC: linux-mm(a)kvack.org
CC: linux-fsdevel(a)vger.kernel.org
CC: linux-ext4(a)vger.kernel.org
CC: linux-xfs(a)vger.kernel.org
CC: linux-kernel(a)vger.kernel.org

Hi Yang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tytso-ext4/dev]
[also build test WARNING on xfs-linux/for-next linus/master v5.17-rc6 next-20220301]
[cannot apply to hnaz-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yang-Shi/Make-khugepaged-collapse-readonly-FS-THP-more-consistent/20220301-075903
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
:::::: branch date: 17 hours ago
:::::: commit date: 17 hours ago
config: arm64-randconfig-m031-20220227 (https://download.01.org/0day-ci/archive/20220302/202203020034.2Ii9kTrs-lkp(a)intel.com/config)
compiler: aarch64-linux-gcc (GCC) 11.2.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
mm/khugepaged.c:468 hugepage_vma_check() error: we previously assumed 'vma->vm_file' could be null (see line 455)
include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)

vim +179 include/linux/huge_mm.h

16981d763501c0 Dan Williams 2017-07-10  174  
2224ed1155c07b Yang Shi     2022-02-28  175  static inline bool file_thp_enabled(struct vm_area_struct *vma)
2224ed1155c07b Yang Shi     2022-02-28  176  {
2224ed1155c07b Yang Shi     2022-02-28 @177  	struct inode *inode = vma->vm_file->f_inode;
2224ed1155c07b Yang Shi     2022-02-28  178  
2224ed1155c07b Yang Shi     2022-02-28 @179  	return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && vma->vm_file &&
2224ed1155c07b Yang Shi     2022-02-28  180  	       (vma->vm_flags & VM_EXEC) &&
2224ed1155c07b Yang Shi     2022-02-28  181  	       !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
2224ed1155c07b Yang Shi     2022-02-28  182  }
2224ed1155c07b Yang Shi     2022-02-28  183  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE
  2022-03-01  8:45   ` Miaohe Lin
@ 2022-03-01 21:49     ` Yang Shi
  2022-03-02  1:39       ` Miaohe Lin
  0 siblings, 1 reply; 23+ messages in thread
From: Yang Shi @ 2022-03-01 21:49 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Linux MM, Linux FS-devel Mailing List, linux-ext4, linux-xfs,
	Linux Kernel Mailing List, Vlastimil Babka, Kirill A. Shutemov,
	Song Liu, Rik van Riel, Matthew Wilcox, Zi Yan, Andrew Morton,
	Theodore Ts'o, Andreas Dilger, darrick.wong

On Tue, Mar 1, 2022 at 12:45 AM Miaohe Lin <linmiaohe@huawei.com> wrote:
>
> On 2022/3/1 7:57, Yang Shi wrote:
> > MMF_VM_HUGEPAGE is set as long as the mm is available for khugepaged by
> > khugepaged_enter(), not only when VM_HUGEPAGE is set on vma.  Correct
> > the comment to avoid confusion.
> >
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > ---
> >  include/linux/sched/coredump.h | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
> > index 4d9e3a656875..4d0a5be28b70 100644
> > --- a/include/linux/sched/coredump.h
> > +++ b/include/linux/sched/coredump.h
> > @@ -57,7 +57,8 @@ static inline int get_dumpable(struct mm_struct *mm)
> >  #endif
> >                                       /* leave room for more dump flags */
> >  #define MMF_VM_MERGEABLE     16      /* KSM may merge identical pages */
> > -#define MMF_VM_HUGEPAGE              17      /* set when VM_HUGEPAGE is set on vma */
> > +#define MMF_VM_HUGEPAGE              17      /* set when mm is available for
> > +                                        khugepaged */
>
> I think this comment could be written in one line. Anyway, this patch looks good to me.
> Thanks.

Yes, as long as we don't care about the 80 characters limit. I know
the limit was bumped to 100 by checkpatch, but I have seen 80 was
still preferred by a lot of people.

>
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks.

>
> >  /*
> >   * This one-shot flag is dropped due to necessity of changing exe once again
> >   * on NFS restore
> >
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE
  2022-03-01 21:49     ` Yang Shi
@ 2022-03-02  1:39       ` Miaohe Lin
  0 siblings, 0 replies; 23+ messages in thread
From: Miaohe Lin @ 2022-03-02  1:39 UTC (permalink / raw)
  To: Yang Shi
  Cc: Linux MM, Linux FS-devel Mailing List, linux-ext4, linux-xfs,
	Linux Kernel Mailing List, Vlastimil Babka, Kirill A. Shutemov,
	Song Liu, Rik van Riel, Matthew Wilcox, Zi Yan, Andrew Morton,
	Theodore Ts'o, Andreas Dilger, darrick.wong

On 2022/3/2 5:49, Yang Shi wrote:
> On Tue, Mar 1, 2022 at 12:45 AM Miaohe Lin <linmiaohe@huawei.com> wrote:
>>
>> On 2022/3/1 7:57, Yang Shi wrote:
>>> MMF_VM_HUGEPAGE is set as long as the mm is available for khugepaged by
>>> khugepaged_enter(), not only when VM_HUGEPAGE is set on vma.  Correct
>>> the comment to avoid confusion.
>>>
>>> Signed-off-by: Yang Shi <shy828301@gmail.com>
>>> ---
>>>  include/linux/sched/coredump.h | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
>>> index 4d9e3a656875..4d0a5be28b70 100644
>>> --- a/include/linux/sched/coredump.h
>>> +++ b/include/linux/sched/coredump.h
>>> @@ -57,7 +57,8 @@ static inline int get_dumpable(struct mm_struct *mm)
>>>  #endif
>>>                                       /* leave room for more dump flags */
>>>  #define MMF_VM_MERGEABLE     16      /* KSM may merge identical pages */
>>> -#define MMF_VM_HUGEPAGE              17      /* set when VM_HUGEPAGE is set on vma */
>>> +#define MMF_VM_HUGEPAGE              17      /* set when mm is available for
>>> +                                        khugepaged */
>>
>> I think this comment could be written in one line. Anyway, this patch looks good to me.
>> Thanks.
> 
> Yes, as long as we don't care about the 80 characters limit. I know
> the limit was bumped to 100 by checkpatch, but I have seen 80 was
> still preferred by a lot of people.

I see. Many thanks for clarifying.

> 
>>
>> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
> 
> Thanks.
> 
>>
>>>  /*
>>>   * This one-shot flag is dropped due to necessity of changing exe once again
>>>   * on NFS restore
>>>
>>
> .
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/8] mm: khugepaged: skip DAX vma
  2022-02-28 23:57 ` [PATCH 3/8] mm: khugepaged: skip DAX vma Yang Shi
@ 2022-03-02  3:21   ` Miaohe Lin
  0 siblings, 0 replies; 23+ messages in thread
From: Miaohe Lin @ 2022-03-02  3:21 UTC (permalink / raw)
  To: Yang Shi
  Cc: linux-mm, linux-fsdevel, linux-ext4, linux-xfs, linux-kernel,
	vbabka, kirill.shutemov, songliubraving, riel, willy, ziy, akpm,
	tytso, adilger.kernel, darrick.wong

On 2022/3/1 7:57, Yang Shi wrote:
> The DAX vma may be seen by khugepaged when the mm has other khugepaged
> suitable vmas.  So khugepaged may try to collapse THP for DAX vma, but
> it will fail due to page sanity check, for example, page is not
> on LRU.

Yep. There is PageLRU check in khugepaged_scan_{file,pmd}.

> 
> So it is not harmful, but it is definitely pointless to run khugepaged
> against DAX vma, so skip it in early check.
> 

This patch looks good to me. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  mm/khugepaged.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 82c71c6da9ce..a0e4fa33660e 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -448,6 +448,10 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
>  	if (vm_flags & VM_NO_KHUGEPAGED)
>  		return false;
>  
> +	/* Don't run khugepaged against DAX vma */
> +	if (vma_is_dax(vma))
> +		return false;
> +
>  	if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
>  				vma->vm_pgoff, HPAGE_PMD_NR))
>  		return false;
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED
  2022-03-01  9:07   ` Miaohe Lin
@ 2022-03-02 18:43     ` Yang Shi
  2022-03-03 10:53       ` Miaohe Lin
  0 siblings, 1 reply; 23+ messages in thread
From: Yang Shi @ 2022-03-02 18:43 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Linux MM, Linux FS-devel Mailing List, linux-ext4, linux-xfs,
	Linux Kernel Mailing List, Vlastimil Babka, Kirill A. Shutemov,
	Song Liu, Rik van Riel, Matthew Wilcox, Zi Yan, Andrew Morton,
	Theodore Ts'o, Andreas Dilger, darrick.wong

On Tue, Mar 1, 2022 at 1:07 AM Miaohe Lin <linmiaohe@huawei.com> wrote:
>
> On 2022/3/1 7:57, Yang Shi wrote:
> > The hugepage_vma_check() called by khugepaged_enter_vma_merge() does
> > check VM_NO_KHUGEPAGED. Remove the check from caller and move the check
> > in hugepage_vma_check() up.
> >
> > More checks may be run for VM_NO_KHUGEPAGED vmas, but MADV_HUGEPAGE is
> > definitely not a hot path, so cleaner code does outweigh.
> >
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > ---
> >  mm/khugepaged.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 131492fd1148..82c71c6da9ce 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -366,8 +366,7 @@ int hugepage_madvise(struct vm_area_struct *vma,
> >                * register it here without waiting a page fault that
> >                * may not happen any time soon.
> >                */
> > -             if (!(*vm_flags & VM_NO_KHUGEPAGED) &&
> > -                             khugepaged_enter_vma_merge(vma, *vm_flags))
> > +             if (khugepaged_enter_vma_merge(vma, *vm_flags))
> >                       return -ENOMEM;
> >               break;
> >       case MADV_NOHUGEPAGE:
> > @@ -446,6 +445,9 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
> >       if (!transhuge_vma_enabled(vma, vm_flags))
> >               return false;
> >
> > +     if (vm_flags & VM_NO_KHUGEPAGED)
> > +             return false;
> > +
>
> This patch does improve the readability. But I have a question.
> It seems VM_NO_KHUGEPAGED is not checked in the below if-condition:
>
>         /* Only regular file is valid */
>         if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
>             (vm_flags & VM_EXEC)) {
>                 struct inode *inode = vma->vm_file->f_inode;
>
>                 return !inode_is_open_for_write(inode) &&
>                         S_ISREG(inode->i_mode);
>         }
>
> If we return false due to VM_NO_KHUGEPAGED here, it seems it will affect the
> return value of this CONFIG_READ_ONLY_THP_FOR_FS condition check.
> Or am I miss something?

Yes, it will return false instead of true if that file THP check is
true, but wasn't that old behavior actually problematic? Khugepaged
definitely can't collapse VM_NO_KHUGEPAGED vmas even though it
satisfies all the readonly file THP checks. With the old behavior
khugepaged may scan an exec file hugetlb vma IIUC although it will
fail later due to other page sanity checks, i.e. page compound check.

>
> Thanks.
>
> >       if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
> >                               vma->vm_pgoff, HPAGE_PMD_NR))
> >               return false;
> > @@ -471,7 +473,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
> >               return false;
> >       if (vma_is_temporary_stack(vma))
> >               return false;
> > -     return !(vm_flags & VM_NO_KHUGEPAGED);
> > +
> > +     return true;
> >  }
> >
> >  int __khugepaged_enter(struct mm_struct *mm)
> >
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED
  2022-03-02 18:43     ` Yang Shi
@ 2022-03-03 10:53       ` Miaohe Lin
  0 siblings, 0 replies; 23+ messages in thread
From: Miaohe Lin @ 2022-03-03 10:53 UTC (permalink / raw)
  To: Yang Shi
  Cc: Linux MM, Linux FS-devel Mailing List, linux-ext4, linux-xfs,
	Linux Kernel Mailing List, Vlastimil Babka, Kirill A. Shutemov,
	Song Liu, Rik van Riel, Matthew Wilcox, Zi Yan, Andrew Morton,
	Theodore Ts'o, Andreas Dilger, darrick.wong

On 2022/3/3 2:43, Yang Shi wrote:
> On Tue, Mar 1, 2022 at 1:07 AM Miaohe Lin <linmiaohe@huawei.com> wrote:
>>
>> On 2022/3/1 7:57, Yang Shi wrote:
>>> The hugepage_vma_check() called by khugepaged_enter_vma_merge() does
>>> check VM_NO_KHUGEPAGED. Remove the check from caller and move the check
>>> in hugepage_vma_check() up.
>>>
>>> More checks may be run for VM_NO_KHUGEPAGED vmas, but MADV_HUGEPAGE is
>>> definitely not a hot path, so cleaner code does outweigh.
>>>
>>> Signed-off-by: Yang Shi <shy828301@gmail.com>
>>> ---
>>>  mm/khugepaged.c | 9 ++++++---
>>>  1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index 131492fd1148..82c71c6da9ce 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -366,8 +366,7 @@ int hugepage_madvise(struct vm_area_struct *vma,
>>>                * register it here without waiting a page fault that
>>>                * may not happen any time soon.
>>>                */
>>> -             if (!(*vm_flags & VM_NO_KHUGEPAGED) &&
>>> -                             khugepaged_enter_vma_merge(vma, *vm_flags))
>>> +             if (khugepaged_enter_vma_merge(vma, *vm_flags))
>>>                       return -ENOMEM;
>>>               break;
>>>       case MADV_NOHUGEPAGE:
>>> @@ -446,6 +445,9 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
>>>       if (!transhuge_vma_enabled(vma, vm_flags))
>>>               return false;
>>>
>>> +     if (vm_flags & VM_NO_KHUGEPAGED)
>>> +             return false;
>>> +
>>
>> This patch does improve the readability. But I have a question.
>> It seems VM_NO_KHUGEPAGED is not checked in the below if-condition:
>>
>>         /* Only regular file is valid */
>>         if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
>>             (vm_flags & VM_EXEC)) {
>>                 struct inode *inode = vma->vm_file->f_inode;
>>
>>                 return !inode_is_open_for_write(inode) &&
>>                         S_ISREG(inode->i_mode);
>>         }
>>
>> If we return false due to VM_NO_KHUGEPAGED here, it seems it will affect the
>> return value of this CONFIG_READ_ONLY_THP_FOR_FS condition check.
>> Or am I miss something?
> 
> Yes, it will return false instead of true if that file THP check is
> true, but wasn't that old behavior actually problematic? Khugepaged
> definitely can't collapse VM_NO_KHUGEPAGED vmas even though it
> satisfies all the readonly file THP checks. With the old behavior
> khugepaged may scan an exec file hugetlb vma IIUC although it will
> fail later due to other page sanity checks, i.e. page compound check.

Sounds reasonable to me. Khugepaged shouldn't collapse VM_NO_KHUGEPAGED vmas.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks.

> 
>>
>> Thanks.
>>
>>>       if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
>>>                               vma->vm_pgoff, HPAGE_PMD_NR))
>>>               return false;
>>> @@ -471,7 +473,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
>>>               return false;
>>>       if (vma_is_temporary_stack(vma))
>>>               return false;
>>> -     return !(vm_flags & VM_NO_KHUGEPAGED);
>>> +
>>> +     return true;
>>>  }
>>>
>>>  int __khugepaged_enter(struct mm_struct *mm)
>>>
>>
>>
> .
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] mm: thp: only regular file could be THP eligible
@ 2022-03-03 11:43 ` Dan Carpenter
  0 siblings, 0 replies; 23+ messages in thread
From: Dan Carpenter @ 2022-03-03 11:43 UTC (permalink / raw)
  To: kbuild, Yang Shi, vbabka, kirill.shutemov, songliubraving,
	linmiaohe, riel, willy, ziy, akpm, tytso, adilger.kernel,
	darrick.wong
  Cc: lkp, kbuild-all, linux-mm, linux-fsdevel, linux-ext4, linux-xfs,
	linux-kernel

Hi Yang,

url:    https://github.com/0day-ci/linux/commits/Yang-Shi/Make-khugepaged-collapse-readonly-FS-THP-more-consistent/20220301-075903
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
config: arm64-randconfig-m031-20220227 (https://download.01.org/0day-ci/archive/20220302/202203020034.2Ii9kTrs-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 11.2.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
mm/khugepaged.c:468 hugepage_vma_check() error: we previously assumed 'vma->vm_file' could be null (see line 455)
include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)

vim +179 include/linux/huge_mm.h

2224ed1155c07b Yang Shi     2022-02-28  175  static inline bool file_thp_enabled(struct vm_area_struct *vma)
2224ed1155c07b Yang Shi     2022-02-28  176  {
2224ed1155c07b Yang Shi     2022-02-28 @177  	struct inode *inode = vma->vm_file->f_inode;
                                                                      ^^^^^^^^^^^^^^
Dereference.

2224ed1155c07b Yang Shi     2022-02-28  178  
2224ed1155c07b Yang Shi     2022-02-28 @179  	return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && vma->vm_file &&
                                                                                                    ^^^^^^^^^^^^
Checked too late.

2224ed1155c07b Yang Shi     2022-02-28  180  	       (vma->vm_flags & VM_EXEC) &&
2224ed1155c07b Yang Shi     2022-02-28  181  	       !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
2224ed1155c07b Yang Shi     2022-02-28  182  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] mm: thp: only regular file could be THP eligible
@ 2022-03-03 11:43 ` Dan Carpenter
  0 siblings, 0 replies; 23+ messages in thread
From: Dan Carpenter @ 2022-03-03 11:43 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1990 bytes --]

Hi Yang,

url:    https://github.com/0day-ci/linux/commits/Yang-Shi/Make-khugepaged-collapse-readonly-FS-THP-more-consistent/20220301-075903
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
config: arm64-randconfig-m031-20220227 (https://download.01.org/0day-ci/archive/20220302/202203020034.2Ii9kTrs-lkp(a)intel.com/config)
compiler: aarch64-linux-gcc (GCC) 11.2.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
mm/khugepaged.c:468 hugepage_vma_check() error: we previously assumed 'vma->vm_file' could be null (see line 455)
include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)

vim +179 include/linux/huge_mm.h

2224ed1155c07b Yang Shi     2022-02-28  175  static inline bool file_thp_enabled(struct vm_area_struct *vma)
2224ed1155c07b Yang Shi     2022-02-28  176  {
2224ed1155c07b Yang Shi     2022-02-28 @177  	struct inode *inode = vma->vm_file->f_inode;
                                                                      ^^^^^^^^^^^^^^
Dereference.

2224ed1155c07b Yang Shi     2022-02-28  178  
2224ed1155c07b Yang Shi     2022-02-28 @179  	return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && vma->vm_file &&
                                                                                                    ^^^^^^^^^^^^
Checked too late.

2224ed1155c07b Yang Shi     2022-02-28  180  	       (vma->vm_flags & VM_EXEC) &&
2224ed1155c07b Yang Shi     2022-02-28  181  	       !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
2224ed1155c07b Yang Shi     2022-02-28  182  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] mm: thp: only regular file could be THP eligible
  2022-03-03 11:43 ` Dan Carpenter
@ 2022-03-03 11:48   ` Miaohe Lin
  -1 siblings, 0 replies; 23+ messages in thread
From: Miaohe Lin @ 2022-03-03 11:48 UTC (permalink / raw)
  To: Dan Carpenter, Yang Shi
  Cc: lkp, kbuild-all, linux-mm, linux-fsdevel, linux-ext4, linux-xfs,
	linux-kernel, kbuild, vbabka, kirill.shutemov, songliubraving,
	riel, willy, ziy, akpm, tytso, adilger.kernel, darrick.wong

On 2022/3/3 19:43, Dan Carpenter wrote:
> Hi Yang,
> 
> url:    https://github.com/0day-ci/linux/commits/Yang-Shi/Make-khugepaged-collapse-readonly-FS-THP-more-consistent/20220301-075903
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
> config: arm64-randconfig-m031-20220227 (https://download.01.org/0day-ci/archive/20220302/202203020034.2Ii9kTrs-lkp@intel.com/config)
> compiler: aarch64-linux-gcc (GCC) 11.2.0
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> smatch warnings:
> include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
> mm/khugepaged.c:468 hugepage_vma_check() error: we previously assumed 'vma->vm_file' could be null (see line 455)
> include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
> 
> vim +179 include/linux/huge_mm.h
> 
> 2224ed1155c07b Yang Shi     2022-02-28  175  static inline bool file_thp_enabled(struct vm_area_struct *vma)
> 2224ed1155c07b Yang Shi     2022-02-28  176  {
> 2224ed1155c07b Yang Shi     2022-02-28 @177  	struct inode *inode = vma->vm_file->f_inode;
>                                                                       ^^^^^^^^^^^^^^
> Dereference.
> 
> 2224ed1155c07b Yang Shi     2022-02-28  178  
> 2224ed1155c07b Yang Shi     2022-02-28 @179  	return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && vma->vm_file &&
>                                                                                                     ^^^^^^^^^^^^
> Checked too late.

Yep. We should check vma->vm_file first before we access vma->vm_file->f_inode.

Thanks.

> 
> 2224ed1155c07b Yang Shi     2022-02-28  180  	       (vma->vm_flags & VM_EXEC) &&
> 2224ed1155c07b Yang Shi     2022-02-28  181  	       !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
> 2224ed1155c07b Yang Shi     2022-02-28  182  }
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
> 
> .
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] mm: thp: only regular file could be THP eligible
@ 2022-03-03 11:48   ` Miaohe Lin
  0 siblings, 0 replies; 23+ messages in thread
From: Miaohe Lin @ 2022-03-03 11:48 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2212 bytes --]

On 2022/3/3 19:43, Dan Carpenter wrote:
> Hi Yang,
> 
> url:    https://github.com/0day-ci/linux/commits/Yang-Shi/Make-khugepaged-collapse-readonly-FS-THP-more-consistent/20220301-075903
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
> config: arm64-randconfig-m031-20220227 (https://download.01.org/0day-ci/archive/20220302/202203020034.2Ii9kTrs-lkp(a)intel.com/config)
> compiler: aarch64-linux-gcc (GCC) 11.2.0
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> smatch warnings:
> include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
> mm/khugepaged.c:468 hugepage_vma_check() error: we previously assumed 'vma->vm_file' could be null (see line 455)
> include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
> 
> vim +179 include/linux/huge_mm.h
> 
> 2224ed1155c07b Yang Shi     2022-02-28  175  static inline bool file_thp_enabled(struct vm_area_struct *vma)
> 2224ed1155c07b Yang Shi     2022-02-28  176  {
> 2224ed1155c07b Yang Shi     2022-02-28 @177  	struct inode *inode = vma->vm_file->f_inode;
>                                                                       ^^^^^^^^^^^^^^
> Dereference.
> 
> 2224ed1155c07b Yang Shi     2022-02-28  178  
> 2224ed1155c07b Yang Shi     2022-02-28 @179  	return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && vma->vm_file &&
>                                                                                                     ^^^^^^^^^^^^
> Checked too late.

Yep. We should check vma->vm_file first before we access vma->vm_file->f_inode.

Thanks.

> 
> 2224ed1155c07b Yang Shi     2022-02-28  180  	       (vma->vm_flags & VM_EXEC) &&
> 2224ed1155c07b Yang Shi     2022-02-28  181  	       !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
> 2224ed1155c07b Yang Shi     2022-02-28  182  }
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
> 
> .
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] mm: thp: only regular file could be THP eligible
  2022-03-03 11:48   ` Miaohe Lin
@ 2022-03-03 19:14     ` Yang Shi
  -1 siblings, 0 replies; 23+ messages in thread
From: Yang Shi @ 2022-03-03 19:14 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Dan Carpenter, kernel test robot, kbuild-all, Linux MM,
	Linux FS-devel Mailing List, linux-ext4, linux-xfs,
	Linux Kernel Mailing List, kbuild, Vlastimil Babka,
	Kirill A. Shutemov, Song Liu, Rik van Riel, Matthew Wilcox,
	Zi Yan, Andrew Morton, Theodore Ts'o, Andreas Dilger,
	darrick.wong

On Thu, Mar 3, 2022 at 3:48 AM Miaohe Lin <linmiaohe@huawei.com> wrote:
>
> On 2022/3/3 19:43, Dan Carpenter wrote:
> > Hi Yang,
> >
> > url:    https://github.com/0day-ci/linux/commits/Yang-Shi/Make-khugepaged-collapse-readonly-FS-THP-more-consistent/20220301-075903
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
> > config: arm64-randconfig-m031-20220227 (https://download.01.org/0day-ci/archive/20220302/202203020034.2Ii9kTrs-lkp@intel.com/config)
> > compiler: aarch64-linux-gcc (GCC) 11.2.0
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> >
> > smatch warnings:
> > include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
> > mm/khugepaged.c:468 hugepage_vma_check() error: we previously assumed 'vma->vm_file' could be null (see line 455)
> > include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
> >
> > vim +179 include/linux/huge_mm.h
> >
> > 2224ed1155c07b Yang Shi     2022-02-28  175  static inline bool file_thp_enabled(struct vm_area_struct *vma)
> > 2224ed1155c07b Yang Shi     2022-02-28  176  {
> > 2224ed1155c07b Yang Shi     2022-02-28 @177   struct inode *inode = vma->vm_file->f_inode;
> >                                                                       ^^^^^^^^^^^^^^
> > Dereference.
> >
> > 2224ed1155c07b Yang Shi     2022-02-28  178
> > 2224ed1155c07b Yang Shi     2022-02-28 @179   return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && vma->vm_file &&
> >                                                                                                     ^^^^^^^^^^^^
> > Checked too late.
>
> Yep. We should check vma->vm_file first before we access vma->vm_file->f_inode.

Ah, yes, thanks for the report and the suggestion.

>
> Thanks.
>
> >
> > 2224ed1155c07b Yang Shi     2022-02-28  180          (vma->vm_flags & VM_EXEC) &&
> > 2224ed1155c07b Yang Shi     2022-02-28  181          !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
> > 2224ed1155c07b Yang Shi     2022-02-28  182  }
> >
> > ---
> > 0-DAY CI Kernel Test Service, Intel Corporation
> > https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
> >
> > .
> >
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/8] mm: thp: only regular file could be THP eligible
@ 2022-03-03 19:14     ` Yang Shi
  0 siblings, 0 replies; 23+ messages in thread
From: Yang Shi @ 2022-03-03 19:14 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2420 bytes --]

On Thu, Mar 3, 2022 at 3:48 AM Miaohe Lin <linmiaohe@huawei.com> wrote:
>
> On 2022/3/3 19:43, Dan Carpenter wrote:
> > Hi Yang,
> >
> > url:    https://github.com/0day-ci/linux/commits/Yang-Shi/Make-khugepaged-collapse-readonly-FS-THP-more-consistent/20220301-075903
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
> > config: arm64-randconfig-m031-20220227 (https://download.01.org/0day-ci/archive/20220302/202203020034.2Ii9kTrs-lkp(a)intel.com/config)
> > compiler: aarch64-linux-gcc (GCC) 11.2.0
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> >
> > smatch warnings:
> > include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
> > mm/khugepaged.c:468 hugepage_vma_check() error: we previously assumed 'vma->vm_file' could be null (see line 455)
> > include/linux/huge_mm.h:179 file_thp_enabled() warn: variable dereferenced before check 'vma->vm_file' (see line 177)
> >
> > vim +179 include/linux/huge_mm.h
> >
> > 2224ed1155c07b Yang Shi     2022-02-28  175  static inline bool file_thp_enabled(struct vm_area_struct *vma)
> > 2224ed1155c07b Yang Shi     2022-02-28  176  {
> > 2224ed1155c07b Yang Shi     2022-02-28 @177   struct inode *inode = vma->vm_file->f_inode;
> >                                                                       ^^^^^^^^^^^^^^
> > Dereference.
> >
> > 2224ed1155c07b Yang Shi     2022-02-28  178
> > 2224ed1155c07b Yang Shi     2022-02-28 @179   return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && vma->vm_file &&
> >                                                                                                     ^^^^^^^^^^^^
> > Checked too late.
>
> Yep. We should check vma->vm_file first before we access vma->vm_file->f_inode.

Ah, yes, thanks for the report and the suggestion.

>
> Thanks.
>
> >
> > 2224ed1155c07b Yang Shi     2022-02-28  180          (vma->vm_flags & VM_EXEC) &&
> > 2224ed1155c07b Yang Shi     2022-02-28  181          !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode);
> > 2224ed1155c07b Yang Shi     2022-02-28  182  }
> >
> > ---
> > 0-DAY CI Kernel Test Service, Intel Corporation
> > https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
> >
> > .
> >
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-03-03 19:14 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-28 23:57 [PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent Yang Shi
2022-02-28 23:57 ` [PATCH 1/8] sched: coredump.h: clarify the use of MMF_VM_HUGEPAGE Yang Shi
2022-03-01  8:45   ` Miaohe Lin
2022-03-01 21:49     ` Yang Shi
2022-03-02  1:39       ` Miaohe Lin
2022-02-28 23:57 ` [PATCH 2/8] mm: khugepaged: remove redundant check for VM_NO_KHUGEPAGED Yang Shi
2022-03-01  9:07   ` Miaohe Lin
2022-03-02 18:43     ` Yang Shi
2022-03-03 10:53       ` Miaohe Lin
2022-02-28 23:57 ` [PATCH 3/8] mm: khugepaged: skip DAX vma Yang Shi
2022-03-02  3:21   ` Miaohe Lin
2022-02-28 23:57 ` [PATCH 4/8] mm: thp: only regular file could be THP eligible Yang Shi
2022-02-28 23:57 ` [PATCH 5/8] mm: khugepaged: make khugepaged_enter() void function Yang Shi
2022-02-28 23:57 ` [PATCH 6/8] mm: khugepaged: move some khugepaged_* functions to khugepaged.c Yang Shi
2022-02-28 23:57 ` [PATCH 7/8] mm: khugepaged: introduce khugepaged_enter_file() helper Yang Shi
2022-02-28 23:57 ` [PATCH 8/8] fs: register suitable readonly vmas for khugepaged Yang Shi
2022-03-01 16:57 [PATCH 4/8] mm: thp: only regular file could be THP eligible kernel test robot
2022-03-03 11:43 ` Dan Carpenter
2022-03-03 11:43 ` Dan Carpenter
2022-03-03 11:48 ` Miaohe Lin
2022-03-03 11:48   ` Miaohe Lin
2022-03-03 19:14   ` Yang Shi
2022-03-03 19:14     ` Yang Shi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.