incoming

All of lore.kernel.org
 help / color / mirror / Atom feed

* incoming
@ 2021-10-28 21:35 Andrew Morton
  2021-10-28 21:36 ` [patch 01/11] memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT Andrew Morton
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

11 patches, based on 411a44c24a561e449b592ff631b7ae321f1eb559.

Subsystems affected by this patch series:

  mm/memcg
  mm/memory-failure
  mm/oom-kill
  ocfs2
  mm/secretmem
  mm/vmalloc
  mm/hugetlb
  mm/damon
  mm/tools

Subsystem: mm/memcg

    Shakeel Butt <shakeelb@google.com>:
      memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT

Subsystem: mm/memory-failure

    Yang Shi <shy828301@gmail.com>:
      mm: hwpoison: remove the unnecessary THP check
      mm: filemap: check if THP has hwpoisoned subpage for PMD page fault

Subsystem: mm/oom-kill

    Suren Baghdasaryan <surenb@google.com>:
      mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap

Subsystem: ocfs2

    Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>:
      ocfs2: fix race between searching chunks and release journal_head from buffer_head

Subsystem: mm/secretmem

    Kees Cook <keescook@chromium.org>:
      mm/secretmem: avoid letting secretmem_users drop to zero

Subsystem: mm/vmalloc

    Chen Wandun <chenwandun@huawei.com>:
      mm/vmalloc: fix numa spreading for large hash tables

Subsystem: mm/hugetlb

    Rongwei Wang <rongwei.wang@linux.alibaba.com>:
      mm, thp: bail out early in collapse_file for writeback page

    Yang Shi <shy828301@gmail.com>:
      mm: khugepaged: skip huge page collapse for special files

Subsystem: mm/damon

    SeongJae Park <sj@kernel.org>:
      mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()'

Subsystem: mm/tools

    David Yang <davidcomponentone@gmail.com>:
      tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer

 fs/ocfs2/suballoc.c                               |   22 ++++++++++-------
 include/linux/page-flags.h                        |   23 ++++++++++++++++++
 mm/damon/core-test.h                              |    4 +--
 mm/huge_memory.c                                  |    2 +
 mm/khugepaged.c                                   |   26 +++++++++++++-------
 mm/memory-failure.c                               |   28 +++++++++++-----------
 mm/memory.c                                       |    9 +++++++
 mm/oom_kill.c                                     |   23 +++++++++---------
 mm/page_alloc.c                                   |    8 +++++-
 mm/secretmem.c                                    |    2 -
 mm/vmalloc.c                                      |   15 +++++++----
 tools/testing/selftests/vm/split_huge_page_test.c |    2 -
 12 files changed, 110 insertions(+), 54 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 01/11] memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT
  2021-10-28 21:35 incoming Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 02/11] mm: hwpoison: remove the unnecessary THP check Andrew Morton
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, david, guro, hannes, linux-mm, mhocko, mm-commits,
	shakeelb, torvalds, vvs

From: Shakeel Butt <shakeelb@google.com>
Subject: memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT

commit 5c1f4e690eec ("mm/vmalloc: switch to bulk allocator in
__vmalloc_area_node()") switched to bulk page allocator for order 0
allocation backing vmalloc.  However bulk page allocator does not support
__GFP_ACCOUNT allocations and there are several users of
kvmalloc(__GFP_ACCOUNT).

For now make __GFP_ACCOUNT allocations bypass bulk page allocator.  In
future if there is workload that can be significantly improved with the
bulk page allocator with __GFP_ACCCOUNT support, we can revisit the
decision.

Link: https://lkml.kernel.org/r/20211014151607.2171970-1-shakeelb@google.com
Fixes: 5c1f4e690eec ("mm/vmalloc: switch to bulk allocator in __vmalloc_area_node()")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reported-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/mm/page_alloc.c~memcg-page_alloc-skip-bulk-allocator-for-__gfp_account
+++ a/mm/page_alloc.c
@@ -5223,6 +5223,10 @@ unsigned long __alloc_pages_bulk(gfp_t g
 	if (unlikely(page_array && nr_pages - nr_populated == 0))
 		goto out;
 
+	/* Bulk allocator does not support memcg accounting. */
+	if (memcg_kmem_enabled() && (gfp & __GFP_ACCOUNT))
+		goto failed;
+
 	/* Use the single page allocator for one page. */
 	if (nr_pages - nr_populated == 1)
 		goto failed;
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 02/11] mm: hwpoison: remove the unnecessary THP check
  2021-10-28 21:35 incoming Andrew Morton
  2021-10-28 21:36 ` [patch 01/11] memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 03/11] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault Andrew Morton
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, hughd, kirill.shutemov, linux-mm, mm-commits,
	naoya.horiguchi, osalvador, peterx, shy828301, stable, torvalds,
	willy

From: Yang Shi <shy828301@gmail.com>
Subject: mm: hwpoison: remove the unnecessary THP check

When handling THP hwpoison checked if the THP is in allocation or free
stage since hwpoison may mistreat it as hugetlb page.  After commit
415c64c1453a ("mm/memory-failure: split thp earlier in memory error
handling") the problem has been fixed, so this check is no longer needed. 
Remove it.  The side effect of the removal is hwpoison may report unsplit
THP instead of unknown error for shmem THP.  It seems not like a big deal.

The following patch "mm: filemap: check if THP has hwpoisoned subpage for
PMD page fault" depends on this, which fixes shmem THP with hwpoisoned
subpage(s) are mapped PMD wrongly.  So this patch needs to be backported
to -stable as well.

Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |   14 --------------
 1 file changed, 14 deletions(-)

--- a/mm/memory-failure.c~mm-hwpoison-remove-the-unnecessary-thp-check
+++ a/mm/memory-failure.c
@@ -1147,20 +1147,6 @@ static int __get_hwpoison_page(struct pa
 	if (!HWPoisonHandlable(head))
 		return -EBUSY;
 
-	if (PageTransHuge(head)) {
-		/*
-		 * Non anonymous thp exists only in allocation/free time. We
-		 * can't handle such a case correctly, so let's give it up.
-		 * This should be better than triggering BUG_ON when kernel
-		 * tries to touch the "partially handled" page.
-		 */
-		if (!PageAnon(head)) {
-			pr_err("Memory failure: %#lx: non anonymous thp\n",
-				page_to_pfn(page));
-			return 0;
-		}
-	}
-
 	if (get_page_unless_zero(head)) {
 		if (head == compound_head(page))
 			return 1;
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 03/11] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
  2021-10-28 21:35 incoming Andrew Morton
  2021-10-28 21:36 ` [patch 01/11] memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT Andrew Morton
  2021-10-28 21:36 ` [patch 02/11] mm: hwpoison: remove the unnecessary THP check Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 04/11] mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap Andrew Morton
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, hughd, kirill.shutemov, linux-mm, mm-commits,
	naoya.horiguchi, osalvador, peterx, shy828301, stable, torvalds,
	willy

From: Yang Shi <shy828301@gmail.com>
Subject: mm: filemap: check if THP has hwpoisoned subpage for PMD page fault

When handling shmem page fault the THP with corrupted subpage could be PMD
mapped if certain conditions are satisfied.  But kernel is supposed to
send SIGBUS when trying to map hwpoisoned page.

There are two paths which may do PMD map: fault around and regular fault.

Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
codepaths") the thing was even worse in fault around path.  The THP could
be PMD mapped as long as the VMA fits regardless what subpage is accessed
and corrupted.  After this commit as long as head page is not corrupted
the THP could be PMD mapped.

In the regular fault path the THP could be PMD mapped as long as the
corrupted page is not accessed and the VMA fits.

This loophole could be fixed by iterating every subpage to check if any of
them is hwpoisoned or not, but it is somewhat costly in page fault path.

So introduce a new page flag called HasHWPoisoned on the first tail page. 
It indicates the THP has hwpoisoned subpage(s).  It is set if any subpage
of THP is found hwpoisoned by memory failure and after the refcount is
bumped successfully, then cleared when the THP is freed or split.

The soft offline path doesn't need this since soft offline handler just
marks a subpage hwpoisoned when the subpage is migrated successfully.  But
shmem THP didn't get split then migrated at all.

Link: https://lkml.kernel.org/r/20211020210755.23964-3-shy828301@gmail.com
Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/page-flags.h |   23 +++++++++++++++++++++++
 mm/huge_memory.c           |    2 ++
 mm/memory-failure.c        |   14 ++++++++++++++
 mm/memory.c                |    9 +++++++++
 mm/page_alloc.c            |    4 +++-
 5 files changed, 51 insertions(+), 1 deletion(-)

--- a/include/linux/page-flags.h~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/include/linux/page-flags.h
@@ -171,6 +171,15 @@ enum pageflags {
 	/* Compound pages. Stored in first tail page's flags */
 	PG_double_map = PG_workingset,
 
+#ifdef CONFIG_MEMORY_FAILURE
+	/*
+	 * Compound pages. Stored in first tail page's flags.
+	 * Indicates that at least one subpage is hwpoisoned in the
+	 * THP.
+	 */
+	PG_has_hwpoisoned = PG_mappedtodisk,
+#endif
+
 	/* non-lru isolated movable page */
 	PG_isolated = PG_reclaim,
 
@@ -668,6 +677,20 @@ PAGEFLAG_FALSE(DoubleMap)
 	TESTSCFLAG_FALSE(DoubleMap)
 #endif
 
+#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
+/*
+ * PageHasHWPoisoned indicates that at least one subpage is hwpoisoned in the
+ * compound page.
+ *
+ * This flag is set by hwpoison handler.  Cleared by THP split or free page.
+ */
+PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+	TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+#else
+PAGEFLAG_FALSE(HasHWPoisoned)
+	TESTSCFLAG_FALSE(HasHWPoisoned)
+#endif
+
 /*
  * Check if a page is currently marked HWPoisoned. Note that this check is
  * best effort only and inherently racy: there is no way to synchronize with
--- a/mm/huge_memory.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/mm/huge_memory.c
@@ -2426,6 +2426,8 @@ static void __split_huge_page(struct pag
 	/* lock lru list/PageCompound, ref frozen by page_ref_freeze */
 	lruvec = lock_page_lruvec(head);
 
+	ClearPageHasHWPoisoned(head);
+
 	for (i = nr - 1; i >= 1; i--) {
 		__split_huge_page_tail(head, i, lruvec, list);
 		/* Some pages can be beyond EOF: drop them from page cache */
--- a/mm/memory.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/mm/memory.c
@@ -3907,6 +3907,15 @@ vm_fault_t do_set_pmd(struct vm_fault *v
 		return ret;
 
 	/*
+	 * Just backoff if any subpage of a THP is corrupted otherwise
+	 * the corrupted page may mapped by PMD silently to escape the
+	 * check.  This kind of THP just can be PTE mapped.  Access to
+	 * the corrupted subpage should trigger SIGBUS as expected.
+	 */
+	if (unlikely(PageHasHWPoisoned(page)))
+		return ret;
+
+	/*
 	 * Archs like ppc64 need additional space to store information
 	 * related to pte entry. Use the preallocated table for that.
 	 */
--- a/mm/memory-failure.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/mm/memory-failure.c
@@ -1694,6 +1694,20 @@ try_again:
 	}
 
 	if (PageTransHuge(hpage)) {
+		/*
+		 * The flag must be set after the refcount is bumped
+		 * otherwise it may race with THP split.
+		 * And the flag can't be set in get_hwpoison_page() since
+		 * it is called by soft offline too and it is just called
+		 * for !MF_COUNT_INCREASE.  So here seems to be the best
+		 * place.
+		 *
+		 * Don't need care about the above error handling paths for
+		 * get_hwpoison_page() since they handle either free page
+		 * or unhandlable page.  The refcount is bumped iff the
+		 * page is a valid handlable page.
+		 */
+		SetPageHasHWPoisoned(hpage);
 		if (try_to_split_thp_page(p, "Memory Failure") < 0) {
 			action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED);
 			res = -EBUSY;
--- a/mm/page_alloc.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/mm/page_alloc.c
@@ -1312,8 +1312,10 @@ static __always_inline bool free_pages_p
 
 		VM_BUG_ON_PAGE(compound && compound_order(page) != order, page);
 
-		if (compound)
+		if (compound) {
 			ClearPageDoubleMap(page);
+			ClearPageHasHWPoisoned(page);
+		}
 		for (i = 1; i < (1 << order); i++) {
 			if (compound)
 				bad += free_tail_pages_check(page, page + i);
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 04/11] mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap
  2021-10-28 21:35 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2021-10-28 21:36 ` [patch 03/11] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 05/11] ocfs2: fix race between searching chunks and release journal_head from buffer_head Andrew Morton
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, christian.brauner, christian, david, fweimer, guro, hannes,
	hch, jannh, jengelh, linux-mm, luto, mhocko, minchan, mm-commits,
	oleg, riel, rientjes, shakeelb, surenb, torvalds, willy

From: Suren Baghdasaryan <surenb@google.com>
Subject: mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap

Race between process_mrelease and exit_mmap, where free_pgtables is called
while __oom_reap_task_mm is in progress, leads to kernel crash during
pte_offset_map_lock call.  oom-reaper avoids this race by setting
MMF_OOM_VICTIM flag and causing exit_mmap to take and release
mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock.

Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to
fix this race, however that would be considered a hack.  Fix this race by
elevating mm->mm_users and preventing exit_mmap from executing until
process_mrelease is finished.  Patch slightly refactors the code to adapt
for a possible mmget_not_zero failure.

This fix has considerable negative impact on process_mrelease performance
and will likely need later optimization.

Link: https://lkml.kernel.org/r/20211022014658.263508-1-surenb@google.com
Fixes: 884a7e5964e0 ("mm: introduce process_mrelease system call")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christian Brauner <christian@brauner.io>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/oom_kill.c |   23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

--- a/mm/oom_kill.c~mm-prevent-a-race-between-process_mrelease-and-exit_mmap
+++ a/mm/oom_kill.c
@@ -1150,7 +1150,7 @@ SYSCALL_DEFINE2(process_mrelease, int, p
 	struct task_struct *task;
 	struct task_struct *p;
 	unsigned int f_flags;
-	bool reap = true;
+	bool reap = false;
 	struct pid *pid;
 	long ret = 0;
 
@@ -1177,15 +1177,15 @@ SYSCALL_DEFINE2(process_mrelease, int, p
 		goto put_task;
 	}
 
-	mm = p->mm;
-	mmgrab(mm);
-
-	/* If the work has been done already, just exit with success */
-	if (test_bit(MMF_OOM_SKIP, &mm->flags))
-		reap = false;
-	else if (!task_will_free_mem(p)) {
-		reap = false;
-		ret = -EINVAL;
+	if (mmget_not_zero(p->mm)) {
+		mm = p->mm;
+		if (task_will_free_mem(p))
+			reap = true;
+		else {
+			/* Error only if the work has not been done already */
+			if (!test_bit(MMF_OOM_SKIP, &mm->flags))
+				ret = -EINVAL;
+		}
 	}
 	task_unlock(p);
 
@@ -1201,7 +1201,8 @@ SYSCALL_DEFINE2(process_mrelease, int, p
 	mmap_read_unlock(mm);
 
 drop_mm:
-	mmdrop(mm);
+	if (mm)
+		mmput(mm);
 put_task:
 	put_task_struct(task);
 put_pid:
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 05/11] ocfs2: fix race between searching chunks and release journal_head from buffer_head
  2021-10-28 21:35 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2021-10-28 21:36 ` [patch 04/11] mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 06/11] mm/secretmem: avoid letting secretmem_users drop to zero Andrew Morton
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, gautham.ananthakrishna, gechangwei, ghe, jlbec, joseph.qi,
	junxiao.bi, linux-mm, mark, mm-commits, piaojun,
	rajesh.sivaramasubramaniom, stable, torvalds

From: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
Subject: ocfs2: fix race between searching chunks and release journal_head from buffer_head

Encountered a race between ocfs2_test_bg_bit_allocatable() and
jbd2_journal_put_journal_head() resulting in the below vmcore.

PID: 106879  TASK: ffff880244ba9c00  CPU: 2   COMMAND: "loop3"
 0 [ffff8802435ff1c0] panic at ffffffff816ed175
 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9
 2 [ffff8802435ff270] no_context at ffffffff8106eccf
 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d
 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143
 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b
 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f
 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667
    [exception RIP: ocfs2_block_group_find_clear_bits+316]
    RIP: ffffffffc11ef6fc  RSP: ffff8802435ff498  RFLAGS: 00010206
    RAX: 0000000000003918  RBX: 0000000000000001  RCX: 0000000000000018
    RDX: 0000000000003918  RSI: 0000000000000000  RDI: ffff880060194040
    RBP: ffff8802435ff4f8   R8: ffffffffff000000   R9: ffffffffffffffff
    R10: ffff8802435ff730  R11: ffff8802a94e5800  R12: 0000000000000007
    R13: 0000000000007e00  R14: 0000000000003918  R15: ffff88017c973a28
    ORIG_RAX: ffffffffffffffff  CS: e030  SS: e02b
 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at ffffffffc11ef680 [ocfs2]
 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 [ocfs2]
10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2]
11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b [ocfs2]
12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb [ocfs2]
13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf [ocfs2]
14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at ffffffffc11cc0db [ocfs2]
15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at ffffffffc11ce53f [ocfs2]
16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at ffffffffc11f59b5 [ocfs2]
17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 [ocfs2]
18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at ffffffffc11dc169 [ocfs2]
19 [ffff8802435ff960] ocfs2_make_clusters_writable at ffffffffc11e4274 [ocfs2]
20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2]
21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2]
22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 [ocfs2]
23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d
24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802
25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2
26 [ffff8802435ffec0] kthread at ffffffff810a7afb
27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1

When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the
bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and
released the jounal head from the buffer head.  Needed to take bit lock
for the bit 'BH_JournalHead' to fix this race.

Link: https://lkml.kernel.org/r/1634820718-6043-1-git-send-email-gautham.ananthakrishna@oracle.com
Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: <rajesh.sivaramasubramaniom@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/suballoc.c |   22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

--- a/fs/ocfs2/suballoc.c~ocfs2-race-between-searching-chunks-and-release-journal_head-from-buffer_head
+++ a/fs/ocfs2/suballoc.c
@@ -1251,7 +1251,7 @@ static int ocfs2_test_bg_bit_allocatable
 {
 	struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data;
 	struct journal_head *jh;
-	int ret;
+	int ret = 1;
 
 	if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap))
 		return 0;
@@ -1259,14 +1259,18 @@ static int ocfs2_test_bg_bit_allocatable
 	if (!buffer_jbd(bg_bh))
 		return 1;
 
-	jh = bh2jh(bg_bh);
-	spin_lock(&jh->b_state_lock);
-	bg = (struct ocfs2_group_desc *) jh->b_committed_data;
-	if (bg)
-		ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap);
-	else
-		ret = 1;
-	spin_unlock(&jh->b_state_lock);
+	jbd_lock_bh_journal_head(bg_bh);
+	if (buffer_jbd(bg_bh)) {
+		jh = bh2jh(bg_bh);
+		spin_lock(&jh->b_state_lock);
+		bg = (struct ocfs2_group_desc *) jh->b_committed_data;
+		if (bg)
+			ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap);
+		else
+			ret = 1;
+		spin_unlock(&jh->b_state_lock);
+	}
+	jbd_unlock_bh_journal_head(bg_bh);
 
 	return ret;
 }
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 06/11] mm/secretmem: avoid letting secretmem_users drop to zero
  2021-10-28 21:35 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2021-10-28 21:36 ` [patch 05/11] ocfs2: fix race between searching chunks and release journal_head from buffer_head Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 07/11] mm/vmalloc: fix numa spreading for large hash tables Andrew Morton
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, david, dvyukov, jordy, keescook, linux-mm, mm-commits,
	rppt, torvalds

From: Kees Cook <keescook@chromium.org>
Subject: mm/secretmem: avoid letting secretmem_users drop to zero

Quoting Dmitry: "refcount_inc() needs to be done before fd_install(). 
After fd_install() finishes, the fd can be used by userspace and we can
have secret data in memory before the refcount_inc().

A straightforward misuse where a user will predict the returned fd in
another thread before the syscall returns and will use it to store secret
data is somewhat dubious because such a user just shoots themself in the
foot.

But a more interesting misuse would be to close the predicted fd and
decrement the refcount before the corresponding refcount_inc, this way one
can briefly drop the refcount to zero while there are other users of
secretmem."

Move fd_install() after refcount_inc().

Link: https://lkml.kernel.org/r/20211021154046.880251-1-keescook@chromium.org
Link: https://lore.kernel.org/lkml/CACT4Y+b1sW6-Hkn8HQYw_SsT7X3tp-CJNh2ci0wG3ZnQz9jjig@mail.gmail.com
Fixes: 9a436f8ff631 ("PM: hibernate: disable when there are active secretmem users")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jordy Zomer <jordy@pwning.systems>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/secretmem.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/secretmem.c~mm-secretmem-avoid-letting-secretmem_users-drop-to-zero
+++ a/mm/secretmem.c
@@ -218,8 +218,8 @@ SYSCALL_DEFINE1(memfd_secret, unsigned i
 
 	file->f_flags |= O_LARGEFILE;
 
-	fd_install(fd, file);
 	atomic_inc(&secretmem_users);
+	fd_install(fd, file);
 	return fd;
 
 err_put_fd:
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 07/11] mm/vmalloc: fix numa spreading for large hash tables
  2021-10-28 21:35 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2021-10-28 21:36 ` [patch 06/11] mm/secretmem: avoid letting secretmem_users drop to zero Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 08/11] mm, thp: bail out early in collapse_file for writeback page Andrew Morton
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, chenwandun, edumazet, guohanjun, linux-mm, mm-commits,
	npiggin, shakeelb, torvalds, urezki, wangkefeng.wang

From: Chen Wandun <chenwandun@huawei.com>
Subject: mm/vmalloc: fix numa spreading for large hash tables

Eric Dumazet reported a strange numa spreading info in [1], and found
commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") introduced
this issue [2].

Dig into the difference before and after this patch, page allocation has
some difference:

before:
alloc_large_system_hash
    __vmalloc
        __vmalloc_node(..., NUMA_NO_NODE, ...)
            __vmalloc_node_range
                __vmalloc_area_node
                    alloc_page /* because NUMA_NO_NODE, so choose alloc_page branch */
                        alloc_pages_current
                            alloc_page_interleave /* can be proved by print policy mode */

after:
alloc_large_system_hash
    __vmalloc
        __vmalloc_node(..., NUMA_NO_NODE, ...)
            __vmalloc_node_range
                __vmalloc_area_node
                    alloc_pages_node /* choose nid by nuam_mem_id() */
                        __alloc_pages_node(nid, ....)

So after commit 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings"), it
will allocate memory in current node instead of interleaving allocate
memory.

[1]
https://lore.kernel.org/linux-mm/CANn89iL6AAyWhfxdHO+jaT075iOa3XcYn9k6JJc7JR2XYn6k_Q@mail.gmail.com/

[2]
https://lore.kernel.org/linux-mm/CANn89iLofTR=AK-QOZY87RdUZENCZUT4O6a0hvhu3_EwRMerOg@mail.gmail.com/

Link: https://lkml.kernel.org/r/20211021080744.874701-2-chenwandun@huawei.com
Fixes: 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings")
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

--- a/mm/vmalloc.c~mm-vmalloc-fix-numa-spreading-for-large-hash-tables
+++ a/mm/vmalloc.c
@@ -2816,6 +2816,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 		unsigned int order, unsigned int nr_pages, struct page **pages)
 {
 	unsigned int nr_allocated = 0;
+	struct page *page;
+	int i;
 
 	/*
 	 * For order-0 pages we make use of bulk allocator, if
@@ -2823,7 +2825,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 	 * to fails, fallback to a single page allocator that is
 	 * more permissive.
 	 */
-	if (!order) {
+	if (!order && nid != NUMA_NO_NODE) {
 		while (nr_allocated < nr_pages) {
 			unsigned int nr, nr_pages_request;
 
@@ -2848,7 +2850,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			if (nr != nr_pages_request)
 				break;
 		}
-	} else
+	} else if (order)
 		/*
 		 * Compound pages required for remap_vmalloc_page if
 		 * high-order pages.
@@ -2856,11 +2858,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 		gfp |= __GFP_COMP;
 
 	/* High-order pages or fallback path if "bulk" fails. */
-	while (nr_allocated < nr_pages) {
-		struct page *page;
-		int i;
 
-		page = alloc_pages_node(nid, gfp, order);
+	while (nr_allocated < nr_pages) {
+		if (nid == NUMA_NO_NODE)
+			page = alloc_pages(gfp, order);
+		else
+			page = alloc_pages_node(nid, gfp, order);
 		if (unlikely(!page))
 			break;
 
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 08/11] mm, thp: bail out early in collapse_file for writeback page
  2021-10-28 21:35 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2021-10-28 21:36 ` [patch 07/11] mm/vmalloc: fix numa spreading for large hash tables Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 09/11] mm: khugepaged: skip huge page collapse for special files Andrew Morton
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, hughd, kirill.shutemov, linux-mm, mike.kravetz, mm-commits,
	rongwei.wang, shy828301, song, stable, torvalds,
	william.kucharski, willy, xuyu

From: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Subject: mm, thp: bail out early in collapse_file for writeback page

Currently collapse_file does not explicitly check PG_writeback, instead,
page_has_private and try_to_release_page are used to filter writeback
pages.  This does not work for xfs with blocksize equal to or larger than
pagesize, because in such case xfs has no page->private.

This makes collapse_file bail out early for writeback page.  Otherwise,
xfs end_page_writeback will panic as follows.

page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:ffff0003f88c86a8 index:0x0 pfn:0x84ef32
aops:xfs_address_space_operations [xfs] ino:30000b7 dentry name:"libtest.so"
flags: 0x57fffe0000008027(locked|referenced|uptodate|active|writeback)
raw: 57fffe0000008027 ffff80001b48bc28 ffff80001b48bc28 ffff0003f88c86a8
raw: 0000000000000000 0000000000000000 00000000ffffffff ffff0000c3e9a000
page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u))
page->mem_cgroup:ffff0000c3e9a000
------------[ cut here ]------------
kernel BUG at include/linux/mm.h:1212!
Internal error: Oops - BUG: 0 [#1] SMP
Modules linked in:
BUG: Bad page state in process khugepaged  pfn:84ef32
 xfs(E)
page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:0 index:0x0 pfn:0x84ef32
 libcrc32c(E) rfkill(E) aes_ce_blk(E) crypto_simd(E) ...
CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Tainted: ...
pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
pc : end_page_writeback+0x1c0/0x214
lr : end_page_writeback+0x1c0/0x214
sp : ffff800011ce3cc0
x29: ffff800011ce3cc0 x28: 0000000000000000
x27: ffff000c04608040 x26: 0000000000000000
x25: ffff000c04608040 x24: 0000000000001000
x23: ffff0003f88c8530 x22: 0000000000001000
x21: ffff0003f88c8530 x20: 0000000000000000
x19: fffffe00201bcc80 x18: 0000000000000030
x17: 0000000000000000 x16: 0000000000000000
x15: ffff000c018f9760 x14: ffffffffffffffff
x13: ffff8000119d72b0 x12: ffff8000119d6ee3
x11: ffff8000117b69b8 x10: 00000000ffff8000
x9 : ffff800010617534 x8 : 0000000000000000
x7 : ffff8000114f69b8 x6 : 000000000000000f
x5 : 0000000000000000 x4 : 0000000000000000
x3 : 0000000000000400 x2 : 0000000000000000
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
 end_page_writeback+0x1c0/0x214
 iomap_finish_page_writeback+0x13c/0x204
 iomap_finish_ioend+0xe8/0x19c
 iomap_writepage_end_bio+0x38/0x50
 bio_endio+0x168/0x1ec
 blk_update_request+0x278/0x3f0
 blk_mq_end_request+0x34/0x15c
 virtblk_request_done+0x38/0x74 [virtio_blk]
 blk_done_softirq+0xc4/0x110
 __do_softirq+0x128/0x38c
 __irq_exit_rcu+0x118/0x150
 irq_exit+0x1c/0x30
 __handle_domain_irq+0x8c/0xf0
 gic_handle_irq+0x84/0x108
 el1_irq+0xcc/0x180
 arch_cpu_idle+0x18/0x40
 default_idle_call+0x4c/0x1a0
 cpuidle_idle_call+0x168/0x1e0
 do_idle+0xb4/0x104
 cpu_startup_entry+0x30/0x9c
 secondary_start_kernel+0x104/0x180
Code: d4210000 b0006161 910c8021 94013f4d (d4210000)
---[ end trace 4a88c6a074082f8c ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt

Link: https://lkml.kernel.org/r/20211022023052.33114-1-rongwei.wang@linux.alibaba.com
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Suggested-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Song Liu <song@kernel.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/khugepaged.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/mm/khugepaged.c~mm-thp-bail-out-early-in-collapse_file-for-writeback-page
+++ a/mm/khugepaged.c
@@ -1763,6 +1763,10 @@ static void collapse_file(struct mm_stru
 				filemap_flush(mapping);
 				result = SCAN_FAIL;
 				goto xa_unlocked;
+			} else if (PageWriteback(page)) {
+				xas_unlock_irq(&xas);
+				result = SCAN_FAIL;
+				goto xa_unlocked;
 			} else if (trylock_page(page)) {
 				get_page(page);
 				xas_unlock_irq(&xas);
@@ -1798,7 +1802,8 @@ static void collapse_file(struct mm_stru
 			goto out_unlock;
 		}
 
-		if (!is_shmem && PageDirty(page)) {
+		if (!is_shmem && (PageDirty(page) ||
+				  PageWriteback(page))) {
 			/*
 			 * khugepaged only works on read-only fd, so this
 			 * page is dirty because it hasn't been flushed
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 09/11] mm: khugepaged: skip huge page collapse for special files
  2021-10-28 21:35 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2021-10-28 21:36 ` [patch 08/11] mm, thp: bail out early in collapse_file for writeback page Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 10/11] mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()' Andrew Morton
  2021-10-28 21:36 ` [patch 11/11] tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer Andrew Morton
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, andrea.righi, hughd, kirill.shutemov, linux-mm, mm-commits,
	shy828301, songliubraving, stable, sunhao.th, torvalds, willy

From: Yang Shi <shy828301@gmail.com>
Subject: mm: khugepaged: skip huge page collapse for special files

The read-only THP for filesystems will collapse THP for files opened
readonly and mapped with VM_EXEC.  The intended usecase is to avoid TLB
misses for large text segments.  But it doesn't restrict the file types so
a THP could be collapsed for a non-regular file, for example, block
device, if it is opened readonly and mapped with EXEC permission.  This
may cause bugs, like [1] and [2].

This is definitely not the intended usecase, so just collapse THP for
regular files in order to close the attack surface.

[1] https://lore.kernel.org/lkml/CACkBjsYwLYLRmX8GpsDpMthagWOjWWrNxqY6ZLNQVr6yx+f5vA@mail.gmail.com/
[2] https://lore.kernel.org/linux-mm/000000000000c6a82505ce284e4c@google.com/

[shy828301@gmail.com: fix vm_file check]
  Link: https://lkml.kernel.org/r/CAHbLzkqTW9U3VvTu1Ki5v_cLRC9gHW+znBukg_ycergE0JWj-A@mail.gmail.com
Link: https://lkml.kernel.org/r/20211027195221.3825-1-shy828301@gmail.com
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reported-by: Hao Sun <sunhao.th@gmail.com>
Reported-by: syzbot+aae069be1de40fb11825@syzkaller.appspotmail.com
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Andrea Righi <andrea.righi@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/khugepaged.c |   19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

--- a/mm/khugepaged.c~mm-khugepaged-skip-huge-page-collapse-for-special-files
+++ a/mm/khugepaged.c
@@ -445,22 +445,25 @@ static bool hugepage_vma_check(struct vm
 	if (!transhuge_vma_enabled(vma, vm_flags))
 		return false;
 
+	if (vma->vm_file && !IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) -
+				vma->vm_pgoff, HPAGE_PMD_NR))
+		return false;
+
 	/* Enabled via shmem mount options or sysfs settings. */
-	if (shmem_file(vma->vm_file) && shmem_huge_enabled(vma)) {
-		return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
-				HPAGE_PMD_NR);
-	}
+	if (shmem_file(vma->vm_file))
+		return shmem_huge_enabled(vma);
 
 	/* THP settings require madvise. */
 	if (!(vm_flags & VM_HUGEPAGE) && !khugepaged_always())
 		return false;
 
-	/* Read-only file mappings need to be aligned for THP to work. */
+	/* Only regular file is valid */
 	if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && vma->vm_file &&
-	    !inode_is_open_for_write(vma->vm_file->f_inode) &&
 	    (vm_flags & VM_EXEC)) {
-		return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
-				HPAGE_PMD_NR);
+		struct inode *inode = vma->vm_file->f_inode;
+
+		return !inode_is_open_for_write(inode) &&
+			S_ISREG(inode->i_mode);
 	}
 
 	if (!vma->anon_vma || vma->vm_ops)
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 10/11] mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()'
  2021-10-28 21:35 incoming Andrew Morton
                   ` (8 preceding siblings ...)
  2021-10-28 21:36 ` [patch 09/11] mm: khugepaged: skip huge page collapse for special files Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  2021-10-28 21:36 ` [patch 11/11] tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer Andrew Morton
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()'

Kunit test cases for 'damon_split_regions_of()' expects the number of
regions after calling the function will be same to their request
('nr_sub').  However, the requested number is just an upper-limit, because
the function randomly decides the size of each sub-region.  This commit
fixes the wrong expectation.

Link: https://lkml.kernel.org/r/20211028090628.14948-1-sj@kernel.org
Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests")
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/core-test.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/damon/core-test.h~mm-damon-core-test-fix-wrong-expectations-for-damon_split_regions_of
+++ a/mm/damon/core-test.h
@@ -219,14 +219,14 @@ static void damon_test_split_regions_of(
 	r = damon_new_region(0, 22);
 	damon_add_region(r, t);
 	damon_split_regions_of(c, t, 2);
-	KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 2u);
+	KUNIT_EXPECT_LE(test, damon_nr_regions(t), 2u);
 	damon_free_target(t);
 
 	t = damon_new_target(42);
 	r = damon_new_region(0, 220);
 	damon_add_region(r, t);
 	damon_split_regions_of(c, t, 4);
-	KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 4u);
+	KUNIT_EXPECT_LE(test, damon_nr_regions(t), 4u);
 	damon_free_target(t);
 	damon_destroy_ctx(c);
 }
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 11/11] tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer
  2021-10-28 21:35 incoming Andrew Morton
                   ` (9 preceding siblings ...)
  2021-10-28 21:36 ` [patch 10/11] mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()' Andrew Morton
@ 2021-10-28 21:36 ` Andrew Morton
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2021-10-28 21:36 UTC (permalink / raw)
  To: akpm, davidcomponentone, linux-mm, mm-commits, shuah, torvalds,
	zealci, ziy

From: David Yang <davidcomponentone@gmail.com>
Subject: tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer

The coccinelle check report:
"./tools/testing/selftests/vm/split_huge_page_test.c:344:36-42:
ERROR: application of sizeof to pointer"
Using the "strlen" to fix it.

Link: https://lkml.kernel.org/r/20211012030116.184027-1-davidcomponentone@gmail.com
Signed-off-by: David Yang <davidcomponentone@gmail.com>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/split_huge_page_test.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/testing/selftests/vm/split_huge_page_test.c~fix-application-of-sizeof-to-pointer
+++ a/tools/testing/selftests/vm/split_huge_page_test.c
@@ -341,7 +341,7 @@ void split_file_backed_thp(void)
 	}
 
 	/* write something to the file, so a file-backed THP can be allocated */
-	num_written = write(fd, tmpfs_loc, sizeof(tmpfs_loc));
+	num_written = write(fd, tmpfs_loc, strlen(tmpfs_loc) + 1);
 	close(fd);
 
 	if (num_written < 1) {
_

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-10-28 21:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-28 21:35 incoming Andrew Morton
2021-10-28 21:36 ` [patch 01/11] memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT Andrew Morton
2021-10-28 21:36 ` [patch 02/11] mm: hwpoison: remove the unnecessary THP check Andrew Morton
2021-10-28 21:36 ` [patch 03/11] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault Andrew Morton
2021-10-28 21:36 ` [patch 04/11] mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap Andrew Morton
2021-10-28 21:36 ` [patch 05/11] ocfs2: fix race between searching chunks and release journal_head from buffer_head Andrew Morton
2021-10-28 21:36 ` [patch 06/11] mm/secretmem: avoid letting secretmem_users drop to zero Andrew Morton
2021-10-28 21:36 ` [patch 07/11] mm/vmalloc: fix numa spreading for large hash tables Andrew Morton
2021-10-28 21:36 ` [patch 08/11] mm, thp: bail out early in collapse_file for writeback page Andrew Morton
2021-10-28 21:36 ` [patch 09/11] mm: khugepaged: skip huge page collapse for special files Andrew Morton
2021-10-28 21:36 ` [patch 10/11] mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()' Andrew Morton
2021-10-28 21:36 ` [patch 11/11] tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.