linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2021-07-15  4:26 Andrew Morton
  2021-07-15  4:26 ` [patch 01/13] mm: move helper to check slub_debug_enabled Andrew Morton
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

13 patches, based on 40226a3d96ef8ab8980f032681c8bfd46d63874e.

Subsystems affected by this patch series:

  mm/kasan
  mm/pagealloc
  mm/rmap
  mm/hmm
  hfs
  mm/hugetlb

Subsystem: mm/kasan

    Marco Elver <elver@google.com>:
      mm: move helper to check slub_debug_enabled

    Yee Lee <yee.lee@mediatek.com>:
      kasan: add memzero init for unaligned size at DEBUG

    Marco Elver <elver@google.com>:
      kasan: fix build by including kernel.h

Subsystem: mm/pagealloc

    Matteo Croce <mcroce@microsoft.com>:
      Revert "mm/page_alloc: make should_fail_alloc_page() static"

    Mel Gorman <mgorman@techsingularity.net>:
      mm/page_alloc: avoid page allocator recursion with pagesets.lock held

    Yanfei Xu <yanfei.xu@windriver.com>:
      mm/page_alloc: correct return value when failing at preparing

    Chuck Lever <chuck.lever@oracle.com>:
      mm/page_alloc: further fix __alloc_pages_bulk() return value

Subsystem: mm/rmap

    Christoph Hellwig <hch@lst.de>:
      mm: fix the try_to_unmap prototype for !CONFIG_MMU

Subsystem: mm/hmm

    Alistair Popple <apopple@nvidia.com>:
      lib/test_hmm: remove set but unused page variable

Subsystem: hfs

    Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>:
    Patch series "hfs: fix various errors", v2:
      hfs: add missing clean-up in hfs_fill_super
      hfs: fix high memory mapping in hfs_bnode_read
      hfs: add lock nesting notation to hfs_find_init

Subsystem: mm/hugetlb

    Joao Martins <joao.m.martins@oracle.com>:
      mm/hugetlb: fix refs calculation from unaligned @vaddr

 fs/hfs/bfind.c        |   14 +++++++++++++-
 fs/hfs/bnode.c        |   25 ++++++++++++++++++++-----
 fs/hfs/btree.h        |    7 +++++++
 fs/hfs/super.c        |   10 +++++-----
 include/linux/kasan.h |    1 +
 include/linux/rmap.h  |    4 +++-
 lib/test_hmm.c        |    2 --
 mm/hugetlb.c          |    5 +++--
 mm/kasan/kasan.h      |   12 ++++++++++++
 mm/page_alloc.c       |   30 ++++++++++++++++++++++--------
 mm/slab.h             |   15 +++++++++++----
 mm/slub.c             |   14 --------------
 12 files changed, 97 insertions(+), 42 deletions(-)



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 01/13] mm: move helper to check slub_debug_enabled
  2021-07-15  4:26 incoming Andrew Morton
@ 2021-07-15  4:26 ` Andrew Morton
  2021-07-15  4:26 ` [patch 02/13] kasan: add memzero init for unaligned size at DEBUG Andrew Morton
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: akpm, andreyknvl, chinwen.chang, dvyukov, elver, glider,
	Kuan-Ying.Lee, linux-mm, mm-commits, nicholas.tang, ryabinin.a.a,
	torvalds, willy, yee.lee

From: Marco Elver <elver@google.com>
Subject: mm: move helper to check slub_debug_enabled

Move the helper to check slub_debug_enabled, so that we can confine the
use of #ifdef outside slub.c as well.

Link: https://lkml.kernel.org/r/20210705103229.8505-2-yee.lee@mediatek.com
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Yee Lee <yee.lee@mediatek.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab.h |   15 +++++++++++----
 mm/slub.c |   14 --------------
 2 files changed, 11 insertions(+), 18 deletions(-)

--- a/mm/slab.h~mm-move-helper-to-check-slub_debug_enabled
+++ a/mm/slab.h
@@ -216,10 +216,18 @@ DECLARE_STATIC_KEY_FALSE(slub_debug_enab
 #endif
 extern void print_tracking(struct kmem_cache *s, void *object);
 long validate_slab_cache(struct kmem_cache *s);
+static inline bool __slub_debug_enabled(void)
+{
+	return static_branch_unlikely(&slub_debug_enabled);
+}
 #else
 static inline void print_tracking(struct kmem_cache *s, void *object)
 {
 }
+static inline bool __slub_debug_enabled(void)
+{
+	return false;
+}
 #endif
 
 /*
@@ -229,11 +237,10 @@ static inline void print_tracking(struct
  */
 static inline bool kmem_cache_debug_flags(struct kmem_cache *s, slab_flags_t flags)
 {
-#ifdef CONFIG_SLUB_DEBUG
-	VM_WARN_ON_ONCE(!(flags & SLAB_DEBUG_FLAGS));
-	if (static_branch_unlikely(&slub_debug_enabled))
+	if (IS_ENABLED(CONFIG_SLUB_DEBUG))
+		VM_WARN_ON_ONCE(!(flags & SLAB_DEBUG_FLAGS));
+	if (__slub_debug_enabled())
 		return s->flags & flags;
-#endif
 	return false;
 }
 
--- a/mm/slub.c~mm-move-helper-to-check-slub_debug_enabled
+++ a/mm/slub.c
@@ -120,25 +120,11 @@
  */
 
 #ifdef CONFIG_SLUB_DEBUG
-
 #ifdef CONFIG_SLUB_DEBUG_ON
 DEFINE_STATIC_KEY_TRUE(slub_debug_enabled);
 #else
 DEFINE_STATIC_KEY_FALSE(slub_debug_enabled);
 #endif
-
-static inline bool __slub_debug_enabled(void)
-{
-	return static_branch_unlikely(&slub_debug_enabled);
-}
-
-#else		/* CONFIG_SLUB_DEBUG */
-
-static inline bool __slub_debug_enabled(void)
-{
-	return false;
-}
-
 #endif		/* CONFIG_SLUB_DEBUG */
 
 static inline bool kmem_cache_debug(struct kmem_cache *s)
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 02/13] kasan: add memzero init for unaligned size at DEBUG
  2021-07-15  4:26 incoming Andrew Morton
  2021-07-15  4:26 ` [patch 01/13] mm: move helper to check slub_debug_enabled Andrew Morton
@ 2021-07-15  4:26 ` Andrew Morton
  2021-07-15  4:26 ` [patch 03/13] kasan: fix build by including kernel.h Andrew Morton
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: akpm, andreyknvl, chinwen.chang, dvyukov, elver, glider,
	Kuan-Ying.Lee, linux-mm, mm-commits, nicholas.tang, ryabinin.a.a,
	torvalds, willy, yee.lee

From: Yee Lee <yee.lee@mediatek.com>
Subject: kasan: add memzero init for unaligned size at DEBUG

Issue: when SLUB debug is on, hwtag kasan_unpoison() would overwrite the
redzone of object with unaligned size.

An additional memzero_explicit() path is added to replacing init by hwtag
instruction for those unaligned size at SLUB debug mode.

The penalty is acceptable since they are only enabled in debug mode, not
production builds.  A block of comment is added for explanation.

Link: https://lkml.kernel.org/r/20210705103229.8505-3-yee.lee@mediatek.com
Signed-off-by: Yee Lee <yee.lee@mediatek.com>
Suggested-by: Andrey Konovalov <andreyknvl@gmail.com>
Suggested-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kasan/kasan.h |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/mm/kasan/kasan.h~kasan-add-memzero-int-for-unaligned-size-at-debug
+++ a/mm/kasan/kasan.h
@@ -9,6 +9,7 @@
 #ifdef CONFIG_KASAN_HW_TAGS
 
 #include <linux/static_key.h>
+#include "../slab.h"
 
 DECLARE_STATIC_KEY_FALSE(kasan_flag_stacktrace);
 extern bool kasan_flag_async __ro_after_init;
@@ -387,6 +388,17 @@ static inline void kasan_unpoison(const
 
 	if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
 		return;
+	/*
+	 * Explicitly initialize the memory with the precise object size to
+	 * avoid overwriting the SLAB redzone. This disables initialization in
+	 * the arch code and may thus lead to performance penalty. The penalty
+	 * is accepted since SLAB redzones aren't enabled in production builds.
+	 */
+	if (__slub_debug_enabled() &&
+	    init && ((unsigned long)size & KASAN_GRANULE_MASK)) {
+		init = false;
+		memzero_explicit((void *)addr, size);
+	}
 	size = round_up(size, KASAN_GRANULE_SIZE);
 
 	hw_set_mem_tag_range((void *)addr, size, tag, init);
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 03/13] kasan: fix build by including kernel.h
  2021-07-15  4:26 incoming Andrew Morton
  2021-07-15  4:26 ` [patch 01/13] mm: move helper to check slub_debug_enabled Andrew Morton
  2021-07-15  4:26 ` [patch 02/13] kasan: add memzero init for unaligned size at DEBUG Andrew Morton
@ 2021-07-15  4:26 ` Andrew Morton
  2021-07-15  4:26 ` [patch 04/13] Revert "mm/page_alloc: make should_fail_alloc_page() static" Andrew Morton
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: akpm, andreyknvl, andy.shevchenko, catalin.marinas, dvyukov,
	elver, glider, linux-mm, mm-commits, pcc, ryabinin.a.a, torvalds,
	vincenzo.frascino

From: Marco Elver <elver@google.com>
Subject: kasan: fix build by including kernel.h

The <linux/kasan.h> header relies on _RET_IP_ being defined, and had been
receiving that definition via inclusion of bug.h which includes kernel.h. 
However, since f39650de687e ("kernel.h: split out panic and oops helpers")
that is no longer the case and get the following build error when building
CONFIG_KASAN_HW_TAGS on arm64:

  In file included from arch/arm64/mm/kasan_init.c:10:
  ./include/linux/kasan.h: In function 'kasan_slab_free':
  ./include/linux/kasan.h:230:39: error: '_RET_IP_' undeclared (first use in this function)
    230 |   return __kasan_slab_free(s, object, _RET_IP_, init);

Fix it by including kernel.h from kasan.h.

Link: https://lkml.kernel.org/r/20210705072716.2125074-1-elver@google.com
Fixes: f39650de687e ("kernel.h: split out panic and oops helpers")
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/kasan.h |    1 +
 1 file changed, 1 insertion(+)

--- a/include/linux/kasan.h~kasan-fix-build-by-including-kernelh
+++ a/include/linux/kasan.h
@@ -3,6 +3,7 @@
 #define _LINUX_KASAN_H
 
 #include <linux/bug.h>
+#include <linux/kernel.h>
 #include <linux/static_key.h>
 #include <linux/types.h>
 
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 04/13] Revert "mm/page_alloc: make should_fail_alloc_page() static"
  2021-07-15  4:26 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2021-07-15  4:26 ` [patch 03/13] kasan: fix build by including kernel.h Andrew Morton
@ 2021-07-15  4:26 ` Andrew Morton
  2021-07-15  4:26 ` [patch 05/13] mm/page_alloc: avoid page allocator recursion with pagesets.lock held Andrew Morton
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: akpm, david, ddstreet, jhubbard, linux-mm, mcroce, mgorman,
	mhocko, mm-commits, shy828301, torvalds, vbabka

From: Matteo Croce <mcroce@microsoft.com>
Subject: Revert "mm/page_alloc: make should_fail_alloc_page() static"

This reverts commit f7173090033c70886d925995e9dfdfb76dbb2441.

Fix an unresolved symbol error when CONFIG_DEBUG_INFO_BTF=y:

  LD      vmlinux
  BTFIDS  vmlinux
FAILED unresolved symbol should_fail_alloc_page
make: *** [Makefile:1199: vmlinux] Error 255
make: *** Deleting file 'vmlinux'

Link: https://lkml.kernel.org/r/20210708191128.153796-1-mcroce@linux.microsoft.com
Fixes: f7173090033c ("mm/page_alloc: make should_fail_alloc_page() static")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: John Hubbard <jhubbard@nvidia.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/page_alloc.c~revert-mm-page_alloc-make-should_fail_alloc_page-static
+++ a/mm/page_alloc.c
@@ -3820,7 +3820,7 @@ static inline bool __should_fail_alloc_p
 
 #endif /* CONFIG_FAIL_PAGE_ALLOC */
 
-static noinline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
+noinline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
 {
 	return __should_fail_alloc_page(gfp_mask, order);
 }
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 05/13] mm/page_alloc: avoid page allocator recursion with pagesets.lock held
  2021-07-15  4:26 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2021-07-15  4:26 ` [patch 04/13] Revert "mm/page_alloc: make should_fail_alloc_page() static" Andrew Morton
@ 2021-07-15  4:26 ` Andrew Morton
  2021-07-15  4:26 ` [patch 06/13] mm/page_alloc: correct return value when failing at preparing Andrew Morton
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: akpm, aquini, desmondcheongzx, linux-mm, mgorman, mm-commits,
	Qiang.Zhang, skhan, torvalds

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm/page_alloc: avoid page allocator recursion with pagesets.lock held

Syzbot is reporting potential deadlocks due to pagesets.lock when
PAGE_OWNER is enabled.  One example from Desmond Cheong Zhi Xi is as
follows

  __alloc_pages_bulk()
    local_lock_irqsave(&pagesets.lock, flags) <---- outer lock here
    prep_new_page():
      post_alloc_hook():
        set_page_owner():
          __set_page_owner():
            save_stack():
              stack_depot_save():
                alloc_pages():
                  alloc_page_interleave():
                    __alloc_pages():
                      get_page_from_freelist():
                        rm_queue():
                          rm_queue_pcplist():
                            local_lock_irqsave(&pagesets.lock, flags);
                            *** DEADLOCK ***

Zhang, Qiang also reported

  BUG: sleeping function called from invalid context at mm/page_alloc.c:5179
  in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
  .....
  __dump_stack lib/dump_stack.c:79 [inline]
  dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:96
  ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9153
  prepare_alloc_pages+0x3da/0x580 mm/page_alloc.c:5179
  __alloc_pages+0x12f/0x500 mm/page_alloc.c:5375
  alloc_page_interleave+0x1e/0x200 mm/mempolicy.c:2147
  alloc_pages+0x238/0x2a0 mm/mempolicy.c:2270
  stack_depot_save+0x39d/0x4e0 lib/stackdepot.c:303
  save_stack+0x15e/0x1e0 mm/page_owner.c:120
  __set_page_owner+0x50/0x290 mm/page_owner.c:181
  prep_new_page mm/page_alloc.c:2445 [inline]
  __alloc_pages_bulk+0x8b9/0x1870 mm/page_alloc.c:5313
  alloc_pages_bulk_array_node include/linux/gfp.h:557 [inline]
  vm_area_alloc_pages mm/vmalloc.c:2775 [inline]
  __vmalloc_area_node mm/vmalloc.c:2845 [inline]
  __vmalloc_node_range+0x39d/0x960 mm/vmalloc.c:2947
  __vmalloc_node mm/vmalloc.c:2996 [inline]
  vzalloc+0x67/0x80 mm/vmalloc.c:3066

There are a number of ways it could be fixed.  The page owner code could
be audited to strip GFP flags that allow sleeping but it'll impair the
functionality of PAGE_OWNER if allocations fail.  The bulk allocator could
add a special case to release/reacquire the lock for prep_new_page and
lookup PCP after the lock is reacquired at the cost of performance.  The
pages requiring prep could be tracked using the least significant bit and
looping through the array although it is more complicated for the list
interface.  The options are relatively complex and the second one still
incurs a performance penalty when PAGE_OWNER is active so this patch takes
the simple approach -- disable bulk allocation of PAGE_OWNER is active. 
The caller will be forced to allocate one page at a time incurring a
performance penalty but PAGE_OWNER is already a performance penalty.

Link: https://lkml.kernel.org/r/20210708081434.GV3840@techsingularity.net
Fixes: dbbee9d5cd83 ("mm/page_alloc: convert per-cpu list protection to local_lock")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reported-by: "Zhang, Qiang" <Qiang.Zhang@windriver.com>
Reported-by: syzbot+127fd7828d6eeb611703@syzkaller.appspotmail.com
Tested-by: syzbot+127fd7828d6eeb611703@syzkaller.appspotmail.com
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/mm/page_alloc.c~mm-page_alloc-avoid-page-allocator-recursion-with-pagesetslock-held
+++ a/mm/page_alloc.c
@@ -5239,6 +5239,18 @@ unsigned long __alloc_pages_bulk(gfp_t g
 	if (nr_pages - nr_populated == 1)
 		goto failed;
 
+#ifdef CONFIG_PAGE_OWNER
+	/*
+	 * PAGE_OWNER may recurse into the allocator to allocate space to
+	 * save the stack with pagesets.lock held. Releasing/reacquiring
+	 * removes much of the performance benefit of bulk allocation so
+	 * force the caller to allocate one page at a time as it'll have
+	 * similar performance to added complexity to the bulk allocator.
+	 */
+	if (static_branch_unlikely(&page_owner_inited))
+		goto failed;
+#endif
+
 	/* May set ALLOC_NOFRAGMENT, fragmentation will return 1 page. */
 	gfp &= gfp_allowed_mask;
 	alloc_gfp = gfp;
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 06/13] mm/page_alloc: correct return value when failing at preparing
  2021-07-15  4:26 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2021-07-15  4:26 ` [patch 05/13] mm/page_alloc: avoid page allocator recursion with pagesets.lock held Andrew Morton
@ 2021-07-15  4:26 ` Andrew Morton
  2021-07-15  4:26 ` [patch 07/13] mm/page_alloc: further fix __alloc_pages_bulk() return value Andrew Morton
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: akpm, linux-mm, mgorman, mm-commits, torvalds, yanfei.xu

From: Yanfei Xu <yanfei.xu@windriver.com>
Subject: mm/page_alloc: correct return value when failing at preparing

If the array passed in is already partially populated, we should return
"nr_populated" even failing at preparing arguments stage.

Link: https://lkml.kernel.org/r/20210713152100.10381-3-mgorman@techsingularity.net
Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20210709102855.55058-1-yanfei.xu@windriver.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/page_alloc.c~mm-page_alloc-correct-return-value-when-failing-at-preparing
+++ a/mm/page_alloc.c
@@ -5255,7 +5255,7 @@ unsigned long __alloc_pages_bulk(gfp_t g
 	gfp &= gfp_allowed_mask;
 	alloc_gfp = gfp;
 	if (!prepare_alloc_pages(gfp, 0, preferred_nid, nodemask, &ac, &alloc_gfp, &alloc_flags))
-		return 0;
+		return nr_populated;
 	gfp = alloc_gfp;
 
 	/* Find an allowed local zone that meets the low watermark. */
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 07/13] mm/page_alloc: further fix __alloc_pages_bulk() return value
  2021-07-15  4:26 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2021-07-15  4:26 ` [patch 06/13] mm/page_alloc: correct return value when failing at preparing Andrew Morton
@ 2021-07-15  4:26 ` Andrew Morton
  2021-07-15  4:26 ` [patch 08/13] mm: fix the try_to_unmap prototype for !CONFIG_MMU Andrew Morton
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: akpm, brouer, chuck.lever, desmondcheongzx, linux-mm, mcroce,
	mgorman, mm-commits, Qiang.Zhang, torvalds, yanfei.xu

From: Chuck Lever <chuck.lever@oracle.com>
Subject: mm/page_alloc: further fix __alloc_pages_bulk() return value

The author of commit b3b64ebd3822 ("mm/page_alloc: do bulk array
bounds check after checking populated elements") was possibly
confused by the mixture of return values throughout the function.

The API contract is clear that the function "Returns the number of pages
on the list or array." It does not list zero as a unique return value with
a special meaning.  Therefore zero is a plausible return value only if
@nr_pages is zero or less.

Clean up the return logic to make it clear that the returned value is
always the total number of pages in the array/list, not the number of
pages that were allocated during this call.

The only change in behavior with this patch is the value returned if
prepare_alloc_pages() fails.  To match the API contract, the number of
pages currently in the array/list is returned in this case.

The call site in __page_pool_alloc_pages_slow() also seems to be confused
on this matter.  It should be attended to by someone who is familiar with
that code.

[mel@techsingularity.net: Return nr_populated if 0 pages are requested]
Link: https://lkml.kernel.org/r/20210713152100.10381-4-mgorman@techsingularity.net
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Cc: Zhang Qiang <Qiang.Zhang@windriver.com>
Cc: Yanfei Xu <yanfei.xu@windriver.com>
Cc: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |   14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-further-fix-__alloc_pages_bulk-return-value
+++ a/mm/page_alloc.c
@@ -5221,9 +5221,6 @@ unsigned long __alloc_pages_bulk(gfp_t g
 	unsigned int alloc_flags = ALLOC_WMARK_LOW;
 	int nr_populated = 0, nr_account = 0;
 
-	if (unlikely(nr_pages <= 0))
-		return 0;
-
 	/*
 	 * Skip populated array elements to determine if any pages need
 	 * to be allocated before disabling IRQs.
@@ -5231,9 +5228,13 @@ unsigned long __alloc_pages_bulk(gfp_t g
 	while (page_array && nr_populated < nr_pages && page_array[nr_populated])
 		nr_populated++;
 
+	/* No pages requested? */
+	if (unlikely(nr_pages <= 0))
+		goto out;
+
 	/* Already populated array? */
 	if (unlikely(page_array && nr_pages - nr_populated == 0))
-		return nr_populated;
+		goto out;
 
 	/* Use the single page allocator for one page. */
 	if (nr_pages - nr_populated == 1)
@@ -5255,7 +5256,7 @@ unsigned long __alloc_pages_bulk(gfp_t g
 	gfp &= gfp_allowed_mask;
 	alloc_gfp = gfp;
 	if (!prepare_alloc_pages(gfp, 0, preferred_nid, nodemask, &ac, &alloc_gfp, &alloc_flags))
-		return nr_populated;
+		goto out;
 	gfp = alloc_gfp;
 
 	/* Find an allowed local zone that meets the low watermark. */
@@ -5323,6 +5324,7 @@ unsigned long __alloc_pages_bulk(gfp_t g
 	__count_zid_vm_events(PGALLOC, zone_idx(zone), nr_account);
 	zone_statistics(ac.preferred_zoneref->zone, zone, nr_account);
 
+out:
 	return nr_populated;
 
 failed_irq:
@@ -5338,7 +5340,7 @@ failed:
 		nr_populated++;
 	}
 
-	return nr_populated;
+	goto out;
 }
 EXPORT_SYMBOL_GPL(__alloc_pages_bulk);
 
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 08/13] mm: fix the try_to_unmap prototype for !CONFIG_MMU
  2021-07-15  4:26 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2021-07-15  4:26 ` [patch 07/13] mm/page_alloc: further fix __alloc_pages_bulk() return value Andrew Morton
@ 2021-07-15  4:26 ` Andrew Morton
  2021-07-15  4:26 ` [patch 09/13] lib/test_hmm: remove set but unused page variable Andrew Morton
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: akpm, hch, linux-mm, mm-commits, shy828301, torvalds

From: Christoph Hellwig <hch@lst.de>
Subject: mm: fix the try_to_unmap prototype for !CONFIG_MMU

Adjust the nommu stub of try_to_unmap to match the changed protype for the
full version.  Turn it into an inline instead of a macro to generally
improve the type checking.

Link: https://lkml.kernel.org/r/20210705053944.885828-1-hch@lst.de
Fixes: 1fb08ac63bee ("mm: rmap: make try_to_unmap() void function")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/rmap.h |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/include/linux/rmap.h~mm-fix-the-try_to_unmap-prototype-for-config_mmu
+++ a/include/linux/rmap.h
@@ -291,7 +291,9 @@ static inline int page_referenced(struct
 	return 0;
 }
 
-#define try_to_unmap(page, refs) false
+static inline void try_to_unmap(struct page *page, enum ttu_flags flags)
+{
+}
 
 static inline int page_mkclean(struct page *page)
 {
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 09/13] lib/test_hmm: remove set but unused page variable
  2021-07-15  4:26 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2021-07-15  4:26 ` [patch 08/13] mm: fix the try_to_unmap prototype for !CONFIG_MMU Andrew Morton
@ 2021-07-15  4:26 ` Andrew Morton
  2021-07-15  4:27 ` [patch 10/13] hfs: add missing clean-up in hfs_fill_super Andrew Morton
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: akpm, apopple, hulkci, jrdr.linux, linux-mm, mm-commits,
	oliver.sang, torvalds, yangyingliang

From: Alistair Popple <apopple@nvidia.com>
Subject: lib/test_hmm: remove set but unused page variable

The HMM selftests use atomic_check_access() to check atomic access to a
page has been revoked.  It doesn't matter if the page mapping has been
removed from the mirrored page tables as that also implies atomic access
has been revoked.  Therefore remove the unused page variable to fix this
compiler warning:

  lib/test_hmm.c:631:16: warning: variable `page' set but not used [-Wunused-but-set-variable]

Link: https://lkml.kernel.org/r/20210706025603.4059-1-apopple@nvidia.com
Fixes: b659baea7546 ("mm: selftests for exclusive device memory")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/test_hmm.c |    2 --
 1 file changed, 2 deletions(-)

--- a/lib/test_hmm.c~lib-test_hmm-remove-set-but-unused-page-variable
+++ a/lib/test_hmm.c
@@ -628,10 +628,8 @@ static int dmirror_check_atomic(struct d
 
 	for (pfn = start >> PAGE_SHIFT; pfn < (end >> PAGE_SHIFT); pfn++) {
 		void *entry;
-		struct page *page;
 
 		entry = xa_load(&dmirror->pt, pfn);
-		page = xa_untag_pointer(entry);
 		if (xa_pointer_tag(entry) == DPT_XA_TAG_ATOMIC)
 			return -EPERM;
 	}
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 10/13] hfs: add missing clean-up in hfs_fill_super
  2021-07-15  4:26 incoming Andrew Morton
                   ` (8 preceding siblings ...)
  2021-07-15  4:26 ` [patch 09/13] lib/test_hmm: remove set but unused page variable Andrew Morton
@ 2021-07-15  4:27 ` Andrew Morton
  2021-07-15  4:27 ` [patch 11/13] hfs: fix high memory mapping in hfs_bnode_read Andrew Morton
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:27 UTC (permalink / raw)
  To: akpm, desmondcheongzx, gregkh, gustavoars, linux-mm, mm-commits,
	skhan, slava, torvalds, viro

From: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Subject: hfs: add missing clean-up in hfs_fill_super

Patch series "hfs: fix various errors", v2.

This series ultimately aims to address a lockdep warning in hfs_find_init
reported by Syzbot:
https://syzkaller.appspot.com/bug?id=f007ef1d7a31a469e3be7aeb0fde0769b18585db

The work done for this led to the discovery of another bug, and the
Syzkaller repro test also reveals an invalid memory access error after
clearing the lockdep warning.  Hence, this series is broken up into three
patches:

1. Add a missing call to hfs_find_exit for an error path in
   hfs_fill_super

2. Fix memory mapping in hfs_bnode_read by fixing calls to kmap

3. Add lock nesting notation to tell lockdep that the observed locking
   hierarchy is safe


This patch (of 3):

Before exiting hfs_fill_super, the struct hfs_find_data used in
hfs_find_init should be passed to hfs_find_exit to be cleaned up, and to
release the lock held on the btree.

The call to hfs_find_exit is missing from an error path.  We add it back
in by consolidating calls to hfs_find_exit for error paths.

Link: https://lkml.kernel.org/r/20210701030756.58760-1-desmondcheongzx@gmail.com
Link: https://lkml.kernel.org/r/20210701030756.58760-2-desmondcheongzx@gmail.com
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/hfs/super.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/fs/hfs/super.c~hfs-add-missing-clean-up-in-hfs_fill_super
+++ a/fs/hfs/super.c
@@ -420,14 +420,12 @@ static int hfs_fill_super(struct super_b
 	if (!res) {
 		if (fd.entrylength > sizeof(rec) || fd.entrylength < 0) {
 			res =  -EIO;
-			goto bail;
+			goto bail_hfs_find;
 		}
 		hfs_bnode_read(fd.bnode, &rec, fd.entryoffset, fd.entrylength);
 	}
-	if (res) {
-		hfs_find_exit(&fd);
-		goto bail_no_root;
-	}
+	if (res)
+		goto bail_hfs_find;
 	res = -EINVAL;
 	root_inode = hfs_iget(sb, &fd.search_key->cat, &rec);
 	hfs_find_exit(&fd);
@@ -443,6 +441,8 @@ static int hfs_fill_super(struct super_b
 	/* everything's okay */
 	return 0;
 
+bail_hfs_find:
+	hfs_find_exit(&fd);
 bail_no_root:
 	pr_err("get root inode failed\n");
 bail:
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 11/13] hfs: fix high memory mapping in hfs_bnode_read
  2021-07-15  4:26 incoming Andrew Morton
                   ` (9 preceding siblings ...)
  2021-07-15  4:27 ` [patch 10/13] hfs: add missing clean-up in hfs_fill_super Andrew Morton
@ 2021-07-15  4:27 ` Andrew Morton
  2021-07-15  4:27 ` [patch 12/13] hfs: add lock nesting notation to hfs_find_init Andrew Morton
  2021-07-15  4:27 ` [patch 13/13] mm/hugetlb: fix refs calculation from unaligned @vaddr Andrew Morton
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:27 UTC (permalink / raw)
  To: akpm, desmondcheongzx, gregkh, gustavoars, linux-mm, mm-commits,
	skhan, slava, torvalds, viro

From: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Subject: hfs: fix high memory mapping in hfs_bnode_read

Pages that we read in hfs_bnode_read need to be kmapped into kernel
address space.  However, currently only the 0th page is kmapped.  If the
given offset + length exceeds this 0th page, then we have an invalid
memory access.

To fix this, we kmap relevant pages one by one and copy their relevant
portions of data.

An example of invalid memory access occurring without this fix can be seen
in the following crash report:

==================================================================
BUG: KASAN: use-after-free in memcpy include/linux/fortify-string.h:191 [inline]
BUG: KASAN: use-after-free in hfs_bnode_read+0xc4/0xe0 fs/hfs/bnode.c:26
Read of size 2 at addr ffff888125fdcffe by task syz-executor5/4634

CPU: 0 PID: 4634 Comm: syz-executor5 Not tainted 5.13.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x195/0x1f8 lib/dump_stack.c:120
 print_address_description.constprop.0+0x1d/0x110 mm/kasan/report.c:233
 __kasan_report mm/kasan/report.c:419 [inline]
 kasan_report.cold+0x7b/0xd4 mm/kasan/report.c:436
 check_region_inline mm/kasan/generic.c:180 [inline]
 kasan_check_range+0x154/0x1b0 mm/kasan/generic.c:186
 memcpy+0x24/0x60 mm/kasan/shadow.c:65
 memcpy include/linux/fortify-string.h:191 [inline]
 hfs_bnode_read+0xc4/0xe0 fs/hfs/bnode.c:26
 hfs_bnode_read_u16 fs/hfs/bnode.c:34 [inline]
 hfs_bnode_find+0x880/0xcc0 fs/hfs/bnode.c:365
 hfs_brec_find+0x2d8/0x540 fs/hfs/bfind.c:126
 hfs_brec_read+0x27/0x120 fs/hfs/bfind.c:165
 hfs_cat_find_brec+0x19a/0x3b0 fs/hfs/catalog.c:194
 hfs_fill_super+0xc13/0x1460 fs/hfs/super.c:419
 mount_bdev+0x331/0x3f0 fs/super.c:1368
 hfs_mount+0x35/0x40 fs/hfs/super.c:457
 legacy_get_tree+0x10c/0x220 fs/fs_context.c:592
 vfs_get_tree+0x93/0x300 fs/super.c:1498
 do_new_mount fs/namespace.c:2905 [inline]
 path_mount+0x13f5/0x20e0 fs/namespace.c:3235
 do_mount fs/namespace.c:3248 [inline]
 __do_sys_mount fs/namespace.c:3456 [inline]
 __se_sys_mount fs/namespace.c:3433 [inline]
 __x64_sys_mount+0x2b8/0x340 fs/namespace.c:3433
 do_syscall_64+0x37/0xc0 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x45e63a
Code: 48 c7 c2 bc ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d2 e8 88 04 00 00 0f 1f 84 00 00 00 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f9404d410d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0000000020000248 RCX: 000000000045e63a
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007f9404d41120
RBP: 00007f9404d41120 R08: 00000000200002c0 R09: 0000000020000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 0000000000000003 R14: 00000000004ad5d8 R15: 0000000000000000

The buggy address belongs to the page:
page:00000000dadbcf3e refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x125fdc
flags: 0x2fffc0000000000(node=0|zone=2|lastcpupid=0x3fff)
raw: 02fffc0000000000 ffffea000497f748 ffffea000497f6c8 0000000000000000
raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888125fdce80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff888125fdcf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff888125fdcf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                                                ^
 ffff888125fdd000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff888125fdd080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================

Link: https://lkml.kernel.org/r/20210701030756.58760-3-desmondcheongzx@gmail.com
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/hfs/bnode.c |   25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

--- a/fs/hfs/bnode.c~hfs-fix-high-memory-mapping-in-hfs_bnode_read
+++ a/fs/hfs/bnode.c
@@ -15,16 +15,31 @@
 
 #include "btree.h"
 
-void hfs_bnode_read(struct hfs_bnode *node, void *buf,
-		int off, int len)
+void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int len)
 {
 	struct page *page;
+	int pagenum;
+	int bytes_read;
+	int bytes_to_read;
+	void *vaddr;
 
 	off += node->page_offset;
-	page = node->page[0];
+	pagenum = off >> PAGE_SHIFT;
+	off &= ~PAGE_MASK; /* compute page offset for the first page */
 
-	memcpy(buf, kmap(page) + off, len);
-	kunmap(page);
+	for (bytes_read = 0; bytes_read < len; bytes_read += bytes_to_read) {
+		if (pagenum >= node->tree->pages_per_bnode)
+			break;
+		page = node->page[pagenum];
+		bytes_to_read = min_t(int, len - bytes_read, PAGE_SIZE - off);
+
+		vaddr = kmap_atomic(page);
+		memcpy(buf + bytes_read, vaddr + off, bytes_to_read);
+		kunmap_atomic(vaddr);
+
+		pagenum++;
+		off = 0; /* page offset only applies to the first page */
+	}
 }
 
 u16 hfs_bnode_read_u16(struct hfs_bnode *node, int off)
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 12/13] hfs: add lock nesting notation to hfs_find_init
  2021-07-15  4:26 incoming Andrew Morton
                   ` (10 preceding siblings ...)
  2021-07-15  4:27 ` [patch 11/13] hfs: fix high memory mapping in hfs_bnode_read Andrew Morton
@ 2021-07-15  4:27 ` Andrew Morton
  2021-07-15  4:27 ` [patch 13/13] mm/hugetlb: fix refs calculation from unaligned @vaddr Andrew Morton
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:27 UTC (permalink / raw)
  To: akpm, desmondcheongzx, gregkh, gustavoars, linux-mm, mm-commits,
	skhan, slava, torvalds, viro

From: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Subject: hfs: add lock nesting notation to hfs_find_init

Syzbot reports a possible recursive lock:
https://syzkaller.appspot.com/bug?id=f007ef1d7a31a469e3be7aeb0fde0769b18585db

This happens due to missing lock nesting information.  From the logs, we
see that a call to hfs_fill_super is made to mount the hfs filesystem. 
While searching for the root inode, the lock on the catalog btree is
grabbed.  Then, when the parent of the root isn't found, a call to
__hfs_bnode_create is made to create the parent of the root.  This
eventually leads to a call to hfs_ext_read_extent which grabs a lock on
the extents btree.

Since the order of locking is catalog btree -> extents btree, this lock
hierarchy does not lead to a deadlock.

To tell lockdep that this locking is safe, we add nesting notation to
distinguish between catalog btrees, extents btrees, and attributes btrees
(for HFS+).  This has already been done in hfsplus.

Link: https://lkml.kernel.org/r/20210701030756.58760-4-desmondcheongzx@gmail.com
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reported-by: syzbot+b718ec84a87b7e73ade4@syzkaller.appspotmail.com
Tested-by: syzbot+b718ec84a87b7e73ade4@syzkaller.appspotmail.com
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/hfs/bfind.c |   14 +++++++++++++-
 fs/hfs/btree.h |    7 +++++++
 2 files changed, 20 insertions(+), 1 deletion(-)

--- a/fs/hfs/bfind.c~hfs-add-lock-nesting-notation-to-hfs_find_init
+++ a/fs/hfs/bfind.c
@@ -25,7 +25,19 @@ int hfs_find_init(struct hfs_btree *tree
 	fd->key = ptr + tree->max_key_len + 2;
 	hfs_dbg(BNODE_REFS, "find_init: %d (%p)\n",
 		tree->cnid, __builtin_return_address(0));
-	mutex_lock(&tree->tree_lock);
+	switch (tree->cnid) {
+	case HFS_CAT_CNID:
+		mutex_lock_nested(&tree->tree_lock, CATALOG_BTREE_MUTEX);
+		break;
+	case HFS_EXT_CNID:
+		mutex_lock_nested(&tree->tree_lock, EXTENTS_BTREE_MUTEX);
+		break;
+	case HFS_ATTR_CNID:
+		mutex_lock_nested(&tree->tree_lock, ATTR_BTREE_MUTEX);
+		break;
+	default:
+		return -EINVAL;
+	}
 	return 0;
 }
 
--- a/fs/hfs/btree.h~hfs-add-lock-nesting-notation-to-hfs_find_init
+++ a/fs/hfs/btree.h
@@ -13,6 +13,13 @@ typedef int (*btree_keycmp)(const btree_
 
 #define NODE_HASH_SIZE  256
 
+/* B-tree mutex nested subclasses */
+enum hfs_btree_mutex_classes {
+	CATALOG_BTREE_MUTEX,
+	EXTENTS_BTREE_MUTEX,
+	ATTR_BTREE_MUTEX,
+};
+
 /* A HFS BTree held in memory */
 struct hfs_btree {
 	struct super_block *sb;
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 13/13] mm/hugetlb: fix refs calculation from unaligned @vaddr
  2021-07-15  4:26 incoming Andrew Morton
                   ` (11 preceding siblings ...)
  2021-07-15  4:27 ` [patch 12/13] hfs: add lock nesting notation to hfs_find_init Andrew Morton
@ 2021-07-15  4:27 ` Andrew Morton
  12 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-07-15  4:27 UTC (permalink / raw)
  To: akpm, joao.m.martins, linux-mm, mike.kravetz, mm-commits, stable,
	torvalds

From: Joao Martins <joao.m.martins@oracle.com>
Subject: mm/hugetlb: fix refs calculation from unaligned @vaddr

commit 82e5d378b0e47 ("mm/hugetlb: refactor subpage recording") refactored
the count of subpages but missed an edge case when @vaddr is not aligned
to PAGE_SIZE e.g.  when close to vma->vm_end.  It would then errousnly set
@refs to 0 and record_subpages_vmas() wouldn't set the @pages array
element to its value, consequently causing the reported null-deref by
syzbot.

Fix it by aligning down @vaddr by PAGE_SIZE in @refs calculation.

Link: https://lkml.kernel.org/r/20210713152440.28650-1-joao.m.martins@oracle.com
Fixes: 82e5d378b0e47 ("mm/hugetlb: refactor subpage recording")
Reported-by: syzbot+a3fcd59df1b372066f5a@syzkaller.appspotmail.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/mm/hugetlb.c~mm-hugetlb-fix-refs-calculation-from-unaligned-vaddr
+++ a/mm/hugetlb.c
@@ -5440,8 +5440,9 @@ long follow_hugetlb_page(struct mm_struc
 			continue;
 		}
 
-		refs = min3(pages_per_huge_page(h) - pfn_offset,
-			    (vma->vm_end - vaddr) >> PAGE_SHIFT, remainder);
+		/* vaddr may not be aligned to PAGE_SIZE */
+		refs = min3(pages_per_huge_page(h) - pfn_offset, remainder,
+		    (vma->vm_end - ALIGN_DOWN(vaddr, PAGE_SIZE)) >> PAGE_SHIFT);
 
 		if (pages || vmas)
 			record_subpages_vmas(mem_map_offset(page, pfn_offset),
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-07-15  4:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-15  4:26 incoming Andrew Morton
2021-07-15  4:26 ` [patch 01/13] mm: move helper to check slub_debug_enabled Andrew Morton
2021-07-15  4:26 ` [patch 02/13] kasan: add memzero init for unaligned size at DEBUG Andrew Morton
2021-07-15  4:26 ` [patch 03/13] kasan: fix build by including kernel.h Andrew Morton
2021-07-15  4:26 ` [patch 04/13] Revert "mm/page_alloc: make should_fail_alloc_page() static" Andrew Morton
2021-07-15  4:26 ` [patch 05/13] mm/page_alloc: avoid page allocator recursion with pagesets.lock held Andrew Morton
2021-07-15  4:26 ` [patch 06/13] mm/page_alloc: correct return value when failing at preparing Andrew Morton
2021-07-15  4:26 ` [patch 07/13] mm/page_alloc: further fix __alloc_pages_bulk() return value Andrew Morton
2021-07-15  4:26 ` [patch 08/13] mm: fix the try_to_unmap prototype for !CONFIG_MMU Andrew Morton
2021-07-15  4:26 ` [patch 09/13] lib/test_hmm: remove set but unused page variable Andrew Morton
2021-07-15  4:27 ` [patch 10/13] hfs: add missing clean-up in hfs_fill_super Andrew Morton
2021-07-15  4:27 ` [patch 11/13] hfs: fix high memory mapping in hfs_bnode_read Andrew Morton
2021-07-15  4:27 ` [patch 12/13] hfs: add lock nesting notation to hfs_find_init Andrew Morton
2021-07-15  4:27 ` [patch 13/13] mm/hugetlb: fix refs calculation from unaligned @vaddr Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).