linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2021-07-23 22:49 Andrew Morton
  2021-07-23 22:50 ` [patch 01/15] userfaultfd: do not untag user pointers Andrew Morton
                   ` (14 more replies)
  0 siblings, 15 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

15 patches, based on 704f4cba43d4ed31ef4beb422313f1263d87bc55.

Subsystems affected by this patch series:

  mm/userfaultfd
  mm/kfence
  mm/highmem
  mm/pagealloc
  mm/memblock
  mm/pagecache
  mm/secretmem
  mm/pagemap
  mm/hugetlbfs

Subsystem: mm/userfaultfd

    Peter Collingbourne <pcc@google.com>:
    Patch series "userfaultfd: do not untag user pointers", v5:
      userfaultfd: do not untag user pointers
      selftest: use mmap instead of posix_memalign to allocate memory

Subsystem: mm/kfence

    Weizhao Ouyang <o451686892@gmail.com>:
      kfence: defer kfence_test_init to ensure that kunit debugfs is created

    Alexander Potapenko <glider@google.com>:
      kfence: move the size check to the beginning of __kfence_alloc()
      kfence: skip all GFP_ZONEMASK allocations

Subsystem: mm/highmem

    Christoph Hellwig <hch@lst.de>:
      mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
      mm: use kmap_local_page in memzero_page

Subsystem: mm/pagealloc

    Sergei Trofimovich <slyfox@gentoo.org>:
      mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction

Subsystem: mm/memblock

    Mike Rapoport <rppt@linux.ibm.com>:
      memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions

Subsystem: mm/pagecache

    Roman Gushchin <guro@fb.com>:
      writeback, cgroup: remove wb from offline list before releasing refcnt
      writeback, cgroup: do not reparent dax inodes

Subsystem: mm/secretmem

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/secretmem: wire up ->set_page_dirty

Subsystem: mm/pagemap

    Muchun Song <songmuchun@bytedance.com>:
      mm: mmap_lock: fix disabling preemption directly

    Qi Zheng <zhengqi.arch@bytedance.com>:
      mm: fix the deadlock in finish_fault()

Subsystem: mm/hugetlbfs

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlbfs: fix mount mode command line processing

 Documentation/arm64/tagged-address-abi.rst |   26 ++++++++++++++++++--------
 fs/fs-writeback.c                          |    3 +++
 fs/hugetlbfs/inode.c                       |    2 +-
 fs/userfaultfd.c                           |   26 ++++++++++++--------------
 include/linux/highmem.h                    |    6 ++++--
 include/linux/memblock.h                   |    4 ++--
 mm/backing-dev.c                           |    2 +-
 mm/kfence/core.c                           |   19 ++++++++++++++++---
 mm/kfence/kfence_test.c                    |    2 +-
 mm/memblock.c                              |    3 ++-
 mm/memory.c                                |   11 ++++++++++-
 mm/mmap_lock.c                             |    4 ++--
 mm/page_alloc.c                            |   29 ++++++++++++++++-------------
 mm/secretmem.c                             |    1 +
 tools/testing/selftests/vm/userfaultfd.c   |    6 ++++--
 15 files changed, 93 insertions(+), 51 deletions(-)



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 01/15] userfaultfd: do not untag user pointers
  2021-07-23 22:49 incoming Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 02/15] selftest: use mmap instead of posix_memalign to allocate memory Andrew Morton
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: aarcange, adelva, akpm, andreyknvl, catalin.marinas, Dave.Martin,
	eugenis, linux-mm, lokeshgidra, mitchp, mm-commits, pcc, stable,
	torvalds, vincenzo.frascino, will, willmcvicker

From: Peter Collingbourne <pcc@google.com>
Subject: userfaultfd: do not untag user pointers

Patch series "userfaultfd: do not untag user pointers", v5.

If a user program uses userfaultfd on ranges of heap memory, it may end up
passing a tagged pointer to the kernel in the range.start field of the
UFFDIO_REGISTER ioctl.  This can happen when using an MTE-capable
allocator, or on Android if using the Tagged Pointers feature for MTE
readiness [1].

When a fault subsequently occurs, the tag is stripped from the fault
address returned to the application in the fault.address field of struct
uffd_msg.  However, from the application's perspective, the tagged address
*is* the memory address, so if the application is unaware of memory tags,
it may get confused by receiving an address that is, from its point of
view, outside of the bounds of the allocation.  We observed this behavior
in the kselftest for userfaultfd [2] but other applications could have the
same problem.

Address this by not untagging pointers passed to the userfaultfd ioctls. 
Instead, let the system call fail.  Also change the kselftest to use mmap
so that it doesn't encounter this problem.

[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c


This patch (of 2):

If a user program uses userfaultfd on ranges of heap memory, it may end up
passing a tagged pointer to the kernel in the range.start field of the
UFFDIO_REGISTER ioctl.  This can happen when using an MTE-capable
allocator, or on Android if using the Tagged Pointers feature for MTE
readiness [1].

When a fault subsequently occurs, the tag is stripped from the fault
address returned to the application in the fault.address field of struct
uffd_msg.  However, from the application's perspective, the tagged address
*is* the memory address, so if the application is unaware of memory tags,
it may get confused by receiving an address that is, from its point of
view, outside of the bounds of the allocation.  We observed this behavior
in the kselftest for userfaultfd [2] but other applications could have the
same problem.

Address this by not untagging pointers passed to the userfaultfd ioctls. 
Instead, let the system call fail.  This will provide an early indication
of problems with tag-unaware userspace code instead of letting the code
get confused later, and is consistent with how we decided to handle
brk/mmap/mremap in commit dcde237319e6 ("mm: Avoid creating virtual
address aliases in brk()/mmap()/mremap()"), as well as being consistent
with the existing tagged address ABI documentation relating to how ioctl
arguments are handled.

The code change is a revert of commit 7d0325749a6c ("userfaultfd: untag
user pointers") plus some fixups to some additional calls to
validate_range that have appeared since then.

[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c

Link: https://lkml.kernel.org/r/20210714195437.118982-1-pcc@google.com
Link: https://lkml.kernel.org/r/20210714195437.118982-2-pcc@google.com
Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0a25501b
Fixes: 63f0c6037965 ("arm64: Introduce prctl() options to control the tagged user addresses ABI")
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alistair Delva <adelva@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mitch Phillips <mitchp@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: William McVicker <willmcvicker@google.com>
Cc: <stable@vger.kernel.org>	[5.4]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/arm64/tagged-address-abi.rst |   26 +++++++++++++------
 fs/userfaultfd.c                           |   26 ++++++++-----------
 2 files changed, 30 insertions(+), 22 deletions(-)

--- a/Documentation/arm64/tagged-address-abi.rst~userfaultfd-do-not-untag-user-pointers
+++ a/Documentation/arm64/tagged-address-abi.rst
@@ -45,14 +45,24 @@ how the user addresses are used by the k
 
 1. User addresses not accessed by the kernel but used for address space
    management (e.g. ``mprotect()``, ``madvise()``). The use of valid
-   tagged pointers in this context is allowed with the exception of
-   ``brk()``, ``mmap()`` and the ``new_address`` argument to
-   ``mremap()`` as these have the potential to alias with existing
-   user addresses.
-
-   NOTE: This behaviour changed in v5.6 and so some earlier kernels may
-   incorrectly accept valid tagged pointers for the ``brk()``,
-   ``mmap()`` and ``mremap()`` system calls.
+   tagged pointers in this context is allowed with these exceptions:
+
+   - ``brk()``, ``mmap()`` and the ``new_address`` argument to
+     ``mremap()`` as these have the potential to alias with existing
+      user addresses.
+
+     NOTE: This behaviour changed in v5.6 and so some earlier kernels may
+     incorrectly accept valid tagged pointers for the ``brk()``,
+     ``mmap()`` and ``mremap()`` system calls.
+
+   - The ``range.start``, ``start`` and ``dst`` arguments to the
+     ``UFFDIO_*`` ``ioctl()``s used on a file descriptor obtained from
+     ``userfaultfd()``, as fault addresses subsequently obtained by reading
+     the file descriptor will be untagged, which may otherwise confuse
+     tag-unaware programs.
+
+     NOTE: This behaviour changed in v5.14 and so some earlier kernels may
+     incorrectly accept valid tagged pointers for this system call.
 
 2. User addresses accessed by the kernel (e.g. ``write()``). This ABI
    relaxation is disabled by default and the application thread needs to
--- a/fs/userfaultfd.c~userfaultfd-do-not-untag-user-pointers
+++ a/fs/userfaultfd.c
@@ -1236,23 +1236,21 @@ static __always_inline void wake_userfau
 }
 
 static __always_inline int validate_range(struct mm_struct *mm,
-					  __u64 *start, __u64 len)
+					  __u64 start, __u64 len)
 {
 	__u64 task_size = mm->task_size;
 
-	*start = untagged_addr(*start);
-
-	if (*start & ~PAGE_MASK)
+	if (start & ~PAGE_MASK)
 		return -EINVAL;
 	if (len & ~PAGE_MASK)
 		return -EINVAL;
 	if (!len)
 		return -EINVAL;
-	if (*start < mmap_min_addr)
+	if (start < mmap_min_addr)
 		return -EINVAL;
-	if (*start >= task_size)
+	if (start >= task_size)
 		return -EINVAL;
-	if (len > task_size - *start)
+	if (len > task_size - start)
 		return -EINVAL;
 	return 0;
 }
@@ -1316,7 +1314,7 @@ static int userfaultfd_register(struct u
 		vm_flags |= VM_UFFD_MINOR;
 	}
 
-	ret = validate_range(mm, &uffdio_register.range.start,
+	ret = validate_range(mm, uffdio_register.range.start,
 			     uffdio_register.range.len);
 	if (ret)
 		goto out;
@@ -1522,7 +1520,7 @@ static int userfaultfd_unregister(struct
 	if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister)))
 		goto out;
 
-	ret = validate_range(mm, &uffdio_unregister.start,
+	ret = validate_range(mm, uffdio_unregister.start,
 			     uffdio_unregister.len);
 	if (ret)
 		goto out;
@@ -1671,7 +1669,7 @@ static int userfaultfd_wake(struct userf
 	if (copy_from_user(&uffdio_wake, buf, sizeof(uffdio_wake)))
 		goto out;
 
-	ret = validate_range(ctx->mm, &uffdio_wake.start, uffdio_wake.len);
+	ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len);
 	if (ret)
 		goto out;
 
@@ -1711,7 +1709,7 @@ static int userfaultfd_copy(struct userf
 			   sizeof(uffdio_copy)-sizeof(__s64)))
 		goto out;
 
-	ret = validate_range(ctx->mm, &uffdio_copy.dst, uffdio_copy.len);
+	ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len);
 	if (ret)
 		goto out;
 	/*
@@ -1768,7 +1766,7 @@ static int userfaultfd_zeropage(struct u
 			   sizeof(uffdio_zeropage)-sizeof(__s64)))
 		goto out;
 
-	ret = validate_range(ctx->mm, &uffdio_zeropage.range.start,
+	ret = validate_range(ctx->mm, uffdio_zeropage.range.start,
 			     uffdio_zeropage.range.len);
 	if (ret)
 		goto out;
@@ -1818,7 +1816,7 @@ static int userfaultfd_writeprotect(stru
 			   sizeof(struct uffdio_writeprotect)))
 		return -EFAULT;
 
-	ret = validate_range(ctx->mm, &uffdio_wp.range.start,
+	ret = validate_range(ctx->mm, uffdio_wp.range.start,
 			     uffdio_wp.range.len);
 	if (ret)
 		return ret;
@@ -1866,7 +1864,7 @@ static int userfaultfd_continue(struct u
 			   sizeof(uffdio_continue) - (sizeof(__s64))))
 		goto out;
 
-	ret = validate_range(ctx->mm, &uffdio_continue.range.start,
+	ret = validate_range(ctx->mm, uffdio_continue.range.start,
 			     uffdio_continue.range.len);
 	if (ret)
 		goto out;
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 02/15] selftest: use mmap instead of posix_memalign to allocate memory
  2021-07-23 22:49 incoming Andrew Morton
  2021-07-23 22:50 ` [patch 01/15] userfaultfd: do not untag user pointers Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 03/15] kfence: defer kfence_test_init to ensure that kunit debugfs is created Andrew Morton
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: aarcange, adelva, akpm, andreyknvl, catalin.marinas, Dave.Martin,
	eugenis, linux-mm, lokeshgidra, mitchp, mm-commits, pcc, stable,
	torvalds, vincenzo.frascino, will, willmcvicker

From: Peter Collingbourne <pcc@google.com>
Subject: selftest: use mmap instead of posix_memalign to allocate memory

This test passes pointers obtained from anon_allocate_area to the
userfaultfd and mremap APIs.  This causes a problem if the system
allocator returns tagged pointers because with the tagged address ABI the
kernel rejects tagged addresses passed to these APIs, which would end up
causing the test to fail.  To make this test compatible with such system
allocators, stop using the system allocator to allocate memory in
anon_allocate_area, and instead just use mmap.

Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com
Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b8fc241
Fixes: c47174fc362a ("userfaultfd: selftest")
Co-developed-by: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Alistair Delva <adelva@google.com>
Cc: William McVicker <willmcvicker@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Mitch Phillips <mitchp@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: <stable@vger.kernel.org>	[5.4]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/userfaultfd.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/tools/testing/selftests/vm/userfaultfd.c~selftest-use-mmap-instead-of-posix_memalign-to-allocate-memory
+++ a/tools/testing/selftests/vm/userfaultfd.c
@@ -210,8 +210,10 @@ static void anon_release_pages(char *rel
 
 static void anon_allocate_area(void **alloc_area)
 {
-	if (posix_memalign(alloc_area, page_size, nr_pages * page_size))
-		err("posix_memalign() failed");
+	*alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
+			   MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+	if (*alloc_area == MAP_FAILED)
+		err("mmap of anonymous memory failed");
 }
 
 static void noop_alias_mapping(__u64 *start, size_t len, unsigned long offset)
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 03/15] kfence: defer kfence_test_init to ensure that kunit debugfs is created
  2021-07-23 22:49 incoming Andrew Morton
  2021-07-23 22:50 ` [patch 01/15] userfaultfd: do not untag user pointers Andrew Morton
  2021-07-23 22:50 ` [patch 02/15] selftest: use mmap instead of posix_memalign to allocate memory Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 04/15] kfence: move the size check to the beginning of __kfence_alloc() Andrew Morton
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, linux-mm, mm-commits, o451686892, torvalds

From: Weizhao Ouyang <o451686892@gmail.com>
Subject: kfence: defer kfence_test_init to ensure that kunit debugfs is created

kfence_test_init and kunit_init both use the same level late_initcall,
which means if kfence_test_init linked ahead of kunit_init,
kfence_test_init will get a NULL debugfs_rootdir as parent dentry, then
kfence_test_init and kfence_debugfs_init both create a debugfs node named
"kfence" under debugfs_mount->mnt_root, and it will throw out "debugfs:
Directory 'kfence' with parent '/' already present!" with EEXIST.  So
kfence_test_init should be deferred.

Link: https://lkml.kernel.org/r/20210714113140.2949995-1-o451686892@gmail.com
Signed-off-by: Weizhao Ouyang <o451686892@gmail.com>
Tested-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kfence/kfence_test.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/kfence/kfence_test.c~kfence-defer-kfence_test_init-to-ensure-that-kunit-debugfs-is-created
+++ a/mm/kfence/kfence_test.c
@@ -852,7 +852,7 @@ static void kfence_test_exit(void)
 	tracepoint_synchronize_unregister();
 }
 
-late_initcall(kfence_test_init);
+late_initcall_sync(kfence_test_init);
 module_exit(kfence_test_exit);
 
 MODULE_LICENSE("GPL v2");
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 04/15] kfence: move the size check to the beginning of __kfence_alloc()
  2021-07-23 22:49 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2021-07-23 22:50 ` [patch 03/15] kfence: defer kfence_test_init to ensure that kunit debugfs is created Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 05/15] kfence: skip all GFP_ZONEMASK allocations Andrew Morton
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, gregkh, linux-mm, mm-commits,
	stable, torvalds

From: Alexander Potapenko <glider@google.com>
Subject: kfence: move the size check to the beginning of __kfence_alloc()

Check the allocation size before toggling kfence_allocation_gate.  This
way allocations that can't be served by KFENCE will not result in waiting
for another CONFIG_KFENCE_SAMPLE_INTERVAL without allocating anything.

Link: https://lkml.kernel.org/r/20210714092222.1890268-1-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Suggested-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>	[5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kfence/core.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

--- a/mm/kfence/core.c~kfence-move-the-size-check-to-the-beginning-of-__kfence_alloc
+++ a/mm/kfence/core.c
@@ -734,6 +734,13 @@ void kfence_shutdown_cache(struct kmem_c
 void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
 {
 	/*
+	 * Perform size check before switching kfence_allocation_gate, so that
+	 * we don't disable KFENCE without making an allocation.
+	 */
+	if (size > PAGE_SIZE)
+		return NULL;
+
+	/*
 	 * allocation_gate only needs to become non-zero, so it doesn't make
 	 * sense to continue writing to it and pay the associated contention
 	 * cost, in case we have a large number of concurrent allocations.
@@ -757,9 +764,6 @@ void *__kfence_alloc(struct kmem_cache *
 	if (!READ_ONCE(kfence_enabled))
 		return NULL;
 
-	if (size > PAGE_SIZE)
-		return NULL;
-
 	return kfence_guarded_alloc(s, size, flags);
 }
 
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 05/15] kfence: skip all GFP_ZONEMASK allocations
  2021-07-23 22:49 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2021-07-23 22:50 ` [patch 04/15] kfence: move the size check to the beginning of __kfence_alloc() Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 06/15] mm: call flush_dcache_page() in memcpy_to_page() and memzero_page() Andrew Morton
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, gregkh, jrdr.linux, linux-mm,
	mm-commits, stable, torvalds

From: Alexander Potapenko <glider@google.com>
Subject: kfence: skip all GFP_ZONEMASK allocations

Allocation requests outside ZONE_NORMAL (MOVABLE, HIGHMEM or DMA) cannot
be fulfilled by KFENCE, because KFENCE memory pool is located in a zone
different from the requested one.

Because callers of kmem_cache_alloc() may actually rely on the allocation
to reside in the requested zone (e.g.  memory allocations done with
__GFP_DMA must be DMAable), skip all allocations done with GFP_ZONEMASK
and/or respective SLAB flags (SLAB_CACHE_DMA and SLAB_CACHE_DMA32).

Link: https://lkml.kernel.org/r/20210714092222.1890268-2-glider@google.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: <stable@vger.kernel.org>	[5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kfence/core.c |    9 +++++++++
 1 file changed, 9 insertions(+)

--- a/mm/kfence/core.c~kfence-skip-all-gfp_zonemask-allocations
+++ a/mm/kfence/core.c
@@ -741,6 +741,15 @@ void *__kfence_alloc(struct kmem_cache *
 		return NULL;
 
 	/*
+	 * Skip allocations from non-default zones, including DMA. We cannot
+	 * guarantee that pages in the KFENCE pool will have the requested
+	 * properties (e.g. reside in DMAable memory).
+	 */
+	if ((flags & GFP_ZONEMASK) ||
+	    (s->flags & (SLAB_CACHE_DMA | SLAB_CACHE_DMA32)))
+		return NULL;
+
+	/*
 	 * allocation_gate only needs to become non-zero, so it doesn't make
 	 * sense to continue writing to it and pay the associated contention
 	 * cost, in case we have a large number of concurrent allocations.
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 06/15] mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
  2021-07-23 22:49 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2021-07-23 22:50 ` [patch 05/15] kfence: skip all GFP_ZONEMASK allocations Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-24  6:59   ` Christoph Hellwig
  2021-07-23 22:50 ` [patch 07/15] mm: use kmap_local_page in memzero_page Andrew Morton
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, chaitanya.kulkarni, hch, ira.weiny, linux-mm, mm-commits,
	stable, torvalds

From: Christoph Hellwig <hch@lst.de>
Subject: mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()

memcpy_to_page and memzero_page can write to arbitrary pages, which could
be in the page cache or in high memory, so call flush_kernel_dcache_pages
to flush the dcache.

This is a problem when using these helpers on dcache challeneged
architectures.  Right now there are just a few users, chances are no
one used the PC floppy dr\u0456ver, the aha1542 driver for an ISA SCSI
HBA, and a few advanced and optional btrfs and ext4 features on those
platforms yet since the conversion.

Link: https://lkml.kernel.org/r/20210713055231.137602-2-hch@lst.de
Fixes: bb90d4bc7b6a ("mm/highmem: Lift memcpy_[to|from]_page to core")
Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/highmem.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/include/linux/highmem.h~mm-call-flush_dcache_page-in-memcpy_to_page-and-memzero_page
+++ a/include/linux/highmem.h
@@ -318,6 +318,7 @@ static inline void memcpy_to_page(struct
 
 	VM_BUG_ON(offset + len > PAGE_SIZE);
 	memcpy(to + offset, from, len);
+	flush_dcache_page(page);
 	kunmap_local(to);
 }
 
@@ -325,6 +326,7 @@ static inline void memzero_page(struct p
 {
 	char *addr = kmap_atomic(page);
 	memset(addr + offset, 0, len);
+	flush_dcache_page(page);
 	kunmap_atomic(addr);
 }
 
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 07/15] mm: use kmap_local_page in memzero_page
  2021-07-23 22:49 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2021-07-23 22:50 ` [patch 06/15] mm: call flush_dcache_page() in memcpy_to_page() and memzero_page() Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 08/15] mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction Andrew Morton
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, chaitanya.kulkarni, hch, ira.weiny, linux-mm, mm-commits, torvalds

From: Christoph Hellwig <hch@lst.de>
Subject: mm: use kmap_local_page in memzero_page

The commit message introducing the global memzero_page explicitly mentions
switching to kmap_local_page in the commit log but doesn't actually do
that.

Link: https://lkml.kernel.org/r/20210713055231.137602-3-hch@lst.de
Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/highmem.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/highmem.h~mm-use-kmap_local_page-in-memzero_page
+++ a/include/linux/highmem.h
@@ -324,10 +324,10 @@ static inline void memcpy_to_page(struct
 
 static inline void memzero_page(struct page *page, size_t offset, size_t len)
 {
-	char *addr = kmap_atomic(page);
+	char *addr = kmap_local_page(page);
 	memset(addr + offset, 0, len);
 	flush_dcache_page(page);
-	kunmap_atomic(addr);
+	kunmap_local(addr);
 }
 
 #endif /* _LINUX_HIGHMEM_H */
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 08/15] mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction
  2021-07-23 22:49 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2021-07-23 22:50 ` [patch 07/15] mm: use kmap_local_page in memzero_page Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 09/15] memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions Andrew Morton
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, bowsingbetee, bowsingbetee, david, glider, keescook,
	linux-mm, mm-commits, mmorfikov, slyfox, stable, tglx, torvalds,
	vbabka

From: Sergei Trofimovich <slyfox@gentoo.org>
Subject: mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction

To reproduce the failure we need the following system:
  - kernel command: page_poison=1 init_on_free=0 init_on_alloc=0
  - kernel config:
    * CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
    * CONFIG_INIT_ON_FREE_DEFAULT_ON=y
    * CONFIG_PAGE_POISONING=y

    0000000085629bdd: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0000000022861832: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00000000c597f5b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    CPU: 11 PID: 15195 Comm: bash Kdump: loaded Tainted: G     U     O      5.13.1-gentoo-x86_64 #1
    Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2801 01/13/2021
    Call Trace:
     dump_stack+0x64/0x7c
     __kernel_unpoison_pages.cold+0x48/0x84
     post_alloc_hook+0x60/0xa0
     get_page_from_freelist+0xdb8/0x1000
     __alloc_pages+0x163/0x2b0
     __get_free_pages+0xc/0x30
     pgd_alloc+0x2e/0x1a0
     ? dup_mm+0x37/0x4f0
     mm_init+0x185/0x270
     dup_mm+0x6b/0x4f0
     ? __lock_task_sighand+0x35/0x70
     copy_process+0x190d/0x1b10
     kernel_clone+0xba/0x3b0
     __do_sys_clone+0x8f/0xb0
     do_syscall_64+0x68/0x80
     ? do_syscall_64+0x11/0x80
     entry_SYSCALL_64_after_hwframe+0x44/0xae

Before the 51cba1eb ("init_on_alloc: Optimize static branches")
init_on_alloc never enabled static branch by default.  It could only be
enabed explicitly by init_mem_debugging_and_hardening().

But after the 51cba1eb static branch could already be enabled by default. 
There was no code to ever disable it.  That caused page_poison=1 /
init_on_free=1 conflict.

This change extends init_mem_debugging_and_hardening() to also disable
static branch disabling.

Link: https://lkml.kernel.org/r/20210714031935.4094114-1-keescook@chromium.org
Link: https://lore.kernel.org/r/20210712215816.1512739-1-slyfox@gentoo.org
Fixes: 51cba1ebc60d ("init_on_alloc: Optimize static branches")
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Kees Cook <keescook@chromium.org>
Reported-by: Mikhail Morfikov <mmorfikov@gmail.com>
Reported-by: <bowsingbetee@pm.me>
Tested-by: <bowsingbetee@protonmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |   29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-fix-page_poison=1-init_on_alloc_default_on-interaction
+++ a/mm/page_alloc.c
@@ -840,21 +840,24 @@ void init_mem_debugging_and_hardening(vo
 	}
 #endif
 
-	if (_init_on_alloc_enabled_early) {
-		if (page_poisoning_requested)
-			pr_info("mem auto-init: CONFIG_PAGE_POISONING is on, "
-				"will take precedence over init_on_alloc\n");
-		else
-			static_branch_enable(&init_on_alloc);
-	}
-	if (_init_on_free_enabled_early) {
-		if (page_poisoning_requested)
-			pr_info("mem auto-init: CONFIG_PAGE_POISONING is on, "
-				"will take precedence over init_on_free\n");
-		else
-			static_branch_enable(&init_on_free);
+	if ((_init_on_alloc_enabled_early || _init_on_free_enabled_early) &&
+	    page_poisoning_requested) {
+		pr_info("mem auto-init: CONFIG_PAGE_POISONING is on, "
+			"will take precedence over init_on_alloc and init_on_free\n");
+		_init_on_alloc_enabled_early = false;
+		_init_on_free_enabled_early = false;
 	}
 
+	if (_init_on_alloc_enabled_early)
+		static_branch_enable(&init_on_alloc);
+	else
+		static_branch_disable(&init_on_alloc);
+
+	if (_init_on_free_enabled_early)
+		static_branch_enable(&init_on_free);
+	else
+		static_branch_disable(&init_on_free);
+
 #ifdef CONFIG_DEBUG_PAGEALLOC
 	if (!debug_pagealloc_enabled())
 		return;
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 09/15] memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions
  2021-07-23 22:49 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2021-07-23 22:50 ` [patch 08/15] mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 10/15] writeback, cgroup: remove wb from offline list before releasing refcnt Andrew Morton
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, david, groug, linux-mm, mm-commits, rppt, stable, torvalds

From: Mike Rapoport <rppt@linux.ibm.com>
Subject: memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions

Commit b10d6bca8720 ("arch, drivers: replace for_each_membock() with
for_each_mem_range()") didn't take into account that when there is
movable_node parameter in the kernel command line, for_each_mem_range()
would skip ranges marked with MEMBLOCK_HOTPLUG.

The page table setup code in POWER uses for_each_mem_range() to create the
linear mapping of the physical memory and since the regions marked as
MEMORY_HOTPLUG are skipped, they never make it to the linear map.

A later access to the memory in those ranges will fail:

[    2.271743] BUG: Unable to handle kernel data access on write at 0xc000000400000000
[    2.271984] Faulting instruction address: 0xc00000000008a3c0
[    2.272568] Oops: Kernel access of bad area, sig: 11 [#1]
[    2.272683] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[    2.273063] Modules linked in:
[    2.273435] CPU: 0 PID: 53 Comm: kworker/u2:0 Not tainted 5.13.0 #7
[    2.273832] NIP:  c00000000008a3c0 LR: c0000000003c1ed8 CTR: 0000000000000040
[    2.273918] REGS: c000000008a57770 TRAP: 0300   Not tainted  (5.13.0)
[    2.274036] MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 84222202  XER: 20040000
[    2.274454] CFAR: c0000000003c1ed4 DAR: c000000400000000 DSISR: 42000000 IRQMASK: 0
[    2.274454] GPR00: c0000000003c1ed8 c000000008a57a10 c0000000019da700 c000000400000000
[    2.274454] GPR04: 0000000000000280 0000000000000180 0000000000000400 0000000000000200
[    2.274454] GPR08: 0000000000000100 0000000000000080 0000000000000040 0000000000000300
[    2.274454] GPR12: 0000000000000380 c000000001bc0000 c0000000001660c8 c000000006337e00
[    2.274454] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    2.274454] GPR20: 0000000040000000 0000000020000000 c000000001a81990 c000000008c30000
[    2.274454] GPR24: c000000008c20000 c000000001a81998 000fffffffff0000 c000000001a819a0
[    2.274454] GPR28: c000000001a81908 c00c000001000000 c000000008c40000 c000000008a64680
[    2.275520] NIP [c00000000008a3c0] clear_user_page+0x50/0x80
[    2.276333] LR [c0000000003c1ed8] __handle_mm_fault+0xc88/0x1910
[    2.276688] Call Trace:
[    2.276839] [c000000008a57a10] [c0000000003c1e94] __handle_mm_fault+0xc44/0x1910 (unreliable)
[    2.277142] [c000000008a57af0] [c0000000003c2c90] handle_mm_fault+0x130/0x2a0
[    2.277331] [c000000008a57b40] [c0000000003b5f08] __get_user_pages+0x248/0x610
[    2.277541] [c000000008a57c40] [c0000000003b848c] __get_user_pages_remote+0x12c/0x3e0
[    2.277768] [c000000008a57cd0] [c000000000473f24] get_arg_page+0x54/0xf0
[    2.277959] [c000000008a57d10] [c000000000474a7c] copy_string_kernel+0x11c/0x210
[    2.278159] [c000000008a57d80] [c00000000047663c] kernel_execve+0x16c/0x220
[    2.278361] [c000000008a57dd0] [c000000000166270] call_usermodehelper_exec_async+0x1b0/0x2f0
[    2.278543] [c000000008a57e10] [c00000000000d5ec] ret_from_kernel_thread+0x5c/0x70
[    2.278870] Instruction dump:
[    2.279214] 79280fa4 79271764 79261f24 794ae8e2 7ca94214 7d683a14 7c893a14 7d893050
[    2.279416] 7d4903a6 60000000 60000000 60000000 <7c001fec> 7c091fec 7c081fec 7c051fec
[    2.280193] ---[ end trace 490b8c67e6075e09 ]---

Making for_each_mem_range() include MEMBLOCK_HOTPLUG regions in the
traversal fixes this issue.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976100
Link: https://lkml.kernel.org/r/20210712071132.20902-1-rppt@kernel.org
Fixes: b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>	[5.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memblock.h |    4 ++--
 mm/memblock.c            |    3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

--- a/include/linux/memblock.h~memblock-make-for_each_mem_range-traverse-memblock_hotplug-regions
+++ a/include/linux/memblock.h
@@ -209,7 +209,7 @@ static inline void __next_physmem_range(
  */
 #define for_each_mem_range(i, p_start, p_end) \
 	__for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE,	\
-			     MEMBLOCK_NONE, p_start, p_end, NULL)
+			     MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
 
 /**
  * for_each_mem_range_rev - reverse iterate through memblock areas from
@@ -220,7 +220,7 @@ static inline void __next_physmem_range(
  */
 #define for_each_mem_range_rev(i, p_start, p_end)			\
 	__for_each_mem_range_rev(i, &memblock.memory, NULL, NUMA_NO_NODE, \
-				 MEMBLOCK_NONE, p_start, p_end, NULL)
+				 MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
 
 /**
  * for_each_reserved_mem_range - iterate over all reserved memblock areas
--- a/mm/memblock.c~memblock-make-for_each_mem_range-traverse-memblock_hotplug-regions
+++ a/mm/memblock.c
@@ -947,7 +947,8 @@ static bool should_skip_region(struct me
 		return true;
 
 	/* skip hotpluggable memory regions if needed */
-	if (movable_node_is_enabled() && memblock_is_hotpluggable(m))
+	if (movable_node_is_enabled() && memblock_is_hotpluggable(m) &&
+	    !(flags & MEMBLOCK_HOTPLUG))
 		return true;
 
 	/* if we want mirror memory skip non-mirror memory regions */
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 10/15] writeback, cgroup: remove wb from offline list before releasing refcnt
  2021-07-23 22:49 incoming Andrew Morton
                   ` (8 preceding siblings ...)
  2021-07-23 22:50 ` [patch 09/15] memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 11/15] writeback, cgroup: do not reparent dax inodes Andrew Morton
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, bxue, dchinner, djwong, guro, jack, jencce.kernel,
	linux-mm, mm-commits, torvalds, will

From: Roman Gushchin <guro@fb.com>
Subject: writeback, cgroup: remove wb from offline list before releasing refcnt

Boyang reported that the commit c22d70a162d3 ("writeback, cgroup: release
dying cgwbs by switching attached inodes") causes the kernel to crash
while running xfstests generic/256 on ext4 on aarch64 and ppc64le.

  [ 4366.380974] run fstests generic/256 at 2021-07-12 05:41:40
  [ 4368.337078] EXT4-fs (vda3): mounted filesystem with ordered data
  mode. Opts: . Quota mode: none.
  [ 4371.275986] Unable to handle kernel NULL pointer dereference at
  virtual address 0000000000000000
  [ 4371.278210] Mem abort info:
  [ 4371.278880]   ESR = 0x96000005
  [ 4371.279603]   EC = 0x25: DABT (current EL), IL = 32 bits
  [ 4371.280878]   SET = 0, FnV = 0
  [ 4371.281621]   EA = 0, S1PTW = 0
  [ 4371.282396]   FSC = 0x05: level 1 translation fault
  [ 4371.283635] Data abort info:
  [ 4371.284333]   ISV = 0, ISS = 0x00000005
  [ 4371.285246]   CM = 0, WnR = 0
  [ 4371.285975] user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000
  [ 4371.287640] [0000000000000000] pgd=0000000000000000,
  p4d=0000000000000000, pud=0000000000000000
  [ 4371.290016] Internal error: Oops: 96000005 [#1] SMP
  [ 4371.291251] Modules linked in: dm_flakey dm_snapshot dm_bufio
  dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
  nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2
  drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64
  sha1_ce virtio_blk virtio_net net_failover virtio_console failover
  virtio_mmio aes_neon_bs [last unloaded: scsi_debug]
  [ 4371.300059] CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G
         X --------- ---  5.14.0-0.rc1.15.bx.el9.aarch64 #1
  [ 4371.303009] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
  [ 4371.304685] Workqueue: events_unbound cleanup_offline_cgwbs_workfn
  [ 4371.306329] pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--)
  [ 4371.307867] pc : cleanup_offline_cgwbs_workfn+0x320/0x394
  [ 4371.309254] lr : cleanup_offline_cgwbs_workfn+0xe0/0x394
  [ 4371.310597] sp : ffff80001554fd10
  [ 4371.311443] x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001
  [ 4371.313320] x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8
  [ 4371.315159] x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730
  [ 4371.316945] x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000
  [ 4371.318690] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
  [ 4371.320437] x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040
  [ 4371.322444] x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60
  [ 4371.324243] x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a
  [ 4371.326049] x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000
  [ 4371.327898] x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003
  [ 4371.329748] Call trace:
  [ 4371.330372]  cleanup_offline_cgwbs_workfn+0x320/0x394
  [ 4371.331694]  process_one_work+0x1f4/0x4b0
  [ 4371.332767]  worker_thread+0x184/0x540
  [ 4371.333732]  kthread+0x114/0x120
  [ 4371.334535]  ret_from_fork+0x10/0x18
  [ 4371.335440] Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)
  [ 4371.337174] ---[ end trace e250fe289272792a ]---
  [ 4371.338365] Kernel panic - not syncing: Oops: Fatal exception
  [ 4371.339884] SMP: stopping secondary CPUs
  [ 4372.424137] SMP: failed to stop secondary CPUs 0-2
  [ 4372.436894] Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000
  [ 4372.438408] PHYS_OFFSET: 0xfff0defca0000000
  [ 4372.439496] CPU features: 0x00200251,23200840
  [ 4372.440603] Memory Limit: none
  [ 4372.441374] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---

The problem happens when cgwb_release_workfn() races with
cleanup_offline_cgwbs_workfn(): wb_tryget() in
cleanup_offline_cgwbs_workfn() can be called after percpu_ref_exit() is
cgwb_release_workfn(), which is basically a use-after-free error.

Fix the problem by making removing the writeback structure from the
offline list before releasing the percpu reference counter.  It will
guarantee that cleanup_offline_cgwbs_workfn() will not see and not access
writeback structures which are about to be released.

Link: https://lkml.kernel.org/r/20210716201039.3762203-1-guro@fb.com
Fixes: c22d70a162d3 ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Boyang Xue <bxue@redhat.com>
Suggested-by: Jan Kara <jack@suse.cz>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/backing-dev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/backing-dev.c~writeback-cgroup-remove-wb-from-offline-list-before-releasing-refcnt
+++ a/mm/backing-dev.c
@@ -398,12 +398,12 @@ static void cgwb_release_workfn(struct w
 	blkcg_unpin_online(blkcg);
 
 	fprop_local_destroy_percpu(&wb->memcg_completions);
-	percpu_ref_exit(&wb->refcnt);
 
 	spin_lock_irq(&cgwb_lock);
 	list_del(&wb->offline_node);
 	spin_unlock_irq(&cgwb_lock);
 
+	percpu_ref_exit(&wb->refcnt);
 	wb_exit(wb);
 	WARN_ON_ONCE(!list_empty(&wb->b_attached));
 	kfree_rcu(wb, rcu);
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 11/15] writeback, cgroup: do not reparent dax inodes
  2021-07-23 22:49 incoming Andrew Morton
                   ` (9 preceding siblings ...)
  2021-07-23 22:50 ` [patch 10/15] writeback, cgroup: remove wb from offline list before releasing refcnt Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 12/15] mm/secretmem: wire up ->set_page_dirty Andrew Morton
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, dchinner, djwong, guro, jack, jencce.kernel, linux-mm,
	mm-commits, torvalds, willy

From: Roman Gushchin <guro@fb.com>
Subject: writeback, cgroup: do not reparent dax inodes

The inode switching code is not suited for dax inodes.  An attempt to
switch a dax inode to a parent writeback structure (as a part of a
writeback cleanup procedure) results in a panic like this:

  [  987.071651] run fstests generic/270 at 2021-07-15 05:54:02
  [  988.704940] XFS (pmem0p2): EXPERIMENTAL big timestamp feature in
  use.  Use at your own risk!
  [  988.746847] XFS (pmem0p2): DAX enabled. Warning: EXPERIMENTAL, use
  at your own risk
  [  988.786070] XFS (pmem0p2): EXPERIMENTAL inode btree counters
  feature in use. Use at your own risk!
  [  988.828639] XFS (pmem0p2): Mounting V5 Filesystem
  [  988.854019] XFS (pmem0p2): Ending clean mount
  [  988.874550] XFS (pmem0p2): Quotacheck needed: Please wait.
  [  988.900618] XFS (pmem0p2): Quotacheck: Done.
  [  989.090783] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  [  989.092751] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  [  989.092962] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  [ 1010.105586] BUG: unable to handle page fault for address: 0000000005b0f669
  [ 1010.141817] #PF: supervisor read access in kernel mode
  [ 1010.167824] #PF: error_code(0x0000) - not-present page
  [ 1010.191499] PGD 0 P4D 0
  [ 1010.203346] Oops: 0000 [#1] SMP PTI
  [ 1010.219596] CPU: 13 PID: 10479 Comm: kworker/13:16 Not tainted
  5.14.0-rc1-master-8096acd7442e+ #8
  [ 1010.260441] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360
  Gen9, BIOS P89 09/13/2016
  [ 1010.297792] Workqueue: inode_switch_wbs inode_switch_wbs_work_fn
  [ 1010.324832] RIP: 0010:inode_do_switch_wbs+0xaf/0x470
  [ 1010.347261] Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48
  c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff
  ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08
  0f 85
  [ 1010.434307] RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
  [ 1010.457795] RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
  [ 1010.489922] RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
  [ 1010.522085] RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
  [ 1010.554234] R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
  [ 1010.586414] R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
  [ 1010.619394] FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000)
  knlGS:0000000000000000
  [ 1010.658874] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 1010.688085] CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
  [ 1010.722129] Call Trace:
  [ 1010.733132]  inode_switch_wbs_work_fn+0xb6/0x2a0
  [ 1010.754121]  process_one_work+0x1e6/0x380
  [ 1010.772512]  worker_thread+0x53/0x3d0
  [ 1010.789221]  ? process_one_work+0x380/0x380
  [ 1010.807964]  kthread+0x10f/0x130
  [ 1010.822043]  ? set_kthread_struct+0x40/0x40
  [ 1010.840818]  ret_from_fork+0x22/0x30
  [ 1010.856851] Modules linked in: xt_CHECKSUM xt_MASQUERADE
  xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat
  nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables
  nfnetlink bridge stp llc rfkill sunrpc intel_rapl_msr
  intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp
  coretemp kvm_intel ipmi_ssif kvm mgag200 i2c_algo_bit iTCO_wdt
  irqbypass drm_kms_helper iTCO_vendor_support acpi_ipmi rapl
  syscopyarea sysfillrect intel_cstate ipmi_si sysimgblt ioatdma
  dax_pmem_compat fb_sys_fops ipmi_devintf device_dax i2c_i801 pcspkr
  intel_uncore hpilo nd_pmem cec dax_pmem_core dca i2c_smbus acpi_tad
  lpc_ich ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod
  t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel tg3
  ghash_clmulni_intel serio_raw hpsa hpwdt scsi_transport_sas wmi
  dm_mirror dm_region_hash dm_log dm_mod
  [ 1011.200864] CR2: 0000000005b0f669
  [ 1011.215700] ---[ end trace ed2105faff8384f3 ]---
  [ 1011.241727] RIP: 0010:inode_do_switch_wbs+0xaf/0x470
  [ 1011.264306] Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48
  c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff
  ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08
  0f 85
  [ 1011.348821] RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
  [ 1011.372734] RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
  [ 1011.405826] RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
  [ 1011.437852] RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
  [ 1011.469926] R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
  [ 1011.502179] R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
  [ 1011.534233] FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000)
  knlGS:0000000000000000
  [ 1011.571247] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 1011.597063] CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
  [ 1011.629160] Kernel panic - not syncing: Fatal exception
  [ 1011.653802] Kernel Offset: 0x15200000 from 0xffffffff81000000
  (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  [ 1011.713723] ---[ end Kernel panic - not syncing: Fatal exception ]---

The crash happens on an attempt to iterate over attached pagecache pages
and check the dirty flag: a dax inode's xarray contains pfn's instead of
generic struct page pointers.

This happens for DAX and not for other kinds of non-page entries in the
inodes because it's a tagged iteration, and shadow/swap entries are never
tagged; only DAX entries get tagged.

Fix the problem by bailing out (with the false return value) of
inode_prepare_sbs_switch() if a dax inode is passed.

[willy@infradead.org: changelog addition]
Link: https://lkml.kernel.org/r/20210719171350.3876830-1-guro@fb.com
Fixes: c22d70a162d3 ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Reported-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Murphy Zhou <jencce.kernel@gmail.com>
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/fs-writeback.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/fs/fs-writeback.c~writeback-cgroup-do-not-reparent-dax-inodes
+++ a/fs/fs-writeback.c
@@ -521,6 +521,9 @@ static bool inode_prepare_wbs_switch(str
 	 */
 	smp_mb();
 
+	if (IS_DAX(inode))
+		return false;
+
 	/* while holding I_WB_SWITCH, no one else can update the association */
 	spin_lock(&inode->i_lock);
 	if (!(inode->i_sb->s_flags & SB_ACTIVE) ||
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 12/15] mm/secretmem: wire up ->set_page_dirty
  2021-07-23 22:49 incoming Andrew Morton
                   ` (10 preceding siblings ...)
  2021-07-23 22:50 ` [patch 11/15] writeback, cgroup: do not reparent dax inodes Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 13/15] mm: mmap_lock: fix disabling preemption directly Andrew Morton
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, david, linux-mm, mm-commits, rppt, torvalds

From: Mike Rapoport <rppt@linux.ibm.com>
Subject: mm/secretmem: wire up ->set_page_dirty

Make secretmem up to date with the changes done in commit 0af573780b0b
("mm: require ->set_page_dirty to be explicitly wired up") so that
unconditional call to this method won't cause crashes.

Link: https://lkml.kernel.org/r/20210716063933.31633-1-rppt@kernel.org
Fixes: 0af573780b0b ("mm: require ->set_page_dirty to be explicitly wired up")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/secretmem.c |    1 +
 1 file changed, 1 insertion(+)

--- a/mm/secretmem.c~mm-secretmem-wire-up-set_page_dirty
+++ a/mm/secretmem.c
@@ -152,6 +152,7 @@ static void secretmem_freepage(struct pa
 }
 
 const struct address_space_operations secretmem_aops = {
+	.set_page_dirty	= __set_page_dirty_no_writeback,
 	.freepage	= secretmem_freepage,
 	.migratepage	= secretmem_migratepage,
 	.isolate_page	= secretmem_isolate_page,
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 13/15] mm: mmap_lock: fix disabling preemption directly
  2021-07-23 22:49 incoming Andrew Morton
                   ` (11 preceding siblings ...)
  2021-07-23 22:50 ` [patch 12/15] mm/secretmem: wire up ->set_page_dirty Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 14/15] mm: fix the deadlock in finish_fault() Andrew Morton
  2021-07-23 22:50 ` [patch 15/15] hugetlbfs: fix mount mode command line processing Andrew Morton
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, linux-mm, mgorman, mm-commits, pankaj.gupta, shy828301,
	songmuchun, torvalds

From: Muchun Song <songmuchun@bytedance.com>
Subject: mm: mmap_lock: fix disabling preemption directly

The commit 832b50725373 ("mm: mmap_lock: use local locks instead of
disabling preemption") fix a bug by using local locks.  But commit
d01079f3d0c0 ("mm/mmap_lock: remove dead code for !CONFIG_TRACING
configurations") changes those lines to original version.  I guess it is
introduced by the conflicts fixing on merging.

Link: https://lkml.kernel.org/r/20210720074228.76342-1-songmuchun@bytedance.com
Fixes: d01079f3d0c0 ("mm/mmap_lock: remove dead code for !CONFIG_TRACING configurations")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mmap_lock.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/mmap_lock.c~mm-mmap_lock-fix-disabling-preemption-directly
+++ a/mm/mmap_lock.c
@@ -156,14 +156,14 @@ static inline void put_memcg_path_buf(vo
 #define TRACE_MMAP_LOCK_EVENT(type, mm, ...)                                   \
 	do {                                                                   \
 		const char *memcg_path;                                        \
-		preempt_disable();                                             \
+		local_lock(&memcg_paths.lock);                                 \
 		memcg_path = get_mm_memcg_path(mm);                            \
 		trace_mmap_lock_##type(mm,                                     \
 				       memcg_path != NULL ? memcg_path : "",   \
 				       ##__VA_ARGS__);                         \
 		if (likely(memcg_path != NULL))                                \
 			put_memcg_path_buf();                                  \
-		preempt_enable();                                              \
+		local_unlock(&memcg_paths.lock);                               \
 	} while (0)
 
 #else /* !CONFIG_MEMCG */
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 14/15] mm: fix the deadlock in finish_fault()
  2021-07-23 22:49 incoming Andrew Morton
                   ` (12 preceding siblings ...)
  2021-07-23 22:50 ` [patch 13/15] mm: mmap_lock: fix disabling preemption directly Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-23 22:50 ` [patch 15/15] hugetlbfs: fix mount mode command line processing Andrew Morton
  14 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, hannes, kirill.shutemov, linux-mm, mhocko, mm-commits,
	songmuchun, stable, tglx, torvalds, vdavydov.dev, zhengqi.arch

From: Qi Zheng <zhengqi.arch@bytedance.com>
Subject: mm: fix the deadlock in finish_fault()

Commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock with writeback") fix
the following ABBA deadlock by pre-allocating the pte page table without
holding the page lock.

	                                lock_page(A)
                                        SetPageWriteback(A)
                                        unlock_page(A)
  lock_page(B)
                                        lock_page(B)
  pte_alloc_one
    shrink_page_list
      wait_on_page_writeback(A)
                                        SetPageWriteback(B)
                                        unlock_page(B)

                                        # flush A, B to clear the writeback

Commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
codepaths") rework the relevant code but ignore this race.  This will
cause the deadlock above to appear again, so fix it.

Link: https://lkml.kernel.org/r/20210721074849.57004-1-zhengqi.arch@bytedance.com
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/mm/memory.c~mm-fix-the-deadlock-in-finish_fault
+++ a/mm/memory.c
@@ -4026,8 +4026,17 @@ vm_fault_t finish_fault(struct vm_fault
 				return ret;
 		}
 
-		if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd)))
+		if (vmf->prealloc_pte) {
+			vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
+			if (likely(pmd_none(*vmf->pmd))) {
+				mm_inc_nr_ptes(vma->vm_mm);
+				pmd_populate(vma->vm_mm, vmf->pmd, vmf->prealloc_pte);
+				vmf->prealloc_pte = NULL;
+			}
+			spin_unlock(vmf->ptl);
+		} else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) {
 			return VM_FAULT_OOM;
+		}
 	}
 
 	/* See comment in handle_pte_fault() */
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 15/15] hugetlbfs: fix mount mode command line processing
  2021-07-23 22:49 incoming Andrew Morton
                   ` (13 preceding siblings ...)
  2021-07-23 22:50 ` [patch 14/15] mm: fix the deadlock in finish_fault() Andrew Morton
@ 2021-07-23 22:50 ` Andrew Morton
  2021-07-24  1:41   ` Al Viro
  14 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2021-07-23 22:50 UTC (permalink / raw)
  To: akpm, bugs+kernel.org, dhowells, linux-mm, mike.kravetz,
	mm-commits, stable, torvalds, viro, willy

From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: hugetlbfs: fix mount mode command line processing

In commit 32021982a324 ("hugetlbfs: Convert to fs_context") processing of
the mount mode string was changed from match_octal() to fsparam_u32.  This
changed existing behavior as match_octal does not require octal values to
have a '0' prefix, but fsparam_u32 does.

Use fsparam_u32oct which provides the same behavior as match_octal.

Link: https://lkml.kernel.org/r/20210721183326.102716-1-mike.kravetz@oracle.com
Fixes: 32021982a324 ("hugetlbfs: Convert to fs_context")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Dennis Camera <bugs+kernel.org@dtnr.ch>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/hugetlbfs/inode.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/hugetlbfs/inode.c~hugetlbfs-fix-mount-mode-command-line-processing
+++ a/fs/hugetlbfs/inode.c
@@ -77,7 +77,7 @@ enum hugetlb_param {
 static const struct fs_parameter_spec hugetlb_fs_parameters[] = {
 	fsparam_u32   ("gid",		Opt_gid),
 	fsparam_string("min_size",	Opt_min_size),
-	fsparam_u32   ("mode",		Opt_mode),
+	fsparam_u32oct("mode",		Opt_mode),
 	fsparam_string("nr_inodes",	Opt_nr_inodes),
 	fsparam_string("pagesize",	Opt_pagesize),
 	fsparam_string("size",		Opt_size),
_


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 15/15] hugetlbfs: fix mount mode command line processing
  2021-07-23 22:50 ` [patch 15/15] hugetlbfs: fix mount mode command line processing Andrew Morton
@ 2021-07-24  1:41   ` Al Viro
  2021-07-26  5:22     ` Andrew Morton
  0 siblings, 1 reply; 20+ messages in thread
From: Al Viro @ 2021-07-24  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: bugs+kernel.org, dhowells, linux-mm, mike.kravetz, mm-commits,
	stable, torvalds, willy

On Fri, Jul 23, 2021 at 03:50:44PM -0700, Andrew Morton wrote:
> From: Mike Kravetz <mike.kravetz@oracle.com>
> Subject: hugetlbfs: fix mount mode command line processing
> 
> In commit 32021982a324 ("hugetlbfs: Convert to fs_context") processing of
> the mount mode string was changed from match_octal() to fsparam_u32.  This
> changed existing behavior as match_octal does not require octal values to
> have a '0' prefix, but fsparam_u32 does.
> 
> Use fsparam_u32oct which provides the same behavior as match_octal.

Looks sane...  Which tree do you want it to go through?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 06/15] mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
  2021-07-23 22:50 ` [patch 06/15] mm: call flush_dcache_page() in memcpy_to_page() and memzero_page() Andrew Morton
@ 2021-07-24  6:59   ` Christoph Hellwig
  2021-07-24 16:23     ` Matthew Wilcox
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-24  6:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: chaitanya.kulkarni, hch, ira.weiny, linux-mm, mm-commits, stable,
	torvalds

On Fri, Jul 23, 2021 at 03:50:17PM -0700, Andrew Morton wrote:
> one used the PC floppy dr\u0456ver, the aha1542 driver for an ISA SCSI

Looks like I produced some messed up utf8 chars again - the above garbage
should read "driver" of course.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 06/15] mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
  2021-07-24  6:59   ` Christoph Hellwig
@ 2021-07-24 16:23     ` Matthew Wilcox
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2021-07-24 16:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, chaitanya.kulkarni, ira.weiny, linux-mm,
	mm-commits, stable, torvalds

On Sat, Jul 24, 2021 at 08:59:54AM +0200, Christoph Hellwig wrote:
> On Fri, Jul 23, 2021 at 03:50:17PM -0700, Andrew Morton wrote:
> > one used the PC floppy dr\u0456ver, the aha1542 driver for an ISA SCSI
> 
> Looks like I produced some messed up utf8 chars again - the above garbage
> should read "driver" of course.

I went back and looked it up, and you did indeed manage to type:

U+0456 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I character (&#x0456;)

It's on the list:
http://www.unicode.org/Public/security/revision-05/confusables.txt

Maybe someone could do something with that file to prevent the
confusables from slipping in when unwanted?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 15/15] hugetlbfs: fix mount mode command line processing
  2021-07-24  1:41   ` Al Viro
@ 2021-07-26  5:22     ` Andrew Morton
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2021-07-26  5:22 UTC (permalink / raw)
  To: Al Viro
  Cc: bugs+kernel.org, dhowells, linux-mm, mike.kravetz, mm-commits,
	stable, torvalds, willy

On Sat, 24 Jul 2021 01:41:52 +0000 Al Viro <viro@zeniv.linux.org.uk> wrote:

> On Fri, Jul 23, 2021 at 03:50:44PM -0700, Andrew Morton wrote:
> > From: Mike Kravetz <mike.kravetz@oracle.com>
> > Subject: hugetlbfs: fix mount mode command line processing
> > 
> > In commit 32021982a324 ("hugetlbfs: Convert to fs_context") processing of
> > the mount mode string was changed from match_octal() to fsparam_u32.  This
> > changed existing behavior as match_octal does not require octal values to
> > have a '0' prefix, but fsparam_u32 does.
> > 
> > Use fsparam_u32oct which provides the same behavior as match_octal.
> 
> Looks sane...  Which tree do you want it to go through?

It's now in mainline, with cc:stable, Fixes: 32021982a324.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-07-26  5:22 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-23 22:49 incoming Andrew Morton
2021-07-23 22:50 ` [patch 01/15] userfaultfd: do not untag user pointers Andrew Morton
2021-07-23 22:50 ` [patch 02/15] selftest: use mmap instead of posix_memalign to allocate memory Andrew Morton
2021-07-23 22:50 ` [patch 03/15] kfence: defer kfence_test_init to ensure that kunit debugfs is created Andrew Morton
2021-07-23 22:50 ` [patch 04/15] kfence: move the size check to the beginning of __kfence_alloc() Andrew Morton
2021-07-23 22:50 ` [patch 05/15] kfence: skip all GFP_ZONEMASK allocations Andrew Morton
2021-07-23 22:50 ` [patch 06/15] mm: call flush_dcache_page() in memcpy_to_page() and memzero_page() Andrew Morton
2021-07-24  6:59   ` Christoph Hellwig
2021-07-24 16:23     ` Matthew Wilcox
2021-07-23 22:50 ` [patch 07/15] mm: use kmap_local_page in memzero_page Andrew Morton
2021-07-23 22:50 ` [patch 08/15] mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction Andrew Morton
2021-07-23 22:50 ` [patch 09/15] memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions Andrew Morton
2021-07-23 22:50 ` [patch 10/15] writeback, cgroup: remove wb from offline list before releasing refcnt Andrew Morton
2021-07-23 22:50 ` [patch 11/15] writeback, cgroup: do not reparent dax inodes Andrew Morton
2021-07-23 22:50 ` [patch 12/15] mm/secretmem: wire up ->set_page_dirty Andrew Morton
2021-07-23 22:50 ` [patch 13/15] mm: mmap_lock: fix disabling preemption directly Andrew Morton
2021-07-23 22:50 ` [patch 14/15] mm: fix the deadlock in finish_fault() Andrew Morton
2021-07-23 22:50 ` [patch 15/15] hugetlbfs: fix mount mode command line processing Andrew Morton
2021-07-24  1:41   ` Al Viro
2021-07-26  5:22     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).