mm-commits.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2020-07-03 22:14 Andrew Morton
  2020-07-03 22:15 ` [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation Andrew Morton
                   ` (232 more replies)
  0 siblings, 233 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

5 patches, based on cdd3bb54332f82295ed90cd0c09c78cd0c0ee822.

Subsystems affected by this patch series:

  mm/hugetlb
  samples
  mm/cma
  mm/vmalloc
  mm/pagealloc

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
      mm/hugetlb.c: fix pages per hugetlb calculation

Subsystem: samples

    Kees Cook <keescook@chromium.org>:
      samples/vfs: avoid warning in statx override

Subsystem: mm/cma

    Barry Song <song.bao.hua@hisilicon.com>:
      mm/cma.c: use exact_nid true to fix possible per-numa cma leak

Subsystem: mm/vmalloc

    Christoph Hellwig <hch@lst.de>:
      vmalloc: fix the owner argument for the new __vmalloc_node_range callers

Subsystem: mm/pagealloc

    Joel Savitz <jsavitz@redhat.com>:
      mm/page_alloc: fix documentation error

 arch/arm64/kernel/probes/kprobes.c |    2 +-
 arch/x86/hyperv/hv_init.c          |    3 ++-
 kernel/module.c                    |    2 +-
 mm/cma.c                           |    4 ++--
 mm/hugetlb.c                       |    2 +-
 mm/page_alloc.c                    |    2 +-
 samples/vfs/test-statx.c           |    2 ++
 7 files changed, 10 insertions(+), 7 deletions(-)

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation
  2020-07-03 22:14 incoming Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
  2020-07-03 22:15 ` [patch 2/5] samples/vfs: avoid warning in statx override Andrew Morton
                   ` (231 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
  To: akpm, kirill.shutemov, linux-mm, mhocko, mike.kravetz,
	mm-commits, stable, torvalds, willy

From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: mm/hugetlb.c: fix pages per hugetlb calculation

The routine hpage_nr_pages() was incorrectly used to calculate the number
of base pages in a hugetlb page.  hpage_nr_pages is designed to be called
for THP pages and will return HPAGE_PMD_NR for hugetlb pages of any size.

Due to the context in which hpage_nr_pages was called, it is unlikely to
produce a user visible error.  The routine with the incorrect call is only
exercised in the case of hugetlb memory error or migration.  In addition,
this would need to be on an architecture which supports huge page sizes
less than PMD_SIZE.  And, the vma containing the huge page would also need
to smaller than PMD_SIZE.

Link: http://lkml.kernel.org/r/20200629185003.97202-1-mike.kravetz@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/hugetlb.c~hugetlb-fix-pages-per-hugetlb-calculation
+++ a/mm/hugetlb.c
@@ -1593,7 +1593,7 @@ static struct address_space *_get_hugetl
 
 	/* Use first found vma */
 	pgoff_start = page_to_pgoff(hpage);
-	pgoff_end = pgoff_start + hpage_nr_pages(hpage) - 1;
+	pgoff_end = pgoff_start + pages_per_huge_page(page_hstate(hpage)) - 1;
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
 					pgoff_start, pgoff_end) {
 		struct vm_area_struct *vma = avc->vma;
_

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [patch 2/5] samples/vfs: avoid warning in statx override
  2020-07-03 22:14 incoming Andrew Morton
  2020-07-03 22:15 ` [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
  2020-07-03 22:15 ` [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak Andrew Morton
                   ` (230 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
  To: akpm, dhowells, keescook, linux-mm, mm-commits, mszeredi, torvalds, viro

From: Kees Cook <keescook@chromium.org>
Subject: samples/vfs: avoid warning in statx override

Something changed recently to uncover this warning:

samples/vfs/test-statx.c:24:15: warning: `struct foo' declared inside parameter list will not be visible outside of this definition or declaration
   24 | #define statx foo
      |               ^~~

Which is due the use of "struct statx" (here, "struct foo") in a function
prototype argument list before it has been defined:

 int
 # 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
    foo
 # 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
          (int __dirfd, const char *__restrict __path, int __flags,
            unsigned int __mask, struct
 # 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
                                       foo
 # 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
                                             *__restrict __buf)
   __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2, 5)));

Add explicit struct before #include to avoid warning.

Link: http://lkml.kernel.org/r/202006282213.C516EA6@keescook
Fixes: f1b5618e013a ("vfs: Add a sample program for the new mount API")
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 samples/vfs/test-statx.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/samples/vfs/test-statx.c~samples-vfs-avoid-warning-in-statx-override
+++ a/samples/vfs/test-statx.c
@@ -23,6 +23,8 @@
 #include <linux/fcntl.h>
 #define statx foo
 #define statx_timestamp foo_timestamp
+struct statx;
+struct statx_timestamp;
 #include <sys/stat.h>
 #undef statx
 #undef statx_timestamp
_

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak
  2020-07-03 22:14 incoming Andrew Morton
  2020-07-03 22:15 ` [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation Andrew Morton
  2020-07-03 22:15 ` [patch 2/5] samples/vfs: avoid warning in statx override Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
  2020-07-03 22:15 ` [patch 4/5] vmalloc: fix the owner argument for the new __vmalloc_node_range callers Andrew Morton
                   ` (229 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
  To: akpm, andreas.schaufler, aslan, guro, Jonathan.Cameron, js1304,
	linux-mm, mhocko, mike.kravetz, mm-commits, riel, robin.murphy,
	song.bao.hua, stable, torvalds

From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/cma.c: use exact_nid true to fix possible per-numa cma leak

Calling cma_declare_contiguous_nid() with false exact_nid for per-numa
reservation can easily cause cma leak and various confusion.  For example,
mm/hugetlb.c is trying to reserve per-numa cma for gigantic pages.  But it
can easily leak cma and make users confused when system has memoryless
nodes.

In case the system has 4 numa nodes, and only numa node0 has memory.  if
we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas for 4
different numa nodes.  since exact_nid=false in current code, all 4 numa
nodes will get cma successfully from node0, but hugetlb_cma[1 to 3] will
never be available to hugepage will only allocate memory from
hugetlb_cma[0].

In case the system has 4 numa nodes, both numa node0&2 has memory, other
nodes have no memory.  if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c
will get 4 cma areas for 4 different numa nodes.  since exact_nid=false in
current code, all 4 numa nodes will get cma successfully from node0 or 2,
but hugetlb_cma[1] and [3] will never be available to hugepage as
mm/hugetlb.c will only allocate memory from hugetlb_cma[0] and
hugetlb_cma[2].  This causes permanent leak of the cma areas which are
supposed to be used by memoryless node.

Of cource we can workaround the issue by letting mm/hugetlb.c scan all cma
areas in alloc_gigantic_page() even node_mask includes node0 only.  that
means when node_mask includes node0 only, we can get page from
hugetlb_cma[1] to hugetlb_cma[3].  But this will cause kernel crash in
free_gigantic_page() while it wants to free page by:
cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)

On the other hand, exact_nid=false won't consider numa distance, it might
be not that useful to leverage cma areas on remote nodes.  I feel it is
much simpler to make exact_nid true to make everything clear.  After that,
memoryless nodes won't be able to reserve per-numa CMA from other nodes
which have memory.

Link: http://lkml.kernel.org/r/20200628074345.27228-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/cma.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/cma.c~mm-cmac-use-exact_nid-true-to-fix-possible-per-numa-cma-leak
+++ a/mm/cma.c
@@ -339,13 +339,13 @@ int __init cma_declare_contiguous_nid(ph
 		 */
 		if (base < highmem_start && limit > highmem_start) {
 			addr = memblock_alloc_range_nid(size, alignment,
-					highmem_start, limit, nid, false);
+					highmem_start, limit, nid, true);
 			limit = highmem_start;
 		}
 
 		if (!addr) {
 			addr = memblock_alloc_range_nid(size, alignment, base,
-					limit, nid, false);
+					limit, nid, true);
 			if (!addr) {
 				ret = -ENOMEM;
 				goto err;
_

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [patch 4/5] vmalloc: fix the owner argument for the new __vmalloc_node_range callers
  2020-07-03 22:14 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2020-07-03 22:15 ` [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
  2020-07-03 22:15 ` [patch 5/5] mm/page_alloc: fix documentation error Andrew Morton
                   ` (228 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
  To: akpm, ardb, hch, linux-mm, mm-commits, torvalds

From: Christoph Hellwig <hch@lst.de>
Subject: vmalloc: fix the owner argument for the new __vmalloc_node_range callers

Fix the recently added new __vmalloc_node_range callers to pass the
correct values as the owner for display in /proc/vmallocinfo.

Link: http://lkml.kernel.org/r/20200627075649.2455097-1-hch@lst.de
Fixes: 800e26b81311 ("x86/hyperv: allocate the hypercall page with only read and execute bits")
Fixes: 10d5e97c1bf8 ("arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page")
Fixes: 7a0e27b2a0ce ("mm: remove vmalloc_exec")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/kernel/probes/kprobes.c |    2 +-
 arch/x86/hyperv/hv_init.c          |    3 ++-
 kernel/module.c                    |    2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

--- a/arch/arm64/kernel/probes/kprobes.c~vmalloc-fix-the-owner-argument-for-the-new-__vmalloc_node_range-callers
+++ a/arch/arm64/kernel/probes/kprobes.c
@@ -122,7 +122,7 @@ void *alloc_insn_page(void)
 {
 	return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
 			GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS,
-			NUMA_NO_NODE, __func__);
+			NUMA_NO_NODE, __builtin_return_address(0));
 }
 
 /* arm kprobe: install breakpoint in text */
--- a/arch/x86/hyperv/hv_init.c~vmalloc-fix-the-owner-argument-for-the-new-__vmalloc_node_range-callers
+++ a/arch/x86/hyperv/hv_init.c
@@ -377,7 +377,8 @@ void __init hyperv_init(void)
 
 	hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START,
 			VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_ROX,
-			VM_FLUSH_RESET_PERMS, NUMA_NO_NODE, __func__);
+			VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
+			__builtin_return_address(0));
 	if (hv_hypercall_pg == NULL) {
 		wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
 		goto remove_cpuhp_state;
--- a/kernel/module.c~vmalloc-fix-the-owner-argument-for-the-new-__vmalloc_node_range-callers
+++ a/kernel/module.c
@@ -2785,7 +2785,7 @@ void * __weak module_alloc(unsigned long
 {
 	return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
 			GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS,
-			NUMA_NO_NODE, __func__);
+			NUMA_NO_NODE, __builtin_return_address(0));
 }
 
 bool __weak module_init_section(const char *name)
_

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [patch 5/5] mm/page_alloc: fix documentation error
  2020-07-03 22:14 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2020-07-03 22:15 ` [patch 4/5] vmalloc: fix the owner argument for the new __vmalloc_node_range callers Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
  2020-07-06 22:41 ` + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree Andrew Morton
                   ` (227 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
  To: akpm, aquini, fdangelo, jsavitz, linux-mm, mm-commits, torvalds, willy

From: Joel Savitz <jsavitz@redhat.com>
Subject: mm/page_alloc: fix documentation error

When I increased the upper bound of the min_free_kbytes value in
ee8eb9a5fe863 ("mm/page_alloc: increase default min_free_kbytes bound") I
forgot to tweak the above comment to reflect the new value.  This patch
fixes that mistake.

Link: http://lkml.kernel.org/r/20200624221236.29560-1-jsavitz@redhat.com
Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Fabrizio D'Angelo <fdangelo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/page_alloc.c~mm-page_alloc-fix-documentation-error
+++ a/mm/page_alloc.c
@@ -7832,7 +7832,7 @@ void setup_per_zone_wmarks(void)
  * Initialise min_free_kbytes.
  *
  * For small machines we want it small (128k min).  For large machines
- * we want it large (64MB max).  But it is not linear, because network
+ * we want it large (256MB max).  But it is not linear, because network
  * bandwidth does not increase linearly with machine size.  We use
  *
  *	min_free_kbytes = 4 * sqrt(lowmem_kbytes), for better accuracy:
_

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2020-07-03 22:15 ` [patch 5/5] mm/page_alloc: fix documentation error Andrew Morton
@ 2020-07-06 22:41 ` Andrew Morton
  2020-07-06 22:41   ` Andrew Morton
  2020-07-06 22:46 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch " Andrew Morton
                   ` (226 subsequent siblings)
  232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 22:41 UTC (permalink / raw)
  To: aryabinin, dvyukov, glider, mark.rutland, mm-commits, vincenzo.frascino


The patch titled
     Subject: kasan: remove kasan_unpoison_stack_above_sp_to()
has been added to the -mm tree.  Its filename is
     kasan-remove-kasan_unpoison_stack_above_sp_to.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/kasan-remove-kasan_unpoison_stack_above_sp_to.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/kasan-remove-kasan_unpoison_stack_above_sp_to.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Vincenzo Frascino <vincenzo.frascino@arm.com>
Subject: kasan: remove kasan_unpoison_stack_above_sp_to()

kasan_unpoison_stack_above_sp_to() is defined in kasan code but never
used.  The function was introduced as part of the commit:

   commit 9f7d416c36124667 ("kprobes: Unpoison stack in jprobe_return() for KASAN")

... where it was necessary because x86's jprobe_return() would leave
stale shadow on the stack, and was an oddity in that regard.

Since then, jprobes were removed entirely, and as of commit:

  commit 80006dbee674f9fa ("kprobes/x86: Remove jprobe implementation")

... there have been no callers of this function.

Remove the declaration and the implementation.

Link: http://lkml.kernel.org/r/20200706143505.23299-1-vincenzo.frascino@arm.com
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/kasan.h |    2 --
 mm/kasan/common.c     |   15 ---------------
 2 files changed, 17 deletions(-)

--- a/include/linux/kasan.h~kasan-remove-kasan_unpoison_stack_above_sp_to
+++ a/include/linux/kasan.h
@@ -38,7 +38,6 @@ extern void kasan_disable_current(void);
 void kasan_unpoison_shadow(const void *address, size_t size);
 
 void kasan_unpoison_task_stack(struct task_struct *task);
-void kasan_unpoison_stack_above_sp_to(const void *watermark);
 
 void kasan_alloc_pages(struct page *page, unsigned int order);
 void kasan_free_pages(struct page *page, unsigned int order);
@@ -101,7 +100,6 @@ void kasan_restore_multi_shot(bool enabl
 static inline void kasan_unpoison_shadow(const void *address, size_t size) {}
 
 static inline void kasan_unpoison_task_stack(struct task_struct *task) {}
-static inline void kasan_unpoison_stack_above_sp_to(const void *watermark) {}
 
 static inline void kasan_enable_current(void) {}
 static inline void kasan_disable_current(void) {}
--- a/mm/kasan/common.c~kasan-remove-kasan_unpoison_stack_above_sp_to
+++ a/mm/kasan/common.c
@@ -180,21 +180,6 @@ asmlinkage void kasan_unpoison_task_stac
 	kasan_unpoison_shadow(base, watermark - base);
 }
 
-/*
- * Clear all poison for the region between the current SP and a provided
- * watermark value, as is sometimes required prior to hand-crafted asm function
- * returns in the middle of functions.
- */
-void kasan_unpoison_stack_above_sp_to(const void *watermark)
-{
-	const void *sp = __builtin_frame_address(0);
-	size_t size = watermark - sp;
-
-	if (WARN_ON(sp > watermark))
-		return;
-	kasan_unpoison_shadow(sp, size);
-}

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree
  2020-07-06 22:41 ` + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree Andrew Morton
@ 2020-07-06 22:41   ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 22:41 UTC (permalink / raw)
  To: aryabinin, dvyukov, glider, mark.rutland, mm-commits, vincenzo.frascino


The patch titled
     Subject: kasan: remove kasan_unpoison_stack_above_sp_to()
has been added to the -mm tree.  Its filename is
     kasan-remove-kasan_unpoison_stack_above_sp_to.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/kasan-remove-kasan_unpoison_stack_above_sp_to.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/kasan-remove-kasan_unpoison_stack_above_sp_to.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Vincenzo Frascino <vincenzo.frascino@arm.com>
Subject: kasan: remove kasan_unpoison_stack_above_sp_to()

kasan_unpoison_stack_above_sp_to() is defined in kasan code but never
used.  The function was introduced as part of the commit:

   commit 9f7d416c36124667 ("kprobes: Unpoison stack in jprobe_return() for KASAN")

... where it was necessary because x86's jprobe_return() would leave
stale shadow on the stack, and was an oddity in that regard.

Since then, jprobes were removed entirely, and as of commit:

  commit 80006dbee674f9fa ("kprobes/x86: Remove jprobe implementation")

... there have been no callers of this function.

Remove the declaration and the implementation.

Link: http://lkml.kernel.org/r/20200706143505.23299-1-vincenzo.frascino@arm.com
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/kasan.h |    2 --
 mm/kasan/common.c     |   15 ---------------
 2 files changed, 17 deletions(-)

--- a/include/linux/kasan.h~kasan-remove-kasan_unpoison_stack_above_sp_to
+++ a/include/linux/kasan.h
@@ -38,7 +38,6 @@ extern void kasan_disable_current(void);
 void kasan_unpoison_shadow(const void *address, size_t size);
 
 void kasan_unpoison_task_stack(struct task_struct *task);
-void kasan_unpoison_stack_above_sp_to(const void *watermark);
 
 void kasan_alloc_pages(struct page *page, unsigned int order);
 void kasan_free_pages(struct page *page, unsigned int order);
@@ -101,7 +100,6 @@ void kasan_restore_multi_shot(bool enabl
 static inline void kasan_unpoison_shadow(const void *address, size_t size) {}
 
 static inline void kasan_unpoison_task_stack(struct task_struct *task) {}
-static inline void kasan_unpoison_stack_above_sp_to(const void *watermark) {}
 
 static inline void kasan_enable_current(void) {}
 static inline void kasan_disable_current(void) {}
--- a/mm/kasan/common.c~kasan-remove-kasan_unpoison_stack_above_sp_to
+++ a/mm/kasan/common.c
@@ -180,21 +180,6 @@ asmlinkage void kasan_unpoison_task_stac
 	kasan_unpoison_shadow(base, watermark - base);
 }
 
-/*
- * Clear all poison for the region between the current SP and a provided
- * watermark value, as is sometimes required prior to hand-crafted asm function
- * returns in the middle of functions.
- */
-void kasan_unpoison_stack_above_sp_to(const void *watermark)
-{
-	const void *sp = __builtin_frame_address(0);
-	size_t size = watermark - sp;
-
-	if (WARN_ON(sp > watermark))
-		return;
-	kasan_unpoison_shadow(sp, size);
-}
-
 void kasan_alloc_pages(struct page *page, unsigned int order)
 {
 	u8 tag;
_

Patches currently in -mm which might be from vincenzo.frascino@arm.com are

kasan-remove-kasan_unpoison_stack_above_sp_to.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2020-07-06 22:41 ` + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree Andrew Morton
@ 2020-07-06 22:46 ` Andrew Morton
  2020-07-06 22:49 ` + lib-test_bitops-do-the-full-test-during-module-init.patch " Andrew Morton
                   ` (225 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 22:46 UTC (permalink / raw)
  To: andreyknvl, aryabinin, dvyukov, glider, matthias.bgg, mm-commits,
	walter-zh.wu


The patch titled
     Subject: lib/test_kasan.c
has been added to the -mm tree.  Its filename is
     kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Walter Wu <walter-zh.wu@mediatek.com>
Subject: lib/test_kasan.c
: fix KASAN unit tests for tag-based KASAN

We use tag-based KASAN, then KASAN unit tests don't detect out-of-bounds
memory access. They need to be fixed.

With tag-based KASAN, the state of each 16 aligned bytes of memory is
encoded in one shadow byte and the shadow value is tag of pointer, so
we need to read next shadow byte, the shadow value is not equal to tag
value of pointer, so that tag-based KASAN will detect out-of-bounds
memory access.

Link: http://lkml.kernel.org/r/20200706115039.16750-1-walter-zh.wu@mediatek.com
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/test_kasan.c |   47 ++++++++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 17 deletions(-)

--- a/lib/test_kasan.c~kasan-fix-kasan-unit-tests-for-tag-based-kasan
+++ a/lib/test_kasan.c
@@ -23,6 +23,8 @@
 
 #include <asm/page.h>
 
+#define OOB_TAG_OFF (IS_ENABLED(CONFIG_KASAN_GENERIC) ? 0 : 13)
+
 /*
  * We assign some test results to these globals to make sure the tests
  * are not eliminated as dead code.
@@ -48,7 +50,8 @@ static noinline void __init kmalloc_oob_
 		return;
 	}
 
-	ptr[size] = 'x';
+	ptr[size + OOB_TAG_OFF] = 'x';
+
 	kfree(ptr);
 }
 
@@ -100,7 +103,8 @@ static noinline void __init kmalloc_page
 		return;
 	}
 
-	ptr[size] = 0;
+	ptr[size + OOB_TAG_OFF] = 0;
+
 	kfree(ptr);
 }
 
@@ -170,7 +174,8 @@ static noinline void __init kmalloc_oob_
 		return;
 	}
 
-	ptr2[size2] = 'x';
+	ptr2[size2 + OOB_TAG_OFF] = 'x';
+
 	kfree(ptr2);
 }
 
@@ -188,7 +193,9 @@ static noinline void __init kmalloc_oob_
 		kfree(ptr1);
 		return;
 	}
-	ptr2[size2] = 'x';
+
+	ptr2[size2 + OOB_TAG_OFF] = 'x';
+
 	kfree(ptr2);
 }
 
@@ -224,7 +231,8 @@ static noinline void __init kmalloc_oob_
 		return;
 	}
 
-	memset(ptr+7, 0, 2);
+	memset(ptr + 7 + OOB_TAG_OFF, 0, 2);
+
 	kfree(ptr);
 }
 
@@ -240,7 +248,8 @@ static noinline void __init kmalloc_oob_
 		return;
 	}
 
-	memset(ptr+5, 0, 4);
+	memset(ptr + 5 + OOB_TAG_OFF, 0, 4);
+
 	kfree(ptr);
 }
 
@@ -257,7 +266,8 @@ static noinline void __init kmalloc_oob_
 		return;
 	}
 
-	memset(ptr+1, 0, 8);
+	memset(ptr + 1 + OOB_TAG_OFF, 0, 8);
+
 	kfree(ptr);
 }
 
@@ -273,7 +283,8 @@ static noinline void __init kmalloc_oob_
 		return;
 	}
 
-	memset(ptr+1, 0, 16);
+	memset(ptr + 1 + OOB_TAG_OFF, 0, 16);
+
 	kfree(ptr);
 }
 
@@ -289,7 +300,8 @@ static noinline void __init kmalloc_oob_
 		return;
 	}
 
-	memset(ptr, 0, size+5);
+	memset(ptr, 0, size + 5 + OOB_TAG_OFF);
+
 	kfree(ptr);
 }
 
@@ -423,7 +435,8 @@ static noinline void __init kmem_cache_o
 		return;
 	}
 
-	*p = p[size];
+	*p = p[size + OOB_TAG_OFF];
+
 	kmem_cache_free(cache, p);
 	kmem_cache_destroy(cache);
 }
@@ -520,25 +533,25 @@ static noinline void __init copy_user_te
 	}
 
 	pr_info("out-of-bounds in copy_from_user()\n");
-	unused = copy_from_user(kmem, usermem, size + 1);
+	unused = copy_from_user(kmem, usermem, size + 1 + OOB_TAG_OFF);
 
 	pr_info("out-of-bounds in copy_to_user()\n");
-	unused = copy_to_user(usermem, kmem, size + 1);
+	unused = copy_to_user(usermem, kmem, size + 1 + OOB_TAG_OFF);
 
 	pr_info("out-of-bounds in __copy_from_user()\n");
-	unused = __copy_from_user(kmem, usermem, size + 1);
+	unused = __copy_from_user(kmem, usermem, size + 1 + OOB_TAG_OFF);
 
 	pr_info("out-of-bounds in __copy_to_user()\n");
-	unused = __copy_to_user(usermem, kmem, size + 1);
+	unused = __copy_to_user(usermem, kmem, size + 1 + OOB_TAG_OFF);
 
 	pr_info("out-of-bounds in __copy_from_user_inatomic()\n");
-	unused = __copy_from_user_inatomic(kmem, usermem, size + 1);
+	unused = __copy_from_user_inatomic(kmem, usermem, size + 1 + OOB_TAG_OFF);
 
 	pr_info("out-of-bounds in __copy_to_user_inatomic()\n");
-	unused = __copy_to_user_inatomic(usermem, kmem, size + 1);
+	unused = __copy_to_user_inatomic(usermem, kmem, size + 1 + OOB_TAG_OFF);
 
 	pr_info("out-of-bounds in strncpy_from_user()\n");
-	unused = strncpy_from_user(kmem, usermem, size + 1);
+	unused = strncpy_from_user(kmem, usermem, size + 1 + OOB_TAG_OFF);
 
 	vm_munmap((unsigned long)usermem, PAGE_SIZE);
 	kfree(kmem);
_

Patches currently in -mm which might be from walter-zh.wu@mediatek.com are

kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
rcu-kasan-record-and-print-call_rcu-call-stack.patch
kasan-record-and-print-the-free-track.patch
kasan-add-tests-for-call_rcu-stack-recording.patch
kasan-update-documentation-for-generic-kasan.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + lib-test_bitops-do-the-full-test-during-module-init.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2020-07-06 22:46 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch " Andrew Morton
@ 2020-07-06 22:49 ` Andrew Morton
  2020-07-06 23:03 ` [nacked] mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd.patch removed from " Andrew Morton
                   ` (224 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 22:49 UTC (permalink / raw)
  To: andriy.shevchenko, geert, jesse.brandeburg, mm-commits, richard.weiyang


The patch titled
     Subject: lib/test_bitops: do the full test during module init
has been added to the -mm tree.  Its filename is
     lib-test_bitops-do-the-full-test-during-module-init.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/lib-test_bitops-do-the-full-test-during-module-init.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/lib-test_bitops-do-the-full-test-during-module-init.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Geert Uytterhoeven <geert@linux-m68k.org>
Subject: lib/test_bitops: do the full test during module init

Currently, the bitops test consists of two parts: one part is executed
during module load, the second part during module unload.  This is
cumbersome for the user, as he has to perform two steps to execute all
tests, and is different from most (all?) other tests.

Merge the two parts, so both are executed during module load.

Link: http://lkml.kernel.org/r/20200706112900.7097-1-geert@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/test_bitops.c |   18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

--- a/lib/test_bitops.c~lib-test_bitops-do-the-full-test-during-module-init
+++ a/lib/test_bitops.c
@@ -52,9 +52,9 @@ static unsigned long order_comb_long[][2
 
 static int __init test_bitops_startup(void)
 {
-	int i;
+	int i, bit_set;
 
-	pr_warn("Loaded test module\n");
+	pr_info("Starting bitops test\n");
 	set_bit(BITOPS_4, g_bitmap);
 	set_bit(BITOPS_7, g_bitmap);
 	set_bit(BITOPS_11, g_bitmap);
@@ -81,12 +81,8 @@ static int __init test_bitops_startup(vo
 				       order_comb_long[i][0]);
 	}
 #endif
-	return 0;
-}
 
-static void __exit test_bitops_unstartup(void)
-{
-	int bit_set;
+	barrier();
 
 	clear_bit(BITOPS_4, g_bitmap);
 	clear_bit(BITOPS_7, g_bitmap);
@@ -98,7 +94,13 @@ static void __exit test_bitops_unstartup
 	if (bit_set != BITOPS_LAST)
 		pr_err("ERROR: FOUND SET BIT %d\n", bit_set);
 
-	pr_warn("Unloaded test module\n");
+	pr_info("Completed bitops test\n");
+
+	return 0;
+}
+
+static void __exit test_bitops_unstartup(void)
+{
 }
 
 module_init(test_bitops_startup);
_

Patches currently in -mm which might be from geert@linux-m68k.org are

lib-test_bitops-do-the-full-test-during-module-init.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [nacked] mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2020-07-06 22:49 ` + lib-test_bitops-do-the-full-test-during-module-init.patch " Andrew Morton
@ 2020-07-06 23:03 ` Andrew Morton
  2020-07-06 23:03 ` [to-be-updated] mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
                   ` (223 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:03 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, walken, willy, yang.shi


The patch titled
     Subject: mm/mremap: format the check in move_normal_pmd() same as move_huge_pmd()
has been removed from the -mm tree.  Its filename was
     mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd.patch

This patch was dropped because it was nacked

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: format the check in move_normal_pmd() same as move_huge_pmd()

Patch series "mm/mremap: cleanup move_page_tables() a little", v2.

move_page_tables() tries to move page table by PMD or PTE.

The root reason is if it tries to move PMD, both old and new range should
be PMD aligned.  But current code calculate old range and new range
separately.  This leads to some redundant check and calculation.

This cleanup tries to consolidate the range check in one place to reduce
some extra range handling.


This patch (of 4):

No functional change, just improve the readability and prepare for
following cleanup.

Link: http://lkml.kernel.org/r/20200626135216.24314-1-richard.weiyang@linux.alibaba.com
Link: http://lkml.kernel.org/r/20200626135216.24314-2-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mremap.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/mm/mremap.c~mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd
+++ a/mm/mremap.c
@@ -200,8 +200,9 @@ static bool move_normal_pmd(struct vm_ar
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t pmd;
 
-	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
-	    || old_end - old_addr < PMD_SIZE)
+	if ((old_addr & ~PMD_MASK) ||
+	    (new_addr & ~PMD_MASK) ||
+	    old_end - old_addr < PMD_SIZE)
 		return false;
 
 	/*
_

Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are

mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (8 preceding siblings ...)
  2020-07-06 23:03 ` [nacked] mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd.patch removed from " Andrew Morton
@ 2020-07-06 23:03 ` Andrew Morton
  2020-07-06 23:03 ` [to-be-updated] mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
                   ` (222 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:03 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, walken, willy, yang.shi


The patch titled
     Subject: mm/mremap: it is sure to have enough space when extent meets requirement
has been removed from the -mm tree.  Its filename was
     mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: it is sure to have enough space when extent meets requirement

old_end is passed to these two function to check whether there is enough
space to do the move, while this check is done before invoking these
functions.

These two functions only would be invoked when extent meets the
requirement and there is one check before invoking these functions:

    if (extent > old_end - old_addr)
        extent = old_end - old_addr;

This implies (old_end - old_addr) won't fail the check in these two
functions.

Link: http://lkml.kernel.org/r/20200626135216.24314-3-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/huge_mm.h |    2 +-
 mm/huge_memory.c        |    7 ++-----
 mm/mremap.c             |   11 ++++-------
 3 files changed, 7 insertions(+), 13 deletions(-)

--- a/include/linux/huge_mm.h~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/include/linux/huge_mm.h
@@ -42,7 +42,7 @@ extern int mincore_huge_pmd(struct vm_ar
 			unsigned long addr, unsigned long end,
 			unsigned char *vec);
 extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
-			 unsigned long new_addr, unsigned long old_end,
+			 unsigned long new_addr,
 			 pmd_t *old_pmd, pmd_t *new_pmd);
 extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 			unsigned long addr, pgprot_t newprot,
--- a/mm/huge_memory.c~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/mm/huge_memory.c
@@ -1722,17 +1722,14 @@ static pmd_t move_soft_dirty_pmd(pmd_t p
 }
 
 bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
-		  unsigned long new_addr, unsigned long old_end,
-		  pmd_t *old_pmd, pmd_t *new_pmd)
+		  unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
 {
 	spinlock_t *old_ptl, *new_ptl;
 	pmd_t pmd;
 	struct mm_struct *mm = vma->vm_mm;
 	bool force_flush = false;
 
-	if ((old_addr & ~HPAGE_PMD_MASK) ||
-	    (new_addr & ~HPAGE_PMD_MASK) ||
-	    old_end - old_addr < HPAGE_PMD_SIZE)
+	if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
 		return false;
 
 	/*
--- a/mm/mremap.c~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/mm/mremap.c
@@ -193,16 +193,13 @@ static void move_ptes(struct vm_area_str
 
 #ifdef CONFIG_HAVE_MOVE_PMD
 static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
-		  unsigned long new_addr, unsigned long old_end,
-		  pmd_t *old_pmd, pmd_t *new_pmd)
+		  unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
 {
 	spinlock_t *old_ptl, *new_ptl;
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t pmd;
 
-	if ((old_addr & ~PMD_MASK) ||
-	    (new_addr & ~PMD_MASK) ||
-	    old_end - old_addr < PMD_SIZE)
+	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
 		return false;
 
 	/*
@@ -274,7 +271,7 @@ unsigned long move_page_tables(struct vm
 				if (need_rmap_locks)
 					take_rmap_locks(vma);
 				moved = move_huge_pmd(vma, old_addr, new_addr,
-						    old_end, old_pmd, new_pmd);
+						      old_pmd, new_pmd);
 				if (need_rmap_locks)
 					drop_rmap_locks(vma);
 				if (moved)
@@ -294,7 +291,7 @@ unsigned long move_page_tables(struct vm
 			if (need_rmap_locks)
 				take_rmap_locks(vma);
 			moved = move_normal_pmd(vma, old_addr, new_addr,
-					old_end, old_pmd, new_pmd);
+						old_pmd, new_pmd);
 			if (need_rmap_locks)
 				drop_rmap_locks(vma);
 			if (moved)
_

Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are

mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-mremap-calculate-extent-in-one-place.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (9 preceding siblings ...)
  2020-07-06 23:03 ` [to-be-updated] mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
@ 2020-07-06 23:03 ` Andrew Morton
  2020-07-06 23:04 ` [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
                   ` (221 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:03 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, walken, willy, yang.shi


The patch titled
     Subject: mm/mremap: calculate extent in one place
has been removed from the -mm tree.  Its filename was
     mm-mremap-calculate-extent-in-one-place.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: calculate extent in one place

Page tables are moved on the base of PMD.  This requires both source and
destination range should meet the requirement.

Current code works well since move_huge_pmd() and move_normal_pmd() would
check old_addr and new_addr again.  And then return to move_ptes() if the
either of them is not aligned.

In stead of calculating the extent separately, it is better to calculate
in one place, so we know it is not necessary to try move pmd.  By doing
so, the logic seems a little clear.

Link: http://lkml.kernel.org/r/20200626135216.24314-4-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mremap.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/mm/mremap.c~mm-mremap-calculate-extent-in-one-place
+++ a/mm/mremap.c
@@ -258,6 +258,9 @@ unsigned long move_page_tables(struct vm
 		extent = next - old_addr;
 		if (extent > old_end - old_addr)
 			extent = old_end - old_addr;
+		next = (new_addr + PMD_SIZE) & PMD_MASK;
+		if (extent > next - new_addr)
+			extent = next - new_addr;
 		old_pmd = get_old_pmd(vma->vm_mm, old_addr);
 		if (!old_pmd)
 			continue;
@@ -301,9 +304,6 @@ unsigned long move_page_tables(struct vm
 
 		if (pte_alloc(new_vma->vm_mm, new_pmd))
 			break;
-		next = (new_addr + PMD_SIZE) & PMD_MASK;
-		if (extent > next - new_addr)
-			extent = next - new_addr;
 		move_ptes(vma, old_pmd, old_addr, old_addr + extent, new_vma,
 			  new_pmd, new_addr, need_rmap_locks);
 	}
_

Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are

mm-mremap-start-addresses-are-properly-aligned.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (10 preceding siblings ...)
  2020-07-06 23:03 ` [to-be-updated] mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
@ 2020-07-06 23:04 ` Andrew Morton
  2020-07-06 23:04   ` Andrew Morton
  2020-07-06 23:15 ` + mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch added to " Andrew Morton
                   ` (220 subsequent siblings)
  232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:04 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, walken, willy, yang.shi


The patch titled
     Subject: mm/mremap: start addresses are properly aligned
has been removed from the -mm tree.  Its filename was
     mm-mremap-start-addresses-are-properly-aligned.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: start addresses are properly aligned

After previous cleanup, extent is the minimal step for both source and
destination.  This means when extent is HPAGE_PMD_SIZE or PMD_SIZE,
old_addr and new_addr are properly aligned too.

Since these two functions are only invoked in move_page_tables, it is safe
to remove the check now.

Link: http://lkml.kernel.org/r/20200626135216.24314-5-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/huge_memory.c |    3 ---
 mm/mremap.c      |    3 ---
 2 files changed, 6 deletions(-)

--- a/mm/huge_memory.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/huge_memory.c
@@ -1729,9 +1729,6 @@ bool move_huge_pmd(struct vm_area_struct
 	struct mm_struct *mm = vma->vm_mm;
 	bool force_flush = false;
 
-	if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
-		return false;
-
 	/*
 	 * The destination pmd shouldn't be established, free_pgtables()
 	 * should have release it.
--- a/mm/mremap.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/mremap.c
@@ -199,9 +199,6 @@ static bool move_normal_pmd(struct vm_ar
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t pmd;
 
-	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
-		return false;

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch removed from -mm tree
  2020-07-06 23:04 ` [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
@ 2020-07-06 23:04   ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:04 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, walken, willy, yang.shi


The patch titled
     Subject: mm/mremap: start addresses are properly aligned
has been removed from the -mm tree.  Its filename was
     mm-mremap-start-addresses-are-properly-aligned.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: start addresses are properly aligned

After previous cleanup, extent is the minimal step for both source and
destination.  This means when extent is HPAGE_PMD_SIZE or PMD_SIZE,
old_addr and new_addr are properly aligned too.

Since these two functions are only invoked in move_page_tables, it is safe
to remove the check now.

Link: http://lkml.kernel.org/r/20200626135216.24314-5-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/huge_memory.c |    3 ---
 mm/mremap.c      |    3 ---
 2 files changed, 6 deletions(-)

--- a/mm/huge_memory.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/huge_memory.c
@@ -1729,9 +1729,6 @@ bool move_huge_pmd(struct vm_area_struct
 	struct mm_struct *mm = vma->vm_mm;
 	bool force_flush = false;
 
-	if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
-		return false;
-
 	/*
 	 * The destination pmd shouldn't be established, free_pgtables()
 	 * should have release it.
--- a/mm/mremap.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/mremap.c
@@ -199,9 +199,6 @@ static bool move_normal_pmd(struct vm_ar
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t pmd;
 
-	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
-		return false;
-
 	/*
 	 * The destination pmd shouldn't be established, free_pgtables()
 	 * should have release it.
_

Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are

mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (11 preceding siblings ...)
  2020-07-06 23:04 ` [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
@ 2020-07-06 23:15 ` Andrew Morton
  2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch " Andrew Morton
                   ` (219 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:15 UTC (permalink / raw)
  To: david, mm-commits, penberg, songmuchun


The patch titled
     Subject: mm/page_alloc.c: skip setting nodemask when we are in interrupt
has been added to the -mm tree.  Its filename is
     mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Muchun Song <songmuchun@bytedance.com>
Subject: mm/page_alloc.c: skip setting nodemask when we are in interrupt

When we are in the interrupt context, it is irrelevant to the current task
context.  If we use current task's mems_allowed, we can be fair to alloc
pages in the fast path and fall back to slow path memory allocation when
the current node(which is the current task mems_allowed) does not have
enough memory to allocate.  In this case, it slows down the memory
allocation speed of interrupt context.  So we can skip setting the
nodemask to allow any node to allocate memory, so that fast path
allocation can success.

Link: http://lkml.kernel.org/r/20200706025921.53683-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/mm/page_alloc.c~mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt
+++ a/mm/page_alloc.c
@@ -4790,7 +4790,11 @@ static inline bool prepare_alloc_pages(g
 
 	if (cpusets_enabled()) {
 		*alloc_mask |= __GFP_HARDWALL;
-		if (!ac->nodemask)
+		/*
+		 * When we are in the interrupt context, it is irrelevant
+		 * to the current task context. It means that any node ok.
+		 */
+		if (!in_interrupt() && !ac->nodemask)
 			ac->nodemask = &cpuset_current_mems_allowed;
 		else
 			*alloc_flags |= ALLOC_CPUSET;
_

Patches currently in -mm which might be from songmuchun@bytedance.com are

mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (12 preceding siblings ...)
  2020-07-06 23:15 ` + mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch added to " Andrew Morton
@ 2020-07-06 23:28 ` Andrew Morton
  2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch " Andrew Morton
                   ` (218 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:28 UTC (permalink / raw)
  To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
	corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
	palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
	ziy


The patch titled
     Subject: mm/debug_vm_pgtable: add tests validating arch helpers for core MM features
has been added to the -mm tree.  Its filename is
     mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/debug_vm_pgtable: add tests validating arch helpers for core MM features

Patch series "mm/debug_vm_pgtable: Add some more tests", v4.

This series adds some more arch page table helper validation tests which
are related to core and advanced memory functions.  This also creates a
documentation, enlisting expected semantics for all page table helpers as
suggested by Mike Rapoport previously
(https://lkml.org/lkml/2020/1/30/40).

There are many TRANSPARENT_HUGEPAGE and ARCH_HAS_TRANSPARENT_HUGEPAGE_PUD
ifdefs scattered across the test.  But consolidating all the fallback
stubs is not very straight forward because
ARCH_HAS_TRANSPARENT_HUGEPAGE_PUD is not explicitly dependent on
ARCH_HAS_TRANSPARENT_HUGEPAGE.

Tested on arm64, x86 platforms but only build tested on all other enabled
platforms through ARCH_HAS_DEBUG_VM_PGTABLE i.e powerpc, arc, s390.  The
following failure on arm64 still exists which was mentioned previously. 
It will be fixed with the upcoming THP migration on arm64 enablement
series.

WARNING .... mm/debug_vm_pgtable.c:860 debug_vm_pgtable+0x940/0xa54
WARN_ON(!pmd_present(pmd_mkinvalid(pmd_mkhuge(pmd))))


This patch (of 4):

This adds new tests validating arch page table helpers for these following
core memory features.  These tests create and test specific mapping types
at various page table levels.

1. SPECIAL mapping
2. PROTNONE mapping
3. DEVMAP mapping
4. SOFTDIRTY mapping
5. SWAP mapping
6. MIGRATION mapping
7. HUGETLB mapping
8. THP mapping

Link: http://lkml.kernel.org/r/1593996516-7186-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1593996516-7186-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com>	[arc]
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug_vm_pgtable.c |  302 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 301 insertions(+), 1 deletion(-)

--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features
+++ a/mm/debug_vm_pgtable.c
@@ -282,6 +282,278 @@ static void __init pmd_populate_tests(st
 	WARN_ON(pmd_bad(pmd));
 }
 
+static void __init pte_special_tests(unsigned long pfn, pgprot_t prot)
+{
+	pte_t pte = pfn_pte(pfn, prot);
+
+	if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL))
+		return;
+
+	WARN_ON(!pte_special(pte_mkspecial(pte)));
+}
+
+static void __init pte_protnone_tests(unsigned long pfn, pgprot_t prot)
+{
+	pte_t pte = pfn_pte(pfn, prot);
+
+	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
+		return;
+
+	WARN_ON(!pte_protnone(pte));
+	WARN_ON(!pte_present(pte));
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static void __init pmd_protnone_tests(unsigned long pfn, pgprot_t prot)
+{
+	pmd_t pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
+
+	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
+		return;
+
+	WARN_ON(!pmd_protnone(pmd));
+	WARN_ON(!pmd_present(pmd));
+}
+#else  /* !CONFIG_TRANSPARENT_HUGEPAGE */
+static void __init pmd_protnone_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
+static void __init pte_devmap_tests(unsigned long pfn, pgprot_t prot)
+{
+	pte_t pte = pfn_pte(pfn, prot);
+
+	WARN_ON(!pte_devmap(pte_mkdevmap(pte)));
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static void __init pmd_devmap_tests(unsigned long pfn, pgprot_t prot)
+{
+	pmd_t pmd = pfn_pmd(pfn, prot);
+
+	WARN_ON(!pmd_devmap(pmd_mkdevmap(pmd)));
+}
+
+#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
+static void __init pud_devmap_tests(unsigned long pfn, pgprot_t prot)
+{
+	pud_t pud = pfn_pud(pfn, prot);
+
+	WARN_ON(!pud_devmap(pud_mkdevmap(pud)));
+}
+#else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+static void __init pud_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+#else  /* CONFIG_TRANSPARENT_HUGEPAGE */
+static void __init pmd_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+#else
+static void __init pte_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pmd_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */
+
+static void __init pte_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+	pte_t pte = pfn_pte(pfn, prot);
+
+	if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+		return;
+
+	WARN_ON(!pte_soft_dirty(pte_mksoft_dirty(pte)));
+	WARN_ON(pte_soft_dirty(pte_clear_soft_dirty(pte)));
+}
+
+static void __init pte_swap_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+	pte_t pte = pfn_pte(pfn, prot);
+
+	if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+		return;
+
+	WARN_ON(!pte_swp_soft_dirty(pte_swp_mksoft_dirty(pte)));
+	WARN_ON(pte_swp_soft_dirty(pte_swp_clear_soft_dirty(pte)));
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static void __init pmd_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+	pmd_t pmd = pfn_pmd(pfn, prot);
+
+	if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+		return;
+
+	WARN_ON(!pmd_soft_dirty(pmd_mksoft_dirty(pmd)));
+	WARN_ON(pmd_soft_dirty(pmd_clear_soft_dirty(pmd)));
+}
+
+static void __init pmd_swap_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+	pmd_t pmd = pfn_pmd(pfn, prot);
+
+	if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) ||
+		!IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION))
+		return;
+
+	WARN_ON(!pmd_swp_soft_dirty(pmd_swp_mksoft_dirty(pmd)));
+	WARN_ON(pmd_swp_soft_dirty(pmd_swp_clear_soft_dirty(pmd)));
+}
+#else  /* !CONFIG_ARCH_HAS_PTE_DEVMAP */
+static void __init pmd_soft_dirty_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pmd_swap_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+}
+#endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */
+
+static void __init pte_swap_tests(unsigned long pfn, pgprot_t prot)
+{
+	swp_entry_t swp;
+	pte_t pte;
+
+	pte = pfn_pte(pfn, prot);
+	swp = __pte_to_swp_entry(pte);
+	pte = __swp_entry_to_pte(swp);
+	WARN_ON(pfn != pte_pfn(pte));
+}
+
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot)
+{
+	swp_entry_t swp;
+	pmd_t pmd;
+
+	pmd = pfn_pmd(pfn, prot);
+	swp = __pmd_to_swp_entry(pmd);
+	pmd = __swp_entry_to_pmd(swp);
+	WARN_ON(pfn != pmd_pfn(pmd));
+}
+#else  /* !CONFIG_ARCH_ENABLE_THP_MIGRATION */
+static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */
+
+static void __init swap_migration_tests(void)
+{
+	struct page *page;
+	swp_entry_t swp;
+
+	if (!IS_ENABLED(CONFIG_MIGRATION))
+		return;
+	/*
+	 * swap_migration_tests() requires a dedicated page as it needs to
+	 * be locked before creating a migration entry from it. Locking the
+	 * page that actually maps kernel text ('start_kernel') can be real
+	 * problematic. Lets allocate a dedicated page explicitly for this
+	 * purpose that will be freed subsequently.
+	 */
+	page = alloc_page(GFP_KERNEL);
+	if (!page) {
+		pr_err("page allocation failed\n");
+		return;
+	}
+
+	/*
+	 * make_migration_entry() expects given page to be
+	 * locked, otherwise it stumbles upon a BUG_ON().
+	 */
+	__SetPageLocked(page);
+	swp = make_migration_entry(page, 1);
+	WARN_ON(!is_migration_entry(swp));
+	WARN_ON(!is_write_migration_entry(swp));
+
+	make_migration_entry_read(&swp);
+	WARN_ON(!is_migration_entry(swp));
+	WARN_ON(is_write_migration_entry(swp));
+
+	swp = make_migration_entry(page, 0);
+	WARN_ON(!is_migration_entry(swp));
+	WARN_ON(is_write_migration_entry(swp));
+	__ClearPageLocked(page);
+	__free_page(page);
+}
+
+#ifdef CONFIG_HUGETLB_PAGE
+static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot)
+{
+	struct page *page;
+	pte_t pte;
+
+	/*
+	 * Accessing the page associated with the pfn is safe here,
+	 * as it was previously derived from a real kernel symbol.
+	 */
+	page = pfn_to_page(pfn);
+	pte = mk_huge_pte(page, prot);
+
+	WARN_ON(!huge_pte_dirty(huge_pte_mkdirty(pte)));
+	WARN_ON(!huge_pte_write(huge_pte_mkwrite(huge_pte_wrprotect(pte))));
+	WARN_ON(huge_pte_write(huge_pte_wrprotect(huge_pte_mkwrite(pte))));
+
+#ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB
+	pte = pfn_pte(pfn, prot);
+
+	WARN_ON(!pte_huge(pte_mkhuge(pte)));
+#endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
+}
+#else  /* !CONFIG_HUGETLB_PAGE */
+static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_HUGETLB_PAGE */
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static void __init pmd_thp_tests(unsigned long pfn, pgprot_t prot)
+{
+	pmd_t pmd;
+
+	if (!has_transparent_hugepage())
+		return;
+
+	/*
+	 * pmd_trans_huge() and pmd_present() must return positive after
+	 * MMU invalidation with pmd_mkinvalid(). This behavior is an
+	 * optimization for transparent huge page. pmd_trans_huge() must
+	 * be true if pmd_page() returns a valid THP to avoid taking the
+	 * pmd_lock when others walk over non transhuge pmds (i.e. there
+	 * are no THP allocated). Especially when splitting a THP and
+	 * removing the present bit from the pmd, pmd_trans_huge() still
+	 * needs to return true. pmd_present() should be true whenever
+	 * pmd_trans_huge() returns true.
+	 */
+	pmd = pfn_pmd(pfn, prot);
+	WARN_ON(!pmd_trans_huge(pmd_mkhuge(pmd)));
+
+#ifndef __HAVE_ARCH_PMDP_INVALIDATE
+	WARN_ON(!pmd_trans_huge(pmd_mkinvalid(pmd_mkhuge(pmd))));
+	WARN_ON(!pmd_present(pmd_mkinvalid(pmd_mkhuge(pmd))));
+#endif /* __HAVE_ARCH_PMDP_INVALIDATE */
+}
+
+#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
+static void __init pud_thp_tests(unsigned long pfn, pgprot_t prot)
+{
+	pud_t pud;
+
+	if (!has_transparent_hugepage())
+		return;
+
+	pud = pfn_pud(pfn, prot);
+	WARN_ON(!pud_trans_huge(pud_mkhuge(pud)));
+
+	/*
+	 * pud_mkinvalid() has been dropped for now. Enable back
+	 * these tests when it comes back with a modified pud_present().
+	 *
+	 * WARN_ON(!pud_trans_huge(pud_mkinvalid(pud_mkhuge(pud))));
+	 * WARN_ON(!pud_present(pud_mkinvalid(pud_mkhuge(pud))));
+	 */
+}
+#else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+static void __init pud_thp_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+#else  /* !CONFIG_TRANSPARENT_HUGEPAGE */
+static void __init pmd_thp_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_thp_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
 static unsigned long __init get_random_vaddr(void)
 {
 	unsigned long random_vaddr, random_pages, total_user_pages;
@@ -303,7 +575,7 @@ static int __init debug_vm_pgtable(void)
 	pmd_t *pmdp, *saved_pmdp, pmd;
 	pte_t *ptep;
 	pgtable_t saved_ptep;
-	pgprot_t prot;
+	pgprot_t prot, protnone;
 	phys_addr_t paddr;
 	unsigned long vaddr, pte_aligned, pmd_aligned;
 	unsigned long pud_aligned, p4d_aligned, pgd_aligned;
@@ -319,6 +591,12 @@ static int __init debug_vm_pgtable(void)
 	}
 
 	/*
+	 * __P000 (or even __S000) will help create page table entries with
+	 * PROT_NONE permission as required for pxx_protnone_tests().
+	 */
+	protnone = __P000;
+
+	/*
 	 * PFN for mapping at PTE level is determined from a standard kernel
 	 * text symbol. But pfns for higher page table levels are derived by
 	 * masking lower bits of this real pfn. These derived pfns might not
@@ -373,6 +651,28 @@ static int __init debug_vm_pgtable(void)
 	p4d_populate_tests(mm, p4dp, saved_pudp);
 	pgd_populate_tests(mm, pgdp, saved_p4dp);
 
+	pte_special_tests(pte_aligned, prot);
+	pte_protnone_tests(pte_aligned, protnone);
+	pmd_protnone_tests(pmd_aligned, protnone);
+
+	pte_devmap_tests(pte_aligned, prot);
+	pmd_devmap_tests(pmd_aligned, prot);
+	pud_devmap_tests(pud_aligned, prot);
+
+	pte_soft_dirty_tests(pte_aligned, prot);
+	pmd_soft_dirty_tests(pmd_aligned, prot);
+	pte_swap_soft_dirty_tests(pte_aligned, prot);
+	pmd_swap_soft_dirty_tests(pmd_aligned, prot);
+
+	pte_swap_tests(pte_aligned, prot);
+	pmd_swap_tests(pmd_aligned, prot);
+
+	swap_migration_tests();
+	hugetlb_basic_tests(pte_aligned, prot);
+
+	pmd_thp_tests(pmd_aligned, prot);
+	pud_thp_tests(pud_aligned, prot);
+
 	p4d_free(mm, saved_p4dp);
 	pud_free(mm, saved_pudp);
 	pmd_free(mm, saved_pmdp);
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (13 preceding siblings ...)
  2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch " Andrew Morton
@ 2020-07-06 23:28 ` Andrew Morton
  2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch " Andrew Morton
                   ` (217 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:28 UTC (permalink / raw)
  To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
	corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
	palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
	ziy


The patch titled
     Subject: mm/debug_vm_pgtable: add tests validating advanced arch page table helpers
has been added to the -mm tree.  Its filename is
     mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/debug_vm_pgtable: add tests validating advanced arch page table helpers

This adds new tests validating for these following arch advanced page
table helpers.  These tests create and test specific mapping types at
various page table levels.

1. pxxp_set_wrprotect()
2. pxxp_get_and_clear()
3. pxxp_set_access_flags()
4. pxxp_get_and_clear_full()
5. pxxp_test_and_clear_young()
6. pxx_leaf()
7. pxx_set_huge()
8. pxx_(clear|mk)_savedwrite()
9. huge_pxxp_xxx()

Link: http://lkml.kernel.org/r/1593996516-7186-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com>	[arc]
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug_vm_pgtable.c |  312 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 312 insertions(+)

--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers
+++ a/mm/debug_vm_pgtable.c
@@ -21,6 +21,7 @@
 #include <linux/module.h>
 #include <linux/pfn_t.h>
 #include <linux/printk.h>
+#include <linux/pgtable.h>
 #include <linux/random.h>
 #include <linux/spinlock.h>
 #include <linux/swap.h>
@@ -28,6 +29,7 @@
 #include <linux/start_kernel.h>
 #include <linux/sched/mm.h>
 #include <asm/pgalloc.h>
+#include <asm/tlbflush.h>
 
 #define VMFLAGS	(VM_READ|VM_WRITE|VM_EXEC)
 
@@ -55,6 +57,55 @@ static void __init pte_basic_tests(unsig
 	WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
 }
 
+static void __init pte_advanced_tests(struct mm_struct *mm,
+				      struct vm_area_struct *vma, pte_t *ptep,
+				      unsigned long pfn, unsigned long vaddr,
+				      pgprot_t prot)
+{
+	pte_t pte = pfn_pte(pfn, prot);
+
+	pte = pfn_pte(pfn, prot);
+	set_pte_at(mm, vaddr, ptep, pte);
+	ptep_set_wrprotect(mm, vaddr, ptep);
+	pte = ptep_get(ptep);
+	WARN_ON(pte_write(pte));
+
+	pte = pfn_pte(pfn, prot);
+	set_pte_at(mm, vaddr, ptep, pte);
+	ptep_get_and_clear(mm, vaddr, ptep);
+	pte = ptep_get(ptep);
+	WARN_ON(!pte_none(pte));
+
+	pte = pfn_pte(pfn, prot);
+	pte = pte_wrprotect(pte);
+	pte = pte_mkclean(pte);
+	set_pte_at(mm, vaddr, ptep, pte);
+	pte = pte_mkwrite(pte);
+	pte = pte_mkdirty(pte);
+	ptep_set_access_flags(vma, vaddr, ptep, pte, 1);
+	pte = ptep_get(ptep);
+	WARN_ON(!(pte_write(pte) && pte_dirty(pte)));
+
+	pte = pfn_pte(pfn, prot);
+	set_pte_at(mm, vaddr, ptep, pte);
+	ptep_get_and_clear_full(mm, vaddr, ptep, 1);
+	pte = ptep_get(ptep);
+	WARN_ON(!pte_none(pte));
+
+	pte = pte_mkyoung(pte);
+	set_pte_at(mm, vaddr, ptep, pte);
+	ptep_test_and_clear_young(vma, vaddr, ptep);
+	pte = ptep_get(ptep);
+	WARN_ON(pte_young(pte));
+}
+
+static void __init pte_savedwrite_tests(unsigned long pfn, pgprot_t prot)
+{
+	pte_t pte = pfn_pte(pfn, prot);
+
+	WARN_ON(!pte_savedwrite(pte_mk_savedwrite(pte_clear_savedwrite(pte))));
+	WARN_ON(pte_savedwrite(pte_clear_savedwrite(pte_mk_savedwrite(pte))));
+}
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
 {
@@ -77,6 +128,90 @@ static void __init pmd_basic_tests(unsig
 	WARN_ON(!pmd_bad(pmd_mkhuge(pmd)));
 }
 
+static void __init pmd_advanced_tests(struct mm_struct *mm,
+				      struct vm_area_struct *vma, pmd_t *pmdp,
+				      unsigned long pfn, unsigned long vaddr,
+				      pgprot_t prot)
+{
+	pmd_t pmd = pfn_pmd(pfn, prot);
+
+	if (!has_transparent_hugepage())
+		return;
+
+	/* Align the address wrt HPAGE_PMD_SIZE */
+	vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
+
+	pmd = pfn_pmd(pfn, prot);
+	set_pmd_at(mm, vaddr, pmdp, pmd);
+	pmdp_set_wrprotect(mm, vaddr, pmdp);
+	pmd = READ_ONCE(*pmdp);
+	WARN_ON(pmd_write(pmd));
+
+	pmd = pfn_pmd(pfn, prot);
+	set_pmd_at(mm, vaddr, pmdp, pmd);
+	pmdp_huge_get_and_clear(mm, vaddr, pmdp);
+	pmd = READ_ONCE(*pmdp);
+	WARN_ON(!pmd_none(pmd));
+
+	pmd = pfn_pmd(pfn, prot);
+	pmd = pmd_wrprotect(pmd);
+	pmd = pmd_mkclean(pmd);
+	set_pmd_at(mm, vaddr, pmdp, pmd);
+	pmd = pmd_mkwrite(pmd);
+	pmd = pmd_mkdirty(pmd);
+	pmdp_set_access_flags(vma, vaddr, pmdp, pmd, 1);
+	pmd = READ_ONCE(*pmdp);
+	WARN_ON(!(pmd_write(pmd) && pmd_dirty(pmd)));
+
+	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
+	set_pmd_at(mm, vaddr, pmdp, pmd);
+	pmdp_huge_get_and_clear_full(vma, vaddr, pmdp, 1);
+	pmd = READ_ONCE(*pmdp);
+	WARN_ON(!pmd_none(pmd));
+
+	pmd = pmd_mkyoung(pmd);
+	set_pmd_at(mm, vaddr, pmdp, pmd);
+	pmdp_test_and_clear_young(vma, vaddr, pmdp);
+	pmd = READ_ONCE(*pmdp);
+	WARN_ON(pmd_young(pmd));
+}
+
+static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot)
+{
+	pmd_t pmd = pfn_pmd(pfn, prot);
+
+	/*
+	 * PMD based THP is a leaf entry.
+	 */
+	pmd = pmd_mkhuge(pmd);
+	WARN_ON(!pmd_leaf(pmd));
+}
+
+static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
+{
+	pmd_t pmd;
+
+	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
+		return;
+	/*
+	 * X86 defined pmd_set_huge() verifies that the given
+	 * PMD is not a populated non-leaf entry.
+	 */
+	WRITE_ONCE(*pmdp, __pmd(0));
+	WARN_ON(!pmd_set_huge(pmdp, __pfn_to_phys(pfn), prot));
+	WARN_ON(!pmd_clear_huge(pmdp));
+	pmd = READ_ONCE(*pmdp);
+	WARN_ON(!pmd_none(pmd));
+}
+
+static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
+{
+	pmd_t pmd = pfn_pmd(pfn, prot);
+
+	WARN_ON(!pmd_savedwrite(pmd_mk_savedwrite(pmd_clear_savedwrite(pmd))));
+	WARN_ON(pmd_savedwrite(pmd_clear_savedwrite(pmd_mk_savedwrite(pmd))));
+}
+
 #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
 static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot)
 {
@@ -100,12 +235,119 @@ static void __init pud_basic_tests(unsig
 	 */
 	WARN_ON(!pud_bad(pud_mkhuge(pud)));
 }
+
+static void __init pud_advanced_tests(struct mm_struct *mm,
+				      struct vm_area_struct *vma, pud_t *pudp,
+				      unsigned long pfn, unsigned long vaddr,
+				      pgprot_t prot)
+{
+	pud_t pud = pfn_pud(pfn, prot);
+
+	if (!has_transparent_hugepage())
+		return;
+
+	/* Align the address wrt HPAGE_PUD_SIZE */
+	vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE;
+
+	set_pud_at(mm, vaddr, pudp, pud);
+	pudp_set_wrprotect(mm, vaddr, pudp);
+	pud = READ_ONCE(*pudp);
+	WARN_ON(pud_write(pud));
+
+#ifndef __PAGETABLE_PMD_FOLDED
+	pud = pfn_pud(pfn, prot);
+	set_pud_at(mm, vaddr, pudp, pud);
+	pudp_huge_get_and_clear(mm, vaddr, pudp);
+	pud = READ_ONCE(*pudp);
+	WARN_ON(!pud_none(pud));
+
+	pud = pfn_pud(pfn, prot);
+	set_pud_at(mm, vaddr, pudp, pud);
+	pudp_huge_get_and_clear_full(mm, vaddr, pudp, 1);
+	pud = READ_ONCE(*pudp);
+	WARN_ON(!pud_none(pud));
+#endif /* __PAGETABLE_PMD_FOLDED */
+	pud = pfn_pud(pfn, prot);
+	pud = pud_wrprotect(pud);
+	pud = pud_mkclean(pud);
+	set_pud_at(mm, vaddr, pudp, pud);
+	pud = pud_mkwrite(pud);
+	pud = pud_mkdirty(pud);
+	pudp_set_access_flags(vma, vaddr, pudp, pud, 1);
+	pud = READ_ONCE(*pudp);
+	WARN_ON(!(pud_write(pud) && pud_dirty(pud)));
+
+	pud = pud_mkyoung(pud);
+	set_pud_at(mm, vaddr, pudp, pud);
+	pudp_test_and_clear_young(vma, vaddr, pudp);
+	pud = READ_ONCE(*pudp);
+	WARN_ON(pud_young(pud));
+}
+
+static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
+{
+	pud_t pud = pfn_pud(pfn, prot);
+
+	/*
+	 * PUD based THP is a leaf entry.
+	 */
+	pud = pud_mkhuge(pud);
+	WARN_ON(!pud_leaf(pud));
+}
+
+static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
+{
+	pud_t pud;
+
+	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
+		return;
+	/*
+	 * X86 defined pud_set_huge() verifies that the given
+	 * PUD is not a populated non-leaf entry.
+	 */
+	WRITE_ONCE(*pudp, __pud(0));
+	WARN_ON(!pud_set_huge(pudp, __pfn_to_phys(pfn), prot));
+	WARN_ON(!pud_clear_huge(pudp));
+	pud = READ_ONCE(*pudp);
+	WARN_ON(!pud_none(pud));
+}
 #else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
 static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_advanced_tests(struct mm_struct *mm,
+				      struct vm_area_struct *vma, pud_t *pudp,
+				      unsigned long pfn, unsigned long vaddr,
+				      pgprot_t prot)
+{
+}
+static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
+{
+}
 #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
 #else  /* !CONFIG_TRANSPARENT_HUGEPAGE */
 static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { }
 static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pmd_advanced_tests(struct mm_struct *mm,
+				      struct vm_area_struct *vma, pmd_t *pmdp,
+				      unsigned long pfn, unsigned long vaddr,
+				      pgprot_t prot)
+{
+}
+static void __init pud_advanced_tests(struct mm_struct *mm,
+				      struct vm_area_struct *vma, pud_t *pudp,
+				      unsigned long pfn, unsigned long vaddr,
+				      pgprot_t prot)
+{
+}
+static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
+{
+}
+static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
+{
+}
+static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot) { }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
 static void __init p4d_basic_tests(unsigned long pfn, pgprot_t prot)
@@ -495,8 +737,56 @@ static void __init hugetlb_basic_tests(u
 	WARN_ON(!pte_huge(pte_mkhuge(pte)));
 #endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
 }
+
+static void __init hugetlb_advanced_tests(struct mm_struct *mm,
+					  struct vm_area_struct *vma,
+					  pte_t *ptep, unsigned long pfn,
+					  unsigned long vaddr, pgprot_t prot)
+{
+	struct page *page = pfn_to_page(pfn);
+	pte_t pte = ptep_get(ptep);
+	unsigned long paddr = (__pfn_to_phys(pfn) | RANDOM_ORVALUE) & PMD_MASK;
+
+	pte = pte_mkhuge(mk_pte(pfn_to_page(PHYS_PFN(paddr)), prot));
+	set_huge_pte_at(mm, vaddr, ptep, pte);
+	barrier();
+	WARN_ON(!pte_same(pte, huge_ptep_get(ptep)));
+	huge_pte_clear(mm, vaddr, ptep, PMD_SIZE);
+	pte = huge_ptep_get(ptep);
+	WARN_ON(!huge_pte_none(pte));
+
+	pte = mk_huge_pte(page, prot);
+	set_huge_pte_at(mm, vaddr, ptep, pte);
+	barrier();
+	huge_ptep_set_wrprotect(mm, vaddr, ptep);
+	pte = huge_ptep_get(ptep);
+	WARN_ON(huge_pte_write(pte));
+
+	pte = mk_huge_pte(page, prot);
+	set_huge_pte_at(mm, vaddr, ptep, pte);
+	barrier();
+	huge_ptep_get_and_clear(mm, vaddr, ptep);
+	pte = huge_ptep_get(ptep);
+	WARN_ON(!huge_pte_none(pte));
+
+	pte = mk_huge_pte(page, prot);
+	pte = huge_pte_wrprotect(pte);
+	set_huge_pte_at(mm, vaddr, ptep, pte);
+	barrier();
+	pte = huge_pte_mkwrite(pte);
+	pte = huge_pte_mkdirty(pte);
+	huge_ptep_set_access_flags(vma, vaddr, ptep, pte, 1);
+	pte = huge_ptep_get(ptep);
+	WARN_ON(!(huge_pte_write(pte) && huge_pte_dirty(pte)));
+}
 #else  /* !CONFIG_HUGETLB_PAGE */
 static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init hugetlb_advanced_tests(struct mm_struct *mm,
+					  struct vm_area_struct *vma,
+					  pte_t *ptep, unsigned long pfn,
+					  unsigned long vaddr, pgprot_t prot)
+{
+}
 #endif /* CONFIG_HUGETLB_PAGE */
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -568,6 +858,7 @@ static unsigned long __init get_random_v
 
 static int __init debug_vm_pgtable(void)
 {
+	struct vm_area_struct *vma;
 	struct mm_struct *mm;
 	pgd_t *pgdp;
 	p4d_t *p4dp, *saved_p4dp;
@@ -596,6 +887,12 @@ static int __init debug_vm_pgtable(void)
 	 */
 	protnone = __P000;
 
+	vma = vm_area_alloc(mm);
+	if (!vma) {
+		pr_err("vma allocation failed\n");
+		return 1;
+	}
+
 	/*
 	 * PFN for mapping at PTE level is determined from a standard kernel
 	 * text symbol. But pfns for higher page table levels are derived by
@@ -644,6 +941,20 @@ static int __init debug_vm_pgtable(void)
 	p4d_clear_tests(mm, p4dp);
 	pgd_clear_tests(mm, pgdp);
 
+	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
+	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
+	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+
+	pmd_leaf_tests(pmd_aligned, prot);
+	pud_leaf_tests(pud_aligned, prot);
+
+	pmd_huge_tests(pmdp, pmd_aligned, prot);
+	pud_huge_tests(pudp, pud_aligned, prot);
+
+	pte_savedwrite_tests(pte_aligned, prot);
+	pmd_savedwrite_tests(pmd_aligned, prot);
+
 	pte_unmap_unlock(ptep, ptl);
 
 	pmd_populate_tests(mm, pmdp, saved_ptep);
@@ -678,6 +989,7 @@ static int __init debug_vm_pgtable(void)
 	pmd_free(mm, saved_pmdp);
 	pte_free(mm, saved_ptep);
 
+	vm_area_free(vma);
 	mm_dec_nr_puds(mm);
 	mm_dec_nr_pmds(mm);
 	mm_dec_nr_ptes(mm);
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (14 preceding siblings ...)
  2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch " Andrew Morton
@ 2020-07-06 23:28 ` Andrew Morton
  2020-07-06 23:28 ` + documentation-mm-add-descriptions-for-arch-page-table-helpers.patch " Andrew Morton
                   ` (216 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:28 UTC (permalink / raw)
  To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
	corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
	palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
	ziy


The patch titled
     Subject: mm/debug_vm_pgtable: add debug prints for individual tests
has been added to the -mm tree.  Its filename is
     mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/debug_vm_pgtable: add debug prints for individual tests

This adds debug print information that enlists all tests getting executed
on a given platform.  With dynamic debug enabled, the following
information will be splashed during boot.  For compactness purpose,
dropped both time stamp and prefix (i.e debug_vm_pgtable) from this sample
output.

[debug_vm_pgtable      ]: Validating architecture page table helpers
[pte_basic_tests       ]: Validating PTE basic
[pmd_basic_tests       ]: Validating PMD basic
[p4d_basic_tests       ]: Validating P4D basic
[pgd_basic_tests       ]: Validating PGD basic
[pte_clear_tests       ]: Validating PTE clear
[pmd_clear_tests       ]: Validating PMD clear
[pte_advanced_tests    ]: Validating PTE advanced
[pmd_advanced_tests    ]: Validating PMD advanced
[hugetlb_advanced_tests]: Validating HugeTLB advanced
[pmd_leaf_tests        ]: Validating PMD leaf
[pmd_huge_tests        ]: Validating PMD huge
[pte_savedwrite_tests  ]: Validating PTE saved write
[pmd_savedwrite_tests  ]: Validating PMD saved write
[pmd_populate_tests    ]: Validating PMD populate
[pte_special_tests     ]: Validating PTE special
[pte_protnone_tests    ]: Validating PTE protnone
[pmd_protnone_tests    ]: Validating PMD protnone
[pte_devmap_tests      ]: Validating PTE devmap
[pmd_devmap_tests      ]: Validating PMD devmap
[pte_swap_tests        ]: Validating PTE swap
[swap_migration_tests  ]: Validating swap migration
[hugetlb_basic_tests   ]: Validating HugeTLB basic
[pmd_thp_tests         ]: Validating PMD based THP

Link: http://lkml.kernel.org/r/1593996516-7186-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com>	[arc]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug_vm_pgtable.c |   46 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 45 insertions(+), 1 deletion(-)

--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-add-debug-prints-for-individual-tests
+++ a/mm/debug_vm_pgtable.c
@@ -8,7 +8,7 @@
  *
  * Author: Anshuman Khandual <anshuman.khandual@arm.com>
  */
-#define pr_fmt(fmt) "debug_vm_pgtable: %s: " fmt, __func__
+#define pr_fmt(fmt) "debug_vm_pgtable: [%-25s]: " fmt, __func__
 
 #include <linux/gfp.h>
 #include <linux/highmem.h>
@@ -48,6 +48,7 @@ static void __init pte_basic_tests(unsig
 {
 	pte_t pte = pfn_pte(pfn, prot);
 
+	pr_debug("Validating PTE basic\n");
 	WARN_ON(!pte_same(pte, pte));
 	WARN_ON(!pte_young(pte_mkyoung(pte_mkold(pte))));
 	WARN_ON(!pte_dirty(pte_mkdirty(pte_mkclean(pte))));
@@ -64,6 +65,7 @@ static void __init pte_advanced_tests(st
 {
 	pte_t pte = pfn_pte(pfn, prot);
 
+	pr_debug("Validating PTE advanced\n");
 	pte = pfn_pte(pfn, prot);
 	set_pte_at(mm, vaddr, ptep, pte);
 	ptep_set_wrprotect(mm, vaddr, ptep);
@@ -103,6 +105,7 @@ static void __init pte_savedwrite_tests(
 {
 	pte_t pte = pfn_pte(pfn, prot);
 
+	pr_debug("Validating PTE saved write\n");
 	WARN_ON(!pte_savedwrite(pte_mk_savedwrite(pte_clear_savedwrite(pte))));
 	WARN_ON(pte_savedwrite(pte_clear_savedwrite(pte_mk_savedwrite(pte))));
 }
@@ -114,6 +117,7 @@ static void __init pmd_basic_tests(unsig
 	if (!has_transparent_hugepage())
 		return;
 
+	pr_debug("Validating PMD basic\n");
 	WARN_ON(!pmd_same(pmd, pmd));
 	WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd))));
 	WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd))));
@@ -138,6 +142,7 @@ static void __init pmd_advanced_tests(st
 	if (!has_transparent_hugepage())
 		return;
 
+	pr_debug("Validating PMD advanced\n");
 	/* Align the address wrt HPAGE_PMD_SIZE */
 	vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
 
@@ -180,6 +185,7 @@ static void __init pmd_leaf_tests(unsign
 {
 	pmd_t pmd = pfn_pmd(pfn, prot);
 
+	pr_debug("Validating PMD leaf\n");
 	/*
 	 * PMD based THP is a leaf entry.
 	 */
@@ -193,6 +199,8 @@ static void __init pmd_huge_tests(pmd_t
 
 	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
 		return;
+
+	pr_debug("Validating PMD huge\n");
 	/*
 	 * X86 defined pmd_set_huge() verifies that the given
 	 * PMD is not a populated non-leaf entry.
@@ -208,6 +216,7 @@ static void __init pmd_savedwrite_tests(
 {
 	pmd_t pmd = pfn_pmd(pfn, prot);
 
+	pr_debug("Validating PMD saved write\n");
 	WARN_ON(!pmd_savedwrite(pmd_mk_savedwrite(pmd_clear_savedwrite(pmd))));
 	WARN_ON(pmd_savedwrite(pmd_clear_savedwrite(pmd_mk_savedwrite(pmd))));
 }
@@ -220,6 +229,7 @@ static void __init pud_basic_tests(unsig
 	if (!has_transparent_hugepage())
 		return;
 
+	pr_debug("Validating PUD basic\n");
 	WARN_ON(!pud_same(pud, pud));
 	WARN_ON(!pud_young(pud_mkyoung(pud_mkold(pud))));
 	WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud))));
@@ -246,6 +256,7 @@ static void __init pud_advanced_tests(st
 	if (!has_transparent_hugepage())
 		return;
 
+	pr_debug("Validating PUD advanced\n");
 	/* Align the address wrt HPAGE_PUD_SIZE */
 	vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE;
 
@@ -288,6 +299,7 @@ static void __init pud_leaf_tests(unsign
 {
 	pud_t pud = pfn_pud(pfn, prot);
 
+	pr_debug("Validating PUD leaf\n");
 	/*
 	 * PUD based THP is a leaf entry.
 	 */
@@ -301,6 +313,8 @@ static void __init pud_huge_tests(pud_t
 
 	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
 		return;
+
+	pr_debug("Validating PUD huge\n");
 	/*
 	 * X86 defined pud_set_huge() verifies that the given
 	 * PUD is not a populated non-leaf entry.
@@ -354,6 +368,7 @@ static void __init p4d_basic_tests(unsig
 {
 	p4d_t p4d;
 
+	pr_debug("Validating P4D basic\n");
 	memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t));
 	WARN_ON(!p4d_same(p4d, p4d));
 }
@@ -362,6 +377,7 @@ static void __init pgd_basic_tests(unsig
 {
 	pgd_t pgd;
 
+	pr_debug("Validating PGD basic\n");
 	memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t));
 	WARN_ON(!pgd_same(pgd, pgd));
 }
@@ -374,6 +390,7 @@ static void __init pud_clear_tests(struc
 	if (mm_pmd_folded(mm))
 		return;
 
+	pr_debug("Validating PUD clear\n");
 	pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
 	WRITE_ONCE(*pudp, pud);
 	pud_clear(pudp);
@@ -388,6 +405,8 @@ static void __init pud_populate_tests(st
 
 	if (mm_pmd_folded(mm))
 		return;
+
+	pr_debug("Validating PUD populate\n");
 	/*
 	 * This entry points to next level page table page.
 	 * Hence this must not qualify as pud_bad().
@@ -414,6 +433,7 @@ static void __init p4d_clear_tests(struc
 	if (mm_pud_folded(mm))
 		return;
 
+	pr_debug("Validating P4D clear\n");
 	p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
 	WRITE_ONCE(*p4dp, p4d);
 	p4d_clear(p4dp);
@@ -429,6 +449,7 @@ static void __init p4d_populate_tests(st
 	if (mm_pud_folded(mm))
 		return;
 
+	pr_debug("Validating P4D populate\n");
 	/*
 	 * This entry points to next level page table page.
 	 * Hence this must not qualify as p4d_bad().
@@ -447,6 +468,7 @@ static void __init pgd_clear_tests(struc
 	if (mm_p4d_folded(mm))
 		return;
 
+	pr_debug("Validating PGD clear\n");
 	pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
 	WRITE_ONCE(*pgdp, pgd);
 	pgd_clear(pgdp);
@@ -462,6 +484,7 @@ static void __init pgd_populate_tests(st
 	if (mm_p4d_folded(mm))
 		return;
 
+	pr_debug("Validating PGD populate\n");
 	/*
 	 * This entry points to next level page table page.
 	 * Hence this must not qualify as pgd_bad().
@@ -490,6 +513,7 @@ static void __init pte_clear_tests(struc
 {
 	pte_t pte = ptep_get(ptep);
 
+	pr_debug("Validating PTE clear\n");
 	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
 	set_pte_at(mm, vaddr, ptep, pte);
 	barrier();
@@ -502,6 +526,7 @@ static void __init pmd_clear_tests(struc
 {
 	pmd_t pmd = READ_ONCE(*pmdp);
 
+	pr_debug("Validating PMD clear\n");
 	pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
 	WRITE_ONCE(*pmdp, pmd);
 	pmd_clear(pmdp);
@@ -514,6 +539,7 @@ static void __init pmd_populate_tests(st
 {
 	pmd_t pmd;
 
+	pr_debug("Validating PMD populate\n");
 	/*
 	 * This entry points to next level page table page.
 	 * Hence this must not qualify as pmd_bad().
@@ -531,6 +557,7 @@ static void __init pte_special_tests(uns
 	if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL))
 		return;
 
+	pr_debug("Validating PTE special\n");
 	WARN_ON(!pte_special(pte_mkspecial(pte)));
 }
 
@@ -541,6 +568,7 @@ static void __init pte_protnone_tests(un
 	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
 		return;
 
+	pr_debug("Validating PTE protnone\n");
 	WARN_ON(!pte_protnone(pte));
 	WARN_ON(!pte_present(pte));
 }
@@ -553,6 +581,7 @@ static void __init pmd_protnone_tests(un
 	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
 		return;
 
+	pr_debug("Validating PMD protnone\n");
 	WARN_ON(!pmd_protnone(pmd));
 	WARN_ON(!pmd_present(pmd));
 }
@@ -565,6 +594,7 @@ static void __init pte_devmap_tests(unsi
 {
 	pte_t pte = pfn_pte(pfn, prot);
 
+	pr_debug("Validating PTE devmap\n");
 	WARN_ON(!pte_devmap(pte_mkdevmap(pte)));
 }
 
@@ -573,6 +603,7 @@ static void __init pmd_devmap_tests(unsi
 {
 	pmd_t pmd = pfn_pmd(pfn, prot);
 
+	pr_debug("Validating PMD devmap\n");
 	WARN_ON(!pmd_devmap(pmd_mkdevmap(pmd)));
 }
 
@@ -581,6 +612,7 @@ static void __init pud_devmap_tests(unsi
 {
 	pud_t pud = pfn_pud(pfn, prot);
 
+	pr_debug("Validating PUD devmap\n");
 	WARN_ON(!pud_devmap(pud_mkdevmap(pud)));
 }
 #else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
@@ -603,6 +635,7 @@ static void __init pte_soft_dirty_tests(
 	if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
 		return;
 
+	pr_debug("Validating PTE soft dirty\n");
 	WARN_ON(!pte_soft_dirty(pte_mksoft_dirty(pte)));
 	WARN_ON(pte_soft_dirty(pte_clear_soft_dirty(pte)));
 }
@@ -614,6 +647,7 @@ static void __init pte_swap_soft_dirty_t
 	if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
 		return;
 
+	pr_debug("Validating PTE swap soft dirty\n");
 	WARN_ON(!pte_swp_soft_dirty(pte_swp_mksoft_dirty(pte)));
 	WARN_ON(pte_swp_soft_dirty(pte_swp_clear_soft_dirty(pte)));
 }
@@ -626,6 +660,7 @@ static void __init pmd_soft_dirty_tests(
 	if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
 		return;
 
+	pr_debug("Validating PMD soft dirty\n");
 	WARN_ON(!pmd_soft_dirty(pmd_mksoft_dirty(pmd)));
 	WARN_ON(pmd_soft_dirty(pmd_clear_soft_dirty(pmd)));
 }
@@ -638,6 +673,7 @@ static void __init pmd_swap_soft_dirty_t
 		!IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION))
 		return;
 
+	pr_debug("Validating PMD swap soft dirty\n");
 	WARN_ON(!pmd_swp_soft_dirty(pmd_swp_mksoft_dirty(pmd)));
 	WARN_ON(pmd_swp_soft_dirty(pmd_swp_clear_soft_dirty(pmd)));
 }
@@ -653,6 +689,7 @@ static void __init pte_swap_tests(unsign
 	swp_entry_t swp;
 	pte_t pte;
 
+	pr_debug("Validating PTE swap\n");
 	pte = pfn_pte(pfn, prot);
 	swp = __pte_to_swp_entry(pte);
 	pte = __swp_entry_to_pte(swp);
@@ -665,6 +702,7 @@ static void __init pmd_swap_tests(unsign
 	swp_entry_t swp;
 	pmd_t pmd;
 
+	pr_debug("Validating PMD swap\n");
 	pmd = pfn_pmd(pfn, prot);
 	swp = __pmd_to_swp_entry(pmd);
 	pmd = __swp_entry_to_pmd(swp);
@@ -681,6 +719,8 @@ static void __init swap_migration_tests(
 
 	if (!IS_ENABLED(CONFIG_MIGRATION))
 		return;
+
+	pr_debug("Validating swap migration\n");
 	/*
 	 * swap_migration_tests() requires a dedicated page as it needs to
 	 * be locked before creating a migration entry from it. Locking the
@@ -720,6 +760,7 @@ static void __init hugetlb_basic_tests(u
 	struct page *page;
 	pte_t pte;
 
+	pr_debug("Validating HugeTLB basic\n");
 	/*
 	 * Accessing the page associated with the pfn is safe here,
 	 * as it was previously derived from a real kernel symbol.
@@ -747,6 +788,7 @@ static void __init hugetlb_advanced_test
 	pte_t pte = ptep_get(ptep);
 	unsigned long paddr = (__pfn_to_phys(pfn) | RANDOM_ORVALUE) & PMD_MASK;
 
+	pr_debug("Validating HugeTLB advanced\n");
 	pte = pte_mkhuge(mk_pte(pfn_to_page(PHYS_PFN(paddr)), prot));
 	set_huge_pte_at(mm, vaddr, ptep, pte);
 	barrier();
@@ -797,6 +839,7 @@ static void __init pmd_thp_tests(unsigne
 	if (!has_transparent_hugepage())
 		return;
 
+	pr_debug("Validating PMD based THP\n");
 	/*
 	 * pmd_trans_huge() and pmd_present() must return positive after
 	 * MMU invalidation with pmd_mkinvalid(). This behavior is an
@@ -825,6 +868,7 @@ static void __init pud_thp_tests(unsigne
 	if (!has_transparent_hugepage())
 		return;
 
+	pr_debug("Validating PUD based THP\n");
 	pud = pfn_pud(pfn, prot);
 	WARN_ON(!pud_trans_huge(pud_mkhuge(pud)));
 
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + documentation-mm-add-descriptions-for-arch-page-table-helpers.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (15 preceding siblings ...)
  2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch " Andrew Morton
@ 2020-07-06 23:28 ` Andrew Morton
  2020-07-06 23:33 ` [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from " Andrew Morton
                   ` (215 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:28 UTC (permalink / raw)
  To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
	corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
	palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
	ziy


The patch titled
     Subject: Documentation/mm: Add descriptions for arch page table helpers
has been added to the -mm tree.  Its filename is
     documentation-mm-add-descriptions-for-arch-page-table-helpers.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/documentation-mm-add-descriptions-for-arch-page-table-helpers.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: Documentation/mm: Add descriptions for arch page table helpers

This adds a specific description file for all arch page table helpers which
is in sync with the semantics being tested via CONFIG_DEBUG_VM_PGTABLE. All
future changes either to these descriptions here or the debug test should
always remain in sync.

Link: http://lkml.kernel.org/r/1593996516-7186-5-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Suggested-by: Mike Rapoport <rppt@kernel.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/vm/arch_pgtable_helpers.rst |  258 ++++++++++++++++++++
 mm/debug_vm_pgtable.c                     |    6 
 2 files changed, 264 insertions(+)

--- /dev/null
+++ a/Documentation/vm/arch_pgtable_helpers.rst
@@ -0,0 +1,258 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _arch_page_table_helpers:
+
+===============================
+Architecture Page Table Helpers
+===============================
+
+Generic MM expects architectures (with MMU) to provide helpers to create, access
+and modify page table entries at various level for different memory functions.
+These page table helpers need to conform to a common semantics across platforms.
+Following tables describe the expected semantics which can also be tested during
+boot via CONFIG_DEBUG_VM_PGTABLE option. All future changes in here or the debug
+test need to be in sync.
+
+======================
+PTE Page Table Helpers
+======================
+
+--------------------------------------------------------------------------------
+| pte_same                  | Tests whether both PTE entries are the same      |
+--------------------------------------------------------------------------------
+| pte_bad                   | Tests a non-table mapped PTE                     |
+--------------------------------------------------------------------------------
+| pte_present               | Tests a valid mapped PTE                         |
+--------------------------------------------------------------------------------
+| pte_young                 | Tests a young PTE                                |
+--------------------------------------------------------------------------------
+| pte_dirty                 | Tests a dirty PTE                                |
+--------------------------------------------------------------------------------
+| pte_write                 | Tests a writable PTE                             |
+--------------------------------------------------------------------------------
+| pte_special               | Tests a special PTE                              |
+--------------------------------------------------------------------------------
+| pte_protnone              | Tests a PROT_NONE PTE                            |
+--------------------------------------------------------------------------------
+| pte_devmap                | Tests a ZONE_DEVICE mapped PTE                   |
+--------------------------------------------------------------------------------
+| pte_soft_dirty            | Tests a soft dirty PTE                           |
+--------------------------------------------------------------------------------
+| pte_swp_soft_dirty        | Tests a soft dirty swapped PTE                   |
+--------------------------------------------------------------------------------
+| pte_mkyoung               | Creates a young PTE                              |
+--------------------------------------------------------------------------------
+| pte_mkold                 | Creates an old PTE                               |
+--------------------------------------------------------------------------------
+| pte_mkdirty               | Creates a dirty PTE                              |
+--------------------------------------------------------------------------------
+| pte_mkclean               | Creates a clean PTE                              |
+--------------------------------------------------------------------------------
+| pte_mkwrite               | Creates a writable PTE                           |
+--------------------------------------------------------------------------------
+| pte_mkwrprotect           | Creates a write protected PTE                    |
+--------------------------------------------------------------------------------
+| pte_mkspecial             | Creates a special PTE                            |
+--------------------------------------------------------------------------------
+| pte_mkdevmap              | Creates a ZONE_DEVICE mapped PTE                 |
+--------------------------------------------------------------------------------
+| pte_mksoft_dirty          | Creates a soft dirty PTE                         |
+--------------------------------------------------------------------------------
+| pte_clear_soft_dirty      | Clears a soft dirty PTE                          |
+--------------------------------------------------------------------------------
+| pte_swp_mksoft_dirty      | Creates a soft dirty swapped PTE                 |
+--------------------------------------------------------------------------------
+| pte_swp_clear_soft_dirty  | Clears a soft dirty swapped PTE                  |
+--------------------------------------------------------------------------------
+| pte_mknotpresent          | Invalidates a mapped PTE                         |
+--------------------------------------------------------------------------------
+| ptep_get_and_clear        | Clears a PTE                                     |
+--------------------------------------------------------------------------------
+| ptep_get_and_clear_full   | Clears a PTE                                     |
+--------------------------------------------------------------------------------
+| ptep_test_and_clear_young | Clears young from a PTE                          |
+--------------------------------------------------------------------------------
+| ptep_set_wrprotect        | Converts into a write protected PTE              |
+--------------------------------------------------------------------------------
+| ptep_set_access_flags     | Converts into a more permissive PTE              |
+--------------------------------------------------------------------------------
+
+======================
+PMD Page Table Helpers
+======================
+
+--------------------------------------------------------------------------------
+| pmd_same                  | Tests whether both PMD entries are the same      |
+--------------------------------------------------------------------------------
+| pmd_bad                   | Tests a non-table mapped PMD                     |
+--------------------------------------------------------------------------------
+| pmd_leaf                  | Tests a leaf mapped PMD                          |
+--------------------------------------------------------------------------------
+| pmd_huge                  | Tests a HugeTLB mapped PMD                       |
+--------------------------------------------------------------------------------
+| pmd_trans_huge            | Tests a Transparent Huge Page (THP) at PMD       |
+--------------------------------------------------------------------------------
+| pmd_present               | Tests a valid mapped PMD                         |
+--------------------------------------------------------------------------------
+| pmd_young                 | Tests a young PMD                                |
+--------------------------------------------------------------------------------
+| pmd_dirty                 | Tests a dirty PMD                                |
+--------------------------------------------------------------------------------
+| pmd_write                 | Tests a writable PMD                             |
+--------------------------------------------------------------------------------
+| pmd_special               | Tests a special PMD                              |
+--------------------------------------------------------------------------------
+| pmd_protnone              | Tests a PROT_NONE PMD                            |
+--------------------------------------------------------------------------------
+| pmd_devmap                | Tests a ZONE_DEVICE mapped PMD                   |
+--------------------------------------------------------------------------------
+| pmd_soft_dirty            | Tests a soft dirty PMD                           |
+--------------------------------------------------------------------------------
+| pmd_swp_soft_dirty        | Tests a soft dirty swapped PMD                   |
+--------------------------------------------------------------------------------
+| pmd_mkyoung               | Creates a young PMD                              |
+--------------------------------------------------------------------------------
+| pmd_mkold                 | Creates an old PMD                               |
+--------------------------------------------------------------------------------
+| pmd_mkdirty               | Creates a dirty PMD                              |
+--------------------------------------------------------------------------------
+| pmd_mkclean               | Creates a clean PMD                              |
+--------------------------------------------------------------------------------
+| pmd_mkwrite               | Creates a writable PMD                           |
+--------------------------------------------------------------------------------
+| pmd_mkwrprotect           | Creates a write protected PMD                    |
+--------------------------------------------------------------------------------
+| pmd_mkspecial             | Creates a special PMD                            |
+--------------------------------------------------------------------------------
+| pmd_mkdevmap              | Creates a ZONE_DEVICE mapped PMD                 |
+--------------------------------------------------------------------------------
+| pmd_mksoft_dirty          | Creates a soft dirty PMD                         |
+--------------------------------------------------------------------------------
+| pmd_clear_soft_dirty      | Clears a soft dirty PMD                          |
+--------------------------------------------------------------------------------
+| pmd_swp_mksoft_dirty      | Creates a soft dirty swapped PMD                 |
+--------------------------------------------------------------------------------
+| pmd_swp_clear_soft_dirty  | Clears a soft dirty swapped PMD                  |
+--------------------------------------------------------------------------------
+| pmd_mkinvalid             | Invalidates a mapped PMD [1]                     |
+--------------------------------------------------------------------------------
+| pmd_set_huge              | Creates a PMD huge mapping                       |
+--------------------------------------------------------------------------------
+| pmd_clear_huge            | Clears a PMD huge mapping                        |
+--------------------------------------------------------------------------------
+| pmdp_get_and_clear        | Clears a PMD                                     |
+--------------------------------------------------------------------------------
+| pmdp_get_and_clear_full   | Clears a PMD                                     |
+--------------------------------------------------------------------------------
+| pmdp_test_and_clear_young | Clears young from a PMD                          |
+--------------------------------------------------------------------------------
+| pmdp_set_wrprotect        | Converts into a write protected PMD              |
+--------------------------------------------------------------------------------
+| pmdp_set_access_flags     | Converts into a more permissive PMD              |
+--------------------------------------------------------------------------------
+
+======================
+PUD Page Table Helpers
+======================
+
+--------------------------------------------------------------------------------
+| pud_same                  | Tests whether both PUD entries are the same      |
+--------------------------------------------------------------------------------
+| pud_bad                   | Tests a non-table mapped PUD                     |
+--------------------------------------------------------------------------------
+| pud_leaf                  | Tests a leaf mapped PUD                          |
+--------------------------------------------------------------------------------
+| pud_huge                  | Tests a HugeTLB mapped PUD                       |
+--------------------------------------------------------------------------------
+| pud_trans_huge            | Tests a Transparent Huge Page (THP) at PUD       |
+--------------------------------------------------------------------------------
+| pud_present               | Tests a valid mapped PUD                         |
+--------------------------------------------------------------------------------
+| pud_young                 | Tests a young PUD                                |
+--------------------------------------------------------------------------------
+| pud_dirty                 | Tests a dirty PUD                                |
+--------------------------------------------------------------------------------
+| pud_write                 | Tests a writable PUD                             |
+--------------------------------------------------------------------------------
+| pud_devmap                | Tests a ZONE_DEVICE mapped PUD                   |
+--------------------------------------------------------------------------------
+| pud_mkyoung               | Creates a young PUD                              |
+--------------------------------------------------------------------------------
+| pud_mkold                 | Creates an old PUD                               |
+--------------------------------------------------------------------------------
+| pud_mkdirty               | Creates a dirty PUD                              |
+--------------------------------------------------------------------------------
+| pud_mkclean               | Creates a clean PUD                              |
+--------------------------------------------------------------------------------
+| pud_mkwrite               | Creates a writable PMD                           |
+--------------------------------------------------------------------------------
+| pud_mkwrprotect           | Creates a write protected PMD                    |
+--------------------------------------------------------------------------------
+| pud_mkdevmap              | Creates a ZONE_DEVICE mapped PMD                 |
+--------------------------------------------------------------------------------
+| pud_mkinvalid             | Invalidates a mapped PUD [1]                     |
+--------------------------------------------------------------------------------
+| pud_set_huge              | Creates a PUD huge mapping                       |
+--------------------------------------------------------------------------------
+| pud_clear_huge            | Clears a PUD huge mapping                        |
+--------------------------------------------------------------------------------
+| pudp_get_and_clear        | Clears a PUD                                     |
+--------------------------------------------------------------------------------
+| pudp_get_and_clear_full   | Clears a PUD                                     |
+--------------------------------------------------------------------------------
+| pudp_test_and_clear_young | Clears young from a PUD                          |
+--------------------------------------------------------------------------------
+| pudp_set_wrprotect        | Converts into a write protected PUD              |
+--------------------------------------------------------------------------------
+| pudp_set_access_flags     | Converts into a more permissive PUD              |
+--------------------------------------------------------------------------------
+
+==========================
+HugeTLB Page Table Helpers
+==========================
+
+--------------------------------------------------------------------------------
+| pte_huge                  | Tests a HugeTLB                                  |
+--------------------------------------------------------------------------------
+| pte_mkhuge                | Creates a HugeTLB                                |
+--------------------------------------------------------------------------------
+| huge_pte_dirty            | Tests a dirty HugeTLB                            |
+--------------------------------------------------------------------------------
+| huge_pte_write            | Tests a writable HugeTLB                         |
+--------------------------------------------------------------------------------
+| huge_pte_mkdirty          | Creates a dirty HugeTLB                          |
+--------------------------------------------------------------------------------
+| huge_pte_mkwrite          | Creates a writable HugeTLB                       |
+--------------------------------------------------------------------------------
+| huge_pte_mkwrprotect      | Creates a write protected HugeTLB                |
+--------------------------------------------------------------------------------
+| huge_ptep_get_and_clear   | Clears a HugeTLB                                 |
+--------------------------------------------------------------------------------
+| huge_ptep_set_wrprotect   | Converts into a write protected HugeTLB          |
+--------------------------------------------------------------------------------
+| huge_ptep_set_access_flags  | Converts into a more permissive HugeTLB        |
+--------------------------------------------------------------------------------
+
+========================
+SWAP Page Table Helpers
+========================
+
+--------------------------------------------------------------------------------
+| __pte_to_swp_entry        | Creates a swapped entry (arch) from a mapepd PTE |
+--------------------------------------------------------------------------------
+| __swp_to_pte_entry        | Creates a mapped PTE from a swapped entry (arch) |
+--------------------------------------------------------------------------------
+| __pmd_to_swp_entry        | Creates a swapped entry (arch) from a mapepd PMD |
+--------------------------------------------------------------------------------
+| __swp_to_pmd_entry        | Creates a mapped PMD from a swapped entry (arch) |
+--------------------------------------------------------------------------------
+| is_migration_entry        | Tests a migration (read or write) swapped entry  |
+--------------------------------------------------------------------------------
+| is_write_migration_entry  | Tests a write migration swapped entry            |
+--------------------------------------------------------------------------------
+| make_migration_entry_read | Converts into read migration swapped entry       |
+--------------------------------------------------------------------------------
+| make_migration_entry      | Creates a migration swapped entry (read or write)|
+--------------------------------------------------------------------------------
+
+[1] https://lore.kernel.org/linux-mm/20181017020930.GN30832@redhat.com/
--- a/mm/debug_vm_pgtable.c~documentation-mm-add-descriptions-for-arch-page-table-helpers
+++ a/mm/debug_vm_pgtable.c
@@ -31,6 +31,12 @@
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
 
+/*
+ * Please refer Documentation/vm/arch_pgtable_helpers.rst for the semantics
+ * expectations that are being validated here. All future changes in here
+ * or the documentation need to be in sync.
+ */
+
 #define VMFLAGS	(VM_READ|VM_WRITE|VM_EXEC)
 
 /*
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (16 preceding siblings ...)
  2020-07-06 23:28 ` + documentation-mm-add-descriptions-for-arch-page-table-helpers.patch " Andrew Morton
@ 2020-07-06 23:33 ` Andrew Morton
  2020-07-06 23:33   ` Andrew Morton
  2020-07-06 23:34 ` + slub-drop-lockdep_assert_held-from-put_map.patch added to " Andrew Morton
                   ` (214 subsequent siblings)
  232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:33 UTC (permalink / raw)
  To: bigeasy, cl, iamjoonsoo.kim, mm-commits, penberg, rientjes, tglx, yuzhao


The patch titled
     Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
has been removed from the -mm tree.  Its filename was
     slub-drop-lockdep_assert_held-from-put_map.patch

This patch was dropped because it was merged into mainline or a subsystem tree

------------------------------------------------------
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()

There is no point in using lockdep_assert_held() unlock that is about to
be unlocked.  It works only with lockdep and lockdep will complain if
spin_unlock() is used on a lock that has not been locked.

Remove superfluous lockdep_assert_held().

Link: http://lkml.kernel.org/r/20200618201234.795692-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |    2 --
 1 file changed, 2 deletions(-)

--- a/mm/slub.c~slub-drop-lockdep_assert_held-from-put_map
+++ a/mm/slub.c
@@ -473,8 +473,6 @@ static unsigned long *get_map(struct kme
 static void put_map(unsigned long *map) __releases(&object_map_lock)
 {
 	VM_BUG_ON(map != object_map);
-	lockdep_assert_held(&object_map_lock);

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from -mm tree
  2020-07-06 23:33 ` [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from " Andrew Morton
@ 2020-07-06 23:33   ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:33 UTC (permalink / raw)
  To: bigeasy, cl, iamjoonsoo.kim, mm-commits, penberg, rientjes, tglx, yuzhao


The patch titled
     Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
has been removed from the -mm tree.  Its filename was
     slub-drop-lockdep_assert_held-from-put_map.patch

This patch was dropped because it was merged into mainline or a subsystem tree

------------------------------------------------------
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()

There is no point in using lockdep_assert_held() unlock that is about to
be unlocked.  It works only with lockdep and lockdep will complain if
spin_unlock() is used on a lock that has not been locked.

Remove superfluous lockdep_assert_held().

Link: http://lkml.kernel.org/r/20200618201234.795692-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |    2 --
 1 file changed, 2 deletions(-)

--- a/mm/slub.c~slub-drop-lockdep_assert_held-from-put_map
+++ a/mm/slub.c
@@ -473,8 +473,6 @@ static unsigned long *get_map(struct kme
 static void put_map(unsigned long *map) __releases(&object_map_lock)
 {
 	VM_BUG_ON(map != object_map);
-	lockdep_assert_held(&object_map_lock);
-
 	spin_unlock(&object_map_lock);
 }
 
_

Patches currently in -mm which might be from bigeasy@linutronix.de are



^ permalink raw reply	[flat|nested] 247+ messages in thread

* + slub-drop-lockdep_assert_held-from-put_map.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (17 preceding siblings ...)
  2020-07-06 23:33 ` [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from " Andrew Morton
@ 2020-07-06 23:34 ` Andrew Morton
  2020-07-06 23:34   ` Andrew Morton
  2020-07-06 23:34 ` [merged] mailmap-add-entry-for-obsolete-email-address.patch removed from " Andrew Morton
                   ` (213 subsequent siblings)
  232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:34 UTC (permalink / raw)
  To: bigeasy, cl, iamjoonsoo.kim, mm-commits, penberg, rientjes, tglx, yuzhao


The patch titled
     Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
has been added to the -mm tree.  Its filename is
     slub-drop-lockdep_assert_held-from-put_map.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/slub-drop-lockdep_assert_held-from-put_map.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/slub-drop-lockdep_assert_held-from-put_map.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()

There is no point in using lockdep_assert_held() unlock that is about to
be unlocked.  It works only with lockdep and lockdep will complain if
spin_unlock() is used on a lock that has not been locked.

Remove superfluous lockdep_assert_held().

Link: http://lkml.kernel.org/r/20200618201234.795692-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |    2 --
 1 file changed, 2 deletions(-)

--- a/mm/slub.c~slub-drop-lockdep_assert_held-from-put_map
+++ a/mm/slub.c
@@ -473,8 +473,6 @@ static unsigned long *get_map(struct kme
 static void put_map(unsigned long *map) __releases(&object_map_lock)
 {
 	VM_BUG_ON(map != object_map);
-	lockdep_assert_held(&object_map_lock);

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + slub-drop-lockdep_assert_held-from-put_map.patch added to -mm tree
  2020-07-06 23:34 ` + slub-drop-lockdep_assert_held-from-put_map.patch added to " Andrew Morton
@ 2020-07-06 23:34   ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:34 UTC (permalink / raw)
  To: bigeasy, cl, iamjoonsoo.kim, mm-commits, penberg, rientjes, tglx, yuzhao


The patch titled
     Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
has been added to the -mm tree.  Its filename is
     slub-drop-lockdep_assert_held-from-put_map.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/slub-drop-lockdep_assert_held-from-put_map.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/slub-drop-lockdep_assert_held-from-put_map.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()

There is no point in using lockdep_assert_held() unlock that is about to
be unlocked.  It works only with lockdep and lockdep will complain if
spin_unlock() is used on a lock that has not been locked.

Remove superfluous lockdep_assert_held().

Link: http://lkml.kernel.org/r/20200618201234.795692-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |    2 --
 1 file changed, 2 deletions(-)

--- a/mm/slub.c~slub-drop-lockdep_assert_held-from-put_map
+++ a/mm/slub.c
@@ -473,8 +473,6 @@ static unsigned long *get_map(struct kme
 static void put_map(unsigned long *map) __releases(&object_map_lock)
 {
 	VM_BUG_ON(map != object_map);
-	lockdep_assert_held(&object_map_lock);
-
 	spin_unlock(&object_map_lock);
 }
 
_

Patches currently in -mm which might be from bigeasy@linutronix.de are

slub-drop-lockdep_assert_held-from-put_map.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* [merged] mailmap-add-entry-for-obsolete-email-address.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (18 preceding siblings ...)
  2020-07-06 23:34 ` + slub-drop-lockdep_assert_held-from-put_map.patch added to " Andrew Morton
@ 2020-07-06 23:34 ` Andrew Morton
  2020-07-06 23:36 ` + mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch added to " Andrew Morton
                   ` (212 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:34 UTC (permalink / raw)
  To: corbet, koct9i, mm-commits


The patch titled
     Subject: mailmap: add entry for obsolete email address
has been removed from the -mm tree.  Its filename was
     mailmap-add-entry-for-obsolete-email-address.patch

This patch was dropped because it was merged into mainline or a subsystem tree

------------------------------------------------------
From: Konstantin Khlebnikov <koct9i@gmail.com>
Subject: mailmap: add entry for obsolete email address

Map old corporate email address @yandex-team.ru to stable private address.

Link: http://lkml.kernel.org/r/159360469186.24918.10108157093572183535.stgit@zurg
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 .mailmap |    1 +
 1 file changed, 1 insertion(+)

--- a/.mailmap~mailmap-add-entry-for-obsolete-email-address
+++ a/.mailmap
@@ -146,6 +146,7 @@ Kamil Konieczny <k.konieczny@samsung.com
 Kay Sievers <kay.sievers@vrfy.org>
 Kenneth W Chen <kenneth.w.chen@intel.com>
 Konstantin Khlebnikov <koct9i@gmail.com> <k.khlebnikov@samsung.com>
+Konstantin Khlebnikov <koct9i@gmail.com> <khlebnikov@yandex-team.ru>
 Koushik <raghavendra.koushik@neterion.com>
 Krzysztof Kozlowski <krzk@kernel.org> <k.kozlowski@samsung.com>
 Krzysztof Kozlowski <krzk@kernel.org> <k.kozlowski.k@gmail.com>
_

Patches currently in -mm which might be from koct9i@gmail.com are

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (19 preceding siblings ...)
  2020-07-06 23:34 ` [merged] mailmap-add-entry-for-obsolete-email-address.patch removed from " Andrew Morton
@ 2020-07-06 23:36 ` Andrew Morton
  2020-07-06 23:50 ` + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch " Andrew Morton
                   ` (211 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:36 UTC (permalink / raw)
  To: akpm, mm-commits, thunder.leizhen


The patch titled
     Subject: mm/mmap: optimize a branch judgment in ksys_mmap_pgoff()
has been added to the -mm tree.  Its filename is
     mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Zhen Lei <thunder.leizhen@huawei.com>
Subject: mm/mmap: optimize a branch judgment in ksys_mmap_pgoff()

Look at the pseudo code below.  It's very clear that, the judgement
"!is_file_hugepages(file)" at 3) is duplicated to the one at 1), we can
use "else if" to avoid it.  And the assignment "retval = -EINVAL" at 2) is
only needed by the branch 3), because "retval" will be overwritten at 4).

No functional change, but it can reduce the code size. Maybe more clearer?
Before:
text    data     bss     dec     hex filename
28733    1590       1   30324    7674 mm/mmap.o

After:
text    data     bss     dec     hex filename
28701    1590       1   30292    7654 mm/mmap.o

====pseudo code====:
	if (!(flags & MAP_ANONYMOUS)) {
		...
1)		if (is_file_hugepages(file))
			len = ALIGN(len, huge_page_size(hstate_file(file)));
2)		retval = -EINVAL;
3)		if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file)))
			goto out_fput;
	} else if (flags & MAP_HUGETLB) {
		...
	}
	...

4)	retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff);
out_fput:
	...
	return retval;

Link: http://lkml.kernel.org/r/20200705080112.1405-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mmap.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/mm/mmap.c~mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff
+++ a/mm/mmap.c
@@ -1562,11 +1562,12 @@ unsigned long ksys_mmap_pgoff(unsigned l
 		file = fget(fd);
 		if (!file)
 			return -EBADF;
-		if (is_file_hugepages(file))
+		if (is_file_hugepages(file)) {
 			len = ALIGN(len, huge_page_size(hstate_file(file)));
-		retval = -EINVAL;
-		if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file)))
+		} else if (unlikely(flags & MAP_HUGETLB)) {
+			retval = -EINVAL;
 			goto out_fput;
+		}
 	} else if (flags & MAP_HUGETLB) {
 		struct user_struct *user = NULL;
 		struct hstate *hs;
_

Patches currently in -mm which might be from thunder.leizhen@huawei.com are

mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (20 preceding siblings ...)
  2020-07-06 23:36 ` + mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch added to " Andrew Morton
@ 2020-07-06 23:50 ` Andrew Morton
  2020-07-06 23:52 ` [to-be-updated] mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch removed from " Andrew Morton
                   ` (210 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:50 UTC (permalink / raw)
  To: adilger, cgxu519, chris, dxu, gregkh, hughd, mm-commits, stable,
	tj, viro


The patch titled
     Subject: vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
has been added to the -mm tree.  Its filename is
     vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chengguang Xu <cgxu519@mykernel.net>
Subject: vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way

After commit fdc85222d58e ("kernfs: kvmalloc xattr value instead of
kmalloc"), simple xattr entry is allocated with kvmalloc() instead of
kmalloc(), so we should release it with kvfree() instead of kfree().

Link: http://lkml.kernel.org/r/20200704051608.15043-1-cgxu519@mykernel.net
Fixes: fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc")
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Chris Down <chris@chrisdown.name>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>	[5.7]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/xattr.h |    3 ++-
 mm/shmem.c            |    2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

--- a/include/linux/xattr.h~vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way
+++ a/include/linux/xattr.h
@@ -15,6 +15,7 @@
 #include <linux/slab.h>
 #include <linux/types.h>
 #include <linux/spinlock.h>
+#include <linux/mm.h>
 #include <uapi/linux/xattr.h>
 
 struct inode;
@@ -94,7 +95,7 @@ static inline void simple_xattrs_free(st
 
 	list_for_each_entry_safe(xattr, node, &xattrs->head, list) {
 		kfree(xattr->name);
-		kfree(xattr);
+		kvfree(xattr);
 	}
 }
 
--- a/mm/shmem.c~vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way
+++ a/mm/shmem.c
@@ -3178,7 +3178,7 @@ static int shmem_initxattrs(struct inode
 		new_xattr->name = kmalloc(XATTR_SECURITY_PREFIX_LEN + len,
 					  GFP_KERNEL);
 		if (!new_xattr->name) {
-			kfree(new_xattr);
+			kvfree(new_xattr);
 			return -ENOMEM;
 		}
 
_

Patches currently in -mm which might be from cgxu519@mykernel.net are

vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
mm-shmem-fix-freeing-new_attr-in-shmem_initxattrs.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (21 preceding siblings ...)
  2020-07-06 23:50 ` + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch " Andrew Morton
@ 2020-07-06 23:52 ` Andrew Morton
  2020-07-06 23:53 ` + mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch added to " Andrew Morton
                   ` (209 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:52 UTC (permalink / raw)
  To: cl, iamjoonsoo.kim, lonuxli.64, mm-commits, penberg, rientjes, willy


The patch titled
     Subject: mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order
has been removed from the -mm tree.  Its filename was
     mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Long Li <lonuxli.64@gmail.com>
Subject: mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order

kmalloc cannot allocate memory from HIGHMEM.  Allocating large amounts of
memory currently bypasses the check and will simply leak the memory when
page_address() returns NULL.  To fix this, factor the GFP_SLAB_BUG_MASK
check out of slab & slub, and call it from kmalloc_order() as well.  In
order to make the code clear, the warning message is put in one place.

Link: http://lkml.kernel.org/r/20200701151645.GA26223@lilong
Signed-off-by: Long Li <lonuxli.64@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab.c        |   10 +++-------
 mm/slab.h        |    1 +
 mm/slab_common.c |   17 +++++++++++++++++
 mm/slub.c        |    9 ++-------
 4 files changed, 23 insertions(+), 14 deletions(-)

--- a/mm/slab.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab.c
@@ -2589,13 +2589,9 @@ static struct page *cache_grow_begin(str
 	 * Be lazy and only check for valid flags here,  keeping it out of the
 	 * critical path in kmem_cache_alloc().
 	 */
-	if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
-		gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
-		flags &= ~GFP_SLAB_BUG_MASK;
-		pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
-				invalid_mask, &invalid_mask, flags, &flags);
-		dump_stack();
-	}
+	if (unlikely(flags & GFP_SLAB_BUG_MASK))
+		flags = kmalloc_invalid_flags(flags);
+
 	WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
 	local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
 
--- a/mm/slab_common.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab_common.c
@@ -26,6 +26,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/kmem.h>
 
+#include "internal.h"
+
 #include "slab.h"
 
 enum slab_state slab_state;
@@ -1311,6 +1313,18 @@ void __init create_kmalloc_caches(slab_f
 }
 #endif /* !CONFIG_SLOB */
 
+gfp_t kmalloc_invalid_flags(gfp_t flags)
+{
+	gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
+
+	flags &= ~GFP_SLAB_BUG_MASK;
+	pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
+			invalid_mask, &invalid_mask, flags, &flags);
+	dump_stack();
+
+	return flags;
+}
+
 /*
  * To avoid unnecessary overhead, we pass through large allocation requests
  * directly to the page allocator. We use __GFP_COMP, because we will need to
@@ -1321,6 +1335,9 @@ void *kmalloc_order(size_t size, gfp_t f
 	void *ret = NULL;
 	struct page *page;
 
+	if (unlikely(flags & GFP_SLAB_BUG_MASK))
+		flags = kmalloc_invalid_flags(flags);
+
 	flags |= __GFP_COMP;
 	page = alloc_pages(flags, order);
 	if (likely(page)) {
--- a/mm/slab.h~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab.h
@@ -152,6 +152,7 @@ void create_kmalloc_caches(slab_flags_t)
 struct kmem_cache *kmalloc_slab(size_t, gfp_t);
 #endif
 
+gfp_t kmalloc_invalid_flags(gfp_t flags);
 
 /* Functions provided by the slab allocators */
 int __kmem_cache_create(struct kmem_cache *, slab_flags_t flags);
--- a/mm/slub.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slub.c
@@ -1745,13 +1745,8 @@ out:
 
 static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
-	if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
-		gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
-		flags &= ~GFP_SLAB_BUG_MASK;
-		pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
-				invalid_mask, &invalid_mask, flags, &flags);
-		dump_stack();
-	}
+	if (unlikely(flags & GFP_SLAB_BUG_MASK))
+		flags = kmalloc_invalid_flags(flags);
 
 	return allocate_slab(s,
 		flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
_

Patches currently in -mm which might be from lonuxli.64@gmail.com are

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (22 preceding siblings ...)
  2020-07-06 23:52 ` [to-be-updated] mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch removed from " Andrew Morton
@ 2020-07-06 23:53 ` Andrew Morton
  2020-07-07  1:53 ` mmotm 2020-07-06-18-53 uploaded Andrew Morton
                   ` (208 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:53 UTC (permalink / raw)
  To: cl, iamjoonsoo.kim, lonuxli.64, mm-commits, penberg, rientjes, willy


The patch titled
     Subject: mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order
has been added to the -mm tree.  Its filename is
     mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Long Li <lonuxli.64@gmail.com>
Subject: mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order

kmalloc cannot allocate memory from HIGHMEM.  Allocating large amounts of
memory currently bypasses the check and will simply leak the memory when
page_address() returns NULL.  To fix this, factor the GFP_SLAB_BUG_MASK
check out of slab & slub, and call it from kmalloc_order() as well.  In
order to make the code clear, the warning message is put in one place.

Link: http://lkml.kernel.org/r/20200704035027.GA62481@lilong
Signed-off-by: Long Li <lonuxli.64@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab.c        |   10 +++-------
 mm/slab.h        |    1 +
 mm/slab_common.c |   17 +++++++++++++++++
 mm/slub.c        |    9 ++-------
 4 files changed, 23 insertions(+), 14 deletions(-)

--- a/mm/slab.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab.c
@@ -2589,13 +2589,9 @@ static struct page *cache_grow_begin(str
 	 * Be lazy and only check for valid flags here,  keeping it out of the
 	 * critical path in kmem_cache_alloc().
 	 */
-	if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
-		gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
-		flags &= ~GFP_SLAB_BUG_MASK;
-		pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
-				invalid_mask, &invalid_mask, flags, &flags);
-		dump_stack();
-	}
+	if (unlikely(flags & GFP_SLAB_BUG_MASK))
+		flags = kmalloc_fix_flags(flags);
+
 	WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
 	local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
 
--- a/mm/slab_common.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab_common.c
@@ -26,6 +26,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/kmem.h>
 
+#include "internal.h"
+
 #include "slab.h"
 
 enum slab_state slab_state;
@@ -1311,6 +1313,18 @@ void __init create_kmalloc_caches(slab_f
 }
 #endif /* !CONFIG_SLOB */
 
+gfp_t kmalloc_fix_flags(gfp_t flags)
+{
+	gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
+
+	flags &= ~GFP_SLAB_BUG_MASK;
+	pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
+			invalid_mask, &invalid_mask, flags, &flags);
+	dump_stack();
+
+	return flags;
+}
+
 /*
  * To avoid unnecessary overhead, we pass through large allocation requests
  * directly to the page allocator. We use __GFP_COMP, because we will need to
@@ -1321,6 +1335,9 @@ void *kmalloc_order(size_t size, gfp_t f
 	void *ret = NULL;
 	struct page *page;
 
+	if (unlikely(flags & GFP_SLAB_BUG_MASK))
+		flags = kmalloc_fix_flags(flags);
+
 	flags |= __GFP_COMP;
 	page = alloc_pages(flags, order);
 	if (likely(page)) {
--- a/mm/slab.h~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab.h
@@ -152,6 +152,7 @@ void create_kmalloc_caches(slab_flags_t)
 struct kmem_cache *kmalloc_slab(size_t, gfp_t);
 #endif
 
+gfp_t kmalloc_fix_flags(gfp_t flags);
 
 /* Functions provided by the slab allocators */
 int __kmem_cache_create(struct kmem_cache *, slab_flags_t flags);
--- a/mm/slub.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slub.c
@@ -1745,13 +1745,8 @@ out:
 
 static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
-	if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
-		gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
-		flags &= ~GFP_SLAB_BUG_MASK;
-		pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
-				invalid_mask, &invalid_mask, flags, &flags);
-		dump_stack();
-	}
+	if (unlikely(flags & GFP_SLAB_BUG_MASK))
+		flags = kmalloc_fix_flags(flags);
 
 	return allocate_slab(s,
 		flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
_

Patches currently in -mm which might be from lonuxli.64@gmail.com are

mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* mmotm 2020-07-06-18-53 uploaded
  2020-07-03 22:14 incoming Andrew Morton
                   ` (23 preceding siblings ...)
  2020-07-06 23:53 ` + mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch added to " Andrew Morton
@ 2020-07-07  1:53 ` Andrew Morton
  2020-07-07 19:17 ` + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch added to -mm tree Andrew Morton
                   ` (207 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07  1:53 UTC (permalink / raw)
  To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
	mhocko, mm-commits, sfr

The mm-of-the-moment snapshot 2020-07-06-18-53 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss.  Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

	https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

	https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.8-rc4:
(patches marked "*" will be included in linux-next)

  origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* checkpatch-test-git_dir-changes.patch
* kthread-work-could-not-be-queued-when-worker-being-destroyed.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
* mm-utilc-make-vm_memory_committed-more-accurate.patch
* mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* kasan-record-and-print-the-free-track.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-vmscanc-fixed-typo.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-page_isolation-prefer-the-node-of-the-source-page.patch
* mm-migrate-move-migration-helper-from-h-to-c.patch
* mm-hugetlb-unify-migration-callbacks.patch
* mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch
* mm-migrate-make-a-standard-migration-target-allocation-function.patch
* mm-gup-use-a-standard-migration-target-allocation-callback.patch
* mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
* mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
* mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch
* mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-free-pages-fix.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix-fix.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* umh-fix-refcount-underflow-in-fork_usermode_blob.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
  linux-next.patch
  linux-next-rejects.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-remove-call-to-memset-after-dma_alloc_coherent.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
  make-sure-nobodys-leaking-resources.patch
  releasing-resources-with-children.patch
  mutex-subsystem-synchro-test-module.patch
  kernel-forkc-export-kernel_thread-to-modules.patch
  workaround-for-a-pci-restoring-bug.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (24 preceding siblings ...)
  2020-07-07  1:53 ` mmotm 2020-07-06-18-53 uploaded Andrew Morton
@ 2020-07-07 19:17 ` Andrew Morton
  2020-07-07 19:20 ` + mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch " Andrew Morton
                   ` (206 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:17 UTC (permalink / raw)
  To: catalin.marinas, hannes, hdanton, hughd, josef, kirill.shutemov,
	mm-commits, will.deacon, willy, xuyu, yang.shi


The patch titled
     Subject: mm/memory.c: avoid access flag update TLB flush for retried page fault
has been added to the -mm tree.  Its filename is
     mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Yang Shi <yang.shi@linux.alibaba.com>
Subject: mm/memory.c: avoid access flag update TLB flush for retried page fault

Recently we found regression when running will_it_scale/page_fault3 test
on ARM64.  Over 70% down for the multi processes cases and over 20% down
for the multi threads cases.  It turns out the regression is caused by
commit 89b15332af7c0 ("mm: drop mmap_sem before calling
balance_dirty_pages() in write fault").

The test mmaps a memory size file then write to the mapping, this would
make all memory dirty and trigger dirty pages throttle, that upstream
commit would release mmap_sem then retry the page fault.  The retried page
fault would see correct PTEs installed by the first try then update access
flags and flush TLBs.  The regression is caused by the excessive TLB
flush.  It is fine on x86 since x86 doesn't need flush TLB for access flag
update.

The page fault would be retried due to:
1. Waiting for page readahead
2. Waiting for page swapped in
3. Waiting for dirty pages throttling

The first two cases don't have PTEs set up at all, so the retried page
fault would install the PTEs, so they don't reach there.  But the #3 case
usually has PTEs installed, the retried page fault would reach the access
flag update.  But it seems not necessary to update access flags for #3
since retried page fault is not real "second access", so it sounds safe to
skip access flag update for retried page fault.

With this fix the test result get back to normal.

Link: http://lkml.kernel.org/r/1594148072-91273-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reported-by: Xu Yu <xuyu@linux.alibaba.com>
Debugged-by: Xu Yu <xuyu@linux.alibaba.com>
Tested-by: Xu Yu <xuyu@linux.alibaba.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/mm/memory.c~mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault
+++ a/mm/memory.c
@@ -4241,8 +4241,13 @@ static vm_fault_t handle_pte_fault(struc
 	if (vmf->flags & FAULT_FLAG_WRITE) {
 		if (!pte_write(entry))
 			return do_wp_page(vmf);
-		entry = pte_mkdirty(entry);
 	}
+
+	if ((vmf->flags & FAULT_FLAG_WRITE) && !(vmf->flags & FAULT_FLAG_TRIED))
+		entry = pte_mkdirty(entry); 
+	else if (vmf->flags & FAULT_FLAG_TRIED)
+		goto unlock;
+
 	entry = pte_mkyoung(entry);
 	if (ptep_set_access_flags(vmf->vma, vmf->address, vmf->pte, entry,
 				vmf->flags & FAULT_FLAG_WRITE)) {
_

Patches currently in -mm which might be from yang.shi@linux.alibaba.com are

mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
mm-filemap-clear-idle-flag-for-writes.patch
mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
mm-thp-remove-debug_cow-switch.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (25 preceding siblings ...)
  2020-07-07 19:17 ` + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch added to -mm tree Andrew Morton
@ 2020-07-07 19:20 ` Andrew Morton
  2020-07-07 19:20 ` + mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch " Andrew Morton
                   ` (205 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:20 UTC (permalink / raw)
  To: cl, guro, hannes, iamjoonsoo.kim, mhocko, mm-commits, penberg,
	rientjes, shakeelb, vbabka


The patch titled
     Subject: mm: memcg/slab: remove unused argument by charge_slab_page()
has been added to the -mm tree.  Its filename is
     mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: memcg/slab: remove unused argument by charge_slab_page()

charge_slab_page() is not using the gfp argument anymore,
remove it.

Link: http://lkml.kernel.org/r/20200707173612.124425-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab.c |    2 +-
 mm/slab.h |    3 +--
 mm/slub.c |    2 +-
 3 files changed, 3 insertions(+), 4 deletions(-)

--- a/mm/slab.c~mm-memcg-slab-remove-unused-argument-by-charge_slab_page
+++ a/mm/slab.c
@@ -1379,7 +1379,7 @@ static struct page *kmem_getpages(struct
 		return NULL;
 	}
 
-	charge_slab_page(page, flags, cachep->gfporder, cachep);
+	charge_slab_page(page, cachep->gfporder, cachep);
 	__SetPageSlab(page);
 	/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
 	if (sk_memalloc_socks() && page_is_pfmemalloc(page))
--- a/mm/slab.h~mm-memcg-slab-remove-unused-argument-by-charge_slab_page
+++ a/mm/slab.h
@@ -440,8 +440,7 @@ static inline struct kmem_cache *virt_to
 	return page->slab_cache;
 }
 
-static __always_inline void charge_slab_page(struct page *page,
-					     gfp_t gfp, int order,
+static __always_inline void charge_slab_page(struct page *page, int order,
 					     struct kmem_cache *s)
 {
 	mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
--- a/mm/slub.c~mm-memcg-slab-remove-unused-argument-by-charge_slab_page
+++ a/mm/slub.c
@@ -1621,7 +1621,7 @@ static inline struct page *alloc_slab_pa
 		page = __alloc_pages_node(node, flags, order);
 
 	if (page)
-		charge_slab_page(page, flags, order, s);
+		charge_slab_page(page, order, s);
 
 	return page;
 }
_

Patches currently in -mm which might be from guro@fb.com are

mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (26 preceding siblings ...)
  2020-07-07 19:20 ` + mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch " Andrew Morton
@ 2020-07-07 19:20 ` Andrew Morton
  2020-07-07 19:20 ` + mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch " Andrew Morton
                   ` (204 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:20 UTC (permalink / raw)
  To: cl, guro, hannes, iamjoonsoo.kim, mhocko, mm-commits, penberg,
	rientjes, shakeelb, vbabka


The patch titled
     Subject: mm: slab: rename (un)charge_slab_page() to (un)account_slab_page()
has been added to the -mm tree.  Its filename is
     mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: slab: rename (un)charge_slab_page() to (un)account_slab_page()

charge_slab_page() and uncharge_slab_page() are not related anymore to
memcg charging and uncharging.  In order to make their names less
confusing, let's rename them to account_slab_page() and
unaccount_slab_page() respectively.

Link: http://lkml.kernel.org/r/20200707173612.124425-2-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab.c |    4 ++--
 mm/slab.h |    8 ++++----
 mm/slub.c |    4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

--- a/mm/slab.c~mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page
+++ a/mm/slab.c
@@ -1379,7 +1379,7 @@ static struct page *kmem_getpages(struct
 		return NULL;
 	}
 
-	charge_slab_page(page, cachep->gfporder, cachep);
+	account_slab_page(page, cachep->gfporder, cachep);
 	__SetPageSlab(page);
 	/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
 	if (sk_memalloc_socks() && page_is_pfmemalloc(page))
@@ -1403,7 +1403,7 @@ static void kmem_freepages(struct kmem_c
 
 	if (current->reclaim_state)
 		current->reclaim_state->reclaimed_slab += 1 << order;
-	uncharge_slab_page(page, order, cachep);
+	unaccount_slab_page(page, order, cachep);
 	__free_pages(page, order);
 }
 
--- a/mm/slab.h~mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page
+++ a/mm/slab.h
@@ -440,15 +440,15 @@ static inline struct kmem_cache *virt_to
 	return page->slab_cache;
 }
 
-static __always_inline void charge_slab_page(struct page *page, int order,
-					     struct kmem_cache *s)
+static __always_inline void account_slab_page(struct page *page, int order,
+					      struct kmem_cache *s)
 {
 	mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
 			    PAGE_SIZE << order);
 }
 
-static __always_inline void uncharge_slab_page(struct page *page, int order,
-					       struct kmem_cache *s)
+static __always_inline void unaccount_slab_page(struct page *page, int order,
+						struct kmem_cache *s)
 {
 	if (memcg_kmem_enabled())
 		memcg_free_page_obj_cgroups(page);
--- a/mm/slub.c~mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page
+++ a/mm/slub.c
@@ -1621,7 +1621,7 @@ static inline struct page *alloc_slab_pa
 		page = __alloc_pages_node(node, flags, order);
 
 	if (page)
-		charge_slab_page(page, order, s);
+		account_slab_page(page, order, s);
 
 	return page;
 }
@@ -1844,7 +1844,7 @@ static void __free_slab(struct kmem_cach
 	page->mapping = NULL;
 	if (current->reclaim_state)
 		current->reclaim_state->reclaimed_slab += pages;
-	uncharge_slab_page(page, order, s);
+	unaccount_slab_page(page, order, s);
 	__free_pages(page, order);
 }
 
_

Patches currently in -mm which might be from guro@fb.com are

mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (27 preceding siblings ...)
  2020-07-07 19:20 ` + mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch " Andrew Morton
@ 2020-07-07 19:20 ` Andrew Morton
  2020-07-07 19:25 ` + fs-minix-check-return-value-of-sb_getblk.patch " Andrew Morton
                   ` (203 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:20 UTC (permalink / raw)
  To: cl, guro, hannes, iamjoonsoo.kim, mhocko, mm-commits, penberg,
	rientjes, shakeelb, vbabka


The patch titled
     Subject: mm: kmem: switch to static_branch_likely() in memcg_kmem_enabled()
has been added to the -mm tree.  Its filename is
     mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: kmem: switch to static_branch_likely() in memcg_kmem_enabled()

Currently memcg_kmem_enabled() is optimized for the kernel memory
accounting being off.  It was so for a long time, and arguably the reason
behind was that the kernel memory accounting was initially an opt-in
feature.  However, now it's on by default on both cgroup v1 and cgroup v2,
and it's on for all cgroups.  So let's switch over to
static_branch_likely() to reflect this fact.

Unlikely there is a significant performance difference, as the cost of a
memory allocation and its accounting significantly exceeds the cost of a
jump.  However, the conversion makes the code look more logically.

Link: http://lkml.kernel.org/r/20200707173612.124425-3-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memcontrol.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/memcontrol.h~mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled
+++ a/include/linux/memcontrol.h
@@ -1456,7 +1456,7 @@ void memcg_put_cache_ids(void);
 
 static inline bool memcg_kmem_enabled(void)
 {
-	return static_branch_unlikely(&memcg_kmem_enabled_key);
+	return static_branch_likely(&memcg_kmem_enabled_key);
 }
 
 static inline bool memcg_kmem_bypass(void)
_

Patches currently in -mm which might be from guro@fb.com are

mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + fs-minix-check-return-value-of-sb_getblk.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (28 preceding siblings ...)
  2020-07-07 19:20 ` + mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
  2020-07-07 19:25 ` + fs-minix-dont-allow-getting-deleted-inodes.patch " Andrew Morton
                   ` (202 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
  To: anenbupt, ebiggers, mm-commits, stable, viro


The patch titled
     Subject: fs/minix: check return value of sb_getblk()
has been added to the -mm tree.  Its filename is
     fs-minix-check-return-value-of-sb_getblk.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-check-return-value-of-sb_getblk.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-check-return-value-of-sb_getblk.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: check return value of sb_getblk()

Patch series "fs/minix: fix syzbot bugs and set s_maxbytes".

This series fixes all syzbot bugs in the minix filesystem:

	KASAN: null-ptr-deref Write in get_block
	KASAN: use-after-free Write in get_block
	KASAN: use-after-free Read in get_block
	WARNING in inc_nlink
	KMSAN: uninit-value in get_block
	WARNING in drop_nlink

It also fixes the minix filesystem to set s_maxbytes correctly, so that
userspace sees the correct behavior when exceeding the max file size.


This patch (of 6):

sb_getblk() can fail, so check its return value.

This fixes a NULL pointer dereference.

Originally from Qiujun Huang.

Link: http://lkml.kernel.org/r/20200628060846.682158-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200628060846.682158-2-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reported-by: syzbot+4a88b2b9dc280f47baf4@syzkaller.appspotmail.com
Cc: Qiujun Huang <anenbupt@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/minix/itree_common.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/fs/minix/itree_common.c~fs-minix-check-return-value-of-sb_getblk
+++ a/fs/minix/itree_common.c
@@ -75,6 +75,7 @@ static int alloc_branch(struct inode *in
 	int n = 0;
 	int i;
 	int parent = minix_new_block(inode);
+	int err = -ENOSPC;
 
 	branch[0].key = cpu_to_block(parent);
 	if (parent) for (n = 1; n < num; n++) {
@@ -85,6 +86,11 @@ static int alloc_branch(struct inode *in
 			break;
 		branch[n].key = cpu_to_block(nr);
 		bh = sb_getblk(inode->i_sb, parent);
+		if (!bh) {
+			minix_free_block(inode, nr);
+			err = -ENOMEM;
+			break;
+		}
 		lock_buffer(bh);
 		memset(bh->b_data, 0, bh->b_size);
 		branch[n].bh = bh;
@@ -103,7 +109,7 @@ static int alloc_branch(struct inode *in
 		bforget(branch[i].bh);
 	for (i = 0; i < n; i++)
 		minix_free_block(inode, block_to_cpu(branch[i].key));
-	return -ENOSPC;
+	return err;
 }
 
 static inline int splice_branch(struct inode *inode,
_

Patches currently in -mm which might be from ebiggers@google.com are

fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + fs-minix-dont-allow-getting-deleted-inodes.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (29 preceding siblings ...)
  2020-07-07 19:25 ` + fs-minix-check-return-value-of-sb_getblk.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
  2020-07-07 19:25 ` + fs-minix-reject-too-large-maximum-file-size.patch " Andrew Morton
                   ` (201 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
  To: anenbupt, ebiggers, mm-commits, stable, viro


The patch titled
     Subject: fs/minix: don't allow getting deleted inodes
has been added to the -mm tree.  Its filename is
     fs-minix-dont-allow-getting-deleted-inodes.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-dont-allow-getting-deleted-inodes.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-dont-allow-getting-deleted-inodes.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: don't allow getting deleted inodes

If an inode has no links, we need to mark it bad rather than allowing it
to be accessed.  This avoids WARNINGs in inc_nlink() and drop_nlink() when
doing directory operations on a fuzzed filesystem.

Link: http://lkml.kernel.org/r/20200628060846.682158-3-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+a9ac3de1b5de5fb10efc@syzkaller.appspotmail.com
Reported-by: syzbot+df958cf5688a96ad3287@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/minix/inode.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

--- a/fs/minix/inode.c~fs-minix-dont-allow-getting-deleted-inodes
+++ a/fs/minix/inode.c
@@ -468,6 +468,13 @@ static struct inode *V1_minix_iget(struc
 		iget_failed(inode);
 		return ERR_PTR(-EIO);
 	}
+	if (raw_inode->i_nlinks == 0) {
+		printk("MINIX-fs: deleted inode referenced: %lu\n",
+		       inode->i_ino);
+		brelse(bh);
+		iget_failed(inode);
+		return ERR_PTR(-ESTALE);
+	}
 	inode->i_mode = raw_inode->i_mode;
 	i_uid_write(inode, raw_inode->i_uid);
 	i_gid_write(inode, raw_inode->i_gid);
@@ -501,6 +508,13 @@ static struct inode *V2_minix_iget(struc
 		iget_failed(inode);
 		return ERR_PTR(-EIO);
 	}
+	if (raw_inode->i_nlinks == 0) {
+		printk("MINIX-fs: deleted inode referenced: %lu\n",
+		       inode->i_ino);
+		brelse(bh);
+		iget_failed(inode);
+		return ERR_PTR(-ESTALE);
+	}
 	inode->i_mode = raw_inode->i_mode;
 	i_uid_write(inode, raw_inode->i_uid);
 	i_gid_write(inode, raw_inode->i_gid);
_

Patches currently in -mm which might be from ebiggers@google.com are

fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + fs-minix-reject-too-large-maximum-file-size.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (30 preceding siblings ...)
  2020-07-07 19:25 ` + fs-minix-dont-allow-getting-deleted-inodes.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
  2020-07-07 19:25 ` + fs-minix-set-s_maxbytes-correctly.patch " Andrew Morton
                   ` (200 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
  To: anenbupt, ebiggers, mm-commits, stable, viro


The patch titled
     Subject: fs/minix: reject too-large maximum file size
has been added to the -mm tree.  Its filename is
     fs-minix-reject-too-large-maximum-file-size.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-reject-too-large-maximum-file-size.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-reject-too-large-maximum-file-size.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: reject too-large maximum file size

If the minix filesystem tries to map a very large logical block number to
its on-disk location, block_to_path() can return offsets that are too
large, causing out-of-bounds memory accesses when accessing indirect index
blocks.  This should be prevented by the check against the maximum file
size, but this doesn't work because the maximum file size is read directly
from the on-disk superblock and isn't validated itself.

Fix this by validating the maximum file size at mount time.

Link: http://lkml.kernel.org/r/20200628060846.682158-4-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+c7d9ec7a1a7272dd71b3@syzkaller.appspotmail.com
Reported-by: syzbot+3b7b03a0c28948054fb5@syzkaller.appspotmail.com
Reported-by: syzbot+6e056ee473568865f3e6@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/minix/inode.c |   22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

--- a/fs/minix/inode.c~fs-minix-reject-too-large-maximum-file-size
+++ a/fs/minix/inode.c
@@ -150,6 +150,23 @@ static int minix_remount (struct super_b
 	return 0;
 }
 
+static bool minix_check_superblock(struct minix_sb_info *sbi)
+{
+	if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
+		return false;
+
+	/*
+	 * s_max_size must not exceed the block mapping limitation.  This check
+	 * is only needed for V1 filesystems, since V2/V3 support an extra level
+	 * of indirect blocks which places the limit well above U32_MAX.
+	 */
+	if (sbi->s_version == MINIX_V1 &&
+	    sbi->s_max_size > (7 + 512 + 512*512) * BLOCK_SIZE)
+		return false;
+
+	return true;
+}
+
 static int minix_fill_super(struct super_block *s, void *data, int silent)
 {
 	struct buffer_head *bh;
@@ -228,11 +245,12 @@ static int minix_fill_super(struct super
 	} else
 		goto out_no_fs;
 
+	if (!minix_check_superblock(sbi))
+		goto out_illegal_sb;
+
 	/*
 	 * Allocate the buffer map to keep the superblock small.
 	 */
-	if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
-		goto out_illegal_sb;
 	i = (sbi->s_imap_blocks + sbi->s_zmap_blocks) * sizeof(bh);
 	map = kzalloc(i, GFP_KERNEL);
 	if (!map)
_

Patches currently in -mm which might be from ebiggers@google.com are

fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + fs-minix-set-s_maxbytes-correctly.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (31 preceding siblings ...)
  2020-07-07 19:25 ` + fs-minix-reject-too-large-maximum-file-size.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
  2020-07-07 19:25 ` + fs-minix-fix-block-limit-check-for-v1-filesystems.patch " Andrew Morton
                   ` (199 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
  To: anenbupt, ebiggers, mm-commits, viro


The patch titled
     Subject: fs/minix: set s_maxbytes correctly
has been added to the -mm tree.  Its filename is
     fs-minix-set-s_maxbytes-correctly.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-set-s_maxbytes-correctly.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-set-s_maxbytes-correctly.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: set s_maxbytes correctly

The minix filesystem leaves super_block::s_maxbytes at MAX_NON_LFS rather
than setting it to the actual filesystem-specific limit.  This is broken
because it means userspace doesn't see the standard behavior like getting
EFBIG and SIGXFSZ when exceeding the maximum file size.

Fix this by setting s_maxbytes correctly.

Link: http://lkml.kernel.org/r/20200628060846.682158-5-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/minix/inode.c    |   12 +++++++-----
 fs/minix/itree_v1.c |    2 +-
 fs/minix/itree_v2.c |    3 +--
 fs/minix/minix.h    |    1 -
 4 files changed, 9 insertions(+), 9 deletions(-)

--- a/fs/minix/inode.c~fs-minix-set-s_maxbytes-correctly
+++ a/fs/minix/inode.c
@@ -150,8 +150,10 @@ static int minix_remount (struct super_b
 	return 0;
 }
 
-static bool minix_check_superblock(struct minix_sb_info *sbi)
+static bool minix_check_superblock(struct super_block *sb)
 {
+	struct minix_sb_info *sbi = minix_sb(sb);
+
 	if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
 		return false;
 
@@ -161,7 +163,7 @@ static bool minix_check_superblock(struc
 	 * of indirect blocks which places the limit well above U32_MAX.
 	 */
 	if (sbi->s_version == MINIX_V1 &&
-	    sbi->s_max_size > (7 + 512 + 512*512) * BLOCK_SIZE)
+	    sb->s_maxbytes > (7 + 512 + 512*512) * BLOCK_SIZE)
 		return false;
 
 	return true;
@@ -202,7 +204,7 @@ static int minix_fill_super(struct super
 	sbi->s_zmap_blocks = ms->s_zmap_blocks;
 	sbi->s_firstdatazone = ms->s_firstdatazone;
 	sbi->s_log_zone_size = ms->s_log_zone_size;
-	sbi->s_max_size = ms->s_max_size;
+	s->s_maxbytes = ms->s_max_size;
 	s->s_magic = ms->s_magic;
 	if (s->s_magic == MINIX_SUPER_MAGIC) {
 		sbi->s_version = MINIX_V1;
@@ -233,7 +235,7 @@ static int minix_fill_super(struct super
 		sbi->s_zmap_blocks = m3s->s_zmap_blocks;
 		sbi->s_firstdatazone = m3s->s_firstdatazone;
 		sbi->s_log_zone_size = m3s->s_log_zone_size;
-		sbi->s_max_size = m3s->s_max_size;
+		s->s_maxbytes = m3s->s_max_size;
 		sbi->s_ninodes = m3s->s_ninodes;
 		sbi->s_nzones = m3s->s_zones;
 		sbi->s_dirsize = 64;
@@ -245,7 +247,7 @@ static int minix_fill_super(struct super
 	} else
 		goto out_no_fs;
 
-	if (!minix_check_superblock(sbi))
+	if (!minix_check_superblock(s))
 		goto out_illegal_sb;
 
 	/*
--- a/fs/minix/itree_v1.c~fs-minix-set-s_maxbytes-correctly
+++ a/fs/minix/itree_v1.c
@@ -29,7 +29,7 @@ static int block_to_path(struct inode *
 	if (block < 0) {
 		printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
 			block, inode->i_sb->s_bdev);
-	} else if (block >= (minix_sb(inode->i_sb)->s_max_size/BLOCK_SIZE)) {
+	} else if (block >= inode->i_sb->s_maxbytes/BLOCK_SIZE) {
 		if (printk_ratelimit())
 			printk("MINIX-fs: block_to_path: "
 			       "block %ld too big on dev %pg\n",
--- a/fs/minix/itree_v2.c~fs-minix-set-s_maxbytes-correctly
+++ a/fs/minix/itree_v2.c
@@ -32,8 +32,7 @@ static int block_to_path(struct inode *
 	if (block < 0) {
 		printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
 			block, sb->s_bdev);
-	} else if ((u64)block * (u64)sb->s_blocksize >=
-			minix_sb(sb)->s_max_size) {
+	} else if ((u64)block * (u64)sb->s_blocksize >= sb->s_maxbytes) {
 		if (printk_ratelimit())
 			printk("MINIX-fs: block_to_path: "
 			       "block %ld too big on dev %pg\n",
--- a/fs/minix/minix.h~fs-minix-set-s_maxbytes-correctly
+++ a/fs/minix/minix.h
@@ -32,7 +32,6 @@ struct minix_sb_info {
 	unsigned long s_zmap_blocks;
 	unsigned long s_firstdatazone;
 	unsigned long s_log_zone_size;
-	unsigned long s_max_size;
 	int s_dirsize;
 	int s_namelen;
 	struct buffer_head ** s_imap;
_

Patches currently in -mm which might be from ebiggers@google.com are

fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + fs-minix-fix-block-limit-check-for-v1-filesystems.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (32 preceding siblings ...)
  2020-07-07 19:25 ` + fs-minix-set-s_maxbytes-correctly.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
  2020-07-07 19:25 ` + fs-minix-remove-expected-error-message-in-block_to_path.patch " Andrew Morton
                   ` (198 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
  To: anenbupt, ebiggers, mm-commits, viro


The patch titled
     Subject: fs/minix: fix block limit check for V1 filesystems
has been added to the -mm tree.  Its filename is
     fs-minix-fix-block-limit-check-for-v1-filesystems.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-fix-block-limit-check-for-v1-filesystems.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-fix-block-limit-check-for-v1-filesystems.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: fix block limit check for V1 filesystems

The minix filesystem reads its maximum file size from its on-disk
superblock.  This value isn't necessarily a multiple of the block size. 
When it's not, the V1 block mapping code doesn't allow mapping the last
possible block.  Commit 6ed6a722f9ab ("minixfs: fix block limit check")
fixed this in the V2 mapping code.  Fix it in the V1 mapping code too.

Link: http://lkml.kernel.org/r/20200628060846.682158-6-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/minix/itree_v1.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/minix/itree_v1.c~fs-minix-fix-block-limit-check-for-v1-filesystems
+++ a/fs/minix/itree_v1.c
@@ -29,7 +29,7 @@ static int block_to_path(struct inode *
 	if (block < 0) {
 		printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
 			block, inode->i_sb->s_bdev);
-	} else if (block >= inode->i_sb->s_maxbytes/BLOCK_SIZE) {
+	} else if ((u64)block * BLOCK_SIZE >= inode->i_sb->s_maxbytes) {
 		if (printk_ratelimit())
 			printk("MINIX-fs: block_to_path: "
 			       "block %ld too big on dev %pg\n",
_

Patches currently in -mm which might be from ebiggers@google.com are

fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + fs-minix-remove-expected-error-message-in-block_to_path.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (33 preceding siblings ...)
  2020-07-07 19:25 ` + fs-minix-fix-block-limit-check-for-v1-filesystems.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
  2020-07-07 19:27 ` + mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch " Andrew Morton
                   ` (197 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
  To: anenbupt, ebiggers, mm-commits, viro


The patch titled
     Subject: fs/minix: remove expected error message in block_to_path()
has been added to the -mm tree.  Its filename is
     fs-minix-remove-expected-error-message-in-block_to_path.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-remove-expected-error-message-in-block_to_path.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-remove-expected-error-message-in-block_to_path.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: remove expected error message in block_to_path()

When truncating a file to a size within the last allowed logical block,
block_to_path() is called with the *next* block.  This exceeds the limit,
causing the "block %ld too big" error message to be printed.

This case isn't actually an error; there are just no more blocks past that
point.  So, remove this error message.

Link: http://lkml.kernel.org/r/20200628060846.682158-7-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/minix/itree_v1.c |   12 ++++++------
 fs/minix/itree_v2.c |   12 ++++++------
 2 files changed, 12 insertions(+), 12 deletions(-)

--- a/fs/minix/itree_v1.c~fs-minix-remove-expected-error-message-in-block_to_path
+++ a/fs/minix/itree_v1.c
@@ -29,12 +29,12 @@ static int block_to_path(struct inode *
 	if (block < 0) {
 		printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
 			block, inode->i_sb->s_bdev);
-	} else if ((u64)block * BLOCK_SIZE >= inode->i_sb->s_maxbytes) {
-		if (printk_ratelimit())
-			printk("MINIX-fs: block_to_path: "
-			       "block %ld too big on dev %pg\n",
-				block, inode->i_sb->s_bdev);
-	} else if (block < 7) {
+		return 0;
+	}
+	if ((u64)block * BLOCK_SIZE >= inode->i_sb->s_maxbytes)
+		return 0;
+
+	if (block < 7) {
 		offsets[n++] = block;
 	} else if ((block -= 7) < 512) {
 		offsets[n++] = 7;
--- a/fs/minix/itree_v2.c~fs-minix-remove-expected-error-message-in-block_to_path
+++ a/fs/minix/itree_v2.c
@@ -32,12 +32,12 @@ static int block_to_path(struct inode *
 	if (block < 0) {
 		printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
 			block, sb->s_bdev);
-	} else if ((u64)block * (u64)sb->s_blocksize >= sb->s_maxbytes) {
-		if (printk_ratelimit())
-			printk("MINIX-fs: block_to_path: "
-			       "block %ld too big on dev %pg\n",
-				block, sb->s_bdev);
-	} else if (block < DIRCOUNT) {
+		return 0;
+	}
+	if ((u64)block * (u64)sb->s_blocksize >= sb->s_maxbytes)
+		return 0;
+
+	if (block < DIRCOUNT) {
 		offsets[n++] = block;
 	} else if ((block -= DIRCOUNT) < INDIRCOUNT(sb)) {
 		offsets[n++] = DIRCOUNT;
_

Patches currently in -mm which might be from ebiggers@google.com are

fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (34 preceding siblings ...)
  2020-07-07 19:25 ` + fs-minix-remove-expected-error-message-in-block_to_path.patch " Andrew Morton
@ 2020-07-07 19:27 ` Andrew Morton
  2020-07-07 19:28 ` + mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch " Andrew Morton
                   ` (196 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:27 UTC (permalink / raw)
  To: david, jroedel, mm-commits, rppt


The patch titled
     Subject: mm: vmalloc: remove redundant asignmnet in unmap_kernel_range_noflush()
has been added to the -mm tree.  Its filename is
     mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Mike Rapoport <rppt@linux.ibm.com>
Subject: mm: vmalloc: remove redundant asignmnet in unmap_kernel_range_noflush()

'addr' is set to 'start' and then a few lines afterwards 'start' is set to
'addr'.  Remove the second asignment.

Link: http://lkml.kernel.org/r/20200707163226.374685-1-rppt@kernel.org
Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |    1 -
 1 file changed, 1 deletion(-)

--- a/mm/vmalloc.c~mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush
+++ a/mm/vmalloc.c
@@ -175,7 +175,6 @@ void unmap_kernel_range_noflush(unsigned
 	pgtbl_mod_mask mask = 0;
 
 	BUG_ON(addr >= end);
-	start = addr;
 	pgd = pgd_offset_k(addr);
 	do {
 		next = pgd_addr_end(addr, end);
_

Patches currently in -mm which might be from rppt@linux.ibm.com are

mm-remove-unneeded-includes-of-asm-pgalloch.patch
opeinrisc-switch-to-generic-version-of-pte-allocation.patch
xtensa-switch-to-generic-version-of-pte-allocation.patch
asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
asm-generic-pgalloc-provide-generic-pgd_free.patch
mm-move-lib-ioremapc-to-mm.patch
mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (35 preceding siblings ...)
  2020-07-07 19:27 ` + mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch " Andrew Morton
@ 2020-07-07 19:28 ` Andrew Morton
  2020-07-07 19:36 ` + kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch " Andrew Morton
                   ` (195 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:28 UTC (permalink / raw)
  To: bigeasy, colin.king, davem, ddstreet, herbert, lgoncalv,
	mahipalreddy2006, mm-commits, sjenning, song.bao.hua,
	vitaly.wool, wangzhou1


The patch titled
     Subject: mm/zswap: move to use crypto_acomp API for hardware acceleration
has been added to the -mm tree.  Its filename is
     mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/zswap: move to use crypto_acomp API for hardware acceleration

Right now, all new ZIP drivers are using crypto_acomp APIs rather than
legacy crypto_comp APIs.  But zswap.c is still using the old APIs.  That
means zswap won't be able to use any new zip drivers in kernel.

This patch moves to use cryto_acomp APIs to fix the problem.  On the other
hand, tradiontal compressors like lz4,lzo etc have been wrapped into acomp
via scomp backend.  So platforms without async compressors can fallback to
use acomp via scomp backend.

It is probably the first real user to use acomp but perhaps not a good
example to demonstrate how multiple acomp requests can be executed in
parallel in one acomp instance.  frontswap is doing page load and store
page by page.  It doesn't have a queuing or buffering mechinism to permit
multiple pages to do frontswap simultaneously in one thread.  However this
patch creates multiple acomp instances, so multiple threads running on
multiple different cpus can actually do (de)compression parallelly,
leveraging the power of multiple ZIP hardware queues.  This is also
consistent with frontswap's page management model.

On the other hand, the current zswap implementation has some per-cpu
global resource like zswap_dstmem.  So we create acomp instances in number
of CPUs just like before, zswap created comp instances in number of CPUs.

Link: http://lkml.kernel.org/r/20200707125210.33256-1-song.bao.hua@hisilicon.com
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mahipal Challa <mahipalreddy2006@gmail.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/zswap.c |  177 ++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 134 insertions(+), 43 deletions(-)

--- a/mm/zswap.c~mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration
+++ a/mm/zswap.c
@@ -24,8 +24,10 @@
 #include <linux/rbtree.h>
 #include <linux/swap.h>
 #include <linux/crypto.h>
+#include <linux/scatterlist.h>
 #include <linux/mempool.h>
 #include <linux/zpool.h>
+#include <crypto/acompress.h>
 
 #include <linux/mm_types.h>
 #include <linux/page-flags.h>
@@ -127,9 +129,17 @@ module_param_named(same_filled_pages_ena
 * data structures
 **********************************/
 
+struct crypto_acomp_ctx {
+	struct crypto_acomp *acomp;
+	struct acomp_req *req;
+	struct crypto_wait wait;
+	u8 *dstmem;
+	struct mutex mutex;
+};
+
 struct zswap_pool {
 	struct zpool *zpool;
-	struct crypto_comp * __percpu *tfm;
+	struct crypto_acomp_ctx * __percpu *acomp_ctx;
 	struct kref kref;
 	struct list_head list;
 	struct work_struct release_work;
@@ -415,30 +425,73 @@ static int zswap_dstmem_dead(unsigned in
 static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
 {
 	struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
-	struct crypto_comp *tfm;
+	struct crypto_acomp *acomp;
+	struct acomp_req *req;
+	struct crypto_acomp_ctx *acomp_ctx;
+	int ret;
 
-	if (WARN_ON(*per_cpu_ptr(pool->tfm, cpu)))
+	if (WARN_ON(*per_cpu_ptr(pool->acomp_ctx, cpu)))
 		return 0;
 
-	tfm = crypto_alloc_comp(pool->tfm_name, 0, 0);
-	if (IS_ERR_OR_NULL(tfm)) {
-		pr_err("could not alloc crypto comp %s : %ld\n",
-		       pool->tfm_name, PTR_ERR(tfm));
+	acomp_ctx = kzalloc(sizeof(*acomp_ctx), GFP_KERNEL);
+	if (!acomp_ctx)
 		return -ENOMEM;
+
+	acomp = crypto_alloc_acomp(pool->tfm_name, 0, 0);
+	if (IS_ERR(acomp)) {
+		pr_err("could not alloc crypto acomp %s : %ld\n",
+				pool->tfm_name, PTR_ERR(acomp));
+		ret = PTR_ERR(acomp);
+		goto free_ctx;
+	}
+	acomp_ctx->acomp = acomp;
+
+	req = acomp_request_alloc(acomp_ctx->acomp);
+	if (!req) {
+		pr_err("could not alloc crypto acomp_request %s\n",
+		       pool->tfm_name);
+		ret = -ENOMEM;
+		goto free_acomp;
 	}
-	*per_cpu_ptr(pool->tfm, cpu) = tfm;
+	acomp_ctx->req = req;
+
+	mutex_init(&acomp_ctx->mutex);
+	crypto_init_wait(&acomp_ctx->wait);
+	/*
+	 * if the backend of acomp is async zip, crypto_req_done() will wakeup
+	 * crypto_wait_req(); if the backend of acomp is scomp, the callback
+	 * won't be called, crypto_wait_req() will return without blocking.
+	 */
+	acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+				   crypto_req_done, &acomp_ctx->wait);
+
+	acomp_ctx->dstmem = per_cpu(zswap_dstmem, cpu);
+	*per_cpu_ptr(pool->acomp_ctx, cpu) = acomp_ctx;
+
 	return 0;
+
+free_acomp:
+	crypto_free_acomp(acomp_ctx->acomp);
+free_ctx:
+	kfree(acomp_ctx);
+	return ret;
 }
 
 static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
 {
 	struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
-	struct crypto_comp *tfm;
+	struct crypto_acomp_ctx *acomp_ctx;
+
+	acomp_ctx = *per_cpu_ptr(pool->acomp_ctx, cpu);
+	if (!IS_ERR_OR_NULL(acomp_ctx)) {
+		if (!IS_ERR_OR_NULL(acomp_ctx->req))
+			acomp_request_free(acomp_ctx->req);
+		if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
+			crypto_free_acomp(acomp_ctx->acomp);
+		kfree(acomp_ctx);
+	}
+	*per_cpu_ptr(pool->acomp_ctx, cpu) = NULL;
 
-	tfm = *per_cpu_ptr(pool->tfm, cpu);
-	if (!IS_ERR_OR_NULL(tfm))
-		crypto_free_comp(tfm);
-	*per_cpu_ptr(pool->tfm, cpu) = NULL;
 	return 0;
 }
 
@@ -561,8 +614,9 @@ static struct zswap_pool *zswap_pool_cre
 	pr_debug("using %s zpool\n", zpool_get_type(pool->zpool));
 
 	strlcpy(pool->tfm_name, compressor, sizeof(pool->tfm_name));
-	pool->tfm = alloc_percpu(struct crypto_comp *);
-	if (!pool->tfm) {
+
+	pool->acomp_ctx = alloc_percpu(struct crypto_acomp_ctx *);
+	if (!pool->acomp_ctx) {
 		pr_err("percpu alloc failed\n");
 		goto error;
 	}
@@ -585,7 +639,7 @@ static struct zswap_pool *zswap_pool_cre
 	return pool;
 
 error:
-	free_percpu(pool->tfm);
+	free_percpu(pool->acomp_ctx);
 	if (pool->zpool)
 		zpool_destroy_pool(pool->zpool);
 	kfree(pool);
@@ -596,14 +650,14 @@ static __init struct zswap_pool *__zswap
 {
 	bool has_comp, has_zpool;
 
-	has_comp = crypto_has_comp(zswap_compressor, 0, 0);
+	has_comp = crypto_has_acomp(zswap_compressor, 0, 0);
 	if (!has_comp && strcmp(zswap_compressor,
 				CONFIG_ZSWAP_COMPRESSOR_DEFAULT)) {
 		pr_err("compressor %s not available, using default %s\n",
 		       zswap_compressor, CONFIG_ZSWAP_COMPRESSOR_DEFAULT);
 		param_free_charp(&zswap_compressor);
 		zswap_compressor = CONFIG_ZSWAP_COMPRESSOR_DEFAULT;
-		has_comp = crypto_has_comp(zswap_compressor, 0, 0);
+		has_comp = crypto_has_acomp(zswap_compressor, 0, 0);
 	}
 	if (!has_comp) {
 		pr_err("default compressor %s not available\n",
@@ -639,7 +693,7 @@ static void zswap_pool_destroy(struct zs
 	zswap_pool_debug("destroying", pool);
 
 	cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node);
-	free_percpu(pool->tfm);
+	free_percpu(pool->acomp_ctx);
 	zpool_destroy_pool(pool->zpool);
 	kfree(pool);
 }
@@ -723,7 +777,7 @@ static int __zswap_param_set(const char
 		}
 		type = s;
 	} else if (!compressor) {
-		if (!crypto_has_comp(s, 0, 0)) {
+		if (!crypto_has_acomp(s, 0, 0)) {
 			pr_err("compressor %s not available\n", s);
 			return -ENOENT;
 		}
@@ -774,7 +828,7 @@ static int __zswap_param_set(const char
 		 * failed, maybe both compressor and zpool params were bad.
 		 * Allow changing this param, so pool creation will succeed
 		 * when the other param is changed. We already verified this
-		 * param is ok in the zpool_has_pool() or crypto_has_comp()
+		 * param is ok in the zpool_has_pool() or crypto_has_acomp()
 		 * checks above.
 		 */
 		ret = param_set_charp(s, kp);
@@ -876,7 +930,9 @@ static int zswap_writeback_entry(struct
 	pgoff_t offset;
 	struct zswap_entry *entry;
 	struct page *page;
-	struct crypto_comp *tfm;
+	struct scatterlist input, output;
+	struct crypto_acomp_ctx *acomp_ctx;
+
 	u8 *src, *dst;
 	unsigned int dlen;
 	int ret;
@@ -916,14 +972,21 @@ static int zswap_writeback_entry(struct
 
 	case ZSWAP_SWAPCACHE_NEW: /* page is locked */
 		/* decompress */
+		acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+
 		dlen = PAGE_SIZE;
 		src = (u8 *)zhdr + sizeof(struct zswap_header);
-		dst = kmap_atomic(page);
-		tfm = *get_cpu_ptr(entry->pool->tfm);
-		ret = crypto_comp_decompress(tfm, src, entry->length,
-					     dst, &dlen);
-		put_cpu_ptr(entry->pool->tfm);
-		kunmap_atomic(dst);
+		dst = kmap(page);
+
+		mutex_lock(&acomp_ctx->mutex);
+		sg_init_one(&input, src, entry->length);
+		sg_init_one(&output, dst, dlen);
+		acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
+		ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait);
+		dlen = acomp_ctx->req->dlen;
+		mutex_unlock(&acomp_ctx->mutex);
+
+		kunmap(page);
 		BUG_ON(ret);
 		BUG_ON(dlen != PAGE_SIZE);
 
@@ -1004,7 +1067,8 @@ static int zswap_frontswap_store(unsigne
 {
 	struct zswap_tree *tree = zswap_trees[type];
 	struct zswap_entry *entry, *dupentry;
-	struct crypto_comp *tfm;
+	struct scatterlist input, output;
+	struct crypto_acomp_ctx *acomp_ctx;
 	int ret;
 	unsigned int hlen, dlen = PAGE_SIZE;
 	unsigned long handle, value;
@@ -1074,12 +1138,32 @@ static int zswap_frontswap_store(unsigne
 	}
 
 	/* compress */
-	dst = get_cpu_var(zswap_dstmem);
-	tfm = *get_cpu_ptr(entry->pool->tfm);
-	src = kmap_atomic(page);
-	ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen);
-	kunmap_atomic(src);
-	put_cpu_ptr(entry->pool->tfm);
+	acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+
+	mutex_lock(&acomp_ctx->mutex);
+
+	src = kmap(page);
+	dst = acomp_ctx->dstmem;
+	sg_init_one(&input, src, PAGE_SIZE);
+	/* zswap_dstmem is of size (PAGE_SIZE * 2). Reflect same in sg_list */
+	sg_init_one(&output, dst, PAGE_SIZE * 2);
+	acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SIZE, dlen);
+	/*
+	 * it maybe looks a little bit silly that we send an asynchronous request,
+	 * then wait for its completion synchronously. This makes the process look
+	 * synchronous in fact.
+	 * Theoretically, acomp supports users send multiple acomp requests in one
+	 * acomp instance, then get those requests done simultaneously. but in this
+	 * case, frontswap actually does store and load page by page, there is no
+	 * existing method to send the second page before the first page is done
+	 * in one thread doing frontswap.
+	 * but in different threads running on different cpu, we have different
+	 * acomp instance, so multiple threads can do (de)compression in parallel.
+	 */
+	ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait);
+	dlen = acomp_ctx->req->dlen;
+	kunmap(page);
+
 	if (ret) {
 		ret = -EINVAL;
 		goto put_dstmem;
@@ -1103,7 +1187,7 @@ static int zswap_frontswap_store(unsigne
 	memcpy(buf, &zhdr, hlen);
 	memcpy(buf + hlen, dst, dlen);
 	zpool_unmap_handle(entry->pool->zpool, handle);
-	put_cpu_var(zswap_dstmem);
+	mutex_unlock(&acomp_ctx->mutex);
 
 	/* populate entry */
 	entry->offset = offset;
@@ -1131,7 +1215,7 @@ insert_entry:
 	return 0;
 
 put_dstmem:
-	put_cpu_var(zswap_dstmem);
+	mutex_unlock(&acomp_ctx->mutex);
 	zswap_pool_put(entry->pool);
 freepage:
 	zswap_entry_cache_free(entry);
@@ -1148,7 +1232,8 @@ static int zswap_frontswap_load(unsigned
 {
 	struct zswap_tree *tree = zswap_trees[type];
 	struct zswap_entry *entry;
-	struct crypto_comp *tfm;
+	struct scatterlist input, output;
+	struct crypto_acomp_ctx *acomp_ctx;
 	u8 *src, *dst;
 	unsigned int dlen;
 	int ret;
@@ -1175,11 +1260,17 @@ static int zswap_frontswap_load(unsigned
 	src = zpool_map_handle(entry->pool->zpool, entry->handle, ZPOOL_MM_RO);
 	if (zpool_evictable(entry->pool->zpool))
 		src += sizeof(struct zswap_header);
-	dst = kmap_atomic(page);
-	tfm = *get_cpu_ptr(entry->pool->tfm);
-	ret = crypto_comp_decompress(tfm, src, entry->length, dst, &dlen);
-	put_cpu_ptr(entry->pool->tfm);
-	kunmap_atomic(dst);
+	dst = kmap(page);
+
+	acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+	mutex_lock(&acomp_ctx->mutex);
+	sg_init_one(&input, src, entry->length);
+	sg_init_one(&output, dst, dlen);
+	acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
+	ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait);
+	mutex_unlock(&acomp_ctx->mutex);
+
+	kunmap(page);
 	zpool_unmap_handle(entry->pool->zpool, entry->handle);
 	BUG_ON(ret);
 
_

Patches currently in -mm which might be from song.bao.hua@hisilicon.com are

mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (36 preceding siblings ...)
  2020-07-07 19:28 ` + mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch " Andrew Morton
@ 2020-07-07 19:36 ` Andrew Morton
  2020-07-07 19:37 ` + lib-test_lockupc-make-symbol-test_works-static.patch " Andrew Morton
                   ` (194 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:36 UTC (permalink / raw)
  To: mm-commits, pmladek, stamatis.iliass


The patch titled
     Subject: kthread: remove incorrect comment in kthread_create_on_cpu()
has been added to the -mm tree.  Its filename is
     kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Ilias Stamatis <stamatis.iliass@gmail.com>
Subject: kthread: remove incorrect comment in kthread_create_on_cpu()

Originally kthread_create_on_cpu() parked and woke up the new thread. 
However, since commit a65d40961dc7 ("kthread/smpboot: do not park in
kthread_create_on_cpu()") this is no longer the case.  This patch removes
the comment that has been left behind and is now incorrect / stale.

Link: http://lkml.kernel.org/r/20200611135920.240551-1-stamatis.iliass@gmail.com
Fixes: a65d40961dc7 ("kthread/smpboot: do not park in kthread_create_on_cpu()")
Signed-off-by: Ilias Stamatis <stamatis.iliass@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kthread.c |    1 -
 1 file changed, 1 deletion(-)

--- a/kernel/kthread.c~kthread-remove-incorrect-comment-in-kthread_create_on_cpu
+++ a/kernel/kthread.c
@@ -478,7 +478,6 @@ EXPORT_SYMBOL(kthread_bind);
  *	     to "name.*%u". Code fills in cpu number.
  *
  * Description: This helper function creates and names a kernel thread
- * The thread will be woken and put into park mode.
  */
 struct task_struct *kthread_create_on_cpu(int (*threadfn)(void *data),
 					  void *data, unsigned int cpu,
_

Patches currently in -mm which might be from stamatis.iliass@gmail.com are

kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + lib-test_lockupc-make-symbol-test_works-static.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (37 preceding siblings ...)
  2020-07-07 19:36 ` + kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch " Andrew Morton
@ 2020-07-07 19:37 ` Andrew Morton
  2020-07-07 19:39 ` [failures] kthread-work-could-not-be-queued-when-worker-being-destroyed.patch removed from " Andrew Morton
                   ` (193 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:37 UTC (permalink / raw)
  To: hulkci, mm-commits, weiyongjun1


The patch titled
     Subject: lib/test_lockup.c: make symbol 'test_works' static
has been added to the -mm tree.  Its filename is
     lib-test_lockupc-make-symbol-test_works-static.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/lib-test_lockupc-make-symbol-test_works-static.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/lib-test_lockupc-make-symbol-test_works-static.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Wei Yongjun <weiyongjun1@huawei.com>
Subject: lib/test_lockup.c: make symbol 'test_works' static

Fix sparse build warning:

lib/test_lockup.c:403:1: warning:
 symbol '__pcpu_scope_test_works' was not declared. Should it be static?

Link: http://lkml.kernel.org/r/20200707112252.9047-1-weiyongjun1@huawei.com
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/test_lockup.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/test_lockup.c~lib-test_lockupc-make-symbol-test_works-static
+++ a/lib/test_lockup.c
@@ -400,7 +400,7 @@ static void test_lockup(bool master)
 	test_unlock(master, true);
 }
 
-DEFINE_PER_CPU(struct work_struct, test_works);
+static DEFINE_PER_CPU(struct work_struct, test_works);
 
 static void test_work_fn(struct work_struct *work)
 {
_

Patches currently in -mm which might be from weiyongjun1@huawei.com are

mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
lib-test_lockupc-make-symbol-test_works-static.patch
bits-add-tests-of-genmask-fix-2.patch
kcov-make-some-symbols-static.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [failures] kthread-work-could-not-be-queued-when-worker-being-destroyed.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (38 preceding siblings ...)
  2020-07-07 19:37 ` + lib-test_lockupc-make-symbol-test_works-static.patch " Andrew Morton
@ 2020-07-07 19:39 ` Andrew Morton
  2020-07-07 19:47 ` [to-be-updated] mm-page_isolation-prefer-the-node-of-the-source-page.patch " Andrew Morton
                   ` (192 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:39 UTC (permalink / raw)
  To: ben.dooks, bfields, cl, mm-commits, peterz, pmladek, qiang.zhang, tj


The patch titled
     Subject: kthread: work could not be queued when worker being destroyed
has been removed from the -mm tree.  Its filename was
     kthread-work-could-not-be-queued-when-worker-being-destroyed.patch

This patch was dropped because it had testing failures

------------------------------------------------------
From: Zhang Qiang <qiang.zhang@windriver.com>
Subject: kthread: work could not be queued when worker being destroyed

The "queuing_blocked" func should print warning message and returns true
when the worker being destroyed.

Before the work is put into the queue of the worker thread, the state of
the worker thread needs to be detected,because the worker thread may be in
the destruction state at this time.

Link: http://lkml.kernel.org/r/20200705013018.7375-1-qiang.zhang@windriver.com
Link: http://lkml.kernel.org/r/20200702070156.5862-1-qiang.zhang@windriver.com
Signed-off-by: Zhang Qiang <qiang.zhang@windriver.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Liang Chen <cl@rock-chips.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kthread.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/kernel/kthread.c~kthread-work-could-not-be-queued-when-worker-being-destroyed
+++ a/kernel/kthread.c
@@ -814,6 +814,9 @@ static inline bool queuing_blocked(struc
 {
 	lockdep_assert_held(&worker->lock);
 
+	if (WARN_ON(!worker->task))
+		return true;
+
 	return !list_empty(&work->node) || work->canceling;
 }
 
_

Patches currently in -mm which might be from qiang.zhang@windriver.com are

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-page_isolation-prefer-the-node-of-the-source-page.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (39 preceding siblings ...)
  2020-07-07 19:39 ` [failures] kthread-work-could-not-be-queued-when-worker-being-destroyed.patch removed from " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
  2020-07-07 19:47 ` [to-be-updated] mm-migrate-move-migration-helper-from-h-to-c.patch " Andrew Morton
                   ` (191 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, vbabka


The patch titled
     Subject: mm/page_isolation: prefer the node of the source page
has been removed from the -mm tree.  Its filename was
     mm-page_isolation-prefer-the-node-of-the-source-page.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/page_isolation: prefer the node of the source page

Patch series "clean-up the migration target allocation functions", v3.

This patchset clean-up the migration target allocation functions.

Contributions of this patchset are:
1. unify two hugetlb alloc functions. As a result, one is remained.
2. make one external hugetlb alloc function to internal one.
3. unify three functions for migration target allocation.


This patch (of 8):

For locality, it's better to migrate the page to the same node rather than
the node of the current caller's cpu.

Link: http://lkml.kernel.org/r/1592892828-1934-1-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1592892828-1934-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_isolation.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/mm/page_isolation.c~mm-page_isolation-prefer-the-node-of-the-source-page
+++ a/mm/page_isolation.c
@@ -309,5 +309,7 @@ int test_pages_isolated(unsigned long st
 
 struct page *alloc_migrate_target(struct page *page, unsigned long private)
 {
-	return new_page_nodemask(page, numa_node_id(), &node_states[N_MEMORY]);
+	int nid = page_to_nid(page);
+
+	return new_page_nodemask(page, nid, &node_states[N_MEMORY]);
 }
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-migrate-move-migration-helper-from-h-to-c.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (40 preceding siblings ...)
  2020-07-07 19:47 ` [to-be-updated] mm-page_isolation-prefer-the-node-of-the-source-page.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
  2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-unify-migration-callbacks.patch " Andrew Morton
                   ` (190 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, vbabka


The patch titled
     Subject: mm/migrate: move migration helper from .h to .c
has been removed from the -mm tree.  Its filename was
     mm-migrate-move-migration-helper-from-h-to-c.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/migrate: move migration helper from .h to .c

It's not performance sensitive function.  Move it to .c.  This is a
preparation step for future change.

Link: http://lkml.kernel.org/r/1592892828-1934-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/migrate.h |   33 +++++----------------------------
 mm/migrate.c            |   29 +++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 28 deletions(-)

--- a/include/linux/migrate.h~mm-migrate-move-migration-helper-from-h-to-c
+++ a/include/linux/migrate.h
@@ -31,34 +31,6 @@ enum migrate_reason {
 /* In mm/debug.c; also keep sync with include/trace/events/migrate.h */
 extern const char *migrate_reason_names[MR_TYPES];
 
-static inline struct page *new_page_nodemask(struct page *page,
-				int preferred_nid, nodemask_t *nodemask)
-{
-	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL;
-	unsigned int order = 0;
-	struct page *new_page = NULL;
-
-	if (PageHuge(page))
-		return alloc_huge_page_nodemask(page_hstate(compound_head(page)),
-				preferred_nid, nodemask);
-
-	if (PageTransHuge(page)) {
-		gfp_mask |= GFP_TRANSHUGE;
-		order = HPAGE_PMD_ORDER;
-	}
-
-	if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
-		gfp_mask |= __GFP_HIGHMEM;
-
-	new_page = __alloc_pages_nodemask(gfp_mask, order,
-				preferred_nid, nodemask);
-
-	if (new_page && PageTransHuge(new_page))
-		prep_transhuge_page(new_page);
-
-	return new_page;
-}
-
 #ifdef CONFIG_MIGRATION
 
 extern void putback_movable_pages(struct list_head *l);
@@ -67,6 +39,8 @@ extern int migrate_page(struct address_s
 			enum migrate_mode mode);
 extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
 		unsigned long private, enum migrate_mode mode, int reason);
+extern struct page *new_page_nodemask(struct page *page,
+		int preferred_nid, nodemask_t *nodemask);
 extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
 extern void putback_movable_page(struct page *page);
 
@@ -85,6 +59,9 @@ static inline int migrate_pages(struct l
 		free_page_t free, unsigned long private, enum migrate_mode mode,
 		int reason)
 	{ return -ENOSYS; }
+static inline struct page *new_page_nodemask(struct page *page,
+		int preferred_nid, nodemask_t *nodemask)
+	{ return NULL; }
 static inline int isolate_movable_page(struct page *page, isolate_mode_t mode)
 	{ return -EBUSY; }
 
--- a/mm/migrate.c~mm-migrate-move-migration-helper-from-h-to-c
+++ a/mm/migrate.c
@@ -1513,6 +1513,35 @@ out:
 	return rc;
 }
 
+struct page *new_page_nodemask(struct page *page,
+				int preferred_nid, nodemask_t *nodemask)
+{
+	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL;
+	unsigned int order = 0;
+	struct page *new_page = NULL;
+
+	if (PageHuge(page))
+		return alloc_huge_page_nodemask(
+				page_hstate(compound_head(page)),
+				preferred_nid, nodemask);
+
+	if (PageTransHuge(page)) {
+		gfp_mask |= GFP_TRANSHUGE;
+		order = HPAGE_PMD_ORDER;
+	}
+
+	if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
+		gfp_mask |= __GFP_HIGHMEM;
+
+	new_page = __alloc_pages_nodemask(gfp_mask, order,
+				preferred_nid, nodemask);
+
+	if (new_page && PageTransHuge(new_page))
+		prep_transhuge_page(new_page);
+
+	return new_page;
+}
+
 #ifdef CONFIG_NUMA
 
 static int store_status(int __user *status, int start, int value, int nr)
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-hugetlb-unify-migration-callbacks.patch
mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-hugetlb-unify-migration-callbacks.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (41 preceding siblings ...)
  2020-07-07 19:47 ` [to-be-updated] mm-migrate-move-migration-helper-from-h-to-c.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
  2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch " Andrew Morton
                   ` (189 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, vbabka


The patch titled
     Subject: mm/hugetlb: unify migration callbacks
has been removed from the -mm tree.  Its filename was
     mm-hugetlb-unify-migration-callbacks.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/hugetlb: unify migration callbacks

There is no difference between two migration callback functions,
alloc_huge_page_node() and alloc_huge_page_nodemask(), except
__GFP_THISNODE handling.

This patch adds an argument, gfp_mask, on alloc_huge_page_nodemask() and
replaces the callsite for alloc_huge_page_node() with the call to
alloc_huge_page_nodemask(..., __GFP_THISNODE).

It's safe to remove a node id check in alloc_huge_page_node() since
there is no caller passing NUMA_NO_NODE as a node id.

Link: http://lkml.kernel.org/r/1592892828-1934-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/hugetlb.h |   11 +++--------
 mm/hugetlb.c            |   26 +++-----------------------
 mm/mempolicy.c          |    9 +++++----
 mm/migrate.c            |    5 +++--
 4 files changed, 14 insertions(+), 37 deletions(-)

--- a/include/linux/hugetlb.h~mm-hugetlb-unify-migration-callbacks
+++ a/include/linux/hugetlb.h
@@ -504,9 +504,8 @@ struct huge_bootmem_page {
 
 struct page *alloc_huge_page(struct vm_area_struct *vma,
 				unsigned long addr, int avoid_reserve);
-struct page *alloc_huge_page_node(struct hstate *h, int nid);
 struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
-				nodemask_t *nmask);
+				nodemask_t *nmask, gfp_t gfp_mask);
 struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
 				unsigned long address);
 struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
@@ -759,13 +758,9 @@ static inline struct page *alloc_huge_pa
 	return NULL;
 }
 
-static inline struct page *alloc_huge_page_node(struct hstate *h, int nid)
-{
-	return NULL;
-}
-
 static inline struct page *
-alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, nodemask_t *nmask)
+alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
+			nodemask_t *nmask, gfp_t gfp_mask)
 {
 	return NULL;
 }
--- a/mm/hugetlb.c~mm-hugetlb-unify-migration-callbacks
+++ a/mm/hugetlb.c
@@ -1980,30 +1980,10 @@ struct page *alloc_buddy_huge_page_with_
 }
 
 /* page migration callback function */
-struct page *alloc_huge_page_node(struct hstate *h, int nid)
-{
-	gfp_t gfp_mask = htlb_alloc_mask(h);
-	struct page *page = NULL;
-
-	if (nid != NUMA_NO_NODE)
-		gfp_mask |= __GFP_THISNODE;
-
-	spin_lock(&hugetlb_lock);
-	if (h->free_huge_pages - h->resv_huge_pages > 0)
-		page = dequeue_huge_page_nodemask(h, gfp_mask, nid, NULL);
-	spin_unlock(&hugetlb_lock);
-
-	if (!page)
-		page = alloc_migrate_huge_page(h, gfp_mask, nid, NULL);
-
-	return page;
-}
-
-/* page migration callback function */
 struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
-		nodemask_t *nmask)
+		nodemask_t *nmask, gfp_t gfp_mask)
 {
-	gfp_t gfp_mask = htlb_alloc_mask(h);
+	gfp_mask |= htlb_alloc_mask(h);
 
 	spin_lock(&hugetlb_lock);
 	if (h->free_huge_pages - h->resv_huge_pages > 0) {
@@ -2032,7 +2012,7 @@ struct page *alloc_huge_page_vma(struct
 
 	gfp_mask = htlb_alloc_mask(h);
 	node = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
-	page = alloc_huge_page_nodemask(h, node, nodemask);
+	page = alloc_huge_page_nodemask(h, node, nodemask, 0);
 	mpol_cond_put(mpol);
 
 	return page;
--- a/mm/mempolicy.c~mm-hugetlb-unify-migration-callbacks
+++ a/mm/mempolicy.c
@@ -1068,10 +1068,11 @@ static int migrate_page_add(struct page
 /* page allocation callback for NUMA node migration */
 struct page *alloc_new_node_page(struct page *page, unsigned long node)
 {
-	if (PageHuge(page))
-		return alloc_huge_page_node(page_hstate(compound_head(page)),
-					node);
-	else if (PageTransHuge(page)) {
+	if (PageHuge(page)) {
+		return alloc_huge_page_nodemask(
+			page_hstate(compound_head(page)), node,
+			NULL, __GFP_THISNODE);
+	} else if (PageTransHuge(page)) {
 		struct page *thp;
 
 		thp = alloc_pages_node(node,
--- a/mm/migrate.c~mm-hugetlb-unify-migration-callbacks
+++ a/mm/migrate.c
@@ -1520,10 +1520,11 @@ struct page *new_page_nodemask(struct pa
 	unsigned int order = 0;
 	struct page *new_page = NULL;
 
-	if (PageHuge(page))
+	if (PageHuge(page)) {
 		return alloc_huge_page_nodemask(
 				page_hstate(compound_head(page)),
-				preferred_nid, nodemask);
+				preferred_nid, nodemask, 0);
+	}
 
 	if (PageTransHuge(page)) {
 		gfp_mask |= GFP_TRANSHUGE;
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (42 preceding siblings ...)
  2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-unify-migration-callbacks.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
  2020-07-07 19:47 ` [to-be-updated] mm-migrate-make-a-standard-migration-target-allocation-function.patch " Andrew Morton
                   ` (188 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, vbabka


The patch titled
     Subject: mm/hugetlb: make hugetlb migration callback CMA aware
has been removed from the -mm tree.  Its filename was
     mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/hugetlb: make hugetlb migration callback CMA aware

new_non_cma_page() in gup.c which try to allocate migration target page
requires to allocate the new page that is not on the CMA area. 
new_non_cma_page() implements it by removing __GFP_MOVABLE flag.  This way
works well for THP page or normal page but not for hugetlb page.

hugetlb page allocation process consists of two steps.  First is dequeing
from the pool.  Second is, if there is no available page on the queue,
allocating from the page allocator.

new_non_cma_page() can control allocation from the page allocator by
specifying correct gfp flag.  However, dequeing cannot be controlled until
now, so, new_non_cma_page() skips dequeing completely.  It is a suboptimal
since new_non_cma_page() cannot utilize hugetlb pages on the queue so this
patch tries to fix this situation.

This patch makes the deque function on hugetlb CMA aware and skip CMA
pages if newly added skip_cma argument is passed as true.

Link: http://lkml.kernel.org/r/1592892828-1934-5-git-send-email-iamjoonsoo.kim@lge.com
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/hugetlb.h |    6 ++----
 mm/gup.c                |    3 ++-
 mm/hugetlb.c            |   31 ++++++++++++++++++++++---------
 mm/mempolicy.c          |    2 +-
 mm/migrate.c            |    2 +-
 5 files changed, 28 insertions(+), 16 deletions(-)

--- a/include/linux/hugetlb.h~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/include/linux/hugetlb.h
@@ -505,11 +505,9 @@ struct huge_bootmem_page {
 struct page *alloc_huge_page(struct vm_area_struct *vma,
 				unsigned long addr, int avoid_reserve);
 struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
-				nodemask_t *nmask, gfp_t gfp_mask);
+				nodemask_t *nmask, gfp_t gfp_mask, bool skip_cma);
 struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
 				unsigned long address);
-struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
-				     int nid, nodemask_t *nmask);
 int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
 			pgoff_t idx);
 
@@ -760,7 +758,7 @@ static inline struct page *alloc_huge_pa
 
 static inline struct page *
 alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
-			nodemask_t *nmask, gfp_t gfp_mask)
+			nodemask_t *nmask, gfp_t gfp_mask, bool skip_cma)
 {
 	return NULL;
 }
--- a/mm/gup.c~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/mm/gup.c
@@ -1630,11 +1630,12 @@ static struct page *new_non_cma_page(str
 #ifdef CONFIG_HUGETLB_PAGE
 	if (PageHuge(page)) {
 		struct hstate *h = page_hstate(page);
+
 		/*
 		 * We don't want to dequeue from the pool because pool pages will
 		 * mostly be from the CMA region.
 		 */
-		return alloc_migrate_huge_page(h, gfp_mask, nid, NULL);
+		return alloc_huge_page_nodemask(h, nid, NULL, gfp_mask, true);
 	}
 #endif
 	if (PageTransHuge(page)) {
--- a/mm/hugetlb.c~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/mm/hugetlb.c
@@ -1034,13 +1034,18 @@ static void enqueue_huge_page(struct hst
 	h->free_huge_pages_node[nid]++;
 }
 
-static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
+static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid, bool skip_cma)
 {
 	struct page *page;
 
-	list_for_each_entry(page, &h->hugepage_freelists[nid], lru)
+	list_for_each_entry(page, &h->hugepage_freelists[nid], lru) {
+		if (skip_cma && is_migrate_cma_page(page))
+			continue;
+
 		if (!PageHWPoison(page))
 			break;
+	}
+
 	/*
 	 * if 'non-isolated free hugepage' not found on the list,
 	 * the allocation fails.
@@ -1055,7 +1060,7 @@ static struct page *dequeue_huge_page_no
 }
 
 static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, int nid,
-		nodemask_t *nmask)
+		nodemask_t *nmask, bool skip_cma)
 {
 	unsigned int cpuset_mems_cookie;
 	struct zonelist *zonelist;
@@ -1080,7 +1085,7 @@ retry_cpuset:
 			continue;
 		node = zone_to_nid(zone);
 
-		page = dequeue_huge_page_node_exact(h, node);
+		page = dequeue_huge_page_node_exact(h, node, skip_cma);
 		if (page)
 			return page;
 	}
@@ -1125,7 +1130,7 @@ static struct page *dequeue_huge_page_vm
 
 	gfp_mask = htlb_alloc_mask(h);
 	nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
-	page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask);
+	page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask, false);
 	if (page && !avoid_reserve && vma_has_reserves(vma, chg)) {
 		SetPagePrivate(page);
 		h->resv_huge_pages--;
@@ -1938,7 +1943,7 @@ out_unlock:
 	return page;
 }
 
-struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
+static struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
 				     int nid, nodemask_t *nmask)
 {
 	struct page *page;
@@ -1981,7 +1986,7 @@ struct page *alloc_buddy_huge_page_with_
 
 /* page migration callback function */
 struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
-		nodemask_t *nmask, gfp_t gfp_mask)
+		nodemask_t *nmask, gfp_t gfp_mask, bool skip_cma)
 {
 	gfp_mask |= htlb_alloc_mask(h);
 
@@ -1989,7 +1994,8 @@ struct page *alloc_huge_page_nodemask(st
 	if (h->free_huge_pages - h->resv_huge_pages > 0) {
 		struct page *page;
 
-		page = dequeue_huge_page_nodemask(h, gfp_mask, preferred_nid, nmask);
+		page = dequeue_huge_page_nodemask(h, gfp_mask,
+					preferred_nid, nmask, skip_cma);
 		if (page) {
 			spin_unlock(&hugetlb_lock);
 			return page;
@@ -1997,6 +2003,13 @@ struct page *alloc_huge_page_nodemask(st
 	}
 	spin_unlock(&hugetlb_lock);
 
+	/*
+	 * To skip the memory on CMA area, we need to clear __GFP_MOVABLE.
+	 * Clearing __GFP_MOVABLE at the top of this function would also skip
+	 * the proper allocation candidates for dequeue so clearing it here.
+	 */
+	if (skip_cma)
+		gfp_mask &= ~__GFP_MOVABLE;
 	return alloc_migrate_huge_page(h, gfp_mask, preferred_nid, nmask);
 }
 
@@ -2012,7 +2025,7 @@ struct page *alloc_huge_page_vma(struct
 
 	gfp_mask = htlb_alloc_mask(h);
 	node = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
-	page = alloc_huge_page_nodemask(h, node, nodemask, 0);
+	page = alloc_huge_page_nodemask(h, node, nodemask, 0, false);
 	mpol_cond_put(mpol);
 
 	return page;
--- a/mm/mempolicy.c~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/mm/mempolicy.c
@@ -1071,7 +1071,7 @@ struct page *alloc_new_node_page(struct
 	if (PageHuge(page)) {
 		return alloc_huge_page_nodemask(
 			page_hstate(compound_head(page)), node,
-			NULL, __GFP_THISNODE);
+			NULL, __GFP_THISNODE, false);
 	} else if (PageTransHuge(page)) {
 		struct page *thp;
 
--- a/mm/migrate.c~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/mm/migrate.c
@@ -1523,7 +1523,7 @@ struct page *new_page_nodemask(struct pa
 	if (PageHuge(page)) {
 		return alloc_huge_page_nodemask(
 				page_hstate(compound_head(page)),
-				preferred_nid, nodemask, 0);
+				preferred_nid, nodemask, 0, false);
 	}
 
 	if (PageTransHuge(page)) {
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-migrate-make-a-standard-migration-target-allocation-function.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (43 preceding siblings ...)
  2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
  2020-07-07 19:47 ` [to-be-updated] mm-gup-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
                   ` (187 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, vbabka


The patch titled
     Subject: mm/migrate: make a standard migration target allocation function
has been removed from the -mm tree.  Its filename was
     mm-migrate-make-a-standard-migration-target-allocation-function.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/migrate: make a standard migration target allocation function

There are some similar functions for migration target allocation.  Since
there is no fundamental difference, it's better to keep just one rather
than keeping all variants.  This patch implements base migration target
allocation function.  In the following patches, variants will be converted
to use this function.

Note that PageHighmem() call in previous function is changed to open-code
"is_highmem_idx()" since it provides more readability.

Link: http://lkml.kernel.org/r/1592892828-1934-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/migrate.h |    5 +++--
 mm/internal.h           |    7 +++++++
 mm/memory-failure.c     |    8 ++++++--
 mm/memory_hotplug.c     |   14 +++++++++-----
 mm/migrate.c            |   21 +++++++++++++--------
 mm/page_isolation.c     |    8 ++++++--
 6 files changed, 44 insertions(+), 19 deletions(-)

--- a/include/linux/migrate.h~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/include/linux/migrate.h
@@ -10,6 +10,8 @@
 typedef struct page *new_page_t(struct page *page, unsigned long private);
 typedef void free_page_t(struct page *page, unsigned long private);
 
+struct migration_target_control;
+
 /*
  * Return values from addresss_space_operations.migratepage():
  * - negative errno on page migration failure;
@@ -39,8 +41,7 @@ extern int migrate_page(struct address_s
 			enum migrate_mode mode);
 extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
 		unsigned long private, enum migrate_mode mode, int reason);
-extern struct page *new_page_nodemask(struct page *page,
-		int preferred_nid, nodemask_t *nodemask);
+extern struct page *alloc_migration_target(struct page *page, unsigned long private);
 extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
 extern void putback_movable_page(struct page *page);
 
--- a/mm/internal.h~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/internal.h
@@ -614,4 +614,11 @@ static inline bool is_migrate_highatomic
 
 void setup_zone_pageset(struct zone *zone);
 extern struct page *alloc_new_node_page(struct page *page, unsigned long node);
+
+struct migration_target_control {
+	int nid;		/* preferred node id */
+	nodemask_t *nmask;
+	gfp_t gfp_mask;
+};
+
 #endif	/* __MM_INTERNAL_H */
--- a/mm/memory-failure.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/memory-failure.c
@@ -1648,9 +1648,13 @@ EXPORT_SYMBOL(unpoison_memory);
 
 static struct page *new_page(struct page *p, unsigned long private)
 {
-	int nid = page_to_nid(p);
+	struct migration_target_control mtc = {
+		.nid = page_to_nid(p),
+		.nmask = &node_states[N_MEMORY],
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
-	return new_page_nodemask(p, nid, &node_states[N_MEMORY]);
+	return alloc_migration_target(p, (unsigned long)&mtc);
 }
 
 /*
--- a/mm/memory_hotplug.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/memory_hotplug.c
@@ -1267,19 +1267,23 @@ found:
 
 static struct page *new_node_page(struct page *page, unsigned long private)
 {
-	int nid = page_to_nid(page);
 	nodemask_t nmask = node_states[N_MEMORY];
+	struct migration_target_control mtc = {
+		.nid = page_to_nid(page),
+		.nmask = &nmask,
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
 	/*
 	 * try to allocate from a different node but reuse this node if there
 	 * are no other online nodes to be used (e.g. we are offlining a part
 	 * of the only existing node)
 	 */
-	node_clear(nid, nmask);
-	if (nodes_empty(nmask))
-		node_set(nid, nmask);
+	node_clear(mtc.nid, *mtc.nmask);
+	if (nodes_empty(*mtc.nmask))
+		node_set(mtc.nid, *mtc.nmask);
 
-	return new_page_nodemask(page, nid, &nmask);
+	return alloc_migration_target(page, (unsigned long)&mtc);
 }
 
 static int
--- a/mm/migrate.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/migrate.c
@@ -1513,29 +1513,34 @@ out:
 	return rc;
 }
 
-struct page *new_page_nodemask(struct page *page,
-				int preferred_nid, nodemask_t *nodemask)
+struct page *alloc_migration_target(struct page *page, unsigned long private)
 {
-	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL;
+	struct migration_target_control *mtc;
+	gfp_t gfp_mask;
 	unsigned int order = 0;
 	struct page *new_page = NULL;
+	int zidx;
+
+	mtc = (struct migration_target_control *)private;
+	gfp_mask = mtc->gfp_mask;
 
 	if (PageHuge(page)) {
 		return alloc_huge_page_nodemask(
-				page_hstate(compound_head(page)),
-				preferred_nid, nodemask, 0, false);
+				page_hstate(compound_head(page)), mtc->nid,
+				mtc->nmask, gfp_mask, false);
 	}
 
 	if (PageTransHuge(page)) {
+		gfp_mask &= ~__GFP_RECLAIM;
 		gfp_mask |= GFP_TRANSHUGE;
 		order = HPAGE_PMD_ORDER;
 	}
-
-	if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
+	zidx = zone_idx(page_zone(page));
+	if (is_highmem_idx(zidx) || zidx == ZONE_MOVABLE)
 		gfp_mask |= __GFP_HIGHMEM;
 
 	new_page = __alloc_pages_nodemask(gfp_mask, order,
-				preferred_nid, nodemask);
+				mtc->nid, mtc->nmask);
 
 	if (new_page && PageTransHuge(new_page))
 		prep_transhuge_page(new_page);
--- a/mm/page_isolation.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/page_isolation.c
@@ -309,7 +309,11 @@ int test_pages_isolated(unsigned long st
 
 struct page *alloc_migrate_target(struct page *page, unsigned long private)
 {
-	int nid = page_to_nid(page);
+	struct migration_target_control mtc = {
+		.nid = page_to_nid(page),
+		.nmask = &node_states[N_MEMORY],
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
-	return new_page_nodemask(page, nid, &node_states[N_MEMORY]);
+	return alloc_migration_target(page, (unsigned long)&mtc);
 }
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-gup-use-a-standard-migration-target-allocation-callback.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (44 preceding siblings ...)
  2020-07-07 19:47 ` [to-be-updated] mm-migrate-make-a-standard-migration-target-allocation-function.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
  2020-07-07 19:47 ` [to-be-updated] mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
                   ` (186 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, vbabka


The patch titled
     Subject: mm/gup: use a standard migration target allocation callback
has been removed from the -mm tree.  Its filename was
     mm-gup-use-a-standard-migration-target-allocation-callback.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/gup: use a standard migration target allocation callback

There is a well-defined migration target allocation callback.  It's mostly
similar with new_non_cma_page() except considering CMA pages.

This patch adds a CMA consideration to the standard migration target
allocation callback and use it on gup.c.

Link: http://lkml.kernel.org/r/1592892828-1934-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/gup.c      |   57 ++++++------------------------------------------
 mm/internal.h |    1 
 mm/migrate.c  |    4 ++-
 3 files changed, 12 insertions(+), 50 deletions(-)

--- a/mm/gup.c~mm-gup-use-a-standard-migration-target-allocation-callback
+++ a/mm/gup.c
@@ -1608,56 +1608,15 @@ static bool check_dax_vmas(struct vm_are
 }
 
 #ifdef CONFIG_CMA
-static struct page *new_non_cma_page(struct page *page, unsigned long private)
+static struct page *alloc_migration_target_non_cma(struct page *page, unsigned long private)
 {
-	/*
-	 * We want to make sure we allocate the new page from the same node
-	 * as the source page.
-	 */
-	int nid = page_to_nid(page);
-	/*
-	 * Trying to allocate a page for migration. Ignore allocation
-	 * failure warnings. We don't force __GFP_THISNODE here because
-	 * this node here is the node where we have CMA reservation and
-	 * in some case these nodes will have really less non movable
-	 * allocation memory.
-	 */
-	gfp_t gfp_mask = GFP_USER | __GFP_NOWARN;
-
-	if (PageHighMem(page))
-		gfp_mask |= __GFP_HIGHMEM;
-
-#ifdef CONFIG_HUGETLB_PAGE
-	if (PageHuge(page)) {
-		struct hstate *h = page_hstate(page);
+	struct migration_target_control mtc = {
+		.nid = page_to_nid(page),
+		.gfp_mask = GFP_USER | __GFP_NOWARN,
+		.skip_cma = true,
+	};
 
-		/*
-		 * We don't want to dequeue from the pool because pool pages will
-		 * mostly be from the CMA region.
-		 */
-		return alloc_huge_page_nodemask(h, nid, NULL, gfp_mask, true);
-	}
-#endif
-	if (PageTransHuge(page)) {
-		struct page *thp;
-		/*
-		 * ignore allocation failure warnings
-		 */
-		gfp_t thp_gfpmask = GFP_TRANSHUGE | __GFP_NOWARN;
-
-		/*
-		 * Remove the movable mask so that we don't allocate from
-		 * CMA area again.
-		 */
-		thp_gfpmask &= ~__GFP_MOVABLE;
-		thp = __alloc_pages_node(nid, thp_gfpmask, HPAGE_PMD_ORDER);
-		if (!thp)
-			return NULL;
-		prep_transhuge_page(thp);
-		return thp;
-	}
-
-	return __alloc_pages_node(nid, gfp_mask, 0);
+	return alloc_migration_target(page, (unsigned long)&mtc);
 }
 
 static long check_and_migrate_cma_pages(struct task_struct *tsk,
@@ -1719,7 +1678,7 @@ check_again:
 		for (i = 0; i < nr_pages; i++)
 			put_page(pages[i]);
 
-		if (migrate_pages(&cma_page_list, new_non_cma_page,
+		if (migrate_pages(&cma_page_list, alloc_migration_target_non_cma,
 				  NULL, 0, MIGRATE_SYNC, MR_CONTIG_RANGE)) {
 			/*
 			 * some of the pages failed migration. Do get_user_pages
--- a/mm/internal.h~mm-gup-use-a-standard-migration-target-allocation-callback
+++ a/mm/internal.h
@@ -619,6 +619,7 @@ struct migration_target_control {
 	int nid;		/* preferred node id */
 	nodemask_t *nmask;
 	gfp_t gfp_mask;
+	bool skip_cma;
 };
 
 #endif	/* __MM_INTERNAL_H */
--- a/mm/migrate.c~mm-gup-use-a-standard-migration-target-allocation-callback
+++ a/mm/migrate.c
@@ -1527,7 +1527,7 @@ struct page *alloc_migration_target(stru
 	if (PageHuge(page)) {
 		return alloc_huge_page_nodemask(
 				page_hstate(compound_head(page)), mtc->nid,
-				mtc->nmask, gfp_mask, false);
+				mtc->nmask, gfp_mask, mtc->skip_cma);
 	}
 
 	if (PageTransHuge(page)) {
@@ -1538,6 +1538,8 @@ struct page *alloc_migration_target(stru
 	zidx = zone_idx(page_zone(page));
 	if (is_highmem_idx(zidx) || zidx == ZONE_MOVABLE)
 		gfp_mask |= __GFP_HIGHMEM;
+	if (mtc->skip_cma)
+		gfp_mask &= ~__GFP_MOVABLE;
 
 	new_page = __alloc_pages_nodemask(gfp_mask, order,
 				mtc->nid, mtc->nmask);
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (45 preceding siblings ...)
  2020-07-07 19:47 ` [to-be-updated] mm-gup-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
  2020-07-07 19:47 ` [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
                   ` (185 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, vbabka


The patch titled
     Subject: mm/mempolicy: use a standard migration target allocation callback
has been removed from the -mm tree.  Its filename was
     mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/mempolicy: use a standard migration target allocation callback

There is a well-defined migration target allocation callback.  Use it.

Link: http://lkml.kernel.org/r/1592892828-1934-8-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/internal.h  |    1 -
 mm/mempolicy.c |   30 ++++++------------------------
 mm/migrate.c   |    8 ++++++--
 3 files changed, 12 insertions(+), 27 deletions(-)

--- a/mm/internal.h~mm-mempolicy-use-a-standard-migration-target-allocation-callback
+++ a/mm/internal.h
@@ -613,7 +613,6 @@ static inline bool is_migrate_highatomic
 }
 
 void setup_zone_pageset(struct zone *zone);
-extern struct page *alloc_new_node_page(struct page *page, unsigned long node);
 
 struct migration_target_control {
 	int nid;		/* preferred node id */
--- a/mm/mempolicy.c~mm-mempolicy-use-a-standard-migration-target-allocation-callback
+++ a/mm/mempolicy.c
@@ -1065,28 +1065,6 @@ static int migrate_page_add(struct page
 	return 0;
 }
 
-/* page allocation callback for NUMA node migration */
-struct page *alloc_new_node_page(struct page *page, unsigned long node)
-{
-	if (PageHuge(page)) {
-		return alloc_huge_page_nodemask(
-			page_hstate(compound_head(page)), node,
-			NULL, __GFP_THISNODE, false);
-	} else if (PageTransHuge(page)) {
-		struct page *thp;
-
-		thp = alloc_pages_node(node,
-			(GFP_TRANSHUGE | __GFP_THISNODE),
-			HPAGE_PMD_ORDER);
-		if (!thp)
-			return NULL;
-		prep_transhuge_page(thp);
-		return thp;
-	} else
-		return __alloc_pages_node(node, GFP_HIGHUSER_MOVABLE |
-						    __GFP_THISNODE, 0);
-}
-
 /*
  * Migrate pages from one node to a target node.
  * Returns error or the number of pages not migrated.
@@ -1097,6 +1075,10 @@ static int migrate_to_node(struct mm_str
 	nodemask_t nmask;
 	LIST_HEAD(pagelist);
 	int err = 0;
+	struct migration_target_control mtc = {
+		.nid = dest,
+		.gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE,
+	};
 
 	nodes_clear(nmask);
 	node_set(source, nmask);
@@ -1111,8 +1093,8 @@ static int migrate_to_node(struct mm_str
 			flags | MPOL_MF_DISCONTIG_OK, &pagelist);
 
 	if (!list_empty(&pagelist)) {
-		err = migrate_pages(&pagelist, alloc_new_node_page, NULL, dest,
-					MIGRATE_SYNC, MR_SYSCALL);
+		err = migrate_pages(&pagelist, alloc_migration_target, NULL,
+				(unsigned long)&mtc, MIGRATE_SYNC, MR_SYSCALL);
 		if (err)
 			putback_movable_pages(&pagelist);
 	}
--- a/mm/migrate.c~mm-mempolicy-use-a-standard-migration-target-allocation-callback
+++ a/mm/migrate.c
@@ -1567,9 +1567,13 @@ static int do_move_pages_to_node(struct
 		struct list_head *pagelist, int node)
 {
 	int err;
+	struct migration_target_control mtc = {
+		.nid = node,
+		.gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE,
+	};
 
-	err = migrate_pages(pagelist, alloc_new_node_page, NULL, node,
-			MIGRATE_SYNC, MR_SYSCALL);
+	err = migrate_pages(pagelist, alloc_migration_target, NULL,
+			(unsigned long)&mtc, MIGRATE_SYNC, MR_SYSCALL);
 	if (err)
 		putback_movable_pages(pagelist);
 	return err;
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (46 preceding siblings ...)
  2020-07-07 19:47 ` [to-be-updated] mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
  2020-07-07 19:47   ` Andrew Morton
  2020-07-07 19:56 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch added to " Andrew Morton
                   ` (184 subsequent siblings)
  232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, vbabka


The patch titled
     Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()
has been removed from the -mm tree.  Its filename was
     mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()

There is a well-defined standard migration target callback.  Use it
directly.

Link: http://lkml.kernel.org/r/1592892828-1934-9-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c     |    9 +++++++--
 mm/page_isolation.c |   11 -----------
 2 files changed, 7 insertions(+), 13 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_alloc.c
@@ -8354,6 +8354,11 @@ static int __alloc_contig_migrate_range(
 	unsigned long pfn = start;
 	unsigned int tries = 0;
 	int ret = 0;
+	struct migration_target_control mtc = {
+		.nid = zone_to_nid(cc->zone),
+		.nmask = &node_states[N_MEMORY],
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
 	migrate_prep();
 
@@ -8380,8 +8385,8 @@ static int __alloc_contig_migrate_range(
 							&cc->migratepages);
 		cc->nr_migratepages -= nr_reclaimed;
 
-		ret = migrate_pages(&cc->migratepages, alloc_migrate_target,
-				    NULL, 0, cc->mode, MR_CONTIG_RANGE);
+		ret = migrate_pages(&cc->migratepages, alloc_migration_target,
+				NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE);
 	}
 	if (ret < 0) {
 		putback_movable_pages(&cc->migratepages);
--- a/mm/page_isolation.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_isolation.c
@@ -306,14 +306,3 @@ int test_pages_isolated(unsigned long st
 
 	return pfn < end_pfn ? -EBUSY : 0;
 }
-
-struct page *alloc_migrate_target(struct page *page, unsigned long private)
-{
-	struct migration_target_control mtc = {
-		.nid = page_to_nid(page),
-		.nmask = &node_states[N_MEMORY],
-		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
-	};

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch removed from -mm tree
  2020-07-07 19:47 ` [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
@ 2020-07-07 19:47   ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, vbabka


The patch titled
     Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()
has been removed from the -mm tree.  Its filename was
     mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()

There is a well-defined standard migration target callback.  Use it
directly.

Link: http://lkml.kernel.org/r/1592892828-1934-9-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c     |    9 +++++++--
 mm/page_isolation.c |   11 -----------
 2 files changed, 7 insertions(+), 13 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_alloc.c
@@ -8354,6 +8354,11 @@ static int __alloc_contig_migrate_range(
 	unsigned long pfn = start;
 	unsigned int tries = 0;
 	int ret = 0;
+	struct migration_target_control mtc = {
+		.nid = zone_to_nid(cc->zone),
+		.nmask = &node_states[N_MEMORY],
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
 	migrate_prep();
 
@@ -8380,8 +8385,8 @@ static int __alloc_contig_migrate_range(
 							&cc->migratepages);
 		cc->nr_migratepages -= nr_reclaimed;
 
-		ret = migrate_pages(&cc->migratepages, alloc_migrate_target,
-				    NULL, 0, cc->mode, MR_CONTIG_RANGE);
+		ret = migrate_pages(&cc->migratepages, alloc_migration_target,
+				NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE);
 	}
 	if (ret < 0) {
 		putback_movable_pages(&cc->migratepages);
--- a/mm/page_isolation.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_isolation.c
@@ -306,14 +306,3 @@ int test_pages_isolated(unsigned long st
 
 	return pfn < end_pfn ? -EBUSY : 0;
 }
-
-struct page *alloc_migrate_target(struct page *page, unsigned long private)
-{
-	struct migration_target_control mtc = {
-		.nid = page_to_nid(page),
-		.nmask = &node_states[N_MEMORY],
-		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
-	};
-
-	return alloc_migration_target(page, (unsigned long)&mtc);
-}
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are



^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (47 preceding siblings ...)
  2020-07-07 19:47 ` [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
@ 2020-07-07 19:56 ` Andrew Morton
  2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch removed from " Andrew Morton
                   ` (183 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:56 UTC (permalink / raw)
  To: guro, jonathan.cameron, mike.kravetz, mm-commits, rppt, song.bao.hua


The patch titled
     Subject: mm/hugetlb: avoid hardcoding while checking if cma is enable
has been added to the -mm tree.  Its filename is
     mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enable

hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory.  so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled.  gigantic pages might have been reserved on other nodes.

Link: http://lkml.kernel.org/r/20200707040204.30132-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |   16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable
+++ a/mm/hugetlb.c
@@ -2547,6 +2547,20 @@ static void __init gather_bootmem_preall
 	}
 }
 
+bool __init hugetlb_cma_enabled(void)
+{
+#ifdef CONFIG_CMA
+	int node;
+
+	for_each_online_node(node) {
+		if (hugetlb_cma[node])
+			return true;
+	}
+#endif
+
+	return false;
+}
+
 static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 {
 	unsigned long i;
@@ -2572,7 +2586,7 @@ static void __init hugetlb_hstate_alloc_
 
 	for (i = 0; i < h->max_huge_pages; ++i) {
 		if (hstate_is_gigantic(h)) {
-			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+			if (hugetlb_cma_enabled()) {
 				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
 				break;
 			}
_

Patches currently in -mm which might be from song.bao.hua@hisilicon.com are

mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (48 preceding siblings ...)
  2020-07-07 19:56 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch added to " Andrew Morton
@ 2020-07-07 20:11 ` Andrew Morton
  2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch " Andrew Morton
                   ` (182 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 20:11 UTC (permalink / raw)
  To: anshuman.khandual, hughd, mm-commits


The patch titled
     Subject: mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix
has been removed from the -mm tree.  Its filename was
     mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch

This patch was dropped because it was folded into mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch

------------------------------------------------------
From: Hugh Dickins <hughd@google.com>
Subject: mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix

Fix 5.7-rc6-mm1 page migration crash in unmap_and_move(): when the
page to be migrated has been freed from under us, that is considered
a MIGRATEPAGE_SUCCESS, but no newpage has been allocated (and I don't
think it would ever need to be counted as a successful THP migration).

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005210643340.482@eggly.anvils
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/migrate.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/migrate.c~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix
+++ a/mm/migrate.c
@@ -1245,7 +1245,7 @@ out:
 	 * we want to retry.
 	 */
 	if (rc == MIGRATEPAGE_SUCCESS) {
-		if (PageTransHuge(newpage))
+		if (newpage && PageTransHuge(newpage))
 			thp_migration_success(true);
 		put_page(page);
 		if (reason == MR_MEMORY_FAILURE) {
_

Patches currently in -mm which might be from hughd@google.com are

mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (49 preceding siblings ...)
  2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch removed from " Andrew Morton
@ 2020-07-07 20:11 ` Andrew Morton
  2020-07-07 20:12 ` [to-be-updated] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch " Andrew Morton
                   ` (181 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 20:11 UTC (permalink / raw)
  To: anshuman.khandual, hughd, jhubbard, mm-commits, n-horiguchi, ziy


The patch titled
     Subject: mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update
has been removed from the -mm tree.  Its filename was
     mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch

This patch was dropped because it was folded into mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update

rename thp_migration_success() to thp_pmd_migration_success() per John

Link: http://lkml.kernel.org/r/1590118444-21601-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/migrate.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

--- a/mm/migrate.c~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update
+++ a/mm/migrate.c
@@ -1172,7 +1172,7 @@ out:
 #endif
 
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
-static inline void thp_migration_success(bool success)
+static inline void thp_pmd_migration_success(bool success)
 {
 	if (success)
 		count_vm_event(THP_PMD_MIGRATION_SUCCESS);
@@ -1180,7 +1180,9 @@ static inline void thp_migration_success
 		count_vm_event(THP_PMD_MIGRATION_FAILURE);
 }
 #else
-static inline void thp_migration_success(bool success) { }
+static inline void thp_pmd_migration_success(bool success)
+{
+}
 #endif
 
 /*
@@ -1245,8 +1247,14 @@ out:
 	 * we want to retry.
 	 */
 	if (rc == MIGRATEPAGE_SUCCESS) {
+		/*
+		 * When the page to be migrated has been freed from under
+		 * us, that is considered a MIGRATEPAGE_SUCCESS, but no
+		 * newpage has been allocated. It should not be counted
+		 * as a successful THP migration.
+		 */
 		if (newpage && PageTransHuge(newpage))
-			thp_migration_success(true);
+			thp_pmd_migration_success(true);
 		put_page(page);
 		if (reason == MR_MEMORY_FAILURE) {
 			/*
@@ -1489,7 +1497,7 @@ retry:
 					unlock_page(page);
 					if (!rc) {
 						list_safe_reset_next(page, page2, lru);
-						thp_migration_success(false);
+						thp_pmd_migration_success(false);
 						goto retry;
 					}
 				}
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (50 preceding siblings ...)
  2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch " Andrew Morton
@ 2020-07-07 20:12 ` Andrew Morton
  2020-07-07 20:13 ` + mm-vmstat-add-events-for-thp-migration-without-split.patch added to " Andrew Morton
                   ` (180 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 20:12 UTC (permalink / raw)
  To: aarcange, anshuman.khandual, cai, daniel.m.jordan, hannes, hughd,
	jhubbard, kirill.shutemov, mhocko, mm-commits, n-horiguchi,
	yang.shi, ziy


The patch titled
     Subject: mm/vmstat: add events for PMD based THP migration without split
has been removed from the -mm tree.  Its filename was
     mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/vmstat: add events for PMD based THP migration without split

This adds the following two new VM events which will help in validating
PMD based THP migration without split.  Statistics reported through these
events will help in performance debugging.

1. THP_PMD_MIGRATION_SUCCESS
2. THP_PMD_MIGRATION_FAILURE

[hughd@google.com: fix page migration crash in unmap_and_move()]
  Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005210643340.482@eggly.anvils
[anshuman.khandual@arm.com: rename thp_migration_success() to thp_pmd_migration_success() per John]
  Link: http://lkml.kernel.org/r/1590118444-21601-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1589784156-28831-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/vm_event_item.h |    4 ++++
 mm/migrate.c                  |   23 +++++++++++++++++++++++
 mm/vmstat.c                   |    4 ++++
 3 files changed, 31 insertions(+)

--- a/include/linux/vm_event_item.h~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split
+++ a/include/linux/vm_event_item.h
@@ -95,6 +95,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
 		THP_ZERO_PAGE_ALLOC_FAILED,
 		THP_SWPOUT,
 		THP_SWPOUT_FALLBACK,
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+		THP_PMD_MIGRATION_SUCCESS,
+		THP_PMD_MIGRATION_FAILURE,
+#endif
 #endif
 #ifdef CONFIG_MEMORY_BALLOON
 		BALLOON_INFLATE,
--- a/mm/migrate.c~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split
+++ a/mm/migrate.c
@@ -1171,6 +1171,20 @@ out:
 #define ICE_noinline
 #endif
 
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+static inline void thp_pmd_migration_success(bool success)
+{
+	if (success)
+		count_vm_event(THP_PMD_MIGRATION_SUCCESS);
+	else
+		count_vm_event(THP_PMD_MIGRATION_FAILURE);
+}
+#else
+static inline void thp_pmd_migration_success(bool success)
+{
+}
+#endif
+
 /*
  * Obtain the lock on page, remove all ptes and migrate the page
  * to the newly allocated page in newpage.
@@ -1233,6 +1247,14 @@ out:
 	 * we want to retry.
 	 */
 	if (rc == MIGRATEPAGE_SUCCESS) {
+		/*
+		 * When the page to be migrated has been freed from under
+		 * us, that is considered a MIGRATEPAGE_SUCCESS, but no
+		 * newpage has been allocated. It should not be counted
+		 * as a successful THP migration.
+		 */
+		if (newpage && PageTransHuge(newpage))
+			thp_pmd_migration_success(true);
 		put_page(page);
 		if (reason == MR_MEMORY_FAILURE) {
 			/*
@@ -1475,6 +1497,7 @@ retry:
 					unlock_page(page);
 					if (!rc) {
 						list_safe_reset_next(page, page2, lru);
+						thp_pmd_migration_success(false);
 						goto retry;
 					}
 				}
--- a/mm/vmstat.c~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split
+++ a/mm/vmstat.c
@@ -1320,6 +1320,10 @@ const char * const vmstat_text[] = {
 	"thp_zero_page_alloc_failed",
 	"thp_swpout",
 	"thp_swpout_fallback",
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+	"thp_pmd_migration_success",
+	"thp_pmd_migration_failure",
+#endif
 #endif
 #ifdef CONFIG_MEMORY_BALLOON
 	"balloon_inflate",
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmstat-add-events-for-thp-migration-without-split.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (51 preceding siblings ...)
  2020-07-07 20:12 ` [to-be-updated] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch " Andrew Morton
@ 2020-07-07 20:13 ` Andrew Morton
  2020-07-07 22:18 ` + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch " Andrew Morton
                   ` (179 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 20:13 UTC (permalink / raw)
  To: anshuman.khandual, daniel.m.jordan, hughd, jhubbard, mm-commits,
	n-horiguchi, willy, ziy


The patch titled
     Subject: mm/vmstat: add events for THP migration without split
has been added to the -mm tree.  Its filename is
     mm-vmstat-add-events-for-thp-migration-without-split.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-add-events-for-thp-migration-without-split.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-add-events-for-thp-migration-without-split.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/vmstat: add events for THP migration without split

Add following new vmstat events which will help in validating THP
migration without split.  Statistics reported through these new VM events
will help in performance debugging.

1. THP_MIGRATION_SUCCESS
2. THP_MIGRATION_FAILURE
3. THP_MIGRATION_SPLIT

In addition, these new events also update normal page migration statistics
appropriately via PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE.  While here,
this updates current trace event 'mm_migrate_pages' to accommodate now
available THP statistics.

Link: http://lkml.kernel.org/r/1594080415-27924-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/vm/page_migration.rst |   19 ++++++++++
 include/linux/vm_event_item.h       |    3 +
 include/trace/events/migrate.h      |   17 +++++++--
 mm/migrate.c                        |   49 +++++++++++++++++++++++---
 mm/vmstat.c                         |    3 +
 5 files changed, 84 insertions(+), 7 deletions(-)

--- a/Documentation/vm/page_migration.rst~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/Documentation/vm/page_migration.rst
@@ -253,5 +253,24 @@ which are function pointers of struct ad
      PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
      for own purpose.
 
+Quantifying Migration
+=====================
+Following events can be used to quantify page migration.
+
+1. PGMIGRATE_SUCCESS       /* Normal page migration success */
+2. PGMIGRATE_FAIL          /* Normal page migration failure */
+3. THP_MIGRATION_SUCCESS   /* Transparent huge page migration success */
+4. THP_MIGRATION_FAILURE   /* Transparent huge page migration failure */
+5. THP_MIGRATION_SPLIT     /* Transparent huge page got split, retried */
+
+THP_MIGRATION_SUCCESS is when THP is migrated successfully without getting
+split into it's subpages. THP_MIGRATION_FAILURE is when THP could neither
+be migrated nor be split. THP_MIGRATION_SPLIT is when THP could not
+just be migrated as is but instead get split into it's subpages and later
+retried as normal pages. THP events would also update normal page migration
+statistics PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE. These events will help
+in quantifying and analyzing various THP migration events including both
+success and failure cases.
+
 Christoph Lameter, May 8, 2006.
 Minchan Kim, Mar 28, 2016.
--- a/include/linux/vm_event_item.h~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/include/linux/vm_event_item.h
@@ -95,6 +95,9 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
 		THP_ZERO_PAGE_ALLOC_FAILED,
 		THP_SWPOUT,
 		THP_SWPOUT_FALLBACK,
+		THP_MIGRATION_SUCCESS,
+		THP_MIGRATION_FAILURE,
+		THP_MIGRATION_SPLIT,
 #endif
 #ifdef CONFIG_MEMORY_BALLOON
 		BALLOON_INFLATE,
--- a/include/trace/events/migrate.h~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/include/trace/events/migrate.h
@@ -46,13 +46,18 @@ MIGRATE_REASON
 TRACE_EVENT(mm_migrate_pages,
 
 	TP_PROTO(unsigned long succeeded, unsigned long failed,
-		 enum migrate_mode mode, int reason),
+		 unsigned long thp_succeeded, unsigned long thp_failed,
+		 unsigned long thp_split, enum migrate_mode mode, int reason),
 
-	TP_ARGS(succeeded, failed, mode, reason),
+	TP_ARGS(succeeded, failed, thp_succeeded, thp_failed,
+		thp_split, mode, reason),
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,		succeeded)
 		__field(	unsigned long,		failed)
+		__field(	unsigned long,		thp_succeeded)
+		__field(	unsigned long,		thp_failed)
+		__field(	unsigned long,		thp_split)
 		__field(	enum migrate_mode,	mode)
 		__field(	int,			reason)
 	),
@@ -60,13 +65,19 @@ TRACE_EVENT(mm_migrate_pages,
 	TP_fast_assign(
 		__entry->succeeded	= succeeded;
 		__entry->failed		= failed;
+		__entry->thp_succeeded	= thp_succeeded;
+		__entry->thp_failed	= thp_failed;
+		__entry->thp_split	= thp_split;
 		__entry->mode		= mode;
 		__entry->reason		= reason;
 	),
 
-	TP_printk("nr_succeeded=%lu nr_failed=%lu mode=%s reason=%s",
+	TP_printk("nr_succeeded=%lu nr_failed=%lu nr_thp_succeeded=%lu nr_thp_failed=%lu nr_thp_split=%lu mode=%s reason=%s",
 		__entry->succeeded,
 		__entry->failed,
+		__entry->thp_succeeded,
+		__entry->thp_failed,
+		__entry->thp_split,
 		__print_symbolic(__entry->mode, MIGRATE_MODE),
 		__print_symbolic(__entry->reason, MIGRATE_REASON))
 );
--- a/mm/migrate.c~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/mm/migrate.c
@@ -1429,22 +1429,35 @@ int migrate_pages(struct list_head *from
 		enum migrate_mode mode, int reason)
 {
 	int retry = 1;
+	int thp_retry = 1;
 	int nr_failed = 0;
 	int nr_succeeded = 0;
+	int nr_thp_succeeded = 0;
+	int nr_thp_failed = 0;
+	int nr_thp_split = 0;
 	int pass = 0;
+	bool is_thp = false;
 	struct page *page;
 	struct page *page2;
 	int swapwrite = current->flags & PF_SWAPWRITE;
-	int rc;
+	int rc, thp_nr_pages;
 
 	if (!swapwrite)
 		current->flags |= PF_SWAPWRITE;
 
-	for(pass = 0; pass < 10 && retry; pass++) {
+	for (pass = 0; pass < 10 && (retry || thp_retry); pass++) {
 		retry = 0;
+		thp_retry = 0;
 
 		list_for_each_entry_safe(page, page2, from, lru) {
 retry:
+			/*
+			 * THP statistics is based on the source huge page.
+			 * Capture required information that might get lost
+			 * during migration.
+			 */
+			is_thp = PageTransHuge(page);
+			thp_nr_pages = hpage_nr_pages(page);
 			cond_resched();
 
 			if (PageHuge(page))
@@ -1475,15 +1488,30 @@ retry:
 					unlock_page(page);
 					if (!rc) {
 						list_safe_reset_next(page, page2, lru);
+						nr_thp_split++;
 						goto retry;
 					}
 				}
+				if (is_thp) {
+					nr_thp_failed++;
+					nr_failed += thp_nr_pages;
+					goto out;
+				}
 				nr_failed++;
 				goto out;
 			case -EAGAIN:
+				if (is_thp) {
+					thp_retry++;
+					break;
+				}
 				retry++;
 				break;
 			case MIGRATEPAGE_SUCCESS:
+				if (is_thp) {
+					nr_thp_succeeded++;
+					nr_succeeded += thp_nr_pages;
+					break;
+				}
 				nr_succeeded++;
 				break;
 			default:
@@ -1493,19 +1521,32 @@ retry:
 				 * removed from migration page list and not
 				 * retried in the next outer loop.
 				 */
+				if (is_thp) {
+					nr_thp_failed++;
+					nr_failed += thp_nr_pages;
+					break;
+				}
 				nr_failed++;
 				break;
 			}
 		}
 	}
-	nr_failed += retry;
+	nr_failed += retry + thp_retry;
+	nr_thp_failed += thp_retry;
 	rc = nr_failed;
 out:
 	if (nr_succeeded)
 		count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
 	if (nr_failed)
 		count_vm_events(PGMIGRATE_FAIL, nr_failed);
-	trace_mm_migrate_pages(nr_succeeded, nr_failed, mode, reason);
+	if (nr_thp_succeeded)
+		count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
+	if (nr_thp_failed)
+		count_vm_events(THP_MIGRATION_FAILURE, nr_thp_failed);
+	if (nr_thp_split)
+		count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
+	trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded,
+			       nr_thp_failed, nr_thp_split, mode, reason);
 
 	if (!swapwrite)
 		current->flags &= ~PF_SWAPWRITE;
--- a/mm/vmstat.c~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/mm/vmstat.c
@@ -1320,6 +1320,9 @@ const char * const vmstat_text[] = {
 	"thp_zero_page_alloc_failed",
 	"thp_swpout",
 	"thp_swpout_fallback",
+	"thp_migration_success",
+	"thp_migration_failure",
+	"thp_migration_split",
 #endif
 #ifdef CONFIG_MEMORY_BALLOON
 	"balloon_inflate",
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-thp-migration-without-split.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (52 preceding siblings ...)
  2020-07-07 20:13 ` + mm-vmstat-add-events-for-thp-migration-without-split.patch added to " Andrew Morton
@ 2020-07-07 22:18 ` Andrew Morton
  2020-07-08 21:48 ` + vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch " Andrew Morton
                   ` (178 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 22:18 UTC (permalink / raw)
  To: alex.shi, hannes, hughd, mhocko, mm-commits, shakeelb, stable


The patch titled
     Subject: mm/memcg: fix refcount error while moving and swapping
has been added to the -mm tree.  Its filename is
     mm-memcg-fix-refcount-error-while-moving-and-swapping.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-fix-refcount-error-while-moving-and-swapping.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Hugh Dickins <hughd@google.com>
Subject: mm/memcg: fix refcount error while moving and swapping

It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.

This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids. 
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when
offlining).

Just skip that optimization: do that part of the accounting immediately.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2007071431050.4726@eggly.anvils
Fixes: 615d66c37c75 ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-fix-refcount-error-while-moving-and-swapping
+++ a/mm/memcontrol.c
@@ -5669,7 +5669,6 @@ static void __mem_cgroup_clear_mc(void)
 		if (!mem_cgroup_is_root(mc.to))
 			page_counter_uncharge(&mc.to->memory, mc.moved_swap);
 
-		mem_cgroup_id_get_many(mc.to, mc.moved_swap);
 		css_put_many(&mc.to->css, mc.moved_swap);
 
 		mc.moved_swap = 0;
@@ -5860,7 +5859,8 @@ put:			/* get_mctgt_type() gets the page
 			ent = target.ent;
 			if (!mem_cgroup_move_swap_account(ent, mc.from, mc.to)) {
 				mc.precharge--;
-				/* we fixup refcnts and charges later. */
+				mem_cgroup_id_get_many(mc.to, 1);
+				/* we fixup other refcnts and charges later. */
 				mc.moved_swap++;
 			}
 			break;
_

Patches currently in -mm which might be from hughd@google.com are

mm-memcg-fix-refcount-error-while-moving-and-swapping.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (53 preceding siblings ...)
  2020-07-07 22:18 ` + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch " Andrew Morton
@ 2020-07-08 21:48 ` Andrew Morton
  2020-07-08 21:50 ` + kbuild-move-wtype-limits-to-w=2.patch " Andrew Morton
                   ` (177 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 21:48 UTC (permalink / raw)
  To: grandmaster, hirofumi, mm-commits


The patch titled
     Subject: VFAT/FAT/MSDOS FILESYSTEM: Replace HTTP links with HTTPS ones
has been added to the -mm tree.  Its filename is
     vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Alexander A. Klimov" <grandmaster@al2klimov.de>
Subject: VFAT/FAT/MSDOS FILESYSTEM: Replace HTTP links with HTTPS ones

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# 	
]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Link: http://lkml.kernel.org/r/20200708200409.22293-1-grandmaster@al2klimov.de
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/fat/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/fat/Kconfig~vfat-fat-msdos-filesystem-replace-http-links-with-https-ones
+++ a/fs/fat/Kconfig
@@ -41,7 +41,7 @@ config MSDOS_FS
 	  they are compressed; to access compressed MSDOS partitions under
 	  Linux, you can either use the DOS emulator DOSEMU, described in the
 	  DOSEMU-HOWTO, available from
-	  <http://www.tldp.org/docs.html#howto>, or try dmsdosfs in
+	  <https://www.tldp.org/docs.html#howto>, or try dmsdosfs in
 	  <ftp://ibiblio.org/pub/Linux/system/filesystems/dosfs/>. If you
 	  intend to use dosemu with a non-compressed MSDOS partition, say Y
 	  here) and MSDOS floppies. This means that file access becomes
_

Patches currently in -mm which might be from grandmaster@al2klimov.de are

vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + kbuild-move-wtype-limits-to-w=2.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (54 preceding siblings ...)
  2020-07-08 21:48 ` + vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch " Andrew Morton
@ 2020-07-08 21:50 ` Andrew Morton
  2020-07-08 22:17 ` [to-be-updated] mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch removed from " Andrew Morton
                   ` (176 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 21:50 UTC (permalink / raw)
  To: andy.shevchenko, arnd, emil.l.velikov, geert, keescook,
	linus.walleij, michal.lkml, mm-commits, rikard.falkeborn,
	syednwaris, vilhelm.gray, yamada.masahiro


The patch titled
     Subject: kbuild: move -Wtype-limits to W=2
has been added to the -mm tree.  Its filename is
     kbuild-move-wtype-limits-to-w=2.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/kbuild-move-wtype-limits-to-w%3D2.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/kbuild-move-wtype-limits-to-w%3D2.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Subject: kbuild: move -Wtype-limits to W=2

-Wtype-limits is included in -Wextra which is added at W=1.  It warns
(among other things) that 'comparison of an unsigned variable `< 0` is
always false.  This causes noisy warnings, especially when used in macros,
hence it is more suitable for W=2.

Link: http://lkml.kernel.org/r/20200708190756.16810-1-rikard.falkeborn@gmail.com
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Syed Nayyar Waris <syednwaris@gmail.com>
Cc: William Breathitt Gray <vilhelm.gray@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/Makefile.extrawarn |    2 ++
 1 file changed, 2 insertions(+)

--- a/scripts/Makefile.extrawarn~kbuild-move-wtype-limits-to-w=2
+++ a/scripts/Makefile.extrawarn
@@ -35,6 +35,7 @@ KBUILD_CFLAGS += $(call cc-option, -Wstr
 # The following turn off the warnings enabled by -Wextra
 KBUILD_CFLAGS += -Wno-missing-field-initializers
 KBUILD_CFLAGS += -Wno-sign-compare
+KBUILD_CFLAGS += -Wno-type-limits
 
 KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN1
 
@@ -66,6 +67,7 @@ KBUILD_CFLAGS += -Wshadow
 KBUILD_CFLAGS += $(call cc-option, -Wlogical-op)
 KBUILD_CFLAGS += -Wmissing-field-initializers
 KBUILD_CFLAGS += -Wsign-compare
+KBUILD_CFLAGS += -Wtype-limits
 KBUILD_CFLAGS += $(call cc-option, -Wmaybe-uninitialized)
 KBUILD_CFLAGS += $(call cc-option, -Wunused-macros)
 
_

Patches currently in -mm which might be from rikard.falkeborn@gmail.com are

kbuild-move-wtype-limits-to-w=2.patch
bits-add-tests-of-genmask.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (55 preceding siblings ...)
  2020-07-08 21:50 ` + kbuild-move-wtype-limits-to-w=2.patch " Andrew Morton
@ 2020-07-08 22:17 ` Andrew Morton
  2020-07-08 22:20 ` + mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch added to " Andrew Morton
                   ` (175 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 22:17 UTC (permalink / raw)
  To: bigeasy, colin.king, davem, ddstreet, herbert, lgoncalv,
	mahipalreddy2006, mm-commits, sjenning, song.bao.hua,
	vitaly.wool, wangzhou1


The patch titled
     Subject: mm/zswap: move to use crypto_acomp API for hardware acceleration
has been removed from the -mm tree.  Its filename was
     mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/zswap: move to use crypto_acomp API for hardware acceleration

Right now, all new ZIP drivers are using crypto_acomp APIs rather than
legacy crypto_comp APIs.  But zswap.c is still using the old APIs.  That
means zswap won't be able to use any new zip drivers in kernel.

This patch moves to use cryto_acomp APIs to fix the problem.  On the other
hand, tradiontal compressors like lz4,lzo etc have been wrapped into acomp
via scomp backend.  So platforms without async compressors can fallback to
use acomp via scomp backend.

It is probably the first real user to use acomp but perhaps not a good
example to demonstrate how multiple acomp requests can be executed in
parallel in one acomp instance.  frontswap is doing page load and store
page by page.  It doesn't have a queuing or buffering mechinism to permit
multiple pages to do frontswap simultaneously in one thread.  However this
patch creates multiple acomp instances, so multiple threads running on
multiple different cpus can actually do (de)compression parallelly,
leveraging the power of multiple ZIP hardware queues.  This is also
consistent with frontswap's page management model.

On the other hand, the current zswap implementation has some per-cpu
global resource like zswap_dstmem.  So we create acomp instances in number
of CPUs just like before, zswap created comp instances in number of CPUs.

Link: http://lkml.kernel.org/r/20200707125210.33256-1-song.bao.hua@hisilicon.com
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mahipal Challa <mahipalreddy2006@gmail.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/zswap.c |  177 ++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 134 insertions(+), 43 deletions(-)

--- a/mm/zswap.c~mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration
+++ a/mm/zswap.c
@@ -24,8 +24,10 @@
 #include <linux/rbtree.h>
 #include <linux/swap.h>
 #include <linux/crypto.h>
+#include <linux/scatterlist.h>
 #include <linux/mempool.h>
 #include <linux/zpool.h>
+#include <crypto/acompress.h>
 
 #include <linux/mm_types.h>
 #include <linux/page-flags.h>
@@ -127,9 +129,17 @@ module_param_named(same_filled_pages_ena
 * data structures
 **********************************/
 
+struct crypto_acomp_ctx {
+	struct crypto_acomp *acomp;
+	struct acomp_req *req;
+	struct crypto_wait wait;
+	u8 *dstmem;
+	struct mutex mutex;
+};
+
 struct zswap_pool {
 	struct zpool *zpool;
-	struct crypto_comp * __percpu *tfm;
+	struct crypto_acomp_ctx * __percpu *acomp_ctx;
 	struct kref kref;
 	struct list_head list;
 	struct work_struct release_work;
@@ -415,30 +425,73 @@ static int zswap_dstmem_dead(unsigned in
 static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
 {
 	struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
-	struct crypto_comp *tfm;
+	struct crypto_acomp *acomp;
+	struct acomp_req *req;
+	struct crypto_acomp_ctx *acomp_ctx;
+	int ret;
 
-	if (WARN_ON(*per_cpu_ptr(pool->tfm, cpu)))
+	if (WARN_ON(*per_cpu_ptr(pool->acomp_ctx, cpu)))
 		return 0;
 
-	tfm = crypto_alloc_comp(pool->tfm_name, 0, 0);
-	if (IS_ERR_OR_NULL(tfm)) {
-		pr_err("could not alloc crypto comp %s : %ld\n",
-		       pool->tfm_name, PTR_ERR(tfm));
+	acomp_ctx = kzalloc(sizeof(*acomp_ctx), GFP_KERNEL);
+	if (!acomp_ctx)
 		return -ENOMEM;
+
+	acomp = crypto_alloc_acomp(pool->tfm_name, 0, 0);
+	if (IS_ERR(acomp)) {
+		pr_err("could not alloc crypto acomp %s : %ld\n",
+				pool->tfm_name, PTR_ERR(acomp));
+		ret = PTR_ERR(acomp);
+		goto free_ctx;
+	}
+	acomp_ctx->acomp = acomp;
+
+	req = acomp_request_alloc(acomp_ctx->acomp);
+	if (!req) {
+		pr_err("could not alloc crypto acomp_request %s\n",
+		       pool->tfm_name);
+		ret = -ENOMEM;
+		goto free_acomp;
 	}
-	*per_cpu_ptr(pool->tfm, cpu) = tfm;
+	acomp_ctx->req = req;
+
+	mutex_init(&acomp_ctx->mutex);
+	crypto_init_wait(&acomp_ctx->wait);
+	/*
+	 * if the backend of acomp is async zip, crypto_req_done() will wakeup
+	 * crypto_wait_req(); if the backend of acomp is scomp, the callback
+	 * won't be called, crypto_wait_req() will return without blocking.
+	 */
+	acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+				   crypto_req_done, &acomp_ctx->wait);
+
+	acomp_ctx->dstmem = per_cpu(zswap_dstmem, cpu);
+	*per_cpu_ptr(pool->acomp_ctx, cpu) = acomp_ctx;
+
 	return 0;
+
+free_acomp:
+	crypto_free_acomp(acomp_ctx->acomp);
+free_ctx:
+	kfree(acomp_ctx);
+	return ret;
 }
 
 static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
 {
 	struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
-	struct crypto_comp *tfm;
+	struct crypto_acomp_ctx *acomp_ctx;
+
+	acomp_ctx = *per_cpu_ptr(pool->acomp_ctx, cpu);
+	if (!IS_ERR_OR_NULL(acomp_ctx)) {
+		if (!IS_ERR_OR_NULL(acomp_ctx->req))
+			acomp_request_free(acomp_ctx->req);
+		if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
+			crypto_free_acomp(acomp_ctx->acomp);
+		kfree(acomp_ctx);
+	}
+	*per_cpu_ptr(pool->acomp_ctx, cpu) = NULL;
 
-	tfm = *per_cpu_ptr(pool->tfm, cpu);
-	if (!IS_ERR_OR_NULL(tfm))
-		crypto_free_comp(tfm);
-	*per_cpu_ptr(pool->tfm, cpu) = NULL;
 	return 0;
 }
 
@@ -561,8 +614,9 @@ static struct zswap_pool *zswap_pool_cre
 	pr_debug("using %s zpool\n", zpool_get_type(pool->zpool));
 
 	strlcpy(pool->tfm_name, compressor, sizeof(pool->tfm_name));
-	pool->tfm = alloc_percpu(struct crypto_comp *);
-	if (!pool->tfm) {
+
+	pool->acomp_ctx = alloc_percpu(struct crypto_acomp_ctx *);
+	if (!pool->acomp_ctx) {
 		pr_err("percpu alloc failed\n");
 		goto error;
 	}
@@ -585,7 +639,7 @@ static struct zswap_pool *zswap_pool_cre
 	return pool;
 
 error:
-	free_percpu(pool->tfm);
+	free_percpu(pool->acomp_ctx);
 	if (pool->zpool)
 		zpool_destroy_pool(pool->zpool);
 	kfree(pool);
@@ -596,14 +650,14 @@ static __init struct zswap_pool *__zswap
 {
 	bool has_comp, has_zpool;
 
-	has_comp = crypto_has_comp(zswap_compressor, 0, 0);
+	has_comp = crypto_has_acomp(zswap_compressor, 0, 0);
 	if (!has_comp && strcmp(zswap_compressor,
 				CONFIG_ZSWAP_COMPRESSOR_DEFAULT)) {
 		pr_err("compressor %s not available, using default %s\n",
 		       zswap_compressor, CONFIG_ZSWAP_COMPRESSOR_DEFAULT);
 		param_free_charp(&zswap_compressor);
 		zswap_compressor = CONFIG_ZSWAP_COMPRESSOR_DEFAULT;
-		has_comp = crypto_has_comp(zswap_compressor, 0, 0);
+		has_comp = crypto_has_acomp(zswap_compressor, 0, 0);
 	}
 	if (!has_comp) {
 		pr_err("default compressor %s not available\n",
@@ -639,7 +693,7 @@ static void zswap_pool_destroy(struct zs
 	zswap_pool_debug("destroying", pool);
 
 	cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node);
-	free_percpu(pool->tfm);
+	free_percpu(pool->acomp_ctx);
 	zpool_destroy_pool(pool->zpool);
 	kfree(pool);
 }
@@ -723,7 +777,7 @@ static int __zswap_param_set(const char
 		}
 		type = s;
 	} else if (!compressor) {
-		if (!crypto_has_comp(s, 0, 0)) {
+		if (!crypto_has_acomp(s, 0, 0)) {
 			pr_err("compressor %s not available\n", s);
 			return -ENOENT;
 		}
@@ -774,7 +828,7 @@ static int __zswap_param_set(const char
 		 * failed, maybe both compressor and zpool params were bad.
 		 * Allow changing this param, so pool creation will succeed
 		 * when the other param is changed. We already verified this
-		 * param is ok in the zpool_has_pool() or crypto_has_comp()
+		 * param is ok in the zpool_has_pool() or crypto_has_acomp()
 		 * checks above.
 		 */
 		ret = param_set_charp(s, kp);
@@ -876,7 +930,9 @@ static int zswap_writeback_entry(struct
 	pgoff_t offset;
 	struct zswap_entry *entry;
 	struct page *page;
-	struct crypto_comp *tfm;
+	struct scatterlist input, output;
+	struct crypto_acomp_ctx *acomp_ctx;
+
 	u8 *src, *dst;
 	unsigned int dlen;
 	int ret;
@@ -916,14 +972,21 @@ static int zswap_writeback_entry(struct
 
 	case ZSWAP_SWAPCACHE_NEW: /* page is locked */
 		/* decompress */
+		acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+
 		dlen = PAGE_SIZE;
 		src = (u8 *)zhdr + sizeof(struct zswap_header);
-		dst = kmap_atomic(page);
-		tfm = *get_cpu_ptr(entry->pool->tfm);
-		ret = crypto_comp_decompress(tfm, src, entry->length,
-					     dst, &dlen);
-		put_cpu_ptr(entry->pool->tfm);
-		kunmap_atomic(dst);
+		dst = kmap(page);
+
+		mutex_lock(&acomp_ctx->mutex);
+		sg_init_one(&input, src, entry->length);
+		sg_init_one(&output, dst, dlen);
+		acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
+		ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait);
+		dlen = acomp_ctx->req->dlen;
+		mutex_unlock(&acomp_ctx->mutex);
+
+		kunmap(page);
 		BUG_ON(ret);
 		BUG_ON(dlen != PAGE_SIZE);
 
@@ -1004,7 +1067,8 @@ static int zswap_frontswap_store(unsigne
 {
 	struct zswap_tree *tree = zswap_trees[type];
 	struct zswap_entry *entry, *dupentry;
-	struct crypto_comp *tfm;
+	struct scatterlist input, output;
+	struct crypto_acomp_ctx *acomp_ctx;
 	int ret;
 	unsigned int hlen, dlen = PAGE_SIZE;
 	unsigned long handle, value;
@@ -1074,12 +1138,32 @@ static int zswap_frontswap_store(unsigne
 	}
 
 	/* compress */
-	dst = get_cpu_var(zswap_dstmem);
-	tfm = *get_cpu_ptr(entry->pool->tfm);
-	src = kmap_atomic(page);
-	ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen);
-	kunmap_atomic(src);
-	put_cpu_ptr(entry->pool->tfm);
+	acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+
+	mutex_lock(&acomp_ctx->mutex);
+
+	src = kmap(page);
+	dst = acomp_ctx->dstmem;
+	sg_init_one(&input, src, PAGE_SIZE);
+	/* zswap_dstmem is of size (PAGE_SIZE * 2). Reflect same in sg_list */
+	sg_init_one(&output, dst, PAGE_SIZE * 2);
+	acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SIZE, dlen);
+	/*
+	 * it maybe looks a little bit silly that we send an asynchronous request,
+	 * then wait for its completion synchronously. This makes the process look
+	 * synchronous in fact.
+	 * Theoretically, acomp supports users send multiple acomp requests in one
+	 * acomp instance, then get those requests done simultaneously. but in this
+	 * case, frontswap actually does store and load page by page, there is no
+	 * existing method to send the second page before the first page is done
+	 * in one thread doing frontswap.
+	 * but in different threads running on different cpu, we have different
+	 * acomp instance, so multiple threads can do (de)compression in parallel.
+	 */
+	ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait);
+	dlen = acomp_ctx->req->dlen;
+	kunmap(page);
+
 	if (ret) {
 		ret = -EINVAL;
 		goto put_dstmem;
@@ -1103,7 +1187,7 @@ static int zswap_frontswap_store(unsigne
 	memcpy(buf, &zhdr, hlen);
 	memcpy(buf + hlen, dst, dlen);
 	zpool_unmap_handle(entry->pool->zpool, handle);
-	put_cpu_var(zswap_dstmem);
+	mutex_unlock(&acomp_ctx->mutex);
 
 	/* populate entry */
 	entry->offset = offset;
@@ -1131,7 +1215,7 @@ insert_entry:
 	return 0;
 
 put_dstmem:
-	put_cpu_var(zswap_dstmem);
+	mutex_unlock(&acomp_ctx->mutex);
 	zswap_pool_put(entry->pool);
 freepage:
 	zswap_entry_cache_free(entry);
@@ -1148,7 +1232,8 @@ static int zswap_frontswap_load(unsigned
 {
 	struct zswap_tree *tree = zswap_trees[type];
 	struct zswap_entry *entry;
-	struct crypto_comp *tfm;
+	struct scatterlist input, output;
+	struct crypto_acomp_ctx *acomp_ctx;
 	u8 *src, *dst;
 	unsigned int dlen;
 	int ret;
@@ -1175,11 +1260,17 @@ static int zswap_frontswap_load(unsigned
 	src = zpool_map_handle(entry->pool->zpool, entry->handle, ZPOOL_MM_RO);
 	if (zpool_evictable(entry->pool->zpool))
 		src += sizeof(struct zswap_header);
-	dst = kmap_atomic(page);
-	tfm = *get_cpu_ptr(entry->pool->tfm);
-	ret = crypto_comp_decompress(tfm, src, entry->length, dst, &dlen);
-	put_cpu_ptr(entry->pool->tfm);
-	kunmap_atomic(dst);
+	dst = kmap(page);
+
+	acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+	mutex_lock(&acomp_ctx->mutex);
+	sg_init_one(&input, src, entry->length);
+	sg_init_one(&output, dst, dlen);
+	acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
+	ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait);
+	mutex_unlock(&acomp_ctx->mutex);
+
+	kunmap(page);
 	zpool_unmap_handle(entry->pool->zpool, entry->handle);
 	BUG_ON(ret);
 
_

Patches currently in -mm which might be from song.bao.hua@hisilicon.com are

mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (56 preceding siblings ...)
  2020-07-08 22:17 ` [to-be-updated] mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch removed from " Andrew Morton
@ 2020-07-08 22:20 ` Andrew Morton
  2020-07-08 22:25 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch " Andrew Morton
                   ` (174 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 22:20 UTC (permalink / raw)
  To: lkp, mm-commits, rppt, sfr


The patch titled
     Subject: powerpc: fix compilation warning caused by missing include of asm/pgalloc.h
has been added to the -mm tree.  Its filename is
     mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Mike Rapoport <rppt@linux.ibm.com>
Subject: powerpc: fix compilation warning caused by missing include of asm/pgalloc.h

Recent rework of asm/pgalloc.h caused a compilation warning reported by
kbuild bot:

All warnings (new ones prefixed by >>):

>> arch/powerpc/mm/nohash/tlb.c:409:6: warning: no previous prototype for
>> 'tlb_flush_pgtable' [-Wmissing-prototypes]
     409 | void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address)
         |      ^~~~~~~~~~~~~~~~~

Add missing include of asm/pgtable.h to arch/powerpc/mm/nohash/tlb.c to
make tlb_flush_pgtable() prototype visible there.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/mm/nohash/tlb.c |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/powerpc/mm/nohash/tlb.c~mm-remove-unneeded-includes-of-asm-pgalloch-fix
+++ a/arch/powerpc/mm/nohash/tlb.c
@@ -34,6 +34,7 @@
 #include <linux/of_fdt.h>
 #include <linux/hugetlb.h>
 
+#include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
 #include <asm/tlb.h>
 #include <asm/code-patching.h>
_

Patches currently in -mm which might be from rppt@linux.ibm.com are

mm-remove-unneeded-includes-of-asm-pgalloch.patch
mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
opeinrisc-switch-to-generic-version-of-pte-allocation.patch
xtensa-switch-to-generic-version-of-pte-allocation.patch
asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
asm-generic-pgalloc-provide-generic-pgd_free.patch
mm-move-lib-ioremapc-to-mm.patch
mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (57 preceding siblings ...)
  2020-07-08 22:20 ` + mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch added to " Andrew Morton
@ 2020-07-08 22:25 ` Andrew Morton
  2020-07-08 23:12 ` + mailmap-add-entry-for-mike-rapoport.patch " Andrew Morton
                   ` (173 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 22:25 UTC (permalink / raw)
  To: andreyknvl, aryabinin, dvyukov, glider, matthias.bgg, mm-commits,
	walter-zh.wu


The patch titled
     Subject: kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4
has been added to the -mm tree.  Its filename is
     kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Walter Wu <walter-zh.wu@mediatek.com>
Subject: kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4

use KASAN_SHADOW_SCALE_SIZE instead of 13

Link: http://lkml.kernel.org/r/20200708132524.11688-1-walter-zh.wu@mediatek.com
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/test_kasan.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/lib/test_kasan.c~kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4
+++ a/lib/test_kasan.c
@@ -23,7 +23,9 @@
 
 #include <asm/page.h>
 
-#define OOB_TAG_OFF (IS_ENABLED(CONFIG_KASAN_GENERIC) ? 0 : 13)
+#include "../mm/kasan/kasan.h"
+
+#define OOB_TAG_OFF (IS_ENABLED(CONFIG_KASAN_GENERIC) ? 0 : KASAN_SHADOW_SCALE_SIZE)
 
 /*
  * We assign some test results to these globals to make sure the tests
_

Patches currently in -mm which might be from walter-zh.wu@mediatek.com are

kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
rcu-kasan-record-and-print-call_rcu-call-stack.patch
kasan-record-and-print-the-free-track.patch
kasan-add-tests-for-call_rcu-stack-recording.patch
kasan-update-documentation-for-generic-kasan.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mailmap-add-entry-for-mike-rapoport.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (58 preceding siblings ...)
  2020-07-08 22:25 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch " Andrew Morton
@ 2020-07-08 23:12 ` Andrew Morton
  2020-07-08 23:16 ` + mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
                   ` (172 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:12 UTC (permalink / raw)
  To: mm-commits, rppt


The patch titled
     Subject: mailmap: add entry for Mike Rapoport
has been added to the -mm tree.  Its filename is
     mailmap-add-entry-for-mike-rapoport.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mailmap-add-entry-for-mike-rapoport.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mailmap-add-entry-for-mike-rapoport.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Mike Rapoport <rppt@linux.ibm.com>
Subject: mailmap: add entry for Mike Rapoport

Add an entry to connect my email addresses.

Link: http://lkml.kernel.org/r/20200708095414.12275-1-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 .mailmap |    3 +++
 1 file changed, 3 insertions(+)

--- a/.mailmap~mailmap-add-entry-for-mike-rapoport
+++ a/.mailmap
@@ -193,6 +193,9 @@ Maxime Ripard <mripard@kernel.org> <maxi
 Mayuresh Janorkar <mayur@ti.com>
 Michael Buesch <m@bues.ch>
 Michel Dänzer <michel@tungstengraphics.com>
+Mike Rapoport <rppt@kernel.org> <mike@compulab.co.il>
+Mike Rapoport <rppt@kernel.org> <mike.rapoport@gmail.com>
+Mike Rapoport <rppt@kernel.org> <rppt@linux.ibm.com>
 Miodrag Dinic <miodrag.dinic@mips.com> <miodrag.dinic@imgtec.com>
 Miquel Raynal <miquel.raynal@bootlin.com> <miquel.raynal@free-electrons.com>
 Mitesh shah <mshah@teja.com>
_

Patches currently in -mm which might be from rppt@linux.ibm.com are

mailmap-add-entry-for-mike-rapoport.patch
mm-remove-unneeded-includes-of-asm-pgalloch.patch
mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
opeinrisc-switch-to-generic-version-of-pte-allocation.patch
xtensa-switch-to-generic-version-of-pte-allocation.patch
asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
asm-generic-pgalloc-provide-generic-pgd_free.patch
mm-move-lib-ioremapc-to-mm.patch
mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (59 preceding siblings ...)
  2020-07-08 23:12 ` + mailmap-add-entry-for-mike-rapoport.patch " Andrew Morton
@ 2020-07-08 23:16 ` Andrew Morton
  2020-07-08 23:16 ` + mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
                   ` (171 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, willy, yang.shi


The patch titled
     Subject: mm/mremap: it is sure to have enough space when extent meets requirement
has been added to the -mm tree.  Its filename is
     mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: it is sure to have enough space when extent meets requirement

Patch series "mm/mremap: cleanup move_page_tables() a little".

move_page_tables() tries to move page table by PMD or PTE.

The root reason is if it tries to move PMD, both old and new range should
be PMD aligned.  But current code calculate old range and new range
separately.  This leads to some redundant check and calculation.

This cleanup tries to consolidate the range check in one place to reduce
some extra range handling.


This patch (of 4):

old_end is passed to these two functions to check whether there is enough
space to do the move, while this check is done before invoking these
functions.

These two functions only would be invoked when extent meets the
requirement and there is one check before invoking these functions:

    if (extent > old_end - old_addr)
        extent = old_end - old_addr;

This implies (old_end - old_addr) won't fail the check in these two
functions.

Link: http://lkml.kernel.org/r/20200708095028.41706-1-richard.weiyang@linux.alibaba.com
Link: http://lkml.kernel.org/r/20200708095028.41706-2-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/huge_mm.h |    2 +-
 mm/huge_memory.c        |    7 ++-----
 mm/mremap.c             |   10 ++++------
 3 files changed, 7 insertions(+), 12 deletions(-)

--- a/include/linux/huge_mm.h~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/include/linux/huge_mm.h
@@ -42,7 +42,7 @@ extern int mincore_huge_pmd(struct vm_ar
 			unsigned long addr, unsigned long end,
 			unsigned char *vec);
 extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
-			 unsigned long new_addr, unsigned long old_end,
+			 unsigned long new_addr,
 			 pmd_t *old_pmd, pmd_t *new_pmd);
 extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 			unsigned long addr, pgprot_t newprot,
--- a/mm/huge_memory.c~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/mm/huge_memory.c
@@ -1722,17 +1722,14 @@ static pmd_t move_soft_dirty_pmd(pmd_t p
 }
 
 bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
-		  unsigned long new_addr, unsigned long old_end,
-		  pmd_t *old_pmd, pmd_t *new_pmd)
+		  unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
 {
 	spinlock_t *old_ptl, *new_ptl;
 	pmd_t pmd;
 	struct mm_struct *mm = vma->vm_mm;
 	bool force_flush = false;
 
-	if ((old_addr & ~HPAGE_PMD_MASK) ||
-	    (new_addr & ~HPAGE_PMD_MASK) ||
-	    old_end - old_addr < HPAGE_PMD_SIZE)
+	if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
 		return false;
 
 	/*
--- a/mm/mremap.c~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/mm/mremap.c
@@ -193,15 +193,13 @@ static void move_ptes(struct vm_area_str
 
 #ifdef CONFIG_HAVE_MOVE_PMD
 static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
-		  unsigned long new_addr, unsigned long old_end,
-		  pmd_t *old_pmd, pmd_t *new_pmd)
+		  unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
 {
 	spinlock_t *old_ptl, *new_ptl;
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t pmd;
 
-	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
-	    || old_end - old_addr < PMD_SIZE)
+	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
 		return false;
 
 	/*
@@ -273,7 +271,7 @@ unsigned long move_page_tables(struct vm
 				if (need_rmap_locks)
 					take_rmap_locks(vma);
 				moved = move_huge_pmd(vma, old_addr, new_addr,
-						    old_end, old_pmd, new_pmd);
+						      old_pmd, new_pmd);
 				if (need_rmap_locks)
 					drop_rmap_locks(vma);
 				if (moved)
@@ -293,7 +291,7 @@ unsigned long move_page_tables(struct vm
 			if (need_rmap_locks)
 				take_rmap_locks(vma);
 			moved = move_normal_pmd(vma, old_addr, new_addr,
-					old_end, old_pmd, new_pmd);
+						old_pmd, new_pmd);
 			if (need_rmap_locks)
 				drop_rmap_locks(vma);
 			if (moved)
_

Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are

mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-mremap-calculate-extent-in-one-place.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (60 preceding siblings ...)
  2020-07-08 23:16 ` + mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
@ 2020-07-08 23:16 ` Andrew Morton
  2020-07-08 23:16 ` + mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
                   ` (170 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, willy, yang.shi


The patch titled
     Subject: mm/mremap: calculate extent in one place
has been added to the -mm tree.  Its filename is
     mm-mremap-calculate-extent-in-one-place.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-calculate-extent-in-one-place.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-calculate-extent-in-one-place.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: calculate extent in one place

Page tables is moved on the base of PMD.  This requires both source and
destination range should meet the requirement.

Current code works well since move_huge_pmd() and move_normal_pmd() would
check old_addr and new_addr again.  And then return to move_ptes() if the
either of them is not aligned.

Instead of calculating the extent separately, it is better to calculate in
one place, so we know it is not necessary to try move pmd.  By doing so,
the logic seems a little clear.

Link: http://lkml.kernel.org/r/20200708095028.41706-3-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mremap.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/mm/mremap.c~mm-mremap-calculate-extent-in-one-place
+++ a/mm/mremap.c
@@ -258,6 +258,9 @@ unsigned long move_page_tables(struct vm
 		extent = next - old_addr;
 		if (extent > old_end - old_addr)
 			extent = old_end - old_addr;
+		next = (new_addr + PMD_SIZE) & PMD_MASK;
+		if (extent > next - new_addr)
+			extent = next - new_addr;
 		old_pmd = get_old_pmd(vma->vm_mm, old_addr);
 		if (!old_pmd)
 			continue;
@@ -301,9 +304,6 @@ unsigned long move_page_tables(struct vm
 
 		if (pte_alloc(new_vma->vm_mm, new_pmd))
 			break;
-		next = (new_addr + PMD_SIZE) & PMD_MASK;
-		if (extent > next - new_addr)
-			extent = next - new_addr;
 		move_ptes(vma, old_pmd, old_addr, old_addr + extent, new_vma,
 			  new_pmd, new_addr, need_rmap_locks);
 	}
_

Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are

mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-mremap-start-addresses-are-properly-aligned.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (61 preceding siblings ...)
  2020-07-08 23:16 ` + mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
@ 2020-07-08 23:16 ` Andrew Morton
  2020-07-08 23:16   ` Andrew Morton
  2020-07-08 23:16 ` + mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch " Andrew Morton
                   ` (169 subsequent siblings)
  232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, willy, yang.shi


The patch titled
     Subject: mm/mremap: start addresses are properly aligned
has been added to the -mm tree.  Its filename is
     mm-mremap-start-addresses-are-properly-aligned.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-start-addresses-are-properly-aligned.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-start-addresses-are-properly-aligned.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: start addresses are properly aligned

After previous cleanup, extent is the minimal step for both source and
destination.  This means when extent is HPAGE_PMD_SIZE or PMD_SIZE,
old_addr and new_addr are properly aligned too.

Since these two functions are only invoked in move_page_tables, it is safe
to remove the check now.

Link: http://lkml.kernel.org/r/20200708095028.41706-4-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/huge_memory.c |    3 ---
 mm/mremap.c      |    3 ---
 2 files changed, 6 deletions(-)

--- a/mm/huge_memory.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/huge_memory.c
@@ -1729,9 +1729,6 @@ bool move_huge_pmd(struct vm_area_struct
 	struct mm_struct *mm = vma->vm_mm;
 	bool force_flush = false;
 
-	if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
-		return false;
-
 	/*
 	 * The destination pmd shouldn't be established, free_pgtables()
 	 * should have release it.
--- a/mm/mremap.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/mremap.c
@@ -199,9 +199,6 @@ static bool move_normal_pmd(struct vm_ar
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t pmd;
 
-	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
-		return false;

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-mremap-start-addresses-are-properly-aligned.patch added to -mm tree
  2020-07-08 23:16 ` + mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
@ 2020-07-08 23:16   ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, willy, yang.shi


The patch titled
     Subject: mm/mremap: start addresses are properly aligned
has been added to the -mm tree.  Its filename is
     mm-mremap-start-addresses-are-properly-aligned.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-start-addresses-are-properly-aligned.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-start-addresses-are-properly-aligned.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: start addresses are properly aligned

After previous cleanup, extent is the minimal step for both source and
destination.  This means when extent is HPAGE_PMD_SIZE or PMD_SIZE,
old_addr and new_addr are properly aligned too.

Since these two functions are only invoked in move_page_tables, it is safe
to remove the check now.

Link: http://lkml.kernel.org/r/20200708095028.41706-4-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/huge_memory.c |    3 ---
 mm/mremap.c      |    3 ---
 2 files changed, 6 deletions(-)

--- a/mm/huge_memory.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/huge_memory.c
@@ -1729,9 +1729,6 @@ bool move_huge_pmd(struct vm_area_struct
 	struct mm_struct *mm = vma->vm_mm;
 	bool force_flush = false;
 
-	if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
-		return false;
-
 	/*
 	 * The destination pmd shouldn't be established, free_pgtables()
 	 * should have release it.
--- a/mm/mremap.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/mremap.c
@@ -199,9 +199,6 @@ static bool move_normal_pmd(struct vm_ar
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t pmd;
 
-	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
-		return false;
-
 	/*
 	 * The destination pmd shouldn't be established, free_pgtables()
 	 * should have release it.
_

Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are

mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (62 preceding siblings ...)
  2020-07-08 23:16 ` + mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
@ 2020-07-08 23:16 ` Andrew Morton
  2020-07-08 23:41 ` + mm-swap-simplify-alloc_swap_slot_cache.patch " Andrew Morton
                   ` (168 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
  To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
	mm-commits, peterx, richard.weiyang, sean.j.christopherson,
	thellstrom, thomas_os, vbabka, willy, yang.shi


The patch titled
     Subject: mm/mremap: use pmd_addr_end to simplify the calculate of extent
has been added to the -mm tree.  Its filename is
     mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: use pmd_addr_end to simplify the calculate of extent

The purpose of this code is to calculate the smaller extent in old and new
range.  Let's leverage pmd_addr_end() to do the calculation.

Hope this would make the code easier to read.

Link: http://lkml.kernel.org/r/20200708095028.41706-5-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>

Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mremap.c |   16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

--- a/mm/mremap.c~mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent
+++ a/mm/mremap.c
@@ -237,11 +237,12 @@ unsigned long move_page_tables(struct vm
 		unsigned long new_addr, unsigned long len,
 		bool need_rmap_locks)
 {
-	unsigned long extent, next, old_end;
+	unsigned long extent, old_next, new_next, old_end, new_end;
 	struct mmu_notifier_range range;
 	pmd_t *old_pmd, *new_pmd;
 
 	old_end = old_addr + len;
+	new_end = new_addr + len;
 	flush_cache_range(vma, old_addr, old_end);
 
 	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
@@ -250,14 +251,11 @@ unsigned long move_page_tables(struct vm
 
 	for (; old_addr < old_end; old_addr += extent, new_addr += extent) {
 		cond_resched();
-		next = (old_addr + PMD_SIZE) & PMD_MASK;
-		/* even if next overflowed, extent below will be ok */
-		extent = next - old_addr;
-		if (extent > old_end - old_addr)
-			extent = old_end - old_addr;
-		next = (new_addr + PMD_SIZE) & PMD_MASK;
-		if (extent > next - new_addr)
-			extent = next - new_addr;
+
+		old_next = pmd_addr_end(old_addr, old_end);
+		new_next = pmd_addr_end(new_addr, new_end);
+		extent = min(old_next - old_addr, new_next - new_addr);
+
 		old_pmd = get_old_pmd(vma->vm_mm, old_addr);
 		if (!old_pmd)
 			continue;
_

Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are

mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-swap-simplify-alloc_swap_slot_cache.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (63 preceding siblings ...)
  2020-07-08 23:16 ` + mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch " Andrew Morton
@ 2020-07-08 23:41 ` Andrew Morton
  2020-07-08 23:41 ` + mm-swap-simplify-enable_swap_slots_cache.patch " Andrew Morton
                   ` (167 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:41 UTC (permalink / raw)
  To: mm-commits, thunder.leizhen, tim.c.chen


The patch titled
     Subject: mm/swap_slots.c: simplify alloc_swap_slot_cache()
has been added to the -mm tree.  Its filename is
     mm-swap-simplify-alloc_swap_slot_cache.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-simplify-alloc_swap_slot_cache.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-simplify-alloc_swap_slot_cache.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Zhen Lei <thunder.leizhen@huawei.com>
Subject: mm/swap_slots.c: simplify alloc_swap_slot_cache()

Patch series "clean up some functions in mm/swap_slots.c".

When I studied the code of mm/swap_slots.c, I found some places can be
improved.


This patch (of 3):

Both "slots" and "slots_ret" are only need to be freed when cache already
allocated.  Make them closer, seems more clear.

No functional change.

Link: http://lkml.kernel.org/r/20200430061143.450-1-thunder.leizhen@huawei.com
Link: http://lkml.kernel.org/r/20200430061143.450-2-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/swap_slots.c |   18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

--- a/mm/swap_slots.c~mm-swap-simplify-alloc_swap_slot_cache
+++ a/mm/swap_slots.c
@@ -136,9 +136,16 @@ static int alloc_swap_slot_cache(unsigne
 
 	mutex_lock(&swap_slots_cache_mutex);
 	cache = &per_cpu(swp_slots, cpu);
-	if (cache->slots || cache->slots_ret)
+	if (cache->slots || cache->slots_ret) {
 		/* cache already allocated */
-		goto out;
+		mutex_unlock(&swap_slots_cache_mutex);
+
+		kvfree(slots);
+		kvfree(slots_ret);
+
+		return 0;
+	}
+
 	if (!cache->lock_initialized) {
 		mutex_init(&cache->alloc_lock);
 		spin_lock_init(&cache->free_lock);
@@ -155,15 +162,8 @@ static int alloc_swap_slot_cache(unsigne
 	 */
 	mb();
 	cache->slots = slots;
-	slots = NULL;
 	cache->slots_ret = slots_ret;
-	slots_ret = NULL;
-out:
 	mutex_unlock(&swap_slots_cache_mutex);
-	if (slots)
-		kvfree(slots);
-	if (slots_ret)
-		kvfree(slots_ret);
 	return 0;
 }
 
_

Patches currently in -mm which might be from thunder.leizhen@huawei.com are

mm-swap-simplify-alloc_swap_slot_cache.patch
mm-swap-simplify-enable_swap_slots_cache.patch
mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-swap-simplify-enable_swap_slots_cache.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (64 preceding siblings ...)
  2020-07-08 23:41 ` + mm-swap-simplify-alloc_swap_slot_cache.patch " Andrew Morton
@ 2020-07-08 23:41 ` Andrew Morton
  2020-07-08 23:41 ` + mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch " Andrew Morton
                   ` (166 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:41 UTC (permalink / raw)
  To: mm-commits, thunder.leizhen, tim.c.chen


The patch titled
     Subject: mm/swap_slots.c: simplify enable_swap_slots_cache()
has been added to the -mm tree.  Its filename is
     mm-swap-simplify-enable_swap_slots_cache.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-simplify-enable_swap_slots_cache.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-simplify-enable_swap_slots_cache.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Zhen Lei <thunder.leizhen@huawei.com>
Subject: mm/swap_slots.c: simplify enable_swap_slots_cache()

Whether swap_slot_cache_initialized is true or false,
__reenable_swap_slots_cache() is always called.  To make this meaning
clear, leave only one call to __reenable_swap_slots_cache().  This also
make it clearer what extra needs be done when swap_slot_cache_initialized
is false.

No functional change.

Link: http://lkml.kernel.org/r/20200430061143.450-3-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/swap_slots.c |   22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

--- a/mm/swap_slots.c~mm-swap-simplify-enable_swap_slots_cache
+++ a/mm/swap_slots.c
@@ -240,21 +240,19 @@ static int free_slot_cache(unsigned int
 
 int enable_swap_slots_cache(void)
 {
-	int ret = 0;
-
 	mutex_lock(&swap_slots_cache_enable_mutex);
-	if (swap_slot_cache_initialized) {
-		__reenable_swap_slots_cache();
-		goto out_unlock;
-	}
+	if (!swap_slot_cache_initialized) {
+		int ret;
 
-	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "swap_slots_cache",
-				alloc_swap_slot_cache, free_slot_cache);
-	if (WARN_ONCE(ret < 0, "Cache allocation failed (%s), operating "
-			       "without swap slots cache.\n", __func__))
-		goto out_unlock;
+		ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "swap_slots_cache",
+					alloc_swap_slot_cache, free_slot_cache);
+		if (WARN_ONCE(ret < 0, "Cache allocation failed (%s), operating "
+				       "without swap slots cache.\n", __func__))
+			goto out_unlock;
+
+		swap_slot_cache_initialized = true;
+	}
 
-	swap_slot_cache_initialized = true;
 	__reenable_swap_slots_cache();
 out_unlock:
 	mutex_unlock(&swap_slots_cache_enable_mutex);
_

Patches currently in -mm which might be from thunder.leizhen@huawei.com are

mm-swap-simplify-alloc_swap_slot_cache.patch
mm-swap-simplify-enable_swap_slots_cache.patch
mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (65 preceding siblings ...)
  2020-07-08 23:41 ` + mm-swap-simplify-enable_swap_slots_cache.patch " Andrew Morton
@ 2020-07-08 23:41 ` Andrew Morton
  2020-07-09  0:06 ` + mm-do-page-fault-accounting-in-handle_mm_fault.patch " Andrew Morton
                   ` (165 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:41 UTC (permalink / raw)
  To: mm-commits, thunder.leizhen, tim.c.chen


The patch titled
     Subject: mm/swap_slots.c: remove redundant check for swap_slot_cache_initialized
has been added to the -mm tree.  Its filename is
     mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Zhen Lei <thunder.leizhen@huawei.com>
Subject: mm/swap_slots.c: remove redundant check for swap_slot_cache_initialized

Because enable_swap_slots_cache can only become true in
enable_swap_slots_cache(), and depends on swap_slot_cache_initialized is
true before.  That means, when enable_swap_slots_cache is true,
swap_slot_cache_initialized is true also.

So the condition:
"swap_slot_cache_enabled && swap_slot_cache_initialized"
can be reduced to "swap_slot_cache_enabled"

And in mathematics:
"!swap_slot_cache_enabled || !swap_slot_cache_initialized"
is equal to "!(swap_slot_cache_enabled && swap_slot_cache_initialized)"

So no functional change.

Link: http://lkml.kernel.org/r/20200430061143.450-4-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/swap_slots.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/mm/swap_slots.c~mm-swap-remove-redundant-check-for-swap_slot_cache_initialized
+++ a/mm/swap_slots.c
@@ -46,8 +46,7 @@ static void __drain_swap_slots_cache(uns
 static void deactivate_swap_slots_cache(void);
 static void reactivate_swap_slots_cache(void);
 
-#define use_swap_slot_cache (swap_slot_cache_active && \
-		swap_slot_cache_enabled && swap_slot_cache_initialized)
+#define use_swap_slot_cache (swap_slot_cache_active && swap_slot_cache_enabled)
 #define SLOTS_CACHE 0x1
 #define SLOTS_CACHE_RET 0x2
 
@@ -94,7 +93,7 @@ static bool check_cache_active(void)
 {
 	long pages;
 
-	if (!swap_slot_cache_enabled || !swap_slot_cache_initialized)
+	if (!swap_slot_cache_enabled)
 		return false;
 
 	pages = get_nr_swap_pages();
_

Patches currently in -mm which might be from thunder.leizhen@huawei.com are

mm-swap-simplify-alloc_swap_slot_cache.patch
mm-swap-simplify-enable_swap_slots_cache.patch
mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-do-page-fault-accounting-in-handle_mm_fault.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (66 preceding siblings ...)
  2020-07-08 23:41 ` + mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06   ` Andrew Morton
  2020-07-09  0:06 ` + mm-alpha-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (164 subsequent siblings)
  232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: agordeev, aou, bcain, benh, borntraeger, bp, catalin.marinas,
	chris, dalias, dave.hansen, davem, deanbo422, deller, geert,
	gerald.schaefer, gor, green.hu, guoren, heiko.carstens, hpa, ink,
	James.Bottomley, jcmvbkbc, jhubbard, jonas, ley.foon.tan, linux,
	luto, mattst88, mingo, mm-commits, monstr, mpe, nickhu, palmer,
	paul.walmsley


The patch titled
     Subject: mm: do page fault accounting in handle_mm_fault
has been added to the -mm tree.  Its filename is
     mm-do-page-fault-accounting-in-handle_mm_fault.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-do-page-fault-accounting-in-handle_mm_fault.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-do-page-fault-accounting-in-handle_mm_fault.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm: do page fault accounting in handle_mm_fault

Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b9827063 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more


This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.

Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/mm/fault.c         |    2 -
 arch/arc/mm/fault.c           |    2 -
 arch/arm/mm/fault.c           |    2 -
 arch/arm64/mm/fault.c         |    2 -
 arch/csky/mm/fault.c          |    3 +
 arch/hexagon/mm/vm_fault.c    |    2 -
 arch/ia64/mm/fault.c          |    2 -
 arch/m68k/mm/fault.c          |    2 -
 arch/microblaze/mm/fault.c    |    2 -
 arch/mips/mm/fault.c          |    2 -
 arch/nds32/mm/fault.c         |    2 -
 arch/nios2/mm/fault.c         |    2 -
 arch/openrisc/mm/fault.c      |    2 -
 arch/parisc/mm/fault.c        |    2 -
 arch/powerpc/mm/copro_fault.c |    2 -
 arch/powerpc/mm/fault.c       |    2 -
 arch/riscv/mm/fault.c         |    2 -
 arch/s390/mm/fault.c          |    2 -
 arch/sh/mm/fault.c            |    2 -
 arch/sparc/mm/fault_32.c      |    4 +-
 arch/sparc/mm/fault_64.c      |    2 -
 arch/um/kernel/trap.c         |    2 -
 arch/x86/mm/fault.c           |    2 -
 arch/xtensa/mm/fault.c        |    2 -
 drivers/iommu/amd/iommu_v2.c  |    2 -
 drivers/iommu/intel/svm.c     |    3 +
 include/linux/mm.h            |    7 ++-
 mm/gup.c                      |    4 +-
 mm/hmm.c                      |    3 +
 mm/ksm.c                      |    3 +
 mm/memory.c                   |   64 +++++++++++++++++++++++++++++++-
 31 files changed, 103 insertions(+), 34 deletions(-)

--- a/arch/alpha/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/alpha/mm/fault.c
@@ -148,7 +148,7 @@ retry:
 	/* If for any reason at all we couldn't handle the fault,
 	   make sure we exit gracefully rather than endlessly redo
 	   the fault.  */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/arc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arc/mm/fault.c
@@ -130,7 +130,7 @@ retry:
 		goto bad_area;
 	}
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	/* Quick path to respond to signals */
 	if (fault_signal_pending(fault, regs)) {
--- a/arch/arm64/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arm64/mm/fault.c
@@ -428,7 +428,7 @@ static vm_fault_t __do_page_fault(struct
 	 */
 	if (!(vma->vm_flags & vm_flags))
 		return VM_FAULT_BADACCESS;
-	return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags);
+	return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags, NULL);
 }
 
 static bool is_el0_instruction_abort(unsigned int esr)
--- a/arch/arm/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arm/mm/fault.c
@@ -224,7 +224,7 @@ good_area:
 		goto out;
 	}
 
-	return handle_mm_fault(vma, addr & PAGE_MASK, flags);
+	return handle_mm_fault(vma, addr & PAGE_MASK, flags, NULL);
 
 check_stack:
 	/* Don't allow expansion below FIRST_USER_ADDRESS */
--- a/arch/csky/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/csky/mm/fault.c
@@ -150,7 +150,8 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0);
+	fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0,
+				NULL);
 	if (unlikely(fault & VM_FAULT_ERROR)) {
 		if (fault & VM_FAULT_OOM)
 			goto out_of_memory;
--- a/arch/hexagon/mm/vm_fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/hexagon/mm/vm_fault.c
@@ -88,7 +88,7 @@ good_area:
 		break;
 	}
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/ia64/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/ia64/mm/fault.c
@@ -143,7 +143,7 @@ retry:
 	 * sure we exit gracefully rather than endlessly redo the
 	 * fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/m68k/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/m68k/mm/fault.c
@@ -134,7 +134,7 @@ good_area:
 	 * the fault.
 	 */
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 	pr_debug("handle_mm_fault returns %x\n", fault);
 
 	if (fault_signal_pending(fault, regs))
--- a/arch/microblaze/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/microblaze/mm/fault.c
@@ -214,7 +214,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/mips/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/mips/mm/fault.c
@@ -152,7 +152,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/nds32/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/nds32/mm/fault.c
@@ -206,7 +206,7 @@ good_area:
 	 * the fault.
 	 */
 
-	fault = handle_mm_fault(vma, addr, flags);
+	fault = handle_mm_fault(vma, addr, flags, NULL);
 
 	/*
 	 * If we need to retry but a fatal signal is pending, handle the
--- a/arch/nios2/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/nios2/mm/fault.c
@@ -131,7 +131,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/openrisc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/openrisc/mm/fault.c
@@ -159,7 +159,7 @@ good_area:
 	 * the fault.
 	 */
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/parisc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/parisc/mm/fault.c
@@ -302,7 +302,7 @@ good_area:
 	 * fault.
 	 */
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/powerpc/mm/copro_fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/powerpc/mm/copro_fault.c
@@ -64,7 +64,7 @@ int copro_handle_mm_fault(struct mm_stru
 	}
 
 	ret = 0;
-	*flt = handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0);
+	*flt = handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0, NULL);
 	if (unlikely(*flt & VM_FAULT_ERROR)) {
 		if (*flt & VM_FAULT_OOM) {
 			ret = -ENOMEM;
--- a/arch/powerpc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/powerpc/mm/fault.c
@@ -607,7 +607,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	major |= fault & VM_FAULT_MAJOR;
 
--- a/arch/riscv/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/riscv/mm/fault.c
@@ -109,7 +109,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, addr, flags);
+	fault = handle_mm_fault(vma, addr, flags, NULL);
 
 	/*
 	 * If we need to retry but a fatal signal is pending, handle the
--- a/arch/s390/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/s390/mm/fault.c
@@ -478,7 +478,7 @@ retry:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 	if (fault_signal_pending(fault, regs)) {
 		fault = VM_FAULT_SIGNAL;
 		if (flags & FAULT_FLAG_RETRY_NOWAIT)
--- a/arch/sh/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sh/mm/fault.c
@@ -482,7 +482,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR)))
 		if (mm_fault_error(regs, error_code, address, fault))
--- a/arch/sparc/mm/fault_32.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sparc/mm/fault_32.c
@@ -234,7 +234,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -410,7 +410,7 @@ good_area:
 		if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
 			goto bad_area;
 	}
-	switch (handle_mm_fault(vma, address, flags)) {
+	switch (handle_mm_fault(vma, address, flags, NULL)) {
 	case VM_FAULT_SIGBUS:
 	case VM_FAULT_OOM:
 		goto do_sigbus;
--- a/arch/sparc/mm/fault_64.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sparc/mm/fault_64.c
@@ -422,7 +422,7 @@ good_area:
 			goto bad_area;
 	}
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		goto exit_exception;
--- a/arch/um/kernel/trap.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/um/kernel/trap.c
@@ -71,7 +71,7 @@ good_area:
 	do {
 		vm_fault_t fault;
 
-		fault = handle_mm_fault(vma, address, flags);
+		fault = handle_mm_fault(vma, address, flags, NULL);
 
 		if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
 			goto out_nosemaphore;
--- a/arch/x86/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/x86/mm/fault.c
@@ -1291,7 +1291,7 @@ good_area:
 	 * userland). The return to userland is identified whenever
 	 * FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in flags.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 	major |= fault & VM_FAULT_MAJOR;
 
 	/* Quick path to respond to signals */
--- a/arch/xtensa/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/xtensa/mm/fault.c
@@ -107,7 +107,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/drivers/iommu/amd/iommu_v2.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/drivers/iommu/amd/iommu_v2.c
@@ -495,7 +495,7 @@ static void do_fault(struct work_struct
 	if (access_error(vma, fault))
 		goto out;
 
-	ret = handle_mm_fault(vma, address, flags);
+	ret = handle_mm_fault(vma, address, flags, NULL);
 out:
 	mmap_read_unlock(mm);
 
--- a/drivers/iommu/intel/svm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/drivers/iommu/intel/svm.c
@@ -872,7 +872,8 @@ static irqreturn_t prq_event_thread(int
 			goto invalid;
 
 		ret = handle_mm_fault(vma, address,
-				      req->wr_req ? FAULT_FLAG_WRITE : 0);
+				      req->wr_req ? FAULT_FLAG_WRITE : 0,
+				      NULL);
 		if (ret & VM_FAULT_ERROR)
 			goto invalid;
 
--- a/include/linux/mm.h~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/include/linux/mm.h
@@ -38,6 +38,7 @@ struct file_ra_state;
 struct user_struct;
 struct writeback_control;
 struct bdi_writeback;
+struct pt_regs;
 
 void init_mm_internals(void);
 
@@ -1650,7 +1651,8 @@ int invalidate_inode_page(struct page *p
 
 #ifdef CONFIG_MMU
 extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
-			unsigned long address, unsigned int flags);
+				  unsigned long address, unsigned int flags,
+				  struct pt_regs *regs);
 extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
 			    unsigned long address, unsigned int fault_flags,
 			    bool *unlocked);
@@ -1660,7 +1662,8 @@ void unmap_mapping_range(struct address_
 		loff_t const holebegin, loff_t const holelen, int even_cows);
 #else
 static inline vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
-		unsigned long address, unsigned int flags)
+					 unsigned long address, unsigned int flags,
+					 struct pt_regs *regs)
 {
 	/* should never happen if there's no MMU */
 	BUG();
--- a/mm/gup.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/gup.c
@@ -884,7 +884,7 @@ static int faultin_page(struct task_stru
 		fault_flags |= FAULT_FLAG_TRIED;
 	}
 
-	ret = handle_mm_fault(vma, address, fault_flags);
+	ret = handle_mm_fault(vma, address, fault_flags, NULL);
 	if (ret & VM_FAULT_ERROR) {
 		int err = vm_fault_to_errno(ret, *flags);
 
@@ -1238,7 +1238,7 @@ retry:
 	    fatal_signal_pending(current))
 		return -EINTR;
 
-	ret = handle_mm_fault(vma, address, fault_flags);
+	ret = handle_mm_fault(vma, address, fault_flags, NULL);
 	major |= ret & VM_FAULT_MAJOR;
 	if (ret & VM_FAULT_ERROR) {
 		int err = vm_fault_to_errno(ret, 0);
--- a/mm/hmm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/hmm.c
@@ -75,7 +75,8 @@ static int hmm_vma_fault(unsigned long a
 	}
 
 	for (; addr < end; addr += PAGE_SIZE)
-		if (handle_mm_fault(vma, addr, fault_flags) & VM_FAULT_ERROR)
+		if (handle_mm_fault(vma, addr, fault_flags, NULL) &
+		    VM_FAULT_ERROR)
 			return -EFAULT;
 	return -EBUSY;
 }
--- a/mm/ksm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/ksm.c
@@ -480,7 +480,8 @@ static int break_ksm(struct vm_area_stru
 			break;
 		if (PageKsm(page))
 			ret = handle_mm_fault(vma, addr,
-					FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE);
+					      FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE,
+					      NULL);
 		else
 			ret = VM_FAULT_WRITE;
 		put_page(page);
--- a/mm/memory.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/memory.c
@@ -71,6 +71,8 @@
 #include <linux/dax.h>
 #include <linux/oom.h>
 #include <linux/numa.h>
+#include <linux/perf_event.h>
+#include <linux/ptrace.h>
 
 #include <trace/events/kmem.h>
 
@@ -4365,6 +4367,64 @@ retry_pud:
 	return handle_pte_fault(&vmf);
 }
 
+/**
+ * mm_account_fault - Do page fault accountings
+ *
+ * @regs: the pt_regs struct pointer.  When set to NULL, will skip accounting
+ *        of perf event counters, but we'll still do the per-task accounting to
+ *        the task who triggered this page fault.
+ * @address: the faulted address.
+ * @flags: the fault flags.
+ * @ret: the fault retcode.
+ *
+ * This will take care of most of the page fault accountings.  Meanwhile, it
+ * will also include the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf counter
+ * updates.  However note that the handling of PERF_COUNT_SW_PAGE_FAULTS should
+ * still be in per-arch page fault handlers at the entry of page fault.
+ */
+static inline void mm_account_fault(struct pt_regs *regs,
+				    unsigned long address, unsigned int flags,
+				    vm_fault_t ret)
+{
+	bool major;
+
+	/*
+	 * We don't do accounting for some specific faults:
+	 *
+	 * - Unsuccessful faults (e.g. when the address wasn't valid).  That
+	 *   includes arch_vma_access_permitted() failing before reaching here.
+	 *   So this is not a "this many hardware page faults" counter.  We
+	 *   should use the hw profiling for that.
+	 *
+	 * - Incomplete faults (VM_FAULT_RETRY).  They will only be counted
+	 *   once they're completed.
+	 */
+	if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY))
+		return;
+
+	/*
+	 * We define the fault as a major fault when the final successful fault
+	 * is VM_FAULT_MAJOR, or if it retried (which implies that we couldn't
+	 * handle it immediately previously).
+	 */
+	major = (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED);
+
+	/*
+	 * If the fault is done for GUP, regs will be NULL, and we will skip
+	 * the fault accounting.
+	 */
+	if (!regs)
+		return;
+
+	if (major) {
+		current->maj_flt++;
+		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
+	} else {
+		current->min_flt++;
+		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
+	}
+}
+
 /*
  * By the time we get here, we already hold the mm semaphore
  *
@@ -4372,7 +4432,7 @@ retry_pud:
  * return value.  See filemap_fault() and __lock_page_or_retry().
  */
 vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
-		unsigned int flags)
+			   unsigned int flags, struct pt_regs *regs)
 {
 	vm_fault_t ret;
 
@@ -4413,6 +4473,8 @@ vm_fault_t handle_mm_fault(struct vm_are
 			mem_cgroup_oom_synchronize(false);
 	}
 
+	mm_account_fault(regs, address, flags, ret);
+
 	return ret;
 }
 EXPORT_SYMBOL_GPL(handle_mm_fault);
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-do-page-fault-accounting-in-handle_mm_fault.patch added to -mm tree
  2020-07-09  0:06 ` + mm-do-page-fault-accounting-in-handle_mm_fault.patch " Andrew Morton
@ 2020-07-09  0:06   ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: agordeev, aou, bcain, benh, borntraeger, bp, catalin.marinas,
	chris, dalias, dave.hansen, davem, deanbo422, deller, geert,
	gerald.schaefer, gor, green.hu, guoren, heiko.carstens, hpa, ink,
	James.Bottomley, jcmvbkbc, jhubbard, jonas, ley.foon.tan, linux,
	luto, mattst88, mingo, mm-commits, monstr, mpe, nickhu, palmer,
	paul.walmsley, paulus, penberg, peterx, peterz, rth, shorne,
	stefan.kristiansson, tglx, tony.luck, torvalds, tsbogend, vgupta,
	will, ysato


The patch titled
     Subject: mm: do page fault accounting in handle_mm_fault
has been added to the -mm tree.  Its filename is
     mm-do-page-fault-accounting-in-handle_mm_fault.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-do-page-fault-accounting-in-handle_mm_fault.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-do-page-fault-accounting-in-handle_mm_fault.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm: do page fault accounting in handle_mm_fault

Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b9827063 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more


This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.

Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/mm/fault.c         |    2 -
 arch/arc/mm/fault.c           |    2 -
 arch/arm/mm/fault.c           |    2 -
 arch/arm64/mm/fault.c         |    2 -
 arch/csky/mm/fault.c          |    3 +
 arch/hexagon/mm/vm_fault.c    |    2 -
 arch/ia64/mm/fault.c          |    2 -
 arch/m68k/mm/fault.c          |    2 -
 arch/microblaze/mm/fault.c    |    2 -
 arch/mips/mm/fault.c          |    2 -
 arch/nds32/mm/fault.c         |    2 -
 arch/nios2/mm/fault.c         |    2 -
 arch/openrisc/mm/fault.c      |    2 -
 arch/parisc/mm/fault.c        |    2 -
 arch/powerpc/mm/copro_fault.c |    2 -
 arch/powerpc/mm/fault.c       |    2 -
 arch/riscv/mm/fault.c         |    2 -
 arch/s390/mm/fault.c          |    2 -
 arch/sh/mm/fault.c            |    2 -
 arch/sparc/mm/fault_32.c      |    4 +-
 arch/sparc/mm/fault_64.c      |    2 -
 arch/um/kernel/trap.c         |    2 -
 arch/x86/mm/fault.c           |    2 -
 arch/xtensa/mm/fault.c        |    2 -
 drivers/iommu/amd/iommu_v2.c  |    2 -
 drivers/iommu/intel/svm.c     |    3 +
 include/linux/mm.h            |    7 ++-
 mm/gup.c                      |    4 +-
 mm/hmm.c                      |    3 +
 mm/ksm.c                      |    3 +
 mm/memory.c                   |   64 +++++++++++++++++++++++++++++++-
 31 files changed, 103 insertions(+), 34 deletions(-)

--- a/arch/alpha/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/alpha/mm/fault.c
@@ -148,7 +148,7 @@ retry:
 	/* If for any reason at all we couldn't handle the fault,
 	   make sure we exit gracefully rather than endlessly redo
 	   the fault.  */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/arc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arc/mm/fault.c
@@ -130,7 +130,7 @@ retry:
 		goto bad_area;
 	}
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	/* Quick path to respond to signals */
 	if (fault_signal_pending(fault, regs)) {
--- a/arch/arm64/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arm64/mm/fault.c
@@ -428,7 +428,7 @@ static vm_fault_t __do_page_fault(struct
 	 */
 	if (!(vma->vm_flags & vm_flags))
 		return VM_FAULT_BADACCESS;
-	return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags);
+	return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags, NULL);
 }
 
 static bool is_el0_instruction_abort(unsigned int esr)
--- a/arch/arm/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arm/mm/fault.c
@@ -224,7 +224,7 @@ good_area:
 		goto out;
 	}
 
-	return handle_mm_fault(vma, addr & PAGE_MASK, flags);
+	return handle_mm_fault(vma, addr & PAGE_MASK, flags, NULL);
 
 check_stack:
 	/* Don't allow expansion below FIRST_USER_ADDRESS */
--- a/arch/csky/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/csky/mm/fault.c
@@ -150,7 +150,8 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0);
+	fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0,
+				NULL);
 	if (unlikely(fault & VM_FAULT_ERROR)) {
 		if (fault & VM_FAULT_OOM)
 			goto out_of_memory;
--- a/arch/hexagon/mm/vm_fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/hexagon/mm/vm_fault.c
@@ -88,7 +88,7 @@ good_area:
 		break;
 	}
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/ia64/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/ia64/mm/fault.c
@@ -143,7 +143,7 @@ retry:
 	 * sure we exit gracefully rather than endlessly redo the
 	 * fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/m68k/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/m68k/mm/fault.c
@@ -134,7 +134,7 @@ good_area:
 	 * the fault.
 	 */
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 	pr_debug("handle_mm_fault returns %x\n", fault);
 
 	if (fault_signal_pending(fault, regs))
--- a/arch/microblaze/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/microblaze/mm/fault.c
@@ -214,7 +214,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/mips/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/mips/mm/fault.c
@@ -152,7 +152,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/nds32/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/nds32/mm/fault.c
@@ -206,7 +206,7 @@ good_area:
 	 * the fault.
 	 */
 
-	fault = handle_mm_fault(vma, addr, flags);
+	fault = handle_mm_fault(vma, addr, flags, NULL);
 
 	/*
 	 * If we need to retry but a fatal signal is pending, handle the
--- a/arch/nios2/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/nios2/mm/fault.c
@@ -131,7 +131,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/openrisc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/openrisc/mm/fault.c
@@ -159,7 +159,7 @@ good_area:
 	 * the fault.
 	 */
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/parisc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/parisc/mm/fault.c
@@ -302,7 +302,7 @@ good_area:
 	 * fault.
 	 */
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/arch/powerpc/mm/copro_fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/powerpc/mm/copro_fault.c
@@ -64,7 +64,7 @@ int copro_handle_mm_fault(struct mm_stru
 	}
 
 	ret = 0;
-	*flt = handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0);
+	*flt = handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0, NULL);
 	if (unlikely(*flt & VM_FAULT_ERROR)) {
 		if (*flt & VM_FAULT_OOM) {
 			ret = -ENOMEM;
--- a/arch/powerpc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/powerpc/mm/fault.c
@@ -607,7 +607,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	major |= fault & VM_FAULT_MAJOR;
 
--- a/arch/riscv/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/riscv/mm/fault.c
@@ -109,7 +109,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, addr, flags);
+	fault = handle_mm_fault(vma, addr, flags, NULL);
 
 	/*
 	 * If we need to retry but a fatal signal is pending, handle the
--- a/arch/s390/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/s390/mm/fault.c
@@ -478,7 +478,7 @@ retry:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 	if (fault_signal_pending(fault, regs)) {
 		fault = VM_FAULT_SIGNAL;
 		if (flags & FAULT_FLAG_RETRY_NOWAIT)
--- a/arch/sh/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sh/mm/fault.c
@@ -482,7 +482,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR)))
 		if (mm_fault_error(regs, error_code, address, fault))
--- a/arch/sparc/mm/fault_32.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sparc/mm/fault_32.c
@@ -234,7 +234,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -410,7 +410,7 @@ good_area:
 		if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
 			goto bad_area;
 	}
-	switch (handle_mm_fault(vma, address, flags)) {
+	switch (handle_mm_fault(vma, address, flags, NULL)) {
 	case VM_FAULT_SIGBUS:
 	case VM_FAULT_OOM:
 		goto do_sigbus;
--- a/arch/sparc/mm/fault_64.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sparc/mm/fault_64.c
@@ -422,7 +422,7 @@ good_area:
 			goto bad_area;
 	}
 
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		goto exit_exception;
--- a/arch/um/kernel/trap.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/um/kernel/trap.c
@@ -71,7 +71,7 @@ good_area:
 	do {
 		vm_fault_t fault;
 
-		fault = handle_mm_fault(vma, address, flags);
+		fault = handle_mm_fault(vma, address, flags, NULL);
 
 		if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
 			goto out_nosemaphore;
--- a/arch/x86/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/x86/mm/fault.c
@@ -1291,7 +1291,7 @@ good_area:
 	 * userland). The return to userland is identified whenever
 	 * FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in flags.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 	major |= fault & VM_FAULT_MAJOR;
 
 	/* Quick path to respond to signals */
--- a/arch/xtensa/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/xtensa/mm/fault.c
@@ -107,7 +107,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags);
+	fault = handle_mm_fault(vma, address, flags, NULL);
 
 	if (fault_signal_pending(fault, regs))
 		return;
--- a/drivers/iommu/amd/iommu_v2.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/drivers/iommu/amd/iommu_v2.c
@@ -495,7 +495,7 @@ static void do_fault(struct work_struct
 	if (access_error(vma, fault))
 		goto out;
 
-	ret = handle_mm_fault(vma, address, flags);
+	ret = handle_mm_fault(vma, address, flags, NULL);
 out:
 	mmap_read_unlock(mm);
 
--- a/drivers/iommu/intel/svm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/drivers/iommu/intel/svm.c
@@ -872,7 +872,8 @@ static irqreturn_t prq_event_thread(int
 			goto invalid;
 
 		ret = handle_mm_fault(vma, address,
-				      req->wr_req ? FAULT_FLAG_WRITE : 0);
+				      req->wr_req ? FAULT_FLAG_WRITE : 0,
+				      NULL);
 		if (ret & VM_FAULT_ERROR)
 			goto invalid;
 
--- a/include/linux/mm.h~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/include/linux/mm.h
@@ -38,6 +38,7 @@ struct file_ra_state;
 struct user_struct;
 struct writeback_control;
 struct bdi_writeback;
+struct pt_regs;
 
 void init_mm_internals(void);
 
@@ -1650,7 +1651,8 @@ int invalidate_inode_page(struct page *p
 
 #ifdef CONFIG_MMU
 extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
-			unsigned long address, unsigned int flags);
+				  unsigned long address, unsigned int flags,
+				  struct pt_regs *regs);
 extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
 			    unsigned long address, unsigned int fault_flags,
 			    bool *unlocked);
@@ -1660,7 +1662,8 @@ void unmap_mapping_range(struct address_
 		loff_t const holebegin, loff_t const holelen, int even_cows);
 #else
 static inline vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
-		unsigned long address, unsigned int flags)
+					 unsigned long address, unsigned int flags,
+					 struct pt_regs *regs)
 {
 	/* should never happen if there's no MMU */
 	BUG();
--- a/mm/gup.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/gup.c
@@ -884,7 +884,7 @@ static int faultin_page(struct task_stru
 		fault_flags |= FAULT_FLAG_TRIED;
 	}
 
-	ret = handle_mm_fault(vma, address, fault_flags);
+	ret = handle_mm_fault(vma, address, fault_flags, NULL);
 	if (ret & VM_FAULT_ERROR) {
 		int err = vm_fault_to_errno(ret, *flags);
 
@@ -1238,7 +1238,7 @@ retry:
 	    fatal_signal_pending(current))
 		return -EINTR;
 
-	ret = handle_mm_fault(vma, address, fault_flags);
+	ret = handle_mm_fault(vma, address, fault_flags, NULL);
 	major |= ret & VM_FAULT_MAJOR;
 	if (ret & VM_FAULT_ERROR) {
 		int err = vm_fault_to_errno(ret, 0);
--- a/mm/hmm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/hmm.c
@@ -75,7 +75,8 @@ static int hmm_vma_fault(unsigned long a
 	}
 
 	for (; addr < end; addr += PAGE_SIZE)
-		if (handle_mm_fault(vma, addr, fault_flags) & VM_FAULT_ERROR)
+		if (handle_mm_fault(vma, addr, fault_flags, NULL) &
+		    VM_FAULT_ERROR)
 			return -EFAULT;
 	return -EBUSY;
 }
--- a/mm/ksm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/ksm.c
@@ -480,7 +480,8 @@ static int break_ksm(struct vm_area_stru
 			break;
 		if (PageKsm(page))
 			ret = handle_mm_fault(vma, addr,
-					FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE);
+					      FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE,
+					      NULL);
 		else
 			ret = VM_FAULT_WRITE;
 		put_page(page);
--- a/mm/memory.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/memory.c
@@ -71,6 +71,8 @@
 #include <linux/dax.h>
 #include <linux/oom.h>
 #include <linux/numa.h>
+#include <linux/perf_event.h>
+#include <linux/ptrace.h>
 
 #include <trace/events/kmem.h>
 
@@ -4365,6 +4367,64 @@ retry_pud:
 	return handle_pte_fault(&vmf);
 }
 
+/**
+ * mm_account_fault - Do page fault accountings
+ *
+ * @regs: the pt_regs struct pointer.  When set to NULL, will skip accounting
+ *        of perf event counters, but we'll still do the per-task accounting to
+ *        the task who triggered this page fault.
+ * @address: the faulted address.
+ * @flags: the fault flags.
+ * @ret: the fault retcode.
+ *
+ * This will take care of most of the page fault accountings.  Meanwhile, it
+ * will also include the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf counter
+ * updates.  However note that the handling of PERF_COUNT_SW_PAGE_FAULTS should
+ * still be in per-arch page fault handlers at the entry of page fault.
+ */
+static inline void mm_account_fault(struct pt_regs *regs,
+				    unsigned long address, unsigned int flags,
+				    vm_fault_t ret)
+{
+	bool major;
+
+	/*
+	 * We don't do accounting for some specific faults:
+	 *
+	 * - Unsuccessful faults (e.g. when the address wasn't valid).  That
+	 *   includes arch_vma_access_permitted() failing before reaching here.
+	 *   So this is not a "this many hardware page faults" counter.  We
+	 *   should use the hw profiling for that.
+	 *
+	 * - Incomplete faults (VM_FAULT_RETRY).  They will only be counted
+	 *   once they're completed.
+	 */
+	if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY))
+		return;
+
+	/*
+	 * We define the fault as a major fault when the final successful fault
+	 * is VM_FAULT_MAJOR, or if it retried (which implies that we couldn't
+	 * handle it immediately previously).
+	 */
+	major = (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED);
+
+	/*
+	 * If the fault is done for GUP, regs will be NULL, and we will skip
+	 * the fault accounting.
+	 */
+	if (!regs)
+		return;
+
+	if (major) {
+		current->maj_flt++;
+		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
+	} else {
+		current->min_flt++;
+		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
+	}
+}
+
 /*
  * By the time we get here, we already hold the mm semaphore
  *
@@ -4372,7 +4432,7 @@ retry_pud:
  * return value.  See filemap_fault() and __lock_page_or_retry().
  */
 vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
-		unsigned int flags)
+			   unsigned int flags, struct pt_regs *regs)
 {
 	vm_fault_t ret;
 
@@ -4413,6 +4473,8 @@ vm_fault_t handle_mm_fault(struct vm_are
 			mem_cgroup_oom_synchronize(false);
 	}
 
+	mm_account_fault(regs, address, flags, ret);
+
 	return ret;
 }
 EXPORT_SYMBOL_GPL(handle_mm_fault);
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-alpha-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (67 preceding siblings ...)
  2020-07-09  0:06 ` + mm-do-page-fault-accounting-in-handle_mm_fault.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06 ` + mm-arc-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (163 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: ink, mattst88, mm-commits, peterx, rth


The patch titled
     Subject: mm/alpha: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-alpha-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-alpha-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-alpha-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/alpha: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/mm/fault.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

--- a/arch/alpha/mm/fault.c~mm-alpha-use-general-page-fault-accounting
+++ a/arch/alpha/mm/fault.c
@@ -25,6 +25,7 @@
 #include <linux/interrupt.h>
 #include <linux/extable.h>
 #include <linux/uaccess.h>
+#include <linux/perf_event.h>
 
 extern void die_if_kernel(char *,struct pt_regs *,long, unsigned long *);
 
@@ -116,6 +117,7 @@ do_page_fault(unsigned long address, uns
 #endif
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
 	mmap_read_lock(mm);
 	vma = find_vma(mm, address);
@@ -148,7 +150,7 @@ retry:
 	/* If for any reason at all we couldn't handle the fault,
 	   make sure we exit gracefully rather than endlessly redo
 	   the fault.  */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -164,10 +166,6 @@ retry:
 	}
 
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR)
-			current->maj_flt++;
-		else
-			current->min_flt++;
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-arc-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (68 preceding siblings ...)
  2020-07-09  0:06 ` + mm-alpha-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06 ` + mm-arm-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (162 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: mm-commits, peterx, vgupta


The patch titled
     Subject: mm/arc: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-arc-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-arc-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-arc-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/arc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Link: http://lkml.kernel.org/r/20200707225021.200906-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arc/mm/fault.c |   18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

--- a/arch/arc/mm/fault.c~mm-arc-use-general-page-fault-accounting
+++ a/arch/arc/mm/fault.c
@@ -105,6 +105,7 @@ void do_page_fault(unsigned long address
 	if (write)
 		flags |= FAULT_FLAG_WRITE;
 
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
 	mmap_read_lock(mm);
 
@@ -130,7 +131,7 @@ retry:
 		goto bad_area;
 	}
 
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	/* Quick path to respond to signals */
 	if (fault_signal_pending(fault, regs)) {
@@ -155,22 +156,9 @@ bad_area:
 	 * Major/minor page fault accounting
 	 * (in case of retry we only land here once)
 	 */
-	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
-
-	if (likely(!(fault & VM_FAULT_ERROR))) {
-		if (fault & VM_FAULT_MAJOR) {
-			tsk->maj_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
-				      regs, address);
-		} else {
-			tsk->min_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
-				      regs, address);
-		}
-
+	if (likely(!(fault & VM_FAULT_ERROR)))
 		/* Normal return path: fault Handled Gracefully */
 		return;
-	}
 
 	if (!user_mode(regs))
 		goto no_context;
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-arm-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (69 preceding siblings ...)
  2020-07-09  0:06 ` + mm-arc-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06 ` + mm-arm64-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (161 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: linux, mm-commits, peterx, will


The patch titled
     Subject: mm/arm: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-arm-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-arm-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-arm-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/arm: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.  To do this, we need to pass
the pt_regs pointer into __do_page_fault().

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Link: http://lkml.kernel.org/r/20200707225021.200906-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm/mm/fault.c |   25 ++++++-------------------
 1 file changed, 6 insertions(+), 19 deletions(-)

--- a/arch/arm/mm/fault.c~mm-arm-use-general-page-fault-accounting
+++ a/arch/arm/mm/fault.c
@@ -202,7 +202,8 @@ static inline bool access_error(unsigned
 
 static vm_fault_t __kprobes
 __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
-		unsigned int flags, struct task_struct *tsk)
+		unsigned int flags, struct task_struct *tsk,
+		struct pt_regs *regs)
 {
 	struct vm_area_struct *vma;
 	vm_fault_t fault;
@@ -224,7 +225,7 @@ good_area:
 		goto out;
 	}
 
-	return handle_mm_fault(vma, addr & PAGE_MASK, flags, NULL);
+	return handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
 
 check_stack:
 	/* Don't allow expansion below FIRST_USER_ADDRESS */
@@ -266,6 +267,8 @@ do_page_fault(unsigned long addr, unsign
 	if ((fsr & FSR_WRITE) && !(fsr & FSR_CM))
 		flags |= FAULT_FLAG_WRITE;
 
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+
 	/*
 	 * As per x86, we may deadlock here.  However, since the kernel only
 	 * validly references user space from well defined areas of the code,
@@ -290,7 +293,7 @@ retry:
 #endif
 	}
 
-	fault = __do_page_fault(mm, addr, fsr, flags, tsk);
+	fault = __do_page_fault(mm, addr, fsr, flags, tsk, regs);
 
 	/* If we need to retry but a fatal signal is pending, handle the
 	 * signal first. We do not need to release the mmap_lock because
@@ -302,23 +305,7 @@ retry:
 		return 0;
 	}
 
-	/*
-	 * Major/minor page fault accounting is only done on the
-	 * initial attempt. If we go through a retry, it is extremely
-	 * likely that the page will be found in page cache at that point.
-	 */
-
-	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
 	if (!(fault & VM_FAULT_ERROR) && flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR) {
-			tsk->maj_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
-					regs, addr);
-		} else {
-			tsk->min_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
-					regs, addr);
-		}
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 			goto retry;
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-arm64-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (70 preceding siblings ...)
  2020-07-09  0:06 ` + mm-arm-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06 ` + mm-csky-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (160 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: catalin.marinas, mm-commits, peterx, will


The patch titled
     Subject: mm/arm64: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-arm64-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-arm64-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-arm64-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/arm64: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.  To do this, we pass pt_regs
pointer into __do_page_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/mm/fault.c |   29 ++++++-----------------------
 1 file changed, 6 insertions(+), 23 deletions(-)

--- a/arch/arm64/mm/fault.c~mm-arm64-use-general-page-fault-accounting
+++ a/arch/arm64/mm/fault.c
@@ -404,7 +404,8 @@ static void do_bad_area(unsigned long ad
 #define VM_FAULT_BADACCESS	0x020000
 
 static vm_fault_t __do_page_fault(struct mm_struct *mm, unsigned long addr,
-			   unsigned int mm_flags, unsigned long vm_flags)
+				  unsigned int mm_flags, unsigned long vm_flags,
+				  struct pt_regs *regs)
 {
 	struct vm_area_struct *vma = find_vma(mm, addr);
 
@@ -428,7 +429,7 @@ static vm_fault_t __do_page_fault(struct
 	 */
 	if (!(vma->vm_flags & vm_flags))
 		return VM_FAULT_BADACCESS;
-	return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags, NULL);
+	return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags, regs);
 }
 
 static bool is_el0_instruction_abort(unsigned int esr)
@@ -450,7 +451,7 @@ static int __kprobes do_page_fault(unsig
 {
 	const struct fault_info *inf;
 	struct mm_struct *mm = current->mm;
-	vm_fault_t fault, major = 0;
+	vm_fault_t fault;
 	unsigned long vm_flags = VM_ACCESS_FLAGS;
 	unsigned int mm_flags = FAULT_FLAG_DEFAULT;
 
@@ -516,8 +517,7 @@ retry:
 #endif
 	}
 
-	fault = __do_page_fault(mm, addr, mm_flags, vm_flags);
-	major |= fault & VM_FAULT_MAJOR;
+	fault = __do_page_fault(mm, addr, mm_flags, vm_flags, regs);
 
 	/* Quick path to respond to signals */
 	if (fault_signal_pending(fault, regs)) {
@@ -538,25 +538,8 @@ retry:
 	 * Handle the "normal" (no error) case first.
 	 */
 	if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP |
-			      VM_FAULT_BADACCESS)))) {
-		/*
-		 * Major/minor page fault accounting is only done
-		 * once. If we go through a retry, it is extremely
-		 * likely that the page will be found in page cache at
-		 * that point.
-		 */
-		if (major) {
-			current->maj_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs,
-				      addr);
-		} else {
-			current->min_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs,
-				      addr);
-		}
-
+			      VM_FAULT_BADACCESS))))
 		return 0;
-	}
 
 	/*
 	 * If we are in kernel mode at this point, we have no context to
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-csky-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (71 preceding siblings ...)
  2020-07-09  0:06 ` + mm-arm64-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06 ` + mm-hexagon-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (159 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: guoren, mm-commits, peterx


The patch titled
     Subject: mm/csky: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-csky-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-csky-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-csky-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/csky: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Link: http://lkml.kernel.org/r/20200707225021.200906-7-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/csky/mm/fault.c |   12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)

--- a/arch/csky/mm/fault.c~mm-csky-use-general-page-fault-accounting
+++ a/arch/csky/mm/fault.c
@@ -151,7 +151,7 @@ good_area:
 	 * the fault.
 	 */
 	fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0,
-				NULL);
+				regs);
 	if (unlikely(fault & VM_FAULT_ERROR)) {
 		if (fault & VM_FAULT_OOM)
 			goto out_of_memory;
@@ -161,16 +161,6 @@ good_area:
 			goto bad_area;
 		BUG();
 	}
-	if (fault & VM_FAULT_MAJOR) {
-		tsk->maj_flt++;
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs,
-			      address);
-	} else {
-		tsk->min_flt++;
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs,
-			      address);
-	}
-
 	mmap_read_unlock(mm);
 	return;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-hexagon-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (72 preceding siblings ...)
  2020-07-09  0:06 ` + mm-csky-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06 ` + mm-ia64-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (158 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: bcain, mm-commits, peterx


The patch titled
     Subject: mm/hexagon: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-hexagon-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-hexagon-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-hexagon-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/hexagon: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-8-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Brian Cain <bcain@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/hexagon/mm/vm_fault.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

--- a/arch/hexagon/mm/vm_fault.c~mm-hexagon-use-general-page-fault-accounting
+++ a/arch/hexagon/mm/vm_fault.c
@@ -18,6 +18,7 @@
 #include <linux/signal.h>
 #include <linux/extable.h>
 #include <linux/hardirq.h>
+#include <linux/perf_event.h>
 
 /*
  * Decode of hardware exception sends us to one of several
@@ -53,6 +54,8 @@ void do_page_fault(unsigned long address
 
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
+
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
 	mmap_read_lock(mm);
 	vma = find_vma(mm, address);
@@ -88,7 +91,7 @@ good_area:
 		break;
 	}
 
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -96,10 +99,6 @@ good_area:
 	/* The most common case -- we are done. */
 	if (likely(!(fault & VM_FAULT_ERROR))) {
 		if (flags & FAULT_FLAG_ALLOW_RETRY) {
-			if (fault & VM_FAULT_MAJOR)
-				current->maj_flt++;
-			else
-				current->min_flt++;
 			if (fault & VM_FAULT_RETRY) {
 				flags |= FAULT_FLAG_TRIED;
 				goto retry;
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-ia64-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (73 preceding siblings ...)
  2020-07-09  0:06 ` + mm-hexagon-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06 ` + mm-m68k-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (157 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: mm-commits, peterx, tony.luck


The patch titled
     Subject: mm/ia64: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-ia64-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-ia64-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-ia64-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/ia64: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-9-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/ia64/mm/fault.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

--- a/arch/ia64/mm/fault.c~mm-ia64-use-general-page-fault-accounting
+++ a/arch/ia64/mm/fault.c
@@ -14,6 +14,7 @@
 #include <linux/kdebug.h>
 #include <linux/prefetch.h>
 #include <linux/uaccess.h>
+#include <linux/perf_event.h>
 
 #include <asm/processor.h>
 #include <asm/exception.h>
@@ -105,6 +106,8 @@ ia64_do_page_fault (unsigned long addres
 		flags |= FAULT_FLAG_USER;
 	if (mask & VM_WRITE)
 		flags |= FAULT_FLAG_WRITE;
+
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
 	mmap_read_lock(mm);
 
@@ -143,7 +146,7 @@ retry:
 	 * sure we exit gracefully rather than endlessly redo the
 	 * fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -166,10 +169,6 @@ retry:
 	}
 
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR)
-			current->maj_flt++;
-		else
-			current->min_flt++;
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-m68k-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (74 preceding siblings ...)
  2020-07-09  0:06 ` + mm-ia64-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06 ` + mm-microblaze-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (156 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: geert, mm-commits, peterx


The patch titled
     Subject: mm/m68k: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-m68k-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-m68k-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-m68k-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/m68k: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-10-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/m68k/mm/fault.c |   14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

--- a/arch/m68k/mm/fault.c~mm-m68k-use-general-page-fault-accounting
+++ a/arch/m68k/mm/fault.c
@@ -12,6 +12,7 @@
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
+#include <linux/perf_event.h>
 
 #include <asm/setup.h>
 #include <asm/traps.h>
@@ -84,6 +85,8 @@ int do_page_fault(struct pt_regs *regs,
 
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
+
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
 	mmap_read_lock(mm);
 
@@ -134,7 +137,7 @@ good_area:
 	 * the fault.
 	 */
 
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 	pr_debug("handle_mm_fault returns %x\n", fault);
 
 	if (fault_signal_pending(fault, regs))
@@ -150,16 +153,7 @@ good_area:
 		BUG();
 	}
 
-	/*
-	 * Major/minor page fault accounting is only done on the
-	 * initial attempt. If we go through a retry, it is extremely
-	 * likely that the page will be found in page cache at that point.
-	 */
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR)
-			current->maj_flt++;
-		else
-			current->min_flt++;
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-microblaze-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (75 preceding siblings ...)
  2020-07-09  0:06 ` + mm-m68k-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:06 ` + mm-mips-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (155 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: mm-commits, monstr, peterx


The patch titled
     Subject: mm/microblaze: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-microblaze-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-microblaze-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-microblaze-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/microblaze: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-11-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/microblaze/mm/fault.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

--- a/arch/microblaze/mm/fault.c~mm-microblaze-use-general-page-fault-accounting
+++ a/arch/microblaze/mm/fault.c
@@ -28,6 +28,7 @@
 #include <linux/mman.h>
 #include <linux/mm.h>
 #include <linux/interrupt.h>
+#include <linux/perf_event.h>
 
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -121,6 +122,8 @@ void do_page_fault(struct pt_regs *regs,
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
 
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
 	/* When running in the kernel we expect faults to occur only to
 	 * addresses in user space.  All other faults represent errors in the
 	 * kernel and should generate an OOPS.  Unfortunately, in the case of an
@@ -214,7 +217,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -230,10 +233,6 @@ good_area:
 	}
 
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (unlikely(fault & VM_FAULT_MAJOR))
-			current->maj_flt++;
-		else
-			current->min_flt++;
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-mips-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (76 preceding siblings ...)
  2020-07-09  0:06 ` + mm-microblaze-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:06 ` Andrew Morton
  2020-07-09  0:07 ` + mm-nds32-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (154 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:06 UTC (permalink / raw)
  To: mm-commits, peterx, tsbogend


The patch titled
     Subject: mm/mips: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-mips-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mips-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mips-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/mips: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Link: http://lkml.kernel.org/r/20200707225021.200906-12-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/mips/mm/fault.c |   14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

--- a/arch/mips/mm/fault.c~mm-mips-use-general-page-fault-accounting
+++ a/arch/mips/mm/fault.c
@@ -96,6 +96,8 @@ static void __kprobes __do_page_fault(st
 
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
+
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
 	mmap_read_lock(mm);
 	vma = find_vma(mm, address);
@@ -152,12 +154,11 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
 
-	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 	if (unlikely(fault & VM_FAULT_ERROR)) {
 		if (fault & VM_FAULT_OOM)
 			goto out_of_memory;
@@ -168,15 +169,6 @@ good_area:
 		BUG();
 	}
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR) {
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
-						  regs, address);
-			tsk->maj_flt++;
-		} else {
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
-						  regs, address);
-			tsk->min_flt++;
-		}
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-nds32-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (77 preceding siblings ...)
  2020-07-09  0:06 ` + mm-mips-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-nios2-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (153 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: deanbo422, green.hu, mm-commits, nickhu, peterx


The patch titled
     Subject: mm/nds32: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-nds32-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-nds32-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-nds32-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/nds32: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Link: http://lkml.kernel.org/r/20200707225021.200906-13-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Greentime Hu <green.hu@gmail.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/nds32/mm/fault.c |   19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

--- a/arch/nds32/mm/fault.c~mm-nds32-use-general-page-fault-accounting
+++ a/arch/nds32/mm/fault.c
@@ -121,6 +121,8 @@ void do_page_fault(unsigned long entry,
 	if (unlikely(faulthandler_disabled() || !mm))
 		goto no_context;
 
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+
 	/*
 	 * As per x86, we may deadlock here. However, since the kernel only
 	 * validly references user space from well defined areas of the code,
@@ -206,7 +208,7 @@ good_area:
 	 * the fault.
 	 */
 
-	fault = handle_mm_fault(vma, addr, flags, NULL);
+	fault = handle_mm_fault(vma, addr, flags, regs);
 
 	/*
 	 * If we need to retry but a fatal signal is pending, handle the
@@ -228,22 +230,7 @@ good_area:
 			goto bad_area;
 	}
 
-	/*
-	 * Major/minor page fault accounting is only done on the initial
-	 * attempt. If we go through a retry, it is extremely likely that the
-	 * page will be found in page cache at that point.
-	 */
-	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR) {
-			tsk->maj_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ,
-				      1, regs, addr);
-		} else {
-			tsk->min_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN,
-				      1, regs, addr);
-		}
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-nios2-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (78 preceding siblings ...)
  2020-07-09  0:07 ` + mm-nds32-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-openrisc-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (152 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: ley.foon.tan, mm-commits, peterx


The patch titled
     Subject: mm/nios2: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-nios2-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-nios2-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-nios2-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/nios2: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-14-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/nios2/mm/fault.c |   14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

--- a/arch/nios2/mm/fault.c~mm-nios2-use-general-page-fault-accounting
+++ a/arch/nios2/mm/fault.c
@@ -24,6 +24,7 @@
 #include <linux/mm.h>
 #include <linux/extable.h>
 #include <linux/uaccess.h>
+#include <linux/perf_event.h>
 
 #include <asm/mmu_context.h>
 #include <asm/traps.h>
@@ -83,6 +84,8 @@ asmlinkage void do_page_fault(struct pt_
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
 
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
 	if (!mmap_read_trylock(mm)) {
 		if (!user_mode(regs) && !search_exception_tables(regs->ea))
 			goto bad_area_nosemaphore;
@@ -131,7 +134,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -146,16 +149,7 @@ good_area:
 		BUG();
 	}
 
-	/*
-	 * Major/minor page fault accounting is only done on the
-	 * initial attempt. If we go through a retry, it is extremely
-	 * likely that the page will be found in page cache at that point.
-	 */
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR)
-			current->maj_flt++;
-		else
-			current->min_flt++;
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-openrisc-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (79 preceding siblings ...)
  2020-07-09  0:07 ` + mm-nios2-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-parisc-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (151 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: jonas, mm-commits, peterx, shorne, stefan.kristiansson


The patch titled
     Subject: mm/openrisc: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-openrisc-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-openrisc-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-openrisc-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/openrisc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-15-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Stafford Horne <shorne@gmail.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/openrisc/mm/fault.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

--- a/arch/openrisc/mm/fault.c~mm-openrisc-use-general-page-fault-accounting
+++ a/arch/openrisc/mm/fault.c
@@ -15,6 +15,7 @@
 #include <linux/interrupt.h>
 #include <linux/extable.h>
 #include <linux/sched/signal.h>
+#include <linux/perf_event.h>
 
 #include <linux/uaccess.h>
 #include <asm/siginfo.h>
@@ -103,6 +104,8 @@ asmlinkage void do_page_fault(struct pt_
 	if (in_interrupt() || !mm)
 		goto no_context;
 
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
 retry:
 	mmap_read_lock(mm);
 	vma = find_vma(mm, address);
@@ -159,7 +162,7 @@ good_area:
 	 * the fault.
 	 */
 
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -176,10 +179,6 @@ good_area:
 
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
 		/*RGD modeled on Cris */
-		if (fault & VM_FAULT_MAJOR)
-			tsk->maj_flt++;
-		else
-			tsk->min_flt++;
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-parisc-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (80 preceding siblings ...)
  2020-07-09  0:07 ` + mm-openrisc-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-powerpc-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (150 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: deller, James.Bottomley, mm-commits, peterx


The patch titled
     Subject: mm/parisc: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-parisc-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-parisc-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-parisc-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/parisc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too.  Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-16-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/parisc/mm/fault.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

--- a/arch/parisc/mm/fault.c~mm-parisc-use-general-page-fault-accounting
+++ a/arch/parisc/mm/fault.c
@@ -18,6 +18,7 @@
 #include <linux/extable.h>
 #include <linux/uaccess.h>
 #include <linux/hugetlb.h>
+#include <linux/perf_event.h>
 
 #include <asm/traps.h>
 
@@ -281,6 +282,7 @@ void do_page_fault(struct pt_regs *regs,
 	acc_type = parisc_acctyp(code, regs->iir);
 	if (acc_type & VM_WRITE)
 		flags |= FAULT_FLAG_WRITE;
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
 	mmap_read_lock(mm);
 	vma = find_vma_prev(mm, address, &prev_vma);
@@ -302,7 +304,7 @@ good_area:
 	 * fault.
 	 */
 
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -323,10 +325,6 @@ good_area:
 		BUG();
 	}
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR)
-			current->maj_flt++;
-		else
-			current->min_flt++;
 		if (fault & VM_FAULT_RETRY) {
 			/*
 			 * No need to mmap_read_unlock(mm) as we would
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-powerpc-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (81 preceding siblings ...)
  2020-07-09  0:07 ` + mm-parisc-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-riscv-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (149 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: benh, mm-commits, mpe, paulus, peterx


The patch titled
     Subject: mm/powerpc: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-powerpc-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-powerpc-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-powerpc-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/powerpc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-17-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/mm/fault.c |   11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

--- a/arch/powerpc/mm/fault.c~mm-powerpc-use-general-page-fault-accounting
+++ a/arch/powerpc/mm/fault.c
@@ -607,7 +607,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	major |= fault & VM_FAULT_MAJOR;
 
@@ -633,14 +633,9 @@ good_area:
 	/*
 	 * Major/minor page fault accounting.
 	 */
-	if (major) {
-		current->maj_flt++;
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
+	if (major)
 		cmo_account_page_fault();
-	} else {
-		current->min_flt++;
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
-	}
+
 	return 0;
 }
 NOKPROBE_SYMBOL(__do_page_fault);
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-riscv-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (82 preceding siblings ...)
  2020-07-09  0:07 ` + mm-powerpc-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-s390-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (148 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: aou, mm-commits, palmer, paul.walmsley, penberg, peterx


The patch titled
     Subject: mm/riscv: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-riscv-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-riscv-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-riscv-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/riscv: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Link: http://lkml.kernel.org/r/20200707225021.200906-18-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/riscv/mm/fault.c |   16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

--- a/arch/riscv/mm/fault.c~mm-riscv-use-general-page-fault-accounting
+++ a/arch/riscv/mm/fault.c
@@ -109,7 +109,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, addr, flags, NULL);
+	fault = handle_mm_fault(vma, addr, flags, regs);
 
 	/*
 	 * If we need to retry but a fatal signal is pending, handle the
@@ -127,21 +127,7 @@ good_area:
 		BUG();
 	}
 
-	/*
-	 * Major/minor page fault accounting is only done on the
-	 * initial attempt. If we go through a retry, it is extremely
-	 * likely that the page will be found in page cache at that point.
-	 */
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR) {
-			tsk->maj_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ,
-				      1, regs, addr);
-		} else {
-			tsk->min_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN,
-				      1, regs, addr);
-		}
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-s390-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (83 preceding siblings ...)
  2020-07-09  0:07 ` + mm-riscv-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-sh-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (147 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: agordeev, borntraeger, gerald.schaefer, gor, heiko.carstens,
	mm-commits, peterx


The patch titled
     Subject: mm/s390: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-s390-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-s390-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-s390-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/s390: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Link: http://lkml.kernel.org/r/20200707225021.200906-19-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/s390/mm/fault.c |   16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

--- a/arch/s390/mm/fault.c~mm-s390-use-general-page-fault-accounting
+++ a/arch/s390/mm/fault.c
@@ -478,7 +478,7 @@ retry:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 	if (fault_signal_pending(fault, regs)) {
 		fault = VM_FAULT_SIGNAL;
 		if (flags & FAULT_FLAG_RETRY_NOWAIT)
@@ -488,21 +488,7 @@ retry:
 	if (unlikely(fault & VM_FAULT_ERROR))
 		goto out_up;
 
-	/*
-	 * Major/minor page fault accounting is only done on the
-	 * initial attempt. If we go through a retry, it is extremely
-	 * likely that the page will be found in page cache at that point.
-	 */
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR) {
-			tsk->maj_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
-				      regs, address);
-		} else {
-			tsk->min_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
-				      regs, address);
-		}
 		if (fault & VM_FAULT_RETRY) {
 			if (IS_ENABLED(CONFIG_PGSTE) && gmap &&
 			    (flags & FAULT_FLAG_RETRY_NOWAIT)) {
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-sh-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (84 preceding siblings ...)
  2020-07-09  0:07 ` + mm-s390-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-sparc32-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (146 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: dalias, mm-commits, peterx, ysato


The patch titled
     Subject: mm/sh: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-sh-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-sh-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-sh-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/sh: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Link: http://lkml.kernel.org/r/20200707225021.200906-20-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/sh/mm/fault.c |   11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

--- a/arch/sh/mm/fault.c~mm-sh-use-general-page-fault-accounting
+++ a/arch/sh/mm/fault.c
@@ -482,22 +482,13 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR)))
 		if (mm_fault_error(regs, error_code, address, fault))
 			return;
 
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR) {
-			tsk->maj_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
-				      regs, address);
-		} else {
-			tsk->min_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
-				      regs, address);
-		}
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-sparc32-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (85 preceding siblings ...)
  2020-07-09  0:07 ` + mm-sh-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-sparc64-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (145 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: davem, mm-commits, peterx


The patch titled
     Subject: mm/sparc32: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-sparc32-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-sparc32-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-sparc32-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/sparc32: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Link: http://lkml.kernel.org/r/20200707225021.200906-21-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/sparc/mm/fault_32.c |   11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

--- a/arch/sparc/mm/fault_32.c~mm-sparc32-use-general-page-fault-accounting
+++ a/arch/sparc/mm/fault_32.c
@@ -234,7 +234,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -250,15 +250,6 @@ good_area:
 	}
 
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR) {
-			current->maj_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ,
-				      1, regs, address);
-		} else {
-			current->min_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN,
-				      1, regs, address);
-		}
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-sparc64-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (86 preceding siblings ...)
  2020-07-09  0:07 ` + mm-sparc32-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-x86-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (144 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: davem, mm-commits, peterx


The patch titled
     Subject: mm/sparc64: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-sparc64-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-sparc64-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-sparc64-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/sparc64: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Link: http://lkml.kernel.org/r/20200707225021.200906-22-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/sparc/mm/fault_64.c |   11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

--- a/arch/sparc/mm/fault_64.c~mm-sparc64-use-general-page-fault-accounting
+++ a/arch/sparc/mm/fault_64.c
@@ -422,7 +422,7 @@ good_area:
 			goto bad_area;
 	}
 
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		goto exit_exception;
@@ -438,15 +438,6 @@ good_area:
 	}
 
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR) {
-			current->maj_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ,
-				      1, regs, address);
-		} else {
-			current->min_flt++;
-			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN,
-				      1, regs, address);
-		}
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-x86-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (87 preceding siblings ...)
  2020-07-09  0:07 ` + mm-sparc64-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-xtensa-use-general-page-fault-accounting.patch " Andrew Morton
                   ` (143 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: bp, dave.hansen, hpa, luto, mingo, mm-commits, peterx, peterz, tglx


The patch titled
     Subject: mm/x86: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-x86-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-x86-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-x86-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/x86: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().

Link: http://lkml.kernel.org/r/20200707225021.200906-23-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/mm/fault.c |   17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

--- a/arch/x86/mm/fault.c~mm-x86-use-general-page-fault-accounting
+++ a/arch/x86/mm/fault.c
@@ -1139,7 +1139,7 @@ void do_user_addr_fault(struct pt_regs *
 	struct vm_area_struct *vma;
 	struct task_struct *tsk;
 	struct mm_struct *mm;
-	vm_fault_t fault, major = 0;
+	vm_fault_t fault;
 	unsigned int flags = FAULT_FLAG_DEFAULT;
 
 	tsk = current;
@@ -1291,8 +1291,7 @@ good_area:
 	 * userland). The return to userland is identified whenever
 	 * FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in flags.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
-	major |= fault & VM_FAULT_MAJOR;
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	/* Quick path to respond to signals */
 	if (fault_signal_pending(fault, regs)) {
@@ -1319,18 +1318,6 @@ good_area:
 		return;
 	}
 
-	/*
-	 * Major/minor page fault accounting. If any of the events
-	 * returned VM_FAULT_MAJOR, we account it as a major fault.
-	 */
-	if (major) {
-		tsk->maj_flt++;
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
-	} else {
-		tsk->min_flt++;
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
-	}
-
 	check_v8086_mode(regs, address, tsk);
 }
 NOKPROBE_SYMBOL(do_user_addr_fault);
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-xtensa-use-general-page-fault-accounting.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (88 preceding siblings ...)
  2020-07-09  0:07 ` + mm-x86-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07 ` + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch " Andrew Morton
                   ` (142 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: chris, jcmvbkbc, mm-commits, peterx


The patch titled
     Subject: mm/xtensa: use general page fault accounting
has been added to the -mm tree.  Its filename is
     mm-xtensa-use-general-page-fault-accounting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-xtensa-use-general-page-fault-accounting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-xtensa-use-general-page-fault-accounting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/xtensa: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Remove the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf events because it's
now also done in handle_mm_fault().

Move the PERF_COUNT_SW_PAGE_FAULTS event higher before taking mmap_sem for
the fault, then it'll match with the rest of the archs.

Link: http://lkml.kernel.org/r/20200707225021.200906-24-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/xtensa/mm/fault.c |   15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

--- a/arch/xtensa/mm/fault.c~mm-xtensa-use-general-page-fault-accounting
+++ a/arch/xtensa/mm/fault.c
@@ -72,6 +72,9 @@ void do_page_fault(struct pt_regs *regs)
 
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
+
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
 retry:
 	mmap_read_lock(mm);
 	vma = find_vma(mm, address);
@@ -107,7 +110,7 @@ good_area:
 	 * make sure we exit gracefully rather than endlessly redo
 	 * the fault.
 	 */
-	fault = handle_mm_fault(vma, address, flags, NULL);
+	fault = handle_mm_fault(vma, address, flags, regs);
 
 	if (fault_signal_pending(fault, regs))
 		return;
@@ -122,10 +125,6 @@ good_area:
 		BUG();
 	}
 	if (flags & FAULT_FLAG_ALLOW_RETRY) {
-		if (fault & VM_FAULT_MAJOR)
-			current->maj_flt++;
-		else
-			current->min_flt++;
 		if (fault & VM_FAULT_RETRY) {
 			flags |= FAULT_FLAG_TRIED;
 
@@ -139,12 +138,6 @@ good_area:
 	}
 
 	mmap_read_unlock(mm);
-	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
-	if (flags & VM_FAULT_MAJOR)
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
-	else
-		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
-
 	return;
 
 	/* Something tried to access memory that isn't in our memory map..
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (89 preceding siblings ...)
  2020-07-09  0:07 ` + mm-xtensa-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  0:07   ` Andrew Morton
  2020-07-09  0:07 ` + mm-gup-remove-task_struct-pointer-for-all-gup-code.patch " Andrew Morton
                   ` (141 subsequent siblings)
  232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: agordeev, aou, bcain, benh, borntraeger, bp, catalin.marinas,
	chris, dalias, dave.hansen, davem, deanbo422, deller, geert,
	gerald.schaefer, gor, green.hu, guoren, heiko.carstens, hpa, ink,
	James.Bottomley, jcmvbkbc, jhubbard, jonas, ley.foon.tan, linux,
	luto, mattst88, mingo, mm-commits, monstr, mpe, nickhu, palmer,
	paul.walmsley


The patch titled
     Subject: mm: clean up the last pieces of page fault accountings
has been added to the -mm tree.  Its filename is
     mm-clean-up-the-last-pieces-of-page-fault-accountings.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm: clean up the last pieces of page fault accountings

Here're the last pieces of page fault accounting that were still done
outside handle_mm_fault() where we still have regs==NULL when calling
handle_mm_fault():

arch/powerpc/mm/copro_fault.c:   copro_handle_mm_fault
arch/sparc/mm/fault_32.c:        force_user_fault
arch/um/kernel/trap.c:           handle_page_fault
mm/gup.c:                        faultin_page
                                 fixup_user_fault
mm/hmm.c:                        hmm_vma_fault
mm/ksm.c:                        break_ksm

Some of them has the issue of duplicated accounting for page fault
retries.  Some of them didn't do the accounting at all.

This patch cleans all these up by letting handle_mm_fault() to do per-task
page fault accounting even if regs==NULL (though we'll still skip the perf
event accountings).  With that, we can safely remove all the outliers now.

There's another functional change in that now we account the page faults
to the caller of gup, rather than the task_struct that passed into the gup
code.  More information of this can be found at [1].

After this patch, below things should never be touched again outside
handle_mm_fault():

  - task_struct.[maj|min]_flt
  - PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]

[1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/

Link: http://lkml.kernel.org/r/20200707225021.200906-25-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/mm/copro_fault.c |    5 -----
 arch/um/kernel/trap.c         |    4 ----
 mm/gup.c                      |   13 -------------
 mm/memory.c                   |   17 ++++++++++-------
 4 files changed, 10 insertions(+), 29 deletions(-)

--- a/arch/powerpc/mm/copro_fault.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/arch/powerpc/mm/copro_fault.c
@@ -76,11 +76,6 @@ int copro_handle_mm_fault(struct mm_stru
 		BUG();
 	}
 
-	if (*flt & VM_FAULT_MAJOR)
-		current->maj_flt++;
-	else
-		current->min_flt++;
-
 out_unlock:
 	mmap_read_unlock(mm);
 	return ret;
--- a/arch/um/kernel/trap.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/arch/um/kernel/trap.c
@@ -88,10 +88,6 @@ good_area:
 			BUG();
 		}
 		if (flags & FAULT_FLAG_ALLOW_RETRY) {
-			if (fault & VM_FAULT_MAJOR)
-				current->maj_flt++;
-			else
-				current->min_flt++;
 			if (fault & VM_FAULT_RETRY) {
 				flags |= FAULT_FLAG_TRIED;
 
--- a/mm/gup.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/mm/gup.c
@@ -893,13 +893,6 @@ static int faultin_page(struct task_stru
 		BUG();
 	}
 
-	if (tsk) {
-		if (ret & VM_FAULT_MAJOR)
-			tsk->maj_flt++;
-		else
-			tsk->min_flt++;
-	}
-
 	if (ret & VM_FAULT_RETRY) {
 		if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT))
 			*locked = 0;
@@ -1255,12 +1248,6 @@ retry:
 		goto retry;
 	}
 
-	if (tsk) {
-		if (major)
-			tsk->maj_flt++;
-		else
-			tsk->min_flt++;
-	}
 	return 0;
 }
 EXPORT_SYMBOL_GPL(fixup_user_fault);
--- a/mm/memory.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/mm/memory.c
@@ -4409,20 +4409,23 @@ static inline void mm_account_fault(stru
 	 */
 	major = (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED);
 
+	if (major)
+		current->maj_flt++;
+	else
+		current->min_flt++;
+
 	/*
-	 * If the fault is done for GUP, regs will be NULL, and we will skip
-	 * the fault accounting.
+	 * If the fault is done for GUP, regs will be NULL.  We only do the
+	 * accounting for the per thread fault counters who triggered the
+	 * fault, and we skip the perf event updates.
 	 */
 	if (!regs)
 		return;
 
-	if (major) {
-		current->maj_flt++;
+	if (major)
 		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
-	} else {
-		current->min_flt++;
+	else
 		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
-	}
 }
 
 /*
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch added to -mm tree
  2020-07-09  0:07 ` + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch " Andrew Morton
@ 2020-07-09  0:07   ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: agordeev, aou, bcain, benh, borntraeger, bp, catalin.marinas,
	chris, dalias, dave.hansen, davem, deanbo422, deller, geert,
	gerald.schaefer, gor, green.hu, guoren, heiko.carstens, hpa, ink,
	James.Bottomley, jcmvbkbc, jhubbard, jonas, ley.foon.tan, linux,
	luto, mattst88, mingo, mm-commits, monstr, mpe, nickhu, palmer,
	paul.walmsley, paulus, penberg, peterx, peterz, rth, shorne,
	stefan.kristiansson, tglx, tony.luck, tsbogend, vgupta, will,
	ysato


The patch titled
     Subject: mm: clean up the last pieces of page fault accountings
has been added to the -mm tree.  Its filename is
     mm-clean-up-the-last-pieces-of-page-fault-accountings.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm: clean up the last pieces of page fault accountings

Here're the last pieces of page fault accounting that were still done
outside handle_mm_fault() where we still have regs==NULL when calling
handle_mm_fault():

arch/powerpc/mm/copro_fault.c:   copro_handle_mm_fault
arch/sparc/mm/fault_32.c:        force_user_fault
arch/um/kernel/trap.c:           handle_page_fault
mm/gup.c:                        faultin_page
                                 fixup_user_fault
mm/hmm.c:                        hmm_vma_fault
mm/ksm.c:                        break_ksm

Some of them has the issue of duplicated accounting for page fault
retries.  Some of them didn't do the accounting at all.

This patch cleans all these up by letting handle_mm_fault() to do per-task
page fault accounting even if regs==NULL (though we'll still skip the perf
event accountings).  With that, we can safely remove all the outliers now.

There's another functional change in that now we account the page faults
to the caller of gup, rather than the task_struct that passed into the gup
code.  More information of this can be found at [1].

After this patch, below things should never be touched again outside
handle_mm_fault():

  - task_struct.[maj|min]_flt
  - PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]

[1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/

Link: http://lkml.kernel.org/r/20200707225021.200906-25-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/mm/copro_fault.c |    5 -----
 arch/um/kernel/trap.c         |    4 ----
 mm/gup.c                      |   13 -------------
 mm/memory.c                   |   17 ++++++++++-------
 4 files changed, 10 insertions(+), 29 deletions(-)

--- a/arch/powerpc/mm/copro_fault.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/arch/powerpc/mm/copro_fault.c
@@ -76,11 +76,6 @@ int copro_handle_mm_fault(struct mm_stru
 		BUG();
 	}
 
-	if (*flt & VM_FAULT_MAJOR)
-		current->maj_flt++;
-	else
-		current->min_flt++;
-
 out_unlock:
 	mmap_read_unlock(mm);
 	return ret;
--- a/arch/um/kernel/trap.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/arch/um/kernel/trap.c
@@ -88,10 +88,6 @@ good_area:
 			BUG();
 		}
 		if (flags & FAULT_FLAG_ALLOW_RETRY) {
-			if (fault & VM_FAULT_MAJOR)
-				current->maj_flt++;
-			else
-				current->min_flt++;
 			if (fault & VM_FAULT_RETRY) {
 				flags |= FAULT_FLAG_TRIED;
 
--- a/mm/gup.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/mm/gup.c
@@ -893,13 +893,6 @@ static int faultin_page(struct task_stru
 		BUG();
 	}
 
-	if (tsk) {
-		if (ret & VM_FAULT_MAJOR)
-			tsk->maj_flt++;
-		else
-			tsk->min_flt++;
-	}
-
 	if (ret & VM_FAULT_RETRY) {
 		if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT))
 			*locked = 0;
@@ -1255,12 +1248,6 @@ retry:
 		goto retry;
 	}
 
-	if (tsk) {
-		if (major)
-			tsk->maj_flt++;
-		else
-			tsk->min_flt++;
-	}
 	return 0;
 }
 EXPORT_SYMBOL_GPL(fixup_user_fault);
--- a/mm/memory.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/mm/memory.c
@@ -4409,20 +4409,23 @@ static inline void mm_account_fault(stru
 	 */
 	major = (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED);
 
+	if (major)
+		current->maj_flt++;
+	else
+		current->min_flt++;
+
 	/*
-	 * If the fault is done for GUP, regs will be NULL, and we will skip
-	 * the fault accounting.
+	 * If the fault is done for GUP, regs will be NULL.  We only do the
+	 * accounting for the per thread fault counters who triggered the
+	 * fault, and we skip the perf event updates.
 	 */
 	if (!regs)
 		return;
 
-	if (major) {
-		current->maj_flt++;
+	if (major)
 		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
-	} else {
-		current->min_flt++;
+	else
 		perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
-	}
 }
 
 /*
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-gup-remove-task_struct-pointer-for-all-gup-code.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (90 preceding siblings ...)
  2020-07-09  0:07 ` + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch " Andrew Morton
@ 2020-07-09  0:07 ` Andrew Morton
  2020-07-09  2:04 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix.patch " Andrew Morton
                   ` (140 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  0:07 UTC (permalink / raw)
  To: jhubbard, mm-commits, peterx


The patch titled
     Subject: mm/gup: remove task_struct pointer for all gup code
has been added to the -mm tree.  Its filename is
     mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/gup: remove task_struct pointer for all gup code

After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more.  Remove that parameter in the whole gup
stack.

Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arc/kernel/process.c                   |    2 
 arch/s390/kvm/interrupt.c                   |    2 
 arch/s390/kvm/kvm-s390.c                    |    2 
 arch/s390/kvm/priv.c                        |    8 -
 arch/s390/mm/gmap.c                         |    4 
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |    2 
 drivers/infiniband/core/umem_odp.c          |    2 
 drivers/vfio/vfio_iommu_type1.c             |    4 
 fs/exec.c                                   |    2 
 include/linux/mm.h                          |    9 -
 kernel/events/uprobes.c                     |    6 -
 kernel/futex.c                              |    2 
 mm/gup.c                                    |  101 +++++++-----------
 mm/memory.c                                 |    2 
 mm/process_vm_access.c                      |    2 
 security/tomoyo/domain.c                    |    2 
 virt/kvm/async_pf.c                         |    2 
 virt/kvm/kvm_main.c                         |    2 
 18 files changed, 69 insertions(+), 87 deletions(-)

--- a/arch/arc/kernel/process.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/arc/kernel/process.c
@@ -91,7 +91,7 @@ fault:
 		 goto fail;
 
 	mmap_read_lock(current->mm);
-	ret = fixup_user_fault(current, current->mm, (unsigned long) uaddr,
+	ret = fixup_user_fault(current->mm, (unsigned long) uaddr,
 			       FAULT_FLAG_WRITE, NULL);
 	mmap_read_unlock(current->mm);
 
--- a/arch/s390/kvm/interrupt.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/s390/kvm/interrupt.c
@@ -2768,7 +2768,7 @@ static struct page *get_map_page(struct
 	struct page *page = NULL;
 
 	mmap_read_lock(kvm->mm);
-	get_user_pages_remote(NULL, kvm->mm, uaddr, 1, FOLL_WRITE,
+	get_user_pages_remote(kvm->mm, uaddr, 1, FOLL_WRITE,
 			      &page, NULL, NULL);
 	mmap_read_unlock(kvm->mm);
 	return page;
--- a/arch/s390/kvm/kvm-s390.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/s390/kvm/kvm-s390.c
@@ -1891,7 +1891,7 @@ static long kvm_s390_set_skeys(struct kv
 
 		r = set_guest_storage_key(current->mm, hva, keys[i], 0);
 		if (r) {
-			r = fixup_user_fault(current, current->mm, hva,
+			r = fixup_user_fault(current->mm, hva,
 					     FAULT_FLAG_WRITE, &unlocked);
 			if (r)
 				break;
--- a/arch/s390/kvm/priv.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/s390/kvm/priv.c
@@ -273,7 +273,7 @@ retry:
 	rc = get_guest_storage_key(current->mm, vmaddr, &key);
 
 	if (rc) {
-		rc = fixup_user_fault(current, current->mm, vmaddr,
+		rc = fixup_user_fault(current->mm, vmaddr,
 				      FAULT_FLAG_WRITE, &unlocked);
 		if (!rc) {
 			mmap_read_unlock(current->mm);
@@ -319,7 +319,7 @@ retry:
 	mmap_read_lock(current->mm);
 	rc = reset_guest_reference_bit(current->mm, vmaddr);
 	if (rc < 0) {
-		rc = fixup_user_fault(current, current->mm, vmaddr,
+		rc = fixup_user_fault(current->mm, vmaddr,
 				      FAULT_FLAG_WRITE, &unlocked);
 		if (!rc) {
 			mmap_read_unlock(current->mm);
@@ -390,7 +390,7 @@ static int handle_sske(struct kvm_vcpu *
 						m3 & SSKE_MC);
 
 		if (rc < 0) {
-			rc = fixup_user_fault(current, current->mm, vmaddr,
+			rc = fixup_user_fault(current->mm, vmaddr,
 					      FAULT_FLAG_WRITE, &unlocked);
 			rc = !rc ? -EAGAIN : rc;
 		}
@@ -1094,7 +1094,7 @@ static int handle_pfmf(struct kvm_vcpu *
 			rc = cond_set_guest_storage_key(current->mm, vmaddr,
 							key, NULL, nq, mr, mc);
 			if (rc < 0) {
-				rc = fixup_user_fault(current, current->mm, vmaddr,
+				rc = fixup_user_fault(current->mm, vmaddr,
 						      FAULT_FLAG_WRITE, &unlocked);
 				rc = !rc ? -EAGAIN : rc;
 			}
--- a/arch/s390/mm/gmap.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/s390/mm/gmap.c
@@ -649,7 +649,7 @@ retry:
 		rc = vmaddr;
 		goto out_up;
 	}
-	if (fixup_user_fault(current, gmap->mm, vmaddr, fault_flags,
+	if (fixup_user_fault(gmap->mm, vmaddr, fault_flags,
 			     &unlocked)) {
 		rc = -EFAULT;
 		goto out_up;
@@ -879,7 +879,7 @@ static int gmap_pte_op_fixup(struct gmap
 
 	BUG_ON(gmap_is_shadow(gmap));
 	fault_flags = (prot == PROT_WRITE) ? FAULT_FLAG_WRITE : 0;
-	if (fixup_user_fault(current, mm, vmaddr, fault_flags, &unlocked))
+	if (fixup_user_fault(mm, vmaddr, fault_flags, &unlocked))
 		return -EFAULT;
 	if (unlocked)
 		/* lost mmap_lock, caller has to retry __gmap_translate */
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -472,7 +472,7 @@ __i915_gem_userptr_get_pages_worker(stru
 					locked = 1;
 				}
 				ret = pin_user_pages_remote
-					(work->task, mm,
+					(mm,
 					 obj->userptr.ptr + pinned * PAGE_SIZE,
 					 npages - pinned,
 					 flags,
--- a/drivers/infiniband/core/umem_odp.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/drivers/infiniband/core/umem_odp.c
@@ -437,7 +437,7 @@ int ib_umem_odp_map_dma_pages(struct ib_
 		 * complex (and doesn't gain us much performance in most use
 		 * cases).
 		 */
-		npages = get_user_pages_remote(owning_process, owning_mm,
+		npages = get_user_pages_remote(owning_mm,
 				user_virt, gup_num_pages,
 				flags, local_page_list, NULL, NULL);
 		mmap_read_unlock(owning_mm);
--- a/drivers/vfio/vfio_iommu_type1.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/drivers/vfio/vfio_iommu_type1.c
@@ -425,7 +425,7 @@ static int follow_fault_pfn(struct vm_ar
 	if (ret) {
 		bool unlocked = false;
 
-		ret = fixup_user_fault(NULL, mm, vaddr,
+		ret = fixup_user_fault(mm, vaddr,
 				       FAULT_FLAG_REMOTE |
 				       (write_fault ?  FAULT_FLAG_WRITE : 0),
 				       &unlocked);
@@ -453,7 +453,7 @@ static int vaddr_get_pfn(struct mm_struc
 		flags |= FOLL_WRITE;
 
 	mmap_read_lock(mm);
-	ret = pin_user_pages_remote(NULL, mm, vaddr, 1, flags | FOLL_LONGTERM,
+	ret = pin_user_pages_remote(mm, vaddr, 1, flags | FOLL_LONGTERM,
 				    page, NULL, NULL);
 	if (ret == 1) {
 		*pfn = page_to_pfn(page[0]);
--- a/fs/exec.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/fs/exec.c
@@ -215,7 +215,7 @@ static struct page *get_arg_page(struct
 	 * We are doing an exec().  'current' is the process
 	 * doing the exec and bprm->mm is the new process's mm.
 	 */
-	ret = get_user_pages_remote(current, bprm->mm, pos, 1, gup_flags,
+	ret = get_user_pages_remote(bprm->mm, pos, 1, gup_flags,
 			&page, NULL, NULL);
 	if (ret <= 0)
 		return NULL;
--- a/include/linux/mm.h~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/include/linux/mm.h
@@ -1653,7 +1653,7 @@ int invalidate_inode_page(struct page *p
 extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
 				  unsigned long address, unsigned int flags,
 				  struct pt_regs *regs);
-extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
+extern int fixup_user_fault(struct mm_struct *mm,
 			    unsigned long address, unsigned int fault_flags,
 			    bool *unlocked);
 void unmap_mapping_pages(struct address_space *mapping,
@@ -1669,8 +1669,7 @@ static inline vm_fault_t handle_mm_fault
 	BUG();
 	return VM_FAULT_SIGBUS;
 }
-static inline int fixup_user_fault(struct task_struct *tsk,
-		struct mm_struct *mm, unsigned long address,
+static inline int fixup_user_fault(struct mm_struct *mm, unsigned long address,
 		unsigned int fault_flags, bool *unlocked)
 {
 	/* should never happen if there's no MMU */
@@ -1696,11 +1695,11 @@ extern int access_remote_vm(struct mm_st
 extern int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
 		unsigned long addr, void *buf, int len, unsigned int gup_flags);
 
-long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long get_user_pages_remote(struct mm_struct *mm,
 			    unsigned long start, unsigned long nr_pages,
 			    unsigned int gup_flags, struct page **pages,
 			    struct vm_area_struct **vmas, int *locked);
-long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long pin_user_pages_remote(struct mm_struct *mm,
 			   unsigned long start, unsigned long nr_pages,
 			   unsigned int gup_flags, struct page **pages,
 			   struct vm_area_struct **vmas, int *locked);
--- a/kernel/events/uprobes.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/kernel/events/uprobes.c
@@ -376,7 +376,7 @@ __update_ref_ctr(struct mm_struct *mm, u
 	if (!vaddr || !d)
 		return -EINVAL;
 
-	ret = get_user_pages_remote(NULL, mm, vaddr, 1,
+	ret = get_user_pages_remote(mm, vaddr, 1,
 			FOLL_WRITE, &page, &vma, NULL);
 	if (unlikely(ret <= 0)) {
 		/*
@@ -477,7 +477,7 @@ retry:
 	if (is_register)
 		gup_flags |= FOLL_SPLIT_PMD;
 	/* Read the page with vaddr into memory */
-	ret = get_user_pages_remote(NULL, mm, vaddr, 1, gup_flags,
+	ret = get_user_pages_remote(mm, vaddr, 1, gup_flags,
 				    &old_page, &vma, NULL);
 	if (ret <= 0)
 		return ret;
@@ -2029,7 +2029,7 @@ static int is_trap_at_addr(struct mm_str
 	 * but we treat this as a 'remote' access since it is
 	 * essentially a kernel access to the memory.
 	 */
-	result = get_user_pages_remote(NULL, mm, vaddr, 1, FOLL_FORCE, &page,
+	result = get_user_pages_remote(mm, vaddr, 1, FOLL_FORCE, &page,
 			NULL, NULL);
 	if (result < 0)
 		return result;
--- a/kernel/futex.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/kernel/futex.c
@@ -699,7 +699,7 @@ static int fault_in_user_writeable(u32 _
 	int ret;
 
 	mmap_read_lock(mm);
-	ret = fixup_user_fault(current, mm, (unsigned long)uaddr,
+	ret = fixup_user_fault(mm, (unsigned long)uaddr,
 			       FAULT_FLAG_WRITE, NULL);
 	mmap_read_unlock(mm);
 
--- a/mm/gup.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/mm/gup.c
@@ -859,7 +859,7 @@ unmap:
  * does not include FOLL_NOWAIT, the mmap_lock may be released.  If it
  * is, *@locked will be set to 0 and -EBUSY returned.
  */
-static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma,
+static int faultin_page(struct vm_area_struct *vma,
 		unsigned long address, unsigned int *flags, int *locked)
 {
 	unsigned int fault_flags = 0;
@@ -962,7 +962,6 @@ static int check_vma_flags(struct vm_are
 
 /**
  * __get_user_pages() - pin user pages in memory
- * @tsk:	task_struct of target task
  * @mm:		mm_struct of target mm
  * @start:	starting user address
  * @nr_pages:	number of pages from start to pin
@@ -1021,7 +1020,7 @@ static int check_vma_flags(struct vm_are
  * instead of __get_user_pages. __get_user_pages should be used only if
  * you need some special @gup_flags.
  */
-static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
+static long __get_user_pages(struct mm_struct *mm,
 		unsigned long start, unsigned long nr_pages,
 		unsigned int gup_flags, struct page **pages,
 		struct vm_area_struct **vmas, int *locked)
@@ -1103,8 +1102,7 @@ retry:
 
 		page = follow_page_mask(vma, start, foll_flags, &ctx);
 		if (!page) {
-			ret = faultin_page(tsk, vma, start, &foll_flags,
-					   locked);
+			ret = faultin_page(vma, start, &foll_flags, locked);
 			switch (ret) {
 			case 0:
 				goto retry;
@@ -1178,8 +1176,6 @@ static bool vma_permits_fault(struct vm_
 
 /**
  * fixup_user_fault() - manually resolve a user page fault
- * @tsk:	the task_struct to use for page fault accounting, or
- *		NULL if faults are not to be recorded.
  * @mm:		mm_struct of target mm
  * @address:	user address
  * @fault_flags:flags to pass down to handle_mm_fault()
@@ -1207,7 +1203,7 @@ static bool vma_permits_fault(struct vm_
  * This function will not return with an unlocked mmap_lock. So it has not the
  * same semantics wrt the @mm->mmap_lock as does filemap_fault().
  */
-int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
+int fixup_user_fault(struct mm_struct *mm,
 		     unsigned long address, unsigned int fault_flags,
 		     bool *unlocked)
 {
@@ -1256,8 +1252,7 @@ EXPORT_SYMBOL_GPL(fixup_user_fault);
  * Please note that this function, unlike __get_user_pages will not
  * return 0 for nr_pages > 0 without FOLL_NOWAIT
  */
-static __always_inline long __get_user_pages_locked(struct task_struct *tsk,
-						struct mm_struct *mm,
+static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
 						unsigned long start,
 						unsigned long nr_pages,
 						struct page **pages,
@@ -1290,7 +1285,7 @@ static __always_inline long __get_user_p
 	pages_done = 0;
 	lock_dropped = false;
 	for (;;) {
-		ret = __get_user_pages(tsk, mm, start, nr_pages, flags, pages,
+		ret = __get_user_pages(mm, start, nr_pages, flags, pages,
 				       vmas, locked);
 		if (!locked)
 			/* VM_FAULT_RETRY couldn't trigger, bypass */
@@ -1350,7 +1345,7 @@ retry:
 		}
 
 		*locked = 1;
-		ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED,
+		ret = __get_user_pages(mm, start, 1, flags | FOLL_TRIED,
 				       pages, NULL, locked);
 		if (!*locked) {
 			/* Continue to retry until we succeeded */
@@ -1436,7 +1431,7 @@ long populate_vma_page_range(struct vm_a
 	 * We made sure addr is within a VMA, so the following will
 	 * not result in a stack expansion that recurses back here.
 	 */
-	return __get_user_pages(current, mm, start, nr_pages, gup_flags,
+	return __get_user_pages(mm, start, nr_pages, gup_flags,
 				NULL, NULL, locked);
 }
 
@@ -1520,7 +1515,7 @@ struct page *get_dump_page(unsigned long
 	struct vm_area_struct *vma;
 	struct page *page;
 
-	if (__get_user_pages(current, current->mm, addr, 1,
+	if (__get_user_pages(current->mm, addr, 1,
 			     FOLL_FORCE | FOLL_DUMP | FOLL_GET, &page, &vma,
 			     NULL) < 1)
 		return NULL;
@@ -1529,8 +1524,7 @@ struct page *get_dump_page(unsigned long
 }
 #endif /* CONFIG_ELF_CORE */
 #else /* CONFIG_MMU */
-static long __get_user_pages_locked(struct task_struct *tsk,
-		struct mm_struct *mm, unsigned long start,
+static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start,
 		unsigned long nr_pages, struct page **pages,
 		struct vm_area_struct **vmas, int *locked,
 		unsigned int foll_flags)
@@ -1646,8 +1640,7 @@ static struct page *new_non_cma_page(str
 	return __alloc_pages_node(nid, gfp_mask, 0);
 }
 
-static long check_and_migrate_cma_pages(struct task_struct *tsk,
-					struct mm_struct *mm,
+static long check_and_migrate_cma_pages(struct mm_struct *mm,
 					unsigned long start,
 					unsigned long nr_pages,
 					struct page **pages,
@@ -1721,7 +1714,7 @@ check_again:
 		 * again migrating any new CMA pages which we failed to isolate
 		 * earlier.
 		 */
-		ret = __get_user_pages_locked(tsk, mm, start, nr_pages,
+		ret = __get_user_pages_locked(mm, start, nr_pages,
 						   pages, vmas, NULL,
 						   gup_flags);
 
@@ -1735,8 +1728,7 @@ check_again:
 	return ret;
 }
 #else
-static long check_and_migrate_cma_pages(struct task_struct *tsk,
-					struct mm_struct *mm,
+static long check_and_migrate_cma_pages(struct mm_struct *mm,
 					unsigned long start,
 					unsigned long nr_pages,
 					struct page **pages,
@@ -1751,8 +1743,7 @@ static long check_and_migrate_cma_pages(
  * __gup_longterm_locked() is a wrapper for __get_user_pages_locked which
  * allows us to process the FOLL_LONGTERM flag.
  */
-static long __gup_longterm_locked(struct task_struct *tsk,
-				  struct mm_struct *mm,
+static long __gup_longterm_locked(struct mm_struct *mm,
 				  unsigned long start,
 				  unsigned long nr_pages,
 				  struct page **pages,
@@ -1777,7 +1768,7 @@ static long __gup_longterm_locked(struct
 		flags = memalloc_nocma_save();
 	}
 
-	rc = __get_user_pages_locked(tsk, mm, start, nr_pages, pages,
+	rc = __get_user_pages_locked(mm, start, nr_pages, pages,
 				     vmas_tmp, NULL, gup_flags);
 
 	if (gup_flags & FOLL_LONGTERM) {
@@ -1792,7 +1783,7 @@ static long __gup_longterm_locked(struct
 			goto out;
 		}
 
-		rc = check_and_migrate_cma_pages(tsk, mm, start, rc, pages,
+		rc = check_and_migrate_cma_pages(mm, start, rc, pages,
 						 vmas_tmp, gup_flags);
 	}
 
@@ -1802,22 +1793,20 @@ out:
 	return rc;
 }
 #else /* !CONFIG_FS_DAX && !CONFIG_CMA */
-static __always_inline long __gup_longterm_locked(struct task_struct *tsk,
-						  struct mm_struct *mm,
+static __always_inline long __gup_longterm_locked(struct mm_struct *mm,
 						  unsigned long start,
 						  unsigned long nr_pages,
 						  struct page **pages,
 						  struct vm_area_struct **vmas,
 						  unsigned int flags)
 {
-	return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
+	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
 				       NULL, flags);
 }
 #endif /* CONFIG_FS_DAX || CONFIG_CMA */
 
 #ifdef CONFIG_MMU
-static long __get_user_pages_remote(struct task_struct *tsk,
-				    struct mm_struct *mm,
+static long __get_user_pages_remote(struct mm_struct *mm,
 				    unsigned long start, unsigned long nr_pages,
 				    unsigned int gup_flags, struct page **pages,
 				    struct vm_area_struct **vmas, int *locked)
@@ -1836,20 +1825,18 @@ static long __get_user_pages_remote(stru
 		 * This will check the vmas (even if our vmas arg is NULL)
 		 * and return -ENOTSUPP if DAX isn't allowed in this case:
 		 */
-		return __gup_longterm_locked(tsk, mm, start, nr_pages, pages,
+		return __gup_longterm_locked(mm, start, nr_pages, pages,
 					     vmas, gup_flags | FOLL_TOUCH |
 					     FOLL_REMOTE);
 	}
 
-	return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
+	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
 				       locked,
 				       gup_flags | FOLL_TOUCH | FOLL_REMOTE);
 }
 
 /**
  * get_user_pages_remote() - pin user pages in memory
- * @tsk:	the task_struct to use for page fault accounting, or
- *		NULL if faults are not to be recorded.
  * @mm:		mm_struct of target mm
  * @start:	starting user address
  * @nr_pages:	number of pages from start to pin
@@ -1908,7 +1895,7 @@ static long __get_user_pages_remote(stru
  * should use get_user_pages_remote because it cannot pass
  * FAULT_FLAG_ALLOW_RETRY to handle_mm_fault.
  */
-long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long get_user_pages_remote(struct mm_struct *mm,
 		unsigned long start, unsigned long nr_pages,
 		unsigned int gup_flags, struct page **pages,
 		struct vm_area_struct **vmas, int *locked)
@@ -1920,13 +1907,13 @@ long get_user_pages_remote(struct task_s
 	if (WARN_ON_ONCE(gup_flags & FOLL_PIN))
 		return -EINVAL;
 
-	return __get_user_pages_remote(tsk, mm, start, nr_pages, gup_flags,
+	return __get_user_pages_remote(mm, start, nr_pages, gup_flags,
 				       pages, vmas, locked);
 }
 EXPORT_SYMBOL(get_user_pages_remote);
 
 #else /* CONFIG_MMU */
-long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long get_user_pages_remote(struct mm_struct *mm,
 			   unsigned long start, unsigned long nr_pages,
 			   unsigned int gup_flags, struct page **pages,
 			   struct vm_area_struct **vmas, int *locked)
@@ -1934,8 +1921,7 @@ long get_user_pages_remote(struct task_s
 	return 0;
 }
 
-static long __get_user_pages_remote(struct task_struct *tsk,
-				    struct mm_struct *mm,
+static long __get_user_pages_remote(struct mm_struct *mm,
 				    unsigned long start, unsigned long nr_pages,
 				    unsigned int gup_flags, struct page **pages,
 				    struct vm_area_struct **vmas, int *locked)
@@ -1955,11 +1941,10 @@ static long __get_user_pages_remote(stru
  * @vmas:       array of pointers to vmas corresponding to each page.
  *              Or NULL if the caller does not require them.
  *
- * This is the same as get_user_pages_remote(), just with a
- * less-flexible calling convention where we assume that the task
- * and mm being operated on are the current task's and don't allow
- * passing of a locked parameter.  We also obviously don't pass
- * FOLL_REMOTE in here.
+ * This is the same as get_user_pages_remote(), just with a less-flexible
+ * calling convention where we assume that the mm being operated on belongs to
+ * the current task, and doesn't allow passing of a locked parameter.  We also
+ * obviously don't pass FOLL_REMOTE in here.
  */
 long get_user_pages(unsigned long start, unsigned long nr_pages,
 		unsigned int gup_flags, struct page **pages,
@@ -1972,7 +1957,7 @@ long get_user_pages(unsigned long start,
 	if (WARN_ON_ONCE(gup_flags & FOLL_PIN))
 		return -EINVAL;
 
-	return __gup_longterm_locked(current, current->mm, start, nr_pages,
+	return __gup_longterm_locked(current->mm, start, nr_pages,
 				     pages, vmas, gup_flags | FOLL_TOUCH);
 }
 EXPORT_SYMBOL(get_user_pages);
@@ -1982,7 +1967,7 @@ EXPORT_SYMBOL(get_user_pages);
  *
  *      mmap_read_lock(mm);
  *      do_something()
- *      get_user_pages(tsk, mm, ..., pages, NULL);
+ *      get_user_pages(mm, ..., pages, NULL);
  *      mmap_read_unlock(mm);
  *
  *  to:
@@ -1990,7 +1975,7 @@ EXPORT_SYMBOL(get_user_pages);
  *      int locked = 1;
  *      mmap_read_lock(mm);
  *      do_something()
- *      get_user_pages_locked(tsk, mm, ..., pages, &locked);
+ *      get_user_pages_locked(mm, ..., pages, &locked);
  *      if (locked)
  *          mmap_read_unlock(mm);
  *
@@ -2028,7 +2013,7 @@ long get_user_pages_locked(unsigned long
 	if (WARN_ON_ONCE(gup_flags & FOLL_PIN))
 		return -EINVAL;
 
-	return __get_user_pages_locked(current, current->mm, start, nr_pages,
+	return __get_user_pages_locked(current->mm, start, nr_pages,
 				       pages, NULL, locked,
 				       gup_flags | FOLL_TOUCH);
 }
@@ -2038,12 +2023,12 @@ EXPORT_SYMBOL(get_user_pages_locked);
  * get_user_pages_unlocked() is suitable to replace the form:
  *
  *      mmap_read_lock(mm);
- *      get_user_pages(tsk, mm, ..., pages, NULL);
+ *      get_user_pages(mm, ..., pages, NULL);
  *      mmap_read_unlock(mm);
  *
  *  with:
  *
- *      get_user_pages_unlocked(tsk, mm, ..., pages);
+ *      get_user_pages_unlocked(mm, ..., pages);
  *
  * It is functionally equivalent to get_user_pages_fast so
  * get_user_pages_fast should be used instead if specific gup_flags
@@ -2066,7 +2051,7 @@ long get_user_pages_unlocked(unsigned lo
 		return -EINVAL;
 
 	mmap_read_lock(mm);
-	ret = __get_user_pages_locked(current, mm, start, nr_pages, pages, NULL,
+	ret = __get_user_pages_locked(mm, start, nr_pages, pages, NULL,
 				      &locked, gup_flags | FOLL_TOUCH);
 	if (locked)
 		mmap_read_unlock(mm);
@@ -2711,7 +2696,7 @@ static int __gup_longterm_unlocked(unsig
 	 */
 	if (gup_flags & FOLL_LONGTERM) {
 		mmap_read_lock(current->mm);
-		ret = __gup_longterm_locked(current, current->mm,
+		ret = __gup_longterm_locked(current->mm,
 					    start, nr_pages,
 					    pages, NULL, gup_flags);
 		mmap_read_unlock(current->mm);
@@ -2954,10 +2939,8 @@ int pin_user_pages_fast_only(unsigned lo
 EXPORT_SYMBOL_GPL(pin_user_pages_fast_only);
 
 /**
- * pin_user_pages_remote() - pin pages of a remote process (task != current)
+ * pin_user_pages_remote() - pin pages of a remote process
  *
- * @tsk:	the task_struct to use for page fault accounting, or
- *		NULL if faults are not to be recorded.
  * @mm:		mm_struct of target mm
  * @start:	starting user address
  * @nr_pages:	number of pages from start to pin
@@ -2978,7 +2961,7 @@ EXPORT_SYMBOL_GPL(pin_user_pages_fast_on
  * FOLL_PIN means that the pages must be released via unpin_user_page(). Please
  * see Documentation/core-api/pin_user_pages.rst for details.
  */
-long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long pin_user_pages_remote(struct mm_struct *mm,
 			   unsigned long start, unsigned long nr_pages,
 			   unsigned int gup_flags, struct page **pages,
 			   struct vm_area_struct **vmas, int *locked)
@@ -2988,7 +2971,7 @@ long pin_user_pages_remote(struct task_s
 		return -EINVAL;
 
 	gup_flags |= FOLL_PIN;
-	return __get_user_pages_remote(tsk, mm, start, nr_pages, gup_flags,
+	return __get_user_pages_remote(mm, start, nr_pages, gup_flags,
 				       pages, vmas, locked);
 }
 EXPORT_SYMBOL(pin_user_pages_remote);
@@ -3020,7 +3003,7 @@ long pin_user_pages(unsigned long start,
 		return -EINVAL;
 
 	gup_flags |= FOLL_PIN;
-	return __gup_longterm_locked(current, current->mm, start, nr_pages,
+	return __gup_longterm_locked(current->mm, start, nr_pages,
 				     pages, vmas, gup_flags);
 }
 EXPORT_SYMBOL(pin_user_pages);
@@ -3065,7 +3048,7 @@ long pin_user_pages_locked(unsigned long
 		return -EINVAL;
 
 	gup_flags |= FOLL_PIN;
-	return __get_user_pages_locked(current, current->mm, start, nr_pages,
+	return __get_user_pages_locked(current->mm, start, nr_pages,
 				       pages, NULL, locked,
 				       gup_flags | FOLL_TOUCH);
 }
--- a/mm/memory.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/mm/memory.c
@@ -4751,7 +4751,7 @@ int __access_remote_vm(struct task_struc
 		void *maddr;
 		struct page *page = NULL;
 
-		ret = get_user_pages_remote(tsk, mm, addr, 1,
+		ret = get_user_pages_remote(mm, addr, 1,
 				gup_flags, &page, &vma, NULL);
 		if (ret <= 0) {
 #ifndef CONFIG_HAVE_IOREMAP_PROT
--- a/mm/process_vm_access.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/mm/process_vm_access.c
@@ -105,7 +105,7 @@ static int process_vm_rw_single_vec(unsi
 		 * current/current->mm
 		 */
 		mmap_read_lock(mm);
-		pinned_pages = pin_user_pages_remote(task, mm, pa, pinned_pages,
+		pinned_pages = pin_user_pages_remote(mm, pa, pinned_pages,
 						     flags, process_pages,
 						     NULL, &locked);
 		if (locked)
--- a/security/tomoyo/domain.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/security/tomoyo/domain.c
@@ -914,7 +914,7 @@ bool tomoyo_dump_page(struct linux_binpr
 	 * (represented by bprm).  'current' is the process doing
 	 * the execve().
 	 */
-	if (get_user_pages_remote(current, bprm->mm, pos, 1,
+	if (get_user_pages_remote(bprm->mm, pos, 1,
 				FOLL_FORCE, &page, NULL, NULL) <= 0)
 		return false;
 #else
--- a/virt/kvm/async_pf.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/virt/kvm/async_pf.c
@@ -61,7 +61,7 @@ static void async_pf_execute(struct work
 	 * access remotely.
 	 */
 	mmap_read_lock(mm);
-	get_user_pages_remote(NULL, mm, addr, 1, FOLL_WRITE, NULL, NULL,
+	get_user_pages_remote(mm, addr, 1, FOLL_WRITE, NULL, NULL,
 			&locked);
 	if (locked)
 		mmap_read_unlock(mm);
--- a/virt/kvm/kvm_main.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/virt/kvm/kvm_main.c
@@ -1830,7 +1830,7 @@ static int hva_to_pfn_remapped(struct vm
 		 * not call the fault handler, so do it here.
 		 */
 		bool unlocked = false;
-		r = fixup_user_fault(current, current->mm, addr,
+		r = fixup_user_fault(current->mm, addr,
 				     (write_fault ? FAULT_FLAG_WRITE : 0),
 				     &unlocked);
 		if (unlocked)
_

Patches currently in -mm which might be from peterx@redhat.com are

mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmstat-add-events-for-thp-migration-without-split-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (91 preceding siblings ...)
  2020-07-09  0:07 ` + mm-gup-remove-task_struct-pointer-for-all-gup-code.patch " Andrew Morton
@ 2020-07-09  2:04 ` Andrew Morton
  2020-07-09  2:29 ` mmotm 2020-07-08-19-28 uploaded Andrew Morton
                   ` (139 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  2:04 UTC (permalink / raw)
  To: akpm, anshuman.khandual, daniel.m.jordan, hughd, jhubbard,
	mm-commits, n-horiguchi, willy, ziy


The patch titled
     Subject: mm-vmstat-add-events-for-thp-migration-without-split-fix
has been added to the -mm tree.  Its filename is
     mm-vmstat-add-events-for-thp-migration-without-split-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-add-events-for-thp-migration-without-split-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-vmstat-add-events-for-thp-migration-without-split-fix

s/hpage_nr_pages/thp_nr_pages/ due to "mm: replace hpage_nr_pages with
thp_nr_pages".

Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/migrate.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/migrate.c~mm-vmstat-add-events-for-thp-migration-without-split-fix
+++ a/mm/migrate.c
@@ -1446,7 +1446,7 @@ retry:
 			 * during migration.
 			 */
 			is_thp = PageTransHuge(page);
-			thp_nr_pages = hpage_nr_pages(page);
+			thp_nr_pages = thp_nr_pages(page);
 			cond_resched();
 
 			if (PageHuge(page))
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* mmotm 2020-07-08-19-28 uploaded
  2020-07-03 22:14 incoming Andrew Morton
                   ` (93 preceding siblings ...)
  2020-07-09  2:29 ` mmotm 2020-07-08-19-28 uploaded Andrew Morton
@ 2020-07-09  2:29 ` Andrew Morton
  2020-07-09 23:09 ` + mm-handle-page-mapping-better-in-dump_page.patch added to -mm tree Andrew Morton
                   ` (137 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  2:29 UTC (permalink / raw)
  To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
	mhocko, mm-commits, sfr

The mm-of-the-moment snapshot 2020-07-08-19-28 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss.  Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

	https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

	https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.8-rc4:
(patches marked "*" will be included in linux-next)

  origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
* mailmap-add-entry-for-mike-rapoport.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* kbuild-move-wtype-limits-to-w=2.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-swap-simplify-alloc_swap_slot_cache.patch
* mm-swap-simplify-enable_swap_slots_cache.patch
* mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
* mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
* mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
* mm-utilc-make-vm_memory_committed-more-accurate.patch
* mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* mm-do-page-fault-accounting-in-handle_mm_fault.patch
* mm-alpha-use-general-page-fault-accounting.patch
* mm-arc-use-general-page-fault-accounting.patch
* mm-arm-use-general-page-fault-accounting.patch
* mm-arm64-use-general-page-fault-accounting.patch
* mm-csky-use-general-page-fault-accounting.patch
* mm-hexagon-use-general-page-fault-accounting.patch
* mm-ia64-use-general-page-fault-accounting.patch
* mm-m68k-use-general-page-fault-accounting.patch
* mm-microblaze-use-general-page-fault-accounting.patch
* mm-mips-use-general-page-fault-accounting.patch
* mm-nds32-use-general-page-fault-accounting.patch
* mm-nios2-use-general-page-fault-accounting.patch
* mm-openrisc-use-general-page-fault-accounting.patch
* mm-parisc-use-general-page-fault-accounting.patch
* mm-powerpc-use-general-page-fault-accounting.patch
* mm-riscv-use-general-page-fault-accounting.patch
* mm-s390-use-general-page-fault-accounting.patch
* mm-sh-use-general-page-fault-accounting.patch
* mm-sparc32-use-general-page-fault-accounting.patch
* mm-sparc64-use-general-page-fault-accounting.patch
* mm-x86-use-general-page-fault-accounting.patch
* mm-xtensa-use-general-page-fault-accounting.patch
* mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
* mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
* mm-mremap-calculate-extent-in-one-place.patch
* mm-mremap-start-addresses-are-properly-aligned.patch
* mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* kasan-record-and-print-the-free-track.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-vmscanc-fixed-typo.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-vmstat-add-events-for-thp-migration-without-split.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-free-pages-fix.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix-fix.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* lib-test_lockupc-make-symbol-test_works-static.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* fs-minix-check-return-value-of-sb_getblk.patch
* fs-minix-dont-allow-getting-deleted-inodes.patch
* fs-minix-reject-too-large-maximum-file-size.patch
* fs-minix-set-s_maxbytes-correctly.patch
* fs-minix-fix-block-limit-check-for-v1-filesystems.patch
* fs-minix-remove-expected-error-message-in-block_to_path.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* umh-fix-refcount-underflow-in-fork_usermode_blob.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
  linux-next.patch
  linux-next-rejects.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-remove-call-to-memset-after-dma_alloc_coherent.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
  make-sure-nobodys-leaking-resources.patch
  releasing-resources-with-children.patch
  mutex-subsystem-synchro-test-module.patch
  kernel-forkc-export-kernel_thread-to-modules.patch
  workaround-for-a-pci-restoring-bug.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* mmotm 2020-07-08-19-28 uploaded
  2020-07-03 22:14 incoming Andrew Morton
                   ` (92 preceding siblings ...)
  2020-07-09  2:04 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix.patch " Andrew Morton
@ 2020-07-09  2:29 ` Andrew Morton
  2020-07-09  2:29 ` Andrew Morton
                   ` (138 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09  2:29 UTC (permalink / raw)
  To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
	mhocko, mm-commits, sfr

The mm-of-the-moment snapshot 2020-07-08-19-28 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss.  Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

	https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

	https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.8-rc4:
(patches marked "*" will be included in linux-next)

  origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
* mailmap-add-entry-for-mike-rapoport.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* kbuild-move-wtype-limits-to-w=2.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-swap-simplify-alloc_swap_slot_cache.patch
* mm-swap-simplify-enable_swap_slots_cache.patch
* mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
* mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
* mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
* mm-utilc-make-vm_memory_committed-more-accurate.patch
* mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* mm-do-page-fault-accounting-in-handle_mm_fault.patch
* mm-alpha-use-general-page-fault-accounting.patch
* mm-arc-use-general-page-fault-accounting.patch
* mm-arm-use-general-page-fault-accounting.patch
* mm-arm64-use-general-page-fault-accounting.patch
* mm-csky-use-general-page-fault-accounting.patch
* mm-hexagon-use-general-page-fault-accounting.patch
* mm-ia64-use-general-page-fault-accounting.patch
* mm-m68k-use-general-page-fault-accounting.patch
* mm-microblaze-use-general-page-fault-accounting.patch
* mm-mips-use-general-page-fault-accounting.patch
* mm-nds32-use-general-page-fault-accounting.patch
* mm-nios2-use-general-page-fault-accounting.patch
* mm-openrisc-use-general-page-fault-accounting.patch
* mm-parisc-use-general-page-fault-accounting.patch
* mm-powerpc-use-general-page-fault-accounting.patch
* mm-riscv-use-general-page-fault-accounting.patch
* mm-s390-use-general-page-fault-accounting.patch
* mm-sh-use-general-page-fault-accounting.patch
* mm-sparc32-use-general-page-fault-accounting.patch
* mm-sparc64-use-general-page-fault-accounting.patch
* mm-x86-use-general-page-fault-accounting.patch
* mm-xtensa-use-general-page-fault-accounting.patch
* mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
* mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
* mm-mremap-calculate-extent-in-one-place.patch
* mm-mremap-start-addresses-are-properly-aligned.patch
* mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* kasan-record-and-print-the-free-track.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-vmscanc-fixed-typo.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-vmstat-add-events-for-thp-migration-without-split.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-free-pages-fix.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix-fix.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* lib-test_lockupc-make-symbol-test_works-static.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* fs-minix-check-return-value-of-sb_getblk.patch
* fs-minix-dont-allow-getting-deleted-inodes.patch
* fs-minix-reject-too-large-maximum-file-size.patch
* fs-minix-set-s_maxbytes-correctly.patch
* fs-minix-fix-block-limit-check-for-v1-filesystems.patch
* fs-minix-remove-expected-error-message-in-block_to_path.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* umh-fix-refcount-underflow-in-fork_usermode_blob.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
  linux-next.patch
  linux-next-rejects.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-remove-call-to-memset-after-dma_alloc_coherent.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
  make-sure-nobodys-leaking-resources.patch
  releasing-resources-with-children.patch
  mutex-subsystem-synchro-test-module.patch
  kernel-forkc-export-kernel_thread-to-modules.patch
  workaround-for-a-pci-restoring-bug.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-handle-page-mapping-better-in-dump_page.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (94 preceding siblings ...)
  2020-07-09  2:29 ` Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
  2020-07-09 23:09 ` + mm-dump-compound-page-information-on-a-second-line.patch " Andrew Morton
                   ` (136 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
  To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy


The patch titled
     Subject: mm/debug: handle page->mapping better in dump_page
has been added to the -mm tree.  Its filename is
     mm-handle-page-mapping-better-in-dump_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-handle-page-mapping-better-in-dump_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-handle-page-mapping-better-in-dump_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: handle page->mapping better in dump_page

Patch series "Improvements for dump_page()", v2.

Here's a sample dump of a pagecache tail page with all of the patches
applied:

page:000000006d1c49ca refcount:6 mapcount:0 mapping:00000000136b8d90 index:0x109 pfn:0x6c645
head:000000008bd38076 order:2 compound_mapcount:0 compound_pincount:0
aops:xfs_address_space_operations ino:800042 dentry name:"fd"
flags: 0x4000000000012014(uptodate|lru|private|head)
raw: 4000000000000000 ffffd46ac1b19101 ffffffff00000202 dead000000000004
raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
head: 4000000000012014 ffffd46ac1b1bbc8 ffffd46ac1b1bc08 ffff91976f659560
head: 0000000000000108 ffff919773220680 00000006ffffffff 0000000000000000
page dumped because: testing


This patch (of 6):

If we can't call page_mapping() to get the page mapping, handle the
anon/ksm/movable bits correctly.

Link: http://lkml.kernel.org/r/20200709202117.7216-1-willy@infradead.org
Link: http://lkml.kernel.org/r/20200709202117.7216-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/mm/debug.c~mm-handle-page-mapping-better-in-dump_page
+++ a/mm/debug.c
@@ -70,7 +70,12 @@ void __dump_page(struct page *page, cons
 
 	if (page < head || (page >= head + MAX_ORDER_NR_PAGES)) {
 		/* Corrupt page, cannot call page_mapping */
-		mapping = page->mapping;
+		unsigned long tmp = (unsigned long)page->mapping;
+
+		if (tmp & PAGE_MAPPING_ANON)
+			mapping = NULL;
+		else
+			mapping = (void *)(tmp & ~PAGE_MAPPING_FLAGS);
 		head = page;
 		compound = false;
 	} else {
_

Patches currently in -mm which might be from willy@infradead.org are

mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-dump-compound-page-information-on-a-second-line.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (95 preceding siblings ...)
  2020-07-09 23:09 ` + mm-handle-page-mapping-better-in-dump_page.patch added to -mm tree Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
  2020-07-09 23:09 ` + mm-print-head-flags-in-dump_page.patch " Andrew Morton
                   ` (135 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
  To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy


The patch titled
     Subject: mm/debug: dump compound page information on a second line
has been added to the -mm tree.  Its filename is
     mm-dump-compound-page-information-on-a-second-line.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-dump-compound-page-information-on-a-second-line.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-dump-compound-page-information-on-a-second-line.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: dump compound page information on a second line

Simplify both the implementation and the output by splitting all the
compound page information onto a second line.

Link: http://lkml.kernel.org/r/20200709202117.7216-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug.c |   30 ++++++++++++------------------
 1 file changed, 12 insertions(+), 18 deletions(-)

--- a/mm/debug.c~mm-dump-compound-page-information-on-a-second-line
+++ a/mm/debug.c
@@ -89,27 +89,21 @@ void __dump_page(struct page *page, cons
 	 */
 	mapcount = PageSlab(head) ? 0 : page_mapcount(page);
 
-	if (compound)
+	pr_warn("page:%px refcount:%d mapcount:%d mapping:%p index:%#lx\n",
+			page, page_ref_count(head), mapcount, mapping,
+			page_to_pgoff(page));
+	if (compound) {
 		if (hpage_pincount_available(page)) {
-			pr_warn("page:%px refcount:%d mapcount:%d mapping:%p "
-				"index:%#lx head:%px order:%u "
-				"compound_mapcount:%d compound_pincount:%d\n",
-				page, page_ref_count(head), mapcount,
-				mapping, page_to_pgoff(page), head,
-				compound_order(head), compound_mapcount(page),
-				compound_pincount(page));
+			pr_warn("head:%px order:%u compound_mapcount:%d compound_pincount:%d\n",
+					head, compound_order(head),
+					compound_mapcount(head),
+					compound_pincount(head));
 		} else {
-			pr_warn("page:%px refcount:%d mapcount:%d mapping:%p "
-				"index:%#lx head:%px order:%u "
-				"compound_mapcount:%d\n",
-				page, page_ref_count(head), mapcount,
-				mapping, page_to_pgoff(page), head,
-				compound_order(head), compound_mapcount(page));
+			pr_warn("head:%px order:%u compound_mapcount:%d\n",
+					head, compound_order(head),
+					compound_mapcount(head));
 		}
-	else
-		pr_warn("page:%px refcount:%d mapcount:%d mapping:%p index:%#lx\n",
-			page, page_ref_count(page), mapcount,
-			mapping, page_to_pgoff(page));
+	}
 	if (PageKsm(page))
 		type = "ksm ";
 	else if (PageAnon(page))
_

Patches currently in -mm which might be from willy@infradead.org are

mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-print-head-flags-in-dump_page.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (96 preceding siblings ...)
  2020-07-09 23:09 ` + mm-dump-compound-page-information-on-a-second-line.patch " Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
  2020-07-09 23:09 ` + mm-switch-dump_page-to-get_kernel_nofault.patch " Andrew Morton
                   ` (134 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
  To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy


The patch titled
     Subject: mm/debug: print head flags in dump_page
has been added to the -mm tree.  Its filename is
     mm-print-head-flags-in-dump_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-print-head-flags-in-dump_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-print-head-flags-in-dump_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: print head flags in dump_page

Tail page flags contain very little useful information.  Print the head
page's flags instead.  While the flags will contain "head" for tail pages,
this should not be too confusing as the previous line starts with the word
"head:" and so the flags should be interpreted as belonging to the head
page.

Link: http://lkml.kernel.org/r/20200709202117.7216-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/debug.c~mm-print-head-flags-in-dump_page
+++ a/mm/debug.c
@@ -162,7 +162,7 @@ void __dump_page(struct page *page, cons
 out_mapping:
 	BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);
 
-	pr_warn("%sflags: %#lx(%pGp)%s\n", type, page->flags, &page->flags,
+	pr_warn("%sflags: %#lx(%pGp)%s\n", type, head->flags, &head->flags,
 		page_cma ? " CMA" : "");
 
 hex_only:
_

Patches currently in -mm which might be from willy@infradead.org are

mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-switch-dump_page-to-get_kernel_nofault.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (97 preceding siblings ...)
  2020-07-09 23:09 ` + mm-print-head-flags-in-dump_page.patch " Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
  2020-07-09 23:09 ` + mm-print-the-inode-number-in-dump_page.patch " Andrew Morton
                   ` (133 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
  To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy


The patch titled
     Subject: mm/debug: switch dump_page to get_kernel_nofault
has been added to the -mm tree.  Its filename is
     mm-switch-dump_page-to-get_kernel_nofault.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-switch-dump_page-to-get_kernel_nofault.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-switch-dump_page-to-get_kernel_nofault.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: switch dump_page to get_kernel_nofault

This is simpler to use than copy_from_kernel_nofault().  Also make some of
the related error messages less verbose.

Link: http://lkml.kernel.org/r/20200709202117.7216-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug.c |   36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

--- a/mm/debug.c~mm-switch-dump_page-to-get_kernel_nofault
+++ a/mm/debug.c
@@ -109,54 +109,50 @@ void __dump_page(struct page *page, cons
 	else if (PageAnon(page))
 		type = "anon ";
 	else if (mapping) {
-		const struct inode *host;
+		struct inode *host;
 		const struct address_space_operations *a_ops;
-		const struct hlist_node *dentry_first;
-		const struct dentry *dentry_ptr;
+		struct hlist_node *dentry_first;
+		struct dentry *dentry_ptr;
 		struct dentry dentry;
 
 		/*
 		 * mapping can be invalid pointer and we don't want to crash
 		 * accessing it, so probe everything depending on it carefully
 		 */
-		if (copy_from_kernel_nofault(&host, &mapping->host,
-					sizeof(struct inode *)) ||
-		    copy_from_kernel_nofault(&a_ops, &mapping->a_ops,
-				sizeof(struct address_space_operations *))) {
-			pr_warn("failed to read mapping->host or a_ops, mapping not a valid kernel address?\n");
+		if (get_kernel_nofault(host, &mapping->host) ||
+		    get_kernel_nofault(a_ops, &mapping->a_ops)) {
+			pr_warn("failed to read mapping contents, not a valid kernel address?\n");
 			goto out_mapping;
 		}
 
 		if (!host) {
-			pr_warn("mapping->a_ops:%ps\n", a_ops);
+			pr_warn("aops:%ps\n", a_ops);
 			goto out_mapping;
 		}
 
-		if (copy_from_kernel_nofault(&dentry_first,
-			&host->i_dentry.first, sizeof(struct hlist_node *))) {
-			pr_warn("mapping->a_ops:%ps with invalid mapping->host inode address %px\n",
-				a_ops, host);
+		if (get_kernel_nofault(dentry_first, &host->i_dentry.first)) {
+			pr_warn("aops:%ps with invalid host inode %px\n",
+					a_ops, host);
 			goto out_mapping;
 		}
 
 		if (!dentry_first) {
-			pr_warn("mapping->a_ops:%ps\n", a_ops);
+			pr_warn("aops:%ps\n", a_ops);
 			goto out_mapping;
 		}
 
 		dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias);
-		if (copy_from_kernel_nofault(&dentry, dentry_ptr,
-							sizeof(struct dentry))) {
-			pr_warn("mapping->aops:%ps with invalid mapping->host->i_dentry.first %px\n",
-				a_ops, dentry_ptr);
+		if (get_kernel_nofault(dentry, dentry_ptr)) {
+			pr_warn("aops:%ps with invalid dentry %px\n", a_ops,
+					dentry_ptr);
 		} else {
 			/*
 			 * if dentry is corrupted, the %pd handler may still
 			 * crash, but it's unlikely that we reach here with a
 			 * corrupted struct page
 			 */
-			pr_warn("mapping->aops:%ps dentry name:\"%pd\"\n",
-								a_ops, &dentry);
+			pr_warn("aops:%ps dentry name:\"%pd\"\n", a_ops,
+					&dentry);
 		}
 	}
 out_mapping:
_

Patches currently in -mm which might be from willy@infradead.org are

mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-print-the-inode-number-in-dump_page.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (98 preceding siblings ...)
  2020-07-09 23:09 ` + mm-switch-dump_page-to-get_kernel_nofault.patch " Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
  2020-07-09 23:09 ` + mm-print-hashed-address-of-struct-page.patch " Andrew Morton
                   ` (132 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
  To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy


The patch titled
     Subject: mm/debug: print the inode number in dump_page
has been added to the -mm tree.  Its filename is
     mm-print-the-inode-number-in-dump_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-print-the-inode-number-in-dump_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-print-the-inode-number-in-dump_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: print the inode number in dump_page

The inode number helps correlate this page with debug messages elsewhere
in the kernel.

Link: http://lkml.kernel.org/r/20200709202117.7216-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/mm/debug.c~mm-print-the-inode-number-in-dump_page
+++ a/mm/debug.c
@@ -137,7 +137,7 @@ void __dump_page(struct page *page, cons
 		}
 
 		if (!dentry_first) {
-			pr_warn("aops:%ps\n", a_ops);
+			pr_warn("aops:%ps ino:%lx\n", a_ops, host->i_ino);
 			goto out_mapping;
 		}
 
@@ -151,8 +151,8 @@ void __dump_page(struct page *page, cons
 			 * crash, but it's unlikely that we reach here with a
 			 * corrupted struct page
 			 */
-			pr_warn("aops:%ps dentry name:\"%pd\"\n", a_ops,
-					&dentry);
+			pr_warn("aops:%ps ino:%lx dentry name:\"%pd\"\n",
+					a_ops, host->i_ino, &dentry);
 		}
 	}
 out_mapping:
_

Patches currently in -mm which might be from willy@infradead.org are

mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-print-hashed-address-of-struct-page.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (99 preceding siblings ...)
  2020-07-09 23:09 ` + mm-print-the-inode-number-in-dump_page.patch " Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
  2020-07-09 23:10 ` + mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch " Andrew Morton
                   ` (131 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
  To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy


The patch titled
     Subject: mm/debug: print hashed address of struct page
has been added to the -mm tree.  Its filename is
     mm-print-hashed-address-of-struct-page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-print-hashed-address-of-struct-page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-print-hashed-address-of-struct-page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: print hashed address of struct page

The actual address of the struct page isn't particularly helpful, while
the hashed address helps match with other messages elsewhere.  Add the PFN
that the page refers to in order to help diagnose problems where the page
is improperly aligned for the purpose.

Link: http://lkml.kernel.org/r/20200709202117.7216-7-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/mm/debug.c~mm-print-hashed-address-of-struct-page
+++ a/mm/debug.c
@@ -89,17 +89,17 @@ void __dump_page(struct page *page, cons
 	 */
 	mapcount = PageSlab(head) ? 0 : page_mapcount(page);
 
-	pr_warn("page:%px refcount:%d mapcount:%d mapping:%p index:%#lx\n",
+	pr_warn("page:%p refcount:%d mapcount:%d mapping:%p index:%#lx pfn:%#lx\n",
 			page, page_ref_count(head), mapcount, mapping,
-			page_to_pgoff(page));
+			page_to_pgoff(page), page_to_pfn(page));
 	if (compound) {
 		if (hpage_pincount_available(page)) {
-			pr_warn("head:%px order:%u compound_mapcount:%d compound_pincount:%d\n",
+			pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n",
 					head, compound_order(head),
 					compound_mapcount(head),
 					compound_pincount(head));
 		} else {
-			pr_warn("head:%px order:%u compound_mapcount:%d\n",
+			pr_warn("head:%p order:%u compound_mapcount:%d\n",
 					head, compound_order(head),
 					compound_mapcount(head));
 		}
_

Patches currently in -mm which might be from willy@infradead.org are

mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (100 preceding siblings ...)
  2020-07-09 23:09 ` + mm-print-hashed-address-of-struct-page.patch " Andrew Morton
@ 2020-07-09 23:10 ` Andrew Morton
  2020-07-09 23:46 ` + mm-migrate-optimize-migrate_vma_setup-for-holes.patch " Andrew Morton
                   ` (130 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:10 UTC (permalink / raw)
  To: chris, domas, guro, hannes, mhocko, mm-commits, shakeelb, tj


The patch titled
     Subject: mm: memcontrol: avoid workload stalls when lowering memory.high
has been added to the -mm tree.  Its filename is
     mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: memcontrol: avoid workload stalls when lowering memory.high

Memory.high limit is implemented in a way such that the kernel penalizes
all threads which are allocating a memory over the limit.  Forcing all
threads into the synchronous reclaim and adding some artificial delays
allows to slow down the memory consumption and potentially give some time
for userspace oom handlers/resource control agents to react.

It works nicely if the memory usage is hitting the limit from below,
however it works sub-optimal if a user adjusts memory.high to a value way
below the current memory usage.  It basically forces all workload threads
(doing any memory allocations) into the synchronous reclaim and sleep. 
This makes the workload completely unresponsive for a long period of time
and can also lead to a system-wide contention on lru locks.  It can happen
even if the workload is not actually tight on memory and has, for example,
a ton of cold pagecache.

In the current implementation writing to memory.high causes an atomic
update of page counter's high value followed by an attempt to reclaim
enough memory to fit into the new limit.  To fix the problem described
above, all we need is to change the order of execution: try to push the
memory usage under the limit first, and only then set the new high limit.

Link: http://lkml.kernel.org/r/20200709194718.189231-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Domas Mituzas <domas@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Chris Down <chris@chrisdown.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/memcontrol.c~mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh
+++ a/mm/memcontrol.c
@@ -6203,8 +6203,6 @@ static ssize_t memory_high_write(struct
 	if (err)
 		return err;
 
-	page_counter_set_high(&memcg->memory, high);
-
 	for (;;) {
 		unsigned long nr_pages = page_counter_read(&memcg->memory);
 		unsigned long reclaimed;
@@ -6228,6 +6226,8 @@ static ssize_t memory_high_write(struct
 			break;
 	}
 
+	page_counter_set_high(&memcg->memory, high);
+
 	return nbytes;
 }
 
_

Patches currently in -mm which might be from guro@fb.com are

mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-migrate-optimize-migrate_vma_setup-for-holes.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (101 preceding siblings ...)
  2020-07-09 23:10 ` + mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch " Andrew Morton
@ 2020-07-09 23:46 ` Andrew Morton
  2020-07-09 23:46 ` + mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch " Andrew Morton
                   ` (129 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:46 UTC (permalink / raw)
  To: bharata, hch, jgg, jglisse, jhubbard, mm-commits, rcampbell, shuah


The patch titled
     Subject: mm/migrate: optimize migrate_vma_setup() for holes
has been added to the -mm tree.  Its filename is
     mm-migrate-optimize-migrate_vma_setup-for-holes.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-migrate-optimize-migrate_vma_setup-for-holes.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-migrate-optimize-migrate_vma_setup-for-holes.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Ralph Campbell <rcampbell@nvidia.com>
Subject: mm/migrate: optimize migrate_vma_setup() for holes

Patch series "mm/migrate: optimize migrate_vma_setup() for holes".

A simple optimization for migrate_vma_*() when the source vma is not an
anonymous vma and a new test case to exercise it.


This patch (of 2):

When migrating system memory to device private memory, if the source
address range is a valid VMA range and there is no memory or a zero page,
the source PFN array is marked as valid but with no PFN.

This lets the device driver allocate private memory and clear it, then
insert the new device private struct page into the CPU's page tables when
migrate_vma_pages() is called.  migrate_vma_pages() only inserts the new
page if the VMA is an anonymous range.

There is no point in telling the device driver to allocate device private
memory and then not migrate the page.  Instead, mark the source PFN array
entries as not migrating to avoid this overhead.

Link: http://lkml.kernel.org/r/20200709165711.26584-1-rcampbell@nvidia.com
Link: http://lkml.kernel.org/r/20200709165711.26584-2-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: "Bharata B Rao" <bharata@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/migrate.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/mm/migrate.c~mm-migrate-optimize-migrate_vma_setup-for-holes
+++ a/mm/migrate.c
@@ -2167,9 +2167,13 @@ static int migrate_vma_collect_hole(unsi
 {
 	struct migrate_vma *migrate = walk->private;
 	unsigned long addr;
+	unsigned long flags;
+
+	/* Only allow populating anonymous memory. */
+	flags = vma_is_anonymous(walk->vma) ? MIGRATE_PFN_MIGRATE : 0;
 
 	for (addr = start; addr < end; addr += PAGE_SIZE) {
-		migrate->src[migrate->npages] = MIGRATE_PFN_MIGRATE;
+		migrate->src[migrate->npages] = flags;
 		migrate->dst[migrate->npages] = 0;
 		migrate->npages++;
 		migrate->cpages++;
_

Patches currently in -mm which might be from rcampbell@nvidia.com are

mm-remove-redundant-check-non_swap_entry.patch
mm-migrate-optimize-migrate_vma_setup-for-holes.patch
mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (102 preceding siblings ...)
  2020-07-09 23:46 ` + mm-migrate-optimize-migrate_vma_setup-for-holes.patch " Andrew Morton
@ 2020-07-09 23:46 ` Andrew Morton
  2020-07-10  0:15 ` + mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch " Andrew Morton
                   ` (128 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:46 UTC (permalink / raw)
  To: bharata, hch, jgg, jglisse, jhubbard, mm-commits, rcampbell, shuah


The patch titled
     Subject: mm/migrate: add migrate-shared test for migrate_vma_*()
has been added to the -mm tree.  Its filename is
     mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Ralph Campbell <rcampbell@nvidia.com>
Subject: mm/migrate: add migrate-shared test for migrate_vma_*()

Add a migrate_vma_*() self test for mmap(MAP_SHARED) to verify that
!vma_anonymous() ranges won't be migrated.

Link: http://lkml.kernel.org/r/20200709165711.26584-3-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: "Bharata B Rao" <bharata@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/hmm-tests.c |   35 +++++++++++++++++++++++
 1 file changed, 35 insertions(+)

--- a/tools/testing/selftests/vm/hmm-tests.c~mm-migrate-add-migrate-shared-test-for-migrate_vma_
+++ a/tools/testing/selftests/vm/hmm-tests.c
@@ -932,6 +932,41 @@ TEST_F(hmm, migrate_fault)
 }
 
 /*
+ * Migrate anonymous shared memory to device private memory.
+ */
+TEST_F(hmm, migrate_shared)
+{
+	struct hmm_buffer *buffer;
+	unsigned long npages;
+	unsigned long size;
+	int ret;
+
+	npages = ALIGN(HMM_BUFFER_SIZE, self->page_size) >> self->page_shift;
+	ASSERT_NE(npages, 0);
+	size = npages << self->page_shift;
+
+	buffer = malloc(sizeof(*buffer));
+	ASSERT_NE(buffer, NULL);
+
+	buffer->fd = -1;
+	buffer->size = size;
+	buffer->mirror = malloc(size);
+	ASSERT_NE(buffer->mirror, NULL);
+
+	buffer->ptr = mmap(NULL, size,
+			   PROT_READ | PROT_WRITE,
+			   MAP_SHARED | MAP_ANONYMOUS,
+			   buffer->fd, 0);
+	ASSERT_NE(buffer->ptr, MAP_FAILED);
+
+	/* Migrate memory to device. */
+	ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+	ASSERT_EQ(ret, -ENOENT);
+
+	hmm_buffer_free(buffer);
+}
+
+/*
  * Try to migrate various memory types to device private memory.
  */
 TEST_F(hmm2, migrate_mixed)
_

Patches currently in -mm which might be from rcampbell@nvidia.com are

mm-remove-redundant-check-non_swap_entry.patch
mm-migrate-optimize-migrate_vma_setup-for-holes.patch
mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (103 preceding siblings ...)
  2020-07-09 23:46 ` + mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch " Andrew Morton
@ 2020-07-10  0:15 ` Andrew Morton
  2020-07-10  0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards.patch " Andrew Morton
                   ` (127 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:15 UTC (permalink / raw)
  To: laoar.shao, mhocko, mm-commits, rientjes


The patch titled
     Subject: mm, oom: make the calculation of oom badness more accurate
has been added to the -mm tree.  Its filename is
     mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Yafang Shao <laoar.shao@gmail.com>
Subject: mm, oom: make the calculation of oom badness more accurate

Recently we found an issue on our production environment that when memcg
oom is triggered the oom killer doesn't chose the process with largest
resident memory but chose the first scanned process.  Note that all
processes in this memcg have the same oom_score_adj, so the oom killer
should chose the process with largest resident memory.

Bellow is part of the oom info, which is enough to analyze this issue.
[7516987.983223] memory: usage 16777216kB, limit 16777216kB, failcnt 52843037
[7516987.983224] memory+swap: usage 16777216kB, limit 9007199254740988kB, failcnt 0
[7516987.983225] kmem: usage 301464kB, limit 9007199254740988kB, failcnt 0
[...]
[7516987.983293] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[7516987.983510] [ 5740]     0  5740      257        1    32768        0          -998 pause
[7516987.983574] [58804]     0 58804     4594      771    81920        0          -998 entry_point.bas
[7516987.983577] [58908]     0 58908     7089      689    98304        0          -998 cron
[7516987.983580] [58910]     0 58910    16235     5576   163840        0          -998 supervisord
[7516987.983590] [59620]     0 59620    18074     1395   188416        0          -998 sshd
[7516987.983594] [59622]     0 59622    18680     6679   188416        0          -998 python
[7516987.983598] [59624]     0 59624  1859266     5161   548864        0          -998 odin-agent
[7516987.983600] [59625]     0 59625   707223     9248   983040        0          -998 filebeat
[7516987.983604] [59627]     0 59627   416433    64239   774144        0          -998 odin-log-agent
[7516987.983607] [59631]     0 59631   180671    15012   385024        0          -998 python3
[7516987.983612] [61396]     0 61396   791287     3189   352256        0          -998 client
[7516987.983615] [61641]     0 61641  1844642    29089   946176        0          -998 client
[7516987.983765] [ 9236]     0  9236     2642      467    53248        0          -998 php_scanner
[7516987.983911] [42898]     0 42898    15543      838   167936        0          -998 su
[7516987.983915] [42900]  1000 42900     3673      867    77824        0          -998 exec_script_vr2
[7516987.983918] [42925]  1000 42925    36475    19033   335872        0          -998 python
[7516987.983921] [57146]  1000 57146     3673      848    73728        0          -998 exec_script_J2p
[7516987.983925] [57195]  1000 57195   186359    22958   491520        0          -998 python2
[7516987.983928] [58376]  1000 58376   275764    14402   290816        0          -998 rosmaster
[7516987.983931] [58395]  1000 58395   155166     4449   245760        0          -998 rosout
[7516987.983935] [58406]  1000 58406 18285584  3967322 37101568        0          -998 data_sim
[7516987.984221] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=3aa16c9482ae3a6f6b78bda68a55d32c87c99b985e0f11331cddf05af6c4d753,mems_allowed=0-1,oom_memcg=/kubepods/podf1c273d3-9b36-11ea-b3df-246e9693c184,task_memcg=/kubepods/podf1c273d3-9b36-11ea-b3df-246e9693c184/1f246a3eeea8f70bf91141eeaf1805346a666e225f823906485ea0b6c37dfc3d,task=pause,pid=5740,uid=0
[7516987.984254] Memory cgroup out of memory: Killed process 5740 (pause) total-vm:1028kB, anon-rss:4kB, file-rss:0kB, shmem-rss:0kB
[7516988.092344] oom_reaper: reaped process 5740 (pause), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

We can find that the first scanned process 5740 (pause) was killed, but
its rss is only one page.  That is because, when we calculate the oom
badness in oom_badness(), we always ignore the negtive point and convert
all of these negtive points to 1.  Now as oom_score_adj of all the
processes in this targeted memcg have the same value -998, the points of
these processes are all negtive value.  As a result, the first scanned
process will be killed.

The oom_socre_adj (-998) in this memcg is set by kubelet, because it is a
a Guaranteed pod, which has higher priority to prevent from being killed
by system oom.

To fix this issue, we should make the calculation of oom point more
accurate.  We can achieve it by convert the chosen_point from 'unsigned
long' to 'long'.

Link: http://lkml.kernel.org/r/1594309987-9919-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/tty/sysrq.c |    1 +
 fs/proc/base.c      |    7 ++++++-
 include/linux/oom.h |    4 ++--
 mm/memcontrol.c     |    1 +
 mm/oom_kill.c       |   19 ++++++++-----------
 mm/page_alloc.c     |    1 +
 6 files changed, 19 insertions(+), 14 deletions(-)

--- a/drivers/tty/sysrq.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/drivers/tty/sysrq.c
@@ -382,6 +382,7 @@ static void moom_callback(struct work_st
 		.memcg = NULL,
 		.gfp_mask = gfp_mask,
 		.order = -1,
+		.chosen_points = LONG_MIN,
 	};
 
 	mutex_lock(&oom_lock);
--- a/fs/proc/base.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/fs/proc/base.c
@@ -551,8 +551,13 @@ static int proc_oom_score(struct seq_fil
 {
 	unsigned long totalpages = totalram_pages() + total_swap_pages;
 	unsigned long points = 0;
+	long badness;
 
-	points = oom_badness(task, totalpages) * 1000 / totalpages;
+	badness = oom_badness(task, totalpages);
+	if (badness != LONG_MIN) {
+		/* Let's keep the range of points as [0, 2000]. */
+		points = (1000 + badness * 1000 / (long)totalpages) * 2 / 3;
+	}
 	seq_printf(m, "%lu\n", points);
 
 	return 0;
--- a/include/linux/oom.h~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/include/linux/oom.h
@@ -48,7 +48,7 @@ struct oom_control {
 	/* Used by oom implementation, do not set */
 	unsigned long totalpages;
 	struct task_struct *chosen;
-	unsigned long chosen_points;
+	long chosen_points;
 
 	/* Used to print the constraint info. */
 	enum oom_constraint constraint;
@@ -107,7 +107,7 @@ static inline vm_fault_t check_stable_ad
 
 bool __oom_reap_task_mm(struct mm_struct *mm);
 
-extern unsigned long oom_badness(struct task_struct *p,
+long oom_badness(struct task_struct *p,
 		unsigned long totalpages);
 
 extern bool out_of_memory(struct oom_control *oc);
--- a/mm/memcontrol.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/mm/memcontrol.c
@@ -1666,6 +1666,7 @@ static bool mem_cgroup_out_of_memory(str
 		.memcg = memcg,
 		.gfp_mask = gfp_mask,
 		.order = order,
+		.chosen_points = LONG_MIN,
 	};
 	bool ret;
 
--- a/mm/oom_kill.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/mm/oom_kill.c
@@ -196,17 +196,17 @@ static bool is_dump_unreclaim_slabs(void
  * predictable as possible.  The goal is to return the highest value for the
  * task consuming the most memory to avoid subsequent oom failures.
  */
-unsigned long oom_badness(struct task_struct *p, unsigned long totalpages)
+long oom_badness(struct task_struct *p, unsigned long totalpages)
 {
 	long points;
 	long adj;
 
 	if (oom_unkillable_task(p))
-		return 0;
+		return LONG_MIN;
 
 	p = find_lock_task_mm(p);
 	if (!p)
-		return 0;
+		return LONG_MIN;
 
 	/*
 	 * Do not even consider tasks which are explicitly marked oom
@@ -218,7 +218,7 @@ unsigned long oom_badness(struct task_st
 			test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
 			in_vfork(p)) {
 		task_unlock(p);
-		return 0;
+		return LONG_MIN;
 	}
 
 	/*
@@ -233,11 +233,7 @@ unsigned long oom_badness(struct task_st
 	adj *= totalpages / 1000;
 	points += adj;
 
-	/*
-	 * Never return 0 for an eligible task regardless of the root bonus and
-	 * oom_score_adj (oom_score_adj can't be OOM_SCORE_ADJ_MIN here).
-	 */
-	return points > 0 ? points : 1;
+	return points;
 }
 
 static const char * const oom_constraint_text[] = {
@@ -336,12 +332,12 @@ static int oom_evaluate_task(struct task
 	 * killed first if it triggers an oom, then select it.
 	 */
 	if (oom_task_origin(task)) {
-		points = ULONG_MAX;
+		points = LONG_MAX;
 		goto select;
 	}
 
 	points = oom_badness(task, oc->totalpages);
-	if (!points || points < oc->chosen_points)
+	if (points == LONG_MIN || points < oc->chosen_points)
 		goto next;
 
 select:
@@ -1128,6 +1124,7 @@ void pagefault_out_of_memory(void)
 		.memcg = NULL,
 		.gfp_mask = 0,
 		.order = 0,
+		.chosen_points = LONG_MIN,
 	};
 
 	if (mem_cgroup_oom_synchronize(true))
--- a/mm/page_alloc.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/mm/page_alloc.c
@@ -3915,6 +3915,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, un
 		.memcg = NULL,
 		.gfp_mask = gfp_mask,
 		.order = order,
+		.chosen_points = LONG_MIN,
 	};
 	struct page *page;
 
_

Patches currently in -mm which might be from laoar.shao@gmail.com are

mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-close-race-between-munmap-and-expand_upwards-downwards.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (104 preceding siblings ...)
  2020-07-10  0:15 ` + mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch " Andrew Morton
@ 2020-07-10  0:23 ` Andrew Morton
  2020-07-10  0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch " Andrew Morton
                   ` (126 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:23 UTC (permalink / raw)
  To: jannh, kirill.shutemov, mm-commits, oleg, stable, vbabka, willy,
	yang.shi


The patch titled
     Subject: mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
has been added to the -mm tree.  Its filename is
     mm-close-race-between-munmap-and-expand_upwards-downwards.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: mm/mmap.c: close race between munmap() and expand_upwards()/downwards()

VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under
mmap_read_lock().  It can lead to race with __do_munmap():

	Thread A			Thread B
__do_munmap()
  detach_vmas_to_be_unmapped()
  mmap_write_downgrade()
				expand_downwards()
				  vma->vm_start = address;
				  // The VMA now overlaps with
				  // VMAs detached by the Thread A
				// page fault populates expanded part
				// of the VMA
  unmap_region()
    // Zaps pagetables partly
    // populated by Thread B

Similar race exists for expand_upwards().

The fix is to avoid downgrading mmap_lock in __do_munmap() if detached
VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA.

Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel.com
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Jann Horn <jannh@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.20+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mmap.c |   16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

--- a/mm/mmap.c~mm-close-race-between-munmap-and-expand_upwards-downwards
+++ a/mm/mmap.c
@@ -2620,7 +2620,7 @@ static void unmap_region(struct mm_struc
  * Create a list of vma's touched by the unmap, removing them from the mm's
  * vma list as we go..
  */
-static void
+static bool
 detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
 	struct vm_area_struct *prev, unsigned long end)
 {
@@ -2645,6 +2645,17 @@ detach_vmas_to_be_unmapped(struct mm_str
 
 	/* Kill the cache */
 	vmacache_invalidate(mm);
+
+	/*
+	 * Do not downgrade mmap_sem if we are next to VM_GROWSDOWN or
+	 * VM_GROWSUP VMA. Such VMAs can change their size under
+	 * down_read(mmap_sem) and collide with the VMA we are about to unmap.
+	 */
+	if (vma && (vma->vm_flags & VM_GROWSDOWN))
+		return false;
+	if (prev && (prev->vm_flags & VM_GROWSUP))
+		return false;
+	return true;
 }
 
 /*
@@ -2825,7 +2836,8 @@ int __do_munmap(struct mm_struct *mm, un
 	}
 
 	/* Detach vmas from rbtree */
-	detach_vmas_to_be_unmapped(mm, vma, prev, end);
+	if (!detach_vmas_to_be_unmapped(mm, vma, prev, end))
+		downgrade = false;
 
 	if (downgrade)
 		mmap_write_downgrade(mm);
_

Patches currently in -mm which might be from kirill.shutemov@linux.intel.com are

mm-close-race-between-munmap-and-expand_upwards-downwards.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (105 preceding siblings ...)
  2020-07-10  0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards.patch " Andrew Morton
@ 2020-07-10  0:23 ` Andrew Morton
  2020-07-10  0:27 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch " Andrew Morton
                   ` (125 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:23 UTC (permalink / raw)
  To: akpm, jannh, kirill.shutemov, mm-commits, oleg, vbabka, willy, yang.shi


The patch titled
     Subject: mm-close-race-between-munmap-and-expand_upwards-downwards-fix
has been added to the -mm tree.  Its filename is
     mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-close-race-between-munmap-and-expand_upwards-downwards-fix

s/mmap_sem/mmap_lock/ in comment

Cc: Jann Horn <jannh@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mmap.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/mmap.c~mm-close-race-between-munmap-and-expand_upwards-downwards-fix
+++ a/mm/mmap.c
@@ -2647,9 +2647,9 @@ detach_vmas_to_be_unmapped(struct mm_str
 	vmacache_invalidate(mm);
 
 	/*
-	 * Do not downgrade mmap_sem if we are next to VM_GROWSDOWN or
+	 * Do not downgrade mmap_lock if we are next to VM_GROWSDOWN or
 	 * VM_GROWSUP VMA. Such VMAs can change their size under
-	 * down_read(mmap_sem) and collide with the VMA we are about to unmap.
+	 * down_read(mmap_lock) and collide with the VMA we are about to unmap.
 	 */
 	if (vma && (vma->vm_flags & VM_GROWSDOWN))
 		return false;
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (106 preceding siblings ...)
  2020-07-10  0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch " Andrew Morton
@ 2020-07-10  0:27 ` Andrew Morton
  2020-07-10  0:33 ` + iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
                   ` (124 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:27 UTC (permalink / raw)
  To: anshuman.khandual, daniel.m.jordan, hughd, jhubbard, mm-commits,
	n-horiguchi, rdunlap, willy, ziy


The patch titled
     Subject: mm-vmstat-add-events-for-thp-migration-without-split-fix-2
has been added to the -mm tree.  Its filename is
     mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Zi Yan <ziy@nvidia.com>
Subject: mm-vmstat-add-events-for-thp-migration-without-split-fix-2

- Changed THP_MIGRATION_FAILURE as THP_MIGRATION_FAIL per John
- Dropped all conditional 'if' blocks in migrate_pages() per Andrew and John
- Updated migration events documentation per John
- Updated thp_nr_pages variable as nr_subpages for an expected merge conflict
- Moved all new THP vmstat events into CONFIG_MIGRATION
- Updated Cc list with Documentation/ and tracing related addresses

Link: http://lkml.kernel.org/r/C5E3C65C-8253-4638-9D3C-71A61858BB8B@nvidia.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/vm/page_migration.rst |   40 +++++++++++++++-----------
 include/linux/vm_event_item.h       |    6 +--
 mm/migrate.c                        |   25 ++++++----------
 mm/vmstat.c                         |    6 +--
 4 files changed, 40 insertions(+), 37 deletions(-)

--- a/Documentation/vm/page_migration.rst~mm-vmstat-add-events-for-thp-migration-without-split-fix-2
+++ a/Documentation/vm/page_migration.rst
@@ -253,24 +253,32 @@ which are function pointers of struct ad
      PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
      for own purpose.
 
-Quantifying Migration
+Monitoring Migration
 =====================
-Following events can be used to quantify page migration.
 
-1. PGMIGRATE_SUCCESS       /* Normal page migration success */
-2. PGMIGRATE_FAIL          /* Normal page migration failure */
-3. THP_MIGRATION_SUCCESS   /* Transparent huge page migration success */
-4. THP_MIGRATION_FAILURE   /* Transparent huge page migration failure */
-5. THP_MIGRATION_SPLIT     /* Transparent huge page got split, retried */
-
-THP_MIGRATION_SUCCESS is when THP is migrated successfully without getting
-split into it's subpages. THP_MIGRATION_FAILURE is when THP could neither
-be migrated nor be split. THP_MIGRATION_SPLIT is when THP could not
-just be migrated as is but instead get split into it's subpages and later
-retried as normal pages. THP events would also update normal page migration
-statistics PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE. These events will help
-in quantifying and analyzing various THP migration events including both
-success and failure cases.
+The following events (counters) can be used to monitor page migration.
+
+1. PGMIGRATE_SUCCESS: Normal page migration success. Each count means that a
+   page was migrated. If the page was a non-THP page, then this counter is
+   increased by one. If the page was a THP, then this counter is increased by
+   the number of THP subpages. For example, migration of a single 2MB THP that
+   has 4KB-size base pages (subpages) will cause this counter to increase by
+   512.
+
+2. PGMIGRATE_FAIL: Normal page migration failure. Same counting rules as for
+   _SUCCESS, above: this will be increased by the number of subpages, if it was
+   a THP.
+
+3. THP_MIGRATION_SUCCESS: A THP was migrated without being split.
+
+4. THP_MIGRATION_FAIL: A THP could not be migrated nor it could be split.
+
+5. THP_MIGRATION_SPLIT: A THP was migrated, but not as such: first, the THP had
+   to be split. After splitting, a migration retry was used for it's sub-pages.
+
+THP_MIGRATION_* events also update the appropriate PGMIGRATE_SUCCESS or
+PGMIGRATE_FAIL events. For example, a THP migration failure will cause both
+THP_MIGRATION_FAIL and PGMIGRATE_FAIL to increase.
 
 Christoph Lameter, May 8, 2006.
 Minchan Kim, Mar 28, 2016.
--- a/include/linux/vm_event_item.h~mm-vmstat-add-events-for-thp-migration-without-split-fix-2
+++ a/include/linux/vm_event_item.h
@@ -56,6 +56,9 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
 #endif
 #ifdef CONFIG_MIGRATION
 		PGMIGRATE_SUCCESS, PGMIGRATE_FAIL,
+		THP_MIGRATION_SUCCESS,
+		THP_MIGRATION_FAIL,
+		THP_MIGRATION_SPLIT,
 #endif
 #ifdef CONFIG_COMPACTION
 		COMPACTMIGRATE_SCANNED, COMPACTFREE_SCANNED,
@@ -95,9 +98,6 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
 		THP_ZERO_PAGE_ALLOC_FAILED,
 		THP_SWPOUT,
 		THP_SWPOUT_FALLBACK,
-		THP_MIGRATION_SUCCESS,
-		THP_MIGRATION_FAILURE,
-		THP_MIGRATION_SPLIT,
 #endif
 #ifdef CONFIG_MEMORY_BALLOON
 		BALLOON_INFLATE,
--- a/mm/migrate.c~mm-vmstat-add-events-for-thp-migration-without-split-fix-2
+++ a/mm/migrate.c
@@ -1429,7 +1429,7 @@ int migrate_pages(struct list_head *from
 	struct page *page;
 	struct page *page2;
 	int swapwrite = current->flags & PF_SWAPWRITE;
-	int rc, thp_n_pages;
+	int rc, nr_subpages;
 
 	if (!swapwrite)
 		current->flags |= PF_SWAPWRITE;
@@ -1446,7 +1446,7 @@ retry:
 			 * during migration.
 			 */
 			is_thp = PageTransHuge(page);
-			thp_n_pages = thp_nr_pages(page);
+			nr_subpages = thp_nr_pages(page);
 			cond_resched();
 
 			if (PageHuge(page))
@@ -1483,7 +1483,7 @@ retry:
 				}
 				if (is_thp) {
 					nr_thp_failed++;
-					nr_failed += thp_n_pages;
+					nr_failed += nr_subpages;
 					goto out;
 				}
 				nr_failed++;
@@ -1498,7 +1498,7 @@ retry:
 			case MIGRATEPAGE_SUCCESS:
 				if (is_thp) {
 					nr_thp_succeeded++;
-					nr_succeeded += thp_n_pages;
+					nr_succeeded += nr_subpages;
 					break;
 				}
 				nr_succeeded++;
@@ -1512,7 +1512,7 @@ retry:
 				 */
 				if (is_thp) {
 					nr_thp_failed++;
-					nr_failed += thp_n_pages;
+					nr_failed += nr_subpages;
 					break;
 				}
 				nr_failed++;
@@ -1524,16 +1524,11 @@ retry:
 	nr_thp_failed += thp_retry;
 	rc = nr_failed;
 out:
-	if (nr_succeeded)
-		count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
-	if (nr_failed)
-		count_vm_events(PGMIGRATE_FAIL, nr_failed);
-	if (nr_thp_succeeded)
-		count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
-	if (nr_thp_failed)
-		count_vm_events(THP_MIGRATION_FAILURE, nr_thp_failed);
-	if (nr_thp_split)
-		count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
+	count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
+	count_vm_events(PGMIGRATE_FAIL, nr_failed);
+	count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
+	count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed);
+	count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
 	trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded,
 			       nr_thp_failed, nr_thp_split, mode, reason);
 
--- a/mm/vmstat.c~mm-vmstat-add-events-for-thp-migration-without-split-fix-2
+++ a/mm/vmstat.c
@@ -1274,6 +1274,9 @@ const char * const vmstat_text[] = {
 #ifdef CONFIG_MIGRATION
 	"pgmigrate_success",
 	"pgmigrate_fail",
+	"thp_migration_success",
+	"thp_migration_fail",
+	"thp_migration_split",
 #endif
 #ifdef CONFIG_COMPACTION
 	"compact_migrate_scanned",
@@ -1320,9 +1323,6 @@ const char * const vmstat_text[] = {
 	"thp_zero_page_alloc_failed",
 	"thp_swpout",
 	"thp_swpout_fallback",
-	"thp_migration_success",
-	"thp_migration_failure",
-	"thp_migration_split",
 #endif
 #ifdef CONFIG_MEMORY_BALLOON
 	"balloon_inflate",
_

Patches currently in -mm which might be from ziy@nvidia.com are

mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (107 preceding siblings ...)
  2020-07-10  0:27 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch " Andrew Morton
@ 2020-07-10  0:33 ` Andrew Morton
  2020-07-10  0:33 ` + rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
                   ` (123 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:33 UTC (permalink / raw)
  To: allenbh, arnd, benh, dalias, dave.jiang, davem, deller,
	geert+renesas, geert, ink, James.Bottomley, jasowang, jdmason,
	krzk, kuba, kvalo, mattst88, mm-commits, mpe, mst, paulus, rth,
	ysato


The patch titled
     Subject: iomap: constify ioreadX() iomem argument (as in generic implementation)
has been added to the -mm tree.  Its filename is
     iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Krzysztof Kozlowski <krzk@kernel.org>
Subject: iomap: constify ioreadX() iomem argument (as in generic implementation)

Patch series "iomap: Constify ioreadX() iomem argument", v3.

The ioread8/16/32() and others have inconsistent interface among the
architectures: some taking address as const, some not.

It seems there is nothing really stopping all of them to take pointer to
const.


This patch (of 4):

The ioreadX() and ioreadX_rep() helpers have inconsistent interface.  On
some architectures void *__iomem address argument is a pointer to const,
on some not.

Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.

Link: http://lkml.kernel.org/r/20200709072837.5869-1-krzk@kernel.org
Link: http://lkml.kernel.org/r/20200709072837.5869-2-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/include/asm/core_apecs.h   |    6 +-
 arch/alpha/include/asm/core_cia.h     |    6 +-
 arch/alpha/include/asm/core_lca.h     |    6 +-
 arch/alpha/include/asm/core_marvel.h  |    4 -
 arch/alpha/include/asm/core_mcpcia.h  |    6 +-
 arch/alpha/include/asm/core_t2.h      |    2 
 arch/alpha/include/asm/io.h           |   12 ++--
 arch/alpha/include/asm/io_trivial.h   |   16 ++---
 arch/alpha/include/asm/jensen.h       |    2 
 arch/alpha/include/asm/machvec.h      |    6 +-
 arch/alpha/kernel/core_marvel.c       |    2 
 arch/alpha/kernel/io.c                |   12 ++--
 arch/parisc/include/asm/io.h          |    4 -
 arch/parisc/lib/iomap.c               |   72 ++++++++++++------------
 arch/powerpc/kernel/iomap.c           |   28 ++++-----
 arch/sh/kernel/iomap.c                |   22 +++----
 include/asm-generic/iomap.h           |   28 ++++-----
 include/linux/io-64-nonatomic-hi-lo.h |    4 -
 include/linux/io-64-nonatomic-lo-hi.h |    4 -
 lib/iomap.c                           |   30 +++++-----
 20 files changed, 136 insertions(+), 136 deletions(-)

--- a/arch/alpha/include/asm/core_apecs.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_apecs.h
@@ -384,7 +384,7 @@ struct el_apecs_procdata
 		}						\
 	} while (0)
 
-__EXTERN_INLINE unsigned int apecs_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread8(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	unsigned long result, base_and_type;
@@ -420,7 +420,7 @@ __EXTERN_INLINE void apecs_iowrite8(u8 b
 	*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int apecs_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread16(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	unsigned long result, base_and_type;
@@ -456,7 +456,7 @@ __EXTERN_INLINE void apecs_iowrite16(u16
 	*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int apecs_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread32(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	if (addr < APECS_DENSE_MEM)
--- a/arch/alpha/include/asm/core_cia.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_cia.h
@@ -342,7 +342,7 @@ struct el_CIA_sysdata_mcheck {
 #define vuip	volatile unsigned int __force *
 #define vulp	volatile unsigned long __force *
 
-__EXTERN_INLINE unsigned int cia_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread8(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	unsigned long result, base_and_type;
@@ -374,7 +374,7 @@ __EXTERN_INLINE void cia_iowrite8(u8 b,
 	*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int cia_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread16(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	unsigned long result, base_and_type;
@@ -404,7 +404,7 @@ __EXTERN_INLINE void cia_iowrite16(u16 b
 	*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int cia_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread32(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	if (addr < CIA_DENSE_MEM)
--- a/arch/alpha/include/asm/core_lca.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_lca.h
@@ -230,7 +230,7 @@ union el_lca {
 	} while (0)
 
 
-__EXTERN_INLINE unsigned int lca_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread8(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	unsigned long result, base_and_type;
@@ -266,7 +266,7 @@ __EXTERN_INLINE void lca_iowrite8(u8 b,
 	*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int lca_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread16(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	unsigned long result, base_and_type;
@@ -302,7 +302,7 @@ __EXTERN_INLINE void lca_iowrite16(u16 b
 	*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int lca_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread32(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	if (addr < LCA_DENSE_MEM)
--- a/arch/alpha/include/asm/core_marvel.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_marvel.h
@@ -332,10 +332,10 @@ struct io7 {
 #define vucp	volatile unsigned char __force *
 #define vusp	volatile unsigned short __force *
 
-extern unsigned int marvel_ioread8(void __iomem *);
+extern unsigned int marvel_ioread8(const void __iomem *);
 extern void marvel_iowrite8(u8 b, void __iomem *);
 
-__EXTERN_INLINE unsigned int marvel_ioread16(void __iomem *addr)
+__EXTERN_INLINE unsigned int marvel_ioread16(const void __iomem *addr)
 {
 	return __kernel_ldwu(*(vusp)addr);
 }
--- a/arch/alpha/include/asm/core_mcpcia.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_mcpcia.h
@@ -267,7 +267,7 @@ extern inline int __mcpcia_is_mmio(unsig
 	return (addr & 0x80000000UL) == 0;
 }
 
-__EXTERN_INLINE unsigned int mcpcia_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int mcpcia_ioread8(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long)xaddr & MCPCIA_MEM_MASK;
 	unsigned long hose = (unsigned long)xaddr & ~MCPCIA_MEM_MASK;
@@ -291,7 +291,7 @@ __EXTERN_INLINE void mcpcia_iowrite8(u8
 	*(vuip) ((addr << 5) + hose + 0x00) = w;
 }
 
-__EXTERN_INLINE unsigned int mcpcia_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int mcpcia_ioread16(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long)xaddr & MCPCIA_MEM_MASK;
 	unsigned long hose = (unsigned long)xaddr & ~MCPCIA_MEM_MASK;
@@ -315,7 +315,7 @@ __EXTERN_INLINE void mcpcia_iowrite16(u1
 	*(vuip) ((addr << 5) + hose + 0x08) = w;
 }
 
-__EXTERN_INLINE unsigned int mcpcia_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int mcpcia_ioread32(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long)xaddr;
 
--- a/arch/alpha/include/asm/core_t2.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_t2.h
@@ -572,7 +572,7 @@ __EXTERN_INLINE int t2_is_mmio(const vol
    it doesn't make sense to merge the pio and mmio routines.  */
 
 #define IOPORT(OS, NS)							\
-__EXTERN_INLINE unsigned int t2_ioread##NS(void __iomem *xaddr)		\
+__EXTERN_INLINE unsigned int t2_ioread##NS(const void __iomem *xaddr)		\
 {									\
 	if (t2_is_mmio(xaddr))						\
 		return t2_read##OS(xaddr);				\
--- a/arch/alpha/include/asm/io.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/io.h
@@ -150,9 +150,9 @@ static inline void generic_##NAME(TYPE b
 	alpha_mv.mv_##NAME(b, addr);					\
 }
 
-REMAP1(unsigned int, ioread8, /**/)
-REMAP1(unsigned int, ioread16, /**/)
-REMAP1(unsigned int, ioread32, /**/)
+REMAP1(unsigned int, ioread8, const)
+REMAP1(unsigned int, ioread16, const)
+REMAP1(unsigned int, ioread32, const)
 REMAP1(u8, readb, const volatile)
 REMAP1(u16, readw, const volatile)
 REMAP1(u32, readl, const volatile)
@@ -307,7 +307,7 @@ static inline int __is_mmio(const volati
  */
 
 #if IO_CONCAT(__IO_PREFIX,trivial_io_bw)
-extern inline unsigned int ioread8(void __iomem *addr)
+extern inline unsigned int ioread8(const void __iomem *addr)
 {
 	unsigned int ret;
 	mb();
@@ -316,7 +316,7 @@ extern inline unsigned int ioread8(void
 	return ret;
 }
 
-extern inline unsigned int ioread16(void __iomem *addr)
+extern inline unsigned int ioread16(const void __iomem *addr)
 {
 	unsigned int ret;
 	mb();
@@ -359,7 +359,7 @@ extern inline void outw(u16 b, unsigned
 #endif
 
 #if IO_CONCAT(__IO_PREFIX,trivial_io_lq)
-extern inline unsigned int ioread32(void __iomem *addr)
+extern inline unsigned int ioread32(const void __iomem *addr)
 {
 	unsigned int ret;
 	mb();
--- a/arch/alpha/include/asm/io_trivial.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/io_trivial.h
@@ -7,15 +7,15 @@
 
 #if IO_CONCAT(__IO_PREFIX,trivial_io_bw)
 __EXTERN_INLINE unsigned int
-IO_CONCAT(__IO_PREFIX,ioread8)(void __iomem *a)
+IO_CONCAT(__IO_PREFIX,ioread8)(const void __iomem *a)
 {
-	return __kernel_ldbu(*(volatile u8 __force *)a);
+	return __kernel_ldbu(*(const volatile u8 __force *)a);
 }
 
 __EXTERN_INLINE unsigned int
-IO_CONCAT(__IO_PREFIX,ioread16)(void __iomem *a)
+IO_CONCAT(__IO_PREFIX,ioread16)(const void __iomem *a)
 {
-	return __kernel_ldwu(*(volatile u16 __force *)a);
+	return __kernel_ldwu(*(const volatile u16 __force *)a);
 }
 
 __EXTERN_INLINE void
@@ -33,9 +33,9 @@ IO_CONCAT(__IO_PREFIX,iowrite16)(u16 b,
 
 #if IO_CONCAT(__IO_PREFIX,trivial_io_lq)
 __EXTERN_INLINE unsigned int
-IO_CONCAT(__IO_PREFIX,ioread32)(void __iomem *a)
+IO_CONCAT(__IO_PREFIX,ioread32)(const void __iomem *a)
 {
-	return *(volatile u32 __force *)a;
+	return *(const volatile u32 __force *)a;
 }
 
 __EXTERN_INLINE void
@@ -73,14 +73,14 @@ IO_CONCAT(__IO_PREFIX,writew)(u16 b, vol
 __EXTERN_INLINE u8
 IO_CONCAT(__IO_PREFIX,readb)(const volatile void __iomem *a)
 {
-	void __iomem *addr = (void __iomem *)a;
+	const void __iomem *addr = (const void __iomem *)a;
 	return IO_CONCAT(__IO_PREFIX,ioread8)(addr);
 }
 
 __EXTERN_INLINE u16
 IO_CONCAT(__IO_PREFIX,readw)(const volatile void __iomem *a)
 {
-	void __iomem *addr = (void __iomem *)a;
+	const void __iomem *addr = (const void __iomem *)a;
 	return IO_CONCAT(__IO_PREFIX,ioread16)(addr);
 }
 
--- a/arch/alpha/include/asm/jensen.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/jensen.h
@@ -305,7 +305,7 @@ __EXTERN_INLINE int jensen_is_mmio(const
    that it doesn't make sense to merge them.  */
 
 #define IOPORT(OS, NS)							\
-__EXTERN_INLINE unsigned int jensen_ioread##NS(void __iomem *xaddr)	\
+__EXTERN_INLINE unsigned int jensen_ioread##NS(const void __iomem *xaddr)	\
 {									\
 	if (jensen_is_mmio(xaddr))					\
 		return jensen_read##OS(xaddr - 0x100000000ul);		\
--- a/arch/alpha/include/asm/machvec.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/machvec.h
@@ -46,9 +46,9 @@ struct alpha_machine_vector
 	void (*mv_pci_tbi)(struct pci_controller *hose,
 			   dma_addr_t start, dma_addr_t end);
 
-	unsigned int (*mv_ioread8)(void __iomem *);
-	unsigned int (*mv_ioread16)(void __iomem *);
-	unsigned int (*mv_ioread32)(void __iomem *);
+	unsigned int (*mv_ioread8)(const void __iomem *);
+	unsigned int (*mv_ioread16)(const void __iomem *);
+	unsigned int (*mv_ioread32)(const void __iomem *);
 
 	void (*mv_iowrite8)(u8, void __iomem *);
 	void (*mv_iowrite16)(u16, void __iomem *);
--- a/arch/alpha/kernel/core_marvel.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/kernel/core_marvel.c
@@ -806,7 +806,7 @@ void __iomem *marvel_ioportmap (unsigned
 }
 
 unsigned int
-marvel_ioread8(void __iomem *xaddr)
+marvel_ioread8(const void __iomem *xaddr)
 {
 	unsigned long addr = (unsigned long) xaddr;
 	if (__marvel_is_port_kbd(addr))
--- a/arch/alpha/kernel/io.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/kernel/io.c
@@ -14,7 +14,7 @@
    "generic", which bumps through the machine vector.  */
 
 unsigned int
-ioread8(void __iomem *addr)
+ioread8(const void __iomem *addr)
 {
 	unsigned int ret;
 	mb();
@@ -23,7 +23,7 @@ ioread8(void __iomem *addr)
 	return ret;
 }
 
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
 {
 	unsigned int ret;
 	mb();
@@ -32,7 +32,7 @@ unsigned int ioread16(void __iomem *addr
 	return ret;
 }
 
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
 {
 	unsigned int ret;
 	mb();
@@ -257,7 +257,7 @@ EXPORT_SYMBOL(readq_relaxed);
 /*
  * Read COUNT 8-bit bytes from port PORT into memory starting at SRC.
  */
-void ioread8_rep(void __iomem *port, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *port, void *dst, unsigned long count)
 {
 	while ((unsigned long)dst & 0x3) {
 		if (!count)
@@ -300,7 +300,7 @@ EXPORT_SYMBOL(insb);
  * the interfaces seems to be slow: just using the inlined version
  * of the inw() breaks things.
  */
-void ioread16_rep(void __iomem *port, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *port, void *dst, unsigned long count)
 {
 	if (unlikely((unsigned long)dst & 0x3)) {
 		if (!count)
@@ -340,7 +340,7 @@ EXPORT_SYMBOL(insw);
  * but the interfaces seems to be slow: just using the inlined version
  * of the inl() breaks things.
  */
-void ioread32_rep(void __iomem *port, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *port, void *dst, unsigned long count)
 {
 	if (unlikely((unsigned long)dst & 0x3)) {
 		while (count--) {
--- a/arch/parisc/include/asm/io.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/parisc/include/asm/io.h
@@ -303,8 +303,8 @@ extern void outsl (unsigned long port, c
 #define ioread64be ioread64be
 #define iowrite64 iowrite64
 #define iowrite64be iowrite64be
-extern u64 ioread64(void __iomem *addr);
-extern u64 ioread64be(void __iomem *addr);
+extern u64 ioread64(const void __iomem *addr);
+extern u64 ioread64be(const void __iomem *addr);
 extern void iowrite64(u64 val, void __iomem *addr);
 extern void iowrite64be(u64 val, void __iomem *addr);
 
--- a/arch/parisc/lib/iomap.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/parisc/lib/iomap.c
@@ -43,13 +43,13 @@
 #endif
 
 struct iomap_ops {
-	unsigned int (*read8)(void __iomem *);
-	unsigned int (*read16)(void __iomem *);
-	unsigned int (*read16be)(void __iomem *);
-	unsigned int (*read32)(void __iomem *);
-	unsigned int (*read32be)(void __iomem *);
-	u64 (*read64)(void __iomem *);
-	u64 (*read64be)(void __iomem *);
+	unsigned int (*read8)(const void __iomem *);
+	unsigned int (*read16)(const void __iomem *);
+	unsigned int (*read16be)(const void __iomem *);
+	unsigned int (*read32)(const void __iomem *);
+	unsigned int (*read32be)(const void __iomem *);
+	u64 (*read64)(const void __iomem *);
+	u64 (*read64be)(const void __iomem *);
 	void (*write8)(u8, void __iomem *);
 	void (*write16)(u16, void __iomem *);
 	void (*write16be)(u16, void __iomem *);
@@ -57,9 +57,9 @@ struct iomap_ops {
 	void (*write32be)(u32, void __iomem *);
 	void (*write64)(u64, void __iomem *);
 	void (*write64be)(u64, void __iomem *);
-	void (*read8r)(void __iomem *, void *, unsigned long);
-	void (*read16r)(void __iomem *, void *, unsigned long);
-	void (*read32r)(void __iomem *, void *, unsigned long);
+	void (*read8r)(const void __iomem *, void *, unsigned long);
+	void (*read16r)(const void __iomem *, void *, unsigned long);
+	void (*read32r)(const void __iomem *, void *, unsigned long);
 	void (*write8r)(void __iomem *, const void *, unsigned long);
 	void (*write16r)(void __iomem *, const void *, unsigned long);
 	void (*write32r)(void __iomem *, const void *, unsigned long);
@@ -69,17 +69,17 @@ struct iomap_ops {
 
 #define ADDR2PORT(addr) ((unsigned long __force)(addr) & 0xffffff)
 
-static unsigned int ioport_read8(void __iomem *addr)
+static unsigned int ioport_read8(const void __iomem *addr)
 {
 	return inb(ADDR2PORT(addr));
 }
 
-static unsigned int ioport_read16(void __iomem *addr)
+static unsigned int ioport_read16(const void __iomem *addr)
 {
 	return inw(ADDR2PORT(addr));
 }
 
-static unsigned int ioport_read32(void __iomem *addr)
+static unsigned int ioport_read32(const void __iomem *addr)
 {
 	return inl(ADDR2PORT(addr));
 }
@@ -99,17 +99,17 @@ static void ioport_write32(u32 datum, vo
 	outl(datum, ADDR2PORT(addr));
 }
 
-static void ioport_read8r(void __iomem *addr, void *dst, unsigned long count)
+static void ioport_read8r(const void __iomem *addr, void *dst, unsigned long count)
 {
 	insb(ADDR2PORT(addr), dst, count);
 }
 
-static void ioport_read16r(void __iomem *addr, void *dst, unsigned long count)
+static void ioport_read16r(const void __iomem *addr, void *dst, unsigned long count)
 {
 	insw(ADDR2PORT(addr), dst, count);
 }
 
-static void ioport_read32r(void __iomem *addr, void *dst, unsigned long count)
+static void ioport_read32r(const void __iomem *addr, void *dst, unsigned long count)
 {
 	insl(ADDR2PORT(addr), dst, count);
 }
@@ -150,37 +150,37 @@ static const struct iomap_ops ioport_ops
 
 /* Legacy I/O memory ops */
 
-static unsigned int iomem_read8(void __iomem *addr)
+static unsigned int iomem_read8(const void __iomem *addr)
 {
 	return readb(addr);
 }
 
-static unsigned int iomem_read16(void __iomem *addr)
+static unsigned int iomem_read16(const void __iomem *addr)
 {
 	return readw(addr);
 }
 
-static unsigned int iomem_read16be(void __iomem *addr)
+static unsigned int iomem_read16be(const void __iomem *addr)
 {
 	return __raw_readw(addr);
 }
 
-static unsigned int iomem_read32(void __iomem *addr)
+static unsigned int iomem_read32(const void __iomem *addr)
 {
 	return readl(addr);
 }
 
-static unsigned int iomem_read32be(void __iomem *addr)
+static unsigned int iomem_read32be(const void __iomem *addr)
 {
 	return __raw_readl(addr);
 }
 
-static u64 iomem_read64(void __iomem *addr)
+static u64 iomem_read64(const void __iomem *addr)
 {
 	return readq(addr);
 }
 
-static u64 iomem_read64be(void __iomem *addr)
+static u64 iomem_read64be(const void __iomem *addr)
 {
 	return __raw_readq(addr);
 }
@@ -220,7 +220,7 @@ static void iomem_write64be(u64 datum, v
 	__raw_writel(datum, addr);
 }
 
-static void iomem_read8r(void __iomem *addr, void *dst, unsigned long count)
+static void iomem_read8r(const void __iomem *addr, void *dst, unsigned long count)
 {
 	while (count--) {
 		*(u8 *)dst = __raw_readb(addr);
@@ -228,7 +228,7 @@ static void iomem_read8r(void __iomem *a
 	}
 }
 
-static void iomem_read16r(void __iomem *addr, void *dst, unsigned long count)
+static void iomem_read16r(const void __iomem *addr, void *dst, unsigned long count)
 {
 	while (count--) {
 		*(u16 *)dst = __raw_readw(addr);
@@ -236,7 +236,7 @@ static void iomem_read16r(void __iomem *
 	}
 }
 
-static void iomem_read32r(void __iomem *addr, void *dst, unsigned long count)
+static void iomem_read32r(const void __iomem *addr, void *dst, unsigned long count)
 {
 	while (count--) {
 		*(u32 *)dst = __raw_readl(addr);
@@ -297,49 +297,49 @@ static const struct iomap_ops *iomap_ops
 };
 
 
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
 {
 	if (unlikely(INDIRECT_ADDR(addr)))
 		return iomap_ops[ADDR_TO_REGION(addr)]->read8(addr);
 	return *((u8 *)addr);
 }
 
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
 {
 	if (unlikely(INDIRECT_ADDR(addr)))
 		return iomap_ops[ADDR_TO_REGION(addr)]->read16(addr);
 	return le16_to_cpup((u16 *)addr);
 }
 
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
 {
 	if (unlikely(INDIRECT_ADDR(addr)))
 		return iomap_ops[ADDR_TO_REGION(addr)]->read16be(addr);
 	return *((u16 *)addr);
 }
 
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
 {
 	if (unlikely(INDIRECT_ADDR(addr)))
 		return iomap_ops[ADDR_TO_REGION(addr)]->read32(addr);
 	return le32_to_cpup((u32 *)addr);
 }
 
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
 {
 	if (unlikely(INDIRECT_ADDR(addr)))
 		return iomap_ops[ADDR_TO_REGION(addr)]->read32be(addr);
 	return *((u32 *)addr);
 }
 
-u64 ioread64(void __iomem *addr)
+u64 ioread64(const void __iomem *addr)
 {
 	if (unlikely(INDIRECT_ADDR(addr)))
 		return iomap_ops[ADDR_TO_REGION(addr)]->read64(addr);
 	return le64_to_cpup((u64 *)addr);
 }
 
-u64 ioread64be(void __iomem *addr)
+u64 ioread64be(const void __iomem *addr)
 {
 	if (unlikely(INDIRECT_ADDR(addr)))
 		return iomap_ops[ADDR_TO_REGION(addr)]->read64be(addr);
@@ -411,7 +411,7 @@ void iowrite64be(u64 datum, void __iomem
 
 /* Repeating interfaces */
 
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	if (unlikely(INDIRECT_ADDR(addr))) {
 		iomap_ops[ADDR_TO_REGION(addr)]->read8r(addr, dst, count);
@@ -423,7 +423,7 @@ void ioread8_rep(void __iomem *addr, voi
 	}
 }
 
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	if (unlikely(INDIRECT_ADDR(addr))) {
 		iomap_ops[ADDR_TO_REGION(addr)]->read16r(addr, dst, count);
@@ -435,7 +435,7 @@ void ioread16_rep(void __iomem *addr, vo
 	}
 }
 
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	if (unlikely(INDIRECT_ADDR(addr))) {
 		iomap_ops[ADDR_TO_REGION(addr)]->read32r(addr, dst, count);
--- a/arch/powerpc/kernel/iomap.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/powerpc/kernel/iomap.c
@@ -15,23 +15,23 @@
  * Here comes the ppc64 implementation of the IOMAP 
  * interfaces.
  */
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
 {
 	return readb(addr);
 }
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
 {
 	return readw(addr);
 }
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
 {
 	return readw_be(addr);
 }
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
 {
 	return readl(addr);
 }
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
 {
 	return readl_be(addr);
 }
@@ -41,27 +41,27 @@ EXPORT_SYMBOL(ioread16be);
 EXPORT_SYMBOL(ioread32);
 EXPORT_SYMBOL(ioread32be);
 #ifdef __powerpc64__
-u64 ioread64(void __iomem *addr)
+u64 ioread64(const void __iomem *addr)
 {
 	return readq(addr);
 }
-u64 ioread64_lo_hi(void __iomem *addr)
+u64 ioread64_lo_hi(const void __iomem *addr)
 {
 	return readq(addr);
 }
-u64 ioread64_hi_lo(void __iomem *addr)
+u64 ioread64_hi_lo(const void __iomem *addr)
 {
 	return readq(addr);
 }
-u64 ioread64be(void __iomem *addr)
+u64 ioread64be(const void __iomem *addr)
 {
 	return readq_be(addr);
 }
-u64 ioread64be_lo_hi(void __iomem *addr)
+u64 ioread64be_lo_hi(const void __iomem *addr)
 {
 	return readq_be(addr);
 }
-u64 ioread64be_hi_lo(void __iomem *addr)
+u64 ioread64be_hi_lo(const void __iomem *addr)
 {
 	return readq_be(addr);
 }
@@ -139,15 +139,15 @@ EXPORT_SYMBOL(iowrite64be_hi_lo);
  * FIXME! We could make these do EEH handling if we really
  * wanted. Not clear if we do.
  */
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	readsb(addr, dst, count);
 }
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	readsw(addr, dst, count);
 }
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	readsl(addr, dst, count);
 }
--- a/arch/sh/kernel/iomap.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/sh/kernel/iomap.c
@@ -8,31 +8,31 @@
 #include <linux/module.h>
 #include <linux/io.h>
 
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
 {
 	return readb(addr);
 }
 EXPORT_SYMBOL(ioread8);
 
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
 {
 	return readw(addr);
 }
 EXPORT_SYMBOL(ioread16);
 
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
 {
 	return be16_to_cpu(__raw_readw(addr));
 }
 EXPORT_SYMBOL(ioread16be);
 
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
 {
 	return readl(addr);
 }
 EXPORT_SYMBOL(ioread32);
 
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
 {
 	return be32_to_cpu(__raw_readl(addr));
 }
@@ -74,7 +74,7 @@ EXPORT_SYMBOL(iowrite32be);
  * convert to CPU byte order. We write in "IO byte
  * order" (we also don't have IO barriers).
  */
-static inline void mmio_insb(void __iomem *addr, u8 *dst, int count)
+static inline void mmio_insb(const void __iomem *addr, u8 *dst, int count)
 {
 	while (--count >= 0) {
 		u8 data = __raw_readb(addr);
@@ -83,7 +83,7 @@ static inline void mmio_insb(void __iome
 	}
 }
 
-static inline void mmio_insw(void __iomem *addr, u16 *dst, int count)
+static inline void mmio_insw(const void __iomem *addr, u16 *dst, int count)
 {
 	while (--count >= 0) {
 		u16 data = __raw_readw(addr);
@@ -92,7 +92,7 @@ static inline void mmio_insw(void __iome
 	}
 }
 
-static inline void mmio_insl(void __iomem *addr, u32 *dst, int count)
+static inline void mmio_insl(const void __iomem *addr, u32 *dst, int count)
 {
 	while (--count >= 0) {
 		u32 data = __raw_readl(addr);
@@ -125,19 +125,19 @@ static inline void mmio_outsl(void __iom
 	}
 }
 
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	mmio_insb(addr, dst, count);
 }
 EXPORT_SYMBOL(ioread8_rep);
 
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	mmio_insw(addr, dst, count);
 }
 EXPORT_SYMBOL(ioread16_rep);
 
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	mmio_insl(addr, dst, count);
 }
--- a/include/asm-generic/iomap.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/include/asm-generic/iomap.h
@@ -26,14 +26,14 @@
  * in the low address range. Architectures for which this is not
  * true can't use this generic implementation.
  */
-extern unsigned int ioread8(void __iomem *);
-extern unsigned int ioread16(void __iomem *);
-extern unsigned int ioread16be(void __iomem *);
-extern unsigned int ioread32(void __iomem *);
-extern unsigned int ioread32be(void __iomem *);
+extern unsigned int ioread8(const void __iomem *);
+extern unsigned int ioread16(const void __iomem *);
+extern unsigned int ioread16be(const void __iomem *);
+extern unsigned int ioread32(const void __iomem *);
+extern unsigned int ioread32be(const void __iomem *);
 #ifdef CONFIG_64BIT
-extern u64 ioread64(void __iomem *);
-extern u64 ioread64be(void __iomem *);
+extern u64 ioread64(const void __iomem *);
+extern u64 ioread64be(const void __iomem *);
 #endif
 
 #ifdef readq
@@ -41,10 +41,10 @@ extern u64 ioread64be(void __iomem *);
 #define ioread64_hi_lo ioread64_hi_lo
 #define ioread64be_lo_hi ioread64be_lo_hi
 #define ioread64be_hi_lo ioread64be_hi_lo
-extern u64 ioread64_lo_hi(void __iomem *addr);
-extern u64 ioread64_hi_lo(void __iomem *addr);
-extern u64 ioread64be_lo_hi(void __iomem *addr);
-extern u64 ioread64be_hi_lo(void __iomem *addr);
+extern u64 ioread64_lo_hi(const void __iomem *addr);
+extern u64 ioread64_hi_lo(const void __iomem *addr);
+extern u64 ioread64be_lo_hi(const void __iomem *addr);
+extern u64 ioread64be_hi_lo(const void __iomem *addr);
 #endif
 
 extern void iowrite8(u8, void __iomem *);
@@ -79,9 +79,9 @@ extern void iowrite64be_hi_lo(u64 val, v
  * memory across multiple ports, use "memcpy_toio()"
  * and friends.
  */
-extern void ioread8_rep(void __iomem *port, void *buf, unsigned long count);
-extern void ioread16_rep(void __iomem *port, void *buf, unsigned long count);
-extern void ioread32_rep(void __iomem *port, void *buf, unsigned long count);
+extern void ioread8_rep(const void __iomem *port, void *buf, unsigned long count);
+extern void ioread16_rep(const void __iomem *port, void *buf, unsigned long count);
+extern void ioread32_rep(const void __iomem *port, void *buf, unsigned long count);
 
 extern void iowrite8_rep(void __iomem *port, const void *buf, unsigned long count);
 extern void iowrite16_rep(void __iomem *port, const void *buf, unsigned long count);
--- a/include/linux/io-64-nonatomic-hi-lo.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/include/linux/io-64-nonatomic-hi-lo.h
@@ -57,7 +57,7 @@ static inline void hi_lo_writeq_relaxed(
 
 #ifndef ioread64_hi_lo
 #define ioread64_hi_lo ioread64_hi_lo
-static inline u64 ioread64_hi_lo(void __iomem *addr)
+static inline u64 ioread64_hi_lo(const void __iomem *addr)
 {
 	u32 low, high;
 
@@ -79,7 +79,7 @@ static inline void iowrite64_hi_lo(u64 v
 
 #ifndef ioread64be_hi_lo
 #define ioread64be_hi_lo ioread64be_hi_lo
-static inline u64 ioread64be_hi_lo(void __iomem *addr)
+static inline u64 ioread64be_hi_lo(const void __iomem *addr)
 {
 	u32 low, high;
 
--- a/include/linux/io-64-nonatomic-lo-hi.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/include/linux/io-64-nonatomic-lo-hi.h
@@ -57,7 +57,7 @@ static inline void lo_hi_writeq_relaxed(
 
 #ifndef ioread64_lo_hi
 #define ioread64_lo_hi ioread64_lo_hi
-static inline u64 ioread64_lo_hi(void __iomem *addr)
+static inline u64 ioread64_lo_hi(const void __iomem *addr)
 {
 	u32 low, high;
 
@@ -79,7 +79,7 @@ static inline void iowrite64_lo_hi(u64 v
 
 #ifndef ioread64be_lo_hi
 #define ioread64be_lo_hi ioread64be_lo_hi
-static inline u64 ioread64be_lo_hi(void __iomem *addr)
+static inline u64 ioread64be_lo_hi(const void __iomem *addr)
 {
 	u32 low, high;
 
--- a/lib/iomap.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/lib/iomap.c
@@ -70,27 +70,27 @@ static void bad_io_access(unsigned long
 #define mmio_read64be(addr) swab64(readq(addr))
 #endif
 
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
 {
 	IO_COND(addr, return inb(port), return readb(addr));
 	return 0xff;
 }
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
 {
 	IO_COND(addr, return inw(port), return readw(addr));
 	return 0xffff;
 }
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
 {
 	IO_COND(addr, return pio_read16be(port), return mmio_read16be(addr));
 	return 0xffff;
 }
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
 {
 	IO_COND(addr, return inl(port), return readl(addr));
 	return 0xffffffff;
 }
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
 {
 	IO_COND(addr, return pio_read32be(port), return mmio_read32be(addr));
 	return 0xffffffff;
@@ -142,26 +142,26 @@ static u64 pio_read64be_hi_lo(unsigned l
 	return lo | (hi << 32);
 }
 
-u64 ioread64_lo_hi(void __iomem *addr)
+u64 ioread64_lo_hi(const void __iomem *addr)
 {
 	IO_COND(addr, return pio_read64_lo_hi(port), return readq(addr));
 	return 0xffffffffffffffffULL;
 }
 
-u64 ioread64_hi_lo(void __iomem *addr)
+u64 ioread64_hi_lo(const void __iomem *addr)
 {
 	IO_COND(addr, return pio_read64_hi_lo(port), return readq(addr));
 	return 0xffffffffffffffffULL;
 }
 
-u64 ioread64be_lo_hi(void __iomem *addr)
+u64 ioread64be_lo_hi(const void __iomem *addr)
 {
 	IO_COND(addr, return pio_read64be_lo_hi(port),
 		return mmio_read64be(addr));
 	return 0xffffffffffffffffULL;
 }
 
-u64 ioread64be_hi_lo(void __iomem *addr)
+u64 ioread64be_hi_lo(const void __iomem *addr)
 {
 	IO_COND(addr, return pio_read64be_hi_lo(port),
 		return mmio_read64be(addr));
@@ -275,7 +275,7 @@ EXPORT_SYMBOL(iowrite64be_hi_lo);
  * order" (we also don't have IO barriers).
  */
 #ifndef mmio_insb
-static inline void mmio_insb(void __iomem *addr, u8 *dst, int count)
+static inline void mmio_insb(const void __iomem *addr, u8 *dst, int count)
 {
 	while (--count >= 0) {
 		u8 data = __raw_readb(addr);
@@ -283,7 +283,7 @@ static inline void mmio_insb(void __iome
 		dst++;
 	}
 }
-static inline void mmio_insw(void __iomem *addr, u16 *dst, int count)
+static inline void mmio_insw(const void __iomem *addr, u16 *dst, int count)
 {
 	while (--count >= 0) {
 		u16 data = __raw_readw(addr);
@@ -291,7 +291,7 @@ static inline void mmio_insw(void __iome
 		dst++;
 	}
 }
-static inline void mmio_insl(void __iomem *addr, u32 *dst, int count)
+static inline void mmio_insl(const void __iomem *addr, u32 *dst, int count)
 {
 	while (--count >= 0) {
 		u32 data = __raw_readl(addr);
@@ -325,15 +325,15 @@ static inline void mmio_outsl(void __iom
 }
 #endif
 
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	IO_COND(addr, insb(port,dst,count), mmio_insb(addr, dst, count));
 }
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	IO_COND(addr, insw(port,dst,count), mmio_insw(addr, dst, count));
 }
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
 {
 	IO_COND(addr, insl(port,dst,count), mmio_insl(addr, dst, count));
 }
_

Patches currently in -mm which might be from krzk@kernel.org are

iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (108 preceding siblings ...)
  2020-07-10  0:33 ` + iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
@ 2020-07-10  0:33 ` Andrew Morton
  2020-07-10  0:33 ` + ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
                   ` (122 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:33 UTC (permalink / raw)
  To: allenbh, arnd, benh, dalias, dave.jiang, davem, deller,
	geert+renesas, geert, ink, James.Bottomley, jasowang, jdmason,
	krzk, kuba, kvalo, mattst88, mm-commits, mpe, mst, paulus, rth,
	ysato


The patch titled
     Subject: rtl818x: constify ioreadX() iomem argument (as in generic implementation)
has been added to the -mm tree.  Its filename is
     rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Krzysztof Kozlowski <krzk@kernel.org>
Subject: rtl818x: constify ioreadX() iomem argument (as in generic implementation)

The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.

Link: http://lkml.kernel.org/r/20200709072837.5869-3-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h~rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h
@@ -150,17 +150,17 @@ void rtl8180_write_phy(struct ieee80211_
 void rtl8180_set_anaparam(struct rtl8180_priv *priv, u32 anaparam);
 void rtl8180_set_anaparam2(struct rtl8180_priv *priv, u32 anaparam2);
 
-static inline u8 rtl818x_ioread8(struct rtl8180_priv *priv, u8 __iomem *addr)
+static inline u8 rtl818x_ioread8(struct rtl8180_priv *priv, const u8 __iomem *addr)
 {
 	return ioread8(addr);
 }
 
-static inline u16 rtl818x_ioread16(struct rtl8180_priv *priv, __le16 __iomem *addr)
+static inline u16 rtl818x_ioread16(struct rtl8180_priv *priv, const __le16 __iomem *addr)
 {
 	return ioread16(addr);
 }
 
-static inline u32 rtl818x_ioread32(struct rtl8180_priv *priv, __le32 __iomem *addr)
+static inline u32 rtl818x_ioread32(struct rtl8180_priv *priv, const __le32 __iomem *addr)
 {
 	return ioread32(addr);
 }
_

Patches currently in -mm which might be from krzk@kernel.org are

iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (109 preceding siblings ...)
  2020-07-10  0:33 ` + rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
@ 2020-07-10  0:33 ` Andrew Morton
  2020-07-10  0:33 ` + virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
                   ` (121 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:33 UTC (permalink / raw)
  To: allenbh, arnd, benh, dalias, dave.jiang, davem, deller,
	geert+renesas, geert, ink, James.Bottomley, jasowang, jdmason,
	krzk, kuba, kvalo, mattst88, mm-commits, mpe, mst, paulus, rth,
	ysato


The patch titled
     Subject: ntb: intel: constify ioreadX() iomem argument (as in generic implementation)
has been added to the -mm tree.  Its filename is
     ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Krzysztof Kozlowski <krzk@kernel.org>
Subject: ntb: intel: constify ioreadX() iomem argument (as in generic implementation)

The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.

Link: http://lkml.kernel.org/r/20200709072837.5869-4-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/ntb/hw/intel/ntb_hw_gen1.c  |    2 +-
 drivers/ntb/hw/intel/ntb_hw_gen3.h  |    2 +-
 drivers/ntb/hw/intel/ntb_hw_intel.h |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/ntb/hw/intel/ntb_hw_gen1.c~ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/ntb/hw/intel/ntb_hw_gen1.c
@@ -1205,7 +1205,7 @@ int intel_ntb_peer_spad_write(struct ntb
 			       ndev->peer_reg->spad);
 }
 
-static u64 xeon_db_ioread(void __iomem *mmio)
+static u64 xeon_db_ioread(const void __iomem *mmio)
 {
 	return (u64)ioread16(mmio);
 }
--- a/drivers/ntb/hw/intel/ntb_hw_gen3.h~ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/ntb/hw/intel/ntb_hw_gen3.h
@@ -91,7 +91,7 @@
 #define GEN3_DB_TOTAL_SHIFT		33
 #define GEN3_SPAD_COUNT			16
 
-static inline u64 gen3_db_ioread(void __iomem *mmio)
+static inline u64 gen3_db_ioread(const void __iomem *mmio)
 {
 	return ioread64(mmio);
 }
--- a/drivers/ntb/hw/intel/ntb_hw_intel.h~ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/ntb/hw/intel/ntb_hw_intel.h
@@ -103,7 +103,7 @@ struct intel_ntb_dev;
 struct intel_ntb_reg {
 	int (*poll_link)(struct intel_ntb_dev *ndev);
 	int (*link_is_up)(struct intel_ntb_dev *ndev);
-	u64 (*db_ioread)(void __iomem *mmio);
+	u64 (*db_ioread)(const void __iomem *mmio);
 	void (*db_iowrite)(u64 db_bits, void __iomem *mmio);
 	unsigned long			ntb_ctl;
 	resource_size_t			db_size;
_

Patches currently in -mm which might be from krzk@kernel.org are

iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (110 preceding siblings ...)
  2020-07-10  0:33 ` + ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
@ 2020-07-10  0:33 ` Andrew Morton
  2020-07-10  0:36 ` + doc-mm-sync-up-oom_score_adj-documentation.patch " Andrew Morton
                   ` (120 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:33 UTC (permalink / raw)
  To: allenbh, arnd, benh, dalias, dave.jiang, davem, deller,
	geert+renesas, geert, ink, James.Bottomley, jasowang, jdmason,
	krzk, kuba, kvalo, mattst88, mm-commits, mpe, mst, paulus, rth,
	ysato


The patch titled
     Subject: virtio: pci: constify ioreadX() iomem argument (as in generic implementation)
has been added to the -mm tree.  Its filename is
     virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Krzysztof Kozlowski <krzk@kernel.org>
Subject: virtio: pci: constify ioreadX() iomem argument (as in generic implementation)

The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.

Link: http://lkml.kernel.org/r/20200709072837.5869-5-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/virtio/virtio_pci_modern.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/virtio/virtio_pci_modern.c~virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/virtio/virtio_pci_modern.c
@@ -27,16 +27,16 @@
  * method, i.e. 32-bit accesses for 32-bit fields, 16-bit accesses
  * for 16-bit fields and 8-bit accesses for 8-bit fields.
  */
-static inline u8 vp_ioread8(u8 __iomem *addr)
+static inline u8 vp_ioread8(const u8 __iomem *addr)
 {
 	return ioread8(addr);
 }
-static inline u16 vp_ioread16 (__le16 __iomem *addr)
+static inline u16 vp_ioread16 (const __le16 __iomem *addr)
 {
 	return ioread16(addr);
 }
 
-static inline u32 vp_ioread32(__le32 __iomem *addr)
+static inline u32 vp_ioread32(const __le32 __iomem *addr)
 {
 	return ioread32(addr);
 }
_

Patches currently in -mm which might be from krzk@kernel.org are

iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + doc-mm-sync-up-oom_score_adj-documentation.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (111 preceding siblings ...)
  2020-07-10  0:33 ` + virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
@ 2020-07-10  0:36 ` Andrew Morton
  2020-07-10  0:36   ` Andrew Morton
  2020-07-10  0:36 ` + doc-mm-clarify-proc-pid-oom_score-value-range.patch " Andrew Morton
                   ` (119 subsequent siblings)
  232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:36 UTC (permalink / raw)
  To: corbet, laoar.shao, mhocko, mm-commits, rientjes


The patch titled
     Subject: doc, mm: sync up oom_score_adj documentation
has been added to the -mm tree.  Its filename is
     doc-mm-sync-up-oom_score_adj-documentation.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/doc-mm-sync-up-oom_score_adj-documentation.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/doc-mm-sync-up-oom_score_adj-documentation.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Michal Hocko <mhocko@suse.com>
Subject: doc, mm: sync up oom_score_adj documentation

There are at least two notes in the oom section.  The 3% discount for root
processes is gone since d46078b28889 ("mm, oom: remove 3% bonus for
CAP_SYS_ADMIN processes").

Likewise children of the selected oom victim are not sacrificed since
bbbe48029720 ("mm, oom: remove 'prefer children over parent' heuristic")

Drop both of them.

Link: http://lkml.kernel.org/r/20200709062603.18480-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/filesystems/proc.rst |    8 --------
 1 file changed, 8 deletions(-)

--- a/Documentation/filesystems/proc.rst~doc-mm-sync-up-oom_score_adj-documentation
+++ a/Documentation/filesystems/proc.rst
@@ -1634,9 +1634,6 @@ may allocate from based on an estimation
 For example, if a task is using all allowed memory, its badness score will be
 1000.  If it is using half of its allowed memory, its score will be 500.
 
-There is an additional factor included in the badness score: the current memory
-and swap usage is discounted by 3% for root processes.
-
 The amount of "allowed" memory depends on the context in which the oom killer
 was called.  If it is due to the memory assigned to the allocating task's cpuset
 being exhausted, the allowed memory represents the set of mems assigned to that
@@ -1672,11 +1669,6 @@ The value of /proc/<pid>/oom_score_adj m
 value set by a CAP_SYS_RESOURCE process. To reduce the value any lower
 requires CAP_SYS_RESOURCE.
 
-Caveat: when a parent task is selected, the oom killer will sacrifice any first
-generation children with separate address spaces instead, if possible.  This
-avoids servers and important system daemons from being killed and loses the
-minimal amount of work.

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + doc-mm-sync-up-oom_score_adj-documentation.patch added to -mm tree
  2020-07-10  0:36 ` + doc-mm-sync-up-oom_score_adj-documentation.patch " Andrew Morton
@ 2020-07-10  0:36   ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:36 UTC (permalink / raw)
  To: corbet, laoar.shao, mhocko, mm-commits, rientjes


The patch titled
     Subject: doc, mm: sync up oom_score_adj documentation
has been added to the -mm tree.  Its filename is
     doc-mm-sync-up-oom_score_adj-documentation.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/doc-mm-sync-up-oom_score_adj-documentation.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/doc-mm-sync-up-oom_score_adj-documentation.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Michal Hocko <mhocko@suse.com>
Subject: doc, mm: sync up oom_score_adj documentation

There are at least two notes in the oom section.  The 3% discount for root
processes is gone since d46078b28889 ("mm, oom: remove 3% bonus for
CAP_SYS_ADMIN processes").

Likewise children of the selected oom victim are not sacrificed since
bbbe48029720 ("mm, oom: remove 'prefer children over parent' heuristic")

Drop both of them.

Link: http://lkml.kernel.org/r/20200709062603.18480-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/filesystems/proc.rst |    8 --------
 1 file changed, 8 deletions(-)

--- a/Documentation/filesystems/proc.rst~doc-mm-sync-up-oom_score_adj-documentation
+++ a/Documentation/filesystems/proc.rst
@@ -1634,9 +1634,6 @@ may allocate from based on an estimation
 For example, if a task is using all allowed memory, its badness score will be
 1000.  If it is using half of its allowed memory, its score will be 500.
 
-There is an additional factor included in the badness score: the current memory
-and swap usage is discounted by 3% for root processes.
-
 The amount of "allowed" memory depends on the context in which the oom killer
 was called.  If it is due to the memory assigned to the allocating task's cpuset
 being exhausted, the allowed memory represents the set of mems assigned to that
@@ -1672,11 +1669,6 @@ The value of /proc/<pid>/oom_score_adj m
 value set by a CAP_SYS_RESOURCE process. To reduce the value any lower
 requires CAP_SYS_RESOURCE.
 
-Caveat: when a parent task is selected, the oom killer will sacrifice any first
-generation children with separate address spaces instead, if possible.  This
-avoids servers and important system daemons from being killed and loses the
-minimal amount of work.
-
 
 3.2 /proc/<pid>/oom_score - Display current oom-killer score
 -------------------------------------------------------------
_

Patches currently in -mm which might be from mhocko@suse.com are

doc-mm-sync-up-oom_score_adj-documentation.patch
doc-mm-clarify-proc-pid-oom_score-value-range.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + doc-mm-clarify-proc-pid-oom_score-value-range.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (112 preceding siblings ...)
  2020-07-10  0:36 ` + doc-mm-sync-up-oom_score_adj-documentation.patch " Andrew Morton
@ 2020-07-10  0:36 ` Andrew Morton
  2020-07-10  0:38 ` [to-be-updated] proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch removed from " Andrew Morton
                   ` (118 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:36 UTC (permalink / raw)
  To: corbet, laoar.shao, mhocko, mm-commits, rientjes


The patch titled
     Subject: doc, mm: clarify /proc/<pid>/oom_score value range
has been added to the -mm tree.  Its filename is
     doc-mm-clarify-proc-pid-oom_score-value-range.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/doc-mm-clarify-proc-pid-oom_score-value-range.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/doc-mm-clarify-proc-pid-oom_score-value-range.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Michal Hocko <mhocko@suse.com>
Subject: doc, mm: clarify /proc/<pid>/oom_score value range

The exported value includes oom_score_adj so the range is no [0, 1000] as
described in the previous section but rather [0, 2000].  Mention that fact
explicitly.

Link: http://lkml.kernel.org/r/20200709062603.18480-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/filesystems/proc.rst |    3 +++
 1 file changed, 3 insertions(+)

--- a/Documentation/filesystems/proc.rst~doc-mm-clarify-proc-pid-oom_score-value-range
+++ a/Documentation/filesystems/proc.rst
@@ -1673,6 +1673,9 @@ requires CAP_SYS_RESOURCE.
 3.2 /proc/<pid>/oom_score - Display current oom-killer score
 -------------------------------------------------------------
 
+Please note that the exported value includes oom_score_adj so it is effectively
+in range [0,2000].
+
 This file can be used to check the current score used by the oom-killer is for
 any given <pid>. Use it together with /proc/<pid>/oom_score_adj to tune which
 process should be killed in an out-of-memory situation.
_

Patches currently in -mm which might be from mhocko@suse.com are

doc-mm-sync-up-oom_score_adj-documentation.patch
doc-mm-clarify-proc-pid-oom_score-value-range.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (113 preceding siblings ...)
  2020-07-10  0:36 ` + doc-mm-clarify-proc-pid-oom_score-value-range.patch " Andrew Morton
@ 2020-07-10  0:38 ` Andrew Morton
  2020-07-10  0:38 ` [to-be-updated] mm-utilc-make-vm_memory_committed-more-accurate.patch " Andrew Morton
                   ` (117 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:38 UTC (permalink / raw)
  To: andi.kleen, dave.hansen, feng.tang, hannes, keescook, mgorman,
	mhocko, mm-commits, tim.c.chen, willy, ying.huang


The patch titled
     Subject: proc/meminfo: avoid open coded reading of vm_committed_as
has been removed from the -mm tree.  Its filename was
     proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Feng Tang <feng.tang@intel.com>
Subject: proc/meminfo: avoid open coded reading of vm_committed_as

Patch series "make vm_committed_as_batch aware of vm overcommit policy", v5.

When checking a performance change for will-it-scale scalability mmap test
[1], we found very high lock contention for spinlock of percpu counter
'vm_committed_as':

    94.14%     0.35%  [kernel.kallsyms]         [k] _raw_spin_lock_irqsave
    48.21% _raw_spin_lock_irqsave;percpu_counter_add_batch;__vm_enough_memory;mmap_region;do_mmap;
    45.91% _raw_spin_lock_irqsave;percpu_counter_add_batch;__do_munmap;

Actually this heavy lock contention is not always necessary.  The
'vm_committed_as' needs to be very precise when the strict
OVERCOMMIT_NEVER policy is set, which requires a rather small batch number
for the percpu counter.

So keep 'batch' number unchanged for strict OVERCOMMIT_NEVER policy, and
enlarge it for not-so-strict OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS
policies.

Benchmark with the same testcase in [1] shows 53% improvement on a 8C/16T
desktop, and 2097%(20X) on a 4S/72C/144T server.  And for that case,
whether it shows improvements depends on if the test mmap size is bigger
than the batch number computed.

We tested 10+ platforms in 0day (server, desktop and laptop).  If we lift
it to 64X, 80%+ platforms show improvements, and for 16X lift, 1/3 of the
platforms will show improvements.

And generally it should help the mmap/unmap usage,as Michal Hocko
mentioned:

: I believe that there are non-synthetic worklaods which would benefit
: from a larger batch. E.g. large in memory databases which do large
: mmaps during startups from multiple threads.

Note: There are some style complain from checkpatch for patch 3, as sysctl
handler declaration follows the similar format of sibling functions

[1] https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/

patch1: a cleanup for /proc/meminfo
patch2: a preparation patch which also improve the accuracy of
        vm_memory_committed
patch3: main change


This patch (of 3):

Use the existing vm_memory_committed() instead, which is also convenient
for future change.

Link: http://lkml.kernel.org/r/1592725000-73486-1-git-send-email-feng.tang@intel.com
Link: http://lkml.kernel.org/r/1592725000-73486-2-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/meminfo.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/proc/meminfo.c~proc-meminfo-avoid-open-coded-reading-of-vm_committed_as
+++ a/fs/proc/meminfo.c
@@ -41,7 +41,7 @@ static int meminfo_proc_show(struct seq_
 
 	si_meminfo(&i);
 	si_swapinfo(&i);
-	committed = percpu_counter_read_positive(&vm_committed_as);
+	committed = vm_memory_committed();
 
 	cached = global_node_page_state(NR_FILE_PAGES) -
 			total_swapcache_pages() - i.bufferram;
_

Patches currently in -mm which might be from feng.tang@intel.com are

mm-utilc-make-vm_memory_committed-more-accurate.patch
mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-utilc-make-vm_memory_committed-more-accurate.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (114 preceding siblings ...)
  2020-07-10  0:38 ` [to-be-updated] proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch removed from " Andrew Morton
@ 2020-07-10  0:38 ` Andrew Morton
  2020-07-10  0:38 ` [to-be-updated] mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch " Andrew Morton
                   ` (116 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:38 UTC (permalink / raw)
  To: andi.kleen, dave.hansen, feng.tang, haiyangz, hannes, kys,
	mgorman, mhocko, mm-commits, tim.c.chen, willy, ying.huang


The patch titled
     Subject: mm/util.c: make vm_memory_committed() more accurate
has been removed from the -mm tree.  Its filename was
     mm-utilc-make-vm_memory_committed-more-accurate.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Feng Tang <feng.tang@intel.com>
Subject: mm/util.c: make vm_memory_committed() more accurate

percpu_counter_sum_positive() will provide more accurate info.

As with percpu_counter_read_positive(), in worst case the deviation could
be 'batch * nr_cpus', which is totalram_pages/256 for now, and will be
more when the batch gets enlarged.

Its time cost is about 800 nanoseconds on a 2C/4T platform and 2~3
microseconds on a 2S/36C/72T Skylake server in normal case, and in worst
case where vm_committed_as's spinlock is under severe contention, it costs
30~40 microseconds for the 2S/36C/72T Skylake sever, which should be fine
for its only two users: /proc/meminfo and HyperV balloon driver's status
trace per second.

Link: http://lkml.kernel.org/r/1592725000-73486-3-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com> # for /proc/meminfo
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/util.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/mm/util.c~mm-utilc-make-vm_memory_committed-more-accurate
+++ a/mm/util.c
@@ -787,10 +787,15 @@ struct percpu_counter vm_committed_as __
  * balancing memory across competing virtual machines that are hosted.
  * Several metrics drive this policy engine including the guest reported
  * memory commitment.
+ *
+ * The time cost of this is very low for small platforms, and for big
+ * platform like a 2S/36C/72T Skylake server, in worst case where
+ * vm_committed_as's spinlock is under severe contention, the time cost
+ * could be about 30~40 microseconds.
  */
 unsigned long vm_memory_committed(void)
 {
-	return percpu_counter_read_positive(&vm_committed_as);
+	return percpu_counter_sum_positive(&vm_committed_as);
 }
 EXPORT_SYMBOL_GPL(vm_memory_committed);
 
_

Patches currently in -mm which might be from feng.tang@intel.com are

mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (115 preceding siblings ...)
  2020-07-10  0:38 ` [to-be-updated] mm-utilc-make-vm_memory_committed-more-accurate.patch " Andrew Morton
@ 2020-07-10  0:38 ` Andrew Morton
  2020-07-10  4:00 ` mmotm 2020-07-09-21-00 uploaded Andrew Morton
                   ` (115 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  0:38 UTC (permalink / raw)
  To: andi.kleen, dave.hansen, feng.tang, hannes, keescook, mgorman,
	mhocko, mm-commits, tim.c.chen, willy, ying.huang


The patch titled
     Subject: mm: adjust vm_committed_as_batch according to vm overcommit policy
has been removed from the -mm tree.  Its filename was
     mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Feng Tang <feng.tang@intel.com>
Subject: mm: adjust vm_committed_as_batch according to vm overcommit policy

When checking a performance change for will-it-scale scalability mmap test
[1], we found very high lock contention for spinlock of percpu counter
'vm_committed_as':

    94.14%     0.35%  [kernel.kallsyms]         [k] _raw_spin_lock_irqsave
    48.21% _raw_spin_lock_irqsave;percpu_counter_add_batch;__vm_enough_memory;mmap_region;do_mmap;
    45.91% _raw_spin_lock_irqsave;percpu_counter_add_batch;__do_munmap;

Actually this heavy lock contention is not always necessary.  The
'vm_committed_as' needs to be very precise when the strict
OVERCOMMIT_NEVER policy is set, which requires a rather small batch number
for the percpu counter.

So keep 'batch' number unchanged for strict OVERCOMMIT_NEVER policy, and
lift it to 64X for OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS policies.  Also
add a sysctl handler to adjust it when the policy is reconfigured.

Benchmark with the same testcase in [1] shows 53% improvement on a 8C/16T
desktop, and 2097%(20X) on a 4S/72C/144T server.  We tested with test
platforms in 0day (server, desktop and laptop), and 80%+ platforms shows
improvements with that test.  And whether it shows improvements depends on
if the test mmap size is bigger than the batch number computed.

And if the lift is 16X, 1/3 of the platforms will show improvements,
though it should help the mmap/unmap usage generally, as Michal Hocko
mentioned:

: I believe that there are non-synthetic worklaods which would benefit from
: a larger batch.  E.g.  large in memory databases which do large mmaps
: during startups from multiple threads.

[1] https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/

Link: http://lkml.kernel.org/r/1589611660-89854-4-git-send-email-feng.tang@intel.com
Link: http://lkml.kernel.org/r/1592725000-73486-4-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h   |    2 ++
 include/linux/mman.h |    4 ++++
 kernel/sysctl.c      |    2 +-
 mm/mm_init.c         |   16 +++++++++++++---
 mm/util.c            |   12 ++++++++++++
 5 files changed, 32 insertions(+), 4 deletions(-)

--- a/include/linux/mman.h~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/include/linux/mman.h
@@ -57,8 +57,12 @@ extern struct percpu_counter vm_committe
 
 #ifdef CONFIG_SMP
 extern s32 vm_committed_as_batch;
+extern void mm_compute_batch(void);
 #else
 #define vm_committed_as_batch 0
+static inline void mm_compute_batch(void)
+{
+}
 #endif
 
 unsigned long vm_memory_committed(void);
--- a/include/linux/mm.h~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/include/linux/mm.h
@@ -206,6 +206,8 @@ int overcommit_ratio_handler(struct ctl_
 		loff_t *);
 int overcommit_kbytes_handler(struct ctl_table *, int, void *, size_t *,
 		loff_t *);
+int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
+		loff_t *);
 
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
 
--- a/kernel/sysctl.c~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/kernel/sysctl.c
@@ -2650,7 +2650,7 @@ static struct ctl_table vm_table[] = {
 		.data		= &sysctl_overcommit_memory,
 		.maxlen		= sizeof(sysctl_overcommit_memory),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
+		.proc_handler	= overcommit_policy_handler,
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= &two,
 	},
--- a/mm/mm_init.c~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/mm/mm_init.c
@@ -13,6 +13,7 @@
 #include <linux/memory.h>
 #include <linux/notifier.h>
 #include <linux/sched.h>
+#include <linux/mman.h>
 #include "internal.h"
 
 #ifdef CONFIG_DEBUG_MEMORY_INIT
@@ -144,14 +145,23 @@ EXPORT_SYMBOL_GPL(mm_kobj);
 #ifdef CONFIG_SMP
 s32 vm_committed_as_batch = 32;
 
-static void __meminit mm_compute_batch(void)
+void mm_compute_batch(void)
 {
 	u64 memsized_batch;
 	s32 nr = num_present_cpus();
 	s32 batch = max_t(s32, nr*2, 32);
+	unsigned long ram_pages = totalram_pages();
 
-	/* batch size set to 0.4% of (total memory/#cpus), or max int32 */
-	memsized_batch = min_t(u64, (totalram_pages()/nr)/256, 0x7fffffff);
+	/*
+	 * For policy of OVERCOMMIT_NEVER, set batch size to 0.4%
+	 * of (total memory/#cpus), and lift it to 25% for other
+	 * policies to easy the possible lock contention for percpu_counter
+	 * vm_committed_as, while the max limit is INT_MAX
+	 */
+	if (sysctl_overcommit_memory == OVERCOMMIT_NEVER)
+		memsized_batch = min_t(u64, ram_pages/nr/256, INT_MAX);
+	else
+		memsized_batch = min_t(u64, ram_pages/nr/4, INT_MAX);
 
 	vm_committed_as_batch = max_t(s32, memsized_batch, batch);
 }
--- a/mm/util.c~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/mm/util.c
@@ -746,6 +746,18 @@ int overcommit_ratio_handler(struct ctl_
 	return ret;
 }
 
+int overcommit_policy_handler(struct ctl_table *table, int write, void *buffer,
+		size_t *lenp, loff_t *ppos)
+{
+	int ret;
+
+	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+	if (ret == 0 && write)
+		mm_compute_batch();
+
+	return ret;
+}
+
 int overcommit_kbytes_handler(struct ctl_table *table, int write, void *buffer,
 		size_t *lenp, loff_t *ppos)
 {
_

Patches currently in -mm which might be from feng.tang@intel.com are

^ permalink raw reply	[flat|nested] 247+ messages in thread

* mmotm 2020-07-09-21-00 uploaded
  2020-07-03 22:14 incoming Andrew Morton
                   ` (116 preceding siblings ...)
  2020-07-10  0:38 ` [to-be-updated] mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch " Andrew Morton
@ 2020-07-10  4:00 ` Andrew Morton
  2020-07-10 23:27 ` [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from -mm tree Andrew Morton
                   ` (114 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10  4:00 UTC (permalink / raw)
  To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
	mhocko, mm-commits, sfr

The mm-of-the-moment snapshot 2020-07-09-21-00 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss.  Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

	https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

	https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.8-rc4:
(patches marked "*" will be included in linux-next)

  origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
* mailmap-add-entry-for-mike-rapoport.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* kbuild-move-wtype-limits-to-w=2.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* mm-handle-page-mapping-better-in-dump_page.patch
* mm-dump-compound-page-information-on-a-second-line.patch
* mm-print-head-flags-in-dump_page.patch
* mm-switch-dump_page-to-get_kernel_nofault.patch
* mm-print-the-inode-number-in-dump_page.patch
* mm-print-hashed-address-of-struct-page.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-swap-simplify-alloc_swap_slot_cache.patch
* mm-swap-simplify-enable_swap_slots_cache.patch
* mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
* mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
* mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
* mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* mm-do-page-fault-accounting-in-handle_mm_fault.patch
* mm-alpha-use-general-page-fault-accounting.patch
* mm-arc-use-general-page-fault-accounting.patch
* mm-arm-use-general-page-fault-accounting.patch
* mm-arm64-use-general-page-fault-accounting.patch
* mm-csky-use-general-page-fault-accounting.patch
* mm-hexagon-use-general-page-fault-accounting.patch
* mm-ia64-use-general-page-fault-accounting.patch
* mm-m68k-use-general-page-fault-accounting.patch
* mm-microblaze-use-general-page-fault-accounting.patch
* mm-mips-use-general-page-fault-accounting.patch
* mm-nds32-use-general-page-fault-accounting.patch
* mm-nios2-use-general-page-fault-accounting.patch
* mm-openrisc-use-general-page-fault-accounting.patch
* mm-parisc-use-general-page-fault-accounting.patch
* mm-powerpc-use-general-page-fault-accounting.patch
* mm-riscv-use-general-page-fault-accounting.patch
* mm-s390-use-general-page-fault-accounting.patch
* mm-sh-use-general-page-fault-accounting.patch
* mm-sparc32-use-general-page-fault-accounting.patch
* mm-sparc64-use-general-page-fault-accounting.patch
* mm-x86-use-general-page-fault-accounting.patch
* mm-xtensa-use-general-page-fault-accounting.patch
* mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* kasan-record-and-print-the-free-track.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-vmscanc-fixed-typo.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
* doc-mm-sync-up-oom_score_adj-documentation.patch
* doc-mm-clarify-proc-pid-oom_score-value-range.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-migrate-optimize-migrate_vma_setup-for-holes.patch
* mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-vmstat-add-events-for-thp-migration-without-split.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-free-pages-fix.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix-fix.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* lib-test_lockupc-make-symbol-test_works-static.patch
* iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* fs-minix-check-return-value-of-sb_getblk.patch
* fs-minix-dont-allow-getting-deleted-inodes.patch
* fs-minix-reject-too-large-maximum-file-size.patch
* fs-minix-set-s_maxbytes-correctly.patch
* fs-minix-fix-block-limit-check-for-v1-filesystems.patch
* fs-minix-remove-expected-error-message-in-block_to_path.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
  linux-next.patch
  linux-next-rejects.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
  make-sure-nobodys-leaking-resources.patch
  releasing-resources-with-children.patch
  mutex-subsystem-synchro-test-module.patch
  kernel-forkc-export-kernel_thread-to-modules.patch
  workaround-for-a-pci-restoring-bug.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (117 preceding siblings ...)
  2020-07-10  4:00 ` mmotm 2020-07-09-21-00 uploaded Andrew Morton
@ 2020-07-10 23:27 ` Andrew Morton
  2020-07-10 23:29 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to " Andrew Morton
                   ` (113 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:27 UTC (permalink / raw)
  To: guro, jonathan.cameron, mike.kravetz, mm-commits, rppt,
	song.bao.hua, stable


The patch titled
     Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
has been removed from the -mm tree.  Its filename was
     mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled

hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory.  so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled.  gigantic pages might have been reserved on other nodes.

Mike Kravetz said:

: Based on the code changes, I believe the following could happen:
: - Someone uses 'hugetlb_cma=' kernel command line parameter to reserve
:   CMA for gigantic pages.
: - The system topology is such that no memory is on node 0.  Therefore,
:   no CMA can be reserved for gigantic pages on node 0.  CMA is reserved
:   on other nodes.
: - The user also specifies a number of gigantic pages to pre-allocate on
:   the command line with hugepagesz=<gigantic_page_size> hugepages=<N>
: - The routine which allocates gigantic pages from the bootmem allocator
:   will not detect CMA has been reserved as there is no memory on node 0.
:   Therefore, pages will be pre-allocated from bootmem allocator as well
:   as reserved in CMA.
: 
: This double allocation (bootmem and CMA) is the worst case scenario.  Not
: sure if this is what Barry saw, and I suspect this would rarely happen.

Link: http://lkml.kernel.org/r/20200707040204.30132-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |   16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable
+++ a/mm/hugetlb.c
@@ -2546,6 +2546,20 @@ static void __init gather_bootmem_preall
 	}
 }
 
+bool __init hugetlb_cma_enabled(void)
+{
+#ifdef CONFIG_CMA
+	int node;
+
+	for_each_online_node(node) {
+		if (hugetlb_cma[node])
+			return true;
+	}
+#endif
+
+	return false;
+}
+
 static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 {
 	unsigned long i;
@@ -2571,7 +2585,7 @@ static void __init hugetlb_hstate_alloc_
 
 	for (i = 0; i < h->max_huge_pages; ++i) {
 		if (hstate_is_gigantic(h)) {
-			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+			if (hugetlb_cma_enabled()) {
 				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
 				break;
 			}
_

Patches currently in -mm which might be from song.bao.hua@hisilicon.com are

mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (118 preceding siblings ...)
  2020-07-10 23:27 ` [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from -mm tree Andrew Morton
@ 2020-07-10 23:29 ` Andrew Morton
  2020-07-10 23:32 ` + proc-sysctl-make-protected_-world-readable.patch " Andrew Morton
                   ` (112 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:29 UTC (permalink / raw)
  To: guro, jonathan.cameron, mike.kravetz, mm-commits, song.bao.hua, stable


The patch titled
     Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
has been added to the -mm tree.  Its filename is
     mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled

hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory.  so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled.  gigantic pages might have been reserved on other nodes.  This
patch fixes possible double reservation and CMA leak.

Link: http://lkml.kernel.org/r/20200710005726.36068-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled
+++ a/mm/hugetlb.c
@@ -46,6 +46,7 @@ unsigned int default_hstate_idx;
 struct hstate hstates[HUGE_MAX_HSTATE];
 
 static struct cma *hugetlb_cma[MAX_NUMNODES];
+static unsigned long hugetlb_cma_size __initdata;
 
 /*
  * Minimum page order among possible hugepage sizes, set to a proper value
@@ -2571,7 +2572,7 @@ static void __init hugetlb_hstate_alloc_
 
 	for (i = 0; i < h->max_huge_pages; ++i) {
 		if (hstate_is_gigantic(h)) {
-			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+			if (hugetlb_cma_size) {
 				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
 				break;
 			}
@@ -5654,7 +5655,6 @@ void move_hugetlb_state(struct page *old
 }
 
 #ifdef CONFIG_CMA
-static unsigned long hugetlb_cma_size __initdata;
 static bool cma_reserve_called __initdata;
 
 static int __init cmdline_parse_hugetlb_cma(char *p)
_

Patches currently in -mm which might be from song.bao.hua@hisilicon.com are

mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + proc-sysctl-make-protected_-world-readable.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (119 preceding siblings ...)
  2020-07-10 23:29 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to " Andrew Morton
@ 2020-07-10 23:32 ` Andrew Morton
  2020-07-10 23:32 ` [to-be-updated] mm-vmallocc-add-an-error-message-if-two-areas-overlap.patch removed from " Andrew Morton
                   ` (111 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:32 UTC (permalink / raw)
  To: jpitti, keescook, mcgrof, mingo, mm-commits, viro, yzaikin


The patch titled
     Subject: proc/sysctl: make protected_* world readable
has been added to the -mm tree.  Its filename is
     proc-sysctl-make-protected_-world-readable.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/proc-sysctl-make-protected_-world-readable.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/proc-sysctl-make-protected_-world-readable.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Julius Hemanth Pitti <jpitti@cisco.com>
Subject: proc/sysctl: make protected_* world readable

protected_* files have 600 permissions which prevents non-superuser from
reading them.

Container like "AWS greengrass" refuse to launch unless
protected_hardlinks and protected_symlinks are set.  When containers like
these run with "userns-remap" or "--user" mapping container's root to
non-superuser on host, they fail to run due to denied read access to these
files.

As these protections are hardly a secret, and do not possess any security
risk, making them world readable.

Though above greengrass usecase needs read access to only
protected_hardlinks and protected_symlinks files, setting all other
protected_* files to 644 to keep consistency.

Link: http://lkml.kernel.org/r/20200709235115.56954-1-jpitti@cisco.com
Fixes: 800179c9b8a1 ("fs: add link restrictions")
Signed-off-by: Julius Hemanth Pitti <jpitti@cisco.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/sysctl.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/kernel/sysctl.c~proc-sysctl-make-protected_-world-readable
+++ a/kernel/sysctl.c
@@ -3232,7 +3232,7 @@ static struct ctl_table fs_table[] = {
 		.procname	= "protected_symlinks",
 		.data		= &sysctl_protected_symlinks,
 		.maxlen		= sizeof(int),
-		.mode		= 0600,
+		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= SYSCTL_ONE,
@@ -3241,7 +3241,7 @@ static struct ctl_table fs_table[] = {
 		.procname	= "protected_hardlinks",
 		.data		= &sysctl_protected_hardlinks,
 		.maxlen		= sizeof(int),
-		.mode		= 0600,
+		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= SYSCTL_ONE,
@@ -3250,7 +3250,7 @@ static struct ctl_table fs_table[] = {
 		.procname	= "protected_fifos",
 		.data		= &sysctl_protected_fifos,
 		.maxlen		= sizeof(int),
-		.mode		= 0600,
+		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= &two,
@@ -3259,7 +3259,7 @@ static struct ctl_table fs_table[] = {
 		.procname	= "protected_regular",
 		.data		= &sysctl_protected_regular,
 		.maxlen		= sizeof(int),
-		.mode		= 0600,
+		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= &two,
_

Patches currently in -mm which might be from jpitti@cisco.com are

proc-sysctl-make-protected_-world-readable.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* [to-be-updated] mm-vmallocc-add-an-error-message-if-two-areas-overlap.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (120 preceding siblings ...)
  2020-07-10 23:32 ` + proc-sysctl-make-protected_-world-readable.patch " Andrew Morton
@ 2020-07-10 23:32 ` Andrew Morton
  2020-07-10 23:35 ` + rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch added to " Andrew Morton
                   ` (110 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:32 UTC (permalink / raw)
  To: hdanton, mhocko, mm-commits, oleksiy.avramchenko, rostedt, urezki, willy


The patch titled
     Subject: mm/vmalloc.c: add an error message if two areas overlap
has been removed from the -mm tree.  Its filename was
     mm-vmallocc-add-an-error-message-if-two-areas-overlap.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Subject: mm/vmalloc.c: add an error message if two areas overlap

Before triggering a BUG() it would be useful to understand how two areas
overlap between each other.  Print information about start/end addresses
of both VAs and their addresses.

For example if both are identical it could mean double free.

Link: http://lkml.kernel.org/r/20200710194443.2984-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/mm/vmalloc.c~mm-vmallocc-add-an-error-message-if-two-areas-overlap
+++ a/mm/vmalloc.c
@@ -550,8 +550,13 @@ find_va_links(struct vmap_area *va,
 		else if (va->va_end > tmp_va->va_start &&
 				va->va_start >= tmp_va->va_end)
 			link = &(*link)->rb_right;
-		else
+		else {
+			pr_err("Overlaps: 0x%px(0x%lx-0x%lx), 0x%px(0x%lx-0x%lx)\n",
+				va, va->va_start, va->va_end, tmp_va,
+				tmp_va->va_start, tmp_va->va_end);
+
 			BUG();
+		}
 	} while (*link);
 
 	*parent = &tmp_va->rb_node;
_

Patches currently in -mm which might be from urezki@gmail.com are

mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
mm-vmalloc-switch-to-propagate-callback.patch
mm-vmalloc-update-the-header-about-kva-rework.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (121 preceding siblings ...)
  2020-07-10 23:32 ` [to-be-updated] mm-vmallocc-add-an-error-message-if-two-areas-overlap.patch removed from " Andrew Morton
@ 2020-07-10 23:35 ` Andrew Morton
  2020-07-14  0:19 ` + mm-vmscan-consistent-update-to-pgrefill.patch " Andrew Morton
                   ` (109 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:35 UTC (permalink / raw)
  To: alex.bou9, gustavoars, keescook, mm-commits, mporter


The patch titled
     Subject: rapidio/rio_mport_cdev: Use array_size() helper in copy_{from,to}_user()
has been added to the -mm tree.  Its filename is
     rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Subject: rapidio/rio_mport_cdev: Use array_size() helper in copy_{from,to}_user()

Use array_size() helper instead of the open-coded version in
copy_{from,to}_user().  These sorts of multiplication factors need to be
wrapped in array_size().

This issue was found with the help of Coccinelle and, audited and fixed
manually.

Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83
Link: http://lkml.kernel.org/r/20200616183050.GA31840@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/rapidio/devices/rio_mport_cdev.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/rapidio/devices/rio_mport_cdev.c~rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user
+++ a/drivers/rapidio/devices/rio_mport_cdev.c
@@ -981,7 +981,7 @@ static int rio_mport_transfer_ioctl(stru
 
 	if (unlikely(copy_from_user(transfer,
 				    (void __user *)(uintptr_t)transaction.block,
-				    transaction.count * sizeof(*transfer)))) {
+				    array_size(sizeof(*transfer), transaction.count)))) {
 		ret = -EFAULT;
 		goto out_free;
 	}
@@ -994,7 +994,7 @@ static int rio_mport_transfer_ioctl(stru
 
 	if (unlikely(copy_to_user((void __user *)(uintptr_t)transaction.block,
 				  transfer,
-				  transaction.count * sizeof(*transfer))))
+				  array_size(sizeof(*transfer), transaction.count))))
 		ret = -EFAULT;
 
 out_free:
_

Patches currently in -mm which might be from gustavoars@kernel.org are

rapidio-rio_mport_cdev-use-struct_size-helper.patch
rapidio-use-struct_size-helper.patch
rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmscan-consistent-update-to-pgrefill.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (122 preceding siblings ...)
  2020-07-10 23:35 ` + rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch added to " Andrew Morton
@ 2020-07-14  0:19 ` Andrew Morton
  2020-07-14  0:24 ` + mm-handle-page-mapping-better-in-dump_page-fix.patch " Andrew Morton
                   ` (108 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  0:19 UTC (permalink / raw)
  To: chris, guro, hannes, laoar.shao, mhocko, mm-commits, shakeelb


The patch titled
     Subject: mm: vmscan: consistent update to pgrefill
has been added to the -mm tree.  Its filename is
     mm-vmscan-consistent-update-to-pgrefill.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmscan-consistent-update-to-pgrefill.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmscan-consistent-update-to-pgrefill.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Shakeel Butt <shakeelb@google.com>
Subject: mm: vmscan: consistent update to pgrefill

The vmstat pgrefill is useful together with pgscan and pgsteal stats to
measure the reclaim efficiency.  However vmstat's pgrefill is not updated
consistently at system level.  It gets updated for both global and memcg
reclaim however pgscan and pgsteal are updated for only global reclaim. 
So, update pgrefill only for global reclaim.  If someone is interested in
the stats representing both system level as well as memcg level reclaim,
then consult the root memcg's memory.stat instead of /proc/vmstat.

Link: http://lkml.kernel.org/r/20200711011459.1159929-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmscan.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/mm/vmscan.c~mm-vmscan-consistent-update-to-pgrefill
+++ a/mm/vmscan.c
@@ -2030,7 +2030,8 @@ static void shrink_active_list(unsigned
 
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, nr_taken);
 
-	__count_vm_events(PGREFILL, nr_scanned);
+	if (!cgroup_reclaim(sc))
+		__count_vm_events(PGREFILL, nr_scanned);
 	__count_memcg_events(lruvec_memcg(lruvec), PGREFILL, nr_scanned);
 
 	spin_unlock_irq(&pgdat->lru_lock);
_

Patches currently in -mm which might be from shakeelb@google.com are

mm-memcontrol-account-kernel-stack-per-node.patch
mm-vmscan-consistent-update-to-pgrefill.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-handle-page-mapping-better-in-dump_page-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (123 preceding siblings ...)
  2020-07-14  0:19 ` + mm-vmscan-consistent-update-to-pgrefill.patch " Andrew Morton
@ 2020-07-14  0:24 ` Andrew Morton
  2020-07-14  0:31 ` + tmpfs-per-superblock-i_ino-support.patch " Andrew Morton
                   ` (107 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  0:24 UTC (permalink / raw)
  To: akpm, jhubbard, kirill, mm-commits, rppt, vbabka,
	william.kucharski, willy


The patch titled
     Subject: mm-handle-page-mapping-better-in-dump_page-fix
has been added to the -mm tree.  Its filename is
     mm-handle-page-mapping-better-in-dump_page-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-handle-page-mapping-better-in-dump_page-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-handle-page-mapping-better-in-dump_page-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-handle-page-mapping-better-in-dump_page-fix

augmented code comment from John

Link: http://lkml.kernel.org/r/15cff11a-6762-8a6a-3f0e-dd227280cd6f@nvidia.com
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/mm/debug.c~mm-handle-page-mapping-better-in-dump_page-fix
+++ a/mm/debug.c
@@ -69,7 +69,13 @@ void __dump_page(struct page *page, cons
 	}
 
 	if (page < head || (page >= head + MAX_ORDER_NR_PAGES)) {
-		/* Corrupt page, cannot call page_mapping */
+		/*
+		 * Corrupt page, so we cannot call page_mapping. Instead, do a
+		 * safe subset of the steps that page_mapping() does. Caution:
+		 * this will be misleading for tail pages, PageSwapCache pages,
+		 * and potentially other situations. (See the page_mapping()
+		 * implementation for what's missing here.)
+		 */
 		unsigned long tmp = (unsigned long)page->mapping;
 
 		if (tmp & PAGE_MAPPING_ANON)
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + tmpfs-per-superblock-i_ino-support.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (124 preceding siblings ...)
  2020-07-14  0:24 ` + mm-handle-page-mapping-better-in-dump_page-fix.patch " Andrew Morton
@ 2020-07-14  0:31 ` Andrew Morton
  2020-07-14  0:31 ` + tmpfs-support-64-bit-inums-per-sb.patch " Andrew Morton
                   ` (106 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  0:31 UTC (permalink / raw)
  To: amir73il, chris, hannes, hughd, jlayton, mm-commits, tj, viro, willy


The patch titled
     Subject: tmpfs: per-superblock i_ino support
has been added to the -mm tree.  Its filename is
     tmpfs-per-superblock-i_ino-support.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/tmpfs-per-superblock-i_ino-support.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/tmpfs-per-superblock-i_ino-support.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chris Down <chris@chrisdown.name>
Subject: tmpfs: per-superblock i_ino support

Patch series "tmpfs: inode: Reduce risk of inum overflow", v7.

In Facebook production we are seeing heavy i_ino wraparounds on tmpfs.  On
affected tiers, in excess of 10% of hosts show multiple files with
different content and the same inode number, with some servers even having
as many as 150 duplicated inode numbers with differing file content.

This causes actual, tangible problems in production.  For example, we have
complaints from those working on remote caches that their application is
reporting cache corruptions because it uses (device, inodenum) to
establish the identity of a particular cache object, but because it's not
unique any more, the application refuses to continue and reports cache
corruption.  Even worse, sometimes applications may not even detect the
corruption but may continue anyway, causing phantom and hard to debug
behaviour.

In general, userspace applications expect that (device, inodenum) should
be enough to be uniquely point to one inode, which seems fair enough.  One
might also need to check the generation, but in this case:

1. That's not currently exposed to userspace
   (ioctl(...FS_IOC_GETVERSION...) returns ENOTTY on tmpfs);
2. Even with generation, there shouldn't be two live inodes with the
   same inode number on one device.

In order to mitigate this, we take a two-pronged approach:

1. Moving inum generation from being global to per-sb for tmpfs. This
   itself allows some reduction in i_ino churn. This works on both 64-
   and 32- bit machines.
2. Adding inode{64,32} for tmpfs. This fix is supported on machines with
   64-bit ino_t only: we allow users to mount tmpfs with a new inode64
   option that uses the full width of ino_t, or CONFIG_TMPFS_INODE64.

You can see how this compares to previous related patches which didn't
implement this per-superblock:

- https://patchwork.kernel.org/patch/11254001/
- https://patchwork.kernel.org/patch/11023915/


This patch (of 2):

get_next_ino has a number of problems:

- It uses and returns a uint, which is susceptible to become overflowed
  if a lot of volatile inodes that use get_next_ino are created.
- It's global, with no specificity per-sb or even per-filesystem. This
  means it's not that difficult to cause inode number wraparounds on a
  single device, which can result in having multiple distinct inodes
  with the same inode number.

This patch adds a per-superblock counter that mitigates the second case. 
This design also allows us to later have a specific i_ino size per-device,
for example, allowing users to choose whether to use 32- or 64-bit inodes
for each tmpfs mount.  This is implemented in the next commit.

For internal shmem mounts which may be less tolerant to spinlock delays,
we implement a percpu batching scheme which only takes the stat_lock at
each batch boundary.

Link: http://lkml.kernel.org/r/cover.1594661218.git.chris@chrisdown.name
Link: http://lkml.kernel.org/r/1986b9d63b986f08ec07a4aa4b2275e718e47d8a.1594661218.git.chris@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/fs.h       |   15 ++++++++
 include/linux/shmem_fs.h |    2 +
 mm/shmem.c               |   66 ++++++++++++++++++++++++++++++++++---
 3 files changed, 78 insertions(+), 5 deletions(-)

--- a/include/linux/fs.h~tmpfs-per-superblock-i_ino-support
+++ a/include/linux/fs.h
@@ -3098,6 +3098,21 @@ extern void discard_new_inode(struct ino
 extern unsigned int get_next_ino(void);
 extern void evict_inodes(struct super_block *sb);
 
+/*
+ * Userspace may rely on the the inode number being non-zero. For example, glibc
+ * simply ignores files with zero i_ino in unlink() and other places.
+ *
+ * As an additional complication, if userspace was compiled with
+ * _FILE_OFFSET_BITS=32 on a 64-bit kernel we'll only end up reading out the
+ * lower 32 bits, so we need to check that those aren't zero explicitly. With
+ * _FILE_OFFSET_BITS=64, this may cause some harmless false-negatives, but
+ * better safe than sorry.
+ */
+static inline bool is_zero_ino(ino_t ino)
+{
+	return (u32)ino == 0;
+}
+
 extern void __iget(struct inode * inode);
 extern void iget_failed(struct inode *);
 extern void clear_inode(struct inode *);
--- a/include/linux/shmem_fs.h~tmpfs-per-superblock-i_ino-support
+++ a/include/linux/shmem_fs.h
@@ -36,6 +36,8 @@ struct shmem_sb_info {
 	unsigned char huge;	    /* Whether to try for hugepages */
 	kuid_t uid;		    /* Mount uid for root directory */
 	kgid_t gid;		    /* Mount gid for root directory */
+	ino_t next_ino;		    /* The next per-sb inode number to use */
+	ino_t __percpu *ino_batch;  /* The next per-cpu inode number to use */
 	struct mempolicy *mpol;     /* default memory policy for mappings */
 	spinlock_t shrinklist_lock;   /* Protects shrinklist */
 	struct list_head shrinklist;  /* List of shinkable inodes */
--- a/mm/shmem.c~tmpfs-per-superblock-i_ino-support
+++ a/mm/shmem.c
@@ -260,18 +260,67 @@ bool vma_is_shmem(struct vm_area_struct
 static LIST_HEAD(shmem_swaplist);
 static DEFINE_MUTEX(shmem_swaplist_mutex);
 
-static int shmem_reserve_inode(struct super_block *sb)
+/*
+ * shmem_reserve_inode() performs bookkeeping to reserve a shmem inode, and
+ * produces a novel ino for the newly allocated inode.
+ *
+ * It may also be called when making a hard link to permit the space needed by
+ * each dentry. However, in that case, no new inode number is needed since that
+ * internally draws from another pool of inode numbers (currently global
+ * get_next_ino()). This case is indicated by passing NULL as inop.
+ */
+#define SHMEM_INO_BATCH 1024
+static int shmem_reserve_inode(struct super_block *sb, ino_t *inop)
 {
 	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
-	if (sbinfo->max_inodes) {
+	ino_t ino;
+
+	if (!(sb->s_flags & SB_KERNMOUNT)) {
 		spin_lock(&sbinfo->stat_lock);
 		if (!sbinfo->free_inodes) {
 			spin_unlock(&sbinfo->stat_lock);
 			return -ENOSPC;
 		}
 		sbinfo->free_inodes--;
+		if (inop) {
+			ino = sbinfo->next_ino++;
+			if (unlikely(is_zero_ino(ino)))
+				ino = sbinfo->next_ino++;
+			if (unlikely(ino > UINT_MAX)) {
+				/*
+				 * Emulate get_next_ino uint wraparound for
+				 * compatibility
+				 */
+				ino = 1;
+			}
+			*inop = ino;
+		}
 		spin_unlock(&sbinfo->stat_lock);
+	} else if (inop) {
+		/*
+		 * __shmem_file_setup, one of our callers, is lock-free: it
+		 * doesn't hold stat_lock in shmem_reserve_inode since
+		 * max_inodes is always 0, and is called from potentially
+		 * unknown contexts. As such, use a per-cpu batched allocator
+		 * which doesn't require the per-sb stat_lock unless we are at
+		 * the batch boundary.
+		 */
+		ino_t *next_ino;
+		next_ino = per_cpu_ptr(sbinfo->ino_batch, get_cpu());
+		ino = *next_ino;
+		if (unlikely(ino % SHMEM_INO_BATCH == 0)) {
+			spin_lock(&sbinfo->stat_lock);
+			ino = sbinfo->next_ino;
+			sbinfo->next_ino += SHMEM_INO_BATCH;
+			spin_unlock(&sbinfo->stat_lock);
+			if (unlikely(is_zero_ino(ino)))
+				ino++;
+		}
+		*inop = ino;
+		*next_ino = ++ino;
+		put_cpu();
 	}
+
 	return 0;
 }
 
@@ -2222,13 +2271,14 @@ static struct inode *shmem_get_inode(str
 	struct inode *inode;
 	struct shmem_inode_info *info;
 	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
+	ino_t ino;
 
-	if (shmem_reserve_inode(sb))
+	if (shmem_reserve_inode(sb, &ino))
 		return NULL;
 
 	inode = new_inode(sb);
 	if (inode) {
-		inode->i_ino = get_next_ino();
+		inode->i_ino = ino;
 		inode_init_owner(inode, dir, mode);
 		inode->i_blocks = 0;
 		inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
@@ -2932,7 +2982,7 @@ static int shmem_link(struct dentry *old
 	 * first link must skip that, to get the accounting right.
 	 */
 	if (inode->i_nlink) {
-		ret = shmem_reserve_inode(inode->i_sb);
+		ret = shmem_reserve_inode(inode->i_sb, NULL);
 		if (ret)
 			goto out;
 	}
@@ -3584,6 +3634,7 @@ static void shmem_put_super(struct super
 {
 	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
 
+	free_percpu(sbinfo->ino_batch);
 	percpu_counter_destroy(&sbinfo->used_blocks);
 	mpol_put(sbinfo->mpol);
 	kfree(sbinfo);
@@ -3626,6 +3677,11 @@ static int shmem_fill_super(struct super
 #endif
 	sbinfo->max_blocks = ctx->blocks;
 	sbinfo->free_inodes = sbinfo->max_inodes = ctx->inodes;
+	if (sb->s_flags & SB_KERNMOUNT) {
+		sbinfo->ino_batch = alloc_percpu(ino_t);
+		if (!sbinfo->ino_batch)
+			goto failed;
+	}
 	sbinfo->uid = ctx->uid;
 	sbinfo->gid = ctx->gid;
 	sbinfo->mode = ctx->mode;
_

Patches currently in -mm which might be from chris@chrisdown.name are

tmpfs-per-superblock-i_ino-support.patch
tmpfs-support-64-bit-inums-per-sb.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + tmpfs-support-64-bit-inums-per-sb.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (125 preceding siblings ...)
  2020-07-14  0:31 ` + tmpfs-per-superblock-i_ino-support.patch " Andrew Morton
@ 2020-07-14  0:31 ` Andrew Morton
  2020-07-14  0:50 ` + mm-thp-replace-http-links-with-https-ones.patch " Andrew Morton
                   ` (105 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  0:31 UTC (permalink / raw)
  To: amir73il, chris, hannes, hughd, jlayton, mm-commits, tj, viro, willy


The patch titled
     Subject: tmpfs: support 64-bit inums per-sb
has been added to the -mm tree.  Its filename is
     tmpfs-support-64-bit-inums-per-sb.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/tmpfs-support-64-bit-inums-per-sb.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/tmpfs-support-64-bit-inums-per-sb.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chris Down <chris@chrisdown.name>
Subject: tmpfs: support 64-bit inums per-sb

The default is still set to inode32 for backwards compatibility, but
system administrators can opt in to the new 64-bit inode numbers by
either:

1. Passing inode64 on the command line when mounting, or
2. Configuring the kernel with CONFIG_TMPFS_INODE64=y

The inode64 and inode32 names are used based on existing precedent from
XFS.

Link: http://lkml.kernel.org/r/8b23758d0c66b5e2263e08baf9c4b6a7565cbd8f.1594661218.git.chris@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/filesystems/tmpfs.rst |   11 ++++
 fs/Kconfig                          |   15 ++++++
 include/linux/shmem_fs.h            |    1 
 mm/shmem.c                          |   65 +++++++++++++++++++++++++-
 4 files changed, 90 insertions(+), 2 deletions(-)

--- a/Documentation/filesystems/tmpfs.rst~tmpfs-support-64-bit-inums-per-sb
+++ a/Documentation/filesystems/tmpfs.rst
@@ -150,6 +150,15 @@ These options do not have any effect on
 parameters with chmod(1), chown(1) and chgrp(1) on a mounted filesystem.
 
 
+tmpfs has a mount option to select whether it will wrap at 32- or 64-bit inode
+numbers:
+
+inode64   Use 64-bit inode numbers
+inode32   Use 32-bit inode numbers
+
+On 64-bit, the default is set by CONFIG_TMPFS_INODE64. On 32-bit, inode64 is
+not legal and will produce an error at mount time.
+
 So 'mount -t tmpfs -o size=10G,nr_inodes=10k,mode=700 tmpfs /mytmpfs'
 will give you tmpfs instance on /mytmpfs which can allocate 10GB
 RAM/SWAP in 10240 inodes and it is only accessible by root.
@@ -161,3 +170,5 @@ RAM/SWAP in 10240 inodes and it is only
    Hugh Dickins, 4 June 2007
 :Updated:
    KOSAKI Motohiro, 16 Mar 2010
+Updated:
+   Chris Down, 13 July 2020
--- a/fs/Kconfig~tmpfs-support-64-bit-inums-per-sb
+++ a/fs/Kconfig
@@ -201,6 +201,21 @@ config TMPFS_XATTR
 
 	  If unsure, say N.
 
+config TMPFS_INODE64
+	bool "Use 64-bit ino_t by default in tmpfs"
+	depends on TMPFS && 64BIT
+	default n
+	help
+	  tmpfs has historically used only inode numbers as wide as an unsigned
+	  int. In some cases this can cause wraparound, potentially resulting in
+	  multiple files with the same inode number on a single device. This option
+	  makes tmpfs use the full width of ino_t by default, similarly to the
+	  inode64 mount option.
+
+	  To override this default, use the inode32 or inode64 mount options.
+
+	  If unsure, say N.
+
 config HUGETLBFS
 	bool "HugeTLB file system support"
 	depends on X86 || IA64 || SPARC64 || (S390 && 64BIT) || \
--- a/include/linux/shmem_fs.h~tmpfs-support-64-bit-inums-per-sb
+++ a/include/linux/shmem_fs.h
@@ -36,6 +36,7 @@ struct shmem_sb_info {
 	unsigned char huge;	    /* Whether to try for hugepages */
 	kuid_t uid;		    /* Mount uid for root directory */
 	kgid_t gid;		    /* Mount gid for root directory */
+	bool full_inums;	    /* If i_ino should be uint or ino_t */
 	ino_t next_ino;		    /* The next per-sb inode number to use */
 	ino_t __percpu *ino_batch;  /* The next per-cpu inode number to use */
 	struct mempolicy *mpol;     /* default memory policy for mappings */
--- a/mm/shmem.c~tmpfs-support-64-bit-inums-per-sb
+++ a/mm/shmem.c
@@ -114,11 +114,13 @@ struct shmem_options {
 	kuid_t uid;
 	kgid_t gid;
 	umode_t mode;
+	bool full_inums;
 	int huge;
 	int seen;
 #define SHMEM_SEEN_BLOCKS 1
 #define SHMEM_SEEN_INODES 2
 #define SHMEM_SEEN_HUGE 4
+#define SHMEM_SEEN_INUMS 8
 };
 
 #ifdef CONFIG_TMPFS
@@ -286,12 +288,17 @@ static int shmem_reserve_inode(struct su
 			ino = sbinfo->next_ino++;
 			if (unlikely(is_zero_ino(ino)))
 				ino = sbinfo->next_ino++;
-			if (unlikely(ino > UINT_MAX)) {
+			if (unlikely(!sbinfo->full_inums &&
+				     ino > UINT_MAX)) {
 				/*
 				 * Emulate get_next_ino uint wraparound for
 				 * compatibility
 				 */
-				ino = 1;
+				if (IS_ENABLED(CONFIG_64BIT))
+					pr_warn("%s: inode number overflow on device %d, consider using inode64 mount option\n",
+						__func__, MINOR(sb->s_dev));
+				sbinfo->next_ino = 1;
+				ino = sbinfo->next_ino++;
 			}
 			*inop = ino;
 		}
@@ -304,6 +311,10 @@ static int shmem_reserve_inode(struct su
 		 * unknown contexts. As such, use a per-cpu batched allocator
 		 * which doesn't require the per-sb stat_lock unless we are at
 		 * the batch boundary.
+		 *
+		 * We don't need to worry about inode{32,64} since SB_KERNMOUNT
+		 * shmem mounts are not exposed to userspace, so we don't need
+		 * to worry about things like glibc compatibility.
 		 */
 		ino_t *next_ino;
 		next_ino = per_cpu_ptr(sbinfo->ino_batch, get_cpu());
@@ -3397,6 +3408,8 @@ enum shmem_param {
 	Opt_nr_inodes,
 	Opt_size,
 	Opt_uid,
+	Opt_inode32,
+	Opt_inode64,
 };
 
 static const struct constant_table shmem_param_enums_huge[] = {
@@ -3416,6 +3429,8 @@ const struct fs_parameter_spec shmem_fs_
 	fsparam_string("nr_inodes",	Opt_nr_inodes),
 	fsparam_string("size",		Opt_size),
 	fsparam_u32   ("uid",		Opt_uid),
+	fsparam_flag  ("inode32",	Opt_inode32),
+	fsparam_flag  ("inode64",	Opt_inode64),
 	{}
 };
 
@@ -3487,6 +3502,18 @@ static int shmem_parse_one(struct fs_con
 			break;
 		}
 		goto unsupported_parameter;
+	case Opt_inode32:
+		ctx->full_inums = false;
+		ctx->seen |= SHMEM_SEEN_INUMS;
+		break;
+	case Opt_inode64:
+		if (sizeof(ino_t) < 8) {
+			return invalfc(fc,
+				       "Cannot use inode64 with <64bit inums in kernel\n");
+		}
+		ctx->full_inums = true;
+		ctx->seen |= SHMEM_SEEN_INUMS;
+		break;
 	}
 	return 0;
 
@@ -3578,8 +3605,16 @@ static int shmem_reconfigure(struct fs_c
 		}
 	}
 
+	if ((ctx->seen & SHMEM_SEEN_INUMS) && !ctx->full_inums &&
+	    sbinfo->next_ino > UINT_MAX) {
+		err = "Current inum too high to switch to 32-bit inums";
+		goto out;
+	}
+
 	if (ctx->seen & SHMEM_SEEN_HUGE)
 		sbinfo->huge = ctx->huge;
+	if (ctx->seen & SHMEM_SEEN_INUMS)
+		sbinfo->full_inums = ctx->full_inums;
 	if (ctx->seen & SHMEM_SEEN_BLOCKS)
 		sbinfo->max_blocks  = ctx->blocks;
 	if (ctx->seen & SHMEM_SEEN_INODES) {
@@ -3619,6 +3654,29 @@ static int shmem_show_options(struct seq
 	if (!gid_eq(sbinfo->gid, GLOBAL_ROOT_GID))
 		seq_printf(seq, ",gid=%u",
 				from_kgid_munged(&init_user_ns, sbinfo->gid));
+
+	/*
+	 * Showing inode{64,32} might be useful even if it's the system default,
+	 * since then people don't have to resort to checking both here and
+	 * /proc/config.gz to confirm 64-bit inums were successfully applied
+	 * (which may not even exist if IKCONFIG_PROC isn't enabled).
+	 *
+	 * We hide it when inode64 isn't the default and we are using 32-bit
+	 * inodes, since that probably just means the feature isn't even under
+	 * consideration.
+	 *
+	 * As such:
+	 *
+	 *                     +-----------------+-----------------+
+	 *                     | TMPFS_INODE64=y | TMPFS_INODE64=n |
+	 *  +------------------+-----------------+-----------------+
+	 *  | full_inums=true  | show            | show            |
+	 *  | full_inums=false | show            | hide            |
+	 *  +------------------+-----------------+-----------------+
+	 *
+	 */
+	if (IS_ENABLED(CONFIG_TMPFS_INODE64) || sbinfo->full_inums)
+		seq_printf(seq, ",inode%d", (sbinfo->full_inums ? 64 : 32));
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	/* Rightly or wrongly, show huge mount option unmasked by shmem_huge */
 	if (sbinfo->huge)
@@ -3667,6 +3725,8 @@ static int shmem_fill_super(struct super
 			ctx->blocks = shmem_default_max_blocks();
 		if (!(ctx->seen & SHMEM_SEEN_INODES))
 			ctx->inodes = shmem_default_max_inodes();
+		if (!(ctx->seen & SHMEM_SEEN_INUMS))
+			ctx->full_inums = IS_ENABLED(CONFIG_TMPFS_INODE64);
 	} else {
 		sb->s_flags |= SB_NOUSER;
 	}
@@ -3684,6 +3744,7 @@ static int shmem_fill_super(struct super
 	}
 	sbinfo->uid = ctx->uid;
 	sbinfo->gid = ctx->gid;
+	sbinfo->full_inums = ctx->full_inums;
 	sbinfo->mode = ctx->mode;
 	sbinfo->huge = ctx->huge;
 	sbinfo->mpol = ctx->mpol;
_

Patches currently in -mm which might be from chris@chrisdown.name are

tmpfs-per-superblock-i_ino-support.patch
tmpfs-support-64-bit-inums-per-sb.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-thp-replace-http-links-with-https-ones.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (126 preceding siblings ...)
  2020-07-14  0:31 ` + tmpfs-support-64-bit-inums-per-sb.patch " Andrew Morton
@ 2020-07-14  0:50 ` Andrew Morton
  2020-07-14  1:00 ` + mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch " Andrew Morton
                   ` (104 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  0:50 UTC (permalink / raw)
  To: akpm, grandmaster, mm-commits


The patch titled
     Subject: mm: thp: replace HTTP links with HTTPS ones
has been added to the -mm tree.  Its filename is
     mm-thp-replace-http-links-with-https-ones.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-replace-http-links-with-https-ones.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-thp-replace-http-links-with-https-ones.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Alexander A. Klimov" <grandmaster@al2klimov.de>
Subject: mm: thp: replace HTTP links with HTTPS ones

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# 	
]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Link: http://lkml.kernel.org/r/20200713164345.36088-1-grandmaster@al2klimov.de
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/huge_memory.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/huge_memory.c~mm-thp-replace-http-links-with-https-ones
+++ a/mm/huge_memory.c
@@ -2065,7 +2065,7 @@ static void __split_huge_pmd_locked(stru
 	 * free), userland could trigger a small page size TLB miss on the
 	 * small sized TLB while the hugepage TLB entry is still established in
 	 * the huge TLB. Some CPU doesn't like that.
-	 * See http://support.amd.com/us/Processor_TechDocs/41322.pdf, Erratum
+	 * See https://support.amd.com/us/Processor_TechDocs/41322.pdf, Erratum
 	 * 383 on page 93. Intel should be safe but is also warns that it's
 	 * only safe if the permission and cache attributes of the two entries
 	 * loaded in the two TLB is identical (which should be the case here).
_

Patches currently in -mm which might be from grandmaster@al2klimov.de are

mm-thp-replace-http-links-with-https-ones.patch
vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (127 preceding siblings ...)
  2020-07-14  0:50 ` + mm-thp-replace-http-links-with-https-ones.patch " Andrew Morton
@ 2020-07-14  1:00 ` Andrew Morton
  2020-07-14  1:00 ` + mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch " Andrew Morton
                   ` (103 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:00 UTC (permalink / raw)
  To: chris, guro, hannes, mhocko, mm-commits, shakeelb, tj


The patch titled
     Subject: mm, memcg: reclaim more aggressively before high allocator throttling
has been added to the -mm tree.  Its filename is
     mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chris Down <chris@chrisdown.name>
Subject: mm, memcg: reclaim more aggressively before high allocator throttling

Patch series "mm, memcg: reclaim harder before high throttling", v2.


This patch (of 2):

In Facebook production, we've seen cases where cgroups have been put into
allocator throttling even when they appear to have a lot of slack file
caches which should be trivially reclaimable.

Looking more closely, the problem is that we only try a single cgroup
reclaim walk for each return to usermode before calculating whether or not
we should throttle.  This single attempt doesn't produce enough pressure
to shrink for cgroups with a rapidly growing amount of file caches prior
to entering allocator throttling.

As an example, we see that threads in an affected cgroup are stuck in
allocator throttling:

    # for i in $(cat cgroup.threads); do
    >     grep over_high "/proc/$i/stack"
    > done
    [<0>] mem_cgroup_handle_over_high+0x10b/0x150
    [<0>] mem_cgroup_handle_over_high+0x10b/0x150
    [<0>] mem_cgroup_handle_over_high+0x10b/0x150

...however, there is no I/O pressure reported by PSI, despite a lot of
slack file pages:

    # cat memory.pressure
    some avg10=78.50 avg60=84.99 avg300=84.53 total=5702440903
    full avg10=78.50 avg60=84.99 avg300=84.53 total=5702116959
    # cat io.pressure
    some avg10=0.00 avg60=0.00 avg300=0.00 total=78051391
    full avg10=0.00 avg60=0.00 avg300=0.00 total=78049640
    # grep _file memory.stat
    inactive_file 1370939392
    active_file 661635072

This patch changes the behaviour to retry reclaim either until the current
task goes below the 10ms grace period, or we are making no reclaim
progress at all.  In the latter case, we enter reclaim throttling as
before.

To a user, there's no intuitive reason for the reclaim behaviour to differ
from hitting memory.high as part of a new allocation, as opposed to
hitting memory.high because someone lowered its value.  As such this also
brings an added benefit: it unifies the reclaim behaviour between the two.

There's precedent for this behaviour: we already do reclaim retries when
writing to memory.{high,max}, in max reclaim, and in the page allocator
itself.

Link: http://lkml.kernel.org/r/cover.1594640214.git.chris@chrisdown.name
Link: http://lkml.kernel.org/r/a4e23b59e9ef499b575ae73a8120ee089b7d3373.1594640214.git.chris@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   42 +++++++++++++++++++++++++++++++++++++-----
 1 file changed, 37 insertions(+), 5 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling
+++ a/mm/memcontrol.c
@@ -73,6 +73,7 @@ EXPORT_SYMBOL(memory_cgrp_subsys);
 
 struct mem_cgroup *root_mem_cgroup __read_mostly;
 
+/* The number of times we should retry reclaim failures before giving up. */
 #define MEM_CGROUP_RECLAIM_RETRIES	5
 
 /* Socket memory accounting disabled? */
@@ -2365,18 +2366,23 @@ static int memcg_hotplug_cpu_dead(unsign
 	return 0;
 }
 
-static void reclaim_high(struct mem_cgroup *memcg,
-			 unsigned int nr_pages,
-			 gfp_t gfp_mask)
+static unsigned long reclaim_high(struct mem_cgroup *memcg,
+				  unsigned int nr_pages,
+				  gfp_t gfp_mask)
 {
+	unsigned long nr_reclaimed = 0;
+
 	do {
 		if (page_counter_read(&memcg->memory) <=
 		    READ_ONCE(memcg->memory.high))
 			continue;
 		memcg_memory_event(memcg, MEMCG_HIGH);
-		try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask, true);
+		nr_reclaimed += try_to_free_mem_cgroup_pages(memcg, nr_pages,
+							     gfp_mask, true);
 	} while ((memcg = parent_mem_cgroup(memcg)) &&
 		 !mem_cgroup_is_root(memcg));
+
+	return nr_reclaimed;
 }
 
 static void high_work_func(struct work_struct *work)
@@ -2532,16 +2538,32 @@ void mem_cgroup_handle_over_high(void)
 {
 	unsigned long penalty_jiffies;
 	unsigned long pflags;
+	unsigned long nr_reclaimed;
 	unsigned int nr_pages = current->memcg_nr_pages_over_high;
+	int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
 	struct mem_cgroup *memcg;
+	bool in_retry = false;
 
 	if (likely(!nr_pages))
 		return;
 
 	memcg = get_mem_cgroup_from_mm(current->mm);
-	reclaim_high(memcg, nr_pages, GFP_KERNEL);
 	current->memcg_nr_pages_over_high = 0;
 
+retry_reclaim:
+	/*
+	 * The allocating task should reclaim at least the batch size, but for
+	 * subsequent retries we only want to do what's necessary to prevent oom
+	 * or breaching resource isolation.
+	 *
+	 * This is distinct from memory.max or page allocator behaviour because
+	 * memory.high is currently batched, whereas memory.max and the page
+	 * allocator run every time an allocation is made.
+	 */
+	nr_reclaimed = reclaim_high(memcg,
+				    in_retry ? SWAP_CLUSTER_MAX : nr_pages,
+				    GFP_KERNEL);
+
 	/*
 	 * memory.high is breached and reclaim is unable to keep up. Throttle
 	 * allocators proactively to slow down excessive growth.
@@ -2569,6 +2591,16 @@ void mem_cgroup_handle_over_high(void)
 		goto out;
 
 	/*
+	 * If reclaim is making forward progress but we're still over
+	 * memory.high, we want to encourage that rather than doing allocator
+	 * throttling.
+	 */
+	if (nr_reclaimed || nr_retries--) {
+		in_retry = true;
+		goto retry_reclaim;
+	}
+
+	/*
 	 * If we exit early, we're guaranteed to die (since
 	 * schedule_timeout_killable sets TASK_KILLABLE). This means we don't
 	 * need to account for any ill-begotten jiffies to pay them off later.
_

Patches currently in -mm which might be from chris@chrisdown.name are

tmpfs-per-superblock-i_ino-support.patch
tmpfs-support-64-bit-inums-per-sb.patch
mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch
mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (128 preceding siblings ...)
  2020-07-14  1:00 ` + mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch " Andrew Morton
@ 2020-07-14  1:00 ` Andrew Morton
  2020-07-14  1:03 ` + mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch " Andrew Morton
                   ` (102 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:00 UTC (permalink / raw)
  To: chris, guro, hannes, mhocko, mm-commits, shakeelb, tj


The patch titled
     Subject: mm, memcg: unify reclaim retry limits with page allocator
has been added to the -mm tree.  Its filename is
     mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chris Down <chris@chrisdown.name>
Subject: mm, memcg: unify reclaim retry limits with page allocator

Reclaim retries have been set to 5 since the beginning of time in
commit 66e1707bc346 ("Memory controller: add per cgroup LRU and
reclaim").  However, we now have a generally agreed-upon standard for
page reclaim: MAX_RECLAIM_RETRIES (currently 16), added many years later
in commit 0a0337e0d1d1 ("mm, oom: rework oom detection").

In the absence of a compelling reason to declare an OOM earlier in memcg
context than page allocator context, it seems reasonable to supplant
MEM_CGROUP_RECLAIM_RETRIES with MAX_RECLAIM_RETRIES, making the page
allocator and memcg internals more similar in semantics when reclaim
fails to produce results, avoiding premature OOMs or throttling.

Link: http://lkml.kernel.org/r/da557856c9c7654308eaff4eedc1952a95e8df5f.1594640214.git.chris@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-unify-reclaim-retry-limits-with-page-allocator
+++ a/mm/memcontrol.c
@@ -73,9 +73,6 @@ EXPORT_SYMBOL(memory_cgrp_subsys);
 
 struct mem_cgroup *root_mem_cgroup __read_mostly;
 
-/* The number of times we should retry reclaim failures before giving up. */
-#define MEM_CGROUP_RECLAIM_RETRIES	5
-
 /* Socket memory accounting disabled? */
 static bool cgroup_memory_nosocket;
 
@@ -2540,7 +2537,7 @@ void mem_cgroup_handle_over_high(void)
 	unsigned long pflags;
 	unsigned long nr_reclaimed;
 	unsigned int nr_pages = current->memcg_nr_pages_over_high;
-	int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
+	int nr_retries = MAX_RECLAIM_RETRIES;
 	struct mem_cgroup *memcg;
 	bool in_retry = false;
 
@@ -2617,7 +2614,7 @@ static int try_charge(struct mem_cgroup
 		      unsigned int nr_pages)
 {
 	unsigned int batch = max(MEMCG_CHARGE_BATCH, nr_pages);
-	int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
+	int nr_retries = MAX_RECLAIM_RETRIES;
 	struct mem_cgroup *mem_over_limit;
 	struct page_counter *counter;
 	unsigned long nr_reclaimed;
@@ -2736,7 +2733,7 @@ retry:
 		       get_order(nr_pages * PAGE_SIZE));
 	switch (oom_status) {
 	case OOM_SUCCESS:
-		nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
+		nr_retries = MAX_RECLAIM_RETRIES;
 		goto retry;
 	case OOM_FAILED:
 		goto force;
@@ -3396,7 +3393,7 @@ static inline bool memcg_has_children(st
  */
 static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
 {
-	int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
+	int nr_retries = MAX_RECLAIM_RETRIES;
 
 	/* we call try-to-free pages for make this cgroup empty */
 	lru_add_drain_all();
@@ -6225,7 +6222,7 @@ static ssize_t memory_high_write(struct
 				 char *buf, size_t nbytes, loff_t off)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
-	unsigned int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
+	unsigned int nr_retries = MAX_RECLAIM_RETRIES;
 	bool drained = false;
 	unsigned long high;
 	int err;
@@ -6273,7 +6270,7 @@ static ssize_t memory_max_write(struct k
 				char *buf, size_t nbytes, loff_t off)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
-	unsigned int nr_reclaims = MEM_CGROUP_RECLAIM_RETRIES;
+	unsigned int nr_reclaims = MAX_RECLAIM_RETRIES;
 	bool drained = false;
 	unsigned long max;
 	int err;
_

Patches currently in -mm which might be from chris@chrisdown.name are

tmpfs-per-superblock-i_ino-support.patch
tmpfs-support-64-bit-inums-per-sb.patch
mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch
mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (129 preceding siblings ...)
  2020-07-14  1:00 ` + mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch " Andrew Morton
@ 2020-07-14  1:03 ` Andrew Morton
  2020-07-14  1:03 ` + mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch " Andrew Morton
                   ` (101 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:03 UTC (permalink / raw)
  To: chris, guro, hannes, laoar.shao, mhocko, mm-commits


The patch titled
     Subject: mm, memcg: avoid stale protection values when cgroup is above protection
has been added to the -mm tree.  Its filename is
     mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Yafang Shao <laoar.shao@gmail.com>
Subject: mm, memcg: avoid stale protection values when cgroup is above protection

Patch series "mm, memcg: memory.{low,min} reclaim fix & cleanup", v4.

This series contains a fix for a edge case in my earlier protection
calculation patches, and a patch to make the area overall a little more
robust to hopefully help avoid this in future.


This patch (of 2):

A cgroup can have both memory protection and a memory limit to isolate it
from its siblings in both directions - for example, to prevent it from
being shrunk below 2G under high pressure from outside, but also from
growing beyond 4G under low pressure.

Commit 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
implemented proportional scan pressure so that multiple siblings in excess
of their protection settings don't get reclaimed equally but instead in
accordance to their unprotected portion.

During limit reclaim, this proportionality shouldn't apply of course:
there is no competition, all pressure is from within the cgroup and should
be applied as such.  Reclaim should operate at full efficiency.

However, mem_cgroup_protected() never expected anybody to look at the
effective protection values when it indicated that the cgroup is above its
protection.  As a result, a query during limit reclaim may return stale
protection values that were calculated by a previous reclaim cycle in
which the cgroup did have siblings.

When this happens, reclaim is unnecessarily hesitant and potentially slow
to meet the desired limit.  In theory this could lead to premature OOM
kills, although it's not obvious this has occurred in practice.

Workaround the problem by special casing reclaim roots in
mem_cgroup_protection.  These memcgs are never participating in the
reclaim protection because the reclaim is internal.

We have to ignore effective protection values for reclaim roots because
mem_cgroup_protected might be called from racing reclaim contexts with
different roots.  Calculation is relying on root -> leaf tree traversal
therefore top-down reclaim protection invariants should hold.  The only
exception is the reclaim root which should have effective protection set
to 0 but that would be problematic for the following setup:

 Let's have global and A's reclaim in parallel:
  |
  A (low=2G, usage = 3G, max = 3G, children_low_usage = 1.5G)
  |\
  | C (low = 1G, usage = 2.5G)
  B (low = 1G, usage = 0.5G)

 for A reclaim we have
 B.elow = B.low
 C.elow = C.low

 For the global reclaim
 A.elow = A.low
 B.elow = min(B.usage, B.low) because children_low_usage <= A.elow
 C.elow = min(C.usage, C.low)

 With the effective values resetting we have A reclaim
 A.elow = 0
 B.elow = B.low
 C.elow = C.low

 and global reclaim could see the above and then
 B.elow = C.elow = 0 because children_low_usage > A.elow

Which means that protected memcgs would get reclaimed.

In future we would like to make mem_cgroup_protected more robust against
racing reclaim contexts but that is likely more complex solution than this
simple workaround.

[hannes@cmpxchg.org - large part of the changelog]
[mhocko@suse.com - workaround explanation]
[chris@chrisdown.name - retitle]
Link: http://lkml.kernel.org/r/cover.1594638158.git.chris@chrisdown.name
Link: http://lkml.kernel.org/r/044fb8ecffd001c7905d27c0c2ad998069fdc396.1594638158.git.chris@chrisdown.name
Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Chris Down <chris@chrisdown.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memcontrol.h |   42 +++++++++++++++++++++++++++++++++--
 mm/memcontrol.c            |    8 ++++++
 mm/vmscan.c                |    3 +-
 3 files changed, 50 insertions(+), 3 deletions(-)

--- a/include/linux/memcontrol.h~mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection
+++ a/include/linux/memcontrol.h
@@ -363,12 +363,49 @@ static inline bool mem_cgroup_disabled(v
 	return !cgroup_subsys_enabled(memory_cgrp_subsys);
 }
 
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg,
+static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
+						  struct mem_cgroup *memcg,
 						  bool in_low_reclaim)
 {
 	if (mem_cgroup_disabled())
 		return 0;
 
+	/*
+	 * There is no reclaim protection applied to a targeted reclaim.
+	 * We are special casing this specific case here because
+	 * mem_cgroup_protected calculation is not robust enough to keep
+	 * the protection invariant for calculated effective values for
+	 * parallel reclaimers with different reclaim target. This is
+	 * especially a problem for tail memcgs (as they have pages on LRU)
+	 * which would want to have effective values 0 for targeted reclaim
+	 * but a different value for external reclaim.
+	 *
+	 * Example
+	 * Let's have global and A's reclaim in parallel:
+	 *  |
+	 *  A (low=2G, usage = 3G, max = 3G, children_low_usage = 1.5G)
+	 *  |\
+	 *  | C (low = 1G, usage = 2.5G)
+	 *  B (low = 1G, usage = 0.5G)
+	 *
+	 * For the global reclaim
+	 * A.elow = A.low
+	 * B.elow = min(B.usage, B.low) because children_low_usage <= A.elow
+	 * C.elow = min(C.usage, C.low)
+	 *
+	 * With the effective values resetting we have A reclaim
+	 * A.elow = 0
+	 * B.elow = B.low
+	 * C.elow = C.low
+	 *
+	 * If the global reclaim races with A's reclaim then
+	 * B.elow = C.elow = 0 because children_low_usage > A.elow)
+	 * is possible and reclaiming B would be violating the protection.
+	 *
+	 */
+	if (root == memcg)
+		return 0;
+
 	if (in_low_reclaim)
 		return READ_ONCE(memcg->memory.emin);
 
@@ -899,7 +936,8 @@ static inline void memcg_memory_event_mm
 {
 }
 
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg,
+static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
+						  struct mem_cgroup *memcg,
 						  bool in_low_reclaim)
 {
 	return 0;
--- a/mm/memcontrol.c~mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection
+++ a/mm/memcontrol.c
@@ -6595,6 +6595,14 @@ enum mem_cgroup_protection mem_cgroup_pr
 
 	if (!root)
 		root = root_mem_cgroup;
+
+	/*
+	 * Effective values of the reclaim targets are ignored so they
+	 * can be stale. Have a look at mem_cgroup_protection for more
+	 * details.
+	 * TODO: calculation should be more robust so that we do not need
+	 * that special casing.
+	 */
 	if (memcg == root)
 		return MEMCG_PROT_NONE;
 
--- a/mm/vmscan.c~mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection
+++ a/mm/vmscan.c
@@ -2331,7 +2331,8 @@ out:
 		unsigned long protection;
 
 		lruvec_size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx);
-		protection = mem_cgroup_protection(memcg,
+		protection = mem_cgroup_protection(sc->target_mem_cgroup,
+						   memcg,
 						   sc->memcg_low_reclaim);
 
 		if (protection) {
_

Patches currently in -mm which might be from laoar.shao@gmail.com are

mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch
mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
mm-oom-make-the-calculation-of-oom-badness-more-accurate-v3.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (130 preceding siblings ...)
  2020-07-14  1:03 ` + mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch " Andrew Morton
@ 2020-07-14  1:03 ` Andrew Morton
  2020-07-14  1:10 ` + scripts-deprecated_terms-sync-with-inclusive-terms.patch " Andrew Morton
                   ` (100 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:03 UTC (permalink / raw)
  To: chris, guro, hannes, laoar.shao, mhocko, mm-commits


The patch titled
     Subject: mm, memcg: decouple e{low,min} state mutations from protection checks
has been added to the -mm tree.  Its filename is
     mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chris Down <chris@chrisdown.name>
Subject: mm, memcg: decouple e{low,min} state mutations from protection checks

mem_cgroup_protected currently is both used to set effective low and min
and return a mem_cgroup_protection based on the result.  As a user, this
can be a little unexpected: it appears to be a simple predicate function,
if not for the big warning in the comment above about the order in which
it must be executed.

This change makes it so that we separate the state mutations from the
actual protection checks, which makes it more obvious where we need to be
careful mutating internal state, and where we are simply checking and
don't need to worry about that.

[mhocko@suse.com - don't check protection on root memcgs]
Link: http://lkml.kernel.org/r/ff3f915097fcee9f6d7041c084ef92d16aaeb56a.1594638158.git.chris@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memcontrol.h |   53 +++++++++++++++++++++++++++--------
 mm/memcontrol.c            |   28 ++++--------------
 mm/vmscan.c                |   17 ++---------
 3 files changed, 53 insertions(+), 45 deletions(-)

--- a/include/linux/memcontrol.h~mm-memcg-decouple-elowmin-state-mutations-from-protection-checks
+++ a/include/linux/memcontrol.h
@@ -55,12 +55,6 @@ enum memcg_memory_event {
 	MEMCG_NR_MEMORY_EVENTS,
 };
 
-enum mem_cgroup_protection {
-	MEMCG_PROT_NONE,
-	MEMCG_PROT_LOW,
-	MEMCG_PROT_MIN,
-};
-
 struct mem_cgroup_reclaim_cookie {
 	pg_data_t *pgdat;
 	unsigned int generation;
@@ -413,8 +407,36 @@ static inline unsigned long mem_cgroup_p
 		   READ_ONCE(memcg->memory.elow));
 }
 
-enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
-						struct mem_cgroup *memcg);
+void mem_cgroup_calculate_protection(struct mem_cgroup *root,
+				     struct mem_cgroup *memcg);
+
+static inline bool mem_cgroup_supports_protection(struct mem_cgroup *memcg)
+{
+	/*
+	 * The root memcg doesn't account charges, and doesn't support
+	 * protection.
+	 */
+	return !mem_cgroup_disabled() && !mem_cgroup_is_root(memcg);
+
+}
+
+static inline bool mem_cgroup_below_low(struct mem_cgroup *memcg)
+{
+	if (!mem_cgroup_supports_protection(memcg))
+		return false;
+
+	return READ_ONCE(memcg->memory.elow) >=
+		page_counter_read(&memcg->memory);
+}
+
+static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg)
+{
+	if (!mem_cgroup_supports_protection(memcg))
+		return false;
+
+	return READ_ONCE(memcg->memory.emin) >=
+		page_counter_read(&memcg->memory);
+}
 
 int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask);
 
@@ -943,10 +965,19 @@ static inline unsigned long mem_cgroup_p
 	return 0;
 }
 
-static inline enum mem_cgroup_protection mem_cgroup_protected(
-	struct mem_cgroup *root, struct mem_cgroup *memcg)
+static inline void mem_cgroup_calculate_protection(struct mem_cgroup *root,
+						   struct mem_cgroup *memcg)
+{
+}
+
+static inline bool mem_cgroup_below_low(struct mem_cgroup *memcg)
+{
+	return false;
+}
+
+static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg)
 {
-	return MEMCG_PROT_NONE;
+	return false;
 }
 
 static inline int mem_cgroup_charge(struct page *page, struct mm_struct *mm,
--- a/mm/memcontrol.c~mm-memcg-decouple-elowmin-state-mutations-from-protection-checks
+++ a/mm/memcontrol.c
@@ -6577,21 +6577,15 @@ static unsigned long effective_protectio
  *
  * WARNING: This function is not stateless! It can only be used as part
  *          of a top-down tree iteration, not for isolated queries.
- *
- * Returns one of the following:
- *   MEMCG_PROT_NONE: cgroup memory is not protected
- *   MEMCG_PROT_LOW: cgroup memory is protected as long there is
- *     an unprotected supply of reclaimable memory from other cgroups.
- *   MEMCG_PROT_MIN: cgroup memory is protected
  */
-enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
-						struct mem_cgroup *memcg)
+void mem_cgroup_calculate_protection(struct mem_cgroup *root,
+				     struct mem_cgroup *memcg)
 {
 	unsigned long usage, parent_usage;
 	struct mem_cgroup *parent;
 
 	if (mem_cgroup_disabled())
-		return MEMCG_PROT_NONE;
+		return;
 
 	if (!root)
 		root = root_mem_cgroup;
@@ -6604,21 +6598,21 @@ enum mem_cgroup_protection mem_cgroup_pr
 	 * that special casing.
 	 */
 	if (memcg == root)
-		return MEMCG_PROT_NONE;
+		return;
 
 	usage = page_counter_read(&memcg->memory);
 	if (!usage)
-		return MEMCG_PROT_NONE;
+		return;
 
 	parent = parent_mem_cgroup(memcg);
 	/* No parent means a non-hierarchical mode on v1 memcg */
 	if (!parent)
-		return MEMCG_PROT_NONE;
+		return;
 
 	if (parent == root) {
 		memcg->memory.emin = READ_ONCE(memcg->memory.min);
 		memcg->memory.elow = READ_ONCE(memcg->memory.low);
-		goto out;
+		return;
 	}
 
 	parent_usage = page_counter_read(&parent->memory);
@@ -6632,14 +6626,6 @@ enum mem_cgroup_protection mem_cgroup_pr
 			READ_ONCE(memcg->memory.low),
 			READ_ONCE(parent->memory.elow),
 			atomic_long_read(&parent->memory.children_low_usage)));
-
-out:
-	if (usage <= memcg->memory.emin)
-		return MEMCG_PROT_MIN;
-	else if (usage <= memcg->memory.elow)
-		return MEMCG_PROT_LOW;
-	else
-		return MEMCG_PROT_NONE;
 }
 
 /**
--- a/mm/vmscan.c~mm-memcg-decouple-elowmin-state-mutations-from-protection-checks
+++ a/mm/vmscan.c
@@ -2620,14 +2620,15 @@ static void shrink_node_memcgs(pg_data_t
 		unsigned long reclaimed;
 		unsigned long scanned;
 
-		switch (mem_cgroup_protected(target_memcg, memcg)) {
-		case MEMCG_PROT_MIN:
+		mem_cgroup_calculate_protection(target_memcg, memcg);
+
+		if (mem_cgroup_below_min(memcg)) {
 			/*
 			 * Hard protection.
 			 * If there is no reclaimable memory, OOM.
 			 */
 			continue;
-		case MEMCG_PROT_LOW:
+		} else if (mem_cgroup_below_low(memcg)) {
 			/*
 			 * Soft protection.
 			 * Respect the protection only as long as
@@ -2639,16 +2640,6 @@ static void shrink_node_memcgs(pg_data_t
 				continue;
 			}
 			memcg_memory_event(memcg, MEMCG_LOW);
-			break;
-		case MEMCG_PROT_NONE:
-			/*
-			 * All protection thresholds breached. We may
-			 * still choose to vary the scan pressure
-			 * applied based on by how much the cgroup in
-			 * question has exceeded its protection
-			 * thresholds (see get_scan_count).
-			 */
-			break;
 		}
 
 		reclaimed = sc->nr_reclaimed;
_

Patches currently in -mm which might be from chris@chrisdown.name are

tmpfs-per-superblock-i_ino-support.patch
tmpfs-support-64-bit-inums-per-sb.patch
mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch
mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch
mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + scripts-deprecated_terms-sync-with-inclusive-terms.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (131 preceding siblings ...)
  2020-07-14  1:03 ` + mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch " Andrew Morton
@ 2020-07-14  1:10 ` Andrew Morton
  2020-07-14  1:21 ` + mm-page_isolation-prefer-the-node-of-the-source-page.patch " Andrew Morton
                   ` (99 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:10 UTC (permalink / raw)
  To: apw, colin.king, corbet, dan.j.williams, gregkh, joe, jslaby,
	mishi, mm-commits, sj38.park, sjpark, skhan


The patch titled
     Subject: scripts/deprecated_terms: sync with inclusive terms
has been added to the -mm tree.  Its filename is
     scripts-deprecated_terms-sync-with-inclusive-terms.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/scripts-deprecated_terms-sync-with-inclusive-terms.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/scripts-deprecated_terms-sync-with-inclusive-terms.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: SeongJae Park <sjpark@amazon.de>
Subject: scripts/deprecated_terms: sync with inclusive terms

# NOTE: this patch is based on next/master, as this is a followup of
# commit 0d8b43e5876a ("scripts/deprecated_terms: recommend
# denylist/allowlist instead of blacklist/whitelist"), which merged in
# next tree only.

Commit a5f526ecb075 ("CodingStyle: Inclusive Terminology") introduced
more terms to be deprecated with more alternatives.  This commit updates
'deprecated_terms.txt' to sync with it.

Link: http://lkml.kernel.org/r/20200713071912.24432-1-sjpark@amazon.com
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: SeongJae Park <sj38.park@gmail.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mishi Choudhary <mishi@linux.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/deprecated_terms.txt |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/scripts/deprecated_terms.txt~scripts-deprecated_terms-sync-with-inclusive-terms
+++ a/scripts/deprecated_terms.txt
@@ -3,5 +3,8 @@
 # The format of each line is:
 # deprecated||suggested
 #
-blacklist||denylist
-whitelist||allowlist
+blacklist||(denylist|blocklist)
+# For other alternatives of 'slave', Please refer to
+# Documentation/process/coding-style.rst
+slave||(secondary|target|...)
+whitelist||(allowlist|passlist)
_

Patches currently in -mm which might be from sjpark@amazon.de are

checkpatch-support-deprecated-terms-checking.patch
scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
scripts-deprecated_terms-sync-with-inclusive-terms.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-page_isolation-prefer-the-node-of-the-source-page.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (132 preceding siblings ...)
  2020-07-14  1:10 ` + scripts-deprecated_terms-sync-with-inclusive-terms.patch " Andrew Morton
@ 2020-07-14  1:21 ` Andrew Morton
  2020-07-14  1:21 ` + mm-migrate-move-migration-helper-from-h-to-c.patch " Andrew Morton
                   ` (98 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:21 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz, mm-commits,
	n-horiguchi, vbabka


The patch titled
     Subject: mm/page_isolation: prefer the node of the source page
has been added to the -mm tree.  Its filename is
     mm-page_isolation-prefer-the-node-of-the-source-page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-page_isolation-prefer-the-node-of-the-source-page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_isolation-prefer-the-node-of-the-source-page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/page_isolation: prefer the node of the source page

Patch series "clean-up the migration target allocation functions", v5.


This patch (of 9):

For locality, it's better to migrate the page to the same node rather than
the node of the current caller's cpu.

Link: http://lkml.kernel.org/r/1594622517-20681-1-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1594622517-20681-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_isolation.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/mm/page_isolation.c~mm-page_isolation-prefer-the-node-of-the-source-page
+++ a/mm/page_isolation.c
@@ -309,5 +309,7 @@ int test_pages_isolated(unsigned long st
 
 struct page *alloc_migrate_target(struct page *page, unsigned long private)
 {
-	return new_page_nodemask(page, numa_node_id(), &node_states[N_MEMORY]);
+	int nid = page_to_nid(page);
+
+	return new_page_nodemask(page, nid, &node_states[N_MEMORY]);
 }
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-migrate-move-migration-helper-from-h-to-c.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (133 preceding siblings ...)
  2020-07-14  1:21 ` + mm-page_isolation-prefer-the-node-of-the-source-page.patch " Andrew Morton
@ 2020-07-14  1:21 ` Andrew Morton
  2020-07-14  1:21 ` + mm-hugetlb-unify-migration-callbacks.patch " Andrew Morton
                   ` (97 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:21 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz, mm-commits,
	n-horiguchi, vbabka


The patch titled
     Subject: mm/migrate: move migration helper from .h to .c
has been added to the -mm tree.  Its filename is
     mm-migrate-move-migration-helper-from-h-to-c.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-migrate-move-migration-helper-from-h-to-c.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-migrate-move-migration-helper-from-h-to-c.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/migrate: move migration helper from .h to .c

It's not performance sensitive function.  Move it to .c.  This is a
preparation step for future change.

Link: http://lkml.kernel.org/r/1594622517-20681-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/migrate.h |   33 +++++----------------------------
 mm/migrate.c            |   29 +++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 28 deletions(-)

--- a/include/linux/migrate.h~mm-migrate-move-migration-helper-from-h-to-c
+++ a/include/linux/migrate.h
@@ -31,34 +31,6 @@ enum migrate_reason {
 /* In mm/debug.c; also keep sync with include/trace/events/migrate.h */
 extern const char *migrate_reason_names[MR_TYPES];
 
-static inline struct page *new_page_nodemask(struct page *page,
-				int preferred_nid, nodemask_t *nodemask)
-{
-	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL;
-	unsigned int order = 0;
-	struct page *new_page = NULL;
-
-	if (PageHuge(page))
-		return alloc_huge_page_nodemask(page_hstate(compound_head(page)),
-				preferred_nid, nodemask);
-
-	if (PageTransHuge(page)) {
-		gfp_mask |= GFP_TRANSHUGE;
-		order = HPAGE_PMD_ORDER;
-	}
-
-	if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
-		gfp_mask |= __GFP_HIGHMEM;
-
-	new_page = __alloc_pages_nodemask(gfp_mask, order,
-				preferred_nid, nodemask);
-
-	if (new_page && PageTransHuge(new_page))
-		prep_transhuge_page(new_page);
-
-	return new_page;
-}
-
 #ifdef CONFIG_MIGRATION
 
 extern void putback_movable_pages(struct list_head *l);
@@ -67,6 +39,8 @@ extern int migrate_page(struct address_s
 			enum migrate_mode mode);
 extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
 		unsigned long private, enum migrate_mode mode, int reason);
+extern struct page *new_page_nodemask(struct page *page,
+		int preferred_nid, nodemask_t *nodemask);
 extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
 extern void putback_movable_page(struct page *page);
 
@@ -85,6 +59,9 @@ static inline int migrate_pages(struct l
 		free_page_t free, unsigned long private, enum migrate_mode mode,
 		int reason)
 	{ return -ENOSYS; }
+static inline struct page *new_page_nodemask(struct page *page,
+		int preferred_nid, nodemask_t *nodemask)
+	{ return NULL; }
 static inline int isolate_movable_page(struct page *page, isolate_mode_t mode)
 	{ return -EBUSY; }
 
--- a/mm/migrate.c~mm-migrate-move-migration-helper-from-h-to-c
+++ a/mm/migrate.c
@@ -1534,6 +1534,35 @@ out:
 	return rc;
 }
 
+struct page *new_page_nodemask(struct page *page,
+				int preferred_nid, nodemask_t *nodemask)
+{
+	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL;
+	unsigned int order = 0;
+	struct page *new_page = NULL;
+
+	if (PageHuge(page))
+		return alloc_huge_page_nodemask(
+				page_hstate(compound_head(page)),
+				preferred_nid, nodemask);
+
+	if (PageTransHuge(page)) {
+		gfp_mask |= GFP_TRANSHUGE;
+		order = HPAGE_PMD_ORDER;
+	}
+
+	if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
+		gfp_mask |= __GFP_HIGHMEM;
+
+	new_page = __alloc_pages_nodemask(gfp_mask, order,
+				preferred_nid, nodemask);
+
+	if (new_page && PageTransHuge(new_page))
+		prep_transhuge_page(new_page);
+
+	return new_page;
+}
+
 #ifdef CONFIG_NUMA
 
 static int store_status(int __user *status, int start, int value, int nr)
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-hugetlb-unify-migration-callbacks.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (134 preceding siblings ...)
  2020-07-14  1:21 ` + mm-migrate-move-migration-helper-from-h-to-c.patch " Andrew Morton
@ 2020-07-14  1:21 ` Andrew Morton
  2020-07-14  1:21 ` + mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch " Andrew Morton
                   ` (96 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:21 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz, mm-commits,
	n-horiguchi, vbabka


The patch titled
     Subject: mm/hugetlb: unify migration callbacks
has been added to the -mm tree.  Its filename is
     mm-hugetlb-unify-migration-callbacks.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-unify-migration-callbacks.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-unify-migration-callbacks.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/hugetlb: unify migration callbacks

There is no difference between two migration callback functions,
alloc_huge_page_node() and alloc_huge_page_nodemask(), except
__GFP_THISNODE handling.  It's redundant to have two almost similar
functions in order to handle this flag.  So, this patch tries to remove
one by introducing a new argument, gfp_mask, to
alloc_huge_page_nodemask().

After introducing gfp_mask argument, it's caller's job to provide correct
gfp_mask.  So, every callsites for alloc_huge_page_nodemask() are changed
to provide gfp_mask.

Note that it's safe to remove a node id check in alloc_huge_page_node()
since there is no caller passing NUMA_NO_NODE as a node id.

Link: http://lkml.kernel.org/r/1594622517-20681-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/hugetlb.h |   26 ++++++++++++++++++--------
 mm/hugetlb.c            |   35 ++---------------------------------
 mm/mempolicy.c          |   10 ++++++----
 mm/migrate.c            |   11 +++++++----
 4 files changed, 33 insertions(+), 49 deletions(-)

--- a/include/linux/hugetlb.h~mm-hugetlb-unify-migration-callbacks
+++ a/include/linux/hugetlb.h
@@ -10,6 +10,7 @@
 #include <linux/list.h>
 #include <linux/kref.h>
 #include <linux/pgtable.h>
+#include <linux/gfp.h>
 
 struct ctl_table;
 struct user_struct;
@@ -504,9 +505,8 @@ struct huge_bootmem_page {
 
 struct page *alloc_huge_page(struct vm_area_struct *vma,
 				unsigned long addr, int avoid_reserve);
-struct page *alloc_huge_page_node(struct hstate *h, int nid);
 struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
-				nodemask_t *nmask);
+				nodemask_t *nmask, gfp_t gfp_mask);
 struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
 				unsigned long address);
 struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
@@ -692,6 +692,15 @@ static inline bool hugepage_movable_supp
 	return true;
 }
 
+/* Movability of hugepages depends on migration support. */
+static inline gfp_t htlb_alloc_mask(struct hstate *h)
+{
+	if (hugepage_movable_supported(h))
+		return GFP_HIGHUSER_MOVABLE;
+	else
+		return GFP_HIGHUSER;
+}
+
 static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
 					   struct mm_struct *mm, pte_t *pte)
 {
@@ -759,13 +768,9 @@ static inline struct page *alloc_huge_pa
 	return NULL;
 }
 
-static inline struct page *alloc_huge_page_node(struct hstate *h, int nid)
-{
-	return NULL;
-}
-
 static inline struct page *
-alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, nodemask_t *nmask)
+alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
+			nodemask_t *nmask, gfp_t gfp_mask)
 {
 	return NULL;
 }
@@ -878,6 +883,11 @@ static inline bool hugepage_movable_supp
 	return false;
 }
 
+static inline gfp_t htlb_alloc_mask(struct hstate *h)
+{
+	return 0;
+}
+
 static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
 					   struct mm_struct *mm, pte_t *pte)
 {
--- a/mm/hugetlb.c~mm-hugetlb-unify-migration-callbacks
+++ a/mm/hugetlb.c
@@ -1091,15 +1091,6 @@ retry_cpuset:
 	return NULL;
 }
 
-/* Movability of hugepages depends on migration support. */
-static inline gfp_t htlb_alloc_mask(struct hstate *h)
-{
-	if (hugepage_movable_supported(h))
-		return GFP_HIGHUSER_MOVABLE;
-	else
-		return GFP_HIGHUSER;
-}
-
 static struct page *dequeue_huge_page_vma(struct hstate *h,
 				struct vm_area_struct *vma,
 				unsigned long address, int avoid_reserve,
@@ -1981,31 +1972,9 @@ struct page *alloc_buddy_huge_page_with_
 }
 
 /* page migration callback function */
-struct page *alloc_huge_page_node(struct hstate *h, int nid)
-{
-	gfp_t gfp_mask = htlb_alloc_mask(h);
-	struct page *page = NULL;
-
-	if (nid != NUMA_NO_NODE)
-		gfp_mask |= __GFP_THISNODE;
-
-	spin_lock(&hugetlb_lock);
-	if (h->free_huge_pages - h->resv_huge_pages > 0)
-		page = dequeue_huge_page_nodemask(h, gfp_mask, nid, NULL);
-	spin_unlock(&hugetlb_lock);
-
-	if (!page)
-		page = alloc_migrate_huge_page(h, gfp_mask, nid, NULL);
-
-	return page;
-}
-
-/* page migration callback function */
 struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
-		nodemask_t *nmask)
+		nodemask_t *nmask, gfp_t gfp_mask)
 {
-	gfp_t gfp_mask = htlb_alloc_mask(h);
-
 	spin_lock(&hugetlb_lock);
 	if (h->free_huge_pages - h->resv_huge_pages > 0) {
 		struct page *page;
@@ -2033,7 +2002,7 @@ struct page *alloc_huge_page_vma(struct
 
 	gfp_mask = htlb_alloc_mask(h);
 	node = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
-	page = alloc_huge_page_nodemask(h, node, nodemask);
+	page = alloc_huge_page_nodemask(h, node, nodemask, gfp_mask);
 	mpol_cond_put(mpol);
 
 	return page;
--- a/mm/mempolicy.c~mm-hugetlb-unify-migration-callbacks
+++ a/mm/mempolicy.c
@@ -1068,10 +1068,12 @@ static int migrate_page_add(struct page
 /* page allocation callback for NUMA node migration */
 struct page *alloc_new_node_page(struct page *page, unsigned long node)
 {
-	if (PageHuge(page))
-		return alloc_huge_page_node(page_hstate(compound_head(page)),
-					node);
-	else if (PageTransHuge(page)) {
+	if (PageHuge(page)) {
+		struct hstate *h = page_hstate(compound_head(page));
+		gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
+
+		return alloc_huge_page_nodemask(h, node, NULL, gfp_mask);
+	} else if (PageTransHuge(page)) {
 		struct page *thp;
 
 		thp = alloc_pages_node(node,
--- a/mm/migrate.c~mm-hugetlb-unify-migration-callbacks
+++ a/mm/migrate.c
@@ -1541,10 +1541,13 @@ struct page *new_page_nodemask(struct pa
 	unsigned int order = 0;
 	struct page *new_page = NULL;
 
-	if (PageHuge(page))
-		return alloc_huge_page_nodemask(
-				page_hstate(compound_head(page)),
-				preferred_nid, nodemask);
+	if (PageHuge(page)) {
+		struct hstate *h = page_hstate(compound_head(page));
+
+		gfp_mask = htlb_alloc_mask(h);
+		return alloc_huge_page_nodemask(h, preferred_nid,
+						nodemask, gfp_mask);
+	}
 
 	if (PageTransHuge(page)) {
 		gfp_mask |= GFP_TRANSHUGE;
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (135 preceding siblings ...)
  2020-07-14  1:21 ` + mm-hugetlb-unify-migration-callbacks.patch " Andrew Morton
@ 2020-07-14  1:21 ` Andrew Morton
  2020-07-14  1:21 ` + mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch " Andrew Morton
                   ` (95 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:21 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz, mm-commits,
	n-horiguchi, vbabka


The patch titled
     Subject: mm/migrate: clear __GFP_RECLAIM to make the migration callback consistent with regular THP allocations
has been added to the -mm tree.  Its filename is
     mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/migrate: clear __GFP_RECLAIM to make the migration callback consistent with regular THP allocations

new_page_nodemask is a migration callback and it tries to use a common gfp
flags for the target page allocation whether it is a base page or a THP. 
The later only adds GFP_TRANSHUGE to the given mask.  This results in the
allocation being slightly more aggressive than necessary because the
resulting gfp mask will contain also __GFP_RECLAIM_KSWAPD.  THP
allocations usually exclude this flag to reduce over eager background
reclaim during a high THP allocation load which has been seen during large
mmaps initialization.  There is no indication that this is a problem for
migration as well but theoretically the same might happen when migrating
large mappings to a different node.  Make the migration callback
consistent with regular THP allocations.

Link: http://lkml.kernel.org/r/1594622517-20681-5-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/migrate.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/mm/migrate.c~mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations
+++ a/mm/migrate.c
@@ -1550,6 +1550,11 @@ struct page *new_page_nodemask(struct pa
 	}
 
 	if (PageTransHuge(page)) {
+		/*
+		 * clear __GFP_RECALIM to make the migration callback
+		 * consistent with regular THP allocations.
+		 */
+		gfp_mask &= ~__GFP_RECLAIM;
 		gfp_mask |= GFP_TRANSHUGE;
 		order = HPAGE_PMD_ORDER;
 	}
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (136 preceding siblings ...)
  2020-07-14  1:21 ` + mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch " Andrew Morton
@ 2020-07-14  1:21 ` Andrew Morton
  2020-07-14  1:21 ` + mm-migrate-make-a-standard-migration-target-allocation-function.patch " Andrew Morton
                   ` (94 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:21 UTC (permalink / raw)
  To: akpm, iamjoonsoo.kim, mhocko, mm-commits, vbabka


The patch titled
     Subject: mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix
has been added to the -mm tree.  Its filename is
     mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix

fix comment typo, per Vlastimil

Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/migrate.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/migrate.c~mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix
+++ a/mm/migrate.c
@@ -1551,7 +1551,7 @@ struct page *new_page_nodemask(struct pa
 
 	if (PageTransHuge(page)) {
 		/*
-		 * clear __GFP_RECALIM to make the migration callback
+		 * clear __GFP_RECLAIM to make the migration callback
 		 * consistent with regular THP allocations.
 		 */
 		gfp_mask &= ~__GFP_RECLAIM;
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-migrate-make-a-standard-migration-target-allocation-function.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (137 preceding siblings ...)
  2020-07-14  1:21 ` + mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch " Andrew Morton
@ 2020-07-14  1:21 ` Andrew Morton
  2020-07-14  1:21 ` + mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
                   ` (93 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:21 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz, mm-commits,
	n-horiguchi, vbabka


The patch titled
     Subject: mm/migrate: introduce a standard migration target allocation function
has been added to the -mm tree.  Its filename is
     mm-migrate-make-a-standard-migration-target-allocation-function.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-migrate-make-a-standard-migration-target-allocation-function.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-migrate-make-a-standard-migration-target-allocation-function.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/migrate: introduce a standard migration target allocation function

There are some similar functions for migration target allocation.  Since
there is no fundamental difference, it's better to keep just one rather
than keeping all variants.  This patch implements base migration target
allocation function.  In the following patches, variants will be converted
to use this function.

Changes should be mechanical, but, unfortunately, there are some
differences.  First, some callers' nodemask is assgined to NULL since NULL
nodemask will be considered as all available nodes, that is,
&node_states[N_MEMORY].  Second, for hugetlb page allocation, gfp_mask is
redefined as regular hugetlb allocation gfp_mask plus __GFP_THISNODE if
user provided gfp_mask has it.  This is because future caller of this
function requires to set this node constaint.  Lastly, if provided nodeid
is NUMA_NO_NODE, nodeid is set up to the node where migration source
lives.  It helps to remove simple wrappers for setting up the nodeid.

Note that PageHighmem() call in previous function is changed to open-code
"is_highmem_idx()" since it provides more readability.

[akpm@linux-foundation.org: tweak patch title, per Vlastimil]
Link: http://lkml.kernel.org/r/1594622517-20681-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/hugetlb.h |   15 +++++++++++++++
 include/linux/migrate.h |    9 +++++----
 mm/internal.h           |    7 +++++++
 mm/memory-failure.c     |    7 +++++--
 mm/memory_hotplug.c     |   12 ++++++++----
 mm/migrate.c            |   26 ++++++++++++++++----------
 mm/page_isolation.c     |    7 +++++--
 7 files changed, 61 insertions(+), 22 deletions(-)

--- a/include/linux/hugetlb.h~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/include/linux/hugetlb.h
@@ -701,6 +701,16 @@ static inline gfp_t htlb_alloc_mask(stru
 		return GFP_HIGHUSER;
 }
 
+static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask)
+{
+	gfp_t modified_mask = htlb_alloc_mask(h);
+
+	/* Some callers might want to enfoce node */
+	modified_mask |= (gfp_mask & __GFP_THISNODE);
+
+	return modified_mask;
+}
+
 static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
 					   struct mm_struct *mm, pte_t *pte)
 {
@@ -887,6 +897,11 @@ static inline gfp_t htlb_alloc_mask(stru
 {
 	return 0;
 }
+
+static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask)
+{
+	return 0;
+}
 
 static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
 					   struct mm_struct *mm, pte_t *pte)
--- a/include/linux/migrate.h~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/include/linux/migrate.h
@@ -10,6 +10,8 @@
 typedef struct page *new_page_t(struct page *page, unsigned long private);
 typedef void free_page_t(struct page *page, unsigned long private);
 
+struct migration_target_control;
+
 /*
  * Return values from addresss_space_operations.migratepage():
  * - negative errno on page migration failure;
@@ -39,8 +41,7 @@ extern int migrate_page(struct address_s
 			enum migrate_mode mode);
 extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
 		unsigned long private, enum migrate_mode mode, int reason);
-extern struct page *new_page_nodemask(struct page *page,
-		int preferred_nid, nodemask_t *nodemask);
+extern struct page *alloc_migration_target(struct page *page, unsigned long private);
 extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
 extern void putback_movable_page(struct page *page);
 
@@ -59,8 +60,8 @@ static inline int migrate_pages(struct l
 		free_page_t free, unsigned long private, enum migrate_mode mode,
 		int reason)
 	{ return -ENOSYS; }
-static inline struct page *new_page_nodemask(struct page *page,
-		int preferred_nid, nodemask_t *nodemask)
+static inline struct page *alloc_migration_target(struct page *page,
+		unsigned long private)
 	{ return NULL; }
 static inline int isolate_movable_page(struct page *page, isolate_mode_t mode)
 	{ return -EBUSY; }
--- a/mm/internal.h~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/internal.h
@@ -614,4 +614,11 @@ static inline bool is_migrate_highatomic
 
 void setup_zone_pageset(struct zone *zone);
 extern struct page *alloc_new_node_page(struct page *page, unsigned long node);
+
+struct migration_target_control {
+	int nid;		/* preferred node id */
+	nodemask_t *nmask;
+	gfp_t gfp_mask;
+};
+
 #endif	/* __MM_INTERNAL_H */
--- a/mm/memory-failure.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/memory-failure.c
@@ -1679,9 +1679,12 @@ EXPORT_SYMBOL(unpoison_memory);
 
 static struct page *new_page(struct page *p, unsigned long private)
 {
-	int nid = page_to_nid(p);
+	struct migration_target_control mtc = {
+		.nid = page_to_nid(p),
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
-	return new_page_nodemask(p, nid, &node_states[N_MEMORY]);
+	return alloc_migration_target(p, (unsigned long)&mtc);
 }
 
 /*
--- a/mm/memory_hotplug.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/memory_hotplug.c
@@ -1277,19 +1277,23 @@ found:
 
 static struct page *new_node_page(struct page *page, unsigned long private)
 {
-	int nid = page_to_nid(page);
 	nodemask_t nmask = node_states[N_MEMORY];
+	struct migration_target_control mtc = {
+		.nid = page_to_nid(page),
+		.nmask = &nmask,
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
 	/*
 	 * try to allocate from a different node but reuse this node if there
 	 * are no other online nodes to be used (e.g. we are offlining a part
 	 * of the only existing node)
 	 */
-	node_clear(nid, nmask);
+	node_clear(mtc.nid, nmask);
 	if (nodes_empty(nmask))
-		node_set(nid, nmask);
+		node_set(mtc.nid, nmask);
 
-	return new_page_nodemask(page, nid, &nmask);
+	return alloc_migration_target(page, (unsigned long)&mtc);
 }
 
 static int
--- a/mm/migrate.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/migrate.c
@@ -1534,19 +1534,26 @@ out:
 	return rc;
 }
 
-struct page *new_page_nodemask(struct page *page,
-				int preferred_nid, nodemask_t *nodemask)
+struct page *alloc_migration_target(struct page *page, unsigned long private)
 {
-	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL;
+	struct migration_target_control *mtc;
+	gfp_t gfp_mask;
 	unsigned int order = 0;
 	struct page *new_page = NULL;
+	int nid;
+	int zidx;
+
+	mtc = (struct migration_target_control *)private;
+	gfp_mask = mtc->gfp_mask;
+	nid = mtc->nid;
+	if (nid == NUMA_NO_NODE)
+		nid = page_to_nid(page);
 
 	if (PageHuge(page)) {
 		struct hstate *h = page_hstate(compound_head(page));
 
-		gfp_mask = htlb_alloc_mask(h);
-		return alloc_huge_page_nodemask(h, preferred_nid,
-						nodemask, gfp_mask);
+		gfp_mask = htlb_modify_alloc_mask(h, gfp_mask);
+		return alloc_huge_page_nodemask(h, nid, mtc->nmask, gfp_mask);
 	}
 
 	if (PageTransHuge(page)) {
@@ -1558,12 +1565,11 @@ struct page *new_page_nodemask(struct pa
 		gfp_mask |= GFP_TRANSHUGE;
 		order = HPAGE_PMD_ORDER;
 	}
-
-	if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
+	zidx = zone_idx(page_zone(page));
+	if (is_highmem_idx(zidx) || zidx == ZONE_MOVABLE)
 		gfp_mask |= __GFP_HIGHMEM;
 
-	new_page = __alloc_pages_nodemask(gfp_mask, order,
-				preferred_nid, nodemask);
+	new_page = __alloc_pages_nodemask(gfp_mask, order, nid, mtc->nmask);
 
 	if (new_page && PageTransHuge(new_page))
 		prep_transhuge_page(new_page);
--- a/mm/page_isolation.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/page_isolation.c
@@ -309,7 +309,10 @@ int test_pages_isolated(unsigned long st
 
 struct page *alloc_migrate_target(struct page *page, unsigned long private)
 {
-	int nid = page_to_nid(page);
+	struct migration_target_control mtc = {
+		.nid = page_to_nid(page),
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
-	return new_page_nodemask(page, nid, &node_states[N_MEMORY]);
+	return alloc_migration_target(page, (unsigned long)&mtc);
 }
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (138 preceding siblings ...)
  2020-07-14  1:21 ` + mm-migrate-make-a-standard-migration-target-allocation-function.patch " Andrew Morton
@ 2020-07-14  1:21 ` Andrew Morton
  2020-07-14  1:21 ` + mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
                   ` (92 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:21 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz, mm-commits,
	n-horiguchi, vbabka


The patch titled
     Subject: mm/mempolicy: use a standard migration target allocation callback
has been added to the -mm tree.  Its filename is
     mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/mempolicy: use a standard migration target allocation callback

There is a well-defined migration target allocation callback.  Use it.

Link: http://lkml.kernel.org/r/1594622517-20681-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/internal.h  |    1 -
 mm/mempolicy.c |   31 ++++++-------------------------
 mm/migrate.c   |    8 ++++++--
 3 files changed, 12 insertions(+), 28 deletions(-)

--- a/mm/internal.h~mm-mempolicy-use-a-standard-migration-target-allocation-callback
+++ a/mm/internal.h
@@ -613,7 +613,6 @@ static inline bool is_migrate_highatomic
 }
 
 void setup_zone_pageset(struct zone *zone);
-extern struct page *alloc_new_node_page(struct page *page, unsigned long node);
 
 struct migration_target_control {
 	int nid;		/* preferred node id */
--- a/mm/mempolicy.c~mm-mempolicy-use-a-standard-migration-target-allocation-callback
+++ a/mm/mempolicy.c
@@ -1065,29 +1065,6 @@ static int migrate_page_add(struct page
 	return 0;
 }
 
-/* page allocation callback for NUMA node migration */
-struct page *alloc_new_node_page(struct page *page, unsigned long node)
-{
-	if (PageHuge(page)) {
-		struct hstate *h = page_hstate(compound_head(page));
-		gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
-
-		return alloc_huge_page_nodemask(h, node, NULL, gfp_mask);
-	} else if (PageTransHuge(page)) {
-		struct page *thp;
-
-		thp = alloc_pages_node(node,
-			(GFP_TRANSHUGE | __GFP_THISNODE),
-			HPAGE_PMD_ORDER);
-		if (!thp)
-			return NULL;
-		prep_transhuge_page(thp);
-		return thp;
-	} else
-		return __alloc_pages_node(node, GFP_HIGHUSER_MOVABLE |
-						    __GFP_THISNODE, 0);
-}
-
 /*
  * Migrate pages from one node to a target node.
  * Returns error or the number of pages not migrated.
@@ -1098,6 +1075,10 @@ static int migrate_to_node(struct mm_str
 	nodemask_t nmask;
 	LIST_HEAD(pagelist);
 	int err = 0;
+	struct migration_target_control mtc = {
+		.nid = dest,
+		.gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE,
+	};
 
 	nodes_clear(nmask);
 	node_set(source, nmask);
@@ -1112,8 +1093,8 @@ static int migrate_to_node(struct mm_str
 			flags | MPOL_MF_DISCONTIG_OK, &pagelist);
 
 	if (!list_empty(&pagelist)) {
-		err = migrate_pages(&pagelist, alloc_new_node_page, NULL, dest,
-					MIGRATE_SYNC, MR_SYSCALL);
+		err = migrate_pages(&pagelist, alloc_migration_target, NULL,
+				(unsigned long)&mtc, MIGRATE_SYNC, MR_SYSCALL);
 		if (err)
 			putback_movable_pages(&pagelist);
 	}
--- a/mm/migrate.c~mm-mempolicy-use-a-standard-migration-target-allocation-callback
+++ a/mm/migrate.c
@@ -1594,9 +1594,13 @@ static int do_move_pages_to_node(struct
 		struct list_head *pagelist, int node)
 {
 	int err;
+	struct migration_target_control mtc = {
+		.nid = node,
+		.gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE,
+	};
 
-	err = migrate_pages(pagelist, alloc_new_node_page, NULL, node,
-			MIGRATE_SYNC, MR_SYSCALL);
+	err = migrate_pages(pagelist, alloc_migration_target, NULL,
+			(unsigned long)&mtc, MIGRATE_SYNC, MR_SYSCALL);
 	if (err)
 		putback_movable_pages(pagelist);
 	return err;
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (139 preceding siblings ...)
  2020-07-14  1:21 ` + mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
@ 2020-07-14  1:21 ` Andrew Morton
  2020-07-14  1:21 ` + mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
                   ` (91 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:21 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz, mm-commits,
	n-horiguchi, vbabka


The patch titled
     Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()
has been added to the -mm tree.  Its filename is
     mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()

There is a well-defined standard migration target callback.  Use it
directly.

Link: http://lkml.kernel.org/r/1594622517-20681-8-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c     |    8 ++++++--
 mm/page_isolation.c |   10 ----------
 2 files changed, 6 insertions(+), 12 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_alloc.c
@@ -8358,6 +8358,10 @@ static int __alloc_contig_migrate_range(
 	unsigned long pfn = start;
 	unsigned int tries = 0;
 	int ret = 0;
+	struct migration_target_control mtc = {
+		.nid = zone_to_nid(cc->zone),
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
 	migrate_prep();
 
@@ -8384,8 +8388,8 @@ static int __alloc_contig_migrate_range(
 							&cc->migratepages);
 		cc->nr_migratepages -= nr_reclaimed;
 
-		ret = migrate_pages(&cc->migratepages, alloc_migrate_target,
-				    NULL, 0, cc->mode, MR_CONTIG_RANGE);
+		ret = migrate_pages(&cc->migratepages, alloc_migration_target,
+				NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE);
 	}
 	if (ret < 0) {
 		putback_movable_pages(&cc->migratepages);
--- a/mm/page_isolation.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_isolation.c
@@ -306,13 +306,3 @@ int test_pages_isolated(unsigned long st
 
 	return pfn < end_pfn ? -EBUSY : 0;
 }
-
-struct page *alloc_migrate_target(struct page *page, unsigned long private)
-{
-	struct migration_target_control mtc = {
-		.nid = page_to_nid(page),
-		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
-	};
-
-	return alloc_migration_target(page, (unsigned long)&mtc);
-}
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (140 preceding siblings ...)
  2020-07-14  1:21 ` + mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
@ 2020-07-14  1:21 ` Andrew Morton
  2020-07-14  1:22 ` + mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
                   ` (90 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:21 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz, mm-commits,
	n-horiguchi, vbabka


The patch titled
     Subject: mm/memory-failure: remove a wrapper for alloc_migration_target()
has been added to the -mm tree.  Its filename is
     mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/memory-failure: remove a wrapper for alloc_migration_target()

There is a well-defined standard migration target callback.  Use it
directly.

Link: http://lkml.kernel.org/r/1594622517-20681-9-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |   18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

--- a/mm/memory-failure.c~mm-memory-failure-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/memory-failure.c
@@ -1677,16 +1677,6 @@ int unpoison_memory(unsigned long pfn)
 }
 EXPORT_SYMBOL(unpoison_memory);
 
-static struct page *new_page(struct page *p, unsigned long private)
-{
-	struct migration_target_control mtc = {
-		.nid = page_to_nid(p),
-		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
-	};
-
-	return alloc_migration_target(p, (unsigned long)&mtc);
-}
-
 /*
  * Safely get reference count of an arbitrary page.
  * Returns 0 for a free page, -EIO for a zero refcount page
@@ -1793,6 +1783,10 @@ static int __soft_offline_page(struct pa
 	const char *msg_page[] = {"page", "hugepage"};
 	bool huge = PageHuge(page);
 	LIST_HEAD(pagelist);
+	struct migration_target_control mtc = {
+		.nid = NUMA_NO_NODE,
+		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+	};
 
 	/*
 	 * Check PageHWPoison again inside page lock because PageHWPoison
@@ -1829,8 +1823,8 @@ static int __soft_offline_page(struct pa
 	}
 
 	if (isolate_page(hpage, &pagelist)) {
-		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
-					MIGRATE_SYNC, MR_MEMORY_FAILURE);
+		ret = migrate_pages(&pagelist, alloc_migration_target, NULL,
+			(unsigned long)&mtc, MIGRATE_SYNC, MR_MEMORY_FAILURE);
 		if (!ret) {
 			bool release = !huge;
 
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (141 preceding siblings ...)
  2020-07-14  1:21 ` + mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
@ 2020-07-14  1:22 ` Andrew Morton
  2020-07-14  1:30 ` + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch " Andrew Morton
                   ` (89 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:22 UTC (permalink / raw)
  To: guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz, mm-commits,
	n-horiguchi, vbabka


The patch titled
     Subject: mm/memory_hotplug: remove a wrapper for alloc_migration_target()
has been added to the -mm tree.  Its filename is
     mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/memory_hotplug: remove a wrapper for alloc_migration_target()

To calculate the correct node to migrate the page for hotplug, we need to
check node id of the page.  Wrapper for alloc_migration_target() exists
for this purpose.

However, Vlastimil informs that all migration source pages come from a
single node.  In this case, we don't need to check the node id for each
page and we don't need to re-set the target nodemask for each page by
using the wrapper.  Set up the migration_target_control once and use it
for all pages.

Link: http://lkml.kernel.org/r/1594622517-20681-10-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory_hotplug.c |   46 ++++++++++++++++++++----------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

--- a/mm/memory_hotplug.c~mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/memory_hotplug.c
@@ -1275,27 +1275,6 @@ found:
 	return 0;
 }
 
-static struct page *new_node_page(struct page *page, unsigned long private)
-{
-	nodemask_t nmask = node_states[N_MEMORY];
-	struct migration_target_control mtc = {
-		.nid = page_to_nid(page),
-		.nmask = &nmask,
-		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
-	};
-
-	/*
-	 * try to allocate from a different node but reuse this node if there
-	 * are no other online nodes to be used (e.g. we are offlining a part
-	 * of the only existing node)
-	 */
-	node_clear(mtc.nid, nmask);
-	if (nodes_empty(nmask))
-		node_set(mtc.nid, nmask);
-
-	return alloc_migration_target(page, (unsigned long)&mtc);
-}
-
 static int
 do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 {
@@ -1355,9 +1334,28 @@ do_migrate_range(unsigned long start_pfn
 		put_page(page);
 	}
 	if (!list_empty(&source)) {
-		/* Allocate a new page from the nearest neighbor node */
-		ret = migrate_pages(&source, new_node_page, NULL, 0,
-					MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
+		nodemask_t nmask = node_states[N_MEMORY];
+		struct migration_target_control mtc = {
+			.nmask = &nmask,
+			.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+		};
+
+		/*
+		 * We have checked that migration range is on a single zone so
+		 * we can use the nid of the first page to all the others.
+		 */
+		mtc.nid = page_to_nid(list_first_entry(&source, struct page, lru));
+
+		/*
+		 * try to allocate from a different node but reuse this node
+		 * if there are no other online nodes to be used (e.g. we are
+		 * offlining a part of the only existing node)
+		 */
+		node_clear(mtc.nid, nmask);
+		if (nodes_empty(nmask))
+			node_set(mtc.nid, nmask);
+		ret = migrate_pages(&source, alloc_migration_target, NULL,
+			(unsigned long)&mtc, MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
 		if (ret) {
 			list_for_each_entry(page, &source, lru) {
 				pr_warn("migrating pfn %lx failed ret:%d ",
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (142 preceding siblings ...)
  2020-07-14  1:22 ` + mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
@ 2020-07-14  1:30 ` Andrew Morton
  2020-07-14  1:30 ` + documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch " Andrew Morton
                   ` (88 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:30 UTC (permalink / raw)
  To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
	corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
	palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
	ziy


The patch titled
     Subject: mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5
has been added to the -mm tree.  Its filename is
     mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5

drop RANDOM_ORVALUE from hugetlb_advanced_tests()

Link: http://lkml.kernel.org/r/1594610587-4172-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com>	[arc]
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug_vm_pgtable.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5
+++ a/mm/debug_vm_pgtable.c
@@ -745,7 +745,7 @@ static void __init hugetlb_advanced_test
 {
 	struct page *page = pfn_to_page(pfn);
 	pte_t pte = ptep_get(ptep);
-	unsigned long paddr = (__pfn_to_phys(pfn) | RANDOM_ORVALUE) & PMD_MASK;
+	unsigned long paddr = __pfn_to_phys(pfn) & PMD_MASK;
 
 	pte = pte_mkhuge(mk_pte(pfn_to_page(PHYS_PFN(paddr)), prot));
 	set_huge_pte_at(mm, vaddr, ptep, pte);
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch
mm-vmstat-add-events-for-thp-migration-without-split.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (143 preceding siblings ...)
  2020-07-14  1:30 ` + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch " Andrew Morton
@ 2020-07-14  1:30 ` Andrew Morton
  2020-07-14  1:37 ` + mm-sparse-cleanup-the-code-surrounding-memory_present.patch " Andrew Morton
                   ` (87 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:30 UTC (permalink / raw)
  To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
	corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
	palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
	ziy


The patch titled
     Subject: documentation-mm-add-descriptions-for-arch-page-table-helpers-v5
has been added to the -mm tree.  Its filename is
     documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: documentation-mm-add-descriptions-for-arch-page-table-helpers-v5

fold in Mike's patch for the rst document, fix typos in the rst document

Link: http://lkml.kernel.org/r/1594610587-4172-5-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Suggested-by: Mike Rapoport <rppt@kernel.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/vm/arch_pgtable_helpers.rst |  232 ++++++++++----------
 1 file changed, 116 insertions(+), 116 deletions(-)

--- a/Documentation/vm/arch_pgtable_helpers.rst~documentation-mm-add-descriptions-for-arch-page-table-helpers-v5
+++ a/Documentation/vm/arch_pgtable_helpers.rst
@@ -17,242 +17,242 @@ test need to be in sync.
 PTE Page Table Helpers
 ======================
 
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_same                  | Tests whether both PTE entries are the same      |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_bad                   | Tests a non-table mapped PTE                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_present               | Tests a valid mapped PTE                         |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_young                 | Tests a young PTE                                |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_dirty                 | Tests a dirty PTE                                |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_write                 | Tests a writable PTE                             |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_special               | Tests a special PTE                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_protnone              | Tests a PROT_NONE PTE                            |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_devmap                | Tests a ZONE_DEVICE mapped PTE                   |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_soft_dirty            | Tests a soft dirty PTE                           |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_swp_soft_dirty        | Tests a soft dirty swapped PTE                   |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mkyoung               | Creates a young PTE                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mkold                 | Creates an old PTE                               |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mkdirty               | Creates a dirty PTE                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mkclean               | Creates a clean PTE                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mkwrite               | Creates a writable PTE                           |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mkwrprotect           | Creates a write protected PTE                    |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mkspecial             | Creates a special PTE                            |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mkdevmap              | Creates a ZONE_DEVICE mapped PTE                 |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mksoft_dirty          | Creates a soft dirty PTE                         |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_clear_soft_dirty      | Clears a soft dirty PTE                          |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_swp_mksoft_dirty      | Creates a soft dirty swapped PTE                 |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_swp_clear_soft_dirty  | Clears a soft dirty swapped PTE                  |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mknotpresent          | Invalidates a mapped PTE                         |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | ptep_get_and_clear        | Clears a PTE                                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | ptep_get_and_clear_full   | Clears a PTE                                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | ptep_test_and_clear_young | Clears young from a PTE                          |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | ptep_set_wrprotect        | Converts into a write protected PTE              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | ptep_set_access_flags     | Converts into a more permissive PTE              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 
 ======================
 PMD Page Table Helpers
 ======================
 
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_same                  | Tests whether both PMD entries are the same      |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_bad                   | Tests a non-table mapped PMD                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_leaf                  | Tests a leaf mapped PMD                          |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_huge                  | Tests a HugeTLB mapped PMD                       |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_trans_huge            | Tests a Transparent Huge Page (THP) at PMD       |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_present               | Tests a valid mapped PMD                         |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_young                 | Tests a young PMD                                |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_dirty                 | Tests a dirty PMD                                |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_write                 | Tests a writable PMD                             |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_special               | Tests a special PMD                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_protnone              | Tests a PROT_NONE PMD                            |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_devmap                | Tests a ZONE_DEVICE mapped PMD                   |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_soft_dirty            | Tests a soft dirty PMD                           |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_swp_soft_dirty        | Tests a soft dirty swapped PMD                   |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mkyoung               | Creates a young PMD                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mkold                 | Creates an old PMD                               |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mkdirty               | Creates a dirty PMD                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mkclean               | Creates a clean PMD                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mkwrite               | Creates a writable PMD                           |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mkwrprotect           | Creates a write protected PMD                    |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mkspecial             | Creates a special PMD                            |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mkdevmap              | Creates a ZONE_DEVICE mapped PMD                 |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mksoft_dirty          | Creates a soft dirty PMD                         |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_clear_soft_dirty      | Clears a soft dirty PMD                          |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_swp_mksoft_dirty      | Creates a soft dirty swapped PMD                 |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_swp_clear_soft_dirty  | Clears a soft dirty swapped PMD                  |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_mkinvalid             | Invalidates a mapped PMD [1]                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_set_huge              | Creates a PMD huge mapping                       |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmd_clear_huge            | Clears a PMD huge mapping                        |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmdp_get_and_clear        | Clears a PMD                                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmdp_get_and_clear_full   | Clears a PMD                                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmdp_test_and_clear_young | Clears young from a PMD                          |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmdp_set_wrprotect        | Converts into a write protected PMD              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pmdp_set_access_flags     | Converts into a more permissive PMD              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 
 ======================
 PUD Page Table Helpers
 ======================
 
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_same                  | Tests whether both PUD entries are the same      |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_bad                   | Tests a non-table mapped PUD                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_leaf                  | Tests a leaf mapped PUD                          |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_huge                  | Tests a HugeTLB mapped PUD                       |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_trans_huge            | Tests a Transparent Huge Page (THP) at PUD       |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_present               | Tests a valid mapped PUD                         |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_young                 | Tests a young PUD                                |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_dirty                 | Tests a dirty PUD                                |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_write                 | Tests a writable PUD                             |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_devmap                | Tests a ZONE_DEVICE mapped PUD                   |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_mkyoung               | Creates a young PUD                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_mkold                 | Creates an old PUD                               |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_mkdirty               | Creates a dirty PUD                              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_mkclean               | Creates a clean PUD                              |
---------------------------------------------------------------------------------
-| pud_mkwrite               | Creates a writable PMD                           |
---------------------------------------------------------------------------------
-| pud_mkwrprotect           | Creates a write protected PMD                    |
---------------------------------------------------------------------------------
-| pud_mkdevmap              | Creates a ZONE_DEVICE mapped PMD                 |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
+| pud_mkwrite               | Creates a writable PUD                           |
++---------------------------+--------------------------------------------------+
+| pud_mkwrprotect           | Creates a write protected PUD                    |
++---------------------------+--------------------------------------------------+
+| pud_mkdevmap              | Creates a ZONE_DEVICE mapped PUD                 |
++---------------------------+--------------------------------------------------+
 | pud_mkinvalid             | Invalidates a mapped PUD [1]                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_set_huge              | Creates a PUD huge mapping                       |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pud_clear_huge            | Clears a PUD huge mapping                        |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pudp_get_and_clear        | Clears a PUD                                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pudp_get_and_clear_full   | Clears a PUD                                     |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pudp_test_and_clear_young | Clears young from a PUD                          |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pudp_set_wrprotect        | Converts into a write protected PUD              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pudp_set_access_flags     | Converts into a more permissive PUD              |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 
 ==========================
 HugeTLB Page Table Helpers
 ==========================
 
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_huge                  | Tests a HugeTLB                                  |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | pte_mkhuge                | Creates a HugeTLB                                |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | huge_pte_dirty            | Tests a dirty HugeTLB                            |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | huge_pte_write            | Tests a writable HugeTLB                         |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | huge_pte_mkdirty          | Creates a dirty HugeTLB                          |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | huge_pte_mkwrite          | Creates a writable HugeTLB                       |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | huge_pte_mkwrprotect      | Creates a write protected HugeTLB                |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | huge_ptep_get_and_clear   | Clears a HugeTLB                                 |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | huge_ptep_set_wrprotect   | Converts into a write protected HugeTLB          |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | huge_ptep_set_access_flags  | Converts into a more permissive HugeTLB        |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 
 ========================
 SWAP Page Table Helpers
 ========================
 
---------------------------------------------------------------------------------
-| __pte_to_swp_entry        | Creates a swapped entry (arch) from a mapepd PTE |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
+| __pte_to_swp_entry        | Creates a swapped entry (arch) from a mapped PTE |
++---------------------------+--------------------------------------------------+
 | __swp_to_pte_entry        | Creates a mapped PTE from a swapped entry (arch) |
---------------------------------------------------------------------------------
-| __pmd_to_swp_entry        | Creates a swapped entry (arch) from a mapepd PMD |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
+| __pmd_to_swp_entry        | Creates a swapped entry (arch) from a mapped PMD |
++---------------------------+--------------------------------------------------+
 | __swp_to_pmd_entry        | Creates a mapped PMD from a swapped entry (arch) |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | is_migration_entry        | Tests a migration (read or write) swapped entry  |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | is_write_migration_entry  | Tests a write migration swapped entry            |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | make_migration_entry_read | Converts into read migration swapped entry       |
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 | make_migration_entry      | Creates a migration swapped entry (read or write)|
---------------------------------------------------------------------------------
++---------------------------+--------------------------------------------------+
 
 [1] https://lore.kernel.org/linux-mm/20181017020930.GN30832@redhat.com/
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch
mm-vmstat-add-events-for-thp-migration-without-split.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-sparse-cleanup-the-code-surrounding-memory_present.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (144 preceding siblings ...)
  2020-07-14  1:30 ` + documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch " Andrew Morton
@ 2020-07-14  1:37 ` Andrew Morton
  2020-07-14  1:38 ` + const_structscheckpatch-add-regulator_ops.patch " Andrew Morton
                   ` (86 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:37 UTC (permalink / raw)
  To: mm-commits, rppt


The patch titled
     Subject: mm/sparse: cleanup the code surrounding memory_present()
has been added to the -mm tree.  Its filename is
     mm-sparse-cleanup-the-code-surrounding-memory_present.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-sparse-cleanup-the-code-surrounding-memory_present.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-sparse-cleanup-the-code-surrounding-memory_present.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Mike Rapoport <rppt@linux.ibm.com>
Subject: mm/sparse: cleanup the code surrounding memory_present()

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP we have two equivalent
functions that call memory_present() for each region in memblock.memory:
sparse_memory_present_with_active_regions() and membocks_present().

Moreover, all architectures have a call to either of these functions
preceding the call to sparse_init() and in the most cases they are called
one after the other.

Mark the regions from memblock.memory as present during sparce_init() by
making sparse_init() call memblocks_present(), make memblocks_present()
and memory_present() functions static and remove redundant
sparse_memory_present_with_active_regions() function.

Also remove no longer required HAVE_MEMORY_PRESENT configuration option.

Link: http://lkml.kernel.org/r/20200712083130.22919-1-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/vm/memory-model.rst |    7 ++-----
 arch/arm/mm/init.c                |    9 ++-------
 arch/arm64/mm/init.c              |    6 ++----
 arch/ia64/mm/discontig.c          |    1 -
 arch/microblaze/mm/init.c         |    3 ---
 arch/mips/kernel/setup.c          |    8 --------
 arch/mips/loongson64/numa.c       |    1 -
 arch/mips/sgi-ip27/ip27-memory.c  |    2 --
 arch/parisc/mm/init.c             |    5 -----
 arch/powerpc/mm/mem.c             |    2 --
 arch/powerpc/mm/numa.c            |    1 -
 arch/riscv/mm/init.c              |    1 -
 arch/s390/mm/init.c               |    1 -
 arch/sh/mm/init.c                 |    6 ------
 arch/sh/mm/numa.c                 |    3 ---
 arch/sparc/mm/init_64.c           |    1 -
 arch/x86/mm/init_32.c             |    2 --
 arch/x86/mm/init_64.c             |    1 -
 include/linux/mm.h                |    4 ----
 include/linux/mmzone.h            |   14 --------------
 mm/Kconfig                        |    6 +-----
 mm/page_alloc.c                   |   16 ----------------
 mm/sparse.c                       |   20 ++++++++++++--------
 23 files changed, 19 insertions(+), 101 deletions(-)

--- a/arch/arm64/mm/init.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/arm64/mm/init.c
@@ -430,11 +430,9 @@ void __init bootmem_init(void)
 #endif
 
 	/*
-	 * Sparsemem tries to allocate bootmem in memory_present(), so must be
-	 * done after the fixed reservations.
+	 * sparse_init() tries to allocate memory from memblock, so must be
+	 * done after the fixed reservations
 	 */
-	memblocks_present();
-
 	sparse_init();
 	zone_sizes_init(min, max);
 
--- a/arch/arm/mm/init.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/arm/mm/init.c
@@ -243,13 +243,8 @@ void __init bootmem_init(void)
 		      (phys_addr_t)max_low_pfn << PAGE_SHIFT);
 
 	/*
-	 * Sparsemem tries to allocate bootmem in memory_present(),
-	 * so must be done after the fixed reservations
-	 */
-	memblocks_present();
-
-	/*
-	 * sparse_init() needs the bootmem allocator up and running.
+	 * sparse_init() tries to allocate memory from memblock, so must be
+	 * done after the fixed reservations
 	 */
 	sparse_init();
 
--- a/arch/ia64/mm/discontig.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/ia64/mm/discontig.c
@@ -600,7 +600,6 @@ void __init paging_init(void)
 
 	max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;
 
-	sparse_memory_present_with_active_regions(MAX_NUMNODES);
 	sparse_init();
 
 #ifdef CONFIG_VIRTUAL_MEM_MAP
--- a/arch/microblaze/mm/init.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/microblaze/mm/init.c
@@ -172,9 +172,6 @@ void __init setup_memory(void)
 				  &memblock.memory, 0);
 	}
 
-	/* XXX need to clip this if using highmem? */
-	sparse_memory_present_with_active_regions(0);
-
 	paging_init();
 }
 
--- a/arch/mips/kernel/setup.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/mips/kernel/setup.c
@@ -371,14 +371,6 @@ static void __init bootmem_init(void)
 #endif
 	}
 
-
-	/*
-	 * In any case the added to the memblock memory regions
-	 * (highmem/lowmem, available/reserved, etc) are considered
-	 * as present, so inform sparsemem about them.
-	 */
-	memblocks_present();
-
 	/*
 	 * Reserve initrd memory if needed.
 	 */
--- a/arch/mips/loongson64/numa.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/mips/loongson64/numa.c
@@ -220,7 +220,6 @@ static __init void prom_meminit(void)
 			cpumask_clear(&__node_cpumask[node]);
 		}
 	}
-	memblocks_present();
 	max_low_pfn = PHYS_PFN(memblock_end_of_DRAM());
 
 	for (cpu = 0; cpu < loongson_sysconf.nr_cpus; cpu++) {
--- a/arch/mips/sgi-ip27/ip27-memory.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/mips/sgi-ip27/ip27-memory.c
@@ -402,8 +402,6 @@ void __init prom_meminit(void)
 		}
 		__node_data[node] = &null_node;
 	}
-
-	memblocks_present();
 }
 
 void __init prom_free_prom_memory(void)
--- a/arch/parisc/mm/init.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/parisc/mm/init.c
@@ -689,11 +689,6 @@ void __init paging_init(void)
 	flush_cache_all_local(); /* start with known state */
 	flush_tlb_all_local(NULL);
 
-	/*
-	 * Mark all memblocks as present for sparsemem using
-	 * memory_present() and then initialize sparsemem.
-	 */
-	memblocks_present();
 	sparse_init();
 	parisc_bootmem_free();
 }
--- a/arch/powerpc/mm/mem.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/powerpc/mm/mem.c
@@ -183,8 +183,6 @@ void __init mem_topology_setup(void)
 
 void __init initmem_init(void)
 {
-	/* XXX need to clip this if using highmem? */
-	sparse_memory_present_with_active_regions(0);
 	sparse_init();
 }
 
--- a/arch/powerpc/mm/numa.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/powerpc/mm/numa.c
@@ -949,7 +949,6 @@ void __init initmem_init(void)
 
 		get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
 		setup_node_data(nid, start_pfn, end_pfn);
-		sparse_memory_present_with_active_regions(nid);
 	}
 
 	sparse_init();
--- a/arch/riscv/mm/init.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/riscv/mm/init.c
@@ -520,7 +520,6 @@ void mark_rodata_ro(void)
 void __init paging_init(void)
 {
 	setup_vm_final();
-	memblocks_present();
 	sparse_init();
 	setup_zero_page();
 	zone_sizes_init();
--- a/arch/s390/mm/init.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/s390/mm/init.c
@@ -115,7 +115,6 @@ void __init paging_init(void)
 	__load_psw_mask(psw.mask);
 	kasan_free_early_identity();
 
-	sparse_memory_present_with_active_regions(MAX_NUMNODES);
 	sparse_init();
 	zone_dma_bits = 31;
 	memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
--- a/arch/sh/mm/init.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/sh/mm/init.c
@@ -241,12 +241,6 @@ static void __init do_init_bootmem(void)
 
 	plat_mem_setup();
 
-	for_each_memblock(memory, reg) {
-		int nid = memblock_get_region_node(reg);
-
-		memory_present(nid, memblock_region_memory_base_pfn(reg),
-			memblock_region_memory_end_pfn(reg));
-	}
 	sparse_init();
 }
 
--- a/arch/sh/mm/numa.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/sh/mm/numa.c
@@ -53,7 +53,4 @@ void __init setup_bootmem_node(int nid,
 
 	/* It's up */
 	node_set_online(nid);
-
-	/* Kick sparsemem */
-	sparse_memory_present_with_active_regions(nid);
 }
--- a/arch/sparc/mm/init_64.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/sparc/mm/init_64.c
@@ -1610,7 +1610,6 @@ static unsigned long __init bootmem_init
 
 	/* XXX cpu notifier XXX */
 
-	sparse_memory_present_with_active_regions(MAX_NUMNODES);
 	sparse_init();
 
 	return end_pfn;
--- a/arch/x86/mm/init_32.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/x86/mm/init_32.c
@@ -678,7 +678,6 @@ void __init initmem_init(void)
 #endif
 
 	memblock_set_node(0, PHYS_ADDR_MAX, &memblock.memory, 0);
-	sparse_memory_present_with_active_regions(0);
 
 #ifdef CONFIG_FLATMEM
 	max_mapnr = IS_ENABLED(CONFIG_HIGHMEM) ? highend_pfn : max_low_pfn;
@@ -718,7 +717,6 @@ void __init paging_init(void)
 	 * NOTE: at this point the bootmem allocator is fully available.
 	 */
 	olpc_dt_build_devicetree();
-	sparse_memory_present_with_active_regions(MAX_NUMNODES);
 	sparse_init();
 	zone_sizes_init();
 }
--- a/arch/x86/mm/init_64.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/arch/x86/mm/init_64.c
@@ -817,7 +817,6 @@ void __init initmem_init(void)
 
 void __init paging_init(void)
 {
-	sparse_memory_present_with_active_regions(MAX_NUMNODES);
 	sparse_init();
 
 	/*
--- a/Documentation/vm/memory-model.rst~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/Documentation/vm/memory-model.rst
@@ -141,11 +141,8 @@ sections:
   `mem_section` objects and the number of rows is calculated to fit
   all the memory sections.
 
-The architecture setup code should call :c:func:`memory_present` for
-each active memory range or use :c:func:`memblocks_present` or
-:c:func:`sparse_memory_present_with_active_regions` wrappers to
-initialize the memory sections. Next, the actual memory maps should be
-set up using :c:func:`sparse_init`.
+The architecture setup code should call sparse_init() to
+initialize the memory sections and the memory maps.
 
 With SPARSEMEM there are two possible ways to convert a PFN to the
 corresponding `struct page` - a "classic sparse" and "sparse
--- a/include/linux/mm.h~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/include/linux/mm.h
@@ -2372,9 +2372,6 @@ static inline unsigned long get_num_phys
  * for_each_valid_physical_page_range()
  * 	memblock_add_node(base, size, nid)
  * free_area_init(max_zone_pfns);
- *
- * sparse_memory_present_with_active_regions() calls memory_present() for
- * each range when SPARSEMEM is enabled.
  */
 void free_area_init(unsigned long *max_zone_pfn);
 unsigned long node_map_pfn_alignment(void);
@@ -2385,7 +2382,6 @@ extern unsigned long absent_pages_in_ran
 extern void get_pfn_range_for_nid(unsigned int nid,
 			unsigned long *start_pfn, unsigned long *end_pfn);
 extern unsigned long find_min_pfn_with_active_regions(void);
-extern void sparse_memory_present_with_active_regions(int nid);
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
 static inline int early_pfn_to_nid(unsigned long pfn)
--- a/include/linux/mmzone.h~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/include/linux/mmzone.h
@@ -839,18 +839,6 @@ static inline struct pglist_data *lruvec
 
 extern unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx);
 
-#ifdef CONFIG_HAVE_MEMORY_PRESENT
-void memory_present(int nid, unsigned long start, unsigned long end);
-#else
-static inline void memory_present(int nid, unsigned long start, unsigned long end) {}
-#endif
-
-#if defined(CONFIG_SPARSEMEM)
-void memblocks_present(void);
-#else
-static inline void memblocks_present(void) {}
-#endif
-
 #ifdef CONFIG_HAVE_MEMORYLESS_NODES
 int local_memory_node(int node_id);
 #else
@@ -1407,8 +1395,6 @@ struct mminit_pfnnid_cache {
 #define early_pfn_valid(pfn)	(1)
 #endif
 
-void memory_present(int nid, unsigned long start, unsigned long end);
-
 /*
  * If it is possible to have holes within a MAX_ORDER_NR_PAGES, then we
  * need to check pfn validity within that MAX_ORDER_NR_PAGES block.
--- a/mm/Kconfig~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/mm/Kconfig
@@ -88,13 +88,9 @@ config NEED_MULTIPLE_NODES
 	def_bool y
 	depends on DISCONTIGMEM || NUMA
 
-config HAVE_MEMORY_PRESENT
-	def_bool y
-	depends on ARCH_HAVE_MEMORY_PRESENT || SPARSEMEM
-
 #
 # SPARSEMEM_EXTREME (which is the default) does some bootmem
-# allocations when memory_present() is called.  If this cannot
+# allocations when sparse_init() is called.  If this cannot
 # be done on your architecture, select this option.  However,
 # statically allocating the mem_section[] array can potentially
 # consume vast quantities of .bss, so be careful.
--- a/mm/page_alloc.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/mm/page_alloc.c
@@ -6325,22 +6325,6 @@ void __meminit init_currently_empty_zone
 }
 
 /**
- * sparse_memory_present_with_active_regions - Call memory_present for each active range
- * @nid: The node to call memory_present for. If MAX_NUMNODES, all nodes will be used.
- *
- * If an architecture guarantees that all ranges registered contain no holes and may
- * be freed, this function may be used instead of calling memory_present() manually.
- */
-void __init sparse_memory_present_with_active_regions(int nid)
-{
-	unsigned long start_pfn, end_pfn;
-	int i, this_nid;
-
-	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
-		memory_present(this_nid, start_pfn, end_pfn);
-}
-
-/**
  * get_pfn_range_for_nid - Return the start and end page frames for a node
  * @nid: The nid to return the range for. If MAX_NUMNODES, the min and max PFN are returned.
  * @start_pfn: Passed by reference. On return, it will have the node start_pfn.
--- a/mm/sparse.c~mm-sparse-cleanup-the-code-surrounding-memory_present
+++ a/mm/sparse.c
@@ -249,7 +249,7 @@ void __init subsection_map_init(unsigned
 #endif
 
 /* Record a memory area against a node. */
-void __init memory_present(int nid, unsigned long start, unsigned long end)
+static void __init memory_present(int nid, unsigned long start, unsigned long end)
 {
 	unsigned long pfn;
 
@@ -285,11 +285,11 @@ void __init memory_present(int nid, unsi
 }
 
 /*
- * Mark all memblocks as present using memory_present(). This is a
- * convenience function that is useful for a number of arches
- * to mark all of the systems memory as present during initialization.
+ * Mark all memblocks as present using memory_present().
+ * This is a convenience function that is useful to mark all of the systems
+ * memory as present during initialization.
  */
-void __init memblocks_present(void)
+static void __init memblocks_present(void)
 {
 	struct memblock_region *reg;
 
@@ -574,9 +574,13 @@ failed:
  */
 void __init sparse_init(void)
 {
-	unsigned long pnum_begin = first_present_section_nr();
-	int nid_begin = sparse_early_nid(__nr_to_section(pnum_begin));
-	unsigned long pnum_end, map_count = 1;
+	unsigned long pnum_end, pnum_begin, map_count = 1;
+	int nid_begin;
+
+	memblocks_present();
+
+	pnum_begin = first_present_section_nr();
+	nid_begin = sparse_early_nid(__nr_to_section(pnum_begin));
 
 	/* Setup pageblock_order for HUGETLB_PAGE_SIZE_VARIABLE */
 	set_pageblock_order();
_

Patches currently in -mm which might be from rppt@linux.ibm.com are

mailmap-add-entry-for-mike-rapoport.patch
mm-remove-unneeded-includes-of-asm-pgalloch.patch
mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
opeinrisc-switch-to-generic-version-of-pte-allocation.patch
xtensa-switch-to-generic-version-of-pte-allocation.patch
asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
asm-generic-pgalloc-provide-generic-pgd_free.patch
mm-move-lib-ioremapc-to-mm.patch
mm-sparse-cleanup-the-code-surrounding-memory_present.patch
mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + const_structscheckpatch-add-regulator_ops.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (145 preceding siblings ...)
  2020-07-14  1:37 ` + mm-sparse-cleanup-the-code-surrounding-memory_present.patch " Andrew Morton
@ 2020-07-14  1:38 ` Andrew Morton
  2020-07-14  1:40 ` + fat-fix-fat_ra_init-for-data-clusters-==-0.patch " Andrew Morton
                   ` (85 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:38 UTC (permalink / raw)
  To: bleung, broonie, enric.balletbo, groeck, joe, lgirdwood,
	mm-commits, pihsun, rikard.falkeborn


The patch titled
     Subject: const_structs.checkpatch: add regulator_ops
has been added to the -mm tree.  Its filename is
     const_structscheckpatch-add-regulator_ops.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/const_structscheckpatch-add-regulator_ops.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/const_structscheckpatch-add-regulator_ops.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joe Perches <joe@perches.com>
Subject: const_structs.checkpatch: add regulator_ops

Add regulator_ops to expected to be const list.

Link: http://lkml.kernel.org/r/dab1ba1aa03a8236933cfb7a28937efb0b808f13.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Pi-Hsun Shih <pihsun@chromium.org>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Benson Leung <bleung@chromium.org>
Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/const_structs.checkpatch |    1 +
 1 file changed, 1 insertion(+)

--- a/scripts/const_structs.checkpatch~const_structscheckpatch-add-regulator_ops
+++ a/scripts/const_structs.checkpatch
@@ -44,6 +44,7 @@ platform_hibernation_ops
 platform_suspend_ops
 proto_ops
 regmap_access_table
+regulator_ops
 rpc_pipe_ops
 rtc_class_ops
 sd_desc
_

Patches currently in -mm which might be from joe@perches.com are

checkpatch-test-git_dir-changes.patch
const_structscheckpatch-add-regulator_ops.patch
checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
checkpatch-add-fix-option-for-assign_in_if.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + fat-fix-fat_ra_init-for-data-clusters-==-0.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (146 preceding siblings ...)
  2020-07-14  1:38 ` + const_structscheckpatch-add-regulator_ops.patch " Andrew Morton
@ 2020-07-14  1:40 ` Andrew Morton
  2020-07-14  1:41 ` + mm-vmallocc-remove-bug-from-the-find_va_links.patch " Andrew Morton
                   ` (84 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:40 UTC (permalink / raw)
  To: hirofumi, mm-commits


The patch titled
     Subject: fat: fix fat_ra_init() for data clusters == 0
has been added to the -mm tree.  Its filename is
     fat-fix-fat_ra_init-for-data-clusters-==-0.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fat-fix-fat_ra_init-for-data-clusters-%3D%3D-0.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fat-fix-fat_ra_init-for-data-clusters-%3D%3D-0.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Subject: fat: fix fat_ra_init() for data clusters == 0

If data clusters == 0, fat_ra_init() calls the ->ent_blocknr() for the
cluster beyond ->max_clusters.

This checks the limit before initialization to suppress the warning.

Link: http://lkml.kernel.org/r/87mu462sv4.fsf@mail.parknet.co.jp
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reported-by: syzbot+756199124937b31a9b7e@syzkaller.appspotmail.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/fat/fatent.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/fs/fat/fatent.c~fat-fix-fat_ra_init-for-data-clusters-==-0
+++ a/fs/fat/fatent.c
@@ -657,6 +657,9 @@ static void fat_ra_init(struct super_blo
 	unsigned long ra_pages = sb->s_bdi->ra_pages;
 	unsigned int reada_blocks;
 
+	if (fatent->entry >= ent_limit)
+		return;
+
 	if (ra_pages > sb->s_bdi->io_pages)
 		ra_pages = rounddown(ra_pages, sb->s_bdi->io_pages);
 	reada_blocks = ra_pages << (PAGE_SHIFT - sb->s_blocksize_bits + 1);
_

Patches currently in -mm which might be from hirofumi@mail.parknet.co.jp are

fat-fix-fat_ra_init-for-data-clusters-==-0.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmallocc-remove-bug-from-the-find_va_links.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (147 preceding siblings ...)
  2020-07-14  1:40 ` + fat-fix-fat_ra_init-for-data-clusters-==-0.patch " Andrew Morton
@ 2020-07-14  1:41 ` Andrew Morton
  2020-07-14  2:49 ` mmotm 2020-07-13-19-49 uploaded Andrew Morton
                   ` (83 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  1:41 UTC (permalink / raw)
  To: hdanton, mhocko, mm-commits, oleksiy.avramchenko, rostedt, urezki, willy


The patch titled
     Subject: mm/vmalloc.c: remove BUG() from the find_va_links()
has been added to the -mm tree.  Its filename is
     mm-vmallocc-remove-bug-from-the-find_va_links.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmallocc-remove-bug-from-the-find_va_links.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmallocc-remove-bug-from-the-find_va_links.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Subject: mm/vmalloc.c: remove BUG() from the find_va_links()

Get rid of BUG() macro, that should be used only when a critical situation
happens and a system is not able to function anymore.

Replace it with WARN() macro instead, dump some extra information about
start/end addresses of both VAs which overlap.  Such overlap data can help
to figure out what happened making further analysis easier.  For example
if both areas are identical it could mean a double free.

A recovery process consists of declining all further steps regarding
inserting of conflicting overlap range.  In that sense find_va_links() now
can return NULL, so its return value has to be checked by callers.

Side effect of such process is it can leak memory, but it is better than
just killing a machine for no good reason.  Apart of that a debugging
process can be done on alive system.

Link: http://lkml.kernel.org/r/20200711104531.12242-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |   41 ++++++++++++++++++++++++++++++++---------
 1 file changed, 32 insertions(+), 9 deletions(-)

--- a/mm/vmalloc.c~mm-vmallocc-remove-bug-from-the-find_va_links
+++ a/mm/vmalloc.c
@@ -512,6 +512,10 @@ static struct vmap_area *__find_vmap_are
 /*
  * This function returns back addresses of parent node
  * and its left or right link for further processing.
+ *
+ * Otherwise NULL is returned. In that case all further
+ * steps regarding inserting of conflicting overlap range
+ * have to be declined and actually considered as a bug.
  */
 static __always_inline struct rb_node **
 find_va_links(struct vmap_area *va,
@@ -550,8 +554,12 @@ find_va_links(struct vmap_area *va,
 		else if (va->va_end > tmp_va->va_start &&
 				va->va_start >= tmp_va->va_end)
 			link = &(*link)->rb_right;
-		else
-			BUG();
+		else {
+			WARN(1, "vmalloc bug: 0x%lx-0x%lx overlaps with 0x%lx-0x%lx\n",
+				va->va_start, va->va_end, tmp_va->va_start, tmp_va->va_end);
+
+			return NULL;
+		}
 	} while (*link);
 
 	*parent = &tmp_va->rb_node;
@@ -697,7 +705,8 @@ insert_vmap_area(struct vmap_area *va,
 	struct rb_node *parent;
 
 	link = find_va_links(va, root, NULL, &parent);
-	link_va(va, root, parent, link, head);
+	if (link)
+		link_va(va, root, parent, link, head);
 }
 
 static void
@@ -713,8 +722,10 @@ insert_vmap_area_augment(struct vmap_are
 	else
 		link = find_va_links(va, root, NULL, &parent);
 
-	link_va(va, root, parent, link, head);
-	augment_tree_propagate_from(va);
+	if (link) {
+		link_va(va, root, parent, link, head);
+		augment_tree_propagate_from(va);
+	}
 }
 
 /*
@@ -722,6 +733,11 @@ insert_vmap_area_augment(struct vmap_are
  * and next free blocks. If coalesce is not done a new
  * free area is inserted. If VA has been merged, it is
  * freed.
+ *
+ * Please note, it can return NULL in case of overlap
+ * ranges, followed by WARN() report. Despite it is a
+ * buggy behaviour, a system can be alive and keep
+ * ongoing.
  */
 static __always_inline struct vmap_area *
 merge_or_add_vmap_area(struct vmap_area *va,
@@ -738,6 +754,8 @@ merge_or_add_vmap_area(struct vmap_area
 	 * inserted, unless it is merged with its sibling/siblings.
 	 */
 	link = find_va_links(va, root, NULL, &parent);
+	if (!link)
+		return NULL;
 
 	/*
 	 * Get next node of VA to check if merging can be done.
@@ -1346,6 +1364,9 @@ static bool __purge_vmap_area_lazy(unsig
 		va = merge_or_add_vmap_area(va, &free_vmap_area_root,
 					    &free_vmap_area_list);
 
+		if (!va)
+			continue;
+
 		if (is_vmalloc_or_module_addr((void *)orig_start))
 			kasan_release_vmalloc(orig_start, orig_end,
 					      va->va_start, va->va_end);
@@ -3330,8 +3351,9 @@ recovery:
 		orig_end = vas[area]->va_end;
 		va = merge_or_add_vmap_area(vas[area], &free_vmap_area_root,
 					    &free_vmap_area_list);
-		kasan_release_vmalloc(orig_start, orig_end,
-				      va->va_start, va->va_end);
+		if (va)
+			kasan_release_vmalloc(orig_start, orig_end,
+				va->va_start, va->va_end);
 		vas[area] = NULL;
 	}
 
@@ -3379,8 +3401,9 @@ err_free_shadow:
 		orig_end = vas[area]->va_end;
 		va = merge_or_add_vmap_area(vas[area], &free_vmap_area_root,
 					    &free_vmap_area_list);
-		kasan_release_vmalloc(orig_start, orig_end,
-				      va->va_start, va->va_end);
+		if (va)
+			kasan_release_vmalloc(orig_start, orig_end,
+				va->va_start, va->va_end);
 		vas[area] = NULL;
 		kfree(vms[area]);
 	}
_

Patches currently in -mm which might be from urezki@gmail.com are

mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
mm-vmalloc-switch-to-propagate-callback.patch
mm-vmalloc-update-the-header-about-kva-rework.patch
mm-vmallocc-remove-bug-from-the-find_va_links.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* mmotm 2020-07-13-19-49 uploaded
  2020-07-03 22:14 incoming Andrew Morton
                   ` (148 preceding siblings ...)
  2020-07-14  1:41 ` + mm-vmallocc-remove-bug-from-the-find_va_links.patch " Andrew Morton
@ 2020-07-14  2:49 ` Andrew Morton
  2020-07-16  0:41 ` + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch added to -mm tree Andrew Morton
                   ` (82 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14  2:49 UTC (permalink / raw)
  To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
	mhocko, mm-commits, sfr

The mm-of-the-moment snapshot 2020-07-13-19-49 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss.  Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

	https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

	https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.8-rc5:
(patches marked "*" will be included in linux-next)

* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
* mailmap-add-entry-for-mike-rapoport.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* const_structscheckpatch-add-regulator_ops.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch
* mm-handle-page-mapping-better-in-dump_page.patch
* mm-handle-page-mapping-better-in-dump_page-fix.patch
* mm-dump-compound-page-information-on-a-second-line.patch
* mm-print-head-flags-in-dump_page.patch
* mm-switch-dump_page-to-get_kernel_nofault.patch
* mm-print-the-inode-number-in-dump_page.patch
* mm-print-hashed-address-of-struct-page.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-swap-simplify-alloc_swap_slot_cache.patch
* mm-swap-simplify-enable_swap_slots_cache.patch
* mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
* tmpfs-per-superblock-i_ino-support.patch
* tmpfs-support-64-bit-inums-per-sb.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
* mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
* mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
* mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
* mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch
* mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch
* mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch
* mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
* mm-utilc-make-vm_memory_committed-more-accurate.patch
* percpu_counter-add-percpu_counter_sync.patch
* mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
* mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
* mm-mremap-calculate-extent-in-one-place.patch
* mm-mremap-start-addresses-are-properly-aligned.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* mm-sparse-cleanup-the-code-surrounding-memory_present.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
* mm-vmallocc-remove-bug-from-the-find_va_links.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* rcu-kasan-record-and-print-call_rcu-call-stack-v8.patch
* kasan-record-and-print-the-free-track.patch
* kasan-record-and-print-the-free-track-v8.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
* mm-thp-replace-http-links-with-https-ones.patch
* mm-vmscanc-fixed-typo.patch
* mm-vmscan-consistent-update-to-pgrefill.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
* mm-oom-make-the-calculation-of-oom-badness-more-accurate-v3.patch
* doc-mm-sync-up-oom_score_adj-documentation.patch
* doc-mm-clarify-proc-pid-oom_score-value-range.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-migrate-optimize-migrate_vma_setup-for-holes.patch
* mm-migrate-optimize-migrate_vma_setup-for-holes-v2.patch
* mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-vmstat-add-events-for-thp-migration-without-split.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-free-pages-fix.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix-fix.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* mm-memory_hotplug-introduce-default-dummy-memory_add_physaddr_to_nid.patch
* mm-memory_hotplug-fix-unpaired-mem_hotplug_begin-done.patch
* syscalls-use-uaccess_kernel-in-addr_limit_user_check.patch
* nds32-use-uaccess_kernel-in-show_regs.patch
* riscv-include-asm-pgtableh-in-asm-uaccessh.patch
* uaccess-remove-segment_eq.patch
* uaccess-add-force_uaccess_beginend-helpers.patch
* exec-use-force_uaccess_begin-during-exec-and-exit.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* proc-sysctl-make-protected_-world-readable.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* lib-test_lockupc-make-symbol-test_works-static.patch
* iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* fs-minix-check-return-value-of-sb_getblk.patch
* fs-minix-dont-allow-getting-deleted-inodes.patch
* fs-minix-reject-too-large-maximum-file-size.patch
* fs-minix-set-s_maxbytes-correctly.patch
* fs-minix-fix-block-limit-check-for-v1-filesystems.patch
* fs-minix-remove-expected-error-message-in-block_to_path.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
* fat-fix-fat_ra_init-for-data-clusters-==-0.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
* ipc-uninline-functions.patch
  linux-next.patch
  linux-next-rejects.patch
* mm-page_isolation-prefer-the-node-of-the-source-page.patch
* mm-migrate-move-migration-helper-from-h-to-c.patch
* mm-hugetlb-unify-migration-callbacks.patch
* mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
* mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
* mm-migrate-make-a-standard-migration-target-allocation-function.patch
* mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
* mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
* mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
* mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch
* scripts-deprecated_terms-sync-with-inclusive-terms.patch
* mm-do-page-fault-accounting-in-handle_mm_fault.patch
* mm-alpha-use-general-page-fault-accounting.patch
* mm-arc-use-general-page-fault-accounting.patch
* mm-arm-use-general-page-fault-accounting.patch
* mm-arm64-use-general-page-fault-accounting.patch
* mm-csky-use-general-page-fault-accounting.patch
* mm-hexagon-use-general-page-fault-accounting.patch
* mm-ia64-use-general-page-fault-accounting.patch
* mm-m68k-use-general-page-fault-accounting.patch
* mm-microblaze-use-general-page-fault-accounting.patch
* mm-mips-use-general-page-fault-accounting.patch
* mm-nds32-use-general-page-fault-accounting.patch
* mm-nios2-use-general-page-fault-accounting.patch
* mm-openrisc-use-general-page-fault-accounting.patch
* mm-parisc-use-general-page-fault-accounting.patch
* mm-powerpc-use-general-page-fault-accounting.patch
* mm-riscv-use-general-page-fault-accounting.patch
* mm-s390-use-general-page-fault-accounting.patch
* mm-sh-use-general-page-fault-accounting.patch
* mm-sparc32-use-general-page-fault-accounting.patch
* mm-sparc64-use-general-page-fault-accounting.patch
* mm-x86-use-general-page-fault-accounting.patch
* mm-xtensa-use-general-page-fault-accounting.patch
* mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code-fix.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
  make-sure-nobodys-leaking-resources.patch
  releasing-resources-with-children.patch
  mutex-subsystem-synchro-test-module.patch
  kernel-forkc-export-kernel_thread-to-modules.patch
  workaround-for-a-pci-restoring-bug.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (149 preceding siblings ...)
  2020-07-14  2:49 ` mmotm 2020-07-13-19-49 uploaded Andrew Morton
@ 2020-07-16  0:41 ` Andrew Morton
  2020-07-16  0:42 ` + fs-ufs-avoid-potential-u32-multiplication-overflow.patch " Andrew Morton
                   ` (81 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16  0:41 UTC (permalink / raw)
  To: catalin.marinas, hannes, hdanton, hughd, josef, kirill.shutemov,
	mm-commits, will.deacon, willy, xuyu, yang.shi


The patch titled
     Subject: mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2
has been added to the -mm tree.  Its filename is
     mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Yang Shi <yang.shi@linux.alibaba.com>
Subject: mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2

incorporat comment from Will Deacon, update commit log per discussion

Link: http://lkml.kernel.org/r/1594848990-55657-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: 89b15332af7c ("mm: drop mmap_sem before calling balance_dirty_pages() in write fault")
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reported-by: Xu Yu <xuyu@linux.alibaba.com>
Debugged-by: Xu Yu <xuyu@linux.alibaba.com>
Tested-by: Xu Yu <xuyu@linux.alibaba.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/mm/memory.c~mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2
+++ a/mm/memory.c
@@ -4243,11 +4243,12 @@ static vm_fault_t handle_pte_fault(struc
 			return do_wp_page(vmf);
 	}
 
-	if ((vmf->flags & FAULT_FLAG_WRITE) && !(vmf->flags & FAULT_FLAG_TRIED))
-		entry = pte_mkdirty(entry);
-	else if (vmf->flags & FAULT_FLAG_TRIED)
+	if (vmf->flags & FAULT_FLAG_TRIED)
 		goto unlock;
 
+	if (vmf->flags & FAULT_FLAG_WRITE)
+		entry = pte_mkdirty(entry);
+
 	entry = pte_mkyoung(entry);
 	if (ptep_set_access_flags(vmf->vma, vmf->address, vmf->pte, entry,
 				vmf->flags & FAULT_FLAG_WRITE)) {
_

Patches currently in -mm which might be from yang.shi@linux.alibaba.com are

mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch
mm-filemap-clear-idle-flag-for-writes.patch
mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
mm-thp-remove-debug_cow-switch.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + fs-ufs-avoid-potential-u32-multiplication-overflow.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (150 preceding siblings ...)
  2020-07-16  0:41 ` + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch added to -mm tree Andrew Morton
@ 2020-07-16  0:42 ` Andrew Morton
  2020-07-16  0:50 ` + x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch " Andrew Morton
                   ` (80 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16  0:42 UTC (permalink / raw)
  To: adobriyan, colin.king, dushistov, mm-commits


The patch titled
     Subject: fs/ufs: avoid potential u32 multiplication overflow
has been added to the -mm tree.  Its filename is
     fs-ufs-avoid-potential-u32-multiplication-overflow.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-ufs-avoid-potential-u32-multiplication-overflow.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-ufs-avoid-potential-u32-multiplication-overflow.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Colin Ian King <colin.king@canonical.com>
Subject: fs/ufs: avoid potential u32 multiplication overflow

The 64 bit ino is being compared to the product of two u32 values,
however, the multiplication is being performed using a 32 bit multiply so
there is a potential of an overflow.  To be fully safe, cast uspi->s_ncg
to a u64 to ensure a 64 bit multiplication occurs to avoid any chance of
overflow.

Addresses-Coverity: ("Unintentional integer overflow")
Link: http://lkml.kernel.org/r/20200715170355.1081713-1-colin.king@canonical.com
Fixes: f3e2a520f5fb ("ufs: NFS support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ufs/super.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/ufs/super.c~fs-ufs-avoid-potential-u32-multiplication-overflow
+++ a/fs/ufs/super.c
@@ -101,7 +101,7 @@ static struct inode *ufs_nfs_get_inode(s
 	struct ufs_sb_private_info *uspi = UFS_SB(sb)->s_uspi;
 	struct inode *inode;
 
-	if (ino < UFS_ROOTINO || ino > uspi->s_ncg * uspi->s_ipg)
+	if (ino < UFS_ROOTINO || ino > (u64)uspi->s_ncg * uspi->s_ipg)
 		return ERR_PTR(-ESTALE);
 
 	inode = ufs_iget(sb, ino);
_

Patches currently in -mm which might be from colin.king@canonical.com are

fs-ufs-avoid-potential-u32-multiplication-overflow.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (151 preceding siblings ...)
  2020-07-16  0:42 ` + fs-ufs-avoid-potential-u32-multiplication-overflow.patch " Andrew Morton
@ 2020-07-16  0:50 ` Andrew Morton
  2020-07-16 21:28 ` + mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch " Andrew Morton
                   ` (79 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16  0:50 UTC (permalink / raw)
  To: daniel.m.jordan, dave.hansen, david, hpa, luto, mhocko, mingo,
	mm-commits, pasha.tatashin, peterz, steven.sistare, tglx


The patch titled
     Subject: x86-mm-use-max-memory-block-size-on-bare-metal-v3
has been added to the -mm tree.  Its filename is
     x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Daniel Jordan <daniel.m.jordan@oracle.com>
Subject: x86-mm-use-max-memory-block-size-on-bare-metal-v3

Add more accurate hypervisor check.  Someone kindly pointed me to
517c3ba00916 ("x86/speculation/mds: Apply more accurate check on
hypervisor platform"), and v2 had the same issue.

Link: http://lkml.kernel.org/r/20200714205450.945834-1-daniel.m.jordan@oracle.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/mm/init_64.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/x86/mm/init_64.c~x86-mm-use-max-memory-block-size-on-bare-metal-v3
+++ a/arch/x86/mm/init_64.c
@@ -54,7 +54,6 @@
 #include <asm/uv/uv.h>
 #include <asm/setup.h>
 #include <asm/ftrace.h>
-#include <asm/hypervisor.h>
 
 #include "mm_internal.h"
 
@@ -1410,7 +1409,7 @@ static unsigned long probe_memory_block_
 	 * Use max block size to minimize overhead on bare metal, where
 	 * alignment for memory hotplug isn't a concern.
 	 */
-	if (hypervisor_is_type(X86_HYPER_NATIVE)) {
+	if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
 		bz = MAX_BLOCK_SIZE;
 		goto done;
 	}
_

Patches currently in -mm which might be from daniel.m.jordan@oracle.com are

x86-mm-use-max-memory-block-size-on-bare-metal.patch
x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (152 preceding siblings ...)
  2020-07-16  0:50 ` + x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch " Andrew Morton
@ 2020-07-16 21:28 ` Andrew Morton
  2020-07-16 21:45 ` + mmhwpoison-cleanup-unused-pagehuge-check.patch " Andrew Morton
                   ` (78 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:28 UTC (permalink / raw)
  To: cl, guro, iamjoonsoo.kim, mm-commits, penberg, rientjes,
	shakeelb, songmuchun, stable, vbabka


The patch titled
     Subject: mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
has been added to the -mm tree.  Its filename is
     mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Muchun Song <songmuchun@bytedance.com>
Subject: mm: memcg/slab: fix memory leak at non-root kmem_cache destroy

If the kmem_cache refcount is greater than one, we should not mark the
root kmem_cache as dying.  If we mark the root kmem_cache dying
incorrectly, the non-root kmem_cache can never be destroyed.  It resulted
in memory leak when memcg was destroyed.  We can use the following steps
to reproduce.

  1) Use kmem_cache_create() to create a new kmem_cache named A.
  2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
     so the refcount of B is just increased.
  3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
     decrease the B's refcount but mark the B as dying.
  4) Create a new memory cgroup and alloc memory from the kmem_cache
     B. It leads to create a non-root kmem_cache for allocating memory.
  5) When destroy the memory cgroup created in the step 4), the
     non-root kmem_cache can never be destroyed.

If we repeat steps 4) and 5), this will cause a lot of memory leak.  So
only when refcount reach zero, we mark the root kmem_cache as dying.

Link: http://lkml.kernel.org/r/20200716165103.83462-1-songmuchun@bytedance.com
Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab_common.c |   35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

--- a/mm/slab_common.c~mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy
+++ a/mm/slab_common.c
@@ -326,6 +326,14 @@ int slab_unmergeable(struct kmem_cache *
 	if (s->refcount < 0)
 		return 1;
 
+#ifdef CONFIG_MEMCG_KMEM
+	/*
+	 * Skip the dying kmem_cache.
+	 */
+	if (s->memcg_params.dying)
+		return 1;
+#endif
+
 	return 0;
 }
 
@@ -886,12 +894,15 @@ static int shutdown_memcg_caches(struct
 	return 0;
 }
 
-static void flush_memcg_workqueue(struct kmem_cache *s)
+static void memcg_set_kmem_cache_dying(struct kmem_cache *s)
 {
 	spin_lock_irq(&memcg_kmem_wq_lock);
 	s->memcg_params.dying = true;
 	spin_unlock_irq(&memcg_kmem_wq_lock);
+}
 
+static void flush_memcg_workqueue(struct kmem_cache *s)
+{
 	/*
 	 * SLAB and SLUB deactivate the kmem_caches through call_rcu. Make
 	 * sure all registered rcu callbacks have been invoked.
@@ -923,10 +934,6 @@ static inline int shutdown_memcg_caches(
 {
 	return 0;
 }
-
-static inline void flush_memcg_workqueue(struct kmem_cache *s)
-{
-}
 #endif /* CONFIG_MEMCG_KMEM */
 
 void slab_kmem_cache_release(struct kmem_cache *s)
@@ -944,8 +951,6 @@ void kmem_cache_destroy(struct kmem_cach
 	if (unlikely(!s))
 		return;
 
-	flush_memcg_workqueue(s);
-
 	get_online_cpus();
 	get_online_mems();
 
@@ -955,6 +960,22 @@ void kmem_cache_destroy(struct kmem_cach
 	if (s->refcount)
 		goto out_unlock;
 
+#ifdef CONFIG_MEMCG_KMEM
+	memcg_set_kmem_cache_dying(s);
+
+	mutex_unlock(&slab_mutex);
+
+	put_online_mems();
+	put_online_cpus();
+
+	flush_memcg_workqueue(s);
+
+	get_online_cpus();
+	get_online_mems();
+
+	mutex_lock(&slab_mutex);
+#endif
+
 	err = shutdown_memcg_caches(s);
 	if (!err)
 		err = shutdown_cache(s);
_

Patches currently in -mm which might be from songmuchun@bytedance.com are

mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch
mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-cleanup-unused-pagehuge-check.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (153 preceding siblings ...)
  2020-07-16 21:28 ` + mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch " Andrew Morton
@ 2020-07-16 21:45 ` Andrew Morton
  2020-07-16 21:45 ` + mm-hwpoison-remove-recalculating-hpage.patch " Andrew Morton
                   ` (77 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:45 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, tony.luck,
	zeil


The patch titled
     Subject: mm,hwpoison: cleanup unused PageHuge() check
has been added to the -mm tree.  Its filename is
     mmhwpoison-cleanup-unused-pagehuge-check.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-cleanup-unused-pagehuge-check.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-cleanup-unused-pagehuge-check.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: mm,hwpoison: cleanup unused PageHuge() check

Patch series "Hwpoison soft-offline rework", v4.

This patchset was initially based on Naoya's hwpoison rework [1], so
thanks to him for the initial work.  I would also like to think Naoya for
testing the patchset off-line, and report any issues he found, that was
quite helpful.

This patchset aims to fix some issues laying in soft-offline handling, but
it also takes the chance and takes some further steps to perform cleanups
and some refactoring as well.


 - Motivation:

   A customer and I were facing an issue were processes were killed
   after having soft-offlined some of their pages.  This should not happen
   when soft-offlining, as it is meant to be non-disruptive.  I was able
   to reproduce the issue when I stressed the memory + soft offlining
   pages in the meantime.

   After debugging the issue, I saw that the problem was that pages
   were returned back to user-space after having offlined them properly. 
   So, when those pages were faulted in, the fault handler returned
   VM_FAULT_POISON all the way down to the arch handler, and it simply
   killed the process.

   After a further anaylsis, it became clear that the problem was that
   when kcompactd kicked in to migrate pages over, compaction_alloc
   callback was handing poisoned pages to the migrate routine.

   All this could happen because isolate_freepages_block and
   fast_isolate_freepages just check for the page to be PageBuddy, and
   since 1) poisoned pages can be part of a higher order page and 2)
   poisoned pages are also Page Buddy, they can sneak in easily.

   I also saw some other problems with sawap pages, but I suspected it
   to be the same sort of problem, so I did not follow that trace.

   The above refers to soft-offline.  But I also saw problems with
   hard-offline, specially hugetlb corruption, and some other weird stuff.
   (I could paste the logs)

   The full explanation refering to the soft-offline case can be found at [2].

 - Approach:

   The taken approach is to contain those pages and never let them hit
   neither pcplists nor buddy freelists.  Only when they are completely
   out of reach, we flag them as poisoned.

   A full explanation of this can be found in patch#11 and patch#12

 - Outcome:

   With this patchset, I no longer see the issues with soft-offline.

[1] https://lore.kernel.org/linux-mm/1541746035-13408-1-git-send-email-n-horiguchi@ah.jp.nec.com/
[2] https://lore.kernel.org/linux-mm/20190826104144.GA7849@linux/T/#u


This patch (of 15):

Drop the PageHuge check since memory_failure forks into memory_failure_hugetlb()
for hugetlb pages.

Link: http://lkml.kernel.org/r/20200716123810.25292-1-osalvador@suse.de
Link: http://lkml.kernel.org/r/20200716123810.25292-2-osalvador@suse.de
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

--- a/mm/memory-failure.c~mmhwpoison-cleanup-unused-pagehuge-check
+++ a/mm/memory-failure.c
@@ -1382,10 +1382,7 @@ int memory_failure(unsigned long pfn, in
 	 * page_remove_rmap() in try_to_unmap_one(). So to determine page status
 	 * correctly, we save a copy of the page flags at this time.
 	 */
-	if (PageHuge(p))
-		page_flags = hpage->flags;
-	else
-		page_flags = p->flags;
+	page_flags = p->flags;
 
 	/*
 	 * unpoison always clear PG_hwpoison inside page lock
_

Patches currently in -mm which might be from n-horiguchi@ah.jp.nec.com are

mmhwpoison-cleanup-unused-pagehuge-check.patch
mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
mmhwpoison-remove-mf_count_increased.patch
mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-hwpoison-remove-recalculating-hpage.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (154 preceding siblings ...)
  2020-07-16 21:45 ` + mmhwpoison-cleanup-unused-pagehuge-check.patch " Andrew Morton
@ 2020-07-16 21:45 ` Andrew Morton
  2020-07-16 21:45 ` + mmmadvise-call-soft_offline_page-without-mf_count_increased.patch " Andrew Morton
                   ` (76 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:45 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, tony.luck,
	zeil


The patch titled
     Subject: mm, hwpoison: remove recalculating hpage
has been added to the -mm tree.  Its filename is
     mm-hwpoison-remove-recalculating-hpage.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-hwpoison-remove-recalculating-hpage.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-hwpoison-remove-recalculating-hpage.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Naoya Horiguchi <naoya.horiguchi@nec.com>
Subject: mm, hwpoison: remove recalculating hpage

hpage is never used after try_to_split_thp_page() in memory_failure(), so
we don't have to update hpage.  So let's not recalculate/use hpage.

Link: http://lkml.kernel.org/r/20200716123810.25292-3-osalvador@suse.de
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |    6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

--- a/mm/memory-failure.c~mm-hwpoison-remove-recalculating-hpage
+++ a/mm/memory-failure.c
@@ -1342,7 +1342,6 @@ int memory_failure(unsigned long pfn, in
 		}
 		unlock_page(p);
 		VM_BUG_ON_PAGE(!page_count(p), p);
-		hpage = compound_head(p);
 	}
 
 	/*
@@ -1414,11 +1413,8 @@ int memory_failure(unsigned long pfn, in
 	/*
 	 * Now take care of user space mappings.
 	 * Abort on fail: __delete_from_page_cache() assumes unmapped page.
-	 *
-	 * When the raw error page is thp tail page, hpage points to the raw
-	 * page after thp split.
 	 */
-	if (!hwpoison_user_mappings(p, pfn, flags, &hpage)) {
+	if (!hwpoison_user_mappings(p, pfn, flags, &p)) {
 		action_result(pfn, MF_MSG_UNMAP_FAILED, MF_IGNORED);
 		res = -EBUSY;
 		goto out;
_

Patches currently in -mm which might be from naoya.horiguchi@nec.com are

mm-hwpoison-remove-recalculating-hpage.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmmadvise-call-soft_offline_page-without-mf_count_increased.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (155 preceding siblings ...)
  2020-07-16 21:45 ` + mm-hwpoison-remove-recalculating-hpage.patch " Andrew Morton
@ 2020-07-16 21:45 ` Andrew Morton
  2020-07-16 21:45 ` + mmmadvise-refactor-madvise_inject_error.patch " Andrew Morton
                   ` (75 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:45 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, tony.luck,
	zeil


The patch titled
     Subject: mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED
has been added to the -mm tree.  Its filename is
     mmmadvise-call-soft_offline_page-without-mf_count_increased.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmmadvise-call-soft_offline_page-without-mf_count_increased.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED

The call to get_user_pages_fast is only to get the pointer to a struct
page of a given address, pinning it is memory-poisoning handler's job, so
drop the refcount grabbed by get_user_pages_fast

Link: http://lkml.kernel.org/r/20200716123810.25292-4-osalvador@suse.de
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/madvise.c |   24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

--- a/mm/madvise.c~mmmadvise-call-soft_offline_page-without-mf_count_increased
+++ a/mm/madvise.c
@@ -893,16 +893,24 @@ static int madvise_inject_error(int beha
 		 */
 		size = page_size(compound_head(page));
 
-		if (PageHWPoison(page)) {
-			put_page(page);
+		/*
+		 * The get_user_pages_fast() is just to get the pfn of the
+		 * given address, and the refcount has nothing to do with
+		 * what we try to test, so it should be released immediately.
+		 * This is racy but it's intended because the real hardware
+		 * errors could happen at any moment and memory error handlers
+		 * must properly handle the race.
+		 */
+		put_page(page);
+
+		if (PageHWPoison(page))
 			continue;
-		}
 
 		if (behavior == MADV_SOFT_OFFLINE) {
 			pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n",
 					pfn, start);
 
-			ret = soft_offline_page(pfn, MF_COUNT_INCREASED);
+			ret = soft_offline_page(pfn, 0);
 			if (ret)
 				return ret;
 			continue;
@@ -910,14 +918,6 @@ static int madvise_inject_error(int beha
 
 		pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n",
 				pfn, start);
-
-		/*
-		 * Drop the page reference taken by get_user_pages_fast(). In
-		 * the absence of MF_COUNT_INCREASED the memory_failure()
-		 * routine is responsible for pinning the page to prevent it
-		 * from being released back to the page allocator.
-		 */
-		put_page(page);
 		ret = memory_failure(pfn, 0);
 		if (ret)
 			return ret;
_

Patches currently in -mm which might be from n-horiguchi@ah.jp.nec.com are

mmhwpoison-cleanup-unused-pagehuge-check.patch
mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
mmhwpoison-remove-mf_count_increased.patch
mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmmadvise-refactor-madvise_inject_error.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (156 preceding siblings ...)
  2020-07-16 21:45 ` + mmmadvise-call-soft_offline_page-without-mf_count_increased.patch " Andrew Morton
@ 2020-07-16 21:45 ` Andrew Morton
  2020-07-16 21:45 ` + mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch " Andrew Morton
                   ` (74 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:45 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, osalvador,
	tony.luck, zeil


The patch titled
     Subject: mm,madvise: Refactor madvise_inject_error
has been added to the -mm tree.  Its filename is
     mmmadvise-refactor-madvise_inject_error.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmmadvise-refactor-madvise_inject_error.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmmadvise-refactor-madvise_inject_error.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,madvise: Refactor madvise_inject_error

Make a proper if-else condition for {hard,soft}-offline.

Link: http://lkml.kernel.org/r/20200716123810.25292-5-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/madvise.c |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

--- a/mm/madvise.c~mmmadvise-refactor-madvise_inject_error
+++ a/mm/madvise.c
@@ -869,7 +869,6 @@ static long madvise_remove(struct vm_are
 static int madvise_inject_error(int behavior,
 		unsigned long start, unsigned long end)
 {
-	struct page *page;
 	struct zone *zone;
 	unsigned long size;
 
@@ -879,6 +878,7 @@ static int madvise_inject_error(int beha
 
 	for (; start < end; start += size) {
 		unsigned long pfn;
+		struct page *page;
 		int ret;
 
 		ret = get_user_pages_fast(start, 1, 0, &page);
@@ -908,17 +908,15 @@ static int madvise_inject_error(int beha
 
 		if (behavior == MADV_SOFT_OFFLINE) {
 			pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n",
-					pfn, start);
+				 pfn, start);
 
 			ret = soft_offline_page(pfn, 0);
-			if (ret)
-				return ret;
-			continue;
+		} else {
+			pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n",
+				 pfn, start);
+			ret = memory_failure(pfn, 0);
 		}
 
-		pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n",
-				pfn, start);
-		ret = memory_failure(pfn, 0);
 		if (ret)
 			return ret;
 	}
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-introduce-mf_msg_unsplit_thp.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (157 preceding siblings ...)
  2020-07-16 21:45 ` + mmmadvise-refactor-madvise_inject_error.patch " Andrew Morton
@ 2020-07-16 21:45 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch " Andrew Morton
                   ` (73 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:45 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, tony.luck,
	zeil


The patch titled
     Subject: mm,hwpoison-inject: don't pin for hwpoison_filter
has been added to the -mm tree.  Its filename is
     mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: mm,hwpoison-inject: don't pin for hwpoison_filter

Another memory error injection interface debugfs:hwpoison/corrupt-pfn also
takes bogus refcount for hwpoison_filter().  It's justified because this
does a coarse filter, expecting that memory_failure() redoes the check for
sure.

Link: http://lkml.kernel.org/r/20200716123810.25292-6-osalvador@suse.de
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hwpoison-inject.c |   18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

--- a/mm/hwpoison-inject.c~mmhwpoison-inject-dont-pin-for-hwpoison_filter
+++ a/mm/hwpoison-inject.c
@@ -26,11 +26,6 @@ static int hwpoison_inject(void *data, u
 
 	p = pfn_to_page(pfn);
 	hpage = compound_head(p);
-	/*
-	 * This implies unable to support free buddy pages.
-	 */
-	if (!get_hwpoison_page(p))
-		return 0;
 
 	if (!hwpoison_filter_enable)
 		goto inject;
@@ -40,23 +35,20 @@ static int hwpoison_inject(void *data, u
 	 * This implies unable to support non-LRU pages.
 	 */
 	if (!PageLRU(hpage) && !PageHuge(p))
-		goto put_out;
+		return 0;
 
 	/*
-	 * do a racy check with elevated page count, to make sure PG_hwpoison
-	 * will only be set for the targeted owner (or on a free page).
+	 * do a racy check to make sure PG_hwpoison will only be set for
+	 * the targeted owner (or on a free page).
 	 * memory_failure() will redo the check reliably inside page lock.
 	 */
 	err = hwpoison_filter(hpage);
 	if (err)
-		goto put_out;
+		return 0;
 
 inject:
 	pr_info("Injecting memory failure at pfn %#lx\n", pfn);
-	return memory_failure(pfn, MF_COUNT_INCREASED);
-put_out:
-	put_hwpoison_page(p);
-	return 0;
+	return memory_failure(pfn, 0);
 }
 
 static int hwpoison_unpoison(void *data, u64 val)
_

Patches currently in -mm which might be from n-horiguchi@ah.jp.nec.com are

mmhwpoison-cleanup-unused-pagehuge-check.patch
mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
mmhwpoison-remove-mf_count_increased.patch
mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (158 preceding siblings ...)
  2020-07-16 21:45 ` + mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-kill-put_hwpoison_page.patch " Andrew Morton
                   ` (72 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, osalvador,
	tony.luck, zeil


The patch titled
     Subject: mm,hwpoison: Un-export get_hwpoison_page and make it static
has been added to the -mm tree.  Its filename is
     mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: Un-export get_hwpoison_page and make it static

Since get_hwpoison_page is only used in memory-failure code now,
let us un-export it and make it private to that code.

Link: http://lkml.kernel.org/r/20200716123810.25292-7-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h  |    1 -
 mm/memory-failure.c |    3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

--- a/include/linux/mm.h~mmhwpoison-un-export-get_hwpoison_page-and-make-it-static
+++ a/include/linux/mm.h
@@ -2992,7 +2992,6 @@ extern int memory_failure(unsigned long
 extern void memory_failure_queue(unsigned long pfn, int flags);
 extern void memory_failure_queue_kick(int cpu);
 extern int unpoison_memory(unsigned long pfn);
-extern int get_hwpoison_page(struct page *page);
 #define put_hwpoison_page(page)	put_page(page)
 extern int sysctl_memory_failure_early_kill;
 extern int sysctl_memory_failure_recovery;
--- a/mm/memory-failure.c~mmhwpoison-un-export-get_hwpoison_page-and-make-it-static
+++ a/mm/memory-failure.c
@@ -925,7 +925,7 @@ static int page_action(struct page_state
  * Return: return 0 if failed to grab the refcount, otherwise true (some
  * non-zero value.)
  */
-int get_hwpoison_page(struct page *page)
+static int get_hwpoison_page(struct page *page)
 {
 	struct page *head = compound_head(page);
 
@@ -954,7 +954,6 @@ int get_hwpoison_page(struct page *page)
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(get_hwpoison_page);
 
 /*
  * Do all that is necessary to remove user space mappings. Unmap
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-introduce-mf_msg_unsplit_thp.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-kill-put_hwpoison_page.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (159 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-remove-mf_count_increased.patch " Andrew Morton
                   ` (71 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, osalvador,
	tony.luck, zeil


The patch titled
     Subject: mm,hwpoison: kill put_hwpoison_page
has been added to the -mm tree.  Its filename is
     mmhwpoison-kill-put_hwpoison_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-kill-put_hwpoison_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-kill-put_hwpoison_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: kill put_hwpoison_page

After commit 4e41a30c6d50 ("mm: hwpoison: adjust for new thp refcounting"),
put_hwpoison_page got reduced to a put_page.
Let us just use put_page instead.

Link: http://lkml.kernel.org/r/20200716123810.25292-8-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h  |    1 -
 mm/memory-failure.c |   30 +++++++++++++++---------------
 2 files changed, 15 insertions(+), 16 deletions(-)

--- a/include/linux/mm.h~mmhwpoison-kill-put_hwpoison_page
+++ a/include/linux/mm.h
@@ -2992,7 +2992,6 @@ extern int memory_failure(unsigned long
 extern void memory_failure_queue(unsigned long pfn, int flags);
 extern void memory_failure_queue_kick(int cpu);
 extern int unpoison_memory(unsigned long pfn);
-#define put_hwpoison_page(page)	put_page(page)
 extern int sysctl_memory_failure_early_kill;
 extern int sysctl_memory_failure_recovery;
 extern void shake_page(struct page *p, int access);
--- a/mm/memory-failure.c~mmhwpoison-kill-put_hwpoison_page
+++ a/mm/memory-failure.c
@@ -1144,7 +1144,7 @@ static int memory_failure_hugetlb(unsign
 		pr_err("Memory failure: %#lx: just unpoisoned\n", pfn);
 		num_poisoned_pages_dec();
 		unlock_page(head);
-		put_hwpoison_page(head);
+		put_page(head);
 		return 0;
 	}
 
@@ -1336,7 +1336,7 @@ int memory_failure(unsigned long pfn, in
 					pfn);
 			if (TestClearPageHWPoison(p))
 				num_poisoned_pages_dec();
-			put_hwpoison_page(p);
+			put_page(p);
 			return -EBUSY;
 		}
 		unlock_page(p);
@@ -1389,14 +1389,14 @@ int memory_failure(unsigned long pfn, in
 		pr_err("Memory failure: %#lx: just unpoisoned\n", pfn);
 		num_poisoned_pages_dec();
 		unlock_page(p);
-		put_hwpoison_page(p);
+		put_page(p);
 		return 0;
 	}
 	if (hwpoison_filter(p)) {
 		if (TestClearPageHWPoison(p))
 			num_poisoned_pages_dec();
 		unlock_page(p);
-		put_hwpoison_page(p);
+		put_page(p);
 		return 0;
 	}
 
@@ -1630,9 +1630,9 @@ int unpoison_memory(unsigned long pfn)
 	}
 	unlock_page(page);
 
-	put_hwpoison_page(page);
+	put_page(page);
 	if (freeit && !(pfn == my_zero_pfn(0) && page_count(p) == 1))
-		put_hwpoison_page(page);
+		put_page(page);
 
 	return 0;
 }
@@ -1690,7 +1690,7 @@ static int get_any_page(struct page *pag
 		/*
 		 * Try to free it.
 		 */
-		put_hwpoison_page(page);
+		put_page(page);
 		shake_page(page, 1);
 
 		/*
@@ -1699,7 +1699,7 @@ static int get_any_page(struct page *pag
 		ret = __get_any_page(page, pfn, 0);
 		if (ret == 1 && !PageLRU(page)) {
 			/* Drop page reference which is from __get_any_page() */
-			put_hwpoison_page(page);
+			put_page(page);
 			pr_info("soft_offline: %#lx: unknown non LRU page type %lx (%pGp)\n",
 				pfn, page->flags, &page->flags);
 			return -EIO;
@@ -1722,7 +1722,7 @@ static int soft_offline_huge_page(struct
 	lock_page(hpage);
 	if (PageHWPoison(hpage)) {
 		unlock_page(hpage);
-		put_hwpoison_page(hpage);
+		put_page(hpage);
 		pr_info("soft offline: %#lx hugepage already poisoned\n", pfn);
 		return -EBUSY;
 	}
@@ -1733,7 +1733,7 @@ static int soft_offline_huge_page(struct
 	 * get_any_page() and isolate_huge_page() takes a refcount each,
 	 * so need to drop one here.
 	 */
-	put_hwpoison_page(hpage);
+	put_page(hpage);
 	if (!ret) {
 		pr_info("soft offline: %#lx hugepage failed to isolate\n", pfn);
 		return -EBUSY;
@@ -1782,7 +1782,7 @@ static int __soft_offline_page(struct pa
 	wait_on_page_writeback(page);
 	if (PageHWPoison(page)) {
 		unlock_page(page);
-		put_hwpoison_page(page);
+		put_page(page);
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
 		return -EBUSY;
 	}
@@ -1797,7 +1797,7 @@ static int __soft_offline_page(struct pa
 	 * would need to fix isolation locking first.
 	 */
 	if (ret == 1) {
-		put_hwpoison_page(page);
+		put_page(page);
 		pr_info("soft_offline: %#lx: invalidated\n", pfn);
 		SetPageHWPoison(page);
 		num_poisoned_pages_inc();
@@ -1817,7 +1817,7 @@ static int __soft_offline_page(struct pa
 	 * Drop page reference which is came from get_any_page()
 	 * successful isolate_lru_page() already took another one.
 	 */
-	put_hwpoison_page(page);
+	put_page(page);
 	if (!ret) {
 		LIST_HEAD(pagelist);
 		/*
@@ -1861,7 +1861,7 @@ static int soft_offline_in_use_page(stru
 				pr_info("soft offline: %#lx: non anonymous thp\n", page_to_pfn(page));
 			else
 				pr_info("soft offline: %#lx: thp split failed\n", page_to_pfn(page));
-			put_hwpoison_page(page);
+			put_page(page);
 			return -EBUSY;
 		}
 		unlock_page(page);
@@ -1934,7 +1934,7 @@ int soft_offline_page(unsigned long pfn,
 	if (PageHWPoison(page)) {
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
 		if (flags & MF_COUNT_INCREASED)
-			put_hwpoison_page(page);
+			put_page(page);
 		return -EBUSY;
 	}
 
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-introduce-mf_msg_unsplit_thp.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-remove-mf_count_increased.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (160 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-kill-put_hwpoison_page.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch " Andrew Morton
                   ` (70 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, tony.luck,
	zeil


The patch titled
     Subject: mm,hwpoison: remove MF_COUNT_INCREASED
has been added to the -mm tree.  Its filename is
     mmhwpoison-remove-mf_count_increased.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-remove-mf_count_increased.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-remove-mf_count_increased.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: mm,hwpoison: remove MF_COUNT_INCREASED

Now there's no user of MF_COUNT_INCREASED, so we can safely remove it from
all calling points.

Link: http://lkml.kernel.org/r/20200716123810.25292-9-osalvador@suse.de
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h  |    7 +++----
 mm/memory-failure.c |   14 +++-----------
 2 files changed, 6 insertions(+), 15 deletions(-)

--- a/include/linux/mm.h~mmhwpoison-remove-mf_count_increased
+++ a/include/linux/mm.h
@@ -2983,10 +2983,9 @@ void register_page_bootmem_memmap(unsign
 				  unsigned long nr_pages);
 
 enum mf_flags {
-	MF_COUNT_INCREASED = 1 << 0,
-	MF_ACTION_REQUIRED = 1 << 1,
-	MF_MUST_KILL = 1 << 2,
-	MF_SOFT_OFFLINE = 1 << 3,
+	MF_ACTION_REQUIRED = 1 << 0,
+	MF_MUST_KILL = 1 << 1,
+	MF_SOFT_OFFLINE = 1 << 2,
 };
 extern int memory_failure(unsigned long pfn, int flags);
 extern void memory_failure_queue(unsigned long pfn, int flags);
--- a/mm/memory-failure.c~mmhwpoison-remove-mf_count_increased
+++ a/mm/memory-failure.c
@@ -1118,7 +1118,7 @@ static int memory_failure_hugetlb(unsign
 
 	num_poisoned_pages_inc();
 
-	if (!(flags & MF_COUNT_INCREASED) && !get_hwpoison_page(p)) {
+	if (!get_hwpoison_page(p)) {
 		/*
 		 * Check "filter hit" and "race with other subpage."
 		 */
@@ -1314,7 +1314,7 @@ int memory_failure(unsigned long pfn, in
 	 * In fact it's dangerous to directly bump up page count from 0,
 	 * that may make page_ref_freeze()/page_ref_unfreeze() mismatch.
 	 */
-	if (!(flags & MF_COUNT_INCREASED) && !get_hwpoison_page(p)) {
+	if (!get_hwpoison_page(p)) {
 		if (is_free_buddy_page(p)) {
 			action_result(pfn, MF_MSG_BUDDY, MF_DELAYED);
 			return 0;
@@ -1354,10 +1354,7 @@ int memory_failure(unsigned long pfn, in
 	shake_page(p, 0);
 	/* shake_page could have turned it free. */
 	if (!PageLRU(p) && is_free_buddy_page(p)) {
-		if (flags & MF_COUNT_INCREASED)
-			action_result(pfn, MF_MSG_BUDDY, MF_DELAYED);
-		else
-			action_result(pfn, MF_MSG_BUDDY_2ND, MF_DELAYED);
+		action_result(pfn, MF_MSG_BUDDY_2ND, MF_DELAYED);
 		return 0;
 	}
 
@@ -1655,9 +1652,6 @@ static int __get_any_page(struct page *p
 {
 	int ret;
 
-	if (flags & MF_COUNT_INCREASED)
-		return 1;
-
 	/*
 	 * When the target page is a free hugepage, just remove it
 	 * from free hugepage list.
@@ -1933,8 +1927,6 @@ int soft_offline_page(unsigned long pfn,
 
 	if (PageHWPoison(page)) {
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
-		if (flags & MF_COUNT_INCREASED)
-			put_page(page);
 		return -EBUSY;
 	}
 
_

Patches currently in -mm which might be from n-horiguchi@ah.jp.nec.com are

mmhwpoison-cleanup-unused-pagehuge-check.patch
mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
mmhwpoison-remove-mf_count_increased.patch
mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (161 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-remove-mf_count_increased.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch " Andrew Morton
                   ` (69 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, tony.luck,
	zeil


The patch titled
     Subject: mm,hwpoison: remove flag argument from soft offline functions
has been added to the -mm tree.  Its filename is
     mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: mm,hwpoison: remove flag argument from soft offline functions

The argument @flag no longer affects the behavior of soft_offline_page()
and its variants, so let's remove them.

Link: http://lkml.kernel.org/r/20200716123810.25292-10-osalvador@suse.de
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/base/memory.c |    2 +-
 include/linux/mm.h    |    2 +-
 mm/madvise.c          |    2 +-
 mm/memory-failure.c   |   27 +++++++++++++--------------
 4 files changed, 16 insertions(+), 17 deletions(-)

--- a/drivers/base/memory.c~mmhwpoison-remove-flag-argument-from-soft-offline-functions
+++ a/drivers/base/memory.c
@@ -463,7 +463,7 @@ static ssize_t soft_offline_page_store(s
 	if (kstrtoull(buf, 0, &pfn) < 0)
 		return -EINVAL;
 	pfn >>= PAGE_SHIFT;
-	ret = soft_offline_page(pfn, 0);
+	ret = soft_offline_page(pfn);
 	return ret == 0 ? count : ret;
 }
 
--- a/include/linux/mm.h~mmhwpoison-remove-flag-argument-from-soft-offline-functions
+++ a/include/linux/mm.h
@@ -2995,7 +2995,7 @@ extern int sysctl_memory_failure_early_k
 extern int sysctl_memory_failure_recovery;
 extern void shake_page(struct page *p, int access);
 extern atomic_long_t num_poisoned_pages __read_mostly;
-extern int soft_offline_page(unsigned long pfn, int flags);
+extern int soft_offline_page(unsigned long pfn);
 
 
 /*
--- a/mm/madvise.c~mmhwpoison-remove-flag-argument-from-soft-offline-functions
+++ a/mm/madvise.c
@@ -910,7 +910,7 @@ static int madvise_inject_error(int beha
 			pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n",
 				 pfn, start);
 
-			ret = soft_offline_page(pfn, 0);
+			ret = soft_offline_page(pfn);
 		} else {
 			pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n",
 				 pfn, start);
--- a/mm/memory-failure.c~mmhwpoison-remove-flag-argument-from-soft-offline-functions
+++ a/mm/memory-failure.c
@@ -1502,7 +1502,7 @@ static void memory_failure_work_func(str
 		if (!gotten)
 			break;
 		if (entry.flags & MF_SOFT_OFFLINE)
-			soft_offline_page(entry.pfn, entry.flags);
+			soft_offline_page(entry.pfn);
 		else
 			memory_failure(entry.pfn, entry.flags);
 	}
@@ -1648,7 +1648,7 @@ static struct page *new_page(struct page
  * that is not free, and 1 for any other page type.
  * For 1 the page is returned with increased page count, otherwise not.
  */
-static int __get_any_page(struct page *p, unsigned long pfn, int flags)
+static int __get_any_page(struct page *p, unsigned long pfn)
 {
 	int ret;
 
@@ -1675,9 +1675,9 @@ static int __get_any_page(struct page *p
 	return ret;
 }
 
-static int get_any_page(struct page *page, unsigned long pfn, int flags)
+static int get_any_page(struct page *page, unsigned long pfn)
 {
-	int ret = __get_any_page(page, pfn, flags);
+	int ret = __get_any_page(page, pfn);
 
 	if (ret == 1 && !PageHuge(page) &&
 	    !PageLRU(page) && !__PageMovable(page)) {
@@ -1690,7 +1690,7 @@ static int get_any_page(struct page *pag
 		/*
 		 * Did it turn free?
 		 */
-		ret = __get_any_page(page, pfn, 0);
+		ret = __get_any_page(page, pfn);
 		if (ret == 1 && !PageLRU(page)) {
 			/* Drop page reference which is from __get_any_page() */
 			put_page(page);
@@ -1702,7 +1702,7 @@ static int get_any_page(struct page *pag
 	return ret;
 }
 
-static int soft_offline_huge_page(struct page *page, int flags)
+static int soft_offline_huge_page(struct page *page)
 {
 	int ret;
 	unsigned long pfn = page_to_pfn(page);
@@ -1761,7 +1761,7 @@ static int soft_offline_huge_page(struct
 	return ret;
 }
 
-static int __soft_offline_page(struct page *page, int flags)
+static int __soft_offline_page(struct page *page)
 {
 	int ret;
 	unsigned long pfn = page_to_pfn(page);
@@ -1841,7 +1841,7 @@ static int __soft_offline_page(struct pa
 	return ret;
 }
 
-static int soft_offline_in_use_page(struct page *page, int flags)
+static int soft_offline_in_use_page(struct page *page)
 {
 	int ret;
 	int mt;
@@ -1871,9 +1871,9 @@ static int soft_offline_in_use_page(stru
 	mt = get_pageblock_migratetype(page);
 	set_pageblock_migratetype(page, MIGRATE_ISOLATE);
 	if (PageHuge(page))
-		ret = soft_offline_huge_page(page, flags);
+		ret = soft_offline_huge_page(page);
 	else
-		ret = __soft_offline_page(page, flags);
+		ret = __soft_offline_page(page);
 	set_pageblock_migratetype(page, mt);
 	return ret;
 }
@@ -1894,7 +1894,6 @@ static int soft_offline_free_page(struct
 /**
  * soft_offline_page - Soft offline a page.
  * @pfn: pfn to soft-offline
- * @flags: flags. Same as memory_failure().
  *
  * Returns 0 on success, otherwise negated errno.
  *
@@ -1913,7 +1912,7 @@ static int soft_offline_free_page(struct
  * This is not a 100% solution for all memory, but tries to be
  * ``good enough'' for the majority of memory.
  */
-int soft_offline_page(unsigned long pfn, int flags)
+int soft_offline_page(unsigned long pfn)
 {
 	int ret;
 	struct page *page;
@@ -1931,11 +1930,11 @@ int soft_offline_page(unsigned long pfn,
 	}
 
 	get_online_mems();
-	ret = get_any_page(page, pfn, flags);
+	ret = get_any_page(page, pfn);
 	put_online_mems();
 
 	if (ret > 0)
-		ret = soft_offline_in_use_page(page, flags);
+		ret = soft_offline_in_use_page(page);
 	else if (ret == 0)
 		ret = soft_offline_free_page(page);
 
_

Patches currently in -mm which might be from n-horiguchi@ah.jp.nec.com are

mmhwpoison-cleanup-unused-pagehuge-check.patch
mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
mmhwpoison-remove-mf_count_increased.patch
mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (162 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-rework-soft-offline-for-free-pages.patch " Andrew Morton
                   ` (68 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, osalvador,
	tony.luck, zeil


The patch titled
     Subject: mm,hwpoison: Unify THP handling for hard and soft offline
has been added to the -mm tree.  Its filename is
     mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: Unify THP handling for hard and soft offline

Place the THP's page handling in a helper and use it from both hard and
soft-offline machinery, so we get rid of some duplicated code.

Link: http://lkml.kernel.org/r/20200716123810.25292-11-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |   48 +++++++++++++++++++-----------------------
 1 file changed, 22 insertions(+), 26 deletions(-)

--- a/mm/memory-failure.c~mmhwpoison-unify-thp-handling-for-hard-and-soft-offline
+++ a/mm/memory-failure.c
@@ -1103,6 +1103,25 @@ static int identify_page_state(unsigned
 	return page_action(ps, p, pfn);
 }
 
+static int try_to_split_thp_page(struct page *page, const char *msg)
+{
+	lock_page(page);
+	if (!PageAnon(page) || unlikely(split_huge_page(page))) {
+		unsigned long pfn = page_to_pfn(page);
+
+		unlock_page(page);
+		if (!PageAnon(page))
+			pr_info("%s: %#lx: non anonymous thp\n", msg, pfn);
+		else
+			pr_info("%s: %#lx: thp split failed\n", msg, pfn);
+		put_page(page);
+		return -EBUSY;
+	}
+	unlock_page(page);
+
+	return 0;
+}
+
 static int memory_failure_hugetlb(unsigned long pfn, int flags)
 {
 	struct page *p = pfn_to_page(pfn);
@@ -1325,21 +1344,8 @@ int memory_failure(unsigned long pfn, in
 	}
 
 	if (PageTransHuge(hpage)) {
-		lock_page(p);
-		if (!PageAnon(p) || unlikely(split_huge_page(p))) {
-			unlock_page(p);
-			if (!PageAnon(p))
-				pr_err("Memory failure: %#lx: non anonymous thp\n",
-					pfn);
-			else
-				pr_err("Memory failure: %#lx: thp split failed\n",
-					pfn);
-			if (TestClearPageHWPoison(p))
-				num_poisoned_pages_dec();
-			put_page(p);
+		if (try_to_split_thp_page(p, "Memory Failure") < 0)
 			return -EBUSY;
-		}
-		unlock_page(p);
 		VM_BUG_ON_PAGE(!page_count(p), p);
 	}
 
@@ -1847,19 +1853,9 @@ static int soft_offline_in_use_page(stru
 	int mt;
 	struct page *hpage = compound_head(page);
 
-	if (!PageHuge(page) && PageTransHuge(hpage)) {
-		lock_page(page);
-		if (!PageAnon(page) || unlikely(split_huge_page(page))) {
-			unlock_page(page);
-			if (!PageAnon(page))
-				pr_info("soft offline: %#lx: non anonymous thp\n", page_to_pfn(page));
-			else
-				pr_info("soft offline: %#lx: thp split failed\n", page_to_pfn(page));
-			put_page(page);
+	if (!PageHuge(page) && PageTransHuge(hpage))
+		if (try_to_split_thp_page(page, "soft offline") < 0)
 			return -EBUSY;
-		}
-		unlock_page(page);
-	}
 
 	/*
 	 * Setting MIGRATE_ISOLATE here ensures that the page will be linked
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-introduce-mf_msg_unsplit_thp.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-rework-soft-offline-for-free-pages.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (163 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-rework-soft-offline-for-in-use-pages.patch " Andrew Morton
                   ` (67 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, osalvador,
	tony.luck, zeil


The patch titled
     Subject: mm,hwpoison: rework soft offline for free pages
has been added to the -mm tree.  Its filename is
     mmhwpoison-rework-soft-offline-for-free-pages.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-rework-soft-offline-for-free-pages.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-rework-soft-offline-for-free-pages.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: rework soft offline for free pages

When trying to soft-offline a free page, we need to first take it off the
buddy allocator.  Once we know is out of reach, we can safely flag it as
poisoned.

take_page_off_buddy will be used to take a page meant to be poisoned off
the buddy allocator.  take_page_off_buddy calls break_down_buddy_pages,
which splits a higher-order page in case our page belongs to one.

Once the page is under our control, we call page_handle_poison to set it
as poisoned and grab a refcount on it.

Link: http://lkml.kernel.org/r/20200716123810.25292-12-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/page-flags.h |    1 
 mm/memory-failure.c        |   17 ++++++--
 mm/page_alloc.c            |   68 +++++++++++++++++++++++++++++++++++
 3 files changed, 81 insertions(+), 5 deletions(-)

--- a/include/linux/page-flags.h~mmhwpoison-rework-soft-offline-for-free-pages
+++ a/include/linux/page-flags.h
@@ -422,6 +422,7 @@ PAGEFLAG_FALSE(Uncached)
 PAGEFLAG(HWPoison, hwpoison, PF_ANY)
 TESTSCFLAG(HWPoison, hwpoison, PF_ANY)
 #define __PG_HWPOISON (1UL << PG_hwpoison)
+extern bool take_page_off_buddy(struct page *page);
 extern bool set_hwpoison_free_buddy_page(struct page *page);
 #else
 PAGEFLAG_FALSE(HWPoison)
--- a/mm/memory-failure.c~mmhwpoison-rework-soft-offline-for-free-pages
+++ a/mm/memory-failure.c
@@ -65,6 +65,13 @@ int sysctl_memory_failure_recovery __rea
 
 atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
 
+static void page_handle_poison(struct page *page)
+{
+	SetPageHWPoison(page);
+	page_ref_inc(page);
+	num_poisoned_pages_inc();
+}
+
 #if defined(CONFIG_HWPOISON_INJECT) || defined(CONFIG_HWPOISON_INJECT_MODULE)
 
 u32 hwpoison_filter_enable = 0;
@@ -1876,14 +1883,14 @@ static int soft_offline_in_use_page(stru
 
 static int soft_offline_free_page(struct page *page)
 {
+	int rc = -EBUSY;
 	int rc = dissolve_free_huge_page(page);
 
-	if (!rc) {
-		if (set_hwpoison_free_buddy_page(page))
-			num_poisoned_pages_inc();
-		else
-			rc = -EBUSY;
+	if (!dissolve_free_huge_page(page) && take_page_off_buddy(page)) {
+		page_handle_poison(page);
+		rc = 0;
 	}
+
 	return rc;
 }
 
--- a/mm/page_alloc.c~mmhwpoison-rework-soft-offline-for-free-pages
+++ a/mm/page_alloc.c
@@ -8762,6 +8762,74 @@ bool is_free_buddy_page(struct page *pag
 
 #ifdef CONFIG_MEMORY_FAILURE
 /*
+ * Break down a higher-order page in sub-pages, and keep our target out of
+ * buddy allocator.
+ */
+static void break_down_buddy_pages(struct zone *zone, struct page *page,
+				   struct page *target, int low, int high,
+				   int migratetype)
+{
+	unsigned long size = 1 << high;
+	struct page *current_buddy, *next_page;
+
+	while (high > low) {
+		high--;
+		size >>= 1;
+
+		if (target >= &page[size]) {
+			next_page = page + size;
+			current_buddy = page;
+		} else {
+			next_page = page;
+			current_buddy = page + size;
+		}
+
+		if (set_page_guard(zone, current_buddy, high, migratetype))
+			continue;
+
+		if (current_buddy != target) {
+			add_to_free_list(current_buddy, zone, high, migratetype);
+			set_page_order(current_buddy, high);
+			page = next_page;
+		}
+	}
+}
+
+/*
+ * Take a page that will be marked as poisoned off the buddy allocator.
+ */
+bool take_page_off_buddy(struct page *page)
+{
+	struct zone *zone = page_zone(page);
+	unsigned long pfn = page_to_pfn(page);
+	unsigned long flags;
+	unsigned int order;
+	bool ret = false;
+
+	spin_lock_irqsave(&zone->lock, flags);
+	for (order = 0; order < MAX_ORDER; order++) {
+		struct page *page_head = page - (pfn & ((1 << order) - 1));
+		int buddy_order = page_order(page_head);
+
+		if (PageBuddy(page_head) && buddy_order >= order) {
+			unsigned long pfn_head = page_to_pfn(page_head);
+			int migratetype = get_pfnblock_migratetype(page_head,
+								   pfn_head);
+
+			del_page_from_free_list(page_head, zone, buddy_order);
+			break_down_buddy_pages(zone, page_head, page, 0,
+						buddy_order, migratetype);
+			ret = true;
+			break;
+		}
+		if (page_count(page_head) > 0)
+			break;
+	}
+	spin_unlock_irqrestore(&zone->lock, flags);
+	return ret;
+}
+
+/*
  * Set PG_hwpoison flag if a given page is confirmed to be a free page.  This
  * test is performed under the zone lock to prevent a race against page
  * allocation.
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-introduce-mf_msg_unsplit_thp.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-rework-soft-offline-for-in-use-pages.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (164 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-rework-soft-offline-for-free-pages.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch " Andrew Morton
                   ` (66 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, osalvador,
	tony.luck, zeil


The patch titled
     Subject: mm,hwpoison: rework soft offline for in-use pages
has been added to the -mm tree.  Its filename is
     mmhwpoison-rework-soft-offline-for-in-use-pages.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-rework-soft-offline-for-in-use-pages.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-rework-soft-offline-for-in-use-pages.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: rework soft offline for in-use pages

This patch changes the way we set and handle in-use poisoned pages.  Until
now, poisoned pages were released to the buddy allocator, trusting that
the checks that take place prior to deliver the page to its end user would
act as a safe net and would skip that page.

This has proved to be wrong, as we got some pfn walkers out there, like
compaction, that all they care is the page to be PageBuddy and be in a
freelist.

Although this might not be the only user, having poisoned pages in the
buddy allocator seems a bad idea as we should only have free pages that
are ready and meant to be used as such.

Before explaining the taken approach, let us break down the kind of pages
we can soft offline.

- Anonymous THP (after the split, they end up being 4K pages)
- Hugetlb
- Order-0 pages (that can be either migrated or invalited)

* Normal pages (order-0 and anon-THP)

  - If they are clean and unmapped page cache pages, we invalidate
    then by means of invalidate_inode_page().
  - If they are mapped/dirty, we do the isolate-and-migrate dance.

  Either way, do not call put_page directly from those paths.
  Instead, we keep the page and send it to page_set_poison to perform the
  right handling.

  Among other things, page_set_poison() sets the HWPoison flag and does the last
  put_page.
  This call to put_page is mainly to be able to call __page_cache_release,
  since this function is not exported.

  Down the chain, we placed a check for HWPoison page in
  free_pages_prepare, that just skips any poisoned page, so those pages
  do not end up either in a pcplist or in buddy-freelist.

  After that, we set the refcount on the page to 1 and we increment
  the poisoned pages counter.

  We could do as we do for free pages:
  1) wait until the page hits buddy's freelists
  2) take it off
  3) flag it

  The problem is that we could race with an allocation, so by the time we
  want to take the page off the buddy, the page is already allocated, so we
  cannot soft-offline it.
  This is not fatal of course, but if it is better if we can close the race
  as does not require a lot of code.

* Hugetlb pages

  - We isolate-and-migrate them

  There is no magic in here, we just isolate and migrate them.
  A new set of internal functions have been made to flag a hugetlb page as
  poisoned (SetPageHugePoisoned(), PageHugePoisoned(), ClearPageHugePoisoned())
  This allows us to flag the page when we migrate it, back in
  move_hugetlb_state().
  Later on we check whether the page is poisoned in __free_huge_page,
  and we bail out in that case before sending the page to e.g: active
  free list.
  This gives us full control of the page, and we can handle it
  page_handle_poison().

  In other words, we do not allow migrated hugepages to get back to the
  freelists.

  Since now the page has no user and has been migrated, we can call
  dissolve_free_huge_page, which will end up calling update_and_free_page.
  In update_and_free_page(), we check for the page to be poisoned.
  If it so, we handle it as we handle gigantic pages, i.e: we break down
  the page in order-0 pages and free them one by one.
  Doing so, allows us for free_pages_prepare to skip poisoned pages.

Because of the way we handle now in-use pages, we no longer need the
put-as-isolation-migratetype dance, that was guarding for poisoned pages
to end up in pcplists.

Link: http://lkml.kernel.org/r/20200716123810.25292-13-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/page-flags.h |    5 --
 mm/hugetlb.c               |   60 ++++++++++++++++++++++++++++++-----
 mm/memory-failure.c        |   53 ++++++++++++------------------
 mm/migrate.c               |   11 +-----
 mm/page_alloc.c            |   38 +++++-----------------
 5 files changed, 86 insertions(+), 81 deletions(-)

--- a/include/linux/page-flags.h~mmhwpoison-rework-soft-offline-for-in-use-pages
+++ a/include/linux/page-flags.h
@@ -423,13 +423,8 @@ PAGEFLAG(HWPoison, hwpoison, PF_ANY)
 TESTSCFLAG(HWPoison, hwpoison, PF_ANY)
 #define __PG_HWPOISON (1UL << PG_hwpoison)
 extern bool take_page_off_buddy(struct page *page);
-extern bool set_hwpoison_free_buddy_page(struct page *page);
 #else
 PAGEFLAG_FALSE(HWPoison)
-static inline bool set_hwpoison_free_buddy_page(struct page *page)
-{
-	return 0;
-}
 #define __PG_HWPOISON 0
 #endif
 
--- a/mm/hugetlb.c~mmhwpoison-rework-soft-offline-for-in-use-pages
+++ a/mm/hugetlb.c
@@ -29,6 +29,7 @@
 #include <linux/numa.h>
 #include <linux/llist.h>
 #include <linux/cma.h>
+#include <linux/migrate.h>
 
 #include <asm/page.h>
 #include <asm/pgalloc.h>
@@ -1210,9 +1211,26 @@ static int hstate_next_node_to_free(stru
 		((node = hstate_next_node_to_free(hs, mask)) || 1);	\
 		nr_nodes--)
 
-#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-static void destroy_compound_gigantic_page(struct page *page,
-					unsigned int order)
+static inline bool PageHugePoisoned(struct page *page)
+{
+	if (!PageHuge(page))
+		return false;
+
+	return (unsigned long)page[3].mapping == -1U;
+}
+
+static inline void SetPageHugePoisoned(struct page *page)
+{
+	page[3].mapping = (void *)-1U;
+}
+
+static inline void ClearPageHugePoisoned(struct page *page)
+{
+	page[3].mapping = NULL;
+}
+
+static void destroy_compound_gigantic_page(struct hstate *h, struct page *page,
+					   unsigned int order)
 {
 	int i;
 	int nr_pages = 1 << order;
@@ -1223,14 +1241,19 @@ static void destroy_compound_gigantic_pa
 		atomic_set(compound_pincount_ptr(page), 0);
 
 	for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
+		if (!hstate_is_gigantic(h))
+			 p->mapping = NULL;
 		clear_compound_head(p);
 		set_page_refcounted(p);
 	}
 
+	if (PageHugePoisoned(page))
+		ClearPageHugePoisoned(page);
 	set_compound_order(page, 0);
 	__ClearPageHead(page);
 }
 
+#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
 static void free_gigantic_page(struct page *page, unsigned int order)
 {
 	/*
@@ -1285,13 +1308,16 @@ static struct page *alloc_gigantic_page(
 	return NULL;
 }
 static inline void free_gigantic_page(struct page *page, unsigned int order) { }
-static inline void destroy_compound_gigantic_page(struct page *page,
-						unsigned int order) { }
+static inline void destroy_compound_gigantic_page(struct hstate *h,
+						  struct page *page,
+						  unsigned int order) { }
 #endif
 
 static void update_and_free_page(struct hstate *h, struct page *page)
 {
 	int i;
+	bool poisoned = PageHugePoisoned(page);
+	unsigned int order = huge_page_order(h);
 
 	if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
 		return;
@@ -1314,11 +1340,21 @@ static void update_and_free_page(struct
 		 * we might block in free_gigantic_page().
 		 */
 		spin_unlock(&hugetlb_lock);
-		destroy_compound_gigantic_page(page, huge_page_order(h));
-		free_gigantic_page(page, huge_page_order(h));
+		destroy_compound_gigantic_page(h, page, order);
+		free_gigantic_page(page, order);
 		spin_lock(&hugetlb_lock);
 	} else {
-		__free_pages(page, huge_page_order(h));
+		if (unlikely(poisoned)) {
+			/*
+			 * If the hugepage is poisoned, do as we do for
+			 * gigantic pages and free the pages as order-0.
+			 * free_pages_prepare will skip over the poisoned ones.
+			 */
+			destroy_compound_gigantic_page(h, page, order);
+			free_contig_range(page_to_pfn(page), 1 << order);
+		} else {
+			__free_pages(page, huge_page_order(h));
+		}
 	}
 }
 
@@ -1428,6 +1464,11 @@ static void __free_huge_page(struct page
 	if (restore_reserve)
 		h->resv_huge_pages++;
 
+	if (PageHugePoisoned(page)) {
+		spin_unlock(&hugetlb_lock);
+		return;
+	}
+
 	if (PageHugeTemporary(page)) {
 		list_del(&page->lru);
 		ClearPageHugeTemporary(page);
@@ -5629,6 +5670,9 @@ void move_hugetlb_state(struct page *old
 	hugetlb_cgroup_migrate(oldpage, newpage);
 	set_page_owner_migrate_reason(newpage, reason);
 
+	if (reason == MR_MEMORY_FAILURE)
+		SetPageHugePoisoned(oldpage);
+
 	/*
 	 * transfer temporary state of the new huge page. This is
 	 * reverse to other transitions because the newpage is going to
--- a/mm/memory-failure.c~mmhwpoison-rework-soft-offline-for-in-use-pages
+++ a/mm/memory-failure.c
@@ -65,9 +65,17 @@ int sysctl_memory_failure_recovery __rea
 
 atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
 
-static void page_handle_poison(struct page *page)
+static void page_handle_poison(struct page *page, bool release, bool set_flag,
+			       bool huge_flag)
 {
-	SetPageHWPoison(page);
+	if (set_flag)
+		SetPageHWPoison(page);
+
+        if (huge_flag)
+		dissolve_free_huge_page(page);
+        else if (release)
+		put_page(page);
+
 	page_ref_inc(page);
 	num_poisoned_pages_inc();
 }
@@ -1717,7 +1725,7 @@ static int get_any_page(struct page *pag
 
 static int soft_offline_huge_page(struct page *page)
 {
-	int ret;
+	int ret = -EBUSY;
 	unsigned long pfn = page_to_pfn(page);
 	struct page *hpage = compound_head(page);
 	LIST_HEAD(pagelist);
@@ -1757,19 +1765,12 @@ static int soft_offline_huge_page(struct
 			ret = -EIO;
 	} else {
 		/*
-		 * We set PG_hwpoison only when the migration source hugepage
-		 * was successfully dissolved, because otherwise hwpoisoned
-		 * hugepage remains on free hugepage list, then userspace will
-		 * find it as SIGBUS by allocation failure. That's not expected
-		 * in soft-offlining.
+		 * At this point the page cannot be in-use since we do not
+		 * let the page to go back to hugetlb freelists.
+		 * In that case we just need to dissolve it.
+		 * page_handle_poison will take care of it.
 		 */
-		ret = dissolve_free_huge_page(page);
-		if (!ret) {
-			if (set_hwpoison_free_buddy_page(page))
-				num_poisoned_pages_inc();
-			else
-				ret = -EBUSY;
-		}
+		page_handle_poison(page, true, true, true);
 	}
 	return ret;
 }
@@ -1804,10 +1805,8 @@ static int __soft_offline_page(struct pa
 	 * would need to fix isolation locking first.
 	 */
 	if (ret == 1) {
-		put_page(page);
 		pr_info("soft_offline: %#lx: invalidated\n", pfn);
-		SetPageHWPoison(page);
-		num_poisoned_pages_inc();
+		page_handle_poison(page, true, true, false);
 		return 0;
 	}
 
@@ -1838,7 +1837,9 @@ static int __soft_offline_page(struct pa
 		list_add(&page->lru, &pagelist);
 		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
 					MIGRATE_SYNC, MR_MEMORY_FAILURE);
-		if (ret) {
+		if (!ret) {
+			page_handle_poison(page, true, true, false);
+		} else {
 			if (!list_empty(&pagelist))
 				putback_movable_pages(&pagelist);
 
@@ -1857,37 +1858,25 @@ static int __soft_offline_page(struct pa
 static int soft_offline_in_use_page(struct page *page)
 {
 	int ret;
-	int mt;
 	struct page *hpage = compound_head(page);
 
 	if (!PageHuge(page) && PageTransHuge(hpage))
 		if (try_to_split_thp_page(page, "soft offline") < 0)
 			return -EBUSY;
 
-	/*
-	 * Setting MIGRATE_ISOLATE here ensures that the page will be linked
-	 * to free list immediately (not via pcplist) when released after
-	 * successful page migration. Otherwise we can't guarantee that the
-	 * page is really free after put_page() returns, so
-	 * set_hwpoison_free_buddy_page() highly likely fails.
-	 */
-	mt = get_pageblock_migratetype(page);
-	set_pageblock_migratetype(page, MIGRATE_ISOLATE);
 	if (PageHuge(page))
 		ret = soft_offline_huge_page(page);
 	else
 		ret = __soft_offline_page(page);
-	set_pageblock_migratetype(page, mt);
 	return ret;
 }
 
 static int soft_offline_free_page(struct page *page)
 {
 	int rc = -EBUSY;
-	int rc = dissolve_free_huge_page(page);
 
 	if (!dissolve_free_huge_page(page) && take_page_off_buddy(page)) {
-		page_handle_poison(page);
+		page_handle_poison(page, false, true, false);
 		rc = 0;
 	}
 
--- a/mm/migrate.c~mmhwpoison-rework-soft-offline-for-in-use-pages
+++ a/mm/migrate.c
@@ -1222,16 +1222,11 @@ out:
 	 * we want to retry.
 	 */
 	if (rc == MIGRATEPAGE_SUCCESS) {
-		put_page(page);
-		if (reason == MR_MEMORY_FAILURE) {
+		if (reason != MR_MEMORY_FAILURE)
 			/*
-			 * Set PG_HWPoison on just freed page
-			 * intentionally. Although it's rather weird,
-			 * it's how HWPoison flag works at the moment.
+			 * We handle poisoned pages in page_handle_poison.
 			 */
-			if (set_hwpoison_free_buddy_page(page))
-				num_poisoned_pages_inc();
-		}
+			put_page(page);
 	} else {
 		if (rc != -EAGAIN) {
 			if (likely(!__PageMovable(page))) {
--- a/mm/page_alloc.c~mmhwpoison-rework-soft-offline-for-in-use-pages
+++ a/mm/page_alloc.c
@@ -1175,6 +1175,16 @@ static __always_inline bool free_pages_p
 
 	trace_mm_page_free(page, order);
 
+	if (unlikely(PageHWPoison(page)) && !order) {
+		/*
+		 * Untie memcg state and reset page's owner
+		 */
+		if (memcg_kmem_enabled() && PageKmemcg(page))
+			__memcg_kmem_uncharge_page(page, order);
+		reset_page_owner(page, order);
+		return false;
+	}
+
 	/*
 	 * Check tail pages before head page information is cleared to
 	 * avoid checking PageCompound for order-0 pages.
@@ -8828,32 +8838,4 @@ bool take_page_off_buddy(struct page *pa
 	spin_unlock_irqrestore(&zone->lock, flags);
 	return ret;
 }
-
-/*
- * Set PG_hwpoison flag if a given page is confirmed to be a free page.  This
- * test is performed under the zone lock to prevent a race against page
- * allocation.
- */
-bool set_hwpoison_free_buddy_page(struct page *page)
-{
-	struct zone *zone = page_zone(page);
-	unsigned long pfn = page_to_pfn(page);
-	unsigned long flags;
-	unsigned int order;
-	bool hwpoisoned = false;
-
-	spin_lock_irqsave(&zone->lock, flags);
-	for (order = 0; order < MAX_ORDER; order++) {
-		struct page *page_head = page - (pfn & ((1 << order) - 1));
-
-		if (PageBuddy(page_head) && page_order(page_head) >= order) {
-			if (!TestSetPageHWPoison(page))
-				hwpoisoned = true;
-			break;
-		}
-	}
-	spin_unlock_irqrestore(&zone->lock, flags);
-
-	return hwpoisoned;
-}
 #endif
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-introduce-mf_msg_unsplit_thp.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (165 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-rework-soft-offline-for-in-use-pages.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch " Andrew Morton
                   ` (65 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, osalvador,
	tony.luck, zeil


The patch titled
     Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page
has been added to the -mm tree.  Its filename is
     mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page

Merging soft_offline_huge_page and __soft_offline_page let us get rid of
quite some duplicated code, and makes the code much easier to follow.

Now, __soft_offline_page will handle both normal and hugetlb pages.

Note that move put_page() block to the beginning of page_handle_poison()
with drain_all_pages() in order to make sure that the target page is freed
and sent into free list to make take_page_off_buddy() work properly.

Link: http://lkml.kernel.org/r/20200716123810.25292-14-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |  141 +++++++++++++++---------------------------
 1 file changed, 52 insertions(+), 89 deletions(-)

--- a/mm/memory-failure.c~mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page
+++ a/mm/memory-failure.c
@@ -1723,62 +1723,50 @@ static int get_any_page(struct page *pag
 	return ret;
 }
 
-static int soft_offline_huge_page(struct page *page)
+static bool isolate_page(struct page *page, struct list_head *pagelist)
 {
-	int ret = -EBUSY;
-	unsigned long pfn = page_to_pfn(page);
-	struct page *hpage = compound_head(page);
-	LIST_HEAD(pagelist);
+	bool isolated = false;
+	bool lru = PageLRU(page);
 
-	/*
-	 * This double-check of PageHWPoison is to avoid the race with
-	 * memory_failure(). See also comment in __soft_offline_page().
-	 */
-	lock_page(hpage);
-	if (PageHWPoison(hpage)) {
-		unlock_page(hpage);
-		put_page(hpage);
-		pr_info("soft offline: %#lx hugepage already poisoned\n", pfn);
-		return -EBUSY;
+	if (PageHuge(page)) {
+		isolated = isolate_huge_page(page, pagelist);
+	} else {
+		if (lru)
+			isolated = !isolate_lru_page(page);
+		else
+			isolated = !isolate_movable_page(page, ISOLATE_UNEVICTABLE);
+
+		if (isolated)
+			list_add(&page->lru, pagelist);
 	}
-	unlock_page(hpage);
 
-	ret = isolate_huge_page(hpage, &pagelist);
+	if (isolated && lru)
+		inc_node_page_state(page, NR_ISOLATED_ANON +
+				    page_is_file_lru(page));
+
 	/*
-	 * get_any_page() and isolate_huge_page() takes a refcount each,
-	 * so need to drop one here.
+	 * If we succeed to isolate the page, we grabbed another refcount on
+	 * the page, so we can safely drop the one we got from get_any_pages().
+	 * If we failed to isolate the page, it means that we cannot go further
+	 * and we will return an error, so drop the reference we got from
+	 * get_any_pages() as well.
 	 */
-	put_page(hpage);
-	if (!ret) {
-		pr_info("soft offline: %#lx hugepage failed to isolate\n", pfn);
-		return -EBUSY;
-	}
-
-	ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
-				MIGRATE_SYNC, MR_MEMORY_FAILURE);
-	if (ret) {
-		pr_info("soft offline: %#lx: hugepage migration failed %d, type %lx (%pGp)\n",
-			pfn, ret, page->flags, &page->flags);
-		if (!list_empty(&pagelist))
-			putback_movable_pages(&pagelist);
-		if (ret > 0)
-			ret = -EIO;
-	} else {
-		/*
-		 * At this point the page cannot be in-use since we do not
-		 * let the page to go back to hugetlb freelists.
-		 * In that case we just need to dissolve it.
-		 * page_handle_poison will take care of it.
-		 */
-		page_handle_poison(page, true, true, true);
-	}
-	return ret;
+	put_page(page);
+	return isolated;
 }
 
+/*
+ * __soft_offline_page handles hugetlb-pages and non-hugetlb pages.
+ * If the page is a non-dirty unmapped page-cache page, it simply invalidates.
+ */
 static int __soft_offline_page(struct page *page)
 {
-	int ret;
+	int ret = 0;
 	unsigned long pfn = page_to_pfn(page);
+	struct page *hpage = compound_head(page);
+	const char *msg_page[] = {"page", "hugepage"};
+	bool huge = PageHuge(page);
+	LIST_HEAD(pagelist);
 
 	/*
 	 * Check PageHWPoison again inside page lock because PageHWPoison
@@ -1787,88 +1775,63 @@ static int __soft_offline_page(struct pa
 	 * so there's no race between soft_offline_page() and memory_failure().
 	 */
 	lock_page(page);
-	wait_on_page_writeback(page);
+	if (!PageHuge(page))
+		wait_on_page_writeback(page);
 	if (PageHWPoison(page)) {
 		unlock_page(page);
 		put_page(page);
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
 		return -EBUSY;
 	}
-	/*
-	 * Try to invalidate first. This should work for
-	 * non dirty unmapped page cache pages.
-	 */
-	ret = invalidate_inode_page(page);
+
+	if (!PageHuge(page))
+		/*
+		 * Try to invalidate first. This should work for
+		 * non dirty unmapped page cache pages.
+		 */
+		ret = invalidate_inode_page(page);
 	unlock_page(page);
+
 	/*
 	 * RED-PEN would be better to keep it isolated here, but we
 	 * would need to fix isolation locking first.
 	 */
 	if (ret == 1) {
 		pr_info("soft_offline: %#lx: invalidated\n", pfn);
-		page_handle_poison(page, true, true, false);
+		page_handle_poison(page, false, true, false);
 		return 0;
 	}
 
-	/*
-	 * Simple invalidation didn't work.
-	 * Try to migrate to a new page instead. migrate.c
-	 * handles a large number of cases for us.
-	 */
-	if (PageLRU(page))
-		ret = isolate_lru_page(page);
-	else
-		ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
-	/*
-	 * Drop page reference which is came from get_any_page()
-	 * successful isolate_lru_page() already took another one.
-	 */
-	put_page(page);
-	if (!ret) {
-		LIST_HEAD(pagelist);
-		/*
-		 * After isolated lru page, the PageLRU will be cleared,
-		 * so use !__PageMovable instead for LRU page's mapping
-		 * cannot have PAGE_MAPPING_MOVABLE.
-		 */
-		if (!__PageMovable(page))
-			inc_node_page_state(page, NR_ISOLATED_ANON +
-						page_is_file_lru(page));
-		list_add(&page->lru, &pagelist);
-		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
+	if (isolate_page(hpage, &pagelist)) {
+	ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
 					MIGRATE_SYNC, MR_MEMORY_FAILURE);
 		if (!ret) {
-			page_handle_poison(page, true, true, false);
+			page_handle_poison(page, true, true, huge);
 		} else {
 			if (!list_empty(&pagelist))
 				putback_movable_pages(&pagelist);
 
-			pr_info("soft offline: %#lx: migration failed %d, type %lx (%pGp)\n",
-				pfn, ret, page->flags, &page->flags);
+			pr_info("soft offline: %#lx: %s migration failed %d, type %lx (%pGp)\n",
+				 pfn, msg_page[huge], ret, page->flags, &page->flags);
 			if (ret > 0)
 				ret = -EIO;
 		}
 	} else {
-		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx (%pGp)\n",
-			pfn, ret, page_count(page), page->flags, &page->flags);
+		pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
+			 pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
 	}
 	return ret;
 }
 
 static int soft_offline_in_use_page(struct page *page)
 {
-	int ret;
 	struct page *hpage = compound_head(page);
 
 	if (!PageHuge(page) && PageTransHuge(hpage))
 		if (try_to_split_thp_page(page, "soft offline") < 0)
 			return -EBUSY;
 
-	if (PageHuge(page))
-		ret = soft_offline_huge_page(page);
-	else
-		ret = __soft_offline_page(page);
-	return ret;
+	return __soft_offline_page(page);
 }
 
 static int soft_offline_free_page(struct page *page)
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-introduce-mf_msg_unsplit_thp.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (166 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 21:46 ` + mmhwpoison-introduce-mf_msg_unsplit_thp.patch " Andrew Morton
                   ` (64 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, osalvador,
	tony.luck, zeil


The patch titled
     Subject: mm,hwpoison: return 0 if the page is already poisoned in soft-offline
has been added to the -mm tree.  Its filename is
     mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: return 0 if the page is already poisoned in soft-offline

Currently, there is an inconsistency when calling soft-offline from
different paths on a page that is already poisoned.

1) madvise:

        madvise_inject_error skips any poisoned page and continues
        the loop.
        If that was the only page to madvise, it returns 0.

2) /sys/devices/system/memory/:

        When calling soft_offline_page_store()->soft_offline_page(),
        we return -EBUSY in case the page is already poisoned.
        This is inconsistent with a) the above example and b)
        memory_failure, where we return 0 if the page was poisoned.

Fix this by dropping the PageHWPoison() check in madvise_inject_error, and
let soft_offline_page return 0 if it finds the page already poisoned.

Please, note that this represents a user-api change, since now the return
error when calling soft_offline_page_store()->soft_offline_page() will be
different.

Link: http://lkml.kernel.org/r/20200716123810.25292-15-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/madvise.c        |    3 ---
 mm/memory-failure.c |    4 ++--
 2 files changed, 2 insertions(+), 5 deletions(-)

--- a/mm/madvise.c~mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline
+++ a/mm/madvise.c
@@ -903,9 +903,6 @@ static int madvise_inject_error(int beha
 		 */
 		put_page(page);
 
-		if (PageHWPoison(page))
-			continue;
-
 		if (behavior == MADV_SOFT_OFFLINE) {
 			pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n",
 				 pfn, start);
--- a/mm/memory-failure.c~mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline
+++ a/mm/memory-failure.c
@@ -1781,7 +1781,7 @@ static int __soft_offline_page(struct pa
 		unlock_page(page);
 		put_page(page);
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
-		return -EBUSY;
+		return 0;
 	}
 
 	if (!PageHuge(page))
@@ -1881,7 +1881,7 @@ int soft_offline_page(unsigned long pfn)
 
 	if (PageHWPoison(page)) {
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
-		return -EBUSY;
+		return 0;
 	}
 
 	get_online_mems();
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-introduce-mf_msg_unsplit_thp.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-introduce-mf_msg_unsplit_thp.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (167 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch " Andrew Morton
@ 2020-07-16 21:46 ` Andrew Morton
  2020-07-16 22:51 ` + linux-sched-mmh-drop-duplicated-words-in-comments.patch " Andrew Morton
                   ` (63 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 21:46 UTC (permalink / raw)
  To: aneesh.kumar, dave.hansen, david, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, naoya.horiguchi, osalvador, osalvador,
	tony.luck, zeil


The patch titled
     Subject: mm,hwpoison: introduce MF_MSG_UNSPLIT_THP
has been added to the -mm tree.  Its filename is
     mmhwpoison-introduce-mf_msg_unsplit_thp.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-introduce-mf_msg_unsplit_thp.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-introduce-mf_msg_unsplit_thp.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: introduce MF_MSG_UNSPLIT_THP

memory_failure() is supposed to call action_result() when it handles
a memory error event, but there's one missing case. So let's add it.

I find that include/ras/ras_event.h has some other MF_MSG_* undefined,
so this patch also adds them.

Link: http://lkml.kernel.org/r/20200716123810.25292-16-osalvador@suse.de
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.com
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h      |    1 +
 include/ras/ras_event.h |    3 +++
 mm/memory-failure.c     |    5 ++++-
 3 files changed, 8 insertions(+), 1 deletion(-)

--- a/include/linux/mm.h~mmhwpoison-introduce-mf_msg_unsplit_thp
+++ a/include/linux/mm.h
@@ -3030,6 +3030,7 @@ enum mf_action_page_type {
 	MF_MSG_BUDDY,
 	MF_MSG_BUDDY_2ND,
 	MF_MSG_DAX,
+	MF_MSG_UNSPLIT_THP,
 	MF_MSG_UNKNOWN,
 };
 
--- a/include/ras/ras_event.h~mmhwpoison-introduce-mf_msg_unsplit_thp
+++ a/include/ras/ras_event.h
@@ -361,6 +361,7 @@ TRACE_EVENT(aer_event,
 	EM ( MF_MSG_POISONED_HUGE, "huge page already hardware poisoned" )	\
 	EM ( MF_MSG_HUGE, "huge page" )					\
 	EM ( MF_MSG_FREE_HUGE, "free huge page" )			\
+	EM ( MF_MSG_NON_PMD_HUGE, "non-pmd-sized huge page" )		\
 	EM ( MF_MSG_UNMAP_FAILED, "unmapping failed page" )		\
 	EM ( MF_MSG_DIRTY_SWAPCACHE, "dirty swapcache page" )		\
 	EM ( MF_MSG_CLEAN_SWAPCACHE, "clean swapcache page" )		\
@@ -373,6 +374,8 @@ TRACE_EVENT(aer_event,
 	EM ( MF_MSG_TRUNCATED_LRU, "already truncated LRU page" )	\
 	EM ( MF_MSG_BUDDY, "free buddy page" )				\
 	EM ( MF_MSG_BUDDY_2ND, "free buddy page (2nd try)" )		\
+	EM ( MF_MSG_DAX, "dax page" )					\
+	EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" )			\
 	EMe ( MF_MSG_UNKNOWN, "unknown page" )
 
 /*
--- a/mm/memory-failure.c~mmhwpoison-introduce-mf_msg_unsplit_thp
+++ a/mm/memory-failure.c
@@ -569,6 +569,7 @@ static const char * const action_page_ty
 	[MF_MSG_BUDDY]			= "free buddy page",
 	[MF_MSG_BUDDY_2ND]		= "free buddy page (2nd try)",
 	[MF_MSG_DAX]			= "dax page",
+	[MF_MSG_UNSPLIT_THP]		= "unsplit thp",
 	[MF_MSG_UNKNOWN]		= "unknown page",
 };
 
@@ -1359,8 +1360,10 @@ int memory_failure(unsigned long pfn, in
 	}
 
 	if (PageTransHuge(hpage)) {
-		if (try_to_split_thp_page(p, "Memory Failure") < 0)
+		if (try_to_split_thp_page(p, "Memory Failure") < 0) {
+			action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED);
 			return -EBUSY;
+		}
 		VM_BUG_ON_PAGE(!page_count(p), p);
 	}
 
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-introduce-mf_msg_unsplit_thp.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + linux-sched-mmh-drop-duplicated-words-in-comments.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (168 preceding siblings ...)
  2020-07-16 21:46 ` + mmhwpoison-introduce-mf_msg_unsplit_thp.patch " Andrew Morton
@ 2020-07-16 22:51 ` Andrew Morton
  2020-07-16 22:51 ` + mm-drop-duplicated-words-in-linux-pgtableh.patch " Andrew Morton
                   ` (62 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 22:51 UTC (permalink / raw)
  To: mingo, mm-commits, peterz, rdunlap, sjpark


The patch titled
     Subject: linux/sched/mm.h: drop duplicated words in comments
has been added to the -mm tree.  Its filename is
     linux-sched-mmh-drop-duplicated-words-in-comments.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/linux-sched-mmh-drop-duplicated-words-in-comments.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/linux-sched-mmh-drop-duplicated-words-in-comments.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: linux/sched/mm.h: drop duplicated words in comments

Drop doubled words "to" and "that".

Link: http://lkml.kernel.org/r/927ea8d8-3f6c-9b65-4c2b-63ab4bd59ef1@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/sched/mm.h |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/include/linux/sched/mm.h~linux-sched-mmh-drop-duplicated-words-in-comments
+++ a/include/linux/sched/mm.h
@@ -23,7 +23,7 @@ extern struct mm_struct *mm_alloc(void);
  * will still exist later on and mmget_not_zero() has to be used before
  * accessing it.
  *
- * This is a preferred way to to pin @mm for a longer/unbounded amount
+ * This is a preferred way to pin @mm for a longer/unbounded amount
  * of time.
  *
  * Use mmdrop() to release the reference acquired by mmgrab().
@@ -236,7 +236,7 @@ static inline unsigned int memalloc_noio
  * @flags: Flags to restore.
  *
  * Ends the implicit GFP_NOIO scope started by memalloc_noio_save function.
- * Always make sure that that the given flags is the return value from the
+ * Always make sure that the given flags is the return value from the
  * pairing memalloc_noio_save call.
  */
 static inline void memalloc_noio_restore(unsigned int flags)
@@ -267,7 +267,7 @@ static inline unsigned int memalloc_nofs
  * @flags: Flags to restore.
  *
  * Ends the implicit GFP_NOFS scope started by memalloc_nofs_save function.
- * Always make sure that that the given flags is the return value from the
+ * Always make sure that the given flags is the return value from the
  * pairing memalloc_nofs_save call.
  */
 static inline void memalloc_nofs_restore(unsigned int flags)
_

Patches currently in -mm which might be from rdunlap@infradead.org are

linux-sched-mmh-drop-duplicated-words-in-comments.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-drop-duplicated-words-in-linux-pgtableh.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (169 preceding siblings ...)
  2020-07-16 22:51 ` + linux-sched-mmh-drop-duplicated-words-in-comments.patch " Andrew Morton
@ 2020-07-16 22:51 ` Andrew Morton
  2020-07-16 22:52 ` + mm-drop-duplicated-words-in-linux-mmh.patch " Andrew Morton
                   ` (61 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 22:51 UTC (permalink / raw)
  To: mm-commits, rdunlap, sjpark


The patch titled
     Subject: mm: drop duplicated words in <linux/pgtable.h>
has been added to the -mm tree.  Its filename is
     mm-drop-duplicated-words-in-linux-pgtableh.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-drop-duplicated-words-in-linux-pgtableh.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-drop-duplicated-words-in-linux-pgtableh.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: mm: drop duplicated words in <linux/pgtable.h>

Drop the doubled words "used" and "by".

Drop the repeated acronym "TLB" and make several other fixes around it.
(capital letters, spellos)

Link: http://lkml.kernel.org/r/2bb6e13e-44df-4920-52d9-4d3539945f73@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/pgtable.h |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/include/linux/pgtable.h~mm-drop-duplicated-words-in-linux-pgtableh
+++ a/include/linux/pgtable.h
@@ -838,7 +838,7 @@ static inline void ptep_modify_prot_comm
 
 /*
  * No-op macros that just return the current protection value. Defined here
- * because these macros can be used used even if CONFIG_MMU is not defined.
+ * because these macros can be used even if CONFIG_MMU is not defined.
  */
 #ifndef pgprot_encrypted
 #define pgprot_encrypted(prot)	(prot)
@@ -1231,7 +1231,7 @@ static inline int pmd_trans_unstable(pmd
  * Technically a PTE can be PROTNONE even when not doing NUMA balancing but
  * the only case the kernel cares is for NUMA balancing and is only ever set
  * when the VMA is accessible. For PROT_NONE VMAs, the PTEs are not marked
- * _PAGE_PROTNONE so by by default, implement the helper as "always no". It
+ * _PAGE_PROTNONE so by default, implement the helper as "always no". It
  * is the responsibility of the caller to distinguish between PROT_NONE
  * protections and NUMA hinting fault protections.
  */
@@ -1315,10 +1315,10 @@ static inline int pmd_free_pte_page(pmd_
 /*
  * ARCHes with special requirements for evicting THP backing TLB entries can
  * implement this. Otherwise also, it can help optimize normal TLB flush in
- * THP regime. stock flush_tlb_range() typically has optimization to nuke the
- * entire TLB TLB if flush span is greater than a threshold, which will
- * likely be true for a single huge page. Thus a single thp flush will
- * invalidate the entire TLB which is not desitable.
+ * THP regime. Stock flush_tlb_range() typically has optimization to nuke the
+ * entire TLB if flush span is greater than a threshold, which will
+ * likely be true for a single huge page. Thus a single THP flush will
+ * invalidate the entire TLB which is not desirable.
  * e.g. see arch/arc: flush_pmd_tlb_range
  */
 #define flush_pmd_tlb_range(vma, addr, end)	flush_tlb_range(vma, addr, end)
_

Patches currently in -mm which might be from rdunlap@infradead.org are

linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-drop-duplicated-words-in-linux-mmh.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (170 preceding siblings ...)
  2020-07-16 22:51 ` + mm-drop-duplicated-words-in-linux-pgtableh.patch " Andrew Morton
@ 2020-07-16 22:52 ` Andrew Morton
  2020-07-16 22:52 ` + autofs-fix-doubled-word.patch " Andrew Morton
                   ` (60 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 22:52 UTC (permalink / raw)
  To: mm-commits, rdunlap, sjpark


The patch titled
     Subject: mm: drop duplicated words in <linux/mm.h>
has been added to the -mm tree.  Its filename is
     mm-drop-duplicated-words-in-linux-mmh.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-drop-duplicated-words-in-linux-mmh.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-drop-duplicated-words-in-linux-mmh.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: mm: drop duplicated words in <linux/mm.h>

Drop the doubled words "to" and "the".

Link: http://lkml.kernel.org/r/d9fae8d6-0d60-4d52-9385-3199ee98de49@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/mm.h~mm-drop-duplicated-words-in-linux-mmh
+++ a/include/linux/mm.h
@@ -482,7 +482,7 @@ static inline bool fault_flag_allow_retr
 	{ FAULT_FLAG_INTERRUPTIBLE,	"INTERRUPTIBLE" }
 
 /*
- * vm_fault is filled by the the pagefault handler and passed to the vma's
+ * vm_fault is filled by the pagefault handler and passed to the vma's
  * ->fault function. The vma's ->fault is responsible for returning a bitmask
  * of VM_FAULT_xxx flags that give details about how the fault was handled.
  *
@@ -2601,7 +2601,7 @@ extern unsigned long stack_guard_gap;
 /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
 extern int expand_stack(struct vm_area_struct *vma, unsigned long address);
 
-/* CONFIG_STACK_GROWSUP still needs to to grow downwards at some places */
+/* CONFIG_STACK_GROWSUP still needs to grow downwards at some places */
 extern int expand_downwards(struct vm_area_struct *vma,
 		unsigned long address);
 #if VM_GROWSUP
_

Patches currently in -mm which might be from rdunlap@infradead.org are

linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + autofs-fix-doubled-word.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (171 preceding siblings ...)
  2020-07-16 22:52 ` + mm-drop-duplicated-words-in-linux-mmh.patch " Andrew Morton
@ 2020-07-16 22:52 ` Andrew Morton
  2020-07-16 23:08 ` + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch " Andrew Morton
                   ` (59 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 22:52 UTC (permalink / raw)
  To: mm-commits, raven, rdunlap


The patch titled
     Subject: autofs: fix doubled word
has been added to the -mm tree.  Its filename is
     autofs-fix-doubled-word.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/autofs-fix-doubled-word.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-doubled-word.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: autofs: fix doubled word

Change doubled word "is" to "it is".

Link: http://lkml.kernel.org/r/5a82befd-40f8-8dc0-3498-cbc0436cad9b@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/uapi/linux/auto_dev-ioctl.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/uapi/linux/auto_dev-ioctl.h~autofs-fix-doubled-word
+++ a/include/uapi/linux/auto_dev-ioctl.h
@@ -82,7 +82,7 @@ struct args_ismountpoint {
 /*
  * All the ioctls use this structure.
  * When sending a path size must account for the total length
- * of the chunk of memory otherwise is is the size of the
+ * of the chunk of memory otherwise it is the size of the
  * structure.
  */
 
_

Patches currently in -mm which might be from rdunlap@infradead.org are

linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch
autofs-fix-doubled-word.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (172 preceding siblings ...)
  2020-07-16 22:52 ` + autofs-fix-doubled-word.patch " Andrew Morton
@ 2020-07-16 23:08 ` Andrew Morton
  2020-07-16 23:09 ` Andrew Morton
                   ` (58 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 23:08 UTC (permalink / raw)
  To: guro, hannes, hughd, mhocko, mm-commits


The patch titled
     Subject: mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings
has been added to the -mm tree.  Its filename is
     mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings

I've noticed a number of warnings like "vmstat_refresh: nr_free_cma -5" or
"vmstat_refresh: nr_zone_write_pending -11" on our production hosts.  The
numbers of these warnings were relatively low and stable, so it didn't
look like we are systematically leaking the counters.  The corresponding
vmstat counters also looked sane.

These warnings are generated by the vmstat_refresh() function, which
assumes that atomic zone and numa counters can't go below zero.  However,
on a SMP machine it's not quite right: due to per-cpu caching it can in
theory be as low as -(zone threshold) * NR_CPUs.

For instance, let's say all cma pages are in use and NR_FREE_CMA_PAGES
reached 0.  Then we've reclaimed a small number of cma pages on each CPU
except CPU0, so that most percpu NR_FREE_CMA_PAGES counters are slightly
positive (the atomic counter is still 0).  Then somebody on CPU0 consumes
all these pages.  The number of pages can easily exceed the threshold and
a negative value will be committed to the atomic counter.

To fix the problem and avoid generating false warnings, let's just relax
the condition and warn only if the value is less than minus the maximum
theoretically possible drift value, which is 125 * number of online CPUs. 
It will still allow to catch systematic leaks, but will not generate bogus
warnings.

Link: http://lkml.kernel.org/r/20200714173920.3319063-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/sysctl/vm.rst |    4 +-
 mm/vmstat.c                             |   30 +++++++++++++---------
 2 files changed, 21 insertions(+), 13 deletions(-)

--- a/Documentation/admin-guide/sysctl/vm.rst~mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings
+++ a/Documentation/admin-guide/sysctl/vm.rst
@@ -822,8 +822,8 @@ e.g. cat /proc/sys/vm/stat_refresh /proc
 
 As a side-effect, it also checks for negative totals (elsewhere reported
 as 0) and "fails" with EINVAL if any are found, with a warning in dmesg.
-(At time of writing, a few stats are known sometimes to be found negative,
-with no ill effects: errors and warnings on these stats are suppressed.)
+(On a SMP machine some stats can temporarily become negative, with no ill
+effects: errors and warnings on these stats are suppressed.)
 
 
 numa_stat
--- a/mm/vmstat.c~mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings
+++ a/mm/vmstat.c
@@ -169,6 +169,8 @@ EXPORT_SYMBOL(vm_node_stat);
 
 #ifdef CONFIG_SMP
 
+#define MAX_THRESHOLD 125
+
 int calculate_pressure_threshold(struct zone *zone)
 {
 	int threshold;
@@ -186,11 +188,9 @@ int calculate_pressure_threshold(struct
 	threshold = max(1, (int)(watermark_distance / num_online_cpus()));
 
 	/*
-	 * Maximum threshold is 125
+	 * Threshold is capped by MAX_THRESHOLD
 	 */
-	threshold = min(125, threshold);
-
-	return threshold;
+	return min(MAX_THRESHOLD, threshold);
 }
 
 int calculate_normal_threshold(struct zone *zone)
@@ -610,6 +610,9 @@ void dec_node_page_state(struct page *pa
 }
 EXPORT_SYMBOL(dec_node_page_state);
 #else
+
+#define MAX_THRESHOLD 0
+
 /*
  * Use interrupt disable to serialize counter updates
  */
@@ -1810,7 +1813,7 @@ static void refresh_vm_stats(struct work
 int vmstat_refresh(struct ctl_table *table, int write,
 		   void *buffer, size_t *lenp, loff_t *ppos)
 {
-	long val;
+	long val, max_drift;
 	int err;
 	int i;
 
@@ -1821,17 +1824,22 @@ int vmstat_refresh(struct ctl_table *tab
 	 * pages, immediately after running a test.  /proc/sys/vm/stat_refresh,
 	 * which can equally be echo'ed to or cat'ted from (by root),
 	 * can be used to update the stats just before reading them.
-	 *
-	 * Oh, and since global_zone_page_state() etc. are so careful to hide
-	 * transiently negative values, report an error here if any of
-	 * the stats is negative, so we know to go looking for imbalance.
 	 */
 	err = schedule_on_each_cpu(refresh_vm_stats);
 	if (err)
 		return err;
+
+	/*
+	 * Since global_zone_page_state() etc. are so careful to hide
+	 * transiently negative values, report an error here if any of
+	 * the stats is negative and are less than the maximum drift value,
+	 * so we know to go looking for imbalance.
+	 */
+	max_drift = num_online_cpus() * MAX_THRESHOLD;
+
 	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) {
 		val = atomic_long_read(&vm_zone_stat[i]);
-		if (val < 0) {
+		if (val < -max_drift) {
 			pr_warn("%s: %s %ld\n",
 				__func__, zone_stat_name(i), val);
 			err = -EINVAL;
@@ -1840,7 +1848,7 @@ int vmstat_refresh(struct ctl_table *tab
 #ifdef CONFIG_NUMA
 	for (i = 0; i < NR_VM_NUMA_STAT_ITEMS; i++) {
 		val = atomic_long_read(&vm_numa_stat[i]);
-		if (val < 0) {
+		if (val < -max_drift) {
 			pr_warn("%s: %s %ld\n",
 				__func__, numa_stat_name(i), val);
 			err = -EINVAL;
_

Patches currently in -mm which might be from guro@fb.com are

mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (173 preceding siblings ...)
  2020-07-16 23:08 ` + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch " Andrew Morton
@ 2020-07-16 23:09 ` Andrew Morton
  2020-07-16 23:28 ` + memcg-oom-check-memcg-margin-for-parallel-oom.patch " Andrew Morton
                   ` (57 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 23:09 UTC (permalink / raw)
  To: guro, hannes, hughd, mhocko, mm-commits, vbabka


The patch titled
     Subject: mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings
has been added to the -mm tree.  Its filename is
     mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings

I've noticed a number of warnings like "vmstat_refresh: nr_free_cma -5" or
"vmstat_refresh: nr_zone_write_pending -11" on our production hosts.  The
numbers of these warnings were relatively low and stable, so it didn't
look like we are systematically leaking the counters.  The corresponding
vmstat counters also looked sane.

These warnings are generated by the vmstat_refresh() function, which
assumes that atomic zone and numa counters can't go below zero.  However,
on a SMP machine it's not quite right: due to per-cpu caching it can in
theory be as low as -(zone threshold) * NR_CPUs.

For instance, let's say all cma pages are in use and NR_FREE_CMA_PAGES
reached 0.  Then we've reclaimed a small number of cma pages on each CPU
except CPU0, so that most percpu NR_FREE_CMA_PAGES counters are slightly
positive (the atomic counter is still 0).  Then somebody on CPU0 consumes
all these pages.  The number of pages can easily exceed the threshold and
a negative value will be committed to the atomic counter.

To fix the problem and avoid generating false warnings, let's just relax
the condition and warn only if the value is less than minus the maximum
theoretically possible drift value, which is 125 * number of online CPUs. 
It will still allow to catch systematic leaks, but will not generate bogus
warnings.

Link: http://lkml.kernel.org/r/20200714173920.3319063-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/sysctl/vm.rst |    4 +-
 mm/vmstat.c                             |   30 +++++++++++++---------
 2 files changed, 21 insertions(+), 13 deletions(-)

--- a/Documentation/admin-guide/sysctl/vm.rst~mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings
+++ a/Documentation/admin-guide/sysctl/vm.rst
@@ -822,8 +822,8 @@ e.g. cat /proc/sys/vm/stat_refresh /proc
 
 As a side-effect, it also checks for negative totals (elsewhere reported
 as 0) and "fails" with EINVAL if any are found, with a warning in dmesg.
-(At time of writing, a few stats are known sometimes to be found negative,
-with no ill effects: errors and warnings on these stats are suppressed.)
+(On a SMP machine some stats can temporarily become negative, with no ill
+effects: errors and warnings on these stats are suppressed.)
 
 
 numa_stat
--- a/mm/vmstat.c~mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings
+++ a/mm/vmstat.c
@@ -169,6 +169,8 @@ EXPORT_SYMBOL(vm_node_stat);
 
 #ifdef CONFIG_SMP
 
+#define MAX_THRESHOLD 125
+
 int calculate_pressure_threshold(struct zone *zone)
 {
 	int threshold;
@@ -186,11 +188,9 @@ int calculate_pressure_threshold(struct
 	threshold = max(1, (int)(watermark_distance / num_online_cpus()));
 
 	/*
-	 * Maximum threshold is 125
+	 * Threshold is capped by MAX_THRESHOLD
 	 */
-	threshold = min(125, threshold);
-
-	return threshold;
+	return min(MAX_THRESHOLD, threshold);
 }
 
 int calculate_normal_threshold(struct zone *zone)
@@ -610,6 +610,9 @@ void dec_node_page_state(struct page *pa
 }
 EXPORT_SYMBOL(dec_node_page_state);
 #else
+
+#define MAX_THRESHOLD 0
+
 /*
  * Use interrupt disable to serialize counter updates
  */
@@ -1810,7 +1813,7 @@ static void refresh_vm_stats(struct work
 int vmstat_refresh(struct ctl_table *table, int write,
 		   void *buffer, size_t *lenp, loff_t *ppos)
 {
-	long val;
+	long val, max_drift;
 	int err;
 	int i;
 
@@ -1821,17 +1824,22 @@ int vmstat_refresh(struct ctl_table *tab
 	 * pages, immediately after running a test.  /proc/sys/vm/stat_refresh,
 	 * which can equally be echo'ed to or cat'ted from (by root),
 	 * can be used to update the stats just before reading them.
-	 *
-	 * Oh, and since global_zone_page_state() etc. are so careful to hide
-	 * transiently negative values, report an error here if any of
-	 * the stats is negative, so we know to go looking for imbalance.
 	 */
 	err = schedule_on_each_cpu(refresh_vm_stats);
 	if (err)
 		return err;
+
+	/*
+	 * Since global_zone_page_state() etc. are so careful to hide
+	 * transiently negative values, report an error here if any of
+	 * the stats is negative and are less than the maximum drift value,
+	 * so we know to go looking for imbalance.
+	 */
+	max_drift = num_online_cpus() * MAX_THRESHOLD;
+
 	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) {
 		val = atomic_long_read(&vm_zone_stat[i]);
-		if (val < 0) {
+		if (val < -max_drift) {
 			pr_warn("%s: %s %ld\n",
 				__func__, zone_stat_name(i), val);
 			err = -EINVAL;
@@ -1840,7 +1848,7 @@ int vmstat_refresh(struct ctl_table *tab
 #ifdef CONFIG_NUMA
 	for (i = 0; i < NR_VM_NUMA_STAT_ITEMS; i++) {
 		val = atomic_long_read(&vm_numa_stat[i]);
-		if (val < 0) {
+		if (val < -max_drift) {
 			pr_warn("%s: %s %ld\n",
 				__func__, numa_stat_name(i), val);
 			err = -EINVAL;
_

Patches currently in -mm which might be from guro@fb.com are

mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + memcg-oom-check-memcg-margin-for-parallel-oom.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (174 preceding siblings ...)
  2020-07-16 23:09 ` Andrew Morton
@ 2020-07-16 23:28 ` Andrew Morton
  2020-07-16 23:32 ` + mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch " Andrew Morton
                   ` (56 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 23:28 UTC (permalink / raw)
  To: chris, hannes, laoar.shao, mhocko, mhocko, mm-commits,
	penguin-kernel, rientjes


The patch titled
     Subject: memcg, oom: check memcg margin for parallel oom
has been added to the -mm tree.  Its filename is
     memcg-oom-check-memcg-margin-for-parallel-oom.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/memcg-oom-check-memcg-margin-for-parallel-oom.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/memcg-oom-check-memcg-margin-for-parallel-oom.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Yafang Shao <laoar.shao@gmail.com>
Subject: memcg, oom: check memcg margin for parallel oom

Memcg oom killer invocation is synchronized by the global oom_lock and
tasks are sleeping on the lock while somebody is selecting the victim or
potentially race with the oom_reaper is releasing the victim's memory. 
This can result in a pointless oom killer invocation because a waiter
might be racing with the oom_reaper

        P1              oom_reaper              P2
                        oom_reap_task           mutex_lock(oom_lock)
                                                out_of_memory # no victim because we have one already
                        __oom_reap_task_mm      mute_unlock(oom_lock)
 mutex_lock(oom_lock)
                        set MMF_OOM_SKIP
 select_bad_process
 # finds a new victim

The page allocator prevents from this race by trying to allocate after the
lock can be acquired (in __alloc_pages_may_oom) which acts as a last
minute check.  Moreover page allocator simply doesn't block on the
oom_lock and simply retries the whole reclaim process.

Memcg oom killer should do the last minute check as well.  Call
mem_cgroup_margin to do that.  Trylock on the oom_lock could be done as
well but this doesn't seem to be necessary at this stage.

[mhocko@kernel.org: commit log]
Link: http://lkml.kernel.org/r/1594735034-19190-1-git-send-email-laoar.shao@gmail.com
Suggested-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Chris Down <chris@chrisdown.name>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/mm/memcontrol.c~memcg-oom-check-memcg-margin-for-parallel-oom
+++ a/mm/memcontrol.c
@@ -1665,15 +1665,21 @@ static bool mem_cgroup_out_of_memory(str
 		.gfp_mask = gfp_mask,
 		.order = order,
 	};
-	bool ret;
+	bool ret = true;
 
 	if (mutex_lock_killable(&oom_lock))
 		return true;
+
+	if (mem_cgroup_margin(memcg) >= (1 << order))
+		goto unlock;
+
 	/*
 	 * A few threads which were not waiting at mutex_lock_killable() can
 	 * fail to bail out. Therefore, check again after holding oom_lock.
 	 */
 	ret = should_force_charge() || out_of_memory(&oc);
+
+unlock:
 	mutex_unlock(&oom_lock);
 	return ret;
 }
_

Patches currently in -mm which might be from laoar.shao@gmail.com are

mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch
memcg-oom-check-memcg-margin-for-parallel-oom.patch
mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
mm-oom-make-the-calculation-of-oom-badness-more-accurate-v3.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (175 preceding siblings ...)
  2020-07-16 23:28 ` + memcg-oom-check-memcg-margin-for-parallel-oom.patch " Andrew Morton
@ 2020-07-16 23:32 ` Andrew Morton
  2020-07-16 23:42 ` + ipc-shmc-remove-the-superfluous-break.patch " Andrew Morton
                   ` (55 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 23:32 UTC (permalink / raw)
  To: cuibixuan, guro, mm-commits, sfr


The patch titled
     Subject: mm/percpu: fix 'defined but not used' warning
has been added to the -mm tree.  Its filename is
     mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Bixuan Cui <cuibixuan@huawei.com>
Subject: mm/percpu: fix 'defined but not used' warning

Gcc report the following warning without CONFIG_MEMCG_KMEM:

mm/percpu-internal.h:145:29: warning: 'pcpu_chunk_type' defined
but not used [-Wunused-function]
 static enum pcpu_chunk_type pcpu_chunk_type(struct pcpu_chunk *chunk)
                             ^~~~~~~~~~~~~~~

Add 'inline' to pcpu_chunk_type(),pcpu_is_memcg_chunk() and
pcpu_chunk_list() to clear warning.

Link: http://lkml.kernel.org/r/6d41b939-a741-b521-a7a2-e7296ec16219@huawei.com
Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/percpu-internal.h |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/mm/percpu-internal.h~mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2
+++ a/mm/percpu-internal.h
@@ -129,31 +129,31 @@ static inline int pcpu_chunk_map_bits(st
 }
 
 #ifdef CONFIG_MEMCG_KMEM
-static enum pcpu_chunk_type pcpu_chunk_type(struct pcpu_chunk *chunk)
+static inline enum pcpu_chunk_type pcpu_chunk_type(struct pcpu_chunk *chunk)
 {
 	if (chunk->obj_cgroups)
 		return PCPU_CHUNK_MEMCG;
 	return PCPU_CHUNK_ROOT;
 }
 
-static bool pcpu_is_memcg_chunk(enum pcpu_chunk_type chunk_type)
+static inline bool pcpu_is_memcg_chunk(enum pcpu_chunk_type chunk_type)
 {
 	return chunk_type == PCPU_CHUNK_MEMCG;
 }
 
 #else
-static enum pcpu_chunk_type pcpu_chunk_type(struct pcpu_chunk *chunk)
+static inline enum pcpu_chunk_type pcpu_chunk_type(struct pcpu_chunk *chunk)
 {
 	return PCPU_CHUNK_ROOT;
 }
 
-static bool pcpu_is_memcg_chunk(enum pcpu_chunk_type chunk_type)
+static inline bool pcpu_is_memcg_chunk(enum pcpu_chunk_type chunk_type)
 {
 	return false;
 }
 #endif
 
-static struct list_head *pcpu_chunk_list(enum pcpu_chunk_type chunk_type)
+static inline struct list_head *pcpu_chunk_list(enum pcpu_chunk_type chunk_type)
 {
 	return &pcpu_chunk_lists[pcpu_nr_slots *
 				 pcpu_is_memcg_chunk(chunk_type)];
_

Patches currently in -mm which might be from cuibixuan@huawei.com are

mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + ipc-shmc-remove-the-superfluous-break.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (176 preceding siblings ...)
  2020-07-16 23:32 ` + mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch " Andrew Morton
@ 2020-07-16 23:42 ` Andrew Morton
  2020-07-16 23:52 ` + mm-thp-replace-http-links-with-https-ones-fix.patch " Andrew Morton
                   ` (54 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 23:42 UTC (permalink / raw)
  To: akpm, liao.pingfang, mm-commits, wang.yi59


The patch titled
     Subject: ipc/shm.c: Remove the superfluous break
has been added to the -mm tree.  Its filename is
     ipc-shmc-remove-the-superfluous-break.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/ipc-shmc-remove-the-superfluous-break.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/ipc-shmc-remove-the-superfluous-break.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Liao Pingfang <liao.pingfang@zte.com.cn>
Subject: ipc/shm.c: Remove the superfluous break

Remove the superfuous break, as there is a 'return' before it.

Link: http://lkml.kernel.org/r/1594724361-11525-1-git-send-email-wang.yi59@zte.com.cn
Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 ipc/shm.c |    1 -
 1 file changed, 1 deletion(-)

--- a/ipc/shm.c~ipc-shmc-remove-the-superfluous-break
+++ a/ipc/shm.c
@@ -1380,7 +1380,6 @@ static long compat_ksys_shmctl(int shmid
 	case SHM_LOCK:
 	case SHM_UNLOCK:
 		return shmctl_do_lock(ns, shmid, cmd);
-		break;
 	default:
 		return -EINVAL;
 	}
_

Patches currently in -mm which might be from liao.pingfang@zte.com.cn are

ipc-shmc-remove-the-superfluous-break.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-thp-replace-http-links-with-https-ones-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (177 preceding siblings ...)
  2020-07-16 23:42 ` + ipc-shmc-remove-the-superfluous-break.patch " Andrew Morton
@ 2020-07-16 23:52 ` Andrew Morton
  2020-07-17  0:01 ` + scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch " Andrew Morton
                   ` (53 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-16 23:52 UTC (permalink / raw)
  To: akpm, grandmaster, mm-commits, vbabka


The patch titled
     Subject: mm-thp-replace-http-links-with-https-ones-fix
has been added to the -mm tree.  Its filename is
     mm-thp-replace-http-links-with-https-ones-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-replace-http-links-with-https-ones-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-thp-replace-http-links-with-https-ones-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-thp-replace-http-links-with-https-ones-fix

fix amd.com URL, per Vlastimil

Cc: "Alexander A. Klimov" <grandmaster@al2klimov.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/huge_memory.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/huge_memory.c~mm-thp-replace-http-links-with-https-ones-fix
+++ a/mm/huge_memory.c
@@ -2065,8 +2065,8 @@ static void __split_huge_pmd_locked(stru
 	 * free), userland could trigger a small page size TLB miss on the
 	 * small sized TLB while the hugepage TLB entry is still established in
 	 * the huge TLB. Some CPU doesn't like that.
-	 * See https://support.amd.com/us/Processor_TechDocs/41322.pdf, Erratum
-	 * 383 on page 93. Intel should be safe but is also warns that it's
+	 * See http://support.amd.com/TechDocs/41322_10h_Rev_Gd.pdf, Erratum
+	 * 383 on page 105. Intel should be safe but is also warns that it's
 	 * only safe if the permission and cache attributes of the two entries
 	 * loaded in the two TLB is identical (which should be the case here).
 	 * But it is generally safer to never allow small and huge TLB entries
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
linux-next-git-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (178 preceding siblings ...)
  2020-07-16 23:52 ` + mm-thp-replace-http-links-with-https-ones-fix.patch " Andrew Morton
@ 2020-07-17  0:01 ` Andrew Morton
  2020-07-17  1:53 ` + revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch " Andrew Morton
                   ` (52 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17  0:01 UTC (permalink / raw)
  To: colin.king, mm-commits


The patch titled
     Subject: scripts/spelling.txt: add more spellings to spelling.txt
has been added to the -mm tree.  Its filename is
     scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Colin Ian King <colin.king@canonical.com>
Subject: scripts/spelling.txt: add more spellings to spelling.txt

Here are some of the more common spelling mistakes and typos that I've
found while fixing up spelling mistakes in the kernel since April 2020.

Link: http://lkml.kernel.org/r/20200714092837.173796-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/spelling.txt |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

--- a/scripts/spelling.txt~scripts-spellingtxt-add-more-spellings-to-spellingtxt
+++ a/scripts/spelling.txt
@@ -149,6 +149,7 @@ arbitary||arbitrary
 architechture||architecture
 arguement||argument
 arguements||arguments
+arithmatic||arithmetic
 aritmetic||arithmetic
 arne't||aren't
 arraival||arrival
@@ -454,6 +455,7 @@ destorys||destroys
 destroied||destroyed
 detabase||database
 deteced||detected
+detectt||detect
 develope||develop
 developement||development
 developped||developed
@@ -545,6 +547,7 @@ entires||entries
 entites||entities
 entrys||entries
 enocded||encoded
+enought||enough
 enterily||entirely
 enviroiment||environment
 enviroment||environment
@@ -556,11 +559,14 @@ equivelant||equivalent
 equivilant||equivalent
 eror||error
 errorr||error
+errror||error
 estbalishment||establishment
 etsablishment||establishment
 etsbalishment||establishment
+evalution||evaluation
 excecutable||executable
 exceded||exceeded
+exceds||exceeds
 exceeed||exceed
 excellant||excellent
 execeeded||exceeded
@@ -583,6 +589,7 @@ explictly||explicitly
 expresion||expression
 exprimental||experimental
 extened||extended
+exteneded||extended||extended
 extensability||extensibility
 extention||extension
 extenstion||extension
@@ -610,10 +617,12 @@ feautures||features
 fetaure||feature
 fetaures||features
 fileystem||filesystem
+fimrware||firmware
 fimware||firmware
 firmare||firmware
 firmaware||firmware
 firware||firmware
+firwmare||firmware
 finanize||finalize
 findn||find
 finilizes||finalizes
@@ -661,6 +670,7 @@ globel||global
 grabing||grabbing
 grahical||graphical
 grahpical||graphical
+granularty||granularity
 grapic||graphic
 grranted||granted
 guage||gauge
@@ -906,6 +916,7 @@ miximum||maximum
 mmnemonic||mnemonic
 mnay||many
 modfiy||modify
+modifer||modifier
 modulues||modules
 momery||memory
 memomry||memory
@@ -915,6 +926,7 @@ monochromo||monochrome
 monocrome||monochrome
 mopdule||module
 mroe||more
+multipler||multiplier
 mulitplied||multiplied
 multidimensionnal||multidimensional
 multipe||multiple
@@ -952,6 +964,7 @@ occassionally||occasionally
 occationally||occasionally
 occurance||occurrence
 occurances||occurrences
+occurd||occurred
 occured||occurred
 occurence||occurrence
 occure||occurred
@@ -1058,6 +1071,7 @@ precission||precision
 preemptable||preemptible
 prefered||preferred
 prefferably||preferably
+prefitler||prefilter
 premption||preemption
 prepaired||prepared
 preperation||preparation
@@ -1101,6 +1115,7 @@ pronunce||pronounce
 propery||property
 propigate||propagate
 propigation||propagation
+propogation||propagation
 propogate||propagate
 prosess||process
 protable||portable
@@ -1316,6 +1331,7 @@ sturcture||structure
 subdirectoires||subdirectories
 suble||subtle
 substract||subtract
+submited||submitted
 submition||submission
 suceed||succeed
 succesfully||successfully
@@ -1324,6 +1340,7 @@ successed||succeeded
 successfull||successful
 successfuly||successfully
 sucessfully||successfully
+sucessful||successful
 sucess||success
 superflous||superfluous
 superseeded||superseded
@@ -1409,6 +1426,7 @@ transormed||transformed
 trasfer||transfer
 trasmission||transmission
 treshold||threshold
+triggerd||triggered
 trigerred||triggered
 trigerring||triggering
 trun||turn
@@ -1421,6 +1439,7 @@ uknown||unknown
 usccess||success
 usupported||unsupported
 uncommited||uncommitted
+uncompatible||incompatible
 unconditionaly||unconditionally
 undeflow||underflow
 underun||underrun
_

Patches currently in -mm which might be from colin.king@canonical.com are

scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch
fs-ufs-avoid-potential-u32-multiplication-overflow.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (179 preceding siblings ...)
  2020-07-17  0:01 ` + scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch " Andrew Morton
@ 2020-07-17  1:53 ` Andrew Morton
  2020-07-17  4:06 ` + revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch " Andrew Morton
                   ` (51 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17  1:53 UTC (permalink / raw)
  To: adrien+dev, akpm, bernd.amend, drosen, groeck, hch, mm-commits,
	phillip, pliard


The patch titled
     Subject: revert "squashfs: migrate from ll_rw_block usage to BIO"
has been added to the -mm tree.  Its filename is
     revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: revert "squashfs: migrate from ll_rw_block usage to BIO"

Revert 93e72b3c612adc ("squashfs: migrate from ll_rw_block usage to BIO")
due to a regression reported by Bernd Amend.

Link: http://lkml.kernel.org/r/CAF31+H5ZB7zn73obrc5svLzgfsTnyYe5TKvr7-6atUOqrRY+2w@mail.gmail.com
Reported-by: Bernd Amend <bernd.amend@gmail.com>
Cc: Philippe Liard <pliard@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Adrien Schildknecht <adrien+dev@schischi.me>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/squashfs/block.c                     |  273 ++++++++++------------
 fs/squashfs/decompressor.h              |    5 
 fs/squashfs/decompressor_multi.c        |    9 
 fs/squashfs/decompressor_multi_percpu.c |    6 
 fs/squashfs/decompressor_single.c       |    9 
 fs/squashfs/lz4_wrapper.c               |   17 -
 fs/squashfs/lzo_wrapper.c               |   17 -
 fs/squashfs/squashfs.h                  |    4 
 fs/squashfs/xz_wrapper.c                |   51 +---
 fs/squashfs/zlib_wrapper.c              |   63 ++---
 fs/squashfs/zstd_wrapper.c              |   62 ++--
 11 files changed, 237 insertions(+), 279 deletions(-)

--- a/fs/squashfs/block.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/block.c
@@ -13,7 +13,6 @@
  * datablocks and metadata blocks.
  */
 
-#include <linux/blkdev.h>
 #include <linux/fs.h>
 #include <linux/vfs.h>
 #include <linux/slab.h>
@@ -28,104 +27,45 @@
 #include "page_actor.h"
 
 /*
- * Returns the amount of bytes copied to the page actor.
+ * Read the metadata block length, this is stored in the first two
+ * bytes of the metadata block.
  */
-static int copy_bio_to_actor(struct bio *bio,
-			     struct squashfs_page_actor *actor,
-			     int offset, int req_length)
-{
-	void *actor_addr = squashfs_first_page(actor);
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
-	int copied_bytes = 0;
-	int actor_offset = 0;
-
-	if (WARN_ON_ONCE(!bio_next_segment(bio, &iter_all)))
-		return 0;
-
-	while (copied_bytes < req_length) {
-		int bytes_to_copy = min_t(int, bvec->bv_len - offset,
-					  PAGE_SIZE - actor_offset);
-
-		bytes_to_copy = min_t(int, bytes_to_copy,
-				      req_length - copied_bytes);
-		memcpy(actor_addr + actor_offset,
-		       page_address(bvec->bv_page) + bvec->bv_offset + offset,
-		       bytes_to_copy);
-
-		actor_offset += bytes_to_copy;
-		copied_bytes += bytes_to_copy;
-		offset += bytes_to_copy;
-
-		if (actor_offset >= PAGE_SIZE) {
-			actor_addr = squashfs_next_page(actor);
-			if (!actor_addr)
-				break;
-			actor_offset = 0;
-		}
-		if (offset >= bvec->bv_len) {
-			if (!bio_next_segment(bio, &iter_all))
-				break;
-			offset = 0;
-		}
-	}
-	squashfs_finish_page(actor);
-	return copied_bytes;
-}
-
-static int squashfs_bio_read(struct super_block *sb, u64 index, int length,
-			     struct bio **biop, int *block_offset)
+static struct buffer_head *get_block_length(struct super_block *sb,
+			u64 *cur_index, int *offset, int *length)
 {
 	struct squashfs_sb_info *msblk = sb->s_fs_info;
-	const u64 read_start = round_down(index, msblk->devblksize);
-	const sector_t block = read_start >> msblk->devblksize_log2;
-	const u64 read_end = round_up(index + length, msblk->devblksize);
-	const sector_t block_end = read_end >> msblk->devblksize_log2;
-	int offset = read_start - round_down(index, PAGE_SIZE);
-	int total_len = (block_end - block) << msblk->devblksize_log2;
-	const int page_count = DIV_ROUND_UP(total_len + offset, PAGE_SIZE);
-	int error, i;
-	struct bio *bio;
-
-	bio = bio_alloc(GFP_NOIO, page_count);
-	if (!bio)
-		return -ENOMEM;
+	struct buffer_head *bh;
 
-	bio_set_dev(bio, sb->s_bdev);
-	bio->bi_opf = READ;
-	bio->bi_iter.bi_sector = block * (msblk->devblksize >> SECTOR_SHIFT);
-
-	for (i = 0; i < page_count; ++i) {
-		unsigned int len =
-			min_t(unsigned int, PAGE_SIZE - offset, total_len);
-		struct page *page = alloc_page(GFP_NOIO);
-
-		if (!page) {
-			error = -ENOMEM;
-			goto out_free_bio;
-		}
-		if (!bio_add_page(bio, page, len, offset)) {
-			error = -EIO;
-			goto out_free_bio;
+	bh = sb_bread(sb, *cur_index);
+	if (bh == NULL)
+		return NULL;
+
+	if (msblk->devblksize - *offset == 1) {
+		*length = (unsigned char) bh->b_data[*offset];
+		put_bh(bh);
+		bh = sb_bread(sb, ++(*cur_index));
+		if (bh == NULL)
+			return NULL;
+		*length |= (unsigned char) bh->b_data[0] << 8;
+		*offset = 1;
+	} else {
+		*length = (unsigned char) bh->b_data[*offset] |
+			(unsigned char) bh->b_data[*offset + 1] << 8;
+		*offset += 2;
+
+		if (*offset == msblk->devblksize) {
+			put_bh(bh);
+			bh = sb_bread(sb, ++(*cur_index));
+			if (bh == NULL)
+				return NULL;
+			*offset = 0;
 		}
-		offset = 0;
-		total_len -= len;
 	}
 
-	error = submit_bio_wait(bio);
-	if (error)
-		goto out_free_bio;
-
-	*biop = bio;
-	*block_offset = index & ((1 << msblk->devblksize_log2) - 1);
-	return 0;
-
-out_free_bio:
-	bio_free_pages(bio);
-	bio_put(bio);
-	return error;
+	return bh;
 }
 
+
 /*
  * Read and decompress a metadata block or datablock.  Length is non-zero
  * if a datablock is being read (the size is stored elsewhere in the
@@ -136,88 +76,129 @@ out_free_bio:
  * algorithms).
  */
 int squashfs_read_data(struct super_block *sb, u64 index, int length,
-		       u64 *next_index, struct squashfs_page_actor *output)
+		u64 *next_index, struct squashfs_page_actor *output)
 {
 	struct squashfs_sb_info *msblk = sb->s_fs_info;
-	struct bio *bio = NULL;
-	int compressed;
-	int res;
-	int offset;
+	struct buffer_head **bh;
+	int offset = index & ((1 << msblk->devblksize_log2) - 1);
+	u64 cur_index = index >> msblk->devblksize_log2;
+	int bytes, compressed, b = 0, k = 0, avail, i;
+
+	bh = kcalloc(((output->length + msblk->devblksize - 1)
+		>> msblk->devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL);
+	if (bh == NULL)
+		return -ENOMEM;
 
 	if (length) {
 		/*
 		 * Datablock.
 		 */
+		bytes = -offset;
 		compressed = SQUASHFS_COMPRESSED_BLOCK(length);
 		length = SQUASHFS_COMPRESSED_SIZE_BLOCK(length);
+		if (next_index)
+			*next_index = index + length;
+
 		TRACE("Block @ 0x%llx, %scompressed size %d, src size %d\n",
 			index, compressed ? "" : "un", length, output->length);
+
+		if (length < 0 || length > output->length ||
+				(index + length) > msblk->bytes_used)
+			goto read_failure;
+
+		for (b = 0; bytes < length; b++, cur_index++) {
+			bh[b] = sb_getblk(sb, cur_index);
+			if (bh[b] == NULL)
+				goto block_release;
+			bytes += msblk->devblksize;
+		}
+		ll_rw_block(REQ_OP_READ, 0, b, bh);
 	} else {
 		/*
 		 * Metadata block.
 		 */
-		const u8 *data;
-		struct bvec_iter_all iter_all = {};
-		struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
-
-		if (index + 2 > msblk->bytes_used) {
-			res = -EIO;
-			goto out;
-		}
-		res = squashfs_bio_read(sb, index, 2, &bio, &offset);
-		if (res)
-			goto out;
-
-		if (WARN_ON_ONCE(!bio_next_segment(bio, &iter_all))) {
-			res = -EIO;
-			goto out_free_bio;
-		}
-		/* Extract the length of the metadata block */
-		data = page_address(bvec->bv_page) + bvec->bv_offset;
-		length = data[offset];
-		if (offset <= bvec->bv_len - 1) {
-			length |= data[offset + 1] << 8;
-		} else {
-			if (WARN_ON_ONCE(!bio_next_segment(bio, &iter_all))) {
-				res = -EIO;
-				goto out_free_bio;
-			}
-			data = page_address(bvec->bv_page) + bvec->bv_offset;
-			length |= data[0] << 8;
-		}
-		bio_free_pages(bio);
-		bio_put(bio);
+		if ((index + 2) > msblk->bytes_used)
+			goto read_failure;
 
+		bh[0] = get_block_length(sb, &cur_index, &offset, &length);
+		if (bh[0] == NULL)
+			goto read_failure;
+		b = 1;
+
+		bytes = msblk->devblksize - offset;
 		compressed = SQUASHFS_COMPRESSED(length);
 		length = SQUASHFS_COMPRESSED_SIZE(length);
-		index += 2;
+		if (next_index)
+			*next_index = index + length + 2;
 
 		TRACE("Block @ 0x%llx, %scompressed size %d\n", index,
-		      compressed ? "" : "un", length);
+				compressed ? "" : "un", length);
+
+		if (length < 0 || length > output->length ||
+					(index + length) > msblk->bytes_used)
+			goto block_release;
+
+		for (; bytes < length; b++) {
+			bh[b] = sb_getblk(sb, ++cur_index);
+			if (bh[b] == NULL)
+				goto block_release;
+			bytes += msblk->devblksize;
+		}
+		ll_rw_block(REQ_OP_READ, 0, b - 1, bh + 1);
 	}
-	if (next_index)
-		*next_index = index + length;
 
-	res = squashfs_bio_read(sb, index, length, &bio, &offset);
-	if (res)
-		goto out;
+	for (i = 0; i < b; i++) {
+		wait_on_buffer(bh[i]);
+		if (!buffer_uptodate(bh[i]))
+			goto block_release;
+	}
 
 	if (compressed) {
-		if (!msblk->stream) {
-			res = -EIO;
-			goto out_free_bio;
-		}
-		res = squashfs_decompress(msblk, bio, offset, length, output);
+		if (!msblk->stream)
+			goto read_failure;
+		length = squashfs_decompress(msblk, bh, b, offset, length,
+			output);
+		if (length < 0)
+			goto read_failure;
 	} else {
-		res = copy_bio_to_actor(bio, output, offset, length);
+		/*
+		 * Block is uncompressed.
+		 */
+		int in, pg_offset = 0;
+		void *data = squashfs_first_page(output);
+
+		for (bytes = length; k < b; k++) {
+			in = min(bytes, msblk->devblksize - offset);
+			bytes -= in;
+			while (in) {
+				if (pg_offset == PAGE_SIZE) {
+					data = squashfs_next_page(output);
+					pg_offset = 0;
+				}
+				avail = min_t(int, in, PAGE_SIZE -
+						pg_offset);
+				memcpy(data + pg_offset, bh[k]->b_data + offset,
+						avail);
+				in -= avail;
+				pg_offset += avail;
+				offset += avail;
+			}
+			offset = 0;
+			put_bh(bh[k]);
+		}
+		squashfs_finish_page(output);
 	}
 
-out_free_bio:
-	bio_free_pages(bio);
-	bio_put(bio);
-out:
-	if (res < 0)
-		ERROR("Failed to read block 0x%llx: %d\n", index, res);
+	kfree(bh);
+	return length;
 
-	return res;
+block_release:
+	for (; k < b; k++)
+		put_bh(bh[k]);
+
+read_failure:
+	ERROR("squashfs_read_data failed to read block 0x%llx\n",
+					(unsigned long long) index);
+	kfree(bh);
+	return -EIO;
 }
--- a/fs/squashfs/decompressor.h~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/decompressor.h
@@ -10,14 +10,13 @@
  * decompressor.h
  */
 
-#include <linux/bio.h>
-
 struct squashfs_decompressor {
 	void	*(*init)(struct squashfs_sb_info *, void *);
 	void	*(*comp_opts)(struct squashfs_sb_info *, void *, int);
 	void	(*free)(void *);
 	int	(*decompress)(struct squashfs_sb_info *, void *,
-		struct bio *, int, int, struct squashfs_page_actor *);
+		struct buffer_head **, int, int, int,
+		struct squashfs_page_actor *);
 	int	id;
 	char	*name;
 	int	supported;
--- a/fs/squashfs/decompressor_multi.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/decompressor_multi.c
@@ -6,7 +6,7 @@
 #include <linux/types.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/sched.h>
 #include <linux/wait.h>
 #include <linux/cpumask.h>
@@ -180,15 +180,14 @@ wait:
 }
 
 
-int squashfs_decompress(struct squashfs_sb_info *msblk, struct bio *bio,
-			int offset, int length,
-			struct squashfs_page_actor *output)
+int squashfs_decompress(struct squashfs_sb_info *msblk, struct buffer_head **bh,
+	int b, int offset, int length, struct squashfs_page_actor *output)
 {
 	int res;
 	struct squashfs_stream *stream = msblk->stream;
 	struct decomp_stream *decomp_stream = get_decomp_stream(msblk, stream);
 	res = msblk->decompressor->decompress(msblk, decomp_stream->stream,
-		bio, offset, length, output);
+		bh, b, offset, length, output);
 	put_decomp_stream(decomp_stream, stream);
 	if (res < 0)
 		ERROR("%s decompression failed, data probably corrupt\n",
--- a/fs/squashfs/decompressor_multi_percpu.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/decompressor_multi_percpu.c
@@ -75,8 +75,8 @@ void squashfs_decompressor_destroy(struc
 	}
 }
 
-int squashfs_decompress(struct squashfs_sb_info *msblk, struct bio *bio,
-	int offset, int length, struct squashfs_page_actor *output)
+int squashfs_decompress(struct squashfs_sb_info *msblk, struct buffer_head **bh,
+	int b, int offset, int length, struct squashfs_page_actor *output)
 {
 	struct squashfs_stream *stream;
 	int res;
@@ -84,7 +84,7 @@ int squashfs_decompress(struct squashfs_
 	local_lock(&msblk->stream->lock);
 	stream = this_cpu_ptr(msblk->stream);
 
-	res = msblk->decompressor->decompress(msblk, stream->stream, bio,
+	res = msblk->decompressor->decompress(msblk, stream->stream, bh, b,
 					      offset, length, output);
 
 	local_unlock(&msblk->stream->lock);
--- a/fs/squashfs/decompressor_single.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/decompressor_single.c
@@ -7,7 +7,7 @@
 #include <linux/types.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 
 #include "squashfs_fs.h"
 #include "squashfs_fs_sb.h"
@@ -59,15 +59,14 @@ void squashfs_decompressor_destroy(struc
 	}
 }
 
-int squashfs_decompress(struct squashfs_sb_info *msblk, struct bio *bio,
-			int offset, int length,
-			struct squashfs_page_actor *output)
+int squashfs_decompress(struct squashfs_sb_info *msblk, struct buffer_head **bh,
+	int b, int offset, int length, struct squashfs_page_actor *output)
 {
 	int res;
 	struct squashfs_stream *stream = msblk->stream;
 
 	mutex_lock(&stream->mutex);
-	res = msblk->decompressor->decompress(msblk, stream->stream, bio,
+	res = msblk->decompressor->decompress(msblk, stream->stream, bh, b,
 		offset, length, output);
 	mutex_unlock(&stream->mutex);
 
--- a/fs/squashfs/lz4_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/lz4_wrapper.c
@@ -4,7 +4,7 @@
  * Phillip Lougher <phillip@squashfs.org.uk>
  */
 
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
@@ -89,23 +89,20 @@ static void lz4_free(void *strm)
 
 
 static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
 	struct squashfs_lz4 *stream = strm;
 	void *buff = stream->input, *data;
-	int bytes = length, res;
+	int avail, i, bytes = length, res;
 
-	while (bio_next_segment(bio, &iter_all)) {
-		int avail = min(bytes, ((int)bvec->bv_len) - offset);
-
-		data = page_address(bvec->bv_page) + bvec->bv_offset;
-		memcpy(buff, data + offset, avail);
+	for (i = 0; i < b; i++) {
+		avail = min(bytes, msblk->devblksize - offset);
+		memcpy(buff, bh[i]->b_data + offset, avail);
 		buff += avail;
 		bytes -= avail;
 		offset = 0;
+		put_bh(bh[i]);
 	}
 
 	res = LZ4_decompress_safe(stream->input, stream->output,
--- a/fs/squashfs/lzo_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/lzo_wrapper.c
@@ -9,7 +9,7 @@
  */
 
 #include <linux/mutex.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 #include <linux/lzo.h>
@@ -63,24 +63,21 @@ static void lzo_free(void *strm)
 
 
 static int lzo_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
 	struct squashfs_lzo *stream = strm;
 	void *buff = stream->input, *data;
-	int bytes = length, res;
+	int avail, i, bytes = length, res;
 	size_t out_len = output->length;
 
-	while (bio_next_segment(bio, &iter_all)) {
-		int avail = min(bytes, ((int)bvec->bv_len) - offset);
-
-		data = page_address(bvec->bv_page) + bvec->bv_offset;
-		memcpy(buff, data + offset, avail);
+	for (i = 0; i < b; i++) {
+		avail = min(bytes, msblk->devblksize - offset);
+		memcpy(buff, bh[i]->b_data + offset, avail);
 		buff += avail;
 		bytes -= avail;
 		offset = 0;
+		put_bh(bh[i]);
 	}
 
 	res = lzo1x_decompress_safe(stream->input, (size_t)length,
--- a/fs/squashfs/squashfs.h~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/squashfs.h
@@ -40,8 +40,8 @@ extern void *squashfs_decompressor_setup
 /* decompressor_xxx.c */
 extern void *squashfs_decompressor_create(struct squashfs_sb_info *, void *);
 extern void squashfs_decompressor_destroy(struct squashfs_sb_info *);
-extern int squashfs_decompress(struct squashfs_sb_info *, struct bio *,
-				int, int, struct squashfs_page_actor *);
+extern int squashfs_decompress(struct squashfs_sb_info *, struct buffer_head **,
+	int, int, int, struct squashfs_page_actor *);
 extern int squashfs_max_decompressors(void);
 
 /* export.c */
--- a/fs/squashfs/xz_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/xz_wrapper.c
@@ -10,7 +10,7 @@
 
 
 #include <linux/mutex.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/slab.h>
 #include <linux/xz.h>
 #include <linux/bitops.h>
@@ -117,12 +117,11 @@ static void squashfs_xz_free(void *strm)
 
 
 static int squashfs_xz_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
-	int total = 0, error = 0;
+	enum xz_ret xz_err;
+	int avail, total = 0, k = 0;
 	struct squashfs_xz *stream = strm;
 
 	xz_dec_reset(stream->state);
@@ -132,23 +131,11 @@ static int squashfs_xz_uncompress(struct
 	stream->buf.out_size = PAGE_SIZE;
 	stream->buf.out = squashfs_first_page(output);
 
-	for (;;) {
-		enum xz_ret xz_err;
-
-		if (stream->buf.in_pos == stream->buf.in_size) {
-			const void *data;
-			int avail;
-
-			if (!bio_next_segment(bio, &iter_all)) {
-				/* XZ_STREAM_END must be reached. */
-				error = -EIO;
-				break;
-			}
-
-			avail = min(length, ((int)bvec->bv_len) - offset);
-			data = page_address(bvec->bv_page) + bvec->bv_offset;
+	do {
+		if (stream->buf.in_pos == stream->buf.in_size && k < b) {
+			avail = min(length, msblk->devblksize - offset);
 			length -= avail;
-			stream->buf.in = data + offset;
+			stream->buf.in = bh[k]->b_data + offset;
 			stream->buf.in_size = avail;
 			stream->buf.in_pos = 0;
 			offset = 0;
@@ -163,17 +150,23 @@ static int squashfs_xz_uncompress(struct
 		}
 
 		xz_err = xz_dec_run(stream->state, &stream->buf);
-		if (xz_err == XZ_STREAM_END)
-			break;
-		if (xz_err != XZ_OK) {
-			error = -EIO;
-			break;
-		}
-	}
+
+		if (stream->buf.in_pos == stream->buf.in_size && k < b)
+			put_bh(bh[k++]);
+	} while (xz_err == XZ_OK);
 
 	squashfs_finish_page(output);
 
-	return error ? error : total + stream->buf.out_pos;
+	if (xz_err != XZ_STREAM_END || k < b)
+		goto out;
+
+	return total + stream->buf.out_pos;
+
+out:
+	for (; k < b; k++)
+		put_bh(bh[k]);
+
+	return -EIO;
 }
 
 const struct squashfs_decompressor squashfs_xz_comp_ops = {
--- a/fs/squashfs/zlib_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/zlib_wrapper.c
@@ -10,7 +10,7 @@
 
 
 #include <linux/mutex.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/slab.h>
 #include <linux/zlib.h>
 #include <linux/vmalloc.h>
@@ -50,35 +50,21 @@ static void zlib_free(void *strm)
 
 
 static int zlib_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
-	int zlib_init = 0, error = 0;
+	int zlib_err, zlib_init = 0, k = 0;
 	z_stream *stream = strm;
 
 	stream->avail_out = PAGE_SIZE;
 	stream->next_out = squashfs_first_page(output);
 	stream->avail_in = 0;
 
-	for (;;) {
-		int zlib_err;
-
-		if (stream->avail_in == 0) {
-			const void *data;
-			int avail;
-
-			if (!bio_next_segment(bio, &iter_all)) {
-				/* Z_STREAM_END must be reached. */
-				error = -EIO;
-				break;
-			}
-
-			avail = min(length, ((int)bvec->bv_len) - offset);
-			data = page_address(bvec->bv_page) + bvec->bv_offset;
+	do {
+		if (stream->avail_in == 0 && k < b) {
+			int avail = min(length, msblk->devblksize - offset);
 			length -= avail;
-			stream->next_in = data + offset;
+			stream->next_in = bh[k]->b_data + offset;
 			stream->avail_in = avail;
 			offset = 0;
 		}
@@ -92,28 +78,37 @@ static int zlib_uncompress(struct squash
 		if (!zlib_init) {
 			zlib_err = zlib_inflateInit(stream);
 			if (zlib_err != Z_OK) {
-				error = -EIO;
-				break;
+				squashfs_finish_page(output);
+				goto out;
 			}
 			zlib_init = 1;
 		}
 
 		zlib_err = zlib_inflate(stream, Z_SYNC_FLUSH);
-		if (zlib_err == Z_STREAM_END)
-			break;
-		if (zlib_err != Z_OK) {
-			error = -EIO;
-			break;
-		}
-	}
+
+		if (stream->avail_in == 0 && k < b)
+			put_bh(bh[k++]);
+	} while (zlib_err == Z_OK);
 
 	squashfs_finish_page(output);
 
-	if (!error)
-		if (zlib_inflateEnd(stream) != Z_OK)
-			error = -EIO;
+	if (zlib_err != Z_STREAM_END)
+		goto out;
+
+	zlib_err = zlib_inflateEnd(stream);
+	if (zlib_err != Z_OK)
+		goto out;
+
+	if (k < b)
+		goto out;
+
+	return stream->total_out;
+
+out:
+	for (; k < b; k++)
+		put_bh(bh[k]);
 
-	return error ? error : stream->total_out;
+	return -EIO;
 }
 
 const struct squashfs_decompressor squashfs_zlib_comp_ops = {
--- a/fs/squashfs/zstd_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/zstd_wrapper.c
@@ -9,7 +9,7 @@
  */
 
 #include <linux/mutex.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/slab.h>
 #include <linux/zstd.h>
 #include <linux/vmalloc.h>
@@ -59,44 +59,33 @@ static void zstd_free(void *strm)
 
 
 static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
 	struct workspace *wksp = strm;
 	ZSTD_DStream *stream;
 	size_t total_out = 0;
-	int error = 0;
+	size_t zstd_err;
+	int k = 0;
 	ZSTD_inBuffer in_buf = { NULL, 0, 0 };
 	ZSTD_outBuffer out_buf = { NULL, 0, 0 };
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
 
 	stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size);
 
 	if (!stream) {
 		ERROR("Failed to initialize zstd decompressor\n");
-		return -EIO;
+		goto out;
 	}
 
 	out_buf.size = PAGE_SIZE;
 	out_buf.dst = squashfs_first_page(output);
 
-	for (;;) {
-		size_t zstd_err;
+	do {
+		if (in_buf.pos == in_buf.size && k < b) {
+			int avail = min(length, msblk->devblksize - offset);
 
-		if (in_buf.pos == in_buf.size) {
-			const void *data;
-			int avail;
-
-			if (!bio_next_segment(bio, &iter_all)) {
-				error = -EIO;
-				break;
-			}
-
-			avail = min(length, ((int)bvec->bv_len) - offset);
-			data = page_address(bvec->bv_page) + bvec->bv_offset;
 			length -= avail;
-			in_buf.src = data + offset;
+			in_buf.src = bh[k]->b_data + offset;
 			in_buf.size = avail;
 			in_buf.pos = 0;
 			offset = 0;
@@ -108,8 +97,8 @@ static int zstd_uncompress(struct squash
 				/* Shouldn't run out of pages
 				 * before stream is done.
 				 */
-				error = -EIO;
-				break;
+				squashfs_finish_page(output);
+				goto out;
 			}
 			out_buf.pos = 0;
 			out_buf.size = PAGE_SIZE;
@@ -118,20 +107,29 @@ static int zstd_uncompress(struct squash
 		total_out -= out_buf.pos;
 		zstd_err = ZSTD_decompressStream(stream, &out_buf, &in_buf);
 		total_out += out_buf.pos; /* add the additional data produced */
-		if (zstd_err == 0)
-			break;
 
-		if (ZSTD_isError(zstd_err)) {
-			ERROR("zstd decompression error: %d\n",
-					(int)ZSTD_getErrorCode(zstd_err));
-			error = -EIO;
-			break;
-		}
-	}
+		if (in_buf.pos == in_buf.size && k < b)
+			put_bh(bh[k++]);
+	} while (zstd_err != 0 && !ZSTD_isError(zstd_err));
 
 	squashfs_finish_page(output);
 
-	return error ? error : total_out;
+	if (ZSTD_isError(zstd_err)) {
+		ERROR("zstd decompression error: %d\n",
+				(int)ZSTD_getErrorCode(zstd_err));
+		goto out;
+	}
+
+	if (k < b)
+		goto out;
+
+	return (int)total_out;
+
+out:
+	for (; k < b; k++)
+		put_bh(bh[k]);
+
+	return -EIO;
 }
 
 const struct squashfs_decompressor squashfs_zstd_comp_ops = {
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
linux-next-git-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (180 preceding siblings ...)
  2020-07-17  1:53 ` + revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch " Andrew Morton
@ 2020-07-17  4:06 ` Andrew Morton
  2020-07-17  5:53 ` mmotm 2020-07-16-22-52 uploaded Andrew Morton
                   ` (50 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17  4:06 UTC (permalink / raw)
  To: akpm, mm-commits


The patch titled
     Subject: revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix
has been added to the -mm tree.  Its filename is
     revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/squashfs/super.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/squashfs/super.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix
+++ a/fs/squashfs/super.c
@@ -26,6 +26,7 @@
 #include <linux/module.h>
 #include <linux/magic.h>
 #include <linux/xattr.h>
+#include <linux/blk_types.h>
 
 #include "squashfs_fs.h"
 #include "squashfs_fs_sb.h"
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
linux-next-git-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch
revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* mmotm 2020-07-16-22-52 uploaded
  2020-07-03 22:14 incoming Andrew Morton
                   ` (181 preceding siblings ...)
  2020-07-17  4:06 ` + revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch " Andrew Morton
@ 2020-07-17  5:53 ` Andrew Morton
  2020-07-17 20:18 ` [folded-merged] revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch removed from -mm tree Andrew Morton
                   ` (49 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17  5:53 UTC (permalink / raw)
  To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
	mhocko, mm-commits, sfr

The mm-of-the-moment snapshot 2020-07-16-22-52 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss.  Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

	https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

	https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.8-rc5:
(patches marked "*" will be included in linux-next)

  origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
* mailmap-add-entry-for-mike-rapoport.patch
* revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch
* revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* const_structscheckpatch-add-regulator_ops.patch
* scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch
* mm-handle-page-mapping-better-in-dump_page.patch
* mm-handle-page-mapping-better-in-dump_page-fix.patch
* mm-dump-compound-page-information-on-a-second-line.patch
* mm-print-head-flags-in-dump_page.patch
* mm-switch-dump_page-to-get_kernel_nofault.patch
* mm-print-the-inode-number-in-dump_page.patch
* mm-print-hashed-address-of-struct-page.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-swap-simplify-alloc_swap_slot_cache.patch
* mm-swap-simplify-enable_swap_slots_cache.patch
* mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
* tmpfs-per-superblock-i_ino-support.patch
* tmpfs-support-64-bit-inums-per-sb.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
* mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
* mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
* mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
* mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch
* mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch
* mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch
* mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch
* memcg-oom-check-memcg-margin-for-parallel-oom.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
* mm-utilc-make-vm_memory_committed-more-accurate.patch
* percpu_counter-add-percpu_counter_sync.patch
* mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
* mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
* mm-mremap-calculate-extent-in-one-place.patch
* mm-mremap-start-addresses-are-properly-aligned.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* mm-sparse-cleanup-the-code-surrounding-memory_present.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
* mm-vmallocc-remove-bug-from-the-find_va_links.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* rcu-kasan-record-and-print-call_rcu-call-stack-v8.patch
* kasan-record-and-print-the-free-track.patch
* kasan-record-and-print-the-free-track-v8.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
* mm-thp-replace-http-links-with-https-ones.patch
* mm-thp-replace-http-links-with-https-ones-fix.patch
* mm-vmscanc-fixed-typo.patch
* mm-vmscan-consistent-update-to-pgrefill.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
* mm-oom-make-the-calculation-of-oom-badness-more-accurate-v3.patch
* doc-mm-sync-up-oom_score_adj-documentation.patch
* doc-mm-clarify-proc-pid-oom_score-value-range.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-migrate-optimize-migrate_vma_setup-for-holes.patch
* mm-migrate-optimize-migrate_vma_setup-for-holes-v2.patch
* mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-vmstat-add-events-for-thp-migration-without-split.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch
* mm-memory_hotplug-introduce-default-dummy-memory_add_physaddr_to_nid.patch
* mm-memory_hotplug-fix-unpaired-mem_hotplug_begin-done.patch
* linux-sched-mmh-drop-duplicated-words-in-comments.patch
* mm-drop-duplicated-words-in-linux-pgtableh.patch
* mm-drop-duplicated-words-in-linux-mmh.patch
* syscalls-use-uaccess_kernel-in-addr_limit_user_check.patch
* nds32-use-uaccess_kernel-in-show_regs.patch
* riscv-include-asm-pgtableh-in-asm-uaccessh.patch
* uaccess-remove-segment_eq.patch
* uaccess-add-force_uaccess_beginend-helpers.patch
* exec-use-force_uaccess_begin-during-exec-and-exit.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* proc-sysctl-make-protected_-world-readable.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* lib-test_lockupc-make-symbol-test_works-static.patch
* iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* autofs-fix-doubled-word.patch
* fs-minix-check-return-value-of-sb_getblk.patch
* fs-minix-dont-allow-getting-deleted-inodes.patch
* fs-minix-reject-too-large-maximum-file-size.patch
* fs-minix-set-s_maxbytes-correctly.patch
* fs-minix-fix-block-limit-check-for-v1-filesystems.patch
* fs-minix-remove-expected-error-message-in-block_to_path.patch
* fs-ufs-avoid-potential-u32-multiplication-overflow.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
* fat-fix-fat_ra_init-for-data-clusters-==-0.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
* ipc-uninline-functions.patch
* ipc-shmc-remove-the-superfluous-break.patch
  linux-next.patch
  linux-next-rejects.patch
  linux-next-git-rejects.patch
* mm-page_isolation-prefer-the-node-of-the-source-page.patch
* mm-migrate-move-migration-helper-from-h-to-c.patch
* mm-hugetlb-unify-migration-callbacks.patch
* mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
* mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
* mm-migrate-make-a-standard-migration-target-allocation-function.patch
* mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
* mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
* mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
* mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch
* scripts-deprecated_terms-sync-with-inclusive-terms.patch
* mm-do-page-fault-accounting-in-handle_mm_fault.patch
* mm-alpha-use-general-page-fault-accounting.patch
* mm-arc-use-general-page-fault-accounting.patch
* mm-arm-use-general-page-fault-accounting.patch
* mm-arm64-use-general-page-fault-accounting.patch
* mm-csky-use-general-page-fault-accounting.patch
* mm-hexagon-use-general-page-fault-accounting.patch
* mm-ia64-use-general-page-fault-accounting.patch
* mm-m68k-use-general-page-fault-accounting.patch
* mm-microblaze-use-general-page-fault-accounting.patch
* mm-mips-use-general-page-fault-accounting.patch
* mm-nds32-use-general-page-fault-accounting.patch
* mm-nios2-use-general-page-fault-accounting.patch
* mm-openrisc-use-general-page-fault-accounting.patch
* mm-parisc-use-general-page-fault-accounting.patch
* mm-powerpc-use-general-page-fault-accounting.patch
* mm-riscv-use-general-page-fault-accounting.patch
* mm-s390-use-general-page-fault-accounting.patch
* mm-sh-use-general-page-fault-accounting.patch
* mm-sparc32-use-general-page-fault-accounting.patch
* mm-sparc64-use-general-page-fault-accounting.patch
* mm-x86-use-general-page-fault-accounting.patch
* mm-xtensa-use-general-page-fault-accounting.patch
* mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code-fix.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
  make-sure-nobodys-leaking-resources.patch
  releasing-resources-with-children.patch
  mutex-subsystem-synchro-test-module.patch
  kernel-forkc-export-kernel_thread-to-modules.patch
  workaround-for-a-pci-restoring-bug.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* [folded-merged] revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (182 preceding siblings ...)
  2020-07-17  5:53 ` mmotm 2020-07-16-22-52 uploaded Andrew Morton
@ 2020-07-17 20:18 ` Andrew Morton
  2020-07-17 20:18 ` [obsolete] revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch " Andrew Morton
                   ` (48 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 20:18 UTC (permalink / raw)
  To: akpm, mm-commits


The patch titled
     Subject: revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix
has been removed from the -mm tree.  Its filename was
     revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch

This patch was dropped because it was folded into revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/squashfs/super.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/squashfs/super.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix
+++ a/fs/squashfs/super.c
@@ -26,6 +26,7 @@
 #include <linux/module.h>
 #include <linux/magic.h>
 #include <linux/xattr.h>
+#include <linux/blk_types.h>
 
 #include "squashfs_fs.h"
 #include "squashfs_fs_sb.h"
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
linux-next-git-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* [obsolete] revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (183 preceding siblings ...)
  2020-07-17 20:18 ` [folded-merged] revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch removed from -mm tree Andrew Morton
@ 2020-07-17 20:18 ` Andrew Morton
  2020-07-17 20:20 ` + squashfs-fix-length-field-overlap-check-in-metadata-reading.patch added to " Andrew Morton
                   ` (47 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 20:18 UTC (permalink / raw)
  To: adrien+dev, akpm, bernd.amend, drosen, groeck, hch, mm-commits,
	phillip, pliard


The patch titled
     Subject: revert "squashfs: migrate from ll_rw_block usage to BIO"
has been removed from the -mm tree.  Its filename was
     revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch

This patch was dropped because it is obsolete

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: revert "squashfs: migrate from ll_rw_block usage to BIO"

Revert 93e72b3c612adc ("squashfs: migrate from ll_rw_block usage to BIO")
due to a regression reported by Bernd Amend.

Link: http://lkml.kernel.org/r/CAF31+H5ZB7zn73obrc5svLzgfsTnyYe5TKvr7-6atUOqrRY+2w@mail.gmail.com
Reported-by: Bernd Amend <bernd.amend@gmail.com>
Cc: Philippe Liard <pliard@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Adrien Schildknecht <adrien+dev@schischi.me>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/squashfs/block.c                     |  273 ++++++++++------------
 fs/squashfs/decompressor.h              |    5 
 fs/squashfs/decompressor_multi.c        |    9 
 fs/squashfs/decompressor_multi_percpu.c |    6 
 fs/squashfs/decompressor_single.c       |    9 
 fs/squashfs/lz4_wrapper.c               |   17 -
 fs/squashfs/lzo_wrapper.c               |   17 -
 fs/squashfs/squashfs.h                  |    4 
 fs/squashfs/super.c                     |    1 
 fs/squashfs/xz_wrapper.c                |   51 +---
 fs/squashfs/zlib_wrapper.c              |   63 ++---
 fs/squashfs/zstd_wrapper.c              |   62 ++--
 12 files changed, 238 insertions(+), 279 deletions(-)

--- a/fs/squashfs/block.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/block.c
@@ -13,7 +13,6 @@
  * datablocks and metadata blocks.
  */
 
-#include <linux/blkdev.h>
 #include <linux/fs.h>
 #include <linux/vfs.h>
 #include <linux/slab.h>
@@ -28,104 +27,45 @@
 #include "page_actor.h"
 
 /*
- * Returns the amount of bytes copied to the page actor.
+ * Read the metadata block length, this is stored in the first two
+ * bytes of the metadata block.
  */
-static int copy_bio_to_actor(struct bio *bio,
-			     struct squashfs_page_actor *actor,
-			     int offset, int req_length)
-{
-	void *actor_addr = squashfs_first_page(actor);
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
-	int copied_bytes = 0;
-	int actor_offset = 0;
-
-	if (WARN_ON_ONCE(!bio_next_segment(bio, &iter_all)))
-		return 0;
-
-	while (copied_bytes < req_length) {
-		int bytes_to_copy = min_t(int, bvec->bv_len - offset,
-					  PAGE_SIZE - actor_offset);
-
-		bytes_to_copy = min_t(int, bytes_to_copy,
-				      req_length - copied_bytes);
-		memcpy(actor_addr + actor_offset,
-		       page_address(bvec->bv_page) + bvec->bv_offset + offset,
-		       bytes_to_copy);
-
-		actor_offset += bytes_to_copy;
-		copied_bytes += bytes_to_copy;
-		offset += bytes_to_copy;
-
-		if (actor_offset >= PAGE_SIZE) {
-			actor_addr = squashfs_next_page(actor);
-			if (!actor_addr)
-				break;
-			actor_offset = 0;
-		}
-		if (offset >= bvec->bv_len) {
-			if (!bio_next_segment(bio, &iter_all))
-				break;
-			offset = 0;
-		}
-	}
-	squashfs_finish_page(actor);
-	return copied_bytes;
-}
-
-static int squashfs_bio_read(struct super_block *sb, u64 index, int length,
-			     struct bio **biop, int *block_offset)
+static struct buffer_head *get_block_length(struct super_block *sb,
+			u64 *cur_index, int *offset, int *length)
 {
 	struct squashfs_sb_info *msblk = sb->s_fs_info;
-	const u64 read_start = round_down(index, msblk->devblksize);
-	const sector_t block = read_start >> msblk->devblksize_log2;
-	const u64 read_end = round_up(index + length, msblk->devblksize);
-	const sector_t block_end = read_end >> msblk->devblksize_log2;
-	int offset = read_start - round_down(index, PAGE_SIZE);
-	int total_len = (block_end - block) << msblk->devblksize_log2;
-	const int page_count = DIV_ROUND_UP(total_len + offset, PAGE_SIZE);
-	int error, i;
-	struct bio *bio;
-
-	bio = bio_alloc(GFP_NOIO, page_count);
-	if (!bio)
-		return -ENOMEM;
+	struct buffer_head *bh;
 
-	bio_set_dev(bio, sb->s_bdev);
-	bio->bi_opf = READ;
-	bio->bi_iter.bi_sector = block * (msblk->devblksize >> SECTOR_SHIFT);
-
-	for (i = 0; i < page_count; ++i) {
-		unsigned int len =
-			min_t(unsigned int, PAGE_SIZE - offset, total_len);
-		struct page *page = alloc_page(GFP_NOIO);
-
-		if (!page) {
-			error = -ENOMEM;
-			goto out_free_bio;
-		}
-		if (!bio_add_page(bio, page, len, offset)) {
-			error = -EIO;
-			goto out_free_bio;
+	bh = sb_bread(sb, *cur_index);
+	if (bh == NULL)
+		return NULL;
+
+	if (msblk->devblksize - *offset == 1) {
+		*length = (unsigned char) bh->b_data[*offset];
+		put_bh(bh);
+		bh = sb_bread(sb, ++(*cur_index));
+		if (bh == NULL)
+			return NULL;
+		*length |= (unsigned char) bh->b_data[0] << 8;
+		*offset = 1;
+	} else {
+		*length = (unsigned char) bh->b_data[*offset] |
+			(unsigned char) bh->b_data[*offset + 1] << 8;
+		*offset += 2;
+
+		if (*offset == msblk->devblksize) {
+			put_bh(bh);
+			bh = sb_bread(sb, ++(*cur_index));
+			if (bh == NULL)
+				return NULL;
+			*offset = 0;
 		}
-		offset = 0;
-		total_len -= len;
 	}
 
-	error = submit_bio_wait(bio);
-	if (error)
-		goto out_free_bio;
-
-	*biop = bio;
-	*block_offset = index & ((1 << msblk->devblksize_log2) - 1);
-	return 0;
-
-out_free_bio:
-	bio_free_pages(bio);
-	bio_put(bio);
-	return error;
+	return bh;
 }
 
+
 /*
  * Read and decompress a metadata block or datablock.  Length is non-zero
  * if a datablock is being read (the size is stored elsewhere in the
@@ -136,88 +76,129 @@ out_free_bio:
  * algorithms).
  */
 int squashfs_read_data(struct super_block *sb, u64 index, int length,
-		       u64 *next_index, struct squashfs_page_actor *output)
+		u64 *next_index, struct squashfs_page_actor *output)
 {
 	struct squashfs_sb_info *msblk = sb->s_fs_info;
-	struct bio *bio = NULL;
-	int compressed;
-	int res;
-	int offset;
+	struct buffer_head **bh;
+	int offset = index & ((1 << msblk->devblksize_log2) - 1);
+	u64 cur_index = index >> msblk->devblksize_log2;
+	int bytes, compressed, b = 0, k = 0, avail, i;
+
+	bh = kcalloc(((output->length + msblk->devblksize - 1)
+		>> msblk->devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL);
+	if (bh == NULL)
+		return -ENOMEM;
 
 	if (length) {
 		/*
 		 * Datablock.
 		 */
+		bytes = -offset;
 		compressed = SQUASHFS_COMPRESSED_BLOCK(length);
 		length = SQUASHFS_COMPRESSED_SIZE_BLOCK(length);
+		if (next_index)
+			*next_index = index + length;
+
 		TRACE("Block @ 0x%llx, %scompressed size %d, src size %d\n",
 			index, compressed ? "" : "un", length, output->length);
+
+		if (length < 0 || length > output->length ||
+				(index + length) > msblk->bytes_used)
+			goto read_failure;
+
+		for (b = 0; bytes < length; b++, cur_index++) {
+			bh[b] = sb_getblk(sb, cur_index);
+			if (bh[b] == NULL)
+				goto block_release;
+			bytes += msblk->devblksize;
+		}
+		ll_rw_block(REQ_OP_READ, 0, b, bh);
 	} else {
 		/*
 		 * Metadata block.
 		 */
-		const u8 *data;
-		struct bvec_iter_all iter_all = {};
-		struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
-
-		if (index + 2 > msblk->bytes_used) {
-			res = -EIO;
-			goto out;
-		}
-		res = squashfs_bio_read(sb, index, 2, &bio, &offset);
-		if (res)
-			goto out;
-
-		if (WARN_ON_ONCE(!bio_next_segment(bio, &iter_all))) {
-			res = -EIO;
-			goto out_free_bio;
-		}
-		/* Extract the length of the metadata block */
-		data = page_address(bvec->bv_page) + bvec->bv_offset;
-		length = data[offset];
-		if (offset <= bvec->bv_len - 1) {
-			length |= data[offset + 1] << 8;
-		} else {
-			if (WARN_ON_ONCE(!bio_next_segment(bio, &iter_all))) {
-				res = -EIO;
-				goto out_free_bio;
-			}
-			data = page_address(bvec->bv_page) + bvec->bv_offset;
-			length |= data[0] << 8;
-		}
-		bio_free_pages(bio);
-		bio_put(bio);
+		if ((index + 2) > msblk->bytes_used)
+			goto read_failure;
 
+		bh[0] = get_block_length(sb, &cur_index, &offset, &length);
+		if (bh[0] == NULL)
+			goto read_failure;
+		b = 1;
+
+		bytes = msblk->devblksize - offset;
 		compressed = SQUASHFS_COMPRESSED(length);
 		length = SQUASHFS_COMPRESSED_SIZE(length);
-		index += 2;
+		if (next_index)
+			*next_index = index + length + 2;
 
 		TRACE("Block @ 0x%llx, %scompressed size %d\n", index,
-		      compressed ? "" : "un", length);
+				compressed ? "" : "un", length);
+
+		if (length < 0 || length > output->length ||
+					(index + length) > msblk->bytes_used)
+			goto block_release;
+
+		for (; bytes < length; b++) {
+			bh[b] = sb_getblk(sb, ++cur_index);
+			if (bh[b] == NULL)
+				goto block_release;
+			bytes += msblk->devblksize;
+		}
+		ll_rw_block(REQ_OP_READ, 0, b - 1, bh + 1);
 	}
-	if (next_index)
-		*next_index = index + length;
 
-	res = squashfs_bio_read(sb, index, length, &bio, &offset);
-	if (res)
-		goto out;
+	for (i = 0; i < b; i++) {
+		wait_on_buffer(bh[i]);
+		if (!buffer_uptodate(bh[i]))
+			goto block_release;
+	}
 
 	if (compressed) {
-		if (!msblk->stream) {
-			res = -EIO;
-			goto out_free_bio;
-		}
-		res = squashfs_decompress(msblk, bio, offset, length, output);
+		if (!msblk->stream)
+			goto read_failure;
+		length = squashfs_decompress(msblk, bh, b, offset, length,
+			output);
+		if (length < 0)
+			goto read_failure;
 	} else {
-		res = copy_bio_to_actor(bio, output, offset, length);
+		/*
+		 * Block is uncompressed.
+		 */
+		int in, pg_offset = 0;
+		void *data = squashfs_first_page(output);
+
+		for (bytes = length; k < b; k++) {
+			in = min(bytes, msblk->devblksize - offset);
+			bytes -= in;
+			while (in) {
+				if (pg_offset == PAGE_SIZE) {
+					data = squashfs_next_page(output);
+					pg_offset = 0;
+				}
+				avail = min_t(int, in, PAGE_SIZE -
+						pg_offset);
+				memcpy(data + pg_offset, bh[k]->b_data + offset,
+						avail);
+				in -= avail;
+				pg_offset += avail;
+				offset += avail;
+			}
+			offset = 0;
+			put_bh(bh[k]);
+		}
+		squashfs_finish_page(output);
 	}
 
-out_free_bio:
-	bio_free_pages(bio);
-	bio_put(bio);
-out:
-	if (res < 0)
-		ERROR("Failed to read block 0x%llx: %d\n", index, res);
+	kfree(bh);
+	return length;
 
-	return res;
+block_release:
+	for (; k < b; k++)
+		put_bh(bh[k]);
+
+read_failure:
+	ERROR("squashfs_read_data failed to read block 0x%llx\n",
+					(unsigned long long) index);
+	kfree(bh);
+	return -EIO;
 }
--- a/fs/squashfs/decompressor.h~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/decompressor.h
@@ -10,14 +10,13 @@
  * decompressor.h
  */
 
-#include <linux/bio.h>
-
 struct squashfs_decompressor {
 	void	*(*init)(struct squashfs_sb_info *, void *);
 	void	*(*comp_opts)(struct squashfs_sb_info *, void *, int);
 	void	(*free)(void *);
 	int	(*decompress)(struct squashfs_sb_info *, void *,
-		struct bio *, int, int, struct squashfs_page_actor *);
+		struct buffer_head **, int, int, int,
+		struct squashfs_page_actor *);
 	int	id;
 	char	*name;
 	int	supported;
--- a/fs/squashfs/decompressor_multi.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/decompressor_multi.c
@@ -6,7 +6,7 @@
 #include <linux/types.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/sched.h>
 #include <linux/wait.h>
 #include <linux/cpumask.h>
@@ -180,15 +180,14 @@ wait:
 }
 
 
-int squashfs_decompress(struct squashfs_sb_info *msblk, struct bio *bio,
-			int offset, int length,
-			struct squashfs_page_actor *output)
+int squashfs_decompress(struct squashfs_sb_info *msblk, struct buffer_head **bh,
+	int b, int offset, int length, struct squashfs_page_actor *output)
 {
 	int res;
 	struct squashfs_stream *stream = msblk->stream;
 	struct decomp_stream *decomp_stream = get_decomp_stream(msblk, stream);
 	res = msblk->decompressor->decompress(msblk, decomp_stream->stream,
-		bio, offset, length, output);
+		bh, b, offset, length, output);
 	put_decomp_stream(decomp_stream, stream);
 	if (res < 0)
 		ERROR("%s decompression failed, data probably corrupt\n",
--- a/fs/squashfs/decompressor_multi_percpu.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/decompressor_multi_percpu.c
@@ -75,8 +75,8 @@ void squashfs_decompressor_destroy(struc
 	}
 }
 
-int squashfs_decompress(struct squashfs_sb_info *msblk, struct bio *bio,
-	int offset, int length, struct squashfs_page_actor *output)
+int squashfs_decompress(struct squashfs_sb_info *msblk, struct buffer_head **bh,
+	int b, int offset, int length, struct squashfs_page_actor *output)
 {
 	struct squashfs_stream *stream;
 	int res;
@@ -84,7 +84,7 @@ int squashfs_decompress(struct squashfs_
 	local_lock(&msblk->stream->lock);
 	stream = this_cpu_ptr(msblk->stream);
 
-	res = msblk->decompressor->decompress(msblk, stream->stream, bio,
+	res = msblk->decompressor->decompress(msblk, stream->stream, bh, b,
 					      offset, length, output);
 
 	local_unlock(&msblk->stream->lock);
--- a/fs/squashfs/decompressor_single.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/decompressor_single.c
@@ -7,7 +7,7 @@
 #include <linux/types.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 
 #include "squashfs_fs.h"
 #include "squashfs_fs_sb.h"
@@ -59,15 +59,14 @@ void squashfs_decompressor_destroy(struc
 	}
 }
 
-int squashfs_decompress(struct squashfs_sb_info *msblk, struct bio *bio,
-			int offset, int length,
-			struct squashfs_page_actor *output)
+int squashfs_decompress(struct squashfs_sb_info *msblk, struct buffer_head **bh,
+	int b, int offset, int length, struct squashfs_page_actor *output)
 {
 	int res;
 	struct squashfs_stream *stream = msblk->stream;
 
 	mutex_lock(&stream->mutex);
-	res = msblk->decompressor->decompress(msblk, stream->stream, bio,
+	res = msblk->decompressor->decompress(msblk, stream->stream, bh, b,
 		offset, length, output);
 	mutex_unlock(&stream->mutex);
 
--- a/fs/squashfs/lz4_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/lz4_wrapper.c
@@ -4,7 +4,7 @@
  * Phillip Lougher <phillip@squashfs.org.uk>
  */
 
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
@@ -89,23 +89,20 @@ static void lz4_free(void *strm)
 
 
 static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
 	struct squashfs_lz4 *stream = strm;
 	void *buff = stream->input, *data;
-	int bytes = length, res;
+	int avail, i, bytes = length, res;
 
-	while (bio_next_segment(bio, &iter_all)) {
-		int avail = min(bytes, ((int)bvec->bv_len) - offset);
-
-		data = page_address(bvec->bv_page) + bvec->bv_offset;
-		memcpy(buff, data + offset, avail);
+	for (i = 0; i < b; i++) {
+		avail = min(bytes, msblk->devblksize - offset);
+		memcpy(buff, bh[i]->b_data + offset, avail);
 		buff += avail;
 		bytes -= avail;
 		offset = 0;
+		put_bh(bh[i]);
 	}
 
 	res = LZ4_decompress_safe(stream->input, stream->output,
--- a/fs/squashfs/lzo_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/lzo_wrapper.c
@@ -9,7 +9,7 @@
  */
 
 #include <linux/mutex.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 #include <linux/lzo.h>
@@ -63,24 +63,21 @@ static void lzo_free(void *strm)
 
 
 static int lzo_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
 	struct squashfs_lzo *stream = strm;
 	void *buff = stream->input, *data;
-	int bytes = length, res;
+	int avail, i, bytes = length, res;
 	size_t out_len = output->length;
 
-	while (bio_next_segment(bio, &iter_all)) {
-		int avail = min(bytes, ((int)bvec->bv_len) - offset);
-
-		data = page_address(bvec->bv_page) + bvec->bv_offset;
-		memcpy(buff, data + offset, avail);
+	for (i = 0; i < b; i++) {
+		avail = min(bytes, msblk->devblksize - offset);
+		memcpy(buff, bh[i]->b_data + offset, avail);
 		buff += avail;
 		bytes -= avail;
 		offset = 0;
+		put_bh(bh[i]);
 	}
 
 	res = lzo1x_decompress_safe(stream->input, (size_t)length,
--- a/fs/squashfs/squashfs.h~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/squashfs.h
@@ -40,8 +40,8 @@ extern void *squashfs_decompressor_setup
 /* decompressor_xxx.c */
 extern void *squashfs_decompressor_create(struct squashfs_sb_info *, void *);
 extern void squashfs_decompressor_destroy(struct squashfs_sb_info *);
-extern int squashfs_decompress(struct squashfs_sb_info *, struct bio *,
-				int, int, struct squashfs_page_actor *);
+extern int squashfs_decompress(struct squashfs_sb_info *, struct buffer_head **,
+	int, int, int, struct squashfs_page_actor *);
 extern int squashfs_max_decompressors(void);
 
 /* export.c */
--- a/fs/squashfs/xz_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/xz_wrapper.c
@@ -10,7 +10,7 @@
 
 
 #include <linux/mutex.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/slab.h>
 #include <linux/xz.h>
 #include <linux/bitops.h>
@@ -117,12 +117,11 @@ static void squashfs_xz_free(void *strm)
 
 
 static int squashfs_xz_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
-	int total = 0, error = 0;
+	enum xz_ret xz_err;
+	int avail, total = 0, k = 0;
 	struct squashfs_xz *stream = strm;
 
 	xz_dec_reset(stream->state);
@@ -132,23 +131,11 @@ static int squashfs_xz_uncompress(struct
 	stream->buf.out_size = PAGE_SIZE;
 	stream->buf.out = squashfs_first_page(output);
 
-	for (;;) {
-		enum xz_ret xz_err;
-
-		if (stream->buf.in_pos == stream->buf.in_size) {
-			const void *data;
-			int avail;
-
-			if (!bio_next_segment(bio, &iter_all)) {
-				/* XZ_STREAM_END must be reached. */
-				error = -EIO;
-				break;
-			}
-
-			avail = min(length, ((int)bvec->bv_len) - offset);
-			data = page_address(bvec->bv_page) + bvec->bv_offset;
+	do {
+		if (stream->buf.in_pos == stream->buf.in_size && k < b) {
+			avail = min(length, msblk->devblksize - offset);
 			length -= avail;
-			stream->buf.in = data + offset;
+			stream->buf.in = bh[k]->b_data + offset;
 			stream->buf.in_size = avail;
 			stream->buf.in_pos = 0;
 			offset = 0;
@@ -163,17 +150,23 @@ static int squashfs_xz_uncompress(struct
 		}
 
 		xz_err = xz_dec_run(stream->state, &stream->buf);
-		if (xz_err == XZ_STREAM_END)
-			break;
-		if (xz_err != XZ_OK) {
-			error = -EIO;
-			break;
-		}
-	}
+
+		if (stream->buf.in_pos == stream->buf.in_size && k < b)
+			put_bh(bh[k++]);
+	} while (xz_err == XZ_OK);
 
 	squashfs_finish_page(output);
 
-	return error ? error : total + stream->buf.out_pos;
+	if (xz_err != XZ_STREAM_END || k < b)
+		goto out;
+
+	return total + stream->buf.out_pos;
+
+out:
+	for (; k < b; k++)
+		put_bh(bh[k]);
+
+	return -EIO;
 }
 
 const struct squashfs_decompressor squashfs_xz_comp_ops = {
--- a/fs/squashfs/zlib_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/zlib_wrapper.c
@@ -10,7 +10,7 @@
 
 
 #include <linux/mutex.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/slab.h>
 #include <linux/zlib.h>
 #include <linux/vmalloc.h>
@@ -50,35 +50,21 @@ static void zlib_free(void *strm)
 
 
 static int zlib_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
-	int zlib_init = 0, error = 0;
+	int zlib_err, zlib_init = 0, k = 0;
 	z_stream *stream = strm;
 
 	stream->avail_out = PAGE_SIZE;
 	stream->next_out = squashfs_first_page(output);
 	stream->avail_in = 0;
 
-	for (;;) {
-		int zlib_err;
-
-		if (stream->avail_in == 0) {
-			const void *data;
-			int avail;
-
-			if (!bio_next_segment(bio, &iter_all)) {
-				/* Z_STREAM_END must be reached. */
-				error = -EIO;
-				break;
-			}
-
-			avail = min(length, ((int)bvec->bv_len) - offset);
-			data = page_address(bvec->bv_page) + bvec->bv_offset;
+	do {
+		if (stream->avail_in == 0 && k < b) {
+			int avail = min(length, msblk->devblksize - offset);
 			length -= avail;
-			stream->next_in = data + offset;
+			stream->next_in = bh[k]->b_data + offset;
 			stream->avail_in = avail;
 			offset = 0;
 		}
@@ -92,28 +78,37 @@ static int zlib_uncompress(struct squash
 		if (!zlib_init) {
 			zlib_err = zlib_inflateInit(stream);
 			if (zlib_err != Z_OK) {
-				error = -EIO;
-				break;
+				squashfs_finish_page(output);
+				goto out;
 			}
 			zlib_init = 1;
 		}
 
 		zlib_err = zlib_inflate(stream, Z_SYNC_FLUSH);
-		if (zlib_err == Z_STREAM_END)
-			break;
-		if (zlib_err != Z_OK) {
-			error = -EIO;
-			break;
-		}
-	}
+
+		if (stream->avail_in == 0 && k < b)
+			put_bh(bh[k++]);
+	} while (zlib_err == Z_OK);
 
 	squashfs_finish_page(output);
 
-	if (!error)
-		if (zlib_inflateEnd(stream) != Z_OK)
-			error = -EIO;
+	if (zlib_err != Z_STREAM_END)
+		goto out;
+
+	zlib_err = zlib_inflateEnd(stream);
+	if (zlib_err != Z_OK)
+		goto out;
+
+	if (k < b)
+		goto out;
+
+	return stream->total_out;
+
+out:
+	for (; k < b; k++)
+		put_bh(bh[k]);
 
-	return error ? error : stream->total_out;
+	return -EIO;
 }
 
 const struct squashfs_decompressor squashfs_zlib_comp_ops = {
--- a/fs/squashfs/zstd_wrapper.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/zstd_wrapper.c
@@ -9,7 +9,7 @@
  */
 
 #include <linux/mutex.h>
-#include <linux/bio.h>
+#include <linux/buffer_head.h>
 #include <linux/slab.h>
 #include <linux/zstd.h>
 #include <linux/vmalloc.h>
@@ -59,44 +59,33 @@ static void zstd_free(void *strm)
 
 
 static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm,
-	struct bio *bio, int offset, int length,
+	struct buffer_head **bh, int b, int offset, int length,
 	struct squashfs_page_actor *output)
 {
 	struct workspace *wksp = strm;
 	ZSTD_DStream *stream;
 	size_t total_out = 0;
-	int error = 0;
+	size_t zstd_err;
+	int k = 0;
 	ZSTD_inBuffer in_buf = { NULL, 0, 0 };
 	ZSTD_outBuffer out_buf = { NULL, 0, 0 };
-	struct bvec_iter_all iter_all = {};
-	struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
 
 	stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size);
 
 	if (!stream) {
 		ERROR("Failed to initialize zstd decompressor\n");
-		return -EIO;
+		goto out;
 	}
 
 	out_buf.size = PAGE_SIZE;
 	out_buf.dst = squashfs_first_page(output);
 
-	for (;;) {
-		size_t zstd_err;
+	do {
+		if (in_buf.pos == in_buf.size && k < b) {
+			int avail = min(length, msblk->devblksize - offset);
 
-		if (in_buf.pos == in_buf.size) {
-			const void *data;
-			int avail;
-
-			if (!bio_next_segment(bio, &iter_all)) {
-				error = -EIO;
-				break;
-			}
-
-			avail = min(length, ((int)bvec->bv_len) - offset);
-			data = page_address(bvec->bv_page) + bvec->bv_offset;
 			length -= avail;
-			in_buf.src = data + offset;
+			in_buf.src = bh[k]->b_data + offset;
 			in_buf.size = avail;
 			in_buf.pos = 0;
 			offset = 0;
@@ -108,8 +97,8 @@ static int zstd_uncompress(struct squash
 				/* Shouldn't run out of pages
 				 * before stream is done.
 				 */
-				error = -EIO;
-				break;
+				squashfs_finish_page(output);
+				goto out;
 			}
 			out_buf.pos = 0;
 			out_buf.size = PAGE_SIZE;
@@ -118,20 +107,29 @@ static int zstd_uncompress(struct squash
 		total_out -= out_buf.pos;
 		zstd_err = ZSTD_decompressStream(stream, &out_buf, &in_buf);
 		total_out += out_buf.pos; /* add the additional data produced */
-		if (zstd_err == 0)
-			break;
 
-		if (ZSTD_isError(zstd_err)) {
-			ERROR("zstd decompression error: %d\n",
-					(int)ZSTD_getErrorCode(zstd_err));
-			error = -EIO;
-			break;
-		}
-	}
+		if (in_buf.pos == in_buf.size && k < b)
+			put_bh(bh[k++]);
+	} while (zstd_err != 0 && !ZSTD_isError(zstd_err));
 
 	squashfs_finish_page(output);
 
-	return error ? error : total_out;
+	if (ZSTD_isError(zstd_err)) {
+		ERROR("zstd decompression error: %d\n",
+				(int)ZSTD_getErrorCode(zstd_err));
+		goto out;
+	}
+
+	if (k < b)
+		goto out;
+
+	return (int)total_out;
+
+out:
+	for (; k < b; k++)
+		put_bh(bh[k]);
+
+	return -EIO;
 }
 
 const struct squashfs_decompressor squashfs_zstd_comp_ops = {
--- a/fs/squashfs/super.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio
+++ a/fs/squashfs/super.c
@@ -26,6 +26,7 @@
 #include <linux/module.h>
 #include <linux/magic.h>
 #include <linux/xattr.h>
+#include <linux/blk_types.h>
 
 #include "squashfs_fs.h"
 #include "squashfs_fs_sb.h"
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
linux-next-git-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + squashfs-fix-length-field-overlap-check-in-metadata-reading.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (184 preceding siblings ...)
  2020-07-17 20:18 ` [obsolete] revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch " Andrew Morton
@ 2020-07-17 20:20 ` Andrew Morton
  2020-07-17 20:35 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch " Andrew Morton
                   ` (46 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 20:20 UTC (permalink / raw)
  To: adrien+dev, bernd.amend, drosen, groeck, hch, mm-commits, phillip


The patch titled
     Subject: squashfs: fix length field overlap check in metadata reading
has been added to the -mm tree.  Its filename is
     squashfs-fix-length-field-overlap-check-in-metadata-reading.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/squashfs-fix-length-field-overlap-check-in-metadata-reading.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/squashfs-fix-length-field-overlap-check-in-metadata-reading.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Phillip Lougher <phillip@squashfs.org.uk>
Subject: squashfs: fix length field overlap check in metadata reading

This is a regression introduced by the "migrate from ll_rw_block usage to
BIO" patch.

Squashfs packs structures on byte boundaries, and due to that the length
field (of the metadata block) may not be fully in the current block.  The
new code rewrote and introduced a faulty check for that edge case.

Link: http://lkml.kernel.org/r/20200717195536.16069-1-phillip@squashfs.org.uk
Fixes: 93e72b3c612adcaca1 ("squashfs: migrate from ll_rw_block usage to BIO")
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: Bernd Amend <bernd.amend@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Adrien Schildknecht <adrien+dev@schischi.me>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/squashfs/block.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/squashfs/block.c~squashfs-fix-length-field-overlap-check-in-metadata-reading
+++ a/fs/squashfs/block.c
@@ -175,7 +175,7 @@ int squashfs_read_data(struct super_bloc
 		/* Extract the length of the metadata block */
 		data = page_address(bvec->bv_page) + bvec->bv_offset;
 		length = data[offset];
-		if (offset <= bvec->bv_len - 1) {
+		if (offset < bvec->bv_len - 1) {
 			length |= data[offset + 1] << 8;
 		} else {
 			if (WARN_ON_ONCE(!bio_next_segment(bio, &iter_all))) {
_

Patches currently in -mm which might be from phillip@squashfs.org.uk are

squashfs-fix-length-field-overlap-check-in-metadata-reading.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (185 preceding siblings ...)
  2020-07-17 20:20 ` + squashfs-fix-length-field-overlap-check-in-metadata-reading.patch added to " Andrew Morton
@ 2020-07-17 20:35 ` Andrew Morton
  2020-07-17 20:49 ` + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch " Andrew Morton
                   ` (45 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 20:35 UTC (permalink / raw)
  To: akpm, guro, jonathan.cameron, mike.kravetz, mm-commits, song.bao.hua


The patch titled
     Subject: mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix
has been added to the -mm tree.  Its filename is
     mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix

fix CONFIG_CMA=n warning

mm/hugetlb.c:48:20: warning: hugetlb_cma defined but not used [-Wunused-variable]

Cc: Barry Song <song.bao.hua@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix
+++ a/mm/hugetlb.c
@@ -45,7 +45,9 @@ int hugetlb_max_hstate __read_mostly;
 unsigned int default_hstate_idx;
 struct hstate hstates[HUGE_MAX_HSTATE];
 
+#ifdef CONFIG_CMA
 static struct cma *hugetlb_cma[MAX_NUMNODES];
+#endif
 static unsigned long hugetlb_cma_size __initdata;
 
 /*
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
linux-next-git-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (186 preceding siblings ...)
  2020-07-17 20:35 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch " Andrew Morton
@ 2020-07-17 20:49 ` Andrew Morton
  2020-07-17 21:11 ` + mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch " Andrew Morton
                   ` (44 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 20:49 UTC (permalink / raw)
  To: guro, mm-commits, sfr


The patch titled
     Subject: mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix
has been added to the -mm tree.  Its filename is
     mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix

fix warnings

Link: http://lkml.kernel.org/r/20200717174705.GA55916@carbon.DHCP.thefacebook.com
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmstat.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/mm/vmstat.c~mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix
+++ a/mm/vmstat.c
@@ -168,9 +168,12 @@ EXPORT_SYMBOL(vm_numa_stat);
 EXPORT_SYMBOL(vm_node_stat);
 
 #ifdef CONFIG_SMP
-
 #define MAX_THRESHOLD 125
+#else
+#define MAX_THRESHOLD 0
+#endif
 
+#ifdef CONFIG_SMP
 int calculate_pressure_threshold(struct zone *zone)
 {
 	int threshold;
@@ -611,8 +614,6 @@ void dec_node_page_state(struct page *pa
 EXPORT_SYMBOL(dec_node_page_state);
 #else
 
-#define MAX_THRESHOLD 0
-
 /*
  * Use interrupt disable to serialize counter updates
  */
_

Patches currently in -mm which might be from guro@fb.com are

mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (187 preceding siblings ...)
  2020-07-17 20:49 ` + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch " Andrew Morton
@ 2020-07-17 21:11 ` Andrew Morton
  2020-07-17 21:11 ` + riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch " Andrew Morton
                   ` (43 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 21:11 UTC (permalink / raw)
  To: mm-commits, palmerdabbelt, penberg, rientjes, rppt, thomas.lendacky


The patch titled
     Subject: mm: pgtable: make generic pgprot_* macros available for no-MMU
has been added to the -mm tree.  Its filename is
     mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Pekka Enberg <penberg@kernel.org>
Subject: mm: pgtable: make generic pgprot_* macros available for no-MMU

The <linux/pgtable.h> header defines some generic pgprot_*
implementations, but they are only available when CONFIG_MMU is enabled. 
The RISC-V architecture, for example, therefore defines some of these
pgprot_* macros for !NOMMU.

Let's make the pgprot_* generic available even for !NOMMU so we can remove
the RISC-V specific definitions.

Compile-tested with x86 defconfig, and riscv defconfig and !MMU defconfig.

Link: http://lkml.kernel.org/r/20200715053340.576300-1-penberg@gmail.com
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Suggested-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/pgtable.h |   71 +++++++++++++++++++-------------------
 1 file changed, 37 insertions(+), 34 deletions(-)

--- a/include/linux/pgtable.h~mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu
+++ a/include/linux/pgtable.h
@@ -647,40 +647,6 @@ static inline int arch_unmap_one(struct
 #define flush_tlb_fix_spurious_fault(vma, address) flush_tlb_page(vma, address)
 #endif
 
-#ifndef pgprot_nx
-#define pgprot_nx(prot)	(prot)
-#endif
-
-#ifndef pgprot_noncached
-#define pgprot_noncached(prot)	(prot)
-#endif
-
-#ifndef pgprot_writecombine
-#define pgprot_writecombine pgprot_noncached
-#endif
-
-#ifndef pgprot_writethrough
-#define pgprot_writethrough pgprot_noncached
-#endif
-
-#ifndef pgprot_device
-#define pgprot_device pgprot_noncached
-#endif
-
-#ifndef pgprot_modify
-#define pgprot_modify pgprot_modify
-static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot)
-{
-	if (pgprot_val(oldprot) == pgprot_val(pgprot_noncached(oldprot)))
-		newprot = pgprot_noncached(newprot);
-	if (pgprot_val(oldprot) == pgprot_val(pgprot_writecombine(oldprot)))
-		newprot = pgprot_writecombine(newprot);
-	if (pgprot_val(oldprot) == pgprot_val(pgprot_device(oldprot)))
-		newprot = pgprot_device(newprot);
-	return newprot;
-}
-#endif
-
 /*
  * When walking page tables, get the address of the next boundary,
  * or the end address of the range if that comes earlier.  Although no
@@ -840,6 +806,43 @@ static inline void ptep_modify_prot_comm
  * No-op macros that just return the current protection value. Defined here
  * because these macros can be used used even if CONFIG_MMU is not defined.
  */
+
+#ifndef pgprot_nx
+#define pgprot_nx(prot)	(prot)
+#endif
+
+#ifndef pgprot_noncached
+#define pgprot_noncached(prot)	(prot)
+#endif
+
+#ifndef pgprot_writecombine
+#define pgprot_writecombine pgprot_noncached
+#endif
+
+#ifndef pgprot_writethrough
+#define pgprot_writethrough pgprot_noncached
+#endif
+
+#ifndef pgprot_device
+#define pgprot_device pgprot_noncached
+#endif
+
+#ifdef CONFIG_MMU
+#ifndef pgprot_modify
+#define pgprot_modify pgprot_modify
+static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot)
+{
+	if (pgprot_val(oldprot) == pgprot_val(pgprot_noncached(oldprot)))
+		newprot = pgprot_noncached(newprot);
+	if (pgprot_val(oldprot) == pgprot_val(pgprot_writecombine(oldprot)))
+		newprot = pgprot_writecombine(newprot);
+	if (pgprot_val(oldprot) == pgprot_val(pgprot_device(oldprot)))
+		newprot = pgprot_device(newprot);
+	return newprot;
+}
+#endif
+#endif /* CONFIG_MMU */
+
 #ifndef pgprot_encrypted
 #define pgprot_encrypted(prot)	(prot)
 #endif
_

Patches currently in -mm which might be from penberg@kernel.org are

mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch
riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (188 preceding siblings ...)
  2020-07-17 21:11 ` + mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch " Andrew Morton
@ 2020-07-17 21:11 ` Andrew Morton
  2020-07-17 21:42 ` + uaccess-add-force_uaccess_beginend-helpers-v2.patch " Andrew Morton
                   ` (42 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 21:11 UTC (permalink / raw)
  To: mm-commits, palmerdabbelt, penberg, rientjes, rppt, thomas.lendacky


The patch titled
     Subject: riscv: use generic pgprot_* macros from <linux/pgtable.h>
has been added to the -mm tree.  Its filename is
     riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Pekka Enberg <penberg@kernel.org>
Subject: riscv: use generic pgprot_* macros from <linux/pgtable.h>

The <linux/pgtable.h> header now defines generic pgprot_ macros also for
the no-MMU configuration, so let's use them.

Link: http://lkml.kernel.org/r/20200715053340.576300-2-penberg@gmail.com
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/riscv/include/asm/mmio.h |    6 ------
 1 file changed, 6 deletions(-)

--- a/arch/riscv/include/asm/mmio.h~riscv-use-generic-pgprot_-macros-from-linux-pgtableh
+++ a/arch/riscv/include/asm/mmio.h
@@ -14,12 +14,6 @@
 #include <linux/types.h>
 #include <asm/mmiowb.h>
 
-#ifndef CONFIG_MMU
-#define pgprot_noncached(x)	(x)
-#define pgprot_writecombine(x)	(x)
-#define pgprot_device(x)	(x)
-#endif /* CONFIG_MMU */
-
 /* Generic IO read/write.  These perform native-endian accesses. */
 #define __raw_writeb __raw_writeb
 static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
_

Patches currently in -mm which might be from penberg@kernel.org are

mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch
riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + uaccess-add-force_uaccess_beginend-helpers-v2.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (189 preceding siblings ...)
  2020-07-17 21:11 ` + riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch " Andrew Morton
@ 2020-07-17 21:42 ` Andrew Morton
  2020-07-17 21:59 ` + mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch " Andrew Morton
                   ` (41 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 21:42 UTC (permalink / raw)
  To: geert, green.hu, hch, mark.rutland, mm-commits


The patch titled
     Subject: uaccess-add-force_uaccess_beginend-helpers-v2
has been added to the -mm tree.  Its filename is
     uaccess-add-force_uaccess_beginend-helpers-v2.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/uaccess-add-force_uaccess_beginend-helpers-v2.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/uaccess-add-force_uaccess_beginend-helpers-v2.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Christoph Hellwig <hch@lst.de>
Subject: uaccess-add-force_uaccess_beginend-helpers-v2

drop two incorrect hunks, fix a commit log typo

Link: http://lkml.kernel.org/r/20200714105505.935079-6-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/m68k/include/asm/tlbflush.h |    6 +++---
 arch/sh/kernel/traps_32.c        |    6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

--- a/arch/m68k/include/asm/tlbflush.h~uaccess-add-force_uaccess_beginend-helpers-v2
+++ a/arch/m68k/include/asm/tlbflush.h
@@ -13,13 +13,13 @@ static inline void flush_tlb_kernel_page
 	if (CPU_IS_COLDFIRE) {
 		mmu_write(MMUOR, MMUOR_CNL);
 	} else if (CPU_IS_040_OR_060) {
-		mm_segment_t old_fs = force_uaccess_begin();
-
+		mm_segment_t old_fs = get_fs();
+		set_fs(KERNEL_DS);
 		__asm__ __volatile__(".chip 68040\n\t"
 				     "pflush (%0)\n\t"
 				     ".chip 68k"
 				     : : "a" (addr));
-		force_uaccess_end(old_fs);
+		set_fs(old_fs);
 	} else if (CPU_IS_020_OR_030)
 		__asm__ __volatile__("pflush #4,#4,(%0)" : : "a" (addr));
 }
--- a/arch/sh/kernel/traps_32.c~uaccess-add-force_uaccess_beginend-helpers-v2
+++ a/arch/sh/kernel/traps_32.c
@@ -538,13 +538,13 @@ uspace_segv:
 		if (regs->pc & 1)
 			die("unaligned program counter", regs, error_code);
 
-		oldfs = force_uaccess_begin();
+		set_fs(KERNEL_DS);
 		if (copy_from_user(&instruction, (void __user *)(regs->pc),
 				   sizeof(instruction))) {
 			/* Argh. Fault on the instruction itself.
 			   This should never happen non-SMP
 			*/
-			force_uaccess_end(oldfs);
+			set_fs(oldfs);
 			die("insn faulting in do_address_error", regs, 0);
 		}
 
@@ -552,7 +552,7 @@ uspace_segv:
 
 		handle_unaligned_access(instruction, regs, &user_mem_access,
 					0, address);
-		force_uaccess_end(oldfs);
+		set_fs(oldfs);
 	}
 }
 
_

Patches currently in -mm which might be from hch@lst.de are

syscalls-use-uaccess_kernel-in-addr_limit_user_check.patch
nds32-use-uaccess_kernel-in-show_regs.patch
riscv-include-asm-pgtableh-in-asm-uaccessh.patch
uaccess-remove-segment_eq.patch
uaccess-add-force_uaccess_beginend-helpers.patch
uaccess-add-force_uaccess_beginend-helpers-v2.patch
exec-use-force_uaccess_begin-during-exec-and-exit.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (190 preceding siblings ...)
  2020-07-17 21:42 ` + uaccess-add-force_uaccess_beginend-helpers-v2.patch " Andrew Morton
@ 2020-07-17 21:59 ` Andrew Morton
  2020-07-17 21:59 ` + mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch " Andrew Morton
                   ` (40 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 21:59 UTC (permalink / raw)
  To: anshuman.khandual, benh, bp, catalin.marinas, corbet,
	dan.j.williams, dave.hansen, david, fenghua.yu, hpa, hsinyi,
	justin.he, kirill.shutemov, luto, mark.rutland, mhocko, mingo,
	mm-commits, mpe, palmer, pasha.tatashin, paul.walmsley, paulus,
	peterz, robin.murphy, rppt, steve.capper, tglx, tony.luck, will,
	willy, yuzhao


The patch titled
     Subject: mm/sparsemem: enable vmem_altmap support in vmemmap_populate_basepages()
has been added to the -mm tree.  Its filename is
     mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/sparsemem: enable vmem_altmap support in vmemmap_populate_basepages()

Patch series "arm64: Enable vmemmap mapping from device memory", v4.

This series enables vmemmap backing memory allocation from device memory
ranges on arm64.  But before that, it enables vmemmap_populate_basepages()
and vmemmap_alloc_block_buf() to accommodate struct vmem_altmap based
alocation requests.


This patch (of 3):

vmemmap_populate_basepages() is used across platforms to allocate backing
memory for vmemmap mapping.  This is used as a standard default choice or
as a fallback when intended huge pages allocation fails.  This just
creates entire vmemmap mapping with base pages (PAGE_SIZE).

On arm64 platforms, vmemmap_populate_basepages() is called instead of the
platform specific vmemmap_populate() when ARM64_SWAPPER_USES_SECTION_MAPS
is not enabled as in case for ARM64_16K_PAGES and ARM64_64K_PAGES configs.

At present vmemmap_populate_basepages() does not support allocating from
driver defined struct vmem_altmap while trying to create vmemmap mapping
for a device memory range.  It prevents ARM64_16K_PAGES and
ARM64_64K_PAGES configs on arm64 from supporting device memory with
vmemap_altmap request.

This enables vmem_altmap support in vmemmap_populate_basepages() unlocking
device memory allocation for vmemap mapping on arm64 platforms with 16K or
64K base page configs.

Each architecture should evaluate and decide on subscribing device memory
based base page allocation through vmemmap_populate_basepages().  Hence
lets keep it disabled on all archs in order to preserve the existing
semantics.  A subsequent patch enables it on arm64.

Link: http://lkml.kernel.org/r/1594004178-8861-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1594004178-8861-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Jia He <justin.he@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/mm/mmu.c      |    2 +-
 arch/ia64/mm/discontig.c |    2 +-
 arch/riscv/mm/init.c     |    2 +-
 arch/x86/mm/init_64.c    |    6 +++---
 include/linux/mm.h       |    5 +++--
 mm/sparse-vmemmap.c      |   16 +++++++++++-----
 6 files changed, 20 insertions(+), 13 deletions(-)

--- a/arch/arm64/mm/mmu.c~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages
+++ a/arch/arm64/mm/mmu.c
@@ -1070,7 +1070,7 @@ static void free_empty_tables(unsigned l
 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
 		struct vmem_altmap *altmap)
 {
-	return vmemmap_populate_basepages(start, end, node);
+	return vmemmap_populate_basepages(start, end, node, NULL);
 }
 #else	/* !ARM64_SWAPPER_USES_SECTION_MAPS */
 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
--- a/arch/ia64/mm/discontig.c~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages
+++ a/arch/ia64/mm/discontig.c
@@ -655,7 +655,7 @@ void arch_refresh_nodedata(int update_no
 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
 		struct vmem_altmap *altmap)
 {
-	return vmemmap_populate_basepages(start, end, node);
+	return vmemmap_populate_basepages(start, end, node, NULL);
 }
 
 void vmemmap_free(unsigned long start, unsigned long end,
--- a/arch/riscv/mm/init.c~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages
+++ a/arch/riscv/mm/init.c
@@ -530,6 +530,6 @@ void __init paging_init(void)
 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
 			       struct vmem_altmap *altmap)
 {
-	return vmemmap_populate_basepages(start, end, node);
+	return vmemmap_populate_basepages(start, end, node, NULL);
 }
 #endif
--- a/arch/x86/mm/init_64.c~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages
+++ a/arch/x86/mm/init_64.c
@@ -1493,7 +1493,7 @@ static int __meminit vmemmap_populate_hu
 			vmemmap_verify((pte_t *)pmd, node, addr, next);
 			continue;
 		}
-		if (vmemmap_populate_basepages(addr, next, node))
+		if (vmemmap_populate_basepages(addr, next, node, NULL))
 			return -ENOMEM;
 	}
 	return 0;
@@ -1505,7 +1505,7 @@ int __meminit vmemmap_populate(unsigned
 	int err;
 
 	if (end - start < PAGES_PER_SECTION * sizeof(struct page))
-		err = vmemmap_populate_basepages(start, end, node);
+		err = vmemmap_populate_basepages(start, end, node, NULL);
 	else if (boot_cpu_has(X86_FEATURE_PSE))
 		err = vmemmap_populate_hugepages(start, end, node, altmap);
 	else if (altmap) {
@@ -1513,7 +1513,7 @@ int __meminit vmemmap_populate(unsigned
 				__func__);
 		err = -ENOMEM;
 	} else
-		err = vmemmap_populate_basepages(start, end, node);
+		err = vmemmap_populate_basepages(start, end, node, NULL);
 	if (!err)
 		sync_global_pgds(start, end - 1);
 	return err;
--- a/include/linux/mm.h~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages
+++ a/include/linux/mm.h
@@ -2968,14 +2968,15 @@ pgd_t *vmemmap_pgd_populate(unsigned lon
 p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node);
 pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node);
 pmd_t *vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node);
-pte_t *vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node);
+pte_t *vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node,
+			    struct vmem_altmap *altmap);
 void *vmemmap_alloc_block(unsigned long size, int node);
 struct vmem_altmap;
 void *vmemmap_alloc_block_buf(unsigned long size, int node);
 void *altmap_alloc_block_buf(unsigned long size, struct vmem_altmap *altmap);
 void vmemmap_verify(pte_t *, int, unsigned long, unsigned long);
 int vmemmap_populate_basepages(unsigned long start, unsigned long end,
-			       int node);
+			       int node, struct vmem_altmap *altmap);
 int vmemmap_populate(unsigned long start, unsigned long end, int node,
 		struct vmem_altmap *altmap);
 void vmemmap_populate_print_last(void);
--- a/mm/sparse-vmemmap.c~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages
+++ a/mm/sparse-vmemmap.c
@@ -139,12 +139,18 @@ void __meminit vmemmap_verify(pte_t *pte
 			start, end - 1);
 }
 
-pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node)
+pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node,
+				       struct vmem_altmap *altmap)
 {
 	pte_t *pte = pte_offset_kernel(pmd, addr);
 	if (pte_none(*pte)) {
 		pte_t entry;
-		void *p = vmemmap_alloc_block_buf(PAGE_SIZE, node);
+		void *p;
+
+		if (altmap)
+			p = altmap_alloc_block_buf(PAGE_SIZE, altmap);
+		else
+			p = vmemmap_alloc_block_buf(PAGE_SIZE, node);
 		if (!p)
 			return NULL;
 		entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL);
@@ -212,8 +218,8 @@ pgd_t * __meminit vmemmap_pgd_populate(u
 	return pgd;
 }
 
-int __meminit vmemmap_populate_basepages(unsigned long start,
-					 unsigned long end, int node)
+int __meminit vmemmap_populate_basepages(unsigned long start, unsigned long end,
+					 int node, struct vmem_altmap *altmap)
 {
 	unsigned long addr = start;
 	pgd_t *pgd;
@@ -235,7 +241,7 @@ int __meminit vmemmap_populate_basepages
 		pmd = vmemmap_pmd_populate(pud, addr, node);
 		if (!pmd)
 			return -ENOMEM;
-		pte = vmemmap_pte_populate(pmd, addr, node);
+		pte = vmemmap_pte_populate(pmd, addr, node, altmap);
 		if (!pte)
 			return -ENOMEM;
 		vmemmap_verify(pte, node, addr, addr + PAGE_SIZE);
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch
mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch
mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch
arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch
mm-vmstat-add-events-for-thp-migration-without-split.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (191 preceding siblings ...)
  2020-07-17 21:59 ` + mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch " Andrew Morton
@ 2020-07-17 21:59 ` Andrew Morton
  2020-07-17 21:59 ` + arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch " Andrew Morton
                   ` (39 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 21:59 UTC (permalink / raw)
  To: anshuman.khandual, benh, bp, catalin.marinas, corbet,
	dan.j.williams, dave.hansen, david, fenghua.yu, hpa, hsinyi,
	justin.he, kirill.shutemov, luto, mark.rutland, mhocko, mingo,
	mm-commits, mpe, palmer, pasha.tatashin, paul.walmsley, paulus,
	peterz, robin.murphy, rppt, steve.capper, tglx, tony.luck, will,
	willy, yuzhao


The patch titled
     Subject: mm/sparsemem: enable vmem_altmap support in vmemmap_alloc_block_buf()
has been added to the -mm tree.  Its filename is
     mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/sparsemem: enable vmem_altmap support in vmemmap_alloc_block_buf()

There are many instances where vmemap allocation is often switched between
regular memory and device memory just based on whether altmap is available
or not.  vmemmap_alloc_block_buf() is used in various platforms to
allocate vmemmap mappings.  Lets also enable it to handle altmap based
device memory allocation along with existing regular memory allocations. 
This will help in avoiding the altmap based allocation switch in many
places.  To summarize there are two different methods to call
vmemmap_alloc_block_buf().

vmemmap_alloc_block_buf(size, node, NULL)   /* Allocate from system RAM */
vmemmap_alloc_block_buf(size, node, altmap) /* Allocate from altmap */

This converts altmap_alloc_block_buf() into a static function, drops it's
entry from the header and updates Documentation/vm/memory-model.rst.

Link: http://lkml.kernel.org/r/1594004178-8861-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Jia He <justin.he@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/vm/memory-model.rst |    2 +-
 arch/arm64/mm/mmu.c               |    2 +-
 arch/powerpc/mm/init_64.c         |    4 ++--
 arch/x86/mm/init_64.c             |    5 +----
 include/linux/mm.h                |    4 ++--
 mm/sparse-vmemmap.c               |   28 +++++++++++++---------------
 6 files changed, 20 insertions(+), 25 deletions(-)

--- a/arch/arm64/mm/mmu.c~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf
+++ a/arch/arm64/mm/mmu.c
@@ -1102,7 +1102,7 @@ int __meminit vmemmap_populate(unsigned
 		if (pmd_none(READ_ONCE(*pmdp))) {
 			void *p = NULL;
 
-			p = vmemmap_alloc_block_buf(PMD_SIZE, node);
+			p = vmemmap_alloc_block_buf(PMD_SIZE, node, NULL);
 			if (!p)
 				return -ENOMEM;
 
--- a/arch/powerpc/mm/init_64.c~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf
+++ a/arch/powerpc/mm/init_64.c
@@ -225,12 +225,12 @@ int __meminit vmemmap_populate(unsigned
 		 * fall back to system memory if the altmap allocation fail.
 		 */
 		if (altmap && !altmap_cross_boundary(altmap, start, page_size)) {
-			p = altmap_alloc_block_buf(page_size, altmap);
+			p = vmemmap_alloc_block_buf(page_size, node, altmap);
 			if (!p)
 				pr_debug("altmap block allocation failed, falling back to system memory");
 		}
 		if (!p)
-			p = vmemmap_alloc_block_buf(page_size, node);
+			p = vmemmap_alloc_block_buf(page_size, node, NULL);
 		if (!p)
 			return -ENOMEM;
 
--- a/arch/x86/mm/init_64.c~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf
+++ a/arch/x86/mm/init_64.c
@@ -1463,10 +1463,7 @@ static int __meminit vmemmap_populate_hu
 		if (pmd_none(*pmd)) {
 			void *p;
 
-			if (altmap)
-				p = altmap_alloc_block_buf(PMD_SIZE, altmap);
-			else
-				p = vmemmap_alloc_block_buf(PMD_SIZE, node);
+			p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
 			if (p) {
 				pte_t entry;
 
--- a/Documentation/vm/memory-model.rst~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf
+++ a/Documentation/vm/memory-model.rst
@@ -178,7 +178,7 @@ for persistent memory devices in pre-all
 devices. This storage is represented with :c:type:`struct vmem_altmap`
 that is eventually passed to vmemmap_populate() through a long chain
 of function calls. The vmemmap_populate() implementation may use the
-`vmem_altmap` along with :c:func:`altmap_alloc_block_buf` helper to
+`vmem_altmap` along with :c:func:`vmemmap_alloc_block_buf` helper to
 allocate memory map on the persistent memory device.
 
 ZONE_DEVICE
--- a/include/linux/mm.h~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf
+++ a/include/linux/mm.h
@@ -2972,8 +2972,8 @@ pte_t *vmemmap_pte_populate(pmd_t *pmd,
 			    struct vmem_altmap *altmap);
 void *vmemmap_alloc_block(unsigned long size, int node);
 struct vmem_altmap;
-void *vmemmap_alloc_block_buf(unsigned long size, int node);
-void *altmap_alloc_block_buf(unsigned long size, struct vmem_altmap *altmap);
+void *vmemmap_alloc_block_buf(unsigned long size, int node,
+			      struct vmem_altmap *altmap);
 void vmemmap_verify(pte_t *, int, unsigned long, unsigned long);
 int vmemmap_populate_basepages(unsigned long start, unsigned long end,
 			       int node, struct vmem_altmap *altmap);
--- a/mm/sparse-vmemmap.c~mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf
+++ a/mm/sparse-vmemmap.c
@@ -69,11 +69,19 @@ void * __meminit vmemmap_alloc_block(uns
 				__pa(MAX_DMA_ADDRESS));
 }
 
+static void * __meminit altmap_alloc_block_buf(unsigned long size,
+					       struct vmem_altmap *altmap);
+
 /* need to make sure size is all the same during early stage */
-void * __meminit vmemmap_alloc_block_buf(unsigned long size, int node)
+void * __meminit vmemmap_alloc_block_buf(unsigned long size, int node,
+					 struct vmem_altmap *altmap)
 {
-	void *ptr = sparse_buffer_alloc(size);
+	void *ptr;
+
+	if (altmap)
+		return altmap_alloc_block_buf(size, altmap);
 
+	ptr = sparse_buffer_alloc(size);
 	if (!ptr)
 		ptr = vmemmap_alloc_block(size, node);
 	return ptr;
@@ -94,15 +102,8 @@ static unsigned long __meminit vmem_altm
 	return 0;
 }
 
-/**
- * altmap_alloc_block_buf - allocate pages from the device page map
- * @altmap:	device page map
- * @size:	size (in bytes) of the allocation
- *
- * Allocations are aligned to the size of the request.
- */
-void * __meminit altmap_alloc_block_buf(unsigned long size,
-		struct vmem_altmap *altmap)
+static void * __meminit altmap_alloc_block_buf(unsigned long size,
+					       struct vmem_altmap *altmap)
 {
 	unsigned long pfn, nr_pfns, nr_align;
 
@@ -147,10 +148,7 @@ pte_t * __meminit vmemmap_pte_populate(p
 		pte_t entry;
 		void *p;
 
-		if (altmap)
-			p = altmap_alloc_block_buf(PAGE_SIZE, altmap);
-		else
-			p = vmemmap_alloc_block_buf(PAGE_SIZE, node);
+		p = vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap);
 		if (!p)
 			return NULL;
 		entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL);
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch
mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch
mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch
arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch
mm-vmstat-add-events-for-thp-migration-without-split.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (192 preceding siblings ...)
  2020-07-17 21:59 ` + mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch " Andrew Morton
@ 2020-07-17 21:59 ` Andrew Morton
  2020-07-17 22:00 ` + ocfs2-fix-remounting-needed-after-setfacl-command.patch " Andrew Morton
                   ` (38 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 21:59 UTC (permalink / raw)
  To: anshuman.khandual, benh, bp, catalin.marinas, corbet,
	dan.j.williams, dave.hansen, david, fenghua.yu, hpa, hsinyi,
	justin.he, kirill.shutemov, luto, mark.rutland, mhocko, mingo,
	mm-commits, mpe, palmer, pasha.tatashin, paul.walmsley, paulus,
	peterz, robin.murphy, rppt, steve.capper, tglx, tony.luck, will,
	willy, yuzhao


The patch titled
     Subject: arm64/mm: enable vmem_altmap support for vmemmap mappings
has been added to the -mm tree.  Its filename is
     arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: arm64/mm: enable vmem_altmap support for vmemmap mappings

Device memory ranges when getting hot added into ZONE_DEVICE, might
require their vmemmap mapping's backing memory to be allocated from their
own range instead of consuming system memory.  This prevents large system
memory usage for potentially large device memory ranges.  Device driver
communicates this request via vmem_altmap structure.  Architecture needs
to take this request into account while creating and tearing down vemmmap
mappings.

This enables vmem_altmap support in vmemmap_populate() and vmemmap_free()
which includes vmemmap_populate_basepages() used for ARM64_16K_PAGES and
ARM64_64K_PAGES configs.

Link: http://lkml.kernel.org/r/1594004178-8861-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Jia He <justin.he@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/mm/mmu.c |   58 +++++++++++++++++++++++++++---------------
 1 file changed, 38 insertions(+), 20 deletions(-)

--- a/arch/arm64/mm/mmu.c~arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings
+++ a/arch/arm64/mm/mmu.c
@@ -761,15 +761,20 @@ int kern_addr_valid(unsigned long addr)
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-static void free_hotplug_page_range(struct page *page, size_t size)
+static void free_hotplug_page_range(struct page *page, size_t size,
+				    struct vmem_altmap *altmap)
 {
-	WARN_ON(PageReserved(page));
-	free_pages((unsigned long)page_address(page), get_order(size));
+	if (altmap) {
+		vmem_altmap_free(altmap, size >> PAGE_SHIFT);
+	} else {
+		WARN_ON(PageReserved(page));
+		free_pages((unsigned long)page_address(page), get_order(size));
+	}
 }
 
 static void free_hotplug_pgtable_page(struct page *page)
 {
-	free_hotplug_page_range(page, PAGE_SIZE);
+	free_hotplug_page_range(page, PAGE_SIZE, NULL);
 }
 
 static bool pgtable_range_aligned(unsigned long start, unsigned long end,
@@ -792,7 +797,8 @@ static bool pgtable_range_aligned(unsign
 }
 
 static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr,
-				    unsigned long end, bool free_mapped)
+				    unsigned long end, bool free_mapped,
+				    struct vmem_altmap *altmap)
 {
 	pte_t *ptep, pte;
 
@@ -806,12 +812,14 @@ static void unmap_hotplug_pte_range(pmd_
 		pte_clear(&init_mm, addr, ptep);
 		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
 		if (free_mapped)
-			free_hotplug_page_range(pte_page(pte), PAGE_SIZE);
+			free_hotplug_page_range(pte_page(pte),
+						PAGE_SIZE, altmap);
 	} while (addr += PAGE_SIZE, addr < end);
 }
 
 static void unmap_hotplug_pmd_range(pud_t *pudp, unsigned long addr,
-				    unsigned long end, bool free_mapped)
+				    unsigned long end, bool free_mapped,
+				    struct vmem_altmap *altmap)
 {
 	unsigned long next;
 	pmd_t *pmdp, pmd;
@@ -834,16 +842,17 @@ static void unmap_hotplug_pmd_range(pud_
 			flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
 			if (free_mapped)
 				free_hotplug_page_range(pmd_page(pmd),
-							PMD_SIZE);
+							PMD_SIZE, altmap);
 			continue;
 		}
 		WARN_ON(!pmd_table(pmd));
-		unmap_hotplug_pte_range(pmdp, addr, next, free_mapped);
+		unmap_hotplug_pte_range(pmdp, addr, next, free_mapped, altmap);
 	} while (addr = next, addr < end);
 }
 
 static void unmap_hotplug_pud_range(p4d_t *p4dp, unsigned long addr,
-				    unsigned long end, bool free_mapped)
+				    unsigned long end, bool free_mapped,
+				    struct vmem_altmap *altmap)
 {
 	unsigned long next;
 	pud_t *pudp, pud;
@@ -866,16 +875,17 @@ static void unmap_hotplug_pud_range(p4d_
 			flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
 			if (free_mapped)
 				free_hotplug_page_range(pud_page(pud),
-							PUD_SIZE);
+							PUD_SIZE, altmap);
 			continue;
 		}
 		WARN_ON(!pud_table(pud));
-		unmap_hotplug_pmd_range(pudp, addr, next, free_mapped);
+		unmap_hotplug_pmd_range(pudp, addr, next, free_mapped, altmap);
 	} while (addr = next, addr < end);
 }
 
 static void unmap_hotplug_p4d_range(pgd_t *pgdp, unsigned long addr,
-				    unsigned long end, bool free_mapped)
+				    unsigned long end, bool free_mapped,
+				    struct vmem_altmap *altmap)
 {
 	unsigned long next;
 	p4d_t *p4dp, p4d;
@@ -888,16 +898,24 @@ static void unmap_hotplug_p4d_range(pgd_
 			continue;
 
 		WARN_ON(!p4d_present(p4d));
-		unmap_hotplug_pud_range(p4dp, addr, next, free_mapped);
+		unmap_hotplug_pud_range(p4dp, addr, next, free_mapped, altmap);
 	} while (addr = next, addr < end);
 }
 
 static void unmap_hotplug_range(unsigned long addr, unsigned long end,
-				bool free_mapped)
+				bool free_mapped, struct vmem_altmap *altmap)
 {
 	unsigned long next;
 	pgd_t *pgdp, pgd;
 
+	/*
+	 * altmap can only be used as vmemmap mapping backing memory.
+	 * In case the backing memory itself is not being freed, then
+	 * altmap is irrelevant. Warn about this inconsistency when
+	 * encountered.
+	 */
+	WARN_ON(!free_mapped && altmap);
+
 	do {
 		next = pgd_addr_end(addr, end);
 		pgdp = pgd_offset_k(addr);
@@ -906,7 +924,7 @@ static void unmap_hotplug_range(unsigned
 			continue;
 
 		WARN_ON(!pgd_present(pgd));
-		unmap_hotplug_p4d_range(pgdp, addr, next, free_mapped);
+		unmap_hotplug_p4d_range(pgdp, addr, next, free_mapped, altmap);
 	} while (addr = next, addr < end);
 }
 
@@ -1070,7 +1088,7 @@ static void free_empty_tables(unsigned l
 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
 		struct vmem_altmap *altmap)
 {
-	return vmemmap_populate_basepages(start, end, node, NULL);
+	return vmemmap_populate_basepages(start, end, node, altmap);
 }
 #else	/* !ARM64_SWAPPER_USES_SECTION_MAPS */
 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
@@ -1102,7 +1120,7 @@ int __meminit vmemmap_populate(unsigned
 		if (pmd_none(READ_ONCE(*pmdp))) {
 			void *p = NULL;
 
-			p = vmemmap_alloc_block_buf(PMD_SIZE, node, NULL);
+			p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
 			if (!p)
 				return -ENOMEM;
 
@@ -1120,7 +1138,7 @@ void vmemmap_free(unsigned long start, u
 #ifdef CONFIG_MEMORY_HOTPLUG
 	WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));
 
-	unmap_hotplug_range(start, end, true);
+	unmap_hotplug_range(start, end, true, altmap);
 	free_empty_tables(start, end, VMEMMAP_START, VMEMMAP_END);
 #endif
 }
@@ -1411,7 +1429,7 @@ static void __remove_pgd_mapping(pgd_t *
 	WARN_ON(pgdir != init_mm.pgd);
 	WARN_ON((start < PAGE_OFFSET) || (end > PAGE_END));
 
-	unmap_hotplug_range(start, end, false);
+	unmap_hotplug_range(start, end, false, NULL);
 	free_empty_tables(start, end, PAGE_OFFSET, PAGE_END);
 }
 
_

Patches currently in -mm which might be from anshuman.khandual@arm.com are

mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch
mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch
mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch
arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch
mm-vmstat-add-events-for-thp-migration-without-split.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + ocfs2-fix-remounting-needed-after-setfacl-command.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (193 preceding siblings ...)
  2020-07-17 21:59 ` + arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch " Andrew Morton
@ 2020-07-17 22:00 ` Andrew Morton
  2020-07-17 23:03 ` + mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix.patch " Andrew Morton
                   ` (37 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 22:00 UTC (permalink / raw)
  To: gechangwei, ghe, jiangqi903, jlbec, junxiao.bi, mark, mm-commits,
	piaojun


The patch titled
     Subject: ocfs2: fix remounting needed after setfacl command
has been added to the -mm tree.  Its filename is
     ocfs2-fix-remounting-needed-after-setfacl-command.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/ocfs2-fix-remounting-needed-after-setfacl-command.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-fix-remounting-needed-after-setfacl-command.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Gang He <ghe@suse.com>
Subject: ocfs2: fix remounting needed after setfacl command

When use setfacl command to change a file's acl, the user cannot get the
latest acl information from the file via getfacl command, until remounting
the file system.

e.g.
setfacl -m u:ivan:rw /ocfs2/ivan
getfacl /ocfs2/ivan
getfacl: Removing leading '/' from absolute path names
file: ocfs2/ivan
owner: root
group: root
user::rw-
group::r--
mask::r--
other::r--

The latest acl record("u:ivan:rw") cannot be returned via getfacl
command until remounting.

Link: http://lkml.kernel.org/r/20200717023751.9922-1-ghe@suse.com
Signed-off-by: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/acl.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/fs/ocfs2/acl.c~ocfs2-fix-remounting-needed-after-setfacl-command
+++ a/fs/ocfs2/acl.c
@@ -256,6 +256,8 @@ static int ocfs2_set_acl(handle_t *handl
 		ret = ocfs2_xattr_set(inode, name_index, "", value, size, 0);
 
 	kfree(value);
+	if (!ret)
+		set_cached_acl(inode, type, acl);
 
 	return ret;
 }
_

Patches currently in -mm which might be from ghe@suse.com are

ocfs2-fix-remounting-needed-after-setfacl-command.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (194 preceding siblings ...)
  2020-07-17 22:00 ` + ocfs2-fix-remounting-needed-after-setfacl-command.patch " Andrew Morton
@ 2020-07-17 23:03 ` Andrew Morton
  2020-07-20 22:55 ` + scripts-decode_stacktrace-strip-basepath-from-all-paths.patch " Andrew Morton
                   ` (36 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-17 23:03 UTC (permalink / raw)
  To: guro, mm-commits, naresh.kamboju, sfr


The patch titled
     Subject: mm: slab/memcg: fix build on MIPS
has been added to the -mm tree.  Its filename is
     mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: slab/memcg: fix build on MIPS

Naresh reported that linux-next build is broken on MIPS.  The problem is
reproducible using gcc 8 and 9, but not 10.

make -sk KBUILD_BUILD_USER=TuxBuild -C/linux -j16 ARCH=mips
CROSS_COMPILE=mips-linux-gnu- HOSTCC=gcc CC="sccache
mips-linux-gnu-gcc" O=build
../mm/slub.c: In function `slab_alloc.constprop':
../mm/slub.c:2897:30: error: inlining failed in call to always_inline
`slab_alloc.constprop': recursive inlining
 2897 | static __always_inline void *slab_alloc(struct kmem_cache *s,
      |                              ^~~~~~~~~~
../mm/slub.c:2905:14: note: called from here
 2905 |  void *ret = slab_alloc(s, gfpflags, _RET_IP_);
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../mm/slub.c: In function `sysfs_slab_alias':
../mm/slub.c:2897:30: error: inlining failed in call to always_inline
`slab_alloc.constprop': recursive inlining
 2897 | static __always_inline void *slab_alloc(struct kmem_cache *s,
      |                              ^~~~~~~~~~
../mm/slub.c:2905:14: note: called from here
 2905 |  void *ret = slab_alloc(s, gfpflags, _RET_IP_);
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../mm/slub.c: In function `sysfs_slab_add':
../mm/slub.c:2897:30: error: inlining failed in call to always_inline
`slab_alloc.constprop': recursive inlining
 2897 | static __always_inline void *slab_alloc(struct kmem_cache *s,
      |                              ^~~~~~~~~~
../mm/slub.c:2905:14: note: called from here
 2905 |  void *ret = slab_alloc(s, gfpflags, _RET_IP_);
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The problem was introduced by commit "mem: memcg/slab: use a single set of
kmem_caches for all allocations", which added an allocation of the space
for the obj_cgroup vector into the slab post hook and created a recursive
inlining.

The easies way to fix this is to move memcg_alloc_page_obj_cgroups() to
memcontrol.c and make it a generic (not static inline) function.  It
breaks the inlining recursion and fixes the build.

Link: http://lkml.kernel.org/r/20200717214810.3733082-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   20 ++++++++++++++++++++
 mm/slab.h       |   21 ++-------------------
 2 files changed, 22 insertions(+), 19 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix
+++ a/mm/memcontrol.c
@@ -2800,6 +2800,26 @@ static void commit_charge(struct page *p
 }
 
 #ifdef CONFIG_MEMCG_KMEM
+int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s,
+				 gfp_t gfp)
+{
+	unsigned int objects = objs_per_slab_page(s, page);
+	void *vec;
+
+	vec = kcalloc_node(objects, sizeof(struct obj_cgroup *), gfp,
+			   page_to_nid(page));
+	if (!vec)
+		return -ENOMEM;
+
+	if (cmpxchg(&page->obj_cgroups, NULL,
+		    (struct obj_cgroup **) ((unsigned long)vec | 0x1UL)))
+		kfree(vec);
+	else
+		kmemleak_not_leak(vec);
+
+	return 0;
+}
+
 /*
  * Returns a pointer to the memory cgroup to which the kernel object is charged.
  *
--- a/mm/slab.h~mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix
+++ a/mm/slab.h
@@ -257,25 +257,8 @@ static inline bool page_has_obj_cgroups(
 	return ((unsigned long)page->obj_cgroups & 0x1UL);
 }
 
-static inline int memcg_alloc_page_obj_cgroups(struct page *page,
-					       struct kmem_cache *s, gfp_t gfp)
-{
-	unsigned int objects = objs_per_slab_page(s, page);
-	void *vec;
-
-	vec = kcalloc_node(objects, sizeof(struct obj_cgroup *), gfp,
-			   page_to_nid(page));
-	if (!vec)
-		return -ENOMEM;
-
-	if (cmpxchg(&page->obj_cgroups, NULL,
-		    (struct obj_cgroup **) ((unsigned long)vec | 0x1UL)))
-		kfree(vec);
-	else
-		kmemleak_not_leak(vec);
-
-	return 0;
-}
+int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s,
+				 gfp_t gfp);
 
 static inline void memcg_free_page_obj_cgroups(struct page *page)
 {
_

Patches currently in -mm which might be from guro@fb.com are

mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + scripts-decode_stacktrace-strip-basepath-from-all-paths.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (195 preceding siblings ...)
  2020-07-17 23:03 ` + mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix.patch " Andrew Morton
@ 2020-07-20 22:55 ` Andrew Morton
  2020-07-20 23:03 ` + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch " Andrew Morton
                   ` (35 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-20 22:55 UTC (permalink / raw)
  To: drinkcat, mm-commits, pihsun, sashal, shik, swboyd


The patch titled
     Subject: scripts/decode_stacktrace: strip basepath from all paths
has been added to the -mm tree.  Its filename is
     scripts-decode_stacktrace-strip-basepath-from-all-paths.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/scripts-decode_stacktrace-strip-basepath-from-all-paths.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/scripts-decode_stacktrace-strip-basepath-from-all-paths.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Pi-Hsun Shih <pihsun@chromium.org>
Subject: scripts/decode_stacktrace: strip basepath from all paths

Currently the basepath is removed only from the beginning of the string. 
When the symbol is inlined and there's multiple line outputs of addr2line,
only the first line would have basepath removed.

Change to remove the basepath prefix from all lines.

Link: http://lkml.kernel.org/r/20200720082709.252805-1-pihsun@chromium.org
Fixes: 31013836a71e ("scripts/decode_stacktrace: match basepath using shell prefix operator, not regex")
Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org>
Signed-off-by: Shik Chen <shik@chromium.org>
Co-developed-by: Shik Chen <shik@chromium.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Nicolas Boichat <drinkcat@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/decode_stacktrace.sh |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/scripts/decode_stacktrace.sh~scripts-decode_stacktrace-strip-basepath-from-all-paths
+++ a/scripts/decode_stacktrace.sh
@@ -87,8 +87,8 @@ parse_symbol() {
 		return
 	fi
 
-	# Strip out the base of the path
-	code=${code#$basepath/}
+	# Strip out the base of the path on each line
+	code=$(while read -r line; do echo "${line#$basepath/}"; done <<< "$code")
 
 	# In the case of inlines, move everything to same line
 	code=${code//$'\n'/' '}
_

Patches currently in -mm which might be from pihsun@chromium.org are

scripts-decode_stacktrace-strip-basepath-from-all-paths.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (196 preceding siblings ...)
  2020-07-20 22:55 ` + scripts-decode_stacktrace-strip-basepath-from-all-paths.patch " Andrew Morton
@ 2020-07-20 23:03 ` Andrew Morton
  2020-07-20 23:26 ` + mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range.patch " Andrew Morton
                   ` (34 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-20 23:03 UTC (permalink / raw)
  To: akpm, guro, mhocko, mm-commits, sfr


The patch titled
     Subject: mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2
has been added to the -mm tree.  Its filename is
     mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2

comment MAX_THRESHOLD, per Michal

Cc: Roman Gushchin <guro@fb.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmstat.c |    1 +
 1 file changed, 1 insertion(+)

--- a/mm/vmstat.c~mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2
+++ a/mm/vmstat.c
@@ -167,6 +167,7 @@ EXPORT_SYMBOL(vm_zone_stat);
 EXPORT_SYMBOL(vm_numa_stat);
 EXPORT_SYMBOL(vm_node_stat);
 
+/* Maximum sync threshold for per-cpu vmstat counters */
 #ifdef CONFIG_SMP
 #define MAX_THRESHOLD 125
 #else
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
mmhwpoison-rework-soft-offline-for-in-use-pages-fix.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch
linux-next-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (197 preceding siblings ...)
  2020-07-20 23:03 ` + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch " Andrew Morton
@ 2020-07-20 23:26 ` Andrew Morton
  2020-07-20 23:31 ` + ocfs2-suballoch-delete-a-duplicated-word.patch " Andrew Morton
                   ` (33 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-20 23:26 UTC (permalink / raw)
  To: akpm, mm-commits, tangyizhou


The patch titled
     Subject: mm/gup.c: Fix the comment of return value for populate_vma_page_range()
has been added to the -mm tree.  Its filename is
     mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Tang Yizhou <tangyizhou@huawei.com>
Subject: mm/gup.c: Fix the comment of return value for populate_vma_page_range()

The return value of populate_vma_page_range() is consistent with
__get_user_pages(), and so is the function comment of return value.

Link: http://lkml.kernel.org/r/20200720034303.29920-1-tangyizhou@huawei.com
Signed-off-by: Tang Yizhou <tangyizhou@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/gup.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/mm/gup.c~mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range
+++ a/mm/gup.c
@@ -1404,7 +1404,8 @@ retry:
  *
  * This takes care of mlocking the pages too if VM_LOCKED is set.
  *
- * return 0 on success, negative error code on error.
+ * Return either number of pages pinned in the vma, or a negative error
+ * code on error.
  *
  * vma->vm_mm->mmap_lock must be held.
  *
_

Patches currently in -mm which might be from tangyizhou@huawei.com are

mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + ocfs2-suballoch-delete-a-duplicated-word.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (198 preceding siblings ...)
  2020-07-20 23:26 ` + mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range.patch " Andrew Morton
@ 2020-07-20 23:31 ` Andrew Morton
  2020-07-21  0:26 ` + ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type.patch " Andrew Morton
                   ` (32 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-20 23:31 UTC (permalink / raw)
  To: jlbec, joseph.qi, mark, mm-commits, rdunlap


The patch titled
     Subject: ocfs2: suballoc.h: delete a duplicated word
has been added to the -mm tree.  Its filename is
     ocfs2-suballoch-delete-a-duplicated-word.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/ocfs2-suballoch-delete-a-duplicated-word.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-suballoch-delete-a-duplicated-word.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: ocfs2: suballoc.h: delete a duplicated word

Drop the repeated word "is" in a comment.

Link: http://lkml.kernel.org/r/20200720001421.28823-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/suballoc.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/ocfs2/suballoc.h~ocfs2-suballoch-delete-a-duplicated-word
+++ a/fs/ocfs2/suballoc.h
@@ -40,7 +40,7 @@ struct ocfs2_alloc_context {
 
 	u64    ac_last_group;
 	u64    ac_max_block;  /* Highest block number to allocate. 0 is
-				 is the same as ~0 - unlimited */
+				 the same as ~0 - unlimited */
 
 	int    ac_find_loc_only;  /* hack for reflink operation ordering */
 	struct ocfs2_suballoc_result *ac_find_loc_priv; /* */
_

Patches currently in -mm which might be from rdunlap@infradead.org are

ocfs2-suballoch-delete-a-duplicated-word.patch
linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch
autofs-fix-doubled-word.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (199 preceding siblings ...)
  2020-07-20 23:31 ` + ocfs2-suballoch-delete-a-duplicated-word.patch " Andrew Morton
@ 2020-07-21  0:26 ` Andrew Morton
  2020-07-21  0:27 ` + highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch " Andrew Morton
                   ` (31 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21  0:26 UTC (permalink / raw)
  To: anton, luca.stefani.ge1, michalechner92, mm-commits,
	natechancellor, ndesaulniers


The patch titled
     Subject: ntfs: fix ntfs_test_inode and ntfs_init_locked_inode function type
has been added to the -mm tree.  Its filename is
     ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Luca Stefani <luca.stefani.ge1@gmail.com>
Subject: ntfs: fix ntfs_test_inode and ntfs_init_locked_inode function type

Clang's Control Flow Integrity (CFI) is a security mechanism that can help
prevent JOP chains, deployed extensively in downstream kernels used in
Android.

Its deployment is hindered by mismatches in function signatures.  For this
case, we make callbacks match their intended function signature, and cast
parameters within them rather than casting the callback when passed as a
parameter.

When running `mount -t ntfs ...` we observe the following trace:

Call trace:
__cfi_check_fail+0x1c/0x24
name_to_dev_t+0x0/0x404
iget5_locked+0x594/0x5e8
ntfs_fill_super+0xbfc/0x43ec
mount_bdev+0x30c/0x3cc
ntfs_mount+0x18/0x24
mount_fs+0x1b0/0x380
vfs_kern_mount+0x90/0x398
do_mount+0x5d8/0x1a10
SyS_mount+0x108/0x144
el0_svc_naked+0x34/0x38

Link: http://lkml.kernel.org/r/20200718112513.533800-1-luca.stefani.ge1@gmail.com
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
Tested-by: freak07 <michalechner92@googlemail.com>
Acked-by: Anton Altaparmakov <anton@tuxera.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ntfs/dir.c   |    2 +-
 fs/ntfs/inode.c |   27 ++++++++++++++-------------
 fs/ntfs/inode.h |    4 +---
 fs/ntfs/mft.c   |    4 ++--
 4 files changed, 18 insertions(+), 19 deletions(-)

--- a/fs/ntfs/dir.c~ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type
+++ a/fs/ntfs/dir.c
@@ -1503,7 +1503,7 @@ static int ntfs_dir_fsync(struct file *f
 	na.type = AT_BITMAP;
 	na.name = I30;
 	na.name_len = 4;
-	bmp_vi = ilookup5(vi->i_sb, vi->i_ino, (test_t)ntfs_test_inode, &na);
+	bmp_vi = ilookup5(vi->i_sb, vi->i_ino, ntfs_test_inode, &na);
 	if (bmp_vi) {
  		write_inode_now(bmp_vi, !datasync);
 		iput(bmp_vi);
--- a/fs/ntfs/inode.c~ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type
+++ a/fs/ntfs/inode.c
@@ -30,10 +30,10 @@
 /**
  * ntfs_test_inode - compare two (possibly fake) inodes for equality
  * @vi:		vfs inode which to test
- * @na:		ntfs attribute which is being tested with
+ * @data:	data which is being tested with
  *
  * Compare the ntfs attribute embedded in the ntfs specific part of the vfs
- * inode @vi for equality with the ntfs attribute @na.
+ * inode @vi for equality with the ntfs attribute @data.
  *
  * If searching for the normal file/directory inode, set @na->type to AT_UNUSED.
  * @na->name and @na->name_len are then ignored.
@@ -43,8 +43,9 @@
  * NOTE: This function runs with the inode_hash_lock spin lock held so it is not
  * allowed to sleep.
  */
-int ntfs_test_inode(struct inode *vi, ntfs_attr *na)
+int ntfs_test_inode(struct inode *vi, void *data)
 {
+	ntfs_attr *na = (ntfs_attr *)data;
 	ntfs_inode *ni;
 
 	if (vi->i_ino != na->mft_no)
@@ -72,9 +73,9 @@ int ntfs_test_inode(struct inode *vi, nt
 /**
  * ntfs_init_locked_inode - initialize an inode
  * @vi:		vfs inode to initialize
- * @na:		ntfs attribute which to initialize @vi to
+ * @data:	data which to initialize @vi to
  *
- * Initialize the vfs inode @vi with the values from the ntfs attribute @na in
+ * Initialize the vfs inode @vi with the values from the ntfs attribute @data in
  * order to enable ntfs_test_inode() to do its work.
  *
  * If initializing the normal file/directory inode, set @na->type to AT_UNUSED.
@@ -87,8 +88,9 @@ int ntfs_test_inode(struct inode *vi, nt
  * NOTE: This function runs with the inode->i_lock spin lock held so it is not
  * allowed to sleep. (Hence the GFP_ATOMIC allocation.)
  */
-static int ntfs_init_locked_inode(struct inode *vi, ntfs_attr *na)
+static int ntfs_init_locked_inode(struct inode *vi, void *data)
 {
+	ntfs_attr *na = (ntfs_attr *)data;
 	ntfs_inode *ni = NTFS_I(vi);
 
 	vi->i_ino = na->mft_no;
@@ -131,7 +133,6 @@ static int ntfs_init_locked_inode(struct
 	return 0;
 }
 
-typedef int (*set_t)(struct inode *, void *);
 static int ntfs_read_locked_inode(struct inode *vi);
 static int ntfs_read_locked_attr_inode(struct inode *base_vi, struct inode *vi);
 static int ntfs_read_locked_index_inode(struct inode *base_vi,
@@ -164,8 +165,8 @@ struct inode *ntfs_iget(struct super_blo
 	na.name = NULL;
 	na.name_len = 0;
 
-	vi = iget5_locked(sb, mft_no, (test_t)ntfs_test_inode,
-			(set_t)ntfs_init_locked_inode, &na);
+	vi = iget5_locked(sb, mft_no, ntfs_test_inode,
+			ntfs_init_locked_inode, &na);
 	if (unlikely(!vi))
 		return ERR_PTR(-ENOMEM);
 
@@ -225,8 +226,8 @@ struct inode *ntfs_attr_iget(struct inod
 	na.name = name;
 	na.name_len = name_len;
 
-	vi = iget5_locked(base_vi->i_sb, na.mft_no, (test_t)ntfs_test_inode,
-			(set_t)ntfs_init_locked_inode, &na);
+	vi = iget5_locked(base_vi->i_sb, na.mft_no, ntfs_test_inode,
+			ntfs_init_locked_inode, &na);
 	if (unlikely(!vi))
 		return ERR_PTR(-ENOMEM);
 
@@ -280,8 +281,8 @@ struct inode *ntfs_index_iget(struct ino
 	na.name = name;
 	na.name_len = name_len;
 
-	vi = iget5_locked(base_vi->i_sb, na.mft_no, (test_t)ntfs_test_inode,
-			(set_t)ntfs_init_locked_inode, &na);
+	vi = iget5_locked(base_vi->i_sb, na.mft_no, ntfs_test_inode,
+			ntfs_init_locked_inode, &na);
 	if (unlikely(!vi))
 		return ERR_PTR(-ENOMEM);
 
--- a/fs/ntfs/inode.h~ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type
+++ a/fs/ntfs/inode.h
@@ -253,9 +253,7 @@ typedef struct {
 	ATTR_TYPE type;
 } ntfs_attr;
 
-typedef int (*test_t)(struct inode *, void *);
-
-extern int ntfs_test_inode(struct inode *vi, ntfs_attr *na);
+extern int ntfs_test_inode(struct inode *vi, void *data);
 
 extern struct inode *ntfs_iget(struct super_block *sb, unsigned long mft_no);
 extern struct inode *ntfs_attr_iget(struct inode *base_vi, ATTR_TYPE type,
--- a/fs/ntfs/mft.c~ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type
+++ a/fs/ntfs/mft.c
@@ -958,7 +958,7 @@ bool ntfs_may_write_mft_record(ntfs_volu
 		 * dirty code path of the inode dirty code path when writing
 		 * $MFT occurs.
 		 */
-		vi = ilookup5_nowait(sb, mft_no, (test_t)ntfs_test_inode, &na);
+		vi = ilookup5_nowait(sb, mft_no, ntfs_test_inode, &na);
 	}
 	if (vi) {
 		ntfs_debug("Base inode 0x%lx is in icache.", mft_no);
@@ -1019,7 +1019,7 @@ bool ntfs_may_write_mft_record(ntfs_volu
 		vi = igrab(mft_vi);
 		BUG_ON(vi != mft_vi);
 	} else
-		vi = ilookup5_nowait(sb, na.mft_no, (test_t)ntfs_test_inode,
+		vi = ilookup5_nowait(sb, na.mft_no, ntfs_test_inode,
 				&na);
 	if (!vi) {
 		/*
_

Patches currently in -mm which might be from luca.stefani.ge1@gmail.com are

ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (200 preceding siblings ...)
  2020-07-21  0:26 ` + ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type.patch " Andrew Morton
@ 2020-07-21  0:27 ` Andrew Morton
  2020-07-21  0:28 ` + clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch " Andrew Morton
                   ` (30 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21  0:27 UTC (permalink / raw)
  To: mm-commits, rdunlap


The patch titled
     Subject: include/linux/highmem.h: fix duplicated words in a comment
has been added to the -mm tree.  Its filename is
     highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: include/linux/highmem.h: fix duplicated words in a comment

Change the doubled word "is" in a comment to "it is".

Link: http://lkml.kernel.org/r/ad605959-0083-4794-8d31-6b073300dd6f@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/highmem.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/highmem.h~highmem-linux-highmemh-fix-duplicated-words-in-a-comment
+++ a/include/linux/highmem.h
@@ -73,7 +73,7 @@ static inline void kunmap(struct page *p
  * no global lock is needed and because the kmap code must perform a global TLB
  * invalidation when the kmap pool wraps.
  *
- * However when holding an atomic kmap is is not legal to sleep, so atomic
+ * However when holding an atomic kmap it is not legal to sleep, so atomic
  * kmaps are appropriate for short, tight code paths only.
  *
  * The use of kmap_atomic/kunmap_atomic is discouraged - kmap/kunmap
_

Patches currently in -mm which might be from rdunlap@infradead.org are

ocfs2-suballoch-delete-a-duplicated-word.patch
linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch
highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch
autofs-fix-doubled-word.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (201 preceding siblings ...)
  2020-07-21  0:27 ` + highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch " Andrew Morton
@ 2020-07-21  0:28 ` Andrew Morton
  2020-07-21  0:30 ` + linux-exportfsh-drop-duplicated-word-in-a-comment.patch " Andrew Morton
                   ` (29 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21  0:28 UTC (permalink / raw)
  To: mm-commits, natechancellor, rdunlap


The patch titled
     Subject: include/linux/compiler-clang.h: drop duplicated word in a comment
has been added to the -mm tree.  Its filename is
     clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: include/linux/compiler-clang.h: drop duplicated word in a comment

Drop the doubled word "the" in a comment.

Link: http://lkml.kernel.org/r/6a18c301-3505-742f-4dd7-0f38d0e537b9@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/compiler-clang.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/compiler-clang.h~clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment
+++ a/include/linux/compiler-clang.h
@@ -42,7 +42,7 @@
 #endif
 
 /*
- * Not all versions of clang implement the the type-generic versions
+ * Not all versions of clang implement the type-generic versions
  * of the builtin overflow checkers. Fortunately, clang implements
  * __has_builtin allowing us to avoid awkward version
  * checks. Unfortunately, we don't know which version of gcc clang
_

Patches currently in -mm which might be from rdunlap@infradead.org are

ocfs2-suballoch-delete-a-duplicated-word.patch
linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch
highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch
clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch
autofs-fix-doubled-word.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + linux-exportfsh-drop-duplicated-word-in-a-comment.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (202 preceding siblings ...)
  2020-07-21  0:28 ` + clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch " Andrew Morton
@ 2020-07-21  0:30 ` Andrew Morton
  2020-07-21  0:30 ` + linux-async_txh-drop-duplicated-word-in-a-comment.patch " Andrew Morton
                   ` (28 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21  0:30 UTC (permalink / raw)
  To: akpm, mm-commits, rdunlap, viro


The patch titled
     Subject: include/linux/exportfs.h: drop duplicated word in a comment
has been added to the -mm tree.  Its filename is
     linux-exportfsh-drop-duplicated-word-in-a-comment.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/linux-exportfsh-drop-duplicated-word-in-a-comment.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/linux-exportfsh-drop-duplicated-word-in-a-comment.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: include/linux/exportfs.h: drop duplicated word in a comment

Drop the doubled word "a" in a comment.

Link: http://lkml.kernel.org/r/c61b707a-8fd8-5b1b-aab0-679122881543@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/exportfs.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/exportfs.h~linux-exportfsh-drop-duplicated-word-in-a-comment
+++ a/include/linux/exportfs.h
@@ -178,7 +178,7 @@ struct fid {
  * get_name:
  *    @get_name should find a name for the given @child in the given @parent
  *    directory.  The name should be stored in the @name (with the
- *    understanding that it is already pointing to a a %NAME_MAX+1 sized
+ *    understanding that it is already pointing to a %NAME_MAX+1 sized
  *    buffer.   get_name() should return %0 on success, a negative error code
  *    or error.  @get_name will be called without @parent->i_mutex held.
  *
_

Patches currently in -mm which might be from rdunlap@infradead.org are

ocfs2-suballoch-delete-a-duplicated-word.patch
linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch
highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch
clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch
linux-exportfsh-drop-duplicated-word-in-a-comment.patch
linux-async_txh-drop-duplicated-word-in-a-comment.patch
autofs-fix-doubled-word.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + linux-async_txh-drop-duplicated-word-in-a-comment.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (203 preceding siblings ...)
  2020-07-21  0:30 ` + linux-exportfsh-drop-duplicated-word-in-a-comment.patch " Andrew Morton
@ 2020-07-21  0:30 ` Andrew Morton
  2020-07-21  0:31 ` + frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch " Andrew Morton
                   ` (27 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21  0:30 UTC (permalink / raw)
  To: dan.j.williams, mm-commits, rdunlap


The patch titled
     Subject: include/linux/async_tx.h: drop duplicated word in a comment
has been added to the -mm tree.  Its filename is
     linux-async_txh-drop-duplicated-word-in-a-comment.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/linux-async_txh-drop-duplicated-word-in-a-comment.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/linux-async_txh-drop-duplicated-word-in-a-comment.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: include/linux/async_tx.h: drop duplicated word in a comment

Drop the doubled word "the" in a comment.

Link: http://lkml.kernel.org/r/e85802f7-8f48-8b4c-29b3-ea237a2c7ae9@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/async_tx.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/async_tx.h~linux-async_txh-drop-duplicated-word-in-a-comment
+++ a/include/linux/async_tx.h
@@ -36,7 +36,7 @@ struct dma_chan_ref {
 /**
  * async_tx_flags - modifiers for the async_* calls
  * @ASYNC_TX_XOR_ZERO_DST: this flag must be used for xor operations where the
- * the destination address is not a source.  The asynchronous case handles this
+ * destination address is not a source.  The asynchronous case handles this
  * implicitly, the synchronous case needs to zero the destination block.
  * @ASYNC_TX_XOR_DROP_DST: this flag must be used if the destination address is
  * also one of the source addresses.  In the synchronous case the destination
_

Patches currently in -mm which might be from rdunlap@infradead.org are

ocfs2-suballoch-delete-a-duplicated-word.patch
linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch
highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch
clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch
linux-exportfsh-drop-duplicated-word-in-a-comment.patch
linux-async_txh-drop-duplicated-word-in-a-comment.patch
autofs-fix-doubled-word.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (204 preceding siblings ...)
  2020-07-21  0:30 ` + linux-async_txh-drop-duplicated-word-in-a-comment.patch " Andrew Morton
@ 2020-07-21  0:31 ` Andrew Morton
  2020-07-21  0:33 ` + memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch " Andrew Morton
                   ` (26 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21  0:31 UTC (permalink / raw)
  To: akpm, konrad.wilk, mm-commits, rdunlap


The patch titled
     Subject: include/linux/frontswap.h:  drop duplicated word in a comment
has been added to the -mm tree.  Its filename is
     frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: include/linux/frontswap.h:  drop duplicated word in a comment

Drop the doubled word "in" in a comment.

Link: http://lkml.kernel.org/r/3af7ed91-ad62-8445-40a4-9e07a64b9523@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/frontswap.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/frontswap.h~frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment
+++ a/include/linux/frontswap.h
@@ -10,7 +10,7 @@
 /*
  * Return code to denote that requested number of
  * frontswap pages are unused(moved to page cache).
- * Used in in shmem_unuse and try_to_unuse.
+ * Used in shmem_unuse and try_to_unuse.
  */
 #define FRONTSWAP_PAGES_UNUSED	2
 
_

Patches currently in -mm which might be from rdunlap@infradead.org are

ocfs2-suballoch-delete-a-duplicated-word.patch
linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch
highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch
frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch
clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch
linux-exportfsh-drop-duplicated-word-in-a-comment.patch
linux-async_txh-drop-duplicated-word-in-a-comment.patch
autofs-fix-doubled-word.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (205 preceding siblings ...)
  2020-07-21  0:31 ` + frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch " Andrew Morton
@ 2020-07-21  0:33 ` Andrew Morton
  2020-07-21  0:34 ` + xz-drop-duplicated-word-in-linux-xzh.patch " Andrew Morton
                   ` (25 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21  0:33 UTC (permalink / raw)
  To: chris, hannes, mhocko, mm-commits, rdunlap, vdavydov.dev


The patch titled
     Subject: include/linux/memcontrol.h: drop duplicate word and fix spello
has been added to the -mm tree.  Its filename is
     memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: include/linux/memcontrol.h: drop duplicate word and fix spello

Drop the doubled word "for" in a comment.
Fix spello of "incremented".

Link: http://lkml.kernel.org/r/b04aa2e4-7c95-12f0-599d-43d07fb28134@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memcontrol.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/memcontrol.h~memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh
+++ a/include/linux/memcontrol.h
@@ -72,8 +72,8 @@ struct mem_cgroup_id {
 
 /*
  * Per memcg event counter is incremented at every pagein/pageout. With THP,
- * it will be incremated by the number of pages. This counter is used for
- * for trigger some periodic events. This is straightforward and better
+ * it will be incremented by the number of pages. This counter is used
+ * to trigger some periodic events. This is straightforward and better
  * than using jiffies etc. to handle periodic memcg event.
  */
 enum mem_cgroup_events_target {
_

Patches currently in -mm which might be from rdunlap@infradead.org are

ocfs2-suballoch-delete-a-duplicated-word.patch
linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch
highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch
frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch
memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch
clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch
linux-exportfsh-drop-duplicated-word-in-a-comment.patch
linux-async_txh-drop-duplicated-word-in-a-comment.patch
autofs-fix-doubled-word.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + xz-drop-duplicated-word-in-linux-xzh.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (206 preceding siblings ...)
  2020-07-21  0:33 ` + memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch " Andrew Morton
@ 2020-07-21  0:34 ` Andrew Morton
  2020-07-21  2:07 ` mmotm 2020-07-20-19-06 uploaded Andrew Morton
                   ` (24 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21  0:34 UTC (permalink / raw)
  To: lasse.collin, mm-commits, rdunlap


The patch titled
     Subject: include/linux/xz.h: drop duplicated word
has been added to the -mm tree.  Its filename is
     xz-drop-duplicated-word-in-linux-xzh.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/xz-drop-duplicated-word-in-linux-xzh.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/xz-drop-duplicated-word-in-linux-xzh.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Randy Dunlap <rdunlap@infradead.org>
Subject: include/linux/xz.h: drop duplicated word

Drop the doubled word "than" in a comment.

Link: http://lkml.kernel.org/r/05ebba7a-c1e4-01ae-fc7b-15c081b33f3e@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/xz.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/xz.h~xz-drop-duplicated-word-in-linux-xzh
+++ a/include/linux/xz.h
@@ -28,7 +28,7 @@
  * enum xz_mode - Operation mode
  *
  * @XZ_SINGLE:              Single-call mode. This uses less RAM than
- *                          than multi-call modes, because the LZMA2
+ *                          multi-call modes, because the LZMA2
  *                          dictionary doesn't need to be allocated as
  *                          part of the decoder state. All required data
  *                          structures are allocated at initialization,
_

Patches currently in -mm which might be from rdunlap@infradead.org are

ocfs2-suballoch-delete-a-duplicated-word.patch
linux-sched-mmh-drop-duplicated-words-in-comments.patch
mm-drop-duplicated-words-in-linux-pgtableh.patch
mm-drop-duplicated-words-in-linux-mmh.patch
highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch
frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch
memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch
clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch
linux-exportfsh-drop-duplicated-word-in-a-comment.patch
linux-async_txh-drop-duplicated-word-in-a-comment.patch
xz-drop-duplicated-word-in-linux-xzh.patch
autofs-fix-doubled-word.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* mmotm 2020-07-20-19-06 uploaded
  2020-07-03 22:14 incoming Andrew Morton
                   ` (207 preceding siblings ...)
  2020-07-21  0:34 ` + xz-drop-duplicated-word-in-linux-xzh.patch " Andrew Morton
@ 2020-07-21  2:07 ` Andrew Morton
  2020-07-21 20:49 ` + fork-silence-a-false-postive-warning-in-__mmdrop.patch added to -mm tree Andrew Morton
                   ` (23 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21  2:07 UTC (permalink / raw)
  To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
	mhocko, mm-commits, sfr

The mm-of-the-moment snapshot 2020-07-20-19-06 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss.  Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

	https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

	https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.8-rc6:
(patches marked "*" will be included in linux-next)

  origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch
* mailmap-add-entry-for-mike-rapoport.patch
* squashfs-fix-length-field-overlap-check-in-metadata-reading.patch
* scripts-decode_stacktrace-strip-basepath-from-all-paths.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* const_structscheckpatch-add-regulator_ops.patch
* scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch
* ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type.patch
* ocfs2-fix-remounting-needed-after-setfacl-command.patch
* ocfs2-suballoch-delete-a-duplicated-word.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch
* mm-handle-page-mapping-better-in-dump_page.patch
* mm-handle-page-mapping-better-in-dump_page-fix.patch
* mm-dump-compound-page-information-on-a-second-line.patch
* mm-print-head-flags-in-dump_page.patch
* mm-switch-dump_page-to-get_kernel_nofault.patch
* mm-print-the-inode-number-in-dump_page.patch
* mm-print-hashed-address-of-struct-page.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range.patch
* mm-swap-simplify-alloc_swap_slot_cache.patch
* mm-swap-simplify-enable_swap_slots_cache.patch
* mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
* tmpfs-per-superblock-i_ino-support.patch
* tmpfs-support-64-bit-inums-per-sb.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
* mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
* mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
* mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
* mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch
* mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch
* mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch
* mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch
* memcg-oom-check-memcg-margin-for-parallel-oom.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
* mm-utilc-make-vm_memory_committed-more-accurate.patch
* percpu_counter-add-percpu_counter_sync.patch
* mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
* mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch
* riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch
* mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch
* mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch
* arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch
* mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
* mm-mremap-calculate-extent-in-one-place.patch
* mm-mremap-start-addresses-are-properly-aligned.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* mm-sparse-cleanup-the-code-surrounding-memory_present.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
* mm-vmallocc-remove-bug-from-the-find_va_links.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* rcu-kasan-record-and-print-call_rcu-call-stack-v8.patch
* kasan-record-and-print-the-free-track.patch
* kasan-record-and-print-the-free-track-v8.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
* mm-thp-replace-http-links-with-https-ones.patch
* mm-thp-replace-http-links-with-https-ones-fix.patch
* mm-vmscanc-fixed-typo.patch
* mm-vmscan-consistent-update-to-pgrefill.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
* mm-oom-make-the-calculation-of-oom-badness-more-accurate-v3.patch
* doc-mm-sync-up-oom_score_adj-documentation.patch
* doc-mm-clarify-proc-pid-oom_score-value-range.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-migrate-optimize-migrate_vma_setup-for-holes.patch
* mm-migrate-optimize-migrate_vma_setup-for-holes-v2.patch
* mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-vmstat-add-events-for-thp-migration-without-split.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch
* mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch
* mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch
* mm-memory_hotplug-introduce-default-dummy-memory_add_physaddr_to_nid.patch
* mm-memory_hotplug-fix-unpaired-mem_hotplug_begin-done.patch
* linux-sched-mmh-drop-duplicated-words-in-comments.patch
* mm-drop-duplicated-words-in-linux-pgtableh.patch
* mm-drop-duplicated-words-in-linux-mmh.patch
* highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch
* frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch
* memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch
* syscalls-use-uaccess_kernel-in-addr_limit_user_check.patch
* nds32-use-uaccess_kernel-in-show_regs.patch
* riscv-include-asm-pgtableh-in-asm-uaccessh.patch
* uaccess-remove-segment_eq.patch
* uaccess-add-force_uaccess_beginend-helpers.patch
* uaccess-add-force_uaccess_beginend-helpers-v2.patch
* exec-use-force_uaccess_begin-during-exec-and-exit.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* proc-sysctl-make-protected_-world-readable.patch
* clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch
* linux-exportfsh-drop-duplicated-word-in-a-comment.patch
* linux-async_txh-drop-duplicated-word-in-a-comment.patch
* xz-drop-duplicated-word-in-linux-xzh.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* lib-test_lockupc-make-symbol-test_works-static.patch
* iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* autofs-fix-doubled-word.patch
* fs-minix-check-return-value-of-sb_getblk.patch
* fs-minix-dont-allow-getting-deleted-inodes.patch
* fs-minix-reject-too-large-maximum-file-size.patch
* fs-minix-set-s_maxbytes-correctly.patch
* fs-minix-fix-block-limit-check-for-v1-filesystems.patch
* fs-minix-remove-expected-error-message-in-block_to_path.patch
* fs-ufs-avoid-potential-u32-multiplication-overflow.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
* fat-fix-fat_ra_init-for-data-clusters-==-0.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
* ipc-uninline-functions.patch
* ipc-shmc-remove-the-superfluous-break.patch
  linux-next.patch
  linux-next-rejects.patch
* mm-page_isolation-prefer-the-node-of-the-source-page.patch
* mm-migrate-move-migration-helper-from-h-to-c.patch
* mm-hugetlb-unify-migration-callbacks.patch
* mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
* mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
* mm-migrate-make-a-standard-migration-target-allocation-function.patch
* mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
* mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
* mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
* mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch
* scripts-deprecated_terms-sync-with-inclusive-terms.patch
* mm-do-page-fault-accounting-in-handle_mm_fault.patch
* mm-alpha-use-general-page-fault-accounting.patch
* mm-arc-use-general-page-fault-accounting.patch
* mm-arm-use-general-page-fault-accounting.patch
* mm-arm64-use-general-page-fault-accounting.patch
* mm-csky-use-general-page-fault-accounting.patch
* mm-hexagon-use-general-page-fault-accounting.patch
* mm-ia64-use-general-page-fault-accounting.patch
* mm-m68k-use-general-page-fault-accounting.patch
* mm-microblaze-use-general-page-fault-accounting.patch
* mm-mips-use-general-page-fault-accounting.patch
* mm-nds32-use-general-page-fault-accounting.patch
* mm-nios2-use-general-page-fault-accounting.patch
* mm-openrisc-use-general-page-fault-accounting.patch
* mm-parisc-use-general-page-fault-accounting.patch
* mm-powerpc-use-general-page-fault-accounting.patch
* mm-riscv-use-general-page-fault-accounting.patch
* mm-s390-use-general-page-fault-accounting.patch
* mm-sh-use-general-page-fault-accounting.patch
* mm-sparc32-use-general-page-fault-accounting.patch
* mm-sparc64-use-general-page-fault-accounting.patch
* mm-x86-use-general-page-fault-accounting.patch
* mm-xtensa-use-general-page-fault-accounting.patch
* mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code-fix.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
  make-sure-nobodys-leaking-resources.patch
  releasing-resources-with-children.patch
  mutex-subsystem-synchro-test-module.patch
  kernel-forkc-export-kernel_thread-to-modules.patch
  workaround-for-a-pci-restoring-bug.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

* + fork-silence-a-false-postive-warning-in-__mmdrop.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (208 preceding siblings ...)
  2020-07-21  2:07 ` mmotm 2020-07-20-19-06 uploaded Andrew Morton
@ 2020-07-21 20:49 ` Andrew Morton
  2020-07-21 20:57 ` + io-mapping-indicate-mapping-failure.patch " Andrew Morton
                   ` (22 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21 20:49 UTC (permalink / raw)
  To: cai, mm-commits, mpe, peterz, stable


The patch titled
     Subject: fork: silence a false postive warning in __mmdrop
has been added to the -mm tree.  Its filename is
     fork-silence-a-false-postive-warning-in-__mmdrop.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fork-silence-a-false-postive-warning-in-__mmdrop.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fork-silence-a-false-postive-warning-in-__mmdrop.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Qian Cai <cai@lca.pw>
Subject: fork: silence a false postive warning in __mmdrop

commit bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
delayed,

idle->active_mm = &init_mm;

into finish_cpu() instead of idle_task_exit() which results in a false
positive warning that was originally designed in the commit 3eda69c92d47
("kernel/fork.c: detect early free of a live mm").

 WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
 __mmdrop+0x230/0x2c0
 do_exit+0x424/0xfa0
 Call Trace:
 do_exit+0x424/0xfa0
 do_group_exit+0x64/0xd0
 sys_exit_group+0x24/0x30
 system_call_exception+0x108/0x1d0
 system_call_common+0xf0/0x278

Link: http://lkml.kernel.org/r/20200604150344.1796-1-cai@lca.pw
Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
Signed-off-by: Qian Cai <cai@lca.pw>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/fork.c |    1 -
 1 file changed, 1 deletion(-)

--- a/kernel/fork.c~fork-silence-a-false-postive-warning-in-__mmdrop
+++ a/kernel/fork.c
@@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
 {
 	BUG_ON(mm == &init_mm);
 	WARN_ON_ONCE(mm == current->mm);
-	WARN_ON_ONCE(mm == current->active_mm);
 	mm_free_pgd(mm);
 	destroy_context(mm);
 	mmu_notifier_subscriptions_destroy(mm);
_

Patches currently in -mm which might be from cai@lca.pw are

fork-silence-a-false-postive-warning-in-__mmdrop.patch
mm-page_alloc-silence-a-kasan-false-positive.patch
mm-kmemleak-silence-kcsan-splats-in-checksum.patch
mm-frontswap-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races-v2.patch
mm-swap_state-mark-various-intentional-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races-v2.patch
mm-page_counter-fix-various-data-races-at-memsw.patch
mm-memcontrol-fix-a-data-race-in-scan-count.patch
mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
mm-mempool-fix-a-data-race-in-mempool_free.patch
mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
mm-annotate-a-data-race-in-page_zonenum.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + io-mapping-indicate-mapping-failure.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (209 preceding siblings ...)
  2020-07-21 20:49 ` + fork-silence-a-false-postive-warning-in-__mmdrop.patch added to -mm tree Andrew Morton
@ 2020-07-21 20:57 ` Andrew Morton
  2020-07-21 20:57 ` + io-mapping-indicate-mapping-failure-fix.patch " Andrew Morton
                   ` (21 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21 20:57 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, chris, michael.j.ruhl, mm-commits, rppt, stable


The patch titled
     Subject: io-mapping: indicate mapping failure
has been added to the -mm tree.  Its filename is
     io-mapping-indicate-mapping-failure.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/io-mapping-indicate-mapping-failure.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/io-mapping-indicate-mapping-failure.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Michael J. Ruhl" <michael.j.ruhl@intel.com>
Subject: io-mapping: indicate mapping failure

The !ATOMIC_IOMAP version of io_maping_init_wc will always return success,
even when the ioremap fails.

Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.

During a device probe, where the ioremap failed, a crash can look
like this:

BUG: unable to handle page fault for address: 0000000000210000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 Oops: 0002 [#1] PREEMPT SMP
 CPU: 0 PID: 177 Comm:
 RIP: 0010:fill_page_dma [i915]
  gen8_ppgtt_create [i915]
  i915_ppgtt_create [i915]
  intel_gt_init [i915]
  i915_gem_init [i915]
  i915_driver_probe [i915]
  pci_device_probe
  really_probe
  driver_probe_device

The remap failure occurred much earlier in the probe.  If it had
been propagated, the driver would have exited with an error.

Return NULL on ioremap failure.

Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/io-mapping.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure
+++ a/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *io
 	iomap->prot = pgprot_noncached(PAGE_KERNEL);
 #endif
 
-	return iomap;
+	return iomap->iomem ? iomap : NULL;
 }
 
 static inline void
_

Patches currently in -mm which might be from michael.j.ruhl@intel.com are

io-mapping-indicate-mapping-failure.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + io-mapping-indicate-mapping-failure-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (210 preceding siblings ...)
  2020-07-21 20:57 ` + io-mapping-indicate-mapping-failure.patch " Andrew Morton
@ 2020-07-21 20:57 ` Andrew Morton
  2020-07-21 21:06 ` + mm-fix-kthread_use_mm-vs-tlb-invalidate.patch " Andrew Morton
                   ` (20 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21 20:57 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, chris, michael.j.ruhl, mm-commits, rppt


The patch titled
     Subject: io-mapping-indicate-mapping-failure-fix
has been added to the -mm tree.  Its filename is
     io-mapping-indicate-mapping-failure-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/io-mapping-indicate-mapping-failure-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/io-mapping-indicate-mapping-failure-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: io-mapping-indicate-mapping-failure-fix

detact ioremap_wc() errors earlier

Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Michael J. Ruhl" <michael.j.ruhl@intel.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/io-mapping.h |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure-fix
+++ a/include/linux/io-mapping.h
@@ -107,9 +107,12 @@ io_mapping_init_wc(struct io_mapping *io
 		   resource_size_t base,
 		   unsigned long size)
 {
+	iomap->iomem = ioremap_wc(base, size);
+	if (!iomap->iomem)
+		return NULL;
+
 	iomap->base = base;
 	iomap->size = size;
-	iomap->iomem = ioremap_wc(base, size);
 #if defined(pgprot_noncached_wc) /* archs can't agree on a name ... */
 	iomap->prot = pgprot_noncached_wc(PAGE_KERNEL);
 #elif defined(pgprot_writecombine)
@@ -118,7 +121,7 @@ io_mapping_init_wc(struct io_mapping *io
 	iomap->prot = pgprot_noncached(PAGE_KERNEL);
 #endif
 
-	return iomap->iomem ? iomap : NULL;
+	return iomap;
 }
 
 static inline void
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch
io-mapping-indicate-mapping-failure-fix.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
mmhwpoison-rework-soft-offline-for-in-use-pages-fix.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch
linux-next-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-fix-kthread_use_mm-vs-tlb-invalidate.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (211 preceding siblings ...)
  2020-07-21 20:57 ` + io-mapping-indicate-mapping-failure-fix.patch " Andrew Morton
@ 2020-07-21 21:06 ` Andrew Morton
  2020-07-21 21:06 ` + mm-fix-kthread_use_mm-vs-tlb-invalidate-fix.patch " Andrew Morton
                   ` (19 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21 21:06 UTC (permalink / raw)
  To: axboe, hch, jannh, keescook, luto, mathieu.desnoyers, mm-commits,
	npiggin, peterz, stable, will


The patch titled
     Subject: mm: fix kthread_use_mm() vs TLB invalidate
has been added to the -mm tree.  Its filename is
     mm-fix-kthread_use_mm-vs-tlb-invalidate.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-fix-kthread_use_mm-vs-tlb-invalidate.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-kthread_use_mm-vs-tlb-invalidate.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Zijlstra <peterz@infradead.org>
Subject: mm: fix kthread_use_mm() vs TLB invalidate

For SMP systems using IPI based TLB invalidation, looking at
current->active_mm is entirely reasonable.  This then presents the
following race condition:

  CPU0			CPU1

  flush_tlb_mm(mm)	use_mm(mm)
    <send-IPI>
			  tsk->active_mm = mm;
			  <IPI>
			    if (tsk->active_mm == mm)
			      // flush TLBs
			  </IPI>
			  switch_mm(old_mm,mm,tsk);

Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
because the IPI lands before we actually switched.

Avoid this by disabling IRQs across changing ->active_mm and switch_mm().

[ There are all sorts of reasons this might be harmless for various
architecture specific reasons, but best not leave the door open at all.  ]

Link: http://lkml.kernel.org/r/20200721154106.GE10769@hirez.programming.kicks-ass.net
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kthread.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/kernel/kthread.c~mm-fix-kthread_use_mm-vs-tlb-invalidate
+++ a/kernel/kthread.c
@@ -1239,13 +1239,15 @@ void kthread_use_mm(struct mm_struct *mm
 	WARN_ON_ONCE(tsk->mm);
 
 	task_lock(tsk);
+	local_irq_disable();
 	active_mm = tsk->active_mm;
 	if (active_mm != mm) {
 		mmgrab(mm);
 		tsk->active_mm = mm;
 	}
 	tsk->mm = mm;
-	switch_mm(active_mm, mm, tsk);
+	switch_mm_irqs_off(active_mm, mm, tsk);
+	local_irq_enable();
 	task_unlock(tsk);
 #ifdef finish_arch_post_lock_switch
 	finish_arch_post_lock_switch();
@@ -1274,9 +1276,11 @@ void kthread_unuse_mm(struct mm_struct *
 
 	task_lock(tsk);
 	sync_mm_rss(mm);
+	local_irq_disable();
 	tsk->mm = NULL;
 	/* active_mm is still 'mm' */
 	enter_lazy_tlb(mm, tsk);
+	local_irq_enable();
 	task_unlock(tsk);
 }
 EXPORT_SYMBOL_GPL(kthread_unuse_mm);
_

Patches currently in -mm which might be from peterz@infradead.org are

mm-fix-kthread_use_mm-vs-tlb-invalidate.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-fix-kthread_use_mm-vs-tlb-invalidate-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (212 preceding siblings ...)
  2020-07-21 21:06 ` + mm-fix-kthread_use_mm-vs-tlb-invalidate.patch " Andrew Morton
@ 2020-07-21 21:06 ` Andrew Morton
  2020-07-21 21:18 ` + kernel-add-a-kernel_wait-helper.patch " Andrew Morton
                   ` (18 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21 21:06 UTC (permalink / raw)
  To: akpm, mm-commits, peterz


The patch titled
     Subject: mm-fix-kthread_use_mm-vs-tlb-invalidate-fix
has been added to the -mm tree.  Its filename is
     mm-fix-kthread_use_mm-vs-tlb-invalidate-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-fix-kthread_use_mm-vs-tlb-invalidate-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-kthread_use_mm-vs-tlb-invalidate-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-fix-kthread_use_mm-vs-tlb-invalidate-fix

add comment

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kthread.c |    1 +
 1 file changed, 1 insertion(+)

--- a/kernel/kthread.c~mm-fix-kthread_use_mm-vs-tlb-invalidate-fix
+++ a/kernel/kthread.c
@@ -1239,6 +1239,7 @@ void kthread_use_mm(struct mm_struct *mm
 	WARN_ON_ONCE(tsk->mm);
 
 	task_lock(tsk);
+	/* Hold off tlb flush IPIs while switching mm's */
 	local_irq_disable();
 	active_mm = tsk->active_mm;
 	if (active_mm != mm) {
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch
io-mapping-indicate-mapping-failure-fix.patch
mm-fix-kthread_use_mm-vs-tlb-invalidate-fix.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
mmhwpoison-rework-soft-offline-for-in-use-pages-fix.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch
linux-next-rejects.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + kernel-add-a-kernel_wait-helper.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (213 preceding siblings ...)
  2020-07-21 21:06 ` + mm-fix-kthread_use_mm-vs-tlb-invalidate-fix.patch " Andrew Morton
@ 2020-07-21 21:18 ` Andrew Morton
  2020-07-21 21:20 ` + maintainers-add-kcov-section.patch " Andrew Morton
                   ` (17 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21 21:18 UTC (permalink / raw)
  To: akpm, ebiederm, hch, mcgrof, mm-commits


The patch titled
     Subject: kernel: add a kernel_wait helper
has been added to the -mm tree.  Its filename is
     kernel-add-a-kernel_wait-helper.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/kernel-add-a-kernel_wait-helper.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/kernel-add-a-kernel_wait-helper.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Christoph Hellwig <hch@lst.de>
Subject: kernel: add a kernel_wait helper

Add a helper that waits for a pid and stores the status in the passed in
kernel pointer.  Use it to fix the usage of kernel_wait4 in
call_usermodehelper_exec_sync that only happens to work due to the
implicit set_fs(KERNEL_DS) for kernel threads.

Link: http://lkml.kernel.org/r/20200721130449.5008-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/sched/task.h |    1 +
 kernel/exit.c              |   16 ++++++++++++++++
 kernel/umh.c               |   29 ++++-------------------------
 3 files changed, 21 insertions(+), 25 deletions(-)

--- a/include/linux/sched/task.h~kernel-add-a-kernel_wait-helper
+++ a/include/linux/sched/task.h
@@ -102,6 +102,7 @@ struct task_struct *fork_idle(int);
 struct mm_struct *copy_init_mm(void);
 extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
 extern long kernel_wait4(pid_t, int __user *, int, struct rusage *);
+int kernel_wait(pid_t pid, int *stat);
 
 extern void free_task(struct task_struct *tsk);
 
--- a/kernel/exit.c~kernel-add-a-kernel_wait-helper
+++ a/kernel/exit.c
@@ -1626,6 +1626,22 @@ long kernel_wait4(pid_t upid, int __user
 	return ret;
 }
 
+int kernel_wait(pid_t pid, int *stat)
+{
+	struct wait_opts wo = {
+		.wo_type	= PIDTYPE_PID,
+		.wo_pid		= find_get_pid(pid),
+		.wo_flags	= WEXITED,
+	};
+	int ret;
+
+	ret = do_wait(&wo);
+	if (ret > 0 && wo.wo_stat)
+		*stat = wo.wo_stat;
+	put_pid(wo.wo_pid);
+	return ret;
+}
+
 SYSCALL_DEFINE4(wait4, pid_t, upid, int __user *, stat_addr,
 		int, options, struct rusage __user *, ru)
 {
--- a/kernel/umh.c~kernel-add-a-kernel_wait-helper
+++ a/kernel/umh.c
@@ -130,37 +130,16 @@ static void call_usermodehelper_exec_syn
 {
 	pid_t pid;
 
-	/* If SIGCLD is ignored kernel_wait4 won't populate the status. */
+	/* If SIGCLD is ignored do_wait won't populate the status. */
 	kernel_sigaction(SIGCHLD, SIG_DFL);
 	pid = kernel_thread(call_usermodehelper_exec_async, sub_info, SIGCHLD);
-	if (pid < 0) {
+	if (pid < 0)
 		sub_info->retval = pid;
-	} else {
-		int ret = -ECHILD;
-		/*
-		 * Normally it is bogus to call wait4() from in-kernel because
-		 * wait4() wants to write the exit code to a userspace address.
-		 * But call_usermodehelper_exec_sync() always runs as kernel
-		 * thread (workqueue) and put_user() to a kernel address works
-		 * OK for kernel threads, due to their having an mm_segment_t
-		 * which spans the entire address space.
-		 *
-		 * Thus the __user pointer cast is valid here.
-		 */
-		kernel_wait4(pid, (int __user *)&ret, 0, NULL);
-
-		/*
-		 * If ret is 0, either call_usermodehelper_exec_async failed and
-		 * the real error code is already in sub_info->retval or
-		 * sub_info->retval is 0 anyway, so don't mess with it then.
-		 */
-		if (ret)
-			sub_info->retval = ret;
-	}
+	else
+		kernel_wait(pid, &sub_info->retval);
 
 	/* Restore default kernel sig handler */
 	kernel_sigaction(SIGCHLD, SIG_IGN);
-
 	umh_complete(sub_info);
 }
 
_

Patches currently in -mm which might be from hch@lst.de are

syscalls-use-uaccess_kernel-in-addr_limit_user_check.patch
nds32-use-uaccess_kernel-in-show_regs.patch
riscv-include-asm-pgtableh-in-asm-uaccessh.patch
uaccess-remove-segment_eq.patch
uaccess-add-force_uaccess_beginend-helpers.patch
uaccess-add-force_uaccess_beginend-helpers-v2.patch
exec-use-force_uaccess_begin-during-exec-and-exit.patch
kernel-add-a-kernel_wait-helper.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + maintainers-add-kcov-section.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (214 preceding siblings ...)
  2020-07-21 21:18 ` + kernel-add-a-kernel_wait-helper.patch " Andrew Morton
@ 2020-07-21 21:20 ` Andrew Morton
  2020-07-21 21:21 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix-fix.patch " Andrew Morton
                   ` (16 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21 21:20 UTC (permalink / raw)
  To: andreyknvl, dvyukov, elver, glider, mm-commits


The patch titled
     Subject: MAINTAINERS: add KCOV section
has been added to the -mm tree.  Its filename is
     maintainers-add-kcov-section.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/maintainers-add-kcov-section.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/maintainers-add-kcov-section.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrey Konovalov <andreyknvl@google.com>
Subject: MAINTAINERS: add KCOV section

To link KCOV to the kasan-dev@ mailing list.

Link: http://lkml.kernel.org/r/5fa344db7ac4af2213049e5656c0f43d6ecaa379.1595331682.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 MAINTAINERS |   11 +++++++++++
 1 file changed, 11 insertions(+)

--- a/MAINTAINERS~maintainers-add-kcov-section
+++ a/MAINTAINERS
@@ -9305,6 +9305,17 @@ F:	Documentation/kbuild/kconfig*
 F:	scripts/Kconfig.include
 F:	scripts/kconfig/
 
+KCOV
+R:	Dmitry Vyukov <dvyukov@google.com>
+R:	Andrey Konovalov <andreyknvl@google.com>
+L:	kasan-dev@googlegroups.com
+S:	Maintained
+F:	Documentation/dev-tools/kcov.rst
+F:	include/linux/kcov.h
+F:	include/uapi/linux/kcov.h
+F:	kernel/kcov.c
+F:	scripts/Makefile.kcov
+
 KCSAN
 M:	Marco Elver <elver@google.com>
 R:	Dmitry Vyukov <dvyukov@google.com>
_

Patches currently in -mm which might be from andreyknvl@google.com are

maintainers-add-kcov-section.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix-fix.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (215 preceding siblings ...)
  2020-07-21 21:20 ` + maintainers-add-kcov-section.patch " Andrew Morton
@ 2020-07-21 21:21 ` Andrew Morton
  2020-07-24  0:26 ` + scripts-gdb-fix-lx-symbols-gdberror-while-loading-modules.patch " Andrew Morton
                   ` (15 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-21 21:21 UTC (permalink / raw)
  To: mike.kravetz, mm-commits, sfr


The patch titled
     Subject: mm/hugetlb: better checks before using hugetlb_cma
has been added to the -mm tree.  Its filename is
     mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix-fix.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix-fix.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix-fix.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Stephen Rothwell <sfr@canb.auug.org.au>
Subject: mm/hugetlb: better checks before using hugetlb_cma

Link: http://lkml.kernel.org/r/20200721205716.6dbaa56b@canb.auug.org.au
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix-fix
+++ a/mm/hugetlb.c
@@ -1238,9 +1238,10 @@ static void free_gigantic_page(struct pa
 	 * If the page isn't allocated using the cma allocator,
 	 * cma_release() returns false.
 	 */
-	if (IS_ENABLED(CONFIG_CMA) &&
-	    cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order))
+#ifdef CONFIG_CMA
+	if (cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order))
 		return;
+#endif
 
 	free_contig_range(page_to_pfn(page), 1 << order);
 }
@@ -1251,7 +1252,8 @@ static struct page *alloc_gigantic_page(
 {
 	unsigned long nr_pages = 1UL << huge_page_order(h);
 
-	if (IS_ENABLED(CONFIG_CMA)) {
+#ifdef CONFIG_CMA
+	{
 		struct page *page;
 		int node;
 
@@ -1265,6 +1267,7 @@ static struct page *alloc_gigantic_page(
 				return page;
 		}
 	}
+#endif
 
 	return alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask);
 }
_

Patches currently in -mm which might be from sfr@canb.auug.org.au are

mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix-fix.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + scripts-gdb-fix-lx-symbols-gdberror-while-loading-modules.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (216 preceding siblings ...)
  2020-07-21 21:21 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix-fix.patch " Andrew Morton
@ 2020-07-24  0:26 ` Andrew Morton
  2020-07-24  0:47 ` + mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch " Andrew Morton
                   ` (14 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  0:26 UTC (permalink / raw)
  To: jan.kiszka, kbingham, mm-commits, sgarzare


The patch titled
     Subject: scripts/gdb: fix lx-symbols 'gdb.error' while loading modules
has been added to the -mm tree.  Its filename is
     scripts-gdb-fix-lx-symbols-gdberror-while-loading-modules.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/scripts-gdb-fix-lx-symbols-gdberror-while-loading-modules.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/scripts-gdb-fix-lx-symbols-gdberror-while-loading-modules.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Stefano Garzarella <sgarzare@redhat.com>
Subject: scripts/gdb: fix lx-symbols 'gdb.error' while loading modules

Commit ed66f991bb19 ("module: Refactor section attr into bin
attribute") removed the 'name' field from 'struct module_sect_attr'
triggering the following error when invoking lx-symbols:

  (gdb) lx-symbols
  loading vmlinux
  scanning for modules in linux/build
  loading @0xffffffffc014f000: linux/build/drivers/net/tun.ko
  Python Exception <class 'gdb.error'> There is no member named name.:
  Error occurred in Python: There is no member named name.

This patch fixes the issue taking the module name from the 'struct
attribute'.

Link: http://lkml.kernel.org/r/20200722102239.313231-1-sgarzare@redhat.com
Fixes: ed66f991bb19 ("module: Refactor section attr into bin attribute")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Kieran Bingham <kbingham@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/gdb/linux/symbols.py |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/scripts/gdb/linux/symbols.py~scripts-gdb-fix-lx-symbols-gdberror-while-loading-modules
+++ a/scripts/gdb/linux/symbols.py
@@ -96,7 +96,7 @@ lx-symbols command."""
             return ""
         attrs = sect_attrs['attrs']
         section_name_to_address = {
-            attrs[n]['name'].string(): attrs[n]['address']
+            attrs[n]['battr']['attr']['name'].string(): attrs[n]['address']
             for n in range(int(sect_attrs['nsections']))}
         args = []
         for section_name in [".data", ".data..read_mostly", ".rodata", ".bss",
_

Patches currently in -mm which might be from sgarzare@redhat.com are

scripts-gdb-fix-lx-symbols-gdberror-while-loading-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (217 preceding siblings ...)
  2020-07-24  0:26 ` + scripts-gdb-fix-lx-symbols-gdberror-while-loading-modules.patch " Andrew Morton
@ 2020-07-24  0:47 ` Andrew Morton
  2020-07-24  0:47 ` + mm-vmscan-protect-the-workingset-on-anonymous-lru.patch " Andrew Morton
                   ` (13 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  0:47 UTC (permalink / raw)
  To: hannes, hughd, iamjoonsoo.kim, mgorman, mhocko, minchan,
	mm-commits, vbabka, willy


The patch titled
     Subject: mm/vmscan: make active/inactive ratio as 1:1 for anon lru
has been added to the -mm tree.  Its filename is
     mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/vmscan: make active/inactive ratio as 1:1 for anon lru

Patch series "workingset protection/detection on the anonymous LRU list", v7.


* PROBLEM
In current implementation, newly created or swap-in anonymous page is
started on the active list.  Growing the active list results in
rebalancing active/inactive list so old pages on the active list are
demoted to the inactive list.  Hence, hot page on the active list isn't
protected at all.

Following is an example of this situation.

Assume that 50 hot pages on active list and system can contain total 100
pages.  Numbers denote the number of pages on active/inactive list (active
| inactive).  (h) stands for hot pages and (uo) stands for used-once
pages.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(uo) | 50(h)

3. workload: another 50 newly created (used-once) pages
50(uo) | 50(uo), swap-out 50(h)

As we can see, hot pages are swapped-out and it would cause swap-in later.

* SOLUTION
Since this is what we want to avoid, this patchset implements workingset
protection.  Like as the file LRU list, newly created or swap-in anonymous
page is started on the inactive list.  Also, like as the file LRU list, if
enough reference happens, the page will be promoted.  This simple
modification changes the above example as following.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(h) | 50(uo)

3. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(uo)

hot pages remains in the active list. :)

* EXPERIMENT
I tested this scenario on my test bed and confirmed that this problem
happens on current implementation. I also checked that it is fixed by
this patchset.


* SUBJECT
workingset detection

* PROBLEM
Later part of the patchset implements the workingset detection for the
anonymous LRU list.  There is a corner case that workingset protection
could cause thrashing.  If we can avoid thrashing by workingset detection,
we can get the better performance.

Following is an example of thrashing due to the workingset protection.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (will be hot) pages
50(h) | 50(wh)

3. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(wh)

4. workload: 50 (will be hot) pages
50(h) | 50(wh), swap-in 50(wh)

5. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(wh)

6. repeat 4, 5

Without workingset detection, this kind of workload cannot be promoted and
thrashing happens forever.

* SOLUTION
Therefore, this patchset implements workingset detection.  All the
infrastructure for workingset detecion is already implemented, so there is
not much work to do.  First, extend workingset detection code to deal with
the anonymous LRU list.  Then, make swap cache handles the exceptional
value for the shadow entry.  Lastly, install/retrieve the shadow value
into/from the swap cache and check the refault distance.

* EXPERIMENT
I made a test program to imitates above scenario and confirmed that
problem exists.  Then, I checked that this patchset fixes it.

My test setup is a virtual machine with 8 cpus and 6100MB memory.  But,
the amount of the memory that the test program can use is about 280 MB. 
This is because the system uses large ram-backed swap and large ramdisk to
capture the trace.

Test scenario is like as below.

1. allocate cold memory (512MB)
2. allocate hot-1 memory (96MB)
3. activate hot-1 memory (96MB)
4. allocate another hot-2 memory (96MB)
5. access cold memory (128MB)
6. access hot-2 memory (96MB)
7. repeat 5, 6

Since hot-1 memory (96MB) is on the active list, the inactive list can
contains roughly 190MB pages.  hot-2 memory's re-access interval (96+128
MB) is more 190MB, so it cannot be promoted without workingset detection
and swap-in/out happens repeatedly.  With this patchset, workingset
detection works and promotion happens.  Therefore, swap-in/out occurs
less.

Here is the result. (average of 5 runs)

type swap-in swap-out
base 863240 989945
patch 681565 809273

As we can see, patched kernel do less swap-in/out.

* OVERALL TEST (ebizzy using modified random function)
ebizzy is the test program that main thread allocates lots of memory and
child threads access them randomly during the given times.  Swap-in will
happen if allocated memory is larger than the system memory.

The random function that represents the zipf distribution is used to make
hot/cold memory.  Hot/cold ratio is controlled by the parameter.  If the
parameter is high, hot memory is accessed much larger than cold one.  If
the parameter is low, the number of access on each memory would be
similar.  I uses various parameters in order to show the effect of
patchset on various hot/cold ratio workload.

My test setup is a virtual machine with 8 cpus, 1024 MB memory and 5120 MB
ram swap.

Result format is as following.

param: 1-1024-0.1
- 1 (number of thread)
- 1024 (allocated memory size, MB)
- 0.1 (zipf distribution alpha,
0.1 works like as roughly uniform random,
1.3 works like as small portion of memory is hot and the others are cold)

pswpin: smaller is better
std: standard deviation
improvement: negative is better

* single thread
           param        pswpin       std       improvement
      base 1-1024.0-0.1 14101983.40   79441.19
      prot 1-1024.0-0.1 14065875.80  136413.01  (   -0.26 )
    detect 1-1024.0-0.1 13910435.60  100804.82  (   -1.36 )
      base 1-1024.0-0.7 7998368.80   43469.32
      prot 1-1024.0-0.7 7622245.80   88318.74  (   -4.70 )
    detect 1-1024.0-0.7 7618515.20   59742.07  (   -4.75 )
      base 1-1024.0-1.3 1017400.80   38756.30
      prot 1-1024.0-1.3  940464.60   29310.69  (   -7.56 )
    detect 1-1024.0-1.3  945511.40   24579.52  (   -7.07 )
      base 1-1280.0-0.1 22895541.40   50016.08
      prot 1-1280.0-0.1 22860305.40   51952.37  (   -0.15 )
    detect 1-1280.0-0.1 22705565.20   93380.35  (   -0.83 )
      base 1-1280.0-0.7 13717645.60   46250.65
      prot 1-1280.0-0.7 12935355.80   64754.43  (   -5.70 )
    detect 1-1280.0-0.7 13040232.00   63304.00  (   -4.94 )
      base 1-1280.0-1.3 1654251.40    4159.68
      prot 1-1280.0-1.3 1522680.60   33673.50  (   -7.95 )
    detect 1-1280.0-1.3 1599207.00   70327.89  (   -3.33 )
      base 1-1536.0-0.1 31621775.40   31156.28
      prot 1-1536.0-0.1 31540355.20   62241.36  (   -0.26 )
    detect 1-1536.0-0.1 31420056.00  123831.27  (   -0.64 )
      base 1-1536.0-0.7 19620760.60   60937.60
      prot 1-1536.0-0.7 18337839.60   56102.58  (   -6.54 )
    detect 1-1536.0-0.7 18599128.00   75289.48  (   -5.21 )
      base 1-1536.0-1.3 2378142.40   20994.43
      prot 1-1536.0-1.3 2166260.60   48455.46  (   -8.91 )
    detect 1-1536.0-1.3 2183762.20   16883.24  (   -8.17 )
      base 1-1792.0-0.1 40259714.80   90750.70
      prot 1-1792.0-0.1 40053917.20   64509.47  (   -0.51 )
    detect 1-1792.0-0.1 39949736.40  104989.64  (   -0.77 )
      base 1-1792.0-0.7 25704884.40   69429.68
      prot 1-1792.0-0.7 23937389.00   79945.60  (   -6.88 )
    detect 1-1792.0-0.7 24271902.00   35044.30  (   -5.57 )
      base 1-1792.0-1.3 3129497.00   32731.86
      prot 1-1792.0-1.3 2796994.40   19017.26  (  -10.62 )
    detect 1-1792.0-1.3 2886840.40   33938.82  (   -7.75 )
      base 1-2048.0-0.1 48746924.40   50863.88
      prot 1-2048.0-0.1 48631954.40   24537.30  (   -0.24 )
    detect 1-2048.0-0.1 48509419.80   27085.34  (   -0.49 )
      base 1-2048.0-0.7 32046424.40   78624.22
      prot 1-2048.0-0.7 29764182.20   86002.26  (   -7.12 )
    detect 1-2048.0-0.7 30250315.80  101282.14  (   -5.60 )
      base 1-2048.0-1.3 3916723.60   24048.55
      prot 1-2048.0-1.3 3490781.60   33292.61  (  -10.87 )
    detect 1-2048.0-1.3 3585002.20   44942.04  (   -8.47 )

* multi thread
           param        pswpin       std       improvement
      base 8-1024.0-0.1 16219822.60  329474.01
      prot 8-1024.0-0.1 15959494.00  654597.45  (   -1.61 )
    detect 8-1024.0-0.1 15773790.80  502275.25  (   -2.75 )
      base 8-1024.0-0.7 9174107.80  537619.33
      prot 8-1024.0-0.7 8571915.00  385230.08  (   -6.56 )
    detect 8-1024.0-0.7 8489484.20  364683.00  (   -7.46 )
      base 8-1024.0-1.3 1108495.60   83555.98
      prot 8-1024.0-1.3 1038906.20   63465.20  (   -6.28 )
    detect 8-1024.0-1.3  941817.80   32648.80  (  -15.04 )
      base 8-1280.0-0.1 25776114.20  450480.45
      prot 8-1280.0-0.1 25430847.00  465627.07  (   -1.34 )
    detect 8-1280.0-0.1 25282555.00  465666.55  (   -1.91 )
      base 8-1280.0-0.7 15218968.00  702007.69
      prot 8-1280.0-0.7 13957947.80  492643.86  (   -8.29 )
    detect 8-1280.0-0.7 14158331.20  238656.02  (   -6.97 )
      base 8-1280.0-1.3 1792482.80   30512.90
      prot 8-1280.0-1.3 1577686.40   34002.62  (  -11.98 )
    detect 8-1280.0-1.3 1556133.00   22944.79  (  -13.19 )
      base 8-1536.0-0.1 33923761.40  575455.85
      prot 8-1536.0-0.1 32715766.20  300633.51  (   -3.56 )
    detect 8-1536.0-0.1 33158477.40  117764.51  (   -2.26 )
      base 8-1536.0-0.7 20628907.80  303851.34
      prot 8-1536.0-0.7 19329511.20  341719.31  (   -6.30 )
    detect 8-1536.0-0.7 20013934.00  385358.66  (   -2.98 )
      base 8-1536.0-1.3 2588106.40  130769.20
      prot 8-1536.0-1.3 2275222.40   89637.06  (  -12.09 )
    detect 8-1536.0-1.3 2365008.40  124412.55  (   -8.62 )
      base 8-1792.0-0.1 43328279.20  946469.12
      prot 8-1792.0-0.1 41481980.80  525690.89  (   -4.26 )
    detect 8-1792.0-0.1 41713944.60  406798.93  (   -3.73 )
      base 8-1792.0-0.7 27155647.40  536253.57
      prot 8-1792.0-0.7 24989406.80  502734.52  (   -7.98 )
    detect 8-1792.0-0.7 25524806.40  263237.87  (   -6.01 )
      base 8-1792.0-1.3 3260372.80  137907.92
      prot 8-1792.0-1.3 2879187.80   63597.26  (  -11.69 )
    detect 8-1792.0-1.3 2892962.20   33229.13  (  -11.27 )
      base 8-2048.0-0.1 50583989.80  710121.48
      prot 8-2048.0-0.1 49599984.40  228782.42  (   -1.95 )
    detect 8-2048.0-0.1 50578596.00  660971.66  (   -0.01 )
      base 8-2048.0-0.7 33765479.60  812659.55
      prot 8-2048.0-0.7 30767021.20  462907.24  (   -8.88 )
    detect 8-2048.0-0.7 32213068.80  211884.24  (   -4.60 )
      base 8-2048.0-1.3 3941675.80   28436.45
      prot 8-2048.0-1.3 3538742.40   76856.08  (  -10.22 )
    detect 8-2048.0-1.3 3579397.80   58630.95  (   -9.19 )

As we can see, all the cases show improvement.  Especially, test case with
zipf distribution 1.3 show more improvements.  It means that if there is a
hot/cold tendency in anon pages, this patchset works better.


This patch (of 6):

Current implementation of LRU management for anonymous page has some
problems.  Most important one is that it doesn't protect the workingset,
that is, pages on the active LRU list.  Although, this problem will be
fixed in the following patchset, the preparation is required and this
patch does it.

What following patch does is to implement workingset protection.  After
the following patchset, newly created or swap-in pages will start their
lifetime on the inactive list.  If inactive list is too small, there is
not enough chance to be referenced and the page cannot become the
workingset.

In order to provide the newly anonymous or swap-in pages enough chance to
be referenced again, this patch makes active/inactive LRU ratio as 1:1.

This is just a temporary measure.  Later patch in the series introduces
workingset detection for anonymous LRU that will be used to better decide
if pages should start on the active and inactive list.  Afterwards this
patch is effectively reverted.

Link: http://lkml.kernel.org/r/1595490560-15117-1-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1595490560-15117-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmscan.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/vmscan.c~mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru
+++ a/mm/vmscan.c
@@ -2208,7 +2208,7 @@ static bool inactive_is_low(struct lruve
 	active = lruvec_page_state(lruvec, NR_LRU_BASE + active_lru);
 
 	gb = (inactive + active) >> (30 - PAGE_SHIFT);
-	if (gb)
+	if (gb && is_file_lru(inactive_lru))
 		inactive_ratio = int_sqrt(10 * gb);
 	else
 		inactive_ratio = 1;
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
mm-swapcache-support-to-handle-the-shadow-entries.patch
mm-swap-implement-workingset-detection-for-anonymous-lru.patch
mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmscan-protect-the-workingset-on-anonymous-lru.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (218 preceding siblings ...)
  2020-07-24  0:47 ` + mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch " Andrew Morton
@ 2020-07-24  0:47 ` Andrew Morton
  2020-07-24  0:47 ` + mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch " Andrew Morton
                   ` (12 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  0:47 UTC (permalink / raw)
  To: hannes, hughd, iamjoonsoo.kim, mgorman, mhocko, minchan,
	mm-commits, vbabka, willy


The patch titled
     Subject: mm/vmscan: protect the workingset on anonymous LRU
has been added to the -mm tree.  Its filename is
     mm-vmscan-protect-the-workingset-on-anonymous-lru.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmscan-protect-the-workingset-on-anonymous-lru.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/vmscan: protect the workingset on anonymous LRU

In current implementation, newly created or swap-in anonymous page is
started on active list.  Growing active list results in rebalancing
active/inactive list so old pages on active list are demoted to inactive
list.  Hence, the page on active list isn't protected at all.

Following is an example of this situation.

Assume that 50 hot pages on active list.  Numbers denote the number of
pages on active/inactive list (active | inactive).

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(uo) | 50(h)

3. workload: another 50 newly created (used-once) pages
50(uo) | 50(uo), swap-out 50(h)

This patch tries to fix this issue.  Like as file LRU, newly created or
swap-in anonymous pages will be inserted to the inactive list.  They are
promoted to active list if enough reference happens.  This simple
modification changes the above example as following.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(h) | 50(uo)

3. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(uo)

As you can see, hot pages on active list would be protected.

Note that, this implementation has a drawback that the page cannot be
promoted and will be swapped-out if re-access interval is greater than the
size of inactive list but less than the size of total(active+inactive). 
To solve this potential issue, following patch will apply workingset
detection similar to the one that's already applied to file LRU.

Link: http://lkml.kernel.org/r/1595490560-15117-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/swap.h    |    2 +-
 kernel/events/uprobes.c |    2 +-
 mm/huge_memory.c        |    2 +-
 mm/khugepaged.c         |    2 +-
 mm/memory.c             |    9 ++++-----
 mm/migrate.c            |    2 +-
 mm/swap.c               |   13 +++++++------
 mm/swapfile.c           |    2 +-
 mm/userfaultfd.c        |    2 +-
 mm/vmscan.c             |    4 +---
 10 files changed, 19 insertions(+), 21 deletions(-)

--- a/include/linux/swap.h~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/include/linux/swap.h
@@ -352,7 +352,7 @@ extern void deactivate_page(struct page
 extern void mark_page_lazyfree(struct page *page);
 extern void swap_setup(void);
 
-extern void lru_cache_add_active_or_unevictable(struct page *page,
+extern void lru_cache_add_inactive_or_unevictable(struct page *page,
 						struct vm_area_struct *vma);
 
 /* linux/mm/vmscan.c */
--- a/kernel/events/uprobes.c~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/kernel/events/uprobes.c
@@ -184,7 +184,7 @@ static int __replace_page(struct vm_area
 	if (new_page) {
 		get_page(new_page);
 		page_add_new_anon_rmap(new_page, vma, addr, false);
-		lru_cache_add_active_or_unevictable(new_page, vma);
+		lru_cache_add_inactive_or_unevictable(new_page, vma);
 	} else
 		/* no new page, just dec_mm_counter for old_page */
 		dec_mm_counter(mm, MM_ANONPAGES);
--- a/mm/huge_memory.c~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/mm/huge_memory.c
@@ -640,7 +640,7 @@ static vm_fault_t __do_huge_pmd_anonymou
 		entry = mk_huge_pmd(page, vma->vm_page_prot);
 		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
 		page_add_new_anon_rmap(page, vma, haddr, true);
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 		pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable);
 		set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry);
 		update_mmu_cache_pmd(vma, vmf->address, vmf->pmd);
--- a/mm/khugepaged.c~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/mm/khugepaged.c
@@ -1173,7 +1173,7 @@ static void collapse_huge_page(struct mm
 	spin_lock(pmd_ptl);
 	BUG_ON(!pmd_none(*pmd));
 	page_add_new_anon_rmap(new_page, vma, address, true);
-	lru_cache_add_active_or_unevictable(new_page, vma);
+	lru_cache_add_inactive_or_unevictable(new_page, vma);
 	pgtable_trans_huge_deposit(mm, pmd, pgtable);
 	set_pmd_at(mm, address, pmd, _pmd);
 	update_mmu_cache_pmd(vma, address, pmd);
--- a/mm/memory.c~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/mm/memory.c
@@ -2715,7 +2715,7 @@ static vm_fault_t wp_page_copy(struct vm
 		 */
 		ptep_clear_flush_notify(vma, vmf->address, vmf->pte);
 		page_add_new_anon_rmap(new_page, vma, vmf->address, false);
-		lru_cache_add_active_or_unevictable(new_page, vma);
+		lru_cache_add_inactive_or_unevictable(new_page, vma);
 		/*
 		 * We call the notify macro here because, when using secondary
 		 * mmu page tables (such as kvm shadow page tables), we want the
@@ -3266,10 +3266,9 @@ vm_fault_t do_swap_page(struct vm_fault
 	/* ksm created a completely new copy */
 	if (unlikely(page != swapcache && swapcache)) {
 		page_add_new_anon_rmap(page, vma, vmf->address, false);
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 	} else {
 		do_page_add_anon_rmap(page, vma, vmf->address, exclusive);
-		activate_page(page);
 	}
 
 	swap_free(entry);
@@ -3414,7 +3413,7 @@ static vm_fault_t do_anonymous_page(stru
 
 	inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
 	page_add_new_anon_rmap(page, vma, vmf->address, false);
-	lru_cache_add_active_or_unevictable(page, vma);
+	lru_cache_add_inactive_or_unevictable(page, vma);
 setpte:
 	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
 
@@ -3672,7 +3671,7 @@ vm_fault_t alloc_set_pte(struct vm_fault
 	if (write && !(vma->vm_flags & VM_SHARED)) {
 		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
 		page_add_new_anon_rmap(page, vma, vmf->address, false);
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 	} else {
 		inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page));
 		page_add_file_rmap(page, false);
--- a/mm/migrate.c~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/mm/migrate.c
@@ -2822,7 +2822,7 @@ static void migrate_vma_insert_page(stru
 	inc_mm_counter(mm, MM_ANONPAGES);
 	page_add_new_anon_rmap(page, vma, addr, false);
 	if (!is_zone_device_page(page))
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 	get_page(page);
 
 	if (flush) {
--- a/mm/swap.c~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/mm/swap.c
@@ -476,23 +476,24 @@ void lru_cache_add(struct page *page)
 EXPORT_SYMBOL(lru_cache_add);
 
 /**
- * lru_cache_add_active_or_unevictable
+ * lru_cache_add_inactive_or_unevictable
  * @page:  the page to be added to LRU
  * @vma:   vma in which page is mapped for determining reclaimability
  *
- * Place @page on the active or unevictable LRU list, depending on its
+ * Place @page on the inactive or unevictable LRU list, depending on its
  * evictability.  Note that if the page is not evictable, it goes
  * directly back onto it's zone's unevictable list, it does NOT use a
  * per cpu pagevec.
  */
-void lru_cache_add_active_or_unevictable(struct page *page,
+void lru_cache_add_inactive_or_unevictable(struct page *page,
 					 struct vm_area_struct *vma)
 {
+	bool unevictable;
+
 	VM_BUG_ON_PAGE(PageLRU(page), page);
 
-	if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
-		SetPageActive(page);
-	else if (!TestSetPageMlocked(page)) {
+	unevictable = (vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) == VM_LOCKED;
+	if (unlikely(unevictable) && !TestSetPageMlocked(page)) {
 		/*
 		 * We use the irq-unsafe __mod_zone_page_stat because this
 		 * counter is not modified from interrupt context, and the pte
--- a/mm/swapfile.c~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/mm/swapfile.c
@@ -1915,7 +1915,7 @@ static int unuse_pte(struct vm_area_stru
 		page_add_anon_rmap(page, vma, addr, false);
 	} else { /* ksm created a completely new copy */
 		page_add_new_anon_rmap(page, vma, addr, false);
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 	}
 	swap_free(entry);
 	/*
--- a/mm/userfaultfd.c~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/mm/userfaultfd.c
@@ -123,7 +123,7 @@ static int mcopy_atomic_pte(struct mm_st
 
 	inc_mm_counter(dst_mm, MM_ANONPAGES);
 	page_add_new_anon_rmap(page, dst_vma, dst_addr, false);
-	lru_cache_add_active_or_unevictable(page, dst_vma);
+	lru_cache_add_inactive_or_unevictable(page, dst_vma);
 
 	set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
 
--- a/mm/vmscan.c~mm-vmscan-protect-the-workingset-on-anonymous-lru
+++ a/mm/vmscan.c
@@ -998,8 +998,6 @@ static enum page_references page_check_r
 		return PAGEREF_RECLAIM;
 
 	if (referenced_ptes) {
-		if (PageSwapBacked(page))
-			return PAGEREF_ACTIVATE;
 		/*
 		 * All mapped pages start out with page table
 		 * references from the instantiating fault, so we need
@@ -1022,7 +1020,7 @@ static enum page_references page_check_r
 		/*
 		 * Activate file-backed executable pages after first usage.
 		 */
-		if (vm_flags & VM_EXEC)
+		if ((vm_flags & VM_EXEC) && !PageSwapBacked(page))
 			return PAGEREF_ACTIVATE;
 
 		return PAGEREF_KEEP;
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
mm-swapcache-support-to-handle-the-shadow-entries.patch
mm-swap-implement-workingset-detection-for-anonymous-lru.patch
mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (219 preceding siblings ...)
  2020-07-24  0:47 ` + mm-vmscan-protect-the-workingset-on-anonymous-lru.patch " Andrew Morton
@ 2020-07-24  0:47 ` Andrew Morton
  2020-07-24  0:47 ` + mm-swapcache-support-to-handle-the-shadow-entries.patch " Andrew Morton
                   ` (11 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  0:47 UTC (permalink / raw)
  To: hannes, hughd, iamjoonsoo.kim, mgorman, mhocko, minchan,
	mm-commits, vbabka, willy


The patch titled
     Subject: mm/workingset: prepare the workingset detection infrastructure for anon LRU
has been added to the -mm tree.  Its filename is
     mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/workingset: prepare the workingset detection infrastructure for anon LRU

To prepare the workingset detection for anon LRU, this patch splits
workingset event counters for refault, activate and restore into anon and
file variants, as well as the refaults counter in struct lruvec.

Link: http://lkml.kernel.org/r/1595490560-15117-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mmzone.h |   16 +++++++++++-----
 mm/memcontrol.c        |   16 +++++++++++-----
 mm/vmscan.c            |   15 ++++++++++-----
 mm/vmstat.c            |    9 ++++++---
 mm/workingset.c        |    8 +++++---
 5 files changed, 43 insertions(+), 21 deletions(-)

--- a/include/linux/mmzone.h~mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru
+++ a/include/linux/mmzone.h
@@ -173,9 +173,15 @@ enum node_stat_item {
 	NR_ISOLATED_ANON,	/* Temporary isolated pages from anon lru */
 	NR_ISOLATED_FILE,	/* Temporary isolated pages from file lru */
 	WORKINGSET_NODES,
-	WORKINGSET_REFAULT,
-	WORKINGSET_ACTIVATE,
-	WORKINGSET_RESTORE,
+	WORKINGSET_REFAULT_BASE,
+	WORKINGSET_REFAULT_ANON = WORKINGSET_REFAULT_BASE,
+	WORKINGSET_REFAULT_FILE,
+	WORKINGSET_ACTIVATE_BASE,
+	WORKINGSET_ACTIVATE_ANON = WORKINGSET_ACTIVATE_BASE,
+	WORKINGSET_ACTIVATE_FILE,
+	WORKINGSET_RESTORE_BASE,
+	WORKINGSET_RESTORE_ANON = WORKINGSET_RESTORE_BASE,
+	WORKINGSET_RESTORE_FILE,
 	WORKINGSET_NODERECLAIM,
 	NR_ANON_MAPPED,	/* Mapped anonymous pages */
 	NR_FILE_MAPPED,	/* pagecache pages mapped into pagetables.
@@ -277,8 +283,8 @@ struct lruvec {
 	unsigned long			file_cost;
 	/* Non-resident age, driven by LRU movement */
 	atomic_long_t			nonresident_age;
-	/* Refaults at the time of last reclaim cycle */
-	unsigned long			refaults;
+	/* Refaults at the time of last reclaim cycle, anon=0, file=1 */
+	unsigned long			refaults[2];
 	/* Various lruvec state flags (enum lruvec_flags) */
 	unsigned long			flags;
 #ifdef CONFIG_MEMCG
--- a/mm/memcontrol.c~mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru
+++ a/mm/memcontrol.c
@@ -1530,12 +1530,18 @@ static char *memory_stat_format(struct m
 	seq_buf_printf(&s, "%s %lu\n", vm_event_name(PGMAJFAULT),
 		       memcg_events(memcg, PGMAJFAULT));
 
-	seq_buf_printf(&s, "workingset_refault %lu\n",
-		       memcg_page_state(memcg, WORKINGSET_REFAULT));
-	seq_buf_printf(&s, "workingset_activate %lu\n",
-		       memcg_page_state(memcg, WORKINGSET_ACTIVATE));
+	seq_buf_printf(&s, "workingset_refault_anon %lu\n",
+		       memcg_page_state(memcg, WORKINGSET_REFAULT_ANON));
+	seq_buf_printf(&s, "workingset_refault_file %lu\n",
+		       memcg_page_state(memcg, WORKINGSET_REFAULT_FILE));
+	seq_buf_printf(&s, "workingset_activate_anon %lu\n",
+		       memcg_page_state(memcg, WORKINGSET_ACTIVATE_ANON));
+	seq_buf_printf(&s, "workingset_activate_file %lu\n",
+		       memcg_page_state(memcg, WORKINGSET_ACTIVATE_FILE));
 	seq_buf_printf(&s, "workingset_restore %lu\n",
-		       memcg_page_state(memcg, WORKINGSET_RESTORE));
+		       memcg_page_state(memcg, WORKINGSET_RESTORE_ANON));
+	seq_buf_printf(&s, "workingset_restore %lu\n",
+		       memcg_page_state(memcg, WORKINGSET_RESTORE_FILE));
 	seq_buf_printf(&s, "workingset_nodereclaim %lu\n",
 		       memcg_page_state(memcg, WORKINGSET_NODERECLAIM));
 
--- a/mm/vmscan.c~mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru
+++ a/mm/vmscan.c
@@ -2683,7 +2683,10 @@ again:
 	if (!sc->force_deactivate) {
 		unsigned long refaults;
 
-		if (inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
+		refaults = lruvec_page_state(target_lruvec,
+				WORKINGSET_ACTIVATE_ANON);
+		if (refaults != target_lruvec->refaults[0] ||
+			inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
 			sc->may_deactivate |= DEACTIVATE_ANON;
 		else
 			sc->may_deactivate &= ~DEACTIVATE_ANON;
@@ -2694,8 +2697,8 @@ again:
 		 * rid of any stale active pages quickly.
 		 */
 		refaults = lruvec_page_state(target_lruvec,
-					     WORKINGSET_ACTIVATE);
-		if (refaults != target_lruvec->refaults ||
+				WORKINGSET_ACTIVATE_FILE);
+		if (refaults != target_lruvec->refaults[1] ||
 		    inactive_is_low(target_lruvec, LRU_INACTIVE_FILE))
 			sc->may_deactivate |= DEACTIVATE_FILE;
 		else
@@ -2972,8 +2975,10 @@ static void snapshot_refaults(struct mem
 	unsigned long refaults;
 
 	target_lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
-	refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE);
-	target_lruvec->refaults = refaults;
+	refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE_ANON);
+	target_lruvec->refaults[0] = refaults;
+	refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE_FILE);
+	target_lruvec->refaults[1] = refaults;
 }
 
 /*
--- a/mm/vmstat.c~mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru
+++ a/mm/vmstat.c
@@ -1167,9 +1167,12 @@ const char * const vmstat_text[] = {
 	"nr_isolated_anon",
 	"nr_isolated_file",
 	"workingset_nodes",
-	"workingset_refault",
-	"workingset_activate",
-	"workingset_restore",
+	"workingset_refault_anon",
+	"workingset_refault_file",
+	"workingset_activate_anon",
+	"workingset_activate_file",
+	"workingset_restore_anon",
+	"workingset_restore_file",
 	"workingset_nodereclaim",
 	"nr_anon_pages",
 	"nr_mapped",
--- a/mm/workingset.c~mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru
+++ a/mm/workingset.c
@@ -6,6 +6,7 @@
  */
 
 #include <linux/memcontrol.h>
+#include <linux/mm_inline.h>
 #include <linux/writeback.h>
 #include <linux/shmem_fs.h>
 #include <linux/pagemap.h>
@@ -280,6 +281,7 @@ void *workingset_eviction(struct page *p
  */
 void workingset_refault(struct page *page, void *shadow)
 {
+	bool file = page_is_file_lru(page);
 	struct mem_cgroup *eviction_memcg;
 	struct lruvec *eviction_lruvec;
 	unsigned long refault_distance;
@@ -346,7 +348,7 @@ void workingset_refault(struct page *pag
 	memcg = page_memcg(page);
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
-	inc_lruvec_state(lruvec, WORKINGSET_REFAULT);
+	inc_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file);
 
 	/*
 	 * Compare the distance to the existing workingset size. We
@@ -366,7 +368,7 @@ void workingset_refault(struct page *pag
 
 	SetPageActive(page);
 	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
-	inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE);
+	inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + file);
 
 	/* Page was active prior to eviction */
 	if (workingset) {
@@ -375,7 +377,7 @@ void workingset_refault(struct page *pag
 		spin_lock_irq(&page_pgdat(page)->lru_lock);
 		lru_note_cost_page(page);
 		spin_unlock_irq(&page_pgdat(page)->lru_lock);
-		inc_lruvec_state(lruvec, WORKINGSET_RESTORE);
+		inc_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + file);
 	}
 out:
 	rcu_read_unlock();
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
mm-swapcache-support-to-handle-the-shadow-entries.patch
mm-swap-implement-workingset-detection-for-anonymous-lru.patch
mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-swapcache-support-to-handle-the-shadow-entries.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (220 preceding siblings ...)
  2020-07-24  0:47 ` + mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch " Andrew Morton
@ 2020-07-24  0:47 ` Andrew Morton
  2020-07-24  0:47 ` + mm-swap-implement-workingset-detection-for-anonymous-lru.patch " Andrew Morton
                   ` (10 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  0:47 UTC (permalink / raw)
  To: hannes, hughd, iamjoonsoo.kim, mgorman, mhocko, minchan,
	mm-commits, vbabka, willy


The patch titled
     Subject: mm/swapcache: support to handle the shadow entries
has been added to the -mm tree.  Its filename is
     mm-swapcache-support-to-handle-the-shadow-entries.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-swapcache-support-to-handle-the-shadow-entries.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-swapcache-support-to-handle-the-shadow-entries.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/swapcache: support to handle the shadow entries

Workingset detection for anonymous page will be implemented in the
following patch and it requires to store the shadow entries into the
swapcache.  This patch implements an infrastructure to store the shadow
entry in the swapcache.

Link: http://lkml.kernel.org/r/1595490560-15117-5-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/swap.h |   17 +++++++++---
 mm/shmem.c           |    3 +-
 mm/swap_state.c      |   57 ++++++++++++++++++++++++++++++++++++-----
 mm/swapfile.c        |    2 +
 mm/vmscan.c          |    2 -
 5 files changed, 69 insertions(+), 12 deletions(-)

--- a/include/linux/swap.h~mm-swapcache-support-to-handle-the-shadow-entries
+++ a/include/linux/swap.h
@@ -414,9 +414,13 @@ extern struct address_space *swapper_spa
 extern unsigned long total_swapcache_pages(void);
 extern void show_swap_cache_info(void);
 extern int add_to_swap(struct page *page);
-extern int add_to_swap_cache(struct page *, swp_entry_t, gfp_t);
-extern void __delete_from_swap_cache(struct page *, swp_entry_t entry);
+extern int add_to_swap_cache(struct page *page, swp_entry_t entry,
+			gfp_t gfp, void **shadowp);
+extern void __delete_from_swap_cache(struct page *page,
+			swp_entry_t entry, void *shadow);
 extern void delete_from_swap_cache(struct page *);
+extern void clear_shadow_from_swap_cache(int type, unsigned long begin,
+				unsigned long end);
 extern void free_page_and_swap_cache(struct page *);
 extern void free_pages_and_swap_cache(struct page **, int);
 extern struct page *lookup_swap_cache(swp_entry_t entry,
@@ -570,13 +574,13 @@ static inline int add_to_swap(struct pag
 }
 
 static inline int add_to_swap_cache(struct page *page, swp_entry_t entry,
-							gfp_t gfp_mask)
+					gfp_t gfp_mask, void **shadowp)
 {
 	return -1;
 }
 
 static inline void __delete_from_swap_cache(struct page *page,
-							swp_entry_t entry)
+					swp_entry_t entry, void *shadow)
 {
 }
 
@@ -584,6 +588,11 @@ static inline void delete_from_swap_cach
 {
 }
 
+static inline void clear_shadow_from_swap_cache(int type, unsigned long begin,
+				unsigned long end)
+{
+}
+
 static inline int page_swapcount(struct page *page)
 {
 	return 0;
--- a/mm/shmem.c~mm-swapcache-support-to-handle-the-shadow-entries
+++ a/mm/shmem.c
@@ -1434,7 +1434,8 @@ static int shmem_writepage(struct page *
 		list_add(&info->swaplist, &shmem_swaplist);
 
 	if (add_to_swap_cache(page, swap,
-			__GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) {
+			__GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN,
+			NULL) == 0) {
 		spin_lock_irq(&info->lock);
 		shmem_recalc_inode(inode);
 		info->swapped++;
--- a/mm/swapfile.c~mm-swapcache-support-to-handle-the-shadow-entries
+++ a/mm/swapfile.c
@@ -696,6 +696,7 @@ static void add_to_avail_list(struct swa
 static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
 			    unsigned int nr_entries)
 {
+	unsigned long begin = offset;
 	unsigned long end = offset + nr_entries - 1;
 	void (*swap_slot_free_notify)(struct block_device *, unsigned long);
 
@@ -721,6 +722,7 @@ static void swap_range_free(struct swap_
 			swap_slot_free_notify(si->bdev, offset);
 		offset++;
 	}
+	clear_shadow_from_swap_cache(si->type, begin, end);
 }
 
 static void set_cluster_next(struct swap_info_struct *si, unsigned long next)
--- a/mm/swap_state.c~mm-swapcache-support-to-handle-the-shadow-entries
+++ a/mm/swap_state.c
@@ -110,12 +110,14 @@ void show_swap_cache_info(void)
  * add_to_swap_cache resembles add_to_page_cache_locked on swapper_space,
  * but sets SwapCache flag and private instead of mapping and index.
  */
-int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp)
+int add_to_swap_cache(struct page *page, swp_entry_t entry,
+			gfp_t gfp, void **shadowp)
 {
 	struct address_space *address_space = swap_address_space(entry);
 	pgoff_t idx = swp_offset(entry);
 	XA_STATE_ORDER(xas, &address_space->i_pages, idx, compound_order(page));
 	unsigned long i, nr = hpage_nr_pages(page);
+	void *old;
 
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 	VM_BUG_ON_PAGE(PageSwapCache(page), page);
@@ -125,16 +127,25 @@ int add_to_swap_cache(struct page *page,
 	SetPageSwapCache(page);
 
 	do {
+		unsigned long nr_shadows = 0;
+
 		xas_lock_irq(&xas);
 		xas_create_range(&xas);
 		if (xas_error(&xas))
 			goto unlock;
 		for (i = 0; i < nr; i++) {
 			VM_BUG_ON_PAGE(xas.xa_index != idx + i, page);
+			old = xas_load(&xas);
+			if (xa_is_value(old)) {
+				nr_shadows++;
+				if (shadowp)
+					*shadowp = old;
+			}
 			set_page_private(page + i, entry.val + i);
 			xas_store(&xas, page);
 			xas_next(&xas);
 		}
+		address_space->nrexceptional -= nr_shadows;
 		address_space->nrpages += nr;
 		__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr);
 		ADD_CACHE_INFO(add_total, nr);
@@ -154,7 +165,8 @@ unlock:
  * This must be called only on pages that have
  * been verified to be in the swap cache.
  */
-void __delete_from_swap_cache(struct page *page, swp_entry_t entry)
+void __delete_from_swap_cache(struct page *page,
+			swp_entry_t entry, void *shadow)
 {
 	struct address_space *address_space = swap_address_space(entry);
 	int i, nr = hpage_nr_pages(page);
@@ -166,12 +178,14 @@ void __delete_from_swap_cache(struct pag
 	VM_BUG_ON_PAGE(PageWriteback(page), page);
 
 	for (i = 0; i < nr; i++) {
-		void *entry = xas_store(&xas, NULL);
+		void *entry = xas_store(&xas, shadow);
 		VM_BUG_ON_PAGE(entry != page, entry);
 		set_page_private(page + i, 0);
 		xas_next(&xas);
 	}
 	ClearPageSwapCache(page);
+	if (shadow)
+		address_space->nrexceptional += nr;
 	address_space->nrpages -= nr;
 	__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, -nr);
 	ADD_CACHE_INFO(del_total, nr);
@@ -208,7 +222,7 @@ int add_to_swap(struct page *page)
 	 * Add it to the swap cache.
 	 */
 	err = add_to_swap_cache(page, entry,
-			__GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN);
+			__GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN, NULL);
 	if (err)
 		/*
 		 * add_to_swap_cache() doesn't return -EEXIST, so we can safely
@@ -246,13 +260,44 @@ void delete_from_swap_cache(struct page
 	struct address_space *address_space = swap_address_space(entry);
 
 	xa_lock_irq(&address_space->i_pages);
-	__delete_from_swap_cache(page, entry);
+	__delete_from_swap_cache(page, entry, NULL);
 	xa_unlock_irq(&address_space->i_pages);
 
 	put_swap_page(page, entry);
 	page_ref_sub(page, hpage_nr_pages(page));
 }
 
+void clear_shadow_from_swap_cache(int type, unsigned long begin,
+				unsigned long end)
+{
+	unsigned long curr = begin;
+	void *old;
+
+	for (;;) {
+		unsigned long nr_shadows = 0;
+		swp_entry_t entry = swp_entry(type, curr);
+		struct address_space *address_space = swap_address_space(entry);
+		XA_STATE(xas, &address_space->i_pages, curr);
+
+		xa_lock_irq(&address_space->i_pages);
+		xas_for_each(&xas, old, end) {
+			if (!xa_is_value(old))
+				continue;
+			xas_store(&xas, NULL);
+			nr_shadows++;
+		}
+		address_space->nrexceptional -= nr_shadows;
+		xa_unlock_irq(&address_space->i_pages);
+
+		/* search the next swapcache until we meet end */
+		curr >>= SWAP_ADDRESS_SPACE_SHIFT;
+		curr++;
+		curr <<= SWAP_ADDRESS_SPACE_SHIFT;
+		if (curr > end)
+			break;
+	}
+}
+
 /* 
  * If we are the only user, then try to free up the swap cache. 
  * 
@@ -429,7 +474,7 @@ struct page *__read_swap_cache_async(swp
 	__SetPageSwapBacked(page);
 
 	/* May fail (-ENOMEM) if XArray node allocation failed. */
-	if (add_to_swap_cache(page, entry, gfp_mask & GFP_RECLAIM_MASK)) {
+	if (add_to_swap_cache(page, entry, gfp_mask & GFP_RECLAIM_MASK, NULL)) {
 		put_swap_page(page, entry);
 		goto fail_unlock;
 	}
--- a/mm/vmscan.c~mm-swapcache-support-to-handle-the-shadow-entries
+++ a/mm/vmscan.c
@@ -896,7 +896,7 @@ static int __remove_mapping(struct addre
 	if (PageSwapCache(page)) {
 		swp_entry_t swap = { .val = page_private(page) };
 		mem_cgroup_swapout(page, swap);
-		__delete_from_swap_cache(page, swap);
+		__delete_from_swap_cache(page, swap, NULL);
 		xa_unlock_irqrestore(&mapping->i_pages, flags);
 		put_swap_page(page, swap);
 		workingset_eviction(page, target_memcg);
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
mm-swapcache-support-to-handle-the-shadow-entries.patch
mm-swap-implement-workingset-detection-for-anonymous-lru.patch
mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-swap-implement-workingset-detection-for-anonymous-lru.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (221 preceding siblings ...)
  2020-07-24  0:47 ` + mm-swapcache-support-to-handle-the-shadow-entries.patch " Andrew Morton
@ 2020-07-24  0:47 ` Andrew Morton
  2020-07-24  0:47 ` + mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch " Andrew Morton
                   ` (9 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  0:47 UTC (permalink / raw)
  To: hannes, hughd, iamjoonsoo.kim, mgorman, mhocko, minchan,
	mm-commits, vbabka, willy


The patch titled
     Subject: mm/swap: implement workingset detection for anonymous LRU
has been added to the -mm tree.  Its filename is
     mm-swap-implement-workingset-detection-for-anonymous-lru.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-implement-workingset-detection-for-anonymous-lru.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-implement-workingset-detection-for-anonymous-lru.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/swap: implement workingset detection for anonymous LRU

This patch implements workingset detection for anonymous LRU.  All the
infrastructure is implemented by the previous patches so this patch just
activates the workingset detection by installing/retrieving the shadow
entry and adding refault calculation.

Link: http://lkml.kernel.org/r/1595490560-15117-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/swap.h |    6 ++++++
 mm/memory.c          |   11 ++++-------
 mm/swap_state.c      |   23 ++++++++++++++++++-----
 mm/vmscan.c          |    7 ++++---
 mm/workingset.c      |   15 +++++++++++----
 5 files changed, 43 insertions(+), 19 deletions(-)

--- a/include/linux/swap.h~mm-swap-implement-workingset-detection-for-anonymous-lru
+++ a/include/linux/swap.h
@@ -414,6 +414,7 @@ extern struct address_space *swapper_spa
 extern unsigned long total_swapcache_pages(void);
 extern void show_swap_cache_info(void);
 extern int add_to_swap(struct page *page);
+extern void *get_shadow_from_swap_cache(swp_entry_t entry);
 extern int add_to_swap_cache(struct page *page, swp_entry_t entry,
 			gfp_t gfp, void **shadowp);
 extern void __delete_from_swap_cache(struct page *page,
@@ -573,6 +574,11 @@ static inline int add_to_swap(struct pag
 	return 0;
 }
 
+static inline void *get_shadow_from_swap_cache(swp_entry_t entry)
+{
+	return NULL;
+}
+
 static inline int add_to_swap_cache(struct page *page, swp_entry_t entry,
 					gfp_t gfp_mask, void **shadowp)
 {
--- a/mm/memory.c~mm-swap-implement-workingset-detection-for-anonymous-lru
+++ a/mm/memory.c
@@ -3098,6 +3098,7 @@ vm_fault_t do_swap_page(struct vm_fault
 	int locked;
 	int exclusive = 0;
 	vm_fault_t ret = 0;
+	void *shadow = NULL;
 
 	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
 		goto out;
@@ -3149,13 +3150,9 @@ vm_fault_t do_swap_page(struct vm_fault
 					goto out_page;
 				}
 
-				/*
-				 * XXX: Move to lru_cache_add() when it
-				 * supports new vs putback
-				 */
-				spin_lock_irq(&page_pgdat(page)->lru_lock);
-				lru_note_cost_page(page);
-				spin_unlock_irq(&page_pgdat(page)->lru_lock);
+				shadow = get_shadow_from_swap_cache(entry);
+				if (shadow)
+					workingset_refault(page, shadow);
 
 				lru_cache_add(page);
 				swap_readpage(page, true);
--- a/mm/swap_state.c~mm-swap-implement-workingset-detection-for-anonymous-lru
+++ a/mm/swap_state.c
@@ -106,6 +106,20 @@ void show_swap_cache_info(void)
 	printk("Total swap = %lukB\n", total_swap_pages << (PAGE_SHIFT - 10));
 }
 
+void *get_shadow_from_swap_cache(swp_entry_t entry)
+{
+	struct address_space *address_space = swap_address_space(entry);
+	pgoff_t idx = swp_offset(entry);
+	struct page *page;
+
+	page = find_get_entry(address_space, idx);
+	if (xa_is_value(page))
+		return page;
+	if (page)
+		put_page(page);
+	return NULL;
+}
+
 /*
  * add_to_swap_cache resembles add_to_page_cache_locked on swapper_space,
  * but sets SwapCache flag and private instead of mapping and index.
@@ -406,6 +420,7 @@ struct page *__read_swap_cache_async(swp
 {
 	struct swap_info_struct *si;
 	struct page *page;
+	void *shadow = NULL;
 
 	*new_page_allocated = false;
 
@@ -474,7 +489,7 @@ struct page *__read_swap_cache_async(swp
 	__SetPageSwapBacked(page);
 
 	/* May fail (-ENOMEM) if XArray node allocation failed. */
-	if (add_to_swap_cache(page, entry, gfp_mask & GFP_RECLAIM_MASK, NULL)) {
+	if (add_to_swap_cache(page, entry, gfp_mask & GFP_RECLAIM_MASK, &shadow)) {
 		put_swap_page(page, entry);
 		goto fail_unlock;
 	}
@@ -484,10 +499,8 @@ struct page *__read_swap_cache_async(swp
 		goto fail_unlock;
 	}
 
-	/* XXX: Move to lru_cache_add() when it supports new vs putback */
-	spin_lock_irq(&page_pgdat(page)->lru_lock);
-	lru_note_cost_page(page);
-	spin_unlock_irq(&page_pgdat(page)->lru_lock);
+	if (shadow)
+		workingset_refault(page, shadow);
 
 	/* Caller will initiate read into locked page */
 	SetPageWorkingset(page);
--- a/mm/vmscan.c~mm-swap-implement-workingset-detection-for-anonymous-lru
+++ a/mm/vmscan.c
@@ -854,6 +854,7 @@ static int __remove_mapping(struct addre
 {
 	unsigned long flags;
 	int refcount;
+	void *shadow = NULL;
 
 	BUG_ON(!PageLocked(page));
 	BUG_ON(mapping != page_mapping(page));
@@ -896,13 +897,13 @@ static int __remove_mapping(struct addre
 	if (PageSwapCache(page)) {
 		swp_entry_t swap = { .val = page_private(page) };
 		mem_cgroup_swapout(page, swap);
-		__delete_from_swap_cache(page, swap, NULL);
+		if (reclaimed && !mapping_exiting(mapping))
+			shadow = workingset_eviction(page, target_memcg);
+		__delete_from_swap_cache(page, swap, shadow);
 		xa_unlock_irqrestore(&mapping->i_pages, flags);
 		put_swap_page(page, swap);
-		workingset_eviction(page, target_memcg);
 	} else {
 		void (*freepage)(struct page *);
-		void *shadow = NULL;
 
 		freepage = mapping->a_ops->freepage;
 		/*
--- a/mm/workingset.c~mm-swap-implement-workingset-detection-for-anonymous-lru
+++ a/mm/workingset.c
@@ -353,15 +353,22 @@ void workingset_refault(struct page *pag
 	/*
 	 * Compare the distance to the existing workingset size. We
 	 * don't activate pages that couldn't stay resident even if
-	 * all the memory was available to the page cache. Whether
-	 * cache can compete with anon or not depends on having swap.
+	 * all the memory was available to the workingset. Whether
+	 * workingset competition needs to consider anon or not depends
+	 * on having swap.
 	 */
 	workingset_size = lruvec_page_state(eviction_lruvec, NR_ACTIVE_FILE);
-	if (mem_cgroup_get_nr_swap_pages(memcg) > 0) {
+	if (!file) {
 		workingset_size += lruvec_page_state(eviction_lruvec,
-						     NR_INACTIVE_ANON);
+						     NR_INACTIVE_FILE);
+	}
+	if (mem_cgroup_get_nr_swap_pages(memcg) > 0) {
 		workingset_size += lruvec_page_state(eviction_lruvec,
 						     NR_ACTIVE_ANON);
+		if (file) {
+			workingset_size += lruvec_page_state(eviction_lruvec,
+						     NR_INACTIVE_ANON);
+		}
 	}
 	if (refault_distance > workingset_size)
 		goto out;
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
mm-swapcache-support-to-handle-the-shadow-entries.patch
mm-swap-implement-workingset-detection-for-anonymous-lru.patch
mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (222 preceding siblings ...)
  2020-07-24  0:47 ` + mm-swap-implement-workingset-detection-for-anonymous-lru.patch " Andrew Morton
@ 2020-07-24  0:47 ` Andrew Morton
  2020-07-24  0:57 ` + makefile-add-debug-option-to-enable-function-aligned-on-32-bytes.patch " Andrew Morton
                   ` (8 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  0:47 UTC (permalink / raw)
  To: hannes, hughd, iamjoonsoo.kim, mgorman, mhocko, minchan,
	mm-commits, vbabka, willy


The patch titled
     Subject: mm/vmscan: restore active/inactive ratio for anonymous LRU
has been added to the -mm tree.  Its filename is
     mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/vmscan: restore active/inactive ratio for anonymous LRU

Now that workingset detection is implemented for anonymous LRU, we don't
need large inactive list to allow detecting frequently accessed pages
before they are reclaimed, anymore.  This effectively reverts the
temporary measure put in by commit "mm/vmscan: make active/inactive ratio
as 1:1 for anon lru".

Link: http://lkml.kernel.org/r/1595490560-15117-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmscan.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/vmscan.c~mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru
+++ a/mm/vmscan.c
@@ -2207,7 +2207,7 @@ static bool inactive_is_low(struct lruve
 	active = lruvec_page_state(lruvec, NR_LRU_BASE + active_lru);
 
 	gb = (inactive + active) >> (30 - PAGE_SHIFT);
-	if (gb && is_file_lru(inactive_lru))
+	if (gb)
 		inactive_ratio = int_sqrt(10 * gb);
 	else
 		inactive_ratio = 1;
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
mm-swapcache-support-to-handle-the-shadow-entries.patch
mm-swap-implement-workingset-detection-for-anonymous-lru.patch
mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + makefile-add-debug-option-to-enable-function-aligned-on-32-bytes.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (223 preceding siblings ...)
  2020-07-24  0:47 ` + mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch " Andrew Morton
@ 2020-07-24  0:57 ` Andrew Morton
  2020-07-24  1:09 ` + mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch " Andrew Morton
                   ` (7 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  0:57 UTC (permalink / raw)
  To: andi.kleen, andriy.shevchenko, feng.tang, masahiroy, michal.lkml,
	mm-commits, ying.huang


The patch titled
     Subject: ./Makefile: add debug option to enable function aligned on 32 bytes
has been added to the -mm tree.  Its filename is
     makefile-add-debug-option-to-enable-function-aligned-on-32-bytes.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/makefile-add-debug-option-to-enable-function-aligned-on-32-bytes.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/makefile-add-debug-option-to-enable-function-aligned-on-32-bytes.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Feng Tang <feng.tang@intel.com>
Subject: ./Makefile: add debug option to enable function aligned on 32 bytes

Recently 0day reported many strange performance changes (regression or
improvement), in which there was no obvious relation between the culprit
commit and the benchmark at the first look, and it causes people to doubt
the test itself is wrong.

Upon further check, many of these cases are caused by the change to the
alignment of kernel text or data, as whole text/data of kernel are linked
together, change in one domain may affect alignments of other domains.

gcc has an option '-falign-functions=n' to force text aligned, and with
that option enabled, some of those performance changes will be gone, like
[1][2][3].

Add this option so that developers and 0day can easily find performance
bump caused by text alignment change, as tracking these strange bump is
quite time consuming.  Though it can't help in other cases like data
alignment changes like [4].

Following is some size data for v5.7 kernel built with a RHEL config used
in 0day:

    text      data      bss	 dec	   filename
  19738771  13292906  5554236  38585913	 vmlinux.noalign
  19758591  13297002  5529660  38585253	 vmlinux.align32

Raw vmlinux size in bytes:

	v5.7		v5.7+align32
	253950832	254018000	+0.02%

Some benchmark data, most of them have no big change:

  * hackbench:		[ -1.8%,  +0.5%]

  * fsmark:		[ -3.2%,  +3.4%]  # ext4/xfs/btrfs

  * kbuild:		[ -2.0%,  +0.9%]

  * will-it-scale:	[ -0.5%,  +1.8%]  # mmap1/pagefault3

  * netperf:
    - TCP_CRR		[+16.6%, +97.4%]
    - TCP_RR		[-18.5%,  -1.8%]
    - TCP_STREAM	[ -1.1%,  +1.9%]

[1] https://lore.kernel.org/lkml/20200114085637.GA29297@shao2-debian/
[2] https://lore.kernel.org/lkml/20200330011254.GA14393@feng-iot/
[3] https://lore.kernel.org/lkml/1d98d1f0-fe84-6df7-f5bd-f4cb2cdb7f45@intel.com/
[4] https://lore.kernel.org/lkml/20200205123216.GO12867@shao2-debian/

Link: http://lkml.kernel.org/r/1595475001-90945-1-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Makefile          |    4 ++++
 lib/Kconfig.debug |   11 +++++++++++
 2 files changed, 15 insertions(+)

--- a/lib/Kconfig.debug~makefile-add-debug-option-to-enable-function-aligned-on-32-bytes
+++ a/lib/Kconfig.debug
@@ -365,6 +365,17 @@ config SECTION_MISMATCH_WARN_ONLY
 
 	  If unsure, say Y.
 
+config DEBUG_FORCE_FUNCTION_ALIGN_32B
+	bool "Force all function address 32B aligned" if EXPERT
+	help
+	  There are cases that a commit from one domain changes the function
+	  address alignment of other domains, and cause magic performance
+	  bump (regression or improvement). Enable this option will help to
+	  verify if the bump is caused by function alignment changes, while
+	  it will slightly increase the kernel size and affect icache usage.
+
+	  It is mainly for debug and performance tuning use.
+
 #
 # Select this config option from the architecture Kconfig, if it
 # is preferred to always offer frame pointers as a config
--- a/Makefile~makefile-add-debug-option-to-enable-function-aligned-on-32-bytes
+++ a/Makefile
@@ -886,6 +886,10 @@ KBUILD_CFLAGS	+= $(CC_FLAGS_SCS)
 export CC_FLAGS_SCS
 endif
 
+ifdef CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_32B
+KBUILD_CFLAGS += -falign-functions=32
+endif
+
 # arch Makefile may override CC so keep this after arch Makefile is included
 NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include)
 
_

Patches currently in -mm which might be from feng.tang@intel.com are

proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
mm-utilc-make-vm_memory_committed-more-accurate.patch
percpu_counter-add-percpu_counter_sync.patch
mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
makefile-add-debug-option-to-enable-function-aligned-on-32-bytes.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (224 preceding siblings ...)
  2020-07-24  0:57 ` + makefile-add-debug-option-to-enable-function-aligned-on-32-bytes.patch " Andrew Morton
@ 2020-07-24  1:09 ` Andrew Morton
  2020-07-24  2:12 ` + panic-make-print_oops_end_marker-static.patch " Andrew Morton
                   ` (6 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  1:09 UTC (permalink / raw)
  To: aneesh.kumar, guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, stable, vbabka


The patch titled
     Subject: mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
has been added to the -mm tree.  Its filename is
     mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: js1304@gmail.com
Subject: mm/page_alloc: fix memalloc_nocma_{save/restore} APIs


Currently, memalloc_nocma_{save/restore} API that prevents CMA area
in page allocation is implemented by using current_gfp_context(). However,
there are two problems of this implementation.

First, this doesn't work for allocation fastpath. In the fastpath,
original gfp_mask is used since current_gfp_context() is introduced in
order to control reclaim and it is on slowpath. So, CMA area can be
allocated through the allocation fastpath even if
memalloc_nocma_{save/restore} APIs are used. Currently, there is just
one user for these APIs and it has a fallback method to prevent actual
problem.
Second, clearing __GFP_MOVABLE in current_gfp_context() has a side effect
to exclude the memory on the ZONE_MOVABLE for allocation target.

To fix these problems, this patch changes the implementation to exclude
CMA area in page allocation. Main point of this change is using the
alloc_flags. alloc_flags is mainly used to control allocation so it fits
for excluding CMA area in allocation.

Link: http://lkml.kernel.org/r/1595468942-29687-1-git-send-email-iamjoonsoo.kim@lge.com
Fixes: d7fefcc8de91 (mm/cma: add PF flag to force non cma alloc)
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/sched/mm.h |    8 +-------
 mm/page_alloc.c          |   31 +++++++++++++++++++++----------
 2 files changed, 22 insertions(+), 17 deletions(-)

--- a/include/linux/sched/mm.h~mm-page_alloc-fix-memalloc_nocma_save-restore-apis
+++ a/include/linux/sched/mm.h
@@ -177,12 +177,10 @@ static inline bool in_vfork(struct task_
  * Applies per-task gfp context to the given allocation flags.
  * PF_MEMALLOC_NOIO implies GFP_NOIO
  * PF_MEMALLOC_NOFS implies GFP_NOFS
- * PF_MEMALLOC_NOCMA implies no allocation from CMA region.
  */
 static inline gfp_t current_gfp_context(gfp_t flags)
 {
-	if (unlikely(current->flags &
-		     (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS | PF_MEMALLOC_NOCMA))) {
+	if (unlikely(current->flags & (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS))) {
 		/*
 		 * NOIO implies both NOIO and NOFS and it is a weaker context
 		 * so always make sure it makes precedence
@@ -191,10 +189,6 @@ static inline gfp_t current_gfp_context(
 			flags &= ~(__GFP_IO | __GFP_FS);
 		else if (current->flags & PF_MEMALLOC_NOFS)
 			flags &= ~__GFP_FS;
-#ifdef CONFIG_CMA
-		if (current->flags & PF_MEMALLOC_NOCMA)
-			flags &= ~__GFP_MOVABLE;
-#endif
 	}
 	return flags;
 }
--- a/mm/page_alloc.c~mm-page_alloc-fix-memalloc_nocma_save-restore-apis
+++ a/mm/page_alloc.c
@@ -2790,7 +2790,7 @@ __rmqueue(struct zone *zone, unsigned in
 	 * allocating from CMA when over half of the zone's free memory
 	 * is in the CMA area.
 	 */
-	if (migratetype == MIGRATE_MOVABLE &&
+	if (alloc_flags & ALLOC_CMA &&
 	    zone_page_state(zone, NR_FREE_CMA_PAGES) >
 	    zone_page_state(zone, NR_FREE_PAGES) / 2) {
 		page = __rmqueue_cma_fallback(zone, order);
@@ -2801,7 +2801,7 @@ __rmqueue(struct zone *zone, unsigned in
 retry:
 	page = __rmqueue_smallest(zone, order, migratetype);
 	if (unlikely(!page)) {
-		if (migratetype == MIGRATE_MOVABLE)
+		if (alloc_flags & ALLOC_CMA)
 			page = __rmqueue_cma_fallback(zone, order);
 
 		if (!page && __rmqueue_fallback(zone, order, migratetype,
@@ -3671,6 +3671,20 @@ alloc_flags_nofragment(struct zone *zone
 	return alloc_flags;
 }
 
+static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
+					unsigned int alloc_flags)
+{
+#ifdef CONFIG_CMA
+	unsigned int pflags = current->flags;
+
+	if (!(pflags & PF_MEMALLOC_NOCMA) &&
+			gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
+		alloc_flags |= ALLOC_CMA;
+
+#endif
+	return alloc_flags;
+}
+
 /*
  * get_page_from_freelist goes through the zonelist trying to allocate
  * a page.
@@ -4316,10 +4330,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 	} else if (unlikely(rt_task(current)) && !in_interrupt())
 		alloc_flags |= ALLOC_HARDER;
 
-#ifdef CONFIG_CMA
-	if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
-		alloc_flags |= ALLOC_CMA;
-#endif
+	alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);
+
 	return alloc_flags;
 }
 
@@ -4620,7 +4632,7 @@ retry:
 
 	reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
 	if (reserve_flags)
-		alloc_flags = reserve_flags;
+		alloc_flags = current_alloc_flags(gfp_mask, reserve_flags);
 
 	/*
 	 * Reset the nodemask and zonelist iterators if memory policies can be
@@ -4697,7 +4709,7 @@ retry:
 
 	/* Avoid allocations with no watermarks from looping endlessly */
 	if (tsk_is_oom_victim(current) &&
-	    (alloc_flags == ALLOC_OOM ||
+	    (alloc_flags & ALLOC_OOM ||
 	     (gfp_mask & __GFP_NOMEMALLOC)))
 		goto nopage;
 
@@ -4785,8 +4797,7 @@ static inline bool prepare_alloc_pages(g
 	if (should_fail_alloc_page(gfp_mask, order))
 		return false;
 
-	if (IS_ENABLED(CONFIG_CMA) && ac->migratetype == MIGRATE_MOVABLE)
-		*alloc_flags |= ALLOC_CMA;
+	*alloc_flags = current_alloc_flags(gfp_mask, *alloc_flags);
 
 	return true;
 }
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch
mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
mm-swapcache-support-to-handle-the-shadow-entries.patch
mm-swap-implement-workingset-detection-for-anonymous-lru.patch
mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + panic-make-print_oops_end_marker-static.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (225 preceding siblings ...)
  2020-07-24  1:09 ` + mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch " Andrew Morton
@ 2020-07-24  2:12 ` Andrew Morton
  2020-07-24  2:20 ` + lib-kconfigdebug-make-test_lockup-depend-on-module.patch " Andrew Morton
                   ` (5 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  2:12 UTC (permalink / raw)
  To: huyue2, keescook, mm-commits


The patch titled
     Subject: panic: make print_oops_end_marker() static
has been added to the -mm tree.  Its filename is
     panic-make-print_oops_end_marker-static.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/panic-make-print_oops_end_marker-static.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/panic-make-print_oops_end_marker-static.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Yue Hu <huyue2@yulong.com>
Subject: panic: make print_oops_end_marker() static

Since print_oops_end_marker() is not used externally, also remove it in
kernel.h at the same time.

Link: http://lkml.kernel.org/r/20200724011516.12756-1-zbestahu@gmail.com
Signed-off-by: Yue Hu <huyue2@yulong.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/kernel.h |    1 -
 kernel/panic.c         |    2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

--- a/include/linux/kernel.h~panic-make-print_oops_end_marker-static
+++ a/include/linux/kernel.h
@@ -321,7 +321,6 @@ void panic(const char *fmt, ...) __noret
 void nmi_panic(struct pt_regs *regs, const char *msg);
 extern void oops_enter(void);
 extern void oops_exit(void);
-void print_oops_end_marker(void);
 extern int oops_may_print(void);
 void do_exit(long error_code) __noreturn;
 void complete_and_exit(struct completion *, long) __noreturn;
--- a/kernel/panic.c~panic-make-print_oops_end_marker-static
+++ a/kernel/panic.c
@@ -551,7 +551,7 @@ static int init_oops_id(void)
 }
 late_initcall(init_oops_id);
 
-void print_oops_end_marker(void)
+static void print_oops_end_marker(void)
 {
 	init_oops_id();
 	pr_warn("---[ end trace %016llx ]---\n", (unsigned long long)oops_id);
_

Patches currently in -mm which might be from huyue2@yulong.com are

panic-make-print_oops_end_marker-static.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + lib-kconfigdebug-make-test_lockup-depend-on-module.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (226 preceding siblings ...)
  2020-07-24  2:12 ` + panic-make-print_oops_end_marker-static.patch " Andrew Morton
@ 2020-07-24  2:20 ` Andrew Morton
  2020-07-24  2:20 ` + lib-test_lockupc-fix-return-value-of-test_lockup_init.patch " Andrew Morton
                   ` (4 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  2:20 UTC (permalink / raw)
  To: keescook, khlebnikov, linux, mm-commits, yangtiezhu


The patch titled
     Subject: lib/Kconfig.debug: make TEST_LOCKUP depend on module
has been added to the -mm tree.  Its filename is
     lib-kconfigdebug-make-test_lockup-depend-on-module.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/lib-kconfigdebug-make-test_lockup-depend-on-module.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/lib-kconfigdebug-make-test_lockup-depend-on-module.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Tiezhu Yang <yangtiezhu@loongson.cn>
Subject: lib/Kconfig.debug: make TEST_LOCKUP depend on module

Since test_lockup is a test module to generate lockups, it is better to
limit TEST_LOCKUP to module (=m) or disabled (=n) because we can not use
the module parameters when CONFIG_TEST_LOCKUP=y.

Link: http://lkml.kernel.org/r/1595555407-29875-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/Kconfig.debug |    1 +
 1 file changed, 1 insertion(+)

--- a/lib/Kconfig.debug~lib-kconfigdebug-make-test_lockup-depend-on-module
+++ a/lib/Kconfig.debug
@@ -1046,6 +1046,7 @@ config WQ_WATCHDOG
 
 config TEST_LOCKUP
 	tristate "Test module to generate lockups"
+	depends on m
 	help
 	  This builds the "test_lockup" module that helps to make sure
 	  that watchdogs and lockup detectors are working properly.
_

Patches currently in -mm which might be from yangtiezhu@loongson.cn are

lib-kconfigdebug-make-test_lockup-depend-on-module.patch
lib-test_lockupc-fix-return-value-of-test_lockup_init.patch
selftests-kmod-use-variable-name-in-kmod_test_0001.patch
kmod-remove-redundant-be-an-in-the-comment.patch
test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
kernel-panicc-make-oops_may_print-return-bool.patch
lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + lib-test_lockupc-fix-return-value-of-test_lockup_init.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (227 preceding siblings ...)
  2020-07-24  2:20 ` + lib-kconfigdebug-make-test_lockup-depend-on-module.patch " Andrew Morton
@ 2020-07-24  2:20 ` Andrew Morton
  2020-07-24  2:25 ` [merged] sh-add-missing-export_symbol-for-__delay.patch removed from " Andrew Morton
                   ` (3 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  2:20 UTC (permalink / raw)
  To: keescook, khlebnikov, linux, mm-commits, yangtiezhu


The patch titled
     Subject: lib/test_lockup.c: fix return value of test_lockup_init()
has been added to the -mm tree.  Its filename is
     lib-test_lockupc-fix-return-value-of-test_lockup_init.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/lib-test_lockupc-fix-return-value-of-test_lockup_init.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/lib-test_lockupc-fix-return-value-of-test_lockup_init.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Tiezhu Yang <yangtiezhu@loongson.cn>
Subject: lib/test_lockup.c: fix return value of test_lockup_init()

Since filp_open() returns an error pointer, we should use IS_ERR() to
check the return value and then return PTR_ERR() if failed to get the
actual return value instead of always -EINVAL.

E.g. without this patch:

[root@localhost loongson]# ls no_such_file
ls: cannot access no_such_file: No such file or directory
[root@localhost loongson]# modprobe test_lockup file_path=no_such_file lock_sb_umount time_secs=60 state=S
modprobe: ERROR: could not insert 'test_lockup': Invalid argument
[root@localhost loongson]# dmesg | tail -1
[  126.100596] test_lockup: cannot find file_path

With this patch:

[root@localhost loongson]# ls no_such_file
ls: cannot access no_such_file: No such file or directory
[root@localhost loongson]# modprobe test_lockup file_path=no_such_file lock_sb_umount time_secs=60 state=S
modprobe: ERROR: could not insert 'test_lockup': Unknown symbol in module, or unknown parameter (see dmesg)
[root@localhost loongson]# dmesg | tail -1
[   95.134362] test_lockup: failed to open no_such_file: -2

Link: http://lkml.kernel.org/r/1595555407-29875-2-git-send-email-yangtiezhu@loongson.cn
Fixes: aecd42df6d39 ("lib/test_lockup.c: add parameters for locking generic vfs locks")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/test_lockup.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/lib/test_lockup.c~lib-test_lockupc-fix-return-value-of-test_lockup_init
+++ a/lib/test_lockup.c
@@ -512,8 +512,8 @@ static int __init test_lockup_init(void)
 	if (test_file_path[0]) {
 		test_file = filp_open(test_file_path, O_RDONLY, 0);
 		if (IS_ERR(test_file)) {
-			pr_err("cannot find file_path\n");
-			return -EINVAL;
+			pr_err("failed to open %s: %ld\n", test_file_path, PTR_ERR(test_file));
+			return PTR_ERR(test_file);
 		}
 		test_inode = file_inode(test_file);
 	} else if (test_lock_inode ||
_

Patches currently in -mm which might be from yangtiezhu@loongson.cn are

lib-kconfigdebug-make-test_lockup-depend-on-module.patch
lib-test_lockupc-fix-return-value-of-test_lockup_init.patch
selftests-kmod-use-variable-name-in-kmod_test_0001.patch
kmod-remove-redundant-be-an-in-the-comment.patch
test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
kernel-panicc-make-oops_may_print-return-bool.patch
lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* [merged] sh-add-missing-export_symbol-for-__delay.patch removed from -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (228 preceding siblings ...)
  2020-07-24  2:20 ` + lib-test_lockupc-fix-return-value-of-test_lockup_init.patch " Andrew Morton
@ 2020-07-24  2:25 ` Andrew Morton
  2020-07-24  2:50 ` + revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size.patch added to " Andrew Morton
                   ` (2 subsequent siblings)
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  2:25 UTC (permalink / raw)
  To: amodra, bin.meng, chenzhou10, dalias, geert+renesas, glaubitz,
	krzk, kuninori.morimoto.gx, linux, mm-commits, romain.naour, sam,
	ysato


The patch titled
     Subject: sh: add missing EXPORT_SYMBOL() for __delay
has been removed from the -mm tree.  Its filename was
     sh-add-missing-export_symbol-for-__delay.patch

This patch was dropped because it was merged into mainline or a subsystem tree

------------------------------------------------------
From: morimoto <kuninori.morimoto.gx@renesas.com>
Subject: sh: add missing EXPORT_SYMBOL() for __delay

__delay() is used from kernel module.  We need EXPORT_SYMBOL(), otherwise
we will get compile error.

ERROR: "__delay" [drivers/net/phy/mdio-cavium.ko] undefined!

Link: https://marc.info/?l=linux-kernel&m=157611811927852
Signed-off-by: morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Alan Modra <amodra@gmail.com>
Cc: Bin Meng <bin.meng@windriver.com>
Cc: Chen Zhou <chenzhou10@huawei.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Romain Naour <romain.naour@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/sh/lib/delay.c |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/sh/lib/delay.c~sh-add-missing-export_symbol-for-__delay
+++ a/arch/sh/lib/delay.c
@@ -41,6 +41,7 @@ inline void __const_udelay(unsigned long
 		: "macl", "mach");
 	__delay(++xloops);
 }
+EXPORT_SYMBOL(__delay);
 
 void __udelay(unsigned long usecs)
 {
_

Patches currently in -mm which might be from kuninori.morimoto.gx@renesas.com are

sh-clkfwk-remove-r8-r16-r32.patch
sh-use-generic-strncpy.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (229 preceding siblings ...)
  2020-07-24  2:25 ` [merged] sh-add-missing-export_symbol-for-__delay.patch removed from " Andrew Morton
@ 2020-07-24  2:50 ` Andrew Morton
  2020-07-24  2:53 ` + khugepaged-fix-null-pointer-dereference-due-to-race.patch " Andrew Morton
  2020-07-24  3:01 ` + mm-mmap-merge-vma-after-call_mmap-if-possible.patch " Andrew Morton
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  2:50 UTC (permalink / raw)
  To: akpm, arnd, bp, dave.hansen, david, dennis, guro, hpa, joel,
	jroedel, laoar.shao, lpf.vector, luto, mhocko, mingo, mm-commits,
	naresh.kamboju, oleksiy.avramchenko, peterz, rostedt, rppt, sfr,
	shakeelb, tglx, urezki, willy


The patch titled
     Subject: revert "Revert "mm/vmalloc: modify struct vmap_area to reduce its size""
has been added to the -mm tree.  Its filename is
     revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: revert "Revert "mm/vmalloc: modify struct vmap_area to reduce its size""

Revert linux-next's bdbfb1d52d5e5 ("Revert "mm/vmalloc: modify struct
vmap_area to reduce its size"").

Numerous reports of kernel crashes due to this.  We can't figure out what
it's for or why it's in -next.

Link: http://lkml.kernel.org/r/20200722144650.GA19628@pc636
Link: http://lkml.kernel.org/r/CA+G9fYuj3bHUMz8XQztbmTgF0c5+rZ5-FkUjFyvEftej2jLT+Q@mail.gmail.com
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Pengfei Li <lpf.vector@gmail.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Roman Gushchin <guro@fb.com>
Cc: Dennis Zhou <dennis@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/vmalloc.h |   20 +++++++++++++-------
 mm/vmalloc.c            |   24 ++++++++++--------------
 2 files changed, 23 insertions(+), 21 deletions(-)

--- a/include/linux/vmalloc.h~revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size
+++ a/include/linux/vmalloc.h
@@ -67,15 +67,21 @@ struct vmap_area {
 	unsigned long va_start;
 	unsigned long va_end;
 
-	/*
-	 * Largest available free size in subtree.
-	 */
-	unsigned long subtree_max_size;
-	unsigned long flags;
 	struct rb_node rb_node;         /* address sorted rbtree */
 	struct list_head list;          /* address sorted list */
-	struct llist_node purge_list;    /* "lazy purge" list */
-	struct vm_struct *vm;
+
+	/*
+	 * The following three variables can be packed, because
+	 * a vmap_area object is always one of the three states:
+	 *    1) in "free" tree (root is vmap_area_root)
+	 *    2) in "busy" tree (root is free_vmap_area_root)
+	 *    3) in purge list  (head is vmap_purge_list)
+	 */
+	union {
+		unsigned long subtree_max_size; /* in "free" tree */
+		struct vm_struct *vm;           /* in "busy" tree */
+		struct llist_node purge_list;   /* in purge list */
+	};
 };
 
 /*
--- a/mm/vmalloc.c~revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size
+++ a/mm/vmalloc.c
@@ -408,7 +408,6 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 #define DEBUG_AUGMENT_PROPAGATE_CHECK 0
 #define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0
 
-#define VM_VM_AREA	0x04
 
 static DEFINE_SPINLOCK(vmap_area_lock);
 static DEFINE_SPINLOCK(free_vmap_area_lock);
@@ -1220,7 +1219,7 @@ retry:
 
 	va->va_start = addr;
 	va->va_end = addr + size;
-	va->flags = 0;
+	va->vm = NULL;
 
 
 	spin_lock(&vmap_area_lock);
@@ -1995,7 +1994,6 @@ void __init vmalloc_init(void)
 		if (WARN_ON_ONCE(!va))
 			continue;
 
-		va->flags = VM_VM_AREA;
 		va->va_start = (unsigned long)tmp->addr;
 		va->va_end = va->va_start + tmp->size;
 		va->vm = tmp;
@@ -2040,7 +2038,6 @@ static void setup_vmalloc_vm(struct vm_s
 			      unsigned long flags, const void *caller)
 {
 	spin_lock(&vmap_area_lock);
-	va->flags |= VM_VM_AREA;
 	setup_vmalloc_vm_locked(vm, va, flags, caller);
 	spin_unlock(&vmap_area_lock);
 }
@@ -2141,10 +2138,10 @@ struct vm_struct *find_vm_area(const voi
 	struct vmap_area *va;
 
 	va = find_vmap_area((unsigned long)addr);
-	if (va && va->flags & VM_VM_AREA)
-		return va->vm;
+	if (!va)
+		return NULL;
 
-	return NULL;
+	return va->vm;
 }
 
 /**
@@ -2165,11 +2162,10 @@ struct vm_struct *remove_vm_area(const v
 
 	spin_lock(&vmap_area_lock);
 	va = __find_vmap_area((unsigned long)addr);
-	if (va && va->flags & VM_VM_AREA) {
+	if (va && va->vm) {
 		struct vm_struct *vm = va->vm;
 
 		va->vm = NULL;
-		va->flags &= ~VM_VM_AREA;
 		spin_unlock(&vmap_area_lock);
 
 		kasan_free_shadow(vm);
@@ -2835,7 +2831,7 @@ long vread(char *buf, char *addr, unsign
 		if (!count)
 			break;
 
-		if (!(va->flags & VM_VM_AREA))
+		if (!va->vm)
 			continue;
 
 		vm = va->vm;
@@ -2915,7 +2911,7 @@ long vwrite(char *buf, char *addr, unsig
 		if (!count)
 			break;
 
-		if (!(va->flags & VM_VM_AREA))
+		if (!va->vm)
 			continue;
 
 		vm = va->vm;
@@ -3506,10 +3502,10 @@ static int s_show(struct seq_file *m, vo
 	va = list_entry(p, struct vmap_area, list);
 
 	/*
-	 * s_show can encounter race with remove_vm_area, !VM_VM_AREA on
-	 * behalf of vmap area is being tear down or vm_map_ram allocation.
+	 * s_show can encounter race with remove_vm_area, !vm on behalf
+	 * of vmap area is being tear down or vm_map_ram allocation.
 	 */
-	if (!(va->flags & VM_VM_AREA)) {
+	if (!va->vm) {
 		seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
 			(void *)va->va_start, (void *)va->va_end,
 			va->va_end - va->va_start);
_

Patches currently in -mm which might be from akpm@linux-foundation.org are

mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch
io-mapping-indicate-mapping-failure-fix.patch
mm-fix-kthread_use_mm-vs-tlb-invalidate-fix.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-thp-replace-http-links-with-https-ones-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
mmhwpoison-rework-soft-offline-for-in-use-pages-fix.patch
mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch
linux-next-rejects.patch
revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + khugepaged-fix-null-pointer-dereference-due-to-race.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (230 preceding siblings ...)
  2020-07-24  2:50 ` + revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size.patch added to " Andrew Morton
@ 2020-07-24  2:53 ` Andrew Morton
  2020-07-24  3:01 ` + mm-mmap-merge-vma-after-call_mmap-if-possible.patch " Andrew Morton
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  2:53 UTC (permalink / raw)
  To: david, kirill.shutemov, mm-commits, stable, yang.shi


The patch titled
     Subject: khugepaged: fix null-pointer dereference due to race
has been added to the -mm tree.  Its filename is
     khugepaged-fix-null-pointer-dereference-due-to-race.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/khugepaged-fix-null-pointer-dereference-due-to-race.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/khugepaged-fix-null-pointer-dereference-due-to-race.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: khugepaged: fix null-pointer dereference due to race

khugepaged has to drop mmap lock several times while collapsing a page. 
The situation can change while the lock is dropped and we need to
re-validate that the VMA is still in place and the PMD is still subject
for collapse.

But we miss one corner case: while collapsing an anonymous pages the VMA
could be replaced with file VMA. If the file VMA doesn't have any
private pages we get NULL pointer dereference:

	general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
	KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
	anon_vma_lock_write include/linux/rmap.h:120 [inline]
	collapse_huge_page mm/khugepaged.c:1110 [inline]
	khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
	khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
	khugepaged_do_scan mm/khugepaged.c:2193 [inline]
	khugepaged+0x3bba/0x5a10 mm/khugepaged.c:2238

The fix is to make sure that the VMA is anonymous in
hugepage_vma_revalidate().  The helper is only used for collapsing
anonymous pages.

Link: http://lkml.kernel.org/r/20200722121439.44328-1-kirill.shutemov@linux.intel.com
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: syzbot+ed318e8b790ca72c5ad0@syzkaller.appspotmail.com
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/khugepaged.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/mm/khugepaged.c~khugepaged-fix-null-pointer-dereference-due-to-race
+++ a/mm/khugepaged.c
@@ -958,6 +958,9 @@ static int hugepage_vma_revalidate(struc
 		return SCAN_ADDRESS_RANGE;
 	if (!hugepage_vma_check(vma, vma->vm_flags))
 		return SCAN_VMA_CHECK;
+	/* Anon VMA expected */
+	if (!vma->anon_vma || vma->vm_ops)
+		return SCAN_VMA_CHECK;
 	return 0;
 }
 
_

Patches currently in -mm which might be from kirill.shutemov@linux.intel.com are

mm-close-race-between-munmap-and-expand_upwards-downwards.patch
khugepaged-fix-null-pointer-dereference-due-to-race.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mm-mmap-merge-vma-after-call_mmap-if-possible.patch added to -mm tree
  2020-07-03 22:14 incoming Andrew Morton
                   ` (231 preceding siblings ...)
  2020-07-24  2:53 ` + khugepaged-fix-null-pointer-dereference-due-to-race.patch " Andrew Morton
@ 2020-07-24  3:01 ` Andrew Morton
  232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-24  3:01 UTC (permalink / raw)
  To: akpm, linmiaohe, louhongxiang, mm-commits


The patch titled
     Subject: mm: mmap: merge vma after call_mmap() if possible
has been added to the -mm tree.  Its filename is
     mm-mmap-merge-vma-after-call_mmap-if-possible.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mmap-merge-vma-after-call_mmap-if-possible.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mmap-merge-vma-after-call_mmap-if-possible.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Miaohe Lin <linmiaohe@huawei.com>
Subject: mm: mmap: merge vma after call_mmap() if possible

The vm_flags may be changed after call_mmap() because drivers may set some
flags for their own purpose.  As a result, we failed to merge the adjacent
vma due to the different vm_flags as userspace can't pass in the same one.
Try to merge vma after call_mmap() to fix this issue.

Link: http://lkml.kernel.org/r/1594954065-23733-1-git-send-email-linmiaohe@huawei.com
Signed-off-by: Hongxiang Lou <louhongxiang@huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mmap.c |   22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

--- a/mm/mmap.c~mm-mmap-merge-vma-after-call_mmap-if-possible
+++ a/mm/mmap.c
@@ -1690,7 +1690,7 @@ unsigned long mmap_region(struct file *f
 		struct list_head *uf)
 {
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma, *prev;
+	struct vm_area_struct *vma, *prev, *merge;
 	int error;
 	struct rb_node **rb_link, *rb_parent;
 	unsigned long charged = 0;
@@ -1774,6 +1774,25 @@ unsigned long mmap_region(struct file *f
 		if (error)
 			goto unmap_and_free_vma;
 
+		/* If vm_flags changed after call_mmap(), we should try merge vma again
+		 * as we may succeed this time.
+		 */
+		if (unlikely(vm_flags != vma->vm_flags && prev)) {
+			merge = vma_merge(mm, prev, vma->vm_start, vma->vm_end, vma->vm_flags,
+				NULL, vma->vm_file, vma->vm_pgoff, NULL, NULL_VM_UFFD_CTX);
+			if (merge) {
+				fput(file);
+				vm_area_free(vma);
+				vma = merge;
+				/* Update vm_flags and possible addr to pick up the change. We don't
+				 * warn here if addr changed as the vma is not linked by vma_link().
+				 */
+				addr = vma->vm_start;
+				vm_flags = vma->vm_flags;
+				goto unmap_writable;
+			}
+		}
+
 		/* Can addr have changed??
 		 *
 		 * Answer: Yes, several device drivers can do it in their
@@ -1796,6 +1815,7 @@ unsigned long mmap_region(struct file *f
 	vma_link(mm, vma, prev, rb_link, rb_parent);
 	/* Once vma denies write, undo our temporary denial count */
 	if (file) {
+unmap_writable:
 		if (vm_flags & VM_SHARED)
 			mapping_unmap_writable(file->f_mapping);
 		if (vm_flags & VM_DENYWRITE)
_

Patches currently in -mm which might be from linmiaohe@huawei.com are

mm-mmap-merge-vma-after-call_mmap-if-possible.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch added to -mm tree
@ 2020-09-22 17:00 akpm
  0 siblings, 0 replies; 247+ messages in thread
From: akpm @ 2020-09-22 17:00 UTC (permalink / raw)
  To: mm-commits, zeil, tony.luck, osalvador, naoya.horiguchi,
	mike.kravetz, mhocko, david, dave.hansen, cai, aris,
	aneesh.kumar, aneesh.kumar, osalvador


The patch titled
     Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page
has been added to the -mm tree.  Its filename is
     mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

This patch should soon appear at
    https://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
and later at
    https://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page

Merging soft_offline_huge_page and __soft_offline_page let us get rid of
quite some duplicated code, and makes the code much easier to follow.

Now, __soft_offline_page will handle both normal and hugetlb pages.

Link: https://lkml.kernel.org/r/20200922135650.1634-11-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |  182 ++++++++++++++++++------------------------
 1 file changed, 82 insertions(+), 100 deletions(-)

--- a/mm/memory-failure.c~mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page
+++ a/mm/memory-failure.c
@@ -65,13 +65,31 @@ int sysctl_memory_failure_recovery __rea
 
 atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
 
-static void page_handle_poison(struct page *page, bool release)
+static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
 {
+	if (hugepage_or_freepage) {
+		/*
+		 * Doing this check for free pages is also fine since dissolve_free_huge_page
+		 * returns 0 for non-hugetlb pages as well.
+		 */
+		if (dissolve_free_huge_page(page) || !take_page_off_buddy(page))
+			/*
+			 * We could fail to take off the target page from buddy
+			 * for example due to racy page allocaiton, but that's
+			 * acceptable because soft-offlined page is not broken
+			 * and if someone really want to use it, they should
+			 * take it.
+			 */
+			return false;
+	}
+
 	SetPageHWPoison(page);
 	if (release)
 		put_page(page);
 	page_ref_inc(page);
 	num_poisoned_pages_inc();
+
+	return true;
 }
 
 #if defined(CONFIG_HWPOISON_INJECT) || defined(CONFIG_HWPOISON_INJECT_MODULE)
@@ -1725,63 +1743,51 @@ static int get_any_page(struct page *pag
 	return ret;
 }
 
-static int soft_offline_huge_page(struct page *page, int flags)
+static bool isolate_page(struct page *page, struct list_head *pagelist)
 {
-	int ret;
-	unsigned long pfn = page_to_pfn(page);
-	struct page *hpage = compound_head(page);
-	LIST_HEAD(pagelist);
+	bool isolated = false;
+	bool lru = PageLRU(page);
 
-	/*
-	 * This double-check of PageHWPoison is to avoid the race with
-	 * memory_failure(). See also comment in __soft_offline_page().
-	 */
-	lock_page(hpage);
-	if (PageHWPoison(hpage)) {
-		unlock_page(hpage);
-		put_page(hpage);
-		pr_info("soft offline: %#lx hugepage already poisoned\n", pfn);
-		return -EBUSY;
+	if (PageHuge(page)) {
+		isolated = isolate_huge_page(page, pagelist);
+	} else {
+		if (lru)
+			isolated = !isolate_lru_page(page);
+		else
+			isolated = !isolate_movable_page(page, ISOLATE_UNEVICTABLE);
+
+		if (isolated)
+			list_add(&page->lru, pagelist);
 	}
-	unlock_page(hpage);
 
-	ret = isolate_huge_page(hpage, &pagelist);
+	if (isolated && lru)
+		inc_node_page_state(page, NR_ISOLATED_ANON +
+				    page_is_file_lru(page));
+
 	/*
-	 * get_any_page() and isolate_huge_page() takes a refcount each,
-	 * so need to drop one here.
+	 * If we succeed to isolate the page, we grabbed another refcount on
+	 * the page, so we can safely drop the one we got from get_any_pages().
+	 * If we failed to isolate the page, it means that we cannot go further
+	 * and we will return an error, so drop the reference we got from
+	 * get_any_pages() as well.
 	 */
-	put_page(hpage);
-	if (!ret) {
-		pr_info("soft offline: %#lx hugepage failed to isolate\n", pfn);
-		return -EBUSY;
-	}
-
-	ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
-				MIGRATE_SYNC, MR_MEMORY_FAILURE);
-	if (ret) {
-		pr_info("soft offline: %#lx: hugepage migration failed %d, type %lx (%pGp)\n",
-			pfn, ret, page->flags, &page->flags);
-		if (!list_empty(&pagelist))
-			putback_movable_pages(&pagelist);
-		if (ret > 0)
-			ret = -EIO;
-	} else {
-		/*
-		 * We set PG_hwpoison only when we were able to take the page
-		 * off the buddy.
-		 */
-		if (!dissolve_free_huge_page(page) && take_page_off_buddy(page))
-			page_handle_poison(page, false);
-		else
-			ret = -EBUSY;
-	}
-	return ret;
+	put_page(page);
+	return isolated;
 }
 
-static int __soft_offline_page(struct page *page, int flags)
+/*
+ * __soft_offline_page handles hugetlb-pages and non-hugetlb pages.
+ * If the page is a non-dirty unmapped page-cache page, it simply invalidates.
+ * If the page is mapped, it migrates the contents over.
+ */
+static int __soft_offline_page(struct page *page)
 {
-	int ret;
+	int ret = 0;
 	unsigned long pfn = page_to_pfn(page);
+	struct page *hpage = compound_head(page);
+	char const *msg_page[] = {"page", "hugepage"};
+	bool huge = PageHuge(page);
+	LIST_HEAD(pagelist);
 
 	/*
 	 * Check PageHWPoison again inside page lock because PageHWPoison
@@ -1790,98 +1796,74 @@ static int __soft_offline_page(struct pa
 	 * so there's no race between soft_offline_page() and memory_failure().
 	 */
 	lock_page(page);
-	wait_on_page_writeback(page);
+	if (!PageHuge(page))
+		wait_on_page_writeback(page);
 	if (PageHWPoison(page)) {
 		unlock_page(page);
 		put_page(page);
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
 		return -EBUSY;
 	}
-	/*
-	 * Try to invalidate first. This should work for
-	 * non dirty unmapped page cache pages.
-	 */
-	ret = invalidate_inode_page(page);
+
+	if (!PageHuge(page))
+		/*
+		 * Try to invalidate first. This should work for
+		 * non dirty unmapped page cache pages.
+		 */
+		ret = invalidate_inode_page(page);
 	unlock_page(page);
+
 	/*
 	 * RED-PEN would be better to keep it isolated here, but we
 	 * would need to fix isolation locking first.
 	 */
-	if (ret == 1) {
+	if (ret) {
 		pr_info("soft_offline: %#lx: invalidated\n", pfn);
-		page_handle_poison(page, true);
+		page_handle_poison(page, false, true);
 		return 0;
 	}
 
-	/*
-	 * Simple invalidation didn't work.
-	 * Try to migrate to a new page instead. migrate.c
-	 * handles a large number of cases for us.
-	 */
-	if (PageLRU(page))
-		ret = isolate_lru_page(page);
-	else
-		ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
-	/*
-	 * Drop page reference which is came from get_any_page()
-	 * successful isolate_lru_page() already took another one.
-	 */
-	put_page(page);
-	if (!ret) {
-		LIST_HEAD(pagelist);
-		/*
-		 * After isolated lru page, the PageLRU will be cleared,
-		 * so use !__PageMovable instead for LRU page's mapping
-		 * cannot have PAGE_MAPPING_MOVABLE.
-		 */
-		if (!__PageMovable(page))
-			inc_node_page_state(page, NR_ISOLATED_ANON +
-						page_is_file_lru(page));
-		list_add(&page->lru, &pagelist);
+	if (isolate_page(hpage, &pagelist)) {
 		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
 					MIGRATE_SYNC, MR_MEMORY_FAILURE);
 		if (!ret) {
-			page_handle_poison(page, true);
+			bool release = !huge;
+
+			if (!page_handle_poison(page, huge, release))
+				ret = -EBUSY;
 		} else {
 			if (!list_empty(&pagelist))
 				putback_movable_pages(&pagelist);
 
-			pr_info("soft offline: %#lx: migration failed %d, type %lx (%pGp)\n",
-				pfn, ret, page->flags, &page->flags);
+			pr_info("soft offline: %#lx: %s migration failed %d, type %lx (%pGp)\n",
+				pfn, msg_page[huge], ret, page->flags, &page->flags);
 			if (ret > 0)
 				ret = -EIO;
 		}
 	} else {
-		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx (%pGp)\n",
-			pfn, ret, page_count(page), page->flags, &page->flags);
+		pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
+			pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
+		ret = -EBUSY;
 	}
 	return ret;
 }
 
-static int soft_offline_in_use_page(struct page *page, int flags)
+static int soft_offline_in_use_page(struct page *page)
 {
-	int ret;
 	struct page *hpage = compound_head(page);
 
 	if (!PageHuge(page) && PageTransHuge(hpage))
 		if (try_to_split_thp_page(page, "soft offline") < 0)
 			return -EBUSY;
-
-	if (PageHuge(page))
-		ret = soft_offline_huge_page(page, flags);
-	else
-		ret = __soft_offline_page(page, flags);
-	return ret;
+	return __soft_offline_page(page);
 }
 
 static int soft_offline_free_page(struct page *page)
 {
-	int rc = -EBUSY;
+	int rc = 0;
 
-	if (!dissolve_free_huge_page(page) && take_page_off_buddy(page)) {
-		page_handle_poison(page, false);
-		rc = 0;
-	}
+	if (!page_handle_poison(page, true, false))
+		rc = -EBUSY;
 
 	return rc;
 }
@@ -1932,7 +1914,7 @@ int soft_offline_page(unsigned long pfn,
 	put_online_mems();
 
 	if (ret > 0)
-		ret = soft_offline_in_use_page(page, flags);
+		ret = soft_offline_in_use_page(page);
 	else if (ret == 0)
 		ret = soft_offline_free_page(page);
 
_

Patches currently in -mm which might be from osalvador@suse.de are

mmhwpoison-unexport-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-refactor-madvise_inject_error.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
mmhwpoison-try-to-narrow-window-race-for-free-pages.patch
mmhwpoison-take-free-pages-off-the-buddy-freelists.patch
mmhwpoison-drain-pcplists-before-bailing-out-for-non-buddy-zero-refcount-page.patch
mmhwpoison-drop-unneeded-pcplist-draining.patch
mmhwpoison-remove-stale-code.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch added to -mm tree
@ 2020-08-07  1:07 akpm
  0 siblings, 0 replies; 247+ messages in thread
From: akpm @ 2020-08-07  1:07 UTC (permalink / raw)
  To: aneesh.kumar, aneesh.kumar, cai, dave.hansen, david, mhocko,
	mike.kravetz, mm-commits, naoya.horiguchi, osalvador, tony.luck,
	zeil


The patch titled
     Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page
has been added to the -mm tree.  Its filename is
     mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page

Merging soft_offline_huge_page and __soft_offline_page let us get rid of
quite some duplicated code, and makes the code much easier to follow.

Now, __soft_offline_page will handle both normal and hugetlb pages.

Note that move put_page() block to the beginning of page_handle_poison()
with drain_all_pages() in order to make sure that the target page is freed
and sent into free list to make take_page_off_buddy() work properly.

Link: http://lkml.kernel.org/r/20200806184923.7007-10-nao.horiguchi@gmail.com
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |  181 +++++++++++++++++++-----------------------
 1 file changed, 82 insertions(+), 99 deletions(-)

--- a/mm/memory-failure.c~mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page
+++ a/mm/memory-failure.c
@@ -65,15 +65,33 @@ int sysctl_memory_failure_recovery __rea
 
 atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
 
-static void page_handle_poison(struct page *page, bool release)
+static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
 {
 	if (release) {
 		put_page(page);
 		drain_all_pages(page_zone(page));
 	}
+
+	if (hugepage_or_freepage) {
+		/*
+		 * Doing this check for free pages is also fine since dissolve_free_huge_page
+		 * returns 0 for non-hugetlb pages as well.
+		 */
+		if (dissolve_free_huge_page(page) || !take_page_off_buddy(page))
+			/*
+			 * We could fail to take off the target page from buddy
+			 * for example due to racy page allocaiton, but that's
+			 * acceptable because soft-offlined page is not broken
+			 * and if someone really want to use it, they should
+			 * take it.
+			 */
+			return false;
+	}
+
 	SetPageHWPoison(page);
 	page_ref_inc(page);
 	num_poisoned_pages_inc();
+	return true;
 }
 
 #if defined(CONFIG_HWPOISON_INJECT) || defined(CONFIG_HWPOISON_INJECT_MODULE)
@@ -1725,63 +1743,53 @@ static int get_any_page(struct page *pag
 	return ret;
 }
 
-static int soft_offline_huge_page(struct page *page, int flags)
+static bool isolate_page(struct page *page, struct list_head *pagelist)
 {
-	int ret;
-	unsigned long pfn = page_to_pfn(page);
-	struct page *hpage = compound_head(page);
-	LIST_HEAD(pagelist);
+	bool isolated = false;
+	bool lru = PageLRU(page);
+
+	if (PageHuge(page)) {
+		isolated = isolate_huge_page(page, pagelist);
+	} else {
+		if (lru)
+			isolated = !isolate_lru_page(page);
+		else
+			isolated = !isolate_movable_page(page, ISOLATE_UNEVICTABLE);
+
+		if (isolated)
+			list_add(&page->lru, pagelist);
 
-	/*
-	 * This double-check of PageHWPoison is to avoid the race with
-	 * memory_failure(). See also comment in __soft_offline_page().
-	 */
-	lock_page(hpage);
-	if (PageHWPoison(hpage)) {
-		unlock_page(hpage);
-		put_page(hpage);
-		pr_info("soft offline: %#lx hugepage already poisoned\n", pfn);
 		return -EBUSY;
 	}
-	unlock_page(hpage);
 
-	ret = isolate_huge_page(hpage, &pagelist);
+	if (isolated && lru)
+		inc_node_page_state(page, NR_ISOLATED_ANON +
+				    page_is_file_lru(page));
+
 	/*
-	 * get_any_page() and isolate_huge_page() takes a refcount each,
-	 * so need to drop one here.
+	 * If we succeed to isolate the page, we grabbed another refcount on
+	 * the page, so we can safely drop the one we got from get_any_pages().
+	 * If we failed to isolate the page, it means that we cannot go further
+	 * and we will return an error, so drop the reference we got from
+	 * get_any_pages() as well.
 	 */
-	put_page(hpage);
-	if (!ret) {
-		pr_info("soft offline: %#lx hugepage failed to isolate\n", pfn);
-		return -EBUSY;
-	}
-
-	ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
-				MIGRATE_SYNC, MR_MEMORY_FAILURE);
-	if (ret) {
-		pr_info("soft offline: %#lx: hugepage migration failed %d, type %lx (%pGp)\n",
-			pfn, ret, page->flags, &page->flags);
-		if (!list_empty(&pagelist))
-			putback_movable_pages(&pagelist);
-		if (ret > 0)
-			ret = -EIO;
-	} else {
-		/*
-		 * We set PG_hwpoison only when we were able to take the page
-		 * off the buddy.
-		 */
-		if (!dissolve_free_huge_page(page) && take_page_off_buddy(page))
-			page_handle_poison(page, false);
-		else
-			ret = -EBUSY;
-	}
-	return ret;
+	put_page(page);
+	return isolated;
 }
 
-static int __soft_offline_page(struct page *page, int flags)
+/*
+ * __soft_offline_page handles hugetlb-pages and non-hugetlb pages.
+ * If the page is a non-dirty unmapped page-cache page, it simply invalidates.
+ * If the page is mapped, it migrates the contents over.
+ */
+static int __soft_offline_page(struct page *page)
 {
-	int ret;
+	int ret = 0;
 	unsigned long pfn = page_to_pfn(page);
+	struct page *hpage = compound_head(page);
+	char const *msg_page[] = {"page", "hugepage"};
+	bool huge = PageHuge(page);
+	LIST_HEAD(pagelist);
 
 	/*
 	 * Check PageHWPoison again inside page lock because PageHWPoison
@@ -1790,98 +1798,73 @@ static int __soft_offline_page(struct pa
 	 * so there's no race between soft_offline_page() and memory_failure().
 	 */
 	lock_page(page);
-	wait_on_page_writeback(page);
+	if (!PageHuge(page))
+		wait_on_page_writeback(page);
 	if (PageHWPoison(page)) {
 		unlock_page(page);
 		put_page(page);
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
 		return -EBUSY;
 	}
-	/*
-	 * Try to invalidate first. This should work for
-	 * non dirty unmapped page cache pages.
-	 */
-	ret = invalidate_inode_page(page);
+
+	if (!PageHuge(page))
+		/*
+		 * Try to invalidate first. This should work for
+		 * non dirty unmapped page cache pages.
+		 */
+		ret = invalidate_inode_page(page);
 	unlock_page(page);
+
 	/*
 	 * RED-PEN would be better to keep it isolated here, but we
 	 * would need to fix isolation locking first.
 	 */
-	if (ret == 1) {
+	if (ret) {
 		pr_info("soft_offline: %#lx: invalidated\n", pfn);
-		page_handle_poison(page, true);
+		page_handle_poison(page, false, true);
 		return 0;
 	}
 
-	/*
-	 * Simple invalidation didn't work.
-	 * Try to migrate to a new page instead. migrate.c
-	 * handles a large number of cases for us.
-	 */
-	if (PageLRU(page))
-		ret = isolate_lru_page(page);
-	else
-		ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
-	/*
-	 * Drop page reference which is came from get_any_page()
-	 * successful isolate_lru_page() already took another one.
-	 */
-	put_page(page);
-	if (!ret) {
-		LIST_HEAD(pagelist);
-		/*
-		 * After isolated lru page, the PageLRU will be cleared,
-		 * so use !__PageMovable instead for LRU page's mapping
-		 * cannot have PAGE_MAPPING_MOVABLE.
-		 */
-		if (!__PageMovable(page))
-			inc_node_page_state(page, NR_ISOLATED_ANON +
-						page_is_file_lru(page));
-		list_add(&page->lru, &pagelist);
+	if (isolate_page(hpage, &pagelist)) {
 		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
 					MIGRATE_SYNC, MR_MEMORY_FAILURE);
 		if (!ret) {
-			page_handle_poison(page, true);
+			bool release = !huge;
+
+			if (!page_handle_poison(page, true, release))
+				ret = -EBUSY;
 		} else {
 			if (!list_empty(&pagelist))
 				putback_movable_pages(&pagelist);
 
-			pr_info("soft offline: %#lx: migration failed %d, type %lx (%pGp)\n",
-				pfn, ret, page->flags, &page->flags);
+			pr_info("soft offline: %#lx: %s migration failed %d, type %lx (%pGp)\n",
+				pfn, msg_page[huge], ret, page->flags, &page->flags);
 			if (ret > 0)
 				ret = -EIO;
 		}
 	} else {
-		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx (%pGp)\n",
-			pfn, ret, page_count(page), page->flags, &page->flags);
+		pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
+			pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
 	}
 	return ret;
 }
 
-static int soft_offline_in_use_page(struct page *page, int flags)
+static int soft_offline_in_use_page(struct page *page)
 {
-	int ret;
 	struct page *hpage = compound_head(page);
 
 	if (!PageHuge(page) && PageTransHuge(hpage))
 		if (try_to_split_thp_page(page, "soft offline") < 0)
 			return -EBUSY;
-
-	if (PageHuge(page))
-		ret = soft_offline_huge_page(page, flags);
-	else
-		ret = __soft_offline_page(page, flags);
-	return ret;
+	return __soft_offline_page(page);
 }
 
 static int soft_offline_free_page(struct page *page)
 {
-	int rc = -EBUSY;
+	int rc = 0;
 
-	if (!dissolve_free_huge_page(page) && take_page_off_buddy(page)) {
-		page_handle_poison(page, false);
-		rc = 0;
-	}
+	if (!page_handle_poison(page, true, false))
+		rc = -EBUSY;
 
 	return rc;
 }
@@ -1932,7 +1915,7 @@ int soft_offline_page(unsigned long pfn,
 	put_online_mems();
 
 	if (ret > 0)
-		ret = soft_offline_in_use_page(page, flags);
+		ret = soft_offline_in_use_page(page);
 	else if (ret == 0)
 		ret = soft_offline_free_page(page);
 
_

Patches currently in -mm which might be from osalvador@suse.de are

mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch added to -mm tree
  2020-07-24  4:14 incoming Andrew Morton
@ 2020-07-31 20:06 ` Andrew Morton
  0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-31 20:06 UTC (permalink / raw)
  To: aneesh.kumar, aneesh.kumar, cai, dave.hansen, david, mhocko,
	mike.kravetz, mm-commits, n-horiguchi, naoya.horiguchi,
	osalvador, osalvador, tony.luck, zeil


The patch titled
     Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page
has been added to the -mm tree.  Its filename is
     mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page

Merging soft_offline_huge_page and __soft_offline_page lets us get rid of
quite some duplicated code, and makes the code much easier to follow.

Now, __soft_offline_page will handle both normal and hugetlb pages.

Note that move put_page() block to the beginning of page_handle_poison()
with drain_all_pages() in order to make sure that the target page is freed
and sent into free list to make take_page_off_buddy() work properly.

Link: http://lkml.kernel.org/r/20200731122112.11263-14-nao.horiguchi@gmail.com
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |  174 ++++++++++++++++++------------------------
 1 file changed, 77 insertions(+), 97 deletions(-)

--- a/mm/memory-failure.c~mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page
+++ a/mm/memory-failure.c
@@ -65,15 +65,33 @@ int sysctl_memory_failure_recovery __rea
 
 atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
 
-static void page_handle_poison(struct page *page, bool release)
+static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
 {
 	if (release) {
 		put_page(page);
 		drain_all_pages(page_zone(page));
 	}
+
+	if (hugepage_or_freepage) {
+		/*
+		 * Doing this check for free pages is also fine since dissolve_free_huge_page
+		 * returns 0 for non-hugetlb pages as well.
+		 */
+		if (dissolve_free_huge_page(page) || !take_page_off_buddy(page))
+			/*
+			 * We could fail to take off the target page from buddy
+			 * for example due to racy page allocaiton, but that's
+			 * acceptable because soft-offlined page is not broken
+			 * and if someone really want to use it, they should
+			 * take it.
+			 */
+			return false;
+	}
+
 	SetPageHWPoison(page);
 	page_ref_inc(page);
 	num_poisoned_pages_inc();
+	return true;
 }
 
 #if defined(CONFIG_HWPOISON_INJECT) || defined(CONFIG_HWPOISON_INJECT_MODULE)
@@ -1719,63 +1737,51 @@ static int get_any_page(struct page *pag
 	return ret;
 }
 
-static int soft_offline_huge_page(struct page *page)
+static bool isolate_page(struct page *page, struct list_head *pagelist)
 {
-	int ret;
-	unsigned long pfn = page_to_pfn(page);
-	struct page *hpage = compound_head(page);
-	LIST_HEAD(pagelist);
+	bool isolated = false;
+	bool lru = PageLRU(page);
 
-	/*
-	 * This double-check of PageHWPoison is to avoid the race with
-	 * memory_failure(). See also comment in __soft_offline_page().
-	 */
-	lock_page(hpage);
-	if (PageHWPoison(hpage)) {
-		unlock_page(hpage);
-		put_page(hpage);
-		pr_info("soft offline: %#lx hugepage already poisoned\n", pfn);
-		return -EBUSY;
+	if (PageHuge(page)) {
+		isolated = isolate_huge_page(page, pagelist);
+	} else {
+		if (lru)
+			isolated = !isolate_lru_page(page);
+		else
+			isolated = !isolate_movable_page(page, ISOLATE_UNEVICTABLE);
+
+		if (isolated)
+			list_add(&page->lru, pagelist);
 	}
-	unlock_page(hpage);
 
-	ret = isolate_huge_page(hpage, &pagelist);
+	if (isolated && lru)
+		inc_node_page_state(page, NR_ISOLATED_ANON +
+				    page_is_file_lru(page));
+
 	/*
-	 * get_any_page() and isolate_huge_page() takes a refcount each,
-	 * so need to drop one here.
+	 * If we succeed to isolate the page, we grabbed another refcount on
+	 * the page, so we can safely drop the one we got from get_any_pages().
+	 * If we failed to isolate the page, it means that we cannot go further
+	 * and we will return an error, so drop the reference we got from
+	 * get_any_pages() as well.
 	 */
-	put_page(hpage);
-	if (!ret) {
-		pr_info("soft offline: %#lx hugepage failed to isolate\n", pfn);
-		return -EBUSY;
-	}
-
-	ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
-				MIGRATE_SYNC, MR_MEMORY_FAILURE);
-	if (ret) {
-		pr_info("soft offline: %#lx: hugepage migration failed %d, type %lx (%pGp)\n",
-			pfn, ret, page->flags, &page->flags);
-		if (!list_empty(&pagelist))
-			putback_movable_pages(&pagelist);
-		if (ret > 0)
-			ret = -EIO;
-	} else {
-		/*
-		 * We set PG_hwpoison only when we were able to take the page
-		 * off the buddy.
-		 */
-		if (!dissolve_free_huge_page(page) && take_page_off_buddy(page))
-			page_handle_poison(page, false);
-		else
-			ret = -EBUSY;
-	}
-	return ret;
+	put_page(page);
+	return isolated;
 }
 
+/*
+ * __soft_offline_page handles hugetlb-pages and non-hugetlb pages.
+ * If the page is a non-dirty unmapped page-cache page, it simply invalidates.
+ * If the page is mapped, it migrates the contents over.
+ */
 static int __soft_offline_page(struct page *page)
 {
-	int ret;
+	int ret = 0;
 	unsigned long pfn = page_to_pfn(page);
+	struct page *hpage = compound_head(page);
+	char const *msg_page[] = {"page", "hugepage"};
+	bool huge = PageHuge(page);
+	LIST_HEAD(pagelist);
 
 	/*
 	 * Check PageHWPoison again inside page lock because PageHWPoison
@@ -1784,98 +1790,72 @@ static int __soft_offline_page(struct pa
 	 * so there's no race between soft_offline_page() and memory_failure().
 	 */
 	lock_page(page);
-	wait_on_page_writeback(page);
+	if (!PageHuge(page))
+		wait_on_page_writeback(page);
 	if (PageHWPoison(page)) {
 		unlock_page(page);
 		put_page(page);
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
 		return -EBUSY;
 	}
-	/*
-	 * Try to invalidate first. This should work for
-	 * non dirty unmapped page cache pages.
-	 */
-	ret = invalidate_inode_page(page);
+
+	if (!PageHuge(page))
+		/*
+		 * Try to invalidate first. This should work for
+		 * non dirty unmapped page cache pages.
+		 */
+		ret = invalidate_inode_page(page);
 	unlock_page(page);
 	/*
 	 * RED-PEN would be better to keep it isolated here, but we
 	 * would need to fix isolation locking first.
 	 */
-	if (ret == 1) {
+	if (ret) {
 		pr_info("soft_offline: %#lx: invalidated\n", pfn);
-		page_handle_poison(page, true);
+		page_handle_poison(page, false, true);
 		return 0;
 	}
 
-	/*
-	 * Simple invalidation didn't work.
-	 * Try to migrate to a new page instead. migrate.c
-	 * handles a large number of cases for us.
-	 */
-	if (PageLRU(page))
-		ret = isolate_lru_page(page);
-	else
-		ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
-	/*
-	 * Drop page reference which is came from get_any_page()
-	 * successful isolate_lru_page() already took another one.
-	 */
-	put_page(page);
-	if (!ret) {
-		LIST_HEAD(pagelist);
-		/*
-		 * After isolated lru page, the PageLRU will be cleared,
-		 * so use !__PageMovable instead for LRU page's mapping
-		 * cannot have PAGE_MAPPING_MOVABLE.
-		 */
-		if (!__PageMovable(page))
-			inc_node_page_state(page, NR_ISOLATED_ANON +
-						page_is_file_lru(page));
-		list_add(&page->lru, &pagelist);
+	if (isolate_page(hpage, &pagelist)) {
 		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
 					MIGRATE_SYNC, MR_MEMORY_FAILURE);
 		if (!ret) {
-			page_handle_poison(page, true);
+			bool release = !huge;
+
+			if (!page_handle_poison(page, true, release))
+				ret = -EBUSY;
 		} else {
 			if (!list_empty(&pagelist))
 				putback_movable_pages(&pagelist);
 
-			pr_info("soft offline: %#lx: migration failed %d, type %lx (%pGp)\n",
-				pfn, ret, page->flags, &page->flags);
+			pr_info("soft offline: %#lx: %s migration failed %d, type %lx (%pGp)\n",
+				pfn, msg_page[huge], ret, page->flags, &page->flags);
 			if (ret > 0)
 				ret = -EIO;
 		}
 	} else {
-		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx (%pGp)\n",
-			pfn, ret, page_count(page), page->flags, &page->flags);
+		pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
+			pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
 	}
 	return ret;
 }
 
 static int soft_offline_in_use_page(struct page *page)
 {
-	int ret;
 	struct page *hpage = compound_head(page);
 
 	if (!PageHuge(page) && PageTransHuge(hpage))
 		if (try_to_split_thp_page(page, "soft offline") < 0)
 			return -EBUSY;
-
-	if (PageHuge(page))
-		ret = soft_offline_huge_page(page);
-	else
-		ret = __soft_offline_page(page);
-	return ret;
+	return __soft_offline_page(page);
 }
 
 static int soft_offline_free_page(struct page *page)
 {
-	int rc = -EBUSY;
+	int rc = 0;
 
-	if (!dissolve_free_huge_page(page) && take_page_off_buddy(page)) {
-		page_handle_poison(page, false);
-		rc = 0;
-	}
+	if (!page_handle_poison(page, true, false))
+		rc = -EBUSY;
 
 	return rc;
 }
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch


^ permalink raw reply	[flat|nested] 247+ messages in thread

* + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch added to -mm tree
@ 2020-06-24 19:19 akpm
  0 siblings, 0 replies; 247+ messages in thread
From: akpm @ 2020-06-24 19:19 UTC (permalink / raw)
  To: mm-commits, zeil, tony.luck, naoya.horiguchi, mike.kravetz,
	mhocko, david, dave.hansen, aneesh.kumar, aneesh.kumar,
	osalvador


The patch titled
     Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page
has been added to the -mm tree.  Its filename is
     mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Oscar Salvador <osalvador@suse.de>
Subject: mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page

Merging soft_offline_huge_page and __soft_offline_page let us get rid of
quite some duplicated code, and makes the code much easier to follow.

Now, __soft_offline_page will handle both normal and hugetlb pages.

Note that move put_page() block to the beginning of page_handle_poison()
with drain_all_pages() in order to make sure that the target page is freed
and sent into free list to make take_page_off_buddy() work properly.

Link: http://lkml.kernel.org/r/20200624150137.7052-14-nao.horiguchi@gmail.com
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |  185 +++++++++++++++++++-----------------------
 1 file changed, 86 insertions(+), 99 deletions(-)

--- a/mm/memory-failure.c~mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page
+++ a/mm/memory-failure.c
@@ -78,14 +78,36 @@ EXPORT_SYMBOL_GPL(hwpoison_filter_dev_mi
 EXPORT_SYMBOL_GPL(hwpoison_filter_flags_mask);
 EXPORT_SYMBOL_GPL(hwpoison_filter_flags_value);
 
-static void page_handle_poison(struct page *page, bool release)
+static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
 {
+	if (release) {
+		put_page(page);
+		drain_all_pages(page_zone(page));
+	}
+
+	if (hugepage_or_freepage) {
+		/*
+		 * Doing this check for free pages is also fine since dissolve_free_huge_page
+		 * returns 0 for non-hugetlb pages as well.
+		 */
+		if (dissolve_free_huge_page(page) || !take_page_off_buddy(page))
+		/*
+		 * The hugetlb page can end up being enqueued back into
+		 * the freelists by means of:
+		 * unmap_and_move_huge_page
+		 *  putback_active_hugepage
+		 *   put_page->free_huge_page
+		 *    enqueue_huge_page
+		 * If this happens, we might lose the race against an allocation.
+		 */
+			return false;
+	}
 
 	SetPageHWPoison(page);
-	if (release)
-		put_page(page);
 	page_ref_inc(page);
 	num_poisoned_pages_inc();
+
+	return true;
 }
 
 static int hwpoison_filter_dev(struct page *p)
@@ -1718,63 +1740,52 @@ static int get_any_page(struct page *pag
 	return ret;
 }
 
-static int soft_offline_huge_page(struct page *page)
+static bool isolate_page(struct page *page, struct list_head *pagelist)
 {
-	int ret;
-	unsigned long pfn = page_to_pfn(page);
-	struct page *hpage = compound_head(page);
-	LIST_HEAD(pagelist);
+	bool isolated = false;
+	bool lru = PageLRU(page);
+
+	if (PageHuge(page)) {
+		isolated = isolate_huge_page(page, pagelist);
+	} else {
+		if (lru)
+			isolated = !isolate_lru_page(page);
+		else
+			isolated = !isolate_movable_page(page, ISOLATE_UNEVICTABLE);
+
+		if (isolated)
+			list_add(&page->lru, pagelist);
 
-	/*
-	 * This double-check of PageHWPoison is to avoid the race with
-	 * memory_failure(). See also comment in __soft_offline_page().
-	 */
-	lock_page(hpage);
-	if (PageHWPoison(hpage)) {
-		unlock_page(hpage);
-		put_page(hpage);
-		pr_info("soft offline: %#lx hugepage already poisoned\n", pfn);
-		return -EBUSY;
 	}
-	unlock_page(hpage);
 
-	ret = isolate_huge_page(hpage, &pagelist);
+	if (isolated && lru)
+		inc_node_page_state(page, NR_ISOLATED_ANON +
+				    page_is_file_lru(page));
+
 	/*
-	 * get_any_page() and isolate_huge_page() takes a refcount each,
-	 * so need to drop one here.
+	 * If we succeed to isolate the page, we grabbed another refcount on
+	 * the page, so we can safely drop the one we got from get_any_pages().
+	 * If we failed to isolate the page, it means that we cannot go further
+	 * and we will return an error, so drop the reference we got from
+	 * get_any_pages() as well.
 	 */
-	put_page(hpage);
-	if (!ret) {
-		pr_info("soft offline: %#lx hugepage failed to isolate\n", pfn);
-		return -EBUSY;
-	}
-
-	ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
-				MIGRATE_SYNC, MR_MEMORY_FAILURE);
-	if (ret) {
-		pr_info("soft offline: %#lx: hugepage migration failed %d, type %lx (%pGp)\n",
-			pfn, ret, page->flags, &page->flags);
-		if (!list_empty(&pagelist))
-			putback_movable_pages(&pagelist);
-		if (ret > 0)
-			ret = -EIO;
-	} else {
-		/*
-		 * We set PG_hwpoison only when we were able to take the page
-		 * off the buddy.
-		 */
-		if (!dissolve_free_huge_page(page) && take_page_off_buddy(page))
-			page_handle_poison(page, false);
-		else
-			ret = -EBUSY;
-	}
-	return ret;
+	put_page(page);
+	return isolated;
 }
 
+/*
+ * __soft_offline_page handles hugetlb-pages and non-hugetlb pages.
+ * If the page is a non-dirty unmapped page-cache page, it simply invalidates.
+ * If the page is mapped, it migrates the contents over.
+ */
 static int __soft_offline_page(struct page *page)
 {
-	int ret;
+	int ret = 0;
 	unsigned long pfn = page_to_pfn(page);
+	struct page *hpage = compound_head(page);
+	const char *msg_page[] = {"page", "hugepage"};
+	bool huge = PageHuge(page);
+	LIST_HEAD(pagelist);
 
 	/*
 	 * Check PageHWPoison again inside page lock because PageHWPoison
@@ -1783,98 +1794,74 @@ static int __soft_offline_page(struct pa
 	 * so there's no race between soft_offline_page() and memory_failure().
 	 */
 	lock_page(page);
-	wait_on_page_writeback(page);
+	if (!PageHuge(page))
+		wait_on_page_writeback(page);
 	if (PageHWPoison(page)) {
 		unlock_page(page);
 		put_page(page);
 		pr_info("soft offline: %#lx page already poisoned\n", pfn);
 		return -EBUSY;
 	}
-	/*
-	 * Try to invalidate first. This should work for
-	 * non dirty unmapped page cache pages.
-	 */
-	ret = invalidate_inode_page(page);
+
+	if (!PageHuge(page))
+		/*
+		 * Try to invalidate first. This should work for
+		 * non dirty unmapped page cache pages.
+		 */
+		ret = invalidate_inode_page(page);
 	unlock_page(page);
+
 	/*
 	 * RED-PEN would be better to keep it isolated here, but we
 	 * would need to fix isolation locking first.
 	 */
-	if (ret == 1) {
+	if (ret) {
 		pr_info("soft_offline: %#lx: invalidated\n", pfn);
-		page_handle_poison(page, true);
+		page_handle_poison(page, false, true);
 		return 0;
 	}
 
-	/*
-	 * Simple invalidation didn't work.
-	 * Try to migrate to a new page instead. migrate.c
-	 * handles a large number of cases for us.
-	 */
-	if (PageLRU(page))
-		ret = isolate_lru_page(page);
-	else
-		ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
-	/*
-	 * Drop page reference which is came from get_any_page()
-	 * successful isolate_lru_page() already took another one.
-	 */
-	put_page(page);
-	if (!ret) {
-		LIST_HEAD(pagelist);
-		/*
-		 * After isolated lru page, the PageLRU will be cleared,
-		 * so use !__PageMovable instead for LRU page's mapping
-		 * cannot have PAGE_MAPPING_MOVABLE.
-		 */
-		if (!__PageMovable(page))
-			inc_node_page_state(page, NR_ISOLATED_ANON +
-						page_is_file_lru(page));
-		list_add(&page->lru, &pagelist);
+	if (isolate_page(hpage, &pagelist)) {
 		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
 					MIGRATE_SYNC, MR_MEMORY_FAILURE);
 		if (!ret) {
-			page_handle_poison(page, true);
+			bool release = !huge;
+
+			if (!page_handle_poison(page, true, release))
+				ret = -EBUSY;
 		} else {
 			if (!list_empty(&pagelist))
 				putback_movable_pages(&pagelist);
 
-			pr_info("soft offline: %#lx: migration failed %d, type %lx (%pGp)\n",
-				pfn, ret, page->flags, &page->flags);
+
+			pr_info("soft offline: %#lx: %s migration failed %d, type %lx (%pGp)\n",
+				pfn, msg_page[huge], ret, page->flags, &page->flags);
 			if (ret > 0)
 				ret = -EIO;
 		}
 	} else {
-		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx (%pGp)\n",
-			pfn, ret, page_count(page), page->flags, &page->flags);
+		pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
+			pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
 	}
 	return ret;
 }
 
 static int soft_offline_in_use_page(struct page *page)
 {
-	int ret;
 	struct page *hpage = compound_head(page);
 
 	if (!PageHuge(page) && PageTransHuge(hpage))
 		if (try_to_split_thp_page(page, "soft offline") < 0)
 			return -EBUSY;
-
-	if (PageHuge(page))
-		ret = soft_offline_huge_page(page);
-	else
-		ret = __soft_offline_page(page);
-	return ret;
+	return __soft_offline_page(page);
 }
 
 static int soft_offline_free_page(struct page *page)
 {
-	int rc = -EBUSY;
+	int rc = 0;
 
-	if (!dissolve_free_huge_page(page) && take_page_off_buddy(page)) {
-		page_handle_poison(page, false);
-		rc = 0;
-	}
+	if (!page_handle_poison(page, true, false))
+		rc = -EBUSY;
 
 	return rc;
 }
_

Patches currently in -mm which might be from osalvador@suse.de are

mmmadvise-refactor-madvise_inject_error.patch
mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
mmhwpoison-kill-put_hwpoison_page.patch
mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
mmhwpoison-rework-soft-offline-for-free-pages.patch
mmhwpoison-rework-soft-offline-for-in-use-pages.patch
mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch

^ permalink raw reply	[flat|nested] 247+ messages in thread

end of thread, other threads:[~2020-09-22 17:00 UTC | newest]

Thread overview: 247+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-03 22:14 incoming Andrew Morton
2020-07-03 22:15 ` [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation Andrew Morton
2020-07-03 22:15 ` [patch 2/5] samples/vfs: avoid warning in statx override Andrew Morton
2020-07-03 22:15 ` [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak Andrew Morton
2020-07-03 22:15 ` [patch 4/5] vmalloc: fix the owner argument for the new __vmalloc_node_range callers Andrew Morton
2020-07-03 22:15 ` [patch 5/5] mm/page_alloc: fix documentation error Andrew Morton
2020-07-06 22:41 ` + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree Andrew Morton
2020-07-06 22:41   ` Andrew Morton
2020-07-06 22:46 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch " Andrew Morton
2020-07-06 22:49 ` + lib-test_bitops-do-the-full-test-during-module-init.patch " Andrew Morton
2020-07-06 23:03 ` [nacked] mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd.patch removed from " Andrew Morton
2020-07-06 23:03 ` [to-be-updated] mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
2020-07-06 23:03 ` [to-be-updated] mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
2020-07-06 23:04 ` [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
2020-07-06 23:04   ` Andrew Morton
2020-07-06 23:15 ` + mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch added to " Andrew Morton
2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch " Andrew Morton
2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch " Andrew Morton
2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch " Andrew Morton
2020-07-06 23:28 ` + documentation-mm-add-descriptions-for-arch-page-table-helpers.patch " Andrew Morton
2020-07-06 23:33 ` [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from " Andrew Morton
2020-07-06 23:33   ` Andrew Morton
2020-07-06 23:34 ` + slub-drop-lockdep_assert_held-from-put_map.patch added to " Andrew Morton
2020-07-06 23:34   ` Andrew Morton
2020-07-06 23:34 ` [merged] mailmap-add-entry-for-obsolete-email-address.patch removed from " Andrew Morton
2020-07-06 23:36 ` + mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch added to " Andrew Morton
2020-07-06 23:50 ` + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch " Andrew Morton
2020-07-06 23:52 ` [to-be-updated] mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch removed from " Andrew Morton
2020-07-06 23:53 ` + mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch added to " Andrew Morton
2020-07-07  1:53 ` mmotm 2020-07-06-18-53 uploaded Andrew Morton
2020-07-07 19:17 ` + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch added to -mm tree Andrew Morton
2020-07-07 19:20 ` + mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch " Andrew Morton
2020-07-07 19:20 ` + mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch " Andrew Morton
2020-07-07 19:20 ` + mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch " Andrew Morton
2020-07-07 19:25 ` + fs-minix-check-return-value-of-sb_getblk.patch " Andrew Morton
2020-07-07 19:25 ` + fs-minix-dont-allow-getting-deleted-inodes.patch " Andrew Morton
2020-07-07 19:25 ` + fs-minix-reject-too-large-maximum-file-size.patch " Andrew Morton
2020-07-07 19:25 ` + fs-minix-set-s_maxbytes-correctly.patch " Andrew Morton
2020-07-07 19:25 ` + fs-minix-fix-block-limit-check-for-v1-filesystems.patch " Andrew Morton
2020-07-07 19:25 ` + fs-minix-remove-expected-error-message-in-block_to_path.patch " Andrew Morton
2020-07-07 19:27 ` + mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch " Andrew Morton
2020-07-07 19:28 ` + mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch " Andrew Morton
2020-07-07 19:36 ` + kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch " Andrew Morton
2020-07-07 19:37 ` + lib-test_lockupc-make-symbol-test_works-static.patch " Andrew Morton
2020-07-07 19:39 ` [failures] kthread-work-could-not-be-queued-when-worker-being-destroyed.patch removed from " Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-page_isolation-prefer-the-node-of-the-source-page.patch " Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-migrate-move-migration-helper-from-h-to-c.patch " Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-unify-migration-callbacks.patch " Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch " Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-migrate-make-a-standard-migration-target-allocation-function.patch " Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-gup-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
2020-07-07 19:47   ` Andrew Morton
2020-07-07 19:56 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch added to " Andrew Morton
2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch removed from " Andrew Morton
2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch " Andrew Morton
2020-07-07 20:12 ` [to-be-updated] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch " Andrew Morton
2020-07-07 20:13 ` + mm-vmstat-add-events-for-thp-migration-without-split.patch added to " Andrew Morton
2020-07-07 22:18 ` + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch " Andrew Morton
2020-07-08 21:48 ` + vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch " Andrew Morton
2020-07-08 21:50 ` + kbuild-move-wtype-limits-to-w=2.patch " Andrew Morton
2020-07-08 22:17 ` [to-be-updated] mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch removed from " Andrew Morton
2020-07-08 22:20 ` + mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch added to " Andrew Morton
2020-07-08 22:25 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch " Andrew Morton
2020-07-08 23:12 ` + mailmap-add-entry-for-mike-rapoport.patch " Andrew Morton
2020-07-08 23:16 ` + mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
2020-07-08 23:16 ` + mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
2020-07-08 23:16 ` + mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
2020-07-08 23:16   ` Andrew Morton
2020-07-08 23:16 ` + mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch " Andrew Morton
2020-07-08 23:41 ` + mm-swap-simplify-alloc_swap_slot_cache.patch " Andrew Morton
2020-07-08 23:41 ` + mm-swap-simplify-enable_swap_slots_cache.patch " Andrew Morton
2020-07-08 23:41 ` + mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch " Andrew Morton
2020-07-09  0:06 ` + mm-do-page-fault-accounting-in-handle_mm_fault.patch " Andrew Morton
2020-07-09  0:06   ` Andrew Morton
2020-07-09  0:06 ` + mm-alpha-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:06 ` + mm-arc-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:06 ` + mm-arm-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:06 ` + mm-arm64-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:06 ` + mm-csky-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:06 ` + mm-hexagon-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:06 ` + mm-ia64-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:06 ` + mm-m68k-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:06 ` + mm-microblaze-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:06 ` + mm-mips-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-nds32-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-nios2-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-openrisc-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-parisc-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-powerpc-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-riscv-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-s390-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-sh-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-sparc32-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-sparc64-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-x86-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-xtensa-use-general-page-fault-accounting.patch " Andrew Morton
2020-07-09  0:07 ` + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch " Andrew Morton
2020-07-09  0:07   ` Andrew Morton
2020-07-09  0:07 ` + mm-gup-remove-task_struct-pointer-for-all-gup-code.patch " Andrew Morton
2020-07-09  2:04 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix.patch " Andrew Morton
2020-07-09  2:29 ` mmotm 2020-07-08-19-28 uploaded Andrew Morton
2020-07-09  2:29 ` Andrew Morton
2020-07-09 23:09 ` + mm-handle-page-mapping-better-in-dump_page.patch added to -mm tree Andrew Morton
2020-07-09 23:09 ` + mm-dump-compound-page-information-on-a-second-line.patch " Andrew Morton
2020-07-09 23:09 ` + mm-print-head-flags-in-dump_page.patch " Andrew Morton
2020-07-09 23:09 ` + mm-switch-dump_page-to-get_kernel_nofault.patch " Andrew Morton
2020-07-09 23:09 ` + mm-print-the-inode-number-in-dump_page.patch " Andrew Morton
2020-07-09 23:09 ` + mm-print-hashed-address-of-struct-page.patch " Andrew Morton
2020-07-09 23:10 ` + mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch " Andrew Morton
2020-07-09 23:46 ` + mm-migrate-optimize-migrate_vma_setup-for-holes.patch " Andrew Morton
2020-07-09 23:46 ` + mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch " Andrew Morton
2020-07-10  0:15 ` + mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch " Andrew Morton
2020-07-10  0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards.patch " Andrew Morton
2020-07-10  0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch " Andrew Morton
2020-07-10  0:27 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch " Andrew Morton
2020-07-10  0:33 ` + iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
2020-07-10  0:33 ` + rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
2020-07-10  0:33 ` + ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
2020-07-10  0:33 ` + virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
2020-07-10  0:36 ` + doc-mm-sync-up-oom_score_adj-documentation.patch " Andrew Morton
2020-07-10  0:36   ` Andrew Morton
2020-07-10  0:36 ` + doc-mm-clarify-proc-pid-oom_score-value-range.patch " Andrew Morton
2020-07-10  0:38 ` [to-be-updated] proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch removed from " Andrew Morton
2020-07-10  0:38 ` [to-be-updated] mm-utilc-make-vm_memory_committed-more-accurate.patch " Andrew Morton
2020-07-10  0:38 ` [to-be-updated] mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch " Andrew Morton
2020-07-10  4:00 ` mmotm 2020-07-09-21-00 uploaded Andrew Morton
2020-07-10 23:27 ` [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from -mm tree Andrew Morton
2020-07-10 23:29 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to " Andrew Morton
2020-07-10 23:32 ` + proc-sysctl-make-protected_-world-readable.patch " Andrew Morton
2020-07-10 23:32 ` [to-be-updated] mm-vmallocc-add-an-error-message-if-two-areas-overlap.patch removed from " Andrew Morton
2020-07-10 23:35 ` + rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch added to " Andrew Morton
2020-07-14  0:19 ` + mm-vmscan-consistent-update-to-pgrefill.patch " Andrew Morton
2020-07-14  0:24 ` + mm-handle-page-mapping-better-in-dump_page-fix.patch " Andrew Morton
2020-07-14  0:31 ` + tmpfs-per-superblock-i_ino-support.patch " Andrew Morton
2020-07-14  0:31 ` + tmpfs-support-64-bit-inums-per-sb.patch " Andrew Morton
2020-07-14  0:50 ` + mm-thp-replace-http-links-with-https-ones.patch " Andrew Morton
2020-07-14  1:00 ` + mm-memcg-reclaim-more-aggressively-before-high-allocator-throttling.patch " Andrew Morton
2020-07-14  1:00 ` + mm-memcg-unify-reclaim-retry-limits-with-page-allocator.patch " Andrew Morton
2020-07-14  1:03 ` + mm-memcg-avoid-stale-protection-values-when-cgroup-is-above-protection.patch " Andrew Morton
2020-07-14  1:03 ` + mm-memcg-decouple-elowmin-state-mutations-from-protection-checks.patch " Andrew Morton
2020-07-14  1:10 ` + scripts-deprecated_terms-sync-with-inclusive-terms.patch " Andrew Morton
2020-07-14  1:21 ` + mm-page_isolation-prefer-the-node-of-the-source-page.patch " Andrew Morton
2020-07-14  1:21 ` + mm-migrate-move-migration-helper-from-h-to-c.patch " Andrew Morton
2020-07-14  1:21 ` + mm-hugetlb-unify-migration-callbacks.patch " Andrew Morton
2020-07-14  1:21 ` + mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch " Andrew Morton
2020-07-14  1:21 ` + mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations-fix.patch " Andrew Morton
2020-07-14  1:21 ` + mm-migrate-make-a-standard-migration-target-allocation-function.patch " Andrew Morton
2020-07-14  1:21 ` + mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
2020-07-14  1:21 ` + mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
2020-07-14  1:21 ` + mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
2020-07-14  1:22 ` + mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
2020-07-14  1:30 ` + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers-v5.patch " Andrew Morton
2020-07-14  1:30 ` + documentation-mm-add-descriptions-for-arch-page-table-helpers-v5.patch " Andrew Morton
2020-07-14  1:37 ` + mm-sparse-cleanup-the-code-surrounding-memory_present.patch " Andrew Morton
2020-07-14  1:38 ` + const_structscheckpatch-add-regulator_ops.patch " Andrew Morton
2020-07-14  1:40 ` + fat-fix-fat_ra_init-for-data-clusters-==-0.patch " Andrew Morton
2020-07-14  1:41 ` + mm-vmallocc-remove-bug-from-the-find_va_links.patch " Andrew Morton
2020-07-14  2:49 ` mmotm 2020-07-13-19-49 uploaded Andrew Morton
2020-07-16  0:41 ` + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault-v2.patch added to -mm tree Andrew Morton
2020-07-16  0:42 ` + fs-ufs-avoid-potential-u32-multiplication-overflow.patch " Andrew Morton
2020-07-16  0:50 ` + x86-mm-use-max-memory-block-size-on-bare-metal-v3.patch " Andrew Morton
2020-07-16 21:28 ` + mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch " Andrew Morton
2020-07-16 21:45 ` + mmhwpoison-cleanup-unused-pagehuge-check.patch " Andrew Morton
2020-07-16 21:45 ` + mm-hwpoison-remove-recalculating-hpage.patch " Andrew Morton
2020-07-16 21:45 ` + mmmadvise-call-soft_offline_page-without-mf_count_increased.patch " Andrew Morton
2020-07-16 21:45 ` + mmmadvise-refactor-madvise_inject_error.patch " Andrew Morton
2020-07-16 21:45 ` + mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-kill-put_hwpoison_page.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-remove-mf_count_increased.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-rework-soft-offline-for-free-pages.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-rework-soft-offline-for-in-use-pages.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch " Andrew Morton
2020-07-16 21:46 ` + mmhwpoison-introduce-mf_msg_unsplit_thp.patch " Andrew Morton
2020-07-16 22:51 ` + linux-sched-mmh-drop-duplicated-words-in-comments.patch " Andrew Morton
2020-07-16 22:51 ` + mm-drop-duplicated-words-in-linux-pgtableh.patch " Andrew Morton
2020-07-16 22:52 ` + mm-drop-duplicated-words-in-linux-mmh.patch " Andrew Morton
2020-07-16 22:52 ` + autofs-fix-doubled-word.patch " Andrew Morton
2020-07-16 23:08 ` + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings.patch " Andrew Morton
2020-07-16 23:09 ` Andrew Morton
2020-07-16 23:28 ` + memcg-oom-check-memcg-margin-for-parallel-oom.patch " Andrew Morton
2020-07-16 23:32 ` + mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-2.patch " Andrew Morton
2020-07-16 23:42 ` + ipc-shmc-remove-the-superfluous-break.patch " Andrew Morton
2020-07-16 23:52 ` + mm-thp-replace-http-links-with-https-ones-fix.patch " Andrew Morton
2020-07-17  0:01 ` + scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch " Andrew Morton
2020-07-17  1:53 ` + revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch " Andrew Morton
2020-07-17  4:06 ` + revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch " Andrew Morton
2020-07-17  5:53 ` mmotm 2020-07-16-22-52 uploaded Andrew Morton
2020-07-17 20:18 ` [folded-merged] revert-squashfs-migrate-from-ll_rw_block-usage-to-bio-fix.patch removed from -mm tree Andrew Morton
2020-07-17 20:18 ` [obsolete] revert-squashfs-migrate-from-ll_rw_block-usage-to-bio.patch " Andrew Morton
2020-07-17 20:20 ` + squashfs-fix-length-field-overlap-check-in-metadata-reading.patch added to " Andrew Morton
2020-07-17 20:35 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix.patch " Andrew Morton
2020-07-17 20:49 ` + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix.patch " Andrew Morton
2020-07-17 21:11 ` + mm-pgtable-make-generic-pgprot_-macros-available-for-no-mmu.patch " Andrew Morton
2020-07-17 21:11 ` + riscv-use-generic-pgprot_-macros-from-linux-pgtableh.patch " Andrew Morton
2020-07-17 21:42 ` + uaccess-add-force_uaccess_beginend-helpers-v2.patch " Andrew Morton
2020-07-17 21:59 ` + mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_populate_basepages.patch " Andrew Morton
2020-07-17 21:59 ` + mm-sparsemem-enable-vmem_altmap-support-in-vmemmap_alloc_block_buf.patch " Andrew Morton
2020-07-17 21:59 ` + arm64-mm-enable-vmem_altmap-support-for-vmemmap-mappings.patch " Andrew Morton
2020-07-17 22:00 ` + ocfs2-fix-remounting-needed-after-setfacl-command.patch " Andrew Morton
2020-07-17 23:03 ` + mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations-fix.patch " Andrew Morton
2020-07-20 22:55 ` + scripts-decode_stacktrace-strip-basepath-from-all-paths.patch " Andrew Morton
2020-07-20 23:03 ` + mm-vmstat-fix-proc-sys-vm-stat_refresh-generating-false-warnings-fix-2.patch " Andrew Morton
2020-07-20 23:26 ` + mm-gupc-fix-the-comment-of-return-value-for-populate_vma_page_range.patch " Andrew Morton
2020-07-20 23:31 ` + ocfs2-suballoch-delete-a-duplicated-word.patch " Andrew Morton
2020-07-21  0:26 ` + ntfs-fix-ntfs_test_inode-and-ntfs_init_locked_inode-function-type.patch " Andrew Morton
2020-07-21  0:27 ` + highmem-linux-highmemh-fix-duplicated-words-in-a-comment.patch " Andrew Morton
2020-07-21  0:28 ` + clang-linux-compiler-clangh-drop-duplicated-word-in-a-comment.patch " Andrew Morton
2020-07-21  0:30 ` + linux-exportfsh-drop-duplicated-word-in-a-comment.patch " Andrew Morton
2020-07-21  0:30 ` + linux-async_txh-drop-duplicated-word-in-a-comment.patch " Andrew Morton
2020-07-21  0:31 ` + frontswap-linux-frontswaph-drop-duplicated-word-in-a-comment.patch " Andrew Morton
2020-07-21  0:33 ` + memcontrol-drop-duplicate-word-and-fix-spello-in-linux-memcontrolh.patch " Andrew Morton
2020-07-21  0:34 ` + xz-drop-duplicated-word-in-linux-xzh.patch " Andrew Morton
2020-07-21  2:07 ` mmotm 2020-07-20-19-06 uploaded Andrew Morton
2020-07-21 20:49 ` + fork-silence-a-false-postive-warning-in-__mmdrop.patch added to -mm tree Andrew Morton
2020-07-21 20:57 ` + io-mapping-indicate-mapping-failure.patch " Andrew Morton
2020-07-21 20:57 ` + io-mapping-indicate-mapping-failure-fix.patch " Andrew Morton
2020-07-21 21:06 ` + mm-fix-kthread_use_mm-vs-tlb-invalidate.patch " Andrew Morton
2020-07-21 21:06 ` + mm-fix-kthread_use_mm-vs-tlb-invalidate-fix.patch " Andrew Morton
2020-07-21 21:18 ` + kernel-add-a-kernel_wait-helper.patch " Andrew Morton
2020-07-21 21:20 ` + maintainers-add-kcov-section.patch " Andrew Morton
2020-07-21 21:21 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled-fix-fix.patch " Andrew Morton
2020-07-24  0:26 ` + scripts-gdb-fix-lx-symbols-gdberror-while-loading-modules.patch " Andrew Morton
2020-07-24  0:47 ` + mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch " Andrew Morton
2020-07-24  0:47 ` + mm-vmscan-protect-the-workingset-on-anonymous-lru.patch " Andrew Morton
2020-07-24  0:47 ` + mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch " Andrew Morton
2020-07-24  0:47 ` + mm-swapcache-support-to-handle-the-shadow-entries.patch " Andrew Morton
2020-07-24  0:47 ` + mm-swap-implement-workingset-detection-for-anonymous-lru.patch " Andrew Morton
2020-07-24  0:47 ` + mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch " Andrew Morton
2020-07-24  0:57 ` + makefile-add-debug-option-to-enable-function-aligned-on-32-bytes.patch " Andrew Morton
2020-07-24  1:09 ` + mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch " Andrew Morton
2020-07-24  2:12 ` + panic-make-print_oops_end_marker-static.patch " Andrew Morton
2020-07-24  2:20 ` + lib-kconfigdebug-make-test_lockup-depend-on-module.patch " Andrew Morton
2020-07-24  2:20 ` + lib-test_lockupc-fix-return-value-of-test_lockup_init.patch " Andrew Morton
2020-07-24  2:25 ` [merged] sh-add-missing-export_symbol-for-__delay.patch removed from " Andrew Morton
2020-07-24  2:50 ` + revert-revert-mm-vmalloc-modify-struct-vmap_area-to-reduce-its-size.patch added to " Andrew Morton
2020-07-24  2:53 ` + khugepaged-fix-null-pointer-dereference-due-to-race.patch " Andrew Morton
2020-07-24  3:01 ` + mm-mmap-merge-vma-after-call_mmap-if-possible.patch " Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2020-09-22 17:00 + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch " akpm
2020-08-07  1:07 akpm
2020-07-24  4:14 incoming Andrew Morton
2020-07-31 20:06 ` + mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch added to -mm tree Andrew Morton
2020-06-24 19:19 akpm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).