* [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation
2020-07-03 22:14 incoming Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
2020-07-03 22:15 ` [patch 2/5] samples/vfs: avoid warning in statx override Andrew Morton
` (231 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
To: akpm, kirill.shutemov, linux-mm, mhocko, mike.kravetz,
mm-commits, stable, torvalds, willy
From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: mm/hugetlb.c: fix pages per hugetlb calculation
The routine hpage_nr_pages() was incorrectly used to calculate the number
of base pages in a hugetlb page. hpage_nr_pages is designed to be called
for THP pages and will return HPAGE_PMD_NR for hugetlb pages of any size.
Due to the context in which hpage_nr_pages was called, it is unlikely to
produce a user visible error. The routine with the incorrect call is only
exercised in the case of hugetlb memory error or migration. In addition,
this would need to be on an architecture which supports huge page sizes
less than PMD_SIZE. And, the vma containing the huge page would also need
to smaller than PMD_SIZE.
Link: http://lkml.kernel.org/r/20200629185003.97202-1-mike.kravetz@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/hugetlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/hugetlb.c~hugetlb-fix-pages-per-hugetlb-calculation
+++ a/mm/hugetlb.c
@@ -1593,7 +1593,7 @@ static struct address_space *_get_hugetl
/* Use first found vma */
pgoff_start = page_to_pgoff(hpage);
- pgoff_end = pgoff_start + hpage_nr_pages(hpage) - 1;
+ pgoff_end = pgoff_start + pages_per_huge_page(page_hstate(hpage)) - 1;
anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
pgoff_start, pgoff_end) {
struct vm_area_struct *vma = avc->vma;
_
^ permalink raw reply [flat|nested] 247+ messages in thread
* [patch 2/5] samples/vfs: avoid warning in statx override
2020-07-03 22:14 incoming Andrew Morton
2020-07-03 22:15 ` [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
2020-07-03 22:15 ` [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak Andrew Morton
` (230 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
To: akpm, dhowells, keescook, linux-mm, mm-commits, mszeredi, torvalds, viro
From: Kees Cook <keescook@chromium.org>
Subject: samples/vfs: avoid warning in statx override
Something changed recently to uncover this warning:
samples/vfs/test-statx.c:24:15: warning: `struct foo' declared inside parameter list will not be visible outside of this definition or declaration
24 | #define statx foo
| ^~~
Which is due the use of "struct statx" (here, "struct foo") in a function
prototype argument list before it has been defined:
int
# 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
foo
# 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
(int __dirfd, const char *__restrict __path, int __flags,
unsigned int __mask, struct
# 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
foo
# 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
*__restrict __buf)
__attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2, 5)));
Add explicit struct before #include to avoid warning.
Link: http://lkml.kernel.org/r/202006282213.C516EA6@keescook
Fixes: f1b5618e013a ("vfs: Add a sample program for the new mount API")
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
samples/vfs/test-statx.c | 2 ++
1 file changed, 2 insertions(+)
--- a/samples/vfs/test-statx.c~samples-vfs-avoid-warning-in-statx-override
+++ a/samples/vfs/test-statx.c
@@ -23,6 +23,8 @@
#include <linux/fcntl.h>
#define statx foo
#define statx_timestamp foo_timestamp
+struct statx;
+struct statx_timestamp;
#include <sys/stat.h>
#undef statx
#undef statx_timestamp
_
^ permalink raw reply [flat|nested] 247+ messages in thread
* [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak
2020-07-03 22:14 incoming Andrew Morton
2020-07-03 22:15 ` [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation Andrew Morton
2020-07-03 22:15 ` [patch 2/5] samples/vfs: avoid warning in statx override Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
2020-07-03 22:15 ` [patch 4/5] vmalloc: fix the owner argument for the new __vmalloc_node_range callers Andrew Morton
` (229 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
To: akpm, andreas.schaufler, aslan, guro, Jonathan.Cameron, js1304,
linux-mm, mhocko, mike.kravetz, mm-commits, riel, robin.murphy,
song.bao.hua, stable, torvalds
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/cma.c: use exact_nid true to fix possible per-numa cma leak
Calling cma_declare_contiguous_nid() with false exact_nid for per-numa
reservation can easily cause cma leak and various confusion. For example,
mm/hugetlb.c is trying to reserve per-numa cma for gigantic pages. But it
can easily leak cma and make users confused when system has memoryless
nodes.
In case the system has 4 numa nodes, and only numa node0 has memory. if
we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas for 4
different numa nodes. since exact_nid=false in current code, all 4 numa
nodes will get cma successfully from node0, but hugetlb_cma[1 to 3] will
never be available to hugepage will only allocate memory from
hugetlb_cma[0].
In case the system has 4 numa nodes, both numa node0&2 has memory, other
nodes have no memory. if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c
will get 4 cma areas for 4 different numa nodes. since exact_nid=false in
current code, all 4 numa nodes will get cma successfully from node0 or 2,
but hugetlb_cma[1] and [3] will never be available to hugepage as
mm/hugetlb.c will only allocate memory from hugetlb_cma[0] and
hugetlb_cma[2]. This causes permanent leak of the cma areas which are
supposed to be used by memoryless node.
Of cource we can workaround the issue by letting mm/hugetlb.c scan all cma
areas in alloc_gigantic_page() even node_mask includes node0 only. that
means when node_mask includes node0 only, we can get page from
hugetlb_cma[1] to hugetlb_cma[3]. But this will cause kernel crash in
free_gigantic_page() while it wants to free page by:
cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)
On the other hand, exact_nid=false won't consider numa distance, it might
be not that useful to leverage cma areas on remote nodes. I feel it is
much simpler to make exact_nid true to make everything clear. After that,
memoryless nodes won't be able to reserve per-numa CMA from other nodes
which have memory.
Link: http://lkml.kernel.org/r/20200628074345.27228-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/cma.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/cma.c~mm-cmac-use-exact_nid-true-to-fix-possible-per-numa-cma-leak
+++ a/mm/cma.c
@@ -339,13 +339,13 @@ int __init cma_declare_contiguous_nid(ph
*/
if (base < highmem_start && limit > highmem_start) {
addr = memblock_alloc_range_nid(size, alignment,
- highmem_start, limit, nid, false);
+ highmem_start, limit, nid, true);
limit = highmem_start;
}
if (!addr) {
addr = memblock_alloc_range_nid(size, alignment, base,
- limit, nid, false);
+ limit, nid, true);
if (!addr) {
ret = -ENOMEM;
goto err;
_
^ permalink raw reply [flat|nested] 247+ messages in thread
* [patch 4/5] vmalloc: fix the owner argument for the new __vmalloc_node_range callers
2020-07-03 22:14 incoming Andrew Morton
` (2 preceding siblings ...)
2020-07-03 22:15 ` [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
2020-07-03 22:15 ` [patch 5/5] mm/page_alloc: fix documentation error Andrew Morton
` (228 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
To: akpm, ardb, hch, linux-mm, mm-commits, torvalds
From: Christoph Hellwig <hch@lst.de>
Subject: vmalloc: fix the owner argument for the new __vmalloc_node_range callers
Fix the recently added new __vmalloc_node_range callers to pass the
correct values as the owner for display in /proc/vmallocinfo.
Link: http://lkml.kernel.org/r/20200627075649.2455097-1-hch@lst.de
Fixes: 800e26b81311 ("x86/hyperv: allocate the hypercall page with only read and execute bits")
Fixes: 10d5e97c1bf8 ("arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page")
Fixes: 7a0e27b2a0ce ("mm: remove vmalloc_exec")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/arm64/kernel/probes/kprobes.c | 2 +-
arch/x86/hyperv/hv_init.c | 3 ++-
kernel/module.c | 2 +-
3 files changed, 4 insertions(+), 3 deletions(-)
--- a/arch/arm64/kernel/probes/kprobes.c~vmalloc-fix-the-owner-argument-for-the-new-__vmalloc_node_range-callers
+++ a/arch/arm64/kernel/probes/kprobes.c
@@ -122,7 +122,7 @@ void *alloc_insn_page(void)
{
return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS,
- NUMA_NO_NODE, __func__);
+ NUMA_NO_NODE, __builtin_return_address(0));
}
/* arm kprobe: install breakpoint in text */
--- a/arch/x86/hyperv/hv_init.c~vmalloc-fix-the-owner-argument-for-the-new-__vmalloc_node_range-callers
+++ a/arch/x86/hyperv/hv_init.c
@@ -377,7 +377,8 @@ void __init hyperv_init(void)
hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START,
VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_ROX,
- VM_FLUSH_RESET_PERMS, NUMA_NO_NODE, __func__);
+ VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
+ __builtin_return_address(0));
if (hv_hypercall_pg == NULL) {
wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
goto remove_cpuhp_state;
--- a/kernel/module.c~vmalloc-fix-the-owner-argument-for-the-new-__vmalloc_node_range-callers
+++ a/kernel/module.c
@@ -2785,7 +2785,7 @@ void * __weak module_alloc(unsigned long
{
return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS,
- NUMA_NO_NODE, __func__);
+ NUMA_NO_NODE, __builtin_return_address(0));
}
bool __weak module_init_section(const char *name)
_
^ permalink raw reply [flat|nested] 247+ messages in thread
* [patch 5/5] mm/page_alloc: fix documentation error
2020-07-03 22:14 incoming Andrew Morton
` (3 preceding siblings ...)
2020-07-03 22:15 ` [patch 4/5] vmalloc: fix the owner argument for the new __vmalloc_node_range callers Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
2020-07-06 22:41 ` + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree Andrew Morton
` (227 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
To: akpm, aquini, fdangelo, jsavitz, linux-mm, mm-commits, torvalds, willy
From: Joel Savitz <jsavitz@redhat.com>
Subject: mm/page_alloc: fix documentation error
When I increased the upper bound of the min_free_kbytes value in
ee8eb9a5fe863 ("mm/page_alloc: increase default min_free_kbytes bound") I
forgot to tweak the above comment to reflect the new value. This patch
fixes that mistake.
Link: http://lkml.kernel.org/r/20200624221236.29560-1-jsavitz@redhat.com
Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Fabrizio D'Angelo <fdangelo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-page_alloc-fix-documentation-error
+++ a/mm/page_alloc.c
@@ -7832,7 +7832,7 @@ void setup_per_zone_wmarks(void)
* Initialise min_free_kbytes.
*
* For small machines we want it small (128k min). For large machines
- * we want it large (64MB max). But it is not linear, because network
+ * we want it large (256MB max). But it is not linear, because network
* bandwidth does not increase linearly with machine size. We use
*
* min_free_kbytes = 4 * sqrt(lowmem_kbytes), for better accuracy:
_
^ permalink raw reply [flat|nested] 247+ messages in thread
* + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (4 preceding siblings ...)
2020-07-03 22:15 ` [patch 5/5] mm/page_alloc: fix documentation error Andrew Morton
@ 2020-07-06 22:41 ` Andrew Morton
2020-07-06 22:41 ` Andrew Morton
2020-07-06 22:46 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch " Andrew Morton
` (226 subsequent siblings)
232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 22:41 UTC (permalink / raw)
To: aryabinin, dvyukov, glider, mark.rutland, mm-commits, vincenzo.frascino
The patch titled
Subject: kasan: remove kasan_unpoison_stack_above_sp_to()
has been added to the -mm tree. Its filename is
kasan-remove-kasan_unpoison_stack_above_sp_to.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/kasan-remove-kasan_unpoison_stack_above_sp_to.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/kasan-remove-kasan_unpoison_stack_above_sp_to.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Vincenzo Frascino <vincenzo.frascino@arm.com>
Subject: kasan: remove kasan_unpoison_stack_above_sp_to()
kasan_unpoison_stack_above_sp_to() is defined in kasan code but never
used. The function was introduced as part of the commit:
commit 9f7d416c36124667 ("kprobes: Unpoison stack in jprobe_return() for KASAN")
... where it was necessary because x86's jprobe_return() would leave
stale shadow on the stack, and was an oddity in that regard.
Since then, jprobes were removed entirely, and as of commit:
commit 80006dbee674f9fa ("kprobes/x86: Remove jprobe implementation")
... there have been no callers of this function.
Remove the declaration and the implementation.
Link: http://lkml.kernel.org/r/20200706143505.23299-1-vincenzo.frascino@arm.com
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/kasan.h | 2 --
mm/kasan/common.c | 15 ---------------
2 files changed, 17 deletions(-)
--- a/include/linux/kasan.h~kasan-remove-kasan_unpoison_stack_above_sp_to
+++ a/include/linux/kasan.h
@@ -38,7 +38,6 @@ extern void kasan_disable_current(void);
void kasan_unpoison_shadow(const void *address, size_t size);
void kasan_unpoison_task_stack(struct task_struct *task);
-void kasan_unpoison_stack_above_sp_to(const void *watermark);
void kasan_alloc_pages(struct page *page, unsigned int order);
void kasan_free_pages(struct page *page, unsigned int order);
@@ -101,7 +100,6 @@ void kasan_restore_multi_shot(bool enabl
static inline void kasan_unpoison_shadow(const void *address, size_t size) {}
static inline void kasan_unpoison_task_stack(struct task_struct *task) {}
-static inline void kasan_unpoison_stack_above_sp_to(const void *watermark) {}
static inline void kasan_enable_current(void) {}
static inline void kasan_disable_current(void) {}
--- a/mm/kasan/common.c~kasan-remove-kasan_unpoison_stack_above_sp_to
+++ a/mm/kasan/common.c
@@ -180,21 +180,6 @@ asmlinkage void kasan_unpoison_task_stac
kasan_unpoison_shadow(base, watermark - base);
}
-/*
- * Clear all poison for the region between the current SP and a provided
- * watermark value, as is sometimes required prior to hand-crafted asm function
- * returns in the middle of functions.
- */
-void kasan_unpoison_stack_above_sp_to(const void *watermark)
-{
- const void *sp = __builtin_frame_address(0);
- size_t size = watermark - sp;
-
- if (WARN_ON(sp > watermark))
- return;
- kasan_unpoison_shadow(sp, size);
-}
^ permalink raw reply [flat|nested] 247+ messages in thread
* + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree
2020-07-06 22:41 ` + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree Andrew Morton
@ 2020-07-06 22:41 ` Andrew Morton
0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 22:41 UTC (permalink / raw)
To: aryabinin, dvyukov, glider, mark.rutland, mm-commits, vincenzo.frascino
The patch titled
Subject: kasan: remove kasan_unpoison_stack_above_sp_to()
has been added to the -mm tree. Its filename is
kasan-remove-kasan_unpoison_stack_above_sp_to.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/kasan-remove-kasan_unpoison_stack_above_sp_to.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/kasan-remove-kasan_unpoison_stack_above_sp_to.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Vincenzo Frascino <vincenzo.frascino@arm.com>
Subject: kasan: remove kasan_unpoison_stack_above_sp_to()
kasan_unpoison_stack_above_sp_to() is defined in kasan code but never
used. The function was introduced as part of the commit:
commit 9f7d416c36124667 ("kprobes: Unpoison stack in jprobe_return() for KASAN")
... where it was necessary because x86's jprobe_return() would leave
stale shadow on the stack, and was an oddity in that regard.
Since then, jprobes were removed entirely, and as of commit:
commit 80006dbee674f9fa ("kprobes/x86: Remove jprobe implementation")
... there have been no callers of this function.
Remove the declaration and the implementation.
Link: http://lkml.kernel.org/r/20200706143505.23299-1-vincenzo.frascino@arm.com
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/kasan.h | 2 --
mm/kasan/common.c | 15 ---------------
2 files changed, 17 deletions(-)
--- a/include/linux/kasan.h~kasan-remove-kasan_unpoison_stack_above_sp_to
+++ a/include/linux/kasan.h
@@ -38,7 +38,6 @@ extern void kasan_disable_current(void);
void kasan_unpoison_shadow(const void *address, size_t size);
void kasan_unpoison_task_stack(struct task_struct *task);
-void kasan_unpoison_stack_above_sp_to(const void *watermark);
void kasan_alloc_pages(struct page *page, unsigned int order);
void kasan_free_pages(struct page *page, unsigned int order);
@@ -101,7 +100,6 @@ void kasan_restore_multi_shot(bool enabl
static inline void kasan_unpoison_shadow(const void *address, size_t size) {}
static inline void kasan_unpoison_task_stack(struct task_struct *task) {}
-static inline void kasan_unpoison_stack_above_sp_to(const void *watermark) {}
static inline void kasan_enable_current(void) {}
static inline void kasan_disable_current(void) {}
--- a/mm/kasan/common.c~kasan-remove-kasan_unpoison_stack_above_sp_to
+++ a/mm/kasan/common.c
@@ -180,21 +180,6 @@ asmlinkage void kasan_unpoison_task_stac
kasan_unpoison_shadow(base, watermark - base);
}
-/*
- * Clear all poison for the region between the current SP and a provided
- * watermark value, as is sometimes required prior to hand-crafted asm function
- * returns in the middle of functions.
- */
-void kasan_unpoison_stack_above_sp_to(const void *watermark)
-{
- const void *sp = __builtin_frame_address(0);
- size_t size = watermark - sp;
-
- if (WARN_ON(sp > watermark))
- return;
- kasan_unpoison_shadow(sp, size);
-}
-
void kasan_alloc_pages(struct page *page, unsigned int order)
{
u8 tag;
_
Patches currently in -mm which might be from vincenzo.frascino@arm.com are
kasan-remove-kasan_unpoison_stack_above_sp_to.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (5 preceding siblings ...)
2020-07-06 22:41 ` + kasan-remove-kasan_unpoison_stack_above_sp_to.patch added to -mm tree Andrew Morton
@ 2020-07-06 22:46 ` Andrew Morton
2020-07-06 22:49 ` + lib-test_bitops-do-the-full-test-during-module-init.patch " Andrew Morton
` (225 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 22:46 UTC (permalink / raw)
To: andreyknvl, aryabinin, dvyukov, glider, matthias.bgg, mm-commits,
walter-zh.wu
The patch titled
Subject: lib/test_kasan.c
has been added to the -mm tree. Its filename is
kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Walter Wu <walter-zh.wu@mediatek.com>
Subject: lib/test_kasan.c
: fix KASAN unit tests for tag-based KASAN
We use tag-based KASAN, then KASAN unit tests don't detect out-of-bounds
memory access. They need to be fixed.
With tag-based KASAN, the state of each 16 aligned bytes of memory is
encoded in one shadow byte and the shadow value is tag of pointer, so
we need to read next shadow byte, the shadow value is not equal to tag
value of pointer, so that tag-based KASAN will detect out-of-bounds
memory access.
Link: http://lkml.kernel.org/r/20200706115039.16750-1-walter-zh.wu@mediatek.com
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
lib/test_kasan.c | 47 ++++++++++++++++++++++++++++-----------------
1 file changed, 30 insertions(+), 17 deletions(-)
--- a/lib/test_kasan.c~kasan-fix-kasan-unit-tests-for-tag-based-kasan
+++ a/lib/test_kasan.c
@@ -23,6 +23,8 @@
#include <asm/page.h>
+#define OOB_TAG_OFF (IS_ENABLED(CONFIG_KASAN_GENERIC) ? 0 : 13)
+
/*
* We assign some test results to these globals to make sure the tests
* are not eliminated as dead code.
@@ -48,7 +50,8 @@ static noinline void __init kmalloc_oob_
return;
}
- ptr[size] = 'x';
+ ptr[size + OOB_TAG_OFF] = 'x';
+
kfree(ptr);
}
@@ -100,7 +103,8 @@ static noinline void __init kmalloc_page
return;
}
- ptr[size] = 0;
+ ptr[size + OOB_TAG_OFF] = 0;
+
kfree(ptr);
}
@@ -170,7 +174,8 @@ static noinline void __init kmalloc_oob_
return;
}
- ptr2[size2] = 'x';
+ ptr2[size2 + OOB_TAG_OFF] = 'x';
+
kfree(ptr2);
}
@@ -188,7 +193,9 @@ static noinline void __init kmalloc_oob_
kfree(ptr1);
return;
}
- ptr2[size2] = 'x';
+
+ ptr2[size2 + OOB_TAG_OFF] = 'x';
+
kfree(ptr2);
}
@@ -224,7 +231,8 @@ static noinline void __init kmalloc_oob_
return;
}
- memset(ptr+7, 0, 2);
+ memset(ptr + 7 + OOB_TAG_OFF, 0, 2);
+
kfree(ptr);
}
@@ -240,7 +248,8 @@ static noinline void __init kmalloc_oob_
return;
}
- memset(ptr+5, 0, 4);
+ memset(ptr + 5 + OOB_TAG_OFF, 0, 4);
+
kfree(ptr);
}
@@ -257,7 +266,8 @@ static noinline void __init kmalloc_oob_
return;
}
- memset(ptr+1, 0, 8);
+ memset(ptr + 1 + OOB_TAG_OFF, 0, 8);
+
kfree(ptr);
}
@@ -273,7 +283,8 @@ static noinline void __init kmalloc_oob_
return;
}
- memset(ptr+1, 0, 16);
+ memset(ptr + 1 + OOB_TAG_OFF, 0, 16);
+
kfree(ptr);
}
@@ -289,7 +300,8 @@ static noinline void __init kmalloc_oob_
return;
}
- memset(ptr, 0, size+5);
+ memset(ptr, 0, size + 5 + OOB_TAG_OFF);
+
kfree(ptr);
}
@@ -423,7 +435,8 @@ static noinline void __init kmem_cache_o
return;
}
- *p = p[size];
+ *p = p[size + OOB_TAG_OFF];
+
kmem_cache_free(cache, p);
kmem_cache_destroy(cache);
}
@@ -520,25 +533,25 @@ static noinline void __init copy_user_te
}
pr_info("out-of-bounds in copy_from_user()\n");
- unused = copy_from_user(kmem, usermem, size + 1);
+ unused = copy_from_user(kmem, usermem, size + 1 + OOB_TAG_OFF);
pr_info("out-of-bounds in copy_to_user()\n");
- unused = copy_to_user(usermem, kmem, size + 1);
+ unused = copy_to_user(usermem, kmem, size + 1 + OOB_TAG_OFF);
pr_info("out-of-bounds in __copy_from_user()\n");
- unused = __copy_from_user(kmem, usermem, size + 1);
+ unused = __copy_from_user(kmem, usermem, size + 1 + OOB_TAG_OFF);
pr_info("out-of-bounds in __copy_to_user()\n");
- unused = __copy_to_user(usermem, kmem, size + 1);
+ unused = __copy_to_user(usermem, kmem, size + 1 + OOB_TAG_OFF);
pr_info("out-of-bounds in __copy_from_user_inatomic()\n");
- unused = __copy_from_user_inatomic(kmem, usermem, size + 1);
+ unused = __copy_from_user_inatomic(kmem, usermem, size + 1 + OOB_TAG_OFF);
pr_info("out-of-bounds in __copy_to_user_inatomic()\n");
- unused = __copy_to_user_inatomic(usermem, kmem, size + 1);
+ unused = __copy_to_user_inatomic(usermem, kmem, size + 1 + OOB_TAG_OFF);
pr_info("out-of-bounds in strncpy_from_user()\n");
- unused = strncpy_from_user(kmem, usermem, size + 1);
+ unused = strncpy_from_user(kmem, usermem, size + 1 + OOB_TAG_OFF);
vm_munmap((unsigned long)usermem, PAGE_SIZE);
kfree(kmem);
_
Patches currently in -mm which might be from walter-zh.wu@mediatek.com are
kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
rcu-kasan-record-and-print-call_rcu-call-stack.patch
kasan-record-and-print-the-free-track.patch
kasan-add-tests-for-call_rcu-stack-recording.patch
kasan-update-documentation-for-generic-kasan.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + lib-test_bitops-do-the-full-test-during-module-init.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (6 preceding siblings ...)
2020-07-06 22:46 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch " Andrew Morton
@ 2020-07-06 22:49 ` Andrew Morton
2020-07-06 23:03 ` [nacked] mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd.patch removed from " Andrew Morton
` (224 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 22:49 UTC (permalink / raw)
To: andriy.shevchenko, geert, jesse.brandeburg, mm-commits, richard.weiyang
The patch titled
Subject: lib/test_bitops: do the full test during module init
has been added to the -mm tree. Its filename is
lib-test_bitops-do-the-full-test-during-module-init.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/lib-test_bitops-do-the-full-test-during-module-init.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/lib-test_bitops-do-the-full-test-during-module-init.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Geert Uytterhoeven <geert@linux-m68k.org>
Subject: lib/test_bitops: do the full test during module init
Currently, the bitops test consists of two parts: one part is executed
during module load, the second part during module unload. This is
cumbersome for the user, as he has to perform two steps to execute all
tests, and is different from most (all?) other tests.
Merge the two parts, so both are executed during module load.
Link: http://lkml.kernel.org/r/20200706112900.7097-1-geert@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
lib/test_bitops.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
--- a/lib/test_bitops.c~lib-test_bitops-do-the-full-test-during-module-init
+++ a/lib/test_bitops.c
@@ -52,9 +52,9 @@ static unsigned long order_comb_long[][2
static int __init test_bitops_startup(void)
{
- int i;
+ int i, bit_set;
- pr_warn("Loaded test module\n");
+ pr_info("Starting bitops test\n");
set_bit(BITOPS_4, g_bitmap);
set_bit(BITOPS_7, g_bitmap);
set_bit(BITOPS_11, g_bitmap);
@@ -81,12 +81,8 @@ static int __init test_bitops_startup(vo
order_comb_long[i][0]);
}
#endif
- return 0;
-}
-static void __exit test_bitops_unstartup(void)
-{
- int bit_set;
+ barrier();
clear_bit(BITOPS_4, g_bitmap);
clear_bit(BITOPS_7, g_bitmap);
@@ -98,7 +94,13 @@ static void __exit test_bitops_unstartup
if (bit_set != BITOPS_LAST)
pr_err("ERROR: FOUND SET BIT %d\n", bit_set);
- pr_warn("Unloaded test module\n");
+ pr_info("Completed bitops test\n");
+
+ return 0;
+}
+
+static void __exit test_bitops_unstartup(void)
+{
}
module_init(test_bitops_startup);
_
Patches currently in -mm which might be from geert@linux-m68k.org are
lib-test_bitops-do-the-full-test-during-module-init.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [nacked] mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (7 preceding siblings ...)
2020-07-06 22:49 ` + lib-test_bitops-do-the-full-test-during-module-init.patch " Andrew Morton
@ 2020-07-06 23:03 ` Andrew Morton
2020-07-06 23:03 ` [to-be-updated] mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
` (223 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:03 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, walken, willy, yang.shi
The patch titled
Subject: mm/mremap: format the check in move_normal_pmd() same as move_huge_pmd()
has been removed from the -mm tree. Its filename was
mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd.patch
This patch was dropped because it was nacked
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: format the check in move_normal_pmd() same as move_huge_pmd()
Patch series "mm/mremap: cleanup move_page_tables() a little", v2.
move_page_tables() tries to move page table by PMD or PTE.
The root reason is if it tries to move PMD, both old and new range should
be PMD aligned. But current code calculate old range and new range
separately. This leads to some redundant check and calculation.
This cleanup tries to consolidate the range check in one place to reduce
some extra range handling.
This patch (of 4):
No functional change, just improve the readability and prepare for
following cleanup.
Link: http://lkml.kernel.org/r/20200626135216.24314-1-richard.weiyang@linux.alibaba.com
Link: http://lkml.kernel.org/r/20200626135216.24314-2-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/mremap.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/mremap.c~mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd
+++ a/mm/mremap.c
@@ -200,8 +200,9 @@ static bool move_normal_pmd(struct vm_ar
struct mm_struct *mm = vma->vm_mm;
pmd_t pmd;
- if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
- || old_end - old_addr < PMD_SIZE)
+ if ((old_addr & ~PMD_MASK) ||
+ (new_addr & ~PMD_MASK) ||
+ old_end - old_addr < PMD_SIZE)
return false;
/*
_
Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are
mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (8 preceding siblings ...)
2020-07-06 23:03 ` [nacked] mm-mremap-format-the-check-in-move_normal_pmd-same-as-move_huge_pmd.patch removed from " Andrew Morton
@ 2020-07-06 23:03 ` Andrew Morton
2020-07-06 23:03 ` [to-be-updated] mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
` (222 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:03 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, walken, willy, yang.shi
The patch titled
Subject: mm/mremap: it is sure to have enough space when extent meets requirement
has been removed from the -mm tree. Its filename was
mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: it is sure to have enough space when extent meets requirement
old_end is passed to these two function to check whether there is enough
space to do the move, while this check is done before invoking these
functions.
These two functions only would be invoked when extent meets the
requirement and there is one check before invoking these functions:
if (extent > old_end - old_addr)
extent = old_end - old_addr;
This implies (old_end - old_addr) won't fail the check in these two
functions.
Link: http://lkml.kernel.org/r/20200626135216.24314-3-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/huge_mm.h | 2 +-
mm/huge_memory.c | 7 ++-----
mm/mremap.c | 11 ++++-------
3 files changed, 7 insertions(+), 13 deletions(-)
--- a/include/linux/huge_mm.h~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/include/linux/huge_mm.h
@@ -42,7 +42,7 @@ extern int mincore_huge_pmd(struct vm_ar
unsigned long addr, unsigned long end,
unsigned char *vec);
extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
- unsigned long new_addr, unsigned long old_end,
+ unsigned long new_addr,
pmd_t *old_pmd, pmd_t *new_pmd);
extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
unsigned long addr, pgprot_t newprot,
--- a/mm/huge_memory.c~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/mm/huge_memory.c
@@ -1722,17 +1722,14 @@ static pmd_t move_soft_dirty_pmd(pmd_t p
}
bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
- unsigned long new_addr, unsigned long old_end,
- pmd_t *old_pmd, pmd_t *new_pmd)
+ unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
{
spinlock_t *old_ptl, *new_ptl;
pmd_t pmd;
struct mm_struct *mm = vma->vm_mm;
bool force_flush = false;
- if ((old_addr & ~HPAGE_PMD_MASK) ||
- (new_addr & ~HPAGE_PMD_MASK) ||
- old_end - old_addr < HPAGE_PMD_SIZE)
+ if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
return false;
/*
--- a/mm/mremap.c~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/mm/mremap.c
@@ -193,16 +193,13 @@ static void move_ptes(struct vm_area_str
#ifdef CONFIG_HAVE_MOVE_PMD
static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
- unsigned long new_addr, unsigned long old_end,
- pmd_t *old_pmd, pmd_t *new_pmd)
+ unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
{
spinlock_t *old_ptl, *new_ptl;
struct mm_struct *mm = vma->vm_mm;
pmd_t pmd;
- if ((old_addr & ~PMD_MASK) ||
- (new_addr & ~PMD_MASK) ||
- old_end - old_addr < PMD_SIZE)
+ if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
return false;
/*
@@ -274,7 +271,7 @@ unsigned long move_page_tables(struct vm
if (need_rmap_locks)
take_rmap_locks(vma);
moved = move_huge_pmd(vma, old_addr, new_addr,
- old_end, old_pmd, new_pmd);
+ old_pmd, new_pmd);
if (need_rmap_locks)
drop_rmap_locks(vma);
if (moved)
@@ -294,7 +291,7 @@ unsigned long move_page_tables(struct vm
if (need_rmap_locks)
take_rmap_locks(vma);
moved = move_normal_pmd(vma, old_addr, new_addr,
- old_end, old_pmd, new_pmd);
+ old_pmd, new_pmd);
if (need_rmap_locks)
drop_rmap_locks(vma);
if (moved)
_
Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-mremap-calculate-extent-in-one-place.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (9 preceding siblings ...)
2020-07-06 23:03 ` [to-be-updated] mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
@ 2020-07-06 23:03 ` Andrew Morton
2020-07-06 23:04 ` [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
` (221 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:03 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, walken, willy, yang.shi
The patch titled
Subject: mm/mremap: calculate extent in one place
has been removed from the -mm tree. Its filename was
mm-mremap-calculate-extent-in-one-place.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: calculate extent in one place
Page tables are moved on the base of PMD. This requires both source and
destination range should meet the requirement.
Current code works well since move_huge_pmd() and move_normal_pmd() would
check old_addr and new_addr again. And then return to move_ptes() if the
either of them is not aligned.
In stead of calculating the extent separately, it is better to calculate
in one place, so we know it is not necessary to try move pmd. By doing
so, the logic seems a little clear.
Link: http://lkml.kernel.org/r/20200626135216.24314-4-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/mremap.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/mremap.c~mm-mremap-calculate-extent-in-one-place
+++ a/mm/mremap.c
@@ -258,6 +258,9 @@ unsigned long move_page_tables(struct vm
extent = next - old_addr;
if (extent > old_end - old_addr)
extent = old_end - old_addr;
+ next = (new_addr + PMD_SIZE) & PMD_MASK;
+ if (extent > next - new_addr)
+ extent = next - new_addr;
old_pmd = get_old_pmd(vma->vm_mm, old_addr);
if (!old_pmd)
continue;
@@ -301,9 +304,6 @@ unsigned long move_page_tables(struct vm
if (pte_alloc(new_vma->vm_mm, new_pmd))
break;
- next = (new_addr + PMD_SIZE) & PMD_MASK;
- if (extent > next - new_addr)
- extent = next - new_addr;
move_ptes(vma, old_pmd, old_addr, old_addr + extent, new_vma,
new_pmd, new_addr, need_rmap_locks);
}
_
Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are
mm-mremap-start-addresses-are-properly-aligned.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (10 preceding siblings ...)
2020-07-06 23:03 ` [to-be-updated] mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
@ 2020-07-06 23:04 ` Andrew Morton
2020-07-06 23:04 ` Andrew Morton
2020-07-06 23:15 ` + mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch added to " Andrew Morton
` (220 subsequent siblings)
232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:04 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, walken, willy, yang.shi
The patch titled
Subject: mm/mremap: start addresses are properly aligned
has been removed from the -mm tree. Its filename was
mm-mremap-start-addresses-are-properly-aligned.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: start addresses are properly aligned
After previous cleanup, extent is the minimal step for both source and
destination. This means when extent is HPAGE_PMD_SIZE or PMD_SIZE,
old_addr and new_addr are properly aligned too.
Since these two functions are only invoked in move_page_tables, it is safe
to remove the check now.
Link: http://lkml.kernel.org/r/20200626135216.24314-5-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/huge_memory.c | 3 ---
mm/mremap.c | 3 ---
2 files changed, 6 deletions(-)
--- a/mm/huge_memory.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/huge_memory.c
@@ -1729,9 +1729,6 @@ bool move_huge_pmd(struct vm_area_struct
struct mm_struct *mm = vma->vm_mm;
bool force_flush = false;
- if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
- return false;
-
/*
* The destination pmd shouldn't be established, free_pgtables()
* should have release it.
--- a/mm/mremap.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/mremap.c
@@ -199,9 +199,6 @@ static bool move_normal_pmd(struct vm_ar
struct mm_struct *mm = vma->vm_mm;
pmd_t pmd;
- if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
- return false;
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch removed from -mm tree
2020-07-06 23:04 ` [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
@ 2020-07-06 23:04 ` Andrew Morton
0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:04 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, walken, willy, yang.shi
The patch titled
Subject: mm/mremap: start addresses are properly aligned
has been removed from the -mm tree. Its filename was
mm-mremap-start-addresses-are-properly-aligned.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: start addresses are properly aligned
After previous cleanup, extent is the minimal step for both source and
destination. This means when extent is HPAGE_PMD_SIZE or PMD_SIZE,
old_addr and new_addr are properly aligned too.
Since these two functions are only invoked in move_page_tables, it is safe
to remove the check now.
Link: http://lkml.kernel.org/r/20200626135216.24314-5-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/huge_memory.c | 3 ---
mm/mremap.c | 3 ---
2 files changed, 6 deletions(-)
--- a/mm/huge_memory.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/huge_memory.c
@@ -1729,9 +1729,6 @@ bool move_huge_pmd(struct vm_area_struct
struct mm_struct *mm = vma->vm_mm;
bool force_flush = false;
- if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
- return false;
-
/*
* The destination pmd shouldn't be established, free_pgtables()
* should have release it.
--- a/mm/mremap.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/mremap.c
@@ -199,9 +199,6 @@ static bool move_normal_pmd(struct vm_ar
struct mm_struct *mm = vma->vm_mm;
pmd_t pmd;
- if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
- return false;
-
/*
* The destination pmd shouldn't be established, free_pgtables()
* should have release it.
_
Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (11 preceding siblings ...)
2020-07-06 23:04 ` [to-be-updated] mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
@ 2020-07-06 23:15 ` Andrew Morton
2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch " Andrew Morton
` (219 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:15 UTC (permalink / raw)
To: david, mm-commits, penberg, songmuchun
The patch titled
Subject: mm/page_alloc.c: skip setting nodemask when we are in interrupt
has been added to the -mm tree. Its filename is
mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Muchun Song <songmuchun@bytedance.com>
Subject: mm/page_alloc.c: skip setting nodemask when we are in interrupt
When we are in the interrupt context, it is irrelevant to the current task
context. If we use current task's mems_allowed, we can be fair to alloc
pages in the fast path and fall back to slow path memory allocation when
the current node(which is the current task mems_allowed) does not have
enough memory to allocate. In this case, it slows down the memory
allocation speed of interrupt context. So we can skip setting the
nodemask to allow any node to allocate memory, so that fast path
allocation can success.
Link: http://lkml.kernel.org/r/20200706025921.53683-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/page_alloc.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt
+++ a/mm/page_alloc.c
@@ -4790,7 +4790,11 @@ static inline bool prepare_alloc_pages(g
if (cpusets_enabled()) {
*alloc_mask |= __GFP_HARDWALL;
- if (!ac->nodemask)
+ /*
+ * When we are in the interrupt context, it is irrelevant
+ * to the current task context. It means that any node ok.
+ */
+ if (!in_interrupt() && !ac->nodemask)
ac->nodemask = &cpuset_current_mems_allowed;
else
*alloc_flags |= ALLOC_CPUSET;
_
Patches currently in -mm which might be from songmuchun@bytedance.com are
mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (12 preceding siblings ...)
2020-07-06 23:15 ` + mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch added to " Andrew Morton
@ 2020-07-06 23:28 ` Andrew Morton
2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch " Andrew Morton
` (218 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:28 UTC (permalink / raw)
To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
ziy
The patch titled
Subject: mm/debug_vm_pgtable: add tests validating arch helpers for core MM features
has been added to the -mm tree. Its filename is
mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/debug_vm_pgtable: add tests validating arch helpers for core MM features
Patch series "mm/debug_vm_pgtable: Add some more tests", v4.
This series adds some more arch page table helper validation tests which
are related to core and advanced memory functions. This also creates a
documentation, enlisting expected semantics for all page table helpers as
suggested by Mike Rapoport previously
(https://lkml.org/lkml/2020/1/30/40).
There are many TRANSPARENT_HUGEPAGE and ARCH_HAS_TRANSPARENT_HUGEPAGE_PUD
ifdefs scattered across the test. But consolidating all the fallback
stubs is not very straight forward because
ARCH_HAS_TRANSPARENT_HUGEPAGE_PUD is not explicitly dependent on
ARCH_HAS_TRANSPARENT_HUGEPAGE.
Tested on arm64, x86 platforms but only build tested on all other enabled
platforms through ARCH_HAS_DEBUG_VM_PGTABLE i.e powerpc, arc, s390. The
following failure on arm64 still exists which was mentioned previously.
It will be fixed with the upcoming THP migration on arm64 enablement
series.
WARNING .... mm/debug_vm_pgtable.c:860 debug_vm_pgtable+0x940/0xa54
WARN_ON(!pmd_present(pmd_mkinvalid(pmd_mkhuge(pmd))))
This patch (of 4):
This adds new tests validating arch page table helpers for these following
core memory features. These tests create and test specific mapping types
at various page table levels.
1. SPECIAL mapping
2. PROTNONE mapping
3. DEVMAP mapping
4. SOFTDIRTY mapping
5. SWAP mapping
6. MIGRATION mapping
7. HUGETLB mapping
8. THP mapping
Link: http://lkml.kernel.org/r/1593996516-7186-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1593996516-7186-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug_vm_pgtable.c | 302 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 301 insertions(+), 1 deletion(-)
--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features
+++ a/mm/debug_vm_pgtable.c
@@ -282,6 +282,278 @@ static void __init pmd_populate_tests(st
WARN_ON(pmd_bad(pmd));
}
+static void __init pte_special_tests(unsigned long pfn, pgprot_t prot)
+{
+ pte_t pte = pfn_pte(pfn, prot);
+
+ if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL))
+ return;
+
+ WARN_ON(!pte_special(pte_mkspecial(pte)));
+}
+
+static void __init pte_protnone_tests(unsigned long pfn, pgprot_t prot)
+{
+ pte_t pte = pfn_pte(pfn, prot);
+
+ if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
+ return;
+
+ WARN_ON(!pte_protnone(pte));
+ WARN_ON(!pte_present(pte));
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static void __init pmd_protnone_tests(unsigned long pfn, pgprot_t prot)
+{
+ pmd_t pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
+
+ if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
+ return;
+
+ WARN_ON(!pmd_protnone(pmd));
+ WARN_ON(!pmd_present(pmd));
+}
+#else /* !CONFIG_TRANSPARENT_HUGEPAGE */
+static void __init pmd_protnone_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
+static void __init pte_devmap_tests(unsigned long pfn, pgprot_t prot)
+{
+ pte_t pte = pfn_pte(pfn, prot);
+
+ WARN_ON(!pte_devmap(pte_mkdevmap(pte)));
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static void __init pmd_devmap_tests(unsigned long pfn, pgprot_t prot)
+{
+ pmd_t pmd = pfn_pmd(pfn, prot);
+
+ WARN_ON(!pmd_devmap(pmd_mkdevmap(pmd)));
+}
+
+#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
+static void __init pud_devmap_tests(unsigned long pfn, pgprot_t prot)
+{
+ pud_t pud = pfn_pud(pfn, prot);
+
+ WARN_ON(!pud_devmap(pud_mkdevmap(pud)));
+}
+#else /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+static void __init pud_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+#else /* CONFIG_TRANSPARENT_HUGEPAGE */
+static void __init pmd_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+#else
+static void __init pte_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pmd_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_devmap_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */
+
+static void __init pte_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+ pte_t pte = pfn_pte(pfn, prot);
+
+ if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+ return;
+
+ WARN_ON(!pte_soft_dirty(pte_mksoft_dirty(pte)));
+ WARN_ON(pte_soft_dirty(pte_clear_soft_dirty(pte)));
+}
+
+static void __init pte_swap_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+ pte_t pte = pfn_pte(pfn, prot);
+
+ if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+ return;
+
+ WARN_ON(!pte_swp_soft_dirty(pte_swp_mksoft_dirty(pte)));
+ WARN_ON(pte_swp_soft_dirty(pte_swp_clear_soft_dirty(pte)));
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static void __init pmd_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+ pmd_t pmd = pfn_pmd(pfn, prot);
+
+ if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+ return;
+
+ WARN_ON(!pmd_soft_dirty(pmd_mksoft_dirty(pmd)));
+ WARN_ON(pmd_soft_dirty(pmd_clear_soft_dirty(pmd)));
+}
+
+static void __init pmd_swap_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+ pmd_t pmd = pfn_pmd(pfn, prot);
+
+ if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) ||
+ !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION))
+ return;
+
+ WARN_ON(!pmd_swp_soft_dirty(pmd_swp_mksoft_dirty(pmd)));
+ WARN_ON(pmd_swp_soft_dirty(pmd_swp_clear_soft_dirty(pmd)));
+}
+#else /* !CONFIG_ARCH_HAS_PTE_DEVMAP */
+static void __init pmd_soft_dirty_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pmd_swap_soft_dirty_tests(unsigned long pfn, pgprot_t prot)
+{
+}
+#endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */
+
+static void __init pte_swap_tests(unsigned long pfn, pgprot_t prot)
+{
+ swp_entry_t swp;
+ pte_t pte;
+
+ pte = pfn_pte(pfn, prot);
+ swp = __pte_to_swp_entry(pte);
+ pte = __swp_entry_to_pte(swp);
+ WARN_ON(pfn != pte_pfn(pte));
+}
+
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot)
+{
+ swp_entry_t swp;
+ pmd_t pmd;
+
+ pmd = pfn_pmd(pfn, prot);
+ swp = __pmd_to_swp_entry(pmd);
+ pmd = __swp_entry_to_pmd(swp);
+ WARN_ON(pfn != pmd_pfn(pmd));
+}
+#else /* !CONFIG_ARCH_ENABLE_THP_MIGRATION */
+static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */
+
+static void __init swap_migration_tests(void)
+{
+ struct page *page;
+ swp_entry_t swp;
+
+ if (!IS_ENABLED(CONFIG_MIGRATION))
+ return;
+ /*
+ * swap_migration_tests() requires a dedicated page as it needs to
+ * be locked before creating a migration entry from it. Locking the
+ * page that actually maps kernel text ('start_kernel') can be real
+ * problematic. Lets allocate a dedicated page explicitly for this
+ * purpose that will be freed subsequently.
+ */
+ page = alloc_page(GFP_KERNEL);
+ if (!page) {
+ pr_err("page allocation failed\n");
+ return;
+ }
+
+ /*
+ * make_migration_entry() expects given page to be
+ * locked, otherwise it stumbles upon a BUG_ON().
+ */
+ __SetPageLocked(page);
+ swp = make_migration_entry(page, 1);
+ WARN_ON(!is_migration_entry(swp));
+ WARN_ON(!is_write_migration_entry(swp));
+
+ make_migration_entry_read(&swp);
+ WARN_ON(!is_migration_entry(swp));
+ WARN_ON(is_write_migration_entry(swp));
+
+ swp = make_migration_entry(page, 0);
+ WARN_ON(!is_migration_entry(swp));
+ WARN_ON(is_write_migration_entry(swp));
+ __ClearPageLocked(page);
+ __free_page(page);
+}
+
+#ifdef CONFIG_HUGETLB_PAGE
+static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot)
+{
+ struct page *page;
+ pte_t pte;
+
+ /*
+ * Accessing the page associated with the pfn is safe here,
+ * as it was previously derived from a real kernel symbol.
+ */
+ page = pfn_to_page(pfn);
+ pte = mk_huge_pte(page, prot);
+
+ WARN_ON(!huge_pte_dirty(huge_pte_mkdirty(pte)));
+ WARN_ON(!huge_pte_write(huge_pte_mkwrite(huge_pte_wrprotect(pte))));
+ WARN_ON(huge_pte_write(huge_pte_wrprotect(huge_pte_mkwrite(pte))));
+
+#ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB
+ pte = pfn_pte(pfn, prot);
+
+ WARN_ON(!pte_huge(pte_mkhuge(pte)));
+#endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
+}
+#else /* !CONFIG_HUGETLB_PAGE */
+static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_HUGETLB_PAGE */
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static void __init pmd_thp_tests(unsigned long pfn, pgprot_t prot)
+{
+ pmd_t pmd;
+
+ if (!has_transparent_hugepage())
+ return;
+
+ /*
+ * pmd_trans_huge() and pmd_present() must return positive after
+ * MMU invalidation with pmd_mkinvalid(). This behavior is an
+ * optimization for transparent huge page. pmd_trans_huge() must
+ * be true if pmd_page() returns a valid THP to avoid taking the
+ * pmd_lock when others walk over non transhuge pmds (i.e. there
+ * are no THP allocated). Especially when splitting a THP and
+ * removing the present bit from the pmd, pmd_trans_huge() still
+ * needs to return true. pmd_present() should be true whenever
+ * pmd_trans_huge() returns true.
+ */
+ pmd = pfn_pmd(pfn, prot);
+ WARN_ON(!pmd_trans_huge(pmd_mkhuge(pmd)));
+
+#ifndef __HAVE_ARCH_PMDP_INVALIDATE
+ WARN_ON(!pmd_trans_huge(pmd_mkinvalid(pmd_mkhuge(pmd))));
+ WARN_ON(!pmd_present(pmd_mkinvalid(pmd_mkhuge(pmd))));
+#endif /* __HAVE_ARCH_PMDP_INVALIDATE */
+}
+
+#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
+static void __init pud_thp_tests(unsigned long pfn, pgprot_t prot)
+{
+ pud_t pud;
+
+ if (!has_transparent_hugepage())
+ return;
+
+ pud = pfn_pud(pfn, prot);
+ WARN_ON(!pud_trans_huge(pud_mkhuge(pud)));
+
+ /*
+ * pud_mkinvalid() has been dropped for now. Enable back
+ * these tests when it comes back with a modified pud_present().
+ *
+ * WARN_ON(!pud_trans_huge(pud_mkinvalid(pud_mkhuge(pud))));
+ * WARN_ON(!pud_present(pud_mkinvalid(pud_mkhuge(pud))));
+ */
+}
+#else /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+static void __init pud_thp_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+#else /* !CONFIG_TRANSPARENT_HUGEPAGE */
+static void __init pmd_thp_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_thp_tests(unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
static unsigned long __init get_random_vaddr(void)
{
unsigned long random_vaddr, random_pages, total_user_pages;
@@ -303,7 +575,7 @@ static int __init debug_vm_pgtable(void)
pmd_t *pmdp, *saved_pmdp, pmd;
pte_t *ptep;
pgtable_t saved_ptep;
- pgprot_t prot;
+ pgprot_t prot, protnone;
phys_addr_t paddr;
unsigned long vaddr, pte_aligned, pmd_aligned;
unsigned long pud_aligned, p4d_aligned, pgd_aligned;
@@ -319,6 +591,12 @@ static int __init debug_vm_pgtable(void)
}
/*
+ * __P000 (or even __S000) will help create page table entries with
+ * PROT_NONE permission as required for pxx_protnone_tests().
+ */
+ protnone = __P000;
+
+ /*
* PFN for mapping at PTE level is determined from a standard kernel
* text symbol. But pfns for higher page table levels are derived by
* masking lower bits of this real pfn. These derived pfns might not
@@ -373,6 +651,28 @@ static int __init debug_vm_pgtable(void)
p4d_populate_tests(mm, p4dp, saved_pudp);
pgd_populate_tests(mm, pgdp, saved_p4dp);
+ pte_special_tests(pte_aligned, prot);
+ pte_protnone_tests(pte_aligned, protnone);
+ pmd_protnone_tests(pmd_aligned, protnone);
+
+ pte_devmap_tests(pte_aligned, prot);
+ pmd_devmap_tests(pmd_aligned, prot);
+ pud_devmap_tests(pud_aligned, prot);
+
+ pte_soft_dirty_tests(pte_aligned, prot);
+ pmd_soft_dirty_tests(pmd_aligned, prot);
+ pte_swap_soft_dirty_tests(pte_aligned, prot);
+ pmd_swap_soft_dirty_tests(pmd_aligned, prot);
+
+ pte_swap_tests(pte_aligned, prot);
+ pmd_swap_tests(pmd_aligned, prot);
+
+ swap_migration_tests();
+ hugetlb_basic_tests(pte_aligned, prot);
+
+ pmd_thp_tests(pmd_aligned, prot);
+ pud_thp_tests(pud_aligned, prot);
+
p4d_free(mm, saved_p4dp);
pud_free(mm, saved_pudp);
pmd_free(mm, saved_pmdp);
_
Patches currently in -mm which might be from anshuman.khandual@arm.com are
mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (13 preceding siblings ...)
2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch " Andrew Morton
@ 2020-07-06 23:28 ` Andrew Morton
2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch " Andrew Morton
` (217 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:28 UTC (permalink / raw)
To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
ziy
The patch titled
Subject: mm/debug_vm_pgtable: add tests validating advanced arch page table helpers
has been added to the -mm tree. Its filename is
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/debug_vm_pgtable: add tests validating advanced arch page table helpers
This adds new tests validating for these following arch advanced page
table helpers. These tests create and test specific mapping types at
various page table levels.
1. pxxp_set_wrprotect()
2. pxxp_get_and_clear()
3. pxxp_set_access_flags()
4. pxxp_get_and_clear_full()
5. pxxp_test_and_clear_young()
6. pxx_leaf()
7. pxx_set_huge()
8. pxx_(clear|mk)_savedwrite()
9. huge_pxxp_xxx()
Link: http://lkml.kernel.org/r/1593996516-7186-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug_vm_pgtable.c | 312 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 312 insertions(+)
--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers
+++ a/mm/debug_vm_pgtable.c
@@ -21,6 +21,7 @@
#include <linux/module.h>
#include <linux/pfn_t.h>
#include <linux/printk.h>
+#include <linux/pgtable.h>
#include <linux/random.h>
#include <linux/spinlock.h>
#include <linux/swap.h>
@@ -28,6 +29,7 @@
#include <linux/start_kernel.h>
#include <linux/sched/mm.h>
#include <asm/pgalloc.h>
+#include <asm/tlbflush.h>
#define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC)
@@ -55,6 +57,55 @@ static void __init pte_basic_tests(unsig
WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
}
+static void __init pte_advanced_tests(struct mm_struct *mm,
+ struct vm_area_struct *vma, pte_t *ptep,
+ unsigned long pfn, unsigned long vaddr,
+ pgprot_t prot)
+{
+ pte_t pte = pfn_pte(pfn, prot);
+
+ pte = pfn_pte(pfn, prot);
+ set_pte_at(mm, vaddr, ptep, pte);
+ ptep_set_wrprotect(mm, vaddr, ptep);
+ pte = ptep_get(ptep);
+ WARN_ON(pte_write(pte));
+
+ pte = pfn_pte(pfn, prot);
+ set_pte_at(mm, vaddr, ptep, pte);
+ ptep_get_and_clear(mm, vaddr, ptep);
+ pte = ptep_get(ptep);
+ WARN_ON(!pte_none(pte));
+
+ pte = pfn_pte(pfn, prot);
+ pte = pte_wrprotect(pte);
+ pte = pte_mkclean(pte);
+ set_pte_at(mm, vaddr, ptep, pte);
+ pte = pte_mkwrite(pte);
+ pte = pte_mkdirty(pte);
+ ptep_set_access_flags(vma, vaddr, ptep, pte, 1);
+ pte = ptep_get(ptep);
+ WARN_ON(!(pte_write(pte) && pte_dirty(pte)));
+
+ pte = pfn_pte(pfn, prot);
+ set_pte_at(mm, vaddr, ptep, pte);
+ ptep_get_and_clear_full(mm, vaddr, ptep, 1);
+ pte = ptep_get(ptep);
+ WARN_ON(!pte_none(pte));
+
+ pte = pte_mkyoung(pte);
+ set_pte_at(mm, vaddr, ptep, pte);
+ ptep_test_and_clear_young(vma, vaddr, ptep);
+ pte = ptep_get(ptep);
+ WARN_ON(pte_young(pte));
+}
+
+static void __init pte_savedwrite_tests(unsigned long pfn, pgprot_t prot)
+{
+ pte_t pte = pfn_pte(pfn, prot);
+
+ WARN_ON(!pte_savedwrite(pte_mk_savedwrite(pte_clear_savedwrite(pte))));
+ WARN_ON(pte_savedwrite(pte_clear_savedwrite(pte_mk_savedwrite(pte))));
+}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
{
@@ -77,6 +128,90 @@ static void __init pmd_basic_tests(unsig
WARN_ON(!pmd_bad(pmd_mkhuge(pmd)));
}
+static void __init pmd_advanced_tests(struct mm_struct *mm,
+ struct vm_area_struct *vma, pmd_t *pmdp,
+ unsigned long pfn, unsigned long vaddr,
+ pgprot_t prot)
+{
+ pmd_t pmd = pfn_pmd(pfn, prot);
+
+ if (!has_transparent_hugepage())
+ return;
+
+ /* Align the address wrt HPAGE_PMD_SIZE */
+ vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
+
+ pmd = pfn_pmd(pfn, prot);
+ set_pmd_at(mm, vaddr, pmdp, pmd);
+ pmdp_set_wrprotect(mm, vaddr, pmdp);
+ pmd = READ_ONCE(*pmdp);
+ WARN_ON(pmd_write(pmd));
+
+ pmd = pfn_pmd(pfn, prot);
+ set_pmd_at(mm, vaddr, pmdp, pmd);
+ pmdp_huge_get_and_clear(mm, vaddr, pmdp);
+ pmd = READ_ONCE(*pmdp);
+ WARN_ON(!pmd_none(pmd));
+
+ pmd = pfn_pmd(pfn, prot);
+ pmd = pmd_wrprotect(pmd);
+ pmd = pmd_mkclean(pmd);
+ set_pmd_at(mm, vaddr, pmdp, pmd);
+ pmd = pmd_mkwrite(pmd);
+ pmd = pmd_mkdirty(pmd);
+ pmdp_set_access_flags(vma, vaddr, pmdp, pmd, 1);
+ pmd = READ_ONCE(*pmdp);
+ WARN_ON(!(pmd_write(pmd) && pmd_dirty(pmd)));
+
+ pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
+ set_pmd_at(mm, vaddr, pmdp, pmd);
+ pmdp_huge_get_and_clear_full(vma, vaddr, pmdp, 1);
+ pmd = READ_ONCE(*pmdp);
+ WARN_ON(!pmd_none(pmd));
+
+ pmd = pmd_mkyoung(pmd);
+ set_pmd_at(mm, vaddr, pmdp, pmd);
+ pmdp_test_and_clear_young(vma, vaddr, pmdp);
+ pmd = READ_ONCE(*pmdp);
+ WARN_ON(pmd_young(pmd));
+}
+
+static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot)
+{
+ pmd_t pmd = pfn_pmd(pfn, prot);
+
+ /*
+ * PMD based THP is a leaf entry.
+ */
+ pmd = pmd_mkhuge(pmd);
+ WARN_ON(!pmd_leaf(pmd));
+}
+
+static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
+{
+ pmd_t pmd;
+
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
+ return;
+ /*
+ * X86 defined pmd_set_huge() verifies that the given
+ * PMD is not a populated non-leaf entry.
+ */
+ WRITE_ONCE(*pmdp, __pmd(0));
+ WARN_ON(!pmd_set_huge(pmdp, __pfn_to_phys(pfn), prot));
+ WARN_ON(!pmd_clear_huge(pmdp));
+ pmd = READ_ONCE(*pmdp);
+ WARN_ON(!pmd_none(pmd));
+}
+
+static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
+{
+ pmd_t pmd = pfn_pmd(pfn, prot);
+
+ WARN_ON(!pmd_savedwrite(pmd_mk_savedwrite(pmd_clear_savedwrite(pmd))));
+ WARN_ON(pmd_savedwrite(pmd_clear_savedwrite(pmd_mk_savedwrite(pmd))));
+}
+
#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot)
{
@@ -100,12 +235,119 @@ static void __init pud_basic_tests(unsig
*/
WARN_ON(!pud_bad(pud_mkhuge(pud)));
}
+
+static void __init pud_advanced_tests(struct mm_struct *mm,
+ struct vm_area_struct *vma, pud_t *pudp,
+ unsigned long pfn, unsigned long vaddr,
+ pgprot_t prot)
+{
+ pud_t pud = pfn_pud(pfn, prot);
+
+ if (!has_transparent_hugepage())
+ return;
+
+ /* Align the address wrt HPAGE_PUD_SIZE */
+ vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE;
+
+ set_pud_at(mm, vaddr, pudp, pud);
+ pudp_set_wrprotect(mm, vaddr, pudp);
+ pud = READ_ONCE(*pudp);
+ WARN_ON(pud_write(pud));
+
+#ifndef __PAGETABLE_PMD_FOLDED
+ pud = pfn_pud(pfn, prot);
+ set_pud_at(mm, vaddr, pudp, pud);
+ pudp_huge_get_and_clear(mm, vaddr, pudp);
+ pud = READ_ONCE(*pudp);
+ WARN_ON(!pud_none(pud));
+
+ pud = pfn_pud(pfn, prot);
+ set_pud_at(mm, vaddr, pudp, pud);
+ pudp_huge_get_and_clear_full(mm, vaddr, pudp, 1);
+ pud = READ_ONCE(*pudp);
+ WARN_ON(!pud_none(pud));
+#endif /* __PAGETABLE_PMD_FOLDED */
+ pud = pfn_pud(pfn, prot);
+ pud = pud_wrprotect(pud);
+ pud = pud_mkclean(pud);
+ set_pud_at(mm, vaddr, pudp, pud);
+ pud = pud_mkwrite(pud);
+ pud = pud_mkdirty(pud);
+ pudp_set_access_flags(vma, vaddr, pudp, pud, 1);
+ pud = READ_ONCE(*pudp);
+ WARN_ON(!(pud_write(pud) && pud_dirty(pud)));
+
+ pud = pud_mkyoung(pud);
+ set_pud_at(mm, vaddr, pudp, pud);
+ pudp_test_and_clear_young(vma, vaddr, pudp);
+ pud = READ_ONCE(*pudp);
+ WARN_ON(pud_young(pud));
+}
+
+static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
+{
+ pud_t pud = pfn_pud(pfn, prot);
+
+ /*
+ * PUD based THP is a leaf entry.
+ */
+ pud = pud_mkhuge(pud);
+ WARN_ON(!pud_leaf(pud));
+}
+
+static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
+{
+ pud_t pud;
+
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
+ return;
+ /*
+ * X86 defined pud_set_huge() verifies that the given
+ * PUD is not a populated non-leaf entry.
+ */
+ WRITE_ONCE(*pudp, __pud(0));
+ WARN_ON(!pud_set_huge(pudp, __pfn_to_phys(pfn), prot));
+ WARN_ON(!pud_clear_huge(pudp));
+ pud = READ_ONCE(*pudp);
+ WARN_ON(!pud_none(pud));
+}
#else /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_advanced_tests(struct mm_struct *mm,
+ struct vm_area_struct *vma, pud_t *pudp,
+ unsigned long pfn, unsigned long vaddr,
+ pgprot_t prot)
+{
+}
+static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
+{
+}
#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
#else /* !CONFIG_TRANSPARENT_HUGEPAGE */
static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { }
static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pmd_advanced_tests(struct mm_struct *mm,
+ struct vm_area_struct *vma, pmd_t *pmdp,
+ unsigned long pfn, unsigned long vaddr,
+ pgprot_t prot)
+{
+}
+static void __init pud_advanced_tests(struct mm_struct *mm,
+ struct vm_area_struct *vma, pud_t *pudp,
+ unsigned long pfn, unsigned long vaddr,
+ pgprot_t prot)
+{
+}
+static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
+{
+}
+static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
+{
+}
+static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot) { }
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
static void __init p4d_basic_tests(unsigned long pfn, pgprot_t prot)
@@ -495,8 +737,56 @@ static void __init hugetlb_basic_tests(u
WARN_ON(!pte_huge(pte_mkhuge(pte)));
#endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
}
+
+static void __init hugetlb_advanced_tests(struct mm_struct *mm,
+ struct vm_area_struct *vma,
+ pte_t *ptep, unsigned long pfn,
+ unsigned long vaddr, pgprot_t prot)
+{
+ struct page *page = pfn_to_page(pfn);
+ pte_t pte = ptep_get(ptep);
+ unsigned long paddr = (__pfn_to_phys(pfn) | RANDOM_ORVALUE) & PMD_MASK;
+
+ pte = pte_mkhuge(mk_pte(pfn_to_page(PHYS_PFN(paddr)), prot));
+ set_huge_pte_at(mm, vaddr, ptep, pte);
+ barrier();
+ WARN_ON(!pte_same(pte, huge_ptep_get(ptep)));
+ huge_pte_clear(mm, vaddr, ptep, PMD_SIZE);
+ pte = huge_ptep_get(ptep);
+ WARN_ON(!huge_pte_none(pte));
+
+ pte = mk_huge_pte(page, prot);
+ set_huge_pte_at(mm, vaddr, ptep, pte);
+ barrier();
+ huge_ptep_set_wrprotect(mm, vaddr, ptep);
+ pte = huge_ptep_get(ptep);
+ WARN_ON(huge_pte_write(pte));
+
+ pte = mk_huge_pte(page, prot);
+ set_huge_pte_at(mm, vaddr, ptep, pte);
+ barrier();
+ huge_ptep_get_and_clear(mm, vaddr, ptep);
+ pte = huge_ptep_get(ptep);
+ WARN_ON(!huge_pte_none(pte));
+
+ pte = mk_huge_pte(page, prot);
+ pte = huge_pte_wrprotect(pte);
+ set_huge_pte_at(mm, vaddr, ptep, pte);
+ barrier();
+ pte = huge_pte_mkwrite(pte);
+ pte = huge_pte_mkdirty(pte);
+ huge_ptep_set_access_flags(vma, vaddr, ptep, pte, 1);
+ pte = huge_ptep_get(ptep);
+ WARN_ON(!(huge_pte_write(pte) && huge_pte_dirty(pte)));
+}
#else /* !CONFIG_HUGETLB_PAGE */
static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init hugetlb_advanced_tests(struct mm_struct *mm,
+ struct vm_area_struct *vma,
+ pte_t *ptep, unsigned long pfn,
+ unsigned long vaddr, pgprot_t prot)
+{
+}
#endif /* CONFIG_HUGETLB_PAGE */
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -568,6 +858,7 @@ static unsigned long __init get_random_v
static int __init debug_vm_pgtable(void)
{
+ struct vm_area_struct *vma;
struct mm_struct *mm;
pgd_t *pgdp;
p4d_t *p4dp, *saved_p4dp;
@@ -596,6 +887,12 @@ static int __init debug_vm_pgtable(void)
*/
protnone = __P000;
+ vma = vm_area_alloc(mm);
+ if (!vma) {
+ pr_err("vma allocation failed\n");
+ return 1;
+ }
+
/*
* PFN for mapping at PTE level is determined from a standard kernel
* text symbol. But pfns for higher page table levels are derived by
@@ -644,6 +941,20 @@ static int __init debug_vm_pgtable(void)
p4d_clear_tests(mm, p4dp);
pgd_clear_tests(mm, pgdp);
+ pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+ pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
+ pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
+ hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+
+ pmd_leaf_tests(pmd_aligned, prot);
+ pud_leaf_tests(pud_aligned, prot);
+
+ pmd_huge_tests(pmdp, pmd_aligned, prot);
+ pud_huge_tests(pudp, pud_aligned, prot);
+
+ pte_savedwrite_tests(pte_aligned, prot);
+ pmd_savedwrite_tests(pmd_aligned, prot);
+
pte_unmap_unlock(ptep, ptl);
pmd_populate_tests(mm, pmdp, saved_ptep);
@@ -678,6 +989,7 @@ static int __init debug_vm_pgtable(void)
pmd_free(mm, saved_pmdp);
pte_free(mm, saved_ptep);
+ vm_area_free(vma);
mm_dec_nr_puds(mm);
mm_dec_nr_pmds(mm);
mm_dec_nr_ptes(mm);
_
Patches currently in -mm which might be from anshuman.khandual@arm.com are
mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (14 preceding siblings ...)
2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch " Andrew Morton
@ 2020-07-06 23:28 ` Andrew Morton
2020-07-06 23:28 ` + documentation-mm-add-descriptions-for-arch-page-table-helpers.patch " Andrew Morton
` (216 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:28 UTC (permalink / raw)
To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
ziy
The patch titled
Subject: mm/debug_vm_pgtable: add debug prints for individual tests
has been added to the -mm tree. Its filename is
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/debug_vm_pgtable: add debug prints for individual tests
This adds debug print information that enlists all tests getting executed
on a given platform. With dynamic debug enabled, the following
information will be splashed during boot. For compactness purpose,
dropped both time stamp and prefix (i.e debug_vm_pgtable) from this sample
output.
[debug_vm_pgtable ]: Validating architecture page table helpers
[pte_basic_tests ]: Validating PTE basic
[pmd_basic_tests ]: Validating PMD basic
[p4d_basic_tests ]: Validating P4D basic
[pgd_basic_tests ]: Validating PGD basic
[pte_clear_tests ]: Validating PTE clear
[pmd_clear_tests ]: Validating PMD clear
[pte_advanced_tests ]: Validating PTE advanced
[pmd_advanced_tests ]: Validating PMD advanced
[hugetlb_advanced_tests]: Validating HugeTLB advanced
[pmd_leaf_tests ]: Validating PMD leaf
[pmd_huge_tests ]: Validating PMD huge
[pte_savedwrite_tests ]: Validating PTE saved write
[pmd_savedwrite_tests ]: Validating PMD saved write
[pmd_populate_tests ]: Validating PMD populate
[pte_special_tests ]: Validating PTE special
[pte_protnone_tests ]: Validating PTE protnone
[pmd_protnone_tests ]: Validating PMD protnone
[pte_devmap_tests ]: Validating PTE devmap
[pmd_devmap_tests ]: Validating PMD devmap
[pte_swap_tests ]: Validating PTE swap
[swap_migration_tests ]: Validating swap migration
[hugetlb_basic_tests ]: Validating HugeTLB basic
[pmd_thp_tests ]: Validating PMD based THP
Link: http://lkml.kernel.org/r/1593996516-7186-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug_vm_pgtable.c | 46 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 45 insertions(+), 1 deletion(-)
--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-add-debug-prints-for-individual-tests
+++ a/mm/debug_vm_pgtable.c
@@ -8,7 +8,7 @@
*
* Author: Anshuman Khandual <anshuman.khandual@arm.com>
*/
-#define pr_fmt(fmt) "debug_vm_pgtable: %s: " fmt, __func__
+#define pr_fmt(fmt) "debug_vm_pgtable: [%-25s]: " fmt, __func__
#include <linux/gfp.h>
#include <linux/highmem.h>
@@ -48,6 +48,7 @@ static void __init pte_basic_tests(unsig
{
pte_t pte = pfn_pte(pfn, prot);
+ pr_debug("Validating PTE basic\n");
WARN_ON(!pte_same(pte, pte));
WARN_ON(!pte_young(pte_mkyoung(pte_mkold(pte))));
WARN_ON(!pte_dirty(pte_mkdirty(pte_mkclean(pte))));
@@ -64,6 +65,7 @@ static void __init pte_advanced_tests(st
{
pte_t pte = pfn_pte(pfn, prot);
+ pr_debug("Validating PTE advanced\n");
pte = pfn_pte(pfn, prot);
set_pte_at(mm, vaddr, ptep, pte);
ptep_set_wrprotect(mm, vaddr, ptep);
@@ -103,6 +105,7 @@ static void __init pte_savedwrite_tests(
{
pte_t pte = pfn_pte(pfn, prot);
+ pr_debug("Validating PTE saved write\n");
WARN_ON(!pte_savedwrite(pte_mk_savedwrite(pte_clear_savedwrite(pte))));
WARN_ON(pte_savedwrite(pte_clear_savedwrite(pte_mk_savedwrite(pte))));
}
@@ -114,6 +117,7 @@ static void __init pmd_basic_tests(unsig
if (!has_transparent_hugepage())
return;
+ pr_debug("Validating PMD basic\n");
WARN_ON(!pmd_same(pmd, pmd));
WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd))));
WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd))));
@@ -138,6 +142,7 @@ static void __init pmd_advanced_tests(st
if (!has_transparent_hugepage())
return;
+ pr_debug("Validating PMD advanced\n");
/* Align the address wrt HPAGE_PMD_SIZE */
vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
@@ -180,6 +185,7 @@ static void __init pmd_leaf_tests(unsign
{
pmd_t pmd = pfn_pmd(pfn, prot);
+ pr_debug("Validating PMD leaf\n");
/*
* PMD based THP is a leaf entry.
*/
@@ -193,6 +199,8 @@ static void __init pmd_huge_tests(pmd_t
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
return;
+
+ pr_debug("Validating PMD huge\n");
/*
* X86 defined pmd_set_huge() verifies that the given
* PMD is not a populated non-leaf entry.
@@ -208,6 +216,7 @@ static void __init pmd_savedwrite_tests(
{
pmd_t pmd = pfn_pmd(pfn, prot);
+ pr_debug("Validating PMD saved write\n");
WARN_ON(!pmd_savedwrite(pmd_mk_savedwrite(pmd_clear_savedwrite(pmd))));
WARN_ON(pmd_savedwrite(pmd_clear_savedwrite(pmd_mk_savedwrite(pmd))));
}
@@ -220,6 +229,7 @@ static void __init pud_basic_tests(unsig
if (!has_transparent_hugepage())
return;
+ pr_debug("Validating PUD basic\n");
WARN_ON(!pud_same(pud, pud));
WARN_ON(!pud_young(pud_mkyoung(pud_mkold(pud))));
WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud))));
@@ -246,6 +256,7 @@ static void __init pud_advanced_tests(st
if (!has_transparent_hugepage())
return;
+ pr_debug("Validating PUD advanced\n");
/* Align the address wrt HPAGE_PUD_SIZE */
vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE;
@@ -288,6 +299,7 @@ static void __init pud_leaf_tests(unsign
{
pud_t pud = pfn_pud(pfn, prot);
+ pr_debug("Validating PUD leaf\n");
/*
* PUD based THP is a leaf entry.
*/
@@ -301,6 +313,8 @@ static void __init pud_huge_tests(pud_t
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
return;
+
+ pr_debug("Validating PUD huge\n");
/*
* X86 defined pud_set_huge() verifies that the given
* PUD is not a populated non-leaf entry.
@@ -354,6 +368,7 @@ static void __init p4d_basic_tests(unsig
{
p4d_t p4d;
+ pr_debug("Validating P4D basic\n");
memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t));
WARN_ON(!p4d_same(p4d, p4d));
}
@@ -362,6 +377,7 @@ static void __init pgd_basic_tests(unsig
{
pgd_t pgd;
+ pr_debug("Validating PGD basic\n");
memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t));
WARN_ON(!pgd_same(pgd, pgd));
}
@@ -374,6 +390,7 @@ static void __init pud_clear_tests(struc
if (mm_pmd_folded(mm))
return;
+ pr_debug("Validating PUD clear\n");
pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
WRITE_ONCE(*pudp, pud);
pud_clear(pudp);
@@ -388,6 +405,8 @@ static void __init pud_populate_tests(st
if (mm_pmd_folded(mm))
return;
+
+ pr_debug("Validating PUD populate\n");
/*
* This entry points to next level page table page.
* Hence this must not qualify as pud_bad().
@@ -414,6 +433,7 @@ static void __init p4d_clear_tests(struc
if (mm_pud_folded(mm))
return;
+ pr_debug("Validating P4D clear\n");
p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
WRITE_ONCE(*p4dp, p4d);
p4d_clear(p4dp);
@@ -429,6 +449,7 @@ static void __init p4d_populate_tests(st
if (mm_pud_folded(mm))
return;
+ pr_debug("Validating P4D populate\n");
/*
* This entry points to next level page table page.
* Hence this must not qualify as p4d_bad().
@@ -447,6 +468,7 @@ static void __init pgd_clear_tests(struc
if (mm_p4d_folded(mm))
return;
+ pr_debug("Validating PGD clear\n");
pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
WRITE_ONCE(*pgdp, pgd);
pgd_clear(pgdp);
@@ -462,6 +484,7 @@ static void __init pgd_populate_tests(st
if (mm_p4d_folded(mm))
return;
+ pr_debug("Validating PGD populate\n");
/*
* This entry points to next level page table page.
* Hence this must not qualify as pgd_bad().
@@ -490,6 +513,7 @@ static void __init pte_clear_tests(struc
{
pte_t pte = ptep_get(ptep);
+ pr_debug("Validating PTE clear\n");
pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
set_pte_at(mm, vaddr, ptep, pte);
barrier();
@@ -502,6 +526,7 @@ static void __init pmd_clear_tests(struc
{
pmd_t pmd = READ_ONCE(*pmdp);
+ pr_debug("Validating PMD clear\n");
pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
WRITE_ONCE(*pmdp, pmd);
pmd_clear(pmdp);
@@ -514,6 +539,7 @@ static void __init pmd_populate_tests(st
{
pmd_t pmd;
+ pr_debug("Validating PMD populate\n");
/*
* This entry points to next level page table page.
* Hence this must not qualify as pmd_bad().
@@ -531,6 +557,7 @@ static void __init pte_special_tests(uns
if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL))
return;
+ pr_debug("Validating PTE special\n");
WARN_ON(!pte_special(pte_mkspecial(pte)));
}
@@ -541,6 +568,7 @@ static void __init pte_protnone_tests(un
if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
return;
+ pr_debug("Validating PTE protnone\n");
WARN_ON(!pte_protnone(pte));
WARN_ON(!pte_present(pte));
}
@@ -553,6 +581,7 @@ static void __init pmd_protnone_tests(un
if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
return;
+ pr_debug("Validating PMD protnone\n");
WARN_ON(!pmd_protnone(pmd));
WARN_ON(!pmd_present(pmd));
}
@@ -565,6 +594,7 @@ static void __init pte_devmap_tests(unsi
{
pte_t pte = pfn_pte(pfn, prot);
+ pr_debug("Validating PTE devmap\n");
WARN_ON(!pte_devmap(pte_mkdevmap(pte)));
}
@@ -573,6 +603,7 @@ static void __init pmd_devmap_tests(unsi
{
pmd_t pmd = pfn_pmd(pfn, prot);
+ pr_debug("Validating PMD devmap\n");
WARN_ON(!pmd_devmap(pmd_mkdevmap(pmd)));
}
@@ -581,6 +612,7 @@ static void __init pud_devmap_tests(unsi
{
pud_t pud = pfn_pud(pfn, prot);
+ pr_debug("Validating PUD devmap\n");
WARN_ON(!pud_devmap(pud_mkdevmap(pud)));
}
#else /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
@@ -603,6 +635,7 @@ static void __init pte_soft_dirty_tests(
if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
return;
+ pr_debug("Validating PTE soft dirty\n");
WARN_ON(!pte_soft_dirty(pte_mksoft_dirty(pte)));
WARN_ON(pte_soft_dirty(pte_clear_soft_dirty(pte)));
}
@@ -614,6 +647,7 @@ static void __init pte_swap_soft_dirty_t
if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
return;
+ pr_debug("Validating PTE swap soft dirty\n");
WARN_ON(!pte_swp_soft_dirty(pte_swp_mksoft_dirty(pte)));
WARN_ON(pte_swp_soft_dirty(pte_swp_clear_soft_dirty(pte)));
}
@@ -626,6 +660,7 @@ static void __init pmd_soft_dirty_tests(
if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
return;
+ pr_debug("Validating PMD soft dirty\n");
WARN_ON(!pmd_soft_dirty(pmd_mksoft_dirty(pmd)));
WARN_ON(pmd_soft_dirty(pmd_clear_soft_dirty(pmd)));
}
@@ -638,6 +673,7 @@ static void __init pmd_swap_soft_dirty_t
!IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION))
return;
+ pr_debug("Validating PMD swap soft dirty\n");
WARN_ON(!pmd_swp_soft_dirty(pmd_swp_mksoft_dirty(pmd)));
WARN_ON(pmd_swp_soft_dirty(pmd_swp_clear_soft_dirty(pmd)));
}
@@ -653,6 +689,7 @@ static void __init pte_swap_tests(unsign
swp_entry_t swp;
pte_t pte;
+ pr_debug("Validating PTE swap\n");
pte = pfn_pte(pfn, prot);
swp = __pte_to_swp_entry(pte);
pte = __swp_entry_to_pte(swp);
@@ -665,6 +702,7 @@ static void __init pmd_swap_tests(unsign
swp_entry_t swp;
pmd_t pmd;
+ pr_debug("Validating PMD swap\n");
pmd = pfn_pmd(pfn, prot);
swp = __pmd_to_swp_entry(pmd);
pmd = __swp_entry_to_pmd(swp);
@@ -681,6 +719,8 @@ static void __init swap_migration_tests(
if (!IS_ENABLED(CONFIG_MIGRATION))
return;
+
+ pr_debug("Validating swap migration\n");
/*
* swap_migration_tests() requires a dedicated page as it needs to
* be locked before creating a migration entry from it. Locking the
@@ -720,6 +760,7 @@ static void __init hugetlb_basic_tests(u
struct page *page;
pte_t pte;
+ pr_debug("Validating HugeTLB basic\n");
/*
* Accessing the page associated with the pfn is safe here,
* as it was previously derived from a real kernel symbol.
@@ -747,6 +788,7 @@ static void __init hugetlb_advanced_test
pte_t pte = ptep_get(ptep);
unsigned long paddr = (__pfn_to_phys(pfn) | RANDOM_ORVALUE) & PMD_MASK;
+ pr_debug("Validating HugeTLB advanced\n");
pte = pte_mkhuge(mk_pte(pfn_to_page(PHYS_PFN(paddr)), prot));
set_huge_pte_at(mm, vaddr, ptep, pte);
barrier();
@@ -797,6 +839,7 @@ static void __init pmd_thp_tests(unsigne
if (!has_transparent_hugepage())
return;
+ pr_debug("Validating PMD based THP\n");
/*
* pmd_trans_huge() and pmd_present() must return positive after
* MMU invalidation with pmd_mkinvalid(). This behavior is an
@@ -825,6 +868,7 @@ static void __init pud_thp_tests(unsigne
if (!has_transparent_hugepage())
return;
+ pr_debug("Validating PUD based THP\n");
pud = pfn_pud(pfn, prot);
WARN_ON(!pud_trans_huge(pud_mkhuge(pud)));
_
Patches currently in -mm which might be from anshuman.khandual@arm.com are
mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + documentation-mm-add-descriptions-for-arch-page-table-helpers.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (15 preceding siblings ...)
2020-07-06 23:28 ` + mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch " Andrew Morton
@ 2020-07-06 23:28 ` Andrew Morton
2020-07-06 23:33 ` [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from " Andrew Morton
` (215 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:28 UTC (permalink / raw)
To: anshuman.khandual, benh, borntraeger, bp, catalin.marinas,
corbet, gor, heiko.carstens, hpa, kirill, mingo, mm-commits, mpe,
palmer, paul.walmsley, paulus, rppt, rppt, tglx, vgupta, will,
ziy
The patch titled
Subject: Documentation/mm: Add descriptions for arch page table helpers
has been added to the -mm tree. Its filename is
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: Documentation/mm: Add descriptions for arch page table helpers
This adds a specific description file for all arch page table helpers which
is in sync with the semantics being tested via CONFIG_DEBUG_VM_PGTABLE. All
future changes either to these descriptions here or the debug test should
always remain in sync.
Link: http://lkml.kernel.org/r/1593996516-7186-5-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Suggested-by: Mike Rapoport <rppt@kernel.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/vm/arch_pgtable_helpers.rst | 258 ++++++++++++++++++++
mm/debug_vm_pgtable.c | 6
2 files changed, 264 insertions(+)
--- /dev/null
+++ a/Documentation/vm/arch_pgtable_helpers.rst
@@ -0,0 +1,258 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _arch_page_table_helpers:
+
+===============================
+Architecture Page Table Helpers
+===============================
+
+Generic MM expects architectures (with MMU) to provide helpers to create, access
+and modify page table entries at various level for different memory functions.
+These page table helpers need to conform to a common semantics across platforms.
+Following tables describe the expected semantics which can also be tested during
+boot via CONFIG_DEBUG_VM_PGTABLE option. All future changes in here or the debug
+test need to be in sync.
+
+======================
+PTE Page Table Helpers
+======================
+
+--------------------------------------------------------------------------------
+| pte_same | Tests whether both PTE entries are the same |
+--------------------------------------------------------------------------------
+| pte_bad | Tests a non-table mapped PTE |
+--------------------------------------------------------------------------------
+| pte_present | Tests a valid mapped PTE |
+--------------------------------------------------------------------------------
+| pte_young | Tests a young PTE |
+--------------------------------------------------------------------------------
+| pte_dirty | Tests a dirty PTE |
+--------------------------------------------------------------------------------
+| pte_write | Tests a writable PTE |
+--------------------------------------------------------------------------------
+| pte_special | Tests a special PTE |
+--------------------------------------------------------------------------------
+| pte_protnone | Tests a PROT_NONE PTE |
+--------------------------------------------------------------------------------
+| pte_devmap | Tests a ZONE_DEVICE mapped PTE |
+--------------------------------------------------------------------------------
+| pte_soft_dirty | Tests a soft dirty PTE |
+--------------------------------------------------------------------------------
+| pte_swp_soft_dirty | Tests a soft dirty swapped PTE |
+--------------------------------------------------------------------------------
+| pte_mkyoung | Creates a young PTE |
+--------------------------------------------------------------------------------
+| pte_mkold | Creates an old PTE |
+--------------------------------------------------------------------------------
+| pte_mkdirty | Creates a dirty PTE |
+--------------------------------------------------------------------------------
+| pte_mkclean | Creates a clean PTE |
+--------------------------------------------------------------------------------
+| pte_mkwrite | Creates a writable PTE |
+--------------------------------------------------------------------------------
+| pte_mkwrprotect | Creates a write protected PTE |
+--------------------------------------------------------------------------------
+| pte_mkspecial | Creates a special PTE |
+--------------------------------------------------------------------------------
+| pte_mkdevmap | Creates a ZONE_DEVICE mapped PTE |
+--------------------------------------------------------------------------------
+| pte_mksoft_dirty | Creates a soft dirty PTE |
+--------------------------------------------------------------------------------
+| pte_clear_soft_dirty | Clears a soft dirty PTE |
+--------------------------------------------------------------------------------
+| pte_swp_mksoft_dirty | Creates a soft dirty swapped PTE |
+--------------------------------------------------------------------------------
+| pte_swp_clear_soft_dirty | Clears a soft dirty swapped PTE |
+--------------------------------------------------------------------------------
+| pte_mknotpresent | Invalidates a mapped PTE |
+--------------------------------------------------------------------------------
+| ptep_get_and_clear | Clears a PTE |
+--------------------------------------------------------------------------------
+| ptep_get_and_clear_full | Clears a PTE |
+--------------------------------------------------------------------------------
+| ptep_test_and_clear_young | Clears young from a PTE |
+--------------------------------------------------------------------------------
+| ptep_set_wrprotect | Converts into a write protected PTE |
+--------------------------------------------------------------------------------
+| ptep_set_access_flags | Converts into a more permissive PTE |
+--------------------------------------------------------------------------------
+
+======================
+PMD Page Table Helpers
+======================
+
+--------------------------------------------------------------------------------
+| pmd_same | Tests whether both PMD entries are the same |
+--------------------------------------------------------------------------------
+| pmd_bad | Tests a non-table mapped PMD |
+--------------------------------------------------------------------------------
+| pmd_leaf | Tests a leaf mapped PMD |
+--------------------------------------------------------------------------------
+| pmd_huge | Tests a HugeTLB mapped PMD |
+--------------------------------------------------------------------------------
+| pmd_trans_huge | Tests a Transparent Huge Page (THP) at PMD |
+--------------------------------------------------------------------------------
+| pmd_present | Tests a valid mapped PMD |
+--------------------------------------------------------------------------------
+| pmd_young | Tests a young PMD |
+--------------------------------------------------------------------------------
+| pmd_dirty | Tests a dirty PMD |
+--------------------------------------------------------------------------------
+| pmd_write | Tests a writable PMD |
+--------------------------------------------------------------------------------
+| pmd_special | Tests a special PMD |
+--------------------------------------------------------------------------------
+| pmd_protnone | Tests a PROT_NONE PMD |
+--------------------------------------------------------------------------------
+| pmd_devmap | Tests a ZONE_DEVICE mapped PMD |
+--------------------------------------------------------------------------------
+| pmd_soft_dirty | Tests a soft dirty PMD |
+--------------------------------------------------------------------------------
+| pmd_swp_soft_dirty | Tests a soft dirty swapped PMD |
+--------------------------------------------------------------------------------
+| pmd_mkyoung | Creates a young PMD |
+--------------------------------------------------------------------------------
+| pmd_mkold | Creates an old PMD |
+--------------------------------------------------------------------------------
+| pmd_mkdirty | Creates a dirty PMD |
+--------------------------------------------------------------------------------
+| pmd_mkclean | Creates a clean PMD |
+--------------------------------------------------------------------------------
+| pmd_mkwrite | Creates a writable PMD |
+--------------------------------------------------------------------------------
+| pmd_mkwrprotect | Creates a write protected PMD |
+--------------------------------------------------------------------------------
+| pmd_mkspecial | Creates a special PMD |
+--------------------------------------------------------------------------------
+| pmd_mkdevmap | Creates a ZONE_DEVICE mapped PMD |
+--------------------------------------------------------------------------------
+| pmd_mksoft_dirty | Creates a soft dirty PMD |
+--------------------------------------------------------------------------------
+| pmd_clear_soft_dirty | Clears a soft dirty PMD |
+--------------------------------------------------------------------------------
+| pmd_swp_mksoft_dirty | Creates a soft dirty swapped PMD |
+--------------------------------------------------------------------------------
+| pmd_swp_clear_soft_dirty | Clears a soft dirty swapped PMD |
+--------------------------------------------------------------------------------
+| pmd_mkinvalid | Invalidates a mapped PMD [1] |
+--------------------------------------------------------------------------------
+| pmd_set_huge | Creates a PMD huge mapping |
+--------------------------------------------------------------------------------
+| pmd_clear_huge | Clears a PMD huge mapping |
+--------------------------------------------------------------------------------
+| pmdp_get_and_clear | Clears a PMD |
+--------------------------------------------------------------------------------
+| pmdp_get_and_clear_full | Clears a PMD |
+--------------------------------------------------------------------------------
+| pmdp_test_and_clear_young | Clears young from a PMD |
+--------------------------------------------------------------------------------
+| pmdp_set_wrprotect | Converts into a write protected PMD |
+--------------------------------------------------------------------------------
+| pmdp_set_access_flags | Converts into a more permissive PMD |
+--------------------------------------------------------------------------------
+
+======================
+PUD Page Table Helpers
+======================
+
+--------------------------------------------------------------------------------
+| pud_same | Tests whether both PUD entries are the same |
+--------------------------------------------------------------------------------
+| pud_bad | Tests a non-table mapped PUD |
+--------------------------------------------------------------------------------
+| pud_leaf | Tests a leaf mapped PUD |
+--------------------------------------------------------------------------------
+| pud_huge | Tests a HugeTLB mapped PUD |
+--------------------------------------------------------------------------------
+| pud_trans_huge | Tests a Transparent Huge Page (THP) at PUD |
+--------------------------------------------------------------------------------
+| pud_present | Tests a valid mapped PUD |
+--------------------------------------------------------------------------------
+| pud_young | Tests a young PUD |
+--------------------------------------------------------------------------------
+| pud_dirty | Tests a dirty PUD |
+--------------------------------------------------------------------------------
+| pud_write | Tests a writable PUD |
+--------------------------------------------------------------------------------
+| pud_devmap | Tests a ZONE_DEVICE mapped PUD |
+--------------------------------------------------------------------------------
+| pud_mkyoung | Creates a young PUD |
+--------------------------------------------------------------------------------
+| pud_mkold | Creates an old PUD |
+--------------------------------------------------------------------------------
+| pud_mkdirty | Creates a dirty PUD |
+--------------------------------------------------------------------------------
+| pud_mkclean | Creates a clean PUD |
+--------------------------------------------------------------------------------
+| pud_mkwrite | Creates a writable PMD |
+--------------------------------------------------------------------------------
+| pud_mkwrprotect | Creates a write protected PMD |
+--------------------------------------------------------------------------------
+| pud_mkdevmap | Creates a ZONE_DEVICE mapped PMD |
+--------------------------------------------------------------------------------
+| pud_mkinvalid | Invalidates a mapped PUD [1] |
+--------------------------------------------------------------------------------
+| pud_set_huge | Creates a PUD huge mapping |
+--------------------------------------------------------------------------------
+| pud_clear_huge | Clears a PUD huge mapping |
+--------------------------------------------------------------------------------
+| pudp_get_and_clear | Clears a PUD |
+--------------------------------------------------------------------------------
+| pudp_get_and_clear_full | Clears a PUD |
+--------------------------------------------------------------------------------
+| pudp_test_and_clear_young | Clears young from a PUD |
+--------------------------------------------------------------------------------
+| pudp_set_wrprotect | Converts into a write protected PUD |
+--------------------------------------------------------------------------------
+| pudp_set_access_flags | Converts into a more permissive PUD |
+--------------------------------------------------------------------------------
+
+==========================
+HugeTLB Page Table Helpers
+==========================
+
+--------------------------------------------------------------------------------
+| pte_huge | Tests a HugeTLB |
+--------------------------------------------------------------------------------
+| pte_mkhuge | Creates a HugeTLB |
+--------------------------------------------------------------------------------
+| huge_pte_dirty | Tests a dirty HugeTLB |
+--------------------------------------------------------------------------------
+| huge_pte_write | Tests a writable HugeTLB |
+--------------------------------------------------------------------------------
+| huge_pte_mkdirty | Creates a dirty HugeTLB |
+--------------------------------------------------------------------------------
+| huge_pte_mkwrite | Creates a writable HugeTLB |
+--------------------------------------------------------------------------------
+| huge_pte_mkwrprotect | Creates a write protected HugeTLB |
+--------------------------------------------------------------------------------
+| huge_ptep_get_and_clear | Clears a HugeTLB |
+--------------------------------------------------------------------------------
+| huge_ptep_set_wrprotect | Converts into a write protected HugeTLB |
+--------------------------------------------------------------------------------
+| huge_ptep_set_access_flags | Converts into a more permissive HugeTLB |
+--------------------------------------------------------------------------------
+
+========================
+SWAP Page Table Helpers
+========================
+
+--------------------------------------------------------------------------------
+| __pte_to_swp_entry | Creates a swapped entry (arch) from a mapepd PTE |
+--------------------------------------------------------------------------------
+| __swp_to_pte_entry | Creates a mapped PTE from a swapped entry (arch) |
+--------------------------------------------------------------------------------
+| __pmd_to_swp_entry | Creates a swapped entry (arch) from a mapepd PMD |
+--------------------------------------------------------------------------------
+| __swp_to_pmd_entry | Creates a mapped PMD from a swapped entry (arch) |
+--------------------------------------------------------------------------------
+| is_migration_entry | Tests a migration (read or write) swapped entry |
+--------------------------------------------------------------------------------
+| is_write_migration_entry | Tests a write migration swapped entry |
+--------------------------------------------------------------------------------
+| make_migration_entry_read | Converts into read migration swapped entry |
+--------------------------------------------------------------------------------
+| make_migration_entry | Creates a migration swapped entry (read or write)|
+--------------------------------------------------------------------------------
+
+[1] https://lore.kernel.org/linux-mm/20181017020930.GN30832@redhat.com/
--- a/mm/debug_vm_pgtable.c~documentation-mm-add-descriptions-for-arch-page-table-helpers
+++ a/mm/debug_vm_pgtable.c
@@ -31,6 +31,12 @@
#include <asm/pgalloc.h>
#include <asm/tlbflush.h>
+/*
+ * Please refer Documentation/vm/arch_pgtable_helpers.rst for the semantics
+ * expectations that are being validated here. All future changes in here
+ * or the documentation need to be in sync.
+ */
+
#define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC)
/*
_
Patches currently in -mm which might be from anshuman.khandual@arm.com are
mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (16 preceding siblings ...)
2020-07-06 23:28 ` + documentation-mm-add-descriptions-for-arch-page-table-helpers.patch " Andrew Morton
@ 2020-07-06 23:33 ` Andrew Morton
2020-07-06 23:33 ` Andrew Morton
2020-07-06 23:34 ` + slub-drop-lockdep_assert_held-from-put_map.patch added to " Andrew Morton
` (214 subsequent siblings)
232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:33 UTC (permalink / raw)
To: bigeasy, cl, iamjoonsoo.kim, mm-commits, penberg, rientjes, tglx, yuzhao
The patch titled
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
has been removed from the -mm tree. Its filename was
slub-drop-lockdep_assert_held-from-put_map.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
There is no point in using lockdep_assert_held() unlock that is about to
be unlocked. It works only with lockdep and lockdep will complain if
spin_unlock() is used on a lock that has not been locked.
Remove superfluous lockdep_assert_held().
Link: http://lkml.kernel.org/r/20200618201234.795692-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slub.c | 2 --
1 file changed, 2 deletions(-)
--- a/mm/slub.c~slub-drop-lockdep_assert_held-from-put_map
+++ a/mm/slub.c
@@ -473,8 +473,6 @@ static unsigned long *get_map(struct kme
static void put_map(unsigned long *map) __releases(&object_map_lock)
{
VM_BUG_ON(map != object_map);
- lockdep_assert_held(&object_map_lock);
^ permalink raw reply [flat|nested] 247+ messages in thread
* [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from -mm tree
2020-07-06 23:33 ` [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from " Andrew Morton
@ 2020-07-06 23:33 ` Andrew Morton
0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:33 UTC (permalink / raw)
To: bigeasy, cl, iamjoonsoo.kim, mm-commits, penberg, rientjes, tglx, yuzhao
The patch titled
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
has been removed from the -mm tree. Its filename was
slub-drop-lockdep_assert_held-from-put_map.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
There is no point in using lockdep_assert_held() unlock that is about to
be unlocked. It works only with lockdep and lockdep will complain if
spin_unlock() is used on a lock that has not been locked.
Remove superfluous lockdep_assert_held().
Link: http://lkml.kernel.org/r/20200618201234.795692-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slub.c | 2 --
1 file changed, 2 deletions(-)
--- a/mm/slub.c~slub-drop-lockdep_assert_held-from-put_map
+++ a/mm/slub.c
@@ -473,8 +473,6 @@ static unsigned long *get_map(struct kme
static void put_map(unsigned long *map) __releases(&object_map_lock)
{
VM_BUG_ON(map != object_map);
- lockdep_assert_held(&object_map_lock);
-
spin_unlock(&object_map_lock);
}
_
Patches currently in -mm which might be from bigeasy@linutronix.de are
^ permalink raw reply [flat|nested] 247+ messages in thread
* + slub-drop-lockdep_assert_held-from-put_map.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (17 preceding siblings ...)
2020-07-06 23:33 ` [merged] slub-drop-lockdep_assert_held-from-put_map.patch removed from " Andrew Morton
@ 2020-07-06 23:34 ` Andrew Morton
2020-07-06 23:34 ` Andrew Morton
2020-07-06 23:34 ` [merged] mailmap-add-entry-for-obsolete-email-address.patch removed from " Andrew Morton
` (213 subsequent siblings)
232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:34 UTC (permalink / raw)
To: bigeasy, cl, iamjoonsoo.kim, mm-commits, penberg, rientjes, tglx, yuzhao
The patch titled
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
has been added to the -mm tree. Its filename is
slub-drop-lockdep_assert_held-from-put_map.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/slub-drop-lockdep_assert_held-from-put_map.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/slub-drop-lockdep_assert_held-from-put_map.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
There is no point in using lockdep_assert_held() unlock that is about to
be unlocked. It works only with lockdep and lockdep will complain if
spin_unlock() is used on a lock that has not been locked.
Remove superfluous lockdep_assert_held().
Link: http://lkml.kernel.org/r/20200618201234.795692-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slub.c | 2 --
1 file changed, 2 deletions(-)
--- a/mm/slub.c~slub-drop-lockdep_assert_held-from-put_map
+++ a/mm/slub.c
@@ -473,8 +473,6 @@ static unsigned long *get_map(struct kme
static void put_map(unsigned long *map) __releases(&object_map_lock)
{
VM_BUG_ON(map != object_map);
- lockdep_assert_held(&object_map_lock);
^ permalink raw reply [flat|nested] 247+ messages in thread
* + slub-drop-lockdep_assert_held-from-put_map.patch added to -mm tree
2020-07-06 23:34 ` + slub-drop-lockdep_assert_held-from-put_map.patch added to " Andrew Morton
@ 2020-07-06 23:34 ` Andrew Morton
0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:34 UTC (permalink / raw)
To: bigeasy, cl, iamjoonsoo.kim, mm-commits, penberg, rientjes, tglx, yuzhao
The patch titled
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
has been added to the -mm tree. Its filename is
slub-drop-lockdep_assert_held-from-put_map.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/slub-drop-lockdep_assert_held-from-put_map.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/slub-drop-lockdep_assert_held-from-put_map.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm/slub.c: drop lockdep_assert_held() from put_map()
There is no point in using lockdep_assert_held() unlock that is about to
be unlocked. It works only with lockdep and lockdep will complain if
spin_unlock() is used on a lock that has not been locked.
Remove superfluous lockdep_assert_held().
Link: http://lkml.kernel.org/r/20200618201234.795692-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slub.c | 2 --
1 file changed, 2 deletions(-)
--- a/mm/slub.c~slub-drop-lockdep_assert_held-from-put_map
+++ a/mm/slub.c
@@ -473,8 +473,6 @@ static unsigned long *get_map(struct kme
static void put_map(unsigned long *map) __releases(&object_map_lock)
{
VM_BUG_ON(map != object_map);
- lockdep_assert_held(&object_map_lock);
-
spin_unlock(&object_map_lock);
}
_
Patches currently in -mm which might be from bigeasy@linutronix.de are
slub-drop-lockdep_assert_held-from-put_map.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [merged] mailmap-add-entry-for-obsolete-email-address.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (18 preceding siblings ...)
2020-07-06 23:34 ` + slub-drop-lockdep_assert_held-from-put_map.patch added to " Andrew Morton
@ 2020-07-06 23:34 ` Andrew Morton
2020-07-06 23:36 ` + mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch added to " Andrew Morton
` (212 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:34 UTC (permalink / raw)
To: corbet, koct9i, mm-commits
The patch titled
Subject: mailmap: add entry for obsolete email address
has been removed from the -mm tree. Its filename was
mailmap-add-entry-for-obsolete-email-address.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Konstantin Khlebnikov <koct9i@gmail.com>
Subject: mailmap: add entry for obsolete email address
Map old corporate email address @yandex-team.ru to stable private address.
Link: http://lkml.kernel.org/r/159360469186.24918.10108157093572183535.stgit@zurg
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
.mailmap | 1 +
1 file changed, 1 insertion(+)
--- a/.mailmap~mailmap-add-entry-for-obsolete-email-address
+++ a/.mailmap
@@ -146,6 +146,7 @@ Kamil Konieczny <k.konieczny@samsung.com
Kay Sievers <kay.sievers@vrfy.org>
Kenneth W Chen <kenneth.w.chen@intel.com>
Konstantin Khlebnikov <koct9i@gmail.com> <k.khlebnikov@samsung.com>
+Konstantin Khlebnikov <koct9i@gmail.com> <khlebnikov@yandex-team.ru>
Koushik <raghavendra.koushik@neterion.com>
Krzysztof Kozlowski <krzk@kernel.org> <k.kozlowski@samsung.com>
Krzysztof Kozlowski <krzk@kernel.org> <k.kozlowski.k@gmail.com>
_
Patches currently in -mm which might be from koct9i@gmail.com are
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (19 preceding siblings ...)
2020-07-06 23:34 ` [merged] mailmap-add-entry-for-obsolete-email-address.patch removed from " Andrew Morton
@ 2020-07-06 23:36 ` Andrew Morton
2020-07-06 23:50 ` + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch " Andrew Morton
` (211 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:36 UTC (permalink / raw)
To: akpm, mm-commits, thunder.leizhen
The patch titled
Subject: mm/mmap: optimize a branch judgment in ksys_mmap_pgoff()
has been added to the -mm tree. Its filename is
mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Zhen Lei <thunder.leizhen@huawei.com>
Subject: mm/mmap: optimize a branch judgment in ksys_mmap_pgoff()
Look at the pseudo code below. It's very clear that, the judgement
"!is_file_hugepages(file)" at 3) is duplicated to the one at 1), we can
use "else if" to avoid it. And the assignment "retval = -EINVAL" at 2) is
only needed by the branch 3), because "retval" will be overwritten at 4).
No functional change, but it can reduce the code size. Maybe more clearer?
Before:
text data bss dec hex filename
28733 1590 1 30324 7674 mm/mmap.o
After:
text data bss dec hex filename
28701 1590 1 30292 7654 mm/mmap.o
====pseudo code====:
if (!(flags & MAP_ANONYMOUS)) {
...
1) if (is_file_hugepages(file))
len = ALIGN(len, huge_page_size(hstate_file(file)));
2) retval = -EINVAL;
3) if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file)))
goto out_fput;
} else if (flags & MAP_HUGETLB) {
...
}
...
4) retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff);
out_fput:
...
return retval;
Link: http://lkml.kernel.org/r/20200705080112.1405-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/mmap.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/mm/mmap.c~mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff
+++ a/mm/mmap.c
@@ -1562,11 +1562,12 @@ unsigned long ksys_mmap_pgoff(unsigned l
file = fget(fd);
if (!file)
return -EBADF;
- if (is_file_hugepages(file))
+ if (is_file_hugepages(file)) {
len = ALIGN(len, huge_page_size(hstate_file(file)));
- retval = -EINVAL;
- if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file)))
+ } else if (unlikely(flags & MAP_HUGETLB)) {
+ retval = -EINVAL;
goto out_fput;
+ }
} else if (flags & MAP_HUGETLB) {
struct user_struct *user = NULL;
struct hstate *hs;
_
Patches currently in -mm which might be from thunder.leizhen@huawei.com are
mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (20 preceding siblings ...)
2020-07-06 23:36 ` + mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch added to " Andrew Morton
@ 2020-07-06 23:50 ` Andrew Morton
2020-07-06 23:52 ` [to-be-updated] mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch removed from " Andrew Morton
` (210 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:50 UTC (permalink / raw)
To: adilger, cgxu519, chris, dxu, gregkh, hughd, mm-commits, stable,
tj, viro
The patch titled
Subject: vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
has been added to the -mm tree. Its filename is
vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Chengguang Xu <cgxu519@mykernel.net>
Subject: vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
After commit fdc85222d58e ("kernfs: kvmalloc xattr value instead of
kmalloc"), simple xattr entry is allocated with kvmalloc() instead of
kmalloc(), so we should release it with kvfree() instead of kfree().
Link: http://lkml.kernel.org/r/20200704051608.15043-1-cgxu519@mykernel.net
Fixes: fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc")
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Chris Down <chris@chrisdown.name>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org> [5.7]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/xattr.h | 3 ++-
mm/shmem.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
--- a/include/linux/xattr.h~vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way
+++ a/include/linux/xattr.h
@@ -15,6 +15,7 @@
#include <linux/slab.h>
#include <linux/types.h>
#include <linux/spinlock.h>
+#include <linux/mm.h>
#include <uapi/linux/xattr.h>
struct inode;
@@ -94,7 +95,7 @@ static inline void simple_xattrs_free(st
list_for_each_entry_safe(xattr, node, &xattrs->head, list) {
kfree(xattr->name);
- kfree(xattr);
+ kvfree(xattr);
}
}
--- a/mm/shmem.c~vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way
+++ a/mm/shmem.c
@@ -3178,7 +3178,7 @@ static int shmem_initxattrs(struct inode
new_xattr->name = kmalloc(XATTR_SECURITY_PREFIX_LEN + len,
GFP_KERNEL);
if (!new_xattr->name) {
- kfree(new_xattr);
+ kvfree(new_xattr);
return -ENOMEM;
}
_
Patches currently in -mm which might be from cgxu519@mykernel.net are
vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
mm-shmem-fix-freeing-new_attr-in-shmem_initxattrs.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (21 preceding siblings ...)
2020-07-06 23:50 ` + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch " Andrew Morton
@ 2020-07-06 23:52 ` Andrew Morton
2020-07-06 23:53 ` + mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch added to " Andrew Morton
` (209 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:52 UTC (permalink / raw)
To: cl, iamjoonsoo.kim, lonuxli.64, mm-commits, penberg, rientjes, willy
The patch titled
Subject: mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order
has been removed from the -mm tree. Its filename was
mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Long Li <lonuxli.64@gmail.com>
Subject: mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order
kmalloc cannot allocate memory from HIGHMEM. Allocating large amounts of
memory currently bypasses the check and will simply leak the memory when
page_address() returns NULL. To fix this, factor the GFP_SLAB_BUG_MASK
check out of slab & slub, and call it from kmalloc_order() as well. In
order to make the code clear, the warning message is put in one place.
Link: http://lkml.kernel.org/r/20200701151645.GA26223@lilong
Signed-off-by: Long Li <lonuxli.64@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slab.c | 10 +++-------
mm/slab.h | 1 +
mm/slab_common.c | 17 +++++++++++++++++
mm/slub.c | 9 ++-------
4 files changed, 23 insertions(+), 14 deletions(-)
--- a/mm/slab.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab.c
@@ -2589,13 +2589,9 @@ static struct page *cache_grow_begin(str
* Be lazy and only check for valid flags here, keeping it out of the
* critical path in kmem_cache_alloc().
*/
- if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
- gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
- flags &= ~GFP_SLAB_BUG_MASK;
- pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
- invalid_mask, &invalid_mask, flags, &flags);
- dump_stack();
- }
+ if (unlikely(flags & GFP_SLAB_BUG_MASK))
+ flags = kmalloc_invalid_flags(flags);
+
WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
--- a/mm/slab_common.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab_common.c
@@ -26,6 +26,8 @@
#define CREATE_TRACE_POINTS
#include <trace/events/kmem.h>
+#include "internal.h"
+
#include "slab.h"
enum slab_state slab_state;
@@ -1311,6 +1313,18 @@ void __init create_kmalloc_caches(slab_f
}
#endif /* !CONFIG_SLOB */
+gfp_t kmalloc_invalid_flags(gfp_t flags)
+{
+ gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
+
+ flags &= ~GFP_SLAB_BUG_MASK;
+ pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
+ invalid_mask, &invalid_mask, flags, &flags);
+ dump_stack();
+
+ return flags;
+}
+
/*
* To avoid unnecessary overhead, we pass through large allocation requests
* directly to the page allocator. We use __GFP_COMP, because we will need to
@@ -1321,6 +1335,9 @@ void *kmalloc_order(size_t size, gfp_t f
void *ret = NULL;
struct page *page;
+ if (unlikely(flags & GFP_SLAB_BUG_MASK))
+ flags = kmalloc_invalid_flags(flags);
+
flags |= __GFP_COMP;
page = alloc_pages(flags, order);
if (likely(page)) {
--- a/mm/slab.h~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab.h
@@ -152,6 +152,7 @@ void create_kmalloc_caches(slab_flags_t)
struct kmem_cache *kmalloc_slab(size_t, gfp_t);
#endif
+gfp_t kmalloc_invalid_flags(gfp_t flags);
/* Functions provided by the slab allocators */
int __kmem_cache_create(struct kmem_cache *, slab_flags_t flags);
--- a/mm/slub.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slub.c
@@ -1745,13 +1745,8 @@ out:
static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
{
- if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
- gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
- flags &= ~GFP_SLAB_BUG_MASK;
- pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
- invalid_mask, &invalid_mask, flags, &flags);
- dump_stack();
- }
+ if (unlikely(flags & GFP_SLAB_BUG_MASK))
+ flags = kmalloc_invalid_flags(flags);
return allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
_
Patches currently in -mm which might be from lonuxli.64@gmail.com are
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (22 preceding siblings ...)
2020-07-06 23:52 ` [to-be-updated] mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch removed from " Andrew Morton
@ 2020-07-06 23:53 ` Andrew Morton
2020-07-07 1:53 ` mmotm 2020-07-06-18-53 uploaded Andrew Morton
` (208 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-06 23:53 UTC (permalink / raw)
To: cl, iamjoonsoo.kim, lonuxli.64, mm-commits, penberg, rientjes, willy
The patch titled
Subject: mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order
has been added to the -mm tree. Its filename is
mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Long Li <lonuxli.64@gmail.com>
Subject: mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order
kmalloc cannot allocate memory from HIGHMEM. Allocating large amounts of
memory currently bypasses the check and will simply leak the memory when
page_address() returns NULL. To fix this, factor the GFP_SLAB_BUG_MASK
check out of slab & slub, and call it from kmalloc_order() as well. In
order to make the code clear, the warning message is put in one place.
Link: http://lkml.kernel.org/r/20200704035027.GA62481@lilong
Signed-off-by: Long Li <lonuxli.64@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slab.c | 10 +++-------
mm/slab.h | 1 +
mm/slab_common.c | 17 +++++++++++++++++
mm/slub.c | 9 ++-------
4 files changed, 23 insertions(+), 14 deletions(-)
--- a/mm/slab.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab.c
@@ -2589,13 +2589,9 @@ static struct page *cache_grow_begin(str
* Be lazy and only check for valid flags here, keeping it out of the
* critical path in kmem_cache_alloc().
*/
- if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
- gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
- flags &= ~GFP_SLAB_BUG_MASK;
- pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
- invalid_mask, &invalid_mask, flags, &flags);
- dump_stack();
- }
+ if (unlikely(flags & GFP_SLAB_BUG_MASK))
+ flags = kmalloc_fix_flags(flags);
+
WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
--- a/mm/slab_common.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab_common.c
@@ -26,6 +26,8 @@
#define CREATE_TRACE_POINTS
#include <trace/events/kmem.h>
+#include "internal.h"
+
#include "slab.h"
enum slab_state slab_state;
@@ -1311,6 +1313,18 @@ void __init create_kmalloc_caches(slab_f
}
#endif /* !CONFIG_SLOB */
+gfp_t kmalloc_fix_flags(gfp_t flags)
+{
+ gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
+
+ flags &= ~GFP_SLAB_BUG_MASK;
+ pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
+ invalid_mask, &invalid_mask, flags, &flags);
+ dump_stack();
+
+ return flags;
+}
+
/*
* To avoid unnecessary overhead, we pass through large allocation requests
* directly to the page allocator. We use __GFP_COMP, because we will need to
@@ -1321,6 +1335,9 @@ void *kmalloc_order(size_t size, gfp_t f
void *ret = NULL;
struct page *page;
+ if (unlikely(flags & GFP_SLAB_BUG_MASK))
+ flags = kmalloc_fix_flags(flags);
+
flags |= __GFP_COMP;
page = alloc_pages(flags, order);
if (likely(page)) {
--- a/mm/slab.h~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slab.h
@@ -152,6 +152,7 @@ void create_kmalloc_caches(slab_flags_t)
struct kmem_cache *kmalloc_slab(size_t, gfp_t);
#endif
+gfp_t kmalloc_fix_flags(gfp_t flags);
/* Functions provided by the slab allocators */
int __kmem_cache_create(struct kmem_cache *, slab_flags_t flags);
--- a/mm/slub.c~mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order
+++ a/mm/slub.c
@@ -1745,13 +1745,8 @@ out:
static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
{
- if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
- gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
- flags &= ~GFP_SLAB_BUG_MASK;
- pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
- invalid_mask, &invalid_mask, flags, &flags);
- dump_stack();
- }
+ if (unlikely(flags & GFP_SLAB_BUG_MASK))
+ flags = kmalloc_fix_flags(flags);
return allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
_
Patches currently in -mm which might be from lonuxli.64@gmail.com are
mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* mmotm 2020-07-06-18-53 uploaded
2020-07-03 22:14 incoming Andrew Morton
` (23 preceding siblings ...)
2020-07-06 23:53 ` + mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch added to " Andrew Morton
@ 2020-07-07 1:53 ` Andrew Morton
2020-07-07 19:17 ` + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch added to -mm tree Andrew Morton
` (207 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 1:53 UTC (permalink / raw)
To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
mhocko, mm-commits, sfr
The mm-of-the-moment snapshot 2020-07-06-18-53 has been uploaded to
http://www.ozlabs.org/~akpm/mmotm/
mmotm-readme.txt says
README for mm-of-the-moment:
http://www.ozlabs.org/~akpm/mmotm/
This is a snapshot of my -mm patch queue. Uploaded at random hopefully
more than once a week.
You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY). The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series
The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss. Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.
This tree is partially included in linux-next. To see which patches are
included in linux-next, consult the `series' file. Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.
A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release. Individual mmotm releases are tagged. The master branch always
points to the latest release, so it's constantly rebasing.
https://github.com/hnaz/linux-mm
The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree. It is updated more frequently
than mmotm, and is untested.
A git copy of this tree is also available at
https://github.com/hnaz/linux-mm
This mmotm tree contains the following patches against 5.8-rc4:
(patches marked "*" will be included in linux-next)
origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* checkpatch-test-git_dir-changes.patch
* kthread-work-could-not-be-queued-when-worker-being-destroyed.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
* mm-utilc-make-vm_memory_committed-more-accurate.patch
* mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* kasan-record-and-print-the-free-track.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-vmscanc-fixed-typo.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-page_isolation-prefer-the-node-of-the-source-page.patch
* mm-migrate-move-migration-helper-from-h-to-c.patch
* mm-hugetlb-unify-migration-callbacks.patch
* mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch
* mm-migrate-make-a-standard-migration-target-allocation-function.patch
* mm-gup-use-a-standard-migration-target-allocation-callback.patch
* mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
* mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
* mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch
* mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-free-pages-fix.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix-fix.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* umh-fix-refcount-underflow-in-fork_usermode_blob.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
linux-next.patch
linux-next-rejects.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-remove-call-to-memset-after-dma_alloc_coherent.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
make-sure-nobodys-leaking-resources.patch
releasing-resources-with-children.patch
mutex-subsystem-synchro-test-module.patch
kernel-forkc-export-kernel_thread-to-modules.patch
workaround-for-a-pci-restoring-bug.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (24 preceding siblings ...)
2020-07-07 1:53 ` mmotm 2020-07-06-18-53 uploaded Andrew Morton
@ 2020-07-07 19:17 ` Andrew Morton
2020-07-07 19:20 ` + mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch " Andrew Morton
` (206 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:17 UTC (permalink / raw)
To: catalin.marinas, hannes, hdanton, hughd, josef, kirill.shutemov,
mm-commits, will.deacon, willy, xuyu, yang.shi
The patch titled
Subject: mm/memory.c: avoid access flag update TLB flush for retried page fault
has been added to the -mm tree. Its filename is
mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Yang Shi <yang.shi@linux.alibaba.com>
Subject: mm/memory.c: avoid access flag update TLB flush for retried page fault
Recently we found regression when running will_it_scale/page_fault3 test
on ARM64. Over 70% down for the multi processes cases and over 20% down
for the multi threads cases. It turns out the regression is caused by
commit 89b15332af7c0 ("mm: drop mmap_sem before calling
balance_dirty_pages() in write fault").
The test mmaps a memory size file then write to the mapping, this would
make all memory dirty and trigger dirty pages throttle, that upstream
commit would release mmap_sem then retry the page fault. The retried page
fault would see correct PTEs installed by the first try then update access
flags and flush TLBs. The regression is caused by the excessive TLB
flush. It is fine on x86 since x86 doesn't need flush TLB for access flag
update.
The page fault would be retried due to:
1. Waiting for page readahead
2. Waiting for page swapped in
3. Waiting for dirty pages throttling
The first two cases don't have PTEs set up at all, so the retried page
fault would install the PTEs, so they don't reach there. But the #3 case
usually has PTEs installed, the retried page fault would reach the access
flag update. But it seems not necessary to update access flags for #3
since retried page fault is not real "second access", so it sounds safe to
skip access flag update for retried page fault.
With this fix the test result get back to normal.
Link: http://lkml.kernel.org/r/1594148072-91273-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reported-by: Xu Yu <xuyu@linux.alibaba.com>
Debugged-by: Xu Yu <xuyu@linux.alibaba.com>
Tested-by: Xu Yu <xuyu@linux.alibaba.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/memory.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/mm/memory.c~mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault
+++ a/mm/memory.c
@@ -4241,8 +4241,13 @@ static vm_fault_t handle_pte_fault(struc
if (vmf->flags & FAULT_FLAG_WRITE) {
if (!pte_write(entry))
return do_wp_page(vmf);
- entry = pte_mkdirty(entry);
}
+
+ if ((vmf->flags & FAULT_FLAG_WRITE) && !(vmf->flags & FAULT_FLAG_TRIED))
+ entry = pte_mkdirty(entry);
+ else if (vmf->flags & FAULT_FLAG_TRIED)
+ goto unlock;
+
entry = pte_mkyoung(entry);
if (ptep_set_access_flags(vmf->vma, vmf->address, vmf->pte, entry,
vmf->flags & FAULT_FLAG_WRITE)) {
_
Patches currently in -mm which might be from yang.shi@linux.alibaba.com are
mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
mm-filemap-clear-idle-flag-for-writes.patch
mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
mm-thp-remove-debug_cow-switch.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (25 preceding siblings ...)
2020-07-07 19:17 ` + mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch added to -mm tree Andrew Morton
@ 2020-07-07 19:20 ` Andrew Morton
2020-07-07 19:20 ` + mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch " Andrew Morton
` (205 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:20 UTC (permalink / raw)
To: cl, guro, hannes, iamjoonsoo.kim, mhocko, mm-commits, penberg,
rientjes, shakeelb, vbabka
The patch titled
Subject: mm: memcg/slab: remove unused argument by charge_slab_page()
has been added to the -mm tree. Its filename is
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: memcg/slab: remove unused argument by charge_slab_page()
charge_slab_page() is not using the gfp argument anymore,
remove it.
Link: http://lkml.kernel.org/r/20200707173612.124425-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slab.c | 2 +-
mm/slab.h | 3 +--
mm/slub.c | 2 +-
3 files changed, 3 insertions(+), 4 deletions(-)
--- a/mm/slab.c~mm-memcg-slab-remove-unused-argument-by-charge_slab_page
+++ a/mm/slab.c
@@ -1379,7 +1379,7 @@ static struct page *kmem_getpages(struct
return NULL;
}
- charge_slab_page(page, flags, cachep->gfporder, cachep);
+ charge_slab_page(page, cachep->gfporder, cachep);
__SetPageSlab(page);
/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
if (sk_memalloc_socks() && page_is_pfmemalloc(page))
--- a/mm/slab.h~mm-memcg-slab-remove-unused-argument-by-charge_slab_page
+++ a/mm/slab.h
@@ -440,8 +440,7 @@ static inline struct kmem_cache *virt_to
return page->slab_cache;
}
-static __always_inline void charge_slab_page(struct page *page,
- gfp_t gfp, int order,
+static __always_inline void charge_slab_page(struct page *page, int order,
struct kmem_cache *s)
{
mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
--- a/mm/slub.c~mm-memcg-slab-remove-unused-argument-by-charge_slab_page
+++ a/mm/slub.c
@@ -1621,7 +1621,7 @@ static inline struct page *alloc_slab_pa
page = __alloc_pages_node(node, flags, order);
if (page)
- charge_slab_page(page, flags, order, s);
+ charge_slab_page(page, order, s);
return page;
}
_
Patches currently in -mm which might be from guro@fb.com are
mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (26 preceding siblings ...)
2020-07-07 19:20 ` + mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch " Andrew Morton
@ 2020-07-07 19:20 ` Andrew Morton
2020-07-07 19:20 ` + mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch " Andrew Morton
` (204 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:20 UTC (permalink / raw)
To: cl, guro, hannes, iamjoonsoo.kim, mhocko, mm-commits, penberg,
rientjes, shakeelb, vbabka
The patch titled
Subject: mm: slab: rename (un)charge_slab_page() to (un)account_slab_page()
has been added to the -mm tree. Its filename is
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: slab: rename (un)charge_slab_page() to (un)account_slab_page()
charge_slab_page() and uncharge_slab_page() are not related anymore to
memcg charging and uncharging. In order to make their names less
confusing, let's rename them to account_slab_page() and
unaccount_slab_page() respectively.
Link: http://lkml.kernel.org/r/20200707173612.124425-2-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slab.c | 4 ++--
mm/slab.h | 8 ++++----
mm/slub.c | 4 ++--
3 files changed, 8 insertions(+), 8 deletions(-)
--- a/mm/slab.c~mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page
+++ a/mm/slab.c
@@ -1379,7 +1379,7 @@ static struct page *kmem_getpages(struct
return NULL;
}
- charge_slab_page(page, cachep->gfporder, cachep);
+ account_slab_page(page, cachep->gfporder, cachep);
__SetPageSlab(page);
/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
if (sk_memalloc_socks() && page_is_pfmemalloc(page))
@@ -1403,7 +1403,7 @@ static void kmem_freepages(struct kmem_c
if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += 1 << order;
- uncharge_slab_page(page, order, cachep);
+ unaccount_slab_page(page, order, cachep);
__free_pages(page, order);
}
--- a/mm/slab.h~mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page
+++ a/mm/slab.h
@@ -440,15 +440,15 @@ static inline struct kmem_cache *virt_to
return page->slab_cache;
}
-static __always_inline void charge_slab_page(struct page *page, int order,
- struct kmem_cache *s)
+static __always_inline void account_slab_page(struct page *page, int order,
+ struct kmem_cache *s)
{
mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
PAGE_SIZE << order);
}
-static __always_inline void uncharge_slab_page(struct page *page, int order,
- struct kmem_cache *s)
+static __always_inline void unaccount_slab_page(struct page *page, int order,
+ struct kmem_cache *s)
{
if (memcg_kmem_enabled())
memcg_free_page_obj_cgroups(page);
--- a/mm/slub.c~mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page
+++ a/mm/slub.c
@@ -1621,7 +1621,7 @@ static inline struct page *alloc_slab_pa
page = __alloc_pages_node(node, flags, order);
if (page)
- charge_slab_page(page, order, s);
+ account_slab_page(page, order, s);
return page;
}
@@ -1844,7 +1844,7 @@ static void __free_slab(struct kmem_cach
page->mapping = NULL;
if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += pages;
- uncharge_slab_page(page, order, s);
+ unaccount_slab_page(page, order, s);
__free_pages(page, order);
}
_
Patches currently in -mm which might be from guro@fb.com are
mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (27 preceding siblings ...)
2020-07-07 19:20 ` + mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch " Andrew Morton
@ 2020-07-07 19:20 ` Andrew Morton
2020-07-07 19:25 ` + fs-minix-check-return-value-of-sb_getblk.patch " Andrew Morton
` (203 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:20 UTC (permalink / raw)
To: cl, guro, hannes, iamjoonsoo.kim, mhocko, mm-commits, penberg,
rientjes, shakeelb, vbabka
The patch titled
Subject: mm: kmem: switch to static_branch_likely() in memcg_kmem_enabled()
has been added to the -mm tree. Its filename is
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: kmem: switch to static_branch_likely() in memcg_kmem_enabled()
Currently memcg_kmem_enabled() is optimized for the kernel memory
accounting being off. It was so for a long time, and arguably the reason
behind was that the kernel memory accounting was initially an opt-in
feature. However, now it's on by default on both cgroup v1 and cgroup v2,
and it's on for all cgroups. So let's switch over to
static_branch_likely() to reflect this fact.
Unlikely there is a significant performance difference, as the cost of a
memory allocation and its accounting significantly exceeds the cost of a
jump. However, the conversion makes the code look more logically.
Link: http://lkml.kernel.org/r/20200707173612.124425-3-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/memcontrol.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/memcontrol.h~mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled
+++ a/include/linux/memcontrol.h
@@ -1456,7 +1456,7 @@ void memcg_put_cache_ids(void);
static inline bool memcg_kmem_enabled(void)
{
- return static_branch_unlikely(&memcg_kmem_enabled_key);
+ return static_branch_likely(&memcg_kmem_enabled_key);
}
static inline bool memcg_kmem_bypass(void)
_
Patches currently in -mm which might be from guro@fb.com are
mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + fs-minix-check-return-value-of-sb_getblk.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (28 preceding siblings ...)
2020-07-07 19:20 ` + mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
2020-07-07 19:25 ` + fs-minix-dont-allow-getting-deleted-inodes.patch " Andrew Morton
` (202 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
To: anenbupt, ebiggers, mm-commits, stable, viro
The patch titled
Subject: fs/minix: check return value of sb_getblk()
has been added to the -mm tree. Its filename is
fs-minix-check-return-value-of-sb_getblk.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-check-return-value-of-sb_getblk.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-check-return-value-of-sb_getblk.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: check return value of sb_getblk()
Patch series "fs/minix: fix syzbot bugs and set s_maxbytes".
This series fixes all syzbot bugs in the minix filesystem:
KASAN: null-ptr-deref Write in get_block
KASAN: use-after-free Write in get_block
KASAN: use-after-free Read in get_block
WARNING in inc_nlink
KMSAN: uninit-value in get_block
WARNING in drop_nlink
It also fixes the minix filesystem to set s_maxbytes correctly, so that
userspace sees the correct behavior when exceeding the max file size.
This patch (of 6):
sb_getblk() can fail, so check its return value.
This fixes a NULL pointer dereference.
Originally from Qiujun Huang.
Link: http://lkml.kernel.org/r/20200628060846.682158-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200628060846.682158-2-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reported-by: syzbot+4a88b2b9dc280f47baf4@syzkaller.appspotmail.com
Cc: Qiujun Huang <anenbupt@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/minix/itree_common.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/minix/itree_common.c~fs-minix-check-return-value-of-sb_getblk
+++ a/fs/minix/itree_common.c
@@ -75,6 +75,7 @@ static int alloc_branch(struct inode *in
int n = 0;
int i;
int parent = minix_new_block(inode);
+ int err = -ENOSPC;
branch[0].key = cpu_to_block(parent);
if (parent) for (n = 1; n < num; n++) {
@@ -85,6 +86,11 @@ static int alloc_branch(struct inode *in
break;
branch[n].key = cpu_to_block(nr);
bh = sb_getblk(inode->i_sb, parent);
+ if (!bh) {
+ minix_free_block(inode, nr);
+ err = -ENOMEM;
+ break;
+ }
lock_buffer(bh);
memset(bh->b_data, 0, bh->b_size);
branch[n].bh = bh;
@@ -103,7 +109,7 @@ static int alloc_branch(struct inode *in
bforget(branch[i].bh);
for (i = 0; i < n; i++)
minix_free_block(inode, block_to_cpu(branch[i].key));
- return -ENOSPC;
+ return err;
}
static inline int splice_branch(struct inode *inode,
_
Patches currently in -mm which might be from ebiggers@google.com are
fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + fs-minix-dont-allow-getting-deleted-inodes.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (29 preceding siblings ...)
2020-07-07 19:25 ` + fs-minix-check-return-value-of-sb_getblk.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
2020-07-07 19:25 ` + fs-minix-reject-too-large-maximum-file-size.patch " Andrew Morton
` (201 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
To: anenbupt, ebiggers, mm-commits, stable, viro
The patch titled
Subject: fs/minix: don't allow getting deleted inodes
has been added to the -mm tree. Its filename is
fs-minix-dont-allow-getting-deleted-inodes.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-dont-allow-getting-deleted-inodes.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-dont-allow-getting-deleted-inodes.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: don't allow getting deleted inodes
If an inode has no links, we need to mark it bad rather than allowing it
to be accessed. This avoids WARNINGs in inc_nlink() and drop_nlink() when
doing directory operations on a fuzzed filesystem.
Link: http://lkml.kernel.org/r/20200628060846.682158-3-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+a9ac3de1b5de5fb10efc@syzkaller.appspotmail.com
Reported-by: syzbot+df958cf5688a96ad3287@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/minix/inode.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- a/fs/minix/inode.c~fs-minix-dont-allow-getting-deleted-inodes
+++ a/fs/minix/inode.c
@@ -468,6 +468,13 @@ static struct inode *V1_minix_iget(struc
iget_failed(inode);
return ERR_PTR(-EIO);
}
+ if (raw_inode->i_nlinks == 0) {
+ printk("MINIX-fs: deleted inode referenced: %lu\n",
+ inode->i_ino);
+ brelse(bh);
+ iget_failed(inode);
+ return ERR_PTR(-ESTALE);
+ }
inode->i_mode = raw_inode->i_mode;
i_uid_write(inode, raw_inode->i_uid);
i_gid_write(inode, raw_inode->i_gid);
@@ -501,6 +508,13 @@ static struct inode *V2_minix_iget(struc
iget_failed(inode);
return ERR_PTR(-EIO);
}
+ if (raw_inode->i_nlinks == 0) {
+ printk("MINIX-fs: deleted inode referenced: %lu\n",
+ inode->i_ino);
+ brelse(bh);
+ iget_failed(inode);
+ return ERR_PTR(-ESTALE);
+ }
inode->i_mode = raw_inode->i_mode;
i_uid_write(inode, raw_inode->i_uid);
i_gid_write(inode, raw_inode->i_gid);
_
Patches currently in -mm which might be from ebiggers@google.com are
fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + fs-minix-reject-too-large-maximum-file-size.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (30 preceding siblings ...)
2020-07-07 19:25 ` + fs-minix-dont-allow-getting-deleted-inodes.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
2020-07-07 19:25 ` + fs-minix-set-s_maxbytes-correctly.patch " Andrew Morton
` (200 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
To: anenbupt, ebiggers, mm-commits, stable, viro
The patch titled
Subject: fs/minix: reject too-large maximum file size
has been added to the -mm tree. Its filename is
fs-minix-reject-too-large-maximum-file-size.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-reject-too-large-maximum-file-size.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-reject-too-large-maximum-file-size.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: reject too-large maximum file size
If the minix filesystem tries to map a very large logical block number to
its on-disk location, block_to_path() can return offsets that are too
large, causing out-of-bounds memory accesses when accessing indirect index
blocks. This should be prevented by the check against the maximum file
size, but this doesn't work because the maximum file size is read directly
from the on-disk superblock and isn't validated itself.
Fix this by validating the maximum file size at mount time.
Link: http://lkml.kernel.org/r/20200628060846.682158-4-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+c7d9ec7a1a7272dd71b3@syzkaller.appspotmail.com
Reported-by: syzbot+3b7b03a0c28948054fb5@syzkaller.appspotmail.com
Reported-by: syzbot+6e056ee473568865f3e6@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/minix/inode.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
--- a/fs/minix/inode.c~fs-minix-reject-too-large-maximum-file-size
+++ a/fs/minix/inode.c
@@ -150,6 +150,23 @@ static int minix_remount (struct super_b
return 0;
}
+static bool minix_check_superblock(struct minix_sb_info *sbi)
+{
+ if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
+ return false;
+
+ /*
+ * s_max_size must not exceed the block mapping limitation. This check
+ * is only needed for V1 filesystems, since V2/V3 support an extra level
+ * of indirect blocks which places the limit well above U32_MAX.
+ */
+ if (sbi->s_version == MINIX_V1 &&
+ sbi->s_max_size > (7 + 512 + 512*512) * BLOCK_SIZE)
+ return false;
+
+ return true;
+}
+
static int minix_fill_super(struct super_block *s, void *data, int silent)
{
struct buffer_head *bh;
@@ -228,11 +245,12 @@ static int minix_fill_super(struct super
} else
goto out_no_fs;
+ if (!minix_check_superblock(sbi))
+ goto out_illegal_sb;
+
/*
* Allocate the buffer map to keep the superblock small.
*/
- if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
- goto out_illegal_sb;
i = (sbi->s_imap_blocks + sbi->s_zmap_blocks) * sizeof(bh);
map = kzalloc(i, GFP_KERNEL);
if (!map)
_
Patches currently in -mm which might be from ebiggers@google.com are
fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + fs-minix-set-s_maxbytes-correctly.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (31 preceding siblings ...)
2020-07-07 19:25 ` + fs-minix-reject-too-large-maximum-file-size.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
2020-07-07 19:25 ` + fs-minix-fix-block-limit-check-for-v1-filesystems.patch " Andrew Morton
` (199 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
To: anenbupt, ebiggers, mm-commits, viro
The patch titled
Subject: fs/minix: set s_maxbytes correctly
has been added to the -mm tree. Its filename is
fs-minix-set-s_maxbytes-correctly.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-set-s_maxbytes-correctly.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-set-s_maxbytes-correctly.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: set s_maxbytes correctly
The minix filesystem leaves super_block::s_maxbytes at MAX_NON_LFS rather
than setting it to the actual filesystem-specific limit. This is broken
because it means userspace doesn't see the standard behavior like getting
EFBIG and SIGXFSZ when exceeding the maximum file size.
Fix this by setting s_maxbytes correctly.
Link: http://lkml.kernel.org/r/20200628060846.682158-5-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/minix/inode.c | 12 +++++++-----
fs/minix/itree_v1.c | 2 +-
fs/minix/itree_v2.c | 3 +--
fs/minix/minix.h | 1 -
4 files changed, 9 insertions(+), 9 deletions(-)
--- a/fs/minix/inode.c~fs-minix-set-s_maxbytes-correctly
+++ a/fs/minix/inode.c
@@ -150,8 +150,10 @@ static int minix_remount (struct super_b
return 0;
}
-static bool minix_check_superblock(struct minix_sb_info *sbi)
+static bool minix_check_superblock(struct super_block *sb)
{
+ struct minix_sb_info *sbi = minix_sb(sb);
+
if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
return false;
@@ -161,7 +163,7 @@ static bool minix_check_superblock(struc
* of indirect blocks which places the limit well above U32_MAX.
*/
if (sbi->s_version == MINIX_V1 &&
- sbi->s_max_size > (7 + 512 + 512*512) * BLOCK_SIZE)
+ sb->s_maxbytes > (7 + 512 + 512*512) * BLOCK_SIZE)
return false;
return true;
@@ -202,7 +204,7 @@ static int minix_fill_super(struct super
sbi->s_zmap_blocks = ms->s_zmap_blocks;
sbi->s_firstdatazone = ms->s_firstdatazone;
sbi->s_log_zone_size = ms->s_log_zone_size;
- sbi->s_max_size = ms->s_max_size;
+ s->s_maxbytes = ms->s_max_size;
s->s_magic = ms->s_magic;
if (s->s_magic == MINIX_SUPER_MAGIC) {
sbi->s_version = MINIX_V1;
@@ -233,7 +235,7 @@ static int minix_fill_super(struct super
sbi->s_zmap_blocks = m3s->s_zmap_blocks;
sbi->s_firstdatazone = m3s->s_firstdatazone;
sbi->s_log_zone_size = m3s->s_log_zone_size;
- sbi->s_max_size = m3s->s_max_size;
+ s->s_maxbytes = m3s->s_max_size;
sbi->s_ninodes = m3s->s_ninodes;
sbi->s_nzones = m3s->s_zones;
sbi->s_dirsize = 64;
@@ -245,7 +247,7 @@ static int minix_fill_super(struct super
} else
goto out_no_fs;
- if (!minix_check_superblock(sbi))
+ if (!minix_check_superblock(s))
goto out_illegal_sb;
/*
--- a/fs/minix/itree_v1.c~fs-minix-set-s_maxbytes-correctly
+++ a/fs/minix/itree_v1.c
@@ -29,7 +29,7 @@ static int block_to_path(struct inode *
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, inode->i_sb->s_bdev);
- } else if (block >= (minix_sb(inode->i_sb)->s_max_size/BLOCK_SIZE)) {
+ } else if (block >= inode->i_sb->s_maxbytes/BLOCK_SIZE) {
if (printk_ratelimit())
printk("MINIX-fs: block_to_path: "
"block %ld too big on dev %pg\n",
--- a/fs/minix/itree_v2.c~fs-minix-set-s_maxbytes-correctly
+++ a/fs/minix/itree_v2.c
@@ -32,8 +32,7 @@ static int block_to_path(struct inode *
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, sb->s_bdev);
- } else if ((u64)block * (u64)sb->s_blocksize >=
- minix_sb(sb)->s_max_size) {
+ } else if ((u64)block * (u64)sb->s_blocksize >= sb->s_maxbytes) {
if (printk_ratelimit())
printk("MINIX-fs: block_to_path: "
"block %ld too big on dev %pg\n",
--- a/fs/minix/minix.h~fs-minix-set-s_maxbytes-correctly
+++ a/fs/minix/minix.h
@@ -32,7 +32,6 @@ struct minix_sb_info {
unsigned long s_zmap_blocks;
unsigned long s_firstdatazone;
unsigned long s_log_zone_size;
- unsigned long s_max_size;
int s_dirsize;
int s_namelen;
struct buffer_head ** s_imap;
_
Patches currently in -mm which might be from ebiggers@google.com are
fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + fs-minix-fix-block-limit-check-for-v1-filesystems.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (32 preceding siblings ...)
2020-07-07 19:25 ` + fs-minix-set-s_maxbytes-correctly.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
2020-07-07 19:25 ` + fs-minix-remove-expected-error-message-in-block_to_path.patch " Andrew Morton
` (198 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
To: anenbupt, ebiggers, mm-commits, viro
The patch titled
Subject: fs/minix: fix block limit check for V1 filesystems
has been added to the -mm tree. Its filename is
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-fix-block-limit-check-for-v1-filesystems.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-fix-block-limit-check-for-v1-filesystems.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: fix block limit check for V1 filesystems
The minix filesystem reads its maximum file size from its on-disk
superblock. This value isn't necessarily a multiple of the block size.
When it's not, the V1 block mapping code doesn't allow mapping the last
possible block. Commit 6ed6a722f9ab ("minixfs: fix block limit check")
fixed this in the V2 mapping code. Fix it in the V1 mapping code too.
Link: http://lkml.kernel.org/r/20200628060846.682158-6-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/minix/itree_v1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/minix/itree_v1.c~fs-minix-fix-block-limit-check-for-v1-filesystems
+++ a/fs/minix/itree_v1.c
@@ -29,7 +29,7 @@ static int block_to_path(struct inode *
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, inode->i_sb->s_bdev);
- } else if (block >= inode->i_sb->s_maxbytes/BLOCK_SIZE) {
+ } else if ((u64)block * BLOCK_SIZE >= inode->i_sb->s_maxbytes) {
if (printk_ratelimit())
printk("MINIX-fs: block_to_path: "
"block %ld too big on dev %pg\n",
_
Patches currently in -mm which might be from ebiggers@google.com are
fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + fs-minix-remove-expected-error-message-in-block_to_path.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (33 preceding siblings ...)
2020-07-07 19:25 ` + fs-minix-fix-block-limit-check-for-v1-filesystems.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
2020-07-07 19:27 ` + mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch " Andrew Morton
` (197 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
To: anenbupt, ebiggers, mm-commits, viro
The patch titled
Subject: fs/minix: remove expected error message in block_to_path()
has been added to the -mm tree. Its filename is
fs-minix-remove-expected-error-message-in-block_to_path.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-remove-expected-error-message-in-block_to_path.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-remove-expected-error-message-in-block_to_path.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: remove expected error message in block_to_path()
When truncating a file to a size within the last allowed logical block,
block_to_path() is called with the *next* block. This exceeds the limit,
causing the "block %ld too big" error message to be printed.
This case isn't actually an error; there are just no more blocks past that
point. So, remove this error message.
Link: http://lkml.kernel.org/r/20200628060846.682158-7-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/minix/itree_v1.c | 12 ++++++------
fs/minix/itree_v2.c | 12 ++++++------
2 files changed, 12 insertions(+), 12 deletions(-)
--- a/fs/minix/itree_v1.c~fs-minix-remove-expected-error-message-in-block_to_path
+++ a/fs/minix/itree_v1.c
@@ -29,12 +29,12 @@ static int block_to_path(struct inode *
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, inode->i_sb->s_bdev);
- } else if ((u64)block * BLOCK_SIZE >= inode->i_sb->s_maxbytes) {
- if (printk_ratelimit())
- printk("MINIX-fs: block_to_path: "
- "block %ld too big on dev %pg\n",
- block, inode->i_sb->s_bdev);
- } else if (block < 7) {
+ return 0;
+ }
+ if ((u64)block * BLOCK_SIZE >= inode->i_sb->s_maxbytes)
+ return 0;
+
+ if (block < 7) {
offsets[n++] = block;
} else if ((block -= 7) < 512) {
offsets[n++] = 7;
--- a/fs/minix/itree_v2.c~fs-minix-remove-expected-error-message-in-block_to_path
+++ a/fs/minix/itree_v2.c
@@ -32,12 +32,12 @@ static int block_to_path(struct inode *
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, sb->s_bdev);
- } else if ((u64)block * (u64)sb->s_blocksize >= sb->s_maxbytes) {
- if (printk_ratelimit())
- printk("MINIX-fs: block_to_path: "
- "block %ld too big on dev %pg\n",
- block, sb->s_bdev);
- } else if (block < DIRCOUNT) {
+ return 0;
+ }
+ if ((u64)block * (u64)sb->s_blocksize >= sb->s_maxbytes)
+ return 0;
+
+ if (block < DIRCOUNT) {
offsets[n++] = block;
} else if ((block -= DIRCOUNT) < INDIRCOUNT(sb)) {
offsets[n++] = DIRCOUNT;
_
Patches currently in -mm which might be from ebiggers@google.com are
fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (34 preceding siblings ...)
2020-07-07 19:25 ` + fs-minix-remove-expected-error-message-in-block_to_path.patch " Andrew Morton
@ 2020-07-07 19:27 ` Andrew Morton
2020-07-07 19:28 ` + mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch " Andrew Morton
` (196 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:27 UTC (permalink / raw)
To: david, jroedel, mm-commits, rppt
The patch titled
Subject: mm: vmalloc: remove redundant asignmnet in unmap_kernel_range_noflush()
has been added to the -mm tree. Its filename is
mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Mike Rapoport <rppt@linux.ibm.com>
Subject: mm: vmalloc: remove redundant asignmnet in unmap_kernel_range_noflush()
'addr' is set to 'start' and then a few lines afterwards 'start' is set to
'addr'. Remove the second asignment.
Link: http://lkml.kernel.org/r/20200707163226.374685-1-rppt@kernel.org
Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/vmalloc.c | 1 -
1 file changed, 1 deletion(-)
--- a/mm/vmalloc.c~mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush
+++ a/mm/vmalloc.c
@@ -175,7 +175,6 @@ void unmap_kernel_range_noflush(unsigned
pgtbl_mod_mask mask = 0;
BUG_ON(addr >= end);
- start = addr;
pgd = pgd_offset_k(addr);
do {
next = pgd_addr_end(addr, end);
_
Patches currently in -mm which might be from rppt@linux.ibm.com are
mm-remove-unneeded-includes-of-asm-pgalloch.patch
opeinrisc-switch-to-generic-version-of-pte-allocation.patch
xtensa-switch-to-generic-version-of-pte-allocation.patch
asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
asm-generic-pgalloc-provide-generic-pgd_free.patch
mm-move-lib-ioremapc-to-mm.patch
mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (35 preceding siblings ...)
2020-07-07 19:27 ` + mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch " Andrew Morton
@ 2020-07-07 19:28 ` Andrew Morton
2020-07-07 19:36 ` + kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch " Andrew Morton
` (195 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:28 UTC (permalink / raw)
To: bigeasy, colin.king, davem, ddstreet, herbert, lgoncalv,
mahipalreddy2006, mm-commits, sjenning, song.bao.hua,
vitaly.wool, wangzhou1
The patch titled
Subject: mm/zswap: move to use crypto_acomp API for hardware acceleration
has been added to the -mm tree. Its filename is
mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/zswap: move to use crypto_acomp API for hardware acceleration
Right now, all new ZIP drivers are using crypto_acomp APIs rather than
legacy crypto_comp APIs. But zswap.c is still using the old APIs. That
means zswap won't be able to use any new zip drivers in kernel.
This patch moves to use cryto_acomp APIs to fix the problem. On the other
hand, tradiontal compressors like lz4,lzo etc have been wrapped into acomp
via scomp backend. So platforms without async compressors can fallback to
use acomp via scomp backend.
It is probably the first real user to use acomp but perhaps not a good
example to demonstrate how multiple acomp requests can be executed in
parallel in one acomp instance. frontswap is doing page load and store
page by page. It doesn't have a queuing or buffering mechinism to permit
multiple pages to do frontswap simultaneously in one thread. However this
patch creates multiple acomp instances, so multiple threads running on
multiple different cpus can actually do (de)compression parallelly,
leveraging the power of multiple ZIP hardware queues. This is also
consistent with frontswap's page management model.
On the other hand, the current zswap implementation has some per-cpu
global resource like zswap_dstmem. So we create acomp instances in number
of CPUs just like before, zswap created comp instances in number of CPUs.
Link: http://lkml.kernel.org/r/20200707125210.33256-1-song.bao.hua@hisilicon.com
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mahipal Challa <mahipalreddy2006@gmail.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/zswap.c | 177 ++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 134 insertions(+), 43 deletions(-)
--- a/mm/zswap.c~mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration
+++ a/mm/zswap.c
@@ -24,8 +24,10 @@
#include <linux/rbtree.h>
#include <linux/swap.h>
#include <linux/crypto.h>
+#include <linux/scatterlist.h>
#include <linux/mempool.h>
#include <linux/zpool.h>
+#include <crypto/acompress.h>
#include <linux/mm_types.h>
#include <linux/page-flags.h>
@@ -127,9 +129,17 @@ module_param_named(same_filled_pages_ena
* data structures
**********************************/
+struct crypto_acomp_ctx {
+ struct crypto_acomp *acomp;
+ struct acomp_req *req;
+ struct crypto_wait wait;
+ u8 *dstmem;
+ struct mutex mutex;
+};
+
struct zswap_pool {
struct zpool *zpool;
- struct crypto_comp * __percpu *tfm;
+ struct crypto_acomp_ctx * __percpu *acomp_ctx;
struct kref kref;
struct list_head list;
struct work_struct release_work;
@@ -415,30 +425,73 @@ static int zswap_dstmem_dead(unsigned in
static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
{
struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
- struct crypto_comp *tfm;
+ struct crypto_acomp *acomp;
+ struct acomp_req *req;
+ struct crypto_acomp_ctx *acomp_ctx;
+ int ret;
- if (WARN_ON(*per_cpu_ptr(pool->tfm, cpu)))
+ if (WARN_ON(*per_cpu_ptr(pool->acomp_ctx, cpu)))
return 0;
- tfm = crypto_alloc_comp(pool->tfm_name, 0, 0);
- if (IS_ERR_OR_NULL(tfm)) {
- pr_err("could not alloc crypto comp %s : %ld\n",
- pool->tfm_name, PTR_ERR(tfm));
+ acomp_ctx = kzalloc(sizeof(*acomp_ctx), GFP_KERNEL);
+ if (!acomp_ctx)
return -ENOMEM;
+
+ acomp = crypto_alloc_acomp(pool->tfm_name, 0, 0);
+ if (IS_ERR(acomp)) {
+ pr_err("could not alloc crypto acomp %s : %ld\n",
+ pool->tfm_name, PTR_ERR(acomp));
+ ret = PTR_ERR(acomp);
+ goto free_ctx;
+ }
+ acomp_ctx->acomp = acomp;
+
+ req = acomp_request_alloc(acomp_ctx->acomp);
+ if (!req) {
+ pr_err("could not alloc crypto acomp_request %s\n",
+ pool->tfm_name);
+ ret = -ENOMEM;
+ goto free_acomp;
}
- *per_cpu_ptr(pool->tfm, cpu) = tfm;
+ acomp_ctx->req = req;
+
+ mutex_init(&acomp_ctx->mutex);
+ crypto_init_wait(&acomp_ctx->wait);
+ /*
+ * if the backend of acomp is async zip, crypto_req_done() will wakeup
+ * crypto_wait_req(); if the backend of acomp is scomp, the callback
+ * won't be called, crypto_wait_req() will return without blocking.
+ */
+ acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+ crypto_req_done, &acomp_ctx->wait);
+
+ acomp_ctx->dstmem = per_cpu(zswap_dstmem, cpu);
+ *per_cpu_ptr(pool->acomp_ctx, cpu) = acomp_ctx;
+
return 0;
+
+free_acomp:
+ crypto_free_acomp(acomp_ctx->acomp);
+free_ctx:
+ kfree(acomp_ctx);
+ return ret;
}
static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
{
struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
- struct crypto_comp *tfm;
+ struct crypto_acomp_ctx *acomp_ctx;
+
+ acomp_ctx = *per_cpu_ptr(pool->acomp_ctx, cpu);
+ if (!IS_ERR_OR_NULL(acomp_ctx)) {
+ if (!IS_ERR_OR_NULL(acomp_ctx->req))
+ acomp_request_free(acomp_ctx->req);
+ if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
+ crypto_free_acomp(acomp_ctx->acomp);
+ kfree(acomp_ctx);
+ }
+ *per_cpu_ptr(pool->acomp_ctx, cpu) = NULL;
- tfm = *per_cpu_ptr(pool->tfm, cpu);
- if (!IS_ERR_OR_NULL(tfm))
- crypto_free_comp(tfm);
- *per_cpu_ptr(pool->tfm, cpu) = NULL;
return 0;
}
@@ -561,8 +614,9 @@ static struct zswap_pool *zswap_pool_cre
pr_debug("using %s zpool\n", zpool_get_type(pool->zpool));
strlcpy(pool->tfm_name, compressor, sizeof(pool->tfm_name));
- pool->tfm = alloc_percpu(struct crypto_comp *);
- if (!pool->tfm) {
+
+ pool->acomp_ctx = alloc_percpu(struct crypto_acomp_ctx *);
+ if (!pool->acomp_ctx) {
pr_err("percpu alloc failed\n");
goto error;
}
@@ -585,7 +639,7 @@ static struct zswap_pool *zswap_pool_cre
return pool;
error:
- free_percpu(pool->tfm);
+ free_percpu(pool->acomp_ctx);
if (pool->zpool)
zpool_destroy_pool(pool->zpool);
kfree(pool);
@@ -596,14 +650,14 @@ static __init struct zswap_pool *__zswap
{
bool has_comp, has_zpool;
- has_comp = crypto_has_comp(zswap_compressor, 0, 0);
+ has_comp = crypto_has_acomp(zswap_compressor, 0, 0);
if (!has_comp && strcmp(zswap_compressor,
CONFIG_ZSWAP_COMPRESSOR_DEFAULT)) {
pr_err("compressor %s not available, using default %s\n",
zswap_compressor, CONFIG_ZSWAP_COMPRESSOR_DEFAULT);
param_free_charp(&zswap_compressor);
zswap_compressor = CONFIG_ZSWAP_COMPRESSOR_DEFAULT;
- has_comp = crypto_has_comp(zswap_compressor, 0, 0);
+ has_comp = crypto_has_acomp(zswap_compressor, 0, 0);
}
if (!has_comp) {
pr_err("default compressor %s not available\n",
@@ -639,7 +693,7 @@ static void zswap_pool_destroy(struct zs
zswap_pool_debug("destroying", pool);
cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node);
- free_percpu(pool->tfm);
+ free_percpu(pool->acomp_ctx);
zpool_destroy_pool(pool->zpool);
kfree(pool);
}
@@ -723,7 +777,7 @@ static int __zswap_param_set(const char
}
type = s;
} else if (!compressor) {
- if (!crypto_has_comp(s, 0, 0)) {
+ if (!crypto_has_acomp(s, 0, 0)) {
pr_err("compressor %s not available\n", s);
return -ENOENT;
}
@@ -774,7 +828,7 @@ static int __zswap_param_set(const char
* failed, maybe both compressor and zpool params were bad.
* Allow changing this param, so pool creation will succeed
* when the other param is changed. We already verified this
- * param is ok in the zpool_has_pool() or crypto_has_comp()
+ * param is ok in the zpool_has_pool() or crypto_has_acomp()
* checks above.
*/
ret = param_set_charp(s, kp);
@@ -876,7 +930,9 @@ static int zswap_writeback_entry(struct
pgoff_t offset;
struct zswap_entry *entry;
struct page *page;
- struct crypto_comp *tfm;
+ struct scatterlist input, output;
+ struct crypto_acomp_ctx *acomp_ctx;
+
u8 *src, *dst;
unsigned int dlen;
int ret;
@@ -916,14 +972,21 @@ static int zswap_writeback_entry(struct
case ZSWAP_SWAPCACHE_NEW: /* page is locked */
/* decompress */
+ acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+
dlen = PAGE_SIZE;
src = (u8 *)zhdr + sizeof(struct zswap_header);
- dst = kmap_atomic(page);
- tfm = *get_cpu_ptr(entry->pool->tfm);
- ret = crypto_comp_decompress(tfm, src, entry->length,
- dst, &dlen);
- put_cpu_ptr(entry->pool->tfm);
- kunmap_atomic(dst);
+ dst = kmap(page);
+
+ mutex_lock(&acomp_ctx->mutex);
+ sg_init_one(&input, src, entry->length);
+ sg_init_one(&output, dst, dlen);
+ acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
+ ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait);
+ dlen = acomp_ctx->req->dlen;
+ mutex_unlock(&acomp_ctx->mutex);
+
+ kunmap(page);
BUG_ON(ret);
BUG_ON(dlen != PAGE_SIZE);
@@ -1004,7 +1067,8 @@ static int zswap_frontswap_store(unsigne
{
struct zswap_tree *tree = zswap_trees[type];
struct zswap_entry *entry, *dupentry;
- struct crypto_comp *tfm;
+ struct scatterlist input, output;
+ struct crypto_acomp_ctx *acomp_ctx;
int ret;
unsigned int hlen, dlen = PAGE_SIZE;
unsigned long handle, value;
@@ -1074,12 +1138,32 @@ static int zswap_frontswap_store(unsigne
}
/* compress */
- dst = get_cpu_var(zswap_dstmem);
- tfm = *get_cpu_ptr(entry->pool->tfm);
- src = kmap_atomic(page);
- ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen);
- kunmap_atomic(src);
- put_cpu_ptr(entry->pool->tfm);
+ acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+
+ mutex_lock(&acomp_ctx->mutex);
+
+ src = kmap(page);
+ dst = acomp_ctx->dstmem;
+ sg_init_one(&input, src, PAGE_SIZE);
+ /* zswap_dstmem is of size (PAGE_SIZE * 2). Reflect same in sg_list */
+ sg_init_one(&output, dst, PAGE_SIZE * 2);
+ acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SIZE, dlen);
+ /*
+ * it maybe looks a little bit silly that we send an asynchronous request,
+ * then wait for its completion synchronously. This makes the process look
+ * synchronous in fact.
+ * Theoretically, acomp supports users send multiple acomp requests in one
+ * acomp instance, then get those requests done simultaneously. but in this
+ * case, frontswap actually does store and load page by page, there is no
+ * existing method to send the second page before the first page is done
+ * in one thread doing frontswap.
+ * but in different threads running on different cpu, we have different
+ * acomp instance, so multiple threads can do (de)compression in parallel.
+ */
+ ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait);
+ dlen = acomp_ctx->req->dlen;
+ kunmap(page);
+
if (ret) {
ret = -EINVAL;
goto put_dstmem;
@@ -1103,7 +1187,7 @@ static int zswap_frontswap_store(unsigne
memcpy(buf, &zhdr, hlen);
memcpy(buf + hlen, dst, dlen);
zpool_unmap_handle(entry->pool->zpool, handle);
- put_cpu_var(zswap_dstmem);
+ mutex_unlock(&acomp_ctx->mutex);
/* populate entry */
entry->offset = offset;
@@ -1131,7 +1215,7 @@ insert_entry:
return 0;
put_dstmem:
- put_cpu_var(zswap_dstmem);
+ mutex_unlock(&acomp_ctx->mutex);
zswap_pool_put(entry->pool);
freepage:
zswap_entry_cache_free(entry);
@@ -1148,7 +1232,8 @@ static int zswap_frontswap_load(unsigned
{
struct zswap_tree *tree = zswap_trees[type];
struct zswap_entry *entry;
- struct crypto_comp *tfm;
+ struct scatterlist input, output;
+ struct crypto_acomp_ctx *acomp_ctx;
u8 *src, *dst;
unsigned int dlen;
int ret;
@@ -1175,11 +1260,17 @@ static int zswap_frontswap_load(unsigned
src = zpool_map_handle(entry->pool->zpool, entry->handle, ZPOOL_MM_RO);
if (zpool_evictable(entry->pool->zpool))
src += sizeof(struct zswap_header);
- dst = kmap_atomic(page);
- tfm = *get_cpu_ptr(entry->pool->tfm);
- ret = crypto_comp_decompress(tfm, src, entry->length, dst, &dlen);
- put_cpu_ptr(entry->pool->tfm);
- kunmap_atomic(dst);
+ dst = kmap(page);
+
+ acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+ mutex_lock(&acomp_ctx->mutex);
+ sg_init_one(&input, src, entry->length);
+ sg_init_one(&output, dst, dlen);
+ acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
+ ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait);
+ mutex_unlock(&acomp_ctx->mutex);
+
+ kunmap(page);
zpool_unmap_handle(entry->pool->zpool, entry->handle);
BUG_ON(ret);
_
Patches currently in -mm which might be from song.bao.hua@hisilicon.com are
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (36 preceding siblings ...)
2020-07-07 19:28 ` + mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch " Andrew Morton
@ 2020-07-07 19:36 ` Andrew Morton
2020-07-07 19:37 ` + lib-test_lockupc-make-symbol-test_works-static.patch " Andrew Morton
` (194 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:36 UTC (permalink / raw)
To: mm-commits, pmladek, stamatis.iliass
The patch titled
Subject: kthread: remove incorrect comment in kthread_create_on_cpu()
has been added to the -mm tree. Its filename is
kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Ilias Stamatis <stamatis.iliass@gmail.com>
Subject: kthread: remove incorrect comment in kthread_create_on_cpu()
Originally kthread_create_on_cpu() parked and woke up the new thread.
However, since commit a65d40961dc7 ("kthread/smpboot: do not park in
kthread_create_on_cpu()") this is no longer the case. This patch removes
the comment that has been left behind and is now incorrect / stale.
Link: http://lkml.kernel.org/r/20200611135920.240551-1-stamatis.iliass@gmail.com
Fixes: a65d40961dc7 ("kthread/smpboot: do not park in kthread_create_on_cpu()")
Signed-off-by: Ilias Stamatis <stamatis.iliass@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/kthread.c | 1 -
1 file changed, 1 deletion(-)
--- a/kernel/kthread.c~kthread-remove-incorrect-comment-in-kthread_create_on_cpu
+++ a/kernel/kthread.c
@@ -478,7 +478,6 @@ EXPORT_SYMBOL(kthread_bind);
* to "name.*%u". Code fills in cpu number.
*
* Description: This helper function creates and names a kernel thread
- * The thread will be woken and put into park mode.
*/
struct task_struct *kthread_create_on_cpu(int (*threadfn)(void *data),
void *data, unsigned int cpu,
_
Patches currently in -mm which might be from stamatis.iliass@gmail.com are
kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + lib-test_lockupc-make-symbol-test_works-static.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (37 preceding siblings ...)
2020-07-07 19:36 ` + kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch " Andrew Morton
@ 2020-07-07 19:37 ` Andrew Morton
2020-07-07 19:39 ` [failures] kthread-work-could-not-be-queued-when-worker-being-destroyed.patch removed from " Andrew Morton
` (193 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:37 UTC (permalink / raw)
To: hulkci, mm-commits, weiyongjun1
The patch titled
Subject: lib/test_lockup.c: make symbol 'test_works' static
has been added to the -mm tree. Its filename is
lib-test_lockupc-make-symbol-test_works-static.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/lib-test_lockupc-make-symbol-test_works-static.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/lib-test_lockupc-make-symbol-test_works-static.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Wei Yongjun <weiyongjun1@huawei.com>
Subject: lib/test_lockup.c: make symbol 'test_works' static
Fix sparse build warning:
lib/test_lockup.c:403:1: warning:
symbol '__pcpu_scope_test_works' was not declared. Should it be static?
Link: http://lkml.kernel.org/r/20200707112252.9047-1-weiyongjun1@huawei.com
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
lib/test_lockup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/lib/test_lockup.c~lib-test_lockupc-make-symbol-test_works-static
+++ a/lib/test_lockup.c
@@ -400,7 +400,7 @@ static void test_lockup(bool master)
test_unlock(master, true);
}
-DEFINE_PER_CPU(struct work_struct, test_works);
+static DEFINE_PER_CPU(struct work_struct, test_works);
static void test_work_fn(struct work_struct *work)
{
_
Patches currently in -mm which might be from weiyongjun1@huawei.com are
mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
lib-test_lockupc-make-symbol-test_works-static.patch
bits-add-tests-of-genmask-fix-2.patch
kcov-make-some-symbols-static.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [failures] kthread-work-could-not-be-queued-when-worker-being-destroyed.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (38 preceding siblings ...)
2020-07-07 19:37 ` + lib-test_lockupc-make-symbol-test_works-static.patch " Andrew Morton
@ 2020-07-07 19:39 ` Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-page_isolation-prefer-the-node-of-the-source-page.patch " Andrew Morton
` (192 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:39 UTC (permalink / raw)
To: ben.dooks, bfields, cl, mm-commits, peterz, pmladek, qiang.zhang, tj
The patch titled
Subject: kthread: work could not be queued when worker being destroyed
has been removed from the -mm tree. Its filename was
kthread-work-could-not-be-queued-when-worker-being-destroyed.patch
This patch was dropped because it had testing failures
------------------------------------------------------
From: Zhang Qiang <qiang.zhang@windriver.com>
Subject: kthread: work could not be queued when worker being destroyed
The "queuing_blocked" func should print warning message and returns true
when the worker being destroyed.
Before the work is put into the queue of the worker thread, the state of
the worker thread needs to be detected,because the worker thread may be in
the destruction state at this time.
Link: http://lkml.kernel.org/r/20200705013018.7375-1-qiang.zhang@windriver.com
Link: http://lkml.kernel.org/r/20200702070156.5862-1-qiang.zhang@windriver.com
Signed-off-by: Zhang Qiang <qiang.zhang@windriver.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Liang Chen <cl@rock-chips.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/kthread.c | 3 +++
1 file changed, 3 insertions(+)
--- a/kernel/kthread.c~kthread-work-could-not-be-queued-when-worker-being-destroyed
+++ a/kernel/kthread.c
@@ -814,6 +814,9 @@ static inline bool queuing_blocked(struc
{
lockdep_assert_held(&worker->lock);
+ if (WARN_ON(!worker->task))
+ return true;
+
return !list_empty(&work->node) || work->canceling;
}
_
Patches currently in -mm which might be from qiang.zhang@windriver.com are
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-page_isolation-prefer-the-node-of-the-source-page.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (39 preceding siblings ...)
2020-07-07 19:39 ` [failures] kthread-work-could-not-be-queued-when-worker-being-destroyed.patch removed from " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-migrate-move-migration-helper-from-h-to-c.patch " Andrew Morton
` (191 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
mm-commits, n-horiguchi, vbabka
The patch titled
Subject: mm/page_isolation: prefer the node of the source page
has been removed from the -mm tree. Its filename was
mm-page_isolation-prefer-the-node-of-the-source-page.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/page_isolation: prefer the node of the source page
Patch series "clean-up the migration target allocation functions", v3.
This patchset clean-up the migration target allocation functions.
Contributions of this patchset are:
1. unify two hugetlb alloc functions. As a result, one is remained.
2. make one external hugetlb alloc function to internal one.
3. unify three functions for migration target allocation.
This patch (of 8):
For locality, it's better to migrate the page to the same node rather than
the node of the current caller's cpu.
Link: http://lkml.kernel.org/r/1592892828-1934-1-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1592892828-1934-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/page_isolation.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/page_isolation.c~mm-page_isolation-prefer-the-node-of-the-source-page
+++ a/mm/page_isolation.c
@@ -309,5 +309,7 @@ int test_pages_isolated(unsigned long st
struct page *alloc_migrate_target(struct page *page, unsigned long private)
{
- return new_page_nodemask(page, numa_node_id(), &node_states[N_MEMORY]);
+ int nid = page_to_nid(page);
+
+ return new_page_nodemask(page, nid, &node_states[N_MEMORY]);
}
_
Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-migrate-move-migration-helper-from-h-to-c.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (40 preceding siblings ...)
2020-07-07 19:47 ` [to-be-updated] mm-page_isolation-prefer-the-node-of-the-source-page.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-unify-migration-callbacks.patch " Andrew Morton
` (190 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
mm-commits, n-horiguchi, vbabka
The patch titled
Subject: mm/migrate: move migration helper from .h to .c
has been removed from the -mm tree. Its filename was
mm-migrate-move-migration-helper-from-h-to-c.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/migrate: move migration helper from .h to .c
It's not performance sensitive function. Move it to .c. This is a
preparation step for future change.
Link: http://lkml.kernel.org/r/1592892828-1934-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/migrate.h | 33 +++++----------------------------
mm/migrate.c | 29 +++++++++++++++++++++++++++++
2 files changed, 34 insertions(+), 28 deletions(-)
--- a/include/linux/migrate.h~mm-migrate-move-migration-helper-from-h-to-c
+++ a/include/linux/migrate.h
@@ -31,34 +31,6 @@ enum migrate_reason {
/* In mm/debug.c; also keep sync with include/trace/events/migrate.h */
extern const char *migrate_reason_names[MR_TYPES];
-static inline struct page *new_page_nodemask(struct page *page,
- int preferred_nid, nodemask_t *nodemask)
-{
- gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL;
- unsigned int order = 0;
- struct page *new_page = NULL;
-
- if (PageHuge(page))
- return alloc_huge_page_nodemask(page_hstate(compound_head(page)),
- preferred_nid, nodemask);
-
- if (PageTransHuge(page)) {
- gfp_mask |= GFP_TRANSHUGE;
- order = HPAGE_PMD_ORDER;
- }
-
- if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
- gfp_mask |= __GFP_HIGHMEM;
-
- new_page = __alloc_pages_nodemask(gfp_mask, order,
- preferred_nid, nodemask);
-
- if (new_page && PageTransHuge(new_page))
- prep_transhuge_page(new_page);
-
- return new_page;
-}
-
#ifdef CONFIG_MIGRATION
extern void putback_movable_pages(struct list_head *l);
@@ -67,6 +39,8 @@ extern int migrate_page(struct address_s
enum migrate_mode mode);
extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
unsigned long private, enum migrate_mode mode, int reason);
+extern struct page *new_page_nodemask(struct page *page,
+ int preferred_nid, nodemask_t *nodemask);
extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
extern void putback_movable_page(struct page *page);
@@ -85,6 +59,9 @@ static inline int migrate_pages(struct l
free_page_t free, unsigned long private, enum migrate_mode mode,
int reason)
{ return -ENOSYS; }
+static inline struct page *new_page_nodemask(struct page *page,
+ int preferred_nid, nodemask_t *nodemask)
+ { return NULL; }
static inline int isolate_movable_page(struct page *page, isolate_mode_t mode)
{ return -EBUSY; }
--- a/mm/migrate.c~mm-migrate-move-migration-helper-from-h-to-c
+++ a/mm/migrate.c
@@ -1513,6 +1513,35 @@ out:
return rc;
}
+struct page *new_page_nodemask(struct page *page,
+ int preferred_nid, nodemask_t *nodemask)
+{
+ gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL;
+ unsigned int order = 0;
+ struct page *new_page = NULL;
+
+ if (PageHuge(page))
+ return alloc_huge_page_nodemask(
+ page_hstate(compound_head(page)),
+ preferred_nid, nodemask);
+
+ if (PageTransHuge(page)) {
+ gfp_mask |= GFP_TRANSHUGE;
+ order = HPAGE_PMD_ORDER;
+ }
+
+ if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
+ gfp_mask |= __GFP_HIGHMEM;
+
+ new_page = __alloc_pages_nodemask(gfp_mask, order,
+ preferred_nid, nodemask);
+
+ if (new_page && PageTransHuge(new_page))
+ prep_transhuge_page(new_page);
+
+ return new_page;
+}
+
#ifdef CONFIG_NUMA
static int store_status(int __user *status, int start, int value, int nr)
_
Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are
mm-hugetlb-unify-migration-callbacks.patch
mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-hugetlb-unify-migration-callbacks.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (41 preceding siblings ...)
2020-07-07 19:47 ` [to-be-updated] mm-migrate-move-migration-helper-from-h-to-c.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch " Andrew Morton
` (189 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
mm-commits, n-horiguchi, vbabka
The patch titled
Subject: mm/hugetlb: unify migration callbacks
has been removed from the -mm tree. Its filename was
mm-hugetlb-unify-migration-callbacks.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/hugetlb: unify migration callbacks
There is no difference between two migration callback functions,
alloc_huge_page_node() and alloc_huge_page_nodemask(), except
__GFP_THISNODE handling.
This patch adds an argument, gfp_mask, on alloc_huge_page_nodemask() and
replaces the callsite for alloc_huge_page_node() with the call to
alloc_huge_page_nodemask(..., __GFP_THISNODE).
It's safe to remove a node id check in alloc_huge_page_node() since
there is no caller passing NUMA_NO_NODE as a node id.
Link: http://lkml.kernel.org/r/1592892828-1934-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/hugetlb.h | 11 +++--------
mm/hugetlb.c | 26 +++-----------------------
mm/mempolicy.c | 9 +++++----
mm/migrate.c | 5 +++--
4 files changed, 14 insertions(+), 37 deletions(-)
--- a/include/linux/hugetlb.h~mm-hugetlb-unify-migration-callbacks
+++ a/include/linux/hugetlb.h
@@ -504,9 +504,8 @@ struct huge_bootmem_page {
struct page *alloc_huge_page(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve);
-struct page *alloc_huge_page_node(struct hstate *h, int nid);
struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
- nodemask_t *nmask);
+ nodemask_t *nmask, gfp_t gfp_mask);
struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
unsigned long address);
struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
@@ -759,13 +758,9 @@ static inline struct page *alloc_huge_pa
return NULL;
}
-static inline struct page *alloc_huge_page_node(struct hstate *h, int nid)
-{
- return NULL;
-}
-
static inline struct page *
-alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, nodemask_t *nmask)
+alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
+ nodemask_t *nmask, gfp_t gfp_mask)
{
return NULL;
}
--- a/mm/hugetlb.c~mm-hugetlb-unify-migration-callbacks
+++ a/mm/hugetlb.c
@@ -1980,30 +1980,10 @@ struct page *alloc_buddy_huge_page_with_
}
/* page migration callback function */
-struct page *alloc_huge_page_node(struct hstate *h, int nid)
-{
- gfp_t gfp_mask = htlb_alloc_mask(h);
- struct page *page = NULL;
-
- if (nid != NUMA_NO_NODE)
- gfp_mask |= __GFP_THISNODE;
-
- spin_lock(&hugetlb_lock);
- if (h->free_huge_pages - h->resv_huge_pages > 0)
- page = dequeue_huge_page_nodemask(h, gfp_mask, nid, NULL);
- spin_unlock(&hugetlb_lock);
-
- if (!page)
- page = alloc_migrate_huge_page(h, gfp_mask, nid, NULL);
-
- return page;
-}
-
-/* page migration callback function */
struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
- nodemask_t *nmask)
+ nodemask_t *nmask, gfp_t gfp_mask)
{
- gfp_t gfp_mask = htlb_alloc_mask(h);
+ gfp_mask |= htlb_alloc_mask(h);
spin_lock(&hugetlb_lock);
if (h->free_huge_pages - h->resv_huge_pages > 0) {
@@ -2032,7 +2012,7 @@ struct page *alloc_huge_page_vma(struct
gfp_mask = htlb_alloc_mask(h);
node = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
- page = alloc_huge_page_nodemask(h, node, nodemask);
+ page = alloc_huge_page_nodemask(h, node, nodemask, 0);
mpol_cond_put(mpol);
return page;
--- a/mm/mempolicy.c~mm-hugetlb-unify-migration-callbacks
+++ a/mm/mempolicy.c
@@ -1068,10 +1068,11 @@ static int migrate_page_add(struct page
/* page allocation callback for NUMA node migration */
struct page *alloc_new_node_page(struct page *page, unsigned long node)
{
- if (PageHuge(page))
- return alloc_huge_page_node(page_hstate(compound_head(page)),
- node);
- else if (PageTransHuge(page)) {
+ if (PageHuge(page)) {
+ return alloc_huge_page_nodemask(
+ page_hstate(compound_head(page)), node,
+ NULL, __GFP_THISNODE);
+ } else if (PageTransHuge(page)) {
struct page *thp;
thp = alloc_pages_node(node,
--- a/mm/migrate.c~mm-hugetlb-unify-migration-callbacks
+++ a/mm/migrate.c
@@ -1520,10 +1520,11 @@ struct page *new_page_nodemask(struct pa
unsigned int order = 0;
struct page *new_page = NULL;
- if (PageHuge(page))
+ if (PageHuge(page)) {
return alloc_huge_page_nodemask(
page_hstate(compound_head(page)),
- preferred_nid, nodemask);
+ preferred_nid, nodemask, 0);
+ }
if (PageTransHuge(page)) {
gfp_mask |= GFP_TRANSHUGE;
_
Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are
mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (42 preceding siblings ...)
2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-unify-migration-callbacks.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-migrate-make-a-standard-migration-target-allocation-function.patch " Andrew Morton
` (188 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
mm-commits, n-horiguchi, vbabka
The patch titled
Subject: mm/hugetlb: make hugetlb migration callback CMA aware
has been removed from the -mm tree. Its filename was
mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/hugetlb: make hugetlb migration callback CMA aware
new_non_cma_page() in gup.c which try to allocate migration target page
requires to allocate the new page that is not on the CMA area.
new_non_cma_page() implements it by removing __GFP_MOVABLE flag. This way
works well for THP page or normal page but not for hugetlb page.
hugetlb page allocation process consists of two steps. First is dequeing
from the pool. Second is, if there is no available page on the queue,
allocating from the page allocator.
new_non_cma_page() can control allocation from the page allocator by
specifying correct gfp flag. However, dequeing cannot be controlled until
now, so, new_non_cma_page() skips dequeing completely. It is a suboptimal
since new_non_cma_page() cannot utilize hugetlb pages on the queue so this
patch tries to fix this situation.
This patch makes the deque function on hugetlb CMA aware and skip CMA
pages if newly added skip_cma argument is passed as true.
Link: http://lkml.kernel.org/r/1592892828-1934-5-git-send-email-iamjoonsoo.kim@lge.com
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/hugetlb.h | 6 ++----
mm/gup.c | 3 ++-
mm/hugetlb.c | 31 ++++++++++++++++++++++---------
mm/mempolicy.c | 2 +-
mm/migrate.c | 2 +-
5 files changed, 28 insertions(+), 16 deletions(-)
--- a/include/linux/hugetlb.h~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/include/linux/hugetlb.h
@@ -505,11 +505,9 @@ struct huge_bootmem_page {
struct page *alloc_huge_page(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve);
struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
- nodemask_t *nmask, gfp_t gfp_mask);
+ nodemask_t *nmask, gfp_t gfp_mask, bool skip_cma);
struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
unsigned long address);
-struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
- int nid, nodemask_t *nmask);
int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
pgoff_t idx);
@@ -760,7 +758,7 @@ static inline struct page *alloc_huge_pa
static inline struct page *
alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
- nodemask_t *nmask, gfp_t gfp_mask)
+ nodemask_t *nmask, gfp_t gfp_mask, bool skip_cma)
{
return NULL;
}
--- a/mm/gup.c~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/mm/gup.c
@@ -1630,11 +1630,12 @@ static struct page *new_non_cma_page(str
#ifdef CONFIG_HUGETLB_PAGE
if (PageHuge(page)) {
struct hstate *h = page_hstate(page);
+
/*
* We don't want to dequeue from the pool because pool pages will
* mostly be from the CMA region.
*/
- return alloc_migrate_huge_page(h, gfp_mask, nid, NULL);
+ return alloc_huge_page_nodemask(h, nid, NULL, gfp_mask, true);
}
#endif
if (PageTransHuge(page)) {
--- a/mm/hugetlb.c~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/mm/hugetlb.c
@@ -1034,13 +1034,18 @@ static void enqueue_huge_page(struct hst
h->free_huge_pages_node[nid]++;
}
-static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
+static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid, bool skip_cma)
{
struct page *page;
- list_for_each_entry(page, &h->hugepage_freelists[nid], lru)
+ list_for_each_entry(page, &h->hugepage_freelists[nid], lru) {
+ if (skip_cma && is_migrate_cma_page(page))
+ continue;
+
if (!PageHWPoison(page))
break;
+ }
+
/*
* if 'non-isolated free hugepage' not found on the list,
* the allocation fails.
@@ -1055,7 +1060,7 @@ static struct page *dequeue_huge_page_no
}
static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, int nid,
- nodemask_t *nmask)
+ nodemask_t *nmask, bool skip_cma)
{
unsigned int cpuset_mems_cookie;
struct zonelist *zonelist;
@@ -1080,7 +1085,7 @@ retry_cpuset:
continue;
node = zone_to_nid(zone);
- page = dequeue_huge_page_node_exact(h, node);
+ page = dequeue_huge_page_node_exact(h, node, skip_cma);
if (page)
return page;
}
@@ -1125,7 +1130,7 @@ static struct page *dequeue_huge_page_vm
gfp_mask = htlb_alloc_mask(h);
nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
- page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask);
+ page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask, false);
if (page && !avoid_reserve && vma_has_reserves(vma, chg)) {
SetPagePrivate(page);
h->resv_huge_pages--;
@@ -1938,7 +1943,7 @@ out_unlock:
return page;
}
-struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
+static struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
int nid, nodemask_t *nmask)
{
struct page *page;
@@ -1981,7 +1986,7 @@ struct page *alloc_buddy_huge_page_with_
/* page migration callback function */
struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
- nodemask_t *nmask, gfp_t gfp_mask)
+ nodemask_t *nmask, gfp_t gfp_mask, bool skip_cma)
{
gfp_mask |= htlb_alloc_mask(h);
@@ -1989,7 +1994,8 @@ struct page *alloc_huge_page_nodemask(st
if (h->free_huge_pages - h->resv_huge_pages > 0) {
struct page *page;
- page = dequeue_huge_page_nodemask(h, gfp_mask, preferred_nid, nmask);
+ page = dequeue_huge_page_nodemask(h, gfp_mask,
+ preferred_nid, nmask, skip_cma);
if (page) {
spin_unlock(&hugetlb_lock);
return page;
@@ -1997,6 +2003,13 @@ struct page *alloc_huge_page_nodemask(st
}
spin_unlock(&hugetlb_lock);
+ /*
+ * To skip the memory on CMA area, we need to clear __GFP_MOVABLE.
+ * Clearing __GFP_MOVABLE at the top of this function would also skip
+ * the proper allocation candidates for dequeue so clearing it here.
+ */
+ if (skip_cma)
+ gfp_mask &= ~__GFP_MOVABLE;
return alloc_migrate_huge_page(h, gfp_mask, preferred_nid, nmask);
}
@@ -2012,7 +2025,7 @@ struct page *alloc_huge_page_vma(struct
gfp_mask = htlb_alloc_mask(h);
node = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
- page = alloc_huge_page_nodemask(h, node, nodemask, 0);
+ page = alloc_huge_page_nodemask(h, node, nodemask, 0, false);
mpol_cond_put(mpol);
return page;
--- a/mm/mempolicy.c~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/mm/mempolicy.c
@@ -1071,7 +1071,7 @@ struct page *alloc_new_node_page(struct
if (PageHuge(page)) {
return alloc_huge_page_nodemask(
page_hstate(compound_head(page)), node,
- NULL, __GFP_THISNODE);
+ NULL, __GFP_THISNODE, false);
} else if (PageTransHuge(page)) {
struct page *thp;
--- a/mm/migrate.c~mm-hugetlb-make-hugetlb-migration-callback-cma-aware
+++ a/mm/migrate.c
@@ -1523,7 +1523,7 @@ struct page *new_page_nodemask(struct pa
if (PageHuge(page)) {
return alloc_huge_page_nodemask(
page_hstate(compound_head(page)),
- preferred_nid, nodemask, 0);
+ preferred_nid, nodemask, 0, false);
}
if (PageTransHuge(page)) {
_
Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-migrate-make-a-standard-migration-target-allocation-function.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (43 preceding siblings ...)
2020-07-07 19:47 ` [to-be-updated] mm-hugetlb-make-hugetlb-migration-callback-cma-aware.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-gup-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
` (187 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
mm-commits, n-horiguchi, vbabka
The patch titled
Subject: mm/migrate: make a standard migration target allocation function
has been removed from the -mm tree. Its filename was
mm-migrate-make-a-standard-migration-target-allocation-function.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/migrate: make a standard migration target allocation function
There are some similar functions for migration target allocation. Since
there is no fundamental difference, it's better to keep just one rather
than keeping all variants. This patch implements base migration target
allocation function. In the following patches, variants will be converted
to use this function.
Note that PageHighmem() call in previous function is changed to open-code
"is_highmem_idx()" since it provides more readability.
Link: http://lkml.kernel.org/r/1592892828-1934-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/migrate.h | 5 +++--
mm/internal.h | 7 +++++++
mm/memory-failure.c | 8 ++++++--
mm/memory_hotplug.c | 14 +++++++++-----
mm/migrate.c | 21 +++++++++++++--------
mm/page_isolation.c | 8 ++++++--
6 files changed, 44 insertions(+), 19 deletions(-)
--- a/include/linux/migrate.h~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/include/linux/migrate.h
@@ -10,6 +10,8 @@
typedef struct page *new_page_t(struct page *page, unsigned long private);
typedef void free_page_t(struct page *page, unsigned long private);
+struct migration_target_control;
+
/*
* Return values from addresss_space_operations.migratepage():
* - negative errno on page migration failure;
@@ -39,8 +41,7 @@ extern int migrate_page(struct address_s
enum migrate_mode mode);
extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
unsigned long private, enum migrate_mode mode, int reason);
-extern struct page *new_page_nodemask(struct page *page,
- int preferred_nid, nodemask_t *nodemask);
+extern struct page *alloc_migration_target(struct page *page, unsigned long private);
extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
extern void putback_movable_page(struct page *page);
--- a/mm/internal.h~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/internal.h
@@ -614,4 +614,11 @@ static inline bool is_migrate_highatomic
void setup_zone_pageset(struct zone *zone);
extern struct page *alloc_new_node_page(struct page *page, unsigned long node);
+
+struct migration_target_control {
+ int nid; /* preferred node id */
+ nodemask_t *nmask;
+ gfp_t gfp_mask;
+};
+
#endif /* __MM_INTERNAL_H */
--- a/mm/memory-failure.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/memory-failure.c
@@ -1648,9 +1648,13 @@ EXPORT_SYMBOL(unpoison_memory);
static struct page *new_page(struct page *p, unsigned long private)
{
- int nid = page_to_nid(p);
+ struct migration_target_control mtc = {
+ .nid = page_to_nid(p),
+ .nmask = &node_states[N_MEMORY],
+ .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+ };
- return new_page_nodemask(p, nid, &node_states[N_MEMORY]);
+ return alloc_migration_target(p, (unsigned long)&mtc);
}
/*
--- a/mm/memory_hotplug.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/memory_hotplug.c
@@ -1267,19 +1267,23 @@ found:
static struct page *new_node_page(struct page *page, unsigned long private)
{
- int nid = page_to_nid(page);
nodemask_t nmask = node_states[N_MEMORY];
+ struct migration_target_control mtc = {
+ .nid = page_to_nid(page),
+ .nmask = &nmask,
+ .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+ };
/*
* try to allocate from a different node but reuse this node if there
* are no other online nodes to be used (e.g. we are offlining a part
* of the only existing node)
*/
- node_clear(nid, nmask);
- if (nodes_empty(nmask))
- node_set(nid, nmask);
+ node_clear(mtc.nid, *mtc.nmask);
+ if (nodes_empty(*mtc.nmask))
+ node_set(mtc.nid, *mtc.nmask);
- return new_page_nodemask(page, nid, &nmask);
+ return alloc_migration_target(page, (unsigned long)&mtc);
}
static int
--- a/mm/migrate.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/migrate.c
@@ -1513,29 +1513,34 @@ out:
return rc;
}
-struct page *new_page_nodemask(struct page *page,
- int preferred_nid, nodemask_t *nodemask)
+struct page *alloc_migration_target(struct page *page, unsigned long private)
{
- gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL;
+ struct migration_target_control *mtc;
+ gfp_t gfp_mask;
unsigned int order = 0;
struct page *new_page = NULL;
+ int zidx;
+
+ mtc = (struct migration_target_control *)private;
+ gfp_mask = mtc->gfp_mask;
if (PageHuge(page)) {
return alloc_huge_page_nodemask(
- page_hstate(compound_head(page)),
- preferred_nid, nodemask, 0, false);
+ page_hstate(compound_head(page)), mtc->nid,
+ mtc->nmask, gfp_mask, false);
}
if (PageTransHuge(page)) {
+ gfp_mask &= ~__GFP_RECLAIM;
gfp_mask |= GFP_TRANSHUGE;
order = HPAGE_PMD_ORDER;
}
-
- if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
+ zidx = zone_idx(page_zone(page));
+ if (is_highmem_idx(zidx) || zidx == ZONE_MOVABLE)
gfp_mask |= __GFP_HIGHMEM;
new_page = __alloc_pages_nodemask(gfp_mask, order,
- preferred_nid, nodemask);
+ mtc->nid, mtc->nmask);
if (new_page && PageTransHuge(new_page))
prep_transhuge_page(new_page);
--- a/mm/page_isolation.c~mm-migrate-make-a-standard-migration-target-allocation-function
+++ a/mm/page_isolation.c
@@ -309,7 +309,11 @@ int test_pages_isolated(unsigned long st
struct page *alloc_migrate_target(struct page *page, unsigned long private)
{
- int nid = page_to_nid(page);
+ struct migration_target_control mtc = {
+ .nid = page_to_nid(page),
+ .nmask = &node_states[N_MEMORY],
+ .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+ };
- return new_page_nodemask(page, nid, &node_states[N_MEMORY]);
+ return alloc_migration_target(page, (unsigned long)&mtc);
}
_
Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are
mm-gup-use-a-standard-migration-target-allocation-callback.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-gup-use-a-standard-migration-target-allocation-callback.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (44 preceding siblings ...)
2020-07-07 19:47 ` [to-be-updated] mm-migrate-make-a-standard-migration-target-allocation-function.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
` (186 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
mm-commits, n-horiguchi, vbabka
The patch titled
Subject: mm/gup: use a standard migration target allocation callback
has been removed from the -mm tree. Its filename was
mm-gup-use-a-standard-migration-target-allocation-callback.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/gup: use a standard migration target allocation callback
There is a well-defined migration target allocation callback. It's mostly
similar with new_non_cma_page() except considering CMA pages.
This patch adds a CMA consideration to the standard migration target
allocation callback and use it on gup.c.
Link: http://lkml.kernel.org/r/1592892828-1934-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/gup.c | 57 ++++++------------------------------------------
mm/internal.h | 1
mm/migrate.c | 4 ++-
3 files changed, 12 insertions(+), 50 deletions(-)
--- a/mm/gup.c~mm-gup-use-a-standard-migration-target-allocation-callback
+++ a/mm/gup.c
@@ -1608,56 +1608,15 @@ static bool check_dax_vmas(struct vm_are
}
#ifdef CONFIG_CMA
-static struct page *new_non_cma_page(struct page *page, unsigned long private)
+static struct page *alloc_migration_target_non_cma(struct page *page, unsigned long private)
{
- /*
- * We want to make sure we allocate the new page from the same node
- * as the source page.
- */
- int nid = page_to_nid(page);
- /*
- * Trying to allocate a page for migration. Ignore allocation
- * failure warnings. We don't force __GFP_THISNODE here because
- * this node here is the node where we have CMA reservation and
- * in some case these nodes will have really less non movable
- * allocation memory.
- */
- gfp_t gfp_mask = GFP_USER | __GFP_NOWARN;
-
- if (PageHighMem(page))
- gfp_mask |= __GFP_HIGHMEM;
-
-#ifdef CONFIG_HUGETLB_PAGE
- if (PageHuge(page)) {
- struct hstate *h = page_hstate(page);
+ struct migration_target_control mtc = {
+ .nid = page_to_nid(page),
+ .gfp_mask = GFP_USER | __GFP_NOWARN,
+ .skip_cma = true,
+ };
- /*
- * We don't want to dequeue from the pool because pool pages will
- * mostly be from the CMA region.
- */
- return alloc_huge_page_nodemask(h, nid, NULL, gfp_mask, true);
- }
-#endif
- if (PageTransHuge(page)) {
- struct page *thp;
- /*
- * ignore allocation failure warnings
- */
- gfp_t thp_gfpmask = GFP_TRANSHUGE | __GFP_NOWARN;
-
- /*
- * Remove the movable mask so that we don't allocate from
- * CMA area again.
- */
- thp_gfpmask &= ~__GFP_MOVABLE;
- thp = __alloc_pages_node(nid, thp_gfpmask, HPAGE_PMD_ORDER);
- if (!thp)
- return NULL;
- prep_transhuge_page(thp);
- return thp;
- }
-
- return __alloc_pages_node(nid, gfp_mask, 0);
+ return alloc_migration_target(page, (unsigned long)&mtc);
}
static long check_and_migrate_cma_pages(struct task_struct *tsk,
@@ -1719,7 +1678,7 @@ check_again:
for (i = 0; i < nr_pages; i++)
put_page(pages[i]);
- if (migrate_pages(&cma_page_list, new_non_cma_page,
+ if (migrate_pages(&cma_page_list, alloc_migration_target_non_cma,
NULL, 0, MIGRATE_SYNC, MR_CONTIG_RANGE)) {
/*
* some of the pages failed migration. Do get_user_pages
--- a/mm/internal.h~mm-gup-use-a-standard-migration-target-allocation-callback
+++ a/mm/internal.h
@@ -619,6 +619,7 @@ struct migration_target_control {
int nid; /* preferred node id */
nodemask_t *nmask;
gfp_t gfp_mask;
+ bool skip_cma;
};
#endif /* __MM_INTERNAL_H */
--- a/mm/migrate.c~mm-gup-use-a-standard-migration-target-allocation-callback
+++ a/mm/migrate.c
@@ -1527,7 +1527,7 @@ struct page *alloc_migration_target(stru
if (PageHuge(page)) {
return alloc_huge_page_nodemask(
page_hstate(compound_head(page)), mtc->nid,
- mtc->nmask, gfp_mask, false);
+ mtc->nmask, gfp_mask, mtc->skip_cma);
}
if (PageTransHuge(page)) {
@@ -1538,6 +1538,8 @@ struct page *alloc_migration_target(stru
zidx = zone_idx(page_zone(page));
if (is_highmem_idx(zidx) || zidx == ZONE_MOVABLE)
gfp_mask |= __GFP_HIGHMEM;
+ if (mtc->skip_cma)
+ gfp_mask &= ~__GFP_MOVABLE;
new_page = __alloc_pages_nodemask(gfp_mask, order,
mtc->nid, mtc->nmask);
_
Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (45 preceding siblings ...)
2020-07-07 19:47 ` [to-be-updated] mm-gup-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
2020-07-07 19:47 ` [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
` (185 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
mm-commits, n-horiguchi, vbabka
The patch titled
Subject: mm/mempolicy: use a standard migration target allocation callback
has been removed from the -mm tree. Its filename was
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/mempolicy: use a standard migration target allocation callback
There is a well-defined migration target allocation callback. Use it.
Link: http://lkml.kernel.org/r/1592892828-1934-8-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/internal.h | 1 -
mm/mempolicy.c | 30 ++++++------------------------
mm/migrate.c | 8 ++++++--
3 files changed, 12 insertions(+), 27 deletions(-)
--- a/mm/internal.h~mm-mempolicy-use-a-standard-migration-target-allocation-callback
+++ a/mm/internal.h
@@ -613,7 +613,6 @@ static inline bool is_migrate_highatomic
}
void setup_zone_pageset(struct zone *zone);
-extern struct page *alloc_new_node_page(struct page *page, unsigned long node);
struct migration_target_control {
int nid; /* preferred node id */
--- a/mm/mempolicy.c~mm-mempolicy-use-a-standard-migration-target-allocation-callback
+++ a/mm/mempolicy.c
@@ -1065,28 +1065,6 @@ static int migrate_page_add(struct page
return 0;
}
-/* page allocation callback for NUMA node migration */
-struct page *alloc_new_node_page(struct page *page, unsigned long node)
-{
- if (PageHuge(page)) {
- return alloc_huge_page_nodemask(
- page_hstate(compound_head(page)), node,
- NULL, __GFP_THISNODE, false);
- } else if (PageTransHuge(page)) {
- struct page *thp;
-
- thp = alloc_pages_node(node,
- (GFP_TRANSHUGE | __GFP_THISNODE),
- HPAGE_PMD_ORDER);
- if (!thp)
- return NULL;
- prep_transhuge_page(thp);
- return thp;
- } else
- return __alloc_pages_node(node, GFP_HIGHUSER_MOVABLE |
- __GFP_THISNODE, 0);
-}
-
/*
* Migrate pages from one node to a target node.
* Returns error or the number of pages not migrated.
@@ -1097,6 +1075,10 @@ static int migrate_to_node(struct mm_str
nodemask_t nmask;
LIST_HEAD(pagelist);
int err = 0;
+ struct migration_target_control mtc = {
+ .nid = dest,
+ .gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE,
+ };
nodes_clear(nmask);
node_set(source, nmask);
@@ -1111,8 +1093,8 @@ static int migrate_to_node(struct mm_str
flags | MPOL_MF_DISCONTIG_OK, &pagelist);
if (!list_empty(&pagelist)) {
- err = migrate_pages(&pagelist, alloc_new_node_page, NULL, dest,
- MIGRATE_SYNC, MR_SYSCALL);
+ err = migrate_pages(&pagelist, alloc_migration_target, NULL,
+ (unsigned long)&mtc, MIGRATE_SYNC, MR_SYSCALL);
if (err)
putback_movable_pages(&pagelist);
}
--- a/mm/migrate.c~mm-mempolicy-use-a-standard-migration-target-allocation-callback
+++ a/mm/migrate.c
@@ -1567,9 +1567,13 @@ static int do_move_pages_to_node(struct
struct list_head *pagelist, int node)
{
int err;
+ struct migration_target_control mtc = {
+ .nid = node,
+ .gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE,
+ };
- err = migrate_pages(pagelist, alloc_new_node_page, NULL, node,
- MIGRATE_SYNC, MR_SYSCALL);
+ err = migrate_pages(pagelist, alloc_migration_target, NULL,
+ (unsigned long)&mtc, MIGRATE_SYNC, MR_SYSCALL);
if (err)
putback_movable_pages(pagelist);
return err;
_
Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (46 preceding siblings ...)
2020-07-07 19:47 ` [to-be-updated] mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
2020-07-07 19:47 ` Andrew Morton
2020-07-07 19:56 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch added to " Andrew Morton
` (184 subsequent siblings)
232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
mm-commits, n-horiguchi, vbabka
The patch titled
Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()
has been removed from the -mm tree. Its filename was
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()
There is a well-defined standard migration target callback. Use it
directly.
Link: http://lkml.kernel.org/r/1592892828-1934-9-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/page_alloc.c | 9 +++++++--
mm/page_isolation.c | 11 -----------
2 files changed, 7 insertions(+), 13 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_alloc.c
@@ -8354,6 +8354,11 @@ static int __alloc_contig_migrate_range(
unsigned long pfn = start;
unsigned int tries = 0;
int ret = 0;
+ struct migration_target_control mtc = {
+ .nid = zone_to_nid(cc->zone),
+ .nmask = &node_states[N_MEMORY],
+ .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+ };
migrate_prep();
@@ -8380,8 +8385,8 @@ static int __alloc_contig_migrate_range(
&cc->migratepages);
cc->nr_migratepages -= nr_reclaimed;
- ret = migrate_pages(&cc->migratepages, alloc_migrate_target,
- NULL, 0, cc->mode, MR_CONTIG_RANGE);
+ ret = migrate_pages(&cc->migratepages, alloc_migration_target,
+ NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE);
}
if (ret < 0) {
putback_movable_pages(&cc->migratepages);
--- a/mm/page_isolation.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_isolation.c
@@ -306,14 +306,3 @@ int test_pages_isolated(unsigned long st
return pfn < end_pfn ? -EBUSY : 0;
}
-
-struct page *alloc_migrate_target(struct page *page, unsigned long private)
-{
- struct migration_target_control mtc = {
- .nid = page_to_nid(page),
- .nmask = &node_states[N_MEMORY],
- .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
- };
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch removed from -mm tree
2020-07-07 19:47 ` [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
@ 2020-07-07 19:47 ` Andrew Morton
0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:47 UTC (permalink / raw)
To: guro, hch, iamjoonsoo.kim, mgorman, mhocko, mike.kravetz,
mm-commits, n-horiguchi, vbabka
The patch titled
Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()
has been removed from the -mm tree. Its filename was
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: mm/page_alloc: remove a wrapper for alloc_migration_target()
There is a well-defined standard migration target callback. Use it
directly.
Link: http://lkml.kernel.org/r/1592892828-1934-9-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/page_alloc.c | 9 +++++++--
mm/page_isolation.c | 11 -----------
2 files changed, 7 insertions(+), 13 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_alloc.c
@@ -8354,6 +8354,11 @@ static int __alloc_contig_migrate_range(
unsigned long pfn = start;
unsigned int tries = 0;
int ret = 0;
+ struct migration_target_control mtc = {
+ .nid = zone_to_nid(cc->zone),
+ .nmask = &node_states[N_MEMORY],
+ .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
+ };
migrate_prep();
@@ -8380,8 +8385,8 @@ static int __alloc_contig_migrate_range(
&cc->migratepages);
cc->nr_migratepages -= nr_reclaimed;
- ret = migrate_pages(&cc->migratepages, alloc_migrate_target,
- NULL, 0, cc->mode, MR_CONTIG_RANGE);
+ ret = migrate_pages(&cc->migratepages, alloc_migration_target,
+ NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE);
}
if (ret < 0) {
putback_movable_pages(&cc->migratepages);
--- a/mm/page_isolation.c~mm-page_alloc-remove-a-wrapper-for-alloc_migration_target
+++ a/mm/page_isolation.c
@@ -306,14 +306,3 @@ int test_pages_isolated(unsigned long st
return pfn < end_pfn ? -EBUSY : 0;
}
-
-struct page *alloc_migrate_target(struct page *page, unsigned long private)
-{
- struct migration_target_control mtc = {
- .nid = page_to_nid(page),
- .nmask = &node_states[N_MEMORY],
- .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
- };
-
- return alloc_migration_target(page, (unsigned long)&mtc);
-}
_
Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (47 preceding siblings ...)
2020-07-07 19:47 ` [to-be-updated] mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch " Andrew Morton
@ 2020-07-07 19:56 ` Andrew Morton
2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch removed from " Andrew Morton
` (183 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 19:56 UTC (permalink / raw)
To: guro, jonathan.cameron, mike.kravetz, mm-commits, rppt, song.bao.hua
The patch titled
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enable
has been added to the -mm tree. Its filename is
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enable
hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled. gigantic pages might have been reserved on other nodes.
Link: http://lkml.kernel.org/r/20200707040204.30132-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/hugetlb.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable
+++ a/mm/hugetlb.c
@@ -2547,6 +2547,20 @@ static void __init gather_bootmem_preall
}
}
+bool __init hugetlb_cma_enabled(void)
+{
+#ifdef CONFIG_CMA
+ int node;
+
+ for_each_online_node(node) {
+ if (hugetlb_cma[node])
+ return true;
+ }
+#endif
+
+ return false;
+}
+
static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
{
unsigned long i;
@@ -2572,7 +2586,7 @@ static void __init hugetlb_hstate_alloc_
for (i = 0; i < h->max_huge_pages; ++i) {
if (hstate_is_gigantic(h)) {
- if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+ if (hugetlb_cma_enabled()) {
pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
break;
}
_
Patches currently in -mm which might be from song.bao.hua@hisilicon.com are
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (48 preceding siblings ...)
2020-07-07 19:56 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch added to " Andrew Morton
@ 2020-07-07 20:11 ` Andrew Morton
2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch " Andrew Morton
` (182 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 20:11 UTC (permalink / raw)
To: anshuman.khandual, hughd, mm-commits
The patch titled
Subject: mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix
has been removed from the -mm tree. Its filename was
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch
This patch was dropped because it was folded into mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
------------------------------------------------------
From: Hugh Dickins <hughd@google.com>
Subject: mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix
Fix 5.7-rc6-mm1 page migration crash in unmap_and_move(): when the
page to be migrated has been freed from under us, that is considered
a MIGRATEPAGE_SUCCESS, but no newpage has been allocated (and I don't
think it would ever need to be counted as a successful THP migration).
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005210643340.482@eggly.anvils
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/migrate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/migrate.c~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix
+++ a/mm/migrate.c
@@ -1245,7 +1245,7 @@ out:
* we want to retry.
*/
if (rc == MIGRATEPAGE_SUCCESS) {
- if (PageTransHuge(newpage))
+ if (newpage && PageTransHuge(newpage))
thp_migration_success(true);
put_page(page);
if (reason == MR_MEMORY_FAILURE) {
_
Patches currently in -mm which might be from hughd@google.com are
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (49 preceding siblings ...)
2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch removed from " Andrew Morton
@ 2020-07-07 20:11 ` Andrew Morton
2020-07-07 20:12 ` [to-be-updated] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch " Andrew Morton
` (181 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 20:11 UTC (permalink / raw)
To: anshuman.khandual, hughd, jhubbard, mm-commits, n-horiguchi, ziy
The patch titled
Subject: mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update
has been removed from the -mm tree. Its filename was
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch
This patch was dropped because it was folded into mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update
rename thp_migration_success() to thp_pmd_migration_success() per John
Link: http://lkml.kernel.org/r/1590118444-21601-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/migrate.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
--- a/mm/migrate.c~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update
+++ a/mm/migrate.c
@@ -1172,7 +1172,7 @@ out:
#endif
#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
-static inline void thp_migration_success(bool success)
+static inline void thp_pmd_migration_success(bool success)
{
if (success)
count_vm_event(THP_PMD_MIGRATION_SUCCESS);
@@ -1180,7 +1180,9 @@ static inline void thp_migration_success
count_vm_event(THP_PMD_MIGRATION_FAILURE);
}
#else
-static inline void thp_migration_success(bool success) { }
+static inline void thp_pmd_migration_success(bool success)
+{
+}
#endif
/*
@@ -1245,8 +1247,14 @@ out:
* we want to retry.
*/
if (rc == MIGRATEPAGE_SUCCESS) {
+ /*
+ * When the page to be migrated has been freed from under
+ * us, that is considered a MIGRATEPAGE_SUCCESS, but no
+ * newpage has been allocated. It should not be counted
+ * as a successful THP migration.
+ */
if (newpage && PageTransHuge(newpage))
- thp_migration_success(true);
+ thp_pmd_migration_success(true);
put_page(page);
if (reason == MR_MEMORY_FAILURE) {
/*
@@ -1489,7 +1497,7 @@ retry:
unlock_page(page);
if (!rc) {
list_safe_reset_next(page, page2, lru);
- thp_migration_success(false);
+ thp_pmd_migration_success(false);
goto retry;
}
}
_
Patches currently in -mm which might be from anshuman.khandual@arm.com are
mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (50 preceding siblings ...)
2020-07-07 20:11 ` [folded-merged] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-update.patch " Andrew Morton
@ 2020-07-07 20:12 ` Andrew Morton
2020-07-07 20:13 ` + mm-vmstat-add-events-for-thp-migration-without-split.patch added to " Andrew Morton
` (180 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 20:12 UTC (permalink / raw)
To: aarcange, anshuman.khandual, cai, daniel.m.jordan, hannes, hughd,
jhubbard, kirill.shutemov, mhocko, mm-commits, n-horiguchi,
yang.shi, ziy
The patch titled
Subject: mm/vmstat: add events for PMD based THP migration without split
has been removed from the -mm tree. Its filename was
mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/vmstat: add events for PMD based THP migration without split
This adds the following two new VM events which will help in validating
PMD based THP migration without split. Statistics reported through these
events will help in performance debugging.
1. THP_PMD_MIGRATION_SUCCESS
2. THP_PMD_MIGRATION_FAILURE
[hughd@google.com: fix page migration crash in unmap_and_move()]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005210643340.482@eggly.anvils
[anshuman.khandual@arm.com: rename thp_migration_success() to thp_pmd_migration_success() per John]
Link: http://lkml.kernel.org/r/1590118444-21601-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1589784156-28831-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/vm_event_item.h | 4 ++++
mm/migrate.c | 23 +++++++++++++++++++++++
mm/vmstat.c | 4 ++++
3 files changed, 31 insertions(+)
--- a/include/linux/vm_event_item.h~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split
+++ a/include/linux/vm_event_item.h
@@ -95,6 +95,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
THP_ZERO_PAGE_ALLOC_FAILED,
THP_SWPOUT,
THP_SWPOUT_FALLBACK,
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+ THP_PMD_MIGRATION_SUCCESS,
+ THP_PMD_MIGRATION_FAILURE,
+#endif
#endif
#ifdef CONFIG_MEMORY_BALLOON
BALLOON_INFLATE,
--- a/mm/migrate.c~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split
+++ a/mm/migrate.c
@@ -1171,6 +1171,20 @@ out:
#define ICE_noinline
#endif
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+static inline void thp_pmd_migration_success(bool success)
+{
+ if (success)
+ count_vm_event(THP_PMD_MIGRATION_SUCCESS);
+ else
+ count_vm_event(THP_PMD_MIGRATION_FAILURE);
+}
+#else
+static inline void thp_pmd_migration_success(bool success)
+{
+}
+#endif
+
/*
* Obtain the lock on page, remove all ptes and migrate the page
* to the newly allocated page in newpage.
@@ -1233,6 +1247,14 @@ out:
* we want to retry.
*/
if (rc == MIGRATEPAGE_SUCCESS) {
+ /*
+ * When the page to be migrated has been freed from under
+ * us, that is considered a MIGRATEPAGE_SUCCESS, but no
+ * newpage has been allocated. It should not be counted
+ * as a successful THP migration.
+ */
+ if (newpage && PageTransHuge(newpage))
+ thp_pmd_migration_success(true);
put_page(page);
if (reason == MR_MEMORY_FAILURE) {
/*
@@ -1475,6 +1497,7 @@ retry:
unlock_page(page);
if (!rc) {
list_safe_reset_next(page, page2, lru);
+ thp_pmd_migration_success(false);
goto retry;
}
}
--- a/mm/vmstat.c~mm-vmstat-add-events-for-pmd-based-thp-migration-without-split
+++ a/mm/vmstat.c
@@ -1320,6 +1320,10 @@ const char * const vmstat_text[] = {
"thp_zero_page_alloc_failed",
"thp_swpout",
"thp_swpout_fallback",
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+ "thp_pmd_migration_success",
+ "thp_pmd_migration_failure",
+#endif
#endif
#ifdef CONFIG_MEMORY_BALLOON
"balloon_inflate",
_
Patches currently in -mm which might be from anshuman.khandual@arm.com are
mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-vmstat-add-events-for-thp-migration-without-split.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (51 preceding siblings ...)
2020-07-07 20:12 ` [to-be-updated] mm-vmstat-add-events-for-pmd-based-thp-migration-without-split.patch " Andrew Morton
@ 2020-07-07 20:13 ` Andrew Morton
2020-07-07 22:18 ` + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch " Andrew Morton
` (179 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 20:13 UTC (permalink / raw)
To: anshuman.khandual, daniel.m.jordan, hughd, jhubbard, mm-commits,
n-horiguchi, willy, ziy
The patch titled
Subject: mm/vmstat: add events for THP migration without split
has been added to the -mm tree. Its filename is
mm-vmstat-add-events-for-thp-migration-without-split.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-add-events-for-thp-migration-without-split.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-add-events-for-thp-migration-without-split.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/vmstat: add events for THP migration without split
Add following new vmstat events which will help in validating THP
migration without split. Statistics reported through these new VM events
will help in performance debugging.
1. THP_MIGRATION_SUCCESS
2. THP_MIGRATION_FAILURE
3. THP_MIGRATION_SPLIT
In addition, these new events also update normal page migration statistics
appropriately via PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE. While here,
this updates current trace event 'mm_migrate_pages' to accommodate now
available THP statistics.
Link: http://lkml.kernel.org/r/1594080415-27924-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/vm/page_migration.rst | 19 ++++++++++
include/linux/vm_event_item.h | 3 +
include/trace/events/migrate.h | 17 +++++++--
mm/migrate.c | 49 +++++++++++++++++++++++---
mm/vmstat.c | 3 +
5 files changed, 84 insertions(+), 7 deletions(-)
--- a/Documentation/vm/page_migration.rst~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/Documentation/vm/page_migration.rst
@@ -253,5 +253,24 @@ which are function pointers of struct ad
PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
for own purpose.
+Quantifying Migration
+=====================
+Following events can be used to quantify page migration.
+
+1. PGMIGRATE_SUCCESS /* Normal page migration success */
+2. PGMIGRATE_FAIL /* Normal page migration failure */
+3. THP_MIGRATION_SUCCESS /* Transparent huge page migration success */
+4. THP_MIGRATION_FAILURE /* Transparent huge page migration failure */
+5. THP_MIGRATION_SPLIT /* Transparent huge page got split, retried */
+
+THP_MIGRATION_SUCCESS is when THP is migrated successfully without getting
+split into it's subpages. THP_MIGRATION_FAILURE is when THP could neither
+be migrated nor be split. THP_MIGRATION_SPLIT is when THP could not
+just be migrated as is but instead get split into it's subpages and later
+retried as normal pages. THP events would also update normal page migration
+statistics PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE. These events will help
+in quantifying and analyzing various THP migration events including both
+success and failure cases.
+
Christoph Lameter, May 8, 2006.
Minchan Kim, Mar 28, 2016.
--- a/include/linux/vm_event_item.h~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/include/linux/vm_event_item.h
@@ -95,6 +95,9 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
THP_ZERO_PAGE_ALLOC_FAILED,
THP_SWPOUT,
THP_SWPOUT_FALLBACK,
+ THP_MIGRATION_SUCCESS,
+ THP_MIGRATION_FAILURE,
+ THP_MIGRATION_SPLIT,
#endif
#ifdef CONFIG_MEMORY_BALLOON
BALLOON_INFLATE,
--- a/include/trace/events/migrate.h~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/include/trace/events/migrate.h
@@ -46,13 +46,18 @@ MIGRATE_REASON
TRACE_EVENT(mm_migrate_pages,
TP_PROTO(unsigned long succeeded, unsigned long failed,
- enum migrate_mode mode, int reason),
+ unsigned long thp_succeeded, unsigned long thp_failed,
+ unsigned long thp_split, enum migrate_mode mode, int reason),
- TP_ARGS(succeeded, failed, mode, reason),
+ TP_ARGS(succeeded, failed, thp_succeeded, thp_failed,
+ thp_split, mode, reason),
TP_STRUCT__entry(
__field( unsigned long, succeeded)
__field( unsigned long, failed)
+ __field( unsigned long, thp_succeeded)
+ __field( unsigned long, thp_failed)
+ __field( unsigned long, thp_split)
__field( enum migrate_mode, mode)
__field( int, reason)
),
@@ -60,13 +65,19 @@ TRACE_EVENT(mm_migrate_pages,
TP_fast_assign(
__entry->succeeded = succeeded;
__entry->failed = failed;
+ __entry->thp_succeeded = thp_succeeded;
+ __entry->thp_failed = thp_failed;
+ __entry->thp_split = thp_split;
__entry->mode = mode;
__entry->reason = reason;
),
- TP_printk("nr_succeeded=%lu nr_failed=%lu mode=%s reason=%s",
+ TP_printk("nr_succeeded=%lu nr_failed=%lu nr_thp_succeeded=%lu nr_thp_failed=%lu nr_thp_split=%lu mode=%s reason=%s",
__entry->succeeded,
__entry->failed,
+ __entry->thp_succeeded,
+ __entry->thp_failed,
+ __entry->thp_split,
__print_symbolic(__entry->mode, MIGRATE_MODE),
__print_symbolic(__entry->reason, MIGRATE_REASON))
);
--- a/mm/migrate.c~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/mm/migrate.c
@@ -1429,22 +1429,35 @@ int migrate_pages(struct list_head *from
enum migrate_mode mode, int reason)
{
int retry = 1;
+ int thp_retry = 1;
int nr_failed = 0;
int nr_succeeded = 0;
+ int nr_thp_succeeded = 0;
+ int nr_thp_failed = 0;
+ int nr_thp_split = 0;
int pass = 0;
+ bool is_thp = false;
struct page *page;
struct page *page2;
int swapwrite = current->flags & PF_SWAPWRITE;
- int rc;
+ int rc, thp_nr_pages;
if (!swapwrite)
current->flags |= PF_SWAPWRITE;
- for(pass = 0; pass < 10 && retry; pass++) {
+ for (pass = 0; pass < 10 && (retry || thp_retry); pass++) {
retry = 0;
+ thp_retry = 0;
list_for_each_entry_safe(page, page2, from, lru) {
retry:
+ /*
+ * THP statistics is based on the source huge page.
+ * Capture required information that might get lost
+ * during migration.
+ */
+ is_thp = PageTransHuge(page);
+ thp_nr_pages = hpage_nr_pages(page);
cond_resched();
if (PageHuge(page))
@@ -1475,15 +1488,30 @@ retry:
unlock_page(page);
if (!rc) {
list_safe_reset_next(page, page2, lru);
+ nr_thp_split++;
goto retry;
}
}
+ if (is_thp) {
+ nr_thp_failed++;
+ nr_failed += thp_nr_pages;
+ goto out;
+ }
nr_failed++;
goto out;
case -EAGAIN:
+ if (is_thp) {
+ thp_retry++;
+ break;
+ }
retry++;
break;
case MIGRATEPAGE_SUCCESS:
+ if (is_thp) {
+ nr_thp_succeeded++;
+ nr_succeeded += thp_nr_pages;
+ break;
+ }
nr_succeeded++;
break;
default:
@@ -1493,19 +1521,32 @@ retry:
* removed from migration page list and not
* retried in the next outer loop.
*/
+ if (is_thp) {
+ nr_thp_failed++;
+ nr_failed += thp_nr_pages;
+ break;
+ }
nr_failed++;
break;
}
}
}
- nr_failed += retry;
+ nr_failed += retry + thp_retry;
+ nr_thp_failed += thp_retry;
rc = nr_failed;
out:
if (nr_succeeded)
count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
if (nr_failed)
count_vm_events(PGMIGRATE_FAIL, nr_failed);
- trace_mm_migrate_pages(nr_succeeded, nr_failed, mode, reason);
+ if (nr_thp_succeeded)
+ count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
+ if (nr_thp_failed)
+ count_vm_events(THP_MIGRATION_FAILURE, nr_thp_failed);
+ if (nr_thp_split)
+ count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
+ trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded,
+ nr_thp_failed, nr_thp_split, mode, reason);
if (!swapwrite)
current->flags &= ~PF_SWAPWRITE;
--- a/mm/vmstat.c~mm-vmstat-add-events-for-thp-migration-without-split
+++ a/mm/vmstat.c
@@ -1320,6 +1320,9 @@ const char * const vmstat_text[] = {
"thp_zero_page_alloc_failed",
"thp_swpout",
"thp_swpout_fallback",
+ "thp_migration_success",
+ "thp_migration_failure",
+ "thp_migration_split",
#endif
#ifdef CONFIG_MEMORY_BALLOON
"balloon_inflate",
_
Patches currently in -mm which might be from anshuman.khandual@arm.com are
mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
mm-vmstat-add-events-for-thp-migration-without-split.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (52 preceding siblings ...)
2020-07-07 20:13 ` + mm-vmstat-add-events-for-thp-migration-without-split.patch added to " Andrew Morton
@ 2020-07-07 22:18 ` Andrew Morton
2020-07-08 21:48 ` + vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch " Andrew Morton
` (178 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-07 22:18 UTC (permalink / raw)
To: alex.shi, hannes, hughd, mhocko, mm-commits, shakeelb, stable
The patch titled
Subject: mm/memcg: fix refcount error while moving and swapping
has been added to the -mm tree. Its filename is
mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Hugh Dickins <hughd@google.com>
Subject: mm/memcg: fix refcount error while moving and swapping
It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.
This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids.
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when
offlining).
Just skip that optimization: do that part of the accounting immediately.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2007071431050.4726@eggly.anvils
Fixes: 615d66c37c75 ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/memcontrol.c~mm-memcg-fix-refcount-error-while-moving-and-swapping
+++ a/mm/memcontrol.c
@@ -5669,7 +5669,6 @@ static void __mem_cgroup_clear_mc(void)
if (!mem_cgroup_is_root(mc.to))
page_counter_uncharge(&mc.to->memory, mc.moved_swap);
- mem_cgroup_id_get_many(mc.to, mc.moved_swap);
css_put_many(&mc.to->css, mc.moved_swap);
mc.moved_swap = 0;
@@ -5860,7 +5859,8 @@ put: /* get_mctgt_type() gets the page
ent = target.ent;
if (!mem_cgroup_move_swap_account(ent, mc.from, mc.to)) {
mc.precharge--;
- /* we fixup refcnts and charges later. */
+ mem_cgroup_id_get_many(mc.to, 1);
+ /* we fixup other refcnts and charges later. */
mc.moved_swap++;
}
break;
_
Patches currently in -mm which might be from hughd@google.com are
mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (53 preceding siblings ...)
2020-07-07 22:18 ` + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch " Andrew Morton
@ 2020-07-08 21:48 ` Andrew Morton
2020-07-08 21:50 ` + kbuild-move-wtype-limits-to-w=2.patch " Andrew Morton
` (177 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 21:48 UTC (permalink / raw)
To: grandmaster, hirofumi, mm-commits
The patch titled
Subject: VFAT/FAT/MSDOS FILESYSTEM: Replace HTTP links with HTTPS ones
has been added to the -mm tree. Its filename is
vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Alexander A. Klimov" <grandmaster@al2klimov.de>
Subject: VFAT/FAT/MSDOS FILESYSTEM: Replace HTTP links with HTTPS ones
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^#
]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Link: http://lkml.kernel.org/r/20200708200409.22293-1-grandmaster@al2klimov.de
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/fat/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/fat/Kconfig~vfat-fat-msdos-filesystem-replace-http-links-with-https-ones
+++ a/fs/fat/Kconfig
@@ -41,7 +41,7 @@ config MSDOS_FS
they are compressed; to access compressed MSDOS partitions under
Linux, you can either use the DOS emulator DOSEMU, described in the
DOSEMU-HOWTO, available from
- <http://www.tldp.org/docs.html#howto>, or try dmsdosfs in
+ <https://www.tldp.org/docs.html#howto>, or try dmsdosfs in
<ftp://ibiblio.org/pub/Linux/system/filesystems/dosfs/>. If you
intend to use dosemu with a non-compressed MSDOS partition, say Y
here) and MSDOS floppies. This means that file access becomes
_
Patches currently in -mm which might be from grandmaster@al2klimov.de are
vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + kbuild-move-wtype-limits-to-w=2.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (54 preceding siblings ...)
2020-07-08 21:48 ` + vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch " Andrew Morton
@ 2020-07-08 21:50 ` Andrew Morton
2020-07-08 22:17 ` [to-be-updated] mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch removed from " Andrew Morton
` (176 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 21:50 UTC (permalink / raw)
To: andy.shevchenko, arnd, emil.l.velikov, geert, keescook,
linus.walleij, michal.lkml, mm-commits, rikard.falkeborn,
syednwaris, vilhelm.gray, yamada.masahiro
The patch titled
Subject: kbuild: move -Wtype-limits to W=2
has been added to the -mm tree. Its filename is
kbuild-move-wtype-limits-to-w=2.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/kbuild-move-wtype-limits-to-w%3D2.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/kbuild-move-wtype-limits-to-w%3D2.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Subject: kbuild: move -Wtype-limits to W=2
-Wtype-limits is included in -Wextra which is added at W=1. It warns
(among other things) that 'comparison of an unsigned variable `< 0` is
always false. This causes noisy warnings, especially when used in macros,
hence it is more suitable for W=2.
Link: http://lkml.kernel.org/r/20200708190756.16810-1-rikard.falkeborn@gmail.com
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Syed Nayyar Waris <syednwaris@gmail.com>
Cc: William Breathitt Gray <vilhelm.gray@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
scripts/Makefile.extrawarn | 2 ++
1 file changed, 2 insertions(+)
--- a/scripts/Makefile.extrawarn~kbuild-move-wtype-limits-to-w=2
+++ a/scripts/Makefile.extrawarn
@@ -35,6 +35,7 @@ KBUILD_CFLAGS += $(call cc-option, -Wstr
# The following turn off the warnings enabled by -Wextra
KBUILD_CFLAGS += -Wno-missing-field-initializers
KBUILD_CFLAGS += -Wno-sign-compare
+KBUILD_CFLAGS += -Wno-type-limits
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN1
@@ -66,6 +67,7 @@ KBUILD_CFLAGS += -Wshadow
KBUILD_CFLAGS += $(call cc-option, -Wlogical-op)
KBUILD_CFLAGS += -Wmissing-field-initializers
KBUILD_CFLAGS += -Wsign-compare
+KBUILD_CFLAGS += -Wtype-limits
KBUILD_CFLAGS += $(call cc-option, -Wmaybe-uninitialized)
KBUILD_CFLAGS += $(call cc-option, -Wunused-macros)
_
Patches currently in -mm which might be from rikard.falkeborn@gmail.com are
kbuild-move-wtype-limits-to-w=2.patch
bits-add-tests-of-genmask.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (55 preceding siblings ...)
2020-07-08 21:50 ` + kbuild-move-wtype-limits-to-w=2.patch " Andrew Morton
@ 2020-07-08 22:17 ` Andrew Morton
2020-07-08 22:20 ` + mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch added to " Andrew Morton
` (175 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 22:17 UTC (permalink / raw)
To: bigeasy, colin.king, davem, ddstreet, herbert, lgoncalv,
mahipalreddy2006, mm-commits, sjenning, song.bao.hua,
vitaly.wool, wangzhou1
The patch titled
Subject: mm/zswap: move to use crypto_acomp API for hardware acceleration
has been removed from the -mm tree. Its filename was
mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/zswap: move to use crypto_acomp API for hardware acceleration
Right now, all new ZIP drivers are using crypto_acomp APIs rather than
legacy crypto_comp APIs. But zswap.c is still using the old APIs. That
means zswap won't be able to use any new zip drivers in kernel.
This patch moves to use cryto_acomp APIs to fix the problem. On the other
hand, tradiontal compressors like lz4,lzo etc have been wrapped into acomp
via scomp backend. So platforms without async compressors can fallback to
use acomp via scomp backend.
It is probably the first real user to use acomp but perhaps not a good
example to demonstrate how multiple acomp requests can be executed in
parallel in one acomp instance. frontswap is doing page load and store
page by page. It doesn't have a queuing or buffering mechinism to permit
multiple pages to do frontswap simultaneously in one thread. However this
patch creates multiple acomp instances, so multiple threads running on
multiple different cpus can actually do (de)compression parallelly,
leveraging the power of multiple ZIP hardware queues. This is also
consistent with frontswap's page management model.
On the other hand, the current zswap implementation has some per-cpu
global resource like zswap_dstmem. So we create acomp instances in number
of CPUs just like before, zswap created comp instances in number of CPUs.
Link: http://lkml.kernel.org/r/20200707125210.33256-1-song.bao.hua@hisilicon.com
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mahipal Challa <mahipalreddy2006@gmail.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/zswap.c | 177 ++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 134 insertions(+), 43 deletions(-)
--- a/mm/zswap.c~mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration
+++ a/mm/zswap.c
@@ -24,8 +24,10 @@
#include <linux/rbtree.h>
#include <linux/swap.h>
#include <linux/crypto.h>
+#include <linux/scatterlist.h>
#include <linux/mempool.h>
#include <linux/zpool.h>
+#include <crypto/acompress.h>
#include <linux/mm_types.h>
#include <linux/page-flags.h>
@@ -127,9 +129,17 @@ module_param_named(same_filled_pages_ena
* data structures
**********************************/
+struct crypto_acomp_ctx {
+ struct crypto_acomp *acomp;
+ struct acomp_req *req;
+ struct crypto_wait wait;
+ u8 *dstmem;
+ struct mutex mutex;
+};
+
struct zswap_pool {
struct zpool *zpool;
- struct crypto_comp * __percpu *tfm;
+ struct crypto_acomp_ctx * __percpu *acomp_ctx;
struct kref kref;
struct list_head list;
struct work_struct release_work;
@@ -415,30 +425,73 @@ static int zswap_dstmem_dead(unsigned in
static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
{
struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
- struct crypto_comp *tfm;
+ struct crypto_acomp *acomp;
+ struct acomp_req *req;
+ struct crypto_acomp_ctx *acomp_ctx;
+ int ret;
- if (WARN_ON(*per_cpu_ptr(pool->tfm, cpu)))
+ if (WARN_ON(*per_cpu_ptr(pool->acomp_ctx, cpu)))
return 0;
- tfm = crypto_alloc_comp(pool->tfm_name, 0, 0);
- if (IS_ERR_OR_NULL(tfm)) {
- pr_err("could not alloc crypto comp %s : %ld\n",
- pool->tfm_name, PTR_ERR(tfm));
+ acomp_ctx = kzalloc(sizeof(*acomp_ctx), GFP_KERNEL);
+ if (!acomp_ctx)
return -ENOMEM;
+
+ acomp = crypto_alloc_acomp(pool->tfm_name, 0, 0);
+ if (IS_ERR(acomp)) {
+ pr_err("could not alloc crypto acomp %s : %ld\n",
+ pool->tfm_name, PTR_ERR(acomp));
+ ret = PTR_ERR(acomp);
+ goto free_ctx;
+ }
+ acomp_ctx->acomp = acomp;
+
+ req = acomp_request_alloc(acomp_ctx->acomp);
+ if (!req) {
+ pr_err("could not alloc crypto acomp_request %s\n",
+ pool->tfm_name);
+ ret = -ENOMEM;
+ goto free_acomp;
}
- *per_cpu_ptr(pool->tfm, cpu) = tfm;
+ acomp_ctx->req = req;
+
+ mutex_init(&acomp_ctx->mutex);
+ crypto_init_wait(&acomp_ctx->wait);
+ /*
+ * if the backend of acomp is async zip, crypto_req_done() will wakeup
+ * crypto_wait_req(); if the backend of acomp is scomp, the callback
+ * won't be called, crypto_wait_req() will return without blocking.
+ */
+ acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+ crypto_req_done, &acomp_ctx->wait);
+
+ acomp_ctx->dstmem = per_cpu(zswap_dstmem, cpu);
+ *per_cpu_ptr(pool->acomp_ctx, cpu) = acomp_ctx;
+
return 0;
+
+free_acomp:
+ crypto_free_acomp(acomp_ctx->acomp);
+free_ctx:
+ kfree(acomp_ctx);
+ return ret;
}
static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
{
struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
- struct crypto_comp *tfm;
+ struct crypto_acomp_ctx *acomp_ctx;
+
+ acomp_ctx = *per_cpu_ptr(pool->acomp_ctx, cpu);
+ if (!IS_ERR_OR_NULL(acomp_ctx)) {
+ if (!IS_ERR_OR_NULL(acomp_ctx->req))
+ acomp_request_free(acomp_ctx->req);
+ if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
+ crypto_free_acomp(acomp_ctx->acomp);
+ kfree(acomp_ctx);
+ }
+ *per_cpu_ptr(pool->acomp_ctx, cpu) = NULL;
- tfm = *per_cpu_ptr(pool->tfm, cpu);
- if (!IS_ERR_OR_NULL(tfm))
- crypto_free_comp(tfm);
- *per_cpu_ptr(pool->tfm, cpu) = NULL;
return 0;
}
@@ -561,8 +614,9 @@ static struct zswap_pool *zswap_pool_cre
pr_debug("using %s zpool\n", zpool_get_type(pool->zpool));
strlcpy(pool->tfm_name, compressor, sizeof(pool->tfm_name));
- pool->tfm = alloc_percpu(struct crypto_comp *);
- if (!pool->tfm) {
+
+ pool->acomp_ctx = alloc_percpu(struct crypto_acomp_ctx *);
+ if (!pool->acomp_ctx) {
pr_err("percpu alloc failed\n");
goto error;
}
@@ -585,7 +639,7 @@ static struct zswap_pool *zswap_pool_cre
return pool;
error:
- free_percpu(pool->tfm);
+ free_percpu(pool->acomp_ctx);
if (pool->zpool)
zpool_destroy_pool(pool->zpool);
kfree(pool);
@@ -596,14 +650,14 @@ static __init struct zswap_pool *__zswap
{
bool has_comp, has_zpool;
- has_comp = crypto_has_comp(zswap_compressor, 0, 0);
+ has_comp = crypto_has_acomp(zswap_compressor, 0, 0);
if (!has_comp && strcmp(zswap_compressor,
CONFIG_ZSWAP_COMPRESSOR_DEFAULT)) {
pr_err("compressor %s not available, using default %s\n",
zswap_compressor, CONFIG_ZSWAP_COMPRESSOR_DEFAULT);
param_free_charp(&zswap_compressor);
zswap_compressor = CONFIG_ZSWAP_COMPRESSOR_DEFAULT;
- has_comp = crypto_has_comp(zswap_compressor, 0, 0);
+ has_comp = crypto_has_acomp(zswap_compressor, 0, 0);
}
if (!has_comp) {
pr_err("default compressor %s not available\n",
@@ -639,7 +693,7 @@ static void zswap_pool_destroy(struct zs
zswap_pool_debug("destroying", pool);
cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node);
- free_percpu(pool->tfm);
+ free_percpu(pool->acomp_ctx);
zpool_destroy_pool(pool->zpool);
kfree(pool);
}
@@ -723,7 +777,7 @@ static int __zswap_param_set(const char
}
type = s;
} else if (!compressor) {
- if (!crypto_has_comp(s, 0, 0)) {
+ if (!crypto_has_acomp(s, 0, 0)) {
pr_err("compressor %s not available\n", s);
return -ENOENT;
}
@@ -774,7 +828,7 @@ static int __zswap_param_set(const char
* failed, maybe both compressor and zpool params were bad.
* Allow changing this param, so pool creation will succeed
* when the other param is changed. We already verified this
- * param is ok in the zpool_has_pool() or crypto_has_comp()
+ * param is ok in the zpool_has_pool() or crypto_has_acomp()
* checks above.
*/
ret = param_set_charp(s, kp);
@@ -876,7 +930,9 @@ static int zswap_writeback_entry(struct
pgoff_t offset;
struct zswap_entry *entry;
struct page *page;
- struct crypto_comp *tfm;
+ struct scatterlist input, output;
+ struct crypto_acomp_ctx *acomp_ctx;
+
u8 *src, *dst;
unsigned int dlen;
int ret;
@@ -916,14 +972,21 @@ static int zswap_writeback_entry(struct
case ZSWAP_SWAPCACHE_NEW: /* page is locked */
/* decompress */
+ acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+
dlen = PAGE_SIZE;
src = (u8 *)zhdr + sizeof(struct zswap_header);
- dst = kmap_atomic(page);
- tfm = *get_cpu_ptr(entry->pool->tfm);
- ret = crypto_comp_decompress(tfm, src, entry->length,
- dst, &dlen);
- put_cpu_ptr(entry->pool->tfm);
- kunmap_atomic(dst);
+ dst = kmap(page);
+
+ mutex_lock(&acomp_ctx->mutex);
+ sg_init_one(&input, src, entry->length);
+ sg_init_one(&output, dst, dlen);
+ acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
+ ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait);
+ dlen = acomp_ctx->req->dlen;
+ mutex_unlock(&acomp_ctx->mutex);
+
+ kunmap(page);
BUG_ON(ret);
BUG_ON(dlen != PAGE_SIZE);
@@ -1004,7 +1067,8 @@ static int zswap_frontswap_store(unsigne
{
struct zswap_tree *tree = zswap_trees[type];
struct zswap_entry *entry, *dupentry;
- struct crypto_comp *tfm;
+ struct scatterlist input, output;
+ struct crypto_acomp_ctx *acomp_ctx;
int ret;
unsigned int hlen, dlen = PAGE_SIZE;
unsigned long handle, value;
@@ -1074,12 +1138,32 @@ static int zswap_frontswap_store(unsigne
}
/* compress */
- dst = get_cpu_var(zswap_dstmem);
- tfm = *get_cpu_ptr(entry->pool->tfm);
- src = kmap_atomic(page);
- ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen);
- kunmap_atomic(src);
- put_cpu_ptr(entry->pool->tfm);
+ acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+
+ mutex_lock(&acomp_ctx->mutex);
+
+ src = kmap(page);
+ dst = acomp_ctx->dstmem;
+ sg_init_one(&input, src, PAGE_SIZE);
+ /* zswap_dstmem is of size (PAGE_SIZE * 2). Reflect same in sg_list */
+ sg_init_one(&output, dst, PAGE_SIZE * 2);
+ acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SIZE, dlen);
+ /*
+ * it maybe looks a little bit silly that we send an asynchronous request,
+ * then wait for its completion synchronously. This makes the process look
+ * synchronous in fact.
+ * Theoretically, acomp supports users send multiple acomp requests in one
+ * acomp instance, then get those requests done simultaneously. but in this
+ * case, frontswap actually does store and load page by page, there is no
+ * existing method to send the second page before the first page is done
+ * in one thread doing frontswap.
+ * but in different threads running on different cpu, we have different
+ * acomp instance, so multiple threads can do (de)compression in parallel.
+ */
+ ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait);
+ dlen = acomp_ctx->req->dlen;
+ kunmap(page);
+
if (ret) {
ret = -EINVAL;
goto put_dstmem;
@@ -1103,7 +1187,7 @@ static int zswap_frontswap_store(unsigne
memcpy(buf, &zhdr, hlen);
memcpy(buf + hlen, dst, dlen);
zpool_unmap_handle(entry->pool->zpool, handle);
- put_cpu_var(zswap_dstmem);
+ mutex_unlock(&acomp_ctx->mutex);
/* populate entry */
entry->offset = offset;
@@ -1131,7 +1215,7 @@ insert_entry:
return 0;
put_dstmem:
- put_cpu_var(zswap_dstmem);
+ mutex_unlock(&acomp_ctx->mutex);
zswap_pool_put(entry->pool);
freepage:
zswap_entry_cache_free(entry);
@@ -1148,7 +1232,8 @@ static int zswap_frontswap_load(unsigned
{
struct zswap_tree *tree = zswap_trees[type];
struct zswap_entry *entry;
- struct crypto_comp *tfm;
+ struct scatterlist input, output;
+ struct crypto_acomp_ctx *acomp_ctx;
u8 *src, *dst;
unsigned int dlen;
int ret;
@@ -1175,11 +1260,17 @@ static int zswap_frontswap_load(unsigned
src = zpool_map_handle(entry->pool->zpool, entry->handle, ZPOOL_MM_RO);
if (zpool_evictable(entry->pool->zpool))
src += sizeof(struct zswap_header);
- dst = kmap_atomic(page);
- tfm = *get_cpu_ptr(entry->pool->tfm);
- ret = crypto_comp_decompress(tfm, src, entry->length, dst, &dlen);
- put_cpu_ptr(entry->pool->tfm);
- kunmap_atomic(dst);
+ dst = kmap(page);
+
+ acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
+ mutex_lock(&acomp_ctx->mutex);
+ sg_init_one(&input, src, entry->length);
+ sg_init_one(&output, dst, dlen);
+ acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
+ ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait);
+ mutex_unlock(&acomp_ctx->mutex);
+
+ kunmap(page);
zpool_unmap_handle(entry->pool->zpool, entry->handle);
BUG_ON(ret);
_
Patches currently in -mm which might be from song.bao.hua@hisilicon.com are
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (56 preceding siblings ...)
2020-07-08 22:17 ` [to-be-updated] mm-zswap-move-to-use-crypto_acomp-api-for-hardware-acceleration.patch removed from " Andrew Morton
@ 2020-07-08 22:20 ` Andrew Morton
2020-07-08 22:25 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch " Andrew Morton
` (174 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 22:20 UTC (permalink / raw)
To: lkp, mm-commits, rppt, sfr
The patch titled
Subject: powerpc: fix compilation warning caused by missing include of asm/pgalloc.h
has been added to the -mm tree. Its filename is
mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Mike Rapoport <rppt@linux.ibm.com>
Subject: powerpc: fix compilation warning caused by missing include of asm/pgalloc.h
Recent rework of asm/pgalloc.h caused a compilation warning reported by
kbuild bot:
All warnings (new ones prefixed by >>):
>> arch/powerpc/mm/nohash/tlb.c:409:6: warning: no previous prototype for
>> 'tlb_flush_pgtable' [-Wmissing-prototypes]
409 | void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address)
| ^~~~~~~~~~~~~~~~~
Add missing include of asm/pgtable.h to arch/powerpc/mm/nohash/tlb.c to
make tlb_flush_pgtable() prototype visible there.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/powerpc/mm/nohash/tlb.c | 1 +
1 file changed, 1 insertion(+)
--- a/arch/powerpc/mm/nohash/tlb.c~mm-remove-unneeded-includes-of-asm-pgalloch-fix
+++ a/arch/powerpc/mm/nohash/tlb.c
@@ -34,6 +34,7 @@
#include <linux/of_fdt.h>
#include <linux/hugetlb.h>
+#include <asm/pgalloc.h>
#include <asm/tlbflush.h>
#include <asm/tlb.h>
#include <asm/code-patching.h>
_
Patches currently in -mm which might be from rppt@linux.ibm.com are
mm-remove-unneeded-includes-of-asm-pgalloch.patch
mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
opeinrisc-switch-to-generic-version-of-pte-allocation.patch
xtensa-switch-to-generic-version-of-pte-allocation.patch
asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
asm-generic-pgalloc-provide-generic-pgd_free.patch
mm-move-lib-ioremapc-to-mm.patch
mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (57 preceding siblings ...)
2020-07-08 22:20 ` + mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch added to " Andrew Morton
@ 2020-07-08 22:25 ` Andrew Morton
2020-07-08 23:12 ` + mailmap-add-entry-for-mike-rapoport.patch " Andrew Morton
` (173 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 22:25 UTC (permalink / raw)
To: andreyknvl, aryabinin, dvyukov, glider, matthias.bgg, mm-commits,
walter-zh.wu
The patch titled
Subject: kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4
has been added to the -mm tree. Its filename is
kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Walter Wu <walter-zh.wu@mediatek.com>
Subject: kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4
use KASAN_SHADOW_SCALE_SIZE instead of 13
Link: http://lkml.kernel.org/r/20200708132524.11688-1-walter-zh.wu@mediatek.com
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
lib/test_kasan.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/lib/test_kasan.c~kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4
+++ a/lib/test_kasan.c
@@ -23,7 +23,9 @@
#include <asm/page.h>
-#define OOB_TAG_OFF (IS_ENABLED(CONFIG_KASAN_GENERIC) ? 0 : 13)
+#include "../mm/kasan/kasan.h"
+
+#define OOB_TAG_OFF (IS_ENABLED(CONFIG_KASAN_GENERIC) ? 0 : KASAN_SHADOW_SCALE_SIZE)
/*
* We assign some test results to these globals to make sure the tests
_
Patches currently in -mm which might be from walter-zh.wu@mediatek.com are
kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
rcu-kasan-record-and-print-call_rcu-call-stack.patch
kasan-record-and-print-the-free-track.patch
kasan-add-tests-for-call_rcu-stack-recording.patch
kasan-update-documentation-for-generic-kasan.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mailmap-add-entry-for-mike-rapoport.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (58 preceding siblings ...)
2020-07-08 22:25 ` + kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch " Andrew Morton
@ 2020-07-08 23:12 ` Andrew Morton
2020-07-08 23:16 ` + mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
` (172 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:12 UTC (permalink / raw)
To: mm-commits, rppt
The patch titled
Subject: mailmap: add entry for Mike Rapoport
has been added to the -mm tree. Its filename is
mailmap-add-entry-for-mike-rapoport.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mailmap-add-entry-for-mike-rapoport.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mailmap-add-entry-for-mike-rapoport.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Mike Rapoport <rppt@linux.ibm.com>
Subject: mailmap: add entry for Mike Rapoport
Add an entry to connect my email addresses.
Link: http://lkml.kernel.org/r/20200708095414.12275-1-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
.mailmap | 3 +++
1 file changed, 3 insertions(+)
--- a/.mailmap~mailmap-add-entry-for-mike-rapoport
+++ a/.mailmap
@@ -193,6 +193,9 @@ Maxime Ripard <mripard@kernel.org> <maxi
Mayuresh Janorkar <mayur@ti.com>
Michael Buesch <m@bues.ch>
Michel Dänzer <michel@tungstengraphics.com>
+Mike Rapoport <rppt@kernel.org> <mike@compulab.co.il>
+Mike Rapoport <rppt@kernel.org> <mike.rapoport@gmail.com>
+Mike Rapoport <rppt@kernel.org> <rppt@linux.ibm.com>
Miodrag Dinic <miodrag.dinic@mips.com> <miodrag.dinic@imgtec.com>
Miquel Raynal <miquel.raynal@bootlin.com> <miquel.raynal@free-electrons.com>
Mitesh shah <mshah@teja.com>
_
Patches currently in -mm which might be from rppt@linux.ibm.com are
mailmap-add-entry-for-mike-rapoport.patch
mm-remove-unneeded-includes-of-asm-pgalloch.patch
mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
opeinrisc-switch-to-generic-version-of-pte-allocation.patch
xtensa-switch-to-generic-version-of-pte-allocation.patch
asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
asm-generic-pgalloc-provide-generic-pgd_free.patch
mm-move-lib-ioremapc-to-mm.patch
mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (59 preceding siblings ...)
2020-07-08 23:12 ` + mailmap-add-entry-for-mike-rapoport.patch " Andrew Morton
@ 2020-07-08 23:16 ` Andrew Morton
2020-07-08 23:16 ` + mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
` (171 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, willy, yang.shi
The patch titled
Subject: mm/mremap: it is sure to have enough space when extent meets requirement
has been added to the -mm tree. Its filename is
mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: it is sure to have enough space when extent meets requirement
Patch series "mm/mremap: cleanup move_page_tables() a little".
move_page_tables() tries to move page table by PMD or PTE.
The root reason is if it tries to move PMD, both old and new range should
be PMD aligned. But current code calculate old range and new range
separately. This leads to some redundant check and calculation.
This cleanup tries to consolidate the range check in one place to reduce
some extra range handling.
This patch (of 4):
old_end is passed to these two functions to check whether there is enough
space to do the move, while this check is done before invoking these
functions.
These two functions only would be invoked when extent meets the
requirement and there is one check before invoking these functions:
if (extent > old_end - old_addr)
extent = old_end - old_addr;
This implies (old_end - old_addr) won't fail the check in these two
functions.
Link: http://lkml.kernel.org/r/20200708095028.41706-1-richard.weiyang@linux.alibaba.com
Link: http://lkml.kernel.org/r/20200708095028.41706-2-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/huge_mm.h | 2 +-
mm/huge_memory.c | 7 ++-----
mm/mremap.c | 10 ++++------
3 files changed, 7 insertions(+), 12 deletions(-)
--- a/include/linux/huge_mm.h~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/include/linux/huge_mm.h
@@ -42,7 +42,7 @@ extern int mincore_huge_pmd(struct vm_ar
unsigned long addr, unsigned long end,
unsigned char *vec);
extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
- unsigned long new_addr, unsigned long old_end,
+ unsigned long new_addr,
pmd_t *old_pmd, pmd_t *new_pmd);
extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
unsigned long addr, pgprot_t newprot,
--- a/mm/huge_memory.c~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/mm/huge_memory.c
@@ -1722,17 +1722,14 @@ static pmd_t move_soft_dirty_pmd(pmd_t p
}
bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
- unsigned long new_addr, unsigned long old_end,
- pmd_t *old_pmd, pmd_t *new_pmd)
+ unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
{
spinlock_t *old_ptl, *new_ptl;
pmd_t pmd;
struct mm_struct *mm = vma->vm_mm;
bool force_flush = false;
- if ((old_addr & ~HPAGE_PMD_MASK) ||
- (new_addr & ~HPAGE_PMD_MASK) ||
- old_end - old_addr < HPAGE_PMD_SIZE)
+ if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
return false;
/*
--- a/mm/mremap.c~mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement
+++ a/mm/mremap.c
@@ -193,15 +193,13 @@ static void move_ptes(struct vm_area_str
#ifdef CONFIG_HAVE_MOVE_PMD
static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
- unsigned long new_addr, unsigned long old_end,
- pmd_t *old_pmd, pmd_t *new_pmd)
+ unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
{
spinlock_t *old_ptl, *new_ptl;
struct mm_struct *mm = vma->vm_mm;
pmd_t pmd;
- if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
- || old_end - old_addr < PMD_SIZE)
+ if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
return false;
/*
@@ -273,7 +271,7 @@ unsigned long move_page_tables(struct vm
if (need_rmap_locks)
take_rmap_locks(vma);
moved = move_huge_pmd(vma, old_addr, new_addr,
- old_end, old_pmd, new_pmd);
+ old_pmd, new_pmd);
if (need_rmap_locks)
drop_rmap_locks(vma);
if (moved)
@@ -293,7 +291,7 @@ unsigned long move_page_tables(struct vm
if (need_rmap_locks)
take_rmap_locks(vma);
moved = move_normal_pmd(vma, old_addr, new_addr,
- old_end, old_pmd, new_pmd);
+ old_pmd, new_pmd);
if (need_rmap_locks)
drop_rmap_locks(vma);
if (moved)
_
Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are
mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-mremap-calculate-extent-in-one-place.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (60 preceding siblings ...)
2020-07-08 23:16 ` + mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch " Andrew Morton
@ 2020-07-08 23:16 ` Andrew Morton
2020-07-08 23:16 ` + mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
` (170 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, willy, yang.shi
The patch titled
Subject: mm/mremap: calculate extent in one place
has been added to the -mm tree. Its filename is
mm-mremap-calculate-extent-in-one-place.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-calculate-extent-in-one-place.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-calculate-extent-in-one-place.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: calculate extent in one place
Page tables is moved on the base of PMD. This requires both source and
destination range should meet the requirement.
Current code works well since move_huge_pmd() and move_normal_pmd() would
check old_addr and new_addr again. And then return to move_ptes() if the
either of them is not aligned.
Instead of calculating the extent separately, it is better to calculate in
one place, so we know it is not necessary to try move pmd. By doing so,
the logic seems a little clear.
Link: http://lkml.kernel.org/r/20200708095028.41706-3-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/mremap.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/mremap.c~mm-mremap-calculate-extent-in-one-place
+++ a/mm/mremap.c
@@ -258,6 +258,9 @@ unsigned long move_page_tables(struct vm
extent = next - old_addr;
if (extent > old_end - old_addr)
extent = old_end - old_addr;
+ next = (new_addr + PMD_SIZE) & PMD_MASK;
+ if (extent > next - new_addr)
+ extent = next - new_addr;
old_pmd = get_old_pmd(vma->vm_mm, old_addr);
if (!old_pmd)
continue;
@@ -301,9 +304,6 @@ unsigned long move_page_tables(struct vm
if (pte_alloc(new_vma->vm_mm, new_pmd))
break;
- next = (new_addr + PMD_SIZE) & PMD_MASK;
- if (extent > next - new_addr)
- extent = next - new_addr;
move_ptes(vma, old_pmd, old_addr, old_addr + extent, new_vma,
new_pmd, new_addr, need_rmap_locks);
}
_
Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are
mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-mremap-start-addresses-are-properly-aligned.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (61 preceding siblings ...)
2020-07-08 23:16 ` + mm-mremap-calculate-extent-in-one-place.patch " Andrew Morton
@ 2020-07-08 23:16 ` Andrew Morton
2020-07-08 23:16 ` Andrew Morton
2020-07-08 23:16 ` + mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch " Andrew Morton
` (169 subsequent siblings)
232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, willy, yang.shi
The patch titled
Subject: mm/mremap: start addresses are properly aligned
has been added to the -mm tree. Its filename is
mm-mremap-start-addresses-are-properly-aligned.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-start-addresses-are-properly-aligned.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-start-addresses-are-properly-aligned.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: start addresses are properly aligned
After previous cleanup, extent is the minimal step for both source and
destination. This means when extent is HPAGE_PMD_SIZE or PMD_SIZE,
old_addr and new_addr are properly aligned too.
Since these two functions are only invoked in move_page_tables, it is safe
to remove the check now.
Link: http://lkml.kernel.org/r/20200708095028.41706-4-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/huge_memory.c | 3 ---
mm/mremap.c | 3 ---
2 files changed, 6 deletions(-)
--- a/mm/huge_memory.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/huge_memory.c
@@ -1729,9 +1729,6 @@ bool move_huge_pmd(struct vm_area_struct
struct mm_struct *mm = vma->vm_mm;
bool force_flush = false;
- if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
- return false;
-
/*
* The destination pmd shouldn't be established, free_pgtables()
* should have release it.
--- a/mm/mremap.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/mremap.c
@@ -199,9 +199,6 @@ static bool move_normal_pmd(struct vm_ar
struct mm_struct *mm = vma->vm_mm;
pmd_t pmd;
- if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
- return false;
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-mremap-start-addresses-are-properly-aligned.patch added to -mm tree
2020-07-08 23:16 ` + mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
@ 2020-07-08 23:16 ` Andrew Morton
0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, willy, yang.shi
The patch titled
Subject: mm/mremap: start addresses are properly aligned
has been added to the -mm tree. Its filename is
mm-mremap-start-addresses-are-properly-aligned.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-start-addresses-are-properly-aligned.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-start-addresses-are-properly-aligned.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: start addresses are properly aligned
After previous cleanup, extent is the minimal step for both source and
destination. This means when extent is HPAGE_PMD_SIZE or PMD_SIZE,
old_addr and new_addr are properly aligned too.
Since these two functions are only invoked in move_page_tables, it is safe
to remove the check now.
Link: http://lkml.kernel.org/r/20200708095028.41706-4-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/huge_memory.c | 3 ---
mm/mremap.c | 3 ---
2 files changed, 6 deletions(-)
--- a/mm/huge_memory.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/huge_memory.c
@@ -1729,9 +1729,6 @@ bool move_huge_pmd(struct vm_area_struct
struct mm_struct *mm = vma->vm_mm;
bool force_flush = false;
- if ((old_addr & ~HPAGE_PMD_MASK) || (new_addr & ~HPAGE_PMD_MASK))
- return false;
-
/*
* The destination pmd shouldn't be established, free_pgtables()
* should have release it.
--- a/mm/mremap.c~mm-mremap-start-addresses-are-properly-aligned
+++ a/mm/mremap.c
@@ -199,9 +199,6 @@ static bool move_normal_pmd(struct vm_ar
struct mm_struct *mm = vma->vm_mm;
pmd_t pmd;
- if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK))
- return false;
-
/*
* The destination pmd shouldn't be established, free_pgtables()
* should have release it.
_
Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are
mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (62 preceding siblings ...)
2020-07-08 23:16 ` + mm-mremap-start-addresses-are-properly-aligned.patch " Andrew Morton
@ 2020-07-08 23:16 ` Andrew Morton
2020-07-08 23:41 ` + mm-swap-simplify-alloc_swap_slot_cache.patch " Andrew Morton
` (168 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:16 UTC (permalink / raw)
To: aneesh.kumar, anshuman.khandual, digetx, kirill.shutemov,
mm-commits, peterx, richard.weiyang, sean.j.christopherson,
thellstrom, thomas_os, vbabka, willy, yang.shi
The patch titled
Subject: mm/mremap: use pmd_addr_end to simplify the calculate of extent
has been added to the -mm tree. Its filename is
mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Wei Yang <richard.weiyang@linux.alibaba.com>
Subject: mm/mremap: use pmd_addr_end to simplify the calculate of extent
The purpose of this code is to calculate the smaller extent in old and new
range. Let's leverage pmd_addr_end() to do the calculation.
Hope this would make the code easier to read.
Link: http://lkml.kernel.org/r/20200708095028.41706-5-richard.weiyang@linux.alibaba.com
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/mremap.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
--- a/mm/mremap.c~mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent
+++ a/mm/mremap.c
@@ -237,11 +237,12 @@ unsigned long move_page_tables(struct vm
unsigned long new_addr, unsigned long len,
bool need_rmap_locks)
{
- unsigned long extent, next, old_end;
+ unsigned long extent, old_next, new_next, old_end, new_end;
struct mmu_notifier_range range;
pmd_t *old_pmd, *new_pmd;
old_end = old_addr + len;
+ new_end = new_addr + len;
flush_cache_range(vma, old_addr, old_end);
mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
@@ -250,14 +251,11 @@ unsigned long move_page_tables(struct vm
for (; old_addr < old_end; old_addr += extent, new_addr += extent) {
cond_resched();
- next = (old_addr + PMD_SIZE) & PMD_MASK;
- /* even if next overflowed, extent below will be ok */
- extent = next - old_addr;
- if (extent > old_end - old_addr)
- extent = old_end - old_addr;
- next = (new_addr + PMD_SIZE) & PMD_MASK;
- if (extent > next - new_addr)
- extent = next - new_addr;
+
+ old_next = pmd_addr_end(old_addr, old_end);
+ new_next = pmd_addr_end(new_addr, new_end);
+ extent = min(old_next - old_addr, new_next - new_addr);
+
old_pmd = get_old_pmd(vma->vm_mm, old_addr);
if (!old_pmd)
continue;
_
Patches currently in -mm which might be from richard.weiyang@linux.alibaba.com are
mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
mm-mremap-calculate-extent-in-one-place.patch
mm-mremap-start-addresses-are-properly-aligned.patch
mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
mm-sparse-never-partially-remove-memmap-for-early-section.patch
mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
mm-page_allocc-simplify-pageblock-bitmap-access.patch
mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
mm-page_alloc-fallbacks-at-most-has-3-elements.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-swap-simplify-alloc_swap_slot_cache.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (63 preceding siblings ...)
2020-07-08 23:16 ` + mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch " Andrew Morton
@ 2020-07-08 23:41 ` Andrew Morton
2020-07-08 23:41 ` + mm-swap-simplify-enable_swap_slots_cache.patch " Andrew Morton
` (167 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:41 UTC (permalink / raw)
To: mm-commits, thunder.leizhen, tim.c.chen
The patch titled
Subject: mm/swap_slots.c: simplify alloc_swap_slot_cache()
has been added to the -mm tree. Its filename is
mm-swap-simplify-alloc_swap_slot_cache.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-simplify-alloc_swap_slot_cache.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-simplify-alloc_swap_slot_cache.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Zhen Lei <thunder.leizhen@huawei.com>
Subject: mm/swap_slots.c: simplify alloc_swap_slot_cache()
Patch series "clean up some functions in mm/swap_slots.c".
When I studied the code of mm/swap_slots.c, I found some places can be
improved.
This patch (of 3):
Both "slots" and "slots_ret" are only need to be freed when cache already
allocated. Make them closer, seems more clear.
No functional change.
Link: http://lkml.kernel.org/r/20200430061143.450-1-thunder.leizhen@huawei.com
Link: http://lkml.kernel.org/r/20200430061143.450-2-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/swap_slots.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
--- a/mm/swap_slots.c~mm-swap-simplify-alloc_swap_slot_cache
+++ a/mm/swap_slots.c
@@ -136,9 +136,16 @@ static int alloc_swap_slot_cache(unsigne
mutex_lock(&swap_slots_cache_mutex);
cache = &per_cpu(swp_slots, cpu);
- if (cache->slots || cache->slots_ret)
+ if (cache->slots || cache->slots_ret) {
/* cache already allocated */
- goto out;
+ mutex_unlock(&swap_slots_cache_mutex);
+
+ kvfree(slots);
+ kvfree(slots_ret);
+
+ return 0;
+ }
+
if (!cache->lock_initialized) {
mutex_init(&cache->alloc_lock);
spin_lock_init(&cache->free_lock);
@@ -155,15 +162,8 @@ static int alloc_swap_slot_cache(unsigne
*/
mb();
cache->slots = slots;
- slots = NULL;
cache->slots_ret = slots_ret;
- slots_ret = NULL;
-out:
mutex_unlock(&swap_slots_cache_mutex);
- if (slots)
- kvfree(slots);
- if (slots_ret)
- kvfree(slots_ret);
return 0;
}
_
Patches currently in -mm which might be from thunder.leizhen@huawei.com are
mm-swap-simplify-alloc_swap_slot_cache.patch
mm-swap-simplify-enable_swap_slots_cache.patch
mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-swap-simplify-enable_swap_slots_cache.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (64 preceding siblings ...)
2020-07-08 23:41 ` + mm-swap-simplify-alloc_swap_slot_cache.patch " Andrew Morton
@ 2020-07-08 23:41 ` Andrew Morton
2020-07-08 23:41 ` + mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch " Andrew Morton
` (166 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:41 UTC (permalink / raw)
To: mm-commits, thunder.leizhen, tim.c.chen
The patch titled
Subject: mm/swap_slots.c: simplify enable_swap_slots_cache()
has been added to the -mm tree. Its filename is
mm-swap-simplify-enable_swap_slots_cache.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-simplify-enable_swap_slots_cache.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-simplify-enable_swap_slots_cache.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Zhen Lei <thunder.leizhen@huawei.com>
Subject: mm/swap_slots.c: simplify enable_swap_slots_cache()
Whether swap_slot_cache_initialized is true or false,
__reenable_swap_slots_cache() is always called. To make this meaning
clear, leave only one call to __reenable_swap_slots_cache(). This also
make it clearer what extra needs be done when swap_slot_cache_initialized
is false.
No functional change.
Link: http://lkml.kernel.org/r/20200430061143.450-3-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/swap_slots.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
--- a/mm/swap_slots.c~mm-swap-simplify-enable_swap_slots_cache
+++ a/mm/swap_slots.c
@@ -240,21 +240,19 @@ static int free_slot_cache(unsigned int
int enable_swap_slots_cache(void)
{
- int ret = 0;
-
mutex_lock(&swap_slots_cache_enable_mutex);
- if (swap_slot_cache_initialized) {
- __reenable_swap_slots_cache();
- goto out_unlock;
- }
+ if (!swap_slot_cache_initialized) {
+ int ret;
- ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "swap_slots_cache",
- alloc_swap_slot_cache, free_slot_cache);
- if (WARN_ONCE(ret < 0, "Cache allocation failed (%s), operating "
- "without swap slots cache.\n", __func__))
- goto out_unlock;
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "swap_slots_cache",
+ alloc_swap_slot_cache, free_slot_cache);
+ if (WARN_ONCE(ret < 0, "Cache allocation failed (%s), operating "
+ "without swap slots cache.\n", __func__))
+ goto out_unlock;
+
+ swap_slot_cache_initialized = true;
+ }
- swap_slot_cache_initialized = true;
__reenable_swap_slots_cache();
out_unlock:
mutex_unlock(&swap_slots_cache_enable_mutex);
_
Patches currently in -mm which might be from thunder.leizhen@huawei.com are
mm-swap-simplify-alloc_swap_slot_cache.patch
mm-swap-simplify-enable_swap_slots_cache.patch
mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (65 preceding siblings ...)
2020-07-08 23:41 ` + mm-swap-simplify-enable_swap_slots_cache.patch " Andrew Morton
@ 2020-07-08 23:41 ` Andrew Morton
2020-07-09 0:06 ` + mm-do-page-fault-accounting-in-handle_mm_fault.patch " Andrew Morton
` (165 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-08 23:41 UTC (permalink / raw)
To: mm-commits, thunder.leizhen, tim.c.chen
The patch titled
Subject: mm/swap_slots.c: remove redundant check for swap_slot_cache_initialized
has been added to the -mm tree. Its filename is
mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Zhen Lei <thunder.leizhen@huawei.com>
Subject: mm/swap_slots.c: remove redundant check for swap_slot_cache_initialized
Because enable_swap_slots_cache can only become true in
enable_swap_slots_cache(), and depends on swap_slot_cache_initialized is
true before. That means, when enable_swap_slots_cache is true,
swap_slot_cache_initialized is true also.
So the condition:
"swap_slot_cache_enabled && swap_slot_cache_initialized"
can be reduced to "swap_slot_cache_enabled"
And in mathematics:
"!swap_slot_cache_enabled || !swap_slot_cache_initialized"
is equal to "!(swap_slot_cache_enabled && swap_slot_cache_initialized)"
So no functional change.
Link: http://lkml.kernel.org/r/20200430061143.450-4-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/swap_slots.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/swap_slots.c~mm-swap-remove-redundant-check-for-swap_slot_cache_initialized
+++ a/mm/swap_slots.c
@@ -46,8 +46,7 @@ static void __drain_swap_slots_cache(uns
static void deactivate_swap_slots_cache(void);
static void reactivate_swap_slots_cache(void);
-#define use_swap_slot_cache (swap_slot_cache_active && \
- swap_slot_cache_enabled && swap_slot_cache_initialized)
+#define use_swap_slot_cache (swap_slot_cache_active && swap_slot_cache_enabled)
#define SLOTS_CACHE 0x1
#define SLOTS_CACHE_RET 0x2
@@ -94,7 +93,7 @@ static bool check_cache_active(void)
{
long pages;
- if (!swap_slot_cache_enabled || !swap_slot_cache_initialized)
+ if (!swap_slot_cache_enabled)
return false;
pages = get_nr_swap_pages();
_
Patches currently in -mm which might be from thunder.leizhen@huawei.com are
mm-swap-simplify-alloc_swap_slot_cache.patch
mm-swap-simplify-enable_swap_slots_cache.patch
mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-do-page-fault-accounting-in-handle_mm_fault.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (66 preceding siblings ...)
2020-07-08 23:41 ` + mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-alpha-use-general-page-fault-accounting.patch " Andrew Morton
` (164 subsequent siblings)
232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: agordeev, aou, bcain, benh, borntraeger, bp, catalin.marinas,
chris, dalias, dave.hansen, davem, deanbo422, deller, geert,
gerald.schaefer, gor, green.hu, guoren, heiko.carstens, hpa, ink,
James.Bottomley, jcmvbkbc, jhubbard, jonas, ley.foon.tan, linux,
luto, mattst88, mingo, mm-commits, monstr, mpe, nickhu, palmer,
paul.walmsley
The patch titled
Subject: mm: do page fault accounting in handle_mm_fault
has been added to the -mm tree. Its filename is
mm-do-page-fault-accounting-in-handle_mm_fault.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-do-page-fault-accounting-in-handle_mm_fault.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-do-page-fault-accounting-in-handle_mm_fault.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm: do page fault accounting in handle_mm_fault
Patch series "mm: Page fault accounting cleanups", v5.
This is v5 of the pf accounting cleanup series. It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b9827063 ("mm: allow
VM_FAULT_RETRY for multiple times"):
https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/
What this series did:
- Correct page fault accounting: we do accounting for a page fault
(no matter whether it's from #PF handling, or gup, or anything else)
only with the one that completed the fault. For example, page fault
retries should not be counted in page fault counters. Same to the
perf events.
- Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
event is used in an adhoc way across different archs.
Case (1): for many archs it's done at the entry of a page fault
handler, so that it will also cover e.g. errornous faults.
Case (2): for some other archs, it is only accounted when the page
fault is resolved successfully.
Case (3): there're still quite some archs that have not enabled
this perf event.
Since this series will touch merely all the archs, we unify this
perf event to always follow case (1), which is the one that makes most
sense. And since we moved the accounting into handle_mm_fault, the
other two MAJ/MIN perf events are well taken care of naturally.
- Unify definition of "major faults": the definition of "major
fault" is slightly changed when used in accounting (not
VM_FAULT_MAJOR). More information in patch 1.
- Always account the page fault onto the one that triggered the page
fault. This does not matter much for #PF handlings, but mostly for
gup. More information on this in patch 25.
Patchset layout:
Patch 1: Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23: Enable the new accounting for arch #PF handlers one by one.
Patch 24: Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25: Cleanup GUP task_struct pointer since it's not needed any more
This patch (of 25):
This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault(). This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events. To
do this, the pt_regs pointer is passed into handle_mm_fault().
PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.
So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/alpha/mm/fault.c | 2 -
arch/arc/mm/fault.c | 2 -
arch/arm/mm/fault.c | 2 -
arch/arm64/mm/fault.c | 2 -
arch/csky/mm/fault.c | 3 +
arch/hexagon/mm/vm_fault.c | 2 -
arch/ia64/mm/fault.c | 2 -
arch/m68k/mm/fault.c | 2 -
arch/microblaze/mm/fault.c | 2 -
arch/mips/mm/fault.c | 2 -
arch/nds32/mm/fault.c | 2 -
arch/nios2/mm/fault.c | 2 -
arch/openrisc/mm/fault.c | 2 -
arch/parisc/mm/fault.c | 2 -
arch/powerpc/mm/copro_fault.c | 2 -
arch/powerpc/mm/fault.c | 2 -
arch/riscv/mm/fault.c | 2 -
arch/s390/mm/fault.c | 2 -
arch/sh/mm/fault.c | 2 -
arch/sparc/mm/fault_32.c | 4 +-
arch/sparc/mm/fault_64.c | 2 -
arch/um/kernel/trap.c | 2 -
arch/x86/mm/fault.c | 2 -
arch/xtensa/mm/fault.c | 2 -
drivers/iommu/amd/iommu_v2.c | 2 -
drivers/iommu/intel/svm.c | 3 +
include/linux/mm.h | 7 ++-
mm/gup.c | 4 +-
mm/hmm.c | 3 +
mm/ksm.c | 3 +
mm/memory.c | 64 +++++++++++++++++++++++++++++++-
31 files changed, 103 insertions(+), 34 deletions(-)
--- a/arch/alpha/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/alpha/mm/fault.c
@@ -148,7 +148,7 @@ retry:
/* If for any reason at all we couldn't handle the fault,
make sure we exit gracefully rather than endlessly redo
the fault. */
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/arc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arc/mm/fault.c
@@ -130,7 +130,7 @@ retry:
goto bad_area;
}
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
/* Quick path to respond to signals */
if (fault_signal_pending(fault, regs)) {
--- a/arch/arm64/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arm64/mm/fault.c
@@ -428,7 +428,7 @@ static vm_fault_t __do_page_fault(struct
*/
if (!(vma->vm_flags & vm_flags))
return VM_FAULT_BADACCESS;
- return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags);
+ return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags, NULL);
}
static bool is_el0_instruction_abort(unsigned int esr)
--- a/arch/arm/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arm/mm/fault.c
@@ -224,7 +224,7 @@ good_area:
goto out;
}
- return handle_mm_fault(vma, addr & PAGE_MASK, flags);
+ return handle_mm_fault(vma, addr & PAGE_MASK, flags, NULL);
check_stack:
/* Don't allow expansion below FIRST_USER_ADDRESS */
--- a/arch/csky/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/csky/mm/fault.c
@@ -150,7 +150,8 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0);
+ fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0,
+ NULL);
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
--- a/arch/hexagon/mm/vm_fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/hexagon/mm/vm_fault.c
@@ -88,7 +88,7 @@ good_area:
break;
}
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/ia64/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/ia64/mm/fault.c
@@ -143,7 +143,7 @@ retry:
* sure we exit gracefully rather than endlessly redo the
* fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/m68k/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/m68k/mm/fault.c
@@ -134,7 +134,7 @@ good_area:
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
pr_debug("handle_mm_fault returns %x\n", fault);
if (fault_signal_pending(fault, regs))
--- a/arch/microblaze/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/microblaze/mm/fault.c
@@ -214,7 +214,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/mips/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/mips/mm/fault.c
@@ -152,7 +152,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/nds32/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/nds32/mm/fault.c
@@ -206,7 +206,7 @@ good_area:
* the fault.
*/
- fault = handle_mm_fault(vma, addr, flags);
+ fault = handle_mm_fault(vma, addr, flags, NULL);
/*
* If we need to retry but a fatal signal is pending, handle the
--- a/arch/nios2/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/nios2/mm/fault.c
@@ -131,7 +131,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/openrisc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/openrisc/mm/fault.c
@@ -159,7 +159,7 @@ good_area:
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/parisc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/parisc/mm/fault.c
@@ -302,7 +302,7 @@ good_area:
* fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/powerpc/mm/copro_fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/powerpc/mm/copro_fault.c
@@ -64,7 +64,7 @@ int copro_handle_mm_fault(struct mm_stru
}
ret = 0;
- *flt = handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0);
+ *flt = handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0, NULL);
if (unlikely(*flt & VM_FAULT_ERROR)) {
if (*flt & VM_FAULT_OOM) {
ret = -ENOMEM;
--- a/arch/powerpc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/powerpc/mm/fault.c
@@ -607,7 +607,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
major |= fault & VM_FAULT_MAJOR;
--- a/arch/riscv/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/riscv/mm/fault.c
@@ -109,7 +109,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, addr, flags);
+ fault = handle_mm_fault(vma, addr, flags, NULL);
/*
* If we need to retry but a fatal signal is pending, handle the
--- a/arch/s390/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/s390/mm/fault.c
@@ -478,7 +478,7 @@ retry:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs)) {
fault = VM_FAULT_SIGNAL;
if (flags & FAULT_FLAG_RETRY_NOWAIT)
--- a/arch/sh/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sh/mm/fault.c
@@ -482,7 +482,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR)))
if (mm_fault_error(regs, error_code, address, fault))
--- a/arch/sparc/mm/fault_32.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sparc/mm/fault_32.c
@@ -234,7 +234,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
@@ -410,7 +410,7 @@ good_area:
if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
goto bad_area;
}
- switch (handle_mm_fault(vma, address, flags)) {
+ switch (handle_mm_fault(vma, address, flags, NULL)) {
case VM_FAULT_SIGBUS:
case VM_FAULT_OOM:
goto do_sigbus;
--- a/arch/sparc/mm/fault_64.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sparc/mm/fault_64.c
@@ -422,7 +422,7 @@ good_area:
goto bad_area;
}
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
goto exit_exception;
--- a/arch/um/kernel/trap.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/um/kernel/trap.c
@@ -71,7 +71,7 @@ good_area:
do {
vm_fault_t fault;
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
goto out_nosemaphore;
--- a/arch/x86/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/x86/mm/fault.c
@@ -1291,7 +1291,7 @@ good_area:
* userland). The return to userland is identified whenever
* FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in flags.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
major |= fault & VM_FAULT_MAJOR;
/* Quick path to respond to signals */
--- a/arch/xtensa/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/xtensa/mm/fault.c
@@ -107,7 +107,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/drivers/iommu/amd/iommu_v2.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/drivers/iommu/amd/iommu_v2.c
@@ -495,7 +495,7 @@ static void do_fault(struct work_struct
if (access_error(vma, fault))
goto out;
- ret = handle_mm_fault(vma, address, flags);
+ ret = handle_mm_fault(vma, address, flags, NULL);
out:
mmap_read_unlock(mm);
--- a/drivers/iommu/intel/svm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/drivers/iommu/intel/svm.c
@@ -872,7 +872,8 @@ static irqreturn_t prq_event_thread(int
goto invalid;
ret = handle_mm_fault(vma, address,
- req->wr_req ? FAULT_FLAG_WRITE : 0);
+ req->wr_req ? FAULT_FLAG_WRITE : 0,
+ NULL);
if (ret & VM_FAULT_ERROR)
goto invalid;
--- a/include/linux/mm.h~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/include/linux/mm.h
@@ -38,6 +38,7 @@ struct file_ra_state;
struct user_struct;
struct writeback_control;
struct bdi_writeback;
+struct pt_regs;
void init_mm_internals(void);
@@ -1650,7 +1651,8 @@ int invalidate_inode_page(struct page *p
#ifdef CONFIG_MMU
extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
- unsigned long address, unsigned int flags);
+ unsigned long address, unsigned int flags,
+ struct pt_regs *regs);
extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
unsigned long address, unsigned int fault_flags,
bool *unlocked);
@@ -1660,7 +1662,8 @@ void unmap_mapping_range(struct address_
loff_t const holebegin, loff_t const holelen, int even_cows);
#else
static inline vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
- unsigned long address, unsigned int flags)
+ unsigned long address, unsigned int flags,
+ struct pt_regs *regs)
{
/* should never happen if there's no MMU */
BUG();
--- a/mm/gup.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/gup.c
@@ -884,7 +884,7 @@ static int faultin_page(struct task_stru
fault_flags |= FAULT_FLAG_TRIED;
}
- ret = handle_mm_fault(vma, address, fault_flags);
+ ret = handle_mm_fault(vma, address, fault_flags, NULL);
if (ret & VM_FAULT_ERROR) {
int err = vm_fault_to_errno(ret, *flags);
@@ -1238,7 +1238,7 @@ retry:
fatal_signal_pending(current))
return -EINTR;
- ret = handle_mm_fault(vma, address, fault_flags);
+ ret = handle_mm_fault(vma, address, fault_flags, NULL);
major |= ret & VM_FAULT_MAJOR;
if (ret & VM_FAULT_ERROR) {
int err = vm_fault_to_errno(ret, 0);
--- a/mm/hmm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/hmm.c
@@ -75,7 +75,8 @@ static int hmm_vma_fault(unsigned long a
}
for (; addr < end; addr += PAGE_SIZE)
- if (handle_mm_fault(vma, addr, fault_flags) & VM_FAULT_ERROR)
+ if (handle_mm_fault(vma, addr, fault_flags, NULL) &
+ VM_FAULT_ERROR)
return -EFAULT;
return -EBUSY;
}
--- a/mm/ksm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/ksm.c
@@ -480,7 +480,8 @@ static int break_ksm(struct vm_area_stru
break;
if (PageKsm(page))
ret = handle_mm_fault(vma, addr,
- FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE);
+ FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE,
+ NULL);
else
ret = VM_FAULT_WRITE;
put_page(page);
--- a/mm/memory.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/memory.c
@@ -71,6 +71,8 @@
#include <linux/dax.h>
#include <linux/oom.h>
#include <linux/numa.h>
+#include <linux/perf_event.h>
+#include <linux/ptrace.h>
#include <trace/events/kmem.h>
@@ -4365,6 +4367,64 @@ retry_pud:
return handle_pte_fault(&vmf);
}
+/**
+ * mm_account_fault - Do page fault accountings
+ *
+ * @regs: the pt_regs struct pointer. When set to NULL, will skip accounting
+ * of perf event counters, but we'll still do the per-task accounting to
+ * the task who triggered this page fault.
+ * @address: the faulted address.
+ * @flags: the fault flags.
+ * @ret: the fault retcode.
+ *
+ * This will take care of most of the page fault accountings. Meanwhile, it
+ * will also include the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf counter
+ * updates. However note that the handling of PERF_COUNT_SW_PAGE_FAULTS should
+ * still be in per-arch page fault handlers at the entry of page fault.
+ */
+static inline void mm_account_fault(struct pt_regs *regs,
+ unsigned long address, unsigned int flags,
+ vm_fault_t ret)
+{
+ bool major;
+
+ /*
+ * We don't do accounting for some specific faults:
+ *
+ * - Unsuccessful faults (e.g. when the address wasn't valid). That
+ * includes arch_vma_access_permitted() failing before reaching here.
+ * So this is not a "this many hardware page faults" counter. We
+ * should use the hw profiling for that.
+ *
+ * - Incomplete faults (VM_FAULT_RETRY). They will only be counted
+ * once they're completed.
+ */
+ if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY))
+ return;
+
+ /*
+ * We define the fault as a major fault when the final successful fault
+ * is VM_FAULT_MAJOR, or if it retried (which implies that we couldn't
+ * handle it immediately previously).
+ */
+ major = (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED);
+
+ /*
+ * If the fault is done for GUP, regs will be NULL, and we will skip
+ * the fault accounting.
+ */
+ if (!regs)
+ return;
+
+ if (major) {
+ current->maj_flt++;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
+ } else {
+ current->min_flt++;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
+ }
+}
+
/*
* By the time we get here, we already hold the mm semaphore
*
@@ -4372,7 +4432,7 @@ retry_pud:
* return value. See filemap_fault() and __lock_page_or_retry().
*/
vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
- unsigned int flags)
+ unsigned int flags, struct pt_regs *regs)
{
vm_fault_t ret;
@@ -4413,6 +4473,8 @@ vm_fault_t handle_mm_fault(struct vm_are
mem_cgroup_oom_synchronize(false);
}
+ mm_account_fault(regs, address, flags, ret);
+
return ret;
}
EXPORT_SYMBOL_GPL(handle_mm_fault);
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-do-page-fault-accounting-in-handle_mm_fault.patch added to -mm tree
2020-07-09 0:06 ` + mm-do-page-fault-accounting-in-handle_mm_fault.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: agordeev, aou, bcain, benh, borntraeger, bp, catalin.marinas,
chris, dalias, dave.hansen, davem, deanbo422, deller, geert,
gerald.schaefer, gor, green.hu, guoren, heiko.carstens, hpa, ink,
James.Bottomley, jcmvbkbc, jhubbard, jonas, ley.foon.tan, linux,
luto, mattst88, mingo, mm-commits, monstr, mpe, nickhu, palmer,
paul.walmsley, paulus, penberg, peterx, peterz, rth, shorne,
stefan.kristiansson, tglx, tony.luck, torvalds, tsbogend, vgupta,
will, ysato
The patch titled
Subject: mm: do page fault accounting in handle_mm_fault
has been added to the -mm tree. Its filename is
mm-do-page-fault-accounting-in-handle_mm_fault.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-do-page-fault-accounting-in-handle_mm_fault.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-do-page-fault-accounting-in-handle_mm_fault.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm: do page fault accounting in handle_mm_fault
Patch series "mm: Page fault accounting cleanups", v5.
This is v5 of the pf accounting cleanup series. It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b9827063 ("mm: allow
VM_FAULT_RETRY for multiple times"):
https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/
What this series did:
- Correct page fault accounting: we do accounting for a page fault
(no matter whether it's from #PF handling, or gup, or anything else)
only with the one that completed the fault. For example, page fault
retries should not be counted in page fault counters. Same to the
perf events.
- Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
event is used in an adhoc way across different archs.
Case (1): for many archs it's done at the entry of a page fault
handler, so that it will also cover e.g. errornous faults.
Case (2): for some other archs, it is only accounted when the page
fault is resolved successfully.
Case (3): there're still quite some archs that have not enabled
this perf event.
Since this series will touch merely all the archs, we unify this
perf event to always follow case (1), which is the one that makes most
sense. And since we moved the accounting into handle_mm_fault, the
other two MAJ/MIN perf events are well taken care of naturally.
- Unify definition of "major faults": the definition of "major
fault" is slightly changed when used in accounting (not
VM_FAULT_MAJOR). More information in patch 1.
- Always account the page fault onto the one that triggered the page
fault. This does not matter much for #PF handlings, but mostly for
gup. More information on this in patch 25.
Patchset layout:
Patch 1: Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23: Enable the new accounting for arch #PF handlers one by one.
Patch 24: Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25: Cleanup GUP task_struct pointer since it's not needed any more
This patch (of 25):
This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault(). This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events. To
do this, the pt_regs pointer is passed into handle_mm_fault().
PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.
So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/alpha/mm/fault.c | 2 -
arch/arc/mm/fault.c | 2 -
arch/arm/mm/fault.c | 2 -
arch/arm64/mm/fault.c | 2 -
arch/csky/mm/fault.c | 3 +
arch/hexagon/mm/vm_fault.c | 2 -
arch/ia64/mm/fault.c | 2 -
arch/m68k/mm/fault.c | 2 -
arch/microblaze/mm/fault.c | 2 -
arch/mips/mm/fault.c | 2 -
arch/nds32/mm/fault.c | 2 -
arch/nios2/mm/fault.c | 2 -
arch/openrisc/mm/fault.c | 2 -
arch/parisc/mm/fault.c | 2 -
arch/powerpc/mm/copro_fault.c | 2 -
arch/powerpc/mm/fault.c | 2 -
arch/riscv/mm/fault.c | 2 -
arch/s390/mm/fault.c | 2 -
arch/sh/mm/fault.c | 2 -
arch/sparc/mm/fault_32.c | 4 +-
arch/sparc/mm/fault_64.c | 2 -
arch/um/kernel/trap.c | 2 -
arch/x86/mm/fault.c | 2 -
arch/xtensa/mm/fault.c | 2 -
drivers/iommu/amd/iommu_v2.c | 2 -
drivers/iommu/intel/svm.c | 3 +
include/linux/mm.h | 7 ++-
mm/gup.c | 4 +-
mm/hmm.c | 3 +
mm/ksm.c | 3 +
mm/memory.c | 64 +++++++++++++++++++++++++++++++-
31 files changed, 103 insertions(+), 34 deletions(-)
--- a/arch/alpha/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/alpha/mm/fault.c
@@ -148,7 +148,7 @@ retry:
/* If for any reason at all we couldn't handle the fault,
make sure we exit gracefully rather than endlessly redo
the fault. */
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/arc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arc/mm/fault.c
@@ -130,7 +130,7 @@ retry:
goto bad_area;
}
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
/* Quick path to respond to signals */
if (fault_signal_pending(fault, regs)) {
--- a/arch/arm64/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arm64/mm/fault.c
@@ -428,7 +428,7 @@ static vm_fault_t __do_page_fault(struct
*/
if (!(vma->vm_flags & vm_flags))
return VM_FAULT_BADACCESS;
- return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags);
+ return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags, NULL);
}
static bool is_el0_instruction_abort(unsigned int esr)
--- a/arch/arm/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/arm/mm/fault.c
@@ -224,7 +224,7 @@ good_area:
goto out;
}
- return handle_mm_fault(vma, addr & PAGE_MASK, flags);
+ return handle_mm_fault(vma, addr & PAGE_MASK, flags, NULL);
check_stack:
/* Don't allow expansion below FIRST_USER_ADDRESS */
--- a/arch/csky/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/csky/mm/fault.c
@@ -150,7 +150,8 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0);
+ fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0,
+ NULL);
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
--- a/arch/hexagon/mm/vm_fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/hexagon/mm/vm_fault.c
@@ -88,7 +88,7 @@ good_area:
break;
}
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/ia64/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/ia64/mm/fault.c
@@ -143,7 +143,7 @@ retry:
* sure we exit gracefully rather than endlessly redo the
* fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/m68k/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/m68k/mm/fault.c
@@ -134,7 +134,7 @@ good_area:
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
pr_debug("handle_mm_fault returns %x\n", fault);
if (fault_signal_pending(fault, regs))
--- a/arch/microblaze/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/microblaze/mm/fault.c
@@ -214,7 +214,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/mips/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/mips/mm/fault.c
@@ -152,7 +152,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/nds32/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/nds32/mm/fault.c
@@ -206,7 +206,7 @@ good_area:
* the fault.
*/
- fault = handle_mm_fault(vma, addr, flags);
+ fault = handle_mm_fault(vma, addr, flags, NULL);
/*
* If we need to retry but a fatal signal is pending, handle the
--- a/arch/nios2/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/nios2/mm/fault.c
@@ -131,7 +131,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/openrisc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/openrisc/mm/fault.c
@@ -159,7 +159,7 @@ good_area:
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/parisc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/parisc/mm/fault.c
@@ -302,7 +302,7 @@ good_area:
* fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/arch/powerpc/mm/copro_fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/powerpc/mm/copro_fault.c
@@ -64,7 +64,7 @@ int copro_handle_mm_fault(struct mm_stru
}
ret = 0;
- *flt = handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0);
+ *flt = handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0, NULL);
if (unlikely(*flt & VM_FAULT_ERROR)) {
if (*flt & VM_FAULT_OOM) {
ret = -ENOMEM;
--- a/arch/powerpc/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/powerpc/mm/fault.c
@@ -607,7 +607,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
major |= fault & VM_FAULT_MAJOR;
--- a/arch/riscv/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/riscv/mm/fault.c
@@ -109,7 +109,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, addr, flags);
+ fault = handle_mm_fault(vma, addr, flags, NULL);
/*
* If we need to retry but a fatal signal is pending, handle the
--- a/arch/s390/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/s390/mm/fault.c
@@ -478,7 +478,7 @@ retry:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs)) {
fault = VM_FAULT_SIGNAL;
if (flags & FAULT_FLAG_RETRY_NOWAIT)
--- a/arch/sh/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sh/mm/fault.c
@@ -482,7 +482,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR)))
if (mm_fault_error(regs, error_code, address, fault))
--- a/arch/sparc/mm/fault_32.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sparc/mm/fault_32.c
@@ -234,7 +234,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
@@ -410,7 +410,7 @@ good_area:
if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
goto bad_area;
}
- switch (handle_mm_fault(vma, address, flags)) {
+ switch (handle_mm_fault(vma, address, flags, NULL)) {
case VM_FAULT_SIGBUS:
case VM_FAULT_OOM:
goto do_sigbus;
--- a/arch/sparc/mm/fault_64.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/sparc/mm/fault_64.c
@@ -422,7 +422,7 @@ good_area:
goto bad_area;
}
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
goto exit_exception;
--- a/arch/um/kernel/trap.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/um/kernel/trap.c
@@ -71,7 +71,7 @@ good_area:
do {
vm_fault_t fault;
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
goto out_nosemaphore;
--- a/arch/x86/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/x86/mm/fault.c
@@ -1291,7 +1291,7 @@ good_area:
* userland). The return to userland is identified whenever
* FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in flags.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
major |= fault & VM_FAULT_MAJOR;
/* Quick path to respond to signals */
--- a/arch/xtensa/mm/fault.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/arch/xtensa/mm/fault.c
@@ -107,7 +107,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags);
+ fault = handle_mm_fault(vma, address, flags, NULL);
if (fault_signal_pending(fault, regs))
return;
--- a/drivers/iommu/amd/iommu_v2.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/drivers/iommu/amd/iommu_v2.c
@@ -495,7 +495,7 @@ static void do_fault(struct work_struct
if (access_error(vma, fault))
goto out;
- ret = handle_mm_fault(vma, address, flags);
+ ret = handle_mm_fault(vma, address, flags, NULL);
out:
mmap_read_unlock(mm);
--- a/drivers/iommu/intel/svm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/drivers/iommu/intel/svm.c
@@ -872,7 +872,8 @@ static irqreturn_t prq_event_thread(int
goto invalid;
ret = handle_mm_fault(vma, address,
- req->wr_req ? FAULT_FLAG_WRITE : 0);
+ req->wr_req ? FAULT_FLAG_WRITE : 0,
+ NULL);
if (ret & VM_FAULT_ERROR)
goto invalid;
--- a/include/linux/mm.h~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/include/linux/mm.h
@@ -38,6 +38,7 @@ struct file_ra_state;
struct user_struct;
struct writeback_control;
struct bdi_writeback;
+struct pt_regs;
void init_mm_internals(void);
@@ -1650,7 +1651,8 @@ int invalidate_inode_page(struct page *p
#ifdef CONFIG_MMU
extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
- unsigned long address, unsigned int flags);
+ unsigned long address, unsigned int flags,
+ struct pt_regs *regs);
extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
unsigned long address, unsigned int fault_flags,
bool *unlocked);
@@ -1660,7 +1662,8 @@ void unmap_mapping_range(struct address_
loff_t const holebegin, loff_t const holelen, int even_cows);
#else
static inline vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
- unsigned long address, unsigned int flags)
+ unsigned long address, unsigned int flags,
+ struct pt_regs *regs)
{
/* should never happen if there's no MMU */
BUG();
--- a/mm/gup.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/gup.c
@@ -884,7 +884,7 @@ static int faultin_page(struct task_stru
fault_flags |= FAULT_FLAG_TRIED;
}
- ret = handle_mm_fault(vma, address, fault_flags);
+ ret = handle_mm_fault(vma, address, fault_flags, NULL);
if (ret & VM_FAULT_ERROR) {
int err = vm_fault_to_errno(ret, *flags);
@@ -1238,7 +1238,7 @@ retry:
fatal_signal_pending(current))
return -EINTR;
- ret = handle_mm_fault(vma, address, fault_flags);
+ ret = handle_mm_fault(vma, address, fault_flags, NULL);
major |= ret & VM_FAULT_MAJOR;
if (ret & VM_FAULT_ERROR) {
int err = vm_fault_to_errno(ret, 0);
--- a/mm/hmm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/hmm.c
@@ -75,7 +75,8 @@ static int hmm_vma_fault(unsigned long a
}
for (; addr < end; addr += PAGE_SIZE)
- if (handle_mm_fault(vma, addr, fault_flags) & VM_FAULT_ERROR)
+ if (handle_mm_fault(vma, addr, fault_flags, NULL) &
+ VM_FAULT_ERROR)
return -EFAULT;
return -EBUSY;
}
--- a/mm/ksm.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/ksm.c
@@ -480,7 +480,8 @@ static int break_ksm(struct vm_area_stru
break;
if (PageKsm(page))
ret = handle_mm_fault(vma, addr,
- FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE);
+ FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE,
+ NULL);
else
ret = VM_FAULT_WRITE;
put_page(page);
--- a/mm/memory.c~mm-do-page-fault-accounting-in-handle_mm_fault
+++ a/mm/memory.c
@@ -71,6 +71,8 @@
#include <linux/dax.h>
#include <linux/oom.h>
#include <linux/numa.h>
+#include <linux/perf_event.h>
+#include <linux/ptrace.h>
#include <trace/events/kmem.h>
@@ -4365,6 +4367,64 @@ retry_pud:
return handle_pte_fault(&vmf);
}
+/**
+ * mm_account_fault - Do page fault accountings
+ *
+ * @regs: the pt_regs struct pointer. When set to NULL, will skip accounting
+ * of perf event counters, but we'll still do the per-task accounting to
+ * the task who triggered this page fault.
+ * @address: the faulted address.
+ * @flags: the fault flags.
+ * @ret: the fault retcode.
+ *
+ * This will take care of most of the page fault accountings. Meanwhile, it
+ * will also include the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf counter
+ * updates. However note that the handling of PERF_COUNT_SW_PAGE_FAULTS should
+ * still be in per-arch page fault handlers at the entry of page fault.
+ */
+static inline void mm_account_fault(struct pt_regs *regs,
+ unsigned long address, unsigned int flags,
+ vm_fault_t ret)
+{
+ bool major;
+
+ /*
+ * We don't do accounting for some specific faults:
+ *
+ * - Unsuccessful faults (e.g. when the address wasn't valid). That
+ * includes arch_vma_access_permitted() failing before reaching here.
+ * So this is not a "this many hardware page faults" counter. We
+ * should use the hw profiling for that.
+ *
+ * - Incomplete faults (VM_FAULT_RETRY). They will only be counted
+ * once they're completed.
+ */
+ if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY))
+ return;
+
+ /*
+ * We define the fault as a major fault when the final successful fault
+ * is VM_FAULT_MAJOR, or if it retried (which implies that we couldn't
+ * handle it immediately previously).
+ */
+ major = (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED);
+
+ /*
+ * If the fault is done for GUP, regs will be NULL, and we will skip
+ * the fault accounting.
+ */
+ if (!regs)
+ return;
+
+ if (major) {
+ current->maj_flt++;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
+ } else {
+ current->min_flt++;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
+ }
+}
+
/*
* By the time we get here, we already hold the mm semaphore
*
@@ -4372,7 +4432,7 @@ retry_pud:
* return value. See filemap_fault() and __lock_page_or_retry().
*/
vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
- unsigned int flags)
+ unsigned int flags, struct pt_regs *regs)
{
vm_fault_t ret;
@@ -4413,6 +4473,8 @@ vm_fault_t handle_mm_fault(struct vm_are
mem_cgroup_oom_synchronize(false);
}
+ mm_account_fault(regs, address, flags, ret);
+
return ret;
}
EXPORT_SYMBOL_GPL(handle_mm_fault);
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-alpha-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (67 preceding siblings ...)
2020-07-09 0:06 ` + mm-do-page-fault-accounting-in-handle_mm_fault.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-arc-use-general-page-fault-accounting.patch " Andrew Morton
` (163 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: ink, mattst88, mm-commits, peterx, rth
The patch titled
Subject: mm/alpha: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-alpha-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-alpha-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-alpha-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/alpha: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault().
Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/alpha/mm/fault.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
--- a/arch/alpha/mm/fault.c~mm-alpha-use-general-page-fault-accounting
+++ a/arch/alpha/mm/fault.c
@@ -25,6 +25,7 @@
#include <linux/interrupt.h>
#include <linux/extable.h>
#include <linux/uaccess.h>
+#include <linux/perf_event.h>
extern void die_if_kernel(char *,struct pt_regs *,long, unsigned long *);
@@ -116,6 +117,7 @@ do_page_fault(unsigned long address, uns
#endif
if (user_mode(regs))
flags |= FAULT_FLAG_USER;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
retry:
mmap_read_lock(mm);
vma = find_vma(mm, address);
@@ -148,7 +150,7 @@ retry:
/* If for any reason at all we couldn't handle the fault,
make sure we exit gracefully rather than endlessly redo
the fault. */
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
@@ -164,10 +166,6 @@ retry:
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-arc-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (68 preceding siblings ...)
2020-07-09 0:06 ` + mm-alpha-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-arm-use-general-page-fault-accounting.patch " Andrew Morton
` (162 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: mm-commits, peterx, vgupta
The patch titled
Subject: mm/arc: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-arc-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-arc-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-arc-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/arc: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.
Link: http://lkml.kernel.org/r/20200707225021.200906-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/arc/mm/fault.c | 18 +++---------------
1 file changed, 3 insertions(+), 15 deletions(-)
--- a/arch/arc/mm/fault.c~mm-arc-use-general-page-fault-accounting
+++ a/arch/arc/mm/fault.c
@@ -105,6 +105,7 @@ void do_page_fault(unsigned long address
if (write)
flags |= FAULT_FLAG_WRITE;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
retry:
mmap_read_lock(mm);
@@ -130,7 +131,7 @@ retry:
goto bad_area;
}
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
/* Quick path to respond to signals */
if (fault_signal_pending(fault, regs)) {
@@ -155,22 +156,9 @@ bad_area:
* Major/minor page fault accounting
* (in case of retry we only land here once)
*/
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
-
- if (likely(!(fault & VM_FAULT_ERROR))) {
- if (fault & VM_FAULT_MAJOR) {
- tsk->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
- regs, address);
- } else {
- tsk->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
- regs, address);
- }
-
+ if (likely(!(fault & VM_FAULT_ERROR)))
/* Normal return path: fault Handled Gracefully */
return;
- }
if (!user_mode(regs))
goto no_context;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-arm-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (69 preceding siblings ...)
2020-07-09 0:06 ` + mm-arc-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-arm64-use-general-page-fault-accounting.patch " Andrew Morton
` (161 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: linux, mm-commits, peterx, will
The patch titled
Subject: mm/arm: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-arm-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-arm-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-arm-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/arm: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened. To do this, we need to pass
the pt_regs pointer into __do_page_fault().
Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.
Link: http://lkml.kernel.org/r/20200707225021.200906-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/arm/mm/fault.c | 25 ++++++-------------------
1 file changed, 6 insertions(+), 19 deletions(-)
--- a/arch/arm/mm/fault.c~mm-arm-use-general-page-fault-accounting
+++ a/arch/arm/mm/fault.c
@@ -202,7 +202,8 @@ static inline bool access_error(unsigned
static vm_fault_t __kprobes
__do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
- unsigned int flags, struct task_struct *tsk)
+ unsigned int flags, struct task_struct *tsk,
+ struct pt_regs *regs)
{
struct vm_area_struct *vma;
vm_fault_t fault;
@@ -224,7 +225,7 @@ good_area:
goto out;
}
- return handle_mm_fault(vma, addr & PAGE_MASK, flags, NULL);
+ return handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
check_stack:
/* Don't allow expansion below FIRST_USER_ADDRESS */
@@ -266,6 +267,8 @@ do_page_fault(unsigned long addr, unsign
if ((fsr & FSR_WRITE) && !(fsr & FSR_CM))
flags |= FAULT_FLAG_WRITE;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+
/*
* As per x86, we may deadlock here. However, since the kernel only
* validly references user space from well defined areas of the code,
@@ -290,7 +293,7 @@ retry:
#endif
}
- fault = __do_page_fault(mm, addr, fsr, flags, tsk);
+ fault = __do_page_fault(mm, addr, fsr, flags, tsk, regs);
/* If we need to retry but a fatal signal is pending, handle the
* signal first. We do not need to release the mmap_lock because
@@ -302,23 +305,7 @@ retry:
return 0;
}
- /*
- * Major/minor page fault accounting is only done on the
- * initial attempt. If we go through a retry, it is extremely
- * likely that the page will be found in page cache at that point.
- */
-
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
if (!(fault & VM_FAULT_ERROR) && flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR) {
- tsk->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
- regs, addr);
- } else {
- tsk->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
- regs, addr);
- }
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
goto retry;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-arm64-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (70 preceding siblings ...)
2020-07-09 0:06 ` + mm-arm-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-csky-use-general-page-fault-accounting.patch " Andrew Morton
` (160 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: catalin.marinas, mm-commits, peterx, will
The patch titled
Subject: mm/arm64: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-arm64-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-arm64-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-arm64-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/arm64: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened. To do this, we pass pt_regs
pointer into __do_page_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/arm64/mm/fault.c | 29 ++++++-----------------------
1 file changed, 6 insertions(+), 23 deletions(-)
--- a/arch/arm64/mm/fault.c~mm-arm64-use-general-page-fault-accounting
+++ a/arch/arm64/mm/fault.c
@@ -404,7 +404,8 @@ static void do_bad_area(unsigned long ad
#define VM_FAULT_BADACCESS 0x020000
static vm_fault_t __do_page_fault(struct mm_struct *mm, unsigned long addr,
- unsigned int mm_flags, unsigned long vm_flags)
+ unsigned int mm_flags, unsigned long vm_flags,
+ struct pt_regs *regs)
{
struct vm_area_struct *vma = find_vma(mm, addr);
@@ -428,7 +429,7 @@ static vm_fault_t __do_page_fault(struct
*/
if (!(vma->vm_flags & vm_flags))
return VM_FAULT_BADACCESS;
- return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags, NULL);
+ return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags, regs);
}
static bool is_el0_instruction_abort(unsigned int esr)
@@ -450,7 +451,7 @@ static int __kprobes do_page_fault(unsig
{
const struct fault_info *inf;
struct mm_struct *mm = current->mm;
- vm_fault_t fault, major = 0;
+ vm_fault_t fault;
unsigned long vm_flags = VM_ACCESS_FLAGS;
unsigned int mm_flags = FAULT_FLAG_DEFAULT;
@@ -516,8 +517,7 @@ retry:
#endif
}
- fault = __do_page_fault(mm, addr, mm_flags, vm_flags);
- major |= fault & VM_FAULT_MAJOR;
+ fault = __do_page_fault(mm, addr, mm_flags, vm_flags, regs);
/* Quick path to respond to signals */
if (fault_signal_pending(fault, regs)) {
@@ -538,25 +538,8 @@ retry:
* Handle the "normal" (no error) case first.
*/
if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP |
- VM_FAULT_BADACCESS)))) {
- /*
- * Major/minor page fault accounting is only done
- * once. If we go through a retry, it is extremely
- * likely that the page will be found in page cache at
- * that point.
- */
- if (major) {
- current->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs,
- addr);
- } else {
- current->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs,
- addr);
- }
-
+ VM_FAULT_BADACCESS))))
return 0;
- }
/*
* If we are in kernel mode at this point, we have no context to
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-csky-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (71 preceding siblings ...)
2020-07-09 0:06 ` + mm-arm64-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-hexagon-use-general-page-fault-accounting.patch " Andrew Morton
` (159 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: guoren, mm-commits, peterx
The patch titled
Subject: mm/csky: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-csky-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-csky-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-csky-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/csky: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Link: http://lkml.kernel.org/r/20200707225021.200906-7-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/csky/mm/fault.c | 12 +-----------
1 file changed, 1 insertion(+), 11 deletions(-)
--- a/arch/csky/mm/fault.c~mm-csky-use-general-page-fault-accounting
+++ a/arch/csky/mm/fault.c
@@ -151,7 +151,7 @@ good_area:
* the fault.
*/
fault = handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0,
- NULL);
+ regs);
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
@@ -161,16 +161,6 @@ good_area:
goto bad_area;
BUG();
}
- if (fault & VM_FAULT_MAJOR) {
- tsk->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs,
- address);
- } else {
- tsk->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs,
- address);
- }
-
mmap_read_unlock(mm);
return;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-hexagon-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (72 preceding siblings ...)
2020-07-09 0:06 ` + mm-csky-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-ia64-use-general-page-fault-accounting.patch " Andrew Morton
` (158 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: bcain, mm-commits, peterx
The patch titled
Subject: mm/hexagon: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-hexagon-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-hexagon-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-hexagon-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/hexagon: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-8-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Brian Cain <bcain@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/hexagon/mm/vm_fault.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
--- a/arch/hexagon/mm/vm_fault.c~mm-hexagon-use-general-page-fault-accounting
+++ a/arch/hexagon/mm/vm_fault.c
@@ -18,6 +18,7 @@
#include <linux/signal.h>
#include <linux/extable.h>
#include <linux/hardirq.h>
+#include <linux/perf_event.h>
/*
* Decode of hardware exception sends us to one of several
@@ -53,6 +54,8 @@ void do_page_fault(unsigned long address
if (user_mode(regs))
flags |= FAULT_FLAG_USER;
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
retry:
mmap_read_lock(mm);
vma = find_vma(mm, address);
@@ -88,7 +91,7 @@ good_area:
break;
}
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
@@ -96,10 +99,6 @@ good_area:
/* The most common case -- we are done. */
if (likely(!(fault & VM_FAULT_ERROR))) {
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
goto retry;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-ia64-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (73 preceding siblings ...)
2020-07-09 0:06 ` + mm-hexagon-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-m68k-use-general-page-fault-accounting.patch " Andrew Morton
` (157 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: mm-commits, peterx, tony.luck
The patch titled
Subject: mm/ia64: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-ia64-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-ia64-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-ia64-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/ia64: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-9-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/ia64/mm/fault.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
--- a/arch/ia64/mm/fault.c~mm-ia64-use-general-page-fault-accounting
+++ a/arch/ia64/mm/fault.c
@@ -14,6 +14,7 @@
#include <linux/kdebug.h>
#include <linux/prefetch.h>
#include <linux/uaccess.h>
+#include <linux/perf_event.h>
#include <asm/processor.h>
#include <asm/exception.h>
@@ -105,6 +106,8 @@ ia64_do_page_fault (unsigned long addres
flags |= FAULT_FLAG_USER;
if (mask & VM_WRITE)
flags |= FAULT_FLAG_WRITE;
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
retry:
mmap_read_lock(mm);
@@ -143,7 +146,7 @@ retry:
* sure we exit gracefully rather than endlessly redo the
* fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
@@ -166,10 +169,6 @@ retry:
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-m68k-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (74 preceding siblings ...)
2020-07-09 0:06 ` + mm-ia64-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-microblaze-use-general-page-fault-accounting.patch " Andrew Morton
` (156 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: geert, mm-commits, peterx
The patch titled
Subject: mm/m68k: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-m68k-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-m68k-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-m68k-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/m68k: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-10-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/m68k/mm/fault.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
--- a/arch/m68k/mm/fault.c~mm-m68k-use-general-page-fault-accounting
+++ a/arch/m68k/mm/fault.c
@@ -12,6 +12,7 @@
#include <linux/interrupt.h>
#include <linux/module.h>
#include <linux/uaccess.h>
+#include <linux/perf_event.h>
#include <asm/setup.h>
#include <asm/traps.h>
@@ -84,6 +85,8 @@ int do_page_fault(struct pt_regs *regs,
if (user_mode(regs))
flags |= FAULT_FLAG_USER;
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
retry:
mmap_read_lock(mm);
@@ -134,7 +137,7 @@ good_area:
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
pr_debug("handle_mm_fault returns %x\n", fault);
if (fault_signal_pending(fault, regs))
@@ -150,16 +153,7 @@ good_area:
BUG();
}
- /*
- * Major/minor page fault accounting is only done on the
- * initial attempt. If we go through a retry, it is extremely
- * likely that the page will be found in page cache at that point.
- */
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-microblaze-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (75 preceding siblings ...)
2020-07-09 0:06 ` + mm-m68k-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:06 ` + mm-mips-use-general-page-fault-accounting.patch " Andrew Morton
` (155 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: mm-commits, monstr, peterx
The patch titled
Subject: mm/microblaze: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-microblaze-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-microblaze-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-microblaze-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/microblaze: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-11-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/microblaze/mm/fault.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
--- a/arch/microblaze/mm/fault.c~mm-microblaze-use-general-page-fault-accounting
+++ a/arch/microblaze/mm/fault.c
@@ -28,6 +28,7 @@
#include <linux/mman.h>
#include <linux/mm.h>
#include <linux/interrupt.h>
+#include <linux/perf_event.h>
#include <asm/page.h>
#include <asm/mmu.h>
@@ -121,6 +122,8 @@ void do_page_fault(struct pt_regs *regs,
if (user_mode(regs))
flags |= FAULT_FLAG_USER;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
/* When running in the kernel we expect faults to occur only to
* addresses in user space. All other faults represent errors in the
* kernel and should generate an OOPS. Unfortunately, in the case of an
@@ -214,7 +217,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
@@ -230,10 +233,6 @@ good_area:
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (unlikely(fault & VM_FAULT_MAJOR))
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-mips-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (76 preceding siblings ...)
2020-07-09 0:06 ` + mm-microblaze-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:06 ` Andrew Morton
2020-07-09 0:07 ` + mm-nds32-use-general-page-fault-accounting.patch " Andrew Morton
` (154 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:06 UTC (permalink / raw)
To: mm-commits, peterx, tsbogend
The patch titled
Subject: mm/mips: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-mips-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-mips-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-mips-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/mips: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.
Link: http://lkml.kernel.org/r/20200707225021.200906-12-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/mips/mm/fault.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)
--- a/arch/mips/mm/fault.c~mm-mips-use-general-page-fault-accounting
+++ a/arch/mips/mm/fault.c
@@ -96,6 +96,8 @@ static void __kprobes __do_page_fault(st
if (user_mode(regs))
flags |= FAULT_FLAG_USER;
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
retry:
mmap_read_lock(mm);
vma = find_vma(mm, address);
@@ -152,12 +154,11 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
@@ -168,15 +169,6 @@ good_area:
BUG();
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR) {
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
- regs, address);
- tsk->maj_flt++;
- } else {
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
- regs, address);
- tsk->min_flt++;
- }
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-nds32-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (77 preceding siblings ...)
2020-07-09 0:06 ` + mm-mips-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-nios2-use-general-page-fault-accounting.patch " Andrew Morton
` (153 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: deanbo422, green.hu, mm-commits, nickhu, peterx
The patch titled
Subject: mm/nds32: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-nds32-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-nds32-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-nds32-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/nds32: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.
Link: http://lkml.kernel.org/r/20200707225021.200906-13-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Greentime Hu <green.hu@gmail.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/nds32/mm/fault.c | 19 +++----------------
1 file changed, 3 insertions(+), 16 deletions(-)
--- a/arch/nds32/mm/fault.c~mm-nds32-use-general-page-fault-accounting
+++ a/arch/nds32/mm/fault.c
@@ -121,6 +121,8 @@ void do_page_fault(unsigned long entry,
if (unlikely(faulthandler_disabled() || !mm))
goto no_context;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+
/*
* As per x86, we may deadlock here. However, since the kernel only
* validly references user space from well defined areas of the code,
@@ -206,7 +208,7 @@ good_area:
* the fault.
*/
- fault = handle_mm_fault(vma, addr, flags, NULL);
+ fault = handle_mm_fault(vma, addr, flags, regs);
/*
* If we need to retry but a fatal signal is pending, handle the
@@ -228,22 +230,7 @@ good_area:
goto bad_area;
}
- /*
- * Major/minor page fault accounting is only done on the initial
- * attempt. If we go through a retry, it is extremely likely that the
- * page will be found in page cache at that point.
- */
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR) {
- tsk->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ,
- 1, regs, addr);
- } else {
- tsk->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN,
- 1, regs, addr);
- }
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-nios2-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (78 preceding siblings ...)
2020-07-09 0:07 ` + mm-nds32-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-openrisc-use-general-page-fault-accounting.patch " Andrew Morton
` (152 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: ley.foon.tan, mm-commits, peterx
The patch titled
Subject: mm/nios2: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-nios2-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-nios2-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-nios2-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/nios2: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-14-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/nios2/mm/fault.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
--- a/arch/nios2/mm/fault.c~mm-nios2-use-general-page-fault-accounting
+++ a/arch/nios2/mm/fault.c
@@ -24,6 +24,7 @@
#include <linux/mm.h>
#include <linux/extable.h>
#include <linux/uaccess.h>
+#include <linux/perf_event.h>
#include <asm/mmu_context.h>
#include <asm/traps.h>
@@ -83,6 +84,8 @@ asmlinkage void do_page_fault(struct pt_
if (user_mode(regs))
flags |= FAULT_FLAG_USER;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
if (!mmap_read_trylock(mm)) {
if (!user_mode(regs) && !search_exception_tables(regs->ea))
goto bad_area_nosemaphore;
@@ -131,7 +134,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
@@ -146,16 +149,7 @@ good_area:
BUG();
}
- /*
- * Major/minor page fault accounting is only done on the
- * initial attempt. If we go through a retry, it is extremely
- * likely that the page will be found in page cache at that point.
- */
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-openrisc-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (79 preceding siblings ...)
2020-07-09 0:07 ` + mm-nios2-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-parisc-use-general-page-fault-accounting.patch " Andrew Morton
` (151 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: jonas, mm-commits, peterx, shorne, stefan.kristiansson
The patch titled
Subject: mm/openrisc: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-openrisc-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-openrisc-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-openrisc-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/openrisc: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-15-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Stafford Horne <shorne@gmail.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/openrisc/mm/fault.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
--- a/arch/openrisc/mm/fault.c~mm-openrisc-use-general-page-fault-accounting
+++ a/arch/openrisc/mm/fault.c
@@ -15,6 +15,7 @@
#include <linux/interrupt.h>
#include <linux/extable.h>
#include <linux/sched/signal.h>
+#include <linux/perf_event.h>
#include <linux/uaccess.h>
#include <asm/siginfo.h>
@@ -103,6 +104,8 @@ asmlinkage void do_page_fault(struct pt_
if (in_interrupt() || !mm)
goto no_context;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
retry:
mmap_read_lock(mm);
vma = find_vma(mm, address);
@@ -159,7 +162,7 @@ good_area:
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
@@ -176,10 +179,6 @@ good_area:
if (flags & FAULT_FLAG_ALLOW_RETRY) {
/*RGD modeled on Cris */
- if (fault & VM_FAULT_MAJOR)
- tsk->maj_flt++;
- else
- tsk->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-parisc-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (80 preceding siblings ...)
2020-07-09 0:07 ` + mm-openrisc-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-powerpc-use-general-page-fault-accounting.patch " Andrew Morton
` (150 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: deller, James.Bottomley, mm-commits, peterx
The patch titled
Subject: mm/parisc: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-parisc-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-parisc-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-parisc-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/parisc: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-16-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/parisc/mm/fault.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
--- a/arch/parisc/mm/fault.c~mm-parisc-use-general-page-fault-accounting
+++ a/arch/parisc/mm/fault.c
@@ -18,6 +18,7 @@
#include <linux/extable.h>
#include <linux/uaccess.h>
#include <linux/hugetlb.h>
+#include <linux/perf_event.h>
#include <asm/traps.h>
@@ -281,6 +282,7 @@ void do_page_fault(struct pt_regs *regs,
acc_type = parisc_acctyp(code, regs->iir);
if (acc_type & VM_WRITE)
flags |= FAULT_FLAG_WRITE;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
retry:
mmap_read_lock(mm);
vma = find_vma_prev(mm, address, &prev_vma);
@@ -302,7 +304,7 @@ good_area:
* fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
@@ -323,10 +325,6 @@ good_area:
BUG();
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
/*
* No need to mmap_read_unlock(mm) as we would
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-powerpc-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (81 preceding siblings ...)
2020-07-09 0:07 ` + mm-parisc-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-riscv-use-general-page-fault-accounting.patch " Andrew Morton
` (149 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: benh, mm-commits, mpe, paulus, peterx
The patch titled
Subject: mm/powerpc: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-powerpc-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-powerpc-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-powerpc-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/powerpc: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-17-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/powerpc/mm/fault.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
--- a/arch/powerpc/mm/fault.c~mm-powerpc-use-general-page-fault-accounting
+++ a/arch/powerpc/mm/fault.c
@@ -607,7 +607,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
major |= fault & VM_FAULT_MAJOR;
@@ -633,14 +633,9 @@ good_area:
/*
* Major/minor page fault accounting.
*/
- if (major) {
- current->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
+ if (major)
cmo_account_page_fault();
- } else {
- current->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
- }
+
return 0;
}
NOKPROBE_SYMBOL(__do_page_fault);
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-riscv-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (82 preceding siblings ...)
2020-07-09 0:07 ` + mm-powerpc-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-s390-use-general-page-fault-accounting.patch " Andrew Morton
` (148 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: aou, mm-commits, palmer, paul.walmsley, penberg, peterx
The patch titled
Subject: mm/riscv: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-riscv-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-riscv-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-riscv-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/riscv: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Link: http://lkml.kernel.org/r/20200707225021.200906-18-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/riscv/mm/fault.c | 16 +---------------
1 file changed, 1 insertion(+), 15 deletions(-)
--- a/arch/riscv/mm/fault.c~mm-riscv-use-general-page-fault-accounting
+++ a/arch/riscv/mm/fault.c
@@ -109,7 +109,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, addr, flags, NULL);
+ fault = handle_mm_fault(vma, addr, flags, regs);
/*
* If we need to retry but a fatal signal is pending, handle the
@@ -127,21 +127,7 @@ good_area:
BUG();
}
- /*
- * Major/minor page fault accounting is only done on the
- * initial attempt. If we go through a retry, it is extremely
- * likely that the page will be found in page cache at that point.
- */
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR) {
- tsk->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ,
- 1, regs, addr);
- } else {
- tsk->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN,
- 1, regs, addr);
- }
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-s390-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (83 preceding siblings ...)
2020-07-09 0:07 ` + mm-riscv-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-sh-use-general-page-fault-accounting.patch " Andrew Morton
` (147 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: agordeev, borntraeger, gerald.schaefer, gor, heiko.carstens,
mm-commits, peterx
The patch titled
Subject: mm/s390: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-s390-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-s390-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-s390-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/s390: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Link: http://lkml.kernel.org/r/20200707225021.200906-19-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/s390/mm/fault.c | 16 +---------------
1 file changed, 1 insertion(+), 15 deletions(-)
--- a/arch/s390/mm/fault.c~mm-s390-use-general-page-fault-accounting
+++ a/arch/s390/mm/fault.c
@@ -478,7 +478,7 @@ retry:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs)) {
fault = VM_FAULT_SIGNAL;
if (flags & FAULT_FLAG_RETRY_NOWAIT)
@@ -488,21 +488,7 @@ retry:
if (unlikely(fault & VM_FAULT_ERROR))
goto out_up;
- /*
- * Major/minor page fault accounting is only done on the
- * initial attempt. If we go through a retry, it is extremely
- * likely that the page will be found in page cache at that point.
- */
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR) {
- tsk->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
- regs, address);
- } else {
- tsk->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
- regs, address);
- }
if (fault & VM_FAULT_RETRY) {
if (IS_ENABLED(CONFIG_PGSTE) && gmap &&
(flags & FAULT_FLAG_RETRY_NOWAIT)) {
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-sh-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (84 preceding siblings ...)
2020-07-09 0:07 ` + mm-s390-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-sparc32-use-general-page-fault-accounting.patch " Andrew Morton
` (146 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: dalias, mm-commits, peterx, ysato
The patch titled
Subject: mm/sh: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-sh-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-sh-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-sh-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/sh: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Link: http://lkml.kernel.org/r/20200707225021.200906-20-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/sh/mm/fault.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
--- a/arch/sh/mm/fault.c~mm-sh-use-general-page-fault-accounting
+++ a/arch/sh/mm/fault.c
@@ -482,22 +482,13 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR)))
if (mm_fault_error(regs, error_code, address, fault))
return;
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR) {
- tsk->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
- regs, address);
- } else {
- tsk->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
- regs, address);
- }
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-sparc32-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (85 preceding siblings ...)
2020-07-09 0:07 ` + mm-sh-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-sparc64-use-general-page-fault-accounting.patch " Andrew Morton
` (145 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: davem, mm-commits, peterx
The patch titled
Subject: mm/sparc32: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-sparc32-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-sparc32-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-sparc32-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/sparc32: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Link: http://lkml.kernel.org/r/20200707225021.200906-21-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/sparc/mm/fault_32.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
--- a/arch/sparc/mm/fault_32.c~mm-sparc32-use-general-page-fault-accounting
+++ a/arch/sparc/mm/fault_32.c
@@ -234,7 +234,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
@@ -250,15 +250,6 @@ good_area:
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR) {
- current->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ,
- 1, regs, address);
- } else {
- current->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN,
- 1, regs, address);
- }
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-sparc64-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (86 preceding siblings ...)
2020-07-09 0:07 ` + mm-sparc32-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-x86-use-general-page-fault-accounting.patch " Andrew Morton
` (144 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: davem, mm-commits, peterx
The patch titled
Subject: mm/sparc64: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-sparc64-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-sparc64-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-sparc64-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/sparc64: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Link: http://lkml.kernel.org/r/20200707225021.200906-22-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/sparc/mm/fault_64.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
--- a/arch/sparc/mm/fault_64.c~mm-sparc64-use-general-page-fault-accounting
+++ a/arch/sparc/mm/fault_64.c
@@ -422,7 +422,7 @@ good_area:
goto bad_area;
}
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
goto exit_exception;
@@ -438,15 +438,6 @@ good_area:
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR) {
- current->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ,
- 1, regs, address);
- } else {
- current->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN,
- 1, regs, address);
- }
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-x86-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (87 preceding siblings ...)
2020-07-09 0:07 ` + mm-sparc64-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-xtensa-use-general-page-fault-accounting.patch " Andrew Morton
` (143 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: bp, dave.hansen, hpa, luto, mingo, mm-commits, peterx, peterz, tglx
The patch titled
Subject: mm/x86: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-x86-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-x86-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-x86-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/x86: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault().
Link: http://lkml.kernel.org/r/20200707225021.200906-23-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/x86/mm/fault.c | 17 ++---------------
1 file changed, 2 insertions(+), 15 deletions(-)
--- a/arch/x86/mm/fault.c~mm-x86-use-general-page-fault-accounting
+++ a/arch/x86/mm/fault.c
@@ -1139,7 +1139,7 @@ void do_user_addr_fault(struct pt_regs *
struct vm_area_struct *vma;
struct task_struct *tsk;
struct mm_struct *mm;
- vm_fault_t fault, major = 0;
+ vm_fault_t fault;
unsigned int flags = FAULT_FLAG_DEFAULT;
tsk = current;
@@ -1291,8 +1291,7 @@ good_area:
* userland). The return to userland is identified whenever
* FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in flags.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
- major |= fault & VM_FAULT_MAJOR;
+ fault = handle_mm_fault(vma, address, flags, regs);
/* Quick path to respond to signals */
if (fault_signal_pending(fault, regs)) {
@@ -1319,18 +1318,6 @@ good_area:
return;
}
- /*
- * Major/minor page fault accounting. If any of the events
- * returned VM_FAULT_MAJOR, we account it as a major fault.
- */
- if (major) {
- tsk->maj_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
- } else {
- tsk->min_flt++;
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
- }
-
check_v8086_mode(regs, address, tsk);
}
NOKPROBE_SYMBOL(do_user_addr_fault);
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-xtensa-use-general-page-fault-accounting.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (88 preceding siblings ...)
2020-07-09 0:07 ` + mm-x86-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch " Andrew Morton
` (142 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: chris, jcmvbkbc, mm-commits, peterx
The patch titled
Subject: mm/xtensa: use general page fault accounting
has been added to the -mm tree. Its filename is
mm-xtensa-use-general-page-fault-accounting.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-xtensa-use-general-page-fault-accounting.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-xtensa-use-general-page-fault-accounting.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/xtensa: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Remove the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf events because it's
now also done in handle_mm_fault().
Move the PERF_COUNT_SW_PAGE_FAULTS event higher before taking mmap_sem for
the fault, then it'll match with the rest of the archs.
Link: http://lkml.kernel.org/r/20200707225021.200906-24-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/xtensa/mm/fault.c | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)
--- a/arch/xtensa/mm/fault.c~mm-xtensa-use-general-page-fault-accounting
+++ a/arch/xtensa/mm/fault.c
@@ -72,6 +72,9 @@ void do_page_fault(struct pt_regs *regs)
if (user_mode(regs))
flags |= FAULT_FLAG_USER;
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
retry:
mmap_read_lock(mm);
vma = find_vma(mm, address);
@@ -107,7 +110,7 @@ good_area:
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
- fault = handle_mm_fault(vma, address, flags, NULL);
+ fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs))
return;
@@ -122,10 +125,6 @@ good_area:
BUG();
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
@@ -139,12 +138,6 @@ good_area:
}
mmap_read_unlock(mm);
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
- if (flags & VM_FAULT_MAJOR)
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
- else
- perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
-
return;
/* Something tried to access memory that isn't in our memory map..
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (89 preceding siblings ...)
2020-07-09 0:07 ` + mm-xtensa-use-general-page-fault-accounting.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` Andrew Morton
2020-07-09 0:07 ` + mm-gup-remove-task_struct-pointer-for-all-gup-code.patch " Andrew Morton
` (141 subsequent siblings)
232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: agordeev, aou, bcain, benh, borntraeger, bp, catalin.marinas,
chris, dalias, dave.hansen, davem, deanbo422, deller, geert,
gerald.schaefer, gor, green.hu, guoren, heiko.carstens, hpa, ink,
James.Bottomley, jcmvbkbc, jhubbard, jonas, ley.foon.tan, linux,
luto, mattst88, mingo, mm-commits, monstr, mpe, nickhu, palmer,
paul.walmsley
The patch titled
Subject: mm: clean up the last pieces of page fault accountings
has been added to the -mm tree. Its filename is
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm: clean up the last pieces of page fault accountings
Here're the last pieces of page fault accounting that were still done
outside handle_mm_fault() where we still have regs==NULL when calling
handle_mm_fault():
arch/powerpc/mm/copro_fault.c: copro_handle_mm_fault
arch/sparc/mm/fault_32.c: force_user_fault
arch/um/kernel/trap.c: handle_page_fault
mm/gup.c: faultin_page
fixup_user_fault
mm/hmm.c: hmm_vma_fault
mm/ksm.c: break_ksm
Some of them has the issue of duplicated accounting for page fault
retries. Some of them didn't do the accounting at all.
This patch cleans all these up by letting handle_mm_fault() to do per-task
page fault accounting even if regs==NULL (though we'll still skip the perf
event accountings). With that, we can safely remove all the outliers now.
There's another functional change in that now we account the page faults
to the caller of gup, rather than the task_struct that passed into the gup
code. More information of this can be found at [1].
After this patch, below things should never be touched again outside
handle_mm_fault():
- task_struct.[maj|min]_flt
- PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]
[1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/
Link: http://lkml.kernel.org/r/20200707225021.200906-25-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/powerpc/mm/copro_fault.c | 5 -----
arch/um/kernel/trap.c | 4 ----
mm/gup.c | 13 -------------
mm/memory.c | 17 ++++++++++-------
4 files changed, 10 insertions(+), 29 deletions(-)
--- a/arch/powerpc/mm/copro_fault.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/arch/powerpc/mm/copro_fault.c
@@ -76,11 +76,6 @@ int copro_handle_mm_fault(struct mm_stru
BUG();
}
- if (*flt & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
-
out_unlock:
mmap_read_unlock(mm);
return ret;
--- a/arch/um/kernel/trap.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/arch/um/kernel/trap.c
@@ -88,10 +88,6 @@ good_area:
BUG();
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
--- a/mm/gup.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/mm/gup.c
@@ -893,13 +893,6 @@ static int faultin_page(struct task_stru
BUG();
}
- if (tsk) {
- if (ret & VM_FAULT_MAJOR)
- tsk->maj_flt++;
- else
- tsk->min_flt++;
- }
-
if (ret & VM_FAULT_RETRY) {
if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT))
*locked = 0;
@@ -1255,12 +1248,6 @@ retry:
goto retry;
}
- if (tsk) {
- if (major)
- tsk->maj_flt++;
- else
- tsk->min_flt++;
- }
return 0;
}
EXPORT_SYMBOL_GPL(fixup_user_fault);
--- a/mm/memory.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/mm/memory.c
@@ -4409,20 +4409,23 @@ static inline void mm_account_fault(stru
*/
major = (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED);
+ if (major)
+ current->maj_flt++;
+ else
+ current->min_flt++;
+
/*
- * If the fault is done for GUP, regs will be NULL, and we will skip
- * the fault accounting.
+ * If the fault is done for GUP, regs will be NULL. We only do the
+ * accounting for the per thread fault counters who triggered the
+ * fault, and we skip the perf event updates.
*/
if (!regs)
return;
- if (major) {
- current->maj_flt++;
+ if (major)
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
- } else {
- current->min_flt++;
+ else
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
- }
}
/*
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch added to -mm tree
2020-07-09 0:07 ` + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: agordeev, aou, bcain, benh, borntraeger, bp, catalin.marinas,
chris, dalias, dave.hansen, davem, deanbo422, deller, geert,
gerald.schaefer, gor, green.hu, guoren, heiko.carstens, hpa, ink,
James.Bottomley, jcmvbkbc, jhubbard, jonas, ley.foon.tan, linux,
luto, mattst88, mingo, mm-commits, monstr, mpe, nickhu, palmer,
paul.walmsley, paulus, penberg, peterx, peterz, rth, shorne,
stefan.kristiansson, tglx, tony.luck, tsbogend, vgupta, will,
ysato
The patch titled
Subject: mm: clean up the last pieces of page fault accountings
has been added to the -mm tree. Its filename is
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm: clean up the last pieces of page fault accountings
Here're the last pieces of page fault accounting that were still done
outside handle_mm_fault() where we still have regs==NULL when calling
handle_mm_fault():
arch/powerpc/mm/copro_fault.c: copro_handle_mm_fault
arch/sparc/mm/fault_32.c: force_user_fault
arch/um/kernel/trap.c: handle_page_fault
mm/gup.c: faultin_page
fixup_user_fault
mm/hmm.c: hmm_vma_fault
mm/ksm.c: break_ksm
Some of them has the issue of duplicated accounting for page fault
retries. Some of them didn't do the accounting at all.
This patch cleans all these up by letting handle_mm_fault() to do per-task
page fault accounting even if regs==NULL (though we'll still skip the perf
event accountings). With that, we can safely remove all the outliers now.
There's another functional change in that now we account the page faults
to the caller of gup, rather than the task_struct that passed into the gup
code. More information of this can be found at [1].
After this patch, below things should never be touched again outside
handle_mm_fault():
- task_struct.[maj|min]_flt
- PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]
[1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/
Link: http://lkml.kernel.org/r/20200707225021.200906-25-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/powerpc/mm/copro_fault.c | 5 -----
arch/um/kernel/trap.c | 4 ----
mm/gup.c | 13 -------------
mm/memory.c | 17 ++++++++++-------
4 files changed, 10 insertions(+), 29 deletions(-)
--- a/arch/powerpc/mm/copro_fault.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/arch/powerpc/mm/copro_fault.c
@@ -76,11 +76,6 @@ int copro_handle_mm_fault(struct mm_stru
BUG();
}
- if (*flt & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
-
out_unlock:
mmap_read_unlock(mm);
return ret;
--- a/arch/um/kernel/trap.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/arch/um/kernel/trap.c
@@ -88,10 +88,6 @@ good_area:
BUG();
}
if (flags & FAULT_FLAG_ALLOW_RETRY) {
- if (fault & VM_FAULT_MAJOR)
- current->maj_flt++;
- else
- current->min_flt++;
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
--- a/mm/gup.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/mm/gup.c
@@ -893,13 +893,6 @@ static int faultin_page(struct task_stru
BUG();
}
- if (tsk) {
- if (ret & VM_FAULT_MAJOR)
- tsk->maj_flt++;
- else
- tsk->min_flt++;
- }
-
if (ret & VM_FAULT_RETRY) {
if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT))
*locked = 0;
@@ -1255,12 +1248,6 @@ retry:
goto retry;
}
- if (tsk) {
- if (major)
- tsk->maj_flt++;
- else
- tsk->min_flt++;
- }
return 0;
}
EXPORT_SYMBOL_GPL(fixup_user_fault);
--- a/mm/memory.c~mm-clean-up-the-last-pieces-of-page-fault-accountings
+++ a/mm/memory.c
@@ -4409,20 +4409,23 @@ static inline void mm_account_fault(stru
*/
major = (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED);
+ if (major)
+ current->maj_flt++;
+ else
+ current->min_flt++;
+
/*
- * If the fault is done for GUP, regs will be NULL, and we will skip
- * the fault accounting.
+ * If the fault is done for GUP, regs will be NULL. We only do the
+ * accounting for the per thread fault counters who triggered the
+ * fault, and we skip the perf event updates.
*/
if (!regs)
return;
- if (major) {
- current->maj_flt++;
+ if (major)
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
- } else {
- current->min_flt++;
+ else
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
- }
}
/*
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-gup-remove-task_struct-pointer-for-all-gup-code.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (90 preceding siblings ...)
2020-07-09 0:07 ` + mm-clean-up-the-last-pieces-of-page-fault-accountings.patch " Andrew Morton
@ 2020-07-09 0:07 ` Andrew Morton
2020-07-09 2:04 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix.patch " Andrew Morton
` (140 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 0:07 UTC (permalink / raw)
To: jhubbard, mm-commits, peterx
The patch titled
Subject: mm/gup: remove task_struct pointer for all gup code
has been added to the -mm tree. Its filename is
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Xu <peterx@redhat.com>
Subject: mm/gup: remove task_struct pointer for all gup code
After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more. Remove that parameter in the whole gup
stack.
Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/arc/kernel/process.c | 2
arch/s390/kvm/interrupt.c | 2
arch/s390/kvm/kvm-s390.c | 2
arch/s390/kvm/priv.c | 8 -
arch/s390/mm/gmap.c | 4
drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 2
drivers/infiniband/core/umem_odp.c | 2
drivers/vfio/vfio_iommu_type1.c | 4
fs/exec.c | 2
include/linux/mm.h | 9 -
kernel/events/uprobes.c | 6 -
kernel/futex.c | 2
mm/gup.c | 101 +++++++-----------
mm/memory.c | 2
mm/process_vm_access.c | 2
security/tomoyo/domain.c | 2
virt/kvm/async_pf.c | 2
virt/kvm/kvm_main.c | 2
18 files changed, 69 insertions(+), 87 deletions(-)
--- a/arch/arc/kernel/process.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/arc/kernel/process.c
@@ -91,7 +91,7 @@ fault:
goto fail;
mmap_read_lock(current->mm);
- ret = fixup_user_fault(current, current->mm, (unsigned long) uaddr,
+ ret = fixup_user_fault(current->mm, (unsigned long) uaddr,
FAULT_FLAG_WRITE, NULL);
mmap_read_unlock(current->mm);
--- a/arch/s390/kvm/interrupt.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/s390/kvm/interrupt.c
@@ -2768,7 +2768,7 @@ static struct page *get_map_page(struct
struct page *page = NULL;
mmap_read_lock(kvm->mm);
- get_user_pages_remote(NULL, kvm->mm, uaddr, 1, FOLL_WRITE,
+ get_user_pages_remote(kvm->mm, uaddr, 1, FOLL_WRITE,
&page, NULL, NULL);
mmap_read_unlock(kvm->mm);
return page;
--- a/arch/s390/kvm/kvm-s390.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/s390/kvm/kvm-s390.c
@@ -1891,7 +1891,7 @@ static long kvm_s390_set_skeys(struct kv
r = set_guest_storage_key(current->mm, hva, keys[i], 0);
if (r) {
- r = fixup_user_fault(current, current->mm, hva,
+ r = fixup_user_fault(current->mm, hva,
FAULT_FLAG_WRITE, &unlocked);
if (r)
break;
--- a/arch/s390/kvm/priv.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/s390/kvm/priv.c
@@ -273,7 +273,7 @@ retry:
rc = get_guest_storage_key(current->mm, vmaddr, &key);
if (rc) {
- rc = fixup_user_fault(current, current->mm, vmaddr,
+ rc = fixup_user_fault(current->mm, vmaddr,
FAULT_FLAG_WRITE, &unlocked);
if (!rc) {
mmap_read_unlock(current->mm);
@@ -319,7 +319,7 @@ retry:
mmap_read_lock(current->mm);
rc = reset_guest_reference_bit(current->mm, vmaddr);
if (rc < 0) {
- rc = fixup_user_fault(current, current->mm, vmaddr,
+ rc = fixup_user_fault(current->mm, vmaddr,
FAULT_FLAG_WRITE, &unlocked);
if (!rc) {
mmap_read_unlock(current->mm);
@@ -390,7 +390,7 @@ static int handle_sske(struct kvm_vcpu *
m3 & SSKE_MC);
if (rc < 0) {
- rc = fixup_user_fault(current, current->mm, vmaddr,
+ rc = fixup_user_fault(current->mm, vmaddr,
FAULT_FLAG_WRITE, &unlocked);
rc = !rc ? -EAGAIN : rc;
}
@@ -1094,7 +1094,7 @@ static int handle_pfmf(struct kvm_vcpu *
rc = cond_set_guest_storage_key(current->mm, vmaddr,
key, NULL, nq, mr, mc);
if (rc < 0) {
- rc = fixup_user_fault(current, current->mm, vmaddr,
+ rc = fixup_user_fault(current->mm, vmaddr,
FAULT_FLAG_WRITE, &unlocked);
rc = !rc ? -EAGAIN : rc;
}
--- a/arch/s390/mm/gmap.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/arch/s390/mm/gmap.c
@@ -649,7 +649,7 @@ retry:
rc = vmaddr;
goto out_up;
}
- if (fixup_user_fault(current, gmap->mm, vmaddr, fault_flags,
+ if (fixup_user_fault(gmap->mm, vmaddr, fault_flags,
&unlocked)) {
rc = -EFAULT;
goto out_up;
@@ -879,7 +879,7 @@ static int gmap_pte_op_fixup(struct gmap
BUG_ON(gmap_is_shadow(gmap));
fault_flags = (prot == PROT_WRITE) ? FAULT_FLAG_WRITE : 0;
- if (fixup_user_fault(current, mm, vmaddr, fault_flags, &unlocked))
+ if (fixup_user_fault(mm, vmaddr, fault_flags, &unlocked))
return -EFAULT;
if (unlocked)
/* lost mmap_lock, caller has to retry __gmap_translate */
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -472,7 +472,7 @@ __i915_gem_userptr_get_pages_worker(stru
locked = 1;
}
ret = pin_user_pages_remote
- (work->task, mm,
+ (mm,
obj->userptr.ptr + pinned * PAGE_SIZE,
npages - pinned,
flags,
--- a/drivers/infiniband/core/umem_odp.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/drivers/infiniband/core/umem_odp.c
@@ -437,7 +437,7 @@ int ib_umem_odp_map_dma_pages(struct ib_
* complex (and doesn't gain us much performance in most use
* cases).
*/
- npages = get_user_pages_remote(owning_process, owning_mm,
+ npages = get_user_pages_remote(owning_mm,
user_virt, gup_num_pages,
flags, local_page_list, NULL, NULL);
mmap_read_unlock(owning_mm);
--- a/drivers/vfio/vfio_iommu_type1.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/drivers/vfio/vfio_iommu_type1.c
@@ -425,7 +425,7 @@ static int follow_fault_pfn(struct vm_ar
if (ret) {
bool unlocked = false;
- ret = fixup_user_fault(NULL, mm, vaddr,
+ ret = fixup_user_fault(mm, vaddr,
FAULT_FLAG_REMOTE |
(write_fault ? FAULT_FLAG_WRITE : 0),
&unlocked);
@@ -453,7 +453,7 @@ static int vaddr_get_pfn(struct mm_struc
flags |= FOLL_WRITE;
mmap_read_lock(mm);
- ret = pin_user_pages_remote(NULL, mm, vaddr, 1, flags | FOLL_LONGTERM,
+ ret = pin_user_pages_remote(mm, vaddr, 1, flags | FOLL_LONGTERM,
page, NULL, NULL);
if (ret == 1) {
*pfn = page_to_pfn(page[0]);
--- a/fs/exec.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/fs/exec.c
@@ -215,7 +215,7 @@ static struct page *get_arg_page(struct
* We are doing an exec(). 'current' is the process
* doing the exec and bprm->mm is the new process's mm.
*/
- ret = get_user_pages_remote(current, bprm->mm, pos, 1, gup_flags,
+ ret = get_user_pages_remote(bprm->mm, pos, 1, gup_flags,
&page, NULL, NULL);
if (ret <= 0)
return NULL;
--- a/include/linux/mm.h~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/include/linux/mm.h
@@ -1653,7 +1653,7 @@ int invalidate_inode_page(struct page *p
extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
unsigned long address, unsigned int flags,
struct pt_regs *regs);
-extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
+extern int fixup_user_fault(struct mm_struct *mm,
unsigned long address, unsigned int fault_flags,
bool *unlocked);
void unmap_mapping_pages(struct address_space *mapping,
@@ -1669,8 +1669,7 @@ static inline vm_fault_t handle_mm_fault
BUG();
return VM_FAULT_SIGBUS;
}
-static inline int fixup_user_fault(struct task_struct *tsk,
- struct mm_struct *mm, unsigned long address,
+static inline int fixup_user_fault(struct mm_struct *mm, unsigned long address,
unsigned int fault_flags, bool *unlocked)
{
/* should never happen if there's no MMU */
@@ -1696,11 +1695,11 @@ extern int access_remote_vm(struct mm_st
extern int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
unsigned long addr, void *buf, int len, unsigned int gup_flags);
-long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long get_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
struct vm_area_struct **vmas, int *locked);
-long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long pin_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
struct vm_area_struct **vmas, int *locked);
--- a/kernel/events/uprobes.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/kernel/events/uprobes.c
@@ -376,7 +376,7 @@ __update_ref_ctr(struct mm_struct *mm, u
if (!vaddr || !d)
return -EINVAL;
- ret = get_user_pages_remote(NULL, mm, vaddr, 1,
+ ret = get_user_pages_remote(mm, vaddr, 1,
FOLL_WRITE, &page, &vma, NULL);
if (unlikely(ret <= 0)) {
/*
@@ -477,7 +477,7 @@ retry:
if (is_register)
gup_flags |= FOLL_SPLIT_PMD;
/* Read the page with vaddr into memory */
- ret = get_user_pages_remote(NULL, mm, vaddr, 1, gup_flags,
+ ret = get_user_pages_remote(mm, vaddr, 1, gup_flags,
&old_page, &vma, NULL);
if (ret <= 0)
return ret;
@@ -2029,7 +2029,7 @@ static int is_trap_at_addr(struct mm_str
* but we treat this as a 'remote' access since it is
* essentially a kernel access to the memory.
*/
- result = get_user_pages_remote(NULL, mm, vaddr, 1, FOLL_FORCE, &page,
+ result = get_user_pages_remote(mm, vaddr, 1, FOLL_FORCE, &page,
NULL, NULL);
if (result < 0)
return result;
--- a/kernel/futex.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/kernel/futex.c
@@ -699,7 +699,7 @@ static int fault_in_user_writeable(u32 _
int ret;
mmap_read_lock(mm);
- ret = fixup_user_fault(current, mm, (unsigned long)uaddr,
+ ret = fixup_user_fault(mm, (unsigned long)uaddr,
FAULT_FLAG_WRITE, NULL);
mmap_read_unlock(mm);
--- a/mm/gup.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/mm/gup.c
@@ -859,7 +859,7 @@ unmap:
* does not include FOLL_NOWAIT, the mmap_lock may be released. If it
* is, *@locked will be set to 0 and -EBUSY returned.
*/
-static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma,
+static int faultin_page(struct vm_area_struct *vma,
unsigned long address, unsigned int *flags, int *locked)
{
unsigned int fault_flags = 0;
@@ -962,7 +962,6 @@ static int check_vma_flags(struct vm_are
/**
* __get_user_pages() - pin user pages in memory
- * @tsk: task_struct of target task
* @mm: mm_struct of target mm
* @start: starting user address
* @nr_pages: number of pages from start to pin
@@ -1021,7 +1020,7 @@ static int check_vma_flags(struct vm_are
* instead of __get_user_pages. __get_user_pages should be used only if
* you need some special @gup_flags.
*/
-static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
+static long __get_user_pages(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
struct vm_area_struct **vmas, int *locked)
@@ -1103,8 +1102,7 @@ retry:
page = follow_page_mask(vma, start, foll_flags, &ctx);
if (!page) {
- ret = faultin_page(tsk, vma, start, &foll_flags,
- locked);
+ ret = faultin_page(vma, start, &foll_flags, locked);
switch (ret) {
case 0:
goto retry;
@@ -1178,8 +1176,6 @@ static bool vma_permits_fault(struct vm_
/**
* fixup_user_fault() - manually resolve a user page fault
- * @tsk: the task_struct to use for page fault accounting, or
- * NULL if faults are not to be recorded.
* @mm: mm_struct of target mm
* @address: user address
* @fault_flags:flags to pass down to handle_mm_fault()
@@ -1207,7 +1203,7 @@ static bool vma_permits_fault(struct vm_
* This function will not return with an unlocked mmap_lock. So it has not the
* same semantics wrt the @mm->mmap_lock as does filemap_fault().
*/
-int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
+int fixup_user_fault(struct mm_struct *mm,
unsigned long address, unsigned int fault_flags,
bool *unlocked)
{
@@ -1256,8 +1252,7 @@ EXPORT_SYMBOL_GPL(fixup_user_fault);
* Please note that this function, unlike __get_user_pages will not
* return 0 for nr_pages > 0 without FOLL_NOWAIT
*/
-static __always_inline long __get_user_pages_locked(struct task_struct *tsk,
- struct mm_struct *mm,
+static __always_inline long __get_user_pages_locked(struct mm_struct *mm,
unsigned long start,
unsigned long nr_pages,
struct page **pages,
@@ -1290,7 +1285,7 @@ static __always_inline long __get_user_p
pages_done = 0;
lock_dropped = false;
for (;;) {
- ret = __get_user_pages(tsk, mm, start, nr_pages, flags, pages,
+ ret = __get_user_pages(mm, start, nr_pages, flags, pages,
vmas, locked);
if (!locked)
/* VM_FAULT_RETRY couldn't trigger, bypass */
@@ -1350,7 +1345,7 @@ retry:
}
*locked = 1;
- ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED,
+ ret = __get_user_pages(mm, start, 1, flags | FOLL_TRIED,
pages, NULL, locked);
if (!*locked) {
/* Continue to retry until we succeeded */
@@ -1436,7 +1431,7 @@ long populate_vma_page_range(struct vm_a
* We made sure addr is within a VMA, so the following will
* not result in a stack expansion that recurses back here.
*/
- return __get_user_pages(current, mm, start, nr_pages, gup_flags,
+ return __get_user_pages(mm, start, nr_pages, gup_flags,
NULL, NULL, locked);
}
@@ -1520,7 +1515,7 @@ struct page *get_dump_page(unsigned long
struct vm_area_struct *vma;
struct page *page;
- if (__get_user_pages(current, current->mm, addr, 1,
+ if (__get_user_pages(current->mm, addr, 1,
FOLL_FORCE | FOLL_DUMP | FOLL_GET, &page, &vma,
NULL) < 1)
return NULL;
@@ -1529,8 +1524,7 @@ struct page *get_dump_page(unsigned long
}
#endif /* CONFIG_ELF_CORE */
#else /* CONFIG_MMU */
-static long __get_user_pages_locked(struct task_struct *tsk,
- struct mm_struct *mm, unsigned long start,
+static long __get_user_pages_locked(struct mm_struct *mm, unsigned long start,
unsigned long nr_pages, struct page **pages,
struct vm_area_struct **vmas, int *locked,
unsigned int foll_flags)
@@ -1646,8 +1640,7 @@ static struct page *new_non_cma_page(str
return __alloc_pages_node(nid, gfp_mask, 0);
}
-static long check_and_migrate_cma_pages(struct task_struct *tsk,
- struct mm_struct *mm,
+static long check_and_migrate_cma_pages(struct mm_struct *mm,
unsigned long start,
unsigned long nr_pages,
struct page **pages,
@@ -1721,7 +1714,7 @@ check_again:
* again migrating any new CMA pages which we failed to isolate
* earlier.
*/
- ret = __get_user_pages_locked(tsk, mm, start, nr_pages,
+ ret = __get_user_pages_locked(mm, start, nr_pages,
pages, vmas, NULL,
gup_flags);
@@ -1735,8 +1728,7 @@ check_again:
return ret;
}
#else
-static long check_and_migrate_cma_pages(struct task_struct *tsk,
- struct mm_struct *mm,
+static long check_and_migrate_cma_pages(struct mm_struct *mm,
unsigned long start,
unsigned long nr_pages,
struct page **pages,
@@ -1751,8 +1743,7 @@ static long check_and_migrate_cma_pages(
* __gup_longterm_locked() is a wrapper for __get_user_pages_locked which
* allows us to process the FOLL_LONGTERM flag.
*/
-static long __gup_longterm_locked(struct task_struct *tsk,
- struct mm_struct *mm,
+static long __gup_longterm_locked(struct mm_struct *mm,
unsigned long start,
unsigned long nr_pages,
struct page **pages,
@@ -1777,7 +1768,7 @@ static long __gup_longterm_locked(struct
flags = memalloc_nocma_save();
}
- rc = __get_user_pages_locked(tsk, mm, start, nr_pages, pages,
+ rc = __get_user_pages_locked(mm, start, nr_pages, pages,
vmas_tmp, NULL, gup_flags);
if (gup_flags & FOLL_LONGTERM) {
@@ -1792,7 +1783,7 @@ static long __gup_longterm_locked(struct
goto out;
}
- rc = check_and_migrate_cma_pages(tsk, mm, start, rc, pages,
+ rc = check_and_migrate_cma_pages(mm, start, rc, pages,
vmas_tmp, gup_flags);
}
@@ -1802,22 +1793,20 @@ out:
return rc;
}
#else /* !CONFIG_FS_DAX && !CONFIG_CMA */
-static __always_inline long __gup_longterm_locked(struct task_struct *tsk,
- struct mm_struct *mm,
+static __always_inline long __gup_longterm_locked(struct mm_struct *mm,
unsigned long start,
unsigned long nr_pages,
struct page **pages,
struct vm_area_struct **vmas,
unsigned int flags)
{
- return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
+ return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
NULL, flags);
}
#endif /* CONFIG_FS_DAX || CONFIG_CMA */
#ifdef CONFIG_MMU
-static long __get_user_pages_remote(struct task_struct *tsk,
- struct mm_struct *mm,
+static long __get_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
struct vm_area_struct **vmas, int *locked)
@@ -1836,20 +1825,18 @@ static long __get_user_pages_remote(stru
* This will check the vmas (even if our vmas arg is NULL)
* and return -ENOTSUPP if DAX isn't allowed in this case:
*/
- return __gup_longterm_locked(tsk, mm, start, nr_pages, pages,
+ return __gup_longterm_locked(mm, start, nr_pages, pages,
vmas, gup_flags | FOLL_TOUCH |
FOLL_REMOTE);
}
- return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
+ return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
locked,
gup_flags | FOLL_TOUCH | FOLL_REMOTE);
}
/**
* get_user_pages_remote() - pin user pages in memory
- * @tsk: the task_struct to use for page fault accounting, or
- * NULL if faults are not to be recorded.
* @mm: mm_struct of target mm
* @start: starting user address
* @nr_pages: number of pages from start to pin
@@ -1908,7 +1895,7 @@ static long __get_user_pages_remote(stru
* should use get_user_pages_remote because it cannot pass
* FAULT_FLAG_ALLOW_RETRY to handle_mm_fault.
*/
-long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long get_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
struct vm_area_struct **vmas, int *locked)
@@ -1920,13 +1907,13 @@ long get_user_pages_remote(struct task_s
if (WARN_ON_ONCE(gup_flags & FOLL_PIN))
return -EINVAL;
- return __get_user_pages_remote(tsk, mm, start, nr_pages, gup_flags,
+ return __get_user_pages_remote(mm, start, nr_pages, gup_flags,
pages, vmas, locked);
}
EXPORT_SYMBOL(get_user_pages_remote);
#else /* CONFIG_MMU */
-long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long get_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
struct vm_area_struct **vmas, int *locked)
@@ -1934,8 +1921,7 @@ long get_user_pages_remote(struct task_s
return 0;
}
-static long __get_user_pages_remote(struct task_struct *tsk,
- struct mm_struct *mm,
+static long __get_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
struct vm_area_struct **vmas, int *locked)
@@ -1955,11 +1941,10 @@ static long __get_user_pages_remote(stru
* @vmas: array of pointers to vmas corresponding to each page.
* Or NULL if the caller does not require them.
*
- * This is the same as get_user_pages_remote(), just with a
- * less-flexible calling convention where we assume that the task
- * and mm being operated on are the current task's and don't allow
- * passing of a locked parameter. We also obviously don't pass
- * FOLL_REMOTE in here.
+ * This is the same as get_user_pages_remote(), just with a less-flexible
+ * calling convention where we assume that the mm being operated on belongs to
+ * the current task, and doesn't allow passing of a locked parameter. We also
+ * obviously don't pass FOLL_REMOTE in here.
*/
long get_user_pages(unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
@@ -1972,7 +1957,7 @@ long get_user_pages(unsigned long start,
if (WARN_ON_ONCE(gup_flags & FOLL_PIN))
return -EINVAL;
- return __gup_longterm_locked(current, current->mm, start, nr_pages,
+ return __gup_longterm_locked(current->mm, start, nr_pages,
pages, vmas, gup_flags | FOLL_TOUCH);
}
EXPORT_SYMBOL(get_user_pages);
@@ -1982,7 +1967,7 @@ EXPORT_SYMBOL(get_user_pages);
*
* mmap_read_lock(mm);
* do_something()
- * get_user_pages(tsk, mm, ..., pages, NULL);
+ * get_user_pages(mm, ..., pages, NULL);
* mmap_read_unlock(mm);
*
* to:
@@ -1990,7 +1975,7 @@ EXPORT_SYMBOL(get_user_pages);
* int locked = 1;
* mmap_read_lock(mm);
* do_something()
- * get_user_pages_locked(tsk, mm, ..., pages, &locked);
+ * get_user_pages_locked(mm, ..., pages, &locked);
* if (locked)
* mmap_read_unlock(mm);
*
@@ -2028,7 +2013,7 @@ long get_user_pages_locked(unsigned long
if (WARN_ON_ONCE(gup_flags & FOLL_PIN))
return -EINVAL;
- return __get_user_pages_locked(current, current->mm, start, nr_pages,
+ return __get_user_pages_locked(current->mm, start, nr_pages,
pages, NULL, locked,
gup_flags | FOLL_TOUCH);
}
@@ -2038,12 +2023,12 @@ EXPORT_SYMBOL(get_user_pages_locked);
* get_user_pages_unlocked() is suitable to replace the form:
*
* mmap_read_lock(mm);
- * get_user_pages(tsk, mm, ..., pages, NULL);
+ * get_user_pages(mm, ..., pages, NULL);
* mmap_read_unlock(mm);
*
* with:
*
- * get_user_pages_unlocked(tsk, mm, ..., pages);
+ * get_user_pages_unlocked(mm, ..., pages);
*
* It is functionally equivalent to get_user_pages_fast so
* get_user_pages_fast should be used instead if specific gup_flags
@@ -2066,7 +2051,7 @@ long get_user_pages_unlocked(unsigned lo
return -EINVAL;
mmap_read_lock(mm);
- ret = __get_user_pages_locked(current, mm, start, nr_pages, pages, NULL,
+ ret = __get_user_pages_locked(mm, start, nr_pages, pages, NULL,
&locked, gup_flags | FOLL_TOUCH);
if (locked)
mmap_read_unlock(mm);
@@ -2711,7 +2696,7 @@ static int __gup_longterm_unlocked(unsig
*/
if (gup_flags & FOLL_LONGTERM) {
mmap_read_lock(current->mm);
- ret = __gup_longterm_locked(current, current->mm,
+ ret = __gup_longterm_locked(current->mm,
start, nr_pages,
pages, NULL, gup_flags);
mmap_read_unlock(current->mm);
@@ -2954,10 +2939,8 @@ int pin_user_pages_fast_only(unsigned lo
EXPORT_SYMBOL_GPL(pin_user_pages_fast_only);
/**
- * pin_user_pages_remote() - pin pages of a remote process (task != current)
+ * pin_user_pages_remote() - pin pages of a remote process
*
- * @tsk: the task_struct to use for page fault accounting, or
- * NULL if faults are not to be recorded.
* @mm: mm_struct of target mm
* @start: starting user address
* @nr_pages: number of pages from start to pin
@@ -2978,7 +2961,7 @@ EXPORT_SYMBOL_GPL(pin_user_pages_fast_on
* FOLL_PIN means that the pages must be released via unpin_user_page(). Please
* see Documentation/core-api/pin_user_pages.rst for details.
*/
-long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+long pin_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
struct vm_area_struct **vmas, int *locked)
@@ -2988,7 +2971,7 @@ long pin_user_pages_remote(struct task_s
return -EINVAL;
gup_flags |= FOLL_PIN;
- return __get_user_pages_remote(tsk, mm, start, nr_pages, gup_flags,
+ return __get_user_pages_remote(mm, start, nr_pages, gup_flags,
pages, vmas, locked);
}
EXPORT_SYMBOL(pin_user_pages_remote);
@@ -3020,7 +3003,7 @@ long pin_user_pages(unsigned long start,
return -EINVAL;
gup_flags |= FOLL_PIN;
- return __gup_longterm_locked(current, current->mm, start, nr_pages,
+ return __gup_longterm_locked(current->mm, start, nr_pages,
pages, vmas, gup_flags);
}
EXPORT_SYMBOL(pin_user_pages);
@@ -3065,7 +3048,7 @@ long pin_user_pages_locked(unsigned long
return -EINVAL;
gup_flags |= FOLL_PIN;
- return __get_user_pages_locked(current, current->mm, start, nr_pages,
+ return __get_user_pages_locked(current->mm, start, nr_pages,
pages, NULL, locked,
gup_flags | FOLL_TOUCH);
}
--- a/mm/memory.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/mm/memory.c
@@ -4751,7 +4751,7 @@ int __access_remote_vm(struct task_struc
void *maddr;
struct page *page = NULL;
- ret = get_user_pages_remote(tsk, mm, addr, 1,
+ ret = get_user_pages_remote(mm, addr, 1,
gup_flags, &page, &vma, NULL);
if (ret <= 0) {
#ifndef CONFIG_HAVE_IOREMAP_PROT
--- a/mm/process_vm_access.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/mm/process_vm_access.c
@@ -105,7 +105,7 @@ static int process_vm_rw_single_vec(unsi
* current/current->mm
*/
mmap_read_lock(mm);
- pinned_pages = pin_user_pages_remote(task, mm, pa, pinned_pages,
+ pinned_pages = pin_user_pages_remote(mm, pa, pinned_pages,
flags, process_pages,
NULL, &locked);
if (locked)
--- a/security/tomoyo/domain.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/security/tomoyo/domain.c
@@ -914,7 +914,7 @@ bool tomoyo_dump_page(struct linux_binpr
* (represented by bprm). 'current' is the process doing
* the execve().
*/
- if (get_user_pages_remote(current, bprm->mm, pos, 1,
+ if (get_user_pages_remote(bprm->mm, pos, 1,
FOLL_FORCE, &page, NULL, NULL) <= 0)
return false;
#else
--- a/virt/kvm/async_pf.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/virt/kvm/async_pf.c
@@ -61,7 +61,7 @@ static void async_pf_execute(struct work
* access remotely.
*/
mmap_read_lock(mm);
- get_user_pages_remote(NULL, mm, addr, 1, FOLL_WRITE, NULL, NULL,
+ get_user_pages_remote(mm, addr, 1, FOLL_WRITE, NULL, NULL,
&locked);
if (locked)
mmap_read_unlock(mm);
--- a/virt/kvm/kvm_main.c~mm-gup-remove-task_struct-pointer-for-all-gup-code
+++ a/virt/kvm/kvm_main.c
@@ -1830,7 +1830,7 @@ static int hva_to_pfn_remapped(struct vm
* not call the fault handler, so do it here.
*/
bool unlocked = false;
- r = fixup_user_fault(current, current->mm, addr,
+ r = fixup_user_fault(current->mm, addr,
(write_fault ? FAULT_FLAG_WRITE : 0),
&unlocked);
if (unlocked)
_
Patches currently in -mm which might be from peterx@redhat.com are
mm-do-page-fault-accounting-in-handle_mm_fault.patch
mm-alpha-use-general-page-fault-accounting.patch
mm-arc-use-general-page-fault-accounting.patch
mm-arm-use-general-page-fault-accounting.patch
mm-arm64-use-general-page-fault-accounting.patch
mm-csky-use-general-page-fault-accounting.patch
mm-hexagon-use-general-page-fault-accounting.patch
mm-ia64-use-general-page-fault-accounting.patch
mm-m68k-use-general-page-fault-accounting.patch
mm-microblaze-use-general-page-fault-accounting.patch
mm-mips-use-general-page-fault-accounting.patch
mm-nds32-use-general-page-fault-accounting.patch
mm-nios2-use-general-page-fault-accounting.patch
mm-openrisc-use-general-page-fault-accounting.patch
mm-parisc-use-general-page-fault-accounting.patch
mm-powerpc-use-general-page-fault-accounting.patch
mm-riscv-use-general-page-fault-accounting.patch
mm-s390-use-general-page-fault-accounting.patch
mm-sh-use-general-page-fault-accounting.patch
mm-sparc32-use-general-page-fault-accounting.patch
mm-sparc64-use-general-page-fault-accounting.patch
mm-x86-use-general-page-fault-accounting.patch
mm-xtensa-use-general-page-fault-accounting.patch
mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-vmstat-add-events-for-thp-migration-without-split-fix.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (91 preceding siblings ...)
2020-07-09 0:07 ` + mm-gup-remove-task_struct-pointer-for-all-gup-code.patch " Andrew Morton
@ 2020-07-09 2:04 ` Andrew Morton
2020-07-09 2:29 ` mmotm 2020-07-08-19-28 uploaded Andrew Morton
` (139 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 2:04 UTC (permalink / raw)
To: akpm, anshuman.khandual, daniel.m.jordan, hughd, jhubbard,
mm-commits, n-horiguchi, willy, ziy
The patch titled
Subject: mm-vmstat-add-events-for-thp-migration-without-split-fix
has been added to the -mm tree. Its filename is
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-vmstat-add-events-for-thp-migration-without-split-fix
s/hpage_nr_pages/thp_nr_pages/ due to "mm: replace hpage_nr_pages with
thp_nr_pages".
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/migrate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/migrate.c~mm-vmstat-add-events-for-thp-migration-without-split-fix
+++ a/mm/migrate.c
@@ -1446,7 +1446,7 @@ retry:
* during migration.
*/
is_thp = PageTransHuge(page);
- thp_nr_pages = hpage_nr_pages(page);
+ thp_nr_pages = thp_nr_pages(page);
cond_resched();
if (PageHuge(page))
_
Patches currently in -mm which might be from akpm@linux-foundation.org are
mm.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* mmotm 2020-07-08-19-28 uploaded
2020-07-03 22:14 incoming Andrew Morton
` (92 preceding siblings ...)
2020-07-09 2:04 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix.patch " Andrew Morton
@ 2020-07-09 2:29 ` Andrew Morton
2020-07-09 2:29 ` Andrew Morton
` (138 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 2:29 UTC (permalink / raw)
To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
mhocko, mm-commits, sfr
The mm-of-the-moment snapshot 2020-07-08-19-28 has been uploaded to
http://www.ozlabs.org/~akpm/mmotm/
mmotm-readme.txt says
README for mm-of-the-moment:
http://www.ozlabs.org/~akpm/mmotm/
This is a snapshot of my -mm patch queue. Uploaded at random hopefully
more than once a week.
You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY). The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series
The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss. Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.
This tree is partially included in linux-next. To see which patches are
included in linux-next, consult the `series' file. Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.
A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release. Individual mmotm releases are tagged. The master branch always
points to the latest release, so it's constantly rebasing.
https://github.com/hnaz/linux-mm
The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree. It is updated more frequently
than mmotm, and is untested.
A git copy of this tree is also available at
https://github.com/hnaz/linux-mm
This mmotm tree contains the following patches against 5.8-rc4:
(patches marked "*" will be included in linux-next)
origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
* mailmap-add-entry-for-mike-rapoport.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* kbuild-move-wtype-limits-to-w=2.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-swap-simplify-alloc_swap_slot_cache.patch
* mm-swap-simplify-enable_swap_slots_cache.patch
* mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
* mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
* mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
* mm-utilc-make-vm_memory_committed-more-accurate.patch
* mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* mm-do-page-fault-accounting-in-handle_mm_fault.patch
* mm-alpha-use-general-page-fault-accounting.patch
* mm-arc-use-general-page-fault-accounting.patch
* mm-arm-use-general-page-fault-accounting.patch
* mm-arm64-use-general-page-fault-accounting.patch
* mm-csky-use-general-page-fault-accounting.patch
* mm-hexagon-use-general-page-fault-accounting.patch
* mm-ia64-use-general-page-fault-accounting.patch
* mm-m68k-use-general-page-fault-accounting.patch
* mm-microblaze-use-general-page-fault-accounting.patch
* mm-mips-use-general-page-fault-accounting.patch
* mm-nds32-use-general-page-fault-accounting.patch
* mm-nios2-use-general-page-fault-accounting.patch
* mm-openrisc-use-general-page-fault-accounting.patch
* mm-parisc-use-general-page-fault-accounting.patch
* mm-powerpc-use-general-page-fault-accounting.patch
* mm-riscv-use-general-page-fault-accounting.patch
* mm-s390-use-general-page-fault-accounting.patch
* mm-sh-use-general-page-fault-accounting.patch
* mm-sparc32-use-general-page-fault-accounting.patch
* mm-sparc64-use-general-page-fault-accounting.patch
* mm-x86-use-general-page-fault-accounting.patch
* mm-xtensa-use-general-page-fault-accounting.patch
* mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
* mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
* mm-mremap-calculate-extent-in-one-place.patch
* mm-mremap-start-addresses-are-properly-aligned.patch
* mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* kasan-record-and-print-the-free-track.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-vmscanc-fixed-typo.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-vmstat-add-events-for-thp-migration-without-split.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-free-pages-fix.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix-fix.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* lib-test_lockupc-make-symbol-test_works-static.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* fs-minix-check-return-value-of-sb_getblk.patch
* fs-minix-dont-allow-getting-deleted-inodes.patch
* fs-minix-reject-too-large-maximum-file-size.patch
* fs-minix-set-s_maxbytes-correctly.patch
* fs-minix-fix-block-limit-check-for-v1-filesystems.patch
* fs-minix-remove-expected-error-message-in-block_to_path.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* umh-fix-refcount-underflow-in-fork_usermode_blob.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
linux-next.patch
linux-next-rejects.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-remove-call-to-memset-after-dma_alloc_coherent.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
make-sure-nobodys-leaking-resources.patch
releasing-resources-with-children.patch
mutex-subsystem-synchro-test-module.patch
kernel-forkc-export-kernel_thread-to-modules.patch
workaround-for-a-pci-restoring-bug.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* mmotm 2020-07-08-19-28 uploaded
2020-07-03 22:14 incoming Andrew Morton
` (93 preceding siblings ...)
2020-07-09 2:29 ` mmotm 2020-07-08-19-28 uploaded Andrew Morton
@ 2020-07-09 2:29 ` Andrew Morton
2020-07-09 23:09 ` + mm-handle-page-mapping-better-in-dump_page.patch added to -mm tree Andrew Morton
` (137 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 2:29 UTC (permalink / raw)
To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
mhocko, mm-commits, sfr
The mm-of-the-moment snapshot 2020-07-08-19-28 has been uploaded to
http://www.ozlabs.org/~akpm/mmotm/
mmotm-readme.txt says
README for mm-of-the-moment:
http://www.ozlabs.org/~akpm/mmotm/
This is a snapshot of my -mm patch queue. Uploaded at random hopefully
more than once a week.
You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY). The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series
The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss. Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.
This tree is partially included in linux-next. To see which patches are
included in linux-next, consult the `series' file. Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.
A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release. Individual mmotm releases are tagged. The master branch always
points to the latest release, so it's constantly rebasing.
https://github.com/hnaz/linux-mm
The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree. It is updated more frequently
than mmotm, and is untested.
A git copy of this tree is also available at
https://github.com/hnaz/linux-mm
This mmotm tree contains the following patches against 5.8-rc4:
(patches marked "*" will be included in linux-next)
origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
* mailmap-add-entry-for-mike-rapoport.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* kbuild-move-wtype-limits-to-w=2.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-swap-simplify-alloc_swap_slot_cache.patch
* mm-swap-simplify-enable_swap_slots_cache.patch
* mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
* mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
* mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
* mm-utilc-make-vm_memory_committed-more-accurate.patch
* mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* mm-do-page-fault-accounting-in-handle_mm_fault.patch
* mm-alpha-use-general-page-fault-accounting.patch
* mm-arc-use-general-page-fault-accounting.patch
* mm-arm-use-general-page-fault-accounting.patch
* mm-arm64-use-general-page-fault-accounting.patch
* mm-csky-use-general-page-fault-accounting.patch
* mm-hexagon-use-general-page-fault-accounting.patch
* mm-ia64-use-general-page-fault-accounting.patch
* mm-m68k-use-general-page-fault-accounting.patch
* mm-microblaze-use-general-page-fault-accounting.patch
* mm-mips-use-general-page-fault-accounting.patch
* mm-nds32-use-general-page-fault-accounting.patch
* mm-nios2-use-general-page-fault-accounting.patch
* mm-openrisc-use-general-page-fault-accounting.patch
* mm-parisc-use-general-page-fault-accounting.patch
* mm-powerpc-use-general-page-fault-accounting.patch
* mm-riscv-use-general-page-fault-accounting.patch
* mm-s390-use-general-page-fault-accounting.patch
* mm-sh-use-general-page-fault-accounting.patch
* mm-sparc32-use-general-page-fault-accounting.patch
* mm-sparc64-use-general-page-fault-accounting.patch
* mm-x86-use-general-page-fault-accounting.patch
* mm-xtensa-use-general-page-fault-accounting.patch
* mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
* mm-mremap-it-is-sure-to-have-enough-space-when-extent-meets-requirement.patch
* mm-mremap-calculate-extent-in-one-place.patch
* mm-mremap-start-addresses-are-properly-aligned.patch
* mm-mremap-use-pmd_addr_end-to-simplify-the-calculate-of-extent.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* kasan-record-and-print-the-free-track.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-vmscanc-fixed-typo.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-vmstat-add-events-for-thp-migration-without-split.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-free-pages-fix.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix-fix.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* lib-test_lockupc-make-symbol-test_works-static.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* fs-minix-check-return-value-of-sb_getblk.patch
* fs-minix-dont-allow-getting-deleted-inodes.patch
* fs-minix-reject-too-large-maximum-file-size.patch
* fs-minix-set-s_maxbytes-correctly.patch
* fs-minix-fix-block-limit-check-for-v1-filesystems.patch
* fs-minix-remove-expected-error-message-in-block_to_path.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* umh-fix-refcount-underflow-in-fork_usermode_blob.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
linux-next.patch
linux-next-rejects.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-remove-call-to-memset-after-dma_alloc_coherent.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
make-sure-nobodys-leaking-resources.patch
releasing-resources-with-children.patch
mutex-subsystem-synchro-test-module.patch
kernel-forkc-export-kernel_thread-to-modules.patch
workaround-for-a-pci-restoring-bug.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-handle-page-mapping-better-in-dump_page.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (94 preceding siblings ...)
2020-07-09 2:29 ` Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
2020-07-09 23:09 ` + mm-dump-compound-page-information-on-a-second-line.patch " Andrew Morton
` (136 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy
The patch titled
Subject: mm/debug: handle page->mapping better in dump_page
has been added to the -mm tree. Its filename is
mm-handle-page-mapping-better-in-dump_page.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-handle-page-mapping-better-in-dump_page.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-handle-page-mapping-better-in-dump_page.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: handle page->mapping better in dump_page
Patch series "Improvements for dump_page()", v2.
Here's a sample dump of a pagecache tail page with all of the patches
applied:
page:000000006d1c49ca refcount:6 mapcount:0 mapping:00000000136b8d90 index:0x109 pfn:0x6c645
head:000000008bd38076 order:2 compound_mapcount:0 compound_pincount:0
aops:xfs_address_space_operations ino:800042 dentry name:"fd"
flags: 0x4000000000012014(uptodate|lru|private|head)
raw: 4000000000000000 ffffd46ac1b19101 ffffffff00000202 dead000000000004
raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
head: 4000000000012014 ffffd46ac1b1bbc8 ffffd46ac1b1bc08 ffff91976f659560
head: 0000000000000108 ffff919773220680 00000006ffffffff 0000000000000000
page dumped because: testing
This patch (of 6):
If we can't call page_mapping() to get the page mapping, handle the
anon/ksm/movable bits correctly.
Link: http://lkml.kernel.org/r/20200709202117.7216-1-willy@infradead.org
Link: http://lkml.kernel.org/r/20200709202117.7216-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/mm/debug.c~mm-handle-page-mapping-better-in-dump_page
+++ a/mm/debug.c
@@ -70,7 +70,12 @@ void __dump_page(struct page *page, cons
if (page < head || (page >= head + MAX_ORDER_NR_PAGES)) {
/* Corrupt page, cannot call page_mapping */
- mapping = page->mapping;
+ unsigned long tmp = (unsigned long)page->mapping;
+
+ if (tmp & PAGE_MAPPING_ANON)
+ mapping = NULL;
+ else
+ mapping = (void *)(tmp & ~PAGE_MAPPING_FLAGS);
head = page;
compound = false;
} else {
_
Patches currently in -mm which might be from willy@infradead.org are
mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-dump-compound-page-information-on-a-second-line.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (95 preceding siblings ...)
2020-07-09 23:09 ` + mm-handle-page-mapping-better-in-dump_page.patch added to -mm tree Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
2020-07-09 23:09 ` + mm-print-head-flags-in-dump_page.patch " Andrew Morton
` (135 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy
The patch titled
Subject: mm/debug: dump compound page information on a second line
has been added to the -mm tree. Its filename is
mm-dump-compound-page-information-on-a-second-line.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-dump-compound-page-information-on-a-second-line.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-dump-compound-page-information-on-a-second-line.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: dump compound page information on a second line
Simplify both the implementation and the output by splitting all the
compound page information onto a second line.
Link: http://lkml.kernel.org/r/20200709202117.7216-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug.c | 30 ++++++++++++------------------
1 file changed, 12 insertions(+), 18 deletions(-)
--- a/mm/debug.c~mm-dump-compound-page-information-on-a-second-line
+++ a/mm/debug.c
@@ -89,27 +89,21 @@ void __dump_page(struct page *page, cons
*/
mapcount = PageSlab(head) ? 0 : page_mapcount(page);
- if (compound)
+ pr_warn("page:%px refcount:%d mapcount:%d mapping:%p index:%#lx\n",
+ page, page_ref_count(head), mapcount, mapping,
+ page_to_pgoff(page));
+ if (compound) {
if (hpage_pincount_available(page)) {
- pr_warn("page:%px refcount:%d mapcount:%d mapping:%p "
- "index:%#lx head:%px order:%u "
- "compound_mapcount:%d compound_pincount:%d\n",
- page, page_ref_count(head), mapcount,
- mapping, page_to_pgoff(page), head,
- compound_order(head), compound_mapcount(page),
- compound_pincount(page));
+ pr_warn("head:%px order:%u compound_mapcount:%d compound_pincount:%d\n",
+ head, compound_order(head),
+ compound_mapcount(head),
+ compound_pincount(head));
} else {
- pr_warn("page:%px refcount:%d mapcount:%d mapping:%p "
- "index:%#lx head:%px order:%u "
- "compound_mapcount:%d\n",
- page, page_ref_count(head), mapcount,
- mapping, page_to_pgoff(page), head,
- compound_order(head), compound_mapcount(page));
+ pr_warn("head:%px order:%u compound_mapcount:%d\n",
+ head, compound_order(head),
+ compound_mapcount(head));
}
- else
- pr_warn("page:%px refcount:%d mapcount:%d mapping:%p index:%#lx\n",
- page, page_ref_count(page), mapcount,
- mapping, page_to_pgoff(page));
+ }
if (PageKsm(page))
type = "ksm ";
else if (PageAnon(page))
_
Patches currently in -mm which might be from willy@infradead.org are
mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-print-head-flags-in-dump_page.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (96 preceding siblings ...)
2020-07-09 23:09 ` + mm-dump-compound-page-information-on-a-second-line.patch " Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
2020-07-09 23:09 ` + mm-switch-dump_page-to-get_kernel_nofault.patch " Andrew Morton
` (134 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy
The patch titled
Subject: mm/debug: print head flags in dump_page
has been added to the -mm tree. Its filename is
mm-print-head-flags-in-dump_page.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-print-head-flags-in-dump_page.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-print-head-flags-in-dump_page.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: print head flags in dump_page
Tail page flags contain very little useful information. Print the head
page's flags instead. While the flags will contain "head" for tail pages,
this should not be too confusing as the previous line starts with the word
"head:" and so the flags should be interpreted as belonging to the head
page.
Link: http://lkml.kernel.org/r/20200709202117.7216-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/debug.c~mm-print-head-flags-in-dump_page
+++ a/mm/debug.c
@@ -162,7 +162,7 @@ void __dump_page(struct page *page, cons
out_mapping:
BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);
- pr_warn("%sflags: %#lx(%pGp)%s\n", type, page->flags, &page->flags,
+ pr_warn("%sflags: %#lx(%pGp)%s\n", type, head->flags, &head->flags,
page_cma ? " CMA" : "");
hex_only:
_
Patches currently in -mm which might be from willy@infradead.org are
mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-switch-dump_page-to-get_kernel_nofault.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (97 preceding siblings ...)
2020-07-09 23:09 ` + mm-print-head-flags-in-dump_page.patch " Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
2020-07-09 23:09 ` + mm-print-the-inode-number-in-dump_page.patch " Andrew Morton
` (133 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy
The patch titled
Subject: mm/debug: switch dump_page to get_kernel_nofault
has been added to the -mm tree. Its filename is
mm-switch-dump_page-to-get_kernel_nofault.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-switch-dump_page-to-get_kernel_nofault.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-switch-dump_page-to-get_kernel_nofault.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: switch dump_page to get_kernel_nofault
This is simpler to use than copy_from_kernel_nofault(). Also make some of
the related error messages less verbose.
Link: http://lkml.kernel.org/r/20200709202117.7216-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug.c | 36 ++++++++++++++++--------------------
1 file changed, 16 insertions(+), 20 deletions(-)
--- a/mm/debug.c~mm-switch-dump_page-to-get_kernel_nofault
+++ a/mm/debug.c
@@ -109,54 +109,50 @@ void __dump_page(struct page *page, cons
else if (PageAnon(page))
type = "anon ";
else if (mapping) {
- const struct inode *host;
+ struct inode *host;
const struct address_space_operations *a_ops;
- const struct hlist_node *dentry_first;
- const struct dentry *dentry_ptr;
+ struct hlist_node *dentry_first;
+ struct dentry *dentry_ptr;
struct dentry dentry;
/*
* mapping can be invalid pointer and we don't want to crash
* accessing it, so probe everything depending on it carefully
*/
- if (copy_from_kernel_nofault(&host, &mapping->host,
- sizeof(struct inode *)) ||
- copy_from_kernel_nofault(&a_ops, &mapping->a_ops,
- sizeof(struct address_space_operations *))) {
- pr_warn("failed to read mapping->host or a_ops, mapping not a valid kernel address?\n");
+ if (get_kernel_nofault(host, &mapping->host) ||
+ get_kernel_nofault(a_ops, &mapping->a_ops)) {
+ pr_warn("failed to read mapping contents, not a valid kernel address?\n");
goto out_mapping;
}
if (!host) {
- pr_warn("mapping->a_ops:%ps\n", a_ops);
+ pr_warn("aops:%ps\n", a_ops);
goto out_mapping;
}
- if (copy_from_kernel_nofault(&dentry_first,
- &host->i_dentry.first, sizeof(struct hlist_node *))) {
- pr_warn("mapping->a_ops:%ps with invalid mapping->host inode address %px\n",
- a_ops, host);
+ if (get_kernel_nofault(dentry_first, &host->i_dentry.first)) {
+ pr_warn("aops:%ps with invalid host inode %px\n",
+ a_ops, host);
goto out_mapping;
}
if (!dentry_first) {
- pr_warn("mapping->a_ops:%ps\n", a_ops);
+ pr_warn("aops:%ps\n", a_ops);
goto out_mapping;
}
dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias);
- if (copy_from_kernel_nofault(&dentry, dentry_ptr,
- sizeof(struct dentry))) {
- pr_warn("mapping->aops:%ps with invalid mapping->host->i_dentry.first %px\n",
- a_ops, dentry_ptr);
+ if (get_kernel_nofault(dentry, dentry_ptr)) {
+ pr_warn("aops:%ps with invalid dentry %px\n", a_ops,
+ dentry_ptr);
} else {
/*
* if dentry is corrupted, the %pd handler may still
* crash, but it's unlikely that we reach here with a
* corrupted struct page
*/
- pr_warn("mapping->aops:%ps dentry name:\"%pd\"\n",
- a_ops, &dentry);
+ pr_warn("aops:%ps dentry name:\"%pd\"\n", a_ops,
+ &dentry);
}
}
out_mapping:
_
Patches currently in -mm which might be from willy@infradead.org are
mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-print-the-inode-number-in-dump_page.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (98 preceding siblings ...)
2020-07-09 23:09 ` + mm-switch-dump_page-to-get_kernel_nofault.patch " Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
2020-07-09 23:09 ` + mm-print-hashed-address-of-struct-page.patch " Andrew Morton
` (132 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy
The patch titled
Subject: mm/debug: print the inode number in dump_page
has been added to the -mm tree. Its filename is
mm-print-the-inode-number-in-dump_page.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-print-the-inode-number-in-dump_page.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-print-the-inode-number-in-dump_page.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: print the inode number in dump_page
The inode number helps correlate this page with debug messages elsewhere
in the kernel.
Link: http://lkml.kernel.org/r/20200709202117.7216-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/debug.c~mm-print-the-inode-number-in-dump_page
+++ a/mm/debug.c
@@ -137,7 +137,7 @@ void __dump_page(struct page *page, cons
}
if (!dentry_first) {
- pr_warn("aops:%ps\n", a_ops);
+ pr_warn("aops:%ps ino:%lx\n", a_ops, host->i_ino);
goto out_mapping;
}
@@ -151,8 +151,8 @@ void __dump_page(struct page *page, cons
* crash, but it's unlikely that we reach here with a
* corrupted struct page
*/
- pr_warn("aops:%ps dentry name:\"%pd\"\n", a_ops,
- &dentry);
+ pr_warn("aops:%ps ino:%lx dentry name:\"%pd\"\n",
+ a_ops, host->i_ino, &dentry);
}
}
out_mapping:
_
Patches currently in -mm which might be from willy@infradead.org are
mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-print-hashed-address-of-struct-page.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (99 preceding siblings ...)
2020-07-09 23:09 ` + mm-print-the-inode-number-in-dump_page.patch " Andrew Morton
@ 2020-07-09 23:09 ` Andrew Morton
2020-07-09 23:10 ` + mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch " Andrew Morton
` (131 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:09 UTC (permalink / raw)
To: jhubbard, kirill, mm-commits, rppt, vbabka, william.kucharski, willy
The patch titled
Subject: mm/debug: print hashed address of struct page
has been added to the -mm tree. Its filename is
mm-print-hashed-address-of-struct-page.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-print-hashed-address-of-struct-page.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-print-hashed-address-of-struct-page.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/debug: print hashed address of struct page
The actual address of the struct page isn't particularly helpful, while
the hashed address helps match with other messages elsewhere. Add the PFN
that the page refers to in order to help diagnose problems where the page
is improperly aligned for the purpose.
Link: http://lkml.kernel.org/r/20200709202117.7216-7-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/mm/debug.c~mm-print-hashed-address-of-struct-page
+++ a/mm/debug.c
@@ -89,17 +89,17 @@ void __dump_page(struct page *page, cons
*/
mapcount = PageSlab(head) ? 0 : page_mapcount(page);
- pr_warn("page:%px refcount:%d mapcount:%d mapping:%p index:%#lx\n",
+ pr_warn("page:%p refcount:%d mapcount:%d mapping:%p index:%#lx pfn:%#lx\n",
page, page_ref_count(head), mapcount, mapping,
- page_to_pgoff(page));
+ page_to_pgoff(page), page_to_pfn(page));
if (compound) {
if (hpage_pincount_available(page)) {
- pr_warn("head:%px order:%u compound_mapcount:%d compound_pincount:%d\n",
+ pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n",
head, compound_order(head),
compound_mapcount(head),
compound_pincount(head));
} else {
- pr_warn("head:%px order:%u compound_mapcount:%d\n",
+ pr_warn("head:%p order:%u compound_mapcount:%d\n",
head, compound_order(head),
compound_mapcount(head));
}
_
Patches currently in -mm which might be from willy@infradead.org are
mm-handle-page-mapping-better-in-dump_page.patch
mm-dump-compound-page-information-on-a-second-line.patch
mm-print-head-flags-in-dump_page.patch
mm-switch-dump_page-to-get_kernel_nofault.patch
mm-print-the-inode-number-in-dump_page.patch
mm-print-hashed-address-of-struct-page.patch
vmalloc-convert-to-xarray.patch
mm-store-compound_nr-as-well-as-compound_order.patch
mm-move-page-flags-include-to-top-of-file.patch
mm-add-thp_order.patch
mm-add-thp_size.patch
mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
mm-add-thp_head.patch
mm-introduce-offset_in_thp.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (100 preceding siblings ...)
2020-07-09 23:09 ` + mm-print-hashed-address-of-struct-page.patch " Andrew Morton
@ 2020-07-09 23:10 ` Andrew Morton
2020-07-09 23:46 ` + mm-migrate-optimize-migrate_vma_setup-for-holes.patch " Andrew Morton
` (130 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:10 UTC (permalink / raw)
To: chris, domas, guro, hannes, mhocko, mm-commits, shakeelb, tj
The patch titled
Subject: mm: memcontrol: avoid workload stalls when lowering memory.high
has been added to the -mm tree. Its filename is
mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Roman Gushchin <guro@fb.com>
Subject: mm: memcontrol: avoid workload stalls when lowering memory.high
Memory.high limit is implemented in a way such that the kernel penalizes
all threads which are allocating a memory over the limit. Forcing all
threads into the synchronous reclaim and adding some artificial delays
allows to slow down the memory consumption and potentially give some time
for userspace oom handlers/resource control agents to react.
It works nicely if the memory usage is hitting the limit from below,
however it works sub-optimal if a user adjusts memory.high to a value way
below the current memory usage. It basically forces all workload threads
(doing any memory allocations) into the synchronous reclaim and sleep.
This makes the workload completely unresponsive for a long period of time
and can also lead to a system-wide contention on lru locks. It can happen
even if the workload is not actually tight on memory and has, for example,
a ton of cold pagecache.
In the current implementation writing to memory.high causes an atomic
update of page counter's high value followed by an attempt to reclaim
enough memory to fit into the new limit. To fix the problem described
above, all we need is to change the order of execution: try to push the
memory usage under the limit first, and only then set the new high limit.
Link: http://lkml.kernel.org/r/20200709194718.189231-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Domas Mituzas <domas@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Chris Down <chris@chrisdown.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/memcontrol.c~mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh
+++ a/mm/memcontrol.c
@@ -6203,8 +6203,6 @@ static ssize_t memory_high_write(struct
if (err)
return err;
- page_counter_set_high(&memcg->memory, high);
-
for (;;) {
unsigned long nr_pages = page_counter_read(&memcg->memory);
unsigned long reclaimed;
@@ -6228,6 +6226,8 @@ static ssize_t memory_high_write(struct
break;
}
+ page_counter_set_high(&memcg->memory, high);
+
return nbytes;
}
_
Patches currently in -mm which might be from guro@fb.com are
mm-kmem-make-memcg_kmem_enabled-irreversible.patch
mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
mm-memcg-prepare-for-byte-sized-vmstat-items.patch
mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
mm-slub-implement-slub-version-of-obj_to_index.patch
mm-memcg-slab-obj_cgroup-api.patch
mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
mm-memcg-slab-deprecate-memorykmemslabinfo.patch
mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
mm-memcg-slab-simplify-memcg-cache-creation.patch
mm-memcg-slab-remove-memcg_kmem_get_cache.patch
mm-memcg-slab-deprecate-slab_root_caches.patch
mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
kselftests-cgroup-add-kernel-memory-accounting-tests.patch
tools-cgroup-add-memcg_slabinfopy-tool.patch
percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
kselftests-cgroup-add-perpcu-memory-accounting-test.patch
mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-migrate-optimize-migrate_vma_setup-for-holes.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (101 preceding siblings ...)
2020-07-09 23:10 ` + mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch " Andrew Morton
@ 2020-07-09 23:46 ` Andrew Morton
2020-07-09 23:46 ` + mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch " Andrew Morton
` (129 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:46 UTC (permalink / raw)
To: bharata, hch, jgg, jglisse, jhubbard, mm-commits, rcampbell, shuah
The patch titled
Subject: mm/migrate: optimize migrate_vma_setup() for holes
has been added to the -mm tree. Its filename is
mm-migrate-optimize-migrate_vma_setup-for-holes.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-migrate-optimize-migrate_vma_setup-for-holes.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-migrate-optimize-migrate_vma_setup-for-holes.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Ralph Campbell <rcampbell@nvidia.com>
Subject: mm/migrate: optimize migrate_vma_setup() for holes
Patch series "mm/migrate: optimize migrate_vma_setup() for holes".
A simple optimization for migrate_vma_*() when the source vma is not an
anonymous vma and a new test case to exercise it.
This patch (of 2):
When migrating system memory to device private memory, if the source
address range is a valid VMA range and there is no memory or a zero page,
the source PFN array is marked as valid but with no PFN.
This lets the device driver allocate private memory and clear it, then
insert the new device private struct page into the CPU's page tables when
migrate_vma_pages() is called. migrate_vma_pages() only inserts the new
page if the VMA is an anonymous range.
There is no point in telling the device driver to allocate device private
memory and then not migrate the page. Instead, mark the source PFN array
entries as not migrating to avoid this overhead.
Link: http://lkml.kernel.org/r/20200709165711.26584-1-rcampbell@nvidia.com
Link: http://lkml.kernel.org/r/20200709165711.26584-2-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: "Bharata B Rao" <bharata@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/migrate.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/mm/migrate.c~mm-migrate-optimize-migrate_vma_setup-for-holes
+++ a/mm/migrate.c
@@ -2167,9 +2167,13 @@ static int migrate_vma_collect_hole(unsi
{
struct migrate_vma *migrate = walk->private;
unsigned long addr;
+ unsigned long flags;
+
+ /* Only allow populating anonymous memory. */
+ flags = vma_is_anonymous(walk->vma) ? MIGRATE_PFN_MIGRATE : 0;
for (addr = start; addr < end; addr += PAGE_SIZE) {
- migrate->src[migrate->npages] = MIGRATE_PFN_MIGRATE;
+ migrate->src[migrate->npages] = flags;
migrate->dst[migrate->npages] = 0;
migrate->npages++;
migrate->cpages++;
_
Patches currently in -mm which might be from rcampbell@nvidia.com are
mm-remove-redundant-check-non_swap_entry.patch
mm-migrate-optimize-migrate_vma_setup-for-holes.patch
mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (102 preceding siblings ...)
2020-07-09 23:46 ` + mm-migrate-optimize-migrate_vma_setup-for-holes.patch " Andrew Morton
@ 2020-07-09 23:46 ` Andrew Morton
2020-07-10 0:15 ` + mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch " Andrew Morton
` (128 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-09 23:46 UTC (permalink / raw)
To: bharata, hch, jgg, jglisse, jhubbard, mm-commits, rcampbell, shuah
The patch titled
Subject: mm/migrate: add migrate-shared test for migrate_vma_*()
has been added to the -mm tree. Its filename is
mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Ralph Campbell <rcampbell@nvidia.com>
Subject: mm/migrate: add migrate-shared test for migrate_vma_*()
Add a migrate_vma_*() self test for mmap(MAP_SHARED) to verify that
!vma_anonymous() ranges won't be migrated.
Link: http://lkml.kernel.org/r/20200709165711.26584-3-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: "Bharata B Rao" <bharata@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
tools/testing/selftests/vm/hmm-tests.c | 35 +++++++++++++++++++++++
1 file changed, 35 insertions(+)
--- a/tools/testing/selftests/vm/hmm-tests.c~mm-migrate-add-migrate-shared-test-for-migrate_vma_
+++ a/tools/testing/selftests/vm/hmm-tests.c
@@ -932,6 +932,41 @@ TEST_F(hmm, migrate_fault)
}
/*
+ * Migrate anonymous shared memory to device private memory.
+ */
+TEST_F(hmm, migrate_shared)
+{
+ struct hmm_buffer *buffer;
+ unsigned long npages;
+ unsigned long size;
+ int ret;
+
+ npages = ALIGN(HMM_BUFFER_SIZE, self->page_size) >> self->page_shift;
+ ASSERT_NE(npages, 0);
+ size = npages << self->page_shift;
+
+ buffer = malloc(sizeof(*buffer));
+ ASSERT_NE(buffer, NULL);
+
+ buffer->fd = -1;
+ buffer->size = size;
+ buffer->mirror = malloc(size);
+ ASSERT_NE(buffer->mirror, NULL);
+
+ buffer->ptr = mmap(NULL, size,
+ PROT_READ | PROT_WRITE,
+ MAP_SHARED | MAP_ANONYMOUS,
+ buffer->fd, 0);
+ ASSERT_NE(buffer->ptr, MAP_FAILED);
+
+ /* Migrate memory to device. */
+ ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+ ASSERT_EQ(ret, -ENOENT);
+
+ hmm_buffer_free(buffer);
+}
+
+/*
* Try to migrate various memory types to device private memory.
*/
TEST_F(hmm2, migrate_mixed)
_
Patches currently in -mm which might be from rcampbell@nvidia.com are
mm-remove-redundant-check-non_swap_entry.patch
mm-migrate-optimize-migrate_vma_setup-for-holes.patch
mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (103 preceding siblings ...)
2020-07-09 23:46 ` + mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch " Andrew Morton
@ 2020-07-10 0:15 ` Andrew Morton
2020-07-10 0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards.patch " Andrew Morton
` (127 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:15 UTC (permalink / raw)
To: laoar.shao, mhocko, mm-commits, rientjes
The patch titled
Subject: mm, oom: make the calculation of oom badness more accurate
has been added to the -mm tree. Its filename is
mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Yafang Shao <laoar.shao@gmail.com>
Subject: mm, oom: make the calculation of oom badness more accurate
Recently we found an issue on our production environment that when memcg
oom is triggered the oom killer doesn't chose the process with largest
resident memory but chose the first scanned process. Note that all
processes in this memcg have the same oom_score_adj, so the oom killer
should chose the process with largest resident memory.
Bellow is part of the oom info, which is enough to analyze this issue.
[7516987.983223] memory: usage 16777216kB, limit 16777216kB, failcnt 52843037
[7516987.983224] memory+swap: usage 16777216kB, limit 9007199254740988kB, failcnt 0
[7516987.983225] kmem: usage 301464kB, limit 9007199254740988kB, failcnt 0
[...]
[7516987.983293] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[7516987.983510] [ 5740] 0 5740 257 1 32768 0 -998 pause
[7516987.983574] [58804] 0 58804 4594 771 81920 0 -998 entry_point.bas
[7516987.983577] [58908] 0 58908 7089 689 98304 0 -998 cron
[7516987.983580] [58910] 0 58910 16235 5576 163840 0 -998 supervisord
[7516987.983590] [59620] 0 59620 18074 1395 188416 0 -998 sshd
[7516987.983594] [59622] 0 59622 18680 6679 188416 0 -998 python
[7516987.983598] [59624] 0 59624 1859266 5161 548864 0 -998 odin-agent
[7516987.983600] [59625] 0 59625 707223 9248 983040 0 -998 filebeat
[7516987.983604] [59627] 0 59627 416433 64239 774144 0 -998 odin-log-agent
[7516987.983607] [59631] 0 59631 180671 15012 385024 0 -998 python3
[7516987.983612] [61396] 0 61396 791287 3189 352256 0 -998 client
[7516987.983615] [61641] 0 61641 1844642 29089 946176 0 -998 client
[7516987.983765] [ 9236] 0 9236 2642 467 53248 0 -998 php_scanner
[7516987.983911] [42898] 0 42898 15543 838 167936 0 -998 su
[7516987.983915] [42900] 1000 42900 3673 867 77824 0 -998 exec_script_vr2
[7516987.983918] [42925] 1000 42925 36475 19033 335872 0 -998 python
[7516987.983921] [57146] 1000 57146 3673 848 73728 0 -998 exec_script_J2p
[7516987.983925] [57195] 1000 57195 186359 22958 491520 0 -998 python2
[7516987.983928] [58376] 1000 58376 275764 14402 290816 0 -998 rosmaster
[7516987.983931] [58395] 1000 58395 155166 4449 245760 0 -998 rosout
[7516987.983935] [58406] 1000 58406 18285584 3967322 37101568 0 -998 data_sim
[7516987.984221] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=3aa16c9482ae3a6f6b78bda68a55d32c87c99b985e0f11331cddf05af6c4d753,mems_allowed=0-1,oom_memcg=/kubepods/podf1c273d3-9b36-11ea-b3df-246e9693c184,task_memcg=/kubepods/podf1c273d3-9b36-11ea-b3df-246e9693c184/1f246a3eeea8f70bf91141eeaf1805346a666e225f823906485ea0b6c37dfc3d,task=pause,pid=5740,uid=0
[7516987.984254] Memory cgroup out of memory: Killed process 5740 (pause) total-vm:1028kB, anon-rss:4kB, file-rss:0kB, shmem-rss:0kB
[7516988.092344] oom_reaper: reaped process 5740 (pause), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
We can find that the first scanned process 5740 (pause) was killed, but
its rss is only one page. That is because, when we calculate the oom
badness in oom_badness(), we always ignore the negtive point and convert
all of these negtive points to 1. Now as oom_score_adj of all the
processes in this targeted memcg have the same value -998, the points of
these processes are all negtive value. As a result, the first scanned
process will be killed.
The oom_socre_adj (-998) in this memcg is set by kubelet, because it is a
a Guaranteed pod, which has higher priority to prevent from being killed
by system oom.
To fix this issue, we should make the calculation of oom point more
accurate. We can achieve it by convert the chosen_point from 'unsigned
long' to 'long'.
Link: http://lkml.kernel.org/r/1594309987-9919-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/tty/sysrq.c | 1 +
fs/proc/base.c | 7 ++++++-
include/linux/oom.h | 4 ++--
mm/memcontrol.c | 1 +
mm/oom_kill.c | 19 ++++++++-----------
mm/page_alloc.c | 1 +
6 files changed, 19 insertions(+), 14 deletions(-)
--- a/drivers/tty/sysrq.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/drivers/tty/sysrq.c
@@ -382,6 +382,7 @@ static void moom_callback(struct work_st
.memcg = NULL,
.gfp_mask = gfp_mask,
.order = -1,
+ .chosen_points = LONG_MIN,
};
mutex_lock(&oom_lock);
--- a/fs/proc/base.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/fs/proc/base.c
@@ -551,8 +551,13 @@ static int proc_oom_score(struct seq_fil
{
unsigned long totalpages = totalram_pages() + total_swap_pages;
unsigned long points = 0;
+ long badness;
- points = oom_badness(task, totalpages) * 1000 / totalpages;
+ badness = oom_badness(task, totalpages);
+ if (badness != LONG_MIN) {
+ /* Let's keep the range of points as [0, 2000]. */
+ points = (1000 + badness * 1000 / (long)totalpages) * 2 / 3;
+ }
seq_printf(m, "%lu\n", points);
return 0;
--- a/include/linux/oom.h~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/include/linux/oom.h
@@ -48,7 +48,7 @@ struct oom_control {
/* Used by oom implementation, do not set */
unsigned long totalpages;
struct task_struct *chosen;
- unsigned long chosen_points;
+ long chosen_points;
/* Used to print the constraint info. */
enum oom_constraint constraint;
@@ -107,7 +107,7 @@ static inline vm_fault_t check_stable_ad
bool __oom_reap_task_mm(struct mm_struct *mm);
-extern unsigned long oom_badness(struct task_struct *p,
+long oom_badness(struct task_struct *p,
unsigned long totalpages);
extern bool out_of_memory(struct oom_control *oc);
--- a/mm/memcontrol.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/mm/memcontrol.c
@@ -1666,6 +1666,7 @@ static bool mem_cgroup_out_of_memory(str
.memcg = memcg,
.gfp_mask = gfp_mask,
.order = order,
+ .chosen_points = LONG_MIN,
};
bool ret;
--- a/mm/oom_kill.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/mm/oom_kill.c
@@ -196,17 +196,17 @@ static bool is_dump_unreclaim_slabs(void
* predictable as possible. The goal is to return the highest value for the
* task consuming the most memory to avoid subsequent oom failures.
*/
-unsigned long oom_badness(struct task_struct *p, unsigned long totalpages)
+long oom_badness(struct task_struct *p, unsigned long totalpages)
{
long points;
long adj;
if (oom_unkillable_task(p))
- return 0;
+ return LONG_MIN;
p = find_lock_task_mm(p);
if (!p)
- return 0;
+ return LONG_MIN;
/*
* Do not even consider tasks which are explicitly marked oom
@@ -218,7 +218,7 @@ unsigned long oom_badness(struct task_st
test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
in_vfork(p)) {
task_unlock(p);
- return 0;
+ return LONG_MIN;
}
/*
@@ -233,11 +233,7 @@ unsigned long oom_badness(struct task_st
adj *= totalpages / 1000;
points += adj;
- /*
- * Never return 0 for an eligible task regardless of the root bonus and
- * oom_score_adj (oom_score_adj can't be OOM_SCORE_ADJ_MIN here).
- */
- return points > 0 ? points : 1;
+ return points;
}
static const char * const oom_constraint_text[] = {
@@ -336,12 +332,12 @@ static int oom_evaluate_task(struct task
* killed first if it triggers an oom, then select it.
*/
if (oom_task_origin(task)) {
- points = ULONG_MAX;
+ points = LONG_MAX;
goto select;
}
points = oom_badness(task, oc->totalpages);
- if (!points || points < oc->chosen_points)
+ if (points == LONG_MIN || points < oc->chosen_points)
goto next;
select:
@@ -1128,6 +1124,7 @@ void pagefault_out_of_memory(void)
.memcg = NULL,
.gfp_mask = 0,
.order = 0,
+ .chosen_points = LONG_MIN,
};
if (mem_cgroup_oom_synchronize(true))
--- a/mm/page_alloc.c~mm-oom-make-the-calculation-of-oom-badness-more-accurate
+++ a/mm/page_alloc.c
@@ -3915,6 +3915,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, un
.memcg = NULL,
.gfp_mask = gfp_mask,
.order = order,
+ .chosen_points = LONG_MIN,
};
struct page *page;
_
Patches currently in -mm which might be from laoar.shao@gmail.com are
mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-close-race-between-munmap-and-expand_upwards-downwards.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (104 preceding siblings ...)
2020-07-10 0:15 ` + mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch " Andrew Morton
@ 2020-07-10 0:23 ` Andrew Morton
2020-07-10 0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch " Andrew Morton
` (126 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:23 UTC (permalink / raw)
To: jannh, kirill.shutemov, mm-commits, oleg, stable, vbabka, willy,
yang.shi
The patch titled
Subject: mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
has been added to the -mm tree. Its filename is
mm-close-race-between-munmap-and-expand_upwards-downwards.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under
mmap_read_lock(). It can lead to race with __do_munmap():
Thread A Thread B
__do_munmap()
detach_vmas_to_be_unmapped()
mmap_write_downgrade()
expand_downwards()
vma->vm_start = address;
// The VMA now overlaps with
// VMAs detached by the Thread A
// page fault populates expanded part
// of the VMA
unmap_region()
// Zaps pagetables partly
// populated by Thread B
Similar race exists for expand_upwards().
The fix is to avoid downgrading mmap_lock in __do_munmap() if detached
VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA.
Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel.com
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Jann Horn <jannh@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org> [4.20+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/mmap.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
--- a/mm/mmap.c~mm-close-race-between-munmap-and-expand_upwards-downwards
+++ a/mm/mmap.c
@@ -2620,7 +2620,7 @@ static void unmap_region(struct mm_struc
* Create a list of vma's touched by the unmap, removing them from the mm's
* vma list as we go..
*/
-static void
+static bool
detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
struct vm_area_struct *prev, unsigned long end)
{
@@ -2645,6 +2645,17 @@ detach_vmas_to_be_unmapped(struct mm_str
/* Kill the cache */
vmacache_invalidate(mm);
+
+ /*
+ * Do not downgrade mmap_sem if we are next to VM_GROWSDOWN or
+ * VM_GROWSUP VMA. Such VMAs can change their size under
+ * down_read(mmap_sem) and collide with the VMA we are about to unmap.
+ */
+ if (vma && (vma->vm_flags & VM_GROWSDOWN))
+ return false;
+ if (prev && (prev->vm_flags & VM_GROWSUP))
+ return false;
+ return true;
}
/*
@@ -2825,7 +2836,8 @@ int __do_munmap(struct mm_struct *mm, un
}
/* Detach vmas from rbtree */
- detach_vmas_to_be_unmapped(mm, vma, prev, end);
+ if (!detach_vmas_to_be_unmapped(mm, vma, prev, end))
+ downgrade = false;
if (downgrade)
mmap_write_downgrade(mm);
_
Patches currently in -mm which might be from kirill.shutemov@linux.intel.com are
mm-close-race-between-munmap-and-expand_upwards-downwards.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (105 preceding siblings ...)
2020-07-10 0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards.patch " Andrew Morton
@ 2020-07-10 0:23 ` Andrew Morton
2020-07-10 0:27 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch " Andrew Morton
` (125 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:23 UTC (permalink / raw)
To: akpm, jannh, kirill.shutemov, mm-commits, oleg, vbabka, willy, yang.shi
The patch titled
Subject: mm-close-race-between-munmap-and-expand_upwards-downwards-fix
has been added to the -mm tree. Its filename is
mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-close-race-between-munmap-and-expand_upwards-downwards-fix
s/mmap_sem/mmap_lock/ in comment
Cc: Jann Horn <jannh@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/mmap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/mmap.c~mm-close-race-between-munmap-and-expand_upwards-downwards-fix
+++ a/mm/mmap.c
@@ -2647,9 +2647,9 @@ detach_vmas_to_be_unmapped(struct mm_str
vmacache_invalidate(mm);
/*
- * Do not downgrade mmap_sem if we are next to VM_GROWSDOWN or
+ * Do not downgrade mmap_lock if we are next to VM_GROWSDOWN or
* VM_GROWSUP VMA. Such VMAs can change their size under
- * down_read(mmap_sem) and collide with the VMA we are about to unmap.
+ * down_read(mmap_lock) and collide with the VMA we are about to unmap.
*/
if (vma && (vma->vm_flags & VM_GROWSDOWN))
return false;
_
Patches currently in -mm which might be from akpm@linux-foundation.org are
mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (106 preceding siblings ...)
2020-07-10 0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch " Andrew Morton
@ 2020-07-10 0:27 ` Andrew Morton
2020-07-10 0:33 ` + iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
` (124 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:27 UTC (permalink / raw)
To: anshuman.khandual, daniel.m.jordan, hughd, jhubbard, mm-commits,
n-horiguchi, rdunlap, willy, ziy
The patch titled
Subject: mm-vmstat-add-events-for-thp-migration-without-split-fix-2
has been added to the -mm tree. Its filename is
mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Zi Yan <ziy@nvidia.com>
Subject: mm-vmstat-add-events-for-thp-migration-without-split-fix-2
- Changed THP_MIGRATION_FAILURE as THP_MIGRATION_FAIL per John
- Dropped all conditional 'if' blocks in migrate_pages() per Andrew and John
- Updated migration events documentation per John
- Updated thp_nr_pages variable as nr_subpages for an expected merge conflict
- Moved all new THP vmstat events into CONFIG_MIGRATION
- Updated Cc list with Documentation/ and tracing related addresses
Link: http://lkml.kernel.org/r/C5E3C65C-8253-4638-9D3C-71A61858BB8B@nvidia.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/vm/page_migration.rst | 40 +++++++++++++++-----------
include/linux/vm_event_item.h | 6 +--
mm/migrate.c | 25 ++++++----------
mm/vmstat.c | 6 +--
4 files changed, 40 insertions(+), 37 deletions(-)
--- a/Documentation/vm/page_migration.rst~mm-vmstat-add-events-for-thp-migration-without-split-fix-2
+++ a/Documentation/vm/page_migration.rst
@@ -253,24 +253,32 @@ which are function pointers of struct ad
PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
for own purpose.
-Quantifying Migration
+Monitoring Migration
=====================
-Following events can be used to quantify page migration.
-1. PGMIGRATE_SUCCESS /* Normal page migration success */
-2. PGMIGRATE_FAIL /* Normal page migration failure */
-3. THP_MIGRATION_SUCCESS /* Transparent huge page migration success */
-4. THP_MIGRATION_FAILURE /* Transparent huge page migration failure */
-5. THP_MIGRATION_SPLIT /* Transparent huge page got split, retried */
-
-THP_MIGRATION_SUCCESS is when THP is migrated successfully without getting
-split into it's subpages. THP_MIGRATION_FAILURE is when THP could neither
-be migrated nor be split. THP_MIGRATION_SPLIT is when THP could not
-just be migrated as is but instead get split into it's subpages and later
-retried as normal pages. THP events would also update normal page migration
-statistics PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE. These events will help
-in quantifying and analyzing various THP migration events including both
-success and failure cases.
+The following events (counters) can be used to monitor page migration.
+
+1. PGMIGRATE_SUCCESS: Normal page migration success. Each count means that a
+ page was migrated. If the page was a non-THP page, then this counter is
+ increased by one. If the page was a THP, then this counter is increased by
+ the number of THP subpages. For example, migration of a single 2MB THP that
+ has 4KB-size base pages (subpages) will cause this counter to increase by
+ 512.
+
+2. PGMIGRATE_FAIL: Normal page migration failure. Same counting rules as for
+ _SUCCESS, above: this will be increased by the number of subpages, if it was
+ a THP.
+
+3. THP_MIGRATION_SUCCESS: A THP was migrated without being split.
+
+4. THP_MIGRATION_FAIL: A THP could not be migrated nor it could be split.
+
+5. THP_MIGRATION_SPLIT: A THP was migrated, but not as such: first, the THP had
+ to be split. After splitting, a migration retry was used for it's sub-pages.
+
+THP_MIGRATION_* events also update the appropriate PGMIGRATE_SUCCESS or
+PGMIGRATE_FAIL events. For example, a THP migration failure will cause both
+THP_MIGRATION_FAIL and PGMIGRATE_FAIL to increase.
Christoph Lameter, May 8, 2006.
Minchan Kim, Mar 28, 2016.
--- a/include/linux/vm_event_item.h~mm-vmstat-add-events-for-thp-migration-without-split-fix-2
+++ a/include/linux/vm_event_item.h
@@ -56,6 +56,9 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
#endif
#ifdef CONFIG_MIGRATION
PGMIGRATE_SUCCESS, PGMIGRATE_FAIL,
+ THP_MIGRATION_SUCCESS,
+ THP_MIGRATION_FAIL,
+ THP_MIGRATION_SPLIT,
#endif
#ifdef CONFIG_COMPACTION
COMPACTMIGRATE_SCANNED, COMPACTFREE_SCANNED,
@@ -95,9 +98,6 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
THP_ZERO_PAGE_ALLOC_FAILED,
THP_SWPOUT,
THP_SWPOUT_FALLBACK,
- THP_MIGRATION_SUCCESS,
- THP_MIGRATION_FAILURE,
- THP_MIGRATION_SPLIT,
#endif
#ifdef CONFIG_MEMORY_BALLOON
BALLOON_INFLATE,
--- a/mm/migrate.c~mm-vmstat-add-events-for-thp-migration-without-split-fix-2
+++ a/mm/migrate.c
@@ -1429,7 +1429,7 @@ int migrate_pages(struct list_head *from
struct page *page;
struct page *page2;
int swapwrite = current->flags & PF_SWAPWRITE;
- int rc, thp_n_pages;
+ int rc, nr_subpages;
if (!swapwrite)
current->flags |= PF_SWAPWRITE;
@@ -1446,7 +1446,7 @@ retry:
* during migration.
*/
is_thp = PageTransHuge(page);
- thp_n_pages = thp_nr_pages(page);
+ nr_subpages = thp_nr_pages(page);
cond_resched();
if (PageHuge(page))
@@ -1483,7 +1483,7 @@ retry:
}
if (is_thp) {
nr_thp_failed++;
- nr_failed += thp_n_pages;
+ nr_failed += nr_subpages;
goto out;
}
nr_failed++;
@@ -1498,7 +1498,7 @@ retry:
case MIGRATEPAGE_SUCCESS:
if (is_thp) {
nr_thp_succeeded++;
- nr_succeeded += thp_n_pages;
+ nr_succeeded += nr_subpages;
break;
}
nr_succeeded++;
@@ -1512,7 +1512,7 @@ retry:
*/
if (is_thp) {
nr_thp_failed++;
- nr_failed += thp_n_pages;
+ nr_failed += nr_subpages;
break;
}
nr_failed++;
@@ -1524,16 +1524,11 @@ retry:
nr_thp_failed += thp_retry;
rc = nr_failed;
out:
- if (nr_succeeded)
- count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
- if (nr_failed)
- count_vm_events(PGMIGRATE_FAIL, nr_failed);
- if (nr_thp_succeeded)
- count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
- if (nr_thp_failed)
- count_vm_events(THP_MIGRATION_FAILURE, nr_thp_failed);
- if (nr_thp_split)
- count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
+ count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
+ count_vm_events(PGMIGRATE_FAIL, nr_failed);
+ count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
+ count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed);
+ count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded,
nr_thp_failed, nr_thp_split, mode, reason);
--- a/mm/vmstat.c~mm-vmstat-add-events-for-thp-migration-without-split-fix-2
+++ a/mm/vmstat.c
@@ -1274,6 +1274,9 @@ const char * const vmstat_text[] = {
#ifdef CONFIG_MIGRATION
"pgmigrate_success",
"pgmigrate_fail",
+ "thp_migration_success",
+ "thp_migration_fail",
+ "thp_migration_split",
#endif
#ifdef CONFIG_COMPACTION
"compact_migrate_scanned",
@@ -1320,9 +1323,6 @@ const char * const vmstat_text[] = {
"thp_zero_page_alloc_failed",
"thp_swpout",
"thp_swpout_fallback",
- "thp_migration_success",
- "thp_migration_failure",
- "thp_migration_split",
#endif
#ifdef CONFIG_MEMORY_BALLOON
"balloon_inflate",
_
Patches currently in -mm which might be from ziy@nvidia.com are
mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (107 preceding siblings ...)
2020-07-10 0:27 ` + mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch " Andrew Morton
@ 2020-07-10 0:33 ` Andrew Morton
2020-07-10 0:33 ` + rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
` (123 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:33 UTC (permalink / raw)
To: allenbh, arnd, benh, dalias, dave.jiang, davem, deller,
geert+renesas, geert, ink, James.Bottomley, jasowang, jdmason,
krzk, kuba, kvalo, mattst88, mm-commits, mpe, mst, paulus, rth,
ysato
The patch titled
Subject: iomap: constify ioreadX() iomem argument (as in generic implementation)
has been added to the -mm tree. Its filename is
iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Krzysztof Kozlowski <krzk@kernel.org>
Subject: iomap: constify ioreadX() iomem argument (as in generic implementation)
Patch series "iomap: Constify ioreadX() iomem argument", v3.
The ioread8/16/32() and others have inconsistent interface among the
architectures: some taking address as const, some not.
It seems there is nothing really stopping all of them to take pointer to
const.
This patch (of 4):
The ioreadX() and ioreadX_rep() helpers have inconsistent interface. On
some architectures void *__iomem address argument is a pointer to const,
on some not.
Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.
Link: http://lkml.kernel.org/r/20200709072837.5869-1-krzk@kernel.org
Link: http://lkml.kernel.org/r/20200709072837.5869-2-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/alpha/include/asm/core_apecs.h | 6 +-
arch/alpha/include/asm/core_cia.h | 6 +-
arch/alpha/include/asm/core_lca.h | 6 +-
arch/alpha/include/asm/core_marvel.h | 4 -
arch/alpha/include/asm/core_mcpcia.h | 6 +-
arch/alpha/include/asm/core_t2.h | 2
arch/alpha/include/asm/io.h | 12 ++--
arch/alpha/include/asm/io_trivial.h | 16 ++---
arch/alpha/include/asm/jensen.h | 2
arch/alpha/include/asm/machvec.h | 6 +-
arch/alpha/kernel/core_marvel.c | 2
arch/alpha/kernel/io.c | 12 ++--
arch/parisc/include/asm/io.h | 4 -
arch/parisc/lib/iomap.c | 72 ++++++++++++------------
arch/powerpc/kernel/iomap.c | 28 ++++-----
arch/sh/kernel/iomap.c | 22 +++----
include/asm-generic/iomap.h | 28 ++++-----
include/linux/io-64-nonatomic-hi-lo.h | 4 -
include/linux/io-64-nonatomic-lo-hi.h | 4 -
lib/iomap.c | 30 +++++-----
20 files changed, 136 insertions(+), 136 deletions(-)
--- a/arch/alpha/include/asm/core_apecs.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_apecs.h
@@ -384,7 +384,7 @@ struct el_apecs_procdata
} \
} while (0)
-__EXTERN_INLINE unsigned int apecs_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -420,7 +420,7 @@ __EXTERN_INLINE void apecs_iowrite8(u8 b
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int apecs_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread16(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -456,7 +456,7 @@ __EXTERN_INLINE void apecs_iowrite16(u16
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int apecs_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread32(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
if (addr < APECS_DENSE_MEM)
--- a/arch/alpha/include/asm/core_cia.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_cia.h
@@ -342,7 +342,7 @@ struct el_CIA_sysdata_mcheck {
#define vuip volatile unsigned int __force *
#define vulp volatile unsigned long __force *
-__EXTERN_INLINE unsigned int cia_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -374,7 +374,7 @@ __EXTERN_INLINE void cia_iowrite8(u8 b,
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int cia_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread16(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -404,7 +404,7 @@ __EXTERN_INLINE void cia_iowrite16(u16 b
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int cia_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread32(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
if (addr < CIA_DENSE_MEM)
--- a/arch/alpha/include/asm/core_lca.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_lca.h
@@ -230,7 +230,7 @@ union el_lca {
} while (0)
-__EXTERN_INLINE unsigned int lca_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -266,7 +266,7 @@ __EXTERN_INLINE void lca_iowrite8(u8 b,
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int lca_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread16(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -302,7 +302,7 @@ __EXTERN_INLINE void lca_iowrite16(u16 b
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int lca_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread32(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
if (addr < LCA_DENSE_MEM)
--- a/arch/alpha/include/asm/core_marvel.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_marvel.h
@@ -332,10 +332,10 @@ struct io7 {
#define vucp volatile unsigned char __force *
#define vusp volatile unsigned short __force *
-extern unsigned int marvel_ioread8(void __iomem *);
+extern unsigned int marvel_ioread8(const void __iomem *);
extern void marvel_iowrite8(u8 b, void __iomem *);
-__EXTERN_INLINE unsigned int marvel_ioread16(void __iomem *addr)
+__EXTERN_INLINE unsigned int marvel_ioread16(const void __iomem *addr)
{
return __kernel_ldwu(*(vusp)addr);
}
--- a/arch/alpha/include/asm/core_mcpcia.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_mcpcia.h
@@ -267,7 +267,7 @@ extern inline int __mcpcia_is_mmio(unsig
return (addr & 0x80000000UL) == 0;
}
-__EXTERN_INLINE unsigned int mcpcia_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int mcpcia_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long)xaddr & MCPCIA_MEM_MASK;
unsigned long hose = (unsigned long)xaddr & ~MCPCIA_MEM_MASK;
@@ -291,7 +291,7 @@ __EXTERN_INLINE void mcpcia_iowrite8(u8
*(vuip) ((addr << 5) + hose + 0x00) = w;
}
-__EXTERN_INLINE unsigned int mcpcia_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int mcpcia_ioread16(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long)xaddr & MCPCIA_MEM_MASK;
unsigned long hose = (unsigned long)xaddr & ~MCPCIA_MEM_MASK;
@@ -315,7 +315,7 @@ __EXTERN_INLINE void mcpcia_iowrite16(u1
*(vuip) ((addr << 5) + hose + 0x08) = w;
}
-__EXTERN_INLINE unsigned int mcpcia_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int mcpcia_ioread32(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long)xaddr;
--- a/arch/alpha/include/asm/core_t2.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/core_t2.h
@@ -572,7 +572,7 @@ __EXTERN_INLINE int t2_is_mmio(const vol
it doesn't make sense to merge the pio and mmio routines. */
#define IOPORT(OS, NS) \
-__EXTERN_INLINE unsigned int t2_ioread##NS(void __iomem *xaddr) \
+__EXTERN_INLINE unsigned int t2_ioread##NS(const void __iomem *xaddr) \
{ \
if (t2_is_mmio(xaddr)) \
return t2_read##OS(xaddr); \
--- a/arch/alpha/include/asm/io.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/io.h
@@ -150,9 +150,9 @@ static inline void generic_##NAME(TYPE b
alpha_mv.mv_##NAME(b, addr); \
}
-REMAP1(unsigned int, ioread8, /**/)
-REMAP1(unsigned int, ioread16, /**/)
-REMAP1(unsigned int, ioread32, /**/)
+REMAP1(unsigned int, ioread8, const)
+REMAP1(unsigned int, ioread16, const)
+REMAP1(unsigned int, ioread32, const)
REMAP1(u8, readb, const volatile)
REMAP1(u16, readw, const volatile)
REMAP1(u32, readl, const volatile)
@@ -307,7 +307,7 @@ static inline int __is_mmio(const volati
*/
#if IO_CONCAT(__IO_PREFIX,trivial_io_bw)
-extern inline unsigned int ioread8(void __iomem *addr)
+extern inline unsigned int ioread8(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -316,7 +316,7 @@ extern inline unsigned int ioread8(void
return ret;
}
-extern inline unsigned int ioread16(void __iomem *addr)
+extern inline unsigned int ioread16(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -359,7 +359,7 @@ extern inline void outw(u16 b, unsigned
#endif
#if IO_CONCAT(__IO_PREFIX,trivial_io_lq)
-extern inline unsigned int ioread32(void __iomem *addr)
+extern inline unsigned int ioread32(const void __iomem *addr)
{
unsigned int ret;
mb();
--- a/arch/alpha/include/asm/io_trivial.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/io_trivial.h
@@ -7,15 +7,15 @@
#if IO_CONCAT(__IO_PREFIX,trivial_io_bw)
__EXTERN_INLINE unsigned int
-IO_CONCAT(__IO_PREFIX,ioread8)(void __iomem *a)
+IO_CONCAT(__IO_PREFIX,ioread8)(const void __iomem *a)
{
- return __kernel_ldbu(*(volatile u8 __force *)a);
+ return __kernel_ldbu(*(const volatile u8 __force *)a);
}
__EXTERN_INLINE unsigned int
-IO_CONCAT(__IO_PREFIX,ioread16)(void __iomem *a)
+IO_CONCAT(__IO_PREFIX,ioread16)(const void __iomem *a)
{
- return __kernel_ldwu(*(volatile u16 __force *)a);
+ return __kernel_ldwu(*(const volatile u16 __force *)a);
}
__EXTERN_INLINE void
@@ -33,9 +33,9 @@ IO_CONCAT(__IO_PREFIX,iowrite16)(u16 b,
#if IO_CONCAT(__IO_PREFIX,trivial_io_lq)
__EXTERN_INLINE unsigned int
-IO_CONCAT(__IO_PREFIX,ioread32)(void __iomem *a)
+IO_CONCAT(__IO_PREFIX,ioread32)(const void __iomem *a)
{
- return *(volatile u32 __force *)a;
+ return *(const volatile u32 __force *)a;
}
__EXTERN_INLINE void
@@ -73,14 +73,14 @@ IO_CONCAT(__IO_PREFIX,writew)(u16 b, vol
__EXTERN_INLINE u8
IO_CONCAT(__IO_PREFIX,readb)(const volatile void __iomem *a)
{
- void __iomem *addr = (void __iomem *)a;
+ const void __iomem *addr = (const void __iomem *)a;
return IO_CONCAT(__IO_PREFIX,ioread8)(addr);
}
__EXTERN_INLINE u16
IO_CONCAT(__IO_PREFIX,readw)(const volatile void __iomem *a)
{
- void __iomem *addr = (void __iomem *)a;
+ const void __iomem *addr = (const void __iomem *)a;
return IO_CONCAT(__IO_PREFIX,ioread16)(addr);
}
--- a/arch/alpha/include/asm/jensen.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/jensen.h
@@ -305,7 +305,7 @@ __EXTERN_INLINE int jensen_is_mmio(const
that it doesn't make sense to merge them. */
#define IOPORT(OS, NS) \
-__EXTERN_INLINE unsigned int jensen_ioread##NS(void __iomem *xaddr) \
+__EXTERN_INLINE unsigned int jensen_ioread##NS(const void __iomem *xaddr) \
{ \
if (jensen_is_mmio(xaddr)) \
return jensen_read##OS(xaddr - 0x100000000ul); \
--- a/arch/alpha/include/asm/machvec.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/include/asm/machvec.h
@@ -46,9 +46,9 @@ struct alpha_machine_vector
void (*mv_pci_tbi)(struct pci_controller *hose,
dma_addr_t start, dma_addr_t end);
- unsigned int (*mv_ioread8)(void __iomem *);
- unsigned int (*mv_ioread16)(void __iomem *);
- unsigned int (*mv_ioread32)(void __iomem *);
+ unsigned int (*mv_ioread8)(const void __iomem *);
+ unsigned int (*mv_ioread16)(const void __iomem *);
+ unsigned int (*mv_ioread32)(const void __iomem *);
void (*mv_iowrite8)(u8, void __iomem *);
void (*mv_iowrite16)(u16, void __iomem *);
--- a/arch/alpha/kernel/core_marvel.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/kernel/core_marvel.c
@@ -806,7 +806,7 @@ void __iomem *marvel_ioportmap (unsigned
}
unsigned int
-marvel_ioread8(void __iomem *xaddr)
+marvel_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
if (__marvel_is_port_kbd(addr))
--- a/arch/alpha/kernel/io.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/alpha/kernel/io.c
@@ -14,7 +14,7 @@
"generic", which bumps through the machine vector. */
unsigned int
-ioread8(void __iomem *addr)
+ioread8(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -23,7 +23,7 @@ ioread8(void __iomem *addr)
return ret;
}
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -32,7 +32,7 @@ unsigned int ioread16(void __iomem *addr
return ret;
}
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -257,7 +257,7 @@ EXPORT_SYMBOL(readq_relaxed);
/*
* Read COUNT 8-bit bytes from port PORT into memory starting at SRC.
*/
-void ioread8_rep(void __iomem *port, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *port, void *dst, unsigned long count)
{
while ((unsigned long)dst & 0x3) {
if (!count)
@@ -300,7 +300,7 @@ EXPORT_SYMBOL(insb);
* the interfaces seems to be slow: just using the inlined version
* of the inw() breaks things.
*/
-void ioread16_rep(void __iomem *port, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *port, void *dst, unsigned long count)
{
if (unlikely((unsigned long)dst & 0x3)) {
if (!count)
@@ -340,7 +340,7 @@ EXPORT_SYMBOL(insw);
* but the interfaces seems to be slow: just using the inlined version
* of the inl() breaks things.
*/
-void ioread32_rep(void __iomem *port, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *port, void *dst, unsigned long count)
{
if (unlikely((unsigned long)dst & 0x3)) {
while (count--) {
--- a/arch/parisc/include/asm/io.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/parisc/include/asm/io.h
@@ -303,8 +303,8 @@ extern void outsl (unsigned long port, c
#define ioread64be ioread64be
#define iowrite64 iowrite64
#define iowrite64be iowrite64be
-extern u64 ioread64(void __iomem *addr);
-extern u64 ioread64be(void __iomem *addr);
+extern u64 ioread64(const void __iomem *addr);
+extern u64 ioread64be(const void __iomem *addr);
extern void iowrite64(u64 val, void __iomem *addr);
extern void iowrite64be(u64 val, void __iomem *addr);
--- a/arch/parisc/lib/iomap.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/parisc/lib/iomap.c
@@ -43,13 +43,13 @@
#endif
struct iomap_ops {
- unsigned int (*read8)(void __iomem *);
- unsigned int (*read16)(void __iomem *);
- unsigned int (*read16be)(void __iomem *);
- unsigned int (*read32)(void __iomem *);
- unsigned int (*read32be)(void __iomem *);
- u64 (*read64)(void __iomem *);
- u64 (*read64be)(void __iomem *);
+ unsigned int (*read8)(const void __iomem *);
+ unsigned int (*read16)(const void __iomem *);
+ unsigned int (*read16be)(const void __iomem *);
+ unsigned int (*read32)(const void __iomem *);
+ unsigned int (*read32be)(const void __iomem *);
+ u64 (*read64)(const void __iomem *);
+ u64 (*read64be)(const void __iomem *);
void (*write8)(u8, void __iomem *);
void (*write16)(u16, void __iomem *);
void (*write16be)(u16, void __iomem *);
@@ -57,9 +57,9 @@ struct iomap_ops {
void (*write32be)(u32, void __iomem *);
void (*write64)(u64, void __iomem *);
void (*write64be)(u64, void __iomem *);
- void (*read8r)(void __iomem *, void *, unsigned long);
- void (*read16r)(void __iomem *, void *, unsigned long);
- void (*read32r)(void __iomem *, void *, unsigned long);
+ void (*read8r)(const void __iomem *, void *, unsigned long);
+ void (*read16r)(const void __iomem *, void *, unsigned long);
+ void (*read32r)(const void __iomem *, void *, unsigned long);
void (*write8r)(void __iomem *, const void *, unsigned long);
void (*write16r)(void __iomem *, const void *, unsigned long);
void (*write32r)(void __iomem *, const void *, unsigned long);
@@ -69,17 +69,17 @@ struct iomap_ops {
#define ADDR2PORT(addr) ((unsigned long __force)(addr) & 0xffffff)
-static unsigned int ioport_read8(void __iomem *addr)
+static unsigned int ioport_read8(const void __iomem *addr)
{
return inb(ADDR2PORT(addr));
}
-static unsigned int ioport_read16(void __iomem *addr)
+static unsigned int ioport_read16(const void __iomem *addr)
{
return inw(ADDR2PORT(addr));
}
-static unsigned int ioport_read32(void __iomem *addr)
+static unsigned int ioport_read32(const void __iomem *addr)
{
return inl(ADDR2PORT(addr));
}
@@ -99,17 +99,17 @@ static void ioport_write32(u32 datum, vo
outl(datum, ADDR2PORT(addr));
}
-static void ioport_read8r(void __iomem *addr, void *dst, unsigned long count)
+static void ioport_read8r(const void __iomem *addr, void *dst, unsigned long count)
{
insb(ADDR2PORT(addr), dst, count);
}
-static void ioport_read16r(void __iomem *addr, void *dst, unsigned long count)
+static void ioport_read16r(const void __iomem *addr, void *dst, unsigned long count)
{
insw(ADDR2PORT(addr), dst, count);
}
-static void ioport_read32r(void __iomem *addr, void *dst, unsigned long count)
+static void ioport_read32r(const void __iomem *addr, void *dst, unsigned long count)
{
insl(ADDR2PORT(addr), dst, count);
}
@@ -150,37 +150,37 @@ static const struct iomap_ops ioport_ops
/* Legacy I/O memory ops */
-static unsigned int iomem_read8(void __iomem *addr)
+static unsigned int iomem_read8(const void __iomem *addr)
{
return readb(addr);
}
-static unsigned int iomem_read16(void __iomem *addr)
+static unsigned int iomem_read16(const void __iomem *addr)
{
return readw(addr);
}
-static unsigned int iomem_read16be(void __iomem *addr)
+static unsigned int iomem_read16be(const void __iomem *addr)
{
return __raw_readw(addr);
}
-static unsigned int iomem_read32(void __iomem *addr)
+static unsigned int iomem_read32(const void __iomem *addr)
{
return readl(addr);
}
-static unsigned int iomem_read32be(void __iomem *addr)
+static unsigned int iomem_read32be(const void __iomem *addr)
{
return __raw_readl(addr);
}
-static u64 iomem_read64(void __iomem *addr)
+static u64 iomem_read64(const void __iomem *addr)
{
return readq(addr);
}
-static u64 iomem_read64be(void __iomem *addr)
+static u64 iomem_read64be(const void __iomem *addr)
{
return __raw_readq(addr);
}
@@ -220,7 +220,7 @@ static void iomem_write64be(u64 datum, v
__raw_writel(datum, addr);
}
-static void iomem_read8r(void __iomem *addr, void *dst, unsigned long count)
+static void iomem_read8r(const void __iomem *addr, void *dst, unsigned long count)
{
while (count--) {
*(u8 *)dst = __raw_readb(addr);
@@ -228,7 +228,7 @@ static void iomem_read8r(void __iomem *a
}
}
-static void iomem_read16r(void __iomem *addr, void *dst, unsigned long count)
+static void iomem_read16r(const void __iomem *addr, void *dst, unsigned long count)
{
while (count--) {
*(u16 *)dst = __raw_readw(addr);
@@ -236,7 +236,7 @@ static void iomem_read16r(void __iomem *
}
}
-static void iomem_read32r(void __iomem *addr, void *dst, unsigned long count)
+static void iomem_read32r(const void __iomem *addr, void *dst, unsigned long count)
{
while (count--) {
*(u32 *)dst = __raw_readl(addr);
@@ -297,49 +297,49 @@ static const struct iomap_ops *iomap_ops
};
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read8(addr);
return *((u8 *)addr);
}
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read16(addr);
return le16_to_cpup((u16 *)addr);
}
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read16be(addr);
return *((u16 *)addr);
}
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read32(addr);
return le32_to_cpup((u32 *)addr);
}
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read32be(addr);
return *((u32 *)addr);
}
-u64 ioread64(void __iomem *addr)
+u64 ioread64(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read64(addr);
return le64_to_cpup((u64 *)addr);
}
-u64 ioread64be(void __iomem *addr)
+u64 ioread64be(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read64be(addr);
@@ -411,7 +411,7 @@ void iowrite64be(u64 datum, void __iomem
/* Repeating interfaces */
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
{
if (unlikely(INDIRECT_ADDR(addr))) {
iomap_ops[ADDR_TO_REGION(addr)]->read8r(addr, dst, count);
@@ -423,7 +423,7 @@ void ioread8_rep(void __iomem *addr, voi
}
}
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
{
if (unlikely(INDIRECT_ADDR(addr))) {
iomap_ops[ADDR_TO_REGION(addr)]->read16r(addr, dst, count);
@@ -435,7 +435,7 @@ void ioread16_rep(void __iomem *addr, vo
}
}
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
{
if (unlikely(INDIRECT_ADDR(addr))) {
iomap_ops[ADDR_TO_REGION(addr)]->read32r(addr, dst, count);
--- a/arch/powerpc/kernel/iomap.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/powerpc/kernel/iomap.c
@@ -15,23 +15,23 @@
* Here comes the ppc64 implementation of the IOMAP
* interfaces.
*/
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
{
return readb(addr);
}
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
return readw(addr);
}
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
{
return readw_be(addr);
}
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
return readl(addr);
}
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
{
return readl_be(addr);
}
@@ -41,27 +41,27 @@ EXPORT_SYMBOL(ioread16be);
EXPORT_SYMBOL(ioread32);
EXPORT_SYMBOL(ioread32be);
#ifdef __powerpc64__
-u64 ioread64(void __iomem *addr)
+u64 ioread64(const void __iomem *addr)
{
return readq(addr);
}
-u64 ioread64_lo_hi(void __iomem *addr)
+u64 ioread64_lo_hi(const void __iomem *addr)
{
return readq(addr);
}
-u64 ioread64_hi_lo(void __iomem *addr)
+u64 ioread64_hi_lo(const void __iomem *addr)
{
return readq(addr);
}
-u64 ioread64be(void __iomem *addr)
+u64 ioread64be(const void __iomem *addr)
{
return readq_be(addr);
}
-u64 ioread64be_lo_hi(void __iomem *addr)
+u64 ioread64be_lo_hi(const void __iomem *addr)
{
return readq_be(addr);
}
-u64 ioread64be_hi_lo(void __iomem *addr)
+u64 ioread64be_hi_lo(const void __iomem *addr)
{
return readq_be(addr);
}
@@ -139,15 +139,15 @@ EXPORT_SYMBOL(iowrite64be_hi_lo);
* FIXME! We could make these do EEH handling if we really
* wanted. Not clear if we do.
*/
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
{
readsb(addr, dst, count);
}
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
{
readsw(addr, dst, count);
}
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
{
readsl(addr, dst, count);
}
--- a/arch/sh/kernel/iomap.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/arch/sh/kernel/iomap.c
@@ -8,31 +8,31 @@
#include <linux/module.h>
#include <linux/io.h>
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
{
return readb(addr);
}
EXPORT_SYMBOL(ioread8);
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
return readw(addr);
}
EXPORT_SYMBOL(ioread16);
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
{
return be16_to_cpu(__raw_readw(addr));
}
EXPORT_SYMBOL(ioread16be);
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
return readl(addr);
}
EXPORT_SYMBOL(ioread32);
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
{
return be32_to_cpu(__raw_readl(addr));
}
@@ -74,7 +74,7 @@ EXPORT_SYMBOL(iowrite32be);
* convert to CPU byte order. We write in "IO byte
* order" (we also don't have IO barriers).
*/
-static inline void mmio_insb(void __iomem *addr, u8 *dst, int count)
+static inline void mmio_insb(const void __iomem *addr, u8 *dst, int count)
{
while (--count >= 0) {
u8 data = __raw_readb(addr);
@@ -83,7 +83,7 @@ static inline void mmio_insb(void __iome
}
}
-static inline void mmio_insw(void __iomem *addr, u16 *dst, int count)
+static inline void mmio_insw(const void __iomem *addr, u16 *dst, int count)
{
while (--count >= 0) {
u16 data = __raw_readw(addr);
@@ -92,7 +92,7 @@ static inline void mmio_insw(void __iome
}
}
-static inline void mmio_insl(void __iomem *addr, u32 *dst, int count)
+static inline void mmio_insl(const void __iomem *addr, u32 *dst, int count)
{
while (--count >= 0) {
u32 data = __raw_readl(addr);
@@ -125,19 +125,19 @@ static inline void mmio_outsl(void __iom
}
}
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
{
mmio_insb(addr, dst, count);
}
EXPORT_SYMBOL(ioread8_rep);
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
{
mmio_insw(addr, dst, count);
}
EXPORT_SYMBOL(ioread16_rep);
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
{
mmio_insl(addr, dst, count);
}
--- a/include/asm-generic/iomap.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/include/asm-generic/iomap.h
@@ -26,14 +26,14 @@
* in the low address range. Architectures for which this is not
* true can't use this generic implementation.
*/
-extern unsigned int ioread8(void __iomem *);
-extern unsigned int ioread16(void __iomem *);
-extern unsigned int ioread16be(void __iomem *);
-extern unsigned int ioread32(void __iomem *);
-extern unsigned int ioread32be(void __iomem *);
+extern unsigned int ioread8(const void __iomem *);
+extern unsigned int ioread16(const void __iomem *);
+extern unsigned int ioread16be(const void __iomem *);
+extern unsigned int ioread32(const void __iomem *);
+extern unsigned int ioread32be(const void __iomem *);
#ifdef CONFIG_64BIT
-extern u64 ioread64(void __iomem *);
-extern u64 ioread64be(void __iomem *);
+extern u64 ioread64(const void __iomem *);
+extern u64 ioread64be(const void __iomem *);
#endif
#ifdef readq
@@ -41,10 +41,10 @@ extern u64 ioread64be(void __iomem *);
#define ioread64_hi_lo ioread64_hi_lo
#define ioread64be_lo_hi ioread64be_lo_hi
#define ioread64be_hi_lo ioread64be_hi_lo
-extern u64 ioread64_lo_hi(void __iomem *addr);
-extern u64 ioread64_hi_lo(void __iomem *addr);
-extern u64 ioread64be_lo_hi(void __iomem *addr);
-extern u64 ioread64be_hi_lo(void __iomem *addr);
+extern u64 ioread64_lo_hi(const void __iomem *addr);
+extern u64 ioread64_hi_lo(const void __iomem *addr);
+extern u64 ioread64be_lo_hi(const void __iomem *addr);
+extern u64 ioread64be_hi_lo(const void __iomem *addr);
#endif
extern void iowrite8(u8, void __iomem *);
@@ -79,9 +79,9 @@ extern void iowrite64be_hi_lo(u64 val, v
* memory across multiple ports, use "memcpy_toio()"
* and friends.
*/
-extern void ioread8_rep(void __iomem *port, void *buf, unsigned long count);
-extern void ioread16_rep(void __iomem *port, void *buf, unsigned long count);
-extern void ioread32_rep(void __iomem *port, void *buf, unsigned long count);
+extern void ioread8_rep(const void __iomem *port, void *buf, unsigned long count);
+extern void ioread16_rep(const void __iomem *port, void *buf, unsigned long count);
+extern void ioread32_rep(const void __iomem *port, void *buf, unsigned long count);
extern void iowrite8_rep(void __iomem *port, const void *buf, unsigned long count);
extern void iowrite16_rep(void __iomem *port, const void *buf, unsigned long count);
--- a/include/linux/io-64-nonatomic-hi-lo.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/include/linux/io-64-nonatomic-hi-lo.h
@@ -57,7 +57,7 @@ static inline void hi_lo_writeq_relaxed(
#ifndef ioread64_hi_lo
#define ioread64_hi_lo ioread64_hi_lo
-static inline u64 ioread64_hi_lo(void __iomem *addr)
+static inline u64 ioread64_hi_lo(const void __iomem *addr)
{
u32 low, high;
@@ -79,7 +79,7 @@ static inline void iowrite64_hi_lo(u64 v
#ifndef ioread64be_hi_lo
#define ioread64be_hi_lo ioread64be_hi_lo
-static inline u64 ioread64be_hi_lo(void __iomem *addr)
+static inline u64 ioread64be_hi_lo(const void __iomem *addr)
{
u32 low, high;
--- a/include/linux/io-64-nonatomic-lo-hi.h~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/include/linux/io-64-nonatomic-lo-hi.h
@@ -57,7 +57,7 @@ static inline void lo_hi_writeq_relaxed(
#ifndef ioread64_lo_hi
#define ioread64_lo_hi ioread64_lo_hi
-static inline u64 ioread64_lo_hi(void __iomem *addr)
+static inline u64 ioread64_lo_hi(const void __iomem *addr)
{
u32 low, high;
@@ -79,7 +79,7 @@ static inline void iowrite64_lo_hi(u64 v
#ifndef ioread64be_lo_hi
#define ioread64be_lo_hi ioread64be_lo_hi
-static inline u64 ioread64be_lo_hi(void __iomem *addr)
+static inline u64 ioread64be_lo_hi(const void __iomem *addr)
{
u32 low, high;
--- a/lib/iomap.c~iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/lib/iomap.c
@@ -70,27 +70,27 @@ static void bad_io_access(unsigned long
#define mmio_read64be(addr) swab64(readq(addr))
#endif
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
{
IO_COND(addr, return inb(port), return readb(addr));
return 0xff;
}
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
IO_COND(addr, return inw(port), return readw(addr));
return 0xffff;
}
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
{
IO_COND(addr, return pio_read16be(port), return mmio_read16be(addr));
return 0xffff;
}
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
IO_COND(addr, return inl(port), return readl(addr));
return 0xffffffff;
}
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
{
IO_COND(addr, return pio_read32be(port), return mmio_read32be(addr));
return 0xffffffff;
@@ -142,26 +142,26 @@ static u64 pio_read64be_hi_lo(unsigned l
return lo | (hi << 32);
}
-u64 ioread64_lo_hi(void __iomem *addr)
+u64 ioread64_lo_hi(const void __iomem *addr)
{
IO_COND(addr, return pio_read64_lo_hi(port), return readq(addr));
return 0xffffffffffffffffULL;
}
-u64 ioread64_hi_lo(void __iomem *addr)
+u64 ioread64_hi_lo(const void __iomem *addr)
{
IO_COND(addr, return pio_read64_hi_lo(port), return readq(addr));
return 0xffffffffffffffffULL;
}
-u64 ioread64be_lo_hi(void __iomem *addr)
+u64 ioread64be_lo_hi(const void __iomem *addr)
{
IO_COND(addr, return pio_read64be_lo_hi(port),
return mmio_read64be(addr));
return 0xffffffffffffffffULL;
}
-u64 ioread64be_hi_lo(void __iomem *addr)
+u64 ioread64be_hi_lo(const void __iomem *addr)
{
IO_COND(addr, return pio_read64be_hi_lo(port),
return mmio_read64be(addr));
@@ -275,7 +275,7 @@ EXPORT_SYMBOL(iowrite64be_hi_lo);
* order" (we also don't have IO barriers).
*/
#ifndef mmio_insb
-static inline void mmio_insb(void __iomem *addr, u8 *dst, int count)
+static inline void mmio_insb(const void __iomem *addr, u8 *dst, int count)
{
while (--count >= 0) {
u8 data = __raw_readb(addr);
@@ -283,7 +283,7 @@ static inline void mmio_insb(void __iome
dst++;
}
}
-static inline void mmio_insw(void __iomem *addr, u16 *dst, int count)
+static inline void mmio_insw(const void __iomem *addr, u16 *dst, int count)
{
while (--count >= 0) {
u16 data = __raw_readw(addr);
@@ -291,7 +291,7 @@ static inline void mmio_insw(void __iome
dst++;
}
}
-static inline void mmio_insl(void __iomem *addr, u32 *dst, int count)
+static inline void mmio_insl(const void __iomem *addr, u32 *dst, int count)
{
while (--count >= 0) {
u32 data = __raw_readl(addr);
@@ -325,15 +325,15 @@ static inline void mmio_outsl(void __iom
}
#endif
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
{
IO_COND(addr, insb(port,dst,count), mmio_insb(addr, dst, count));
}
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
{
IO_COND(addr, insw(port,dst,count), mmio_insw(addr, dst, count));
}
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
{
IO_COND(addr, insl(port,dst,count), mmio_insl(addr, dst, count));
}
_
Patches currently in -mm which might be from krzk@kernel.org are
iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (108 preceding siblings ...)
2020-07-10 0:33 ` + iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
@ 2020-07-10 0:33 ` Andrew Morton
2020-07-10 0:33 ` + ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
` (122 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:33 UTC (permalink / raw)
To: allenbh, arnd, benh, dalias, dave.jiang, davem, deller,
geert+renesas, geert, ink, James.Bottomley, jasowang, jdmason,
krzk, kuba, kvalo, mattst88, mm-commits, mpe, mst, paulus, rth,
ysato
The patch titled
Subject: rtl818x: constify ioreadX() iomem argument (as in generic implementation)
has been added to the -mm tree. Its filename is
rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Krzysztof Kozlowski <krzk@kernel.org>
Subject: rtl818x: constify ioreadX() iomem argument (as in generic implementation)
The ioreadX() helpers have inconsistent interface. On some architectures
void *__iomem address argument is a pointer to const, on some not.
Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.
Link: http://lkml.kernel.org/r/20200709072837.5869-3-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h~rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h
@@ -150,17 +150,17 @@ void rtl8180_write_phy(struct ieee80211_
void rtl8180_set_anaparam(struct rtl8180_priv *priv, u32 anaparam);
void rtl8180_set_anaparam2(struct rtl8180_priv *priv, u32 anaparam2);
-static inline u8 rtl818x_ioread8(struct rtl8180_priv *priv, u8 __iomem *addr)
+static inline u8 rtl818x_ioread8(struct rtl8180_priv *priv, const u8 __iomem *addr)
{
return ioread8(addr);
}
-static inline u16 rtl818x_ioread16(struct rtl8180_priv *priv, __le16 __iomem *addr)
+static inline u16 rtl818x_ioread16(struct rtl8180_priv *priv, const __le16 __iomem *addr)
{
return ioread16(addr);
}
-static inline u32 rtl818x_ioread32(struct rtl8180_priv *priv, __le32 __iomem *addr)
+static inline u32 rtl818x_ioread32(struct rtl8180_priv *priv, const __le32 __iomem *addr)
{
return ioread32(addr);
}
_
Patches currently in -mm which might be from krzk@kernel.org are
iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (109 preceding siblings ...)
2020-07-10 0:33 ` + rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
@ 2020-07-10 0:33 ` Andrew Morton
2020-07-10 0:33 ` + virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
` (121 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:33 UTC (permalink / raw)
To: allenbh, arnd, benh, dalias, dave.jiang, davem, deller,
geert+renesas, geert, ink, James.Bottomley, jasowang, jdmason,
krzk, kuba, kvalo, mattst88, mm-commits, mpe, mst, paulus, rth,
ysato
The patch titled
Subject: ntb: intel: constify ioreadX() iomem argument (as in generic implementation)
has been added to the -mm tree. Its filename is
ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Krzysztof Kozlowski <krzk@kernel.org>
Subject: ntb: intel: constify ioreadX() iomem argument (as in generic implementation)
The ioreadX() helpers have inconsistent interface. On some architectures
void *__iomem address argument is a pointer to const, on some not.
Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.
Link: http://lkml.kernel.org/r/20200709072837.5869-4-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/ntb/hw/intel/ntb_hw_gen1.c | 2 +-
drivers/ntb/hw/intel/ntb_hw_gen3.h | 2 +-
drivers/ntb/hw/intel/ntb_hw_intel.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/ntb/hw/intel/ntb_hw_gen1.c~ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/ntb/hw/intel/ntb_hw_gen1.c
@@ -1205,7 +1205,7 @@ int intel_ntb_peer_spad_write(struct ntb
ndev->peer_reg->spad);
}
-static u64 xeon_db_ioread(void __iomem *mmio)
+static u64 xeon_db_ioread(const void __iomem *mmio)
{
return (u64)ioread16(mmio);
}
--- a/drivers/ntb/hw/intel/ntb_hw_gen3.h~ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/ntb/hw/intel/ntb_hw_gen3.h
@@ -91,7 +91,7 @@
#define GEN3_DB_TOTAL_SHIFT 33
#define GEN3_SPAD_COUNT 16
-static inline u64 gen3_db_ioread(void __iomem *mmio)
+static inline u64 gen3_db_ioread(const void __iomem *mmio)
{
return ioread64(mmio);
}
--- a/drivers/ntb/hw/intel/ntb_hw_intel.h~ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/ntb/hw/intel/ntb_hw_intel.h
@@ -103,7 +103,7 @@ struct intel_ntb_dev;
struct intel_ntb_reg {
int (*poll_link)(struct intel_ntb_dev *ndev);
int (*link_is_up)(struct intel_ntb_dev *ndev);
- u64 (*db_ioread)(void __iomem *mmio);
+ u64 (*db_ioread)(const void __iomem *mmio);
void (*db_iowrite)(u64 db_bits, void __iomem *mmio);
unsigned long ntb_ctl;
resource_size_t db_size;
_
Patches currently in -mm which might be from krzk@kernel.org are
iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (110 preceding siblings ...)
2020-07-10 0:33 ` + ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
@ 2020-07-10 0:33 ` Andrew Morton
2020-07-10 0:36 ` + doc-mm-sync-up-oom_score_adj-documentation.patch " Andrew Morton
` (120 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:33 UTC (permalink / raw)
To: allenbh, arnd, benh, dalias, dave.jiang, davem, deller,
geert+renesas, geert, ink, James.Bottomley, jasowang, jdmason,
krzk, kuba, kvalo, mattst88, mm-commits, mpe, mst, paulus, rth,
ysato
The patch titled
Subject: virtio: pci: constify ioreadX() iomem argument (as in generic implementation)
has been added to the -mm tree. Its filename is
virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Krzysztof Kozlowski <krzk@kernel.org>
Subject: virtio: pci: constify ioreadX() iomem argument (as in generic implementation)
The ioreadX() helpers have inconsistent interface. On some architectures
void *__iomem address argument is a pointer to const, on some not.
Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.
Link: http://lkml.kernel.org/r/20200709072837.5869-5-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/virtio/virtio_pci_modern.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/virtio/virtio_pci_modern.c~virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation
+++ a/drivers/virtio/virtio_pci_modern.c
@@ -27,16 +27,16 @@
* method, i.e. 32-bit accesses for 32-bit fields, 16-bit accesses
* for 16-bit fields and 8-bit accesses for 8-bit fields.
*/
-static inline u8 vp_ioread8(u8 __iomem *addr)
+static inline u8 vp_ioread8(const u8 __iomem *addr)
{
return ioread8(addr);
}
-static inline u16 vp_ioread16 (__le16 __iomem *addr)
+static inline u16 vp_ioread16 (const __le16 __iomem *addr)
{
return ioread16(addr);
}
-static inline u32 vp_ioread32(__le32 __iomem *addr)
+static inline u32 vp_ioread32(const __le32 __iomem *addr)
{
return ioread32(addr);
}
_
Patches currently in -mm which might be from krzk@kernel.org are
iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + doc-mm-sync-up-oom_score_adj-documentation.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (111 preceding siblings ...)
2020-07-10 0:33 ` + virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch " Andrew Morton
@ 2020-07-10 0:36 ` Andrew Morton
2020-07-10 0:36 ` Andrew Morton
2020-07-10 0:36 ` + doc-mm-clarify-proc-pid-oom_score-value-range.patch " Andrew Morton
` (119 subsequent siblings)
232 siblings, 1 reply; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:36 UTC (permalink / raw)
To: corbet, laoar.shao, mhocko, mm-commits, rientjes
The patch titled
Subject: doc, mm: sync up oom_score_adj documentation
has been added to the -mm tree. Its filename is
doc-mm-sync-up-oom_score_adj-documentation.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/doc-mm-sync-up-oom_score_adj-documentation.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/doc-mm-sync-up-oom_score_adj-documentation.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Michal Hocko <mhocko@suse.com>
Subject: doc, mm: sync up oom_score_adj documentation
There are at least two notes in the oom section. The 3% discount for root
processes is gone since d46078b28889 ("mm, oom: remove 3% bonus for
CAP_SYS_ADMIN processes").
Likewise children of the selected oom victim are not sacrificed since
bbbe48029720 ("mm, oom: remove 'prefer children over parent' heuristic")
Drop both of them.
Link: http://lkml.kernel.org/r/20200709062603.18480-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/filesystems/proc.rst | 8 --------
1 file changed, 8 deletions(-)
--- a/Documentation/filesystems/proc.rst~doc-mm-sync-up-oom_score_adj-documentation
+++ a/Documentation/filesystems/proc.rst
@@ -1634,9 +1634,6 @@ may allocate from based on an estimation
For example, if a task is using all allowed memory, its badness score will be
1000. If it is using half of its allowed memory, its score will be 500.
-There is an additional factor included in the badness score: the current memory
-and swap usage is discounted by 3% for root processes.
-
The amount of "allowed" memory depends on the context in which the oom killer
was called. If it is due to the memory assigned to the allocating task's cpuset
being exhausted, the allowed memory represents the set of mems assigned to that
@@ -1672,11 +1669,6 @@ The value of /proc/<pid>/oom_score_adj m
value set by a CAP_SYS_RESOURCE process. To reduce the value any lower
requires CAP_SYS_RESOURCE.
-Caveat: when a parent task is selected, the oom killer will sacrifice any first
-generation children with separate address spaces instead, if possible. This
-avoids servers and important system daemons from being killed and loses the
-minimal amount of work.
^ permalink raw reply [flat|nested] 247+ messages in thread
* + doc-mm-sync-up-oom_score_adj-documentation.patch added to -mm tree
2020-07-10 0:36 ` + doc-mm-sync-up-oom_score_adj-documentation.patch " Andrew Morton
@ 2020-07-10 0:36 ` Andrew Morton
0 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:36 UTC (permalink / raw)
To: corbet, laoar.shao, mhocko, mm-commits, rientjes
The patch titled
Subject: doc, mm: sync up oom_score_adj documentation
has been added to the -mm tree. Its filename is
doc-mm-sync-up-oom_score_adj-documentation.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/doc-mm-sync-up-oom_score_adj-documentation.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/doc-mm-sync-up-oom_score_adj-documentation.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Michal Hocko <mhocko@suse.com>
Subject: doc, mm: sync up oom_score_adj documentation
There are at least two notes in the oom section. The 3% discount for root
processes is gone since d46078b28889 ("mm, oom: remove 3% bonus for
CAP_SYS_ADMIN processes").
Likewise children of the selected oom victim are not sacrificed since
bbbe48029720 ("mm, oom: remove 'prefer children over parent' heuristic")
Drop both of them.
Link: http://lkml.kernel.org/r/20200709062603.18480-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/filesystems/proc.rst | 8 --------
1 file changed, 8 deletions(-)
--- a/Documentation/filesystems/proc.rst~doc-mm-sync-up-oom_score_adj-documentation
+++ a/Documentation/filesystems/proc.rst
@@ -1634,9 +1634,6 @@ may allocate from based on an estimation
For example, if a task is using all allowed memory, its badness score will be
1000. If it is using half of its allowed memory, its score will be 500.
-There is an additional factor included in the badness score: the current memory
-and swap usage is discounted by 3% for root processes.
-
The amount of "allowed" memory depends on the context in which the oom killer
was called. If it is due to the memory assigned to the allocating task's cpuset
being exhausted, the allowed memory represents the set of mems assigned to that
@@ -1672,11 +1669,6 @@ The value of /proc/<pid>/oom_score_adj m
value set by a CAP_SYS_RESOURCE process. To reduce the value any lower
requires CAP_SYS_RESOURCE.
-Caveat: when a parent task is selected, the oom killer will sacrifice any first
-generation children with separate address spaces instead, if possible. This
-avoids servers and important system daemons from being killed and loses the
-minimal amount of work.
-
3.2 /proc/<pid>/oom_score - Display current oom-killer score
-------------------------------------------------------------
_
Patches currently in -mm which might be from mhocko@suse.com are
doc-mm-sync-up-oom_score_adj-documentation.patch
doc-mm-clarify-proc-pid-oom_score-value-range.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + doc-mm-clarify-proc-pid-oom_score-value-range.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (112 preceding siblings ...)
2020-07-10 0:36 ` + doc-mm-sync-up-oom_score_adj-documentation.patch " Andrew Morton
@ 2020-07-10 0:36 ` Andrew Morton
2020-07-10 0:38 ` [to-be-updated] proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch removed from " Andrew Morton
` (118 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:36 UTC (permalink / raw)
To: corbet, laoar.shao, mhocko, mm-commits, rientjes
The patch titled
Subject: doc, mm: clarify /proc/<pid>/oom_score value range
has been added to the -mm tree. Its filename is
doc-mm-clarify-proc-pid-oom_score-value-range.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/doc-mm-clarify-proc-pid-oom_score-value-range.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/doc-mm-clarify-proc-pid-oom_score-value-range.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Michal Hocko <mhocko@suse.com>
Subject: doc, mm: clarify /proc/<pid>/oom_score value range
The exported value includes oom_score_adj so the range is no [0, 1000] as
described in the previous section but rather [0, 2000]. Mention that fact
explicitly.
Link: http://lkml.kernel.org/r/20200709062603.18480-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/filesystems/proc.rst | 3 +++
1 file changed, 3 insertions(+)
--- a/Documentation/filesystems/proc.rst~doc-mm-clarify-proc-pid-oom_score-value-range
+++ a/Documentation/filesystems/proc.rst
@@ -1673,6 +1673,9 @@ requires CAP_SYS_RESOURCE.
3.2 /proc/<pid>/oom_score - Display current oom-killer score
-------------------------------------------------------------
+Please note that the exported value includes oom_score_adj so it is effectively
+in range [0,2000].
+
This file can be used to check the current score used by the oom-killer is for
any given <pid>. Use it together with /proc/<pid>/oom_score_adj to tune which
process should be killed in an out-of-memory situation.
_
Patches currently in -mm which might be from mhocko@suse.com are
doc-mm-sync-up-oom_score_adj-documentation.patch
doc-mm-clarify-proc-pid-oom_score-value-range.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (113 preceding siblings ...)
2020-07-10 0:36 ` + doc-mm-clarify-proc-pid-oom_score-value-range.patch " Andrew Morton
@ 2020-07-10 0:38 ` Andrew Morton
2020-07-10 0:38 ` [to-be-updated] mm-utilc-make-vm_memory_committed-more-accurate.patch " Andrew Morton
` (117 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:38 UTC (permalink / raw)
To: andi.kleen, dave.hansen, feng.tang, hannes, keescook, mgorman,
mhocko, mm-commits, tim.c.chen, willy, ying.huang
The patch titled
Subject: proc/meminfo: avoid open coded reading of vm_committed_as
has been removed from the -mm tree. Its filename was
proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Feng Tang <feng.tang@intel.com>
Subject: proc/meminfo: avoid open coded reading of vm_committed_as
Patch series "make vm_committed_as_batch aware of vm overcommit policy", v5.
When checking a performance change for will-it-scale scalability mmap test
[1], we found very high lock contention for spinlock of percpu counter
'vm_committed_as':
94.14% 0.35% [kernel.kallsyms] [k] _raw_spin_lock_irqsave
48.21% _raw_spin_lock_irqsave;percpu_counter_add_batch;__vm_enough_memory;mmap_region;do_mmap;
45.91% _raw_spin_lock_irqsave;percpu_counter_add_batch;__do_munmap;
Actually this heavy lock contention is not always necessary. The
'vm_committed_as' needs to be very precise when the strict
OVERCOMMIT_NEVER policy is set, which requires a rather small batch number
for the percpu counter.
So keep 'batch' number unchanged for strict OVERCOMMIT_NEVER policy, and
enlarge it for not-so-strict OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS
policies.
Benchmark with the same testcase in [1] shows 53% improvement on a 8C/16T
desktop, and 2097%(20X) on a 4S/72C/144T server. And for that case,
whether it shows improvements depends on if the test mmap size is bigger
than the batch number computed.
We tested 10+ platforms in 0day (server, desktop and laptop). If we lift
it to 64X, 80%+ platforms show improvements, and for 16X lift, 1/3 of the
platforms will show improvements.
And generally it should help the mmap/unmap usage,as Michal Hocko
mentioned:
: I believe that there are non-synthetic worklaods which would benefit
: from a larger batch. E.g. large in memory databases which do large
: mmaps during startups from multiple threads.
Note: There are some style complain from checkpatch for patch 3, as sysctl
handler declaration follows the similar format of sibling functions
[1] https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/
patch1: a cleanup for /proc/meminfo
patch2: a preparation patch which also improve the accuracy of
vm_memory_committed
patch3: main change
This patch (of 3):
Use the existing vm_memory_committed() instead, which is also convenient
for future change.
Link: http://lkml.kernel.org/r/1592725000-73486-1-git-send-email-feng.tang@intel.com
Link: http://lkml.kernel.org/r/1592725000-73486-2-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/proc/meminfo.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/proc/meminfo.c~proc-meminfo-avoid-open-coded-reading-of-vm_committed_as
+++ a/fs/proc/meminfo.c
@@ -41,7 +41,7 @@ static int meminfo_proc_show(struct seq_
si_meminfo(&i);
si_swapinfo(&i);
- committed = percpu_counter_read_positive(&vm_committed_as);
+ committed = vm_memory_committed();
cached = global_node_page_state(NR_FILE_PAGES) -
total_swapcache_pages() - i.bufferram;
_
Patches currently in -mm which might be from feng.tang@intel.com are
mm-utilc-make-vm_memory_committed-more-accurate.patch
mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-utilc-make-vm_memory_committed-more-accurate.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (114 preceding siblings ...)
2020-07-10 0:38 ` [to-be-updated] proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch removed from " Andrew Morton
@ 2020-07-10 0:38 ` Andrew Morton
2020-07-10 0:38 ` [to-be-updated] mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch " Andrew Morton
` (116 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:38 UTC (permalink / raw)
To: andi.kleen, dave.hansen, feng.tang, haiyangz, hannes, kys,
mgorman, mhocko, mm-commits, tim.c.chen, willy, ying.huang
The patch titled
Subject: mm/util.c: make vm_memory_committed() more accurate
has been removed from the -mm tree. Its filename was
mm-utilc-make-vm_memory_committed-more-accurate.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Feng Tang <feng.tang@intel.com>
Subject: mm/util.c: make vm_memory_committed() more accurate
percpu_counter_sum_positive() will provide more accurate info.
As with percpu_counter_read_positive(), in worst case the deviation could
be 'batch * nr_cpus', which is totalram_pages/256 for now, and will be
more when the batch gets enlarged.
Its time cost is about 800 nanoseconds on a 2C/4T platform and 2~3
microseconds on a 2S/36C/72T Skylake server in normal case, and in worst
case where vm_committed_as's spinlock is under severe contention, it costs
30~40 microseconds for the 2S/36C/72T Skylake sever, which should be fine
for its only two users: /proc/meminfo and HyperV balloon driver's status
trace per second.
Link: http://lkml.kernel.org/r/1592725000-73486-3-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com> # for /proc/meminfo
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/util.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/mm/util.c~mm-utilc-make-vm_memory_committed-more-accurate
+++ a/mm/util.c
@@ -787,10 +787,15 @@ struct percpu_counter vm_committed_as __
* balancing memory across competing virtual machines that are hosted.
* Several metrics drive this policy engine including the guest reported
* memory commitment.
+ *
+ * The time cost of this is very low for small platforms, and for big
+ * platform like a 2S/36C/72T Skylake server, in worst case where
+ * vm_committed_as's spinlock is under severe contention, the time cost
+ * could be about 30~40 microseconds.
*/
unsigned long vm_memory_committed(void)
{
- return percpu_counter_read_positive(&vm_committed_as);
+ return percpu_counter_sum_positive(&vm_committed_as);
}
EXPORT_SYMBOL_GPL(vm_memory_committed);
_
Patches currently in -mm which might be from feng.tang@intel.com are
mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (115 preceding siblings ...)
2020-07-10 0:38 ` [to-be-updated] mm-utilc-make-vm_memory_committed-more-accurate.patch " Andrew Morton
@ 2020-07-10 0:38 ` Andrew Morton
2020-07-10 4:00 ` mmotm 2020-07-09-21-00 uploaded Andrew Morton
` (115 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 0:38 UTC (permalink / raw)
To: andi.kleen, dave.hansen, feng.tang, hannes, keescook, mgorman,
mhocko, mm-commits, tim.c.chen, willy, ying.huang
The patch titled
Subject: mm: adjust vm_committed_as_batch according to vm overcommit policy
has been removed from the -mm tree. Its filename was
mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Feng Tang <feng.tang@intel.com>
Subject: mm: adjust vm_committed_as_batch according to vm overcommit policy
When checking a performance change for will-it-scale scalability mmap test
[1], we found very high lock contention for spinlock of percpu counter
'vm_committed_as':
94.14% 0.35% [kernel.kallsyms] [k] _raw_spin_lock_irqsave
48.21% _raw_spin_lock_irqsave;percpu_counter_add_batch;__vm_enough_memory;mmap_region;do_mmap;
45.91% _raw_spin_lock_irqsave;percpu_counter_add_batch;__do_munmap;
Actually this heavy lock contention is not always necessary. The
'vm_committed_as' needs to be very precise when the strict
OVERCOMMIT_NEVER policy is set, which requires a rather small batch number
for the percpu counter.
So keep 'batch' number unchanged for strict OVERCOMMIT_NEVER policy, and
lift it to 64X for OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS policies. Also
add a sysctl handler to adjust it when the policy is reconfigured.
Benchmark with the same testcase in [1] shows 53% improvement on a 8C/16T
desktop, and 2097%(20X) on a 4S/72C/144T server. We tested with test
platforms in 0day (server, desktop and laptop), and 80%+ platforms shows
improvements with that test. And whether it shows improvements depends on
if the test mmap size is bigger than the batch number computed.
And if the lift is 16X, 1/3 of the platforms will show improvements,
though it should help the mmap/unmap usage generally, as Michal Hocko
mentioned:
: I believe that there are non-synthetic worklaods which would benefit from
: a larger batch. E.g. large in memory databases which do large mmaps
: during startups from multiple threads.
[1] https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/
Link: http://lkml.kernel.org/r/1589611660-89854-4-git-send-email-feng.tang@intel.com
Link: http://lkml.kernel.org/r/1592725000-73486-4-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/mm.h | 2 ++
include/linux/mman.h | 4 ++++
kernel/sysctl.c | 2 +-
mm/mm_init.c | 16 +++++++++++++---
mm/util.c | 12 ++++++++++++
5 files changed, 32 insertions(+), 4 deletions(-)
--- a/include/linux/mman.h~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/include/linux/mman.h
@@ -57,8 +57,12 @@ extern struct percpu_counter vm_committe
#ifdef CONFIG_SMP
extern s32 vm_committed_as_batch;
+extern void mm_compute_batch(void);
#else
#define vm_committed_as_batch 0
+static inline void mm_compute_batch(void)
+{
+}
#endif
unsigned long vm_memory_committed(void);
--- a/include/linux/mm.h~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/include/linux/mm.h
@@ -206,6 +206,8 @@ int overcommit_ratio_handler(struct ctl_
loff_t *);
int overcommit_kbytes_handler(struct ctl_table *, int, void *, size_t *,
loff_t *);
+int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
+ loff_t *);
#define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
--- a/kernel/sysctl.c~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/kernel/sysctl.c
@@ -2650,7 +2650,7 @@ static struct ctl_table vm_table[] = {
.data = &sysctl_overcommit_memory,
.maxlen = sizeof(sysctl_overcommit_memory),
.mode = 0644,
- .proc_handler = proc_dointvec_minmax,
+ .proc_handler = overcommit_policy_handler,
.extra1 = SYSCTL_ZERO,
.extra2 = &two,
},
--- a/mm/mm_init.c~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/mm/mm_init.c
@@ -13,6 +13,7 @@
#include <linux/memory.h>
#include <linux/notifier.h>
#include <linux/sched.h>
+#include <linux/mman.h>
#include "internal.h"
#ifdef CONFIG_DEBUG_MEMORY_INIT
@@ -144,14 +145,23 @@ EXPORT_SYMBOL_GPL(mm_kobj);
#ifdef CONFIG_SMP
s32 vm_committed_as_batch = 32;
-static void __meminit mm_compute_batch(void)
+void mm_compute_batch(void)
{
u64 memsized_batch;
s32 nr = num_present_cpus();
s32 batch = max_t(s32, nr*2, 32);
+ unsigned long ram_pages = totalram_pages();
- /* batch size set to 0.4% of (total memory/#cpus), or max int32 */
- memsized_batch = min_t(u64, (totalram_pages()/nr)/256, 0x7fffffff);
+ /*
+ * For policy of OVERCOMMIT_NEVER, set batch size to 0.4%
+ * of (total memory/#cpus), and lift it to 25% for other
+ * policies to easy the possible lock contention for percpu_counter
+ * vm_committed_as, while the max limit is INT_MAX
+ */
+ if (sysctl_overcommit_memory == OVERCOMMIT_NEVER)
+ memsized_batch = min_t(u64, ram_pages/nr/256, INT_MAX);
+ else
+ memsized_batch = min_t(u64, ram_pages/nr/4, INT_MAX);
vm_committed_as_batch = max_t(s32, memsized_batch, batch);
}
--- a/mm/util.c~mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy
+++ a/mm/util.c
@@ -746,6 +746,18 @@ int overcommit_ratio_handler(struct ctl_
return ret;
}
+int overcommit_policy_handler(struct ctl_table *table, int write, void *buffer,
+ size_t *lenp, loff_t *ppos)
+{
+ int ret;
+
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+ if (ret == 0 && write)
+ mm_compute_batch();
+
+ return ret;
+}
+
int overcommit_kbytes_handler(struct ctl_table *table, int write, void *buffer,
size_t *lenp, loff_t *ppos)
{
_
Patches currently in -mm which might be from feng.tang@intel.com are
^ permalink raw reply [flat|nested] 247+ messages in thread
* mmotm 2020-07-09-21-00 uploaded
2020-07-03 22:14 incoming Andrew Morton
` (116 preceding siblings ...)
2020-07-10 0:38 ` [to-be-updated] mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch " Andrew Morton
@ 2020-07-10 4:00 ` Andrew Morton
2020-07-10 23:27 ` [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from -mm tree Andrew Morton
` (114 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 4:00 UTC (permalink / raw)
To: broonie, linux-fsdevel, linux-kernel, linux-mm, linux-next,
mhocko, mm-commits, sfr
The mm-of-the-moment snapshot 2020-07-09-21-00 has been uploaded to
http://www.ozlabs.org/~akpm/mmotm/
mmotm-readme.txt says
README for mm-of-the-moment:
http://www.ozlabs.org/~akpm/mmotm/
This is a snapshot of my -mm patch queue. Uploaded at random hopefully
more than once a week.
You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY). The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series
The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE-yyyy-mm-dd-hh-mm-ss. Both contain the string yyyy-mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.
This tree is partially included in linux-next. To see which patches are
included in linux-next, consult the `series' file. Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.
A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release. Individual mmotm releases are tagged. The master branch always
points to the latest release, so it's constantly rebasing.
https://github.com/hnaz/linux-mm
The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree. It is updated more frequently
than mmotm, and is untested.
A git copy of this tree is also available at
https://github.com/hnaz/linux-mm
This mmotm tree contains the following patches against 5.8-rc4:
(patches marked "*" will be included in linux-next)
origin.patch
* mm-shuffle-dont-move-pages-between-zones-and-dont-read-garbage-memmaps.patch
* mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards.patch
* mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
* vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
* mm-initialize-return-of-vm_insert_pages.patch
* mm-memcontrol-fix-oops-inside-mem_cgroup_get_nr_swap_pages.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
* mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
* mailmap-add-entry-for-mike-rapoport.patch
* checkpatch-test-git_dir-changes.patch
* kthread-remove-incorrect-comment-in-kthread_create_on_cpu.patch
* kbuild-move-wtype-limits-to-w=2.patch
* scripts-tagssh-collect-compiled-source-precisely.patch
* scripts-tagssh-collect-compiled-source-precisely-v2.patch
* bloat-o-meter-support-comparing-library-archives.patch
* scripts-decode_stacktrace-skip-missing-symbols.patch
* scripts-decode_stacktrace-guess-basepath-if-not-specified.patch
* scripts-decode_stacktrace-guess-path-to-modules.patch
* scripts-decode_stacktrace-guess-path-to-vmlinux-by-release-name.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ocfs2-change-slot-number-type-s16-to-u16.patch
* ramfs-support-o_tmpfile.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
mm.patch
* mm-treewide-rename-kzfree-to-kfree_sensitive.patch
* mm-ksize-should-silently-accept-a-null-pointer.patch
* mm-expand-config_slab_freelist_hardened-to-include-slab.patch
* slab-add-naive-detection-of-double-free.patch
* slab-add-naive-detection-of-double-free-fix.patch
* mm-slab-check-gfp_slab_bug_mask-before-alloc_pages-in-kmalloc_order.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch
* mm-slub-extend-slub_debug-syntax-for-multiple-blocks-fix.patch
* mm-slub-make-some-slub_debug-related-attributes-read-only.patch
* mm-slub-remove-runtime-allocation-order-changes.patch
* mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch
* mm-slub-make-reclaim_account-attribute-read-only.patch
* mm-slub-introduce-static-key-for-slub_debug.patch
* mm-slub-introduce-kmem_cache_debug_flags.patch
* mm-slub-introduce-kmem_cache_debug_flags-fix.patch
* mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch
* mm-slab-slub-move-and-improve-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch
* mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch
* slub-drop-lockdep_assert_held-from-put_map.patch
* mm-kcsan-instrument-slab-slub-free-with-assert_exclusive_access.patch
* mm-debug_vm_pgtable-add-tests-validating-arch-helpers-for-core-mm-features.patch
* mm-debug_vm_pgtable-add-tests-validating-advanced-arch-page-table-helpers.patch
* mm-debug_vm_pgtable-add-debug-prints-for-individual-tests.patch
* documentation-mm-add-descriptions-for-arch-page-table-helpers.patch
* mm-handle-page-mapping-better-in-dump_page.patch
* mm-dump-compound-page-information-on-a-second-line.patch
* mm-print-head-flags-in-dump_page.patch
* mm-switch-dump_page-to-get_kernel_nofault.patch
* mm-print-the-inode-number-in-dump_page.patch
* mm-print-hashed-address-of-struct-page.patch
* mm-filemap-clear-idle-flag-for-writes.patch
* mm-filemap-add-missing-fgp_-flags-in-kerneldoc-comment-for-pagecache_get_page.patch
* mm-swap-simplify-alloc_swap_slot_cache.patch
* mm-swap-simplify-enable_swap_slots_cache.patch
* mm-swap-remove-redundant-check-for-swap_slot_cache_initialized.patch
* mm-kmem-make-memcg_kmem_enabled-irreversible.patch
* mm-memcg-factor-out-memcg-and-lruvec-level-changes-out-of-__mod_lruvec_state.patch
* mm-memcg-prepare-for-byte-sized-vmstat-items.patch
* mm-memcg-convert-vmstat-slab-counters-to-bytes.patch
* mm-slub-implement-slub-version-of-obj_to_index.patch
* mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
* mm-memcg-slab-obj_cgroup-api.patch
* mm-memcg-slab-allocate-obj_cgroups-for-non-root-slab-pages.patch
* mm-memcg-slab-save-obj_cgroup-for-non-root-slab-objects.patch
* mm-memcg-slab-charge-individual-slab-objects-instead-of-pages.patch
* mm-memcg-slab-deprecate-memorykmemslabinfo.patch
* mm-memcg-slab-move-memcg_kmem_bypass-to-memcontrolh.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
* mm-memcg-slab-simplify-memcg-cache-creation.patch
* mm-memcg-slab-remove-memcg_kmem_get_cache.patch
* mm-memcg-slab-deprecate-slab_root_caches.patch
* mm-memcg-slab-remove-redundant-check-in-memcg_accumulate_slabinfo.patch
* mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-allocations.patch
* kselftests-cgroup-add-kernel-memory-accounting-tests.patch
* tools-cgroup-add-memcg_slabinfopy-tool.patch
* percpu-return-number-of-released-bytes-from-pcpu_free_area.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
* mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics.patch
* mm-memcg-percpu-per-memcg-percpu-memory-statistics-v3.patch
* mm-memcg-charge-memcg-percpu-memory-to-the-parent-cgroup.patch
* kselftests-cgroup-add-perpcu-memory-accounting-test.patch
* mm-memcontrol-account-kernel-stack-per-node.patch
* mm-memcg-slab-remove-unused-argument-by-charge_slab_page.patch
* mm-slab-rename-uncharge_slab_page-to-unaccount_slab_page.patch
* mm-kmem-switch-to-static_branch_likely-in-memcg_kmem_enabled.patch
* mm-memcontrol-avoid-workload-stalls-when-lowering-memoryhigh.patch
* mm-remove-redundant-check-non_swap_entry.patch
* mm-memoryc-make-remap_pfn_range-reject-unaligned-addr.patch
* mm-remove-unneeded-includes-of-asm-pgalloch.patch
* mm-remove-unneeded-includes-of-asm-pgalloch-fix.patch
* opeinrisc-switch-to-generic-version-of-pte-allocation.patch
* xtensa-switch-to-generic-version-of-pte-allocation.patch
* asm-generic-pgalloc-provide-generic-pmd_alloc_one-and-pmd_free_one.patch
* asm-generic-pgalloc-provide-generic-pud_alloc_one-and-pud_free_one.patch
* asm-generic-pgalloc-provide-generic-pgd_free.patch
* mm-move-lib-ioremapc-to-mm.patch
* mm-move-pd_alloc_track-to-separate-header-file.patch
* mm-mmap-fix-the-adjusted-length-error.patch
* mm-mmap-optimize-a-branch-judgment-in-ksys_mmap_pgoff.patch
* mm-do-page-fault-accounting-in-handle_mm_fault.patch
* mm-alpha-use-general-page-fault-accounting.patch
* mm-arc-use-general-page-fault-accounting.patch
* mm-arm-use-general-page-fault-accounting.patch
* mm-arm64-use-general-page-fault-accounting.patch
* mm-csky-use-general-page-fault-accounting.patch
* mm-hexagon-use-general-page-fault-accounting.patch
* mm-ia64-use-general-page-fault-accounting.patch
* mm-m68k-use-general-page-fault-accounting.patch
* mm-microblaze-use-general-page-fault-accounting.patch
* mm-mips-use-general-page-fault-accounting.patch
* mm-nds32-use-general-page-fault-accounting.patch
* mm-nios2-use-general-page-fault-accounting.patch
* mm-openrisc-use-general-page-fault-accounting.patch
* mm-parisc-use-general-page-fault-accounting.patch
* mm-powerpc-use-general-page-fault-accounting.patch
* mm-riscv-use-general-page-fault-accounting.patch
* mm-s390-use-general-page-fault-accounting.patch
* mm-sh-use-general-page-fault-accounting.patch
* mm-sparc32-use-general-page-fault-accounting.patch
* mm-sparc64-use-general-page-fault-accounting.patch
* mm-x86-use-general-page-fault-accounting.patch
* mm-xtensa-use-general-page-fault-accounting.patch
* mm-clean-up-the-last-pieces-of-page-fault-accountings.patch
* mm-gup-remove-task_struct-pointer-for-all-gup-code.patch
* mm-sparse-never-partially-remove-memmap-for-early-section.patch
* mm-sparse-only-sub-section-aligned-range-would-be-populated.patch
* vmalloc-convert-to-xarray.patch
* mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
* mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
* mm-vmalloc-switch-to-propagate-callback.patch
* mm-vmalloc-update-the-header-about-kva-rework.patch
* mm-vmalloc-remove-redundant-asignmnet-in-unmap_kernel_range_noflush.patch
* kasan-improve-and-simplify-kconfigkasan.patch
* kasan-update-required-compiler-versions-in-documentation.patch
* rcu-kasan-record-and-print-call_rcu-call-stack.patch
* kasan-record-and-print-the-free-track.patch
* kasan-add-tests-for-call_rcu-stack-recording.patch
* kasan-update-documentation-for-generic-kasan.patch
* kasan-remove-kasan_unpoison_stack_above_sp_to.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan.patch
* kasan-fix-kasan-unit-tests-for-tag-based-kasan-v4.patch
* mm-page_alloc-use-unlikely-in-task_capc.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast.patch
* page_alloc-consider-highatomic-reserve-in-watermark-fast-v5.patch
* mm-page_alloc-skip-waternark_boost-for-atomic-order-0-allocations.patch
* mm-page_alloc-skip-watermark_boost-for-atomic-order-0-allocations-fix.patch
* mm-drop-vm_total_pages.patch
* mm-page_alloc-drop-nr_free_pagecache_pages.patch
* mm-memory_hotplug-document-why-shuffle_zone-is-relevant.patch
* mm-shuffle-remove-dynamic-reconfiguration.patch
* powerpc-numa-set-numa_node-for-all-possible-cpus.patch
* powerpc-numa-prefer-node-id-queried-from-vphn.patch
* mm-page_alloc-keep-memoryless-cpuless-node-0-offline.patch
* mm-page_allocc-replace-the-definition-of-nr_migratetype_bits-with-pb_migratetype_bits.patch
* mm-page_allocc-extract-the-common-part-in-pfn_to_bitidx.patch
* mm-page_allocc-simplify-pageblock-bitmap-access.patch
* mm-page_allocc-remove-unnecessary-end_bitidx-for-_pfnblock_flags_mask.patch
* mm-page_alloc-silence-a-kasan-false-positive.patch
* mm-page_alloc-fallbacks-at-most-has-3-elements.patch
* mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
* mm-huge_memoryc-update-tlb-entry-if-pmd-is-changed.patch
* mips-do-not-call-flush_tlb_all-when-setting-pmd-entry.patch
* mm-vmscanc-fixed-typo.patch
* mm-proactive-compaction.patch
* mm-proactive-compaction-fix.patch
* mm-use-unsigned-types-for-fragmentation-score.patch
* mm-oom-make-the-calculation-of-oom-badness-more-accurate.patch
* doc-mm-sync-up-oom_score_adj-documentation.patch
* doc-mm-clarify-proc-pid-oom_score-value-range.patch
* hugetlbfs-prevent-filesystem-stacking-of-hugetlbfs.patch
* mm-migrate-optimize-migrate_vma_setup-for-holes.patch
* mm-migrate-add-migrate-shared-test-for-migrate_vma_.patch
* mm-thp-remove-debug_cow-switch.patch
* mm-store-compound_nr-as-well-as-compound_order.patch
* mm-move-page-flags-include-to-top-of-file.patch
* mm-add-thp_order.patch
* mm-add-thp_size.patch
* mm-replace-hpage_nr_pages-with-thp_nr_pages.patch
* mm-add-thp_head.patch
* mm-introduce-offset_in_thp.patch
* mm-vmstat-add-events-for-thp-migration-without-split.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
* mm-vmstat-add-events-for-thp-migration-without-split-fix-2.patch
* mm-cma-fix-null-pointer-dereference-when-cma-could-not-be-activated.patch
* mm-cma-fix-the-name-of-cma-areas.patch
* mm-cma-fix-the-name-of-cma-areas-fix.patch
* mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
* mmhwpoison-cleanup-unused-pagehuge-check.patch
* mm-hwpoison-remove-recalculating-hpage.patch
* mmmadvise-call-soft_offline_page-without-mf_count_increased.patch
* mmmadvise-refactor-madvise_inject_error.patch
* mmhwpoison-inject-dont-pin-for-hwpoison_filter.patch
* mmhwpoison-un-export-get_hwpoison_page-and-make-it-static.patch
* mmhwpoison-kill-put_hwpoison_page.patch
* mmhwpoison-remove-mf_count_increased.patch
* mmhwpoison-remove-flag-argument-from-soft-offline-functions.patch
* mmhwpoison-unify-thp-handling-for-hard-and-soft-offline.patch
* mmhwpoison-rework-soft-offline-for-free-pages.patch
* mmhwpoison-rework-soft-offline-for-free-pages-fix.patch
* mmhwpoison-rework-soft-offline-for-in-use-pages.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix.patch
* mmhwpoison-refactor-soft_offline_huge_page-and-__soft_offline_page-fix-fix.patch
* mmhwpoison-return-0-if-the-page-is-already-poisoned-in-soft-offline.patch
* mmhwpoison-introduce-mf_msg_unsplit_thp.patch
* sched-mm-optimize-current_gfp_context.patch
* x86-mm-use-max-memory-block-size-on-bare-metal.patch
* info-task-hung-in-generic_file_write_iter.patch
* info-task-hung-in-generic_file_write-fix.patch
* kernel-hung_taskc-monitor-killed-tasks.patch
* fix-annotation-of-ioreadwrite1632be.patch
* sparse-group-the-defines-by-functionality.patch
* bitmap-fix-bitmap_cut-for-partial-overlapping-case.patch
* bitmap-add-test-for-bitmap_cut.patch
* lib-generic-radix-treec-remove-unneeded-__rcu.patch
* lib-test_bitops-do-the-full-test-during-module-init.patch
* lib-optimize-cpumask_local_spread.patch
* lib-test_lockupc-make-symbol-test_works-static.patch
* iomap-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* rtl818x-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* ntb-intel-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* virtio-pci-constify-ioreadx-iomem-argument-as-in-generic-implementation.patch
* bits-add-tests-of-genmask.patch
* bits-add-tests-of-genmask-fix.patch
* bits-add-tests-of-genmask-fix-2.patch
* checkpatch-add-test-for-possible-misuse-of-is_enabled-without-config_.patch
* checkpatch-support-deprecated-terms-checking.patch
* scripts-deprecated_terms-recommend-denylist-allowlist-instead-of-blacklist-whitelist.patch
* checkpatch-add-fix-option-for-assign_in_if.patch
* checkpatch-fix-const_struct-when-const_structscheckpatch-is-missing.patch
* fs-minix-check-return-value-of-sb_getblk.patch
* fs-minix-dont-allow-getting-deleted-inodes.patch
* fs-minix-reject-too-large-maximum-file-size.patch
* fs-minix-set-s_maxbytes-correctly.patch
* fs-minix-fix-block-limit-check-for-v1-filesystems.patch
* fs-minix-remove-expected-error-message-in-block_to_path.patch
* fatfs-switch-write_lock-to-read_lock-in-fat_ioctl_get_attributes.patch
* vfat-fat-msdos-filesystem-replace-http-links-with-https-ones.patch
* fs-signalfdc-fix-inconsistent-return-codes-for-signalfd4.patch
* selftests-kmod-use-variable-name-in-kmod_test_0001.patch
* kmod-remove-redundant-be-an-in-the-comment.patch
* test_kmod-avoid-potential-double-free-in-trigger_config_run_type.patch
* coredump-add-%f-for-executable-filename.patch
* exec-change-uselib2-is_sreg-failure-to-eacces.patch
* exec-move-s_isreg-check-earlier.patch
* exec-move-path_noexec-check-earlier.patch
* kdump-append-kernel-build-id-string-to-vmcoreinfo.patch
* rapidio-rio_mport_cdev-use-struct_size-helper.patch
* rapidio-use-struct_size-helper.patch
* kernel-panicc-make-oops_may_print-return-bool.patch
* lib-kconfigdebug-fix-typo-in-the-help-text-of-config_panic_timeout.patch
* aio-simplify-read_events.patch
* kcov-unconditionally-add-fno-stack-protector-to-compiler-options.patch
* kcov-make-some-symbols-static.patch
linux-next.patch
linux-next-rejects.patch
* mm-madvise-pass-task-and-mm-to-do_madvise.patch
* pid-move-pidfd_get_pid-to-pidc.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
* mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
* mm-madvise-check-fatal-signal-pending-of-target-process.patch
* all-arch-remove-system-call-sys_sysctl.patch
* all-arch-remove-system-call-sys_sysctl-fix.patch
* mm-kmemleak-silence-kcsan-splats-in-checksum.patch
* mm-frontswap-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races.patch
* mm-page_io-mark-various-intentional-data-races-v2.patch
* mm-swap_state-mark-various-intentional-data-races.patch
* mm-filemap-fix-a-data-race-in-filemap_fault.patch
* mm-swapfile-fix-and-annotate-various-data-races.patch
* mm-swapfile-fix-and-annotate-various-data-races-v2.patch
* mm-page_counter-fix-various-data-races-at-memsw.patch
* mm-memcontrol-fix-a-data-race-in-scan-count.patch
* mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
* mm-mempool-fix-a-data-race-in-mempool_free.patch
* mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
* mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
* mm-annotate-a-data-race-in-page_zonenum.patch
* include-asm-generic-vmlinuxldsh-align-ro_after_init.patch
* sh-clkfwk-remove-r8-r16-r32.patch
* sh-use-generic-strncpy.patch
* sh-add-missing-export_symbol-for-__delay.patch
make-sure-nobodys-leaking-resources.patch
releasing-resources-with-children.patch
mutex-subsystem-synchro-test-module.patch
kernel-forkc-export-kernel_thread-to-modules.patch
workaround-for-a-pci-restoring-bug.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (117 preceding siblings ...)
2020-07-10 4:00 ` mmotm 2020-07-09-21-00 uploaded Andrew Morton
@ 2020-07-10 23:27 ` Andrew Morton
2020-07-10 23:29 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to " Andrew Morton
` (113 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:27 UTC (permalink / raw)
To: guro, jonathan.cameron, mike.kravetz, mm-commits, rppt,
song.bao.hua, stable
The patch titled
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
has been removed from the -mm tree. Its filename was
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled. gigantic pages might have been reserved on other nodes.
Mike Kravetz said:
: Based on the code changes, I believe the following could happen:
: - Someone uses 'hugetlb_cma=' kernel command line parameter to reserve
: CMA for gigantic pages.
: - The system topology is such that no memory is on node 0. Therefore,
: no CMA can be reserved for gigantic pages on node 0. CMA is reserved
: on other nodes.
: - The user also specifies a number of gigantic pages to pre-allocate on
: the command line with hugepagesz=<gigantic_page_size> hugepages=<N>
: - The routine which allocates gigantic pages from the bootmem allocator
: will not detect CMA has been reserved as there is no memory on node 0.
: Therefore, pages will be pre-allocated from bootmem allocator as well
: as reserved in CMA.
:
: This double allocation (bootmem and CMA) is the worst case scenario. Not
: sure if this is what Barry saw, and I suspect this would rarely happen.
Link: http://lkml.kernel.org/r/20200707040204.30132-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/hugetlb.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable
+++ a/mm/hugetlb.c
@@ -2546,6 +2546,20 @@ static void __init gather_bootmem_preall
}
}
+bool __init hugetlb_cma_enabled(void)
+{
+#ifdef CONFIG_CMA
+ int node;
+
+ for_each_online_node(node) {
+ if (hugetlb_cma[node])
+ return true;
+ }
+#endif
+
+ return false;
+}
+
static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
{
unsigned long i;
@@ -2571,7 +2585,7 @@ static void __init hugetlb_hstate_alloc_
for (i = 0; i < h->max_huge_pages; ++i) {
if (hstate_is_gigantic(h)) {
- if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+ if (hugetlb_cma_enabled()) {
pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
break;
}
_
Patches currently in -mm which might be from song.bao.hua@hisilicon.com are
mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (118 preceding siblings ...)
2020-07-10 23:27 ` [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from -mm tree Andrew Morton
@ 2020-07-10 23:29 ` Andrew Morton
2020-07-10 23:32 ` + proc-sysctl-make-protected_-world-readable.patch " Andrew Morton
` (112 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:29 UTC (permalink / raw)
To: guro, jonathan.cameron, mike.kravetz, mm-commits, song.bao.hua, stable
The patch titled
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
has been added to the -mm tree. Its filename is
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled. gigantic pages might have been reserved on other nodes. This
patch fixes possible double reservation and CMA leak.
Link: http://lkml.kernel.org/r/20200710005726.36068-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/hugetlb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled
+++ a/mm/hugetlb.c
@@ -46,6 +46,7 @@ unsigned int default_hstate_idx;
struct hstate hstates[HUGE_MAX_HSTATE];
static struct cma *hugetlb_cma[MAX_NUMNODES];
+static unsigned long hugetlb_cma_size __initdata;
/*
* Minimum page order among possible hugepage sizes, set to a proper value
@@ -2571,7 +2572,7 @@ static void __init hugetlb_hstate_alloc_
for (i = 0; i < h->max_huge_pages; ++i) {
if (hstate_is_gigantic(h)) {
- if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+ if (hugetlb_cma_size) {
pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
break;
}
@@ -5654,7 +5655,6 @@ void move_hugetlb_state(struct page *old
}
#ifdef CONFIG_CMA
-static unsigned long hugetlb_cma_size __initdata;
static bool cma_reserve_called __initdata;
static int __init cmdline_parse_hugetlb_cma(char *p)
_
Patches currently in -mm which might be from song.bao.hua@hisilicon.com are
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + proc-sysctl-make-protected_-world-readable.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (119 preceding siblings ...)
2020-07-10 23:29 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to " Andrew Morton
@ 2020-07-10 23:32 ` Andrew Morton
2020-07-10 23:32 ` [to-be-updated] mm-vmallocc-add-an-error-message-if-two-areas-overlap.patch removed from " Andrew Morton
` (111 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:32 UTC (permalink / raw)
To: jpitti, keescook, mcgrof, mingo, mm-commits, viro, yzaikin
The patch titled
Subject: proc/sysctl: make protected_* world readable
has been added to the -mm tree. Its filename is
proc-sysctl-make-protected_-world-readable.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/proc-sysctl-make-protected_-world-readable.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/proc-sysctl-make-protected_-world-readable.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Julius Hemanth Pitti <jpitti@cisco.com>
Subject: proc/sysctl: make protected_* world readable
protected_* files have 600 permissions which prevents non-superuser from
reading them.
Container like "AWS greengrass" refuse to launch unless
protected_hardlinks and protected_symlinks are set. When containers like
these run with "userns-remap" or "--user" mapping container's root to
non-superuser on host, they fail to run due to denied read access to these
files.
As these protections are hardly a secret, and do not possess any security
risk, making them world readable.
Though above greengrass usecase needs read access to only
protected_hardlinks and protected_symlinks files, setting all other
protected_* files to 644 to keep consistency.
Link: http://lkml.kernel.org/r/20200709235115.56954-1-jpitti@cisco.com
Fixes: 800179c9b8a1 ("fs: add link restrictions")
Signed-off-by: Julius Hemanth Pitti <jpitti@cisco.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/sysctl.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/kernel/sysctl.c~proc-sysctl-make-protected_-world-readable
+++ a/kernel/sysctl.c
@@ -3232,7 +3232,7 @@ static struct ctl_table fs_table[] = {
.procname = "protected_symlinks",
.data = &sysctl_protected_symlinks,
.maxlen = sizeof(int),
- .mode = 0600,
+ .mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
@@ -3241,7 +3241,7 @@ static struct ctl_table fs_table[] = {
.procname = "protected_hardlinks",
.data = &sysctl_protected_hardlinks,
.maxlen = sizeof(int),
- .mode = 0600,
+ .mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
@@ -3250,7 +3250,7 @@ static struct ctl_table fs_table[] = {
.procname = "protected_fifos",
.data = &sysctl_protected_fifos,
.maxlen = sizeof(int),
- .mode = 0600,
+ .mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = &two,
@@ -3259,7 +3259,7 @@ static struct ctl_table fs_table[] = {
.procname = "protected_regular",
.data = &sysctl_protected_regular,
.maxlen = sizeof(int),
- .mode = 0600,
+ .mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = &two,
_
Patches currently in -mm which might be from jpitti@cisco.com are
proc-sysctl-make-protected_-world-readable.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* [to-be-updated] mm-vmallocc-add-an-error-message-if-two-areas-overlap.patch removed from -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (120 preceding siblings ...)
2020-07-10 23:32 ` + proc-sysctl-make-protected_-world-readable.patch " Andrew Morton
@ 2020-07-10 23:32 ` Andrew Morton
2020-07-10 23:35 ` + rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch added to " Andrew Morton
` (110 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:32 UTC (permalink / raw)
To: hdanton, mhocko, mm-commits, oleksiy.avramchenko, rostedt, urezki, willy
The patch titled
Subject: mm/vmalloc.c: add an error message if two areas overlap
has been removed from the -mm tree. Its filename was
mm-vmallocc-add-an-error-message-if-two-areas-overlap.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Subject: mm/vmalloc.c: add an error message if two areas overlap
Before triggering a BUG() it would be useful to understand how two areas
overlap between each other. Print information about start/end addresses
of both VAs and their addresses.
For example if both are identical it could mean double free.
Link: http://lkml.kernel.org/r/20200710194443.2984-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/vmalloc.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/mm/vmalloc.c~mm-vmallocc-add-an-error-message-if-two-areas-overlap
+++ a/mm/vmalloc.c
@@ -550,8 +550,13 @@ find_va_links(struct vmap_area *va,
else if (va->va_end > tmp_va->va_start &&
va->va_start >= tmp_va->va_end)
link = &(*link)->rb_right;
- else
+ else {
+ pr_err("Overlaps: 0x%px(0x%lx-0x%lx), 0x%px(0x%lx-0x%lx)\n",
+ va, va->va_start, va->va_end, tmp_va,
+ tmp_va->va_start, tmp_va->va_end);
+
BUG();
+ }
} while (*link);
*parent = &tmp_va->rb_node;
_
Patches currently in -mm which might be from urezki@gmail.com are
mm-vmalloc-simplify-merge_or_add_vmap_area-func.patch
mm-vmalloc-simplify-augment_tree_propagate_check-func.patch
mm-vmalloc-switch-to-propagate-callback.patch
mm-vmalloc-update-the-header-about-kva-rework.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (121 preceding siblings ...)
2020-07-10 23:32 ` [to-be-updated] mm-vmallocc-add-an-error-message-if-two-areas-overlap.patch removed from " Andrew Morton
@ 2020-07-10 23:35 ` Andrew Morton
2020-07-14 0:19 ` + mm-vmscan-consistent-update-to-pgrefill.patch " Andrew Morton
` (109 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-10 23:35 UTC (permalink / raw)
To: alex.bou9, gustavoars, keescook, mm-commits, mporter
The patch titled
Subject: rapidio/rio_mport_cdev: Use array_size() helper in copy_{from,to}_user()
has been added to the -mm tree. Its filename is
rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Subject: rapidio/rio_mport_cdev: Use array_size() helper in copy_{from,to}_user()
Use array_size() helper instead of the open-coded version in
copy_{from,to}_user(). These sorts of multiplication factors need to be
wrapped in array_size().
This issue was found with the help of Coccinelle and, audited and fixed
manually.
Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83
Link: http://lkml.kernel.org/r/20200616183050.GA31840@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/rapidio/devices/rio_mport_cdev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/rapidio/devices/rio_mport_cdev.c~rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user
+++ a/drivers/rapidio/devices/rio_mport_cdev.c
@@ -981,7 +981,7 @@ static int rio_mport_transfer_ioctl(stru
if (unlikely(copy_from_user(transfer,
(void __user *)(uintptr_t)transaction.block,
- transaction.count * sizeof(*transfer)))) {
+ array_size(sizeof(*transfer), transaction.count)))) {
ret = -EFAULT;
goto out_free;
}
@@ -994,7 +994,7 @@ static int rio_mport_transfer_ioctl(stru
if (unlikely(copy_to_user((void __user *)(uintptr_t)transaction.block,
transfer,
- transaction.count * sizeof(*transfer))))
+ array_size(sizeof(*transfer), transaction.count))))
ret = -EFAULT;
out_free:
_
Patches currently in -mm which might be from gustavoars@kernel.org are
rapidio-rio_mport_cdev-use-struct_size-helper.patch
rapidio-use-struct_size-helper.patch
rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-vmscan-consistent-update-to-pgrefill.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (122 preceding siblings ...)
2020-07-10 23:35 ` + rapidio-rio_mport_cdev-use-array_size-helper-in-copy_fromto_user.patch added to " Andrew Morton
@ 2020-07-14 0:19 ` Andrew Morton
2020-07-14 0:24 ` + mm-handle-page-mapping-better-in-dump_page-fix.patch " Andrew Morton
` (108 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14 0:19 UTC (permalink / raw)
To: chris, guro, hannes, laoar.shao, mhocko, mm-commits, shakeelb
The patch titled
Subject: mm: vmscan: consistent update to pgrefill
has been added to the -mm tree. Its filename is
mm-vmscan-consistent-update-to-pgrefill.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-vmscan-consistent-update-to-pgrefill.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmscan-consistent-update-to-pgrefill.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Shakeel Butt <shakeelb@google.com>
Subject: mm: vmscan: consistent update to pgrefill
The vmstat pgrefill is useful together with pgscan and pgsteal stats to
measure the reclaim efficiency. However vmstat's pgrefill is not updated
consistently at system level. It gets updated for both global and memcg
reclaim however pgscan and pgsteal are updated for only global reclaim.
So, update pgrefill only for global reclaim. If someone is interested in
the stats representing both system level as well as memcg level reclaim,
then consult the root memcg's memory.stat instead of /proc/vmstat.
Link: http://lkml.kernel.org/r/20200711011459.1159929-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/vmscan.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/vmscan.c~mm-vmscan-consistent-update-to-pgrefill
+++ a/mm/vmscan.c
@@ -2030,7 +2030,8 @@ static void shrink_active_list(unsigned
__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, nr_taken);
- __count_vm_events(PGREFILL, nr_scanned);
+ if (!cgroup_reclaim(sc))
+ __count_vm_events(PGREFILL, nr_scanned);
__count_memcg_events(lruvec_memcg(lruvec), PGREFILL, nr_scanned);
spin_unlock_irq(&pgdat->lru_lock);
_
Patches currently in -mm which might be from shakeelb@google.com are
mm-memcontrol-account-kernel-stack-per-node.patch
mm-vmscan-consistent-update-to-pgrefill.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + mm-handle-page-mapping-better-in-dump_page-fix.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (123 preceding siblings ...)
2020-07-14 0:19 ` + mm-vmscan-consistent-update-to-pgrefill.patch " Andrew Morton
@ 2020-07-14 0:24 ` Andrew Morton
2020-07-14 0:31 ` + tmpfs-per-superblock-i_ino-support.patch " Andrew Morton
` (107 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14 0:24 UTC (permalink / raw)
To: akpm, jhubbard, kirill, mm-commits, rppt, vbabka,
william.kucharski, willy
The patch titled
Subject: mm-handle-page-mapping-better-in-dump_page-fix
has been added to the -mm tree. Its filename is
mm-handle-page-mapping-better-in-dump_page-fix.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-handle-page-mapping-better-in-dump_page-fix.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-handle-page-mapping-better-in-dump_page-fix.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-handle-page-mapping-better-in-dump_page-fix
augmented code comment from John
Link: http://lkml.kernel.org/r/15cff11a-6762-8a6a-3f0e-dd227280cd6f@nvidia.com
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/debug.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/mm/debug.c~mm-handle-page-mapping-better-in-dump_page-fix
+++ a/mm/debug.c
@@ -69,7 +69,13 @@ void __dump_page(struct page *page, cons
}
if (page < head || (page >= head + MAX_ORDER_NR_PAGES)) {
- /* Corrupt page, cannot call page_mapping */
+ /*
+ * Corrupt page, so we cannot call page_mapping. Instead, do a
+ * safe subset of the steps that page_mapping() does. Caution:
+ * this will be misleading for tail pages, PageSwapCache pages,
+ * and potentially other situations. (See the page_mapping()
+ * implementation for what's missing here.)
+ */
unsigned long tmp = (unsigned long)page->mapping;
if (tmp & PAGE_MAPPING_ANON)
_
Patches currently in -mm which might be from akpm@linux-foundation.org are
mm-close-race-between-munmap-and-expand_upwards-downwards-fix.patch
mm.patch
mm-handle-page-mapping-better-in-dump_page-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix.patch
mm-memcg-percpu-account-percpu-memory-to-memory-cgroups-fix-fix.patch
mm-vmstat-add-events-for-thp-migration-without-split-fix.patch
linux-next-rejects.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix.patch
mm-madvise-introduce-process_madvise-syscall-an-external-memory-hinting-api-fix-2.patch
kernel-forkc-export-kernel_thread-to-modules.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + tmpfs-per-superblock-i_ino-support.patch added to -mm tree
2020-07-03 22:14 incoming Andrew Morton
` (124 preceding siblings ...)
2020-07-14 0:24 ` + mm-handle-page-mapping-better-in-dump_page-fix.patch " Andrew Morton
@ 2020-07-14 0:31 ` Andrew Morton
2020-07-14 0:31 ` + tmpfs-support-64-bit-inums-per-sb.patch " Andrew Morton
` (106 subsequent siblings)
232 siblings, 0 replies; 247+ messages in thread
From: Andrew Morton @ 2020-07-14 0:31 UTC (permalink / raw)
To: amir73il, chris, hannes, hughd, jlayton, mm-commits, tj, viro, willy
The patch titled
Subject: tmpfs: per-superblock i_ino support
has been added to the -mm tree. Its filename is
tmpfs-per-superblock-i_ino-support.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/tmpfs-per-superblock-i_ino-support.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/tmpfs-per-superblock-i_ino-support.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Chris Down <chris@chrisdown.name>
Subject: tmpfs: per-superblock i_ino support
Patch series "tmpfs: inode: Reduce risk of inum overflow", v7.
In Facebook production we are seeing heavy i_ino wraparounds on tmpfs. On
affected tiers, in excess of 10% of hosts show multiple files with
different content and the same inode number, with some servers even having
as many as 150 duplicated inode numbers with differing file content.
This causes actual, tangible problems in production. For example, we have
complaints from those working on remote caches that their application is
reporting cache corruptions because it uses (device, inodenum) to
establish the identity of a particular cache object, but because it's not
unique any more, the application refuses to continue and reports cache
corruption. Even worse, sometimes applications may not even detect the
corruption but may continue anyway, causing phantom and hard to debug
behaviour.
In general, userspace applications expect that (device, inodenum) should
be enough to be uniquely point to one inode, which seems fair enough. One
might also need to check the generation, but in this case:
1. That's not currently exposed to userspace
(ioctl(...FS_IOC_GETVERSION...) returns ENOTTY on tmpfs);
2. Even with generation, there shouldn't be two live inodes with the
same inode number on one device.
In order to mitigate this, we take a two-pronged approach:
1. Moving inum generation from being global to per-sb for tmpfs. This
itself allows some reduction in i_ino churn. This works on both 64-
and 32- bit machines.
2. Adding inode{64,32} for tmpfs. This fix is supported on machines with
64-bit ino_t only: we allow users to mount tmpfs with a new inode64
option that uses the full width of ino_t, or CONFIG_TMPFS_INODE64.
You can see how this compares to previous related patches which didn't
implement this per-superblock:
- https://patchwork.kernel.org/patch/11254001/
- https://patchwork.kernel.org/patch/11023915/
This patch (of 2):
get_next_ino has a number of problems:
- It uses and returns a uint, which is susceptible to become overflowed
if a lot of volatile inodes that use get_next_ino are created.
- It's global, with no specificity per-sb or even per-filesystem. This
means it's not that difficult to cause inode number wraparounds on a
single device, which can result in having multiple distinct inodes
with the same inode number.
This patch adds a per-superblock counter that mitigates the second case.
This design also allows us to later have a specific i_ino size per-device,
for example, allowing users to choose whether to use 32- or 64-bit inodes
for each tmpfs mount. This is implemented in the next commit.
For internal shmem mounts which may be less tolerant to spinlock delays,
we implement a percpu batching scheme which only takes the stat_lock at
each batch boundary.
Link: http://lkml.kernel.org/r/cover.1594661218.git.chris@chrisdown.name
Link: http://lkml.kernel.org/r/1986b9d63b986f08ec07a4aa4b2275e718e47d8a.1594661218.git.chris@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/fs.h | 15 ++++++++
include/linux/shmem_fs.h | 2 +
mm/shmem.c | 66 ++++++++++++++++++++++++++++++++++---
3 files changed, 78 insertions(+), 5 deletions(-)
--- a/include/linux/fs.h~tmpfs-per-superblock-i_ino-support
+++ a/include/linux/fs.h
@@ -3098,6 +3098,21 @@ extern void discard_new_inode(struct ino
extern unsigned int get_next_ino(void);
extern void evict_inodes(struct super_block *sb);
+/*
+ * Userspace may rely on the the inode number being non-zero. For example, glibc
+ * simply ignores files with zero i_ino in unlink() and other places.
+ *
+ * As an additional complication, if userspace was compiled with
+ * _FILE_OFFSET_BITS=32 on a 64-bit kernel we'll only end up reading out the
+ * lower 32 bits, so we need to check that those aren't zero explicitly. With
+ * _FILE_OFFSET_BITS=64, this may cause some harmless false-negatives, but
+ * better safe than sorry.
+ */
+static inline bool is_zero_ino(ino_t ino)
+{
+ return (u32)ino == 0;
+}
+
extern void __iget(struct inode * inode);
extern void iget_failed(struct inode *);
extern void clear_inode(struct inode *);
--- a/include/linux/shmem_fs.h~tmpfs-per-superblock-i_ino-support
+++ a/include/linux/shmem_fs.h
@@ -36,6 +36,8 @@ struct shmem_sb_info {
unsigned char huge; /* Whether to try for hugepages */
kuid_t uid; /* Mount uid for root directory */
kgid_t gid; /* Mount gid for root directory */
+ ino_t next_ino; /* The next per-sb inode number to use */
+ ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */
struct mempolicy *mpol; /* default memory policy for mappings */
spinlock_t shrinklist_lock; /* Protects shrinklist */
struct list_head shrinklist; /* List of shinkable inodes */
--- a/mm/shmem.c~tmpfs-per-superblock-i_ino-support
+++ a/mm/shmem.c
@@ -260,18 +260,67 @@ bool vma_is_shmem(struct vm_area_struct
static LIST_HEAD(shmem_swaplist);
static DEFINE_MUTEX(shmem_swaplist_mutex);
-static int shmem_reserve_inode(struct super_block *sb)
+/*
+ * shmem_reserve_inode() performs bookkeeping to reserve a shmem inode, and
+ * produces a novel ino for the newly allocated inode.
+ *
+ * It may also be called when making a hard link to permit the space needed by
+ * each dentry. However, in that case, no new inode number is needed since that
+ * internally draws from another pool of inode numbers (currently global
+ * get_next_ino()). This case is indicated by passing NULL as inop.
+ */
+#define SHMEM_INO_BATCH 1024
+static int shmem_reserve_inode(struct super_block *sb, ino_t *inop)
{
struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
- if (sbinfo->max_inodes) {
+ ino_t ino;
+
+ if (!(sb->s_flags & SB_KERNMOUNT)) {
spin_lock(&sbinfo->stat_lock);
if (!sbinfo->free_inodes) {
spin_unlock(&sbinfo->stat_lock);
return -ENOSPC;
}
sbinfo->free_inodes--;
+ if (inop) {
+ ino = sbinfo->next_ino++;
+ if (unlikely(is_zero_ino(ino)))
+ ino = sbinfo->next_ino++;
+ if (unlikely(ino > UINT_MAX)) {
+ /*
+ * Emulate get_next_ino uint wraparound for
+ * compatibility
+ */
+ ino = 1;
+ }
+ *inop = ino;
+ }
spin_unlock(&sbinfo->stat_lock);
+ } else if (inop) {
+ /*
+ * __shmem_file_setup, one of our callers, is lock-free: it
+ * doesn't hold stat_lock in shmem_reserve_inode since
+ * max_inodes is always 0, and is called from potentially
+ * unknown contexts. As such, use a per-cpu batched allocator
+ * which doesn't require the per-sb stat_lock unless we are at
+ * the batch boundary.
+ */
+ ino_t *next_ino;
+ next_ino = per_cpu_ptr(sbinfo->ino_batch, get_cpu());
+ ino = *next_ino;
+ if (unlikely(ino % SHMEM_INO_BATCH == 0)) {
+ spin_lock(&sbinfo->stat_lock);
+ ino = sbinfo->next_ino;
+ sbinfo->next_ino += SHMEM_INO_BATCH;
+ spin_unlock(&sbinfo->stat_lock);
+ if (unlikely(is_zero_ino(ino)))
+ ino++;
+ }
+ *inop = ino;
+ *next_ino = ++ino;
+ put_cpu();
}
+
return 0;
}
@@ -2222,13 +2271,14 @@ static struct inode *shmem_get_inode(str
struct inode *inode;
struct shmem_inode_info *info;
struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
+ ino_t ino;
- if (shmem_reserve_inode(sb))
+ if (shmem_reserve_inode(sb, &ino))
return NULL;
inode = new_inode(sb);
if (inode) {
- inode->i_ino = get_next_ino();
+ inode->i_ino = ino;
inode_init_owner(inode, dir, mode);
inode->i_blocks = 0;
inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
@@ -2932,7 +2982,7 @@ static int shmem_link(struct dentry *old
* first link must skip that, to get the accounting right.
*/
if (inode->i_nlink) {
- ret = shmem_reserve_inode(inode->i_sb);
+ ret = shmem_reserve_inode(inode->i_sb, NULL);
if (ret)
goto out;
}
@@ -3584,6 +3634,7 @@ static void shmem_put_super(struct super
{
struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
+ free_percpu(sbinfo->ino_batch);
percpu_counter_destroy(&sbinfo->used_blocks);
mpol_put(sbinfo->mpol);
kfree(sbinfo);
@@ -3626,6 +3677,11 @@ static int shmem_fill_super(struct super
#endif
sbinfo->max_blocks = ctx->blocks;
sbinfo->free_inodes = sbinfo->max_inodes = ctx->inodes;
+ if (sb->s_flags & SB_KERNMOUNT) {
+ sbinfo->ino_batch = alloc_percpu(ino_t);
+ if (!sbinfo->ino_batch)
+ goto failed;
+ }
sbinfo->uid = ctx->uid;
sbinfo->gid = ctx->gid;
sbinfo->mode = ctx->mode;
_
Patches currently in -mm which might be from chris@chrisdown.name are
tmpfs-per-superblock-i_ino-support.patch
tmpfs-support-64-bit-inums-per-sb.patch
^ permalink raw reply [flat|nested] 247+ messages in thread
* + tmpfs-support-64-bit-inums-per-