* [patch 1/9] kfence: fix memory leak when cat kfence objects
2021-12-25 5:11 incoming Andrew Morton
@ 2021-12-25 5:12 ` Andrew Morton
2021-12-25 5:12 ` [patch 2/9] mm: mempolicy: fix THP allocations escaping mempolicy restrictions Andrew Morton
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25 5:12 UTC (permalink / raw)
To: akpm, dvyukov, elver, glider, hulkci, libaokun1, linux-mm,
mm-commits, torvalds, wangkefeng.wang, yukuai3
From: Baokun Li <libaokun1@huawei.com>
Subject: kfence: fix memory leak when cat kfence objects
Hulk robot reported a kmemleak problem:
-----------------------------------------------------------------------
unreferenced object 0xffff93d1d8cc02e8 (size 248):
comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s)
hex dump (first 32 bytes):
00 40 85 19 d4 93 ff ff 00 10 00 00 00 00 00 00 .@..............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000db5610b3>] seq_open+0x2a/0x80
[<00000000d66ac99d>] full_proxy_open+0x167/0x1e0
[<00000000d58ef917>] do_dentry_open+0x1e1/0x3a0
[<0000000016c91867>] path_openat+0x961/0xa20
[<00000000909c9564>] do_filp_open+0xae/0x120
[<0000000059c761e6>] do_sys_openat2+0x216/0x2f0
[<00000000b7a7b239>] do_sys_open+0x57/0x80
[<00000000e559d671>] do_syscall_64+0x33/0x40
[<000000000ea1fbfd>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
unreferenced object 0xffff93d419854000 (size 4096):
comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s)
hex dump (first 32 bytes):
6b 66 65 6e 63 65 2d 23 32 35 30 3a 20 30 78 30 kfence-#250: 0x0
30 30 30 30 30 30 30 37 35 34 62 64 61 31 32 2d 0000000754bda12-
backtrace:
[<000000008162c6f2>] seq_read_iter+0x313/0x440
[<0000000020b1b3e3>] seq_read+0x14b/0x1a0
[<00000000af248fbc>] full_proxy_read+0x56/0x80
[<00000000f97679d1>] vfs_read+0xa5/0x1b0
[<000000000ed8a36f>] ksys_read+0xa0/0xf0
[<00000000e559d671>] do_syscall_64+0x33/0x40
[<000000000ea1fbfd>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
-----------------------------------------------------------------------
I find that we can easily reproduce this problem with the following
commands:
`cat /sys/kernel/debug/kfence/objects`
`echo scan > /sys/kernel/debug/kmemleak`
`cat /sys/kernel/debug/kmemleak`
The leaked memory is allocated in the stack below:
----------------------------------
do_syscall_64
do_sys_open
do_dentry_open
full_proxy_open
seq_open ---> alloc seq_file
vfs_read
full_proxy_read
seq_read
seq_read_iter
traverse ---> alloc seq_buf
----------------------------------
And it should have been released in the following process:
----------------------------------
do_syscall_64
syscall_exit_to_user_mode
exit_to_user_mode_prepare
task_work_run
____fput
__fput
full_proxy_release ---> free here
----------------------------------
However, the release function corresponding to file_operations is not
implemented in kfence. As a result, a memory leak occurs. Therefore,
the solution to this problem is to implement the corresponding
release function.
Link: https://lkml.kernel.org/r/20211206133628.2822545-1-libaokun1@huawei.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/kfence/core.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/kfence/core.c~kfence-fix-memory-leak-when-cat-kfence-objects
+++ a/mm/kfence/core.c
@@ -683,6 +683,7 @@ static const struct file_operations obje
.open = open_objects,
.read = seq_read,
.llseek = seq_lseek,
+ .release = seq_release,
};
static int __init kfence_debugfs_init(void)
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* [patch 2/9] mm: mempolicy: fix THP allocations escaping mempolicy restrictions
2021-12-25 5:11 incoming Andrew Morton
2021-12-25 5:12 ` [patch 1/9] kfence: fix memory leak when cat kfence objects Andrew Morton
@ 2021-12-25 5:12 ` Andrew Morton
2021-12-25 5:12 ` [patch 3/9] kernel/crash_core: suppress unknown crashkernel parameter warning Andrew Morton
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25 5:12 UTC (permalink / raw)
To: aarcange, akpm, arbn, linux-mm, mgorman, mhocko, mm-commits,
rientjes, stable, torvalds
From: Andrey Ryabinin <arbn@yandex-team.com>
Subject: mm: mempolicy: fix THP allocations escaping mempolicy restrictions
alloc_pages_vma() may try to allocate THP page on the local NUMA node
first:
page = __alloc_pages_node(hpage_node,
gfp | __GFP_THISNODE | __GFP_NORETRY, order);
And if the allocation fails it retries allowing remote memory:
if (!page && (gfp & __GFP_DIRECT_RECLAIM))
page = __alloc_pages_node(hpage_node,
gfp, order);
However, this retry allocation completely ignores memory policy nodemask
allowing allocation to escape restrictions.
The first appearance of this bug seems to be the commit ac5b2c18911f
("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings")
The bug disappeared later in the commit 89c83fb539f9
("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask")
and reappeared again in slightly different form in the commit 76e654cc91bb
("mm, page_alloc: allow hugepage fallback to remote nodes when madvised")
Fix this by passing correct nodemask to the __alloc_pages() call.
The demonstration/reproducer of the problem:
$ mount -oremount,size=4G,huge=always /dev/shm/
$ echo always > /sys/kernel/mm/transparent_hugepage/defrag
$ cat mbind_thp.c
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <numaif.h>
#define SIZE 2ULL << 30
int main(int argc, char **argv)
{
int fd;
unsigned long long i;
char *addr;
pid_t pid;
char buf[100];
unsigned long nodemask = 1;
fd = open("/dev/shm/test", O_RDWR|O_CREAT);
assert(fd > 0);
assert(ftruncate(fd, SIZE) == 0);
addr = mmap(NULL, SIZE, PROT_READ|PROT_WRITE,
MAP_SHARED, fd, 0);
assert(mbind(addr, SIZE, MPOL_BIND, &nodemask, 2, MPOL_MF_STRICT|MPOL_MF_MOVE)==0);
for (i = 0; i < SIZE; i+=4096) {
addr[i] = 1;
}
pid = getpid();
snprintf(buf, sizeof(buf), "grep shm /proc/%d/numa_maps", pid);
system(buf);
sleep(10000);
return 0;
}
$ gcc mbind_thp.c -o mbind_thp -lnuma
$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2
node 0 size: 1918 MB
node 0 free: 1595 MB
node 1 cpus: 1 3
node 1 size: 2014 MB
node 1 free: 1731 MB
node distances:
node 0 1
0: 10 20
1: 20 10
$ rm -f /dev/shm/test; taskset -c 0 ./mbind_thp
7fd970a00000 bind:0 file=/dev/shm/test dirty=524288 active=0 N0=396800 N1=127488 kernelpagesize_kB=4
Link: https://lkml.kernel.org/r/20211208165343.22349-1-arbn@yandex-team.com
Fixes: ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings")
Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/mempolicy.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/mm/mempolicy.c~mm-mempolicy-fix-thp-allocations-escaping-mempolicy-restrictions
+++ a/mm/mempolicy.c
@@ -2140,8 +2140,7 @@ struct page *alloc_pages_vma(gfp_t gfp,
* memory with both reclaim and compact as well.
*/
if (!page && (gfp & __GFP_DIRECT_RECLAIM))
- page = __alloc_pages_node(hpage_node,
- gfp, order);
+ page = __alloc_pages(gfp, order, hpage_node, nmask);
goto out;
}
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* [patch 3/9] kernel/crash_core: suppress unknown crashkernel parameter warning
2021-12-25 5:11 incoming Andrew Morton
2021-12-25 5:12 ` [patch 1/9] kfence: fix memory leak when cat kfence objects Andrew Morton
2021-12-25 5:12 ` [patch 2/9] mm: mempolicy: fix THP allocations escaping mempolicy restrictions Andrew Morton
@ 2021-12-25 5:12 ` Andrew Morton
2021-12-25 5:12 ` [patch 4/9] MAINTAINERS: mark more list instances as moderated Andrew Morton
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25 5:12 UTC (permalink / raw)
To: ahalaney, akpm, bhe, linux-mm, mm-commits, prudo, torvalds
From: Philipp Rudo <prudo@redhat.com>
Subject: kernel/crash_core: suppress unknown crashkernel parameter warning
When booting with crashkernel= on the kernel command line a warning
similar to
[ 0.038294] Kernel command line: ro console=ttyS0 crashkernel=256M
[ 0.038353] Unknown kernel command line parameters "crashkernel=256M", will be passed to user space.
is printed. This comes from crashkernel= being parsed independent from
the kernel parameter handling mechanism. So the code in init/main.c
doesn't know that crashkernel= is a valid kernel parameter and prints this
incorrect warning. Suppress the warning by adding a dummy early_param
handler for crashkernel=.
Link: https://lkml.kernel.org/r/20211208133443.6867-1-prudo@redhat.com
Fixes: 86d1919a4fb0 ("init: print out unknown kernel parameters")
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/crash_core.c | 11 +++++++++++
1 file changed, 11 insertions(+)
--- a/kernel/crash_core.c~kernel-crash_core-suppress-unknown-crashkernel-parameter-warning
+++ a/kernel/crash_core.c
@@ -6,6 +6,7 @@
#include <linux/buildid.h>
#include <linux/crash_core.h>
+#include <linux/init.h>
#include <linux/utsname.h>
#include <linux/vmalloc.h>
@@ -295,6 +296,16 @@ int __init parse_crashkernel_low(char *c
"crashkernel=", suffix_tbl[SUFFIX_LOW]);
}
+/*
+ * Add a dummy early_param handler to mark crashkernel= as a known command line
+ * parameter and suppress incorrect warnings in init/main.c.
+ */
+static int __init parse_crashkernel_dummy(char *arg)
+{
+ return 0;
+}
+early_param("crashkernel", parse_crashkernel_dummy);
+
Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
void *data, size_t data_len)
{
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* [patch 4/9] MAINTAINERS: mark more list instances as moderated
2021-12-25 5:11 incoming Andrew Morton
` (2 preceding siblings ...)
2021-12-25 5:12 ` [patch 3/9] kernel/crash_core: suppress unknown crashkernel parameter warning Andrew Morton
@ 2021-12-25 5:12 ` Andrew Morton
2021-12-25 5:12 ` [patch 5/9] mm, hwpoison: fix condition in free hugetlb page path Andrew Morton
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25 5:12 UTC (permalink / raw)
To: akpm, alexandre.belloni, conor.culhane, jianjun.wang, linux-mm,
miquel.raynal, mm-commits, rdunlap, ryder.lee, torvalds
From: Randy Dunlap <rdunlap@infradead.org>
Subject: MAINTAINERS: mark more list instances as moderated
Some lists that are moderated are not marked as moderated consistently, so
mark them all as moderated.
Link: https://lkml.kernel.org/r/20211209001330.18558-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Conor Culhane <conor.culhane@silvaco.com>
Cc: Ryder Lee <ryder.lee@mediatek.com>
Cc: Jianjun Wang <jianjun.wang@mediatek.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
MAINTAINERS | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/MAINTAINERS~maintainers-mark-more-list-instances-as-moderated
+++ a/MAINTAINERS
@@ -14845,7 +14845,7 @@ PCIE DRIVER FOR MEDIATEK
M: Ryder Lee <ryder.lee@mediatek.com>
M: Jianjun Wang <jianjun.wang@mediatek.com>
L: linux-pci@vger.kernel.org
-L: linux-mediatek@lists.infradead.org
+L: linux-mediatek@lists.infradead.org (moderated for non-subscribers)
S: Supported
F: Documentation/devicetree/bindings/pci/mediatek*
F: drivers/pci/controller/*mediatek*
@@ -17423,7 +17423,7 @@ F: drivers/video/fbdev/sm712*
SILVACO I3C DUAL-ROLE MASTER
M: Miquel Raynal <miquel.raynal@bootlin.com>
M: Conor Culhane <conor.culhane@silvaco.com>
-L: linux-i3c@lists.infradead.org
+L: linux-i3c@lists.infradead.org (moderated for non-subscribers)
S: Maintained
F: Documentation/devicetree/bindings/i3c/silvaco,i3c-master.yaml
F: drivers/i3c/master/svc-i3c-master.c
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* [patch 5/9] mm, hwpoison: fix condition in free hugetlb page path
2021-12-25 5:11 incoming Andrew Morton
` (3 preceding siblings ...)
2021-12-25 5:12 ` [patch 4/9] MAINTAINERS: mark more list instances as moderated Andrew Morton
@ 2021-12-25 5:12 ` Andrew Morton
2021-12-25 5:12 ` [patch 6/9] mm: delete unsafe BUG from page_cache_add_speculative() Andrew Morton
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25 5:12 UTC (permalink / raw)
To: akpm, linux-mm, luofei, mike.kravetz, mm-commits,
naoya.horiguchi, stable, torvalds
From: Naoya Horiguchi <naoya.horiguchi@nec.com>
Subject: mm, hwpoison: fix condition in free hugetlb page path
When a memory error hits a tail page of a free hugepage,
__page_handle_poison() is expected to be called to isolate the error in
4kB unit, but it's not called due to the outdated if-condition in
memory_failure_hugetlb(). This loses the chance to isolate the error in
the finer unit, so it's not optimal. Drop the condition.
This "(p != head && TestSetPageHWPoison(head)" condition is based on the
old semantics of PageHWPoison on hugepage (where PG_hwpoison flag was set
on the subpage), so it's not necessray any more. By getting to set
PG_hwpoison on head page for hugepages, concurrent error events on
different subpages in a single hugepage can be prevented by
TestSetPageHWPoison(head) at the beginning of memory_failure_hugetlb().
So dropping the condition should not reopen the race window originally
mentioned in commit b985194c8c0a ("hwpoison, hugetlb:
lock_page/unlock_page does not match for handling a free hugepage")
[naoya.horiguchi@linux.dev: fix "HardwareCorrupted" counter]
Link: https://lkml.kernel.org/r/20211220084851.GA1460264@u2004
Link: https://lkml.kernel.org/r/20211210110208.879740-1-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Fei Luo <luofei@unicloud.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org> [5.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/memory-failure.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
--- a/mm/memory-failure.c~mm-hwpoison-fix-condition-in-free-hugetlb-page-path
+++ a/mm/memory-failure.c
@@ -1470,17 +1470,12 @@ static int memory_failure_hugetlb(unsign
if (!(flags & MF_COUNT_INCREASED)) {
res = get_hwpoison_page(p, flags);
if (!res) {
- /*
- * Check "filter hit" and "race with other subpage."
- */
lock_page(head);
- if (PageHWPoison(head)) {
- if ((hwpoison_filter(p) && TestClearPageHWPoison(p))
- || (p != head && TestSetPageHWPoison(head))) {
+ if (hwpoison_filter(p)) {
+ if (TestClearPageHWPoison(head))
num_poisoned_pages_dec();
- unlock_page(head);
- return 0;
- }
+ unlock_page(head);
+ return 0;
}
unlock_page(head);
res = MF_FAILED;
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* [patch 6/9] mm: delete unsafe BUG from page_cache_add_speculative()
2021-12-25 5:11 incoming Andrew Morton
` (4 preceding siblings ...)
2021-12-25 5:12 ` [patch 5/9] mm, hwpoison: fix condition in free hugetlb page path Andrew Morton
@ 2021-12-25 5:12 ` Andrew Morton
2021-12-25 5:12 ` [patch 7/9] mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid Andrew Morton
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25 5:12 UTC (permalink / raw)
To: akpm, hch, hughd, kirill.shutemov, linux-mm, mm-commits, rppt,
torvalds, vbabka, william.kucharski, willy
From: Hugh Dickins <hughd@google.com>
Subject: mm: delete unsafe BUG from page_cache_add_speculative()
It is not easily reproducible, but on 5.16-rc I have several times hit the
VM_BUG_ON_PAGE(PageTail(page), page) in page_cache_add_speculative():
usually from filemap_get_read_batch() for an ext4 read, yesterday from
next_uptodate_page() from filemap_map_pages() for a shmem fault.
That BUG used to be placed where page_ref_add_unless() had succeeded, but
now it is placed before folio_ref_add_unless() is attempted: that is not
safe, since it is only the acquired reference which makes the page safe
from racing THP collapse or split.
We could keep the BUG, checking PageTail only when folio_ref_try_add_rcu()
has succeeded; but I don't think it adds much value - just delete it.
Link: https://lkml.kernel.org/r/8b98fc6f-3439-8614-c3f3-945c659a1aba@google.com
Fixes: 020853b6f5ea ("mm: Add folio_try_get_rcu()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/pagemap.h | 1 -
1 file changed, 1 deletion(-)
--- a/include/linux/pagemap.h~mm-delete-unsafe-bug-from-page_cache_add_speculative
+++ a/include/linux/pagemap.h
@@ -285,7 +285,6 @@ static inline struct inode *folio_inode(
static inline bool page_cache_add_speculative(struct page *page, int count)
{
- VM_BUG_ON_PAGE(PageTail(page), page);
return folio_ref_try_add_rcu((struct folio *)page, count);
}
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* [patch 7/9] mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid
2021-12-25 5:11 incoming Andrew Morton
` (5 preceding siblings ...)
2021-12-25 5:12 ` [patch 6/9] mm: delete unsafe BUG from page_cache_add_speculative() Andrew Morton
@ 2021-12-25 5:12 ` Andrew Morton
2021-12-25 5:12 ` [patch 8/9] mm/damon/dbgfs: protect targets destructions with kdamond_lock Andrew Morton
2021-12-25 5:12 ` [patch 9/9] mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page() Andrew Morton
8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25 5:12 UTC (permalink / raw)
To: akpm, danielmicay, keescook, levente, linux-mm, mm-commits,
thibaut.sautereau, torvalds
From: Thibaut Sautereau <thibaut.sautereau@ssi.gouv.fr>
Subject: mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid
The second parameter of alloc_pages_exact_nid is the one indicating the
size of memory pointed by the returned pointer.
Link: https://lkml.kernel.org/r/YbjEgwhn4bGblp//@coeus
Fixes: abd58f38dfb4 ("mm/page_alloc: add __alloc_size attributes for better bounds checking")
Signed-off-by: Thibaut Sautereau <thibaut.sautereau@ssi.gouv.fr>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Levente Polyak <levente@leventepolyak.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/gfp.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/gfp.h~mm-page_alloc-fix-__alloc_size-attribute-for-alloc_pages_exact_nid
+++ a/include/linux/gfp.h
@@ -624,7 +624,7 @@ extern unsigned long get_zeroed_page(gfp
void *alloc_pages_exact(size_t size, gfp_t gfp_mask) __alloc_size(1);
void free_pages_exact(void *virt, size_t size);
-__meminit void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) __alloc_size(1);
+__meminit void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) __alloc_size(2);
#define __get_free_page(gfp_mask) \
__get_free_pages((gfp_mask), 0)
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* [patch 8/9] mm/damon/dbgfs: protect targets destructions with kdamond_lock
2021-12-25 5:11 incoming Andrew Morton
` (6 preceding siblings ...)
2021-12-25 5:12 ` [patch 7/9] mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid Andrew Morton
@ 2021-12-25 5:12 ` Andrew Morton
2021-12-25 5:12 ` [patch 9/9] mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page() Andrew Morton
8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25 5:12 UTC (permalink / raw)
To: akpm, linux-mm, mm-commits, sangwoob, sj, stable, torvalds
From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/dbgfs: protect targets destructions with kdamond_lock
DAMON debugfs interface iterates current monitoring targets in
'dbgfs_target_ids_read()' while holding the corresponding 'kdamond_lock'.
However, it also destructs the monitoring targets in
'dbgfs_before_terminate()' without holding the lock. This can result in a
use_after_free bug. This commit avoids the race by protecting the
destruction with the corresponding 'kdamond_lock'.
Link: https://lkml.kernel.org/r/20211221094447.2241-1-sj@kernel.org
Reported-by: Sangwoo Bae <sangwoob@amazon.com>
Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [5.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/damon/dbgfs.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-protect-targets-destructions-with-kdamond_lock
+++ a/mm/damon/dbgfs.c
@@ -650,10 +650,12 @@ static void dbgfs_before_terminate(struc
if (!targetid_is_pid(ctx))
return;
+ mutex_lock(&ctx->kdamond_lock);
damon_for_each_target_safe(t, next, ctx) {
put_pid((struct pid *)t->id);
damon_destroy_target(t);
}
+ mutex_unlock(&ctx->kdamond_lock);
}
static struct damon_ctx *dbgfs_new_ctx(void)
_
^ permalink raw reply [flat|nested] 10+ messages in thread
* [patch 9/9] mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page()
2021-12-25 5:11 incoming Andrew Morton
` (7 preceding siblings ...)
2021-12-25 5:12 ` [patch 8/9] mm/damon/dbgfs: protect targets destructions with kdamond_lock Andrew Morton
@ 2021-12-25 5:12 ` Andrew Morton
8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25 5:12 UTC (permalink / raw)
To: akpm, hulkci, linux-mm, liushixin2, mm-commits, naoya.horiguchi,
osalvador, stable, torvalds
From: Liu Shixin <liushixin2@huawei.com>
Subject: mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page()
Hulk Robot reported a panic in put_page_testzero() when testing madvise()
with MADV_SOFT_OFFLINE. The BUG() is triggered when retrying
get_any_page(). This is because we keep MF_COUNT_INCREASED flag in second
try but the refcnt is not increased.
page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
------------[ cut here ]------------
kernel BUG at include/linux/mm.h:737!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 5 PID: 2135 Comm: sshd Tainted: G B 5.16.0-rc6-dirty #373
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:release_pages+0x53f/0x840
Code: 0c 01 4c 8d 60 ff e9 5b fb ff ff 48 c7 c6 d8 97 0c b3 4c 89 e7 48 83 05 0e 7b 3c 0c 01 e8 89 3d 04 00 48 83 05 11 7b 3c 0c 01 <0f> 0b 48 83 05 0f 7b 3c 0c 01 48 83 05 0f 7b 3c 0c 01 48 83 05f
RSP: 0018:ffffc900015a7bc0 EFLAGS: 00010002
RAX: 000000000000003e RBX: ffffffffbace04c8 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
RBP: ffff88817b9acd50 R08: 0000000000000000 R09: c0000000ffefffff
R10: 0000000000000001 R11: ffffc900015a79b0 R12: ffffea0005e1c900
R13: ffffea0005e1de88 R14: 000000000000001f R15: ffff888100071000
FS: 0000000000000000(0000) GS:ffff88842fb40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f305e8de3d4 CR3: 000000017bb6f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
free_pages_and_swap_cache+0x64/0x80
tlb_flush_mmu+0x6f/0x220
unmap_page_range+0xe6c/0x12c0
unmap_single_vma+0x90/0x170
unmap_vmas+0xc4/0x180
exit_mmap+0xde/0x3a0
mmput+0xa3/0x250
do_exit+0x564/0x1470
do_group_exit+0x3b/0x100
__do_sys_exit_group+0x13/0x20
__x64_sys_exit_group+0x16/0x20
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f30625401d9
Code: Unable to access opcode bytes at RIP 0x7f30625401af.
RSP: 002b:00007ffe391b0c88 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f30625401d9
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 00007f306283d838 R08: 000000000000003c R09: 00000000000000e7
R10: fffffffffffffe30 R11: 0000000000000246 R12: 00007f306283d838
R13: 00007f3062842e80 R14: 0000000000000000 R15: ffffaa4fb7932430
</TASK>
Modules linked in:
---[ end trace e99579b570fe0649 ]---
RIP: 0010:release_pages+0x53f/0x840
Code: 0c 01 4c 8d 60 ff e9 5b fb ff ff 48 c7 c6 d8 97 0c b3 4c 89 e7 48 83 05 0e 7b 3c 0c 01 e8 89 3d 04 00 48 83 05 11 7b 3c 0c 01 <0f> 0b 48 83 05 0f 7b 3c 0c 01 48 83 05 0f 7b 3c 0c 01 48 83 05f
RSP: 0018:ffffc900015a7bc0 EFLAGS: 00010002
RAX: 000000000000003e RBX: ffffffffbace04c8 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
RBP: ffff88817b9acd50 R08: 0000000000000000 R09: c0000000ffefffff
R10: 0000000000000001 R11: ffffc900015a79b0 R12: ffffea0005e1c900
R13: ffffea0005e1de88 R14: 000000000000001f R15: ffff888100071000
FS: 0000000000000000(0000) GS:ffff88842fb40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f305e8de3d4 CR3: 000000017bb6f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Link: https://lkml.kernel.org/r/20211221074908.3910286-1-liushixin2@huawei.com
Fixes: b94e02822deb ("mm,hwpoison: try to narrow window race for free pages")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/memory-failure.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/memory-failure.c~mm-hwpoison-clear-mf_count_increased-before-retrying-get_any_page
+++ a/mm/memory-failure.c
@@ -2234,6 +2234,7 @@ retry:
} else if (ret == 0) {
if (soft_offline_free_page(page) && try_again) {
try_again = false;
+ flags &= ~MF_COUNT_INCREASED;
goto retry;
}
}
_
^ permalink raw reply [flat|nested] 10+ messages in thread