linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2021-12-25  5:11 Andrew Morton
  2021-12-25  5:12 ` [patch 1/9] kfence: fix memory leak when cat kfence objects Andrew Morton
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

9 patches, based on bc491fb12513e79702c6f936c838f792b5389129.

Subsystems affected by this patch series:

  mm/kfence
  mm/mempolicy
  core-kernel
  MAINTAINERS
  mm/memory-failure
  mm/pagemap
  mm/pagealloc
  mm/damon
  mm/memory-failure

Subsystem: mm/kfence

    Baokun Li <libaokun1@huawei.com>:
      kfence: fix memory leak when cat kfence objects

Subsystem: mm/mempolicy

    Andrey Ryabinin <arbn@yandex-team.com>:
      mm: mempolicy: fix THP allocations escaping mempolicy restrictions

Subsystem: core-kernel

    Philipp Rudo <prudo@redhat.com>:
      kernel/crash_core: suppress unknown crashkernel parameter warning

Subsystem: MAINTAINERS

    Randy Dunlap <rdunlap@infradead.org>:
      MAINTAINERS: mark more list instances as moderated

Subsystem: mm/memory-failure

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm, hwpoison: fix condition in free hugetlb page path

Subsystem: mm/pagemap

    Hugh Dickins <hughd@google.com>:
      mm: delete unsafe BUG from page_cache_add_speculative()

Subsystem: mm/pagealloc

    Thibaut Sautereau <thibaut.sautereau@ssi.gouv.fr>:
      mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid

Subsystem: mm/damon

    SeongJae Park <sj@kernel.org>:
      mm/damon/dbgfs: protect targets destructions with kdamond_lock

Subsystem: mm/memory-failure

    Liu Shixin <liushixin2@huawei.com>:
      mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page()

 MAINTAINERS             |    4 ++--
 include/linux/gfp.h     |    2 +-
 include/linux/pagemap.h |    1 -
 kernel/crash_core.c     |   11 +++++++++++
 mm/damon/dbgfs.c        |    2 ++
 mm/kfence/core.c        |    1 +
 mm/memory-failure.c     |   14 +++++---------
 mm/mempolicy.c          |    3 +--
 8 files changed, 23 insertions(+), 15 deletions(-)



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 1/9] kfence: fix memory leak when cat kfence objects
  2021-12-25  5:11 incoming Andrew Morton
@ 2021-12-25  5:12 ` Andrew Morton
  2021-12-25  5:12 ` [patch 2/9] mm: mempolicy: fix THP allocations escaping mempolicy restrictions Andrew Morton
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:12 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, hulkci, libaokun1, linux-mm,
	mm-commits, torvalds, wangkefeng.wang, yukuai3

From: Baokun Li <libaokun1@huawei.com>
Subject: kfence: fix memory leak when cat kfence objects

Hulk robot reported a kmemleak problem:
-----------------------------------------------------------------------
unreferenced object 0xffff93d1d8cc02e8 (size 248):
  comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s)
  hex dump (first 32 bytes):
    00 40 85 19 d4 93 ff ff 00 10 00 00 00 00 00 00  .@..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000db5610b3>] seq_open+0x2a/0x80
    [<00000000d66ac99d>] full_proxy_open+0x167/0x1e0
    [<00000000d58ef917>] do_dentry_open+0x1e1/0x3a0
    [<0000000016c91867>] path_openat+0x961/0xa20
    [<00000000909c9564>] do_filp_open+0xae/0x120
    [<0000000059c761e6>] do_sys_openat2+0x216/0x2f0
    [<00000000b7a7b239>] do_sys_open+0x57/0x80
    [<00000000e559d671>] do_syscall_64+0x33/0x40
    [<000000000ea1fbfd>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
unreferenced object 0xffff93d419854000 (size 4096):
  comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s)
  hex dump (first 32 bytes):
    6b 66 65 6e 63 65 2d 23 32 35 30 3a 20 30 78 30  kfence-#250: 0x0
    30 30 30 30 30 30 30 37 35 34 62 64 61 31 32 2d  0000000754bda12-
  backtrace:
    [<000000008162c6f2>] seq_read_iter+0x313/0x440
    [<0000000020b1b3e3>] seq_read+0x14b/0x1a0
    [<00000000af248fbc>] full_proxy_read+0x56/0x80
    [<00000000f97679d1>] vfs_read+0xa5/0x1b0
    [<000000000ed8a36f>] ksys_read+0xa0/0xf0
    [<00000000e559d671>] do_syscall_64+0x33/0x40
    [<000000000ea1fbfd>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
-----------------------------------------------------------------------

I find that we can easily reproduce this problem with the following
commands:
	`cat /sys/kernel/debug/kfence/objects`
	`echo scan > /sys/kernel/debug/kmemleak`
	`cat /sys/kernel/debug/kmemleak`

The leaked memory is allocated in the stack below:
----------------------------------
do_syscall_64
  do_sys_open
    do_dentry_open
      full_proxy_open
        seq_open            ---> alloc seq_file
  vfs_read
    full_proxy_read
      seq_read
        seq_read_iter
          traverse          ---> alloc seq_buf
----------------------------------

And it should have been released in the following process:
----------------------------------
do_syscall_64
  syscall_exit_to_user_mode
    exit_to_user_mode_prepare
      task_work_run
        ____fput
          __fput
            full_proxy_release  ---> free here
----------------------------------

However, the release function corresponding to file_operations is not
implemented in kfence. As a result, a memory leak occurs. Therefore,
the solution to this problem is to implement the corresponding
release function.

Link: https://lkml.kernel.org/r/20211206133628.2822545-1-libaokun1@huawei.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kfence/core.c |    1 +
 1 file changed, 1 insertion(+)

--- a/mm/kfence/core.c~kfence-fix-memory-leak-when-cat-kfence-objects
+++ a/mm/kfence/core.c
@@ -683,6 +683,7 @@ static const struct file_operations obje
 	.open = open_objects,
 	.read = seq_read,
 	.llseek = seq_lseek,
+	.release = seq_release,
 };
 
 static int __init kfence_debugfs_init(void)
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 2/9] mm: mempolicy: fix THP allocations escaping mempolicy restrictions
  2021-12-25  5:11 incoming Andrew Morton
  2021-12-25  5:12 ` [patch 1/9] kfence: fix memory leak when cat kfence objects Andrew Morton
@ 2021-12-25  5:12 ` Andrew Morton
  2021-12-25  5:12 ` [patch 3/9] kernel/crash_core: suppress unknown crashkernel parameter warning Andrew Morton
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:12 UTC (permalink / raw)
  To: aarcange, akpm, arbn, linux-mm, mgorman, mhocko, mm-commits,
	rientjes, stable, torvalds

From: Andrey Ryabinin <arbn@yandex-team.com>
Subject: mm: mempolicy: fix THP allocations escaping mempolicy restrictions

alloc_pages_vma() may try to allocate THP page on the local NUMA node
first:

	page = __alloc_pages_node(hpage_node,
		gfp | __GFP_THISNODE | __GFP_NORETRY, order);

And if the allocation fails it retries allowing remote memory:

	if (!page && (gfp & __GFP_DIRECT_RECLAIM))
    		page = __alloc_pages_node(hpage_node,
					gfp, order);

However, this retry allocation completely ignores memory policy nodemask
allowing allocation to escape restrictions.

The first appearance of this bug seems to be the commit ac5b2c18911f
 ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings")
The bug disappeared later in the commit 89c83fb539f9
 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask")
and reappeared again in slightly different form in the commit 76e654cc91bb
 ("mm, page_alloc: allow hugepage fallback to remote nodes when madvised")

Fix this by passing correct nodemask to the __alloc_pages() call.

The demonstration/reproducer of the problem:
 $ mount -oremount,size=4G,huge=always /dev/shm/
 $ echo always > /sys/kernel/mm/transparent_hugepage/defrag
 $ cat mbind_thp.c
 #include <unistd.h>
 #include <sys/mman.h>
 #include <sys/stat.h>
 #include <fcntl.h>
 #include <assert.h>
 #include <stdlib.h>
 #include <stdio.h>
 #include <numaif.h>

 #define SIZE 2ULL << 30
 int main(int argc, char **argv)
 {
   int fd;
   unsigned long long i;
   char *addr;
   pid_t pid;
   char buf[100];
   unsigned long nodemask = 1;

   fd = open("/dev/shm/test", O_RDWR|O_CREAT);
   assert(fd > 0);
   assert(ftruncate(fd, SIZE) == 0);

   addr = mmap(NULL, SIZE, PROT_READ|PROT_WRITE,
                        MAP_SHARED, fd, 0);

   assert(mbind(addr, SIZE, MPOL_BIND, &nodemask, 2, MPOL_MF_STRICT|MPOL_MF_MOVE)==0);
   for (i = 0; i < SIZE; i+=4096) {
     addr[i] = 1;
   }
   pid = getpid();
   snprintf(buf, sizeof(buf), "grep shm /proc/%d/numa_maps", pid);
   system(buf);
   sleep(10000);

   return 0;
 }
 $ gcc mbind_thp.c -o mbind_thp -lnuma
 $ numactl -H
 available: 2 nodes (0-1)
 node 0 cpus: 0 2
 node 0 size: 1918 MB
 node 0 free: 1595 MB
 node 1 cpus: 1 3
 node 1 size: 2014 MB
 node 1 free: 1731 MB
 node distances:
 node   0   1
   0:  10  20
   1:  20  10
 $ rm -f /dev/shm/test; taskset -c 0 ./mbind_thp
 7fd970a00000 bind:0 file=/dev/shm/test dirty=524288 active=0 N0=396800 N1=127488 kernelpagesize_kB=4

Link: https://lkml.kernel.org/r/20211208165343.22349-1-arbn@yandex-team.com
Fixes: ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings")
Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mempolicy.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/mm/mempolicy.c~mm-mempolicy-fix-thp-allocations-escaping-mempolicy-restrictions
+++ a/mm/mempolicy.c
@@ -2140,8 +2140,7 @@ struct page *alloc_pages_vma(gfp_t gfp,
 			 * memory with both reclaim and compact as well.
 			 */
 			if (!page && (gfp & __GFP_DIRECT_RECLAIM))
-				page = __alloc_pages_node(hpage_node,
-								gfp, order);
+				page = __alloc_pages(gfp, order, hpage_node, nmask);
 
 			goto out;
 		}
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 3/9] kernel/crash_core: suppress unknown crashkernel parameter warning
  2021-12-25  5:11 incoming Andrew Morton
  2021-12-25  5:12 ` [patch 1/9] kfence: fix memory leak when cat kfence objects Andrew Morton
  2021-12-25  5:12 ` [patch 2/9] mm: mempolicy: fix THP allocations escaping mempolicy restrictions Andrew Morton
@ 2021-12-25  5:12 ` Andrew Morton
  2021-12-25  5:12 ` [patch 4/9] MAINTAINERS: mark more list instances as moderated Andrew Morton
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:12 UTC (permalink / raw)
  To: ahalaney, akpm, bhe, linux-mm, mm-commits, prudo, torvalds

From: Philipp Rudo <prudo@redhat.com>
Subject: kernel/crash_core: suppress unknown crashkernel parameter warning

When booting with crashkernel= on the kernel command line a warning
similar to

[    0.038294] Kernel command line: ro console=ttyS0 crashkernel=256M
[    0.038353] Unknown kernel command line parameters "crashkernel=256M", will be passed to user space.

is printed.  This comes from crashkernel= being parsed independent from
the kernel parameter handling mechanism.  So the code in init/main.c
doesn't know that crashkernel= is a valid kernel parameter and prints this
incorrect warning.  Suppress the warning by adding a dummy early_param
handler for crashkernel=.

Link: https://lkml.kernel.org/r/20211208133443.6867-1-prudo@redhat.com
Fixes: 86d1919a4fb0 ("init: print out unknown kernel parameters")
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/crash_core.c |   11 +++++++++++
 1 file changed, 11 insertions(+)

--- a/kernel/crash_core.c~kernel-crash_core-suppress-unknown-crashkernel-parameter-warning
+++ a/kernel/crash_core.c
@@ -6,6 +6,7 @@
 
 #include <linux/buildid.h>
 #include <linux/crash_core.h>
+#include <linux/init.h>
 #include <linux/utsname.h>
 #include <linux/vmalloc.h>
 
@@ -295,6 +296,16 @@ int __init parse_crashkernel_low(char *c
 				"crashkernel=", suffix_tbl[SUFFIX_LOW]);
 }
 
+/*
+ * Add a dummy early_param handler to mark crashkernel= as a known command line
+ * parameter and suppress incorrect warnings in init/main.c.
+ */
+static int __init parse_crashkernel_dummy(char *arg)
+{
+	return 0;
+}
+early_param("crashkernel", parse_crashkernel_dummy);
+
 Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
 			  void *data, size_t data_len)
 {
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 4/9] MAINTAINERS: mark more list instances as moderated
  2021-12-25  5:11 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2021-12-25  5:12 ` [patch 3/9] kernel/crash_core: suppress unknown crashkernel parameter warning Andrew Morton
@ 2021-12-25  5:12 ` Andrew Morton
  2021-12-25  5:12 ` [patch 5/9] mm, hwpoison: fix condition in free hugetlb page path Andrew Morton
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:12 UTC (permalink / raw)
  To: akpm, alexandre.belloni, conor.culhane, jianjun.wang, linux-mm,
	miquel.raynal, mm-commits, rdunlap, ryder.lee, torvalds

From: Randy Dunlap <rdunlap@infradead.org>
Subject: MAINTAINERS: mark more list instances as moderated

Some lists that are moderated are not marked as moderated consistently, so
mark them all as moderated.

Link: https://lkml.kernel.org/r/20211209001330.18558-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Conor Culhane <conor.culhane@silvaco.com>
Cc: Ryder Lee <ryder.lee@mediatek.com>
Cc: Jianjun Wang <jianjun.wang@mediatek.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 MAINTAINERS |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/MAINTAINERS~maintainers-mark-more-list-instances-as-moderated
+++ a/MAINTAINERS
@@ -14845,7 +14845,7 @@ PCIE DRIVER FOR MEDIATEK
 M:	Ryder Lee <ryder.lee@mediatek.com>
 M:	Jianjun Wang <jianjun.wang@mediatek.com>
 L:	linux-pci@vger.kernel.org
-L:	linux-mediatek@lists.infradead.org
+L:	linux-mediatek@lists.infradead.org (moderated for non-subscribers)
 S:	Supported
 F:	Documentation/devicetree/bindings/pci/mediatek*
 F:	drivers/pci/controller/*mediatek*
@@ -17423,7 +17423,7 @@ F:	drivers/video/fbdev/sm712*
 SILVACO I3C DUAL-ROLE MASTER
 M:	Miquel Raynal <miquel.raynal@bootlin.com>
 M:	Conor Culhane <conor.culhane@silvaco.com>
-L:	linux-i3c@lists.infradead.org
+L:	linux-i3c@lists.infradead.org (moderated for non-subscribers)
 S:	Maintained
 F:	Documentation/devicetree/bindings/i3c/silvaco,i3c-master.yaml
 F:	drivers/i3c/master/svc-i3c-master.c
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 5/9] mm, hwpoison: fix condition in free hugetlb page path
  2021-12-25  5:11 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2021-12-25  5:12 ` [patch 4/9] MAINTAINERS: mark more list instances as moderated Andrew Morton
@ 2021-12-25  5:12 ` Andrew Morton
  2021-12-25  5:12 ` [patch 6/9] mm: delete unsafe BUG from page_cache_add_speculative() Andrew Morton
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:12 UTC (permalink / raw)
  To: akpm, linux-mm, luofei, mike.kravetz, mm-commits,
	naoya.horiguchi, stable, torvalds

From: Naoya Horiguchi <naoya.horiguchi@nec.com>
Subject: mm, hwpoison: fix condition in free hugetlb page path

When a memory error hits a tail page of a free hugepage,
__page_handle_poison() is expected to be called to isolate the error in
4kB unit, but it's not called due to the outdated if-condition in
memory_failure_hugetlb().  This loses the chance to isolate the error in
the finer unit, so it's not optimal.  Drop the condition.

This "(p != head && TestSetPageHWPoison(head)" condition is based on the
old semantics of PageHWPoison on hugepage (where PG_hwpoison flag was set
on the subpage), so it's not necessray any more.  By getting to set
PG_hwpoison on head page for hugepages, concurrent error events on
different subpages in a single hugepage can be prevented by
TestSetPageHWPoison(head) at the beginning of memory_failure_hugetlb(). 
So dropping the condition should not reopen the race window originally
mentioned in commit b985194c8c0a ("hwpoison, hugetlb:
lock_page/unlock_page does not match for handling a free hugepage")

[naoya.horiguchi@linux.dev: fix "HardwareCorrupted" counter]
  Link: https://lkml.kernel.org/r/20211220084851.GA1460264@u2004
Link: https://lkml.kernel.org/r/20211210110208.879740-1-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Fei Luo <luofei@unicloud.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>	[5.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |   13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

--- a/mm/memory-failure.c~mm-hwpoison-fix-condition-in-free-hugetlb-page-path
+++ a/mm/memory-failure.c
@@ -1470,17 +1470,12 @@ static int memory_failure_hugetlb(unsign
 	if (!(flags & MF_COUNT_INCREASED)) {
 		res = get_hwpoison_page(p, flags);
 		if (!res) {
-			/*
-			 * Check "filter hit" and "race with other subpage."
-			 */
 			lock_page(head);
-			if (PageHWPoison(head)) {
-				if ((hwpoison_filter(p) && TestClearPageHWPoison(p))
-				    || (p != head && TestSetPageHWPoison(head))) {
+			if (hwpoison_filter(p)) {
+				if (TestClearPageHWPoison(head))
 					num_poisoned_pages_dec();
-					unlock_page(head);
-					return 0;
-				}
+				unlock_page(head);
+				return 0;
 			}
 			unlock_page(head);
 			res = MF_FAILED;
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 6/9] mm: delete unsafe BUG from page_cache_add_speculative()
  2021-12-25  5:11 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2021-12-25  5:12 ` [patch 5/9] mm, hwpoison: fix condition in free hugetlb page path Andrew Morton
@ 2021-12-25  5:12 ` Andrew Morton
  2021-12-25  5:12 ` [patch 7/9] mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid Andrew Morton
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:12 UTC (permalink / raw)
  To: akpm, hch, hughd, kirill.shutemov, linux-mm, mm-commits, rppt,
	torvalds, vbabka, william.kucharski, willy

From: Hugh Dickins <hughd@google.com>
Subject: mm: delete unsafe BUG from page_cache_add_speculative()

It is not easily reproducible, but on 5.16-rc I have several times hit the
VM_BUG_ON_PAGE(PageTail(page), page) in page_cache_add_speculative():
usually from filemap_get_read_batch() for an ext4 read, yesterday from
next_uptodate_page() from filemap_map_pages() for a shmem fault.

That BUG used to be placed where page_ref_add_unless() had succeeded, but
now it is placed before folio_ref_add_unless() is attempted: that is not
safe, since it is only the acquired reference which makes the page safe
from racing THP collapse or split.

We could keep the BUG, checking PageTail only when folio_ref_try_add_rcu()
has succeeded; but I don't think it adds much value - just delete it.

Link: https://lkml.kernel.org/r/8b98fc6f-3439-8614-c3f3-945c659a1aba@google.com
Fixes: 020853b6f5ea ("mm: Add folio_try_get_rcu()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/pagemap.h |    1 -
 1 file changed, 1 deletion(-)

--- a/include/linux/pagemap.h~mm-delete-unsafe-bug-from-page_cache_add_speculative
+++ a/include/linux/pagemap.h
@@ -285,7 +285,6 @@ static inline struct inode *folio_inode(
 
 static inline bool page_cache_add_speculative(struct page *page, int count)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
 	return folio_ref_try_add_rcu((struct folio *)page, count);
 }
 
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 7/9] mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid
  2021-12-25  5:11 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2021-12-25  5:12 ` [patch 6/9] mm: delete unsafe BUG from page_cache_add_speculative() Andrew Morton
@ 2021-12-25  5:12 ` Andrew Morton
  2021-12-25  5:12 ` [patch 8/9] mm/damon/dbgfs: protect targets destructions with kdamond_lock Andrew Morton
  2021-12-25  5:12 ` [patch 9/9] mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page() Andrew Morton
  8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:12 UTC (permalink / raw)
  To: akpm, danielmicay, keescook, levente, linux-mm, mm-commits,
	thibaut.sautereau, torvalds

From: Thibaut Sautereau <thibaut.sautereau@ssi.gouv.fr>
Subject: mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid

The second parameter of alloc_pages_exact_nid is the one indicating the
size of memory pointed by the returned pointer.

Link: https://lkml.kernel.org/r/YbjEgwhn4bGblp//@coeus
Fixes: abd58f38dfb4 ("mm/page_alloc: add __alloc_size attributes for better bounds checking")
Signed-off-by: Thibaut Sautereau <thibaut.sautereau@ssi.gouv.fr>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Levente Polyak <levente@leventepolyak.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/gfp.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/gfp.h~mm-page_alloc-fix-__alloc_size-attribute-for-alloc_pages_exact_nid
+++ a/include/linux/gfp.h
@@ -624,7 +624,7 @@ extern unsigned long get_zeroed_page(gfp
 
 void *alloc_pages_exact(size_t size, gfp_t gfp_mask) __alloc_size(1);
 void free_pages_exact(void *virt, size_t size);
-__meminit void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) __alloc_size(1);
+__meminit void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) __alloc_size(2);
 
 #define __get_free_page(gfp_mask) \
 		__get_free_pages((gfp_mask), 0)
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 8/9] mm/damon/dbgfs: protect targets destructions with kdamond_lock
  2021-12-25  5:11 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2021-12-25  5:12 ` [patch 7/9] mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid Andrew Morton
@ 2021-12-25  5:12 ` Andrew Morton
  2021-12-25  5:12 ` [patch 9/9] mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page() Andrew Morton
  8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:12 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, sangwoob, sj, stable, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/dbgfs: protect targets destructions with kdamond_lock

DAMON debugfs interface iterates current monitoring targets in
'dbgfs_target_ids_read()' while holding the corresponding 'kdamond_lock'. 
However, it also destructs the monitoring targets in
'dbgfs_before_terminate()' without holding the lock.  This can result in a
use_after_free bug.  This commit avoids the race by protecting the
destruction with the corresponding 'kdamond_lock'.

Link: https://lkml.kernel.org/r/20211221094447.2241-1-sj@kernel.org
Reported-by: Sangwoo Bae <sangwoob@amazon.com>
Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[5.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/dbgfs.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-protect-targets-destructions-with-kdamond_lock
+++ a/mm/damon/dbgfs.c
@@ -650,10 +650,12 @@ static void dbgfs_before_terminate(struc
 	if (!targetid_is_pid(ctx))
 		return;
 
+	mutex_lock(&ctx->kdamond_lock);
 	damon_for_each_target_safe(t, next, ctx) {
 		put_pid((struct pid *)t->id);
 		damon_destroy_target(t);
 	}
+	mutex_unlock(&ctx->kdamond_lock);
 }
 
 static struct damon_ctx *dbgfs_new_ctx(void)
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch 9/9] mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page()
  2021-12-25  5:11 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2021-12-25  5:12 ` [patch 8/9] mm/damon/dbgfs: protect targets destructions with kdamond_lock Andrew Morton
@ 2021-12-25  5:12 ` Andrew Morton
  8 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2021-12-25  5:12 UTC (permalink / raw)
  To: akpm, hulkci, linux-mm, liushixin2, mm-commits, naoya.horiguchi,
	osalvador, stable, torvalds

From: Liu Shixin <liushixin2@huawei.com>
Subject: mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page()

Hulk Robot reported a panic in put_page_testzero() when testing madvise()
with MADV_SOFT_OFFLINE.  The BUG() is triggered when retrying
get_any_page().  This is because we keep MF_COUNT_INCREASED flag in second
try but the refcnt is not increased.

 page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
 ------------[ cut here ]------------
 kernel BUG at include/linux/mm.h:737!
 invalid opcode: 0000 [#1] PREEMPT SMP
 CPU: 5 PID: 2135 Comm: sshd Tainted: G    B             5.16.0-rc6-dirty #373
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
 RIP: 0010:release_pages+0x53f/0x840
 Code: 0c 01 4c 8d 60 ff e9 5b fb ff ff 48 c7 c6 d8 97 0c b3 4c 89 e7 48 83 05 0e 7b 3c 0c 01 e8 89 3d 04 00 48 83 05 11 7b 3c 0c 01 <0f> 0b 48 83 05 0f 7b 3c 0c 01 48 83 05 0f 7b 3c 0c 01 48 83 05f
 RSP: 0018:ffffc900015a7bc0 EFLAGS: 00010002
 RAX: 000000000000003e RBX: ffffffffbace04c8 RCX: 0000000000000002
 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
 RBP: ffff88817b9acd50 R08: 0000000000000000 R09: c0000000ffefffff
 R10: 0000000000000001 R11: ffffc900015a79b0 R12: ffffea0005e1c900
 R13: ffffea0005e1de88 R14: 000000000000001f R15: ffff888100071000
 FS:  0000000000000000(0000) GS:ffff88842fb40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f305e8de3d4 CR3: 000000017bb6f000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  free_pages_and_swap_cache+0x64/0x80
  tlb_flush_mmu+0x6f/0x220
  unmap_page_range+0xe6c/0x12c0
  unmap_single_vma+0x90/0x170
  unmap_vmas+0xc4/0x180
  exit_mmap+0xde/0x3a0
  mmput+0xa3/0x250
  do_exit+0x564/0x1470
  do_group_exit+0x3b/0x100
  __do_sys_exit_group+0x13/0x20
  __x64_sys_exit_group+0x16/0x20
  do_syscall_64+0x34/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f30625401d9
 Code: Unable to access opcode bytes at RIP 0x7f30625401af.
 RSP: 002b:00007ffe391b0c88 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f30625401d9
 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000001
 RBP: 00007f306283d838 R08: 000000000000003c R09: 00000000000000e7
 R10: fffffffffffffe30 R11: 0000000000000246 R12: 00007f306283d838
 R13: 00007f3062842e80 R14: 0000000000000000 R15: ffffaa4fb7932430
  </TASK>
 Modules linked in:
 ---[ end trace e99579b570fe0649 ]---
 RIP: 0010:release_pages+0x53f/0x840
 Code: 0c 01 4c 8d 60 ff e9 5b fb ff ff 48 c7 c6 d8 97 0c b3 4c 89 e7 48 83 05 0e 7b 3c 0c 01 e8 89 3d 04 00 48 83 05 11 7b 3c 0c 01 <0f> 0b 48 83 05 0f 7b 3c 0c 01 48 83 05 0f 7b 3c 0c 01 48 83 05f
 RSP: 0018:ffffc900015a7bc0 EFLAGS: 00010002
 RAX: 000000000000003e RBX: ffffffffbace04c8 RCX: 0000000000000002
 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
 RBP: ffff88817b9acd50 R08: 0000000000000000 R09: c0000000ffefffff
 R10: 0000000000000001 R11: ffffc900015a79b0 R12: ffffea0005e1c900
 R13: ffffea0005e1de88 R14: 000000000000001f R15: ffff888100071000
 FS:  0000000000000000(0000) GS:ffff88842fb40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f305e8de3d4 CR3: 000000017bb6f000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Link: https://lkml.kernel.org/r/20211221074908.3910286-1-liushixin2@huawei.com
Fixes: b94e02822deb ("mm,hwpoison: try to narrow window race for free pages")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |    1 +
 1 file changed, 1 insertion(+)

--- a/mm/memory-failure.c~mm-hwpoison-clear-mf_count_increased-before-retrying-get_any_page
+++ a/mm/memory-failure.c
@@ -2234,6 +2234,7 @@ retry:
 	} else if (ret == 0) {
 		if (soft_offline_free_page(page) && try_again) {
 			try_again = false;
+			flags &= ~MF_COUNT_INCREASED;
 			goto retry;
 		}
 	}
_


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-12-25  5:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-25  5:11 incoming Andrew Morton
2021-12-25  5:12 ` [patch 1/9] kfence: fix memory leak when cat kfence objects Andrew Morton
2021-12-25  5:12 ` [patch 2/9] mm: mempolicy: fix THP allocations escaping mempolicy restrictions Andrew Morton
2021-12-25  5:12 ` [patch 3/9] kernel/crash_core: suppress unknown crashkernel parameter warning Andrew Morton
2021-12-25  5:12 ` [patch 4/9] MAINTAINERS: mark more list instances as moderated Andrew Morton
2021-12-25  5:12 ` [patch 5/9] mm, hwpoison: fix condition in free hugetlb page path Andrew Morton
2021-12-25  5:12 ` [patch 6/9] mm: delete unsafe BUG from page_cache_add_speculative() Andrew Morton
2021-12-25  5:12 ` [patch 7/9] mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid Andrew Morton
2021-12-25  5:12 ` [patch 8/9] mm/damon/dbgfs: protect targets destructions with kdamond_lock Andrew Morton
2021-12-25  5:12 ` [patch 9/9] mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page() Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).