linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2020-08-21  0:41 Andrew Morton
  2020-08-21  0:41 ` [patch 01/11] mailmap: add Andi Kleen Andrew Morton
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

11 patches, based on 7eac66d0456fe12a462e5c14c68e97c7460989da.

Subsystems affected by this patch series:

  misc
  mm/hugetlb
  mm/vmalloc
  mm/misc
  romfs
  relay
  uprobes
  squashfs
  mm/cma
  mm/pagealloc

Subsystem: misc

    Nick Desaulniers <ndesaulniers@google.com>:
      mailmap: add Andi Kleen

Subsystem: mm/hugetlb

    Xu Wang <vulab@iscas.ac.cn>:
      hugetlb_cgroup: convert comma to semicolon

    Hugh Dickins <hughd@google.com>:
      khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()

Subsystem: mm/vmalloc

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      mm/vunmap: add cond_resched() in vunmap_pmd_range

Subsystem: mm/misc

    Leon Romanovsky <leonro@nvidia.com>:
      mm/rodata_test.c: fix missing function declaration

Subsystem: romfs

    Jann Horn <jannh@google.com>:
      romfs: fix uninitialized memory leak in romfs_dev_read()

Subsystem: relay

    Wei Yongjun <weiyongjun1@huawei.com>:
      kernel/relay.c: fix memleak on destroy relay channel

Subsystem: uprobes

    Hugh Dickins <hughd@google.com>:
      uprobes: __replace_page() avoid BUG in munlock_vma_page()

Subsystem: squashfs

    Phillip Lougher <phillip@squashfs.org.uk>:
      squashfs: avoid bio_alloc() failure with 1Mbyte blocks

Subsystem: mm/cma

    Doug Berger <opendmb@gmail.com>:
      mm: include CMA pages in lowmem_reserve at boot

Subsystem: mm/pagealloc

    Charan Teja Reddy <charante@codeaurora.org>:
      mm, page_alloc: fix core hung in free_pcppages_bulk()

 .mailmap                |    1 +
 fs/romfs/storage.c      |    4 +---
 fs/squashfs/block.c     |    6 +++++-
 kernel/events/uprobes.c |    2 +-
 kernel/relay.c          |    1 +
 mm/hugetlb_cgroup.c     |    4 ++--
 mm/khugepaged.c         |    2 +-
 mm/page_alloc.c         |    7 ++++++-
 mm/rodata_test.c        |    1 +
 mm/vmalloc.c            |    2 ++
 10 files changed, 21 insertions(+), 9 deletions(-)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 01/11] mailmap: add Andi Kleen
  2020-08-21  0:41 incoming Andrew Morton
@ 2020-08-21  0:41 ` Andrew Morton
  2020-08-21  0:41 ` [patch 02/11] hugetlb_cgroup: convert comma to semicolon Andrew Morton
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:41 UTC (permalink / raw)
  To: ak, akpm, corbet, keescook, linux-mm, mm-commits, ndesaulniers,
	qperret, torvalds

From: Nick Desaulniers <ndesaulniers@google.com>
Subject: mailmap: add Andi Kleen

I keep getting bounce back from the suse.de address.

Link: http://lkml.kernel.org/r/20200818203214.659955-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Quentin Perret <qperret@qperret.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 .mailmap |    1 +
 1 file changed, 1 insertion(+)

--- a/.mailmap~mailmap-add-andi-kleen
+++ a/.mailmap
@@ -32,6 +32,7 @@ Alex Shi <alex.shi@linux.alibaba.com> <a
 Alex Shi <alex.shi@linux.alibaba.com> <alex.shi@linaro.org>
 Al Viro <viro@ftp.linux.org.uk>
 Al Viro <viro@zenIV.linux.org.uk>
+Andi Kleen <ak@linux.intel.com> <ak@suse.de>
 Andi Shyti <andi@etezian.org> <andi.shyti@samsung.com>
 Andreas Herrmann <aherrman@de.ibm.com>
 Andrew Morton <akpm@linux-foundation.org>
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 02/11] hugetlb_cgroup: convert comma to semicolon
  2020-08-21  0:41 incoming Andrew Morton
  2020-08-21  0:41 ` [patch 01/11] mailmap: add Andi Kleen Andrew Morton
@ 2020-08-21  0:41 ` Andrew Morton
  2020-08-21  0:42 ` [patch 03/11] khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter() Andrew Morton
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:41 UTC (permalink / raw)
  To: akpm, gscrivan, linux-mm, mm-commits, tj, torvalds, vulab

From: Xu Wang <vulab@iscas.ac.cn>
Subject: hugetlb_cgroup: convert comma to semicolon

Replace a comma between expression statements by a semicolon.

Link: http://lkml.kernel.org/r/20200818064333.21759-1-vulab@iscas.ac.cn
Fixes: faced7e0806cf4 ("mm: hugetlb controller for cgroups v2")
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Cc: Tejun Heo <tj@kernel.org>
Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb_cgroup.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/hugetlb_cgroup.c~hugetlb_cgroup-convert-comma-to-semicolon
+++ a/mm/hugetlb_cgroup.c
@@ -655,7 +655,7 @@ static void __init __hugetlb_cgroup_file
 	snprintf(cft->name, MAX_CFTYPE_NAME, "%s.events", buf);
 	cft->private = MEMFILE_PRIVATE(idx, 0);
 	cft->seq_show = hugetlb_events_show;
-	cft->file_offset = offsetof(struct hugetlb_cgroup, events_file[idx]),
+	cft->file_offset = offsetof(struct hugetlb_cgroup, events_file[idx]);
 	cft->flags = CFTYPE_NOT_ON_ROOT;
 
 	/* Add the events.local file */
@@ -664,7 +664,7 @@ static void __init __hugetlb_cgroup_file
 	cft->private = MEMFILE_PRIVATE(idx, 0);
 	cft->seq_show = hugetlb_events_local_show;
 	cft->file_offset = offsetof(struct hugetlb_cgroup,
-				    events_local_file[idx]),
+				    events_local_file[idx]);
 	cft->flags = CFTYPE_NOT_ON_ROOT;
 
 	/* NULL terminate the last cft */
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 03/11] khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
  2020-08-21  0:41 incoming Andrew Morton
  2020-08-21  0:41 ` [patch 01/11] mailmap: add Andi Kleen Andrew Morton
  2020-08-21  0:41 ` [patch 02/11] hugetlb_cgroup: convert comma to semicolon Andrew Morton
@ 2020-08-21  0:42 ` Andrew Morton
  2020-08-21  0:42 ` [patch 04/11] mm/vunmap: add cond_resched() in vunmap_pmd_range Andrew Morton
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:42 UTC (permalink / raw)
  To: aarcange, akpm, edumazet, hughd, kirill.shutemov, linux-mm,
	mike.kravetz, mm-commits, shy828301, songliubraving, stable,
	syzkaller, torvalds

From: Hugh Dickins <hughd@google.com>
Subject: khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()

syzbot crashes on the VM_BUG_ON_MM(khugepaged_test_exit(mm), mm) in
__khugepaged_enter(): yes, when one thread is about to dump core, has set
core_state, and is waiting for others, another might do something calling
__khugepaged_enter(), which now crashes because I lumped the core_state
test (known as "mmget_still_valid") into khugepaged_test_exit().  I still
think it's best to lump them together, so just in this exceptional case,
check mm->mm_users directly instead of khugepaged_test_exit().

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008141503370.18085@eggly.anvils
Fixes: bbe98f9cadff ("khugepaged: khugepaged_test_exit() check mmget_still_valid()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/khugepaged.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/khugepaged.c~khugepaged-adjust-vm_bug_on_mm-in-__khugepaged_enter
+++ a/mm/khugepaged.c
@@ -466,7 +466,7 @@ int __khugepaged_enter(struct mm_struct
 		return -ENOMEM;
 
 	/* __khugepaged_exit() must not run from under us */
-	VM_BUG_ON_MM(khugepaged_test_exit(mm), mm);
+	VM_BUG_ON_MM(atomic_read(&mm->mm_users) == 0, mm);
 	if (unlikely(test_and_set_bit(MMF_VM_HUGEPAGE, &mm->flags))) {
 		free_mm_slot(mm_slot);
 		return 0;
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 04/11] mm/vunmap: add cond_resched() in vunmap_pmd_range
  2020-08-21  0:41 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2020-08-21  0:42 ` [patch 03/11] khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter() Andrew Morton
@ 2020-08-21  0:42 ` Andrew Morton
  2020-08-21  0:42 ` [patch 05/11] mm/rodata_test.c: fix missing function declaration Andrew Morton
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:42 UTC (permalink / raw)
  To: akpm, aneesh.kumar, harish, linux-mm, mm-commits, stable, torvalds

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Subject: mm/vunmap: add cond_resched() in vunmap_pmd_range

Like zap_pte_range add cond_resched so that we can avoid softlockups as
reported below.  On non-preemptible kernel with large I/O map region (like
the one we get when using persistent memory with sector mode), an unmap of
the namespace can report below softlockups.

22724.027334] watchdog: BUG: soft lockup - CPU#49 stuck for 23s! [ndctl:50777]
 NIP [c0000000000dc224] plpar_hcall+0x38/0x58
 LR [c0000000000d8898] pSeries_lpar_hpte_invalidate+0x68/0xb0
 Call Trace:
 [c0000004e87a7780] [c0000004fb197c00] 0xc0000004fb197c00 (unreliable)
 [c0000004e87a7810] [c00000000007f4e4] flush_hash_page+0x114/0x200
 [c0000004e87a7890] [c0000000000833cc] hpte_need_flush+0x2dc/0x540
 [c0000004e87a7950] [c0000000003f5798] vunmap_page_range+0x538/0x6f0
 [c0000004e87a7a70] [c0000000003f76d0] free_unmap_vmap_area+0x30/0x70
 [c0000004e87a7aa0] [c0000000003f7a6c] remove_vm_area+0xfc/0x140
 [c0000004e87a7ad0] [c0000000003f7dd8] __vunmap+0x68/0x270
 [c0000004e87a7b50] [c000000000079de4] __iounmap.part.0+0x34/0x60
 [c0000004e87a7bb0] [c000000000376394] memunmap+0x54/0x70
 [c0000004e87a7bd0] [c000000000881d7c] release_nodes+0x28c/0x300
 [c0000004e87a7c40] [c00000000087a65c] device_release_driver_internal+0x16c/0x280
 [c0000004e87a7c80] [c000000000876fc4] unbind_store+0x124/0x170
 [c0000004e87a7cd0] [c000000000875be4] drv_attr_store+0x44/0x60
 [c0000004e87a7cf0] [c00000000057c734] sysfs_kf_write+0x64/0x90
 [c0000004e87a7d10] [c00000000057bc10] kernfs_fop_write+0x1b0/0x290
 [c0000004e87a7d60] [c000000000488e6c] __vfs_write+0x3c/0x70
 [c0000004e87a7d80] [c00000000048c868] vfs_write+0xd8/0x260
 [c0000004e87a7dd0] [c00000000048ccac] ksys_write+0xdc/0x130
 [c0000004e87a7e20] [c00000000000b588] system_call+0x5c/0x70

Link: http://lkml.kernel.org/r/20200807075933.310240-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Harish Sriram <harish@linux.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/mm/vmalloc.c~mm-vunmap-add-cond_resched-in-vunmap_pmd_range
+++ a/mm/vmalloc.c
@@ -104,6 +104,8 @@ static void vunmap_pmd_range(pud_t *pud,
 		if (pmd_none_or_clear_bad(pmd))
 			continue;
 		vunmap_pte_range(pmd, addr, next, mask);
+
+		cond_resched();
 	} while (pmd++, addr = next, addr != end);
 }
 
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 05/11] mm/rodata_test.c: fix missing function declaration
  2020-08-21  0:41 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2020-08-21  0:42 ` [patch 04/11] mm/vunmap: add cond_resched() in vunmap_pmd_range Andrew Morton
@ 2020-08-21  0:42 ` Andrew Morton
  2020-08-21  0:42 ` [patch 06/11] romfs: fix uninitialized memory leak in romfs_dev_read() Andrew Morton
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:42 UTC (permalink / raw)
  To: akpm, anshuman.khandual, leonro, linux-mm, mm-commits, torvalds

From: Leon Romanovsky <leonro@nvidia.com>
Subject: mm/rodata_test.c: fix missing function declaration

The compilation with CONFIG_DEBUG_RODATA_TEST set produces the following
warning due to the missing include.

 mm/rodata_test.c:15:6: warning: no previous prototype for 'rodata_test' [-Wmissing-prototypes]
    15 | void rodata_test(void)
      |      ^~~~~~~~~~~

Link: https://lkml.kernel.org/r/20200819080026.918134-1-leon@kernel.org
Fixes: 2959a5f726f6 ("mm: add arch-independent testcases for RODATA")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/rodata_test.c |    1 +
 1 file changed, 1 insertion(+)

--- a/mm/rodata_test.c~mm-fix-missing-function-declaration
+++ a/mm/rodata_test.c
@@ -7,6 +7,7 @@
  */
 #define pr_fmt(fmt) "rodata_test: " fmt
 
+#include <linux/rodata_test.h>
 #include <linux/uaccess.h>
 #include <asm/sections.h>
 
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 06/11] romfs: fix uninitialized memory leak in romfs_dev_read()
  2020-08-21  0:41 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2020-08-21  0:42 ` [patch 05/11] mm/rodata_test.c: fix missing function declaration Andrew Morton
@ 2020-08-21  0:42 ` Andrew Morton
  2020-08-21  0:42 ` [patch 07/11] kernel/relay.c: fix memleak on destroy relay channel Andrew Morton
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:42 UTC (permalink / raw)
  To: akpm, dhowells, gregkh, jannh, linux-mm, mm-commits, stable, torvalds

From: Jann Horn <jannh@google.com>
Subject: romfs: fix uninitialized memory leak in romfs_dev_read()

romfs has a superblock field that limits the size of the filesystem; data
beyond that limit is never accessed.

romfs_dev_read() fetches a caller-supplied number of bytes from the
backing device.  It returns 0 on success or an error code on failure;
therefore, its API can't represent short reads, it's all-or-nothing.

However, when romfs_dev_read() detects that the requested operation would
cross the filesystem size limit, it currently silently truncates the
requested number of bytes.  This e.g.  means that when the content of a
file with size 0x1000 starts one byte before the filesystem size limit,
->readpage() will only fill a single byte of the supplied page while
leaving the rest uninitialized, leaking that uninitialized memory to
userspace.

Fix it by returning an error code instead of truncating the read when the
requested read operation would go beyond the end of the filesystem.

Link: http://lkml.kernel.org/r/20200818013202.2246365-1-jannh@google.com
Fixes: da4458bda237 ("NOMMU: Make it possible for RomFS to use MTD devices directly")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/romfs/storage.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/fs/romfs/storage.c~romfs-fix-uninitialized-memory-leak-in-romfs_dev_read
+++ a/fs/romfs/storage.c
@@ -217,10 +217,8 @@ int romfs_dev_read(struct super_block *s
 	size_t limit;
 
 	limit = romfs_maxsize(sb);
-	if (pos >= limit)
+	if (pos >= limit || buflen > limit - pos)
 		return -EIO;
-	if (buflen > limit - pos)
-		buflen = limit - pos;
 
 #ifdef CONFIG_ROMFS_ON_MTD
 	if (sb->s_mtd)
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 07/11] kernel/relay.c: fix memleak on destroy relay channel
  2020-08-21  0:41 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2020-08-21  0:42 ` [patch 06/11] romfs: fix uninitialized memory leak in romfs_dev_read() Andrew Morton
@ 2020-08-21  0:42 ` Andrew Morton
  2020-08-21  0:42 ` [patch 08/11] uprobes: __replace_page() avoid BUG in munlock_vma_page() Andrew Morton
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:42 UTC (permalink / raw)
  To: akash.goel, akpm, chris, dja, hulkci, linux-mm, mm-commits, mpe,
	rientjes, stable, tglx, torvalds, viro, walken, weiyongjun1

From: Wei Yongjun <weiyongjun1@huawei.com>
Subject: kernel/relay.c: fix memleak on destroy relay channel

kmemleak report memory leak as follows:

unreferenced object 0x607ee4e5f948 (size 8):
comm "syz-executor.1", pid 2098, jiffies 4295031601 (age 288.468s)
hex dump (first 8 bytes):
00 00 00 00 00 00 00 00 ........
backtrace:
[<00000000ca1de2fa>] relay_open kernel/relay.c:583 [inline]
[<00000000ca1de2fa>] relay_open+0xb6/0x970 kernel/relay.c:563
[<0000000038ae5a4b>] do_blk_trace_setup+0x4a8/0xb20 kernel/trace/blktrace.c:557
[<00000000d5e778e9>] __blk_trace_setup+0xb6/0x150 kernel/trace/blktrace.c:597
[<0000000038fdf803>] blk_trace_ioctl+0x146/0x280 kernel/trace/blktrace.c:738
[<00000000ce25a0ca>] blkdev_ioctl+0xb2/0x6a0 block/ioctl.c:613
[<00000000579e47e0>] block_ioctl+0xe5/0x120 fs/block_dev.c:1871
[<00000000b1588c11>] vfs_ioctl fs/ioctl.c:48 [inline]
[<00000000b1588c11>] __do_sys_ioctl fs/ioctl.c:753 [inline]
[<00000000b1588c11>] __se_sys_ioctl fs/ioctl.c:739 [inline]
[<00000000b1588c11>] __x64_sys_ioctl+0x170/0x1ce fs/ioctl.c:739
[<0000000088fc9942>] do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
[<000000004f6dd57a>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

'chan->buf' is malloced in relay_open() by alloc_percpu() but not free
while destroy the relay channel. Fix it by adding free_percpu() before
return from relay_destroy_channel().

Link: http://lkml.kernel.org/r/20200817122826.48518-1-weiyongjun1@huawei.com
Fixes: 017c59c042d0 ("relay: Use per CPU constructs for the relay channel buffer pointers")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David Rientjes <rientjes@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Akash Goel <akash.goel@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/relay.c |    1 +
 1 file changed, 1 insertion(+)

--- a/kernel/relay.c~kernel-relayc-fix-memleak-on-destroy-relay-channel
+++ a/kernel/relay.c
@@ -197,6 +197,7 @@ free_buf:
 static void relay_destroy_channel(struct kref *kref)
 {
 	struct rchan *chan = container_of(kref, struct rchan, kref);
+	free_percpu(chan->buf);
 	kfree(chan);
 }
 
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 08/11] uprobes: __replace_page() avoid BUG in munlock_vma_page()
  2020-08-21  0:41 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2020-08-21  0:42 ` [patch 07/11] kernel/relay.c: fix memleak on destroy relay channel Andrew Morton
@ 2020-08-21  0:42 ` Andrew Morton
  2020-08-21  0:42 ` [patch 09/11] squashfs: avoid bio_alloc() failure with 1Mbyte blocks Andrew Morton
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:42 UTC (permalink / raw)
  To: akpm, hughd, kirill.shutemov, linux-mm, mm-commits, oleg,
	songliubraving, srikar, stable, syzkaller, torvalds

From: Hugh Dickins <hughd@google.com>
Subject: uprobes: __replace_page() avoid BUG in munlock_vma_page()

syzbot crashed on the VM_BUG_ON_PAGE(PageTail) in munlock_vma_page(), when
called from uprobes __replace_page().  Which of many ways to fix it? 
Settled on not calling when PageCompound (since Head and Tail are equals
in this context, PageCompound the usual check in uprobes.c, and the prior
use of FOLL_SPLIT_PMD will have cleared PageMlocked already).

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008161338360.20413@eggly.anvils
Fixes: 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>	[5.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/events/uprobes.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/events/uprobes.c~uprobes-__replace_page-avoid-bug-in-munlock_vma_page
+++ a/kernel/events/uprobes.c
@@ -205,7 +205,7 @@ static int __replace_page(struct vm_area
 		try_to_free_swap(old_page);
 	page_vma_mapped_walk_done(&pvmw);
 
-	if (vma->vm_flags & VM_LOCKED)
+	if ((vma->vm_flags & VM_LOCKED) && !PageCompound(old_page))
 		munlock_vma_page(old_page);
 	put_page(old_page);
 
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 09/11] squashfs: avoid bio_alloc() failure with 1Mbyte blocks
  2020-08-21  0:41 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2020-08-21  0:42 ` [patch 08/11] uprobes: __replace_page() avoid BUG in munlock_vma_page() Andrew Morton
@ 2020-08-21  0:42 ` Andrew Morton
  2020-08-21  0:42 ` [patch 10/11] mm: include CMA pages in lowmem_reserve at boot Andrew Morton
  2020-08-21  0:42 ` [patch 11/11] mm, page_alloc: fix core hung in free_pcppages_bulk() Andrew Morton
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:42 UTC (permalink / raw)
  To: adrien+dev, akpm, drosen, groeck, hch, linux-mm, mm-commits,
	nicolas.prochazka, phillip, pliard, shimada, stable, torvalds

From: Phillip Lougher <phillip@squashfs.org.uk>
Subject: squashfs: avoid bio_alloc() failure with 1Mbyte blocks

This is a regression introduced by the patch "migrate from ll_rw_block
usage to BIO".

Bio_alloc() is limited to 256 pages (1 Mbyte).  This can cause a failure
when reading 1 Mbyte block filesystems.  The problem is a datablock can be
fully (or almost uncompressed), requiring 256 pages, but, because blocks
are not aligned to page boundaries, it may require 257 pages to read.

Bio_kmalloc() can handle 1024 pages, and so use this for the edge
condition.

Link: http://lkml.kernel.org/r/20200815035637.15319-1-phillip@squashfs.org.uk
Fixes: 93e72b3c612a ("squashfs: migrate from ll_rw_block usage to BIO")
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: Nicolas Prochazka <nicolas.prochazka@gmail.com>
Reported-by: Tomoatsu Shimada <shimada@walbrix.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Cc: Philippe Liard <pliard@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Adrien Schildknecht <adrien+dev@schischi.me>
Cc: Daniel Rosenberg <drosen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/squashfs/block.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/fs/squashfs/block.c~squashfs-avoid-bio_alloc-failure-with-1mbyte-blocks
+++ a/fs/squashfs/block.c
@@ -87,7 +87,11 @@ static int squashfs_bio_read(struct supe
 	int error, i;
 	struct bio *bio;
 
-	bio = bio_alloc(GFP_NOIO, page_count);
+	if (page_count <= BIO_MAX_PAGES)
+		bio = bio_alloc(GFP_NOIO, page_count);
+	else
+		bio = bio_kmalloc(GFP_NOIO, page_count);
+
 	if (!bio)
 		return -ENOMEM;
 
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 10/11] mm: include CMA pages in lowmem_reserve at boot
  2020-08-21  0:41 incoming Andrew Morton
                   ` (8 preceding siblings ...)
  2020-08-21  0:42 ` [patch 09/11] squashfs: avoid bio_alloc() failure with 1Mbyte blocks Andrew Morton
@ 2020-08-21  0:42 ` Andrew Morton
  2020-08-21  0:42 ` [patch 11/11] mm, page_alloc: fix core hung in free_pcppages_bulk() Andrew Morton
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:42 UTC (permalink / raw)
  To: akpm, jbaron, kirill.shutemov, linux-mm, mhocko, mm-commits,
	opendmb, rientjes, stable, torvalds

From: Doug Berger <opendmb@gmail.com>
Subject: mm: include CMA pages in lowmem_reserve at boot

The lowmem_reserve arrays provide a means of applying pressure against
allocations from lower zones that were targeted at higher zones.  Its
values are a function of the number of pages managed by higher zones and
are assigned by a call to the setup_per_zone_lowmem_reserve() function.

The function is initially called at boot time by the function
init_per_zone_wmark_min() and may be called later by accesses of the
/proc/sys/vm/lowmem_reserve_ratio sysctl file.

The function init_per_zone_wmark_min() was moved up from a module_init to
a core_initcall to resolve a sequencing issue with khugepaged. 
Unfortunately this created a sequencing issue with CMA page accounting.

The CMA pages are added to the managed page count of a zone when
cma_init_reserved_areas() is called at boot also as a core_initcall.  This
makes it uncertain whether the CMA pages will be added to the managed page
counts of their zones before or after the call to
init_per_zone_wmark_min() as it becomes dependent on link order.  With the
current link order the pages are added to the managed count after the
lowmem_reserve arrays are initialized at boot.

This means the lowmem_reserve values at boot may be lower than the values
used later if /proc/sys/vm/lowmem_reserve_ratio is accessed even if the
ratio values are unchanged.

In many cases the difference is not significant, but for example
an ARM platform with 1GB of memory and the following memory layout
[    0.000000] cma: Reserved 256 MiB at 0x0000000030000000
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000000000-0x000000002fffffff]
[    0.000000]   Normal   empty
[    0.000000]   HighMem  [mem 0x0000000030000000-0x000000003fffffff]

would result in 0 lowmem_reserve for the DMA zone.  This would allow
userspace to deplete the DMA zone easily.  Funnily enough

$ cat /proc/sys/vm/lowmem_reserve_ratio

would fix up the situation because it forces setup_per_zone_lowmem_reserve
as a side effect.

This commit breaks the link order dependency by invoking
init_per_zone_wmark_min() as a postcore_initcall so that the CMA pages
have the chance to be properly accounted in their zone(s) and allowing the
lowmem_reserve arrays to receive consistent values.

Link: http://lkml.kernel.org/r/1597423766-27849-1-git-send-email-opendmb@gmail.com
Fixes: bc22af74f271 ("mm: update min_free_kbytes from khugepaged after core initialization")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/page_alloc.c~mm-include-cma-pages-in-lowmem_reserve-at-boot
+++ a/mm/page_alloc.c
@@ -7888,7 +7888,7 @@ int __meminit init_per_zone_wmark_min(vo
 
 	return 0;
 }
-core_initcall(init_per_zone_wmark_min)
+postcore_initcall(init_per_zone_wmark_min)
 
 /*
  * min_free_kbytes_sysctl_handler - just a wrapper around proc_dointvec() so
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 11/11] mm, page_alloc: fix core hung in free_pcppages_bulk()
  2020-08-21  0:41 incoming Andrew Morton
                   ` (9 preceding siblings ...)
  2020-08-21  0:42 ` [patch 10/11] mm: include CMA pages in lowmem_reserve at boot Andrew Morton
@ 2020-08-21  0:42 ` Andrew Morton
  10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21  0:42 UTC (permalink / raw)
  To: akpm, charante, david, linux-mm, mhocko, mm-commits, rientjes,
	stable, torvalds, vbabka, vinmenon

From: Charan Teja Reddy <charante@codeaurora.org>
Subject: mm, page_alloc: fix core hung in free_pcppages_bulk()

The following race is observed with the repeated online, offline and a
delay between two successive online of memory blocks of movable zone.

P1						P2

Online the first memory block in
the movable zone. The pcp struct
values are initialized to default
values,i.e., pcp->high = 0 &
pcp->batch = 1.

					Allocate the pages from the
					movable zone.

Try to Online the second memory
block in the movable zone thus it
entered the online_pages() but yet
to call zone_pcp_update().
					This process is entered into
					the exit path thus it tries
					to release the order-0 pages
					to pcp lists through
					free_unref_page_commit().
					As pcp->high = 0, pcp->count = 1
					proceed to call the function
					free_pcppages_bulk().
Update the pcp values thus the
new pcp values are like, say,
pcp->high = 378, pcp->batch = 63.
					Read the pcp's batch value using
					READ_ONCE() and pass the same to
					free_pcppages_bulk(), pcp values
					passed here are, batch = 63,
					count = 1.

					Since num of pages in the pcp
					lists are less than ->batch,
					then it will stuck in
					while(list_empty(list)) loop
					with interrupts disabled thus
					a core hung.

Avoid this by ensuring free_pcppages_bulk() is called with proper count of
pcp list pages.

The mentioned race is some what easily reproducible without [1] because
pcp's are not updated for the first memory block online and thus there is
a enough race window for P2 between alloc+free and pcp struct values
update through onlining of second memory block.

With [1], the race still exists but it is very narrow as we update the pcp
struct values for the first memory block online itself.

This is not limited to the movable zone, it could also happen in cases
with the normal zone (e.g., hotplug to a node that only has DMA memory, or
no other memory yet).

[1]: https://patchwork.kernel.org/patch/11696389/

Link: http://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org
Fixes: 5f8dcc21211a ("page-allocator: split per-cpu list into one-list-per-migrate-type")
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: <stable@vger.kernel.org> [2.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/mm/page_alloc.c~mm-page_alloc-fix-core-hung-in-free_pcppages_bulk
+++ a/mm/page_alloc.c
@@ -1302,6 +1302,11 @@ static void free_pcppages_bulk(struct zo
 	struct page *page, *tmp;
 	LIST_HEAD(head);
 
+	/*
+	 * Ensure proper count is passed which otherwise would stuck in the
+	 * below while (list_empty(list)) loop.
+	 */
+	count = min(pcp->count, count);
 	while (count) {
 		struct list_head *list;
 
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-08-21  0:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-21  0:41 incoming Andrew Morton
2020-08-21  0:41 ` [patch 01/11] mailmap: add Andi Kleen Andrew Morton
2020-08-21  0:41 ` [patch 02/11] hugetlb_cgroup: convert comma to semicolon Andrew Morton
2020-08-21  0:42 ` [patch 03/11] khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter() Andrew Morton
2020-08-21  0:42 ` [patch 04/11] mm/vunmap: add cond_resched() in vunmap_pmd_range Andrew Morton
2020-08-21  0:42 ` [patch 05/11] mm/rodata_test.c: fix missing function declaration Andrew Morton
2020-08-21  0:42 ` [patch 06/11] romfs: fix uninitialized memory leak in romfs_dev_read() Andrew Morton
2020-08-21  0:42 ` [patch 07/11] kernel/relay.c: fix memleak on destroy relay channel Andrew Morton
2020-08-21  0:42 ` [patch 08/11] uprobes: __replace_page() avoid BUG in munlock_vma_page() Andrew Morton
2020-08-21  0:42 ` [patch 09/11] squashfs: avoid bio_alloc() failure with 1Mbyte blocks Andrew Morton
2020-08-21  0:42 ` [patch 10/11] mm: include CMA pages in lowmem_reserve at boot Andrew Morton
2020-08-21  0:42 ` [patch 11/11] mm, page_alloc: fix core hung in free_pcppages_bulk() Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).