* [patch 01/11] mailmap: add Andi Kleen
2020-08-21 0:41 incoming Andrew Morton
@ 2020-08-21 0:41 ` Andrew Morton
2020-08-21 0:41 ` [patch 02/11] hugetlb_cgroup: convert comma to semicolon Andrew Morton
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:41 UTC (permalink / raw)
To: ak, akpm, corbet, keescook, linux-mm, mm-commits, ndesaulniers,
qperret, torvalds
From: Nick Desaulniers <ndesaulniers@google.com>
Subject: mailmap: add Andi Kleen
I keep getting bounce back from the suse.de address.
Link: http://lkml.kernel.org/r/20200818203214.659955-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Quentin Perret <qperret@qperret.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
.mailmap | 1 +
1 file changed, 1 insertion(+)
--- a/.mailmap~mailmap-add-andi-kleen
+++ a/.mailmap
@@ -32,6 +32,7 @@ Alex Shi <alex.shi@linux.alibaba.com> <a
Alex Shi <alex.shi@linux.alibaba.com> <alex.shi@linaro.org>
Al Viro <viro@ftp.linux.org.uk>
Al Viro <viro@zenIV.linux.org.uk>
+Andi Kleen <ak@linux.intel.com> <ak@suse.de>
Andi Shyti <andi@etezian.org> <andi.shyti@samsung.com>
Andreas Herrmann <aherrman@de.ibm.com>
Andrew Morton <akpm@linux-foundation.org>
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 02/11] hugetlb_cgroup: convert comma to semicolon
2020-08-21 0:41 incoming Andrew Morton
2020-08-21 0:41 ` [patch 01/11] mailmap: add Andi Kleen Andrew Morton
@ 2020-08-21 0:41 ` Andrew Morton
2020-08-21 0:42 ` [patch 03/11] khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter() Andrew Morton
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:41 UTC (permalink / raw)
To: akpm, gscrivan, linux-mm, mm-commits, tj, torvalds, vulab
From: Xu Wang <vulab@iscas.ac.cn>
Subject: hugetlb_cgroup: convert comma to semicolon
Replace a comma between expression statements by a semicolon.
Link: http://lkml.kernel.org/r/20200818064333.21759-1-vulab@iscas.ac.cn
Fixes: faced7e0806cf4 ("mm: hugetlb controller for cgroups v2")
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Cc: Tejun Heo <tj@kernel.org>
Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/hugetlb_cgroup.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/hugetlb_cgroup.c~hugetlb_cgroup-convert-comma-to-semicolon
+++ a/mm/hugetlb_cgroup.c
@@ -655,7 +655,7 @@ static void __init __hugetlb_cgroup_file
snprintf(cft->name, MAX_CFTYPE_NAME, "%s.events", buf);
cft->private = MEMFILE_PRIVATE(idx, 0);
cft->seq_show = hugetlb_events_show;
- cft->file_offset = offsetof(struct hugetlb_cgroup, events_file[idx]),
+ cft->file_offset = offsetof(struct hugetlb_cgroup, events_file[idx]);
cft->flags = CFTYPE_NOT_ON_ROOT;
/* Add the events.local file */
@@ -664,7 +664,7 @@ static void __init __hugetlb_cgroup_file
cft->private = MEMFILE_PRIVATE(idx, 0);
cft->seq_show = hugetlb_events_local_show;
cft->file_offset = offsetof(struct hugetlb_cgroup,
- events_local_file[idx]),
+ events_local_file[idx]);
cft->flags = CFTYPE_NOT_ON_ROOT;
/* NULL terminate the last cft */
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 03/11] khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
2020-08-21 0:41 incoming Andrew Morton
2020-08-21 0:41 ` [patch 01/11] mailmap: add Andi Kleen Andrew Morton
2020-08-21 0:41 ` [patch 02/11] hugetlb_cgroup: convert comma to semicolon Andrew Morton
@ 2020-08-21 0:42 ` Andrew Morton
2020-08-21 0:42 ` [patch 04/11] mm/vunmap: add cond_resched() in vunmap_pmd_range Andrew Morton
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:42 UTC (permalink / raw)
To: aarcange, akpm, edumazet, hughd, kirill.shutemov, linux-mm,
mike.kravetz, mm-commits, shy828301, songliubraving, stable,
syzkaller, torvalds
From: Hugh Dickins <hughd@google.com>
Subject: khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
syzbot crashes on the VM_BUG_ON_MM(khugepaged_test_exit(mm), mm) in
__khugepaged_enter(): yes, when one thread is about to dump core, has set
core_state, and is waiting for others, another might do something calling
__khugepaged_enter(), which now crashes because I lumped the core_state
test (known as "mmget_still_valid") into khugepaged_test_exit(). I still
think it's best to lump them together, so just in this exceptional case,
check mm->mm_users directly instead of khugepaged_test_exit().
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008141503370.18085@eggly.anvils
Fixes: bbe98f9cadff ("khugepaged: khugepaged_test_exit() check mmget_still_valid()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: <stable@vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/khugepaged.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/khugepaged.c~khugepaged-adjust-vm_bug_on_mm-in-__khugepaged_enter
+++ a/mm/khugepaged.c
@@ -466,7 +466,7 @@ int __khugepaged_enter(struct mm_struct
return -ENOMEM;
/* __khugepaged_exit() must not run from under us */
- VM_BUG_ON_MM(khugepaged_test_exit(mm), mm);
+ VM_BUG_ON_MM(atomic_read(&mm->mm_users) == 0, mm);
if (unlikely(test_and_set_bit(MMF_VM_HUGEPAGE, &mm->flags))) {
free_mm_slot(mm_slot);
return 0;
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 04/11] mm/vunmap: add cond_resched() in vunmap_pmd_range
2020-08-21 0:41 incoming Andrew Morton
` (2 preceding siblings ...)
2020-08-21 0:42 ` [patch 03/11] khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter() Andrew Morton
@ 2020-08-21 0:42 ` Andrew Morton
2020-08-21 0:42 ` [patch 05/11] mm/rodata_test.c: fix missing function declaration Andrew Morton
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:42 UTC (permalink / raw)
To: akpm, aneesh.kumar, harish, linux-mm, mm-commits, stable, torvalds
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Subject: mm/vunmap: add cond_resched() in vunmap_pmd_range
Like zap_pte_range add cond_resched so that we can avoid softlockups as
reported below. On non-preemptible kernel with large I/O map region (like
the one we get when using persistent memory with sector mode), an unmap of
the namespace can report below softlockups.
22724.027334] watchdog: BUG: soft lockup - CPU#49 stuck for 23s! [ndctl:50777]
NIP [c0000000000dc224] plpar_hcall+0x38/0x58
LR [c0000000000d8898] pSeries_lpar_hpte_invalidate+0x68/0xb0
Call Trace:
[c0000004e87a7780] [c0000004fb197c00] 0xc0000004fb197c00 (unreliable)
[c0000004e87a7810] [c00000000007f4e4] flush_hash_page+0x114/0x200
[c0000004e87a7890] [c0000000000833cc] hpte_need_flush+0x2dc/0x540
[c0000004e87a7950] [c0000000003f5798] vunmap_page_range+0x538/0x6f0
[c0000004e87a7a70] [c0000000003f76d0] free_unmap_vmap_area+0x30/0x70
[c0000004e87a7aa0] [c0000000003f7a6c] remove_vm_area+0xfc/0x140
[c0000004e87a7ad0] [c0000000003f7dd8] __vunmap+0x68/0x270
[c0000004e87a7b50] [c000000000079de4] __iounmap.part.0+0x34/0x60
[c0000004e87a7bb0] [c000000000376394] memunmap+0x54/0x70
[c0000004e87a7bd0] [c000000000881d7c] release_nodes+0x28c/0x300
[c0000004e87a7c40] [c00000000087a65c] device_release_driver_internal+0x16c/0x280
[c0000004e87a7c80] [c000000000876fc4] unbind_store+0x124/0x170
[c0000004e87a7cd0] [c000000000875be4] drv_attr_store+0x44/0x60
[c0000004e87a7cf0] [c00000000057c734] sysfs_kf_write+0x64/0x90
[c0000004e87a7d10] [c00000000057bc10] kernfs_fop_write+0x1b0/0x290
[c0000004e87a7d60] [c000000000488e6c] __vfs_write+0x3c/0x70
[c0000004e87a7d80] [c00000000048c868] vfs_write+0xd8/0x260
[c0000004e87a7dd0] [c00000000048ccac] ksys_write+0xdc/0x130
[c0000004e87a7e20] [c00000000000b588] system_call+0x5c/0x70
Link: http://lkml.kernel.org/r/20200807075933.310240-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Harish Sriram <harish@linux.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/vmalloc.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/vmalloc.c~mm-vunmap-add-cond_resched-in-vunmap_pmd_range
+++ a/mm/vmalloc.c
@@ -104,6 +104,8 @@ static void vunmap_pmd_range(pud_t *pud,
if (pmd_none_or_clear_bad(pmd))
continue;
vunmap_pte_range(pmd, addr, next, mask);
+
+ cond_resched();
} while (pmd++, addr = next, addr != end);
}
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 05/11] mm/rodata_test.c: fix missing function declaration
2020-08-21 0:41 incoming Andrew Morton
` (3 preceding siblings ...)
2020-08-21 0:42 ` [patch 04/11] mm/vunmap: add cond_resched() in vunmap_pmd_range Andrew Morton
@ 2020-08-21 0:42 ` Andrew Morton
2020-08-21 0:42 ` [patch 06/11] romfs: fix uninitialized memory leak in romfs_dev_read() Andrew Morton
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:42 UTC (permalink / raw)
To: akpm, anshuman.khandual, leonro, linux-mm, mm-commits, torvalds
From: Leon Romanovsky <leonro@nvidia.com>
Subject: mm/rodata_test.c: fix missing function declaration
The compilation with CONFIG_DEBUG_RODATA_TEST set produces the following
warning due to the missing include.
mm/rodata_test.c:15:6: warning: no previous prototype for 'rodata_test' [-Wmissing-prototypes]
15 | void rodata_test(void)
| ^~~~~~~~~~~
Link: https://lkml.kernel.org/r/20200819080026.918134-1-leon@kernel.org
Fixes: 2959a5f726f6 ("mm: add arch-independent testcases for RODATA")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/rodata_test.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/rodata_test.c~mm-fix-missing-function-declaration
+++ a/mm/rodata_test.c
@@ -7,6 +7,7 @@
*/
#define pr_fmt(fmt) "rodata_test: " fmt
+#include <linux/rodata_test.h>
#include <linux/uaccess.h>
#include <asm/sections.h>
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 06/11] romfs: fix uninitialized memory leak in romfs_dev_read()
2020-08-21 0:41 incoming Andrew Morton
` (4 preceding siblings ...)
2020-08-21 0:42 ` [patch 05/11] mm/rodata_test.c: fix missing function declaration Andrew Morton
@ 2020-08-21 0:42 ` Andrew Morton
2020-08-21 0:42 ` [patch 07/11] kernel/relay.c: fix memleak on destroy relay channel Andrew Morton
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:42 UTC (permalink / raw)
To: akpm, dhowells, gregkh, jannh, linux-mm, mm-commits, stable, torvalds
From: Jann Horn <jannh@google.com>
Subject: romfs: fix uninitialized memory leak in romfs_dev_read()
romfs has a superblock field that limits the size of the filesystem; data
beyond that limit is never accessed.
romfs_dev_read() fetches a caller-supplied number of bytes from the
backing device. It returns 0 on success or an error code on failure;
therefore, its API can't represent short reads, it's all-or-nothing.
However, when romfs_dev_read() detects that the requested operation would
cross the filesystem size limit, it currently silently truncates the
requested number of bytes. This e.g. means that when the content of a
file with size 0x1000 starts one byte before the filesystem size limit,
->readpage() will only fill a single byte of the supplied page while
leaving the rest uninitialized, leaking that uninitialized memory to
userspace.
Fix it by returning an error code instead of truncating the read when the
requested read operation would go beyond the end of the filesystem.
Link: http://lkml.kernel.org/r/20200818013202.2246365-1-jannh@google.com
Fixes: da4458bda237 ("NOMMU: Make it possible for RomFS to use MTD devices directly")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/romfs/storage.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/romfs/storage.c~romfs-fix-uninitialized-memory-leak-in-romfs_dev_read
+++ a/fs/romfs/storage.c
@@ -217,10 +217,8 @@ int romfs_dev_read(struct super_block *s
size_t limit;
limit = romfs_maxsize(sb);
- if (pos >= limit)
+ if (pos >= limit || buflen > limit - pos)
return -EIO;
- if (buflen > limit - pos)
- buflen = limit - pos;
#ifdef CONFIG_ROMFS_ON_MTD
if (sb->s_mtd)
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 07/11] kernel/relay.c: fix memleak on destroy relay channel
2020-08-21 0:41 incoming Andrew Morton
` (5 preceding siblings ...)
2020-08-21 0:42 ` [patch 06/11] romfs: fix uninitialized memory leak in romfs_dev_read() Andrew Morton
@ 2020-08-21 0:42 ` Andrew Morton
2020-08-21 0:42 ` [patch 08/11] uprobes: __replace_page() avoid BUG in munlock_vma_page() Andrew Morton
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:42 UTC (permalink / raw)
To: akash.goel, akpm, chris, dja, hulkci, linux-mm, mm-commits, mpe,
rientjes, stable, tglx, torvalds, viro, walken, weiyongjun1
From: Wei Yongjun <weiyongjun1@huawei.com>
Subject: kernel/relay.c: fix memleak on destroy relay channel
kmemleak report memory leak as follows:
unreferenced object 0x607ee4e5f948 (size 8):
comm "syz-executor.1", pid 2098, jiffies 4295031601 (age 288.468s)
hex dump (first 8 bytes):
00 00 00 00 00 00 00 00 ........
backtrace:
[<00000000ca1de2fa>] relay_open kernel/relay.c:583 [inline]
[<00000000ca1de2fa>] relay_open+0xb6/0x970 kernel/relay.c:563
[<0000000038ae5a4b>] do_blk_trace_setup+0x4a8/0xb20 kernel/trace/blktrace.c:557
[<00000000d5e778e9>] __blk_trace_setup+0xb6/0x150 kernel/trace/blktrace.c:597
[<0000000038fdf803>] blk_trace_ioctl+0x146/0x280 kernel/trace/blktrace.c:738
[<00000000ce25a0ca>] blkdev_ioctl+0xb2/0x6a0 block/ioctl.c:613
[<00000000579e47e0>] block_ioctl+0xe5/0x120 fs/block_dev.c:1871
[<00000000b1588c11>] vfs_ioctl fs/ioctl.c:48 [inline]
[<00000000b1588c11>] __do_sys_ioctl fs/ioctl.c:753 [inline]
[<00000000b1588c11>] __se_sys_ioctl fs/ioctl.c:739 [inline]
[<00000000b1588c11>] __x64_sys_ioctl+0x170/0x1ce fs/ioctl.c:739
[<0000000088fc9942>] do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
[<000000004f6dd57a>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
'chan->buf' is malloced in relay_open() by alloc_percpu() but not free
while destroy the relay channel. Fix it by adding free_percpu() before
return from relay_destroy_channel().
Link: http://lkml.kernel.org/r/20200817122826.48518-1-weiyongjun1@huawei.com
Fixes: 017c59c042d0 ("relay: Use per CPU constructs for the relay channel buffer pointers")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David Rientjes <rientjes@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Akash Goel <akash.goel@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/relay.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/relay.c~kernel-relayc-fix-memleak-on-destroy-relay-channel
+++ a/kernel/relay.c
@@ -197,6 +197,7 @@ free_buf:
static void relay_destroy_channel(struct kref *kref)
{
struct rchan *chan = container_of(kref, struct rchan, kref);
+ free_percpu(chan->buf);
kfree(chan);
}
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 08/11] uprobes: __replace_page() avoid BUG in munlock_vma_page()
2020-08-21 0:41 incoming Andrew Morton
` (6 preceding siblings ...)
2020-08-21 0:42 ` [patch 07/11] kernel/relay.c: fix memleak on destroy relay channel Andrew Morton
@ 2020-08-21 0:42 ` Andrew Morton
2020-08-21 0:42 ` [patch 09/11] squashfs: avoid bio_alloc() failure with 1Mbyte blocks Andrew Morton
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:42 UTC (permalink / raw)
To: akpm, hughd, kirill.shutemov, linux-mm, mm-commits, oleg,
songliubraving, srikar, stable, syzkaller, torvalds
From: Hugh Dickins <hughd@google.com>
Subject: uprobes: __replace_page() avoid BUG in munlock_vma_page()
syzbot crashed on the VM_BUG_ON_PAGE(PageTail) in munlock_vma_page(), when
called from uprobes __replace_page(). Which of many ways to fix it?
Settled on not calling when PageCompound (since Head and Tail are equals
in this context, PageCompound the usual check in uprobes.c, and the prior
use of FOLL_SPLIT_PMD will have cleared PageMlocked already).
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008161338360.20413@eggly.anvils
Fixes: 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/events/uprobes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/events/uprobes.c~uprobes-__replace_page-avoid-bug-in-munlock_vma_page
+++ a/kernel/events/uprobes.c
@@ -205,7 +205,7 @@ static int __replace_page(struct vm_area
try_to_free_swap(old_page);
page_vma_mapped_walk_done(&pvmw);
- if (vma->vm_flags & VM_LOCKED)
+ if ((vma->vm_flags & VM_LOCKED) && !PageCompound(old_page))
munlock_vma_page(old_page);
put_page(old_page);
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 09/11] squashfs: avoid bio_alloc() failure with 1Mbyte blocks
2020-08-21 0:41 incoming Andrew Morton
` (7 preceding siblings ...)
2020-08-21 0:42 ` [patch 08/11] uprobes: __replace_page() avoid BUG in munlock_vma_page() Andrew Morton
@ 2020-08-21 0:42 ` Andrew Morton
2020-08-21 0:42 ` [patch 10/11] mm: include CMA pages in lowmem_reserve at boot Andrew Morton
2020-08-21 0:42 ` [patch 11/11] mm, page_alloc: fix core hung in free_pcppages_bulk() Andrew Morton
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:42 UTC (permalink / raw)
To: adrien+dev, akpm, drosen, groeck, hch, linux-mm, mm-commits,
nicolas.prochazka, phillip, pliard, shimada, stable, torvalds
From: Phillip Lougher <phillip@squashfs.org.uk>
Subject: squashfs: avoid bio_alloc() failure with 1Mbyte blocks
This is a regression introduced by the patch "migrate from ll_rw_block
usage to BIO".
Bio_alloc() is limited to 256 pages (1 Mbyte). This can cause a failure
when reading 1 Mbyte block filesystems. The problem is a datablock can be
fully (or almost uncompressed), requiring 256 pages, but, because blocks
are not aligned to page boundaries, it may require 257 pages to read.
Bio_kmalloc() can handle 1024 pages, and so use this for the edge
condition.
Link: http://lkml.kernel.org/r/20200815035637.15319-1-phillip@squashfs.org.uk
Fixes: 93e72b3c612a ("squashfs: migrate from ll_rw_block usage to BIO")
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: Nicolas Prochazka <nicolas.prochazka@gmail.com>
Reported-by: Tomoatsu Shimada <shimada@walbrix.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Cc: Philippe Liard <pliard@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Adrien Schildknecht <adrien+dev@schischi.me>
Cc: Daniel Rosenberg <drosen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/squashfs/block.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/fs/squashfs/block.c~squashfs-avoid-bio_alloc-failure-with-1mbyte-blocks
+++ a/fs/squashfs/block.c
@@ -87,7 +87,11 @@ static int squashfs_bio_read(struct supe
int error, i;
struct bio *bio;
- bio = bio_alloc(GFP_NOIO, page_count);
+ if (page_count <= BIO_MAX_PAGES)
+ bio = bio_alloc(GFP_NOIO, page_count);
+ else
+ bio = bio_kmalloc(GFP_NOIO, page_count);
+
if (!bio)
return -ENOMEM;
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 10/11] mm: include CMA pages in lowmem_reserve at boot
2020-08-21 0:41 incoming Andrew Morton
` (8 preceding siblings ...)
2020-08-21 0:42 ` [patch 09/11] squashfs: avoid bio_alloc() failure with 1Mbyte blocks Andrew Morton
@ 2020-08-21 0:42 ` Andrew Morton
2020-08-21 0:42 ` [patch 11/11] mm, page_alloc: fix core hung in free_pcppages_bulk() Andrew Morton
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:42 UTC (permalink / raw)
To: akpm, jbaron, kirill.shutemov, linux-mm, mhocko, mm-commits,
opendmb, rientjes, stable, torvalds
From: Doug Berger <opendmb@gmail.com>
Subject: mm: include CMA pages in lowmem_reserve at boot
The lowmem_reserve arrays provide a means of applying pressure against
allocations from lower zones that were targeted at higher zones. Its
values are a function of the number of pages managed by higher zones and
are assigned by a call to the setup_per_zone_lowmem_reserve() function.
The function is initially called at boot time by the function
init_per_zone_wmark_min() and may be called later by accesses of the
/proc/sys/vm/lowmem_reserve_ratio sysctl file.
The function init_per_zone_wmark_min() was moved up from a module_init to
a core_initcall to resolve a sequencing issue with khugepaged.
Unfortunately this created a sequencing issue with CMA page accounting.
The CMA pages are added to the managed page count of a zone when
cma_init_reserved_areas() is called at boot also as a core_initcall. This
makes it uncertain whether the CMA pages will be added to the managed page
counts of their zones before or after the call to
init_per_zone_wmark_min() as it becomes dependent on link order. With the
current link order the pages are added to the managed count after the
lowmem_reserve arrays are initialized at boot.
This means the lowmem_reserve values at boot may be lower than the values
used later if /proc/sys/vm/lowmem_reserve_ratio is accessed even if the
ratio values are unchanged.
In many cases the difference is not significant, but for example
an ARM platform with 1GB of memory and the following memory layout
[ 0.000000] cma: Reserved 256 MiB at 0x0000000030000000
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000000000000-0x000000002fffffff]
[ 0.000000] Normal empty
[ 0.000000] HighMem [mem 0x0000000030000000-0x000000003fffffff]
would result in 0 lowmem_reserve for the DMA zone. This would allow
userspace to deplete the DMA zone easily. Funnily enough
$ cat /proc/sys/vm/lowmem_reserve_ratio
would fix up the situation because it forces setup_per_zone_lowmem_reserve
as a side effect.
This commit breaks the link order dependency by invoking
init_per_zone_wmark_min() as a postcore_initcall so that the CMA pages
have the chance to be properly accounted in their zone(s) and allowing the
lowmem_reserve arrays to receive consistent values.
Link: http://lkml.kernel.org/r/1597423766-27849-1-git-send-email-opendmb@gmail.com
Fixes: bc22af74f271 ("mm: update min_free_kbytes from khugepaged after core initialization")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-include-cma-pages-in-lowmem_reserve-at-boot
+++ a/mm/page_alloc.c
@@ -7888,7 +7888,7 @@ int __meminit init_per_zone_wmark_min(vo
return 0;
}
-core_initcall(init_per_zone_wmark_min)
+postcore_initcall(init_per_zone_wmark_min)
/*
* min_free_kbytes_sysctl_handler - just a wrapper around proc_dointvec() so
_
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch 11/11] mm, page_alloc: fix core hung in free_pcppages_bulk()
2020-08-21 0:41 incoming Andrew Morton
` (9 preceding siblings ...)
2020-08-21 0:42 ` [patch 10/11] mm: include CMA pages in lowmem_reserve at boot Andrew Morton
@ 2020-08-21 0:42 ` Andrew Morton
10 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2020-08-21 0:42 UTC (permalink / raw)
To: akpm, charante, david, linux-mm, mhocko, mm-commits, rientjes,
stable, torvalds, vbabka, vinmenon
From: Charan Teja Reddy <charante@codeaurora.org>
Subject: mm, page_alloc: fix core hung in free_pcppages_bulk()
The following race is observed with the repeated online, offline and a
delay between two successive online of memory blocks of movable zone.
P1 P2
Online the first memory block in
the movable zone. The pcp struct
values are initialized to default
values,i.e., pcp->high = 0 &
pcp->batch = 1.
Allocate the pages from the
movable zone.
Try to Online the second memory
block in the movable zone thus it
entered the online_pages() but yet
to call zone_pcp_update().
This process is entered into
the exit path thus it tries
to release the order-0 pages
to pcp lists through
free_unref_page_commit().
As pcp->high = 0, pcp->count = 1
proceed to call the function
free_pcppages_bulk().
Update the pcp values thus the
new pcp values are like, say,
pcp->high = 378, pcp->batch = 63.
Read the pcp's batch value using
READ_ONCE() and pass the same to
free_pcppages_bulk(), pcp values
passed here are, batch = 63,
count = 1.
Since num of pages in the pcp
lists are less than ->batch,
then it will stuck in
while(list_empty(list)) loop
with interrupts disabled thus
a core hung.
Avoid this by ensuring free_pcppages_bulk() is called with proper count of
pcp list pages.
The mentioned race is some what easily reproducible without [1] because
pcp's are not updated for the first memory block online and thus there is
a enough race window for P2 between alloc+free and pcp struct values
update through onlining of second memory block.
With [1], the race still exists but it is very narrow as we update the pcp
struct values for the first memory block online itself.
This is not limited to the movable zone, it could also happen in cases
with the normal zone (e.g., hotplug to a node that only has DMA memory, or
no other memory yet).
[1]: https://patchwork.kernel.org/patch/11696389/
Link: http://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org
Fixes: 5f8dcc21211a ("page-allocator: split per-cpu list into one-list-per-migrate-type")
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: <stable@vger.kernel.org> [2.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/page_alloc.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/mm/page_alloc.c~mm-page_alloc-fix-core-hung-in-free_pcppages_bulk
+++ a/mm/page_alloc.c
@@ -1302,6 +1302,11 @@ static void free_pcppages_bulk(struct zo
struct page *page, *tmp;
LIST_HEAD(head);
+ /*
+ * Ensure proper count is passed which otherwise would stuck in the
+ * below while (list_empty(list)) loop.
+ */
+ count = min(pcp->count, count);
while (count) {
struct list_head *list;
_
^ permalink raw reply [flat|nested] 12+ messages in thread