stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
@ 2020-07-03 22:15 ` Andrew Morton
  2020-07-03 22:15 ` [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak Andrew Morton
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
  To: akpm, kirill.shutemov, linux-mm, mhocko, mike.kravetz,
	mm-commits, stable, torvalds, willy

From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: mm/hugetlb.c: fix pages per hugetlb calculation

The routine hpage_nr_pages() was incorrectly used to calculate the number
of base pages in a hugetlb page.  hpage_nr_pages is designed to be called
for THP pages and will return HPAGE_PMD_NR for hugetlb pages of any size.

Due to the context in which hpage_nr_pages was called, it is unlikely to
produce a user visible error.  The routine with the incorrect call is only
exercised in the case of hugetlb memory error or migration.  In addition,
this would need to be on an architecture which supports huge page sizes
less than PMD_SIZE.  And, the vma containing the huge page would also need
to smaller than PMD_SIZE.

Link: http://lkml.kernel.org/r/20200629185003.97202-1-mike.kravetz@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/hugetlb.c~hugetlb-fix-pages-per-hugetlb-calculation
+++ a/mm/hugetlb.c
@@ -1593,7 +1593,7 @@ static struct address_space *_get_hugetl
 
 	/* Use first found vma */
 	pgoff_start = page_to_pgoff(hpage);
-	pgoff_end = pgoff_start + hpage_nr_pages(hpage) - 1;
+	pgoff_end = pgoff_start + pages_per_huge_page(page_hstate(hpage)) - 1;
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
 					pgoff_start, pgoff_end) {
 		struct vm_area_struct *vma = avc->vma;
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
  2020-07-03 22:15 ` [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation Andrew Morton
@ 2020-07-03 22:15 ` Andrew Morton
  2020-07-06 23:50 ` + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch added to -mm tree Andrew Morton
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-03 22:15 UTC (permalink / raw)
  To: akpm, andreas.schaufler, aslan, guro, Jonathan.Cameron, js1304,
	linux-mm, mhocko, mike.kravetz, mm-commits, riel, robin.murphy,
	song.bao.hua, stable, torvalds

From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/cma.c: use exact_nid true to fix possible per-numa cma leak

Calling cma_declare_contiguous_nid() with false exact_nid for per-numa
reservation can easily cause cma leak and various confusion.  For example,
mm/hugetlb.c is trying to reserve per-numa cma for gigantic pages.  But it
can easily leak cma and make users confused when system has memoryless
nodes.

In case the system has 4 numa nodes, and only numa node0 has memory.  if
we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas for 4
different numa nodes.  since exact_nid=false in current code, all 4 numa
nodes will get cma successfully from node0, but hugetlb_cma[1 to 3] will
never be available to hugepage will only allocate memory from
hugetlb_cma[0].

In case the system has 4 numa nodes, both numa node0&2 has memory, other
nodes have no memory.  if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c
will get 4 cma areas for 4 different numa nodes.  since exact_nid=false in
current code, all 4 numa nodes will get cma successfully from node0 or 2,
but hugetlb_cma[1] and [3] will never be available to hugepage as
mm/hugetlb.c will only allocate memory from hugetlb_cma[0] and
hugetlb_cma[2].  This causes permanent leak of the cma areas which are
supposed to be used by memoryless node.

Of cource we can workaround the issue by letting mm/hugetlb.c scan all cma
areas in alloc_gigantic_page() even node_mask includes node0 only.  that
means when node_mask includes node0 only, we can get page from
hugetlb_cma[1] to hugetlb_cma[3].  But this will cause kernel crash in
free_gigantic_page() while it wants to free page by:
cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)

On the other hand, exact_nid=false won't consider numa distance, it might
be not that useful to leverage cma areas on remote nodes.  I feel it is
much simpler to make exact_nid true to make everything clear.  After that,
memoryless nodes won't be able to reserve per-numa CMA from other nodes
which have memory.

Link: http://lkml.kernel.org/r/20200628074345.27228-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/cma.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/cma.c~mm-cmac-use-exact_nid-true-to-fix-possible-per-numa-cma-leak
+++ a/mm/cma.c
@@ -339,13 +339,13 @@ int __init cma_declare_contiguous_nid(ph
 		 */
 		if (base < highmem_start && limit > highmem_start) {
 			addr = memblock_alloc_range_nid(size, alignment,
-					highmem_start, limit, nid, false);
+					highmem_start, limit, nid, true);
 			limit = highmem_start;
 		}
 
 		if (!addr) {
 			addr = memblock_alloc_range_nid(size, alignment, base,
-					limit, nid, false);
+					limit, nid, true);
 			if (!addr) {
 				ret = -ENOMEM;
 				goto err;
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
  2020-07-03 22:15 ` [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation Andrew Morton
  2020-07-03 22:15 ` [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak Andrew Morton
@ 2020-07-06 23:50 ` Andrew Morton
  2020-07-07 19:25 ` + fs-minix-check-return-value-of-sb_getblk.patch " Andrew Morton
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-06 23:50 UTC (permalink / raw)
  To: adilger, cgxu519, chris, dxu, gregkh, hughd, mm-commits, stable,
	tj, viro


The patch titled
     Subject: vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
has been added to the -mm tree.  Its filename is
     vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chengguang Xu <cgxu519@mykernel.net>
Subject: vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way

After commit fdc85222d58e ("kernfs: kvmalloc xattr value instead of
kmalloc"), simple xattr entry is allocated with kvmalloc() instead of
kmalloc(), so we should release it with kvfree() instead of kfree().

Link: http://lkml.kernel.org/r/20200704051608.15043-1-cgxu519@mykernel.net
Fixes: fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc")
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Chris Down <chris@chrisdown.name>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>	[5.7]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/xattr.h |    3 ++-
 mm/shmem.c            |    2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

--- a/include/linux/xattr.h~vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way
+++ a/include/linux/xattr.h
@@ -15,6 +15,7 @@
 #include <linux/slab.h>
 #include <linux/types.h>
 #include <linux/spinlock.h>
+#include <linux/mm.h>
 #include <uapi/linux/xattr.h>
 
 struct inode;
@@ -94,7 +95,7 @@ static inline void simple_xattrs_free(st
 
 	list_for_each_entry_safe(xattr, node, &xattrs->head, list) {
 		kfree(xattr->name);
-		kfree(xattr);
+		kvfree(xattr);
 	}
 }
 
--- a/mm/shmem.c~vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way
+++ a/mm/shmem.c
@@ -3178,7 +3178,7 @@ static int shmem_initxattrs(struct inode
 		new_xattr->name = kmalloc(XATTR_SECURITY_PREFIX_LEN + len,
 					  GFP_KERNEL);
 		if (!new_xattr->name) {
-			kfree(new_xattr);
+			kvfree(new_xattr);
 			return -ENOMEM;
 		}
 
_

Patches currently in -mm which might be from cgxu519@mykernel.net are

vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
mm-shmem-fix-freeing-new_attr-in-shmem_initxattrs.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + fs-minix-check-return-value-of-sb_getblk.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (2 preceding siblings ...)
  2020-07-06 23:50 ` + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch added to -mm tree Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
  2020-07-07 19:25 ` + fs-minix-dont-allow-getting-deleted-inodes.patch " Andrew Morton
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
  To: anenbupt, ebiggers, mm-commits, stable, viro


The patch titled
     Subject: fs/minix: check return value of sb_getblk()
has been added to the -mm tree.  Its filename is
     fs-minix-check-return-value-of-sb_getblk.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-check-return-value-of-sb_getblk.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-check-return-value-of-sb_getblk.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: check return value of sb_getblk()

Patch series "fs/minix: fix syzbot bugs and set s_maxbytes".

This series fixes all syzbot bugs in the minix filesystem:

	KASAN: null-ptr-deref Write in get_block
	KASAN: use-after-free Write in get_block
	KASAN: use-after-free Read in get_block
	WARNING in inc_nlink
	KMSAN: uninit-value in get_block
	WARNING in drop_nlink

It also fixes the minix filesystem to set s_maxbytes correctly, so that
userspace sees the correct behavior when exceeding the max file size.


This patch (of 6):

sb_getblk() can fail, so check its return value.

This fixes a NULL pointer dereference.

Originally from Qiujun Huang.

Link: http://lkml.kernel.org/r/20200628060846.682158-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200628060846.682158-2-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reported-by: syzbot+4a88b2b9dc280f47baf4@syzkaller.appspotmail.com
Cc: Qiujun Huang <anenbupt@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/minix/itree_common.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/fs/minix/itree_common.c~fs-minix-check-return-value-of-sb_getblk
+++ a/fs/minix/itree_common.c
@@ -75,6 +75,7 @@ static int alloc_branch(struct inode *in
 	int n = 0;
 	int i;
 	int parent = minix_new_block(inode);
+	int err = -ENOSPC;
 
 	branch[0].key = cpu_to_block(parent);
 	if (parent) for (n = 1; n < num; n++) {
@@ -85,6 +86,11 @@ static int alloc_branch(struct inode *in
 			break;
 		branch[n].key = cpu_to_block(nr);
 		bh = sb_getblk(inode->i_sb, parent);
+		if (!bh) {
+			minix_free_block(inode, nr);
+			err = -ENOMEM;
+			break;
+		}
 		lock_buffer(bh);
 		memset(bh->b_data, 0, bh->b_size);
 		branch[n].bh = bh;
@@ -103,7 +109,7 @@ static int alloc_branch(struct inode *in
 		bforget(branch[i].bh);
 	for (i = 0; i < n; i++)
 		minix_free_block(inode, block_to_cpu(branch[i].key));
-	return -ENOSPC;
+	return err;
 }
 
 static inline int splice_branch(struct inode *inode,
_

Patches currently in -mm which might be from ebiggers@google.com are

fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + fs-minix-dont-allow-getting-deleted-inodes.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (3 preceding siblings ...)
  2020-07-07 19:25 ` + fs-minix-check-return-value-of-sb_getblk.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
  2020-07-07 19:25 ` + fs-minix-reject-too-large-maximum-file-size.patch " Andrew Morton
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
  To: anenbupt, ebiggers, mm-commits, stable, viro


The patch titled
     Subject: fs/minix: don't allow getting deleted inodes
has been added to the -mm tree.  Its filename is
     fs-minix-dont-allow-getting-deleted-inodes.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-dont-allow-getting-deleted-inodes.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-dont-allow-getting-deleted-inodes.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: don't allow getting deleted inodes

If an inode has no links, we need to mark it bad rather than allowing it
to be accessed.  This avoids WARNINGs in inc_nlink() and drop_nlink() when
doing directory operations on a fuzzed filesystem.

Link: http://lkml.kernel.org/r/20200628060846.682158-3-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+a9ac3de1b5de5fb10efc@syzkaller.appspotmail.com
Reported-by: syzbot+df958cf5688a96ad3287@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/minix/inode.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

--- a/fs/minix/inode.c~fs-minix-dont-allow-getting-deleted-inodes
+++ a/fs/minix/inode.c
@@ -468,6 +468,13 @@ static struct inode *V1_minix_iget(struc
 		iget_failed(inode);
 		return ERR_PTR(-EIO);
 	}
+	if (raw_inode->i_nlinks == 0) {
+		printk("MINIX-fs: deleted inode referenced: %lu\n",
+		       inode->i_ino);
+		brelse(bh);
+		iget_failed(inode);
+		return ERR_PTR(-ESTALE);
+	}
 	inode->i_mode = raw_inode->i_mode;
 	i_uid_write(inode, raw_inode->i_uid);
 	i_gid_write(inode, raw_inode->i_gid);
@@ -501,6 +508,13 @@ static struct inode *V2_minix_iget(struc
 		iget_failed(inode);
 		return ERR_PTR(-EIO);
 	}
+	if (raw_inode->i_nlinks == 0) {
+		printk("MINIX-fs: deleted inode referenced: %lu\n",
+		       inode->i_ino);
+		brelse(bh);
+		iget_failed(inode);
+		return ERR_PTR(-ESTALE);
+	}
 	inode->i_mode = raw_inode->i_mode;
 	i_uid_write(inode, raw_inode->i_uid);
 	i_gid_write(inode, raw_inode->i_gid);
_

Patches currently in -mm which might be from ebiggers@google.com are

fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + fs-minix-reject-too-large-maximum-file-size.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (4 preceding siblings ...)
  2020-07-07 19:25 ` + fs-minix-dont-allow-getting-deleted-inodes.patch " Andrew Morton
@ 2020-07-07 19:25 ` Andrew Morton
  2020-07-07 22:18 ` + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch " Andrew Morton
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-07 19:25 UTC (permalink / raw)
  To: anenbupt, ebiggers, mm-commits, stable, viro


The patch titled
     Subject: fs/minix: reject too-large maximum file size
has been added to the -mm tree.  Its filename is
     fs-minix-reject-too-large-maximum-file-size.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-minix-reject-too-large-maximum-file-size.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-minix-reject-too-large-maximum-file-size.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/minix: reject too-large maximum file size

If the minix filesystem tries to map a very large logical block number to
its on-disk location, block_to_path() can return offsets that are too
large, causing out-of-bounds memory accesses when accessing indirect index
blocks.  This should be prevented by the check against the maximum file
size, but this doesn't work because the maximum file size is read directly
from the on-disk superblock and isn't validated itself.

Fix this by validating the maximum file size at mount time.

Link: http://lkml.kernel.org/r/20200628060846.682158-4-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+c7d9ec7a1a7272dd71b3@syzkaller.appspotmail.com
Reported-by: syzbot+3b7b03a0c28948054fb5@syzkaller.appspotmail.com
Reported-by: syzbot+6e056ee473568865f3e6@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/minix/inode.c |   22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

--- a/fs/minix/inode.c~fs-minix-reject-too-large-maximum-file-size
+++ a/fs/minix/inode.c
@@ -150,6 +150,23 @@ static int minix_remount (struct super_b
 	return 0;
 }
 
+static bool minix_check_superblock(struct minix_sb_info *sbi)
+{
+	if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
+		return false;
+
+	/*
+	 * s_max_size must not exceed the block mapping limitation.  This check
+	 * is only needed for V1 filesystems, since V2/V3 support an extra level
+	 * of indirect blocks which places the limit well above U32_MAX.
+	 */
+	if (sbi->s_version == MINIX_V1 &&
+	    sbi->s_max_size > (7 + 512 + 512*512) * BLOCK_SIZE)
+		return false;
+
+	return true;
+}
+
 static int minix_fill_super(struct super_block *s, void *data, int silent)
 {
 	struct buffer_head *bh;
@@ -228,11 +245,12 @@ static int minix_fill_super(struct super
 	} else
 		goto out_no_fs;
 
+	if (!minix_check_superblock(sbi))
+		goto out_illegal_sb;
+
 	/*
 	 * Allocate the buffer map to keep the superblock small.
 	 */
-	if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
-		goto out_illegal_sb;
 	i = (sbi->s_imap_blocks + sbi->s_zmap_blocks) * sizeof(bh);
 	map = kzalloc(i, GFP_KERNEL);
 	if (!map)
_

Patches currently in -mm which might be from ebiggers@google.com are

fs-minix-check-return-value-of-sb_getblk.patch
fs-minix-dont-allow-getting-deleted-inodes.patch
fs-minix-reject-too-large-maximum-file-size.patch
fs-minix-set-s_maxbytes-correctly.patch
fs-minix-fix-block-limit-check-for-v1-filesystems.patch
fs-minix-remove-expected-error-message-in-block_to_path.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (5 preceding siblings ...)
  2020-07-07 19:25 ` + fs-minix-reject-too-large-maximum-file-size.patch " Andrew Morton
@ 2020-07-07 22:18 ` Andrew Morton
  2020-07-10  0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards.patch " Andrew Morton
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-07 22:18 UTC (permalink / raw)
  To: alex.shi, hannes, hughd, mhocko, mm-commits, shakeelb, stable


The patch titled
     Subject: mm/memcg: fix refcount error while moving and swapping
has been added to the -mm tree.  Its filename is
     mm-memcg-fix-refcount-error-while-moving-and-swapping.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-fix-refcount-error-while-moving-and-swapping.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Hugh Dickins <hughd@google.com>
Subject: mm/memcg: fix refcount error while moving and swapping

It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.

This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids. 
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when
offlining).

Just skip that optimization: do that part of the accounting immediately.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2007071431050.4726@eggly.anvils
Fixes: 615d66c37c75 ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-fix-refcount-error-while-moving-and-swapping
+++ a/mm/memcontrol.c
@@ -5669,7 +5669,6 @@ static void __mem_cgroup_clear_mc(void)
 		if (!mem_cgroup_is_root(mc.to))
 			page_counter_uncharge(&mc.to->memory, mc.moved_swap);
 
-		mem_cgroup_id_get_many(mc.to, mc.moved_swap);
 		css_put_many(&mc.to->css, mc.moved_swap);
 
 		mc.moved_swap = 0;
@@ -5860,7 +5859,8 @@ put:			/* get_mctgt_type() gets the page
 			ent = target.ent;
 			if (!mem_cgroup_move_swap_account(ent, mc.from, mc.to)) {
 				mc.precharge--;
-				/* we fixup refcnts and charges later. */
+				mem_cgroup_id_get_many(mc.to, 1);
+				/* we fixup other refcnts and charges later. */
 				mc.moved_swap++;
 			}
 			break;
_

Patches currently in -mm which might be from hughd@google.com are

mm-memcg-fix-refcount-error-while-moving-and-swapping.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-close-race-between-munmap-and-expand_upwards-downwards.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (6 preceding siblings ...)
  2020-07-07 22:18 ` + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch " Andrew Morton
@ 2020-07-10  0:23 ` Andrew Morton
  2020-07-10 23:27 ` [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from " Andrew Morton
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-10  0:23 UTC (permalink / raw)
  To: jannh, kirill.shutemov, mm-commits, oleg, stable, vbabka, willy,
	yang.shi


The patch titled
     Subject: mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
has been added to the -mm tree.  Its filename is
     mm-close-race-between-munmap-and-expand_upwards-downwards.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-close-race-between-munmap-and-expand_upwards-downwards.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: mm/mmap.c: close race between munmap() and expand_upwards()/downwards()

VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under
mmap_read_lock().  It can lead to race with __do_munmap():

	Thread A			Thread B
__do_munmap()
  detach_vmas_to_be_unmapped()
  mmap_write_downgrade()
				expand_downwards()
				  vma->vm_start = address;
				  // The VMA now overlaps with
				  // VMAs detached by the Thread A
				// page fault populates expanded part
				// of the VMA
  unmap_region()
    // Zaps pagetables partly
    // populated by Thread B

Similar race exists for expand_upwards().

The fix is to avoid downgrading mmap_lock in __do_munmap() if detached
VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA.

Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel.com
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Jann Horn <jannh@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.20+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mmap.c |   16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

--- a/mm/mmap.c~mm-close-race-between-munmap-and-expand_upwards-downwards
+++ a/mm/mmap.c
@@ -2620,7 +2620,7 @@ static void unmap_region(struct mm_struc
  * Create a list of vma's touched by the unmap, removing them from the mm's
  * vma list as we go..
  */
-static void
+static bool
 detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
 	struct vm_area_struct *prev, unsigned long end)
 {
@@ -2645,6 +2645,17 @@ detach_vmas_to_be_unmapped(struct mm_str
 
 	/* Kill the cache */
 	vmacache_invalidate(mm);
+
+	/*
+	 * Do not downgrade mmap_sem if we are next to VM_GROWSDOWN or
+	 * VM_GROWSUP VMA. Such VMAs can change their size under
+	 * down_read(mmap_sem) and collide with the VMA we are about to unmap.
+	 */
+	if (vma && (vma->vm_flags & VM_GROWSDOWN))
+		return false;
+	if (prev && (prev->vm_flags & VM_GROWSUP))
+		return false;
+	return true;
 }
 
 /*
@@ -2825,7 +2836,8 @@ int __do_munmap(struct mm_struct *mm, un
 	}
 
 	/* Detach vmas from rbtree */
-	detach_vmas_to_be_unmapped(mm, vma, prev, end);
+	if (!detach_vmas_to_be_unmapped(mm, vma, prev, end))
+		downgrade = false;
 
 	if (downgrade)
 		mmap_write_downgrade(mm);
_

Patches currently in -mm which might be from kirill.shutemov@linux.intel.com are

mm-close-race-between-munmap-and-expand_upwards-downwards.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (7 preceding siblings ...)
  2020-07-10  0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards.patch " Andrew Morton
@ 2020-07-10 23:27 ` Andrew Morton
  2020-07-10 23:29 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to " Andrew Morton
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-10 23:27 UTC (permalink / raw)
  To: guro, jonathan.cameron, mike.kravetz, mm-commits, rppt,
	song.bao.hua, stable


The patch titled
     Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
has been removed from the -mm tree.  Its filename was
     mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled

hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory.  so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled.  gigantic pages might have been reserved on other nodes.

Mike Kravetz said:

: Based on the code changes, I believe the following could happen:
: - Someone uses 'hugetlb_cma=' kernel command line parameter to reserve
:   CMA for gigantic pages.
: - The system topology is such that no memory is on node 0.  Therefore,
:   no CMA can be reserved for gigantic pages on node 0.  CMA is reserved
:   on other nodes.
: - The user also specifies a number of gigantic pages to pre-allocate on
:   the command line with hugepagesz=<gigantic_page_size> hugepages=<N>
: - The routine which allocates gigantic pages from the bootmem allocator
:   will not detect CMA has been reserved as there is no memory on node 0.
:   Therefore, pages will be pre-allocated from bootmem allocator as well
:   as reserved in CMA.
: 
: This double allocation (bootmem and CMA) is the worst case scenario.  Not
: sure if this is what Barry saw, and I suspect this would rarely happen.

Link: http://lkml.kernel.org/r/20200707040204.30132-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |   16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable
+++ a/mm/hugetlb.c
@@ -2546,6 +2546,20 @@ static void __init gather_bootmem_preall
 	}
 }
 
+bool __init hugetlb_cma_enabled(void)
+{
+#ifdef CONFIG_CMA
+	int node;
+
+	for_each_online_node(node) {
+		if (hugetlb_cma[node])
+			return true;
+	}
+#endif
+
+	return false;
+}
+
 static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 {
 	unsigned long i;
@@ -2571,7 +2585,7 @@ static void __init hugetlb_hstate_alloc_
 
 	for (i = 0; i < h->max_huge_pages; ++i) {
 		if (hstate_is_gigantic(h)) {
-			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+			if (hugetlb_cma_enabled()) {
 				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
 				break;
 			}
_

Patches currently in -mm which might be from song.bao.hua@hisilicon.com are

mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (8 preceding siblings ...)
  2020-07-10 23:27 ` [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from " Andrew Morton
@ 2020-07-10 23:29 ` Andrew Morton
  2020-07-16 21:28 ` + mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch " Andrew Morton
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-10 23:29 UTC (permalink / raw)
  To: guro, jonathan.cameron, mike.kravetz, mm-commits, song.bao.hua, stable


The patch titled
     Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
has been added to the -mm tree.  Its filename is
     mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Barry Song <song.bao.hua@hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled

hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory.  so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled.  gigantic pages might have been reserved on other nodes.  This
patch fixes possible double reservation and CMA leak.

Link: http://lkml.kernel.org/r/20200710005726.36068-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled
+++ a/mm/hugetlb.c
@@ -46,6 +46,7 @@ unsigned int default_hstate_idx;
 struct hstate hstates[HUGE_MAX_HSTATE];
 
 static struct cma *hugetlb_cma[MAX_NUMNODES];
+static unsigned long hugetlb_cma_size __initdata;
 
 /*
  * Minimum page order among possible hugepage sizes, set to a proper value
@@ -2571,7 +2572,7 @@ static void __init hugetlb_hstate_alloc_
 
 	for (i = 0; i < h->max_huge_pages; ++i) {
 		if (hstate_is_gigantic(h)) {
-			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+			if (hugetlb_cma_size) {
 				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
 				break;
 			}
@@ -5654,7 +5655,6 @@ void move_hugetlb_state(struct page *old
 }
 
 #ifdef CONFIG_CMA
-static unsigned long hugetlb_cma_size __initdata;
 static bool cma_reserve_called __initdata;
 
 static int __init cmdline_parse_hugetlb_cma(char *p)
_

Patches currently in -mm which might be from song.bao.hua@hisilicon.com are

mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (9 preceding siblings ...)
  2020-07-10 23:29 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to " Andrew Morton
@ 2020-07-16 21:28 ` Andrew Morton
  2020-07-21 20:49 ` + fork-silence-a-false-postive-warning-in-__mmdrop.patch " Andrew Morton
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-16 21:28 UTC (permalink / raw)
  To: cl, guro, iamjoonsoo.kim, mm-commits, penberg, rientjes,
	shakeelb, songmuchun, stable, vbabka


The patch titled
     Subject: mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
has been added to the -mm tree.  Its filename is
     mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Muchun Song <songmuchun@bytedance.com>
Subject: mm: memcg/slab: fix memory leak at non-root kmem_cache destroy

If the kmem_cache refcount is greater than one, we should not mark the
root kmem_cache as dying.  If we mark the root kmem_cache dying
incorrectly, the non-root kmem_cache can never be destroyed.  It resulted
in memory leak when memcg was destroyed.  We can use the following steps
to reproduce.

  1) Use kmem_cache_create() to create a new kmem_cache named A.
  2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
     so the refcount of B is just increased.
  3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
     decrease the B's refcount but mark the B as dying.
  4) Create a new memory cgroup and alloc memory from the kmem_cache
     B. It leads to create a non-root kmem_cache for allocating memory.
  5) When destroy the memory cgroup created in the step 4), the
     non-root kmem_cache can never be destroyed.

If we repeat steps 4) and 5), this will cause a lot of memory leak.  So
only when refcount reach zero, we mark the root kmem_cache as dying.

Link: http://lkml.kernel.org/r/20200716165103.83462-1-songmuchun@bytedance.com
Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab_common.c |   35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

--- a/mm/slab_common.c~mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy
+++ a/mm/slab_common.c
@@ -326,6 +326,14 @@ int slab_unmergeable(struct kmem_cache *
 	if (s->refcount < 0)
 		return 1;
 
+#ifdef CONFIG_MEMCG_KMEM
+	/*
+	 * Skip the dying kmem_cache.
+	 */
+	if (s->memcg_params.dying)
+		return 1;
+#endif
+
 	return 0;
 }
 
@@ -886,12 +894,15 @@ static int shutdown_memcg_caches(struct
 	return 0;
 }
 
-static void flush_memcg_workqueue(struct kmem_cache *s)
+static void memcg_set_kmem_cache_dying(struct kmem_cache *s)
 {
 	spin_lock_irq(&memcg_kmem_wq_lock);
 	s->memcg_params.dying = true;
 	spin_unlock_irq(&memcg_kmem_wq_lock);
+}
 
+static void flush_memcg_workqueue(struct kmem_cache *s)
+{
 	/*
 	 * SLAB and SLUB deactivate the kmem_caches through call_rcu. Make
 	 * sure all registered rcu callbacks have been invoked.
@@ -923,10 +934,6 @@ static inline int shutdown_memcg_caches(
 {
 	return 0;
 }
-
-static inline void flush_memcg_workqueue(struct kmem_cache *s)
-{
-}
 #endif /* CONFIG_MEMCG_KMEM */
 
 void slab_kmem_cache_release(struct kmem_cache *s)
@@ -944,8 +951,6 @@ void kmem_cache_destroy(struct kmem_cach
 	if (unlikely(!s))
 		return;
 
-	flush_memcg_workqueue(s);
-
 	get_online_cpus();
 	get_online_mems();
 
@@ -955,6 +960,22 @@ void kmem_cache_destroy(struct kmem_cach
 	if (s->refcount)
 		goto out_unlock;
 
+#ifdef CONFIG_MEMCG_KMEM
+	memcg_set_kmem_cache_dying(s);
+
+	mutex_unlock(&slab_mutex);
+
+	put_online_mems();
+	put_online_cpus();
+
+	flush_memcg_workqueue(s);
+
+	get_online_cpus();
+	get_online_mems();
+
+	mutex_lock(&slab_mutex);
+#endif
+
 	err = shutdown_memcg_caches(s);
 	if (!err)
 		err = shutdown_cache(s);
_

Patches currently in -mm which might be from songmuchun@bytedance.com are

mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch
mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + fork-silence-a-false-postive-warning-in-__mmdrop.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (10 preceding siblings ...)
  2020-07-16 21:28 ` + mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch " Andrew Morton
@ 2020-07-21 20:49 ` Andrew Morton
  2020-07-21 20:57 ` + io-mapping-indicate-mapping-failure.patch " Andrew Morton
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-21 20:49 UTC (permalink / raw)
  To: cai, mm-commits, mpe, peterz, stable


The patch titled
     Subject: fork: silence a false postive warning in __mmdrop
has been added to the -mm tree.  Its filename is
     fork-silence-a-false-postive-warning-in-__mmdrop.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fork-silence-a-false-postive-warning-in-__mmdrop.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fork-silence-a-false-postive-warning-in-__mmdrop.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Qian Cai <cai@lca.pw>
Subject: fork: silence a false postive warning in __mmdrop

commit bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
delayed,

idle->active_mm = &init_mm;

into finish_cpu() instead of idle_task_exit() which results in a false
positive warning that was originally designed in the commit 3eda69c92d47
("kernel/fork.c: detect early free of a live mm").

 WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
 __mmdrop+0x230/0x2c0
 do_exit+0x424/0xfa0
 Call Trace:
 do_exit+0x424/0xfa0
 do_group_exit+0x64/0xd0
 sys_exit_group+0x24/0x30
 system_call_exception+0x108/0x1d0
 system_call_common+0xf0/0x278

Link: http://lkml.kernel.org/r/20200604150344.1796-1-cai@lca.pw
Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
Signed-off-by: Qian Cai <cai@lca.pw>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/fork.c |    1 -
 1 file changed, 1 deletion(-)

--- a/kernel/fork.c~fork-silence-a-false-postive-warning-in-__mmdrop
+++ a/kernel/fork.c
@@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
 {
 	BUG_ON(mm == &init_mm);
 	WARN_ON_ONCE(mm == current->mm);
-	WARN_ON_ONCE(mm == current->active_mm);
 	mm_free_pgd(mm);
 	destroy_context(mm);
 	mmu_notifier_subscriptions_destroy(mm);
_

Patches currently in -mm which might be from cai@lca.pw are

fork-silence-a-false-postive-warning-in-__mmdrop.patch
mm-page_alloc-silence-a-kasan-false-positive.patch
mm-kmemleak-silence-kcsan-splats-in-checksum.patch
mm-frontswap-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races-v2.patch
mm-swap_state-mark-various-intentional-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races-v2.patch
mm-page_counter-fix-various-data-races-at-memsw.patch
mm-memcontrol-fix-a-data-race-in-scan-count.patch
mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
mm-mempool-fix-a-data-race-in-mempool_free.patch
mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
mm-annotate-a-data-race-in-page_zonenum.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + io-mapping-indicate-mapping-failure.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (11 preceding siblings ...)
  2020-07-21 20:49 ` + fork-silence-a-false-postive-warning-in-__mmdrop.patch " Andrew Morton
@ 2020-07-21 20:57 ` Andrew Morton
  2020-07-21 21:06 ` + mm-fix-kthread_use_mm-vs-tlb-invalidate.patch " Andrew Morton
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-21 20:57 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, chris, michael.j.ruhl, mm-commits, rppt, stable


The patch titled
     Subject: io-mapping: indicate mapping failure
has been added to the -mm tree.  Its filename is
     io-mapping-indicate-mapping-failure.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/io-mapping-indicate-mapping-failure.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/io-mapping-indicate-mapping-failure.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Michael J. Ruhl" <michael.j.ruhl@intel.com>
Subject: io-mapping: indicate mapping failure

The !ATOMIC_IOMAP version of io_maping_init_wc will always return success,
even when the ioremap fails.

Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.

During a device probe, where the ioremap failed, a crash can look
like this:

BUG: unable to handle page fault for address: 0000000000210000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 Oops: 0002 [#1] PREEMPT SMP
 CPU: 0 PID: 177 Comm:
 RIP: 0010:fill_page_dma [i915]
  gen8_ppgtt_create [i915]
  i915_ppgtt_create [i915]
  intel_gt_init [i915]
  i915_gem_init [i915]
  i915_driver_probe [i915]
  pci_device_probe
  really_probe
  driver_probe_device

The remap failure occurred much earlier in the probe.  If it had
been propagated, the driver would have exited with an error.

Return NULL on ioremap failure.

Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/io-mapping.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure
+++ a/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *io
 	iomap->prot = pgprot_noncached(PAGE_KERNEL);
 #endif
 
-	return iomap;
+	return iomap->iomem ? iomap : NULL;
 }
 
 static inline void
_

Patches currently in -mm which might be from michael.j.ruhl@intel.com are

io-mapping-indicate-mapping-failure.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-fix-kthread_use_mm-vs-tlb-invalidate.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (12 preceding siblings ...)
  2020-07-21 20:57 ` + io-mapping-indicate-mapping-failure.patch " Andrew Morton
@ 2020-07-21 21:06 ` Andrew Morton
  2020-07-24  1:09 ` + mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch " Andrew Morton
  2020-07-24  2:53 ` + khugepaged-fix-null-pointer-dereference-due-to-race.patch " Andrew Morton
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-21 21:06 UTC (permalink / raw)
  To: axboe, hch, jannh, keescook, luto, mathieu.desnoyers, mm-commits,
	npiggin, peterz, stable, will


The patch titled
     Subject: mm: fix kthread_use_mm() vs TLB invalidate
has been added to the -mm tree.  Its filename is
     mm-fix-kthread_use_mm-vs-tlb-invalidate.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-fix-kthread_use_mm-vs-tlb-invalidate.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-kthread_use_mm-vs-tlb-invalidate.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Peter Zijlstra <peterz@infradead.org>
Subject: mm: fix kthread_use_mm() vs TLB invalidate

For SMP systems using IPI based TLB invalidation, looking at
current->active_mm is entirely reasonable.  This then presents the
following race condition:

  CPU0			CPU1

  flush_tlb_mm(mm)	use_mm(mm)
    <send-IPI>
			  tsk->active_mm = mm;
			  <IPI>
			    if (tsk->active_mm == mm)
			      // flush TLBs
			  </IPI>
			  switch_mm(old_mm,mm,tsk);

Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
because the IPI lands before we actually switched.

Avoid this by disabling IRQs across changing ->active_mm and switch_mm().

[ There are all sorts of reasons this might be harmless for various
architecture specific reasons, but best not leave the door open at all.  ]

Link: http://lkml.kernel.org/r/20200721154106.GE10769@hirez.programming.kicks-ass.net
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kthread.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/kernel/kthread.c~mm-fix-kthread_use_mm-vs-tlb-invalidate
+++ a/kernel/kthread.c
@@ -1239,13 +1239,15 @@ void kthread_use_mm(struct mm_struct *mm
 	WARN_ON_ONCE(tsk->mm);
 
 	task_lock(tsk);
+	local_irq_disable();
 	active_mm = tsk->active_mm;
 	if (active_mm != mm) {
 		mmgrab(mm);
 		tsk->active_mm = mm;
 	}
 	tsk->mm = mm;
-	switch_mm(active_mm, mm, tsk);
+	switch_mm_irqs_off(active_mm, mm, tsk);
+	local_irq_enable();
 	task_unlock(tsk);
 #ifdef finish_arch_post_lock_switch
 	finish_arch_post_lock_switch();
@@ -1274,9 +1276,11 @@ void kthread_unuse_mm(struct mm_struct *
 
 	task_lock(tsk);
 	sync_mm_rss(mm);
+	local_irq_disable();
 	tsk->mm = NULL;
 	/* active_mm is still 'mm' */
 	enter_lazy_tlb(mm, tsk);
+	local_irq_enable();
 	task_unlock(tsk);
 }
 EXPORT_SYMBOL_GPL(kthread_unuse_mm);
_

Patches currently in -mm which might be from peterz@infradead.org are

mm-fix-kthread_use_mm-vs-tlb-invalidate.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (13 preceding siblings ...)
  2020-07-21 21:06 ` + mm-fix-kthread_use_mm-vs-tlb-invalidate.patch " Andrew Morton
@ 2020-07-24  1:09 ` Andrew Morton
  2020-07-24  2:53 ` + khugepaged-fix-null-pointer-dereference-due-to-race.patch " Andrew Morton
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-24  1:09 UTC (permalink / raw)
  To: aneesh.kumar, guro, hch, iamjoonsoo.kim, mhocko, mike.kravetz,
	mm-commits, n-horiguchi, stable, vbabka


The patch titled
     Subject: mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
has been added to the -mm tree.  Its filename is
     mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: js1304@gmail.com
Subject: mm/page_alloc: fix memalloc_nocma_{save/restore} APIs


Currently, memalloc_nocma_{save/restore} API that prevents CMA area
in page allocation is implemented by using current_gfp_context(). However,
there are two problems of this implementation.

First, this doesn't work for allocation fastpath. In the fastpath,
original gfp_mask is used since current_gfp_context() is introduced in
order to control reclaim and it is on slowpath. So, CMA area can be
allocated through the allocation fastpath even if
memalloc_nocma_{save/restore} APIs are used. Currently, there is just
one user for these APIs and it has a fallback method to prevent actual
problem.
Second, clearing __GFP_MOVABLE in current_gfp_context() has a side effect
to exclude the memory on the ZONE_MOVABLE for allocation target.

To fix these problems, this patch changes the implementation to exclude
CMA area in page allocation. Main point of this change is using the
alloc_flags. alloc_flags is mainly used to control allocation so it fits
for excluding CMA area in allocation.

Link: http://lkml.kernel.org/r/1595468942-29687-1-git-send-email-iamjoonsoo.kim@lge.com
Fixes: d7fefcc8de91 (mm/cma: add PF flag to force non cma alloc)
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/sched/mm.h |    8 +-------
 mm/page_alloc.c          |   31 +++++++++++++++++++++----------
 2 files changed, 22 insertions(+), 17 deletions(-)

--- a/include/linux/sched/mm.h~mm-page_alloc-fix-memalloc_nocma_save-restore-apis
+++ a/include/linux/sched/mm.h
@@ -177,12 +177,10 @@ static inline bool in_vfork(struct task_
  * Applies per-task gfp context to the given allocation flags.
  * PF_MEMALLOC_NOIO implies GFP_NOIO
  * PF_MEMALLOC_NOFS implies GFP_NOFS
- * PF_MEMALLOC_NOCMA implies no allocation from CMA region.
  */
 static inline gfp_t current_gfp_context(gfp_t flags)
 {
-	if (unlikely(current->flags &
-		     (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS | PF_MEMALLOC_NOCMA))) {
+	if (unlikely(current->flags & (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS))) {
 		/*
 		 * NOIO implies both NOIO and NOFS and it is a weaker context
 		 * so always make sure it makes precedence
@@ -191,10 +189,6 @@ static inline gfp_t current_gfp_context(
 			flags &= ~(__GFP_IO | __GFP_FS);
 		else if (current->flags & PF_MEMALLOC_NOFS)
 			flags &= ~__GFP_FS;
-#ifdef CONFIG_CMA
-		if (current->flags & PF_MEMALLOC_NOCMA)
-			flags &= ~__GFP_MOVABLE;
-#endif
 	}
 	return flags;
 }
--- a/mm/page_alloc.c~mm-page_alloc-fix-memalloc_nocma_save-restore-apis
+++ a/mm/page_alloc.c
@@ -2790,7 +2790,7 @@ __rmqueue(struct zone *zone, unsigned in
 	 * allocating from CMA when over half of the zone's free memory
 	 * is in the CMA area.
 	 */
-	if (migratetype == MIGRATE_MOVABLE &&
+	if (alloc_flags & ALLOC_CMA &&
 	    zone_page_state(zone, NR_FREE_CMA_PAGES) >
 	    zone_page_state(zone, NR_FREE_PAGES) / 2) {
 		page = __rmqueue_cma_fallback(zone, order);
@@ -2801,7 +2801,7 @@ __rmqueue(struct zone *zone, unsigned in
 retry:
 	page = __rmqueue_smallest(zone, order, migratetype);
 	if (unlikely(!page)) {
-		if (migratetype == MIGRATE_MOVABLE)
+		if (alloc_flags & ALLOC_CMA)
 			page = __rmqueue_cma_fallback(zone, order);
 
 		if (!page && __rmqueue_fallback(zone, order, migratetype,
@@ -3671,6 +3671,20 @@ alloc_flags_nofragment(struct zone *zone
 	return alloc_flags;
 }
 
+static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
+					unsigned int alloc_flags)
+{
+#ifdef CONFIG_CMA
+	unsigned int pflags = current->flags;
+
+	if (!(pflags & PF_MEMALLOC_NOCMA) &&
+			gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
+		alloc_flags |= ALLOC_CMA;
+
+#endif
+	return alloc_flags;
+}
+
 /*
  * get_page_from_freelist goes through the zonelist trying to allocate
  * a page.
@@ -4316,10 +4330,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 	} else if (unlikely(rt_task(current)) && !in_interrupt())
 		alloc_flags |= ALLOC_HARDER;
 
-#ifdef CONFIG_CMA
-	if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
-		alloc_flags |= ALLOC_CMA;
-#endif
+	alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);
+
 	return alloc_flags;
 }
 
@@ -4620,7 +4632,7 @@ retry:
 
 	reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
 	if (reserve_flags)
-		alloc_flags = reserve_flags;
+		alloc_flags = current_alloc_flags(gfp_mask, reserve_flags);
 
 	/*
 	 * Reset the nodemask and zonelist iterators if memory policies can be
@@ -4697,7 +4709,7 @@ retry:
 
 	/* Avoid allocations with no watermarks from looping endlessly */
 	if (tsk_is_oom_victim(current) &&
-	    (alloc_flags == ALLOC_OOM ||
+	    (alloc_flags & ALLOC_OOM ||
 	     (gfp_mask & __GFP_NOMEMALLOC)))
 		goto nopage;
 
@@ -4785,8 +4797,7 @@ static inline bool prepare_alloc_pages(g
 	if (should_fail_alloc_page(gfp_mask, order))
 		return false;
 
-	if (IS_ENABLED(CONFIG_CMA) && ac->migratetype == MIGRATE_MOVABLE)
-		*alloc_flags |= ALLOC_CMA;
+	*alloc_flags = current_alloc_flags(gfp_mask, *alloc_flags);
 
 	return true;
 }
_

Patches currently in -mm which might be from iamjoonsoo.kim@lge.com are

mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch
mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
mm-swapcache-support-to-handle-the-shadow-entries.patch
mm-swap-implement-workingset-detection-for-anonymous-lru.patch
mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + khugepaged-fix-null-pointer-dereference-due-to-race.patch added to -mm tree
       [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
                   ` (14 preceding siblings ...)
  2020-07-24  1:09 ` + mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch " Andrew Morton
@ 2020-07-24  2:53 ` Andrew Morton
  15 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-07-24  2:53 UTC (permalink / raw)
  To: david, kirill.shutemov, mm-commits, stable, yang.shi


The patch titled
     Subject: khugepaged: fix null-pointer dereference due to race
has been added to the -mm tree.  Its filename is
     khugepaged-fix-null-pointer-dereference-due-to-race.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/khugepaged-fix-null-pointer-dereference-due-to-race.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/khugepaged-fix-null-pointer-dereference-due-to-race.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: khugepaged: fix null-pointer dereference due to race

khugepaged has to drop mmap lock several times while collapsing a page. 
The situation can change while the lock is dropped and we need to
re-validate that the VMA is still in place and the PMD is still subject
for collapse.

But we miss one corner case: while collapsing an anonymous pages the VMA
could be replaced with file VMA. If the file VMA doesn't have any
private pages we get NULL pointer dereference:

	general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
	KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
	anon_vma_lock_write include/linux/rmap.h:120 [inline]
	collapse_huge_page mm/khugepaged.c:1110 [inline]
	khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
	khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
	khugepaged_do_scan mm/khugepaged.c:2193 [inline]
	khugepaged+0x3bba/0x5a10 mm/khugepaged.c:2238

The fix is to make sure that the VMA is anonymous in
hugepage_vma_revalidate().  The helper is only used for collapsing
anonymous pages.

Link: http://lkml.kernel.org/r/20200722121439.44328-1-kirill.shutemov@linux.intel.com
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: syzbot+ed318e8b790ca72c5ad0@syzkaller.appspotmail.com
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/khugepaged.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/mm/khugepaged.c~khugepaged-fix-null-pointer-dereference-due-to-race
+++ a/mm/khugepaged.c
@@ -958,6 +958,9 @@ static int hugepage_vma_revalidate(struc
 		return SCAN_ADDRESS_RANGE;
 	if (!hugepage_vma_check(vma, vma->vm_flags))
 		return SCAN_VMA_CHECK;
+	/* Anon VMA expected */
+	if (!vma->anon_vma || vma->vm_ops)
+		return SCAN_VMA_CHECK;
 	return 0;
 }
 
_

Patches currently in -mm which might be from kirill.shutemov@linux.intel.com are

mm-close-race-between-munmap-and-expand_upwards-downwards.patch
khugepaged-fix-null-pointer-dereference-due-to-race.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-07-24  2:53 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200703151445.b6a0cfee402c7c5c4651f1b1@linux-foundation.org>
2020-07-03 22:15 ` [patch 1/5] mm/hugetlb.c: fix pages per hugetlb calculation Andrew Morton
2020-07-03 22:15 ` [patch 3/5] mm/cma.c: use exact_nid true to fix possible per-numa cma leak Andrew Morton
2020-07-06 23:50 ` + vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch added to -mm tree Andrew Morton
2020-07-07 19:25 ` + fs-minix-check-return-value-of-sb_getblk.patch " Andrew Morton
2020-07-07 19:25 ` + fs-minix-dont-allow-getting-deleted-inodes.patch " Andrew Morton
2020-07-07 19:25 ` + fs-minix-reject-too-large-maximum-file-size.patch " Andrew Morton
2020-07-07 22:18 ` + mm-memcg-fix-refcount-error-while-moving-and-swapping.patch " Andrew Morton
2020-07-10  0:23 ` + mm-close-race-between-munmap-and-expand_upwards-downwards.patch " Andrew Morton
2020-07-10 23:27 ` [to-be-updated] mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enable.patch removed from " Andrew Morton
2020-07-10 23:29 ` + mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch added to " Andrew Morton
2020-07-16 21:28 ` + mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch " Andrew Morton
2020-07-21 20:49 ` + fork-silence-a-false-postive-warning-in-__mmdrop.patch " Andrew Morton
2020-07-21 20:57 ` + io-mapping-indicate-mapping-failure.patch " Andrew Morton
2020-07-21 21:06 ` + mm-fix-kthread_use_mm-vs-tlb-invalidate.patch " Andrew Morton
2020-07-24  1:09 ` + mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch " Andrew Morton
2020-07-24  2:53 ` + khugepaged-fix-null-pointer-dereference-due-to-race.patch " Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).