All of lore.kernel.org
 help / color / mirror / Atom feed
* incoming
@ 2021-08-13 23:53 Andrew Morton
  2021-08-13 23:54 ` [patch 1/7] kasan, kmemleak: reset tags when scanning block Andrew Morton
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Andrew Morton @ 2021-08-13 23:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

7 patches, based on f8e6dfc64f6135d1b6c5215c14cd30b9b60a0008.

Subsystems affected by this patch series:

  mm/kasan
  mm/slub
  mm/madvise
  mm/memcg
  lib

Subsystem: mm/kasan

    Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>:
    Patch series "kasan, slub: reset tag when printing address", v3:
      kasan, kmemleak: reset tags when scanning block
      kasan, slub: reset tag when printing address

Subsystem: mm/slub

    Shakeel Butt <shakeelb@google.com>:
      slub: fix kmalloc_pagealloc_invalid_free unit test

    Vlastimil Babka <vbabka@suse.cz>:
      mm: slub: fix slub_debug disabling for list of slabs

Subsystem: mm/madvise

    David Hildenbrand <david@redhat.com>:
      mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)

Subsystem: mm/memcg

    Waiman Long <longman@redhat.com>:
      mm/memcg: fix incorrect flushing of lruvec data in obj_stock

Subsystem: lib

    Liang Wang <wangliang101@huawei.com>:
      lib: use PFN_PHYS() in devmem_is_allowed()

 lib/devmem_is_allowed.c |    2 +-
 mm/gup.c                |    7 +++++--
 mm/kmemleak.c           |    6 +++---
 mm/madvise.c            |    4 +++-
 mm/memcontrol.c         |    6 ++++--
 mm/slub.c               |   25 ++++++++++++++-----------
 6 files changed, 30 insertions(+), 20 deletions(-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [patch 1/7] kasan, kmemleak: reset tags when scanning block
  2021-08-13 23:53 incoming Andrew Morton
@ 2021-08-13 23:54 ` Andrew Morton
  2021-08-13 23:54 ` [patch 2/7] kasan, slub: reset tag when printing address Andrew Morton
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2021-08-13 23:54 UTC (permalink / raw)
  To: akpm, andreyknvl, catalin.marinas, chinwen.chang, elver, glider,
	Kuan-Ying.Lee, linux-mm, mm-commits, nicholas.tang, ryabinin.a.a,
	torvalds

From: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Subject: kasan, kmemleak: reset tags when scanning block

Patch series "kasan, slub: reset tag when printing address", v3.

With hardware tag-based kasan enabled, we reset the tag when we access
metadata to avoid from false alarm.


This patch (of 2):

Kmemleak needs to scan kernel memory to check memory leak.  With hardware
tag-based kasan enabled, when it scans on the invalid slab and
dereference, the issue will occur as below.

Hardware tag-based KASAN doesn't use compiler instrumentation, we can not
use kasan_disable_current() to ignore tag check.

Based on the below report, there are 11 0xf7 granules, which amounts to
176 bytes, and the object is allocated from the kmalloc-256 cache.  So
when kmemleak accesses the last 256-176 bytes, it causes faults, as those
are marked with KASAN_KMALLOC_REDZONE == KASAN_TAG_INVALID == 0xfe.

Thus, we reset tags before accessing metadata to avoid from false positives.

[  151.905804] ==================================================================
[  151.907120] BUG: KASAN: out-of-bounds in scan_block+0x58/0x170
[  151.908773] Read at addr f7ff0000c0074eb0 by task kmemleak/138
[  151.909656] Pointer tag: [f7], memory tag: [fe]
[  151.910195]
[  151.910876] CPU: 7 PID: 138 Comm: kmemleak Not tainted 5.14.0-rc2-00001-g8cae8cd89f05-dirty #134
[  151.912085] Hardware name: linux,dummy-virt (DT)
[  151.912868] Call trace:
[  151.913211]  dump_backtrace+0x0/0x1b0
[  151.913796]  show_stack+0x1c/0x30
[  151.914248]  dump_stack_lvl+0x68/0x84
[  151.914778]  print_address_description+0x7c/0x2b4
[  151.915340]  kasan_report+0x138/0x38c
[  151.915804]  __do_kernel_fault+0x190/0x1c4
[  151.916386]  do_tag_check_fault+0x78/0x90
[  151.916856]  do_mem_abort+0x44/0xb4
[  151.917308]  el1_abort+0x40/0x60
[  151.917754]  el1h_64_sync_handler+0xb4/0xd0
[  151.918270]  el1h_64_sync+0x78/0x7c
[  151.918714]  scan_block+0x58/0x170
[  151.919157]  scan_gray_list+0xdc/0x1a0
[  151.919626]  kmemleak_scan+0x2ac/0x560
[  151.920129]  kmemleak_scan_thread+0xb0/0xe0
[  151.920635]  kthread+0x154/0x160
[  151.921115]  ret_from_fork+0x10/0x18
[  151.921717]
[  151.922077] Allocated by task 0:
[  151.922523]  kasan_save_stack+0x2c/0x60
[  151.923099]  __kasan_kmalloc+0xec/0x104
[  151.923502]  __kmalloc+0x224/0x3c4
[  151.924172]  __register_sysctl_paths+0x200/0x290
[  151.924709]  register_sysctl_table+0x2c/0x40
[  151.925175]  sysctl_init+0x20/0x34
[  151.925665]  proc_sys_init+0x3c/0x48
[  151.926136]  proc_root_init+0x80/0x9c
[  151.926547]  start_kernel+0x648/0x6a4
[  151.926987]  __primary_switched+0xc0/0xc8
[  151.927557]
[  151.927994] Freed by task 0:
[  151.928340]  kasan_save_stack+0x2c/0x60
[  151.928766]  kasan_set_track+0x2c/0x40
[  151.929173]  kasan_set_free_info+0x44/0x54
[  151.929568]  ____kasan_slab_free.constprop.0+0x150/0x1b0
[  151.930063]  __kasan_slab_free+0x14/0x20
[  151.930449]  slab_free_freelist_hook+0xa4/0x1fc
[  151.930924]  kfree+0x1e8/0x30c
[  151.931285]  put_fs_context+0x124/0x220
[  151.931731]  vfs_kern_mount.part.0+0x60/0xd4
[  151.932280]  kern_mount+0x24/0x4c
[  151.932686]  bdev_cache_init+0x70/0x9c
[  151.933122]  vfs_caches_init+0xdc/0xf4
[  151.933578]  start_kernel+0x638/0x6a4
[  151.934014]  __primary_switched+0xc0/0xc8
[  151.934478]
[  151.934757] The buggy address belongs to the object at ffff0000c0074e00
[  151.934757]  which belongs to the cache kmalloc-256 of size 256
[  151.935744] The buggy address is located 176 bytes inside of
[  151.935744]  256-byte region [ffff0000c0074e00, ffff0000c0074f00)
[  151.936702] The buggy address belongs to the page:
[  151.937378] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100074
[  151.938682] head:(____ptrval____) order:2 compound_mapcount:0 compound_pincount:0
[  151.939440] flags: 0xbfffc0000010200(slab|head|node=0|zone=2|lastcpupid=0xffff|kasantag=0x0)
[  151.940886] raw: 0bfffc0000010200 0000000000000000 dead000000000122 f5ff0000c0002300
[  151.941634] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
[  151.942353] page dumped because: kasan: bad access detected
[  151.942923]
[  151.943214] Memory state around the buggy address:
[  151.943896]  ffff0000c0074c00: f0 f0 f0 f0 f0 f0 f0 f0 f0 fe fe fe fe fe fe fe
[  151.944857]  ffff0000c0074d00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[  151.945892] >ffff0000c0074e00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 fe fe fe fe fe
[  151.946407]                                                     ^
[  151.946939]  ffff0000c0074f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[  151.947445]  ffff0000c0075000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  151.947999] ==================================================================
[  151.948524] Disabling lock debugging due to kernel taint
[  156.434569] kmemleak: 181 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

Link: https://lkml.kernel.org/r/20210804090957.12393-1-Kuan-Ying.Lee@mediatek.com
Link: https://lkml.kernel.org/r/20210804090957.12393-2-Kuan-Ying.Lee@mediatek.com
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Marco Elver <elver@google.com>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kmemleak.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/mm/kmemleak.c~kasan-kmemleak-reset-tags-when-scanning-block
+++ a/mm/kmemleak.c
@@ -290,7 +290,7 @@ static void hex_dump_object(struct seq_f
 	warn_or_seq_printf(seq, "  hex dump (first %zu bytes):\n", len);
 	kasan_disable_current();
 	warn_or_seq_hex_dump(seq, DUMP_PREFIX_NONE, HEX_ROW_SIZE,
-			     HEX_GROUP_SIZE, ptr, len, HEX_ASCII);
+			     HEX_GROUP_SIZE, kasan_reset_tag((void *)ptr), len, HEX_ASCII);
 	kasan_enable_current();
 }
 
@@ -1171,7 +1171,7 @@ static bool update_checksum(struct kmeml
 
 	kasan_disable_current();
 	kcsan_disable_current();
-	object->checksum = crc32(0, (void *)object->pointer, object->size);
+	object->checksum = crc32(0, kasan_reset_tag((void *)object->pointer), object->size);
 	kasan_enable_current();
 	kcsan_enable_current();
 
@@ -1246,7 +1246,7 @@ static void scan_block(void *_start, voi
 			break;
 
 		kasan_disable_current();
-		pointer = *ptr;
+		pointer = *(unsigned long *)kasan_reset_tag((void *)ptr);
 		kasan_enable_current();
 
 		untagged_ptr = (unsigned long)kasan_reset_tag((void *)pointer);
_

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [patch 2/7] kasan, slub: reset tag when printing address
  2021-08-13 23:53 incoming Andrew Morton
  2021-08-13 23:54 ` [patch 1/7] kasan, kmemleak: reset tags when scanning block Andrew Morton
@ 2021-08-13 23:54 ` Andrew Morton
  2021-08-13 23:54 ` [patch 3/7] slub: fix kmalloc_pagealloc_invalid_free unit test Andrew Morton
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2021-08-13 23:54 UTC (permalink / raw)
  To: akpm, andreyknvl, catalin.marinas, chinwen.chang, elver, glider,
	Kuan-Ying.Lee, linux-mm, mm-commits, nicholas.tang, ryabinin.a.a,
	torvalds

From: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Subject: kasan, slub: reset tag when printing address

The address still includes the tags when it is printed.  With hardware
tag-based kasan enabled, we will get a false positive KASAN issue when we
access metadata.

Reset the tag before we access the metadata.

Link: https://lkml.kernel.org/r/20210804090957.12393-3-Kuan-Ying.Lee@mediatek.com
Fixes: aa1ef4d7b3f6 ("kasan, mm: reset tags when accessing metadata")
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/slub.c~kasan-slub-reset-tag-when-printing-address
+++ a/mm/slub.c
@@ -576,8 +576,8 @@ static void print_section(char *level, c
 			  unsigned int length)
 {
 	metadata_access_enable();
-	print_hex_dump(level, kasan_reset_tag(text), DUMP_PREFIX_ADDRESS,
-			16, 1, addr, length, 1);
+	print_hex_dump(level, text, DUMP_PREFIX_ADDRESS,
+			16, 1, kasan_reset_tag((void *)addr), length, 1);
 	metadata_access_disable();
 }
 
_

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [patch 3/7] slub: fix kmalloc_pagealloc_invalid_free unit test
  2021-08-13 23:53 incoming Andrew Morton
  2021-08-13 23:54 ` [patch 1/7] kasan, kmemleak: reset tags when scanning block Andrew Morton
  2021-08-13 23:54 ` [patch 2/7] kasan, slub: reset tag when printing address Andrew Morton
@ 2021-08-13 23:54 ` Andrew Morton
  2021-08-13 23:54 ` [patch 4/7] mm: slub: fix slub_debug disabling for list of slabs Andrew Morton
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2021-08-13 23:54 UTC (permalink / raw)
  To: akpm, cl, guro, iamjoonsoo.kim, linux-mm, mhocko, mm-commits,
	nathan, penberg, rientjes, shakeelb, songmuchun, torvalds,
	vbabka

From: Shakeel Butt <shakeelb@google.com>
Subject: slub: fix kmalloc_pagealloc_invalid_free unit test

The unit test kmalloc_pagealloc_invalid_free makes sure that for the
higher order slub allocation which goes to page allocator, the free is
called with the correct address i.e.  the virtual address of the head
page.

The commit f227f0faf63b ("slub: fix unreclaimable slab stat for bulk
free") unified the free code paths for page allocator based slub
allocations but instead of using the address passed by the caller, it
extracted the address from the page.  Thus making the unit test
kmalloc_pagealloc_invalid_free moot.  So, fix this by using the address
passed by the caller.

Should we fix this?  I think yes because dev expect kasan to catch these
type of programming bugs.

Link: https://lkml.kernel.org/r/20210802180819.1110165-1-shakeelb@google.com
Fixes: f227f0faf63b ("slub: fix unreclaimable slab stat for bulk free")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/mm/slub.c~slub-fix-kmalloc_pagealloc_invalid_free-unit-test
+++ a/mm/slub.c
@@ -3236,12 +3236,12 @@ struct detached_freelist {
 	struct kmem_cache *s;
 };
 
-static inline void free_nonslab_page(struct page *page)
+static inline void free_nonslab_page(struct page *page, void *object)
 {
 	unsigned int order = compound_order(page);
 
 	VM_BUG_ON_PAGE(!PageCompound(page), page);
-	kfree_hook(page_address(page));
+	kfree_hook(object);
 	mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, -(PAGE_SIZE << order));
 	__free_pages(page, order);
 }
@@ -3282,7 +3282,7 @@ int build_detached_freelist(struct kmem_
 	if (!s) {
 		/* Handle kalloc'ed objects */
 		if (unlikely(!PageSlab(page))) {
-			free_nonslab_page(page);
+			free_nonslab_page(page, object);
 			p[size] = NULL; /* mark object processed */
 			return size;
 		}
@@ -4258,7 +4258,7 @@ void kfree(const void *x)
 
 	page = virt_to_head_page(x);
 	if (unlikely(!PageSlab(page))) {
-		free_nonslab_page(page);
+		free_nonslab_page(page, object);
 		return;
 	}
 	slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_);
_

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [patch 4/7] mm: slub: fix slub_debug disabling for list of slabs
  2021-08-13 23:53 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2021-08-13 23:54 ` [patch 3/7] slub: fix kmalloc_pagealloc_invalid_free unit test Andrew Morton
@ 2021-08-13 23:54 ` Andrew Morton
  2021-08-13 23:54 ` [patch 5/7] mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE) Andrew Morton
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2021-08-13 23:54 UTC (permalink / raw)
  To: akpm, cl, iamjoonsoo.kim, linux-mm, mm-commits, penberg,
	rientjes, torvalds, vbabka, vinmenon, vjitta

From: Vlastimil Babka <vbabka@suse.cz>
Subject: mm: slub: fix slub_debug disabling for list of slabs

Vijayanand Jitta reports:

  Consider the scenario where CONFIG_SLUB_DEBUG_ON is set
  and we would want to disable slub_debug for few slabs.
  Using boot parameter with slub_debug=-,slab_name syntax
  doesn't work as expected i.e; only disabling debugging for
  the specified list of slabs. Instead it disables debugging
  for all slabs, which is wrong.

This patch fixes it by delaying the moment when the global slub_debug
flags variable is updated.  In case a "slub_debug=-,slab_name" has been
passed, the global flags remain as initialized (depending on
CONFIG_SLUB_DEBUG_ON enabled or disabled) and are not simply reset to 0.

Link: https://lkml.kernel.org/r/8a3d992a-473a-467b-28a0-4ad2ff60ab82@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Vijayanand Jitta <vjitta@codeaurora.org>
Reviewed-by: Vijayanand Jitta <vjitta@codeaurora.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |   13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

--- a/mm/slub.c~mm-slub-fix-slub_debug-disablement-for-list-of-slabs
+++ a/mm/slub.c
@@ -1400,12 +1400,13 @@ check_slabs:
 static int __init setup_slub_debug(char *str)
 {
 	slab_flags_t flags;
+	slab_flags_t global_flags;
 	char *saved_str;
 	char *slab_list;
 	bool global_slub_debug_changed = false;
 	bool slab_list_specified = false;
 
-	slub_debug = DEBUG_DEFAULT_FLAGS;
+	global_flags = DEBUG_DEFAULT_FLAGS;
 	if (*str++ != '=' || !*str)
 		/*
 		 * No options specified. Switch on full debugging.
@@ -1417,7 +1418,7 @@ static int __init setup_slub_debug(char
 		str = parse_slub_debug_flags(str, &flags, &slab_list, true);
 
 		if (!slab_list) {
-			slub_debug = flags;
+			global_flags = flags;
 			global_slub_debug_changed = true;
 		} else {
 			slab_list_specified = true;
@@ -1426,16 +1427,18 @@ static int __init setup_slub_debug(char
 
 	/*
 	 * For backwards compatibility, a single list of flags with list of
-	 * slabs means debugging is only enabled for those slabs, so the global
-	 * slub_debug should be 0. We can extended that to multiple lists as
+	 * slabs means debugging is only changed for those slabs, so the global
+	 * slub_debug should be unchanged (0 or DEBUG_DEFAULT_FLAGS, depending
+	 * on CONFIG_SLUB_DEBUG_ON). We can extended that to multiple lists as
 	 * long as there is no option specifying flags without a slab list.
 	 */
 	if (slab_list_specified) {
 		if (!global_slub_debug_changed)
-			slub_debug = 0;
+			global_flags = slub_debug;
 		slub_debug_string = saved_str;
 	}
 out:
+	slub_debug = global_flags;
 	if (slub_debug != 0 || slub_debug_string)
 		static_branch_enable(&slub_debug_enabled);
 	else
_

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [patch 5/7] mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)
  2021-08-13 23:53 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2021-08-13 23:54 ` [patch 4/7] mm: slub: fix slub_debug disabling for list of slabs Andrew Morton
@ 2021-08-13 23:54 ` Andrew Morton
  2021-08-13 23:54 ` [patch 6/7] mm/memcg: fix incorrect flushing of lruvec data in obj_stock Andrew Morton
  2021-08-13 23:54 ` [patch 7/7] lib: use PFN_PHYS() in devmem_is_allowed() Andrew Morton
  6 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2021-08-13 23:54 UTC (permalink / raw)
  To: aarcange, akpm, arnd, chris, dave.hansen, david, deller,
	eike-kernel, hughd, ink, James.Bottomley, jannh, jcmvbkbc, jgg,
	kirill.shutemov, linux-mm, linuxram, mattst88, mhocko,
	mike.kravetz, minchan, mm-commits, mst, osalvador, peterx, riel,
	rth, shuah, torvalds, tsbogend, vbabka, willy

From: David Hildenbrand <david@redhat.com>
Subject: mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)

Doing some extended tests and polishing the man page update for
MADV_POPULATE_(READ|WRITE), I realized that we end up converting also
SIGBUS (via -EFAULT) to -EINVAL, making it look like yet another madvise()
user error.

We want to report only problematic mappings and permission problems that
the user could have know as -EINVAL.

Let's not convert -EFAULT arising due to SIGBUS (or SIGSEGV) to -EINVAL,
but instead indicate -EFAULT to user space.  While we could also convert
it to -ENOMEM, using -EFAULT looks more helpful when user space might want
to troubleshoot what's going wrong: MADV_POPULATE_(READ|WRITE) is not part
of an final Linux release and we can still adjust the behavio= r.

Link: https://lkml.kernel.org/r/20210726154932.102880-1-david@redhat.com
Fixes: 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables")
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/gup.c     |    7 +++++--
 mm/madvise.c |    4 +++-
 2 files changed, 8 insertions(+), 3 deletions(-)

--- a/mm/gup.c~mm-madvise-report-sigbus-as-efault-for-madv_populate_readwrite
+++ a/mm/gup.c
@@ -1558,9 +1558,12 @@ long faultin_vma_page_range(struct vm_ar
 		gup_flags |= FOLL_WRITE;
 
 	/*
-	 * See check_vma_flags(): Will return -EFAULT on incompatible mappings
-	 * or with insufficient permissions.
+	 * We want to report -EINVAL instead of -EFAULT for any permission
+	 * problems or incompatible mappings.
 	 */
+	if (check_vma_flags(vma, gup_flags))
+		return -EINVAL;
+
 	return __get_user_pages(mm, start, nr_pages, gup_flags,
 				NULL, NULL, locked);
 }
--- a/mm/madvise.c~mm-madvise-report-sigbus-as-efault-for-madv_populate_readwrite
+++ a/mm/madvise.c
@@ -862,10 +862,12 @@ static long madvise_populate(struct vm_a
 			switch (pages) {
 			case -EINTR:
 				return -EINTR;
-			case -EFAULT: /* Incompatible mappings / permissions. */
+			case -EINVAL: /* Incompatible mappings / permissions. */
 				return -EINVAL;
 			case -EHWPOISON:
 				return -EHWPOISON;
+			case -EFAULT: /* VM_FAULT_SIGBUS or VM_FAULT_SIGSEGV */
+				return -EFAULT;
 			default:
 				pr_warn_once("%s: unhandled return value: %ld\n",
 					     __func__, pages);
_

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [patch 6/7] mm/memcg: fix incorrect flushing of lruvec data in obj_stock
  2021-08-13 23:53 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2021-08-13 23:54 ` [patch 5/7] mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE) Andrew Morton
@ 2021-08-13 23:54 ` Andrew Morton
  2021-08-13 23:54 ` [patch 7/7] lib: use PFN_PHYS() in devmem_is_allowed() Andrew Morton
  6 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2021-08-13 23:54 UTC (permalink / raw)
  To: akpm, alex.shi, chris, cl, guro, hannes, iamjoonsoo.kim,
	laoar.shao, linux-mm, longman, mhocko, mm-commits, msys.mizuma,
	penberg, richard.weiyang, rientjes, shakeelb, songmuchun, tj,
	torvalds, vbabka, vdavydov.dev, willy, zhengjun.xing

From: Waiman Long <longman@redhat.com>
Subject: mm/memcg: fix incorrect flushing of lruvec data in obj_stock

When mod_objcg_state() is called with a pgdat that is different from that
in the obj_stock, the old lruvec data cached in obj_stock are flushed out.
Unfortunately, they were flushed to the new pgdat and so the data go to
the wrong node.  This will screw up the slab data reported in
/sys/devices/system/node/node*/meminfo.

Fix that by flushing the data to the cached pgdat instead.

Link: https://lkml.kernel.org/r/20210802143834.30578-1-longman@redhat.com
Fixes: 68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp")
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-fix-incorrect-flushing-of-lruvec-data-in-obj_stock
+++ a/mm/memcontrol.c
@@ -3106,13 +3106,15 @@ void mod_objcg_state(struct obj_cgroup *
 		stock->cached_pgdat = pgdat;
 	} else if (stock->cached_pgdat != pgdat) {
 		/* Flush the existing cached vmstat data */
+		struct pglist_data *oldpg = stock->cached_pgdat;
+
 		if (stock->nr_slab_reclaimable_b) {
-			mod_objcg_mlstate(objcg, pgdat, NR_SLAB_RECLAIMABLE_B,
+			mod_objcg_mlstate(objcg, oldpg, NR_SLAB_RECLAIMABLE_B,
 					  stock->nr_slab_reclaimable_b);
 			stock->nr_slab_reclaimable_b = 0;
 		}
 		if (stock->nr_slab_unreclaimable_b) {
-			mod_objcg_mlstate(objcg, pgdat, NR_SLAB_UNRECLAIMABLE_B,
+			mod_objcg_mlstate(objcg, oldpg, NR_SLAB_UNRECLAIMABLE_B,
 					  stock->nr_slab_unreclaimable_b);
 			stock->nr_slab_unreclaimable_b = 0;
 		}
_

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [patch 7/7] lib: use PFN_PHYS() in devmem_is_allowed()
  2021-08-13 23:53 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2021-08-13 23:54 ` [patch 6/7] mm/memcg: fix incorrect flushing of lruvec data in obj_stock Andrew Morton
@ 2021-08-13 23:54 ` Andrew Morton
  6 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2021-08-13 23:54 UTC (permalink / raw)
  To: akpm, gregkh, linux-mm, linux, mcgrof, mm-commits, nixiaoming,
	palmerdabbelt, stable, torvalds, wangkefeng.wang, wangliang101

From: Liang Wang <wangliang101@huawei.com>
Subject: lib: use PFN_PHYS() in devmem_is_allowed()

The physical address may exceed 32 bits on 32-bit systems with more than
32 bits of physcial address.  Use PFN_PHYS() in devmem_is_allowed(), or
the physical address may overflow and be truncated.

We found this bug when mapping a high addresses through devmem tool, when
CONFIG_STRICT_DEVMEM is enabled on the ARM with ARM_LPAE and devmem is
used to map a high address that is not in the iomem address range, an
unexpected error indicating no permission is returned.

This bug was initially introduced from v2.6.37, and the function was moved
to lib when v5.11.

Link: https://lkml.kernel.org/r/20210731025057.78825-1-wangliang101@huawei.com
Fixes: 087aaffcdf9c ("ARM: implement CONFIG_STRICT_DEVMEM by disabling access to RAM via /dev/mem")
Fixes: 527701eda5f1 ("lib: Add a generic version of devmem_is_allowed()")
Signed-off-by: Liang Wang <wangliang101@huawei.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Liang Wang <wangliang101@huawei.com>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: <stable@vger.kernel.org>	[2.6.37+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/devmem_is_allowed.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/devmem_is_allowed.c~lib-use-pfn_phys-in-devmem_is_allowed
+++ a/lib/devmem_is_allowed.c
@@ -19,7 +19,7 @@
  */
 int devmem_is_allowed(unsigned long pfn)
 {
-	if (iomem_is_exclusive(pfn << PAGE_SHIFT))
+	if (iomem_is_exclusive(PFN_PHYS(pfn)))
 		return 0;
 	if (!page_is_ram(pfn))
 		return 1;
_

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-13 23:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-13 23:53 incoming Andrew Morton
2021-08-13 23:54 ` [patch 1/7] kasan, kmemleak: reset tags when scanning block Andrew Morton
2021-08-13 23:54 ` [patch 2/7] kasan, slub: reset tag when printing address Andrew Morton
2021-08-13 23:54 ` [patch 3/7] slub: fix kmalloc_pagealloc_invalid_free unit test Andrew Morton
2021-08-13 23:54 ` [patch 4/7] mm: slub: fix slub_debug disabling for list of slabs Andrew Morton
2021-08-13 23:54 ` [patch 5/7] mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE) Andrew Morton
2021-08-13 23:54 ` [patch 6/7] mm/memcg: fix incorrect flushing of lruvec data in obj_stock Andrew Morton
2021-08-13 23:54 ` [patch 7/7] lib: use PFN_PHYS() in devmem_is_allowed() Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.