linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: aarcange@redhat.com, akpm@linux-foundation.org,
	axelrasmussen@google.com, linux-mm@kvack.org, liwan@redhat.com,
	liwang@redhat.com, mm-commits@vger.kernel.org,
	nadav.amit@gmail.com, peterx@redhat.com, stable@vger.kernel.org,
	torvalds@linux-foundation.org
Subject: [patch 01/19] mm/userfaultfd: selftests: fix memory corruption with thp enabled
Date: Mon, 18 Oct 2021 15:15:22 -0700	[thread overview]
Message-ID: <20211018221522.wVUlfJFIe%akpm@linux-foundation.org> (raw)
In-Reply-To: <20211018151438.f2246e2656c041b6753a8bdd@linux-foundation.org>

From: Peter Xu <peterx@redhat.com>
Subject: mm/userfaultfd: selftests: fix memory corruption with thp enabled

In RHEL's gating selftests we've encountered memory corruption in the uffd
event test even with upstream kernel:

        # ./userfaultfd anon 128 4
        nr_pages: 32768, nr_pages_per_cpu: 32768
        bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729)
        bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877)
        bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699)
        bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196)
        testing uffd-wp with pagemap (pgsize=4096): done
        testing uffd-wp with pagemap (pgsize=2097152): done
        testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963)
        ERROR: faulting process failed (errno=0, line=1117)

It can be easily reproduced when global thp enabled, which is the default for
RHEL.

It's also known as a side effect of commit 0db282ba2c12 ("selftest: use
mmap instead of posix_memalign to allocate memory", 2021-07-23), which is
imho right itself on using mmap() to make sure the addresses will be
untagged even on arm.

The problem is, for each test we allocate buffers using two
allocate_area() calls.  We assumed these two buffers won't affect each
other, however they could, because mmap() could have found that the two
buffers are near each other and having the same VMA flags, so they got
merged into one VMA.

It won't be a big problem if thp is not enabled, but when thp is
agressively enabled it means when initializing the src buffer it could
accidentally setup part of the dest buffer too when there's a shared THP
that overlaps the two regions.  Then some of the dest buffer won't be able
to be trapped by userfaultfd missing mode, then it'll cause memory
corruption as described.

To fix it, do release_pages() after initializing the src buffer.

Since the previous two release_pages() calls are after
uffd_test_ctx_clear() which will unmap all the buffers anyway (which is
stronger than release pages; as unmap() also tear town pgtables), drop
them as they shouldn't really be anything useful.

We can mark the Fixes tag upon 0db282ba2c12 as it's reported to only
happen there, however the real "Fixes" IMHO should be 8ba6e8640844, as
before that commit we'll always do explicit release_pages() before
registration of uffd, and 8ba6e8640844 changed that logic by adding extra
unmap/map and we didn't release the pages at the right place.  Meanwhile I
don't have a solid glue anyway on whether posix_memalign() could always
avoid triggering this bug, hence it's safer to attach this fix to commit
8ba6e8640844.

Link: https://lkml.kernel.org/r/20210923232512.210092-1-peterx@redhat.com
Fixes: 8ba6e8640844 ("userfaultfd/selftests: reinitialize test context in each test")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1994931
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Li Wang <liwan@redhat.com>
Tested-by: Li Wang <liwang@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/userfaultfd.c |   23 ++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

--- a/tools/testing/selftests/vm/userfaultfd.c~mm-userfaultfd-selftests-fix-memory-corruption-with-thp-enabled
+++ a/tools/testing/selftests/vm/userfaultfd.c
@@ -414,9 +414,6 @@ static void uffd_test_ctx_init_ext(uint6
 	uffd_test_ops->allocate_area((void **)&area_src);
 	uffd_test_ops->allocate_area((void **)&area_dst);
 
-	uffd_test_ops->release_pages(area_src);
-	uffd_test_ops->release_pages(area_dst);
-
 	userfaultfd_open(features);
 
 	count_verify = malloc(nr_pages * sizeof(unsigned long long));
@@ -437,6 +434,26 @@ static void uffd_test_ctx_init_ext(uint6
 		*(area_count(area_src, nr) + 1) = 1;
 	}
 
+	/*
+	 * After initialization of area_src, we must explicitly release pages
+	 * for area_dst to make sure it's fully empty.  Otherwise we could have
+	 * some area_dst pages be errornously initialized with zero pages,
+	 * hence we could hit memory corruption later in the test.
+	 *
+	 * One example is when THP is globally enabled, above allocate_area()
+	 * calls could have the two areas merged into a single VMA (as they
+	 * will have the same VMA flags so they're mergeable).  When we
+	 * initialize the area_src above, it's possible that some part of
+	 * area_dst could have been faulted in via one huge THP that will be
+	 * shared between area_src and area_dst.  It could cause some of the
+	 * area_dst won't be trapped by missing userfaults.
+	 *
+	 * This release_pages() will guarantee even if that happened, we'll
+	 * proactively split the thp and drop any accidentally initialized
+	 * pages within area_dst.
+	 */
+	uffd_test_ops->release_pages(area_dst);
+
 	pipefd = malloc(sizeof(int) * nr_cpus * 2);
 	if (!pipefd)
 		err("pipefd");
_


  reply	other threads:[~2021-10-18 22:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-18 22:14 incoming Andrew Morton
2021-10-18 22:15 ` Andrew Morton [this message]
2021-10-18 22:15 ` [patch 02/19] userfaultfd: fix a race between writeprotect and exit_mmap() Andrew Morton
2021-10-18 22:15 ` [patch 03/19] mm/migrate: optimize hotplug-time demotion order updates Andrew Morton
2021-10-18 22:15 ` [patch 04/19] mm/migrate: add CPU hotplug to demotion #ifdef Andrew Morton
2021-10-18 22:15 ` [patch 05/19] mm/migrate: fix CPUHP state to update node demotion order Andrew Morton
2021-10-18 22:15 ` [patch 06/19] ocfs2: fix data corruption after conversion from inline format Andrew Morton
2021-10-18 22:15 ` [patch 07/19] ocfs2: mount fails with buffer overflow in strlen Andrew Morton
2021-10-18 22:15 ` [patch 08/19] memblock: check memory total_size Andrew Morton
2021-10-18 22:15 ` [patch 09/19] mm/mempolicy: do not allow illegal MPOL_F_NUMA_BALANCING | MPOL_LOCAL in mbind() Andrew Morton
2021-10-18 22:15 ` [patch 10/19] mm, slub: fix two bugs in slab_debug_trace_open() Andrew Morton
2021-10-18 22:15 ` [patch 11/19] mm, slub: fix mismatch between reconstructed freelist depth and cnt Andrew Morton
2021-10-18 22:15 ` [patch 12/19] mm, slub: fix potential memoryleak in kmem_cache_open() Andrew Morton
2021-10-18 22:16 ` [patch 13/19] mm, slub: fix potential use-after-free in slab_debugfs_fops Andrew Morton
2021-10-18 22:16 ` [patch 14/19] mm, slub: fix incorrect memcg slab count for bulk free Andrew Morton
2021-10-18 22:16 ` [patch 15/19] elfcore: correct reference to CONFIG_UML Andrew Morton
2021-10-18 22:16 ` [patch 16/19] vfs: check fd has read access in kernel_read_file_from_fd() Andrew Morton
2021-10-18 22:16 ` [patch 17/19] mm/secretmem: fix NULL page->mapping dereference in page_is_secretmem() Andrew Morton
2021-10-18 22:16 ` [patch 18/19] mm/thp: decrease nr_thps in file's mapping on THP split Andrew Morton
2021-10-18 22:16 ` [patch 19/19] mailmap: add Andrej Shadura Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211018221522.wVUlfJFIe%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=aarcange@redhat.com \
    --cc=axelrasmussen@google.com \
    --cc=linux-mm@kvack.org \
    --cc=liwan@redhat.com \
    --cc=liwang@redhat.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=peterx@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).