All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Moyer <jmoyer@redhat.com>
To: Jia He <justin.he@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Kirill A. Shutemov <kirill.shutemov@linux.intel.com>,
	linux-mm@kvack.org
Subject: bug: data corruption introduced by commit 83d116c53058 ("mm: fix double page fault on arm64 if PTE_AF is cleared")
Date: Mon, 10 Feb 2020 17:51:50 -0500	[thread overview]
Message-ID: <x49tv3yys1l.fsf@segfault.boston.devel.redhat.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 4149 bytes --]

Hi,

While running xfstests test generic/437, I noticed that the following
WARN_ON_ONCE inside cow_user_page() was triggered:

	/*
	 * This really shouldn't fail, because the page is there
	 * in the page tables. But it might just be unreadable,
	 * in which case we just give up and fill the result with
	 * zeroes.
	 */
	if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
		/*
		 * Give a warn in case there can be some obscure
		 * use-case
		 */
		WARN_ON_ONCE(1);
		clear_page(kaddr);
	}

Just clearing the page, in this case, will result in data corruption.  I
think the assumption that the copy fails because the memory is
inaccessible may be wrong.  In this instance, it may be the other thread
issuing the madvise call?

I reverted the commit in question, and the data corruption is gone.

Below is the (ppc64) stack trace (this will reproduce on x86 as well).
I've attached the reproducer, which is a modified version of the
xfs test.  You'll need to setup a file system on pmem, and mount it with
-o dax.  Then issue "./t_mmap_cow_race /mnt/pmem/foo".

Any help tracking this down is appreciated.

Thanks!
Jeff

[ 3690.894950] run fstests generic/437 at 2020-02-10 13:40:37
[ 3691.173277] ------------[ cut here ]------------
[ 3691.173302] WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
[ 3691.173308] Modules linked in: ext4 mbcache jbd2 loop rfkill ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp i2c_dev rpcrdma sunrpc rdma_ucm ib_iser ib_umad rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core dax_pmem_compat device_dax nd_pmem nd_btt dax_pmem_core of_pmem libnvdimm ses enclosure ipmi_powernv ipmi_devintf sg ibmpowernv xts ipmi_msghandler opal_prd vmx_crypto leds_powernv powernv_op_panel uio_pdrv_genirq uio xfs libcrc32c sd_mod t10_pi mlx5_core ipr mpt3sas libata raid_class tg3 scsi_transport_sas mlxfw dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug]
[ 3691.173363] CPU: 76 PID: 51024 Comm: t_mmap_cow_race Not tainted 5.5.0+ #4
[ 3691.173369] NIP:  c0000000003df6e0 LR: c0000000003df42c CTR: 0000000000007f10
[ 3691.173375] REGS: c000002ec66fb830 TRAP: 0700   Not tainted  (5.5.0+)
[ 3691.173379] MSR:  9000000002029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 24022244  XER: 20040000
[ 3691.173388] CFAR: c0000000003df458 IRQMASK: 0 
               GPR00: 0000000000000000 c000002ec66fbac0 c000000001746a00 0000000000007f10 
               GPR04: 00007fff976080f0 0000000000007f10 00000000000080f0 0000000000000000 
               GPR08: 0000000000000000 c000000000000000 c000002de046a000 0000000000000030 
               GPR12: 0000000000000040 c000002ffffa9800 0000000014a70278 00007fff981f0000 
               GPR16: 00007fff98f24410 00007fff98f24420 00007fff97600000 0000000014a70270 
               GPR20: 0000000010000fc0 c000002fa974ebf0 00007fff97600000 c0000000017fd9e8 
               GPR24: c0000000017fd960 c000000001775e98 c000002eabcc5a00 c000002fc3b40000 
               GPR28: 0000000000000000 c000002fa974ebf0 c00c00000bf0ed00 c000002ec66fbc10 
[ 3691.173430] NIP [c0000000003df6e0] wp_page_copy+0xc40/0xd50
[ 3691.173434] LR [c0000000003df42c] wp_page_copy+0x98c/0xd50
[ 3691.173438] Call Trace:
[ 3691.173442] [c000002ec66fbac0] [c0000000003df42c] wp_page_copy+0x98c/0xd50 (unreliable)
[ 3691.173448] [c000002ec66fbb80] [c0000000003e3448] do_wp_page+0xd8/0xad0
[ 3691.173454] [c000002ec66fbbd0] [c0000000003e6248] __handle_mm_fault+0x748/0x1b90
[ 3691.173460] [c000002ec66fbcd0] [c0000000003e77b0] handle_mm_fault+0x120/0x1f0
[ 3691.173466] [c000002ec66fbd10] [c000000000086b60] __do_page_fault+0x240/0xd70
[ 3691.173473] [c000002ec66fbde0] [c0000000000876c8] do_page_fault+0x38/0xd0
[ 3691.173480] [c000002ec66fbe20] [c00000000000a888] handle_page_fault+0x10/0x30
[ 3691.173484] Instruction dump:
[ 3691.173488] 38800003 4bfffaf8 60000000 60000000 7f83e378 4800d585 60000000 4bfffcb8 
[ 3691.173496] 7f83e378 4bfa72b5 60000000 4bfffc8c <0fe00000> 3d42ffeb 394acb48 812a0008 
[ 3691.173503] ---[ end trace a8dffbf7f73e8243 ]---


[-- Attachment #2: t_mmap_cow_race.c --]
[-- Type: text/plain, Size: 2494 bytes --]

// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2017 Intel Corporation. */
#include <errno.h>
#include <fcntl.h>
#include <libgen.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>

#define MiB(a) ((a)*1024*1024)
#define NUM_THREADS 2

void err_exit(char *op)
{
	fprintf(stderr, "%s: %s\n", op, strerror(errno));
	exit(1);
}

void worker_fn(void *ptr)
{
	char *data = (char *)ptr;
	volatile int a;
	volatile int d;
	int i, err;

	for (i = 0; i < 10; i++) {
		a = data[0];
		data[0] = a;

		err = madvise(data, MiB(2), MADV_DONTNEED);
		if (err < 0)
			err_exit("madvise");
		for (i = 0; i < MiB(2); i++) {
			if ((d = data[i]) != 1) {
				fprintf(stderr, "Data corruption!\n");
				fprintf(stderr, "data[%d] = %d\n", i, d);
				exit(1);
			}
		}

		/* Mix up the thread timings to encourage the race. */
		err = usleep(rand() % 100);
		if (err < 0)
			err_exit("usleep");
	}
}

int main(int argc, char *argv[])
{
	pthread_t thread[NUM_THREADS];
	int i, j, fd, err;
	char *data;

	if (argc < 2) {
		printf("Usage: %s <file>\n", basename(argv[0]));
		exit(0);
	}

	fd = open(argv[1], O_RDWR|O_CREAT, S_IRUSR|S_IWUSR);
	if (fd < 0)
		err_exit("fd");

	/* This allows us to map a huge page. */
	ftruncate(fd, 0);
	ftruncate(fd, MiB(2));

	/*
	 * First we set up a shared mapping.  Our write will (hopefully) get
	 * the filesystem to give us a 2MiB huge page DAX mapping.  We will
	 * then use this 2MiB page for our private mapping race.
	 */
	data = mmap(NULL, MiB(2), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	if (data == MAP_FAILED)
		err_exit("shared mmap");

	memset(data, 1, MiB(2));

	err = munmap(data, MiB(2));
	if (err < 0)
		err_exit("shared munmap");

	for (i = 0; i < 500; i++) {
		data = mmap(NULL, MiB(2), PROT_READ|PROT_WRITE, MAP_PRIVATE,
				fd, 0);
		if (data == MAP_FAILED)
			err_exit("private mmap");

		for (j = 0; j < NUM_THREADS; j++) {
			err = pthread_create(&thread[j], NULL,
					(void*)&worker_fn, data);
			if (err)
				err_exit("pthread_create");
		}

		for (j = 0; j < NUM_THREADS; j++) {
			err = pthread_join(thread[j], NULL);
			if (err)
				err_exit("pthread_join");
		}

		err = munmap(data, MiB(2));
		if (err < 0)
			err_exit("private munmap");
	}

	err = close(fd);
	if (err < 0)
		err_exit("close");

	return 0;
}

             reply	other threads:[~2020-02-10 22:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-10 22:51 Jeff Moyer [this message]
2020-02-11  4:17 ` bug: data corruption introduced by commit 83d116c53058 ("mm: fix double page fault on arm64 if PTE_AF is cleared") Justin He
2020-02-11  4:29   ` Justin He
2020-02-11 16:44     ` Jeff Moyer
2020-02-11 17:33       ` Kirill A. Shutemov
2020-02-11 17:55         ` Jeff Moyer
2020-02-11 21:44           ` Kirill A. Shutemov
2020-02-11 22:01             ` Jeff Moyer
2020-02-11 22:15               ` Kirill A. Shutemov
2020-02-11 14:51 ` Kirill A. Shutemov
2020-02-11 16:27   ` Jeff Moyer
2020-02-11 22:40     ` Kirill A. Shutemov
2020-02-12 14:22       ` Jeff Moyer
2020-02-13 12:14         ` Kirill A. Shutemov
2020-02-14 21:07           ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=x49tv3yys1l.fsf@segfault.boston.devel.redhat.com \
    --to=jmoyer@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=justin.he@arm.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.