From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB66EC352A4 for ; Mon, 10 Feb 2020 22:52:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 90B002051A for ; Mon, 10 Feb 2020 22:52:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="aCpFEf9J" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 90B002051A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E7B7A6B018C; Mon, 10 Feb 2020 17:51:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E2CB56B018D; Mon, 10 Feb 2020 17:51:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1B626B018E; Mon, 10 Feb 2020 17:51:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id B882A6B018C for ; Mon, 10 Feb 2020 17:51:59 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5BEE240C0 for ; Mon, 10 Feb 2020 22:51:59 +0000 (UTC) X-FDA: 76475716758.11.magic66_1132e16e9272d X-HE-Tag: magic66_1132e16e9272d X-Filterd-Recvd-Size: 9529 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Mon, 10 Feb 2020 22:51:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1581375117; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=1tigNgolq1vDd4P/NdCAp6xTxJl9XWiZ0J/X3Eh3sgI=; b=aCpFEf9JNhXPdK76gUJzArRo+jp/FPnHq2A6dT2RI9+x1dLKh39IfgWJTCAze8s/e+32O6 nb2q7AzWTB15nh5J8cj/VJr7nfaZPvRc0YBlrVVV61Y4Xc2/Xh+muJX/3W5Df3LNY8KPyK sImgnL2sq8PeFD5Ebh9vFnP9Cdt7SxM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-211-WAUmhQPbM963mgU0IkIUIQ-1; Mon, 10 Feb 2020 17:51:53 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D758D10054E3; Mon, 10 Feb 2020 22:51:51 +0000 (UTC) Received: from segfault.boston.devel.redhat.com (segfault.boston.devel.redhat.com [10.19.60.26]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 46ACC5C10D; Mon, 10 Feb 2020 22:51:51 +0000 (UTC) From: Jeff Moyer To: Jia He Cc: Catalin Marinas , Kirill A. Shutemov , linux-mm@kvack.org Subject: bug: data corruption introduced by commit 83d116c53058 ("mm: fix double page fault on arm64 if PTE_AF is cleared") X-PGP-KeyID: 1F78E1B4 X-PGP-CertKey: F6FE 280D 8293 F72C 65FD 5A58 1FF8 A7CA 1F78 E1B4 Date: Mon, 10 Feb 2020 17:51:50 -0500 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-MC-Unique: WAUmhQPbM963mgU0IkIUIQ-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/mixed; boundary="=-=-=" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi, While running xfstests test generic/437, I noticed that the following WARN_ON_ONCE inside cow_user_page() was triggered: =09/* =09 * This really shouldn't fail, because the page is there =09 * in the page tables. But it might just be unreadable, =09 * in which case we just give up and fill the result with =09 * zeroes. =09 */ =09if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) { =09=09/* =09=09 * Give a warn in case there can be some obscure =09=09 * use-case =09=09 */ =09=09WARN_ON_ONCE(1); =09=09clear_page(kaddr); =09} Just clearing the page, in this case, will result in data corruption. I think the assumption that the copy fails because the memory is inaccessible may be wrong. In this instance, it may be the other thread issuing the madvise call? I reverted the commit in question, and the data corruption is gone. Below is the (ppc64) stack trace (this will reproduce on x86 as well). I've attached the reproducer, which is a modified version of the xfs test. You'll need to setup a file system on pmem, and mount it with -o dax. Then issue "./t_mmap_cow_race /mnt/pmem/foo". Any help tracking this down is appreciated. Thanks! Jeff [ 3690.894950] run fstests generic/437 at 2020-02-10 13:40:37 [ 3691.173277] ------------[ cut here ]------------ [ 3691.173302] WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy= +0xc40/0xd50 [ 3691.173308] Modules linked in: ext4 mbcache jbd2 loop rfkill ib_isert is= csi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp i2c_dev rp= crdma sunrpc rdma_ucm ib_iser ib_umad rdma_cm iw_cm libiscsi scsi_transport= _iscsi ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core dax_pmem_compat device_dax = nd_pmem nd_btt dax_pmem_core of_pmem libnvdimm ses enclosure ipmi_powernv i= pmi_devintf sg ibmpowernv xts ipmi_msghandler opal_prd vmx_crypto leds_powe= rnv powernv_op_panel uio_pdrv_genirq uio xfs libcrc32c sd_mod t10_pi mlx5_c= ore ipr mpt3sas libata raid_class tg3 scsi_transport_sas mlxfw dm_mirror dm= _region_hash dm_log dm_mod [last unloaded: scsi_debug] [ 3691.173363] CPU: 76 PID: 51024 Comm: t_mmap_cow_race Not tainted 5.5.0+ = #4 [ 3691.173369] NIP: c0000000003df6e0 LR: c0000000003df42c CTR: 00000000000= 07f10 [ 3691.173375] REGS: c000002ec66fb830 TRAP: 0700 Not tainted (5.5.0+) [ 3691.173379] MSR: 9000000002029033 CR: 24= 022244 XER: 20040000 [ 3691.173388] CFAR: c0000000003df458 IRQMASK: 0=20 GPR00: 0000000000000000 c000002ec66fbac0 c000000001746a00 00= 00000000007f10=20 GPR04: 00007fff976080f0 0000000000007f10 00000000000080f0 00= 00000000000000=20 GPR08: 0000000000000000 c000000000000000 c000002de046a000 00= 00000000000030=20 GPR12: 0000000000000040 c000002ffffa9800 0000000014a70278 00= 007fff981f0000=20 GPR16: 00007fff98f24410 00007fff98f24420 00007fff97600000 00= 00000014a70270=20 GPR20: 0000000010000fc0 c000002fa974ebf0 00007fff97600000 c0= 000000017fd9e8=20 GPR24: c0000000017fd960 c000000001775e98 c000002eabcc5a00 c0= 00002fc3b40000=20 GPR28: 0000000000000000 c000002fa974ebf0 c00c00000bf0ed00 c0= 00002ec66fbc10=20 [ 3691.173430] NIP [c0000000003df6e0] wp_page_copy+0xc40/0xd50 [ 3691.173434] LR [c0000000003df42c] wp_page_copy+0x98c/0xd50 [ 3691.173438] Call Trace: [ 3691.173442] [c000002ec66fbac0] [c0000000003df42c] wp_page_copy+0x98c/0xd= 50 (unreliable) [ 3691.173448] [c000002ec66fbb80] [c0000000003e3448] do_wp_page+0xd8/0xad0 [ 3691.173454] [c000002ec66fbbd0] [c0000000003e6248] __handle_mm_fault+0x74= 8/0x1b90 [ 3691.173460] [c000002ec66fbcd0] [c0000000003e77b0] handle_mm_fault+0x120/= 0x1f0 [ 3691.173466] [c000002ec66fbd10] [c000000000086b60] __do_page_fault+0x240/= 0xd70 [ 3691.173473] [c000002ec66fbde0] [c0000000000876c8] do_page_fault+0x38/0xd= 0 [ 3691.173480] [c000002ec66fbe20] [c00000000000a888] handle_page_fault+0x10= /0x30 [ 3691.173484] Instruction dump: [ 3691.173488] 38800003 4bfffaf8 60000000 60000000 7f83e378 4800d585 600000= 00 4bfffcb8=20 [ 3691.173496] 7f83e378 4bfa72b5 60000000 4bfffc8c <0fe00000> 3d42ffeb 394a= cb48 812a0008=20 [ 3691.173503] ---[ end trace a8dffbf7f73e8243 ]--- --=-=-= Content-Type: text/plain Content-Disposition: attachment; filename=t_mmap_cow_race.c Content-Transfer-Encoding: quoted-printable // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2017 Intel Corporation. */ #include #include #include #include #include #include #include #include #include #include #include #define MiB(a) ((a)*1024*1024) #define NUM_THREADS 2 void err_exit(char *op) { =09fprintf(stderr, "%s: %s\n", op, strerror(errno)); =09exit(1); } void worker_fn(void *ptr) { =09char *data =3D (char *)ptr; =09volatile int a; =09volatile int d; =09int i, err; =09for (i =3D 0; i < 10; i++) { =09=09a =3D data[0]; =09=09data[0] =3D a; =09=09err =3D madvise(data, MiB(2), MADV_DONTNEED); =09=09if (err < 0) =09=09=09err_exit("madvise"); =09=09for (i =3D 0; i < MiB(2); i++) { =09=09=09if ((d =3D data[i]) !=3D 1) { =09=09=09=09fprintf(stderr, "Data corruption!\n"); =09=09=09=09fprintf(stderr, "data[%d] =3D %d\n", i, d); =09=09=09=09exit(1); =09=09=09} =09=09} =09=09/* Mix up the thread timings to encourage the race. */ =09=09err =3D usleep(rand() % 100); =09=09if (err < 0) =09=09=09err_exit("usleep"); =09} } int main(int argc, char *argv[]) { =09pthread_t thread[NUM_THREADS]; =09int i, j, fd, err; =09char *data; =09if (argc < 2) { =09=09printf("Usage: %s \n", basename(argv[0])); =09=09exit(0); =09} =09fd =3D open(argv[1], O_RDWR|O_CREAT, S_IRUSR|S_IWUSR); =09if (fd < 0) =09=09err_exit("fd"); =09/* This allows us to map a huge page. */ =09ftruncate(fd, 0); =09ftruncate(fd, MiB(2)); =09/* =09 * First we set up a shared mapping. Our write will (hopefully) get =09 * the filesystem to give us a 2MiB huge page DAX mapping. We will =09 * then use this 2MiB page for our private mapping race. =09 */ =09data =3D mmap(NULL, MiB(2), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); =09if (data =3D=3D MAP_FAILED) =09=09err_exit("shared mmap"); =09memset(data, 1, MiB(2)); =09err =3D munmap(data, MiB(2)); =09if (err < 0) =09=09err_exit("shared munmap"); =09for (i =3D 0; i < 500; i++) { =09=09data =3D mmap(NULL, MiB(2), PROT_READ|PROT_WRITE, MAP_PRIVATE, =09=09=09=09fd, 0); =09=09if (data =3D=3D MAP_FAILED) =09=09=09err_exit("private mmap"); =09=09for (j =3D 0; j < NUM_THREADS; j++) { =09=09=09err =3D pthread_create(&thread[j], NULL, =09=09=09=09=09(void*)&worker_fn, data); =09=09=09if (err) =09=09=09=09err_exit("pthread_create"); =09=09} =09=09for (j =3D 0; j < NUM_THREADS; j++) { =09=09=09err =3D pthread_join(thread[j], NULL); =09=09=09if (err) =09=09=09=09err_exit("pthread_join"); =09=09} =09=09err =3D munmap(data, MiB(2)); =09=09if (err < 0) =09=09=09err_exit("private munmap"); =09} =09err =3D close(fd); =09if (err < 0) =09=09err_exit("close"); =09return 0; } --=-=-=--