From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC665C433FE for ; Thu, 3 Dec 2020 19:57:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 03EB6221F4 for ; Thu, 3 Dec 2020 19:57:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 03EB6221F4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2A2B86B0068; Thu, 3 Dec 2020 14:57:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2542F6B006C; Thu, 3 Dec 2020 14:57:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 143B36B006E; Thu, 3 Dec 2020 14:57:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0091.hostedemail.com [216.40.44.91]) by kanga.kvack.org (Postfix) with ESMTP id EE3A96B0068 for ; Thu, 3 Dec 2020 14:57:51 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id B478D180AD80F for ; Thu, 3 Dec 2020 19:57:51 +0000 (UTC) X-FDA: 77553031542.14.brake76_1712ea9273be Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id 99D5018229837 for ; Thu, 3 Dec 2020 19:57:51 +0000 (UTC) X-HE-Tag: brake76_1712ea9273be X-Filterd-Recvd-Size: 5948 Received: from mail-pj1-f68.google.com (mail-pj1-f68.google.com [209.85.216.68]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Thu, 3 Dec 2020 19:57:51 +0000 (UTC) Received: by mail-pj1-f68.google.com with SMTP id m5so1751199pjv.5 for ; Thu, 03 Dec 2020 11:57:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=YC21jCFH+T6bNg6mCpBEH9LA7IBalHFyi6DVV8g6ygE=; b=iwveHThaR7rt4USbVY0nmfrQEqKcB5GNjW4CuEbwkbrKtXvn8+kZlnHkeTT8WXxyVV vRjljhf0KbTbds56ZoE8/GYDeqF/QoX1gRppxubEr/h/QEQew6Qa7tIUawzRb11lnxTK ogjxmvpDf8q1g9zpHo6qi+IzNJXWvN2qvGSzPoeaqXrNeRk2T638+1J4aJl94/eBK6nh FtZ5I1McTKYsYOfScqLPkzZWr3AaNxKsCKCFz5JlO6+8cMk4v5sM/kO4QkF6G10SoC+G DbjKMFadvOBS9rioprm24JiOz8EtyC8bLfng+lKQ/pGXWc6B+YQD+Jwj38AN+Oq+0kfe vNOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=YC21jCFH+T6bNg6mCpBEH9LA7IBalHFyi6DVV8g6ygE=; b=lLA2jewtLXg+mvw+7tba26nKXR8/opNrEwNPqNhew0cqnIH2eEd+R0TvB++hNqIB9c rgN28UWqxL/5yoEiskULKkdILH215+ICCi66vKtusEOOrD8flnJ+0XuaSEyt1MQ+zgK9 gKDrKiDlxSNxdbbMqu3rBAihyR9bP0p374kXQV9ehaTwr/i8gW4zb2k6DsMCSKKax+O/ DqZlWbjqWbaRpqkwb/JYfeFpw3mjEyXIim6xu0r5JeF63yTeG8BiclVeShJxOfOMHIJ3 gpPGoDiHZfRvi55t/ejEEj3iq+ZDiw0t+uYZ933UkYltQVE76tEqdUCdSJhYHDuWm0hU l5GA== X-Gm-Message-State: AOAM530Emu8px/KKv0jl6uZPezCp3VfPgaJ3SI2ED8fIslWH/GWL+4lC flxmu898PR8lwV4hGPWZAyo= X-Google-Smtp-Source: ABdhPJz4ovh5ZU/Ytt6GUaKhyiwpEaFicj/3m3dV+EckHn4762woSgYaU6s851Mpxa3iOuBsOHZTcw== X-Received: by 2002:a17:90a:1b29:: with SMTP id q38mr652392pjq.223.1607025469914; Thu, 03 Dec 2020 11:57:49 -0800 (PST) Received: from ?IPv6:2601:647:4700:9b2:e462:2bde:2e8b:f9df? ([2601:647:4700:9b2:e462:2bde:2e8b:f9df]) by smtp.gmail.com with ESMTPSA id j10sm201707pji.29.2020.12.03.11.57.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 03 Dec 2020 11:57:48 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: [PATCH] userfaultfd: prevent non-cooperative events vs mcopy_atomic races From: Nadav Amit In-Reply-To: <1527061324-19949-1-git-send-email-rppt@linux.vnet.ibm.com> Date: Thu, 3 Dec 2020 11:57:46 -0800 Cc: Andrew Morton , linux-mm , lkml , Andrea Arcangeli , Mike Kravetz , Pavel Emelyanov , Andrei Vagin Content-Transfer-Encoding: quoted-printable Message-Id: <31DA12CC-E9CC-497D-A2EE-B83549D95CE8@gmail.com> References: <1527061324-19949-1-git-send-email-rppt@linux.vnet.ibm.com> To: Mike Rapoport X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello Mike, Regarding your (old) patch: > On May 23, 2018, at 12:42 AM, Mike Rapoport = wrote: >=20 > If a process monitored with userfaultfd changes it's memory mappings = or > forks() at the same time as uffd monitor fills the process memory with > UFFDIO_COPY, the actual creation of page table entries and copying of = the > data in mcopy_atomic may happen either before of after the memory = mapping > modifications and there is no way for the uffd monitor to maintain > consistent view of the process memory layout. >=20 > For instance, let's consider fork() running in parallel with > userfaultfd_copy(): >=20 > process | uffd monitor > ---------------------------------+------------------------------ > fork() | userfaultfd_copy() > ... | ... > dup_mmap() | down_read(mmap_sem) > down_write(mmap_sem) | /* create PTEs, copy data */ > dup_uffd() | up_read(mmap_sem) > copy_page_range() | > up_write(mmap_sem) | > dup_uffd_complete() | > /* notify monitor */ | >=20 > If the userfaultfd_copy() takes the mmap_sem first, the new page(s) = will be > present by the time copy_page_range() is called and they will appear = in the > child's memory mappings. However, if the fork() is the first to take = the > mmap_sem, the new pages won't be mapped in the child's address space. >=20 > Since userfaultfd monitor has no way to determine what was the order, = let's > disallow userfaultfd_copy in parallel with the non-cooperative events. = In > such case we return -EAGAIN and the uffd monitor can understand that > userfaultfd_copy() clashed with a non-cooperative event and take an > appropriate action. I am struggling to understand this patch and would appreciate your assistance. Specifically, I have two questions: 1. How can memory corruption occur? If the page is already mapped and = the handler =E2=80=9Cmistakenly" calls userfaultfd_copy(), wouldn't = mcopy_atomic_pte() return -EEXIST once it sees the PTE already exists? In such case, I = would presume that the handler should be able to recover gracefully by waking = the faulting thread. 2. How is memory ordering supposed to work here? IIUC, mmap_changing is = not protected by any lock and there are no memory barriers that are = associated with the assignment. Indeed, the code calls WRITE_ONCE()/READ_ONCE(), = but AFAIK this does not guarantee ordering with non-volatile reads/writes. Thanks, Nadav=