From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A5B3C64E7A for ; Tue, 1 Dec 2020 22:30:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DD91820870 for ; Tue, 1 Dec 2020 22:30:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NGdmt7ZC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DD91820870 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1370B8D0002; Tue, 1 Dec 2020 17:30:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0EA408D0001; Tue, 1 Dec 2020 17:30:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F183C8D0002; Tue, 1 Dec 2020 17:30:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0184.hostedemail.com [216.40.44.184]) by kanga.kvack.org (Postfix) with ESMTP id D8C688D0001 for ; Tue, 1 Dec 2020 17:30:44 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9D7B9180AD820 for ; Tue, 1 Dec 2020 22:30:44 +0000 (UTC) X-FDA: 77546159208.02.part92_190517e273ae Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 7BED010097AA0 for ; Tue, 1 Dec 2020 22:30:44 +0000 (UTC) X-HE-Tag: part92_190517e273ae X-Filterd-Recvd-Size: 6803 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Tue, 1 Dec 2020 22:30:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1606861843; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=VNFpshh0KXLK8U3rtzTr4+BIoGN4Z+9c10o907dnoaI=; b=NGdmt7ZCovgRrCOP2U22Bot6PaXAZjqgeB3CYabXQJYBP9Hb/TNBPJcdak+afLeVPgGNci SCBdOzjSJGyoQOIfCVgMHzK9XtA4fDjobI7W46rQRTz7EFsLH9LK+J8m65kTUnGqSo6wT6 WF0N+506vQdme9Ect2hc41c9jKajCL8= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-166-jGouPRvAPE6JyacTFzdr5w-1; Tue, 01 Dec 2020 17:30:37 -0500 X-MC-Unique: jGouPRvAPE6JyacTFzdr5w-1 Received: by mail-qv1-f71.google.com with SMTP id 102so2311035qva.0 for ; Tue, 01 Dec 2020 14:30:37 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=VNFpshh0KXLK8U3rtzTr4+BIoGN4Z+9c10o907dnoaI=; b=tfA9LOTnE5UyzAI5fQvVqgrIVJfHoNOGAC+Wm3Os7fwxxrErQoYw0sQu6dZborPtEG Rv8njirf972QJsq8FzccqTKCTcbFk5rvK85F0mTWAFkxgtEhT2GyXBAlca4rvcA3huI2 cusLndZiwPs+G1ZZD4bP5JlptBFH0F5YNMiXhAqHky2wC3IQD8VqDCDpvmzrPR62W+F2 lnQlKyJToSEFt3agEFy0D2UF6ULRAd/19pg0LRAG/gGP/GHxb8eJdmbE7lwaWaDVEMmB zZz51tnJDwh3oaj/FSy/Newbr0bsbNvHYT1PqA7rEmojz3rL/uMANQWv1jS6iQQlmopZ MBRQ== X-Gm-Message-State: AOAM530fpo6BqyyxqQPz0F/wGlM7ec4KpZ14/5yFN7+ghRu7NIIZbFfz qmnYpGjzQSRTG/9QnsOG84MWqYnAQ9vAFWFx7ffNom84sTFK9Bxh9zvjd1oB9QtplYSUjRiD4H/ 9VfnXAOiJmBE= X-Received: by 2002:a37:9d16:: with SMTP id g22mr5333142qke.62.1606861836760; Tue, 01 Dec 2020 14:30:36 -0800 (PST) X-Google-Smtp-Source: ABdhPJw5u0z/kIoCpzz+1n/AsgrjghQuDDAFjAR1UtdA4N2bZDrAyaLpA0tKx4CXXY6lx64kuGjutQ== X-Received: by 2002:a37:9d16:: with SMTP id g22mr5333118qke.62.1606861836492; Tue, 01 Dec 2020 14:30:36 -0800 (PST) Received: from xz-x1 ([142.126.94.187]) by smtp.gmail.com with ESMTPSA id t126sm1149498qkc.76.2020.12.01.14.30.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Dec 2020 14:30:35 -0800 (PST) Date: Tue, 1 Dec 2020 17:30:33 -0500 From: Peter Xu To: Matthew Wilcox Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Hugh Dickins , Andrea Arcangeli , Mike Rapoport , David Hildenbrand Subject: Re: [PATCH v2] mm: Don't fault around userfaultfd-registered regions on reads Message-ID: <20201201223033.GG3277@xz-x1> References: <20201130230603.46187-1-peterx@redhat.com> <20201201125927.GB11935@casper.infradead.org> MIME-Version: 1.0 In-Reply-To: <20201201125927.GB11935@casper.infradead.org> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Dec 01, 2020 at 12:59:27PM +0000, Matthew Wilcox wrote: > On Mon, Nov 30, 2020 at 06:06:03PM -0500, Peter Xu wrote: > > Faulting around for reads are in most cases helpful for the performance so that > > continuous memory accesses may avoid another trip of page fault. However it > > may not always work as expected. > > > > For example, userfaultfd registered regions may not be the best candidate for > > pre-faults around the reads. > > > > For missing mode uffds, fault around does not help because if the page cache > > existed, then the page should be there already. If the page cache is not > > there, nothing else we can do, either. If the fault-around code is destined to > > be helpless for userfault-missing vmas, then ideally we can skip it. > > This sounds like you're thinking of a file which has exactly one user. > If there are multiple processes mapping the same file, then no, there's > no reason to expect a page to be already present in the page table, > just because it's present in the page cache. > > > For wr-protected mode uffds, errornously fault in those pages around could lead > > to threads accessing the pages without uffd server's awareness. For example, > > when punching holes on uffd-wp registered shmem regions, we'll first try to > > unmap all the pages before evicting the page cache but without locking the > > page (please refer to shmem_fallocate(), where unmap_mapping_range() is called > > before shmem_truncate_range()). When fault-around happens near a hole being > > punched, we might errornously fault in the "holes" right before it will be > > punched. Then there's a small window before the page cache was finally > > dropped, and after the page will be writable again (NOTE: the uffd-wp protect > > information is totally lost due to the pre-unmap in shmem_fallocate(), so the > > page can be writable within the small window). That's severe data loss. > > This still doesn't make sense. If the page is Uptodate in the page > cache, then userspace gets to access it. If you don't want the page to > be accessible, ClearPageUptodate(). read() can also access it if it's > marked Uptodate. A write fault on a page will call the filesystem's > page_mkwrite() and you can block it there. I still don't think the page_mkwrite() could help here... Though Andrea pointed out an more important issue against swap cache (in the v1 thread [1]). Indeed if we have those figured out maybe we'll also rethink this patch then it could become optional; while that seems to be required to allow shmem swap in/out with uffd-wp which I haven't yet tested. As Hugh pointed out, purely reuse the _PAGE_SWP_UFFD_WP in swap cache may not work trivially since uffd-wp is per-pte rather than per-page, so I probably need to think a bit more on how to do that... I don't know whether a patch like this could still be good in the future. For now, let's drop this patch until we solve all the rest of the puzzle. My thanks to all the reviewers, and sorry for the noise! NAK myself. [1] https://lore.kernel.org/lkml/alpine.LSU.2.11.2012011250070.1582@eggly.anvils/T/#mef0716b38f4f5fc07b7542f2c11a07535ea31aad -- Peter Xu