From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09D56C2BA1A for ; Tue, 7 Apr 2020 03:05:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C083620936 for ; Tue, 7 Apr 2020 03:05:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="I5JQVmn+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C083620936 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 76FE28E0028; Mon, 6 Apr 2020 23:05:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7200F8E0001; Mon, 6 Apr 2020 23:05:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6363D8E0028; Mon, 6 Apr 2020 23:05:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 4AD258E0001 for ; Mon, 6 Apr 2020 23:05:32 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0E9D699B4 for ; Tue, 7 Apr 2020 03:05:32 +0000 (UTC) X-FDA: 76679568504.13.clock01_6a79c0661c84d X-HE-Tag: clock01_6a79c0661c84d X-Filterd-Recvd-Size: 5339 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Tue, 7 Apr 2020 03:05:31 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 087FB21744; Tue, 7 Apr 2020 03:05:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1586228730; bh=j3MlzVqCfDRcoRYwFr8syMpH7f1dTzH9XDE+2W4G83o=; h=Date:From:To:Subject:In-Reply-To:From; b=I5JQVmn+cCBMAGDNhJ4L1wv5FhsXrGDug5OICSe0iipYtttK1to/CbnBpF5XKTTY4 C50QQCxEcvCygrv7Wz+SUvOoCbbyTSW1J1vP0fy0NnzBxmgnNmZxjiaHW6f7WtQpjj agdnjbKvySSnPGAmx93I5nd1Z7SK8ND55Lrbh178= Date: Mon, 06 Apr 2020 20:05:29 -0700 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, bgeffon@google.com, bobbypowers@gmail.com, cracauer@cons.org, david@redhat.com, dgilbert@redhat.com, dplotnikov@virtuozzo.com, gokhale2@llnl.gov, hannes@cmpxchg.org, hughd@google.com, jglisse@redhat.com, kirill@shutemov.name, linux-mm@kvack.org, mcfadden8@llnl.gov, mgorman@suse.de, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, peterx@redhat.com, riel@redhat.com, rppt@linux.vnet.ibm.com, shli@fb.com, torvalds@linux-foundation.org, xemul@openvz.org Subject: [patch 034/166] userfaultfd: wp: hook userfault handler to write protection fault Message-ID: <20200407030529.LNaWlZqz4%akpm@linux-foundation.org> In-Reply-To: <20200406200254.a69ebd9e08c4074e41ddebaf@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Andrea Arcangeli Subject: userfaultfd: wp: hook userfault handler to write protection fault There are several cases write protection fault happens. It could be a write to zero page, swaped page or userfault write protected page. When the fault happens, there is no way to know if userfault write protect the page before. Here we just blindly issue a userfault notification for vma with VM_UFFD_WP regardless if app write protects it yet. Application should be ready to handle such wp fault. In the swapin case, always swapin as readonly. This will cause false positive userfaults. We need to decide later if to eliminate them with a flag like soft-dirty in the swap entry (see _PAGE_SWP_SOFT_DIRTY). hugetlbfs wouldn't need to worry about swapouts but and tmpfs would be handled by a swap entry bit like anonymous memory. The main problem with no easy solution to eliminate the false positives, will be if/when userfaultfd is extended to real filesystem pagecache. When the pagecache is freed by reclaim we can't leave the radix tree pinned if the inode and in turn the radix tree is reclaimed as well. The estimation is that full accuracy and lack of false positives could be easily provided only to anonymous memory (as long as there's no fork or as long as MADV_DONTFORK is used on the userfaultfd anonymous range) tmpfs and hugetlbfs, it's most certainly worth to achieve it but in a later incremental patch. [peterx@redhat.com: don't conditionally drop FAULT_FLAG_WRITE in do_swap_page] Link: http://lkml.kernel.org/r/20200220163112.11409-3-peterx@redhat.com Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: Mike Rapoport Reviewed-by: Jerome Glisse Cc: Shaohua Li Cc: Bobby Powers Cc: Brian Geffon Cc: David Hildenbrand Cc: Denis Plotnikov Cc: "Dr . David Alan Gilbert" Cc: Hugh Dickins Cc: Johannes Weiner Cc: "Kirill A . Shutemov" Cc: Martin Cracauer Cc: Marty McFadden Cc: Maya Gokhale Cc: Mel Gorman Cc: Mike Kravetz Cc: Pavel Emelyanov Cc: Rik van Riel Signed-off-by: Andrew Morton --- mm/memory.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) --- a/mm/memory.c~userfaultfd-wp-hook-userfault-handler-to-write-protection-fault +++ a/mm/memory.c @@ -2752,6 +2752,11 @@ static vm_fault_t do_wp_page(struct vm_f { struct vm_area_struct *vma = vmf->vma; + if (userfaultfd_wp(vma)) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + return handle_userfault(vmf, VM_UFFD_WP); + } + vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte); if (!vmf->page) { /* @@ -3948,8 +3953,11 @@ static inline vm_fault_t create_huge_pmd /* `inline' is required to avoid gcc 4.1.2 build error */ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { - if (vma_is_anonymous(vmf->vma)) + if (vma_is_anonymous(vmf->vma)) { + if (userfaultfd_wp(vmf->vma)) + return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); + } if (vmf->vma->vm_ops->huge_fault) { vm_fault_t ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD); _