From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AD39C433DB for ; Tue, 22 Dec 2020 20:19:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EE954229C5 for ; Tue, 22 Dec 2020 20:19:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE954229C5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 63B426B00A6; Tue, 22 Dec 2020 15:19:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E9816B00AC; Tue, 22 Dec 2020 15:19:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B23B8D0003; Tue, 22 Dec 2020 15:19:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0200.hostedemail.com [216.40.44.200]) by kanga.kvack.org (Postfix) with ESMTP id 3054F6B00A6 for ; Tue, 22 Dec 2020 15:19:55 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id EBF3B181AC544 for ; Tue, 22 Dec 2020 20:19:54 +0000 (UTC) X-FDA: 77622034308.01.tree51_120c13727462 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id CD0A81004D43F for ; Tue, 22 Dec 2020 20:19:54 +0000 (UTC) X-HE-Tag: tree51_120c13727462 X-Filterd-Recvd-Size: 6342 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Dec 2020 20:19:54 +0000 (UTC) Received: by mail-pl1-f179.google.com with SMTP id b8so8007855plx.0 for ; Tue, 22 Dec 2020 12:19:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=j4sUJIy8wLuwpZsZfzIF114nw0eLg2dUa60AVRCoc8g=; b=SlSrm1Zf8I148uovJZ4yrLOIIiMXlHyoYYswc9MaAYCxM2v+qS//fKYPvao7/elQo9 QPT7eIDI6IkiUT+z1qF1bRKgaclRQVMwD3XhSQ0Prj6JzYG7fg8MhFr4Y+tknVV5l0Oq AFUMYg6U6r4eexQY4rUeZF1OOKC6ZzTdcIyGrIKsWVHuGj7lVUyjGIMbVE3FbfFfiAZI J/KFi78UdE+isI8e0223fb/ieu4zoh/W8qIS4uL+hT5suTLarOkI61X4ZcaMaFzumFRD DxX1lPilLYAzmtQaXuJACeBrwQe0j2GHVhBg9WL7mw+f0OPj/adTmTi47rPZM0HxUi3P lnyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=j4sUJIy8wLuwpZsZfzIF114nw0eLg2dUa60AVRCoc8g=; b=YNLI3N817O4upLBpEAI4BseIJbr9OFgsfBseTFzMmm+6Gc5f0tovGHS3tiuystoJKM LKQESe4j6pktDkUg4RyCWiop0vYpZoaF0xflsFwgw1YsF3+PNKNrqcgIEUT+RWHX3qz2 q8L44sFL/J/ACxHSPadjuCyR8kvPoQfsdUHu4K3YqU2FCuwSoNwjYUlLTezQ4+hfZ6kl R0UXHHAfkmsHI5hfndPMu52f9ehEwbIdwUSaeo9v8uZOq+EH8JaltTz/Ky3DjgklbdlJ ZbHp7m+IE4bY1Vl/y+T9LneLbu4PLd8RsvEzRh8yPTfXDdm7Alqmq4zOOIEjgSPG8H26 13Sw== X-Gm-Message-State: AOAM530QhVOwuVxILSmbsq1Yk/qW4kP5ksytVBiLXbYY0Cc9BmSYTKxL MZhUp1oEvQ1LVmth/cFSAO0= X-Google-Smtp-Source: ABdhPJzbbZJAZ9xswxx5s+dKtNYOGHbcK8PivtV62oYliLZktRGGHuObQH+oZ+72B+bmbjVVNeQfXg== X-Received: by 2002:a17:90a:5782:: with SMTP id g2mr24259306pji.124.1608668393091; Tue, 22 Dec 2020 12:19:53 -0800 (PST) Received: from ?IPv6:2601:647:4700:9b2:9423:6a08:cbd0:8220? ([2601:647:4700:9b2:9423:6a08:cbd0:8220]) by smtp.gmail.com with ESMTPSA id bf3sm20719301pjb.45.2020.12.22.12.19.50 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 Dec 2020 12:19:51 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect From: Nadav Amit In-Reply-To: Date: Tue, 22 Dec 2020 12:19:49 -0800 Cc: Peter Xu , Yu Zhao , Linus Torvalds , linux-mm , lkml , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , stable , Minchan Kim , Andy Lutomirski , Will Deacon , Peter Zijlstra Content-Transfer-Encoding: quoted-printable Message-Id: References: <20201221172711.GE6640@xz-x1> <76B4F49B-ED61-47EA-9BE4-7F17A26B610D@gmail.com> <9E301C7C-882A-4E0F-8D6D-1170E792065A@gmail.com> <1FCC8F93-FF29-44D3-A73A-DF943D056680@gmail.com> <20201221223041.GL6640@xz-x1> To: Andrea Arcangeli X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Dec 22, 2020, at 11:44 AM, Andrea Arcangeli = wrote: >=20 > On Mon, Dec 21, 2020 at 02:55:12PM -0800, Nadav Amit wrote: >> wouldn=E2=80=99t mmap_write_downgrade() be executed before = mprotect_fixup() (so >=20 > I assume you mean "in" mprotect_fixup, after change_protection. >=20 > If you would downgrade the mmap_lock to read there, then it'd severely > slowdown the non contention case, if there's more than vma that needs > change_protection. >=20 > You'd need to throw away the prev->vm_next info and you'd need to do a > new find_vma after droping the mmap_lock for reading and re-taking the > mmap_lock for writing at every iteration of the loop. >=20 > To do less harm to the non-contention case you could perhaps walk > vma->vm_next and check if it's outside the mprotect range and only > downgrade in such case. So let's assume we intend to optimize with > mmap_write_downgrade only the last vma. =E2=80=A6 I read in detail your response and you make great points. To be fair, I did not think it through and just tried to make a point that not taking mmap_lock for write is an unrelated optimization. So you make a great point that it is actually related and can only(?) benefit uffd and arguably soft-dirty, to which I am going to add mmap_write_lock(). Yet, my confidence in doing the optimization that you suggested (keep = using mmap_read_lock()) as part of the bug fix is very low and just got lower since we discussed. So I do want in the future to try to avoid the = overheads I introduce (sorry), but it requires a systematic solution and some = thought. Perhaps any change to PTE in a page-table should increase a page-table generation that we would save in the page-table page-struct (next to the PTL) and pte_same() would also look at the original and updated = "page-table generation=E2=80=9D when it checks if a PTE changed. So if a PTE went = through not-writable -> writable -> not-writable cycle without a TLB flush this = can go detected. Yet, there is still a question of how to detect pending TLB flushes in finer granularity to avoid frequent unwarranted TLB flushes = while a TLB flush is pending. It all requires some thought, and the fact that soft-dirty appears to be broken too indicates that bugs can get unnoticed for some time. Sorry for being a chicken, but I prefer to be safe than sorry.=