From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25C95C433DB for ; Tue, 22 Dec 2020 21:15:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9BA4A22AAF for ; Tue, 22 Dec 2020 21:15:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9BA4A22AAF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 199B38D0011; Tue, 22 Dec 2020 16:15:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 14A968D000F; Tue, 22 Dec 2020 16:15:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 039D58D0011; Tue, 22 Dec 2020 16:15:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0245.hostedemail.com [216.40.44.245]) by kanga.kvack.org (Postfix) with ESMTP id E08DA8D000F for ; Tue, 22 Dec 2020 16:15:02 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A286E181AEF21 for ; Tue, 22 Dec 2020 21:15:02 +0000 (UTC) X-FDA: 77622173244.25.fish39_1a13b2027463 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 7FEAD1804E3A8 for ; Tue, 22 Dec 2020 21:15:02 +0000 (UTC) X-HE-Tag: fish39_1a13b2027463 X-Filterd-Recvd-Size: 6446 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Dec 2020 21:15:01 +0000 (UTC) Received: by mail-pl1-f177.google.com with SMTP id s15so3636133plr.9 for ; Tue, 22 Dec 2020 13:15:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=A45ut6uIN2L7+l/sGdjppxZEAFQcj4+IojMqRdpCWPw=; b=c8jRMCmsalURJ1YVIlU5QMM+NG0JN1KFoToMjJlrbfR3mOzkizoWL55frFWy56GgR9 Ss5J6fLwYa/byWMZEU9OSBParGw4elM98HVdSlIuTBYF4WFAIvYeIxAR1rEl3eKomNnF MaOrAktl1qZMMvCi9A0uswxcspCSq8Dxfh2TMSR1S+YTeS07GfMOGP4SxdvHs6rgR/Vm kUzMpHbBtUonFb5SpPYc/0gZNG2k/gNyvT/zw0YK5bdTf3JTNq8jVCQkPTNnqH30vhzf ndnDmZJIh0w+lG9vL5yumEb37QNJYil1RDeQHlJIpinyfcVFk/MbHVTswNiddty+jWZ2 EXbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=A45ut6uIN2L7+l/sGdjppxZEAFQcj4+IojMqRdpCWPw=; b=dEnwAValBzej7pM5Ve8yuJKHYH+ks3+4n0VXBTIIwukXyDc5P/lvO6vccgga0ukY95 q3RToQlq39tdoHfejPXWhZ7u19UYXcTk47rz510ebewQZuKLrCWLVcJ+mvf75ESRvBI5 R7hVJgawuvWQ7t7ABO3m9g8zB3hfAXNxP6tfFUPxYO/9OhYm7eyMk7yk5I9nnr2mBnfu B94IlZDrs+WBV6zDxNr5SuEY+kV9nzxde+/Bcywtm6g8ps9sknYt7ts4d3V06h4n/rwj OSNED5NS8I0L0Qa/8r49QSIalD0HPpZXYpkh5WgWyOuU18VX/DNoTG12ribPR+9M/iuW V/ow== X-Gm-Message-State: AOAM533aSH8HaC8/rhvuUJbA3Yt72pK3Ujidlo39RyDKOoZDEZ90JwBr g6Ik5nJtl/IfB+T48ppGFlk= X-Google-Smtp-Source: ABdhPJzpHrQHlRj/syPSMFmEb6JKEwxRHbyaGwX8U+lpMwdYupT0/TDliy14n+a3fAs2h+1xxf1JNQ== X-Received: by 2002:a17:90a:d308:: with SMTP id p8mr23890678pju.110.1608671700816; Tue, 22 Dec 2020 13:15:00 -0800 (PST) Received: from ?IPv6:2601:647:4700:9b2:9423:6a08:cbd0:8220? ([2601:647:4700:9b2:9423:6a08:cbd0:8220]) by smtp.gmail.com with ESMTPSA id a8sm21078104pfo.209.2020.12.22.13.14.58 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 Dec 2020 13:14:59 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect From: Nadav Amit In-Reply-To: Date: Tue, 22 Dec 2020 13:14:58 -0800 Cc: Andy Lutomirski , Linus Torvalds , Peter Xu , Yu Zhao , linux-mm , lkml , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , stable , Minchan Kim , Will Deacon , Peter Zijlstra Content-Transfer-Encoding: quoted-printable Message-Id: <55080C83-4047-4AA9-B6A3-785A1E4E17B1@gmail.com> References: <76B4F49B-ED61-47EA-9BE4-7F17A26B610D@gmail.com> <9E301C7C-882A-4E0F-8D6D-1170E792065A@gmail.com> <1FCC8F93-FF29-44D3-A73A-DF943D056680@gmail.com> <20201221223041.GL6640@xz-x1> To: Andrea Arcangeli X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Dec 22, 2020, at 11:31 AM, Andrea Arcangeli = wrote: >=20 > =46rom 4ace4d1b53f5cb3b22a5c2dc33facc4150b112d6 Mon Sep 17 00:00:00 = 2001 > From: Andrea Arcangeli > Date: Tue, 22 Dec 2020 14:30:16 -0500 > Subject: [PATCH 1/1] mm: userfaultfd: avoid leaving stale TLB after > userfaultfd_writeprotect() >=20 > change_protection() is called by userfaultfd_writeprotect() with the > mmap_lock_read like in change_prot_numa(). >=20 > The page fault code in wp_copy_page() rightfully assumes if the CPU > issued a write fault and the write bit in the pagetable is not set, no > CPU can write to the page. That's wrong assumption after > userfaultfd_writeprotect(). That's also wrong assumption after > change_prot_numa() where the regular page fault code also would assume > that if the present bit is not set and the page fault is running, > there should be no stale TLB entry, but there is still. >=20 > So to stay safe, the page fault code must be prevented to run as long > as long as the TLB flush remains pending. That is already achieved by > the do_numa_page() path for change_prot_numa() and by the > userfaultfd_pte_wp() path for userfaultfd_writeprotect(). >=20 > The problem that needs fixing is that an un-wrprotect > (i.e. userfaultfd_writeprotect() with UFFDIO_WRITEPROTECT_MODE_WP not > set) could run in between the original wrprotect > (i.e. userfaultfd_writeprotect() with UFFDIO_WRITEPROTECT_MODE_WP set) > and wp_copy_page, while the TLB flush remains pending. I may need to read your patch more carefully, but in general I do not = like the approach. You are much more experienced than I am, but IMHO the TLB flushing logic needs to be further simplified and generalized and not = the other way around. The complexity is already too high. We have tlb_flush_batched and tlb_flush_pending, which I think should be (somehow) combined. tlb_gather_mmu() is designed for zapping, but can=E2=80=99t it be = modified to suit protection changes to avoid buggy use-cases (as the wrong use in clear_refs_write() ) ? So adding new userfaultfd specific code, which potentially does not = address all the interactions (now or the future), is concerning. In this regard, a similar problem to the one in userfaultfd (mmap_read_lock() while write-protecting) already exists with soft-dirty clearing, so any solution should also address the soft-dirty issue.