From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5618DC433E0 for ; Wed, 23 Dec 2020 00:22:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 27CC822573 for ; Wed, 23 Dec 2020 00:22:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726289AbgLWAVs (ORCPT ); Tue, 22 Dec 2020 19:21:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726039AbgLWAVs (ORCPT ); Tue, 22 Dec 2020 19:21:48 -0500 Received: from mail-lf1-x12c.google.com (mail-lf1-x12c.google.com [IPv6:2a00:1450:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 83E82C0613D6 for ; Tue, 22 Dec 2020 16:21:07 -0800 (PST) Received: by mail-lf1-x12c.google.com with SMTP id o17so36138947lfg.4 for ; Tue, 22 Dec 2020 16:21:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=DLFc43+zWZaHz7DJE5w0IA8m/d3/STNRHdNjNBYY/r8=; b=Kv96F1RjqhLKNH4Di/XoAkh4fGcFanJF/35mBeZg8s5yZd6emVuhDTmAbKdCOPjTHy 2Flkg4YAHNfaKVQVpWzaowFcfIsg0qWnCHYScg1wPkWTGMP48AsbRBDUDmFxjUz6rUTh Xoab7E7PwRF0MbFD3jGQK+ux0O7RWJxy/Q1DA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=DLFc43+zWZaHz7DJE5w0IA8m/d3/STNRHdNjNBYY/r8=; b=QB1p5IMaP4oNvylLh3DN1K11OodNO6PLWQfkE2q466DuddHXoHiOlGpj9Zaqei88Sv gte7Tqg/RsiT8VHiCQj/VA71OYVoqq3vP+25KI7NdZr/fxUK/EMcD82vf5oJF2Wy3mbO OF96py4RT3Wu0VTPNXiDcp8tZu+KOC9NWDBtw0gDBnVRBgp+6TrE33Uw2BTluxPy90/i iEyHqrfVpu+h+rMEnWEh1eDvI9pFDr1MHnwmtRw9OHou6YoG9G6QFw72OHhrF5jeiLAU Nn8xJctCIQFi9ocXF/SwzfzLTD6lu1dd5AIH/wHEtaA1ErASO+4KHaM+7z6NoAn6tO2N BoJw== X-Gm-Message-State: AOAM532xyrgF7PpvY+VgIVmtAyhuSqs6VnBH2aWIENFMDkYJoCQFLGSP JnH/MBkAX9XuI5fbBVQPxIOS6I2Dc78HmQ== X-Google-Smtp-Source: ABdhPJytMSTqWgxkf4kWLXYHjMZ/hFVu6Obar87/hQMXdv8nDYZ72u40AZozJseC0Krw8XCaLXtBZg== X-Received: by 2002:ac2:4463:: with SMTP id y3mr9261320lfl.94.1608682865330; Tue, 22 Dec 2020 16:21:05 -0800 (PST) Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com. [209.85.167.44]) by smtp.gmail.com with ESMTPSA id o138sm2528357lfa.171.2020.12.22.16.21.03 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 22 Dec 2020 16:21:04 -0800 (PST) Received: by mail-lf1-f44.google.com with SMTP id m12so36107395lfo.7 for ; Tue, 22 Dec 2020 16:21:03 -0800 (PST) X-Received: by 2002:a05:6512:338f:: with SMTP id h15mr9238858lfg.40.1608682863624; Tue, 22 Dec 2020 16:21:03 -0800 (PST) MIME-Version: 1.0 References: <9E301C7C-882A-4E0F-8D6D-1170E792065A@gmail.com> <1FCC8F93-FF29-44D3-A73A-DF943D056680@gmail.com> <20201221223041.GL6640@xz-x1> In-Reply-To: From: Linus Torvalds Date: Tue, 22 Dec 2020 16:20:47 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect To: Yu Zhao Cc: Andrea Arcangeli , Andy Lutomirski , Peter Xu , Nadav Amit , linux-mm , lkml , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , stable , Minchan Kim , Will Deacon , Peter Zijlstra Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 22, 2020 at 3:50 PM Linus Torvalds wrote: > > The rule is that the TLB flush has to be done before the page table > lock is released. I take that back. I guess it's ok as long as the mmap_sem is held for writing. Then the TLB flush can be delayed until just before releasing the mmap_sem. I think. The stale TLB entries still mean that somebody else can write through them in another thread, but as long as anybody who actually unmaps the page (and frees it - think rmap etc) is being careful, mprotect() itself can probably afford to be a bit laissez-faire. So mprotect() itself should be ok, I think, because it takes things for writing. Even with the mmap_sem held for writing, truncate and friends can see the read-only page table entries (because they can look things up using the file i_mmap thing instead), but then they rely on the page table lock and they'll also be careful if they then change that PTE and will force their own TLB flushes. So I think a pending TLB flush outside the page table lock is fine - but once again only if you hold the mmap_sem for writing. Not for reading, because then the page tables need to be synchronized with the TLB so that other readers don't see the not-yet-synchronized state. It once again looks like it's just userfaultfd that would trigger this due to the read-lock on the mmap_sem. And mprotect() itself is fine. Am I missing something? But apparently Nadav sees problems even with that lock changed to a write lock. Navad? Linus