From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 610C2C433DB for ; Mon, 21 Dec 2020 22:30:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DDEAF2311A for ; Mon, 21 Dec 2020 22:30:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DDEAF2311A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D02C36B0036; Mon, 21 Dec 2020 17:30:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CB0E16B005C; Mon, 21 Dec 2020 17:30:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BC76D6B0068; Mon, 21 Dec 2020 17:30:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0051.hostedemail.com [216.40.44.51]) by kanga.kvack.org (Postfix) with ESMTP id A721D6B0036 for ; Mon, 21 Dec 2020 17:30:47 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 6B66B1E0A for ; Mon, 21 Dec 2020 22:30:47 +0000 (UTC) X-FDA: 77618735334.06.level93_3914b752745a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 44D501003BF8B for ; Mon, 21 Dec 2020 22:30:47 +0000 (UTC) X-HE-Tag: level93_3914b752745a X-Filterd-Recvd-Size: 6027 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Mon, 21 Dec 2020 22:30:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1608589846; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=/0nOtRjyRXqUdfVgnLEJZJLZpSyfauPx9TpoxNOMLQc=; b=BcN7z3wbg0KXkZy++I+KLqknDQm3BxmfKknuq2jou0uD5WxuUGN9BmhnV+hiyxwqyGFQFm ++oFzoJEukxIyoMw9Y0o/DlJIwF+ugPg8K3UXnTnyKEO6ZvM2mNtX2IIKF8TUY3OVcyNRI LYyUnUqedIWKlARbBCWKQdghitunS+c= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-534-7A_O1M_BMCCIrM4Tue1vJQ-1; Mon, 21 Dec 2020 17:30:44 -0500 X-MC-Unique: 7A_O1M_BMCCIrM4Tue1vJQ-1 Received: by mail-qv1-f72.google.com with SMTP id x19so9062908qvv.16 for ; Mon, 21 Dec 2020 14:30:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=/0nOtRjyRXqUdfVgnLEJZJLZpSyfauPx9TpoxNOMLQc=; b=KIkzA64vp2TmhCe551WYvjreenXmCUMlzZIcwbanv9cLPUhaq7vHqrOUiNIor9hGrL jCjszwKvrdTzhgE1EAShA7M+XzMTg0pYGFpUYIPYmjOOFBNjKbS0DJL1MYpqChG8iiM7 5x7zsw9+p3M67l9+0kXpntVvxTOEC4BONHyTOJLqXkI/ypNM1cWcoei4abeCos4Gn54Q 3b4bFoT33Vm8oPPZHrg85+GJSI/59jn2Zx0izS0Ey7o+gtvRE09XAYBdmO9NnLlaIzPJ YDFtI7uzAxyaG74sxdoZ+TaeSaE8s7hwXx1o+WqKunuMdKX9DPsOS4a52jBueUR1JdjU /M4g== X-Gm-Message-State: AOAM532ckfRvM2xS8BxBk3KAEVbuS5Qm0rT3Q1z9HEuc3WlB7XWghusw 71gNzZC1zDWW9Pg9Kgy+HSf92FhsohbHMMZ9aGk9CFaSavzCXssu9/lGJNQIr1PssYKlJ7QCoon 0htaVeKTysWM= X-Received: by 2002:a0c:f850:: with SMTP id g16mr19255292qvo.14.1608589844018; Mon, 21 Dec 2020 14:30:44 -0800 (PST) X-Google-Smtp-Source: ABdhPJwsyvGQ+Zlz+F4Ln7DFyojEIAwfAH7eIeF2+0g/BLBE8jYg9DoCNA01RbxX54HEjqCZPSWhnQ== X-Received: by 2002:a0c:f850:: with SMTP id g16mr19255269qvo.14.1608589843776; Mon, 21 Dec 2020 14:30:43 -0800 (PST) Received: from xz-x1 ([142.126.83.202]) by smtp.gmail.com with ESMTPSA id a35sm12063468qtk.82.2020.12.21.14.30.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Dec 2020 14:30:42 -0800 (PST) Date: Mon, 21 Dec 2020 17:30:41 -0500 From: Peter Xu To: Nadav Amit Cc: Yu Zhao , Linus Torvalds , Andrea Arcangeli , linux-mm , lkml , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , stable , Minchan Kim , Andy Lutomirski , Will Deacon , Peter Zijlstra Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect Message-ID: <20201221223041.GL6640@xz-x1> References: <20201221172711.GE6640@xz-x1> <76B4F49B-ED61-47EA-9BE4-7F17A26B610D@gmail.com> <9E301C7C-882A-4E0F-8D6D-1170E792065A@gmail.com> <1FCC8F93-FF29-44D3-A73A-DF943D056680@gmail.com> MIME-Version: 1.0 In-Reply-To: <1FCC8F93-FF29-44D3-A73A-DF943D056680@gmail.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Dec 21, 2020 at 01:49:55PM -0800, Nadav Amit wrote: > BTW: In general, I think that you are right, and that changing of PTEs > should not require taking mmap_lock for write. However, I am not sure > cow_user_page() is not the only one that poses a problem and whether a more > systematic solution is needed. If cow_user_pages() is the only problem, do > you think it is possible to do the copying while holding the PTL? It works > for normal-pages, but I am not sure whether special-pages pose special > problems. > > Anyhow, this is an enhancement that we can try later. AFAIU mprotect() is the only one who modifies the pte using the mmap write lock. NUMA balancing is also using read mmap lock when changing pte protections, while my understanding is mprotect() used write lock only because it manipulates the address space itself (aka. vma layout) rather than modifying the ptes, so it needs to. At the pte level, it seems always to be the pgtable lock that serializes things. So it's perfectly legal to me for e.g. a driver to modify ptes with the read lock of mmap_sem, unless I'm severely mistaken.. as long as the pgtable lock is taken when doing so. If there's a driver that manipulated the ptes, changed the content of the page, recover the ptes to origin, and all these happen right after wp_page_copy() unlocked the pgtable lock but before wp_page_copy() retakes the same lock again, we may face the same issue finding that the page got copied contains corrupted data at last. While I don't know what to blame on the driver either because it seems to be exactly following the rules. I believe changing into write lock would solve the race here because tlb flushing would be guaranteed along the way, but I'm just a bit worried it's not the best way to go.. Thanks, -- Peter Xu