From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85344C433DB for ; Sun, 31 Jan 2021 00:16:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3312E64E1C for ; Sun, 31 Jan 2021 00:16:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3312E64E1C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5B2EF6B006E; Sat, 30 Jan 2021 19:16:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 547FD6B0070; Sat, 30 Jan 2021 19:16:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D2DD6B0071; Sat, 30 Jan 2021 19:16:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0129.hostedemail.com [216.40.44.129]) by kanga.kvack.org (Postfix) with ESMTP id 12BC16B006E for ; Sat, 30 Jan 2021 19:16:07 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D34771EE6 for ; Sun, 31 Jan 2021 00:16:06 +0000 (UTC) X-FDA: 77764152732.10.pin22_5211efb275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id B10F316A046 for ; Sun, 31 Jan 2021 00:16:06 +0000 (UTC) X-HE-Tag: pin22_5211efb275b5 X-Filterd-Recvd-Size: 7846 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:06 +0000 (UTC) Received: by mail-pl1-f182.google.com with SMTP id u11so7895427plg.13 for ; Sat, 30 Jan 2021 16:16:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DNxo16iCNNEqpRi0tVt5da0M9RdBAEB3Mfjbxme6ACQ=; b=Yr95/MAa90ahbQ9SGbcNuoIm7byeFbJgjA9qtcNbqcRUs8stZ62X6TP0sb6ijhCM4u UqPIQ6NM8f8yf33mQgiOmMuVBSAt9LDz/lBdJivbFlXqZajnoA1NCOt+CU9xHdgx3QUi To//nUdTJRhpoUbmS8RibLZ11gh7dQDbowtXmLcvx9xVwgdAD//UwpjJRjnf6LY4skue GxZQIqWNESsTJBgDQ8STQLM7AS73T+ccluPy8jOij6Hq8ugtyl0Vmo6RgBMFg/iolsYa I5fekBGetsDGdcsIdj3sl6IHqyWxXvAW8HdmrtqyWKtD3LHfwl4bLwTXq21MZydYJOQR DnBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DNxo16iCNNEqpRi0tVt5da0M9RdBAEB3Mfjbxme6ACQ=; b=CFA8zirpszuQ0EhoPfxq/2N/3L6ckhqmkwmbIqzpB9Qg7zj0vx8ALAS2+Mi1dHzyQY TiirqJcisMzy1j35Z1u5ZVX+V8eTrUqVdPt6U6+QURPFCRZ4YqYz2426CwBgOgVI+VtA 7pC08BTfb2ZUzPIP63hcaJ9p4IGSyJ9y9a1Lo3d+jTQgbvf36W61Hm1MKxdKbXsq4QeW /i++F5G65SXMc9KQOmuOl3z4IG7mjAVsq4QuVsFNEngHsn9BWrx/x+JsvBSeYpzTLQyI ZE3SRUzbguqRcDDriRQ3Y5mw6D4Jb8K5Wr2fpvdzGq4dmJzKrcsYza9OI8VCNDDLo8Ey yrjQ== X-Gm-Message-State: AOAM530hnm4UjcLyBT8+LUuZxc8FsmwX2QtiVO465CT4RnF/hdhEgdd4 f2HiScdt98IhWV1gqNEx9Jtk5J8avZE= X-Google-Smtp-Source: ABdhPJzqJLhgd8JRSQewC2Nzlo9p9h1NzX9V2S7FLf7k6KbsLBVCMC9Lm4EPN9txdmdVTtWwr62UDA== X-Received: by 2002:a17:90b:180d:: with SMTP id lw13mr10626232pjb.94.1612052165042; Sat, 30 Jan 2021 16:16:05 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:04 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 03/20] mm/mprotect: do not flush on permission promotion Date: Sat, 30 Jan 2021 16:11:15 -0800 Message-Id: <20210131001132.3368247-4-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Currently, using mprotect() to unprotect a memory region or uffd to unprotect a memory region causes a TLB flush. At least on x86, as protection is promoted, no TLB flush is needed. Add an arch-specific pte_may_need_flush() which tells whether a TLB flush is needed based on the old PTE and the new one. Implement an x86 pte_may_need_flush(). For x86, besides the simple logic that PTE protection promotion or changes of software bits does require a flush, also add logic that considers the dirty-bit. If the dirty-bit is clear and write-protect is set, no TLB flush is needed, as x86 updates the dirty-bit atomically on write, and if the bit is clear, the PTE is reread. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- arch/x86/include/asm/tlbflush.h | 44 +++++++++++++++++++++++++++++++++ include/asm-generic/tlb.h | 4 +++ mm/mprotect.c | 3 ++- 3 files changed, 50 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbfl= ush.h index 8c87a2e0b660..a617dc0a9b06 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -255,6 +255,50 @@ static inline void arch_tlbbatch_add_mm(struct arch_= tlbflush_unmap_batch *batch, =20 extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)= ; =20 +static inline bool pte_may_need_flush(pte_t oldpte, pte_t newpte) +{ + const pteval_t ignore_mask =3D _PAGE_SOFTW1 | _PAGE_SOFTW2 | + _PAGE_SOFTW3 | _PAGE_ACCESSED; + const pteval_t enable_mask =3D _PAGE_RW | _PAGE_DIRTY | _PAGE_GLOBAL; + pteval_t oldval =3D pte_val(oldpte); + pteval_t newval =3D pte_val(newpte); + pteval_t diff =3D oldval ^ newval; + pteval_t disable_mask =3D 0; + + if (IS_ENABLED(CONFIG_X86_64) || IS_ENABLED(CONFIG_X86_PAE)) + disable_mask =3D _PAGE_NX; + + /* new is non-present: need only if old is present */ + if (pte_none(newpte)) + return !pte_none(oldpte); + + /* + * If, excluding the ignored bits, only RW and dirty are cleared and th= e + * old PTE does not have the dirty-bit set, we can avoid a flush. This + * is possible since x86 architecture set the dirty bit atomically whil= e + * it caches the PTE in the TLB. + * + * The condition considers any change to RW and dirty as not requiring + * flush if the old PTE is not dirty or not writable for simplification + * of the code and to consider (unlikely) cases of changing dirty-bit o= f + * write-protected PTE. + */ + if (!(diff & ~(_PAGE_RW | _PAGE_DIRTY | ignore_mask)) && + (!(pte_dirty(oldpte) || !pte_write(oldpte)))) + return false; + + /* + * Any change of PFN and any flag other than those that we consider + * requires a flush (e.g., PAT, protection keys). To save flushes we do + * not consider the access bit as it is considered by the kernel as + * best-effort. + */ + return diff & ((oldval & enable_mask) | + (newval & disable_mask) | + ~(enable_mask | disable_mask | ignore_mask)); +} +#define pte_may_need_flush pte_may_need_flush + #endif /* !MODULE */ =20 #endif /* _ASM_X86_TLBFLUSH_H */ diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index eea113323468..c2deec0b6919 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -654,6 +654,10 @@ static inline void tlb_flush_p4d_range(struct mmu_ga= ther *tlb, } while (0) #endif =20 +#ifndef pte_may_need_flush +static inline bool pte_may_need_flush(pte_t oldpte, pte_t newpte) { retu= rn true; } +#endif + #endif /* CONFIG_MMU */ =20 #endif /* _ASM_GENERIC__TLB_H */ diff --git a/mm/mprotect.c b/mm/mprotect.c index 632d5a677d3f..b7473d2c9a1f 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -139,7 +139,8 @@ static unsigned long change_pte_range(struct mmu_gath= er *tlb, ptent =3D pte_mkwrite(ptent); } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); - tlb_flush_pte_range(tlb, addr, PAGE_SIZE); + if (pte_may_need_flush(oldpte, ptent)) + tlb_flush_pte_range(tlb, addr, PAGE_SIZE); pages++; } else if (is_swap_pte(oldpte)) { swp_entry_t entry =3D pte_to_swp_entry(oldpte); --=20 2.25.1