From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DC80C433DB for ; Sun, 31 Jan 2021 00:16:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9933F64E15 for ; Sun, 31 Jan 2021 00:16:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9933F64E15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 16F976B007E; Sat, 30 Jan 2021 19:16:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D0DC6B0080; Sat, 30 Jan 2021 19:16:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E68916B0081; Sat, 30 Jan 2021 19:16:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0139.hostedemail.com [216.40.44.139]) by kanga.kvack.org (Postfix) with ESMTP id C83D56B007E for ; Sat, 30 Jan 2021 19:16:23 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8E60982499A8 for ; Sun, 31 Jan 2021 00:16:23 +0000 (UTC) X-FDA: 77764153446.06.rice03_0c02d21275b5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 695431003A845 for ; Sun, 31 Jan 2021 00:16:23 +0000 (UTC) X-HE-Tag: rice03_0c02d21275b5 X-Filterd-Recvd-Size: 11100 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Sun, 31 Jan 2021 00:16:22 +0000 (UTC) Received: by mail-pj1-f51.google.com with SMTP id gx1so8806123pjb.1 for ; Sat, 30 Jan 2021 16:16:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xvDcVWDMqSlgRKDC/415dsNWXRNDCXGyoKkXgRrWozk=; b=sHfvfBFkHKyTyMjCEaanWGOfbmnW6eqnCByHh+aQy9puH39w/8QOImPSD5ESCNZPPW wQMXyW1xHIQs4H84eMYNrngi6phUfigFjDXPZk8ZZ/IYmOaD44gyo16syDWbj8jV14TC DF//wgzY5SSxBGAAk/rmVD9X/AXccea+NItjjkseOFTfaavrdH+/hKdaZ2+COQFlsVg+ EDMy2LRt91m0sAKAOiUbPfKufcw8qeeH+NxSjaoUhH9jle5bu6E103GzQBmOH5cRg3em x07MiWMxWGuZMomPESLflaGqR5ddIgqF7FNv9cq2OTImt67OHCo+9HJT0SdQiUODHUGa a42w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xvDcVWDMqSlgRKDC/415dsNWXRNDCXGyoKkXgRrWozk=; b=VhVgZ27Gg/XZ/nr4nxgqio9BCEvZzoQ07ut0ST22qWoY45VSYR5Unyi/cVOCvjOSwH ZsBKVrEFVjIxPAzFUvzVF/tUEC+lYpimk8pO/t/dLxyAAUjM4/5ISLFNrPNwPhK4jPg3 dw2c9OOB7K7cC7/hKvhCJqmu+K0F/RKojK8uhEz90rBai/h/5CyHsuFs+gNQKoxmho+b +Zq9yPJPIXpVQ83yHdeNBERVeH4mQ2egLCmqxcF66jCCrbdUJyBVmwQP77zOalZXb81q OHGzrUmI3WRsussxvpjACsD0guroV8xUjG3uBYIRSLHIAJOZ/fJ3xnxXxkEKUh8in8Fq /8sQ== X-Gm-Message-State: AOAM533ycAewTi/FrZSQPWBbMv8mPtvB5SXq9Wrr58IKsOmGrkvOpxcx MVIuDImU71QAbr+2ycGSBLIVGWLrDys= X-Google-Smtp-Source: ABdhPJxSJ3S1Hh834JyxNBWkaIPU+kNlBMZlR30PiuMLpD9MM9sZ1et0g2VBjM/ZMxyGVv2eOMlHTg== X-Received: by 2002:a17:90b:30d6:: with SMTP id hi22mr1757058pjb.42.1612052181708; Sat, 30 Jan 2021 16:16:21 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Jan 2021 16:16:21 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Andrea Arcangeli , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC 13/20] mm/tlb: introduce tlb_start_ptes() and tlb_end_ptes() Date: Sat, 30 Jan 2021 16:11:25 -0800 Message-Id: <20210131001132.3368247-14-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131001132.3368247-1-namit@vmware.com> References: <20210131001132.3368247-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Introduce tlb_start_ptes() and tlb_end_ptes() which would be called before and after PTEs are updated and TLB flushes are deferred. This will be later be used for fine granualrity deferred TLB flushing detection. In the meanwhile, move flush_tlb_batched_pending() into tlb_start_ptes(). It was not called from mapping_dirty_helpers by wp_pte() and clean_record_pte(), which might be a bug. No additional functional change is intended. Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org --- fs/proc/task_mmu.c | 2 ++ include/asm-generic/tlb.h | 18 ++++++++++++++++++ mm/madvise.c | 6 ++++-- mm/mapping_dirty_helpers.c | 15 +++++++++++++-- mm/memory.c | 2 ++ mm/mprotect.c | 3 ++- 6 files changed, 41 insertions(+), 5 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 4cd048ffa0f6..d0cce961fa5c 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1168,6 +1168,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigne= d long addr, return 0; =20 pte =3D pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + tlb_start_ptes(&cp->tlb); for (; addr !=3D end; pte++, addr +=3D PAGE_SIZE) { ptent =3D *pte; =20 @@ -1190,6 +1191,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigne= d long addr, tlb_flush_pte_range(&cp->tlb, addr, PAGE_SIZE); ClearPageReferenced(page); } + tlb_end_ptes(&cp->tlb); pte_unmap_unlock(pte - 1, ptl); cond_resched(); return 0; diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 041be2ef4426..10690763090a 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -58,6 +58,11 @@ * Defaults to flushing at tlb_end_vma() to reset the range; helps wh= en * there's large holes between the VMAs. * + * - tlb_start_ptes() / tlb_end_ptes; makr the start / end of PTEs chan= ge. + * + * Does internal accounting to allow fine(r) granularity checks for + * pte_accessible() on certain configuration. + * * - tlb_remove_table() * * tlb_remove_table() is the basic primitive to free page-table direc= tories @@ -373,6 +378,10 @@ static inline void tlb_flush(struct mmu_gather *tlb) flush_tlb_range(tlb->vma, tlb->start, tlb->end); } } +#endif + +#if __is_defined(tlb_flush) || \ + IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING) =20 static inline void tlb_update_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) @@ -523,6 +532,15 @@ static inline void mark_mm_tlb_gen_done(struct mm_st= ruct *mm, u64 gen) =20 #endif /* CONFIG_ARCH_HAS_TLB_GENERATIONS */ =20 +#define tlb_start_ptes(tlb) \ + do { \ + struct mmu_gather *_tlb =3D (tlb); \ + \ + flush_tlb_batched_pending(_tlb->mm); \ + } while (0) + +static inline void tlb_end_ptes(struct mmu_gather *tlb) { } + /* * tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->en= d, * and set corresponding cleared_*. diff --git a/mm/madvise.c b/mm/madvise.c index 0938fd3ad228..932c1c2eb9a3 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -392,7 +392,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *p= md, #endif tlb_change_page_size(tlb, PAGE_SIZE); orig_pte =3D pte =3D pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); - flush_tlb_batched_pending(mm); + tlb_start_ptes(tlb); arch_enter_lazy_mmu_mode(); for (; addr < end; pte++, addr +=3D PAGE_SIZE) { ptent =3D *pte; @@ -468,6 +468,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *p= md, } =20 arch_leave_lazy_mmu_mode(); + tlb_end_ptes(tlb); pte_unmap_unlock(orig_pte, ptl); if (pageout) reclaim_pages(&page_list); @@ -588,7 +589,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigne= d long addr, =20 tlb_change_page_size(tlb, PAGE_SIZE); orig_pte =3D pte =3D pte_offset_map_lock(mm, pmd, addr, &ptl); - flush_tlb_batched_pending(mm); + tlb_start_ptes(tlb); arch_enter_lazy_mmu_mode(); for (; addr !=3D end; pte++, addr +=3D PAGE_SIZE) { ptent =3D *pte; @@ -692,6 +693,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigne= d long addr, add_mm_counter(mm, MM_SWAPENTS, nr_swap); } arch_leave_lazy_mmu_mode(); + tlb_end_ptes(tlb); pte_unmap_unlock(orig_pte, ptl); cond_resched(); next: diff --git a/mm/mapping_dirty_helpers.c b/mm/mapping_dirty_helpers.c index 2ce6cf431026..063419ade304 100644 --- a/mm/mapping_dirty_helpers.c +++ b/mm/mapping_dirty_helpers.c @@ -6,6 +6,8 @@ #include #include =20 +#include "internal.h" + /** * struct wp_walk - Private struct for pagetable walk callbacks * @range: Range for mmu notifiers @@ -36,7 +38,10 @@ static int wp_pte(pte_t *pte, unsigned long addr, unsi= gned long end, pte_t ptent =3D *pte; =20 if (pte_write(ptent)) { - pte_t old_pte =3D ptep_modify_prot_start(walk->vma, addr, pte); + pte_t old_pte; + + tlb_start_ptes(&wpwalk->tlb); + old_pte =3D ptep_modify_prot_start(walk->vma, addr, pte); =20 ptent =3D pte_wrprotect(old_pte); ptep_modify_prot_commit(walk->vma, addr, pte, old_pte, ptent); @@ -44,6 +49,7 @@ static int wp_pte(pte_t *pte, unsigned long addr, unsig= ned long end, =20 if (pte_may_need_flush(old_pte, ptent)) tlb_flush_pte_range(&wpwalk->tlb, addr, PAGE_SIZE); + tlb_end_ptes(&wpwalk->tlb); } =20 return 0; @@ -94,13 +100,18 @@ static int clean_record_pte(pte_t *pte, unsigned lon= g addr, if (pte_dirty(ptent)) { pgoff_t pgoff =3D ((addr - walk->vma->vm_start) >> PAGE_SHIFT) + walk->vma->vm_pgoff - cwalk->bitmap_pgoff; - pte_t old_pte =3D ptep_modify_prot_start(walk->vma, addr, pte); + pte_t old_pte; + + tlb_start_ptes(&wpwalk->tlb); + + old_pte =3D ptep_modify_prot_start(walk->vma, addr, pte); =20 ptent =3D pte_mkclean(old_pte); ptep_modify_prot_commit(walk->vma, addr, pte, old_pte, ptent); =20 wpwalk->total++; tlb_flush_pte_range(&wpwalk->tlb, addr, PAGE_SIZE); + tlb_end_ptes(&wpwalk->tlb); =20 __set_bit(pgoff, cwalk->bitmap); cwalk->start =3D min(cwalk->start, pgoff); diff --git a/mm/memory.c b/mm/memory.c index 9e8576a83147..929a93c50d9a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1221,6 +1221,7 @@ static unsigned long zap_pte_range(struct mmu_gathe= r *tlb, init_rss_vec(rss); start_pte =3D pte_offset_map_lock(mm, pmd, addr, &ptl); pte =3D start_pte; + tlb_start_ptes(tlb); flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); do { @@ -1314,6 +1315,7 @@ static unsigned long zap_pte_range(struct mmu_gathe= r *tlb, add_mm_rss_vec(mm, rss); arch_leave_lazy_mmu_mode(); =20 + tlb_end_ptes(tlb); /* Do the actual TLB flush before dropping ptl */ if (force_flush) tlb_flush_mmu_tlbonly(tlb); diff --git a/mm/mprotect.c b/mm/mprotect.c index b7473d2c9a1f..1258bbe42ee1 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -70,7 +70,7 @@ static unsigned long change_pte_range(struct mmu_gather= *tlb, atomic_read(&vma->vm_mm->mm_users) =3D=3D 1) target_node =3D numa_node_id(); =20 - flush_tlb_batched_pending(vma->vm_mm); + tlb_start_ptes(tlb); arch_enter_lazy_mmu_mode(); do { oldpte =3D *pte; @@ -182,6 +182,7 @@ static unsigned long change_pte_range(struct mmu_gath= er *tlb, } } while (pte++, addr +=3D PAGE_SIZE, addr !=3D end); arch_leave_lazy_mmu_mode(); + tlb_end_ptes(tlb); pte_unmap_unlock(pte - 1, ptl); =20 return pages; --=20 2.25.1