From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE25EC07E9C for ; Thu, 8 Jul 2021 01:10:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 71A7161C4E for ; Thu, 8 Jul 2021 01:10:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 71A7161C4E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 622D56B0096; Wed, 7 Jul 2021 21:10:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FB6D6B009B; Wed, 7 Jul 2021 21:10:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EB146B009D; Wed, 7 Jul 2021 21:10:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0131.hostedemail.com [216.40.44.131]) by kanga.kvack.org (Postfix) with ESMTP id 290CD6B0096 for ; Wed, 7 Jul 2021 21:10:24 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 8527D2705C for ; Thu, 8 Jul 2021 01:10:23 +0000 (UTC) X-FDA: 78337639926.34.E037294 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf13.hostedemail.com (Postfix) with ESMTP id EFCF81002CAD for ; Thu, 8 Jul 2021 01:10:22 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CBFEC61CCC; Thu, 8 Jul 2021 01:10:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1625706622; bh=pQavm+VUPVhv3eQDCLWLWX+xz6WB2pkDjdPeIA2fLEg=; h=Date:From:To:Subject:In-Reply-To:From; b=A171zLo36O4XNIZnWwPVOL2WEJxIQPcr9RR2HhhWYNm/FMq/0fnk/nyz79LVvTZZG sVPXne2BE5pz4wG4cFDoaUH07cD7yReQKLGY7d5lRCxORAcExuIzq2ncnkgk3ZCF9w qtHQo6QqB65A03UwZUUBSUEaKDiU7pxaHr4AFHI0= Date: Wed, 07 Jul 2021 18:10:21 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, christophe.leroy@csgroup.eu, hughd@google.com, joel@joelfernandes.org, kaleshsingh@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, sfr@canb.auug.org.au, torvalds@linux-foundation.org Subject: [patch 53/54] powerpc/book3s64/mm: update flush_tlb_range to flush page walk cache Message-ID: <20210708011021.6PyMtPyxp%akpm@linux-foundation.org> In-Reply-To: <20210707175950.eceddb86c6c555555d4730e2@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: zw3qatn19zdwjfz9y86es3akpugdojd4 X-Rspamd-Queue-Id: EFCF81002CAD X-Rspam-User: nil X-Rspamd-Server: rspam01 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=A171zLo3; dmarc=none; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1625706622-752557 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Aneesh Kumar K.V" Subject: powerpc/book3s64/mm: update flush_tlb_range to flush page walk cache flush_tlb_range is special in that we don't specify the page size used for the translation. Hence when flushing TLB we flush the translation cache for all possible page sizes. The kernel also uses the same interface when moving page tables around. Such a move requires us to flush the page walk cache. Instead of adding another interface to force page walk cache flush, update flush_tlb_range to flush page walk cache if the range flushed is more than the PMD range. A page table move will always involve an invalidate range more than PMD_SIZE. Running microbenchmark with mprotect and parallel memory access didn't show any observable performance impact. Link: https://lkml.kernel.org/r/20210616045735.374532-3-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V Cc: Christophe Leroy Cc: Hugh Dickins Cc: Joel Fernandes Cc: Kalesh Singh Cc: Kirill A. Shutemov Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Stephen Rothwell Signed-off-by: Andrew Morton --- arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 2 arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 8 + arch/powerpc/mm/book3s64/radix_tlb.c | 44 ++++++---- 3 files changed, 36 insertions(+), 18 deletions(-) --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h~powerpc-book3s64-mm-update-flush_tlb_range-to-flush-page-walk-cache +++ a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -64,6 +64,8 @@ extern void radix__flush_hugetlb_tlb_ran unsigned long start, unsigned long end); extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, unsigned long end, int psize); +void radix__flush_tlb_pwc_range_psize(struct mm_struct *mm, unsigned long start, + unsigned long end, int psize); extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, --- a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c~powerpc-book3s64-mm-update-flush_tlb_range-to-flush-page-walk-cache +++ a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c @@ -32,7 +32,13 @@ void radix__flush_hugetlb_tlb_range(stru struct hstate *hstate = hstate_file(vma->vm_file); psize = hstate_get_psize(hstate); - radix__flush_tlb_range_psize(vma->vm_mm, start, end, psize); + /* + * Flush PWC even if we get PUD_SIZE hugetlb invalidate to keep this simpler. + */ + if (end - start >= PUD_SIZE) + radix__flush_tlb_pwc_range_psize(vma->vm_mm, start, end, psize); + else + radix__flush_tlb_range_psize(vma->vm_mm, start, end, psize); } /* --- a/arch/powerpc/mm/book3s64/radix_tlb.c~powerpc-book3s64-mm-update-flush_tlb_range-to-flush-page-walk-cache +++ a/arch/powerpc/mm/book3s64/radix_tlb.c @@ -1111,14 +1111,13 @@ static unsigned long tlb_local_single_pa static inline void __radix__flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end) - { unsigned long pid; unsigned int page_shift = mmu_psize_defs[mmu_virtual_psize].shift; unsigned long page_size = 1UL << page_shift; unsigned long nr_pages = (end - start) >> page_shift; bool fullmm = (end == TLB_FLUSH_ALL); - bool flush_pid; + bool flush_pid, flush_pwc = false; enum tlb_flush_type type; pid = mm->context.id; @@ -1137,8 +1136,16 @@ static inline void __radix__flush_tlb_ra flush_pid = nr_pages > tlb_single_page_flush_ceiling; else flush_pid = nr_pages > tlb_local_single_page_flush_ceiling; + /* + * full pid flush already does the PWC flush. if it is not full pid + * flush check the range is more than PMD and force a pwc flush + * mremap() depends on this behaviour. + */ + if (!flush_pid && (end - start) >= PMD_SIZE) + flush_pwc = true; if (!mmu_has_feature(MMU_FTR_GTSE) && type == FLUSH_TYPE_GLOBAL) { + unsigned long type = H_RPTI_TYPE_TLB; unsigned long tgt = H_RPTI_TARGET_CMMU; unsigned long pg_sizes = psize_to_rpti_pgsize(mmu_virtual_psize); @@ -1146,19 +1153,20 @@ static inline void __radix__flush_tlb_ra pg_sizes |= psize_to_rpti_pgsize(MMU_PAGE_2M); if (atomic_read(&mm->context.copros) > 0) tgt |= H_RPTI_TARGET_NMMU; - pseries_rpt_invalidate(pid, tgt, H_RPTI_TYPE_TLB, pg_sizes, - start, end); + if (flush_pwc) + type |= H_RPTI_TYPE_PWC; + pseries_rpt_invalidate(pid, tgt, type, pg_sizes, start, end); } else if (flush_pid) { + /* + * We are now flushing a range larger than PMD size force a RIC_FLUSH_ALL + */ if (type == FLUSH_TYPE_LOCAL) { - _tlbiel_pid(pid, RIC_FLUSH_TLB); + _tlbiel_pid(pid, RIC_FLUSH_ALL); } else { if (cputlb_use_tlbie()) { - if (mm_needs_flush_escalation(mm)) - _tlbie_pid(pid, RIC_FLUSH_ALL); - else - _tlbie_pid(pid, RIC_FLUSH_TLB); + _tlbie_pid(pid, RIC_FLUSH_ALL); } else { - _tlbiel_pid_multicast(mm, pid, RIC_FLUSH_TLB); + _tlbiel_pid_multicast(mm, pid, RIC_FLUSH_ALL); } } } else { @@ -1174,6 +1182,9 @@ static inline void __radix__flush_tlb_ra if (type == FLUSH_TYPE_LOCAL) { asm volatile("ptesync": : :"memory"); + if (flush_pwc) + /* For PWC, only one flush is needed */ + __tlbiel_pid(pid, 0, RIC_FLUSH_PWC); __tlbiel_va_range(start, end, pid, page_size, mmu_virtual_psize); if (hflush) __tlbiel_va_range(hstart, hend, pid, @@ -1181,6 +1192,8 @@ static inline void __radix__flush_tlb_ra ppc_after_tlbiel_barrier(); } else if (cputlb_use_tlbie()) { asm volatile("ptesync": : :"memory"); + if (flush_pwc) + __tlbie_pid(pid, RIC_FLUSH_PWC); __tlbie_va_range(start, end, pid, page_size, mmu_virtual_psize); if (hflush) __tlbie_va_range(hstart, hend, pid, @@ -1188,10 +1201,10 @@ static inline void __radix__flush_tlb_ra asm volatile("eieio; tlbsync; ptesync": : :"memory"); } else { _tlbiel_va_range_multicast(mm, - start, end, pid, page_size, mmu_virtual_psize, false); + start, end, pid, page_size, mmu_virtual_psize, flush_pwc); if (hflush) _tlbiel_va_range_multicast(mm, - hstart, hend, pid, PMD_SIZE, MMU_PAGE_2M, false); + hstart, hend, pid, PMD_SIZE, MMU_PAGE_2M, flush_pwc); } } out: @@ -1265,9 +1278,6 @@ void radix__flush_all_lpid_guest(unsigne _tlbie_lpid_guest(lpid, RIC_FLUSH_ALL); } -static void radix__flush_tlb_pwc_range_psize(struct mm_struct *mm, unsigned long start, - unsigned long end, int psize); - void radix__tlb_flush(struct mmu_gather *tlb) { int psize = 0; @@ -1374,8 +1384,8 @@ void radix__flush_tlb_range_psize(struct return __radix__flush_tlb_range_psize(mm, start, end, psize, false); } -static void radix__flush_tlb_pwc_range_psize(struct mm_struct *mm, unsigned long start, - unsigned long end, int psize) +void radix__flush_tlb_pwc_range_psize(struct mm_struct *mm, unsigned long start, + unsigned long end, int psize) { __radix__flush_tlb_range_psize(mm, start, end, psize, true); } _