From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82C21C43381 for ; Tue, 19 Feb 2019 10:34:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3AE9C2146E for ; Tue, 19 Feb 2019 10:34:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="ZUS+Ui6o" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728908AbfBSKeM (ORCPT ); Tue, 19 Feb 2019 05:34:12 -0500 Received: from merlin.infradead.org ([205.233.59.134]:55738 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728056AbfBSKeJ (ORCPT ); Tue, 19 Feb 2019 05:34:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=snNhn5zveF6dfV/zpZ0PO6Eh5zozpyzs0qjQhpCNol8=; b=ZUS+Ui6ozcYG4zjSGGs0zJGTlR MDJjRipekQBdKE9nJVKI5mG1id4d8nCwV+LiX1RawROqHIKsIwA7W9KgYpT9pFmEdcO7nLRJ4jdWp X7A7Kkfcg9K5k+pt05pGCgtWfLBSV9ck1tgUOTJQfGOkrNScQenZxXgIGSc8oHGfFXSBEwMEukF3u sTdvaPBXqCdsQxCjkmNjPcBpcy3QIcDXhQgWT7QX+3UegePh7l3wOetqF8uW5CMeOSj8F3Evl/drH OTwSTWV2XLKnawMQ+W69R+rPqkE9Z4VTFeMXKFI3VCT+o3mNjM0H4mkGA60/GUgiYAK7a8vW7jBDO qBeaOyWA==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gw2ho-0000dm-DP; Tue, 19 Feb 2019 10:32:52 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 6E6102852059F; Tue, 19 Feb 2019 11:32:48 +0100 (CET) Message-Id: <20190219103233.564804918@infradead.org> User-Agent: quilt/0.65 Date: Tue, 19 Feb 2019 11:32:00 +0100 From: Peter Zijlstra To: will.deacon@arm.com, aneesh.kumar@linux.vnet.ibm.com, akpm@linux-foundation.org, npiggin@gmail.com Cc: linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, peterz@infradead.org, linux@armlinux.org.uk, heiko.carstens@de.ibm.com, riel@surriel.com, "David S. Miller" , Michal Simek , Helge Deller , Greentime Hu , Richard Henderson , Ley Foon Tan , Jonas Bonn , Mark Salter , Richard Kuo , Vineet Gupta , Paul Burton , Max Filippov , Guan Xuetao Subject: [PATCH v6 12/18] arch/tlb: Clean up simple architectures References: <20190219103148.192029670@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org For the architectures that do not implement their own tlb_flush() but do already use the generic mmu_gather, there are two options: 1) the platform has an efficient flush_tlb_range() and asm-generic/tlb.h doesn't need any overrides at all. 2) the platform lacks an efficient flush_tlb_range() and we select MMU_GATHER_NO_RANGE to minimize full invalidates. Convert all 'simple' architectures to one of these two forms. alpha: has no range invalidate -> 2 arc: already used flush_tlb_range() -> 1 c6x: has no range invalidate -> 2 hexagon: has an efficient flush_tlb_range() -> 1 (flush_tlb_mm() is in fact a full range invalidate, so no need to shoot down everything) m68k: has inefficient flush_tlb_range() -> 2 microblaze: has no flush_tlb_range() -> 2 mips: has efficient flush_tlb_range() -> 1 (even though it currently seems to use flush_tlb_mm()) nds32: already uses flush_tlb_range() -> 1 nios2: has inefficient flush_tlb_range() -> 2 (no limit on range iteration) openrisc: has inefficient flush_tlb_range() -> 2 (no limit on range iteration) parisc: already uses flush_tlb_range() -> 1 sparc32: already uses flush_tlb_range() -> 1 unicore32: has inefficient flush_tlb_range() -> 2 (no limit on range iteration) xtensa: has efficient flush_tlb_range() -> 1 Note this also fixes a bug in the existing code for a number platforms. Those platforms that did: tlb_end_vma() -> if (!full_mm) flush_tlb_*() tlb_flush -> if (full_mm) flush_tlb_mm() missed the case of shift_arg_pages(), which doesn't have @fullmm set, nor calls into tlb_*vma(), but still frees page-tables and thus needs an invalidate. The new code handles this by detecting a non-empty range, and either issuing the matching range invalidate or a full invalidate, depending on the capabilities. Cc: Nick Piggin Cc: "David S. Miller" Cc: Michal Simek Cc: Helge Deller Cc: Greentime Hu Cc: Richard Henderson Cc: Andrew Morton Cc: "Aneesh Kumar K.V" Cc: Will Deacon Cc: Ley Foon Tan Cc: Jonas Bonn Cc: Mark Salter Cc: Richard Kuo Cc: Vineet Gupta Cc: Paul Burton Cc: Max Filippov Cc: Guan Xuetao Signed-off-by: Peter Zijlstra (Intel) --- arch/alpha/Kconfig | 1 + arch/alpha/include/asm/tlb.h | 6 ------ arch/arc/include/asm/tlb.h | 23 ----------------------- arch/c6x/Kconfig | 1 + arch/c6x/include/asm/tlb.h | 2 -- arch/h8300/include/asm/tlb.h | 2 -- arch/hexagon/include/asm/tlb.h | 12 ------------ arch/m68k/Kconfig | 1 + arch/m68k/include/asm/tlb.h | 14 -------------- arch/microblaze/Kconfig | 1 + arch/microblaze/include/asm/tlb.h | 9 --------- arch/mips/include/asm/tlb.h | 8 -------- arch/nds32/include/asm/tlb.h | 10 ---------- arch/nios2/Kconfig | 1 + arch/nios2/include/asm/tlb.h | 8 ++++---- arch/openrisc/Kconfig | 1 + arch/openrisc/include/asm/tlb.h | 8 ++------ arch/parisc/include/asm/tlb.h | 13 ------------- arch/sparc/include/asm/tlb_32.h | 13 ------------- arch/unicore32/Kconfig | 1 + arch/unicore32/include/asm/tlb.h | 7 +++---- arch/xtensa/include/asm/tlb.h | 17 ----------------- 22 files changed, 16 insertions(+), 143 deletions(-) --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -36,6 +36,7 @@ config ALPHA select ODD_RT_SIGACTION select OLD_SIGSUSPEND select CPU_NO_EFFICIENT_FFS if !ALPHA_EV67 + select MMU_GATHER_NO_RANGE help The Alpha is a 64-bit general-purpose processor designed and marketed by the Digital Equipment Corporation of blessed memory, --- a/arch/alpha/include/asm/tlb.h +++ b/arch/alpha/include/asm/tlb.h @@ -2,12 +2,6 @@ #ifndef _ALPHA_TLB_H #define _ALPHA_TLB_H -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) -#define __tlb_remove_tlb_entry(tlb, pte, addr) do { } while (0) - -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) - #include #define __pte_free_tlb(tlb, pte, address) pte_free((tlb)->mm, pte) --- a/arch/arc/include/asm/tlb.h +++ b/arch/arc/include/asm/tlb.h @@ -9,29 +9,6 @@ #ifndef _ASM_ARC_TLB_H #define _ASM_ARC_TLB_H -#define tlb_flush(tlb) \ -do { \ - if (tlb->fullmm) \ - flush_tlb_mm((tlb)->mm); \ -} while (0) - -/* - * This pair is called at time of munmap/exit to flush cache and TLB entries - * for mappings being torn down. - * 1) cache-flush part -implemented via tlb_start_vma( ) for VIPT aliasing D$ - * 2) tlb-flush part - implemted via tlb_end_vma( ) flushes the TLB range - * - * Note, read http://lkml.org/lkml/2004/1/15/6 - */ - -#define tlb_end_vma(tlb, vma) \ -do { \ - if (!tlb->fullmm) \ - flush_tlb_range(vma, vma->vm_start, vma->vm_end); \ -} while (0) - -#define __tlb_remove_tlb_entry(tlb, ptep, address) - #include #include --- a/arch/c6x/Kconfig +++ b/arch/c6x/Kconfig @@ -19,6 +19,7 @@ config C6X select GENERIC_CLOCKEVENTS select MODULES_USE_ELF_RELA select ARCH_NO_COHERENT_DMA_MMAP + select MMU_GATHER_NO_RANGE if MMU config MMU def_bool n --- a/arch/c6x/include/asm/tlb.h +++ b/arch/c6x/include/asm/tlb.h @@ -2,8 +2,6 @@ #ifndef _ASM_C6X_TLB_H #define _ASM_C6X_TLB_H -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) - #include #endif /* _ASM_C6X_TLB_H */ --- a/arch/h8300/include/asm/tlb.h +++ b/arch/h8300/include/asm/tlb.h @@ -2,8 +2,6 @@ #ifndef __H8300_TLB_H__ #define __H8300_TLB_H__ -#define tlb_flush(tlb) do { } while (0) - #include #endif --- a/arch/hexagon/include/asm/tlb.h +++ b/arch/hexagon/include/asm/tlb.h @@ -22,18 +22,6 @@ #include #include -/* - * We don't need any special per-pte or per-vma handling... - */ -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) -#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0) - -/* - * .. because we flush the whole mm when it fills up - */ -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) - #include #endif --- a/arch/m68k/Kconfig +++ b/arch/m68k/Kconfig @@ -27,6 +27,7 @@ config M68K select OLD_SIGSUSPEND3 select OLD_SIGACTION select ARCH_DISCARD_MEMBLOCK + select MMU_GATHER_NO_RANGE if MMU config CPU_BIG_ENDIAN def_bool y --- a/arch/m68k/include/asm/tlb.h +++ b/arch/m68k/include/asm/tlb.h @@ -2,20 +2,6 @@ #ifndef _M68K_TLB_H #define _M68K_TLB_H -/* - * m68k doesn't need any special per-pte or - * per-vma handling.. - */ -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) -#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0) - -/* - * .. because we flush the whole mm when it - * fills up. - */ -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) - #include #endif /* _M68K_TLB_H */ --- a/arch/microblaze/Kconfig +++ b/arch/microblaze/Kconfig @@ -40,6 +40,7 @@ config MICROBLAZE select TRACING_SUPPORT select VIRT_TO_BUS select CPU_NO_EFFICIENT_FFS + select MMU_GATHER_NO_RANGE if MMU # Endianness selection choice --- a/arch/microblaze/include/asm/tlb.h +++ b/arch/microblaze/include/asm/tlb.h @@ -11,16 +11,7 @@ #ifndef _ASM_MICROBLAZE_TLB_H #define _ASM_MICROBLAZE_TLB_H -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) - #include - -#ifdef CONFIG_MMU -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) -#define __tlb_remove_tlb_entry(tlb, pte, address) do { } while (0) -#endif - #include #endif /* _ASM_MICROBLAZE_TLB_H */ --- a/arch/mips/include/asm/tlb.h +++ b/arch/mips/include/asm/tlb.h @@ -5,14 +5,6 @@ #include #include -#define tlb_end_vma(tlb, vma) do { } while (0) -#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0) - -/* - * .. because we flush the whole mm when it fills up. - */ -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) - #define _UNIQUE_ENTRYHI(base, idx) \ (((base) + ((idx) << (PAGE_SHIFT + 1))) | \ (cpu_has_tlbinv ? MIPS_ENTRYHI_EHINV : 0)) --- a/arch/nds32/include/asm/tlb.h +++ b/arch/nds32/include/asm/tlb.h @@ -4,16 +4,6 @@ #ifndef __ASMNDS32_TLB_H #define __ASMNDS32_TLB_H -#define tlb_end_vma(tlb,vma) \ - do { \ - if(!tlb->fullmm) \ - flush_tlb_range(vma, vma->vm_start, vma->vm_end); \ - } while (0) - -#define __tlb_remove_tlb_entry(tlb, pte, addr) do { } while (0) - -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) - #include #define __pte_free_tlb(tlb, pte, addr) pte_free((tlb)->mm, pte) --- a/arch/nios2/Kconfig +++ b/arch/nios2/Kconfig @@ -23,6 +23,7 @@ config NIOS2 select USB_ARCH_HAS_HCD if USB_SUPPORT select CPU_NO_EFFICIENT_FFS select ARCH_DISCARD_MEMBLOCK + select MMU_GATHER_NO_RANGE if MMU config GENERIC_CSUM def_bool y --- a/arch/nios2/include/asm/tlb.h +++ b/arch/nios2/include/asm/tlb.h @@ -11,12 +11,12 @@ #ifndef _ASM_NIOS2_TLB_H #define _ASM_NIOS2_TLB_H -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) - extern void set_mmu_pid(unsigned long pid); -#define tlb_end_vma(tlb, vma) do { } while (0) -#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0) +/* + * NIOS32 does have flush_tlb_range(), but it lacks a limit and fallback to + * full mm invalidation. So use flush_tlb_mm() for everything. + */ #include #include --- a/arch/openrisc/Kconfig +++ b/arch/openrisc/Kconfig @@ -35,6 +35,7 @@ config OPENRISC select OMPIC if SMP select ARCH_WANT_FRAME_POINTERS select GENERIC_IRQ_MULTI_HANDLER + select MMU_GATHER_NO_RANGE if MMU config CPU_BIG_ENDIAN def_bool y --- a/arch/openrisc/include/asm/tlb.h +++ b/arch/openrisc/include/asm/tlb.h @@ -20,14 +20,10 @@ #define __ASM_OPENRISC_TLB_H__ /* - * or32 doesn't need any special per-pte or - * per-vma handling.. + * OpenRISC doesn't have an efficient flush_tlb_range() so use flush_tlb_mm() + * for everything. */ -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) -#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0) -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) #include #include --- a/arch/parisc/include/asm/tlb.h +++ b/arch/parisc/include/asm/tlb.h @@ -2,19 +2,6 @@ #ifndef _PARISC_TLB_H #define _PARISC_TLB_H -#define tlb_flush(tlb) \ -do { if ((tlb)->fullmm) \ - flush_tlb_mm((tlb)->mm);\ -} while (0) - -#define tlb_end_vma(tlb, vma) \ -do { if (!(tlb)->fullmm) \ - flush_tlb_range(vma, vma->vm_start, vma->vm_end); \ -} while (0) - -#define __tlb_remove_tlb_entry(tlb, pte, address) \ - do { } while (0) - #include #define __pmd_free_tlb(tlb, pmd, addr) pmd_free((tlb)->mm, pmd) --- a/arch/sparc/include/asm/tlb_32.h +++ b/arch/sparc/include/asm/tlb_32.h @@ -2,19 +2,6 @@ #ifndef _SPARC_TLB_H #define _SPARC_TLB_H -#define tlb_end_vma(tlb, vma) \ -do { \ - flush_tlb_range(vma, vma->vm_start, vma->vm_end); \ -} while (0) - -#define __tlb_remove_tlb_entry(tlb, pte, address) \ - do { } while (0) - -#define tlb_flush(tlb) \ -do { \ - flush_tlb_mm((tlb)->mm); \ -} while (0) - #include #endif /* _SPARC_TLB_H */ --- a/arch/unicore32/Kconfig +++ b/arch/unicore32/Kconfig @@ -20,6 +20,7 @@ config UNICORE32 select GENERIC_IOMAP select MODULES_USE_ELF_REL select NEED_DMA_MAP_STATE + select MMU_GATHER_NO_RANGE if MMU help UniCore-32 is 32-bit Instruction Set Architecture, including a series of low-power-consumption RISC chip --- a/arch/unicore32/include/asm/tlb.h +++ b/arch/unicore32/include/asm/tlb.h @@ -12,10 +12,9 @@ #ifndef __UNICORE_TLB_H__ #define __UNICORE_TLB_H__ -#define tlb_start_vma(tlb, vma) do { } while (0) -#define tlb_end_vma(tlb, vma) do { } while (0) -#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0) -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) +/* + * unicore32 lacks an efficient flush_tlb_range(), use flush_tlb_mm(). + */ #define __pte_free_tlb(tlb, pte, addr) \ do { \ --- a/arch/xtensa/include/asm/tlb.h +++ b/arch/xtensa/include/asm/tlb.h @@ -14,23 +14,6 @@ #include #include -#if (DCACHE_WAY_SIZE <= PAGE_SIZE) - -# define tlb_end_vma(tlb,vma) do { } while (0) - -#else - -# define tlb_end_vma(tlb, vma) \ - do { \ - if (!tlb->fullmm) \ - flush_tlb_range(vma, vma->vm_start, vma->vm_end); \ - } while(0) - -#endif - -#define __tlb_remove_tlb_entry(tlb,pte,addr) do { } while (0) -#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm) - #include #define __pte_free_tlb(tlb, pte, address) pte_free((tlb)->mm, pte)