* [RFC PATCH v3 0/2] arm64: tlb: add support for TLBI RANGE instructions @ 2020-06-01 14:47 Zhenyu Ye 2020-06-01 14:47 ` [RFC PATCH v4 1/2] arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature Zhenyu Ye 2020-06-01 14:47 ` [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 Zhenyu Ye 0 siblings, 2 replies; 10+ messages in thread From: Zhenyu Ye @ 2020-06-01 14:47 UTC (permalink / raw) To: catalin.marinas, will, suzuki.poulose, maz, steven.price, guohanjun, olof Cc: linux-arch, yezhenyu2, linux-kernel, xiexiangyou, zhangshaokun, linux-mm, arm, prime.zeng, kuhn.chenqun, linux-arm-kernel ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a range of input addresses. This series add support for this feature. -- ChangeList: v4: combine the __flush_tlb_range() and the __directly into the same function with a single loop for both. v3: rebase this series on Linux 5.7-rc1. v2: Link: https://lkml.org/lkml/2019/11/11/348 Zhenyu Ye (2): arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature arm64: tlb: Use the TLBI RANGE feature in arm64 arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/sysreg.h | 4 ++ arch/arm64/include/asm/tlbflush.h | 108 +++++++++++++++++++++++++++++- arch/arm64/kernel/cpufeature.c | 11 +++ 4 files changed, 124 insertions(+), 2 deletions(-) -- 2.19.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC PATCH v4 1/2] arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature 2020-06-01 14:47 [RFC PATCH v3 0/2] arm64: tlb: add support for TLBI RANGE instructions Zhenyu Ye @ 2020-06-01 14:47 ` Zhenyu Ye 2020-06-01 14:47 ` [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 Zhenyu Ye 1 sibling, 0 replies; 10+ messages in thread From: Zhenyu Ye @ 2020-06-01 14:47 UTC (permalink / raw) To: catalin.marinas, will, suzuki.poulose, maz, steven.price, guohanjun, olof Cc: linux-arch, yezhenyu2, linux-kernel, xiexiangyou, zhangshaokun, linux-mm, arm, prime.zeng, kuhn.chenqun, linux-arm-kernel ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a range of input addresses. This patch detect this feature. Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> --- arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/include/asm/sysreg.h | 4 ++++ arch/arm64/kernel/cpufeature.c | 11 +++++++++++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 8eb5a088ae65..950095a72617 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -61,7 +61,8 @@ #define ARM64_HAS_AMU_EXTN 51 #define ARM64_HAS_ADDRESS_AUTH 52 #define ARM64_HAS_GENERIC_AUTH 53 +#define ARM64_HAS_TLBI_RANGE 54 -#define ARM64_NCAPS 54 +#define ARM64_NCAPS 55 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index c4ac0ac25a00..4cc3efd633bc 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -594,6 +594,7 @@ /* id_aa64isar0 */ #define ID_AA64ISAR0_RNDR_SHIFT 60 +#define ID_AA64ISAR0_TLBI_RANGE_SHIFT 56 #define ID_AA64ISAR0_TS_SHIFT 52 #define ID_AA64ISAR0_FHM_SHIFT 48 #define ID_AA64ISAR0_DP_SHIFT 44 @@ -607,6 +608,9 @@ #define ID_AA64ISAR0_SHA1_SHIFT 8 #define ID_AA64ISAR0_AES_SHIFT 4 +#define ID_AA64ISAR0_TLBI_RANGE_NI 0x0 +#define ID_AA64ISAR0_TLBI_RANGE 0x2 + /* id_aa64isar1 */ #define ID_AA64ISAR1_I8MM_SHIFT 52 #define ID_AA64ISAR1_DGH_SHIFT 48 diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 9fac745aa7bb..31bcfd0722b5 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -124,6 +124,7 @@ static bool __system_matches_cap(unsigned int n); */ static const struct arm64_ftr_bits ftr_id_aa64isar0[] = { ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_RNDR_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_TLBI_RANGE_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_TS_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_FHM_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_DP_SHIFT, 4, 0), @@ -1779,6 +1780,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .min_field_value = 1, }, #endif + { + .desc = "TLB range maintenance instruction", + .capability = ARM64_HAS_TLBI_RANGE, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_cpuid_feature, + .sys_reg = SYS_ID_AA64ISAR0_EL1, + .field_pos = ID_AA64ISAR0_TLBI_RANGE_SHIFT, + .sign = FTR_UNSIGNED, + .min_field_value = ID_AA64ISAR0_TLBI_RANGE, + }, {}, }; -- 2.19.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 2020-06-01 14:47 [RFC PATCH v3 0/2] arm64: tlb: add support for TLBI RANGE instructions Zhenyu Ye 2020-06-01 14:47 ` [RFC PATCH v4 1/2] arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature Zhenyu Ye @ 2020-06-01 14:47 ` Zhenyu Ye 2020-06-02 12:06 ` Zhenyu Ye 2020-07-07 17:36 ` Catalin Marinas 1 sibling, 2 replies; 10+ messages in thread From: Zhenyu Ye @ 2020-06-01 14:47 UTC (permalink / raw) To: catalin.marinas, will, suzuki.poulose, maz, steven.price, guohanjun, olof Cc: linux-arch, yezhenyu2, linux-kernel, xiexiangyou, zhangshaokun, linux-mm, arm, prime.zeng, kuhn.chenqun, linux-arm-kernel Add __TLBI_VADDR_RANGE macro and rewrite __flush_tlb_range(). In this patch, we only use the TLBI RANGE feature if the stride == PAGE_SIZE, because when stride > PAGE_SIZE, usually only a small number of pages need to be flushed and classic tlbi intructions are more effective. We can also use 'end - start < threshold number' to decide which way to go, however, different hardware may have different thresholds, so I'm not sure if this is feasible. Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> --- arch/arm64/include/asm/tlbflush.h | 98 +++++++++++++++++++++++++++---- 1 file changed, 86 insertions(+), 12 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index bc3949064725..818f27c82024 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -50,6 +50,16 @@ __tlbi(op, (arg) | USER_ASID_FLAG); \ } while (0) +#define __tlbi_last_level(op1, op2, arg, last_level) do { \ + if (last_level) { \ + __tlbi(op1, arg); \ + __tlbi_user(op1, arg); \ + } else { \ + __tlbi(op2, arg); \ + __tlbi_user(op2, arg); \ + } \ +} while (0) + /* This macro creates a properly formatted VA operand for the TLBI */ #define __TLBI_VADDR(addr, asid) \ ({ \ @@ -59,6 +69,47 @@ __ta; \ }) +/* + * __TG defines translation granule of the system, which is decided by + * PAGE_SHIFT. Used by TTL. + * - 4KB : 1 + * - 16KB : 2 + * - 64KB : 3 + */ +#define __TG ((PAGE_SHIFT - 12) / 2 + 1) + +/* + * This macro creates a properly formatted VA operand for the TLBI RANGE. + * The value bit assignments are: + * + * +----------+------+-------+-------+-------+----------------------+ + * | ASID | TG | SCALE | NUM | TTL | BADDR | + * +-----------------+-------+-------+-------+----------------------+ + * |63 48|47 46|45 44|43 39|38 37|36 0| + * + * The address range is determined by below formula: + * [BADDR, BADDR + (NUM + 1) * 2^(5*SCALE + 1) * PAGESIZE) + * + */ +#define __TLBI_VADDR_RANGE(addr, asid, scale, num, ttl) \ + ({ \ + unsigned long __ta = (addr) >> PAGE_SHIFT; \ + __ta &= GENMASK_ULL(36, 0); \ + __ta |= (unsigned long)(ttl) << 37; \ + __ta |= (unsigned long)(num) << 39; \ + __ta |= (unsigned long)(scale) << 44; \ + __ta |= (unsigned long)(__TG) << 46; \ + __ta |= (unsigned long)(asid) << 48; \ + __ta; \ + }) + +/* This macro defines the range pages of the TLBI RANGE. */ +#define __TLBI_RANGE_SIZES(num, scale) ((num + 1) << (5 * scale + 1) << PAGE_SHIFT) + +#define TLB_RANGE_MASK_SHIFT 5 +#define TLB_RANGE_MASK GENMASK_ULL(TLB_RANGE_MASK_SHIFT - 1, 0) + + /* * TLB Invalidation * ================ @@ -181,32 +232,55 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end, unsigned long stride, bool last_level) { + int num = 0; + int scale = 0; unsigned long asid = ASID(vma->vm_mm); unsigned long addr; + unsigned long range_pages; start = round_down(start, stride); end = round_up(end, stride); + range_pages = (end - start) >> PAGE_SHIFT; if ((end - start) >= (MAX_TLBI_OPS * stride)) { flush_tlb_mm(vma->vm_mm); return; } - /* Convert the stride into units of 4k */ - stride >>= 12; + dsb(ishst); - start = __TLBI_VADDR(start, asid); - end = __TLBI_VADDR(end, asid); + /* + * The minimum size of TLB RANGE is 2 pages; + * Use normal TLB instruction to handle odd pages. + * If the stride != PAGE_SIZE, this will never happen. + */ + if (range_pages % 2 == 1) { + addr = __TLBI_VADDR(start, asid); + __tlbi_last_level(vale1is, vae1is, addr, last_level); + start += 1 << PAGE_SHIFT; + range_pages >>= 1; + } - dsb(ishst); - for (addr = start; addr < end; addr += stride) { - if (last_level) { - __tlbi(vale1is, addr); - __tlbi_user(vale1is, addr); - } else { - __tlbi(vae1is, addr); - __tlbi_user(vae1is, addr); + while (range_pages > 0) { + if (cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && + stride == PAGE_SIZE) { + num = (range_pages & TLB_RANGE_MASK) - 1; + if (num >= 0) { + addr = __TLBI_VADDR_RANGE(start, asid, scale, + num, 0); + __tlbi_last_level(rvale1is, rvae1is, addr, + last_level); + start += __TLBI_RANGE_SIZES(num, scale); + } + scale++; + range_pages >>= TLB_RANGE_MASK_SHIFT; + continue; } + + addr = __TLBI_VADDR(start, asid); + __tlbi_last_level(vale1is, vae1is, addr, last_level); + start += stride; + range_pages -= stride >> 12; } dsb(ish); } -- 2.19.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 2020-06-01 14:47 ` [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 Zhenyu Ye @ 2020-06-02 12:06 ` Zhenyu Ye 2020-07-07 17:36 ` Catalin Marinas 1 sibling, 0 replies; 10+ messages in thread From: Zhenyu Ye @ 2020-06-02 12:06 UTC (permalink / raw) To: catalin.marinas, will, suzuki.poulose, maz, steven.price, guohanjun, olof Cc: linux-arch, linux-kernel, xiexiangyou, zhangshaokun, linux-mm, arm, prime.zeng, kuhn.chenqun, linux-arm-kernel Hi all, Some optimizations to the codes: On 2020/6/1 22:47, Zhenyu Ye wrote: > - start = __TLBI_VADDR(start, asid); > - end = __TLBI_VADDR(end, asid); > + /* > + * The minimum size of TLB RANGE is 2 pages; > + * Use normal TLB instruction to handle odd pages. > + * If the stride != PAGE_SIZE, this will never happen. > + */ > + if (range_pages % 2 == 1) { > + addr = __TLBI_VADDR(start, asid); > + __tlbi_last_level(vale1is, vae1is, addr, last_level); > + start += 1 << PAGE_SHIFT; > + range_pages >>= 1; > + } > We flush a single page here, and below loop does the same thing if cpu not support TLB RANGE feature. So may we use a goto statement to simplify the code. > + while (range_pages > 0) { > + if (cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && > + stride == PAGE_SIZE) { > + num = (range_pages & TLB_RANGE_MASK) - 1; > + if (num >= 0) { > + addr = __TLBI_VADDR_RANGE(start, asid, scale, > + num, 0); > + __tlbi_last_level(rvale1is, rvae1is, addr, > + last_level); > + start += __TLBI_RANGE_SIZES(num, scale); > + } > + scale++; > + range_pages >>= TLB_RANGE_MASK_SHIFT; > + continue; > } > + > + addr = __TLBI_VADDR(start, asid); > + __tlbi_last_level(vale1is, vae1is, addr, last_level); > + start += stride; > + range_pages -= stride >> 12; > } > dsb(ish); > } > Just like: --8<--- if (range_pages %2 == 1) goto flush_single_tlb; while (range_pages > 0) { if (cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && stride == PAGE_SIZE) { num = ((range_pages >> 1) & TLB_RANGE_MASK) - 1; if (num >= 0) { addr = __TLBI_VADDR_RANGE(start, asid, scale, num, 0); __tlbi_last_level(rvale1is, rvae1is, addr, last_level); start += __TLBI_RANGE_SIZES(num, scale); } scale++; range_pages >>= TLB_RANGE_MASK_SHIFT; continue; } flush_single_tlb: addr = __TLBI_VADDR(start, asid); __tlbi_last_level(vale1is, vae1is, addr, last_level); start += stride; range_pages -= stride >> PAGE_SHIFT; } --8<--- _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 2020-06-01 14:47 ` [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 Zhenyu Ye 2020-06-02 12:06 ` Zhenyu Ye @ 2020-07-07 17:36 ` Catalin Marinas 2020-07-07 17:43 ` Marc Zyngier 2020-07-08 9:01 ` Zhenyu Ye 1 sibling, 2 replies; 10+ messages in thread From: Catalin Marinas @ 2020-07-07 17:36 UTC (permalink / raw) To: Zhenyu Ye Cc: linux-arch, suzuki.poulose, maz, linux-kernel, xiexiangyou, steven.price, zhangshaokun, linux-mm, arm, prime.zeng, guohanjun, olof, kuhn.chenqun, will, linux-arm-kernel On Mon, Jun 01, 2020 at 10:47:13PM +0800, Zhenyu Ye wrote: > @@ -59,6 +69,47 @@ > __ta; \ > }) > > +/* > + * __TG defines translation granule of the system, which is decided by > + * PAGE_SHIFT. Used by TTL. > + * - 4KB : 1 > + * - 16KB : 2 > + * - 64KB : 3 > + */ > +#define __TG ((PAGE_SHIFT - 12) / 2 + 1) Nitpick: maybe something like __TLBI_TG to avoid clashes in case someone else defines a __TG macro. > @@ -181,32 +232,55 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > unsigned long start, unsigned long end, > unsigned long stride, bool last_level) > { > + int num = 0; > + int scale = 0; > unsigned long asid = ASID(vma->vm_mm); > unsigned long addr; > + unsigned long range_pages; > > start = round_down(start, stride); > end = round_up(end, stride); > + range_pages = (end - start) >> PAGE_SHIFT; > > if ((end - start) >= (MAX_TLBI_OPS * stride)) { > flush_tlb_mm(vma->vm_mm); > return; > } > > - /* Convert the stride into units of 4k */ > - stride >>= 12; > + dsb(ishst); > > - start = __TLBI_VADDR(start, asid); > - end = __TLBI_VADDR(end, asid); > + /* > + * The minimum size of TLB RANGE is 2 pages; > + * Use normal TLB instruction to handle odd pages. > + * If the stride != PAGE_SIZE, this will never happen. > + */ > + if (range_pages % 2 == 1) { > + addr = __TLBI_VADDR(start, asid); > + __tlbi_last_level(vale1is, vae1is, addr, last_level); > + start += 1 << PAGE_SHIFT; > + range_pages >>= 1; > + } Shouldn't this be range_pages-- or -= stride >> 12? Your goto follow-up fixes this, though I'm not a big fan of gotos jumping in the middle of a loop. > - dsb(ishst); > - for (addr = start; addr < end; addr += stride) { > - if (last_level) { > - __tlbi(vale1is, addr); > - __tlbi_user(vale1is, addr); > - } else { > - __tlbi(vae1is, addr); > - __tlbi_user(vae1is, addr); > + while (range_pages > 0) { > + if (cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && > + stride == PAGE_SIZE) { I think we could have the odd range_pages check here: if (cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && stride == PAGE_SIZE && range_pages % 2 == 0) { and avoid the one outside the loop. > + num = (range_pages & TLB_RANGE_MASK) - 1; > + if (num >= 0) { > + addr = __TLBI_VADDR_RANGE(start, asid, scale, > + num, 0); > + __tlbi_last_level(rvale1is, rvae1is, addr, > + last_level); > + start += __TLBI_RANGE_SIZES(num, scale); > + } > + scale++; > + range_pages >>= TLB_RANGE_MASK_SHIFT; > + continue; > } > + > + addr = __TLBI_VADDR(start, asid); > + __tlbi_last_level(vale1is, vae1is, addr, last_level); > + start += stride; > + range_pages -= stride >> 12; > } > dsb(ish); > } > -- > 2.19.1 -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 2020-07-07 17:36 ` Catalin Marinas @ 2020-07-07 17:43 ` Marc Zyngier 2020-07-07 17:46 ` Catalin Marinas 2020-07-08 9:01 ` Zhenyu Ye 1 sibling, 1 reply; 10+ messages in thread From: Marc Zyngier @ 2020-07-07 17:43 UTC (permalink / raw) To: Catalin Marinas Cc: linux-arch, suzuki.poulose, Zhenyu Ye, linux-kernel, xiexiangyou, steven.price, zhangshaokun, linux-mm, arm, prime.zeng, guohanjun, olof, kuhn.chenqun, will, linux-arm-kernel On 2020-07-07 18:36, Catalin Marinas wrote: > On Mon, Jun 01, 2020 at 10:47:13PM +0800, Zhenyu Ye wrote: >> @@ -59,6 +69,47 @@ >> __ta; \ >> }) >> >> +/* >> + * __TG defines translation granule of the system, which is decided >> by >> + * PAGE_SHIFT. Used by TTL. >> + * - 4KB : 1 >> + * - 16KB : 2 >> + * - 64KB : 3 >> + */ >> +#define __TG ((PAGE_SHIFT - 12) / 2 + 1) > > Nitpick: maybe something like __TLBI_TG to avoid clashes in case > someone > else defines a __TG macro. I have commented on this in the past, and still maintain that this would be better served by a switch statement similar to what is used for TTL already (I don't think this magic formula exists in the ARM ARM). M. -- Jazz is not dead. It just smells funny... _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 2020-07-07 17:43 ` Marc Zyngier @ 2020-07-07 17:46 ` Catalin Marinas 2020-07-07 18:12 ` Robin Murphy 0 siblings, 1 reply; 10+ messages in thread From: Catalin Marinas @ 2020-07-07 17:46 UTC (permalink / raw) To: Marc Zyngier Cc: linux-arch, suzuki.poulose, Zhenyu Ye, linux-kernel, xiexiangyou, steven.price, zhangshaokun, linux-mm, arm, prime.zeng, guohanjun, olof, kuhn.chenqun, will, linux-arm-kernel On Tue, Jul 07, 2020 at 06:43:35PM +0100, Marc Zyngier wrote: > On 2020-07-07 18:36, Catalin Marinas wrote: > > On Mon, Jun 01, 2020 at 10:47:13PM +0800, Zhenyu Ye wrote: > > > @@ -59,6 +69,47 @@ > > > __ta; \ > > > }) > > > > > > +/* > > > + * __TG defines translation granule of the system, which is decided > > > by > > > + * PAGE_SHIFT. Used by TTL. > > > + * - 4KB : 1 > > > + * - 16KB : 2 > > > + * - 64KB : 3 > > > + */ > > > +#define __TG ((PAGE_SHIFT - 12) / 2 + 1) > > > > Nitpick: maybe something like __TLBI_TG to avoid clashes in case someone > > else defines a __TG macro. > > I have commented on this in the past, and still maintain that this > would be better served by a switch statement similar to what is used > for TTL already (I don't think this magic formula exists in the > ARM ARM). Good point, it would be cleaner indeed. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 2020-07-07 17:46 ` Catalin Marinas @ 2020-07-07 18:12 ` Robin Murphy 0 siblings, 0 replies; 10+ messages in thread From: Robin Murphy @ 2020-07-07 18:12 UTC (permalink / raw) To: Catalin Marinas, Marc Zyngier Cc: linux-arch, Zhenyu Ye, suzuki.poulose, linux-kernel, xiexiangyou, steven.price, zhangshaokun, linux-mm, arm, prime.zeng, guohanjun, olof, kuhn.chenqun, will, linux-arm-kernel On 2020-07-07 18:46, Catalin Marinas wrote: > On Tue, Jul 07, 2020 at 06:43:35PM +0100, Marc Zyngier wrote: >> On 2020-07-07 18:36, Catalin Marinas wrote: >>> On Mon, Jun 01, 2020 at 10:47:13PM +0800, Zhenyu Ye wrote: >>>> @@ -59,6 +69,47 @@ >>>> __ta; \ >>>> }) >>>> >>>> +/* >>>> + * __TG defines translation granule of the system, which is decided >>>> by >>>> + * PAGE_SHIFT. Used by TTL. >>>> + * - 4KB : 1 >>>> + * - 16KB : 2 >>>> + * - 64KB : 3 >>>> + */ >>>> +#define __TG ((PAGE_SHIFT - 12) / 2 + 1) >>> >>> Nitpick: maybe something like __TLBI_TG to avoid clashes in case someone >>> else defines a __TG macro. >> >> I have commented on this in the past, and still maintain that this >> would be better served by a switch statement similar to what is used >> for TTL already (I don't think this magic formula exists in the >> ARM ARM). > > Good point, it would be cleaner indeed. FWIW, we use the somewhat obtuse "(shift - 10) / 2" in the SMMUv3 driver, but that's because we support multiple different granules at runtime and want to generate efficient code. Anything based on PAGE_SHIFT that resolves to a compile-time constant has no excuse for not being written in a clear and obvious manner ;) Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 2020-07-07 17:36 ` Catalin Marinas 2020-07-07 17:43 ` Marc Zyngier @ 2020-07-08 9:01 ` Zhenyu Ye 1 sibling, 0 replies; 10+ messages in thread From: Zhenyu Ye @ 2020-07-08 9:01 UTC (permalink / raw) To: Catalin Marinas Cc: linux-arch, suzuki.poulose, maz, linux-kernel, xiexiangyou, steven.price, zhangshaokun, linux-mm, arm, prime.zeng, guohanjun, olof, kuhn.chenqun, will, linux-arm-kernel Hi Catalin, On 2020/7/8 1:36, Catalin Marinas wrote: > On Mon, Jun 01, 2020 at 10:47:13PM +0800, Zhenyu Ye wrote: >> @@ -59,6 +69,47 @@ >> __ta; \ >> }) >> >> +/* >> + * __TG defines translation granule of the system, which is decided by >> + * PAGE_SHIFT. Used by TTL. >> + * - 4KB : 1 >> + * - 16KB : 2 >> + * - 64KB : 3 >> + */ >> +#define __TG ((PAGE_SHIFT - 12) / 2 + 1) > > Nitpick: maybe something like __TLBI_TG to avoid clashes in case someone > else defines a __TG macro. > Thanks for your review. According to Marc and Robin's suggestion, I will remove this macro. I'd like implement this in a function beacause it's used in both TTL and TLB RANGE. >> - for (addr = start; addr < end; addr += stride) { >> - if (last_level) { >> - __tlbi(vale1is, addr); >> - __tlbi_user(vale1is, addr); >> - } else { >> - __tlbi(vae1is, addr); >> - __tlbi_user(vae1is, addr); >> + while (range_pages > 0) { >> + if (cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && >> + stride == PAGE_SIZE) { > > I think we could have the odd range_pages check here: > > if (cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && > stride == PAGE_SIZE && range_pages % 2 == 0) { > > and avoid the one outside the loop. > This may need some other necessary changes to do this. See in next version series. Thanks, Zhenyu _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC PATCH v3 0/2] arm64: tlb: add support for TLBI RANGE instructions @ 2020-04-14 11:28 Zhenyu Ye 0 siblings, 0 replies; 10+ messages in thread From: Zhenyu Ye @ 2020-04-14 11:28 UTC (permalink / raw) To: will, catalin.marinas, suzuki.poulose, maz, steven.price, guohanjun, olof Cc: linux-arch, yezhenyu2, linux-kernel, xiexiangyou, zhangshaokun, linux-mm, arm, prime.zeng, kuhn.chenqun, linux-arm-kernel ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a range of input addresses. This series add support for this feature. -- ChangeList: v3: rebase this series on Linux 5.7-rc1. v2: Link: https://lkml.org/lkml/2019/11/11/348 Zhenyu Ye (2): arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature arm64: tlb: Use the TLBI RANGE feature in arm64 arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/sysreg.h | 4 ++ arch/arm64/include/asm/tlbflush.h | 108 +++++++++++++++++++++++++++++- arch/arm64/kernel/cpufeature.c | 11 +++ 4 files changed, 124 insertions(+), 2 deletions(-) -- 2.19.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-07-08 9:03 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-01 14:47 [RFC PATCH v3 0/2] arm64: tlb: add support for TLBI RANGE instructions Zhenyu Ye 2020-06-01 14:47 ` [RFC PATCH v4 1/2] arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature Zhenyu Ye 2020-06-01 14:47 ` [RFC PATCH v4 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 Zhenyu Ye 2020-06-02 12:06 ` Zhenyu Ye 2020-07-07 17:36 ` Catalin Marinas 2020-07-07 17:43 ` Marc Zyngier 2020-07-07 17:46 ` Catalin Marinas 2020-07-07 18:12 ` Robin Murphy 2020-07-08 9:01 ` Zhenyu Ye -- strict thread matches above, loose matches on Subject: below -- 2020-04-14 11:28 [RFC PATCH v3 0/2] arm64: tlb: add support for TLBI RANGE instructions Zhenyu Ye
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).