linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature
@ 2020-03-31 14:29 Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 1/8] arm64: Detect the ARMv8.4 " Zhenyu Ye
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Zhenyu Ye @ 2020-03-31 14:29 UTC (permalink / raw)
  To: peterz, mark.rutland, will, catalin.marinas, aneesh.kumar, akpm,
	npiggin, arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao,
	Dave.Martin, steven.price, broonie, guohanjun, corbet, vgupta,
	tony.luck
  Cc: yezhenyu2, linux-arm-kernel, linux-kernel, linux-arch, linux-mm,
	arm, xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

In order to reduce the cost of TLB invalidation, the ARMv8.4 TTL
feature allows TLBs to be issued with a level allowing for quicker
invalidation.  This series provide support for this feature.

Patch 1 and Patch 2 was provided by Marc on his NV series[1] patches,
which detect the TTL feature and add __tlbi_level interface.
Patch 4-7 passes struct mmu_gather to flush_tlb_range, which can pass
the level of tlbi invalidations.  Arm64 and power9 can benefit from this.
Patch 8 set the TTL field in arm64 by using the cleared_* values in
struct mmu_gather.

See patches for details, Thanks.

[1] https://lore.kernel.org/linux-arm-kernel/20200211174938.27809-1-maz@kernel.org/
[2] https://lore.kernel.org/linux-arm-kernel/7859561b-78b4-4a12-2642-3741d7d3e7b8@huawei.com/

--
ChangeList:
v1:
add support for TTL feature in arm64.

v2:
build the patch on Marc's NV series[1].

v3:
use vma->vm_flags to replace mm->context.flags.

v4:
add Marc's patches into my series.

v5:
pass struct mmu_gather to flush_tlb_range, then set the
TTL field by using infos in struct mmu_gather.


Marc Zyngier (2):
  arm64: Detect the ARMv8.4 TTL feature
  arm64: Add level-hinted TLB invalidation helper

Zhenyu Ye (6):
  arm64: Add tlbi_user_level TLB invalidation helper
  mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range
  mm: tlb: Pass struct mmu_gather to flush_pud_tlb_range
  mm: tlb: Pass struct mmu_gather to flush_hugetlb_tlb_range
  mm: tlb: Pass struct mmu_gather to flush_tlb_range
  arm64: tlb: Set the TTL field in flush_tlb_range

 Documentation/core-api/cachetlb.rst           |  8 ++-
 arch/alpha/include/asm/tlbflush.h             |  8 +--
 arch/alpha/kernel/smp.c                       |  3 +-
 arch/arc/include/asm/hugepage.h               |  4 +-
 arch/arc/include/asm/tlbflush.h               | 11 ++--
 arch/arc/mm/tlb.c                             |  8 +--
 arch/arm/include/asm/tlbflush.h               | 12 ++--
 arch/arm/kernel/smp_tlb.c                     |  4 +-
 arch/arm/mach-rpc/ecard.c                     |  8 ++-
 arch/arm64/crypto/aes-glue.c                  |  1 -
 arch/arm64/include/asm/cpucaps.h              |  3 +-
 arch/arm64/include/asm/sysreg.h               |  1 +
 arch/arm64/include/asm/tlb.h                  | 39 +++++++++++-
 arch/arm64/include/asm/tlbflush.h             | 63 +++++++++++++------
 arch/arm64/kernel/cpufeature.c                | 11 ++++
 arch/arm64/mm/hugetlbpage.c                   | 10 ++-
 arch/csky/include/asm/tlb.h                   |  2 +-
 arch/csky/include/asm/tlbflush.h              |  6 +-
 arch/csky/mm/tlb.c                            |  4 +-
 arch/hexagon/include/asm/tlbflush.h           |  2 +-
 arch/hexagon/mm/vm_tlb.c                      |  4 +-
 arch/ia64/include/asm/tlbflush.h              |  6 +-
 arch/ia64/mm/tlb.c                            |  5 +-
 arch/m68k/include/asm/tlbflush.h              | 10 +--
 arch/microblaze/include/asm/tlbflush.h        |  5 +-
 arch/mips/include/asm/hugetlb.h               |  6 +-
 arch/mips/include/asm/tlbflush.h              |  9 +--
 arch/mips/kernel/smp.c                        |  3 +-
 arch/nds32/include/asm/tlbflush.h             |  3 +-
 arch/nios2/include/asm/tlbflush.h             |  9 +--
 arch/nios2/mm/tlb.c                           |  8 ++-
 arch/openrisc/include/asm/tlbflush.h          | 10 +--
 arch/openrisc/kernel/smp.c                    |  2 +-
 arch/parisc/include/asm/tlbflush.h            |  2 +-
 arch/parisc/kernel/cache.c                    | 13 +++-
 arch/powerpc/include/asm/book3s/32/tlbflush.h |  4 +-
 arch/powerpc/include/asm/book3s/64/tlbflush.h |  9 ++-
 arch/powerpc/include/asm/nohash/tlbflush.h    |  7 ++-
 arch/powerpc/mm/book3s32/tlb.c                |  6 +-
 arch/powerpc/mm/book3s64/pgtable.c            |  8 ++-
 arch/powerpc/mm/book3s64/radix_tlb.c          |  2 +-
 arch/powerpc/mm/nohash/tlb.c                  |  6 +-
 arch/riscv/include/asm/tlbflush.h             |  7 ++-
 arch/riscv/mm/tlbflush.c                      |  4 +-
 arch/s390/include/asm/tlbflush.h              |  5 +-
 arch/sh/include/asm/tlbflush.h                |  8 +--
 arch/sh/kernel/smp.c                          |  2 +-
 arch/sparc/include/asm/tlbflush_32.h          |  2 +-
 arch/sparc/include/asm/tlbflush_64.h          |  3 +-
 arch/sparc/mm/tlb.c                           |  5 +-
 arch/um/include/asm/tlbflush.h                |  6 +-
 arch/um/kernel/tlb.c                          |  4 +-
 arch/unicore32/include/asm/tlbflush.h         |  5 +-
 arch/x86/include/asm/tlbflush.h               |  4 +-
 arch/x86/mm/pgtable.c                         | 10 ++-
 arch/xtensa/include/asm/tlbflush.h            | 10 +--
 arch/xtensa/kernel/smp.c                      |  2 +-
 include/asm-generic/pgtable.h                 | 10 +--
 include/asm-generic/tlb.h                     |  2 +-
 mm/huge_memory.c                              | 19 +++++-
 mm/hugetlb.c                                  | 17 +++--
 mm/mapping_dirty_helpers.c                    | 23 ++++---
 mm/migrate.c                                  |  8 ++-
 mm/mprotect.c                                 |  8 ++-
 mm/mremap.c                                   | 17 ++++-
 mm/pgtable-generic.c                          | 51 ++++++++++++---
 mm/rmap.c                                     |  6 +-
 67 files changed, 409 insertions(+), 174 deletions(-)

-- 
2.19.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v5 1/8] arm64: Detect the ARMv8.4 TTL feature
  2020-03-31 14:29 [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature Zhenyu Ye
@ 2020-03-31 14:29 ` Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 2/8] arm64: Add level-hinted TLB invalidation helper Zhenyu Ye
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Zhenyu Ye @ 2020-03-31 14:29 UTC (permalink / raw)
  To: peterz, mark.rutland, will, catalin.marinas, aneesh.kumar, akpm,
	npiggin, arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao,
	Dave.Martin, steven.price, broonie, guohanjun, corbet, vgupta,
	tony.luck
  Cc: yezhenyu2, linux-arm-kernel, linux-kernel, linux-arch, linux-mm,
	arm, xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

From: Marc Zyngier <maz@kernel.org>

In order to reduce the cost of TLB invalidation, the ARMv8.4 TTL
feature allows TLBs to be issued with a level allowing for quicker
invalidation.

Let's detect the feature for now. Further patches will implement
its actual usage.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
---
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  1 +
 arch/arm64/kernel/cpufeature.c   | 11 +++++++++++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 865e0253fc1e..8b3b4dd612b3 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -58,7 +58,8 @@
 #define ARM64_WORKAROUND_SPECULATIVE_AT_NVHE	48
 #define ARM64_HAS_E0PD				49
 #define ARM64_HAS_RNG				50
+#define ARM64_HAS_ARMv8_4_TTL			51
 
-#define ARM64_NCAPS				51
+#define ARM64_NCAPS				52
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index b91570ff9db1..a28b76f32ba7 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -685,6 +685,7 @@
 
 /* id_aa64mmfr2 */
 #define ID_AA64MMFR2_E0PD_SHIFT		60
+#define ID_AA64MMFR2_TTL_SHIFT		48
 #define ID_AA64MMFR2_FWB_SHIFT		40
 #define ID_AA64MMFR2_AT_SHIFT		32
 #define ID_AA64MMFR2_LVA_SHIFT		16
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0b6715625cf6..cbe46ad2900a 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -241,6 +241,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = {
 
 static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_E0PD_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_TTL_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_FWB_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_AT_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_LVA_SHIFT, 4, 0),
@@ -1523,6 +1524,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.matches = has_cpuid_feature,
 		.cpu_enable = cpu_has_fwb,
 	},
+	{
+		.desc = "ARMv8.4 Translation Table Level",
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.capability = ARM64_HAS_ARMv8_4_TTL,
+		.sys_reg = SYS_ID_AA64MMFR2_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64MMFR2_TTL_SHIFT,
+		.min_field_value = 1,
+		.matches = has_cpuid_feature,
+	},
 #ifdef CONFIG_ARM64_HW_AFDBM
 	{
 		/*
-- 
2.19.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v5 2/8] arm64: Add level-hinted TLB invalidation helper
  2020-03-31 14:29 [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 1/8] arm64: Detect the ARMv8.4 " Zhenyu Ye
@ 2020-03-31 14:29 ` Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 3/8] arm64: Add tlbi_user_level " Zhenyu Ye
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Zhenyu Ye @ 2020-03-31 14:29 UTC (permalink / raw)
  To: peterz, mark.rutland, will, catalin.marinas, aneesh.kumar, akpm,
	npiggin, arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao,
	Dave.Martin, steven.price, broonie, guohanjun, corbet, vgupta,
	tony.luck
  Cc: yezhenyu2, linux-arm-kernel, linux-kernel, linux-arch, linux-mm,
	arm, xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

From: Marc Zyngier <maz@kernel.org>

Add a level-hinted TLB invalidation helper that only gets used if
ARMv8.4-TTL gets detected.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
---
 arch/arm64/include/asm/tlbflush.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index bc3949064725..5f9f189bc6d2 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -10,6 +10,7 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/bitfield.h>
 #include <linux/mm_types.h>
 #include <linux/sched.h>
 #include <asm/cputype.h>
@@ -59,6 +60,35 @@
 		__ta;						\
 	})
 
+#define TLBI_TTL_MASK	GENMASK_ULL(47, 44)
+
+#define __tlbi_level(op, addr, level)					\
+	do {								\
+		u64 arg = addr;						\
+									\
+		if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) &&	\
+		    level) {						\
+			u64 ttl = level;				\
+									\
+			switch (PAGE_SIZE) {				\
+			case SZ_4K:					\
+				ttl |= 1 << 2;				\
+				break;					\
+			case SZ_16K:					\
+				ttl |= 2 << 2;				\
+				break;					\
+			case SZ_64K:					\
+				ttl |= 3 << 2;				\
+				break;					\
+			}						\
+									\
+			arg &= ~TLBI_TTL_MASK;				\
+			arg |= FIELD_PREP(TLBI_TTL_MASK, ttl);		\
+		}							\
+									\
+		__tlbi(op,  arg);					\
+	} while (0)
+
 /*
  *	TLB Invalidation
  *	================
-- 
2.19.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v5 3/8] arm64: Add tlbi_user_level TLB invalidation helper
  2020-03-31 14:29 [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 1/8] arm64: Detect the ARMv8.4 " Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 2/8] arm64: Add level-hinted TLB invalidation helper Zhenyu Ye
@ 2020-03-31 14:29 ` Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range Zhenyu Ye
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Zhenyu Ye @ 2020-03-31 14:29 UTC (permalink / raw)
  To: peterz, mark.rutland, will, catalin.marinas, aneesh.kumar, akpm,
	npiggin, arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao,
	Dave.Martin, steven.price, broonie, guohanjun, corbet, vgupta,
	tony.luck
  Cc: yezhenyu2, linux-arm-kernel, linux-kernel, linux-arch, linux-mm,
	arm, xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

Add a level-hinted parameter to __tlbi_user, which only gets used
if ARMv8.4-TTL gets detected.

ARMv8.4-TTL provides the TTL field in tlbi instruction to indicate
the level of translation table walk holding the leaf entry for the
address that is being invalidated.

This patch set the default level value to 0.

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
---
 arch/arm64/include/asm/tlbflush.h | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 5f9f189bc6d2..892f33235dc7 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -89,6 +89,12 @@
 		__tlbi(op,  arg);					\
 	} while (0)
 
+#define __tlbi_user_level(op, arg, level) do {				\
+	if (arm64_kernel_unmapped_at_el0())				\
+		__tlbi_level(op, (arg | USER_ASID_FLAG), level);	\
+} while (0)
+
+
 /*
  *	TLB Invalidation
  *	================
@@ -190,8 +196,8 @@ static inline void flush_tlb_page_nosync(struct vm_area_struct *vma,
 	unsigned long addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm));
 
 	dsb(ishst);
-	__tlbi(vale1is, addr);
-	__tlbi_user(vale1is, addr);
+	__tlbi_level(vale1is, addr, 0);
+	__tlbi_user_level(vale1is, addr, 0);
 }
 
 static inline void flush_tlb_page(struct vm_area_struct *vma,
@@ -231,11 +237,11 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
 	dsb(ishst);
 	for (addr = start; addr < end; addr += stride) {
 		if (last_level) {
-			__tlbi(vale1is, addr);
-			__tlbi_user(vale1is, addr);
+			__tlbi_level(vale1is, addr, 0);
+			__tlbi_user_level(vale1is, addr, 0);
 		} else {
-			__tlbi(vae1is, addr);
-			__tlbi_user(vae1is, addr);
+			__tlbi_level(vae1is, addr, 0);
+			__tlbi_user_level(vae1is, addr, 0);
 		}
 	}
 	dsb(ish);
-- 
2.19.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range
  2020-03-31 14:29 [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature Zhenyu Ye
                   ` (2 preceding siblings ...)
  2020-03-31 14:29 ` [RFC PATCH v5 3/8] arm64: Add tlbi_user_level " Zhenyu Ye
@ 2020-03-31 14:29 ` Zhenyu Ye
  2020-03-31 15:13   ` Peter Zijlstra
  2020-03-31 14:29 ` [RFC PATCH v5 5/8] mm: tlb: Pass struct mmu_gather to flush_pud_tlb_range Zhenyu Ye
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Zhenyu Ye @ 2020-03-31 14:29 UTC (permalink / raw)
  To: peterz, mark.rutland, will, catalin.marinas, aneesh.kumar, akpm,
	npiggin, arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao,
	Dave.Martin, steven.price, broonie, guohanjun, corbet, vgupta,
	tony.luck
  Cc: yezhenyu2, linux-arm-kernel, linux-kernel, linux-arch, linux-mm,
	arm, xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

Preparations to support for passing struct mmu_gather to
flush_tlb_range.  See in future patches.

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
---
 arch/arc/include/asm/hugepage.h               |  4 +--
 arch/arc/include/asm/tlbflush.h               |  5 +--
 arch/arc/mm/tlb.c                             |  4 +--
 arch/powerpc/include/asm/book3s/64/tlbflush.h |  3 +-
 arch/powerpc/mm/book3s64/pgtable.c            |  8 ++++-
 include/asm-generic/pgtable.h                 |  4 +--
 mm/pgtable-generic.c                          | 35 ++++++++++++++++---
 7 files changed, 48 insertions(+), 15 deletions(-)

diff --git a/arch/arc/include/asm/hugepage.h b/arch/arc/include/asm/hugepage.h
index 30ac40fed2c5..c2b325dd47f2 100644
--- a/arch/arc/include/asm/hugepage.h
+++ b/arch/arc/include/asm/hugepage.h
@@ -67,8 +67,8 @@ extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
 extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
 
 #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
-extern void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
-				unsigned long end);
+extern void flush_pmd_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+				unsigned long start, unsigned long end);
 
 /* We don't have hardware dirty/accessed bits, generic_pmdp_establish is fine.*/
 #define pmdp_establish generic_pmdp_establish
diff --git a/arch/arc/include/asm/tlbflush.h b/arch/arc/include/asm/tlbflush.h
index 992a2837a53f..49e4e5b59bb2 100644
--- a/arch/arc/include/asm/tlbflush.h
+++ b/arch/arc/include/asm/tlbflush.h
@@ -26,7 +26,7 @@ void local_flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-#define flush_pmd_tlb_range(vma, s, e)	local_flush_pmd_tlb_range(vma, s, e)
+#define flush_pmd_tlb_range(tlb, vma, s, e)	local_flush_pmd_tlb_range(vma, s, e)
 #endif
 #else
 extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
@@ -36,7 +36,8 @@ extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-extern void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_pmd_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+				unsigned long start, unsigned long end);
 #endif
 #endif /* CONFIG_SMP */
 #endif
diff --git a/arch/arc/mm/tlb.c b/arch/arc/mm/tlb.c
index c340acd989a0..10b2a2373dc0 100644
--- a/arch/arc/mm/tlb.c
+++ b/arch/arc/mm/tlb.c
@@ -464,8 +464,8 @@ void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
 }
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			 unsigned long end)
+void flush_pmd_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			 unsigned long start, unsigned long end)
 {
 	struct tlb_args ta = {
 		.ta_vma = vma,
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index dcb5c3839d2f..6445d179ac15 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -47,7 +47,8 @@ static inline void tlbiel_all_lpid(bool radix)
 
 
 #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
-static inline void flush_pmd_tlb_range(struct vm_area_struct *vma,
+static inline void flush_pmd_tlb_range(struct mmu_gather *tlb,
+				       struct vm_area_struct *vma,
 				       unsigned long start, unsigned long end)
 {
 	if (radix_enabled())
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 2bf7e1b4fd82..0a9c7ad7ee81 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -106,9 +106,15 @@ pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
 		     pmd_t *pmdp)
 {
 	unsigned long old_pmd;
+	struct mmu_gather tlb;
+	unsigned long tlb_start = address;
+	unsigned long tlb_end = address + HPAGE_PMD_SIZE;
 
 	old_pmd = pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID);
-	flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+	tlb.cleared_pmds = 1;
+	flush_pmd_tlb_range(&tlb, vma, tlb_start, tlb_end);
+	tlb_finish_mmu(&tlb, tlb_start, tlb_end);
 	/*
 	 * This ensures that generic code that rely on IRQ disabling
 	 * to prevent a parallel THP split work as expected.
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index e2e2bef07dd2..32d4661e5a56 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -1160,10 +1160,10 @@ static inline int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
  * invalidate the entire TLB which is not desitable.
  * e.g. see arch/arc: flush_pmd_tlb_range
  */
-#define flush_pmd_tlb_range(vma, addr, end)	flush_tlb_range(vma, addr, end)
+#define flush_pmd_tlb_range(tlb, vma, addr, end)	flush_tlb_range(vma, addr, end)
 #define flush_pud_tlb_range(vma, addr, end)	flush_tlb_range(vma, addr, end)
 #else
-#define flush_pmd_tlb_range(vma, addr, end)	BUILD_BUG()
+#define flush_pmd_tlb_range(tlb, vma, addr, end)	BUILD_BUG()
 #define flush_pud_tlb_range(vma, addr, end)	BUILD_BUG()
 #endif
 #endif
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 3d7c01e76efc..96c9cf77bfb5 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -109,8 +109,14 @@ int pmdp_set_access_flags(struct vm_area_struct *vma,
 	int changed = !pmd_same(*pmdp, entry);
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	if (changed) {
+		struct mmu_gather tlb;
+		unsigned long tlb_start = address;
+		unsigned long tlb_end = address + HPAGE_PMD_SIZE;
 		set_pmd_at(vma->vm_mm, address, pmdp, entry);
-		flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+		tlb.cleared_pmds = 1;
+		flush_pmd_tlb_range(&tlb, vma, tlb_start, tlb_end);
+		tlb_finish_mmu(&tlb, tlb_start, tlb_end);
 	}
 	return changed;
 }
@@ -123,8 +129,15 @@ int pmdp_clear_flush_young(struct vm_area_struct *vma,
 	int young;
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	young = pmdp_test_and_clear_young(vma, address, pmdp);
-	if (young)
-		flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	if (young) {
+		struct mmu_gather tlb;
+		unsigned long tlb_start = address;
+		unsigned long tlb_end = address + HPAGE_PMD_SIZE;
+		tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+		tlb.cleared_pmds = 1;
+		flush_pmd_tlb_range(&tlb, vma, tlb_start, tlb_end);
+		tlb_finish_mmu(&tlb, tlb_start, tlb_end);
+	}
 	return young;
 }
 #endif
@@ -134,11 +147,17 @@ pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address,
 			    pmd_t *pmdp)
 {
 	pmd_t pmd;
+	struct mmu_gather tlb;
+	unsigned long tlb_start = address;
+	unsigned long tlb_end = address + HPAGE_PMD_SIZE;
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	VM_BUG_ON((pmd_present(*pmdp) && !pmd_trans_huge(*pmdp) &&
 			   !pmd_devmap(*pmdp)) || !pmd_present(*pmdp));
 	pmd = pmdp_huge_get_and_clear(vma->vm_mm, address, pmdp);
-	flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+	tlb.cleared_pmds = 1;
+	flush_pmd_tlb_range(&tlb, vma, tlb_start, tlb_end);
+	tlb_finish_mmu(&tlb, tlb_start, tlb_end);
 	return pmd;
 }
 
@@ -195,7 +214,13 @@ pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
 		     pmd_t *pmdp)
 {
 	pmd_t old = pmdp_establish(vma, address, pmdp, pmd_mknotpresent(*pmdp));
-	flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	struct mmu_gather tlb;
+	unsigned long tlb_start = address;
+	unsigned long tlb_end = address + HPAGE_PMD_SIZE;
+	tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+	tlb.cleared_pmds = 1;
+	flush_pmd_tlb_range(&tlb, vma, tlb_start, tlb_end);
+	tlb_finish_mmu(&tlb, tlb_start, tlb_end);
 	return old;
 }
 #endif
-- 
2.19.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v5 5/8] mm: tlb: Pass struct mmu_gather to flush_pud_tlb_range
  2020-03-31 14:29 [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature Zhenyu Ye
                   ` (3 preceding siblings ...)
  2020-03-31 14:29 ` [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range Zhenyu Ye
@ 2020-03-31 14:29 ` Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 6/8] mm: tlb: Pass struct mmu_gather to flush_hugetlb_tlb_range Zhenyu Ye
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Zhenyu Ye @ 2020-03-31 14:29 UTC (permalink / raw)
  To: peterz, mark.rutland, will, catalin.marinas, aneesh.kumar, akpm,
	npiggin, arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao,
	Dave.Martin, steven.price, broonie, guohanjun, corbet, vgupta,
	tony.luck
  Cc: yezhenyu2, linux-arm-kernel, linux-kernel, linux-arch, linux-mm,
	arm, xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

Preparations to support for passing struct mmu_gather to
flush_tlb_range.  See in future patches.

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
---
 include/asm-generic/pgtable.h | 4 ++--
 mm/pgtable-generic.c          | 8 +++++++-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 32d4661e5a56..1c67a744877e 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -1161,10 +1161,10 @@ static inline int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
  * e.g. see arch/arc: flush_pmd_tlb_range
  */
 #define flush_pmd_tlb_range(tlb, vma, addr, end)	flush_tlb_range(vma, addr, end)
-#define flush_pud_tlb_range(vma, addr, end)	flush_tlb_range(vma, addr, end)
+#define flush_pud_tlb_range(tlb, vma, addr, end)	flush_tlb_range(vma, addr, end)
 #else
 #define flush_pmd_tlb_range(tlb, vma, addr, end)	BUILD_BUG()
-#define flush_pud_tlb_range(vma, addr, end)	BUILD_BUG()
+#define flush_pud_tlb_range(tlb, vma, addr, end)	BUILD_BUG()
 #endif
 #endif
 
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 96c9cf77bfb5..9ab9d8f698ea 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -166,11 +166,17 @@ pud_t pudp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address,
 			    pud_t *pudp)
 {
 	pud_t pud;
+	struct mmu_gather tlb;
+	unsigned long tlb_start = address;
+	unsigned long tlb_end = address + HPAGE_PUD_SIZE;
 
 	VM_BUG_ON(address & ~HPAGE_PUD_MASK);
 	VM_BUG_ON(!pud_trans_huge(*pudp) && !pud_devmap(*pudp));
 	pud = pudp_huge_get_and_clear(vma->vm_mm, address, pudp);
-	flush_pud_tlb_range(vma, address, address + HPAGE_PUD_SIZE);
+	tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+	tlb.cleared_puds = 1;
+	flush_pud_tlb_range(&tlb, vma, tlb_start, tlb_end);
+	tlb_finish_mmu(&tlb, tlb_start, tlb_end);
 	return pud;
 }
 #endif
-- 
2.19.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v5 6/8] mm: tlb: Pass struct mmu_gather to flush_hugetlb_tlb_range
  2020-03-31 14:29 [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature Zhenyu Ye
                   ` (4 preceding siblings ...)
  2020-03-31 14:29 ` [RFC PATCH v5 5/8] mm: tlb: Pass struct mmu_gather to flush_pud_tlb_range Zhenyu Ye
@ 2020-03-31 14:29 ` Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 7/8] mm: tlb: Pass struct mmu_gather to flush_tlb_range Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 8/8] arm64: tlb: Set the TTL field in flush_tlb_range Zhenyu Ye
  7 siblings, 0 replies; 16+ messages in thread
From: Zhenyu Ye @ 2020-03-31 14:29 UTC (permalink / raw)
  To: peterz, mark.rutland, will, catalin.marinas, aneesh.kumar, akpm,
	npiggin, arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao,
	Dave.Martin, steven.price, broonie, guohanjun, corbet, vgupta,
	tony.luck
  Cc: yezhenyu2, linux-arm-kernel, linux-kernel, linux-arch, linux-mm,
	arm, xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

Preparations to support for passing struct mmu_gather to
flush_tlb_range.  See in future patches.

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
---
 arch/powerpc/include/asm/book3s/64/tlbflush.h |  3 ++-
 mm/hugetlb.c                                  | 17 ++++++++++++-----
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 6445d179ac15..968f10ef3d51 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -57,7 +57,8 @@ static inline void flush_pmd_tlb_range(struct mmu_gather *tlb,
 }
 
 #define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
-static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
+static inline void flush_hugetlb_tlb_range(struct mmu_gather *tlb,
+					   struct vm_area_struct *vma,
 					   unsigned long start,
 					   unsigned long end)
 {
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index dd8737a94bec..f913ce0b4831 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4441,7 +4441,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
  * ARCHes with special requirements for evicting HUGETLB backing TLB entries can
  * implement this.
  */
-#define flush_hugetlb_tlb_range(vma, addr, end)	flush_tlb_range(vma, addr, end)
+#define flush_hugetlb_tlb_range(tlb, vma, addr, end)	\
+	flush_tlb_range(vma, addr, end)
 #endif
 
 unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
@@ -4455,6 +4456,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
 	unsigned long pages = 0;
 	bool shared_pmd = false;
 	struct mmu_notifier_range range;
+	struct mmu_gather tlb;
 
 	/*
 	 * In the case of shared PMDs, the area to flush could be beyond
@@ -4520,10 +4522,15 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
 	 * and that page table be reused and filled with junk.  If we actually
 	 * did unshare a page of pmds, flush the range corresponding to the pud.
 	 */
-	if (shared_pmd)
-		flush_hugetlb_tlb_range(vma, range.start, range.end);
-	else
-		flush_hugetlb_tlb_range(vma, start, end);
+	if (shared_pmd) {
+		tlb_gather_mmu(&tlb, mm, range.start, range.end);
+		flush_hugetlb_tlb_range(&tlb, vma, range.start, range.end);
+		tlb_finish_mmu(&tlb, range.start, range.end);
+	} else {
+		tlb_gather_mmu(&tlb, mm, start, end);
+		flush_hugetlb_tlb_range(&tlb, vma, start, end);
+		tlb_finish_mmu(&tlb, start, end);
+	}
 	/*
 	 * No need to call mmu_notifier_invalidate_range() we are downgrading
 	 * page table protection not changing it to point to a new page.
-- 
2.19.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v5 7/8] mm: tlb: Pass struct mmu_gather to flush_tlb_range
  2020-03-31 14:29 [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature Zhenyu Ye
                   ` (5 preceding siblings ...)
  2020-03-31 14:29 ` [RFC PATCH v5 6/8] mm: tlb: Pass struct mmu_gather to flush_hugetlb_tlb_range Zhenyu Ye
@ 2020-03-31 14:29 ` Zhenyu Ye
  2020-03-31 14:29 ` [RFC PATCH v5 8/8] arm64: tlb: Set the TTL field in flush_tlb_range Zhenyu Ye
  7 siblings, 0 replies; 16+ messages in thread
From: Zhenyu Ye @ 2020-03-31 14:29 UTC (permalink / raw)
  To: peterz, mark.rutland, will, catalin.marinas, aneesh.kumar, akpm,
	npiggin, arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao,
	Dave.Martin, steven.price, broonie, guohanjun, corbet, vgupta,
	tony.luck
  Cc: yezhenyu2, linux-arm-kernel, linux-kernel, linux-arch, linux-mm,
	arm, xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

A few new fields were added to mmu_gather to make TLB flush smarter for
huge page by telling what level of page table is changed. However, we
can not get these infomations in flush_tlb_range() now.

This patch passes struct mmu_gather to flush_tlb_range interface, and
arm64 and power9 can all benefit from this.

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
---
 Documentation/core-api/cachetlb.rst           |  8 +++++--
 arch/alpha/include/asm/tlbflush.h             |  8 +++----
 arch/alpha/kernel/smp.c                       |  3 ++-
 arch/arc/include/asm/tlbflush.h               |  6 ++---
 arch/arc/mm/tlb.c                             |  4 ++--
 arch/arm/include/asm/tlbflush.h               | 12 ++++++----
 arch/arm/kernel/smp_tlb.c                     |  4 ++--
 arch/arm/mach-rpc/ecard.c                     |  8 +++++--
 arch/arm64/crypto/aes-glue.c                  |  1 -
 arch/arm64/include/asm/tlbflush.h             |  5 ++--
 arch/arm64/mm/hugetlbpage.c                   | 10 ++++++--
 arch/csky/include/asm/tlb.h                   |  2 +-
 arch/csky/include/asm/tlbflush.h              |  6 ++---
 arch/csky/mm/tlb.c                            |  4 ++--
 arch/hexagon/include/asm/tlbflush.h           |  2 +-
 arch/hexagon/mm/vm_tlb.c                      |  4 ++--
 arch/ia64/include/asm/tlbflush.h              |  6 +++--
 arch/ia64/mm/tlb.c                            |  5 +++-
 arch/m68k/include/asm/tlbflush.h              | 10 ++++----
 arch/microblaze/include/asm/tlbflush.h        |  5 ++--
 arch/mips/include/asm/hugetlb.h               |  6 ++++-
 arch/mips/include/asm/tlbflush.h              |  9 ++++----
 arch/mips/kernel/smp.c                        |  3 ++-
 arch/nds32/include/asm/tlbflush.h             |  3 ++-
 arch/nios2/include/asm/tlbflush.h             |  9 ++++----
 arch/nios2/mm/tlb.c                           |  8 +++++--
 arch/openrisc/include/asm/tlbflush.h          | 10 ++++----
 arch/openrisc/kernel/smp.c                    |  2 +-
 arch/parisc/include/asm/tlbflush.h            |  2 +-
 arch/parisc/kernel/cache.c                    | 13 ++++++++---
 arch/powerpc/include/asm/book3s/32/tlbflush.h |  4 ++--
 arch/powerpc/include/asm/book3s/64/tlbflush.h |  3 ++-
 arch/powerpc/include/asm/nohash/tlbflush.h    |  7 +++---
 arch/powerpc/mm/book3s32/tlb.c                |  6 ++---
 arch/powerpc/mm/book3s64/radix_tlb.c          |  2 +-
 arch/powerpc/mm/nohash/tlb.c                  |  6 ++---
 arch/riscv/include/asm/tlbflush.h             |  7 +++---
 arch/riscv/mm/tlbflush.c                      |  4 ++--
 arch/s390/include/asm/tlbflush.h              |  5 ++--
 arch/sh/include/asm/tlbflush.h                |  8 +++----
 arch/sh/kernel/smp.c                          |  2 +-
 arch/sparc/include/asm/tlbflush_32.h          |  2 +-
 arch/sparc/include/asm/tlbflush_64.h          |  3 ++-
 arch/sparc/mm/tlb.c                           |  5 +++-
 arch/um/include/asm/tlbflush.h                |  6 ++---
 arch/um/kernel/tlb.c                          |  4 ++--
 arch/unicore32/include/asm/tlbflush.h         |  5 ++--
 arch/x86/include/asm/tlbflush.h               |  4 ++--
 arch/x86/mm/pgtable.c                         | 10 ++++++--
 arch/xtensa/include/asm/tlbflush.h            | 10 ++++----
 arch/xtensa/kernel/smp.c                      |  2 +-
 include/asm-generic/pgtable.h                 |  6 +++--
 include/asm-generic/tlb.h                     |  2 +-
 mm/huge_memory.c                              | 19 ++++++++++++---
 mm/hugetlb.c                                  |  2 +-
 mm/mapping_dirty_helpers.c                    | 23 +++++++++++++------
 mm/migrate.c                                  |  8 +++++--
 mm/mprotect.c                                 |  8 ++++---
 mm/mremap.c                                   | 17 +++++++++++---
 mm/pgtable-generic.c                          |  8 ++++++-
 mm/rmap.c                                     |  6 ++++-
 61 files changed, 247 insertions(+), 135 deletions(-)

diff --git a/Documentation/core-api/cachetlb.rst b/Documentation/core-api/cachetlb.rst
index 93cb65d52720..05f9522dca17 100644
--- a/Documentation/core-api/cachetlb.rst
+++ b/Documentation/core-api/cachetlb.rst
@@ -50,7 +50,7 @@ changes occur:
 	page table operations such as what happens during
 	fork, and exec.
 
-3) ``void flush_tlb_range(struct vm_area_struct *vma,
+3) ``void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
    unsigned long start, unsigned long end)``
 
 	Here we are flushing a specific range of (user) virtual
@@ -61,6 +61,10 @@ changes occur:
 	running, there will be no entries in the TLB for 'mm' for
 	virtual addresses in the range 'start' to 'end-1'.
 
+	The "tlb" is an opaque type used for passing around any data
+	needed by arch specific code for flush_tlb_range. For example,
+	it can pass the level information of TLBI instructions.
+
 	The "vma" is the backing store being used for the region.
 	Primarily, this is used for munmap() type operations.
 
@@ -111,7 +115,7 @@ the sequence will be in one of the following forms::
 
 	2) flush_cache_range(vma, start, end);
 	   change_range_of_page_tables(mm, start, end);
-	   flush_tlb_range(vma, start, end);
+	   flush_tlb_range(tlb, vma, start, end);
 
 	3) flush_cache_page(vma, addr, pfn);
 	   set_pte(pte_pointer, new_pte_val);
diff --git a/arch/alpha/include/asm/tlbflush.h b/arch/alpha/include/asm/tlbflush.h
index f8b492408f51..7512d817acee 100644
--- a/arch/alpha/include/asm/tlbflush.h
+++ b/arch/alpha/include/asm/tlbflush.h
@@ -128,8 +128,8 @@ flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)
 /* Flush a specified range of user mapping.  On the Alpha we flush
    the whole user tlb.  */
 static inline void
-flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		unsigned long end)
+flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		unsigned long start, unsigned long end)
 {
 	flush_tlb_mm(vma->vm_mm);
 }
@@ -139,8 +139,8 @@ flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *);
 extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void flush_tlb_range(struct vm_area_struct *, unsigned long,
-			    unsigned long);
+extern void flush_tlb_range(struct mmu_gather *, struct vm_area_struct *,
+			    unsigned long, unsigned long);
 
 #endif /* CONFIG_SMP */
 
diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
index 5f90df30be20..e57c5f5007e0 100644
--- a/arch/alpha/kernel/smp.c
+++ b/arch/alpha/kernel/smp.c
@@ -722,7 +722,8 @@ flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)
 EXPORT_SYMBOL(flush_tlb_page);
 
 void
-flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		unsigned long start, unsigned long end)
 {
 	/* On the Alpha we always flush the whole user tlb.  */
 	flush_tlb_mm(vma->vm_mm);
diff --git a/arch/arc/include/asm/tlbflush.h b/arch/arc/include/asm/tlbflush.h
index 49e4e5b59bb2..92f336840baf 100644
--- a/arch/arc/include/asm/tlbflush.h
+++ b/arch/arc/include/asm/tlbflush.h
@@ -20,7 +20,7 @@ void local_flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
 #endif
 
 #ifndef CONFIG_SMP
-#define flush_tlb_range(vma, s, e)	local_flush_tlb_range(vma, s, e)
+#define flush_tlb_range(tlb, vma, s, e)	local_flush_tlb_range(vma, s, e)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
 #define flush_tlb_kernel_range(s, e)	local_flush_tlb_kernel_range(s, e)
 #define flush_tlb_all()			local_flush_tlb_all()
@@ -29,8 +29,8 @@ void local_flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
 #define flush_pmd_tlb_range(tlb, vma, s, e)	local_flush_pmd_tlb_range(vma, s, e)
 #endif
 #else
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-							 unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 extern void flush_tlb_all(void);
diff --git a/arch/arc/mm/tlb.c b/arch/arc/mm/tlb.c
index 10b2a2373dc0..2f85c5a19a40 100644
--- a/arch/arc/mm/tlb.c
+++ b/arch/arc/mm/tlb.c
@@ -451,8 +451,8 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr)
 	on_each_cpu_mask(mm_cpumask(vma->vm_mm), ipi_flush_tlb_page, &ta, 1);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		     unsigned long end)
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		     unsigned long start, unsigned long end)
 {
 	struct tlb_args ta = {
 		.ta_vma = vma,
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index 24cbfc112dfa..cf52a76f97c1 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -253,10 +253,11 @@ extern struct cpu_tlb_fns cpu_tlb;
  *		space.
  *		- mm	- mm_struct describing address space
  *
- *	flush_tlb_range(mm,start,end)
+ *	flush_tlb_range(tlb, mm, start, end)
  *
  *		Invalidate a range of TLB entries in the specified
  *		address space.
+ *		- tlb	- mmu_gather contains any data needed by tlbi interface
  *		- mm	- mm_struct describing address space
  *		- start - start address (may not be aligned)
  *		- end	- end address (exclusive, may not be aligned)
@@ -609,7 +610,8 @@ static inline void clean_pmd_entry(void *pmd)
 #define flush_tlb_mm		local_flush_tlb_mm
 #define flush_tlb_page		local_flush_tlb_page
 #define flush_tlb_kernel_page	local_flush_tlb_kernel_page
-#define flush_tlb_range		local_flush_tlb_range
+#define flush_tlb_range(tlb, vma, start, end)	\
+	local_flush_tlb_range(vma, start, end)
 #define flush_tlb_kernel_range	local_flush_tlb_kernel_range
 #define flush_bp_all		local_flush_bp_all
 #else
@@ -617,7 +619,8 @@ extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr);
 extern void flush_tlb_kernel_page(unsigned long kaddr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 extern void flush_bp_all(void);
 #endif
@@ -657,7 +660,8 @@ extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr);
 extern void flush_tlb_kernel_page(unsigned long kaddr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 extern void flush_bp_all(void);
 #endif	/* __ASSEMBLY__ */
diff --git a/arch/arm/kernel/smp_tlb.c b/arch/arm/kernel/smp_tlb.c
index d4908b3736d8..7a0437dd3b64 100644
--- a/arch/arm/kernel/smp_tlb.c
+++ b/arch/arm/kernel/smp_tlb.c
@@ -217,8 +217,8 @@ void flush_tlb_kernel_page(unsigned long kaddr)
 	broadcast_tlb_a15_erratum();
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
-                     unsigned long start, unsigned long end)
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		     unsigned long start, unsigned long end)
 {
 	if (tlb_ops_need_broadcast()) {
 		struct tlb_args ta;
diff --git a/arch/arm/mach-rpc/ecard.c b/arch/arm/mach-rpc/ecard.c
index 75cfad2cb143..4bad2bdb1755 100644
--- a/arch/arm/mach-rpc/ecard.c
+++ b/arch/arm/mach-rpc/ecard.c
@@ -49,6 +49,7 @@
 #include <asm/irq.h>
 #include <asm/mmu_context.h>
 #include <asm/mach/irq.h>
+#include <asm/tlb.h>
 #include <asm/tlbflush.h>
 
 #include "ecard.h"
@@ -214,6 +215,7 @@ static DEFINE_MUTEX(ecard_mutex);
 static void ecard_init_pgtables(struct mm_struct *mm)
 {
 	struct vm_area_struct vma = TLB_FLUSH_VMA(mm, VM_EXEC);
+	struct mmu_gather tlb;
 
 	/* We want to set up the page tables for the following mapping:
 	 *  Virtual	Physical
@@ -238,8 +240,10 @@ static void ecard_init_pgtables(struct mm_struct *mm)
 
 	memcpy(dst_pgd, src_pgd, sizeof(pgd_t) * (EASI_SIZE / PGDIR_SIZE));
 
-	flush_tlb_range(&vma, IO_START, IO_START + IO_SIZE);
-	flush_tlb_range(&vma, EASI_START, EASI_START + EASI_SIZE);
+	tlb_gather_mmu(&tlb, mm, 0, -1);
+	flush_tlb_range(&tlb, &vma, IO_START, IO_START + IO_SIZE);
+	flush_tlb_range(&tlb, &vma, EASI_START, EASI_START + EASI_SIZE);
+	tlb_finish_mmu(&tlb, 0, -1);
 }
 
 static int ecard_init_mm(void)
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index ed5409c6abf4..116da8026154 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -18,7 +18,6 @@
 #include <linux/module.h>
 #include <linux/cpufeature.h>
 #include <crypto/xts.h>
-
 #include "aes-ce-setkey.h"
 
 #ifdef USE_V8_CRYPTO_EXTENSIONS
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 892f33235dc7..0b4d75a2270b 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -121,7 +121,7 @@
  *		Invalidate an entire user address space on all CPUs.
  *		The 'mm' argument identifies the ASID to invalidate.
  *
- *	flush_tlb_range(vma, start, end)
+ *	flush_tlb_range(tlb, vma, start, end)
  *		Invalidate the virtual-address range '[start, end)' on all
  *		CPUs for the user address space corresponding to 'vma->mm'.
  *		Note that this operation also invalidates any walk-cache
@@ -247,7 +247,8 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
 	dsb(ish);
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mmu_gather *tlb,
+				   struct vm_area_struct *vma,
 				   unsigned long start, unsigned long end)
 {
 	/*
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index bbeb6a5a6ba6..50ae5c300f8a 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -141,7 +141,10 @@ static pte_t get_clear_flush(struct mm_struct *mm,
 
 	if (valid) {
 		struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
-		flush_tlb_range(&vma, saddr, addr);
+		struct mmu_gather tlb;
+		tlb_gather_mmu(&tlb, mm, saddr, addr);
+		flush_tlb_range(&tlb, &vma, saddr, addr);
+		tlb_finish_mmu(&tlb, saddr, addr);
 	}
 	return orig_pte;
 }
@@ -162,12 +165,15 @@ static void clear_flush(struct mm_struct *mm,
 			     unsigned long ncontig)
 {
 	struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
+	struct mmu_gather tlb;
 	unsigned long i, saddr = addr;
 
 	for (i = 0; i < ncontig; i++, addr += pgsize, ptep++)
 		pte_clear(mm, addr, ptep);
 
-	flush_tlb_range(&vma, saddr, addr);
+	tlb_gather_mmu(&tlb, mm, saddr, addr);
+	flush_tlb_range(&tlb, &vma, saddr, addr);
+	tlb_finish_mmu(&tlb, saddr, addr);
 }
 
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
diff --git a/arch/csky/include/asm/tlb.h b/arch/csky/include/asm/tlb.h
index fdff9b8d70c8..05f756b32e5f 100644
--- a/arch/csky/include/asm/tlb.h
+++ b/arch/csky/include/asm/tlb.h
@@ -15,7 +15,7 @@
 #define tlb_end_vma(tlb, vma) \
 	do { \
 		if (!(tlb)->fullmm) \
-			flush_tlb_range(vma, (vma)->vm_start, (vma)->vm_end); \
+			flush_tlb_range(tlb, vma, (vma)->vm_start, (vma)->vm_end); \
 	}  while (0)
 
 #define tlb_flush(tlb) flush_tlb_mm((tlb)->mm)
diff --git a/arch/csky/include/asm/tlbflush.h b/arch/csky/include/asm/tlbflush.h
index 6845b0667703..8f922a693e11 100644
--- a/arch/csky/include/asm/tlbflush.h
+++ b/arch/csky/include/asm/tlbflush.h
@@ -10,14 +10,14 @@
  *  - flush_tlb_all() flushes all processes TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB entries
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			    unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 
 extern void flush_tlb_one(unsigned long vaddr);
diff --git a/arch/csky/mm/tlb.c b/arch/csky/mm/tlb.c
index eb3ba6c9c927..52e9087c45a7 100644
--- a/arch/csky/mm/tlb.c
+++ b/arch/csky/mm/tlb.c
@@ -44,8 +44,8 @@ do { \
 } while (0)
 #endif
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			unsigned long end)
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			unsigned long start, unsigned long end)
 {
 	unsigned long newpid = cpu_asid(vma->vm_mm);
 
diff --git a/arch/hexagon/include/asm/tlbflush.h b/arch/hexagon/include/asm/tlbflush.h
index a7c9ab398cab..837deece2876 100644
--- a/arch/hexagon/include/asm/tlbflush.h
+++ b/arch/hexagon/include/asm/tlbflush.h
@@ -24,7 +24,7 @@
 extern void tlb_flush_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr);
-extern void flush_tlb_range(struct vm_area_struct *vma,
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 				unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 extern void flush_tlb_one(unsigned long);
diff --git a/arch/hexagon/mm/vm_tlb.c b/arch/hexagon/mm/vm_tlb.c
index 53482f2a9ff9..31ab187b4e44 100644
--- a/arch/hexagon/mm/vm_tlb.c
+++ b/arch/hexagon/mm/vm_tlb.c
@@ -22,8 +22,8 @@
  * processors must be induced to flush the copies in their local TLBs,
  * but Hexagon thread-based virtual processors share the same MMU.
  */
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			unsigned long end)
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			unsigned long start, unsigned long end)
 {
 	struct mm_struct *mm = vma->vm_mm;
 
diff --git a/arch/ia64/include/asm/tlbflush.h b/arch/ia64/include/asm/tlbflush.h
index ceac10c4d6e2..b67b5527a7ba 100644
--- a/arch/ia64/include/asm/tlbflush.h
+++ b/arch/ia64/include/asm/tlbflush.h
@@ -92,7 +92,8 @@ flush_tlb_mm (struct mm_struct *mm)
 #endif
 }
 
-extern void flush_tlb_range (struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 
 /*
  * Page-granular tlb flush.
@@ -101,7 +102,8 @@ static inline void
 flush_tlb_page (struct vm_area_struct *vma, unsigned long addr)
 {
 #ifdef CONFIG_SMP
-	flush_tlb_range(vma, (addr & PAGE_MASK), (addr & PAGE_MASK) + PAGE_SIZE);
+	flush_tlb_range(NULL, vma, (addr & PAGE_MASK),
+			(addr & PAGE_MASK) + PAGE_SIZE);
 #else
 	if (vma->vm_mm == current->active_mm)
 		ia64_ptcl(addr, (PAGE_SHIFT << 2));
diff --git a/arch/ia64/mm/tlb.c b/arch/ia64/mm/tlb.c
index 72cc568bc841..84b11faaaf0c 100644
--- a/arch/ia64/mm/tlb.c
+++ b/arch/ia64/mm/tlb.c
@@ -347,7 +347,10 @@ __flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
 	ia64_srlz_i();			/* srlz.i implies srlz.d */
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+/* struct mmu_gather *tlb might be NULL in this architecture. See
+ * arch/ia64/include/asm/tlbflush.h: flush_tlb_page().
+ */
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		unsigned long start, unsigned long end)
 {
 	if (unlikely(end - start >= 1024*1024*1024*1024UL
diff --git a/arch/m68k/include/asm/tlbflush.h b/arch/m68k/include/asm/tlbflush.h
index 191e75a6bb24..bad618573ea3 100644
--- a/arch/m68k/include/asm/tlbflush.h
+++ b/arch/m68k/include/asm/tlbflush.h
@@ -92,7 +92,8 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr
 	}
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mmu_gather *tlb,
+				   struct vm_area_struct *vma,
 				   unsigned long start, unsigned long end)
 {
 	if (vma->vm_mm == current->active_mm)
@@ -189,8 +190,9 @@ static inline void flush_tlb_page (struct vm_area_struct *vma,
 }
 /* Flush a range of pages from TLB. */
 
-static inline void flush_tlb_range (struct vm_area_struct *vma,
-		      unsigned long start, unsigned long end)
+static inline void flush_tlb_range(struct mmu_gather *tlb,
+				    struct vm_area_struct *vma,
+				    unsigned long start, unsigned long end)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	unsigned char seg, oldctx;
@@ -263,7 +265,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr
 	BUG();
 }
 
-static inline void flush_tlb_range(struct mm_struct *mm,
+static inline void flush_tlb_range(struct mmu_gather *tlb, struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
 	BUG();
diff --git a/arch/microblaze/include/asm/tlbflush.h b/arch/microblaze/include/asm/tlbflush.h
index 2e1353c2d18d..2c0f8b1a3b0b 100644
--- a/arch/microblaze/include/asm/tlbflush.h
+++ b/arch/microblaze/include/asm/tlbflush.h
@@ -44,7 +44,8 @@ static inline void local_flush_tlb_range(struct vm_area_struct *vma,
 #define flush_tlb_all local_flush_tlb_all
 #define flush_tlb_mm local_flush_tlb_mm
 #define flush_tlb_page local_flush_tlb_page
-#define flush_tlb_range local_flush_tlb_range
+#define flush_tlb_range(tlb, mm, start, end)	\
+	local_flush_tlb_range(mm, start, end)
 
 /*
  * This is called in munmap when we have freed up some page-table
@@ -60,7 +61,7 @@ static inline void flush_tlb_pgtables(struct mm_struct *mm,
 #define flush_tlb_all()				BUG()
 #define flush_tlb_mm(mm)			BUG()
 #define flush_tlb_page(vma, addr)		BUG()
-#define flush_tlb_range(mm, start, end)		BUG()
+#define flush_tlb_range(tlb, mm, start, end)	BUG()
 #define flush_tlb_pgtables(mm, start, end)	BUG()
 #define flush_tlb_kernel_range(start, end)	BUG()
 
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index 425bb6fc3bda..26acebd3a58a 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -10,6 +10,7 @@
 #define __ASM_HUGETLB_H
 
 #include <asm/page.h>
+#include <asm/tlb.h>
 
 static inline int is_hugepage_only_range(struct mm_struct *mm,
 					 unsigned long addr,
@@ -72,12 +73,15 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 	int changed = !pte_same(*ptep, pte);
 
 	if (changed) {
+		struct mmu_gather tlb;
 		set_pte_at(vma->vm_mm, addr, ptep, pte);
 		/*
 		 * There could be some standard sized pages in there,
 		 * get them all.
 		 */
-		flush_tlb_range(vma, addr, addr + HPAGE_SIZE);
+		tlb_gather_mmu(&tlb, vma->vm_mm, addr, addr + HPAGE_SIZE);
+		flush_tlb_range(&tlb, vma, addr, addr + HPAGE_SIZE);
+		tlb_finish_mmu(&tlb, addr, addr + HPAGE_SIZE);
 	}
 	return changed;
 }
diff --git a/arch/mips/include/asm/tlbflush.h b/arch/mips/include/asm/tlbflush.h
index 9789e7a32def..ad5ae304f2fa 100644
--- a/arch/mips/include/asm/tlbflush.h
+++ b/arch/mips/include/asm/tlbflush.h
@@ -10,7 +10,7 @@
  *  - flush_tlb_all() flushes all processes TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB entries
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void local_flush_tlb_all(void);
@@ -28,8 +28,8 @@ extern void local_flush_tlb_one(unsigned long vaddr);
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long,
-	unsigned long);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long, unsigned long);
 extern void flush_tlb_kernel_range(unsigned long, unsigned long);
 extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
 extern void flush_tlb_one(unsigned long vaddr);
@@ -38,7 +38,8 @@ extern void flush_tlb_one(unsigned long vaddr);
 
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		drop_mmu_context(mm)
-#define flush_tlb_range(vma, vmaddr, end)	local_flush_tlb_range(vma, vmaddr, end)
+#define flush_tlb_range(tlb, vma, vmaddr, end) \
+	local_flush_tlb_range(vma, vmaddr, end)
 #define flush_tlb_kernel_range(vmaddr,end) \
 	local_flush_tlb_kernel_range(vmaddr, end)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c
index f510c00bda88..9c3a40d23314 100644
--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -563,7 +563,8 @@ static void flush_tlb_range_ipi(void *info)
 	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		     unsigned long start, unsigned long end)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	unsigned long addr;
diff --git a/arch/nds32/include/asm/tlbflush.h b/arch/nds32/include/asm/tlbflush.h
index 97155366ea01..81ba671759f9 100644
--- a/arch/nds32/include/asm/tlbflush.h
+++ b/arch/nds32/include/asm/tlbflush.h
@@ -36,7 +36,8 @@ void local_flush_tlb_page(struct vm_area_struct *vma, unsigned long addr);
 
 #define flush_tlb_all		local_flush_tlb_all
 #define flush_tlb_mm		local_flush_tlb_mm
-#define flush_tlb_range		local_flush_tlb_range
+#define flush_tlb_range(tlb, vma, start, end) \
+	local_flush_tlb_range(vma, start, end)
 #define flush_tlb_page		local_flush_tlb_page
 #define flush_tlb_kernel_range	local_flush_tlb_kernel_range
 
diff --git a/arch/nios2/include/asm/tlbflush.h b/arch/nios2/include/asm/tlbflush.h
index 362d6da09d02..823ea991b6d7 100644
--- a/arch/nios2/include/asm/tlbflush.h
+++ b/arch/nios2/include/asm/tlbflush.h
@@ -7,13 +7,14 @@
 #define _ASM_NIOS2_TLBFLUSH_H
 
 struct mm_struct;
+struct mmu_gather;
 
 /*
  * TLB flushing:
  *
  *  - flush_tlb_all() flushes all processes TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB entries
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  *  - flush_tlb_page(vma, address) flushes a page
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *  - flush_tlb_kernel_page(address) flushes a kernel page
@@ -23,14 +24,14 @@ struct mm_struct;
  */
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			    unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 
 static inline void flush_tlb_page(struct vm_area_struct *vma,
 				  unsigned long address)
 {
-	flush_tlb_range(vma, address, address + PAGE_SIZE);
+	flush_tlb_range(NULL, vma, tlb_start, tlb_end);
 }
 
 static inline void flush_tlb_kernel_page(unsigned long address)
diff --git a/arch/nios2/mm/tlb.c b/arch/nios2/mm/tlb.c
index 7fea59e53f94..5cb6d9a64082 100644
--- a/arch/nios2/mm/tlb.c
+++ b/arch/nios2/mm/tlb.c
@@ -100,8 +100,12 @@ static void reload_tlb_one_pid(unsigned long addr, unsigned long mmu_pid, pte_t
 	replace_tlb_one_pid(addr, mmu_pid, pte_val(pte));
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			unsigned long end)
+/*
+ * struct mmu_gather *tlb might be NULL in this architecture. See
+ * arch/nios2/include/asm/tlbflush.h: flush_tlb_page().
+ */
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			unsigned long start, unsigned long end)
 {
 	unsigned long mmu_pid = get_pid_from_context(&vma->vm_mm->context);
 
diff --git a/arch/openrisc/include/asm/tlbflush.h b/arch/openrisc/include/asm/tlbflush.h
index e9a7f0b35a15..67e8e492d5f0 100644
--- a/arch/openrisc/include/asm/tlbflush.h
+++ b/arch/openrisc/include/asm/tlbflush.h
@@ -27,7 +27,7 @@
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(mm, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, mm, start, end) flushes a range of pages
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
@@ -41,13 +41,13 @@ extern void local_flush_tlb_range(struct vm_area_struct *vma,
 #define flush_tlb_all	local_flush_tlb_all
 #define flush_tlb_mm	local_flush_tlb_mm
 #define flush_tlb_page	local_flush_tlb_page
-#define flush_tlb_range	local_flush_tlb_range
+#define flush_tlb_range(tlb, vma, start, end)	local_flush_tlb_range(vma, start, end)
 #else
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			    unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 #endif
 
 static inline void flush_tlb(void)
@@ -58,7 +58,7 @@ static inline void flush_tlb(void)
 static inline void flush_tlb_kernel_range(unsigned long start,
 					  unsigned long end)
 {
-	flush_tlb_range(NULL, start, end);
+	flush_tlb_range(NULL, NULL, start, end);
 }
 
 #endif /* __ASM_OPENRISC_TLBFLUSH_H */
diff --git a/arch/openrisc/kernel/smp.c b/arch/openrisc/kernel/smp.c
index 7d518ee8bddc..b22d5afb8daa 100644
--- a/arch/openrisc/kernel/smp.c
+++ b/arch/openrisc/kernel/smp.c
@@ -238,7 +238,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr)
 	on_each_cpu(ipi_flush_tlb_all, NULL, 1);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		     unsigned long start, unsigned long end)
 {
 	on_each_cpu(ipi_flush_tlb_all, NULL, 1);
diff --git a/arch/parisc/include/asm/tlbflush.h b/arch/parisc/include/asm/tlbflush.h
index c5ded01d45be..a46d27e7b9b0 100644
--- a/arch/parisc/include/asm/tlbflush.h
+++ b/arch/parisc/include/asm/tlbflush.h
@@ -16,7 +16,7 @@ extern void flush_tlb_all_local(void *);
 int __flush_tlb_range(unsigned long sid,
 	unsigned long start, unsigned long end);
 
-#define flush_tlb_range(vma, start, end) \
+#define flush_tlb_range(tlb, vma, start, end) \
 	__flush_tlb_range((vma)->vm_mm->context, start, end)
 
 #define flush_tlb_kernel_range(start, end) \
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index 1eedfecc5137..341eeff566c9 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -22,6 +22,7 @@
 #include <asm/pdc.h>
 #include <asm/cache.h>
 #include <asm/cacheflush.h>
+#include <asm/tlb.h>
 #include <asm/tlbflush.h>
 #include <asm/page.h>
 #include <asm/pgalloc.h>
@@ -563,11 +564,14 @@ void flush_cache_mm(struct mm_struct *mm)
 	}
 
 	if (mm->context == mfsp(3)) {
+		struct mmu_gather tlb;
 		for (vma = mm->mmap; vma; vma = vma->vm_next) {
 			flush_user_dcache_range_asm(vma->vm_start, vma->vm_end);
 			if (vma->vm_flags & VM_EXEC)
 				flush_user_icache_range_asm(vma->vm_start, vma->vm_end);
-			flush_tlb_range(vma, vma->vm_start, vma->vm_end);
+			tlb_gather_mmu(&tlb, mm, vma->vm_start, vma->vm_end);
+			flush_tlb_range(&tlb, vma, vma->vm_start, vma->vm_end);
+			tlb_finish_mmu(&tlb, vma->vm_start, vma->vm_end);
 		}
 		return;
 	}
@@ -600,11 +604,13 @@ void flush_cache_range(struct vm_area_struct *vma,
 {
 	pgd_t *pgd;
 	unsigned long addr;
+	struct mmu_gather tlb;
 
+	tlb_gather_mmu(&tlb, vma->vm_mm, start, end);
 	if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
 	    end - start >= parisc_cache_flush_threshold) {
 		if (vma->vm_mm->context)
-			flush_tlb_range(vma, start, end);
+			flush_tlb_range(&tlb, vma, start, end);
 		flush_cache_all();
 		return;
 	}
@@ -613,9 +619,10 @@ void flush_cache_range(struct vm_area_struct *vma,
 		flush_user_dcache_range_asm(start, end);
 		if (vma->vm_flags & VM_EXEC)
 			flush_user_icache_range_asm(start, end);
-		flush_tlb_range(vma, start, end);
+		flush_tlb_range(&tlb, vma, start, end);
 		return;
 	}
+	tlb_finish_mmu(&tlb, start, end);
 
 	pgd = vma->vm_mm->pgd;
 	for (addr = vma->vm_start; addr < vma->vm_end; addr += PAGE_SIZE) {
diff --git a/arch/powerpc/include/asm/book3s/32/tlbflush.h b/arch/powerpc/include/asm/book3s/32/tlbflush.h
index 068085b709fb..79790529fc7c 100644
--- a/arch/powerpc/include/asm/book3s/32/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/32/tlbflush.h
@@ -9,8 +9,8 @@
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr);
 extern void flush_tlb_page_nohash(struct vm_area_struct *vma, unsigned long addr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			    unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 static inline void local_flush_tlb_page(struct vm_area_struct *vma,
 					unsigned long vmaddr)
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 968f10ef3d51..b2af60c9a5ca 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -67,7 +67,8 @@ static inline void flush_hugetlb_tlb_range(struct mmu_gather *tlb,
 	return hash__flush_tlb_range(vma, start, end);
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mmu_gather *tlb,
+				   struct vm_area_struct *vma,
 				   unsigned long start, unsigned long end)
 {
 	if (radix_enabled())
diff --git a/arch/powerpc/include/asm/nohash/tlbflush.h b/arch/powerpc/include/asm/nohash/tlbflush.h
index b1d8fec29169..471ce66d32c2 100644
--- a/arch/powerpc/include/asm/nohash/tlbflush.h
+++ b/arch/powerpc/include/asm/nohash/tlbflush.h
@@ -11,7 +11,7 @@
  *                           the local processor
  *  - local_flush_tlb_page(vma, vmaddr) flushes one page on the local processor
  *  - flush_tlb_page_nohash(vma, vmaddr) flushes one page if SW loaded TLB
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *
  */
@@ -26,11 +26,12 @@
 
 struct vm_area_struct;
 struct mm_struct;
+struct mmu_gather;
 
 #define MMU_NO_CONTEXT      	((unsigned int)-1)
 
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			    unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 
 extern void local_flush_tlb_mm(struct mm_struct *mm);
diff --git a/arch/powerpc/mm/book3s32/tlb.c b/arch/powerpc/mm/book3s32/tlb.c
index 2fcd321040ff..96dbdcffc140 100644
--- a/arch/powerpc/mm/book3s32/tlb.c
+++ b/arch/powerpc/mm/book3s32/tlb.c
@@ -63,7 +63,7 @@ void tlb_flush(struct mmu_gather *tlb)
  *
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes kernel pages
  *
  * since the hardware hash table functions as an extension of the
@@ -156,8 +156,8 @@ EXPORT_SYMBOL(flush_tlb_page);
  * and check _PAGE_HASHPTE bit; if it is set, find and destroy
  * the corresponding HPTE.
  */
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		     unsigned long end)
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		     unsigned long start, unsigned long end)
 {
 	flush_range(vma->vm_mm, start, end);
 }
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index 03f43c924e00..78ea6e4192d1 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -557,7 +557,7 @@ static inline void _tlbiel_va_range_multicast(struct mm_struct *mm,
  *
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes kernel pages
  *
  *  - local_* variants of page and mm only apply to the current
diff --git a/arch/powerpc/mm/nohash/tlb.c b/arch/powerpc/mm/nohash/tlb.c
index 696f568253a0..1fd5c837630e 100644
--- a/arch/powerpc/mm/nohash/tlb.c
+++ b/arch/powerpc/mm/nohash/tlb.c
@@ -181,7 +181,7 @@ EXPORT_PER_CPU_SYMBOL(next_tlbcam_idx);
  *
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes kernel pages
  *
  *  - local_* variants of page and mm only apply to the current
@@ -379,8 +379,8 @@ EXPORT_SYMBOL(flush_tlb_kernel_range);
  * some implementation can stack multiple tlbivax before a tlbsync but
  * for now, we keep it that way
  */
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		     unsigned long end)
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		     unsigned long start, unsigned long end)
 
 {
 	if (end - start == PAGE_SIZE && !(start & ~PAGE_MASK))
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index 394cfbccdcd9..d1ede9c434ec 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -30,14 +30,15 @@ static inline void local_flush_tlb_page(unsigned long addr)
 void flush_tlb_all(void);
 void flush_tlb_mm(struct mm_struct *mm);
 void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr);
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		     unsigned long end);
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		     unsigned long start, unsigned long end);
 #else /* CONFIG_SMP && CONFIG_MMU */
 
 #define flush_tlb_all() local_flush_tlb_all()
 #define flush_tlb_page(vma, addr) local_flush_tlb_page(addr)
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mmu_gather *tlb,
+		struct vm_area_struct *vma,
 		unsigned long start, unsigned long end)
 {
 	local_flush_tlb_all();
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index 720b443c4528..4c679f942b14 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -49,8 +49,8 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)
 	__sbi_tlb_flush_range(mm_cpumask(vma->vm_mm), addr, PAGE_SIZE);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		     unsigned long end)
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		     unsigned long start, unsigned long end)
 {
 	__sbi_tlb_flush_range(mm_cpumask(vma->vm_mm), start, end - start);
 }
diff --git a/arch/s390/include/asm/tlbflush.h b/arch/s390/include/asm/tlbflush.h
index 82703e03f35d..c9d372bbbb3e 100644
--- a/arch/s390/include/asm/tlbflush.h
+++ b/arch/s390/include/asm/tlbflush.h
@@ -99,7 +99,7 @@ static inline void __tlb_flush_mm_lazy(struct mm_struct * mm)
  *  flush_tlb_all() - flushes all processes TLBs
  *  flush_tlb_mm(mm) - flushes the specified mm context TLB's
  *  flush_tlb_page(vma, vmaddr) - flushes one page
- *  flush_tlb_range(vma, start, end) - flushes a range of pages
+ *  flush_tlb_range(tlb, vma, start, end) - flushes a range of pages
  *  flush_tlb_kernel_range(start, end) - flushes a range of kernel pages
  */
 
@@ -120,7 +120,8 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
 	__tlb_flush_mm_lazy(mm);
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mmu_gather *tlb,
+				   struct vm_area_struct *vma,
 				   unsigned long start, unsigned long end)
 {
 	__tlb_flush_mm_lazy(vma->vm_mm);
diff --git a/arch/sh/include/asm/tlbflush.h b/arch/sh/include/asm/tlbflush.h
index 8f180cd3bcd6..bfaf0e9677c2 100644
--- a/arch/sh/include/asm/tlbflush.h
+++ b/arch/sh/include/asm/tlbflush.h
@@ -8,7 +8,7 @@
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void local_flush_tlb_all(void);
@@ -28,8 +28,8 @@ extern void __flush_tlb_global(void);
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-			    unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 extern void flush_tlb_one(unsigned long asid, unsigned long page);
@@ -41,7 +41,7 @@ extern void flush_tlb_one(unsigned long asid, unsigned long page);
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
 #define flush_tlb_one(asid, page)	local_flush_tlb_one(asid, page)
 
-#define flush_tlb_range(vma, start, end)	\
+#define flush_tlb_range(tlb, vma, start, end)	\
 	local_flush_tlb_range(vma, start, end)
 
 #define flush_tlb_kernel_range(start, end)	\
diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c
index 372acdc9033e..8bee178f6882 100644
--- a/arch/sh/kernel/smp.c
+++ b/arch/sh/kernel/smp.c
@@ -387,7 +387,7 @@ static void flush_tlb_range_ipi(void *info)
 	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		     unsigned long start, unsigned long end)
 {
 	struct mm_struct *mm = vma->vm_mm;
diff --git a/arch/sparc/include/asm/tlbflush_32.h b/arch/sparc/include/asm/tlbflush_32.h
index 470531991a08..231f705cd314 100644
--- a/arch/sparc/include/asm/tlbflush_32.h
+++ b/arch/sparc/include/asm/tlbflush_32.h
@@ -8,7 +8,7 @@
 	sparc32_cachetlb_ops->tlb_all()
 #define flush_tlb_mm(mm) \
 	sparc32_cachetlb_ops->tlb_mm(mm)
-#define flush_tlb_range(vma, start, end) \
+#define flush_tlb_range(tlb, vma, start, end) \
 	sparc32_cachetlb_ops->tlb_range(vma, start, end)
 #define flush_tlb_page(vma, addr) \
 	sparc32_cachetlb_ops->tlb_page(vma, addr)
diff --git a/arch/sparc/include/asm/tlbflush_64.h b/arch/sparc/include/asm/tlbflush_64.h
index 8b8cdaa69272..c371f46c71e8 100644
--- a/arch/sparc/include/asm/tlbflush_64.h
+++ b/arch/sparc/include/asm/tlbflush_64.h
@@ -32,7 +32,8 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
 {
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mmu_gather *tlb,
+				   struct vm_area_struct *vma,
 				   unsigned long start, unsigned long end)
 {
 }
diff --git a/arch/sparc/mm/tlb.c b/arch/sparc/mm/tlb.c
index 3d72d2deb13b..9bb1fd1c2668 100644
--- a/arch/sparc/mm/tlb.c
+++ b/arch/sparc/mm/tlb.c
@@ -245,10 +245,13 @@ pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
 		     pmd_t *pmdp)
 {
 	pmd_t old, entry;
+	struct mmu_gather tlb;
 
 	entry = __pmd(pmd_val(*pmdp) & ~_PAGE_VALID);
 	old = pmdp_establish(vma, address, pmdp, entry);
-	flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	tlb_gather_mmu(&tlb, vma->vm_mm, address, address + HPAGE_PMD_SIZE);
+	flush_tlb_range(&tlb, vma, address, address + HPAGE_PMD_SIZE);
+	tlb_finish_mmu(&tlb, address, address + HPAGE_PMD_SIZE);
 
 	/*
 	 * set_pmd_at() will not be called in a way to decrement
diff --git a/arch/um/include/asm/tlbflush.h b/arch/um/include/asm/tlbflush.h
index a5bda890390d..e41a03181aa8 100644
--- a/arch/um/include/asm/tlbflush.h
+++ b/arch/um/include/asm/tlbflush.h
@@ -16,13 +16,13 @@
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
  *  - flush_tlb_kernel_vm() flushes the kernel vm area
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  */
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, 
-			    unsigned long end);
+extern void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+			    unsigned long start, unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long address);
 extern void flush_tlb_kernel_vm(void);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c
index 80a358c6d652..5aa1b8100f73 100644
--- a/arch/um/kernel/tlb.c
+++ b/arch/um/kernel/tlb.c
@@ -574,8 +574,8 @@ static void fix_range(struct mm_struct *mm, unsigned long start_addr,
 	fix_range_common(mm, start_addr, end_addr, force);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		     unsigned long end)
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+		     unsigned long start, unsigned long end)
 {
 	if (vma->vm_mm == NULL)
 		flush_tlb_kernel_range_common(start, end);
diff --git a/arch/unicore32/include/asm/tlbflush.h b/arch/unicore32/include/asm/tlbflush.h
index 1cf18ef55515..69ea6c3079e7 100644
--- a/arch/unicore32/include/asm/tlbflush.h
+++ b/arch/unicore32/include/asm/tlbflush.h
@@ -38,7 +38,7 @@ extern void __cpu_flush_kern_tlb_range(unsigned long, unsigned long);
  *		space.
  *		- mm	- mm_struct describing address space
  *
- *	flush_tlb_range(mm,start,end)
+ *	flush_tlb_range(tlb,mm,start,end)
  *
  *		Invalidate a range of TLB entries in the specified
  *		address space.
@@ -173,7 +173,8 @@ static inline void clean_pmd_entry(pmd_t *pmd)
 #define flush_tlb_mm		local_flush_tlb_mm
 #define flush_tlb_page		local_flush_tlb_page
 #define flush_tlb_kernel_page	local_flush_tlb_kernel_page
-#define flush_tlb_range		local_flush_tlb_range
+#define flush_tlb_range(tlb, vma, start, end)	\
+	local_flush_tlb_range(vma, start, end)
 #define flush_tlb_kernel_range	local_flush_tlb_kernel_range
 
 /*
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 6f66d841262d..217e37da8271 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -531,7 +531,7 @@ static inline void __flush_tlb_one_kernel(unsigned long addr)
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, vma, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *  - flush_tlb_others(cpumask, info) flushes TLBs on other cpus
  *
@@ -568,7 +568,7 @@ struct flush_tlb_info {
 #define flush_tlb_mm(mm)						\
 		flush_tlb_mm_range(mm, 0UL, TLB_FLUSH_ALL, 0UL, true)
 
-#define flush_tlb_range(vma, start, end)				\
+#define flush_tlb_range(tlb, vma, start, end)				\
 	flush_tlb_mm_range((vma)->vm_mm, start, end,			\
 			   ((vma)->vm_flags & VM_HUGETLB)		\
 				? huge_page_shift(hstate_vma(vma))	\
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 7bd2c3a52297..7ff38507086c 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -596,8 +596,14 @@ int pmdp_clear_flush_young(struct vm_area_struct *vma,
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 
 	young = pmdp_test_and_clear_young(vma, address, pmdp);
-	if (young)
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	if (young) {
+		struct mmu_gather tlb;
+		unsigned long tlb_start = address;
+		unsigned long tlb_end = address + HPAGE_PMD_SIZE;
+		tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+		flush_tlb_range(&tlb, vma, tlb_start, tlb_end);
+		tlb_finish_mmu(&tlb, tlb_start, tlb_end);
+	}
 
 	return young;
 }
diff --git a/arch/xtensa/include/asm/tlbflush.h b/arch/xtensa/include/asm/tlbflush.h
index 856e2da2e397..613127ebabfb 100644
--- a/arch/xtensa/include/asm/tlbflush.h
+++ b/arch/xtensa/include/asm/tlbflush.h
@@ -27,7 +27,7 @@
  *  - flush_tlb_all() flushes all processes TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB entries
  *  - flush_tlb_page(mm, vmaddr) flushes a single page
- *  - flush_tlb_range(mm, start, end) flushes a range of pages
+ *  - flush_tlb_range(tlb, mm, start, end) flushes a range of pages
  */
 
 void local_flush_tlb_all(void);
@@ -43,8 +43,8 @@ void local_flush_tlb_kernel_range(unsigned long start, unsigned long end);
 void flush_tlb_all(void);
 void flush_tlb_mm(struct mm_struct *);
 void flush_tlb_page(struct vm_area_struct *, unsigned long);
-void flush_tlb_range(struct vm_area_struct *, unsigned long,
-		unsigned long);
+void flush_tlb_range(struct mmu_gather *, struct vm_area_struct *,
+		unsigned long, unsigned long);
 void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 
 #else /* !CONFIG_SMP */
@@ -52,8 +52,8 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 #define flush_tlb_all()			   local_flush_tlb_all()
 #define flush_tlb_mm(mm)		   local_flush_tlb_mm(mm)
 #define flush_tlb_page(vma, page)	   local_flush_tlb_page(vma, page)
-#define flush_tlb_range(vma, vmaddr, end)  local_flush_tlb_range(vma, vmaddr, \
-								 end)
+#define flush_tlb_range(tlb, vma, vmaddr, end)	\
+	local_flush_tlb_range(vma, vmaddr, end)
 #define flush_tlb_kernel_range(start, end) local_flush_tlb_kernel_range(start, \
 									end)
 
diff --git a/arch/xtensa/kernel/smp.c b/arch/xtensa/kernel/smp.c
index 83b244ce61ee..6fec29ea865a 100644
--- a/arch/xtensa/kernel/smp.c
+++ b/arch/xtensa/kernel/smp.c
@@ -509,7 +509,7 @@ static void ipi_flush_tlb_range(void *arg)
 	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+void flush_tlb_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		     unsigned long start, unsigned long end)
 {
 	struct flush_data fd = {
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 1c67a744877e..884be388fe38 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -1160,8 +1160,10 @@ static inline int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
  * invalidate the entire TLB which is not desitable.
  * e.g. see arch/arc: flush_pmd_tlb_range
  */
-#define flush_pmd_tlb_range(tlb, vma, addr, end)	flush_tlb_range(vma, addr, end)
-#define flush_pud_tlb_range(tlb, vma, addr, end)	flush_tlb_range(vma, addr, end)
+#define flush_pmd_tlb_range(tlb, vma, addr, end)	\
+	flush_tlb_range(tlb, vma, addr, end)
+#define flush_pud_tlb_range(tlb, vma, addr, end)	\
+	flush_tlb_range(tlb, vma, addr, end)
 #else
 #define flush_pmd_tlb_range(tlb, vma, addr, end)	BUILD_BUG()
 #define flush_pud_tlb_range(tlb, vma, addr, end)	BUILD_BUG()
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index f391f6b500b4..1aa1ff9f72a2 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -380,7 +380,7 @@ static inline void tlb_flush(struct mmu_gather *tlb)
 				    (tlb->vma_huge ? VM_HUGETLB : 0),
 		};
 
-		flush_tlb_range(&vma, tlb->start, tlb->end);
+		flush_tlb_range(tlb, &vma, tlb->start, tlb->end);
 	}
 }
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 24ad53b4dfc0..5f1d29c409df 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1646,7 +1646,13 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd)
 	 * mapping or not. Hence use the tlb range variant
 	 */
 	if (mm_tlb_flush_pending(vma->vm_mm)) {
-		flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
+		struct mmu_gather tlb;
+		unsigned long tlb_start = haddr;
+		unsigned long tlb_end = haddr + HPAGE_PMD_SIZE;
+		tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+		tlb.cleared_pmds = 1;
+		flush_tlb_range(&tlb, vma, tlb_start, tlb_end);
+		tlb_finish_mmu(&tlb, tlb_start, tlb_end);
 		/*
 		 * change_huge_pmd() released the pmd lock before
 		 * invalidating the secondary MMUs sharing the primary
@@ -1917,8 +1923,15 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
 		}
 		pmd = move_soft_dirty_pmd(pmd);
 		set_pmd_at(mm, new_addr, new_pmd, pmd);
-		if (force_flush)
-			flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE);
+		if (force_flush) {
+			struct mmu_gather tlb;
+			unsigned long tlb_start = old_addr;
+			unsigned long tlb_end = old_addr + PMD_SIZE;
+			tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+			tlb.cleared_pmds = 1;
+			flush_tlb_range(&tlb, vma, tlb_start, tlb_end);
+			tlb_finish_mmu(&tlb, tlb_start, tlb_end);
+		}
 		if (new_ptl != old_ptl)
 			spin_unlock(new_ptl);
 		spin_unlock(old_ptl);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f913ce0b4831..f4a2c9a9e478 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4442,7 +4442,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
  * implement this.
  */
 #define flush_hugetlb_tlb_range(tlb, vma, addr, end)	\
-	flush_tlb_range(vma, addr, end)
+	flush_tlb_range(tlb, vma, addr, end)
 #endif
 
 unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
diff --git a/mm/mapping_dirty_helpers.c b/mm/mapping_dirty_helpers.c
index 71070dda9643..6b9df57b6301 100644
--- a/mm/mapping_dirty_helpers.c
+++ b/mm/mapping_dirty_helpers.c
@@ -175,13 +175,22 @@ static int wp_clean_pre_vma(unsigned long start, unsigned long end,
 static void wp_clean_post_vma(struct mm_walk *walk)
 {
 	struct wp_walk *wpwalk = walk->private;
-
-	if (mm_tlb_flush_nested(walk->mm))
-		flush_tlb_range(walk->vma, wpwalk->range.start,
-				wpwalk->range.end);
-	else if (wpwalk->tlbflush_end > wpwalk->tlbflush_start)
-		flush_tlb_range(walk->vma, wpwalk->tlbflush_start,
-				wpwalk->tlbflush_end);
+	struct mmu_gather tlb;
+	unsigned long tlb_start, tlb_end;
+
+	if (mm_tlb_flush_nested(walk->mm)) {
+		tlb_start = wpwalk->range.start;
+		tlb_end = wpwalk->range.end;
+		tlb_gather_mmu(&tlb, walk->mm, tlb_start, tlb_end);
+		flush_tlb_range(&tlb, walk->vma, tlb_start, tlb_end);
+		tlb_finish_mmu(&tlb, tlb_start, tlb_end);
+	} else if (wpwalk->tlbflush_end > wpwalk->tlbflush_start) {
+		tlb_start = wpwalk->tlbflush_start;
+		tlb_end = wpwalk->tlbflush_end;
+		tlb_gather_mmu(&tlb, walk->mm, tlb_start, tlb_end);
+		flush_tlb_range(&tlb, walk->vma, tlb_start, tlb_end);
+		tlb_finish_mmu(&tlb, tlb_start, tlb_end);
+	}
 
 	mmu_notifier_invalidate_range_end(&wpwalk->range);
 	dec_tlb_flush_pending(walk->mm);
diff --git a/mm/migrate.c b/mm/migrate.c
index b1092876e537..7f62b16ef36b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2340,8 +2340,12 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 	pte_unmap_unlock(ptep - 1, ptl);
 
 	/* Only flush the TLB if we actually modified any entries */
-	if (unmapped)
-		flush_tlb_range(walk->vma, start, end);
+	if (unmapped) {
+		struct mmu_gather tlb;
+		tlb_gather_mmu(&tlb, mm, start, end);
+		flush_tlb_range(&tlb, walk->vma, start, end);
+		tlb_finish_mmu(&tlb, start, end);
+	}
 
 	return 0;
 }
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 311c0dadf71c..1e79254776b4 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -31,6 +31,7 @@
 #include <asm/pgtable.h>
 #include <asm/cacheflush.h>
 #include <asm/mmu_context.h>
+#include <asm/tlb.h>
 #include <asm/tlbflush.h>
 
 #include "internal.h"
@@ -303,6 +304,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
 		int dirty_accountable, int prot_numa)
 {
 	struct mm_struct *mm = vma->vm_mm;
+	struct mmu_gather tlb;
 	pgd_t *pgd;
 	unsigned long next;
 	unsigned long start = addr;
@@ -311,7 +313,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
 	BUG_ON(addr >= end);
 	pgd = pgd_offset(mm, addr);
 	flush_cache_range(vma, addr, end);
-	inc_tlb_flush_pending(mm);
+	tlb_gather_mmu(&tlb, mm, start, end);
 	do {
 		next = pgd_addr_end(addr, end);
 		if (pgd_none_or_clear_bad(pgd))
@@ -322,8 +324,8 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
 
 	/* Only flush the TLB if we actually modified any entries: */
 	if (pages)
-		flush_tlb_range(vma, start, end);
-	dec_tlb_flush_pending(mm);
+		flush_tlb_range(&tlb, vma, start, end);
+	tlb_finish_mmu(&tlb, start, end);
 
 	return pages;
 }
diff --git a/mm/mremap.c b/mm/mremap.c
index af363063ea23..28cfe2f820ea 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -26,6 +26,7 @@
 #include <linux/userfaultfd_k.h>
 
 #include <asm/cacheflush.h>
+#include <asm/tlb.h>
 #include <asm/tlbflush.h>
 
 #include "internal.h"
@@ -181,8 +182,14 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd,
 	}
 
 	arch_leave_lazy_mmu_mode();
-	if (force_flush)
-		flush_tlb_range(vma, old_end - len, old_end);
+	if (force_flush) {
+		struct mmu_gather tlb;
+		tlb_gather_mmu(&tlb, mm, old_end - len, old_end);
+		tlb.cleared_ptes = 1;
+		flush_tlb_range(&tlb, vma, old_end - len, old_end);
+		tlb_finish_mmu(&tlb, old_end - len, old_end);
+	}
+
 	if (new_ptl != old_ptl)
 		spin_unlock(new_ptl);
 	pte_unmap(new_pte - 1);
@@ -198,6 +205,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
 {
 	spinlock_t *old_ptl, *new_ptl;
 	struct mm_struct *mm = vma->vm_mm;
+	struct mmu_gather tlb;
 	pmd_t pmd;
 
 	if ((old_addr & ~PMD_MASK) || (new_addr & ~PMD_MASK)
@@ -228,7 +236,10 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
 
 	/* Set the new pmd */
 	set_pmd_at(mm, new_addr, new_pmd, pmd);
-	flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE);
+	tlb_gather_mmu(&tlb, mm, old_addr, old_addr + PMD_SIZE);
+	tlb.cleared_pmds = 1;
+	flush_tlb_range(&tlb, vma, old_addr, old_addr + PMD_SIZE);
+	tlb_finish_mmu(&tlb, old_addr, old_addr + PMD_SIZE);
 	if (new_ptl != old_ptl)
 		spin_unlock(new_ptl);
 	spin_unlock(old_ptl);
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 9ab9d8f698ea..ef3ac1752033 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -240,13 +240,19 @@ pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long address,
 	 * use the same function.
 	 */
 	pmd_t pmd;
+	struct mmu_gather tlb;
+	unsigned long tlb_start = address;
+	unsigned long tlb_end = address + HPAGE_PMD_SIZE;
 
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	VM_BUG_ON(pmd_trans_huge(*pmdp));
 	pmd = pmdp_huge_get_and_clear(vma->vm_mm, address, pmdp);
 
 	/* collapse entails shooting down ptes not pmd */
-	flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
+	tlb.cleared_ptes = 1;
+	flush_tlb_range(&tlb, vma, tlb_start, tlb_end);
+	tlb_finish_mmu(&tlb, tlb_start, tlb_end);
 	return pmd;
 }
 #endif
diff --git a/mm/rmap.c b/mm/rmap.c
index b3e381919835..72133bfb1c07 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -67,6 +67,7 @@
 #include <linux/memremap.h>
 #include <linux/userfaultfd_k.h>
 
+#include <asm/tlb.h>
 #include <asm/tlbflush.h>
 
 #include <trace/events/tlb.h>
@@ -1377,6 +1378,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	bool ret = true;
 	struct mmu_notifier_range range;
 	enum ttu_flags flags = (enum ttu_flags)arg;
+	struct mmu_gather tlb;
 
 	/* munlock has nothing to gain from examining un-locked vmas */
 	if ((flags & TTU_MUNLOCK) && !(vma->vm_flags & VM_LOCKED))
@@ -1462,7 +1464,9 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 				 * already adjusted above to cover this range.
 				 */
 				flush_cache_range(vma, range.start, range.end);
-				flush_tlb_range(vma, range.start, range.end);
+				tlb_gather_mmu(&tlb, mm, range.start, range.end);
+				flush_tlb_range(&tlb, vma, range.start, range.end);
+				tlb_finish_mmu(&tlb, range.start, range.end);
 				mmu_notifier_invalidate_range(mm, range.start,
 							      range.end);
 
-- 
2.19.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v5 8/8] arm64: tlb: Set the TTL field in flush_tlb_range
  2020-03-31 14:29 [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature Zhenyu Ye
                   ` (6 preceding siblings ...)
  2020-03-31 14:29 ` [RFC PATCH v5 7/8] mm: tlb: Pass struct mmu_gather to flush_tlb_range Zhenyu Ye
@ 2020-03-31 14:29 ` Zhenyu Ye
  7 siblings, 0 replies; 16+ messages in thread
From: Zhenyu Ye @ 2020-03-31 14:29 UTC (permalink / raw)
  To: peterz, mark.rutland, will, catalin.marinas, aneesh.kumar, akpm,
	npiggin, arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao,
	Dave.Martin, steven.price, broonie, guohanjun, corbet, vgupta,
	tony.luck
  Cc: yezhenyu2, linux-arm-kernel, linux-kernel, linux-arch, linux-mm,
	arm, xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

This patch uses the cleared_* in struct mmu_gather to set the
TTL field in flush_tlb_range().

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
---
 arch/arm64/include/asm/tlb.h      | 39 ++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/tlbflush.h | 22 +++++------------
 2 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index b76df828e6b7..72b6e3763df2 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -21,11 +21,34 @@ static void tlb_flush(struct mmu_gather *tlb);
 
 #include <asm-generic/tlb.h>
 
+/*
+ * get the tlbi levels in arm64.  Default value is 0 if more than one
+ * of cleared_* is set or neither is set.
+ * Arm64 doesn't support p4ds now.
+ */
+static inline int tlb_get_level(struct mmu_gather *tlb)
+{
+	int sum = tlb->cleared_ptes + tlb->cleared_pmds +
+		  tlb->cleared_puds + tlb->cleared_p4ds;
+
+	if (sum != 1)
+		return 0;
+	else if (tlb->cleared_ptes)
+		return 3;
+	else if (tlb->cleared_pmds)
+		return 2;
+	else if (tlb->cleared_puds)
+		return 1;
+
+	return 0;
+}
+
 static inline void tlb_flush(struct mmu_gather *tlb)
 {
 	struct vm_area_struct vma = TLB_FLUSH_VMA(tlb->mm, 0);
 	bool last_level = !tlb->freed_tables;
 	unsigned long stride = tlb_get_unmap_size(tlb);
+	int tlb_level = tlb_get_level(tlb);
 
 	/*
 	 * If we're tearing down the address space then we only care about
@@ -38,7 +61,21 @@ static inline void tlb_flush(struct mmu_gather *tlb)
 		return;
 	}
 
-	__flush_tlb_range(&vma, tlb->start, tlb->end, stride, last_level);
+	__flush_tlb_range(&vma, tlb->start, tlb->end, stride,
+			  last_level, tlb_level);
+}
+
+static inline void flush_tlb_range(struct mmu_gather *tlb,
+				   struct vm_area_struct *vma,
+				   unsigned long start, unsigned long end)
+{
+	/*
+	 * We cannot use leaf-only invalidation here, since we may be invalidating
+	 * table entries as part of collapsing hugepages or moving page tables.
+	 */
+	unsigned long stride = tlb_get_unmap_size(tlb);
+	int tlb_level = tlb_get_level(tlb);
+	__flush_tlb_range(vma, start, end, stride, false, tlb_level);
 }
 
 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 0b4d75a2270b..dc8e803692f8 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -215,7 +215,8 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
 
 static inline void __flush_tlb_range(struct vm_area_struct *vma,
 				     unsigned long start, unsigned long end,
-				     unsigned long stride, bool last_level)
+				     unsigned long stride, bool last_level,
+				     int tlb_level)
 {
 	unsigned long asid = ASID(vma->vm_mm);
 	unsigned long addr;
@@ -237,27 +238,16 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
 	dsb(ishst);
 	for (addr = start; addr < end; addr += stride) {
 		if (last_level) {
-			__tlbi_level(vale1is, addr, 0);
-			__tlbi_user_level(vale1is, addr, 0);
+			__tlbi_level(vale1is, addr, tlb_level);
+			__tlbi_user_level(vale1is, addr, tlb_level);
 		} else {
-			__tlbi_level(vae1is, addr, 0);
-			__tlbi_user_level(vae1is, addr, 0);
+			__tlbi_level(vae1is, addr, tlb_level);
+			__tlbi_user_level(vae1is, addr, tlb_level);
 		}
 	}
 	dsb(ish);
 }
 
-static inline void flush_tlb_range(struct mmu_gather *tlb,
-				   struct vm_area_struct *vma,
-				   unsigned long start, unsigned long end)
-{
-	/*
-	 * We cannot use leaf-only invalidation here, since we may be invalidating
-	 * table entries as part of collapsing hugepages or moving page tables.
-	 */
-	__flush_tlb_range(vma, start, end, PAGE_SIZE, false);
-}
-
 static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end)
 {
 	unsigned long addr;
-- 
2.19.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range
  2020-03-31 14:29 ` [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range Zhenyu Ye
@ 2020-03-31 15:13   ` Peter Zijlstra
  2020-04-01  8:51     ` Zhenyu Ye
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2020-03-31 15:13 UTC (permalink / raw)
  To: Zhenyu Ye
  Cc: mark.rutland, will, catalin.marinas, aneesh.kumar, akpm, npiggin,
	arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao, Dave.Martin,
	steven.price, broonie, guohanjun, corbet, vgupta, tony.luck,
	linux-arm-kernel, linux-kernel, linux-arch, linux-mm, arm,
	xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

On Tue, Mar 31, 2020 at 10:29:23PM +0800, Zhenyu Ye wrote:
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index e2e2bef07dd2..32d4661e5a56 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -1160,10 +1160,10 @@ static inline int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
>   * invalidate the entire TLB which is not desitable.
>   * e.g. see arch/arc: flush_pmd_tlb_range
>   */
> -#define flush_pmd_tlb_range(vma, addr, end)	flush_tlb_range(vma, addr, end)
> +#define flush_pmd_tlb_range(tlb, vma, addr, end)	flush_tlb_range(vma, addr, end)
>  #define flush_pud_tlb_range(vma, addr, end)	flush_tlb_range(vma, addr, end)
>  #else
> -#define flush_pmd_tlb_range(vma, addr, end)	BUILD_BUG()
> +#define flush_pmd_tlb_range(tlb, vma, addr, end)	BUILD_BUG()
>  #define flush_pud_tlb_range(vma, addr, end)	BUILD_BUG()
>  #endif
>  #endif
> diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
> index 3d7c01e76efc..96c9cf77bfb5 100644
> --- a/mm/pgtable-generic.c
> +++ b/mm/pgtable-generic.c
> @@ -109,8 +109,14 @@ int pmdp_set_access_flags(struct vm_area_struct *vma,
>  	int changed = !pmd_same(*pmdp, entry);
>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>  	if (changed) {
> +		struct mmu_gather tlb;
> +		unsigned long tlb_start = address;
> +		unsigned long tlb_end = address + HPAGE_PMD_SIZE;
>  		set_pmd_at(vma->vm_mm, address, pmdp, entry);
> -		flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
> +		tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
> +		tlb.cleared_pmds = 1;
> +		flush_pmd_tlb_range(&tlb, vma, tlb_start, tlb_end);
> +		tlb_finish_mmu(&tlb, tlb_start, tlb_end);
>  	}
>  	return changed;
>  }

This is madness. Please, carefully consider what you just wrote and what
it will do in the generic case.

Instead of trying to retro-fit flush_*tlb_range() to take an mmu_gather
parameter, please replace them out-right.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range
  2020-03-31 15:13   ` Peter Zijlstra
@ 2020-04-01  8:51     ` Zhenyu Ye
  2020-04-01 12:20       ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Zhenyu Ye @ 2020-04-01  8:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mark.rutland, will, catalin.marinas, aneesh.kumar, akpm, npiggin,
	arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao, Dave.Martin,
	steven.price, broonie, guohanjun, corbet, vgupta, tony.luck,
	linux-arm-kernel, linux-kernel, linux-arch, linux-mm, arm,
	xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

Hi Peter,

On 2020/3/31 23:13, Peter Zijlstra wrote:
> On Tue, Mar 31, 2020 at 10:29:23PM +0800, Zhenyu Ye wrote:
>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>> index e2e2bef07dd2..32d4661e5a56 100644
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -1160,10 +1160,10 @@ static inline int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
>>   * invalidate the entire TLB which is not desitable.
>>   * e.g. see arch/arc: flush_pmd_tlb_range
>>   */
>> -#define flush_pmd_tlb_range(vma, addr, end)	flush_tlb_range(vma, addr, end)
>> +#define flush_pmd_tlb_range(tlb, vma, addr, end)	flush_tlb_range(vma, addr, end)
>>  #define flush_pud_tlb_range(vma, addr, end)	flush_tlb_range(vma, addr, end)
>>  #else
>> -#define flush_pmd_tlb_range(vma, addr, end)	BUILD_BUG()
>> +#define flush_pmd_tlb_range(tlb, vma, addr, end)	BUILD_BUG()
>>  #define flush_pud_tlb_range(vma, addr, end)	BUILD_BUG()
>>  #endif
>>  #endif
>> diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
>> index 3d7c01e76efc..96c9cf77bfb5 100644
>> --- a/mm/pgtable-generic.c
>> +++ b/mm/pgtable-generic.c
>> @@ -109,8 +109,14 @@ int pmdp_set_access_flags(struct vm_area_struct *vma,
>>  	int changed = !pmd_same(*pmdp, entry);
>>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>>  	if (changed) {
>> +		struct mmu_gather tlb;
>> +		unsigned long tlb_start = address;
>> +		unsigned long tlb_end = address + HPAGE_PMD_SIZE;
>>  		set_pmd_at(vma->vm_mm, address, pmdp, entry);
>> -		flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
>> +		tlb_gather_mmu(&tlb, vma->vm_mm, tlb_start, tlb_end);
>> +		tlb.cleared_pmds = 1;
>> +		flush_pmd_tlb_range(&tlb, vma, tlb_start, tlb_end);
>> +		tlb_finish_mmu(&tlb, tlb_start, tlb_end);
>>  	}
>>  	return changed;
>>  }
> 
> This is madness. Please, carefully consider what you just wrote and what
> it will do in the generic case.
> 
> Instead of trying to retro-fit flush_*tlb_range() to take an mmu_gather
> parameter, please replace them out-right.
> 

I'm sorry that I'm not sure what "replace them out-right" means.  Do you
mean that I should define flush_*_tlb_range like this?

#define flush_pmd_tlb_range(vma, addr, end)				\
	do {								\
		struct mmu_gather tlb;					\
		tlb_gather_mmu(&tlb, (vma)->vm_mm, addr, end);		\
		tlba.cleared_pmds = 1;					\
		flush_tlb_range(&tlb, vma, addr, end);			\
		tlb_finish_mmu(&tlb, addr, end);			\
	} while (0)


Thanks,
Zhenyu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range
  2020-04-01  8:51     ` Zhenyu Ye
@ 2020-04-01 12:20       ` Peter Zijlstra
  2020-04-02 11:24         ` Zhenyu Ye
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2020-04-01 12:20 UTC (permalink / raw)
  To: Zhenyu Ye
  Cc: mark.rutland, will, catalin.marinas, aneesh.kumar, akpm, npiggin,
	arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao, Dave.Martin,
	steven.price, broonie, guohanjun, corbet, vgupta, tony.luck,
	linux-arm-kernel, linux-kernel, linux-arch, linux-mm, arm,
	xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

On Wed, Apr 01, 2020 at 04:51:15PM +0800, Zhenyu Ye wrote:
> On 2020/3/31 23:13, Peter Zijlstra wrote:

> > Instead of trying to retro-fit flush_*tlb_range() to take an mmu_gather
> > parameter, please replace them out-right.
> > 
> 
> I'm sorry that I'm not sure what "replace them out-right" means.  Do you
> mean that I should define flush_*_tlb_range like this?
> 
> #define flush_pmd_tlb_range(vma, addr, end)				\
> 	do {								\
> 		struct mmu_gather tlb;					\
> 		tlb_gather_mmu(&tlb, (vma)->vm_mm, addr, end);		\
> 		tlba.cleared_pmds = 1;					\
> 		flush_tlb_range(&tlb, vma, addr, end);			\
> 		tlb_finish_mmu(&tlb, addr, end);			\
> 	} while (0)
> 

I was thinking to remove flush_*tlb_range() entirely (from generic
code).

And specifically to not use them like the above; instead extend the
mmu_gather API.

Specifically, if you wanted to express flush_pmd_tlb_range() in mmu
gather, you'd write it like:

static inline void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end)
{
	struct mmu_gather tlb;

	tlb_gather_mmu(&tlb, vma->vm_mm, addr, end);
	tlb_start_vma(&tlb, vma);
	tlb.cleared_pmds = 1;
	__tlb_adjust_range(addr, end - addr);
	tlb_end_vma(&tlb, vma);
	tlb_finish_mmu(&tlb, addr, end);
}

Except of course, that the code between start_vma and end_vma is not a
proper mmu_gather API.

So maybe add:

  tlb_flush_{pte,pmd,pud,p4d}_range()

Then we can write:

static inline void flush_XXX_tlb_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end)
{
	struct mmu_gather tlb;

	tlb_gather_mmu(&tlb, vma->vm_mm, addr, end);
	tlb_start_vma(&tlb, vma);
	tlb_flush_XXX_range(&tlb, addr, end - addr);
	tlb_end_vma(&tlb, vma);
	tlb_finish_mmu(&tlb, addr, end);
}

But when I look at the output of:

  git grep flush_.*tlb_range -- :^arch/

I doubt it makes sense to provide wrappers like the above.

( Also, we should probably remove the (addr, end) arguments from
tlb_finish_mmu(), Will? )

---
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index f391f6b500b4..be5452a8efaa 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -511,6 +511,34 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
 }
 #endif
 
+static inline void tlb_flush_pte_range(struct mmu_gather *tlb,
+				       unsigned long address, unsigned long size)
+{
+	__tlb_adjust_range(tlb, address, size);
+	tlb->cleared_ptes = 1;
+}
+
+static inline void tlb_flush_pmd_range(struct mmu_gather *tlb,
+				       unsigned long address, unsigned long size)
+{
+	__tlb_adjust_range(tlb, address, size);
+	tlb->cleared_pmds = 1;
+}
+
+static inline void tlb_flush_pud_range(struct mmu_gather *tlb,
+				       unsigned long address, unsigned long size)
+{
+	__tlb_adjust_range(tlb, address, size);
+	tlb->cleared_puds = 1;
+}
+
+static inline void tlb_flush_p4d_range(struct mmu_gather *tlb,
+				       unsigned long address, unsigned long size)
+{
+	__tlb_adjust_range(tlb, address, size);
+	tlb->cleared_p4ds = 1;
+}
+
 #ifndef __tlb_remove_tlb_entry
 #define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
 #endif
@@ -524,8 +552,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
  */
 #define tlb_remove_tlb_entry(tlb, ptep, address)		\
 	do {							\
-		__tlb_adjust_range(tlb, address, PAGE_SIZE);	\
-		tlb->cleared_ptes = 1;				\
+		tlb_flush_pte_range(tlb, address, PAGE_SIZE);	\
 		__tlb_remove_tlb_entry(tlb, ptep, address);	\
 	} while (0)
 
@@ -550,8 +577,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
 
 #define tlb_remove_pmd_tlb_entry(tlb, pmdp, address)			\
 	do {								\
-		__tlb_adjust_range(tlb, address, HPAGE_PMD_SIZE);	\
-		tlb->cleared_pmds = 1;					\
+		tlb_flush_pmd_range(tlb, address, HPAGE_PMD_SIZE);	\
 		__tlb_remove_pmd_tlb_entry(tlb, pmdp, address);		\
 	} while (0)
 
@@ -565,8 +591,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
 
 #define tlb_remove_pud_tlb_entry(tlb, pudp, address)			\
 	do {								\
-		__tlb_adjust_range(tlb, address, HPAGE_PUD_SIZE);	\
-		tlb->cleared_puds = 1;					\
+		tlb_flush_pud_range(tlb, address, HPAGE_PUD_SIZE);	\
 		__tlb_remove_pud_tlb_entry(tlb, pudp, address);		\
 	} while (0)
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range
  2020-04-01 12:20       ` Peter Zijlstra
@ 2020-04-02 11:24         ` Zhenyu Ye
  2020-04-02 16:38           ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Zhenyu Ye @ 2020-04-02 11:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mark.rutland, will, catalin.marinas, aneesh.kumar, akpm, npiggin,
	arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao, Dave.Martin,
	steven.price, broonie, guohanjun, corbet, vgupta, tony.luck,
	linux-arm-kernel, linux-kernel, linux-arch, linux-mm, arm,
	xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

Hi Peter,

On 2020/4/1 20:20, Peter Zijlstra wrote:
> On Wed, Apr 01, 2020 at 04:51:15PM +0800, Zhenyu Ye wrote:
>> On 2020/3/31 23:13, Peter Zijlstra wrote:
> 
>>> Instead of trying to retro-fit flush_*tlb_range() to take an mmu_gather
>>> parameter, please replace them out-right.
>>>
>>
>> I'm sorry that I'm not sure what "replace them out-right" means.  Do you
>> mean that I should define flush_*_tlb_range like this?
>>
>> #define flush_pmd_tlb_range(vma, addr, end)				\
>> 	do {								\
>> 		struct mmu_gather tlb;					\
>> 		tlb_gather_mmu(&tlb, (vma)->vm_mm, addr, end);		\
>> 		tlba.cleared_pmds = 1;					\
>> 		flush_tlb_range(&tlb, vma, addr, end);			\
>> 		tlb_finish_mmu(&tlb, addr, end);			\
>> 	} while (0)
>>
> 
> I was thinking to remove flush_*tlb_range() entirely (from generic
> code).
> 
> And specifically to not use them like the above; instead extend the
> mmu_gather API.
> 
> Specifically, if you wanted to express flush_pmd_tlb_range() in mmu
> gather, you'd write it like:
> 
> static inline void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end)
> {
> 	struct mmu_gather tlb;
> 
> 	tlb_gather_mmu(&tlb, vma->vm_mm, addr, end);
> 	tlb_start_vma(&tlb, vma);
> 	tlb.cleared_pmds = 1;
> 	__tlb_adjust_range(addr, end - addr);
> 	tlb_end_vma(&tlb, vma);
> 	tlb_finish_mmu(&tlb, addr, end);
> }
> 
> Except of course, that the code between start_vma and end_vma is not a
> proper mmu_gather API.
> 
> So maybe add:
> 
>   tlb_flush_{pte,pmd,pud,p4d}_range()
> 
> Then we can write:
> 
> static inline void flush_XXX_tlb_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end)
> {
> 	struct mmu_gather tlb;
> 
> 	tlb_gather_mmu(&tlb, vma->vm_mm, addr, end);
> 	tlb_start_vma(&tlb, vma);
> 	tlb_flush_XXX_range(&tlb, addr, end - addr);
> 	tlb_end_vma(&tlb, vma);
> 	tlb_finish_mmu(&tlb, addr, end);
> }
> 
> But when I look at the output of:
> 
>   git grep flush_.*tlb_range -- :^arch/
> 
> I doubt it makes sense to provide wrappers like the above.
> 

Thanks for your detailed explanation.  I notice that you used
`tlb_end_vma` replace `flush_tlb_range`, which will call `tlb_flush`,
then finally call `flush_tlb_range` in generic code.  However, some
architectures define tlb_end_vma|tlb_flush|flush_tlb_range themselves,
so this may cause problems.

For example, in s390, it defines:

#define tlb_end_vma(tlb, vma)			do { } while (0)

And it doesn't define it's own flush_pmd_tlb_range().  So there will be
a mistake if we changed flush_pmd_tlb_range() using tlb_end_vma().

Is this really a problem or something I understand wrong ?



If true, I think there are three ways to solve this problem:

1. use `flush_tlb_range` rather than `tlb_end_vma` in flush_XXX_tlb_range;
   In this way, we still need retro-fit `flush_tlb_range` to take an mmu_gather
parameter.

2. use `tlb_flush` rather than `tlb_end_vma`.
   There is a constraint such like:

	#ifndef tlb_flush
	#if defined(tlb_start_vma) || defined(tlb_end_vma)
	#error Default tlb_flush() relies on default tlb_start_vma() and tlb_end_vma()
	#endif

   So all architectures that define tlb_{start|end}_vma have defined tlb_flush.
Also, we can add a constraint to flush_XXX_tlb_range such like:

	#ifndef flush_XXX_tlb_range
	#if defined(tlb_start_vma) || defined(tlb_end_vma)
	#error Default flush_XXX_tlb_range() relies on default tlb_start/end_vma()
	#endif

3. Define flush_XXX_tlb_range() architecture-self, and keep original define in
generic code, such as:

In arm64:
	#define flush_XXX_tlb_range flush_XXX_tlb_range

In generic:
	#ifndef flush_XXX_tlb_range
	#define flush_XXX_tlb_range flush_tlb_range


Which do you think is more appropriate?


> ( Also, we should probably remove the (addr, end) arguments from
> tlb_finish_mmu(), Will? )
> 

This can be changed quickly. If you want I can do this with a
separate patch.

> ---
> diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> index f391f6b500b4..be5452a8efaa 100644
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -511,6 +511,34 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
>  }
>  #endif
>  
> +static inline void tlb_flush_pte_range(struct mmu_gather *tlb,
> +				       unsigned long address, unsigned long size)
> +{
> +	__tlb_adjust_range(tlb, address, size);
> +	tlb->cleared_ptes = 1;
> +}
> +
> +static inline void tlb_flush_pmd_range(struct mmu_gather *tlb,
> +				       unsigned long address, unsigned long size)
> +{
> +	__tlb_adjust_range(tlb, address, size);
> +	tlb->cleared_pmds = 1;
> +}
> +
> +static inline void tlb_flush_pud_range(struct mmu_gather *tlb,
> +				       unsigned long address, unsigned long size)
> +{
> +	__tlb_adjust_range(tlb, address, size);
> +	tlb->cleared_puds = 1;
> +}
> +
> +static inline void tlb_flush_p4d_range(struct mmu_gather *tlb,
> +				       unsigned long address, unsigned long size)
> +{
> +	__tlb_adjust_range(tlb, address, size);
> +	tlb->cleared_p4ds = 1;
> +}
> +

By the way, I think the name of tlb_set_XXX_range() is more suitable, because
we don't do actual flush there.

>  #ifndef __tlb_remove_tlb_entry
>  #define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
>  #endif
> @@ -524,8 +552,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
>   */
>  #define tlb_remove_tlb_entry(tlb, ptep, address)		\
>  	do {							\
> -		__tlb_adjust_range(tlb, address, PAGE_SIZE);	\
> -		tlb->cleared_ptes = 1;				\
> +		tlb_flush_pte_range(tlb, address, PAGE_SIZE);	\
>  		__tlb_remove_tlb_entry(tlb, ptep, address);	\
>  	} while (0)
>  
> @@ -550,8 +577,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
>  
>  #define tlb_remove_pmd_tlb_entry(tlb, pmdp, address)			\
>  	do {								\
> -		__tlb_adjust_range(tlb, address, HPAGE_PMD_SIZE);	\
> -		tlb->cleared_pmds = 1;					\
> +		tlb_flush_pmd_range(tlb, address, HPAGE_PMD_SIZE);	\
>  		__tlb_remove_pmd_tlb_entry(tlb, pmdp, address);		\
>  	} while (0)
>  
> @@ -565,8 +591,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
>  
>  #define tlb_remove_pud_tlb_entry(tlb, pudp, address)			\
>  	do {								\
> -		__tlb_adjust_range(tlb, address, HPAGE_PUD_SIZE);	\
> -		tlb->cleared_puds = 1;					\
> +		tlb_flush_pud_range(tlb, address, HPAGE_PUD_SIZE);	\
>  		__tlb_remove_pud_tlb_entry(tlb, pudp, address);		\
>  	} while (0)
>  
> 
> .
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range
  2020-04-02 11:24         ` Zhenyu Ye
@ 2020-04-02 16:38           ` Peter Zijlstra
  2020-04-03  5:14             ` Zhenyu Ye
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2020-04-02 16:38 UTC (permalink / raw)
  To: Zhenyu Ye
  Cc: mark.rutland, will, catalin.marinas, aneesh.kumar, akpm, npiggin,
	arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao, Dave.Martin,
	steven.price, broonie, guohanjun, corbet, vgupta, tony.luck,
	linux-arm-kernel, linux-kernel, linux-arch, linux-mm, arm,
	xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

On Thu, Apr 02, 2020 at 07:24:04PM +0800, Zhenyu Ye wrote:
> Thanks for your detailed explanation.  I notice that you used
> `tlb_end_vma` replace `flush_tlb_range`, which will call `tlb_flush`,
> then finally call `flush_tlb_range` in generic code.  However, some
> architectures define tlb_end_vma|tlb_flush|flush_tlb_range themselves,
> so this may cause problems.
> 
> For example, in s390, it defines:
> 
> #define tlb_end_vma(tlb, vma)			do { } while (0)
> 
> And it doesn't define it's own flush_pmd_tlb_range().  So there will be
> a mistake if we changed flush_pmd_tlb_range() using tlb_end_vma().
> 
> Is this really a problem or something I understand wrong ?

If tlb_end_vma() is a no-op, then tlb_finish_mmu() will do:
tlb_flush_mmu() -> tlb_flush_mmu_tlbonly() -> tlb_flush()

And s390 has tlb_flush().

If tlb_end_vma() is not a no-op and it calls tlb_flush_mmu_tlbonly(),
then tlb_finish_mmu()'s invocation of tlb_flush_mmu_tlbonly() will
terniate early due o no flags set.

IOW, it should all just work.


FYI the whole tlb_{start,end}_vma() thing is a only needed when the
architecture doesn't implement tlb_flush() and instead default to using
flush_tlb_range(), at which point we need to provide a 'fake' vma.

At the time I audited all architectures and they only look at VM_EXEC
(to do $I invalidation) and VM_HUGETLB (for pmd level invalidations),
but I forgot which architectures that were.

But that is all legacy code; eventually we'll get all archs a native
tlb_flush() and this can go away.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range
  2020-04-02 16:38           ` Peter Zijlstra
@ 2020-04-03  5:14             ` Zhenyu Ye
  2020-04-08  9:00               ` Zhenyu Ye
  0 siblings, 1 reply; 16+ messages in thread
From: Zhenyu Ye @ 2020-04-03  5:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mark.rutland, will, catalin.marinas, aneesh.kumar, akpm, npiggin,
	arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao, Dave.Martin,
	steven.price, broonie, guohanjun, corbet, vgupta, tony.luck,
	linux-arm-kernel, linux-kernel, linux-arch, linux-mm, arm,
	xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

Hi Peter,

On 2020/4/3 0:38, Peter Zijlstra wrote:
> On Thu, Apr 02, 2020 at 07:24:04PM +0800, Zhenyu Ye wrote:
>> Thanks for your detailed explanation.  I notice that you used
>> `tlb_end_vma` replace `flush_tlb_range`, which will call `tlb_flush`,
>> then finally call `flush_tlb_range` in generic code.  However, some
>> architectures define tlb_end_vma|tlb_flush|flush_tlb_range themselves,
>> so this may cause problems.
>>
>> For example, in s390, it defines:
>>
>> #define tlb_end_vma(tlb, vma)			do { } while (0)
>>
>> And it doesn't define it's own flush_pmd_tlb_range().  So there will be
>> a mistake if we changed flush_pmd_tlb_range() using tlb_end_vma().
>>
>> Is this really a problem or something I understand wrong ?
> 
> If tlb_end_vma() is a no-op, then tlb_finish_mmu() will do:
> tlb_flush_mmu() -> tlb_flush_mmu_tlbonly() -> tlb_flush()
> 
> And s390 has tlb_flush().
> 
> If tlb_end_vma() is not a no-op and it calls tlb_flush_mmu_tlbonly(),
> then tlb_finish_mmu()'s invocation of tlb_flush_mmu_tlbonly() will
> terniate early due o no flags set.
> 
> IOW, it should all just work.
> 
> 
> FYI the whole tlb_{start,end}_vma() thing is a only needed when the
> architecture doesn't implement tlb_flush() and instead default to using
> flush_tlb_range(), at which point we need to provide a 'fake' vma.
> 
> At the time I audited all architectures and they only look at VM_EXEC
> (to do $I invalidation) and VM_HUGETLB (for pmd level invalidations),
> but I forgot which architectures that were.

Many architectures, such as alpha, arc, arm and so on.
I really understand why you hate making vma->vm_flags more important for
tlbi :).

> But that is all legacy code; eventually we'll get all archs a native
> tlb_flush() and this can go away.
> 

Thanks for your reply.  Currently, to enable the TTL feature, extending
the flush_*tlb_range() may be more convenient.
I will send a formal PATCH soon.

Thanks,
Zhenyu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range
  2020-04-03  5:14             ` Zhenyu Ye
@ 2020-04-08  9:00               ` Zhenyu Ye
  0 siblings, 0 replies; 16+ messages in thread
From: Zhenyu Ye @ 2020-04-08  9:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mark.rutland, will, catalin.marinas, aneesh.kumar, akpm, npiggin,
	arnd, rostedt, maz, suzuki.poulose, tglx, yuzhao, Dave.Martin,
	steven.price, broonie, guohanjun, corbet, vgupta, tony.luck,
	linux-arm-kernel, linux-kernel, linux-arch, linux-mm, arm,
	xiexiangyou, prime.zeng, zhangshaokun, kuhn.chenqun

Hi Peter,

On 2020/4/3 13:14, Zhenyu Ye wrote:
> Hi Peter,
> 
> On 2020/4/3 0:38, Peter Zijlstra wrote:
>> On Thu, Apr 02, 2020 at 07:24:04PM +0800, Zhenyu Ye wrote:
>>> Thanks for your detailed explanation.  I notice that you used
>>> `tlb_end_vma` replace `flush_tlb_range`, which will call `tlb_flush`,
>>> then finally call `flush_tlb_range` in generic code.  However, some
>>> architectures define tlb_end_vma|tlb_flush|flush_tlb_range themselves,
>>> so this may cause problems.
>>>
>>> For example, in s390, it defines:
>>>
>>> #define tlb_end_vma(tlb, vma)			do { } while (0)
>>>
>>> And it doesn't define it's own flush_pmd_tlb_range().  So there will be
>>> a mistake if we changed flush_pmd_tlb_range() using tlb_end_vma().
>>>
>>> Is this really a problem or something I understand wrong ?
>>
>> If tlb_end_vma() is a no-op, then tlb_finish_mmu() will do:
>> tlb_flush_mmu() -> tlb_flush_mmu_tlbonly() -> tlb_flush()
>>
>> And s390 has tlb_flush().
>>
>> If tlb_end_vma() is not a no-op and it calls tlb_flush_mmu_tlbonly(),
>> then tlb_finish_mmu()'s invocation of tlb_flush_mmu_tlbonly() will
>> terniate early due o no flags set.
>>
>> IOW, it should all just work.
>>
>>
>> FYI the whole tlb_{start,end}_vma() thing is a only needed when the
>> architecture doesn't implement tlb_flush() and instead default to using
>> flush_tlb_range(), at which point we need to provide a 'fake' vma.
>>
>> At the time I audited all architectures and they only look at VM_EXEC
>> (to do $I invalidation) and VM_HUGETLB (for pmd level invalidations),
>> but I forgot which architectures that were.
> 
> Many architectures, such as alpha, arc, arm and so on.
> I really understand why you hate making vma->vm_flags more important for
> tlbi :).
> 
>> But that is all legacy code; eventually we'll get all archs a native
>> tlb_flush() and this can go away.
>>
> 
> Thanks for your reply.  Currently, to enable the TTL feature, extending
> the flush_*tlb_range() may be more convenient.
> I will send a formal PATCH soon.
> 
> Thanks,
> Zhenyu
> 

I had sent [PATCH v1] a few days ago[1].  Do you have time to review
my changes?  Are those changes appropriate?

Waiting for your suggestion.

[1] https://lore.kernel.org/linux-arm-kernel/20200403090048.938-1-yezhenyu2@huawei.com/

Thanks,
Zhenyu


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-04-08  9:01 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-31 14:29 [RFC PATCH v5 0/8] arm64: tlb: add support for TTL feature Zhenyu Ye
2020-03-31 14:29 ` [RFC PATCH v5 1/8] arm64: Detect the ARMv8.4 " Zhenyu Ye
2020-03-31 14:29 ` [RFC PATCH v5 2/8] arm64: Add level-hinted TLB invalidation helper Zhenyu Ye
2020-03-31 14:29 ` [RFC PATCH v5 3/8] arm64: Add tlbi_user_level " Zhenyu Ye
2020-03-31 14:29 ` [RFC PATCH v5 4/8] mm: tlb: Pass struct mmu_gather to flush_pmd_tlb_range Zhenyu Ye
2020-03-31 15:13   ` Peter Zijlstra
2020-04-01  8:51     ` Zhenyu Ye
2020-04-01 12:20       ` Peter Zijlstra
2020-04-02 11:24         ` Zhenyu Ye
2020-04-02 16:38           ` Peter Zijlstra
2020-04-03  5:14             ` Zhenyu Ye
2020-04-08  9:00               ` Zhenyu Ye
2020-03-31 14:29 ` [RFC PATCH v5 5/8] mm: tlb: Pass struct mmu_gather to flush_pud_tlb_range Zhenyu Ye
2020-03-31 14:29 ` [RFC PATCH v5 6/8] mm: tlb: Pass struct mmu_gather to flush_hugetlb_tlb_range Zhenyu Ye
2020-03-31 14:29 ` [RFC PATCH v5 7/8] mm: tlb: Pass struct mmu_gather to flush_tlb_range Zhenyu Ye
2020-03-31 14:29 ` [RFC PATCH v5 8/8] arm64: tlb: Set the TTL field in flush_tlb_range Zhenyu Ye

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).