Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v6 00/18] generic mmu_gather patches
@ 2019-02-19 10:31 Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 01/18] asm-generic/tlb: Provide a comment Peter Zijlstra
                   ` (17 more replies)
  0 siblings, 18 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

Hi all,

Sorry I haven't posted these in a while, I sorta forgot about them for a little.

Not much changed since last time; one change to the ARM patch as suggested by
Will and a fresh Changelog for patch 12 as requested by Vineet. And some
trivial rebasing of the s390 bits.

They've sat in my queue.git for a while and 0-day hasn't reported anything
funny with them.

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git mm/tlb

I'm thinking this is about ready to go.

---
 arch/Kconfig                      |   8 +-
 arch/alpha/Kconfig                |   1 +
 arch/alpha/include/asm/tlb.h      |   6 -
 arch/arc/include/asm/tlb.h        |  32 -----
 arch/arm/include/asm/tlb.h        | 255 ++-------------------------------
 arch/arm64/Kconfig                |   1 -
 arch/arm64/include/asm/tlb.h      |   1 +
 arch/c6x/Kconfig                  |   1 +
 arch/c6x/include/asm/tlb.h        |   2 -
 arch/h8300/include/asm/tlb.h      |   2 -
 arch/hexagon/include/asm/tlb.h    |  12 --
 arch/ia64/include/asm/tlb.h       | 257 +---------------------------------
 arch/ia64/include/asm/tlbflush.h  |  25 ++++
 arch/ia64/mm/tlb.c                |  23 ++-
 arch/m68k/Kconfig                 |   1 +
 arch/m68k/include/asm/tlb.h       |  14 --
 arch/microblaze/Kconfig           |   1 +
 arch/microblaze/include/asm/tlb.h |   9 --
 arch/mips/include/asm/tlb.h       |  17 ---
 arch/nds32/include/asm/tlb.h      |  16 ---
 arch/nios2/Kconfig                |   1 +
 arch/nios2/include/asm/tlb.h      |  14 +-
 arch/openrisc/Kconfig             |   1 +
 arch/openrisc/include/asm/tlb.h   |   8 +-
 arch/parisc/include/asm/tlb.h     |  18 ---
 arch/powerpc/Kconfig              |   2 +
 arch/powerpc/include/asm/tlb.h    |  18 +--
 arch/riscv/include/asm/tlb.h      |   1 +
 arch/s390/Kconfig                 |   2 +
 arch/s390/include/asm/tlb.h       | 130 ++++++-----------
 arch/s390/mm/pgalloc.c            |  63 +--------
 arch/sh/include/asm/pgalloc.h     |   9 ++
 arch/sh/include/asm/tlb.h         | 132 +----------------
 arch/sparc/Kconfig                |   1 +
 arch/sparc/include/asm/tlb_32.h   |  18 ---
 arch/um/include/asm/tlb.h         | 158 +--------------------
 arch/unicore32/Kconfig            |   1 +
 arch/unicore32/include/asm/tlb.h  |   7 +-
 arch/x86/Kconfig                  |   1 -
 arch/x86/include/asm/tlb.h        |   1 +
 arch/xtensa/include/asm/tlb.h     |  26 ----
 include/asm-generic/tlb.h         | 288 ++++++++++++++++++++++++++++++++++----
 mm/huge_memory.c                  |   4 +-
 mm/hugetlb.c                      |   2 +-
 mm/madvise.c                      |   2 +-
 mm/memory.c                       |   6 +-
 mm/mmu_gather.c                   | 129 +++++++++--------
 47 files changed, 477 insertions(+), 1250 deletions(-)


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 01/18] asm-generic/tlb: Provide a comment
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
@ 2019-02-19 10:31 ` Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 02/18] asm-generic/tlb: Provide HAVE_MMU_GATHER_PAGE_SIZE Peter Zijlstra
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

Write a comment explaining some of this..

Cc: Nick Piggin <npiggin@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/tlb.h |  119 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 116 insertions(+), 3 deletions(-)

--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -22,6 +22,118 @@
 
 #ifdef CONFIG_MMU
 
+/*
+ * Generic MMU-gather implementation.
+ *
+ * The mmu_gather data structure is used by the mm code to implement the
+ * correct and efficient ordering of freeing pages and TLB invalidations.
+ *
+ * This correct ordering is:
+ *
+ *  1) unhook page
+ *  2) TLB invalidate page
+ *  3) free page
+ *
+ * That is, we must never free a page before we have ensured there are no live
+ * translations left to it. Otherwise it might be possible to observe (or
+ * worse, change) the page content after it has been reused.
+ *
+ * The mmu_gather API consists of:
+ *
+ *  - tlb_gather_mmu() / tlb_finish_mmu(); start and finish a mmu_gather
+ *
+ *    Finish in particular will issue a (final) TLB invalidate and free
+ *    all (remaining) queued pages.
+ *
+ *  - tlb_start_vma() / tlb_end_vma(); marks the start / end of a VMA
+ *
+ *    Defaults to flushing at tlb_end_vma() to reset the range; helps when
+ *    there's large holes between the VMAs.
+ *
+ *  - tlb_remove_page() / __tlb_remove_page()
+ *  - tlb_remove_page_size() / __tlb_remove_page_size()
+ *
+ *    __tlb_remove_page_size() is the basic primitive that queues a page for
+ *    freeing. __tlb_remove_page() assumes PAGE_SIZE. Both will return a
+ *    boolean indicating if the queue is (now) full and a call to
+ *    tlb_flush_mmu() is required.
+ *
+ *    tlb_remove_page() and tlb_remove_page_size() imply the call to
+ *    tlb_flush_mmu() when required and has no return value.
+ *
+ *  - tlb_remove_check_page_size_change()
+ *
+ *    call before __tlb_remove_page*() to set the current page-size; implies a
+ *    possible tlb_flush_mmu() call.
+ *
+ *  - tlb_flush_mmu() / tlb_flush_mmu_tlbonly() / tlb_flush_mmu_free()
+ *
+ *    tlb_flush_mmu_tlbonly() - does the TLB invalidate (and resets
+ *                              related state, like the range)
+ *
+ *    tlb_flush_mmu_free() - frees the queued pages; make absolutely
+ *			     sure no additional tlb_remove_page()
+ *			     calls happen between _tlbonly() and this.
+ *
+ *    tlb_flush_mmu() - the above two calls.
+ *
+ *  - mmu_gather::fullmm
+ *
+ *    A flag set by tlb_gather_mmu() to indicate we're going to free
+ *    the entire mm; this allows a number of optimizations.
+ *
+ *    - We can ignore tlb_{start,end}_vma(); because we don't
+ *      care about ranges. Everything will be shot down.
+ *
+ *    - (RISC) architectures that use ASIDs can cycle to a new ASID
+ *      and delay the invalidation until ASID space runs out.
+ *
+ *  - mmu_gather::need_flush_all
+ *
+ *    A flag that can be set by the arch code if it wants to force
+ *    flush the entire TLB irrespective of the range. For instance
+ *    x86-PAE needs this when changing top-level entries.
+ *
+ * And requires the architecture to provide and implement tlb_flush().
+ *
+ * tlb_flush() may, in addition to the above mentioned mmu_gather fields, make
+ * use of:
+ *
+ *  - mmu_gather::start / mmu_gather::end
+ *
+ *    which provides the range that needs to be flushed to cover the pages to
+ *    be freed.
+ *
+ *  - mmu_gather::freed_tables
+ *
+ *    set when we freed page table pages
+ *
+ *  - tlb_get_unmap_shift() / tlb_get_unmap_size()
+ *
+ *    returns the smallest TLB entry size unmapped in this range
+ *
+ * Additionally there are a few opt-in features:
+ *
+ *  HAVE_RCU_TABLE_FREE
+ *
+ *  This provides tlb_remove_table(), to be used instead of tlb_remove_page()
+ *  for page directores (__p*_free_tlb()). This provides separate freeing of
+ *  the page-table pages themselves in a semi-RCU fashion (see comment below).
+ *  Useful if your architecture doesn't use IPIs for remote TLB invalidates
+ *  and therefore doesn't naturally serialize with software page-table walkers.
+ *
+ *  When used, an architecture is expected to provide __tlb_remove_table()
+ *  which does the actual freeing of these pages.
+ *
+ *  HAVE_RCU_TABLE_INVALIDATE
+ *
+ *  This makes HAVE_RCU_TABLE_FREE call tlb_flush_mmu_tlbonly() before freeing
+ *  the page-table pages. Required if you use HAVE_RCU_TABLE_FREE and your
+ *  architecture uses the Linux page-tables natively.
+ *
+ */
+#define HAVE_GENERIC_MMU_GATHER
+
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 /*
  * Semi RCU freeing of the page directories.
@@ -89,14 +201,17 @@ struct mmu_gather_batch {
  */
 #define MAX_GATHER_BATCH_COUNT	(10000UL/MAX_GATHER_BATCH)
 
-/* struct mmu_gather is an opaque type used by the mm code for passing around
+/*
+ * struct mmu_gather is an opaque type used by the mm code for passing around
  * any data needed by arch specific code for tlb_remove_page.
  */
 struct mmu_gather {
 	struct mm_struct	*mm;
+
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	struct mmu_table_batch	*batch;
 #endif
+
 	unsigned long		start;
 	unsigned long		end;
 	/*
@@ -131,8 +246,6 @@ struct mmu_gather {
 	int page_size;
 };
 
-#define HAVE_GENERIC_MMU_GATHER
-
 void arch_tlb_gather_mmu(struct mmu_gather *tlb,
 	struct mm_struct *mm, unsigned long start, unsigned long end);
 void tlb_flush_mmu(struct mmu_gather *tlb);



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 02/18] asm-generic/tlb: Provide HAVE_MMU_GATHER_PAGE_SIZE
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 01/18] asm-generic/tlb: Provide a comment Peter Zijlstra
@ 2019-02-19 10:31 ` Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 03/18] asm-generic/tlb: Provide generic VIPT cache flush Peter Zijlstra
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

Move the mmu_gather::page_size things into the generic code instead of
powerpc specific bits.

Cc: Nick Piggin <npiggin@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/Kconfig                   |    3 +++
 arch/arm/include/asm/tlb.h     |    3 +--
 arch/ia64/include/asm/tlb.h    |    3 +--
 arch/powerpc/Kconfig           |    1 +
 arch/powerpc/include/asm/tlb.h |   17 -----------------
 arch/s390/include/asm/tlb.h    |    4 +---
 arch/sh/include/asm/tlb.h      |    4 +---
 arch/um/include/asm/tlb.h      |    4 +---
 include/asm-generic/tlb.h      |   32 +++++++++++++++++++-------------
 mm/huge_memory.c               |    4 ++--
 mm/hugetlb.c                   |    2 +-
 mm/madvise.c                   |    2 +-
 mm/memory.c                    |    4 ++--
 mm/mmu_gather.c                |    5 +++++
 14 files changed, 39 insertions(+), 49 deletions(-)

--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -368,6 +368,9 @@ config HAVE_RCU_TABLE_FREE
 config HAVE_RCU_TABLE_INVALIDATE
 	bool
 
+config HAVE_MMU_GATHER_PAGE_SIZE
+	bool
+
 config ARCH_HAVE_NMI_SAFE_CMPXCHG
 	bool
 
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -286,8 +286,7 @@ tlb_remove_pmd_tlb_entry(struct mmu_gath
 
 #define tlb_migrate_finish(mm)		do { } while (0)
 
-#define tlb_remove_check_page_size_change tlb_remove_check_page_size_change
-static inline void tlb_remove_check_page_size_change(struct mmu_gather *tlb,
+static inline void tlb_change_page_size(struct mmu_gather *tlb,
 						     unsigned int page_size)
 {
 }
--- a/arch/ia64/include/asm/tlb.h
+++ b/arch/ia64/include/asm/tlb.h
@@ -282,8 +282,7 @@ do {							\
 #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
 	tlb_remove_tlb_entry(tlb, ptep, address)
 
-#define tlb_remove_check_page_size_change tlb_remove_check_page_size_change
-static inline void tlb_remove_check_page_size_change(struct mmu_gather *tlb,
+static inline void tlb_change_page_size(struct mmu_gather *tlb,
 						     unsigned int page_size)
 {
 }
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -216,6 +216,7 @@ config PPC
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_RCU_TABLE_FREE		if SMP
+	select HAVE_MMU_GATHER_PAGE_SIZE
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RELIABLE_STACKTRACE		if PPC64 && CPU_LITTLE_ENDIAN
 	select HAVE_SYSCALL_TRACEPOINTS
--- a/arch/powerpc/include/asm/tlb.h
+++ b/arch/powerpc/include/asm/tlb.h
@@ -27,7 +27,6 @@
 #define tlb_start_vma(tlb, vma)	do { } while (0)
 #define tlb_end_vma(tlb, vma)	do { } while (0)
 #define __tlb_remove_tlb_entry	__tlb_remove_tlb_entry
-#define tlb_remove_check_page_size_change tlb_remove_check_page_size_change
 
 extern void tlb_flush(struct mmu_gather *tlb);
 
@@ -46,22 +45,6 @@ static inline void __tlb_remove_tlb_entr
 #endif
 }
 
-static inline void tlb_remove_check_page_size_change(struct mmu_gather *tlb,
-						     unsigned int page_size)
-{
-	if (!tlb->page_size)
-		tlb->page_size = page_size;
-	else if (tlb->page_size != page_size) {
-		if (!tlb->fullmm)
-			tlb_flush_mmu(tlb);
-		/*
-		 * update the page size after flush for the new
-		 * mmu_gather.
-		 */
-		tlb->page_size = page_size;
-	}
-}
-
 #ifdef CONFIG_SMP
 static inline int mm_is_core_local(struct mm_struct *mm)
 {
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -180,9 +180,7 @@ static inline void pud_free_tlb(struct m
 #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
 	tlb_remove_tlb_entry(tlb, ptep, address)
 
-#define tlb_remove_check_page_size_change tlb_remove_check_page_size_change
-static inline void tlb_remove_check_page_size_change(struct mmu_gather *tlb,
-						     unsigned int page_size)
+static inline void tlb_change_page_size(struct mmu_gather *tlb, unsigned int page_size)
 {
 }
 
--- a/arch/sh/include/asm/tlb.h
+++ b/arch/sh/include/asm/tlb.h
@@ -127,9 +127,7 @@ static inline void tlb_remove_page_size(
 	return tlb_remove_page(tlb, page);
 }
 
-#define tlb_remove_check_page_size_change tlb_remove_check_page_size_change
-static inline void tlb_remove_check_page_size_change(struct mmu_gather *tlb,
-						     unsigned int page_size)
+static inline void tlb_change_page_size(struct mmu_gather *tlb, unsigned int page_size)
 {
 }
 
--- a/arch/um/include/asm/tlb.h
+++ b/arch/um/include/asm/tlb.h
@@ -146,9 +146,7 @@ static inline void tlb_remove_page_size(
 #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
 	tlb_remove_tlb_entry(tlb, ptep, address)
 
-#define tlb_remove_check_page_size_change tlb_remove_check_page_size_change
-static inline void tlb_remove_check_page_size_change(struct mmu_gather *tlb,
-						     unsigned int page_size)
+static inline void tlb_change_page_size(struct mmu_gather *tlb, unsigned int page_size)
 {
 }
 
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -61,7 +61,7 @@
  *    tlb_remove_page() and tlb_remove_page_size() imply the call to
  *    tlb_flush_mmu() when required and has no return value.
  *
- *  - tlb_remove_check_page_size_change()
+ *  - tlb_change_page_size()
  *
  *    call before __tlb_remove_page*() to set the current page-size; implies a
  *    possible tlb_flush_mmu() call.
@@ -114,6 +114,11 @@
  *
  * Additionally there are a few opt-in features:
  *
+ *  HAVE_MMU_GATHER_PAGE_SIZE
+ *
+ *  This ensures we call tlb_flush() every time tlb_change_page_size() actually
+ *  changes the size and provides mmu_gather::page_size to tlb_flush().
+ *
  *  HAVE_RCU_TABLE_FREE
  *
  *  This provides tlb_remove_table(), to be used instead of tlb_remove_page()
@@ -239,11 +244,15 @@ struct mmu_gather {
 	unsigned int		cleared_puds : 1;
 	unsigned int		cleared_p4ds : 1;
 
+	unsigned int		batch_count;
+
 	struct mmu_gather_batch *active;
 	struct mmu_gather_batch	local;
 	struct page		*__pages[MMU_GATHER_BUNDLE];
-	unsigned int		batch_count;
-	int page_size;
+
+#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
+	unsigned int page_size;
+#endif
 };
 
 void arch_tlb_gather_mmu(struct mmu_gather *tlb,
@@ -309,21 +318,18 @@ static inline void tlb_remove_page(struc
 	return tlb_remove_page_size(tlb, page, PAGE_SIZE);
 }
 
-#ifndef tlb_remove_check_page_size_change
-#define tlb_remove_check_page_size_change tlb_remove_check_page_size_change
-static inline void tlb_remove_check_page_size_change(struct mmu_gather *tlb,
+static inline void tlb_change_page_size(struct mmu_gather *tlb,
 						     unsigned int page_size)
 {
-	/*
-	 * We don't care about page size change, just update
-	 * mmu_gather page size here so that debug checks
-	 * doesn't throw false warning.
-	 */
-#ifdef CONFIG_DEBUG_VM
+#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
+	if (tlb->page_size && tlb->page_size != page_size) {
+		if (!tlb->fullmm)
+			tlb_flush_mmu(tlb);
+	}
+
 	tlb->page_size = page_size;
 #endif
 }
-#endif
 
 static inline unsigned long tlb_get_unmap_shift(struct mmu_gather *tlb)
 {
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1617,7 +1617,7 @@ bool madvise_free_huge_pmd(struct mmu_ga
 	struct mm_struct *mm = tlb->mm;
 	bool ret = false;
 
-	tlb_remove_check_page_size_change(tlb, HPAGE_PMD_SIZE);
+	tlb_change_page_size(tlb, HPAGE_PMD_SIZE);
 
 	ptl = pmd_trans_huge_lock(pmd, vma);
 	if (!ptl)
@@ -1693,7 +1693,7 @@ int zap_huge_pmd(struct mmu_gather *tlb,
 	pmd_t orig_pmd;
 	spinlock_t *ptl;
 
-	tlb_remove_check_page_size_change(tlb, HPAGE_PMD_SIZE);
+	tlb_change_page_size(tlb, HPAGE_PMD_SIZE);
 
 	ptl = __pmd_trans_huge_lock(pmd, vma);
 	if (!ptl)
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3337,7 +3337,7 @@ void __unmap_hugepage_range(struct mmu_g
 	 * This is a hugetlb vma, all the pte entries should point
 	 * to huge page.
 	 */
-	tlb_remove_check_page_size_change(tlb, sz);
+	tlb_change_page_size(tlb, sz);
 	tlb_start_vma(tlb, vma);
 
 	/*
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -328,7 +328,7 @@ static int madvise_free_pte_range(pmd_t
 	if (pmd_trans_unstable(pmd))
 		return 0;
 
-	tlb_remove_check_page_size_change(tlb, PAGE_SIZE);
+	tlb_change_page_size(tlb, PAGE_SIZE);
 	orig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
 	flush_tlb_batched_pending(mm);
 	arch_enter_lazy_mmu_mode();
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -355,7 +355,7 @@ void free_pgd_range(struct mmu_gather *t
 	 * We add page table cache pages with PAGE_SIZE,
 	 * (see pte_free_tlb()), flush the tlb if we need
 	 */
-	tlb_remove_check_page_size_change(tlb, PAGE_SIZE);
+	tlb_change_page_size(tlb, PAGE_SIZE);
 	pgd = pgd_offset(tlb->mm, addr);
 	do {
 		next = pgd_addr_end(addr, end);
@@ -1046,7 +1046,7 @@ static unsigned long zap_pte_range(struc
 	pte_t *pte;
 	swp_entry_t entry;
 
-	tlb_remove_check_page_size_change(tlb, PAGE_SIZE);
+	tlb_change_page_size(tlb, PAGE_SIZE);
 again:
 	init_rss_vec(rss);
 	start_pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -58,7 +58,9 @@ void arch_tlb_gather_mmu(struct mmu_gath
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb->batch = NULL;
 #endif
+#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
 	tlb->page_size = 0;
+#endif
 
 	__tlb_reset_range(tlb);
 }
@@ -121,7 +123,10 @@ bool __tlb_remove_page_size(struct mmu_g
 	struct mmu_gather_batch *batch;
 
 	VM_BUG_ON(!tlb->end);
+
+#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
 	VM_WARN_ON(tlb->page_size != page_size);
+#endif
 
 	batch = tlb->active;
 	/*



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 03/18] asm-generic/tlb: Provide generic VIPT cache flush
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 01/18] asm-generic/tlb: Provide a comment Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 02/18] asm-generic/tlb: Provide HAVE_MMU_GATHER_PAGE_SIZE Peter Zijlstra
@ 2019-02-19 10:31 ` Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 04/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_range() Peter Zijlstra
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux,
	heiko.carstens, riel, David Miller, Guan Xuetao

The one obvious thing SH and ARM want is a sensible default for
tlb_start_vma(). (also: https://lkml.org/lkml/2004/1/15/6 )

Avoid all VIPT architectures providing their own tlb_start_vma()
implementation and rely on architectures to provide a no-op
flush_cache_range() when it is not relevant.

The below makes tlb_start_vma() default to flush_cache_range(), which
should be right and sufficient. The only exceptions that I found where
(oddly):

  - m68k-mmu
  - sparc64
  - unicore

Those architectures appear to have flush_cache_range(), but their
current tlb_start_vma() does not call it.

Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arc/include/asm/tlb.h      |    9 ---------
 arch/mips/include/asm/tlb.h     |    9 ---------
 arch/nds32/include/asm/tlb.h    |    6 ------
 arch/nios2/include/asm/tlb.h    |   10 ----------
 arch/parisc/include/asm/tlb.h   |    5 -----
 arch/sparc/include/asm/tlb_32.h |    5 -----
 arch/xtensa/include/asm/tlb.h   |    9 ---------
 include/asm-generic/tlb.h       |   19 +++++++++++--------
 8 files changed, 11 insertions(+), 61 deletions(-)

--- a/arch/arc/include/asm/tlb.h
+++ b/arch/arc/include/asm/tlb.h
@@ -23,15 +23,6 @@ do {						\
  *
  * Note, read http://lkml.org/lkml/2004/1/15/6
  */
-#ifndef CONFIG_ARC_CACHE_VIPT_ALIASING
-#define tlb_start_vma(tlb, vma)
-#else
-#define tlb_start_vma(tlb, vma)						\
-do {									\
-	if (!tlb->fullmm)						\
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);	\
-} while(0)
-#endif
 
 #define tlb_end_vma(tlb, vma)						\
 do {									\
--- a/arch/mips/include/asm/tlb.h
+++ b/arch/mips/include/asm/tlb.h
@@ -5,15 +5,6 @@
 #include <asm/cpu-features.h>
 #include <asm/mipsregs.h>
 
-/*
- * MIPS doesn't need any special per-pte or per-vma handling, except
- * we need to flush cache for area to be unmapped.
- */
-#define tlb_start_vma(tlb, vma)					\
-	do {							\
-		if (!tlb->fullmm)				\
-			flush_cache_range(vma, vma->vm_start, vma->vm_end); \
-	}  while (0)
 #define tlb_end_vma(tlb, vma) do { } while (0)
 #define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
 
--- a/arch/nds32/include/asm/tlb.h
+++ b/arch/nds32/include/asm/tlb.h
@@ -4,12 +4,6 @@
 #ifndef __ASMNDS32_TLB_H
 #define __ASMNDS32_TLB_H
 
-#define tlb_start_vma(tlb,vma)						\
-	do {								\
-		if (!tlb->fullmm)					\
-			flush_cache_range(vma, vma->vm_start, vma->vm_end); \
-	} while (0)
-
 #define tlb_end_vma(tlb,vma)				\
 	do { 						\
 		if(!tlb->fullmm)			\
--- a/arch/nios2/include/asm/tlb.h
+++ b/arch/nios2/include/asm/tlb.h
@@ -15,16 +15,6 @@
 
 extern void set_mmu_pid(unsigned long pid);
 
-/*
- * NiosII doesn't need any special per-pte or per-vma handling, except
- * we need to flush cache for the area to be unmapped.
- */
-#define tlb_start_vma(tlb, vma)					\
-	do {							\
-		if (!tlb->fullmm)				\
-			flush_cache_range(vma, vma->vm_start, vma->vm_end); \
-	}  while (0)
-
 #define tlb_end_vma(tlb, vma)	do { } while (0)
 #define __tlb_remove_tlb_entry(tlb, ptep, address)	do { } while (0)
 
--- a/arch/parisc/include/asm/tlb.h
+++ b/arch/parisc/include/asm/tlb.h
@@ -7,11 +7,6 @@ do {	if ((tlb)->fullmm)		\
 		flush_tlb_mm((tlb)->mm);\
 } while (0)
 
-#define tlb_start_vma(tlb, vma) \
-do {	if (!(tlb)->fullmm)	\
-		flush_cache_range(vma, vma->vm_start, vma->vm_end); \
-} while (0)
-
 #define tlb_end_vma(tlb, vma)	\
 do {	if (!(tlb)->fullmm)	\
 		flush_tlb_range(vma, vma->vm_start, vma->vm_end); \
--- a/arch/sparc/include/asm/tlb_32.h
+++ b/arch/sparc/include/asm/tlb_32.h
@@ -2,11 +2,6 @@
 #ifndef _SPARC_TLB_H
 #define _SPARC_TLB_H
 
-#define tlb_start_vma(tlb, vma) \
-do {								\
-	flush_cache_range(vma, vma->vm_start, vma->vm_end);	\
-} while (0)
-
 #define tlb_end_vma(tlb, vma) \
 do {								\
 	flush_tlb_range(vma, vma->vm_start, vma->vm_end);	\
--- a/arch/xtensa/include/asm/tlb.h
+++ b/arch/xtensa/include/asm/tlb.h
@@ -16,19 +16,10 @@
 
 #if (DCACHE_WAY_SIZE <= PAGE_SIZE)
 
-/* Note, read http://lkml.org/lkml/2004/1/15/6 */
-
-# define tlb_start_vma(tlb,vma)			do { } while (0)
 # define tlb_end_vma(tlb,vma)			do { } while (0)
 
 #else
 
-# define tlb_start_vma(tlb, vma)					      \
-	do {								      \
-		if (!tlb->fullmm)					      \
-			flush_cache_range(vma, vma->vm_start, vma->vm_end);   \
-	} while(0)
-
 # define tlb_end_vma(tlb, vma)						      \
 	do {								      \
 		if (!tlb->fullmm)					      \
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -19,6 +19,7 @@
 #include <linux/swap.h>
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
+#include <asm/cacheflush.h>
 
 #ifdef CONFIG_MMU
 
@@ -351,17 +352,19 @@ static inline unsigned long tlb_get_unma
  * the vmas are adjusted to only cover the region to be torn down.
  */
 #ifndef tlb_start_vma
-#define tlb_start_vma(tlb, vma) do { } while (0)
+#define tlb_start_vma(tlb, vma)						\
+do {									\
+	if (!tlb->fullmm)						\
+		flush_cache_range(vma, vma->vm_start, vma->vm_end);	\
+} while (0)
 #endif
 
-#define __tlb_end_vma(tlb, vma)					\
-	do {							\
-		if (!tlb->fullmm)				\
-			tlb_flush_mmu_tlbonly(tlb);		\
-	} while (0)
-
 #ifndef tlb_end_vma
-#define tlb_end_vma	__tlb_end_vma
+#define tlb_end_vma(tlb, vma)						\
+do {									\
+	if (!tlb->fullmm)						\
+		tlb_flush_mmu_tlbonly(tlb);				\
+} while (0)
 #endif
 
 #ifndef __tlb_remove_tlb_entry



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 04/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_range()
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (2 preceding siblings ...)
  2019-02-19 10:31 ` [PATCH v6 03/18] asm-generic/tlb: Provide generic VIPT cache flush Peter Zijlstra
@ 2019-02-19 10:31 ` Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 05/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_mm() Peter Zijlstra
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

Provide a generic tlb_flush() implementation that relies on
flush_tlb_range(). This is a little awkward because flush_tlb_range()
assumes a VMA for range invalidation, but we no longer have one.

Audit of all flush_tlb_range() implementations shows only vma->vm_mm
and vma->vm_flags are used, and of the latter only VM_EXEC (I-TLB
invalidates) and VM_HUGETLB (large TLB invalidate) are used.

Therefore, track VM_EXEC and VM_HUGETLB in two more bits, and create a
'fake' VMA.

This allows architectures that have a reasonably efficient
flush_tlb_range() to not require any additional effort.

Cc: Nick Piggin <npiggin@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arm64/include/asm/tlb.h   |    1 
 arch/powerpc/include/asm/tlb.h |    1 
 arch/riscv/include/asm/tlb.h   |    1 
 arch/x86/include/asm/tlb.h     |    1 
 include/asm-generic/tlb.h      |   95 +++++++++++++++++++++++++++++++++++------
 5 files changed, 87 insertions(+), 12 deletions(-)

--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -27,6 +27,7 @@ static inline void __tlb_remove_table(vo
 	free_page_and_swap_cache((struct page *)_table);
 }
 
+#define tlb_flush tlb_flush
 static void tlb_flush(struct mmu_gather *tlb);
 
 #include <asm-generic/tlb.h>
--- a/arch/powerpc/include/asm/tlb.h
+++ b/arch/powerpc/include/asm/tlb.h
@@ -28,6 +28,7 @@
 #define tlb_end_vma(tlb, vma)	do { } while (0)
 #define __tlb_remove_tlb_entry	__tlb_remove_tlb_entry
 
+#define tlb_flush tlb_flush
 extern void tlb_flush(struct mmu_gather *tlb);
 
 /* Get the generic bits... */
--- a/arch/riscv/include/asm/tlb.h
+++ b/arch/riscv/include/asm/tlb.h
@@ -18,6 +18,7 @@ struct mmu_gather;
 
 static void tlb_flush(struct mmu_gather *tlb);
 
+#define tlb_flush tlb_flush
 #include <asm-generic/tlb.h>
 
 static inline void tlb_flush(struct mmu_gather *tlb)
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -6,6 +6,7 @@
 #define tlb_end_vma(tlb, vma) do { } while (0)
 #define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
 
+#define tlb_flush tlb_flush
 static inline void tlb_flush(struct mmu_gather *tlb);
 
 #include <asm-generic/tlb.h>
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -95,7 +95,7 @@
  *    flush the entire TLB irrespective of the range. For instance
  *    x86-PAE needs this when changing top-level entries.
  *
- * And requires the architecture to provide and implement tlb_flush().
+ * And allows the architecture to provide and implement tlb_flush():
  *
  * tlb_flush() may, in addition to the above mentioned mmu_gather fields, make
  * use of:
@@ -111,7 +111,10 @@
  *
  *  - tlb_get_unmap_shift() / tlb_get_unmap_size()
  *
- *    returns the smallest TLB entry size unmapped in this range
+ *    returns the smallest TLB entry size unmapped in this range.
+ *
+ * If an architecture does not provide tlb_flush() a default implementation
+ * based on flush_tlb_range() will be used.
  *
  * Additionally there are a few opt-in features:
  *
@@ -245,6 +248,12 @@ struct mmu_gather {
 	unsigned int		cleared_puds : 1;
 	unsigned int		cleared_p4ds : 1;
 
+	/*
+	 * tracks VM_EXEC | VM_HUGETLB in tlb_start_vma
+	 */
+	unsigned int		vma_exec : 1;
+	unsigned int		vma_huge : 1;
+
 	unsigned int		batch_count;
 
 	struct mmu_gather_batch *active;
@@ -286,8 +295,59 @@ static inline void __tlb_reset_range(str
 	tlb->cleared_pmds = 0;
 	tlb->cleared_puds = 0;
 	tlb->cleared_p4ds = 0;
+	/*
+	 * Do not reset mmu_gather::vma_* fields here, we do not
+	 * call into tlb_start_vma() again to set them if there is an
+	 * intermediate flush.
+	 */
 }
 
+#ifndef tlb_flush
+
+#if defined(tlb_start_vma) || defined(tlb_end_vma)
+#error Default tlb_flush() relies on default tlb_start_vma() and tlb_end_vma()
+#endif
+
+static inline void tlb_flush(struct mmu_gather *tlb)
+{
+	if (tlb->fullmm || tlb->need_flush_all) {
+		flush_tlb_mm(tlb->mm);
+	} else if (tlb->end) {
+		struct vm_area_struct vma = {
+			.vm_mm = tlb->mm,
+			.vm_flags = (tlb->vma_exec ? VM_EXEC    : 0) |
+				    (tlb->vma_huge ? VM_HUGETLB : 0),
+		};
+
+		flush_tlb_range(&vma, tlb->start, tlb->end);
+	}
+}
+
+static inline void
+tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+	/*
+	 * flush_tlb_range() implementations that look at VM_HUGETLB (tile,
+	 * mips-4k) flush only large pages.
+	 *
+	 * flush_tlb_range() implementations that flush I-TLB also flush D-TLB
+	 * (tile, xtensa, arm), so it's ok to just add VM_EXEC to an existing
+	 * range.
+	 *
+	 * We rely on tlb_end_vma() to issue a flush, such that when we reset
+	 * these values the batch is empty.
+	 */
+	tlb->vma_huge = !!(vma->vm_flags & VM_HUGETLB);
+	tlb->vma_exec = !!(vma->vm_flags & VM_EXEC);
+}
+
+#else
+
+static inline void
+tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma) { }
+
+#endif
+
 static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
 {
 	if (!tlb->end)
@@ -357,19 +417,30 @@ static inline unsigned long tlb_get_unma
  * the vmas are adjusted to only cover the region to be torn down.
  */
 #ifndef tlb_start_vma
-#define tlb_start_vma(tlb, vma)						\
-do {									\
-	if (!tlb->fullmm)						\
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);	\
-} while (0)
+static inline void tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+	if (tlb->fullmm)
+		return;
+
+	tlb_update_vma_flags(tlb, vma);
+	flush_cache_range(vma, vma->vm_start, vma->vm_end);
+}
 #endif
 
 #ifndef tlb_end_vma
-#define tlb_end_vma(tlb, vma)						\
-do {									\
-	if (!tlb->fullmm)						\
-		tlb_flush_mmu_tlbonly(tlb);				\
-} while (0)
+static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+	if (tlb->fullmm)
+		return;
+
+	/*
+	 * Do a TLB flush and reset the range at VMA boundaries; this avoids
+	 * the ranges growing with the unused space between consecutive VMAs,
+	 * but also the mmu_gather::vma_* flags from tlb_start_vma() rely on
+	 * this.
+	 */
+	tlb_flush_mmu_tlbonly(tlb);
+}
 #endif
 
 #ifndef __tlb_remove_tlb_entry



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 05/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_mm()
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (3 preceding siblings ...)
  2019-02-19 10:31 ` [PATCH v6 04/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_range() Peter Zijlstra
@ 2019-02-19 10:31 ` Peter Zijlstra
  2019-02-19 12:47   ` Will Deacon
  2019-02-19 10:31 ` [PATCH v6 06/18] asm-generic/tlb: Conditionally provide tlb_migrate_finish() Peter Zijlstra
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

When an architecture does not have (an efficient) flush_tlb_range(),
but instead always uses full TLB invalidates, the current generic
tlb_flush() is sub-optimal, for it will generate extra flushes in
order to keep the range small.

But if we cannot do range flushes, that is a moot concern. Optionally
provide this simplified default.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/tlb.h |   41 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -114,7 +114,8 @@
  *    returns the smallest TLB entry size unmapped in this range.
  *
  * If an architecture does not provide tlb_flush() a default implementation
- * based on flush_tlb_range() will be used.
+ * based on flush_tlb_range() will be used, unless MMU_GATHER_NO_RANGE is
+ * specified, in which case we'll default to flush_tlb_mm().
  *
  * Additionally there are a few opt-in features:
  *
@@ -140,6 +141,9 @@
  *  the page-table pages. Required if you use HAVE_RCU_TABLE_FREE and your
  *  architecture uses the Linux page-tables natively.
  *
+ *  MMU_GATHER_NO_RANGE
+ *
+ *  Use this if your architecture lacks an efficient flush_tlb_range().
  */
 #define HAVE_GENERIC_MMU_GATHER
 
@@ -302,12 +306,45 @@ static inline void __tlb_reset_range(str
 	 */
 }
 
+#ifdef CONFIG_MMU_GATHER_NO_RANGE
+
+#if defined(tlb_flush) || defined(tlb_start_vma) || defined(tlb_end_vma)
+#error MMU_GATHER_NO_RANGE relies on default tlb_flush(), tlb_start_vma() and tlb_end_vma()
+#endif
+
+/*
+ * When an architecture does not have efficient means of range flushing TLBs
+ * there is no point in doing intermediate flushes on tlb_end_vma() to keep the
+ * range small. We equally don't have to worry about page granularity or other
+ * things.
+ *
+ * All we need to do is issue a full flush for any !0 range.
+ */
+static inline void tlb_flush(struct mmu_gather *tlb)
+{
+	if (tlb->end)
+		flush_tlb_mm(tlb->mm);
+}
+
+static inline void
+tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma) { }
+
+#define tlb_end_vma tlb_end_vma
+static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) { }
+
+#else /* CONFIG_MMU_GATHER_NO_RANGE */
+
 #ifndef tlb_flush
 
 #if defined(tlb_start_vma) || defined(tlb_end_vma)
 #error Default tlb_flush() relies on default tlb_start_vma() and tlb_end_vma()
 #endif
 
+/*
+ * When an architecture does not provide its own tlb_flush() implementation
+ * but does have a reasonably efficient flush_vma_range() implementation
+ * use that.
+ */
 static inline void tlb_flush(struct mmu_gather *tlb)
 {
 	if (tlb->fullmm || tlb->need_flush_all) {
@@ -348,6 +385,8 @@ tlb_update_vma_flags(struct mmu_gather *
 
 #endif
 
+#endif /* CONFIG_MMU_GATHER_NO_RANGE */
+
 static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
 {
 	if (!tlb->end)



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 06/18] asm-generic/tlb: Conditionally provide tlb_migrate_finish()
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (4 preceding siblings ...)
  2019-02-19 10:31 ` [PATCH v6 05/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_mm() Peter Zijlstra
@ 2019-02-19 10:31 ` Peter Zijlstra
  2019-02-19 12:47   ` Will Deacon
  2019-02-19 10:31 ` [PATCH v6 07/18] asm-generic/tlb: Invert HAVE_RCU_TABLE_INVALIDATE Peter Zijlstra
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

Needed for ia64 -- alternatively we drop the entire hook.

Cc: Will Deacon <will.deacon@arm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/tlb.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -539,6 +539,8 @@ static inline void tlb_end_vma(struct mm
 
 #endif /* CONFIG_MMU */
 
+#ifndef tlb_migrate_finish
 #define tlb_migrate_finish(mm) do {} while (0)
+#endif
 
 #endif /* _ASM_GENERIC__TLB_H */



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 07/18] asm-generic/tlb: Invert HAVE_RCU_TABLE_INVALIDATE
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (5 preceding siblings ...)
  2019-02-19 10:31 ` [PATCH v6 06/18] asm-generic/tlb: Conditionally provide tlb_migrate_finish() Peter Zijlstra
@ 2019-02-19 10:31 ` Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 08/18] arm/tlb: Convert to generic mmu_gather Peter Zijlstra
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

Make issuing a TLB invalidate for page-table pages the normal case.

The reason is twofold:

 - too many invalidates is safer than too few,
 - most architectures use the linux page-tables natively
   and would thus require this.

Make it an opt-out, instead of an opt-in.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/Kconfig              |    2 +-
 arch/arm64/Kconfig        |    1 -
 arch/powerpc/Kconfig      |    1 +
 arch/sparc/Kconfig        |    1 +
 arch/x86/Kconfig          |    1 -
 include/asm-generic/tlb.h |    9 +++++----
 mm/mmu_gather.c           |    2 +-
 7 files changed, 9 insertions(+), 8 deletions(-)

--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -372,7 +372,7 @@ config HAVE_ARCH_JUMP_LABEL_RELATIVE
 config HAVE_RCU_TABLE_FREE
 	bool
 
-config HAVE_RCU_TABLE_INVALIDATE
+config HAVE_RCU_TABLE_NO_INVALIDATE
 	bool
 
 config HAVE_MMU_GATHER_PAGE_SIZE
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -147,7 +147,6 @@ config ARM64
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RCU_TABLE_FREE
-	select HAVE_RCU_TABLE_INVALIDATE
 	select HAVE_RSEQ
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -218,6 +218,7 @@ config PPC
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_RCU_TABLE_FREE		if SMP
+	select HAVE_RCU_TABLE_NO_INVALIDATE	if HAVE_RCU_TABLE_FREE
 	select HAVE_MMU_GATHER_PAGE_SIZE
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RELIABLE_STACKTRACE		if PPC64 && CPU_LITTLE_ENDIAN
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -62,6 +62,7 @@ config SPARC64
 	select HAVE_KRETPROBES
 	select HAVE_KPROBES
 	select HAVE_RCU_TABLE_FREE if SMP
+	select HAVE_RCU_TABLE_NO_INVALIDATE if HAVE_RCU_TABLE_FREE
 	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
 	select HAVE_DYNAMIC_FTRACE
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -183,7 +183,6 @@ config X86
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_RCU_TABLE_FREE		if PARAVIRT
-	select HAVE_RCU_TABLE_INVALIDATE	if HAVE_RCU_TABLE_FREE
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RELIABLE_STACKTRACE		if X86_64 && (UNWINDER_FRAME_POINTER || UNWINDER_ORC) && STACK_VALIDATION
 	select HAVE_FUNCTION_ARG_ACCESS_API
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -135,11 +135,12 @@
  *  When used, an architecture is expected to provide __tlb_remove_table()
  *  which does the actual freeing of these pages.
  *
- *  HAVE_RCU_TABLE_INVALIDATE
+ *  HAVE_RCU_TABLE_NO_INVALIDATE
  *
- *  This makes HAVE_RCU_TABLE_FREE call tlb_flush_mmu_tlbonly() before freeing
- *  the page-table pages. Required if you use HAVE_RCU_TABLE_FREE and your
- *  architecture uses the Linux page-tables natively.
+ *  This makes HAVE_RCU_TABLE_FREE avoid calling tlb_flush_mmu_tlbonly() before
+ *  freeing the page-table pages. This can be avoided if you use
+ *  HAVE_RCU_TABLE_FREE and your architecture does _NOT_ use the Linux
+ *  page-tables natively.
  *
  *  MMU_GATHER_NO_RANGE
  *
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -157,7 +157,7 @@ bool __tlb_remove_page_size(struct mmu_g
  */
 static inline void tlb_table_invalidate(struct mmu_gather *tlb)
 {
-#ifdef CONFIG_HAVE_RCU_TABLE_INVALIDATE
+#ifndef CONFIG_HAVE_RCU_TABLE_NO_INVALIDATE
 	/*
 	 * Invalidate page-table caches used by hardware walkers. Then we still
 	 * need to RCU-sched wait while freeing the pages because software



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 08/18] arm/tlb: Convert to generic mmu_gather
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (6 preceding siblings ...)
  2019-02-19 10:31 ` [PATCH v6 07/18] asm-generic/tlb: Invert HAVE_RCU_TABLE_INVALIDATE Peter Zijlstra
@ 2019-02-19 10:31 ` Peter Zijlstra
  2019-02-19 10:31 ` [PATCH v6 09/18] ia64/tlb: Conver " Peter Zijlstra
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

Generic mmu_gather provides everything that ARM needs:

 - range tracking
 - RCU table free
 - VM_EXEC tracking
 - VIPT cache flushing

The one notable curiosity is the 'funny' range tracking for classical
ARM in __pte_free_tlb().

Cc: Nick Piggin <npiggin@gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Russell King <linux@armlinux.org.uk>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arm/include/asm/tlb.h |  254 ++-------------------------------------------
 1 file changed, 13 insertions(+), 241 deletions(-)

--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -33,270 +33,42 @@
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
 
-#define MMU_GATHER_BUNDLE	8
-
-#ifdef CONFIG_HAVE_RCU_TABLE_FREE
 static inline void __tlb_remove_table(void *_table)
 {
 	free_page_and_swap_cache((struct page *)_table);
 }
 
-struct mmu_table_batch {
-	struct rcu_head		rcu;
-	unsigned int		nr;
-	void			*tables[0];
-};
-
-#define MAX_TABLE_BATCH		\
-	((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
-
-extern void tlb_table_flush(struct mmu_gather *tlb);
-extern void tlb_remove_table(struct mmu_gather *tlb, void *table);
-
-#define tlb_remove_entry(tlb, entry)	tlb_remove_table(tlb, entry)
-#else
-#define tlb_remove_entry(tlb, entry)	tlb_remove_page(tlb, entry)
-#endif /* CONFIG_HAVE_RCU_TABLE_FREE */
-
-/*
- * TLB handling.  This allows us to remove pages from the page
- * tables, and efficiently handle the TLB issues.
- */
-struct mmu_gather {
-	struct mm_struct	*mm;
-#ifdef CONFIG_HAVE_RCU_TABLE_FREE
-	struct mmu_table_batch	*batch;
-	unsigned int		need_flush;
-#endif
-	unsigned int		fullmm;
-	struct vm_area_struct	*vma;
-	unsigned long		start, end;
-	unsigned long		range_start;
-	unsigned long		range_end;
-	unsigned int		nr;
-	unsigned int		max;
-	struct page		**pages;
-	struct page		*local[MMU_GATHER_BUNDLE];
-};
-
-DECLARE_PER_CPU(struct mmu_gather, mmu_gathers);
-
-/*
- * This is unnecessarily complex.  There's three ways the TLB shootdown
- * code is used:
- *  1. Unmapping a range of vmas.  See zap_page_range(), unmap_region().
- *     tlb->fullmm = 0, and tlb_start_vma/tlb_end_vma will be called.
- *     tlb->vma will be non-NULL.
- *  2. Unmapping all vmas.  See exit_mmap().
- *     tlb->fullmm = 1, and tlb_start_vma/tlb_end_vma will be called.
- *     tlb->vma will be non-NULL.  Additionally, page tables will be freed.
- *  3. Unmapping argument pages.  See shift_arg_pages().
- *     tlb->fullmm = 0, but tlb_start_vma/tlb_end_vma will not be called.
- *     tlb->vma will be NULL.
- */
-static inline void tlb_flush(struct mmu_gather *tlb)
-{
-	if (tlb->fullmm || !tlb->vma)
-		flush_tlb_mm(tlb->mm);
-	else if (tlb->range_end > 0) {
-		flush_tlb_range(tlb->vma, tlb->range_start, tlb->range_end);
-		tlb->range_start = TASK_SIZE;
-		tlb->range_end = 0;
-	}
-}
-
-static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
-{
-	if (!tlb->fullmm) {
-		if (addr < tlb->range_start)
-			tlb->range_start = addr;
-		if (addr + PAGE_SIZE > tlb->range_end)
-			tlb->range_end = addr + PAGE_SIZE;
-	}
-}
-
-static inline void __tlb_alloc_page(struct mmu_gather *tlb)
-{
-	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
-
-	if (addr) {
-		tlb->pages = (void *)addr;
-		tlb->max = PAGE_SIZE / sizeof(struct page *);
-	}
-}
-
-static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
-{
-	tlb_flush(tlb);
-#ifdef CONFIG_HAVE_RCU_TABLE_FREE
-	tlb_table_flush(tlb);
-#endif
-}
-
-static inline void tlb_flush_mmu_free(struct mmu_gather *tlb)
-{
-	free_pages_and_swap_cache(tlb->pages, tlb->nr);
-	tlb->nr = 0;
-	if (tlb->pages == tlb->local)
-		__tlb_alloc_page(tlb);
-}
-
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	tlb_flush_mmu_tlbonly(tlb);
-	tlb_flush_mmu_free(tlb);
-}
-
-static inline void
-arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
-			unsigned long start, unsigned long end)
-{
-	tlb->mm = mm;
-	tlb->fullmm = !(start | (end+1));
-	tlb->start = start;
-	tlb->end = end;
-	tlb->vma = NULL;
-	tlb->max = ARRAY_SIZE(tlb->local);
-	tlb->pages = tlb->local;
-	tlb->nr = 0;
-	__tlb_alloc_page(tlb);
+#include <asm-generic/tlb.h>
 
-#ifdef CONFIG_HAVE_RCU_TABLE_FREE
-	tlb->batch = NULL;
+#ifndef CONFIG_HAVE_RCU_TABLE_FREE
+#define tlb_remove_table(tlb, entry) tlb_remove_page(tlb, entry)
 #endif
-}
-
-static inline void
-arch_tlb_finish_mmu(struct mmu_gather *tlb,
-			unsigned long start, unsigned long end, bool force)
-{
-	if (force) {
-		tlb->range_start = start;
-		tlb->range_end = end;
-	}
-
-	tlb_flush_mmu(tlb);
 
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-
-	if (tlb->pages != tlb->local)
-		free_pages((unsigned long)tlb->pages, 0);
-}
-
-/*
- * Memorize the range for the TLB flush.
- */
 static inline void
-tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long addr)
-{
-	tlb_add_flush(tlb, addr);
-}
-
-#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
-	tlb_remove_tlb_entry(tlb, ptep, address)
-/*
- * In the case of tlb vma handling, we can optimise these away in the
- * case where we're doing a full MM flush.  When we're doing a munmap,
- * the vmas are adjusted to only cover the region to be torn down.
- */
-static inline void
-tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm) {
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);
-		tlb->vma = vma;
-		tlb->range_start = TASK_SIZE;
-		tlb->range_end = 0;
-	}
-}
-
-static inline void
-tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm)
-		tlb_flush(tlb);
-}
-
-static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	tlb->pages[tlb->nr++] = page;
-	VM_WARN_ON(tlb->nr > tlb->max);
-	if (tlb->nr == tlb->max)
-		return true;
-	return false;
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (__tlb_remove_page(tlb, page))
-		tlb_flush_mmu(tlb);
-}
-
-static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
-					  struct page *page, int page_size)
-{
-	return __tlb_remove_page(tlb, page);
-}
-
-static inline void tlb_remove_page_size(struct mmu_gather *tlb,
-					struct page *page, int page_size)
-{
-	return tlb_remove_page(tlb, page);
-}
-
-static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
-	unsigned long addr)
+__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
 {
 	pgtable_page_dtor(pte);
 
-#ifdef CONFIG_ARM_LPAE
-	tlb_add_flush(tlb, addr);
-#else
+#ifndef CONFIG_ARM_LPAE
 	/*
 	 * With the classic ARM MMU, a pte page has two corresponding pmd
 	 * entries, each covering 1MB.
 	 */
-	addr &= PMD_MASK;
-	tlb_add_flush(tlb, addr + SZ_1M - PAGE_SIZE);
-	tlb_add_flush(tlb, addr + SZ_1M);
+	addr = (addr & PMD_MASK) + SZ_1M;
+	__tlb_adjust_range(tlb, addr - PAGE_SIZE, 2 * PAGE_SIZE);
 #endif
 
-	tlb_remove_entry(tlb, pte);
-}
-
-static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
-				  unsigned long addr)
-{
-#ifdef CONFIG_ARM_LPAE
-	tlb_add_flush(tlb, addr);
-	tlb_remove_entry(tlb, virt_to_page(pmdp));
-#endif
+	tlb_remove_table(tlb, pte);
 }
 
 static inline void
-tlb_remove_pmd_tlb_entry(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr)
+__pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr)
 {
-	tlb_add_flush(tlb, addr);
-}
-
-#define pte_free_tlb(tlb, ptep, addr)	__pte_free_tlb(tlb, ptep, addr)
-#define pmd_free_tlb(tlb, pmdp, addr)	__pmd_free_tlb(tlb, pmdp, addr)
-#define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
-
-#define tlb_migrate_finish(mm)		do { } while (0)
-
-static inline void tlb_change_page_size(struct mmu_gather *tlb,
-						     unsigned int page_size)
-{
-}
-
-static inline void tlb_flush_remove_tables(struct mm_struct *mm)
-{
-}
+#ifdef CONFIG_ARM_LPAE
+	struct page *page = virt_to_page(pmdp);
 
-static inline void tlb_flush_remove_tables_local(void *arg)
-{
+	tlb_remove_table(tlb, page);
+#endif
 }
 
 #endif /* CONFIG_MMU */



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 09/18] ia64/tlb: Conver to generic mmu_gather
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (7 preceding siblings ...)
  2019-02-19 10:31 ` [PATCH v6 08/18] arm/tlb: Convert to generic mmu_gather Peter Zijlstra
@ 2019-02-19 10:31 ` " Peter Zijlstra
  2019-02-19 12:47   ` Will Deacon
  2019-02-21  2:52   ` Souptick Joarder
  2019-02-19 10:31 ` [PATCH v6 10/18] sh/tlb: Convert SH " Peter Zijlstra
                   ` (8 subsequent siblings)
  17 siblings, 2 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux,
	heiko.carstens, riel, Tony Luck

Generic mmu_gather provides everything ia64 needs (range tracking).

Cc: Will Deacon <will.deacon@arm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/ia64/include/asm/tlb.h      |  256 ---------------------------------------
 arch/ia64/include/asm/tlbflush.h |   25 +++
 arch/ia64/mm/tlb.c               |   23 +++
 3 files changed, 47 insertions(+), 257 deletions(-)

--- a/arch/ia64/include/asm/tlb.h
+++ b/arch/ia64/include/asm/tlb.h
@@ -47,262 +47,8 @@
 #include <asm/tlbflush.h>
 #include <asm/machvec.h>
 
-/*
- * If we can't allocate a page to make a big batch of page pointers
- * to work on, then just handle a few from the on-stack structure.
- */
-#define	IA64_GATHER_BUNDLE	8
-
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		nr;
-	unsigned int		max;
-	unsigned char		fullmm;		/* non-zero means full mm flush */
-	unsigned char		need_flush;	/* really unmapped some PTEs? */
-	unsigned long		start, end;
-	unsigned long		start_addr;
-	unsigned long		end_addr;
-	struct page		**pages;
-	struct page		*local[IA64_GATHER_BUNDLE];
-};
-
-struct ia64_tr_entry {
-	u64 ifa;
-	u64 itir;
-	u64 pte;
-	u64 rr;
-}; /*Record for tr entry!*/
-
-extern int ia64_itr_entry(u64 target_mask, u64 va, u64 pte, u64 log_size);
-extern void ia64_ptr_entry(u64 target_mask, int slot);
-
-extern struct ia64_tr_entry *ia64_idtrs[NR_CPUS];
-
-/*
- region register macros
-*/
-#define RR_TO_VE(val)   (((val) >> 0) & 0x0000000000000001)
-#define RR_VE(val)	(((val) & 0x0000000000000001) << 0)
-#define RR_VE_MASK	0x0000000000000001L
-#define RR_VE_SHIFT	0
-#define RR_TO_PS(val)	(((val) >> 2) & 0x000000000000003f)
-#define RR_PS(val)	(((val) & 0x000000000000003f) << 2)
-#define RR_PS_MASK	0x00000000000000fcL
-#define RR_PS_SHIFT	2
-#define RR_RID_MASK	0x00000000ffffff00L
-#define RR_TO_RID(val) 	((val >> 8) & 0xffffff)
-
-static inline void
-ia64_tlb_flush_mmu_tlbonly(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	tlb->need_flush = 0;
-
-	if (tlb->fullmm) {
-		/*
-		 * Tearing down the entire address space.  This happens both as a result
-		 * of exit() and execve().  The latter case necessitates the call to
-		 * flush_tlb_mm() here.
-		 */
-		flush_tlb_mm(tlb->mm);
-	} else if (unlikely (end - start >= 1024*1024*1024*1024UL
-			     || REGION_NUMBER(start) != REGION_NUMBER(end - 1)))
-	{
-		/*
-		 * If we flush more than a tera-byte or across regions, we're probably
-		 * better off just flushing the entire TLB(s).  This should be very rare
-		 * and is not worth optimizing for.
-		 */
-		flush_tlb_all();
-	} else {
-		/*
-		 * flush_tlb_range() takes a vma instead of a mm pointer because
-		 * some architectures want the vm_flags for ITLB/DTLB flush.
-		 */
-		struct vm_area_struct vma = TLB_FLUSH_VMA(tlb->mm, 0);
-
-		/* flush the address range from the tlb: */
-		flush_tlb_range(&vma, start, end);
-		/* now flush the virt. page-table area mapping the address range: */
-		flush_tlb_range(&vma, ia64_thash(start), ia64_thash(end));
-	}
-
-}
-
-static inline void
-ia64_tlb_flush_mmu_free(struct mmu_gather *tlb)
-{
-	unsigned long i;
-	unsigned int nr;
-
-	/* lastly, release the freed pages */
-	nr = tlb->nr;
-
-	tlb->nr = 0;
-	tlb->start_addr = ~0UL;
-	for (i = 0; i < nr; ++i)
-		free_page_and_swap_cache(tlb->pages[i]);
-}
-
-/*
- * Flush the TLB for address range START to END and, if not in fast mode, release the
- * freed pages that where gathered up to this point.
- */
-static inline void
-ia64_tlb_flush_mmu (struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	if (!tlb->need_flush)
-		return;
-	ia64_tlb_flush_mmu_tlbonly(tlb, start, end);
-	ia64_tlb_flush_mmu_free(tlb);
-}
-
-static inline void __tlb_alloc_page(struct mmu_gather *tlb)
-{
-	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
-
-	if (addr) {
-		tlb->pages = (void *)addr;
-		tlb->max = PAGE_SIZE / sizeof(void *);
-	}
-}
-
-
-static inline void
-arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
-			unsigned long start, unsigned long end)
-{
-	tlb->mm = mm;
-	tlb->max = ARRAY_SIZE(tlb->local);
-	tlb->pages = tlb->local;
-	tlb->nr = 0;
-	tlb->fullmm = !(start | (end+1));
-	tlb->start = start;
-	tlb->end = end;
-	tlb->start_addr = ~0UL;
-}
-
-/*
- * Called at the end of the shootdown operation to free up any resources that were
- * collected.
- */
-static inline void
-arch_tlb_finish_mmu(struct mmu_gather *tlb,
-			unsigned long start, unsigned long end, bool force)
-{
-	if (force)
-		tlb->need_flush = 1;
-	/*
-	 * Note: tlb->nr may be 0 at this point, so we can't rely on tlb->start_addr and
-	 * tlb->end_addr.
-	 */
-	ia64_tlb_flush_mmu(tlb, start, end);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-
-	if (tlb->pages != tlb->local)
-		free_pages((unsigned long)tlb->pages, 0);
-}
-
-/*
- * Logically, this routine frees PAGE.  On MP machines, the actual freeing of the page
- * must be delayed until after the TLB has been flushed (see comments at the beginning of
- * this file).
- */
-static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	tlb->need_flush = 1;
-
-	if (!tlb->nr && tlb->pages == tlb->local)
-		__tlb_alloc_page(tlb);
-
-	tlb->pages[tlb->nr++] = page;
-	VM_WARN_ON(tlb->nr > tlb->max);
-	if (tlb->nr == tlb->max)
-		return true;
-	return false;
-}
-
-static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
-{
-	ia64_tlb_flush_mmu_tlbonly(tlb, tlb->start_addr, tlb->end_addr);
-}
-
-static inline void tlb_flush_mmu_free(struct mmu_gather *tlb)
-{
-	ia64_tlb_flush_mmu_free(tlb);
-}
-
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	ia64_tlb_flush_mmu(tlb, tlb->start_addr, tlb->end_addr);
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (__tlb_remove_page(tlb, page))
-		tlb_flush_mmu(tlb);
-}
-
-static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
-					  struct page *page, int page_size)
-{
-	return __tlb_remove_page(tlb, page);
-}
-
-static inline void tlb_remove_page_size(struct mmu_gather *tlb,
-					struct page *page, int page_size)
-{
-	return tlb_remove_page(tlb, page);
-}
-
-/*
- * Remove TLB entry for PTE mapped at virtual address ADDRESS.  This is called for any
- * PTE, not just those pointing to (normal) physical memory.
- */
-static inline void
-__tlb_remove_tlb_entry (struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
-{
-	if (tlb->start_addr == ~0UL)
-		tlb->start_addr = address;
-	tlb->end_addr = address + PAGE_SIZE;
-}
-
 #define tlb_migrate_finish(mm)	platform_tlb_migrate_finish(mm)
 
-#define tlb_start_vma(tlb, vma)			do { } while (0)
-#define tlb_end_vma(tlb, vma)			do { } while (0)
-
-#define tlb_remove_tlb_entry(tlb, ptep, addr)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__tlb_remove_tlb_entry(tlb, ptep, addr);	\
-} while (0)
-
-#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
-	tlb_remove_tlb_entry(tlb, ptep, address)
-
-static inline void tlb_change_page_size(struct mmu_gather *tlb,
-						     unsigned int page_size)
-{
-}
-
-#define pte_free_tlb(tlb, ptep, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pte_free_tlb(tlb, ptep, address);		\
-} while (0)
-
-#define pmd_free_tlb(tlb, ptep, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pmd_free_tlb(tlb, ptep, address);		\
-} while (0)
-
-#define pud_free_tlb(tlb, pudp, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pud_free_tlb(tlb, pudp, address);		\
-} while (0)
+#include <asm-generic/tlb.h>
 
 #endif /* _ASM_IA64_TLB_H */
--- a/arch/ia64/include/asm/tlbflush.h
+++ b/arch/ia64/include/asm/tlbflush.h
@@ -14,6 +14,31 @@
 #include <asm/mmu_context.h>
 #include <asm/page.h>
 
+struct ia64_tr_entry {
+	u64 ifa;
+	u64 itir;
+	u64 pte;
+	u64 rr;
+}; /*Record for tr entry!*/
+
+extern int ia64_itr_entry(u64 target_mask, u64 va, u64 pte, u64 log_size);
+extern void ia64_ptr_entry(u64 target_mask, int slot);
+extern struct ia64_tr_entry *ia64_idtrs[NR_CPUS];
+
+/*
+ region register macros
+*/
+#define RR_TO_VE(val)   (((val) >> 0) & 0x0000000000000001)
+#define RR_VE(val)     (((val) & 0x0000000000000001) << 0)
+#define RR_VE_MASK     0x0000000000000001L
+#define RR_VE_SHIFT    0
+#define RR_TO_PS(val)  (((val) >> 2) & 0x000000000000003f)
+#define RR_PS(val)     (((val) & 0x000000000000003f) << 2)
+#define RR_PS_MASK     0x00000000000000fcL
+#define RR_PS_SHIFT    2
+#define RR_RID_MASK    0x00000000ffffff00L
+#define RR_TO_RID(val)         ((val >> 8) & 0xffffff)
+
 /*
  * Now for some TLB flushing routines.  This is the kind of stuff that
  * can be very expensive, so try to avoid them whenever possible.
--- a/arch/ia64/mm/tlb.c
+++ b/arch/ia64/mm/tlb.c
@@ -297,8 +297,8 @@ local_flush_tlb_all (void)
 	ia64_srlz_i();			/* srlz.i implies srlz.d */
 }
 
-void
-flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
+static void
+__flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
 		 unsigned long end)
 {
 	struct mm_struct *mm = vma->vm_mm;
@@ -335,6 +335,25 @@ flush_tlb_range (struct vm_area_struct *
 	preempt_enable();
 	ia64_srlz_i();			/* srlz.i implies srlz.d */
 }
+
+void flush_tlb_range(struct vm_area_struct *vma,
+		unsigned long start, unsigned long end)
+{
+	if (unlikely(end - start >= 1024*1024*1024*1024UL
+			|| REGION_NUMBER(start) != REGION_NUMBER(end - 1))) {
+		/*
+		 * If we flush more than a tera-byte or across regions, we're
+		 * probably better off just flushing the entire TLB(s).  This
+		 * should be very rare and is not worth optimizing for.
+		 */
+		flush_tlb_all();
+	} else {
+		/* flush the address range from the tlb */
+		__flush_tlb_range(vma, start, end);
+		/* flush the virt. page-table area mapping the addr range */
+		__flush_tlb_range(vma, ia64_thash(start), ia64_thash(end));
+	}
+}
 EXPORT_SYMBOL(flush_tlb_range);
 
 void ia64_tlb_init(void)



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (8 preceding siblings ...)
  2019-02-19 10:31 ` [PATCH v6 09/18] ia64/tlb: Conver " Peter Zijlstra
@ 2019-02-19 10:31 ` " Peter Zijlstra
  2019-12-03 11:19   ` Geert Uytterhoeven
  2019-02-19 10:31 ` [PATCH v6 11/18] um/tlb: Convert " Peter Zijlstra
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux,
	heiko.carstens, riel, Yoshinori Sato, Rich Felker

Generic mmu_gather provides everything SH needs (range tracking and
cache coherency).

Cc: Will Deacon <will.deacon@arm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/sh/include/asm/pgalloc.h |    7 ++
 arch/sh/include/asm/tlb.h     |  130 ------------------------------------------
 2 files changed, 8 insertions(+), 129 deletions(-)

--- a/arch/sh/include/asm/pgalloc.h
+++ b/arch/sh/include/asm/pgalloc.h
@@ -72,6 +72,15 @@ do {							\
 	tlb_remove_page((tlb), (pte));			\
 } while (0)
 
+#if CONFIG_PGTABLE_LEVELS > 2
+#define __pmd_free_tlb(tlb, pmdp, addr)			\
+do {							\
+	struct page *page = virt_to_page(pmdp);		\
+	pgtable_pmd_page_dtor(page);			\
+	tlb_remove_page((tlb), page);			\
+} while (0);
+#endif
+
 static inline void check_pgt_cache(void)
 {
 	quicklist_trim(QUICK_PT, NULL, 25, 16);
--- a/arch/sh/include/asm/tlb.h
+++ b/arch/sh/include/asm/tlb.h
@@ -11,131 +11,8 @@
 
 #ifdef CONFIG_MMU
 #include <linux/swap.h>
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-#include <asm/mmu_context.h>
-
-/*
- * TLB handling.  This allows us to remove pages from the page
- * tables, and efficiently handle the TLB issues.
- */
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		fullmm;
-	unsigned long		start, end;
-};
 
-static inline void init_tlb_gather(struct mmu_gather *tlb)
-{
-	tlb->start = TASK_SIZE;
-	tlb->end = 0;
-
-	if (tlb->fullmm) {
-		tlb->start = 0;
-		tlb->end = TASK_SIZE;
-	}
-}
-
-static inline void
-arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
-		unsigned long start, unsigned long end)
-{
-	tlb->mm = mm;
-	tlb->start = start;
-	tlb->end = end;
-	tlb->fullmm = !(start | (end+1));
-
-	init_tlb_gather(tlb);
-}
-
-static inline void
-arch_tlb_finish_mmu(struct mmu_gather *tlb,
-		unsigned long start, unsigned long end, bool force)
-{
-	if (tlb->fullmm || force)
-		flush_tlb_mm(tlb->mm);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-}
-
-static inline void
-tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
-{
-	if (tlb->start > address)
-		tlb->start = address;
-	if (tlb->end < address + PAGE_SIZE)
-		tlb->end = address + PAGE_SIZE;
-}
-
-#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
-	tlb_remove_tlb_entry(tlb, ptep, address)
-
-/*
- * In the case of tlb vma handling, we can optimise these away in the
- * case where we're doing a full MM flush.  When we're doing a munmap,
- * the vmas are adjusted to only cover the region to be torn down.
- */
-static inline void
-tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm)
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);
-}
-
-static inline void
-tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm && tlb->end) {
-		flush_tlb_range(vma, tlb->start, tlb->end);
-		init_tlb_gather(tlb);
-	}
-}
-
-static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
-{
-}
-
-static inline void tlb_flush_mmu_free(struct mmu_gather *tlb)
-{
-}
-
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-}
-
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	free_page_and_swap_cache(page);
-	return false; /* avoid calling tlb_flush_mmu */
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	__tlb_remove_page(tlb, page);
-}
-
-static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
-					  struct page *page, int page_size)
-{
-	return __tlb_remove_page(tlb, page);
-}
-
-static inline void tlb_remove_page_size(struct mmu_gather *tlb,
-					struct page *page, int page_size)
-{
-	return tlb_remove_page(tlb, page);
-}
-
-static inline void tlb_change_page_size(struct mmu_gather *tlb, unsigned int page_size)
-{
-}
-
-#define pte_free_tlb(tlb, ptep, addr)	pte_free((tlb)->mm, ptep)
-#define pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
-#define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
-
-#define tlb_migrate_finish(mm)		do { } while (0)
+#include <asm-generic/tlb.h>
 
 #if defined(CONFIG_CPU_SH4) || defined(CONFIG_SUPERH64)
 extern void tlb_wire_entry(struct vm_area_struct *, unsigned long, pte_t);
@@ -155,11 +32,6 @@ static inline void tlb_unwire_entry(void
 
 #else /* CONFIG_MMU */
 
-#define tlb_start_vma(tlb, vma)				do { } while (0)
-#define tlb_end_vma(tlb, vma)				do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
-#define tlb_flush(tlb)					do { } while (0)
-
 #include <asm-generic/tlb.h>
 
 #endif /* CONFIG_MMU */



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 11/18] um/tlb: Convert to generic mmu_gather
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (9 preceding siblings ...)
  2019-02-19 10:31 ` [PATCH v6 10/18] sh/tlb: Convert SH " Peter Zijlstra
@ 2019-02-19 10:31 ` " Peter Zijlstra
  2019-02-19 10:32 ` [PATCH v6 12/18] arch/tlb: Clean up simple architectures Peter Zijlstra
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:31 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux,
	heiko.carstens, riel, Richard Weinberger

Generic mmu_gather provides the simple flush_tlb_range() based range
tracking mmu_gather UM needs.

Cc: Will Deacon <will.deacon@arm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/um/include/asm/tlb.h |  156 ----------------------------------------------
 1 file changed, 2 insertions(+), 154 deletions(-)

--- a/arch/um/include/asm/tlb.h
+++ b/arch/um/include/asm/tlb.h
@@ -2,160 +2,8 @@
 #ifndef __UM_TLB_H
 #define __UM_TLB_H
 
-#include <linux/pagemap.h>
-#include <linux/swap.h>
-#include <asm/percpu.h>
-#include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
-
-#define tlb_start_vma(tlb, vma) do { } while (0)
-#define tlb_end_vma(tlb, vma) do { } while (0)
-#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm)
-
-/* struct mmu_gather is an opaque type used by the mm code for passing around
- * any data needed by arch specific code for tlb_remove_page.
- */
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		need_flush; /* Really unmapped some ptes? */
-	unsigned long		start;
-	unsigned long		end;
-	unsigned int		fullmm; /* non-zero means full mm flush */
-};
-
-static inline void __tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep,
-					  unsigned long address)
-{
-	if (tlb->start > address)
-		tlb->start = address;
-	if (tlb->end < address + PAGE_SIZE)
-		tlb->end = address + PAGE_SIZE;
-}
-
-static inline void init_tlb_gather(struct mmu_gather *tlb)
-{
-	tlb->need_flush = 0;
-
-	tlb->start = TASK_SIZE;
-	tlb->end = 0;
-
-	if (tlb->fullmm) {
-		tlb->start = 0;
-		tlb->end = TASK_SIZE;
-	}
-}
-
-static inline void
-arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
-		unsigned long start, unsigned long end)
-{
-	tlb->mm = mm;
-	tlb->start = start;
-	tlb->end = end;
-	tlb->fullmm = !(start | (end+1));
-
-	init_tlb_gather(tlb);
-}
-
-extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
-			       unsigned long end);
-
-static inline void
-tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
-{
-	flush_tlb_mm_range(tlb->mm, tlb->start, tlb->end);
-}
-
-static inline void
-tlb_flush_mmu_free(struct mmu_gather *tlb)
-{
-	init_tlb_gather(tlb);
-}
-
-static inline void
-tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	if (!tlb->need_flush)
-		return;
-
-	tlb_flush_mmu_tlbonly(tlb);
-	tlb_flush_mmu_free(tlb);
-}
-
-/* arch_tlb_finish_mmu
- *	Called at the end of the shootdown operation to free up any resources
- *	that were required.
- */
-static inline void
-arch_tlb_finish_mmu(struct mmu_gather *tlb,
-		unsigned long start, unsigned long end, bool force)
-{
-	if (force) {
-		tlb->start = start;
-		tlb->end = end;
-		tlb->need_flush = 1;
-	}
-	tlb_flush_mmu(tlb);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-}
-
-/* tlb_remove_page
- *	Must perform the equivalent to __free_pte(pte_get_and_clear(ptep)),
- *	while handling the additional races in SMP caused by other CPUs
- *	caching valid mappings in their TLBs.
- */
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	tlb->need_flush = 1;
-	free_page_and_swap_cache(page);
-	return false; /* avoid calling tlb_flush_mmu */
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	__tlb_remove_page(tlb, page);
-}
-
-static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
-					  struct page *page, int page_size)
-{
-	return __tlb_remove_page(tlb, page);
-}
-
-static inline void tlb_remove_page_size(struct mmu_gather *tlb,
-					struct page *page, int page_size)
-{
-	return tlb_remove_page(tlb, page);
-}
-
-/**
- * tlb_remove_tlb_entry - remember a pte unmapping for later tlb invalidation.
- *
- * Record the fact that pte's were really umapped in ->need_flush, so we can
- * later optimise away the tlb invalidate.   This helps when userspace is
- * unmapping already-unmapped pages, which happens quite a lot.
- */
-#define tlb_remove_tlb_entry(tlb, ptep, address)		\
-	do {							\
-		tlb->need_flush = 1;				\
-		__tlb_remove_tlb_entry(tlb, ptep, address);	\
-	} while (0)
-
-#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
-	tlb_remove_tlb_entry(tlb, ptep, address)
-
-static inline void tlb_change_page_size(struct mmu_gather *tlb, unsigned int page_size)
-{
-}
-
-#define pte_free_tlb(tlb, ptep, addr) __pte_free_tlb(tlb, ptep, addr)
-
-#define pud_free_tlb(tlb, pudp, addr) __pud_free_tlb(tlb, pudp, addr)
-
-#define pmd_free_tlb(tlb, pmdp, addr) __pmd_free_tlb(tlb, pmdp, addr)
-
-#define tlb_migrate_finish(mm) do {} while (0)
+#include <asm-generic/cacheflush.h>
+#include <asm-generic/tlb.h>
 
 #endif



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 12/18] arch/tlb: Clean up simple architectures
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (10 preceding siblings ...)
  2019-02-19 10:31 ` [PATCH v6 11/18] um/tlb: Convert " Peter Zijlstra
@ 2019-02-19 10:32 ` Peter Zijlstra
  2019-02-19 10:32 ` [PATCH v6 13/18] asm-generic/tlb: Introduce HAVE_MMU_GATHER_NO_GATHER Peter Zijlstra
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:32 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux,
	heiko.carstens, riel, David S. Miller, Michal Simek,
	Helge Deller, Greentime Hu, Richard Henderson, Ley Foon Tan,
	Jonas Bonn, Mark Salter, Richard Kuo, Vineet Gupta, Paul Burton,
	Max Filippov, Guan Xuetao

For the architectures that do not implement their own tlb_flush() but
do already use the generic mmu_gather, there are two options:

 1) the platform has an efficient flush_tlb_range() and
    asm-generic/tlb.h doesn't need any overrides at all.

 2) the platform lacks an efficient flush_tlb_range() and
    we select MMU_GATHER_NO_RANGE to minimize full invalidates.

Convert all 'simple' architectures to one of these two forms.

alpha:	    has no range invalidate -> 2
arc:	    already used flush_tlb_range() -> 1
c6x:	    has no range invalidate -> 2
hexagon:    has an efficient flush_tlb_range() -> 1
            (flush_tlb_mm() is in fact a full range invalidate,
	     so no need to shoot down everything)
m68k:	    has inefficient flush_tlb_range() -> 2
microblaze: has no flush_tlb_range() -> 2
mips:	    has efficient flush_tlb_range() -> 1
	    (even though it currently seems to use flush_tlb_mm())
nds32:	    already uses flush_tlb_range() -> 1
nios2:	    has inefficient flush_tlb_range() -> 2
	    (no limit on range iteration)
openrisc:   has inefficient flush_tlb_range() -> 2
	    (no limit on range iteration)
parisc:	    already uses flush_tlb_range() -> 1
sparc32:    already uses flush_tlb_range() -> 1
unicore32:  has inefficient flush_tlb_range() -> 2
	    (no limit on range iteration)
xtensa:	    has efficient flush_tlb_range() -> 1

Note this also fixes a bug in the existing code for a number
platforms. Those platforms that did:

  tlb_end_vma() -> if (!full_mm) flush_tlb_*()
  tlb_flush -> if (full_mm) flush_tlb_mm()

missed the case of shift_arg_pages(), which doesn't have @fullmm set,
nor calls into tlb_*vma(), but still frees page-tables and thus needs
an invalidate. The new code handles this by detecting a non-empty
range, and either issuing the matching range invalidate or a full
invalidate, depending on the capabilities.


Cc: Nick Piggin <npiggin@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Helge Deller <deller@gmx.de>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Mark Salter <msalter@redhat.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/alpha/Kconfig                |    1 +
 arch/alpha/include/asm/tlb.h      |    6 ------
 arch/arc/include/asm/tlb.h        |   23 -----------------------
 arch/c6x/Kconfig                  |    1 +
 arch/c6x/include/asm/tlb.h        |    2 --
 arch/h8300/include/asm/tlb.h      |    2 --
 arch/hexagon/include/asm/tlb.h    |   12 ------------
 arch/m68k/Kconfig                 |    1 +
 arch/m68k/include/asm/tlb.h       |   14 --------------
 arch/microblaze/Kconfig           |    1 +
 arch/microblaze/include/asm/tlb.h |    9 ---------
 arch/mips/include/asm/tlb.h       |    8 --------
 arch/nds32/include/asm/tlb.h      |   10 ----------
 arch/nios2/Kconfig                |    1 +
 arch/nios2/include/asm/tlb.h      |    8 ++++----
 arch/openrisc/Kconfig             |    1 +
 arch/openrisc/include/asm/tlb.h   |    8 ++------
 arch/parisc/include/asm/tlb.h     |   13 -------------
 arch/sparc/include/asm/tlb_32.h   |   13 -------------
 arch/unicore32/Kconfig            |    1 +
 arch/unicore32/include/asm/tlb.h  |    7 +++----
 arch/xtensa/include/asm/tlb.h     |   17 -----------------
 22 files changed, 16 insertions(+), 143 deletions(-)

--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -36,6 +36,7 @@ config ALPHA
 	select ODD_RT_SIGACTION
 	select OLD_SIGSUSPEND
 	select CPU_NO_EFFICIENT_FFS if !ALPHA_EV67
+	select MMU_GATHER_NO_RANGE
 	help
 	  The Alpha is a 64-bit general-purpose processor designed and
 	  marketed by the Digital Equipment Corporation of blessed memory,
--- a/arch/alpha/include/asm/tlb.h
+++ b/arch/alpha/include/asm/tlb.h
@@ -2,12 +2,6 @@
 #ifndef _ALPHA_TLB_H
 #define _ALPHA_TLB_H
 
-#define tlb_start_vma(tlb, vma)			do { } while (0)
-#define tlb_end_vma(tlb, vma)			do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, pte, addr)	do { } while (0)
-
-#define tlb_flush(tlb)				flush_tlb_mm((tlb)->mm)
-
 #include <asm-generic/tlb.h>
 
 #define __pte_free_tlb(tlb, pte, address)		pte_free((tlb)->mm, pte)
--- a/arch/arc/include/asm/tlb.h
+++ b/arch/arc/include/asm/tlb.h
@@ -9,29 +9,6 @@
 #ifndef _ASM_ARC_TLB_H
 #define _ASM_ARC_TLB_H
 
-#define tlb_flush(tlb)				\
-do {						\
-	if (tlb->fullmm)			\
-		flush_tlb_mm((tlb)->mm);	\
-} while (0)
-
-/*
- * This pair is called at time of munmap/exit to flush cache and TLB entries
- * for mappings being torn down.
- * 1) cache-flush part -implemented via tlb_start_vma( ) for VIPT aliasing D$
- * 2) tlb-flush part - implemted via tlb_end_vma( ) flushes the TLB range
- *
- * Note, read http://lkml.org/lkml/2004/1/15/6
- */
-
-#define tlb_end_vma(tlb, vma)						\
-do {									\
-	if (!tlb->fullmm)						\
-		flush_tlb_range(vma, vma->vm_start, vma->vm_end);	\
-} while (0)
-
-#define __tlb_remove_tlb_entry(tlb, ptep, address)
-
 #include <linux/pagemap.h>
 #include <asm-generic/tlb.h>
 
--- a/arch/c6x/Kconfig
+++ b/arch/c6x/Kconfig
@@ -19,6 +19,7 @@ config C6X
 	select GENERIC_CLOCKEVENTS
 	select MODULES_USE_ELF_RELA
 	select ARCH_NO_COHERENT_DMA_MMAP
+	select MMU_GATHER_NO_RANGE if MMU
 
 config MMU
 	def_bool n
--- a/arch/c6x/include/asm/tlb.h
+++ b/arch/c6x/include/asm/tlb.h
@@ -2,8 +2,6 @@
 #ifndef _ASM_C6X_TLB_H
 #define _ASM_C6X_TLB_H
 
-#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm)
-
 #include <asm-generic/tlb.h>
 
 #endif /* _ASM_C6X_TLB_H */
--- a/arch/h8300/include/asm/tlb.h
+++ b/arch/h8300/include/asm/tlb.h
@@ -2,8 +2,6 @@
 #ifndef __H8300_TLB_H__
 #define __H8300_TLB_H__
 
-#define tlb_flush(tlb)	do { } while (0)
-
 #include <asm-generic/tlb.h>
 
 #endif
--- a/arch/hexagon/include/asm/tlb.h
+++ b/arch/hexagon/include/asm/tlb.h
@@ -22,18 +22,6 @@
 #include <linux/pagemap.h>
 #include <asm/tlbflush.h>
 
-/*
- * We don't need any special per-pte or per-vma handling...
- */
-#define tlb_start_vma(tlb, vma)				do { } while (0)
-#define tlb_end_vma(tlb, vma)				do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, ptep, address)	do { } while (0)
-
-/*
- * .. because we flush the whole mm when it fills up
- */
-#define tlb_flush(tlb)		flush_tlb_mm((tlb)->mm)
-
 #include <asm-generic/tlb.h>
 
 #endif
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -27,6 +27,7 @@ config M68K
 	select OLD_SIGSUSPEND3
 	select OLD_SIGACTION
 	select ARCH_DISCARD_MEMBLOCK
+	select MMU_GATHER_NO_RANGE if MMU
 
 config CPU_BIG_ENDIAN
 	def_bool y
--- a/arch/m68k/include/asm/tlb.h
+++ b/arch/m68k/include/asm/tlb.h
@@ -2,20 +2,6 @@
 #ifndef _M68K_TLB_H
 #define _M68K_TLB_H
 
-/*
- * m68k doesn't need any special per-pte or
- * per-vma handling..
- */
-#define tlb_start_vma(tlb, vma)	do { } while (0)
-#define tlb_end_vma(tlb, vma)	do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, ptep, address)	do { } while (0)
-
-/*
- * .. because we flush the whole mm when it
- * fills up.
- */
-#define tlb_flush(tlb)		flush_tlb_mm((tlb)->mm)
-
 #include <asm-generic/tlb.h>
 
 #endif /* _M68K_TLB_H */
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -40,6 +40,7 @@ config MICROBLAZE
 	select TRACING_SUPPORT
 	select VIRT_TO_BUS
 	select CPU_NO_EFFICIENT_FFS
+	select MMU_GATHER_NO_RANGE if MMU
 
 # Endianness selection
 choice
--- a/arch/microblaze/include/asm/tlb.h
+++ b/arch/microblaze/include/asm/tlb.h
@@ -11,16 +11,7 @@
 #ifndef _ASM_MICROBLAZE_TLB_H
 #define _ASM_MICROBLAZE_TLB_H
 
-#define tlb_flush(tlb)	flush_tlb_mm((tlb)->mm)
-
 #include <linux/pagemap.h>
-
-#ifdef CONFIG_MMU
-#define tlb_start_vma(tlb, vma)		do { } while (0)
-#define tlb_end_vma(tlb, vma)		do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, pte, address) do { } while (0)
-#endif
-
 #include <asm-generic/tlb.h>
 
 #endif /* _ASM_MICROBLAZE_TLB_H */
--- a/arch/mips/include/asm/tlb.h
+++ b/arch/mips/include/asm/tlb.h
@@ -5,14 +5,6 @@
 #include <asm/cpu-features.h>
 #include <asm/mipsregs.h>
 
-#define tlb_end_vma(tlb, vma) do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
-
-/*
- * .. because we flush the whole mm when it fills up.
- */
-#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm)
-
 #define _UNIQUE_ENTRYHI(base, idx)					\
 		(((base) + ((idx) << (PAGE_SHIFT + 1))) |		\
 		 (cpu_has_tlbinv ? MIPS_ENTRYHI_EHINV : 0))
--- a/arch/nds32/include/asm/tlb.h
+++ b/arch/nds32/include/asm/tlb.h
@@ -4,16 +4,6 @@
 #ifndef __ASMNDS32_TLB_H
 #define __ASMNDS32_TLB_H
 
-#define tlb_end_vma(tlb,vma)				\
-	do { 						\
-		if(!tlb->fullmm)			\
-			flush_tlb_range(vma, vma->vm_start, vma->vm_end); \
-	} while (0)
-
-#define __tlb_remove_tlb_entry(tlb, pte, addr) do { } while (0)
-
-#define tlb_flush(tlb)	flush_tlb_mm((tlb)->mm)
-
 #include <asm-generic/tlb.h>
 
 #define __pte_free_tlb(tlb, pte, addr)	pte_free((tlb)->mm, pte)
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -23,6 +23,7 @@ config NIOS2
 	select USB_ARCH_HAS_HCD if USB_SUPPORT
 	select CPU_NO_EFFICIENT_FFS
 	select ARCH_DISCARD_MEMBLOCK
+	select MMU_GATHER_NO_RANGE if MMU
 
 config GENERIC_CSUM
 	def_bool y
--- a/arch/nios2/include/asm/tlb.h
+++ b/arch/nios2/include/asm/tlb.h
@@ -11,12 +11,12 @@
 #ifndef _ASM_NIOS2_TLB_H
 #define _ASM_NIOS2_TLB_H
 
-#define tlb_flush(tlb)	flush_tlb_mm((tlb)->mm)
-
 extern void set_mmu_pid(unsigned long pid);
 
-#define tlb_end_vma(tlb, vma)	do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, ptep, address)	do { } while (0)
+/*
+ * NIOS32 does have flush_tlb_range(), but it lacks a limit and fallback to
+ * full mm invalidation. So use flush_tlb_mm() for everything.
+ */
 
 #include <linux/pagemap.h>
 #include <asm-generic/tlb.h>
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -35,6 +35,7 @@ config OPENRISC
 	select OMPIC if SMP
 	select ARCH_WANT_FRAME_POINTERS
 	select GENERIC_IRQ_MULTI_HANDLER
+	select MMU_GATHER_NO_RANGE if MMU
 
 config CPU_BIG_ENDIAN
 	def_bool y
--- a/arch/openrisc/include/asm/tlb.h
+++ b/arch/openrisc/include/asm/tlb.h
@@ -20,14 +20,10 @@
 #define __ASM_OPENRISC_TLB_H__
 
 /*
- * or32 doesn't need any special per-pte or
- * per-vma handling..
+ * OpenRISC doesn't have an efficient flush_tlb_range() so use flush_tlb_mm()
+ * for everything.
  */
-#define tlb_start_vma(tlb, vma) do { } while (0)
-#define tlb_end_vma(tlb, vma) do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
 
-#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm)
 #include <linux/pagemap.h>
 #include <asm-generic/tlb.h>
 
--- a/arch/parisc/include/asm/tlb.h
+++ b/arch/parisc/include/asm/tlb.h
@@ -2,19 +2,6 @@
 #ifndef _PARISC_TLB_H
 #define _PARISC_TLB_H
 
-#define tlb_flush(tlb)			\
-do {	if ((tlb)->fullmm)		\
-		flush_tlb_mm((tlb)->mm);\
-} while (0)
-
-#define tlb_end_vma(tlb, vma)	\
-do {	if (!(tlb)->fullmm)	\
-		flush_tlb_range(vma, vma->vm_start, vma->vm_end); \
-} while (0)
-
-#define __tlb_remove_tlb_entry(tlb, pte, address) \
-	do { } while (0)
-
 #include <asm-generic/tlb.h>
 
 #define __pmd_free_tlb(tlb, pmd, addr)	pmd_free((tlb)->mm, pmd)
--- a/arch/sparc/include/asm/tlb_32.h
+++ b/arch/sparc/include/asm/tlb_32.h
@@ -2,19 +2,6 @@
 #ifndef _SPARC_TLB_H
 #define _SPARC_TLB_H
 
-#define tlb_end_vma(tlb, vma) \
-do {								\
-	flush_tlb_range(vma, vma->vm_start, vma->vm_end);	\
-} while (0)
-
-#define __tlb_remove_tlb_entry(tlb, pte, address) \
-	do { } while (0)
-
-#define tlb_flush(tlb) \
-do {								\
-	flush_tlb_mm((tlb)->mm);				\
-} while (0)
-
 #include <asm-generic/tlb.h>
 
 #endif /* _SPARC_TLB_H */
--- a/arch/unicore32/Kconfig
+++ b/arch/unicore32/Kconfig
@@ -20,6 +20,7 @@ config UNICORE32
 	select GENERIC_IOMAP
 	select MODULES_USE_ELF_REL
 	select NEED_DMA_MAP_STATE
+	select MMU_GATHER_NO_RANGE if MMU
 	help
 	  UniCore-32 is 32-bit Instruction Set Architecture,
 	  including a series of low-power-consumption RISC chip
--- a/arch/unicore32/include/asm/tlb.h
+++ b/arch/unicore32/include/asm/tlb.h
@@ -12,10 +12,9 @@
 #ifndef __UNICORE_TLB_H__
 #define __UNICORE_TLB_H__
 
-#define tlb_start_vma(tlb, vma)				do { } while (0)
-#define tlb_end_vma(tlb, vma)				do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, ptep, address)	do { } while (0)
-#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm)
+/*
+ * unicore32 lacks an efficient flush_tlb_range(), use flush_tlb_mm().
+ */
 
 #define __pte_free_tlb(tlb, pte, addr)				\
 	do {							\
--- a/arch/xtensa/include/asm/tlb.h
+++ b/arch/xtensa/include/asm/tlb.h
@@ -14,23 +14,6 @@
 #include <asm/cache.h>
 #include <asm/page.h>
 
-#if (DCACHE_WAY_SIZE <= PAGE_SIZE)
-
-# define tlb_end_vma(tlb,vma)			do { } while (0)
-
-#else
-
-# define tlb_end_vma(tlb, vma)						      \
-	do {								      \
-		if (!tlb->fullmm)					      \
-			flush_tlb_range(vma, vma->vm_start, vma->vm_end);     \
-	} while(0)
-
-#endif
-
-#define __tlb_remove_tlb_entry(tlb,pte,addr)	do { } while (0)
-#define tlb_flush(tlb)				flush_tlb_mm((tlb)->mm)
-
 #include <asm-generic/tlb.h>
 
 #define __pte_free_tlb(tlb, pte, address)	pte_free((tlb)->mm, pte)



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 13/18] asm-generic/tlb: Introduce HAVE_MMU_GATHER_NO_GATHER
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (11 preceding siblings ...)
  2019-02-19 10:32 ` [PATCH v6 12/18] arch/tlb: Clean up simple architectures Peter Zijlstra
@ 2019-02-19 10:32 ` Peter Zijlstra
  2019-02-19 12:47   ` Will Deacon
  2019-02-19 10:32 ` [PATCH v6 14/18] s390/tlb: convert to generic mmu_gather Peter Zijlstra
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:32 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux,
	heiko.carstens, riel, Linus Torvalds, Martin Schwidefsky

Add the Kconfig option HAVE_MMU_GATHER_NO_GATHER to the generic
mmu_gather code. If the option is set the mmu_gather will not
track individual pages for delayed page free anymore. A platform
that enables the option needs to provide its own implementation
of the __tlb_remove_page_size function to free pages.

Cc: npiggin@gmail.com
Cc: heiko.carstens@de.ibm.com
Cc: will.deacon@arm.com
Cc: aneesh.kumar@linux.vnet.ibm.com
Cc: akpm@linux-foundation.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux@armlinux.org.uk
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180918125151.31744-2-schwidefsky@de.ibm.com
---
 arch/Kconfig              |    3 +
 include/asm-generic/tlb.h |    9 +++
 mm/mmu_gather.c           |  107 +++++++++++++++++++++++++---------------------
 3 files changed, 70 insertions(+), 49 deletions(-)

--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -368,6 +368,9 @@ config HAVE_RCU_TABLE_NO_INVALIDATE
 config HAVE_MMU_GATHER_PAGE_SIZE
 	bool
 
+config HAVE_MMU_GATHER_NO_GATHER
+	bool
+
 config ARCH_HAVE_NMI_SAFE_CMPXCHG
 	bool
 
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -184,6 +184,7 @@ extern void tlb_remove_table(struct mmu_
 
 #endif
 
+#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
 /*
  * If we can't allocate a page to make a big batch of page pointers
  * to work on, then just handle a few from the on-stack structure.
@@ -208,6 +209,10 @@ struct mmu_gather_batch {
  */
 #define MAX_GATHER_BATCH_COUNT	(10000UL/MAX_GATHER_BATCH)
 
+extern bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page,
+				   int page_size);
+#endif
+
 /*
  * struct mmu_gather is an opaque type used by the mm code for passing around
  * any data needed by arch specific code for tlb_remove_page.
@@ -254,6 +259,7 @@ struct mmu_gather {
 
 	unsigned int		batch_count;
 
+#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
 	struct mmu_gather_batch *active;
 	struct mmu_gather_batch	local;
 	struct page		*__pages[MMU_GATHER_BUNDLE];
@@ -261,6 +267,7 @@ struct mmu_gather {
 #ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
 	unsigned int page_size;
 #endif
+#endif
 };
 
 void arch_tlb_gather_mmu(struct mmu_gather *tlb,
@@ -269,8 +276,6 @@ void tlb_flush_mmu(struct mmu_gather *tl
 void arch_tlb_finish_mmu(struct mmu_gather *tlb,
 			 unsigned long start, unsigned long end, bool force);
 void tlb_flush_mmu_free(struct mmu_gather *tlb);
-extern bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page,
-				   int page_size);
 
 static inline void __tlb_adjust_range(struct mmu_gather *tlb,
 				      unsigned long address,
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -13,6 +13,8 @@
 
 #ifdef HAVE_GENERIC_MMU_GATHER
 
+#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+
 static bool tlb_next_batch(struct mmu_gather *tlb)
 {
 	struct mmu_gather_batch *batch;
@@ -41,6 +43,56 @@ static bool tlb_next_batch(struct mmu_ga
 	return true;
 }
 
+static void tlb_batch_pages_flush(struct mmu_gather *tlb)
+{
+	struct mmu_gather_batch *batch;
+
+	for (batch = &tlb->local; batch && batch->nr; batch = batch->next) {
+		free_pages_and_swap_cache(batch->pages, batch->nr);
+		batch->nr = 0;
+	}
+	tlb->active = &tlb->local;
+}
+
+static void tlb_batch_list_free(struct mmu_gather *tlb)
+{
+	struct mmu_gather_batch *batch, *next;
+
+	for (batch = tlb->local.next; batch; batch = next) {
+		next = batch->next;
+		free_pages((unsigned long)batch, 0);
+	}
+	tlb->local.next = NULL;
+}
+
+bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_size)
+{
+	struct mmu_gather_batch *batch;
+
+	VM_BUG_ON(!tlb->end);
+
+#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
+	VM_WARN_ON(tlb->page_size != page_size);
+#endif
+
+	batch = tlb->active;
+	/*
+	 * Add the page and check if we are full. If so
+	 * force a flush.
+	 */
+	batch->pages[batch->nr++] = page;
+	if (batch->nr == batch->max) {
+		if (!tlb_next_batch(tlb))
+			return true;
+		batch = tlb->active;
+	}
+	VM_BUG_ON_PAGE(batch->nr > batch->max, page);
+
+	return false;
+}
+
+#endif /* HAVE_MMU_GATHER_NO_GATHER */
+
 void arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
 				unsigned long start, unsigned long end)
 {
@@ -48,12 +100,15 @@ void arch_tlb_gather_mmu(struct mmu_gath
 
 	/* Is it from 0 to ~0? */
 	tlb->fullmm     = !(start | (end+1));
+
+#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
 	tlb->need_flush_all = 0;
 	tlb->local.next = NULL;
 	tlb->local.nr   = 0;
 	tlb->local.max  = ARRAY_SIZE(tlb->__pages);
 	tlb->active     = &tlb->local;
 	tlb->batch_count = 0;
+#endif
 
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb->batch = NULL;
@@ -67,16 +122,12 @@ void arch_tlb_gather_mmu(struct mmu_gath
 
 void tlb_flush_mmu_free(struct mmu_gather *tlb)
 {
-	struct mmu_gather_batch *batch;
-
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb_table_flush(tlb);
 #endif
-	for (batch = &tlb->local; batch && batch->nr; batch = batch->next) {
-		free_pages_and_swap_cache(batch->pages, batch->nr);
-		batch->nr = 0;
-	}
-	tlb->active = &tlb->local;
+#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+	tlb_batch_pages_flush(tlb);
+#endif
 }
 
 void tlb_flush_mmu(struct mmu_gather *tlb)
@@ -92,8 +143,6 @@ void tlb_flush_mmu(struct mmu_gather *tl
 void arch_tlb_finish_mmu(struct mmu_gather *tlb,
 		unsigned long start, unsigned long end, bool force)
 {
-	struct mmu_gather_batch *batch, *next;
-
 	if (force) {
 		__tlb_reset_range(tlb);
 		__tlb_adjust_range(tlb, start, end - start);
@@ -103,45 +152,9 @@ void arch_tlb_finish_mmu(struct mmu_gath
 
 	/* keep the page table cache within bounds */
 	check_pgt_cache();
-
-	for (batch = tlb->local.next; batch; batch = next) {
-		next = batch->next;
-		free_pages((unsigned long)batch, 0);
-	}
-	tlb->local.next = NULL;
-}
-
-/* __tlb_remove_page
- *	Must perform the equivalent to __free_pte(pte_get_and_clear(ptep)), while
- *	handling the additional races in SMP caused by other CPUs caching valid
- *	mappings in their TLBs. Returns the number of free page slots left.
- *	When out of page slots we must call tlb_flush_mmu().
- *returns true if the caller should flush.
- */
-bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_size)
-{
-	struct mmu_gather_batch *batch;
-
-	VM_BUG_ON(!tlb->end);
-
-#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
-	VM_WARN_ON(tlb->page_size != page_size);
+#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+	tlb_batch_list_free(tlb);
 #endif
-
-	batch = tlb->active;
-	/*
-	 * Add the page and check if we are full. If so
-	 * force a flush.
-	 */
-	batch->pages[batch->nr++] = page;
-	if (batch->nr == batch->max) {
-		if (!tlb_next_batch(tlb))
-			return true;
-		batch = tlb->active;
-	}
-	VM_BUG_ON_PAGE(batch->nr > batch->max, page);
-
-	return false;
 }
 
 #endif /* HAVE_GENERIC_MMU_GATHER */



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 14/18] s390/tlb: convert to generic mmu_gather
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (12 preceding siblings ...)
  2019-02-19 10:32 ` [PATCH v6 13/18] asm-generic/tlb: Introduce HAVE_MMU_GATHER_NO_GATHER Peter Zijlstra
@ 2019-02-19 10:32 ` Peter Zijlstra
  2019-02-19 12:47   ` Will Deacon
  2019-02-19 10:32 ` [PATCH v6 15/18] asm-generic/tlb: Remove arch_tlb*_mmu() Peter Zijlstra
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:32 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux,
	heiko.carstens, riel, Linus Torvalds, Martin Schwidefsky


Cc: heiko.carstens@de.ibm.com
Cc: npiggin@gmail.com
Cc: akpm@linux-foundation.org
Cc: aneesh.kumar@linux.vnet.ibm.com
Cc: will.deacon@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux@armlinux.org.uk
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180918125151.31744-3-schwidefsky@de.ibm.com
---
 arch/s390/Kconfig           |    2 
 arch/s390/include/asm/tlb.h |  128 +++++++++++++-------------------------------
 arch/s390/mm/pgalloc.c      |   63 ---------------------
 3 files changed, 42 insertions(+), 151 deletions(-)

--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -163,11 +163,13 @@ config S390
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_MEMBLOCK_PHYS_MAP
+	select HAVE_MMU_GATHER_NO_GATHER
 	select HAVE_MOD_ARCH_SPECIFIC
 	select HAVE_NOP_MCOUNT
 	select HAVE_OPROFILE
 	select HAVE_PCI
 	select HAVE_PERF_EVENTS
+	select HAVE_RCU_TABLE_FREE
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RSEQ
 	select HAVE_SYSCALL_TRACEPOINTS
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -22,98 +22,39 @@
  * Pages used for the page tables is a different story. FIXME: more
  */
 
-#include <linux/mm.h>
-#include <linux/pagemap.h>
-#include <linux/swap.h>
-#include <asm/processor.h>
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-
-struct mmu_gather {
-	struct mm_struct *mm;
-	struct mmu_table_batch *batch;
-	unsigned int fullmm;
-	unsigned long start, end;
-};
-
-struct mmu_table_batch {
-	struct rcu_head		rcu;
-	unsigned int		nr;
-	void			*tables[0];
-};
-
-#define MAX_TABLE_BATCH		\
-	((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
-
-extern void tlb_table_flush(struct mmu_gather *tlb);
-extern void tlb_remove_table(struct mmu_gather *tlb, void *table);
-
-static inline void
-arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
-			unsigned long start, unsigned long end)
-{
-	tlb->mm = mm;
-	tlb->start = start;
-	tlb->end = end;
-	tlb->fullmm = !(start | (end+1));
-	tlb->batch = NULL;
-}
-
-static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
-{
-	__tlb_flush_mm_lazy(tlb->mm);
-}
-
-static inline void tlb_flush_mmu_free(struct mmu_gather *tlb)
-{
-	tlb_table_flush(tlb);
-}
-
+void __tlb_remove_table(void *_table);
+static inline void tlb_flush(struct mmu_gather *tlb);
+static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
+					  struct page *page, int page_size);
 
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	tlb_flush_mmu_tlbonly(tlb);
-	tlb_flush_mmu_free(tlb);
-}
+#define tlb_start_vma(tlb, vma)			do { } while (0)
+#define tlb_end_vma(tlb, vma)			do { } while (0)
 
-static inline void
-arch_tlb_finish_mmu(struct mmu_gather *tlb,
-		unsigned long start, unsigned long end, bool force)
-{
-	if (force) {
-		tlb->start = start;
-		tlb->end = end;
-	}
+#define tlb_flush tlb_flush
+#define pte_free_tlb pte_free_tlb
+#define pmd_free_tlb pmd_free_tlb
+#define p4d_free_tlb p4d_free_tlb
+#define pud_free_tlb pud_free_tlb
 
-	tlb_flush_mmu(tlb);
-}
+#include <asm/pgalloc.h>
+#include <asm/tlbflush.h>
+#include <asm-generic/tlb.h>
 
 /*
  * Release the page cache reference for a pte removed by
  * tlb_ptep_clear_flush. In both flush modes the tlb for a page cache page
  * has already been freed, so just do free_page_and_swap_cache.
  */
-static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	free_page_and_swap_cache(page);
-	return false; /* avoid calling tlb_flush_mmu */
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	free_page_and_swap_cache(page);
-}
-
 static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
 					  struct page *page, int page_size)
 {
-	return __tlb_remove_page(tlb, page);
+	free_page_and_swap_cache(page);
+	return false;
 }
 
-static inline void tlb_remove_page_size(struct mmu_gather *tlb,
-					struct page *page, int page_size)
+static inline void tlb_flush(struct mmu_gather *tlb)
 {
-	return tlb_remove_page(tlb, page);
+	__tlb_flush_mm_lazy(tlb->mm);
 }
 
 /*
@@ -121,8 +62,17 @@ static inline void tlb_remove_page_size(
  * page table from the tlb.
  */
 static inline void pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
-				unsigned long address)
+                                unsigned long address)
 {
+	__tlb_adjust_range(tlb, address, PAGE_SIZE);
+	tlb->mm->context.flush_mm = 1;
+	tlb->freed_tables = 1;
+	tlb->cleared_ptes = 1;
+	/*
+	 * page_table_free_rcu takes care of the allocation bit masks
+	 * of the 2K table fragments in the 4K page table page,
+	 * then calls tlb_remove_table.
+	 */
 	page_table_free_rcu(tlb, (unsigned long *) pte, address);
 }
 
@@ -139,6 +89,10 @@ static inline void pmd_free_tlb(struct m
 	if (mm_pmd_folded(tlb->mm))
 		return;
 	pgtable_pmd_page_dtor(virt_to_page(pmd));
+	__tlb_adjust_range(tlb, address, PAGE_SIZE);
+	tlb->mm->context.flush_mm = 1;
+	tlb->freed_tables = 1;
+	tlb->cleared_puds = 1;
 	tlb_remove_table(tlb, pmd);
 }
 
@@ -154,6 +108,10 @@ static inline void p4d_free_tlb(struct m
 {
 	if (mm_p4d_folded(tlb->mm))
 		return;
+	__tlb_adjust_range(tlb, address, PAGE_SIZE);
+	tlb->mm->context.flush_mm = 1;
+	tlb->freed_tables = 1;
+	tlb->cleared_p4ds = 1;
 	tlb_remove_table(tlb, p4d);
 }
 
@@ -169,19 +127,11 @@ static inline void pud_free_tlb(struct m
 {
 	if (mm_pud_folded(tlb->mm))
 		return;
+	tlb->mm->context.flush_mm = 1;
+	tlb->freed_tables = 1;
+	tlb->cleared_puds = 1;
 	tlb_remove_table(tlb, pud);
 }
 
-#define tlb_start_vma(tlb, vma)			do { } while (0)
-#define tlb_end_vma(tlb, vma)			do { } while (0)
-#define tlb_remove_tlb_entry(tlb, ptep, addr)	do { } while (0)
-#define tlb_remove_pmd_tlb_entry(tlb, pmdp, addr)	do { } while (0)
-#define tlb_migrate_finish(mm)			do { } while (0)
-#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
-	tlb_remove_tlb_entry(tlb, ptep, address)
-
-static inline void tlb_change_page_size(struct mmu_gather *tlb, unsigned int page_size)
-{
-}
 
 #endif /* _S390_TLB_H */
--- a/arch/s390/mm/pgalloc.c
+++ b/arch/s390/mm/pgalloc.c
@@ -290,7 +290,7 @@ void page_table_free_rcu(struct mmu_gath
 	tlb_remove_table(tlb, table);
 }
 
-static void __tlb_remove_table(void *_table)
+void __tlb_remove_table(void *_table)
 {
 	unsigned int mask = (unsigned long) _table & 3;
 	void *table = (void *)((unsigned long) _table ^ mask);
@@ -316,67 +316,6 @@ static void __tlb_remove_table(void *_ta
 	}
 }
 
-static void tlb_remove_table_smp_sync(void *arg)
-{
-	/* Simply deliver the interrupt */
-}
-
-static void tlb_remove_table_one(void *table)
-{
-	/*
-	 * This isn't an RCU grace period and hence the page-tables cannot be
-	 * assumed to be actually RCU-freed.
-	 *
-	 * It is however sufficient for software page-table walkers that rely
-	 * on IRQ disabling. See the comment near struct mmu_table_batch.
-	 */
-	smp_call_function(tlb_remove_table_smp_sync, NULL, 1);
-	__tlb_remove_table(table);
-}
-
-static void tlb_remove_table_rcu(struct rcu_head *head)
-{
-	struct mmu_table_batch *batch;
-	int i;
-
-	batch = container_of(head, struct mmu_table_batch, rcu);
-
-	for (i = 0; i < batch->nr; i++)
-		__tlb_remove_table(batch->tables[i]);
-
-	free_page((unsigned long)batch);
-}
-
-void tlb_table_flush(struct mmu_gather *tlb)
-{
-	struct mmu_table_batch **batch = &tlb->batch;
-
-	if (*batch) {
-		call_rcu(&(*batch)->rcu, tlb_remove_table_rcu);
-		*batch = NULL;
-	}
-}
-
-void tlb_remove_table(struct mmu_gather *tlb, void *table)
-{
-	struct mmu_table_batch **batch = &tlb->batch;
-
-	tlb->mm->context.flush_mm = 1;
-	if (*batch == NULL) {
-		*batch = (struct mmu_table_batch *)
-			__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
-		if (*batch == NULL) {
-			__tlb_flush_mm_lazy(tlb->mm);
-			tlb_remove_table_one(table);
-			return;
-		}
-		(*batch)->nr = 0;
-	}
-	(*batch)->tables[(*batch)->nr++] = table;
-	if ((*batch)->nr == MAX_TABLE_BATCH)
-		tlb_flush_mmu(tlb);
-}
-
 /*
  * Base infrastructure required to generate basic asces, region, segment,
  * and page tables that do not make use of enhanced features like EDAT1.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 15/18] asm-generic/tlb: Remove arch_tlb*_mmu()
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (13 preceding siblings ...)
  2019-02-19 10:32 ` [PATCH v6 14/18] s390/tlb: convert to generic mmu_gather Peter Zijlstra
@ 2019-02-19 10:32 ` Peter Zijlstra
  2019-02-19 10:32 ` [PATCH v6 16/18] asm-generic/tlb: Remove HAVE_GENERIC_MMU_GATHER Peter Zijlstra
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:32 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

Now that all architectures are converted to the generic code, remove
the arch hooks.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 mm/mmu_gather.c |   93 +++++++++++++++++++++++++-------------------------------
 1 file changed, 42 insertions(+), 51 deletions(-)

--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -93,33 +93,6 @@ bool __tlb_remove_page_size(struct mmu_g
 
 #endif /* HAVE_MMU_GATHER_NO_GATHER */
 
-void arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
-				unsigned long start, unsigned long end)
-{
-	tlb->mm = mm;
-
-	/* Is it from 0 to ~0? */
-	tlb->fullmm     = !(start | (end+1));
-
-#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
-	tlb->need_flush_all = 0;
-	tlb->local.next = NULL;
-	tlb->local.nr   = 0;
-	tlb->local.max  = ARRAY_SIZE(tlb->__pages);
-	tlb->active     = &tlb->local;
-	tlb->batch_count = 0;
-#endif
-
-#ifdef CONFIG_HAVE_RCU_TABLE_FREE
-	tlb->batch = NULL;
-#endif
-#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
-	tlb->page_size = 0;
-#endif
-
-	__tlb_reset_range(tlb);
-}
-
 void tlb_flush_mmu_free(struct mmu_gather *tlb)
 {
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
@@ -136,27 +109,6 @@ void tlb_flush_mmu(struct mmu_gather *tl
 	tlb_flush_mmu_free(tlb);
 }
 
-/* tlb_finish_mmu
- *	Called at the end of the shootdown operation to free up any resources
- *	that were required.
- */
-void arch_tlb_finish_mmu(struct mmu_gather *tlb,
-		unsigned long start, unsigned long end, bool force)
-{
-	if (force) {
-		__tlb_reset_range(tlb);
-		__tlb_adjust_range(tlb, start, end - start);
-	}
-
-	tlb_flush_mmu(tlb);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
-	tlb_batch_list_free(tlb);
-#endif
-}
-
 #endif /* HAVE_GENERIC_MMU_GATHER */
 
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
@@ -258,10 +210,40 @@ void tlb_remove_table(struct mmu_gather
 void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
 			unsigned long start, unsigned long end)
 {
-	arch_tlb_gather_mmu(tlb, mm, start, end);
+	tlb->mm = mm;
+
+	/* Is it from 0 to ~0? */
+	tlb->fullmm     = !(start | (end+1));
+
+#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+	tlb->need_flush_all = 0;
+	tlb->local.next = NULL;
+	tlb->local.nr   = 0;
+	tlb->local.max  = ARRAY_SIZE(tlb->__pages);
+	tlb->active     = &tlb->local;
+	tlb->batch_count = 0;
+#endif
+
+#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+	tlb->batch = NULL;
+#endif
+#ifdef CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
+	tlb->page_size = 0;
+#endif
+
+	__tlb_reset_range(tlb);
 	inc_tlb_flush_pending(tlb->mm);
 }
 
+/**
+ * tlb_finish_mmu - finish an mmu_gather structure
+ * @tlb: the mmu_gather structure to finish
+ * @start: start of the region that will be removed from the page-table
+ * @end: end of the region that will be removed from the page-table
+ *
+ * Called at the end of the shootdown operation to free up any resources that
+ * were required.
+ */
 void tlb_finish_mmu(struct mmu_gather *tlb,
 		unsigned long start, unsigned long end)
 {
@@ -272,8 +254,17 @@ void tlb_finish_mmu(struct mmu_gather *t
 	 * the TLB by observing pte_none|!pte_dirty, for example so flush TLB
 	 * forcefully if we detect parallel PTE batching threads.
 	 */
-	bool force = mm_tlb_flush_nested(tlb->mm);
+	if (mm_tlb_flush_nested(tlb->mm)) {
+		__tlb_reset_range(tlb);
+		__tlb_adjust_range(tlb, start, end - start);
+	}
 
-	arch_tlb_finish_mmu(tlb, start, end, force);
+	tlb_flush_mmu(tlb);
+
+	/* keep the page table cache within bounds */
+	check_pgt_cache();
+#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+	tlb_batch_list_free(tlb);
+#endif
 	dec_tlb_flush_pending(tlb->mm);
 }



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 16/18] asm-generic/tlb: Remove HAVE_GENERIC_MMU_GATHER
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (14 preceding siblings ...)
  2019-02-19 10:32 ` [PATCH v6 15/18] asm-generic/tlb: Remove arch_tlb*_mmu() Peter Zijlstra
@ 2019-02-19 10:32 ` Peter Zijlstra
  2019-02-19 10:32 ` [PATCH v6 17/18] asm-generic/tlb: Remove tlb_flush_mmu_free() Peter Zijlstra
  2019-02-19 10:32 ` [PATCH v6 18/18] asm-generic/tlb: Remove tlb_table_flush() Peter Zijlstra
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:32 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

Since all architectures are now using it, it is redundant.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/tlb.h |    1 -
 mm/mmu_gather.c           |    4 ----
 2 files changed, 5 deletions(-)

--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -142,7 +142,6 @@
  *
  *  Use this if your architecture lacks an efficient flush_tlb_range().
  */
-#define HAVE_GENERIC_MMU_GATHER
 
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 /*
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -11,8 +11,6 @@
 #include <asm/pgalloc.h>
 #include <asm/tlb.h>
 
-#ifdef HAVE_GENERIC_MMU_GATHER
-
 #ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
 
 static bool tlb_next_batch(struct mmu_gather *tlb)
@@ -109,8 +107,6 @@ void tlb_flush_mmu(struct mmu_gather *tl
 	tlb_flush_mmu_free(tlb);
 }
 
-#endif /* HAVE_GENERIC_MMU_GATHER */
-
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 
 /*



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 17/18] asm-generic/tlb: Remove tlb_flush_mmu_free()
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (15 preceding siblings ...)
  2019-02-19 10:32 ` [PATCH v6 16/18] asm-generic/tlb: Remove HAVE_GENERIC_MMU_GATHER Peter Zijlstra
@ 2019-02-19 10:32 ` Peter Zijlstra
  2019-02-19 10:32 ` [PATCH v6 18/18] asm-generic/tlb: Remove tlb_table_flush() Peter Zijlstra
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:32 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

As the comment notes; it is a potentially dangerous operation. Just
use tlb_flush_mmu(), that will skip the (double) TLB invalidate if
it really isn't needed anyway.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/tlb.h |   10 +++-------
 mm/memory.c               |    2 +-
 mm/mmu_gather.c           |    2 +-
 3 files changed, 5 insertions(+), 9 deletions(-)

--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -67,16 +67,13 @@
  *    call before __tlb_remove_page*() to set the current page-size; implies a
  *    possible tlb_flush_mmu() call.
  *
- *  - tlb_flush_mmu() / tlb_flush_mmu_tlbonly() / tlb_flush_mmu_free()
+ *  - tlb_flush_mmu() / tlb_flush_mmu_tlbonly()
  *
  *    tlb_flush_mmu_tlbonly() - does the TLB invalidate (and resets
  *                              related state, like the range)
  *
- *    tlb_flush_mmu_free() - frees the queued pages; make absolutely
- *			     sure no additional tlb_remove_page()
- *			     calls happen between _tlbonly() and this.
- *
- *    tlb_flush_mmu() - the above two calls.
+ *    tlb_flush_mmu() - in addition to the above TLB invalidate, also frees
+ *			whatever pages are still batched.
  *
  *  - mmu_gather::fullmm
  *
@@ -274,7 +271,6 @@ void arch_tlb_gather_mmu(struct mmu_gath
 void tlb_flush_mmu(struct mmu_gather *tlb);
 void arch_tlb_finish_mmu(struct mmu_gather *tlb,
 			 unsigned long start, unsigned long end, bool force);
-void tlb_flush_mmu_free(struct mmu_gather *tlb);
 
 static inline void __tlb_adjust_range(struct mmu_gather *tlb,
 				      unsigned long address,
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1155,7 +1155,7 @@ static unsigned long zap_pte_range(struc
 	 */
 	if (force_flush) {
 		force_flush = 0;
-		tlb_flush_mmu_free(tlb);
+		tlb_flush_mmu(tlb);
 		if (addr != end)
 			goto again;
 	}
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -91,7 +91,7 @@ bool __tlb_remove_page_size(struct mmu_g
 
 #endif /* HAVE_MMU_GATHER_NO_GATHER */
 
-void tlb_flush_mmu_free(struct mmu_gather *tlb)
+static void tlb_flush_mmu_free(struct mmu_gather *tlb)
 {
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb_table_flush(tlb);



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6 18/18] asm-generic/tlb: Remove tlb_table_flush()
  2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
                   ` (16 preceding siblings ...)
  2019-02-19 10:32 ` [PATCH v6 17/18] asm-generic/tlb: Remove tlb_flush_mmu_free() Peter Zijlstra
@ 2019-02-19 10:32 ` Peter Zijlstra
  17 siblings, 0 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 10:32 UTC (permalink / raw)
  To: will.deacon, aneesh.kumar, akpm, npiggin
  Cc: linux-arch, linux-mm, linux-kernel, peterz, linux, heiko.carstens, riel

There are no external users of this API (nor should there be); remove it.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/tlb.h |    1 -
 mm/mmu_gather.c           |   34 +++++++++++++++++-----------------
 2 files changed, 17 insertions(+), 18 deletions(-)

--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -174,7 +174,6 @@ struct mmu_table_batch {
 #define MAX_TABLE_BATCH		\
 	((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
 
-extern void tlb_table_flush(struct mmu_gather *tlb);
 extern void tlb_remove_table(struct mmu_gather *tlb, void *table);
 
 #endif
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -91,22 +91,6 @@ bool __tlb_remove_page_size(struct mmu_g
 
 #endif /* HAVE_MMU_GATHER_NO_GATHER */
 
-static void tlb_flush_mmu_free(struct mmu_gather *tlb)
-{
-#ifdef CONFIG_HAVE_RCU_TABLE_FREE
-	tlb_table_flush(tlb);
-#endif
-#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
-	tlb_batch_pages_flush(tlb);
-#endif
-}
-
-void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	tlb_flush_mmu_tlbonly(tlb);
-	tlb_flush_mmu_free(tlb);
-}
-
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 
 /*
@@ -159,7 +143,7 @@ static void tlb_remove_table_rcu(struct
 	free_page((unsigned long)batch);
 }
 
-void tlb_table_flush(struct mmu_gather *tlb)
+static void tlb_table_flush(struct mmu_gather *tlb)
 {
 	struct mmu_table_batch **batch = &tlb->batch;
 
@@ -191,6 +175,22 @@ void tlb_remove_table(struct mmu_gather
 
 #endif /* CONFIG_HAVE_RCU_TABLE_FREE */
 
+static void tlb_flush_mmu_free(struct mmu_gather *tlb)
+{
+#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+	tlb_table_flush(tlb);
+#endif
+#ifndef CONFIG_HAVE_MMU_GATHER_NO_GATHER
+	tlb_batch_pages_flush(tlb);
+#endif
+}
+
+void tlb_flush_mmu(struct mmu_gather *tlb)
+{
+	tlb_flush_mmu_tlbonly(tlb);
+	tlb_flush_mmu_free(tlb);
+}
+
 /**
  * tlb_gather_mmu - initialize an mmu_gather structure for page-table tear-down
  * @tlb: the mmu_gather structure to initialize



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 05/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_mm()
  2019-02-19 10:31 ` [PATCH v6 05/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_mm() Peter Zijlstra
@ 2019-02-19 12:47   ` Will Deacon
  0 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2019-02-19 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: aneesh.kumar, akpm, npiggin, linux-arch, linux-mm, linux-kernel,
	linux, heiko.carstens, riel

On Tue, Feb 19, 2019 at 11:31:53AM +0100, Peter Zijlstra wrote:
> When an architecture does not have (an efficient) flush_tlb_range(),
> but instead always uses full TLB invalidates, the current generic
> tlb_flush() is sub-optimal, for it will generate extra flushes in
> order to keep the range small.
> 
> But if we cannot do range flushes, that is a moot concern. Optionally
> provide this simplified default.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  include/asm-generic/tlb.h |   41 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 40 insertions(+), 1 deletion(-)
> 
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -114,7 +114,8 @@
>   *    returns the smallest TLB entry size unmapped in this range.
>   *
>   * If an architecture does not provide tlb_flush() a default implementation
> - * based on flush_tlb_range() will be used.
> + * based on flush_tlb_range() will be used, unless MMU_GATHER_NO_RANGE is
> + * specified, in which case we'll default to flush_tlb_mm().
>   *
>   * Additionally there are a few opt-in features:
>   *
> @@ -140,6 +141,9 @@
>   *  the page-table pages. Required if you use HAVE_RCU_TABLE_FREE and your
>   *  architecture uses the Linux page-tables natively.
>   *
> + *  MMU_GATHER_NO_RANGE
> + *
> + *  Use this if your architecture lacks an efficient flush_tlb_range().
>   */
>  #define HAVE_GENERIC_MMU_GATHER
>  
> @@ -302,12 +306,45 @@ static inline void __tlb_reset_range(str
>  	 */
>  }
>  
> +#ifdef CONFIG_MMU_GATHER_NO_RANGE
> +
> +#if defined(tlb_flush) || defined(tlb_start_vma) || defined(tlb_end_vma)
> +#error MMU_GATHER_NO_RANGE relies on default tlb_flush(), tlb_start_vma() and tlb_end_vma()
> +#endif
> +
> +/*
> + * When an architecture does not have efficient means of range flushing TLBs
> + * there is no point in doing intermediate flushes on tlb_end_vma() to keep the
> + * range small. We equally don't have to worry about page granularity or other
> + * things.
> + *
> + * All we need to do is issue a full flush for any !0 range.
> + */
> +static inline void tlb_flush(struct mmu_gather *tlb)
> +{
> +	if (tlb->end)
> +		flush_tlb_mm(tlb->mm);
> +}

I guess another way we could handle these architectures is by
unconditionally resetting tlb->fullmm to 1, but this works too.

Acked-by: Will Deacon <will.deacon@arm.com>

Will


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 09/18] ia64/tlb: Conver to generic mmu_gather
  2019-02-19 10:31 ` [PATCH v6 09/18] ia64/tlb: Conver " Peter Zijlstra
@ 2019-02-19 12:47   ` Will Deacon
  2019-02-21  2:52   ` Souptick Joarder
  1 sibling, 0 replies; 42+ messages in thread
From: Will Deacon @ 2019-02-19 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: aneesh.kumar, akpm, npiggin, linux-arch, linux-mm, linux-kernel,
	linux, heiko.carstens, riel, Tony Luck

On Tue, Feb 19, 2019 at 11:31:57AM +0100, Peter Zijlstra wrote:
> Generic mmu_gather provides everything ia64 needs (range tracking).
> 
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Nick Piggin <npiggin@gmail.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/ia64/include/asm/tlb.h      |  256 ---------------------------------------
>  arch/ia64/include/asm/tlbflush.h |   25 +++
>  arch/ia64/mm/tlb.c               |   23 +++
>  3 files changed, 47 insertions(+), 257 deletions(-)

Typo in subject (s/Conver/Convert) but other than that this looks sensible:

Acked-by: Will Deacon <will.deacon@arm.com>

Will


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 06/18] asm-generic/tlb: Conditionally provide tlb_migrate_finish()
  2019-02-19 10:31 ` [PATCH v6 06/18] asm-generic/tlb: Conditionally provide tlb_migrate_finish() Peter Zijlstra
@ 2019-02-19 12:47   ` Will Deacon
  2019-02-19 13:41     ` Peter Zijlstra
  0 siblings, 1 reply; 42+ messages in thread
From: Will Deacon @ 2019-02-19 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: aneesh.kumar, akpm, npiggin, linux-arch, linux-mm, linux-kernel,
	linux, heiko.carstens, riel

On Tue, Feb 19, 2019 at 11:31:54AM +0100, Peter Zijlstra wrote:
> Needed for ia64 -- alternatively we drop the entire hook.
> 
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Nick Piggin <npiggin@gmail.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  include/asm-generic/tlb.h |    2 ++
>  1 file changed, 2 insertions(+)
> 
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -539,6 +539,8 @@ static inline void tlb_end_vma(struct mm
>  
>  #endif /* CONFIG_MMU */
>  
> +#ifndef tlb_migrate_finish
>  #define tlb_migrate_finish(mm) do {} while (0)
> +#endif

Fine for now, but I agree that we should drop the hook altogether. AFAICT,
this only exists to help an ia64 optimisation which looks suspicious to
me since it uses:

    mm == current->active_mm && atomic_read(&mm->mm_users) == 1

to identify a "single-threaded fork()" and therefore perform only local TLB
invalidation. Even if this was the right thing to do, it's not clear to me
that tlb_migrate_finish() is called on the right CPU anyway.

So I'd be keen to remove this hook before it spreads, but in the meantime:

Acked-by: Will Deacon <will.deacon@arm.com>

Will


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 14/18] s390/tlb: convert to generic mmu_gather
  2019-02-19 10:32 ` [PATCH v6 14/18] s390/tlb: convert to generic mmu_gather Peter Zijlstra
@ 2019-02-19 12:47   ` Will Deacon
  0 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2019-02-19 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: aneesh.kumar, akpm, npiggin, linux-arch, linux-mm, linux-kernel,
	linux, heiko.carstens, riel, Linus Torvalds, Martin Schwidefsky

On Tue, Feb 19, 2019 at 11:32:02AM +0100, Peter Zijlstra wrote:
> 
> Cc: heiko.carstens@de.ibm.com
> Cc: npiggin@gmail.com
> Cc: akpm@linux-foundation.org
> Cc: aneesh.kumar@linux.vnet.ibm.com
> Cc: will.deacon@arm.com
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: linux@armlinux.org.uk
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: http://lkml.kernel.org/r/20180918125151.31744-3-schwidefsky@de.ibm.com
> ---
>  arch/s390/Kconfig           |    2 
>  arch/s390/include/asm/tlb.h |  128 +++++++++++++-------------------------------
>  arch/s390/mm/pgalloc.c      |   63 ---------------------
>  3 files changed, 42 insertions(+), 151 deletions(-)

-ENOCOMMITMESSAGE ?

Will


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 13/18] asm-generic/tlb: Introduce HAVE_MMU_GATHER_NO_GATHER
  2019-02-19 10:32 ` [PATCH v6 13/18] asm-generic/tlb: Introduce HAVE_MMU_GATHER_NO_GATHER Peter Zijlstra
@ 2019-02-19 12:47   ` Will Deacon
  0 siblings, 0 replies; 42+ messages in thread
From: Will Deacon @ 2019-02-19 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: aneesh.kumar, akpm, npiggin, linux-arch, linux-mm, linux-kernel,
	linux, heiko.carstens, riel, Linus Torvalds, Martin Schwidefsky

On Tue, Feb 19, 2019 at 11:32:01AM +0100, Peter Zijlstra wrote:
> Add the Kconfig option HAVE_MMU_GATHER_NO_GATHER to the generic
> mmu_gather code. If the option is set the mmu_gather will not
> track individual pages for delayed page free anymore. A platform
> that enables the option needs to provide its own implementation
> of the __tlb_remove_page_size function to free pages.
> 
> Cc: npiggin@gmail.com
> Cc: heiko.carstens@de.ibm.com
> Cc: will.deacon@arm.com
> Cc: aneesh.kumar@linux.vnet.ibm.com
> Cc: akpm@linux-foundation.org
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: linux@armlinux.org.uk
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: http://lkml.kernel.org/r/20180918125151.31744-2-schwidefsky@de.ibm.com
> ---
>  arch/Kconfig              |    3 +
>  include/asm-generic/tlb.h |    9 +++
>  mm/mmu_gather.c           |  107 +++++++++++++++++++++++++---------------------
>  3 files changed, 70 insertions(+), 49 deletions(-)

Acked-by: Will Deacon <will.deacon@arm.com>

Will


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 06/18] asm-generic/tlb: Conditionally provide tlb_migrate_finish()
  2019-02-19 12:47   ` Will Deacon
@ 2019-02-19 13:41     ` Peter Zijlstra
  2019-02-20 14:47       ` Will Deacon
  0 siblings, 1 reply; 42+ messages in thread
From: Peter Zijlstra @ 2019-02-19 13:41 UTC (permalink / raw)
  To: Will Deacon
  Cc: aneesh.kumar, akpm, npiggin, linux-arch, linux-mm, linux-kernel,
	linux, heiko.carstens, riel, tony.luck

On Tue, Feb 19, 2019 at 12:47:38PM +0000, Will Deacon wrote:
> Fine for now, but I agree that we should drop the hook altogether. AFAICT,
> this only exists to help an ia64 optimisation which looks suspicious to
> me since it uses:
> 
>     mm == current->active_mm && atomic_read(&mm->mm_users) == 1
> 
> to identify a "single-threaded fork()" and therefore perform only local TLB
> invalidation. Even if this was the right thing to do, it's not clear to me
> that tlb_migrate_finish() is called on the right CPU anyway.
> 
> So I'd be keen to remove this hook before it spreads, but in the meantime:

Agreed :-)

The obvious slash and kill patch ... untested

---
 Documentation/core-api/cachetlb.rst |   10 ----------
 arch/ia64/include/asm/machvec.h     |    8 --------
 arch/ia64/include/asm/machvec_sn2.h |    2 --
 arch/ia64/include/asm/tlb.h         |    2 --
 arch/ia64/sn/kernel/sn2/sn2_smp.c   |    7 -------
 arch/nds32/include/asm/tlbflush.h   |    1 -
 include/asm-generic/tlb.h           |    4 ----
 kernel/sched/core.c                 |    1 -
 8 files changed, 35 deletions(-)

--- a/Documentation/core-api/cachetlb.rst
+++ b/Documentation/core-api/cachetlb.rst
@@ -101,16 +101,6 @@ invoke one of the following flush method
 	translations for software managed TLB configurations.
 	The sparc64 port currently does this.
 
-6) ``void tlb_migrate_finish(struct mm_struct *mm)``
-
-	This interface is called at the end of an explicit
-	process migration. This interface provides a hook
-	to allow a platform to update TLB or context-specific
-	information for the address space.
-
-	The ia64 sn2 platform is one example of a platform
-	that uses this interface.
-
 Next, we have the cache flushing interfaces.  In general, when Linux
 is changing an existing virtual-->physical mapping to a new value,
 the sequence will be in one of the following forms::
--- a/arch/ia64/include/asm/machvec.h
+++ b/arch/ia64/include/asm/machvec.h
@@ -30,7 +30,6 @@ typedef void ia64_mv_irq_init_t (void);
 typedef void ia64_mv_send_ipi_t (int, int, int, int);
 typedef void ia64_mv_timer_interrupt_t (int, void *);
 typedef void ia64_mv_global_tlb_purge_t (struct mm_struct *, unsigned long, unsigned long, unsigned long);
-typedef void ia64_mv_tlb_migrate_finish_t (struct mm_struct *);
 typedef u8 ia64_mv_irq_to_vector (int);
 typedef unsigned int ia64_mv_local_vector_to_irq (u8);
 typedef char *ia64_mv_pci_get_legacy_mem_t (struct pci_bus *);
@@ -96,7 +95,6 @@ machvec_noop_bus (struct pci_bus *bus)
 
 extern void machvec_setup (char **);
 extern void machvec_timer_interrupt (int, void *);
-extern void machvec_tlb_migrate_finish (struct mm_struct *);
 
 # if defined (CONFIG_IA64_HP_SIM)
 #  include <asm/machvec_hpsim.h>
@@ -124,7 +122,6 @@ extern void machvec_tlb_migrate_finish (
 #  define platform_send_ipi	ia64_mv.send_ipi
 #  define platform_timer_interrupt	ia64_mv.timer_interrupt
 #  define platform_global_tlb_purge	ia64_mv.global_tlb_purge
-#  define platform_tlb_migrate_finish	ia64_mv.tlb_migrate_finish
 #  define platform_dma_init		ia64_mv.dma_init
 #  define platform_dma_get_ops		ia64_mv.dma_get_ops
 #  define platform_irq_to_vector	ia64_mv.irq_to_vector
@@ -167,7 +164,6 @@ struct ia64_machine_vector {
 	ia64_mv_send_ipi_t *send_ipi;
 	ia64_mv_timer_interrupt_t *timer_interrupt;
 	ia64_mv_global_tlb_purge_t *global_tlb_purge;
-	ia64_mv_tlb_migrate_finish_t *tlb_migrate_finish;
 	ia64_mv_dma_init *dma_init;
 	ia64_mv_dma_get_ops *dma_get_ops;
 	ia64_mv_irq_to_vector *irq_to_vector;
@@ -206,7 +202,6 @@ struct ia64_machine_vector {
 	platform_send_ipi,			\
 	platform_timer_interrupt,		\
 	platform_global_tlb_purge,		\
-	platform_tlb_migrate_finish,		\
 	platform_dma_init,			\
 	platform_dma_get_ops,			\
 	platform_irq_to_vector,			\
@@ -270,9 +265,6 @@ extern const struct dma_map_ops *dma_get
 #ifndef platform_global_tlb_purge
 # define platform_global_tlb_purge	ia64_global_tlb_purge /* default to architected version */
 #endif
-#ifndef platform_tlb_migrate_finish
-# define platform_tlb_migrate_finish	machvec_noop_mm
-#endif
 #ifndef platform_kernel_launch_event
 # define platform_kernel_launch_event	machvec_noop
 #endif
--- a/arch/ia64/include/asm/machvec_sn2.h
+++ b/arch/ia64/include/asm/machvec_sn2.h
@@ -34,7 +34,6 @@ extern ia64_mv_irq_init_t sn_irq_init;
 extern ia64_mv_send_ipi_t sn2_send_IPI;
 extern ia64_mv_timer_interrupt_t sn_timer_interrupt;
 extern ia64_mv_global_tlb_purge_t sn2_global_tlb_purge;
-extern ia64_mv_tlb_migrate_finish_t	sn_tlb_migrate_finish;
 extern ia64_mv_irq_to_vector sn_irq_to_vector;
 extern ia64_mv_local_vector_to_irq sn_local_vector_to_irq;
 extern ia64_mv_pci_get_legacy_mem_t sn_pci_get_legacy_mem;
@@ -77,7 +76,6 @@ extern ia64_mv_pci_fixup_bus_t		sn_pci_f
 #define platform_send_ipi		sn2_send_IPI
 #define platform_timer_interrupt	sn_timer_interrupt
 #define platform_global_tlb_purge       sn2_global_tlb_purge
-#define platform_tlb_migrate_finish	sn_tlb_migrate_finish
 #define platform_pci_fixup		sn_pci_fixup
 #define platform_inb			__sn_inb
 #define platform_inw			__sn_inw
--- a/arch/ia64/include/asm/tlb.h
+++ b/arch/ia64/include/asm/tlb.h
@@ -47,8 +47,6 @@
 #include <asm/tlbflush.h>
 #include <asm/machvec.h>
 
-#define tlb_migrate_finish(mm)	platform_tlb_migrate_finish(mm)
-
 #include <asm-generic/tlb.h>
 
 #endif /* _ASM_IA64_TLB_H */
--- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -120,13 +120,6 @@ void sn_migrate(struct task_struct *task
 		cpu_relax();
 }
 
-void sn_tlb_migrate_finish(struct mm_struct *mm)
-{
-	/* flush_tlb_mm is inefficient if more than 1 users of mm */
-	if (mm == current->mm && mm && atomic_read(&mm->mm_users) == 1)
-		flush_tlb_mm(mm);
-}
-
 static void
 sn2_ipi_flush_all_tlb(struct mm_struct *mm)
 {
--- a/arch/nds32/include/asm/tlbflush.h
+++ b/arch/nds32/include/asm/tlbflush.h
@@ -42,6 +42,5 @@ void local_flush_tlb_page(struct vm_area
 
 void update_mmu_cache(struct vm_area_struct *vma,
 		      unsigned long address, pte_t * pte);
-void tlb_migrate_finish(struct mm_struct *mm);
 
 #endif
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -604,8 +604,4 @@ static inline void tlb_end_vma(struct mm
 
 #endif /* CONFIG_MMU */
 
-#ifndef tlb_migrate_finish
-#define tlb_migrate_finish(mm) do {} while (0)
-#endif
-
 #endif /* _ASM_GENERIC__TLB_H */
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1151,7 +1151,6 @@ static int __set_cpus_allowed_ptr(struct
 		/* Need help from migration thread: drop lock and wait. */
 		task_rq_unlock(rq, p, &rf);
 		stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
-		tlb_migrate_finish(p->mm);
 		return 0;
 	} else if (task_on_rq_queued(p)) {
 		/*


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 06/18] asm-generic/tlb: Conditionally provide tlb_migrate_finish()
  2019-02-19 13:41     ` Peter Zijlstra
@ 2019-02-20 14:47       ` Will Deacon
  2019-02-20 15:02         ` Matthew Wilcox
  0 siblings, 1 reply; 42+ messages in thread
From: Will Deacon @ 2019-02-20 14:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: aneesh.kumar, akpm, npiggin, linux-arch, linux-mm, linux-kernel,
	linux, heiko.carstens, riel, tony.luck

On Tue, Feb 19, 2019 at 02:41:47PM +0100, Peter Zijlstra wrote:
> On Tue, Feb 19, 2019 at 12:47:38PM +0000, Will Deacon wrote:
> > Fine for now, but I agree that we should drop the hook altogether. AFAICT,
> > this only exists to help an ia64 optimisation which looks suspicious to
> > me since it uses:
> > 
> >     mm == current->active_mm && atomic_read(&mm->mm_users) == 1
> > 
> > to identify a "single-threaded fork()" and therefore perform only local TLB
> > invalidation. Even if this was the right thing to do, it's not clear to me
> > that tlb_migrate_finish() is called on the right CPU anyway.
> > 
> > So I'd be keen to remove this hook before it spreads, but in the meantime:
> 
> Agreed :-)
> 
> The obvious slash and kill patch ... untested

I'm also unable to test this, unfortunately. Can we get it into next after
the merge window and see if anybody reports issues?

Will


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 06/18] asm-generic/tlb: Conditionally provide tlb_migrate_finish()
  2019-02-20 14:47       ` Will Deacon
@ 2019-02-20 15:02         ` Matthew Wilcox
  0 siblings, 0 replies; 42+ messages in thread
From: Matthew Wilcox @ 2019-02-20 15:02 UTC (permalink / raw)
  To: Will Deacon
  Cc: Peter Zijlstra, aneesh.kumar, akpm, npiggin, linux-arch,
	linux-mm, linux-kernel, linux, heiko.carstens, riel, tony.luck

On Wed, Feb 20, 2019 at 02:47:05PM +0000, Will Deacon wrote:
> On Tue, Feb 19, 2019 at 02:41:47PM +0100, Peter Zijlstra wrote:
> > On Tue, Feb 19, 2019 at 12:47:38PM +0000, Will Deacon wrote:
> > > Fine for now, but I agree that we should drop the hook altogether. AFAICT,
> > > this only exists to help an ia64 optimisation which looks suspicious to
> > > me since it uses:
> > > 
> > >     mm == current->active_mm && atomic_read(&mm->mm_users) == 1
> > > 
> > > to identify a "single-threaded fork()" and therefore perform only local TLB
> > > invalidation. Even if this was the right thing to do, it's not clear to me
> > > that tlb_migrate_finish() is called on the right CPU anyway.
> > > 
> > > So I'd be keen to remove this hook before it spreads, but in the meantime:
> > 
> > Agreed :-)
> > 
> > The obvious slash and kill patch ... untested
> 
> I'm also unable to test this, unfortunately. Can we get it into next after
> the merge window and see if anybody reports issues?

While I do have a pair of Itanium systems in my basement, neither are
sn2 machines, which was the only sub-architecture that implemented
tlb_migrate_finish().  I see NASA decomissioned Columbia in 2013, and
I imagine most sn2 machines have been similarly scrapped.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 09/18] ia64/tlb: Conver to generic mmu_gather
  2019-02-19 10:31 ` [PATCH v6 09/18] ia64/tlb: Conver " Peter Zijlstra
  2019-02-19 12:47   ` Will Deacon
@ 2019-02-21  2:52   ` Souptick Joarder
  1 sibling, 0 replies; 42+ messages in thread
From: Souptick Joarder @ 2019-02-21  2:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: will.deacon, Aneesh Kumar K.V, Andrew Morton, npiggin,
	linux-arch, Linux-MM, linux-kernel, Russell King - ARM Linux,
	heiko.carstens, Rik van Riel, Tony Luck

On Tue, Feb 19, 2019 at 4:03 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Generic mmu_gather provides everything ia64 needs (range tracking).
>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Nick Piggin <npiggin@gmail.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/ia64/include/asm/tlb.h      |  256 ---------------------------------------
>  arch/ia64/include/asm/tlbflush.h |   25 +++
>  arch/ia64/mm/tlb.c               |   23 +++
>  3 files changed, 47 insertions(+), 257 deletions(-)
>
> --- a/arch/ia64/include/asm/tlb.h
> +++ b/arch/ia64/include/asm/tlb.h
> @@ -47,262 +47,8 @@
>  #include <asm/tlbflush.h>
>  #include <asm/machvec.h>
>
> -/*
> - * If we can't allocate a page to make a big batch of page pointers
> - * to work on, then just handle a few from the on-stack structure.
> - */
> -#define        IA64_GATHER_BUNDLE      8
> -
> -struct mmu_gather {
> -       struct mm_struct        *mm;
> -       unsigned int            nr;
> -       unsigned int            max;
> -       unsigned char           fullmm;         /* non-zero means full mm flush */
> -       unsigned char           need_flush;     /* really unmapped some PTEs? */
> -       unsigned long           start, end;
> -       unsigned long           start_addr;
> -       unsigned long           end_addr;
> -       struct page             **pages;
> -       struct page             *local[IA64_GATHER_BUNDLE];
> -};
> -
> -struct ia64_tr_entry {
> -       u64 ifa;
> -       u64 itir;
> -       u64 pte;
> -       u64 rr;
> -}; /*Record for tr entry!*/
> -
> -extern int ia64_itr_entry(u64 target_mask, u64 va, u64 pte, u64 log_size);
> -extern void ia64_ptr_entry(u64 target_mask, int slot);
> -
> -extern struct ia64_tr_entry *ia64_idtrs[NR_CPUS];
> -
> -/*
> - region register macros
> -*/
> -#define RR_TO_VE(val)   (((val) >> 0) & 0x0000000000000001)
> -#define RR_VE(val)     (((val) & 0x0000000000000001) << 0)
> -#define RR_VE_MASK     0x0000000000000001L
> -#define RR_VE_SHIFT    0
> -#define RR_TO_PS(val)  (((val) >> 2) & 0x000000000000003f)
> -#define RR_PS(val)     (((val) & 0x000000000000003f) << 2)
> -#define RR_PS_MASK     0x00000000000000fcL
> -#define RR_PS_SHIFT    2
> -#define RR_RID_MASK    0x00000000ffffff00L
> -#define RR_TO_RID(val)         ((val >> 8) & 0xffffff)
> -
> -static inline void
> -ia64_tlb_flush_mmu_tlbonly(struct mmu_gather *tlb, unsigned long start, unsigned long end)
> -{
> -       tlb->need_flush = 0;
> -
> -       if (tlb->fullmm) {
> -               /*
> -                * Tearing down the entire address space.  This happens both as a result
> -                * of exit() and execve().  The latter case necessitates the call to
> -                * flush_tlb_mm() here.
> -                */
> -               flush_tlb_mm(tlb->mm);
> -       } else if (unlikely (end - start >= 1024*1024*1024*1024UL
> -                            || REGION_NUMBER(start) != REGION_NUMBER(end - 1)))
> -       {
> -               /*
> -                * If we flush more than a tera-byte or across regions, we're probably
> -                * better off just flushing the entire TLB(s).  This should be very rare
> -                * and is not worth optimizing for.
> -                */
> -               flush_tlb_all();
> -       } else {
> -               /*
> -                * flush_tlb_range() takes a vma instead of a mm pointer because
> -                * some architectures want the vm_flags for ITLB/DTLB flush.
> -                */
> -               struct vm_area_struct vma = TLB_FLUSH_VMA(tlb->mm, 0);
> -
> -               /* flush the address range from the tlb: */
> -               flush_tlb_range(&vma, start, end);
> -               /* now flush the virt. page-table area mapping the address range: */
> -               flush_tlb_range(&vma, ia64_thash(start), ia64_thash(end));
> -       }
> -
> -}
> -
> -static inline void
> -ia64_tlb_flush_mmu_free(struct mmu_gather *tlb)
> -{
> -       unsigned long i;
> -       unsigned int nr;
> -
> -       /* lastly, release the freed pages */
> -       nr = tlb->nr;
> -
> -       tlb->nr = 0;
> -       tlb->start_addr = ~0UL;
> -       for (i = 0; i < nr; ++i)
> -               free_page_and_swap_cache(tlb->pages[i]);
> -}
> -
> -/*
> - * Flush the TLB for address range START to END and, if not in fast mode, release the
> - * freed pages that where gathered up to this point.
> - */
> -static inline void
> -ia64_tlb_flush_mmu (struct mmu_gather *tlb, unsigned long start, unsigned long end)
> -{
> -       if (!tlb->need_flush)
> -               return;
> -       ia64_tlb_flush_mmu_tlbonly(tlb, start, end);
> -       ia64_tlb_flush_mmu_free(tlb);
> -}
> -
> -static inline void __tlb_alloc_page(struct mmu_gather *tlb)
> -{
> -       unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
> -
> -       if (addr) {
> -               tlb->pages = (void *)addr;
> -               tlb->max = PAGE_SIZE / sizeof(void *);
> -       }
> -}
> -
> -
> -static inline void
> -arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
> -                       unsigned long start, unsigned long end)
> -{
> -       tlb->mm = mm;
> -       tlb->max = ARRAY_SIZE(tlb->local);
> -       tlb->pages = tlb->local;
> -       tlb->nr = 0;
> -       tlb->fullmm = !(start | (end+1));
> -       tlb->start = start;
> -       tlb->end = end;
> -       tlb->start_addr = ~0UL;
> -}
> -
> -/*
> - * Called at the end of the shootdown operation to free up any resources that were
> - * collected.
> - */
> -static inline void
> -arch_tlb_finish_mmu(struct mmu_gather *tlb,
> -                       unsigned long start, unsigned long end, bool force)
> -{
> -       if (force)
> -               tlb->need_flush = 1;
> -       /*
> -        * Note: tlb->nr may be 0 at this point, so we can't rely on tlb->start_addr and
> -        * tlb->end_addr.
> -        */
> -       ia64_tlb_flush_mmu(tlb, start, end);
> -
> -       /* keep the page table cache within bounds */
> -       check_pgt_cache();
> -
> -       if (tlb->pages != tlb->local)
> -               free_pages((unsigned long)tlb->pages, 0);
> -}
> -
> -/*
> - * Logically, this routine frees PAGE.  On MP machines, the actual freeing of the page
> - * must be delayed until after the TLB has been flushed (see comments at the beginning of
> - * this file).
> - */
> -static inline bool __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> -{
> -       tlb->need_flush = 1;
> -
> -       if (!tlb->nr && tlb->pages == tlb->local)
> -               __tlb_alloc_page(tlb);
> -
> -       tlb->pages[tlb->nr++] = page;
> -       VM_WARN_ON(tlb->nr > tlb->max);
> -       if (tlb->nr == tlb->max)
> -               return true;
> -       return false;
> -}
> -
> -static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
> -{
> -       ia64_tlb_flush_mmu_tlbonly(tlb, tlb->start_addr, tlb->end_addr);
> -}
> -
> -static inline void tlb_flush_mmu_free(struct mmu_gather *tlb)
> -{
> -       ia64_tlb_flush_mmu_free(tlb);
> -}
> -
> -static inline void tlb_flush_mmu(struct mmu_gather *tlb)
> -{
> -       ia64_tlb_flush_mmu(tlb, tlb->start_addr, tlb->end_addr);
> -}
> -
> -static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> -{
> -       if (__tlb_remove_page(tlb, page))
> -               tlb_flush_mmu(tlb);
> -}
> -
> -static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
> -                                         struct page *page, int page_size)
> -{
> -       return __tlb_remove_page(tlb, page);
> -}
> -
> -static inline void tlb_remove_page_size(struct mmu_gather *tlb,
> -                                       struct page *page, int page_size)
> -{
> -       return tlb_remove_page(tlb, page);
> -}
> -
> -/*
> - * Remove TLB entry for PTE mapped at virtual address ADDRESS.  This is called for any
> - * PTE, not just those pointing to (normal) physical memory.
> - */
> -static inline void
> -__tlb_remove_tlb_entry (struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
> -{
> -       if (tlb->start_addr == ~0UL)
> -               tlb->start_addr = address;
> -       tlb->end_addr = address + PAGE_SIZE;
> -}
> -
>  #define tlb_migrate_finish(mm) platform_tlb_migrate_finish(mm)
>
> -#define tlb_start_vma(tlb, vma)                        do { } while (0)
> -#define tlb_end_vma(tlb, vma)                  do { } while (0)
> -
> -#define tlb_remove_tlb_entry(tlb, ptep, addr)          \
> -do {                                                   \
> -       tlb->need_flush = 1;                            \
> -       __tlb_remove_tlb_entry(tlb, ptep, addr);        \
> -} while (0)
> -
> -#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)       \
> -       tlb_remove_tlb_entry(tlb, ptep, address)
> -
> -static inline void tlb_change_page_size(struct mmu_gather *tlb,
> -                                                    unsigned int page_size)
> -{
> -}
> -
> -#define pte_free_tlb(tlb, ptep, address)               \
> -do {                                                   \
> -       tlb->need_flush = 1;                            \
> -       __pte_free_tlb(tlb, ptep, address);             \
> -} while (0)
> -
> -#define pmd_free_tlb(tlb, ptep, address)               \
> -do {                                                   \
> -       tlb->need_flush = 1;                            \
> -       __pmd_free_tlb(tlb, ptep, address);             \
> -} while (0)
> -
> -#define pud_free_tlb(tlb, pudp, address)               \
> -do {                                                   \
> -       tlb->need_flush = 1;                            \
> -       __pud_free_tlb(tlb, pudp, address);             \
> -} while (0)
> +#include <asm-generic/tlb.h>
>
>  #endif /* _ASM_IA64_TLB_H */
> --- a/arch/ia64/include/asm/tlbflush.h
> +++ b/arch/ia64/include/asm/tlbflush.h
> @@ -14,6 +14,31 @@
>  #include <asm/mmu_context.h>
>  #include <asm/page.h>
>
> +struct ia64_tr_entry {
> +       u64 ifa;
> +       u64 itir;
> +       u64 pte;
> +       u64 rr;
> +}; /*Record for tr entry!*/
> +
> +extern int ia64_itr_entry(u64 target_mask, u64 va, u64 pte, u64 log_size);
> +extern void ia64_ptr_entry(u64 target_mask, int slot);
> +extern struct ia64_tr_entry *ia64_idtrs[NR_CPUS];
> +
> +/*
> + region register macros
> +*/
> +#define RR_TO_VE(val)   (((val) >> 0) & 0x0000000000000001)
> +#define RR_VE(val)     (((val) & 0x0000000000000001) << 0)
> +#define RR_VE_MASK     0x0000000000000001L
> +#define RR_VE_SHIFT    0
> +#define RR_TO_PS(val)  (((val) >> 2) & 0x000000000000003f)
> +#define RR_PS(val)     (((val) & 0x000000000000003f) << 2)
> +#define RR_PS_MASK     0x00000000000000fcL
> +#define RR_PS_SHIFT    2
> +#define RR_RID_MASK    0x00000000ffffff00L
> +#define RR_TO_RID(val)         ((val >> 8) & 0xffffff)
> +
>  /*
>   * Now for some TLB flushing routines.  This is the kind of stuff that
>   * can be very expensive, so try to avoid them whenever possible.
> --- a/arch/ia64/mm/tlb.c
> +++ b/arch/ia64/mm/tlb.c
> @@ -297,8 +297,8 @@ local_flush_tlb_all (void)
>         ia64_srlz_i();                  /* srlz.i implies srlz.d */
>  }
>
> -void
> -flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
> +static void
> +__flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
>                  unsigned long end)
>  {
>         struct mm_struct *mm = vma->vm_mm;
> @@ -335,6 +335,25 @@ flush_tlb_range (struct vm_area_struct *
>         preempt_enable();
>         ia64_srlz_i();                  /* srlz.i implies srlz.d */
>  }
> +
> +void flush_tlb_range(struct vm_area_struct *vma,
> +               unsigned long start, unsigned long end)
> +{
> +       if (unlikely(end - start >= 1024*1024*1024*1024UL
> +                       || REGION_NUMBER(start) != REGION_NUMBER(end - 1))) {
> +               /*
> +                * If we flush more than a tera-byte or across regions, we're
> +                * probably better off just flushing the entire TLB(s).  This
> +                * should be very rare and is not worth optimizing for.
> +                */
> +               flush_tlb_all();
> +       } else {
> +               /* flush the address range from the tlb */
> +               __flush_tlb_range(vma, start, end);
> +               /* flush the virt. page-table area mapping the addr range */
> +               __flush_tlb_range(vma, ia64_thash(start), ia64_thash(end));
> +       }
> +}
>  EXPORT_SYMBOL(flush_tlb_range);
Just a minor one,
As this is a public API, I think adding docs might be helpful.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-02-19 10:31 ` [PATCH v6 10/18] sh/tlb: Convert SH " Peter Zijlstra
@ 2019-12-03 11:19   ` Geert Uytterhoeven
  2019-12-04 10:47     ` Peter Zijlstra
  0 siblings, 1 reply; 42+ messages in thread
From: Geert Uytterhoeven @ 2019-12-03 11:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list, Guenter Roeck

Hoi Peter,

On Tue, Feb 19, 2019 at 11:35 AM Peter Zijlstra <peterz@infradead.org> wrote:
> Generic mmu_gather provides everything SH needs (range tracking and
> cache coherency).
>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Nick Piggin <npiggin@gmail.com>
> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> Cc: Rich Felker <dalias@libc.org>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

I got remote access to an SH7722-based Migo-R again, which spews a long
sequence of BUGs during userspace startup.  I've bisected this to commit
c5b27a889da92f4a ("sh/tlb: Convert SH to generic mmu_gather").
Do you have a clue?
Thanks!

mount: mounting none on /dev failed: No such device
BUG: Bad page state in process grep  pfn:0e8c1
page:8fb50820 count:0 mapcount:2 mapping:8f403480 index:0x0
  (null)
flags: 0xc000000()
raw: 0c000000 00000100 00000200 8f403480 00000000 00000000 00000001 00000000
page dumped because: non-NULL mapping
CPU: 0 PID: 765 Comm: grep Tainted: G        W
5.1.0-rc3-migor-00024-gc5b27a889da92f4a #31
Stack: (0x8e8a7d4c to 0x8e8a8000)
7d40: 8c072e90 8fb50820 8fbf400c 00000000 8c0734bc
7d60: 00000200 8fbf4010 00000001 00000002 8fbf4000 00000002 8fbf400c 8e8a7d9c
7d80: 0030f231 00000100 8e8a7da0 00000000 8c2a8a38 8c2a8c0c 8c2a8a94 8c339fa0
7da0: 8fb052a4 8fb052a4 9165dc07 8c0744ea 00000010 00000000 8c0027ac 00000000
7dc0: 8c073ad0 8e8a7df8 8fb054a0 8fb50980 8c07ab54 000000aa 00000011 8e8b52b4
7de0: 00010000 000000aa 8c2a8a38 8fb50820 8c0027ac 00000000 8fb50824 8fb054a4
7e00: 9165dc07 8c0928e0 8c210e48 8c28900c 8e8a7ecc 8e8a7ee4 8c07a958 00000000
7e20: 8e8b5000 8c09296e 8e8a7ed4 8c08f414 8c28900c 8e8a7e60 8e8a7ecc 00000000
7e40: 9165dc07 8c099d08 8ef16178 8c339f60 00000001 00000000 8c0950bc 8c0027ac
7e60: 8c08b00e 7ba1bfff 40000000 7ba1c000 00000000 7ba1bfff 00000000 ffffffff
7e80: 8c08b1ec 8ef16140 8c08ae58 00000000 8ef16140 7b9fb000 8c094fe4 9165dc07
7ea0: 8c0929e6 8e8a7ecc 00000001 8e8a7ecc 8c092b3e 00100000 00000001 8e8a7ecc
7ec0: 8c090cfe 00000000 8ef16020 8f4b78e0 ffffffff ffffffff 8fb50501 00000001
7ee0: 8e8b5000 8e8b5000 00000000 00000008 8fb74f80 8fb74fa0 8fb74f00 8fb74f20
7f00: 8fb74f40 8fb74f60 8fb76c00 8fb76c20 9165dc07 8c0128a2 00000000 00000000
7f20: 8f4b7918 8f4b78e0 00000000 8f4b78e0 8c016326 8ea34f00 8ea3503c 8ea34f00
7f40: 8ea351bc 00000000 8ea3503c 8ea3507c 8c297c1c 8f4a717c 8f4f37a0 9165dc07
7f60: 8c016a58 7ba1bbe4 00000000 00000000 00000000 8eedd720 8eedd6e0 00000000
7f80: 8c016ac6 00000000 00000071 00000100 8c016abc 8c006242 00000000 00427b38
7fa0: 004c025c 00000000 00427b38 004c025c 000000fc 00000000 000000fc fffff000
7fc0: 00000000 00098d9c 00000000 004ac6f4 7ba1be94 00000000 00000000 7ba1bbe4
7fe0: 7ba1bbe4 00427b4e 0040a978 00000101 004c5450 00000022 ffffd000 00000044

Call trace:
 [<(ptrval)>] free_pcppages_bulk+0x114/0x33c
 [<(ptrval)>] free_unref_page_list+0xae/0x10c
 [<(ptrval)>] arch_local_irq_restore+0x0/0x24
 [<(ptrval)>] free_unref_page_commit.isra.24+0x0/0x74
 [<(ptrval)>] release_pages+0x1fc/0x2d0
 [<(ptrval)>] arch_local_irq_restore+0x0/0x24
 [<(ptrval)>] tlb_flush_mmu_free+0x30/0x4c
 [<(ptrval)>] down_read+0x0/0xc
 [<(ptrval)>] release_pages+0x0/0x2d0
 [<(ptrval)>] tlb_flush_mmu+0x72/0xc0
 [<(ptrval)>] remove_vma+0x0/0x48
 [<(ptrval)>] kmem_cache_free+0x34/0x90
 [<(ptrval)>] unlink_anon_vmas+0xd8/0x150
 [<(ptrval)>] arch_local_irq_restore+0x0/0x24
 [<(ptrval)>] free_pgd_range+0x1b6/0x324
 [<(ptrval)>] free_pgtables+0x70/0xb4
 [<(ptrval)>] free_pgd_range+0x0/0x324
 [<(ptrval)>] unlink_anon_vmas+0x0/0x150
 [<(ptrval)>] arch_tlb_finish_mmu+0x2a/0x74
 [<(ptrval)>] tlb_finish_mmu+0x1a/0x38
 [<(ptrval)>] exit_mmap+0xa2/0x16c
 [<(ptrval)>] mmput+0x2a/0x80
 [<(ptrval)>] do_exit+0x186/0x85c
 [<(ptrval)>] do_group_exit+0x2c/0x90
 [<(ptrval)>] sys_exit_group+0xa/0x10
 [<(ptrval)>] sys_exit_group+0x0/0x10
 [<(ptrval)>] syscall_call+0x18/0x1e

Disabling lock debugging due to kernel taint
BUG: Bad page state in process rcS  pfn:0ee78
page:8fb5bf00 count:0 mapcount:2 mapping:8f403480 index:0x0
  (null)
flags: 0xc000000()
raw: 0c000000 00000100 00000200 8f403480 00000000 00000000 00000001 00000000
page dumped because: non-NULL mapping
CPU: 0 PID: 763 Comm: rcS Tainted: G    B   W
5.1.0-rc3-migor-00024-gc5b27a889da92f4a #31
Stack: (0x8ef55e54 to 0x8ef56000)
5e40: 8c072e90 8fb5bf00 8fbf400c
5e60: 00000000 8c0734bc 00000200 8fbf4010 00000001 00000002 8fbf4000 00000002
5e80: 8fbf400c 00989680 0030f231 00000100 8ef55ea8 00000000 8c2a8a38 00000028
5ea0: 00000000 8c29d340 8fb05344 8fb05344 b362ad67 8c074416 8f4f38e0 8c0027ac
5ec0: 8e9125e0 00000000 00000000 8c0027ac 8fb50760 8c0a59e4 8e9125a0 00000180
5ee0: 00000010 8c0a5a9a 8f4a69e0 8f01ad00 8f4f38e0 8f01b21c 8e9125a0 8c09f526
5f00: 8f414cf0 8f01b120 8f4f38e0 8c0281e0 8c20f7dc 8f4a6cdc 00000000 8f4a6cec
5f20: 8c003c18 00401f80 004c6aa0 00000000 8ef55fa4 8ef55fa4 09000080 8c28900c
5f40: 00000000 b362ad67 8c00bbde 8ef55fe4 8f4b7aa0 00000001 09000000 8f4a6c1c
5f60: 004c6ab4 00000055 8c215f8c ffff8bce 00000004 00000009 00003f10 8f4f38e0
5f80: b362ad67 8c0060f8 00401f80 004c6aa0 00000000 00000000 40000000 00000100
5fa0: 8ef54000 00000000 00000450 00000002 00000006 00000003 00000000 004c6ab4
5fc0: 00000000 7bdf069c 004c61d4 00000003 004c61bc 7bdf0670 004c6aa0 00401f80
5fe0: 7bdf069c 00401f94 0047d1cc 00000001 004c5450 00000043 0000000c 00000044

Call trace:
 [<(ptrval)>] free_pcppages_bulk+0x114/0x33c
 [<(ptrval)>] free_unref_page+0x4a/0x70
 [<(ptrval)>] arch_local_irq_restore+0x0/0x24
 [<(ptrval)>] arch_local_irq_restore+0x0/0x24
 [<(ptrval)>] free_pipe_info+0x60/0x84
 [<(ptrval)>] pipe_release+0x92/0xb8
 [<(ptrval)>] __fput+0x3e/0x120
 [<(ptrval)>] task_work_run+0x78/0xa8
 [<(ptrval)>] _cond_resched+0x0/0x54
 [<(ptrval)>] do_notify_resume+0xd4/0x4c8
 [<(ptrval)>] do_page_fault+0xda/0x2a8
 [<(ptrval)>] resume_userspace+0x0/0x10

BUG: Bad page state in process rcS  pfn:0e8b2
page:8fb50640 count:0 mapcount:1026 mapping:8f403480 index:0x0
  (null)
flags: 0xc000000()
raw: 0c000000 00000100 00000200 8f403480 00000000 00000000 00000401 00000000
page dumped because: non-NULL mapping
CPU: 0 PID: 763 Comm: rcS Tainted: G    B   W
5.1.0-rc3-migor-00024-gc5b27a889da92f4a #31
Stack: (0x8ef55e54 to 0x8ef56000)
5e40: 8c072e90 8fb50640 8fbf400c
5e60: 00000000 8c0734bc 00000200 8fbf4010 00000001 00000001 8fbf4000 00000001
5e80: 8fbf400c 00989680 0030f231 00000100 8ef55ea8 00000000 8c2a8a38 00000028
5ea0: 00000000 8c29d340 8fb05344 8fb05344 b362ad67 8c074416 8f4f38e0 8c0027ac
5ec0: 8e9125e0 00000000 00000000 8c0027ac 8fb50760 8c0a59e4 8e9125a0 00000180
5ee0: 00000010 8c0a5a9a 8f4a69e0 8f01ad00 8f4f38e0 8f01b21c 8e9125a0 8c09f526
5f00: 8f414cf0 8f01b120 8f4f38e0 8c0281e0 8c20f7dc 8f4a6cdc 00000000 8f4a6cec
5f20: 8c003c18 00401f80 004c6aa0 00000000 8ef55fa4 8ef55fa4 09000080 8c28900c
5f40: 00000000 b362ad67 8c00bbde 8ef55fe4 8f4b7aa0 00000001 09000000 8f4a6c1c
5f60: 004c6ab4 00000055 8c215f8c ffff8bce 00000004 00000009 00003f10 8f4f38e0
5f80: b362ad67 8c0060f8 00401f80 004c6aa0 00000000 00000000 40000000 00000100
5fa0: 8ef54000 00000000 00000450 00000002 00000006 00000003 00000000 004c6ab4
5fc0: 00000000 7bdf069c 004c61d4 00000003 004c61bc 7bdf0670 004c6aa0 00401f80
5fe0: 7bdf069c 00401f94 0047d1cc 00000001 004c5450 00000043 0000000c 00000044

Call trace:
 [<(ptrval)>] free_pcppages_bulk+0x114/0x33c
 [<(ptrval)>] free_unref_page+0x4a/0x70
 [<(ptrval)>] arch_local_irq_restore+0x0/0x24
 [<(ptrval)>] arch_local_irq_restore+0x0/0x24
 [<(ptrval)>] free_pipe_info+0x60/0x84
 [<(ptrval)>] pipe_release+0x92/0xb8
 [<(ptrval)>] __fput+0x3e/0x120
 [<(ptrval)>] task_work_run+0x78/0xa8
 [<(ptrval)>] _cond_resched+0x0/0x54
 [<(ptrval)>] do_notify_resume+0xd4/0x4c8
 [<(ptrval)>] do_page_fault+0xda/0x2a8
 [<(ptrval)>] resume_userspace+0x0/0x10
...

> ---
>  arch/sh/include/asm/pgalloc.h |    7 ++
>  arch/sh/include/asm/tlb.h     |  130 ------------------------------------------
>  2 files changed, 8 insertions(+), 129 deletions(-)
>
> --- a/arch/sh/include/asm/pgalloc.h
> +++ b/arch/sh/include/asm/pgalloc.h
> @@ -72,6 +72,15 @@ do {                                                 \
>         tlb_remove_page((tlb), (pte));                  \
>  } while (0)
>
> +#if CONFIG_PGTABLE_LEVELS > 2
> +#define __pmd_free_tlb(tlb, pmdp, addr)                        \
> +do {                                                   \
> +       struct page *page = virt_to_page(pmdp);         \
> +       pgtable_pmd_page_dtor(page);                    \
> +       tlb_remove_page((tlb), page);                   \
> +} while (0);
> +#endif
> +
>  static inline void check_pgt_cache(void)
>  {
>         quicklist_trim(QUICK_PT, NULL, 25, 16);
> --- a/arch/sh/include/asm/tlb.h
> +++ b/arch/sh/include/asm/tlb.h
> @@ -11,131 +11,8 @@
>
>  #ifdef CONFIG_MMU
>  #include <linux/swap.h>
> -#include <asm/pgalloc.h>
> -#include <asm/tlbflush.h>
> -#include <asm/mmu_context.h>
> -
> -/*
> - * TLB handling.  This allows us to remove pages from the page
> - * tables, and efficiently handle the TLB issues.
> - */
> -struct mmu_gather {
> -       struct mm_struct        *mm;
> -       unsigned int            fullmm;
> -       unsigned long           start, end;
> -};
>
> -static inline void init_tlb_gather(struct mmu_gather *tlb)
> -{
> -       tlb->start = TASK_SIZE;
> -       tlb->end = 0;
> -
> -       if (tlb->fullmm) {
> -               tlb->start = 0;
> -               tlb->end = TASK_SIZE;
> -       }
> -}
> -
> -static inline void
> -arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
> -               unsigned long start, unsigned long end)
> -{
> -       tlb->mm = mm;
> -       tlb->start = start;
> -       tlb->end = end;
> -       tlb->fullmm = !(start | (end+1));
> -
> -       init_tlb_gather(tlb);
> -}
> -
> -static inline void
> -arch_tlb_finish_mmu(struct mmu_gather *tlb,
> -               unsigned long start, unsigned long end, bool force)
> -{
> -       if (tlb->fullmm || force)
> -               flush_tlb_mm(tlb->mm);
> -
> -       /* keep the page table cache within bounds */
> -       check_pgt_cache();
> -}
> -
> -static inline void
> -tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
> -{
> -       if (tlb->start > address)
> -               tlb->start = address;
> -       if (tlb->end < address + PAGE_SIZE)
> -               tlb->end = address + PAGE_SIZE;
> -}
> -
> -#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)       \
> -       tlb_remove_tlb_entry(tlb, ptep, address)
> -
> -/*
> - * In the case of tlb vma handling, we can optimise these away in the
> - * case where we're doing a full MM flush.  When we're doing a munmap,
> - * the vmas are adjusted to only cover the region to be torn down.
> - */
> -static inline void
> -tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
> -{
> -       if (!tlb->fullmm)
> -               flush_cache_range(vma, vma->vm_start, vma->vm_end);
> -}
> -
> -static inline void
> -tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
> -{
> -       if (!tlb->fullmm && tlb->end) {
> -               flush_tlb_range(vma, tlb->start, tlb->end);
> -               init_tlb_gather(tlb);
> -       }
> -}
> -
> -static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
> -{
> -}
> -
> -static inline void tlb_flush_mmu_free(struct mmu_gather *tlb)
> -{
> -}
> -
> -static inline void tlb_flush_mmu(struct mmu_gather *tlb)
> -{
> -}
> -
> -static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> -{
> -       free_page_and_swap_cache(page);
> -       return false; /* avoid calling tlb_flush_mmu */
> -}
> -
> -static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> -{
> -       __tlb_remove_page(tlb, page);
> -}
> -
> -static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
> -                                         struct page *page, int page_size)
> -{
> -       return __tlb_remove_page(tlb, page);
> -}
> -
> -static inline void tlb_remove_page_size(struct mmu_gather *tlb,
> -                                       struct page *page, int page_size)
> -{
> -       return tlb_remove_page(tlb, page);
> -}
> -
> -static inline void tlb_change_page_size(struct mmu_gather *tlb, unsigned int page_size)
> -{
> -}
> -
> -#define pte_free_tlb(tlb, ptep, addr)  pte_free((tlb)->mm, ptep)
> -#define pmd_free_tlb(tlb, pmdp, addr)  pmd_free((tlb)->mm, pmdp)
> -#define pud_free_tlb(tlb, pudp, addr)  pud_free((tlb)->mm, pudp)
> -
> -#define tlb_migrate_finish(mm)         do { } while (0)
> +#include <asm-generic/tlb.h>
>
>  #if defined(CONFIG_CPU_SH4) || defined(CONFIG_SUPERH64)
>  extern void tlb_wire_entry(struct vm_area_struct *, unsigned long, pte_t);
> @@ -155,11 +32,6 @@ static inline void tlb_unwire_entry(void
>
>  #else /* CONFIG_MMU */
>
> -#define tlb_start_vma(tlb, vma)                                do { } while (0)
> -#define tlb_end_vma(tlb, vma)                          do { } while (0)
> -#define __tlb_remove_tlb_entry(tlb, pte, address)      do { } while (0)
> -#define tlb_flush(tlb)                                 do { } while (0)
> -
>  #include <asm-generic/tlb.h>
>
>  #endif /* CONFIG_MMU */

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-03 11:19   ` Geert Uytterhoeven
@ 2019-12-04 10:47     ` Peter Zijlstra
  2019-12-04 12:32       ` Geert Uytterhoeven
                         ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Peter Zijlstra @ 2019-12-04 10:47 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list, Guenter Roeck

On Tue, Dec 03, 2019 at 12:19:00PM +0100, Geert Uytterhoeven wrote:
> Hoi Peter,
> 
> On Tue, Feb 19, 2019 at 11:35 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > Generic mmu_gather provides everything SH needs (range tracking and
> > cache coherency).
> >
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Nick Piggin <npiggin@gmail.com>
> > Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> > Cc: Rich Felker <dalias@libc.org>
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> 
> I got remote access to an SH7722-based Migo-R again, which spews a long
> sequence of BUGs during userspace startup.  I've bisected this to commit
> c5b27a889da92f4a ("sh/tlb: Convert SH to generic mmu_gather").

Whoopsy.. also, is this really the first time anybody booted an SH
kernel in over a year ?!?

> Do you have a clue?

Does the below help?

diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
index 22d968bfe9bb..73a2c00de6c5 100644
--- a/arch/sh/include/asm/pgalloc.h
+++ b/arch/sh/include/asm/pgalloc.h
@@ -36,9 +36,8 @@ do {							\
 #if CONFIG_PGTABLE_LEVELS > 2
 #define __pmd_free_tlb(tlb, pmdp, addr)			\
 do {							\
-	struct page *page = virt_to_page(pmdp);		\
-	pgtable_pmd_page_dtor(page);			\
-	tlb_remove_page((tlb), page);			\
+	pgtable_pmd_page_dtor(pmdp);			\
+	tlb_remove_page((tlb), (pmdp));			\
 } while (0);
 #endif
 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 10:47     ` Peter Zijlstra
@ 2019-12-04 12:32       ` Geert Uytterhoeven
  2019-12-04 13:22         ` Guenter Roeck
  2019-12-04 13:34         ` Peter Zijlstra
  2019-12-05 19:24       ` Rob Landley
  2019-12-05 19:30       ` John Paul Adrian Glaubitz
  2 siblings, 2 replies; 42+ messages in thread
From: Geert Uytterhoeven @ 2019-12-04 12:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list, Guenter Roeck

Hoi Peter,

On Wed, Dec 4, 2019 at 11:48 AM Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Dec 03, 2019 at 12:19:00PM +0100, Geert Uytterhoeven wrote:
> > On Tue, Feb 19, 2019 at 11:35 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > Generic mmu_gather provides everything SH needs (range tracking and
> > > cache coherency).
> > >
> > > Cc: Will Deacon <will.deacon@arm.com>
> > > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Nick Piggin <npiggin@gmail.com>
> > > Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> > > Cc: Rich Felker <dalias@libc.org>
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> >
> > I got remote access to an SH7722-based Migo-R again, which spews a long
> > sequence of BUGs during userspace startup.  I've bisected this to commit
> > c5b27a889da92f4a ("sh/tlb: Convert SH to generic mmu_gather").
>
> Whoopsy.. also, is this really the first time anybody booted an SH
> kernel in over a year ?!?

Nah, but the v5.4-rc3 I booted recently on qemu -M r2d had
CONFIG_PGTABLE_LEVELS=2, so it didn't show the problem.

> > Do you have a clue?
>
> Does the below help?

Unfortunately not.

> diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
> index 22d968bfe9bb..73a2c00de6c5 100644
> --- a/arch/sh/include/asm/pgalloc.h
> +++ b/arch/sh/include/asm/pgalloc.h
> @@ -36,9 +36,8 @@ do {                                                  \
>  #if CONFIG_PGTABLE_LEVELS > 2
>  #define __pmd_free_tlb(tlb, pmdp, addr)                        \
>  do {                                                   \
> -       struct page *page = virt_to_page(pmdp);         \
> -       pgtable_pmd_page_dtor(page);                    \
> -       tlb_remove_page((tlb), page);                   \
> +       pgtable_pmd_page_dtor(pmdp);                    \

expected ‘struct page *’ but argument is of type ‘pmd_t * {aka struct
<anonymous> *}’

> +       tlb_remove_page((tlb), (pmdp));                 \

likewise

>  } while (0);
>  #endif

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 12:32       ` Geert Uytterhoeven
@ 2019-12-04 13:22         ` Guenter Roeck
  2019-12-04 15:17           ` Geert Uytterhoeven
  2019-12-04 13:34         ` Peter Zijlstra
  1 sibling, 1 reply; 42+ messages in thread
From: Guenter Roeck @ 2019-12-04 13:22 UTC (permalink / raw)
  To: Geert Uytterhoeven, Peter Zijlstra
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list

On 12/4/19 4:32 AM, Geert Uytterhoeven wrote:
> Hoi Peter,
> 
> On Wed, Dec 4, 2019 at 11:48 AM Peter Zijlstra <peterz@infradead.org> wrote:
>> On Tue, Dec 03, 2019 at 12:19:00PM +0100, Geert Uytterhoeven wrote:
>>> On Tue, Feb 19, 2019 at 11:35 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>>> Generic mmu_gather provides everything SH needs (range tracking and
>>>> cache coherency).
>>>>
>>>> Cc: Will Deacon <will.deacon@arm.com>
>>>> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>> Cc: Nick Piggin <npiggin@gmail.com>
>>>> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
>>>> Cc: Rich Felker <dalias@libc.org>
>>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>>>
>>> I got remote access to an SH7722-based Migo-R again, which spews a long
>>> sequence of BUGs during userspace startup.  I've bisected this to commit
>>> c5b27a889da92f4a ("sh/tlb: Convert SH to generic mmu_gather").
>>
>> Whoopsy.. also, is this really the first time anybody booted an SH
>> kernel in over a year ?!?
> 
> Nah, but the v5.4-rc3 I booted recently on qemu -M r2d had
> CONFIG_PGTABLE_LEVELS=2, so it didn't show the problem.
> 

Guess that explains why I do not see the problem with my qemu boots.
I use rts7751r2dplus_defconfig. Is it possible to reproduce the problem
with qemu ? I don't think so, but maybe I am missing something.

Guenter

>>> Do you have a clue?
>>
>> Does the below help?
> 
> Unfortunately not.
> 
>> diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
>> index 22d968bfe9bb..73a2c00de6c5 100644
>> --- a/arch/sh/include/asm/pgalloc.h
>> +++ b/arch/sh/include/asm/pgalloc.h
>> @@ -36,9 +36,8 @@ do {                                                  \
>>   #if CONFIG_PGTABLE_LEVELS > 2
>>   #define __pmd_free_tlb(tlb, pmdp, addr)                        \
>>   do {                                                   \
>> -       struct page *page = virt_to_page(pmdp);         \
>> -       pgtable_pmd_page_dtor(page);                    \
>> -       tlb_remove_page((tlb), page);                   \
>> +       pgtable_pmd_page_dtor(pmdp);                    \
> 
> expected ‘struct page *’ but argument is of type ‘pmd_t * {aka struct
> <anonymous> *}’
> 
>> +       tlb_remove_page((tlb), (pmdp));                 \
> 
> likewise
> 
>>   } while (0);
>>   #endif
> 
> Gr{oetje,eeting}s,
> 
>                          Geert
> 



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 12:32       ` Geert Uytterhoeven
  2019-12-04 13:22         ` Guenter Roeck
@ 2019-12-04 13:34         ` Peter Zijlstra
  2019-12-04 15:07           ` Geert Uytterhoeven
  1 sibling, 1 reply; 42+ messages in thread
From: Peter Zijlstra @ 2019-12-04 13:34 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list, Guenter Roeck

On Wed, Dec 04, 2019 at 01:32:58PM +0100, Geert Uytterhoeven wrote:

> > Does the below help?
> 
> Unfortunately not.
> 
> > diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
> > index 22d968bfe9bb..73a2c00de6c5 100644
> > --- a/arch/sh/include/asm/pgalloc.h
> > +++ b/arch/sh/include/asm/pgalloc.h
> > @@ -36,9 +36,8 @@ do {                                                  \
> >  #if CONFIG_PGTABLE_LEVELS > 2
> >  #define __pmd_free_tlb(tlb, pmdp, addr)                        \
> >  do {                                                   \
> > -       struct page *page = virt_to_page(pmdp);         \
> > -       pgtable_pmd_page_dtor(page);                    \
> > -       tlb_remove_page((tlb), page);                   \
> > +       pgtable_pmd_page_dtor(pmdp);                    \
> 
> expected ‘struct page *’ but argument is of type ‘pmd_t * {aka struct
> <anonymous> *}’
> 
> > +       tlb_remove_page((tlb), (pmdp));                 \
> 
> likewise

Duh.. clearly I misplaced my SH cross compiler. Let me go find it.

Also, looking at pgtable.c the pmd_t* actually comes from a kmemcach()
and should probably use pmd_free() (which is what the old code did too).

Also, since SH doesn't have ARCH_ENABLE_SPLIT_PMD_PTLOCK, it will never
need pgtable_pmd_page_dtor().

The below seems to build se7722_defconfig using sh4-linux-. That is, the
build fails, on 'node_reclaim_distance', not pgtable stuff.

Does this fare better?

---
 arch/sh/include/asm/pgalloc.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
index 22d968bfe9bb..c910e5bcde62 100644
--- a/arch/sh/include/asm/pgalloc.h
+++ b/arch/sh/include/asm/pgalloc.h
@@ -36,9 +36,7 @@ do {							\
 #if CONFIG_PGTABLE_LEVELS > 2
 #define __pmd_free_tlb(tlb, pmdp, addr)			\
 do {							\
-	struct page *page = virt_to_page(pmdp);		\
-	pgtable_pmd_page_dtor(page);			\
-	tlb_remove_page((tlb), page);			\
+	pmd_free((tlb)->mm, (pmdp));			\
 } while (0);
 #endif
 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 13:34         ` Peter Zijlstra
@ 2019-12-04 15:07           ` Geert Uytterhoeven
  2019-12-04 16:41             ` Peter Zijlstra
  0 siblings, 1 reply; 42+ messages in thread
From: Geert Uytterhoeven @ 2019-12-04 15:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list, Guenter Roeck

Hoi Peter,

On Wed, Dec 4, 2019 at 2:35 PM Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Dec 04, 2019 at 01:32:58PM +0100, Geert Uytterhoeven wrote:
> > > Does the below help?
> >
> > Unfortunately not.
> >
> > > diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
> > > index 22d968bfe9bb..73a2c00de6c5 100644
> > > --- a/arch/sh/include/asm/pgalloc.h
> > > +++ b/arch/sh/include/asm/pgalloc.h
> > > @@ -36,9 +36,8 @@ do {                                                  \
> > >  #if CONFIG_PGTABLE_LEVELS > 2
> > >  #define __pmd_free_tlb(tlb, pmdp, addr)                        \
> > >  do {                                                   \
> > > -       struct page *page = virt_to_page(pmdp);         \
> > > -       pgtable_pmd_page_dtor(page);                    \
> > > -       tlb_remove_page((tlb), page);                   \
> > > +       pgtable_pmd_page_dtor(pmdp);                    \
> >
> > expected ‘struct page *’ but argument is of type ‘pmd_t * {aka struct
> > <anonymous> *}’
> >
> > > +       tlb_remove_page((tlb), (pmdp));                 \
> >
> > likewise
>
> Duh.. clearly I misplaced my SH cross compiler. Let me go find it.
>
> Also, looking at pgtable.c the pmd_t* actually comes from a kmemcach()
> and should probably use pmd_free() (which is what the old code did too).
>
> Also, since SH doesn't have ARCH_ENABLE_SPLIT_PMD_PTLOCK, it will never
> need pgtable_pmd_page_dtor().
>
> The below seems to build se7722_defconfig using sh4-linux-. That is, the
> build fails, on 'node_reclaim_distance', not pgtable stuff.
>
> Does this fare better?

Yes. Migo-R is happy again.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

> --- a/arch/sh/include/asm/pgalloc.h
> +++ b/arch/sh/include/asm/pgalloc.h
> @@ -36,9 +36,7 @@ do {                                                  \
>  #if CONFIG_PGTABLE_LEVELS > 2
>  #define __pmd_free_tlb(tlb, pmdp, addr)                        \
>  do {                                                   \
> -       struct page *page = virt_to_page(pmdp);         \
> -       pgtable_pmd_page_dtor(page);                    \
> -       tlb_remove_page((tlb), page);                   \
> +       pmd_free((tlb)->mm, (pmdp));                    \
>  } while (0);
>  #endif

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 13:22         ` Guenter Roeck
@ 2019-12-04 15:17           ` Geert Uytterhoeven
  2019-12-04 19:03             ` Guenter Roeck
  0 siblings, 1 reply; 42+ messages in thread
From: Geert Uytterhoeven @ 2019-12-04 15:17 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Peter Zijlstra, Will Deacon, Aneesh Kumar K.V, Andrew Morton,
	Nicholas Piggin, Linux-Arch, Linux MM, Linux Kernel Mailing List,
	Russell King, Heiko Carstens, Rik van Riel, Yoshinori Sato,
	Rich Felker, Linux-sh list

Hi Günter,

On Wed, Dec 4, 2019 at 2:22 PM Guenter Roeck <linux@roeck-us.net> wrote:
> On 12/4/19 4:32 AM, Geert Uytterhoeven wrote:
> > On Wed, Dec 4, 2019 at 11:48 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >> On Tue, Dec 03, 2019 at 12:19:00PM +0100, Geert Uytterhoeven wrote:
> >>> On Tue, Feb 19, 2019 at 11:35 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >>>> Generic mmu_gather provides everything SH needs (range tracking and
> >>>> cache coherency).

> >>> I got remote access to an SH7722-based Migo-R again, which spews a long
> >>> sequence of BUGs during userspace startup.  I've bisected this to commit
> >>> c5b27a889da92f4a ("sh/tlb: Convert SH to generic mmu_gather").
> >>
> >> Whoopsy.. also, is this really the first time anybody booted an SH
> >> kernel in over a year ?!?
> >
> > Nah, but the v5.4-rc3 I booted recently on qemu -M r2d had
> > CONFIG_PGTABLE_LEVELS=2, so it didn't show the problem.
> >
>
> Guess that explains why I do not see the problem with my qemu boots.
> I use rts7751r2dplus_defconfig. Is it possible to reproduce the problem
> with qemu ? I don't think so, but maybe I am missing something.

Qemu seems to support r2d and shix only.
For the latter, the website pointed to by the qemu sources no longer exists.
But according to those sources, it's also sh7750-based, so no luck.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 15:07           ` Geert Uytterhoeven
@ 2019-12-04 16:41             ` Peter Zijlstra
  2019-12-05 15:26               ` Geert Uytterhoeven
  0 siblings, 1 reply; 42+ messages in thread
From: Peter Zijlstra @ 2019-12-04 16:41 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list, Guenter Roeck

On Wed, Dec 04, 2019 at 04:07:53PM +0100, Geert Uytterhoeven wrote:
> On Wed, Dec 4, 2019 at 2:35 PM Peter Zijlstra <peterz@infradead.org> wrote:

> > Does this fare better?
> 
> Yes. Migo-R is happy again.
> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
> 
> > --- a/arch/sh/include/asm/pgalloc.h
> > +++ b/arch/sh/include/asm/pgalloc.h
> > @@ -36,9 +36,7 @@ do {                                                  \
> >  #if CONFIG_PGTABLE_LEVELS > 2
> >  #define __pmd_free_tlb(tlb, pmdp, addr)                        \
> >  do {                                                   \
> > -       struct page *page = virt_to_page(pmdp);         \
> > -       pgtable_pmd_page_dtor(page);                    \
> > -       tlb_remove_page((tlb), page);                   \
> > +       pmd_free((tlb)->mm, (pmdp));                    \
> >  } while (0);
> >  #endif

OK, so I was going to write a Changelog to go with that, but then I
realized that while this works and is similar to before the patch, I'm
not sure this is in fact correct.

With this on (and also before) we're freeing the PMD before we've done
the TLB invalidate, that seems wrong!

Looking at the size of that pmd_cache, that looks to be 30-(12+12-3)+3
== 12, which is exactly 1 page, for PAGE_SIZE_4K, less for the larger
pages.

I'm thinking perhaps we should do something like the below instead?


---
 arch/sh/mm/pgtable.c | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/arch/sh/mm/pgtable.c b/arch/sh/mm/pgtable.c
index 5c8f9247c3c2..fac7e822fd0c 100644
--- a/arch/sh/mm/pgtable.c
+++ b/arch/sh/mm/pgtable.c
@@ -5,9 +5,6 @@
 #define PGALLOC_GFP GFP_KERNEL | __GFP_ZERO
 
 static struct kmem_cache *pgd_cachep;
-#if PAGETABLE_LEVELS > 2
-static struct kmem_cache *pmd_cachep;
-#endif
 
 void pgd_ctor(void *x)
 {
@@ -23,11 +20,6 @@ void pgtable_cache_init(void)
 	pgd_cachep = kmem_cache_create("pgd_cache",
 				       PTRS_PER_PGD * (1<<PTE_MAGNITUDE),
 				       PAGE_SIZE, SLAB_PANIC, pgd_ctor);
-#if PAGETABLE_LEVELS > 2
-	pmd_cachep = kmem_cache_create("pmd_cache",
-				       PTRS_PER_PMD * (1<<PTE_MAGNITUDE),
-				       PAGE_SIZE, SLAB_PANIC, NULL);
-#endif
 }
 
 pgd_t *pgd_alloc(struct mm_struct *mm)
@@ -48,11 +40,7 @@ void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 
 pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address)
 {
-	return kmem_cache_alloc(pmd_cachep, PGALLOC_GFP);
-}
-
-void pmd_free(struct mm_struct *mm, pmd_t *pmd)
-{
-	kmem_cache_free(pmd_cachep, pmd);
+	BUILD_BUG_ON(PTRS_PER_PMD * (1<<PTE_MAGNITUDE) <= PAGE_SIZE);
+	return (pmd_t *)__get_free_page(PGALLOC_GFP);
 }
 #endif /* PAGETABLE_LEVELS > 2 */


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 15:17           ` Geert Uytterhoeven
@ 2019-12-04 19:03             ` Guenter Roeck
  0 siblings, 0 replies; 42+ messages in thread
From: Guenter Roeck @ 2019-12-04 19:03 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Peter Zijlstra, Will Deacon, Aneesh Kumar K.V, Andrew Morton,
	Nicholas Piggin, Linux-Arch, Linux MM, Linux Kernel Mailing List,
	Russell King, Heiko Carstens, Rik van Riel, Yoshinori Sato,
	Rich Felker, Linux-sh list

On Wed, Dec 04, 2019 at 04:17:26PM +0100, Geert Uytterhoeven wrote:
> Hi Günter,
> 
> > > Nah, but the v5.4-rc3 I booted recently on qemu -M r2d had
> > > CONFIG_PGTABLE_LEVELS=2, so it didn't show the problem.
> > >
> >
> > Guess that explains why I do not see the problem with my qemu boots.
> > I use rts7751r2dplus_defconfig. Is it possible to reproduce the problem
> > with qemu ? I don't think so, but maybe I am missing something.
> 
> Qemu seems to support r2d and shix only.
> For the latter, the website pointed to by the qemu sources no longer exists.
> But according to those sources, it's also sh7750-based, so no luck.
> 
Oh, well, worth asking. Thanks for the feedback.

Guenter


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 16:41             ` Peter Zijlstra
@ 2019-12-05 15:26               ` Geert Uytterhoeven
  0 siblings, 0 replies; 42+ messages in thread
From: Geert Uytterhoeven @ 2019-12-05 15:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list, Guenter Roeck

Hoi Peter,

On Wed, Dec 4, 2019 at 5:42 PM Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Dec 04, 2019 at 04:07:53PM +0100, Geert Uytterhoeven wrote:
> > On Wed, Dec 4, 2019 at 2:35 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > Does this fare better?
> >
> > Yes. Migo-R is happy again.
> > Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
> >
> > > --- a/arch/sh/include/asm/pgalloc.h
> > > +++ b/arch/sh/include/asm/pgalloc.h
> > > @@ -36,9 +36,7 @@ do {                                                  \
> > >  #if CONFIG_PGTABLE_LEVELS > 2
> > >  #define __pmd_free_tlb(tlb, pmdp, addr)                        \
> > >  do {                                                   \
> > > -       struct page *page = virt_to_page(pmdp);         \
> > > -       pgtable_pmd_page_dtor(page);                    \
> > > -       tlb_remove_page((tlb), page);                   \
> > > +       pmd_free((tlb)->mm, (pmdp));                    \
> > >  } while (0);
> > >  #endif
>
> OK, so I was going to write a Changelog to go with that, but then I
> realized that while this works and is similar to before the patch, I'm
> not sure this is in fact correct.
>
> With this on (and also before) we're freeing the PMD before we've done
> the TLB invalidate, that seems wrong!
>
> Looking at the size of that pmd_cache, that looks to be 30-(12+12-3)+3
> == 12, which is exactly 1 page, for PAGE_SIZE_4K, less for the larger
> pages.
>
> I'm thinking perhaps we should do something like the below instead?

Your advice is better when in close vicinity of an SH cross compiler,
though ;-)

> --- a/arch/sh/mm/pgtable.c
> +++ b/arch/sh/mm/pgtable.c
> @@ -5,9 +5,6 @@
>  #define PGALLOC_GFP GFP_KERNEL | __GFP_ZERO
>
>  static struct kmem_cache *pgd_cachep;
> -#if PAGETABLE_LEVELS > 2
> -static struct kmem_cache *pmd_cachep;
> -#endif
>
>  void pgd_ctor(void *x)
>  {
> @@ -23,11 +20,6 @@ void pgtable_cache_init(void)
>         pgd_cachep = kmem_cache_create("pgd_cache",
>                                        PTRS_PER_PGD * (1<<PTE_MAGNITUDE),
>                                        PAGE_SIZE, SLAB_PANIC, pgd_ctor);
> -#if PAGETABLE_LEVELS > 2
> -       pmd_cachep = kmem_cache_create("pmd_cache",
> -                                      PTRS_PER_PMD * (1<<PTE_MAGNITUDE),
> -                                      PAGE_SIZE, SLAB_PANIC, NULL);
> -#endif
>  }
>
>  pgd_t *pgd_alloc(struct mm_struct *mm)
> @@ -48,11 +40,7 @@ void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
>
>  pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address)
>  {
> -       return kmem_cache_alloc(pmd_cachep, PGALLOC_GFP);
> -}
> -
> -void pmd_free(struct mm_struct *mm, pmd_t *pmd)

mm/memory.o: In function `__pmd_alloc':
memory.c:(.text+0x1d74): undefined reference to `pmd_free'

> -{
> -       kmem_cache_free(pmd_cachep, pmd);
> +       BUILD_BUG_ON(PTRS_PER_PMD * (1<<PTE_MAGNITUDE) <= PAGE_SIZE);

... > PAGE_SIZE ?

Else it triggers all the time.

> +       return (pmd_t *)__get_free_page(PGALLOC_GFP);
>  }
>  #endif /* PAGETABLE_LEVELS > 2 */

BTW, I'm still running Willy's fix that never made it upstream
to kill an ugly boot warning, which also touches this code:
https://patchwork.kernel.org/patch/10549883/#22166333

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-05 19:24       ` Rob Landley
@ 2019-12-05 19:23         ` Rich Felker
  0 siblings, 0 replies; 42+ messages in thread
From: Rich Felker @ 2019-12-05 19:23 UTC (permalink / raw)
  To: Rob Landley
  Cc: Peter Zijlstra, Geert Uytterhoeven, Will Deacon,
	Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin, Linux-Arch,
	Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Linux-sh list,
	Guenter Roeck

On Thu, Dec 05, 2019 at 01:24:02PM -0600, Rob Landley wrote:
> On 12/4/19 4:47 AM, Peter Zijlstra wrote:
> > On Tue, Dec 03, 2019 at 12:19:00PM +0100, Geert Uytterhoeven wrote:
> >> Hoi Peter,
> >>
> >> On Tue, Feb 19, 2019 at 11:35 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >>> Generic mmu_gather provides everything SH needs (range tracking and
> >>> cache coherency).
> >>>
> >>> Cc: Will Deacon <will.deacon@arm.com>
> >>> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> >>> Cc: Andrew Morton <akpm@linux-foundation.org>
> >>> Cc: Nick Piggin <npiggin@gmail.com>
> >>> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> >>> Cc: Rich Felker <dalias@libc.org>
> >>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> >>
> >> I got remote access to an SH7722-based Migo-R again, which spews a long
> >> sequence of BUGs during userspace startup.  I've bisected this to commit
> >> c5b27a889da92f4a ("sh/tlb: Convert SH to generic mmu_gather").
> > 
> > Whoopsy.. also, is this really the first time anybody booted an SH
> > kernel in over a year ?!?
> 
> No, but most people running this kind of hardware tend not to upgrade to current
> kernels on a regular basis.
> 
> The j-core guys tested the 5.3 release. I can't find an email about 5.4 so I
> dunno if that's been tested yet?

Being that this code is about mmu, does it affect nommu machines?
That's what we've got at present on the J-core side.

Rich


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 10:47     ` Peter Zijlstra
  2019-12-04 12:32       ` Geert Uytterhoeven
@ 2019-12-05 19:24       ` Rob Landley
  2019-12-05 19:23         ` Rich Felker
  2019-12-05 19:30       ` John Paul Adrian Glaubitz
  2 siblings, 1 reply; 42+ messages in thread
From: Rob Landley @ 2019-12-05 19:24 UTC (permalink / raw)
  To: Peter Zijlstra, Geert Uytterhoeven
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list, Guenter Roeck

On 12/4/19 4:47 AM, Peter Zijlstra wrote:
> On Tue, Dec 03, 2019 at 12:19:00PM +0100, Geert Uytterhoeven wrote:
>> Hoi Peter,
>>
>> On Tue, Feb 19, 2019 at 11:35 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>> Generic mmu_gather provides everything SH needs (range tracking and
>>> cache coherency).
>>>
>>> Cc: Will Deacon <will.deacon@arm.com>
>>> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Nick Piggin <npiggin@gmail.com>
>>> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
>>> Cc: Rich Felker <dalias@libc.org>
>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>>
>> I got remote access to an SH7722-based Migo-R again, which spews a long
>> sequence of BUGs during userspace startup.  I've bisected this to commit
>> c5b27a889da92f4a ("sh/tlb: Convert SH to generic mmu_gather").
> 
> Whoopsy.. also, is this really the first time anybody booted an SH
> kernel in over a year ?!?

No, but most people running this kind of hardware tend not to upgrade to current
kernels on a regular basis.

The j-core guys tested the 5.3 release. I can't find an email about 5.4 so I
dunno if that's been tested yet?

I just tested yesterday's git and it works fine with
http://lkml.iu.edu/hypermail/linux/kernel/1912.0/01554.html installed, modulo it
_still_ has the suprious stack dump shortly before calling init, which I've
complained about on linux-sh and off for a year now?

------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at mm/slub.c:2451 kmem_cache_free_bulk+0x2c2/0x37c

CPU: 0 PID: 1 Comm: swapper Not tainted 5.4.0 #1
PC is at kmem_cache_free_bulk+0x2c2/0x37c
PR is at kmem_cache_alloc_bulk+0x36/0x1a0
PC  : 8c0a6fae SP  : 8f829e9c SR  : 400080f0
TEA : c0001240
R0  : 8c0a6de4 R1  : 00000100 R2  : 00000100 R3  : 00000000
R4  : 8f8020a0 R5  : 00000dc0 R6  : 8c01d66c R7  : 8fff5180
R8  : 8c011a00 R9  : 8fff5180 R10 : 8c01d66c R11 : 80000000
R12 : 00007fff R13 : 00000dc0 R14 : 8f8020a0
MACH: 0000017a MACL: 0ae4849d GBR : 00000000 PR  : 8c0a709e

Call trace:
 [<(ptrval)>] copy_process+0x218/0x1094
 [<(ptrval)>] copy_process+0x7ba/0x1094
 [<(ptrval)>] kmem_cache_alloc_bulk+0x36/0x1a0
 [<(ptrval)>] restore_sigcontext+0x94/0x1b0
 [<(ptrval)>] restore_sigcontext+0x70/0x1b0
 [<(ptrval)>] copy_process+0x218/0x1094
 [<(ptrval)>] sysfs_slab_add+0x106/0x354
 [<(ptrval)>] restore_sigcontext+0x70/0x1b0
 [<(ptrval)>] copy_process+0x218/0x1094
 [<(ptrval)>] copy_process+0x218/0x1094
 [<(ptrval)>] fprop_fraction_single+0x38/0xa4
 [<(ptrval)>] pipe_read+0x7a/0x23c
 [<(ptrval)>] restore_sigcontext+0x70/0x1b0
 [<(ptrval)>] restore_sigcontext+0x94/0x1b0
 [<(ptrval)>] alloc_pipe_info+0x162/0x1c8
 [<(ptrval)>] restore_sigcontext+0x94/0x1b0
 [<(ptrval)>] restore_sigcontext+0x70/0x1b0
 [<(ptrval)>] handle_bad_irq+0x154/0x188
 [<(ptrval)>] raw6_exit_net+0x0/0x14
 [<(ptrval)>] prepare_stack+0xe4/0x2fc
 [<(ptrval)>] sys_sched_get_priority_min+0x18/0x28
 [<(ptrval)>] ndisc_net_exit+0x4/0x24

---[ end trace 6ce4eefeb577b078 ]---

But it's cosmetic...

I haven't got one of the new Turtle boards yet (next time I visit Japan...) and
the USB connector broke off my old one, so I haven't got test hardware in my bag
to boot it on with me at this coffee shop. So just qemu testing at the moment.
The actual j-core deployment environment I'm working on this month is a deeply
embedded thing with 128k sram so isn't running Linux. :)

Rob


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-04 10:47     ` Peter Zijlstra
  2019-12-04 12:32       ` Geert Uytterhoeven
  2019-12-05 19:24       ` Rob Landley
@ 2019-12-05 19:30       ` John Paul Adrian Glaubitz
  2019-12-05 22:56         ` Guenter Roeck
  2 siblings, 1 reply; 42+ messages in thread
From: John Paul Adrian Glaubitz @ 2019-12-05 19:30 UTC (permalink / raw)
  To: Peter Zijlstra, Geert Uytterhoeven
  Cc: Will Deacon, Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin,
	Linux-Arch, Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list, Guenter Roeck

Hi!

On 12/4/19 11:47 AM, Peter Zijlstra wrote:
>> I got remote access to an SH7722-based Migo-R again, which spews a long
>> sequence of BUGs during userspace startup.  I've bisected this to commit
>> c5b27a889da92f4a ("sh/tlb: Convert SH to generic mmu_gather").
> 
> Whoopsy.. also, is this really the first time anybody booted an SH
> kernel in over a year ?!?

I have to admit, I have been very lazy with kernel updates. I have been
planning to upgrade to a much more recent release on my boards for a while
now, I have just been postponing it since the machines run very stable
with the current kernel I am using.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@debian.org
`. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6 10/18] sh/tlb: Convert SH to generic mmu_gather
  2019-12-05 19:30       ` John Paul Adrian Glaubitz
@ 2019-12-05 22:56         ` Guenter Roeck
  0 siblings, 0 replies; 42+ messages in thread
From: Guenter Roeck @ 2019-12-05 22:56 UTC (permalink / raw)
  To: John Paul Adrian Glaubitz
  Cc: Peter Zijlstra, Geert Uytterhoeven, Will Deacon,
	Aneesh Kumar K.V, Andrew Morton, Nicholas Piggin, Linux-Arch,
	Linux MM, Linux Kernel Mailing List, Russell King,
	Heiko Carstens, Rik van Riel, Yoshinori Sato, Rich Felker,
	Linux-sh list

On Thu, Dec 05, 2019 at 08:30:17PM +0100, John Paul Adrian Glaubitz wrote:
> Hi!
> 
> On 12/4/19 11:47 AM, Peter Zijlstra wrote:
> >> I got remote access to an SH7722-based Migo-R again, which spews a long
> >> sequence of BUGs during userspace startup.  I've bisected this to commit
> >> c5b27a889da92f4a ("sh/tlb: Convert SH to generic mmu_gather").
> > 
> > Whoopsy.. also, is this really the first time anybody booted an SH
> > kernel in over a year ?!?
> 
> I have to admit, I have been very lazy with kernel updates. I have been
> planning to upgrade to a much more recent release on my boards for a while
> now, I have just been postponing it since the machines run very stable
> with the current kernel I am using.
> 

Hey, if you write a qemu emulation, I'll be happy to run it on a regular
basis :-)

Problem is really that the architecture doesn't get as much attention as
it needs. The backtrace pointed to by Rob has been seen for a long time,
but either there is no one with the knowledge to fix it, or they are all
busy with other stuff.

Guenter

> Adrian
> 
> -- 
>  .''`.  John Paul Adrian Glaubitz
> : :' :  Debian Developer - glaubitz@debian.org
> `. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
>   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, back to index

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-19 10:31 [PATCH v6 00/18] generic mmu_gather patches Peter Zijlstra
2019-02-19 10:31 ` [PATCH v6 01/18] asm-generic/tlb: Provide a comment Peter Zijlstra
2019-02-19 10:31 ` [PATCH v6 02/18] asm-generic/tlb: Provide HAVE_MMU_GATHER_PAGE_SIZE Peter Zijlstra
2019-02-19 10:31 ` [PATCH v6 03/18] asm-generic/tlb: Provide generic VIPT cache flush Peter Zijlstra
2019-02-19 10:31 ` [PATCH v6 04/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_range() Peter Zijlstra
2019-02-19 10:31 ` [PATCH v6 05/18] asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_mm() Peter Zijlstra
2019-02-19 12:47   ` Will Deacon
2019-02-19 10:31 ` [PATCH v6 06/18] asm-generic/tlb: Conditionally provide tlb_migrate_finish() Peter Zijlstra
2019-02-19 12:47   ` Will Deacon
2019-02-19 13:41     ` Peter Zijlstra
2019-02-20 14:47       ` Will Deacon
2019-02-20 15:02         ` Matthew Wilcox
2019-02-19 10:31 ` [PATCH v6 07/18] asm-generic/tlb: Invert HAVE_RCU_TABLE_INVALIDATE Peter Zijlstra
2019-02-19 10:31 ` [PATCH v6 08/18] arm/tlb: Convert to generic mmu_gather Peter Zijlstra
2019-02-19 10:31 ` [PATCH v6 09/18] ia64/tlb: Conver " Peter Zijlstra
2019-02-19 12:47   ` Will Deacon
2019-02-21  2:52   ` Souptick Joarder
2019-02-19 10:31 ` [PATCH v6 10/18] sh/tlb: Convert SH " Peter Zijlstra
2019-12-03 11:19   ` Geert Uytterhoeven
2019-12-04 10:47     ` Peter Zijlstra
2019-12-04 12:32       ` Geert Uytterhoeven
2019-12-04 13:22         ` Guenter Roeck
2019-12-04 15:17           ` Geert Uytterhoeven
2019-12-04 19:03             ` Guenter Roeck
2019-12-04 13:34         ` Peter Zijlstra
2019-12-04 15:07           ` Geert Uytterhoeven
2019-12-04 16:41             ` Peter Zijlstra
2019-12-05 15:26               ` Geert Uytterhoeven
2019-12-05 19:24       ` Rob Landley
2019-12-05 19:23         ` Rich Felker
2019-12-05 19:30       ` John Paul Adrian Glaubitz
2019-12-05 22:56         ` Guenter Roeck
2019-02-19 10:31 ` [PATCH v6 11/18] um/tlb: Convert " Peter Zijlstra
2019-02-19 10:32 ` [PATCH v6 12/18] arch/tlb: Clean up simple architectures Peter Zijlstra
2019-02-19 10:32 ` [PATCH v6 13/18] asm-generic/tlb: Introduce HAVE_MMU_GATHER_NO_GATHER Peter Zijlstra
2019-02-19 12:47   ` Will Deacon
2019-02-19 10:32 ` [PATCH v6 14/18] s390/tlb: convert to generic mmu_gather Peter Zijlstra
2019-02-19 12:47   ` Will Deacon
2019-02-19 10:32 ` [PATCH v6 15/18] asm-generic/tlb: Remove arch_tlb*_mmu() Peter Zijlstra
2019-02-19 10:32 ` [PATCH v6 16/18] asm-generic/tlb: Remove HAVE_GENERIC_MMU_GATHER Peter Zijlstra
2019-02-19 10:32 ` [PATCH v6 17/18] asm-generic/tlb: Remove tlb_flush_mmu_free() Peter Zijlstra
2019-02-19 10:32 ` [PATCH v6 18/18] asm-generic/tlb: Remove tlb_table_flush() Peter Zijlstra

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git