All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/6] mm: Unify TLB gather implementations
@ 2011-03-02 17:59 ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

This is a series that attempts to unify and fix the current tlb gather
implementations, theres numerous more tlb range architectures that I need
to visit, but please do comment on the direction taken.

Compile tested, but note I don't have anything to actually test this one
(aside from an omap board without network access).


^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 0/6] mm: Unify TLB gather implementations
@ 2011-03-02 17:59 ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

This is a series that attempts to unify and fix the current tlb gather
implementations, theres numerous more tlb range architectures that I need
to visit, but please do comment on the direction taken.

Compile tested, but note I don't have anything to actually test this one
(aside from an omap board without network access).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 0/6] mm: Unify TLB gather implementations
@ 2011-03-02 17:59 ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

This is a series that attempts to unify and fix the current tlb gather
implementations, theres numerous more tlb range architectures that I need
to visit, but please do comment on the direction taken.

Compile tested, but note I don't have anything to actually test this one
(aside from an omap board without network access).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 1/6] mm: Optimize fullmm TLB flushing
  2011-03-02 17:59 ` Peter Zijlstra
  (?)
@ 2011-03-02 17:59   ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: mmu_gather_fullmm.patch --]
[-- Type: text/plain, Size: 1377 bytes --]

This originated from s390 which does something similar and would allow
s390 to use the generic TLB flushing code.

The idea is to flush the mm wide cache and tlb a priory and not bother
with multiple flushes if the batching isn't large enough.

This can be safely done since there cannot be any concurrency on this
mm, its either after the process died (exit) or in the middle of
execve where the thread switched to the new mm.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/asm-generic/tlb.h |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Index: linux-2.6/include/asm-generic/tlb.h
===================================================================
--- linux-2.6.orig/include/asm-generic/tlb.h
+++ linux-2.6/include/asm-generic/tlb.h
@@ -149,6 +149,11 @@ tlb_gather_mmu(struct mmu_gather *tlb, s
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb->batch = NULL;
 #endif
+
+	if (fullmm) {
+		flush_cache_mm(mm);
+		flush_tlb_mm(mm);
+	}
 }
 
 static inline void
@@ -156,13 +161,15 @@ tlb_flush_mmu(struct mmu_gather *tlb)
 {
 	struct mmu_gather_batch *batch;
 
-	if (!tlb->need_flush)
-		return;
-	tlb->need_flush = 0;
-	tlb_flush(tlb);
+	if (!tlb->fullmm && tlb->need_flush) {
+		tlb->need_flush = 0;
+		tlb_flush(tlb);
+	}
+
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb_table_flush(tlb);
 #endif
+
 	if (tlb_fast_mode(tlb))
 		return;
 



^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 1/6] mm: Optimize fullmm TLB flushing
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: mmu_gather_fullmm.patch --]
[-- Type: text/plain, Size: 1680 bytes --]

This originated from s390 which does something similar and would allow
s390 to use the generic TLB flushing code.

The idea is to flush the mm wide cache and tlb a priory and not bother
with multiple flushes if the batching isn't large enough.

This can be safely done since there cannot be any concurrency on this
mm, its either after the process died (exit) or in the middle of
execve where the thread switched to the new mm.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/asm-generic/tlb.h |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Index: linux-2.6/include/asm-generic/tlb.h
===================================================================
--- linux-2.6.orig/include/asm-generic/tlb.h
+++ linux-2.6/include/asm-generic/tlb.h
@@ -149,6 +149,11 @@ tlb_gather_mmu(struct mmu_gather *tlb, s
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb->batch = NULL;
 #endif
+
+	if (fullmm) {
+		flush_cache_mm(mm);
+		flush_tlb_mm(mm);
+	}
 }
 
 static inline void
@@ -156,13 +161,15 @@ tlb_flush_mmu(struct mmu_gather *tlb)
 {
 	struct mmu_gather_batch *batch;
 
-	if (!tlb->need_flush)
-		return;
-	tlb->need_flush = 0;
-	tlb_flush(tlb);
+	if (!tlb->fullmm && tlb->need_flush) {
+		tlb->need_flush = 0;
+		tlb_flush(tlb);
+	}
+
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb_table_flush(tlb);
 #endif
+
 	if (tlb_fast_mode(tlb))
 		return;
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 1/6] mm: Optimize fullmm TLB flushing
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: mmu_gather_fullmm.patch --]
[-- Type: text/plain, Size: 1680 bytes --]

This originated from s390 which does something similar and would allow
s390 to use the generic TLB flushing code.

The idea is to flush the mm wide cache and tlb a priory and not bother
with multiple flushes if the batching isn't large enough.

This can be safely done since there cannot be any concurrency on this
mm, its either after the process died (exit) or in the middle of
execve where the thread switched to the new mm.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/asm-generic/tlb.h |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Index: linux-2.6/include/asm-generic/tlb.h
===================================================================
--- linux-2.6.orig/include/asm-generic/tlb.h
+++ linux-2.6/include/asm-generic/tlb.h
@@ -149,6 +149,11 @@ tlb_gather_mmu(struct mmu_gather *tlb, s
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb->batch = NULL;
 #endif
+
+	if (fullmm) {
+		flush_cache_mm(mm);
+		flush_tlb_mm(mm);
+	}
 }
 
 static inline void
@@ -156,13 +161,15 @@ tlb_flush_mmu(struct mmu_gather *tlb)
 {
 	struct mmu_gather_batch *batch;
 
-	if (!tlb->need_flush)
-		return;
-	tlb->need_flush = 0;
-	tlb_flush(tlb);
+	if (!tlb->fullmm && tlb->need_flush) {
+		tlb->need_flush = 0;
+		tlb_flush(tlb);
+	}
+
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb_table_flush(tlb);
 #endif
+
 	if (tlb_fast_mode(tlb))
 		return;
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
  2011-03-02 17:59 ` Peter Zijlstra
  (?)
@ 2011-03-02 17:59   ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: fixup-flush_tlb_range.patch --]
[-- Type: text/plain, Size: 69108 bytes --]

In order to be able to properly support architecture that want/need to
support TLB range invalidation, we need to change the
flush_tlb_range() argument from a vm_area_struct to an mm_struct
because the range might very well extend past one VMA, or not have a
VMA at all.

There are two mmu_gather cases to consider:

  unmap_region()
    tlb_gather_mmu()
    unmap_vmas()
      for (; vma; vma = vma->vm_next)
        unmao_page_range()
          tlb_start_vma() -> flush cache range
          zap_*_range()
            ptep_get_and_clear_full() -> batch/track external tlbs
            tlb_remove_tlb_entry() -> batch/track external tlbs
            tlb_remove_page() -> track range/batch page
          tlb_end_vma()
    free_pgtables()
      while (vma)
        unlink_*_vma()
        free_*_range()
          *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush everything
  free vmas

and:

  shift_arg_pages()
    tlb_gather_mmu()
    free_*_range()
      *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush things

There are various reasons that we need to flush TLBs _after_ freeing
the page-tables themselves. For some architectures (x86 among others)
this serializes against (both hardware and software) page table
walkers like gup_fast().

For others (ARM) this is (also) needed to evict stale page-table
caches - ARM LPAE mode apparently caches page tables and concurrent
hardware walkers could re-populate these caches if the final tlb flush
were to be from tlb_end_vma() since an concurrent walk could still be
in progress.

Therefore we need to track the range over all VMAs and the freeing of
the page-tables themselves. This means we cannot use a VMA argument to
the flush the TLB range.

Mostly architectures only used the ->vm_mm argument anyway, and
conversion is straight forward and removes numerous fake vma
instrances created just to pass an mm pointer.

The exceptions are ARM and TILE, both of which also look at
->vm_flags, ARM uses this to optimize TBL flushes for Harvard style
MMUs that have independent I-TLB ops. The taken conversion is rather
ugly (because I can't write ARM asm) and creates a fake VMA with
VM_EXEC set so that it effectively always flushes the I-TLBs and thus
looses the optimization.

TILE uses vm_flags to check for VM_EXEC in order to flush I-cache, but
also checks VM_HUGETLB. Arguably it shouldn't flush I-cache here and
we can use things like update_mmu_cache() to solve this. As for the
HUGETLB case, we can simply flush both at a small penalty. The current
conversion does all three, I-cache, TLB and HUGETLB.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 Documentation/cachetlb.txt             |    9 +++------
 arch/alpha/include/asm/tlbflush.h      |    8 +++-----
 arch/alpha/kernel/smp.c                |    4 ++--
 arch/arm/include/asm/tlb.h             |    2 +-
 arch/arm/include/asm/tlbflush.h        |    5 +++--
 arch/arm/kernel/ecard.c                |    8 ++------
 arch/arm/kernel/smp_tlb.c              |   29 +++++++++++++++++++++--------
 arch/avr32/include/asm/tlb.h           |    2 +-
 arch/avr32/include/asm/tlbflush.h      |    4 ++--
 arch/avr32/mm/tlb.c                    |    4 +---
 arch/cris/include/asm/tlbflush.h       |    4 ++--
 arch/frv/include/asm/tlbflush.h        |    4 ++--
 arch/ia64/include/asm/tlb.h            |   11 ++---------
 arch/ia64/include/asm/tlbflush.h       |    4 ++--
 arch/ia64/mm/tlb.c                     |    3 +--
 arch/m32r/include/asm/tlbflush.h       |   14 +++++++-------
 arch/m32r/kernel/smp.c                 |    6 +++---
 arch/m32r/mm/fault-nommu.c             |    2 +-
 arch/m32r/mm/fault.c                   |    5 +----
 arch/m68k/include/asm/tlbflush.h       |    7 +++----
 arch/microblaze/include/asm/tlbflush.h |    2 +-
 arch/mips/include/asm/tlbflush.h       |    8 ++++----
 arch/mips/kernel/smp.c                 |   12 +++++-------
 arch/mips/mm/tlb-r3k.c                 |    3 +--
 arch/mips/mm/tlb-r4k.c                 |    3 +--
 arch/mips/mm/tlb-r8k.c                 |    3 +--
 arch/mn10300/include/asm/tlbflush.h    |    6 +++---
 arch/parisc/include/asm/tlb.h          |    2 +-
 arch/parisc/include/asm/tlbflush.h     |    2 +-
 arch/powerpc/include/asm/tlbflush.h    |    8 ++++----
 arch/powerpc/mm/tlb_hash32.c           |    6 +++---
 arch/powerpc/mm/tlb_nohash.c           |    4 ++--
 arch/s390/include/asm/tlbflush.h       |    6 +++---
 arch/score/include/asm/tlbflush.h      |    6 +++---
 arch/score/mm/tlb-score.c              |    3 +--
 arch/sh/include/asm/tlb.h              |    2 +-
 arch/sh/include/asm/tlbflush.h         |   10 +++++-----
 arch/sh/kernel/smp.c                   |   12 +++++-------
 arch/sh/mm/nommu.c                     |    2 +-
 arch/sh/mm/tlbflush_32.c               |    3 +--
 arch/sh/mm/tlbflush_64.c               |    4 +---
 arch/sparc/include/asm/tlb_32.h        |    2 +-
 arch/sparc/include/asm/tlbflush_32.h   |   12 ++++++------
 arch/sparc/include/asm/tlbflush_64.h   |    2 +-
 arch/sparc/kernel/smp_32.c             |    8 +++-----
 arch/sparc/mm/generic_32.c             |    2 +-
 arch/sparc/mm/generic_64.c             |    2 +-
 arch/sparc/mm/hypersparc.S             |    1 -
 arch/sparc/mm/srmmu.c                  |   17 ++++++++---------
 arch/sparc/mm/sun4c.c                  |    3 +--
 arch/sparc/mm/swift.S                  |    1 -
 arch/sparc/mm/tsunami.S                |    1 -
 arch/sparc/mm/viking.S                 |    2 --
 arch/tile/include/asm/tlbflush.h       |    4 ++--
 arch/tile/kernel/tlb.c                 |   11 +++++------
 arch/um/include/asm/tlbflush.h         |    4 ++--
 arch/um/kernel/tlb.c                   |    6 +++---
 arch/unicore32/include/asm/tlb.h       |    2 +-
 arch/unicore32/include/asm/tlbflush.h  |    2 +-
 arch/x86/include/asm/tlbflush.h        |   10 +++++-----
 arch/x86/mm/pgtable.c                  |    6 +++---
 arch/xtensa/include/asm/tlb.h          |    2 +-
 arch/xtensa/include/asm/tlbflush.h     |    2 +-
 arch/xtensa/mm/tlb.c                   |    3 +--
 mm/huge_memory.c                       |    6 +++---
 mm/hugetlb.c                           |    4 ++--
 mm/mprotect.c                          |    2 +-
 mm/pgtable-generic.c                   |    8 ++++----
 68 files changed, 168 insertions(+), 199 deletions(-)

Index: linux-2.6/Documentation/cachetlb.txt
===================================================================
--- linux-2.6.orig/Documentation/cachetlb.txt
+++ linux-2.6/Documentation/cachetlb.txt
@@ -49,20 +49,17 @@ invoke one of the following flush method
 	page table operations such as what happens during
 	fork, and exec.
 
-3) void flush_tlb_range(struct vm_area_struct *vma,
+3) void flush_tlb_range(struct mm_struct *mm,
 			unsigned long start, unsigned long end)
 
 	Here we are flushing a specific range of (user) virtual
 	address translations from the TLB.  After running, this
 	interface must make sure that any previous page table
-	modifications for the address space 'vma->vm_mm' in the range
+	modifications for the address space 'mm' in the range
 	'start' to 'end-1' will be visible to the cpu.  That is, after
 	running, here will be no entries in the TLB for 'mm' for
 	virtual addresses in the range 'start' to 'end-1'.
 
-	The "vma" is the backing store being used for the region.
-	Primarily, this is used for munmap() type operations.
-
 	The interface is provided in hopes that the port can find
 	a suitably efficient method for removing multiple page
 	sized translations from the TLB, instead of having the kernel
@@ -120,7 +117,7 @@ is changing an existing virtual-->physic
 
 	2) flush_cache_range(vma, start, end);
 	   change_range_of_page_tables(mm, start, end);
-	   flush_tlb_range(vma, start, end);
+	   flush_tlb_range(vma->vm_mm, start, end);
 
 	3) flush_cache_page(vma, addr, pfn);
 	   set_pte(pte_pointer, new_pte_val);
Index: linux-2.6/arch/alpha/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/alpha/include/asm/tlbflush.h
+++ linux-2.6/arch/alpha/include/asm/tlbflush.h
@@ -127,10 +127,9 @@ flush_tlb_page(struct vm_area_struct *vm
 /* Flush a specified range of user mapping.  On the Alpha we flush
    the whole user tlb.  */
 static inline void
-flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		unsigned long end)
+flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 #else /* CONFIG_SMP */
@@ -138,8 +137,7 @@ flush_tlb_range(struct vm_area_struct *v
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *);
 extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void flush_tlb_range(struct vm_area_struct *, unsigned long,
-			    unsigned long);
+extern void flush_tlb_range(struct mm_struct *, unsigned long, unsigned long);
 
 #endif /* CONFIG_SMP */
 
Index: linux-2.6/arch/alpha/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/alpha/kernel/smp.c
+++ linux-2.6/arch/alpha/kernel/smp.c
@@ -773,10 +773,10 @@ flush_tlb_page(struct vm_area_struct *vm
 EXPORT_SYMBOL(flush_tlb_page);
 
 void
-flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
 	/* On the Alpha we always flush the whole user tlb.  */
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 EXPORT_SYMBOL(flush_tlb_range);
 
Index: linux-2.6/arch/arm/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/tlb.h
+++ linux-2.6/arch/arm/include/asm/tlb.h
@@ -83,7 +83,7 @@ static inline void tlb_flush(struct mmu_
 	if (tlb->fullmm || !tlb->vma)
 		flush_tlb_mm(tlb->mm);
 	else if (tlb->range_end > 0) {
-		flush_tlb_range(tlb->vma, tlb->range_start, tlb->range_end);
+		flush_tlb_range(tlb->mm, tlb->range_start, tlb->range_end);
 		tlb->range_start = TASK_SIZE;
 		tlb->range_end = 0;
 	}
Index: linux-2.6/arch/arm/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/tlbflush.h
+++ linux-2.6/arch/arm/include/asm/tlbflush.h
@@ -545,7 +545,8 @@ static inline void clean_pmd_entry(pmd_t
 /*
  * Convert calls to our calling convention.
  */
-#define local_flush_tlb_range(vma,start,end)	__cpu_flush_user_tlb_range(start,end,vma)
+extern void local_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
+
 #define local_flush_tlb_kernel_range(s,e)	__cpu_flush_kern_tlb_range(s,e)
 
 #ifndef CONFIG_SMP
@@ -560,7 +561,7 @@ extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr);
 extern void flush_tlb_kernel_page(unsigned long kaddr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 #endif
 
Index: linux-2.6/arch/arm/kernel/ecard.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/ecard.c
+++ linux-2.6/arch/arm/kernel/ecard.c
@@ -217,8 +217,6 @@ static DEFINE_MUTEX(ecard_mutex);
  */
 static void ecard_init_pgtables(struct mm_struct *mm)
 {
-	struct vm_area_struct vma;
-
 	/* We want to set up the page tables for the following mapping:
 	 *  Virtual	Physical
 	 *  0x03000000	0x03000000
@@ -242,10 +240,8 @@ static void ecard_init_pgtables(struct m
 
 	memcpy(dst_pgd, src_pgd, sizeof(pgd_t) * (EASI_SIZE / PGDIR_SIZE));
 
-	vma.vm_mm = mm;
-
-	flush_tlb_range(&vma, IO_START, IO_START + IO_SIZE);
-	flush_tlb_range(&vma, EASI_START, EASI_START + EASI_SIZE);
+	flush_tlb_range(mm, IO_START, IO_START + IO_SIZE);
+	flush_tlb_range(mm, EASI_START, EASI_START + EASI_SIZE);
 }
 
 static int ecard_init_mm(void)
Index: linux-2.6/arch/arm/kernel/smp_tlb.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/smp_tlb.c
+++ linux-2.6/arch/arm/kernel/smp_tlb.c
@@ -9,6 +9,7 @@
  */
 #include <linux/preempt.h>
 #include <linux/smp.h>
+#include <linux/mm.h>
 
 #include <asm/smp_plat.h>
 #include <asm/tlbflush.h>
@@ -31,7 +32,7 @@ static void on_each_cpu_mask(void (*func
  * TLB operations
  */
 struct tlb_args {
-	struct vm_area_struct *ta_vma;
+	struct mm_struct *ta_mm;
 	unsigned long ta_start;
 	unsigned long ta_end;
 };
@@ -51,8 +52,11 @@ static inline void ipi_flush_tlb_mm(void
 static inline void ipi_flush_tlb_page(void *arg)
 {
 	struct tlb_args *ta = (struct tlb_args *)arg;
+	struct vm_area_struct vma = {
+		.vm_mm = ta->ta_mm,
+	};
 
-	local_flush_tlb_page(ta->ta_vma, ta->ta_start);
+	local_flush_tlb_page(&vma, ta->ta_start);
 }
 
 static inline void ipi_flush_tlb_kernel_page(void *arg)
@@ -66,7 +70,7 @@ static inline void ipi_flush_tlb_range(v
 {
 	struct tlb_args *ta = (struct tlb_args *)arg;
 
-	local_flush_tlb_range(ta->ta_vma, ta->ta_start, ta->ta_end);
+	local_flush_tlb_range(ta->ta_mm, ta->ta_start, ta->ta_end);
 }
 
 static inline void ipi_flush_tlb_kernel_range(void *arg)
@@ -96,7 +100,7 @@ void flush_tlb_page(struct vm_area_struc
 {
 	if (tlb_ops_need_broadcast()) {
 		struct tlb_args ta;
-		ta.ta_vma = vma;
+		ta.ta_mm = vma->vm_mm;
 		ta.ta_start = uaddr;
 		on_each_cpu_mask(ipi_flush_tlb_page, &ta, 1, mm_cpumask(vma->vm_mm));
 	} else
@@ -113,17 +117,17 @@ void flush_tlb_kernel_page(unsigned long
 		local_flush_tlb_kernel_page(kaddr);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+void flush_tlb_range(struct mm_struct *mm,
                      unsigned long start, unsigned long end)
 {
 	if (tlb_ops_need_broadcast()) {
 		struct tlb_args ta;
-		ta.ta_vma = vma;
+		ta.ta_mm = mm;
 		ta.ta_start = start;
 		ta.ta_end = end;
-		on_each_cpu_mask(ipi_flush_tlb_range, &ta, 1, mm_cpumask(vma->vm_mm));
+		on_each_cpu_mask(ipi_flush_tlb_range, &ta, 1, mm_cpumask(mm));
 	} else
-		local_flush_tlb_range(vma, start, end);
+		local_flush_tlb_range(mm, start, end);
 }
 
 void flush_tlb_kernel_range(unsigned long start, unsigned long end)
@@ -137,3 +141,12 @@ void flush_tlb_kernel_range(unsigned lon
 		local_flush_tlb_kernel_range(start, end);
 }
 
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
+{
+	struct vm_area_struct vma = {
+		.vm_mm = mm,
+		.vm_flags = VM_EXEC,
+	};
+
+	__cpu_flush_user_tlb_range(start, end, &vma);
+}
Index: linux-2.6/arch/avr32/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/avr32/include/asm/tlb.h
+++ linux-2.6/arch/avr32/include/asm/tlb.h
@@ -12,7 +12,7 @@
 	flush_cache_range(vma, vma->vm_start, vma->vm_end)
 
 #define tlb_end_vma(tlb, vma) \
-	flush_tlb_range(vma, vma->vm_start, vma->vm_end)
+	flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end)
 
 #define __tlb_remove_tlb_entry(tlb, pte, address) do { } while(0)
 
Index: linux-2.6/arch/avr32/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/avr32/include/asm/tlbflush.h
+++ linux-2.6/arch/avr32/include/asm/tlbflush.h
@@ -17,13 +17,13 @@
  *  - flush_tlb_all() flushes all processes' TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLBs
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void flush_tlb(void);
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 
Index: linux-2.6/arch/avr32/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/avr32/mm/tlb.c
+++ linux-2.6/arch/avr32/mm/tlb.c
@@ -170,11 +170,9 @@ void flush_tlb_page(struct vm_area_struc
 	}
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	if (mm->context != NO_CONTEXT) {
 		unsigned long flags;
 		int size;
Index: linux-2.6/arch/cris/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/cris/include/asm/tlbflush.h
+++ linux-2.6/arch/cris/include/asm/tlbflush.h
@@ -33,9 +33,9 @@ extern void flush_tlb_page(struct vm_are
 #define flush_tlb_page __flush_tlb_page
 #endif
 
-static inline void flush_tlb_range(struct vm_area_struct * vma, unsigned long start, unsigned long end)
+static inline void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 static inline void flush_tlb(void)
Index: linux-2.6/arch/frv/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/frv/include/asm/tlbflush.h
+++ linux-2.6/arch/frv/include/asm/tlbflush.h
@@ -39,10 +39,10 @@ do {						\
 	preempt_enable();			\
 } while(0)
 
-#define flush_tlb_range(vma,start,end)					\
+#define flush_tlb_range(mm,start,end)					\
 do {									\
 	preempt_disable();						\
-	__flush_tlb_range((vma)->vm_mm->context.id, start, end);	\
+	__flush_tlb_range((mm)->context.id, start, end);		\
 	preempt_enable();						\
 } while(0)
 
Index: linux-2.6/arch/ia64/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/tlb.h
+++ linux-2.6/arch/ia64/include/asm/tlb.h
@@ -126,17 +126,10 @@ ia64_tlb_flush_mmu (struct mmu_gather *t
 		 */
 		flush_tlb_all();
 	} else {
-		/*
-		 * XXX fix me: flush_tlb_range() should take an mm pointer instead of a
-		 * vma pointer.
-		 */
-		struct vm_area_struct vma;
-
-		vma.vm_mm = tlb->mm;
 		/* flush the address range from the tlb: */
-		flush_tlb_range(&vma, start, end);
+		flush_tlb_range(tlb->mm, start, end);
 		/* now flush the virt. page-table area mapping the address range: */
-		flush_tlb_range(&vma, ia64_thash(start), ia64_thash(end));
+		flush_tlb_range(tlb->mm, ia64_thash(start), ia64_thash(end));
 	}
 
 	/* lastly, release the freed pages */
Index: linux-2.6/arch/ia64/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/tlbflush.h
+++ linux-2.6/arch/ia64/include/asm/tlbflush.h
@@ -66,7 +66,7 @@ flush_tlb_mm (struct mm_struct *mm)
 #endif
 }
 
-extern void flush_tlb_range (struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_tlb_range (struct mm_struct *mm, unsigned long start, unsigned long end);
 
 /*
  * Page-granular tlb flush.
@@ -75,7 +75,7 @@ static inline void
 flush_tlb_page (struct vm_area_struct *vma, unsigned long addr)
 {
 #ifdef CONFIG_SMP
-	flush_tlb_range(vma, (addr & PAGE_MASK), (addr & PAGE_MASK) + PAGE_SIZE);
+	flush_tlb_range(vma->vm_mm, (addr & PAGE_MASK), (addr & PAGE_MASK) + PAGE_SIZE);
 #else
 	if (vma->vm_mm == current->active_mm)
 		ia64_ptcl(addr, (PAGE_SHIFT << 2));
Index: linux-2.6/arch/ia64/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/ia64/mm/tlb.c
+++ linux-2.6/arch/ia64/mm/tlb.c
@@ -298,10 +298,9 @@ local_flush_tlb_all (void)
 }
 
 void
-flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
+flush_tlb_range (struct mm_struct *mm, unsigned long start,
 		 unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long size = end - start;
 	unsigned long nbits;
 
Index: linux-2.6/arch/m32r/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/m32r/include/asm/tlbflush.h
+++ linux-2.6/arch/m32r/include/asm/tlbflush.h
@@ -17,7 +17,7 @@
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *);
 extern void local_flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void local_flush_tlb_range(struct vm_area_struct *, unsigned long,
+extern void local_flush_tlb_range(struct mm_struct *, unsigned long,
 	unsigned long);
 
 #ifndef CONFIG_SMP
@@ -25,27 +25,27 @@ extern void local_flush_tlb_range(struct
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
-#define flush_tlb_range(vma, start, end)	\
-	local_flush_tlb_range(vma, start, end)
+#define flush_tlb_range(mm, start, end)	\
+	local_flush_tlb_range(mm, start, end)
 #define flush_tlb_kernel_range(start, end)	local_flush_tlb_all()
 #else	/* CONFIG_MMU */
 #define flush_tlb_all()			do { } while (0)
 #define flush_tlb_mm(mm)		do { } while (0)
 #define flush_tlb_page(vma, vmaddr)	do { } while (0)
-#define flush_tlb_range(vma, start, end)	do { } while (0)
+#define flush_tlb_range(mm, start, end)	do { } while (0)
 #endif	/* CONFIG_MMU */
 #else	/* CONFIG_SMP */
 extern void smp_flush_tlb_all(void);
 extern void smp_flush_tlb_mm(struct mm_struct *);
 extern void smp_flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void smp_flush_tlb_range(struct vm_area_struct *, unsigned long,
+extern void smp_flush_tlb_range(struct mm_struct *, unsigned long,
 	unsigned long);
 
 #define flush_tlb_all()			smp_flush_tlb_all()
 #define flush_tlb_mm(mm)		smp_flush_tlb_mm(mm)
 #define flush_tlb_page(vma, page)	smp_flush_tlb_page(vma, page)
-#define flush_tlb_range(vma, start, end)	\
-	smp_flush_tlb_range(vma, start, end)
+#define flush_tlb_range(mm, start, end)	\
+	smp_flush_tlb_range(mm, start, end)
 #define flush_tlb_kernel_range(start, end)	smp_flush_tlb_all()
 #endif	/* CONFIG_SMP */
 
Index: linux-2.6/arch/m32r/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/m32r/kernel/smp.c
+++ linux-2.6/arch/m32r/kernel/smp.c
@@ -71,7 +71,7 @@ void smp_flush_tlb_all(void);
 static void flush_tlb_all_ipi(void *);
 
 void smp_flush_tlb_mm(struct mm_struct *);
-void smp_flush_tlb_range(struct vm_area_struct *, unsigned long, \
+void smp_flush_tlb_range(struct mm_struct *, unsigned long, \
 	unsigned long);
 void smp_flush_tlb_page(struct vm_area_struct *, unsigned long);
 static void flush_tlb_others(cpumask_t, struct mm_struct *,
@@ -299,10 +299,10 @@ void smp_flush_tlb_mm(struct mm_struct *
  * ---------- --- --------------------------------------------------------
  *
  *==========================================================================*/
-void smp_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void smp_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	smp_flush_tlb_mm(vma->vm_mm);
+	smp_flush_tlb_mm(mm);
 }
 
 /*==========================================================================*
Index: linux-2.6/arch/m32r/mm/fault-nommu.c
===================================================================
--- linux-2.6.orig/arch/m32r/mm/fault-nommu.c
+++ linux-2.6/arch/m32r/mm/fault-nommu.c
@@ -111,7 +111,7 @@ void local_flush_tlb_page(struct vm_area
 /*======================================================================*
  * flush_tlb_range() : flushes a range of pages
  *======================================================================*/
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
 	BUG();
Index: linux-2.6/arch/m32r/mm/fault.c
===================================================================
--- linux-2.6.orig/arch/m32r/mm/fault.c
+++ linux-2.6/arch/m32r/mm/fault.c
@@ -468,12 +468,9 @@ void local_flush_tlb_page(struct vm_area
 /*======================================================================*
  * flush_tlb_range() : flushes a range of pages
  *======================================================================*/
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm;
-
-	mm = vma->vm_mm;
 	if (mm_context(mm) != NO_CONTEXT) {
 		unsigned long flags;
 		int size;
Index: linux-2.6/arch/m68k/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/m68k/include/asm/tlbflush.h
+++ linux-2.6/arch/m68k/include/asm/tlbflush.h
@@ -80,10 +80,10 @@ static inline void flush_tlb_page(struct
 	}
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	if (vma->vm_mm == current->active_mm)
+	if (mm == current->active_mm)
 		__flush_tlb();
 }
 
@@ -177,10 +177,9 @@ static inline void flush_tlb_page (struc
 }
 /* Flush a range of pages from TLB. */
 
-static inline void flush_tlb_range (struct vm_area_struct *vma,
+static inline void flush_tlb_range (struct mm_struct *mm,
 		      unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned char seg, oldctx;
 
 	start &= ~SUN3_PMEG_MASK;
Index: linux-2.6/arch/microblaze/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/microblaze/include/asm/tlbflush.h
+++ linux-2.6/arch/microblaze/include/asm/tlbflush.h
@@ -33,7 +33,7 @@ static inline void local_flush_tlb_mm(st
 static inline void local_flush_tlb_page(struct vm_area_struct *vma,
 				unsigned long vmaddr)
 	{ __tlbie(vmaddr); }
-static inline void local_flush_tlb_range(struct vm_area_struct *vma,
+static inline void local_flush_tlb_range(struct mm_struct *mm,
 		unsigned long start, unsigned long end)
 	{ __tlbia(); }
 
Index: linux-2.6/arch/mips/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/mips/include/asm/tlbflush.h
+++ linux-2.6/arch/mips/include/asm/tlbflush.h
@@ -9,12 +9,12 @@
  *  - flush_tlb_all() flushes all processes TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB entries
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
-extern void local_flush_tlb_range(struct vm_area_struct *vma,
+extern void local_flush_tlb_range(struct mm_struct *mm,
 	unsigned long start, unsigned long end);
 extern void local_flush_tlb_kernel_range(unsigned long start,
 	unsigned long end);
@@ -26,7 +26,7 @@ extern void local_flush_tlb_one(unsigned
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long,
 	unsigned long);
 extern void flush_tlb_kernel_range(unsigned long, unsigned long);
 extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
@@ -36,7 +36,7 @@ extern void flush_tlb_one(unsigned long 
 
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
-#define flush_tlb_range(vma, vmaddr, end)	local_flush_tlb_range(vma, vmaddr, end)
+#define flush_tlb_range(mm, vmaddr, end)	local_flush_tlb_range(mm, vmaddr, end)
 #define flush_tlb_kernel_range(vmaddr,end) \
 	local_flush_tlb_kernel_range(vmaddr, end)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
Index: linux-2.6/arch/mips/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/mips/kernel/smp.c
+++ linux-2.6/arch/mips/kernel/smp.c
@@ -307,7 +307,7 @@ void flush_tlb_mm(struct mm_struct *mm)
 }
 
 struct flush_tlb_data {
-	struct vm_area_struct *vma;
+	struct mm_struct *mm;
 	unsigned long addr1;
 	unsigned long addr2;
 };
@@ -316,17 +316,15 @@ static void flush_tlb_range_ipi(void *in
 {
 	struct flush_tlb_data *fd = info;
 
-	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
+	local_flush_tlb_range(fd->mm, fd->addr1, fd->addr2);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	preempt_disable();
 	if ((atomic_read(&mm->mm_users) != 1) || (current->mm != mm)) {
 		struct flush_tlb_data fd = {
-			.vma = vma,
+			.mm = mm,
 			.addr1 = start,
 			.addr2 = end,
 		};
@@ -341,7 +339,7 @@ void flush_tlb_range(struct vm_area_stru
 			if (cpu_context(cpu, mm))
 				cpu_context(cpu, mm) = 0;
 	}
-	local_flush_tlb_range(vma, start, end);
+	local_flush_tlb_range(mm, start, end);
 	preempt_enable();
 }
 
Index: linux-2.6/arch/mips/mm/tlb-r3k.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/tlb-r3k.c
+++ linux-2.6/arch/mips/mm/tlb-r3k.c
@@ -76,10 +76,9 @@ void local_flush_tlb_mm(struct mm_struct
 	}
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			   unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int cpu = smp_processor_id();
 
 	if (cpu_context(cpu, mm) != 0) {
Index: linux-2.6/arch/mips/mm/tlb-r4k.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/tlb-r4k.c
+++ linux-2.6/arch/mips/mm/tlb-r4k.c
@@ -112,10 +112,9 @@ void local_flush_tlb_mm(struct mm_struct
 	preempt_enable();
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int cpu = smp_processor_id();
 
 	if (cpu_context(cpu, mm) != 0) {
Index: linux-2.6/arch/mips/mm/tlb-r8k.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/tlb-r8k.c
+++ linux-2.6/arch/mips/mm/tlb-r8k.c
@@ -60,10 +60,9 @@ void local_flush_tlb_mm(struct mm_struct
 		drop_mmu_context(mm, cpu);
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int cpu = smp_processor_id();
 	unsigned long flags;
 	int oldpid, newpid, size;
Index: linux-2.6/arch/mn10300/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/mn10300/include/asm/tlbflush.h
+++ linux-2.6/arch/mn10300/include/asm/tlbflush.h
@@ -105,10 +105,10 @@ extern void flush_tlb_page(struct vm_are
 
 #define flush_tlb()		flush_tlb_current_task()
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 #else   /* CONFIG_SMP */
@@ -127,7 +127,7 @@ static inline void flush_tlb_mm(struct m
 	preempt_enable();
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
 	preempt_disable();
Index: linux-2.6/arch/parisc/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/parisc/include/asm/tlb.h
+++ linux-2.6/arch/parisc/include/asm/tlb.h
@@ -13,7 +13,7 @@ do {	if (!(tlb)->fullmm)	\
 
 #define tlb_end_vma(tlb, vma)	\
 do {	if (!(tlb)->fullmm)	\
-		flush_tlb_range(vma, vma->vm_start, vma->vm_end); \
+		flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end); \
 } while (0)
 
 #define __tlb_remove_tlb_entry(tlb, pte, address) \
Index: linux-2.6/arch/parisc/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/parisc/include/asm/tlbflush.h
+++ linux-2.6/arch/parisc/include/asm/tlbflush.h
@@ -76,7 +76,7 @@ static inline void flush_tlb_page(struct
 void __flush_tlb_range(unsigned long sid,
 	unsigned long start, unsigned long end);
 
-#define flush_tlb_range(vma,start,end) __flush_tlb_range((vma)->vm_mm->context,start,end)
+#define flush_tlb_range(mm,start,end) __flush_tlb_range((mm)->context,start,end)
 
 #define flush_tlb_kernel_range(start, end) __flush_tlb_range(0,start,end)
 
Index: linux-2.6/arch/powerpc/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/powerpc/include/asm/tlbflush.h
+++ linux-2.6/arch/powerpc/include/asm/tlbflush.h
@@ -10,7 +10,7 @@
  *                           the local processor
  *  - local_flush_tlb_page(vma, vmaddr) flushes one page on the local processor
  *  - flush_tlb_page_nohash(vma, vmaddr) flushes one page if SW loaded TLB
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *
  *  This program is free software; you can redistribute it and/or
@@ -34,7 +34,7 @@ struct mm_struct;
 
 #define MMU_NO_CONTEXT      	((unsigned int)-1)
 
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 
@@ -64,7 +64,7 @@ extern void __flush_tlb_page(struct mm_s
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr);
 extern void flush_tlb_page_nohash(struct vm_area_struct *vma, unsigned long addr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 static inline void local_flush_tlb_page(struct vm_area_struct *vma,
@@ -153,7 +153,7 @@ static inline void flush_tlb_page_nohash
 {
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
 }
Index: linux-2.6/arch/powerpc/mm/tlb_hash32.c
===================================================================
--- linux-2.6.orig/arch/powerpc/mm/tlb_hash32.c
+++ linux-2.6/arch/powerpc/mm/tlb_hash32.c
@@ -78,7 +78,7 @@ void tlb_flush(struct mmu_gather *tlb)
  *
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes kernel pages
  *
  * since the hardware hash table functions as an extension of the
@@ -171,9 +171,9 @@ EXPORT_SYMBOL(flush_tlb_page);
  * and check _PAGE_HASHPTE bit; if it is set, find and destroy
  * the corresponding HPTE.
  */
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 {
-	flush_range(vma->vm_mm, start, end);
+	flush_range(mm, start, end);
 }
 EXPORT_SYMBOL(flush_tlb_range);
Index: linux-2.6/arch/powerpc/mm/tlb_nohash.c
===================================================================
--- linux-2.6.orig/arch/powerpc/mm/tlb_nohash.c
+++ linux-2.6/arch/powerpc/mm/tlb_nohash.c
@@ -107,7 +107,7 @@ unsigned long linear_map_top;	/* Top of 
  *
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes kernel pages
  *
  *  - local_* variants of page and mm only apply to the current
@@ -288,7 +288,7 @@ EXPORT_SYMBOL(flush_tlb_kernel_range);
  * some implementation can stack multiple tlbivax before a tlbsync but
  * for now, we keep it that way
  */
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 
 {
Index: linux-2.6/arch/s390/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/s390/include/asm/tlbflush.h
+++ linux-2.6/arch/s390/include/asm/tlbflush.h
@@ -108,7 +108,7 @@ static inline void __tlb_flush_mm_cond(s
  *  flush_tlb_all() - flushes all processes TLBs
  *  flush_tlb_mm(mm) - flushes the specified mm context TLB's
  *  flush_tlb_page(vma, vmaddr) - flushes one page
- *  flush_tlb_range(vma, start, end) - flushes a range of pages
+ *  flush_tlb_range(mm, start, end) - flushes a range of pages
  *  flush_tlb_kernel_range(start, end) - flushes a range of kernel pages
  */
 
@@ -129,10 +129,10 @@ static inline void flush_tlb_mm(struct m
 	__tlb_flush_mm_cond(mm);
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	__tlb_flush_mm_cond(vma->vm_mm);
+	__tlb_flush_mm_cond(mm);
 }
 
 static inline void flush_tlb_kernel_range(unsigned long start,
Index: linux-2.6/arch/score/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/score/include/asm/tlbflush.h
+++ linux-2.6/arch/score/include/asm/tlbflush.h
@@ -14,7 +14,7 @@
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
-extern void local_flush_tlb_range(struct vm_area_struct *vma,
+extern void local_flush_tlb_range(struct mm_struct *mm,
 	unsigned long start, unsigned long end);
 extern void local_flush_tlb_kernel_range(unsigned long start,
 	unsigned long end);
@@ -24,8 +24,8 @@ extern void local_flush_tlb_one(unsigned
 
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
-#define flush_tlb_range(vma, vmaddr, end) \
-	local_flush_tlb_range(vma, vmaddr, end)
+#define flush_tlb_range(mm, vmaddr, end) \
+	local_flush_tlb_range(mm, vmaddr, end)
 #define flush_tlb_kernel_range(vmaddr, end) \
 	local_flush_tlb_kernel_range(vmaddr, end)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
Index: linux-2.6/arch/score/mm/tlb-score.c
===================================================================
--- linux-2.6.orig/arch/score/mm/tlb-score.c
+++ linux-2.6/arch/score/mm/tlb-score.c
@@ -77,10 +77,9 @@ void local_flush_tlb_mm(struct mm_struct
 		drop_mmu_context(mm);
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long vma_mm_context = mm->context;
 	if (mm->context != 0) {
 		unsigned long flags;
Index: linux-2.6/arch/sh/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/tlb.h
+++ linux-2.6/arch/sh/include/asm/tlb.h
@@ -78,7 +78,7 @@ static inline void
 tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
 {
 	if (!tlb->fullmm && tlb->end) {
-		flush_tlb_range(vma, tlb->start, tlb->end);
+		flush_tlb_range(vma->vm_mm, tlb->start, tlb->end);
 		init_tlb_gather(tlb);
 	}
 }
Index: linux-2.6/arch/sh/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/tlbflush.h
+++ linux-2.6/arch/sh/include/asm/tlbflush.h
@@ -7,12 +7,12 @@
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
-extern void local_flush_tlb_range(struct vm_area_struct *vma,
+extern void local_flush_tlb_range(struct mm_struct *mm,
 				  unsigned long start,
 				  unsigned long end);
 extern void local_flush_tlb_page(struct vm_area_struct *vma,
@@ -27,7 +27,7 @@ extern void __flush_tlb_global(void);
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
@@ -40,8 +40,8 @@ extern void flush_tlb_one(unsigned long 
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
 #define flush_tlb_one(asid, page)	local_flush_tlb_one(asid, page)
 
-#define flush_tlb_range(vma, start, end)	\
-	local_flush_tlb_range(vma, start, end)
+#define flush_tlb_range(mm, start, end)	\
+	local_flush_tlb_range(mm, start, end)
 
 #define flush_tlb_kernel_range(start, end)	\
 	local_flush_tlb_kernel_range(start, end)
Index: linux-2.6/arch/sh/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/sh/kernel/smp.c
+++ linux-2.6/arch/sh/kernel/smp.c
@@ -390,7 +390,7 @@ void flush_tlb_mm(struct mm_struct *mm)
 }
 
 struct flush_tlb_data {
-	struct vm_area_struct *vma;
+	struct mm_struct *mm;
 	unsigned long addr1;
 	unsigned long addr2;
 };
@@ -399,19 +399,17 @@ static void flush_tlb_range_ipi(void *in
 {
 	struct flush_tlb_data *fd = (struct flush_tlb_data *)info;
 
-	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
+	local_flush_tlb_range(fd->mm, fd->addr1, fd->addr2);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+void flush_tlb_range(struct mm_struct *mm,
 		     unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	preempt_disable();
 	if ((atomic_read(&mm->mm_users) != 1) || (current->mm != mm)) {
 		struct flush_tlb_data fd;
 
-		fd.vma = vma;
+		fd.mm = mm;
 		fd.addr1 = start;
 		fd.addr2 = end;
 		smp_call_function(flush_tlb_range_ipi, (void *)&fd, 1);
@@ -421,7 +419,7 @@ void flush_tlb_range(struct vm_area_stru
 			if (smp_processor_id() != i)
 				cpu_context(i, mm) = 0;
 	}
-	local_flush_tlb_range(vma, start, end);
+	local_flush_tlb_range(mm, start, end);
 	preempt_enable();
 }
 
Index: linux-2.6/arch/sh/mm/nommu.c
===================================================================
--- linux-2.6.orig/arch/sh/mm/nommu.c
+++ linux-2.6/arch/sh/mm/nommu.c
@@ -46,7 +46,7 @@ void local_flush_tlb_mm(struct mm_struct
 	BUG();
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end)
 {
 	BUG();
Index: linux-2.6/arch/sh/mm/tlbflush_32.c
===================================================================
--- linux-2.6.orig/arch/sh/mm/tlbflush_32.c
+++ linux-2.6/arch/sh/mm/tlbflush_32.c
@@ -36,10 +36,9 @@ void local_flush_tlb_page(struct vm_area
 	}
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			   unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned int cpu = smp_processor_id();
 
 	if (cpu_context(cpu, mm) != NO_CONTEXT) {
Index: linux-2.6/arch/sh/mm/tlbflush_64.c
===================================================================
--- linux-2.6.orig/arch/sh/mm/tlbflush_64.c
+++ linux-2.6/arch/sh/mm/tlbflush_64.c
@@ -365,16 +365,14 @@ void local_flush_tlb_page(struct vm_area
 	}
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			   unsigned long end)
 {
 	unsigned long flags;
 	unsigned long long match, pteh=0, pteh_epn, pteh_low;
 	unsigned long tlb;
 	unsigned int cpu = smp_processor_id();
-	struct mm_struct *mm;
 
-	mm = vma->vm_mm;
 	if (cpu_context(cpu, mm) == NO_CONTEXT)
 		return;
 
Index: linux-2.6/arch/sparc/include/asm/tlb_32.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/tlb_32.h
+++ linux-2.6/arch/sparc/include/asm/tlb_32.h
@@ -8,7 +8,7 @@ do {								\
 
 #define tlb_end_vma(tlb, vma) \
 do {								\
-	flush_tlb_range(vma, vma->vm_start, vma->vm_end);	\
+	flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end);\
 } while (0)
 
 #define __tlb_remove_tlb_entry(tlb, pte, address) \
Index: linux-2.6/arch/sparc/include/asm/tlbflush_32.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/tlbflush_32.h
+++ linux-2.6/arch/sparc/include/asm/tlbflush_32.h
@@ -11,7 +11,7 @@
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 
@@ -19,17 +19,17 @@
 
 BTFIXUPDEF_CALL(void, local_flush_tlb_all, void)
 BTFIXUPDEF_CALL(void, local_flush_tlb_mm, struct mm_struct *)
-BTFIXUPDEF_CALL(void, local_flush_tlb_range, struct vm_area_struct *, unsigned long, unsigned long)
+BTFIXUPDEF_CALL(void, local_flush_tlb_range, struct mm_struct *, unsigned long, unsigned long)
 BTFIXUPDEF_CALL(void, local_flush_tlb_page, struct vm_area_struct *, unsigned long)
 
 #define local_flush_tlb_all() BTFIXUP_CALL(local_flush_tlb_all)()
 #define local_flush_tlb_mm(mm) BTFIXUP_CALL(local_flush_tlb_mm)(mm)
-#define local_flush_tlb_range(vma,start,end) BTFIXUP_CALL(local_flush_tlb_range)(vma,start,end)
+#define local_flush_tlb_range(mm,start,end) BTFIXUP_CALL(local_flush_tlb_range)(mm,start,end)
 #define local_flush_tlb_page(vma,addr) BTFIXUP_CALL(local_flush_tlb_page)(vma,addr)
 
 extern void smp_flush_tlb_all(void);
 extern void smp_flush_tlb_mm(struct mm_struct *mm);
-extern void smp_flush_tlb_range(struct vm_area_struct *vma,
+extern void smp_flush_tlb_range(struct mm_struct *mm,
 				  unsigned long start,
 				  unsigned long end);
 extern void smp_flush_tlb_page(struct vm_area_struct *mm, unsigned long page);
@@ -38,12 +38,12 @@ extern void smp_flush_tlb_page(struct vm
 
 BTFIXUPDEF_CALL(void, flush_tlb_all, void)
 BTFIXUPDEF_CALL(void, flush_tlb_mm, struct mm_struct *)
-BTFIXUPDEF_CALL(void, flush_tlb_range, struct vm_area_struct *, unsigned long, unsigned long)
+BTFIXUPDEF_CALL(void, flush_tlb_range, struct mm_struct *, unsigned long, unsigned long)
 BTFIXUPDEF_CALL(void, flush_tlb_page, struct vm_area_struct *, unsigned long)
 
 #define flush_tlb_all() BTFIXUP_CALL(flush_tlb_all)()
 #define flush_tlb_mm(mm) BTFIXUP_CALL(flush_tlb_mm)(mm)
-#define flush_tlb_range(vma,start,end) BTFIXUP_CALL(flush_tlb_range)(vma,start,end)
+#define flush_tlb_range(mm,start,end) BTFIXUP_CALL(flush_tlb_range)(mm,start,end)
 #define flush_tlb_page(vma,addr) BTFIXUP_CALL(flush_tlb_page)(vma,addr)
 
 // #define flush_tlb() flush_tlb_mm(current->active_mm)	/* XXX Sure? */
Index: linux-2.6/arch/sparc/include/asm/tlbflush_64.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/tlbflush_64.h
+++ linux-2.6/arch/sparc/include/asm/tlbflush_64.h
@@ -21,7 +21,7 @@ extern void flush_tsb_user(struct tlb_ba
 
 extern void flush_tlb_pending(void);
 
-#define flush_tlb_range(vma,start,end)	\
+#define flush_tlb_range(mm,start,end)	\
 	do { (void)(start); flush_tlb_pending(); } while (0)
 #define flush_tlb_page(vma,addr)	flush_tlb_pending()
 #define flush_tlb_mm(mm)		flush_tlb_pending()
Index: linux-2.6/arch/sparc/kernel/smp_32.c
===================================================================
--- linux-2.6.orig/arch/sparc/kernel/smp_32.c
+++ linux-2.6/arch/sparc/kernel/smp_32.c
@@ -184,17 +184,15 @@ void smp_flush_cache_range(struct vm_are
 	}
 }
 
-void smp_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void smp_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			 unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	if (mm->context != NO_CONTEXT) {
 		cpumask_t cpu_mask = *mm_cpumask(mm);
 		cpu_clear(smp_processor_id(), cpu_mask);
 		if (!cpus_empty(cpu_mask))
-			xc3((smpfunc_t) BTFIXUP_CALL(local_flush_tlb_range), (unsigned long) vma, start, end);
-		local_flush_tlb_range(vma, start, end);
+			xc3((smpfunc_t) BTFIXUP_CALL(local_flush_tlb_range), (unsigned long) mm, start, end);
+		local_flush_tlb_range(mm, start, end);
 	}
 }
 
Index: linux-2.6/arch/sparc/mm/generic_32.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/generic_32.c
+++ linux-2.6/arch/sparc/mm/generic_32.c
@@ -92,7 +92,7 @@ int io_remap_pfn_range(struct vm_area_st
 		dir++;
 	}
 
-	flush_tlb_range(vma, beg, end);
+	flush_tlb_range(vma->vm_mm, beg, end);
 	return error;
 }
 EXPORT_SYMBOL(io_remap_pfn_range);
Index: linux-2.6/arch/sparc/mm/generic_64.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/generic_64.c
+++ linux-2.6/arch/sparc/mm/generic_64.c
@@ -158,7 +158,7 @@ int io_remap_pfn_range(struct vm_area_st
 		dir++;
 	}
 
-	flush_tlb_range(vma, beg, end);
+	flush_tlb_range(vma->vm_mm, beg, end);
 	return error;
 }
 EXPORT_SYMBOL(io_remap_pfn_range);
Index: linux-2.6/arch/sparc/mm/hypersparc.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/hypersparc.S
+++ linux-2.6/arch/sparc/mm/hypersparc.S
@@ -284,7 +284,6 @@
 	 sta	%g5, [%g1] ASI_M_MMUREGS
 
 hypersparc_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 	mov	SRMMU_CTX_REG, %g1
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
Index: linux-2.6/arch/sparc/mm/srmmu.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/srmmu.c
+++ linux-2.6/arch/sparc/mm/srmmu.c
@@ -679,7 +679,7 @@ extern void tsunami_flush_page_for_dma(u
 extern void tsunami_flush_sig_insns(struct mm_struct *mm, unsigned long insn_addr);
 extern void tsunami_flush_tlb_all(void);
 extern void tsunami_flush_tlb_mm(struct mm_struct *mm);
-extern void tsunami_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void tsunami_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
 extern void tsunami_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void tsunami_setup_blockops(void);
 
@@ -726,7 +726,7 @@ extern void swift_flush_page_for_dma(uns
 extern void swift_flush_sig_insns(struct mm_struct *mm, unsigned long insn_addr);
 extern void swift_flush_tlb_all(void);
 extern void swift_flush_tlb_mm(struct mm_struct *mm);
-extern void swift_flush_tlb_range(struct vm_area_struct *vma,
+extern void swift_flush_tlb_range(struct mm_struct *mm,
 				  unsigned long start, unsigned long end);
 extern void swift_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 
@@ -964,9 +964,8 @@ static void cypress_flush_tlb_mm(struct 
 	FLUSH_END
 }
 
-static void cypress_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+static void cypress_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long size;
 
 	FLUSH_BEGIN(mm)
@@ -1018,13 +1017,13 @@ extern void viking_flush_page(unsigned l
 extern void viking_mxcc_flush_page(unsigned long page);
 extern void viking_flush_tlb_all(void);
 extern void viking_flush_tlb_mm(struct mm_struct *mm);
-extern void viking_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void viking_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 				   unsigned long end);
 extern void viking_flush_tlb_page(struct vm_area_struct *vma,
 				  unsigned long page);
 extern void sun4dsmp_flush_tlb_all(void);
 extern void sun4dsmp_flush_tlb_mm(struct mm_struct *mm);
-extern void sun4dsmp_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void sun4dsmp_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 				   unsigned long end);
 extern void sun4dsmp_flush_tlb_page(struct vm_area_struct *vma,
 				  unsigned long page);
@@ -1039,7 +1038,7 @@ extern void hypersparc_flush_page_for_dm
 extern void hypersparc_flush_sig_insns(struct mm_struct *mm, unsigned long insn_addr);
 extern void hypersparc_flush_tlb_all(void);
 extern void hypersparc_flush_tlb_mm(struct mm_struct *mm);
-extern void hypersparc_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void hypersparc_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
 extern void hypersparc_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void hypersparc_setup_blockops(void);
 
@@ -1761,9 +1760,9 @@ static void turbosparc_flush_tlb_mm(stru
 	FLUSH_END
 }
 
-static void turbosparc_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+static void turbosparc_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	FLUSH_BEGIN(vma->vm_mm)
+	FLUSH_BEGIN(mm)
 	srmmu_flush_whole_tlb();
 	FLUSH_END
 }
Index: linux-2.6/arch/sparc/mm/sun4c.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/sun4c.c
+++ linux-2.6/arch/sparc/mm/sun4c.c
@@ -1419,9 +1419,8 @@ static void sun4c_flush_tlb_mm(struct mm
 	}
 }
 
-static void sun4c_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+static void sun4c_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int new_ctx = mm->context;
 
 	if (new_ctx != NO_CONTEXT) {
Index: linux-2.6/arch/sparc/mm/swift.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/swift.S
+++ linux-2.6/arch/sparc/mm/swift.S
@@ -219,7 +219,6 @@
 	.globl	swift_flush_tlb_range
 	.globl	swift_flush_tlb_all
 swift_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 swift_flush_tlb_mm:
 	ld	[%o0 + AOFF_mm_context], %g2
 	cmp	%g2, -1
Index: linux-2.6/arch/sparc/mm/tsunami.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/tsunami.S
+++ linux-2.6/arch/sparc/mm/tsunami.S
@@ -46,7 +46,6 @@
 
 	/* More slick stuff... */
 tsunami_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 tsunami_flush_tlb_mm:
 	ld	[%o0 + AOFF_mm_context], %g2
 	cmp	%g2, -1
Index: linux-2.6/arch/sparc/mm/viking.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/viking.S
+++ linux-2.6/arch/sparc/mm/viking.S
@@ -149,7 +149,6 @@
 #endif
 
 viking_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 	mov	SRMMU_CTX_REG, %g1
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
@@ -240,7 +239,6 @@
 	tst	%g5
 	bne	3f
 	 mov	SRMMU_CTX_REG, %g1
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
 	sethi	%hi(~((1 << SRMMU_PGDIR_SHIFT) - 1)), %o4
Index: linux-2.6/arch/tile/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/tile/include/asm/tlbflush.h
+++ linux-2.6/arch/tile/include/asm/tlbflush.h
@@ -105,7 +105,7 @@ static inline void local_flush_tlb_all(v
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *  - flush_tlb_others(cpumask, mm, va) flushes TLBs on other cpus
  *
@@ -120,7 +120,7 @@ extern void flush_tlb_mm(struct mm_struc
 extern void flush_tlb_page(const struct vm_area_struct *, unsigned long);
 extern void flush_tlb_page_mm(const struct vm_area_struct *,
 			      struct mm_struct *, unsigned long);
-extern void flush_tlb_range(const struct vm_area_struct *,
+extern void flush_tlb_range(const struct mm_struct *,
 			    unsigned long start, unsigned long end);
 
 #define flush_tlb()     flush_tlb_current_task()
Index: linux-2.6/arch/tile/kernel/tlb.c
===================================================================
--- linux-2.6.orig/arch/tile/kernel/tlb.c
+++ linux-2.6/arch/tile/kernel/tlb.c
@@ -64,14 +64,13 @@ void flush_tlb_page(const struct vm_area
 }
 EXPORT_SYMBOL(flush_tlb_page);
 
-void flush_tlb_range(const struct vm_area_struct *vma,
+void flush_tlb_range(const struct mm_struct *mm,
 		     unsigned long start, unsigned long end)
 {
-	unsigned long size = hv_page_size(vma);
-	struct mm_struct *mm = vma->vm_mm;
-	int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
-	flush_remote(0, cache, &mm->cpu_vm_mask, start, end - start, size,
-		     &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, HV_FLUSH_EVICT_L1I, &mm->cpu_vm_mask,
+		     start, end - start, PAGE_SIZE, &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, 0, &mm->cpu_vm_mask,
+		     start, end - start, HPAGE_SIZE, &mm->cpu_vm_mask, NULL, 0);
 }
 
 void flush_tlb_all(void)
Index: linux-2.6/arch/um/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/um/include/asm/tlbflush.h
+++ linux-2.6/arch/um/include/asm/tlbflush.h
@@ -16,12 +16,12 @@
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
  *  - flush_tlb_kernel_vm() flushes the kernel vm area
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  */
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, 
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long address);
 extern void flush_tlb_kernel_vm(void);
Index: linux-2.6/arch/um/kernel/tlb.c
===================================================================
--- linux-2.6.orig/arch/um/kernel/tlb.c
+++ linux-2.6/arch/um/kernel/tlb.c
@@ -492,12 +492,12 @@ static void fix_range(struct mm_struct *
 	fix_range_common(mm, start_addr, end_addr, force);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 {
-	if (vma->vm_mm == NULL)
+	if (mm == NULL)
 		flush_tlb_kernel_range_common(start, end);
-	else fix_range(vma->vm_mm, start, end, 0);
+	else fix_range(mm, start, end, 0);
 }
 
 void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
Index: linux-2.6/arch/unicore32/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/unicore32/include/asm/tlb.h
+++ linux-2.6/arch/unicore32/include/asm/tlb.h
@@ -77,7 +77,7 @@ static inline void
 tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
 {
 	if (!tlb->fullmm && tlb->range_end > 0)
-		flush_tlb_range(vma, tlb->range_start, tlb->range_end);
+		flush_tlb_range(vma->vm_mm, tlb->range_start, tlb->range_end);
 }
 
 static inline void tlb_flush_mmu(struct mmu_gather *tlb)
Index: linux-2.6/arch/unicore32/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/unicore32/include/asm/tlbflush.h
+++ linux-2.6/arch/unicore32/include/asm/tlbflush.h
@@ -167,7 +167,7 @@ static inline void clean_pmd_entry(pmd_t
 /*
  * Convert calls to our calling convention.
  */
-#define local_flush_tlb_range(vma, start, end)	\
+#define local_flush_tlb_range(mm, start, end)	\
 	__cpu_flush_user_tlb_range(start, end, vma)
 #define local_flush_tlb_kernel_range(s, e)	\
 	__cpu_flush_kern_tlb_range(s, e)
Index: linux-2.6/arch/x86/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/tlbflush.h
+++ linux-2.6/arch/x86/include/asm/tlbflush.h
@@ -75,7 +75,7 @@ static inline void __flush_tlb_one(unsig
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *  - flush_tlb_others(cpumask, mm, va) flushes TLBs on other cpus
  *
@@ -106,10 +106,10 @@ static inline void flush_tlb_page(struct
 		__flush_tlb_one(addr);
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	if (vma->vm_mm == current->active_mm)
+	if (mm == current->active_mm)
 		__flush_tlb();
 }
 
@@ -136,10 +136,10 @@ extern void flush_tlb_page(struct vm_are
 
 #define flush_tlb()	flush_tlb_current_task()
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 void native_flush_tlb_others(const struct cpumask *cpumask,
Index: linux-2.6/arch/x86/mm/pgtable.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/pgtable.c
+++ linux-2.6/arch/x86/mm/pgtable.c
@@ -332,7 +332,7 @@ int pmdp_set_access_flags(struct vm_area
 	if (changed && dirty) {
 		*pmdp = entry;
 		pmd_update_defer(vma->vm_mm, address, pmdp);
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	}
 
 	return changed;
@@ -393,7 +393,7 @@ int pmdp_clear_flush_young(struct vm_are
 
 	young = pmdp_test_and_clear_young(vma, address, pmdp);
 	if (young)
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 
 	return young;
 }
@@ -408,7 +408,7 @@ void pmdp_splitting_flush(struct vm_area
 	if (set) {
 		pmd_update(vma->vm_mm, address, pmdp);
 		/* need tlb flush only to serialize against gup-fast */
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	}
 }
 #endif
Index: linux-2.6/arch/xtensa/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/xtensa/include/asm/tlb.h
+++ linux-2.6/arch/xtensa/include/asm/tlb.h
@@ -32,7 +32,7 @@
 # define tlb_end_vma(tlb, vma)						      \
 	do {								      \
 		if (!tlb->fullmm)					      \
-			flush_tlb_range(vma, vma->vm_start, vma->vm_end);     \
+			flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end);     \
 	} while(0)
 
 #endif
Index: linux-2.6/arch/xtensa/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/xtensa/include/asm/tlbflush.h
+++ linux-2.6/arch/xtensa/include/asm/tlbflush.h
@@ -37,7 +37,7 @@
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct*);
 extern void flush_tlb_page(struct vm_area_struct*,unsigned long);
-extern void flush_tlb_range(struct vm_area_struct*,unsigned long,unsigned long);
+extern void flush_tlb_range(struct mm_struct*,unsigned long,unsigned long);
 
 #define flush_tlb_kernel_range(start,end) flush_tlb_all()
 
Index: linux-2.6/arch/xtensa/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/xtensa/mm/tlb.c
+++ linux-2.6/arch/xtensa/mm/tlb.c
@@ -82,10 +82,9 @@ void flush_tlb_mm(struct mm_struct *mm)
 # define _TLB_ENTRIES _DTLB_ENTRIES
 #endif
 
-void flush_tlb_range (struct vm_area_struct *vma,
+void flush_tlb_range (struct mm_struct *mm,
     		      unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long flags;
 
 	if (mm->context == NO_CONTEXT)
Index: linux-2.6/mm/huge_memory.c
===================================================================
--- linux-2.6.orig/mm/huge_memory.c
+++ linux-2.6/mm/huge_memory.c
@@ -1058,8 +1058,8 @@ int change_huge_pmd(struct vm_area_struc
 			entry = pmdp_get_and_clear(mm, addr, pmd);
 			entry = pmd_modify(entry, newprot);
 			set_pmd_at(mm, addr, pmd, entry);
-			spin_unlock(&vma->vm_mm->page_table_lock);
-			flush_tlb_range(vma, addr, addr + HPAGE_PMD_SIZE);
+			spin_unlock(&mm->page_table_lock);
+			flush_tlb_range(mm, addr, addr + HPAGE_PMD_SIZE);
 			ret = 1;
 		}
 	} else
@@ -1313,7 +1313,7 @@ static int __split_huge_page_map(struct 
 		 * of the pmd entry with pmd_populate.
 		 */
 		set_pmd_at(mm, address, pmd, pmd_mknotpresent(*pmd));
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(mm, address, address + HPAGE_PMD_SIZE);
 		pmd_populate(mm, pmd, pgtable);
 		ret = 1;
 	}
Index: linux-2.6/mm/hugetlb.c
===================================================================
--- linux-2.6.orig/mm/hugetlb.c
+++ linux-2.6/mm/hugetlb.c
@@ -2264,7 +2264,7 @@ void __unmap_hugepage_range(struct vm_ar
 		list_add(&page->lru, &page_list);
 	}
 	spin_unlock(&mm->page_table_lock);
-	flush_tlb_range(vma, start, end);
+	flush_tlb_range(mm, start, end);
 	mmu_notifier_invalidate_range_end(mm, start, end);
 	list_for_each_entry_safe(page, tmp, &page_list, lru) {
 		page_remove_rmap(page);
@@ -2829,7 +2829,7 @@ void hugetlb_change_protection(struct vm
 	spin_unlock(&mm->page_table_lock);
 	mutex_unlock(&vma->vm_file->f_mapping->i_mmap_mutex);
 
-	flush_tlb_range(vma, start, end);
+	flush_tlb_range(mm, start, end);
 }
 
 int hugetlb_reserve_pages(struct inode *inode,
Index: linux-2.6/mm/mprotect.c
===================================================================
--- linux-2.6.orig/mm/mprotect.c
+++ linux-2.6/mm/mprotect.c
@@ -138,7 +138,7 @@ static void change_protection(struct vm_
 		change_pud_range(vma, pgd, addr, next, newprot,
 				 dirty_accountable);
 	} while (pgd++, addr = next, addr != end);
-	flush_tlb_range(vma, start, end);
+	flush_tlb_range(mm, start, end);
 }
 
 int
Index: linux-2.6/mm/pgtable-generic.c
===================================================================
--- linux-2.6.orig/mm/pgtable-generic.c
+++ linux-2.6/mm/pgtable-generic.c
@@ -43,7 +43,7 @@ int pmdp_set_access_flags(struct vm_area
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	if (changed) {
 		set_pmd_at(vma->vm_mm, address, pmdp, entry);
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	}
 	return changed;
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
@@ -76,7 +76,7 @@ int pmdp_clear_flush_young(struct vm_are
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	young = pmdp_test_and_clear_young(vma, address, pmdp);
 	if (young)
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	return young;
 }
 #endif
@@ -100,7 +100,7 @@ pmd_t pmdp_clear_flush(struct vm_area_st
 	pmd_t pmd;
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	pmd = pmdp_get_and_clear(vma->vm_mm, address, pmdp);
-	flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	return pmd;
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
@@ -115,7 +115,7 @@ pmd_t pmdp_splitting_flush(struct vm_are
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	set_pmd_at(vma->vm_mm, address, pmdp, pmd);
 	/* tlb flush only to serialize against gup-fast */
-	flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif



^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: fixup-flush_tlb_range.patch --]
[-- Type: text/plain, Size: 69411 bytes --]

In order to be able to properly support architecture that want/need to
support TLB range invalidation, we need to change the
flush_tlb_range() argument from a vm_area_struct to an mm_struct
because the range might very well extend past one VMA, or not have a
VMA at all.

There are two mmu_gather cases to consider:

  unmap_region()
    tlb_gather_mmu()
    unmap_vmas()
      for (; vma; vma = vma->vm_next)
        unmao_page_range()
          tlb_start_vma() -> flush cache range
          zap_*_range()
            ptep_get_and_clear_full() -> batch/track external tlbs
            tlb_remove_tlb_entry() -> batch/track external tlbs
            tlb_remove_page() -> track range/batch page
          tlb_end_vma()
    free_pgtables()
      while (vma)
        unlink_*_vma()
        free_*_range()
          *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush everything
  free vmas

and:

  shift_arg_pages()
    tlb_gather_mmu()
    free_*_range()
      *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush things

There are various reasons that we need to flush TLBs _after_ freeing
the page-tables themselves. For some architectures (x86 among others)
this serializes against (both hardware and software) page table
walkers like gup_fast().

For others (ARM) this is (also) needed to evict stale page-table
caches - ARM LPAE mode apparently caches page tables and concurrent
hardware walkers could re-populate these caches if the final tlb flush
were to be from tlb_end_vma() since an concurrent walk could still be
in progress.

Therefore we need to track the range over all VMAs and the freeing of
the page-tables themselves. This means we cannot use a VMA argument to
the flush the TLB range.

Mostly architectures only used the ->vm_mm argument anyway, and
conversion is straight forward and removes numerous fake vma
instrances created just to pass an mm pointer.

The exceptions are ARM and TILE, both of which also look at
->vm_flags, ARM uses this to optimize TBL flushes for Harvard style
MMUs that have independent I-TLB ops. The taken conversion is rather
ugly (because I can't write ARM asm) and creates a fake VMA with
VM_EXEC set so that it effectively always flushes the I-TLBs and thus
looses the optimization.

TILE uses vm_flags to check for VM_EXEC in order to flush I-cache, but
also checks VM_HUGETLB. Arguably it shouldn't flush I-cache here and
we can use things like update_mmu_cache() to solve this. As for the
HUGETLB case, we can simply flush both at a small penalty. The current
conversion does all three, I-cache, TLB and HUGETLB.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 Documentation/cachetlb.txt             |    9 +++------
 arch/alpha/include/asm/tlbflush.h      |    8 +++-----
 arch/alpha/kernel/smp.c                |    4 ++--
 arch/arm/include/asm/tlb.h             |    2 +-
 arch/arm/include/asm/tlbflush.h        |    5 +++--
 arch/arm/kernel/ecard.c                |    8 ++------
 arch/arm/kernel/smp_tlb.c              |   29 +++++++++++++++++++++--------
 arch/avr32/include/asm/tlb.h           |    2 +-
 arch/avr32/include/asm/tlbflush.h      |    4 ++--
 arch/avr32/mm/tlb.c                    |    4 +---
 arch/cris/include/asm/tlbflush.h       |    4 ++--
 arch/frv/include/asm/tlbflush.h        |    4 ++--
 arch/ia64/include/asm/tlb.h            |   11 ++---------
 arch/ia64/include/asm/tlbflush.h       |    4 ++--
 arch/ia64/mm/tlb.c                     |    3 +--
 arch/m32r/include/asm/tlbflush.h       |   14 +++++++-------
 arch/m32r/kernel/smp.c                 |    6 +++---
 arch/m32r/mm/fault-nommu.c             |    2 +-
 arch/m32r/mm/fault.c                   |    5 +----
 arch/m68k/include/asm/tlbflush.h       |    7 +++----
 arch/microblaze/include/asm/tlbflush.h |    2 +-
 arch/mips/include/asm/tlbflush.h       |    8 ++++----
 arch/mips/kernel/smp.c                 |   12 +++++-------
 arch/mips/mm/tlb-r3k.c                 |    3 +--
 arch/mips/mm/tlb-r4k.c                 |    3 +--
 arch/mips/mm/tlb-r8k.c                 |    3 +--
 arch/mn10300/include/asm/tlbflush.h    |    6 +++---
 arch/parisc/include/asm/tlb.h          |    2 +-
 arch/parisc/include/asm/tlbflush.h     |    2 +-
 arch/powerpc/include/asm/tlbflush.h    |    8 ++++----
 arch/powerpc/mm/tlb_hash32.c           |    6 +++---
 arch/powerpc/mm/tlb_nohash.c           |    4 ++--
 arch/s390/include/asm/tlbflush.h       |    6 +++---
 arch/score/include/asm/tlbflush.h      |    6 +++---
 arch/score/mm/tlb-score.c              |    3 +--
 arch/sh/include/asm/tlb.h              |    2 +-
 arch/sh/include/asm/tlbflush.h         |   10 +++++-----
 arch/sh/kernel/smp.c                   |   12 +++++-------
 arch/sh/mm/nommu.c                     |    2 +-
 arch/sh/mm/tlbflush_32.c               |    3 +--
 arch/sh/mm/tlbflush_64.c               |    4 +---
 arch/sparc/include/asm/tlb_32.h        |    2 +-
 arch/sparc/include/asm/tlbflush_32.h   |   12 ++++++------
 arch/sparc/include/asm/tlbflush_64.h   |    2 +-
 arch/sparc/kernel/smp_32.c             |    8 +++-----
 arch/sparc/mm/generic_32.c             |    2 +-
 arch/sparc/mm/generic_64.c             |    2 +-
 arch/sparc/mm/hypersparc.S             |    1 -
 arch/sparc/mm/srmmu.c                  |   17 ++++++++---------
 arch/sparc/mm/sun4c.c                  |    3 +--
 arch/sparc/mm/swift.S                  |    1 -
 arch/sparc/mm/tsunami.S                |    1 -
 arch/sparc/mm/viking.S                 |    2 --
 arch/tile/include/asm/tlbflush.h       |    4 ++--
 arch/tile/kernel/tlb.c                 |   11 +++++------
 arch/um/include/asm/tlbflush.h         |    4 ++--
 arch/um/kernel/tlb.c                   |    6 +++---
 arch/unicore32/include/asm/tlb.h       |    2 +-
 arch/unicore32/include/asm/tlbflush.h  |    2 +-
 arch/x86/include/asm/tlbflush.h        |   10 +++++-----
 arch/x86/mm/pgtable.c                  |    6 +++---
 arch/xtensa/include/asm/tlb.h          |    2 +-
 arch/xtensa/include/asm/tlbflush.h     |    2 +-
 arch/xtensa/mm/tlb.c                   |    3 +--
 mm/huge_memory.c                       |    6 +++---
 mm/hugetlb.c                           |    4 ++--
 mm/mprotect.c                          |    2 +-
 mm/pgtable-generic.c                   |    8 ++++----
 68 files changed, 168 insertions(+), 199 deletions(-)

Index: linux-2.6/Documentation/cachetlb.txt
===================================================================
--- linux-2.6.orig/Documentation/cachetlb.txt
+++ linux-2.6/Documentation/cachetlb.txt
@@ -49,20 +49,17 @@ invoke one of the following flush method
 	page table operations such as what happens during
 	fork, and exec.
 
-3) void flush_tlb_range(struct vm_area_struct *vma,
+3) void flush_tlb_range(struct mm_struct *mm,
 			unsigned long start, unsigned long end)
 
 	Here we are flushing a specific range of (user) virtual
 	address translations from the TLB.  After running, this
 	interface must make sure that any previous page table
-	modifications for the address space 'vma->vm_mm' in the range
+	modifications for the address space 'mm' in the range
 	'start' to 'end-1' will be visible to the cpu.  That is, after
 	running, here will be no entries in the TLB for 'mm' for
 	virtual addresses in the range 'start' to 'end-1'.
 
-	The "vma" is the backing store being used for the region.
-	Primarily, this is used for munmap() type operations.
-
 	The interface is provided in hopes that the port can find
 	a suitably efficient method for removing multiple page
 	sized translations from the TLB, instead of having the kernel
@@ -120,7 +117,7 @@ is changing an existing virtual-->physic
 
 	2) flush_cache_range(vma, start, end);
 	   change_range_of_page_tables(mm, start, end);
-	   flush_tlb_range(vma, start, end);
+	   flush_tlb_range(vma->vm_mm, start, end);
 
 	3) flush_cache_page(vma, addr, pfn);
 	   set_pte(pte_pointer, new_pte_val);
Index: linux-2.6/arch/alpha/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/alpha/include/asm/tlbflush.h
+++ linux-2.6/arch/alpha/include/asm/tlbflush.h
@@ -127,10 +127,9 @@ flush_tlb_page(struct vm_area_struct *vm
 /* Flush a specified range of user mapping.  On the Alpha we flush
    the whole user tlb.  */
 static inline void
-flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		unsigned long end)
+flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 #else /* CONFIG_SMP */
@@ -138,8 +137,7 @@ flush_tlb_range(struct vm_area_struct *v
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *);
 extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void flush_tlb_range(struct vm_area_struct *, unsigned long,
-			    unsigned long);
+extern void flush_tlb_range(struct mm_struct *, unsigned long, unsigned long);
 
 #endif /* CONFIG_SMP */
 
Index: linux-2.6/arch/alpha/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/alpha/kernel/smp.c
+++ linux-2.6/arch/alpha/kernel/smp.c
@@ -773,10 +773,10 @@ flush_tlb_page(struct vm_area_struct *vm
 EXPORT_SYMBOL(flush_tlb_page);
 
 void
-flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
 	/* On the Alpha we always flush the whole user tlb.  */
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 EXPORT_SYMBOL(flush_tlb_range);
 
Index: linux-2.6/arch/arm/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/tlb.h
+++ linux-2.6/arch/arm/include/asm/tlb.h
@@ -83,7 +83,7 @@ static inline void tlb_flush(struct mmu_
 	if (tlb->fullmm || !tlb->vma)
 		flush_tlb_mm(tlb->mm);
 	else if (tlb->range_end > 0) {
-		flush_tlb_range(tlb->vma, tlb->range_start, tlb->range_end);
+		flush_tlb_range(tlb->mm, tlb->range_start, tlb->range_end);
 		tlb->range_start = TASK_SIZE;
 		tlb->range_end = 0;
 	}
Index: linux-2.6/arch/arm/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/tlbflush.h
+++ linux-2.6/arch/arm/include/asm/tlbflush.h
@@ -545,7 +545,8 @@ static inline void clean_pmd_entry(pmd_t
 /*
  * Convert calls to our calling convention.
  */
-#define local_flush_tlb_range(vma,start,end)	__cpu_flush_user_tlb_range(start,end,vma)
+extern void local_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
+
 #define local_flush_tlb_kernel_range(s,e)	__cpu_flush_kern_tlb_range(s,e)
 
 #ifndef CONFIG_SMP
@@ -560,7 +561,7 @@ extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr);
 extern void flush_tlb_kernel_page(unsigned long kaddr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 #endif
 
Index: linux-2.6/arch/arm/kernel/ecard.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/ecard.c
+++ linux-2.6/arch/arm/kernel/ecard.c
@@ -217,8 +217,6 @@ static DEFINE_MUTEX(ecard_mutex);
  */
 static void ecard_init_pgtables(struct mm_struct *mm)
 {
-	struct vm_area_struct vma;
-
 	/* We want to set up the page tables for the following mapping:
 	 *  Virtual	Physical
 	 *  0x03000000	0x03000000
@@ -242,10 +240,8 @@ static void ecard_init_pgtables(struct m
 
 	memcpy(dst_pgd, src_pgd, sizeof(pgd_t) * (EASI_SIZE / PGDIR_SIZE));
 
-	vma.vm_mm = mm;
-
-	flush_tlb_range(&vma, IO_START, IO_START + IO_SIZE);
-	flush_tlb_range(&vma, EASI_START, EASI_START + EASI_SIZE);
+	flush_tlb_range(mm, IO_START, IO_START + IO_SIZE);
+	flush_tlb_range(mm, EASI_START, EASI_START + EASI_SIZE);
 }
 
 static int ecard_init_mm(void)
Index: linux-2.6/arch/arm/kernel/smp_tlb.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/smp_tlb.c
+++ linux-2.6/arch/arm/kernel/smp_tlb.c
@@ -9,6 +9,7 @@
  */
 #include <linux/preempt.h>
 #include <linux/smp.h>
+#include <linux/mm.h>
 
 #include <asm/smp_plat.h>
 #include <asm/tlbflush.h>
@@ -31,7 +32,7 @@ static void on_each_cpu_mask(void (*func
  * TLB operations
  */
 struct tlb_args {
-	struct vm_area_struct *ta_vma;
+	struct mm_struct *ta_mm;
 	unsigned long ta_start;
 	unsigned long ta_end;
 };
@@ -51,8 +52,11 @@ static inline void ipi_flush_tlb_mm(void
 static inline void ipi_flush_tlb_page(void *arg)
 {
 	struct tlb_args *ta = (struct tlb_args *)arg;
+	struct vm_area_struct vma = {
+		.vm_mm = ta->ta_mm,
+	};
 
-	local_flush_tlb_page(ta->ta_vma, ta->ta_start);
+	local_flush_tlb_page(&vma, ta->ta_start);
 }
 
 static inline void ipi_flush_tlb_kernel_page(void *arg)
@@ -66,7 +70,7 @@ static inline void ipi_flush_tlb_range(v
 {
 	struct tlb_args *ta = (struct tlb_args *)arg;
 
-	local_flush_tlb_range(ta->ta_vma, ta->ta_start, ta->ta_end);
+	local_flush_tlb_range(ta->ta_mm, ta->ta_start, ta->ta_end);
 }
 
 static inline void ipi_flush_tlb_kernel_range(void *arg)
@@ -96,7 +100,7 @@ void flush_tlb_page(struct vm_area_struc
 {
 	if (tlb_ops_need_broadcast()) {
 		struct tlb_args ta;
-		ta.ta_vma = vma;
+		ta.ta_mm = vma->vm_mm;
 		ta.ta_start = uaddr;
 		on_each_cpu_mask(ipi_flush_tlb_page, &ta, 1, mm_cpumask(vma->vm_mm));
 	} else
@@ -113,17 +117,17 @@ void flush_tlb_kernel_page(unsigned long
 		local_flush_tlb_kernel_page(kaddr);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+void flush_tlb_range(struct mm_struct *mm,
                      unsigned long start, unsigned long end)
 {
 	if (tlb_ops_need_broadcast()) {
 		struct tlb_args ta;
-		ta.ta_vma = vma;
+		ta.ta_mm = mm;
 		ta.ta_start = start;
 		ta.ta_end = end;
-		on_each_cpu_mask(ipi_flush_tlb_range, &ta, 1, mm_cpumask(vma->vm_mm));
+		on_each_cpu_mask(ipi_flush_tlb_range, &ta, 1, mm_cpumask(mm));
 	} else
-		local_flush_tlb_range(vma, start, end);
+		local_flush_tlb_range(mm, start, end);
 }
 
 void flush_tlb_kernel_range(unsigned long start, unsigned long end)
@@ -137,3 +141,12 @@ void flush_tlb_kernel_range(unsigned lon
 		local_flush_tlb_kernel_range(start, end);
 }
 
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
+{
+	struct vm_area_struct vma = {
+		.vm_mm = mm,
+		.vm_flags = VM_EXEC,
+	};
+
+	__cpu_flush_user_tlb_range(start, end, &vma);
+}
Index: linux-2.6/arch/avr32/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/avr32/include/asm/tlb.h
+++ linux-2.6/arch/avr32/include/asm/tlb.h
@@ -12,7 +12,7 @@
 	flush_cache_range(vma, vma->vm_start, vma->vm_end)
 
 #define tlb_end_vma(tlb, vma) \
-	flush_tlb_range(vma, vma->vm_start, vma->vm_end)
+	flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end)
 
 #define __tlb_remove_tlb_entry(tlb, pte, address) do { } while(0)
 
Index: linux-2.6/arch/avr32/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/avr32/include/asm/tlbflush.h
+++ linux-2.6/arch/avr32/include/asm/tlbflush.h
@@ -17,13 +17,13 @@
  *  - flush_tlb_all() flushes all processes' TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLBs
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void flush_tlb(void);
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 
Index: linux-2.6/arch/avr32/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/avr32/mm/tlb.c
+++ linux-2.6/arch/avr32/mm/tlb.c
@@ -170,11 +170,9 @@ void flush_tlb_page(struct vm_area_struc
 	}
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	if (mm->context != NO_CONTEXT) {
 		unsigned long flags;
 		int size;
Index: linux-2.6/arch/cris/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/cris/include/asm/tlbflush.h
+++ linux-2.6/arch/cris/include/asm/tlbflush.h
@@ -33,9 +33,9 @@ extern void flush_tlb_page(struct vm_are
 #define flush_tlb_page __flush_tlb_page
 #endif
 
-static inline void flush_tlb_range(struct vm_area_struct * vma, unsigned long start, unsigned long end)
+static inline void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 static inline void flush_tlb(void)
Index: linux-2.6/arch/frv/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/frv/include/asm/tlbflush.h
+++ linux-2.6/arch/frv/include/asm/tlbflush.h
@@ -39,10 +39,10 @@ do {						\
 	preempt_enable();			\
 } while(0)
 
-#define flush_tlb_range(vma,start,end)					\
+#define flush_tlb_range(mm,start,end)					\
 do {									\
 	preempt_disable();						\
-	__flush_tlb_range((vma)->vm_mm->context.id, start, end);	\
+	__flush_tlb_range((mm)->context.id, start, end);		\
 	preempt_enable();						\
 } while(0)
 
Index: linux-2.6/arch/ia64/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/tlb.h
+++ linux-2.6/arch/ia64/include/asm/tlb.h
@@ -126,17 +126,10 @@ ia64_tlb_flush_mmu (struct mmu_gather *t
 		 */
 		flush_tlb_all();
 	} else {
-		/*
-		 * XXX fix me: flush_tlb_range() should take an mm pointer instead of a
-		 * vma pointer.
-		 */
-		struct vm_area_struct vma;
-
-		vma.vm_mm = tlb->mm;
 		/* flush the address range from the tlb: */
-		flush_tlb_range(&vma, start, end);
+		flush_tlb_range(tlb->mm, start, end);
 		/* now flush the virt. page-table area mapping the address range: */
-		flush_tlb_range(&vma, ia64_thash(start), ia64_thash(end));
+		flush_tlb_range(tlb->mm, ia64_thash(start), ia64_thash(end));
 	}
 
 	/* lastly, release the freed pages */
Index: linux-2.6/arch/ia64/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/tlbflush.h
+++ linux-2.6/arch/ia64/include/asm/tlbflush.h
@@ -66,7 +66,7 @@ flush_tlb_mm (struct mm_struct *mm)
 #endif
 }
 
-extern void flush_tlb_range (struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_tlb_range (struct mm_struct *mm, unsigned long start, unsigned long end);
 
 /*
  * Page-granular tlb flush.
@@ -75,7 +75,7 @@ static inline void
 flush_tlb_page (struct vm_area_struct *vma, unsigned long addr)
 {
 #ifdef CONFIG_SMP
-	flush_tlb_range(vma, (addr & PAGE_MASK), (addr & PAGE_MASK) + PAGE_SIZE);
+	flush_tlb_range(vma->vm_mm, (addr & PAGE_MASK), (addr & PAGE_MASK) + PAGE_SIZE);
 #else
 	if (vma->vm_mm == current->active_mm)
 		ia64_ptcl(addr, (PAGE_SHIFT << 2));
Index: linux-2.6/arch/ia64/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/ia64/mm/tlb.c
+++ linux-2.6/arch/ia64/mm/tlb.c
@@ -298,10 +298,9 @@ local_flush_tlb_all (void)
 }
 
 void
-flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
+flush_tlb_range (struct mm_struct *mm, unsigned long start,
 		 unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long size = end - start;
 	unsigned long nbits;
 
Index: linux-2.6/arch/m32r/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/m32r/include/asm/tlbflush.h
+++ linux-2.6/arch/m32r/include/asm/tlbflush.h
@@ -17,7 +17,7 @@
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *);
 extern void local_flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void local_flush_tlb_range(struct vm_area_struct *, unsigned long,
+extern void local_flush_tlb_range(struct mm_struct *, unsigned long,
 	unsigned long);
 
 #ifndef CONFIG_SMP
@@ -25,27 +25,27 @@ extern void local_flush_tlb_range(struct
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
-#define flush_tlb_range(vma, start, end)	\
-	local_flush_tlb_range(vma, start, end)
+#define flush_tlb_range(mm, start, end)	\
+	local_flush_tlb_range(mm, start, end)
 #define flush_tlb_kernel_range(start, end)	local_flush_tlb_all()
 #else	/* CONFIG_MMU */
 #define flush_tlb_all()			do { } while (0)
 #define flush_tlb_mm(mm)		do { } while (0)
 #define flush_tlb_page(vma, vmaddr)	do { } while (0)
-#define flush_tlb_range(vma, start, end)	do { } while (0)
+#define flush_tlb_range(mm, start, end)	do { } while (0)
 #endif	/* CONFIG_MMU */
 #else	/* CONFIG_SMP */
 extern void smp_flush_tlb_all(void);
 extern void smp_flush_tlb_mm(struct mm_struct *);
 extern void smp_flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void smp_flush_tlb_range(struct vm_area_struct *, unsigned long,
+extern void smp_flush_tlb_range(struct mm_struct *, unsigned long,
 	unsigned long);
 
 #define flush_tlb_all()			smp_flush_tlb_all()
 #define flush_tlb_mm(mm)		smp_flush_tlb_mm(mm)
 #define flush_tlb_page(vma, page)	smp_flush_tlb_page(vma, page)
-#define flush_tlb_range(vma, start, end)	\
-	smp_flush_tlb_range(vma, start, end)
+#define flush_tlb_range(mm, start, end)	\
+	smp_flush_tlb_range(mm, start, end)
 #define flush_tlb_kernel_range(start, end)	smp_flush_tlb_all()
 #endif	/* CONFIG_SMP */
 
Index: linux-2.6/arch/m32r/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/m32r/kernel/smp.c
+++ linux-2.6/arch/m32r/kernel/smp.c
@@ -71,7 +71,7 @@ void smp_flush_tlb_all(void);
 static void flush_tlb_all_ipi(void *);
 
 void smp_flush_tlb_mm(struct mm_struct *);
-void smp_flush_tlb_range(struct vm_area_struct *, unsigned long, \
+void smp_flush_tlb_range(struct mm_struct *, unsigned long, \
 	unsigned long);
 void smp_flush_tlb_page(struct vm_area_struct *, unsigned long);
 static void flush_tlb_others(cpumask_t, struct mm_struct *,
@@ -299,10 +299,10 @@ void smp_flush_tlb_mm(struct mm_struct *
  * ---------- --- --------------------------------------------------------
  *
  *==========================================================================*/
-void smp_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void smp_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	smp_flush_tlb_mm(vma->vm_mm);
+	smp_flush_tlb_mm(mm);
 }
 
 /*==========================================================================*
Index: linux-2.6/arch/m32r/mm/fault-nommu.c
===================================================================
--- linux-2.6.orig/arch/m32r/mm/fault-nommu.c
+++ linux-2.6/arch/m32r/mm/fault-nommu.c
@@ -111,7 +111,7 @@ void local_flush_tlb_page(struct vm_area
 /*======================================================================*
  * flush_tlb_range() : flushes a range of pages
  *======================================================================*/
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
 	BUG();
Index: linux-2.6/arch/m32r/mm/fault.c
===================================================================
--- linux-2.6.orig/arch/m32r/mm/fault.c
+++ linux-2.6/arch/m32r/mm/fault.c
@@ -468,12 +468,9 @@ void local_flush_tlb_page(struct vm_area
 /*======================================================================*
  * flush_tlb_range() : flushes a range of pages
  *======================================================================*/
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm;
-
-	mm = vma->vm_mm;
 	if (mm_context(mm) != NO_CONTEXT) {
 		unsigned long flags;
 		int size;
Index: linux-2.6/arch/m68k/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/m68k/include/asm/tlbflush.h
+++ linux-2.6/arch/m68k/include/asm/tlbflush.h
@@ -80,10 +80,10 @@ static inline void flush_tlb_page(struct
 	}
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	if (vma->vm_mm == current->active_mm)
+	if (mm == current->active_mm)
 		__flush_tlb();
 }
 
@@ -177,10 +177,9 @@ static inline void flush_tlb_page (struc
 }
 /* Flush a range of pages from TLB. */
 
-static inline void flush_tlb_range (struct vm_area_struct *vma,
+static inline void flush_tlb_range (struct mm_struct *mm,
 		      unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned char seg, oldctx;
 
 	start &= ~SUN3_PMEG_MASK;
Index: linux-2.6/arch/microblaze/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/microblaze/include/asm/tlbflush.h
+++ linux-2.6/arch/microblaze/include/asm/tlbflush.h
@@ -33,7 +33,7 @@ static inline void local_flush_tlb_mm(st
 static inline void local_flush_tlb_page(struct vm_area_struct *vma,
 				unsigned long vmaddr)
 	{ __tlbie(vmaddr); }
-static inline void local_flush_tlb_range(struct vm_area_struct *vma,
+static inline void local_flush_tlb_range(struct mm_struct *mm,
 		unsigned long start, unsigned long end)
 	{ __tlbia(); }
 
Index: linux-2.6/arch/mips/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/mips/include/asm/tlbflush.h
+++ linux-2.6/arch/mips/include/asm/tlbflush.h
@@ -9,12 +9,12 @@
  *  - flush_tlb_all() flushes all processes TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB entries
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
-extern void local_flush_tlb_range(struct vm_area_struct *vma,
+extern void local_flush_tlb_range(struct mm_struct *mm,
 	unsigned long start, unsigned long end);
 extern void local_flush_tlb_kernel_range(unsigned long start,
 	unsigned long end);
@@ -26,7 +26,7 @@ extern void local_flush_tlb_one(unsigned
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long,
 	unsigned long);
 extern void flush_tlb_kernel_range(unsigned long, unsigned long);
 extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
@@ -36,7 +36,7 @@ extern void flush_tlb_one(unsigned long 
 
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
-#define flush_tlb_range(vma, vmaddr, end)	local_flush_tlb_range(vma, vmaddr, end)
+#define flush_tlb_range(mm, vmaddr, end)	local_flush_tlb_range(mm, vmaddr, end)
 #define flush_tlb_kernel_range(vmaddr,end) \
 	local_flush_tlb_kernel_range(vmaddr, end)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
Index: linux-2.6/arch/mips/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/mips/kernel/smp.c
+++ linux-2.6/arch/mips/kernel/smp.c
@@ -307,7 +307,7 @@ void flush_tlb_mm(struct mm_struct *mm)
 }
 
 struct flush_tlb_data {
-	struct vm_area_struct *vma;
+	struct mm_struct *mm;
 	unsigned long addr1;
 	unsigned long addr2;
 };
@@ -316,17 +316,15 @@ static void flush_tlb_range_ipi(void *in
 {
 	struct flush_tlb_data *fd = info;
 
-	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
+	local_flush_tlb_range(fd->mm, fd->addr1, fd->addr2);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	preempt_disable();
 	if ((atomic_read(&mm->mm_users) != 1) || (current->mm != mm)) {
 		struct flush_tlb_data fd = {
-			.vma = vma,
+			.mm = mm,
 			.addr1 = start,
 			.addr2 = end,
 		};
@@ -341,7 +339,7 @@ void flush_tlb_range(struct vm_area_stru
 			if (cpu_context(cpu, mm))
 				cpu_context(cpu, mm) = 0;
 	}
-	local_flush_tlb_range(vma, start, end);
+	local_flush_tlb_range(mm, start, end);
 	preempt_enable();
 }
 
Index: linux-2.6/arch/mips/mm/tlb-r3k.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/tlb-r3k.c
+++ linux-2.6/arch/mips/mm/tlb-r3k.c
@@ -76,10 +76,9 @@ void local_flush_tlb_mm(struct mm_struct
 	}
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			   unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int cpu = smp_processor_id();
 
 	if (cpu_context(cpu, mm) != 0) {
Index: linux-2.6/arch/mips/mm/tlb-r4k.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/tlb-r4k.c
+++ linux-2.6/arch/mips/mm/tlb-r4k.c
@@ -112,10 +112,9 @@ void local_flush_tlb_mm(struct mm_struct
 	preempt_enable();
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int cpu = smp_processor_id();
 
 	if (cpu_context(cpu, mm) != 0) {
Index: linux-2.6/arch/mips/mm/tlb-r8k.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/tlb-r8k.c
+++ linux-2.6/arch/mips/mm/tlb-r8k.c
@@ -60,10 +60,9 @@ void local_flush_tlb_mm(struct mm_struct
 		drop_mmu_context(mm, cpu);
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int cpu = smp_processor_id();
 	unsigned long flags;
 	int oldpid, newpid, size;
Index: linux-2.6/arch/mn10300/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/mn10300/include/asm/tlbflush.h
+++ linux-2.6/arch/mn10300/include/asm/tlbflush.h
@@ -105,10 +105,10 @@ extern void flush_tlb_page(struct vm_are
 
 #define flush_tlb()		flush_tlb_current_task()
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 #else   /* CONFIG_SMP */
@@ -127,7 +127,7 @@ static inline void flush_tlb_mm(struct m
 	preempt_enable();
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
 	preempt_disable();
Index: linux-2.6/arch/parisc/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/parisc/include/asm/tlb.h
+++ linux-2.6/arch/parisc/include/asm/tlb.h
@@ -13,7 +13,7 @@ do {	if (!(tlb)->fullmm)	\
 
 #define tlb_end_vma(tlb, vma)	\
 do {	if (!(tlb)->fullmm)	\
-		flush_tlb_range(vma, vma->vm_start, vma->vm_end); \
+		flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end); \
 } while (0)
 
 #define __tlb_remove_tlb_entry(tlb, pte, address) \
Index: linux-2.6/arch/parisc/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/parisc/include/asm/tlbflush.h
+++ linux-2.6/arch/parisc/include/asm/tlbflush.h
@@ -76,7 +76,7 @@ static inline void flush_tlb_page(struct
 void __flush_tlb_range(unsigned long sid,
 	unsigned long start, unsigned long end);
 
-#define flush_tlb_range(vma,start,end) __flush_tlb_range((vma)->vm_mm->context,start,end)
+#define flush_tlb_range(mm,start,end) __flush_tlb_range((mm)->context,start,end)
 
 #define flush_tlb_kernel_range(start, end) __flush_tlb_range(0,start,end)
 
Index: linux-2.6/arch/powerpc/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/powerpc/include/asm/tlbflush.h
+++ linux-2.6/arch/powerpc/include/asm/tlbflush.h
@@ -10,7 +10,7 @@
  *                           the local processor
  *  - local_flush_tlb_page(vma, vmaddr) flushes one page on the local processor
  *  - flush_tlb_page_nohash(vma, vmaddr) flushes one page if SW loaded TLB
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *
  *  This program is free software; you can redistribute it and/or
@@ -34,7 +34,7 @@ struct mm_struct;
 
 #define MMU_NO_CONTEXT      	((unsigned int)-1)
 
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 
@@ -64,7 +64,7 @@ extern void __flush_tlb_page(struct mm_s
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr);
 extern void flush_tlb_page_nohash(struct vm_area_struct *vma, unsigned long addr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 static inline void local_flush_tlb_page(struct vm_area_struct *vma,
@@ -153,7 +153,7 @@ static inline void flush_tlb_page_nohash
 {
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
 }
Index: linux-2.6/arch/powerpc/mm/tlb_hash32.c
===================================================================
--- linux-2.6.orig/arch/powerpc/mm/tlb_hash32.c
+++ linux-2.6/arch/powerpc/mm/tlb_hash32.c
@@ -78,7 +78,7 @@ void tlb_flush(struct mmu_gather *tlb)
  *
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes kernel pages
  *
  * since the hardware hash table functions as an extension of the
@@ -171,9 +171,9 @@ EXPORT_SYMBOL(flush_tlb_page);
  * and check _PAGE_HASHPTE bit; if it is set, find and destroy
  * the corresponding HPTE.
  */
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 {
-	flush_range(vma->vm_mm, start, end);
+	flush_range(mm, start, end);
 }
 EXPORT_SYMBOL(flush_tlb_range);
Index: linux-2.6/arch/powerpc/mm/tlb_nohash.c
===================================================================
--- linux-2.6.orig/arch/powerpc/mm/tlb_nohash.c
+++ linux-2.6/arch/powerpc/mm/tlb_nohash.c
@@ -107,7 +107,7 @@ unsigned long linear_map_top;	/* Top of 
  *
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes kernel pages
  *
  *  - local_* variants of page and mm only apply to the current
@@ -288,7 +288,7 @@ EXPORT_SYMBOL(flush_tlb_kernel_range);
  * some implementation can stack multiple tlbivax before a tlbsync but
  * for now, we keep it that way
  */
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 
 {
Index: linux-2.6/arch/s390/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/s390/include/asm/tlbflush.h
+++ linux-2.6/arch/s390/include/asm/tlbflush.h
@@ -108,7 +108,7 @@ static inline void __tlb_flush_mm_cond(s
  *  flush_tlb_all() - flushes all processes TLBs
  *  flush_tlb_mm(mm) - flushes the specified mm context TLB's
  *  flush_tlb_page(vma, vmaddr) - flushes one page
- *  flush_tlb_range(vma, start, end) - flushes a range of pages
+ *  flush_tlb_range(mm, start, end) - flushes a range of pages
  *  flush_tlb_kernel_range(start, end) - flushes a range of kernel pages
  */
 
@@ -129,10 +129,10 @@ static inline void flush_tlb_mm(struct m
 	__tlb_flush_mm_cond(mm);
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	__tlb_flush_mm_cond(vma->vm_mm);
+	__tlb_flush_mm_cond(mm);
 }
 
 static inline void flush_tlb_kernel_range(unsigned long start,
Index: linux-2.6/arch/score/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/score/include/asm/tlbflush.h
+++ linux-2.6/arch/score/include/asm/tlbflush.h
@@ -14,7 +14,7 @@
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
-extern void local_flush_tlb_range(struct vm_area_struct *vma,
+extern void local_flush_tlb_range(struct mm_struct *mm,
 	unsigned long start, unsigned long end);
 extern void local_flush_tlb_kernel_range(unsigned long start,
 	unsigned long end);
@@ -24,8 +24,8 @@ extern void local_flush_tlb_one(unsigned
 
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
-#define flush_tlb_range(vma, vmaddr, end) \
-	local_flush_tlb_range(vma, vmaddr, end)
+#define flush_tlb_range(mm, vmaddr, end) \
+	local_flush_tlb_range(mm, vmaddr, end)
 #define flush_tlb_kernel_range(vmaddr, end) \
 	local_flush_tlb_kernel_range(vmaddr, end)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
Index: linux-2.6/arch/score/mm/tlb-score.c
===================================================================
--- linux-2.6.orig/arch/score/mm/tlb-score.c
+++ linux-2.6/arch/score/mm/tlb-score.c
@@ -77,10 +77,9 @@ void local_flush_tlb_mm(struct mm_struct
 		drop_mmu_context(mm);
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long vma_mm_context = mm->context;
 	if (mm->context != 0) {
 		unsigned long flags;
Index: linux-2.6/arch/sh/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/tlb.h
+++ linux-2.6/arch/sh/include/asm/tlb.h
@@ -78,7 +78,7 @@ static inline void
 tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
 {
 	if (!tlb->fullmm && tlb->end) {
-		flush_tlb_range(vma, tlb->start, tlb->end);
+		flush_tlb_range(vma->vm_mm, tlb->start, tlb->end);
 		init_tlb_gather(tlb);
 	}
 }
Index: linux-2.6/arch/sh/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/tlbflush.h
+++ linux-2.6/arch/sh/include/asm/tlbflush.h
@@ -7,12 +7,12 @@
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
-extern void local_flush_tlb_range(struct vm_area_struct *vma,
+extern void local_flush_tlb_range(struct mm_struct *mm,
 				  unsigned long start,
 				  unsigned long end);
 extern void local_flush_tlb_page(struct vm_area_struct *vma,
@@ -27,7 +27,7 @@ extern void __flush_tlb_global(void);
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
@@ -40,8 +40,8 @@ extern void flush_tlb_one(unsigned long 
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
 #define flush_tlb_one(asid, page)	local_flush_tlb_one(asid, page)
 
-#define flush_tlb_range(vma, start, end)	\
-	local_flush_tlb_range(vma, start, end)
+#define flush_tlb_range(mm, start, end)	\
+	local_flush_tlb_range(mm, start, end)
 
 #define flush_tlb_kernel_range(start, end)	\
 	local_flush_tlb_kernel_range(start, end)
Index: linux-2.6/arch/sh/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/sh/kernel/smp.c
+++ linux-2.6/arch/sh/kernel/smp.c
@@ -390,7 +390,7 @@ void flush_tlb_mm(struct mm_struct *mm)
 }
 
 struct flush_tlb_data {
-	struct vm_area_struct *vma;
+	struct mm_struct *mm;
 	unsigned long addr1;
 	unsigned long addr2;
 };
@@ -399,19 +399,17 @@ static void flush_tlb_range_ipi(void *in
 {
 	struct flush_tlb_data *fd = (struct flush_tlb_data *)info;
 
-	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
+	local_flush_tlb_range(fd->mm, fd->addr1, fd->addr2);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+void flush_tlb_range(struct mm_struct *mm,
 		     unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	preempt_disable();
 	if ((atomic_read(&mm->mm_users) != 1) || (current->mm != mm)) {
 		struct flush_tlb_data fd;
 
-		fd.vma = vma;
+		fd.mm = mm;
 		fd.addr1 = start;
 		fd.addr2 = end;
 		smp_call_function(flush_tlb_range_ipi, (void *)&fd, 1);
@@ -421,7 +419,7 @@ void flush_tlb_range(struct vm_area_stru
 			if (smp_processor_id() != i)
 				cpu_context(i, mm) = 0;
 	}
-	local_flush_tlb_range(vma, start, end);
+	local_flush_tlb_range(mm, start, end);
 	preempt_enable();
 }
 
Index: linux-2.6/arch/sh/mm/nommu.c
===================================================================
--- linux-2.6.orig/arch/sh/mm/nommu.c
+++ linux-2.6/arch/sh/mm/nommu.c
@@ -46,7 +46,7 @@ void local_flush_tlb_mm(struct mm_struct
 	BUG();
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end)
 {
 	BUG();
Index: linux-2.6/arch/sh/mm/tlbflush_32.c
===================================================================
--- linux-2.6.orig/arch/sh/mm/tlbflush_32.c
+++ linux-2.6/arch/sh/mm/tlbflush_32.c
@@ -36,10 +36,9 @@ void local_flush_tlb_page(struct vm_area
 	}
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			   unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned int cpu = smp_processor_id();
 
 	if (cpu_context(cpu, mm) != NO_CONTEXT) {
Index: linux-2.6/arch/sh/mm/tlbflush_64.c
===================================================================
--- linux-2.6.orig/arch/sh/mm/tlbflush_64.c
+++ linux-2.6/arch/sh/mm/tlbflush_64.c
@@ -365,16 +365,14 @@ void local_flush_tlb_page(struct vm_area
 	}
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			   unsigned long end)
 {
 	unsigned long flags;
 	unsigned long long match, pteh=0, pteh_epn, pteh_low;
 	unsigned long tlb;
 	unsigned int cpu = smp_processor_id();
-	struct mm_struct *mm;
 
-	mm = vma->vm_mm;
 	if (cpu_context(cpu, mm) == NO_CONTEXT)
 		return;
 
Index: linux-2.6/arch/sparc/include/asm/tlb_32.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/tlb_32.h
+++ linux-2.6/arch/sparc/include/asm/tlb_32.h
@@ -8,7 +8,7 @@ do {								\
 
 #define tlb_end_vma(tlb, vma) \
 do {								\
-	flush_tlb_range(vma, vma->vm_start, vma->vm_end);	\
+	flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end);\
 } while (0)
 
 #define __tlb_remove_tlb_entry(tlb, pte, address) \
Index: linux-2.6/arch/sparc/include/asm/tlbflush_32.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/tlbflush_32.h
+++ linux-2.6/arch/sparc/include/asm/tlbflush_32.h
@@ -11,7 +11,7 @@
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 
@@ -19,17 +19,17 @@
 
 BTFIXUPDEF_CALL(void, local_flush_tlb_all, void)
 BTFIXUPDEF_CALL(void, local_flush_tlb_mm, struct mm_struct *)
-BTFIXUPDEF_CALL(void, local_flush_tlb_range, struct vm_area_struct *, unsigned long, unsigned long)
+BTFIXUPDEF_CALL(void, local_flush_tlb_range, struct mm_struct *, unsigned long, unsigned long)
 BTFIXUPDEF_CALL(void, local_flush_tlb_page, struct vm_area_struct *, unsigned long)
 
 #define local_flush_tlb_all() BTFIXUP_CALL(local_flush_tlb_all)()
 #define local_flush_tlb_mm(mm) BTFIXUP_CALL(local_flush_tlb_mm)(mm)
-#define local_flush_tlb_range(vma,start,end) BTFIXUP_CALL(local_flush_tlb_range)(vma,start,end)
+#define local_flush_tlb_range(mm,start,end) BTFIXUP_CALL(local_flush_tlb_range)(mm,start,end)
 #define local_flush_tlb_page(vma,addr) BTFIXUP_CALL(local_flush_tlb_page)(vma,addr)
 
 extern void smp_flush_tlb_all(void);
 extern void smp_flush_tlb_mm(struct mm_struct *mm);
-extern void smp_flush_tlb_range(struct vm_area_struct *vma,
+extern void smp_flush_tlb_range(struct mm_struct *mm,
 				  unsigned long start,
 				  unsigned long end);
 extern void smp_flush_tlb_page(struct vm_area_struct *mm, unsigned long page);
@@ -38,12 +38,12 @@ extern void smp_flush_tlb_page(struct vm
 
 BTFIXUPDEF_CALL(void, flush_tlb_all, void)
 BTFIXUPDEF_CALL(void, flush_tlb_mm, struct mm_struct *)
-BTFIXUPDEF_CALL(void, flush_tlb_range, struct vm_area_struct *, unsigned long, unsigned long)
+BTFIXUPDEF_CALL(void, flush_tlb_range, struct mm_struct *, unsigned long, unsigned long)
 BTFIXUPDEF_CALL(void, flush_tlb_page, struct vm_area_struct *, unsigned long)
 
 #define flush_tlb_all() BTFIXUP_CALL(flush_tlb_all)()
 #define flush_tlb_mm(mm) BTFIXUP_CALL(flush_tlb_mm)(mm)
-#define flush_tlb_range(vma,start,end) BTFIXUP_CALL(flush_tlb_range)(vma,start,end)
+#define flush_tlb_range(mm,start,end) BTFIXUP_CALL(flush_tlb_range)(mm,start,end)
 #define flush_tlb_page(vma,addr) BTFIXUP_CALL(flush_tlb_page)(vma,addr)
 
 // #define flush_tlb() flush_tlb_mm(current->active_mm)	/* XXX Sure? */
Index: linux-2.6/arch/sparc/include/asm/tlbflush_64.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/tlbflush_64.h
+++ linux-2.6/arch/sparc/include/asm/tlbflush_64.h
@@ -21,7 +21,7 @@ extern void flush_tsb_user(struct tlb_ba
 
 extern void flush_tlb_pending(void);
 
-#define flush_tlb_range(vma,start,end)	\
+#define flush_tlb_range(mm,start,end)	\
 	do { (void)(start); flush_tlb_pending(); } while (0)
 #define flush_tlb_page(vma,addr)	flush_tlb_pending()
 #define flush_tlb_mm(mm)		flush_tlb_pending()
Index: linux-2.6/arch/sparc/kernel/smp_32.c
===================================================================
--- linux-2.6.orig/arch/sparc/kernel/smp_32.c
+++ linux-2.6/arch/sparc/kernel/smp_32.c
@@ -184,17 +184,15 @@ void smp_flush_cache_range(struct vm_are
 	}
 }
 
-void smp_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void smp_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			 unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	if (mm->context != NO_CONTEXT) {
 		cpumask_t cpu_mask = *mm_cpumask(mm);
 		cpu_clear(smp_processor_id(), cpu_mask);
 		if (!cpus_empty(cpu_mask))
-			xc3((smpfunc_t) BTFIXUP_CALL(local_flush_tlb_range), (unsigned long) vma, start, end);
-		local_flush_tlb_range(vma, start, end);
+			xc3((smpfunc_t) BTFIXUP_CALL(local_flush_tlb_range), (unsigned long) mm, start, end);
+		local_flush_tlb_range(mm, start, end);
 	}
 }
 
Index: linux-2.6/arch/sparc/mm/generic_32.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/generic_32.c
+++ linux-2.6/arch/sparc/mm/generic_32.c
@@ -92,7 +92,7 @@ int io_remap_pfn_range(struct vm_area_st
 		dir++;
 	}
 
-	flush_tlb_range(vma, beg, end);
+	flush_tlb_range(vma->vm_mm, beg, end);
 	return error;
 }
 EXPORT_SYMBOL(io_remap_pfn_range);
Index: linux-2.6/arch/sparc/mm/generic_64.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/generic_64.c
+++ linux-2.6/arch/sparc/mm/generic_64.c
@@ -158,7 +158,7 @@ int io_remap_pfn_range(struct vm_area_st
 		dir++;
 	}
 
-	flush_tlb_range(vma, beg, end);
+	flush_tlb_range(vma->vm_mm, beg, end);
 	return error;
 }
 EXPORT_SYMBOL(io_remap_pfn_range);
Index: linux-2.6/arch/sparc/mm/hypersparc.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/hypersparc.S
+++ linux-2.6/arch/sparc/mm/hypersparc.S
@@ -284,7 +284,6 @@
 	 sta	%g5, [%g1] ASI_M_MMUREGS
 
 hypersparc_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 	mov	SRMMU_CTX_REG, %g1
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
Index: linux-2.6/arch/sparc/mm/srmmu.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/srmmu.c
+++ linux-2.6/arch/sparc/mm/srmmu.c
@@ -679,7 +679,7 @@ extern void tsunami_flush_page_for_dma(u
 extern void tsunami_flush_sig_insns(struct mm_struct *mm, unsigned long insn_addr);
 extern void tsunami_flush_tlb_all(void);
 extern void tsunami_flush_tlb_mm(struct mm_struct *mm);
-extern void tsunami_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void tsunami_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
 extern void tsunami_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void tsunami_setup_blockops(void);
 
@@ -726,7 +726,7 @@ extern void swift_flush_page_for_dma(uns
 extern void swift_flush_sig_insns(struct mm_struct *mm, unsigned long insn_addr);
 extern void swift_flush_tlb_all(void);
 extern void swift_flush_tlb_mm(struct mm_struct *mm);
-extern void swift_flush_tlb_range(struct vm_area_struct *vma,
+extern void swift_flush_tlb_range(struct mm_struct *mm,
 				  unsigned long start, unsigned long end);
 extern void swift_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 
@@ -964,9 +964,8 @@ static void cypress_flush_tlb_mm(struct 
 	FLUSH_END
 }
 
-static void cypress_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+static void cypress_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long size;
 
 	FLUSH_BEGIN(mm)
@@ -1018,13 +1017,13 @@ extern void viking_flush_page(unsigned l
 extern void viking_mxcc_flush_page(unsigned long page);
 extern void viking_flush_tlb_all(void);
 extern void viking_flush_tlb_mm(struct mm_struct *mm);
-extern void viking_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void viking_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 				   unsigned long end);
 extern void viking_flush_tlb_page(struct vm_area_struct *vma,
 				  unsigned long page);
 extern void sun4dsmp_flush_tlb_all(void);
 extern void sun4dsmp_flush_tlb_mm(struct mm_struct *mm);
-extern void sun4dsmp_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void sun4dsmp_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 				   unsigned long end);
 extern void sun4dsmp_flush_tlb_page(struct vm_area_struct *vma,
 				  unsigned long page);
@@ -1039,7 +1038,7 @@ extern void hypersparc_flush_page_for_dm
 extern void hypersparc_flush_sig_insns(struct mm_struct *mm, unsigned long insn_addr);
 extern void hypersparc_flush_tlb_all(void);
 extern void hypersparc_flush_tlb_mm(struct mm_struct *mm);
-extern void hypersparc_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void hypersparc_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
 extern void hypersparc_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void hypersparc_setup_blockops(void);
 
@@ -1761,9 +1760,9 @@ static void turbosparc_flush_tlb_mm(stru
 	FLUSH_END
 }
 
-static void turbosparc_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+static void turbosparc_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	FLUSH_BEGIN(vma->vm_mm)
+	FLUSH_BEGIN(mm)
 	srmmu_flush_whole_tlb();
 	FLUSH_END
 }
Index: linux-2.6/arch/sparc/mm/sun4c.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/sun4c.c
+++ linux-2.6/arch/sparc/mm/sun4c.c
@@ -1419,9 +1419,8 @@ static void sun4c_flush_tlb_mm(struct mm
 	}
 }
 
-static void sun4c_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+static void sun4c_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int new_ctx = mm->context;
 
 	if (new_ctx != NO_CONTEXT) {
Index: linux-2.6/arch/sparc/mm/swift.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/swift.S
+++ linux-2.6/arch/sparc/mm/swift.S
@@ -219,7 +219,6 @@
 	.globl	swift_flush_tlb_range
 	.globl	swift_flush_tlb_all
 swift_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 swift_flush_tlb_mm:
 	ld	[%o0 + AOFF_mm_context], %g2
 	cmp	%g2, -1
Index: linux-2.6/arch/sparc/mm/tsunami.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/tsunami.S
+++ linux-2.6/arch/sparc/mm/tsunami.S
@@ -46,7 +46,6 @@
 
 	/* More slick stuff... */
 tsunami_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 tsunami_flush_tlb_mm:
 	ld	[%o0 + AOFF_mm_context], %g2
 	cmp	%g2, -1
Index: linux-2.6/arch/sparc/mm/viking.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/viking.S
+++ linux-2.6/arch/sparc/mm/viking.S
@@ -149,7 +149,6 @@
 #endif
 
 viking_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 	mov	SRMMU_CTX_REG, %g1
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
@@ -240,7 +239,6 @@
 	tst	%g5
 	bne	3f
 	 mov	SRMMU_CTX_REG, %g1
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
 	sethi	%hi(~((1 << SRMMU_PGDIR_SHIFT) - 1)), %o4
Index: linux-2.6/arch/tile/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/tile/include/asm/tlbflush.h
+++ linux-2.6/arch/tile/include/asm/tlbflush.h
@@ -105,7 +105,7 @@ static inline void local_flush_tlb_all(v
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *  - flush_tlb_others(cpumask, mm, va) flushes TLBs on other cpus
  *
@@ -120,7 +120,7 @@ extern void flush_tlb_mm(struct mm_struc
 extern void flush_tlb_page(const struct vm_area_struct *, unsigned long);
 extern void flush_tlb_page_mm(const struct vm_area_struct *,
 			      struct mm_struct *, unsigned long);
-extern void flush_tlb_range(const struct vm_area_struct *,
+extern void flush_tlb_range(const struct mm_struct *,
 			    unsigned long start, unsigned long end);
 
 #define flush_tlb()     flush_tlb_current_task()
Index: linux-2.6/arch/tile/kernel/tlb.c
===================================================================
--- linux-2.6.orig/arch/tile/kernel/tlb.c
+++ linux-2.6/arch/tile/kernel/tlb.c
@@ -64,14 +64,13 @@ void flush_tlb_page(const struct vm_area
 }
 EXPORT_SYMBOL(flush_tlb_page);
 
-void flush_tlb_range(const struct vm_area_struct *vma,
+void flush_tlb_range(const struct mm_struct *mm,
 		     unsigned long start, unsigned long end)
 {
-	unsigned long size = hv_page_size(vma);
-	struct mm_struct *mm = vma->vm_mm;
-	int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
-	flush_remote(0, cache, &mm->cpu_vm_mask, start, end - start, size,
-		     &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, HV_FLUSH_EVICT_L1I, &mm->cpu_vm_mask,
+		     start, end - start, PAGE_SIZE, &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, 0, &mm->cpu_vm_mask,
+		     start, end - start, HPAGE_SIZE, &mm->cpu_vm_mask, NULL, 0);
 }
 
 void flush_tlb_all(void)
Index: linux-2.6/arch/um/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/um/include/asm/tlbflush.h
+++ linux-2.6/arch/um/include/asm/tlbflush.h
@@ -16,12 +16,12 @@
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
  *  - flush_tlb_kernel_vm() flushes the kernel vm area
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  */
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, 
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long address);
 extern void flush_tlb_kernel_vm(void);
Index: linux-2.6/arch/um/kernel/tlb.c
===================================================================
--- linux-2.6.orig/arch/um/kernel/tlb.c
+++ linux-2.6/arch/um/kernel/tlb.c
@@ -492,12 +492,12 @@ static void fix_range(struct mm_struct *
 	fix_range_common(mm, start_addr, end_addr, force);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 {
-	if (vma->vm_mm == NULL)
+	if (mm == NULL)
 		flush_tlb_kernel_range_common(start, end);
-	else fix_range(vma->vm_mm, start, end, 0);
+	else fix_range(mm, start, end, 0);
 }
 
 void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
Index: linux-2.6/arch/unicore32/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/unicore32/include/asm/tlb.h
+++ linux-2.6/arch/unicore32/include/asm/tlb.h
@@ -77,7 +77,7 @@ static inline void
 tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
 {
 	if (!tlb->fullmm && tlb->range_end > 0)
-		flush_tlb_range(vma, tlb->range_start, tlb->range_end);
+		flush_tlb_range(vma->vm_mm, tlb->range_start, tlb->range_end);
 }
 
 static inline void tlb_flush_mmu(struct mmu_gather *tlb)
Index: linux-2.6/arch/unicore32/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/unicore32/include/asm/tlbflush.h
+++ linux-2.6/arch/unicore32/include/asm/tlbflush.h
@@ -167,7 +167,7 @@ static inline void clean_pmd_entry(pmd_t
 /*
  * Convert calls to our calling convention.
  */
-#define local_flush_tlb_range(vma, start, end)	\
+#define local_flush_tlb_range(mm, start, end)	\
 	__cpu_flush_user_tlb_range(start, end, vma)
 #define local_flush_tlb_kernel_range(s, e)	\
 	__cpu_flush_kern_tlb_range(s, e)
Index: linux-2.6/arch/x86/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/tlbflush.h
+++ linux-2.6/arch/x86/include/asm/tlbflush.h
@@ -75,7 +75,7 @@ static inline void __flush_tlb_one(unsig
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *  - flush_tlb_others(cpumask, mm, va) flushes TLBs on other cpus
  *
@@ -106,10 +106,10 @@ static inline void flush_tlb_page(struct
 		__flush_tlb_one(addr);
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	if (vma->vm_mm == current->active_mm)
+	if (mm == current->active_mm)
 		__flush_tlb();
 }
 
@@ -136,10 +136,10 @@ extern void flush_tlb_page(struct vm_are
 
 #define flush_tlb()	flush_tlb_current_task()
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 void native_flush_tlb_others(const struct cpumask *cpumask,
Index: linux-2.6/arch/x86/mm/pgtable.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/pgtable.c
+++ linux-2.6/arch/x86/mm/pgtable.c
@@ -332,7 +332,7 @@ int pmdp_set_access_flags(struct vm_area
 	if (changed && dirty) {
 		*pmdp = entry;
 		pmd_update_defer(vma->vm_mm, address, pmdp);
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	}
 
 	return changed;
@@ -393,7 +393,7 @@ int pmdp_clear_flush_young(struct vm_are
 
 	young = pmdp_test_and_clear_young(vma, address, pmdp);
 	if (young)
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 
 	return young;
 }
@@ -408,7 +408,7 @@ void pmdp_splitting_flush(struct vm_area
 	if (set) {
 		pmd_update(vma->vm_mm, address, pmdp);
 		/* need tlb flush only to serialize against gup-fast */
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	}
 }
 #endif
Index: linux-2.6/arch/xtensa/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/xtensa/include/asm/tlb.h
+++ linux-2.6/arch/xtensa/include/asm/tlb.h
@@ -32,7 +32,7 @@
 # define tlb_end_vma(tlb, vma)						      \
 	do {								      \
 		if (!tlb->fullmm)					      \
-			flush_tlb_range(vma, vma->vm_start, vma->vm_end);     \
+			flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end);     \
 	} while(0)
 
 #endif
Index: linux-2.6/arch/xtensa/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/xtensa/include/asm/tlbflush.h
+++ linux-2.6/arch/xtensa/include/asm/tlbflush.h
@@ -37,7 +37,7 @@
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct*);
 extern void flush_tlb_page(struct vm_area_struct*,unsigned long);
-extern void flush_tlb_range(struct vm_area_struct*,unsigned long,unsigned long);
+extern void flush_tlb_range(struct mm_struct*,unsigned long,unsigned long);
 
 #define flush_tlb_kernel_range(start,end) flush_tlb_all()
 
Index: linux-2.6/arch/xtensa/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/xtensa/mm/tlb.c
+++ linux-2.6/arch/xtensa/mm/tlb.c
@@ -82,10 +82,9 @@ void flush_tlb_mm(struct mm_struct *mm)
 # define _TLB_ENTRIES _DTLB_ENTRIES
 #endif
 
-void flush_tlb_range (struct vm_area_struct *vma,
+void flush_tlb_range (struct mm_struct *mm,
     		      unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long flags;
 
 	if (mm->context == NO_CONTEXT)
Index: linux-2.6/mm/huge_memory.c
===================================================================
--- linux-2.6.orig/mm/huge_memory.c
+++ linux-2.6/mm/huge_memory.c
@@ -1058,8 +1058,8 @@ int change_huge_pmd(struct vm_area_struc
 			entry = pmdp_get_and_clear(mm, addr, pmd);
 			entry = pmd_modify(entry, newprot);
 			set_pmd_at(mm, addr, pmd, entry);
-			spin_unlock(&vma->vm_mm->page_table_lock);
-			flush_tlb_range(vma, addr, addr + HPAGE_PMD_SIZE);
+			spin_unlock(&mm->page_table_lock);
+			flush_tlb_range(mm, addr, addr + HPAGE_PMD_SIZE);
 			ret = 1;
 		}
 	} else
@@ -1313,7 +1313,7 @@ static int __split_huge_page_map(struct 
 		 * of the pmd entry with pmd_populate.
 		 */
 		set_pmd_at(mm, address, pmd, pmd_mknotpresent(*pmd));
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(mm, address, address + HPAGE_PMD_SIZE);
 		pmd_populate(mm, pmd, pgtable);
 		ret = 1;
 	}
Index: linux-2.6/mm/hugetlb.c
===================================================================
--- linux-2.6.orig/mm/hugetlb.c
+++ linux-2.6/mm/hugetlb.c
@@ -2264,7 +2264,7 @@ void __unmap_hugepage_range(struct vm_ar
 		list_add(&page->lru, &page_list);
 	}
 	spin_unlock(&mm->page_table_lock);
-	flush_tlb_range(vma, start, end);
+	flush_tlb_range(mm, start, end);
 	mmu_notifier_invalidate_range_end(mm, start, end);
 	list_for_each_entry_safe(page, tmp, &page_list, lru) {
 		page_remove_rmap(page);
@@ -2829,7 +2829,7 @@ void hugetlb_change_protection(struct vm
 	spin_unlock(&mm->page_table_lock);
 	mutex_unlock(&vma->vm_file->f_mapping->i_mmap_mutex);
 
-	flush_tlb_range(vma, start, end);
+	flush_tlb_range(mm, start, end);
 }
 
 int hugetlb_reserve_pages(struct inode *inode,
Index: linux-2.6/mm/mprotect.c
===================================================================
--- linux-2.6.orig/mm/mprotect.c
+++ linux-2.6/mm/mprotect.c
@@ -138,7 +138,7 @@ static void change_protection(struct vm_
 		change_pud_range(vma, pgd, addr, next, newprot,
 				 dirty_accountable);
 	} while (pgd++, addr = next, addr != end);
-	flush_tlb_range(vma, start, end);
+	flush_tlb_range(mm, start, end);
 }
 
 int
Index: linux-2.6/mm/pgtable-generic.c
===================================================================
--- linux-2.6.orig/mm/pgtable-generic.c
+++ linux-2.6/mm/pgtable-generic.c
@@ -43,7 +43,7 @@ int pmdp_set_access_flags(struct vm_area
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	if (changed) {
 		set_pmd_at(vma->vm_mm, address, pmdp, entry);
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	}
 	return changed;
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
@@ -76,7 +76,7 @@ int pmdp_clear_flush_young(struct vm_are
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	young = pmdp_test_and_clear_young(vma, address, pmdp);
 	if (young)
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	return young;
 }
 #endif
@@ -100,7 +100,7 @@ pmd_t pmdp_clear_flush(struct vm_area_st
 	pmd_t pmd;
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	pmd = pmdp_get_and_clear(vma->vm_mm, address, pmdp);
-	flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	return pmd;
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
@@ -115,7 +115,7 @@ pmd_t pmdp_splitting_flush(struct vm_are
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	set_pmd_at(vma->vm_mm, address, pmdp, pmd);
 	/* tlb flush only to serialize against gup-fast */
-	flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: fixup-flush_tlb_range.patch --]
[-- Type: text/plain, Size: 69411 bytes --]

In order to be able to properly support architecture that want/need to
support TLB range invalidation, we need to change the
flush_tlb_range() argument from a vm_area_struct to an mm_struct
because the range might very well extend past one VMA, or not have a
VMA at all.

There are two mmu_gather cases to consider:

  unmap_region()
    tlb_gather_mmu()
    unmap_vmas()
      for (; vma; vma = vma->vm_next)
        unmao_page_range()
          tlb_start_vma() -> flush cache range
          zap_*_range()
            ptep_get_and_clear_full() -> batch/track external tlbs
            tlb_remove_tlb_entry() -> batch/track external tlbs
            tlb_remove_page() -> track range/batch page
          tlb_end_vma()
    free_pgtables()
      while (vma)
        unlink_*_vma()
        free_*_range()
          *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush everything
  free vmas

and:

  shift_arg_pages()
    tlb_gather_mmu()
    free_*_range()
      *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush things

There are various reasons that we need to flush TLBs _after_ freeing
the page-tables themselves. For some architectures (x86 among others)
this serializes against (both hardware and software) page table
walkers like gup_fast().

For others (ARM) this is (also) needed to evict stale page-table
caches - ARM LPAE mode apparently caches page tables and concurrent
hardware walkers could re-populate these caches if the final tlb flush
were to be from tlb_end_vma() since an concurrent walk could still be
in progress.

Therefore we need to track the range over all VMAs and the freeing of
the page-tables themselves. This means we cannot use a VMA argument to
the flush the TLB range.

Mostly architectures only used the ->vm_mm argument anyway, and
conversion is straight forward and removes numerous fake vma
instrances created just to pass an mm pointer.

The exceptions are ARM and TILE, both of which also look at
->vm_flags, ARM uses this to optimize TBL flushes for Harvard style
MMUs that have independent I-TLB ops. The taken conversion is rather
ugly (because I can't write ARM asm) and creates a fake VMA with
VM_EXEC set so that it effectively always flushes the I-TLBs and thus
looses the optimization.

TILE uses vm_flags to check for VM_EXEC in order to flush I-cache, but
also checks VM_HUGETLB. Arguably it shouldn't flush I-cache here and
we can use things like update_mmu_cache() to solve this. As for the
HUGETLB case, we can simply flush both at a small penalty. The current
conversion does all three, I-cache, TLB and HUGETLB.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 Documentation/cachetlb.txt             |    9 +++------
 arch/alpha/include/asm/tlbflush.h      |    8 +++-----
 arch/alpha/kernel/smp.c                |    4 ++--
 arch/arm/include/asm/tlb.h             |    2 +-
 arch/arm/include/asm/tlbflush.h        |    5 +++--
 arch/arm/kernel/ecard.c                |    8 ++------
 arch/arm/kernel/smp_tlb.c              |   29 +++++++++++++++++++++--------
 arch/avr32/include/asm/tlb.h           |    2 +-
 arch/avr32/include/asm/tlbflush.h      |    4 ++--
 arch/avr32/mm/tlb.c                    |    4 +---
 arch/cris/include/asm/tlbflush.h       |    4 ++--
 arch/frv/include/asm/tlbflush.h        |    4 ++--
 arch/ia64/include/asm/tlb.h            |   11 ++---------
 arch/ia64/include/asm/tlbflush.h       |    4 ++--
 arch/ia64/mm/tlb.c                     |    3 +--
 arch/m32r/include/asm/tlbflush.h       |   14 +++++++-------
 arch/m32r/kernel/smp.c                 |    6 +++---
 arch/m32r/mm/fault-nommu.c             |    2 +-
 arch/m32r/mm/fault.c                   |    5 +----
 arch/m68k/include/asm/tlbflush.h       |    7 +++----
 arch/microblaze/include/asm/tlbflush.h |    2 +-
 arch/mips/include/asm/tlbflush.h       |    8 ++++----
 arch/mips/kernel/smp.c                 |   12 +++++-------
 arch/mips/mm/tlb-r3k.c                 |    3 +--
 arch/mips/mm/tlb-r4k.c                 |    3 +--
 arch/mips/mm/tlb-r8k.c                 |    3 +--
 arch/mn10300/include/asm/tlbflush.h    |    6 +++---
 arch/parisc/include/asm/tlb.h          |    2 +-
 arch/parisc/include/asm/tlbflush.h     |    2 +-
 arch/powerpc/include/asm/tlbflush.h    |    8 ++++----
 arch/powerpc/mm/tlb_hash32.c           |    6 +++---
 arch/powerpc/mm/tlb_nohash.c           |    4 ++--
 arch/s390/include/asm/tlbflush.h       |    6 +++---
 arch/score/include/asm/tlbflush.h      |    6 +++---
 arch/score/mm/tlb-score.c              |    3 +--
 arch/sh/include/asm/tlb.h              |    2 +-
 arch/sh/include/asm/tlbflush.h         |   10 +++++-----
 arch/sh/kernel/smp.c                   |   12 +++++-------
 arch/sh/mm/nommu.c                     |    2 +-
 arch/sh/mm/tlbflush_32.c               |    3 +--
 arch/sh/mm/tlbflush_64.c               |    4 +---
 arch/sparc/include/asm/tlb_32.h        |    2 +-
 arch/sparc/include/asm/tlbflush_32.h   |   12 ++++++------
 arch/sparc/include/asm/tlbflush_64.h   |    2 +-
 arch/sparc/kernel/smp_32.c             |    8 +++-----
 arch/sparc/mm/generic_32.c             |    2 +-
 arch/sparc/mm/generic_64.c             |    2 +-
 arch/sparc/mm/hypersparc.S             |    1 -
 arch/sparc/mm/srmmu.c                  |   17 ++++++++---------
 arch/sparc/mm/sun4c.c                  |    3 +--
 arch/sparc/mm/swift.S                  |    1 -
 arch/sparc/mm/tsunami.S                |    1 -
 arch/sparc/mm/viking.S                 |    2 --
 arch/tile/include/asm/tlbflush.h       |    4 ++--
 arch/tile/kernel/tlb.c                 |   11 +++++------
 arch/um/include/asm/tlbflush.h         |    4 ++--
 arch/um/kernel/tlb.c                   |    6 +++---
 arch/unicore32/include/asm/tlb.h       |    2 +-
 arch/unicore32/include/asm/tlbflush.h  |    2 +-
 arch/x86/include/asm/tlbflush.h        |   10 +++++-----
 arch/x86/mm/pgtable.c                  |    6 +++---
 arch/xtensa/include/asm/tlb.h          |    2 +-
 arch/xtensa/include/asm/tlbflush.h     |    2 +-
 arch/xtensa/mm/tlb.c                   |    3 +--
 mm/huge_memory.c                       |    6 +++---
 mm/hugetlb.c                           |    4 ++--
 mm/mprotect.c                          |    2 +-
 mm/pgtable-generic.c                   |    8 ++++----
 68 files changed, 168 insertions(+), 199 deletions(-)

Index: linux-2.6/Documentation/cachetlb.txt
===================================================================
--- linux-2.6.orig/Documentation/cachetlb.txt
+++ linux-2.6/Documentation/cachetlb.txt
@@ -49,20 +49,17 @@ invoke one of the following flush method
 	page table operations such as what happens during
 	fork, and exec.
 
-3) void flush_tlb_range(struct vm_area_struct *vma,
+3) void flush_tlb_range(struct mm_struct *mm,
 			unsigned long start, unsigned long end)
 
 	Here we are flushing a specific range of (user) virtual
 	address translations from the TLB.  After running, this
 	interface must make sure that any previous page table
-	modifications for the address space 'vma->vm_mm' in the range
+	modifications for the address space 'mm' in the range
 	'start' to 'end-1' will be visible to the cpu.  That is, after
 	running, here will be no entries in the TLB for 'mm' for
 	virtual addresses in the range 'start' to 'end-1'.
 
-	The "vma" is the backing store being used for the region.
-	Primarily, this is used for munmap() type operations.
-
 	The interface is provided in hopes that the port can find
 	a suitably efficient method for removing multiple page
 	sized translations from the TLB, instead of having the kernel
@@ -120,7 +117,7 @@ is changing an existing virtual-->physic
 
 	2) flush_cache_range(vma, start, end);
 	   change_range_of_page_tables(mm, start, end);
-	   flush_tlb_range(vma, start, end);
+	   flush_tlb_range(vma->vm_mm, start, end);
 
 	3) flush_cache_page(vma, addr, pfn);
 	   set_pte(pte_pointer, new_pte_val);
Index: linux-2.6/arch/alpha/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/alpha/include/asm/tlbflush.h
+++ linux-2.6/arch/alpha/include/asm/tlbflush.h
@@ -127,10 +127,9 @@ flush_tlb_page(struct vm_area_struct *vm
 /* Flush a specified range of user mapping.  On the Alpha we flush
    the whole user tlb.  */
 static inline void
-flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-		unsigned long end)
+flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 #else /* CONFIG_SMP */
@@ -138,8 +137,7 @@ flush_tlb_range(struct vm_area_struct *v
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *);
 extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void flush_tlb_range(struct vm_area_struct *, unsigned long,
-			    unsigned long);
+extern void flush_tlb_range(struct mm_struct *, unsigned long, unsigned long);
 
 #endif /* CONFIG_SMP */
 
Index: linux-2.6/arch/alpha/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/alpha/kernel/smp.c
+++ linux-2.6/arch/alpha/kernel/smp.c
@@ -773,10 +773,10 @@ flush_tlb_page(struct vm_area_struct *vm
 EXPORT_SYMBOL(flush_tlb_page);
 
 void
-flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
 	/* On the Alpha we always flush the whole user tlb.  */
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 EXPORT_SYMBOL(flush_tlb_range);
 
Index: linux-2.6/arch/arm/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/tlb.h
+++ linux-2.6/arch/arm/include/asm/tlb.h
@@ -83,7 +83,7 @@ static inline void tlb_flush(struct mmu_
 	if (tlb->fullmm || !tlb->vma)
 		flush_tlb_mm(tlb->mm);
 	else if (tlb->range_end > 0) {
-		flush_tlb_range(tlb->vma, tlb->range_start, tlb->range_end);
+		flush_tlb_range(tlb->mm, tlb->range_start, tlb->range_end);
 		tlb->range_start = TASK_SIZE;
 		tlb->range_end = 0;
 	}
Index: linux-2.6/arch/arm/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/tlbflush.h
+++ linux-2.6/arch/arm/include/asm/tlbflush.h
@@ -545,7 +545,8 @@ static inline void clean_pmd_entry(pmd_t
 /*
  * Convert calls to our calling convention.
  */
-#define local_flush_tlb_range(vma,start,end)	__cpu_flush_user_tlb_range(start,end,vma)
+extern void local_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
+
 #define local_flush_tlb_kernel_range(s,e)	__cpu_flush_kern_tlb_range(s,e)
 
 #ifndef CONFIG_SMP
@@ -560,7 +561,7 @@ extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr);
 extern void flush_tlb_kernel_page(unsigned long kaddr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 #endif
 
Index: linux-2.6/arch/arm/kernel/ecard.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/ecard.c
+++ linux-2.6/arch/arm/kernel/ecard.c
@@ -217,8 +217,6 @@ static DEFINE_MUTEX(ecard_mutex);
  */
 static void ecard_init_pgtables(struct mm_struct *mm)
 {
-	struct vm_area_struct vma;
-
 	/* We want to set up the page tables for the following mapping:
 	 *  Virtual	Physical
 	 *  0x03000000	0x03000000
@@ -242,10 +240,8 @@ static void ecard_init_pgtables(struct m
 
 	memcpy(dst_pgd, src_pgd, sizeof(pgd_t) * (EASI_SIZE / PGDIR_SIZE));
 
-	vma.vm_mm = mm;
-
-	flush_tlb_range(&vma, IO_START, IO_START + IO_SIZE);
-	flush_tlb_range(&vma, EASI_START, EASI_START + EASI_SIZE);
+	flush_tlb_range(mm, IO_START, IO_START + IO_SIZE);
+	flush_tlb_range(mm, EASI_START, EASI_START + EASI_SIZE);
 }
 
 static int ecard_init_mm(void)
Index: linux-2.6/arch/arm/kernel/smp_tlb.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/smp_tlb.c
+++ linux-2.6/arch/arm/kernel/smp_tlb.c
@@ -9,6 +9,7 @@
  */
 #include <linux/preempt.h>
 #include <linux/smp.h>
+#include <linux/mm.h>
 
 #include <asm/smp_plat.h>
 #include <asm/tlbflush.h>
@@ -31,7 +32,7 @@ static void on_each_cpu_mask(void (*func
  * TLB operations
  */
 struct tlb_args {
-	struct vm_area_struct *ta_vma;
+	struct mm_struct *ta_mm;
 	unsigned long ta_start;
 	unsigned long ta_end;
 };
@@ -51,8 +52,11 @@ static inline void ipi_flush_tlb_mm(void
 static inline void ipi_flush_tlb_page(void *arg)
 {
 	struct tlb_args *ta = (struct tlb_args *)arg;
+	struct vm_area_struct vma = {
+		.vm_mm = ta->ta_mm,
+	};
 
-	local_flush_tlb_page(ta->ta_vma, ta->ta_start);
+	local_flush_tlb_page(&vma, ta->ta_start);
 }
 
 static inline void ipi_flush_tlb_kernel_page(void *arg)
@@ -66,7 +70,7 @@ static inline void ipi_flush_tlb_range(v
 {
 	struct tlb_args *ta = (struct tlb_args *)arg;
 
-	local_flush_tlb_range(ta->ta_vma, ta->ta_start, ta->ta_end);
+	local_flush_tlb_range(ta->ta_mm, ta->ta_start, ta->ta_end);
 }
 
 static inline void ipi_flush_tlb_kernel_range(void *arg)
@@ -96,7 +100,7 @@ void flush_tlb_page(struct vm_area_struc
 {
 	if (tlb_ops_need_broadcast()) {
 		struct tlb_args ta;
-		ta.ta_vma = vma;
+		ta.ta_mm = vma->vm_mm;
 		ta.ta_start = uaddr;
 		on_each_cpu_mask(ipi_flush_tlb_page, &ta, 1, mm_cpumask(vma->vm_mm));
 	} else
@@ -113,17 +117,17 @@ void flush_tlb_kernel_page(unsigned long
 		local_flush_tlb_kernel_page(kaddr);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+void flush_tlb_range(struct mm_struct *mm,
                      unsigned long start, unsigned long end)
 {
 	if (tlb_ops_need_broadcast()) {
 		struct tlb_args ta;
-		ta.ta_vma = vma;
+		ta.ta_mm = mm;
 		ta.ta_start = start;
 		ta.ta_end = end;
-		on_each_cpu_mask(ipi_flush_tlb_range, &ta, 1, mm_cpumask(vma->vm_mm));
+		on_each_cpu_mask(ipi_flush_tlb_range, &ta, 1, mm_cpumask(mm));
 	} else
-		local_flush_tlb_range(vma, start, end);
+		local_flush_tlb_range(mm, start, end);
 }
 
 void flush_tlb_kernel_range(unsigned long start, unsigned long end)
@@ -137,3 +141,12 @@ void flush_tlb_kernel_range(unsigned lon
 		local_flush_tlb_kernel_range(start, end);
 }
 
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
+{
+	struct vm_area_struct vma = {
+		.vm_mm = mm,
+		.vm_flags = VM_EXEC,
+	};
+
+	__cpu_flush_user_tlb_range(start, end, &vma);
+}
Index: linux-2.6/arch/avr32/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/avr32/include/asm/tlb.h
+++ linux-2.6/arch/avr32/include/asm/tlb.h
@@ -12,7 +12,7 @@
 	flush_cache_range(vma, vma->vm_start, vma->vm_end)
 
 #define tlb_end_vma(tlb, vma) \
-	flush_tlb_range(vma, vma->vm_start, vma->vm_end)
+	flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end)
 
 #define __tlb_remove_tlb_entry(tlb, pte, address) do { } while(0)
 
Index: linux-2.6/arch/avr32/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/avr32/include/asm/tlbflush.h
+++ linux-2.6/arch/avr32/include/asm/tlbflush.h
@@ -17,13 +17,13 @@
  *  - flush_tlb_all() flushes all processes' TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLBs
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void flush_tlb(void);
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 
Index: linux-2.6/arch/avr32/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/avr32/mm/tlb.c
+++ linux-2.6/arch/avr32/mm/tlb.c
@@ -170,11 +170,9 @@ void flush_tlb_page(struct vm_area_struc
 	}
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	if (mm->context != NO_CONTEXT) {
 		unsigned long flags;
 		int size;
Index: linux-2.6/arch/cris/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/cris/include/asm/tlbflush.h
+++ linux-2.6/arch/cris/include/asm/tlbflush.h
@@ -33,9 +33,9 @@ extern void flush_tlb_page(struct vm_are
 #define flush_tlb_page __flush_tlb_page
 #endif
 
-static inline void flush_tlb_range(struct vm_area_struct * vma, unsigned long start, unsigned long end)
+static inline void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 static inline void flush_tlb(void)
Index: linux-2.6/arch/frv/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/frv/include/asm/tlbflush.h
+++ linux-2.6/arch/frv/include/asm/tlbflush.h
@@ -39,10 +39,10 @@ do {						\
 	preempt_enable();			\
 } while(0)
 
-#define flush_tlb_range(vma,start,end)					\
+#define flush_tlb_range(mm,start,end)					\
 do {									\
 	preempt_disable();						\
-	__flush_tlb_range((vma)->vm_mm->context.id, start, end);	\
+	__flush_tlb_range((mm)->context.id, start, end);		\
 	preempt_enable();						\
 } while(0)
 
Index: linux-2.6/arch/ia64/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/tlb.h
+++ linux-2.6/arch/ia64/include/asm/tlb.h
@@ -126,17 +126,10 @@ ia64_tlb_flush_mmu (struct mmu_gather *t
 		 */
 		flush_tlb_all();
 	} else {
-		/*
-		 * XXX fix me: flush_tlb_range() should take an mm pointer instead of a
-		 * vma pointer.
-		 */
-		struct vm_area_struct vma;
-
-		vma.vm_mm = tlb->mm;
 		/* flush the address range from the tlb: */
-		flush_tlb_range(&vma, start, end);
+		flush_tlb_range(tlb->mm, start, end);
 		/* now flush the virt. page-table area mapping the address range: */
-		flush_tlb_range(&vma, ia64_thash(start), ia64_thash(end));
+		flush_tlb_range(tlb->mm, ia64_thash(start), ia64_thash(end));
 	}
 
 	/* lastly, release the freed pages */
Index: linux-2.6/arch/ia64/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/tlbflush.h
+++ linux-2.6/arch/ia64/include/asm/tlbflush.h
@@ -66,7 +66,7 @@ flush_tlb_mm (struct mm_struct *mm)
 #endif
 }
 
-extern void flush_tlb_range (struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_tlb_range (struct mm_struct *mm, unsigned long start, unsigned long end);
 
 /*
  * Page-granular tlb flush.
@@ -75,7 +75,7 @@ static inline void
 flush_tlb_page (struct vm_area_struct *vma, unsigned long addr)
 {
 #ifdef CONFIG_SMP
-	flush_tlb_range(vma, (addr & PAGE_MASK), (addr & PAGE_MASK) + PAGE_SIZE);
+	flush_tlb_range(vma->vm_mm, (addr & PAGE_MASK), (addr & PAGE_MASK) + PAGE_SIZE);
 #else
 	if (vma->vm_mm == current->active_mm)
 		ia64_ptcl(addr, (PAGE_SHIFT << 2));
Index: linux-2.6/arch/ia64/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/ia64/mm/tlb.c
+++ linux-2.6/arch/ia64/mm/tlb.c
@@ -298,10 +298,9 @@ local_flush_tlb_all (void)
 }
 
 void
-flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
+flush_tlb_range (struct mm_struct *mm, unsigned long start,
 		 unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long size = end - start;
 	unsigned long nbits;
 
Index: linux-2.6/arch/m32r/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/m32r/include/asm/tlbflush.h
+++ linux-2.6/arch/m32r/include/asm/tlbflush.h
@@ -17,7 +17,7 @@
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *);
 extern void local_flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void local_flush_tlb_range(struct vm_area_struct *, unsigned long,
+extern void local_flush_tlb_range(struct mm_struct *, unsigned long,
 	unsigned long);
 
 #ifndef CONFIG_SMP
@@ -25,27 +25,27 @@ extern void local_flush_tlb_range(struct
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
-#define flush_tlb_range(vma, start, end)	\
-	local_flush_tlb_range(vma, start, end)
+#define flush_tlb_range(mm, start, end)	\
+	local_flush_tlb_range(mm, start, end)
 #define flush_tlb_kernel_range(start, end)	local_flush_tlb_all()
 #else	/* CONFIG_MMU */
 #define flush_tlb_all()			do { } while (0)
 #define flush_tlb_mm(mm)		do { } while (0)
 #define flush_tlb_page(vma, vmaddr)	do { } while (0)
-#define flush_tlb_range(vma, start, end)	do { } while (0)
+#define flush_tlb_range(mm, start, end)	do { } while (0)
 #endif	/* CONFIG_MMU */
 #else	/* CONFIG_SMP */
 extern void smp_flush_tlb_all(void);
 extern void smp_flush_tlb_mm(struct mm_struct *);
 extern void smp_flush_tlb_page(struct vm_area_struct *, unsigned long);
-extern void smp_flush_tlb_range(struct vm_area_struct *, unsigned long,
+extern void smp_flush_tlb_range(struct mm_struct *, unsigned long,
 	unsigned long);
 
 #define flush_tlb_all()			smp_flush_tlb_all()
 #define flush_tlb_mm(mm)		smp_flush_tlb_mm(mm)
 #define flush_tlb_page(vma, page)	smp_flush_tlb_page(vma, page)
-#define flush_tlb_range(vma, start, end)	\
-	smp_flush_tlb_range(vma, start, end)
+#define flush_tlb_range(mm, start, end)	\
+	smp_flush_tlb_range(mm, start, end)
 #define flush_tlb_kernel_range(start, end)	smp_flush_tlb_all()
 #endif	/* CONFIG_SMP */
 
Index: linux-2.6/arch/m32r/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/m32r/kernel/smp.c
+++ linux-2.6/arch/m32r/kernel/smp.c
@@ -71,7 +71,7 @@ void smp_flush_tlb_all(void);
 static void flush_tlb_all_ipi(void *);
 
 void smp_flush_tlb_mm(struct mm_struct *);
-void smp_flush_tlb_range(struct vm_area_struct *, unsigned long, \
+void smp_flush_tlb_range(struct mm_struct *, unsigned long, \
 	unsigned long);
 void smp_flush_tlb_page(struct vm_area_struct *, unsigned long);
 static void flush_tlb_others(cpumask_t, struct mm_struct *,
@@ -299,10 +299,10 @@ void smp_flush_tlb_mm(struct mm_struct *
  * ---------- --- --------------------------------------------------------
  *
  *==========================================================================*/
-void smp_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void smp_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	smp_flush_tlb_mm(vma->vm_mm);
+	smp_flush_tlb_mm(mm);
 }
 
 /*==========================================================================*
Index: linux-2.6/arch/m32r/mm/fault-nommu.c
===================================================================
--- linux-2.6.orig/arch/m32r/mm/fault-nommu.c
+++ linux-2.6/arch/m32r/mm/fault-nommu.c
@@ -111,7 +111,7 @@ void local_flush_tlb_page(struct vm_area
 /*======================================================================*
  * flush_tlb_range() : flushes a range of pages
  *======================================================================*/
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
 	BUG();
Index: linux-2.6/arch/m32r/mm/fault.c
===================================================================
--- linux-2.6.orig/arch/m32r/mm/fault.c
+++ linux-2.6/arch/m32r/mm/fault.c
@@ -468,12 +468,9 @@ void local_flush_tlb_page(struct vm_area
 /*======================================================================*
  * flush_tlb_range() : flushes a range of pages
  *======================================================================*/
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm;
-
-	mm = vma->vm_mm;
 	if (mm_context(mm) != NO_CONTEXT) {
 		unsigned long flags;
 		int size;
Index: linux-2.6/arch/m68k/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/m68k/include/asm/tlbflush.h
+++ linux-2.6/arch/m68k/include/asm/tlbflush.h
@@ -80,10 +80,10 @@ static inline void flush_tlb_page(struct
 	}
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	if (vma->vm_mm == current->active_mm)
+	if (mm == current->active_mm)
 		__flush_tlb();
 }
 
@@ -177,10 +177,9 @@ static inline void flush_tlb_page (struc
 }
 /* Flush a range of pages from TLB. */
 
-static inline void flush_tlb_range (struct vm_area_struct *vma,
+static inline void flush_tlb_range (struct mm_struct *mm,
 		      unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned char seg, oldctx;
 
 	start &= ~SUN3_PMEG_MASK;
Index: linux-2.6/arch/microblaze/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/microblaze/include/asm/tlbflush.h
+++ linux-2.6/arch/microblaze/include/asm/tlbflush.h
@@ -33,7 +33,7 @@ static inline void local_flush_tlb_mm(st
 static inline void local_flush_tlb_page(struct vm_area_struct *vma,
 				unsigned long vmaddr)
 	{ __tlbie(vmaddr); }
-static inline void local_flush_tlb_range(struct vm_area_struct *vma,
+static inline void local_flush_tlb_range(struct mm_struct *mm,
 		unsigned long start, unsigned long end)
 	{ __tlbia(); }
 
Index: linux-2.6/arch/mips/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/mips/include/asm/tlbflush.h
+++ linux-2.6/arch/mips/include/asm/tlbflush.h
@@ -9,12 +9,12 @@
  *  - flush_tlb_all() flushes all processes TLB entries
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB entries
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
-extern void local_flush_tlb_range(struct vm_area_struct *vma,
+extern void local_flush_tlb_range(struct mm_struct *mm,
 	unsigned long start, unsigned long end);
 extern void local_flush_tlb_kernel_range(unsigned long start,
 	unsigned long end);
@@ -26,7 +26,7 @@ extern void local_flush_tlb_one(unsigned
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long,
 	unsigned long);
 extern void flush_tlb_kernel_range(unsigned long, unsigned long);
 extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
@@ -36,7 +36,7 @@ extern void flush_tlb_one(unsigned long 
 
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
-#define flush_tlb_range(vma, vmaddr, end)	local_flush_tlb_range(vma, vmaddr, end)
+#define flush_tlb_range(mm, vmaddr, end)	local_flush_tlb_range(mm, vmaddr, end)
 #define flush_tlb_kernel_range(vmaddr,end) \
 	local_flush_tlb_kernel_range(vmaddr, end)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
Index: linux-2.6/arch/mips/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/mips/kernel/smp.c
+++ linux-2.6/arch/mips/kernel/smp.c
@@ -307,7 +307,7 @@ void flush_tlb_mm(struct mm_struct *mm)
 }
 
 struct flush_tlb_data {
-	struct vm_area_struct *vma;
+	struct mm_struct *mm;
 	unsigned long addr1;
 	unsigned long addr2;
 };
@@ -316,17 +316,15 @@ static void flush_tlb_range_ipi(void *in
 {
 	struct flush_tlb_data *fd = info;
 
-	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
+	local_flush_tlb_range(fd->mm, fd->addr1, fd->addr2);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+void flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	preempt_disable();
 	if ((atomic_read(&mm->mm_users) != 1) || (current->mm != mm)) {
 		struct flush_tlb_data fd = {
-			.vma = vma,
+			.mm = mm,
 			.addr1 = start,
 			.addr2 = end,
 		};
@@ -341,7 +339,7 @@ void flush_tlb_range(struct vm_area_stru
 			if (cpu_context(cpu, mm))
 				cpu_context(cpu, mm) = 0;
 	}
-	local_flush_tlb_range(vma, start, end);
+	local_flush_tlb_range(mm, start, end);
 	preempt_enable();
 }
 
Index: linux-2.6/arch/mips/mm/tlb-r3k.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/tlb-r3k.c
+++ linux-2.6/arch/mips/mm/tlb-r3k.c
@@ -76,10 +76,9 @@ void local_flush_tlb_mm(struct mm_struct
 	}
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			   unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int cpu = smp_processor_id();
 
 	if (cpu_context(cpu, mm) != 0) {
Index: linux-2.6/arch/mips/mm/tlb-r4k.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/tlb-r4k.c
+++ linux-2.6/arch/mips/mm/tlb-r4k.c
@@ -112,10 +112,9 @@ void local_flush_tlb_mm(struct mm_struct
 	preempt_enable();
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int cpu = smp_processor_id();
 
 	if (cpu_context(cpu, mm) != 0) {
Index: linux-2.6/arch/mips/mm/tlb-r8k.c
===================================================================
--- linux-2.6.orig/arch/mips/mm/tlb-r8k.c
+++ linux-2.6/arch/mips/mm/tlb-r8k.c
@@ -60,10 +60,9 @@ void local_flush_tlb_mm(struct mm_struct
 		drop_mmu_context(mm, cpu);
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int cpu = smp_processor_id();
 	unsigned long flags;
 	int oldpid, newpid, size;
Index: linux-2.6/arch/mn10300/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/mn10300/include/asm/tlbflush.h
+++ linux-2.6/arch/mn10300/include/asm/tlbflush.h
@@ -105,10 +105,10 @@ extern void flush_tlb_page(struct vm_are
 
 #define flush_tlb()		flush_tlb_current_task()
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 #else   /* CONFIG_SMP */
@@ -127,7 +127,7 @@ static inline void flush_tlb_mm(struct m
 	preempt_enable();
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
 	preempt_disable();
Index: linux-2.6/arch/parisc/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/parisc/include/asm/tlb.h
+++ linux-2.6/arch/parisc/include/asm/tlb.h
@@ -13,7 +13,7 @@ do {	if (!(tlb)->fullmm)	\
 
 #define tlb_end_vma(tlb, vma)	\
 do {	if (!(tlb)->fullmm)	\
-		flush_tlb_range(vma, vma->vm_start, vma->vm_end); \
+		flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end); \
 } while (0)
 
 #define __tlb_remove_tlb_entry(tlb, pte, address) \
Index: linux-2.6/arch/parisc/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/parisc/include/asm/tlbflush.h
+++ linux-2.6/arch/parisc/include/asm/tlbflush.h
@@ -76,7 +76,7 @@ static inline void flush_tlb_page(struct
 void __flush_tlb_range(unsigned long sid,
 	unsigned long start, unsigned long end);
 
-#define flush_tlb_range(vma,start,end) __flush_tlb_range((vma)->vm_mm->context,start,end)
+#define flush_tlb_range(mm,start,end) __flush_tlb_range((mm)->context,start,end)
 
 #define flush_tlb_kernel_range(start, end) __flush_tlb_range(0,start,end)
 
Index: linux-2.6/arch/powerpc/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/powerpc/include/asm/tlbflush.h
+++ linux-2.6/arch/powerpc/include/asm/tlbflush.h
@@ -10,7 +10,7 @@
  *                           the local processor
  *  - local_flush_tlb_page(vma, vmaddr) flushes one page on the local processor
  *  - flush_tlb_page_nohash(vma, vmaddr) flushes one page if SW loaded TLB
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *
  *  This program is free software; you can redistribute it and/or
@@ -34,7 +34,7 @@ struct mm_struct;
 
 #define MMU_NO_CONTEXT      	((unsigned int)-1)
 
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 
@@ -64,7 +64,7 @@ extern void __flush_tlb_page(struct mm_s
 extern void flush_tlb_mm(struct mm_struct *mm);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr);
 extern void flush_tlb_page_nohash(struct vm_area_struct *vma, unsigned long addr);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
 static inline void local_flush_tlb_page(struct vm_area_struct *vma,
@@ -153,7 +153,7 @@ static inline void flush_tlb_page_nohash
 {
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
 }
Index: linux-2.6/arch/powerpc/mm/tlb_hash32.c
===================================================================
--- linux-2.6.orig/arch/powerpc/mm/tlb_hash32.c
+++ linux-2.6/arch/powerpc/mm/tlb_hash32.c
@@ -78,7 +78,7 @@ void tlb_flush(struct mmu_gather *tlb)
  *
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes kernel pages
  *
  * since the hardware hash table functions as an extension of the
@@ -171,9 +171,9 @@ EXPORT_SYMBOL(flush_tlb_page);
  * and check _PAGE_HASHPTE bit; if it is set, find and destroy
  * the corresponding HPTE.
  */
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 {
-	flush_range(vma->vm_mm, start, end);
+	flush_range(mm, start, end);
 }
 EXPORT_SYMBOL(flush_tlb_range);
Index: linux-2.6/arch/powerpc/mm/tlb_nohash.c
===================================================================
--- linux-2.6.orig/arch/powerpc/mm/tlb_nohash.c
+++ linux-2.6/arch/powerpc/mm/tlb_nohash.c
@@ -107,7 +107,7 @@ unsigned long linear_map_top;	/* Top of 
  *
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes kernel pages
  *
  *  - local_* variants of page and mm only apply to the current
@@ -288,7 +288,7 @@ EXPORT_SYMBOL(flush_tlb_kernel_range);
  * some implementation can stack multiple tlbivax before a tlbsync but
  * for now, we keep it that way
  */
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 
 {
Index: linux-2.6/arch/s390/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/s390/include/asm/tlbflush.h
+++ linux-2.6/arch/s390/include/asm/tlbflush.h
@@ -108,7 +108,7 @@ static inline void __tlb_flush_mm_cond(s
  *  flush_tlb_all() - flushes all processes TLBs
  *  flush_tlb_mm(mm) - flushes the specified mm context TLB's
  *  flush_tlb_page(vma, vmaddr) - flushes one page
- *  flush_tlb_range(vma, start, end) - flushes a range of pages
+ *  flush_tlb_range(mm, start, end) - flushes a range of pages
  *  flush_tlb_kernel_range(start, end) - flushes a range of kernel pages
  */
 
@@ -129,10 +129,10 @@ static inline void flush_tlb_mm(struct m
 	__tlb_flush_mm_cond(mm);
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	__tlb_flush_mm_cond(vma->vm_mm);
+	__tlb_flush_mm_cond(mm);
 }
 
 static inline void flush_tlb_kernel_range(unsigned long start,
Index: linux-2.6/arch/score/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/score/include/asm/tlbflush.h
+++ linux-2.6/arch/score/include/asm/tlbflush.h
@@ -14,7 +14,7 @@
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
-extern void local_flush_tlb_range(struct vm_area_struct *vma,
+extern void local_flush_tlb_range(struct mm_struct *mm,
 	unsigned long start, unsigned long end);
 extern void local_flush_tlb_kernel_range(unsigned long start,
 	unsigned long end);
@@ -24,8 +24,8 @@ extern void local_flush_tlb_one(unsigned
 
 #define flush_tlb_all()			local_flush_tlb_all()
 #define flush_tlb_mm(mm)		local_flush_tlb_mm(mm)
-#define flush_tlb_range(vma, vmaddr, end) \
-	local_flush_tlb_range(vma, vmaddr, end)
+#define flush_tlb_range(mm, vmaddr, end) \
+	local_flush_tlb_range(mm, vmaddr, end)
 #define flush_tlb_kernel_range(vmaddr, end) \
 	local_flush_tlb_kernel_range(vmaddr, end)
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
Index: linux-2.6/arch/score/mm/tlb-score.c
===================================================================
--- linux-2.6.orig/arch/score/mm/tlb-score.c
+++ linux-2.6/arch/score/mm/tlb-score.c
@@ -77,10 +77,9 @@ void local_flush_tlb_mm(struct mm_struct
 		drop_mmu_context(mm);
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 	unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long vma_mm_context = mm->context;
 	if (mm->context != 0) {
 		unsigned long flags;
Index: linux-2.6/arch/sh/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/tlb.h
+++ linux-2.6/arch/sh/include/asm/tlb.h
@@ -78,7 +78,7 @@ static inline void
 tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
 {
 	if (!tlb->fullmm && tlb->end) {
-		flush_tlb_range(vma, tlb->start, tlb->end);
+		flush_tlb_range(vma->vm_mm, tlb->start, tlb->end);
 		init_tlb_gather(tlb);
 	}
 }
Index: linux-2.6/arch/sh/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/tlbflush.h
+++ linux-2.6/arch/sh/include/asm/tlbflush.h
@@ -7,12 +7,12 @@
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 extern void local_flush_tlb_all(void);
 extern void local_flush_tlb_mm(struct mm_struct *mm);
-extern void local_flush_tlb_range(struct vm_area_struct *vma,
+extern void local_flush_tlb_range(struct mm_struct *mm,
 				  unsigned long start,
 				  unsigned long end);
 extern void local_flush_tlb_page(struct vm_area_struct *vma,
@@ -27,7 +27,7 @@ extern void __flush_tlb_global(void);
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
@@ -40,8 +40,8 @@ extern void flush_tlb_one(unsigned long 
 #define flush_tlb_page(vma, page)	local_flush_tlb_page(vma, page)
 #define flush_tlb_one(asid, page)	local_flush_tlb_one(asid, page)
 
-#define flush_tlb_range(vma, start, end)	\
-	local_flush_tlb_range(vma, start, end)
+#define flush_tlb_range(mm, start, end)	\
+	local_flush_tlb_range(mm, start, end)
 
 #define flush_tlb_kernel_range(start, end)	\
 	local_flush_tlb_kernel_range(start, end)
Index: linux-2.6/arch/sh/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/sh/kernel/smp.c
+++ linux-2.6/arch/sh/kernel/smp.c
@@ -390,7 +390,7 @@ void flush_tlb_mm(struct mm_struct *mm)
 }
 
 struct flush_tlb_data {
-	struct vm_area_struct *vma;
+	struct mm_struct *mm;
 	unsigned long addr1;
 	unsigned long addr2;
 };
@@ -399,19 +399,17 @@ static void flush_tlb_range_ipi(void *in
 {
 	struct flush_tlb_data *fd = (struct flush_tlb_data *)info;
 
-	local_flush_tlb_range(fd->vma, fd->addr1, fd->addr2);
+	local_flush_tlb_range(fd->mm, fd->addr1, fd->addr2);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma,
+void flush_tlb_range(struct mm_struct *mm,
 		     unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	preempt_disable();
 	if ((atomic_read(&mm->mm_users) != 1) || (current->mm != mm)) {
 		struct flush_tlb_data fd;
 
-		fd.vma = vma;
+		fd.mm = mm;
 		fd.addr1 = start;
 		fd.addr2 = end;
 		smp_call_function(flush_tlb_range_ipi, (void *)&fd, 1);
@@ -421,7 +419,7 @@ void flush_tlb_range(struct vm_area_stru
 			if (smp_processor_id() != i)
 				cpu_context(i, mm) = 0;
 	}
-	local_flush_tlb_range(vma, start, end);
+	local_flush_tlb_range(mm, start, end);
 	preempt_enable();
 }
 
Index: linux-2.6/arch/sh/mm/nommu.c
===================================================================
--- linux-2.6.orig/arch/sh/mm/nommu.c
+++ linux-2.6/arch/sh/mm/nommu.c
@@ -46,7 +46,7 @@ void local_flush_tlb_mm(struct mm_struct
 	BUG();
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end)
 {
 	BUG();
Index: linux-2.6/arch/sh/mm/tlbflush_32.c
===================================================================
--- linux-2.6.orig/arch/sh/mm/tlbflush_32.c
+++ linux-2.6/arch/sh/mm/tlbflush_32.c
@@ -36,10 +36,9 @@ void local_flush_tlb_page(struct vm_area
 	}
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			   unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned int cpu = smp_processor_id();
 
 	if (cpu_context(cpu, mm) != NO_CONTEXT) {
Index: linux-2.6/arch/sh/mm/tlbflush_64.c
===================================================================
--- linux-2.6.orig/arch/sh/mm/tlbflush_64.c
+++ linux-2.6/arch/sh/mm/tlbflush_64.c
@@ -365,16 +365,14 @@ void local_flush_tlb_page(struct vm_area
 	}
 }
 
-void local_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void local_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			   unsigned long end)
 {
 	unsigned long flags;
 	unsigned long long match, pteh=0, pteh_epn, pteh_low;
 	unsigned long tlb;
 	unsigned int cpu = smp_processor_id();
-	struct mm_struct *mm;
 
-	mm = vma->vm_mm;
 	if (cpu_context(cpu, mm) == NO_CONTEXT)
 		return;
 
Index: linux-2.6/arch/sparc/include/asm/tlb_32.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/tlb_32.h
+++ linux-2.6/arch/sparc/include/asm/tlb_32.h
@@ -8,7 +8,7 @@ do {								\
 
 #define tlb_end_vma(tlb, vma) \
 do {								\
-	flush_tlb_range(vma, vma->vm_start, vma->vm_end);	\
+	flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end);\
 } while (0)
 
 #define __tlb_remove_tlb_entry(tlb, pte, address) \
Index: linux-2.6/arch/sparc/include/asm/tlbflush_32.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/tlbflush_32.h
+++ linux-2.6/arch/sparc/include/asm/tlbflush_32.h
@@ -11,7 +11,7 @@
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  */
 
@@ -19,17 +19,17 @@
 
 BTFIXUPDEF_CALL(void, local_flush_tlb_all, void)
 BTFIXUPDEF_CALL(void, local_flush_tlb_mm, struct mm_struct *)
-BTFIXUPDEF_CALL(void, local_flush_tlb_range, struct vm_area_struct *, unsigned long, unsigned long)
+BTFIXUPDEF_CALL(void, local_flush_tlb_range, struct mm_struct *, unsigned long, unsigned long)
 BTFIXUPDEF_CALL(void, local_flush_tlb_page, struct vm_area_struct *, unsigned long)
 
 #define local_flush_tlb_all() BTFIXUP_CALL(local_flush_tlb_all)()
 #define local_flush_tlb_mm(mm) BTFIXUP_CALL(local_flush_tlb_mm)(mm)
-#define local_flush_tlb_range(vma,start,end) BTFIXUP_CALL(local_flush_tlb_range)(vma,start,end)
+#define local_flush_tlb_range(mm,start,end) BTFIXUP_CALL(local_flush_tlb_range)(mm,start,end)
 #define local_flush_tlb_page(vma,addr) BTFIXUP_CALL(local_flush_tlb_page)(vma,addr)
 
 extern void smp_flush_tlb_all(void);
 extern void smp_flush_tlb_mm(struct mm_struct *mm);
-extern void smp_flush_tlb_range(struct vm_area_struct *vma,
+extern void smp_flush_tlb_range(struct mm_struct *mm,
 				  unsigned long start,
 				  unsigned long end);
 extern void smp_flush_tlb_page(struct vm_area_struct *mm, unsigned long page);
@@ -38,12 +38,12 @@ extern void smp_flush_tlb_page(struct vm
 
 BTFIXUPDEF_CALL(void, flush_tlb_all, void)
 BTFIXUPDEF_CALL(void, flush_tlb_mm, struct mm_struct *)
-BTFIXUPDEF_CALL(void, flush_tlb_range, struct vm_area_struct *, unsigned long, unsigned long)
+BTFIXUPDEF_CALL(void, flush_tlb_range, struct mm_struct *, unsigned long, unsigned long)
 BTFIXUPDEF_CALL(void, flush_tlb_page, struct vm_area_struct *, unsigned long)
 
 #define flush_tlb_all() BTFIXUP_CALL(flush_tlb_all)()
 #define flush_tlb_mm(mm) BTFIXUP_CALL(flush_tlb_mm)(mm)
-#define flush_tlb_range(vma,start,end) BTFIXUP_CALL(flush_tlb_range)(vma,start,end)
+#define flush_tlb_range(mm,start,end) BTFIXUP_CALL(flush_tlb_range)(mm,start,end)
 #define flush_tlb_page(vma,addr) BTFIXUP_CALL(flush_tlb_page)(vma,addr)
 
 // #define flush_tlb() flush_tlb_mm(current->active_mm)	/* XXX Sure? */
Index: linux-2.6/arch/sparc/include/asm/tlbflush_64.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/tlbflush_64.h
+++ linux-2.6/arch/sparc/include/asm/tlbflush_64.h
@@ -21,7 +21,7 @@ extern void flush_tsb_user(struct tlb_ba
 
 extern void flush_tlb_pending(void);
 
-#define flush_tlb_range(vma,start,end)	\
+#define flush_tlb_range(mm,start,end)	\
 	do { (void)(start); flush_tlb_pending(); } while (0)
 #define flush_tlb_page(vma,addr)	flush_tlb_pending()
 #define flush_tlb_mm(mm)		flush_tlb_pending()
Index: linux-2.6/arch/sparc/kernel/smp_32.c
===================================================================
--- linux-2.6.orig/arch/sparc/kernel/smp_32.c
+++ linux-2.6/arch/sparc/kernel/smp_32.c
@@ -184,17 +184,15 @@ void smp_flush_cache_range(struct vm_are
 	}
 }
 
-void smp_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void smp_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			 unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
 	if (mm->context != NO_CONTEXT) {
 		cpumask_t cpu_mask = *mm_cpumask(mm);
 		cpu_clear(smp_processor_id(), cpu_mask);
 		if (!cpus_empty(cpu_mask))
-			xc3((smpfunc_t) BTFIXUP_CALL(local_flush_tlb_range), (unsigned long) vma, start, end);
-		local_flush_tlb_range(vma, start, end);
+			xc3((smpfunc_t) BTFIXUP_CALL(local_flush_tlb_range), (unsigned long) mm, start, end);
+		local_flush_tlb_range(mm, start, end);
 	}
 }
 
Index: linux-2.6/arch/sparc/mm/generic_32.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/generic_32.c
+++ linux-2.6/arch/sparc/mm/generic_32.c
@@ -92,7 +92,7 @@ int io_remap_pfn_range(struct vm_area_st
 		dir++;
 	}
 
-	flush_tlb_range(vma, beg, end);
+	flush_tlb_range(vma->vm_mm, beg, end);
 	return error;
 }
 EXPORT_SYMBOL(io_remap_pfn_range);
Index: linux-2.6/arch/sparc/mm/generic_64.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/generic_64.c
+++ linux-2.6/arch/sparc/mm/generic_64.c
@@ -158,7 +158,7 @@ int io_remap_pfn_range(struct vm_area_st
 		dir++;
 	}
 
-	flush_tlb_range(vma, beg, end);
+	flush_tlb_range(vma->vm_mm, beg, end);
 	return error;
 }
 EXPORT_SYMBOL(io_remap_pfn_range);
Index: linux-2.6/arch/sparc/mm/hypersparc.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/hypersparc.S
+++ linux-2.6/arch/sparc/mm/hypersparc.S
@@ -284,7 +284,6 @@
 	 sta	%g5, [%g1] ASI_M_MMUREGS
 
 hypersparc_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 	mov	SRMMU_CTX_REG, %g1
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
Index: linux-2.6/arch/sparc/mm/srmmu.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/srmmu.c
+++ linux-2.6/arch/sparc/mm/srmmu.c
@@ -679,7 +679,7 @@ extern void tsunami_flush_page_for_dma(u
 extern void tsunami_flush_sig_insns(struct mm_struct *mm, unsigned long insn_addr);
 extern void tsunami_flush_tlb_all(void);
 extern void tsunami_flush_tlb_mm(struct mm_struct *mm);
-extern void tsunami_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void tsunami_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
 extern void tsunami_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void tsunami_setup_blockops(void);
 
@@ -726,7 +726,7 @@ extern void swift_flush_page_for_dma(uns
 extern void swift_flush_sig_insns(struct mm_struct *mm, unsigned long insn_addr);
 extern void swift_flush_tlb_all(void);
 extern void swift_flush_tlb_mm(struct mm_struct *mm);
-extern void swift_flush_tlb_range(struct vm_area_struct *vma,
+extern void swift_flush_tlb_range(struct mm_struct *mm,
 				  unsigned long start, unsigned long end);
 extern void swift_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 
@@ -964,9 +964,8 @@ static void cypress_flush_tlb_mm(struct 
 	FLUSH_END
 }
 
-static void cypress_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+static void cypress_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long size;
 
 	FLUSH_BEGIN(mm)
@@ -1018,13 +1017,13 @@ extern void viking_flush_page(unsigned l
 extern void viking_mxcc_flush_page(unsigned long page);
 extern void viking_flush_tlb_all(void);
 extern void viking_flush_tlb_mm(struct mm_struct *mm);
-extern void viking_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void viking_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 				   unsigned long end);
 extern void viking_flush_tlb_page(struct vm_area_struct *vma,
 				  unsigned long page);
 extern void sun4dsmp_flush_tlb_all(void);
 extern void sun4dsmp_flush_tlb_mm(struct mm_struct *mm);
-extern void sun4dsmp_flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+extern void sun4dsmp_flush_tlb_range(struct mm_struct *mm, unsigned long start,
 				   unsigned long end);
 extern void sun4dsmp_flush_tlb_page(struct vm_area_struct *vma,
 				  unsigned long page);
@@ -1039,7 +1038,7 @@ extern void hypersparc_flush_page_for_dm
 extern void hypersparc_flush_sig_insns(struct mm_struct *mm, unsigned long insn_addr);
 extern void hypersparc_flush_tlb_all(void);
 extern void hypersparc_flush_tlb_mm(struct mm_struct *mm);
-extern void hypersparc_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void hypersparc_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end);
 extern void hypersparc_flush_tlb_page(struct vm_area_struct *vma, unsigned long page);
 extern void hypersparc_setup_blockops(void);
 
@@ -1761,9 +1760,9 @@ static void turbosparc_flush_tlb_mm(stru
 	FLUSH_END
 }
 
-static void turbosparc_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+static void turbosparc_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	FLUSH_BEGIN(vma->vm_mm)
+	FLUSH_BEGIN(mm)
 	srmmu_flush_whole_tlb();
 	FLUSH_END
 }
Index: linux-2.6/arch/sparc/mm/sun4c.c
===================================================================
--- linux-2.6.orig/arch/sparc/mm/sun4c.c
+++ linux-2.6/arch/sparc/mm/sun4c.c
@@ -1419,9 +1419,8 @@ static void sun4c_flush_tlb_mm(struct mm
 	}
 }
 
-static void sun4c_flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
+static void sun4c_flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	int new_ctx = mm->context;
 
 	if (new_ctx != NO_CONTEXT) {
Index: linux-2.6/arch/sparc/mm/swift.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/swift.S
+++ linux-2.6/arch/sparc/mm/swift.S
@@ -219,7 +219,6 @@
 	.globl	swift_flush_tlb_range
 	.globl	swift_flush_tlb_all
 swift_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 swift_flush_tlb_mm:
 	ld	[%o0 + AOFF_mm_context], %g2
 	cmp	%g2, -1
Index: linux-2.6/arch/sparc/mm/tsunami.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/tsunami.S
+++ linux-2.6/arch/sparc/mm/tsunami.S
@@ -46,7 +46,6 @@
 
 	/* More slick stuff... */
 tsunami_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 tsunami_flush_tlb_mm:
 	ld	[%o0 + AOFF_mm_context], %g2
 	cmp	%g2, -1
Index: linux-2.6/arch/sparc/mm/viking.S
===================================================================
--- linux-2.6.orig/arch/sparc/mm/viking.S
+++ linux-2.6/arch/sparc/mm/viking.S
@@ -149,7 +149,6 @@
 #endif
 
 viking_flush_tlb_range:
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 	mov	SRMMU_CTX_REG, %g1
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
@@ -240,7 +239,6 @@
 	tst	%g5
 	bne	3f
 	 mov	SRMMU_CTX_REG, %g1
-	ld	[%o0 + 0x00], %o0	/* XXX vma->vm_mm GROSS XXX */
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
 	sethi	%hi(~((1 << SRMMU_PGDIR_SHIFT) - 1)), %o4
Index: linux-2.6/arch/tile/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/tile/include/asm/tlbflush.h
+++ linux-2.6/arch/tile/include/asm/tlbflush.h
@@ -105,7 +105,7 @@ static inline void local_flush_tlb_all(v
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *  - flush_tlb_others(cpumask, mm, va) flushes TLBs on other cpus
  *
@@ -120,7 +120,7 @@ extern void flush_tlb_mm(struct mm_struc
 extern void flush_tlb_page(const struct vm_area_struct *, unsigned long);
 extern void flush_tlb_page_mm(const struct vm_area_struct *,
 			      struct mm_struct *, unsigned long);
-extern void flush_tlb_range(const struct vm_area_struct *,
+extern void flush_tlb_range(const struct mm_struct *,
 			    unsigned long start, unsigned long end);
 
 #define flush_tlb()     flush_tlb_current_task()
Index: linux-2.6/arch/tile/kernel/tlb.c
===================================================================
--- linux-2.6.orig/arch/tile/kernel/tlb.c
+++ linux-2.6/arch/tile/kernel/tlb.c
@@ -64,14 +64,13 @@ void flush_tlb_page(const struct vm_area
 }
 EXPORT_SYMBOL(flush_tlb_page);
 
-void flush_tlb_range(const struct vm_area_struct *vma,
+void flush_tlb_range(const struct mm_struct *mm,
 		     unsigned long start, unsigned long end)
 {
-	unsigned long size = hv_page_size(vma);
-	struct mm_struct *mm = vma->vm_mm;
-	int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
-	flush_remote(0, cache, &mm->cpu_vm_mask, start, end - start, size,
-		     &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, HV_FLUSH_EVICT_L1I, &mm->cpu_vm_mask,
+		     start, end - start, PAGE_SIZE, &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, 0, &mm->cpu_vm_mask,
+		     start, end - start, HPAGE_SIZE, &mm->cpu_vm_mask, NULL, 0);
 }
 
 void flush_tlb_all(void)
Index: linux-2.6/arch/um/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/um/include/asm/tlbflush.h
+++ linux-2.6/arch/um/include/asm/tlbflush.h
@@ -16,12 +16,12 @@
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
  *  - flush_tlb_kernel_vm() flushes the kernel vm area
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  */
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, 
+extern void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 			    unsigned long end);
 extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long address);
 extern void flush_tlb_kernel_vm(void);
Index: linux-2.6/arch/um/kernel/tlb.c
===================================================================
--- linux-2.6.orig/arch/um/kernel/tlb.c
+++ linux-2.6/arch/um/kernel/tlb.c
@@ -492,12 +492,12 @@ static void fix_range(struct mm_struct *
 	fix_range_common(mm, start_addr, end_addr, force);
 }
 
-void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+void flush_tlb_range(struct mm_struct *mm, unsigned long start,
 		     unsigned long end)
 {
-	if (vma->vm_mm == NULL)
+	if (mm == NULL)
 		flush_tlb_kernel_range_common(start, end);
-	else fix_range(vma->vm_mm, start, end, 0);
+	else fix_range(mm, start, end, 0);
 }
 
 void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
Index: linux-2.6/arch/unicore32/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/unicore32/include/asm/tlb.h
+++ linux-2.6/arch/unicore32/include/asm/tlb.h
@@ -77,7 +77,7 @@ static inline void
 tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
 {
 	if (!tlb->fullmm && tlb->range_end > 0)
-		flush_tlb_range(vma, tlb->range_start, tlb->range_end);
+		flush_tlb_range(vma->vm_mm, tlb->range_start, tlb->range_end);
 }
 
 static inline void tlb_flush_mmu(struct mmu_gather *tlb)
Index: linux-2.6/arch/unicore32/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/unicore32/include/asm/tlbflush.h
+++ linux-2.6/arch/unicore32/include/asm/tlbflush.h
@@ -167,7 +167,7 @@ static inline void clean_pmd_entry(pmd_t
 /*
  * Convert calls to our calling convention.
  */
-#define local_flush_tlb_range(vma, start, end)	\
+#define local_flush_tlb_range(mm, start, end)	\
 	__cpu_flush_user_tlb_range(start, end, vma)
 #define local_flush_tlb_kernel_range(s, e)	\
 	__cpu_flush_kern_tlb_range(s, e)
Index: linux-2.6/arch/x86/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/tlbflush.h
+++ linux-2.6/arch/x86/include/asm/tlbflush.h
@@ -75,7 +75,7 @@ static inline void __flush_tlb_one(unsig
  *  - flush_tlb_all() flushes all processes TLBs
  *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
  *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_range(mm, start, end) flushes a range of pages
  *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
  *  - flush_tlb_others(cpumask, mm, va) flushes TLBs on other cpus
  *
@@ -106,10 +106,10 @@ static inline void flush_tlb_page(struct
 		__flush_tlb_one(addr);
 }
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	if (vma->vm_mm == current->active_mm)
+	if (mm == current->active_mm)
 		__flush_tlb();
 }
 
@@ -136,10 +136,10 @@ extern void flush_tlb_page(struct vm_are
 
 #define flush_tlb()	flush_tlb_current_task()
 
-static inline void flush_tlb_range(struct vm_area_struct *vma,
+static inline void flush_tlb_range(struct mm_struct *mm,
 				   unsigned long start, unsigned long end)
 {
-	flush_tlb_mm(vma->vm_mm);
+	flush_tlb_mm(mm);
 }
 
 void native_flush_tlb_others(const struct cpumask *cpumask,
Index: linux-2.6/arch/x86/mm/pgtable.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/pgtable.c
+++ linux-2.6/arch/x86/mm/pgtable.c
@@ -332,7 +332,7 @@ int pmdp_set_access_flags(struct vm_area
 	if (changed && dirty) {
 		*pmdp = entry;
 		pmd_update_defer(vma->vm_mm, address, pmdp);
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	}
 
 	return changed;
@@ -393,7 +393,7 @@ int pmdp_clear_flush_young(struct vm_are
 
 	young = pmdp_test_and_clear_young(vma, address, pmdp);
 	if (young)
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 
 	return young;
 }
@@ -408,7 +408,7 @@ void pmdp_splitting_flush(struct vm_area
 	if (set) {
 		pmd_update(vma->vm_mm, address, pmdp);
 		/* need tlb flush only to serialize against gup-fast */
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	}
 }
 #endif
Index: linux-2.6/arch/xtensa/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/xtensa/include/asm/tlb.h
+++ linux-2.6/arch/xtensa/include/asm/tlb.h
@@ -32,7 +32,7 @@
 # define tlb_end_vma(tlb, vma)						      \
 	do {								      \
 		if (!tlb->fullmm)					      \
-			flush_tlb_range(vma, vma->vm_start, vma->vm_end);     \
+			flush_tlb_range(vma->vm_mm, vma->vm_start, vma->vm_end);     \
 	} while(0)
 
 #endif
Index: linux-2.6/arch/xtensa/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/xtensa/include/asm/tlbflush.h
+++ linux-2.6/arch/xtensa/include/asm/tlbflush.h
@@ -37,7 +37,7 @@
 extern void flush_tlb_all(void);
 extern void flush_tlb_mm(struct mm_struct*);
 extern void flush_tlb_page(struct vm_area_struct*,unsigned long);
-extern void flush_tlb_range(struct vm_area_struct*,unsigned long,unsigned long);
+extern void flush_tlb_range(struct mm_struct*,unsigned long,unsigned long);
 
 #define flush_tlb_kernel_range(start,end) flush_tlb_all()
 
Index: linux-2.6/arch/xtensa/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/xtensa/mm/tlb.c
+++ linux-2.6/arch/xtensa/mm/tlb.c
@@ -82,10 +82,9 @@ void flush_tlb_mm(struct mm_struct *mm)
 # define _TLB_ENTRIES _DTLB_ENTRIES
 #endif
 
-void flush_tlb_range (struct vm_area_struct *vma,
+void flush_tlb_range (struct mm_struct *mm,
     		      unsigned long start, unsigned long end)
 {
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long flags;
 
 	if (mm->context == NO_CONTEXT)
Index: linux-2.6/mm/huge_memory.c
===================================================================
--- linux-2.6.orig/mm/huge_memory.c
+++ linux-2.6/mm/huge_memory.c
@@ -1058,8 +1058,8 @@ int change_huge_pmd(struct vm_area_struc
 			entry = pmdp_get_and_clear(mm, addr, pmd);
 			entry = pmd_modify(entry, newprot);
 			set_pmd_at(mm, addr, pmd, entry);
-			spin_unlock(&vma->vm_mm->page_table_lock);
-			flush_tlb_range(vma, addr, addr + HPAGE_PMD_SIZE);
+			spin_unlock(&mm->page_table_lock);
+			flush_tlb_range(mm, addr, addr + HPAGE_PMD_SIZE);
 			ret = 1;
 		}
 	} else
@@ -1313,7 +1313,7 @@ static int __split_huge_page_map(struct 
 		 * of the pmd entry with pmd_populate.
 		 */
 		set_pmd_at(mm, address, pmd, pmd_mknotpresent(*pmd));
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(mm, address, address + HPAGE_PMD_SIZE);
 		pmd_populate(mm, pmd, pgtable);
 		ret = 1;
 	}
Index: linux-2.6/mm/hugetlb.c
===================================================================
--- linux-2.6.orig/mm/hugetlb.c
+++ linux-2.6/mm/hugetlb.c
@@ -2264,7 +2264,7 @@ void __unmap_hugepage_range(struct vm_ar
 		list_add(&page->lru, &page_list);
 	}
 	spin_unlock(&mm->page_table_lock);
-	flush_tlb_range(vma, start, end);
+	flush_tlb_range(mm, start, end);
 	mmu_notifier_invalidate_range_end(mm, start, end);
 	list_for_each_entry_safe(page, tmp, &page_list, lru) {
 		page_remove_rmap(page);
@@ -2829,7 +2829,7 @@ void hugetlb_change_protection(struct vm
 	spin_unlock(&mm->page_table_lock);
 	mutex_unlock(&vma->vm_file->f_mapping->i_mmap_mutex);
 
-	flush_tlb_range(vma, start, end);
+	flush_tlb_range(mm, start, end);
 }
 
 int hugetlb_reserve_pages(struct inode *inode,
Index: linux-2.6/mm/mprotect.c
===================================================================
--- linux-2.6.orig/mm/mprotect.c
+++ linux-2.6/mm/mprotect.c
@@ -138,7 +138,7 @@ static void change_protection(struct vm_
 		change_pud_range(vma, pgd, addr, next, newprot,
 				 dirty_accountable);
 	} while (pgd++, addr = next, addr != end);
-	flush_tlb_range(vma, start, end);
+	flush_tlb_range(mm, start, end);
 }
 
 int
Index: linux-2.6/mm/pgtable-generic.c
===================================================================
--- linux-2.6.orig/mm/pgtable-generic.c
+++ linux-2.6/mm/pgtable-generic.c
@@ -43,7 +43,7 @@ int pmdp_set_access_flags(struct vm_area
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	if (changed) {
 		set_pmd_at(vma->vm_mm, address, pmdp, entry);
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	}
 	return changed;
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
@@ -76,7 +76,7 @@ int pmdp_clear_flush_young(struct vm_are
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	young = pmdp_test_and_clear_young(vma, address, pmdp);
 	if (young)
-		flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+		flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	return young;
 }
 #endif
@@ -100,7 +100,7 @@ pmd_t pmdp_clear_flush(struct vm_area_st
 	pmd_t pmd;
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	pmd = pmdp_get_and_clear(vma->vm_mm, address, pmdp);
-	flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 	return pmd;
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
@@ -115,7 +115,7 @@ pmd_t pmdp_splitting_flush(struct vm_are
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	set_pmd_at(vma->vm_mm, address, pmdp, pmd);
 	/* tlb flush only to serialize against gup-fast */
-	flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+	flush_tlb_range(vma->vm_mm, address, address + HPAGE_PMD_SIZE);
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 3/6] mm: Provide generic range tracking and flushing
  2011-03-02 17:59 ` Peter Zijlstra
  (?)
@ 2011-03-02 17:59   ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: mm-generic-tlb-range.patch --]
[-- Type: text/plain, Size: 6218 bytes --]

In order to convert ia64, arm and sh to generic tlb we need to provide
some extra infrastructure to track the range of the flushed page
tables.

There are two mmu_gather cases to consider:

  unmap_region()
    tlb_gather_mmu()
    unmap_vmas()
      for (; vma; vma = vma->vm_next)
        unmao_page_range()
          tlb_start_vma() -> flush cache range
          zap_*_range()
            ptep_get_and_clear_full() -> batch/track external tlbs
            tlb_remove_tlb_entry() -> batch/track external tlbs
            tlb_remove_page() -> track range/batch page
          tlb_end_vma()
    free_pgtables()
      while (vma)
        unlink_*_vma()
        free_*_range()
          *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush everything
  free vmas

and:

  shift_arg_pages()
    tlb_gather_mmu()
    free_*_range()
      *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush things

There are various reasons that we need to flush TLBs _after_ freeing
the page-tables themselves. For some architectures (x86 among others)
this serializes against (both hardware and software) page table
walkers like gup_fast().  

For others (ARM) this is (also) needed to evict stale page-table
caches - ARM LPAE mode apparently caches page tables and concurrent
hardware walkers could re-populate these caches if the final tlb flush
were to be from tlb_end_vma() since an concurrent walk could still be
in progress.

Now that we fixed flush_tlb_range() to take an mm_struct argument we
can implement the needed range tracking as outlined above.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/Kconfig              |    3 +
 include/asm-generic/tlb.h |   93 ++++++++++++++++++++++++++++++++++++----------
 2 files changed, 77 insertions(+), 19 deletions(-)

Index: linux-2.6/arch/Kconfig
===================================================================
--- linux-2.6.orig/arch/Kconfig
+++ linux-2.6/arch/Kconfig
@@ -184,4 +184,7 @@ config HAVE_ARCH_MUTEX_CPU_RELAX
 config HAVE_RCU_TABLE_FREE
 	bool
 
+config HAVE_MMU_GATHER_RANGE
+	bool
+
 source "kernel/gcov/Kconfig"
Index: linux-2.6/include/asm-generic/tlb.h
===================================================================
--- linux-2.6.orig/include/asm-generic/tlb.h
+++ linux-2.6/include/asm-generic/tlb.h
@@ -78,11 +78,15 @@ struct mmu_gather_batch {
 #define MAX_GATHER_BATCH	\
 	((PAGE_SIZE - sizeof(struct mmu_gather_batch)) / sizeof(void *))
 
-/* struct mmu_gather is an opaque type used by the mm code for passing around
+/*
+ * struct mmu_gather is an opaque type used by the mm code for passing around
  * any data needed by arch specific code for tlb_remove_page.
  */
 struct mmu_gather {
 	struct mm_struct	*mm;
+#ifdef CONFIG_HAVE_MMU_GATHER_RANGE
+	unsigned long		start, end;
+#endif
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	struct mmu_table_batch	*batch;
 #endif
@@ -106,6 +110,48 @@ struct mmu_gather {
   #define tlb_fast_mode(tlb) 1
 #endif
 
+#ifdef CONFIG_HAVE_MMU_GATHER_RANGE
+
+static inline void tlb_init_range(struct mmu_gather *tlb)
+{
+	tlb->start = TASK_SIZE;
+	tlb->end = 0;
+}
+
+static inline void
+tlb_track_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end)
+{
+	if (!tlb->fullmm) {
+		tlb->start = min(tlb->start, addr);
+		tlb->end = max(tlb->end, end);
+	}
+}
+
+static inline void
+tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+	if (!tlb->fullmm)
+		flush_cache_range(vma, vma->vm_start, vma->vm_end);
+}
+
+static inline void
+tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+}
+
+#else /* CONFIG_HAVE_MMU_GATHER_RANGE */
+
+static inline void tlb_init_range(struct mmu_gather *tlb)
+{
+}
+
+/*
+ * Macro avoids argument evaluation.
+ */
+#define tlb_track_range(tlb, addr, end) do { } while (0)
+
+#endif /* CONFIG_HAVE_MMU_GATHER_RANGE */
+
 static inline int tlb_next_batch(struct mmu_gather *tlb)
 {
 	struct mmu_gather_batch *batch;
@@ -146,6 +192,8 @@ tlb_gather_mmu(struct mmu_gather *tlb, s
 	tlb->local.max  = ARRAY_SIZE(tlb->__pages);
 	tlb->active     = &tlb->local;
 
+	tlb_init_range(tlb);
+
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb->batch = NULL;
 #endif
@@ -164,6 +212,7 @@ tlb_flush_mmu(struct mmu_gather *tlb)
 	if (!tlb->fullmm && tlb->need_flush) {
 		tlb->need_flush = 0;
 		tlb_flush(tlb);
+		tlb_init_range(tlb);
 	}
 
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
@@ -240,32 +289,38 @@ static inline void tlb_remove_page(struc
  * later optimise away the tlb invalidate.   This helps when userspace is
  * unmapping already-unmapped pages, which happens quite a lot.
  */
-#define tlb_remove_tlb_entry(tlb, ptep, address)		\
-	do {							\
-		tlb->need_flush = 1;				\
-		__tlb_remove_tlb_entry(tlb, ptep, address);	\
+#define tlb_remove_tlb_entry(tlb, ptep, address)			\
+	do {								\
+		tlb->need_flush = 1;					\
+		tlb_track_range(tlb, address, address + PAGE_SIZE);	\
+		__tlb_remove_tlb_entry(tlb, ptep, address);		\
 	} while (0)
 
-#define pte_free_tlb(tlb, ptep, address)			\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pte_free_tlb(tlb, ptep, address);		\
+#define pte_free_tlb(tlb, ptep, address)					\
+	do {									\
+		tlb->need_flush = 1;						\
+		tlb_track_range(tlb, address, pmd_addr_end(address, TASK_SIZE));\
+		__pte_free_tlb(tlb, ptep, address);				\
 	} while (0)
 
-#ifndef __ARCH_HAS_4LEVEL_HACK
-#define pud_free_tlb(tlb, pudp, address)			\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pud_free_tlb(tlb, pudp, address);		\
+#define pmd_free_tlb(tlb, pmdp, address)					\
+	do {									\
+		tlb->need_flush = 1;						\
+		tlb_track_range(tlb, address, pud_addr_end(address, TASK_SIZE));\
+		__pmd_free_tlb(tlb, pmdp, address);				\
 	} while (0)
-#endif
 
-#define pmd_free_tlb(tlb, pmdp, address)			\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pmd_free_tlb(tlb, pmdp, address);		\
+#ifndef __ARCH_HAS_4LEVEL_HACK
+#define pud_free_tlb(tlb, pudp, address)					\
+	do {									\
+		tlb->need_flush = 1;						\
+		tlb_track_range(tlb, address, pgd_addr_end(address, TASK_SIZE));\
+		__pud_free_tlb(tlb, pudp, address);				\
 	} while (0)
+#endif
 
+#ifndef tlb_migrate_finish
 #define tlb_migrate_finish(mm) do {} while (0)
+#endif
 
 #endif /* _ASM_GENERIC__TLB_H */



^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 3/6] mm: Provide generic range tracking and flushing
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: mm-generic-tlb-range.patch --]
[-- Type: text/plain, Size: 6521 bytes --]

In order to convert ia64, arm and sh to generic tlb we need to provide
some extra infrastructure to track the range of the flushed page
tables.

There are two mmu_gather cases to consider:

  unmap_region()
    tlb_gather_mmu()
    unmap_vmas()
      for (; vma; vma = vma->vm_next)
        unmao_page_range()
          tlb_start_vma() -> flush cache range
          zap_*_range()
            ptep_get_and_clear_full() -> batch/track external tlbs
            tlb_remove_tlb_entry() -> batch/track external tlbs
            tlb_remove_page() -> track range/batch page
          tlb_end_vma()
    free_pgtables()
      while (vma)
        unlink_*_vma()
        free_*_range()
          *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush everything
  free vmas

and:

  shift_arg_pages()
    tlb_gather_mmu()
    free_*_range()
      *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush things

There are various reasons that we need to flush TLBs _after_ freeing
the page-tables themselves. For some architectures (x86 among others)
this serializes against (both hardware and software) page table
walkers like gup_fast().  

For others (ARM) this is (also) needed to evict stale page-table
caches - ARM LPAE mode apparently caches page tables and concurrent
hardware walkers could re-populate these caches if the final tlb flush
were to be from tlb_end_vma() since an concurrent walk could still be
in progress.

Now that we fixed flush_tlb_range() to take an mm_struct argument we
can implement the needed range tracking as outlined above.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/Kconfig              |    3 +
 include/asm-generic/tlb.h |   93 ++++++++++++++++++++++++++++++++++++----------
 2 files changed, 77 insertions(+), 19 deletions(-)

Index: linux-2.6/arch/Kconfig
===================================================================
--- linux-2.6.orig/arch/Kconfig
+++ linux-2.6/arch/Kconfig
@@ -184,4 +184,7 @@ config HAVE_ARCH_MUTEX_CPU_RELAX
 config HAVE_RCU_TABLE_FREE
 	bool
 
+config HAVE_MMU_GATHER_RANGE
+	bool
+
 source "kernel/gcov/Kconfig"
Index: linux-2.6/include/asm-generic/tlb.h
===================================================================
--- linux-2.6.orig/include/asm-generic/tlb.h
+++ linux-2.6/include/asm-generic/tlb.h
@@ -78,11 +78,15 @@ struct mmu_gather_batch {
 #define MAX_GATHER_BATCH	\
 	((PAGE_SIZE - sizeof(struct mmu_gather_batch)) / sizeof(void *))
 
-/* struct mmu_gather is an opaque type used by the mm code for passing around
+/*
+ * struct mmu_gather is an opaque type used by the mm code for passing around
  * any data needed by arch specific code for tlb_remove_page.
  */
 struct mmu_gather {
 	struct mm_struct	*mm;
+#ifdef CONFIG_HAVE_MMU_GATHER_RANGE
+	unsigned long		start, end;
+#endif
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	struct mmu_table_batch	*batch;
 #endif
@@ -106,6 +110,48 @@ struct mmu_gather {
   #define tlb_fast_mode(tlb) 1
 #endif
 
+#ifdef CONFIG_HAVE_MMU_GATHER_RANGE
+
+static inline void tlb_init_range(struct mmu_gather *tlb)
+{
+	tlb->start = TASK_SIZE;
+	tlb->end = 0;
+}
+
+static inline void
+tlb_track_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end)
+{
+	if (!tlb->fullmm) {
+		tlb->start = min(tlb->start, addr);
+		tlb->end = max(tlb->end, end);
+	}
+}
+
+static inline void
+tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+	if (!tlb->fullmm)
+		flush_cache_range(vma, vma->vm_start, vma->vm_end);
+}
+
+static inline void
+tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+}
+
+#else /* CONFIG_HAVE_MMU_GATHER_RANGE */
+
+static inline void tlb_init_range(struct mmu_gather *tlb)
+{
+}
+
+/*
+ * Macro avoids argument evaluation.
+ */
+#define tlb_track_range(tlb, addr, end) do { } while (0)
+
+#endif /* CONFIG_HAVE_MMU_GATHER_RANGE */
+
 static inline int tlb_next_batch(struct mmu_gather *tlb)
 {
 	struct mmu_gather_batch *batch;
@@ -146,6 +192,8 @@ tlb_gather_mmu(struct mmu_gather *tlb, s
 	tlb->local.max  = ARRAY_SIZE(tlb->__pages);
 	tlb->active     = &tlb->local;
 
+	tlb_init_range(tlb);
+
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb->batch = NULL;
 #endif
@@ -164,6 +212,7 @@ tlb_flush_mmu(struct mmu_gather *tlb)
 	if (!tlb->fullmm && tlb->need_flush) {
 		tlb->need_flush = 0;
 		tlb_flush(tlb);
+		tlb_init_range(tlb);
 	}
 
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
@@ -240,32 +289,38 @@ static inline void tlb_remove_page(struc
  * later optimise away the tlb invalidate.   This helps when userspace is
  * unmapping already-unmapped pages, which happens quite a lot.
  */
-#define tlb_remove_tlb_entry(tlb, ptep, address)		\
-	do {							\
-		tlb->need_flush = 1;				\
-		__tlb_remove_tlb_entry(tlb, ptep, address);	\
+#define tlb_remove_tlb_entry(tlb, ptep, address)			\
+	do {								\
+		tlb->need_flush = 1;					\
+		tlb_track_range(tlb, address, address + PAGE_SIZE);	\
+		__tlb_remove_tlb_entry(tlb, ptep, address);		\
 	} while (0)
 
-#define pte_free_tlb(tlb, ptep, address)			\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pte_free_tlb(tlb, ptep, address);		\
+#define pte_free_tlb(tlb, ptep, address)					\
+	do {									\
+		tlb->need_flush = 1;						\
+		tlb_track_range(tlb, address, pmd_addr_end(address, TASK_SIZE));\
+		__pte_free_tlb(tlb, ptep, address);				\
 	} while (0)
 
-#ifndef __ARCH_HAS_4LEVEL_HACK
-#define pud_free_tlb(tlb, pudp, address)			\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pud_free_tlb(tlb, pudp, address);		\
+#define pmd_free_tlb(tlb, pmdp, address)					\
+	do {									\
+		tlb->need_flush = 1;						\
+		tlb_track_range(tlb, address, pud_addr_end(address, TASK_SIZE));\
+		__pmd_free_tlb(tlb, pmdp, address);				\
 	} while (0)
-#endif
 
-#define pmd_free_tlb(tlb, pmdp, address)			\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pmd_free_tlb(tlb, pmdp, address);		\
+#ifndef __ARCH_HAS_4LEVEL_HACK
+#define pud_free_tlb(tlb, pudp, address)					\
+	do {									\
+		tlb->need_flush = 1;						\
+		tlb_track_range(tlb, address, pgd_addr_end(address, TASK_SIZE));\
+		__pud_free_tlb(tlb, pudp, address);				\
 	} while (0)
+#endif
 
+#ifndef tlb_migrate_finish
 #define tlb_migrate_finish(mm) do {} while (0)
+#endif
 
 #endif /* _ASM_GENERIC__TLB_H */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 3/6] mm: Provide generic range tracking and flushing
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: mm-generic-tlb-range.patch --]
[-- Type: text/plain, Size: 6521 bytes --]

In order to convert ia64, arm and sh to generic tlb we need to provide
some extra infrastructure to track the range of the flushed page
tables.

There are two mmu_gather cases to consider:

  unmap_region()
    tlb_gather_mmu()
    unmap_vmas()
      for (; vma; vma = vma->vm_next)
        unmao_page_range()
          tlb_start_vma() -> flush cache range
          zap_*_range()
            ptep_get_and_clear_full() -> batch/track external tlbs
            tlb_remove_tlb_entry() -> batch/track external tlbs
            tlb_remove_page() -> track range/batch page
          tlb_end_vma()
    free_pgtables()
      while (vma)
        unlink_*_vma()
        free_*_range()
          *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush everything
  free vmas

and:

  shift_arg_pages()
    tlb_gather_mmu()
    free_*_range()
      *_free_tlb() -> track tlb range
    tlb_finish_mmu() -> flush things

There are various reasons that we need to flush TLBs _after_ freeing
the page-tables themselves. For some architectures (x86 among others)
this serializes against (both hardware and software) page table
walkers like gup_fast().  

For others (ARM) this is (also) needed to evict stale page-table
caches - ARM LPAE mode apparently caches page tables and concurrent
hardware walkers could re-populate these caches if the final tlb flush
were to be from tlb_end_vma() since an concurrent walk could still be
in progress.

Now that we fixed flush_tlb_range() to take an mm_struct argument we
can implement the needed range tracking as outlined above.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/Kconfig              |    3 +
 include/asm-generic/tlb.h |   93 ++++++++++++++++++++++++++++++++++++----------
 2 files changed, 77 insertions(+), 19 deletions(-)

Index: linux-2.6/arch/Kconfig
===================================================================
--- linux-2.6.orig/arch/Kconfig
+++ linux-2.6/arch/Kconfig
@@ -184,4 +184,7 @@ config HAVE_ARCH_MUTEX_CPU_RELAX
 config HAVE_RCU_TABLE_FREE
 	bool
 
+config HAVE_MMU_GATHER_RANGE
+	bool
+
 source "kernel/gcov/Kconfig"
Index: linux-2.6/include/asm-generic/tlb.h
===================================================================
--- linux-2.6.orig/include/asm-generic/tlb.h
+++ linux-2.6/include/asm-generic/tlb.h
@@ -78,11 +78,15 @@ struct mmu_gather_batch {
 #define MAX_GATHER_BATCH	\
 	((PAGE_SIZE - sizeof(struct mmu_gather_batch)) / sizeof(void *))
 
-/* struct mmu_gather is an opaque type used by the mm code for passing around
+/*
+ * struct mmu_gather is an opaque type used by the mm code for passing around
  * any data needed by arch specific code for tlb_remove_page.
  */
 struct mmu_gather {
 	struct mm_struct	*mm;
+#ifdef CONFIG_HAVE_MMU_GATHER_RANGE
+	unsigned long		start, end;
+#endif
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	struct mmu_table_batch	*batch;
 #endif
@@ -106,6 +110,48 @@ struct mmu_gather {
   #define tlb_fast_mode(tlb) 1
 #endif
 
+#ifdef CONFIG_HAVE_MMU_GATHER_RANGE
+
+static inline void tlb_init_range(struct mmu_gather *tlb)
+{
+	tlb->start = TASK_SIZE;
+	tlb->end = 0;
+}
+
+static inline void
+tlb_track_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end)
+{
+	if (!tlb->fullmm) {
+		tlb->start = min(tlb->start, addr);
+		tlb->end = max(tlb->end, end);
+	}
+}
+
+static inline void
+tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+	if (!tlb->fullmm)
+		flush_cache_range(vma, vma->vm_start, vma->vm_end);
+}
+
+static inline void
+tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+}
+
+#else /* CONFIG_HAVE_MMU_GATHER_RANGE */
+
+static inline void tlb_init_range(struct mmu_gather *tlb)
+{
+}
+
+/*
+ * Macro avoids argument evaluation.
+ */
+#define tlb_track_range(tlb, addr, end) do { } while (0)
+
+#endif /* CONFIG_HAVE_MMU_GATHER_RANGE */
+
 static inline int tlb_next_batch(struct mmu_gather *tlb)
 {
 	struct mmu_gather_batch *batch;
@@ -146,6 +192,8 @@ tlb_gather_mmu(struct mmu_gather *tlb, s
 	tlb->local.max  = ARRAY_SIZE(tlb->__pages);
 	tlb->active     = &tlb->local;
 
+	tlb_init_range(tlb);
+
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
 	tlb->batch = NULL;
 #endif
@@ -164,6 +212,7 @@ tlb_flush_mmu(struct mmu_gather *tlb)
 	if (!tlb->fullmm && tlb->need_flush) {
 		tlb->need_flush = 0;
 		tlb_flush(tlb);
+		tlb_init_range(tlb);
 	}
 
 #ifdef CONFIG_HAVE_RCU_TABLE_FREE
@@ -240,32 +289,38 @@ static inline void tlb_remove_page(struc
  * later optimise away the tlb invalidate.   This helps when userspace is
  * unmapping already-unmapped pages, which happens quite a lot.
  */
-#define tlb_remove_tlb_entry(tlb, ptep, address)		\
-	do {							\
-		tlb->need_flush = 1;				\
-		__tlb_remove_tlb_entry(tlb, ptep, address);	\
+#define tlb_remove_tlb_entry(tlb, ptep, address)			\
+	do {								\
+		tlb->need_flush = 1;					\
+		tlb_track_range(tlb, address, address + PAGE_SIZE);	\
+		__tlb_remove_tlb_entry(tlb, ptep, address);		\
 	} while (0)
 
-#define pte_free_tlb(tlb, ptep, address)			\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pte_free_tlb(tlb, ptep, address);		\
+#define pte_free_tlb(tlb, ptep, address)					\
+	do {									\
+		tlb->need_flush = 1;						\
+		tlb_track_range(tlb, address, pmd_addr_end(address, TASK_SIZE));\
+		__pte_free_tlb(tlb, ptep, address);				\
 	} while (0)
 
-#ifndef __ARCH_HAS_4LEVEL_HACK
-#define pud_free_tlb(tlb, pudp, address)			\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pud_free_tlb(tlb, pudp, address);		\
+#define pmd_free_tlb(tlb, pmdp, address)					\
+	do {									\
+		tlb->need_flush = 1;						\
+		tlb_track_range(tlb, address, pud_addr_end(address, TASK_SIZE));\
+		__pmd_free_tlb(tlb, pmdp, address);				\
 	} while (0)
-#endif
 
-#define pmd_free_tlb(tlb, pmdp, address)			\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pmd_free_tlb(tlb, pmdp, address);		\
+#ifndef __ARCH_HAS_4LEVEL_HACK
+#define pud_free_tlb(tlb, pudp, address)					\
+	do {									\
+		tlb->need_flush = 1;						\
+		tlb_track_range(tlb, address, pgd_addr_end(address, TASK_SIZE));\
+		__pud_free_tlb(tlb, pudp, address);				\
 	} while (0)
+#endif
 
+#ifndef tlb_migrate_finish
 #define tlb_migrate_finish(mm) do {} while (0)
+#endif
 
 #endif /* _ASM_GENERIC__TLB_H */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2011-03-02 17:59 ` Peter Zijlstra
  (?)
@ 2011-03-02 17:59   ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: mm-arm-tlb-range.patch --]
[-- Type: text/plain, Size: 6440 bytes --]

Might want to optimize the tlb_flush() function to do a full mm flush
when the range is 'large', IA64 does this too.

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/arm/Kconfig           |    1 
 arch/arm/include/asm/tlb.h |  174 ++-------------------------------------------
 2 files changed, 9 insertions(+), 166 deletions(-)

Index: linux-2.6/arch/arm/Kconfig
===================================================================
--- linux-2.6.orig/arch/arm/Kconfig
+++ linux-2.6/arch/arm/Kconfig
@@ -28,6 +28,7 @@ config ARM
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_GENERIC_HARDIRQS
 	select HAVE_SPARSE_IRQ
+	select HAVE_MMU_GATHER_RANGE if MMU
 	help
 	  The ARM series is a line of low-power-consumption RISC chip designs
 	  licensed by ARM Ltd and targeted at embedded applications and
Index: linux-2.6/arch/arm/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/tlb.h
+++ linux-2.6/arch/arm/include/asm/tlb.h
@@ -29,184 +29,26 @@
 
 #else /* !CONFIG_MMU */
 
-#include <linux/swap.h>
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-
-/*
- * We need to delay page freeing for SMP as other CPUs can access pages
- * which have been removed but not yet had their TLB entries invalidated.
- * Also, as ARMv7 speculative prefetch can drag new entries into the TLB,
- * we need to apply this same delaying tactic to ensure correct operation.
- */
-#if defined(CONFIG_SMP) || defined(CONFIG_CPU_32v7)
-#define tlb_fast_mode(tlb)	0
-#else
-#define tlb_fast_mode(tlb)	1
-#endif
-
-#define MMU_GATHER_BUNDLE	8
-
-/*
- * TLB handling.  This allows us to remove pages from the page
- * tables, and efficiently handle the TLB issues.
- */
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		fullmm;
-	struct vm_area_struct	*vma;
-	unsigned long		range_start;
-	unsigned long		range_end;
-	unsigned int		nr;
-	unsigned int		max;
-	struct page		**pages;
-	struct page		*local[MMU_GATHER_BUNDLE];
-};
-
-DECLARE_PER_CPU(struct mmu_gather, mmu_gathers);
-
-/*
- * This is unnecessarily complex.  There's three ways the TLB shootdown
- * code is used:
- *  1. Unmapping a range of vmas.  See zap_page_range(), unmap_region().
- *     tlb->fullmm = 0, and tlb_start_vma/tlb_end_vma will be called.
- *     tlb->vma will be non-NULL.
- *  2. Unmapping all vmas.  See exit_mmap().
- *     tlb->fullmm = 1, and tlb_start_vma/tlb_end_vma will be called.
- *     tlb->vma will be non-NULL.  Additionally, page tables will be freed.
- *  3. Unmapping argument pages.  See shift_arg_pages().
- *     tlb->fullmm = 0, but tlb_start_vma/tlb_end_vma will not be called.
- *     tlb->vma will be NULL.
- */
-static inline void tlb_flush(struct mmu_gather *tlb)
-{
-	if (tlb->fullmm || !tlb->vma)
-		flush_tlb_mm(tlb->mm);
-	else if (tlb->range_end > 0) {
-		flush_tlb_range(tlb->mm, tlb->range_start, tlb->range_end);
-		tlb->range_start = TASK_SIZE;
-		tlb->range_end = 0;
-	}
-}
-
-static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
-{
-	if (!tlb->fullmm) {
-		if (addr < tlb->range_start)
-			tlb->range_start = addr;
-		if (addr + PAGE_SIZE > tlb->range_end)
-			tlb->range_end = addr + PAGE_SIZE;
-	}
-}
-
-static inline void __tlb_alloc_page(struct mmu_gather *tlb)
-{
-	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
-
-	if (addr) {
-		tlb->pages = (void *)addr;
-		tlb->max = PAGE_SIZE / sizeof(struct page *);
-	}
-}
-
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	tlb_flush(tlb);
-	if (!tlb_fast_mode(tlb)) {
-		free_pages_and_swap_cache(tlb->pages, tlb->nr);
-		tlb->nr = 0;
-		if (tlb->pages == tlb->local)
-			__tlb_alloc_page(tlb);
-	}
-}
+static inline void tlb_flush(struct mmu_gather *tlb);
 
-static inline void
-tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int fullmm)
-{
-	tlb->mm = mm;
-	tlb->fullmm = fullmm;
-	tlb->vma = NULL;
-	tlb->max = ARRAY_SIZE(tlb->local);
-	tlb->pages = tlb->local;
-	tlb->nr = 0;
-	__tlb_alloc_page(tlb);
-}
+#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
 
 static inline void
-tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	tlb_flush_mmu(tlb);
+__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr);
+#define __pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
 
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
+#include <asm-generic/tlb.h>
 
-	if (tlb->pages != tlb->local)
-		free_pages((unsigned long)tlb->pages, 0);
-}
-
-/*
- * Memorize the range for the TLB flush.
- */
-static inline void
-tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long addr)
-{
-	tlb_add_flush(tlb, addr);
-}
-
-/*
- * In the case of tlb vma handling, we can optimise these away in the
- * case where we're doing a full MM flush.  When we're doing a munmap,
- * the vmas are adjusted to only cover the region to be torn down.
- */
-static inline void
-tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+static inline void tlb_flush(struct mmu_gather *tlb)
 {
-	if (!tlb->fullmm) {
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);
-		tlb->vma = vma;
-		tlb->range_start = TASK_SIZE;
-		tlb->range_end = 0;
-	}
+	flush_tlb_range(tlb->mm, tlb->start, tlb->end);
 }
 
 static inline void
-tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm)
-		tlb_flush(tlb);
-}
-
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (tlb_fast_mode(tlb)) {
-		free_page_and_swap_cache(page);
-	} else {
-		tlb->pages[tlb->nr++] = page;
-		if (tlb->nr >= tlb->max)
-			return 1;
-	}
-	return 0;
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (__tlb_remove_page(tlb, page))
-		tlb_flush_mmu(tlb);
-}
-
-static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
-	unsigned long addr)
+__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
 {
 	pgtable_page_dtor(pte);
-	tlb_add_flush(tlb, addr);
 	tlb_remove_page(tlb, pte);
 }
-
-#define pte_free_tlb(tlb, ptep, addr)	__pte_free_tlb(tlb, ptep, addr)
-#define pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
-#define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
-
-#define tlb_migrate_finish(mm)		do { } while (0)
-
 #endif /* CONFIG_MMU */
 #endif



^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: mm-arm-tlb-range.patch --]
[-- Type: text/plain, Size: 6743 bytes --]

Might want to optimize the tlb_flush() function to do a full mm flush
when the range is 'large', IA64 does this too.

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/arm/Kconfig           |    1 
 arch/arm/include/asm/tlb.h |  174 ++-------------------------------------------
 2 files changed, 9 insertions(+), 166 deletions(-)

Index: linux-2.6/arch/arm/Kconfig
===================================================================
--- linux-2.6.orig/arch/arm/Kconfig
+++ linux-2.6/arch/arm/Kconfig
@@ -28,6 +28,7 @@ config ARM
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_GENERIC_HARDIRQS
 	select HAVE_SPARSE_IRQ
+	select HAVE_MMU_GATHER_RANGE if MMU
 	help
 	  The ARM series is a line of low-power-consumption RISC chip designs
 	  licensed by ARM Ltd and targeted at embedded applications and
Index: linux-2.6/arch/arm/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/tlb.h
+++ linux-2.6/arch/arm/include/asm/tlb.h
@@ -29,184 +29,26 @@
 
 #else /* !CONFIG_MMU */
 
-#include <linux/swap.h>
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-
-/*
- * We need to delay page freeing for SMP as other CPUs can access pages
- * which have been removed but not yet had their TLB entries invalidated.
- * Also, as ARMv7 speculative prefetch can drag new entries into the TLB,
- * we need to apply this same delaying tactic to ensure correct operation.
- */
-#if defined(CONFIG_SMP) || defined(CONFIG_CPU_32v7)
-#define tlb_fast_mode(tlb)	0
-#else
-#define tlb_fast_mode(tlb)	1
-#endif
-
-#define MMU_GATHER_BUNDLE	8
-
-/*
- * TLB handling.  This allows us to remove pages from the page
- * tables, and efficiently handle the TLB issues.
- */
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		fullmm;
-	struct vm_area_struct	*vma;
-	unsigned long		range_start;
-	unsigned long		range_end;
-	unsigned int		nr;
-	unsigned int		max;
-	struct page		**pages;
-	struct page		*local[MMU_GATHER_BUNDLE];
-};
-
-DECLARE_PER_CPU(struct mmu_gather, mmu_gathers);
-
-/*
- * This is unnecessarily complex.  There's three ways the TLB shootdown
- * code is used:
- *  1. Unmapping a range of vmas.  See zap_page_range(), unmap_region().
- *     tlb->fullmm = 0, and tlb_start_vma/tlb_end_vma will be called.
- *     tlb->vma will be non-NULL.
- *  2. Unmapping all vmas.  See exit_mmap().
- *     tlb->fullmm = 1, and tlb_start_vma/tlb_end_vma will be called.
- *     tlb->vma will be non-NULL.  Additionally, page tables will be freed.
- *  3. Unmapping argument pages.  See shift_arg_pages().
- *     tlb->fullmm = 0, but tlb_start_vma/tlb_end_vma will not be called.
- *     tlb->vma will be NULL.
- */
-static inline void tlb_flush(struct mmu_gather *tlb)
-{
-	if (tlb->fullmm || !tlb->vma)
-		flush_tlb_mm(tlb->mm);
-	else if (tlb->range_end > 0) {
-		flush_tlb_range(tlb->mm, tlb->range_start, tlb->range_end);
-		tlb->range_start = TASK_SIZE;
-		tlb->range_end = 0;
-	}
-}
-
-static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
-{
-	if (!tlb->fullmm) {
-		if (addr < tlb->range_start)
-			tlb->range_start = addr;
-		if (addr + PAGE_SIZE > tlb->range_end)
-			tlb->range_end = addr + PAGE_SIZE;
-	}
-}
-
-static inline void __tlb_alloc_page(struct mmu_gather *tlb)
-{
-	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
-
-	if (addr) {
-		tlb->pages = (void *)addr;
-		tlb->max = PAGE_SIZE / sizeof(struct page *);
-	}
-}
-
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	tlb_flush(tlb);
-	if (!tlb_fast_mode(tlb)) {
-		free_pages_and_swap_cache(tlb->pages, tlb->nr);
-		tlb->nr = 0;
-		if (tlb->pages == tlb->local)
-			__tlb_alloc_page(tlb);
-	}
-}
+static inline void tlb_flush(struct mmu_gather *tlb);
 
-static inline void
-tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int fullmm)
-{
-	tlb->mm = mm;
-	tlb->fullmm = fullmm;
-	tlb->vma = NULL;
-	tlb->max = ARRAY_SIZE(tlb->local);
-	tlb->pages = tlb->local;
-	tlb->nr = 0;
-	__tlb_alloc_page(tlb);
-}
+#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
 
 static inline void
-tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	tlb_flush_mmu(tlb);
+__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr);
+#define __pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
 
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
+#include <asm-generic/tlb.h>
 
-	if (tlb->pages != tlb->local)
-		free_pages((unsigned long)tlb->pages, 0);
-}
-
-/*
- * Memorize the range for the TLB flush.
- */
-static inline void
-tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long addr)
-{
-	tlb_add_flush(tlb, addr);
-}
-
-/*
- * In the case of tlb vma handling, we can optimise these away in the
- * case where we're doing a full MM flush.  When we're doing a munmap,
- * the vmas are adjusted to only cover the region to be torn down.
- */
-static inline void
-tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+static inline void tlb_flush(struct mmu_gather *tlb)
 {
-	if (!tlb->fullmm) {
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);
-		tlb->vma = vma;
-		tlb->range_start = TASK_SIZE;
-		tlb->range_end = 0;
-	}
+	flush_tlb_range(tlb->mm, tlb->start, tlb->end);
 }
 
 static inline void
-tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm)
-		tlb_flush(tlb);
-}
-
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (tlb_fast_mode(tlb)) {
-		free_page_and_swap_cache(page);
-	} else {
-		tlb->pages[tlb->nr++] = page;
-		if (tlb->nr >= tlb->max)
-			return 1;
-	}
-	return 0;
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (__tlb_remove_page(tlb, page))
-		tlb_flush_mmu(tlb);
-}
-
-static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
-	unsigned long addr)
+__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
 {
 	pgtable_page_dtor(pte);
-	tlb_add_flush(tlb, addr);
 	tlb_remove_page(tlb, pte);
 }
-
-#define pte_free_tlb(tlb, ptep, addr)	__pte_free_tlb(tlb, ptep, addr)
-#define pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
-#define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
-
-#define tlb_migrate_finish(mm)		do { } while (0)
-
 #endif /* CONFIG_MMU */
 #endif


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky

[-- Attachment #1: mm-arm-tlb-range.patch --]
[-- Type: text/plain, Size: 6743 bytes --]

Might want to optimize the tlb_flush() function to do a full mm flush
when the range is 'large', IA64 does this too.

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/arm/Kconfig           |    1 
 arch/arm/include/asm/tlb.h |  174 ++-------------------------------------------
 2 files changed, 9 insertions(+), 166 deletions(-)

Index: linux-2.6/arch/arm/Kconfig
===================================================================
--- linux-2.6.orig/arch/arm/Kconfig
+++ linux-2.6/arch/arm/Kconfig
@@ -28,6 +28,7 @@ config ARM
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_GENERIC_HARDIRQS
 	select HAVE_SPARSE_IRQ
+	select HAVE_MMU_GATHER_RANGE if MMU
 	help
 	  The ARM series is a line of low-power-consumption RISC chip designs
 	  licensed by ARM Ltd and targeted at embedded applications and
Index: linux-2.6/arch/arm/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/tlb.h
+++ linux-2.6/arch/arm/include/asm/tlb.h
@@ -29,184 +29,26 @@
 
 #else /* !CONFIG_MMU */
 
-#include <linux/swap.h>
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-
-/*
- * We need to delay page freeing for SMP as other CPUs can access pages
- * which have been removed but not yet had their TLB entries invalidated.
- * Also, as ARMv7 speculative prefetch can drag new entries into the TLB,
- * we need to apply this same delaying tactic to ensure correct operation.
- */
-#if defined(CONFIG_SMP) || defined(CONFIG_CPU_32v7)
-#define tlb_fast_mode(tlb)	0
-#else
-#define tlb_fast_mode(tlb)	1
-#endif
-
-#define MMU_GATHER_BUNDLE	8
-
-/*
- * TLB handling.  This allows us to remove pages from the page
- * tables, and efficiently handle the TLB issues.
- */
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		fullmm;
-	struct vm_area_struct	*vma;
-	unsigned long		range_start;
-	unsigned long		range_end;
-	unsigned int		nr;
-	unsigned int		max;
-	struct page		**pages;
-	struct page		*local[MMU_GATHER_BUNDLE];
-};
-
-DECLARE_PER_CPU(struct mmu_gather, mmu_gathers);
-
-/*
- * This is unnecessarily complex.  There's three ways the TLB shootdown
- * code is used:
- *  1. Unmapping a range of vmas.  See zap_page_range(), unmap_region().
- *     tlb->fullmm = 0, and tlb_start_vma/tlb_end_vma will be called.
- *     tlb->vma will be non-NULL.
- *  2. Unmapping all vmas.  See exit_mmap().
- *     tlb->fullmm = 1, and tlb_start_vma/tlb_end_vma will be called.
- *     tlb->vma will be non-NULL.  Additionally, page tables will be freed.
- *  3. Unmapping argument pages.  See shift_arg_pages().
- *     tlb->fullmm = 0, but tlb_start_vma/tlb_end_vma will not be called.
- *     tlb->vma will be NULL.
- */
-static inline void tlb_flush(struct mmu_gather *tlb)
-{
-	if (tlb->fullmm || !tlb->vma)
-		flush_tlb_mm(tlb->mm);
-	else if (tlb->range_end > 0) {
-		flush_tlb_range(tlb->mm, tlb->range_start, tlb->range_end);
-		tlb->range_start = TASK_SIZE;
-		tlb->range_end = 0;
-	}
-}
-
-static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
-{
-	if (!tlb->fullmm) {
-		if (addr < tlb->range_start)
-			tlb->range_start = addr;
-		if (addr + PAGE_SIZE > tlb->range_end)
-			tlb->range_end = addr + PAGE_SIZE;
-	}
-}
-
-static inline void __tlb_alloc_page(struct mmu_gather *tlb)
-{
-	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
-
-	if (addr) {
-		tlb->pages = (void *)addr;
-		tlb->max = PAGE_SIZE / sizeof(struct page *);
-	}
-}
-
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	tlb_flush(tlb);
-	if (!tlb_fast_mode(tlb)) {
-		free_pages_and_swap_cache(tlb->pages, tlb->nr);
-		tlb->nr = 0;
-		if (tlb->pages == tlb->local)
-			__tlb_alloc_page(tlb);
-	}
-}
+static inline void tlb_flush(struct mmu_gather *tlb);
 
-static inline void
-tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int fullmm)
-{
-	tlb->mm = mm;
-	tlb->fullmm = fullmm;
-	tlb->vma = NULL;
-	tlb->max = ARRAY_SIZE(tlb->local);
-	tlb->pages = tlb->local;
-	tlb->nr = 0;
-	__tlb_alloc_page(tlb);
-}
+#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
 
 static inline void
-tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	tlb_flush_mmu(tlb);
+__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr);
+#define __pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
 
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
+#include <asm-generic/tlb.h>
 
-	if (tlb->pages != tlb->local)
-		free_pages((unsigned long)tlb->pages, 0);
-}
-
-/*
- * Memorize the range for the TLB flush.
- */
-static inline void
-tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long addr)
-{
-	tlb_add_flush(tlb, addr);
-}
-
-/*
- * In the case of tlb vma handling, we can optimise these away in the
- * case where we're doing a full MM flush.  When we're doing a munmap,
- * the vmas are adjusted to only cover the region to be torn down.
- */
-static inline void
-tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+static inline void tlb_flush(struct mmu_gather *tlb)
 {
-	if (!tlb->fullmm) {
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);
-		tlb->vma = vma;
-		tlb->range_start = TASK_SIZE;
-		tlb->range_end = 0;
-	}
+	flush_tlb_range(tlb->mm, tlb->start, tlb->end);
 }
 
 static inline void
-tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm)
-		tlb_flush(tlb);
-}
-
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (tlb_fast_mode(tlb)) {
-		free_page_and_swap_cache(page);
-	} else {
-		tlb->pages[tlb->nr++] = page;
-		if (tlb->nr >= tlb->max)
-			return 1;
-	}
-	return 0;
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (__tlb_remove_page(tlb, page))
-		tlb_flush_mmu(tlb);
-}
-
-static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
-	unsigned long addr)
+__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
 {
 	pgtable_page_dtor(pte);
-	tlb_add_flush(tlb, addr);
 	tlb_remove_page(tlb, pte);
 }
-
-#define pte_free_tlb(tlb, ptep, addr)	__pte_free_tlb(tlb, ptep, addr)
-#define pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
-#define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
-
-#define tlb_migrate_finish(mm)		do { } while (0)
-
 #endif /* CONFIG_MMU */
 #endif


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 5/6] ia64, mm: Convert ia64 to generic tlb
  2011-03-02 17:59 ` Peter Zijlstra
  (?)
@ 2011-03-02 17:59   ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky,
	Tony Luck

[-- Attachment #1: mm-ia64-tlb-range.patch --]
[-- Type: text/plain, Size: 8148 bytes --]

Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/ia64/Kconfig           |    1 
 arch/ia64/include/asm/tlb.h |  216 ++++----------------------------------------
 2 files changed, 24 insertions(+), 193 deletions(-)

Index: linux-2.6/arch/ia64/Kconfig
===================================================================
--- linux-2.6.orig/arch/ia64/Kconfig
+++ linux-2.6/arch/ia64/Kconfig
@@ -25,6 +25,7 @@ config IA64
 	select HAVE_GENERIC_HARDIRQS
 	select GENERIC_IRQ_PROBE
 	select GENERIC_PENDING_IRQ if SMP
+	select HAVE_MMU_GATHER_RANGE
 	select IRQ_PER_CPU
 	default y
 	help
Index: linux-2.6/arch/ia64/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/tlb.h
+++ linux-2.6/arch/ia64/include/asm/tlb.h
@@ -46,77 +46,48 @@
 #include <asm/tlbflush.h>
 #include <asm/machvec.h>
 
-#ifdef CONFIG_SMP
-# define tlb_fast_mode(tlb)	((tlb)->nr == ~0U)
-#else
-# define tlb_fast_mode(tlb)	(1)
-#endif
-
-/*
- * If we can't allocate a page to make a big batch of page pointers
- * to work on, then just handle a few from the on-stack structure.
- */
-#define	IA64_GATHER_BUNDLE	8
-
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		nr;		/* == ~0U => fast mode */
-	unsigned int		max;
-	unsigned char		fullmm;		/* non-zero means full mm flush */
-	unsigned char		need_flush;	/* really unmapped some PTEs? */
-	unsigned long		start_addr;
-	unsigned long		end_addr;
-	struct page		**pages;
-	struct page		*local[IA64_GATHER_BUNDLE];
-};
-
 struct ia64_tr_entry {
-	u64 ifa;
-	u64 itir;
-	u64 pte;
-	u64 rr;
+       u64 ifa;
+       u64 itir;
+       u64 pte;
+       u64 rr;
 }; /*Record for tr entry!*/
 
 extern int ia64_itr_entry(u64 target_mask, u64 va, u64 pte, u64 log_size);
 extern void ia64_ptr_entry(u64 target_mask, int slot);
-
 extern struct ia64_tr_entry *ia64_idtrs[NR_CPUS];
 
 /*
  region register macros
 */
 #define RR_TO_VE(val)   (((val) >> 0) & 0x0000000000000001)
-#define RR_VE(val)	(((val) & 0x0000000000000001) << 0)
-#define RR_VE_MASK	0x0000000000000001L
-#define RR_VE_SHIFT	0
-#define RR_TO_PS(val)	(((val) >> 2) & 0x000000000000003f)
-#define RR_PS(val)	(((val) & 0x000000000000003f) << 2)
-#define RR_PS_MASK	0x00000000000000fcL
-#define RR_PS_SHIFT	2
-#define RR_RID_MASK	0x00000000ffffff00L
-#define RR_TO_RID(val) 	((val >> 8) & 0xffffff)
+#define RR_VE(val)     (((val) & 0x0000000000000001) << 0)
+#define RR_VE_MASK     0x0000000000000001L
+#define RR_VE_SHIFT    0
+#define RR_TO_PS(val)  (((val) >> 2) & 0x000000000000003f)
+#define RR_PS(val)     (((val) & 0x000000000000003f) << 2)
+#define RR_PS_MASK     0x00000000000000fcL
+#define RR_PS_SHIFT    2
+#define RR_RID_MASK    0x00000000ffffff00L
+#define RR_TO_RID(val)         ((val >> 8) & 0xffffff)
+
+static inline void tlb_flush(struct mmu_gather *tlb);
+
+#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
+#define tlb_migrate_finish(mm)	platform_tlb_migrate_finish(mm)
+
+#include <asm-generic/tlb.h>
 
 /*
  * Flush the TLB for address range START to END and, if not in fast mode, release the
  * freed pages that where gathered up to this point.
  */
 static inline void
-ia64_tlb_flush_mmu (struct mmu_gather *tlb, unsigned long start, unsigned long end)
+tlb_flush(struct mmu_gather *tlb)
 {
-	unsigned int nr;
+	unsigned long start = tlb->start, end = tlb->end;
 
-	if (!tlb->need_flush)
-		return;
-	tlb->need_flush = 0;
-
-	if (tlb->fullmm) {
-		/*
-		 * Tearing down the entire address space.  This happens both as a result
-		 * of exit() and execve().  The latter case necessitates the call to
-		 * flush_tlb_mm() here.
-		 */
-		flush_tlb_mm(tlb->mm);
-	} else if (unlikely (end - start >= 1024*1024*1024*1024UL
+	if (unlikely (end - start >= 1024*1024*1024*1024UL
 			     || REGION_NUMBER(start) != REGION_NUMBER(end - 1)))
 	{
 		/*
@@ -131,147 +102,6 @@ ia64_tlb_flush_mmu (struct mmu_gather *t
 		/* now flush the virt. page-table area mapping the address range: */
 		flush_tlb_range(tlb->mm, ia64_thash(start), ia64_thash(end));
 	}
-
-	/* lastly, release the freed pages */
-	nr = tlb->nr;
-	if (!tlb_fast_mode(tlb)) {
-		unsigned long i;
-		tlb->nr = 0;
-		tlb->start_addr = ~0UL;
-		for (i = 0; i < nr; ++i)
-			free_page_and_swap_cache(tlb->pages[i]);
-	}
 }
 
-static inline void __tlb_alloc_page(struct mmu_gather *tlb)
-{
-	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
-
-	if (addr) {
-		tlb->pages = (void *)addr;
-		tlb->max = PAGE_SIZE / sizeof(void *);
-	}
-}
-
-
-static inline void
-tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int full_mm_flush)
-{
-	tlb->mm = mm;
-	tlb->max = ARRAY_SIZE(tlb->local);
-	tlb->pages = tlb->local;
-	/*
-	 * Use fast mode if only 1 CPU is online.
-	 *
-	 * It would be tempting to turn on fast-mode for full_mm_flush as well.  But this
-	 * doesn't work because of speculative accesses and software prefetching: the page
-	 * table of "mm" may (and usually is) the currently active page table and even
-	 * though the kernel won't do any user-space accesses during the TLB shoot down, a
-	 * compiler might use speculation or lfetch.fault on what happens to be a valid
-	 * user-space address.  This in turn could trigger a TLB miss fault (or a VHPT
-	 * walk) and re-insert a TLB entry we just removed.  Slow mode avoids such
-	 * problems.  (We could make fast-mode work by switching the current task to a
-	 * different "mm" during the shootdown.) --davidm 08/02/2002
-	 */
-	tlb->nr = (num_online_cpus() == 1) ? ~0U : 0;
-	tlb->fullmm = full_mm_flush;
-	tlb->start_addr = ~0UL;
-}
-
-/*
- * Called at the end of the shootdown operation to free up any resources that were
- * collected.
- */
-static inline void
-tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	/*
-	 * Note: tlb->nr may be 0 at this point, so we can't rely on tlb->start_addr and
-	 * tlb->end_addr.
-	 */
-	ia64_tlb_flush_mmu(tlb, start, end);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-
-	if (tlb->pages != tlb->local)
-		free_pages((unsigned long)tlb->pages, 0);
-}
-
-/*
- * Logically, this routine frees PAGE.  On MP machines, the actual freeing of the page
- * must be delayed until after the TLB has been flushed (see comments at the beginning of
- * this file).
- */
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	tlb->need_flush = 1;
-
-	if (tlb_fast_mode(tlb)) {
-		free_page_and_swap_cache(page);
-		return 0;
-	}
-
-	if (!tlb->nr && tlb->pages == tlb->local)
-		__tlb_alloc_page(tlb);
-
-	tlb->pages[tlb->nr++] = page;
-	if (tlb->nr >= tlb->max)
-		return 1;
-
-	return 0;
-}
-
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	ia64_tlb_flush_mmu(tlb, tlb->start_addr, tlb->end_addr);
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (__tlb_remove_page(tlb, page))
-		tlb_flush_mmu(tlb);
-}
-
-/*
- * Remove TLB entry for PTE mapped at virtual address ADDRESS.  This is called for any
- * PTE, not just those pointing to (normal) physical memory.
- */
-static inline void
-__tlb_remove_tlb_entry (struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
-{
-	if (tlb->start_addr == ~0UL)
-		tlb->start_addr = address;
-	tlb->end_addr = address + PAGE_SIZE;
-}
-
-#define tlb_migrate_finish(mm)	platform_tlb_migrate_finish(mm)
-
-#define tlb_start_vma(tlb, vma)			do { } while (0)
-#define tlb_end_vma(tlb, vma)			do { } while (0)
-
-#define tlb_remove_tlb_entry(tlb, ptep, addr)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__tlb_remove_tlb_entry(tlb, ptep, addr);	\
-} while (0)
-
-#define pte_free_tlb(tlb, ptep, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pte_free_tlb(tlb, ptep, address);		\
-} while (0)
-
-#define pmd_free_tlb(tlb, ptep, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pmd_free_tlb(tlb, ptep, address);		\
-} while (0)
-
-#define pud_free_tlb(tlb, pudp, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pud_free_tlb(tlb, pudp, address);		\
-} while (0)
-
 #endif /* _ASM_IA64_TLB_H */



^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 5/6] ia64, mm: Convert ia64 to generic tlb
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky,
	Tony Luck

[-- Attachment #1: mm-ia64-tlb-range.patch --]
[-- Type: text/plain, Size: 8451 bytes --]

Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/ia64/Kconfig           |    1 
 arch/ia64/include/asm/tlb.h |  216 ++++----------------------------------------
 2 files changed, 24 insertions(+), 193 deletions(-)

Index: linux-2.6/arch/ia64/Kconfig
===================================================================
--- linux-2.6.orig/arch/ia64/Kconfig
+++ linux-2.6/arch/ia64/Kconfig
@@ -25,6 +25,7 @@ config IA64
 	select HAVE_GENERIC_HARDIRQS
 	select GENERIC_IRQ_PROBE
 	select GENERIC_PENDING_IRQ if SMP
+	select HAVE_MMU_GATHER_RANGE
 	select IRQ_PER_CPU
 	default y
 	help
Index: linux-2.6/arch/ia64/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/tlb.h
+++ linux-2.6/arch/ia64/include/asm/tlb.h
@@ -46,77 +46,48 @@
 #include <asm/tlbflush.h>
 #include <asm/machvec.h>
 
-#ifdef CONFIG_SMP
-# define tlb_fast_mode(tlb)	((tlb)->nr == ~0U)
-#else
-# define tlb_fast_mode(tlb)	(1)
-#endif
-
-/*
- * If we can't allocate a page to make a big batch of page pointers
- * to work on, then just handle a few from the on-stack structure.
- */
-#define	IA64_GATHER_BUNDLE	8
-
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		nr;		/* == ~0U => fast mode */
-	unsigned int		max;
-	unsigned char		fullmm;		/* non-zero means full mm flush */
-	unsigned char		need_flush;	/* really unmapped some PTEs? */
-	unsigned long		start_addr;
-	unsigned long		end_addr;
-	struct page		**pages;
-	struct page		*local[IA64_GATHER_BUNDLE];
-};
-
 struct ia64_tr_entry {
-	u64 ifa;
-	u64 itir;
-	u64 pte;
-	u64 rr;
+       u64 ifa;
+       u64 itir;
+       u64 pte;
+       u64 rr;
 }; /*Record for tr entry!*/
 
 extern int ia64_itr_entry(u64 target_mask, u64 va, u64 pte, u64 log_size);
 extern void ia64_ptr_entry(u64 target_mask, int slot);
-
 extern struct ia64_tr_entry *ia64_idtrs[NR_CPUS];
 
 /*
  region register macros
 */
 #define RR_TO_VE(val)   (((val) >> 0) & 0x0000000000000001)
-#define RR_VE(val)	(((val) & 0x0000000000000001) << 0)
-#define RR_VE_MASK	0x0000000000000001L
-#define RR_VE_SHIFT	0
-#define RR_TO_PS(val)	(((val) >> 2) & 0x000000000000003f)
-#define RR_PS(val)	(((val) & 0x000000000000003f) << 2)
-#define RR_PS_MASK	0x00000000000000fcL
-#define RR_PS_SHIFT	2
-#define RR_RID_MASK	0x00000000ffffff00L
-#define RR_TO_RID(val) 	((val >> 8) & 0xffffff)
+#define RR_VE(val)     (((val) & 0x0000000000000001) << 0)
+#define RR_VE_MASK     0x0000000000000001L
+#define RR_VE_SHIFT    0
+#define RR_TO_PS(val)  (((val) >> 2) & 0x000000000000003f)
+#define RR_PS(val)     (((val) & 0x000000000000003f) << 2)
+#define RR_PS_MASK     0x00000000000000fcL
+#define RR_PS_SHIFT    2
+#define RR_RID_MASK    0x00000000ffffff00L
+#define RR_TO_RID(val)         ((val >> 8) & 0xffffff)
+
+static inline void tlb_flush(struct mmu_gather *tlb);
+
+#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
+#define tlb_migrate_finish(mm)	platform_tlb_migrate_finish(mm)
+
+#include <asm-generic/tlb.h>
 
 /*
  * Flush the TLB for address range START to END and, if not in fast mode, release the
  * freed pages that where gathered up to this point.
  */
 static inline void
-ia64_tlb_flush_mmu (struct mmu_gather *tlb, unsigned long start, unsigned long end)
+tlb_flush(struct mmu_gather *tlb)
 {
-	unsigned int nr;
+	unsigned long start = tlb->start, end = tlb->end;
 
-	if (!tlb->need_flush)
-		return;
-	tlb->need_flush = 0;
-
-	if (tlb->fullmm) {
-		/*
-		 * Tearing down the entire address space.  This happens both as a result
-		 * of exit() and execve().  The latter case necessitates the call to
-		 * flush_tlb_mm() here.
-		 */
-		flush_tlb_mm(tlb->mm);
-	} else if (unlikely (end - start >= 1024*1024*1024*1024UL
+	if (unlikely (end - start >= 1024*1024*1024*1024UL
 			     || REGION_NUMBER(start) != REGION_NUMBER(end - 1)))
 	{
 		/*
@@ -131,147 +102,6 @@ ia64_tlb_flush_mmu (struct mmu_gather *t
 		/* now flush the virt. page-table area mapping the address range: */
 		flush_tlb_range(tlb->mm, ia64_thash(start), ia64_thash(end));
 	}
-
-	/* lastly, release the freed pages */
-	nr = tlb->nr;
-	if (!tlb_fast_mode(tlb)) {
-		unsigned long i;
-		tlb->nr = 0;
-		tlb->start_addr = ~0UL;
-		for (i = 0; i < nr; ++i)
-			free_page_and_swap_cache(tlb->pages[i]);
-	}
 }
 
-static inline void __tlb_alloc_page(struct mmu_gather *tlb)
-{
-	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
-
-	if (addr) {
-		tlb->pages = (void *)addr;
-		tlb->max = PAGE_SIZE / sizeof(void *);
-	}
-}
-
-
-static inline void
-tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int full_mm_flush)
-{
-	tlb->mm = mm;
-	tlb->max = ARRAY_SIZE(tlb->local);
-	tlb->pages = tlb->local;
-	/*
-	 * Use fast mode if only 1 CPU is online.
-	 *
-	 * It would be tempting to turn on fast-mode for full_mm_flush as well.  But this
-	 * doesn't work because of speculative accesses and software prefetching: the page
-	 * table of "mm" may (and usually is) the currently active page table and even
-	 * though the kernel won't do any user-space accesses during the TLB shoot down, a
-	 * compiler might use speculation or lfetch.fault on what happens to be a valid
-	 * user-space address.  This in turn could trigger a TLB miss fault (or a VHPT
-	 * walk) and re-insert a TLB entry we just removed.  Slow mode avoids such
-	 * problems.  (We could make fast-mode work by switching the current task to a
-	 * different "mm" during the shootdown.) --davidm 08/02/2002
-	 */
-	tlb->nr = (num_online_cpus() == 1) ? ~0U : 0;
-	tlb->fullmm = full_mm_flush;
-	tlb->start_addr = ~0UL;
-}
-
-/*
- * Called at the end of the shootdown operation to free up any resources that were
- * collected.
- */
-static inline void
-tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	/*
-	 * Note: tlb->nr may be 0 at this point, so we can't rely on tlb->start_addr and
-	 * tlb->end_addr.
-	 */
-	ia64_tlb_flush_mmu(tlb, start, end);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-
-	if (tlb->pages != tlb->local)
-		free_pages((unsigned long)tlb->pages, 0);
-}
-
-/*
- * Logically, this routine frees PAGE.  On MP machines, the actual freeing of the page
- * must be delayed until after the TLB has been flushed (see comments at the beginning of
- * this file).
- */
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	tlb->need_flush = 1;
-
-	if (tlb_fast_mode(tlb)) {
-		free_page_and_swap_cache(page);
-		return 0;
-	}
-
-	if (!tlb->nr && tlb->pages == tlb->local)
-		__tlb_alloc_page(tlb);
-
-	tlb->pages[tlb->nr++] = page;
-	if (tlb->nr >= tlb->max)
-		return 1;
-
-	return 0;
-}
-
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	ia64_tlb_flush_mmu(tlb, tlb->start_addr, tlb->end_addr);
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (__tlb_remove_page(tlb, page))
-		tlb_flush_mmu(tlb);
-}
-
-/*
- * Remove TLB entry for PTE mapped at virtual address ADDRESS.  This is called for any
- * PTE, not just those pointing to (normal) physical memory.
- */
-static inline void
-__tlb_remove_tlb_entry (struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
-{
-	if (tlb->start_addr == ~0UL)
-		tlb->start_addr = address;
-	tlb->end_addr = address + PAGE_SIZE;
-}
-
-#define tlb_migrate_finish(mm)	platform_tlb_migrate_finish(mm)
-
-#define tlb_start_vma(tlb, vma)			do { } while (0)
-#define tlb_end_vma(tlb, vma)			do { } while (0)
-
-#define tlb_remove_tlb_entry(tlb, ptep, addr)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__tlb_remove_tlb_entry(tlb, ptep, addr);	\
-} while (0)
-
-#define pte_free_tlb(tlb, ptep, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pte_free_tlb(tlb, ptep, address);		\
-} while (0)
-
-#define pmd_free_tlb(tlb, ptep, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pmd_free_tlb(tlb, ptep, address);		\
-} while (0)
-
-#define pud_free_tlb(tlb, pudp, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pud_free_tlb(tlb, pudp, address);		\
-} while (0)
-
 #endif /* _ASM_IA64_TLB_H */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 5/6] ia64, mm: Convert ia64 to generic tlb
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky,
	Tony Luck

[-- Attachment #1: mm-ia64-tlb-range.patch --]
[-- Type: text/plain, Size: 8451 bytes --]

Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/ia64/Kconfig           |    1 
 arch/ia64/include/asm/tlb.h |  216 ++++----------------------------------------
 2 files changed, 24 insertions(+), 193 deletions(-)

Index: linux-2.6/arch/ia64/Kconfig
===================================================================
--- linux-2.6.orig/arch/ia64/Kconfig
+++ linux-2.6/arch/ia64/Kconfig
@@ -25,6 +25,7 @@ config IA64
 	select HAVE_GENERIC_HARDIRQS
 	select GENERIC_IRQ_PROBE
 	select GENERIC_PENDING_IRQ if SMP
+	select HAVE_MMU_GATHER_RANGE
 	select IRQ_PER_CPU
 	default y
 	help
Index: linux-2.6/arch/ia64/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/tlb.h
+++ linux-2.6/arch/ia64/include/asm/tlb.h
@@ -46,77 +46,48 @@
 #include <asm/tlbflush.h>
 #include <asm/machvec.h>
 
-#ifdef CONFIG_SMP
-# define tlb_fast_mode(tlb)	((tlb)->nr == ~0U)
-#else
-# define tlb_fast_mode(tlb)	(1)
-#endif
-
-/*
- * If we can't allocate a page to make a big batch of page pointers
- * to work on, then just handle a few from the on-stack structure.
- */
-#define	IA64_GATHER_BUNDLE	8
-
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		nr;		/* == ~0U => fast mode */
-	unsigned int		max;
-	unsigned char		fullmm;		/* non-zero means full mm flush */
-	unsigned char		need_flush;	/* really unmapped some PTEs? */
-	unsigned long		start_addr;
-	unsigned long		end_addr;
-	struct page		**pages;
-	struct page		*local[IA64_GATHER_BUNDLE];
-};
-
 struct ia64_tr_entry {
-	u64 ifa;
-	u64 itir;
-	u64 pte;
-	u64 rr;
+       u64 ifa;
+       u64 itir;
+       u64 pte;
+       u64 rr;
 }; /*Record for tr entry!*/
 
 extern int ia64_itr_entry(u64 target_mask, u64 va, u64 pte, u64 log_size);
 extern void ia64_ptr_entry(u64 target_mask, int slot);
-
 extern struct ia64_tr_entry *ia64_idtrs[NR_CPUS];
 
 /*
  region register macros
 */
 #define RR_TO_VE(val)   (((val) >> 0) & 0x0000000000000001)
-#define RR_VE(val)	(((val) & 0x0000000000000001) << 0)
-#define RR_VE_MASK	0x0000000000000001L
-#define RR_VE_SHIFT	0
-#define RR_TO_PS(val)	(((val) >> 2) & 0x000000000000003f)
-#define RR_PS(val)	(((val) & 0x000000000000003f) << 2)
-#define RR_PS_MASK	0x00000000000000fcL
-#define RR_PS_SHIFT	2
-#define RR_RID_MASK	0x00000000ffffff00L
-#define RR_TO_RID(val) 	((val >> 8) & 0xffffff)
+#define RR_VE(val)     (((val) & 0x0000000000000001) << 0)
+#define RR_VE_MASK     0x0000000000000001L
+#define RR_VE_SHIFT    0
+#define RR_TO_PS(val)  (((val) >> 2) & 0x000000000000003f)
+#define RR_PS(val)     (((val) & 0x000000000000003f) << 2)
+#define RR_PS_MASK     0x00000000000000fcL
+#define RR_PS_SHIFT    2
+#define RR_RID_MASK    0x00000000ffffff00L
+#define RR_TO_RID(val)         ((val >> 8) & 0xffffff)
+
+static inline void tlb_flush(struct mmu_gather *tlb);
+
+#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
+#define tlb_migrate_finish(mm)	platform_tlb_migrate_finish(mm)
+
+#include <asm-generic/tlb.h>
 
 /*
  * Flush the TLB for address range START to END and, if not in fast mode, release the
  * freed pages that where gathered up to this point.
  */
 static inline void
-ia64_tlb_flush_mmu (struct mmu_gather *tlb, unsigned long start, unsigned long end)
+tlb_flush(struct mmu_gather *tlb)
 {
-	unsigned int nr;
+	unsigned long start = tlb->start, end = tlb->end;
 
-	if (!tlb->need_flush)
-		return;
-	tlb->need_flush = 0;
-
-	if (tlb->fullmm) {
-		/*
-		 * Tearing down the entire address space.  This happens both as a result
-		 * of exit() and execve().  The latter case necessitates the call to
-		 * flush_tlb_mm() here.
-		 */
-		flush_tlb_mm(tlb->mm);
-	} else if (unlikely (end - start >= 1024*1024*1024*1024UL
+	if (unlikely (end - start >= 1024*1024*1024*1024UL
 			     || REGION_NUMBER(start) != REGION_NUMBER(end - 1)))
 	{
 		/*
@@ -131,147 +102,6 @@ ia64_tlb_flush_mmu (struct mmu_gather *t
 		/* now flush the virt. page-table area mapping the address range: */
 		flush_tlb_range(tlb->mm, ia64_thash(start), ia64_thash(end));
 	}
-
-	/* lastly, release the freed pages */
-	nr = tlb->nr;
-	if (!tlb_fast_mode(tlb)) {
-		unsigned long i;
-		tlb->nr = 0;
-		tlb->start_addr = ~0UL;
-		for (i = 0; i < nr; ++i)
-			free_page_and_swap_cache(tlb->pages[i]);
-	}
 }
 
-static inline void __tlb_alloc_page(struct mmu_gather *tlb)
-{
-	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
-
-	if (addr) {
-		tlb->pages = (void *)addr;
-		tlb->max = PAGE_SIZE / sizeof(void *);
-	}
-}
-
-
-static inline void
-tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int full_mm_flush)
-{
-	tlb->mm = mm;
-	tlb->max = ARRAY_SIZE(tlb->local);
-	tlb->pages = tlb->local;
-	/*
-	 * Use fast mode if only 1 CPU is online.
-	 *
-	 * It would be tempting to turn on fast-mode for full_mm_flush as well.  But this
-	 * doesn't work because of speculative accesses and software prefetching: the page
-	 * table of "mm" may (and usually is) the currently active page table and even
-	 * though the kernel won't do any user-space accesses during the TLB shoot down, a
-	 * compiler might use speculation or lfetch.fault on what happens to be a valid
-	 * user-space address.  This in turn could trigger a TLB miss fault (or a VHPT
-	 * walk) and re-insert a TLB entry we just removed.  Slow mode avoids such
-	 * problems.  (We could make fast-mode work by switching the current task to a
-	 * different "mm" during the shootdown.) --davidm 08/02/2002
-	 */
-	tlb->nr = (num_online_cpus() == 1) ? ~0U : 0;
-	tlb->fullmm = full_mm_flush;
-	tlb->start_addr = ~0UL;
-}
-
-/*
- * Called at the end of the shootdown operation to free up any resources that were
- * collected.
- */
-static inline void
-tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	/*
-	 * Note: tlb->nr may be 0 at this point, so we can't rely on tlb->start_addr and
-	 * tlb->end_addr.
-	 */
-	ia64_tlb_flush_mmu(tlb, start, end);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-
-	if (tlb->pages != tlb->local)
-		free_pages((unsigned long)tlb->pages, 0);
-}
-
-/*
- * Logically, this routine frees PAGE.  On MP machines, the actual freeing of the page
- * must be delayed until after the TLB has been flushed (see comments at the beginning of
- * this file).
- */
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	tlb->need_flush = 1;
-
-	if (tlb_fast_mode(tlb)) {
-		free_page_and_swap_cache(page);
-		return 0;
-	}
-
-	if (!tlb->nr && tlb->pages == tlb->local)
-		__tlb_alloc_page(tlb);
-
-	tlb->pages[tlb->nr++] = page;
-	if (tlb->nr >= tlb->max)
-		return 1;
-
-	return 0;
-}
-
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-	ia64_tlb_flush_mmu(tlb, tlb->start_addr, tlb->end_addr);
-}
-
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	if (__tlb_remove_page(tlb, page))
-		tlb_flush_mmu(tlb);
-}
-
-/*
- * Remove TLB entry for PTE mapped at virtual address ADDRESS.  This is called for any
- * PTE, not just those pointing to (normal) physical memory.
- */
-static inline void
-__tlb_remove_tlb_entry (struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
-{
-	if (tlb->start_addr == ~0UL)
-		tlb->start_addr = address;
-	tlb->end_addr = address + PAGE_SIZE;
-}
-
-#define tlb_migrate_finish(mm)	platform_tlb_migrate_finish(mm)
-
-#define tlb_start_vma(tlb, vma)			do { } while (0)
-#define tlb_end_vma(tlb, vma)			do { } while (0)
-
-#define tlb_remove_tlb_entry(tlb, ptep, addr)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__tlb_remove_tlb_entry(tlb, ptep, addr);	\
-} while (0)
-
-#define pte_free_tlb(tlb, ptep, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pte_free_tlb(tlb, ptep, address);		\
-} while (0)
-
-#define pmd_free_tlb(tlb, ptep, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pmd_free_tlb(tlb, ptep, address);		\
-} while (0)
-
-#define pud_free_tlb(tlb, pudp, address)		\
-do {							\
-	tlb->need_flush = 1;				\
-	__pud_free_tlb(tlb, pudp, address);		\
-} while (0)
-
 #endif /* _ASM_IA64_TLB_H */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 6/6] sh, mm: Convert sh to generic tlb
  2011-03-02 17:59 ` Peter Zijlstra
  (?)
@ 2011-03-02 17:59   ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky,
	Paul Mundt

[-- Attachment #1: mm-sh-tlb-range.patch --]
[-- Type: text/plain, Size: 4432 bytes --]

Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/sh/Kconfig           |    1 
 arch/sh/include/asm/tlb.h |  103 +++++-----------------------------------------
 2 files changed, 13 insertions(+), 91 deletions(-)

Index: linux-2.6/arch/sh/Kconfig
===================================================================
--- linux-2.6.orig/arch/sh/Kconfig
+++ linux-2.6/arch/sh/Kconfig
@@ -23,6 +23,7 @@ config SUPERH
 	select HAVE_SPARSE_IRQ
 	select RTC_LIB
 	select GENERIC_ATOMIC64
+	select HAVE_MMU_GATHER_RANGE if MMU
 	# Support the deprecated APIs until MFD and GPIOLIB catch up.
 	select GENERIC_HARDIRQS_NO_DEPRECATED if !MFD_SUPPORT && !GPIOLIB
 	help
Index: linux-2.6/arch/sh/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/tlb.h
+++ linux-2.6/arch/sh/include/asm/tlb.h
@@ -9,101 +9,22 @@
 #include <linux/pagemap.h>
 
 #ifdef CONFIG_MMU
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-#include <asm/mmu_context.h>
-
-/*
- * TLB handling.  This allows us to remove pages from the page
- * tables, and efficiently handle the TLB issues.
- */
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		fullmm;
-	unsigned long		start, end;
-};
 
-static inline void init_tlb_gather(struct mmu_gather *tlb)
-{
-	tlb->start = TASK_SIZE;
-	tlb->end = 0;
-
-	if (tlb->fullmm) {
-		tlb->start = 0;
-		tlb->end = TASK_SIZE;
-	}
-}
-
-static inline void
-tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int full_mm_flush)
-{
-	tlb->mm = mm;
-	tlb->fullmm = full_mm_flush;
-
-	init_tlb_gather(tlb);
-}
-
-static inline void
-tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	if (tlb->fullmm)
-		flush_tlb_mm(tlb->mm);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-}
-
-static inline void
-tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
-{
-	if (tlb->start > address)
-		tlb->start = address;
-	if (tlb->end < address + PAGE_SIZE)
-		tlb->end = address + PAGE_SIZE;
-}
-
-/*
- * In the case of tlb vma handling, we can optimise these away in the
- * case where we're doing a full MM flush.  When we're doing a munmap,
- * the vmas are adjusted to only cover the region to be torn down.
- */
-static inline void
-tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm)
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);
-}
+static inline void tlb_flush(struct mmu_gather *tlb);
 
-static inline void
-tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm && tlb->end) {
-		flush_tlb_range(vma->vm_mm, tlb->start, tlb->end);
-		init_tlb_gather(tlb);
-	}
-}
+#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
 
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-}
+#define __pte_free_tlb(tlb, ptep, addr)	pte_free((tlb)->mm, ptep)
+#define __pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
+#define __pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
 
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	free_page_and_swap_cache(page);
-	return 0;
-}
+#include <asm-generic/tlb.h>
 
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
+static inline void tlb_flush(struct mmu_gather *tlb)
 {
-	__tlb_remove_page(tlb, page);
+	flush_tlb_range(tlb->mm, tlb->start, tlb->end);
 }
 
-#define pte_free_tlb(tlb, ptep, addr)	pte_free((tlb)->mm, ptep)
-#define pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
-#define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
-
-#define tlb_migrate_finish(mm)		do { } while (0)
-
 #if defined(CONFIG_CPU_SH4) || defined(CONFIG_SUPERH64)
 extern void tlb_wire_entry(struct vm_area_struct *, unsigned long, pte_t);
 extern void tlb_unwire_entry(void);
@@ -122,13 +43,13 @@ static inline void tlb_unwire_entry(void
 
 #else /* CONFIG_MMU */
 
-#define tlb_start_vma(tlb, vma)				do { } while (0)
-#define tlb_end_vma(tlb, vma)				do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
 #define tlb_flush(tlb)					do { } while (0)
 
+#endif /* CONFIG_MMU */
+
+#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
+
 #include <asm-generic/tlb.h>
 
-#endif /* CONFIG_MMU */
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_SH_TLB_H */



^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 6/6] sh, mm: Convert sh to generic tlb
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky,
	Paul Mundt

[-- Attachment #1: mm-sh-tlb-range.patch --]
[-- Type: text/plain, Size: 4735 bytes --]

Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/sh/Kconfig           |    1 
 arch/sh/include/asm/tlb.h |  103 +++++-----------------------------------------
 2 files changed, 13 insertions(+), 91 deletions(-)

Index: linux-2.6/arch/sh/Kconfig
===================================================================
--- linux-2.6.orig/arch/sh/Kconfig
+++ linux-2.6/arch/sh/Kconfig
@@ -23,6 +23,7 @@ config SUPERH
 	select HAVE_SPARSE_IRQ
 	select RTC_LIB
 	select GENERIC_ATOMIC64
+	select HAVE_MMU_GATHER_RANGE if MMU
 	# Support the deprecated APIs until MFD and GPIOLIB catch up.
 	select GENERIC_HARDIRQS_NO_DEPRECATED if !MFD_SUPPORT && !GPIOLIB
 	help
Index: linux-2.6/arch/sh/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/tlb.h
+++ linux-2.6/arch/sh/include/asm/tlb.h
@@ -9,101 +9,22 @@
 #include <linux/pagemap.h>
 
 #ifdef CONFIG_MMU
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-#include <asm/mmu_context.h>
-
-/*
- * TLB handling.  This allows us to remove pages from the page
- * tables, and efficiently handle the TLB issues.
- */
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		fullmm;
-	unsigned long		start, end;
-};
 
-static inline void init_tlb_gather(struct mmu_gather *tlb)
-{
-	tlb->start = TASK_SIZE;
-	tlb->end = 0;
-
-	if (tlb->fullmm) {
-		tlb->start = 0;
-		tlb->end = TASK_SIZE;
-	}
-}
-
-static inline void
-tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int full_mm_flush)
-{
-	tlb->mm = mm;
-	tlb->fullmm = full_mm_flush;
-
-	init_tlb_gather(tlb);
-}
-
-static inline void
-tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	if (tlb->fullmm)
-		flush_tlb_mm(tlb->mm);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-}
-
-static inline void
-tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
-{
-	if (tlb->start > address)
-		tlb->start = address;
-	if (tlb->end < address + PAGE_SIZE)
-		tlb->end = address + PAGE_SIZE;
-}
-
-/*
- * In the case of tlb vma handling, we can optimise these away in the
- * case where we're doing a full MM flush.  When we're doing a munmap,
- * the vmas are adjusted to only cover the region to be torn down.
- */
-static inline void
-tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm)
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);
-}
+static inline void tlb_flush(struct mmu_gather *tlb);
 
-static inline void
-tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm && tlb->end) {
-		flush_tlb_range(vma->vm_mm, tlb->start, tlb->end);
-		init_tlb_gather(tlb);
-	}
-}
+#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
 
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-}
+#define __pte_free_tlb(tlb, ptep, addr)	pte_free((tlb)->mm, ptep)
+#define __pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
+#define __pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
 
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	free_page_and_swap_cache(page);
-	return 0;
-}
+#include <asm-generic/tlb.h>
 
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
+static inline void tlb_flush(struct mmu_gather *tlb)
 {
-	__tlb_remove_page(tlb, page);
+	flush_tlb_range(tlb->mm, tlb->start, tlb->end);
 }
 
-#define pte_free_tlb(tlb, ptep, addr)	pte_free((tlb)->mm, ptep)
-#define pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
-#define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
-
-#define tlb_migrate_finish(mm)		do { } while (0)
-
 #if defined(CONFIG_CPU_SH4) || defined(CONFIG_SUPERH64)
 extern void tlb_wire_entry(struct vm_area_struct *, unsigned long, pte_t);
 extern void tlb_unwire_entry(void);
@@ -122,13 +43,13 @@ static inline void tlb_unwire_entry(void
 
 #else /* CONFIG_MMU */
 
-#define tlb_start_vma(tlb, vma)				do { } while (0)
-#define tlb_end_vma(tlb, vma)				do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
 #define tlb_flush(tlb)					do { } while (0)
 
+#endif /* CONFIG_MMU */
+
+#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
+
 #include <asm-generic/tlb.h>
 
-#endif /* CONFIG_MMU */
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_SH_TLB_H */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [RFC][PATCH 6/6] sh, mm: Convert sh to generic tlb
@ 2011-03-02 17:59   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 17:59 UTC (permalink / raw)
  To: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds
  Cc: linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Peter Zijlstra, Russell King, Chris Metcalf, Martin Schwidefsky,
	Paul Mundt

[-- Attachment #1: mm-sh-tlb-range.patch --]
[-- Type: text/plain, Size: 4735 bytes --]

Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/sh/Kconfig           |    1 
 arch/sh/include/asm/tlb.h |  103 +++++-----------------------------------------
 2 files changed, 13 insertions(+), 91 deletions(-)

Index: linux-2.6/arch/sh/Kconfig
===================================================================
--- linux-2.6.orig/arch/sh/Kconfig
+++ linux-2.6/arch/sh/Kconfig
@@ -23,6 +23,7 @@ config SUPERH
 	select HAVE_SPARSE_IRQ
 	select RTC_LIB
 	select GENERIC_ATOMIC64
+	select HAVE_MMU_GATHER_RANGE if MMU
 	# Support the deprecated APIs until MFD and GPIOLIB catch up.
 	select GENERIC_HARDIRQS_NO_DEPRECATED if !MFD_SUPPORT && !GPIOLIB
 	help
Index: linux-2.6/arch/sh/include/asm/tlb.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/tlb.h
+++ linux-2.6/arch/sh/include/asm/tlb.h
@@ -9,101 +9,22 @@
 #include <linux/pagemap.h>
 
 #ifdef CONFIG_MMU
-#include <asm/pgalloc.h>
-#include <asm/tlbflush.h>
-#include <asm/mmu_context.h>
-
-/*
- * TLB handling.  This allows us to remove pages from the page
- * tables, and efficiently handle the TLB issues.
- */
-struct mmu_gather {
-	struct mm_struct	*mm;
-	unsigned int		fullmm;
-	unsigned long		start, end;
-};
 
-static inline void init_tlb_gather(struct mmu_gather *tlb)
-{
-	tlb->start = TASK_SIZE;
-	tlb->end = 0;
-
-	if (tlb->fullmm) {
-		tlb->start = 0;
-		tlb->end = TASK_SIZE;
-	}
-}
-
-static inline void
-tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int full_mm_flush)
-{
-	tlb->mm = mm;
-	tlb->fullmm = full_mm_flush;
-
-	init_tlb_gather(tlb);
-}
-
-static inline void
-tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
-{
-	if (tlb->fullmm)
-		flush_tlb_mm(tlb->mm);
-
-	/* keep the page table cache within bounds */
-	check_pgt_cache();
-}
-
-static inline void
-tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long address)
-{
-	if (tlb->start > address)
-		tlb->start = address;
-	if (tlb->end < address + PAGE_SIZE)
-		tlb->end = address + PAGE_SIZE;
-}
-
-/*
- * In the case of tlb vma handling, we can optimise these away in the
- * case where we're doing a full MM flush.  When we're doing a munmap,
- * the vmas are adjusted to only cover the region to be torn down.
- */
-static inline void
-tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm)
-		flush_cache_range(vma, vma->vm_start, vma->vm_end);
-}
+static inline void tlb_flush(struct mmu_gather *tlb);
 
-static inline void
-tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
-{
-	if (!tlb->fullmm && tlb->end) {
-		flush_tlb_range(vma->vm_mm, tlb->start, tlb->end);
-		init_tlb_gather(tlb);
-	}
-}
+#define __tlb_remove_tlb_entry(tlb, ptep, addr) do { } while (0)
 
-static inline void tlb_flush_mmu(struct mmu_gather *tlb)
-{
-}
+#define __pte_free_tlb(tlb, ptep, addr)	pte_free((tlb)->mm, ptep)
+#define __pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
+#define __pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
 
-static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
-{
-	free_page_and_swap_cache(page);
-	return 0;
-}
+#include <asm-generic/tlb.h>
 
-static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
+static inline void tlb_flush(struct mmu_gather *tlb)
 {
-	__tlb_remove_page(tlb, page);
+	flush_tlb_range(tlb->mm, tlb->start, tlb->end);
 }
 
-#define pte_free_tlb(tlb, ptep, addr)	pte_free((tlb)->mm, ptep)
-#define pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
-#define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
-
-#define tlb_migrate_finish(mm)		do { } while (0)
-
 #if defined(CONFIG_CPU_SH4) || defined(CONFIG_SUPERH64)
 extern void tlb_wire_entry(struct vm_area_struct *, unsigned long, pte_t);
 extern void tlb_unwire_entry(void);
@@ -122,13 +43,13 @@ static inline void tlb_unwire_entry(void
 
 #else /* CONFIG_MMU */
 
-#define tlb_start_vma(tlb, vma)				do { } while (0)
-#define tlb_end_vma(tlb, vma)				do { } while (0)
-#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
 #define tlb_flush(tlb)					do { } while (0)
 
+#endif /* CONFIG_MMU */
+
+#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
+
 #include <asm-generic/tlb.h>
 
-#endif /* CONFIG_MMU */
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_SH_TLB_H */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
  2011-03-02 17:59   ` Peter Zijlstra
@ 2011-03-02 19:19     ` Linus Torvalds
  -1 siblings, 0 replies; 86+ messages in thread
From: Linus Torvalds @ 2011-03-02 19:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> In order to be able to properly support architecture that want/need to
> support TLB range invalidation, we need to change the
> flush_tlb_range() argument from a vm_area_struct to an mm_struct
> because the range might very well extend past one VMA, or not have a
> VMA at all.

I really don't think this is right. The whole "drop the icache
information" thing is a total anti-optimization, since for some
architectures, the icache flush is the _big_ deal. Possibly much
bigger than the TLB flush itself. Doing an icache flush was much more
expensive than the TLB flush on alpha, for example (the tlb had ASI's
etc, the icache did not).

> There are various reasons that we need to flush TLBs _after_ freeing
> the page-tables themselves. For some architectures (x86 among others)
> this serializes against (both hardware and software) page table
> walkers like gup_fast().

This part of the changelog also makes no sense what-so-ever. It's
actively wrong.

On x86, we absolutely *must* do the TLB flush _before_ we release the
page tables. So your commentary is actively wrong and misleading.

The order has to be:
 - clear the page table entry, queue the page to be free'd
 - flush the TLB
 - free the page (and page tables)

and nothing else is correct, afaik. So the changelog is pure and utter
garbage. I didn't look at what the patch actually changed.

NAK.

                         Linus

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-02 19:19     ` Linus Torvalds
  0 siblings, 0 replies; 86+ messages in thread
From: Linus Torvalds @ 2011-03-02 19:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> In order to be able to properly support architecture that want/need to
> support TLB range invalidation, we need to change the
> flush_tlb_range() argument from a vm_area_struct to an mm_struct
> because the range might very well extend past one VMA, or not have a
> VMA at all.

I really don't think this is right. The whole "drop the icache
information" thing is a total anti-optimization, since for some
architectures, the icache flush is the _big_ deal. Possibly much
bigger than the TLB flush itself. Doing an icache flush was much more
expensive than the TLB flush on alpha, for example (the tlb had ASI's
etc, the icache did not).

> There are various reasons that we need to flush TLBs _after_ freeing
> the page-tables themselves. For some architectures (x86 among others)
> this serializes against (both hardware and software) page table
> walkers like gup_fast().

This part of the changelog also makes no sense what-so-ever. It's
actively wrong.

On x86, we absolutely *must* do the TLB flush _before_ we release the
page tables. So your commentary is actively wrong and misleading.

The order has to be:
 - clear the page table entry, queue the page to be free'd
 - flush the TLB
 - free the page (and page tables)

and nothing else is correct, afaik. So the changelog is pure and utter
garbage. I didn't look at what the patch actually changed.

NAK.

                         Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
  2011-03-02 19:19     ` Linus Torvalds
@ 2011-03-02 20:58       ` Rik van Riel
  -1 siblings, 0 replies; 86+ messages in thread
From: Rik van Riel @ 2011-03-02 20:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Andrea Arcangeli, Thomas Gleixner, Ingo Molnar,
	akpm, linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Russell King, Chris Metcalf, Martin Schwidefsky

On 03/02/2011 02:19 PM, Linus Torvalds wrote:

>> There are various reasons that we need to flush TLBs _after_ freeing
>> the page-tables themselves. For some architectures (x86 among others)
>> this serializes against (both hardware and software) page table
>> walkers like gup_fast().
>
> This part of the changelog also makes no sense what-so-ever. It's
> actively wrong.
>
> On x86, we absolutely *must* do the TLB flush _before_ we release the
> page tables. So your commentary is actively wrong and misleading.
>
> The order has to be:
>   - clear the page table entry, queue the page to be free'd
>   - flush the TLB
>   - free the page (and page tables)
>
> and nothing else is correct, afaik. So the changelog is pure and utter
> garbage. I didn't look at what the patch actually changed.

The patch seems to preserve the correct behaviour.

The changelog should probably read something along the
lines of:

"There are various reasons that we need to flush TLBs _after_
  clearing the page-table entries themselves."

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-02 20:58       ` Rik van Riel
  0 siblings, 0 replies; 86+ messages in thread
From: Rik van Riel @ 2011-03-02 20:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Andrea Arcangeli, Thomas Gleixner, Ingo Molnar,
	akpm, linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Russell King, Chris Metcalf, Martin Schwidefsky

On 03/02/2011 02:19 PM, Linus Torvalds wrote:

>> There are various reasons that we need to flush TLBs _after_ freeing
>> the page-tables themselves. For some architectures (x86 among others)
>> this serializes against (both hardware and software) page table
>> walkers like gup_fast().
>
> This part of the changelog also makes no sense what-so-ever. It's
> actively wrong.
>
> On x86, we absolutely *must* do the TLB flush _before_ we release the
> page tables. So your commentary is actively wrong and misleading.
>
> The order has to be:
>   - clear the page table entry, queue the page to be free'd
>   - flush the TLB
>   - free the page (and page tables)
>
> and nothing else is correct, afaik. So the changelog is pure and utter
> garbage. I didn't look at what the patch actually changed.

The patch seems to preserve the correct behaviour.

The changelog should probably read something along the
lines of:

"There are various reasons that we need to flush TLBs _after_
  clearing the page-table entries themselves."

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
  2011-03-02 19:19     ` Linus Torvalds
@ 2011-03-02 21:40       ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 21:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-02 at 11:19 -0800, Linus Torvalds wrote:
> On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > In order to be able to properly support architecture that want/need to
> > support TLB range invalidation, we need to change the
> > flush_tlb_range() argument from a vm_area_struct to an mm_struct
> > because the range might very well extend past one VMA, or not have a
> > VMA at all.
> 
> I really don't think this is right. The whole "drop the icache
> information" thing is a total anti-optimization, since for some
> architectures, the icache flush is the _big_ deal. 

Right, so Tile has the I-cache flush from flush_tlb_range(), I'm not
sure if that's the right thing to do, Documentation/cachetlb.txt seems
to suggest doing it from update_mmu_cache() like things.

However, I really don't know, and would happily be explained how these
things are supposed to work. Also:

> Possibly much
> bigger than the TLB flush itself. Doing an icache flush was much more
> expensive than the TLB flush on alpha, for example (the tlb had ASI's
> etc, the icache did not).

Right, but the problem remains that we do page-table teardown without
having a vma.

Now we can re-introduce I/D variants again by assuming D-only and using
tlb_start_vma() to set a I-too bit on VM_EXEC. (this assumes the vm_args
range is non-executable -- which it had better be).

How about I do something like:

enum {
  TLB_FLUSH_I = 1,
  TLB_FLUSH_D = 2,
  TLB_FLUSH_PAGE = 4,
  TLB_FLUSH_HPAGE = 8,
};

void flush_tlb_range(struct mm_struct *mm, unsigned long start,
		     unsigned long end, unsigned int flags);

And we then do:

tlb_gather_mmu(struct mmu_gather *tlb, ...)
{
  ...
  tlb->flush_type = TLB_FLUSH_D | TLB_FLUSH_PAGE;
}

tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
{
  if (!tlb->fullmm)
    flush_cache_range(vma, vma->vm_start, vma->vm_end);

  if (vma->vm_flags & VM_EXEC)
    tlb->flush_type |= TLB_FLUSH_I;

  if (vma->vm_flags & VM_HUGEPAGE)
    tlb->flush_type |= TLB_FLUSH_HPAGE;
}

tlb_flush_mmu(struct mmu_gather *tlb)
{
  if (!tlb->fullmm && tlb->need_flush) {
    flush_tlb_range(tlb->mm, tlb->start, tlb->end, tlb->flush_type);	
    tlb->start = TASK_SIZE;
    tlb->end = 0;
  }
  ...
}

> > There are various reasons that we need to flush TLBs _after_ freeing
> > the page-tables themselves. For some architectures (x86 among others)
> > this serializes against (both hardware and software) page table
> > walkers like gup_fast().
> 
> This part of the changelog also makes no sense what-so-ever. It's
> actively wrong.
> 
> On x86, we absolutely *must* do the TLB flush _before_ we release the
> page tables. So your commentary is actively wrong and misleading.
> 
> The order has to be:
>  - clear the page table entry, queue the page to be free'd
>  - flush the TLB
>  - free the page (and page tables)
> 
> and nothing else is correct, afaik. So the changelog is pure and utter
> garbage. I didn't look at what the patch actually changed.

OK, so I use the wrong terms, I meant page-table tear-down, where we
remove the pte page pointer from the pmd, remove the pmd page from the
pud etc.

We then flush the TLBs and only then actually free the pages. I think
the confusion stems from the fact that we call tear-down free_pgtables()

The point was that we need to TLB flush _after_ tear-down (before actual
free), not before tear-down. The problem is that currently we either end
up doing too many TLB flushes or one too few.




^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-02 21:40       ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-02 21:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-02 at 11:19 -0800, Linus Torvalds wrote:
> On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > In order to be able to properly support architecture that want/need to
> > support TLB range invalidation, we need to change the
> > flush_tlb_range() argument from a vm_area_struct to an mm_struct
> > because the range might very well extend past one VMA, or not have a
> > VMA at all.
> 
> I really don't think this is right. The whole "drop the icache
> information" thing is a total anti-optimization, since for some
> architectures, the icache flush is the _big_ deal. 

Right, so Tile has the I-cache flush from flush_tlb_range(), I'm not
sure if that's the right thing to do, Documentation/cachetlb.txt seems
to suggest doing it from update_mmu_cache() like things.

However, I really don't know, and would happily be explained how these
things are supposed to work. Also:

> Possibly much
> bigger than the TLB flush itself. Doing an icache flush was much more
> expensive than the TLB flush on alpha, for example (the tlb had ASI's
> etc, the icache did not).

Right, but the problem remains that we do page-table teardown without
having a vma.

Now we can re-introduce I/D variants again by assuming D-only and using
tlb_start_vma() to set a I-too bit on VM_EXEC. (this assumes the vm_args
range is non-executable -- which it had better be).

How about I do something like:

enum {
  TLB_FLUSH_I = 1,
  TLB_FLUSH_D = 2,
  TLB_FLUSH_PAGE = 4,
  TLB_FLUSH_HPAGE = 8,
};

void flush_tlb_range(struct mm_struct *mm, unsigned long start,
		     unsigned long end, unsigned int flags);

And we then do:

tlb_gather_mmu(struct mmu_gather *tlb, ...)
{
  ...
  tlb->flush_type = TLB_FLUSH_D | TLB_FLUSH_PAGE;
}

tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
{
  if (!tlb->fullmm)
    flush_cache_range(vma, vma->vm_start, vma->vm_end);

  if (vma->vm_flags & VM_EXEC)
    tlb->flush_type |= TLB_FLUSH_I;

  if (vma->vm_flags & VM_HUGEPAGE)
    tlb->flush_type |= TLB_FLUSH_HPAGE;
}

tlb_flush_mmu(struct mmu_gather *tlb)
{
  if (!tlb->fullmm && tlb->need_flush) {
    flush_tlb_range(tlb->mm, tlb->start, tlb->end, tlb->flush_type);	
    tlb->start = TASK_SIZE;
    tlb->end = 0;
  }
  ...
}

> > There are various reasons that we need to flush TLBs _after_ freeing
> > the page-tables themselves. For some architectures (x86 among others)
> > this serializes against (both hardware and software) page table
> > walkers like gup_fast().
> 
> This part of the changelog also makes no sense what-so-ever. It's
> actively wrong.
> 
> On x86, we absolutely *must* do the TLB flush _before_ we release the
> page tables. So your commentary is actively wrong and misleading.
> 
> The order has to be:
>  - clear the page table entry, queue the page to be free'd
>  - flush the TLB
>  - free the page (and page tables)
> 
> and nothing else is correct, afaik. So the changelog is pure and utter
> garbage. I didn't look at what the patch actually changed.

OK, so I use the wrong terms, I meant page-table tear-down, where we
remove the pte page pointer from the pmd, remove the pmd page from the
pud etc.

We then flush the TLBs and only then actually free the pages. I think
the confusion stems from the fact that we call tear-down free_pgtables()

The point was that we need to TLB flush _after_ tear-down (before actual
free), not before tear-down. The problem is that currently we either end
up doing too many TLB flushes or one too few.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
  2011-03-02 21:40       ` Peter Zijlstra
@ 2011-03-02 21:47         ` David Miller
  -1 siblings, 0 replies; 86+ messages in thread
From: David Miller @ 2011-03-02 21:47 UTC (permalink / raw)
  To: a.p.zijlstra
  Cc: torvalds, aarcange, tglx, riel, mingo, akpm, linux-kernel,
	linux-arch, linux-mm, benh, hugh.dickins, mel, npiggin, rmk,
	cmetcalf, schwidefsky

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed, 02 Mar 2011 22:40:27 +0100

> On Wed, 2011-03-02 at 11:19 -0800, Linus Torvalds wrote:
>> On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>> > In order to be able to properly support architecture that want/need to
>> > support TLB range invalidation, we need to change the
>> > flush_tlb_range() argument from a vm_area_struct to an mm_struct
>> > because the range might very well extend past one VMA, or not have a
>> > VMA at all.
>> 
>> I really don't think this is right. The whole "drop the icache
>> information" thing is a total anti-optimization, since for some
>> architectures, the icache flush is the _big_ deal. 
> 
> Right, so Tile has the I-cache flush from flush_tlb_range(), I'm not
> sure if that's the right thing to do, Documentation/cachetlb.txt seems
> to suggest doing it from update_mmu_cache() like things.

Sparc32 chips that require a valid TLB entry for I-cache flushes do
the flush from flush_cache_range() and similar.

Sparc64 does not have the "present TLB entry" requirement (since I-cache
is physical), and we handle it in update_mmu_cache() but only as an
optimization.  This scheme works in concert with flush_dcache_page().

Either scheme is valid, the former is best when flushing is based upon
virtual addresses.

But I'll be the first to admit that the interfaces we have for doing
this stuff is basically nothing more than a set of hooks, with
assurances that the hooks will be called in specific situations.  Like
anything else, it's evolved over time based upon architectural needs.


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-02 21:47         ` David Miller
  0 siblings, 0 replies; 86+ messages in thread
From: David Miller @ 2011-03-02 21:47 UTC (permalink / raw)
  To: a.p.zijlstra
  Cc: torvalds, aarcange, tglx, riel, mingo, akpm, linux-kernel,
	linux-arch, linux-mm, benh, hugh.dickins, mel, npiggin, rmk,
	cmetcalf, schwidefsky

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed, 02 Mar 2011 22:40:27 +0100

> On Wed, 2011-03-02 at 11:19 -0800, Linus Torvalds wrote:
>> On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>> > In order to be able to properly support architecture that want/need to
>> > support TLB range invalidation, we need to change the
>> > flush_tlb_range() argument from a vm_area_struct to an mm_struct
>> > because the range might very well extend past one VMA, or not have a
>> > VMA at all.
>> 
>> I really don't think this is right. The whole "drop the icache
>> information" thing is a total anti-optimization, since for some
>> architectures, the icache flush is the _big_ deal. 
> 
> Right, so Tile has the I-cache flush from flush_tlb_range(), I'm not
> sure if that's the right thing to do, Documentation/cachetlb.txt seems
> to suggest doing it from update_mmu_cache() like things.

Sparc32 chips that require a valid TLB entry for I-cache flushes do
the flush from flush_cache_range() and similar.

Sparc64 does not have the "present TLB entry" requirement (since I-cache
is physical), and we handle it in update_mmu_cache() but only as an
optimization.  This scheme works in concert with flush_dcache_page().

Either scheme is valid, the former is best when flushing is based upon
virtual addresses.

But I'll be the first to admit that the interfaces we have for doing
this stuff is basically nothing more than a set of hooks, with
assurances that the hooks will be called in specific situations.  Like
anything else, it's evolved over time based upon architectural needs.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
  2011-03-02 21:47         ` David Miller
  (?)
@ 2011-03-03 17:22           ` Chris Metcalf
  -1 siblings, 0 replies; 86+ messages in thread
From: Chris Metcalf @ 2011-03-03 17:22 UTC (permalink / raw)
  To: David Miller
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	linux-kernel, linux-arch, linux-mm, benh, hugh.dickins, mel,
	npiggin, rmk, schwidefsky

On 3/2/2011 4:47 PM, David Miller wrote:
> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Wed, 02 Mar 2011 22:40:27 +0100
>
>> On Wed, 2011-03-02 at 11:19 -0800, Linus Torvalds wrote:
>>> On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>>>> In order to be able to properly support architecture that want/need to
>>>> support TLB range invalidation, we need to change the
>>>> flush_tlb_range() argument from a vm_area_struct to an mm_struct
>>>> because the range might very well extend past one VMA, or not have a
>>>> VMA at all.
>>> I really don't think this is right. The whole "drop the icache
>>> information" thing is a total anti-optimization, since for some
>>> architectures, the icache flush is the _big_ deal. 
>> Right, so Tile has the I-cache flush from flush_tlb_range(), I'm not
>> sure if that's the right thing to do, Documentation/cachetlb.txt seems
>> to suggest doing it from update_mmu_cache() like things.
> Sparc32 chips that require a valid TLB entry for I-cache flushes do
> the flush from flush_cache_range() and similar.
>
> Sparc64 does not have the "present TLB entry" requirement (since I-cache
> is physical), and we handle it in update_mmu_cache() but only as an
> optimization.  This scheme works in concert with flush_dcache_page().
>
> Either scheme is valid, the former is best when flushing is based upon
> virtual addresses.
>
> But I'll be the first to admit that the interfaces we have for doing
> this stuff is basically nothing more than a set of hooks, with
> assurances that the hooks will be called in specific situations.  Like
> anything else, it's evolved over time based upon architectural needs.

I'm finding it hard to understand how the Sparc code handles icache
coherence.  It seems that the Spitfire MMU is the interesting one, but the
hard case seems to be when a process migrates around to various cores
during execution (thus leaving incoherent icache lines everywhere), and the
page is then freed and re-used for different executable code.  I'd think
that there would have to be xcall IPIs to flush all the cpus' icaches, or
to flush every core in the cpu_vm_mask plus do something at context switch,
but I don't see any of that.  No doubt I'm missing something :-)

Currently on Tile I assume that we flush icaches in cpu_vm_mask at TLB
flush time, and flush the icache on context-switch, since I'm confident I
can reason correctly about that and prove that with this model you can
never have stale icache data.  But the "every context-switch" is a
nuisance, only somewhat mitigated by the fact that with 64 cores we don't
do a lot of context-switching.

To give some more specificity to my thinking, here's one optimization we
could do on Tile, that would both address Peter Zijlstra's generic
architecture in an obvious way, and also improve context switch time:

- Add a "free time" field to struct page.  The free time field could be a
64-bit cycle counter value, or maybe some kind of 32-bit counter that just
increments every time we free, etc., though then we'd need to worry about
handling wraparound.  We'd record the free time when we freed the page back
to the buddy allocator.  Since we only care about executable page frees,
we'd want to use a page bit to track if a given page was ever associated
with an executable PTE, and if it wasn't, we could just record the "free
time" as zero, for book-keeping purposes.

- Keep a per-cpu "icache flush time", with the same timekeeping system as
the page free time.  Every time we flush the whole icache on a cpu, we
update its per-cpu timestamp.

- When writing an executable PTE into the page table, we'd check the
cpu_vm_mask, and any cpu that hadn't done a full icache flush since the
page in question was previously freed would be IPI'ed and would do the
icache flush, making it safe to start running code on the page with its new
code.  We'd also update a per-mm "latest free" timestamp to hold the most
recent "free time" of all the pages faulted in for that mm.

- When context-switching, we'd check the per-mm "latest free" timestamp,
and if the mm held a page that was freed more recently than that cpu's
timestamp, we'd do a full icache flush and update the per-cpu timestamp.

This has several good properties:

- We are unlikely to do much icache flushing, since we only do it when an
executable page is freed back to the buddy allocator and then reused as
executable again.

- If two processes share a cpu, they don't end up having to icache flush at
every context switch.

- We never need to IPI a cpu that isn't actively involved with the process
that is faulting in a new executable page.  (This is particularly important
since we want to avoid disturbing "dataplane" cpus that are running
latency-sensitive tasks.)

- We don't need to worry about vma's at flush_tlb_range() time, thus making
Peter happy :-)

I'm not worrying about kernel module executable pages, since I'm happy to
do much more heavy-weight operations for them, i.e. flush all the icaches
on all the cores.

So: does this general approach seem appropriate, or am I missing a key
subtlety of the Sparc approach that makes this all unnecessary?

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-03 17:22           ` Chris Metcalf
  0 siblings, 0 replies; 86+ messages in thread
From: Chris Metcalf @ 2011-03-03 17:22 UTC (permalink / raw)
  To: David Miller
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	linux-kernel, linux-arch, linux-mm, benh, hugh.dickins, mel,
	npiggin, rmk, schwidefsky

On 3/2/2011 4:47 PM, David Miller wrote:
> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Wed, 02 Mar 2011 22:40:27 +0100
>
>> On Wed, 2011-03-02 at 11:19 -0800, Linus Torvalds wrote:
>>> On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>>>> In order to be able to properly support architecture that want/need to
>>>> support TLB range invalidation, we need to change the
>>>> flush_tlb_range() argument from a vm_area_struct to an mm_struct
>>>> because the range might very well extend past one VMA, or not have a
>>>> VMA at all.
>>> I really don't think this is right. The whole "drop the icache
>>> information" thing is a total anti-optimization, since for some
>>> architectures, the icache flush is the _big_ deal. 
>> Right, so Tile has the I-cache flush from flush_tlb_range(), I'm not
>> sure if that's the right thing to do, Documentation/cachetlb.txt seems
>> to suggest doing it from update_mmu_cache() like things.
> Sparc32 chips that require a valid TLB entry for I-cache flushes do
> the flush from flush_cache_range() and similar.
>
> Sparc64 does not have the "present TLB entry" requirement (since I-cache
> is physical), and we handle it in update_mmu_cache() but only as an
> optimization.  This scheme works in concert with flush_dcache_page().
>
> Either scheme is valid, the former is best when flushing is based upon
> virtual addresses.
>
> But I'll be the first to admit that the interfaces we have for doing
> this stuff is basically nothing more than a set of hooks, with
> assurances that the hooks will be called in specific situations.  Like
> anything else, it's evolved over time based upon architectural needs.

I'm finding it hard to understand how the Sparc code handles icache
coherence.  It seems that the Spitfire MMU is the interesting one, but the
hard case seems to be when a process migrates around to various cores
during execution (thus leaving incoherent icache lines everywhere), and the
page is then freed and re-used for different executable code.  I'd think
that there would have to be xcall IPIs to flush all the cpus' icaches, or
to flush every core in the cpu_vm_mask plus do something at context switch,
but I don't see any of that.  No doubt I'm missing something :-)

Currently on Tile I assume that we flush icaches in cpu_vm_mask at TLB
flush time, and flush the icache on context-switch, since I'm confident I
can reason correctly about that and prove that with this model you can
never have stale icache data.  But the "every context-switch" is a
nuisance, only somewhat mitigated by the fact that with 64 cores we don't
do a lot of context-switching.

To give some more specificity to my thinking, here's one optimization we
could do on Tile, that would both address Peter Zijlstra's generic
architecture in an obvious way, and also improve context switch time:

- Add a "free time" field to struct page.  The free time field could be a
64-bit cycle counter value, or maybe some kind of 32-bit counter that just
increments every time we free, etc., though then we'd need to worry about
handling wraparound.  We'd record the free time when we freed the page back
to the buddy allocator.  Since we only care about executable page frees,
we'd want to use a page bit to track if a given page was ever associated
with an executable PTE, and if it wasn't, we could just record the "free
time" as zero, for book-keeping purposes.

- Keep a per-cpu "icache flush time", with the same timekeeping system as
the page free time.  Every time we flush the whole icache on a cpu, we
update its per-cpu timestamp.

- When writing an executable PTE into the page table, we'd check the
cpu_vm_mask, and any cpu that hadn't done a full icache flush since the
page in question was previously freed would be IPI'ed and would do the
icache flush, making it safe to start running code on the page with its new
code.  We'd also update a per-mm "latest free" timestamp to hold the most
recent "free time" of all the pages faulted in for that mm.

- When context-switching, we'd check the per-mm "latest free" timestamp,
and if the mm held a page that was freed more recently than that cpu's
timestamp, we'd do a full icache flush and update the per-cpu timestamp.

This has several good properties:

- We are unlikely to do much icache flushing, since we only do it when an
executable page is freed back to the buddy allocator and then reused as
executable again.

- If two processes share a cpu, they don't end up having to icache flush at
every context switch.

- We never need to IPI a cpu that isn't actively involved with the process
that is faulting in a new executable page.  (This is particularly important
since we want to avoid disturbing "dataplane" cpus that are running
latency-sensitive tasks.)

- We don't need to worry about vma's at flush_tlb_range() time, thus making
Peter happy :-)

I'm not worrying about kernel module executable pages, since I'm happy to
do much more heavy-weight operations for them, i.e. flush all the icaches
on all the cores.

So: does this general approach seem appropriate, or am I missing a key
subtlety of the Sparc approach that makes this all unnecessary?

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-03 17:22           ` Chris Metcalf
  0 siblings, 0 replies; 86+ messages in thread
From: Chris Metcalf @ 2011-03-03 17:22 UTC (permalink / raw)
  To: David Miller
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	linux-kernel, linux-arch, linux-mm, benh, hugh.dickins, mel,
	npiggin, rmk, schwidefsky

On 3/2/2011 4:47 PM, David Miller wrote:
> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Wed, 02 Mar 2011 22:40:27 +0100
>
>> On Wed, 2011-03-02 at 11:19 -0800, Linus Torvalds wrote:
>>> On Wed, Mar 2, 2011 at 9:59 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>>>> In order to be able to properly support architecture that want/need to
>>>> support TLB range invalidation, we need to change the
>>>> flush_tlb_range() argument from a vm_area_struct to an mm_struct
>>>> because the range might very well extend past one VMA, or not have a
>>>> VMA at all.
>>> I really don't think this is right. The whole "drop the icache
>>> information" thing is a total anti-optimization, since for some
>>> architectures, the icache flush is the _big_ deal. 
>> Right, so Tile has the I-cache flush from flush_tlb_range(), I'm not
>> sure if that's the right thing to do, Documentation/cachetlb.txt seems
>> to suggest doing it from update_mmu_cache() like things.
> Sparc32 chips that require a valid TLB entry for I-cache flushes do
> the flush from flush_cache_range() and similar.
>
> Sparc64 does not have the "present TLB entry" requirement (since I-cache
> is physical), and we handle it in update_mmu_cache() but only as an
> optimization.  This scheme works in concert with flush_dcache_page().
>
> Either scheme is valid, the former is best when flushing is based upon
> virtual addresses.
>
> But I'll be the first to admit that the interfaces we have for doing
> this stuff is basically nothing more than a set of hooks, with
> assurances that the hooks will be called in specific situations.  Like
> anything else, it's evolved over time based upon architectural needs.

I'm finding it hard to understand how the Sparc code handles icache
coherence.  It seems that the Spitfire MMU is the interesting one, but the
hard case seems to be when a process migrates around to various cores
during execution (thus leaving incoherent icache lines everywhere), and the
page is then freed and re-used for different executable code.  I'd think
that there would have to be xcall IPIs to flush all the cpus' icaches, or
to flush every core in the cpu_vm_mask plus do something at context switch,
but I don't see any of that.  No doubt I'm missing something :-)

Currently on Tile I assume that we flush icaches in cpu_vm_mask at TLB
flush time, and flush the icache on context-switch, since I'm confident I
can reason correctly about that and prove that with this model you can
never have stale icache data.  But the "every context-switch" is a
nuisance, only somewhat mitigated by the fact that with 64 cores we don't
do a lot of context-switching.

To give some more specificity to my thinking, here's one optimization we
could do on Tile, that would both address Peter Zijlstra's generic
architecture in an obvious way, and also improve context switch time:

- Add a "free time" field to struct page.  The free time field could be a
64-bit cycle counter value, or maybe some kind of 32-bit counter that just
increments every time we free, etc., though then we'd need to worry about
handling wraparound.  We'd record the free time when we freed the page back
to the buddy allocator.  Since we only care about executable page frees,
we'd want to use a page bit to track if a given page was ever associated
with an executable PTE, and if it wasn't, we could just record the "free
time" as zero, for book-keeping purposes.

- Keep a per-cpu "icache flush time", with the same timekeeping system as
the page free time.  Every time we flush the whole icache on a cpu, we
update its per-cpu timestamp.

- When writing an executable PTE into the page table, we'd check the
cpu_vm_mask, and any cpu that hadn't done a full icache flush since the
page in question was previously freed would be IPI'ed and would do the
icache flush, making it safe to start running code on the page with its new
code.  We'd also update a per-mm "latest free" timestamp to hold the most
recent "free time" of all the pages faulted in for that mm.

- When context-switching, we'd check the per-mm "latest free" timestamp,
and if the mm held a page that was freed more recently than that cpu's
timestamp, we'd do a full icache flush and update the per-cpu timestamp.

This has several good properties:

- We are unlikely to do much icache flushing, since we only do it when an
executable page is freed back to the buddy allocator and then reused as
executable again.

- If two processes share a cpu, they don't end up having to icache flush at
every context switch.

- We never need to IPI a cpu that isn't actively involved with the process
that is faulting in a new executable page.  (This is particularly important
since we want to avoid disturbing "dataplane" cpus that are running
latency-sensitive tasks.)

- We don't need to worry about vma's at flush_tlb_range() time, thus making
Peter happy :-)

I'm not worrying about kernel module executable pages, since I'm happy to
do much more heavy-weight operations for them, i.e. flush all the icaches
on all the cores.

So: does this general approach seem appropriate, or am I missing a key
subtlety of the Sparc approach that makes this all unnecessary?

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
  2011-03-03 17:22           ` Chris Metcalf
@ 2011-03-03 18:45             ` David Miller
  -1 siblings, 0 replies; 86+ messages in thread
From: David Miller @ 2011-03-03 18:45 UTC (permalink / raw)
  To: cmetcalf
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	linux-kernel, linux-arch, linux-mm, benh, hugh.dickins, mel,
	npiggin, rmk, schwidefsky

From: Chris Metcalf <cmetcalf@tilera.com>
Date: Thu, 3 Mar 2011 12:22:37 -0500

> I'm finding it hard to understand how the Sparc code handles icache
> coherence.  It seems that the Spitfire MMU is the interesting one, but the
> hard case seems to be when a process migrates around to various cores
> during execution (thus leaving incoherent icache lines everywhere), and the
> page is then freed and re-used for different executable code.  I'd think
> that there would have to be xcall IPIs to flush all the cpus' icaches, or
> to flush every core in the cpu_vm_mask plus do something at context switch,
> but I don't see any of that.  No doubt I'm missing something :-)

flush_dcache_page() remembers the cpu that wrote to the page (in the
page flags), and cross-calls to that specific cpu.

It is only that cpu which must flush his I-cache, since all other cpus
saw the write on the bus and updated their I-cache lines as a result.

See, in the sparc64 case, the incoherency issue is purely local to the
store.  The problem case is specifically the local I-cache not seeing
local writes, everything else is fine.  CPU I-caches see writes done
by other cpus, just not those done by the local cpu.


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-03 18:45             ` David Miller
  0 siblings, 0 replies; 86+ messages in thread
From: David Miller @ 2011-03-03 18:45 UTC (permalink / raw)
  To: cmetcalf
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	linux-kernel, linux-arch, linux-mm, benh, hugh.dickins, mel,
	npiggin, rmk, schwidefsky

From: Chris Metcalf <cmetcalf@tilera.com>
Date: Thu, 3 Mar 2011 12:22:37 -0500

> I'm finding it hard to understand how the Sparc code handles icache
> coherence.  It seems that the Spitfire MMU is the interesting one, but the
> hard case seems to be when a process migrates around to various cores
> during execution (thus leaving incoherent icache lines everywhere), and the
> page is then freed and re-used for different executable code.  I'd think
> that there would have to be xcall IPIs to flush all the cpus' icaches, or
> to flush every core in the cpu_vm_mask plus do something at context switch,
> but I don't see any of that.  No doubt I'm missing something :-)

flush_dcache_page() remembers the cpu that wrote to the page (in the
page flags), and cross-calls to that specific cpu.

It is only that cpu which must flush his I-cache, since all other cpus
saw the write on the bus and updated their I-cache lines as a result.

See, in the sparc64 case, the incoherency issue is purely local to the
store.  The problem case is specifically the local I-cache not seeing
local writes, everything else is fine.  CPU I-caches see writes done
by other cpus, just not those done by the local cpu.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
  2011-03-03 18:45             ` David Miller
  (?)
@ 2011-03-03 18:56               ` Chris Metcalf
  -1 siblings, 0 replies; 86+ messages in thread
From: Chris Metcalf @ 2011-03-03 18:56 UTC (permalink / raw)
  To: David Miller
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	linux-kernel, linux-arch, linux-mm, benh, hugh.dickins, mel,
	npiggin, rmk, schwidefsky

On 3/3/2011 1:45 PM, David Miller wrote:
>> I'm finding it hard to understand how the Sparc code handles icache
>> coherence.  It seems that the Spitfire MMU is the interesting one, but the
>> hard case seems to be when a process migrates around to various cores
>> during execution (thus leaving incoherent icache lines everywhere), and the
>> page is then freed and re-used for different executable code.  I'd think
>> that there would have to be xcall IPIs to flush all the cpus' icaches, or
>> to flush every core in the cpu_vm_mask plus do something at context switch,
>> but I don't see any of that.  No doubt I'm missing something :-)
> flush_dcache_page() remembers the cpu that wrote to the page (in the
> page flags), and cross-calls to that specific cpu.
>
> It is only that cpu which must flush his I-cache, since all other cpus
> saw the write on the bus and updated their I-cache lines as a result.
>
> See, in the sparc64 case, the incoherency issue is purely local to the
> store.  The problem case is specifically the local I-cache not seeing
> local writes, everything else is fine.  CPU I-caches see writes done
> by other cpus, just not those done by the local cpu.

Thanks, that makes sense.  Our architecture has no bus to snoop, so we
couldn't take advantage of that approach.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-03 18:56               ` Chris Metcalf
  0 siblings, 0 replies; 86+ messages in thread
From: Chris Metcalf @ 2011-03-03 18:56 UTC (permalink / raw)
  To: David Miller
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	linux-kernel, linux-arch, linux-mm, benh, hugh.dickins, mel,
	npiggin, rmk, schwidefsky

On 3/3/2011 1:45 PM, David Miller wrote:
>> I'm finding it hard to understand how the Sparc code handles icache
>> coherence.  It seems that the Spitfire MMU is the interesting one, but the
>> hard case seems to be when a process migrates around to various cores
>> during execution (thus leaving incoherent icache lines everywhere), and the
>> page is then freed and re-used for different executable code.  I'd think
>> that there would have to be xcall IPIs to flush all the cpus' icaches, or
>> to flush every core in the cpu_vm_mask plus do something at context switch,
>> but I don't see any of that.  No doubt I'm missing something :-)
> flush_dcache_page() remembers the cpu that wrote to the page (in the
> page flags), and cross-calls to that specific cpu.
>
> It is only that cpu which must flush his I-cache, since all other cpus
> saw the write on the bus and updated their I-cache lines as a result.
>
> See, in the sparc64 case, the incoherency issue is purely local to the
> store.  The problem case is specifically the local I-cache not seeing
> local writes, everything else is fine.  CPU I-caches see writes done
> by other cpus, just not those done by the local cpu.

Thanks, that makes sense.  Our architecture has no bus to snoop, so we
couldn't take advantage of that approach.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct
@ 2011-03-03 18:56               ` Chris Metcalf
  0 siblings, 0 replies; 86+ messages in thread
From: Chris Metcalf @ 2011-03-03 18:56 UTC (permalink / raw)
  To: David Miller
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	linux-kernel, linux-arch, linux-mm, benh, hugh.dickins, mel,
	npiggin, rmk, schwidefsky

On 3/3/2011 1:45 PM, David Miller wrote:
>> I'm finding it hard to understand how the Sparc code handles icache
>> coherence.  It seems that the Spitfire MMU is the interesting one, but the
>> hard case seems to be when a process migrates around to various cores
>> during execution (thus leaving incoherent icache lines everywhere), and the
>> page is then freed and re-used for different executable code.  I'd think
>> that there would have to be xcall IPIs to flush all the cpus' icaches, or
>> to flush every core in the cpu_vm_mask plus do something at context switch,
>> but I don't see any of that.  No doubt I'm missing something :-)
> flush_dcache_page() remembers the cpu that wrote to the page (in the
> page flags), and cross-calls to that specific cpu.
>
> It is only that cpu which must flush his I-cache, since all other cpus
> saw the write on the bus and updated their I-cache lines as a result.
>
> See, in the sparc64 case, the incoherency issue is purely local to the
> store.  The problem case is specifically the local I-cache not seeing
> local writes, everything else is fine.  CPU I-caches see writes done
> by other cpus, just not those done by the local cpu.

Thanks, that makes sense.  Our architecture has no bus to snoop, so we
couldn't take advantage of that approach.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2011-03-02 17:59   ` Peter Zijlstra
@ 2011-03-09 15:16     ` Catalin Marinas
  -1 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2011-03-09 15:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

Hi Peter,

On 2 March 2011 17:59, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> --- linux-2.6.orig/arch/arm/include/asm/tlb.h
> +++ linux-2.6/arch/arm/include/asm/tlb.h
[...]
> +__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
>  {
>        pgtable_page_dtor(pte);
> -       tlb_add_flush(tlb, addr);
>        tlb_remove_page(tlb, pte);
>  }

I think we still need a tlb_track_range() call here. On the path to
pte_free_tlb() (for example shift_arg_pages ... free_pte_range) there
doesn't seem to be any code setting the tlb->start/end range. Did I
miss anything?

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2011-03-09 15:16     ` Catalin Marinas
  0 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2011-03-09 15:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

Hi Peter,

On 2 March 2011 17:59, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> --- linux-2.6.orig/arch/arm/include/asm/tlb.h
> +++ linux-2.6/arch/arm/include/asm/tlb.h
[...]
> +__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
>  {
>        pgtable_page_dtor(pte);
> -       tlb_add_flush(tlb, addr);
>        tlb_remove_page(tlb, pte);
>  }

I think we still need a tlb_track_range() call here. On the path to
pte_free_tlb() (for example shift_arg_pages ... free_pte_range) there
doesn't seem to be any code setting the tlb->start/end range. Did I
miss anything?

Thanks.

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2011-03-09 15:16     ` Catalin Marinas
@ 2011-03-09 15:19       ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-09 15:19 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 15:16 +0000, Catalin Marinas wrote:
> Hi Peter,
> 
> On 2 March 2011 17:59, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > --- linux-2.6.orig/arch/arm/include/asm/tlb.h
> > +++ linux-2.6/arch/arm/include/asm/tlb.h
> [...]
> > +__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
> >  {
> >        pgtable_page_dtor(pte);
> > -       tlb_add_flush(tlb, addr);
> >        tlb_remove_page(tlb, pte);
> >  }
> 
> I think we still need a tlb_track_range() call here. On the path to
> pte_free_tlb() (for example shift_arg_pages ... free_pte_range) there
> doesn't seem to be any code setting the tlb->start/end range. Did I
> miss anything?

Patch 3 included:

-#define pte_free_tlb(tlb, ptep, address)                       \
-       do {                                                    \
-               tlb->need_flush = 1;                            \
-               __pte_free_tlb(tlb, ptep, address);             \
+#define pte_free_tlb(tlb, ptep, address)                                       \
+       do {                                                                    \
+               tlb->need_flush = 1;                                            \
+               tlb_track_range(tlb, address, pmd_addr_end(address, TASK_SIZE));\
+               __pte_free_tlb(tlb, ptep, address);                             \
        } while (0)

Also, I posted a new version of this series here:

  https://lkml.org/lkml/2011/3/7/308

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2011-03-09 15:19       ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-09 15:19 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 15:16 +0000, Catalin Marinas wrote:
> Hi Peter,
> 
> On 2 March 2011 17:59, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > --- linux-2.6.orig/arch/arm/include/asm/tlb.h
> > +++ linux-2.6/arch/arm/include/asm/tlb.h
> [...]
> > +__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
> >  {
> >        pgtable_page_dtor(pte);
> > -       tlb_add_flush(tlb, addr);
> >        tlb_remove_page(tlb, pte);
> >  }
> 
> I think we still need a tlb_track_range() call here. On the path to
> pte_free_tlb() (for example shift_arg_pages ... free_pte_range) there
> doesn't seem to be any code setting the tlb->start/end range. Did I
> miss anything?

Patch 3 included:

-#define pte_free_tlb(tlb, ptep, address)                       \
-       do {                                                    \
-               tlb->need_flush = 1;                            \
-               __pte_free_tlb(tlb, ptep, address);             \
+#define pte_free_tlb(tlb, ptep, address)                                       \
+       do {                                                                    \
+               tlb->need_flush = 1;                                            \
+               tlb_track_range(tlb, address, pmd_addr_end(address, TASK_SIZE));\
+               __pte_free_tlb(tlb, ptep, address);                             \
        } while (0)

Also, I posted a new version of this series here:

  https://lkml.org/lkml/2011/3/7/308

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2011-03-09 15:19       ` Peter Zijlstra
@ 2011-03-09 15:36         ` Catalin Marinas
  -1 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2011-03-09 15:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 15:19 +0000, Peter Zijlstra wrote:
> On Wed, 2011-03-09 at 15:16 +0000, Catalin Marinas wrote:
> > Hi Peter,
> >
> > On 2 March 2011 17:59, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > > --- linux-2.6.orig/arch/arm/include/asm/tlb.h
> > > +++ linux-2.6/arch/arm/include/asm/tlb.h
> > [...]
> > > +__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
> > >  {
> > >        pgtable_page_dtor(pte);
> > > -       tlb_add_flush(tlb, addr);
> > >        tlb_remove_page(tlb, pte);
> > >  }
> >
> > I think we still need a tlb_track_range() call here. On the path to
> > pte_free_tlb() (for example shift_arg_pages ... free_pte_range) there
> > doesn't seem to be any code setting the tlb->start/end range. Did I
> > miss anything?
> 
> Patch 3 included:
> 
> -#define pte_free_tlb(tlb, ptep, address)                       \
> -       do {                                                    \
> -               tlb->need_flush = 1;                            \
> -               __pte_free_tlb(tlb, ptep, address);             \
> +#define pte_free_tlb(tlb, ptep, address)                                       \
> +       do {                                                                    \
> +               tlb->need_flush = 1;                                            \
> +               tlb_track_range(tlb, address, pmd_addr_end(address, TASK_SIZE));\
> +               __pte_free_tlb(tlb, ptep, address);                             \
>         } while (0)

OK, so the range is tracked. The only issue is that for platforms with a
folded pmd the range end would go to TASK_SIZE. In this case
pgd_addr_end() would make more sense (or something like
PTRS_PER_PTE*PAGE_SIZE).

-- 
Catalin



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2011-03-09 15:36         ` Catalin Marinas
  0 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2011-03-09 15:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 15:19 +0000, Peter Zijlstra wrote:
> On Wed, 2011-03-09 at 15:16 +0000, Catalin Marinas wrote:
> > Hi Peter,
> >
> > On 2 March 2011 17:59, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > > --- linux-2.6.orig/arch/arm/include/asm/tlb.h
> > > +++ linux-2.6/arch/arm/include/asm/tlb.h
> > [...]
> > > +__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
> > >  {
> > >        pgtable_page_dtor(pte);
> > > -       tlb_add_flush(tlb, addr);
> > >        tlb_remove_page(tlb, pte);
> > >  }
> >
> > I think we still need a tlb_track_range() call here. On the path to
> > pte_free_tlb() (for example shift_arg_pages ... free_pte_range) there
> > doesn't seem to be any code setting the tlb->start/end range. Did I
> > miss anything?
> 
> Patch 3 included:
> 
> -#define pte_free_tlb(tlb, ptep, address)                       \
> -       do {                                                    \
> -               tlb->need_flush = 1;                            \
> -               __pte_free_tlb(tlb, ptep, address);             \
> +#define pte_free_tlb(tlb, ptep, address)                                       \
> +       do {                                                                    \
> +               tlb->need_flush = 1;                                            \
> +               tlb_track_range(tlb, address, pmd_addr_end(address, TASK_SIZE));\
> +               __pte_free_tlb(tlb, ptep, address);                             \
>         } while (0)

OK, so the range is tracked. The only issue is that for platforms with a
folded pmd the range end would go to TASK_SIZE. In this case
pgd_addr_end() would make more sense (or something like
PTRS_PER_PTE*PAGE_SIZE).

-- 
Catalin


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2011-03-09 15:36         ` Catalin Marinas
@ 2011-03-09 15:39           ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-09 15:39 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 15:36 +0000, Catalin Marinas wrote:
> On Wed, 2011-03-09 at 15:19 +0000, Peter Zijlstra wrote:
> > On Wed, 2011-03-09 at 15:16 +0000, Catalin Marinas wrote:
> > > Hi Peter,
> > >
> > > On 2 March 2011 17:59, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > > > --- linux-2.6.orig/arch/arm/include/asm/tlb.h
> > > > +++ linux-2.6/arch/arm/include/asm/tlb.h
> > > [...]
> > > > +__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
> > > >  {
> > > >        pgtable_page_dtor(pte);
> > > > -       tlb_add_flush(tlb, addr);
> > > >        tlb_remove_page(tlb, pte);
> > > >  }
> > >
> > > I think we still need a tlb_track_range() call here. On the path to
> > > pte_free_tlb() (for example shift_arg_pages ... free_pte_range) there
> > > doesn't seem to be any code setting the tlb->start/end range. Did I
> > > miss anything?
> > 
> > Patch 3 included:
> > 
> > -#define pte_free_tlb(tlb, ptep, address)                       \
> > -       do {                                                    \
> > -               tlb->need_flush = 1;                            \
> > -               __pte_free_tlb(tlb, ptep, address);             \
> > +#define pte_free_tlb(tlb, ptep, address)                                       \
> > +       do {                                                                    \
> > +               tlb->need_flush = 1;                                            \
> > +               tlb_track_range(tlb, address, pmd_addr_end(address, TASK_SIZE));\
> > +               __pte_free_tlb(tlb, ptep, address);                             \
> >         } while (0)
> 
> OK, so the range is tracked. The only issue is that for platforms with a
> folded pmd the range end would go to TASK_SIZE. In this case
> pgd_addr_end() would make more sense (or something like
> PTRS_PER_PTE*PAGE_SIZE).

Urgh, so when pmds are folded pmd_addr_end() doesn't get to be the next
biggest thing?  PTRS_PER_PTE*PAGE_SIZE like things don't work since
there is no guarantee addr is at the beginning of the pmd.

Ok, will try and sort that out.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2011-03-09 15:39           ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-09 15:39 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 15:36 +0000, Catalin Marinas wrote:
> On Wed, 2011-03-09 at 15:19 +0000, Peter Zijlstra wrote:
> > On Wed, 2011-03-09 at 15:16 +0000, Catalin Marinas wrote:
> > > Hi Peter,
> > >
> > > On 2 March 2011 17:59, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > > > --- linux-2.6.orig/arch/arm/include/asm/tlb.h
> > > > +++ linux-2.6/arch/arm/include/asm/tlb.h
> > > [...]
> > > > +__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
> > > >  {
> > > >        pgtable_page_dtor(pte);
> > > > -       tlb_add_flush(tlb, addr);
> > > >        tlb_remove_page(tlb, pte);
> > > >  }
> > >
> > > I think we still need a tlb_track_range() call here. On the path to
> > > pte_free_tlb() (for example shift_arg_pages ... free_pte_range) there
> > > doesn't seem to be any code setting the tlb->start/end range. Did I
> > > miss anything?
> > 
> > Patch 3 included:
> > 
> > -#define pte_free_tlb(tlb, ptep, address)                       \
> > -       do {                                                    \
> > -               tlb->need_flush = 1;                            \
> > -               __pte_free_tlb(tlb, ptep, address);             \
> > +#define pte_free_tlb(tlb, ptep, address)                                       \
> > +       do {                                                                    \
> > +               tlb->need_flush = 1;                                            \
> > +               tlb_track_range(tlb, address, pmd_addr_end(address, TASK_SIZE));\
> > +               __pte_free_tlb(tlb, ptep, address);                             \
> >         } while (0)
> 
> OK, so the range is tracked. The only issue is that for platforms with a
> folded pmd the range end would go to TASK_SIZE. In this case
> pgd_addr_end() would make more sense (or something like
> PTRS_PER_PTE*PAGE_SIZE).

Urgh, so when pmds are folded pmd_addr_end() doesn't get to be the next
biggest thing?  PTRS_PER_PTE*PAGE_SIZE like things don't work since
there is no guarantee addr is at the beginning of the pmd.

Ok, will try and sort that out.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2011-03-09 15:39           ` Peter Zijlstra
@ 2011-03-09 15:48             ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-09 15:48 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 16:39 +0100, Peter Zijlstra wrote:
> 
> Ok, will try and sort that out. 

We could do something like the below and use the end passed down, which
because it goes top down should be clipped at the appropriate size, just
means touching all the p??_free_tlb() implementations ;-)

Will do on the next iteration ;-)

---

diff --git a/mm/memory.c b/mm/memory.c
index 5823698..833bd90 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -222,11 +222,11 @@ void pmd_clear_bad(pmd_t *pmd)
  * has been handled earlier when unmapping all the memory regions.
  */
 static void free_pte_range(struct mmu_gather *tlb, pmd_t *pmd,
-			   unsigned long addr)
+			   unsigned long addr, unsigned long end)
 {
 	pgtable_t token = pmd_pgtable(*pmd);
 	pmd_clear(pmd);
-	pte_free_tlb(tlb, token, addr);
+	pte_free_tlb(tlb, token, addr, end);
 	tlb->mm->nr_ptes--;
 }
 
@@ -244,7 +244,7 @@ static inline void free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
 		next = pmd_addr_end(addr, end);
 		if (pmd_none_or_clear_bad(pmd))
 			continue;
-		free_pte_range(tlb, pmd, addr);
+		free_pte_range(tlb, pmd, addr, next);
 	} while (pmd++, addr = next, addr != end);
 
 	start &= PUD_MASK;
@@ -260,7 +260,7 @@ static inline void free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
 
 	pmd = pmd_offset(pud, start);
 	pud_clear(pud);
-	pmd_free_tlb(tlb, pmd, start);
+	pmd_free_tlb(tlb, pmd, start, end);
 }
 
 static inline void free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
@@ -293,7 +293,7 @@ static inline void free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
 
 	pud = pud_offset(pgd, start);
 	pgd_clear(pgd);
-	pud_free_tlb(tlb, pud, start);
+	pud_free_tlb(tlb, pud, start, end);
 }
 
 /*




^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2011-03-09 15:48             ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2011-03-09 15:48 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 16:39 +0100, Peter Zijlstra wrote:
> 
> Ok, will try and sort that out. 

We could do something like the below and use the end passed down, which
because it goes top down should be clipped at the appropriate size, just
means touching all the p??_free_tlb() implementations ;-)

Will do on the next iteration ;-)

---

diff --git a/mm/memory.c b/mm/memory.c
index 5823698..833bd90 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -222,11 +222,11 @@ void pmd_clear_bad(pmd_t *pmd)
  * has been handled earlier when unmapping all the memory regions.
  */
 static void free_pte_range(struct mmu_gather *tlb, pmd_t *pmd,
-			   unsigned long addr)
+			   unsigned long addr, unsigned long end)
 {
 	pgtable_t token = pmd_pgtable(*pmd);
 	pmd_clear(pmd);
-	pte_free_tlb(tlb, token, addr);
+	pte_free_tlb(tlb, token, addr, end);
 	tlb->mm->nr_ptes--;
 }
 
@@ -244,7 +244,7 @@ static inline void free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
 		next = pmd_addr_end(addr, end);
 		if (pmd_none_or_clear_bad(pmd))
 			continue;
-		free_pte_range(tlb, pmd, addr);
+		free_pte_range(tlb, pmd, addr, next);
 	} while (pmd++, addr = next, addr != end);
 
 	start &= PUD_MASK;
@@ -260,7 +260,7 @@ static inline void free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
 
 	pmd = pmd_offset(pud, start);
 	pud_clear(pud);
-	pmd_free_tlb(tlb, pmd, start);
+	pmd_free_tlb(tlb, pmd, start, end);
 }
 
 static inline void free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
@@ -293,7 +293,7 @@ static inline void free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
 
 	pud = pud_offset(pgd, start);
 	pgd_clear(pgd);
-	pud_free_tlb(tlb, pud, start);
+	pud_free_tlb(tlb, pud, start, end);
 }
 
 /*



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2011-03-09 15:48             ` Peter Zijlstra
@ 2011-03-09 16:34               ` Catalin Marinas
  -1 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2011-03-09 16:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 15:48 +0000, Peter Zijlstra wrote:
> On Wed, 2011-03-09 at 16:39 +0100, Peter Zijlstra wrote:
> >
> > Ok, will try and sort that out.
> 
> We could do something like the below and use the end passed down, which
> because it goes top down should be clipped at the appropriate size, just
> means touching all the p??_free_tlb() implementations ;-)

Looks fine to me (apart from the hassle to change the p??_free_tlb()
definitions).

-- 
Catalin



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2011-03-09 16:34               ` Catalin Marinas
  0 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2011-03-09 16:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, 2011-03-09 at 15:48 +0000, Peter Zijlstra wrote:
> On Wed, 2011-03-09 at 16:39 +0100, Peter Zijlstra wrote:
> >
> > Ok, will try and sort that out.
> 
> We could do something like the below and use the end passed down, which
> because it goes top down should be clipped at the appropriate size, just
> means touching all the p??_free_tlb() implementations ;-)

Looks fine to me (apart from the hassle to change the p??_free_tlb()
definitions).

-- 
Catalin


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH] arch/tile: optimize icache flush
  2011-03-03 17:22           ` Chris Metcalf
  (?)
@ 2011-03-10 18:05             ` Chris Metcalf
  -1 siblings, 0 replies; 86+ messages in thread
From: Chris Metcalf @ 2011-03-10 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	David Miller <davem@davemloft.net>,
	linux-mm, benh, hugh.dickins, mel, npiggin, rmk, schwidefsky

Tile has incoherent icaches, so they must be explicitly invalidated
when necessary.  Until now we have done so at tlb flush and context
switch time, which means more invalidation than strictly necessary.
The new model for icache flush is:

- When we fault in a page as executable, we set an "Exec" bit in the
  "struct page" information; the bit stays set until page free time.
  (We use the arch_1 page bit for our "Exec" bit.)

- At page free time, if the Exec bit is set, we do an icache flush.
  This should happen relatively rarely: e.g., deleting a binary from disk,
  or evicting a binary's pages from the page cache due to memory pressure.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
---

This change was motivated initially by Peter Zijlstra's attempt to
handle tlb_flush_range() without the vm_flags.  Since we no longer do I$
flushing at TLB flush time, this is arguably a step in that direction,
though in practice Peter gave up and is now passing the vm_flags anyway.

This change is also much simpler than the one I proposed a few days ago:

https://lkml.org/lkml/2011/3/3/284

since I decided that it would be overkill to track page free times in
the struct page and compare it as part of this.

Note that Tilera's shipping sources have support for some amortizing
code that collects freed pages on a separate list if they need any kind
of special cache-management attention, so in that code we are deferring
the icache frees until even later.  This code involves enough hooks into
the platform-independent kernel mm sources that I haven't tried to push
it back to the community yet; it's basically performance optimization
code for our architecture.

 arch/tile/include/asm/page.h    |    3 +++
 arch/tile/include/asm/pgtable.h |   18 +++++++++++++-----
 arch/tile/kernel/module.c       |    1 +
 arch/tile/kernel/tlb.c          |   24 ++++++------------------
 arch/tile/mm/homecache.c        |   14 ++++++++++++++
 arch/tile/mm/init.c             |    4 ++++
 6 files changed, 41 insertions(+), 23 deletions(-)

diff --git a/arch/tile/include/asm/page.h b/arch/tile/include/asm/page.h
index 3eb5352..24e0f8c 100644
--- a/arch/tile/include/asm/page.h
+++ b/arch/tile/include/asm/page.h
@@ -324,6 +324,9 @@ static inline int pfn_valid(unsigned long pfn)
 struct mm_struct;
 extern pte_t *virt_to_pte(struct mm_struct *mm, unsigned long addr);
 
+void arch_free_page(struct page *page, int order);
+#define HAVE_ARCH_FREE_PAGE
+
 #endif /* !__ASSEMBLY__ */
 
 #define VM_DATA_DEFAULT_FLAGS \
diff --git a/arch/tile/include/asm/pgtable.h b/arch/tile/include/asm/pgtable.h
index 1a20b7e..39a2c3d 100644
--- a/arch/tile/include/asm/pgtable.h
+++ b/arch/tile/include/asm/pgtable.h
@@ -27,6 +27,7 @@
 #include <linux/slab.h>
 #include <linux/list.h>
 #include <linux/spinlock.h>
+#include <linux/page-flags.h>
 #include <asm/processor.h>
 #include <asm/fixmap.h>
 #include <asm/system.h>
@@ -351,11 +352,18 @@ do {						\
 	local_flush_tlb_page(FLUSH_NONEXEC, (vaddr), PAGE_SIZE); \
 } while (0)
 
-/*
- * The kernel page tables contain what we need, and we flush when we
- * change specific page table entries.
- */
-#define update_mmu_cache(vma, address, pte) do { } while (0)
+/* Use this bit to track whether a page may have been cached in an icache. */
+PAGEFLAG(Exec, arch_1)
+__TESTCLEARFLAG(Exec, arch_1)
+
+static inline void update_mmu_cache(struct vm_area_struct *vma,
+				    unsigned long address,
+				    pte_t *pte)
+{
+	pte_t pteval = *pte;
+	if (pte_exec(pteval))
+		SetPageExec(pte_page(pteval));
+}
 
 #ifdef CONFIG_FLATMEM
 #define kern_addr_valid(addr)	(1)
diff --git a/arch/tile/kernel/module.c b/arch/tile/kernel/module.c
index e2ab82b..37615b3 100644
--- a/arch/tile/kernel/module.c
+++ b/arch/tile/kernel/module.c
@@ -61,6 +61,7 @@ void *module_alloc(unsigned long size)
 		pages[i] = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
 		if (!pages[i])
 			goto error;
+		SetPageExec(pages[i]);   /* do icache flush when we free */
 	}
 
 	area = __get_vm_area(size, VM_ALLOC, MEM_MODULE_START, MEM_MODULE_END);
diff --git a/arch/tile/kernel/tlb.c b/arch/tile/kernel/tlb.c
index 2dffc10..e8b9062 100644
--- a/arch/tile/kernel/tlb.c
+++ b/arch/tile/kernel/tlb.c
@@ -23,13 +23,6 @@
 DEFINE_PER_CPU(int, current_asid);
 int min_asid, max_asid;
 
-/*
- * Note that we flush the L1I (for VM_EXEC pages) as well as the TLB
- * so that when we are unmapping an executable page, we also flush it.
- * Combined with flushing the L1I at context switch time, this means
- * we don't have to do any other icache flushes.
- */
-
 void flush_tlb_mm(struct mm_struct *mm)
 {
 	HV_Remote_ASID asids[NR_CPUS];
@@ -40,8 +33,7 @@ void flush_tlb_mm(struct mm_struct *mm)
 		asid->x = cpu % smp_topology.width;
 		asid->asid = per_cpu(current_asid, cpu);
 	}
-	flush_remote(0, HV_FLUSH_EVICT_L1I, &mm->cpu_vm_mask,
-		     0, 0, 0, NULL, asids, i);
+	flush_remote(0, 0, NULL, 0, 0, 0, NULL, asids, i);
 }
 
 void flush_tlb_current_task(void)
@@ -53,9 +45,7 @@ void flush_tlb_page_mm(const struct vm_area_struct *vma, struct mm_struct *mm,
 		       unsigned long va)
 {
 	unsigned long size = hv_page_size(vma);
-	int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
-	flush_remote(0, cache, &mm->cpu_vm_mask,
-		     va, size, size, &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, 0, NULL, va, size, size, &mm->cpu_vm_mask, NULL, 0);
 }
 
 void flush_tlb_page(const struct vm_area_struct *vma, unsigned long va)
@@ -68,10 +58,8 @@ void flush_tlb_range(const struct vm_area_struct *vma,
 		     unsigned long start, unsigned long end)
 {
 	unsigned long size = hv_page_size(vma);
-	struct mm_struct *mm = vma->vm_mm;
-	int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
-	flush_remote(0, cache, &mm->cpu_vm_mask, start, end - start, size,
-		     &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, 0, NULL, start, end - start, size,
+		     &vma->vm_mm->cpu_vm_mask, NULL, 0);
 }
 
 void flush_tlb_all(void)
@@ -81,7 +69,7 @@ void flush_tlb_all(void)
 		HV_VirtAddrRange r = hv_inquire_virtual(i);
 		if (r.size == 0)
 			break;
-		flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+		flush_remote(0, 0, NULL,
 			     r.start, r.size, PAGE_SIZE, cpu_online_mask,
 			     NULL, 0);
 		flush_remote(0, 0, NULL,
@@ -92,6 +80,6 @@ void flush_tlb_all(void)
 
 void flush_tlb_kernel_range(unsigned long start, unsigned long end)
 {
-	flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+	flush_remote(0, 0, NULL,
 		     start, end - start, PAGE_SIZE, cpu_online_mask, NULL, 0);
 }
diff --git a/arch/tile/mm/homecache.c b/arch/tile/mm/homecache.c
index cbe6f4f..e020b54 100644
--- a/arch/tile/mm/homecache.c
+++ b/arch/tile/mm/homecache.c
@@ -455,3 +455,17 @@ void homecache_free_pages(unsigned long addr, unsigned int order)
 			__free_page(page++);
 	}
 }
+
+/*
+ * When freeing a page that was executable, we flush all icaches to
+ * avoid incoherence.  This should be relatively rare, e.g. deleting a
+ * binary or evicting an executable page-cache page.  Enabling dynamic
+ * homecaching support amortizes this overhead even further.
+ */
+void arch_free_page(struct page *page, int order)
+{
+	if (__TestClearPageExec(page)) {
+		flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+			     0, 0, 0, NULL, NULL, 0);
+	}
+}
diff --git a/arch/tile/mm/init.c b/arch/tile/mm/init.c
index d6e87fd..6332347 100644
--- a/arch/tile/mm/init.c
+++ b/arch/tile/mm/init.c
@@ -1082,4 +1082,8 @@ void free_initmem(void)
 
 	/* Do a global TLB flush so everyone sees the changes. */
 	flush_tlb_all();
+
+	/* Do a global L1I flush now that we've freed kernel text. */
+	flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+		     0, 0, 0, NULL, NULL, 0);
 }
-- 
1.6.5.2


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH] arch/tile: optimize icache flush
@ 2011-03-10 18:05             ` Chris Metcalf
  0 siblings, 0 replies; 86+ messages in thread
From: Chris Metcalf @ 2011-03-10 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	David Miller <davem@davemloft.net>,
	linux-mm, benh, hugh.dickins, mel, npiggin, rmk, schwidefsky

Tile has incoherent icaches, so they must be explicitly invalidated
when necessary.  Until now we have done so at tlb flush and context
switch time, which means more invalidation than strictly necessary.
The new model for icache flush is:

- When we fault in a page as executable, we set an "Exec" bit in the
  "struct page" information; the bit stays set until page free time.
  (We use the arch_1 page bit for our "Exec" bit.)

- At page free time, if the Exec bit is set, we do an icache flush.
  This should happen relatively rarely: e.g., deleting a binary from disk,
  or evicting a binary's pages from the page cache due to memory pressure.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
---

This change was motivated initially by Peter Zijlstra's attempt to
handle tlb_flush_range() without the vm_flags.  Since we no longer do I$
flushing at TLB flush time, this is arguably a step in that direction,
though in practice Peter gave up and is now passing the vm_flags anyway.

This change is also much simpler than the one I proposed a few days ago:

https://lkml.org/lkml/2011/3/3/284

since I decided that it would be overkill to track page free times in
the struct page and compare it as part of this.

Note that Tilera's shipping sources have support for some amortizing
code that collects freed pages on a separate list if they need any kind
of special cache-management attention, so in that code we are deferring
the icache frees until even later.  This code involves enough hooks into
the platform-independent kernel mm sources that I haven't tried to push
it back to the community yet; it's basically performance optimization
code for our architecture.

 arch/tile/include/asm/page.h    |    3 +++
 arch/tile/include/asm/pgtable.h |   18 +++++++++++++-----
 arch/tile/kernel/module.c       |    1 +
 arch/tile/kernel/tlb.c          |   24 ++++++------------------
 arch/tile/mm/homecache.c        |   14 ++++++++++++++
 arch/tile/mm/init.c             |    4 ++++
 6 files changed, 41 insertions(+), 23 deletions(-)

diff --git a/arch/tile/include/asm/page.h b/arch/tile/include/asm/page.h
index 3eb5352..24e0f8c 100644
--- a/arch/tile/include/asm/page.h
+++ b/arch/tile/include/asm/page.h
@@ -324,6 +324,9 @@ static inline int pfn_valid(unsigned long pfn)
 struct mm_struct;
 extern pte_t *virt_to_pte(struct mm_struct *mm, unsigned long addr);
 
+void arch_free_page(struct page *page, int order);
+#define HAVE_ARCH_FREE_PAGE
+
 #endif /* !__ASSEMBLY__ */
 
 #define VM_DATA_DEFAULT_FLAGS \
diff --git a/arch/tile/include/asm/pgtable.h b/arch/tile/include/asm/pgtable.h
index 1a20b7e..39a2c3d 100644
--- a/arch/tile/include/asm/pgtable.h
+++ b/arch/tile/include/asm/pgtable.h
@@ -27,6 +27,7 @@
 #include <linux/slab.h>
 #include <linux/list.h>
 #include <linux/spinlock.h>
+#include <linux/page-flags.h>
 #include <asm/processor.h>
 #include <asm/fixmap.h>
 #include <asm/system.h>
@@ -351,11 +352,18 @@ do {						\
 	local_flush_tlb_page(FLUSH_NONEXEC, (vaddr), PAGE_SIZE); \
 } while (0)
 
-/*
- * The kernel page tables contain what we need, and we flush when we
- * change specific page table entries.
- */
-#define update_mmu_cache(vma, address, pte) do { } while (0)
+/* Use this bit to track whether a page may have been cached in an icache. */
+PAGEFLAG(Exec, arch_1)
+__TESTCLEARFLAG(Exec, arch_1)
+
+static inline void update_mmu_cache(struct vm_area_struct *vma,
+				    unsigned long address,
+				    pte_t *pte)
+{
+	pte_t pteval = *pte;
+	if (pte_exec(pteval))
+		SetPageExec(pte_page(pteval));
+}
 
 #ifdef CONFIG_FLATMEM
 #define kern_addr_valid(addr)	(1)
diff --git a/arch/tile/kernel/module.c b/arch/tile/kernel/module.c
index e2ab82b..37615b3 100644
--- a/arch/tile/kernel/module.c
+++ b/arch/tile/kernel/module.c
@@ -61,6 +61,7 @@ void *module_alloc(unsigned long size)
 		pages[i] = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
 		if (!pages[i])
 			goto error;
+		SetPageExec(pages[i]);   /* do icache flush when we free */
 	}
 
 	area = __get_vm_area(size, VM_ALLOC, MEM_MODULE_START, MEM_MODULE_END);
diff --git a/arch/tile/kernel/tlb.c b/arch/tile/kernel/tlb.c
index 2dffc10..e8b9062 100644
--- a/arch/tile/kernel/tlb.c
+++ b/arch/tile/kernel/tlb.c
@@ -23,13 +23,6 @@
 DEFINE_PER_CPU(int, current_asid);
 int min_asid, max_asid;
 
-/*
- * Note that we flush the L1I (for VM_EXEC pages) as well as the TLB
- * so that when we are unmapping an executable page, we also flush it.
- * Combined with flushing the L1I at context switch time, this means
- * we don't have to do any other icache flushes.
- */
-
 void flush_tlb_mm(struct mm_struct *mm)
 {
 	HV_Remote_ASID asids[NR_CPUS];
@@ -40,8 +33,7 @@ void flush_tlb_mm(struct mm_struct *mm)
 		asid->x = cpu % smp_topology.width;
 		asid->asid = per_cpu(current_asid, cpu);
 	}
-	flush_remote(0, HV_FLUSH_EVICT_L1I, &mm->cpu_vm_mask,
-		     0, 0, 0, NULL, asids, i);
+	flush_remote(0, 0, NULL, 0, 0, 0, NULL, asids, i);
 }
 
 void flush_tlb_current_task(void)
@@ -53,9 +45,7 @@ void flush_tlb_page_mm(const struct vm_area_struct *vma, struct mm_struct *mm,
 		       unsigned long va)
 {
 	unsigned long size = hv_page_size(vma);
-	int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
-	flush_remote(0, cache, &mm->cpu_vm_mask,
-		     va, size, size, &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, 0, NULL, va, size, size, &mm->cpu_vm_mask, NULL, 0);
 }
 
 void flush_tlb_page(const struct vm_area_struct *vma, unsigned long va)
@@ -68,10 +58,8 @@ void flush_tlb_range(const struct vm_area_struct *vma,
 		     unsigned long start, unsigned long end)
 {
 	unsigned long size = hv_page_size(vma);
-	struct mm_struct *mm = vma->vm_mm;
-	int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
-	flush_remote(0, cache, &mm->cpu_vm_mask, start, end - start, size,
-		     &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, 0, NULL, start, end - start, size,
+		     &vma->vm_mm->cpu_vm_mask, NULL, 0);
 }
 
 void flush_tlb_all(void)
@@ -81,7 +69,7 @@ void flush_tlb_all(void)
 		HV_VirtAddrRange r = hv_inquire_virtual(i);
 		if (r.size == 0)
 			break;
-		flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+		flush_remote(0, 0, NULL,
 			     r.start, r.size, PAGE_SIZE, cpu_online_mask,
 			     NULL, 0);
 		flush_remote(0, 0, NULL,
@@ -92,6 +80,6 @@ void flush_tlb_all(void)
 
 void flush_tlb_kernel_range(unsigned long start, unsigned long end)
 {
-	flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+	flush_remote(0, 0, NULL,
 		     start, end - start, PAGE_SIZE, cpu_online_mask, NULL, 0);
 }
diff --git a/arch/tile/mm/homecache.c b/arch/tile/mm/homecache.c
index cbe6f4f..e020b54 100644
--- a/arch/tile/mm/homecache.c
+++ b/arch/tile/mm/homecache.c
@@ -455,3 +455,17 @@ void homecache_free_pages(unsigned long addr, unsigned int order)
 			__free_page(page++);
 	}
 }
+
+/*
+ * When freeing a page that was executable, we flush all icaches to
+ * avoid incoherence.  This should be relatively rare, e.g. deleting a
+ * binary or evicting an executable page-cache page.  Enabling dynamic
+ * homecaching support amortizes this overhead even further.
+ */
+void arch_free_page(struct page *page, int order)
+{
+	if (__TestClearPageExec(page)) {
+		flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+			     0, 0, 0, NULL, NULL, 0);
+	}
+}
diff --git a/arch/tile/mm/init.c b/arch/tile/mm/init.c
index d6e87fd..6332347 100644
--- a/arch/tile/mm/init.c
+++ b/arch/tile/mm/init.c
@@ -1082,4 +1082,8 @@ void free_initmem(void)
 
 	/* Do a global TLB flush so everyone sees the changes. */
 	flush_tlb_all();
+
+	/* Do a global L1I flush now that we've freed kernel text. */
+	flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+		     0, 0, 0, NULL, NULL, 0);
 }
-- 
1.6.5.2

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH] arch/tile: optimize icache flush
@ 2011-03-10 18:05             ` Chris Metcalf
  0 siblings, 0 replies; 86+ messages in thread
From: Chris Metcalf @ 2011-03-10 18:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: a.p.zijlstra, torvalds, aarcange, tglx, riel, mingo, akpm,
	David Miller <davem@davemloft.net>,
	linux-mm, benh, hugh.dickins, mel, npiggin, rmk, schwidefsky

Tile has incoherent icaches, so they must be explicitly invalidated
when necessary.  Until now we have done so at tlb flush and context
switch time, which means more invalidation than strictly necessary.
The new model for icache flush is:

- When we fault in a page as executable, we set an "Exec" bit in the
  "struct page" information; the bit stays set until page free time.
  (We use the arch_1 page bit for our "Exec" bit.)

- At page free time, if the Exec bit is set, we do an icache flush.
  This should happen relatively rarely: e.g., deleting a binary from disk,
  or evicting a binary's pages from the page cache due to memory pressure.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
---

This change was motivated initially by Peter Zijlstra's attempt to
handle tlb_flush_range() without the vm_flags.  Since we no longer do I$
flushing at TLB flush time, this is arguably a step in that direction,
though in practice Peter gave up and is now passing the vm_flags anyway.

This change is also much simpler than the one I proposed a few days ago:

https://lkml.org/lkml/2011/3/3/284

since I decided that it would be overkill to track page free times in
the struct page and compare it as part of this.

Note that Tilera's shipping sources have support for some amortizing
code that collects freed pages on a separate list if they need any kind
of special cache-management attention, so in that code we are deferring
the icache frees until even later.  This code involves enough hooks into
the platform-independent kernel mm sources that I haven't tried to push
it back to the community yet; it's basically performance optimization
code for our architecture.

 arch/tile/include/asm/page.h    |    3 +++
 arch/tile/include/asm/pgtable.h |   18 +++++++++++++-----
 arch/tile/kernel/module.c       |    1 +
 arch/tile/kernel/tlb.c          |   24 ++++++------------------
 arch/tile/mm/homecache.c        |   14 ++++++++++++++
 arch/tile/mm/init.c             |    4 ++++
 6 files changed, 41 insertions(+), 23 deletions(-)

diff --git a/arch/tile/include/asm/page.h b/arch/tile/include/asm/page.h
index 3eb5352..24e0f8c 100644
--- a/arch/tile/include/asm/page.h
+++ b/arch/tile/include/asm/page.h
@@ -324,6 +324,9 @@ static inline int pfn_valid(unsigned long pfn)
 struct mm_struct;
 extern pte_t *virt_to_pte(struct mm_struct *mm, unsigned long addr);
 
+void arch_free_page(struct page *page, int order);
+#define HAVE_ARCH_FREE_PAGE
+
 #endif /* !__ASSEMBLY__ */
 
 #define VM_DATA_DEFAULT_FLAGS \
diff --git a/arch/tile/include/asm/pgtable.h b/arch/tile/include/asm/pgtable.h
index 1a20b7e..39a2c3d 100644
--- a/arch/tile/include/asm/pgtable.h
+++ b/arch/tile/include/asm/pgtable.h
@@ -27,6 +27,7 @@
 #include <linux/slab.h>
 #include <linux/list.h>
 #include <linux/spinlock.h>
+#include <linux/page-flags.h>
 #include <asm/processor.h>
 #include <asm/fixmap.h>
 #include <asm/system.h>
@@ -351,11 +352,18 @@ do {						\
 	local_flush_tlb_page(FLUSH_NONEXEC, (vaddr), PAGE_SIZE); \
 } while (0)
 
-/*
- * The kernel page tables contain what we need, and we flush when we
- * change specific page table entries.
- */
-#define update_mmu_cache(vma, address, pte) do { } while (0)
+/* Use this bit to track whether a page may have been cached in an icache. */
+PAGEFLAG(Exec, arch_1)
+__TESTCLEARFLAG(Exec, arch_1)
+
+static inline void update_mmu_cache(struct vm_area_struct *vma,
+				    unsigned long address,
+				    pte_t *pte)
+{
+	pte_t pteval = *pte;
+	if (pte_exec(pteval))
+		SetPageExec(pte_page(pteval));
+}
 
 #ifdef CONFIG_FLATMEM
 #define kern_addr_valid(addr)	(1)
diff --git a/arch/tile/kernel/module.c b/arch/tile/kernel/module.c
index e2ab82b..37615b3 100644
--- a/arch/tile/kernel/module.c
+++ b/arch/tile/kernel/module.c
@@ -61,6 +61,7 @@ void *module_alloc(unsigned long size)
 		pages[i] = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
 		if (!pages[i])
 			goto error;
+		SetPageExec(pages[i]);   /* do icache flush when we free */
 	}
 
 	area = __get_vm_area(size, VM_ALLOC, MEM_MODULE_START, MEM_MODULE_END);
diff --git a/arch/tile/kernel/tlb.c b/arch/tile/kernel/tlb.c
index 2dffc10..e8b9062 100644
--- a/arch/tile/kernel/tlb.c
+++ b/arch/tile/kernel/tlb.c
@@ -23,13 +23,6 @@
 DEFINE_PER_CPU(int, current_asid);
 int min_asid, max_asid;
 
-/*
- * Note that we flush the L1I (for VM_EXEC pages) as well as the TLB
- * so that when we are unmapping an executable page, we also flush it.
- * Combined with flushing the L1I at context switch time, this means
- * we don't have to do any other icache flushes.
- */
-
 void flush_tlb_mm(struct mm_struct *mm)
 {
 	HV_Remote_ASID asids[NR_CPUS];
@@ -40,8 +33,7 @@ void flush_tlb_mm(struct mm_struct *mm)
 		asid->x = cpu % smp_topology.width;
 		asid->asid = per_cpu(current_asid, cpu);
 	}
-	flush_remote(0, HV_FLUSH_EVICT_L1I, &mm->cpu_vm_mask,
-		     0, 0, 0, NULL, asids, i);
+	flush_remote(0, 0, NULL, 0, 0, 0, NULL, asids, i);
 }
 
 void flush_tlb_current_task(void)
@@ -53,9 +45,7 @@ void flush_tlb_page_mm(const struct vm_area_struct *vma, struct mm_struct *mm,
 		       unsigned long va)
 {
 	unsigned long size = hv_page_size(vma);
-	int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
-	flush_remote(0, cache, &mm->cpu_vm_mask,
-		     va, size, size, &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, 0, NULL, va, size, size, &mm->cpu_vm_mask, NULL, 0);
 }
 
 void flush_tlb_page(const struct vm_area_struct *vma, unsigned long va)
@@ -68,10 +58,8 @@ void flush_tlb_range(const struct vm_area_struct *vma,
 		     unsigned long start, unsigned long end)
 {
 	unsigned long size = hv_page_size(vma);
-	struct mm_struct *mm = vma->vm_mm;
-	int cache = (vma->vm_flags & VM_EXEC) ? HV_FLUSH_EVICT_L1I : 0;
-	flush_remote(0, cache, &mm->cpu_vm_mask, start, end - start, size,
-		     &mm->cpu_vm_mask, NULL, 0);
+	flush_remote(0, 0, NULL, start, end - start, size,
+		     &vma->vm_mm->cpu_vm_mask, NULL, 0);
 }
 
 void flush_tlb_all(void)
@@ -81,7 +69,7 @@ void flush_tlb_all(void)
 		HV_VirtAddrRange r = hv_inquire_virtual(i);
 		if (r.size == 0)
 			break;
-		flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+		flush_remote(0, 0, NULL,
 			     r.start, r.size, PAGE_SIZE, cpu_online_mask,
 			     NULL, 0);
 		flush_remote(0, 0, NULL,
@@ -92,6 +80,6 @@ void flush_tlb_all(void)
 
 void flush_tlb_kernel_range(unsigned long start, unsigned long end)
 {
-	flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+	flush_remote(0, 0, NULL,
 		     start, end - start, PAGE_SIZE, cpu_online_mask, NULL, 0);
 }
diff --git a/arch/tile/mm/homecache.c b/arch/tile/mm/homecache.c
index cbe6f4f..e020b54 100644
--- a/arch/tile/mm/homecache.c
+++ b/arch/tile/mm/homecache.c
@@ -455,3 +455,17 @@ void homecache_free_pages(unsigned long addr, unsigned int order)
 			__free_page(page++);
 	}
 }
+
+/*
+ * When freeing a page that was executable, we flush all icaches to
+ * avoid incoherence.  This should be relatively rare, e.g. deleting a
+ * binary or evicting an executable page-cache page.  Enabling dynamic
+ * homecaching support amortizes this overhead even further.
+ */
+void arch_free_page(struct page *page, int order)
+{
+	if (__TestClearPageExec(page)) {
+		flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+			     0, 0, 0, NULL, NULL, 0);
+	}
+}
diff --git a/arch/tile/mm/init.c b/arch/tile/mm/init.c
index d6e87fd..6332347 100644
--- a/arch/tile/mm/init.c
+++ b/arch/tile/mm/init.c
@@ -1082,4 +1082,8 @@ void free_initmem(void)
 
 	/* Do a global TLB flush so everyone sees the changes. */
 	flush_tlb_all();
+
+	/* Do a global L1I flush now that we've freed kernel text. */
+	flush_remote(0, HV_FLUSH_EVICT_L1I, cpu_online_mask,
+		     0, 0, 0, NULL, NULL, 0);
 }
-- 
1.6.5.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [PATCH] arch/tile: optimize icache flush
  2011-03-10 18:05             ` Chris Metcalf
@ 2011-03-10 23:19               ` Rik van Riel
  -1 siblings, 0 replies; 86+ messages in thread
From: Rik van Riel @ 2011-03-10 23:19 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: linux-kernel, a.p.zijlstra, torvalds, aarcange, tglx, mingo,
	akpm, David Miller <davem@davemloft.net>,
	linux-mm, benh, hugh.dickins, mel, npiggin, rmk, schwidefsky

On 03/10/2011 01:05 PM, Chris Metcalf wrote:
> Tile has incoherent icaches, so they must be explicitly invalidated
> when necessary.  Until now we have done so at tlb flush and context
> switch time, which means more invalidation than strictly necessary.
> The new model for icache flush is:
>
> - When we fault in a page as executable, we set an "Exec" bit in the
>    "struct page" information; the bit stays set until page free time.
>    (We use the arch_1 page bit for our "Exec" bit.)
>
> - At page free time, if the Exec bit is set, we do an icache flush.
>    This should happen relatively rarely: e.g., deleting a binary from disk,
>    or evicting a binary's pages from the page cache due to memory pressure.
>
> Signed-off-by: Chris Metcalf<cmetcalf@tilera.com>

Nice trick.

Acked-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH] arch/tile: optimize icache flush
@ 2011-03-10 23:19               ` Rik van Riel
  0 siblings, 0 replies; 86+ messages in thread
From: Rik van Riel @ 2011-03-10 23:19 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: linux-kernel, a.p.zijlstra, torvalds, aarcange, tglx, mingo,
	akpm, David Miller <davem@davemloft.net>,
	linux-mm, benh, hugh.dickins, mel, npiggin, rmk, schwidefsky

On 03/10/2011 01:05 PM, Chris Metcalf wrote:
> Tile has incoherent icaches, so they must be explicitly invalidated
> when necessary.  Until now we have done so at tlb flush and context
> switch time, which means more invalidation than strictly necessary.
> The new model for icache flush is:
>
> - When we fault in a page as executable, we set an "Exec" bit in the
>    "struct page" information; the bit stays set until page free time.
>    (We use the arch_1 page bit for our "Exec" bit.)
>
> - At page free time, if the Exec bit is set, we do an icache flush.
>    This should happen relatively rarely: e.g., deleting a binary from disk,
>    or evicting a binary's pages from the page cache due to memory pressure.
>
> Signed-off-by: Chris Metcalf<cmetcalf@tilera.com>

Nice trick.

Acked-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2011-03-02 17:59   ` Peter Zijlstra
@ 2012-05-17  3:05     ` Paul Mundt
  -1 siblings, 0 replies; 86+ messages in thread
From: Paul Mundt @ 2012-05-17  3:05 UTC (permalink / raw)
  To: Peter Zijlstra, Catalin Marinas
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, Mar 02, 2011 at 06:59:32PM +0100, Peter Zijlstra wrote:
> Might want to optimize the tlb_flush() function to do a full mm flush
> when the range is 'large', IA64 does this too.
> 
> Cc: Russell King <rmk@arm.linux.org.uk>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

The current version in tlb-unify blows up due to a missing
tlb_add_flush() definition. I can see in this thread tlb_track_range()
was factored in, but the __pte_free_tlb()/__pmd_free_tlb() semantics have
changed since then. Adding a dumb tlb_add_flush() that wraps in to
tlb_track_range() seems to do the right thing, but someone more familiar
with LPAE and ARM's double PMDs will have to figure out whether the
tlb_track_range() in asm-generic/tlb.h's pmd/pte_free_tlb() are
sufficient to remove the tlb_add_flush() calls or not.

Here's the dumb build fix for now though:

Signed-off-by: Paul Mundt <lethal@linux-sh.org>

---

diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 37dbce9..1de4b21 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -38,6 +38,11 @@ __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr);
 
 #include <asm-generic/tlb.h>
 
+static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
+{
+	tlb_track_range(tlb, addr, addr + PAGE_SIZE);
+}
+
 static inline void
 __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
 {

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17  3:05     ` Paul Mundt
  0 siblings, 0 replies; 86+ messages in thread
From: Paul Mundt @ 2012-05-17  3:05 UTC (permalink / raw)
  To: Peter Zijlstra, Catalin Marinas
  Cc: Andrea Arcangeli, Thomas Gleixner, Rik van Riel, Ingo Molnar,
	akpm, Linus Torvalds, linux-kernel, linux-arch, linux-mm,
	Benjamin Herrenschmidt, David Miller, Hugh Dickins, Mel Gorman,
	Nick Piggin, Russell King, Chris Metcalf, Martin Schwidefsky

On Wed, Mar 02, 2011 at 06:59:32PM +0100, Peter Zijlstra wrote:
> Might want to optimize the tlb_flush() function to do a full mm flush
> when the range is 'large', IA64 does this too.
> 
> Cc: Russell King <rmk@arm.linux.org.uk>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

The current version in tlb-unify blows up due to a missing
tlb_add_flush() definition. I can see in this thread tlb_track_range()
was factored in, but the __pte_free_tlb()/__pmd_free_tlb() semantics have
changed since then. Adding a dumb tlb_add_flush() that wraps in to
tlb_track_range() seems to do the right thing, but someone more familiar
with LPAE and ARM's double PMDs will have to figure out whether the
tlb_track_range() in asm-generic/tlb.h's pmd/pte_free_tlb() are
sufficient to remove the tlb_add_flush() calls or not.

Here's the dumb build fix for now though:

Signed-off-by: Paul Mundt <lethal@linux-sh.org>

---

diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 37dbce9..1de4b21 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -38,6 +38,11 @@ __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr);
 
 #include <asm-generic/tlb.h>
 
+static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
+{
+	tlb_track_range(tlb, addr, addr + PAGE_SIZE);
+}
+
 static inline void
 __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
 {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17  3:05     ` Paul Mundt
@ 2012-05-17  9:30       ` Catalin Marinas
  -1 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17  9:30 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Peter Zijlstra, Andrea Arcangeli, Thomas Gleixner, Rik van Riel,
	Ingo Molnar, akpm, Linus Torvalds, linux-kernel, linux-arch,
	linux-mm, Benjamin Herrenschmidt, David Miller, Hugh Dickins,
	Mel Gorman, Nick Piggin, Russell King, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 04:05:52AM +0100, Paul Mundt wrote:
> On Wed, Mar 02, 2011 at 06:59:32PM +0100, Peter Zijlstra wrote:
> > Might want to optimize the tlb_flush() function to do a full mm flush
> > when the range is 'large', IA64 does this too.
> > 
> > Cc: Russell King <rmk@arm.linux.org.uk>
> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> 
> The current version in tlb-unify blows up due to a missing
> tlb_add_flush() definition. I can see in this thread tlb_track_range()
> was factored in, but the __pte_free_tlb()/__pmd_free_tlb() semantics have
> changed since then. Adding a dumb tlb_add_flush() that wraps in to
> tlb_track_range() seems to do the right thing, but someone more familiar
> with LPAE and ARM's double PMDs will have to figure out whether the
> tlb_track_range() in asm-generic/tlb.h's pmd/pte_free_tlb() are
> sufficient to remove the tlb_add_flush() calls or not.
> 
> Here's the dumb build fix for now though:
> 
> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
> 
> ---
> 
> diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
> index 37dbce9..1de4b21 100644
> --- a/arch/arm/include/asm/tlb.h
> +++ b/arch/arm/include/asm/tlb.h
> @@ -38,6 +38,11 @@ __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr);
>  
>  #include <asm-generic/tlb.h>
>  
> +static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
> +{
> +	tlb_track_range(tlb, addr, addr + PAGE_SIZE);
> +}


I think that's still needed in case the range given to pte_free_tlb()
does not cover both pmd entries (1MB each) that the classic ARM MMU
uses. But we could call tlb_track_range() directly rather than adding a
tlb_add_flush() function (untested):


diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 37dbce9..efe2831 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -42,15 +46,14 @@ static inline void
 __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
 {
 	pgtable_page_dtor(pte);
-
+#ifndef CONFIG_ARM_LPAE
 	/*
 	 * With the classic ARM MMU, a pte page has two corresponding pmd
 	 * entries, each covering 1MB.
 	 */
-	addr &= PMD_MASK;
-	tlb_add_flush(tlb, addr + SZ_1M - PAGE_SIZE);
-	tlb_add_flush(tlb, addr + SZ_1M);
-
+	addr = (addr & PMD_MASK) + SZ_1M;
+	tlb_track_range(tlb, addr - PAGE_SIZE, addr + PAGE_SIZE);
+#endif
 	tlb_remove_page(tlb, pte);
 }
 
@@ -58,7 +61,6 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
 				  unsigned long addr)
 {
 #ifdef CONFIG_ARM_LPAE
-	tlb_add_flush(tlb, addr);
 	tlb_remove_page(tlb, virt_to_page(pmdp));
 #endif
 }


Another minor thing is that on newer ARM processors (Cortex-A15) we
need the TLB shootdown even on UP systems, so tlb_fast_mode should
always return 0. Something like below (untested):


diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 37dbce9..8e79689 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -23,6 +23,10 @@
 
 #include <linux/pagemap.h>
 
+#ifdef CONFIG_CPU_32v7
+#define tlb_fast_mode	(0)
+#endif
+
 #include <asm-generic/tlb.h>
 
 #else /* !CONFIG_MMU */
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 90a725c..9ddf7ee 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -194,6 +194,7 @@ static inline void tlb_flush(struct mmu_gather *tlb)
 
 #endif /* CONFIG_HAVE_MMU_GATHER_RANGE */
 
+#ifndef tlb_fast_mode
 static inline int tlb_fast_mode(struct mmu_gather *tlb)
 {
 #ifdef CONFIG_SMP
@@ -206,6 +207,7 @@ static inline int tlb_fast_mode(struct mmu_gather *tlb)
 	return 1;
 #endif
 }
+#endif
 
 void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, bool fullmm);
 void tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end);


-- 
Catalin

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17  9:30       ` Catalin Marinas
  0 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17  9:30 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Peter Zijlstra, Andrea Arcangeli, Thomas Gleixner, Rik van Riel,
	Ingo Molnar, akpm, Linus Torvalds, linux-kernel, linux-arch,
	linux-mm, Benjamin Herrenschmidt, David Miller, Hugh Dickins,
	Mel Gorman, Nick Piggin, Russell King, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 04:05:52AM +0100, Paul Mundt wrote:
> On Wed, Mar 02, 2011 at 06:59:32PM +0100, Peter Zijlstra wrote:
> > Might want to optimize the tlb_flush() function to do a full mm flush
> > when the range is 'large', IA64 does this too.
> > 
> > Cc: Russell King <rmk@arm.linux.org.uk>
> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> 
> The current version in tlb-unify blows up due to a missing
> tlb_add_flush() definition. I can see in this thread tlb_track_range()
> was factored in, but the __pte_free_tlb()/__pmd_free_tlb() semantics have
> changed since then. Adding a dumb tlb_add_flush() that wraps in to
> tlb_track_range() seems to do the right thing, but someone more familiar
> with LPAE and ARM's double PMDs will have to figure out whether the
> tlb_track_range() in asm-generic/tlb.h's pmd/pte_free_tlb() are
> sufficient to remove the tlb_add_flush() calls or not.
> 
> Here's the dumb build fix for now though:
> 
> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
> 
> ---
> 
> diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
> index 37dbce9..1de4b21 100644
> --- a/arch/arm/include/asm/tlb.h
> +++ b/arch/arm/include/asm/tlb.h
> @@ -38,6 +38,11 @@ __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr);
>  
>  #include <asm-generic/tlb.h>
>  
> +static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
> +{
> +	tlb_track_range(tlb, addr, addr + PAGE_SIZE);
> +}


I think that's still needed in case the range given to pte_free_tlb()
does not cover both pmd entries (1MB each) that the classic ARM MMU
uses. But we could call tlb_track_range() directly rather than adding a
tlb_add_flush() function (untested):


diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 37dbce9..efe2831 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -42,15 +46,14 @@ static inline void
 __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
 {
 	pgtable_page_dtor(pte);
-
+#ifndef CONFIG_ARM_LPAE
 	/*
 	 * With the classic ARM MMU, a pte page has two corresponding pmd
 	 * entries, each covering 1MB.
 	 */
-	addr &= PMD_MASK;
-	tlb_add_flush(tlb, addr + SZ_1M - PAGE_SIZE);
-	tlb_add_flush(tlb, addr + SZ_1M);
-
+	addr = (addr & PMD_MASK) + SZ_1M;
+	tlb_track_range(tlb, addr - PAGE_SIZE, addr + PAGE_SIZE);
+#endif
 	tlb_remove_page(tlb, pte);
 }
 
@@ -58,7 +61,6 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
 				  unsigned long addr)
 {
 #ifdef CONFIG_ARM_LPAE
-	tlb_add_flush(tlb, addr);
 	tlb_remove_page(tlb, virt_to_page(pmdp));
 #endif
 }


Another minor thing is that on newer ARM processors (Cortex-A15) we
need the TLB shootdown even on UP systems, so tlb_fast_mode should
always return 0. Something like below (untested):


diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 37dbce9..8e79689 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -23,6 +23,10 @@
 
 #include <linux/pagemap.h>
 
+#ifdef CONFIG_CPU_32v7
+#define tlb_fast_mode	(0)
+#endif
+
 #include <asm-generic/tlb.h>
 
 #else /* !CONFIG_MMU */
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 90a725c..9ddf7ee 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -194,6 +194,7 @@ static inline void tlb_flush(struct mmu_gather *tlb)
 
 #endif /* CONFIG_HAVE_MMU_GATHER_RANGE */
 
+#ifndef tlb_fast_mode
 static inline int tlb_fast_mode(struct mmu_gather *tlb)
 {
 #ifdef CONFIG_SMP
@@ -206,6 +207,7 @@ static inline int tlb_fast_mode(struct mmu_gather *tlb)
 	return 1;
 #endif
 }
+#endif
 
 void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, bool fullmm);
 void tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end);


-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17  9:30       ` Catalin Marinas
@ 2012-05-17  9:39         ` Catalin Marinas
  -1 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17  9:39 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Peter Zijlstra, Andrea Arcangeli, Thomas Gleixner, Rik van Riel,
	Ingo Molnar, akpm, Linus Torvalds, linux-kernel, linux-arch,
	linux-mm, Benjamin Herrenschmidt, David Miller, Hugh Dickins,
	Mel Gorman, Nick Piggin, Russell King, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> Another minor thing is that on newer ARM processors (Cortex-A15) we
> need the TLB shootdown even on UP systems, so tlb_fast_mode should
> always return 0. Something like below (untested):
> 
> 
> diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
> index 37dbce9..8e79689 100644
> --- a/arch/arm/include/asm/tlb.h
> +++ b/arch/arm/include/asm/tlb.h
> @@ -23,6 +23,10 @@
>  
>  #include <linux/pagemap.h>
>  
> +#ifdef CONFIG_CPU_32v7
> +#define tlb_fast_mode	(0)
> +#endif
> +
>  #include <asm-generic/tlb.h>
>  
>  #else /* !CONFIG_MMU */

This hunk should have been a few lines down for the CONFIG_MMU case.

-- 
Catalin

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17  9:39         ` Catalin Marinas
  0 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17  9:39 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Peter Zijlstra, Andrea Arcangeli, Thomas Gleixner, Rik van Riel,
	Ingo Molnar, akpm, Linus Torvalds, linux-kernel, linux-arch,
	linux-mm, Benjamin Herrenschmidt, David Miller, Hugh Dickins,
	Mel Gorman, Nick Piggin, Russell King, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> Another minor thing is that on newer ARM processors (Cortex-A15) we
> need the TLB shootdown even on UP systems, so tlb_fast_mode should
> always return 0. Something like below (untested):
> 
> 
> diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
> index 37dbce9..8e79689 100644
> --- a/arch/arm/include/asm/tlb.h
> +++ b/arch/arm/include/asm/tlb.h
> @@ -23,6 +23,10 @@
>  
>  #include <linux/pagemap.h>
>  
> +#ifdef CONFIG_CPU_32v7
> +#define tlb_fast_mode	(0)
> +#endif
> +
>  #include <asm-generic/tlb.h>
>  
>  #else /* !CONFIG_MMU */

This hunk should have been a few lines down for the CONFIG_MMU case.

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17  9:30       ` Catalin Marinas
@ 2012-05-17  9:51         ` Russell King
  -1 siblings, 0 replies; 86+ messages in thread
From: Russell King @ 2012-05-17  9:51 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Paul Mundt, Peter Zijlstra, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> Another minor thing is that on newer ARM processors (Cortex-A15) we
> need the TLB shootdown even on UP systems, so tlb_fast_mode should
> always return 0. Something like below (untested):

No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
or SMP, not just for A15, because of the speculative prefetch which can
re-load TLB entries from the page tables at _any_ time.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17  9:51         ` Russell King
  0 siblings, 0 replies; 86+ messages in thread
From: Russell King @ 2012-05-17  9:51 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Paul Mundt, Peter Zijlstra, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> Another minor thing is that on newer ARM processors (Cortex-A15) we
> need the TLB shootdown even on UP systems, so tlb_fast_mode should
> always return 0. Something like below (untested):

No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
or SMP, not just for A15, because of the speculative prefetch which can
re-load TLB entries from the page tables at _any_ time.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17  9:51         ` Russell King
@ 2012-05-17 11:28           ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 11:28 UTC (permalink / raw)
  To: Russell King
  Cc: Catalin Marinas, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > always return 0. Something like below (untested):
> 
> No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> or SMP, not just for A15, because of the speculative prefetch which can
> re-load TLB entries from the page tables at _any_ time.

Hmm,. so this is mostly because of the confusion/coupling between
tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
freeing of the actual pages being a problem with speculative TLB
reloads, just the page-tables.

Should we introduce a tlb_remove_table() regardless of
HAVE_RCU_TABLE_FREE which always queues the tables regardless of
tlb_fast_mode()? 



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 11:28           ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 11:28 UTC (permalink / raw)
  To: Russell King
  Cc: Catalin Marinas, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > always return 0. Something like below (untested):
> 
> No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> or SMP, not just for A15, because of the speculative prefetch which can
> re-load TLB entries from the page tables at _any_ time.

Hmm,. so this is mostly because of the confusion/coupling between
tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
freeing of the actual pages being a problem with speculative TLB
reloads, just the page-tables.

Should we introduce a tlb_remove_table() regardless of
HAVE_RCU_TABLE_FREE which always queues the tables regardless of
tlb_fast_mode()? 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 11:28           ` Peter Zijlstra
@ 2012-05-17 12:14             ` Catalin Marinas
  -1 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17 12:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 12:28:06PM +0100, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> > On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > > always return 0. Something like below (untested):
> > 
> > No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> > or SMP, not just for A15, because of the speculative prefetch which can
> > re-load TLB entries from the page tables at _any_ time.
> 
> Hmm,. so this is mostly because of the confusion/coupling between
> tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
> freeing of the actual pages being a problem with speculative TLB
> reloads, just the page-tables.

The TLB on newer ARM cores can cache intermediate entries (e.g. pmd) as
long as they are valid, even if the full translation is not possible
(e.g. because the pte entry is 0). With fast_mode, this could lead to
the MMU reading the already freed pte page as it was pointed at by the
old pmd.

Older ARMv7 CPUs (Cortex-A8), don't do this intermediate caching and UP
should be fine with fast_mode==1 as we already track the pte range via
tlb_remove_tlb_entry(). The MMU on ARM is treated like any another agent
that accesses the memory, so standard memory ordering issues apply In
theory Linux can clear the pmd, free the page and it is re-used shortly
after while the MMU hasn't observed the pmd_clear() yet (we don't have a
barrier in this function).

> Should we introduce a tlb_remove_table() regardless of
> HAVE_RCU_TABLE_FREE which always queues the tables regardless of
> tlb_fast_mode()? 

This would probably work as well (or we just add support for
HAVE_RCU_TABLE_FREE on ARM).

-- 
Catalin

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 12:14             ` Catalin Marinas
  0 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17 12:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 12:28:06PM +0100, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> > On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > > always return 0. Something like below (untested):
> > 
> > No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> > or SMP, not just for A15, because of the speculative prefetch which can
> > re-load TLB entries from the page tables at _any_ time.
> 
> Hmm,. so this is mostly because of the confusion/coupling between
> tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
> freeing of the actual pages being a problem with speculative TLB
> reloads, just the page-tables.

The TLB on newer ARM cores can cache intermediate entries (e.g. pmd) as
long as they are valid, even if the full translation is not possible
(e.g. because the pte entry is 0). With fast_mode, this could lead to
the MMU reading the already freed pte page as it was pointed at by the
old pmd.

Older ARMv7 CPUs (Cortex-A8), don't do this intermediate caching and UP
should be fine with fast_mode==1 as we already track the pte range via
tlb_remove_tlb_entry(). The MMU on ARM is treated like any another agent
that accesses the memory, so standard memory ordering issues apply In
theory Linux can clear the pmd, free the page and it is re-used shortly
after while the MMU hasn't observed the pmd_clear() yet (we don't have a
barrier in this function).

> Should we introduce a tlb_remove_table() regardless of
> HAVE_RCU_TABLE_FREE which always queues the tables regardless of
> tlb_fast_mode()? 

This would probably work as well (or we just add support for
HAVE_RCU_TABLE_FREE on ARM).

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 11:28           ` Peter Zijlstra
@ 2012-05-17 16:00             ` Catalin Marinas
  -1 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17 16:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 12:28:06PM +0100, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> > On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > > always return 0. Something like below (untested):
> > 
> > No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> > or SMP, not just for A15, because of the speculative prefetch which can
> > re-load TLB entries from the page tables at _any_ time.
> 
> Hmm,. so this is mostly because of the confusion/coupling between
> tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
> freeing of the actual pages being a problem with speculative TLB
> reloads, just the page-tables.
> 
> Should we introduce a tlb_remove_table() regardless of
> HAVE_RCU_TABLE_FREE which always queues the tables regardless of
> tlb_fast_mode()? 

BTW, looking at your tlb-unify branch, does tlb_remove_table() call
tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
after clearing the pmd and before freeing the pte page table (and
ideally doing it less often than at every pte_free_tlb() call).

-- 
Catalin

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 16:00             ` Catalin Marinas
  0 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17 16:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 12:28:06PM +0100, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> > On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > > always return 0. Something like below (untested):
> > 
> > No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> > or SMP, not just for A15, because of the speculative prefetch which can
> > re-load TLB entries from the page tables at _any_ time.
> 
> Hmm,. so this is mostly because of the confusion/coupling between
> tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
> freeing of the actual pages being a problem with speculative TLB
> reloads, just the page-tables.
> 
> Should we introduce a tlb_remove_table() regardless of
> HAVE_RCU_TABLE_FREE which always queues the tables regardless of
> tlb_fast_mode()? 

BTW, looking at your tlb-unify branch, does tlb_remove_table() call
tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
after clearing the pmd and before freeing the pte page table (and
ideally doing it less often than at every pte_free_tlb() call).

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 16:00             ` Catalin Marinas
@ 2012-05-17 16:24               ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 16:24 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:

> BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> after clearing the pmd and before freeing the pte page table (and
> ideally doing it less often than at every pte_free_tlb() call).

No I don't think it does, so far the only archs using the RCU stuff are
ppc,sparc and s390 and none of those needed that (Xen might join them
soon though). But I will have to look and consider this more carefully.
I 'lost' most of the ppc/sparc/s390 details from memory to say this with
any certainty.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 16:24               ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 16:24 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:

> BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> after clearing the pmd and before freeing the pte page table (and
> ideally doing it less often than at every pte_free_tlb() call).

No I don't think it does, so far the only archs using the RCU stuff are
ppc,sparc and s390 and none of those needed that (Xen might join them
soon though). But I will have to look and consider this more carefully.
I 'lost' most of the ppc/sparc/s390 details from memory to say this with
any certainty.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 16:24               ` Peter Zijlstra
@ 2012-05-17 16:33                 ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 16:33 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 18:24 +0200, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:
> 
> > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > after clearing the pmd and before freeing the pte page table (and
> > ideally doing it less often than at every pte_free_tlb() call).
> 
> No I don't think it does, so far the only archs using the RCU stuff are
> ppc,sparc and s390 and none of those needed that (Xen might join them
> soon though). But I will have to look and consider this more carefully.
> I 'lost' most of the ppc/sparc/s390 details from memory to say this with
> any certainty.


Hmm, no, thinking more that does indeed sounds strange, I'll still have
to consider it more carefully, but I think you might have found a bug
there.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 16:33                 ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 16:33 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 18:24 +0200, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:
> 
> > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > after clearing the pmd and before freeing the pte page table (and
> > ideally doing it less often than at every pte_free_tlb() call).
> 
> No I don't think it does, so far the only archs using the RCU stuff are
> ppc,sparc and s390 and none of those needed that (Xen might join them
> soon though). But I will have to look and consider this more carefully.
> I 'lost' most of the ppc/sparc/s390 details from memory to say this with
> any certainty.


Hmm, no, thinking more that does indeed sounds strange, I'll still have
to consider it more carefully, but I think you might have found a bug
there.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 16:33                 ` Peter Zijlstra
@ 2012-05-17 16:44                   ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 16:44 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 18:33 +0200, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 18:24 +0200, Peter Zijlstra wrote:
> > On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:
> > 
> > > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > > after clearing the pmd and before freeing the pte page table (and
> > > ideally doing it less often than at every pte_free_tlb() call).
> > 
> > No I don't think it does, so far the only archs using the RCU stuff are
> > ppc,sparc and s390 and none of those needed that (Xen might join them
> > soon though). But I will have to look and consider this more carefully.
> > I 'lost' most of the ppc/sparc/s390 details from memory to say this with
> > any certainty.
> 
> 
> Hmm, no, thinking more that does indeed sounds strange, I'll still have
> to consider it more carefully, but I think you might have found a bug
> there.

So the RCU code can from ppc in commit
267239116987d64850ad2037d8e0f3071dc3b5ce, which has similar behaviour.
Also I suspect the mm_users < 2 test will be incorrect for ARM since
even the one user can be concurrent with your speculation engine.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 16:44                   ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 16:44 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 18:33 +0200, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 18:24 +0200, Peter Zijlstra wrote:
> > On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:
> > 
> > > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > > after clearing the pmd and before freeing the pte page table (and
> > > ideally doing it less often than at every pte_free_tlb() call).
> > 
> > No I don't think it does, so far the only archs using the RCU stuff are
> > ppc,sparc and s390 and none of those needed that (Xen might join them
> > soon though). But I will have to look and consider this more carefully.
> > I 'lost' most of the ppc/sparc/s390 details from memory to say this with
> > any certainty.
> 
> 
> Hmm, no, thinking more that does indeed sounds strange, I'll still have
> to consider it more carefully, but I think you might have found a bug
> there.

So the RCU code can from ppc in commit
267239116987d64850ad2037d8e0f3071dc3b5ce, which has similar behaviour.
Also I suspect the mm_users < 2 test will be incorrect for ARM since
even the one user can be concurrent with your speculation engine.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 16:44                   ` Peter Zijlstra
@ 2012-05-17 16:59                     ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 16:59 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 18:44 +0200, Peter Zijlstra wrote:
> 
> So the RCU code can from ppc in commit
> 267239116987d64850ad2037d8e0f3071dc3b5ce, which has similar behaviour.
> Also I suspect the mm_users < 2 test will be incorrect for ARM since
> even the one user can be concurrent with your speculation engine.
> 
> 
Right, last mail, I promise, I've confused myself enough already! :-)

OK, so ppc/sparc are special (forgot all about s390) I think by the time
they are done with unmap_page_range() their hardware hash-tables are
empty and nobody but software page-table walkers will still access the
linux page tables.

So when we do free_pgtables() to clean up the actual page-tables.
Power/Sparc need to RCU free this to allow concurrent software
page-table walkers like gup_fast.

Thus I don't think they need to tlb flush again because their hardware
doesn't actually walk the link page-tables, it walks hash-tables, which
by this time are empty.

Now if x86/Xen were to use this, it would indeed also need to TLB flush
when freeing the page-tables, since its hardware walkers do indeed
traverse these pages and we need to sync against them.

So my first patch in the tlb-unify tree is actually buggy.

Humm,. what to do adding a tlb flush in there might slow down ppc/sparc
unnecessarily.. dave/ben? I guess we need more knobs :-(


Now its quite possible I've utterly confused myself and everybody
reading, apologies for that, I shall rest and purge all from memory and
start over before commenting more.. 

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 16:59                     ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 16:59 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 18:44 +0200, Peter Zijlstra wrote:
> 
> So the RCU code can from ppc in commit
> 267239116987d64850ad2037d8e0f3071dc3b5ce, which has similar behaviour.
> Also I suspect the mm_users < 2 test will be incorrect for ARM since
> even the one user can be concurrent with your speculation engine.
> 
> 
Right, last mail, I promise, I've confused myself enough already! :-)

OK, so ppc/sparc are special (forgot all about s390) I think by the time
they are done with unmap_page_range() their hardware hash-tables are
empty and nobody but software page-table walkers will still access the
linux page tables.

So when we do free_pgtables() to clean up the actual page-tables.
Power/Sparc need to RCU free this to allow concurrent software
page-table walkers like gup_fast.

Thus I don't think they need to tlb flush again because their hardware
doesn't actually walk the link page-tables, it walks hash-tables, which
by this time are empty.

Now if x86/Xen were to use this, it would indeed also need to TLB flush
when freeing the page-tables, since its hardware walkers do indeed
traverse these pages and we need to sync against them.

So my first patch in the tlb-unify tree is actually buggy.

Humm,. what to do adding a tlb flush in there might slow down ppc/sparc
unnecessarily.. dave/ben? I guess we need more knobs :-(


Now its quite possible I've utterly confused myself and everybody
reading, apologies for that, I shall rest and purge all from memory and
start over before commenting more.. 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 16:44                   ` Peter Zijlstra
@ 2012-05-17 17:01                     ` Catalin Marinas
  -1 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17 17:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 05:44:13PM +0100, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 18:33 +0200, Peter Zijlstra wrote:
> > On Thu, 2012-05-17 at 18:24 +0200, Peter Zijlstra wrote:
> > > On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:
> > > > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > > > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > > > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > > > after clearing the pmd and before freeing the pte page table (and
> > > > ideally doing it less often than at every pte_free_tlb() call).
> > > 
> > > No I don't think it does, so far the only archs using the RCU stuff are
> > > ppc,sparc and s390 and none of those needed that (Xen might join them
> > > soon though). But I will have to look and consider this more carefully.
> > > I 'lost' most of the ppc/sparc/s390 details from memory to say this with
> > > any certainty.
> > 
> > Hmm, no, thinking more that does indeed sounds strange, I'll still have
> > to consider it more carefully, but I think you might have found a bug
> > there.
> 
> So the RCU code can from ppc in commit
> 267239116987d64850ad2037d8e0f3071dc3b5ce, which has similar behaviour.
> Also I suspect the mm_users < 2 test will be incorrect for ARM since
> even the one user can be concurrent with your speculation engine.

That's correct.

-- 
Catalin

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 17:01                     ` Catalin Marinas
  0 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17 17:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 05:44:13PM +0100, Peter Zijlstra wrote:
> On Thu, 2012-05-17 at 18:33 +0200, Peter Zijlstra wrote:
> > On Thu, 2012-05-17 at 18:24 +0200, Peter Zijlstra wrote:
> > > On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:
> > > > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > > > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > > > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > > > after clearing the pmd and before freeing the pte page table (and
> > > > ideally doing it less often than at every pte_free_tlb() call).
> > > 
> > > No I don't think it does, so far the only archs using the RCU stuff are
> > > ppc,sparc and s390 and none of those needed that (Xen might join them
> > > soon though). But I will have to look and consider this more carefully.
> > > I 'lost' most of the ppc/sparc/s390 details from memory to say this with
> > > any certainty.
> > 
> > Hmm, no, thinking more that does indeed sounds strange, I'll still have
> > to consider it more carefully, but I think you might have found a bug
> > there.
> 
> So the RCU code can from ppc in commit
> 267239116987d64850ad2037d8e0f3071dc3b5ce, which has similar behaviour.
> Also I suspect the mm_users < 2 test will be incorrect for ARM since
> even the one user can be concurrent with your speculation engine.

That's correct.

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 17:01                     ` Catalin Marinas
@ 2012-05-17 17:11                       ` Peter Zijlstra
  -1 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 17:11 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 18:01 +0100, Catalin Marinas wrote:
> > So the RCU code can from ppc in commit
> > 267239116987d64850ad2037d8e0f3071dc3b5ce, which has similar behaviour.
> > Also I suspect the mm_users < 2 test will be incorrect for ARM since
> > even the one user can be concurrent with your speculation engine.
> 
> That's correct. 

(I'm not sending this... really :-)

---
commit cd94154cc6a28dd9dc271042c1a59c08d26da886
Author: Martin Schwidefsky <schwidefsky@de.ibm.com>
Date:   Wed Apr 11 14:28:07 2012 +0200

    [S390] fix tlb flushing for page table pages
    
    Git commit 36409f6353fc2d7b6516e631415f938eadd92ffa "use generic RCU
    page-table freeing code" introduced a tlb flushing bug. Partially revert
    the above git commit and go back to s390 specific page table flush code.
    
    For s390 the TLB can contain three types of entries, "normal" TLB
    page-table entries, TLB combined region-and-segment-table (CRST) entries
    and real-space entries. Linux does not use real-space entries which
    leaves normal TLB entries and CRST entries. The CRST entries are
    intermediate steps in the page-table translation called translation paths.
    For example a 4K page access in a three-level page table setup will
    create two CRST TLB entries and one page-table TLB entry. The advantage
    of that approach is that a page access next to the previous one can reuse
    the CRST entries and needs just a single read from memory to create the
    page-table TLB entry. The disadvantage is that the TLB flushing rules are
    more complicated, before any page-table may be freed the TLB needs to be
    flushed.
    
    In short: the generic RCU page-table freeing code is incorrect for the
    CRST entries, in particular the check for mm_users < 2 is troublesome.
    
    This is applicable to 3.0+ kernels.
    
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 17:11                       ` Peter Zijlstra
  0 siblings, 0 replies; 86+ messages in thread
From: Peter Zijlstra @ 2012-05-17 17:11 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, 2012-05-17 at 18:01 +0100, Catalin Marinas wrote:
> > So the RCU code can from ppc in commit
> > 267239116987d64850ad2037d8e0f3071dc3b5ce, which has similar behaviour.
> > Also I suspect the mm_users < 2 test will be incorrect for ARM since
> > even the one user can be concurrent with your speculation engine.
> 
> That's correct. 

(I'm not sending this... really :-)

---
commit cd94154cc6a28dd9dc271042c1a59c08d26da886
Author: Martin Schwidefsky <schwidefsky@de.ibm.com>
Date:   Wed Apr 11 14:28:07 2012 +0200

    [S390] fix tlb flushing for page table pages
    
    Git commit 36409f6353fc2d7b6516e631415f938eadd92ffa "use generic RCU
    page-table freeing code" introduced a tlb flushing bug. Partially revert
    the above git commit and go back to s390 specific page table flush code.
    
    For s390 the TLB can contain three types of entries, "normal" TLB
    page-table entries, TLB combined region-and-segment-table (CRST) entries
    and real-space entries. Linux does not use real-space entries which
    leaves normal TLB entries and CRST entries. The CRST entries are
    intermediate steps in the page-table translation called translation paths.
    For example a 4K page access in a three-level page table setup will
    create two CRST TLB entries and one page-table TLB entry. The advantage
    of that approach is that a page access next to the previous one can reuse
    the CRST entries and needs just a single read from memory to create the
    page-table TLB entry. The disadvantage is that the TLB flushing rules are
    more complicated, before any page-table may be freed the TLB needs to be
    flushed.
    
    In short: the generic RCU page-table freeing code is incorrect for the
    CRST entries, in particular the check for mm_users < 2 is troublesome.
    
    This is applicable to 3.0+ kernels.
    
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 16:00             ` Catalin Marinas
@ 2012-05-17 17:22               ` Russell King
  -1 siblings, 0 replies; 86+ messages in thread
From: Russell King @ 2012-05-17 17:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Peter Zijlstra, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 05:00:12PM +0100, Catalin Marinas wrote:
> On Thu, May 17, 2012 at 12:28:06PM +0100, Peter Zijlstra wrote:
> > On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> > > On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > > > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > > > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > > > always return 0. Something like below (untested):
> > > 
> > > No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> > > or SMP, not just for A15, because of the speculative prefetch which can
> > > re-load TLB entries from the page tables at _any_ time.
> > 
> > Hmm,. so this is mostly because of the confusion/coupling between
> > tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
> > freeing of the actual pages being a problem with speculative TLB
> > reloads, just the page-tables.
> > 
> > Should we introduce a tlb_remove_table() regardless of
> > HAVE_RCU_TABLE_FREE which always queues the tables regardless of
> > tlb_fast_mode()? 
> 
> BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> after clearing the pmd and before freeing the pte page table (and
> ideally doing it less often than at every pte_free_tlb() call).

Catalin,

The way TLB shootdown stuff works is that _every_ single bit of memory
which gets freed, whether its a page or a page table, gets added to a
list of pages to be freed.

So, the sequence is:
- remove pte/pmd/pud/pgd pointers
- add pages, whether they be pages pointed to by pte entries or page tables
  to be freed to a list
- when list is sufficiently full, invalidate TLBs
- free list of pages

That means the pages will not be freed, whether it be a page mapped
into userspace or a page table until such time that the TLB has been
invalidated.

For page tables, this is done via pXX_free_tlb(), which then calls out
to the arch specific __pXX_free_tlb(), which ultimately then hands the
page table over to tlb_remove_page() to add to the list of to-be-freed
pages.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 17:22               ` Russell King
  0 siblings, 0 replies; 86+ messages in thread
From: Russell King @ 2012-05-17 17:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Peter Zijlstra, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 05:00:12PM +0100, Catalin Marinas wrote:
> On Thu, May 17, 2012 at 12:28:06PM +0100, Peter Zijlstra wrote:
> > On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> > > On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > > > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > > > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > > > always return 0. Something like below (untested):
> > > 
> > > No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> > > or SMP, not just for A15, because of the speculative prefetch which can
> > > re-load TLB entries from the page tables at _any_ time.
> > 
> > Hmm,. so this is mostly because of the confusion/coupling between
> > tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
> > freeing of the actual pages being a problem with speculative TLB
> > reloads, just the page-tables.
> > 
> > Should we introduce a tlb_remove_table() regardless of
> > HAVE_RCU_TABLE_FREE which always queues the tables regardless of
> > tlb_fast_mode()? 
> 
> BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> after clearing the pmd and before freeing the pte page table (and
> ideally doing it less often than at every pte_free_tlb() call).

Catalin,

The way TLB shootdown stuff works is that _every_ single bit of memory
which gets freed, whether its a page or a page table, gets added to a
list of pages to be freed.

So, the sequence is:
- remove pte/pmd/pud/pgd pointers
- add pages, whether they be pages pointed to by pte entries or page tables
  to be freed to a list
- when list is sufficiently full, invalidate TLBs
- free list of pages

That means the pages will not be freed, whether it be a page mapped
into userspace or a page table until such time that the TLB has been
invalidated.

For page tables, this is done via pXX_free_tlb(), which then calls out
to the arch specific __pXX_free_tlb(), which ultimately then hands the
page table over to tlb_remove_page() to add to the list of to-be-freed
pages.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 17:22               ` Russell King
@ 2012-05-17 18:31                 ` Catalin Marinas
  -1 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17 18:31 UTC (permalink / raw)
  To: Russell King
  Cc: Peter Zijlstra, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 06:22:15PM +0100, Russell King wrote:
> On Thu, May 17, 2012 at 05:00:12PM +0100, Catalin Marinas wrote:
> > On Thu, May 17, 2012 at 12:28:06PM +0100, Peter Zijlstra wrote:
> > > On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> > > > On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > > > > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > > > > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > > > > always return 0. Something like below (untested):
> > > > 
> > > > No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> > > > or SMP, not just for A15, because of the speculative prefetch which can
> > > > re-load TLB entries from the page tables at _any_ time.
> > > 
> > > Hmm,. so this is mostly because of the confusion/coupling between
> > > tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
> > > freeing of the actual pages being a problem with speculative TLB
> > > reloads, just the page-tables.
> > > 
> > > Should we introduce a tlb_remove_table() regardless of
> > > HAVE_RCU_TABLE_FREE which always queues the tables regardless of
> > > tlb_fast_mode()? 
> > 
> > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > after clearing the pmd and before freeing the pte page table (and
> > ideally doing it less often than at every pte_free_tlb() call).
> 
> Catalin,
> 
> The way TLB shootdown stuff works is that _every_ single bit of memory
> which gets freed, whether its a page or a page table, gets added to a
> list of pages to be freed.
> 
> So, the sequence is:
> - remove pte/pmd/pud/pgd pointers
> - add pages, whether they be pages pointed to by pte entries or page tables
>   to be freed to a list
> - when list is sufficiently full, invalidate TLBs
> - free list of pages
> 
> That means the pages will not be freed, whether it be a page mapped
> into userspace or a page table until such time that the TLB has been
> invalidated.
> 
> For page tables, this is done via pXX_free_tlb(), which then calls out
> to the arch specific __pXX_free_tlb(), which ultimately then hands the
> page table over to tlb_remove_page() to add to the list of to-be-freed
> pages.

I know that already, not sure why you explained it again (but it's good
for future reference).

My point was that if we move to HAVE_RCU_FREE_TLB, the other
architectures doing this are calling tlb_remove_table() instead of
tlb_remove_page() in __p??_free_tlb(). And tlb_remove_table() does not
do any TLB maintenance when it can no longer queue pages (batch table
overflow).

-- 
Catalin

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-17 18:31                 ` Catalin Marinas
  0 siblings, 0 replies; 86+ messages in thread
From: Catalin Marinas @ 2012-05-17 18:31 UTC (permalink / raw)
  To: Russell King
  Cc: Peter Zijlstra, Paul Mundt, Andrea Arcangeli, Thomas Gleixner,
	Rik van Riel, Ingo Molnar, akpm, Linus Torvalds, linux-kernel,
	linux-arch, linux-mm, Benjamin Herrenschmidt, David Miller,
	Hugh Dickins, Mel Gorman, Nick Piggin, Chris Metcalf,
	Martin Schwidefsky

On Thu, May 17, 2012 at 06:22:15PM +0100, Russell King wrote:
> On Thu, May 17, 2012 at 05:00:12PM +0100, Catalin Marinas wrote:
> > On Thu, May 17, 2012 at 12:28:06PM +0100, Peter Zijlstra wrote:
> > > On Thu, 2012-05-17 at 10:51 +0100, Russell King wrote:
> > > > On Thu, May 17, 2012 at 10:30:23AM +0100, Catalin Marinas wrote:
> > > > > Another minor thing is that on newer ARM processors (Cortex-A15) we
> > > > > need the TLB shootdown even on UP systems, so tlb_fast_mode should
> > > > > always return 0. Something like below (untested):
> > > > 
> > > > No Catalin, we need this for virtually all ARMv7 CPUs whether they're UP
> > > > or SMP, not just for A15, because of the speculative prefetch which can
> > > > re-load TLB entries from the page tables at _any_ time.
> > > 
> > > Hmm,. so this is mostly because of the confusion/coupling between
> > > tlb_remove_page() and tlb_remove_table() I guess. Since I don't see the
> > > freeing of the actual pages being a problem with speculative TLB
> > > reloads, just the page-tables.
> > > 
> > > Should we introduce a tlb_remove_table() regardless of
> > > HAVE_RCU_TABLE_FREE which always queues the tables regardless of
> > > tlb_fast_mode()? 
> > 
> > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > after clearing the pmd and before freeing the pte page table (and
> > ideally doing it less often than at every pte_free_tlb() call).
> 
> Catalin,
> 
> The way TLB shootdown stuff works is that _every_ single bit of memory
> which gets freed, whether its a page or a page table, gets added to a
> list of pages to be freed.
> 
> So, the sequence is:
> - remove pte/pmd/pud/pgd pointers
> - add pages, whether they be pages pointed to by pte entries or page tables
>   to be freed to a list
> - when list is sufficiently full, invalidate TLBs
> - free list of pages
> 
> That means the pages will not be freed, whether it be a page mapped
> into userspace or a page table until such time that the TLB has been
> invalidated.
> 
> For page tables, this is done via pXX_free_tlb(), which then calls out
> to the arch specific __pXX_free_tlb(), which ultimately then hands the
> page table over to tlb_remove_page() to add to the list of to-be-freed
> pages.

I know that already, not sure why you explained it again (but it's good
for future reference).

My point was that if we move to HAVE_RCU_FREE_TLB, the other
architectures doing this are calling tlb_remove_table() instead of
tlb_remove_page() in __p??_free_tlb(). And tlb_remove_table() does not
do any TLB maintenance when it can no longer queue pages (batch table
overflow).

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
  2012-05-17 16:24               ` Peter Zijlstra
@ 2012-05-21  7:47                 ` Martin Schwidefsky
  -1 siblings, 0 replies; 86+ messages in thread
From: Martin Schwidefsky @ 2012-05-21  7:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Catalin Marinas, Russell King, Paul Mundt, Andrea Arcangeli,
	Thomas Gleixner, Rik van Riel, Ingo Molnar, akpm, Linus Torvalds,
	linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Chris Metcalf

On Thu, 17 May 2012 18:24:44 +0200
Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:
> 
> > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > after clearing the pmd and before freeing the pte page table (and
> > ideally doing it less often than at every pte_free_tlb() call).
> 
> No I don't think it does, so far the only archs using the RCU stuff are
> ppc,sparc and s390 and none of those needed that (Xen might join them
> soon though). But I will have to look and consider this more carefully.
> I 'lost' most of the ppc/sparc/s390 details from memory to say this with
> any certainty.
 
s390 needs a TLB flush for the pgd, pud and pmd tables. See git commit
cd94154cc6a28dd9dc271042c1a59c08d26da886 for the sad details.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb
@ 2012-05-21  7:47                 ` Martin Schwidefsky
  0 siblings, 0 replies; 86+ messages in thread
From: Martin Schwidefsky @ 2012-05-21  7:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Catalin Marinas, Russell King, Paul Mundt, Andrea Arcangeli,
	Thomas Gleixner, Rik van Riel, Ingo Molnar, akpm, Linus Torvalds,
	linux-kernel, linux-arch, linux-mm, Benjamin Herrenschmidt,
	David Miller, Hugh Dickins, Mel Gorman, Nick Piggin,
	Chris Metcalf

On Thu, 17 May 2012 18:24:44 +0200
Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> On Thu, 2012-05-17 at 17:00 +0100, Catalin Marinas wrote:
> 
> > BTW, looking at your tlb-unify branch, does tlb_remove_table() call
> > tlb_flush/tlb_flush_mmu before freeing the tables?  I can only see
> > tlb_remove_page() doing this. On ARM, even UP, we need the TLB flushing
> > after clearing the pmd and before freeing the pte page table (and
> > ideally doing it less often than at every pte_free_tlb() call).
> 
> No I don't think it does, so far the only archs using the RCU stuff are
> ppc,sparc and s390 and none of those needed that (Xen might join them
> soon though). But I will have to look and consider this more carefully.
> I 'lost' most of the ppc/sparc/s390 details from memory to say this with
> any certainty.
 
s390 needs a TLB flush for the pgd, pud and pmd tables. See git commit
cd94154cc6a28dd9dc271042c1a59c08d26da886 for the sad details.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 86+ messages in thread

end of thread, other threads:[~2012-05-21  7:47 UTC | newest]

Thread overview: 86+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-02 17:59 [RFC][PATCH 0/6] mm: Unify TLB gather implementations Peter Zijlstra
2011-03-02 17:59 ` Peter Zijlstra
2011-03-02 17:59 ` Peter Zijlstra
2011-03-02 17:59 ` [RFC][PATCH 1/6] mm: Optimize fullmm TLB flushing Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 17:59 ` [RFC][PATCH 2/6] mm: Change flush_tlb_range() to take an mm_struct Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 19:19   ` Linus Torvalds
2011-03-02 19:19     ` Linus Torvalds
2011-03-02 20:58     ` Rik van Riel
2011-03-02 20:58       ` Rik van Riel
2011-03-02 21:40     ` Peter Zijlstra
2011-03-02 21:40       ` Peter Zijlstra
2011-03-02 21:47       ` David Miller
2011-03-02 21:47         ` David Miller
2011-03-03 17:22         ` Chris Metcalf
2011-03-03 17:22           ` Chris Metcalf
2011-03-03 17:22           ` Chris Metcalf
2011-03-03 18:45           ` David Miller
2011-03-03 18:45             ` David Miller
2011-03-03 18:56             ` Chris Metcalf
2011-03-03 18:56               ` Chris Metcalf
2011-03-03 18:56               ` Chris Metcalf
2011-03-10 18:05           ` [PATCH] arch/tile: optimize icache flush Chris Metcalf
2011-03-10 18:05             ` Chris Metcalf
2011-03-10 18:05             ` Chris Metcalf
2011-03-10 23:19             ` Rik van Riel
2011-03-10 23:19               ` Rik van Riel
2011-03-02 17:59 ` [RFC][PATCH 3/6] mm: Provide generic range tracking and flushing Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 17:59 ` [RFC][PATCH 4/6] arm, mm: Convert arm to generic tlb Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-09 15:16   ` Catalin Marinas
2011-03-09 15:16     ` Catalin Marinas
2011-03-09 15:19     ` Peter Zijlstra
2011-03-09 15:19       ` Peter Zijlstra
2011-03-09 15:36       ` Catalin Marinas
2011-03-09 15:36         ` Catalin Marinas
2011-03-09 15:39         ` Peter Zijlstra
2011-03-09 15:39           ` Peter Zijlstra
2011-03-09 15:48           ` Peter Zijlstra
2011-03-09 15:48             ` Peter Zijlstra
2011-03-09 16:34             ` Catalin Marinas
2011-03-09 16:34               ` Catalin Marinas
2012-05-17  3:05   ` Paul Mundt
2012-05-17  3:05     ` Paul Mundt
2012-05-17  9:30     ` Catalin Marinas
2012-05-17  9:30       ` Catalin Marinas
2012-05-17  9:39       ` Catalin Marinas
2012-05-17  9:39         ` Catalin Marinas
2012-05-17  9:51       ` Russell King
2012-05-17  9:51         ` Russell King
2012-05-17 11:28         ` Peter Zijlstra
2012-05-17 11:28           ` Peter Zijlstra
2012-05-17 12:14           ` Catalin Marinas
2012-05-17 12:14             ` Catalin Marinas
2012-05-17 16:00           ` Catalin Marinas
2012-05-17 16:00             ` Catalin Marinas
2012-05-17 16:24             ` Peter Zijlstra
2012-05-17 16:24               ` Peter Zijlstra
2012-05-17 16:33               ` Peter Zijlstra
2012-05-17 16:33                 ` Peter Zijlstra
2012-05-17 16:44                 ` Peter Zijlstra
2012-05-17 16:44                   ` Peter Zijlstra
2012-05-17 16:59                   ` Peter Zijlstra
2012-05-17 16:59                     ` Peter Zijlstra
2012-05-17 17:01                   ` Catalin Marinas
2012-05-17 17:01                     ` Catalin Marinas
2012-05-17 17:11                     ` Peter Zijlstra
2012-05-17 17:11                       ` Peter Zijlstra
2012-05-21  7:47               ` Martin Schwidefsky
2012-05-21  7:47                 ` Martin Schwidefsky
2012-05-17 17:22             ` Russell King
2012-05-17 17:22               ` Russell King
2012-05-17 18:31               ` Catalin Marinas
2012-05-17 18:31                 ` Catalin Marinas
2011-03-02 17:59 ` [RFC][PATCH 5/6] ia64, mm: Convert ia64 " Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 17:59 ` [RFC][PATCH 6/6] sh, mm: Convert sh " Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra
2011-03-02 17:59   ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.