linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/14] sparc64 shared context/TLB support
@ 2016-12-16 18:35 Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 01/14] sparc64: placeholder for needed mmu shared context patching Mike Kravetz
                   ` (13 more replies)
  0 siblings, 14 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

In Sparc mm code today, each address space is assigned a unique context
identifier.  This context ID is stored in context register 0 of the MMU.
This same context ID is stored in TLB entries.  When the MMU is searching
for a virtual address translation, the context ID as well as the virtual
address must match for a TLB hit.

Beginning with Sparc Niagara 2 processors, the MMU contains an additional
context register (register 1).  When searching the TLB, the MMU will find
a match if the virtual address matches and the ID contained in either
context register 0 -OR- context register 1 matches.

In the Linux kernel today, only context register 0 is set and used by
the MMU.  Solaris has made use of the additional context register for shared
mappings.  If two tasks share an appropriate mapping, then both tasks set
context register 1 to the same value and associate that value with the
shared mapping.  In this way, both tasks can use the same TLB entries for
pages of the shared mapping.

This RFC adds support for the additional context register, and extends the
mmap and System V shared memory system calls so that an application can
request shared context mappings.  At a very high level, this works as follows:
- An application passes a new SHARED_CTX flag to mmap or shmat
- The vma associated with the mapping is marked with a SHARED_CTX flag
  - When a SHARED_CTX marked vma is first created, all other vma's mapping
    the same underlying object are searched looking for a match that:
	1) Is also marked SHARED_CTX 
	2) Is mapped at the same virtual address
  - If a match is found, the new vma shares a context ID with the existing vma.
  - If no match is found, a context ID is allocated for the new vma
- sparc specific code associates the context ID with pages in the shared
  mappings.

This RFC patch series limits a task to having only a single shared context
vma.  Shared context vmas in different processes must match exactly (start
and length) to be shared.  In addition, shared context support is only
provided for huge page (hugetlb) mappings.  These and other restrictions can
be relaxed as the code is further developed.

Most of the code in this patch series is sparc specific for management of
the new context ID and associated TSB entries.  However, there is arch
independent code which needs to enable the flagging of mappings which request
shared context.

This is early proof of concept code.  It is not polished, and there is need
for much more work.  There are even FIXME comments in the code.  My hope is
that it is sufficiently readable to start a discussion about the general
direction to enable such functionality.

It does function, and with perf you can see a reduction in TLB misses for
shared context mappings.  A simple test program which has two tasks touch
pages in a shared mapping has the following dTLB miss rates.

Testing		Normal Mapping			Shared Context Mapping
Rounds		dTLB-load-misses		dTLB-load-misses
1			771				834
10		      1,651				881
100		     10,422				874
1000		     97,992				958
10000		    975,910				963
100000	  	  9,719,193			      1,017
1000000		 97,941,327			      4,148

Mike Kravetz (14):
  sparc64: placeholder for needed mmu shared context patching
  sparc64: add new fields to mmu context for shared context support
  sparc64: routines for basic mmu shared context structure management
  sparc64: load shared id into context register 1
  sparc64: Add PAGE_SHR_CTX flag
  sparc64: general shared context tsb creation and support
  sparc64: move COMPUTE_TAG_TARGET and COMPUTE_TSB_PTR to header file
  sparc64: shared context tsb handling at context switch time
  sparc64: TLB/TSB miss handling for shared context
  mm: add shared context to vm_area_struct
  sparc64: add routines to look for vmsa which can share context
  mm: add mmap and shmat arch hooks for shared context
  sparc64 mm: add shared context support to mmap() and shmat() APIs
  sparc64: add SHARED_MMU_CTX Kconfig option

 arch/powerpc/include/asm/mmu_context.h   |  12 ++
 arch/s390/include/asm/mmu_context.h      |  12 ++
 arch/sparc/Kconfig                       |   3 +
 arch/sparc/include/asm/hugetlb.h         |   4 +
 arch/sparc/include/asm/mman.h            |   6 +
 arch/sparc/include/asm/mmu_64.h          |  36 +++++-
 arch/sparc/include/asm/mmu_context_64.h  | 139 ++++++++++++++++++++++--
 arch/sparc/include/asm/page_64.h         |   1 +
 arch/sparc/include/asm/pgtable_64.h      |  13 +++
 arch/sparc/include/asm/spitfire.h        |   2 +
 arch/sparc/include/asm/tlb_64.h          |   3 +
 arch/sparc/include/asm/trap_block.h      |   3 +-
 arch/sparc/include/asm/tsb.h             |  40 +++++++
 arch/sparc/include/uapi/asm/mman.h       |   1 +
 arch/sparc/kernel/fpu_traps.S            |  63 +++++++++++
 arch/sparc/kernel/head_64.S              |   2 +-
 arch/sparc/kernel/rtrap_64.S             |  20 ++++
 arch/sparc/kernel/setup_64.c             |  11 ++
 arch/sparc/kernel/smp_64.c               |  22 ++++
 arch/sparc/kernel/sun4v_tlb_miss.S       |  37 ++-----
 arch/sparc/kernel/sys_sparc_64.c         |  17 +++
 arch/sparc/kernel/trampoline_64.S        |  20 ++++
 arch/sparc/kernel/tsb.S                  | 172 +++++++++++++++++++++++------
 arch/sparc/mm/fault_64.c                 |  10 ++
 arch/sparc/mm/hugetlbpage.c              |  94 +++++++++++++++-
 arch/sparc/mm/init_64.c                  | 181 ++++++++++++++++++++++++++++++-
 arch/sparc/mm/tsb.c                      |  95 +++++++++++++++-
 arch/unicore32/include/asm/mmu_context.h |  12 ++
 arch/x86/include/asm/mmu_context.h       |  12 ++
 include/asm-generic/mm_hooks.h           |  18 ++-
 include/linux/mm.h                       |   1 +
 include/linux/mm_types.h                 |  13 +++
 include/uapi/linux/shm.h                 |   1 +
 ipc/shm.c                                |  13 +++
 mm/hugetlb.c                             |   9 ++
 mm/mmap.c                                |  10 ++
 36 files changed, 1018 insertions(+), 90 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFC PATCH 01/14] sparc64: placeholder for needed mmu shared context patching
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support Mike Kravetz
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

MMU shared context patching will be supported on Sun4V platforms with
Niagara 2 or later processors.  There will be a need for kernel patching
based on this criteria.  This 'patch' simply adds a comment as a reminder
and placeholder to add that support.

For now, MMU shared context support will be determined at follows:
- sun4v patching will be used for shared context support.  This is too
  general as most but not all sun4v platforms contain the required
  processors.
- A new config option (CONFIG_SHARED_MMU_CTX) is added

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/kernel/setup_64.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/sparc/kernel/setup_64.c b/arch/sparc/kernel/setup_64.c
index 6b7331d..ffda69b 100644
--- a/arch/sparc/kernel/setup_64.c
+++ b/arch/sparc/kernel/setup_64.c
@@ -276,6 +276,17 @@ void sun_m7_patch_2insn_range(struct sun4v_2insn_patch_entry *start,
 	}
 }
 
+/*
+ * FIXME - TODO
+ *
+ * Shared MMU context support will only be provided on sun4v platforms
+ * with Niagara 2 or later processors.  A patching mechanism for this
+ * this type of support will need to be implemented.  For now, the code
+ * is making the too general assumption of supporting shared context on
+ * all sun4v platforms.  This is a placeholder to add correct support
+ * at a later time.
+ */
+
 static void __init sun4v_patch(void)
 {
 	extern void sun4v_hvapi_init(void);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 01/14] sparc64: placeholder for needed mmu shared context patching Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-17  7:34   ` Sam Ravnborg
  2016-12-17  7:38   ` Sam Ravnborg
  2016-12-16 18:35 ` [RFC PATCH 03/14] sparc64: routines for basic mmu shared context structure management Mike Kravetz
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

Add new fields to the mm_context structure to support shared context.
Instead of a simple context ID, add a pointer to a structure with a
reference count.  This is needed as multiple tasks will share the
context ID.

Pages using the shared context ID will reside in a separate TSB.  So
changes are made to increase the number of TSBs as well.  Note that
only support for context sharing of huge pages is provided.  Therefore,
no base page size shared context TSB.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/include/asm/mmu_64.h         | 36 +++++++++++++++++++++++++++++----
 arch/sparc/include/asm/mmu_context_64.h |  8 ++++----
 2 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/arch/sparc/include/asm/mmu_64.h b/arch/sparc/include/asm/mmu_64.h
index f7de0db..edf8663 100644
--- a/arch/sparc/include/asm/mmu_64.h
+++ b/arch/sparc/include/asm/mmu_64.h
@@ -57,6 +57,13 @@
 	 (!(((__ctx.sparc64_ctx_val) ^ tlb_context_cache) & CTX_VERSION_MASK))
 #define CTX_HWBITS(__ctx)	((__ctx.sparc64_ctx_val) & CTX_HW_MASK)
 #define CTX_NRBITS(__ctx)	((__ctx.sparc64_ctx_val) & CTX_NR_MASK)
+#define	SHARED_CTX_VALID(__ctx)	(__ctx.shared_ctx && \
+	 (!(((__ctx.shared_ctx->shared_ctx_val) ^ tlb_context_cache) & \
+	   CTX_VERSION_MASK)))
+#define	SHARED_CTX_HWBITS(__ctx)	\
+	 ((__ctx.shared_ctx->shared_ctx_val) & CTX_HW_MASK)
+#define	SHARED_CTX_NRBITS(__ctx)	\
+	 ((__ctx.shared_ctx->shared_ctx_val) & CTX_NR_MASK)
 
 #ifndef __ASSEMBLY__
 
@@ -80,24 +87,45 @@ struct tsb_config {
 	unsigned long		tsb_map_pte;
 };
 
-#define MM_TSB_BASE	0
+#if defined(CONFIG_SHARED_MMU_CTX)
+struct shared_mmu_ctx {
+	atomic_t	refcount;
+	unsigned long	shared_ctx_val;
+};
+
+#define MM_TSB_HUGE_SHARED	0
+#define MM_TSB_BASE		1
+#define MM_TSB_HUGE		2
+#define MM_NUM_TSBS		3
+#else
 
+#define MM_TSB_BASE		0
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
-#define MM_TSB_HUGE	1
-#define MM_NUM_TSBS	2
+#define MM_TSB_HUGE		1
+#define MM_TSB_HUGE_SHARED	1	/* Simplifies conditions in code */
+#define MM_NUM_TSBS		2
 #else
-#define MM_NUM_TSBS	1
+#define MM_NUM_TSBS		1
+#endif
 #endif
 
 typedef struct {
 	spinlock_t		lock;
 	unsigned long		sparc64_ctx_val;
+#if defined(CONFIG_SHARED_MMU_CTX)
+	struct shared_mmu_ctx	*shared_ctx;
+	unsigned long		shared_hugetlb_pte_count;
+#endif
 	unsigned long		hugetlb_pte_count;
 	unsigned long		thp_pte_count;
 	struct tsb_config	tsb_block[MM_NUM_TSBS];
 	struct hv_tsb_descr	tsb_descr[MM_NUM_TSBS];
 } mm_context_t;
 
+#define	mm_shared_ctx_val(mm)					\
+	((mm)->context.shared_ctx ?				\
+	 (mm)->context.shared_ctx->shared_ctx_val : 0UL)
+
 #endif /* !__ASSEMBLY__ */
 
 #define TSB_CONFIG_TSB		0x00
diff --git a/arch/sparc/include/asm/mmu_context_64.h b/arch/sparc/include/asm/mmu_context_64.h
index b84be67..d031799 100644
--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -35,15 +35,15 @@ void __tsb_context_switch(unsigned long pgd_pa,
 static inline void tsb_context_switch(struct mm_struct *mm)
 {
 	__tsb_context_switch(__pa(mm->pgd),
-			     &mm->context.tsb_block[0],
+			     &mm->context.tsb_block[MM_TSB_BASE],
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
-			     (mm->context.tsb_block[1].tsb ?
-			      &mm->context.tsb_block[1] :
+			     (mm->context.tsb_block[MM_TSB_HUGE].tsb ?
+			      &mm->context.tsb_block[MM_TSB_HUGE] :
 			      NULL)
 #else
 			     NULL
 #endif
-			     , __pa(&mm->context.tsb_descr[0]));
+			     , __pa(&mm->context.tsb_descr[MM_TSB_BASE]));
 }
 
 void tsb_grow(struct mm_struct *mm,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 03/14] sparc64: routines for basic mmu shared context structure management
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 01/14] sparc64: placeholder for needed mmu shared context patching Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-18  3:07   ` David Miller
  2016-12-16 18:35 ` [RFC PATCH 04/14] sparc64: load shared id into context register 1 Mike Kravetz
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

Add routines for basic management of mmu shared context data structures.
These routines have to do with allocation/deallocation and get/put
of the structures.  The structures themselves will come from a new
kmem cache.

FIXMEs were added to then code where additional work is needed.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/include/asm/mmu_context_64.h |  6 +++
 arch/sparc/include/asm/tlb_64.h         |  3 ++
 arch/sparc/include/asm/tsb.h            |  2 +
 arch/sparc/kernel/smp_64.c              | 22 +++++++++
 arch/sparc/mm/init_64.c                 | 84 +++++++++++++++++++++++++++++++--
 arch/sparc/mm/tsb.c                     | 54 +++++++++++++++++++++
 6 files changed, 168 insertions(+), 3 deletions(-)

diff --git a/arch/sparc/include/asm/mmu_context_64.h b/arch/sparc/include/asm/mmu_context_64.h
index d031799..acaea6d 100644
--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -18,6 +18,12 @@ extern unsigned long tlb_context_cache;
 extern unsigned long mmu_context_bmap[];
 
 void get_new_mmu_context(struct mm_struct *mm);
+#if defined(CONFIG_SHARED_MMU_CTX)
+void get_new_mmu_shared_context(struct mm_struct *mm);
+void put_shared_context(struct mm_struct *mm);
+void set_mm_shared_ctx(struct mm_struct *mm, struct shared_mmu_ctx *ctx);
+void destroy_shared_context(struct mm_struct *mm);
+#endif
 #ifdef CONFIG_SMP
 void smp_new_mmu_context_version(void);
 #else
diff --git a/arch/sparc/include/asm/tlb_64.h b/arch/sparc/include/asm/tlb_64.h
index 4cb392f..e348a1b 100644
--- a/arch/sparc/include/asm/tlb_64.h
+++ b/arch/sparc/include/asm/tlb_64.h
@@ -14,6 +14,9 @@ void smp_flush_tlb_pending(struct mm_struct *,
 
 #ifdef CONFIG_SMP
 void smp_flush_tlb_mm(struct mm_struct *mm);
+#if defined(CONFIG_SHARED_MMU_CTX)
+void smp_flush_shared_tlb_mm(struct mm_struct *mm);
+#endif
 #define do_flush_tlb_mm(mm) smp_flush_tlb_mm(mm)
 #else
 #define do_flush_tlb_mm(mm) __flush_tlb_mm(CTX_HWBITS(mm->context), SECONDARY_CONTEXT)
diff --git a/arch/sparc/include/asm/tsb.h b/arch/sparc/include/asm/tsb.h
index 32258e0..311cd4e 100644
--- a/arch/sparc/include/asm/tsb.h
+++ b/arch/sparc/include/asm/tsb.h
@@ -72,6 +72,8 @@ struct tsb_phys_patch_entry {
 	unsigned int	insn;
 };
 extern struct tsb_phys_patch_entry __tsb_phys_patch, __tsb_phys_patch_end;
+
+extern struct kmem_cache *shared_mmu_ctx_cachep __read_mostly;
 #endif
 #define TSB_LOAD_QUAD(TSB, REG)	\
 661:	ldda		[TSB] ASI_NUCLEUS_QUAD_LDD, REG; \
diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index 8182f7c..c0f23ee 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -1078,6 +1078,28 @@ void smp_flush_tlb_mm(struct mm_struct *mm)
 	put_cpu();
 }
 
+#if defined(CONFIG_SHARED_MMU_CTX)
+/*
+ * Called when last reference to shared context is dropped.  Flush
+ * all TLB entries associated with the shared clontext ID.
+ *
+ * FIXME
+ * Future optimization would be to store cpumask in shared context
+ * structure and only make cross call to those cpus.
+ */
+void smp_flush_shared_tlb_mm(struct mm_struct *mm)
+{
+	u32 ctx = SHARED_CTX_HWBITS(mm->context);
+
+	(void)get_cpu();		/* prevent preemption */
+
+	smp_cross_call(&xcall_flush_tlb_mm, ctx, 0, 0);
+	__flush_tlb_mm(ctx, SECONDARY_CONTEXT);
+
+	put_cpu();
+}
+#endif
+
 struct tlb_pending_info {
 	unsigned long ctx;
 	unsigned long nr;
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 37aa537..bb9a6ee 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -673,14 +673,24 @@ DECLARE_BITMAP(mmu_context_bmap, MAX_CTX_NR);
  *
  * Always invoked with interrupts disabled.
  */
-void get_new_mmu_context(struct mm_struct *mm)
+static void __get_new_mmu_context_common(struct mm_struct *mm, bool shared)
 {
 	unsigned long ctx, new_ctx;
 	unsigned long orig_pgsz_bits;
 	int new_version;
 
 	spin_lock(&ctx_alloc_lock);
-	orig_pgsz_bits = (mm->context.sparc64_ctx_val & CTX_PGSZ_MASK);
+#if defined(CONFIG_SHARED_MMU_CTX)
+	if (shared)
+		/*
+		 * Note that we are only called from get_new_mmu_shared_context
+		 * which guarantees the existence of shared_ctx structure.
+		 */
+		orig_pgsz_bits = (mm->context.shared_ctx->shared_ctx_val &
+				  CTX_PGSZ_MASK);
+	else
+#endif
+		orig_pgsz_bits = (mm->context.sparc64_ctx_val & CTX_PGSZ_MASK);
 	ctx = (tlb_context_cache + 1) & CTX_NR_MASK;
 	new_ctx = find_next_zero_bit(mmu_context_bmap, 1 << CTX_NR_BITS, ctx);
 	new_version = 0;
@@ -714,13 +724,81 @@ void get_new_mmu_context(struct mm_struct *mm)
 	new_ctx |= (tlb_context_cache & CTX_VERSION_MASK);
 out:
 	tlb_context_cache = new_ctx;
-	mm->context.sparc64_ctx_val = new_ctx | orig_pgsz_bits;
+#if defined(CONFIG_SHARED_MMU_CTX)
+	if (shared)
+		mm->context.shared_ctx->shared_ctx_val =
+					new_ctx | orig_pgsz_bits;
+	else
+#endif
+		mm->context.sparc64_ctx_val = new_ctx | orig_pgsz_bits;
 	spin_unlock(&ctx_alloc_lock);
 
+	/*
+	 * FIXME
+	 * Not sure if the case where a shared context ID changed (not just
+	 * newly allocated) is handled properly.  May need to modify
+	 * smp_new_mmu_context_version to handle correctly.
+	 */
 	if (unlikely(new_version))
 		smp_new_mmu_context_version();
 }
 
+void get_new_mmu_context(struct mm_struct *mm)
+{
+	__get_new_mmu_context_common(mm, false);
+}
+
+#if defined(CONFIG_SHARED_MMU_CTX)
+void get_new_mmu_shared_context(struct mm_struct *mm)
+{
+	/*
+	 * For now, we only support one shared context mapping per mm.  So,
+	 * if mm->context.shared_ctx  is already set, we have a bug
+	 *
+	 * Note that we are called from mmap with mmap_sem held.  Thus,
+	 * there can not be two threads racing to initialize.
+	 */
+	BUG_ON(mm->context.shared_ctx);
+
+	mm->context.shared_ctx = kmem_cache_alloc(shared_mmu_ctx_cachep,
+						GFP_NOWAIT);
+	if (!mm->context.shared_ctx)
+		return;
+
+	__get_new_mmu_context_common(mm, true);
+}
+
+void put_shared_context(struct mm_struct *mm)
+{
+	if (!mm->context.shared_ctx)
+		return;
+
+	if (atomic_dec_and_test(&mm->context.shared_ctx->refcount)) {
+		smp_flush_shared_tlb_mm(mm);
+		destroy_shared_context(mm);
+		kmem_cache_free(shared_mmu_ctx_cachep, mm->context.shared_ctx);
+	}
+
+	/*
+	 * For now we assume/expect only one shared context reference per mm
+	 */
+	mm->context.shared_ctx = NULL;
+}
+
+void set_mm_shared_ctx(struct mm_struct *mm, struct shared_mmu_ctx *ctx)
+{
+	BUG_ON(mm->context.shared_ctx || !ctx);
+
+	/*
+	 * Note that we are called with mmap_lock held on underlying
+	 * mapping.  Hence, the ctx structure pointed to by the matching
+	 * vma can not go away.
+	 */
+	atomic_inc(&ctx->refcount);
+	mm->context.shared_ctx = ctx;
+}
+#endif
+
 static int numa_enabled = 1;
 static int numa_debug;
 
diff --git a/arch/sparc/mm/tsb.c b/arch/sparc/mm/tsb.c
index e20fbba..8c2d148 100644
--- a/arch/sparc/mm/tsb.c
+++ b/arch/sparc/mm/tsb.c
@@ -277,6 +277,8 @@ static void setup_tsb_params(struct mm_struct *mm, unsigned long tsb_idx, unsign
 	}
 }
 
+struct kmem_cache *shared_mmu_ctx_cachep __read_mostly;
+
 struct kmem_cache *pgtable_cache __read_mostly;
 
 static struct kmem_cache *tsb_caches[8] __read_mostly;
@@ -292,6 +294,27 @@ static const char *tsb_cache_names[8] = {
 	"tsb_1MB",
 };
 
+#if defined(CONFIG_SHARED_MMU_CTX)
+static void init_once_shared_mmu_ctx(void *mem)
+{
+	struct shared_mmu_ctx *ctx = (struct shared_mmu_ctx *) mem;
+
+	ctx->shared_ctx_val = 0;
+	atomic_set(&ctx->refcount, 1);
+}
+
+static void __init sun4v_shared_mmu_ctx_init(void)
+{
+	shared_mmu_ctx_cachep = kmem_cache_create("shared_mmu_ctx_cache",
+					sizeof(struct shared_mmu_ctx),
+					0,
+					SLAB_HWCACHE_ALIGN|SLAB_PANIC,
+					init_once_shared_mmu_ctx);
+}
+#else
+static void __init sun4v_shared_mmu_ctx_init(void) { }
+#endif
+
 void __init pgtable_cache_init(void)
 {
 	unsigned long i;
@@ -317,6 +340,13 @@ void __init pgtable_cache_init(void)
 			prom_halt();
 		}
 	}
+
+	if (tlb_type == hypervisor)
+		/*
+		 * FIXME - shared context enables/supported on most
+		 * but not all sun4v priocessors
+		 */
+		sun4v_shared_mmu_ctx_init();
 }
 
 int sysctl_tsb_ratio = -2;
@@ -547,6 +577,30 @@ static void tsb_destroy_one(struct tsb_config *tp)
 	tp->tsb_reg_val = 0UL;
 }
 
+#if defined(CONFIG_SHARED_MMU_CTX)
+void destroy_shared_context(struct mm_struct *mm)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&ctx_alloc_lock, flags);
+
+	if (SHARED_CTX_VALID(mm->context)) {
+		unsigned long nr = SHARED_CTX_NRBITS(mm->context);
+
+		mmu_context_bmap[nr>>6] &= ~(1UL << (nr & 63));
+	}
+
+	spin_unlock_irqrestore(&ctx_alloc_lock, flags);
+
+#if defined(CONFIG_SHARED_MMU_CTX)
+	/*
+	 * Any shared context should have been cleaned up by now
+	 */
+	BUG_ON(SHARED_CTX_VALID(mm->context));
+#endif
+}
+#endif
+
 void destroy_context(struct mm_struct *mm)
 {
 	unsigned long flags, i;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 04/14] sparc64: load shared id into context register 1
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (2 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 03/14] sparc64: routines for basic mmu shared context structure management Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-17  7:45   ` Sam Ravnborg
  2016-12-18  3:14   ` David Miller
  2016-12-16 18:35 ` [RFC PATCH 05/14] sparc64: Add PAGE_SHR_CTX flag Mike Kravetz
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

In current code, only context ID register 0 is set and used by the MMU.
On sun4v platforms that support MMU shared context, there is an additional
context ID register: specifically context register 1.  When searching
the TLB, the MMU will find a match if the virtual address matches and
the ID contained in context register 0 -OR- context register 1 matches.

Load the shared context ID into context ID register 1.  Care must be
taken to load register 1 after register 0, as loading register 0
overwrites both register 0 and 1.  Modify code loading register 0 to
also load register one if applicable.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/include/asm/mmu_context_64.h | 37 +++++++++++++++++--
 arch/sparc/include/asm/spitfire.h       |  2 ++
 arch/sparc/kernel/fpu_traps.S           | 63 +++++++++++++++++++++++++++++++++
 arch/sparc/kernel/rtrap_64.S            | 20 +++++++++++
 arch/sparc/kernel/trampoline_64.S       | 20 +++++++++++
 5 files changed, 140 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/include/asm/mmu_context_64.h b/arch/sparc/include/asm/mmu_context_64.h
index acaea6d..84268df 100644
--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -61,8 +61,11 @@ void smp_tsb_sync(struct mm_struct *mm);
 #define smp_tsb_sync(__mm) do { } while (0)
 #endif
 
-/* Set MMU context in the actual hardware. */
-#define load_secondary_context(__mm) \
+/*
+ * Set MMU context in the actual hardware.  Secondary context register
+ * zero is loaded with task specific context.
+ */
+#define load_secondary_context_0(__mm) \
 	__asm__ __volatile__( \
 	"\n661:	stxa		%0, [%1] %2\n" \
 	"	.section	.sun4v_1insn_patch, \"ax\"\n" \
@@ -74,6 +77,36 @@ void smp_tsb_sync(struct mm_struct *mm);
 	: "r" (CTX_HWBITS((__mm)->context)), \
 	  "r" (SECONDARY_CONTEXT), "i" (ASI_DMMU), "i" (ASI_MMU))
 
+/*
+ * Secondary context register one is loaded with shared context if
+ * it exists for the task.
+ */
+#define load_secondary_context_1(__mm) \
+	__asm__ __volatile__( \
+	"\n661: stxa		%0, [%1] %2\n" \
+	"	.section	.sun4v_1insn_patch, \"ax\"\n" \
+	"	.word		661b\n" \
+	"	stxa		%0, [%1] %3\n" \
+	"	.previous\n" \
+	"	flush		%%g6\n" \
+	: /* No outputs */ \
+	: "r" (SHARED_CTX_HWBITS((__mm)->context)), \
+	  "r" (SECONDARY_CONTEXT_R1), "i" (ASI_DMMU), "i" (ASI_MMU))
+
+#if defined(CONFIG_SHARED_MMU_CTX)
+#define load_secondary_context(__mm) \
+	do { \
+		load_secondary_context_0(__mm); \
+		if ((__mm)->context.shared_ctx) \
+			load_secondary_context_1(__mm); \
+	} while (0)
+#else
+#define load_secondary_context(__mm) \
+	do { \
+		load_secondary_context_0(__mm); \
+	} while (0)
+#endif
+
 void __flush_tlb_mm(unsigned long, unsigned long);
 
 /* Switch the current MM context. */
diff --git a/arch/sparc/include/asm/spitfire.h b/arch/sparc/include/asm/spitfire.h
index 1d8321c..1fa4594 100644
--- a/arch/sparc/include/asm/spitfire.h
+++ b/arch/sparc/include/asm/spitfire.h
@@ -33,6 +33,8 @@
 #define DMMU_SFAR		0x0000000000000020
 #define VIRT_WATCHPOINT		0x0000000000000038
 #define PHYS_WATCHPOINT		0x0000000000000040
+#define	PRIMARY_CONTEXT_R1	0x0000000000000108
+#define	SECONDARY_CONTEXT_R1	0x0000000000000110
 
 #define SPITFIRE_HIGHEST_LOCKED_TLBENT	(64 - 1)
 #define CHEETAH_HIGHEST_LOCKED_TLBENT	(16 - 1)
diff --git a/arch/sparc/kernel/fpu_traps.S b/arch/sparc/kernel/fpu_traps.S
index 336d275..f85a034 100644
--- a/arch/sparc/kernel/fpu_traps.S
+++ b/arch/sparc/kernel/fpu_traps.S
@@ -73,6 +73,16 @@ do_fpdis:
 	ldxa		[%g3] ASI_MMU, %g5
 	.previous
 
+661:	nop
+	nop
+	.section	.sun4v_2insn_patch, "ax"
+	.word		661b
+	mov		SECONDARY_CONTEXT_R1, %g3
+	ldxa		[%g3] ASI_MMU, %g4
+	.previous
+	/* Unnecessary on sun4u and pre-Niagara 2 sun4v */
+	mov		SECONDARY_CONTEXT, %g3
+
 	sethi		%hi(sparc64_kern_sec_context), %g2
 	ldx		[%g2 + %lo(sparc64_kern_sec_context)], %g2
 
@@ -114,6 +124,16 @@ do_fpdis:
 	ldxa		[%g3] ASI_MMU, %g5
 	.previous
 
+661:	nop
+	nop
+	.section	.sun4v_2insn_patch, "ax"
+	.word		661b
+	mov		SECONDARY_CONTEXT_R1, %g3
+	ldxa		[%g3] ASI_MMU, %g4
+	.previous
+	/* Unnecessary on sun4u and pre-Niagara 2 sun4v */
+	mov		SECONDARY_CONTEXT, %g3
+
 	add		%g6, TI_FPREGS, %g1
 	sethi		%hi(sparc64_kern_sec_context), %g2
 	ldx		[%g2 + %lo(sparc64_kern_sec_context)], %g2
@@ -155,6 +175,16 @@ do_fpdis:
 	ldxa		[%g3] ASI_MMU, %g5
 	.previous
 
+661:	nop
+	nop
+	.section	.sun4v_2insn_patch, "ax"
+	.word		661b
+	mov		SECONDARY_CONTEXT_R1, %g3
+	ldxa		[%g3] ASI_MMU, %g4
+	.previous
+	/* Unnecessary on sun4u and pre-Niagara 2 sun4v */
+	mov		SECONDARY_CONTEXT, %g3
+
 	sethi		%hi(sparc64_kern_sec_context), %g2
 	ldx		[%g2 + %lo(sparc64_kern_sec_context)], %g2
 
@@ -181,11 +211,24 @@ fpdis_exit:
 	stxa		%g5, [%g3] ASI_MMU
 	.previous
 
+661:	nop
+	nop
+	.section	.sun4v_2insn_patch, "ax"
+	.word		661b
+	mov		SECONDARY_CONTEXT_R1, %g3
+	stxa		%g4, [%g3] ASI_MMU
+	.previous
+
 	membar		#Sync
 fpdis_exit2:
 	wr		%g7, 0, %gsr
 	ldx		[%g6 + TI_XFSR], %fsr
 	rdpr		%tstate, %g3
+661:	nop
+	.section	.sun4v_1insn_patch, "ax"
+	.word		661b
+	sethi		%hi(TSTATE_PEF), %g4
+	.previous
 	or		%g3, %g4, %g3		! anal...
 	wrpr		%g3, %tstate
 	wr		%g0, FPRS_FEF, %fprs	! clean DU/DL bits
@@ -347,6 +390,16 @@ do_fptrap_after_fsr:
 	ldxa		[%g3] ASI_MMU, %g5
 	.previous
 
+661:	nop
+	nop
+	.section	.sun4v_2insn_patch, "ax"
+	.word		661b
+	mov		SECONDARY_CONTEXT_R1, %g3
+	ldxa		[%g3] ASI_MMU, %g4
+	.previous
+	/* Unnecessary on sun4u and pre-Niagara 2 sun4v */
+	mov		SECONDARY_CONTEXT, %g3
+
 	sethi		%hi(sparc64_kern_sec_context), %g2
 	ldx		[%g2 + %lo(sparc64_kern_sec_context)], %g2
 
@@ -377,7 +430,17 @@ do_fptrap_after_fsr:
 	stxa		%g5, [%g1] ASI_MMU
 	.previous
 
+661:	nop
+	nop
+	.section	.sun4v_2insn_patch, "ax"
+	.word		661b
+	mov		SECONDARY_CONTEXT_R1, %g1
+	stxa		%g4, [%g1] ASI_MMU
+	.previous
+
 	membar		#Sync
+	/* Unnecessary on sun4u and pre-Niagara 2 sun4v */
+	mov		SECONDARY_CONTEXT, %g1
 	ba,pt		%xcc, etrap
 	 wr		%g0, 0, %fprs
 	.size		do_fptrap,.-do_fptrap
diff --git a/arch/sparc/kernel/rtrap_64.S b/arch/sparc/kernel/rtrap_64.S
index 216948c..d409d84 100644
--- a/arch/sparc/kernel/rtrap_64.S
+++ b/arch/sparc/kernel/rtrap_64.S
@@ -202,6 +202,7 @@ rt_continue:	ldx			[%sp + PTREGS_OFF + PT_V9_G1], %g1
 		brnz,pn			%l3, kern_rtt
 		 mov			PRIMARY_CONTEXT, %l7
 
+		/* Get value from SECONDARY_CONTEXT register */
 661:		ldxa			[%l7 + %l7] ASI_DMMU, %l0
 		.section		.sun4v_1insn_patch, "ax"
 		.word			661b
@@ -212,12 +213,31 @@ rt_continue:	ldx			[%sp + PTREGS_OFF + PT_V9_G1], %g1
 		ldx			[%l1 + %lo(sparc64_kern_pri_nuc_bits)], %l1
 		or			%l0, %l1, %l0
 
+		/* and, put into PRIMARY_CONTEXT register */
 661:		stxa			%l0, [%l7] ASI_DMMU
 		.section		.sun4v_1insn_patch, "ax"
 		.word			661b
 		stxa			%l0, [%l7] ASI_MMU
 		.previous
 
+		/* Get value from SECONDARY_CONTEXT_R1 register */
+661:		nop
+		nop
+		.section		.sun4v_2insn_patch, "ax"
+		.word			661b
+		mov			SECONDARY_CONTEXT_R1, %l7
+		ldxa			[%l7] ASI_MMU, %l0
+		.previous
+
+		/* and, put into PRIMARY_CONTEXT_R1 register */
+661:		nop
+		nop
+		.section		.sun4v_2insn_patch, "ax"
+		.word			661b
+		mov			PRIMARY_CONTEXT_R1, %l7
+		stxa			%l0, [%l7] ASI_MMU
+		.previous
+
 		sethi			%hi(KERNBASE), %l7
 		flush			%l7
 		rdpr			%wstate, %l1
diff --git a/arch/sparc/kernel/trampoline_64.S b/arch/sparc/kernel/trampoline_64.S
index 88ede1d..7c4ab3b 100644
--- a/arch/sparc/kernel/trampoline_64.S
+++ b/arch/sparc/kernel/trampoline_64.S
@@ -260,6 +260,16 @@ after_lock_tlb:
 	stxa		%g0, [%g7] ASI_MMU
 	.previous
 
+	/* Save SECONDARY_CONTEXT_R1, membar should be part of patch */
+	membar		#Sync
+661:	nop
+	nop
+	.section	.sun4v_2insn_patch, "ax"
+	.word		661b
+	mov		SECONDARY_CONTEXT_R1, %g7
+	ldxa		[%g7] ASI_MMU, %g1
+	.previous
+
 	membar		#Sync
 	mov		SECONDARY_CONTEXT, %g7
 
@@ -269,6 +279,16 @@ after_lock_tlb:
 	stxa		%g0, [%g7] ASI_MMU
 	.previous
 
+	/* Restore SECONDARY_CONTEXT_R1, membar should be part of patch */
+	membar		#Sync
+661:	nop
+	nop
+	.section	.sun4v_2insn_patch, "ax"
+	.word		661b
+	mov		SECONDARY_CONTEXT_R1, %g7
+	stxa		%g1, [%g7] ASI_MMU
+	.previous
+
 	membar		#Sync
 
 	/* Everything we do here, until we properly take over the
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 05/14] sparc64: Add PAGE_SHR_CTX flag
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (3 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 04/14] sparc64: load shared id into context register 1 Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-18  3:12   ` David Miller
  2016-12-16 18:35 ` [RFC PATCH 06/14] sparc64: general shared context tsb creation and support Mike Kravetz
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

This new page flag is used to identify pages which are associated with
a shared context ID.  It is needed at page fault time when we only
have access to the PTE and need to determine whether the associated
TSB entry should be associated with the regular ot shared context TSB.

A new helper routine is_sharedctx_pte() is also added.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/include/asm/pgtable_64.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
index 1fb317f..f2fd088 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -166,6 +166,7 @@ bool kern_addr_valid(unsigned long addr);
 #define _PAGE_EXEC_4V	  _AC(0x0000000000000080,UL) /* Executable Page      */
 #define _PAGE_W_4V	  _AC(0x0000000000000040,UL) /* Writable             */
 #define _PAGE_SOFT_4V	  _AC(0x0000000000000030,UL) /* Software bits        */
+#define _PAGE_SHR_CTX_4V  _AC(0x0000000000000020,UL) /* Shared Context       */
 #define _PAGE_PRESENT_4V  _AC(0x0000000000000010,UL) /* Present              */
 #define _PAGE_RESV_4V	  _AC(0x0000000000000008,UL) /* Reserved             */
 #define _PAGE_SZ16GB_4V	  _AC(0x0000000000000007,UL) /* 16GB Page            */
@@ -426,6 +427,18 @@ static inline bool is_hugetlb_pte(pte_t pte)
 }
 #endif
 
+#if defined(CONFIG_SHARED_MMU_CTX)
+static inline bool is_sharedctx_pte(pte_t pte)
+{
+	return !!(pte_val(pte) & _PAGE_SHR_CTX_4V);
+}
+#else
+static inline bool is_sharedctx_pte(pte_t pte)
+{
+	return false;
+}
+#endif
+
 static inline pte_t pte_mkdirty(pte_t pte)
 {
 	unsigned long val = pte_val(pte), tmp;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 06/14] sparc64: general shared context tsb creation and support
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (4 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 05/14] sparc64: Add PAGE_SHR_CTX flag Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-17  7:53   ` Sam Ravnborg
  2016-12-16 18:35 ` [RFC PATCH 07/14] sparc64: move COMPUTE_TAG_TARGET and COMPUTE_TSB_PTR to header file Mike Kravetz
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

Take into account the shared context TSB when creating and updating
TSBs.  Existing routines are modified to key off the TSB index or
PTE flag (_PAGE_SHR_CTX_4V) to determine this is a shared context
operation.

With shared context support the sun4v TSB descriptor array could
contain a 'hole' if there is a shared context TSB and no huge page
TSB. An array with a hole can not be bassed to the hypervisor, so
make sure no hole exists in the array.

For shared context TSBs, the context index in the hypervisor descriptor
structure is set to 1.  This indicates the context ID stored in context
register 1 should be used for TLB matching.

This commit does NOT load the shared context TSB into the hv MMU.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/mm/fault_64.c    | 10 ++++++++++
 arch/sparc/mm/hugetlbpage.c | 20 ++++++++++++++++----
 arch/sparc/mm/init_64.c     | 42 +++++++++++++++++++++++++++++++++++++++---
 arch/sparc/mm/tsb.c         | 41 ++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 105 insertions(+), 8 deletions(-)

diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
index 643c149..2b82cdb 100644
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -493,6 +493,16 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs)
 			hugetlb_setup(regs);
 
 	}
+#if defined(CONFIG_SHARED_MMU_CTX)
+	mm_rss = mm->context.shared_hugetlb_pte_count * REAL_HPAGE_PER_HPAGE;
+	if (unlikely(mm_shared_ctx_val(mm) && mm_rss >
+		     mm->context.tsb_block[MM_TSB_HUGE_SHARED].tsb_rss_limit)) {
+		if (mm->context.tsb_block[MM_TSB_HUGE_SHARED].tsb)
+			tsb_grow(mm, MM_TSB_HUGE_SHARED, mm_rss);
+		else
+			hugetlb_shared_setup(regs);
+	}
+#endif
 #endif
 exit_exception:
 	exception_exit(prev_state);
diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index 988acc8b..2039d45 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -162,8 +162,14 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 {
 	pte_t orig;
 
-	if (!pte_present(*ptep) && pte_present(entry))
-		mm->context.hugetlb_pte_count++;
+	if (!pte_present(*ptep) && pte_present(entry)) {
+#if defined(CONFIG_SHARED_MMU_CTX)
+		if (pte_val(entry) | _PAGE_SHR_CTX_4V)
+			mm->context.shared_hugetlb_pte_count++;
+		else
+#endif
+			mm->context.hugetlb_pte_count++;
+	}
 
 	addr &= HPAGE_MASK;
 	orig = *ptep;
@@ -180,8 +186,14 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 	pte_t entry;
 
 	entry = *ptep;
-	if (pte_present(entry))
-		mm->context.hugetlb_pte_count--;
+	if (pte_present(entry)) {
+#if defined(CONFIG_SHARED_MMU_CTX)
+		if (pte_val(entry) | _PAGE_SHR_CTX_4V)
+			mm->context.shared_hugetlb_pte_count--;
+		else
+#endif
+			mm->context.hugetlb_pte_count--;
+	}
 
 	addr &= HPAGE_MASK;
 	*ptep = __pte(0UL);
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index bb9a6ee..2b310e5 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -346,6 +346,21 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *
 	spin_lock_irqsave(&mm->context.lock, flags);
 
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
+#if defined(CONFIG_SHARED_MMU_CTX)
+	if ((mm->context.hugetlb_pte_count || mm->context.thp_pte_count ||
+	    mm->context.shared_hugetlb_pte_count) && is_hugetlb_pte(pte)) {
+		/* We are fabricating 8MB pages using 4MB real hw pages.  */
+		pte_val(pte) |= (address & (1UL << REAL_HPAGE_SHIFT));
+		if (is_sharedctx_pte(pte))
+			__update_mmu_tsb_insert(mm, MM_TSB_HUGE_SHARED,
+					REAL_HPAGE_SHIFT, address,
+					pte_val(pte));
+		else
+			__update_mmu_tsb_insert(mm, MM_TSB_HUGE,
+					REAL_HPAGE_SHIFT, address,
+					pte_val(pte));
+	} else
+#else
 	if ((mm->context.hugetlb_pte_count || mm->context.thp_pte_count) &&
 	    is_hugetlb_pte(pte)) {
 		/* We are fabricating 8MB pages using 4MB real hw pages.  */
@@ -354,6 +369,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *
 					address, pte_val(pte));
 	} else
 #endif
+#endif
 		__update_mmu_tsb_insert(mm, MM_TSB_BASE, PAGE_SHIFT,
 					address, pte_val(pte));
 
@@ -2915,7 +2931,7 @@ static void context_reload(void *__data)
 		load_secondary_context(mm);
 }
 
-void hugetlb_setup(struct pt_regs *regs)
+static void __hugetlb_setup_common(struct pt_regs *regs, unsigned long tsb_idx)
 {
 	struct mm_struct *mm = current->mm;
 	struct tsb_config *tp;
@@ -2933,15 +2949,18 @@ void hugetlb_setup(struct pt_regs *regs)
 		die_if_kernel("HugeTSB in atomic", regs);
 	}
 
-	tp = &mm->context.tsb_block[MM_TSB_HUGE];
+	tp = &mm->context.tsb_block[tsb_idx];
 	if (likely(tp->tsb == NULL))
-		tsb_grow(mm, MM_TSB_HUGE, 0);
+		tsb_grow(mm, tsb_idx, 0);
 
 	tsb_context_switch(mm);
 	smp_tsb_sync(mm);
 
 	/* On UltraSPARC-III+ and later, configure the second half of
 	 * the Data-TLB for huge pages.
+	 *
+	 * Note that the following does not execute on platforms where
+	 * shared context is supported.
 	 */
 	if (tlb_type == cheetah_plus) {
 		bool need_context_reload = false;
@@ -2974,6 +2993,23 @@ void hugetlb_setup(struct pt_regs *regs)
 			on_each_cpu(context_reload, mm, 0);
 	}
 }
+
+void hugetlb_setup(struct pt_regs *regs)
+{
+	__hugetlb_setup_common(regs, MM_TSB_HUGE);
+}
+
+#if defined(CONFIG_SHARED_MMU_CTX)
+void hugetlb_shared_setup(struct pt_regs *regs)
+{
+	__hugetlb_setup_common(regs, MM_TSB_HUGE_SHARED);
+}
+#else
+void hugetlb_shared_setup(struct pt_regs *regs)
+{
+	BUG();
+}
+#endif
 #endif
 
 static struct resource code_resource = {
diff --git a/arch/sparc/mm/tsb.c b/arch/sparc/mm/tsb.c
index 8c2d148..0b684de 100644
--- a/arch/sparc/mm/tsb.c
+++ b/arch/sparc/mm/tsb.c
@@ -108,6 +108,12 @@ void flush_tsb_user(struct tlb_batch *tb)
 			base = __pa(base);
 		__flush_tsb_one(tb, REAL_HPAGE_SHIFT, base, nentries);
 	}
+
+	/*
+	 * FIXME
+	 * I don't "think" we want to flush shared context tsb entries here.
+	 * There should at least be a comment.
+	 */
 #endif
 	spin_unlock_irqrestore(&mm->context.lock, flags);
 }
@@ -133,6 +139,11 @@ void flush_tsb_user_page(struct mm_struct *mm, unsigned long vaddr, bool huge)
 			base = __pa(base);
 		__flush_tsb_one_entry(base, vaddr, REAL_HPAGE_SHIFT, nentries);
 	}
+	/*
+	 * FIXME
+	 * Again, we should give more thought to the need for flushing
+	 * shared context pages.  At least a comment is needed.
+	 */
 #endif
 	spin_unlock_irqrestore(&mm->context.lock, flags);
 }
@@ -159,6 +170,7 @@ static void setup_tsb_params(struct mm_struct *mm, unsigned long tsb_idx, unsign
 		break;
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
 	case MM_TSB_HUGE:
+	case MM_TSB_HUGE_SHARED:
 		base = TSBMAP_4M_BASE;
 		break;
 #endif
@@ -251,6 +263,7 @@ static void setup_tsb_params(struct mm_struct *mm, unsigned long tsb_idx, unsign
 			break;
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
 		case MM_TSB_HUGE:
+		case MM_TSB_HUGE_SHARED:
 			hp->pgsz_idx = HV_PGSZ_IDX_HUGE;
 			break;
 #endif
@@ -260,12 +273,21 @@ static void setup_tsb_params(struct mm_struct *mm, unsigned long tsb_idx, unsign
 		hp->assoc = 1;
 		hp->num_ttes = tsb_bytes / 16;
 		hp->ctx_idx = 0;
+
+#if defined(CONFIG_SHARED_MMU_CTX)
+		/*
+		 * For shared context TSBs, adjust the context register index
+		 */
+		if (mm->context.shared_ctx && tsb_idx == MM_TSB_HUGE_SHARED)
+			hp->ctx_idx = 1;
+#endif
 		switch (tsb_idx) {
 		case MM_TSB_BASE:
 			hp->pgsz_mask = HV_PGSZ_MASK_BASE;
 			break;
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
 		case MM_TSB_HUGE:
+		case MM_TSB_HUGE_SHARED:
 			hp->pgsz_mask = HV_PGSZ_MASK_HUGE;
 			break;
 #endif
@@ -520,12 +542,18 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
 	unsigned long saved_hugetlb_pte_count;
 	unsigned long saved_thp_pte_count;
+#if defined(CONFIG_SHARED_MMU_CTX)
+	unsigned long saved_shared_hugetlb_pte_count;
+#endif
 #endif
 	unsigned int i;
 
 	spin_lock_init(&mm->context.lock);
 
 	mm->context.sparc64_ctx_val = 0UL;
+#if defined(CONFIG_SHARED_MMU_CTX)
+	mm->context.shared_ctx = NULL;
+#endif
 
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
 	/* We reset them to zero because the fork() page copying
@@ -536,6 +564,10 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 	saved_thp_pte_count = mm->context.thp_pte_count;
 	mm->context.hugetlb_pte_count = 0;
 	mm->context.thp_pte_count = 0;
+#if defined(CONFIG_SHARED_MMU_CTX)
+	saved_shared_hugetlb_pte_count = mm->context.shared_hugetlb_pte_count;
+	mm->context.shared_hugetlb_pte_count = 0;
+#endif
 
 	mm_rss -= saved_thp_pte_count * (HPAGE_SIZE / PAGE_SIZE);
 #endif
@@ -544,8 +576,10 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 	 * us, so we need to zero out the TSB pointer or else tsb_grow()
 	 * will be confused and think there is an older TSB to free up.
 	 */
-	for (i = 0; i < MM_NUM_TSBS; i++)
+	for (i = 0; i < MM_NUM_TSBS; i++) {
 		mm->context.tsb_block[i].tsb = NULL;
+		mm->context.tsb_descr[i].tsb_base = 0UL;
+	}
 
 	/* If this is fork, inherit the parent's TSB size.  We would
 	 * grow it to that size on the first page fault anyways.
@@ -557,6 +591,11 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 		tsb_grow(mm, MM_TSB_HUGE,
 			 (saved_hugetlb_pte_count + saved_thp_pte_count) *
 			 REAL_HPAGE_PER_HPAGE);
+#if defined(CONFIG_SHARED_MMU_CTX)
+	if (unlikely(saved_shared_hugetlb_pte_count))
+		tsb_grow(mm, MM_TSB_HUGE_SHARED,
+			saved_shared_hugetlb_pte_count * REAL_HPAGE_PER_HPAGE);
+#endif
 #endif
 
 	if (unlikely(!mm->context.tsb_block[MM_TSB_BASE].tsb))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 07/14] sparc64: move COMPUTE_TAG_TARGET and COMPUTE_TSB_PTR to header file
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (5 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 06/14] sparc64: general shared context tsb creation and support Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 08/14] sparc64: shared context tsb handling at context switch time Mike Kravetz
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

Move macros COMPUTE_TSB_PTR and COMPUTE_TSB_PTR out of .S file to
headers so that they can be used in other files.

Also, add new macro IF_TLB_TYPE_NOT_HYPE

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/include/asm/tsb.h       | 38 ++++++++++++++++++++++++++++++++++++++
 arch/sparc/kernel/sun4v_tlb_miss.S | 29 ++---------------------------
 2 files changed, 40 insertions(+), 27 deletions(-)

diff --git a/arch/sparc/include/asm/tsb.h b/arch/sparc/include/asm/tsb.h
index 311cd4e..bb7df61 100644
--- a/arch/sparc/include/asm/tsb.h
+++ b/arch/sparc/include/asm/tsb.h
@@ -75,6 +75,44 @@ extern struct tsb_phys_patch_entry __tsb_phys_patch, __tsb_phys_patch_end;
 
 extern struct kmem_cache *shared_mmu_ctx_cachep __read_mostly;
 #endif
+
+	/*
+	 * If tlb type is not hypervisor, branch to label
+	 */
+#define	IF_TLB_TYPE_NOT_HYPE(TMP, NOT_HYPE_LABEL)	\
+	sethi	%hi(tlb_type), TMP;			\
+	lduw	[TMP + %lo(tlb_type)], TMP;		\
+	cmp	TMP, 3;					\
+	bne,pn	%icc, NOT_HYPE_LABEL;			\
+	nop
+
+	/* DEST = (VADDR >> 22)
+	 *
+	 * Branch to ZERO_CTX_LABEL if context is zero.
+	 */
+#define	COMPUTE_TAG_TARGET(DEST, VADDR, CTX, ZERO_CTX_LABEL) \
+	srlx	VADDR, 22, DEST; \
+	brz,pn	CTX, ZERO_CTX_LABEL; \
+	 nop;
+
+	/* Create TSB pointer.  This is something like:
+	 *
+	 * index_mask = (512 << (tsb_reg & 0x7UL)) - 1UL;
+	 * tsb_base = tsb_reg & ~0x7UL;
+	 * tsb_index = ((vaddr >> HASH_SHIFT) & tsb_mask);
+	 * tsb_ptr = tsb_base + (tsb_index * 16);
+	 */
+#define COMPUTE_TSB_PTR(TSB_PTR, VADDR, HASH_SHIFT, TMP1, TMP2) \
+	and	TSB_PTR, 0x7, TMP1;			\
+	mov	512, TMP2;				\
+	andn	TSB_PTR, 0x7, TSB_PTR;			\
+	sllx	TMP2, TMP1, TMP2;			\
+	srlx	VADDR, HASH_SHIFT, TMP1;		\
+	sub	TMP2, 1, TMP2;				\
+	and	TMP1, TMP2, TMP1;			\
+	sllx	TMP1, 4, TMP1;				\
+	add	TSB_PTR, TMP1, TSB_PTR;
+
 #define TSB_LOAD_QUAD(TSB, REG)	\
 661:	ldda		[TSB] ASI_NUCLEUS_QUAD_LDD, REG; \
 	.section	.tsb_ldquad_phys_patch, "ax"; \
diff --git a/arch/sparc/kernel/sun4v_tlb_miss.S b/arch/sparc/kernel/sun4v_tlb_miss.S
index 6179e19..46fbc16 100644
--- a/arch/sparc/kernel/sun4v_tlb_miss.S
+++ b/arch/sparc/kernel/sun4v_tlb_miss.S
@@ -3,6 +3,8 @@
  * Copyright (C) 2006 <davem@davemloft.net>
  */
 
+#include <asm/tsb.h>
+
 	.text
 	.align	32
 
@@ -16,33 +18,6 @@
 	ldx	[BASE + HV_FAULT_D_ADDR_OFFSET], VADDR; \
 	ldx	[BASE + HV_FAULT_D_CTX_OFFSET], CTX;
 
-	/* DEST = (VADDR >> 22)
-	 *
-	 * Branch to ZERO_CTX_LABEL if context is zero.
-	 */
-#define	COMPUTE_TAG_TARGET(DEST, VADDR, CTX, ZERO_CTX_LABEL) \
-	srlx	VADDR, 22, DEST; \
-	brz,pn	CTX, ZERO_CTX_LABEL; \
-	 nop;
-
-	/* Create TSB pointer.  This is something like:
-	 *
-	 * index_mask = (512 << (tsb_reg & 0x7UL)) - 1UL;
-	 * tsb_base = tsb_reg & ~0x7UL;
-	 * tsb_index = ((vaddr >> HASH_SHIFT) & tsb_mask);
-	 * tsb_ptr = tsb_base + (tsb_index * 16);
-	 */
-#define COMPUTE_TSB_PTR(TSB_PTR, VADDR, HASH_SHIFT, TMP1, TMP2) \
-	and	TSB_PTR, 0x7, TMP1;			\
-	mov	512, TMP2;				\
-	andn	TSB_PTR, 0x7, TSB_PTR;			\
-	sllx	TMP2, TMP1, TMP2;			\
-	srlx	VADDR, HASH_SHIFT, TMP1;		\
-	sub	TMP2, 1, TMP2;				\
-	and	TMP1, TMP2, TMP1;			\
-	sllx	TMP1, 4, TMP1;				\
-	add	TSB_PTR, TMP1, TSB_PTR;
-
 sun4v_itlb_miss:
 	/* Load MMU Miss base into %g2.  */
 	ldxa	[%g0] ASI_SCRATCHPAD, %g2
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 08/14] sparc64: shared context tsb handling at context switch time
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (6 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 07/14] sparc64: move COMPUTE_TAG_TARGET and COMPUTE_TSB_PTR to header file Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 09/14] sparc64: TLB/TSB miss handling for shared context Mike Kravetz
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

At context switch time, load the shared context TSB into the MMU (if
applicable) and set up global state to include the TSB.

sun4v loads the address of base and huge page TSBs into scratchpad
registers.  There is not an extra register for shared context TSB.
So, use offset 0xd0 in the trap block.  This is TRAP_PER_CPU_TSB_HUGE,
and is only used on sun4u.  We can then use this area for the shared
context on sun4v.

With this commit, global state is set up for shared context TSB but
still not used.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/include/asm/mmu_context_64.h | 27 ++++++++++++++----
 arch/sparc/include/asm/trap_block.h     |  3 +-
 arch/sparc/kernel/head_64.S             |  2 +-
 arch/sparc/kernel/tsb.S                 | 50 +++++++++++++++++++++------------
 4 files changed, 57 insertions(+), 25 deletions(-)

diff --git a/arch/sparc/include/asm/mmu_context_64.h b/arch/sparc/include/asm/mmu_context_64.h
index 84268df..0dc95cb5 100644
--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -36,21 +36,38 @@ void destroy_context(struct mm_struct *mm);
 void __tsb_context_switch(unsigned long pgd_pa,
 			  struct tsb_config *tsb_base,
 			  struct tsb_config *tsb_huge,
+			  struct tsb_config *tsb_huge_shared,
 			  unsigned long tsb_descr_pa);
 
+#if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
 static inline void tsb_context_switch(struct mm_struct *mm)
 {
+	/*
+	 * The conditional for tsb_descr_pa handles shared context
+	 * case where tsb_block[0] may not be used.
+	 */
 	__tsb_context_switch(__pa(mm->pgd),
 			     &mm->context.tsb_block[MM_TSB_BASE],
-#if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
 			     (mm->context.tsb_block[MM_TSB_HUGE].tsb ?
 			      &mm->context.tsb_block[MM_TSB_HUGE] :
-			      NULL)
+			      NULL),
+			     (mm->context.tsb_block[MM_TSB_HUGE_SHARED].tsb ?
+			      &mm->context.tsb_block[MM_TSB_HUGE_SHARED] :
+			      NULL),
+			     (mm->context.tsb_block[0].tsb ?
+			      __pa(&mm->context.tsb_descr[0]) :
+			      __pa(&mm->context.tsb_descr[1])));
+}
 #else
-			     NULL
-#endif
-			     , __pa(&mm->context.tsb_descr[MM_TSB_BASE]));
+static inline void tsb_context_switch(struct mm_struct *mm)
+{
+	__tsb_context_switch(__pa(mm->pgd),
+			     &mm->context.tsb_block[MM_TSB_BASE],
+			     NULL,
+			     NULL,
+			     __pa(&mm->context.tsb_descr[MM_TSB_BASE]);
 }
+#endif
 
 void tsb_grow(struct mm_struct *mm,
 	      unsigned long tsb_index,
diff --git a/arch/sparc/include/asm/trap_block.h b/arch/sparc/include/asm/trap_block.h
index ec9c04d..e971785 100644
--- a/arch/sparc/include/asm/trap_block.h
+++ b/arch/sparc/include/asm/trap_block.h
@@ -96,7 +96,8 @@ extern struct sun4v_2insn_patch_entry __sun_m7_2insn_patch,
 #define TRAP_PER_CPU_FAULT_INFO		0x40
 #define TRAP_PER_CPU_CPU_MONDO_BLOCK_PA	0xc0
 #define TRAP_PER_CPU_CPU_LIST_PA	0xc8
-#define TRAP_PER_CPU_TSB_HUGE		0xd0
+#define TRAP_PER_CPU_TSB_HUGE		0xd0	/* sun4u only */
+#define TRAP_PER_CPU_TSB_HUGE_SHARED	0xd0	/* sun4v only */
 #define TRAP_PER_CPU_TSB_HUGE_TEMP	0xd8
 #define TRAP_PER_CPU_IRQ_WORKLIST_PA	0xe0
 #define TRAP_PER_CPU_CPU_MONDO_QMASK	0xe8
diff --git a/arch/sparc/kernel/head_64.S b/arch/sparc/kernel/head_64.S
index 6aa3da1..0bf1e1f 100644
--- a/arch/sparc/kernel/head_64.S
+++ b/arch/sparc/kernel/head_64.S
@@ -875,7 +875,6 @@ sparc64_boot_end:
 #include "sun4v_tlb_miss.S"
 #include "sun4v_ivec.S"
 #include "ktlb.S"
-#include "tsb.S"
 
 /*
  * The following skip makes sure the trap table in ttable.S is aligned
@@ -916,6 +915,7 @@ swapper_4m_tsb:
 
 ! 0x0000000000428000
 
+#include "tsb.S"
 #include "systbls_64.S"
 
 	.data
diff --git a/arch/sparc/kernel/tsb.S b/arch/sparc/kernel/tsb.S
index d568c82..3ed3e7c 100644
--- a/arch/sparc/kernel/tsb.S
+++ b/arch/sparc/kernel/tsb.S
@@ -374,7 +374,8 @@ tsb_flush:
 	 * %o0: page table physical address
 	 * %o1:	TSB base config pointer
 	 * %o2:	TSB huge config pointer, or NULL if none
-	 * %o3:	Hypervisor TSB descriptor physical address
+	 * %o3: TSB huge shared config pointer, or NULL if none
+	 * %o4: Hypervisor TSB descriptor physical address
 	 *
 	 * We have to run this whole thing with interrupts
 	 * disabled so that the current cpu doesn't change
@@ -387,6 +388,8 @@ __tsb_context_switch:
 	rdpr	%pstate, %g1
 	wrpr	%g1, PSTATE_IE, %pstate
 
+	mov	%o4, %g7
+
 	TRAP_LOAD_TRAP_BLOCK(%g2, %g3)
 
 	stx	%o0, [%g2 + TRAP_PER_CPU_PGD_PADDR]
@@ -397,13 +400,8 @@ __tsb_context_switch:
 
 	ldx	[%o2 + TSB_CONFIG_REG_VAL], %g3
 
-1:	stx	%g3, [%g2 + TRAP_PER_CPU_TSB_HUGE]
-
-	sethi	%hi(tlb_type), %g2
-	lduw	[%g2 + %lo(tlb_type)], %g2
-	cmp	%g2, 3
-	bne,pt	%icc, 50f
-	 nop
+1:	IF_TLB_TYPE_NOT_HYPE(%o5, 50f)
+	/* Only setup HV TSB descriptors on appropriate MMU */
 
 	/* Hypervisor TSB switch. */
 	mov	SCRATCHPAD_UTSBREG1, %o5
@@ -411,27 +409,43 @@ __tsb_context_switch:
 	mov	SCRATCHPAD_UTSBREG2, %o5
 	stxa	%g3, [%o5] ASI_SCRATCHPAD
 
-	mov	2, %o0
+	/* Start counting HV tsb descriptors. */
+	mov	1, %o0				/* Always MM_TSB_BASE */
+	cmp	%g3, -1				/* MM_TSB_HUGE ? */
+	beq	%xcc, 2f
+	 nop
+	add	%o0, 1, %o0
+2:
+	brz,pt	%o3, 3f				/* MM_TSB_HUGE_SHARED ? */
+	 mov	-1, %g3
+	ldx	[%o3 + TSB_CONFIG_REG_VAL], %g3
+3:
+	/* Put Huge Shared TSB in trap block */
+	stx	%g3, [%g2 + TRAP_PER_CPU_TSB_HUGE_SHARED]
 	cmp	%g3, -1
-	move	%xcc, 1, %o0
-
+	beq	%xcc, 4f
+	 nop
+	add	%o0, 1, %o0
+4:
 	mov	HV_FAST_MMU_TSB_CTXNON0, %o5
-	mov	%o3, %o1
+	mov	%g7, %o1
 	ta	HV_FAST_TRAP
 
 	/* Finish up.  */
-	ba,pt	%xcc, 9f
+	ba,pt	%xcc, 60f
 	 nop
 
 	/* SUN4U TSB switch.  */
-50:	mov	TSB_REG, %o5
+50:	stx	%g3, [%g2 + TRAP_PER_CPU_TSB_HUGE]
+
+	mov	TSB_REG, %o5
 	stxa	%o0, [%o5] ASI_DMMU
 	membar	#Sync
 	stxa	%o0, [%o5] ASI_IMMU
 	membar	#Sync
 
-2:	ldx	[%o1 + TSB_CONFIG_MAP_VADDR], %o4
-	brz	%o4, 9f
+	ldx	[%o1 + TSB_CONFIG_MAP_VADDR], %o4
+	brz	%o4, 60f
 	 ldx	[%o1 + TSB_CONFIG_MAP_PTE], %o5
 
 	sethi	%hi(sparc64_highest_unlocked_tlb_ent), %g2
@@ -443,7 +457,7 @@ __tsb_context_switch:
 	stxa	%o5, [%g2] ASI_DTLB_DATA_ACCESS
 	membar	#Sync
 
-	brz,pt	%o2, 9f
+	brz,pt	%o2, 60f
 	 nop
 
 	ldx	[%o2 + TSB_CONFIG_MAP_VADDR], %o4
@@ -455,7 +469,7 @@ __tsb_context_switch:
 	stxa	%o5, [%g2] ASI_DTLB_DATA_ACCESS
 	membar	#Sync
 
-9:
+60:
 	wrpr	%g1, %pstate
 
 	retl
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 09/14] sparc64: TLB/TSB miss handling for shared context
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (7 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 08/14] sparc64: shared context tsb handling at context switch time Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 10/14] mm: add shared context to vm_area_struct Mike Kravetz
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

Modifications to the fault handling code to take shared context TSB
into account.  For now, the shared context code mirrors the huge
page code.  The _PAGE_SHR_CTX_4V page flag is used to determine
which TSB should be used.

Note, TRAP_PER_CPU_TSB_HUGE_TEMP is used to stash away calculation
of a TTE address in the huge page TSB.  At present, tehre is no
similar mechanism for shared context TSB so the address must be
recalculated.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/kernel/sun4v_tlb_miss.S |   8 +++
 arch/sparc/kernel/tsb.S            | 122 ++++++++++++++++++++++++++++++++-----
 2 files changed, 116 insertions(+), 14 deletions(-)

diff --git a/arch/sparc/kernel/sun4v_tlb_miss.S b/arch/sparc/kernel/sun4v_tlb_miss.S
index 46fbc16..c438ccc 100644
--- a/arch/sparc/kernel/sun4v_tlb_miss.S
+++ b/arch/sparc/kernel/sun4v_tlb_miss.S
@@ -152,6 +152,14 @@ sun4v_tsb_miss_common:
 	sub	%g2, TRAP_PER_CPU_FAULT_INFO, %g2
 
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
+	/*
+	 * FIXME
+	 *
+	 * This just computes the possible huge page TSB entry.  It does
+	 * not consider the shared huge page TSB.  Also, care must be taken
+	 * so that TRAP_PER_CPU_TSB_HUGE_TEMP is only used for non-shared
+	 * huge TSB.
+	 */
 	mov	SCRATCHPAD_UTSBREG2, %g5
 	ldxa	[%g5] ASI_SCRATCHPAD, %g5
 	cmp	%g5, -1
diff --git a/arch/sparc/kernel/tsb.S b/arch/sparc/kernel/tsb.S
index 3ed3e7c..57ee5ad 100644
--- a/arch/sparc/kernel/tsb.S
+++ b/arch/sparc/kernel/tsb.S
@@ -55,6 +55,9 @@ tsb_miss_page_table_walk:
 	 */
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
 
+	/*
+	 * First check the normal huge page TSB
+	 */
 661:	ldx		[%g7 + TRAP_PER_CPU_TSB_HUGE], %g5
 	nop
 	.section	.sun4v_2insn_patch, "ax"
@@ -64,7 +67,47 @@ tsb_miss_page_table_walk:
 	.previous
 
 	cmp		%g5, -1
-	be,pt		%xcc, 80f
+	be,pt		%xcc, chk_huge_page_shared
+	 nop
+
+	/* We need an aligned pair of registers containing 2 values
+	 * which can be easily rematerialized.  %g6 and %g7 foot the
+	 * bill just nicely.  We'll save %g6 away into %g2 for the
+	 * huge page TSB TAG comparison.
+	 *
+	 * Perform a huge page TSB lookup.
+	 */
+	mov		%g6, %g2
+
+	COMPUTE_TSB_PTR(%g5, %g4, REAL_HPAGE_SHIFT, %g6, %g7)
+
+	TSB_LOAD_QUAD(%g5, %g6)
+	cmp		%g6, %g2
+	be,a,pt		%xcc, tsb_tlb_reload
+	 mov		%g7, %g5
+
+	/*
+	 * No match, restore %g6 and %g7.
+	 * Store huge page TSB entry address
+	 *
+	 * FIXME - Look into use of TRAP_PER_CPU_TSB_HUGE_TEMP as it
+	 * is only used for regular, not shared huge pages.
+	 */
+	TRAP_LOAD_TRAP_BLOCK(%g7, %g6)
+	srlx		%g4, 22, %g6
+
+chk_huge_page_shared:
+	stx		%g5, [%g7 + TRAP_PER_CPU_TSB_HUGE_TEMP]
+
+	/*
+	 * For now (POC) only check shared context on hypervisor
+	 */
+	IF_TLB_TYPE_NOT_HYPE(%g2, huge_checks_done)
+
+	/* Check the shared huge page TSB */
+	ldx		[%g7 + TRAP_PER_CPU_TSB_HUGE_SHARED], %g5
+	cmp		%g5, -1
+	bne,pn		%xcc, huge_checks_done
 	 nop
 
 	/* We need an aligned pair of registers containing 2 values
@@ -75,15 +118,8 @@ tsb_miss_page_table_walk:
 	 * Perform a huge page TSB lookup.
 	 */
 	mov		%g6, %g2
-	and		%g5, 0x7, %g6
-	mov		512, %g7
-	andn		%g5, 0x7, %g5
-	sllx		%g7, %g6, %g7
-	srlx		%g4, REAL_HPAGE_SHIFT, %g6
-	sub		%g7, 1, %g7
-	and		%g6, %g7, %g6
-	sllx		%g6, 4, %g6
-	add		%g5, %g6, %g5
+
+	COMPUTE_TSB_PTR(%g5, %g4, REAL_HPAGE_SHIFT, %g6, %g7)
 
 	TSB_LOAD_QUAD(%g5, %g6)
 	cmp		%g6, %g2
@@ -91,25 +127,29 @@ tsb_miss_page_table_walk:
 	 mov		%g7, %g5
 
 	/* No match, remember the huge page TSB entry address,
-	 * and restore %g6 and %g7.
+	 * restore %g6 and %g7.
+	 *
+	 * NOT REALLY REMEMBERING -  See FIXME above
 	 */
 	TRAP_LOAD_TRAP_BLOCK(%g7, %g6)
 	srlx		%g4, 22, %g6
-80:	stx		%g5, [%g7 + TRAP_PER_CPU_TSB_HUGE_TEMP]
 
+huge_checks_done:
+	stx		%g5, [%g7 + TRAP_PER_CPU_TSB_HUGE_TEMP]
 #endif
 
 	ldx		[%g7 + TRAP_PER_CPU_PGD_PADDR], %g7
 
 	/* At this point we have:
-	 * %g1 --	TSB entry address
+	 * %g1 --	Base TSB entry address
 	 * %g3 --	FAULT_CODE_{D,I}TLB
 	 * %g4 --	missing virtual address
 	 * %g6 --	TAG TARGET (vaddr >> 22)
 	 * %g7 --	page table physical address
 	 *
 	 * We know that both the base PAGE_SIZE TSB and the HPAGE_SIZE
-	 * TSB both lack a matching entry.
+	 * TSB both lack a matching entry, as well as shared TSBs if
+	 * present.
 	 */
 tsb_miss_page_table_walk_sun4v_fastpath:
 	USER_PGTABLE_WALK_TL1(%g4, %g7, %g5, %g2, tsb_do_fault)
@@ -152,12 +192,42 @@ tsb_miss_page_table_walk_sun4v_fastpath:
 	 * thus handle it here.  This also makes sure that we can
 	 * allocate the TSB hash table on the correct NUMA node.
 	 */
+
+	/*
+	 * Check for shared context PTE, in this case we do not have
+	 * a saved TSB entry pointer and must compute now
+	 */
+	IF_TLB_TYPE_NOT_HYPE(%g2, no_shared_ctx_pte)
+
+	mov		_PAGE_SHR_CTX_4V, %g2
+	andcc		%g5, %g2, %g2
+	be,pn		%xcc, no_shared_ctx_pte
+
+	/*
+	 * If there was a shared context TSB, then we need to copmute the
+	 * TSB entry address.  Previously, only the non-shared context
+	 * TSB entry address was calculated.
+	 *
+	 * FIXME
+	 */
+	TRAP_LOAD_TRAP_BLOCK(%g7, %g1)
+	ldx		[%g7 + TRAP_PER_CPU_TSB_HUGE_SHARED], %g1
+	cmp		%g1, -1
+	be,pn		%xcc, no_shared_hugetlb
+	 nop
+
+	COMPUTE_TSB_PTR(%g1, %g4, REAL_HPAGE_SHIFT, %g2, %g7)
+
+	ba,a,pt %xcc,tsb_reload
+
+no_shared_ctx_pte:
 	TRAP_LOAD_TRAP_BLOCK(%g7, %g2)
 	ldx		[%g7 + TRAP_PER_CPU_TSB_HUGE_TEMP], %g1
 	cmp		%g1, -1
 	bne,pt		%xcc, 60f
 	 nop
 
+no_hugetlb:
 661:	rdpr		%pstate, %g5
 	wrpr		%g5, PSTATE_AG | PSTATE_MG, %pstate
 	.section	.sun4v_2insn_patch, "ax"
@@ -177,6 +247,30 @@ tsb_miss_page_table_walk_sun4v_fastpath:
 	ba,pt	%xcc, rtrap
 	 nop
 
+	/*
+	 * This is the same as above call to hugetlb_setup.
+	 * FIXME
+	 */
+no_shared_hugetlb:
+661:	rdpr		%pstate, %g5
+	wrpr		%g5, PSTATE_AG | PSTATE_MG, %pstate
+	.section	.sun4v_2insn_patch, "ax"
+	.word		661b
+	SET_GL(1)
+	nop
+	.previous
+
+	rdpr	%tl, %g7
+	cmp	%g7, 1
+	bne,pn	%xcc, winfix_trampoline
+	 mov	%g3, %g4
+	ba,pt	%xcc, etrap
+	 rd	%pc, %g7
+	call	hugetlb_shared_setup
+	 add	%sp, PTREGS_OFF, %o0
+	ba,pt	%xcc, rtrap
+	 nop
+
 60:
 #endif
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 10/14] mm: add shared context to vm_area_struct
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (8 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 09/14] sparc64: TLB/TSB miss handling for shared context Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 11/14] sparc64: add routines to look for vmsa which can share context Mike Kravetz
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

Shared context usage is reflected in a vm area (vma).  To handle this,
a new flag (VM_SHARED_CTX) is added anlng with a pointer to a shared
context structure (vm_shared_mmu_ctx).

This commit does not contain the method by which a vma is marked for
shared context.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 include/linux/mm.h       |  1 +
 include/linux/mm_types.h | 13 +++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a92c8d7..9d82028 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -182,6 +182,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_ACCOUNT	0x00100000	/* Is a VM accounted object */
 #define VM_NORESERVE	0x00200000	/* should the VM suppress accounting */
 #define VM_HUGETLB	0x00400000	/* Huge TLB Page VM */
+#define VM_SHARED_CTX	0x00800000	/* Shared TLB context */
 #define VM_ARCH_1	0x01000000	/* Architecture-specific flag */
 #define VM_ARCH_2	0x02000000
 #define VM_DONTDUMP	0x04000000	/* Do not include in the core dump */
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 4a8aced..0c30d43 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -291,6 +291,18 @@ struct vm_userfaultfd_ctx {
 struct vm_userfaultfd_ctx {};
 #endif /* CONFIG_USERFAULTFD */
 
+#ifdef CONFIG_SHARED_MMU_CTX
+#define NULL_VM_SHARED_MMU_CTX ((struct vm_shared_mmu_ctx) { NULL, })
+struct vm_shared_mmu_ctx {
+	struct shared_mmu_ctx *ctx;
+};
+#define vma_shared_ctx_val(vma)					\
+	((vma)->vm_shared_mmu_ctx.ctx ?				\
+	 (vma)->vm_shared_mmu_ctx.ctx->shared_ctx_val : 0UL)
+#else /* CONFIG_SHARED__MMU_CTX */
+struct vm_shared_mmu_ctx {};
+#endif /* CONFIG_SHARED_MMU_CTX */
+
 /*
  * This struct defines a memory VMM memory area. There is one of these
  * per VM-area/task.  A VM area is any part of the process virtual memory
@@ -358,6 +370,7 @@ struct vm_area_struct {
 	struct mempolicy *vm_policy;	/* NUMA policy for the VMA */
 #endif
 	struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
+	struct vm_shared_mmu_ctx vm_shared_mmu_ctx;
 };
 
 struct core_thread {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 11/14] sparc64: add routines to look for vmsa which can share context
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (9 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 10/14] mm: add shared context to vm_area_struct Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 12/14] mm: add mmap and shmat arch hooks for shared context Mike Kravetz
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

When a shared context mapping is requested, a search of the other
vmas mapping the same object is searched.  For simplicity, vmas
can only share context if the following is true:
- They both request shared context mapping
- The are at the same virtual address
- They are of the same size
In addition, a task is only allowed to have a single vma with shared
context.

Some of these contstraints can be relaxed at a later date.  They
make the code simpler for now.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/include/asm/mmu_context_64.h |  1 +
 arch/sparc/include/asm/page_64.h        |  1 +
 arch/sparc/mm/hugetlbpage.c             | 78 ++++++++++++++++++++++++++++++++-
 arch/sparc/mm/init_64.c                 | 19 ++++++++
 mm/hugetlb.c                            |  9 ++++
 5 files changed, 106 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/include/asm/mmu_context_64.h b/arch/sparc/include/asm/mmu_context_64.h
index 0dc95cb5..46c2c7e 100644
--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -23,6 +23,7 @@ void get_new_mmu_shared_context(struct mm_struct *mm);
 void put_shared_context(struct mm_struct *mm);
 void set_mm_shared_ctx(struct mm_struct *mm, struct shared_mmu_ctx *ctx);
 void destroy_shared_context(struct mm_struct *mm);
+void set_vma_shared_ctx(struct vm_area_struct *vma);
 #endif
 #ifdef CONFIG_SMP
 void smp_new_mmu_context_version(void);
diff --git a/arch/sparc/include/asm/page_64.h b/arch/sparc/include/asm/page_64.h
index c1263fc..ccceb76 100644
--- a/arch/sparc/include/asm/page_64.h
+++ b/arch/sparc/include/asm/page_64.h
@@ -33,6 +33,7 @@
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
 struct pt_regs;
 void hugetlb_setup(struct pt_regs *regs);
+void hugetlb_shared_setup(struct pt_regs *regs);
 #endif
 
 #define WANT_PAGE_VIRTUAL
diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index 2039d45..5681df6 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -127,6 +127,80 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 				pgoff, flags);
 }
 
+#if defined(CONFIG_SHARED_MMU_CTX)
+static bool huge_vma_can_share_ctx(struct vm_area_struct *vma,
+					struct vm_area_struct *tvma)
+{
+	/*
+	 * Do not match unless there is an actual context value.  It
+	 * could be the case that tvma is a new mapping with VM_SHARED_CTX
+	 * set, but still not associated with a shared context ID.
+	 */
+	if (!vma_shared_ctx_val(tvma))
+		return false;
+
+	/*
+	 * For simple functionality now, vmas must be exactly the same
+	 */
+	if ((vma->vm_flags & VM_LOCKED_CLEAR_MASK) ==
+	    (tvma->vm_flags & VM_LOCKED_CLEAR_MASK) &&
+	    vma->vm_pgoff == tvma->vm_pgoff &&
+	    vma->vm_start == tvma->vm_start &&
+	    vma->vm_end == tvma->vm_end)
+		return true;
+
+	return false;
+}
+
+/*
+ * If vma is marked as desiring shared contexxt, search for a context to
+ * share.  If no context found, assign one.
+ */
+void huge_get_shared_ctx(struct mm_struct *mm, unsigned long addr)
+{
+	struct vm_area_struct *vma = find_vma(mm, addr);
+	struct address_space *mapping = vma->vm_file->f_mapping;
+	pgoff_t idx = ((addr - vma->vm_start) >> PAGE_SHIFT) +
+			vma->vm_pgoff;
+	struct vm_area_struct *tvma;
+
+	/*
+	 * FIXME
+	 *
+	 * For now limit a task to a single shared context mapping
+	 */
+	if (!(vma->vm_flags & VM_SHARED_CTX) || vma_shared_ctx_val(vma) ||
+	    mm_shared_ctx_val(mm))
+		return;
+
+	i_mmap_lock_write(mapping);
+	vma_interval_tree_foreach(tvma, &mapping->i_mmap, idx, idx) {
+		if (tvma == vma)
+			continue;
+
+		if (huge_vma_can_share_ctx(vma, tvma)) {
+			set_mm_shared_ctx(mm, tvma->vm_shared_mmu_ctx.ctx);
+			set_vma_shared_ctx(vma);
+			if (likely(mm_shared_ctx_val(mm))) {
+				load_secondary_context(mm);
+				/*
+				 * What about multiple matches ?
+				 */
+				break;
+			}
+		}
+	}
+
+	if (!mm_shared_ctx_val(mm)) {
+		get_new_mmu_shared_context(mm);
+		set_vma_shared_ctx(vma);
+		load_secondary_context(mm);
+	}
+
+	i_mmap_unlock_write(mapping);
+}
+#endif
+
 pte_t *huge_pte_alloc(struct mm_struct *mm,
 			unsigned long addr, unsigned long sz)
 {
@@ -164,7 +238,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 
 	if (!pte_present(*ptep) && pte_present(entry)) {
 #if defined(CONFIG_SHARED_MMU_CTX)
-		if (pte_val(entry) | _PAGE_SHR_CTX_4V)
+		if (is_sharedctx_pte(entry))
 			mm->context.shared_hugetlb_pte_count++;
 		else
 #endif
@@ -188,7 +262,7 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 	entry = *ptep;
 	if (pte_present(entry)) {
 #if defined(CONFIG_SHARED_MMU_CTX)
-		if (pte_val(entry) | _PAGE_SHR_CTX_4V)
+		if (is_sharedctx_pte(entry))
 			mm->context.shared_hugetlb_pte_count--;
 		else
 #endif
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 2b310e5..25ad5bd 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -813,6 +813,25 @@ void set_mm_shared_ctx(struct mm_struct *mm, struct shared_mmu_ctx *ctx)
 	atomic_inc(&ctx->refcount);
 	mm->context.shared_ctx = ctx;
 }
+
+/*
+ * Set the shared context value in the vma to that in the mm.
+ *
+ *
+ * Note that we are called from mmap with mmap_sem held.
+ */
+void set_vma_shared_ctx(struct vm_area_struct *vma)
+{
+	struct mm_struct *mm = vma->vm_mm;
+
+	BUG_ON(vma->vm_shared_mmu_ctx.ctx);
+
+	if (!mm_shared_ctx_val(mm))
+		return;
+
+	atomic_inc(&mm->context.shared_ctx->refcount);
+	vma->vm_shared_mmu_ctx.ctx = mm->context.shared_ctx;
+}
 #endif
 
 static int numa_enabled = 1;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 418bf01..3733ba1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3150,6 +3150,15 @@ static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page,
 	entry = pte_mkhuge(entry);
 	entry = arch_make_huge_pte(entry, vma, page, writable);
 
+#if defined(CONFIG_SHARED_MMU_CTX)
+	/*
+	 * FIXME
+	 * needs arch independent way of setting - perhaps arch_make_huge_pte
+	 */
+	if (vma->vm_flags & VM_SHARED_CTX)
+		pte_val(entry) |= _PAGE_SHR_CTX_4V;
+#endif
+
 	return entry;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 12/14] mm: add mmap and shmat arch hooks for shared context
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (10 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 11/14] sparc64: add routines to look for vmsa which can share context Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 13/14] sparc64 mm: add shared context support to mmap() and shmat() APIs Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 14/14] sparc64: add SHARED_MMU_CTX Kconfig option Mike Kravetz
  13 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

Shared context will require some additional checking and processing
when mappings are created.  To faciliate this, add new mmap hooks
arch_pre_mmap_flags and arch_post_mmap to generic mm_hooks.  For
shmat, a new hook arch_shmat_check is added.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/powerpc/include/asm/mmu_context.h   | 12 ++++++++++++
 arch/s390/include/asm/mmu_context.h      | 12 ++++++++++++
 arch/unicore32/include/asm/mmu_context.h | 12 ++++++++++++
 arch/x86/include/asm/mmu_context.h       | 12 ++++++++++++
 include/asm-generic/mm_hooks.h           | 18 +++++++++++++++---
 ipc/shm.c                                | 13 +++++++++++++
 mm/mmap.c                                | 10 ++++++++++
 7 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 5c45114..d5ce33a 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -133,6 +133,18 @@ static inline void enter_lazy_tlb(struct mm_struct *mm,
 #endif
 }
 
+static inline unsigned long arch_pre_mmap_flags(struct file *file,
+						unsigned long flags,
+						vm_flags_t *vm_flags)
+{
+	return 0;	/* no errors */
+}
+
+static inline void arch_post_mmap(struct mm_struct *mm, unsigned long addr,
+					vm_flags_t vm_flags)
+{
+}
+
 static inline void arch_dup_mmap(struct mm_struct *oldmm,
 				 struct mm_struct *mm)
 {
diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h
index 515fea5..0a2322d 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -129,6 +129,18 @@ static inline void activate_mm(struct mm_struct *prev,
 	set_user_asce(next);
 }
 
+static inline unsigned long arch_pre_mmap_flags(struct file *file,
+						unsigned long flags,
+						vm_flags_t *vm_flags)
+{
+	return 0;	/* no errors */
+}
+
+static inline void arch_post_mmap(struct mm_struct *mm, unsigned long addr,
+					vm_flags_t vm_flags)
+{
+}
+
 static inline void arch_dup_mmap(struct mm_struct *oldmm,
 				 struct mm_struct *mm)
 {
diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h
index 62dfc64..8b57b9d 100644
--- a/arch/unicore32/include/asm/mmu_context.h
+++ b/arch/unicore32/include/asm/mmu_context.h
@@ -81,6 +81,18 @@ do { \
 	} \
 } while (0)
 
+static inline unsigned long arch_pre_mmap_flags(struct file *file,
+						unsigned long flags,
+						vm_flags_t *vm_flags)
+{
+	return 0;	/* no errors */
+}
+
+static inline void arch_post_mmap(struct mm_struct *mm, unsigned long addr,
+					vm_flags_t vm_flags)
+{
+}
+
 static inline void arch_dup_mmap(struct mm_struct *oldmm,
 				 struct mm_struct *mm)
 {
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 8e0a9fe..fe60309 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -151,6 +151,18 @@ do {						\
 } while (0)
 #endif
 
+static inline unsigned long arch_pre_mmap_flags(struct file *file,
+						unsigned long flags,
+						vm_flags_t *vm_flags)
+{
+	return 0;	/* no errors */
+}
+
+static inline void arch_post_mmap(struct mm_struct *mm, unsigned long addr,
+					vm_flags_t vm_flags)
+{
+}
+
 static inline void arch_dup_mmap(struct mm_struct *oldmm,
 				 struct mm_struct *mm)
 {
diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h
index cc5d9a1..c742e52 100644
--- a/include/asm-generic/mm_hooks.h
+++ b/include/asm-generic/mm_hooks.h
@@ -1,11 +1,23 @@
 /*
- * Define generic no-op hooks for arch_dup_mmap, arch_exit_mmap
- * and arch_unmap to be included in asm-FOO/mmu_context.h for any
- * arch FOO which doesn't need to hook these.
+ * Define generic no-op hooks for mmap and protection related routines
+ * to be included in asm-FOO/mmu_context.h for any arch FOO which doesn't
+ * need to hook these.
  */
 #ifndef _ASM_GENERIC_MM_HOOKS_H
 #define _ASM_GENERIC_MM_HOOKS_H
 
+static inline unsigned long arch_pre_mmap_flags(struct file *file,
+						unsigned long flags,
+						vm_flags_t *vm_flags)
+{
+	return 0;	/* no errors */
+}
+
+static inline void arch_post_mmap(struct mm_struct *mm, unsigned long addr,
+					vm_flags_t vm_flags)
+{
+}
+
 static inline void arch_dup_mmap(struct mm_struct *oldmm,
 				 struct mm_struct *mm)
 {
diff --git a/ipc/shm.c b/ipc/shm.c
index dbac886..dab6cd1 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -72,6 +72,14 @@ static void shm_destroy(struct ipc_namespace *ns, struct shmid_kernel *shp);
 static int sysvipc_shm_proc_show(struct seq_file *s, void *it);
 #endif
 
+#ifndef arch_shmat_check
+#define arch_shmat_check(file, shmflg, flags) (0)
+#endif
+
+#ifndef arch_shmat_check
+#define arch_shmat_check(file, shmflg, flags) (0)
+#endif
+
 void shm_init_ns(struct ipc_namespace *ns)
 {
 	ns->shm_ctlmax = SHMMAX;
@@ -1149,6 +1157,11 @@ long do_shmat(int shmid, char __user *shmaddr, int shmflg, ulong *raddr,
 		goto out_unlock;
 	}
 
+	/* arch specific check and possible flag modification */
+	err = arch_shmat_check(shp->shm_file, shmflg, &flags);
+	if (err)
+		goto out_unlock;
+
 	err = -EACCES;
 	if (ipcperms(ns, &shp->shm_perm, acc_mode))
 		goto out_unlock;
diff --git a/mm/mmap.c b/mm/mmap.c
index 1af87c1..7fc946b 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1307,6 +1307,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 			unsigned long pgoff, unsigned long *populate)
 {
 	struct mm_struct *mm = current->mm;
+	unsigned long ret;
 	int pkey = 0;
 
 	*populate = 0;
@@ -1314,6 +1315,11 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 	if (!len)
 		return -EINVAL;
 
+	/* arch specific check and possible modification of vm_flags */
+	ret = arch_pre_mmap_flags(file, flags, &vm_flags);
+	if (ret)
+		return ret;
+
 	/*
 	 * Does the application expect PROT_READ to imply PROT_EXEC?
 	 *
@@ -1452,6 +1458,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 	    ((vm_flags & VM_LOCKED) ||
 	     (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE))
 		*populate = len;
+
+	if (!IS_ERR_VALUE(addr))
+		arch_post_mmap(mm, addr, vm_flags);
+
 	return addr;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 13/14] sparc64 mm: add shared context support to mmap() and shmat() APIs
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (11 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 12/14] mm: add mmap and shmat arch hooks for shared context Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  2016-12-16 18:35 ` [RFC PATCH 14/14] sparc64: add SHARED_MMU_CTX Kconfig option Mike Kravetz
  13 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

Add new mmap(MAP_SHAREDCTX) and shm(SHM_SHAREDCTX) flags to specify
desire for shared context mappings.  This only works on HUGETLB
mappings.  In addition, the mappings must be SHARED and at a FIXED
address otherwize EINVAL will be returned.

Also, populate the sparc specific hooks to mmap and shmat that perform
shared context processing.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/include/asm/hugetlb.h        |  4 +++
 arch/sparc/include/asm/mman.h           |  6 ++++
 arch/sparc/include/asm/mmu_context_64.h | 62 ++++++++++++++++++++++++++++++++-
 arch/sparc/include/uapi/asm/mman.h      |  1 +
 arch/sparc/kernel/sys_sparc_64.c        | 17 +++++++++
 arch/sparc/mm/init_64.c                 | 36 +++++++++++++++++++
 include/uapi/linux/shm.h                |  1 +
 7 files changed, 126 insertions(+), 1 deletion(-)

diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index dcbf985..13157b3 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -78,4 +78,8 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
 			    unsigned long end, unsigned long floor,
 			    unsigned long ceiling);
 
+#if defined(CONFIG_SHARED_MMU_CTX)
+void huge_get_shared_ctx(struct mm_struct *mm, unsigned long addr);
+#endif
+
 #endif /* _ASM_SPARC64_HUGETLB_H */
diff --git a/arch/sparc/include/asm/mman.h b/arch/sparc/include/asm/mman.h
index 59bb593..cbe384e 100644
--- a/arch/sparc/include/asm/mman.h
+++ b/arch/sparc/include/asm/mman.h
@@ -6,5 +6,11 @@
 #ifndef __ASSEMBLY__
 #define arch_mmap_check(addr,len,flags)	sparc_mmap_check(addr,len)
 int sparc_mmap_check(unsigned long addr, unsigned long len);
+
+#if defined(CONFIG_SHARED_MMU_CTX)
+#define arch_shmat_check(file, shmflg, flags) \
+				sparc_shmat_check(file, shmflg, flags)
+int sparc_shmat_check(struct file *file, int shmflg, unsigned long *flags);
+#endif
 #endif
 #endif /* __SPARC_MMAN_H__ */
diff --git a/arch/sparc/include/asm/mmu_context_64.h b/arch/sparc/include/asm/mmu_context_64.h
index 46c2c7e..8ab05f2 100644
--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -7,7 +7,6 @@
 
 #include <linux/spinlock.h>
 #include <asm/spitfire.h>
-#include <asm-generic/mm_hooks.h>
 
 static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
 {
@@ -24,6 +23,13 @@ void put_shared_context(struct mm_struct *mm);
 void set_mm_shared_ctx(struct mm_struct *mm, struct shared_mmu_ctx *ctx);
 void destroy_shared_context(struct mm_struct *mm);
 void set_vma_shared_ctx(struct vm_area_struct *vma);
+void sparc64_exit_mmap(struct mm_struct *mm);
+void sparc64_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
+			unsigned long start, unsigned long end);
+unsigned long sparc64_pre_mmap_flags(struct file *file, unsigned long flags,
+					vm_flags_t *vm_flags);
+void sparc64_post_mmap(struct mm_struct *mm, unsigned long addr,
+					vm_flags_t vm_flags);
 #endif
 #ifdef CONFIG_SMP
 void smp_new_mmu_context_version(void);
@@ -208,6 +214,60 @@ static inline void activate_mm(struct mm_struct *active_mm, struct mm_struct *mm
 	spin_unlock_irqrestore(&mm->context.lock, flags);
 }
 
+#if defined(CONFIG_SHARED_MMU_CTX)
+/*
+ * mm_hooks only needed for CONFIG_SHARED_MMU_CTX
+ */
+static inline unsigned long arch_pre_mmap_flags(struct file *file,
+						unsigned long flags,
+						vm_flags_t *vm_flags)
+{
+	return sparc64_pre_mmap_flags(file, flags, vm_flags);
+}
+
+static inline void arch_post_mmap(struct mm_struct *mm, unsigned long addr,
+							vm_flags_t vm_flags)
+{
+	sparc64_post_mmap(mm, addr, vm_flags);
+}
+
+static inline void arch_dup_mmap(struct mm_struct *oldmm,
+				 struct mm_struct *mm)
+{
+}
+
+static inline void arch_exit_mmap(struct mm_struct *mm)
+{
+	sparc64_exit_mmap(mm);
+}
+
+static inline void arch_unmap(struct mm_struct *mm,
+			struct vm_area_struct *vma,
+			unsigned long start, unsigned long end)
+{
+	sparc64_unmap(mm, vma, start, end);
+}
+
+static inline void arch_bprm_mm_init(struct mm_struct *mm,
+				     struct vm_area_struct *vma)
+{
+}
+
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool execute, bool foreign)
+{
+	/* by default, allow everything */
+	return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+	/* by default, allow everything */
+	return true;
+}
+#else
+#include <asm-generic/mm_hooks.h>
+#endif
 #endif /* !(__ASSEMBLY__) */
 
 #endif /* !(__SPARC64_MMU_CONTEXT_H) */
diff --git a/arch/sparc/include/uapi/asm/mman.h b/arch/sparc/include/uapi/asm/mman.h
index 9765896..a52c6fe 100644
--- a/arch/sparc/include/uapi/asm/mman.h
+++ b/arch/sparc/include/uapi/asm/mman.h
@@ -23,6 +23,7 @@
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define	MAP_SHAREDCTX	0x80000		/* request shared cxt mapping */
 
 
 #endif /* _UAPI__SPARC_MMAN_H__ */
diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
index fe8b8ee..23fa538 100644
--- a/arch/sparc/kernel/sys_sparc_64.c
+++ b/arch/sparc/kernel/sys_sparc_64.c
@@ -25,6 +25,7 @@
 #include <linux/random.h>
 #include <linux/export.h>
 #include <linux/context_tracking.h>
+#include <linux/hugetlb.h>
 
 #include <asm/uaccess.h>
 #include <asm/utrap.h>
@@ -444,6 +445,22 @@ int sparc_mmap_check(unsigned long addr, unsigned long len)
 	return 0;
 }
 
+int sparc_shmat_check(struct file *file, int shmflg, unsigned long *flags)
+{
+	if (shmflg & SHM_SHAREDCTX) {
+		if ((*flags & (MAP_SHARED | MAP_FIXED)) !=
+		    (unsigned long)(MAP_SHARED | MAP_FIXED))
+			return -EINVAL;
+
+		if (!is_file_hugepages(file))
+			return -EINVAL;
+
+		*flags |= MAP_SHAREDCTX;
+	}
+
+	return 0;
+}
+
 /* Linux version of mmap */
 SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
 		unsigned long, prot, unsigned long, flags, unsigned long, fd,
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 25ad5bd..0637762 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -27,6 +27,7 @@
 #include <linux/memblock.h>
 #include <linux/mmzone.h>
 #include <linux/gfp.h>
+#include <linux/mman.h>
 
 #include <asm/head.h>
 #include <asm/page.h>
@@ -832,6 +833,41 @@ void set_vma_shared_ctx(struct vm_area_struct *vma)
 	atomic_inc(&mm->context.shared_ctx->refcount);
 	vma->vm_shared_mmu_ctx.ctx = mm->context.shared_ctx;
 }
+
+unsigned long sparc64_pre_mmap_flags(struct file *file, unsigned long flags,
+					vm_flags_t *vm_flags)
+{
+	if (flags & MAP_SHAREDCTX) {
+		/* Must be a shared huge page mapping */
+		if (!(flags & (MAP_SHARED | MAP_FIXED)))
+			return -EINVAL;
+		if (!(flags & MAP_HUGETLB)  &&
+		    !(file && is_file_hugepages(file)))
+			return -EINVAL;
+
+		*vm_flags |= VM_SHARED_CTX;
+	}
+
+	return 0;
+}
+
+void sparc64_post_mmap(struct mm_struct *mm, unsigned long addr,
+							vm_flags_t vm_flags)
+{
+	if (vm_flags & VM_SHARED_CTX)
+		huge_get_shared_ctx(mm, addr);
+}
+
+void sparc64_exit_mmap(struct mm_struct *mm)
+{
+	put_shared_context(mm);
+}
+
+void sparc64_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
+			unsigned long start, unsigned long end)
+{
+	put_shared_context(mm);
+}
 #endif
 
 static int numa_enabled = 1;
diff --git a/include/uapi/linux/shm.h b/include/uapi/linux/shm.h
index 1fbf24e..3373567 100644
--- a/include/uapi/linux/shm.h
+++ b/include/uapi/linux/shm.h
@@ -49,6 +49,7 @@ struct shmid_ds {
 #define	SHM_RND		020000	/* round attach address to SHMLBA boundary */
 #define	SHM_REMAP	040000	/* take-over region on attach */
 #define	SHM_EXEC	0100000	/* execution access */
+#define	SHM_SHAREDCTX	0200000	/* share context (TLB entries) if possible */
 
 /* super user shmctl commands */
 #define SHM_LOCK 	11
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 14/14] sparc64: add SHARED_MMU_CTX Kconfig option
  2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
                   ` (12 preceding siblings ...)
  2016-12-16 18:35 ` [RFC PATCH 13/14] sparc64 mm: add shared context support to mmap() and shmat() APIs Mike Kravetz
@ 2016-12-16 18:35 ` Mike Kravetz
  13 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-16 18:35 UTC (permalink / raw)
  To: sparclinux, linux-mm, linux-kernel
  Cc: David S . Miller, Bob Picco, Nitin Gupta, Vijay Kumar,
	Julian Calaby, Adam Buchbinder, Kirill A . Shutemov,
	Michal Hocko, Andrew Morton, Mike Kravetz

Depends on SPARC64 && HUGETLB_PAGE

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/sparc/Kconfig | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 165ecdd..f39dcdf 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -155,6 +155,9 @@ config PGTABLE_LEVELS
 	default 4 if 64BIT
 	default 3
 
+config SHARED_MMU_CTX
+	def_bool y if SPARC64 && HUGETLB_PAGE
+
 source "init/Kconfig"
 
 source "kernel/Kconfig.freezer"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support
  2016-12-16 18:35 ` [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support Mike Kravetz
@ 2016-12-17  7:34   ` Sam Ravnborg
  2016-12-18 23:33     ` Mike Kravetz
  2016-12-17  7:38   ` Sam Ravnborg
  1 sibling, 1 reply; 35+ messages in thread
From: Sam Ravnborg @ 2016-12-17  7:34 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

Hi Mike.

On Fri, Dec 16, 2016 at 10:35:25AM -0800, Mike Kravetz wrote:
> Add new fields to the mm_context structure to support shared context.
> Instead of a simple context ID, add a pointer to a structure with a
> reference count.  This is needed as multiple tasks will share the
> context ID.

What are the benefits with the shared_mmu_ctx struct?
It does not save any space in mm_context_t, and the CPU only
supports one extra context.
So it looks like over-engineering with all the extra administration
required to handle it with refcount, poitners etc.

what do I miss?

	Sam

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support
  2016-12-16 18:35 ` [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support Mike Kravetz
  2016-12-17  7:34   ` Sam Ravnborg
@ 2016-12-17  7:38   ` Sam Ravnborg
  2016-12-18 23:45     ` Mike Kravetz
  1 sibling, 1 reply; 35+ messages in thread
From: Sam Ravnborg @ 2016-12-17  7:38 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

Hi Mike

> diff --git a/arch/sparc/include/asm/mmu_context_64.h b/arch/sparc/include/asm/mmu_context_64.h
> index b84be67..d031799 100644
> --- a/arch/sparc/include/asm/mmu_context_64.h
> +++ b/arch/sparc/include/asm/mmu_context_64.h
> @@ -35,15 +35,15 @@ void __tsb_context_switch(unsigned long pgd_pa,
>  static inline void tsb_context_switch(struct mm_struct *mm)
>  {
>  	__tsb_context_switch(__pa(mm->pgd),
> -			     &mm->context.tsb_block[0],
> +			     &mm->context.tsb_block[MM_TSB_BASE],
>  #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
> -			     (mm->context.tsb_block[1].tsb ?
> -			      &mm->context.tsb_block[1] :
> +			     (mm->context.tsb_block[MM_TSB_HUGE].tsb ?
> +			      &mm->context.tsb_block[MM_TSB_HUGE] :
>  			      NULL)
>  #else
>  			     NULL
>  #endif
> -			     , __pa(&mm->context.tsb_descr[0]));
> +			     , __pa(&mm->context.tsb_descr[MM_TSB_BASE]));
>  }
>  
This is a nice cleanup that has nothing to do with your series.
Could you submit this as a separate patch so we can get it applied.

This is the only place left where the array index for tsb_block
and tsb_descr uses hardcoded values. And it would be good to get
rid of these.

	Sam

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 04/14] sparc64: load shared id into context register 1
  2016-12-16 18:35 ` [RFC PATCH 04/14] sparc64: load shared id into context register 1 Mike Kravetz
@ 2016-12-17  7:45   ` Sam Ravnborg
  2016-12-19  0:22     ` Mike Kravetz
  2016-12-18  3:14   ` David Miller
  1 sibling, 1 reply; 35+ messages in thread
From: Sam Ravnborg @ 2016-12-17  7:45 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

Hi Mike

> diff --git a/arch/sparc/kernel/fpu_traps.S b/arch/sparc/kernel/fpu_traps.S
> index 336d275..f85a034 100644
> --- a/arch/sparc/kernel/fpu_traps.S
> +++ b/arch/sparc/kernel/fpu_traps.S
> @@ -73,6 +73,16 @@ do_fpdis:
>  	ldxa		[%g3] ASI_MMU, %g5
>  	.previous
>  
> +661:	nop
> +	nop
> +	.section	.sun4v_2insn_patch, "ax"
> +	.word		661b
> +	mov		SECONDARY_CONTEXT_R1, %g3
> +	ldxa		[%g3] ASI_MMU, %g4
> +	.previous
> +	/* Unnecessary on sun4u and pre-Niagara 2 sun4v */
> +	mov		SECONDARY_CONTEXT, %g3
> +
>  	sethi		%hi(sparc64_kern_sec_context), %g2

You missed the second instruction to patch with here.
This bug repeats itself further down.

Just noted while briefly reading the code - did not really follow the code.

	Sam

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 06/14] sparc64: general shared context tsb creation and support
  2016-12-16 18:35 ` [RFC PATCH 06/14] sparc64: general shared context tsb creation and support Mike Kravetz
@ 2016-12-17  7:53   ` Sam Ravnborg
  2016-12-19  0:52     ` Mike Kravetz
  0 siblings, 1 reply; 35+ messages in thread
From: Sam Ravnborg @ 2016-12-17  7:53 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

Hi Mike

> --- a/arch/sparc/mm/hugetlbpage.c
> +++ b/arch/sparc/mm/hugetlbpage.c
> @@ -162,8 +162,14 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
>  {
>  	pte_t orig;
>  
> -	if (!pte_present(*ptep) && pte_present(entry))
> -		mm->context.hugetlb_pte_count++;
> +	if (!pte_present(*ptep) && pte_present(entry)) {
> +#if defined(CONFIG_SHARED_MMU_CTX)
> +		if (pte_val(entry) | _PAGE_SHR_CTX_4V)
> +			mm->context.shared_hugetlb_pte_count++;
> +		else
> +#endif
> +			mm->context.hugetlb_pte_count++;
> +	}

This kind of conditional code it just too ugly to survive...
Could a static inline be used to help you here?
The compiler will inline it so there should not be any run-time cost

>  
>  	mm_rss -= saved_thp_pte_count * (HPAGE_SIZE / PAGE_SIZE);
>  #endif
> @@ -544,8 +576,10 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
>  	 * us, so we need to zero out the TSB pointer or else tsb_grow()
>  	 * will be confused and think there is an older TSB to free up.
>  	 */
> -	for (i = 0; i < MM_NUM_TSBS; i++)
> +	for (i = 0; i < MM_NUM_TSBS; i++) {
>  		mm->context.tsb_block[i].tsb = NULL;
> +		mm->context.tsb_descr[i].tsb_base = 0UL;
> +	}
This change seems un-related to the rest?

	Sam

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 03/14] sparc64: routines for basic mmu shared context structure management
  2016-12-16 18:35 ` [RFC PATCH 03/14] sparc64: routines for basic mmu shared context structure management Mike Kravetz
@ 2016-12-18  3:07   ` David Miller
  0 siblings, 0 replies; 35+ messages in thread
From: David Miller @ 2016-12-18  3:07 UTC (permalink / raw)
  To: mike.kravetz
  Cc: sparclinux, linux-mm, linux-kernel, bob.picco, nitin.m.gupta,
	vijay.ac.kumar, julian.calaby, adam.buchbinder, kirill.shutemov,
	mhocko, akpm

From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Fri, 16 Dec 2016 10:35:26 -0800

> +void smp_flush_shared_tlb_mm(struct mm_struct *mm)
> +{
> +	u32 ctx = SHARED_CTX_HWBITS(mm->context);
> +
> +	(void)get_cpu();		/* prevent preemption */

preempt_disable();

> +
> +	smp_cross_call(&xcall_flush_tlb_mm, ctx, 0, 0);
> +	__flush_tlb_mm(ctx, SECONDARY_CONTEXT);
> +
> +	put_cpu();

preempt_enable();

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 05/14] sparc64: Add PAGE_SHR_CTX flag
  2016-12-16 18:35 ` [RFC PATCH 05/14] sparc64: Add PAGE_SHR_CTX flag Mike Kravetz
@ 2016-12-18  3:12   ` David Miller
  2016-12-19  0:42     ` Mike Kravetz
  0 siblings, 1 reply; 35+ messages in thread
From: David Miller @ 2016-12-18  3:12 UTC (permalink / raw)
  To: mike.kravetz
  Cc: sparclinux, linux-mm, linux-kernel, bob.picco, nitin.m.gupta,
	vijay.ac.kumar, julian.calaby, adam.buchbinder, kirill.shutemov,
	mhocko, akpm

From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Fri, 16 Dec 2016 10:35:28 -0800

> @@ -166,6 +166,7 @@ bool kern_addr_valid(unsigned long addr);
>  #define _PAGE_EXEC_4V	  _AC(0x0000000000000080,UL) /* Executable Page      */
>  #define _PAGE_W_4V	  _AC(0x0000000000000040,UL) /* Writable             */
>  #define _PAGE_SOFT_4V	  _AC(0x0000000000000030,UL) /* Software bits        */
> +#define _PAGE_SHR_CTX_4V  _AC(0x0000000000000020,UL) /* Shared Context       */
>  #define _PAGE_PRESENT_4V  _AC(0x0000000000000010,UL) /* Present              */
>  #define _PAGE_RESV_4V	  _AC(0x0000000000000008,UL) /* Reserved             */
>  #define _PAGE_SZ16GB_4V	  _AC(0x0000000000000007,UL) /* 16GB Page            */

You really don't need this.

The VMA is available, and you can obtain the information you need
about whether this is a shared mapping or not from the. It just isn't
being passed down into things like set_huge_pte_at().  Simply make it
do so.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 04/14] sparc64: load shared id into context register 1
  2016-12-16 18:35 ` [RFC PATCH 04/14] sparc64: load shared id into context register 1 Mike Kravetz
  2016-12-17  7:45   ` Sam Ravnborg
@ 2016-12-18  3:14   ` David Miller
  2016-12-19  0:06     ` Mike Kravetz
  1 sibling, 1 reply; 35+ messages in thread
From: David Miller @ 2016-12-18  3:14 UTC (permalink / raw)
  To: mike.kravetz
  Cc: sparclinux, linux-mm, linux-kernel, bob.picco, nitin.m.gupta,
	vijay.ac.kumar, julian.calaby, adam.buchbinder, kirill.shutemov,
	mhocko, akpm

From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Fri, 16 Dec 2016 10:35:27 -0800

> In current code, only context ID register 0 is set and used by the MMU.
> On sun4v platforms that support MMU shared context, there is an additional
> context ID register: specifically context register 1.  When searching
> the TLB, the MMU will find a match if the virtual address matches and
> the ID contained in context register 0 -OR- context register 1 matches.
> 
> Load the shared context ID into context ID register 1.  Care must be
> taken to load register 1 after register 0, as loading register 0
> overwrites both register 0 and 1.  Modify code loading register 0 to
> also load register one if applicable.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

You can't make these register accesses if the feature isn't being
used.

Considering the percentage of applications which will actually use
this thing, incuring the overhead of even loading the shared context
register is simply unacceptable.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support
  2016-12-17  7:34   ` Sam Ravnborg
@ 2016-12-18 23:33     ` Mike Kravetz
  2016-12-21 18:12       ` Sam Ravnborg
  0 siblings, 1 reply; 35+ messages in thread
From: Mike Kravetz @ 2016-12-18 23:33 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

On 12/16/2016 11:34 PM, Sam Ravnborg wrote:
> Hi Mike.
> 
> On Fri, Dec 16, 2016 at 10:35:25AM -0800, Mike Kravetz wrote:
>> Add new fields to the mm_context structure to support shared context.
>> Instead of a simple context ID, add a pointer to a structure with a
>> reference count.  This is needed as multiple tasks will share the
>> context ID.
> 
> What are the benefits with the shared_mmu_ctx struct?
> It does not save any space in mm_context_t, and the CPU only
> supports one extra context.
> So it looks like over-engineering with all the extra administration
> required to handle it with refcount, poitners etc.
> 
> what do I miss?

Multiple tasks will share this same context ID.  The first task to need
a new shared context will allocate the structure, increment the ref count
and point to it.  As other tasks join the sharing, they will increment
the ref count and point to the same structure.  Similarly, when tasks
no longer use the shared context ID, they will decrement the reference
count.

The reference count is important so that we will know when the last
reference to the shared context ID is dropped.  When the last reference
is dropped, then the ID can be recycled/given back to the global pool
of context IDs.

This seemed to be the most straight forward way to implement this.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support
  2016-12-17  7:38   ` Sam Ravnborg
@ 2016-12-18 23:45     ` Mike Kravetz
  2016-12-21 18:13       ` Sam Ravnborg
  0 siblings, 1 reply; 35+ messages in thread
From: Mike Kravetz @ 2016-12-18 23:45 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

On 12/16/2016 11:38 PM, Sam Ravnborg wrote:
> Hi Mike
> 
>> diff --git a/arch/sparc/include/asm/mmu_context_64.h b/arch/sparc/include/asm/mmu_context_64.h
>> index b84be67..d031799 100644
>> --- a/arch/sparc/include/asm/mmu_context_64.h
>> +++ b/arch/sparc/include/asm/mmu_context_64.h
>> @@ -35,15 +35,15 @@ void __tsb_context_switch(unsigned long pgd_pa,
>>  static inline void tsb_context_switch(struct mm_struct *mm)
>>  {
>>  	__tsb_context_switch(__pa(mm->pgd),
>> -			     &mm->context.tsb_block[0],
>> +			     &mm->context.tsb_block[MM_TSB_BASE],
>>  #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
>> -			     (mm->context.tsb_block[1].tsb ?
>> -			      &mm->context.tsb_block[1] :
>> +			     (mm->context.tsb_block[MM_TSB_HUGE].tsb ?
>> +			      &mm->context.tsb_block[MM_TSB_HUGE] :
>>  			      NULL)
>>  #else
>>  			     NULL
>>  #endif
>> -			     , __pa(&mm->context.tsb_descr[0]));
>> +			     , __pa(&mm->context.tsb_descr[MM_TSB_BASE]));
>>  }
>>  
> This is a nice cleanup that has nothing to do with your series.
> Could you submit this as a separate patch so we can get it applied.
> 
> This is the only place left where the array index for tsb_block
> and tsb_descr uses hardcoded values. And it would be good to get
> rid of these.

Sure, I will submit a separate cleanup patch for this.

However, do note that in my series if CONFIG_SHARED_MMU_CTX is defined,
then MM_TSB_HUGE_SHARED is index 0, instead of MM_TSB_BASE being 0 in
the case where CONFIG_SHARED_MMU_CTX is not defined.  This may seem
'strange' and the obvious question would be 'why not put CONFIG_SHARED_MMU_CTX
at the end of the existing array (index 2)?'.  The reason is that tsb_descr
array can not have any 'holes' when passed to the hypervisor.  Since there
will always be a MM_TSB_BASE tsb, with MM_TSB_HUGE_SHARED before and
MM_TSB_HUGE after MM_TSB_BASE, few tricks are necessary to ensure no holes
are in the array passed to the hypervisor.

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 04/14] sparc64: load shared id into context register 1
  2016-12-18  3:14   ` David Miller
@ 2016-12-19  0:06     ` Mike Kravetz
  2016-12-20 18:33       ` David Miller
  2016-12-21 18:17       ` Sam Ravnborg
  0 siblings, 2 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-19  0:06 UTC (permalink / raw)
  To: David Miller
  Cc: sparclinux, linux-mm, linux-kernel, bob.picco, nitin.m.gupta,
	vijay.ac.kumar, julian.calaby, adam.buchbinder, kirill.shutemov,
	mhocko, akpm

On 12/17/2016 07:14 PM, David Miller wrote:
> From: Mike Kravetz <mike.kravetz@oracle.com>
> Date: Fri, 16 Dec 2016 10:35:27 -0800
> 
>> In current code, only context ID register 0 is set and used by the MMU.
>> On sun4v platforms that support MMU shared context, there is an additional
>> context ID register: specifically context register 1.  When searching
>> the TLB, the MMU will find a match if the virtual address matches and
>> the ID contained in context register 0 -OR- context register 1 matches.
>>
>> Load the shared context ID into context ID register 1.  Care must be
>> taken to load register 1 after register 0, as loading register 0
>> overwrites both register 0 and 1.  Modify code loading register 0 to
>> also load register one if applicable.
>>
>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> You can't make these register accesses if the feature isn't being
> used.
> 
> Considering the percentage of applications which will actually use
> this thing, incuring the overhead of even loading the shared context
> register is simply unacceptable.

Ok, let me try to find a way to eliminate these loads unless the application
is using shared context.

Part of the issue is a 'backwards compatibility' feature of the processor
which loads/overwrites register 1 every time register 0 is loaded.  Somewhere
in the evolution of the processor, a feature was added so that register 0
could be loaded without overwriting register 1.  That could be used to
eliminate the extra load in some/many cases.  But, that would likely lead
to more runtime kernel patching based on processor level.  And, I don't
really want to add more of that if possible.  Or, perhaps we only enable
the shared context ID feature on processors which have the ability to work
around the backwards compatibility feature.

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 04/14] sparc64: load shared id into context register 1
  2016-12-17  7:45   ` Sam Ravnborg
@ 2016-12-19  0:22     ` Mike Kravetz
  2016-12-21 18:16       ` Sam Ravnborg
  0 siblings, 1 reply; 35+ messages in thread
From: Mike Kravetz @ 2016-12-19  0:22 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

On 12/16/2016 11:45 PM, Sam Ravnborg wrote:
> Hi Mike
> 
>> diff --git a/arch/sparc/kernel/fpu_traps.S b/arch/sparc/kernel/fpu_traps.S
>> index 336d275..f85a034 100644
>> --- a/arch/sparc/kernel/fpu_traps.S
>> +++ b/arch/sparc/kernel/fpu_traps.S
>> @@ -73,6 +73,16 @@ do_fpdis:
>>  	ldxa		[%g3] ASI_MMU, %g5
>>  	.previous
>>  
>> +661:	nop
>> +	nop
>> +	.section	.sun4v_2insn_patch, "ax"
>> +	.word		661b
>> +	mov		SECONDARY_CONTEXT_R1, %g3
>> +	ldxa		[%g3] ASI_MMU, %g4
>> +	.previous
>> +	/* Unnecessary on sun4u and pre-Niagara 2 sun4v */
>> +	mov		SECONDARY_CONTEXT, %g3
>> +
>>  	sethi		%hi(sparc64_kern_sec_context), %g2
> 
> You missed the second instruction to patch with here.
> This bug repeats itself further down.
> 
> Just noted while briefly reading the code - did not really follow the code.

Hi Sam,

This is my first sparc assembly code, so I could certainly have this
wrong.  The code I was trying to write has the two nop instructions,
that get patched with the mov and ldxa on sun4v.  Certainly, this is
not elegant.  And, the formatting may lead to some confusion.

Did you perhaps think the mov instruction after the comment was for
patching?  I am just trying to understand your comment.

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 05/14] sparc64: Add PAGE_SHR_CTX flag
  2016-12-18  3:12   ` David Miller
@ 2016-12-19  0:42     ` Mike Kravetz
  2016-12-20 18:33       ` David Miller
  0 siblings, 1 reply; 35+ messages in thread
From: Mike Kravetz @ 2016-12-19  0:42 UTC (permalink / raw)
  To: David Miller
  Cc: sparclinux, linux-mm, linux-kernel, bob.picco, nitin.m.gupta,
	vijay.ac.kumar, julian.calaby, adam.buchbinder, kirill.shutemov,
	mhocko, akpm

On 12/17/2016 07:12 PM, David Miller wrote:
> From: Mike Kravetz <mike.kravetz@oracle.com>
> Date: Fri, 16 Dec 2016 10:35:28 -0800
> 
>> @@ -166,6 +166,7 @@ bool kern_addr_valid(unsigned long addr);
>>  #define _PAGE_EXEC_4V	  _AC(0x0000000000000080,UL) /* Executable Page      */
>>  #define _PAGE_W_4V	  _AC(0x0000000000000040,UL) /* Writable             */
>>  #define _PAGE_SOFT_4V	  _AC(0x0000000000000030,UL) /* Software bits        */
>> +#define _PAGE_SHR_CTX_4V  _AC(0x0000000000000020,UL) /* Shared Context       */
>>  #define _PAGE_PRESENT_4V  _AC(0x0000000000000010,UL) /* Present              */
>>  #define _PAGE_RESV_4V	  _AC(0x0000000000000008,UL) /* Reserved             */
>>  #define _PAGE_SZ16GB_4V	  _AC(0x0000000000000007,UL) /* 16GB Page            */
> 
> You really don't need this.
> 
> The VMA is available, and you can obtain the information you need
> about whether this is a shared mapping or not from the. It just isn't
> being passed down into things like set_huge_pte_at().  Simply make it
> do so.
> 

I was more concerned about the page table walk code at tlb/tsb miss time.
Specifically, the code after tsb_miss_page_table_walk_sun4v_fastpath in
tsb.S.  AFAICT, the tsb entries should have been created when the pte entries
were created.  Yet, this code is still walking the page table and creating
tsb entries.  We do not have a pointer to the vma here, and I thought it
would be somewhat difficult to get access.  This is the reason why I went
down the path of a page flag.

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 06/14] sparc64: general shared context tsb creation and support
  2016-12-17  7:53   ` Sam Ravnborg
@ 2016-12-19  0:52     ` Mike Kravetz
  0 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-19  0:52 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

On 12/16/2016 11:53 PM, Sam Ravnborg wrote:
> Hi Mike
> 
>> --- a/arch/sparc/mm/hugetlbpage.c
>> +++ b/arch/sparc/mm/hugetlbpage.c
>> @@ -162,8 +162,14 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
>>  {
>>  	pte_t orig;
>>  
>> -	if (!pte_present(*ptep) && pte_present(entry))
>> -		mm->context.hugetlb_pte_count++;
>> +	if (!pte_present(*ptep) && pte_present(entry)) {
>> +#if defined(CONFIG_SHARED_MMU_CTX)
>> +		if (pte_val(entry) | _PAGE_SHR_CTX_4V)
>> +			mm->context.shared_hugetlb_pte_count++;
>> +		else
>> +#endif
>> +			mm->context.hugetlb_pte_count++;
>> +	}
> 
> This kind of conditional code it just too ugly to survive...
> Could a static inline be used to help you here?
> The compiler will inline it so there should not be any run-time cost

Yes, this can be cleaned up in that way.

> 
>>  
>>  	mm_rss -= saved_thp_pte_count * (HPAGE_SIZE / PAGE_SIZE);
>>  #endif
>> @@ -544,8 +576,10 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
>>  	 * us, so we need to zero out the TSB pointer or else tsb_grow()
>>  	 * will be confused and think there is an older TSB to free up.
>>  	 */
>> -	for (i = 0; i < MM_NUM_TSBS; i++)
>> +	for (i = 0; i < MM_NUM_TSBS; i++) {
>>  		mm->context.tsb_block[i].tsb = NULL;
>> +		mm->context.tsb_descr[i].tsb_base = 0UL;
>> +	}
> This change seems un-related to the rest?

Correct.  I was experimenting with some other ways of managing the tsb_descr
array that got dropped, but forgot to remove this.

-- 
Mike Kravetz

> 
> 	Sam
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 04/14] sparc64: load shared id into context register 1
  2016-12-19  0:06     ` Mike Kravetz
@ 2016-12-20 18:33       ` David Miller
  2016-12-20 20:27         ` Mike Kravetz
  2016-12-21 18:17       ` Sam Ravnborg
  1 sibling, 1 reply; 35+ messages in thread
From: David Miller @ 2016-12-20 18:33 UTC (permalink / raw)
  To: mike.kravetz
  Cc: sparclinux, linux-mm, linux-kernel, bob.picco, nitin.m.gupta,
	vijay.ac.kumar, julian.calaby, adam.buchbinder, kirill.shutemov,
	mhocko, akpm

From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Sun, 18 Dec 2016 16:06:01 -0800

> Ok, let me try to find a way to eliminate these loads unless the application
> is using shared context.
> 
> Part of the issue is a 'backwards compatibility' feature of the processor
> which loads/overwrites register 1 every time register 0 is loaded.  Somewhere
> in the evolution of the processor, a feature was added so that register 0
> could be loaded without overwriting register 1.  That could be used to
> eliminate the extra load in some/many cases.  But, that would likely lead
> to more runtime kernel patching based on processor level.  And, I don't
> really want to add more of that if possible.  Or, perhaps we only enable
> the shared context ID feature on processors which have the ability to work
> around the backwards compatibility feature.

Until the first process uses shared mappings, you should not touch the
context 1 register in any way for any reason at all.

And even once a process _does_ use shared mappings, you only need to
access the context 1 register in 2 cases:

1) TLB processing for the processes using shared mappings.

2) Context switch MMU state handling, where either the previous or
   next process is using shared mappings.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 05/14] sparc64: Add PAGE_SHR_CTX flag
  2016-12-19  0:42     ` Mike Kravetz
@ 2016-12-20 18:33       ` David Miller
  0 siblings, 0 replies; 35+ messages in thread
From: David Miller @ 2016-12-20 18:33 UTC (permalink / raw)
  To: mike.kravetz
  Cc: sparclinux, linux-mm, linux-kernel, bob.picco, nitin.m.gupta,
	vijay.ac.kumar, julian.calaby, adam.buchbinder, kirill.shutemov,
	mhocko, akpm

From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Sun, 18 Dec 2016 16:42:52 -0800

> On 12/17/2016 07:12 PM, David Miller wrote:
>> From: Mike Kravetz <mike.kravetz@oracle.com>
>> Date: Fri, 16 Dec 2016 10:35:28 -0800
>> 
>>> @@ -166,6 +166,7 @@ bool kern_addr_valid(unsigned long addr);
>>>  #define _PAGE_EXEC_4V	  _AC(0x0000000000000080,UL) /* Executable Page      */
>>>  #define _PAGE_W_4V	  _AC(0x0000000000000040,UL) /* Writable             */
>>>  #define _PAGE_SOFT_4V	  _AC(0x0000000000000030,UL) /* Software bits        */
>>> +#define _PAGE_SHR_CTX_4V  _AC(0x0000000000000020,UL) /* Shared Context       */
>>>  #define _PAGE_PRESENT_4V  _AC(0x0000000000000010,UL) /* Present              */
>>>  #define _PAGE_RESV_4V	  _AC(0x0000000000000008,UL) /* Reserved             */
>>>  #define _PAGE_SZ16GB_4V	  _AC(0x0000000000000007,UL) /* 16GB Page            */
>> 
>> You really don't need this.
>> 
>> The VMA is available, and you can obtain the information you need
>> about whether this is a shared mapping or not from the. It just isn't
>> being passed down into things like set_huge_pte_at().  Simply make it
>> do so.
>> 
> 
> I was more concerned about the page table walk code at tlb/tsb miss time.
> Specifically, the code after tsb_miss_page_table_walk_sun4v_fastpath in
> tsb.S.  AFAICT, the tsb entries should have been created when the pte entries
> were created.  Yet, this code is still walking the page table and creating
> tsb entries.  We do not have a pointer to the vma here, and I thought it
> would be somewhat difficult to get access.  This is the reason why I went
> down the path of a page flag.

You are right, you will need a page flag for that part.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 04/14] sparc64: load shared id into context register 1
  2016-12-20 18:33       ` David Miller
@ 2016-12-20 20:27         ` Mike Kravetz
  0 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-12-20 20:27 UTC (permalink / raw)
  To: David Miller
  Cc: sparclinux, linux-mm, linux-kernel, bob.picco, nitin.m.gupta,
	vijay.ac.kumar, julian.calaby, adam.buchbinder, kirill.shutemov,
	mhocko, akpm

On 12/20/2016 10:33 AM, David Miller wrote:
> From: Mike Kravetz <mike.kravetz@oracle.com>
> Date: Sun, 18 Dec 2016 16:06:01 -0800
> 
>> Ok, let me try to find a way to eliminate these loads unless the application
>> is using shared context.
>>
>> Part of the issue is a 'backwards compatibility' feature of the processor
>> which loads/overwrites register 1 every time register 0 is loaded.  Somewhere
>> in the evolution of the processor, a feature was added so that register 0
>> could be loaded without overwriting register 1.  That could be used to
>> eliminate the extra load in some/many cases.  But, that would likely lead
>> to more runtime kernel patching based on processor level.  And, I don't
>> really want to add more of that if possible.  Or, perhaps we only enable
>> the shared context ID feature on processors which have the ability to work
>> around the backwards compatibility feature.
> 
> Until the first process uses shared mappings, you should not touch the
> context 1 register in any way for any reason at all.
> 
> And even once a process _does_ use shared mappings, you only need to
> access the context 1 register in 2 cases:
> 
> 1) TLB processing for the processes using shared mappings.
> 
> 2) Context switch MMU state handling, where either the previous or
>    next process is using shared mappings.

I agree.

But, we still need to address the issue of existing code that is
overwriting context register 1 today.  Due to that backwards
compatibility feature, code like:

	mov	SECONDARY_CONTEXT, %g3
	stxa	%g2, [%g3] ASI_DMMU

will store not only to register 0, but register 1 as well.

In this RFC, I used an ugly brute force method of always restoring
register 1 after storing register 0 to make sure any unique value
in register 1 was preserved.  I agree this is not acceptable and needs
to be fixed.  We could check if register 1 is in use and only do the
save/restore in that case.  But, that is still an additional check.

The Sparc M7 processor has new ASIs to handle this better:
ASI	ASI Name	R/W	VA 	Per Strand	Description
0x21	ASI_MMU 	RW	0x28 	Y 		I/DMMUPrimary Context
							register 0 (no Primary
							Context register 1
							update)
0x21	ASI_MMU		RW	0x30	Y 		DMMUSecondary Context
							register 0 (no Secondary
							Context register 1
							update)
More details at,
http://www.oracle.com/technetwork/server-storage/sun-sparc-enterprise/documentation/sparc-architecture-supplement-3093429.pdf

Of course, this could only be used on processors where the new ASIs are
available.

Still need to think about the best way to handle this.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support
  2016-12-18 23:33     ` Mike Kravetz
@ 2016-12-21 18:12       ` Sam Ravnborg
  0 siblings, 0 replies; 35+ messages in thread
From: Sam Ravnborg @ 2016-12-21 18:12 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

Hi Mike.

On Sun, Dec 18, 2016 at 03:33:59PM -0800, Mike Kravetz wrote:
> On 12/16/2016 11:34 PM, Sam Ravnborg wrote:
> > Hi Mike.
> > 
> > On Fri, Dec 16, 2016 at 10:35:25AM -0800, Mike Kravetz wrote:
> >> Add new fields to the mm_context structure to support shared context.
> >> Instead of a simple context ID, add a pointer to a structure with a
> >> reference count.  This is needed as multiple tasks will share the
> >> context ID.
> > 
> > What are the benefits with the shared_mmu_ctx struct?
> > It does not save any space in mm_context_t, and the CPU only
> > supports one extra context.
> > So it looks like over-engineering with all the extra administration
> > required to handle it with refcount, poitners etc.
> > 
> > what do I miss?
> 
> Multiple tasks will share this same context ID.  The first task to need
> a new shared context will allocate the structure, increment the ref count
> and point to it.  As other tasks join the sharing, they will increment
> the ref count and point to the same structure.  Similarly, when tasks
> no longer use the shared context ID, they will decrement the reference
> count.
> 
> The reference count is important so that we will know when the last
> reference to the shared context ID is dropped.  When the last reference
> is dropped, then the ID can be recycled/given back to the global pool
> of context IDs.
> 
> This seemed to be the most straight forward way to implement this.

This nice explanation clarified it - thanks.
Could you try to include this info in the description
of the struct - so it is obvious what the intention with the
reference counter is.

	Sam

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support
  2016-12-18 23:45     ` Mike Kravetz
@ 2016-12-21 18:13       ` Sam Ravnborg
  0 siblings, 0 replies; 35+ messages in thread
From: Sam Ravnborg @ 2016-12-21 18:13 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

On Sun, Dec 18, 2016 at 03:45:31PM -0800, Mike Kravetz wrote:
> On 12/16/2016 11:38 PM, Sam Ravnborg wrote:
> > Hi Mike
> > 
> >> diff --git a/arch/sparc/include/asm/mmu_context_64.h b/arch/sparc/include/asm/mmu_context_64.h
> >> index b84be67..d031799 100644
> >> --- a/arch/sparc/include/asm/mmu_context_64.h
> >> +++ b/arch/sparc/include/asm/mmu_context_64.h
> >> @@ -35,15 +35,15 @@ void __tsb_context_switch(unsigned long pgd_pa,
> >>  static inline void tsb_context_switch(struct mm_struct *mm)
> >>  {
> >>  	__tsb_context_switch(__pa(mm->pgd),
> >> -			     &mm->context.tsb_block[0],
> >> +			     &mm->context.tsb_block[MM_TSB_BASE],
> >>  #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
> >> -			     (mm->context.tsb_block[1].tsb ?
> >> -			      &mm->context.tsb_block[1] :
> >> +			     (mm->context.tsb_block[MM_TSB_HUGE].tsb ?
> >> +			      &mm->context.tsb_block[MM_TSB_HUGE] :
> >>  			      NULL)
> >>  #else
> >>  			     NULL
> >>  #endif
> >> -			     , __pa(&mm->context.tsb_descr[0]));
> >> +			     , __pa(&mm->context.tsb_descr[MM_TSB_BASE]));
> >>  }
> >>  
> > This is a nice cleanup that has nothing to do with your series.
> > Could you submit this as a separate patch so we can get it applied.
> > 
> > This is the only place left where the array index for tsb_block
> > and tsb_descr uses hardcoded values. And it would be good to get
> > rid of these.
> 
> Sure, I will submit a separate cleanup patch for this.
> 
> However, do note that in my series if CONFIG_SHARED_MMU_CTX is defined,
> then MM_TSB_HUGE_SHARED is index 0, instead of MM_TSB_BASE being 0 in
> the case where CONFIG_SHARED_MMU_CTX is not defined.  This may seem
> 'strange' and the obvious question would be 'why not put CONFIG_SHARED_MMU_CTX
> at the end of the existing array (index 2)?'.  The reason is that tsb_descr
> array can not have any 'holes' when passed to the hypervisor.  Since there
> will always be a MM_TSB_BASE tsb, with MM_TSB_HUGE_SHARED before and
> MM_TSB_HUGE after MM_TSB_BASE, few tricks are necessary to ensure no holes
> are in the array passed to the hypervisor.
So this is the explanation for the strange changes of the constants.
Add a similar explanation to the code to help the next reader.

	Sam

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 04/14] sparc64: load shared id into context register 1
  2016-12-19  0:22     ` Mike Kravetz
@ 2016-12-21 18:16       ` Sam Ravnborg
  0 siblings, 0 replies; 35+ messages in thread
From: Sam Ravnborg @ 2016-12-21 18:16 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: sparclinux, linux-mm, linux-kernel, David S . Miller, Bob Picco,
	Nitin Gupta, Vijay Kumar, Julian Calaby, Adam Buchbinder,
	Kirill A . Shutemov, Michal Hocko, Andrew Morton

On Sun, Dec 18, 2016 at 04:22:31PM -0800, Mike Kravetz wrote:
> On 12/16/2016 11:45 PM, Sam Ravnborg wrote:
> > Hi Mike
> > 
> >> diff --git a/arch/sparc/kernel/fpu_traps.S b/arch/sparc/kernel/fpu_traps.S
> >> index 336d275..f85a034 100644
> >> --- a/arch/sparc/kernel/fpu_traps.S
> >> +++ b/arch/sparc/kernel/fpu_traps.S
> >> @@ -73,6 +73,16 @@ do_fpdis:
> >>  	ldxa		[%g3] ASI_MMU, %g5
> >>  	.previous
> >>  
> >> +661:	nop
> >> +	nop
> >> +	.section	.sun4v_2insn_patch, "ax"
> >> +	.word		661b
> >> +	mov		SECONDARY_CONTEXT_R1, %g3
> >> +	ldxa		[%g3] ASI_MMU, %g4
> >> +	.previous
> >> +	/* Unnecessary on sun4u and pre-Niagara 2 sun4v */
> >> +	mov		SECONDARY_CONTEXT, %g3
> >> +
> >>  	sethi		%hi(sparc64_kern_sec_context), %g2
> > 
> > You missed the second instruction to patch with here.
> > This bug repeats itself further down.
> > 
> > Just noted while briefly reading the code - did not really follow the code.
> 
> Hi Sam,
> 
> This is my first sparc assembly code, so I could certainly have this
> wrong.

Nope. I was to quick in my reading and in the reply.
when I looked at this with fresh eyes it looked perfectly OK.

That is to say - the patching part. I did not follow the code logic.

	Sam

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 04/14] sparc64: load shared id into context register 1
  2016-12-19  0:06     ` Mike Kravetz
  2016-12-20 18:33       ` David Miller
@ 2016-12-21 18:17       ` Sam Ravnborg
  1 sibling, 0 replies; 35+ messages in thread
From: Sam Ravnborg @ 2016-12-21 18:17 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: David Miller, sparclinux, linux-mm, linux-kernel, bob.picco,
	nitin.m.gupta, vijay.ac.kumar, julian.calaby, adam.buchbinder,
	kirill.shutemov, mhocko, akpm

Hi Mike.

> Or, perhaps we only enable
> the shared context ID feature on processors which have the ability to work
> around the backwards compatibility feature.

Start out like this, and then see if it is really needed with the older processors.
This should keep the code logic simpler - which is always good for this complicated stuff.

	Sam

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2016-12-21 18:17 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-16 18:35 [RFC PATCH 00/14] sparc64 shared context/TLB support Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 01/14] sparc64: placeholder for needed mmu shared context patching Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 02/14] sparc64: add new fields to mmu context for shared context support Mike Kravetz
2016-12-17  7:34   ` Sam Ravnborg
2016-12-18 23:33     ` Mike Kravetz
2016-12-21 18:12       ` Sam Ravnborg
2016-12-17  7:38   ` Sam Ravnborg
2016-12-18 23:45     ` Mike Kravetz
2016-12-21 18:13       ` Sam Ravnborg
2016-12-16 18:35 ` [RFC PATCH 03/14] sparc64: routines for basic mmu shared context structure management Mike Kravetz
2016-12-18  3:07   ` David Miller
2016-12-16 18:35 ` [RFC PATCH 04/14] sparc64: load shared id into context register 1 Mike Kravetz
2016-12-17  7:45   ` Sam Ravnborg
2016-12-19  0:22     ` Mike Kravetz
2016-12-21 18:16       ` Sam Ravnborg
2016-12-18  3:14   ` David Miller
2016-12-19  0:06     ` Mike Kravetz
2016-12-20 18:33       ` David Miller
2016-12-20 20:27         ` Mike Kravetz
2016-12-21 18:17       ` Sam Ravnborg
2016-12-16 18:35 ` [RFC PATCH 05/14] sparc64: Add PAGE_SHR_CTX flag Mike Kravetz
2016-12-18  3:12   ` David Miller
2016-12-19  0:42     ` Mike Kravetz
2016-12-20 18:33       ` David Miller
2016-12-16 18:35 ` [RFC PATCH 06/14] sparc64: general shared context tsb creation and support Mike Kravetz
2016-12-17  7:53   ` Sam Ravnborg
2016-12-19  0:52     ` Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 07/14] sparc64: move COMPUTE_TAG_TARGET and COMPUTE_TSB_PTR to header file Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 08/14] sparc64: shared context tsb handling at context switch time Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 09/14] sparc64: TLB/TSB miss handling for shared context Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 10/14] mm: add shared context to vm_area_struct Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 11/14] sparc64: add routines to look for vmsa which can share context Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 12/14] mm: add mmap and shmat arch hooks for shared context Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 13/14] sparc64 mm: add shared context support to mmap() and shmat() APIs Mike Kravetz
2016-12-16 18:35 ` [RFC PATCH 14/14] sparc64: add SHARED_MMU_CTX Kconfig option Mike Kravetz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).