All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] powerpc: Add function to copy mm_context_t to the paca
@ 2015-10-28  4:54 Michael Neuling
  2015-10-28  4:54 ` [PATCH 2/2] powerpc: Copy only required pieces of the " Michael Neuling
  2016-01-11  9:14 ` [1/2] powerpc: Add function to copy " Michael Ellerman
  0 siblings, 2 replies; 8+ messages in thread
From: Michael Neuling @ 2015-10-28  4:54 UTC (permalink / raw)
  To: mpe, benh; +Cc: mikey, anton, linuxppc-dev, Cyril Bur

This adds a function to copy the mm->context to the paca.  This is
only a basic conversion for now but will be used more extensively in
the next patch.

This also adds #ifdef CONFIG_PPC_BOOK3S around this code since it's
not used elsewhere.

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/include/asm/paca.h   | 11 +++++++++++
 arch/powerpc/kernel/asm-offsets.c |  2 ++
 arch/powerpc/mm/hash_utils_64.c   |  5 +++--
 arch/powerpc/mm/slb.c             |  2 +-
 arch/powerpc/mm/slice.c           |  3 +--
 5 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 70bd438..1cc6e08 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -131,7 +131,9 @@ struct paca_struct {
 	struct tlb_core_data tcd;
 #endif /* CONFIG_PPC_BOOK3E */
 
+#ifdef CONFIG_PPC_BOOK3S
 	mm_context_t context;
+#endif
 
 	/*
 	 * then miscellaneous read-write fields
@@ -194,6 +196,15 @@ struct paca_struct {
 #endif
 };
 
+#ifdef CONFIG_PPC_BOOK3S
+static inline void copy_mm_to_paca(mm_context_t *context)
+{
+	get_paca()->context = *context;
+}
+#else
+static inline void copy_mm_to_paca(mm_context_t *context){}
+#endif
+
 extern struct paca_struct *paca;
 extern void initialise_paca(struct paca_struct *new_paca, int cpu);
 extern void setup_paca(struct paca_struct *new_paca);
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 221d584..9db7be2 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -185,6 +185,7 @@ int main(void)
 	DEFINE(PACAKMSR, offsetof(struct paca_struct, kernel_msr));
 	DEFINE(PACASOFTIRQEN, offsetof(struct paca_struct, soft_enabled));
 	DEFINE(PACAIRQHAPPENED, offsetof(struct paca_struct, irq_happened));
+#ifdef CONFIG_PPC_BOOK3S
 	DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context.id));
 #ifdef CONFIG_PPC_MM_SLICES
 	DEFINE(PACALOWSLICESPSIZE, offsetof(struct paca_struct,
@@ -193,6 +194,7 @@ int main(void)
 					    context.high_slices_psize));
 	DEFINE(MMUPSIZEDEFSIZE, sizeof(struct mmu_psize_def));
 #endif /* CONFIG_PPC_MM_SLICES */
+#endif
 
 #ifdef CONFIG_PPC_BOOK3E
 	DEFINE(PACAPGD, offsetof(struct paca_struct, pgd));
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index aee7017..fa62eb0 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -906,7 +906,8 @@ void demote_segment_4k(struct mm_struct *mm, unsigned long addr)
 	slice_set_range_psize(mm, addr, 1, MMU_PAGE_4K);
 	copro_flush_all_slbs(mm);
 	if ((get_paca_psize(addr) != MMU_PAGE_4K) && (current->mm == mm)) {
-		get_paca()->context = mm->context;
+
+		copy_mm_to_paca(&mm->context);
 		slb_flush_and_rebolt();
 	}
 }
@@ -973,7 +974,7 @@ static void check_paca_psize(unsigned long ea, struct mm_struct *mm,
 {
 	if (user_region) {
 		if (psize != get_paca_psize(ea)) {
-			get_paca()->context = mm->context;
+			copy_mm_to_paca(&mm->context);
 			slb_flush_and_rebolt();
 		}
 	} else if (get_paca()->vmalloc_sllp !=
diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 8a32a2be..4412b8e 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -223,7 +223,7 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
 		asm volatile("slbie %0" : : "r" (slbie_data));
 
 	get_paca()->slb_cache_ptr = 0;
-	get_paca()->context = mm->context;
+	copy_mm_to_paca(&mm->context);
 
 	/*
 	 * preload some userspace segments into the SLB.
diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 0f432a7..42954f0 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -185,8 +185,7 @@ static void slice_flush_segments(void *parm)
 	if (mm != current->active_mm)
 		return;
 
-	/* update the paca copy of the context struct */
-	get_paca()->context = current->active_mm->context;
+	copy_mm_to_paca(&current->active_mm->context);
 
 	local_irq_save(flags);
 	slb_flush_and_rebolt();
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] powerpc: Copy only required pieces of the mm_context_t to the paca
  2015-10-28  4:54 [PATCH 1/2] powerpc: Add function to copy mm_context_t to the paca Michael Neuling
@ 2015-10-28  4:54 ` Michael Neuling
  2015-12-09 12:17   ` Anton Blanchard
  2015-12-10  3:46   ` [PATCH v2 " Michael Neuling
  2016-01-11  9:14 ` [1/2] powerpc: Add function to copy " Michael Ellerman
  1 sibling, 2 replies; 8+ messages in thread
From: Michael Neuling @ 2015-10-28  4:54 UTC (permalink / raw)
  To: mpe, benh; +Cc: mikey, anton, linuxppc-dev, Cyril Bur

Currently we copy the whole mm_context_t to the paca but only access a
few bits of it.  This is wasteful of space paca and also takes quite
some time in the hot path of context switching.

This patch pulls in only the required bits from the mm_context_t to
the paca and on context switch, copies only those.

Benchmarking this (On top of Anton's recent MSR context switching
changes [1]) using processes and yield shows an improvement of almost
3% on POWER8:

  http://ozlabs.org/~anton/junkcode/context_switch2.c
  ./context_switch2 --test=yield --process 0 0

1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/include/asm/paca.h   | 17 +++++++++++++++--
 arch/powerpc/kernel/asm-offsets.c |  8 ++++----
 arch/powerpc/mm/hash_utils_64.c   |  4 ++--
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 1cc6e08..1c0d9f4 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -132,7 +132,13 @@ struct paca_struct {
 #endif /* CONFIG_PPC_BOOK3E */
 
 #ifdef CONFIG_PPC_BOOK3S
-	mm_context_t context;
+	mm_context_id_t context_id;
+#ifdef CONFIG_PPC_MM_SLICES
+	u64 context_low_slices_psize;
+	unsigned char context_high_slices_psize[SLICE_ARRAY_SIZE];
+#else
+	u16 context_sllp;
+#endif
 #endif
 
 	/*
@@ -199,7 +205,14 @@ struct paca_struct {
 #ifdef CONFIG_PPC_BOOK3S
 static inline void copy_mm_to_paca(mm_context_t *context)
 {
-	get_paca()->context = *context;
+	get_paca()->context_id = context->id;
+#ifdef CONFIG_PPC_MM_SLICES
+	get_paca()->context_low_slices_psize = context->low_slices_psize;
+	memcpy(&get_paca()->context_high_slices_psize,
+	       &context->high_slices_psize, SLICE_ARRAY_SIZE);
+#else
+	get_paca()->context_sllp = context->sllp;
+#endif
 }
 #else
 static inline void copy_mm_to_paca(mm_context_t *context){}
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 9db7be2..d5903a9 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -186,12 +186,12 @@ int main(void)
 	DEFINE(PACASOFTIRQEN, offsetof(struct paca_struct, soft_enabled));
 	DEFINE(PACAIRQHAPPENED, offsetof(struct paca_struct, irq_happened));
 #ifdef CONFIG_PPC_BOOK3S
-	DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context.id));
+	DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context_id));
 #ifdef CONFIG_PPC_MM_SLICES
 	DEFINE(PACALOWSLICESPSIZE, offsetof(struct paca_struct,
-					    context.low_slices_psize));
+					    context_low_slices_psize));
 	DEFINE(PACAHIGHSLICEPSIZE, offsetof(struct paca_struct,
-					    context.high_slices_psize));
+					    context_high_slices_psize));
 	DEFINE(MMUPSIZEDEFSIZE, sizeof(struct mmu_psize_def));
 #endif /* CONFIG_PPC_MM_SLICES */
 #endif
@@ -224,7 +224,7 @@ int main(void)
 #ifdef CONFIG_PPC_MM_SLICES
 	DEFINE(MMUPSIZESLLP, offsetof(struct mmu_psize_def, sllp));
 #else
-	DEFINE(PACACONTEXTSLLP, offsetof(struct paca_struct, context.sllp));
+	DEFINE(PACACONTEXTSLLP, offsetof(struct paca_struct, context_sllp));
 #endif /* CONFIG_PPC_MM_SLICES */
 	DEFINE(PACA_EXGEN, offsetof(struct paca_struct, exgen));
 	DEFINE(PACA_EXMC, offsetof(struct paca_struct, exmc));
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index fa62eb0..0e087e4 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -877,11 +877,11 @@ static unsigned int get_paca_psize(unsigned long addr)
 	unsigned long index, mask_index;
 
 	if (addr < SLICE_LOW_TOP) {
-		lpsizes = get_paca()->context.low_slices_psize;
+		lpsizes = get_paca()->context_low_slices_psize;
 		index = GET_LOW_SLICE_INDEX(addr);
 		return (lpsizes >> (index * 4)) & 0xF;
 	}
-	hpsizes = get_paca()->context.high_slices_psize;
+	hpsizes = get_paca()->context_high_slices_psize;
 	index = GET_HIGH_SLICE_INDEX(addr);
 	mask_index = index & 0x1;
 	return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xF;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] powerpc: Copy only required pieces of the mm_context_t to the paca
  2015-10-28  4:54 ` [PATCH 2/2] powerpc: Copy only required pieces of the " Michael Neuling
@ 2015-12-09 12:17   ` Anton Blanchard
  2015-12-10  3:46   ` [PATCH v2 " Michael Neuling
  1 sibling, 0 replies; 8+ messages in thread
From: Anton Blanchard @ 2015-12-09 12:17 UTC (permalink / raw)
  To: Michael Neuling; +Cc: mpe, benh, linuxppc-dev, Cyril Bur

> Currently we copy the whole mm_context_t to the paca but only access a
> few bits of it.  This is wasteful of space paca and also takes quite
> some time in the hot path of context switching.
> 
> This patch pulls in only the required bits from the mm_context_t to
> the paca and on context switch, copies only those.
> 
> Benchmarking this (On top of Anton's recent MSR context switching
> changes [1]) using processes and yield shows an improvement of almost
> 3% on POWER8:
> 
>   http://ozlabs.org/~anton/junkcode/context_switch2.c
>   ./context_switch2 --test=yield --process 0 0

Now the context switch series is in, I tested this against powerpc-next
and still see a ~3% improvement.

Anton

> 1.
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html
> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> ---
>  arch/powerpc/include/asm/paca.h   | 17 +++++++++++++++--
>  arch/powerpc/kernel/asm-offsets.c |  8 ++++----
>  arch/powerpc/mm/hash_utils_64.c   |  4 ++--
>  3 files changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/paca.h
> b/arch/powerpc/include/asm/paca.h index 1cc6e08..1c0d9f4 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -132,7 +132,13 @@ struct paca_struct {
>  #endif /* CONFIG_PPC_BOOK3E */
>  
>  #ifdef CONFIG_PPC_BOOK3S
> -	mm_context_t context;
> +	mm_context_id_t context_id;
> +#ifdef CONFIG_PPC_MM_SLICES
> +	u64 context_low_slices_psize;
> +	unsigned char context_high_slices_psize[SLICE_ARRAY_SIZE];
> +#else
> +	u16 context_sllp;
> +#endif
>  #endif
>  
>  	/*
> @@ -199,7 +205,14 @@ struct paca_struct {
>  #ifdef CONFIG_PPC_BOOK3S
>  static inline void copy_mm_to_paca(mm_context_t *context)
>  {
> -	get_paca()->context = *context;
> +	get_paca()->context_id = context->id;
> +#ifdef CONFIG_PPC_MM_SLICES
> +	get_paca()->context_low_slices_psize =
> context->low_slices_psize;
> +	memcpy(&get_paca()->context_high_slices_psize,
> +	       &context->high_slices_psize, SLICE_ARRAY_SIZE);
> +#else
> +	get_paca()->context_sllp = context->sllp;
> +#endif
>  }
>  #else
>  static inline void copy_mm_to_paca(mm_context_t *context){}
> diff --git a/arch/powerpc/kernel/asm-offsets.c
> b/arch/powerpc/kernel/asm-offsets.c index 9db7be2..d5903a9 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -186,12 +186,12 @@ int main(void)
>  	DEFINE(PACASOFTIRQEN, offsetof(struct paca_struct,
> soft_enabled)); DEFINE(PACAIRQHAPPENED, offsetof(struct paca_struct,
> irq_happened)); #ifdef CONFIG_PPC_BOOK3S
> -	DEFINE(PACACONTEXTID, offsetof(struct paca_struct,
> context.id));
> +	DEFINE(PACACONTEXTID, offsetof(struct paca_struct,
> context_id)); #ifdef CONFIG_PPC_MM_SLICES
>  	DEFINE(PACALOWSLICESPSIZE, offsetof(struct paca_struct,
> -
> context.low_slices_psize));
> +
> context_low_slices_psize)); DEFINE(PACAHIGHSLICEPSIZE,
> offsetof(struct paca_struct,
> -
> context.high_slices_psize));
> +
> context_high_slices_psize)); DEFINE(MMUPSIZEDEFSIZE, sizeof(struct
> mmu_psize_def)); #endif /* CONFIG_PPC_MM_SLICES */
>  #endif
> @@ -224,7 +224,7 @@ int main(void)
>  #ifdef CONFIG_PPC_MM_SLICES
>  	DEFINE(MMUPSIZESLLP, offsetof(struct mmu_psize_def, sllp));
>  #else
> -	DEFINE(PACACONTEXTSLLP, offsetof(struct paca_struct,
> context.sllp));
> +	DEFINE(PACACONTEXTSLLP, offsetof(struct paca_struct,
> context_sllp)); #endif /* CONFIG_PPC_MM_SLICES */
>  	DEFINE(PACA_EXGEN, offsetof(struct paca_struct, exgen));
>  	DEFINE(PACA_EXMC, offsetof(struct paca_struct, exmc));
> diff --git a/arch/powerpc/mm/hash_utils_64.c
> b/arch/powerpc/mm/hash_utils_64.c index fa62eb0..0e087e4 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -877,11 +877,11 @@ static unsigned int get_paca_psize(unsigned
> long addr) unsigned long index, mask_index;
>  
>  	if (addr < SLICE_LOW_TOP) {
> -		lpsizes = get_paca()->context.low_slices_psize;
> +		lpsizes = get_paca()->context_low_slices_psize;
>  		index = GET_LOW_SLICE_INDEX(addr);
>  		return (lpsizes >> (index * 4)) & 0xF;
>  	}
> -	hpsizes = get_paca()->context.high_slices_psize;
> +	hpsizes = get_paca()->context_high_slices_psize;
>  	index = GET_HIGH_SLICE_INDEX(addr);
>  	mask_index = index & 0x1;
>  	return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xF;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] powerpc: Copy only required pieces of the mm_context_t to the paca
  2015-10-28  4:54 ` [PATCH 2/2] powerpc: Copy only required pieces of the " Michael Neuling
  2015-12-09 12:17   ` Anton Blanchard
@ 2015-12-10  3:46   ` Michael Neuling
  2015-12-10 10:00     ` Michael Ellerman
  1 sibling, 1 reply; 8+ messages in thread
From: Michael Neuling @ 2015-12-10  3:46 UTC (permalink / raw)
  To: mpe, benh; +Cc: linuxppc-dev, anton, Cyril Bur

Currently we copy the whole mm_context_t to the paca but only access a
few bits of it.  This is wasteful of space paca and also takes quite
some time in the hot path of context switching.

This patch pulls in only the required bits from the mm_context_t to
the paca and on context switch, copies only those.

Benchmarking this (On top of Anton's recent MSR context switching
changes [1]) using processes and yield shows an improvement of almost
3% on POWER8:

  http://ozlabs.org/~anton/junkcode/context_switch2.c
  ./context_switch2 --test=3Dyield --process 0 0

1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.
html

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
v2:
  Added missing include which broke allmodconfig (noticed by anton)

 include/asm/paca.h   |   18 ++++++++++++++++--
 kernel/asm-offsets.c |    8 ++++----
 mm/hash_utils_64.c   |    4 ++--
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h
b/arch/powerpc/include/asm/paca.h
index 1cc6e08..06cdaee 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -16,6 +16,7 @@
=20
 #ifdef CONFIG_PPC64
=20
+#include <linux/string.h>
 #include <asm/types.h>
 #include <asm/lppaca.h>
 #include <asm/mmu.h>
@@ -132,7 +133,13 @@ struct paca_struct {
 #endif /* CONFIG_PPC_BOOK3E */
=20
 #ifdef CONFIG_PPC_BOOK3S
-	mm_context_t context;
+	mm_context_id_t context_id;
+#ifdef CONFIG_PPC_MM_SLICES
+	u64 context_low_slices_psize;
+	unsigned char context_high_slices_psize[SLICE_ARRAY_SIZE];
+#else
+	u16 context_sllp;
+#endif
 #endif
=20
 	/*
@@ -199,7 +206,14 @@ struct paca_struct {
 #ifdef CONFIG_PPC_BOOK3S
 static inline void copy_mm_to_paca(mm_context_t *context)
 {
-	get_paca()->context =3D *context;
+	get_paca()->context_id =3D context->id;
+#ifdef CONFIG_PPC_MM_SLICES
+	get_paca()->context_low_slices_psize =3D context
->low_slices_psize;
+	memcpy(&get_paca()->context_high_slices_psize,
+	       &context->high_slices_psize, SLICE_ARRAY_SIZE);
+#else
+	get_paca()->context_sllp =3D context->sllp;
+#endif
 }
 #else
 static inline void copy_mm_to_paca(mm_context_t *context){}
diff --git a/arch/powerpc/kernel/asm-offsets.c
b/arch/powerpc/kernel/asm-offsets.c
index 9db7be2..d5903a9 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -186,12 +186,12 @@ int main(void)
 	DEFINE(PACASOFTIRQEN, offsetof(struct paca_struct,
soft_enabled));
 	DEFINE(PACAIRQHAPPENED, offsetof(struct paca_struct,
irq_happened));
 #ifdef CONFIG_PPC_BOOK3S
-	DEFINE(PACACONTEXTID, offsetof(struct paca_struct,
context.id));
+	DEFINE(PACACONTEXTID, offsetof(struct paca_struct,
context_id));
 #ifdef CONFIG_PPC_MM_SLICES
 	DEFINE(PACALOWSLICESPSIZE, offsetof(struct paca_struct,
-					  =20
 context.low_slices_psize));
+					  =20
 context_low_slices_psize));
 	DEFINE(PACAHIGHSLICEPSIZE, offsetof(struct paca_struct,
-					  =20
 context.high_slices_psize));
+					  =20
 context_high_slices_psize));
 	DEFINE(MMUPSIZEDEFSIZE, sizeof(struct mmu_psize_def));
 #endif /* CONFIG_PPC_MM_SLICES */
 #endif
@@ -224,7 +224,7 @@ int main(void)
 #ifdef CONFIG_PPC_MM_SLICES
 	DEFINE(MMUPSIZESLLP, offsetof(struct mmu_psize_def, sllp));
 #else
-	DEFINE(PACACONTEXTSLLP, offsetof(struct paca_struct,
context.sllp));
+	DEFINE(PACACONTEXTSLLP, offsetof(struct paca_struct,
context_sllp));
 #endif /* CONFIG_PPC_MM_SLICES */
 	DEFINE(PACA_EXGEN, offsetof(struct paca_struct, exgen));
 	DEFINE(PACA_EXMC, offsetof(struct paca_struct, exmc));
diff --git a/arch/powerpc/mm/hash_utils_64.c
b/arch/powerpc/mm/hash_utils_64.c
index 0aa526e..ae98d58 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -877,11 +877,11 @@ static unsigned int get_paca_psize(unsigned long
addr)
 	unsigned long index, mask_index;
=20
 	if (addr < SLICE_LOW_TOP) {
-		lpsizes =3D get_paca()->context.low_slices_psize;
+		lpsizes =3D get_paca()->context_low_slices_psize;
 		index =3D GET_LOW_SLICE_INDEX(addr);
 		return (lpsizes >> (index * 4)) & 0xF;
 	}
-	hpsizes =3D get_paca()->context.high_slices_psize;
+	hpsizes =3D get_paca()->context_high_slices_psize;
 	index =3D GET_HIGH_SLICE_INDEX(addr);
 	mask_index =3D index & 0x1;
 	return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xF;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] powerpc: Copy only required pieces of the mm_context_t to the paca
  2015-12-10  3:46   ` [PATCH v2 " Michael Neuling
@ 2015-12-10 10:00     ` Michael Ellerman
  2015-12-10 22:34       ` [PATCH v3 " Michael Neuling
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Ellerman @ 2015-12-10 10:00 UTC (permalink / raw)
  To: Michael Neuling, benh; +Cc: linuxppc-dev, anton, Cyril Bur

On Thu, 2015-12-10 at 14:46 +1100, Michael Neuling wrote:

> Currently we copy the whole mm_context_t to the paca but only access a
> few bits of it.  This is wasteful of space paca and also takes quite
> some time in the hot path of context switching.
> 
> diff --git a/arch/powerpc/include/asm/paca.h
> b/arch/powerpc/include/asm/paca.h
> index 1cc6e08..06cdaee 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -199,7 +206,14 @@ struct paca_struct {
>  #ifdef CONFIG_PPC_BOOK3S
>  static inline void copy_mm_to_paca(mm_context_t *context)
>  {
> -	get_paca()->context = *context;
> +	get_paca()->context_id = context->id;
> +#ifdef CONFIG_PPC_MM_SLICES
> +	get_paca()->context_low_slices_psize = context
> ->low_slices_psize;

Patch is wrapped ^

And so on.

cheers

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 2/2] powerpc: Copy only required pieces of the mm_context_t to the paca
  2015-12-10 10:00     ` Michael Ellerman
@ 2015-12-10 22:34       ` Michael Neuling
  2016-01-11  9:14         ` [v3, " Michael Ellerman
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Neuling @ 2015-12-10 22:34 UTC (permalink / raw)
  To: Michael Ellerman, benh; +Cc: linuxppc-dev, anton, Cyril Bur

Currently we copy the whole mm_context_t to the paca but only access a
few bits of it.  This is wasteful of space paca and also takes quite
some time in the hot path of context switching.

This patch pulls in only the required bits from the mm_context_t to
the paca and on context switch, copies only those.

Benchmarking this (On top of Anton's recent MSR context switching
changes [1]) using processes and yield shows an improvement of almost
3% on POWER8:

  http://ozlabs.org/~anton/junkcode/context_switch2.c
  ./context_switch2 --test=3Dyield --process 0 0

1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html

Signed-off-by: Michael Neuling <mikey@neuling.org>
--
v3:
  Fix line wrapping (WTF sorry!?!)
v2:
  Added missing include which broke allmodconfig (noticed by anton)


diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/pac=
a.h
index 1cc6e08..06cdaee 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -16,6 +16,7 @@
=20
 #ifdef CONFIG_PPC64
=20
+#include <linux/string.h>
 #include <asm/types.h>
 #include <asm/lppaca.h>
 #include <asm/mmu.h>
@@ -132,7 +133,13 @@ struct paca_struct {
 #endif /* CONFIG_PPC_BOOK3E */
=20
 #ifdef CONFIG_PPC_BOOK3S
-	mm_context_t context;
+	mm_context_id_t context_id;
+#ifdef CONFIG_PPC_MM_SLICES
+	u64 context_low_slices_psize;
+	unsigned char context_high_slices_psize[SLICE_ARRAY_SIZE];
+#else
+	u16 context_sllp;
+#endif
 #endif
=20
 	/*
@@ -199,7 +206,14 @@ struct paca_struct {
 #ifdef CONFIG_PPC_BOOK3S
 static inline void copy_mm_to_paca(mm_context_t *context)
 {
-	get_paca()->context =3D *context;
+	get_paca()->context_id =3D context->id;
+#ifdef CONFIG_PPC_MM_SLICES
+	get_paca()->context_low_slices_psize =3D context->low_slices_psize;
+	memcpy(&get_paca()->context_high_slices_psize,
+	       &context->high_slices_psize, SLICE_ARRAY_SIZE);
+#else
+	get_paca()->context_sllp =3D context->sllp;
+#endif
 }
 #else
 static inline void copy_mm_to_paca(mm_context_t *context){}
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-of=
fsets.c
index 9db7be2..d5903a9 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -186,12 +186,12 @@ int main(void)
 	DEFINE(PACASOFTIRQEN, offsetof(struct paca_struct, soft_enabled));
 	DEFINE(PACAIRQHAPPENED, offsetof(struct paca_struct, irq_happened));
 #ifdef CONFIG_PPC_BOOK3S
-	DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context.id));
+	DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context_id));
 #ifdef CONFIG_PPC_MM_SLICES
 	DEFINE(PACALOWSLICESPSIZE, offsetof(struct paca_struct,
-					    context.low_slices_psize));
+					    context_low_slices_psize));
 	DEFINE(PACAHIGHSLICEPSIZE, offsetof(struct paca_struct,
-					    context.high_slices_psize));
+					    context_high_slices_psize));
 	DEFINE(MMUPSIZEDEFSIZE, sizeof(struct mmu_psize_def));
 #endif /* CONFIG_PPC_MM_SLICES */
 #endif
@@ -224,7 +224,7 @@ int main(void)
 #ifdef CONFIG_PPC_MM_SLICES
 	DEFINE(MMUPSIZESLLP, offsetof(struct mmu_psize_def, sllp));
 #else
-	DEFINE(PACACONTEXTSLLP, offsetof(struct paca_struct, context.sllp));
+	DEFINE(PACACONTEXTSLLP, offsetof(struct paca_struct, context_sllp));
 #endif /* CONFIG_PPC_MM_SLICES */
 	DEFINE(PACA_EXGEN, offsetof(struct paca_struct, exgen));
 	DEFINE(PACA_EXMC, offsetof(struct paca_struct, exmc));
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_6=
4.c
index 0aa526e..ae98d58 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -877,11 +877,11 @@ static unsigned int get_paca_psize(unsigned long addr=
)
 	unsigned long index, mask_index;
=20
 	if (addr < SLICE_LOW_TOP) {
-		lpsizes =3D get_paca()->context.low_slices_psize;
+		lpsizes =3D get_paca()->context_low_slices_psize;
 		index =3D GET_LOW_SLICE_INDEX(addr);
 		return (lpsizes >> (index * 4)) & 0xF;
 	}
-	hpsizes =3D get_paca()->context.high_slices_psize;
+	hpsizes =3D get_paca()->context_high_slices_psize;
 	index =3D GET_HIGH_SLICE_INDEX(addr);
 	mask_index =3D index & 0x1;
 	return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xF;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [1/2] powerpc: Add function to copy mm_context_t to the paca
  2015-10-28  4:54 [PATCH 1/2] powerpc: Add function to copy mm_context_t to the paca Michael Neuling
  2015-10-28  4:54 ` [PATCH 2/2] powerpc: Copy only required pieces of the " Michael Neuling
@ 2016-01-11  9:14 ` Michael Ellerman
  1 sibling, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2016-01-11  9:14 UTC (permalink / raw)
  To: Michael Neuling, benh; +Cc: linuxppc-dev, mikey, anton, Cyril Bur

On Wed, 2015-28-10 at 04:54:06 UTC, Michael Neuling wrote:
> This adds a function to copy the mm->context to the paca.  This is
> only a basic conversion for now but will be used more extensively in
> the next patch.
> 
> This also adds #ifdef CONFIG_PPC_BOOK3S around this code since it's
> not used elsewhere.
> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/c395465da68bfc3a238d5bc15f

cheers

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [v3, 2/2] powerpc: Copy only required pieces of the mm_context_t to the paca
  2015-12-10 22:34       ` [PATCH v3 " Michael Neuling
@ 2016-01-11  9:14         ` Michael Ellerman
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2016-01-11  9:14 UTC (permalink / raw)
  To: Michael Neuling, benh; +Cc: linuxppc-dev, anton, Cyril Bur

On Thu, 2015-10-12 at 22:34:42 UTC, Michael Neuling wrote:
> Currently we copy the whole mm_context_t to the paca but only access a
> few bits of it.  This is wasteful of space paca and also takes quite
> some time in the hot path of context switching.
> 
> This patch pulls in only the required bits from the mm_context_t to
> the paca and on context switch, copies only those.
> 
> Benchmarking this (On top of Anton's recent MSR context switching
> changes [1]) using processes and yield shows an improvement of almost
> 3% on POWER8:
> 
>   http://ozlabs.org/~anton/junkcode/context_switch2.c
>   ./context_switch2 --test=yield --process 0 0
> 
> 1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html
> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/2fc251a8dda56b71ec491bee4c

cheers

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-01-11  9:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-28  4:54 [PATCH 1/2] powerpc: Add function to copy mm_context_t to the paca Michael Neuling
2015-10-28  4:54 ` [PATCH 2/2] powerpc: Copy only required pieces of the " Michael Neuling
2015-12-09 12:17   ` Anton Blanchard
2015-12-10  3:46   ` [PATCH v2 " Michael Neuling
2015-12-10 10:00     ` Michael Ellerman
2015-12-10 22:34       ` [PATCH v3 " Michael Neuling
2016-01-11  9:14         ` [v3, " Michael Ellerman
2016-01-11  9:14 ` [1/2] powerpc: Add function to copy " Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.