All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] powerpc: add BookS wait opcode macro
@ 2022-07-11  3:11 Nicholas Piggin
  2022-07-11  3:11 ` [PATCH 2/2] powerpc/64s: Make POWER10 and later use pause_short in cpu_relax loops Nicholas Piggin
  2022-07-12 16:29 ` [PATCH 1/2] powerpc: add BookS wait opcode macro Segher Boessenkool
  0 siblings, 2 replies; 5+ messages in thread
From: Nicholas Piggin @ 2022-07-11  3:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The wait instruction has a different encoding between BookE and BookS.
Add the BookS variant.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/ppc-opcode.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index 89beabf5325c..46fca27e8101 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -331,6 +331,7 @@
 #define __PPC_XSP(s)	((((s) & 0x1e) | (((s) >> 5) & 0x1)) << 21)
 #define __PPC_XTP(s)	__PPC_XSP(s)
 #define __PPC_T_TLB(t)	(((t) & 0x3) << 21)
+#define __PPC_PL(p)	(((p) & 0x3) << 16)
 #define __PPC_WC(w)	(((w) & 0x3) << 21)
 #define __PPC_WS(w)	(((w) & 0x1f) << 11)
 #define __PPC_SH(s)	__PPC_WS(s)
@@ -399,6 +400,7 @@
 #define PPC_RAW_RFMCI			(0x4c00004c)
 #define PPC_RAW_TLBILX(t, a, b)		(0x7c000024 | __PPC_T_TLB(t) | 	__PPC_RA0(a) | __PPC_RB(b))
 #define PPC_RAW_WAIT(w)			(0x7c00007c | __PPC_WC(w))
+#define PPC_RAW_WAIT_BOOKS(w, p)	(0x7c00003c | __PPC_WC(w) | __PPC_PL(p))
 #define PPC_RAW_TLBIE(lp, a)		(0x7c000264 | ___PPC_RB(a) | ___PPC_RS(lp))
 #define PPC_RAW_TLBIE_5(rb, rs, ric, prs, r) \
 	(0x7c000264 | ___PPC_RB(rb) | ___PPC_RS(rs) | ___PPC_RIC(ric) | ___PPC_PRS(prs) | ___PPC_R(r))
@@ -614,6 +616,7 @@
 #define PPC_TLBILX_PID(a, b)	PPC_TLBILX(1, a, b)
 #define PPC_TLBILX_VA(a, b)	PPC_TLBILX(3, a, b)
 #define PPC_WAIT(w)		stringify_in_c(.long PPC_RAW_WAIT(w))
+#define PPC_WAIT_BOOKS(w, p)	stringify_in_c(.long PPC_RAW_WAIT_BOOKS(w, p))
 #define PPC_TLBIE(lp, a) 	stringify_in_c(.long PPC_RAW_TLBIE(lp, a))
 #define	PPC_TLBIE_5(rb, rs, ric, prs, r) \
 				stringify_in_c(.long PPC_RAW_TLBIE_5(rb, rs, ric, prs, r))
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] powerpc/64s: Make POWER10 and later use pause_short in cpu_relax loops
  2022-07-11  3:11 [PATCH 1/2] powerpc: add BookS wait opcode macro Nicholas Piggin
@ 2022-07-11  3:11 ` Nicholas Piggin
  2022-07-12 16:29 ` [PATCH 1/2] powerpc: add BookS wait opcode macro Segher Boessenkool
  1 sibling, 0 replies; 5+ messages in thread
From: Nicholas Piggin @ 2022-07-11  3:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

We want to move away from using SMT prioroty updates for cpu_relax, and
use a 'wait' instruction which is similar to x86. As well as being a
much better fit for what everybody else uses and tests with, priority
nops are stateful which is nasty (interrupts have to consider they might
be taken at a different priority), and they're expensive to execute,
similar to a mtSPR which can effect other threads in the pipe.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Unfortunately qemu TCG does not emulate pause_short properly and will
cause hangs. I have a patch for it but not merged yet. But if we tune
qspinlock code it would be best to do it with this patch.
---
 arch/powerpc/include/asm/processor.h      | 30 +++++++++++++++++++----
 arch/powerpc/include/asm/vdso/processor.h | 10 +++++++-
 2 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index fdfaae194ddd..61f16515cbe0 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -355,11 +355,31 @@ static inline unsigned long __pack_fe01(unsigned int fpmode)
 
 #ifdef CONFIG_PPC64
 
-#define spin_begin()	HMT_low()
-
-#define spin_cpu_relax()	barrier()
-
-#define spin_end()	HMT_medium()
+#define spin_begin()							\
+do {									\
+	asm volatile(ASM_FTR_IFCLR(					\
+		"or 1,1,1", /* HMT_LOW */				\
+		"nop",/* POWER10 onward uses pause_short (wait 2,0) */	\
+				%0) :: "i" (CPU_FTR_ARCH_31) : "memory"); \
+} while (0)
+
+#define spin_cpu_relax()						\
+do {									\
+	asm volatile(ASM_FTR_IFCLR(					\
+		/* Pre-POWER10 uses low ; medium priority nops */	\
+		"nop",							\
+		/* POWER10 onward uses pause_short (wait 2,0) */	\
+		PPC_WAIT_BOOKS(2, 0),					\
+				%0) :: "i" (CPU_FTR_ARCH_31) : "memory"); \
+} while (0)
+
+#define spin_end()							\
+do {									\
+	asm volatile(ASM_FTR_IFCLR(					\
+		"or 2,2,2", /* HMT_MEDIUM */				\
+		"nop",/* POWER10 onward uses pause_short (wait 2,0) */	\
+				%0) :: "i" (CPU_FTR_ARCH_31) : "memory"); \
+} while (0)
 
 #endif
 
diff --git a/arch/powerpc/include/asm/vdso/processor.h b/arch/powerpc/include/asm/vdso/processor.h
index 8d79f994b4aa..1116230ebb08 100644
--- a/arch/powerpc/include/asm/vdso/processor.h
+++ b/arch/powerpc/include/asm/vdso/processor.h
@@ -22,7 +22,15 @@
 #endif
 
 #ifdef CONFIG_PPC64
-#define cpu_relax()	do { HMT_low(); HMT_medium(); barrier(); } while (0)
+#define cpu_relax()							\
+do {									\
+	asm volatile(ASM_FTR_IFCLR(					\
+		/* Pre-POWER10 uses low ; medium priority nops */	\
+		"or 1,1,1 ; or 2,2,2",					\
+		/* POWER10 onward uses pause_short (wait 2,0) */	\
+		PPC_WAIT_BOOKS(2, 0),					\
+				%0) :: "i" (CPU_FTR_ARCH_31) : "memory"); \
+} while (0)
 #else
 #define cpu_relax()	barrier()
 #endif
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc: add BookS wait opcode macro
  2022-07-11  3:11 [PATCH 1/2] powerpc: add BookS wait opcode macro Nicholas Piggin
  2022-07-11  3:11 ` [PATCH 2/2] powerpc/64s: Make POWER10 and later use pause_short in cpu_relax loops Nicholas Piggin
@ 2022-07-12 16:29 ` Segher Boessenkool
  2022-07-20  9:54   ` Nicholas Piggin
  1 sibling, 1 reply; 5+ messages in thread
From: Segher Boessenkool @ 2022-07-12 16:29 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev

Hi!

On Mon, Jul 11, 2022 at 01:11:27PM +1000, Nicholas Piggin wrote:
> The wait instruction has a different encoding between BookE and BookS.
> Add the BookS variant.

>  #define PPC_RAW_WAIT(w)			(0x7c00007c | __PPC_WC(w))
> +#define PPC_RAW_WAIT_BOOKS(w, p)	(0x7c00003c | __PPC_WC(w) | __PPC_PL(p))

The embedded extensions are no longer part of the PowerPC architecture,
so wouldn't it be a better way forward to rename the existing one,
instead?  A bit more work now, but less in the future :-)


Segher

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc: add BookS wait opcode macro
  2022-07-12 16:29 ` [PATCH 1/2] powerpc: add BookS wait opcode macro Segher Boessenkool
@ 2022-07-20  9:54   ` Nicholas Piggin
  2022-07-20 21:19     ` Segher Boessenkool
  0 siblings, 1 reply; 5+ messages in thread
From: Nicholas Piggin @ 2022-07-20  9:54 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linuxppc-dev

Excerpts from Segher Boessenkool's message of July 13, 2022 2:29 am:
> Hi!
> 
> On Mon, Jul 11, 2022 at 01:11:27PM +1000, Nicholas Piggin wrote:
>> The wait instruction has a different encoding between BookE and BookS.
>> Add the BookS variant.
> 
>>  #define PPC_RAW_WAIT(w)			(0x7c00007c | __PPC_WC(w))
>> +#define PPC_RAW_WAIT_BOOKS(w, p)	(0x7c00003c | __PPC_WC(w) | __PPC_PL(p))
> 
> The embedded extensions are no longer part of the PowerPC architecture,
> so wouldn't it be a better way forward to rename the existing one,
> instead?  A bit more work now, but less in the future :-)

And I actually misremembered this too, was off digging and asking
about it, but the change isn't strictly BookE vs BookS, but rather
the wait opcode was changed in ISA v3.0, which is a bit of an
unfortunate landmine.

It seems apparently POWER8 implemented a non-architected instruction
'waitasec' that uses this opcode, then I suppose it was decided to
continue with that opcode in v3.0 when BookE was dropped, for reasons.
Maybe it was more widely used?

In any case, I will rename it. Precedent is divided. We have
PPC_RAW_TLBIEL_v205 for older tlbiel, and PPC_ISA_3_0_INVALIDATE_ERAT
for a new ERAT invalidation instruction. I guess making the older
instruction the exceptional case ends up being better in the long
term.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc: add BookS wait opcode macro
  2022-07-20  9:54   ` Nicholas Piggin
@ 2022-07-20 21:19     ` Segher Boessenkool
  0 siblings, 0 replies; 5+ messages in thread
From: Segher Boessenkool @ 2022-07-20 21:19 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev

On Wed, Jul 20, 2022 at 07:54:54PM +1000, Nicholas Piggin wrote:
> Excerpts from Segher Boessenkool's message of July 13, 2022 2:29 am:
> > The embedded extensions are no longer part of the PowerPC architecture,
> > so wouldn't it be a better way forward to rename the existing one,
> > instead?  A bit more work now, but less in the future :-)
> 
> And I actually misremembered this too, was off digging and asking
> about it, but the change isn't strictly BookE vs BookS, but rather
> the wait opcode was changed in ISA v3.0, which is a bit of an
> unfortunate landmine.

The current wait opcode is *new* in 3.0, but it has the same mnemonic
as the old one, more formally.

> It seems apparently POWER8 implemented a non-architected instruction
> 'waitasec' that uses this opcode, then I suppose it was decided to
> continue with that opcode in v3.0 when BookE was dropped, for reasons.

The (newer) wait instruction is 0/30 while the old one is 1/30
(secondary opcodes 30 resp. 62), although 0/30 stayed open; maybe there
is more history there already?  Curious.

> In any case, I will rename it. Precedent is divided. We have
> PPC_RAW_TLBIEL_v205 for older tlbiel, and PPC_ISA_3_0_INVALIDATE_ERAT
> for a new ERAT invalidation instruction. I guess making the older
> instruction the exceptional case ends up being better in the long
> term.

Yeah exactly, it's better if the simpler name is the one that newer
code should use: more readable, more writable, and importantly less
room for mistakes :-)

Thanks,


Segher

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-07-20 21:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-11  3:11 [PATCH 1/2] powerpc: add BookS wait opcode macro Nicholas Piggin
2022-07-11  3:11 ` [PATCH 2/2] powerpc/64s: Make POWER10 and later use pause_short in cpu_relax loops Nicholas Piggin
2022-07-12 16:29 ` [PATCH 1/2] powerpc: add BookS wait opcode macro Segher Boessenkool
2022-07-20  9:54   ` Nicholas Piggin
2022-07-20 21:19     ` Segher Boessenkool

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.