All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11  7:41 ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-11  7:41 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
  Cc: linux-kernel, linuxppc-dev

powerpc BUG_ON() is based on using twnei or tdnei instruction,
which obliges gcc to format the condition into a 0 or 1 value
in a register.

By using a generic implementation, gcc will generate a branch
to the unconditional trap generated by BUG().

As modern powerpc implement branch folding, that's even more efficient.

See below the difference at the entry of system_call_exception.

With the patch:

	00000000 <system_call_exception>:
	   0:	81 6a 00 84 	lwz     r11,132(r10)
	   4:	90 6a 00 88 	stw     r3,136(r10)
	   8:	71 60 00 02 	andi.   r0,r11,2
	   c:	41 82 00 70 	beq     7c <system_call_exception+0x7c>
	  10:	71 60 40 00 	andi.   r0,r11,16384
	  14:	41 82 00 6c 	beq     80 <system_call_exception+0x80>
	  18:	71 6b 80 00 	andi.   r11,r11,32768
	  1c:	41 82 00 68 	beq     84 <system_call_exception+0x84>
	  20:	94 21 ff e0 	stwu    r1,-32(r1)
	  24:	93 e1 00 1c 	stw     r31,28(r1)
	  28:	7d 8c 42 e6 	mftb    r12
	...
	  7c:	0f e0 00 00 	twui    r0,0
	  80:	0f e0 00 00 	twui    r0,0
	  84:	0f e0 00 00 	twui    r0,0

Without the patch:

	00000000 <system_call_exception>:
	   0:	94 21 ff e0 	stwu    r1,-32(r1)
	   4:	93 e1 00 1c 	stw     r31,28(r1)
	   8:	90 6a 00 88 	stw     r3,136(r10)
	   c:	81 6a 00 84 	lwz     r11,132(r10)
	  10:	69 60 00 02 	xori    r0,r11,2
	  14:	54 00 ff fe 	rlwinm  r0,r0,31,31,31
	  18:	0f 00 00 00 	twnei   r0,0
	  1c:	69 60 40 00 	xori    r0,r11,16384
	  20:	54 00 97 fe 	rlwinm  r0,r0,18,31,31
	  24:	0f 00 00 00 	twnei   r0,0
	  28:	69 6b 80 00 	xori    r11,r11,32768
	  2c:	55 6b 8f fe 	rlwinm  r11,r11,17,31,31
	  30:	0f 0b 00 00 	twnei   r11,0
	  34:	7d 8c 42 e6 	mftb    r12

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/bug.h | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index d1635ffbb179..21103d3e1f29 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -69,15 +69,6 @@
 	unreachable();						\
 } while (0)
 
-#define BUG_ON(x) do {						\
-	if (__builtin_constant_p(x)) {				\
-		if (x)						\
-			BUG();					\
-	} else {						\
-		BUG_ENTRY(PPC_TLNEI " %4, 0", 0, "r" ((__force long)(x)));	\
-	}							\
-} while (0)
-
 #define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
 
 #define WARN_ON(x) ({						\
@@ -94,7 +85,6 @@
 })
 
 #define HAVE_ARCH_BUG
-#define HAVE_ARCH_BUG_ON
 #define HAVE_ARCH_WARN_ON
 #endif /* __ASSEMBLY __ */
 #else
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11  7:41 ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-11  7:41 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
  Cc: linuxppc-dev, linux-kernel

powerpc BUG_ON() is based on using twnei or tdnei instruction,
which obliges gcc to format the condition into a 0 or 1 value
in a register.

By using a generic implementation, gcc will generate a branch
to the unconditional trap generated by BUG().

As modern powerpc implement branch folding, that's even more efficient.

See below the difference at the entry of system_call_exception.

With the patch:

	00000000 <system_call_exception>:
	   0:	81 6a 00 84 	lwz     r11,132(r10)
	   4:	90 6a 00 88 	stw     r3,136(r10)
	   8:	71 60 00 02 	andi.   r0,r11,2
	   c:	41 82 00 70 	beq     7c <system_call_exception+0x7c>
	  10:	71 60 40 00 	andi.   r0,r11,16384
	  14:	41 82 00 6c 	beq     80 <system_call_exception+0x80>
	  18:	71 6b 80 00 	andi.   r11,r11,32768
	  1c:	41 82 00 68 	beq     84 <system_call_exception+0x84>
	  20:	94 21 ff e0 	stwu    r1,-32(r1)
	  24:	93 e1 00 1c 	stw     r31,28(r1)
	  28:	7d 8c 42 e6 	mftb    r12
	...
	  7c:	0f e0 00 00 	twui    r0,0
	  80:	0f e0 00 00 	twui    r0,0
	  84:	0f e0 00 00 	twui    r0,0

Without the patch:

	00000000 <system_call_exception>:
	   0:	94 21 ff e0 	stwu    r1,-32(r1)
	   4:	93 e1 00 1c 	stw     r31,28(r1)
	   8:	90 6a 00 88 	stw     r3,136(r10)
	   c:	81 6a 00 84 	lwz     r11,132(r10)
	  10:	69 60 00 02 	xori    r0,r11,2
	  14:	54 00 ff fe 	rlwinm  r0,r0,31,31,31
	  18:	0f 00 00 00 	twnei   r0,0
	  1c:	69 60 40 00 	xori    r0,r11,16384
	  20:	54 00 97 fe 	rlwinm  r0,r0,18,31,31
	  24:	0f 00 00 00 	twnei   r0,0
	  28:	69 6b 80 00 	xori    r11,r11,32768
	  2c:	55 6b 8f fe 	rlwinm  r11,r11,17,31,31
	  30:	0f 0b 00 00 	twnei   r11,0
	  34:	7d 8c 42 e6 	mftb    r12

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/bug.h | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index d1635ffbb179..21103d3e1f29 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -69,15 +69,6 @@
 	unreachable();						\
 } while (0)
 
-#define BUG_ON(x) do {						\
-	if (__builtin_constant_p(x)) {				\
-		if (x)						\
-			BUG();					\
-	} else {						\
-		BUG_ENTRY(PPC_TLNEI " %4, 0", 0, "r" ((__force long)(x)));	\
-	}							\
-} while (0)
-
 #define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
 
 #define WARN_ON(x) ({						\
@@ -94,7 +85,6 @@
 })
 
 #define HAVE_ARCH_BUG
-#define HAVE_ARCH_BUG_ON
 #define HAVE_ARCH_WARN_ON
 #endif /* __ASSEMBLY __ */
 #else
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11  7:41 ` Christophe Leroy
@ 2021-02-11 10:04   ` Nicholas Piggin
  -1 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2021-02-11 10:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Christophe Leroy, Michael Ellerman,
	Paul Mackerras
  Cc: linux-kernel, linuxppc-dev

Excerpts from Christophe Leroy's message of February 11, 2021 5:41 pm:
> powerpc BUG_ON() is based on using twnei or tdnei instruction,
> which obliges gcc to format the condition into a 0 or 1 value
> in a register.
> 
> By using a generic implementation, gcc will generate a branch
> to the unconditional trap generated by BUG().

We don't want to do this on 64s because that will lose the useful CFAR
contents.

Unfortunately the code generation is not great and the registers that 
give some useful information about the condition are often mangled :(

It would be nice if we could have a __builtin_trap_if that gcc would use 
conditional traps with, (and which never assumes following code is 
unreachable even for constant true, so we can use it with WARN and put 
explicit unreachable for BUG).

> 
> As modern powerpc implement branch folding, that's even more efficient.

I think POWER will speculate conditional traps as non faulting always
so it should be just as good if not better than the branch.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 10:04   ` Nicholas Piggin
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2021-02-11 10:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Christophe Leroy, Michael Ellerman,
	Paul Mackerras
  Cc: linuxppc-dev, linux-kernel

Excerpts from Christophe Leroy's message of February 11, 2021 5:41 pm:
> powerpc BUG_ON() is based on using twnei or tdnei instruction,
> which obliges gcc to format the condition into a 0 or 1 value
> in a register.
> 
> By using a generic implementation, gcc will generate a branch
> to the unconditional trap generated by BUG().

We don't want to do this on 64s because that will lose the useful CFAR
contents.

Unfortunately the code generation is not great and the registers that 
give some useful information about the condition are often mangled :(

It would be nice if we could have a __builtin_trap_if that gcc would use 
conditional traps with, (and which never assumes following code is 
unreachable even for constant true, so we can use it with WARN and put 
explicit unreachable for BUG).

> 
> As modern powerpc implement branch folding, that's even more efficient.

I think POWER will speculate conditional traps as non faulting always
so it should be just as good if not better than the branch.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11  7:41 ` Christophe Leroy
@ 2021-02-11 11:49   ` Segher Boessenkool
  -1 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 11:49 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	npiggin, linuxppc-dev, linux-kernel

On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
> powerpc BUG_ON() is based on using twnei or tdnei instruction,
> which obliges gcc to format the condition into a 0 or 1 value
> in a register.

Huh?  Why is that?

Will it work better if this used __builtin_trap?  Or does the kernel only
detect very specific forms of trap instructions?

> By using a generic implementation, gcc will generate a branch
> to the unconditional trap generated by BUG().

That is many more instructions than ideal.

> As modern powerpc implement branch folding, that's even more efficient.

What PowerPC cpus implement branch folding?  I know none.

Some example code generated via __builtin_trap:

void trap(void) { __builtin_trap(); }
void trap_if_0(int x) { if (x == 0) __builtin_trap(); }
void trap_if_not_0(int x) { if (x != 0) __builtin_trap(); }

-m64:

trap:
	trap
trap_if_0:
	tdeqi 3,0
	blr
trap_if_not_0:
	tdnei 3,0
	blr

-m32:

trap:
	trap
trap_if_0:
	tweqi 3,0
	blr
trap_if_not_0:
	twnei 3,0
	blr


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 11:49   ` Segher Boessenkool
  0 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 11:49 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: linux-kernel, npiggin, Paul Mackerras, linuxppc-dev

On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
> powerpc BUG_ON() is based on using twnei or tdnei instruction,
> which obliges gcc to format the condition into a 0 or 1 value
> in a register.

Huh?  Why is that?

Will it work better if this used __builtin_trap?  Or does the kernel only
detect very specific forms of trap instructions?

> By using a generic implementation, gcc will generate a branch
> to the unconditional trap generated by BUG().

That is many more instructions than ideal.

> As modern powerpc implement branch folding, that's even more efficient.

What PowerPC cpus implement branch folding?  I know none.

Some example code generated via __builtin_trap:

void trap(void) { __builtin_trap(); }
void trap_if_0(int x) { if (x == 0) __builtin_trap(); }
void trap_if_not_0(int x) { if (x != 0) __builtin_trap(); }

-m64:

trap:
	trap
trap_if_0:
	tdeqi 3,0
	blr
trap_if_not_0:
	tdnei 3,0
	blr

-m32:

trap:
	trap
trap_if_0:
	tweqi 3,0
	blr
trap_if_not_0:
	twnei 3,0
	blr


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11 10:04   ` Nicholas Piggin
@ 2021-02-11 11:50     ` Segher Boessenkool
  -1 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 11:50 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Benjamin Herrenschmidt, Christophe Leroy, Michael Ellerman,
	Paul Mackerras, linuxppc-dev, linux-kernel

On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
> It would be nice if we could have a __builtin_trap_if that gcc would use 
> conditional traps with, (and which never assumes following code is 
> unreachable even for constant true, so we can use it with WARN and put 
> explicit unreachable for BUG).

It automatically does that with just __builtin_trap, see my other mail :-)


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 11:50     ` Segher Boessenkool
  0 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 11:50 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linux-kernel, Paul Mackerras, linuxppc-dev

On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
> It would be nice if we could have a __builtin_trap_if that gcc would use 
> conditional traps with, (and which never assumes following code is 
> unreachable even for constant true, so we can use it with WARN and put 
> explicit unreachable for BUG).

It automatically does that with just __builtin_trap, see my other mail :-)


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11 10:04   ` Nicholas Piggin
@ 2021-02-11 12:22     ` Segher Boessenkool
  -1 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 12:22 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Benjamin Herrenschmidt, Christophe Leroy, Michael Ellerman,
	Paul Mackerras, linuxppc-dev, linux-kernel

On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
> Excerpts from Christophe Leroy's message of February 11, 2021 5:41 pm:
> > As modern powerpc implement branch folding, that's even more efficient.

Ah, it seems you mean what Arm calls branch folding.  Sure, power4
already did that, and this has not changed.

> I think POWER will speculate conditional traps as non faulting always
> so it should be just as good if not better than the branch.

Right, these are not branch instructions, so are not branch predicted;
all trap instructions are assumed to fall through, like all other
non-branch instructions.


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 12:22     ` Segher Boessenkool
  0 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 12:22 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linux-kernel, Paul Mackerras, linuxppc-dev

On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
> Excerpts from Christophe Leroy's message of February 11, 2021 5:41 pm:
> > As modern powerpc implement branch folding, that's even more efficient.

Ah, it seems you mean what Arm calls branch folding.  Sure, power4
already did that, and this has not changed.

> I think POWER will speculate conditional traps as non faulting always
> so it should be just as good if not better than the branch.

Right, these are not branch instructions, so are not branch predicted;
all trap instructions are assumed to fall through, like all other
non-branch instructions.


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11 11:49   ` Segher Boessenkool
@ 2021-02-11 12:26     ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-11 12:26 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	npiggin, linuxppc-dev, linux-kernel



Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
> On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
>> powerpc BUG_ON() is based on using twnei or tdnei instruction,
>> which obliges gcc to format the condition into a 0 or 1 value
>> in a register.
> 
> Huh?  Why is that?
> 
> Will it work better if this used __builtin_trap?  Or does the kernel only
> detect very specific forms of trap instructions?
> 
>> By using a generic implementation, gcc will generate a branch
>> to the unconditional trap generated by BUG().
> 
> That is many more instructions than ideal.
> 
>> As modern powerpc implement branch folding, that's even more efficient.
> 
> What PowerPC cpus implement branch folding?  I know none.

Extract from powerpc mpc8323 reference manual:

High instruction and data throughput
— Zero-cycle branch capability (branch folding)
— Programmable static branch prediction on unresolved conditional branches
— Two integer units with enhanced multipliers in thee300c2 for increased integer instruction
throughput and a maximum two-cycle latency for multiply instructions
— Instruction fetch unit capable of fetching two instructions per clock from the instruction cache
— A six-entry instruction queue (IQ) that provides lookahead capability
— Independent pipelines with feed-forwarding that reduces data dependencies in hardware
— 16-Kbyte, four-way set-associative instruction and data caches on the e300c2.
— Cache write-back or write-through operation programmable on a per-page or per-block basis
— Features for instruction and data cache locking and protection
— BPU that performs CR lookahead operations
— Address translation facilities for 4-Kbyte page size, variable block size, and 256-Mbyte
segment size
— A 64-entry, two-way, set-associative ITLB and DTLB
— Eight-entry data and instruction BAT arrays providing 128-Kbyte to 256-Mbyte blocks
— Software table search operations and updates supported through fast trap mechanism
— 52-bit virtual address; 32-bit physical address

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 12:26     ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-11 12:26 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linux-kernel, npiggin, Paul Mackerras, linuxppc-dev



Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
> On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
>> powerpc BUG_ON() is based on using twnei or tdnei instruction,
>> which obliges gcc to format the condition into a 0 or 1 value
>> in a register.
> 
> Huh?  Why is that?
> 
> Will it work better if this used __builtin_trap?  Or does the kernel only
> detect very specific forms of trap instructions?
> 
>> By using a generic implementation, gcc will generate a branch
>> to the unconditional trap generated by BUG().
> 
> That is many more instructions than ideal.
> 
>> As modern powerpc implement branch folding, that's even more efficient.
> 
> What PowerPC cpus implement branch folding?  I know none.

Extract from powerpc mpc8323 reference manual:

High instruction and data throughput
— Zero-cycle branch capability (branch folding)
— Programmable static branch prediction on unresolved conditional branches
— Two integer units with enhanced multipliers in thee300c2 for increased integer instruction
throughput and a maximum two-cycle latency for multiply instructions
— Instruction fetch unit capable of fetching two instructions per clock from the instruction cache
— A six-entry instruction queue (IQ) that provides lookahead capability
— Independent pipelines with feed-forwarding that reduces data dependencies in hardware
— 16-Kbyte, four-way set-associative instruction and data caches on the e300c2.
— Cache write-back or write-through operation programmable on a per-page or per-block basis
— Features for instruction and data cache locking and protection
— BPU that performs CR lookahead operations
— Address translation facilities for 4-Kbyte page size, variable block size, and 256-Mbyte
segment size
— A 64-entry, two-way, set-associative ITLB and DTLB
— Eight-entry data and instruction BAT arrays providing 128-Kbyte to 256-Mbyte blocks
— Software table search operations and updates supported through fast trap mechanism
— 52-bit virtual address; 32-bit physical address

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11 12:26     ` Christophe Leroy
@ 2021-02-11 12:39       ` Segher Boessenkool
  -1 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 12:39 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	npiggin, linuxppc-dev, linux-kernel

On Thu, Feb 11, 2021 at 01:26:12PM +0100, Christophe Leroy wrote:
> >What PowerPC cpus implement branch folding?  I know none.
> 
> Extract from powerpc mpc8323 reference manual:

> — Zero-cycle branch capability (branch folding)

Yeah, this is not what is traditionally called branch folding (which
stores the instruction being branched to in some cache, originally the
instruction cache itself; somewhat similar (but different) to the BTIC
on 750).  Overloaded terminology :-)

6xx/7xx CPUs had the branch execution unit in the frontend, and it would
not issue a branch at all if it could be resolved then already.  Power4
and later predict all branches, and most are not issued at all (those
that do need to be executed, like bdnz, are).  At completion time it is
checked if the prediction was correct (and corrective action is taken if
not).


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 12:39       ` Segher Boessenkool
  0 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 12:39 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: linux-kernel, npiggin, Paul Mackerras, linuxppc-dev

On Thu, Feb 11, 2021 at 01:26:12PM +0100, Christophe Leroy wrote:
> >What PowerPC cpus implement branch folding?  I know none.
> 
> Extract from powerpc mpc8323 reference manual:

> — Zero-cycle branch capability (branch folding)

Yeah, this is not what is traditionally called branch folding (which
stores the instruction being branched to in some cache, originally the
instruction cache itself; somewhat similar (but different) to the BTIC
on 750).  Overloaded terminology :-)

6xx/7xx CPUs had the branch execution unit in the frontend, and it would
not issue a branch at all if it could be resolved then already.  Power4
and later predict all branches, and most are not issued at all (those
that do need to be executed, like bdnz, are).  At completion time it is
checked if the prediction was correct (and corrective action is taken if
not).


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11 11:49   ` Segher Boessenkool
@ 2021-02-11 14:09     ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-11 14:09 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	npiggin, linuxppc-dev, linux-kernel



Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
> On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
>> powerpc BUG_ON() is based on using twnei or tdnei instruction,
>> which obliges gcc to format the condition into a 0 or 1 value
>> in a register.
> 
> Huh?  Why is that?
> 
> Will it work better if this used __builtin_trap?  Or does the kernel only
> detect very specific forms of trap instructions?

We already made a try with __builtin_trap() 1,5 year ago, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.leroy@c-s.fr/

The main problems encountered are:
- It is only possible to use it for BUG_ON, not for WARN_ON because GCC considers it as noreturn. Is 
there any workaround ?
- The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify the source file and line 
corresponding to the trap. How can that be done with __builtin_trap() ?

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 14:09     ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-11 14:09 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linux-kernel, npiggin, Paul Mackerras, linuxppc-dev



Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
> On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
>> powerpc BUG_ON() is based on using twnei or tdnei instruction,
>> which obliges gcc to format the condition into a 0 or 1 value
>> in a register.
> 
> Huh?  Why is that?
> 
> Will it work better if this used __builtin_trap?  Or does the kernel only
> detect very specific forms of trap instructions?

We already made a try with __builtin_trap() 1,5 year ago, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.leroy@c-s.fr/

The main problems encountered are:
- It is only possible to use it for BUG_ON, not for WARN_ON because GCC considers it as noreturn. Is 
there any workaround ?
- The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify the source file and line 
corresponding to the trap. How can that be done with __builtin_trap() ?

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11 14:09     ` Christophe Leroy
@ 2021-02-11 14:30       ` Segher Boessenkool
  -1 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 14:30 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	npiggin, linuxppc-dev, linux-kernel

On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:
> Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
> >On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
> >>powerpc BUG_ON() is based on using twnei or tdnei instruction,
> >>which obliges gcc to format the condition into a 0 or 1 value
> >>in a register.
> >
> >Huh?  Why is that?
> >
> >Will it work better if this used __builtin_trap?  Or does the kernel only
> >detect very specific forms of trap instructions?
> 
> We already made a try with __builtin_trap() 1,5 year ago, see 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.leroy@c-s.fr/
> 
> The main problems encountered are:
> - It is only possible to use it for BUG_ON, not for WARN_ON because GCC 
> considers it as noreturn. Is there any workaround ?

A trap is noreturn by definition:

 -- Built-in Function: void __builtin_trap (void)
     This function causes the program to exit abnormally.

> - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify 
> the source file and line corresponding to the trap. How can that be done 
> with __builtin_trap() ?

The DWARF debug info should be sufficient.  Perhaps you can post-process
some way?

You can create a trap that falls through yourself (by having a trap-on
condition with a condition that is always true, but make the compiler
not see that).  This isn't efficient though.

Could you file a feature request (in bugzilla)?  It is probably useful
for generic code as well, but we could implement this for powerpc only
if needed.


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 14:30       ` Segher Boessenkool
  0 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2021-02-11 14:30 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: linux-kernel, npiggin, Paul Mackerras, linuxppc-dev

On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:
> Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
> >On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
> >>powerpc BUG_ON() is based on using twnei or tdnei instruction,
> >>which obliges gcc to format the condition into a 0 or 1 value
> >>in a register.
> >
> >Huh?  Why is that?
> >
> >Will it work better if this used __builtin_trap?  Or does the kernel only
> >detect very specific forms of trap instructions?
> 
> We already made a try with __builtin_trap() 1,5 year ago, see 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.leroy@c-s.fr/
> 
> The main problems encountered are:
> - It is only possible to use it for BUG_ON, not for WARN_ON because GCC 
> considers it as noreturn. Is there any workaround ?

A trap is noreturn by definition:

 -- Built-in Function: void __builtin_trap (void)
     This function causes the program to exit abnormally.

> - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify 
> the source file and line corresponding to the trap. How can that be done 
> with __builtin_trap() ?

The DWARF debug info should be sufficient.  Perhaps you can post-process
some way?

You can create a trap that falls through yourself (by having a trap-on
condition with a condition that is always true, but make the compiler
not see that).  This isn't efficient though.

Could you file a feature request (in bugzilla)?  It is probably useful
for generic code as well, but we could implement this for powerpc only
if needed.


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11 14:30       ` Segher Boessenkool
@ 2021-02-11 15:30         ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-11 15:30 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	npiggin, linuxppc-dev, linux-kernel



Le 11/02/2021 à 15:30, Segher Boessenkool a écrit :
> On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:
>> Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
>>> On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
>>>> powerpc BUG_ON() is based on using twnei or tdnei instruction,
>>>> which obliges gcc to format the condition into a 0 or 1 value
>>>> in a register.
>>>
>>> Huh?  Why is that?
>>>
>>> Will it work better if this used __builtin_trap?  Or does the kernel only
>>> detect very specific forms of trap instructions?
>>
>> We already made a try with __builtin_trap() 1,5 year ago, see
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.leroy@c-s.fr/
>>
>> The main problems encountered are:
>> - It is only possible to use it for BUG_ON, not for WARN_ON because GCC
>> considers it as noreturn. Is there any workaround ?
> 
> A trap is noreturn by definition:
> 
>   -- Built-in Function: void __builtin_trap (void)
>       This function causes the program to exit abnormally.
> 
>> - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify
>> the source file and line corresponding to the trap. How can that be done
>> with __builtin_trap() ?
> 
> The DWARF debug info should be sufficient.  Perhaps you can post-process
> some way?
> 
> You can create a trap that falls through yourself (by having a trap-on
> condition with a condition that is always true, but make the compiler
> not see that).  This isn't efficient though.
> 
> Could you file a feature request (in bugzilla)?  It is probably useful
> for generic code as well, but we could implement this for powerpc only
> if needed.
> 

Ok I will.


For sure using __builtin_trap() would be the best.

unsigned long test1(unsigned long msr)
{
	WARN_ON((msr & (MSR_DR | MSR_IR | MSR_RI)) != (MSR_DR | MSR_IR | MSR_RI));

	return msr;
}

unsigned long test2(unsigned long msr)
{
	if ((msr & (MSR_DR | MSR_IR | MSR_RI)) != (MSR_DR | MSR_IR | MSR_RI))
		__builtin_trap();

	return msr;
}

000003c0 <test1>:
  3c0:	70 69 00 32 	andi.   r9,r3,50
  3c4:	69 29 00 32 	xori    r9,r9,50
  3c8:	31 49 ff ff 	addic   r10,r9,-1
  3cc:	7d 2a 49 10 	subfe   r9,r10,r9
  3d0:	0f 09 00 00 	twnei   r9,0
  3d4:	4e 80 00 20 	blr

000003d8 <test2>:
  3d8:	70 69 00 32 	andi.   r9,r3,50
  3dc:	0f 09 00 32 	twnei   r9,50
  3e0:	4e 80 00 20 	blr

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 15:30         ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-11 15:30 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linux-kernel, npiggin, Paul Mackerras, linuxppc-dev



Le 11/02/2021 à 15:30, Segher Boessenkool a écrit :
> On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:
>> Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
>>> On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
>>>> powerpc BUG_ON() is based on using twnei or tdnei instruction,
>>>> which obliges gcc to format the condition into a 0 or 1 value
>>>> in a register.
>>>
>>> Huh?  Why is that?
>>>
>>> Will it work better if this used __builtin_trap?  Or does the kernel only
>>> detect very specific forms of trap instructions?
>>
>> We already made a try with __builtin_trap() 1,5 year ago, see
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.leroy@c-s.fr/
>>
>> The main problems encountered are:
>> - It is only possible to use it for BUG_ON, not for WARN_ON because GCC
>> considers it as noreturn. Is there any workaround ?
> 
> A trap is noreturn by definition:
> 
>   -- Built-in Function: void __builtin_trap (void)
>       This function causes the program to exit abnormally.
> 
>> - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify
>> the source file and line corresponding to the trap. How can that be done
>> with __builtin_trap() ?
> 
> The DWARF debug info should be sufficient.  Perhaps you can post-process
> some way?
> 
> You can create a trap that falls through yourself (by having a trap-on
> condition with a condition that is always true, but make the compiler
> not see that).  This isn't efficient though.
> 
> Could you file a feature request (in bugzilla)?  It is probably useful
> for generic code as well, but we could implement this for powerpc only
> if needed.
> 

Ok I will.


For sure using __builtin_trap() would be the best.

unsigned long test1(unsigned long msr)
{
	WARN_ON((msr & (MSR_DR | MSR_IR | MSR_RI)) != (MSR_DR | MSR_IR | MSR_RI));

	return msr;
}

unsigned long test2(unsigned long msr)
{
	if ((msr & (MSR_DR | MSR_IR | MSR_RI)) != (MSR_DR | MSR_IR | MSR_RI))
		__builtin_trap();

	return msr;
}

000003c0 <test1>:
  3c0:	70 69 00 32 	andi.   r9,r3,50
  3c4:	69 29 00 32 	xori    r9,r9,50
  3c8:	31 49 ff ff 	addic   r10,r9,-1
  3cc:	7d 2a 49 10 	subfe   r9,r10,r9
  3d0:	0f 09 00 00 	twnei   r9,0
  3d4:	4e 80 00 20 	blr

000003d8 <test2>:
  3d8:	70 69 00 32 	andi.   r9,r3,50
  3dc:	0f 09 00 32 	twnei   r9,50
  3e0:	4e 80 00 20 	blr

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11 11:50     ` Segher Boessenkool
@ 2021-02-11 22:47       ` Nicholas Piggin
  -1 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2021-02-11 22:47 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Benjamin Herrenschmidt, Christophe Leroy, linux-kernel,
	linuxppc-dev, Michael Ellerman, Paul Mackerras

Excerpts from Segher Boessenkool's message of February 11, 2021 9:50 pm:
> On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
>> It would be nice if we could have a __builtin_trap_if that gcc would use 
>> conditional traps with, (and which never assumes following code is 
>> unreachable even for constant true, so we can use it with WARN and put 
>> explicit unreachable for BUG).
> 
> It automatically does that with just __builtin_trap, see my other mail :-)

If that is generated without branches (or at least with no more
branches than existing asm implementation), then it could be usable 
without trashing CFAR.

Unfortunately I don't think we will be parsing the dwarf information
to get line numbers from it any time soon, so not a drop in replacement 
but maybe one day someone would find a solution.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-11 22:47       ` Nicholas Piggin
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas Piggin @ 2021-02-11 22:47 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linux-kernel, Paul Mackerras, linuxppc-dev

Excerpts from Segher Boessenkool's message of February 11, 2021 9:50 pm:
> On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
>> It would be nice if we could have a __builtin_trap_if that gcc would use 
>> conditional traps with, (and which never assumes following code is 
>> unreachable even for constant true, so we can use it with WARN and put 
>> explicit unreachable for BUG).
> 
> It automatically does that with just __builtin_trap, see my other mail :-)

If that is generated without branches (or at least with no more
branches than existing asm implementation), then it could be usable 
without trashing CFAR.

Unfortunately I don't think we will be parsing the dwarf information
to get line numbers from it any time soon, so not a drop in replacement 
but maybe one day someone would find a solution.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
  2021-02-11 14:30       ` Segher Boessenkool
@ 2021-02-27 10:31         ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-27 10:31 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	npiggin, linuxppc-dev, linux-kernel



Le 11/02/2021 à 15:30, Segher Boessenkool a écrit :
> On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:
>> Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
>>> On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
>>>> powerpc BUG_ON() is based on using twnei or tdnei instruction,
>>>> which obliges gcc to format the condition into a 0 or 1 value
>>>> in a register.
>>>
>>> Huh?  Why is that?
>>>
>>> Will it work better if this used __builtin_trap?  Or does the kernel only
>>> detect very specific forms of trap instructions?
>>
>> We already made a try with __builtin_trap() 1,5 year ago, see
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.leroy@c-s.fr/
>>
>> The main problems encountered are:
>> - It is only possible to use it for BUG_ON, not for WARN_ON because GCC
>> considers it as noreturn. Is there any workaround ?
> 
> A trap is noreturn by definition:
> 
>   -- Built-in Function: void __builtin_trap (void)
>       This function causes the program to exit abnormally.
> 
>> - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify
>> the source file and line corresponding to the trap. How can that be done
>> with __builtin_trap() ?
> 
> The DWARF debug info should be sufficient.  Perhaps you can post-process
> some way?
> 
> You can create a trap that falls through yourself (by having a trap-on
> condition with a condition that is always true, but make the compiler
> not see that).  This isn't efficient though.
> 
> Could you file a feature request (in bugzilla)?  It is probably useful
> for generic code as well, but we could implement this for powerpc only
> if needed.
> 

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99299

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
@ 2021-02-27 10:31         ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2021-02-27 10:31 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: linux-kernel, npiggin, Paul Mackerras, linuxppc-dev



Le 11/02/2021 à 15:30, Segher Boessenkool a écrit :
> On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:
>> Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
>>> On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
>>>> powerpc BUG_ON() is based on using twnei or tdnei instruction,
>>>> which obliges gcc to format the condition into a 0 or 1 value
>>>> in a register.
>>>
>>> Huh?  Why is that?
>>>
>>> Will it work better if this used __builtin_trap?  Or does the kernel only
>>> detect very specific forms of trap instructions?
>>
>> We already made a try with __builtin_trap() 1,5 year ago, see
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.leroy@c-s.fr/
>>
>> The main problems encountered are:
>> - It is only possible to use it for BUG_ON, not for WARN_ON because GCC
>> considers it as noreturn. Is there any workaround ?
> 
> A trap is noreturn by definition:
> 
>   -- Built-in Function: void __builtin_trap (void)
>       This function causes the program to exit abnormally.
> 
>> - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify
>> the source file and line corresponding to the trap. How can that be done
>> with __builtin_trap() ?
> 
> The DWARF debug info should be sufficient.  Perhaps you can post-process
> some way?
> 
> You can create a trap that falls through yourself (by having a trap-on
> condition with a condition that is always true, but make the compiler
> not see that).  This isn't efficient though.
> 
> Could you file a feature request (in bugzilla)?  It is probably useful
> for generic code as well, but we could implement this for powerpc only
> if needed.
> 

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99299

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-02-27 10:32 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-11  7:41 [PATCH] powerpc/bug: Remove specific powerpc BUG_ON() Christophe Leroy
2021-02-11  7:41 ` Christophe Leroy
2021-02-11 10:04 ` Nicholas Piggin
2021-02-11 10:04   ` Nicholas Piggin
2021-02-11 11:50   ` Segher Boessenkool
2021-02-11 11:50     ` Segher Boessenkool
2021-02-11 22:47     ` Nicholas Piggin
2021-02-11 22:47       ` Nicholas Piggin
2021-02-11 12:22   ` Segher Boessenkool
2021-02-11 12:22     ` Segher Boessenkool
2021-02-11 11:49 ` Segher Boessenkool
2021-02-11 11:49   ` Segher Boessenkool
2021-02-11 12:26   ` Christophe Leroy
2021-02-11 12:26     ` Christophe Leroy
2021-02-11 12:39     ` Segher Boessenkool
2021-02-11 12:39       ` Segher Boessenkool
2021-02-11 14:09   ` Christophe Leroy
2021-02-11 14:09     ` Christophe Leroy
2021-02-11 14:30     ` Segher Boessenkool
2021-02-11 14:30       ` Segher Boessenkool
2021-02-11 15:30       ` Christophe Leroy
2021-02-11 15:30         ` Christophe Leroy
2021-02-27 10:31       ` Christophe Leroy
2021-02-27 10:31         ` Christophe Leroy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.