* [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32
@ 2021-04-12 16:26 Christophe Leroy
2021-04-12 16:26 ` [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto Christophe Leroy
2021-06-25 14:41 ` [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy
0 siblings, 2 replies; 5+ messages in thread
From: Christophe Leroy @ 2021-04-12 16:26 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linux-kernel, linuxppc-dev
powerpc BUG_ON() and WARN_ON() are based on using twnei instruction.
For catching simple conditions like a variable having value 0, this
is efficient because it does the test and the trap at the same time.
But most conditions used with BUG_ON or WARN_ON are more complex and
forces GCC to format the condition into a 0 or 1 value in a register.
This will usually require 2 to 3 instructions.
The most efficient solution would be to use __builtin_trap() because
GCC is able to optimise the use of the different trap instructions
based on the requested condition, but this is complex if not
impossible for the following reasons:
- __builtin_trap() is a non-recoverable instruction, so it can't be
used for WARN_ON
- Knowing which line of code generated the trap would require the
analysis of DWARF information. This is not a feature we have today.
As mentioned in commit 8d4fbcfbe0a4 ("Fix WARN_ON() on bitfield ops")
the way WARN_ON() is implemented is suboptimal. That commit also
mentions an issue with 'long long' condition. It fixed it for
WARN_ON() but the same problem still exists today with BUG_ON() on
PPC32. It will be fixed by using the generic implementation.
By using the generic implementation, gcc will naturally generate a
branch to the unconditional trap generated by BUG().
As modern powerpc implement zero-cycle branch,
that's even more efficient.
And for the functions using WARN_ON() and its return, the test
on return from WARN_ON() is now also used for the WARN_ON() itself.
On PPC64 we don't want it because we want to be able to use CFAR
register to track how we entered the code that trapped. The CFAR
register would be clobbered by the branch.
A simple test function:
unsigned long test9w(unsigned long a, unsigned long b)
{
if (WARN_ON(!b))
return 0;
return a / b;
}
Before the patch:
0000046c <test9w>:
46c: 7c 89 00 34 cntlzw r9,r4
470: 55 29 d9 7e rlwinm r9,r9,27,5,31
474: 0f 09 00 00 twnei r9,0
478: 2c 04 00 00 cmpwi r4,0
47c: 41 82 00 0c beq 488 <test9w+0x1c>
480: 7c 63 23 96 divwu r3,r3,r4
484: 4e 80 00 20 blr
488: 38 60 00 00 li r3,0
48c: 4e 80 00 20 blr
After the patch:
00000468 <test9w>:
468: 2c 04 00 00 cmpwi r4,0
46c: 41 82 00 0c beq 478 <test9w+0x10>
470: 7c 63 23 96 divwu r3,r3,r4
474: 4e 80 00 20 blr
478: 0f e0 00 00 twui r0,0
47c: 38 60 00 00 li r3,0
480: 4e 80 00 20 blr
So we see before the patch we need 3 instructions on the likely path
to handle the WARN_ON(). With the patch the trap goes on the unlikely
path.
See below the difference at the entry of system_call_exception where
we have several BUG_ON(), allthough less impressing.
With the patch:
00000000 <system_call_exception>:
0: 81 6a 00 84 lwz r11,132(r10)
4: 90 6a 00 88 stw r3,136(r10)
8: 71 60 00 02 andi. r0,r11,2
c: 41 82 00 70 beq 7c <system_call_exception+0x7c>
10: 71 60 40 00 andi. r0,r11,16384
14: 41 82 00 6c beq 80 <system_call_exception+0x80>
18: 71 6b 80 00 andi. r11,r11,32768
1c: 41 82 00 68 beq 84 <system_call_exception+0x84>
20: 94 21 ff e0 stwu r1,-32(r1)
24: 93 e1 00 1c stw r31,28(r1)
28: 7d 8c 42 e6 mftb r12
...
7c: 0f e0 00 00 twui r0,0
80: 0f e0 00 00 twui r0,0
84: 0f e0 00 00 twui r0,0
Without the patch:
00000000 <system_call_exception>:
0: 94 21 ff e0 stwu r1,-32(r1)
4: 93 e1 00 1c stw r31,28(r1)
8: 90 6a 00 88 stw r3,136(r10)
c: 81 6a 00 84 lwz r11,132(r10)
10: 69 60 00 02 xori r0,r11,2
14: 54 00 ff fe rlwinm r0,r0,31,31,31
18: 0f 00 00 00 twnei r0,0
1c: 69 60 40 00 xori r0,r11,16384
20: 54 00 97 fe rlwinm r0,r0,18,31,31
24: 0f 00 00 00 twnei r0,0
28: 69 6b 80 00 xori r11,r11,32768
2c: 55 6b 8f fe rlwinm r11,r11,17,31,31
30: 0f 0b 00 00 twnei r11,0
34: 7d 8c 42 e6 mftb r12
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/bug.h | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index d1635ffbb179..101dea4eec8d 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -68,7 +68,11 @@
BUG_ENTRY("twi 31, 0, 0", 0); \
unreachable(); \
} while (0)
+#define HAVE_ARCH_BUG
+
+#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
+#ifdef CONFIG_PPC64
#define BUG_ON(x) do { \
if (__builtin_constant_p(x)) { \
if (x) \
@@ -78,8 +82,6 @@
} \
} while (0)
-#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
-
#define WARN_ON(x) ({ \
int __ret_warn_on = !!(x); \
if (__builtin_constant_p(__ret_warn_on)) { \
@@ -93,9 +95,10 @@
unlikely(__ret_warn_on); \
})
-#define HAVE_ARCH_BUG
#define HAVE_ARCH_BUG_ON
#define HAVE_ARCH_WARN_ON
+#endif
+
#endif /* __ASSEMBLY __ */
#else
#ifdef __ASSEMBLY__
--
2.25.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto
2021-04-12 16:26 [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy
@ 2021-04-12 16:26 ` Christophe Leroy
2021-04-12 20:40 ` kernel test robot
2021-06-25 14:41 ` [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy
1 sibling, 1 reply; 5+ messages in thread
From: Christophe Leroy @ 2021-04-12 16:26 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linux-kernel, linuxppc-dev
Using asm goto in __WARN_FLAGS() and WARN_ON() allows more
flexibility to GCC.
For that add an entry to the exception table so that
program_check_exception() knowns where to resume execution
after a WARNING.
Here are two exemples. The first one is done on PPC32 (which
benefits from the previous patch), the second is on PPC64.
unsigned long test(struct pt_regs *regs)
{
int ret;
WARN_ON(regs->msr & MSR_PR);
return regs->gpr[3];
}
unsigned long test9w(unsigned long a, unsigned long b)
{
if (WARN_ON(!b))
return 0;
return a / b;
}
Before the patch:
000003a8 <test>:
3a8: 81 23 00 84 lwz r9,132(r3)
3ac: 71 29 40 00 andi. r9,r9,16384
3b0: 40 82 00 0c bne 3bc <test+0x14>
3b4: 80 63 00 0c lwz r3,12(r3)
3b8: 4e 80 00 20 blr
3bc: 0f e0 00 00 twui r0,0
3c0: 80 63 00 0c lwz r3,12(r3)
3c4: 4e 80 00 20 blr
0000000000000bf0 <.test9w>:
bf0: 7c 89 00 74 cntlzd r9,r4
bf4: 79 29 d1 82 rldicl r9,r9,58,6
bf8: 0b 09 00 00 tdnei r9,0
bfc: 2c 24 00 00 cmpdi r4,0
c00: 41 82 00 0c beq c0c <.test9w+0x1c>
c04: 7c 63 23 92 divdu r3,r3,r4
c08: 4e 80 00 20 blr
c0c: 38 60 00 00 li r3,0
c10: 4e 80 00 20 blr
After the patch:
000003a8 <test>:
3a8: 81 23 00 84 lwz r9,132(r3)
3ac: 71 29 40 00 andi. r9,r9,16384
3b0: 40 82 00 0c bne 3bc <test+0x14>
3b4: 80 63 00 0c lwz r3,12(r3)
3b8: 4e 80 00 20 blr
3bc: 0f e0 00 00 twui r0,0
0000000000000c50 <.test9w>:
c50: 7c 89 00 74 cntlzd r9,r4
c54: 79 29 d1 82 rldicl r9,r9,58,6
c58: 0b 09 00 00 tdnei r9,0
c5c: 7c 63 23 92 divdu r3,r3,r4
c60: 4e 80 00 20 blr
c70: 38 60 00 00 li r3,0
c74: 4e 80 00 20 blr
In the first exemple, we see GCC doesn't need to duplicate what
happens after the trap.
In the second exemple, we see that GCC doesn't need to emit a test
and a branch in the likely path in addition to the trap.
We've got some WARN_ON() in .softirqentry.text section so it needs
to be added in the OTHER_TEXT_SECTIONS in modpost.c
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/book3s/64/kup.h | 2 +-
arch/powerpc/include/asm/bug.h | 51 +++++++++++++++++++-----
arch/powerpc/include/asm/extable.h | 14 +++++++
arch/powerpc/include/asm/ppc_asm.h | 11 +----
arch/powerpc/kernel/entry_64.S | 2 +-
arch/powerpc/kernel/exceptions-64e.S | 2 +-
arch/powerpc/kernel/misc_32.S | 2 +-
arch/powerpc/kernel/traps.c | 9 ++++-
scripts/mod/modpost.c | 2 +-
9 files changed, 69 insertions(+), 26 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
index 9700da3a4093..a22839cba32e 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -90,7 +90,7 @@
/* Prevent access to userspace using any key values */
LOAD_REG_IMMEDIATE(\gpr2, AMR_KUAP_BLOCKED)
999: tdne \gpr1, \gpr2
- EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE)
+ EMIT_WARN_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE)
END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_BOOK3S_KUAP, 67)
#endif
.endm
diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 101dea4eec8d..d92afdbd4449 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -4,6 +4,7 @@
#ifdef __KERNEL__
#include <asm/asm-compat.h>
+#include <asm/extable.h>
#ifdef CONFIG_BUG
@@ -30,6 +31,11 @@
.endm
#endif /* verbose */
+.macro EMIT_WARN_ENTRY addr,file,line,flags
+ EX_TABLE(\addr,\addr+4)
+ EMIT_BUG_ENTRY \addr,\file,\line,\flags
+.endm
+
#else /* !__ASSEMBLY__ */
/* _EMIT_BUG_ENTRY expects args %0,%1,%2,%3 to be FILE, LINE, flags and
sizeof(struct bug_entry), respectively */
@@ -58,6 +64,16 @@
"i" (sizeof(struct bug_entry)), \
##__VA_ARGS__)
+#define WARN_ENTRY(insn, flags, label, ...) \
+ asm_volatile_goto( \
+ "1: " insn "\n" \
+ EX_TABLE(1b, %l[label]) \
+ _EMIT_BUG_ENTRY \
+ : : "i" (__FILE__), "i" (__LINE__), \
+ "i" (flags), \
+ "i" (sizeof(struct bug_entry)), \
+ ##__VA_ARGS__ : : label)
+
/*
* BUG_ON() and WARN_ON() do their best to cooperate with compile-time
* optimisations. However depending on the complexity of the condition
@@ -70,7 +86,15 @@
} while (0)
#define HAVE_ARCH_BUG
-#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
+#define __WARN_FLAGS(flags) do { \
+ __label__ __label_warn_on; \
+ \
+ WARN_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags), __label_warn_on); \
+ unreachable(); \
+ \
+__label_warn_on: \
+ break; \
+} while (0)
#ifdef CONFIG_PPC64
#define BUG_ON(x) do { \
@@ -83,15 +107,24 @@
} while (0)
#define WARN_ON(x) ({ \
- int __ret_warn_on = !!(x); \
- if (__builtin_constant_p(__ret_warn_on)) { \
- if (__ret_warn_on) \
+ bool __ret_warn_on = false; \
+ do { \
+ if (__builtin_constant_p((x))) { \
+ if (!(x)) \
+ break; \
__WARN(); \
- } else { \
- BUG_ENTRY(PPC_TLNEI " %4, 0", \
- BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), \
- "r" (__ret_warn_on)); \
- } \
+ __ret_warn_on = true; \
+ } else { \
+ __label__ __label_warn_on; \
+ \
+ WARN_ENTRY(PPC_TLNEI " %4, 0", \
+ BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), \
+ __label_warn_on, "r" (x)); \
+ break; \
+__label_warn_on: \
+ __ret_warn_on = true; \
+ } \
+ } while (0); \
unlikely(__ret_warn_on); \
})
diff --git a/arch/powerpc/include/asm/extable.h b/arch/powerpc/include/asm/extable.h
index eb91b2d2935a..26ce2e5c0fa8 100644
--- a/arch/powerpc/include/asm/extable.h
+++ b/arch/powerpc/include/asm/extable.h
@@ -17,6 +17,8 @@
#define ARCH_HAS_RELATIVE_EXTABLE
+#ifndef __ASSEMBLY__
+
struct exception_table_entry {
int insn;
int fixup;
@@ -28,3 +30,15 @@ static inline unsigned long extable_fixup(const struct exception_table_entry *x)
}
#endif
+
+/*
+ * Helper macro for exception table entries
+ */
+#define EX_TABLE(_fault, _target) \
+ stringify_in_c(.section __ex_table,"a";)\
+ stringify_in_c(.balign 4;) \
+ stringify_in_c(.long (_fault) - . ;) \
+ stringify_in_c(.long (_target) - . ;) \
+ stringify_in_c(.previous)
+
+#endif
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index 8998122fc7e2..a74e1561535a 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -10,6 +10,7 @@
#include <asm/ppc-opcode.h>
#include <asm/firmware.h>
#include <asm/feature-fixups.h>
+#include <asm/extable.h>
#ifdef __ASSEMBLY__
@@ -772,16 +773,6 @@ END_FTR_SECTION_NESTED(CPU_FTR_CELL_TB_BUG, CPU_FTR_CELL_TB_BUG, 96)
#endif /* __ASSEMBLY__ */
-/*
- * Helper macro for exception table entries
- */
-#define EX_TABLE(_fault, _target) \
- stringify_in_c(.section __ex_table,"a";)\
- stringify_in_c(.balign 4;) \
- stringify_in_c(.long (_fault) - . ;) \
- stringify_in_c(.long (_target) - . ;) \
- stringify_in_c(.previous)
-
#ifdef CONFIG_PPC_FSL_BOOK3E
#define BTB_FLUSH(reg) \
lis reg,BUCSR_INIT@h; \
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 6c4d9e276c4d..faa64cc90f02 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -835,7 +835,7 @@ _GLOBAL(enter_rtas)
*/
lbz r0,PACAIRQSOFTMASK(r13)
1: tdeqi r0,IRQS_ENABLED
- EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING
+ EMIT_WARN_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING
#endif
/* Hard-disable interrupts */
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index e8eb9992a270..3f8c51390a04 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1249,7 +1249,7 @@ fast_exception_return:
/* The interrupt should not have soft enabled. */
lbz r7,PACAIRQSOFTMASK(r13)
1: tdeqi r7,IRQS_ENABLED
- EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING
+ EMIT_WARN_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING
#endif
b fast_exception_return
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 6a076bef2932..21390f3119a9 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -237,7 +237,7 @@ _GLOBAL(copy_page)
addi r3,r3,-4
0: twnei r5, 0 /* WARN if r3 is not cache aligned */
- EMIT_BUG_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING
+ EMIT_WARN_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING
addi r4,r4,-4
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index efba99870691..ee40a637313d 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1467,8 +1467,13 @@ static void do_program_check(struct pt_regs *regs)
if (!(regs->msr & MSR_PR) && /* not user-mode */
report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) {
- regs->nip += 4;
- return;
+ const struct exception_table_entry *entry;
+
+ entry = search_exception_tables(bugaddr);
+ if (entry) {
+ regs->nip = extable_fixup(entry) + regs->nip - bugaddr;
+ return;
+ }
}
_exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip);
return;
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 24725e50c7b4..34745f239208 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -926,7 +926,7 @@ static void check_section(const char *modname, struct elf_info *elf,
".kprobes.text", ".cpuidle.text", ".noinstr.text"
#define OTHER_TEXT_SECTIONS ".ref.text", ".head.text", ".spinlock.text", \
".fixup", ".entry.text", ".exception.text", ".text.*", \
- ".coldtext"
+ ".coldtext .softirqentry.text"
#define INIT_SECTIONS ".init.*"
#define MEM_INIT_SECTIONS ".meminit.*"
--
2.25.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto
2021-04-12 16:26 ` [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto Christophe Leroy
@ 2021-04-12 20:40 ` kernel test robot
0 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2021-04-12 20:40 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman
Cc: kbuild-all, clang-built-linux, linuxppc-dev, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2874 bytes --]
Hi Christophe,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on kbuild/for-next v5.12-rc7 next-20210412]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-bug-Remove-specific-powerpc-BUG_ON-and-WARN_ON-on-PPC32/20210413-002741
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-r035-20210412 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 9829f5e6b1bca9b61efc629770d28bb9014dec45)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc cross compiling tool for clang build
# apt-get install binutils-powerpc-linux-gnu
# https://github.com/0day-ci/linux/commit/398ea05a716b58d231d62d20083a101488d155b4
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Christophe-Leroy/powerpc-bug-Remove-specific-powerpc-BUG_ON-and-WARN_ON-on-PPC32/20210413-002741
git checkout 398ea05a716b58d231d62d20083a101488d155b4
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
arch/powerpc/kernel/misc_32.S: Assembler messages:
>> arch/powerpc/kernel/misc_32.S:240: Error: unrecognized opcode: `emit_warn_entry'
clang-13: error: assembler command failed with exit code 1 (use -v to see invocation)
vim +240 arch/powerpc/kernel/misc_32.S
218
219 /*
220 * Copy a whole page. We use the dcbz instruction on the destination
221 * to reduce memory traffic (it eliminates the unnecessary reads of
222 * the destination into cache). This requires that the destination
223 * is cacheable.
224 */
225 #define COPY_16_BYTES \
226 lwz r6,4(r4); \
227 lwz r7,8(r4); \
228 lwz r8,12(r4); \
229 lwzu r9,16(r4); \
230 stw r6,4(r3); \
231 stw r7,8(r3); \
232 stw r8,12(r3); \
233 stwu r9,16(r3)
234
235 _GLOBAL(copy_page)
236 rlwinm r5, r3, 0, L1_CACHE_BYTES - 1
237 addi r3,r3,-4
238
239 0: twnei r5, 0 /* WARN if r3 is not cache aligned */
> 240 EMIT_WARN_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING
241
242 addi r4,r4,-4
243
244 li r5,4
245
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27424 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32
2021-04-12 16:26 [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy
2021-04-12 16:26 ` [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto Christophe Leroy
@ 2021-06-25 14:41 ` Christophe Leroy
2021-08-04 5:25 ` Christophe Leroy
1 sibling, 1 reply; 5+ messages in thread
From: Christophe Leroy @ 2021-06-25 14:41 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Hi Michael,
What happened to this series ? It has been flagged 'under review' in Patchwork since mid April but I
never saw it in next-test.
Thanks
Christophe
Le 12/04/2021 à 18:26, Christophe Leroy a écrit :
> powerpc BUG_ON() and WARN_ON() are based on using twnei instruction.
>
> For catching simple conditions like a variable having value 0, this
> is efficient because it does the test and the trap at the same time.
> But most conditions used with BUG_ON or WARN_ON are more complex and
> forces GCC to format the condition into a 0 or 1 value in a register.
> This will usually require 2 to 3 instructions.
>
> The most efficient solution would be to use __builtin_trap() because
> GCC is able to optimise the use of the different trap instructions
> based on the requested condition, but this is complex if not
> impossible for the following reasons:
> - __builtin_trap() is a non-recoverable instruction, so it can't be
> used for WARN_ON
> - Knowing which line of code generated the trap would require the
> analysis of DWARF information. This is not a feature we have today.
>
> As mentioned in commit 8d4fbcfbe0a4 ("Fix WARN_ON() on bitfield ops")
> the way WARN_ON() is implemented is suboptimal. That commit also
> mentions an issue with 'long long' condition. It fixed it for
> WARN_ON() but the same problem still exists today with BUG_ON() on
> PPC32. It will be fixed by using the generic implementation.
>
> By using the generic implementation, gcc will naturally generate a
> branch to the unconditional trap generated by BUG().
>
> As modern powerpc implement zero-cycle branch,
> that's even more efficient.
>
> And for the functions using WARN_ON() and its return, the test
> on return from WARN_ON() is now also used for the WARN_ON() itself.
>
> On PPC64 we don't want it because we want to be able to use CFAR
> register to track how we entered the code that trapped. The CFAR
> register would be clobbered by the branch.
>
> A simple test function:
>
> unsigned long test9w(unsigned long a, unsigned long b)
> {
> if (WARN_ON(!b))
> return 0;
> return a / b;
> }
>
> Before the patch:
>
> 0000046c <test9w>:
> 46c: 7c 89 00 34 cntlzw r9,r4
> 470: 55 29 d9 7e rlwinm r9,r9,27,5,31
> 474: 0f 09 00 00 twnei r9,0
> 478: 2c 04 00 00 cmpwi r4,0
> 47c: 41 82 00 0c beq 488 <test9w+0x1c>
> 480: 7c 63 23 96 divwu r3,r3,r4
> 484: 4e 80 00 20 blr
>
> 488: 38 60 00 00 li r3,0
> 48c: 4e 80 00 20 blr
>
> After the patch:
>
> 00000468 <test9w>:
> 468: 2c 04 00 00 cmpwi r4,0
> 46c: 41 82 00 0c beq 478 <test9w+0x10>
> 470: 7c 63 23 96 divwu r3,r3,r4
> 474: 4e 80 00 20 blr
>
> 478: 0f e0 00 00 twui r0,0
> 47c: 38 60 00 00 li r3,0
> 480: 4e 80 00 20 blr
>
> So we see before the patch we need 3 instructions on the likely path
> to handle the WARN_ON(). With the patch the trap goes on the unlikely
> path.
>
> See below the difference at the entry of system_call_exception where
> we have several BUG_ON(), allthough less impressing.
>
> With the patch:
>
> 00000000 <system_call_exception>:
> 0: 81 6a 00 84 lwz r11,132(r10)
> 4: 90 6a 00 88 stw r3,136(r10)
> 8: 71 60 00 02 andi. r0,r11,2
> c: 41 82 00 70 beq 7c <system_call_exception+0x7c>
> 10: 71 60 40 00 andi. r0,r11,16384
> 14: 41 82 00 6c beq 80 <system_call_exception+0x80>
> 18: 71 6b 80 00 andi. r11,r11,32768
> 1c: 41 82 00 68 beq 84 <system_call_exception+0x84>
> 20: 94 21 ff e0 stwu r1,-32(r1)
> 24: 93 e1 00 1c stw r31,28(r1)
> 28: 7d 8c 42 e6 mftb r12
> ...
> 7c: 0f e0 00 00 twui r0,0
> 80: 0f e0 00 00 twui r0,0
> 84: 0f e0 00 00 twui r0,0
>
> Without the patch:
>
> 00000000 <system_call_exception>:
> 0: 94 21 ff e0 stwu r1,-32(r1)
> 4: 93 e1 00 1c stw r31,28(r1)
> 8: 90 6a 00 88 stw r3,136(r10)
> c: 81 6a 00 84 lwz r11,132(r10)
> 10: 69 60 00 02 xori r0,r11,2
> 14: 54 00 ff fe rlwinm r0,r0,31,31,31
> 18: 0f 00 00 00 twnei r0,0
> 1c: 69 60 40 00 xori r0,r11,16384
> 20: 54 00 97 fe rlwinm r0,r0,18,31,31
> 24: 0f 00 00 00 twnei r0,0
> 28: 69 6b 80 00 xori r11,r11,32768
> 2c: 55 6b 8f fe rlwinm r11,r11,17,31,31
> 30: 0f 0b 00 00 twnei r11,0
> 34: 7d 8c 42 e6 mftb r12
>
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
> arch/powerpc/include/asm/bug.h | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
> index d1635ffbb179..101dea4eec8d 100644
> --- a/arch/powerpc/include/asm/bug.h
> +++ b/arch/powerpc/include/asm/bug.h
> @@ -68,7 +68,11 @@
> BUG_ENTRY("twi 31, 0, 0", 0); \
> unreachable(); \
> } while (0)
> +#define HAVE_ARCH_BUG
> +
> +#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
>
> +#ifdef CONFIG_PPC64
> #define BUG_ON(x) do { \
> if (__builtin_constant_p(x)) { \
> if (x) \
> @@ -78,8 +82,6 @@
> } \
> } while (0)
>
> -#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
> -
> #define WARN_ON(x) ({ \
> int __ret_warn_on = !!(x); \
> if (__builtin_constant_p(__ret_warn_on)) { \
> @@ -93,9 +95,10 @@
> unlikely(__ret_warn_on); \
> })
>
> -#define HAVE_ARCH_BUG
> #define HAVE_ARCH_BUG_ON
> #define HAVE_ARCH_WARN_ON
> +#endif
> +
> #endif /* __ASSEMBLY __ */
> #else
> #ifdef __ASSEMBLY__
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32
2021-06-25 14:41 ` [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy
@ 2021-08-04 5:25 ` Christophe Leroy
0 siblings, 0 replies; 5+ messages in thread
From: Christophe Leroy @ 2021-08-04 5:25 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Gentle ping Michael ?
Le 25/06/2021 à 16:41, Christophe Leroy a écrit :
> Hi Michael,
>
> What happened to this series ? It has been flagged 'under review' in Patchwork since mid April but I
> never saw it in next-test.
>
> Thanks
> Christophe
>
> Le 12/04/2021 à 18:26, Christophe Leroy a écrit :
>> powerpc BUG_ON() and WARN_ON() are based on using twnei instruction.
>>
>> For catching simple conditions like a variable having value 0, this
>> is efficient because it does the test and the trap at the same time.
>> But most conditions used with BUG_ON or WARN_ON are more complex and
>> forces GCC to format the condition into a 0 or 1 value in a register.
>> This will usually require 2 to 3 instructions.
>>
>> The most efficient solution would be to use __builtin_trap() because
>> GCC is able to optimise the use of the different trap instructions
>> based on the requested condition, but this is complex if not
>> impossible for the following reasons:
>> - __builtin_trap() is a non-recoverable instruction, so it can't be
>> used for WARN_ON
>> - Knowing which line of code generated the trap would require the
>> analysis of DWARF information. This is not a feature we have today.
>>
>> As mentioned in commit 8d4fbcfbe0a4 ("Fix WARN_ON() on bitfield ops")
>> the way WARN_ON() is implemented is suboptimal. That commit also
>> mentions an issue with 'long long' condition. It fixed it for
>> WARN_ON() but the same problem still exists today with BUG_ON() on
>> PPC32. It will be fixed by using the generic implementation.
>>
>> By using the generic implementation, gcc will naturally generate a
>> branch to the unconditional trap generated by BUG().
>>
>> As modern powerpc implement zero-cycle branch,
>> that's even more efficient.
>>
>> And for the functions using WARN_ON() and its return, the test
>> on return from WARN_ON() is now also used for the WARN_ON() itself.
>>
>> On PPC64 we don't want it because we want to be able to use CFAR
>> register to track how we entered the code that trapped. The CFAR
>> register would be clobbered by the branch.
>>
>> A simple test function:
>>
>> unsigned long test9w(unsigned long a, unsigned long b)
>> {
>> if (WARN_ON(!b))
>> return 0;
>> return a / b;
>> }
>>
>> Before the patch:
>>
>> 0000046c <test9w>:
>> 46c: 7c 89 00 34 cntlzw r9,r4
>> 470: 55 29 d9 7e rlwinm r9,r9,27,5,31
>> 474: 0f 09 00 00 twnei r9,0
>> 478: 2c 04 00 00 cmpwi r4,0
>> 47c: 41 82 00 0c beq 488 <test9w+0x1c>
>> 480: 7c 63 23 96 divwu r3,r3,r4
>> 484: 4e 80 00 20 blr
>>
>> 488: 38 60 00 00 li r3,0
>> 48c: 4e 80 00 20 blr
>>
>> After the patch:
>>
>> 00000468 <test9w>:
>> 468: 2c 04 00 00 cmpwi r4,0
>> 46c: 41 82 00 0c beq 478 <test9w+0x10>
>> 470: 7c 63 23 96 divwu r3,r3,r4
>> 474: 4e 80 00 20 blr
>>
>> 478: 0f e0 00 00 twui r0,0
>> 47c: 38 60 00 00 li r3,0
>> 480: 4e 80 00 20 blr
>>
>> So we see before the patch we need 3 instructions on the likely path
>> to handle the WARN_ON(). With the patch the trap goes on the unlikely
>> path.
>>
>> See below the difference at the entry of system_call_exception where
>> we have several BUG_ON(), allthough less impressing.
>>
>> With the patch:
>>
>> 00000000 <system_call_exception>:
>> 0: 81 6a 00 84 lwz r11,132(r10)
>> 4: 90 6a 00 88 stw r3,136(r10)
>> 8: 71 60 00 02 andi. r0,r11,2
>> c: 41 82 00 70 beq 7c <system_call_exception+0x7c>
>> 10: 71 60 40 00 andi. r0,r11,16384
>> 14: 41 82 00 6c beq 80 <system_call_exception+0x80>
>> 18: 71 6b 80 00 andi. r11,r11,32768
>> 1c: 41 82 00 68 beq 84 <system_call_exception+0x84>
>> 20: 94 21 ff e0 stwu r1,-32(r1)
>> 24: 93 e1 00 1c stw r31,28(r1)
>> 28: 7d 8c 42 e6 mftb r12
>> ...
>> 7c: 0f e0 00 00 twui r0,0
>> 80: 0f e0 00 00 twui r0,0
>> 84: 0f e0 00 00 twui r0,0
>>
>> Without the patch:
>>
>> 00000000 <system_call_exception>:
>> 0: 94 21 ff e0 stwu r1,-32(r1)
>> 4: 93 e1 00 1c stw r31,28(r1)
>> 8: 90 6a 00 88 stw r3,136(r10)
>> c: 81 6a 00 84 lwz r11,132(r10)
>> 10: 69 60 00 02 xori r0,r11,2
>> 14: 54 00 ff fe rlwinm r0,r0,31,31,31
>> 18: 0f 00 00 00 twnei r0,0
>> 1c: 69 60 40 00 xori r0,r11,16384
>> 20: 54 00 97 fe rlwinm r0,r0,18,31,31
>> 24: 0f 00 00 00 twnei r0,0
>> 28: 69 6b 80 00 xori r11,r11,32768
>> 2c: 55 6b 8f fe rlwinm r11,r11,17,31,31
>> 30: 0f 0b 00 00 twnei r11,0
>> 34: 7d 8c 42 e6 mftb r12
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>> ---
>> arch/powerpc/include/asm/bug.h | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
>> index d1635ffbb179..101dea4eec8d 100644
>> --- a/arch/powerpc/include/asm/bug.h
>> +++ b/arch/powerpc/include/asm/bug.h
>> @@ -68,7 +68,11 @@
>> BUG_ENTRY("twi 31, 0, 0", 0); \
>> unreachable(); \
>> } while (0)
>> +#define HAVE_ARCH_BUG
>> +
>> +#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
>> +#ifdef CONFIG_PPC64
>> #define BUG_ON(x) do { \
>> if (__builtin_constant_p(x)) { \
>> if (x) \
>> @@ -78,8 +82,6 @@
>> } \
>> } while (0)
>> -#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
>> -
>> #define WARN_ON(x) ({ \
>> int __ret_warn_on = !!(x); \
>> if (__builtin_constant_p(__ret_warn_on)) { \
>> @@ -93,9 +95,10 @@
>> unlikely(__ret_warn_on); \
>> })
>> -#define HAVE_ARCH_BUG
>> #define HAVE_ARCH_BUG_ON
>> #define HAVE_ARCH_WARN_ON
>> +#endif
>> +
>> #endif /* __ASSEMBLY __ */
>> #else
>> #ifdef __ASSEMBLY__
>>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-08-04 5:25 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-12 16:26 [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy
2021-04-12 16:26 ` [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto Christophe Leroy
2021-04-12 20:40 ` kernel test robot
2021-06-25 14:41 ` [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy
2021-08-04 5:25 ` Christophe Leroy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).