All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
@ 2022-07-08 17:31 ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev, x86, chenzhongjin

This series applies on top of the series v3 "objtool: Enable and
implement --mcount option on powerpc" [1] rebased on powerpc-next branch

A few modifications are done to core parts to enable powerpc
implementation:
- R_X86_64_PC32 is abstracted to R_REL32 so that it can then be
redefined as R_PPC_REL32.
- A call to static_call_init() is added to start_kernel() to avoid
every architecture to have to call it
- Trampoline address is provided to arch_static_call_transform() even
when setting a site to fallback on a call to the trampoline when the
target is too far.

[1] https://lore.kernel.org/lkml/70b6d08d-aced-7f4e-b958-a3c7ae1a9319@csgroup.eu/T/#rb3a073c54aba563a135fba891e0c34c46e47beef

Christophe Leroy (7):
  powerpc: Add missing asm/asm.h for objtool
  objtool/powerpc: Activate objtool on PPC32
  objtool: Add architecture specific R_REL32 macro
  objtool/powerpc: Add necessary support for inline static calls
  init: Call static_call_init() from start_kernel()
  static_call_inline: Provide trampoline address when updating sites
  powerpc/static_call: Implement inline static calls

 arch/powerpc/Kconfig                          |  3 +-
 arch/powerpc/include/asm/asm.h                |  7 +++
 arch/powerpc/include/asm/static_call.h        |  2 +
 arch/powerpc/kernel/cpu_setup_6xx.S           | 26 ++++++---
 arch/powerpc/kernel/cpu_setup_fsl_booke.S     |  8 ++-
 arch/powerpc/kernel/entry_32.S                |  8 ++-
 arch/powerpc/kernel/head_40x.S                |  5 +-
 arch/powerpc/kernel/head_8xx.S                |  5 +-
 arch/powerpc/kernel/head_book3s_32.S          | 29 +++++++---
 arch/powerpc/kernel/head_fsl_booke.S          |  5 +-
 arch/powerpc/kernel/static_call.c             | 56 ++++++++++++++-----
 arch/powerpc/kernel/swsusp_32.S               |  5 +-
 arch/powerpc/kvm/fpu.S                        | 17 ++++--
 arch/powerpc/platforms/52xx/lite5200_sleep.S  | 15 +++--
 arch/x86/kernel/static_call.c                 |  2 +-
 init/main.c                                   |  1 +
 kernel/static_call_inline.c                   |  2 +-
 tools/objtool/arch/powerpc/decode.c           | 16 ++++--
 tools/objtool/arch/powerpc/include/arch/elf.h |  1 +
 tools/objtool/arch/x86/include/arch/elf.h     |  1 +
 tools/objtool/check.c                         | 10 ++--
 tools/objtool/orc_gen.c                       |  2 +-
 22 files changed, 162 insertions(+), 64 deletions(-)
 create mode 100644 arch/powerpc/include/asm/asm.h

-- 
2.36.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
@ 2022-07-08 17:31 ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: chenzhongjin, x86, linuxppc-dev, linux-kernel

This series applies on top of the series v3 "objtool: Enable and
implement --mcount option on powerpc" [1] rebased on powerpc-next branch

A few modifications are done to core parts to enable powerpc
implementation:
- R_X86_64_PC32 is abstracted to R_REL32 so that it can then be
redefined as R_PPC_REL32.
- A call to static_call_init() is added to start_kernel() to avoid
every architecture to have to call it
- Trampoline address is provided to arch_static_call_transform() even
when setting a site to fallback on a call to the trampoline when the
target is too far.

[1] https://lore.kernel.org/lkml/70b6d08d-aced-7f4e-b958-a3c7ae1a9319@csgroup.eu/T/#rb3a073c54aba563a135fba891e0c34c46e47beef

Christophe Leroy (7):
  powerpc: Add missing asm/asm.h for objtool
  objtool/powerpc: Activate objtool on PPC32
  objtool: Add architecture specific R_REL32 macro
  objtool/powerpc: Add necessary support for inline static calls
  init: Call static_call_init() from start_kernel()
  static_call_inline: Provide trampoline address when updating sites
  powerpc/static_call: Implement inline static calls

 arch/powerpc/Kconfig                          |  3 +-
 arch/powerpc/include/asm/asm.h                |  7 +++
 arch/powerpc/include/asm/static_call.h        |  2 +
 arch/powerpc/kernel/cpu_setup_6xx.S           | 26 ++++++---
 arch/powerpc/kernel/cpu_setup_fsl_booke.S     |  8 ++-
 arch/powerpc/kernel/entry_32.S                |  8 ++-
 arch/powerpc/kernel/head_40x.S                |  5 +-
 arch/powerpc/kernel/head_8xx.S                |  5 +-
 arch/powerpc/kernel/head_book3s_32.S          | 29 +++++++---
 arch/powerpc/kernel/head_fsl_booke.S          |  5 +-
 arch/powerpc/kernel/static_call.c             | 56 ++++++++++++++-----
 arch/powerpc/kernel/swsusp_32.S               |  5 +-
 arch/powerpc/kvm/fpu.S                        | 17 ++++--
 arch/powerpc/platforms/52xx/lite5200_sleep.S  | 15 +++--
 arch/x86/kernel/static_call.c                 |  2 +-
 init/main.c                                   |  1 +
 kernel/static_call_inline.c                   |  2 +-
 tools/objtool/arch/powerpc/decode.c           | 16 ++++--
 tools/objtool/arch/powerpc/include/arch/elf.h |  1 +
 tools/objtool/arch/x86/include/arch/elf.h     |  1 +
 tools/objtool/check.c                         | 10 ++--
 tools/objtool/orc_gen.c                       |  2 +-
 22 files changed, 162 insertions(+), 64 deletions(-)
 create mode 100644 arch/powerpc/include/asm/asm.h

-- 
2.36.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 1/7] powerpc: Add missing asm/asm.h for objtool
  2022-07-08 17:31 ` Christophe Leroy
@ 2022-07-08 17:31   ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev, x86, chenzhongjin

Since commit e2ef115813c3 ("objtool: Fix STACK_FRAME_NON_STANDARD
reloc type"), powerpc needs asm/asm.h to enable objtool.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/asm.h | 7 +++++++
 1 file changed, 7 insertions(+)
 create mode 100644 arch/powerpc/include/asm/asm.h

diff --git a/arch/powerpc/include/asm/asm.h b/arch/powerpc/include/asm/asm.h
new file mode 100644
index 000000000000..86f46b604e9a
--- /dev/null
+++ b/arch/powerpc/include/asm/asm.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_ASM_H
+#define _ASM_POWERPC_ASM_H
+
+#define _ASM_PTR	" .long "
+
+#endif /* _ASM_POWERPC_ASM_H */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 1/7] powerpc: Add missing asm/asm.h for objtool
@ 2022-07-08 17:31   ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: chenzhongjin, x86, linuxppc-dev, linux-kernel

Since commit e2ef115813c3 ("objtool: Fix STACK_FRAME_NON_STANDARD
reloc type"), powerpc needs asm/asm.h to enable objtool.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/asm.h | 7 +++++++
 1 file changed, 7 insertions(+)
 create mode 100644 arch/powerpc/include/asm/asm.h

diff --git a/arch/powerpc/include/asm/asm.h b/arch/powerpc/include/asm/asm.h
new file mode 100644
index 000000000000..86f46b604e9a
--- /dev/null
+++ b/arch/powerpc/include/asm/asm.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_ASM_H
+#define _ASM_POWERPC_ASM_H
+
+#define _ASM_PTR	" .long "
+
+#endif /* _ASM_POWERPC_ASM_H */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 2/7] objtool/powerpc: Activate objtool on PPC32
  2022-07-08 17:31 ` Christophe Leroy
@ 2022-07-08 17:31   ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev, x86, chenzhongjin

Fix several annotations in assembly files and enable objtool on PPC32.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/Kconfig                         |  2 +-
 arch/powerpc/kernel/cpu_setup_6xx.S          | 26 ++++++++++++------
 arch/powerpc/kernel/cpu_setup_fsl_booke.S    |  8 ++++--
 arch/powerpc/kernel/entry_32.S               |  8 ++++--
 arch/powerpc/kernel/head_40x.S               |  5 +++-
 arch/powerpc/kernel/head_8xx.S               |  5 +++-
 arch/powerpc/kernel/head_book3s_32.S         | 29 ++++++++++++++------
 arch/powerpc/kernel/head_fsl_booke.S         |  5 +++-
 arch/powerpc/kernel/swsusp_32.S              |  5 +++-
 arch/powerpc/kvm/fpu.S                       | 17 ++++++++----
 arch/powerpc/platforms/52xx/lite5200_sleep.S | 15 +++++++---
 11 files changed, 90 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 96263d78aec9..00a43eb26418 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -237,7 +237,7 @@ config PPC
 	select HAVE_MOD_ARCH_SPECIFIC
 	select HAVE_NMI				if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
 	select HAVE_OPTPROBES
-	select HAVE_OBJTOOL			if PPC64
+	select HAVE_OBJTOOL
 	select HAVE_OBJTOOL_MCOUNT		if HAVE_OBJTOOL
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_EVENTS_NMI		if PPC64
diff --git a/arch/powerpc/kernel/cpu_setup_6xx.S b/arch/powerpc/kernel/cpu_setup_6xx.S
index f8b5ff64b604..f29ce3dd6140 100644
--- a/arch/powerpc/kernel/cpu_setup_6xx.S
+++ b/arch/powerpc/kernel/cpu_setup_6xx.S
@@ -4,6 +4,8 @@
  *    Copyright (C) 2003 Benjamin Herrenschmidt (benh@kernel.crashing.org)
  */
 
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/cputable.h>
@@ -81,7 +83,7 @@ _GLOBAL(__setup_cpu_745x)
 	blr
 
 /* Enable caches for 603's, 604, 750 & 7400 */
-setup_common_caches:
+SYM_FUNC_START_LOCAL(setup_common_caches)
 	mfspr	r11,SPRN_HID0
 	andi.	r0,r11,HID0_DCE
 	ori	r11,r11,HID0_ICE|HID0_DCE
@@ -95,11 +97,12 @@ setup_common_caches:
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_common_caches)
 
 /* 604, 604e, 604ev, ...
  * Enable superscalar execution & branch history table
  */
-setup_604_hid0:
+SYM_FUNC_START_LOCAL(setup_604_hid0)
 	mfspr	r11,SPRN_HID0
 	ori	r11,r11,HID0_SIED|HID0_BHTE
 	ori	r8,r11,HID0_BTCD
@@ -110,6 +113,7 @@ setup_604_hid0:
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_604_hid0)
 
 /* 7400 <= rev 2.7 and 7410 rev = 1.0 suffer from some
  * erratas we work around here.
@@ -125,13 +129,14 @@ setup_604_hid0:
  * needed once we have applied workaround #5 (though it's
  * not set by Apple's firmware at least).
  */
-setup_7400_workarounds:
+SYM_FUNC_START_LOCAL(setup_7400_workarounds)
 	mfpvr	r3
 	rlwinm	r3,r3,0,20,31
 	cmpwi	0,r3,0x0207
 	ble	1f
 	blr
-setup_7410_workarounds:
+SYM_FUNC_END(setup_7400_workarounds)
+SYM_FUNC_START_LOCAL(setup_7410_workarounds)
 	mfpvr	r3
 	rlwinm	r3,r3,0,20,31
 	cmpwi	0,r3,0x0100
@@ -151,6 +156,7 @@ setup_7410_workarounds:
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_7410_workarounds)
 
 /* 740/750/7400/7410
  * Enable Store Gathering (SGE), Address Broadcast (ABE),
@@ -158,7 +164,7 @@ setup_7410_workarounds:
  * Dynamic Power Management (DPM), Speculative (SPD)
  * Clear Instruction cache throttling (ICTC)
  */
-setup_750_7400_hid0:
+SYM_FUNC_START_LOCAL(setup_750_7400_hid0)
 	mfspr	r11,SPRN_HID0
 	ori	r11,r11,HID0_SGE | HID0_ABE | HID0_BHTE | HID0_BTIC
 	oris	r11,r11,HID0_DPM@h
@@ -177,12 +183,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_NO_DPM)
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_750_7400_hid0)
 
 /* 750cx specific
  * Looks like we have to disable NAP feature for some PLL settings...
  * (waiting for confirmation)
  */
-setup_750cx:
+SYM_FUNC_START_LOCAL(setup_750cx)
 	mfspr	r10, SPRN_HID1
 	rlwinm	r10,r10,4,28,31
 	cmpwi	cr0,r10,7
@@ -196,11 +203,13 @@ setup_750cx:
 	andc	r6,r6,r7
 	stw	r6,CPU_SPEC_FEATURES(r4)
 	blr
+SYM_FUNC_END(setup_750cx)
 
 /* 750fx specific
  */
-setup_750fx:
+SYM_FUNC_START_LOCAL(setup_750fx)
 	blr
+SYM_FUNC_END(setup_750fx)
 
 /* MPC 745x
  * Enable Store Gathering (SGE), Branch Folding (FOLD)
@@ -212,7 +221,7 @@ setup_750fx:
  * Clear Instruction cache throttling (ICTC)
  * Enable L2 HW prefetch
  */
-setup_745x_specifics:
+SYM_FUNC_START_LOCAL(setup_745x_specifics)
 	/* We check for the presence of an L3 cache setup by
 	 * the firmware. If any, we disable NAP capability as
 	 * it's known to be bogus on rev 2.1 and earlier
@@ -270,6 +279,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_NO_DPM)
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_745x_specifics)
 
 /*
  * Initialize the FPU registers. This is needed to work around an errata
diff --git a/arch/powerpc/kernel/cpu_setup_fsl_booke.S b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
index 4bf33f1b4193..f573a4f3bbe6 100644
--- a/arch/powerpc/kernel/cpu_setup_fsl_booke.S
+++ b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
@@ -8,6 +8,8 @@
  * Benjamin Herrenschmidt <benh@kernel.crashing.org>
  */
 
+#include <linux/linkage.h>
+
 #include <asm/page.h>
 #include <asm/processor.h>
 #include <asm/cputable.h>
@@ -274,7 +276,7 @@ _GLOBAL(flush_dcache_L1)
 
 	blr
 
-has_L2_cache:
+SYM_FUNC_START_LOCAL(has_L2_cache)
 	/* skip L2 cache on P2040/P2040E as they have no L2 cache */
 	mfspr	r3, SPRN_SVR
 	/* shift right by 8 bits and clear E bit of SVR */
@@ -290,9 +292,10 @@ has_L2_cache:
 1:
 	li	r3, 0
 	blr
+SYM_FUNC_END(has_L2_cache)
 
 /* flush backside L2 cache */
-flush_backside_L2_cache:
+SYM_FUNC_START_LOCAL(flush_backside_L2_cache)
 	mflr	r10
 	bl	has_L2_cache
 	mtlr	r10
@@ -313,6 +316,7 @@ flush_backside_L2_cache:
 	bne	1b
 2:
 	blr
+SYM_FUNC_END(flush_backside_L2_cache)
 
 _GLOBAL(cpu_down_flush_e500v2)
 	mflr r0
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 1d599df6f169..f47b682d4667 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -18,6 +18,8 @@
 #include <linux/err.h>
 #include <linux/sys.h>
 #include <linux/threads.h>
+#include <linux/linkage.h>
+
 #include <asm/reg.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -74,17 +76,19 @@ _ASM_NOKPROBE_SYMBOL(prepare_transfer_to_handler)
 #endif /* CONFIG_PPC_BOOK3S_32 || CONFIG_E500 */
 
 #if defined(CONFIG_PPC_KUEP) && defined(CONFIG_PPC_BOOK3S_32)
-	.globl	__kuep_lock
+SYM_FUNC_START(__kuep_lock)
 __kuep_lock:
 	lwz	r9, THREAD+THSR0(r2)
 	update_user_segments_by_4 r9, r10, r11, r12
 	blr
+SYM_FUNC_END(__kuep_lock)
 
-__kuep_unlock:
+SYM_FUNC_START_LOCAL(__kuep_unlock)
 	lwz	r9, THREAD+THSR0(r2)
 	rlwinm  r9,r9,0,~SR_NX
 	update_user_segments_by_4 r9, r10, r11, r12
 	blr
+SYM_FUNC_END(__kuep_unlock)
 
 .macro	kuep_lock
 	bl	__kuep_lock
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 088f500896c7..9110fe9d6747 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -28,6 +28,8 @@
 #include <linux/init.h>
 #include <linux/pgtable.h>
 #include <linux/sizes.h>
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -662,7 +664,7 @@ start_here:
  * kernel initialization.  This maps the first 32 MBytes of memory 1:1
  * virtual to physical and more importantly sets the cache mode.
  */
-initial_mmu:
+SYM_FUNC_START_LOCAL(initial_mmu)
 	tlbia			/* Invalidate all TLB entries */
 	isync
 
@@ -711,6 +713,7 @@ initial_mmu:
 	mtspr	SPRN_EVPR,r0
 
 	blr
+SYM_FUNC_END(initial_mmu)
 
 _GLOBAL(abort)
         mfspr   r13,SPRN_DBCR0
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 0b05f2be66b9..c94ed5a08c93 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -18,6 +18,8 @@
 #include <linux/magic.h>
 #include <linux/pgtable.h>
 #include <linux/sizes.h>
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -625,7 +627,7 @@ start_here:
  * 24 Mbytes of data, and the 512k IMMR space.  Anything not covered by
  * these mappings is mapped by page tables.
  */
-initial_mmu:
+SYM_FUNC_START_LOCAL(initial_mmu)
 	li	r8, 0
 	mtspr	SPRN_MI_CTR, r8		/* remove PINNED ITLB entries */
 	lis	r10, MD_TWAM@h
@@ -686,6 +688,7 @@ initial_mmu:
 #endif
 	mtspr	SPRN_DER, r8
 	blr
+SYM_FUNC_END(initial_mmu)
 
 _GLOBAL(mmu_pin_tlb)
 	lis	r9, (1f - PAGE_OFFSET)@h
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 6c739beb938c..c0e0868ba01a 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -18,6 +18,8 @@
 
 #include <linux/init.h>
 #include <linux/pgtable.h>
+#include <linux/linkage.h>
+
 #include <asm/reg.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -877,7 +879,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
  * Load stuff into the MMU.  Intended to be called with
  * IR=0 and DR=0.
  */
-early_hash_table:
+SYM_FUNC_START_LOCAL(early_hash_table)
 	sync			/* Force all PTE updates to finish */
 	isync
 	tlbia			/* Clear all TLB entries */
@@ -888,8 +890,9 @@ early_hash_table:
 	ori	r6, r6, 3	/* 256kB table */
 	mtspr	SPRN_SDR1, r6
 	blr
+SYM_FUNC_END(early_hash_table)
 
-load_up_mmu:
+SYM_FUNC_START_LOCAL(load_up_mmu)
 	sync			/* Force all PTE updates to finish */
 	isync
 	tlbia			/* Clear all TLB entries */
@@ -918,6 +921,7 @@ BEGIN_MMU_FTR_SECTION
 	LOAD_BAT(7,r3,r4,r5)
 END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
 	blr
+SYM_FUNC_END(load_up_mmu)
 
 _GLOBAL(load_segment_registers)
 	li	r0, NUM_USER_SEGMENTS /* load up user segment register values */
@@ -1028,7 +1032,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
  * this makes sure it's done.
  *  -- Cort
  */
-clear_bats:
+SYM_FUNC_START_LOCAL(clear_bats)
 	li	r10,0
 
 	mtspr	SPRN_DBAT0U,r10
@@ -1072,6 +1076,7 @@ BEGIN_MMU_FTR_SECTION
 	mtspr	SPRN_IBAT7L,r10
 END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
 	blr
+SYM_FUNC_END(clear_bats)
 
 _GLOBAL(update_bats)
 	lis	r4, 1f@h
@@ -1108,15 +1113,16 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
 	mtspr	SPRN_SRR1, r6
 	rfi
 
-flush_tlbs:
+SYM_FUNC_START_LOCAL(flush_tlbs)
 	lis	r10, 0x40
 1:	addic.	r10, r10, -0x1000
 	tlbie	r10
 	bgt	1b
 	sync
 	blr
+SYM_FUNC_END(flush_tlbs)
 
-mmu_off:
+SYM_FUNC_START_LOCAL(mmu_off)
  	addi	r4, r3, __after_mmu_off - _start
 	mfmsr	r3
 	andi.	r0,r3,MSR_DR|MSR_IR		/* MMU enabled? */
@@ -1128,9 +1134,10 @@ mmu_off:
 	mtspr	SPRN_SRR1,r3
 	sync
 	rfi
+SYM_FUNC_END(mmu_off)
 
 /* We use one BAT to map up to 256M of RAM at _PAGE_OFFSET */
-initial_bats:
+SYM_FUNC_START_LOCAL(initial_bats)
 	lis	r11,PAGE_OFFSET@h
 	tophys(r8,r11)
 #ifdef CONFIG_SMP
@@ -1146,9 +1153,10 @@ initial_bats:
 	mtspr	SPRN_IBAT0U,r11
 	isync
 	blr
+SYM_FUNC_END(initial_bats)
 
 #ifdef CONFIG_BOOTX_TEXT
-setup_disp_bat:
+SYM_FUNC_START_LOCAL(setup_disp_bat)
 	/*
 	 * setup the display bat prepared for us in prom.c
 	 */
@@ -1164,10 +1172,11 @@ setup_disp_bat:
 	mtspr	SPRN_DBAT3L,r8
 	mtspr	SPRN_DBAT3U,r11
 	blr
+SYM_FUNC_END(setup_disp_bat)
 #endif /* CONFIG_BOOTX_TEXT */
 
 #ifdef CONFIG_PPC_EARLY_DEBUG_CPM
-setup_cpm_bat:
+SYM_FUNC_START_LOCAL(setup_cpm_bat)
 	lis	r8, 0xf000
 	ori	r8, r8,	0x002a
 	mtspr	SPRN_DBAT1L, r8
@@ -1177,10 +1186,11 @@ setup_cpm_bat:
 	mtspr	SPRN_DBAT1U, r11
 
 	blr
+SYM_FUNC_END(setup_cpm_bat)
 #endif
 
 #ifdef CONFIG_PPC_EARLY_DEBUG_USBGECKO
-setup_usbgecko_bat:
+SYM_FUNC_START_LOCAL(setup_usbgecko_bat)
 	/* prepare a BAT for early io */
 #if defined(CONFIG_GAMECUBE)
 	lis	r8, 0x0c00
@@ -1199,6 +1209,7 @@ setup_usbgecko_bat:
 	mtspr	SPRN_DBAT1L, r8
 	mtspr	SPRN_DBAT1U, r11
 	blr
+SYM_FUNC_END(setup_usbgecko_bat)
 #endif
 
 	.data
diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
index f0db4f52bc00..744b096857a1 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -29,6 +29,8 @@
 #include <linux/init.h>
 #include <linux/threads.h>
 #include <linux/pgtable.h>
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -885,7 +887,7 @@ KernelSPE:
  * Translate the effec addr in r3 to phys addr. The phys addr will be put
  * into r3(higher 32bit) and r4(lower 32bit)
  */
-get_phys_addr:
+SYM_FUNC_START_LOCAL(get_phys_addr)
 	mfmsr	r8
 	mfspr	r9,SPRN_PID
 	rlwinm	r9,r9,16,0x3fff0000	/* turn PID into MAS6[SPID] */
@@ -907,6 +909,7 @@ get_phys_addr:
 	mfspr	r3,SPRN_MAS7
 #endif
 	blr
+SYM_FUNC_END(get_phys_addr)
 
 /*
  * Global functions
diff --git a/arch/powerpc/kernel/swsusp_32.S b/arch/powerpc/kernel/swsusp_32.S
index e0cbd63007f2..ffb79326483c 100644
--- a/arch/powerpc/kernel/swsusp_32.S
+++ b/arch/powerpc/kernel/swsusp_32.S
@@ -1,5 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/threads.h>
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/cputable.h>
@@ -400,7 +402,7 @@ _ASM_NOKPROBE_SYMBOL(swsusp_arch_resume)
 /* FIXME:This construct is actually not useful since we don't shut
  * down the instruction MMU, we could just flip back MSR-DR on.
  */
-turn_on_mmu:
+SYM_FUNC_START_LOCAL(turn_on_mmu)
 	mflr	r4
 	mtsrr0	r4
 	mtsrr1	r3
@@ -408,4 +410,5 @@ turn_on_mmu:
 	isync
 	rfi
 _ASM_NOKPROBE_SYMBOL(turn_on_mmu)
+SYM_FUNC_END(turn_on_mmu)
 
diff --git a/arch/powerpc/kvm/fpu.S b/arch/powerpc/kvm/fpu.S
index 315c94946bad..b68e7f26a81f 100644
--- a/arch/powerpc/kvm/fpu.S
+++ b/arch/powerpc/kvm/fpu.S
@@ -6,6 +6,8 @@
  */
 
 #include <linux/pgtable.h>
+#include <linux/linkage.h>
+
 #include <asm/reg.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -110,18 +112,22 @@ FPS_THREE_IN(fsel)
  * R8 = (double*)&param3 [load_three]
  * LR = instruction call function
  */
-fpd_load_three:
+SYM_FUNC_START_LOCAL(fpd_load_three)
 	lfd	2,0(r8)			/* load param3 */
-fpd_load_two:
+SYM_FUNC_START_LOCAL(fpd_load_two)
 	lfd	1,0(r7)			/* load param2 */
-fpd_load_one:
+SYM_FUNC_START_LOCAL(fpd_load_one)
 	lfd	0,0(r6)			/* load param1 */
-fpd_load_none:
+SYM_FUNC_START_LOCAL(fpd_load_none)
 	lfd	3,0(r3)			/* load up fpscr value */
 	MTFSF_L(3)
 	lwz	r6, 0(r4)		/* load cr */
 	mtcr	r6
 	blr
+SYM_FUNC_END(fpd_load_none)
+SYM_FUNC_END(fpd_load_one)
+SYM_FUNC_END(fpd_load_two)
+SYM_FUNC_END(fpd_load_three)
 
 /*
  * End of double instruction processing
@@ -131,13 +137,14 @@ fpd_load_none:
  * R5 = (double*)&result
  * LR = caller of instruction call function
  */
-fpd_return:
+SYM_FUNC_START_LOCAL(fpd_return)
 	mfcr	r6
 	stfd	0,0(r5)			/* save result */
 	mffs	0
 	stfd	0,0(r3)			/* save new fpscr value */
 	stw	r6,0(r4)		/* save new cr value */
 	blr
+SYM_FUNC_END(fpd_return)
 
 /*
  * Double operation with no input operand
diff --git a/arch/powerpc/platforms/52xx/lite5200_sleep.S b/arch/powerpc/platforms/52xx/lite5200_sleep.S
index afee8b1515a8..0b12647e7b42 100644
--- a/arch/powerpc/platforms/52xx/lite5200_sleep.S
+++ b/arch/powerpc/platforms/52xx/lite5200_sleep.S
@@ -1,4 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+#include <linux/linkage.h>
+
 #include <asm/reg.h>
 #include <asm/ppc_asm.h>
 #include <asm/processor.h>
@@ -178,7 +180,8 @@ sram_code:
 
 
 	/* local udelay in sram is needed */
-  udelay: /* r11 - tb_ticks_per_usec, r12 - usecs, overwrites r13 */
+SYM_FUNC_START_LOCAL(udelay)
+	/* r11 - tb_ticks_per_usec, r12 - usecs, overwrites r13 */
 	mullw	r12, r12, r11
 	mftb	r13	/* start */
 	add	r12, r13, r12 /* end */
@@ -187,6 +190,7 @@ sram_code:
 	cmp	cr0, r13, r12
 	blt	1b
 	blr
+SYM_FUNC_END(udelay)
 
 sram_code_end:
 
@@ -271,7 +275,7 @@ _ASM_NOKPROBE_SYMBOL(lite5200_wakeup)
 	SAVE_SR(n+2, addr+2);	\
 	SAVE_SR(n+3, addr+3);
 
-save_regs:
+SYM_FUNC_START_LOCAL(save_regs)
 	stw	r0, 0(r4)
 	stw	r1, 0x4(r4)
 	stw	r2, 0x8(r4)
@@ -317,6 +321,7 @@ save_regs:
 	SAVE_SPRN(TBRU,  0x5b)
 
 	blr
+SYM_FUNC_END(save_regs)
 
 
 /* restore registers */
@@ -336,7 +341,7 @@ save_regs:
 	LOAD_SR(n+2, addr+2);	\
 	LOAD_SR(n+3, addr+3);
 
-restore_regs:
+SYM_FUNC_START_LOCAL(restore_regs)
 	lis	r4, registers@h
 	ori	r4, r4, registers@l
 
@@ -393,6 +398,7 @@ restore_regs:
 
 	blr
 _ASM_NOKPROBE_SYMBOL(restore_regs)
+SYM_FUNC_END(restore_regs)
 
 
 
@@ -403,7 +409,7 @@ _ASM_NOKPROBE_SYMBOL(restore_regs)
  * Flush data cache
  * Do this by just reading lots of stuff into the cache.
  */
-flush_data_cache:
+SYM_FUNC_START_LOCAL(flush_data_cache)
 	lis	r3,CONFIG_KERNEL_START@h
 	ori	r3,r3,CONFIG_KERNEL_START@l
 	li	r4,NUM_CACHE_LINES
@@ -413,3 +419,4 @@ flush_data_cache:
 	addi	r3,r3,L1_CACHE_BYTES	/* Next line, please */
 	bdnz	1b
 	blr
+SYM_FUNC_END(flush_data_cache)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 2/7] objtool/powerpc: Activate objtool on PPC32
@ 2022-07-08 17:31   ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: chenzhongjin, x86, linuxppc-dev, linux-kernel

Fix several annotations in assembly files and enable objtool on PPC32.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/Kconfig                         |  2 +-
 arch/powerpc/kernel/cpu_setup_6xx.S          | 26 ++++++++++++------
 arch/powerpc/kernel/cpu_setup_fsl_booke.S    |  8 ++++--
 arch/powerpc/kernel/entry_32.S               |  8 ++++--
 arch/powerpc/kernel/head_40x.S               |  5 +++-
 arch/powerpc/kernel/head_8xx.S               |  5 +++-
 arch/powerpc/kernel/head_book3s_32.S         | 29 ++++++++++++++------
 arch/powerpc/kernel/head_fsl_booke.S         |  5 +++-
 arch/powerpc/kernel/swsusp_32.S              |  5 +++-
 arch/powerpc/kvm/fpu.S                       | 17 ++++++++----
 arch/powerpc/platforms/52xx/lite5200_sleep.S | 15 +++++++---
 11 files changed, 90 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 96263d78aec9..00a43eb26418 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -237,7 +237,7 @@ config PPC
 	select HAVE_MOD_ARCH_SPECIFIC
 	select HAVE_NMI				if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
 	select HAVE_OPTPROBES
-	select HAVE_OBJTOOL			if PPC64
+	select HAVE_OBJTOOL
 	select HAVE_OBJTOOL_MCOUNT		if HAVE_OBJTOOL
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_EVENTS_NMI		if PPC64
diff --git a/arch/powerpc/kernel/cpu_setup_6xx.S b/arch/powerpc/kernel/cpu_setup_6xx.S
index f8b5ff64b604..f29ce3dd6140 100644
--- a/arch/powerpc/kernel/cpu_setup_6xx.S
+++ b/arch/powerpc/kernel/cpu_setup_6xx.S
@@ -4,6 +4,8 @@
  *    Copyright (C) 2003 Benjamin Herrenschmidt (benh@kernel.crashing.org)
  */
 
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/cputable.h>
@@ -81,7 +83,7 @@ _GLOBAL(__setup_cpu_745x)
 	blr
 
 /* Enable caches for 603's, 604, 750 & 7400 */
-setup_common_caches:
+SYM_FUNC_START_LOCAL(setup_common_caches)
 	mfspr	r11,SPRN_HID0
 	andi.	r0,r11,HID0_DCE
 	ori	r11,r11,HID0_ICE|HID0_DCE
@@ -95,11 +97,12 @@ setup_common_caches:
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_common_caches)
 
 /* 604, 604e, 604ev, ...
  * Enable superscalar execution & branch history table
  */
-setup_604_hid0:
+SYM_FUNC_START_LOCAL(setup_604_hid0)
 	mfspr	r11,SPRN_HID0
 	ori	r11,r11,HID0_SIED|HID0_BHTE
 	ori	r8,r11,HID0_BTCD
@@ -110,6 +113,7 @@ setup_604_hid0:
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_604_hid0)
 
 /* 7400 <= rev 2.7 and 7410 rev = 1.0 suffer from some
  * erratas we work around here.
@@ -125,13 +129,14 @@ setup_604_hid0:
  * needed once we have applied workaround #5 (though it's
  * not set by Apple's firmware at least).
  */
-setup_7400_workarounds:
+SYM_FUNC_START_LOCAL(setup_7400_workarounds)
 	mfpvr	r3
 	rlwinm	r3,r3,0,20,31
 	cmpwi	0,r3,0x0207
 	ble	1f
 	blr
-setup_7410_workarounds:
+SYM_FUNC_END(setup_7400_workarounds)
+SYM_FUNC_START_LOCAL(setup_7410_workarounds)
 	mfpvr	r3
 	rlwinm	r3,r3,0,20,31
 	cmpwi	0,r3,0x0100
@@ -151,6 +156,7 @@ setup_7410_workarounds:
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_7410_workarounds)
 
 /* 740/750/7400/7410
  * Enable Store Gathering (SGE), Address Broadcast (ABE),
@@ -158,7 +164,7 @@ setup_7410_workarounds:
  * Dynamic Power Management (DPM), Speculative (SPD)
  * Clear Instruction cache throttling (ICTC)
  */
-setup_750_7400_hid0:
+SYM_FUNC_START_LOCAL(setup_750_7400_hid0)
 	mfspr	r11,SPRN_HID0
 	ori	r11,r11,HID0_SGE | HID0_ABE | HID0_BHTE | HID0_BTIC
 	oris	r11,r11,HID0_DPM@h
@@ -177,12 +183,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_NO_DPM)
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_750_7400_hid0)
 
 /* 750cx specific
  * Looks like we have to disable NAP feature for some PLL settings...
  * (waiting for confirmation)
  */
-setup_750cx:
+SYM_FUNC_START_LOCAL(setup_750cx)
 	mfspr	r10, SPRN_HID1
 	rlwinm	r10,r10,4,28,31
 	cmpwi	cr0,r10,7
@@ -196,11 +203,13 @@ setup_750cx:
 	andc	r6,r6,r7
 	stw	r6,CPU_SPEC_FEATURES(r4)
 	blr
+SYM_FUNC_END(setup_750cx)
 
 /* 750fx specific
  */
-setup_750fx:
+SYM_FUNC_START_LOCAL(setup_750fx)
 	blr
+SYM_FUNC_END(setup_750fx)
 
 /* MPC 745x
  * Enable Store Gathering (SGE), Branch Folding (FOLD)
@@ -212,7 +221,7 @@ setup_750fx:
  * Clear Instruction cache throttling (ICTC)
  * Enable L2 HW prefetch
  */
-setup_745x_specifics:
+SYM_FUNC_START_LOCAL(setup_745x_specifics)
 	/* We check for the presence of an L3 cache setup by
 	 * the firmware. If any, we disable NAP capability as
 	 * it's known to be bogus on rev 2.1 and earlier
@@ -270,6 +279,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_NO_DPM)
 	sync
 	isync
 	blr
+SYM_FUNC_END(setup_745x_specifics)
 
 /*
  * Initialize the FPU registers. This is needed to work around an errata
diff --git a/arch/powerpc/kernel/cpu_setup_fsl_booke.S b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
index 4bf33f1b4193..f573a4f3bbe6 100644
--- a/arch/powerpc/kernel/cpu_setup_fsl_booke.S
+++ b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
@@ -8,6 +8,8 @@
  * Benjamin Herrenschmidt <benh@kernel.crashing.org>
  */
 
+#include <linux/linkage.h>
+
 #include <asm/page.h>
 #include <asm/processor.h>
 #include <asm/cputable.h>
@@ -274,7 +276,7 @@ _GLOBAL(flush_dcache_L1)
 
 	blr
 
-has_L2_cache:
+SYM_FUNC_START_LOCAL(has_L2_cache)
 	/* skip L2 cache on P2040/P2040E as they have no L2 cache */
 	mfspr	r3, SPRN_SVR
 	/* shift right by 8 bits and clear E bit of SVR */
@@ -290,9 +292,10 @@ has_L2_cache:
 1:
 	li	r3, 0
 	blr
+SYM_FUNC_END(has_L2_cache)
 
 /* flush backside L2 cache */
-flush_backside_L2_cache:
+SYM_FUNC_START_LOCAL(flush_backside_L2_cache)
 	mflr	r10
 	bl	has_L2_cache
 	mtlr	r10
@@ -313,6 +316,7 @@ flush_backside_L2_cache:
 	bne	1b
 2:
 	blr
+SYM_FUNC_END(flush_backside_L2_cache)
 
 _GLOBAL(cpu_down_flush_e500v2)
 	mflr r0
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 1d599df6f169..f47b682d4667 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -18,6 +18,8 @@
 #include <linux/err.h>
 #include <linux/sys.h>
 #include <linux/threads.h>
+#include <linux/linkage.h>
+
 #include <asm/reg.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -74,17 +76,19 @@ _ASM_NOKPROBE_SYMBOL(prepare_transfer_to_handler)
 #endif /* CONFIG_PPC_BOOK3S_32 || CONFIG_E500 */
 
 #if defined(CONFIG_PPC_KUEP) && defined(CONFIG_PPC_BOOK3S_32)
-	.globl	__kuep_lock
+SYM_FUNC_START(__kuep_lock)
 __kuep_lock:
 	lwz	r9, THREAD+THSR0(r2)
 	update_user_segments_by_4 r9, r10, r11, r12
 	blr
+SYM_FUNC_END(__kuep_lock)
 
-__kuep_unlock:
+SYM_FUNC_START_LOCAL(__kuep_unlock)
 	lwz	r9, THREAD+THSR0(r2)
 	rlwinm  r9,r9,0,~SR_NX
 	update_user_segments_by_4 r9, r10, r11, r12
 	blr
+SYM_FUNC_END(__kuep_unlock)
 
 .macro	kuep_lock
 	bl	__kuep_lock
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 088f500896c7..9110fe9d6747 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -28,6 +28,8 @@
 #include <linux/init.h>
 #include <linux/pgtable.h>
 #include <linux/sizes.h>
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -662,7 +664,7 @@ start_here:
  * kernel initialization.  This maps the first 32 MBytes of memory 1:1
  * virtual to physical and more importantly sets the cache mode.
  */
-initial_mmu:
+SYM_FUNC_START_LOCAL(initial_mmu)
 	tlbia			/* Invalidate all TLB entries */
 	isync
 
@@ -711,6 +713,7 @@ initial_mmu:
 	mtspr	SPRN_EVPR,r0
 
 	blr
+SYM_FUNC_END(initial_mmu)
 
 _GLOBAL(abort)
         mfspr   r13,SPRN_DBCR0
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 0b05f2be66b9..c94ed5a08c93 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -18,6 +18,8 @@
 #include <linux/magic.h>
 #include <linux/pgtable.h>
 #include <linux/sizes.h>
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -625,7 +627,7 @@ start_here:
  * 24 Mbytes of data, and the 512k IMMR space.  Anything not covered by
  * these mappings is mapped by page tables.
  */
-initial_mmu:
+SYM_FUNC_START_LOCAL(initial_mmu)
 	li	r8, 0
 	mtspr	SPRN_MI_CTR, r8		/* remove PINNED ITLB entries */
 	lis	r10, MD_TWAM@h
@@ -686,6 +688,7 @@ initial_mmu:
 #endif
 	mtspr	SPRN_DER, r8
 	blr
+SYM_FUNC_END(initial_mmu)
 
 _GLOBAL(mmu_pin_tlb)
 	lis	r9, (1f - PAGE_OFFSET)@h
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 6c739beb938c..c0e0868ba01a 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -18,6 +18,8 @@
 
 #include <linux/init.h>
 #include <linux/pgtable.h>
+#include <linux/linkage.h>
+
 #include <asm/reg.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -877,7 +879,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
  * Load stuff into the MMU.  Intended to be called with
  * IR=0 and DR=0.
  */
-early_hash_table:
+SYM_FUNC_START_LOCAL(early_hash_table)
 	sync			/* Force all PTE updates to finish */
 	isync
 	tlbia			/* Clear all TLB entries */
@@ -888,8 +890,9 @@ early_hash_table:
 	ori	r6, r6, 3	/* 256kB table */
 	mtspr	SPRN_SDR1, r6
 	blr
+SYM_FUNC_END(early_hash_table)
 
-load_up_mmu:
+SYM_FUNC_START_LOCAL(load_up_mmu)
 	sync			/* Force all PTE updates to finish */
 	isync
 	tlbia			/* Clear all TLB entries */
@@ -918,6 +921,7 @@ BEGIN_MMU_FTR_SECTION
 	LOAD_BAT(7,r3,r4,r5)
 END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
 	blr
+SYM_FUNC_END(load_up_mmu)
 
 _GLOBAL(load_segment_registers)
 	li	r0, NUM_USER_SEGMENTS /* load up user segment register values */
@@ -1028,7 +1032,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
  * this makes sure it's done.
  *  -- Cort
  */
-clear_bats:
+SYM_FUNC_START_LOCAL(clear_bats)
 	li	r10,0
 
 	mtspr	SPRN_DBAT0U,r10
@@ -1072,6 +1076,7 @@ BEGIN_MMU_FTR_SECTION
 	mtspr	SPRN_IBAT7L,r10
 END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
 	blr
+SYM_FUNC_END(clear_bats)
 
 _GLOBAL(update_bats)
 	lis	r4, 1f@h
@@ -1108,15 +1113,16 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
 	mtspr	SPRN_SRR1, r6
 	rfi
 
-flush_tlbs:
+SYM_FUNC_START_LOCAL(flush_tlbs)
 	lis	r10, 0x40
 1:	addic.	r10, r10, -0x1000
 	tlbie	r10
 	bgt	1b
 	sync
 	blr
+SYM_FUNC_END(flush_tlbs)
 
-mmu_off:
+SYM_FUNC_START_LOCAL(mmu_off)
  	addi	r4, r3, __after_mmu_off - _start
 	mfmsr	r3
 	andi.	r0,r3,MSR_DR|MSR_IR		/* MMU enabled? */
@@ -1128,9 +1134,10 @@ mmu_off:
 	mtspr	SPRN_SRR1,r3
 	sync
 	rfi
+SYM_FUNC_END(mmu_off)
 
 /* We use one BAT to map up to 256M of RAM at _PAGE_OFFSET */
-initial_bats:
+SYM_FUNC_START_LOCAL(initial_bats)
 	lis	r11,PAGE_OFFSET@h
 	tophys(r8,r11)
 #ifdef CONFIG_SMP
@@ -1146,9 +1153,10 @@ initial_bats:
 	mtspr	SPRN_IBAT0U,r11
 	isync
 	blr
+SYM_FUNC_END(initial_bats)
 
 #ifdef CONFIG_BOOTX_TEXT
-setup_disp_bat:
+SYM_FUNC_START_LOCAL(setup_disp_bat)
 	/*
 	 * setup the display bat prepared for us in prom.c
 	 */
@@ -1164,10 +1172,11 @@ setup_disp_bat:
 	mtspr	SPRN_DBAT3L,r8
 	mtspr	SPRN_DBAT3U,r11
 	blr
+SYM_FUNC_END(setup_disp_bat)
 #endif /* CONFIG_BOOTX_TEXT */
 
 #ifdef CONFIG_PPC_EARLY_DEBUG_CPM
-setup_cpm_bat:
+SYM_FUNC_START_LOCAL(setup_cpm_bat)
 	lis	r8, 0xf000
 	ori	r8, r8,	0x002a
 	mtspr	SPRN_DBAT1L, r8
@@ -1177,10 +1186,11 @@ setup_cpm_bat:
 	mtspr	SPRN_DBAT1U, r11
 
 	blr
+SYM_FUNC_END(setup_cpm_bat)
 #endif
 
 #ifdef CONFIG_PPC_EARLY_DEBUG_USBGECKO
-setup_usbgecko_bat:
+SYM_FUNC_START_LOCAL(setup_usbgecko_bat)
 	/* prepare a BAT for early io */
 #if defined(CONFIG_GAMECUBE)
 	lis	r8, 0x0c00
@@ -1199,6 +1209,7 @@ setup_usbgecko_bat:
 	mtspr	SPRN_DBAT1L, r8
 	mtspr	SPRN_DBAT1U, r11
 	blr
+SYM_FUNC_END(setup_usbgecko_bat)
 #endif
 
 	.data
diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
index f0db4f52bc00..744b096857a1 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -29,6 +29,8 @@
 #include <linux/init.h>
 #include <linux/threads.h>
 #include <linux/pgtable.h>
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -885,7 +887,7 @@ KernelSPE:
  * Translate the effec addr in r3 to phys addr. The phys addr will be put
  * into r3(higher 32bit) and r4(lower 32bit)
  */
-get_phys_addr:
+SYM_FUNC_START_LOCAL(get_phys_addr)
 	mfmsr	r8
 	mfspr	r9,SPRN_PID
 	rlwinm	r9,r9,16,0x3fff0000	/* turn PID into MAS6[SPID] */
@@ -907,6 +909,7 @@ get_phys_addr:
 	mfspr	r3,SPRN_MAS7
 #endif
 	blr
+SYM_FUNC_END(get_phys_addr)
 
 /*
  * Global functions
diff --git a/arch/powerpc/kernel/swsusp_32.S b/arch/powerpc/kernel/swsusp_32.S
index e0cbd63007f2..ffb79326483c 100644
--- a/arch/powerpc/kernel/swsusp_32.S
+++ b/arch/powerpc/kernel/swsusp_32.S
@@ -1,5 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/threads.h>
+#include <linux/linkage.h>
+
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/cputable.h>
@@ -400,7 +402,7 @@ _ASM_NOKPROBE_SYMBOL(swsusp_arch_resume)
 /* FIXME:This construct is actually not useful since we don't shut
  * down the instruction MMU, we could just flip back MSR-DR on.
  */
-turn_on_mmu:
+SYM_FUNC_START_LOCAL(turn_on_mmu)
 	mflr	r4
 	mtsrr0	r4
 	mtsrr1	r3
@@ -408,4 +410,5 @@ turn_on_mmu:
 	isync
 	rfi
 _ASM_NOKPROBE_SYMBOL(turn_on_mmu)
+SYM_FUNC_END(turn_on_mmu)
 
diff --git a/arch/powerpc/kvm/fpu.S b/arch/powerpc/kvm/fpu.S
index 315c94946bad..b68e7f26a81f 100644
--- a/arch/powerpc/kvm/fpu.S
+++ b/arch/powerpc/kvm/fpu.S
@@ -6,6 +6,8 @@
  */
 
 #include <linux/pgtable.h>
+#include <linux/linkage.h>
+
 #include <asm/reg.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
@@ -110,18 +112,22 @@ FPS_THREE_IN(fsel)
  * R8 = (double*)&param3 [load_three]
  * LR = instruction call function
  */
-fpd_load_three:
+SYM_FUNC_START_LOCAL(fpd_load_three)
 	lfd	2,0(r8)			/* load param3 */
-fpd_load_two:
+SYM_FUNC_START_LOCAL(fpd_load_two)
 	lfd	1,0(r7)			/* load param2 */
-fpd_load_one:
+SYM_FUNC_START_LOCAL(fpd_load_one)
 	lfd	0,0(r6)			/* load param1 */
-fpd_load_none:
+SYM_FUNC_START_LOCAL(fpd_load_none)
 	lfd	3,0(r3)			/* load up fpscr value */
 	MTFSF_L(3)
 	lwz	r6, 0(r4)		/* load cr */
 	mtcr	r6
 	blr
+SYM_FUNC_END(fpd_load_none)
+SYM_FUNC_END(fpd_load_one)
+SYM_FUNC_END(fpd_load_two)
+SYM_FUNC_END(fpd_load_three)
 
 /*
  * End of double instruction processing
@@ -131,13 +137,14 @@ fpd_load_none:
  * R5 = (double*)&result
  * LR = caller of instruction call function
  */
-fpd_return:
+SYM_FUNC_START_LOCAL(fpd_return)
 	mfcr	r6
 	stfd	0,0(r5)			/* save result */
 	mffs	0
 	stfd	0,0(r3)			/* save new fpscr value */
 	stw	r6,0(r4)		/* save new cr value */
 	blr
+SYM_FUNC_END(fpd_return)
 
 /*
  * Double operation with no input operand
diff --git a/arch/powerpc/platforms/52xx/lite5200_sleep.S b/arch/powerpc/platforms/52xx/lite5200_sleep.S
index afee8b1515a8..0b12647e7b42 100644
--- a/arch/powerpc/platforms/52xx/lite5200_sleep.S
+++ b/arch/powerpc/platforms/52xx/lite5200_sleep.S
@@ -1,4 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+#include <linux/linkage.h>
+
 #include <asm/reg.h>
 #include <asm/ppc_asm.h>
 #include <asm/processor.h>
@@ -178,7 +180,8 @@ sram_code:
 
 
 	/* local udelay in sram is needed */
-  udelay: /* r11 - tb_ticks_per_usec, r12 - usecs, overwrites r13 */
+SYM_FUNC_START_LOCAL(udelay)
+	/* r11 - tb_ticks_per_usec, r12 - usecs, overwrites r13 */
 	mullw	r12, r12, r11
 	mftb	r13	/* start */
 	add	r12, r13, r12 /* end */
@@ -187,6 +190,7 @@ sram_code:
 	cmp	cr0, r13, r12
 	blt	1b
 	blr
+SYM_FUNC_END(udelay)
 
 sram_code_end:
 
@@ -271,7 +275,7 @@ _ASM_NOKPROBE_SYMBOL(lite5200_wakeup)
 	SAVE_SR(n+2, addr+2);	\
 	SAVE_SR(n+3, addr+3);
 
-save_regs:
+SYM_FUNC_START_LOCAL(save_regs)
 	stw	r0, 0(r4)
 	stw	r1, 0x4(r4)
 	stw	r2, 0x8(r4)
@@ -317,6 +321,7 @@ save_regs:
 	SAVE_SPRN(TBRU,  0x5b)
 
 	blr
+SYM_FUNC_END(save_regs)
 
 
 /* restore registers */
@@ -336,7 +341,7 @@ save_regs:
 	LOAD_SR(n+2, addr+2);	\
 	LOAD_SR(n+3, addr+3);
 
-restore_regs:
+SYM_FUNC_START_LOCAL(restore_regs)
 	lis	r4, registers@h
 	ori	r4, r4, registers@l
 
@@ -393,6 +398,7 @@ restore_regs:
 
 	blr
 _ASM_NOKPROBE_SYMBOL(restore_regs)
+SYM_FUNC_END(restore_regs)
 
 
 
@@ -403,7 +409,7 @@ _ASM_NOKPROBE_SYMBOL(restore_regs)
  * Flush data cache
  * Do this by just reading lots of stuff into the cache.
  */
-flush_data_cache:
+SYM_FUNC_START_LOCAL(flush_data_cache)
 	lis	r3,CONFIG_KERNEL_START@h
 	ori	r3,r3,CONFIG_KERNEL_START@l
 	li	r4,NUM_CACHE_LINES
@@ -413,3 +419,4 @@ flush_data_cache:
 	addi	r3,r3,L1_CACHE_BYTES	/* Next line, please */
 	bdnz	1b
 	blr
+SYM_FUNC_END(flush_data_cache)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 3/7] objtool: Add architecture specific R_REL32 macro
  2022-07-08 17:31 ` Christophe Leroy
@ 2022-07-08 17:31   ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev, x86, chenzhongjin

In order to allow other architectures than x86 to use 32 bits
PC relative relocations (S+A-P), define a R_REL32 macro that each
architecture will define, in the same way as already done for
R_NONE, R_ABS32 and R_ABS64.

For x86 that corresponds to R_X86_64_PC32.
For powerpc it will be R_PPC_REL32/R_PPC64_REL32.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v2: Improved commit message based on feedback from Segher
---
 tools/objtool/arch/x86/include/arch/elf.h |  1 +
 tools/objtool/check.c                     | 10 +++++-----
 tools/objtool/orc_gen.c                   |  2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/arch/x86/include/arch/elf.h b/tools/objtool/arch/x86/include/arch/elf.h
index ac14987cf687..e7d228c686db 100644
--- a/tools/objtool/arch/x86/include/arch/elf.h
+++ b/tools/objtool/arch/x86/include/arch/elf.h
@@ -4,5 +4,6 @@
 #define R_NONE R_X86_64_NONE
 #define R_ABS64 R_X86_64_64
 #define R_ABS32 R_X86_64_32
+#define R_REL32	R_X86_64_PC32
 
 #endif /* _OBJTOOL_ARCH_ELF */
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index dec42a226048..ba8fd313372c 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -652,7 +652,7 @@ static int create_static_call_sections(struct objtool_file *file)
 		/* populate reloc for 'addr' */
 		if (elf_add_reloc_to_insn(file->elf, sec,
 					  idx * sizeof(struct static_call_site),
-					  R_X86_64_PC32,
+					  R_REL32,
 					  insn->sec, insn->offset))
 			return -1;
 
@@ -693,7 +693,7 @@ static int create_static_call_sections(struct objtool_file *file)
 		/* populate reloc for 'key' */
 		if (elf_add_reloc(file->elf, sec,
 				  idx * sizeof(struct static_call_site) + 4,
-				  R_X86_64_PC32, key_sym,
+				  R_REL32, key_sym,
 				  is_sibling_call(insn) * STATIC_CALL_SITE_TAIL))
 			return -1;
 
@@ -737,7 +737,7 @@ static int create_retpoline_sites_sections(struct objtool_file *file)
 
 		if (elf_add_reloc_to_insn(file->elf, sec,
 					  idx * sizeof(int),
-					  R_X86_64_PC32,
+					  R_REL32,
 					  insn->sec, insn->offset)) {
 			WARN("elf_add_reloc_to_insn: .retpoline_sites");
 			return -1;
@@ -789,7 +789,7 @@ static int create_ibt_endbr_seal_sections(struct objtool_file *file)
 
 		if (elf_add_reloc_to_insn(file->elf, sec,
 					  idx * sizeof(int),
-					  R_X86_64_PC32,
+					  R_REL32,
 					  insn->sec, insn->offset)) {
 			WARN("elf_add_reloc_to_insn: .ibt_endbr_seal");
 			return -1;
@@ -3718,7 +3718,7 @@ static int validate_ibt_insn(struct objtool_file *file, struct instruction *insn
 			continue;
 
 		off = reloc->sym->offset;
-		if (reloc->type == R_X86_64_PC32 || reloc->type == R_X86_64_PLT32)
+		if (reloc->type == R_REL32 || reloc->type == R_X86_64_PLT32)
 			off += arch_dest_reloc_offset(reloc->addend);
 		else
 			off += reloc->addend;
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 1f22b7ebae58..49a877b9c879 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -101,7 +101,7 @@ static int write_orc_entry(struct elf *elf, struct section *orc_sec,
 	orc->bp_offset = bswap_if_needed(elf, orc->bp_offset);
 
 	/* populate reloc for ip */
-	if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_X86_64_PC32,
+	if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_REL32,
 				  insn_sec, insn_off))
 		return -1;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 3/7] objtool: Add architecture specific R_REL32 macro
@ 2022-07-08 17:31   ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: chenzhongjin, x86, linuxppc-dev, linux-kernel

In order to allow other architectures than x86 to use 32 bits
PC relative relocations (S+A-P), define a R_REL32 macro that each
architecture will define, in the same way as already done for
R_NONE, R_ABS32 and R_ABS64.

For x86 that corresponds to R_X86_64_PC32.
For powerpc it will be R_PPC_REL32/R_PPC64_REL32.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v2: Improved commit message based on feedback from Segher
---
 tools/objtool/arch/x86/include/arch/elf.h |  1 +
 tools/objtool/check.c                     | 10 +++++-----
 tools/objtool/orc_gen.c                   |  2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/arch/x86/include/arch/elf.h b/tools/objtool/arch/x86/include/arch/elf.h
index ac14987cf687..e7d228c686db 100644
--- a/tools/objtool/arch/x86/include/arch/elf.h
+++ b/tools/objtool/arch/x86/include/arch/elf.h
@@ -4,5 +4,6 @@
 #define R_NONE R_X86_64_NONE
 #define R_ABS64 R_X86_64_64
 #define R_ABS32 R_X86_64_32
+#define R_REL32	R_X86_64_PC32
 
 #endif /* _OBJTOOL_ARCH_ELF */
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index dec42a226048..ba8fd313372c 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -652,7 +652,7 @@ static int create_static_call_sections(struct objtool_file *file)
 		/* populate reloc for 'addr' */
 		if (elf_add_reloc_to_insn(file->elf, sec,
 					  idx * sizeof(struct static_call_site),
-					  R_X86_64_PC32,
+					  R_REL32,
 					  insn->sec, insn->offset))
 			return -1;
 
@@ -693,7 +693,7 @@ static int create_static_call_sections(struct objtool_file *file)
 		/* populate reloc for 'key' */
 		if (elf_add_reloc(file->elf, sec,
 				  idx * sizeof(struct static_call_site) + 4,
-				  R_X86_64_PC32, key_sym,
+				  R_REL32, key_sym,
 				  is_sibling_call(insn) * STATIC_CALL_SITE_TAIL))
 			return -1;
 
@@ -737,7 +737,7 @@ static int create_retpoline_sites_sections(struct objtool_file *file)
 
 		if (elf_add_reloc_to_insn(file->elf, sec,
 					  idx * sizeof(int),
-					  R_X86_64_PC32,
+					  R_REL32,
 					  insn->sec, insn->offset)) {
 			WARN("elf_add_reloc_to_insn: .retpoline_sites");
 			return -1;
@@ -789,7 +789,7 @@ static int create_ibt_endbr_seal_sections(struct objtool_file *file)
 
 		if (elf_add_reloc_to_insn(file->elf, sec,
 					  idx * sizeof(int),
-					  R_X86_64_PC32,
+					  R_REL32,
 					  insn->sec, insn->offset)) {
 			WARN("elf_add_reloc_to_insn: .ibt_endbr_seal");
 			return -1;
@@ -3718,7 +3718,7 @@ static int validate_ibt_insn(struct objtool_file *file, struct instruction *insn
 			continue;
 
 		off = reloc->sym->offset;
-		if (reloc->type == R_X86_64_PC32 || reloc->type == R_X86_64_PLT32)
+		if (reloc->type == R_REL32 || reloc->type == R_X86_64_PLT32)
 			off += arch_dest_reloc_offset(reloc->addend);
 		else
 			off += reloc->addend;
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 1f22b7ebae58..49a877b9c879 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -101,7 +101,7 @@ static int write_orc_entry(struct elf *elf, struct section *orc_sec,
 	orc->bp_offset = bswap_if_needed(elf, orc->bp_offset);
 
 	/* populate reloc for ip */
-	if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_X86_64_PC32,
+	if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_REL32,
 				  insn_sec, insn_off))
 		return -1;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 4/7] objtool/powerpc: Add necessary support for inline static calls
  2022-07-08 17:31 ` Christophe Leroy
@ 2022-07-08 17:31   ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev, x86, chenzhongjin

In order to support inline static calls for powerpc, objtool needs
the following additions:
- R_REL32 macro
- Support for JUMP instruction used for tail calls

Add the support of decoding branch instruction 'b' which is the jump
instruction used for tail calls. This is because a static call can be
a tail call.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 tools/objtool/arch/powerpc/decode.c           | 16 ++++++++++------
 tools/objtool/arch/powerpc/include/arch/elf.h |  1 +
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/arch/powerpc/decode.c b/tools/objtool/arch/powerpc/decode.c
index 06fc0206bf8e..ba84869cd134 100644
--- a/tools/objtool/arch/powerpc/decode.c
+++ b/tools/objtool/arch/powerpc/decode.c
@@ -59,13 +59,17 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec
 	opcode = insn >> 26;
 
 	switch (opcode) {
-	case 18: /* bl */
-		if ((insn & 3) == 1) {
+	case 18: /* bl/b */
+		if ((insn & 3) == 1)
 			*type = INSN_CALL;
-			*immediate = insn & 0x3fffffc;
-			if (*immediate & 0x2000000)
-				*immediate -= 0x4000000;
-		}
+		else if ((insn & 3) == 0)
+			*type = INSN_JUMP_UNCONDITIONAL;
+		else
+			break;
+
+		*immediate = insn & 0x3fffffc;
+		if (*immediate & 0x2000000)
+			*immediate -= 0x4000000;
 		break;
 	}
 
diff --git a/tools/objtool/arch/powerpc/include/arch/elf.h b/tools/objtool/arch/powerpc/include/arch/elf.h
index 73f9ae172fe5..befc2e30d38b 100644
--- a/tools/objtool/arch/powerpc/include/arch/elf.h
+++ b/tools/objtool/arch/powerpc/include/arch/elf.h
@@ -6,5 +6,6 @@
 #define R_NONE R_PPC_NONE
 #define R_ABS64 R_PPC64_ADDR64
 #define R_ABS32 R_PPC_ADDR32
+#define R_REL32 R_PPC_REL32 /* R_PPC64_REL32 is identical */
 
 #endif /* _OBJTOOL_ARCH_ELF */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 4/7] objtool/powerpc: Add necessary support for inline static calls
@ 2022-07-08 17:31   ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: chenzhongjin, x86, linuxppc-dev, linux-kernel

In order to support inline static calls for powerpc, objtool needs
the following additions:
- R_REL32 macro
- Support for JUMP instruction used for tail calls

Add the support of decoding branch instruction 'b' which is the jump
instruction used for tail calls. This is because a static call can be
a tail call.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 tools/objtool/arch/powerpc/decode.c           | 16 ++++++++++------
 tools/objtool/arch/powerpc/include/arch/elf.h |  1 +
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/arch/powerpc/decode.c b/tools/objtool/arch/powerpc/decode.c
index 06fc0206bf8e..ba84869cd134 100644
--- a/tools/objtool/arch/powerpc/decode.c
+++ b/tools/objtool/arch/powerpc/decode.c
@@ -59,13 +59,17 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec
 	opcode = insn >> 26;
 
 	switch (opcode) {
-	case 18: /* bl */
-		if ((insn & 3) == 1) {
+	case 18: /* bl/b */
+		if ((insn & 3) == 1)
 			*type = INSN_CALL;
-			*immediate = insn & 0x3fffffc;
-			if (*immediate & 0x2000000)
-				*immediate -= 0x4000000;
-		}
+		else if ((insn & 3) == 0)
+			*type = INSN_JUMP_UNCONDITIONAL;
+		else
+			break;
+
+		*immediate = insn & 0x3fffffc;
+		if (*immediate & 0x2000000)
+			*immediate -= 0x4000000;
 		break;
 	}
 
diff --git a/tools/objtool/arch/powerpc/include/arch/elf.h b/tools/objtool/arch/powerpc/include/arch/elf.h
index 73f9ae172fe5..befc2e30d38b 100644
--- a/tools/objtool/arch/powerpc/include/arch/elf.h
+++ b/tools/objtool/arch/powerpc/include/arch/elf.h
@@ -6,5 +6,6 @@
 #define R_NONE R_PPC_NONE
 #define R_ABS64 R_PPC64_ADDR64
 #define R_ABS32 R_PPC_ADDR32
+#define R_REL32 R_PPC_REL32 /* R_PPC64_REL32 is identical */
 
 #endif /* _OBJTOOL_ARCH_ELF */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 5/7] init: Call static_call_init() from start_kernel()
  2022-07-08 17:31 ` Christophe Leroy
@ 2022-07-08 17:31   ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev, x86, chenzhongjin

Call static_call_init() just after jump_label_init().

x86 already called it from setup_arch(). This is not a
problem as static_call_init() is guarded from double call.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 init/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/init/main.c b/init/main.c
index 0ee39cdcfcac..7b8e9608f091 100644
--- a/init/main.c
+++ b/init/main.c
@@ -962,6 +962,7 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
 	pr_notice("Kernel command line: %s\n", saved_command_line);
 	/* parameters may set static keys */
 	jump_label_init();
+	static_call_init();
 	parse_early_param();
 	after_dashes = parse_args("Booting kernel",
 				  static_command_line, __start___param,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 5/7] init: Call static_call_init() from start_kernel()
@ 2022-07-08 17:31   ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: chenzhongjin, x86, linuxppc-dev, linux-kernel

Call static_call_init() just after jump_label_init().

x86 already called it from setup_arch(). This is not a
problem as static_call_init() is guarded from double call.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 init/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/init/main.c b/init/main.c
index 0ee39cdcfcac..7b8e9608f091 100644
--- a/init/main.c
+++ b/init/main.c
@@ -962,6 +962,7 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
 	pr_notice("Kernel command line: %s\n", saved_command_line);
 	/* parameters may set static keys */
 	jump_label_init();
+	static_call_init();
 	parse_early_param();
 	after_dashes = parse_args("Booting kernel",
 				  static_command_line, __start___param,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 6/7] static_call_inline: Provide trampoline address when updating sites
  2022-07-08 17:31 ` Christophe Leroy
@ 2022-07-08 17:31   ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev, x86, chenzhongjin

In preparation of support of inline static calls on powerpc, provide
trampoline address when updating sites, so that if the destination
function is too far for a direct function call, the call site will
be patched with a call to the trampoline.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/x86/kernel/static_call.c | 2 +-
 kernel/static_call_inline.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c
index aa72cefdd5be..4db30b0ea71c 100644
--- a/arch/x86/kernel/static_call.c
+++ b/arch/x86/kernel/static_call.c
@@ -102,7 +102,7 @@ void arch_static_call_transform(void *site, void *tramp, void *func, bool tail)
 {
 	mutex_lock(&text_mutex);
 
-	if (tramp) {
+	if (tramp && !site) {
 		__static_call_validate(tramp, true, true);
 		__static_call_transform(tramp, __sc_insn(!func, true), func);
 	}
diff --git a/kernel/static_call_inline.c b/kernel/static_call_inline.c
index dc5665b62814..b5de9d92fa4e 100644
--- a/kernel/static_call_inline.c
+++ b/kernel/static_call_inline.c
@@ -195,7 +195,7 @@ void __static_call_update(struct static_call_key *key, void *tramp, void *func)
 				continue;
 			}
 
-			arch_static_call_transform(site_addr, NULL, func,
+			arch_static_call_transform(site_addr, tramp, func,
 						   static_call_is_tail(site));
 		}
 	}
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 6/7] static_call_inline: Provide trampoline address when updating sites
@ 2022-07-08 17:31   ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: chenzhongjin, x86, linuxppc-dev, linux-kernel

In preparation of support of inline static calls on powerpc, provide
trampoline address when updating sites, so that if the destination
function is too far for a direct function call, the call site will
be patched with a call to the trampoline.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/x86/kernel/static_call.c | 2 +-
 kernel/static_call_inline.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c
index aa72cefdd5be..4db30b0ea71c 100644
--- a/arch/x86/kernel/static_call.c
+++ b/arch/x86/kernel/static_call.c
@@ -102,7 +102,7 @@ void arch_static_call_transform(void *site, void *tramp, void *func, bool tail)
 {
 	mutex_lock(&text_mutex);
 
-	if (tramp) {
+	if (tramp && !site) {
 		__static_call_validate(tramp, true, true);
 		__static_call_transform(tramp, __sc_insn(!func, true), func);
 	}
diff --git a/kernel/static_call_inline.c b/kernel/static_call_inline.c
index dc5665b62814..b5de9d92fa4e 100644
--- a/kernel/static_call_inline.c
+++ b/kernel/static_call_inline.c
@@ -195,7 +195,7 @@ void __static_call_update(struct static_call_key *key, void *tramp, void *func)
 				continue;
 			}
 
-			arch_static_call_transform(site_addr, NULL, func,
+			arch_static_call_transform(site_addr, tramp, func,
 						   static_call_is_tail(site));
 		}
 	}
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 7/7] powerpc/static_call: Implement inline static calls
  2022-07-08 17:31 ` Christophe Leroy
@ 2022-07-08 17:31   ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: Christophe Leroy, linux-kernel, linuxppc-dev, x86, chenzhongjin

Implement inline static calls:
- Put a 'bl' to the destination function ('b' if tail call)
- Put a 'nop' when the destination function is NULL ('blr' if tail call)
- Put a 'li r3,0' when the destination is the RET0 function and not
a tail call.

If the destination is too far (over the 32Mb limit), go via the
trampoline.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/Kconfig                   |  1 +
 arch/powerpc/include/asm/static_call.h |  2 +
 arch/powerpc/kernel/static_call.c      | 56 +++++++++++++++++++-------
 3 files changed, 44 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 00a43eb26418..cb92887acc3f 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -251,6 +251,7 @@ config PPC
 	select HAVE_STACKPROTECTOR		if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
 	select HAVE_STACKPROTECTOR		if PPC64 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
 	select HAVE_STATIC_CALL			if PPC32
+	select HAVE_STATIC_CALL_INLINE		if PPC32
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_VIRT_CPU_ACCOUNTING
 	select HUGETLB_PAGE_SIZE_VARIABLE	if PPC_BOOK3S_64 && HUGETLB_PAGE
diff --git a/arch/powerpc/include/asm/static_call.h b/arch/powerpc/include/asm/static_call.h
index de1018cc522b..e3d5d3823dac 100644
--- a/arch/powerpc/include/asm/static_call.h
+++ b/arch/powerpc/include/asm/static_call.h
@@ -26,4 +26,6 @@
 #define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)	__PPC_SCT(name, "blr")
 #define ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name)	__PPC_SCT(name, "b .+20")
 
+#define CALL_INSN_SIZE		4
+
 #endif /* _ASM_POWERPC_STATIC_CALL_H */
diff --git a/arch/powerpc/kernel/static_call.c b/arch/powerpc/kernel/static_call.c
index 863a7aa24650..0093b471186d 100644
--- a/arch/powerpc/kernel/static_call.c
+++ b/arch/powerpc/kernel/static_call.c
@@ -8,26 +8,52 @@ void arch_static_call_transform(void *site, void *tramp, void *func, bool tail)
 {
 	int err;
 	bool is_ret0 = (func == __static_call_return0);
-	unsigned long target = (unsigned long)(is_ret0 ? tramp + PPC_SCT_RET0 : func);
-	bool is_short = is_offset_in_branch_range((long)target - (long)tramp);
-
-	if (!tramp)
-		return;
+	unsigned long _tramp = (unsigned long)tramp;
+	unsigned long _func = (unsigned long)func;
+	unsigned long _ret0 = _tramp + PPC_SCT_RET0;
+	bool is_short = is_offset_in_branch_range((long)func - (long)(site ? : tramp));
 
 	mutex_lock(&text_mutex);
 
-	if (func && !is_short) {
-		err = patch_instruction(tramp + PPC_SCT_DATA, ppc_inst(target));
-		if (err)
-			goto out;
+	if (site && !tail) {
+		if (!func)
+			err = patch_instruction(site, ppc_inst(PPC_RAW_NOP()));
+		else if (is_ret0)
+			err = patch_instruction(site, ppc_inst(PPC_RAW_LI(_R3, 0)));
+		else if (is_short)
+			err = patch_branch(site, _func, BRANCH_SET_LINK);
+		else if (tramp)
+			err = patch_branch(site, _tramp, BRANCH_SET_LINK);
+		else
+			err = 0;
+	} else if (site) {
+		if (!func)
+			err = patch_instruction(site, ppc_inst(PPC_RAW_BLR()));
+		else if (is_ret0)
+			err = patch_branch(site, _ret0, 0);
+		else if (is_short)
+			err = patch_branch(site, _func, 0);
+		else if (tramp)
+			err = patch_branch(site, _tramp, 0);
+		else
+			err = 0;
+	} else if (tramp) {
+		if (func && !is_short) {
+			err = patch_instruction(tramp + PPC_SCT_DATA, ppc_inst(_func));
+			if (err)
+				goto out;
+		}
+
+		if (!func)
+			err = patch_instruction(tramp, ppc_inst(PPC_RAW_BLR()));
+		else if (is_ret0)
+			err = patch_branch(tramp, _ret0, 0);
+		else if (is_short)
+			err = patch_branch(tramp, _func, 0);
+		else
+			err = patch_instruction(tramp, ppc_inst(PPC_RAW_NOP()));
 	}
 
-	if (!func)
-		err = patch_instruction(tramp, ppc_inst(PPC_RAW_BLR()));
-	else if (is_short)
-		err = patch_branch(tramp, target, 0);
-	else
-		err = patch_instruction(tramp, ppc_inst(PPC_RAW_NOP()));
 out:
 	mutex_unlock(&text_mutex);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 7/7] powerpc/static_call: Implement inline static calls
@ 2022-07-08 17:31   ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-07-08 17:31 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, sv, agust, jpoimboe, peterz,
	jbaron, rostedt, ardb, tglx, mingo, bp, dave.hansen, hpa
  Cc: chenzhongjin, x86, linuxppc-dev, linux-kernel

Implement inline static calls:
- Put a 'bl' to the destination function ('b' if tail call)
- Put a 'nop' when the destination function is NULL ('blr' if tail call)
- Put a 'li r3,0' when the destination is the RET0 function and not
a tail call.

If the destination is too far (over the 32Mb limit), go via the
trampoline.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/Kconfig                   |  1 +
 arch/powerpc/include/asm/static_call.h |  2 +
 arch/powerpc/kernel/static_call.c      | 56 +++++++++++++++++++-------
 3 files changed, 44 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 00a43eb26418..cb92887acc3f 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -251,6 +251,7 @@ config PPC
 	select HAVE_STACKPROTECTOR		if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
 	select HAVE_STACKPROTECTOR		if PPC64 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
 	select HAVE_STATIC_CALL			if PPC32
+	select HAVE_STATIC_CALL_INLINE		if PPC32
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_VIRT_CPU_ACCOUNTING
 	select HUGETLB_PAGE_SIZE_VARIABLE	if PPC_BOOK3S_64 && HUGETLB_PAGE
diff --git a/arch/powerpc/include/asm/static_call.h b/arch/powerpc/include/asm/static_call.h
index de1018cc522b..e3d5d3823dac 100644
--- a/arch/powerpc/include/asm/static_call.h
+++ b/arch/powerpc/include/asm/static_call.h
@@ -26,4 +26,6 @@
 #define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)	__PPC_SCT(name, "blr")
 #define ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name)	__PPC_SCT(name, "b .+20")
 
+#define CALL_INSN_SIZE		4
+
 #endif /* _ASM_POWERPC_STATIC_CALL_H */
diff --git a/arch/powerpc/kernel/static_call.c b/arch/powerpc/kernel/static_call.c
index 863a7aa24650..0093b471186d 100644
--- a/arch/powerpc/kernel/static_call.c
+++ b/arch/powerpc/kernel/static_call.c
@@ -8,26 +8,52 @@ void arch_static_call_transform(void *site, void *tramp, void *func, bool tail)
 {
 	int err;
 	bool is_ret0 = (func == __static_call_return0);
-	unsigned long target = (unsigned long)(is_ret0 ? tramp + PPC_SCT_RET0 : func);
-	bool is_short = is_offset_in_branch_range((long)target - (long)tramp);
-
-	if (!tramp)
-		return;
+	unsigned long _tramp = (unsigned long)tramp;
+	unsigned long _func = (unsigned long)func;
+	unsigned long _ret0 = _tramp + PPC_SCT_RET0;
+	bool is_short = is_offset_in_branch_range((long)func - (long)(site ? : tramp));
 
 	mutex_lock(&text_mutex);
 
-	if (func && !is_short) {
-		err = patch_instruction(tramp + PPC_SCT_DATA, ppc_inst(target));
-		if (err)
-			goto out;
+	if (site && !tail) {
+		if (!func)
+			err = patch_instruction(site, ppc_inst(PPC_RAW_NOP()));
+		else if (is_ret0)
+			err = patch_instruction(site, ppc_inst(PPC_RAW_LI(_R3, 0)));
+		else if (is_short)
+			err = patch_branch(site, _func, BRANCH_SET_LINK);
+		else if (tramp)
+			err = patch_branch(site, _tramp, BRANCH_SET_LINK);
+		else
+			err = 0;
+	} else if (site) {
+		if (!func)
+			err = patch_instruction(site, ppc_inst(PPC_RAW_BLR()));
+		else if (is_ret0)
+			err = patch_branch(site, _ret0, 0);
+		else if (is_short)
+			err = patch_branch(site, _func, 0);
+		else if (tramp)
+			err = patch_branch(site, _tramp, 0);
+		else
+			err = 0;
+	} else if (tramp) {
+		if (func && !is_short) {
+			err = patch_instruction(tramp + PPC_SCT_DATA, ppc_inst(_func));
+			if (err)
+				goto out;
+		}
+
+		if (!func)
+			err = patch_instruction(tramp, ppc_inst(PPC_RAW_BLR()));
+		else if (is_ret0)
+			err = patch_branch(tramp, _ret0, 0);
+		else if (is_short)
+			err = patch_branch(tramp, _func, 0);
+		else
+			err = patch_instruction(tramp, ppc_inst(PPC_RAW_NOP()));
 	}
 
-	if (!func)
-		err = patch_instruction(tramp, ppc_inst(PPC_RAW_BLR()));
-	else if (is_short)
-		err = patch_branch(tramp, target, 0);
-	else
-		err = patch_instruction(tramp, ppc_inst(PPC_RAW_NOP()));
 out:
 	mutex_unlock(&text_mutex);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
  2022-07-08 17:31 ` Christophe Leroy
@ 2022-07-09  6:52   ` Ard Biesheuvel
  -1 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2022-07-09  6:52 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Michael Ellerman, Nicholas Piggin, sv, agust, Josh Poimboeuf,
	Peter Zijlstra, Jason Baron, Steven Rostedt (VMware),
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Linux Kernel Mailing List,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	X86 ML, Chen Zhongjin

Hello Christophe,

On Fri, 8 Jul 2022 at 19:32, Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
> This series applies on top of the series v3 "objtool: Enable and
> implement --mcount option on powerpc" [1] rebased on powerpc-next branch
>
> A few modifications are done to core parts to enable powerpc
> implementation:
> - R_X86_64_PC32 is abstracted to R_REL32 so that it can then be
> redefined as R_PPC_REL32.
> - A call to static_call_init() is added to start_kernel() to avoid
> every architecture to have to call it
> - Trampoline address is provided to arch_static_call_transform() even
> when setting a site to fallback on a call to the trampoline when the
> target is too far.
>
> [1] https://lore.kernel.org/lkml/70b6d08d-aced-7f4e-b958-a3c7ae1a9319@csgroup.eu/T/#rb3a073c54aba563a135fba891e0c34c46e47beef
>
> Christophe Leroy (7):
>   powerpc: Add missing asm/asm.h for objtool
>   objtool/powerpc: Activate objtool on PPC32
>   objtool: Add architecture specific R_REL32 macro
>   objtool/powerpc: Add necessary support for inline static calls
>   init: Call static_call_init() from start_kernel()
>   static_call_inline: Provide trampoline address when updating sites
>   powerpc/static_call: Implement inline static calls
>

Could you quantify the performance gains of moving from out-of-line,
patched tail-call branch instructions to full-fledged inline static
calls? On x86, the retpoline problem makes this glaringly obvious, but
on other architectures, the complexity of supporting this model may
outweigh the performance advantages.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
@ 2022-07-09  6:52   ` Ard Biesheuvel
  0 siblings, 0 replies; 24+ messages in thread
From: Ard Biesheuvel @ 2022-07-09  6:52 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: X86 ML, Peter Zijlstra, Chen Zhongjin, Dave Hansen,
	Linux Kernel Mailing List, Nicholas Piggin, Jason Baron,
	Ingo Molnar, sv, Steven Rostedt (VMware),
	H. Peter Anvin, Borislav Petkov, Thomas Gleixner, agust,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Josh Poimboeuf

Hello Christophe,

On Fri, 8 Jul 2022 at 19:32, Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
> This series applies on top of the series v3 "objtool: Enable and
> implement --mcount option on powerpc" [1] rebased on powerpc-next branch
>
> A few modifications are done to core parts to enable powerpc
> implementation:
> - R_X86_64_PC32 is abstracted to R_REL32 so that it can then be
> redefined as R_PPC_REL32.
> - A call to static_call_init() is added to start_kernel() to avoid
> every architecture to have to call it
> - Trampoline address is provided to arch_static_call_transform() even
> when setting a site to fallback on a call to the trampoline when the
> target is too far.
>
> [1] https://lore.kernel.org/lkml/70b6d08d-aced-7f4e-b958-a3c7ae1a9319@csgroup.eu/T/#rb3a073c54aba563a135fba891e0c34c46e47beef
>
> Christophe Leroy (7):
>   powerpc: Add missing asm/asm.h for objtool
>   objtool/powerpc: Activate objtool on PPC32
>   objtool: Add architecture specific R_REL32 macro
>   objtool/powerpc: Add necessary support for inline static calls
>   init: Call static_call_init() from start_kernel()
>   static_call_inline: Provide trampoline address when updating sites
>   powerpc/static_call: Implement inline static calls
>

Could you quantify the performance gains of moving from out-of-line,
patched tail-call branch instructions to full-fledged inline static
calls? On x86, the retpoline problem makes this glaringly obvious, but
on other architectures, the complexity of supporting this model may
outweigh the performance advantages.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
  2022-07-09  6:52   ` Ard Biesheuvel
@ 2022-09-01 16:46     ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-09-01 16:46 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Michael Ellerman, Nicholas Piggin, sv, agust, Josh Poimboeuf,
	Peter Zijlstra, Jason Baron, Steven Rostedt (VMware),
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Linux Kernel Mailing List,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	X86 ML, Chen Zhongjin



Le 09/07/2022 à 08:52, Ard Biesheuvel a écrit :
> Hello Christophe,
> 
> On Fri, 8 Jul 2022 at 19:32, Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
>>
>> This series applies on top of the series v3 "objtool: Enable and
>> implement --mcount option on powerpc" [1] rebased on powerpc-next branch
>>
>> A few modifications are done to core parts to enable powerpc
>> implementation:
>> - R_X86_64_PC32 is abstracted to R_REL32 so that it can then be
>> redefined as R_PPC_REL32.
>> - A call to static_call_init() is added to start_kernel() to avoid
>> every architecture to have to call it
>> - Trampoline address is provided to arch_static_call_transform() even
>> when setting a site to fallback on a call to the trampoline when the
>> target is too far.
>>
>> [1] https://lore.kernel.org/lkml/70b6d08d-aced-7f4e-b958-a3c7ae1a9319@csgroup.eu/T/#rb3a073c54aba563a135fba891e0c34c46e47beef
>>
>> Christophe Leroy (7):
>>    powerpc: Add missing asm/asm.h for objtool
>>    objtool/powerpc: Activate objtool on PPC32
>>    objtool: Add architecture specific R_REL32 macro
>>    objtool/powerpc: Add necessary support for inline static calls
>>    init: Call static_call_init() from start_kernel()
>>    static_call_inline: Provide trampoline address when updating sites
>>    powerpc/static_call: Implement inline static calls
>>
> 
> Could you quantify the performance gains of moving from out-of-line,
> patched tail-call branch instructions to full-fledged inline static
> calls? On x86, the retpoline problem makes this glaringly obvious, but
> on other architectures, the complexity of supporting this model may
> outweigh the performance advantages.

Surprisingly, I get worst performance with inline static call than with 
out of line static call:

No static call:

root@vgoip:~# perf stat -r 10 ./hackbench 1
Running with 1*40 (== 40) tasks.
Time: 17.186
Running with 1*40 (== 40) tasks.
Time: 16.738
Running with 1*40 (== 40) tasks.
Time: 16.579
Running with 1*40 (== 40) tasks.
Time: 16.838
Running with 1*40 (== 40) tasks.
Time: 16.652
Running with 1*40 (== 40) tasks.
Time: 17.380
Running with 1*40 (== 40) tasks.
Time: 16.630
Running with 1*40 (== 40) tasks.
Time: 16.850
Running with 1*40 (== 40) tasks.
Time: 17.161
Running with 1*40 (== 40) tasks.
Time: 16.722

  Performance counter stats for './hackbench 1' (10 runs):

           17019.55 msec task-clock                #    0.980 CPUs 
utilized            ( +-  0.51% )
               4847      context-switches          #  282.280 /sec 
               ( +-  6.32% )
                  0      cpu-migrations            #    0.000 /sec
               1249      page-faults               #   72.739 /sec 
               ( +-  0.49% )
         2245344976      cycles                    #    0.131 GHz 
               ( +-  0.51% )
          727437072      instructions              #    0.32  insn per 
cycle           ( +-  0.40% )
    <not supported>      branches
    <not supported>      branch-misses

            17.3585 +- 0.0909 seconds time elapsed  ( +-  0.52% )


Outline static call:

root@vgoip:~# perf stat -r 10 ./hackbench 1
Running with 1*40 (== 40) tasks.
Time: 15.892
Running with 1*40 (== 40) tasks.
Time: 15.731
Running with 1*40 (== 40) tasks.
Time: 15.507
Running with 1*40 (== 40) tasks.
Time: 16.269
Running with 1*40 (== 40) tasks.
Time: 15.934
Running with 1*40 (== 40) tasks.
Time: 16.048
Running with 1*40 (== 40) tasks.
Time: 15.700
Running with 1*40 (== 40) tasks.
Time: 16.063
Running with 1*40 (== 40) tasks.
Time: 15.852
Running with 1*40 (== 40) tasks.
Time: 15.941

  Performance counter stats for './hackbench 1' (10 runs):

           16227.32 msec task-clock                #    0.992 CPUs 
utilized            ( +-  0.42% )
               3732      context-switches          #  230.525 /sec 
               ( +-  6.42% )
                  0      cpu-migrations            #    0.000 /sec
               1244      page-faults               #   76.842 /sec 
               ( +-  0.11% )
         2141094288      cycles                    #    0.132 GHz 
               ( +-  0.42% )
          712598441      instructions              #    0.33  insn per 
cycle           ( +-  0.29% )
    <not supported>      branches
    <not supported>      branch-misses

            16.3539 +- 0.0675 seconds time elapsed  ( +-  0.41% )


Inline static call:

root@vgoip:~# perf stat -r 10 ./hackbench 1
Running with 1*40 (== 40) tasks.
Time: 17.512
Running with 1*40 (== 40) tasks.
Time: 17.240
Running with 1*40 (== 40) tasks.
Time: 16.901
Running with 1*40 (== 40) tasks.
Time: 17.125
Running with 1*40 (== 40) tasks.
Time: 17.262
Running with 1*40 (== 40) tasks.
Time: 17.298
Running with 1*40 (== 40) tasks.
Time: 17.182
Running with 1*40 (== 40) tasks.
Time: 16.988
Running with 1*40 (== 40) tasks.
Time: 17.102
Running with 1*40 (== 40) tasks.
Time: 16.669

  Performance counter stats for './hackbench 1' (10 runs):

           16976.76 msec task-clock                #    0.964 CPUs 
utilized            ( +-  0.44% )
               4760      context-switches          #  273.007 /sec 
               ( +-  4.93% )
                  0      cpu-migrations            #    0.000 /sec
               1252      page-faults               #   71.808 /sec 
               ( +-  0.35% )
         2239986112      cycles                    #    0.128 GHz 
               ( +-  0.44% )
          721540184      instructions              #    0.31  insn per 
cycle           ( +-  0.31% )
    <not supported>      branches
    <not supported>      branch-misses

            17.6126 +- 0.0762 seconds time elapsed  ( +-  0.43% )


Summary:

No static calls:
            17.3585 +- 0.0909 seconds time elapsed  ( +-  0.52% )
Out-of-line static calls:
            16.3539 +- 0.0675 seconds time elapsed  ( +-  0.41% )
Inline static calls:
            17.6126 +- 0.0762 seconds time elapsed  ( +-  0.43% )

Is there anything wrong with inline statica calls ?

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
@ 2022-09-01 16:46     ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-09-01 16:46 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: X86 ML, Peter Zijlstra, Chen Zhongjin, Dave Hansen,
	Linux Kernel Mailing List, Nicholas Piggin, Jason Baron,
	Ingo Molnar, sv, Steven Rostedt (VMware),
	H. Peter Anvin, Borislav Petkov, Thomas Gleixner, agust,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Josh Poimboeuf



Le 09/07/2022 à 08:52, Ard Biesheuvel a écrit :
> Hello Christophe,
> 
> On Fri, 8 Jul 2022 at 19:32, Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
>>
>> This series applies on top of the series v3 "objtool: Enable and
>> implement --mcount option on powerpc" [1] rebased on powerpc-next branch
>>
>> A few modifications are done to core parts to enable powerpc
>> implementation:
>> - R_X86_64_PC32 is abstracted to R_REL32 so that it can then be
>> redefined as R_PPC_REL32.
>> - A call to static_call_init() is added to start_kernel() to avoid
>> every architecture to have to call it
>> - Trampoline address is provided to arch_static_call_transform() even
>> when setting a site to fallback on a call to the trampoline when the
>> target is too far.
>>
>> [1] https://lore.kernel.org/lkml/70b6d08d-aced-7f4e-b958-a3c7ae1a9319@csgroup.eu/T/#rb3a073c54aba563a135fba891e0c34c46e47beef
>>
>> Christophe Leroy (7):
>>    powerpc: Add missing asm/asm.h for objtool
>>    objtool/powerpc: Activate objtool on PPC32
>>    objtool: Add architecture specific R_REL32 macro
>>    objtool/powerpc: Add necessary support for inline static calls
>>    init: Call static_call_init() from start_kernel()
>>    static_call_inline: Provide trampoline address when updating sites
>>    powerpc/static_call: Implement inline static calls
>>
> 
> Could you quantify the performance gains of moving from out-of-line,
> patched tail-call branch instructions to full-fledged inline static
> calls? On x86, the retpoline problem makes this glaringly obvious, but
> on other architectures, the complexity of supporting this model may
> outweigh the performance advantages.

Surprisingly, I get worst performance with inline static call than with 
out of line static call:

No static call:

root@vgoip:~# perf stat -r 10 ./hackbench 1
Running with 1*40 (== 40) tasks.
Time: 17.186
Running with 1*40 (== 40) tasks.
Time: 16.738
Running with 1*40 (== 40) tasks.
Time: 16.579
Running with 1*40 (== 40) tasks.
Time: 16.838
Running with 1*40 (== 40) tasks.
Time: 16.652
Running with 1*40 (== 40) tasks.
Time: 17.380
Running with 1*40 (== 40) tasks.
Time: 16.630
Running with 1*40 (== 40) tasks.
Time: 16.850
Running with 1*40 (== 40) tasks.
Time: 17.161
Running with 1*40 (== 40) tasks.
Time: 16.722

  Performance counter stats for './hackbench 1' (10 runs):

           17019.55 msec task-clock                #    0.980 CPUs 
utilized            ( +-  0.51% )
               4847      context-switches          #  282.280 /sec 
               ( +-  6.32% )
                  0      cpu-migrations            #    0.000 /sec
               1249      page-faults               #   72.739 /sec 
               ( +-  0.49% )
         2245344976      cycles                    #    0.131 GHz 
               ( +-  0.51% )
          727437072      instructions              #    0.32  insn per 
cycle           ( +-  0.40% )
    <not supported>      branches
    <not supported>      branch-misses

            17.3585 +- 0.0909 seconds time elapsed  ( +-  0.52% )


Outline static call:

root@vgoip:~# perf stat -r 10 ./hackbench 1
Running with 1*40 (== 40) tasks.
Time: 15.892
Running with 1*40 (== 40) tasks.
Time: 15.731
Running with 1*40 (== 40) tasks.
Time: 15.507
Running with 1*40 (== 40) tasks.
Time: 16.269
Running with 1*40 (== 40) tasks.
Time: 15.934
Running with 1*40 (== 40) tasks.
Time: 16.048
Running with 1*40 (== 40) tasks.
Time: 15.700
Running with 1*40 (== 40) tasks.
Time: 16.063
Running with 1*40 (== 40) tasks.
Time: 15.852
Running with 1*40 (== 40) tasks.
Time: 15.941

  Performance counter stats for './hackbench 1' (10 runs):

           16227.32 msec task-clock                #    0.992 CPUs 
utilized            ( +-  0.42% )
               3732      context-switches          #  230.525 /sec 
               ( +-  6.42% )
                  0      cpu-migrations            #    0.000 /sec
               1244      page-faults               #   76.842 /sec 
               ( +-  0.11% )
         2141094288      cycles                    #    0.132 GHz 
               ( +-  0.42% )
          712598441      instructions              #    0.33  insn per 
cycle           ( +-  0.29% )
    <not supported>      branches
    <not supported>      branch-misses

            16.3539 +- 0.0675 seconds time elapsed  ( +-  0.41% )


Inline static call:

root@vgoip:~# perf stat -r 10 ./hackbench 1
Running with 1*40 (== 40) tasks.
Time: 17.512
Running with 1*40 (== 40) tasks.
Time: 17.240
Running with 1*40 (== 40) tasks.
Time: 16.901
Running with 1*40 (== 40) tasks.
Time: 17.125
Running with 1*40 (== 40) tasks.
Time: 17.262
Running with 1*40 (== 40) tasks.
Time: 17.298
Running with 1*40 (== 40) tasks.
Time: 17.182
Running with 1*40 (== 40) tasks.
Time: 16.988
Running with 1*40 (== 40) tasks.
Time: 17.102
Running with 1*40 (== 40) tasks.
Time: 16.669

  Performance counter stats for './hackbench 1' (10 runs):

           16976.76 msec task-clock                #    0.964 CPUs 
utilized            ( +-  0.44% )
               4760      context-switches          #  273.007 /sec 
               ( +-  4.93% )
                  0      cpu-migrations            #    0.000 /sec
               1252      page-faults               #   71.808 /sec 
               ( +-  0.35% )
         2239986112      cycles                    #    0.128 GHz 
               ( +-  0.44% )
          721540184      instructions              #    0.31  insn per 
cycle           ( +-  0.31% )
    <not supported>      branches
    <not supported>      branch-misses

            17.6126 +- 0.0762 seconds time elapsed  ( +-  0.43% )


Summary:

No static calls:
            17.3585 +- 0.0909 seconds time elapsed  ( +-  0.52% )
Out-of-line static calls:
            16.3539 +- 0.0675 seconds time elapsed  ( +-  0.41% )
Inline static calls:
            17.6126 +- 0.0762 seconds time elapsed  ( +-  0.43% )

Is there anything wrong with inline statica calls ?

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
  2022-09-01 16:46     ` Christophe Leroy
@ 2022-09-08  0:13       ` Benjamin Gray
  -1 siblings, 0 replies; 24+ messages in thread
From: Benjamin Gray @ 2022-09-08  0:13 UTC (permalink / raw)
  To: Christophe Leroy, Ard Biesheuvel
  Cc: X86 ML, Peter Zijlstra, Chen Zhongjin, Dave Hansen,
	Linux Kernel Mailing List, Nicholas Piggin, Jason Baron,
	Ingo Molnar, sv, Steven Rostedt (VMware),
	H. Peter Anvin, Borislav Petkov, Thomas Gleixner, agust,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Josh Poimboeuf

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Thu, 2022-09-01 at 16:46 +0000, Christophe Leroy wrote:
> Surprisingly, I get worst performance with inline static call than
> with 
> out of line static call:

I'm not sure what hackbench is doing, but when microbenchmarking 64 bit
out-of-line calls in a loop I saw a similar thing where adding more
indirection improved the performance despite doing more work. The cause
seemed to be a combination of using older hardware and the target being
too short (just an integer increment). Moving to a newer machine and
adding a lot of NOPs to the target made the performance make sense.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
@ 2022-09-08  0:13       ` Benjamin Gray
  0 siblings, 0 replies; 24+ messages in thread
From: Benjamin Gray @ 2022-09-08  0:13 UTC (permalink / raw)
  To: Christophe Leroy, Ard Biesheuvel
  Cc: X86 ML, Peter Zijlstra,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Dave Hansen, Linux Kernel Mailing List, Nicholas Piggin,
	Jason Baron, Ingo Molnar, sv, Steven Rostedt (VMware),
	H. Peter Anvin, Borislav Petkov, Thomas Gleixner, agust,
	Chen Zhongjin, Josh Poimboeuf

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Thu, 2022-09-01 at 16:46 +0000, Christophe Leroy wrote:
> Surprisingly, I get worst performance with inline static call than
> with 
> out of line static call:

I'm not sure what hackbench is doing, but when microbenchmarking 64 bit
out-of-line calls in a loop I saw a similar thing where adding more
indirection improved the performance despite doing more work. The cause
seemed to be a combination of using older hardware and the target being
too short (just an integer increment). Moving to a newer machine and
adding a lot of NOPs to the target made the performance make sense.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
  2022-09-08  0:13       ` Benjamin Gray
@ 2022-09-08  6:11         ` Christophe Leroy
  -1 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-09-08  6:11 UTC (permalink / raw)
  To: Benjamin Gray, Ard Biesheuvel
  Cc: X86 ML, Peter Zijlstra, Chen Zhongjin, Dave Hansen,
	Linux Kernel Mailing List, Nicholas Piggin, Jason Baron,
	Ingo Molnar, sv, Steven Rostedt (VMware),
	H. Peter Anvin, Borislav Petkov, Thomas Gleixner, agust,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Josh Poimboeuf



Le 08/09/2022 à 02:13, Benjamin Gray a écrit :
> On Thu, 2022-09-01 at 16:46 +0000, Christophe Leroy wrote:
>> Surprisingly, I get worst performance with inline static call than
>> with
>> out of line static call:
> 
> I'm not sure what hackbench is doing, but when microbenchmarking 64 bit
> out-of-line calls in a loop I saw a similar thing where adding more
> indirection improved the performance despite doing more work. The cause
> seemed to be a combination of using older hardware and the target being
> too short (just an integer increment). Moving to a newer machine and
> adding a lot of NOPs to the target made the performance make sense.

Yes might be.

I think I'll first do new tests with CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2
@ 2022-09-08  6:11         ` Christophe Leroy
  0 siblings, 0 replies; 24+ messages in thread
From: Christophe Leroy @ 2022-09-08  6:11 UTC (permalink / raw)
  To: Benjamin Gray, Ard Biesheuvel
  Cc: X86 ML, Peter Zijlstra,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Dave Hansen, Linux Kernel Mailing List, Nicholas Piggin,
	Jason Baron, Ingo Molnar, sv, Steven Rostedt (VMware),
	H. Peter Anvin, Borislav Petkov, Thomas Gleixner, agust,
	Chen Zhongjin, Josh Poimboeuf



Le 08/09/2022 à 02:13, Benjamin Gray a écrit :
> On Thu, 2022-09-01 at 16:46 +0000, Christophe Leroy wrote:
>> Surprisingly, I get worst performance with inline static call than
>> with
>> out of line static call:
> 
> I'm not sure what hackbench is doing, but when microbenchmarking 64 bit
> out-of-line calls in a loop I saw a similar thing where adding more
> indirection improved the performance despite doing more work. The cause
> seemed to be a combination of using older hardware and the target being
> too short (just an integer increment). Moving to a newer machine and
> adding a lot of NOPs to the target made the performance make sense.

Yes might be.

I think I'll first do new tests with CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B

Christophe

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-09-08  6:12 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-08 17:31 [PATCH v2 0/7] Implement inline static calls on PPC32 - v2 Christophe Leroy
2022-07-08 17:31 ` Christophe Leroy
2022-07-08 17:31 ` [PATCH v2 1/7] powerpc: Add missing asm/asm.h for objtool Christophe Leroy
2022-07-08 17:31   ` Christophe Leroy
2022-07-08 17:31 ` [PATCH v2 2/7] objtool/powerpc: Activate objtool on PPC32 Christophe Leroy
2022-07-08 17:31   ` Christophe Leroy
2022-07-08 17:31 ` [PATCH v2 3/7] objtool: Add architecture specific R_REL32 macro Christophe Leroy
2022-07-08 17:31   ` Christophe Leroy
2022-07-08 17:31 ` [PATCH v2 4/7] objtool/powerpc: Add necessary support for inline static calls Christophe Leroy
2022-07-08 17:31   ` Christophe Leroy
2022-07-08 17:31 ` [PATCH v2 5/7] init: Call static_call_init() from start_kernel() Christophe Leroy
2022-07-08 17:31   ` Christophe Leroy
2022-07-08 17:31 ` [PATCH v2 6/7] static_call_inline: Provide trampoline address when updating sites Christophe Leroy
2022-07-08 17:31   ` Christophe Leroy
2022-07-08 17:31 ` [PATCH v2 7/7] powerpc/static_call: Implement inline static calls Christophe Leroy
2022-07-08 17:31   ` Christophe Leroy
2022-07-09  6:52 ` [PATCH v2 0/7] Implement inline static calls on PPC32 - v2 Ard Biesheuvel
2022-07-09  6:52   ` Ard Biesheuvel
2022-09-01 16:46   ` Christophe Leroy
2022-09-01 16:46     ` Christophe Leroy
2022-09-08  0:13     ` Benjamin Gray
2022-09-08  0:13       ` Benjamin Gray
2022-09-08  6:11       ` Christophe Leroy
2022-09-08  6:11         ` Christophe Leroy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.