All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/14] powerpc/64: fast interrupt exits
@ 2021-03-15 22:03 Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 01/14] powerpc: remove interrupt exit helpers unused argument Nicholas Piggin
                   ` (13 more replies)
  0 siblings, 14 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This applies to powerpc next-test (particularly Christophe's ppc32
interrupt conversion) plus the 64e interrupt conversion patches I
recently posted.

This series attempts to improve the speed of interrupts and system calls
in three major ways.

Firstly, the SRR/HSRR registers do not need to be reloaded if they were
not used or clobbered fur the duration of the interrupt. 64e does not
implement this, but it could.

Secondly, an alternate return location facility is added for soft-masked
asynchronous interrupts and then that's used to set everything up for
return without having to disable MSR RI or EE.

Thirdly, mtmsrd and mtspr are reduced by various means. This is mostly
specific to 64s.

After this series, the entire system call / interrupt handler fast path
executes no mtsprs and one mtmsrd to enable interrupts initially, and
the system call vectored path doesn't even need to do that. This gives a
decent performance benefit. On POWER9 with a powernv_defconfig without
VIRT_CPU_ACCOUNTING_NATIVE, no meltdown workarounds, gettid sc system
call goes from 481 -> 344 cycles, gettid scv 345->299 cycles, and page
fault 1225->1064 cycles.

Since RFC, this no longer breaks 64e, several techniques for reducing
MSR/SPR updates become possible or tidier with interrupt wrappers, and
security fallback flushes aren't broken, usual bug fixes.

Thanks,
Nick

Nicholas Piggin (14):
  powerpc: remove interrupt exit helpers unused argument
  powerpc/64s: security fallback improvement
  powerpc/64s: introduce different functions to return from SRR vs HSRR
    interrupts
  powerpc/64s: avoid reloading (H)SRR registers if they are still valid
  powerpc/64: move interrupt return asm to interrupt_64.S
  powerpc/64s: save one more register in the masked interrupt handler
  powerpc/64: allow alternate return locations for soft-masked
    interrupts
  powerpc/64: interrupt soft-enable race fix
  powerpc/64: treat low kernel text as irqs soft-masked
  powerpc/64: use interrupt restart table to speed up return from
    interrupt
  powerpc/64e: Remove PPR from pt_regs
  powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE]
  powerpc/64: handle MSR EE and RI in interrupt entry wrapper
  powerpc/64s: use the same default PPR for user and kernel

 arch/powerpc/Kconfig.debug                 |   5 +
 arch/powerpc/include/asm/asm-prototypes.h  |   4 +-
 arch/powerpc/include/asm/exception-64e.h   |   6 +
 arch/powerpc/include/asm/exception-64s.h   |  52 +-
 arch/powerpc/include/asm/feature-fixups.h  |  18 +
 arch/powerpc/include/asm/head-64.h         |   2 +-
 arch/powerpc/include/asm/interrupt.h       |  41 +-
 arch/powerpc/include/asm/paca.h            |   9 +-
 arch/powerpc/include/asm/ppc_asm.h         |   8 +
 arch/powerpc/include/asm/processor.h       |   4 +-
 arch/powerpc/include/asm/ptrace.h          |  65 +-
 arch/powerpc/kernel/asm-offsets.c          |   7 +-
 arch/powerpc/kernel/entry_64.S             | 516 --------------
 arch/powerpc/kernel/exceptions-64e.S       |  53 +-
 arch/powerpc/kernel/exceptions-64s.S       | 384 +++++------
 arch/powerpc/kernel/fpu.S                  |   2 +
 arch/powerpc/kernel/head_64.S              |   5 +-
 arch/powerpc/kernel/interrupt.c            | 319 +++++----
 arch/powerpc/kernel/interrupt_64.S         | 738 +++++++++++++++++++++
 arch/powerpc/kernel/irq.c                  |  81 ++-
 arch/powerpc/kernel/kgdb.c                 |   2 +-
 arch/powerpc/kernel/kprobes-ftrace.c       |   2 +-
 arch/powerpc/kernel/kprobes.c              |  10 +-
 arch/powerpc/kernel/process.c              |  20 +-
 arch/powerpc/kernel/rtas.c                 |  13 +-
 arch/powerpc/kernel/signal.c               |   2 +-
 arch/powerpc/kernel/signal_64.c            |  14 +
 arch/powerpc/kernel/syscalls.c             |   2 +
 arch/powerpc/kernel/traps.c                |  18 +-
 arch/powerpc/kernel/vector.S               |   6 +-
 arch/powerpc/kernel/vmlinux.lds.S          |  24 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S    |   4 +
 arch/powerpc/lib/Makefile                  |   2 +-
 arch/powerpc/lib/feature-fixups.c          | 241 ++++---
 arch/powerpc/lib/restart_table.c           |  29 +
 arch/powerpc/lib/sstep.c                   |   5 +-
 arch/powerpc/math-emu/math.c               |   2 +-
 arch/powerpc/platforms/powernv/opal-call.c |   3 +
 arch/powerpc/sysdev/fsl_pci.c              |   2 +-
 39 files changed, 1631 insertions(+), 1089 deletions(-)
 create mode 100644 arch/powerpc/kernel/interrupt_64.S
 create mode 100644 arch/powerpc/lib/restart_table.c

-- 
2.23.0


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/14] powerpc: remove interrupt exit helpers unused argument
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-05-17 13:49   ` Christophe Leroy
  2021-03-15 22:03 ` [PATCH 02/14] powerpc/64s: security fallback improvement Nicholas Piggin
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The msr argument is not used, remove it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/asm-prototypes.h | 4 ++--
 arch/powerpc/kernel/interrupt.c           | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
index 1c7b75834e04..95492655462e 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -71,8 +71,8 @@ void __init machine_init(u64 dt_ptr);
 #endif
 long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, unsigned long r0, struct pt_regs *regs);
 notrace unsigned long syscall_exit_prepare(unsigned long r3, struct pt_regs *regs, long scv);
-notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned long msr);
-notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsigned long msr);
+notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs);
+notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs);
 
 long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low,
 		      u32 len_high, u32 len_low);
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 96ca27ef68ae..efeeefe6ee8f 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -359,7 +359,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
 	return ret;
 }
 
-notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned long msr)
+notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
 {
 	unsigned long ti_flags;
 	unsigned long flags;
@@ -443,7 +443,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned
 
 void preempt_schedule_irq(void);
 
-notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsigned long msr)
+notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 {
 	unsigned long flags;
 	unsigned long ret = 0;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 02/14] powerpc/64s: security fallback improvement
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 01/14] powerpc: remove interrupt exit helpers unused argument Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 03/14] powerpc/64s: introduce different functions to return from SRR vs HSRR interrupts Nicholas Piggin
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The fallback sequences for L1D flushing and store forwarding barriers
requires reloading r13, and saving and reloading registers from a
special PACA save area and SPRGs.

This is painful, and has caused a few difficult bugs (recently the scv
interrupt re-entrancy). Things would get even more hairy with planned
interrupt exit optimizations that can return without disabling
interrupts.

This patch moves those fallbacks further into the kernel, to the point
where r13 is available, and some registers are available to use.  This
exposes slightly more attack surface, but not a huge amount (mainly some
stack frame and more paca). Firmware to implement the stateless security
ops has been available for several years now, which does not use this
path.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/exception-64e.h  |   6 +
 arch/powerpc/include/asm/exception-64s.h  |  52 +++--
 arch/powerpc/include/asm/feature-fixups.h |  18 ++
 arch/powerpc/include/asm/paca.h           |   6 +-
 arch/powerpc/kernel/asm-offsets.c         |   2 +-
 arch/powerpc/kernel/entry_64.S            |  26 ++-
 arch/powerpc/kernel/exceptions-64s.S      | 186 +++++------------
 arch/powerpc/kernel/vmlinux.lds.S         |  14 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   |   4 +
 arch/powerpc/lib/feature-fixups.c         | 241 ++++++++++++----------
 10 files changed, 276 insertions(+), 279 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64e.h b/arch/powerpc/include/asm/exception-64e.h
index 40cdcb2fb057..bc90e872484e 100644
--- a/arch/powerpc/include/asm/exception-64e.h
+++ b/arch/powerpc/include/asm/exception-64e.h
@@ -164,5 +164,11 @@ exc_##label##_book3e:
 #define RFI_TO_USER							\
 	rfi
 
+#define ENTER_KERNEL_SECURITY_FALLBACK
+
+#define EXIT_KERNEL_SECURITY_FALLBACK
+
+#define ENTER_GUEST_SECURITY_FALLBACK
+
 #endif /* _ASM_POWERPC_EXCEPTION_64E_H */
 
diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index c1a8aac01cf9..9f2684488922 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -44,30 +44,21 @@
 
 #define STF_ENTRY_BARRIER_SLOT						\
 	STF_ENTRY_BARRIER_FIXUP_SECTION;				\
-	nop;								\
-	nop;								\
 	nop
 
 #define STF_EXIT_BARRIER_SLOT						\
 	STF_EXIT_BARRIER_FIXUP_SECTION;					\
-	nop;								\
-	nop;								\
-	nop;								\
-	nop;								\
-	nop;								\
 	nop
 
 #define ENTRY_FLUSH_SLOT						\
 	ENTRY_FLUSH_FIXUP_SECTION;					\
 	nop;								\
-	nop;								\
-	nop;
+	nop
 
 #define SCV_ENTRY_FLUSH_SLOT						\
 	SCV_ENTRY_FLUSH_FIXUP_SECTION;					\
 	nop;								\
-	nop;								\
-	nop;
+	nop
 
 /*
  * r10 must be free to use, r13 must be paca
@@ -100,7 +91,6 @@
 #define RFI_FLUSH_SLOT							\
 	RFI_FLUSH_FIXUP_SECTION;					\
 	nop;								\
-	nop;								\
 	nop
 
 #define RFI_TO_KERNEL							\
@@ -109,20 +99,17 @@
 #define RFI_TO_USER							\
 	STF_EXIT_BARRIER_SLOT;						\
 	RFI_FLUSH_SLOT;							\
-	rfid;								\
-	b	rfi_flush_fallback
+	rfid
 
 #define RFI_TO_USER_OR_KERNEL						\
 	STF_EXIT_BARRIER_SLOT;						\
 	RFI_FLUSH_SLOT;							\
-	rfid;								\
-	b	rfi_flush_fallback
+	rfid
 
 #define RFI_TO_GUEST							\
 	STF_EXIT_BARRIER_SLOT;						\
 	RFI_FLUSH_SLOT;							\
-	rfid;								\
-	b	rfi_flush_fallback
+	rfid
 
 #define HRFI_TO_KERNEL							\
 	hrfid
@@ -130,35 +117,44 @@
 #define HRFI_TO_USER							\
 	STF_EXIT_BARRIER_SLOT;						\
 	RFI_FLUSH_SLOT;							\
-	hrfid;								\
-	b	hrfi_flush_fallback
+	hrfid
 
 #define HRFI_TO_USER_OR_KERNEL						\
 	STF_EXIT_BARRIER_SLOT;						\
 	RFI_FLUSH_SLOT;							\
-	hrfid;								\
-	b	hrfi_flush_fallback
+	hrfid
 
 #define HRFI_TO_GUEST							\
 	STF_EXIT_BARRIER_SLOT;						\
 	RFI_FLUSH_SLOT;							\
-	hrfid;								\
-	b	hrfi_flush_fallback
+	hrfid
 
 #define HRFI_TO_UNKNOWN							\
 	STF_EXIT_BARRIER_SLOT;						\
 	RFI_FLUSH_SLOT;							\
-	hrfid;								\
-	b	hrfi_flush_fallback
+	hrfid
 
 #define RFSCV_TO_USER							\
 	STF_EXIT_BARRIER_SLOT;						\
 	RFI_FLUSH_SLOT;							\
-	RFSCV;								\
-	b	rfscv_flush_fallback
+	RFSCV
+
+#define ENTER_KERNEL_SECURITY_FALLBACK					\
+	ENTER_SECURITY_FALLBACK_SECTION;				\
+	nop
+
+#define EXIT_KERNEL_SECURITY_FALLBACK					\
+	EXIT_SECURITY_FALLBACK_SECTION;					\
+	nop
+
+#define ENTER_GUEST_SECURITY_FALLBACK					\
+	EXIT_SECURITY_FALLBACK_SECTION;					\
+	nop
 
 #else /* __ASSEMBLY__ */
 /* Prototype for function defined in exceptions-64s.S */
+void exit_security_fallback(void);
+void enter_security_fallback(void);
 void do_uaccess_flush(void);
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/powerpc/include/asm/feature-fixups.h b/arch/powerpc/include/asm/feature-fixups.h
index ac605fc369c4..4b5fab33688a 100644
--- a/arch/powerpc/include/asm/feature-fixups.h
+++ b/arch/powerpc/include/asm/feature-fixups.h
@@ -256,6 +256,22 @@ label##3:					       	\
 	FTR_ENTRY_OFFSET 951b-952b;			\
 	.popsection;
 
+#define ENTER_SECURITY_FALLBACK_SECTION			\
+958:							\
+	.pushsection __enter_security_fallback_fixup,"a"; \
+	.align 2;					\
+959:							\
+	FTR_ENTRY_OFFSET 958b-959b;			\
+	.popsection;
+
+#define EXIT_SECURITY_FALLBACK_SECTION			\
+960:							\
+	.pushsection __exit_security_fallback_fixup,"a"; \
+	.align 2;					\
+961:							\
+	FTR_ENTRY_OFFSET 960b-961b;			\
+	.popsection;
+
 #define NOSPEC_BARRIER_FIXUP_SECTION			\
 953:							\
 	.pushsection __barrier_nospec_fixup,"a";	\
@@ -288,6 +304,8 @@ extern long __start___uaccess_flush_fixup, __stop___uaccess_flush_fixup;
 extern long __start___entry_flush_fixup, __stop___entry_flush_fixup;
 extern long __start___scv_entry_flush_fixup, __stop___scv_entry_flush_fixup;
 extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;
+extern long __start___enter_security_fallback_fixup, __stop___enter_security_fallback_fixup;
+extern long __start___exit_security_fallback_fixup, __stop___exit_security_fallback_fixup;
 extern long __start___barrier_nospec_fixup, __stop___barrier_nospec_fixup;
 extern long __start__btb_flush_fixup, __stop__btb_flush_fixup;
 
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index ec18ac818e3a..819db8afd425 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -249,11 +249,7 @@ struct paca_struct {
 #endif
 #endif
 #ifdef CONFIG_PPC_BOOK3S_64
-	/*
-	 * rfi fallback flush must be in its own cacheline to prevent
-	 * other paca data leaking into the L1d
-	 */
-	u64 exrfi[EX_SIZE] __aligned(0x80);
+	u64 stf_fallback_scratch[2];
 	void *rfi_flush_fallback_area;
 	u64 l1d_flush_size;
 #endif
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 85ba2b0bc8d8..e33f04280f77 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -275,7 +275,7 @@ int main(void)
 	OFFSET(PACA_IN_MCE, paca_struct, in_mce);
 	OFFSET(PACA_IN_NMI, paca_struct, in_nmi);
 	OFFSET(PACA_RFI_FLUSH_FALLBACK_AREA, paca_struct, rfi_flush_fallback_area);
-	OFFSET(PACA_EXRFI, paca_struct, exrfi);
+	OFFSET(PACA_STF_FALLBACK_SCRATCH, paca_struct, stf_fallback_scratch);
 	OFFSET(PACA_L1D_FLUSH_SIZE, paca_struct, l1d_flush_size);
 
 #endif
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 03727308d8cc..3632d8c56e48 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -116,6 +116,8 @@ BEGIN_FTR_SECTION
 	HMT_MEDIUM
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
+	ENTER_KERNEL_SECURITY_FALLBACK
+
 	/*
 	 * scv enters with MSR[EE]=1 and is immediately considered soft-masked.
 	 * The entry vector already sets PACAIRQSOFTMASK to IRQS_ALL_DISABLED,
@@ -134,6 +136,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	li	r5,1 /* scv */
 	bl	syscall_exit_prepare
 
+	EXIT_KERNEL_SECURITY_FALLBACK
+
 	ld	r2,_CCR(r1)
 	ld	r4,_NIP(r1)
 	ld	r5,_MSR(r1)
@@ -296,6 +300,8 @@ END_BTB_FLUSH_SECTION
 	stb	r11,PACAIRQSOFTMASK(r13)
 	stb	r12,PACAIRQHAPPENED(r13)
 
+	ENTER_KERNEL_SECURITY_FALLBACK
+
 	/* Calling convention has r9 = orig r0, r10 = regs */
 	mr	r9,r0
 	bl	system_call_exception
@@ -305,6 +311,8 @@ END_BTB_FLUSH_SECTION
 	li	r5,0 /* !scv */
 	bl	syscall_exit_prepare
 
+	EXIT_KERNEL_SECURITY_FALLBACK
+
 	ld	r2,_CCR(r1)
 	ld	r4,_NIP(r1)
 	ld	r5,_MSR(r1)
@@ -642,11 +650,16 @@ _ASM_NOKPROBE_SYMBOL(fast_interrupt_return)
 	ld	r5,_MSR(r1)
 	andi.	r0,r5,MSR_PR
 #ifdef CONFIG_PPC_BOOK3S
-	bne	.Lfast_user_interrupt_return_amr
-	kuap_kernel_restore r3, r4
+	beq	1f
+	kuap_user_restore r3, r4
+	b	.Lfast_user_interrupt_return
+1:
 	andi.	r0,r5,MSR_RI
+	beq-	2f
+	kuap_kernel_restore r3, r4
 	li	r3,0 /* 0 return value, no EMULATE_STACK_STORE */
-	bne+	.Lfast_kernel_interrupt_return
+	b	.Lfast_kernel_interrupt_return
+2:
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	unrecoverable_exception
 	b	. /* should not get here */
@@ -666,12 +679,9 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return)
 	bl	interrupt_exit_user_prepare
 	cmpdi	r3,0
 	bne-	.Lrestore_nvgprs
-
-#ifdef CONFIG_PPC_BOOK3S
-.Lfast_user_interrupt_return_amr:
-	kuap_user_restore r3, r4
-#endif
 .Lfast_user_interrupt_return:
+	EXIT_KERNEL_SECURITY_FALLBACK
+
 	ld	r11,_NIP(r1)
 	ld	r12,_MSR(r1)
 BEGIN_FTR_SECTION
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 0cdb59e8b577..0127032bc2aa 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -644,6 +644,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 	ld	r11,exception_marker@toc(r2)
 	std	r10,RESULT(r1)		/* clear regs->result		*/
 	std	r11,STACK_FRAME_OVERHEAD-16(r1) /* mark the frame	*/
+
+	ENTER_KERNEL_SECURITY_FALLBACK
 .endm
 
 /*
@@ -996,6 +998,8 @@ EXC_COMMON_BEGIN(system_reset_common)
 	subi	r10,r10,1
 	sth	r10,PACA_IN_NMI(r13)
 
+	EXIT_KERNEL_SECURITY_FALLBACK
+
 	kuap_kernel_restore r9, r10
 	EXCEPTION_RESTORE_REGS
 	RFI_TO_USER_OR_KERNEL
@@ -2199,6 +2203,8 @@ EXC_COMMON_BEGIN(hmi_exception_early_common)
 	cmpdi	cr0,r3,0
 	bne	1f
 
+	EXIT_KERNEL_SECURITY_FALLBACK
+
 	EXCEPTION_RESTORE_REGS hsrr=1
 	HRFI_TO_USER_OR_KERNEL
 
@@ -2843,26 +2849,13 @@ masked_interrupt:
 	b	.
 .endm
 
-TRAMP_REAL_BEGIN(stf_barrier_fallback)
-	std	r9,PACA_EXRFI+EX_R9(r13)
-	std	r10,PACA_EXRFI+EX_R10(r13)
-	sync
-	ld	r9,PACA_EXRFI+EX_R9(r13)
-	ld	r10,PACA_EXRFI+EX_R10(r13)
-	ori	31,31,0
-	.rept 14
-	b	1f
-1:
-	.endr
-	blr
-
-/* Clobbers r10, r11, ctr */
+/* Clobbers r11, r12, ctr */
 .macro L1D_DISPLACEMENT_FLUSH
-	ld	r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
-	ld	r11,PACA_L1D_FLUSH_SIZE(r13)
-	srdi	r11,r11,(7 + 3) /* 128 byte lines, unrolled 8x */
-	mtctr	r11
-	DCBT_BOOK3S_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */
+	ld	r11,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
+	ld	r12,PACA_L1D_FLUSH_SIZE(r13)
+	srdi	r12,r12,(7 + 3) /* 128 byte lines, unrolled 8x */
+	mtctr	r12
+	DCBT_BOOK3S_STOP_ALL_STREAM_IDS(r12) /* Stop prefetch streams */
 
 	/* order ld/st prior to dcbt stop all streams with flushing */
 	sync
@@ -2873,125 +2866,31 @@ TRAMP_REAL_BEGIN(stf_barrier_fallback)
 	 * hurt).
 	 */
 1:
-	ld	r11,(0x80 + 8)*0(r10)
-	ld	r11,(0x80 + 8)*1(r10)
-	ld	r11,(0x80 + 8)*2(r10)
-	ld	r11,(0x80 + 8)*3(r10)
-	ld	r11,(0x80 + 8)*4(r10)
-	ld	r11,(0x80 + 8)*5(r10)
-	ld	r11,(0x80 + 8)*6(r10)
-	ld	r11,(0x80 + 8)*7(r10)
-	addi	r10,r10,0x80*8
+	ld	r12,(0x80 + 8)*0(r11)
+	ld	r12,(0x80 + 8)*1(r11)
+	ld	r12,(0x80 + 8)*2(r11)
+	ld	r12,(0x80 + 8)*3(r11)
+	ld	r12,(0x80 + 8)*4(r11)
+	ld	r12,(0x80 + 8)*5(r11)
+	ld	r12,(0x80 + 8)*6(r11)
+	ld	r12,(0x80 + 8)*7(r11)
+	addi	r11,r11,0x80*8
 	bdnz	1b
 .endm
 
-TRAMP_REAL_BEGIN(entry_flush_fallback)
-	std	r9,PACA_EXRFI+EX_R9(r13)
-	std	r10,PACA_EXRFI+EX_R10(r13)
-	std	r11,PACA_EXRFI+EX_R11(r13)
-	mfctr	r9
-	L1D_DISPLACEMENT_FLUSH
-	mtctr	r9
-	ld	r9,PACA_EXRFI+EX_R9(r13)
-	ld	r10,PACA_EXRFI+EX_R10(r13)
-	ld	r11,PACA_EXRFI+EX_R11(r13)
-	blr
-
-/*
- * The SCV entry flush happens with interrupts enabled, so it must disable
- * to prevent EXRFI being clobbered by NMIs (e.g., soft_nmi_common). r10
- * (containing LR) does not need to be preserved here because scv entry
- * puts 0 in the pt_regs, CTR can be clobbered for the same reason.
- */
-TRAMP_REAL_BEGIN(scv_entry_flush_fallback)
-	li	r10,0
-	mtmsrd	r10,1
-	lbz	r10,PACAIRQHAPPENED(r13)
-	ori	r10,r10,PACA_IRQ_HARD_DIS
-	stb	r10,PACAIRQHAPPENED(r13)
-	std	r11,PACA_EXRFI+EX_R11(r13)
-	L1D_DISPLACEMENT_FLUSH
-	ld	r11,PACA_EXRFI+EX_R11(r13)
-	li	r10,MSR_RI
-	mtmsrd	r10,1
-	blr
-
-TRAMP_REAL_BEGIN(rfi_flush_fallback)
-	SET_SCRATCH0(r13);
-	GET_PACA(r13);
-	std	r1,PACA_EXRFI+EX_R12(r13)
-	ld	r1,PACAKSAVE(r13)
-	std	r9,PACA_EXRFI+EX_R9(r13)
-	std	r10,PACA_EXRFI+EX_R10(r13)
-	std	r11,PACA_EXRFI+EX_R11(r13)
-	mfctr	r9
-	L1D_DISPLACEMENT_FLUSH
-	mtctr	r9
-	ld	r9,PACA_EXRFI+EX_R9(r13)
-	ld	r10,PACA_EXRFI+EX_R10(r13)
-	ld	r11,PACA_EXRFI+EX_R11(r13)
-	ld	r1,PACA_EXRFI+EX_R12(r13)
-	GET_SCRATCH0(r13);
-	rfid
-
-TRAMP_REAL_BEGIN(hrfi_flush_fallback)
-	SET_SCRATCH0(r13);
-	GET_PACA(r13);
-	std	r1,PACA_EXRFI+EX_R12(r13)
-	ld	r1,PACAKSAVE(r13)
-	std	r9,PACA_EXRFI+EX_R9(r13)
-	std	r10,PACA_EXRFI+EX_R10(r13)
-	std	r11,PACA_EXRFI+EX_R11(r13)
-	mfctr	r9
-	L1D_DISPLACEMENT_FLUSH
-	mtctr	r9
-	ld	r9,PACA_EXRFI+EX_R9(r13)
-	ld	r10,PACA_EXRFI+EX_R10(r13)
-	ld	r11,PACA_EXRFI+EX_R11(r13)
-	ld	r1,PACA_EXRFI+EX_R12(r13)
-	GET_SCRATCH0(r13);
-	hrfid
-
-TRAMP_REAL_BEGIN(rfscv_flush_fallback)
-	/* system call volatile */
-	mr	r7,r13
-	GET_PACA(r13);
-	mr	r8,r1
-	ld	r1,PACAKSAVE(r13)
-	mfctr	r9
-	ld	r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
-	ld	r11,PACA_L1D_FLUSH_SIZE(r13)
-	srdi	r11,r11,(7 + 3) /* 128 byte lines, unrolled 8x */
-	mtctr	r11
-	DCBT_BOOK3S_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */
-
-	/* order ld/st prior to dcbt stop all streams with flushing */
+/* Clobbers r11, r12 */
+.macro STF_FALLBACK_BARRIER
+	std	r11,PACA_STF_FALLBACK_SCRATCH+0(r13)
+	std	r12,PACA_STF_FALLBACK_SCRATCH+8(r13)
 	sync
-
-	/*
-	 * The load adresses are at staggered offsets within cachelines,
-	 * which suits some pipelines better (on others it should not
-	 * hurt).
-	 */
+	ld	r11,PACA_STF_FALLBACK_SCRATCH+0(r13)
+	ld	r12,PACA_STF_FALLBACK_SCRATCH+8(r13)
+	ori	31,31,0
+	.rept 14
+	b	1f
 1:
-	ld	r11,(0x80 + 8)*0(r10)
-	ld	r11,(0x80 + 8)*1(r10)
-	ld	r11,(0x80 + 8)*2(r10)
-	ld	r11,(0x80 + 8)*3(r10)
-	ld	r11,(0x80 + 8)*4(r10)
-	ld	r11,(0x80 + 8)*5(r10)
-	ld	r11,(0x80 + 8)*6(r10)
-	ld	r11,(0x80 + 8)*7(r10)
-	addi	r10,r10,0x80*8
-	bdnz	1b
-
-	mtctr	r9
-	li	r9,0
-	li	r10,0
-	li	r11,0
-	mr	r1,r8
-	mr	r13,r7
-	RFSCV
+	.endr
+.endm
 
 USE_TEXT_SECTION()
 
@@ -3006,6 +2905,27 @@ _GLOBAL(do_uaccess_flush)
 _ASM_NOKPROBE_SYMBOL(do_uaccess_flush)
 EXPORT_SYMBOL(do_uaccess_flush)
 
+_GLOBAL(enter_security_fallback)
+	STF_FALLBACK_BARRIER
+	ld	r11,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
+	cmpdi	r11,0
+	beq	10f
+	L1D_DISPLACEMENT_FLUSH
+10:
+	ld	r12,_MSR(r1)	// some interrupts require r12==SRR1
+	andi.	r11,r12,MSR_PR	// and cr0 set
+	blr
+_ASM_NOKPROBE_SYMBOL(enter_security_fallback)
+
+_GLOBAL(exit_security_fallback)
+	STF_FALLBACK_BARRIER
+	ld	r11,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
+	cmpdi	r11,0
+	beq	10f
+	L1D_DISPLACEMENT_FLUSH
+10:
+	blr
+_ASM_NOKPROBE_SYMBOL(exit_security_fallback)
 
 MASKED_INTERRUPT
 MASKED_INTERRUPT hsrr=1
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index 72fa3c00229a..582009dacef4 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -165,6 +165,20 @@ SECTIONS
 		*(__rfi_flush_fixup)
 		__stop___rfi_flush_fixup = .;
 	}
+
+	. = ALIGN(8);
+	__enter_security_fallback_fixup : AT(ADDR(__enter_security_fallback_fixup) - LOAD_OFFSET) {
+		__start___enter_security_fallback_fixup = .;
+		*(__enter_security_fallback_fixup)
+		__stop___enter_security_fallback_fixup = .;
+	}
+
+	. = ALIGN(8);
+	__exit_security_fallback_fixup : AT(ADDR(__exit_security_fallback_fixup) - LOAD_OFFSET) {
+		__start___exit_security_fallback_fixup = .;
+		*(__exit_security_fallback_fixup)
+		__stop___exit_security_fallback_fixup = .;
+	}
 #endif /* CONFIG_PPC64 */
 
 #ifdef CONFIG_PPC_BARRIER_NOSPEC
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 5e634db4809b..e5adfa090c6a 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1066,6 +1066,8 @@ fast_guest_return:
 	bl	kvmhv_accumulate_time
 #endif
 
+	ENTER_GUEST_SECURITY_FALLBACK
+
 	/* Enter guest */
 
 BEGIN_FTR_SECTION
@@ -1348,6 +1350,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	li	r0, MSR_RI
 	mtmsrd	r0, 1
 
+	ENTER_KERNEL_SECURITY_FALLBACK
+
 #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
 	addi	r3, r9, VCPU_TB_RMINTR
 	mr	r4, r9
diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 1fd31b4b0e13..370e98dc64db 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -118,115 +118,169 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)
 }
 
 #ifdef CONFIG_PPC_BOOK3S_64
-static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
+static void do_enter_security_fallback_fixups(bool enable)
 {
-	unsigned int instrs[3], *dest;
+	unsigned int instr, *dest;
 	long *start, *end;
 	int i;
 
-	start = PTRRELOC(&__start___stf_entry_barrier_fixup);
-	end = PTRRELOC(&__stop___stf_entry_barrier_fixup);
+	start = PTRRELOC(&__start___enter_security_fallback_fixup);
+	end = PTRRELOC(&__stop___enter_security_fallback_fixup);
 
-	instrs[0] = 0x60000000; /* nop */
-	instrs[1] = 0x60000000; /* nop */
-	instrs[2] = 0x60000000; /* nop */
+	instr = 0x60000000; /* nop */
 
-	i = 0;
-	if (types & STF_BARRIER_FALLBACK) {
-		instrs[i++] = 0x7d4802a6; /* mflr r10		*/
-		instrs[i++] = 0x60000000; /* branch patched below */
-		instrs[i++] = 0x7d4803a6; /* mtlr r10		*/
-	} else if (types & STF_BARRIER_EIEIO) {
-		instrs[i++] = 0x7e0006ac; /* eieio + bit 6 hint */
-	} else if (types & STF_BARRIER_SYNC_ORI) {
-		instrs[i++] = 0x7c0004ac; /* hwsync		*/
-		instrs[i++] = 0xe94d0000; /* ld r10,0(r13)	*/
-		instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */
+	for (i = 0; start < end; start++, i++) {
+		dest = (void *)start + *start;
+
+		pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+		if (enable) {
+			patch_branch((struct ppc_inst *)dest,
+				     (unsigned long)&enter_security_fallback,
+				     BRANCH_SET_LINK);
+		} else {
+			patch_instruction((struct ppc_inst *)dest,
+					  ppc_inst(instr));
+		}
 	}
 
+	printk(KERN_DEBUG "enter-security-fallback: patched %d locations (%s)\n", i,
+			enable ? "enable" : "disable");
+}
+
+static void do_exit_security_fallback_fixups(bool enable)
+{
+	unsigned int instr, *dest;
+	long *start, *end;
+	int i;
+
+	start = PTRRELOC(&__start___exit_security_fallback_fixup);
+	end = PTRRELOC(&__stop___exit_security_fallback_fixup);
+
+	instr = 0x60000000; /* nop */
+
 	for (i = 0; start < end; start++, i++) {
 		dest = (void *)start + *start;
 
 		pr_devel("patching dest %lx\n", (unsigned long)dest);
 
-		patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
-
-		if (types & STF_BARRIER_FALLBACK)
-			patch_branch((struct ppc_inst *)(dest + 1),
-				     (unsigned long)&stf_barrier_fallback,
+		if (enable) {
+			patch_branch((struct ppc_inst *)dest,
+				     (unsigned long)&exit_security_fallback,
 				     BRANCH_SET_LINK);
-		else
-			patch_instruction((struct ppc_inst *)(dest + 1),
-					  ppc_inst(instrs[1]));
+		} else {
+			patch_instruction((struct ppc_inst *)dest,
+					  ppc_inst(instr));
+		}
+	}
 
-		patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
+	printk(KERN_DEBUG "exit-security-fallback: patched %d locations (%s)\n", i,
+			enable ? "enable" : "disable");
+}
+
+static enum stf_barrier_type enter_stf_barrier_type = STF_BARRIER_NONE;
+static enum stf_barrier_type exit_stf_barrier_type = STF_BARRIER_NONE;
+static enum l1d_flush_type enter_flush_type = L1D_FLUSH_NONE;
+static enum l1d_flush_type exit_flush_type = L1D_FLUSH_NONE;
+
+static void update_fallback_calls(void)
+{
+	static bool enter_fallback_enabled = false;
+	static bool exit_fallback_enabled = false;
+
+	// This is slightly racy if called concurrently.
+
+	if (enter_stf_barrier_type == STF_BARRIER_FALLBACK ||
+			enter_flush_type == L1D_FLUSH_FALLBACK) {
+		if (!enter_fallback_enabled) {
+			do_enter_security_fallback_fixups(true);
+			enter_fallback_enabled = true;
+		}
+	} else if (enter_stf_barrier_type == STF_BARRIER_NONE &&
+			enter_flush_type == L1D_FLUSH_NONE) {
+		if (enter_fallback_enabled) {
+			do_enter_security_fallback_fixups(false);
+			enter_fallback_enabled = false;
+		}
+	}
+
+	if (exit_stf_barrier_type == STF_BARRIER_FALLBACK ||
+			exit_flush_type == L1D_FLUSH_FALLBACK) {
+		if (!exit_fallback_enabled) {
+			do_exit_security_fallback_fixups(true);
+			exit_fallback_enabled = true;
+		}
+	} else if (exit_stf_barrier_type == STF_BARRIER_NONE &&
+			exit_flush_type == L1D_FLUSH_NONE) {
+		if (exit_fallback_enabled) {
+			do_exit_security_fallback_fixups(false);
+			exit_fallback_enabled = false;
+		}
+	}
+}
+
+static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
+{
+	unsigned int instr, *dest;
+	long *start, *end;
+	int i;
+
+	start = PTRRELOC(&__start___stf_entry_barrier_fixup);
+	end = PTRRELOC(&__stop___stf_entry_barrier_fixup);
+
+	instr = 0x60000000; /* nop */
+	if (types & STF_BARRIER_EIEIO)
+		instr = 0x7e0006ac; /* eieio + bit 6 hint */
+
+	for (i = 0; start < end; start++, i++) {
+		dest = (void *)start + *start;
+
+		pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+		patch_instruction((struct ppc_inst *)dest, ppc_inst(instr));
 	}
 
 	printk(KERN_DEBUG "stf-barrier: patched %d entry locations (%s barrier)\n", i,
 		(types == STF_BARRIER_NONE)                  ? "no" :
 		(types == STF_BARRIER_FALLBACK)              ? "fallback" :
 		(types == STF_BARRIER_EIEIO)                 ? "eieio" :
-		(types == (STF_BARRIER_SYNC_ORI))            ? "hwsync"
-		                                           : "unknown");
+		(types == STF_BARRIER_SYNC_ORI)              ? "hwsync"
+		                                             : "unknown");
+
+	enter_stf_barrier_type = types;
+	update_fallback_calls();
 }
 
 static void do_stf_exit_barrier_fixups(enum stf_barrier_type types)
 {
-	unsigned int instrs[6], *dest;
+	unsigned int instr, *dest;
 	long *start, *end;
 	int i;
 
 	start = PTRRELOC(&__start___stf_exit_barrier_fixup);
 	end = PTRRELOC(&__stop___stf_exit_barrier_fixup);
 
-	instrs[0] = 0x60000000; /* nop */
-	instrs[1] = 0x60000000; /* nop */
-	instrs[2] = 0x60000000; /* nop */
-	instrs[3] = 0x60000000; /* nop */
-	instrs[4] = 0x60000000; /* nop */
-	instrs[5] = 0x60000000; /* nop */
-
-	i = 0;
-	if (types & STF_BARRIER_FALLBACK || types & STF_BARRIER_SYNC_ORI) {
-		if (cpu_has_feature(CPU_FTR_HVMODE)) {
-			instrs[i++] = 0x7db14ba6; /* mtspr 0x131, r13 (HSPRG1) */
-			instrs[i++] = 0x7db04aa6; /* mfspr r13, 0x130 (HSPRG0) */
-		} else {
-			instrs[i++] = 0x7db243a6; /* mtsprg 2,r13	*/
-			instrs[i++] = 0x7db142a6; /* mfsprg r13,1    */
-	        }
-		instrs[i++] = 0x7c0004ac; /* hwsync		*/
-		instrs[i++] = 0xe9ad0000; /* ld r13,0(r13)	*/
-		instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */
-		if (cpu_has_feature(CPU_FTR_HVMODE)) {
-			instrs[i++] = 0x7db14aa6; /* mfspr r13, 0x131 (HSPRG1) */
-		} else {
-			instrs[i++] = 0x7db242a6; /* mfsprg r13,2 */
-		}
-	} else if (types & STF_BARRIER_EIEIO) {
-		instrs[i++] = 0x7e0006ac; /* eieio + bit 6 hint */
-	}
+	instr = 0x60000000; /* nop */
+	if (types == STF_BARRIER_EIEIO)
+		instr = 0x7e0006ac; /* eieio + bit 6 hint */
 
 	for (i = 0; start < end; start++, i++) {
 		dest = (void *)start + *start;
 
 		pr_devel("patching dest %lx\n", (unsigned long)dest);
 
-		patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
-		patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
-		patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
-		patch_instruction((struct ppc_inst *)(dest + 3), ppc_inst(instrs[3]));
-		patch_instruction((struct ppc_inst *)(dest + 4), ppc_inst(instrs[4]));
-		patch_instruction((struct ppc_inst *)(dest + 5), ppc_inst(instrs[5]));
+		patch_instruction((struct ppc_inst *)dest, ppc_inst(instr));
 	}
 	printk(KERN_DEBUG "stf-barrier: patched %d exit locations (%s barrier)\n", i,
 		(types == STF_BARRIER_NONE)                  ? "no" :
 		(types == STF_BARRIER_FALLBACK)              ? "fallback" :
 		(types == STF_BARRIER_EIEIO)                 ? "eieio" :
-		(types == (STF_BARRIER_SYNC_ORI))            ? "hwsync"
-		                                           : "unknown");
-}
+		(types == STF_BARRIER_SYNC_ORI)              ? "hwsync"
+		                                             : "unknown");
 
+	exit_stf_barrier_type = types;
+	update_fallback_calls();
+}
 
 void do_stf_barrier_fixups(enum stf_barrier_type types)
 {
@@ -286,28 +340,20 @@ void do_uaccess_flush_fixups(enum l1d_flush_type types)
 
 void do_entry_flush_fixups(enum l1d_flush_type types)
 {
-	unsigned int instrs[3], *dest;
+	unsigned int instrs[2], *dest;
 	long *start, *end;
 	int i;
 
 	instrs[0] = 0x60000000; /* nop */
 	instrs[1] = 0x60000000; /* nop */
-	instrs[2] = 0x60000000; /* nop */
 
 	i = 0;
-	if (types == L1D_FLUSH_FALLBACK) {
-		instrs[i++] = 0x7d4802a6; /* mflr r10		*/
-		instrs[i++] = 0x60000000; /* branch patched below */
-		instrs[i++] = 0x7d4803a6; /* mtlr r10		*/
-	}
-
 	if (types & L1D_FLUSH_ORI) {
 		instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */
 		instrs[i++] = 0x63de0000; /* ori 30,30,0 L1d flush*/
-	}
-
-	if (types & L1D_FLUSH_MTTRIG)
+	} else if (types & L1D_FLUSH_MTTRIG) {
 		instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */
+	}
 
 	start = PTRRELOC(&__start___entry_flush_fixup);
 	end = PTRRELOC(&__stop___entry_flush_fixup);
@@ -316,15 +362,9 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
 
 		pr_devel("patching dest %lx\n", (unsigned long)dest);
 
-		patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
-
-		if (types == L1D_FLUSH_FALLBACK)
-			patch_branch((struct ppc_inst *)(dest + 1), (unsigned long)&entry_flush_fallback,
-				     BRANCH_SET_LINK);
-		else
-			patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
+		patch_instruction((struct ppc_inst *)(dest + 0), ppc_inst(instrs[0]));
 
-		patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
+		patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
 	}
 
 	start = PTRRELOC(&__start___scv_entry_flush_fixup);
@@ -334,15 +374,9 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
 
 		pr_devel("patching dest %lx\n", (unsigned long)dest);
 
-		patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
-
-		if (types == L1D_FLUSH_FALLBACK)
-			patch_branch((struct ppc_inst *)(dest + 1), (unsigned long)&scv_entry_flush_fallback,
-				     BRANCH_SET_LINK);
-		else
-			patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
+		patch_instruction((struct ppc_inst *)(dest + 0), ppc_inst(instrs[0]));
 
-		patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
+		patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
 	}
 
 
@@ -354,11 +388,14 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
 							: "ori type" :
 		(types &  L1D_FLUSH_MTTRIG)     ? "mttrig type"
 						: "unknown");
+
+	enter_flush_type = types;
+	update_fallback_calls();
 }
 
 void do_rfi_flush_fixups(enum l1d_flush_type types)
 {
-	unsigned int instrs[3], *dest;
+	unsigned int instrs[2], *dest;
 	long *start, *end;
 	int i;
 
@@ -367,29 +404,22 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
 
 	instrs[0] = 0x60000000; /* nop */
 	instrs[1] = 0x60000000; /* nop */
-	instrs[2] = 0x60000000; /* nop */
-
-	if (types & L1D_FLUSH_FALLBACK)
-		/* b .+16 to fallback flush */
-		instrs[0] = 0x48000010;
 
 	i = 0;
 	if (types & L1D_FLUSH_ORI) {
 		instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */
 		instrs[i++] = 0x63de0000; /* ori 30,30,0 L1d flush*/
-	}
-
-	if (types & L1D_FLUSH_MTTRIG)
+	} else if (types & L1D_FLUSH_MTTRIG) {
 		instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */
+	}
 
 	for (i = 0; start < end; start++, i++) {
 		dest = (void *)start + *start;
 
 		pr_devel("patching dest %lx\n", (unsigned long)dest);
 
-		patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
+		patch_instruction((struct ppc_inst *)(dest + 0), ppc_inst(instrs[0]));
 		patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
-		patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
 	}
 
 	printk(KERN_DEBUG "rfi-flush: patched %d locations (%s flush)\n", i,
@@ -400,6 +430,9 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
 							: "ori type" :
 		(types &  L1D_FLUSH_MTTRIG)     ? "mttrig type"
 						: "unknown");
+
+	exit_flush_type = types;
+	update_fallback_calls();
 }
 
 void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 03/14] powerpc/64s: introduce different functions to return from SRR vs HSRR interrupts
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 01/14] powerpc: remove interrupt exit helpers unused argument Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 02/14] powerpc/64s: security fallback improvement Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid Nicholas Piggin
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This makes no real difference yet except that HSRR type interrupts will
use hrfid to return. This is important for the next patch.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/entry_64.S       | 63 ++++++++++++++++-------
 arch/powerpc/kernel/exceptions-64e.S |  4 ++
 arch/powerpc/kernel/exceptions-64s.S | 76 +++++++++++++++-------------
 arch/powerpc/kernel/vector.S         |  2 +-
 4 files changed, 91 insertions(+), 54 deletions(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 3632d8c56e48..ccf913cedd29 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -643,43 +643,44 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	 * touched, no exit work created, then this can be used.
 	 */
 	.balign IFETCH_ALIGN_BYTES
-	.globl fast_interrupt_return
-fast_interrupt_return:
-_ASM_NOKPROBE_SYMBOL(fast_interrupt_return)
+	.globl fast_interrupt_return_srr
+fast_interrupt_return_srr:
+_ASM_NOKPROBE_SYMBOL(fast_interrupt_return_srr)
 	kuap_check_amr r3, r4
 	ld	r5,_MSR(r1)
 	andi.	r0,r5,MSR_PR
 #ifdef CONFIG_PPC_BOOK3S
 	beq	1f
 	kuap_user_restore r3, r4
-	b	.Lfast_user_interrupt_return
+	b	.Lfast_user_interrupt_return_srr
 1:
 	andi.	r0,r5,MSR_RI
 	beq-	2f
 	kuap_kernel_restore r3, r4
 	li	r3,0 /* 0 return value, no EMULATE_STACK_STORE */
-	b	.Lfast_kernel_interrupt_return
+	b	.Lfast_kernel_interrupt_return_srr
 2:
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	unrecoverable_exception
 	b	. /* should not get here */
 #else
-	bne	.Lfast_user_interrupt_return
-	b	.Lfast_kernel_interrupt_return
+	bne	.Lfast_user_interrupt_return_srr
+	b	.Lfast_kernel_interrupt_return_srr
 #endif
 
+.macro interrupt_return_macro srr
 	.balign IFETCH_ALIGN_BYTES
-	.globl interrupt_return
-interrupt_return:
-_ASM_NOKPROBE_SYMBOL(interrupt_return)
+	.globl interrupt_return_\srr
+interrupt_return_\srr\():
+_ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
 	ld	r4,_MSR(r1)
 	andi.	r0,r4,MSR_PR
-	beq	.Lkernel_interrupt_return
+	beq	.Lkernel_interrupt_return_\srr
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	interrupt_exit_user_prepare
 	cmpdi	r3,0
-	bne-	.Lrestore_nvgprs
-.Lfast_user_interrupt_return:
+	bne-	.Lrestore_nvgprs_\srr
+.Lfast_user_interrupt_return_\srr\():
 	EXIT_KERNEL_SECURITY_FALLBACK
 
 	ld	r11,_NIP(r1)
@@ -688,8 +689,13 @@ BEGIN_FTR_SECTION
 	ld	r10,_PPR(r1)
 	mtspr	SPRN_PPR,r10
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
+	.ifc \srr,srr
 	mtspr	SPRN_SRR0,r11
 	mtspr	SPRN_SRR1,r12
+	.else
+	mtspr	SPRN_HSRR0,r11
+	mtspr	SPRN_HSRR1,r12
+	.endif
 
 BEGIN_FTR_SECTION
 	stdcx.	r0,0,r1		/* to clear the reservation */
@@ -716,24 +722,33 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 	REST_GPR(6, r1)
 	REST_GPR(0, r1)
 	REST_GPR(1, r1)
+	.ifc \srr,srr
 	RFI_TO_USER
+	.else
+	HRFI_TO_USER
+	.endif
 	b	.	/* prevent speculative execution */
 
-.Lrestore_nvgprs:
+.Lrestore_nvgprs_\srr\():
 	REST_NVGPRS(r1)
-	b	.Lfast_user_interrupt_return
+	b	.Lfast_user_interrupt_return_\srr
 
 	.balign IFETCH_ALIGN_BYTES
-.Lkernel_interrupt_return:
+.Lkernel_interrupt_return_\srr\():
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	interrupt_exit_kernel_prepare
 
-.Lfast_kernel_interrupt_return:
+.Lfast_kernel_interrupt_return_\srr\():
 	cmpdi	cr1,r3,0
 	ld	r11,_NIP(r1)
 	ld	r12,_MSR(r1)
+	.ifc \srr,srr
 	mtspr	SPRN_SRR0,r11
 	mtspr	SPRN_SRR1,r12
+	.else
+	mtspr	SPRN_HSRR0,r11
+	mtspr	SPRN_HSRR1,r12
+	.endif
 
 BEGIN_FTR_SECTION
 	stdcx.	r0,0,r1		/* to clear the reservation */
@@ -767,7 +782,11 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 	REST_GPR(6, r1)
 	REST_GPR(0, r1)
 	REST_GPR(1, r1)
+	.ifc \srr,srr
 	RFI_TO_KERNEL
+	.else
+	HRFI_TO_KERNEL
+	.endif
 	b	.	/* prevent speculative execution */
 
 1:	/*
@@ -787,8 +806,18 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 	std	r9,0(r1) /* perform store component of stdu */
 	ld	r9,PACA_EXGEN+0(r13)
 
+	.ifc \srr,srr
 	RFI_TO_KERNEL
+	.else
+	HRFI_TO_KERNEL
+	.endif
 	b	.	/* prevent speculative execution */
+.endm
+
+interrupt_return_macro srr
+#ifdef CONFIG_PPC_BOOK3S
+interrupt_return_macro hsrr
+#endif
 
 #ifdef CONFIG_PPC_RTAS
 /*
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index b08c84e0fa56..86612f68f5bd 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -26,6 +26,10 @@
 #include <asm/feature-fixups.h>
 #include <asm/context_tracking.h>
 
+/* 64e interrupt returns always use SRR registers */
+#define fast_interrupt_return fast_interrupt_return_srr
+#define interrupt_return interrupt_return_srr
+
 /* XXX This will ultimately add space for a special exception save
  *     structure used to save things like SRR0/SRR1, SPRGs, MAS, etc...
  *     when taking special interrupts. For now we don't support that,
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 0127032bc2aa..136323d38c80 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1238,7 +1238,7 @@ EXC_COMMON_BEGIN(machine_check_common)
 	mtmsrd 	r10,1
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	machine_check_exception
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM machine_check
 
@@ -1367,7 +1367,7 @@ BEGIN_MMU_FTR_SECTION
 MMU_FTR_SECTION_ELSE
 	bl	do_page_fault
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
-	b	interrupt_return
+	b	interrupt_return_srr
 
 1:	bl	do_break
 	/*
@@ -1375,7 +1375,7 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 	 * If so, we need to restore them with their updated values.
 	 */
 	REST_NVGPRS(r1)
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM data_access
 
@@ -1418,7 +1418,7 @@ BEGIN_MMU_FTR_SECTION
 	bl	do_slb_fault
 	cmpdi	r3,0
 	bne-	1f
-	b	fast_interrupt_return
+	b	fast_interrupt_return_srr
 1:	/* Error case */
 MMU_FTR_SECTION_ELSE
 	/* Radix case, access is outside page table range */
@@ -1427,7 +1427,7 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 	std	r3,RESULT(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_bad_slb_fault
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM data_access_slb
 
@@ -1465,7 +1465,7 @@ BEGIN_MMU_FTR_SECTION
 MMU_FTR_SECTION_ELSE
 	bl	do_page_fault
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM instruction_access
 
@@ -1502,7 +1502,7 @@ BEGIN_MMU_FTR_SECTION
 	bl	do_slb_fault
 	cmpdi	r3,0
 	bne-	1f
-	b	fast_interrupt_return
+	b	fast_interrupt_return_srr
 1:	/* Error case */
 MMU_FTR_SECTION_ELSE
 	/* Radix case, access is outside page table range */
@@ -1511,7 +1511,7 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 	std	r3,RESULT(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_bad_slb_fault
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM instruction_access_slb
 
@@ -1557,7 +1557,11 @@ EXC_COMMON_BEGIN(hardware_interrupt_common)
 	GEN_COMMON hardware_interrupt
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_IRQ
-	b	interrupt_return
+	BEGIN_FTR_SECTION
+	b	interrupt_return_hsrr
+	FTR_SECTION_ELSE
+	b	interrupt_return_srr
+	ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
 
 	GEN_KVM hardware_interrupt
 
@@ -1586,7 +1590,7 @@ EXC_COMMON_BEGIN(alignment_common)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	alignment_exception
 	REST_NVGPRS(r1) /* instruction emulation may change GPRs */
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM alignment
 
@@ -1695,7 +1699,7 @@ EXC_COMMON_BEGIN(program_check_common)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	program_check_exception
 	REST_NVGPRS(r1) /* instruction emulation may change GPRs */
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM program_check
 
@@ -1740,12 +1744,12 @@ BEGIN_FTR_SECTION
 END_FTR_SECTION_IFSET(CPU_FTR_TM)
 #endif
 	bl	load_up_fpu
-	b	fast_interrupt_return
+	b	fast_interrupt_return_srr
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 2:	/* User process was in a transaction */
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	fp_unavailable_tm
-	b	interrupt_return
+	b	interrupt_return_srr
 #endif
 
 	GEN_KVM fp_unavailable
@@ -1786,7 +1790,7 @@ EXC_COMMON_BEGIN(decrementer_common)
 	GEN_COMMON decrementer
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	timer_interrupt
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM decrementer
 
@@ -1874,7 +1878,7 @@ EXC_COMMON_BEGIN(doorbell_super_common)
 #else
 	bl	unknown_async_exception
 #endif
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM doorbell_super
 
@@ -2045,7 +2049,7 @@ EXC_COMMON_BEGIN(single_step_common)
 	GEN_COMMON single_step
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	single_step_exception
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM single_step
 
@@ -2086,7 +2090,7 @@ BEGIN_MMU_FTR_SECTION
 MMU_FTR_SECTION_ELSE
 	bl      unknown_exception
 ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_TYPE_RADIX)
-	b       interrupt_return
+	b       interrupt_return_hsrr
 
 	GEN_KVM h_data_storage
 
@@ -2113,7 +2117,7 @@ EXC_COMMON_BEGIN(h_instr_storage_common)
 	GEN_COMMON h_instr_storage
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	unknown_exception
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM h_instr_storage
 
@@ -2139,7 +2143,7 @@ EXC_COMMON_BEGIN(emulation_assist_common)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	emulation_assist_interrupt
 	REST_NVGPRS(r1) /* instruction emulation may change GPRs */
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM emulation_assist
 
@@ -2222,7 +2226,7 @@ EXC_COMMON_BEGIN(hmi_exception_common)
 	GEN_COMMON hmi_exception
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	handle_hmi_exception
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM hmi_exception
 
@@ -2254,7 +2258,7 @@ EXC_COMMON_BEGIN(h_doorbell_common)
 #else
 	bl	unknown_async_exception
 #endif
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM h_doorbell
 
@@ -2282,7 +2286,7 @@ EXC_COMMON_BEGIN(h_virt_irq_common)
 	GEN_COMMON h_virt_irq
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_IRQ
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM h_virt_irq
 
@@ -2327,7 +2331,7 @@ EXC_COMMON_BEGIN(performance_monitor_common)
 	GEN_COMMON performance_monitor
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	performance_monitor_exception
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM performance_monitor
 
@@ -2366,19 +2370,19 @@ BEGIN_FTR_SECTION
   END_FTR_SECTION_NESTED(CPU_FTR_TM, CPU_FTR_TM, 69)
 #endif
 	bl	load_up_altivec
-	b	fast_interrupt_return
+	b	fast_interrupt_return_srr
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 2:	/* User process was in a transaction */
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	altivec_unavailable_tm
-	b	interrupt_return
+	b	interrupt_return_srr
 #endif
 1:
 END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
 #endif
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	altivec_unavailable_exception
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM altivec_unavailable
 
@@ -2421,14 +2425,14 @@ BEGIN_FTR_SECTION
 2:	/* User process was in a transaction */
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	vsx_unavailable_tm
-	b	interrupt_return
+	b	interrupt_return_srr
 #endif
 1:
 END_FTR_SECTION_IFSET(CPU_FTR_VSX)
 #endif
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	vsx_unavailable_exception
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM vsx_unavailable
 
@@ -2458,7 +2462,7 @@ EXC_COMMON_BEGIN(facility_unavailable_common)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	facility_unavailable_exception
 	REST_NVGPRS(r1) /* instruction emulation may change GPRs */
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM facility_unavailable
 
@@ -2488,7 +2492,7 @@ EXC_COMMON_BEGIN(h_facility_unavailable_common)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	facility_unavailable_exception
 	REST_NVGPRS(r1) /* XXX Shouldn't be necessary in practice */
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM h_facility_unavailable
 
@@ -2521,7 +2525,7 @@ EXC_COMMON_BEGIN(cbe_system_error_common)
 	GEN_COMMON cbe_system_error
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	cbe_system_error_exception
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM cbe_system_error
 
@@ -2549,7 +2553,7 @@ EXC_COMMON_BEGIN(instruction_breakpoint_common)
 	GEN_COMMON instruction_breakpoint
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	instruction_breakpoint_exception
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM instruction_breakpoint
 
@@ -2671,7 +2675,7 @@ EXC_COMMON_BEGIN(denorm_exception_common)
 	GEN_COMMON denorm_exception
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	unknown_exception
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM denorm_exception
 
@@ -2692,7 +2696,7 @@ EXC_COMMON_BEGIN(cbe_maintenance_common)
 	GEN_COMMON cbe_maintenance
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	cbe_maintenance_exception
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM cbe_maintenance
 
@@ -2724,7 +2728,7 @@ EXC_COMMON_BEGIN(altivec_assist_common)
 #else
 	bl	unknown_exception
 #endif
-	b	interrupt_return
+	b	interrupt_return_srr
 
 	GEN_KVM altivec_assist
 
@@ -2745,7 +2749,7 @@ EXC_COMMON_BEGIN(cbe_thermal_common)
 	GEN_COMMON cbe_thermal
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	cbe_thermal_exception
-	b	interrupt_return
+	b	interrupt_return_hsrr
 
 	GEN_KVM cbe_thermal
 
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index f5a52f444e36..54dbefcb4cde 100644
--- a/arch/powerpc/kernel/vector.S
+++ b/arch/powerpc/kernel/vector.S
@@ -131,7 +131,7 @@ _GLOBAL(load_up_vsx)
 	/* enable use of VSX after return */
 	oris	r12,r12,MSR_VSX@h
 	std	r12,_MSR(r1)
-	b	fast_interrupt_return
+	b	fast_interrupt_return_srr
 
 #endif /* CONFIG_VSX */
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (2 preceding siblings ...)
  2021-03-15 22:03 ` [PATCH 03/14] powerpc/64s: introduce different functions to return from SRR vs HSRR interrupts Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-04-02 22:39   ` Michael Ellerman
  2021-04-03  2:28   ` Michael Ellerman
  2021-03-15 22:03 ` [PATCH 05/14] powerpc/64: move interrupt return asm to interrupt_64.S Nicholas Piggin
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

When an interrupt is taken, the SRR registers are set to return to
where it left off. Unless they are modified in the meantime, or the
return address or MSR are modified, there is no need to reload these
registers when returning from interrupt.

Introduce per-CPU flags that track the validity of SRR and HSRR
registers, clear them when returning from interrupt, using the registers
for something else (e.g., OPAL calls), or adjusting return address or MSR.

This improves the performance of interrupt returns.

XXX: may not need to invalidate both hsrr and srr all the time

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/Kconfig.debug                 |  5 ++
 arch/powerpc/include/asm/paca.h            |  2 +
 arch/powerpc/include/asm/ptrace.h          | 57 +++++++++-----
 arch/powerpc/kernel/asm-offsets.c          |  2 +
 arch/powerpc/kernel/entry_64.S             | 91 ++++++++++++++++++++--
 arch/powerpc/kernel/exceptions-64s.S       | 27 +++++++
 arch/powerpc/kernel/fpu.S                  |  2 +
 arch/powerpc/kernel/kgdb.c                 |  2 +-
 arch/powerpc/kernel/kprobes-ftrace.c       |  2 +-
 arch/powerpc/kernel/kprobes.c              | 10 +--
 arch/powerpc/kernel/process.c              | 20 ++++-
 arch/powerpc/kernel/rtas.c                 | 13 +++-
 arch/powerpc/kernel/signal.c               |  2 +-
 arch/powerpc/kernel/signal_64.c            | 14 ++++
 arch/powerpc/kernel/syscalls.c             |  2 +
 arch/powerpc/kernel/traps.c                | 18 ++---
 arch/powerpc/kernel/vector.S               |  4 +
 arch/powerpc/lib/sstep.c                   |  5 +-
 arch/powerpc/math-emu/math.c               |  2 +-
 arch/powerpc/platforms/powernv/opal-call.c |  3 +
 arch/powerpc/sysdev/fsl_pci.c              |  2 +-
 21 files changed, 233 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index ae084357994e..359ed36c5487 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -84,6 +84,11 @@ config MSI_BITMAP_SELFTEST
 
 config PPC_IRQ_SOFT_MASK_DEBUG
 	bool "Include extra checks for powerpc irq soft masking"
+	depends on PPC64
+
+config PPC_RFI_SRR_DEBUG
+	bool "Include extra checks for RFI SRR register validity"
+	depends on PPC_BOOK3S_64
 
 config XMON
 	bool "Include xmon kernel debugger"
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 819db8afd425..4cbfaa09950a 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -170,6 +170,8 @@ struct paca_struct {
 #ifdef CONFIG_PPC_BOOK3E
 	u16 trap_save;			/* Used when bad stack is encountered */
 #endif
+	u8 hsrr_valid;			/* HSRRs set for HRFID */
+	u8 srr_valid;			/* SRRs set for RFID */
 	u8 irq_soft_mask;		/* mask for irq soft masking */
 	u8 irq_happened;		/* irq happened while soft-disabled */
 	u8 irq_work_pending;		/* IRQ_WORK interrupt while soft-disable */
diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
index 6a04abfe5eb6..77c86ce01f20 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -121,27 +121,7 @@ struct pt_regs
 #endif /* __powerpc64__ */
 
 #ifndef __ASSEMBLY__
-
-static inline unsigned long instruction_pointer(struct pt_regs *regs)
-{
-	return regs->nip;
-}
-
-static inline void instruction_pointer_set(struct pt_regs *regs,
-		unsigned long val)
-{
-	regs->nip = val;
-}
-
-static inline unsigned long user_stack_pointer(struct pt_regs *regs)
-{
-	return regs->gpr[1];
-}
-
-static inline unsigned long frame_pointer(struct pt_regs *regs)
-{
-	return 0;
-}
+#include <asm/paca.h>
 
 #ifdef CONFIG_SMP
 extern unsigned long profile_pc(struct pt_regs *regs);
@@ -171,6 +151,41 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
 	regs->gpr[3] = rc;
 }
 
+static inline void regs_set_return_ip(struct pt_regs *regs, unsigned long ip)
+{
+	regs->nip = ip;
+#ifdef CONFIG_PPC_BOOK3S_64
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
+#endif
+}
+
+static inline void regs_add_return_ip(struct pt_regs *regs, long offset)
+{
+	regs_set_return_ip(regs, regs->nip + offset);
+}
+
+static inline unsigned long instruction_pointer(struct pt_regs *regs)
+{
+	return regs->nip;
+}
+
+static inline void instruction_pointer_set(struct pt_regs *regs,
+		unsigned long val)
+{
+	regs_set_return_ip(regs, val);
+}
+
+static inline unsigned long user_stack_pointer(struct pt_regs *regs)
+{
+	return regs->gpr[1];
+}
+
+static inline unsigned long frame_pointer(struct pt_regs *regs)
+{
+	return 0;
+}
+
 #ifdef __powerpc64__
 #define user_mode(regs) ((((regs)->msr) >> MSR_PR_LG) & 0x1)
 #else
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index e33f04280f77..35ce6e36f593 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -209,6 +209,8 @@ int main(void)
 	OFFSET(PACATOC, paca_struct, kernel_toc);
 	OFFSET(PACAKBASE, paca_struct, kernelbase);
 	OFFSET(PACAKMSR, paca_struct, kernel_msr);
+	OFFSET(PACAHSRR_VALID, paca_struct, hsrr_valid);
+	OFFSET(PACASRR_VALID, paca_struct, srr_valid);
 	OFFSET(PACAIRQSOFTMASK, paca_struct, irq_soft_mask);
 	OFFSET(PACAIRQHAPPENED, paca_struct, irq_happened);
 	OFFSET(PACA_FTRACE_ENABLED, paca_struct, ftrace_enabled);
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index ccf913cedd29..b466b3e1bb3f 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -64,6 +64,30 @@ exception_marker:
 	.section	".text"
 	.align 7
 
+.macro DEBUG_SRR_VALID srr
+#ifdef CONFIG_PPC_RFI_SRR_DEBUG
+	.ifc \srr,srr
+	mfspr	r11,SPRN_SRR0
+	ld	r12,_NIP(r1)
+100:	tdne	r11,r12
+	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	mfspr	r11,SPRN_SRR1
+	ld	r12,_MSR(r1)
+100:	tdne	r11,r12
+	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	.else
+	mfspr	r11,SPRN_HSRR0
+	ld	r12,_NIP(r1)
+100:	tdne	r11,r12
+	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	mfspr	r11,SPRN_HSRR1
+	ld	r12,_MSR(r1)
+100:	tdne	r11,r12
+	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	.endif
+#endif
+.endm
+
 #ifdef CONFIG_PPC_BOOK3S
 .macro system_call_vectored name trapnr
 	.globl system_call_vectored_\name
@@ -290,6 +314,11 @@ END_BTB_FLUSH_SECTION
 	ld	r11,exception_marker@toc(r2)
 	std	r11,-16(r10)		/* "regshere" marker */
 
+#ifdef CONFIG_PPC_BOOK3S
+	li	r11,1
+	stb	r11,PACASRR_VALID(r13)
+#endif
+
 	/*
 	 * We always enter kernel from userspace with irq soft-mask enabled and
 	 * nothing pending. system_call_exception() will call
@@ -314,18 +343,27 @@ END_BTB_FLUSH_SECTION
 	EXIT_KERNEL_SECURITY_FALLBACK
 
 	ld	r2,_CCR(r1)
+	ld	r6,_LINK(r1)
+	mtlr	r6
+
+#ifdef CONFIG_PPC_BOOK3S
+	lbz	r4,PACASRR_VALID(r13)
+	cmpdi	r4,0
+	bne	1f
+	li	r4,0
+	stb	r4,PACASRR_VALID(r13)
+#endif
 	ld	r4,_NIP(r1)
 	ld	r5,_MSR(r1)
-	ld	r6,_LINK(r1)
+	mtspr	SPRN_SRR0,r4
+	mtspr	SPRN_SRR1,r5
+1:
+	DEBUG_SRR_VALID srr
 
 BEGIN_FTR_SECTION
 	stdcx.	r0,0,r1			/* to clear the reservation */
 END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 
-	mtspr	SPRN_SRR0,r4
-	mtspr	SPRN_SRR1,r5
-	mtlr	r6
-
 	cmpdi	r3,0
 	bne	.Lsyscall_restore_regs
 	/* Zero volatile regs that may contain sensitive kernel data */
@@ -683,19 +721,39 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
 .Lfast_user_interrupt_return_\srr\():
 	EXIT_KERNEL_SECURITY_FALLBACK
 
-	ld	r11,_NIP(r1)
-	ld	r12,_MSR(r1)
 BEGIN_FTR_SECTION
 	ld	r10,_PPR(r1)
 	mtspr	SPRN_PPR,r10
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
+
+#ifdef CONFIG_PPC_BOOK3S
+	.ifc \srr,srr
+	lbz	r4,PACASRR_VALID(r13)
+	.else
+	lbz	r4,PACAHSRR_VALID(r13)
+	.endif
+	cmpdi	r4,0
+	li	r4,0
+	bne	1f
+#endif
+	ld	r11,_NIP(r1)
+	ld	r12,_MSR(r1)
 	.ifc \srr,srr
 	mtspr	SPRN_SRR0,r11
 	mtspr	SPRN_SRR1,r12
+1:
+#ifdef CONFIG_PPC_BOOK3S
+	stb	r4,PACASRR_VALID(r13)
+#endif
 	.else
 	mtspr	SPRN_HSRR0,r11
 	mtspr	SPRN_HSRR1,r12
+1:
+#ifdef CONFIG_PPC_BOOK3S
+	stb	r4,PACAHSRR_VALID(r13)
+#endif
 	.endif
+	DEBUG_SRR_VALID \srr
 
 BEGIN_FTR_SECTION
 	stdcx.	r0,0,r1		/* to clear the reservation */
@@ -740,15 +798,34 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 
 .Lfast_kernel_interrupt_return_\srr\():
 	cmpdi	cr1,r3,0
+#ifdef CONFIG_PPC_BOOK3S
+	.ifc \srr,srr
+	lbz	r4,PACASRR_VALID(r13)
+	.else
+	lbz	r4,PACAHSRR_VALID(r13)
+	.endif
+	cmpdi	r4,0
+	li	r4,0
+	bne	1f
+#endif
 	ld	r11,_NIP(r1)
 	ld	r12,_MSR(r1)
 	.ifc \srr,srr
 	mtspr	SPRN_SRR0,r11
 	mtspr	SPRN_SRR1,r12
+1:
+#ifdef CONFIG_PPC_BOOK3S
+	stb	r4,PACASRR_VALID(r13)
+#endif
 	.else
 	mtspr	SPRN_HSRR0,r11
 	mtspr	SPRN_HSRR1,r12
+1:
+#ifdef CONFIG_PPC_BOOK3S
+	stb	r4,PACAHSRR_VALID(r13)
+#endif
 	.endif
+	DEBUG_SRR_VALID \srr
 
 BEGIN_FTR_SECTION
 	stdcx.	r0,0,r1		/* to clear the reservation */
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 136323d38c80..0c7af27d6dc1 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -567,6 +567,20 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real)
 	std	r0,GPR0(r1)		/* save r0 in stackframe	*/
 	std	r10,GPR1(r1)		/* save r1 in stackframe	*/
 
+	/* Mark our [H]SRRs valid for return */
+	li	r10,1
+	.if IHSRR_IF_HVMODE
+	BEGIN_FTR_SECTION
+	stb	r10,PACAHSRR_VALID(r13)
+	FTR_SECTION_ELSE
+	stb	r10,PACASRR_VALID(r13)
+	ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
+	.elseif IHSRR
+	stb	r10,PACAHSRR_VALID(r13)
+	.else
+	stb	r10,PACASRR_VALID(r13)
+	.endif
+
 	.if ISET_RI
 	li	r10,MSR_RI
 	mtmsrd	r10,1			/* Set MSR_RI */
@@ -668,10 +682,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 .macro EXCEPTION_RESTORE_REGS hsrr=0
 	/* Move original SRR0 and SRR1 into the respective regs */
 	ld	r9,_MSR(r1)
+	li	r10,0
 	.if \hsrr
 	mtspr	SPRN_HSRR1,r9
+	stb	r10,PACAHSRR_VALID(r13)
 	.else
 	mtspr	SPRN_SRR1,r9
+	stb	r10,PACASRR_VALID(r13)
 	.endif
 	ld	r9,_NIP(r1)
 	.if \hsrr
@@ -1829,6 +1846,8 @@ EXC_COMMON_BEGIN(hdecrementer_common)
 	 *
 	 * Be careful to avoid touching the kernel stack.
 	 */
+	li	r10,0
+	stb	r10,PACAHSRR_VALID(r13)
 	ld	r10,PACA_EXGEN+EX_CTR(r13)
 	mtctr	r10
 	mtcrf	0x80,r9
@@ -2663,6 +2682,8 @@ BEGIN_FTR_SECTION
 	ld	r10,PACA_EXGEN+EX_CFAR(r13)
 	mtspr	SPRN_CFAR,r10
 END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
+	li	r10,0
+	stb	r10,PACAHSRR_VALID(r13)
 	ld	r10,PACA_EXGEN+EX_R10(r13)
 	ld	r11,PACA_EXGEN+EX_R11(r13)
 	ld	r12,PACA_EXGEN+EX_R12(r13)
@@ -2835,6 +2856,12 @@ masked_interrupt:
 	ori	r11,r11,PACA_IRQ_HARD_DIS
 	stb	r11,PACAIRQHAPPENED(r13)
 2:	/* done */
+	li	r10,0
+	.if \hsrr
+	stb	r10,PACAHSRR_VALID(r13)
+	.else
+	stb	r10,PACASRR_VALID(r13)
+	.endif
 	ld	r10,PACA_EXGEN+EX_CTR(r13)
 	mtctr	r10
 	mtcrf	0x80,r9
diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index 2c57ece6671c..44526c157bff 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -103,6 +103,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
 	ori	r12,r12,MSR_FP
 	or	r12,r12,r4
 	std	r12,_MSR(r1)
+	li	r4,0
+	stb	r4,PACASRR_VALID(r13)
 #endif
 	li	r4,1
 	stb	r4,THREAD_LOAD_FP(r5)
diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
index 409080208a6c..dcac6c74a93c 100644
--- a/arch/powerpc/kernel/kgdb.c
+++ b/arch/powerpc/kernel/kgdb.c
@@ -147,7 +147,7 @@ static int kgdb_handle_breakpoint(struct pt_regs *regs)
 		return 0;
 
 	if (*(u32 *)regs->nip == BREAK_INSTR)
-		regs->nip += BREAK_INSTR_SIZE;
+		regs_add_return_ip(regs, BREAK_INSTR_SIZE);
 
 	return 1;
 }
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
index 660138f6c4b2..a4965a32628a 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -48,7 +48,7 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 			 * Emulate singlestep (and also recover regs->nip)
 			 * as if there is a nop
 			 */
-			regs->nip += MCOUNT_INSN_SIZE;
+			regs_add_return_ip(regs, MCOUNT_INSN_SIZE);
 			if (unlikely(p->post_handler)) {
 				kcb->kprobe_status = KPROBE_HIT_SSDONE;
 				p->post_handler(p, regs, 0);
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 01ab2163659e..8165ed71ab51 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -178,7 +178,7 @@ static nokprobe_inline void prepare_singlestep(struct kprobe *p, struct pt_regs
 	 * variant as values in regs could play a part in
 	 * if the trap is taken or not
 	 */
-	regs->nip = (unsigned long)p->ainsn.insn;
+	regs_set_return_ip(regs, (unsigned long)p->ainsn.insn);
 }
 
 static nokprobe_inline void save_previous_kprobe(struct kprobe_ctlblk *kcb)
@@ -415,7 +415,7 @@ static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 	 * we end up emulating it in kprobe_handler(), which increments the nip
 	 * again.
 	 */
-	regs->nip = orig_ret_address - 4;
+	regs_set_return_ip(regs, orig_ret_address - 4);
 	regs->link = orig_ret_address;
 
 	return 0;
@@ -450,7 +450,7 @@ int kprobe_post_handler(struct pt_regs *regs)
 	}
 
 	/* Adjust nip to after the single-stepped instruction */
-	regs->nip = (unsigned long)cur->addr + len;
+	regs_set_return_ip(regs, (unsigned long)cur->addr + len);
 	regs->msr |= kcb->kprobe_saved_msr;
 
 	/*Restore back the original saved kprobes variables and continue. */
@@ -490,7 +490,7 @@ int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
 		 * and allow the page fault handler to continue as a
 		 * normal page fault.
 		 */
-		regs->nip = (unsigned long)cur->addr;
+		regs_set_return_ip(regs, (unsigned long)cur->addr);
 		regs->msr &= ~MSR_SINGLESTEP; /* Turn off 'trace' bits */
 		regs->msr |= kcb->kprobe_saved_msr;
 		if (kcb->kprobe_status == KPROBE_REENTER)
@@ -523,7 +523,7 @@ int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
 		 * zero, try to fix up.
 		 */
 		if ((entry = search_exception_tables(regs->nip)) != NULL) {
-			regs->nip = extable_fixup(entry);
+			regs_set_return_ip(regs, extable_fixup(entry));
 			return 1;
 		}
 
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 1e62a70a29aa..ee8e274bbfd1 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -98,6 +98,8 @@ static void check_if_tm_restore_required(struct task_struct *tsk)
 	    !test_thread_flag(TIF_RESTORE_TM)) {
 		tsk->thread.ckpt_regs.msr = tsk->thread.regs->msr;
 		set_thread_flag(TIF_RESTORE_TM);
+		local_paca->hsrr_valid = 0;
+		local_paca->srr_valid = 0;
 	}
 }
 
@@ -162,6 +164,8 @@ static void __giveup_fpu(struct task_struct *tsk)
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr &= ~MSR_VSX;
 	tsk->thread.regs->msr = msr;
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
 }
 
 void giveup_fpu(struct task_struct *tsk)
@@ -245,6 +249,8 @@ static void __giveup_altivec(struct task_struct *tsk)
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr &= ~MSR_VSX;
 	tsk->thread.regs->msr = msr;
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
 }
 
 void giveup_altivec(struct task_struct *tsk)
@@ -560,6 +566,8 @@ void notrace restore_math(struct pt_regs *regs)
 		msr_check_and_clear(new_msr);
 
 		regs->msr |= new_msr | fpexc_mode;
+		local_paca->hsrr_valid = 0;
+		local_paca->srr_valid = 0;
 	}
 }
 #endif /* CONFIG_PPC_BOOK3S_64 */
@@ -1284,6 +1292,8 @@ struct task_struct *__switch_to(struct task_struct *prev,
 			atomic_read(&current->mm->context.vas_windows)))
 			asm volatile(PPC_CP_ABORT);
 	}
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
 #endif /* CONFIG_PPC_BOOK3S_64 */
 
 	return last;
@@ -1873,6 +1883,8 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
 	current->thread.load_tm = 0;
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
 }
 EXPORT_SYMBOL(start_thread);
 
@@ -1920,9 +1932,12 @@ int set_fpexc_mode(struct task_struct *tsk, unsigned int val)
 	if (val > PR_FP_EXC_PRECISE)
 		return -EINVAL;
 	tsk->thread.fpexc_mode = __pack_fe01(val);
-	if (regs != NULL && (regs->msr & MSR_FP) != 0)
+	if (regs != NULL && (regs->msr & MSR_FP) != 0) {
 		regs->msr = (regs->msr & ~(MSR_FE0|MSR_FE1))
 			| tsk->thread.fpexc_mode;
+		local_paca->hsrr_valid = 0;
+		local_paca->srr_valid = 0;
+	}
 	return 0;
 }
 
@@ -1974,6 +1989,9 @@ int set_endian(struct task_struct *tsk, unsigned int val)
 	else
 		return -EINVAL;
 
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
+
 	return 0;
 }
 
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index d126d71ea5bd..adb92f7d00a1 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -46,6 +46,13 @@
 /* This is here deliberately so it's only used in this file */
 void enter_rtas(unsigned long);
 
+static inline void do_enter_rtas(unsigned long args)
+{
+	enter_rtas(args);
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
+}
+
 struct rtas_t rtas = {
 	.lock = __ARCH_SPIN_LOCK_UNLOCKED
 };
@@ -384,7 +391,7 @@ static char *__fetch_rtas_last_error(char *altbuf)
 	save_args = rtas.args;
 	rtas.args = err_args;
 
-	enter_rtas(__pa(&rtas.args));
+	do_enter_rtas(__pa(&rtas.args));
 
 	err_args = rtas.args;
 	rtas.args = save_args;
@@ -430,7 +437,7 @@ va_rtas_call_unlocked(struct rtas_args *args, int token, int nargs, int nret,
 	for (i = 0; i < nret; ++i)
 		args->rets[i] = 0;
 
-	enter_rtas(__pa(args));
+	do_enter_rtas(__pa(args));
 }
 
 void rtas_call_unlocked(struct rtas_args *args, int token, int nargs, int nret, ...)
@@ -1127,7 +1134,7 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
 	flags = lock_rtas();
 
 	rtas.args = args;
-	enter_rtas(__pa(&rtas.args));
+	do_enter_rtas(__pa(&rtas.args));
 	args = rtas.args;
 
 	/* A -1 return code indicates that the last command couldn't
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index 9ded046edb0e..285f036ef3c0 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -214,7 +214,7 @@ static void check_syscall_restart(struct pt_regs *regs, struct k_sigaction *ka,
 			regs->gpr[0] = __NR_restart_syscall;
 		else
 			regs->gpr[3] = regs->orig_gpr3;
-		regs->nip -= 4;
+		regs_add_return_ip(regs, - 4);
 		regs->result = 0;
 	} else {
 		if (trap_is_scv(regs)) {
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 0e3637722e97..6f1309e3c338 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -686,6 +686,10 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
 
 	/* This returns like rt_sigreturn */
 	set_thread_flag(TIF_RESTOREALL);
+
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
+
 	return 0;
 }
 
@@ -791,6 +795,10 @@ SYSCALL_DEFINE0(rt_sigreturn)
 		goto badframe;
 
 	set_thread_flag(TIF_RESTOREALL);
+
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
+
 	return 0;
 
 badframe:
@@ -866,6 +874,7 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 	err |= put_user(regs->gpr[1], (unsigned long __user *)newsp);
 
 	/* Set up "regs" so we "return" to the signal handler. */
+	/* XXX: use set return IP */
 	if (is_elf2_task()) {
 		regs->ctr = (unsigned long) ksig->ka.sa.sa_handler;
 		regs->gpr[12] = regs->ctr;
@@ -898,10 +907,15 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 	if (err)
 		goto badframe;
 
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
+
 	return 0;
 
 badframe:
 	signal_fault(current, regs, "handle_rt_signal64", frame);
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
 
 	return 1;
 }
diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c
index 078608ec2e92..13bf62ccfa18 100644
--- a/arch/powerpc/kernel/syscalls.c
+++ b/arch/powerpc/kernel/syscalls.c
@@ -123,6 +123,8 @@ SYSCALL_DEFINE0(switch_endian)
 	struct thread_info *ti;
 
 	current->thread.regs->msr ^= MSR_LE;
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
 
 	/*
 	 * Set TIF_RESTOREALL so that r3 isn't clobbered on return to
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 6c62e4e87979..c9a5bfeb8a7c 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1032,7 +1032,7 @@ static void p9_hmi_special_emu(struct pt_regs *regs)
 #endif /* !__LITTLE_ENDIAN__ */
 
 	/* Go to next instruction */
-	regs->nip += 4;
+	regs_add_return_ip(regs, 4);
 }
 #endif /* CONFIG_VSX */
 
@@ -1478,7 +1478,7 @@ static void do_program_check(struct pt_regs *regs)
 
 		if (!(regs->msr & MSR_PR) &&  /* not user-mode */
 		    report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) {
-			regs->nip += 4;
+			regs_add_return_ip(regs, 4);
 			return;
 		}
 		_exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip);
@@ -1540,7 +1540,7 @@ static void do_program_check(struct pt_regs *regs)
 	if (reason & (REASON_ILLEGAL | REASON_PRIVILEGED)) {
 		switch (emulate_instruction(regs)) {
 		case 0:
-			regs->nip += 4;
+			regs_add_return_ip(regs, 4);
 			emulate_single_step(regs);
 			return;
 		case -EFAULT:
@@ -1595,7 +1595,7 @@ DEFINE_INTERRUPT_HANDLER(alignment_exception)
 
 	if (fixed == 1) {
 		/* skip over emulated instruction */
-		regs->nip += inst_length(reason);
+		regs_add_return_ip(regs, inst_length(reason));
 		emulate_single_step(regs);
 		return;
 	}
@@ -1753,7 +1753,7 @@ DEFINE_INTERRUPT_HANDLER(facility_unavailable_exception)
 				pr_err("DSCR based mfspr emulation failed\n");
 				return;
 			}
-			regs->nip += 4;
+			regs_add_return_ip(regs, 4);
 			emulate_single_step(regs);
 		}
 		return;
@@ -2046,7 +2046,7 @@ DEFINE_INTERRUPT_HANDLER(altivec_assist_exception)
 	PPC_WARN_EMULATED(altivec, regs);
 	err = emulate_altivec(regs);
 	if (err == 0) {
-		regs->nip += 4;		/* skip emulated instruction */
+		regs_add_return_ip(regs, 4); /* skip emulated instruction */
 		emulate_single_step(regs);
 		return;
 	}
@@ -2111,7 +2111,7 @@ DEFINE_INTERRUPT_HANDLER(SPEFloatingPointException)
 
 	err = do_spe_mathemu(regs);
 	if (err == 0) {
-		regs->nip += 4;		/* skip emulated instruction */
+		regs_add_return_ip(regs, 4); /* skip emulated instruction */
 		emulate_single_step(regs);
 		return;
 	}
@@ -2142,10 +2142,10 @@ DEFINE_INTERRUPT_HANDLER(SPEFloatingPointRoundException)
 		giveup_spe(current);
 	preempt_enable();
 
-	regs->nip -= 4;
+	regs_add_return_ip(regs, - 4);
 	err = speround_handler(regs);
 	if (err == 0) {
-		regs->nip += 4;		/* skip emulated instruction */
+		regs_add_return_ip(regs, 4); /* skip emulated instruction */
 		emulate_single_step(regs);
 		return;
 	}
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index 54dbefcb4cde..02f8925c7919 100644
--- a/arch/powerpc/kernel/vector.S
+++ b/arch/powerpc/kernel/vector.S
@@ -73,6 +73,8 @@ _GLOBAL(load_up_altivec)
 	addi	r5,r4,THREAD		/* Get THREAD */
 	oris	r12,r12,MSR_VEC@h
 	std	r12,_MSR(r1)
+	li	r4,0
+	stb	r4,PACASRR_VALID(r13)
 #endif
 	li	r4,1
 	stb	r4,THREAD_LOAD_VEC(r5)
@@ -131,6 +133,8 @@ _GLOBAL(load_up_vsx)
 	/* enable use of VSX after return */
 	oris	r12,r12,MSR_VSX@h
 	std	r12,_MSR(r1)
+	li	r4,0
+	stb	r4,PACASRR_VALID(r13)
 	b	fast_interrupt_return_srr
 
 #endif /* CONFIG_VSX */
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 45bda2520755..96505d4bba1c 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -3203,7 +3203,7 @@ void emulate_update_regs(struct pt_regs *regs, struct instruction_op *op)
 	default:
 		WARN_ON_ONCE(1);
 	}
-	regs->nip = next_pc;
+	regs_set_return_ip(regs, next_pc);
 }
 NOKPROBE_SYMBOL(emulate_update_regs);
 
@@ -3480,6 +3480,9 @@ int emulate_step(struct pt_regs *regs, struct ppc_inst instr)
 	unsigned long val;
 	unsigned long ea;
 
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
+
 	r = analyse_instr(&op, regs, instr);
 	if (r < 0)
 		return r;
diff --git a/arch/powerpc/math-emu/math.c b/arch/powerpc/math-emu/math.c
index 30b4b69c6941..d92416d78aee 100644
--- a/arch/powerpc/math-emu/math.c
+++ b/arch/powerpc/math-emu/math.c
@@ -453,7 +453,7 @@ do_mathemu(struct pt_regs *regs)
 		break;
 	}
 
-	regs->nip += 4;
+	regs_add_return_ip(regs, 4);
 	return 0;
 
 illegal:
diff --git a/arch/powerpc/platforms/powernv/opal-call.c b/arch/powerpc/platforms/powernv/opal-call.c
index 5cd0f52d258f..1a7bc261d156 100644
--- a/arch/powerpc/platforms/powernv/opal-call.c
+++ b/arch/powerpc/platforms/powernv/opal-call.c
@@ -100,6 +100,9 @@ static int64_t opal_call(int64_t a0, int64_t a1, int64_t a2, int64_t a3,
 	bool mmu = (msr & (MSR_IR|MSR_DR));
 	int64_t ret;
 
+	local_paca->hsrr_valid = 0;
+	local_paca->srr_valid = 0;
+
 	msr &= ~MSR_EE;
 
 	if (unlikely(!mmu))
diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
index 040b9d01c079..af78e7c3108f 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -1072,7 +1072,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
 			ret = get_kernel_nofault(inst, (void *)regs->nip);
 
 		if (!ret && mcheck_handle_load(regs, inst)) {
-			regs->nip += 4;
+			regs_add_return_ip(regs, 4);
 			return 1;
 		}
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/14] powerpc/64: move interrupt return asm to interrupt_64.S
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (3 preceding siblings ...)
  2021-03-15 22:03 ` [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 06/14] powerpc/64s: save one more register in the masked interrupt handler Nicholas Piggin
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The next patch would like to move interrupt return assembly code to a low
location before general text, so move it into its own file and include via
head_64.S

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/head-64.h |   2 +-
 arch/powerpc/kernel/entry_64.S     | 632 ----------------------------
 arch/powerpc/kernel/head_64.S      |   5 +-
 arch/powerpc/kernel/interrupt_64.S | 645 +++++++++++++++++++++++++++++
 4 files changed, 650 insertions(+), 634 deletions(-)
 create mode 100644 arch/powerpc/kernel/interrupt_64.S

diff --git a/arch/powerpc/include/asm/head-64.h b/arch/powerpc/include/asm/head-64.h
index 4cb9efa2eb21..242204e12993 100644
--- a/arch/powerpc/include/asm/head-64.h
+++ b/arch/powerpc/include/asm/head-64.h
@@ -16,7 +16,7 @@
 	.section ".head.data.\name\()","a",@progbits
 .endm
 .macro use_ftsec name
-	.section ".head.text.\name\()"
+	.section ".head.text.\name\()","ax",@progbits
 .endm
 
 /*
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index b466b3e1bb3f..15720f8661a1 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -32,7 +32,6 @@
 #include <asm/irqflags.h>
 #include <asm/hw_irq.h>
 #include <asm/context_tracking.h>
-#include <asm/tm.h>
 #include <asm/ppc-opcode.h>
 #include <asm/barrier.h>
 #include <asm/export.h>
@@ -48,418 +47,7 @@
 /*
  * System calls.
  */
-	.section	".toc","aw"
-SYS_CALL_TABLE:
-	.tc sys_call_table[TC],sys_call_table
-
-#ifdef CONFIG_COMPAT
-COMPAT_SYS_CALL_TABLE:
-	.tc compat_sys_call_table[TC],compat_sys_call_table
-#endif
-
-/* This value is used to mark exception frames on the stack. */
-exception_marker:
-	.tc	ID_EXC_MARKER[TC],STACK_FRAME_REGS_MARKER
-
 	.section	".text"
-	.align 7
-
-.macro DEBUG_SRR_VALID srr
-#ifdef CONFIG_PPC_RFI_SRR_DEBUG
-	.ifc \srr,srr
-	mfspr	r11,SPRN_SRR0
-	ld	r12,_NIP(r1)
-100:	tdne	r11,r12
-	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
-	mfspr	r11,SPRN_SRR1
-	ld	r12,_MSR(r1)
-100:	tdne	r11,r12
-	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
-	.else
-	mfspr	r11,SPRN_HSRR0
-	ld	r12,_NIP(r1)
-100:	tdne	r11,r12
-	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
-	mfspr	r11,SPRN_HSRR1
-	ld	r12,_MSR(r1)
-100:	tdne	r11,r12
-	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
-	.endif
-#endif
-.endm
-
-#ifdef CONFIG_PPC_BOOK3S
-.macro system_call_vectored name trapnr
-	.globl system_call_vectored_\name
-system_call_vectored_\name:
-_ASM_NOKPROBE_SYMBOL(system_call_vectored_\name)
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-BEGIN_FTR_SECTION
-	extrdi.	r10, r12, 1, (63-MSR_TS_T_LG) /* transaction active? */
-	bne	.Ltabort_syscall
-END_FTR_SECTION_IFSET(CPU_FTR_TM)
-#endif
-	SCV_INTERRUPT_TO_KERNEL
-	mr	r10,r1
-	ld	r1,PACAKSAVE(r13)
-	std	r10,0(r1)
-	std	r11,_NIP(r1)
-	std	r12,_MSR(r1)
-	std	r0,GPR0(r1)
-	std	r10,GPR1(r1)
-	std	r2,GPR2(r1)
-	ld	r2,PACATOC(r13)
-	mfcr	r12
-	li	r11,0
-	/* Can we avoid saving r3-r8 in common case? */
-	std	r3,GPR3(r1)
-	std	r4,GPR4(r1)
-	std	r5,GPR5(r1)
-	std	r6,GPR6(r1)
-	std	r7,GPR7(r1)
-	std	r8,GPR8(r1)
-	/* Zero r9-r12, this should only be required when restoring all GPRs */
-	std	r11,GPR9(r1)
-	std	r11,GPR10(r1)
-	std	r11,GPR11(r1)
-	std	r11,GPR12(r1)
-	std	r9,GPR13(r1)
-	SAVE_NVGPRS(r1)
-	std	r11,_XER(r1)
-	std	r11,_LINK(r1)
-	std	r11,_CTR(r1)
-
-	li	r11,\trapnr
-	std	r11,_TRAP(r1)
-	std	r12,_CCR(r1)
-	addi	r10,r1,STACK_FRAME_OVERHEAD
-	ld	r11,exception_marker@toc(r2)
-	std	r11,-16(r10)		/* "regshere" marker */
-
-BEGIN_FTR_SECTION
-	HMT_MEDIUM
-END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
-
-	ENTER_KERNEL_SECURITY_FALLBACK
-
-	/*
-	 * scv enters with MSR[EE]=1 and is immediately considered soft-masked.
-	 * The entry vector already sets PACAIRQSOFTMASK to IRQS_ALL_DISABLED,
-	 * and interrupts may be masked and pending already.
-	 * system_call_exception() will call trace_hardirqs_off() which means
-	 * interrupts could already have been blocked before trace_hardirqs_off,
-	 * but this is the best we can do.
-	 */
-
-	/* Calling convention has r9 = orig r0, r10 = regs */
-	mr	r9,r0
-	bl	system_call_exception
-
-.Lsyscall_vectored_\name\()_exit:
-	addi    r4,r1,STACK_FRAME_OVERHEAD
-	li	r5,1 /* scv */
-	bl	syscall_exit_prepare
-
-	EXIT_KERNEL_SECURITY_FALLBACK
-
-	ld	r2,_CCR(r1)
-	ld	r4,_NIP(r1)
-	ld	r5,_MSR(r1)
-
-BEGIN_FTR_SECTION
-	stdcx.	r0,0,r1			/* to clear the reservation */
-END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
-
-BEGIN_FTR_SECTION
-	HMT_MEDIUM_LOW
-END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
-
-	cmpdi	r3,0
-	bne	.Lsyscall_vectored_\name\()_restore_regs
-
-	/* rfscv returns with LR->NIA and CTR->MSR */
-	mtlr	r4
-	mtctr	r5
-
-	/* Could zero these as per ABI, but we may consider a stricter ABI
-	 * which preserves these if libc implementations can benefit, so
-	 * restore them for now until further measurement is done. */
-	ld	r0,GPR0(r1)
-	ld	r4,GPR4(r1)
-	ld	r5,GPR5(r1)
-	ld	r6,GPR6(r1)
-	ld	r7,GPR7(r1)
-	ld	r8,GPR8(r1)
-	/* Zero volatile regs that may contain sensitive kernel data */
-	li	r9,0
-	li	r10,0
-	li	r11,0
-	li	r12,0
-	mtspr	SPRN_XER,r0
-
-	/*
-	 * We don't need to restore AMR on the way back to userspace for KUAP.
-	 * The value of AMR only matters while we're in the kernel.
-	 */
-	mtcr	r2
-	ld	r2,GPR2(r1)
-	ld	r3,GPR3(r1)
-	ld	r13,GPR13(r1)
-	ld	r1,GPR1(r1)
-	RFSCV_TO_USER
-	b	.	/* prevent speculative execution */
-
-.Lsyscall_vectored_\name\()_restore_regs:
-	li	r3,0
-	mtmsrd	r3,1
-	mtspr	SPRN_SRR0,r4
-	mtspr	SPRN_SRR1,r5
-
-	ld	r3,_CTR(r1)
-	ld	r4,_LINK(r1)
-	ld	r5,_XER(r1)
-
-	REST_NVGPRS(r1)
-	ld	r0,GPR0(r1)
-	mtcr	r2
-	mtctr	r3
-	mtlr	r4
-	mtspr	SPRN_XER,r5
-	REST_10GPRS(2, r1)
-	REST_2GPRS(12, r1)
-	ld	r1,GPR1(r1)
-	RFI_TO_USER
-.endm
-
-system_call_vectored common 0x3000
-/*
- * We instantiate another entry copy for the SIGILL variant, with TRAP=0x7ff0
- * which is tested by system_call_exception when r0 is -1 (as set by vector
- * entry code).
- */
-system_call_vectored sigill 0x7ff0
-
-
-/*
- * Entered via kernel return set up by kernel/sstep.c, must match entry regs
- */
-	.globl system_call_vectored_emulate
-system_call_vectored_emulate:
-_ASM_NOKPROBE_SYMBOL(system_call_vectored_emulate)
-	li	r10,IRQS_ALL_DISABLED
-	stb	r10,PACAIRQSOFTMASK(r13)
-	b	system_call_vectored_common
-#endif
-
-	.balign IFETCH_ALIGN_BYTES
-	.globl system_call_common_real
-system_call_common_real:
-	ld	r10,PACAKMSR(r13)	/* get MSR value for kernel */
-	mtmsrd	r10
-
-	.balign IFETCH_ALIGN_BYTES
-	.globl system_call_common
-system_call_common:
-_ASM_NOKPROBE_SYMBOL(system_call_common)
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-BEGIN_FTR_SECTION
-	extrdi.	r10, r12, 1, (63-MSR_TS_T_LG) /* transaction active? */
-	bne	.Ltabort_syscall
-END_FTR_SECTION_IFSET(CPU_FTR_TM)
-#endif
-	mr	r10,r1
-	ld	r1,PACAKSAVE(r13)
-	std	r10,0(r1)
-	std	r11,_NIP(r1)
-	std	r12,_MSR(r1)
-	std	r0,GPR0(r1)
-	std	r10,GPR1(r1)
-	std	r2,GPR2(r1)
-#ifdef CONFIG_PPC_FSL_BOOK3E
-START_BTB_FLUSH_SECTION
-	BTB_FLUSH(r10)
-END_BTB_FLUSH_SECTION
-#endif
-	ld	r2,PACATOC(r13)
-	mfcr	r12
-	li	r11,0
-	/* Can we avoid saving r3-r8 in common case? */
-	std	r3,GPR3(r1)
-	std	r4,GPR4(r1)
-	std	r5,GPR5(r1)
-	std	r6,GPR6(r1)
-	std	r7,GPR7(r1)
-	std	r8,GPR8(r1)
-	/* Zero r9-r12, this should only be required when restoring all GPRs */
-	std	r11,GPR9(r1)
-	std	r11,GPR10(r1)
-	std	r11,GPR11(r1)
-	std	r11,GPR12(r1)
-	std	r9,GPR13(r1)
-	SAVE_NVGPRS(r1)
-	std	r11,_XER(r1)
-	std	r11,_CTR(r1)
-	mflr	r10
-
-	/*
-	 * This clears CR0.SO (bit 28), which is the error indication on
-	 * return from this system call.
-	 */
-	rldimi	r12,r11,28,(63-28)
-	li	r11,0xc00
-	std	r10,_LINK(r1)
-	std	r11,_TRAP(r1)
-	std	r12,_CCR(r1)
-	addi	r10,r1,STACK_FRAME_OVERHEAD
-	ld	r11,exception_marker@toc(r2)
-	std	r11,-16(r10)		/* "regshere" marker */
-
-#ifdef CONFIG_PPC_BOOK3S
-	li	r11,1
-	stb	r11,PACASRR_VALID(r13)
-#endif
-
-	/*
-	 * We always enter kernel from userspace with irq soft-mask enabled and
-	 * nothing pending. system_call_exception() will call
-	 * trace_hardirqs_off().
-	 */
-	li	r11,IRQS_ALL_DISABLED
-	li	r12,PACA_IRQ_HARD_DIS
-	stb	r11,PACAIRQSOFTMASK(r13)
-	stb	r12,PACAIRQHAPPENED(r13)
-
-	ENTER_KERNEL_SECURITY_FALLBACK
-
-	/* Calling convention has r9 = orig r0, r10 = regs */
-	mr	r9,r0
-	bl	system_call_exception
-
-.Lsyscall_exit:
-	addi    r4,r1,STACK_FRAME_OVERHEAD
-	li	r5,0 /* !scv */
-	bl	syscall_exit_prepare
-
-	EXIT_KERNEL_SECURITY_FALLBACK
-
-	ld	r2,_CCR(r1)
-	ld	r6,_LINK(r1)
-	mtlr	r6
-
-#ifdef CONFIG_PPC_BOOK3S
-	lbz	r4,PACASRR_VALID(r13)
-	cmpdi	r4,0
-	bne	1f
-	li	r4,0
-	stb	r4,PACASRR_VALID(r13)
-#endif
-	ld	r4,_NIP(r1)
-	ld	r5,_MSR(r1)
-	mtspr	SPRN_SRR0,r4
-	mtspr	SPRN_SRR1,r5
-1:
-	DEBUG_SRR_VALID srr
-
-BEGIN_FTR_SECTION
-	stdcx.	r0,0,r1			/* to clear the reservation */
-END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
-
-	cmpdi	r3,0
-	bne	.Lsyscall_restore_regs
-	/* Zero volatile regs that may contain sensitive kernel data */
-	li	r0,0
-	li	r4,0
-	li	r5,0
-	li	r6,0
-	li	r7,0
-	li	r8,0
-	li	r9,0
-	li	r10,0
-	li	r11,0
-	li	r12,0
-	mtctr	r0
-	mtspr	SPRN_XER,r0
-.Lsyscall_restore_regs_cont:
-
-BEGIN_FTR_SECTION
-	HMT_MEDIUM_LOW
-END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
-
-	/*
-	 * We don't need to restore AMR on the way back to userspace for KUAP.
-	 * The value of AMR only matters while we're in the kernel.
-	 */
-	mtcr	r2
-	ld	r2,GPR2(r1)
-	ld	r3,GPR3(r1)
-	ld	r13,GPR13(r1)
-	ld	r1,GPR1(r1)
-	RFI_TO_USER
-	b	.	/* prevent speculative execution */
-
-.Lsyscall_restore_regs:
-	ld	r3,_CTR(r1)
-	ld	r4,_XER(r1)
-	REST_NVGPRS(r1)
-	mtctr	r3
-	mtspr	SPRN_XER,r4
-	ld	r0,GPR0(r1)
-	REST_8GPRS(4, r1)
-	ld	r12,GPR12(r1)
-	b	.Lsyscall_restore_regs_cont
-
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-.Ltabort_syscall:
-	/* Firstly we need to enable TM in the kernel */
-	mfmsr	r10
-	li	r9, 1
-	rldimi	r10, r9, MSR_TM_LG, 63-MSR_TM_LG
-	mtmsrd	r10, 0
-
-	/* tabort, this dooms the transaction, nothing else */
-	li	r9, (TM_CAUSE_SYSCALL|TM_CAUSE_PERSISTENT)
-	TABORT(R9)
-
-	/*
-	 * Return directly to userspace. We have corrupted user register state,
-	 * but userspace will never see that register state. Execution will
-	 * resume after the tbegin of the aborted transaction with the
-	 * checkpointed register state.
-	 */
-	li	r9, MSR_RI
-	andc	r10, r10, r9
-	mtmsrd	r10, 1
-	mtspr	SPRN_SRR0, r11
-	mtspr	SPRN_SRR1, r12
-	RFI_TO_USER
-	b	.	/* prevent speculative execution */
-#endif
-
-#ifdef CONFIG_PPC_BOOK3S
-_GLOBAL(ret_from_fork_scv)
-	bl	schedule_tail
-	REST_NVGPRS(r1)
-	li	r3,0	/* fork() return value */
-	b	.Lsyscall_vectored_common_exit
-#endif
-
-_GLOBAL(ret_from_fork)
-	bl	schedule_tail
-	REST_NVGPRS(r1)
-	li	r3,0	/* fork() return value */
-	b	.Lsyscall_exit
-
-_GLOBAL(ret_from_kernel_thread)
-	bl	schedule_tail
-	REST_NVGPRS(r1)
-	mtctr	r14
-	mr	r3,r15
-#ifdef PPC64_ELF_ABI_v2
-	mr	r12,r14
-#endif
-	bctrl
-	li	r3,0
-	b	.Lsyscall_exit
 
 #ifdef CONFIG_PPC_BOOK3S_64
 
@@ -676,226 +264,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	addi	r1,r1,SWITCH_FRAME_SIZE
 	blr
 
-	/*
-	 * If MSR EE/RI was never enabled, IRQs not reconciled, NVGPRs not
-	 * touched, no exit work created, then this can be used.
-	 */
-	.balign IFETCH_ALIGN_BYTES
-	.globl fast_interrupt_return_srr
-fast_interrupt_return_srr:
-_ASM_NOKPROBE_SYMBOL(fast_interrupt_return_srr)
-	kuap_check_amr r3, r4
-	ld	r5,_MSR(r1)
-	andi.	r0,r5,MSR_PR
-#ifdef CONFIG_PPC_BOOK3S
-	beq	1f
-	kuap_user_restore r3, r4
-	b	.Lfast_user_interrupt_return_srr
-1:
-	andi.	r0,r5,MSR_RI
-	beq-	2f
-	kuap_kernel_restore r3, r4
-	li	r3,0 /* 0 return value, no EMULATE_STACK_STORE */
-	b	.Lfast_kernel_interrupt_return_srr
-2:
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	bl	unrecoverable_exception
-	b	. /* should not get here */
-#else
-	bne	.Lfast_user_interrupt_return_srr
-	b	.Lfast_kernel_interrupt_return_srr
-#endif
-
-.macro interrupt_return_macro srr
-	.balign IFETCH_ALIGN_BYTES
-	.globl interrupt_return_\srr
-interrupt_return_\srr\():
-_ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
-	ld	r4,_MSR(r1)
-	andi.	r0,r4,MSR_PR
-	beq	.Lkernel_interrupt_return_\srr
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	bl	interrupt_exit_user_prepare
-	cmpdi	r3,0
-	bne-	.Lrestore_nvgprs_\srr
-.Lfast_user_interrupt_return_\srr\():
-	EXIT_KERNEL_SECURITY_FALLBACK
-
-BEGIN_FTR_SECTION
-	ld	r10,_PPR(r1)
-	mtspr	SPRN_PPR,r10
-END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
-
-#ifdef CONFIG_PPC_BOOK3S
-	.ifc \srr,srr
-	lbz	r4,PACASRR_VALID(r13)
-	.else
-	lbz	r4,PACAHSRR_VALID(r13)
-	.endif
-	cmpdi	r4,0
-	li	r4,0
-	bne	1f
-#endif
-	ld	r11,_NIP(r1)
-	ld	r12,_MSR(r1)
-	.ifc \srr,srr
-	mtspr	SPRN_SRR0,r11
-	mtspr	SPRN_SRR1,r12
-1:
-#ifdef CONFIG_PPC_BOOK3S
-	stb	r4,PACASRR_VALID(r13)
-#endif
-	.else
-	mtspr	SPRN_HSRR0,r11
-	mtspr	SPRN_HSRR1,r12
-1:
-#ifdef CONFIG_PPC_BOOK3S
-	stb	r4,PACAHSRR_VALID(r13)
-#endif
-	.endif
-	DEBUG_SRR_VALID \srr
-
-BEGIN_FTR_SECTION
-	stdcx.	r0,0,r1		/* to clear the reservation */
-FTR_SECTION_ELSE
-	ldarx	r0,0,r1
-ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
-
-	ld	r3,_CCR(r1)
-	ld	r4,_LINK(r1)
-	ld	r5,_CTR(r1)
-	ld	r6,_XER(r1)
-	li	r0,0
-
-	REST_4GPRS(7, r1)
-	REST_2GPRS(11, r1)
-	REST_GPR(13, r1)
-
-	mtcr	r3
-	mtlr	r4
-	mtctr	r5
-	mtspr	SPRN_XER,r6
-
-	REST_4GPRS(2, r1)
-	REST_GPR(6, r1)
-	REST_GPR(0, r1)
-	REST_GPR(1, r1)
-	.ifc \srr,srr
-	RFI_TO_USER
-	.else
-	HRFI_TO_USER
-	.endif
-	b	.	/* prevent speculative execution */
-
-.Lrestore_nvgprs_\srr\():
-	REST_NVGPRS(r1)
-	b	.Lfast_user_interrupt_return_\srr
-
-	.balign IFETCH_ALIGN_BYTES
-.Lkernel_interrupt_return_\srr\():
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	bl	interrupt_exit_kernel_prepare
-
-.Lfast_kernel_interrupt_return_\srr\():
-	cmpdi	cr1,r3,0
-#ifdef CONFIG_PPC_BOOK3S
-	.ifc \srr,srr
-	lbz	r4,PACASRR_VALID(r13)
-	.else
-	lbz	r4,PACAHSRR_VALID(r13)
-	.endif
-	cmpdi	r4,0
-	li	r4,0
-	bne	1f
-#endif
-	ld	r11,_NIP(r1)
-	ld	r12,_MSR(r1)
-	.ifc \srr,srr
-	mtspr	SPRN_SRR0,r11
-	mtspr	SPRN_SRR1,r12
-1:
-#ifdef CONFIG_PPC_BOOK3S
-	stb	r4,PACASRR_VALID(r13)
-#endif
-	.else
-	mtspr	SPRN_HSRR0,r11
-	mtspr	SPRN_HSRR1,r12
-1:
-#ifdef CONFIG_PPC_BOOK3S
-	stb	r4,PACAHSRR_VALID(r13)
-#endif
-	.endif
-	DEBUG_SRR_VALID \srr
-
-BEGIN_FTR_SECTION
-	stdcx.	r0,0,r1		/* to clear the reservation */
-FTR_SECTION_ELSE
-	ldarx	r0,0,r1
-ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
-
-	ld	r3,_LINK(r1)
-	ld	r4,_CTR(r1)
-	ld	r5,_XER(r1)
-	ld	r6,_CCR(r1)
-	li	r0,0
-
-	REST_4GPRS(7, r1)
-	REST_2GPRS(11, r1)
-
-	mtlr	r3
-	mtctr	r4
-	mtspr	SPRN_XER,r5
-
-	/*
-	 * Leaving a stale exception_marker on the stack can confuse
-	 * the reliable stack unwinder later on. Clear it.
-	 */
-	std	r0,STACK_FRAME_OVERHEAD-16(r1)
-
-	REST_4GPRS(2, r1)
-
-	bne-	cr1,1f /* emulate stack store */
-	mtcr	r6
-	REST_GPR(6, r1)
-	REST_GPR(0, r1)
-	REST_GPR(1, r1)
-	.ifc \srr,srr
-	RFI_TO_KERNEL
-	.else
-	HRFI_TO_KERNEL
-	.endif
-	b	.	/* prevent speculative execution */
-
-1:	/*
-	 * Emulate stack store with update. New r1 value was already calculated
-	 * and updated in our interrupt regs by emulate_loadstore, but we can't
-	 * store the previous value of r1 to the stack before re-loading our
-	 * registers from it, otherwise they could be clobbered.  Use
-	 * PACA_EXGEN as temporary storage to hold the store data, as
-	 * interrupts are disabled here so it won't be clobbered.
-	 */
-	mtcr	r6
-	std	r9,PACA_EXGEN+0(r13)
-	addi	r9,r1,INT_FRAME_SIZE /* get original r1 */
-	REST_GPR(6, r1)
-	REST_GPR(0, r1)
-	REST_GPR(1, r1)
-	std	r9,0(r1) /* perform store component of stdu */
-	ld	r9,PACA_EXGEN+0(r13)
-
-	.ifc \srr,srr
-	RFI_TO_KERNEL
-	.else
-	HRFI_TO_KERNEL
-	.endif
-	b	.	/* prevent speculative execution */
-.endm
-
-interrupt_return_macro srr
-#ifdef CONFIG_PPC_BOOK3S
-interrupt_return_macro hsrr
-#endif
-
 #ifdef CONFIG_PPC_RTAS
 /*
  * On CHRP, the Run-Time Abstraction Services (RTAS) have to be
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index ece7f97bafff..d49c25daf1c0 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -194,8 +194,9 @@ CLOSE_FIXED_SECTION(first_256B)
 
 /* This value is used to mark exception frames on the stack. */
 	.section ".toc","aw"
+/* This value is used to mark exception frames on the stack. */
 exception_marker:
-	.tc	ID_72656773_68657265[TC],0x7265677368657265
+	.tc	ID_EXC_MARKER[TC],STACK_FRAME_REGS_MARKER
 	.previous
 
 /*
@@ -211,6 +212,8 @@ OPEN_TEXT_SECTION(0x100)
 
 USE_TEXT_SECTION()
 
+#include "interrupt_64.S"
+
 #ifdef CONFIG_PPC_BOOK3E
 /*
  * The booting_thread_hwid holds the thread id we want to boot in cpu
diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
new file mode 100644
index 000000000000..8a2b8188108b
--- /dev/null
+++ b/arch/powerpc/kernel/interrupt_64.S
@@ -0,0 +1,645 @@
+#include <asm/ppc_asm.h>
+#include <asm/head-64.h>
+#include <asm/bug.h>
+#include <asm/hw_irq.h>
+#include <asm/tm.h>
+#include <asm/mmu.h>
+#include <asm/asm-offsets.h>
+#ifdef CONFIG_PPC_BOOK3S
+#include <asm/exception-64s.h>
+#else
+#include <asm/exception-64e.h>
+#endif
+#include <asm/ptrace.h>
+#include <asm/head-64.h>
+#include <asm/feature-fixups.h>
+#include <asm/kup.h>
+
+	.section	".toc","aw"
+SYS_CALL_TABLE:
+	.tc sys_call_table[TC],sys_call_table
+
+#ifdef CONFIG_COMPAT
+COMPAT_SYS_CALL_TABLE:
+	.tc compat_sys_call_table[TC],compat_sys_call_table
+#endif
+	.previous
+
+	.align 7
+
+.macro DEBUG_SRR_VALID srr
+#ifdef CONFIG_PPC_RFI_SRR_DEBUG
+	.ifc \srr,srr
+	mfspr	r11,SPRN_SRR0
+	ld	r12,_NIP(r1)
+100:	tdne	r11,r12
+	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	mfspr	r11,SPRN_SRR1
+	ld	r12,_MSR(r1)
+100:	tdne	r11,r12
+	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	.else
+	mfspr	r11,SPRN_HSRR0
+	ld	r12,_NIP(r1)
+100:	tdne	r11,r12
+	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	mfspr	r11,SPRN_HSRR1
+	ld	r12,_MSR(r1)
+100:	tdne	r11,r12
+	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	.endif
+#endif
+.endm
+
+#ifdef CONFIG_PPC_BOOK3S
+.macro system_call_vectored name trapnr
+	.globl system_call_vectored_\name
+system_call_vectored_\name:
+_ASM_NOKPROBE_SYMBOL(system_call_vectored_\name)
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+BEGIN_FTR_SECTION
+	extrdi.	r10, r12, 1, (63-MSR_TS_T_LG) /* transaction active? */
+	bne	.Ltabort_syscall
+END_FTR_SECTION_IFSET(CPU_FTR_TM)
+#endif
+	SCV_INTERRUPT_TO_KERNEL
+	mr	r10,r1
+	ld	r1,PACAKSAVE(r13)
+	std	r10,0(r1)
+	std	r11,_NIP(r1)
+	std	r12,_MSR(r1)
+	std	r0,GPR0(r1)
+	std	r10,GPR1(r1)
+	std	r2,GPR2(r1)
+	ld	r2,PACATOC(r13)
+	mfcr	r12
+	li	r11,0
+	/* Can we avoid saving r3-r8 in common case? */
+	std	r3,GPR3(r1)
+	std	r4,GPR4(r1)
+	std	r5,GPR5(r1)
+	std	r6,GPR6(r1)
+	std	r7,GPR7(r1)
+	std	r8,GPR8(r1)
+	/* Zero r9-r12, this should only be required when restoring all GPRs */
+	std	r11,GPR9(r1)
+	std	r11,GPR10(r1)
+	std	r11,GPR11(r1)
+	std	r11,GPR12(r1)
+	std	r9,GPR13(r1)
+	SAVE_NVGPRS(r1)
+	std	r11,_XER(r1)
+	std	r11,_LINK(r1)
+	std	r11,_CTR(r1)
+
+	li	r11,\trapnr
+	std	r11,_TRAP(r1)
+	std	r12,_CCR(r1)
+	addi	r10,r1,STACK_FRAME_OVERHEAD
+	ld	r11,exception_marker@toc(r2)
+	std	r11,-16(r10)		/* "regshere" marker */
+
+BEGIN_FTR_SECTION
+	HMT_MEDIUM
+END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
+
+	ENTER_KERNEL_SECURITY_FALLBACK
+
+	/*
+	 * scv enters with MSR[EE]=1 and is immediately considered soft-masked.
+	 * The entry vector already sets PACAIRQSOFTMASK to IRQS_ALL_DISABLED,
+	 * and interrupts may be masked and pending already.
+	 * system_call_exception() will call trace_hardirqs_off() which means
+	 * interrupts could already have been blocked before trace_hardirqs_off,
+	 * but this is the best we can do.
+	 */
+
+	/* Calling convention has r9 = orig r0, r10 = regs */
+	mr	r9,r0
+	bl	system_call_exception
+
+.Lsyscall_vectored_\name\()_exit:
+	addi    r4,r1,STACK_FRAME_OVERHEAD
+	li	r5,1 /* scv */
+	bl	syscall_exit_prepare
+
+	EXIT_KERNEL_SECURITY_FALLBACK
+
+	ld	r2,_CCR(r1)
+	ld	r4,_NIP(r1)
+	ld	r5,_MSR(r1)
+
+BEGIN_FTR_SECTION
+	stdcx.	r0,0,r1			/* to clear the reservation */
+END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
+
+BEGIN_FTR_SECTION
+	HMT_MEDIUM_LOW
+END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
+
+	cmpdi	r3,0
+	bne	.Lsyscall_vectored_\name\()_restore_regs
+
+	/* rfscv returns with LR->NIA and CTR->MSR */
+	mtlr	r4
+	mtctr	r5
+
+	/* Could zero these as per ABI, but we may consider a stricter ABI
+	 * which preserves these if libc implementations can benefit, so
+	 * restore them for now until further measurement is done. */
+	ld	r0,GPR0(r1)
+	ld	r4,GPR4(r1)
+	ld	r5,GPR5(r1)
+	ld	r6,GPR6(r1)
+	ld	r7,GPR7(r1)
+	ld	r8,GPR8(r1)
+	/* Zero volatile regs that may contain sensitive kernel data */
+	li	r9,0
+	li	r10,0
+	li	r11,0
+	li	r12,0
+	mtspr	SPRN_XER,r0
+
+	/*
+	 * We don't need to restore AMR on the way back to userspace for KUAP.
+	 * The value of AMR only matters while we're in the kernel.
+	 */
+	mtcr	r2
+	ld	r2,GPR2(r1)
+	ld	r3,GPR3(r1)
+	ld	r13,GPR13(r1)
+	ld	r1,GPR1(r1)
+	RFSCV_TO_USER
+	b	.	/* prevent speculative execution */
+
+.Lsyscall_vectored_\name\()_restore_regs:
+	li	r3,0
+	mtmsrd	r3,1
+	mtspr	SPRN_SRR0,r4
+	mtspr	SPRN_SRR1,r5
+
+	ld	r3,_CTR(r1)
+	ld	r4,_LINK(r1)
+	ld	r5,_XER(r1)
+
+	REST_NVGPRS(r1)
+	ld	r0,GPR0(r1)
+	mtcr	r2
+	mtctr	r3
+	mtlr	r4
+	mtspr	SPRN_XER,r5
+	REST_10GPRS(2, r1)
+	REST_2GPRS(12, r1)
+	ld	r1,GPR1(r1)
+	RFI_TO_USER
+.endm
+
+system_call_vectored common 0x3000
+/*
+ * We instantiate another entry copy for the SIGILL variant, with TRAP=0x7ff0
+ * which is tested by system_call_exception when r0 is -1 (as set by vector
+ * entry code).
+ */
+system_call_vectored sigill 0x7ff0
+
+
+/*
+ * Entered via kernel return set up by kernel/sstep.c, must match entry regs
+ */
+	.globl system_call_vectored_emulate
+system_call_vectored_emulate:
+_ASM_NOKPROBE_SYMBOL(system_call_vectored_emulate)
+	li	r10,IRQS_ALL_DISABLED
+	stb	r10,PACAIRQSOFTMASK(r13)
+	b	system_call_vectored_common
+#endif
+
+	.balign IFETCH_ALIGN_BYTES
+	.globl system_call_common_real
+system_call_common_real:
+	ld	r10,PACAKMSR(r13)	/* get MSR value for kernel */
+	mtmsrd	r10
+
+	.balign IFETCH_ALIGN_BYTES
+	.globl system_call_common
+system_call_common:
+_ASM_NOKPROBE_SYMBOL(system_call_common)
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+BEGIN_FTR_SECTION
+	extrdi.	r10, r12, 1, (63-MSR_TS_T_LG) /* transaction active? */
+	bne	.Ltabort_syscall
+END_FTR_SECTION_IFSET(CPU_FTR_TM)
+#endif
+	mr	r10,r1
+	ld	r1,PACAKSAVE(r13)
+	std	r10,0(r1)
+	std	r11,_NIP(r1)
+	std	r12,_MSR(r1)
+	std	r0,GPR0(r1)
+	std	r10,GPR1(r1)
+	std	r2,GPR2(r1)
+#ifdef CONFIG_PPC_FSL_BOOK3E
+START_BTB_FLUSH_SECTION
+	BTB_FLUSH(r10)
+END_BTB_FLUSH_SECTION
+#endif
+	ld	r2,PACATOC(r13)
+	mfcr	r12
+	li	r11,0
+	/* Can we avoid saving r3-r8 in common case? */
+	std	r3,GPR3(r1)
+	std	r4,GPR4(r1)
+	std	r5,GPR5(r1)
+	std	r6,GPR6(r1)
+	std	r7,GPR7(r1)
+	std	r8,GPR8(r1)
+	/* Zero r9-r12, this should only be required when restoring all GPRs */
+	std	r11,GPR9(r1)
+	std	r11,GPR10(r1)
+	std	r11,GPR11(r1)
+	std	r11,GPR12(r1)
+	std	r9,GPR13(r1)
+	SAVE_NVGPRS(r1)
+	std	r11,_XER(r1)
+	std	r11,_CTR(r1)
+	mflr	r10
+
+	/*
+	 * This clears CR0.SO (bit 28), which is the error indication on
+	 * return from this system call.
+	 */
+	rldimi	r12,r11,28,(63-28)
+	li	r11,0xc00
+	std	r10,_LINK(r1)
+	std	r11,_TRAP(r1)
+	std	r12,_CCR(r1)
+	addi	r10,r1,STACK_FRAME_OVERHEAD
+	ld	r11,exception_marker@toc(r2)
+	std	r11,-16(r10)		/* "regshere" marker */
+
+#ifdef CONFIG_PPC_BOOK3S
+	li	r11,1
+	stb	r11,PACASRR_VALID(r13)
+#endif
+
+	/*
+	 * We always enter kernel from userspace with irq soft-mask enabled and
+	 * nothing pending. system_call_exception() will call
+	 * trace_hardirqs_off().
+	 */
+	li	r11,IRQS_ALL_DISABLED
+	li	r12,PACA_IRQ_HARD_DIS
+	stb	r11,PACAIRQSOFTMASK(r13)
+	stb	r12,PACAIRQHAPPENED(r13)
+
+	ENTER_KERNEL_SECURITY_FALLBACK
+
+	/* Calling convention has r9 = orig r0, r10 = regs */
+	mr	r9,r0
+	bl	system_call_exception
+
+.Lsyscall_exit:
+	addi    r4,r1,STACK_FRAME_OVERHEAD
+	li	r5,0 /* !scv */
+	bl	syscall_exit_prepare
+
+	EXIT_KERNEL_SECURITY_FALLBACK
+
+	ld	r2,_CCR(r1)
+	ld	r6,_LINK(r1)
+	mtlr	r6
+
+#ifdef CONFIG_PPC_BOOK3S
+	lbz	r4,PACASRR_VALID(r13)
+	cmpdi	r4,0
+	bne	1f
+	li	r4,0
+	stb	r4,PACASRR_VALID(r13)
+#endif
+	ld	r4,_NIP(r1)
+	ld	r5,_MSR(r1)
+	mtspr	SPRN_SRR0,r4
+	mtspr	SPRN_SRR1,r5
+1:
+	DEBUG_SRR_VALID srr
+
+BEGIN_FTR_SECTION
+	stdcx.	r0,0,r1			/* to clear the reservation */
+END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
+
+	cmpdi	r3,0
+	bne	.Lsyscall_restore_regs
+	/* Zero volatile regs that may contain sensitive kernel data */
+	li	r0,0
+	li	r4,0
+	li	r5,0
+	li	r6,0
+	li	r7,0
+	li	r8,0
+	li	r9,0
+	li	r10,0
+	li	r11,0
+	li	r12,0
+	mtctr	r0
+	mtspr	SPRN_XER,r0
+.Lsyscall_restore_regs_cont:
+
+BEGIN_FTR_SECTION
+	HMT_MEDIUM_LOW
+END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
+
+	/*
+	 * We don't need to restore AMR on the way back to userspace for KUAP.
+	 * The value of AMR only matters while we're in the kernel.
+	 */
+	mtcr	r2
+	ld	r2,GPR2(r1)
+	ld	r3,GPR3(r1)
+	ld	r13,GPR13(r1)
+	ld	r1,GPR1(r1)
+	RFI_TO_USER
+	b	.	/* prevent speculative execution */
+
+.Lsyscall_restore_regs:
+	ld	r3,_CTR(r1)
+	ld	r4,_XER(r1)
+	REST_NVGPRS(r1)
+	mtctr	r3
+	mtspr	SPRN_XER,r4
+	ld	r0,GPR0(r1)
+	REST_8GPRS(4, r1)
+	ld	r12,GPR12(r1)
+	b	.Lsyscall_restore_regs_cont
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+.Ltabort_syscall:
+	/* Firstly we need to enable TM in the kernel */
+	mfmsr	r10
+	li	r9, 1
+	rldimi	r10, r9, MSR_TM_LG, 63-MSR_TM_LG
+	mtmsrd	r10, 0
+
+	/* tabort, this dooms the transaction, nothing else */
+	li	r9, (TM_CAUSE_SYSCALL|TM_CAUSE_PERSISTENT)
+	TABORT(R9)
+
+	/*
+	 * Return directly to userspace. We have corrupted user register state,
+	 * but userspace will never see that register state. Execution will
+	 * resume after the tbegin of the aborted transaction with the
+	 * checkpointed register state.
+	 */
+	li	r9, MSR_RI
+	andc	r10, r10, r9
+	mtmsrd	r10, 1
+	mtspr	SPRN_SRR0, r11
+	mtspr	SPRN_SRR1, r12
+	RFI_TO_USER
+	b	.	/* prevent speculative execution */
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S
+_GLOBAL(ret_from_fork_scv)
+	bl	schedule_tail
+	REST_NVGPRS(r1)
+	li	r3,0	/* fork() return value */
+	b	.Lsyscall_vectored_common_exit
+#endif
+
+_GLOBAL(ret_from_fork)
+	bl	schedule_tail
+	REST_NVGPRS(r1)
+	li	r3,0	/* fork() return value */
+	b	.Lsyscall_exit
+
+_GLOBAL(ret_from_kernel_thread)
+	bl	schedule_tail
+	REST_NVGPRS(r1)
+	mtctr	r14
+	mr	r3,r15
+#ifdef PPC64_ELF_ABI_v2
+	mr	r12,r14
+#endif
+	bctrl
+	li	r3,0
+	b	.Lsyscall_exit
+
+	/*
+	 * If MSR EE/RI was never enabled, IRQs not reconciled, NVGPRs not
+	 * touched, no exit work created, then this can be used.
+	 */
+	.balign IFETCH_ALIGN_BYTES
+	.globl fast_interrupt_return_srr
+fast_interrupt_return_srr:
+_ASM_NOKPROBE_SYMBOL(fast_interrupt_return_srr)
+	kuap_check_amr r3, r4
+	ld	r5,_MSR(r1)
+	andi.	r0,r5,MSR_PR
+#ifdef CONFIG_PPC_BOOK3S
+	beq	1f
+	kuap_user_restore r3, r4
+	b	.Lfast_user_interrupt_return_srr
+1:
+	andi.	r0,r5,MSR_RI
+	beq-	2f
+	kuap_kernel_restore r3, r4
+	li	r3,0 /* 0 return value, no EMULATE_STACK_STORE */
+	b	.Lfast_kernel_interrupt_return_srr
+2:
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	bl	unrecoverable_exception
+	b	. /* should not get here */
+#else
+	bne	.Lfast_user_interrupt_return_srr
+	b	.Lfast_kernel_interrupt_return_srr
+#endif
+
+.macro interrupt_return_macro srr
+	.balign IFETCH_ALIGN_BYTES
+	.globl interrupt_return_\srr
+interrupt_return_\srr\():
+_ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
+	ld	r4,_MSR(r1)
+	andi.	r0,r4,MSR_PR
+	beq	.Lkernel_interrupt_return_\srr
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	bl	interrupt_exit_user_prepare
+	cmpdi	r3,0
+	bne-	.Lrestore_nvgprs_\srr
+.Lfast_user_interrupt_return_\srr\():
+	EXIT_KERNEL_SECURITY_FALLBACK
+
+BEGIN_FTR_SECTION
+	ld	r10,_PPR(r1)
+	mtspr	SPRN_PPR,r10
+END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
+
+#ifdef CONFIG_PPC_BOOK3S
+	.ifc \srr,srr
+	lbz	r4,PACASRR_VALID(r13)
+	.else
+	lbz	r4,PACAHSRR_VALID(r13)
+	.endif
+	cmpdi	r4,0
+	li	r4,0
+	bne	1f
+#endif
+	ld	r11,_NIP(r1)
+	ld	r12,_MSR(r1)
+	.ifc \srr,srr
+	mtspr	SPRN_SRR0,r11
+	mtspr	SPRN_SRR1,r12
+1:
+#ifdef CONFIG_PPC_BOOK3S
+	stb	r4,PACASRR_VALID(r13)
+#endif
+	.else
+	mtspr	SPRN_HSRR0,r11
+	mtspr	SPRN_HSRR1,r12
+1:
+#ifdef CONFIG_PPC_BOOK3S
+	stb	r4,PACAHSRR_VALID(r13)
+#endif
+	.endif
+	DEBUG_SRR_VALID \srr
+
+BEGIN_FTR_SECTION
+	stdcx.	r0,0,r1		/* to clear the reservation */
+FTR_SECTION_ELSE
+	ldarx	r0,0,r1
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
+
+	ld	r3,_CCR(r1)
+	ld	r4,_LINK(r1)
+	ld	r5,_CTR(r1)
+	ld	r6,_XER(r1)
+	li	r0,0
+
+	REST_4GPRS(7, r1)
+	REST_2GPRS(11, r1)
+	REST_GPR(13, r1)
+
+	mtcr	r3
+	mtlr	r4
+	mtctr	r5
+	mtspr	SPRN_XER,r6
+
+	REST_4GPRS(2, r1)
+	REST_GPR(6, r1)
+	REST_GPR(0, r1)
+	REST_GPR(1, r1)
+	.ifc \srr,srr
+	RFI_TO_USER
+	.else
+	HRFI_TO_USER
+	.endif
+	b	.	/* prevent speculative execution */
+
+.Lrestore_nvgprs_\srr\():
+	REST_NVGPRS(r1)
+	b	.Lfast_user_interrupt_return_\srr
+
+	.balign IFETCH_ALIGN_BYTES
+.Lkernel_interrupt_return_\srr\():
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	bl	interrupt_exit_kernel_prepare
+
+.Lfast_kernel_interrupt_return_\srr\():
+	cmpdi	cr1,r3,0
+#ifdef CONFIG_PPC_BOOK3S
+	.ifc \srr,srr
+	lbz	r4,PACASRR_VALID(r13)
+	.else
+	lbz	r4,PACAHSRR_VALID(r13)
+	.endif
+	cmpdi	r4,0
+	li	r4,0
+	bne	1f
+#endif
+	ld	r11,_NIP(r1)
+	ld	r12,_MSR(r1)
+	.ifc \srr,srr
+	mtspr	SPRN_SRR0,r11
+	mtspr	SPRN_SRR1,r12
+1:
+#ifdef CONFIG_PPC_BOOK3S
+	stb	r4,PACASRR_VALID(r13)
+#endif
+	.else
+	mtspr	SPRN_HSRR0,r11
+	mtspr	SPRN_HSRR1,r12
+1:
+#ifdef CONFIG_PPC_BOOK3S
+	stb	r4,PACAHSRR_VALID(r13)
+#endif
+	.endif
+	DEBUG_SRR_VALID \srr
+
+BEGIN_FTR_SECTION
+	stdcx.	r0,0,r1		/* to clear the reservation */
+FTR_SECTION_ELSE
+	ldarx	r0,0,r1
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
+
+	ld	r3,_LINK(r1)
+	ld	r4,_CTR(r1)
+	ld	r5,_XER(r1)
+	ld	r6,_CCR(r1)
+	li	r0,0
+
+	REST_4GPRS(7, r1)
+	REST_2GPRS(11, r1)
+
+	mtlr	r3
+	mtctr	r4
+	mtspr	SPRN_XER,r5
+
+	/*
+	 * Leaving a stale exception_marker on the stack can confuse
+	 * the reliable stack unwinder later on. Clear it.
+	 */
+	std	r0,STACK_FRAME_OVERHEAD-16(r1)
+
+	REST_4GPRS(2, r1)
+
+	bne-	cr1,1f /* emulate stack store */
+	mtcr	r6
+	REST_GPR(6, r1)
+	REST_GPR(0, r1)
+	REST_GPR(1, r1)
+	.ifc \srr,srr
+	RFI_TO_KERNEL
+	.else
+	HRFI_TO_KERNEL
+	.endif
+	b	.	/* prevent speculative execution */
+
+1:	/*
+	 * Emulate stack store with update. New r1 value was already calculated
+	 * and updated in our interrupt regs by emulate_loadstore, but we can't
+	 * store the previous value of r1 to the stack before re-loading our
+	 * registers from it, otherwise they could be clobbered.  Use
+	 * PACA_EXGEN as temporary storage to hold the store data, as
+	 * interrupts are disabled here so it won't be clobbered.
+	 */
+	mtcr	r6
+	std	r9,PACA_EXGEN+0(r13)
+	addi	r9,r1,INT_FRAME_SIZE /* get original r1 */
+	REST_GPR(6, r1)
+	REST_GPR(0, r1)
+	REST_GPR(1, r1)
+	std	r9,0(r1) /* perform store component of stdu */
+	ld	r9,PACA_EXGEN+0(r13)
+
+	.ifc \srr,srr
+	RFI_TO_KERNEL
+	.else
+	HRFI_TO_KERNEL
+	.endif
+	b	.	/* prevent speculative execution */
+.endm
+
+interrupt_return_macro srr
+#ifdef CONFIG_PPC_BOOK3S
+interrupt_return_macro hsrr
+#endif
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/14] powerpc/64s: save one more register in the masked interrupt handler
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (4 preceding siblings ...)
  2021-03-15 22:03 ` [PATCH 05/14] powerpc/64: move interrupt return asm to interrupt_64.S Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 07/14] powerpc/64: allow alternate return locations for soft-masked interrupts Nicholas Piggin
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This frees up one more register (and takes advantage of that to
clean things up a little bit).

This register will be used in the following patch.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 34 ++++++++++++++++------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 0c7af27d6dc1..a5a0b17f77bf 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -2797,7 +2797,6 @@ INT_DEFINE_END(soft_nmi)
  * and run it entirely with interrupts hard disabled.
  */
 EXC_COMMON_BEGIN(soft_nmi_common)
-	mfspr	r11,SPRN_SRR0
 	mr	r10,r1
 	ld	r1,PACAEMERGSP(r13)
 	subi	r1,r1,INT_FRAME_SIZE
@@ -2832,19 +2831,24 @@ masked_Hinterrupt:
 	.else
 masked_interrupt:
 	.endif
-	lbz	r11,PACAIRQHAPPENED(r13)
-	or	r11,r11,r10
-	stb	r11,PACAIRQHAPPENED(r13)
+	stw	r9,PACA_EXGEN+EX_CCR(r13)
+	lbz	r9,PACAIRQHAPPENED(r13)
+	or	r9,r9,r10
+	stb	r9,PACAIRQHAPPENED(r13)
+
+	.if ! \hsrr
 	cmpwi	r10,PACA_IRQ_DEC
 	bne	1f
-	lis	r10,0x7fff
-	ori	r10,r10,0xffff
-	mtspr	SPRN_DEC,r10
+	LOAD_REG_IMMEDIATE(r9, 0x7fffffff)
+	mtspr	SPRN_DEC,r9
 #ifdef CONFIG_PPC_WATCHDOG
+	lwz	r9,PACA_EXGEN+EX_CCR(r13)
 	b	soft_nmi_common
 #else
 	b	2f
 #endif
+	.endif
+
 1:	andi.	r10,r10,PACA_IRQ_MUST_HARD_MASK
 	beq	2f
 	xori	r12,r12,MSR_EE	/* clear MSR_EE */
@@ -2853,17 +2857,19 @@ masked_interrupt:
 	.else
 	mtspr	SPRN_SRR1,r12
 	.endif
-	ori	r11,r11,PACA_IRQ_HARD_DIS
-	stb	r11,PACAIRQHAPPENED(r13)
+	ori	r9,r9,PACA_IRQ_HARD_DIS
+	stb	r9,PACAIRQHAPPENED(r13)
 2:	/* done */
-	li	r10,0
+	li	r9,0
 	.if \hsrr
-	stb	r10,PACAHSRR_VALID(r13)
+	stb	r9,PACAHSRR_VALID(r13)
 	.else
-	stb	r10,PACASRR_VALID(r13)
+	stb	r9,PACASRR_VALID(r13)
 	.endif
-	ld	r10,PACA_EXGEN+EX_CTR(r13)
-	mtctr	r10
+
+	ld	r9,PACA_EXGEN+EX_CTR(r13)
+	mtctr	r9
+	lwz	r9,PACA_EXGEN+EX_CCR(r13)
 	mtcrf	0x80,r9
 	std	r1,PACAR1(r13)
 	ld	r9,PACA_EXGEN+EX_R9(r13)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/14] powerpc/64: allow alternate return locations for soft-masked interrupts
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (5 preceding siblings ...)
  2021-03-15 22:03 ` [PATCH 06/14] powerpc/64s: save one more register in the masked interrupt handler Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 08/14] powerpc/64: interrupt soft-enable race fix Nicholas Piggin
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The exception table fixup adjusts a failed page fault's interrupt return
location if it was taken at an address specified in the exception table,
to a corresponding fixup handler address.

Introduce a variation of that idea which adds a fixup table for NMIs and
soft-masked asynchronous interrupts. This will be used to protect
certain critical sections that are sensitive to being clobbered by
interrupts coming in (due to using the same SPRs and/or irq soft-mask
state).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 13 ++++++++++
 arch/powerpc/include/asm/ppc_asm.h   |  8 ++++++
 arch/powerpc/kernel/exceptions-64e.S | 37 ++++++++++++++++++++++++++--
 arch/powerpc/kernel/exceptions-64s.S | 33 +++++++++++++++++++++++++
 arch/powerpc/kernel/vmlinux.lds.S    | 10 ++++++++
 arch/powerpc/lib/Makefile            |  2 +-
 arch/powerpc/lib/restart_table.c     | 29 ++++++++++++++++++++++
 7 files changed, 129 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/lib/restart_table.c

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index dfa50bb3734d..5cdbd3630254 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -9,6 +9,11 @@
 #include <asm/kprobes.h>
 #include <asm/runlatch.h>
 
+#ifdef CONFIG_PPC64
+extern char __end_soft_masked[];
+unsigned long search_kernel_restart_table(unsigned long addr);
+#endif
+
 static inline void nap_adjust_return(struct pt_regs *regs)
 {
 #ifdef CONFIG_PPC_970_NAP
@@ -183,6 +188,14 @@ static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct inter
 	 * new work to do (must use irq_work for that).
 	 */
 
+#ifdef CONFIG_PPC64
+	if (arch_irq_disabled_regs(regs)) {
+		unsigned long rst = search_kernel_restart_table(regs->nip);
+		if (rst)
+			regs_set_return_ip(regs, rst);
+	}
+#endif
+
 #ifdef CONFIG_PPC64
 	if (nmi_disables_ftrace(regs))
 		this_cpu_set_ftrace_enabled(state->ftrace_enabled);
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index 8998122fc7e2..03447f79f684 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -782,6 +782,14 @@ END_FTR_SECTION_NESTED(CPU_FTR_CELL_TB_BUG, CPU_FTR_CELL_TB_BUG, 96)
 	stringify_in_c(.long (_target) - . ;)	\
 	stringify_in_c(.previous)
 
+#define RESTART_TABLE(_start, _end, _target)	\
+	stringify_in_c(.section __restart_table,"a";)\
+	stringify_in_c(.balign 8;)		\
+	stringify_in_c(.llong (_start);)	\
+	stringify_in_c(.llong (_end);)		\
+	stringify_in_c(.llong (_target);)	\
+	stringify_in_c(.previous)
+
 #ifdef CONFIG_PPC_FSL_BOOK3E
 #define BTB_FLUSH(reg)			\
 	lis reg,BUCSR_INIT@h;		\
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 86612f68f5bd..69d0d63cee85 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -883,6 +883,28 @@ kernel_dbg_exc:
 	bl	unknown_exception
 	b	interrupt_return
 
+.macro SEARCH_RESTART_TABLE
+	LOAD_REG_IMMEDIATE_SYM(r14, r11, __start___restart_table)
+	LOAD_REG_IMMEDIATE_SYM(r15, r11, __stop___restart_table)
+300:
+	cmpd	r14,r15
+	beq	302f
+	ld	r11,0(r14)
+	cmpld	r10,r11
+	blt	301f
+	ld	r11,8(r14)
+	cmpld	r10,r11
+	bge	301f
+	ld	r11,16(r14)
+	b	303f
+301:
+	addi	r14,r14,24
+	b	300b
+302:
+	li	r11,0
+303:
+.endm
+
 /*
  * An interrupt came in while soft-disabled; We mark paca->irq_happened
  * accordingly and if the interrupt is level sensitive, we hard disable
@@ -891,6 +913,9 @@ kernel_dbg_exc:
  */
 
 .macro masked_interrupt_book3e paca_irq full_mask
+	std	r14,PACA_EXGEN+EX_R14(r13)
+	std	r15,PACA_EXGEN+EX_R15(r13)
+
 	lbz	r10,PACAIRQHAPPENED(r13)
 	.if \full_mask == 1
 	ori	r10,r10,\paca_irq | PACA_IRQ_HARD_DIS
@@ -900,15 +925,23 @@ kernel_dbg_exc:
 	stb	r10,PACAIRQHAPPENED(r13)
 
 	.if \full_mask == 1
-	rldicl	r10,r11,48,1		/* clear MSR_EE */
-	rotldi	r11,r10,16
+	xori	r11,r11,MSR_EE		/* clear MSR_EE */
 	mtspr	SPRN_SRR1,r11
 	.endif
 
+	mfspr	r10,SPRN_SRR0
+	SEARCH_RESTART_TABLE
+	cmpdi	r11,0
+	beq	1f
+	mtspr	SPRN_SRR0,r11		/* return to restart address */
+1:
+
 	lwz	r11,PACA_EXGEN+EX_CR(r13)
 	mtcr	r11
 	ld	r10,PACA_EXGEN+EX_R10(r13)
 	ld	r11,PACA_EXGEN+EX_R11(r13)
+	ld	r14,PACA_EXGEN+EX_R14(r13)
+	ld	r15,PACA_EXGEN+EX_R15(r13)
 	mfspr	r13,SPRN_SPRG_GEN_SCRATCH
 	rfi
 	b	.
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index a5a0b17f77bf..32b11431ac4a 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -675,6 +675,28 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 	__GEN_COMMON_BODY \name
 .endm
 
+.macro SEARCH_RESTART_TABLE
+	LOAD_REG_IMMEDIATE_SYM(r9, r12, __start___restart_table)
+	LOAD_REG_IMMEDIATE_SYM(r10, r12, __stop___restart_table)
+300:
+	cmpd	r9,r10
+	beq	302f
+	ld	r12,0(r9)
+	cmpld	r11,r12
+	blt	301f
+	ld	r12,8(r9)
+	cmpld	r11,r12
+	bge	301f
+	ld	r12,16(r9)
+	b	303f
+301:
+	addi	r9,r9,24
+	b	300b
+302:
+	li	r12,0
+303:
+.endm
+
 /*
  * Restore all registers including H/SRR0/1 saved in a stack frame of a
  * standard exception.
@@ -2810,6 +2832,7 @@ EXC_COMMON_BEGIN(soft_nmi_common)
 	mtmsrd	r9,1
 
 	kuap_kernel_restore r9, r10
+
 	EXCEPTION_RESTORE_REGS hsrr=0
 	RFI_TO_KERNEL
 
@@ -2867,6 +2890,16 @@ masked_interrupt:
 	stb	r9,PACASRR_VALID(r13)
 	.endif
 
+	SEARCH_RESTART_TABLE
+	cmpdi	r12,0
+	beq	3f
+	.if \hsrr
+	mtspr	SPRN_HSRR0,r12
+	.else
+	mtspr	SPRN_SRR0,r12
+	.endif
+3:
+
 	ld	r9,PACA_EXGEN+EX_CTR(r13)
 	mtctr	r9
 	lwz	r9,PACA_EXGEN+EX_CCR(r13)
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index 582009dacef4..badcca54e968 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -9,6 +9,14 @@
 #define EMITS_PT_NOTE
 #define RO_EXCEPTION_TABLE_ALIGN	0
 
+#define RESTART_TABLE(align)						\
+	. = ALIGN(align);						\
+	__restart_table : AT(ADDR(__restart_table) - LOAD_OFFSET) {	\
+		__start___restart_table = .;				\
+		KEEP(*(__restart_table))				\
+		__stop___restart_table = .;				\
+	}
+
 #include <asm/page.h>
 #include <asm-generic/vmlinux.lds.h>
 #include <asm/cache.h>
@@ -124,6 +132,8 @@ SECTIONS
 	RO_DATA(PAGE_SIZE)
 
 #ifdef CONFIG_PPC64
+	RESTART_TABLE(8)
+
 	. = ALIGN(8);
 	__stf_entry_barrier_fixup : AT(ADDR(__stf_entry_barrier_fixup) - LOAD_OFFSET) {
 		__start___stf_entry_barrier_fixup = .;
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index d4efc182662a..a9bbd80e2748 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -39,7 +39,7 @@ obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \
 			       memcpy_power7.o
 
 obj64-y	+= copypage_64.o copyuser_64.o mem_64.o hweight_64.o \
-	   memcpy_64.o copy_mc_64.o
+	   memcpy_64.o copy_mc_64.o restart_table.o
 
 ifndef CONFIG_PPC_QUEUED_SPINLOCKS
 obj64-$(CONFIG_SMP)	+= locks.o
diff --git a/arch/powerpc/lib/restart_table.c b/arch/powerpc/lib/restart_table.c
new file mode 100644
index 000000000000..3ccb31914036
--- /dev/null
+++ b/arch/powerpc/lib/restart_table.c
@@ -0,0 +1,29 @@
+#include <asm/kprobes.h>
+
+struct restart_table_entry {
+	unsigned long start;
+	unsigned long end;
+	unsigned long fixup;
+};
+
+extern struct restart_table_entry __start___restart_table[];
+extern struct restart_table_entry __stop___restart_table[];
+
+/* Given an address, look for it in the kernel exception table */
+unsigned long search_kernel_restart_table(unsigned long addr)
+{
+	struct restart_table_entry *rte = __start___restart_table;
+
+	while (rte < __stop___restart_table) {
+		unsigned long start = rte->start;
+		unsigned long end = rte->end;
+		unsigned long fixup = rte->fixup;
+
+		if (addr >= start && addr < end)
+			return fixup;
+
+		rte++;
+	}
+	return 0;
+}
+NOKPROBE_SYMBOL(search_kernel_restart_table);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 08/14] powerpc/64: interrupt soft-enable race fix
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (6 preceding siblings ...)
  2021-03-15 22:03 ` [PATCH 07/14] powerpc/64: allow alternate return locations for soft-masked interrupts Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 09/14] powerpc/64: treat low kernel text as irqs soft-masked Nicholas Piggin
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Prevent interrupt restore from allowing racing hard interrupts going
ahead of previous soft-pending ones, by using the soft-masked restart
handler to allow a store to clear the soft-mask while knowing nothing
is soft-pending.

This probably doesn't matter much in practice, but it's a simple
demonstrator / test case to exercise the restart table logic.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/irq.c | 81 ++++++++++++++++++++++++---------------
 1 file changed, 51 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 08a747b92735..a032701e81be 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -224,54 +224,75 @@ notrace void arch_local_irq_restore(unsigned long mask)
 {
 	unsigned char irq_happened;
 
-	/* Write the new soft-enabled value */
-	irq_soft_mask_set(mask);
-	if (mask)
+	/* Write the new soft-enabled value if it is a disable */
+	if (mask) {
+		irq_soft_mask_set(mask);
 		return;
+	}
 
 	/*
-	 * From this point onward, we can take interrupts, preempt,
-	 * etc... unless we got hard-disabled. We check if an event
-	 * happened. If none happened, we know we can just return.
-	 *
-	 * We may have preempted before the check below, in which case
-	 * we are checking the "new" CPU instead of the old one. This
-	 * is only a problem if an event happened on the "old" CPU.
+	 * After the stb, interrupts are unmasked and there are no interrupts
+	 * pending replay. The restart sequence makes this atomic with
+	 * respect to soft-masked interrupts. If this was just a simple code
+	 * sequence, a soft-masked interrupt could become pending right after
+	 * the comparison and before the stb.
 	 *
-	 * External interrupt events will have caused interrupts to
-	 * be hard-disabled, so there is no problem, we
-	 * cannot have preempted.
+	 * This allows interrupts to be unmasked without hard disabling, and
+	 * also without new hard interrupts coming in ahead of pending ones.
 	 */
+	asm_volatile_goto(
+"1:					\n"
+"		lbz	9,%0(13)	\n"
+"		cmpwi	9,0		\n"
+"		bne	%l[happened]	\n"
+"		stb	9,%1(13)	\n"
+"2:					\n"
+		RESTART_TABLE(1b, 2b, 1b)
+	: : "i" (offsetof(struct paca_struct, irq_happened)),
+	    "i" (offsetof(struct paca_struct, irq_soft_mask))
+	: "cr0", "r9"
+	: happened);
+
+	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
+		WARN_ON_ONCE(!(mfmsr() & MSR_EE));
+
+	return;
+
+happened:
 	irq_happened = get_irq_happened();
-	if (!irq_happened) {
+	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
+		WARN_ON_ONCE(!irq_happened);
+
+	if (irq_happened == PACA_IRQ_HARD_DIS) {
 		if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
-			WARN_ON_ONCE(!(mfmsr() & MSR_EE));
+			WARN_ON_ONCE(mfmsr() & MSR_EE);
+		irq_soft_mask_set(IRQS_ENABLED);
+		local_paca->irq_happened = 0;
+		__hard_irq_enable();
 		return;
 	}
 
-	/* We need to hard disable to replay. */
+	/* Have interrupts to replay, need to hard disable first */
 	if (!(irq_happened & PACA_IRQ_HARD_DIS)) {
-		if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
-			WARN_ON_ONCE(!(mfmsr() & MSR_EE));
+		if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) {
+			if (!(mfmsr() & MSR_EE)) {
+				/*
+				 * An interrupt could have come in and cleared
+				 * MSR[EE] and set IRQ_HARD_DIS, so check
+				 * IRQ_HARD_DIS again and warn if it is still
+				 * clear.
+				 */
+				irq_happened = get_irq_happened();
+				WARN_ON_ONCE(!(irq_happened & PACA_IRQ_HARD_DIS));
+			}
+		}
 		__hard_irq_disable();
 		local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 	} else {
-		/*
-		 * We should already be hard disabled here. We had bugs
-		 * where that wasn't the case so let's dbl check it and
-		 * warn if we are wrong. Only do that when IRQ tracing
-		 * is enabled as mfmsr() can be costly.
-		 */
 		if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) {
 			if (WARN_ON_ONCE(mfmsr() & MSR_EE))
 				__hard_irq_disable();
 		}
-
-		if (irq_happened == PACA_IRQ_HARD_DIS) {
-			local_paca->irq_happened = 0;
-			__hard_irq_enable();
-			return;
-		}
 	}
 
 	/*
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 09/14] powerpc/64: treat low kernel text as irqs soft-masked
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (7 preceding siblings ...)
  2021-03-15 22:03 ` [PATCH 08/14] powerpc/64: interrupt soft-enable race fix Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-03-15 22:03 ` [PATCH 10/14] powerpc/64: use interrupt restart table to speed up return from interrupt Nicholas Piggin
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Treat code below __end_soft_masked as soft-masked for the purpose
of alternate return. 64s already mostly does this for scv entry.

This will be used to exit from interrupts without disabling MSR[EE].

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h |  8 ++++++++
 arch/powerpc/kernel/exceptions-64e.S | 12 +++++++++++-
 arch/powerpc/kernel/exceptions-64s.S |  3 ++-
 arch/powerpc/kernel/interrupt_64.S   |  6 +++++-
 4 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 5cdbd3630254..8796eb4630c9 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -72,6 +72,10 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
 		 */
 		if (TRAP(regs) != 0x700)
 			CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
+		BUG_ON(regs->nip < (unsigned long)__end_soft_masked);
+		/* Move this under a debugging check */
+		if (arch_irq_disabled_regs(regs))
+			BUG_ON(search_kernel_restart_table(regs->nip));
 	}
 #endif
 
@@ -147,6 +151,10 @@ static inline bool nmi_disables_ftrace(struct pt_regs *regs)
 static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
 {
 #ifdef CONFIG_PPC64
+	/* Ensure arch_irq_disabled_regs(regs) looks right. */
+	if (!(regs->msr & MSR_PR) && regs->nip < (unsigned long)__end_soft_masked)
+		regs->softe = IRQS_ALL_DISABLED;
+
 	state->irq_soft_mask = local_paca->irq_soft_mask;
 	state->irq_happened = local_paca->irq_happened;
 
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 69d0d63cee85..87fe307b4da8 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -342,7 +342,17 @@ ret_from_mc_except:
 #define PROLOG_ADDITION_MASKABLE_GEN(n)					    \
 	lbz	r10,PACAIRQSOFTMASK(r13);	/* are irqs soft-masked? */ \
 	andi.	r10,r10,IRQS_DISABLED;	/* yes -> go out of line */ \
-	bne	masked_interrupt_book3e_##n
+	bne	masked_interrupt_book3e_##n;				    \
+	/* Kernel code below __end_soft_masked is implicitly masked */	    \
+	andi.	r10,r11,MSR_PR;						    \
+	bne	1f;			/* user -> not masked */	    \
+	std	r14,PACA_EXGEN+EX_R14(r13);				    \
+	LOAD_REG_IMMEDIATE_SYM(r14, r10, __end_soft_masked);		    \
+	mfspr	r10,SPRN_SRR0;						    \
+	cmpld	r10,r14;						    \
+	ld	r14,PACA_EXGEN+EX_R14(r13);				    \
+	blt	masked_interrupt_book3e_##n;				    \
+1:
 
 #define PROLOG_ADDITION_2REGS_GEN(n)					    \
 	std	r14,PACA_EXGEN+EX_R14(r13);				    \
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 32b11431ac4a..bd0c82ac9de5 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -514,8 +514,9 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real)
 
 		/* Kernel code running below __end_interrupts is implicitly
 		 * soft-masked */
-		LOAD_HANDLER(r10, __end_interrupts)
+		LOAD_HANDLER(r10, __end_soft_masked)
 		cmpld	r11,r10
+
 		li	r10,IMASK
 		blt-	1f
 
diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
index 8a2b8188108b..c6a0349dde59 100644
--- a/arch/powerpc/kernel/interrupt_64.S
+++ b/arch/powerpc/kernel/interrupt_64.S
@@ -642,4 +642,8 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 interrupt_return_macro srr
 #ifdef CONFIG_PPC_BOOK3S
 interrupt_return_macro hsrr
-#endif
+#endif /* CONFIG_PPC_BOOK3S */
+
+	.globl __end_soft_masked
+__end_soft_masked:
+DEFINE_FIXED_SYMBOL(__end_soft_masked)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/14] powerpc/64: use interrupt restart table to speed up return from interrupt
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (8 preceding siblings ...)
  2021-03-15 22:03 ` [PATCH 09/14] powerpc/64: treat low kernel text as irqs soft-masked Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-03-16 19:34   ` Christophe Leroy
  2021-03-15 22:03 ` [PATCH 11/14] powerpc/64e: Remove PPR from pt_regs Nicholas Piggin
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The restart table facility is used to return from interrupt without
disabling MSR EE or RI.

Interrupt return code is put into the low soft-masked region.

Critical code that has no exit work, SRRs set, soft-masked state set to
return state, saves r1 in the PACA and then begins to run instructions
that have an alternate return handler.

In this region, pending interrupts are checked, and if any exist then
it branches directly to the restart handler.

If it does not branch, then no masked interrupts are pending, and if any
interrupts do hit, we will go out the restart handler.

The restart handler re-loads the saved r1, and from there we can find
regs, and reload critical state before setting things up to replay
interrupts and go around the exit prepare sequence again.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/paca.h    |   1 +
 arch/powerpc/include/asm/ptrace.h  |   4 +-
 arch/powerpc/kernel/asm-offsets.c  |   1 +
 arch/powerpc/kernel/interrupt.c    | 282 ++++++++++++++++-------------
 arch/powerpc/kernel/interrupt_64.S | 118 +++++++++++-
 5 files changed, 270 insertions(+), 136 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 4cbfaa09950a..039ccedcfcdb 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -167,6 +167,7 @@ struct paca_struct {
 	u64 kstack;			/* Saved Kernel stack addr */
 	u64 saved_r1;			/* r1 save for RTAS calls or PM or EE=0 */
 	u64 saved_msr;			/* MSR saved here by enter_rtas */
+	u64 exit_save_r1;		/* Syscall/interrupt R1 save */
 #ifdef CONFIG_PPC_BOOK3E
 	u16 trap_save;			/* Used when bad stack is encountered */
 #endif
diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
index 77c86ce01f20..d37787a74342 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -52,6 +52,7 @@ struct pt_regs
 		struct {
 #ifdef CONFIG_PPC64
 			unsigned long ppr;
+			unsigned long exit_result;
 #endif
 			union {
 #ifdef CONFIG_PPC_KUAP
@@ -65,7 +66,8 @@ struct pt_regs
 			unsigned long iamr;
 #endif
 		};
-		unsigned long __pad[4];	/* Maintain 16 byte interrupt stack alignment */
+		/* Maintain 16 byte interrupt stack alignment */
+		unsigned long __pad[4];
 	};
 };
 #endif
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 35ce6e36f593..44d557dacc77 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -288,6 +288,7 @@ int main(void)
 	OFFSET(ACCOUNT_STARTTIME_USER, paca_struct, accounting.starttime_user);
 	OFFSET(ACCOUNT_USER_TIME, paca_struct, accounting.utime);
 	OFFSET(ACCOUNT_SYSTEM_TIME, paca_struct, accounting.stime);
+	OFFSET(PACA_EXIT_SAVE_R1, paca_struct, exit_save_r1);
 #ifdef CONFIG_PPC_BOOK3E
 	OFFSET(PACA_TRAP_SAVE, paca_struct, trap_save);
 #endif
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index efeeefe6ee8f..09cf699d0e2e 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -147,71 +147,6 @@ notrace long system_call_exception(long r3, long r4, long r5,
 	return f(r3, r4, r5, r6, r7, r8);
 }
 
-/*
- * local irqs must be disabled. Returns false if the caller must re-enable
- * them, check for new work, and try again.
- *
- * This should be called with local irqs disabled, but if they were previously
- * enabled when the interrupt handler returns (indicating a process-context /
- * synchronous interrupt) then irqs_enabled should be true.
- */
-static notrace __always_inline bool __prep_irq_for_enabled_exit(bool clear_ri)
-{
-	/* This must be done with RI=1 because tracing may touch vmaps */
-	trace_hardirqs_on();
-
-	/* This pattern matches prep_irq_for_idle */
-	if (clear_ri)
-		__hard_EE_RI_disable();
-	else
-		__hard_irq_disable();
-#ifdef CONFIG_PPC64
-	if (unlikely(lazy_irq_pending_nocheck())) {
-		/* Took an interrupt, may have more exit work to do. */
-		if (clear_ri)
-			__hard_RI_enable();
-		trace_hardirqs_off();
-		local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
-
-		return false;
-	}
-	local_paca->irq_happened = 0;
-	irq_soft_mask_set(IRQS_ENABLED);
-#endif
-	return true;
-}
-
-static notrace inline bool prep_irq_for_enabled_exit(bool clear_ri, bool irqs_enabled)
-{
-	if (__prep_irq_for_enabled_exit(clear_ri))
-		return true;
-
-	/*
-	 * Must replay pending soft-masked interrupts now. Don't just
-	 * local_irq_enabe(); local_irq_disable(); because if we are
-	 * returning from an asynchronous interrupt here, another one
-	 * might hit after irqs are enabled, and it would exit via this
-	 * same path allowing another to fire, and so on unbounded.
-	 *
-	 * If interrupts were enabled when this interrupt exited,
-	 * indicating a process context (synchronous) interrupt,
-	 * local_irq_enable/disable can be used, which will enable
-	 * interrupts rather than keeping them masked (unclear how
-	 * much benefit this is over just replaying for all cases,
-	 * because we immediately disable again, so all we're really
-	 * doing is allowing hard interrupts to execute directly for
-	 * a very small time, rather than being masked and replayed).
-	 */
-	if (irqs_enabled) {
-		local_irq_enable();
-		local_irq_disable();
-	} else {
-		replay_soft_interrupts();
-	}
-
-	return false;
-}
-
 static notrace void booke_load_dbcr0(void)
 {
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
@@ -234,57 +169,11 @@ static notrace void booke_load_dbcr0(void)
 #endif
 }
 
-/*
- * This should be called after a syscall returns, with r3 the return value
- * from the syscall. If this function returns non-zero, the system call
- * exit assembly should additionally load all GPR registers and CTR and XER
- * from the interrupt frame.
- *
- * The function graph tracer can not trace the return side of this function,
- * because RI=0 and soft mask state is "unreconciled", so it is marked notrace.
- */
-notrace unsigned long syscall_exit_prepare(unsigned long r3,
-					   struct pt_regs *regs,
-					   long scv)
+notrace unsigned long syscall_exit_prepare_main(unsigned long r3,
+						struct pt_regs *regs)
 {
 	unsigned long ti_flags;
 	unsigned long ret = 0;
-	bool is_not_scv = !IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !scv;
-
-	CT_WARN_ON(ct_state() == CONTEXT_USER);
-
-	kuap_assert_locked();
-
-	regs->result = r3;
-
-	/* Check whether the syscall is issued inside a restartable sequence */
-	rseq_syscall(regs);
-
-	ti_flags = current_thread_info()->flags;
-
-	if (unlikely(r3 >= (unsigned long)-MAX_ERRNO) && is_not_scv) {
-		if (likely(!(ti_flags & (_TIF_NOERROR | _TIF_RESTOREALL)))) {
-			r3 = -r3;
-			regs->ccr |= 0x10000000; /* Set SO bit in CR */
-		}
-	}
-
-	if (unlikely(ti_flags & _TIF_PERSYSCALL_MASK)) {
-		if (ti_flags & _TIF_RESTOREALL)
-			ret = _TIF_RESTOREALL;
-		else
-			regs->gpr[3] = r3;
-		clear_bits(_TIF_PERSYSCALL_MASK, &current_thread_info()->flags);
-	} else {
-		regs->gpr[3] = r3;
-	}
-
-	if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) {
-		do_syscall_trace_leave(regs);
-		ret |= _TIF_RESTOREALL;
-	}
-
-	local_irq_disable();
 
 again:
 	ti_flags = READ_ONCE(current_thread_info()->flags);
@@ -330,16 +219,16 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
 		}
 	}
 
-	user_enter_irqoff();
-
 	/* scv need not set RI=0 because SRRs are not used */
-	if (unlikely(!__prep_irq_for_enabled_exit(is_not_scv))) {
-		user_exit_irqoff();
+	if (unlikely(lazy_irq_pending_nocheck())) {
 		local_irq_enable();
 		local_irq_disable();
 		goto again;
 	}
 
+	user_enter_irqoff();
+	trace_hardirqs_on();
+
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	local_paca->tm_scratch = regs->msr;
 #endif
@@ -359,6 +248,84 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
 	return ret;
 }
 
+/*
+ * This should be called after a syscall returns, with r3 the return value
+ * from the syscall. If this function returns non-zero, the system call
+ * exit assembly should additionally load all GPR registers and CTR and XER
+ * from the interrupt frame.
+ *
+ * The function graph tracer can not trace the return side of this function,
+ * because RI=0 and soft mask state is "unreconciled", so it is marked notrace.
+ */
+notrace unsigned long syscall_exit_prepare(unsigned long r3,
+					   struct pt_regs *regs,
+					   long scv)
+{
+	unsigned long ti_flags;
+	unsigned long ret = 0;
+	bool is_not_scv = !IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !scv;
+
+	CT_WARN_ON(ct_state() == CONTEXT_USER);
+
+	kuap_assert_locked();
+
+	regs->result = r3;
+
+	/* Check whether the syscall is issued inside a restartable sequence */
+	rseq_syscall(regs);
+
+	ti_flags = current_thread_info()->flags;
+
+	if (unlikely(r3 >= (unsigned long)-MAX_ERRNO) && is_not_scv) {
+		if (likely(!(ti_flags & (_TIF_NOERROR | _TIF_RESTOREALL)))) {
+			r3 = -r3;
+			regs->ccr |= 0x10000000; /* Set SO bit in CR */
+		}
+	}
+
+	if (unlikely(ti_flags & _TIF_PERSYSCALL_MASK)) {
+		if (ti_flags & _TIF_RESTOREALL)
+			ret = _TIF_RESTOREALL;
+		else
+			regs->gpr[3] = r3;
+		clear_bits(_TIF_PERSYSCALL_MASK, &current_thread_info()->flags);
+	} else {
+		regs->gpr[3] = r3;
+	}
+
+	if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) {
+		do_syscall_trace_leave(regs);
+		ret |= _TIF_RESTOREALL;
+	}
+
+	local_irq_disable();
+	ret |= syscall_exit_prepare_main(r3, regs);
+
+	regs->exit_result = ret;
+
+	return ret;
+}
+
+notrace unsigned long syscall_exit_restart(unsigned long r3, struct pt_regs *regs)
+{
+	/* This is called in when detecting a soft-pending interrupt as well,
+	 * so can't just have restart table returns clear SRR1[MSR] and set
+	 * PACA_IRQ_HARD_DIS here (unless the soft-pending case were to clear
+	 * MSR[EE] too).
+	 */
+	hard_irq_disable();
+
+	trace_hardirqs_off();
+	user_exit_irqoff();
+	account_cpu_user_entry();
+
+	BUG_ON(!user_mode(regs));
+
+	regs->exit_result |= syscall_exit_prepare_main(r3, regs);
+
+	return regs->exit_result;
+}
+
 notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
 {
 	unsigned long ti_flags;
@@ -414,15 +381,15 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
 		}
 	}
 
-	user_enter_irqoff();
-
-	if (unlikely(!__prep_irq_for_enabled_exit(true))) {
-		user_exit_irqoff();
+	if (unlikely(lazy_irq_pending_nocheck())) {
 		local_irq_enable();
 		local_irq_disable();
 		goto again;
 	}
 
+	user_enter_irqoff();
+	trace_hardirqs_on();
+
 	booke_load_dbcr0();
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
@@ -437,6 +404,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
 #ifndef CONFIG_PPC_BOOK3E_64
 	kuap_user_restore(regs);
 #endif
+	regs->exit_result = ret;
 
 	return ret;
 }
@@ -466,11 +434,6 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 	kuap = kuap_get_and_assert_locked();
 #endif
 
-	if (unlikely(current_thread_info()->flags & _TIF_EMULATE_STACK_STORE)) {
-		clear_bits(_TIF_EMULATE_STACK_STORE, &current_thread_info()->flags);
-		ret = 1;
-	}
-
 	local_irq_save(flags);
 
 	if (!arch_irq_disabled_regs(regs)) {
@@ -485,17 +448,42 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 			}
 		}
 
-		if (unlikely(!prep_irq_for_enabled_exit(true, !irqs_disabled_flags(flags))))
+		/*
+		 * May replay pending soft-masked interrupts now. Don't just
+		 * local_irq_enabe(); local_irq_disable(); because if we are
+		 * returning from an asynchronous interrupt here, another one
+		 * might hit after irqs are enabled, and it would exit via this
+		 * same path allowing another to fire, and so on unbounded.
+		 */
+		if (unlikely(lazy_irq_pending_nocheck())) {
+			hard_irq_disable();
+			replay_soft_interrupts();
+			/* Took an interrupt, may have more exit work to do. */
 			goto again;
+		}
+		local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
+
+		trace_hardirqs_on();
 	} else {
-		/* Returning to a kernel context with local irqs disabled. */
-		__hard_EE_RI_disable();
 #ifdef CONFIG_PPC64
+		/* Returning to a kernel context with local irqs disabled. */
 		if (regs->msr & MSR_EE)
 			local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
 #endif
 	}
 
+	if (unlikely(current_thread_info()->flags & _TIF_EMULATE_STACK_STORE)) {
+		/* Stack store can't be restarted (stack might have been clobbered) */
+		__hard_EE_RI_disable();
+		if (unlikely(lazy_irq_pending_nocheck()) && !arch_irq_disabled_regs(regs)) {
+			local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+			__hard_RI_enable();
+			goto again;
+		}
+
+		clear_bits(_TIF_EMULATE_STACK_STORE, &current_thread_info()->flags);
+		ret = 1;
+	}
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	local_paca->tm_scratch = regs->msr;
@@ -512,3 +500,39 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 
 	return ret;
 }
+
+#ifdef CONFIG_PPC64
+notrace unsigned long interrupt_exit_user_restart(struct pt_regs *regs)
+{
+	hard_irq_disable();
+
+	trace_hardirqs_off();
+	user_exit_irqoff();
+	account_cpu_user_entry();
+
+	BUG_ON(!user_mode(regs));
+
+	regs->exit_result |= interrupt_exit_user_prepare(regs);
+
+	return regs->exit_result;
+}
+
+/* No real need to return a value here because the stack store can't be
+ * restarted
+ */
+notrace unsigned long interrupt_exit_kernel_restart(struct pt_regs *regs)
+{
+	hard_irq_disable();
+
+#ifndef CONFIG_PPC_BOOK3E_64
+	set_kuap(AMR_KUAP_BLOCKED);
+#endif
+
+	if (regs->softe == IRQS_ENABLED)
+		trace_hardirqs_off();
+
+	BUG_ON(user_mode(regs));
+
+	return interrupt_exit_kernel_prepare(regs);
+}
+#endif
diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
index c6a0349dde59..2b68b1dea8bf 100644
--- a/arch/powerpc/kernel/interrupt_64.S
+++ b/arch/powerpc/kernel/interrupt_64.S
@@ -119,9 +119,18 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	bl	system_call_exception
 
 .Lsyscall_vectored_\name\()_exit:
-	addi    r4,r1,STACK_FRAME_OVERHEAD
+	addi	r4,r1,STACK_FRAME_OVERHEAD
 	li	r5,1 /* scv */
 	bl	syscall_exit_prepare
+	std	r1,PACA_EXIT_SAVE_R1(r13) /* save r1 for restart */
+.Lsyscall_vectored_\name\()_rst_start:
+	lbz	r11,PACAIRQHAPPENED(r13)
+	andi.	r11,r11,(~PACA_IRQ_HARD_DIS)@l
+	bne-	.Lsyscall_vectored_\name\()_restart
+	li	r11,IRQS_ENABLED
+	stb	r11,PACAIRQSOFTMASK(r13)
+	li	r11,0
+	stb	r11,PACAIRQHAPPENED(r13) # clear out possible HARD_DIS
 
 	EXIT_KERNEL_SECURITY_FALLBACK
 
@@ -173,8 +182,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	b	.	/* prevent speculative execution */
 
 .Lsyscall_vectored_\name\()_restore_regs:
-	li	r3,0
-	mtmsrd	r3,1
 	mtspr	SPRN_SRR0,r4
 	mtspr	SPRN_SRR1,r5
 
@@ -192,9 +199,26 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	REST_2GPRS(12, r1)
 	ld	r1,GPR1(r1)
 	RFI_TO_USER
+.Lsyscall_vectored_\name\()_rst_end:
+
+.Lsyscall_vectored_\name\()_restart:
+	GET_PACA(r13)
+	ld	r1,PACA_EXIT_SAVE_R1(r13)
+	ld	r2,PACATOC(r13)
+	ld	r3,RESULT(r1)
+	addi	r4,r1,STACK_FRAME_OVERHEAD
+	li	r11,IRQS_ALL_DISABLED
+	stb	r11,PACAIRQSOFTMASK(r13)
+	bl	syscall_exit_restart
+	std	r1,PACA_EXIT_SAVE_R1(r13) /* save r1 for restart */
+	b	.Lsyscall_vectored_\name\()_rst_start
+
+RESTART_TABLE(.Lsyscall_vectored_\name\()_rst_start, .Lsyscall_vectored_\name\()_rst_end, .Lsyscall_vectored_\name\()_restart)
+
 .endm
 
 system_call_vectored common 0x3000
+
 /*
  * We instantiate another entry copy for the SIGILL variant, with TRAP=0x7ff0
  * which is tested by system_call_exception when r0 is -1 (as set by vector
@@ -299,9 +323,18 @@ END_BTB_FLUSH_SECTION
 	bl	system_call_exception
 
 .Lsyscall_exit:
-	addi    r4,r1,STACK_FRAME_OVERHEAD
+	addi	r4,r1,STACK_FRAME_OVERHEAD
 	li	r5,0 /* !scv */
 	bl	syscall_exit_prepare
+	std	r1,PACA_EXIT_SAVE_R1(r13) /* save r1 for restart */
+.Lsyscall_rst_start:
+	lbz	r11,PACAIRQHAPPENED(r13)
+	andi.	r11,r11,(~PACA_IRQ_HARD_DIS)@l
+	bne-	.Lsyscall_restart
+	li	r11,IRQS_ENABLED
+	stb	r11,PACAIRQSOFTMASK(r13)
+	li	r11,0
+	stb	r11,PACAIRQHAPPENED(r13) # clear out possible HARD_DIS
 
 	EXIT_KERNEL_SECURITY_FALLBACK
 
@@ -370,6 +403,21 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	REST_8GPRS(4, r1)
 	ld	r12,GPR12(r1)
 	b	.Lsyscall_restore_regs_cont
+.Lsyscall_rst_end:
+
+.Lsyscall_restart:
+	GET_PACA(r13)
+	ld	r1,PACA_EXIT_SAVE_R1(r13)
+	ld	r2,PACATOC(r13)
+	ld	r3,RESULT(r1)
+	addi	r4,r1,STACK_FRAME_OVERHEAD
+	li	r11,IRQS_ALL_DISABLED
+	stb	r11,PACAIRQSOFTMASK(r13)
+	bl	syscall_exit_restart
+	std	r1,PACA_EXIT_SAVE_R1(r13) /* save r1 for restart */
+	b	.Lsyscall_rst_start
+
+RESTART_TABLE(.Lsyscall_rst_start, .Lsyscall_rst_end, .Lsyscall_restart)
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 .Ltabort_syscall:
@@ -462,13 +510,28 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
 	ld	r4,_MSR(r1)
 	andi.	r0,r4,MSR_PR
 	beq	.Lkernel_interrupt_return_\srr
+.Linterrupt_return_\srr\()_user:
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	interrupt_exit_user_prepare
 	cmpdi	r3,0
 	bne-	.Lrestore_nvgprs_\srr
-.Lfast_user_interrupt_return_\srr\():
+.Lrestore_nvgprs_\srr\()_cont:
 	EXIT_KERNEL_SECURITY_FALLBACK
 
+	std	r1,PACA_EXIT_SAVE_R1(r13) /* save r1 for restart */
+.Linterrupt_return_\srr\()_user_rst_start:
+	lbz	r11,PACAIRQHAPPENED(r13)
+	andi.	r11,r11,(~PACA_IRQ_HARD_DIS)@l
+	bne-	.Linterrupt_return_\srr\()_user_restart
+	li	r11,IRQS_ENABLED
+	stb	r11,PACAIRQSOFTMASK(r13)
+	li	r11,0
+	stb	r11,PACAIRQHAPPENED(r13) # clear out possible HARD_DIS
+
+.Lfast_user_interrupt_return_\srr\():
+	lbz	r4,PACAIRQSOFTMASK(r13)
+	tdnei	r4,IRQS_ENABLED
+
 BEGIN_FTR_SECTION
 	ld	r10,_PPR(r1)
 	mtspr	SPRN_PPR,r10
@@ -534,16 +597,44 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 	HRFI_TO_USER
 	.endif
 	b	.	/* prevent speculative execution */
+.Linterrupt_return_\srr\()_user_rst_end:
 
 .Lrestore_nvgprs_\srr\():
 	REST_NVGPRS(r1)
-	b	.Lfast_user_interrupt_return_\srr
+	b	.Lrestore_nvgprs_\srr\()_cont
+
+.Linterrupt_return_\srr\()_user_restart:
+	GET_PACA(r13)
+	ld	r1,PACA_EXIT_SAVE_R1(r13)
+	ld	r2,PACATOC(r13)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	li	r11,IRQS_ALL_DISABLED
+	stb	r11,PACAIRQSOFTMASK(r13)
+	bl	interrupt_exit_user_restart
+	std	r1,PACA_EXIT_SAVE_R1(r13) /* save r1 for restart */
+	b	.Linterrupt_return_\srr\()_user_rst_start
+
+RESTART_TABLE(.Linterrupt_return_\srr\()_user_rst_start, .Linterrupt_return_\srr\()_user_rst_end, .Linterrupt_return_\srr\()_user_restart)
 
 	.balign IFETCH_ALIGN_BYTES
 .Lkernel_interrupt_return_\srr\():
+.Linterrupt_return_\srr\()_kernel:
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	interrupt_exit_kernel_prepare
 
+	std	r1,PACA_EXIT_SAVE_R1(r13) /* save r1 for restart */
+.Linterrupt_return_\srr\()_kernel_rst_start:
+	lbz	r11,SOFTE(r1)
+	cmpwi	r11,IRQS_ENABLED
+	stb	r11,PACAIRQSOFTMASK(r13)
+	bne	1f
+	lbz	r11,PACAIRQHAPPENED(r13)
+	andi.	r11,r11,(~PACA_IRQ_HARD_DIS)@l
+	bne-	.Linterrupt_return_\srr\()_kernel_restart
+	li	r11,0
+	stb	r11,PACAIRQHAPPENED(r13) # clear out possible HARD_DIS
+1:
+
 .Lfast_kernel_interrupt_return_\srr\():
 	cmpdi	cr1,r3,0
 #ifdef CONFIG_PPC_BOOK3S
@@ -637,6 +728,21 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 	HRFI_TO_KERNEL
 	.endif
 	b	.	/* prevent speculative execution */
+.Linterrupt_return_\srr\()_kernel_rst_end:
+
+.Linterrupt_return_\srr\()_kernel_restart:
+	GET_PACA(r13)
+	ld	r1,PACA_EXIT_SAVE_R1(r13)
+	ld	r2,PACATOC(r13)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	li	r11,IRQS_ALL_DISABLED
+	stb	r11,PACAIRQSOFTMASK(r13)
+	bl	interrupt_exit_kernel_restart
+	std	r1,PACA_EXIT_SAVE_R1(r13) /* save r1 for restart */
+	b	.Linterrupt_return_\srr\()_kernel_rst_start
+
+RESTART_TABLE(.Linterrupt_return_\srr\()_kernel_rst_start, .Linterrupt_return_\srr\()_kernel_rst_end, .Linterrupt_return_\srr\()_kernel_restart)
+
 .endm
 
 interrupt_return_macro srr
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 11/14] powerpc/64e: Remove PPR from pt_regs
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (9 preceding siblings ...)
  2021-03-15 22:03 ` [PATCH 10/14] powerpc/64: use interrupt restart table to speed up return from interrupt Nicholas Piggin
@ 2021-03-15 22:03 ` Nicholas Piggin
  2021-03-15 22:04 ` [PATCH 12/14] powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE] Nicholas Piggin
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

64e does not have a PPR register.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/ptrace.h  | 4 +++-
 arch/powerpc/kernel/asm-offsets.c  | 2 ++
 arch/powerpc/kernel/interrupt_64.S | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
index d37787a74342..a59bdc020195 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -50,8 +50,10 @@ struct pt_regs
 
 	union {
 		struct {
-#ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3S_64
 			unsigned long ppr;
+#endif
+#ifdef CONFIG_PPC64
 			unsigned long exit_result;
 #endif
 			union {
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 44d557dacc77..0b7828eff7ff 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -352,7 +352,9 @@ int main(void)
 	STACK_PT_REGS_OFFSET(_ESR, dsisr);
 #else /* CONFIG_PPC64 */
 	STACK_PT_REGS_OFFSET(SOFTE, softe);
+#ifdef CONFIG_PPC_BOOK3S_64
 	STACK_PT_REGS_OFFSET(_PPR, ppr);
+#endif
 #endif /* CONFIG_PPC64 */
 
 #ifdef CONFIG_PPC_PKEY
diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
index 2b68b1dea8bf..f28f41a1a85a 100644
--- a/arch/powerpc/kernel/interrupt_64.S
+++ b/arch/powerpc/kernel/interrupt_64.S
@@ -532,12 +532,12 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
 	lbz	r4,PACAIRQSOFTMASK(r13)
 	tdnei	r4,IRQS_ENABLED
 
+#ifdef CONFIG_PPC_BOOK3S
 BEGIN_FTR_SECTION
 	ld	r10,_PPR(r1)
 	mtspr	SPRN_PPR,r10
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
-#ifdef CONFIG_PPC_BOOK3S
 	.ifc \srr,srr
 	lbz	r4,PACASRR_VALID(r13)
 	.else
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 12/14] powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE]
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (10 preceding siblings ...)
  2021-03-15 22:03 ` [PATCH 11/14] powerpc/64e: Remove PPR from pt_regs Nicholas Piggin
@ 2021-03-15 22:04 ` Nicholas Piggin
  2021-03-16  7:21   ` Christophe Leroy
  2021-03-15 22:04 ` [PATCH 13/14] powerpc/64: handle MSR EE and RI in interrupt entry wrapper Nicholas Piggin
  2021-03-15 22:04 ` [PATCH 14/14] powerpc/64s: use the same default PPR for user and kernel Nicholas Piggin
  13 siblings, 1 reply; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:04 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This extends the MSR[RI]=0 window a little further into the system
call in order to pair RI and EE enabling with a single mtmsrd.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 2 --
 arch/powerpc/kernel/interrupt_64.S   | 6 +++---
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index bd0c82ac9de5..2f14ac3c377c 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1999,8 +1999,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
 	mtctr	r10
 	bctr
 	.else
-	li	r10,MSR_RI
-	mtmsrd 	r10,1			/* Set RI (EE=0) */
 #ifdef CONFIG_RELOCATABLE
 	__LOAD_HANDLER(r10, system_call_common)
 	mtctr	r10
diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
index f28f41a1a85a..eef61800f734 100644
--- a/arch/powerpc/kernel/interrupt_64.S
+++ b/arch/powerpc/kernel/interrupt_64.S
@@ -311,10 +311,10 @@ END_BTB_FLUSH_SECTION
 	 * nothing pending. system_call_exception() will call
 	 * trace_hardirqs_off().
 	 */
-	li	r11,IRQS_ALL_DISABLED
-	li	r12,PACA_IRQ_HARD_DIS
+	li	r11,IRQS_DISABLED
+	li	r12,-1 /* Set MSR_EE and MSR_RI */
 	stb	r11,PACAIRQSOFTMASK(r13)
-	stb	r12,PACAIRQHAPPENED(r13)
+	mtmsrd	r12,1
 
 	ENTER_KERNEL_SECURITY_FALLBACK
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 13/14] powerpc/64: handle MSR EE and RI in interrupt entry wrapper
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (11 preceding siblings ...)
  2021-03-15 22:04 ` [PATCH 12/14] powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE] Nicholas Piggin
@ 2021-03-15 22:04 ` Nicholas Piggin
  2021-03-15 22:04 ` [PATCH 14/14] powerpc/64s: use the same default PPR for user and kernel Nicholas Piggin
  13 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:04 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Similarly to the system call change in the previous patch, the mtmsrd to
enable RI can be combined with the mtmsrd to enable EE for interrupts
which enable the latter, which tends to be the important synchronous
interrupts (i.e., page faults).

Do this by enabling EE and RI together at the beginning of the entry
wrapper if PACA_IRQ_HARD_DIS is clear, and just enabling RI if it is set
(which means something wanted EE=0).

Asynchronous interrupts set PACA_IRQ_HARD_DIS, but synchronous ones
leave it unchanged, so by default they always get EE=1 unless they
interrupt a caller that has hard disabled. When the sync interrupt
later calls interrupt_cond_local_irq_enable(), that will not require
another mtmsrd because we already enabled here.

This tends to save one mtmsrd L=1 for synchronous interrupts on 64s.
64e is conceptually unchanged, but it also sets MSR[EE]=1 now in the
interrupt wrapper for synchronous interrupts with the same code.

From: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 18 +++++++++++++++--
 arch/powerpc/kernel/exceptions-64s.S | 30 ----------------------------
 2 files changed, 16 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 8796eb4630c9..d6d54bbcba2f 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -55,9 +55,20 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
 #endif
 
 #ifdef CONFIG_PPC64
-	if (irq_soft_mask_set_return(IRQS_ALL_DISABLED) == IRQS_ENABLED)
+	bool trace_enable = false;
+
+	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS)) {
+		if (irq_soft_mask_set_return(IRQS_DISABLED) == IRQS_ENABLED)
+			trace_enable = true;
+	} else {
+		irq_soft_mask_set(IRQS_DISABLED);
+	}
+	if (local_paca->irq_happened & PACA_IRQ_HARD_DIS)
+		__hard_RI_enable();
+	else
+		__hard_irq_enable();
+	if (trace_enable)
 		trace_hardirqs_off();
-	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 
 	if (user_mode(regs)) {
 		CT_WARN_ON(ct_state() != CONTEXT_USER);
@@ -110,6 +121,7 @@ static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct in
 		__ppc64_runlatch_on();
 #endif
 
+	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 	interrupt_enter_prepare(regs, state);
 	irq_enter();
 }
@@ -166,6 +178,8 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
 	local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
 	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 
+	__hard_RI_enable();
+
 	/* Don't do any per-CPU operations until interrupt state is fixed */
 
 	if (nmi_disables_ftrace(regs)) {
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 2f14ac3c377c..75cee7cdf887 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -129,7 +129,6 @@ name:
 #define IISIDE		.L_IISIDE_\name\()	/* Uses SRR0/1 not DAR/DSISR */
 #define IDAR		.L_IDAR_\name\()	/* Uses DAR (or SRR0) */
 #define IDSISR		.L_IDSISR_\name\()	/* Uses DSISR (or SRR1) */
-#define ISET_RI		.L_ISET_RI_\name\()	/* Run common code w/ MSR[RI]=1 */
 #define IBRANCH_TO_COMMON	.L_IBRANCH_TO_COMMON_\name\() /* ENTRY branch to common */
 #define IREALMODE_COMMON	.L_IREALMODE_COMMON_\name\() /* Common runs in realmode */
 #define IMASK		.L_IMASK_\name\()	/* IRQ soft-mask bit */
@@ -174,9 +173,6 @@ do_define_int n
 	.ifndef IDSISR
 		IDSISR=0
 	.endif
-	.ifndef ISET_RI
-		ISET_RI=1
-	.endif
 	.ifndef IBRANCH_TO_COMMON
 		IBRANCH_TO_COMMON=1
 	.endif
@@ -582,11 +578,6 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real)
 	stb	r10,PACASRR_VALID(r13)
 	.endif
 
-	.if ISET_RI
-	li	r10,MSR_RI
-	mtmsrd	r10,1			/* Set MSR_RI */
-	.endif
-
 	.if ISTACK
 	.if IKUAP
 	kuap_save_amr_and_lock r9, r10, cr1, cr0
@@ -931,11 +922,6 @@ INT_DEFINE_BEGIN(system_reset)
 	IVEC=0x100
 	IAREA=PACA_EXNMI
 	IVIRT=0 /* no virt entry point */
-	/*
-	 * MSR_RI is not enabled, because PACA_EXNMI and nmi stack is
-	 * being used, so a nested NMI exception would corrupt it.
-	 */
-	ISET_RI=0
 	ISTACK=0
 	IKVM_REAL=1
 INT_DEFINE_END(system_reset)
@@ -1016,8 +1002,6 @@ EXC_COMMON_BEGIN(system_reset_common)
 	lhz	r10,PACA_IN_NMI(r13)
 	addi	r10,r10,1
 	sth	r10,PACA_IN_NMI(r13)
-	li	r10,MSR_RI
-	mtmsrd 	r10,1
 
 	mr	r10,r1
 	ld	r1,PACA_NMI_EMERG_SP(r13)
@@ -1095,12 +1079,6 @@ INT_DEFINE_BEGIN(machine_check_early)
 	IAREA=PACA_EXMC
 	IVIRT=0 /* no virt entry point */
 	IREALMODE_COMMON=1
-	/*
-	 * MSR_RI is not enabled, because PACA_EXMC is being used, so a
-	 * nested machine check corrupts it. machine_check_common enables
-	 * MSR_RI.
-	 */
-	ISET_RI=0
 	ISTACK=0
 	IDAR=1
 	IDSISR=1
@@ -1111,7 +1089,6 @@ INT_DEFINE_BEGIN(machine_check)
 	IVEC=0x200
 	IAREA=PACA_EXMC
 	IVIRT=0 /* no virt entry point */
-	ISET_RI=0
 	IDAR=1
 	IDSISR=1
 	IKVM_SKIP=1
@@ -1182,9 +1159,6 @@ EXC_COMMON_BEGIN(machine_check_early_common)
 BEGIN_FTR_SECTION
 	bl	enable_machine_check
 END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
-	li	r10,MSR_RI
-	mtmsrd	r10,1
-
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	machine_check_early
 	std	r3,RESULT(r1)	/* Save result */
@@ -1272,10 +1246,6 @@ EXC_COMMON_BEGIN(machine_check_common)
 	 * save area: PACA_EXMC instead of PACA_EXGEN.
 	 */
 	GEN_COMMON machine_check
-
-	/* Enable MSR_RI when finished with PACA_EXMC */
-	li	r10,MSR_RI
-	mtmsrd 	r10,1
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	machine_check_exception
 	b	interrupt_return_srr
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 14/14] powerpc/64s: use the same default PPR for user and kernel
  2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
                   ` (12 preceding siblings ...)
  2021-03-15 22:04 ` [PATCH 13/14] powerpc/64: handle MSR EE and RI in interrupt entry wrapper Nicholas Piggin
@ 2021-03-15 22:04 ` Nicholas Piggin
  2021-05-17 14:09   ` Christophe Leroy
  13 siblings, 1 reply; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-15 22:04 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Change the default PPR to userspace to 4 (medium), matching the
normal kernel PPR.

This allows system calls and user interrupts to avoid setting PPR on
entry and exit, providing a significant speedup.

This is a change to the user environment. The problem with changing
the kernel to match userspace at 3 (medium-low), is that userspace
can then boost priority above the kernel which is also undesirable.

glibc does not seem to change PPR anywhere, so the decision is to
go with this.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h |  2 ++
 arch/powerpc/include/asm/processor.h |  4 ++--
 arch/powerpc/kernel/exceptions-64s.S |  3 ---
 arch/powerpc/kernel/interrupt.c      | 33 ++++++++++++++++++++++++++++
 arch/powerpc/kernel/interrupt_64.S   | 17 --------------
 5 files changed, 37 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index d6d54bbcba2f..293e6be9fd71 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -57,6 +57,8 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
 #ifdef CONFIG_PPC64
 	bool trace_enable = false;
 
+	if (unlikely(regs->ppr != DEFAULT_PPR))
+		mtspr(SPRN_PPR, DEFAULT_PPR);
 	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS)) {
 		if (irq_soft_mask_set_return(IRQS_DISABLED) == IRQS_ENABLED)
 			trace_enable = true;
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index cb1edf21a82e..5ff589042103 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -27,8 +27,8 @@
 #endif
 
 #ifdef CONFIG_PPC64
-/* Default SMT priority is set to 3. Use 11- 13bits to save priority. */
-#define PPR_PRIORITY 3
+/* Default SMT priority is set to 4. Use 11- 13bits to save priority. */
+#define PPR_PRIORITY 4
 #ifdef __ASSEMBLY__
 #define DEFAULT_PPR (PPR_PRIORITY << 50)
 #else
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 75cee7cdf887..0d40614d13e0 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -367,7 +367,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 BEGIN_FTR_SECTION
 	mfspr	r9,SPRN_PPR
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
-	HMT_MEDIUM
 	std	r10,IAREA+EX_R10(r13)		/* save r10 - r12 */
 BEGIN_FTR_SECTION
 	mfspr	r10,SPRN_CFAR
@@ -1962,8 +1961,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
 	mfspr	r11,SPRN_SRR0
 	mfspr	r12,SPRN_SRR1
 
-	HMT_MEDIUM
-
 	.if ! \virt
 	__LOAD_HANDLER(r10, system_call_common_real)
 	mtctr	r10
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 09cf699d0e2e..a6e0595da0dd 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -40,6 +40,11 @@ notrace long system_call_exception(long r3, long r4, long r5,
 
 	regs->orig_gpr3 = r3;
 
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (unlikely(regs->ppr != DEFAULT_PPR))
+		mtspr(SPRN_PPR, DEFAULT_PPR);
+#endif
+
 	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
 		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
 
@@ -237,6 +242,11 @@ notrace unsigned long syscall_exit_prepare_main(unsigned long r3,
 
 	account_cpu_user_exit();
 
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (unlikely(regs->ppr != DEFAULT_PPR))
+		mtspr(SPRN_PPR, regs->ppr);
+#endif
+
 #ifndef CONFIG_PPC_BOOK3E_64 /* BOOK3E not using this */
 	/*
 	 * We do this at the end so that we do context switch with KERNEL AMR
@@ -315,6 +325,11 @@ notrace unsigned long syscall_exit_restart(unsigned long r3, struct pt_regs *reg
 	 */
 	hard_irq_disable();
 
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (unlikely(regs->ppr != DEFAULT_PPR))
+		mtspr(SPRN_PPR, DEFAULT_PPR);
+#endif
+
 	trace_hardirqs_off();
 	user_exit_irqoff();
 	account_cpu_user_entry();
@@ -398,6 +413,11 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
 
 	account_cpu_user_exit();
 
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (unlikely(regs->ppr != DEFAULT_PPR))
+		mtspr(SPRN_PPR, regs->ppr);
+#endif
+
 	/*
 	 * We do this at the end so that we do context switch with KERNEL AMR
 	 */
@@ -489,6 +509,11 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 	local_paca->tm_scratch = regs->msr;
 #endif
 
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (unlikely(regs->ppr != DEFAULT_PPR))
+		mtspr(SPRN_PPR, regs->ppr);
+#endif
+
 	/*
 	 * Don't want to mfspr(SPRN_AMR) here, because this comes after mtmsr,
 	 * which would cause Read-After-Write stalls. Hence, we take the AMR
@@ -505,6 +530,10 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 notrace unsigned long interrupt_exit_user_restart(struct pt_regs *regs)
 {
 	hard_irq_disable();
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (unlikely(regs->ppr != DEFAULT_PPR))
+		mtspr(SPRN_PPR, DEFAULT_PPR);
+#endif
 
 	trace_hardirqs_off();
 	user_exit_irqoff();
@@ -523,6 +552,10 @@ notrace unsigned long interrupt_exit_user_restart(struct pt_regs *regs)
 notrace unsigned long interrupt_exit_kernel_restart(struct pt_regs *regs)
 {
 	hard_irq_disable();
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (unlikely(regs->ppr != DEFAULT_PPR))
+		mtspr(SPRN_PPR, DEFAULT_PPR);
+#endif
 
 #ifndef CONFIG_PPC_BOOK3E_64
 	set_kuap(AMR_KUAP_BLOCKED);
diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
index eef61800f734..53fc446dcbeb 100644
--- a/arch/powerpc/kernel/interrupt_64.S
+++ b/arch/powerpc/kernel/interrupt_64.S
@@ -99,10 +99,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
 	ld	r11,exception_marker@toc(r2)
 	std	r11,-16(r10)		/* "regshere" marker */
 
-BEGIN_FTR_SECTION
-	HMT_MEDIUM
-END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
-
 	ENTER_KERNEL_SECURITY_FALLBACK
 
 	/*
@@ -142,10 +138,6 @@ BEGIN_FTR_SECTION
 	stdcx.	r0,0,r1			/* to clear the reservation */
 END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 
-BEGIN_FTR_SECTION
-	HMT_MEDIUM_LOW
-END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
-
 	cmpdi	r3,0
 	bne	.Lsyscall_vectored_\name\()_restore_regs
 
@@ -377,10 +369,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 	mtspr	SPRN_XER,r0
 .Lsyscall_restore_regs_cont:
 
-BEGIN_FTR_SECTION
-	HMT_MEDIUM_LOW
-END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
-
 	/*
 	 * We don't need to restore AMR on the way back to userspace for KUAP.
 	 * The value of AMR only matters while we're in the kernel.
@@ -533,11 +521,6 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
 	tdnei	r4,IRQS_ENABLED
 
 #ifdef CONFIG_PPC_BOOK3S
-BEGIN_FTR_SECTION
-	ld	r10,_PPR(r1)
-	mtspr	SPRN_PPR,r10
-END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
-
 	.ifc \srr,srr
 	lbz	r4,PACASRR_VALID(r13)
 	.else
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 12/14] powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE]
  2021-03-15 22:04 ` [PATCH 12/14] powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE] Nicholas Piggin
@ 2021-03-16  7:21   ` Christophe Leroy
  2021-03-16  8:13     ` Nicholas Piggin
  2021-03-19 11:29     ` Michael Ellerman
  0 siblings, 2 replies; 26+ messages in thread
From: Christophe Leroy @ 2021-03-16  7:21 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 15/03/2021 à 23:04, Nicholas Piggin a écrit :
> This extends the MSR[RI]=0 window a little further into the system
> call in order to pair RI and EE enabling with a single mtmsrd.

Time ago, I proposed to delay that on PPC32 and Michael objected, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/9f9dd859d571e324c7412ed9db9da8cfba678257.1548956511.git.christophe.leroy@c-s.fr/


> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/kernel/exceptions-64s.S | 2 --
>   arch/powerpc/kernel/interrupt_64.S   | 6 +++---
>   2 files changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index bd0c82ac9de5..2f14ac3c377c 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1999,8 +1999,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
>   	mtctr	r10
>   	bctr
>   	.else
> -	li	r10,MSR_RI
> -	mtmsrd 	r10,1			/* Set RI (EE=0) */
>   #ifdef CONFIG_RELOCATABLE
>   	__LOAD_HANDLER(r10, system_call_common)
>   	mtctr	r10
> diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
> index f28f41a1a85a..eef61800f734 100644
> --- a/arch/powerpc/kernel/interrupt_64.S
> +++ b/arch/powerpc/kernel/interrupt_64.S
> @@ -311,10 +311,10 @@ END_BTB_FLUSH_SECTION
>   	 * nothing pending. system_call_exception() will call
>   	 * trace_hardirqs_off().
>   	 */
> -	li	r11,IRQS_ALL_DISABLED
> -	li	r12,PACA_IRQ_HARD_DIS
> +	li	r11,IRQS_DISABLED
> +	li	r12,-1 /* Set MSR_EE and MSR_RI */
>   	stb	r11,PACAIRQSOFTMASK(r13)
> -	stb	r12,PACAIRQHAPPENED(r13)
> +	mtmsrd	r12,1
>   
>   	ENTER_KERNEL_SECURITY_FALLBACK
>   
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 12/14] powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE]
  2021-03-16  7:21   ` Christophe Leroy
@ 2021-03-16  8:13     ` Nicholas Piggin
  2021-03-19 11:29     ` Michael Ellerman
  1 sibling, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-16  8:13 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of March 16, 2021 5:21 pm:
> 
> 
> Le 15/03/2021 à 23:04, Nicholas Piggin a écrit :
>> This extends the MSR[RI]=0 window a little further into the system
>> call in order to pair RI and EE enabling with a single mtmsrd.
> 
> Time ago, I proposed to delay that on PPC32 and Michael objected, see 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/9f9dd859d571e324c7412ed9db9da8cfba678257.1548956511.git.christophe.leroy@c-s.fr/

Yeah, it is a concern. The speedup should be at least 5% I think on
64s (have not measured in isolation yet), so might be worth it.

We'll see.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 10/14] powerpc/64: use interrupt restart table to speed up return from interrupt
  2021-03-15 22:03 ` [PATCH 10/14] powerpc/64: use interrupt restart table to speed up return from interrupt Nicholas Piggin
@ 2021-03-16 19:34   ` Christophe Leroy
  2021-03-16 23:46     ` Nicholas Piggin
  0 siblings, 1 reply; 26+ messages in thread
From: Christophe Leroy @ 2021-03-16 19:34 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 15/03/2021 à 23:03, Nicholas Piggin a écrit :
> The restart table facility is used to return from interrupt without
> disabling MSR EE or RI.

What happens when an interrupt happens between the point you restore the user r1 and the rfi which 
returns to user ?
If an interrupt happens there, the interrupt prolog sees it as an interrupt coming from kernel, so 
it uses r1 as is, but r1 points to user stack.

Don't we end up in kernel_bad_stack() ?

Or we take a KUAP fault and end-up in an infinite loop ?

> 
> Interrupt return code is put into the low soft-masked region.
> 
> Critical code that has no exit work, SRRs set, soft-masked state set to
> return state, saves r1 in the PACA and then begins to run instructions
> that have an alternate return handler.
> 
> In this region, pending interrupts are checked, and if any exist then
> it branches directly to the restart handler.
> 
> If it does not branch, then no masked interrupts are pending, and if any
> interrupts do hit, we will go out the restart handler.
> 
> The restart handler re-loads the saved r1, and from there we can find
> regs, and reload critical state before setting things up to replay
> interrupts and go around the exit prepare sequence again.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 10/14] powerpc/64: use interrupt restart table to speed up return from interrupt
  2021-03-16 19:34   ` Christophe Leroy
@ 2021-03-16 23:46     ` Nicholas Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-03-16 23:46 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of March 17, 2021 5:34 am:
> 
> 
> Le 15/03/2021 à 23:03, Nicholas Piggin a écrit :
>> The restart table facility is used to return from interrupt without
>> disabling MSR EE or RI.
> 
> What happens when an interrupt happens between the point you restore the user r1 and the rfi which 
> returns to user ?
> If an interrupt happens there, the interrupt prolog sees it as an interrupt coming from kernel, so 
> it uses r1 as is, but r1 points to user stack.

The interrupt is "soft-masked" because it arrives from kernel with an 
address below __end_soft_masked. Masked interrupts never touch the 
stack. It then checks the restart table and finds an entry, so it 
returns to the restart which loads the previous r1 from paca.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 12/14] powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE]
  2021-03-16  7:21   ` Christophe Leroy
  2021-03-16  8:13     ` Nicholas Piggin
@ 2021-03-19 11:29     ` Michael Ellerman
  1 sibling, 0 replies; 26+ messages in thread
From: Michael Ellerman @ 2021-03-19 11:29 UTC (permalink / raw)
  To: Christophe Leroy, Nicholas Piggin, linuxppc-dev

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 15/03/2021 à 23:04, Nicholas Piggin a écrit :
>> This extends the MSR[RI]=0 window a little further into the system
>> call in order to pair RI and EE enabling with a single mtmsrd.
>
> Time ago, I proposed to delay that on PPC32 and Michael objected, see 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/9f9dd859d571e324c7412ed9db9da8cfba678257.1548956511.git.christophe.leroy@c-s.fr/

I don't think I objected, I was just curious about what the added
exposure to RI=0 was :)

cheers

>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index bd0c82ac9de5..2f14ac3c377c 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -1999,8 +1999,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
>>   	mtctr	r10
>>   	bctr
>>   	.else
>> -	li	r10,MSR_RI
>> -	mtmsrd 	r10,1			/* Set RI (EE=0) */
>>   #ifdef CONFIG_RELOCATABLE
>>   	__LOAD_HANDLER(r10, system_call_common)
>>   	mtctr	r10
>> diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
>> index f28f41a1a85a..eef61800f734 100644
>> --- a/arch/powerpc/kernel/interrupt_64.S
>> +++ b/arch/powerpc/kernel/interrupt_64.S
>> @@ -311,10 +311,10 @@ END_BTB_FLUSH_SECTION
>>   	 * nothing pending. system_call_exception() will call
>>   	 * trace_hardirqs_off().
>>   	 */
>> -	li	r11,IRQS_ALL_DISABLED
>> -	li	r12,PACA_IRQ_HARD_DIS
>> +	li	r11,IRQS_DISABLED
>> +	li	r12,-1 /* Set MSR_EE and MSR_RI */
>>   	stb	r11,PACAIRQSOFTMASK(r13)
>> -	stb	r12,PACAIRQHAPPENED(r13)
>> +	mtmsrd	r12,1
>>   
>>   	ENTER_KERNEL_SECURITY_FALLBACK
>>   
>> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid
  2021-03-15 22:03 ` [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid Nicholas Piggin
@ 2021-04-02 22:39   ` Michael Ellerman
  2021-04-04  0:49     ` Nicholas Piggin
  2021-04-03  2:28   ` Michael Ellerman
  1 sibling, 1 reply; 26+ messages in thread
From: Michael Ellerman @ 2021-04-02 22:39 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:
> When an interrupt is taken, the SRR registers are set to return to
> where it left off. Unless they are modified in the meantime, or the
> return address or MSR are modified, there is no need to reload these
> registers when returning from interrupt.
>
> Introduce per-CPU flags that track the validity of SRR and HSRR
> registers, clear them when returning from interrupt, using the registers
> for something else (e.g., OPAL calls), or adjusting return address or MSR.
>
> This improves the performance of interrupt returns.
>
> XXX: may not need to invalidate both hsrr and srr all the time
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---

I needed something like below to get 32-bit building.

cheers


diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
index 6d6237e0cbd7..7f9bbd19db10 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -153,15 +153,21 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
 	regs->gpr[3] = rc;
 }
 
-static inline void regs_set_return_ip(struct pt_regs *regs, unsigned long ip)
+static inline void invalidate_srrs(void)
 {
-	regs->nip = ip;
 #ifdef CONFIG_PPC_BOOK3S_64
+	// XXX: We may not need to invalidate both hsrr and srr all the time
 	local_paca->hsrr_valid = 0;
 	local_paca->srr_valid = 0;
 #endif
 }
 
+static inline void regs_set_return_ip(struct pt_regs *regs, unsigned long ip)
+{
+	regs->nip = ip;
+	invalidate_srrs();
+}
+
 static inline void regs_add_return_ip(struct pt_regs *regs, long offset)
 {
 	regs_set_return_ip(regs, regs->nip + offset);
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 200b4805f999..82623b57e2d6 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -98,8 +98,7 @@ static void check_if_tm_restore_required(struct task_struct *tsk)
 	    !test_thread_flag(TIF_RESTORE_TM)) {
 		tsk->thread.ckpt_regs.msr = tsk->thread.regs->msr;
 		set_thread_flag(TIF_RESTORE_TM);
-		local_paca->hsrr_valid = 0;
-		local_paca->srr_valid = 0;
+		invalidate_srrs();
 	}
 }
 
@@ -164,8 +163,7 @@ static void __giveup_fpu(struct task_struct *tsk)
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr &= ~MSR_VSX;
 	tsk->thread.regs->msr = msr;
-	local_paca->hsrr_valid = 0;
-	local_paca->srr_valid = 0;
+	invalidate_srrs();
 }
 
 void giveup_fpu(struct task_struct *tsk)
@@ -249,8 +247,7 @@ static void __giveup_altivec(struct task_struct *tsk)
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr &= ~MSR_VSX;
 	tsk->thread.regs->msr = msr;
-	local_paca->hsrr_valid = 0;
-	local_paca->srr_valid = 0;
+	invalidate_srrs();
 }
 
 void giveup_altivec(struct task_struct *tsk)
@@ -566,8 +563,7 @@ void notrace restore_math(struct pt_regs *regs)
 		msr_check_and_clear(new_msr);
 
 		regs->msr |= new_msr | fpexc_mode;
-		local_paca->hsrr_valid = 0;
-		local_paca->srr_valid = 0;
+		invalidate_srrs();
 	}
 }
 #endif /* CONFIG_PPC_BOOK3S_64 */
@@ -1293,8 +1289,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
 			atomic_read(&current->mm->context.vas_windows)))
 			asm volatile(PPC_CP_ABORT);
 	}
-	local_paca->hsrr_valid = 0;
-	local_paca->srr_valid = 0;
+	invalidate_srrs();
 #endif /* CONFIG_PPC_BOOK3S_64 */
 
 	return last;
@@ -1884,8 +1879,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
 	current->thread.load_tm = 0;
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
-	local_paca->hsrr_valid = 0;
-	local_paca->srr_valid = 0;
+	invalidate_srrs();
 }
 EXPORT_SYMBOL(start_thread);
 
@@ -1936,8 +1930,7 @@ int set_fpexc_mode(struct task_struct *tsk, unsigned int val)
 	if (regs != NULL && (regs->msr & MSR_FP) != 0) {
 		regs->msr = (regs->msr & ~(MSR_FE0|MSR_FE1))
 			| tsk->thread.fpexc_mode;
-		local_paca->hsrr_valid = 0;
-		local_paca->srr_valid = 0;
+		invalidate_srrs();
 	}
 	return 0;
 }
@@ -1990,8 +1983,7 @@ int set_endian(struct task_struct *tsk, unsigned int val)
 	else
 		return -EINVAL;
 
-	local_paca->hsrr_valid = 0;
-	local_paca->srr_valid = 0;
+	invalidate_srrs();
 
 	return 0;
 }
diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c
index 4cb38afa28a8..9d1d6070a516 100644
--- a/arch/powerpc/kernel/syscalls.c
+++ b/arch/powerpc/kernel/syscalls.c
@@ -115,8 +115,8 @@ SYSCALL_DEFINE0(switch_endian)
 	struct thread_info *ti;
 
 	current->thread.regs->msr ^= MSR_LE;
-	local_paca->hsrr_valid = 0;
-	local_paca->srr_valid = 0;
+
+	invalidate_srrs();
 
 	/*
 	 * Set TIF_RESTOREALL so that r3 isn't clobbered on return to
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 96505d4bba1c..2b94bf21d6ae 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -3480,8 +3480,7 @@ int emulate_step(struct pt_regs *regs, struct ppc_inst instr)
 	unsigned long val;
 	unsigned long ea;
 
-	local_paca->hsrr_valid = 0;
-	local_paca->srr_valid = 0;
+	invalidate_srrs();
 
 	r = analyse_instr(&op, regs, instr);
 	if (r < 0)

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid
  2021-03-15 22:03 ` [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid Nicholas Piggin
  2021-04-02 22:39   ` Michael Ellerman
@ 2021-04-03  2:28   ` Michael Ellerman
  2021-04-04  0:51     ` Nicholas Piggin
  1 sibling, 1 reply; 26+ messages in thread
From: Michael Ellerman @ 2021-04-03  2:28 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index ccf913cedd29..b466b3e1bb3f 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -64,6 +64,30 @@ exception_marker:
>  	.section	".text"
>  	.align 7
>  
> +.macro DEBUG_SRR_VALID srr
> +#ifdef CONFIG_PPC_RFI_SRR_DEBUG
> +	.ifc \srr,srr
> +	mfspr	r11,SPRN_SRR0
> +	ld	r12,_NIP(r1)
> +100:	tdne	r11,r12
> +	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)

This always points at *this* line, not the caller. Works better with the
patch below.

cheers


diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index b466b3e1bb3f..ada76b1279f9 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -64,26 +64,26 @@
 	.section	".text"
 	.align 7
 
-.macro DEBUG_SRR_VALID srr
+.macro DEBUG_SRR_VALID srr line
 #ifdef CONFIG_PPC_RFI_SRR_DEBUG
 	.ifc \srr,srr
 	mfspr	r11,SPRN_SRR0
 	ld	r12,_NIP(r1)
 100:	tdne	r11,r12
-	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	EMIT_BUG_ENTRY 100b,__FILE__,\line,(BUGFLAG_WARNING | BUGFLAG_ONCE)
 	mfspr	r11,SPRN_SRR1
 	ld	r12,_MSR(r1)
 100:	tdne	r11,r12
-	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	EMIT_BUG_ENTRY 100b,__FILE__,\line,(BUGFLAG_WARNING | BUGFLAG_ONCE)
 	.else
 	mfspr	r11,SPRN_HSRR0
 	ld	r12,_NIP(r1)
 100:	tdne	r11,r12
-	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	EMIT_BUG_ENTRY 100b,__FILE__,\line,(BUGFLAG_WARNING | BUGFLAG_ONCE)
 	mfspr	r11,SPRN_HSRR1
 	ld	r12,_MSR(r1)
 100:	tdne	r11,r12
-	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
+	EMIT_BUG_ENTRY 100b,__FILE__,\line,(BUGFLAG_WARNING | BUGFLAG_ONCE)
 	.endif
 #endif
 .endm
@@ -358,7 +358,7 @@ END_BTB_FLUSH_SECTION
 	mtspr	SPRN_SRR0,r4
 	mtspr	SPRN_SRR1,r5
 1:
-	DEBUG_SRR_VALID srr
+	DEBUG_SRR_VALID srr __LINE__
 
 BEGIN_FTR_SECTION
 	stdcx.	r0,0,r1			/* to clear the reservation */
@@ -753,7 +753,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	stb	r4,PACAHSRR_VALID(r13)
 #endif
 	.endif
-	DEBUG_SRR_VALID \srr
+	DEBUG_SRR_VALID \srr __LINE__
 
 BEGIN_FTR_SECTION
 	stdcx.	r0,0,r1		/* to clear the reservation */
@@ -825,7 +825,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
 	stb	r4,PACAHSRR_VALID(r13)
 #endif
 	.endif
-	DEBUG_SRR_VALID \srr
+	DEBUG_SRR_VALID \srr __LINE__
 
 BEGIN_FTR_SECTION
 	stdcx.	r0,0,r1		/* to clear the reservation */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid
  2021-04-02 22:39   ` Michael Ellerman
@ 2021-04-04  0:49     ` Nicholas Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-04-04  0:49 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman

Excerpts from Michael Ellerman's message of April 3, 2021 8:39 am:
> Nicholas Piggin <npiggin@gmail.com> writes:
>> When an interrupt is taken, the SRR registers are set to return to
>> where it left off. Unless they are modified in the meantime, or the
>> return address or MSR are modified, there is no need to reload these
>> registers when returning from interrupt.
>>
>> Introduce per-CPU flags that track the validity of SRR and HSRR
>> registers, clear them when returning from interrupt, using the registers
>> for something else (e.g., OPAL calls), or adjusting return address or MSR.
>>
>> This improves the performance of interrupt returns.
>>
>> XXX: may not need to invalidate both hsrr and srr all the time
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
> 
> I needed something like below to get 32-bit building.

That looks much better.

Thanks,
Nick

> 
> cheers
> 
> 
> diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
> index 6d6237e0cbd7..7f9bbd19db10 100644
> --- a/arch/powerpc/include/asm/ptrace.h
> +++ b/arch/powerpc/include/asm/ptrace.h
> @@ -153,15 +153,21 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
>  	regs->gpr[3] = rc;
>  }
>  
> -static inline void regs_set_return_ip(struct pt_regs *regs, unsigned long ip)
> +static inline void invalidate_srrs(void)
>  {
> -	regs->nip = ip;
>  #ifdef CONFIG_PPC_BOOK3S_64
> +	// XXX: We may not need to invalidate both hsrr and srr all the time
>  	local_paca->hsrr_valid = 0;
>  	local_paca->srr_valid = 0;
>  #endif
>  }
>  
> +static inline void regs_set_return_ip(struct pt_regs *regs, unsigned long ip)
> +{
> +	regs->nip = ip;
> +	invalidate_srrs();
> +}
> +
>  static inline void regs_add_return_ip(struct pt_regs *regs, long offset)
>  {
>  	regs_set_return_ip(regs, regs->nip + offset);
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 200b4805f999..82623b57e2d6 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -98,8 +98,7 @@ static void check_if_tm_restore_required(struct task_struct *tsk)
>  	    !test_thread_flag(TIF_RESTORE_TM)) {
>  		tsk->thread.ckpt_regs.msr = tsk->thread.regs->msr;
>  		set_thread_flag(TIF_RESTORE_TM);
> -		local_paca->hsrr_valid = 0;
> -		local_paca->srr_valid = 0;
> +		invalidate_srrs();
>  	}
>  }
>  
> @@ -164,8 +163,7 @@ static void __giveup_fpu(struct task_struct *tsk)
>  	if (cpu_has_feature(CPU_FTR_VSX))
>  		msr &= ~MSR_VSX;
>  	tsk->thread.regs->msr = msr;
> -	local_paca->hsrr_valid = 0;
> -	local_paca->srr_valid = 0;
> +	invalidate_srrs();
>  }
>  
>  void giveup_fpu(struct task_struct *tsk)
> @@ -249,8 +247,7 @@ static void __giveup_altivec(struct task_struct *tsk)
>  	if (cpu_has_feature(CPU_FTR_VSX))
>  		msr &= ~MSR_VSX;
>  	tsk->thread.regs->msr = msr;
> -	local_paca->hsrr_valid = 0;
> -	local_paca->srr_valid = 0;
> +	invalidate_srrs();
>  }
>  
>  void giveup_altivec(struct task_struct *tsk)
> @@ -566,8 +563,7 @@ void notrace restore_math(struct pt_regs *regs)
>  		msr_check_and_clear(new_msr);
>  
>  		regs->msr |= new_msr | fpexc_mode;
> -		local_paca->hsrr_valid = 0;
> -		local_paca->srr_valid = 0;
> +		invalidate_srrs();
>  	}
>  }
>  #endif /* CONFIG_PPC_BOOK3S_64 */
> @@ -1293,8 +1289,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
>  			atomic_read(&current->mm->context.vas_windows)))
>  			asm volatile(PPC_CP_ABORT);
>  	}
> -	local_paca->hsrr_valid = 0;
> -	local_paca->srr_valid = 0;
> +	invalidate_srrs();
>  #endif /* CONFIG_PPC_BOOK3S_64 */
>  
>  	return last;
> @@ -1884,8 +1879,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
>  	current->thread.load_tm = 0;
>  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>  
> -	local_paca->hsrr_valid = 0;
> -	local_paca->srr_valid = 0;
> +	invalidate_srrs();
>  }
>  EXPORT_SYMBOL(start_thread);
>  
> @@ -1936,8 +1930,7 @@ int set_fpexc_mode(struct task_struct *tsk, unsigned int val)
>  	if (regs != NULL && (regs->msr & MSR_FP) != 0) {
>  		regs->msr = (regs->msr & ~(MSR_FE0|MSR_FE1))
>  			| tsk->thread.fpexc_mode;
> -		local_paca->hsrr_valid = 0;
> -		local_paca->srr_valid = 0;
> +		invalidate_srrs();
>  	}
>  	return 0;
>  }
> @@ -1990,8 +1983,7 @@ int set_endian(struct task_struct *tsk, unsigned int val)
>  	else
>  		return -EINVAL;
>  
> -	local_paca->hsrr_valid = 0;
> -	local_paca->srr_valid = 0;
> +	invalidate_srrs();
>  
>  	return 0;
>  }
> diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c
> index 4cb38afa28a8..9d1d6070a516 100644
> --- a/arch/powerpc/kernel/syscalls.c
> +++ b/arch/powerpc/kernel/syscalls.c
> @@ -115,8 +115,8 @@ SYSCALL_DEFINE0(switch_endian)
>  	struct thread_info *ti;
>  
>  	current->thread.regs->msr ^= MSR_LE;
> -	local_paca->hsrr_valid = 0;
> -	local_paca->srr_valid = 0;
> +
> +	invalidate_srrs();
>  
>  	/*
>  	 * Set TIF_RESTOREALL so that r3 isn't clobbered on return to
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 96505d4bba1c..2b94bf21d6ae 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -3480,8 +3480,7 @@ int emulate_step(struct pt_regs *regs, struct ppc_inst instr)
>  	unsigned long val;
>  	unsigned long ea;
>  
> -	local_paca->hsrr_valid = 0;
> -	local_paca->srr_valid = 0;
> +	invalidate_srrs();
>  
>  	r = analyse_instr(&op, regs, instr);
>  	if (r < 0)
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid
  2021-04-03  2:28   ` Michael Ellerman
@ 2021-04-04  0:51     ` Nicholas Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nicholas Piggin @ 2021-04-04  0:51 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman

Excerpts from Michael Ellerman's message of April 3, 2021 12:28 pm:
> Nicholas Piggin <npiggin@gmail.com> writes:
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index ccf913cedd29..b466b3e1bb3f 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -64,6 +64,30 @@ exception_marker:
>>  	.section	".text"
>>  	.align 7
>>  
>> +.macro DEBUG_SRR_VALID srr
>> +#ifdef CONFIG_PPC_RFI_SRR_DEBUG
>> +	.ifc \srr,srr
>> +	mfspr	r11,SPRN_SRR0
>> +	ld	r12,_NIP(r1)
>> +100:	tdne	r11,r12
>> +	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
> 
> This always points at *this* line, not the caller. Works better with the
> patch below.

Good thinking.

Thanks,
Nick

> 
> cheers
> 
> 
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index b466b3e1bb3f..ada76b1279f9 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -64,26 +64,26 @@
>  	.section	".text"
>  	.align 7
>  
> -.macro DEBUG_SRR_VALID srr
> +.macro DEBUG_SRR_VALID srr line
>  #ifdef CONFIG_PPC_RFI_SRR_DEBUG
>  	.ifc \srr,srr
>  	mfspr	r11,SPRN_SRR0
>  	ld	r12,_NIP(r1)
>  100:	tdne	r11,r12
> -	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
> +	EMIT_BUG_ENTRY 100b,__FILE__,\line,(BUGFLAG_WARNING | BUGFLAG_ONCE)
>  	mfspr	r11,SPRN_SRR1
>  	ld	r12,_MSR(r1)
>  100:	tdne	r11,r12
> -	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
> +	EMIT_BUG_ENTRY 100b,__FILE__,\line,(BUGFLAG_WARNING | BUGFLAG_ONCE)
>  	.else
>  	mfspr	r11,SPRN_HSRR0
>  	ld	r12,_NIP(r1)
>  100:	tdne	r11,r12
> -	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
> +	EMIT_BUG_ENTRY 100b,__FILE__,\line,(BUGFLAG_WARNING | BUGFLAG_ONCE)
>  	mfspr	r11,SPRN_HSRR1
>  	ld	r12,_MSR(r1)
>  100:	tdne	r11,r12
> -	EMIT_BUG_ENTRY 100b,__FILE__,__LINE__,(BUGFLAG_WARNING | BUGFLAG_ONCE)
> +	EMIT_BUG_ENTRY 100b,__FILE__,\line,(BUGFLAG_WARNING | BUGFLAG_ONCE)
>  	.endif
>  #endif
>  .endm
> @@ -358,7 +358,7 @@ END_BTB_FLUSH_SECTION
>  	mtspr	SPRN_SRR0,r4
>  	mtspr	SPRN_SRR1,r5
>  1:
> -	DEBUG_SRR_VALID srr
> +	DEBUG_SRR_VALID srr __LINE__
>  
>  BEGIN_FTR_SECTION
>  	stdcx.	r0,0,r1			/* to clear the reservation */
> @@ -753,7 +753,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>  	stb	r4,PACAHSRR_VALID(r13)
>  #endif
>  	.endif
> -	DEBUG_SRR_VALID \srr
> +	DEBUG_SRR_VALID \srr __LINE__
>  
>  BEGIN_FTR_SECTION
>  	stdcx.	r0,0,r1		/* to clear the reservation */
> @@ -825,7 +825,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>  	stb	r4,PACAHSRR_VALID(r13)
>  #endif
>  	.endif
> -	DEBUG_SRR_VALID \srr
> +	DEBUG_SRR_VALID \srr __LINE__
>  
>  BEGIN_FTR_SECTION
>  	stdcx.	r0,0,r1		/* to clear the reservation */
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/14] powerpc: remove interrupt exit helpers unused argument
  2021-03-15 22:03 ` [PATCH 01/14] powerpc: remove interrupt exit helpers unused argument Nicholas Piggin
@ 2021-05-17 13:49   ` Christophe Leroy
  0 siblings, 0 replies; 26+ messages in thread
From: Christophe Leroy @ 2021-05-17 13:49 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 15/03/2021 à 23:03, Nicholas Piggin a écrit :
> The msr argument is not used, remove it.

And why not use it instead of re-reading regs->msr ?

> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/asm-prototypes.h | 4 ++--
>   arch/powerpc/kernel/interrupt.c           | 4 ++--
>   2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
> index 1c7b75834e04..95492655462e 100644
> --- a/arch/powerpc/include/asm/asm-prototypes.h
> +++ b/arch/powerpc/include/asm/asm-prototypes.h
> @@ -71,8 +71,8 @@ void __init machine_init(u64 dt_ptr);
>   #endif
>   long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, unsigned long r0, struct pt_regs *regs);
>   notrace unsigned long syscall_exit_prepare(unsigned long r3, struct pt_regs *regs, long scv);
> -notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned long msr);
> -notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsigned long msr);
> +notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs);
> +notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs);
>   
>   long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low,
>   		      u32 len_high, u32 len_low);
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index 96ca27ef68ae..efeeefe6ee8f 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -359,7 +359,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
>   	return ret;
>   }
>   
> -notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned long msr)
> +notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
>   {
>   	unsigned long ti_flags;
>   	unsigned long flags;
> @@ -443,7 +443,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned
>   
>   void preempt_schedule_irq(void);
>   
> -notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsigned long msr)
> +notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
>   {
>   	unsigned long flags;
>   	unsigned long ret = 0;
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 14/14] powerpc/64s: use the same default PPR for user and kernel
  2021-03-15 22:04 ` [PATCH 14/14] powerpc/64s: use the same default PPR for user and kernel Nicholas Piggin
@ 2021-05-17 14:09   ` Christophe Leroy
  0 siblings, 0 replies; 26+ messages in thread
From: Christophe Leroy @ 2021-05-17 14:09 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 15/03/2021 à 23:04, Nicholas Piggin a écrit :
> Change the default PPR to userspace to 4 (medium), matching the
> normal kernel PPR.
> 
> This allows system calls and user interrupts to avoid setting PPR on
> entry and exit, providing a significant speedup.
> 
> This is a change to the user environment. The problem with changing
> the kernel to match userspace at 3 (medium-low), is that userspace
> can then boost priority above the kernel which is also undesirable.
> 
> glibc does not seem to change PPR anywhere, so the decision is to
> go with this.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/interrupt.h |  2 ++
>   arch/powerpc/include/asm/processor.h |  4 ++--
>   arch/powerpc/kernel/exceptions-64s.S |  3 ---
>   arch/powerpc/kernel/interrupt.c      | 33 ++++++++++++++++++++++++++++
>   arch/powerpc/kernel/interrupt_64.S   | 17 --------------
>   5 files changed, 37 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> index d6d54bbcba2f..293e6be9fd71 100644
> --- a/arch/powerpc/include/asm/interrupt.h
> +++ b/arch/powerpc/include/asm/interrupt.h
> @@ -57,6 +57,8 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
>   #ifdef CONFIG_PPC64
>   	bool trace_enable = false;
>   
> +	if (unlikely(regs->ppr != DEFAULT_PPR))
> +		mtspr(SPRN_PPR, DEFAULT_PPR);
>   	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS)) {
>   		if (irq_soft_mask_set_return(IRQS_DISABLED) == IRQS_ENABLED)
>   			trace_enable = true;
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index cb1edf21a82e..5ff589042103 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -27,8 +27,8 @@
>   #endif
>   
>   #ifdef CONFIG_PPC64
> -/* Default SMT priority is set to 3. Use 11- 13bits to save priority. */
> -#define PPR_PRIORITY 3
> +/* Default SMT priority is set to 4. Use 11- 13bits to save priority. */
> +#define PPR_PRIORITY 4
>   #ifdef __ASSEMBLY__
>   #define DEFAULT_PPR (PPR_PRIORITY << 50)
>   #else
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 75cee7cdf887..0d40614d13e0 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -367,7 +367,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>   BEGIN_FTR_SECTION
>   	mfspr	r9,SPRN_PPR
>   END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
> -	HMT_MEDIUM
>   	std	r10,IAREA+EX_R10(r13)		/* save r10 - r12 */
>   BEGIN_FTR_SECTION
>   	mfspr	r10,SPRN_CFAR
> @@ -1962,8 +1961,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
>   	mfspr	r11,SPRN_SRR0
>   	mfspr	r12,SPRN_SRR1
>   
> -	HMT_MEDIUM
> -
>   	.if ! \virt
>   	__LOAD_HANDLER(r10, system_call_common_real)
>   	mtctr	r10
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index 09cf699d0e2e..a6e0595da0dd 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -40,6 +40,11 @@ notrace long system_call_exception(long r3, long r4, long r5,
>   
>   	regs->orig_gpr3 = r3;
>   
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	if (unlikely(regs->ppr != DEFAULT_PPR))
> +		mtspr(SPRN_PPR, DEFAULT_PPR);
> +#endif

Can you have some helper functions to do this instead of those 4 lines #ifdefed blocks all over the 
place ?

Something like

#ifdef CONFIG_PPC_BOOK3S_64
static inline void set_ppr_regs(struct pt_regs *regs)
{
	if (unlikely(regs->ppr != DEFAULT_PPR))
		mtspr(SPRN_PPR, regs->ppr);
}

static inline void set_ppr_default(struct pt_regs *regs)
{
	if (unlikely(regs->ppr != DEFAULT_PPR))
		mtspr(SPRN_PPR, DEFAULT_PPR);
}
#else
static inline void set_ppr_regs(struct pt_regs *regs) { }
static inline void set_ppr_default(struct pt_regs *regs) { }
#endif

> +
>   	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
>   		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
>   
> @@ -237,6 +242,11 @@ notrace unsigned long syscall_exit_prepare_main(unsigned long r3,
>   
>   	account_cpu_user_exit();
>   
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	if (unlikely(regs->ppr != DEFAULT_PPR))
> +		mtspr(SPRN_PPR, regs->ppr);
> +#endif
> +
>   #ifndef CONFIG_PPC_BOOK3E_64 /* BOOK3E not using this */
>   	/*
>   	 * We do this at the end so that we do context switch with KERNEL AMR
> @@ -315,6 +325,11 @@ notrace unsigned long syscall_exit_restart(unsigned long r3, struct pt_regs *reg
>   	 */
>   	hard_irq_disable();
>   
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	if (unlikely(regs->ppr != DEFAULT_PPR))
> +		mtspr(SPRN_PPR, DEFAULT_PPR);
> +#endif
> +
>   	trace_hardirqs_off();
>   	user_exit_irqoff();
>   	account_cpu_user_entry();
> @@ -398,6 +413,11 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
>   
>   	account_cpu_user_exit();
>   
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	if (unlikely(regs->ppr != DEFAULT_PPR))
> +		mtspr(SPRN_PPR, regs->ppr);
> +#endif
> +
>   	/*
>   	 * We do this at the end so that we do context switch with KERNEL AMR
>   	 */
> @@ -489,6 +509,11 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
>   	local_paca->tm_scratch = regs->msr;
>   #endif
>   
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	if (unlikely(regs->ppr != DEFAULT_PPR))
> +		mtspr(SPRN_PPR, regs->ppr);
> +#endif
> +
>   	/*
>   	 * Don't want to mfspr(SPRN_AMR) here, because this comes after mtmsr,
>   	 * which would cause Read-After-Write stalls. Hence, we take the AMR
> @@ -505,6 +530,10 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
>   notrace unsigned long interrupt_exit_user_restart(struct pt_regs *regs)
>   {
>   	hard_irq_disable();
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	if (unlikely(regs->ppr != DEFAULT_PPR))
> +		mtspr(SPRN_PPR, DEFAULT_PPR);
> +#endif
>   
>   	trace_hardirqs_off();
>   	user_exit_irqoff();
> @@ -523,6 +552,10 @@ notrace unsigned long interrupt_exit_user_restart(struct pt_regs *regs)
>   notrace unsigned long interrupt_exit_kernel_restart(struct pt_regs *regs)
>   {
>   	hard_irq_disable();
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	if (unlikely(regs->ppr != DEFAULT_PPR))
> +		mtspr(SPRN_PPR, DEFAULT_PPR);
> +#endif
>   
>   #ifndef CONFIG_PPC_BOOK3E_64
>   	set_kuap(AMR_KUAP_BLOCKED);
> diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
> index eef61800f734..53fc446dcbeb 100644
> --- a/arch/powerpc/kernel/interrupt_64.S
> +++ b/arch/powerpc/kernel/interrupt_64.S
> @@ -99,10 +99,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
>   	ld	r11,exception_marker@toc(r2)
>   	std	r11,-16(r10)		/* "regshere" marker */
>   
> -BEGIN_FTR_SECTION
> -	HMT_MEDIUM
> -END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
> -
>   	ENTER_KERNEL_SECURITY_FALLBACK
>   
>   	/*
> @@ -142,10 +138,6 @@ BEGIN_FTR_SECTION
>   	stdcx.	r0,0,r1			/* to clear the reservation */
>   END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>   
> -BEGIN_FTR_SECTION
> -	HMT_MEDIUM_LOW
> -END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
> -
>   	cmpdi	r3,0
>   	bne	.Lsyscall_vectored_\name\()_restore_regs
>   
> @@ -377,10 +369,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>   	mtspr	SPRN_XER,r0
>   .Lsyscall_restore_regs_cont:
>   
> -BEGIN_FTR_SECTION
> -	HMT_MEDIUM_LOW
> -END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
> -
>   	/*
>   	 * We don't need to restore AMR on the way back to userspace for KUAP.
>   	 * The value of AMR only matters while we're in the kernel.
> @@ -533,11 +521,6 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
>   	tdnei	r4,IRQS_ENABLED
>   
>   #ifdef CONFIG_PPC_BOOK3S
> -BEGIN_FTR_SECTION
> -	ld	r10,_PPR(r1)
> -	mtspr	SPRN_PPR,r10
> -END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
> -
>   	.ifc \srr,srr
>   	lbz	r4,PACASRR_VALID(r13)
>   	.else
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2021-05-17 14:10 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-15 22:03 [PATCH 00/14] powerpc/64: fast interrupt exits Nicholas Piggin
2021-03-15 22:03 ` [PATCH 01/14] powerpc: remove interrupt exit helpers unused argument Nicholas Piggin
2021-05-17 13:49   ` Christophe Leroy
2021-03-15 22:03 ` [PATCH 02/14] powerpc/64s: security fallback improvement Nicholas Piggin
2021-03-15 22:03 ` [PATCH 03/14] powerpc/64s: introduce different functions to return from SRR vs HSRR interrupts Nicholas Piggin
2021-03-15 22:03 ` [PATCH 04/14] powerpc/64s: avoid reloading (H)SRR registers if they are still valid Nicholas Piggin
2021-04-02 22:39   ` Michael Ellerman
2021-04-04  0:49     ` Nicholas Piggin
2021-04-03  2:28   ` Michael Ellerman
2021-04-04  0:51     ` Nicholas Piggin
2021-03-15 22:03 ` [PATCH 05/14] powerpc/64: move interrupt return asm to interrupt_64.S Nicholas Piggin
2021-03-15 22:03 ` [PATCH 06/14] powerpc/64s: save one more register in the masked interrupt handler Nicholas Piggin
2021-03-15 22:03 ` [PATCH 07/14] powerpc/64: allow alternate return locations for soft-masked interrupts Nicholas Piggin
2021-03-15 22:03 ` [PATCH 08/14] powerpc/64: interrupt soft-enable race fix Nicholas Piggin
2021-03-15 22:03 ` [PATCH 09/14] powerpc/64: treat low kernel text as irqs soft-masked Nicholas Piggin
2021-03-15 22:03 ` [PATCH 10/14] powerpc/64: use interrupt restart table to speed up return from interrupt Nicholas Piggin
2021-03-16 19:34   ` Christophe Leroy
2021-03-16 23:46     ` Nicholas Piggin
2021-03-15 22:03 ` [PATCH 11/14] powerpc/64e: Remove PPR from pt_regs Nicholas Piggin
2021-03-15 22:04 ` [PATCH 12/14] powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE] Nicholas Piggin
2021-03-16  7:21   ` Christophe Leroy
2021-03-16  8:13     ` Nicholas Piggin
2021-03-19 11:29     ` Michael Ellerman
2021-03-15 22:04 ` [PATCH 13/14] powerpc/64: handle MSR EE and RI in interrupt entry wrapper Nicholas Piggin
2021-03-15 22:04 ` [PATCH 14/14] powerpc/64s: use the same default PPR for user and kernel Nicholas Piggin
2021-05-17 14:09   ` Christophe Leroy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.