linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/21] powerpc: interrupt wrappers
@ 2021-01-13  7:31 Nicholas Piggin
  2021-01-13  7:31 ` [PATCH v5 01/21] powerpc/32s: Do DABR match out of handle_page_fault() Nicholas Piggin
                   ` (20 more replies)
  0 siblings, 21 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This adds interrupt handler wrapper functions, similar to the
generic / x86 code, and moves several common operations into them
from either asm or open coded in the individual handlers.

Since v1:
- Fixed a couple of compile issues
- Fixed perf weirdness (sometimes NMI, sometimes not)
- Also move irq_enter/exit into wrappers

Since v2:
- Rebased upstream
- Took code in patch 3 from Christophe
- Fixed some compile errors from 0day

Since v3:
- Rebased
- Split Christophe's 32s DABR patch into its own patch
- Fixed missing asm from 32s on patch 3 noticed by Christophe.
- Moved changes around, split out one more patch (patch 9) to make
  changes more logical and atomic.
- Add comments explaining _RAW handlers (SLB, HPTE) interrupts better

Since v4:
- Rebased (on top of scv fallback flush fix)
- Rearranged a few changes into different patches from Christophe,
  e.g., the ___do_page_fault change from patch 2 to 10. I didn't
  do everything (e.g., splitting to update __hash_page to drop the
  msr argument before the bulk of patch 2 seemed like churn without
  much improvement), and also other things like removing the new
  ___do_page_fault variant if we can change hash fault context tracking
  I didn't get time to completely investigate and implement. I think
  this shouldn't be a showstopper though we can make more improvements
  as we go.

Thanks,
Nick

Christophe Leroy (1):
  powerpc/32s: Do DABR match out of handle_page_fault()

Nicholas Piggin (20):
  powerpc/64s: move the last of the page fault handling logic to C
  powerpc: remove arguments from fault handler functions
  powerpc: bad_page_fault, do_break get registers from regs
  powerpc/perf: move perf irq/nmi handling details into traps.c
  powerpc: interrupt handler wrapper functions
  powerpc: add interrupt wrapper entry / exit stub functions
  powerpc: add interrupt_cond_local_irq_enable helper
  powerpc/64: context tracking remove _TIF_NOHZ
  powerpc/64s/hash: improve context tracking of hash faults
  powerpc/64: context tracking move to interrupt wrappers
  powerpc/64: add context tracking to asynchronous interrupts
  powerpc: handle irq_enter/irq_exit in interrupt handler wrappers
  powerpc/64s: move context tracking exit to interrupt exit path
  powerpc/64s: reconcile interrupts in C
  powerpc/64: move account_stolen_time into its own function
  powerpc/64: entry cpu time accounting in C
  powerpc: move NMI entry/exit code into wrapper
  powerpc/64s: move NMI soft-mask handling to C
  powerpc/64s: runlatch interrupt handling in C
  powerpc/64s: power4 nap fixup in C

 arch/powerpc/Kconfig                       |   1 -
 arch/powerpc/include/asm/asm-prototypes.h  |  29 --
 arch/powerpc/include/asm/bug.h             |   7 +-
 arch/powerpc/include/asm/cputime.h         |  15 +
 arch/powerpc/include/asm/debug.h           |   3 +-
 arch/powerpc/include/asm/hw_irq.h          |   9 -
 arch/powerpc/include/asm/interrupt.h       | 418 +++++++++++++++++++++
 arch/powerpc/include/asm/ppc_asm.h         |  24 --
 arch/powerpc/include/asm/processor.h       |   1 +
 arch/powerpc/include/asm/thread_info.h     |  10 +-
 arch/powerpc/include/asm/time.h            |   2 +
 arch/powerpc/kernel/dbell.c                |  15 +-
 arch/powerpc/kernel/entry_32.S             |  24 +-
 arch/powerpc/kernel/exceptions-64e.S       |   6 +-
 arch/powerpc/kernel/exceptions-64s.S       | 307 ++-------------
 arch/powerpc/kernel/head_40x.S             |  10 +-
 arch/powerpc/kernel/head_8xx.S             |  11 +-
 arch/powerpc/kernel/head_book3s_32.S       |  14 +-
 arch/powerpc/kernel/head_booke.h           |   4 +-
 arch/powerpc/kernel/idle_book3s.S          |   4 +
 arch/powerpc/kernel/irq.c                  |   7 +-
 arch/powerpc/kernel/mce.c                  |  16 +-
 arch/powerpc/kernel/process.c              |   7 +-
 arch/powerpc/kernel/ptrace/ptrace.c        |   4 -
 arch/powerpc/kernel/signal.c               |   4 -
 arch/powerpc/kernel/syscall_64.c           |  30 +-
 arch/powerpc/kernel/tau_6xx.c              |   5 +-
 arch/powerpc/kernel/time.c                 |   7 +-
 arch/powerpc/kernel/traps.c                | 231 ++++++------
 arch/powerpc/kernel/watchdog.c             |  15 +-
 arch/powerpc/kvm/book3s_hv.c               |   1 +
 arch/powerpc/kvm/book3s_hv_builtin.c       |   1 +
 arch/powerpc/kvm/booke.c                   |   1 +
 arch/powerpc/mm/book3s64/hash_utils.c      |  92 +++--
 arch/powerpc/mm/book3s64/slb.c             |  36 +-
 arch/powerpc/mm/fault.c                    |  92 +++--
 arch/powerpc/perf/core-book3s.c            |  35 +-
 arch/powerpc/perf/core-fsl-emb.c           |  25 --
 arch/powerpc/platforms/8xx/machine_check.c |   2 +-
 arch/powerpc/platforms/powernv/idle.c      |   1 +
 40 files changed, 813 insertions(+), 713 deletions(-)
 create mode 100644 arch/powerpc/include/asm/interrupt.h

-- 
2.23.0


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 01/21] powerpc/32s: Do DABR match out of handle_page_fault()
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
@ 2021-01-13  7:31 ` Nicholas Piggin
  2021-01-13  7:31 ` [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C Nicholas Piggin
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

From: Christophe Leroy <christophe.leroy@csgroup.eu>

handle_page_fault() has some code dedicated to book3s/32 to
call do_break() when the DSI is a DABR match.

On other platforms, do_break() is handled separately.

Do the same for book3s/32, do it earlier in the process of DSI.

This change also avoid doing the test on ISI.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/entry_32.S       | 15 ---------------
 arch/powerpc/kernel/head_book3s_32.S |  3 +++
 2 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 1c9b0ccc2172..238eacfda7b0 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -670,10 +670,6 @@ ppc_swapcontext:
 	.globl	handle_page_fault
 handle_page_fault:
 	addi	r3,r1,STACK_FRAME_OVERHEAD
-#ifdef CONFIG_PPC_BOOK3S_32
-	andis.  r0,r5,DSISR_DABRMATCH@h
-	bne-    handle_dabr_fault
-#endif
 	bl	do_page_fault
 	cmpwi	r3,0
 	beq+	ret_from_except
@@ -687,17 +683,6 @@ handle_page_fault:
 	bl	__bad_page_fault
 	b	ret_from_except_full
 
-#ifdef CONFIG_PPC_BOOK3S_32
-	/* We have a data breakpoint exception - handle it */
-handle_dabr_fault:
-	SAVE_NVGPRS(r1)
-	lwz	r0,_TRAP(r1)
-	clrrwi	r0,r0,1
-	stw	r0,_TRAP(r1)
-	bl      do_break
-	b	ret_from_except_full
-#endif
-
 /*
  * This routine switches between two different tasks.  The process
  * state of one is saved on its kernel stack.  Then the state
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 349bf3f0c3af..fc9a12768a14 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -680,7 +680,10 @@ handle_page_fault_tramp_1:
 	lwz	r5, _DSISR(r11)
 	/* fall through */
 handle_page_fault_tramp_2:
+	andis.	r0, r5, DSISR_DABRMATCH@h
+	bne-	1f
 	EXC_XFER_LITE(0x300, handle_page_fault)
+1:	EXC_XFER_STD(0x300, do_break)
 
 #ifdef CONFIG_VMAP_STACK
 .macro save_regs_thread		thread
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
  2021-01-13  7:31 ` [PATCH v5 01/21] powerpc/32s: Do DABR match out of handle_page_fault() Nicholas Piggin
@ 2021-01-13  7:31 ` Nicholas Piggin
  2021-01-13 14:12   ` Christophe Leroy
  2021-01-13  7:31 ` [PATCH v5 03/21] powerpc: remove arguments from fault handler functions Nicholas Piggin
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The page fault handling still has some complex logic particularly around
hash table handling, in asm. Implement this in C instead.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
 arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
 arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
 arch/powerpc/mm/fault.c                       |  46 ++++--
 4 files changed, 107 insertions(+), 148 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 066b1d34c7bc..60a669379aa0 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
 #define HPTE_NOHPTE_UPDATE	0x2
 #define HPTE_USE_KERNEL_KEY	0x4
 
+int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
 extern int __hash_page_4K(unsigned long ea, unsigned long access,
 			  unsigned long vsid, pte_t *ptep, unsigned long trap,
 			  unsigned long flags, int ssize, int subpage_prot);
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 6e53f7638737..bcb5e81d2088 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
  *
  * Handling:
  * - Hash MMU
- *   Go to do_hash_page first to see if the HPT can be filled from an entry in
- *   the Linux page table. Hash faults can hit in kernel mode in a fairly
+ *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
+ *   Linux page table. Hash faults can hit in kernel mode in a fairly
  *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
  *   "non-bolted" regions, e.g., vmalloc space. However these should always be
- *   backed by Linux page tables.
+ *   backed by Linux page table entries.
  *
- *   If none is found, do a Linux page fault. Linux page faults can happen in
- *   kernel mode due to user copy operations of course.
+ *   If no entry is found the Linux page fault handler is invoked (by
+ *   do_hash_fault). Linux page faults can happen in kernel mode due to user
+ *   copy operations of course.
  *
  *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
  *   MMU context, which may cause a DSI in the host, which must go to the
@@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
 	GEN_COMMON data_access
 	ld	r4,_DAR(r1)
 	ld	r5,_DSISR(r1)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
-	ld	r6,_MSR(r1)
-	li	r3,0x300
-	b	do_hash_page		/* Try to handle as hpte fault */
+	bl	do_hash_fault
 MMU_FTR_SECTION_ELSE
-	b	handle_page_fault
+	bl	do_page_fault
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+        cmpdi	r3,0
+	beq+	interrupt_return
+	/* We need to restore NVGPRS */
+	REST_NVGPRS(r1)
+	b       interrupt_return
 
 	GEN_KVM data_access
 
@@ -1540,13 +1545,17 @@ EXC_COMMON_BEGIN(instruction_access_common)
 	GEN_COMMON instruction_access
 	ld	r4,_DAR(r1)
 	ld	r5,_DSISR(r1)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
-	ld      r6,_MSR(r1)
-	li	r3,0x400
-	b	do_hash_page		/* Try to handle as hpte fault */
+	bl	do_hash_fault
 MMU_FTR_SECTION_ELSE
-	b	handle_page_fault
+	bl	do_page_fault
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+        cmpdi	r3,0
+	beq+	interrupt_return
+	/* We need to restore NVGPRS */
+	REST_NVGPRS(r1)
+	b       interrupt_return
 
 	GEN_KVM instruction_access
 
@@ -3221,99 +3230,3 @@ disable_machine_check:
 	RFI_TO_KERNEL
 1:	mtlr	r0
 	blr
-
-/*
- * Hash table stuff
- */
-	.balign	IFETCH_ALIGN_BYTES
-do_hash_page:
-#ifdef CONFIG_PPC_BOOK3S_64
-	lis	r0,(DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)@h
-	ori	r0,r0,DSISR_BAD_FAULT_64S@l
-	and.	r0,r5,r0		/* weird error? */
-	bne-	handle_page_fault	/* if not, try to insert a HPTE */
-
-	/*
-	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
-	 * don't call hash_page, just fail the fault. This is required to
-	 * prevent re-entrancy problems in the hash code, namely perf
-	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
-	 * hash fault. See the comment in hash_preload().
-	 */
-	ld	r11, PACA_THREAD_INFO(r13)
-	lwz	r0,TI_PREEMPT(r11)
-	andis.	r0,r0,NMI_MASK@h
-	bne	77f
-
-	/*
-	 * r3 contains the trap number
-	 * r4 contains the faulting address
-	 * r5 contains dsisr
-	 * r6 msr
-	 *
-	 * at return r3 = 0 for success, 1 for page fault, negative for error
-	 */
-	bl	__hash_page		/* build HPTE if possible */
-        cmpdi	r3,0			/* see if __hash_page succeeded */
-
-	/* Success */
-	beq	interrupt_return	/* Return from exception on success */
-
-	/* Error */
-	blt-	13f
-
-	/* Reload DAR/DSISR into r4/r5 for the DABR check below */
-	ld	r4,_DAR(r1)
-	ld      r5,_DSISR(r1)
-#endif /* CONFIG_PPC_BOOK3S_64 */
-
-/* Here we have a page fault that hash_page can't handle. */
-handle_page_fault:
-11:	andis.  r0,r5,DSISR_DABRMATCH@h
-	bne-    handle_dabr_fault
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	bl	do_page_fault
-	cmpdi	r3,0
-	beq+	interrupt_return
-	mr	r5,r3
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	ld	r4,_DAR(r1)
-	bl	__bad_page_fault
-	b	interrupt_return
-
-/* We have a data breakpoint exception - handle it */
-handle_dabr_fault:
-	ld      r4,_DAR(r1)
-	ld      r5,_DSISR(r1)
-	addi    r3,r1,STACK_FRAME_OVERHEAD
-	bl      do_break
-	/*
-	 * do_break() may have changed the NV GPRS while handling a breakpoint.
-	 * If so, we need to restore them with their updated values.
-	 */
-	REST_NVGPRS(r1)
-	b       interrupt_return
-
-
-#ifdef CONFIG_PPC_BOOK3S_64
-/* We have a page fault that hash_page could handle but HV refused
- * the PTE insertion
- */
-13:	mr	r5,r3
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	ld	r4,_DAR(r1)
-	bl	low_hash_fault
-	b	interrupt_return
-#endif
-
-/*
- * We come here as a result of a DSI at a point where we don't want
- * to call hash_page, such as when we are accessing memory (possibly
- * user memory) inside a PMU interrupt that occurred while interrupts
- * were soft-disabled.  We want to invoke the exception handler for
- * the access, or panic if there isn't a handler.
- */
-77:	addi	r3,r1,STACK_FRAME_OVERHEAD
-	li	r5,SIGSEGV
-	bl	bad_page_fault
-	b	interrupt_return
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 73b06adb6eeb..5a61182ddf75 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1512,16 +1512,40 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
 }
 EXPORT_SYMBOL_GPL(hash_page);
 
-int __hash_page(unsigned long trap, unsigned long ea, unsigned long dsisr,
-		unsigned long msr)
+int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
 {
 	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
 	unsigned long flags = 0;
-	struct mm_struct *mm = current->mm;
-	unsigned int region_id = get_region_id(ea);
+	struct mm_struct *mm;
+	unsigned int region_id;
+	int err;
+
+	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
+		goto page_fault;
+
+	/*
+	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
+	 * don't call hash_page, just fail the fault. This is required to
+	 * prevent re-entrancy problems in the hash code, namely perf
+	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
+	 * hash fault. See the comment in hash_preload().
+	 *
+	 * We come here as a result of a DSI at a point where we don't want
+	 * to call hash_page, such as when we are accessing memory (possibly
+	 * user memory) inside a PMU interrupt that occurred while interrupts
+	 * were soft-disabled.  We want to invoke the exception handler for
+	 * the access, or panic if there isn't a handler.
+	 */
+	if (unlikely(in_nmi())) {
+		bad_page_fault(regs, ea, SIGSEGV);
+		return 0;
+	}
 
+	region_id = get_region_id(ea);
 	if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
 		mm = &init_mm;
+	else
+		mm = current->mm;
 
 	if (dsisr & DSISR_NOHPTE)
 		flags |= HPTE_NOHPTE_UPDATE;
@@ -1537,13 +1561,31 @@ int __hash_page(unsigned long trap, unsigned long ea, unsigned long dsisr,
 	 * 2) user space access kernel space.
 	 */
 	access |= _PAGE_PRIVILEGED;
-	if ((msr & MSR_PR) || (region_id == USER_REGION_ID))
+	if (user_mode(regs) || (region_id == USER_REGION_ID))
 		access &= ~_PAGE_PRIVILEGED;
 
-	if (trap == 0x400)
+	if (regs->trap == 0x400)
 		access |= _PAGE_EXEC;
 
-	return hash_page_mm(mm, ea, access, trap, flags);
+	err = hash_page_mm(mm, ea, access, regs->trap, flags);
+	if (unlikely(err < 0)) {
+		// failed to instert a hash PTE due to an hypervisor error
+		if (user_mode(regs)) {
+			if (IS_ENABLED(CONFIG_PPC_SUBPAGE_PROT) && err == -2)
+				_exception(SIGSEGV, regs, SEGV_ACCERR, ea);
+			else
+				_exception(SIGBUS, regs, BUS_ADRERR, ea);
+		} else {
+			bad_page_fault(regs, ea, SIGBUS);
+		}
+		err = 0;
+
+	} else if (err) {
+page_fault:
+		err = do_page_fault(regs, ea, dsisr);
+	}
+
+	return err;
 }
 
 #ifdef CONFIG_PPC_MM_SLICES
@@ -1843,27 +1885,6 @@ void flush_hash_range(unsigned long number, int local)
 	}
 }
 
-/*
- * low_hash_fault is called when we the low level hash code failed
- * to instert a PTE due to an hypervisor error
- */
-void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
-{
-	enum ctx_state prev_state = exception_enter();
-
-	if (user_mode(regs)) {
-#ifdef CONFIG_PPC_SUBPAGE_PROT
-		if (rc == -2)
-			_exception(SIGSEGV, regs, SEGV_ACCERR, address);
-		else
-#endif
-			_exception(SIGBUS, regs, BUS_ADRERR, address);
-	} else
-		bad_page_fault(regs, address, SIGBUS);
-
-	exception_exit(prev_state);
-}
-
 long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
 			   unsigned long pa, unsigned long rflags,
 			   unsigned long vflags, int psize, int ssize)
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 8961b44f350c..77a3155c77b6 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -369,7 +369,9 @@ static void sanity_check_fault(bool is_write, bool is_user,
 #define page_fault_is_bad(__err)	(0)
 #elif defined(CONFIG_PPC_8xx)
 #define page_fault_is_bad(__err)	((__err) & DSISR_NOEXEC_OR_G)
-#elif defined(CONFIG_PPC64)
+#elif defined(CONFIG_PPC_BOOK3S_64)
+#define page_fault_is_bad(__err)	((__err) & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH))
+#elif defined(CONFIG_PPC_BOOK3E_64)
 #define page_fault_is_bad(__err)	((__err) & DSISR_BAD_FAULT_64S)
 #else
 #define page_fault_is_bad(__err)	((__err) & DSISR_BAD_FAULT_32S)
@@ -404,6 +406,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 		return 0;
 
 	if (unlikely(page_fault_is_bad(error_code))) {
+		if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && (error_code & DSISR_DABRMATCH))
+			return -1;
+
 		if (is_user) {
 			_exception(SIGBUS, regs, BUS_OBJERR, address);
 			return 0;
@@ -545,20 +550,39 @@ NOKPROBE_SYMBOL(__do_page_fault);
 int do_page_fault(struct pt_regs *regs, unsigned long address,
 		  unsigned long error_code)
 {
-	const struct exception_table_entry *entry;
 	enum ctx_state prev_state = exception_enter();
-	int rc = __do_page_fault(regs, address, error_code);
-	exception_exit(prev_state);
-	if (likely(!rc))
-		return 0;
+	int err;
 
-	entry = search_exception_tables(regs->nip);
-	if (unlikely(!entry))
-		return rc;
+	err = __do_page_fault(regs, address, error_code);
+	if (unlikely(err)) {
+		const struct exception_table_entry *entry;
 
-	instruction_pointer_set(regs, extable_fixup(entry));
+		entry = search_exception_tables(regs->nip);
+		if (likely(entry)) {
+			instruction_pointer_set(regs, extable_fixup(entry));
+			err = 0;
+		}
+	}
 
-	return 0;
+#ifdef CONFIG_PPC_BOOK3S_64
+	/* 32 and 64e handle these errors in asm */
+	if (unlikely(err)) {
+		if (err > 0) {
+			__bad_page_fault(regs, address, err);
+			err = 0;
+		} else {
+			/*
+			 * do_break() may change NV GPRS while handling the
+			 * breakpoint. Return -ve to caller to do that.
+			 */
+			do_break(regs, address, error_code);
+		}
+	}
+#endif
+
+	exception_exit(prev_state);
+
+	return err;
 }
 NOKPROBE_SYMBOL(do_page_fault);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 03/21] powerpc: remove arguments from fault handler functions
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
  2021-01-13  7:31 ` [PATCH v5 01/21] powerpc/32s: Do DABR match out of handle_page_fault() Nicholas Piggin
  2021-01-13  7:31 ` [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C Nicholas Piggin
@ 2021-01-13  7:31 ` Nicholas Piggin
  2021-01-14 14:12   ` Christophe Leroy
  2021-01-13  7:31 ` [PATCH v5 04/21] powerpc: bad_page_fault, do_break get registers from regs Nicholas Piggin
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Make mm fault handlers all just take the pt_regs * argument and load
DAR/DSISR from that. Make those that return a value return long.

This is done to make the function signatures match other handlers, which
will help with a future patch to add wrappers. Explicit arguments could
be added for performance but that would require more wrapper macro
variants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/asm-prototypes.h     |  4 ++--
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  2 +-
 arch/powerpc/include/asm/bug.h                |  2 +-
 arch/powerpc/kernel/entry_32.S                |  6 +-----
 arch/powerpc/kernel/exceptions-64e.S          |  2 --
 arch/powerpc/kernel/exceptions-64s.S          | 14 ++------------
 arch/powerpc/kernel/head_40x.S                | 10 +++++-----
 arch/powerpc/kernel/head_8xx.S                |  6 +++---
 arch/powerpc/kernel/head_book3s_32.S          |  5 ++---
 arch/powerpc/kernel/head_booke.h              |  4 +---
 arch/powerpc/mm/book3s64/hash_utils.c         |  8 +++++---
 arch/powerpc/mm/book3s64/slb.c                | 11 +++++++----
 arch/powerpc/mm/fault.c                       |  7 ++++---
 13 files changed, 34 insertions(+), 47 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
index d0b832cbbec8..22c9d08fa3a4 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -82,8 +82,8 @@ void kernel_bad_stack(struct pt_regs *regs);
 void system_reset_exception(struct pt_regs *regs);
 void machine_check_exception(struct pt_regs *regs);
 void emulation_assist_interrupt(struct pt_regs *regs);
-long do_slb_fault(struct pt_regs *regs, unsigned long ea);
-void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err);
+long do_slb_fault(struct pt_regs *regs);
+void do_bad_slb_fault(struct pt_regs *regs);
 
 /* signals, syscalls and interrupts */
 long sys_swapcontext(struct ucontext __user *old_ctx,
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 60a669379aa0..b9968e297da2 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -454,7 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
 #define HPTE_NOHPTE_UPDATE	0x2
 #define HPTE_USE_KERNEL_KEY	0x4
 
-int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
+long do_hash_fault(struct pt_regs *regs);
 extern int __hash_page_4K(unsigned long ea, unsigned long access,
 			  unsigned long vsid, pte_t *ptep, unsigned long trap,
 			  unsigned long flags, int ssize, int subpage_prot);
diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 464f8ca8a5c9..f7827e993196 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -111,7 +111,7 @@
 #ifndef __ASSEMBLY__
 
 struct pt_regs;
-extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
+long do_page_fault(struct pt_regs *);
 extern void bad_page_fault(struct pt_regs *, unsigned long, int);
 void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 238eacfda7b0..a32157ce0551 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -277,7 +277,7 @@ reenable_mmu:
 	 * r3 can be different from GPR3(r1) at this point, r9 and r11
 	 * contains the old MSR and handler address respectively,
 	 * r4 & r5 can contain page fault arguments that need to be passed
-	 * along as well. r0, r6-r8, r12, CCR, CTR, XER etc... are left
+	 * r0, r4-r8, r12, CCR, CTR, XER etc... are left
 	 * clobbered as they aren't useful past this point.
 	 */
 
@@ -285,15 +285,11 @@ reenable_mmu:
 	stw	r9,8(r1)
 	stw	r11,12(r1)
 	stw	r3,16(r1)
-	stw	r4,20(r1)
-	stw	r5,24(r1)
 
 	/* If we are disabling interrupts (normal case), simply log it with
 	 * lockdep
 	 */
 1:	bl	trace_hardirqs_off
-	lwz	r5,24(r1)
-	lwz	r4,20(r1)
 	lwz	r3,16(r1)
 	lwz	r11,12(r1)
 	lwz	r9,8(r1)
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 74d07dc0bb48..43e71d86dcbf 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1011,8 +1011,6 @@ storage_fault_common:
 	std	r14,_DAR(r1)
 	std	r15,_DSISR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
-	mr	r4,r14
-	mr	r5,r15
 	ld	r14,PACA_EXGEN+EX_R14(r13)
 	ld	r15,PACA_EXGEN+EX_R15(r13)
 	bl	do_page_fault
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index bcb5e81d2088..814cff2c649e 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1438,8 +1438,6 @@ EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
 EXC_VIRT_END(data_access, 0x4300, 0x80)
 EXC_COMMON_BEGIN(data_access_common)
 	GEN_COMMON data_access
-	ld	r4,_DAR(r1)
-	ld	r5,_DSISR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
 	bl	do_hash_fault
@@ -1492,10 +1490,9 @@ EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
 EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
 EXC_COMMON_BEGIN(data_access_slb_common)
 	GEN_COMMON data_access_slb
-	ld	r4,_DAR(r1)
-	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
 	/* HPT case, do SLB fault */
+	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_slb_fault
 	cmpdi	r3,0
 	bne-	1f
@@ -1507,8 +1504,6 @@ MMU_FTR_SECTION_ELSE
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 	std	r3,RESULT(r1)
 	RECONCILE_IRQ_STATE(r10, r11)
-	ld	r4,_DAR(r1)
-	ld	r5,RESULT(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_bad_slb_fault
 	b	interrupt_return
@@ -1543,8 +1538,6 @@ EXC_VIRT_BEGIN(instruction_access, 0x4400, 0x80)
 EXC_VIRT_END(instruction_access, 0x4400, 0x80)
 EXC_COMMON_BEGIN(instruction_access_common)
 	GEN_COMMON instruction_access
-	ld	r4,_DAR(r1)
-	ld	r5,_DSISR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
 	bl	do_hash_fault
@@ -1588,10 +1581,9 @@ EXC_VIRT_BEGIN(instruction_access_slb, 0x4480, 0x80)
 EXC_VIRT_END(instruction_access_slb, 0x4480, 0x80)
 EXC_COMMON_BEGIN(instruction_access_slb_common)
 	GEN_COMMON instruction_access_slb
-	ld	r4,_DAR(r1)
-	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
 	/* HPT case, do SLB fault */
+	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_slb_fault
 	cmpdi	r3,0
 	bne-	1f
@@ -1603,8 +1595,6 @@ MMU_FTR_SECTION_ELSE
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 	std	r3,RESULT(r1)
 	RECONCILE_IRQ_STATE(r10, r11)
-	ld	r4,_DAR(r1)
-	ld	r5,RESULT(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_bad_slb_fault
 	b	interrupt_return
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index a1ae00689e0f..3c5577ac4dc8 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -179,9 +179,9 @@ _ENTRY(saved_ksp_limit)
  */
 	START_EXCEPTION(0x0300,	DataStorage)
 	EXCEPTION_PROLOG
-	mfspr	r5, SPRN_ESR		/* Grab the ESR, save it, pass arg3 */
+	mfspr	r5, SPRN_ESR		/* Grab the ESR, save it */
 	stw	r5, _ESR(r11)
-	mfspr	r4, SPRN_DEAR		/* Grab the DEAR, save it, pass arg2 */
+	mfspr	r4, SPRN_DEAR		/* Grab the DEAR, save it */
 	stw	r4, _DEAR(r11)
 	EXC_XFER_LITE(0x300, handle_page_fault)
 
@@ -191,9 +191,9 @@ _ENTRY(saved_ksp_limit)
  */
 	START_EXCEPTION(0x0400, InstructionAccess)
 	EXCEPTION_PROLOG
-	mr	r4,r12			/* Pass SRR0 as arg2 */
-	stw	r4, _DEAR(r11)
-	li	r5,0			/* Pass zero as arg3 */
+	li	r5,0
+	stw	r5, _ESR(r11)		/* Zero ESR */
+	stw	r12, _DEAR(r11)		/* SRR0 as DEAR */
 	EXC_XFER_LITE(0x400, handle_page_fault)
 
 /* 0x0500 - External Interrupt Exception */
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 52702f3db6df..0b2c247cfdff 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -312,14 +312,14 @@ DataStoreTLBMiss:
 	. = 0x1300
 InstructionTLBError:
 	EXCEPTION_PROLOG
-	mr	r4,r12
 	andis.	r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
 	andis.	r10,r9,SRR1_ISI_NOPT@h
 	beq+	.Litlbie
-	tlbie	r4
+	tlbie	r12
 	/* 0x400 is InstructionAccess exception, needed by bad_page_fault() */
 .Litlbie:
-	stw	r4, _DAR(r11)
+	stw	r12, _DAR(r11)
+	stw	r5, _DSISR(r11)
 	EXC_XFER_LITE(0x400, handle_page_fault)
 
 /* This is the data TLB error on the MPC8xx.  This could be due to
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index fc9a12768a14..94ad1372c490 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -352,9 +352,9 @@ BEGIN_MMU_FTR_SECTION
 	bl	hash_page
 END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
 #endif	/* CONFIG_VMAP_STACK */
-1:	mr	r4,r12
 	andis.	r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
-	stw	r4, _DAR(r11)
+	stw	r5, _DSISR(r11)
+	stw	r12, _DAR(r11)
 	EXC_XFER_LITE(0x400, handle_page_fault)
 
 /* External interrupt */
@@ -676,7 +676,6 @@ handle_page_fault_tramp_1:
 #ifdef CONFIG_VMAP_STACK
 	EXCEPTION_PROLOG_2 handle_dar_dsisr=1
 #endif
-	lwz	r4, _DAR(r11)
 	lwz	r5, _DSISR(r11)
 	/* fall through */
 handle_page_fault_tramp_2:
diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index 74e230c200fb..0fbdacc7fab7 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -476,9 +476,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
 	NORMAL_EXCEPTION_PROLOG(INST_STORAGE);		      \
 	mfspr	r5,SPRN_ESR;		/* Grab the ESR and save it */	      \
 	stw	r5,_ESR(r11);						      \
-	mr      r4,r12;                 /* Pass SRR0 as arg2 */		      \
-	stw	r4, _DEAR(r11);						      \
-	li      r5,0;                   /* Pass zero as arg3 */		      \
+	stw	r12, _DEAR(r11);	/* Pass SRR0 as arg2 */		      \
 	EXC_XFER_LITE(0x0400, handle_page_fault)
 
 #define ALIGNMENT_EXCEPTION						      \
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 5a61182ddf75..8d014924ee0d 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1512,13 +1512,15 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
 }
 EXPORT_SYMBOL_GPL(hash_page);
 
-int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
+long do_hash_fault(struct pt_regs *regs)
 {
+	unsigned long ea = regs->dar;
+	unsigned long dsisr = regs->dsisr;
 	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
 	unsigned long flags = 0;
 	struct mm_struct *mm;
 	unsigned int region_id;
-	int err;
+	long err;
 
 	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
 		goto page_fault;
@@ -1582,7 +1584,7 @@ int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
 
 	} else if (err) {
 page_fault:
-		err = do_page_fault(regs, ea, dsisr);
+		err = do_page_fault(regs);
 	}
 
 	return err;
diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
index 584567970c11..985902ce0272 100644
--- a/arch/powerpc/mm/book3s64/slb.c
+++ b/arch/powerpc/mm/book3s64/slb.c
@@ -813,8 +813,9 @@ static long slb_allocate_user(struct mm_struct *mm, unsigned long ea)
 	return slb_insert_entry(ea, context, flags, ssize, false);
 }
 
-long do_slb_fault(struct pt_regs *regs, unsigned long ea)
+long do_slb_fault(struct pt_regs *regs)
 {
+	unsigned long ea = regs->dar;
 	unsigned long id = get_region_id(ea);
 
 	/* IRQs are not reconciled here, so can't check irqs_disabled */
@@ -865,13 +866,15 @@ long do_slb_fault(struct pt_regs *regs, unsigned long ea)
 	}
 }
 
-void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err)
+void do_bad_slb_fault(struct pt_regs *regs)
 {
+	int err = regs->result;
+
 	if (err == -EFAULT) {
 		if (user_mode(regs))
-			_exception(SIGSEGV, regs, SEGV_BNDERR, ea);
+			_exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
 		else
-			bad_page_fault(regs, ea, SIGSEGV);
+			bad_page_fault(regs, regs->dar, SIGSEGV);
 	} else if (err == -EINVAL) {
 		unrecoverable_exception(regs);
 	} else {
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 77a3155c77b6..e170501081a7 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -547,11 +547,12 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 }
 NOKPROBE_SYMBOL(__do_page_fault);
 
-int do_page_fault(struct pt_regs *regs, unsigned long address,
-		  unsigned long error_code)
+long do_page_fault(struct pt_regs *regs)
 {
 	enum ctx_state prev_state = exception_enter();
-	int err;
+	unsigned long address = regs->dar;
+	unsigned long error_code = regs->dsisr;
+	long err;
 
 	err = __do_page_fault(regs, address, error_code);
 	if (unlikely(err)) {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 04/21] powerpc: bad_page_fault, do_break get registers from regs
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (2 preceding siblings ...)
  2021-01-13  7:31 ` [PATCH v5 03/21] powerpc: remove arguments from fault handler functions Nicholas Piggin
@ 2021-01-13  7:31 ` Nicholas Piggin
  2021-01-13 14:25   ` Christophe Leroy
  2021-01-13  7:31 ` [PATCH v5 05/21] powerpc/perf: move perf irq/nmi handling details into traps.c Nicholas Piggin
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Similar to the previous patch this makes interrupt handler function
types more regular so they can be wrapped with the next patch.

bad_page_fault and do_break are not performance critical.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/bug.h             |  4 ++--
 arch/powerpc/include/asm/debug.h           |  3 +--
 arch/powerpc/kernel/entry_32.S             |  3 +--
 arch/powerpc/kernel/exceptions-64e.S       |  3 +--
 arch/powerpc/kernel/exceptions-64s.S       |  3 +--
 arch/powerpc/kernel/head_8xx.S             |  5 ++---
 arch/powerpc/kernel/process.c              |  7 +++----
 arch/powerpc/kernel/traps.c                |  2 +-
 arch/powerpc/mm/book3s64/hash_utils.c      |  4 ++--
 arch/powerpc/mm/book3s64/slb.c             |  2 +-
 arch/powerpc/mm/fault.c                    | 10 +++++-----
 arch/powerpc/platforms/8xx/machine_check.c |  2 +-
 12 files changed, 21 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index f7827e993196..4220789b9a97 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -112,8 +112,8 @@
 
 struct pt_regs;
 long do_page_fault(struct pt_regs *);
-extern void bad_page_fault(struct pt_regs *, unsigned long, int);
-void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig);
+void bad_page_fault(struct pt_regs *, int);
+void __bad_page_fault(struct pt_regs *regs, int sig);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
 extern void _exception_pkey(struct pt_regs *, unsigned long, int);
 extern void die(const char *, struct pt_regs *, long);
diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
index ec57daf87f40..0550eceab3ca 100644
--- a/arch/powerpc/include/asm/debug.h
+++ b/arch/powerpc/include/asm/debug.h
@@ -52,8 +52,7 @@ extern void do_send_trap(struct pt_regs *regs, unsigned long address,
 			 unsigned long error_code, int brkpt);
 #else
 
-extern void do_break(struct pt_regs *regs, unsigned long address,
-		     unsigned long error_code);
+void do_break(struct pt_regs *regs);
 #endif
 
 #endif /* _ASM_POWERPC_DEBUG_H */
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index a32157ce0551..a94127eed56b 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -673,9 +673,8 @@ handle_page_fault:
 	lwz	r0,_TRAP(r1)
 	clrrwi	r0,r0,1
 	stw	r0,_TRAP(r1)
-	mr	r5,r3
+	mr	r4,r3		/* err arg for bad_page_fault */
 	addi	r3,r1,STACK_FRAME_OVERHEAD
-	lwz	r4,_DAR(r1)
 	bl	__bad_page_fault
 	b	ret_from_except_full
 
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 43e71d86dcbf..52421042a020 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1018,9 +1018,8 @@ storage_fault_common:
 	bne-	1f
 	b	ret_from_except_lite
 1:	bl	save_nvgprs
-	mr	r5,r3
+	mr	r4,r3
 	addi	r3,r1,STACK_FRAME_OVERHEAD
-	ld	r4,_DAR(r1)
 	bl	__bad_page_fault
 	b	ret_from_except
 
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 814cff2c649e..36dea2020ec5 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -2136,8 +2136,7 @@ EXC_COMMON_BEGIN(h_data_storage_common)
 	GEN_COMMON h_data_storage
 	addi    r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
-	ld	r4,_DAR(r1)
-	li	r5,SIGSEGV
+	li	r4,SIGSEGV
 	bl      bad_page_fault
 MMU_FTR_SECTION_ELSE
 	bl      unknown_exception
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 0b2c247cfdff..7869db974185 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -364,10 +364,9 @@ do_databreakpoint:
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	mfspr	r4,SPRN_BAR
 	stw	r4,_DAR(r11)
-#ifdef CONFIG_VMAP_STACK
-	lwz	r5,_DSISR(r11)
-#else
+#ifndef CONFIG_VMAP_STACK
 	mfspr	r5,SPRN_DSISR
+	stw	r5,_DSISR(r11)
 #endif
 	EXC_XFER_STD(0x1c00, do_break)
 
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index a66f435dabbf..4f0f81e9420b 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -659,11 +659,10 @@ static void do_break_handler(struct pt_regs *regs)
 	}
 }
 
-void do_break (struct pt_regs *regs, unsigned long address,
-		    unsigned long error_code)
+void do_break(struct pt_regs *regs)
 {
 	current->thread.trap_nr = TRAP_HWBKPT;
-	if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, error_code,
+	if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, regs->dsisr,
 			11, SIGSEGV) == NOTIFY_STOP)
 		return;
 
@@ -681,7 +680,7 @@ void do_break (struct pt_regs *regs, unsigned long address,
 		do_break_handler(regs);
 
 	/* Deliver the signal to userspace */
-	force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address);
+	force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)regs->dar);
 }
 #endif	/* CONFIG_PPC_ADV_DEBUG_REGS */
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 3ec7b443fe6b..f3f6af3141ee 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1612,7 +1612,7 @@ void alignment_exception(struct pt_regs *regs)
 	if (user_mode(regs))
 		_exception(sig, regs, code, regs->dar);
 	else
-		bad_page_fault(regs, regs->dar, sig);
+		bad_page_fault(regs, sig);
 
 bail:
 	exception_exit(prev_state);
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 8d014924ee0d..77073a256cff 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1539,7 +1539,7 @@ long do_hash_fault(struct pt_regs *regs)
 	 * the access, or panic if there isn't a handler.
 	 */
 	if (unlikely(in_nmi())) {
-		bad_page_fault(regs, ea, SIGSEGV);
+		bad_page_fault(regs, SIGSEGV);
 		return 0;
 	}
 
@@ -1578,7 +1578,7 @@ long do_hash_fault(struct pt_regs *regs)
 			else
 				_exception(SIGBUS, regs, BUS_ADRERR, ea);
 		} else {
-			bad_page_fault(regs, ea, SIGBUS);
+			bad_page_fault(regs, SIGBUS);
 		}
 		err = 0;
 
diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
index 985902ce0272..c581548b533f 100644
--- a/arch/powerpc/mm/book3s64/slb.c
+++ b/arch/powerpc/mm/book3s64/slb.c
@@ -874,7 +874,7 @@ void do_bad_slb_fault(struct pt_regs *regs)
 		if (user_mode(regs))
 			_exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
 		else
-			bad_page_fault(regs, regs->dar, SIGSEGV);
+			bad_page_fault(regs, SIGSEGV);
 	} else if (err == -EINVAL) {
 		unrecoverable_exception(regs);
 	} else {
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index e170501081a7..36604ff8b3ec 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -569,14 +569,14 @@ long do_page_fault(struct pt_regs *regs)
 	/* 32 and 64e handle these errors in asm */
 	if (unlikely(err)) {
 		if (err > 0) {
-			__bad_page_fault(regs, address, err);
+			__bad_page_fault(regs, err);
 			err = 0;
 		} else {
 			/*
 			 * do_break() may change NV GPRS while handling the
 			 * breakpoint. Return -ve to caller to do that.
 			 */
-			do_break(regs, address, error_code);
+			do_break(regs);
 		}
 	}
 #endif
@@ -592,7 +592,7 @@ NOKPROBE_SYMBOL(do_page_fault);
  * It is called from the DSI and ISI handlers in head.S and from some
  * of the procedures in traps.c.
  */
-void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
+void __bad_page_fault(struct pt_regs *regs, int sig)
 {
 	int is_write = page_fault_is_write(regs->dsisr);
 
@@ -630,7 +630,7 @@ void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	die("Kernel access of bad area", regs, sig);
 }
 
-void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
+void bad_page_fault(struct pt_regs *regs, int sig)
 {
 	const struct exception_table_entry *entry;
 
@@ -639,5 +639,5 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	if (entry)
 		instruction_pointer_set(regs, extable_fixup(entry));
 	else
-		__bad_page_fault(regs, address, sig);
+		__bad_page_fault(regs, sig);
 }
diff --git a/arch/powerpc/platforms/8xx/machine_check.c b/arch/powerpc/platforms/8xx/machine_check.c
index 88dedf38eccd..656365975895 100644
--- a/arch/powerpc/platforms/8xx/machine_check.c
+++ b/arch/powerpc/platforms/8xx/machine_check.c
@@ -26,7 +26,7 @@ int machine_check_8xx(struct pt_regs *regs)
 	 * to deal with that than having a wart in the mcheck handler.
 	 * -- BenH
 	 */
-	bad_page_fault(regs, regs->dar, SIGBUS);
+	bad_page_fault(regs, SIGBUS);
 	return 1;
 #else
 	return 0;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 05/21] powerpc/perf: move perf irq/nmi handling details into traps.c
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (3 preceding siblings ...)
  2021-01-13  7:31 ` [PATCH v5 04/21] powerpc: bad_page_fault, do_break get registers from regs Nicholas Piggin
@ 2021-01-13  7:31 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 06/21] powerpc: interrupt handler wrapper functions Nicholas Piggin
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This is required in order to allow more significant differences between
NMI type interrupt handlers and regular asynchronous handlers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/traps.c      | 31 +++++++++++++++++++++++++++-
 arch/powerpc/perf/core-book3s.c  | 35 ++------------------------------
 arch/powerpc/perf/core-fsl-emb.c | 25 -----------------------
 3 files changed, 32 insertions(+), 59 deletions(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index f3f6af3141ee..9b5298c016c7 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1890,11 +1890,40 @@ void vsx_unavailable_tm(struct pt_regs *regs)
 }
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
-void performance_monitor_exception(struct pt_regs *regs)
+static void performance_monitor_exception_nmi(struct pt_regs *regs)
+{
+	nmi_enter();
+
+	__this_cpu_inc(irq_stat.pmu_irqs);
+
+	perf_irq(regs);
+
+	nmi_exit();
+}
+
+static void performance_monitor_exception_async(struct pt_regs *regs)
 {
+	irq_enter();
+
 	__this_cpu_inc(irq_stat.pmu_irqs);
 
 	perf_irq(regs);
+
+	irq_exit();
+}
+
+void performance_monitor_exception(struct pt_regs *regs)
+{
+	/*
+	 * On 64-bit, if perf interrupts hit in a local_irq_disable
+	 * (soft-masked) region, we consider them as NMIs. This is required to
+	 * prevent hash faults on user addresses when reading callchains (and
+	 * looks better from an irq tracing perspective).
+	 */
+	if (IS_ENABLED(CONFIG_PPC64) && unlikely(arch_irq_disabled_regs(regs)))
+		performance_monitor_exception_nmi(regs);
+	else
+		performance_monitor_exception_async(regs);
 }
 
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 28206b1fe172..9fd06010e8b6 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -110,10 +110,6 @@ static inline void perf_read_regs(struct pt_regs *regs)
 {
 	regs->result = 0;
 }
-static inline int perf_intr_is_nmi(struct pt_regs *regs)
-{
-	return 0;
-}
 
 static inline int siar_valid(struct pt_regs *regs)
 {
@@ -353,15 +349,6 @@ static inline void perf_read_regs(struct pt_regs *regs)
 	regs->result = use_siar;
 }
 
-/*
- * If interrupts were soft-disabled when a PMU interrupt occurs, treat
- * it as an NMI.
- */
-static inline int perf_intr_is_nmi(struct pt_regs *regs)
-{
-	return (regs->softe & IRQS_DISABLED);
-}
-
 /*
  * On processors like P7+ that have the SIAR-Valid bit, marked instructions
  * must be sampled only if the SIAR-valid bit is set.
@@ -2279,7 +2266,6 @@ static void __perf_event_interrupt(struct pt_regs *regs)
 	struct perf_event *event;
 	unsigned long val[8];
 	int found, active;
-	int nmi;
 
 	if (cpuhw->n_limited)
 		freeze_limited_counters(cpuhw, mfspr(SPRN_PMC5),
@@ -2287,18 +2273,6 @@ static void __perf_event_interrupt(struct pt_regs *regs)
 
 	perf_read_regs(regs);
 
-	/*
-	 * If perf interrupts hit in a local_irq_disable (soft-masked) region,
-	 * we consider them as NMIs. This is required to prevent hash faults on
-	 * user addresses when reading callchains. See the NMI test in
-	 * do_hash_page.
-	 */
-	nmi = perf_intr_is_nmi(regs);
-	if (nmi)
-		nmi_enter();
-	else
-		irq_enter();
-
 	/* Read all the PMCs since we'll need them a bunch of times */
 	for (i = 0; i < ppmu->n_counter; ++i)
 		val[i] = read_pmc(i + 1);
@@ -2344,8 +2318,8 @@ static void __perf_event_interrupt(struct pt_regs *regs)
 			}
 		}
 	}
-	if (!found && !nmi && printk_ratelimit())
-		printk(KERN_WARNING "Can't find PMC that caused IRQ\n");
+	if (unlikely(!found) && !arch_irq_disabled_regs(regs))
+		printk_ratelimited(KERN_WARNING "Can't find PMC that caused IRQ\n");
 
 	/*
 	 * Reset MMCR0 to its normal value.  This will set PMXE and
@@ -2355,11 +2329,6 @@ static void __perf_event_interrupt(struct pt_regs *regs)
 	 * we get back out of this interrupt.
 	 */
 	write_mmcr0(cpuhw, cpuhw->mmcr.mmcr0);
-
-	if (nmi)
-		nmi_exit();
-	else
-		irq_exit();
 }
 
 static void perf_event_interrupt(struct pt_regs *regs)
diff --git a/arch/powerpc/perf/core-fsl-emb.c b/arch/powerpc/perf/core-fsl-emb.c
index e0e7e276bfd2..ee721f420a7b 100644
--- a/arch/powerpc/perf/core-fsl-emb.c
+++ b/arch/powerpc/perf/core-fsl-emb.c
@@ -31,19 +31,6 @@ static atomic_t num_events;
 /* Used to avoid races in calling reserve/release_pmc_hardware */
 static DEFINE_MUTEX(pmc_reserve_mutex);
 
-/*
- * If interrupts were soft-disabled when a PMU interrupt occurs, treat
- * it as an NMI.
- */
-static inline int perf_intr_is_nmi(struct pt_regs *regs)
-{
-#ifdef __powerpc64__
-	return (regs->softe & IRQS_DISABLED);
-#else
-	return 0;
-#endif
-}
-
 static void perf_event_interrupt(struct pt_regs *regs);
 
 /*
@@ -659,13 +646,6 @@ static void perf_event_interrupt(struct pt_regs *regs)
 	struct perf_event *event;
 	unsigned long val;
 	int found = 0;
-	int nmi;
-
-	nmi = perf_intr_is_nmi(regs);
-	if (nmi)
-		nmi_enter();
-	else
-		irq_enter();
 
 	for (i = 0; i < ppmu->n_counter; ++i) {
 		event = cpuhw->event[i];
@@ -690,11 +670,6 @@ static void perf_event_interrupt(struct pt_regs *regs)
 	mtmsr(mfmsr() | MSR_PMM);
 	mtpmr(PMRN_PMGC0, PMGC0_PMIE | PMGC0_FCECE);
 	isync();
-
-	if (nmi)
-		nmi_exit();
-	else
-		irq_exit();
 }
 
 void hw_perf_event_setup(int cpu)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 06/21] powerpc: interrupt handler wrapper functions
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (4 preceding siblings ...)
  2021-01-13  7:31 ` [PATCH v5 05/21] powerpc/perf: move perf irq/nmi handling details into traps.c Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13 14:45   ` Christophe Leroy
  2021-01-13  7:32 ` [PATCH v5 07/21] powerpc: add interrupt wrapper entry / exit stub functions Nicholas Piggin
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Add wrapper functions (derived from x86 macros) for interrupt handler
functions. This allows interrupt entry code to be written in C.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/asm-prototypes.h     |  29 ---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 -
 arch/powerpc/include/asm/hw_irq.h             |   9 -
 arch/powerpc/include/asm/interrupt.h          | 218 ++++++++++++++++++
 arch/powerpc/include/asm/time.h               |   2 +
 arch/powerpc/kernel/dbell.c                   |  12 +-
 arch/powerpc/kernel/exceptions-64s.S          |   7 +-
 arch/powerpc/kernel/head_book3s_32.S          |   6 +-
 arch/powerpc/kernel/irq.c                     |   3 +-
 arch/powerpc/kernel/mce.c                     |   5 +-
 arch/powerpc/kernel/syscall_64.c              |   1 +
 arch/powerpc/kernel/tau_6xx.c                 |   2 +-
 arch/powerpc/kernel/time.c                    |   3 +-
 arch/powerpc/kernel/traps.c                   |  90 +++++---
 arch/powerpc/kernel/watchdog.c                |   7 +-
 arch/powerpc/kvm/book3s_hv.c                  |   1 +
 arch/powerpc/kvm/book3s_hv_builtin.c          |   1 +
 arch/powerpc/kvm/booke.c                      |   1 +
 arch/powerpc/mm/book3s64/hash_utils.c         |  57 +++--
 arch/powerpc/mm/book3s64/slb.c                |  29 +--
 arch/powerpc/mm/fault.c                       |  15 +-
 arch/powerpc/platforms/powernv/idle.c         |   1 +
 22 files changed, 374 insertions(+), 126 deletions(-)
 create mode 100644 arch/powerpc/include/asm/interrupt.h

diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
index 22c9d08fa3a4..939f3c94c8f3 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -56,35 +56,6 @@ int exit_vmx_usercopy(void);
 int enter_vmx_ops(void);
 void *exit_vmx_ops(void *dest);
 
-/* Traps */
-long machine_check_early(struct pt_regs *regs);
-long hmi_exception_realmode(struct pt_regs *regs);
-void SMIException(struct pt_regs *regs);
-void handle_hmi_exception(struct pt_regs *regs);
-void instruction_breakpoint_exception(struct pt_regs *regs);
-void RunModeException(struct pt_regs *regs);
-void single_step_exception(struct pt_regs *regs);
-void program_check_exception(struct pt_regs *regs);
-void alignment_exception(struct pt_regs *regs);
-void StackOverflow(struct pt_regs *regs);
-void stack_overflow_exception(struct pt_regs *regs);
-void kernel_fp_unavailable_exception(struct pt_regs *regs);
-void altivec_unavailable_exception(struct pt_regs *regs);
-void vsx_unavailable_exception(struct pt_regs *regs);
-void fp_unavailable_tm(struct pt_regs *regs);
-void altivec_unavailable_tm(struct pt_regs *regs);
-void vsx_unavailable_tm(struct pt_regs *regs);
-void facility_unavailable_exception(struct pt_regs *regs);
-void TAUException(struct pt_regs *regs);
-void altivec_assist_exception(struct pt_regs *regs);
-void unrecoverable_exception(struct pt_regs *regs);
-void kernel_bad_stack(struct pt_regs *regs);
-void system_reset_exception(struct pt_regs *regs);
-void machine_check_exception(struct pt_regs *regs);
-void emulation_assist_interrupt(struct pt_regs *regs);
-long do_slb_fault(struct pt_regs *regs);
-void do_bad_slb_fault(struct pt_regs *regs);
-
 /* signals, syscalls and interrupts */
 long sys_swapcontext(struct ucontext __user *old_ctx,
 		    struct ucontext __user *new_ctx,
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index b9968e297da2..066b1d34c7bc 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -454,7 +454,6 @@ static inline unsigned long hpt_hash(unsigned long vpn,
 #define HPTE_NOHPTE_UPDATE	0x2
 #define HPTE_USE_KERNEL_KEY	0x4
 
-long do_hash_fault(struct pt_regs *regs);
 extern int __hash_page_4K(unsigned long ea, unsigned long access,
 			  unsigned long vsid, pte_t *ptep, unsigned long trap,
 			  unsigned long flags, int ssize, int subpage_prot);
diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 0363734ff56e..614957f74cee 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -50,15 +50,6 @@
 
 #ifndef __ASSEMBLY__
 
-extern void replay_system_reset(void);
-extern void replay_soft_interrupts(void);
-
-extern void timer_interrupt(struct pt_regs *);
-extern void timer_broadcast_interrupt(void);
-extern void performance_monitor_exception(struct pt_regs *regs);
-extern void WatchdogException(struct pt_regs *regs);
-extern void unknown_exception(struct pt_regs *regs);
-
 #ifdef CONFIG_PPC64
 #include <asm/paca.h>
 
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
new file mode 100644
index 000000000000..60363e5eeffa
--- /dev/null
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -0,0 +1,218 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ASM_POWERPC_INTERRUPT_H
+#define _ASM_POWERPC_INTERRUPT_H
+
+#include <linux/context_tracking.h>
+#include <asm/ftrace.h>
+
+/**
+ * DECLARE_INTERRUPT_HANDLER_RAW - Declare raw interrupt handler function
+ * @func:	Function name of the entry point
+ * @returns:	Returns a value back to asm caller
+ */
+#define DECLARE_INTERRUPT_HANDLER_RAW(func)				\
+	__visible long func(struct pt_regs *regs)
+
+/**
+ * DEFINE_INTERRUPT_HANDLER_RAW - Define raw interrupt handler function
+ * @func:	Function name of the entry point
+ * @returns:	Returns a value back to asm caller
+ *
+ * @func is called from ASM entry code.
+ *
+ * This is a plain function which does no tracing, reconciling, etc.
+ * The macro is written so it acts as function definition. Append the
+ * body with a pair of curly brackets.
+ *
+ * raw interrupt handlers must not enable or disable interrupts, or
+ * schedule, tracing and instrumentation (ftrace, lockdep, etc) would
+ * not be advisable either, although may be possible in a pinch, the
+ * trace will look odd at least.
+ *
+ * A raw handler may call one of the other interrupt handler functions
+ * to be converted into that interrupt context without these restrictions.
+ *
+ * On PPC64, _RAW handlers may return with fast_interrupt_return.
+ *
+ * Specific handlers may have additional restrictions.
+ */
+#define DEFINE_INTERRUPT_HANDLER_RAW(func)				\
+static __always_inline long ____##func(struct pt_regs *regs);		\
+									\
+__visible noinstr long func(struct pt_regs *regs)			\
+{									\
+	long ret;							\
+									\
+	ret = ____##func (regs);					\
+									\
+	return ret;							\
+}									\
+									\
+static __always_inline long ____##func(struct pt_regs *regs)
+
+/**
+ * DECLARE_INTERRUPT_HANDLER - Declare synchronous interrupt handler function
+ * @func:	Function name of the entry point
+ */
+#define DECLARE_INTERRUPT_HANDLER(func)					\
+	__visible void func(struct pt_regs *regs)
+
+/**
+ * DEFINE_INTERRUPT_HANDLER - Define synchronous interrupt handler function
+ * @func:	Function name of the entry point
+ *
+ * @func is called from ASM entry code.
+ *
+ * The macro is written so it acts as function definition. Append the
+ * body with a pair of curly brackets.
+ */
+#define DEFINE_INTERRUPT_HANDLER(func)					\
+static __always_inline void ____##func(struct pt_regs *regs);		\
+									\
+__visible noinstr void func(struct pt_regs *regs)			\
+{									\
+	____##func (regs);						\
+}									\
+									\
+static __always_inline void ____##func(struct pt_regs *regs)
+
+/**
+ * DECLARE_INTERRUPT_HANDLER_RET - Declare synchronous interrupt handler function
+ * @func:	Function name of the entry point
+ * @returns:	Returns a value back to asm caller
+ */
+#define DECLARE_INTERRUPT_HANDLER_RET(func)				\
+	__visible long func(struct pt_regs *regs)
+
+/**
+ * DEFINE_INTERRUPT_HANDLER_RET - Define synchronous interrupt handler function
+ * @func:	Function name of the entry point
+ * @returns:	Returns a value back to asm caller
+ *
+ * @func is called from ASM entry code.
+ *
+ * The macro is written so it acts as function definition. Append the
+ * body with a pair of curly brackets.
+ */
+#define DEFINE_INTERRUPT_HANDLER_RET(func)				\
+static __always_inline long ____##func(struct pt_regs *regs);		\
+									\
+__visible noinstr long func(struct pt_regs *regs)			\
+{									\
+	long ret;							\
+									\
+	ret = ____##func (regs);					\
+									\
+	return ret;							\
+}									\
+									\
+static __always_inline long ____##func(struct pt_regs *regs)
+
+/**
+ * DECLARE_INTERRUPT_HANDLER_ASYNC - Declare asynchronous interrupt handler function
+ * @func:	Function name of the entry point
+ */
+#define DECLARE_INTERRUPT_HANDLER_ASYNC(func)				\
+	__visible void func(struct pt_regs *regs)
+
+/**
+ * DEFINE_INTERRUPT_HANDLER_ASYNC - Define asynchronous interrupt handler function
+ * @func:	Function name of the entry point
+ *
+ * @func is called from ASM entry code.
+ *
+ * The macro is written so it acts as function definition. Append the
+ * body with a pair of curly brackets.
+ */
+#define DEFINE_INTERRUPT_HANDLER_ASYNC(func)				\
+static __always_inline void ____##func(struct pt_regs *regs);		\
+									\
+__visible noinstr void func(struct pt_regs *regs)			\
+{									\
+	____##func (regs);						\
+}									\
+									\
+static __always_inline void ____##func(struct pt_regs *regs)
+
+/**
+ * DECLARE_INTERRUPT_HANDLER_NMI - Declare NMI interrupt handler function
+ * @func:	Function name of the entry point
+ * @returns:	Returns a value back to asm caller
+ */
+#define DECLARE_INTERRUPT_HANDLER_NMI(func)				\
+	__visible long func(struct pt_regs *regs)
+
+/**
+ * DEFINE_INTERRUPT_HANDLER_NMI - Define NMI interrupt handler function
+ * @func:	Function name of the entry point
+ * @returns:	Returns a value back to asm caller
+ *
+ * @func is called from ASM entry code.
+ *
+ * The macro is written so it acts as function definition. Append the
+ * body with a pair of curly brackets.
+ */
+#define DEFINE_INTERRUPT_HANDLER_NMI(func)				\
+static __always_inline long ____##func(struct pt_regs *regs);		\
+									\
+__visible noinstr long func(struct pt_regs *regs)			\
+{									\
+	long ret;							\
+									\
+	ret = ____##func (regs);					\
+									\
+	return ret;							\
+}									\
+									\
+static __always_inline long ____##func(struct pt_regs *regs)
+
+
+/* Interrupt handlers */
+DECLARE_INTERRUPT_HANDLER_NMI(machine_check_early);
+DECLARE_INTERRUPT_HANDLER_NMI(hmi_exception_realmode);
+DECLARE_INTERRUPT_HANDLER(SMIException);
+DECLARE_INTERRUPT_HANDLER(handle_hmi_exception);
+DECLARE_INTERRUPT_HANDLER(instruction_breakpoint_exception);
+DECLARE_INTERRUPT_HANDLER(RunModeException);
+DECLARE_INTERRUPT_HANDLER(single_step_exception);
+DECLARE_INTERRUPT_HANDLER(program_check_exception);
+DECLARE_INTERRUPT_HANDLER(alignment_exception);
+DECLARE_INTERRUPT_HANDLER(StackOverflow);
+DECLARE_INTERRUPT_HANDLER(stack_overflow_exception);
+DECLARE_INTERRUPT_HANDLER(kernel_fp_unavailable_exception);
+DECLARE_INTERRUPT_HANDLER(altivec_unavailable_exception);
+DECLARE_INTERRUPT_HANDLER(vsx_unavailable_exception);
+DECLARE_INTERRUPT_HANDLER(fp_unavailable_tm);
+DECLARE_INTERRUPT_HANDLER(altivec_unavailable_tm);
+DECLARE_INTERRUPT_HANDLER(vsx_unavailable_tm);
+DECLARE_INTERRUPT_HANDLER(facility_unavailable_exception);
+DECLARE_INTERRUPT_HANDLER_ASYNC(TAUException);
+DECLARE_INTERRUPT_HANDLER(altivec_assist_exception);
+DECLARE_INTERRUPT_HANDLER(unrecoverable_exception);
+DECLARE_INTERRUPT_HANDLER(kernel_bad_stack);
+DECLARE_INTERRUPT_HANDLER_NMI(system_reset_exception);
+#ifdef CONFIG_PPC_BOOK3S_64
+DECLARE_INTERRUPT_HANDLER_ASYNC(machine_check_exception);
+#else
+DECLARE_INTERRUPT_HANDLER_NMI(machine_check_exception);
+#endif
+DECLARE_INTERRUPT_HANDLER(emulation_assist_interrupt);
+DECLARE_INTERRUPT_HANDLER_RAW(do_slb_fault);
+DECLARE_INTERRUPT_HANDLER(do_bad_slb_fault);
+DECLARE_INTERRUPT_HANDLER_RAW(do_hash_fault);
+DECLARE_INTERRUPT_HANDLER_RET(do_page_fault);
+DECLARE_INTERRUPT_HANDLER(__do_bad_page_fault);
+DECLARE_INTERRUPT_HANDLER(do_bad_page_fault);
+
+DECLARE_INTERRUPT_HANDLER_ASYNC(timer_interrupt);
+DECLARE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi);
+DECLARE_INTERRUPT_HANDLER_ASYNC(performance_monitor_exception_async);
+DECLARE_INTERRUPT_HANDLER_RAW(performance_monitor_exception);
+DECLARE_INTERRUPT_HANDLER(WatchdogException);
+DECLARE_INTERRUPT_HANDLER(unknown_exception);
+DECLARE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception);
+
+void replay_system_reset(void);
+void replay_soft_interrupts(void);
+
+#endif /* _ASM_POWERPC_INTERRUPT_H */
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 8f789b597bae..8dd3cdb25338 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -102,6 +102,8 @@ DECLARE_PER_CPU(u64, decrementers_next_tb);
 /* Convert timebase ticks to nanoseconds */
 unsigned long long tb_to_ns(unsigned long long tb_ticks);
 
+void timer_broadcast_interrupt(void);
+
 /* SPLPAR */
 void accumulate_stolen_time(void);
 
diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index 52680cf07c9d..c0f99f8ffa7d 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -12,14 +12,14 @@
 #include <linux/hardirq.h>
 
 #include <asm/dbell.h>
+#include <asm/interrupt.h>
 #include <asm/irq_regs.h>
 #include <asm/kvm_ppc.h>
 #include <asm/trace.h>
 
-#ifdef CONFIG_SMP
-
-void doorbell_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_ASYNC(doorbell_exception)
 {
+#ifdef CONFIG_SMP
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
 	irq_enter();
@@ -37,11 +37,7 @@ void doorbell_exception(struct pt_regs *regs)
 	trace_doorbell_exit(regs);
 	irq_exit();
 	set_irq_regs(old_regs);
-}
 #else /* CONFIG_SMP */
-void doorbell_exception(struct pt_regs *regs)
-{
 	printk(KERN_WARNING "Received doorbell on non-smp system\n");
-}
 #endif /* CONFIG_SMP */
-
+}
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 36dea2020ec5..8b0db807974c 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1923,7 +1923,7 @@ EXC_COMMON_BEGIN(doorbell_super_common)
 #ifdef CONFIG_PPC_DOORBELL
 	bl	doorbell_exception
 #else
-	bl	unknown_exception
+	bl	unknown_async_exception
 #endif
 	b	interrupt_return
 
@@ -2136,8 +2136,7 @@ EXC_COMMON_BEGIN(h_data_storage_common)
 	GEN_COMMON h_data_storage
 	addi    r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
-	li	r4,SIGSEGV
-	bl      bad_page_fault
+	bl      do_bad_page_fault
 MMU_FTR_SECTION_ELSE
 	bl      unknown_exception
 ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_TYPE_RADIX)
@@ -2310,7 +2309,7 @@ EXC_COMMON_BEGIN(h_doorbell_common)
 #ifdef CONFIG_PPC_DOORBELL
 	bl	doorbell_exception
 #else
-	bl	unknown_exception
+	bl	unknown_async_exception
 #endif
 	b	interrupt_return
 
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 94ad1372c490..9b4d5432e2db 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -238,8 +238,8 @@ __secondary_hold_acknowledge:
 
 /* System reset */
 /* core99 pmac starts the seconary here by changing the vector, and
-   putting it back to what it was (unknown_exception) when done.  */
-	EXCEPTION(0x100, Reset, unknown_exception, EXC_XFER_STD)
+   putting it back to what it was (unknown_async_exception) when done.  */
+	EXCEPTION(0x100, Reset, unknown_async_exception, EXC_XFER_STD)
 
 /* Machine check */
 /*
@@ -631,7 +631,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_NEED_DTLB_SW_LRU)
 #endif
 
 #ifndef CONFIG_TAU_INT
-#define TAUException	unknown_exception
+#define TAUException	unknown_async_exception
 #endif
 
 	EXCEPTION(0x1300, Trap_13, instruction_breakpoint_exception, EXC_XFER_STD)
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 6b1eca53e36c..2055d204d08e 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -54,6 +54,7 @@
 #include <linux/pgtable.h>
 
 #include <linux/uaccess.h>
+#include <asm/interrupt.h>
 #include <asm/io.h>
 #include <asm/irq.h>
 #include <asm/cache.h>
@@ -665,7 +666,7 @@ void __do_irq(struct pt_regs *regs)
 	irq_exit();
 }
 
-void do_IRQ(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_ASYNC(do_IRQ)
 {
 	struct pt_regs *old_regs = set_irq_regs(regs);
 	void *cursp, *irqsp, *sirqsp;
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 9f3e133b57b7..54269947113d 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -18,6 +18,7 @@
 #include <linux/extable.h>
 #include <linux/ftrace.h>
 
+#include <asm/interrupt.h>
 #include <asm/machdep.h>
 #include <asm/mce.h>
 #include <asm/nmi.h>
@@ -588,7 +589,7 @@ EXPORT_SYMBOL_GPL(machine_check_print_event_info);
  *
  * regs->nip and regs->msr contains srr0 and ssr1.
  */
-long notrace machine_check_early(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
 {
 	long handled = 0;
 	u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
@@ -722,7 +723,7 @@ long hmi_handle_debugtrig(struct pt_regs *regs)
 /*
  * Return values:
  */
-long hmi_exception_realmode(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_NMI(hmi_exception_realmode)
 {	
 	int ret;
 
diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index 7c85ed04a164..dd87b2118620 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -5,6 +5,7 @@
 #include <asm/kup.h>
 #include <asm/cputime.h>
 #include <asm/hw_irq.h>
+#include <asm/interrupt.h>
 #include <asm/kprobes.h>
 #include <asm/paca.h>
 #include <asm/ptrace.h>
diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c
index 0b4694b8d248..46b2e5de4ef5 100644
--- a/arch/powerpc/kernel/tau_6xx.c
+++ b/arch/powerpc/kernel/tau_6xx.c
@@ -100,7 +100,7 @@ static void TAUupdate(int cpu)
  * with interrupts disabled
  */
 
-void TAUException(struct pt_regs * regs)
+DEFINE_INTERRUPT_HANDLER_ASYNC(TAUException)
 {
 	int cpu = smp_processor_id();
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 67feb3524460..435a251247ed 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -56,6 +56,7 @@
 #include <linux/processor.h>
 #include <asm/trace.h>
 
+#include <asm/interrupt.h>
 #include <asm/io.h>
 #include <asm/nvram.h>
 #include <asm/cache.h>
@@ -570,7 +571,7 @@ void arch_irq_work_raise(void)
  * timer_interrupt - gets called when the decrementer overflows,
  * with interrupts disabled.
  */
-void timer_interrupt(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
 {
 	struct clock_event_device *evt = this_cpu_ptr(&decrementers);
 	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 9b5298c016c7..f4462b481248 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -41,6 +41,7 @@
 #include <asm/emulated_ops.h>
 #include <linux/uaccess.h>
 #include <asm/debugfs.h>
+#include <asm/interrupt.h>
 #include <asm/io.h>
 #include <asm/machdep.h>
 #include <asm/rtas.h>
@@ -430,8 +431,7 @@ void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
 	regs->msr &= ~MSR_RI;
 #endif
 }
-
-void system_reset_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
 {
 	unsigned long hsrr0, hsrr1;
 	bool saved_hsrrs = false;
@@ -516,7 +516,10 @@ void system_reset_exception(struct pt_regs *regs)
 	this_cpu_set_ftrace_enabled(ftrace_enabled);
 
 	/* What should we do here? We could issue a shutdown or hard reset. */
+
+	return 0;
 }
+NOKPROBE_SYMBOL(system_reset_exception);
 
 /*
  * I/O accesses can cause machine checks on powermacs.
@@ -788,7 +791,12 @@ int machine_check_generic(struct pt_regs *regs)
 }
 #endif /* everything else */
 
-void machine_check_exception(struct pt_regs *regs)
+
+#ifdef CONFIG_PPC_BOOK3S_64
+DEFINE_INTERRUPT_HANDLER_ASYNC(machine_check_exception)
+#else
+DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
+#endif
 {
 	int recover = 0;
 
@@ -838,13 +846,21 @@ void machine_check_exception(struct pt_regs *regs)
 	if (!(regs->msr & MSR_RI))
 		die("Unrecoverable Machine check", regs, SIGBUS);
 
+#ifdef CONFIG_PPC_BOOK3S_64
+bail:
 	return;
+#else
+	return 0;
 
 bail:
 	if (nmi) nmi_exit();
+
+	return 0;
+#endif
 }
+NOKPROBE_SYMBOL(machine_check_exception);
 
-void SMIException(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(SMIException) /* async? */
 {
 	die("System Management Interrupt", regs, SIGABRT);
 }
@@ -1030,7 +1046,7 @@ static void p9_hmi_special_emu(struct pt_regs *regs)
 }
 #endif /* CONFIG_VSX */
 
-void handle_hmi_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_ASYNC(handle_hmi_exception)
 {
 	struct pt_regs *old_regs;
 
@@ -1059,7 +1075,7 @@ void handle_hmi_exception(struct pt_regs *regs)
 	set_irq_regs(old_regs);
 }
 
-void unknown_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(unknown_exception)
 {
 	enum ctx_state prev_state = exception_enter();
 
@@ -1071,7 +1087,19 @@ void unknown_exception(struct pt_regs *regs)
 	exception_exit(prev_state);
 }
 
-void instruction_breakpoint_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception)
+{
+	enum ctx_state prev_state = exception_enter();
+
+	printk("Bad trap at PC: %lx, SR: %lx, vector=%lx\n",
+	       regs->nip, regs->msr, regs->trap);
+
+	_exception(SIGTRAP, regs, TRAP_UNK, 0);
+
+	exception_exit(prev_state);
+}
+
+DEFINE_INTERRUPT_HANDLER(instruction_breakpoint_exception)
 {
 	enum ctx_state prev_state = exception_enter();
 
@@ -1086,12 +1114,12 @@ void instruction_breakpoint_exception(struct pt_regs *regs)
 	exception_exit(prev_state);
 }
 
-void RunModeException(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(RunModeException)
 {
 	_exception(SIGTRAP, regs, TRAP_UNK, 0);
 }
 
-void single_step_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(single_step_exception)
 {
 	enum ctx_state prev_state = exception_enter();
 
@@ -1436,7 +1464,7 @@ static int emulate_math(struct pt_regs *regs)
 static inline int emulate_math(struct pt_regs *regs) { return -1; }
 #endif
 
-void program_check_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(program_check_exception)
 {
 	enum ctx_state prev_state = exception_enter();
 	unsigned int reason = get_reason(regs);
@@ -1561,14 +1589,14 @@ NOKPROBE_SYMBOL(program_check_exception);
  * This occurs when running in hypervisor mode on POWER6 or later
  * and an illegal instruction is encountered.
  */
-void emulation_assist_interrupt(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(emulation_assist_interrupt)
 {
 	regs->msr |= REASON_ILLEGAL;
 	program_check_exception(regs);
 }
 NOKPROBE_SYMBOL(emulation_assist_interrupt);
 
-void alignment_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(alignment_exception)
 {
 	enum ctx_state prev_state = exception_enter();
 	int sig, code, fixed = 0;
@@ -1618,7 +1646,7 @@ void alignment_exception(struct pt_regs *regs)
 	exception_exit(prev_state);
 }
 
-void StackOverflow(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(StackOverflow)
 {
 	pr_crit("Kernel stack overflow in process %s[%d], r1=%lx\n",
 		current->comm, task_pid_nr(current), regs->gpr[1]);
@@ -1627,7 +1655,7 @@ void StackOverflow(struct pt_regs *regs)
 	panic("kernel stack overflow");
 }
 
-void stack_overflow_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(stack_overflow_exception)
 {
 	enum ctx_state prev_state = exception_enter();
 
@@ -1636,7 +1664,7 @@ void stack_overflow_exception(struct pt_regs *regs)
 	exception_exit(prev_state);
 }
 
-void kernel_fp_unavailable_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(kernel_fp_unavailable_exception)
 {
 	enum ctx_state prev_state = exception_enter();
 
@@ -1647,7 +1675,7 @@ void kernel_fp_unavailable_exception(struct pt_regs *regs)
 	exception_exit(prev_state);
 }
 
-void altivec_unavailable_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(altivec_unavailable_exception)
 {
 	enum ctx_state prev_state = exception_enter();
 
@@ -1666,7 +1694,7 @@ void altivec_unavailable_exception(struct pt_regs *regs)
 	exception_exit(prev_state);
 }
 
-void vsx_unavailable_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(vsx_unavailable_exception)
 {
 	if (user_mode(regs)) {
 		/* A user program has executed an vsx instruction,
@@ -1697,7 +1725,7 @@ static void tm_unavailable(struct pt_regs *regs)
 	die("Unrecoverable TM Unavailable Exception", regs, SIGABRT);
 }
 
-void facility_unavailable_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(facility_unavailable_exception)
 {
 	static char *facility_strings[] = {
 		[FSCR_FP_LG] = "FPU",
@@ -1817,7 +1845,7 @@ void facility_unavailable_exception(struct pt_regs *regs)
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 
-void fp_unavailable_tm(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(fp_unavailable_tm)
 {
 	/* Note:  This does not handle any kind of FP laziness. */
 
@@ -1850,7 +1878,7 @@ void fp_unavailable_tm(struct pt_regs *regs)
 	tm_recheckpoint(&current->thread);
 }
 
-void altivec_unavailable_tm(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(altivec_unavailable_tm)
 {
 	/* See the comments in fp_unavailable_tm().  This function operates
 	 * the same way.
@@ -1865,7 +1893,7 @@ void altivec_unavailable_tm(struct pt_regs *regs)
 	current->thread.used_vr = 1;
 }
 
-void vsx_unavailable_tm(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(vsx_unavailable_tm)
 {
 	/* See the comments in fp_unavailable_tm().  This works similarly,
 	 * though we're loading both FP and VEC registers in here.
@@ -1890,7 +1918,8 @@ void vsx_unavailable_tm(struct pt_regs *regs)
 }
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
-static void performance_monitor_exception_nmi(struct pt_regs *regs)
+#ifdef CONFIG_PPC64
+DEFINE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi)
 {
 	nmi_enter();
 
@@ -1899,9 +1928,12 @@ static void performance_monitor_exception_nmi(struct pt_regs *regs)
 	perf_irq(regs);
 
 	nmi_exit();
+
+	return 0;
 }
+#endif
 
-static void performance_monitor_exception_async(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_ASYNC(performance_monitor_exception_async)
 {
 	irq_enter();
 
@@ -1912,7 +1944,7 @@ static void performance_monitor_exception_async(struct pt_regs *regs)
 	irq_exit();
 }
 
-void performance_monitor_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_RAW(performance_monitor_exception)
 {
 	/*
 	 * On 64-bit, if perf interrupts hit in a local_irq_disable
@@ -1924,6 +1956,8 @@ void performance_monitor_exception(struct pt_regs *regs)
 		performance_monitor_exception_nmi(regs);
 	else
 		performance_monitor_exception_async(regs);
+
+	return 0;
 }
 
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
@@ -2057,7 +2091,7 @@ NOKPROBE_SYMBOL(DebugException);
 #endif /* CONFIG_PPC_ADV_DEBUG_REGS */
 
 #ifdef CONFIG_ALTIVEC
-void altivec_assist_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(altivec_assist_exception)
 {
 	int err;
 
@@ -2199,7 +2233,7 @@ void SPEFloatingPointRoundException(struct pt_regs *regs)
  * in the MSR is 0.  This indicates that SRR0/1 are live, and that
  * we therefore lost state by taking this exception.
  */
-void unrecoverable_exception(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(unrecoverable_exception)
 {
 	pr_emerg("Unrecoverable exception %lx at %lx (msr=%lx)\n",
 		 regs->trap, regs->nip, regs->msr);
@@ -2219,7 +2253,7 @@ void __attribute__ ((weak)) WatchdogHandler(struct pt_regs *regs)
 	return;
 }
 
-void WatchdogException(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(WatchdogException) /* XXX NMI? async? */
 {
 	printk (KERN_EMERG "PowerPC Book-E Watchdog Exception\n");
 	WatchdogHandler(regs);
@@ -2230,7 +2264,7 @@ void WatchdogException(struct pt_regs *regs)
  * We enter here if we discover during exception entry that we are
  * running in supervisor mode with a userspace value in the stack pointer.
  */
-void kernel_bad_stack(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(kernel_bad_stack)
 {
 	printk(KERN_EMERG "Bad kernel stack pointer %lx at %lx\n",
 	       regs->gpr[1], regs->nip);
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index af3c15a1d41e..824b9376ac35 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -26,6 +26,7 @@
 #include <linux/delay.h>
 #include <linux/smp.h>
 
+#include <asm/interrupt.h>
 #include <asm/paca.h>
 
 /*
@@ -247,14 +248,14 @@ static void watchdog_timer_interrupt(int cpu)
 		watchdog_smp_panic(cpu, tb);
 }
 
-void soft_nmi_interrupt(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
 {
 	unsigned long flags;
 	int cpu = raw_smp_processor_id();
 	u64 tb;
 
 	if (!cpumask_test_cpu(cpu, &wd_cpus_enabled))
-		return;
+		return 0;
 
 	nmi_enter();
 
@@ -291,6 +292,8 @@ void soft_nmi_interrupt(struct pt_regs *regs)
 
 out:
 	nmi_exit();
+
+	return 0;
 }
 
 static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6f612d240392..3f9a229f82a2 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -53,6 +53,7 @@
 #include <asm/cputable.h>
 #include <asm/cacheflush.h>
 #include <linux/uaccess.h>
+#include <asm/interrupt.h>
 #include <asm/io.h>
 #include <asm/kvm_ppc.h>
 #include <asm/kvm_book3s.h>
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index 8053efdf7ea7..10fc274bea65 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -17,6 +17,7 @@
 
 #include <asm/asm-prototypes.h>
 #include <asm/cputable.h>
+#include <asm/interrupt.h>
 #include <asm/kvm_ppc.h>
 #include <asm/kvm_book3s.h>
 #include <asm/archrandom.h>
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 288a9820ec01..bd2bb73021d8 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -20,6 +20,7 @@
 
 #include <asm/cputable.h>
 #include <linux/uaccess.h>
+#include <asm/interrupt.h>
 #include <asm/kvm_ppc.h>
 #include <asm/cacheflush.h>
 #include <asm/dbell.h>
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 77073a256cff..453afb9ae9b4 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -38,6 +38,7 @@
 #include <linux/pgtable.h>
 
 #include <asm/debugfs.h>
+#include <asm/interrupt.h>
 #include <asm/processor.h>
 #include <asm/mmu.h>
 #include <asm/mmu_context.h>
@@ -1512,7 +1513,7 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
 }
 EXPORT_SYMBOL_GPL(hash_page);
 
-long do_hash_fault(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
 {
 	unsigned long ea = regs->dar;
 	unsigned long dsisr = regs->dsisr;
@@ -1522,27 +1523,6 @@ long do_hash_fault(struct pt_regs *regs)
 	unsigned int region_id;
 	long err;
 
-	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
-		goto page_fault;
-
-	/*
-	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
-	 * don't call hash_page, just fail the fault. This is required to
-	 * prevent re-entrancy problems in the hash code, namely perf
-	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
-	 * hash fault. See the comment in hash_preload().
-	 *
-	 * We come here as a result of a DSI at a point where we don't want
-	 * to call hash_page, such as when we are accessing memory (possibly
-	 * user memory) inside a PMU interrupt that occurred while interrupts
-	 * were soft-disabled.  We want to invoke the exception handler for
-	 * the access, or panic if there isn't a handler.
-	 */
-	if (unlikely(in_nmi())) {
-		bad_page_fault(regs, SIGSEGV);
-		return 0;
-	}
-
 	region_id = get_region_id(ea);
 	if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
 		mm = &init_mm;
@@ -1583,13 +1563,44 @@ long do_hash_fault(struct pt_regs *regs)
 		err = 0;
 
 	} else if (err) {
-page_fault:
 		err = do_page_fault(regs);
 	}
 
 	return err;
 }
 
+/*
+ * The _RAW interrupt entry checks for the in_nmi() case before
+ * running the full handler.
+ */
+DEFINE_INTERRUPT_HANDLER_RAW(do_hash_fault)
+{
+	unsigned long dsisr = regs->dsisr;
+
+	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
+		return do_page_fault(regs);
+
+	/*
+	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
+	 * don't call hash_page, just fail the fault. This is required to
+	 * prevent re-entrancy problems in the hash code, namely perf
+	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
+	 * hash fault. See the comment in hash_preload().
+	 *
+	 * We come here as a result of a DSI at a point where we don't want
+	 * to call hash_page, such as when we are accessing memory (possibly
+	 * user memory) inside a PMU interrupt that occurred while interrupts
+	 * were soft-disabled.  We want to invoke the exception handler for
+	 * the access, or panic if there isn't a handler.
+	 */
+	if (unlikely(in_nmi())) {
+		do_bad_page_fault(regs);
+		return 0;
+	}
+
+	return __do_hash_fault(regs);
+}
+
 #ifdef CONFIG_PPC_MM_SLICES
 static bool should_hash_preload(struct mm_struct *mm, unsigned long ea)
 {
diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
index c581548b533f..0ae10adae203 100644
--- a/arch/powerpc/mm/book3s64/slb.c
+++ b/arch/powerpc/mm/book3s64/slb.c
@@ -10,6 +10,7 @@
  */
 
 #include <asm/asm-prototypes.h>
+#include <asm/interrupt.h>
 #include <asm/mmu.h>
 #include <asm/mmu_context.h>
 #include <asm/paca.h>
@@ -813,7 +814,7 @@ static long slb_allocate_user(struct mm_struct *mm, unsigned long ea)
 	return slb_insert_entry(ea, context, flags, ssize, false);
 }
 
-long do_slb_fault(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_RAW(do_slb_fault)
 {
 	unsigned long ea = regs->dar;
 	unsigned long id = get_region_id(ea);
@@ -827,17 +828,19 @@ long do_slb_fault(struct pt_regs *regs)
 	/*
 	 * SLB kernel faults must be very careful not to touch anything
 	 * that is not bolted. E.g., PACA and global variables are okay,
-	 * mm->context stuff is not.
-	 *
-	 * SLB user faults can access all of kernel memory, but must be
-	 * careful not to touch things like IRQ state because it is not
-	 * "reconciled" here. The difficulty is that we must use
-	 * fast_exception_return to return from kernel SLB faults without
-	 * looking at possible non-bolted memory. We could test user vs
-	 * kernel faults in the interrupt handler asm and do a full fault,
-	 * reconcile, ret_from_except for user faults which would make them
-	 * first class kernel code. But for performance it's probably nicer
-	 * if they go via fast_exception_return too.
+	 * mm->context stuff is not. SLB user faults may access all of
+	 * memory (and induce one recursive SLB kernel fault), so the
+	 * kernel fault must not trample on the user fault state at those
+	 * points.
+	 */
+
+	/*
+	 * This is a _RAW interrupt handler, so it must not touch local
+	 * irq state, or schedule. We could test for usermode and upgrade
+	 * to a normal process context (synchronous) interrupt for those,
+	 * which would make them first-class kernel code and able to be
+	 * traced and instrumented, although performance would suffer a
+	 * bit, it would probably be a good tradeoff.
 	 */
 	if (id >= LINEAR_MAP_REGION_ID) {
 		long err;
@@ -866,7 +869,7 @@ long do_slb_fault(struct pt_regs *regs)
 	}
 }
 
-void do_bad_slb_fault(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER(do_bad_slb_fault)
 {
 	int err = regs->result;
 
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 36604ff8b3ec..9e1cd74ebb13 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -34,6 +34,7 @@
 #include <linux/uaccess.h>
 
 #include <asm/firmware.h>
+#include <asm/interrupt.h>
 #include <asm/page.h>
 #include <asm/mmu.h>
 #include <asm/mmu_context.h>
@@ -547,7 +548,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 }
 NOKPROBE_SYMBOL(__do_page_fault);
 
-long do_page_fault(struct pt_regs *regs)
+DEFINE_INTERRUPT_HANDLER_RET(do_page_fault)
 {
 	enum ctx_state prev_state = exception_enter();
 	unsigned long address = regs->dar;
@@ -641,3 +642,15 @@ void bad_page_fault(struct pt_regs *regs, int sig)
 	else
 		__bad_page_fault(regs, sig);
 }
+
+#ifdef CONFIG_PPC_BOOK3S_64
+DEFINE_INTERRUPT_HANDLER(__do_bad_page_fault)
+{
+	__bad_page_fault(regs, SIGSEGV);
+}
+
+DEFINE_INTERRUPT_HANDLER(do_bad_page_fault)
+{
+	bad_page_fault(regs, SIGSEGV);
+}
+#endif
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index e6f461812856..999997d9e9a9 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -14,6 +14,7 @@
 
 #include <asm/asm-prototypes.h>
 #include <asm/firmware.h>
+#include <asm/interrupt.h>
 #include <asm/machdep.h>
 #include <asm/opal.h>
 #include <asm/cputhreads.h>
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 07/21] powerpc: add interrupt wrapper entry / exit stub functions
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (5 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 06/21] powerpc: interrupt handler wrapper functions Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 08/21] powerpc: add interrupt_cond_local_irq_enable helper Nicholas Piggin
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

These will be used by subsequent patches.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 66 ++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 60363e5eeffa..7c72c91c21ce 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -5,6 +5,50 @@
 #include <linux/context_tracking.h>
 #include <asm/ftrace.h>
 
+struct interrupt_state {
+};
+
+static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
+{
+}
+
+/*
+ * Care should be taken to note that interrupt_exit_prepare and
+ * interrupt_async_exit_prepare do not necessarily return immediately to
+ * regs context (e.g., if regs is usermode, we don't necessarily return to
+ * user mode). Other interrupts might be taken between here and return,
+ * context switch / preemption may occur in the exit path after this, or a
+ * signal may be delivered, etc.
+ *
+ * The real interrupt exit code is platform specific, e.g.,
+ * interrupt_exit_user_prepare / interrupt_exit_kernel_prepare for 64s.
+ *
+ * However interrupt_nmi_exit_prepare does return directly to regs, because
+ * NMIs do not do "exit work" or replay soft-masked interrupts.
+ */
+static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt_state *state)
+{
+}
+
+static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
+{
+}
+
+static inline void interrupt_async_exit_prepare(struct pt_regs *regs, struct interrupt_state *state)
+{
+}
+
+struct interrupt_nmi_state {
+};
+
+static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
+{
+}
+
+static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
+{
+}
+
 /**
  * DECLARE_INTERRUPT_HANDLER_RAW - Declare raw interrupt handler function
  * @func:	Function name of the entry point
@@ -71,7 +115,13 @@ static __always_inline void ____##func(struct pt_regs *regs);		\
 									\
 __visible noinstr void func(struct pt_regs *regs)			\
 {									\
+	struct interrupt_state state;					\
+									\
+	interrupt_enter_prepare(regs, &state);				\
+									\
 	____##func (regs);						\
+									\
+	interrupt_exit_prepare(regs, &state);				\
 }									\
 									\
 static __always_inline void ____##func(struct pt_regs *regs)
@@ -99,10 +149,15 @@ static __always_inline long ____##func(struct pt_regs *regs);		\
 									\
 __visible noinstr long func(struct pt_regs *regs)			\
 {									\
+	struct interrupt_state state;					\
 	long ret;							\
 									\
+	interrupt_enter_prepare(regs, &state);				\
+									\
 	ret = ____##func (regs);					\
 									\
+	interrupt_exit_prepare(regs, &state);				\
+									\
 	return ret;							\
 }									\
 									\
@@ -129,7 +184,13 @@ static __always_inline void ____##func(struct pt_regs *regs);		\
 									\
 __visible noinstr void func(struct pt_regs *regs)			\
 {									\
+	struct interrupt_state state;					\
+									\
+	interrupt_async_enter_prepare(regs, &state);			\
+									\
 	____##func (regs);						\
+									\
+	interrupt_async_exit_prepare(regs, &state);			\
 }									\
 									\
 static __always_inline void ____##func(struct pt_regs *regs)
@@ -157,10 +218,15 @@ static __always_inline long ____##func(struct pt_regs *regs);		\
 									\
 __visible noinstr long func(struct pt_regs *regs)			\
 {									\
+	struct interrupt_nmi_state state;				\
 	long ret;							\
 									\
+	interrupt_nmi_enter_prepare(regs, &state);			\
+									\
 	ret = ____##func (regs);					\
 									\
+	interrupt_nmi_exit_prepare(regs, &state);			\
+									\
 	return ret;							\
 }									\
 									\
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 08/21] powerpc: add interrupt_cond_local_irq_enable helper
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (6 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 07/21] powerpc: add interrupt wrapper entry / exit stub functions Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 09/21] powerpc/64: context tracking remove _TIF_NOHZ Nicholas Piggin
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Simple helper for synchronous interrupt handlers (i.e., process-context)
to enable interrupts if it was taken in an interrupts-enabled context.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h |  7 +++++++
 arch/powerpc/kernel/traps.c          | 24 +++++++-----------------
 arch/powerpc/mm/fault.c              |  4 +---
 3 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 7c72c91c21ce..dfa846ebae43 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -3,6 +3,7 @@
 #define _ASM_POWERPC_INTERRUPT_H
 
 #include <linux/context_tracking.h>
+#include <linux/hardirq.h>
 #include <asm/ftrace.h>
 
 struct interrupt_state {
@@ -281,4 +282,10 @@ DECLARE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception);
 void replay_system_reset(void);
 void replay_soft_interrupts(void);
 
+static inline void interrupt_cond_local_irq_enable(struct pt_regs *regs)
+{
+	if (!arch_irq_disabled_regs(regs))
+		local_irq_enable();
+}
+
 #endif /* _ASM_POWERPC_INTERRUPT_H */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index f4462b481248..0b712c40272b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -343,8 +343,8 @@ static bool exception_common(int signr, struct pt_regs *regs, int code,
 
 	show_signal_msg(signr, regs, code, addr);
 
-	if (arch_irqs_disabled() && !arch_irq_disabled_regs(regs))
-		local_irq_enable();
+	if (arch_irqs_disabled())
+		interrupt_cond_local_irq_enable(regs);
 
 	current->thread.trap_nr = code;
 
@@ -1546,9 +1546,7 @@ DEFINE_INTERRUPT_HANDLER(program_check_exception)
 	if (!user_mode(regs))
 		goto sigill;
 
-	/* We restore the interrupt state now */
-	if (!arch_irq_disabled_regs(regs))
-		local_irq_enable();
+	interrupt_cond_local_irq_enable(regs);
 
 	/* (reason & REASON_ILLEGAL) would be the obvious thing here,
 	 * but there seems to be a hardware bug on the 405GP (RevD)
@@ -1602,9 +1600,7 @@ DEFINE_INTERRUPT_HANDLER(alignment_exception)
 	int sig, code, fixed = 0;
 	unsigned long  reason;
 
-	/* We restore the interrupt state now */
-	if (!arch_irq_disabled_regs(regs))
-		local_irq_enable();
+	interrupt_cond_local_irq_enable(regs);
 
 	reason = get_reason(regs);
 
@@ -1765,9 +1761,7 @@ DEFINE_INTERRUPT_HANDLER(facility_unavailable_exception)
 		die("Unexpected facility unavailable exception", regs, SIGABRT);
 	}
 
-	/* We restore the interrupt state now */
-	if (!arch_irq_disabled_regs(regs))
-		local_irq_enable();
+	interrupt_cond_local_irq_enable(regs);
 
 	if (status == FSCR_DSCR_LG) {
 		/*
@@ -2147,9 +2141,7 @@ void SPEFloatingPointException(struct pt_regs *regs)
 	int code = FPE_FLTUNK;
 	int err;
 
-	/* We restore the interrupt state now */
-	if (!arch_irq_disabled_regs(regs))
-		local_irq_enable();
+	interrupt_cond_local_irq_enable(regs);
 
 	flush_spe_to_thread(current);
 
@@ -2196,9 +2188,7 @@ void SPEFloatingPointRoundException(struct pt_regs *regs)
 	extern int speround_handler(struct pt_regs *regs);
 	int err;
 
-	/* We restore the interrupt state now */
-	if (!arch_irq_disabled_regs(regs))
-		local_irq_enable();
+	interrupt_cond_local_irq_enable(regs);
 
 	preempt_disable();
 	if (regs->msr & MSR_SPE)
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 9e1cd74ebb13..e971712c95c6 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -441,9 +441,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 		return bad_area_nosemaphore(regs, address);
 	}
 
-	/* We restore the interrupt state now */
-	if (!arch_irq_disabled_regs(regs))
-		local_irq_enable();
+	interrupt_cond_local_irq_enable(regs);
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 09/21] powerpc/64: context tracking remove _TIF_NOHZ
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (7 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 08/21] powerpc: add interrupt_cond_local_irq_enable helper Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13 14:50   ` Christophe Leroy
  2021-01-13  7:32 ` [PATCH v5 10/21] powerpc/64s/hash: improve context tracking of hash faults Nicholas Piggin
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Add context tracking to the system call handler explicitly, and remove
_TIF_NOHZ.

This saves 35 cycles on gettid system call cost on POWER9 with a
CONFIG_NOHZ_FULL kernel.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/Kconfig                   |  1 -
 arch/powerpc/include/asm/thread_info.h |  4 +---
 arch/powerpc/kernel/ptrace/ptrace.c    |  4 ----
 arch/powerpc/kernel/signal.c           |  4 ----
 arch/powerpc/kernel/syscall_64.c       | 10 ++++++++++
 5 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 107bb4319e0e..28d5a1b1510f 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -196,7 +196,6 @@ config PPC
 	select HAVE_STACKPROTECTOR		if PPC64 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
 	select HAVE_STACKPROTECTOR		if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
 	select HAVE_CONTEXT_TRACKING		if PPC64
-	select HAVE_TIF_NOHZ			if PPC64
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DEBUG_STACKOVERFLOW
 	select HAVE_DYNAMIC_FTRACE
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 3d8a47af7a25..386d576673a1 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -94,7 +94,6 @@ void arch_setup_new_exec(void);
 #define TIF_PATCH_PENDING	6	/* pending live patching update */
 #define TIF_SYSCALL_AUDIT	7	/* syscall auditing active */
 #define TIF_SINGLESTEP		8	/* singlestepping active */
-#define TIF_NOHZ		9	/* in adaptive nohz mode */
 #define TIF_SECCOMP		10	/* secure computing */
 #define TIF_RESTOREALL		11	/* Restore all regs (implies NOERROR) */
 #define TIF_NOERROR		12	/* Force successful syscall return */
@@ -128,11 +127,10 @@ void arch_setup_new_exec(void);
 #define _TIF_UPROBE		(1<<TIF_UPROBE)
 #define _TIF_SYSCALL_TRACEPOINT	(1<<TIF_SYSCALL_TRACEPOINT)
 #define _TIF_EMULATE_STACK_STORE	(1<<TIF_EMULATE_STACK_STORE)
-#define _TIF_NOHZ		(1<<TIF_NOHZ)
 #define _TIF_SYSCALL_EMU	(1<<TIF_SYSCALL_EMU)
 #define _TIF_SYSCALL_DOTRACE	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 				 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \
-				 _TIF_NOHZ | _TIF_SYSCALL_EMU)
+				 _TIF_SYSCALL_EMU)
 
 #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
 				 _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index 3d44b73adb83..4f3d4ff3728c 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -262,8 +262,6 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 {
 	u32 flags;
 
-	user_exit();
-
 	flags = READ_ONCE(current_thread_info()->flags) &
 		(_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
 
@@ -340,8 +338,6 @@ void do_syscall_trace_leave(struct pt_regs *regs)
 	step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall_exit(regs, step);
-
-	user_enter();
 }
 
 void __init pt_regs_check(void);
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index 53782aa60ade..9ded046edb0e 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -282,8 +282,6 @@ static void do_signal(struct task_struct *tsk)
 
 void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 {
-	user_exit();
-
 	if (thread_info_flags & _TIF_UPROBE)
 		uprobe_notify_resume(regs);
 
@@ -299,8 +297,6 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 		tracehook_notify_resume(regs);
 		rseq_handle_notify_resume(NULL, regs);
 	}
-
-	user_enter();
 }
 
 static unsigned long get_tm_stackpointer(struct task_struct *tsk)
diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index dd87b2118620..d7d256a7a41f 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -1,9 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 
+#include <linux/context_tracking.h>
 #include <linux/err.h>
 #include <asm/asm-prototypes.h>
 #include <asm/kup.h>
 #include <asm/cputime.h>
+#include <asm/interrupt.h>
 #include <asm/hw_irq.h>
 #include <asm/interrupt.h>
 #include <asm/kprobes.h>
@@ -28,6 +30,9 @@ notrace long system_call_exception(long r3, long r4, long r5,
 	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
 		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
 
+	CT_WARN_ON(ct_state() == CONTEXT_KERNEL);
+	user_exit_irqoff();
+
 	trace_hardirqs_off(); /* finish reconciling */
 
 	if (IS_ENABLED(CONFIG_PPC_BOOK3S))
@@ -182,6 +187,8 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
 	unsigned long ti_flags;
 	unsigned long ret = 0;
 
+	CT_WARN_ON(ct_state() == CONTEXT_USER);
+
 	kuap_check_amr();
 
 	regs->result = r3;
@@ -258,8 +265,11 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
 		}
 	}
 
+	user_enter_irqoff();
+
 	/* scv need not set RI=0 because SRRs are not used */
 	if (unlikely(!prep_irq_for_enabled_exit(!scv))) {
+		user_exit_irqoff();
 		local_irq_enable();
 		goto again;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 10/21] powerpc/64s/hash: improve context tracking of hash faults
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (8 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 09/21] powerpc/64: context tracking remove _TIF_NOHZ Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 11/21] powerpc/64: context tracking move to interrupt wrappers Nicholas Piggin
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This moves the 64s/hash context tracking from hash_page_mm() to
__do_hash_fault(), so it's no longer called by OCXL / SPU
accelerators, which was certainly the wrong thing to be doing,
because those callers are not low level interrupt handlers, so
should have entered a kernel context tracking already.

Then remain in kernel context for the duration of the fault,
rather than enter/exit for the hash fault then enter/exit for
the page fault, which is pointless.

Even still, calling exception_enter/exit in __do_hash_fault seems
questionable because that's touching per-cpu variables, tracing,
etc., which might have been interrupted by this hash fault or
themselves cause hash faults. But maybe I miss something because
hash_page_mm very deliberately calls trace_hash_fault too, for
example. So for now go with it, it's no worse than before, in this
regard.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/bug.h        |  1 +
 arch/powerpc/mm/book3s64/hash_utils.c |  7 ++++---
 arch/powerpc/mm/fault.c               | 29 ++++++++++++++++++++++-----
 3 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 4220789b9a97..e048c820ca02 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -112,6 +112,7 @@
 
 struct pt_regs;
 long do_page_fault(struct pt_regs *);
+long hash__do_page_fault(struct pt_regs *);
 void bad_page_fault(struct pt_regs *, int);
 void __bad_page_fault(struct pt_regs *regs, int sig);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 453afb9ae9b4..801d5e94cd2b 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1289,7 +1289,6 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea,
 		 unsigned long flags)
 {
 	bool is_thp;
-	enum ctx_state prev_state = exception_enter();
 	pgd_t *pgdir;
 	unsigned long vsid;
 	pte_t *ptep;
@@ -1491,7 +1490,6 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea,
 	DBG_LOW(" -> rc=%d\n", rc);
 
 bail:
-	exception_exit(prev_state);
 	return rc;
 }
 EXPORT_SYMBOL_GPL(hash_page_mm);
@@ -1515,6 +1513,7 @@ EXPORT_SYMBOL_GPL(hash_page);
 
 DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
 {
+	enum ctx_state prev_state = exception_enter();
 	unsigned long ea = regs->dar;
 	unsigned long dsisr = regs->dsisr;
 	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
@@ -1563,9 +1562,11 @@ DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
 		err = 0;
 
 	} else if (err) {
-		err = do_page_fault(regs);
+		err = hash__do_page_fault(regs);
 	}
 
+	exception_exit(prev_state);
+
 	return err;
 }
 
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index e971712c95c6..495edce9dc51 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -391,7 +391,7 @@ static void sanity_check_fault(bool is_write, bool is_user,
  * The return value is 0 if the fault was handled, or the signal
  * number if this is a kernel fault that can't be handled here.
  */
-static int __do_page_fault(struct pt_regs *regs, unsigned long address,
+static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 			   unsigned long error_code)
 {
 	struct vm_area_struct * vma;
@@ -544,16 +544,15 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	return 0;
 }
-NOKPROBE_SYMBOL(__do_page_fault);
+NOKPROBE_SYMBOL(___do_page_fault);
 
-DEFINE_INTERRUPT_HANDLER_RET(do_page_fault)
+static long __do_page_fault(struct pt_regs *regs)
 {
-	enum ctx_state prev_state = exception_enter();
 	unsigned long address = regs->dar;
 	unsigned long error_code = regs->dsisr;
 	long err;
 
-	err = __do_page_fault(regs, address, error_code);
+	err = ___do_page_fault(regs, address, error_code);
 	if (unlikely(err)) {
 		const struct exception_table_entry *entry;
 
@@ -580,12 +579,32 @@ DEFINE_INTERRUPT_HANDLER_RET(do_page_fault)
 	}
 #endif
 
+	return err;
+}
+NOKPROBE_SYMBOL(__do_page_fault);
+
+DEFINE_INTERRUPT_HANDLER_RET(do_page_fault)
+{
+	enum ctx_state prev_state = exception_enter();
+	long err;
+
+	err = __do_page_fault(regs);
+
 	exception_exit(prev_state);
 
 	return err;
 }
 NOKPROBE_SYMBOL(do_page_fault);
 
+#ifdef CONFIG_PPC_BOOK3S_64
+/* Same as do_page_fault but interrupt entry has already run in do_hash_fault */
+long hash__do_page_fault(struct pt_regs *regs)
+{
+	return __do_page_fault(regs);
+}
+NOKPROBE_SYMBOL(hash__do_page_fault);
+#endif
+
 /*
  * bad_page_fault is called when we have a bad access from the kernel.
  * It is called from the DSI and ISI handlers in head.S and from some
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 11/21] powerpc/64: context tracking move to interrupt wrappers
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (9 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 10/21] powerpc/64s/hash: improve context tracking of hash faults Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 12/21] powerpc/64: add context tracking to asynchronous interrupts Nicholas Piggin
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This moves exception_enter/exit calls to wrapper functions for
synchronous interrupts. More interrupt handlers are covered by
this than previously.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h  |  9 ++++
 arch/powerpc/kernel/traps.c           | 74 ++++++---------------------
 arch/powerpc/mm/book3s64/hash_utils.c |  3 --
 arch/powerpc/mm/fault.c               |  9 +---
 4 files changed, 27 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index dfa846ebae43..7fab54a14152 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -7,10 +7,16 @@
 #include <asm/ftrace.h>
 
 struct interrupt_state {
+#ifdef CONFIG_PPC64
+	enum ctx_state ctx_state;
+#endif
 };
 
 static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
+#ifdef CONFIG_PPC64
+	state->ctx_state = exception_enter();
+#endif
 }
 
 /*
@@ -29,6 +35,9 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
  */
 static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
+#ifdef CONFIG_PPC64
+	exception_exit(state->ctx_state);
+#endif
 }
 
 static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 0b712c40272b..b2c53883580b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1077,41 +1077,28 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(handle_hmi_exception)
 
 DEFINE_INTERRUPT_HANDLER(unknown_exception)
 {
-	enum ctx_state prev_state = exception_enter();
-
 	printk("Bad trap at PC: %lx, SR: %lx, vector=%lx\n",
 	       regs->nip, regs->msr, regs->trap);
 
 	_exception(SIGTRAP, regs, TRAP_UNK, 0);
-
-	exception_exit(prev_state);
 }
 
 DEFINE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception)
 {
-	enum ctx_state prev_state = exception_enter();
-
 	printk("Bad trap at PC: %lx, SR: %lx, vector=%lx\n",
 	       regs->nip, regs->msr, regs->trap);
 
 	_exception(SIGTRAP, regs, TRAP_UNK, 0);
-
-	exception_exit(prev_state);
 }
 
 DEFINE_INTERRUPT_HANDLER(instruction_breakpoint_exception)
 {
-	enum ctx_state prev_state = exception_enter();
-
 	if (notify_die(DIE_IABR_MATCH, "iabr_match", regs, 5,
 					5, SIGTRAP) == NOTIFY_STOP)
-		goto bail;
+		return;
 	if (debugger_iabr_match(regs))
-		goto bail;
+		return;
 	_exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip);
-
-bail:
-	exception_exit(prev_state);
 }
 
 DEFINE_INTERRUPT_HANDLER(RunModeException)
@@ -1121,8 +1108,6 @@ DEFINE_INTERRUPT_HANDLER(RunModeException)
 
 DEFINE_INTERRUPT_HANDLER(single_step_exception)
 {
-	enum ctx_state prev_state = exception_enter();
-
 	clear_single_step(regs);
 	clear_br_trace(regs);
 
@@ -1131,14 +1116,11 @@ DEFINE_INTERRUPT_HANDLER(single_step_exception)
 
 	if (notify_die(DIE_SSTEP, "single_step", regs, 5,
 					5, SIGTRAP) == NOTIFY_STOP)
-		goto bail;
+		return;
 	if (debugger_sstep(regs))
-		goto bail;
+		return;
 
 	_exception(SIGTRAP, regs, TRAP_TRACE, regs->nip);
-
-bail:
-	exception_exit(prev_state);
 }
 NOKPROBE_SYMBOL(single_step_exception);
 
@@ -1466,7 +1448,6 @@ static inline int emulate_math(struct pt_regs *regs) { return -1; }
 
 DEFINE_INTERRUPT_HANDLER(program_check_exception)
 {
-	enum ctx_state prev_state = exception_enter();
 	unsigned int reason = get_reason(regs);
 
 	/* We can now get here via a FP Unavailable exception if the core
@@ -1475,22 +1456,22 @@ DEFINE_INTERRUPT_HANDLER(program_check_exception)
 	if (reason & REASON_FP) {
 		/* IEEE FP exception */
 		parse_fpe(regs);
-		goto bail;
+		return;
 	}
 	if (reason & REASON_TRAP) {
 		unsigned long bugaddr;
 		/* Debugger is first in line to stop recursive faults in
 		 * rcu_lock, notify_die, or atomic_notifier_call_chain */
 		if (debugger_bpt(regs))
-			goto bail;
+			return;
 
 		if (kprobe_handler(regs))
-			goto bail;
+			return;
 
 		/* trap exception */
 		if (notify_die(DIE_BPT, "breakpoint", regs, 5, 5, SIGTRAP)
 				== NOTIFY_STOP)
-			goto bail;
+			return;
 
 		bugaddr = regs->nip;
 		/*
@@ -1502,10 +1483,10 @@ DEFINE_INTERRUPT_HANDLER(program_check_exception)
 		if (!(regs->msr & MSR_PR) &&  /* not user-mode */
 		    report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) {
 			regs->nip += 4;
-			goto bail;
+			return;
 		}
 		_exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip);
-		goto bail;
+		return;
 	}
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	if (reason & REASON_TM) {
@@ -1526,7 +1507,7 @@ DEFINE_INTERRUPT_HANDLER(program_check_exception)
 		 */
 		if (user_mode(regs)) {
 			_exception(SIGILL, regs, ILL_ILLOPN, regs->nip);
-			goto bail;
+			return;
 		} else {
 			printk(KERN_EMERG "Unexpected TM Bad Thing exception "
 			       "at %lx (msr 0x%lx) tm_scratch=%llx\n",
@@ -1557,7 +1538,7 @@ DEFINE_INTERRUPT_HANDLER(program_check_exception)
 	 * pattern to occurrences etc. -dgibson 31/Mar/2003
 	 */
 	if (!emulate_math(regs))
-		goto bail;
+		return;
 
 	/* Try to emulate it if we should. */
 	if (reason & (REASON_ILLEGAL | REASON_PRIVILEGED)) {
@@ -1565,10 +1546,10 @@ DEFINE_INTERRUPT_HANDLER(program_check_exception)
 		case 0:
 			regs->nip += 4;
 			emulate_single_step(regs);
-			goto bail;
+			return;
 		case -EFAULT:
 			_exception(SIGSEGV, regs, SEGV_MAPERR, regs->nip);
-			goto bail;
+			return;
 		}
 	}
 
@@ -1577,9 +1558,6 @@ DEFINE_INTERRUPT_HANDLER(program_check_exception)
 		_exception(SIGILL, regs, ILL_PRVOPC, regs->nip);
 	else
 		_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
-
-bail:
-	exception_exit(prev_state);
 }
 NOKPROBE_SYMBOL(program_check_exception);
 
@@ -1596,14 +1574,12 @@ NOKPROBE_SYMBOL(emulation_assist_interrupt);
 
 DEFINE_INTERRUPT_HANDLER(alignment_exception)
 {
-	enum ctx_state prev_state = exception_enter();
 	int sig, code, fixed = 0;
 	unsigned long  reason;
 
 	interrupt_cond_local_irq_enable(regs);
 
 	reason = get_reason(regs);
-
 	if (reason & REASON_BOUNDARY) {
 		sig = SIGBUS;
 		code = BUS_ADRALN;
@@ -1611,7 +1587,7 @@ DEFINE_INTERRUPT_HANDLER(alignment_exception)
 	}
 
 	if (tm_abort_check(regs, TM_CAUSE_ALIGNMENT | TM_CAUSE_PERSISTENT))
-		goto bail;
+		return;
 
 	/* we don't implement logging of alignment exceptions */
 	if (!(current->thread.align_ctl & PR_UNALIGN_SIGBUS))
@@ -1621,7 +1597,7 @@ DEFINE_INTERRUPT_HANDLER(alignment_exception)
 		/* skip over emulated instruction */
 		regs->nip += inst_length(reason);
 		emulate_single_step(regs);
-		goto bail;
+		return;
 	}
 
 	/* Operand address was bad */
@@ -1637,9 +1613,6 @@ DEFINE_INTERRUPT_HANDLER(alignment_exception)
 		_exception(sig, regs, code, regs->dar);
 	else
 		bad_page_fault(regs, sig);
-
-bail:
-	exception_exit(prev_state);
 }
 
 DEFINE_INTERRUPT_HANDLER(StackOverflow)
@@ -1653,41 +1626,28 @@ DEFINE_INTERRUPT_HANDLER(StackOverflow)
 
 DEFINE_INTERRUPT_HANDLER(stack_overflow_exception)
 {
-	enum ctx_state prev_state = exception_enter();
-
 	die("Kernel stack overflow", regs, SIGSEGV);
-
-	exception_exit(prev_state);
 }
 
 DEFINE_INTERRUPT_HANDLER(kernel_fp_unavailable_exception)
 {
-	enum ctx_state prev_state = exception_enter();
-
 	printk(KERN_EMERG "Unrecoverable FP Unavailable Exception "
 			  "%lx at %lx\n", regs->trap, regs->nip);
 	die("Unrecoverable FP Unavailable Exception", regs, SIGABRT);
-
-	exception_exit(prev_state);
 }
 
 DEFINE_INTERRUPT_HANDLER(altivec_unavailable_exception)
 {
-	enum ctx_state prev_state = exception_enter();
-
 	if (user_mode(regs)) {
 		/* A user program has executed an altivec instruction,
 		   but this kernel doesn't support altivec. */
 		_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
-		goto bail;
+		return;
 	}
 
 	printk(KERN_EMERG "Unrecoverable VMX/Altivec Unavailable Exception "
 			"%lx at %lx\n", regs->trap, regs->nip);
 	die("Unrecoverable VMX/Altivec Unavailable Exception", regs, SIGABRT);
-
-bail:
-	exception_exit(prev_state);
 }
 
 DEFINE_INTERRUPT_HANDLER(vsx_unavailable_exception)
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 801d5e94cd2b..662adafc92e0 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1513,7 +1513,6 @@ EXPORT_SYMBOL_GPL(hash_page);
 
 DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
 {
-	enum ctx_state prev_state = exception_enter();
 	unsigned long ea = regs->dar;
 	unsigned long dsisr = regs->dsisr;
 	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
@@ -1565,8 +1564,6 @@ DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
 		err = hash__do_page_fault(regs);
 	}
 
-	exception_exit(prev_state);
-
 	return err;
 }
 
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 495edce9dc51..cc71c93cceaf 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -585,14 +585,7 @@ NOKPROBE_SYMBOL(__do_page_fault);
 
 DEFINE_INTERRUPT_HANDLER_RET(do_page_fault)
 {
-	enum ctx_state prev_state = exception_enter();
-	long err;
-
-	err = __do_page_fault(regs);
-
-	exception_exit(prev_state);
-
-	return err;
+	return __do_page_fault(regs);
 }
 NOKPROBE_SYMBOL(do_page_fault);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 12/21] powerpc/64: add context tracking to asynchronous interrupts
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (10 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 11/21] powerpc/64: context tracking move to interrupt wrappers Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 13/21] powerpc: handle irq_enter/irq_exit in interrupt handler wrappers Nicholas Piggin
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Previously context tracking was not done for asynchronous interrupts,
(those that run in interrupt context), and if those would cause a
reschedule when they exit, then scheduling functions (schedule_user,
preempt_schedule_irq) call exception_enter/exit to fix this up and
exit user context.

This is a hack we would like to get away from, so do context tracking
for asynchronous interrupts too.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 7fab54a14152..7c40ce78c4bb 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -42,10 +42,12 @@ static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt
 
 static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
+	interrupt_enter_prepare(regs, state);
 }
 
 static inline void interrupt_async_exit_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
+	interrupt_exit_prepare(regs, state);
 }
 
 struct interrupt_nmi_state {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 13/21] powerpc: handle irq_enter/irq_exit in interrupt handler wrappers
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (11 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 12/21] powerpc/64: add context tracking to asynchronous interrupts Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 14/21] powerpc/64s: move context tracking exit to interrupt exit path Nicholas Piggin
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Move irq_enter/irq_exit into asynchronous interrupt handler wrappers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 2 ++
 arch/powerpc/kernel/dbell.c          | 3 +--
 arch/powerpc/kernel/irq.c            | 4 ----
 arch/powerpc/kernel/tau_6xx.c        | 3 ---
 arch/powerpc/kernel/time.c           | 4 ++--
 arch/powerpc/kernel/traps.c          | 6 ------
 6 files changed, 5 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 7c40ce78c4bb..bee393c72fe5 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -43,10 +43,12 @@ static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt
 static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
 	interrupt_enter_prepare(regs, state);
+	irq_enter();
 }
 
 static inline void interrupt_async_exit_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
+	irq_exit();
 	interrupt_exit_prepare(regs, state);
 }
 
diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index c0f99f8ffa7d..84ee9c511459 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -22,7 +22,6 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(doorbell_exception)
 #ifdef CONFIG_SMP
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
-	irq_enter();
 	trace_doorbell_entry(regs);
 
 	ppc_msgsync();
@@ -35,7 +34,7 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(doorbell_exception)
 	smp_ipi_demux_relaxed(); /* already performed the barrier */
 
 	trace_doorbell_exit(regs);
-	irq_exit();
+
 	set_irq_regs(old_regs);
 #else /* CONFIG_SMP */
 	printk(KERN_WARNING "Received doorbell on non-smp system\n");
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 2055d204d08e..681abb7c0507 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -641,8 +641,6 @@ void __do_irq(struct pt_regs *regs)
 {
 	unsigned int irq;
 
-	irq_enter();
-
 	trace_irq_entry(regs);
 
 	/*
@@ -662,8 +660,6 @@ void __do_irq(struct pt_regs *regs)
 		generic_handle_irq(irq);
 
 	trace_irq_exit(regs);
-
-	irq_exit();
 }
 
 DEFINE_INTERRUPT_HANDLER_ASYNC(do_IRQ)
diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c
index 46b2e5de4ef5..d864f07bab74 100644
--- a/arch/powerpc/kernel/tau_6xx.c
+++ b/arch/powerpc/kernel/tau_6xx.c
@@ -104,12 +104,9 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(TAUException)
 {
 	int cpu = smp_processor_id();
 
-	irq_enter();
 	tau[cpu].interrupts++;
 
 	TAUupdate(cpu);
-
-	irq_exit();
 }
 #endif /* CONFIG_TAU_INT */
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 435a251247ed..2177defb7884 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -610,7 +610,7 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
 #endif
 
 	old_regs = set_irq_regs(regs);
-	irq_enter();
+
 	trace_timer_interrupt_entry(regs);
 
 	if (test_irq_work_pending()) {
@@ -635,7 +635,7 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
 	}
 
 	trace_timer_interrupt_exit(regs);
-	irq_exit();
+
 	set_irq_regs(old_regs);
 }
 EXPORT_SYMBOL(timer_interrupt);
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index b2c53883580b..b4f23e871a68 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1051,7 +1051,6 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(handle_hmi_exception)
 	struct pt_regs *old_regs;
 
 	old_regs = set_irq_regs(regs);
-	irq_enter();
 
 #ifdef CONFIG_VSX
 	/* Real mode flagged P9 special emu is needed */
@@ -1071,7 +1070,6 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(handle_hmi_exception)
 	if (ppc_md.handle_hmi_exception)
 		ppc_md.handle_hmi_exception(regs);
 
-	irq_exit();
 	set_irq_regs(old_regs);
 }
 
@@ -1889,13 +1887,9 @@ DEFINE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi)
 
 DEFINE_INTERRUPT_HANDLER_ASYNC(performance_monitor_exception_async)
 {
-	irq_enter();
-
 	__this_cpu_inc(irq_stat.pmu_irqs);
 
 	perf_irq(regs);
-
-	irq_exit();
 }
 
 DEFINE_INTERRUPT_HANDLER_RAW(performance_monitor_exception)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 14/21] powerpc/64s: move context tracking exit to interrupt exit path
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (12 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 13/21] powerpc: handle irq_enter/irq_exit in interrupt handler wrappers Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 15/21] powerpc/64s: reconcile interrupts in C Nicholas Piggin
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The interrupt handler wrapper functions are not the ideal place to
maintain context tracking because after they return, the low level exit
code must then determine if there are interrupts to replay, or if the
task should be preempted, etc. Those paths (e.g., schedule_user) include
their own exception_enter/exit pairs to fix this up but it's a bit hacky
(see schedule_user() comments).

Ideally context tracking will go to user mode only when there are no
more interrupts or context switches or other exit processing work to
handle.

64e can not do this because it does not use the C interrupt exit code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 34 +++++++++++++++++++++++++---
 arch/powerpc/kernel/syscall_64.c     |  9 ++++++++
 2 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index bee393c72fe5..34d7cca2cb2e 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -7,16 +7,30 @@
 #include <asm/ftrace.h>
 
 struct interrupt_state {
-#ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3E_64
 	enum ctx_state ctx_state;
 #endif
 };
 
 static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
-#ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3E_64
 	state->ctx_state = exception_enter();
 #endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (user_mode(regs)) {
+		CT_WARN_ON(ct_state() != CONTEXT_USER);
+		user_exit_irqoff();
+	} else {
+		/*
+		 * CT_WARN_ON comes here via program_check_exception,
+		 * so avoid recursion.
+		 */
+		if (TRAP(regs) != 0x700)
+			CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
+	}
+#endif
 }
 
 /*
@@ -35,9 +49,23 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
  */
 static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
-#ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3E_64
 	exception_exit(state->ctx_state);
 #endif
+
+	/*
+	 * Book3S exits to user via interrupt_exit_user_prepare(), which does
+	 * context tracking, which is a cleaner way to handle PREEMPT=y
+	 * and avoid context entry/exit in e.g., preempt_schedule_irq()),
+	 * which is likely to be where the core code wants to end up.
+	 *
+	 * The above comment explains why we can't do the
+	 *
+	 *     if (user_mode(regs))
+	 *         user_exit_irqoff();
+	 *
+	 * sequence here.
+	 */
 }
 
 static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index d7d256a7a41f..42f0ad4b2fbb 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -305,6 +305,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned
 	BUG_ON(!(regs->msr & MSR_PR));
 	BUG_ON(!FULL_REGS(regs));
 	BUG_ON(regs->softe != IRQS_ENABLED);
+	CT_WARN_ON(ct_state() == CONTEXT_USER);
 
 	/*
 	 * We don't need to restore AMR on the way back to userspace for KUAP.
@@ -347,7 +348,9 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned
 		}
 	}
 
+	user_enter_irqoff();
 	if (unlikely(!prep_irq_for_enabled_exit(true))) {
+		user_exit_irqoff();
 		local_irq_enable();
 		local_irq_disable();
 		goto again;
@@ -392,6 +395,12 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsign
 		unrecoverable_exception(regs);
 	BUG_ON(regs->msr & MSR_PR);
 	BUG_ON(!FULL_REGS(regs));
+	/*
+	 * CT_WARN_ON comes here via program_check_exception,
+	 * so avoid recursion.
+	 */
+	if (TRAP(regs) != 0x700)
+		CT_WARN_ON(ct_state() == CONTEXT_USER);
 
 	amr = kuap_get_and_check_amr();
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 15/21] powerpc/64s: reconcile interrupts in C
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (13 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 14/21] powerpc/64s: move context tracking exit to interrupt exit path Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13 14:54   ` Christophe Leroy
  2021-01-13  7:32 ` [PATCH v5 16/21] powerpc/64: move account_stolen_time into its own function Nicholas Piggin
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

There is no need for this to be in asm, use the new intrrupt entry wrapper.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 15 +++++++++++----
 arch/powerpc/kernel/exceptions-64s.S | 26 --------------------------
 2 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 34d7cca2cb2e..6eba7c489753 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -14,11 +14,14 @@ struct interrupt_state {
 
 static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
-#ifdef CONFIG_PPC_BOOK3E_64
-	state->ctx_state = exception_enter();
-#endif
-
+	/*
+	 * Book3E reconciles irq soft mask in asm
+	 */
 #ifdef CONFIG_PPC_BOOK3S_64
+	if (irq_soft_mask_set_return(IRQS_ALL_DISABLED) == IRQS_ENABLED)
+		trace_hardirqs_off();
+	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+
 	if (user_mode(regs)) {
 		CT_WARN_ON(ct_state() != CONTEXT_USER);
 		user_exit_irqoff();
@@ -31,6 +34,10 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
 			CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
 	}
 #endif
+
+#ifdef CONFIG_PPC_BOOK3E_64
+	state->ctx_state = exception_enter();
+#endif
 }
 
 /*
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 8b0db807974c..df4ee073386b 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -139,7 +139,6 @@ name:
 #define IKVM_VIRT	.L_IKVM_VIRT_\name\()	/* Virt entry tests KVM */
 #define ISTACK		.L_ISTACK_\name\()	/* Set regular kernel stack */
 #define __ISTACK(name)	.L_ISTACK_ ## name
-#define IRECONCILE	.L_IRECONCILE_\name\()	/* Do RECONCILE_IRQ_STATE */
 #define IKUAP		.L_IKUAP_\name\()	/* Do KUAP lock */
 
 #define INT_DEFINE_BEGIN(n)						\
@@ -203,9 +202,6 @@ do_define_int n
 	.ifndef ISTACK
 		ISTACK=1
 	.endif
-	.ifndef IRECONCILE
-		IRECONCILE=1
-	.endif
 	.ifndef IKUAP
 		IKUAP=1
 	.endif
@@ -653,10 +649,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 	.if ISTACK
 	ACCOUNT_STOLEN_TIME
 	.endif
-
-	.if IRECONCILE
-	RECONCILE_IRQ_STATE(r10, r11)
-	.endif
 .endm
 
 /*
@@ -935,7 +927,6 @@ INT_DEFINE_BEGIN(system_reset)
 	 */
 	ISET_RI=0
 	ISTACK=0
-	IRECONCILE=0
 	IKVM_REAL=1
 INT_DEFINE_END(system_reset)
 
@@ -1123,7 +1114,6 @@ INT_DEFINE_BEGIN(machine_check_early)
 	ISTACK=0
 	IDAR=1
 	IDSISR=1
-	IRECONCILE=0
 	IKUAP=0 /* We don't touch AMR here, we never go to virtual mode */
 INT_DEFINE_END(machine_check_early)
 
@@ -1476,7 +1466,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 INT_DEFINE_BEGIN(data_access_slb)
 	IVEC=0x380
 	IAREA=PACA_EXSLB
-	IRECONCILE=0
 	IDAR=1
 	IKVM_SKIP=1
 	IKVM_REAL=1
@@ -1503,7 +1492,6 @@ MMU_FTR_SECTION_ELSE
 	li	r3,-EFAULT
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 	std	r3,RESULT(r1)
-	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_bad_slb_fault
 	b	interrupt_return
@@ -1565,7 +1553,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 INT_DEFINE_BEGIN(instruction_access_slb)
 	IVEC=0x480
 	IAREA=PACA_EXSLB
-	IRECONCILE=0
 	IISIDE=1
 	IDAR=1
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
@@ -1594,7 +1581,6 @@ MMU_FTR_SECTION_ELSE
 	li	r3,-EFAULT
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 	std	r3,RESULT(r1)
-	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_bad_slb_fault
 	b	interrupt_return
@@ -1754,7 +1740,6 @@ EXC_COMMON_BEGIN(program_check_common)
  */
 INT_DEFINE_BEGIN(fp_unavailable)
 	IVEC=0x800
-	IRECONCILE=0
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 	IKVM_REAL=1
 #endif
@@ -1769,7 +1754,6 @@ EXC_VIRT_END(fp_unavailable, 0x4800, 0x100)
 EXC_COMMON_BEGIN(fp_unavailable_common)
 	GEN_COMMON fp_unavailable
 	bne	1f			/* if from user, just load it up */
-	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	kernel_fp_unavailable_exception
 0:	trap
@@ -1788,7 +1772,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
 	b	fast_interrupt_return
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 2:	/* User process was in a transaction */
-	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	fp_unavailable_tm
 	b	interrupt_return
@@ -1853,7 +1836,6 @@ INT_DEFINE_BEGIN(hdecrementer)
 	IVEC=0x980
 	IHSRR=1
 	ISTACK=0
-	IRECONCILE=0
 	IKVM_REAL=1
 	IKVM_VIRT=1
 INT_DEFINE_END(hdecrementer)
@@ -2227,7 +2209,6 @@ INT_DEFINE_BEGIN(hmi_exception_early)
 	IHSRR=1
 	IREALMODE_COMMON=1
 	ISTACK=0
-	IRECONCILE=0
 	IKUAP=0 /* We don't touch AMR here, we never go to virtual mode */
 	IKVM_REAL=1
 INT_DEFINE_END(hmi_exception_early)
@@ -2401,7 +2382,6 @@ EXC_COMMON_BEGIN(performance_monitor_common)
  */
 INT_DEFINE_BEGIN(altivec_unavailable)
 	IVEC=0xf20
-	IRECONCILE=0
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 	IKVM_REAL=1
 #endif
@@ -2431,7 +2411,6 @@ BEGIN_FTR_SECTION
 	b	fast_interrupt_return
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 2:	/* User process was in a transaction */
-	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	altivec_unavailable_tm
 	b	interrupt_return
@@ -2439,7 +2418,6 @@ BEGIN_FTR_SECTION
 1:
 END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
 #endif
-	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	altivec_unavailable_exception
 	b	interrupt_return
@@ -2455,7 +2433,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
  */
 INT_DEFINE_BEGIN(vsx_unavailable)
 	IVEC=0xf40
-	IRECONCILE=0
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 	IKVM_REAL=1
 #endif
@@ -2484,7 +2461,6 @@ BEGIN_FTR_SECTION
 	b	load_up_vsx
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 2:	/* User process was in a transaction */
-	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	vsx_unavailable_tm
 	b	interrupt_return
@@ -2492,7 +2468,6 @@ BEGIN_FTR_SECTION
 1:
 END_FTR_SECTION_IFSET(CPU_FTR_VSX)
 #endif
-	RECONCILE_IRQ_STATE(r10, r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	vsx_unavailable_exception
 	b	interrupt_return
@@ -2827,7 +2802,6 @@ EXC_VIRT_NONE(0x5800, 0x100)
 INT_DEFINE_BEGIN(soft_nmi)
 	IVEC=0x900
 	ISTACK=0
-	IRECONCILE=0	/* Soft-NMI may fire under local_irq_disable */
 INT_DEFINE_END(soft_nmi)
 
 /*
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 16/21] powerpc/64: move account_stolen_time into its own function
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (14 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 15/21] powerpc/64s: reconcile interrupts in C Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13 14:59   ` Christophe Leroy
  2021-01-13  7:32 ` [PATCH v5 17/21] powerpc/64: entry cpu time accounting in C Nicholas Piggin
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This will be used by interrupt entry as well.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/cputime.h | 15 +++++++++++++++
 arch/powerpc/kernel/syscall_64.c   | 10 +---------
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/cputime.h b/arch/powerpc/include/asm/cputime.h
index ed75d1c318e3..3f61604e1fcf 100644
--- a/arch/powerpc/include/asm/cputime.h
+++ b/arch/powerpc/include/asm/cputime.h
@@ -87,6 +87,18 @@ static notrace inline void account_cpu_user_exit(void)
 	acct->starttime_user = tb;
 }
 
+static notrace inline void account_stolen_time(void)
+{
+#ifdef CONFIG_PPC_SPLPAR
+	if (IS_ENABLED(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) &&
+	    firmware_has_feature(FW_FEATURE_SPLPAR)) {
+		struct lppaca *lp = local_paca->lppaca_ptr;
+
+		if (unlikely(local_paca->dtl_ridx != be64_to_cpu(lp->dtl_idx)))
+			accumulate_stolen_time();
+	}
+#endif
+}
 
 #endif /* __KERNEL__ */
 #else /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
@@ -96,5 +108,8 @@ static inline void account_cpu_user_entry(void)
 static inline void account_cpu_user_exit(void)
 {
 }
+static notrace inline void account_stolen_time(void)
+{
+}
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
 #endif /* __POWERPC_CPUTIME_H */
diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index 42f0ad4b2fbb..32f72965da26 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -69,15 +69,7 @@ notrace long system_call_exception(long r3, long r4, long r5,
 
 	account_cpu_user_entry();
 
-#ifdef CONFIG_PPC_SPLPAR
-	if (IS_ENABLED(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) &&
-	    firmware_has_feature(FW_FEATURE_SPLPAR)) {
-		struct lppaca *lp = local_paca->lppaca_ptr;
-
-		if (unlikely(local_paca->dtl_ridx != be64_to_cpu(lp->dtl_idx)))
-			accumulate_stolen_time();
-	}
-#endif
+	account_stolen_time();
 
 	/*
 	 * This is not required for the syscall exit path, but makes the
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 17/21] powerpc/64: entry cpu time accounting in C
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (15 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 16/21] powerpc/64: move account_stolen_time into its own function Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13 15:05   ` Christophe Leroy
  2021-01-13  7:32 ` [PATCH v5 18/21] powerpc: move NMI entry/exit code into wrapper Nicholas Piggin
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

There is no need for this to be in asm, use the new intrrupt entry wrapper.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h |  7 +++++++
 arch/powerpc/include/asm/ppc_asm.h   | 24 ------------------------
 arch/powerpc/kernel/exceptions-64e.S |  1 -
 arch/powerpc/kernel/exceptions-64s.S |  5 -----
 4 files changed, 7 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 6eba7c489753..e278dffe7657 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -4,6 +4,7 @@
 
 #include <linux/context_tracking.h>
 #include <linux/hardirq.h>
+#include <asm/cputime.h>
 #include <asm/ftrace.h>
 
 struct interrupt_state {
@@ -25,6 +26,9 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
 	if (user_mode(regs)) {
 		CT_WARN_ON(ct_state() != CONTEXT_USER);
 		user_exit_irqoff();
+
+		account_cpu_user_entry();
+		account_stolen_time();
 	} else {
 		/*
 		 * CT_WARN_ON comes here via program_check_exception,
@@ -38,6 +42,9 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
 #ifdef CONFIG_PPC_BOOK3E_64
 	state->ctx_state = exception_enter();
 #endif
+
+	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) && user_mode(regs))
+		account_cpu_user_entry();
 }
 
 /*
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index cc1bca571332..3dceb64fc9af 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -25,7 +25,6 @@
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 #define ACCOUNT_CPU_USER_ENTRY(ptr, ra, rb)
 #define ACCOUNT_CPU_USER_EXIT(ptr, ra, rb)
-#define ACCOUNT_STOLEN_TIME
 #else
 #define ACCOUNT_CPU_USER_ENTRY(ptr, ra, rb)				\
 	MFTB(ra);			/* get timebase */		\
@@ -44,29 +43,6 @@
 	PPC_LL	ra, ACCOUNT_SYSTEM_TIME(ptr);				\
 	add	ra,ra,rb;		/* add on to system time */	\
 	PPC_STL	ra, ACCOUNT_SYSTEM_TIME(ptr)
-
-#ifdef CONFIG_PPC_SPLPAR
-#define ACCOUNT_STOLEN_TIME						\
-BEGIN_FW_FTR_SECTION;							\
-	beq	33f;							\
-	/* from user - see if there are any DTL entries to process */	\
-	ld	r10,PACALPPACAPTR(r13);	/* get ptr to VPA */		\
-	ld	r11,PACA_DTL_RIDX(r13);	/* get log read index */	\
-	addi	r10,r10,LPPACA_DTLIDX;					\
-	LDX_BE	r10,0,r10;		/* get log write index */	\
-	cmpd	cr1,r11,r10;						\
-	beq+	cr1,33f;						\
-	bl	accumulate_stolen_time;				\
-	ld	r12,_MSR(r1);						\
-	andi.	r10,r12,MSR_PR;		/* Restore cr0 (coming from user) */ \
-33:									\
-END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
-
-#else  /* CONFIG_PPC_SPLPAR */
-#define ACCOUNT_STOLEN_TIME
-
-#endif /* CONFIG_PPC_SPLPAR */
-
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
 
 /*
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 52421042a020..87b3e74ded41 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -398,7 +398,6 @@ exc_##n##_common:							    \
 	std	r10,_NIP(r1);		/* save SRR0 to stackframe */	    \
 	std	r11,_MSR(r1);		/* save SRR1 to stackframe */	    \
 	beq	2f;			/* if from kernel mode */	    \
-	ACCOUNT_CPU_USER_ENTRY(r13,r10,r11);/* accounting (uses cr0+eq) */  \
 2:	ld	r3,excf+EX_R10(r13);	/* get back r10 */		    \
 	ld	r4,excf+EX_R11(r13);	/* get back r11 */		    \
 	mfspr	r5,scratch;		/* get back r13 */		    \
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index df4ee073386b..68505e35bcf7 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -577,7 +577,6 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real)
 	kuap_save_amr_and_lock r9, r10, cr1, cr0
 	.endif
 	beq	101f			/* if from kernel mode		*/
-	ACCOUNT_CPU_USER_ENTRY(r13, r9, r10)
 BEGIN_FTR_SECTION
 	ld	r9,IAREA+EX_PPR(r13)	/* Read PPR from paca		*/
 	std	r9,_PPR(r1)
@@ -645,10 +644,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 	ld	r11,exception_marker@toc(r2)
 	std	r10,RESULT(r1)		/* clear regs->result		*/
 	std	r11,STACK_FRAME_OVERHEAD-16(r1) /* mark the frame	*/
-
-	.if ISTACK
-	ACCOUNT_STOLEN_TIME
-	.endif
 .endm
 
 /*
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 18/21] powerpc: move NMI entry/exit code into wrapper
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (16 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 17/21] powerpc/64: entry cpu time accounting in C Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13 15:13   ` Christophe Leroy
  2021-01-13  7:32 ` [PATCH v5 19/21] powerpc/64s: move NMI soft-mask handling to C Nicholas Piggin
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This moves the common NMI entry and exit code into the interrupt handler
wrappers.

This changes the behaviour of soft-NMI (watchdog) and HMI interrupts, and
also MCE interrupts on 64e, by adding missing parts of the NMI entry to
them.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 24 ++++++++++++++++
 arch/powerpc/kernel/mce.c            | 11 --------
 arch/powerpc/kernel/traps.c          | 42 +++++-----------------------
 arch/powerpc/kernel/watchdog.c       | 10 +++----
 4 files changed, 35 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index e278dffe7657..01192e213f9a 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -95,14 +95,38 @@ static inline void interrupt_async_exit_prepare(struct pt_regs *regs, struct int
 }
 
 struct interrupt_nmi_state {
+#ifdef CONFIG_PPC64
+	u8 ftrace_enabled;
+#endif
 };
 
 static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
 {
+#ifdef CONFIG_PPC64
+	state->ftrace_enabled = this_cpu_get_ftrace_enabled();
+	this_cpu_set_ftrace_enabled(0);
+#endif
+
+	/*
+	 * Do not use nmi_enter() for pseries hash guest taking a real-mode
+	 * NMI because not everything it touches is within the RMA limit.
+	 */
+	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
+			!firmware_has_feature(FW_FEATURE_LPAR) ||
+			radix_enabled() || (mfmsr() & MSR_DR))
+		nmi_enter();
 }
 
 static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
 {
+	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
+			!firmware_has_feature(FW_FEATURE_LPAR) ||
+			radix_enabled() || (mfmsr() & MSR_DR))
+		nmi_exit();
+
+#ifdef CONFIG_PPC64
+	this_cpu_set_ftrace_enabled(state->ftrace_enabled);
+#endif
 }
 
 /**
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 54269947113d..51456217ec40 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -592,12 +592,6 @@ EXPORT_SYMBOL_GPL(machine_check_print_event_info);
 DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
 {
 	long handled = 0;
-	u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
-
-	this_cpu_set_ftrace_enabled(0);
-	/* Do not use nmi_enter/exit for pseries hpte guest */
-	if (radix_enabled() || !firmware_has_feature(FW_FEATURE_LPAR))
-		nmi_enter();
 
 	hv_nmi_check_nonrecoverable(regs);
 
@@ -607,11 +601,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
 	if (ppc_md.machine_check_early)
 		handled = ppc_md.machine_check_early(regs);
 
-	if (radix_enabled() || !firmware_has_feature(FW_FEATURE_LPAR))
-		nmi_exit();
-
-	this_cpu_set_ftrace_enabled(ftrace_enabled);
-
 	return handled;
 }
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index b4f23e871a68..43d23232ef5c 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -435,11 +435,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
 {
 	unsigned long hsrr0, hsrr1;
 	bool saved_hsrrs = false;
-	u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
-
-	this_cpu_set_ftrace_enabled(0);
-
-	nmi_enter();
 
 	/*
 	 * System reset can interrupt code where HSRRs are live and MSR[RI]=1.
@@ -511,10 +506,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
 		mtspr(SPRN_HSRR1, hsrr1);
 	}
 
-	nmi_exit();
-
-	this_cpu_set_ftrace_enabled(ftrace_enabled);
-
 	/* What should we do here? We could issue a shutdown or hard reset. */
 
 	return 0;
@@ -792,6 +783,12 @@ int machine_check_generic(struct pt_regs *regs)
 #endif /* everything else */
 
 
+/*
+ * BOOK3S_64 does not call this handler as a non-maskable interrupt
+ * (it uses its own early real-mode handler to handle the MCE proper
+ * and then raises irq_work to call this handler when interrupts are
+ * enabled).
+ */
 #ifdef CONFIG_PPC_BOOK3S_64
 DEFINE_INTERRUPT_HANDLER_ASYNC(machine_check_exception)
 #else
@@ -800,20 +797,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
 {
 	int recover = 0;
 
-	/*
-	 * BOOK3S_64 does not call this handler as a non-maskable interrupt
-	 * (it uses its own early real-mode handler to handle the MCE proper
-	 * and then raises irq_work to call this handler when interrupts are
-	 * enabled).
-	 *
-	 * This is silly. The BOOK3S_64 should just call a different function
-	 * rather than expecting semantics to magically change. Something
-	 * like 'non_nmi_machine_check_exception()', perhaps?
-	 */
-	const bool nmi = !IS_ENABLED(CONFIG_PPC_BOOK3S_64);
-
-	if (nmi) nmi_enter();
-
 	__this_cpu_inc(irq_stat.mce_exceptions);
 
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
@@ -838,24 +821,17 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
 	if (check_io_access(regs))
 		goto bail;
 
-	if (nmi) nmi_exit();
-
 	die("Machine check", regs, SIGBUS);
 
 	/* Must die if the interrupt is not recoverable */
 	if (!(regs->msr & MSR_RI))
 		die("Unrecoverable Machine check", regs, SIGBUS);
 
-#ifdef CONFIG_PPC_BOOK3S_64
 bail:
+#ifdef CONFIG_PPC_BOOK3S_64
 	return;
 #else
 	return 0;
-
-bail:
-	if (nmi) nmi_exit();
-
-	return 0;
 #endif
 }
 NOKPROBE_SYMBOL(machine_check_exception);
@@ -1873,14 +1849,10 @@ DEFINE_INTERRUPT_HANDLER(vsx_unavailable_tm)
 #ifdef CONFIG_PPC64
 DEFINE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi)
 {
-	nmi_enter();
-
 	__this_cpu_inc(irq_stat.pmu_irqs);
 
 	perf_irq(regs);
 
-	nmi_exit();
-
 	return 0;
 }
 #endif
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 824b9376ac35..dc39534836a3 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -254,11 +254,12 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
 	int cpu = raw_smp_processor_id();
 	u64 tb;
 
+	/* should only arrive from kernel, with irqs disabled */
+	WARN_ON_ONCE(!arch_irq_disabled_regs(regs));
+
 	if (!cpumask_test_cpu(cpu, &wd_cpus_enabled))
 		return 0;
 
-	nmi_enter();
-
 	__this_cpu_inc(irq_stat.soft_nmi_irqs);
 
 	tb = get_tb();
@@ -266,7 +267,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
 		wd_smp_lock(&flags);
 		if (cpumask_test_cpu(cpu, &wd_smp_cpus_stuck)) {
 			wd_smp_unlock(&flags);
-			goto out;
+			return 0;
 		}
 		set_cpu_stuck(cpu, tb);
 
@@ -290,9 +291,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
 	if (wd_panic_timeout_tb < 0x7fffffff)
 		mtspr(SPRN_DEC, wd_panic_timeout_tb);
 
-out:
-	nmi_exit();
-
 	return 0;
 }
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 19/21] powerpc/64s: move NMI soft-mask handling to C
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (17 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 18/21] powerpc: move NMI entry/exit code into wrapper Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 20/21] powerpc/64s: runlatch interrupt handling in C Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 21/21] powerpc/64s: power4 nap fixup " Nicholas Piggin
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Saving and restoring soft-mask state can now be done in C using the
interrupt handler wrapper functions.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 26 ++++++++++++
 arch/powerpc/kernel/exceptions-64s.S | 60 ----------------------------
 2 files changed, 26 insertions(+), 60 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 01192e213f9a..db89ecfef762 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -96,6 +96,10 @@ static inline void interrupt_async_exit_prepare(struct pt_regs *regs, struct int
 
 struct interrupt_nmi_state {
 #ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3S_64
+	u8 irq_soft_mask;
+	u8 irq_happened;
+#endif
 	u8 ftrace_enabled;
 #endif
 };
@@ -103,6 +107,21 @@ struct interrupt_nmi_state {
 static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
 {
 #ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3S_64
+	state->irq_soft_mask = local_paca->irq_soft_mask;
+	state->irq_happened = local_paca->irq_happened;
+
+	/*
+	 * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
+	 * the right thing, and set IRQ_HARD_DIS. We do not want to reconcile
+	 * because that goes through irq tracing which we don't want in NMI.
+	 */
+	local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
+	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+
+	/* Don't do any per-CPU operations until interrupt state is fixed */
+	state->ftrace_enabled = this_cpu_get_ftrace_enabled();
+#endif
 	state->ftrace_enabled = this_cpu_get_ftrace_enabled();
 	this_cpu_set_ftrace_enabled(0);
 #endif
@@ -126,6 +145,13 @@ static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct inter
 
 #ifdef CONFIG_PPC64
 	this_cpu_set_ftrace_enabled(state->ftrace_enabled);
+
+#ifdef CONFIG_PPC_BOOK3S_64
+	/* Check we didn't change the pending interrupt mask. */
+	WARN_ON_ONCE((state->irq_happened | PACA_IRQ_HARD_DIS) != local_paca->irq_happened);
+	local_paca->irq_happened = state->irq_happened;
+	local_paca->irq_soft_mask = state->irq_soft_mask;
+#endif
 #endif
 }
 
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 68505e35bcf7..ceea3f3c5619 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1008,20 +1008,6 @@ EXC_COMMON_BEGIN(system_reset_common)
 	ld	r1,PACA_NMI_EMERG_SP(r13)
 	subi	r1,r1,INT_FRAME_SIZE
 	__GEN_COMMON_BODY system_reset
-	/*
-	 * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
-	 * the right thing. We do not want to reconcile because that goes
-	 * through irq tracing which we don't want in NMI.
-	 *
-	 * Save PACAIRQHAPPENED to RESULT (otherwise unused), and set HARD_DIS
-	 * as we are running with MSR[EE]=0.
-	 */
-	li	r10,IRQS_ALL_DISABLED
-	stb	r10,PACAIRQSOFTMASK(r13)
-	lbz	r10,PACAIRQHAPPENED(r13)
-	std	r10,RESULT(r1)
-	ori	r10,r10,PACA_IRQ_HARD_DIS
-	stb	r10,PACAIRQHAPPENED(r13)
 
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	system_reset_exception
@@ -1037,14 +1023,6 @@ EXC_COMMON_BEGIN(system_reset_common)
 	subi	r10,r10,1
 	sth	r10,PACA_IN_NMI(r13)
 
-	/*
-	 * Restore soft mask settings.
-	 */
-	ld	r10,RESULT(r1)
-	stb	r10,PACAIRQHAPPENED(r13)
-	ld	r10,SOFTE(r1)
-	stb	r10,PACAIRQSOFTMASK(r13)
-
 	kuap_kernel_restore r9, r10
 	EXCEPTION_RESTORE_REGS
 	RFI_TO_USER_OR_KERNEL
@@ -1190,30 +1168,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
 	li	r10,MSR_RI
 	mtmsrd	r10,1
 
-	/*
-	 * Set IRQS_ALL_DISABLED and save PACAIRQHAPPENED (see
-	 * system_reset_common)
-	 */
-	li	r10,IRQS_ALL_DISABLED
-	stb	r10,PACAIRQSOFTMASK(r13)
-	lbz	r10,PACAIRQHAPPENED(r13)
-	std	r10,RESULT(r1)
-	ori	r10,r10,PACA_IRQ_HARD_DIS
-	stb	r10,PACAIRQHAPPENED(r13)
-
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	machine_check_early
 	std	r3,RESULT(r1)	/* Save result */
 	ld	r12,_MSR(r1)
 
-	/*
-	 * Restore soft mask settings.
-	 */
-	ld	r10,RESULT(r1)
-	stb	r10,PACAIRQHAPPENED(r13)
-	ld	r10,SOFTE(r1)
-	stb	r10,PACAIRQSOFTMASK(r13)
-
 #ifdef CONFIG_PPC_P7_NAP
 	/*
 	 * Check if thread was in power saving mode. We come here when any
@@ -2815,17 +2774,6 @@ EXC_COMMON_BEGIN(soft_nmi_common)
 	subi	r1,r1,INT_FRAME_SIZE
 	__GEN_COMMON_BODY soft_nmi
 
-	/*
-	 * Set IRQS_ALL_DISABLED and save PACAIRQHAPPENED (see
-	 * system_reset_common)
-	 */
-	li	r10,IRQS_ALL_DISABLED
-	stb	r10,PACAIRQSOFTMASK(r13)
-	lbz	r10,PACAIRQHAPPENED(r13)
-	std	r10,RESULT(r1)
-	ori	r10,r10,PACA_IRQ_HARD_DIS
-	stb	r10,PACAIRQHAPPENED(r13)
-
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	soft_nmi_interrupt
 
@@ -2833,14 +2781,6 @@ EXC_COMMON_BEGIN(soft_nmi_common)
 	li	r9,0
 	mtmsrd	r9,1
 
-	/*
-	 * Restore soft mask settings.
-	 */
-	ld	r10,RESULT(r1)
-	stb	r10,PACAIRQHAPPENED(r13)
-	ld	r10,SOFTE(r1)
-	stb	r10,PACAIRQSOFTMASK(r13)
-
 	kuap_kernel_restore r9, r10
 	EXCEPTION_RESTORE_REGS hsrr=0
 	RFI_TO_KERNEL
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 20/21] powerpc/64s: runlatch interrupt handling in C
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (18 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 19/21] powerpc/64s: move NMI soft-mask handling to C Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  2021-01-13  7:32 ` [PATCH v5 21/21] powerpc/64s: power4 nap fixup " Nicholas Piggin
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

There is no need for this to be in asm, use the new intrrupt entry wrapper.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h |  7 +++++++
 arch/powerpc/kernel/exceptions-64s.S | 18 ------------------
 2 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index db89ecfef762..9c16e9a48df6 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -6,6 +6,7 @@
 #include <linux/hardirq.h>
 #include <asm/cputime.h>
 #include <asm/ftrace.h>
+#include <asm/runlatch.h>
 
 struct interrupt_state {
 #ifdef CONFIG_PPC_BOOK3E_64
@@ -84,6 +85,12 @@ static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt
 
 static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
 {
+#ifdef CONFIG_PPC_BOOK3S_64
+	if (cpu_has_feature(CPU_FTR_CTRL) &&
+	    !test_thread_local_flags(_TLF_RUNLATCH))
+		__ppc64_runlatch_on();
+#endif
+
 	interrupt_enter_prepare(regs, state);
 	irq_enter();
 }
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index ceea3f3c5619..8966aa3419d5 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -692,14 +692,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 	ld	r1,GPR1(r1)
 .endm
 
-#define RUNLATCH_ON				\
-BEGIN_FTR_SECTION				\
-	ld	r3, PACA_THREAD_INFO(r13);	\
-	ld	r4,TI_LOCAL_FLAGS(r3);		\
-	andi.	r0,r4,_TLF_RUNLATCH;		\
-	beql	ppc64_runlatch_on_trampoline;	\
-END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
-
 /*
  * When the idle code in power4_idle puts the CPU into NAP mode,
  * it has to do so in a loop, and relies on the external interrupt
@@ -1582,7 +1574,6 @@ EXC_VIRT_END(hardware_interrupt, 0x4500, 0x100)
 EXC_COMMON_BEGIN(hardware_interrupt_common)
 	GEN_COMMON hardware_interrupt
 	FINISH_NAP
-	RUNLATCH_ON
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_IRQ
 	b	interrupt_return
@@ -1768,7 +1759,6 @@ EXC_VIRT_END(decrementer, 0x4900, 0x80)
 EXC_COMMON_BEGIN(decrementer_common)
 	GEN_COMMON decrementer
 	FINISH_NAP
-	RUNLATCH_ON
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	timer_interrupt
 	b	interrupt_return
@@ -1854,7 +1844,6 @@ EXC_VIRT_END(doorbell_super, 0x4a00, 0x100)
 EXC_COMMON_BEGIN(doorbell_super_common)
 	GEN_COMMON doorbell_super
 	FINISH_NAP
-	RUNLATCH_ON
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 #ifdef CONFIG_PPC_DOORBELL
 	bl	doorbell_exception
@@ -2209,7 +2198,6 @@ EXC_COMMON_BEGIN(hmi_exception_early_common)
 EXC_COMMON_BEGIN(hmi_exception_common)
 	GEN_COMMON hmi_exception
 	FINISH_NAP
-	RUNLATCH_ON
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	handle_hmi_exception
 	b	interrupt_return
@@ -2239,7 +2227,6 @@ EXC_VIRT_END(h_doorbell, 0x4e80, 0x20)
 EXC_COMMON_BEGIN(h_doorbell_common)
 	GEN_COMMON h_doorbell
 	FINISH_NAP
-	RUNLATCH_ON
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 #ifdef CONFIG_PPC_DOORBELL
 	bl	doorbell_exception
@@ -2273,7 +2260,6 @@ EXC_VIRT_END(h_virt_irq, 0x4ea0, 0x20)
 EXC_COMMON_BEGIN(h_virt_irq_common)
 	GEN_COMMON h_virt_irq
 	FINISH_NAP
-	RUNLATCH_ON
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_IRQ
 	b	interrupt_return
@@ -2320,7 +2306,6 @@ EXC_VIRT_END(performance_monitor, 0x4f00, 0x20)
 EXC_COMMON_BEGIN(performance_monitor_common)
 	GEN_COMMON performance_monitor
 	FINISH_NAP
-	RUNLATCH_ON
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	performance_monitor_exception
 	b	interrupt_return
@@ -3054,9 +3039,6 @@ kvmppc_skip_Hinterrupt:
 	 * come here.
 	 */
 
-EXC_COMMON_BEGIN(ppc64_runlatch_on_trampoline)
-	b	__ppc64_runlatch_on
-
 USE_FIXED_SECTION(virt_trampolines)
 	/*
 	 * All code below __end_interrupts is treated as soft-masked. If
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 21/21] powerpc/64s: power4 nap fixup in C
  2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
                   ` (19 preceding siblings ...)
  2021-01-13  7:32 ` [PATCH v5 20/21] powerpc/64s: runlatch interrupt handling in C Nicholas Piggin
@ 2021-01-13  7:32 ` Nicholas Piggin
  20 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-13  7:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

There is no need for this to be in asm, use the new intrrupt entry wrapper.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h   | 15 +++++++++
 arch/powerpc/include/asm/processor.h   |  1 +
 arch/powerpc/include/asm/thread_info.h |  6 ++++
 arch/powerpc/kernel/exceptions-64s.S   | 45 --------------------------
 arch/powerpc/kernel/idle_book3s.S      |  4 +++
 5 files changed, 26 insertions(+), 45 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 9c16e9a48df6..4e290680f461 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -8,6 +8,16 @@
 #include <asm/ftrace.h>
 #include <asm/runlatch.h>
 
+static inline void nap_adjust_return(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC_970_NAP
+	if (unlikely(test_thread_local_flags(_TLF_NAPPING))) {
+		clear_thread_local_flags(_TLF_NAPPING);
+		regs->nip = (unsigned long)power4_idle_nap_return;
+	}
+#endif
+}
+
 struct interrupt_state {
 #ifdef CONFIG_PPC_BOOK3E_64
 	enum ctx_state ctx_state;
@@ -99,6 +109,9 @@ static inline void interrupt_async_exit_prepare(struct pt_regs *regs, struct int
 {
 	irq_exit();
 	interrupt_exit_prepare(regs, state);
+
+	/* Adjust at exit so the main handler sees the true NIA */
+	nap_adjust_return(regs);
 }
 
 struct interrupt_nmi_state {
@@ -150,6 +163,8 @@ static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct inter
 			radix_enabled() || (mfmsr() & MSR_DR))
 		nmi_exit();
 
+	nap_adjust_return(regs);
+
 #ifdef CONFIG_PPC64
 	this_cpu_set_ftrace_enabled(state->ftrace_enabled);
 
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 8acc3590c971..eedc3c775141 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -393,6 +393,7 @@ extern unsigned long isa300_idle_stop_mayloss(unsigned long psscr_val);
 extern unsigned long isa206_idle_insn_mayloss(unsigned long type);
 #ifdef CONFIG_PPC_970_NAP
 extern void power4_idle_nap(void);
+void power4_idle_nap_return(void);
 #endif
 
 extern unsigned long cpuidle_disable;
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 386d576673a1..bf137151100b 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -152,6 +152,12 @@ void arch_setup_new_exec(void);
 
 #ifndef __ASSEMBLY__
 
+static inline void clear_thread_local_flags(unsigned int flags)
+{
+	struct thread_info *ti = current_thread_info();
+	ti->local_flags &= ~flags;
+}
+
 static inline bool test_thread_local_flags(unsigned int flags)
 {
 	struct thread_info *ti = current_thread_info();
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 8966aa3419d5..12ab2e9f920f 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -692,25 +692,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 	ld	r1,GPR1(r1)
 .endm
 
-/*
- * When the idle code in power4_idle puts the CPU into NAP mode,
- * it has to do so in a loop, and relies on the external interrupt
- * and decrementer interrupt entry code to get it out of the loop.
- * It sets the _TLF_NAPPING bit in current_thread_info()->local_flags
- * to signal that it is in the loop and needs help to get out.
- */
-#ifdef CONFIG_PPC_970_NAP
-#define FINISH_NAP				\
-BEGIN_FTR_SECTION				\
-	ld	r11, PACA_THREAD_INFO(r13);	\
-	ld	r9,TI_LOCAL_FLAGS(r11);		\
-	andi.	r10,r9,_TLF_NAPPING;		\
-	bnel	power4_fixup_nap;		\
-END_FTR_SECTION_IFSET(CPU_FTR_CAN_NAP)
-#else
-#define FINISH_NAP
-#endif
-
 /*
  * There are a few constraints to be concerned with.
  * - Real mode exceptions code/data must be located at their physical location.
@@ -1248,7 +1229,6 @@ EXC_COMMON_BEGIN(machine_check_common)
 	 */
 	GEN_COMMON machine_check
 
-	FINISH_NAP
 	/* Enable MSR_RI when finished with PACA_EXMC */
 	li	r10,MSR_RI
 	mtmsrd 	r10,1
@@ -1573,7 +1553,6 @@ EXC_VIRT_BEGIN(hardware_interrupt, 0x4500, 0x100)
 EXC_VIRT_END(hardware_interrupt, 0x4500, 0x100)
 EXC_COMMON_BEGIN(hardware_interrupt_common)
 	GEN_COMMON hardware_interrupt
-	FINISH_NAP
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_IRQ
 	b	interrupt_return
@@ -1758,7 +1737,6 @@ EXC_VIRT_BEGIN(decrementer, 0x4900, 0x80)
 EXC_VIRT_END(decrementer, 0x4900, 0x80)
 EXC_COMMON_BEGIN(decrementer_common)
 	GEN_COMMON decrementer
-	FINISH_NAP
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	timer_interrupt
 	b	interrupt_return
@@ -1843,7 +1821,6 @@ EXC_VIRT_BEGIN(doorbell_super, 0x4a00, 0x100)
 EXC_VIRT_END(doorbell_super, 0x4a00, 0x100)
 EXC_COMMON_BEGIN(doorbell_super_common)
 	GEN_COMMON doorbell_super
-	FINISH_NAP
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 #ifdef CONFIG_PPC_DOORBELL
 	bl	doorbell_exception
@@ -2197,7 +2174,6 @@ EXC_COMMON_BEGIN(hmi_exception_early_common)
 
 EXC_COMMON_BEGIN(hmi_exception_common)
 	GEN_COMMON hmi_exception
-	FINISH_NAP
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	handle_hmi_exception
 	b	interrupt_return
@@ -2226,7 +2202,6 @@ EXC_VIRT_BEGIN(h_doorbell, 0x4e80, 0x20)
 EXC_VIRT_END(h_doorbell, 0x4e80, 0x20)
 EXC_COMMON_BEGIN(h_doorbell_common)
 	GEN_COMMON h_doorbell
-	FINISH_NAP
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 #ifdef CONFIG_PPC_DOORBELL
 	bl	doorbell_exception
@@ -2259,7 +2234,6 @@ EXC_VIRT_BEGIN(h_virt_irq, 0x4ea0, 0x20)
 EXC_VIRT_END(h_virt_irq, 0x4ea0, 0x20)
 EXC_COMMON_BEGIN(h_virt_irq_common)
 	GEN_COMMON h_virt_irq
-	FINISH_NAP
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_IRQ
 	b	interrupt_return
@@ -2305,7 +2279,6 @@ EXC_VIRT_BEGIN(performance_monitor, 0x4f00, 0x20)
 EXC_VIRT_END(performance_monitor, 0x4f00, 0x20)
 EXC_COMMON_BEGIN(performance_monitor_common)
 	GEN_COMMON performance_monitor
-	FINISH_NAP
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	performance_monitor_exception
 	b	interrupt_return
@@ -3056,24 +3029,6 @@ USE_FIXED_SECTION(virt_trampolines)
 __end_interrupts:
 DEFINE_FIXED_SYMBOL(__end_interrupts)
 
-#ifdef CONFIG_PPC_970_NAP
-	/*
-	 * Called by exception entry code if _TLF_NAPPING was set, this clears
-	 * the NAPPING flag, and redirects the exception exit to
-	 * power4_fixup_nap_return.
-	 */
-	.globl power4_fixup_nap
-EXC_COMMON_BEGIN(power4_fixup_nap)
-	andc	r9,r9,r10
-	std	r9,TI_LOCAL_FLAGS(r11)
-	LOAD_REG_ADDR(r10, power4_idle_nap_return)
-	std	r10,_NIP(r1)
-	blr
-
-power4_idle_nap_return:
-	blr
-#endif
-
 CLOSE_FIXED_SECTION(real_vectors);
 CLOSE_FIXED_SECTION(real_trampolines);
 CLOSE_FIXED_SECTION(virt_vectors);
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 22f249b6f58d..27d2e6a72ec9 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -201,4 +201,8 @@ _GLOBAL(power4_idle_nap)
 	mtmsrd	r7
 	isync
 	b	1b
+
+	.globl power4_idle_nap_return
+power4_idle_nap_return:
+	blr
 #endif
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C
  2021-01-13  7:31 ` [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C Nicholas Piggin
@ 2021-01-13 14:12   ` Christophe Leroy
  2021-01-14  3:24     ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-13 14:12 UTC (permalink / raw)
  To: linuxppc-dev



Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
> The page fault handling still has some complex logic particularly around
> hash table handling, in asm. Implement this in C instead.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>   arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>   arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>   arch/powerpc/mm/fault.c                       |  46 ++++--
>   4 files changed, 107 insertions(+), 148 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 066b1d34c7bc..60a669379aa0 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>   #define HPTE_NOHPTE_UPDATE	0x2
>   #define HPTE_USE_KERNEL_KEY	0x4
>   
> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>   extern int __hash_page_4K(unsigned long ea, unsigned long access,
>   			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>   			  unsigned long flags, int ssize, int subpage_prot);
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 6e53f7638737..bcb5e81d2088 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>    *
>    * Handling:
>    * - Hash MMU
> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>    *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>    *   "non-bolted" regions, e.g., vmalloc space. However these should always be
> - *   backed by Linux page tables.
> + *   backed by Linux page table entries.
>    *
> - *   If none is found, do a Linux page fault. Linux page faults can happen in
> - *   kernel mode due to user copy operations of course.
> + *   If no entry is found the Linux page fault handler is invoked (by
> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
> + *   copy operations of course.
>    *
>    *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>    *   MMU context, which may cause a DSI in the host, which must go to the
> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>   	GEN_COMMON data_access
>   	ld	r4,_DAR(r1)
>   	ld	r5,_DSISR(r1)

We have DSISR here. I think the dispatch between page fault or do_break() should be done here:
- It would be more similar to other arches
- Would avoid doing it also in instruction fault
- Would avoid that -1 return which looks more like a hack.

> +	addi	r3,r1,STACK_FRAME_OVERHEAD
>   BEGIN_MMU_FTR_SECTION
> -	ld	r6,_MSR(r1)
> -	li	r3,0x300
> -	b	do_hash_page		/* Try to handle as hpte fault */
> +	bl	do_hash_fault
>   MMU_FTR_SECTION_ELSE
> -	b	handle_page_fault
> +	bl	do_page_fault
>   ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
> +        cmpdi	r3,0

IIUC, this is for do_break(). Would be better done in a separate path.

> +	beq+	interrupt_return
> +	/* We need to restore NVGPRS */
> +	REST_NVGPRS(r1)
> +	b       interrupt_return
>   
>   	GEN_KVM data_access
>   
> @@ -1540,13 +1545,17 @@ EXC_COMMON_BEGIN(instruction_access_common)
>   	GEN_COMMON instruction_access
>   	ld	r4,_DAR(r1)
>   	ld	r5,_DSISR(r1)
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
>   BEGIN_MMU_FTR_SECTION
> -	ld      r6,_MSR(r1)
> -	li	r3,0x400
> -	b	do_hash_page		/* Try to handle as hpte fault */
> +	bl	do_hash_fault
>   MMU_FTR_SECTION_ELSE
> -	b	handle_page_fault
> +	bl	do_page_fault
>   ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
> +        cmpdi	r3,0

What is that for ? If that's for do_break(), its irrelant in ISI.

> +	beq+	interrupt_return
> +	/* We need to restore NVGPRS */
> +	REST_NVGPRS(r1)
> +	b       interrupt_return
>   
>   	GEN_KVM instruction_access
>   
> @@ -3221,99 +3230,3 @@ disable_machine_check:
>   	RFI_TO_KERNEL
>   1:	mtlr	r0
>   	blr
> -
> -/*
> - * Hash table stuff
> - */
> -	.balign	IFETCH_ALIGN_BYTES
> -do_hash_page:
> -#ifdef CONFIG_PPC_BOOK3S_64
> -	lis	r0,(DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)@h
> -	ori	r0,r0,DSISR_BAD_FAULT_64S@l
> -	and.	r0,r5,r0		/* weird error? */
> -	bne-	handle_page_fault	/* if not, try to insert a HPTE */
> -
> -	/*
> -	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
> -	 * don't call hash_page, just fail the fault. This is required to
> -	 * prevent re-entrancy problems in the hash code, namely perf
> -	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
> -	 * hash fault. See the comment in hash_preload().
> -	 */
> -	ld	r11, PACA_THREAD_INFO(r13)
> -	lwz	r0,TI_PREEMPT(r11)
> -	andis.	r0,r0,NMI_MASK@h
> -	bne	77f
> -
> -	/*
> -	 * r3 contains the trap number
> -	 * r4 contains the faulting address
> -	 * r5 contains dsisr
> -	 * r6 msr
> -	 *
> -	 * at return r3 = 0 for success, 1 for page fault, negative for error
> -	 */
> -	bl	__hash_page		/* build HPTE if possible */
> -        cmpdi	r3,0			/* see if __hash_page succeeded */
> -
> -	/* Success */
> -	beq	interrupt_return	/* Return from exception on success */
> -
> -	/* Error */
> -	blt-	13f
> -
> -	/* Reload DAR/DSISR into r4/r5 for the DABR check below */
> -	ld	r4,_DAR(r1)
> -	ld      r5,_DSISR(r1)
> -#endif /* CONFIG_PPC_BOOK3S_64 */
> -
> -/* Here we have a page fault that hash_page can't handle. */
> -handle_page_fault:
> -11:	andis.  r0,r5,DSISR_DABRMATCH@h
> -	bne-    handle_dabr_fault
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	bl	do_page_fault
> -	cmpdi	r3,0
> -	beq+	interrupt_return
> -	mr	r5,r3
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	ld	r4,_DAR(r1)
> -	bl	__bad_page_fault
> -	b	interrupt_return
> -
> -/* We have a data breakpoint exception - handle it */
> -handle_dabr_fault:
> -	ld      r4,_DAR(r1)
> -	ld      r5,_DSISR(r1)
> -	addi    r3,r1,STACK_FRAME_OVERHEAD
> -	bl      do_break
> -	/*
> -	 * do_break() may have changed the NV GPRS while handling a breakpoint.
> -	 * If so, we need to restore them with their updated values.
> -	 */
> -	REST_NVGPRS(r1)
> -	b       interrupt_return
> -
> -
> -#ifdef CONFIG_PPC_BOOK3S_64
> -/* We have a page fault that hash_page could handle but HV refused
> - * the PTE insertion
> - */
> -13:	mr	r5,r3
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	ld	r4,_DAR(r1)
> -	bl	low_hash_fault
> -	b	interrupt_return
> -#endif
> -
> -/*
> - * We come here as a result of a DSI at a point where we don't want
> - * to call hash_page, such as when we are accessing memory (possibly
> - * user memory) inside a PMU interrupt that occurred while interrupts
> - * were soft-disabled.  We want to invoke the exception handler for
> - * the access, or panic if there isn't a handler.
> - */
> -77:	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	li	r5,SIGSEGV
> -	bl	bad_page_fault
> -	b	interrupt_return
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 73b06adb6eeb..5a61182ddf75 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -1512,16 +1512,40 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
>   }
>   EXPORT_SYMBOL_GPL(hash_page);
>   
> -int __hash_page(unsigned long trap, unsigned long ea, unsigned long dsisr,
> -		unsigned long msr)
> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
>   {
>   	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
>   	unsigned long flags = 0;
> -	struct mm_struct *mm = current->mm;
> -	unsigned int region_id = get_region_id(ea);
> +	struct mm_struct *mm;
> +	unsigned int region_id;
> +	int err;
> +
> +	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
> +		goto page_fault;
> +
> +	/*
> +	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
> +	 * don't call hash_page, just fail the fault. This is required to
> +	 * prevent re-entrancy problems in the hash code, namely perf
> +	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
> +	 * hash fault. See the comment in hash_preload().
> +	 *
> +	 * We come here as a result of a DSI at a point where we don't want
> +	 * to call hash_page, such as when we are accessing memory (possibly
> +	 * user memory) inside a PMU interrupt that occurred while interrupts
> +	 * were soft-disabled.  We want to invoke the exception handler for
> +	 * the access, or panic if there isn't a handler.
> +	 */
> +	if (unlikely(in_nmi())) {
> +		bad_page_fault(regs, ea, SIGSEGV);
> +		return 0;
> +	}
>   
> +	region_id = get_region_id(ea);
>   	if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
>   		mm = &init_mm;
> +	else
> +		mm = current->mm;
>   
>   	if (dsisr & DSISR_NOHPTE)
>   		flags |= HPTE_NOHPTE_UPDATE;
> @@ -1537,13 +1561,31 @@ int __hash_page(unsigned long trap, unsigned long ea, unsigned long dsisr,
>   	 * 2) user space access kernel space.
>   	 */
>   	access |= _PAGE_PRIVILEGED;
> -	if ((msr & MSR_PR) || (region_id == USER_REGION_ID))
> +	if (user_mode(regs) || (region_id == USER_REGION_ID))
>   		access &= ~_PAGE_PRIVILEGED;
>   
> -	if (trap == 0x400)
> +	if (regs->trap == 0x400)
>   		access |= _PAGE_EXEC;
>   
> -	return hash_page_mm(mm, ea, access, trap, flags);
> +	err = hash_page_mm(mm, ea, access, regs->trap, flags);
> +	if (unlikely(err < 0)) {
> +		// failed to instert a hash PTE due to an hypervisor error
> +		if (user_mode(regs)) {
> +			if (IS_ENABLED(CONFIG_PPC_SUBPAGE_PROT) && err == -2)
> +				_exception(SIGSEGV, regs, SEGV_ACCERR, ea);
> +			else
> +				_exception(SIGBUS, regs, BUS_ADRERR, ea);
> +		} else {
> +			bad_page_fault(regs, ea, SIGBUS);
> +		}
> +		err = 0;
> +
> +	} else if (err) {
> +page_fault:
> +		err = do_page_fault(regs, ea, dsisr);
> +	}
> +
> +	return err;
>   }
>   
>   #ifdef CONFIG_PPC_MM_SLICES
> @@ -1843,27 +1885,6 @@ void flush_hash_range(unsigned long number, int local)
>   	}
>   }
>   
> -/*
> - * low_hash_fault is called when we the low level hash code failed
> - * to instert a PTE due to an hypervisor error
> - */
> -void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
> -{
> -	enum ctx_state prev_state = exception_enter();
> -
> -	if (user_mode(regs)) {
> -#ifdef CONFIG_PPC_SUBPAGE_PROT
> -		if (rc == -2)
> -			_exception(SIGSEGV, regs, SEGV_ACCERR, address);
> -		else
> -#endif
> -			_exception(SIGBUS, regs, BUS_ADRERR, address);
> -	} else
> -		bad_page_fault(regs, address, SIGBUS);
> -
> -	exception_exit(prev_state);
> -}
> -
>   long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
>   			   unsigned long pa, unsigned long rflags,
>   			   unsigned long vflags, int psize, int ssize)
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 8961b44f350c..77a3155c77b6 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -369,7 +369,9 @@ static void sanity_check_fault(bool is_write, bool is_user,
>   #define page_fault_is_bad(__err)	(0)
>   #elif defined(CONFIG_PPC_8xx)
>   #define page_fault_is_bad(__err)	((__err) & DSISR_NOEXEC_OR_G)
> -#elif defined(CONFIG_PPC64)
> +#elif defined(CONFIG_PPC_BOOK3S_64)
> +#define page_fault_is_bad(__err)	((__err) & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH))
> +#elif defined(CONFIG_PPC_BOOK3E_64)
>   #define page_fault_is_bad(__err)	((__err) & DSISR_BAD_FAULT_64S)
>   #else
>   #define page_fault_is_bad(__err)	((__err) & DSISR_BAD_FAULT_32S)
> @@ -404,6 +406,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>   		return 0;
>   
>   	if (unlikely(page_fault_is_bad(error_code))) {
> +		if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && (error_code & DSISR_DABRMATCH))
> +			return -1;
> +
>   		if (is_user) {
>   			_exception(SIGBUS, regs, BUS_OBJERR, address);
>   			return 0;
> @@ -545,20 +550,39 @@ NOKPROBE_SYMBOL(__do_page_fault);
>   int do_page_fault(struct pt_regs *regs, unsigned long address,
>   		  unsigned long error_code)
>   {
> -	const struct exception_table_entry *entry;
>   	enum ctx_state prev_state = exception_enter();
> -	int rc = __do_page_fault(regs, address, error_code);
> -	exception_exit(prev_state);
> -	if (likely(!rc))
> -		return 0;
> +	int err;
>   
> -	entry = search_exception_tables(regs->nip);
> -	if (unlikely(!entry))
> -		return rc;
> +	err = __do_page_fault(regs, address, error_code);
> +	if (unlikely(err)) {
> +		const struct exception_table_entry *entry;
>   
> -	instruction_pointer_set(regs, extable_fixup(entry));
> +		entry = search_exception_tables(regs->nip);
> +		if (likely(entry)) {
> +			instruction_pointer_set(regs, extable_fixup(entry));
> +			err = 0;
> +		}
> +	}
>   
> -	return 0;
> +#ifdef CONFIG_PPC_BOOK3S_64

Seems like you are re-implementing handle_page_fault() inside do_page_fault(). Wouldn't it be 
possible to keep do_page_fault() as is for the moment and implement a C version of handle_page_fault() ?

Or just keep it in assembly ? It is not that big, keeping it in assembly would keep things more 
common with PPC32, and would still allow to save NV GPRS only when needed.

> +	/* 32 and 64e handle these errors in asm */
> +	if (unlikely(err)) {

Really looks like a hack. I'd rather see do_break() dispatch being done as early as possible, same 
as we now do on book3s/32

> +		if (err > 0) {
> +			__bad_page_fault(regs, address, err);
> +			err = 0;
> +		} else {
> +			/*
> +			 * do_break() may change NV GPRS while handling the
> +			 * breakpoint. Return -ve to caller to do that.
> +			 */
> +			do_break(regs, address, error_code);
> +		}
> +	}
> +#endif
> +
> +	exception_exit(prev_state);
> +
> +	return err;
>   }
>   NOKPROBE_SYMBOL(do_page_fault);
>   
> 

Christophe

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 04/21] powerpc: bad_page_fault, do_break get registers from regs
  2021-01-13  7:31 ` [PATCH v5 04/21] powerpc: bad_page_fault, do_break get registers from regs Nicholas Piggin
@ 2021-01-13 14:25   ` Christophe Leroy
  2021-01-14  3:26     ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-13 14:25 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
> Similar to the previous patch this makes interrupt handler function
> types more regular so they can be wrapped with the next patch.
> 
> bad_page_fault and do_break are not performance critical.

It's a bit different between do_break() and bad_page_fault():
- do_break() is not performance critical for sure
- bad_page_fault(), it doesn't matter, because bad_page_fault() was not using the address param so 
it doesn't get anything from regs at the end.

Maybe it would be worth splitting in two patches, one for bad_page_fault() and one for do_break()

Christophe

> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/bug.h             |  4 ++--
>   arch/powerpc/include/asm/debug.h           |  3 +--
>   arch/powerpc/kernel/entry_32.S             |  3 +--
>   arch/powerpc/kernel/exceptions-64e.S       |  3 +--
>   arch/powerpc/kernel/exceptions-64s.S       |  3 +--
>   arch/powerpc/kernel/head_8xx.S             |  5 ++---
>   arch/powerpc/kernel/process.c              |  7 +++----
>   arch/powerpc/kernel/traps.c                |  2 +-
>   arch/powerpc/mm/book3s64/hash_utils.c      |  4 ++--
>   arch/powerpc/mm/book3s64/slb.c             |  2 +-
>   arch/powerpc/mm/fault.c                    | 10 +++++-----
>   arch/powerpc/platforms/8xx/machine_check.c |  2 +-
>   12 files changed, 21 insertions(+), 27 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
> index f7827e993196..4220789b9a97 100644
> --- a/arch/powerpc/include/asm/bug.h
> +++ b/arch/powerpc/include/asm/bug.h
> @@ -112,8 +112,8 @@
>   
>   struct pt_regs;
>   long do_page_fault(struct pt_regs *);
> -extern void bad_page_fault(struct pt_regs *, unsigned long, int);
> -void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig);
> +void bad_page_fault(struct pt_regs *, int);
> +void __bad_page_fault(struct pt_regs *regs, int sig);
>   extern void _exception(int, struct pt_regs *, int, unsigned long);
>   extern void _exception_pkey(struct pt_regs *, unsigned long, int);
>   extern void die(const char *, struct pt_regs *, long);
> diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
> index ec57daf87f40..0550eceab3ca 100644
> --- a/arch/powerpc/include/asm/debug.h
> +++ b/arch/powerpc/include/asm/debug.h
> @@ -52,8 +52,7 @@ extern void do_send_trap(struct pt_regs *regs, unsigned long address,
>   			 unsigned long error_code, int brkpt);
>   #else
>   
> -extern void do_break(struct pt_regs *regs, unsigned long address,
> -		     unsigned long error_code);
> +void do_break(struct pt_regs *regs);
>   #endif
>   
>   #endif /* _ASM_POWERPC_DEBUG_H */
> diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
> index a32157ce0551..a94127eed56b 100644
> --- a/arch/powerpc/kernel/entry_32.S
> +++ b/arch/powerpc/kernel/entry_32.S
> @@ -673,9 +673,8 @@ handle_page_fault:
>   	lwz	r0,_TRAP(r1)
>   	clrrwi	r0,r0,1
>   	stw	r0,_TRAP(r1)
> -	mr	r5,r3
> +	mr	r4,r3		/* err arg for bad_page_fault */
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	lwz	r4,_DAR(r1)
>   	bl	__bad_page_fault
>   	b	ret_from_except_full
>   
> diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
> index 43e71d86dcbf..52421042a020 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -1018,9 +1018,8 @@ storage_fault_common:
>   	bne-	1f
>   	b	ret_from_except_lite
>   1:	bl	save_nvgprs
> -	mr	r5,r3
> +	mr	r4,r3
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	ld	r4,_DAR(r1)
>   	bl	__bad_page_fault
>   	b	ret_from_except
>   
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 814cff2c649e..36dea2020ec5 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -2136,8 +2136,7 @@ EXC_COMMON_BEGIN(h_data_storage_common)
>   	GEN_COMMON h_data_storage
>   	addi    r3,r1,STACK_FRAME_OVERHEAD
>   BEGIN_MMU_FTR_SECTION
> -	ld	r4,_DAR(r1)
> -	li	r5,SIGSEGV
> +	li	r4,SIGSEGV
>   	bl      bad_page_fault
>   MMU_FTR_SECTION_ELSE
>   	bl      unknown_exception
> diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> index 0b2c247cfdff..7869db974185 100644
> --- a/arch/powerpc/kernel/head_8xx.S
> +++ b/arch/powerpc/kernel/head_8xx.S
> @@ -364,10 +364,9 @@ do_databreakpoint:
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	mfspr	r4,SPRN_BAR
>   	stw	r4,_DAR(r11)
> -#ifdef CONFIG_VMAP_STACK
> -	lwz	r5,_DSISR(r11)
> -#else
> +#ifndef CONFIG_VMAP_STACK
>   	mfspr	r5,SPRN_DSISR
> +	stw	r5,_DSISR(r11)
>   #endif
>   	EXC_XFER_STD(0x1c00, do_break)
>   
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index a66f435dabbf..4f0f81e9420b 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -659,11 +659,10 @@ static void do_break_handler(struct pt_regs *regs)
>   	}
>   }
>   
> -void do_break (struct pt_regs *regs, unsigned long address,
> -		    unsigned long error_code)
> +void do_break(struct pt_regs *regs)
>   {
>   	current->thread.trap_nr = TRAP_HWBKPT;
> -	if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, error_code,
> +	if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, regs->dsisr,
>   			11, SIGSEGV) == NOTIFY_STOP)
>   		return;
>   
> @@ -681,7 +680,7 @@ void do_break (struct pt_regs *regs, unsigned long address,
>   		do_break_handler(regs);
>   
>   	/* Deliver the signal to userspace */
> -	force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address);
> +	force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)regs->dar);
>   }
>   #endif	/* CONFIG_PPC_ADV_DEBUG_REGS */
>   
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 3ec7b443fe6b..f3f6af3141ee 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -1612,7 +1612,7 @@ void alignment_exception(struct pt_regs *regs)
>   	if (user_mode(regs))
>   		_exception(sig, regs, code, regs->dar);
>   	else
> -		bad_page_fault(regs, regs->dar, sig);
> +		bad_page_fault(regs, sig);
>   
>   bail:
>   	exception_exit(prev_state);
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 8d014924ee0d..77073a256cff 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -1539,7 +1539,7 @@ long do_hash_fault(struct pt_regs *regs)
>   	 * the access, or panic if there isn't a handler.
>   	 */
>   	if (unlikely(in_nmi())) {
> -		bad_page_fault(regs, ea, SIGSEGV);
> +		bad_page_fault(regs, SIGSEGV);
>   		return 0;
>   	}
>   
> @@ -1578,7 +1578,7 @@ long do_hash_fault(struct pt_regs *regs)
>   			else
>   				_exception(SIGBUS, regs, BUS_ADRERR, ea);
>   		} else {
> -			bad_page_fault(regs, ea, SIGBUS);
> +			bad_page_fault(regs, SIGBUS);
>   		}
>   		err = 0;
>   
> diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
> index 985902ce0272..c581548b533f 100644
> --- a/arch/powerpc/mm/book3s64/slb.c
> +++ b/arch/powerpc/mm/book3s64/slb.c
> @@ -874,7 +874,7 @@ void do_bad_slb_fault(struct pt_regs *regs)
>   		if (user_mode(regs))
>   			_exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
>   		else
> -			bad_page_fault(regs, regs->dar, SIGSEGV);
> +			bad_page_fault(regs, SIGSEGV);
>   	} else if (err == -EINVAL) {
>   		unrecoverable_exception(regs);
>   	} else {
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index e170501081a7..36604ff8b3ec 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -569,14 +569,14 @@ long do_page_fault(struct pt_regs *regs)
>   	/* 32 and 64e handle these errors in asm */
>   	if (unlikely(err)) {
>   		if (err > 0) {
> -			__bad_page_fault(regs, address, err);
> +			__bad_page_fault(regs, err);
>   			err = 0;
>   		} else {
>   			/*
>   			 * do_break() may change NV GPRS while handling the
>   			 * breakpoint. Return -ve to caller to do that.
>   			 */
> -			do_break(regs, address, error_code);
> +			do_break(regs);
>   		}
>   	}
>   #endif
> @@ -592,7 +592,7 @@ NOKPROBE_SYMBOL(do_page_fault);
>    * It is called from the DSI and ISI handlers in head.S and from some
>    * of the procedures in traps.c.
>    */
> -void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
> +void __bad_page_fault(struct pt_regs *regs, int sig)
>   {
>   	int is_write = page_fault_is_write(regs->dsisr);
>   
> @@ -630,7 +630,7 @@ void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
>   	die("Kernel access of bad area", regs, sig);
>   }
>   
> -void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
> +void bad_page_fault(struct pt_regs *regs, int sig)
>   {
>   	const struct exception_table_entry *entry;
>   
> @@ -639,5 +639,5 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
>   	if (entry)
>   		instruction_pointer_set(regs, extable_fixup(entry));
>   	else
> -		__bad_page_fault(regs, address, sig);
> +		__bad_page_fault(regs, sig);
>   }
> diff --git a/arch/powerpc/platforms/8xx/machine_check.c b/arch/powerpc/platforms/8xx/machine_check.c
> index 88dedf38eccd..656365975895 100644
> --- a/arch/powerpc/platforms/8xx/machine_check.c
> +++ b/arch/powerpc/platforms/8xx/machine_check.c
> @@ -26,7 +26,7 @@ int machine_check_8xx(struct pt_regs *regs)
>   	 * to deal with that than having a wart in the mcheck handler.
>   	 * -- BenH
>   	 */
> -	bad_page_fault(regs, regs->dar, SIGBUS);
> +	bad_page_fault(regs, SIGBUS);
>   	return 1;
>   #else
>   	return 0;
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 06/21] powerpc: interrupt handler wrapper functions
  2021-01-13  7:32 ` [PATCH v5 06/21] powerpc: interrupt handler wrapper functions Nicholas Piggin
@ 2021-01-13 14:45   ` Christophe Leroy
  2021-01-14  3:41     ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-13 14:45 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
> Add wrapper functions (derived from x86 macros) for interrupt handler
> functions. This allows interrupt entry code to be written in C.

Looks like you are doing more than just that is this patch. WOuld be worth splitting in several 
patches I think,

I'd suggest:
- Other patches for unrelated changes, see below for details
- One patch that brings the wrapper macros
- One patch that uses those macros

> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/asm-prototypes.h     |  29 ---
>   arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 -
>   arch/powerpc/include/asm/hw_irq.h             |   9 -
>   arch/powerpc/include/asm/interrupt.h          | 218 ++++++++++++++++++
>   arch/powerpc/include/asm/time.h               |   2 +
>   arch/powerpc/kernel/dbell.c                   |  12 +-
>   arch/powerpc/kernel/exceptions-64s.S          |   7 +-
>   arch/powerpc/kernel/head_book3s_32.S          |   6 +-
>   arch/powerpc/kernel/irq.c                     |   3 +-
>   arch/powerpc/kernel/mce.c                     |   5 +-
>   arch/powerpc/kernel/syscall_64.c              |   1 +
>   arch/powerpc/kernel/tau_6xx.c                 |   2 +-
>   arch/powerpc/kernel/time.c                    |   3 +-
>   arch/powerpc/kernel/traps.c                   |  90 +++++---
>   arch/powerpc/kernel/watchdog.c                |   7 +-
>   arch/powerpc/kvm/book3s_hv.c                  |   1 +
>   arch/powerpc/kvm/book3s_hv_builtin.c          |   1 +
>   arch/powerpc/kvm/booke.c                      |   1 +
>   arch/powerpc/mm/book3s64/hash_utils.c         |  57 +++--
>   arch/powerpc/mm/book3s64/slb.c                |  29 +--
>   arch/powerpc/mm/fault.c                       |  15 +-
>   arch/powerpc/platforms/powernv/idle.c         |   1 +
>   22 files changed, 374 insertions(+), 126 deletions(-)
>   create mode 100644 arch/powerpc/include/asm/interrupt.h
> 

...

> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> new file mode 100644
> index 000000000000..60363e5eeffa
> --- /dev/null
> +++ b/arch/powerpc/include/asm/interrupt.h

...

> +/* Interrupt handlers */
> +DECLARE_INTERRUPT_HANDLER_NMI(machine_check_early);
> +DECLARE_INTERRUPT_HANDLER_NMI(hmi_exception_realmode);
> +DECLARE_INTERRUPT_HANDLER(SMIException);
> +DECLARE_INTERRUPT_HANDLER(handle_hmi_exception);
> +DECLARE_INTERRUPT_HANDLER(instruction_breakpoint_exception);
> +DECLARE_INTERRUPT_HANDLER(RunModeException);
> +DECLARE_INTERRUPT_HANDLER(single_step_exception);
> +DECLARE_INTERRUPT_HANDLER(program_check_exception);
> +DECLARE_INTERRUPT_HANDLER(alignment_exception);
> +DECLARE_INTERRUPT_HANDLER(StackOverflow);
> +DECLARE_INTERRUPT_HANDLER(stack_overflow_exception);
> +DECLARE_INTERRUPT_HANDLER(kernel_fp_unavailable_exception);
> +DECLARE_INTERRUPT_HANDLER(altivec_unavailable_exception);
> +DECLARE_INTERRUPT_HANDLER(vsx_unavailable_exception);
> +DECLARE_INTERRUPT_HANDLER(fp_unavailable_tm);
> +DECLARE_INTERRUPT_HANDLER(altivec_unavailable_tm);
> +DECLARE_INTERRUPT_HANDLER(vsx_unavailable_tm);
> +DECLARE_INTERRUPT_HANDLER(facility_unavailable_exception);
> +DECLARE_INTERRUPT_HANDLER_ASYNC(TAUException);
> +DECLARE_INTERRUPT_HANDLER(altivec_assist_exception);
> +DECLARE_INTERRUPT_HANDLER(unrecoverable_exception);
> +DECLARE_INTERRUPT_HANDLER(kernel_bad_stack);
> +DECLARE_INTERRUPT_HANDLER_NMI(system_reset_exception);
> +#ifdef CONFIG_PPC_BOOK3S_64
> +DECLARE_INTERRUPT_HANDLER_ASYNC(machine_check_exception);
> +#else
> +DECLARE_INTERRUPT_HANDLER_NMI(machine_check_exception);
> +#endif
> +DECLARE_INTERRUPT_HANDLER(emulation_assist_interrupt);
> +DECLARE_INTERRUPT_HANDLER_RAW(do_slb_fault);
> +DECLARE_INTERRUPT_HANDLER(do_bad_slb_fault);
> +DECLARE_INTERRUPT_HANDLER_RAW(do_hash_fault);
> +DECLARE_INTERRUPT_HANDLER_RET(do_page_fault);
> +DECLARE_INTERRUPT_HANDLER(__do_bad_page_fault);
> +DECLARE_INTERRUPT_HANDLER(do_bad_page_fault);

Missing DECLARE_INTERRUPT_HANDLER(do_break)

> +
> +DECLARE_INTERRUPT_HANDLER_ASYNC(timer_interrupt);
> +DECLARE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi);
> +DECLARE_INTERRUPT_HANDLER_ASYNC(performance_monitor_exception_async);
> +DECLARE_INTERRUPT_HANDLER_RAW(performance_monitor_exception);
> +DECLARE_INTERRUPT_HANDLER(WatchdogException);
> +DECLARE_INTERRUPT_HANDLER(unknown_exception);
> +DECLARE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception);
> +
> +void replay_system_reset(void);
> +void replay_soft_interrupts(void);
> +
> +#endif /* _ASM_POWERPC_INTERRUPT_H */
> diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
> index 8f789b597bae..8dd3cdb25338 100644
> --- a/arch/powerpc/include/asm/time.h
> +++ b/arch/powerpc/include/asm/time.h
> @@ -102,6 +102,8 @@ DECLARE_PER_CPU(u64, decrementers_next_tb);
>   /* Convert timebase ticks to nanoseconds */
>   unsigned long long tb_to_ns(unsigned long long tb_ticks);
>   
> +void timer_broadcast_interrupt(void);

This seems unrelated. I think a separate patch woud be better for moving prototypes without making 
them wrappers.

> +
>   /* SPLPAR */
>   void accumulate_stolen_time(void);
>   
> diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
> index 52680cf07c9d..c0f99f8ffa7d 100644
> --- a/arch/powerpc/kernel/dbell.c
> +++ b/arch/powerpc/kernel/dbell.c
> @@ -12,14 +12,14 @@
>   #include <linux/hardirq.h>
>   
>   #include <asm/dbell.h>
> +#include <asm/interrupt.h>
>   #include <asm/irq_regs.h>
>   #include <asm/kvm_ppc.h>
>   #include <asm/trace.h>
>   
> -#ifdef CONFIG_SMP
> -

This seems unrelated, is that needed ? What's the problem with having to full versions of 
doorbell_exception() ?

> -void doorbell_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_ASYNC(doorbell_exception)
>   {
> +#ifdef CONFIG_SMP
>   	struct pt_regs *old_regs = set_irq_regs(regs);
>   
>   	irq_enter();
> @@ -37,11 +37,7 @@ void doorbell_exception(struct pt_regs *regs)
>   	trace_doorbell_exit(regs);
>   	irq_exit();
>   	set_irq_regs(old_regs);
> -}
>   #else /* CONFIG_SMP */
> -void doorbell_exception(struct pt_regs *regs)
> -{
>   	printk(KERN_WARNING "Received doorbell on non-smp system\n");
> -}
>   #endif /* CONFIG_SMP */
> -
> +}
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 36dea2020ec5..8b0db807974c 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1923,7 +1923,7 @@ EXC_COMMON_BEGIN(doorbell_super_common)
>   #ifdef CONFIG_PPC_DOORBELL
>   	bl	doorbell_exception
>   #else
> -	bl	unknown_exception
> +	bl	unknown_async_exception

Unrelated  to wrappers ?

>   #endif
>   	b	interrupt_return
>   
> @@ -2136,8 +2136,7 @@ EXC_COMMON_BEGIN(h_data_storage_common)
>   	GEN_COMMON h_data_storage
>   	addi    r3,r1,STACK_FRAME_OVERHEAD
>   BEGIN_MMU_FTR_SECTION
> -	li	r4,SIGSEGV
> -	bl      bad_page_fault
> +	bl      do_bad_page_fault

Is this name change related ?

>   MMU_FTR_SECTION_ELSE
>   	bl      unknown_exception
>   ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_TYPE_RADIX)
> @@ -2310,7 +2309,7 @@ EXC_COMMON_BEGIN(h_doorbell_common)
>   #ifdef CONFIG_PPC_DOORBELL
>   	bl	doorbell_exception
>   #else
> -	bl	unknown_exception
> +	bl	unknown_async_exception
>   #endif
>   	b	interrupt_return
>   
> diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
> index 94ad1372c490..9b4d5432e2db 100644
> --- a/arch/powerpc/kernel/head_book3s_32.S
> +++ b/arch/powerpc/kernel/head_book3s_32.S
> @@ -238,8 +238,8 @@ __secondary_hold_acknowledge:
>   
>   /* System reset */
>   /* core99 pmac starts the seconary here by changing the vector, and
> -   putting it back to what it was (unknown_exception) when done.  */
> -	EXCEPTION(0x100, Reset, unknown_exception, EXC_XFER_STD)
> +   putting it back to what it was (unknown_async_exception) when done.  */
> +	EXCEPTION(0x100, Reset, unknown_async_exception, EXC_XFER_STD)
>   
>   /* Machine check */
>   /*
> @@ -631,7 +631,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_NEED_DTLB_SW_LRU)
>   #endif
>   
>   #ifndef CONFIG_TAU_INT
> -#define TAUException	unknown_exception
> +#define TAUException	unknown_async_exception
>   #endif
>   
>   	EXCEPTION(0x1300, Trap_13, instruction_breakpoint_exception, EXC_XFER_STD)
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 6b1eca53e36c..2055d204d08e 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -54,6 +54,7 @@
>   #include <linux/pgtable.h>
>   
>   #include <linux/uaccess.h>
> +#include <asm/interrupt.h>
>   #include <asm/io.h>
>   #include <asm/irq.h>
>   #include <asm/cache.h>
> @@ -665,7 +666,7 @@ void __do_irq(struct pt_regs *regs)
>   	irq_exit();
>   }
>   
> -void do_IRQ(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_ASYNC(do_IRQ)
>   {
>   	struct pt_regs *old_regs = set_irq_regs(regs);
>   	void *cursp, *irqsp, *sirqsp;
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index 9f3e133b57b7..54269947113d 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -18,6 +18,7 @@
>   #include <linux/extable.h>
>   #include <linux/ftrace.h>
>   
> +#include <asm/interrupt.h>
>   #include <asm/machdep.h>
>   #include <asm/mce.h>
>   #include <asm/nmi.h>
> @@ -588,7 +589,7 @@ EXPORT_SYMBOL_GPL(machine_check_print_event_info);
>    *
>    * regs->nip and regs->msr contains srr0 and ssr1.
>    */
> -long notrace machine_check_early(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
>   {
>   	long handled = 0;
>   	u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
> @@ -722,7 +723,7 @@ long hmi_handle_debugtrig(struct pt_regs *regs)
>   /*
>    * Return values:
>    */
> -long hmi_exception_realmode(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_NMI(hmi_exception_realmode)
>   {	
>   	int ret;
>   
> diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
> index 7c85ed04a164..dd87b2118620 100644
> --- a/arch/powerpc/kernel/syscall_64.c
> +++ b/arch/powerpc/kernel/syscall_64.c
> @@ -5,6 +5,7 @@
>   #include <asm/kup.h>
>   #include <asm/cputime.h>
>   #include <asm/hw_irq.h>
> +#include <asm/interrupt.h>
>   #include <asm/kprobes.h>
>   #include <asm/paca.h>
>   #include <asm/ptrace.h>
> diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c
> index 0b4694b8d248..46b2e5de4ef5 100644
> --- a/arch/powerpc/kernel/tau_6xx.c
> +++ b/arch/powerpc/kernel/tau_6xx.c
> @@ -100,7 +100,7 @@ static void TAUupdate(int cpu)
>    * with interrupts disabled
>    */
>   
> -void TAUException(struct pt_regs * regs)
> +DEFINE_INTERRUPT_HANDLER_ASYNC(TAUException)
>   {
>   	int cpu = smp_processor_id();
>   
> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
> index 67feb3524460..435a251247ed 100644
> --- a/arch/powerpc/kernel/time.c
> +++ b/arch/powerpc/kernel/time.c
> @@ -56,6 +56,7 @@
>   #include <linux/processor.h>
>   #include <asm/trace.h>
>   
> +#include <asm/interrupt.h>
>   #include <asm/io.h>
>   #include <asm/nvram.h>
>   #include <asm/cache.h>
> @@ -570,7 +571,7 @@ void arch_irq_work_raise(void)
>    * timer_interrupt - gets called when the decrementer overflows,
>    * with interrupts disabled.
>    */
> -void timer_interrupt(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
>   {
>   	struct clock_event_device *evt = this_cpu_ptr(&decrementers);
>   	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 9b5298c016c7..f4462b481248 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -41,6 +41,7 @@
>   #include <asm/emulated_ops.h>
>   #include <linux/uaccess.h>
>   #include <asm/debugfs.h>
> +#include <asm/interrupt.h>
>   #include <asm/io.h>
>   #include <asm/machdep.h>
>   #include <asm/rtas.h>
> @@ -430,8 +431,7 @@ void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
>   	regs->msr &= ~MSR_RI;
>   #endif
>   }
> -
> -void system_reset_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
>   {
>   	unsigned long hsrr0, hsrr1;
>   	bool saved_hsrrs = false;
> @@ -516,7 +516,10 @@ void system_reset_exception(struct pt_regs *regs)
>   	this_cpu_set_ftrace_enabled(ftrace_enabled);
>   
>   	/* What should we do here? We could issue a shutdown or hard reset. */
> +
> +	return 0;
>   }
> +NOKPROBE_SYMBOL(system_reset_exception);

Is this NOKPROBE_SYMBOL() related to wrappers or just a bug fix ?

>   
>   /*
>    * I/O accesses can cause machine checks on powermacs.
> @@ -788,7 +791,12 @@ int machine_check_generic(struct pt_regs *regs)
>   }
>   #endif /* everything else */
>   
> -void machine_check_exception(struct pt_regs *regs)
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
> +DEFINE_INTERRUPT_HANDLER_ASYNC(machine_check_exception)
> +#else
> +DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
> +#endif
>   {
>   	int recover = 0;
>   
> @@ -838,13 +846,21 @@ void machine_check_exception(struct pt_regs *regs)
>   	if (!(regs->msr & MSR_RI))
>   		die("Unrecoverable Machine check", regs, SIGBUS);
>   
> +#ifdef CONFIG_PPC_BOOK3S_64
> +bail:
>   	return;
> +#else
> +	return 0;
>   
>   bail:
>   	if (nmi) nmi_exit();
> +
> +	return 0;
> +#endif

Looks fishy. Can't we have both returning either long or void ?

>   }
> +NOKPROBE_SYMBOL(machine_check_exception);

Is this NOKPROBE_SYMBOL() related to wrappers or just a bug fix ?

>   
> -void SMIException(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(SMIException) /* async? */
>   {
>   	die("System Management Interrupt", regs, SIGABRT);
>   }
> @@ -1030,7 +1046,7 @@ static void p9_hmi_special_emu(struct pt_regs *regs)
>   }
>   #endif /* CONFIG_VSX */
>   
> -void handle_hmi_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_ASYNC(handle_hmi_exception)
>   {
>   	struct pt_regs *old_regs;
>   
> @@ -1059,7 +1075,7 @@ void handle_hmi_exception(struct pt_regs *regs)
>   	set_irq_regs(old_regs);
>   }
>   
> -void unknown_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(unknown_exception)
>   {
>   	enum ctx_state prev_state = exception_enter();
>   
> @@ -1071,7 +1087,19 @@ void unknown_exception(struct pt_regs *regs)
>   	exception_exit(prev_state);
>   }
>   
> -void instruction_breakpoint_exception(struct pt_regs *regs)

shouldn't unknown_async_exception() be added in a preceeding patch ?

> +DEFINE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception)
> +{
> +	enum ctx_state prev_state = exception_enter();
> +
> +	printk("Bad trap at PC: %lx, SR: %lx, vector=%lx\n",
> +	       regs->nip, regs->msr, regs->trap);
> +
> +	_exception(SIGTRAP, regs, TRAP_UNK, 0);
> +
> +	exception_exit(prev_state);
> +}
> +
> +DEFINE_INTERRUPT_HANDLER(instruction_breakpoint_exception)
>   {
>   	enum ctx_state prev_state = exception_enter();
>   
> @@ -1086,12 +1114,12 @@ void instruction_breakpoint_exception(struct pt_regs *regs)
>   	exception_exit(prev_state);
>   }
>   
> -void RunModeException(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(RunModeException)
>   {
>   	_exception(SIGTRAP, regs, TRAP_UNK, 0);
>   }
>   
> -void single_step_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(single_step_exception)
>   {
>   	enum ctx_state prev_state = exception_enter();
>   
> @@ -1436,7 +1464,7 @@ static int emulate_math(struct pt_regs *regs)
>   static inline int emulate_math(struct pt_regs *regs) { return -1; }
>   #endif
>   
> -void program_check_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(program_check_exception)
>   {
>   	enum ctx_state prev_state = exception_enter();
>   	unsigned int reason = get_reason(regs);
> @@ -1561,14 +1589,14 @@ NOKPROBE_SYMBOL(program_check_exception);
>    * This occurs when running in hypervisor mode on POWER6 or later
>    * and an illegal instruction is encountered.
>    */
> -void emulation_assist_interrupt(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(emulation_assist_interrupt)
>   {
>   	regs->msr |= REASON_ILLEGAL;
>   	program_check_exception(regs);
>   }
>   NOKPROBE_SYMBOL(emulation_assist_interrupt);
>   
> -void alignment_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(alignment_exception)
>   {
>   	enum ctx_state prev_state = exception_enter();
>   	int sig, code, fixed = 0;
> @@ -1618,7 +1646,7 @@ void alignment_exception(struct pt_regs *regs)
>   	exception_exit(prev_state);
>   }
>   
> -void StackOverflow(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(StackOverflow)
>   {
>   	pr_crit("Kernel stack overflow in process %s[%d], r1=%lx\n",
>   		current->comm, task_pid_nr(current), regs->gpr[1]);
> @@ -1627,7 +1655,7 @@ void StackOverflow(struct pt_regs *regs)
>   	panic("kernel stack overflow");
>   }
>   
> -void stack_overflow_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(stack_overflow_exception)
>   {
>   	enum ctx_state prev_state = exception_enter();
>   
> @@ -1636,7 +1664,7 @@ void stack_overflow_exception(struct pt_regs *regs)
>   	exception_exit(prev_state);
>   }
>   
> -void kernel_fp_unavailable_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(kernel_fp_unavailable_exception)
>   {
>   	enum ctx_state prev_state = exception_enter();
>   
> @@ -1647,7 +1675,7 @@ void kernel_fp_unavailable_exception(struct pt_regs *regs)
>   	exception_exit(prev_state);
>   }
>   
> -void altivec_unavailable_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(altivec_unavailable_exception)
>   {
>   	enum ctx_state prev_state = exception_enter();
>   
> @@ -1666,7 +1694,7 @@ void altivec_unavailable_exception(struct pt_regs *regs)
>   	exception_exit(prev_state);
>   }
>   
> -void vsx_unavailable_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(vsx_unavailable_exception)
>   {
>   	if (user_mode(regs)) {
>   		/* A user program has executed an vsx instruction,
> @@ -1697,7 +1725,7 @@ static void tm_unavailable(struct pt_regs *regs)
>   	die("Unrecoverable TM Unavailable Exception", regs, SIGABRT);
>   }
>   
> -void facility_unavailable_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(facility_unavailable_exception)
>   {
>   	static char *facility_strings[] = {
>   		[FSCR_FP_LG] = "FPU",
> @@ -1817,7 +1845,7 @@ void facility_unavailable_exception(struct pt_regs *regs)
>   
>   #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>   
> -void fp_unavailable_tm(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(fp_unavailable_tm)
>   {
>   	/* Note:  This does not handle any kind of FP laziness. */
>   
> @@ -1850,7 +1878,7 @@ void fp_unavailable_tm(struct pt_regs *regs)
>   	tm_recheckpoint(&current->thread);
>   }
>   
> -void altivec_unavailable_tm(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(altivec_unavailable_tm)
>   {
>   	/* See the comments in fp_unavailable_tm().  This function operates
>   	 * the same way.
> @@ -1865,7 +1893,7 @@ void altivec_unavailable_tm(struct pt_regs *regs)
>   	current->thread.used_vr = 1;
>   }
>   
> -void vsx_unavailable_tm(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(vsx_unavailable_tm)
>   {
>   	/* See the comments in fp_unavailable_tm().  This works similarly,
>   	 * though we're loading both FP and VEC registers in here.
> @@ -1890,7 +1918,8 @@ void vsx_unavailable_tm(struct pt_regs *regs)
>   }
>   #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>   
> -static void performance_monitor_exception_nmi(struct pt_regs *regs)
> +#ifdef CONFIG_PPC64
> +DEFINE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi)
>   {
>   	nmi_enter();
>   
> @@ -1899,9 +1928,12 @@ static void performance_monitor_exception_nmi(struct pt_regs *regs)
>   	perf_irq(regs);
>   
>   	nmi_exit();
> +
> +	return 0;
>   }
> +#endif
>   
> -static void performance_monitor_exception_async(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_ASYNC(performance_monitor_exception_async)
>   {
>   	irq_enter();
>   
> @@ -1912,7 +1944,7 @@ static void performance_monitor_exception_async(struct pt_regs *regs)
>   	irq_exit();
>   }
>   
> -void performance_monitor_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_RAW(performance_monitor_exception)
>   {
>   	/*
>   	 * On 64-bit, if perf interrupts hit in a local_irq_disable
> @@ -1924,6 +1956,8 @@ void performance_monitor_exception(struct pt_regs *regs)
>   		performance_monitor_exception_nmi(regs);
>   	else
>   		performance_monitor_exception_async(regs);
> +
> +	return 0;
>   }
>   
>   #ifdef CONFIG_PPC_ADV_DEBUG_REGS
> @@ -2057,7 +2091,7 @@ NOKPROBE_SYMBOL(DebugException);
>   #endif /* CONFIG_PPC_ADV_DEBUG_REGS */
>   
>   #ifdef CONFIG_ALTIVEC
> -void altivec_assist_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(altivec_assist_exception)
>   {
>   	int err;
>   
> @@ -2199,7 +2233,7 @@ void SPEFloatingPointRoundException(struct pt_regs *regs)
>    * in the MSR is 0.  This indicates that SRR0/1 are live, and that
>    * we therefore lost state by taking this exception.
>    */
> -void unrecoverable_exception(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(unrecoverable_exception)
>   {
>   	pr_emerg("Unrecoverable exception %lx at %lx (msr=%lx)\n",
>   		 regs->trap, regs->nip, regs->msr);
> @@ -2219,7 +2253,7 @@ void __attribute__ ((weak)) WatchdogHandler(struct pt_regs *regs)
>   	return;
>   }
>   
> -void WatchdogException(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(WatchdogException) /* XXX NMI? async? */
>   {
>   	printk (KERN_EMERG "PowerPC Book-E Watchdog Exception\n");
>   	WatchdogHandler(regs);
> @@ -2230,7 +2264,7 @@ void WatchdogException(struct pt_regs *regs)
>    * We enter here if we discover during exception entry that we are
>    * running in supervisor mode with a userspace value in the stack pointer.
>    */
> -void kernel_bad_stack(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(kernel_bad_stack)
>   {
>   	printk(KERN_EMERG "Bad kernel stack pointer %lx at %lx\n",
>   	       regs->gpr[1], regs->nip);
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index af3c15a1d41e..824b9376ac35 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -26,6 +26,7 @@
>   #include <linux/delay.h>
>   #include <linux/smp.h>
>   
> +#include <asm/interrupt.h>
>   #include <asm/paca.h>
>   
>   /*
> @@ -247,14 +248,14 @@ static void watchdog_timer_interrupt(int cpu)
>   		watchdog_smp_panic(cpu, tb);
>   }
>   
> -void soft_nmi_interrupt(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>   {
>   	unsigned long flags;
>   	int cpu = raw_smp_processor_id();
>   	u64 tb;
>   
>   	if (!cpumask_test_cpu(cpu, &wd_cpus_enabled))
> -		return;
> +		return 0;
>   
>   	nmi_enter();
>   
> @@ -291,6 +292,8 @@ void soft_nmi_interrupt(struct pt_regs *regs)
>   
>   out:
>   	nmi_exit();
> +
> +	return 0;
>   }
>   
>   static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 6f612d240392..3f9a229f82a2 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -53,6 +53,7 @@
>   #include <asm/cputable.h>
>   #include <asm/cacheflush.h>
>   #include <linux/uaccess.h>
> +#include <asm/interrupt.h>
>   #include <asm/io.h>
>   #include <asm/kvm_ppc.h>
>   #include <asm/kvm_book3s.h>
> diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
> index 8053efdf7ea7..10fc274bea65 100644
> --- a/arch/powerpc/kvm/book3s_hv_builtin.c
> +++ b/arch/powerpc/kvm/book3s_hv_builtin.c
> @@ -17,6 +17,7 @@
>   
>   #include <asm/asm-prototypes.h>
>   #include <asm/cputable.h>
> +#include <asm/interrupt.h>
>   #include <asm/kvm_ppc.h>
>   #include <asm/kvm_book3s.h>
>   #include <asm/archrandom.h>
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 288a9820ec01..bd2bb73021d8 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -20,6 +20,7 @@
>   
>   #include <asm/cputable.h>
>   #include <linux/uaccess.h>
> +#include <asm/interrupt.h>
>   #include <asm/kvm_ppc.h>
>   #include <asm/cacheflush.h>
>   #include <asm/dbell.h>
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 77073a256cff..453afb9ae9b4 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -38,6 +38,7 @@
>   #include <linux/pgtable.h>
>   
>   #include <asm/debugfs.h>
> +#include <asm/interrupt.h>
>   #include <asm/processor.h>
>   #include <asm/mmu.h>
>   #include <asm/mmu_context.h>
> @@ -1512,7 +1513,7 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
>   }
>   EXPORT_SYMBOL_GPL(hash_page);
>   
> -long do_hash_fault(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
>   {
>   	unsigned long ea = regs->dar;
>   	unsigned long dsisr = regs->dsisr;
> @@ -1522,27 +1523,6 @@ long do_hash_fault(struct pt_regs *regs)
>   	unsigned int region_id;
>   	long err;
>   
> -	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
> -		goto page_fault;
> -
> -	/*
> -	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
> -	 * don't call hash_page, just fail the fault. This is required to
> -	 * prevent re-entrancy problems in the hash code, namely perf
> -	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
> -	 * hash fault. See the comment in hash_preload().
> -	 *
> -	 * We come here as a result of a DSI at a point where we don't want
> -	 * to call hash_page, such as when we are accessing memory (possibly
> -	 * user memory) inside a PMU interrupt that occurred while interrupts
> -	 * were soft-disabled.  We want to invoke the exception handler for
> -	 * the access, or panic if there isn't a handler.
> -	 */
> -	if (unlikely(in_nmi())) {
> -		bad_page_fault(regs, SIGSEGV);
> -		return 0;
> -	}
> -
>   	region_id = get_region_id(ea);
>   	if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
>   		mm = &init_mm;
> @@ -1583,13 +1563,44 @@ long do_hash_fault(struct pt_regs *regs)
>   		err = 0;
>   
>   	} else if (err) {
> -page_fault:
>   		err = do_page_fault(regs);
>   	}
>   
>   	return err;
>   }
>   
> +/*
> + * The _RAW interrupt entry checks for the in_nmi() case before
> + * running the full handler.
> + */
> +DEFINE_INTERRUPT_HANDLER_RAW(do_hash_fault)

Could we do that split into __do_hash_fault() / do_hash_fault() in a preceeding patch ?

> +{
> +	unsigned long dsisr = regs->dsisr;
> +
> +	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
> +		return do_page_fault(regs);
> +
> +	/*
> +	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
> +	 * don't call hash_page, just fail the fault. This is required to
> +	 * prevent re-entrancy problems in the hash code, namely perf
> +	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
> +	 * hash fault. See the comment in hash_preload().
> +	 *
> +	 * We come here as a result of a DSI at a point where we don't want
> +	 * to call hash_page, such as when we are accessing memory (possibly
> +	 * user memory) inside a PMU interrupt that occurred while interrupts
> +	 * were soft-disabled.  We want to invoke the exception handler for
> +	 * the access, or panic if there isn't a handler.
> +	 */
> +	if (unlikely(in_nmi())) {
> +		do_bad_page_fault(regs);
> +		return 0;
> +	}
> +
> +	return __do_hash_fault(regs);
> +}
> +
>   #ifdef CONFIG_PPC_MM_SLICES
>   static bool should_hash_preload(struct mm_struct *mm, unsigned long ea)
>   {
> diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
> index c581548b533f..0ae10adae203 100644
> --- a/arch/powerpc/mm/book3s64/slb.c
> +++ b/arch/powerpc/mm/book3s64/slb.c
> @@ -10,6 +10,7 @@
>    */
>   
>   #include <asm/asm-prototypes.h>
> +#include <asm/interrupt.h>
>   #include <asm/mmu.h>
>   #include <asm/mmu_context.h>
>   #include <asm/paca.h>
> @@ -813,7 +814,7 @@ static long slb_allocate_user(struct mm_struct *mm, unsigned long ea)
>   	return slb_insert_entry(ea, context, flags, ssize, false);
>   }
>   
> -long do_slb_fault(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_RAW(do_slb_fault)
>   {
>   	unsigned long ea = regs->dar;
>   	unsigned long id = get_region_id(ea);
> @@ -827,17 +828,19 @@ long do_slb_fault(struct pt_regs *regs)
>   	/*
>   	 * SLB kernel faults must be very careful not to touch anything
>   	 * that is not bolted. E.g., PACA and global variables are okay,
> -	 * mm->context stuff is not.
> -	 *
> -	 * SLB user faults can access all of kernel memory, but must be
> -	 * careful not to touch things like IRQ state because it is not
> -	 * "reconciled" here. The difficulty is that we must use
> -	 * fast_exception_return to return from kernel SLB faults without
> -	 * looking at possible non-bolted memory. We could test user vs
> -	 * kernel faults in the interrupt handler asm and do a full fault,
> -	 * reconcile, ret_from_except for user faults which would make them
> -	 * first class kernel code. But for performance it's probably nicer
> -	 * if they go via fast_exception_return too.
> +	 * mm->context stuff is not. SLB user faults may access all of
> +	 * memory (and induce one recursive SLB kernel fault), so the
> +	 * kernel fault must not trample on the user fault state at those
> +	 * points.
> +	 */
> +
> +	/*
> +	 * This is a _RAW interrupt handler, so it must not touch local
> +	 * irq state, or schedule. We could test for usermode and upgrade
> +	 * to a normal process context (synchronous) interrupt for those,
> +	 * which would make them first-class kernel code and able to be
> +	 * traced and instrumented, although performance would suffer a
> +	 * bit, it would probably be a good tradeoff.

Is the comment change really related to the wrapper macros ?

>   	 */
>   	if (id >= LINEAR_MAP_REGION_ID) {
>   		long err;
> @@ -866,7 +869,7 @@ long do_slb_fault(struct pt_regs *regs)
>   	}
>   }
>   
> -void do_bad_slb_fault(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER(do_bad_slb_fault)
>   {
>   	int err = regs->result;
>   
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 36604ff8b3ec..9e1cd74ebb13 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -34,6 +34,7 @@
>   #include <linux/uaccess.h>
>   
>   #include <asm/firmware.h>
> +#include <asm/interrupt.h>
>   #include <asm/page.h>
>   #include <asm/mmu.h>
>   #include <asm/mmu_context.h>
> @@ -547,7 +548,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>   }
>   NOKPROBE_SYMBOL(__do_page_fault);
>   
> -long do_page_fault(struct pt_regs *regs)
> +DEFINE_INTERRUPT_HANDLER_RET(do_page_fault)
>   {
>   	enum ctx_state prev_state = exception_enter();
>   	unsigned long address = regs->dar;
> @@ -641,3 +642,15 @@ void bad_page_fault(struct pt_regs *regs, int sig)
>   	else
>   		__bad_page_fault(regs, sig);
>   }
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
> +DEFINE_INTERRUPT_HANDLER(__do_bad_page_fault)
> +{
> +	__bad_page_fault(regs, SIGSEGV);
> +}
> +
> +DEFINE_INTERRUPT_HANDLER(do_bad_page_fault)
> +{
> +	bad_page_fault(regs, SIGSEGV);
> +}
> +#endif
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index e6f461812856..999997d9e9a9 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -14,6 +14,7 @@
>   
>   #include <asm/asm-prototypes.h>
>   #include <asm/firmware.h>
> +#include <asm/interrupt.h>
>   #include <asm/machdep.h>
>   #include <asm/opal.h>
>   #include <asm/cputhreads.h>
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 09/21] powerpc/64: context tracking remove _TIF_NOHZ
  2021-01-13  7:32 ` [PATCH v5 09/21] powerpc/64: context tracking remove _TIF_NOHZ Nicholas Piggin
@ 2021-01-13 14:50   ` Christophe Leroy
  2021-01-14  3:48     ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-13 14:50 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
> Add context tracking to the system call handler explicitly, and remove
> _TIF_NOHZ.
> 
> This saves 35 cycles on gettid system call cost on POWER9 with a
> CONFIG_NOHZ_FULL kernel.

35 cycles among 100 cycles, or among 5000 cycles ? I meant what pourcentage to you win ?

Christophe

> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/Kconfig                   |  1 -
>   arch/powerpc/include/asm/thread_info.h |  4 +---
>   arch/powerpc/kernel/ptrace/ptrace.c    |  4 ----
>   arch/powerpc/kernel/signal.c           |  4 ----
>   arch/powerpc/kernel/syscall_64.c       | 10 ++++++++++
>   5 files changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 107bb4319e0e..28d5a1b1510f 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -196,7 +196,6 @@ config PPC
>   	select HAVE_STACKPROTECTOR		if PPC64 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
>   	select HAVE_STACKPROTECTOR		if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
>   	select HAVE_CONTEXT_TRACKING		if PPC64
> -	select HAVE_TIF_NOHZ			if PPC64
>   	select HAVE_DEBUG_KMEMLEAK
>   	select HAVE_DEBUG_STACKOVERFLOW
>   	select HAVE_DYNAMIC_FTRACE
> diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
> index 3d8a47af7a25..386d576673a1 100644
> --- a/arch/powerpc/include/asm/thread_info.h
> +++ b/arch/powerpc/include/asm/thread_info.h
> @@ -94,7 +94,6 @@ void arch_setup_new_exec(void);
>   #define TIF_PATCH_PENDING	6	/* pending live patching update */
>   #define TIF_SYSCALL_AUDIT	7	/* syscall auditing active */
>   #define TIF_SINGLESTEP		8	/* singlestepping active */
> -#define TIF_NOHZ		9	/* in adaptive nohz mode */
>   #define TIF_SECCOMP		10	/* secure computing */
>   #define TIF_RESTOREALL		11	/* Restore all regs (implies NOERROR) */
>   #define TIF_NOERROR		12	/* Force successful syscall return */
> @@ -128,11 +127,10 @@ void arch_setup_new_exec(void);
>   #define _TIF_UPROBE		(1<<TIF_UPROBE)
>   #define _TIF_SYSCALL_TRACEPOINT	(1<<TIF_SYSCALL_TRACEPOINT)
>   #define _TIF_EMULATE_STACK_STORE	(1<<TIF_EMULATE_STACK_STORE)
> -#define _TIF_NOHZ		(1<<TIF_NOHZ)
>   #define _TIF_SYSCALL_EMU	(1<<TIF_SYSCALL_EMU)
>   #define _TIF_SYSCALL_DOTRACE	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
>   				 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \
> -				 _TIF_NOHZ | _TIF_SYSCALL_EMU)
> +				 _TIF_SYSCALL_EMU)
>   
>   #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
>   				 _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
> diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
> index 3d44b73adb83..4f3d4ff3728c 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace.c
> @@ -262,8 +262,6 @@ long do_syscall_trace_enter(struct pt_regs *regs)
>   {
>   	u32 flags;
>   
> -	user_exit();
> -
>   	flags = READ_ONCE(current_thread_info()->flags) &
>   		(_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
>   
> @@ -340,8 +338,6 @@ void do_syscall_trace_leave(struct pt_regs *regs)
>   	step = test_thread_flag(TIF_SINGLESTEP);
>   	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
>   		tracehook_report_syscall_exit(regs, step);
> -
> -	user_enter();
>   }
>   
>   void __init pt_regs_check(void);
> diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
> index 53782aa60ade..9ded046edb0e 100644
> --- a/arch/powerpc/kernel/signal.c
> +++ b/arch/powerpc/kernel/signal.c
> @@ -282,8 +282,6 @@ static void do_signal(struct task_struct *tsk)
>   
>   void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
>   {
> -	user_exit();
> -
>   	if (thread_info_flags & _TIF_UPROBE)
>   		uprobe_notify_resume(regs);
>   
> @@ -299,8 +297,6 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
>   		tracehook_notify_resume(regs);
>   		rseq_handle_notify_resume(NULL, regs);
>   	}
> -
> -	user_enter();
>   }
>   
>   static unsigned long get_tm_stackpointer(struct task_struct *tsk)
> diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
> index dd87b2118620..d7d256a7a41f 100644
> --- a/arch/powerpc/kernel/syscall_64.c
> +++ b/arch/powerpc/kernel/syscall_64.c
> @@ -1,9 +1,11 @@
>   // SPDX-License-Identifier: GPL-2.0-or-later
>   
> +#include <linux/context_tracking.h>
>   #include <linux/err.h>
>   #include <asm/asm-prototypes.h>
>   #include <asm/kup.h>
>   #include <asm/cputime.h>
> +#include <asm/interrupt.h>
>   #include <asm/hw_irq.h>
>   #include <asm/interrupt.h>
>   #include <asm/kprobes.h>
> @@ -28,6 +30,9 @@ notrace long system_call_exception(long r3, long r4, long r5,
>   	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
>   		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
>   
> +	CT_WARN_ON(ct_state() == CONTEXT_KERNEL);
> +	user_exit_irqoff();
> +
>   	trace_hardirqs_off(); /* finish reconciling */
>   
>   	if (IS_ENABLED(CONFIG_PPC_BOOK3S))
> @@ -182,6 +187,8 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
>   	unsigned long ti_flags;
>   	unsigned long ret = 0;
>   
> +	CT_WARN_ON(ct_state() == CONTEXT_USER);
> +
>   	kuap_check_amr();
>   
>   	regs->result = r3;
> @@ -258,8 +265,11 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
>   		}
>   	}
>   
> +	user_enter_irqoff();
> +
>   	/* scv need not set RI=0 because SRRs are not used */
>   	if (unlikely(!prep_irq_for_enabled_exit(!scv))) {
> +		user_exit_irqoff();
>   		local_irq_enable();
>   		goto again;
>   	}
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 15/21] powerpc/64s: reconcile interrupts in C
  2021-01-13  7:32 ` [PATCH v5 15/21] powerpc/64s: reconcile interrupts in C Nicholas Piggin
@ 2021-01-13 14:54   ` Christophe Leroy
  2021-01-14  3:51     ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-13 14:54 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
> There is no need for this to be in asm, use the new intrrupt entry wrapper.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/interrupt.h | 15 +++++++++++----
>   arch/powerpc/kernel/exceptions-64s.S | 26 --------------------------
>   2 files changed, 11 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> index 34d7cca2cb2e..6eba7c489753 100644
> --- a/arch/powerpc/include/asm/interrupt.h
> +++ b/arch/powerpc/include/asm/interrupt.h
> @@ -14,11 +14,14 @@ struct interrupt_state {
>   
>   static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
>   {
> -#ifdef CONFIG_PPC_BOOK3E_64
> -	state->ctx_state = exception_enter();
> -#endif
> -

Can't the above stay on top of the function ?

> +	/*
> +	 * Book3E reconciles irq soft mask in asm
> +	 */
>   #ifdef CONFIG_PPC_BOOK3S_64
> +	if (irq_soft_mask_set_return(IRQS_ALL_DISABLED) == IRQS_ENABLED)
> +		trace_hardirqs_off();
> +	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
> +
>   	if (user_mode(regs)) {
>   		CT_WARN_ON(ct_state() != CONTEXT_USER);
>   		user_exit_irqoff();
> @@ -31,6 +34,10 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
>   			CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
>   	}
>   #endif
> +
> +#ifdef CONFIG_PPC_BOOK3E_64
> +	state->ctx_state = exception_enter();
> +#endif
>   }
>   
>   /*
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 8b0db807974c..df4ee073386b 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -139,7 +139,6 @@ name:
>   #define IKVM_VIRT	.L_IKVM_VIRT_\name\()	/* Virt entry tests KVM */
>   #define ISTACK		.L_ISTACK_\name\()	/* Set regular kernel stack */
>   #define __ISTACK(name)	.L_ISTACK_ ## name
> -#define IRECONCILE	.L_IRECONCILE_\name\()	/* Do RECONCILE_IRQ_STATE */
>   #define IKUAP		.L_IKUAP_\name\()	/* Do KUAP lock */
>   
>   #define INT_DEFINE_BEGIN(n)						\
> @@ -203,9 +202,6 @@ do_define_int n
>   	.ifndef ISTACK
>   		ISTACK=1
>   	.endif
> -	.ifndef IRECONCILE
> -		IRECONCILE=1
> -	.endif
>   	.ifndef IKUAP
>   		IKUAP=1
>   	.endif
> @@ -653,10 +649,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
>   	.if ISTACK
>   	ACCOUNT_STOLEN_TIME
>   	.endif
> -
> -	.if IRECONCILE
> -	RECONCILE_IRQ_STATE(r10, r11)
> -	.endif
>   .endm
>   
>   /*
> @@ -935,7 +927,6 @@ INT_DEFINE_BEGIN(system_reset)
>   	 */
>   	ISET_RI=0
>   	ISTACK=0
> -	IRECONCILE=0
>   	IKVM_REAL=1
>   INT_DEFINE_END(system_reset)
>   
> @@ -1123,7 +1114,6 @@ INT_DEFINE_BEGIN(machine_check_early)
>   	ISTACK=0
>   	IDAR=1
>   	IDSISR=1
> -	IRECONCILE=0
>   	IKUAP=0 /* We don't touch AMR here, we never go to virtual mode */
>   INT_DEFINE_END(machine_check_early)
>   
> @@ -1476,7 +1466,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
>   INT_DEFINE_BEGIN(data_access_slb)
>   	IVEC=0x380
>   	IAREA=PACA_EXSLB
> -	IRECONCILE=0
>   	IDAR=1
>   	IKVM_SKIP=1
>   	IKVM_REAL=1
> @@ -1503,7 +1492,6 @@ MMU_FTR_SECTION_ELSE
>   	li	r3,-EFAULT
>   ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
>   	std	r3,RESULT(r1)
> -	RECONCILE_IRQ_STATE(r10, r11)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	do_bad_slb_fault
>   	b	interrupt_return
> @@ -1565,7 +1553,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
>   INT_DEFINE_BEGIN(instruction_access_slb)
>   	IVEC=0x480
>   	IAREA=PACA_EXSLB
> -	IRECONCILE=0
>   	IISIDE=1
>   	IDAR=1
>   #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
> @@ -1594,7 +1581,6 @@ MMU_FTR_SECTION_ELSE
>   	li	r3,-EFAULT
>   ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
>   	std	r3,RESULT(r1)
> -	RECONCILE_IRQ_STATE(r10, r11)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	do_bad_slb_fault
>   	b	interrupt_return
> @@ -1754,7 +1740,6 @@ EXC_COMMON_BEGIN(program_check_common)
>    */
>   INT_DEFINE_BEGIN(fp_unavailable)
>   	IVEC=0x800
> -	IRECONCILE=0
>   #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>   	IKVM_REAL=1
>   #endif
> @@ -1769,7 +1754,6 @@ EXC_VIRT_END(fp_unavailable, 0x4800, 0x100)
>   EXC_COMMON_BEGIN(fp_unavailable_common)
>   	GEN_COMMON fp_unavailable
>   	bne	1f			/* if from user, just load it up */
> -	RECONCILE_IRQ_STATE(r10, r11)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	kernel_fp_unavailable_exception
>   0:	trap
> @@ -1788,7 +1772,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
>   	b	fast_interrupt_return
>   #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>   2:	/* User process was in a transaction */
> -	RECONCILE_IRQ_STATE(r10, r11)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	fp_unavailable_tm
>   	b	interrupt_return
> @@ -1853,7 +1836,6 @@ INT_DEFINE_BEGIN(hdecrementer)
>   	IVEC=0x980
>   	IHSRR=1
>   	ISTACK=0
> -	IRECONCILE=0
>   	IKVM_REAL=1
>   	IKVM_VIRT=1
>   INT_DEFINE_END(hdecrementer)
> @@ -2227,7 +2209,6 @@ INT_DEFINE_BEGIN(hmi_exception_early)
>   	IHSRR=1
>   	IREALMODE_COMMON=1
>   	ISTACK=0
> -	IRECONCILE=0
>   	IKUAP=0 /* We don't touch AMR here, we never go to virtual mode */
>   	IKVM_REAL=1
>   INT_DEFINE_END(hmi_exception_early)
> @@ -2401,7 +2382,6 @@ EXC_COMMON_BEGIN(performance_monitor_common)
>    */
>   INT_DEFINE_BEGIN(altivec_unavailable)
>   	IVEC=0xf20
> -	IRECONCILE=0
>   #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>   	IKVM_REAL=1
>   #endif
> @@ -2431,7 +2411,6 @@ BEGIN_FTR_SECTION
>   	b	fast_interrupt_return
>   #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>   2:	/* User process was in a transaction */
> -	RECONCILE_IRQ_STATE(r10, r11)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	altivec_unavailable_tm
>   	b	interrupt_return
> @@ -2439,7 +2418,6 @@ BEGIN_FTR_SECTION
>   1:
>   END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
>   #endif
> -	RECONCILE_IRQ_STATE(r10, r11)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	altivec_unavailable_exception
>   	b	interrupt_return
> @@ -2455,7 +2433,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
>    */
>   INT_DEFINE_BEGIN(vsx_unavailable)
>   	IVEC=0xf40
> -	IRECONCILE=0
>   #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>   	IKVM_REAL=1
>   #endif
> @@ -2484,7 +2461,6 @@ BEGIN_FTR_SECTION
>   	b	load_up_vsx
>   #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>   2:	/* User process was in a transaction */
> -	RECONCILE_IRQ_STATE(r10, r11)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	vsx_unavailable_tm
>   	b	interrupt_return
> @@ -2492,7 +2468,6 @@ BEGIN_FTR_SECTION
>   1:
>   END_FTR_SECTION_IFSET(CPU_FTR_VSX)
>   #endif
> -	RECONCILE_IRQ_STATE(r10, r11)
>   	addi	r3,r1,STACK_FRAME_OVERHEAD
>   	bl	vsx_unavailable_exception
>   	b	interrupt_return
> @@ -2827,7 +2802,6 @@ EXC_VIRT_NONE(0x5800, 0x100)
>   INT_DEFINE_BEGIN(soft_nmi)
>   	IVEC=0x900
>   	ISTACK=0
> -	IRECONCILE=0	/* Soft-NMI may fire under local_irq_disable */
>   INT_DEFINE_END(soft_nmi)
>   
>   /*
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 16/21] powerpc/64: move account_stolen_time into its own function
  2021-01-13  7:32 ` [PATCH v5 16/21] powerpc/64: move account_stolen_time into its own function Nicholas Piggin
@ 2021-01-13 14:59   ` Christophe Leroy
  2021-01-14  3:51     ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-13 14:59 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
> This will be used by interrupt entry as well.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/cputime.h | 15 +++++++++++++++
>   arch/powerpc/kernel/syscall_64.c   | 10 +---------
>   2 files changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/cputime.h b/arch/powerpc/include/asm/cputime.h
> index ed75d1c318e3..3f61604e1fcf 100644
> --- a/arch/powerpc/include/asm/cputime.h
> +++ b/arch/powerpc/include/asm/cputime.h
> @@ -87,6 +87,18 @@ static notrace inline void account_cpu_user_exit(void)
>   	acct->starttime_user = tb;
>   }
>   
> +static notrace inline void account_stolen_time(void)
> +{
> +#ifdef CONFIG_PPC_SPLPAR
> +	if (IS_ENABLED(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) &&

Arent' you already inside a CONFIG_VIRT_CPU_ACCOUNTING_NATIVE section ?

> +	    firmware_has_feature(FW_FEATURE_SPLPAR)) {
> +		struct lppaca *lp = local_paca->lppaca_ptr;
> +
> +		if (unlikely(local_paca->dtl_ridx != be64_to_cpu(lp->dtl_idx)))
> +			accumulate_stolen_time();
> +	}
> +#endif
> +}
>   
>   #endif /* __KERNEL__ */
>   #else /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
> @@ -96,5 +108,8 @@ static inline void account_cpu_user_entry(void)
>   static inline void account_cpu_user_exit(void)
>   {
>   }
> +static notrace inline void account_stolen_time(void)
> +{
> +}
>   #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
>   #endif /* __POWERPC_CPUTIME_H */
> diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
> index 42f0ad4b2fbb..32f72965da26 100644
> --- a/arch/powerpc/kernel/syscall_64.c
> +++ b/arch/powerpc/kernel/syscall_64.c
> @@ -69,15 +69,7 @@ notrace long system_call_exception(long r3, long r4, long r5,
>   
>   	account_cpu_user_entry();
>   
> -#ifdef CONFIG_PPC_SPLPAR
> -	if (IS_ENABLED(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) &&
> -	    firmware_has_feature(FW_FEATURE_SPLPAR)) {
> -		struct lppaca *lp = local_paca->lppaca_ptr;
> -
> -		if (unlikely(local_paca->dtl_ridx != be64_to_cpu(lp->dtl_idx)))
> -			accumulate_stolen_time();
> -	}
> -#endif
> +	account_stolen_time();
>   
>   	/*
>   	 * This is not required for the syscall exit path, but makes the
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 17/21] powerpc/64: entry cpu time accounting in C
  2021-01-13  7:32 ` [PATCH v5 17/21] powerpc/64: entry cpu time accounting in C Nicholas Piggin
@ 2021-01-13 15:05   ` Christophe Leroy
  2021-01-14  3:58     ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-13 15:05 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
> There is no need for this to be in asm, use the new intrrupt entry wrapper.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/interrupt.h |  7 +++++++
>   arch/powerpc/include/asm/ppc_asm.h   | 24 ------------------------
>   arch/powerpc/kernel/exceptions-64e.S |  1 -
>   arch/powerpc/kernel/exceptions-64s.S |  5 -----
>   4 files changed, 7 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> index 6eba7c489753..e278dffe7657 100644
> --- a/arch/powerpc/include/asm/interrupt.h
> +++ b/arch/powerpc/include/asm/interrupt.h
> @@ -4,6 +4,7 @@
>   
>   #include <linux/context_tracking.h>
>   #include <linux/hardirq.h>
> +#include <asm/cputime.h>
>   #include <asm/ftrace.h>
>   
>   struct interrupt_state {
> @@ -25,6 +26,9 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
>   	if (user_mode(regs)) {
>   		CT_WARN_ON(ct_state() != CONTEXT_USER);
>   		user_exit_irqoff();
> +
> +		account_cpu_user_entry();

Are interrupts still disabled here ? Otherwise you risk getting IRQ time accounted on user.

> +		account_stolen_time();
>   	} else {
>   		/*
>   		 * CT_WARN_ON comes here via program_check_exception,
> @@ -38,6 +42,9 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
>   #ifdef CONFIG_PPC_BOOK3E_64
>   	state->ctx_state = exception_enter();
>   #endif
> +
> +	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) && user_mode(regs))
> +		account_cpu_user_entry();

Isn't this interrupt_enter_prepare() function called also on PPC32 ?
Have you removed the ACCOUNT_CPU_USER_ENTRY() from entry_32.S ?

>   }
>   
>   /*
> diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
> index cc1bca571332..3dceb64fc9af 100644
> --- a/arch/powerpc/include/asm/ppc_asm.h
> +++ b/arch/powerpc/include/asm/ppc_asm.h
> @@ -25,7 +25,6 @@
>   #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
>   #define ACCOUNT_CPU_USER_ENTRY(ptr, ra, rb)
>   #define ACCOUNT_CPU_USER_EXIT(ptr, ra, rb)
> -#define ACCOUNT_STOLEN_TIME
>   #else
>   #define ACCOUNT_CPU_USER_ENTRY(ptr, ra, rb)				\
>   	MFTB(ra);			/* get timebase */		\
> @@ -44,29 +43,6 @@
>   	PPC_LL	ra, ACCOUNT_SYSTEM_TIME(ptr);				\
>   	add	ra,ra,rb;		/* add on to system time */	\
>   	PPC_STL	ra, ACCOUNT_SYSTEM_TIME(ptr)
> -
> -#ifdef CONFIG_PPC_SPLPAR
> -#define ACCOUNT_STOLEN_TIME						\
> -BEGIN_FW_FTR_SECTION;							\
> -	beq	33f;							\
> -	/* from user - see if there are any DTL entries to process */	\
> -	ld	r10,PACALPPACAPTR(r13);	/* get ptr to VPA */		\
> -	ld	r11,PACA_DTL_RIDX(r13);	/* get log read index */	\
> -	addi	r10,r10,LPPACA_DTLIDX;					\
> -	LDX_BE	r10,0,r10;		/* get log write index */	\
> -	cmpd	cr1,r11,r10;						\
> -	beq+	cr1,33f;						\
> -	bl	accumulate_stolen_time;				\
> -	ld	r12,_MSR(r1);						\
> -	andi.	r10,r12,MSR_PR;		/* Restore cr0 (coming from user) */ \
> -33:									\
> -END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
> -
> -#else  /* CONFIG_PPC_SPLPAR */
> -#define ACCOUNT_STOLEN_TIME
> -
> -#endif /* CONFIG_PPC_SPLPAR */
> -
>   #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
>   
>   /*
> diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
> index 52421042a020..87b3e74ded41 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -398,7 +398,6 @@ exc_##n##_common:							    \
>   	std	r10,_NIP(r1);		/* save SRR0 to stackframe */	    \
>   	std	r11,_MSR(r1);		/* save SRR1 to stackframe */	    \
>   	beq	2f;			/* if from kernel mode */	    \
> -	ACCOUNT_CPU_USER_ENTRY(r13,r10,r11);/* accounting (uses cr0+eq) */  \
>   2:	ld	r3,excf+EX_R10(r13);	/* get back r10 */		    \
>   	ld	r4,excf+EX_R11(r13);	/* get back r11 */		    \
>   	mfspr	r5,scratch;		/* get back r13 */		    \
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index df4ee073386b..68505e35bcf7 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -577,7 +577,6 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real)
>   	kuap_save_amr_and_lock r9, r10, cr1, cr0
>   	.endif
>   	beq	101f			/* if from kernel mode		*/
> -	ACCOUNT_CPU_USER_ENTRY(r13, r9, r10)
>   BEGIN_FTR_SECTION
>   	ld	r9,IAREA+EX_PPR(r13)	/* Read PPR from paca		*/
>   	std	r9,_PPR(r1)
> @@ -645,10 +644,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
>   	ld	r11,exception_marker@toc(r2)
>   	std	r10,RESULT(r1)		/* clear regs->result		*/
>   	std	r11,STACK_FRAME_OVERHEAD-16(r1) /* mark the frame	*/
> -
> -	.if ISTACK
> -	ACCOUNT_STOLEN_TIME
> -	.endif
>   .endm
>   
>   /*
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 18/21] powerpc: move NMI entry/exit code into wrapper
  2021-01-13  7:32 ` [PATCH v5 18/21] powerpc: move NMI entry/exit code into wrapper Nicholas Piggin
@ 2021-01-13 15:13   ` Christophe Leroy
  2021-01-14  4:00     ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-13 15:13 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
> This moves the common NMI entry and exit code into the interrupt handler
> wrappers.
> 
> This changes the behaviour of soft-NMI (watchdog) and HMI interrupts, and
> also MCE interrupts on 64e, by adding missing parts of the NMI entry to
> them.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/interrupt.h | 24 ++++++++++++++++
>   arch/powerpc/kernel/mce.c            | 11 --------
>   arch/powerpc/kernel/traps.c          | 42 +++++-----------------------
>   arch/powerpc/kernel/watchdog.c       | 10 +++----
>   4 files changed, 35 insertions(+), 52 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> index e278dffe7657..01192e213f9a 100644
> --- a/arch/powerpc/include/asm/interrupt.h
> +++ b/arch/powerpc/include/asm/interrupt.h
> @@ -95,14 +95,38 @@ static inline void interrupt_async_exit_prepare(struct pt_regs *regs, struct int
>   }
>   
>   struct interrupt_nmi_state {
> +#ifdef CONFIG_PPC64
> +	u8 ftrace_enabled;
> +#endif
>   };
>   
>   static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
>   {
> +#ifdef CONFIG_PPC64
> +	state->ftrace_enabled = this_cpu_get_ftrace_enabled();
> +	this_cpu_set_ftrace_enabled(0);
> +#endif
> +
> +	/*
> +	 * Do not use nmi_enter() for pseries hash guest taking a real-mode
> +	 * NMI because not everything it touches is within the RMA limit.
> +	 */
> +	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
> +			!firmware_has_feature(FW_FEATURE_LPAR) ||
> +			radix_enabled() || (mfmsr() & MSR_DR))
> +		nmi_enter();
>   }
>   
>   static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
>   {
> +	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
> +			!firmware_has_feature(FW_FEATURE_LPAR) ||
> +			radix_enabled() || (mfmsr() & MSR_DR))
> +		nmi_exit();
> +
> +#ifdef CONFIG_PPC64
> +	this_cpu_set_ftrace_enabled(state->ftrace_enabled);
> +#endif
>   }
>   
>   /**
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index 54269947113d..51456217ec40 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -592,12 +592,6 @@ EXPORT_SYMBOL_GPL(machine_check_print_event_info);
>   DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
>   {
>   	long handled = 0;
> -	u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
> -
> -	this_cpu_set_ftrace_enabled(0);
> -	/* Do not use nmi_enter/exit for pseries hpte guest */
> -	if (radix_enabled() || !firmware_has_feature(FW_FEATURE_LPAR))
> -		nmi_enter();
>   
>   	hv_nmi_check_nonrecoverable(regs);
>   
> @@ -607,11 +601,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
>   	if (ppc_md.machine_check_early)
>   		handled = ppc_md.machine_check_early(regs);
>   
> -	if (radix_enabled() || !firmware_has_feature(FW_FEATURE_LPAR))
> -		nmi_exit();
> -
> -	this_cpu_set_ftrace_enabled(ftrace_enabled);
> -
>   	return handled;
>   }
>   
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index b4f23e871a68..43d23232ef5c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -435,11 +435,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
>   {
>   	unsigned long hsrr0, hsrr1;
>   	bool saved_hsrrs = false;
> -	u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
> -
> -	this_cpu_set_ftrace_enabled(0);
> -
> -	nmi_enter();
>   
>   	/*
>   	 * System reset can interrupt code where HSRRs are live and MSR[RI]=1.
> @@ -511,10 +506,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
>   		mtspr(SPRN_HSRR1, hsrr1);
>   	}
>   
> -	nmi_exit();
> -
> -	this_cpu_set_ftrace_enabled(ftrace_enabled);
> -
>   	/* What should we do here? We could issue a shutdown or hard reset. */
>   
>   	return 0;
> @@ -792,6 +783,12 @@ int machine_check_generic(struct pt_regs *regs)
>   #endif /* everything else */
>   
>   
> +/*
> + * BOOK3S_64 does not call this handler as a non-maskable interrupt
> + * (it uses its own early real-mode handler to handle the MCE proper
> + * and then raises irq_work to call this handler when interrupts are
> + * enabled).
> + */
>   #ifdef CONFIG_PPC_BOOK3S_64
>   DEFINE_INTERRUPT_HANDLER_ASYNC(machine_check_exception)
>   #else
> @@ -800,20 +797,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
>   {
>   	int recover = 0;
>   
> -	/*
> -	 * BOOK3S_64 does not call this handler as a non-maskable interrupt
> -	 * (it uses its own early real-mode handler to handle the MCE proper
> -	 * and then raises irq_work to call this handler when interrupts are
> -	 * enabled).
> -	 *
> -	 * This is silly. The BOOK3S_64 should just call a different function
> -	 * rather than expecting semantics to magically change. Something
> -	 * like 'non_nmi_machine_check_exception()', perhaps?
> -	 */
> -	const bool nmi = !IS_ENABLED(CONFIG_PPC_BOOK3S_64);
> -
> -	if (nmi) nmi_enter();
> -
>   	__this_cpu_inc(irq_stat.mce_exceptions);
>   
>   	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
> @@ -838,24 +821,17 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
>   	if (check_io_access(regs))
>   		goto bail;
>   
> -	if (nmi) nmi_exit();
> -

IIRC, not doing the nmi_exit() before the die() is problematic.

See 
https://github.com/linuxppc/linux/commit/daf00ae71dad8aa05965713c62558aeebf2df48e#diff-70077148c383252ca949063eaf1b0250620e4607b43f4ef3fd2d8f448a83ab0a

>   	die("Machine check", regs, SIGBUS);
>   
>   	/* Must die if the interrupt is not recoverable */
>   	if (!(regs->msr & MSR_RI))
>   		die("Unrecoverable Machine check", regs, SIGBUS);
>   
> -#ifdef CONFIG_PPC_BOOK3S_64
>   bail:
> +#ifdef CONFIG_PPC_BOOK3S_64
>   	return;
>   #else
>   	return 0;
> -
> -bail:
> -	if (nmi) nmi_exit();
> -
> -	return 0;
>   #endif
>   }
>   NOKPROBE_SYMBOL(machine_check_exception);
> @@ -1873,14 +1849,10 @@ DEFINE_INTERRUPT_HANDLER(vsx_unavailable_tm)
>   #ifdef CONFIG_PPC64
>   DEFINE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi)
>   {
> -	nmi_enter();
> -
>   	__this_cpu_inc(irq_stat.pmu_irqs);
>   
>   	perf_irq(regs);
>   
> -	nmi_exit();
> -
>   	return 0;
>   }
>   #endif
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index 824b9376ac35..dc39534836a3 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -254,11 +254,12 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>   	int cpu = raw_smp_processor_id();
>   	u64 tb;
>   
> +	/* should only arrive from kernel, with irqs disabled */
> +	WARN_ON_ONCE(!arch_irq_disabled_regs(regs));
> +
>   	if (!cpumask_test_cpu(cpu, &wd_cpus_enabled))
>   		return 0;
>   
> -	nmi_enter();
> -
>   	__this_cpu_inc(irq_stat.soft_nmi_irqs);
>   
>   	tb = get_tb();
> @@ -266,7 +267,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>   		wd_smp_lock(&flags);
>   		if (cpumask_test_cpu(cpu, &wd_smp_cpus_stuck)) {
>   			wd_smp_unlock(&flags);
> -			goto out;
> +			return 0;
>   		}
>   		set_cpu_stuck(cpu, tb);
>   
> @@ -290,9 +291,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>   	if (wd_panic_timeout_tb < 0x7fffffff)
>   		mtspr(SPRN_DEC, wd_panic_timeout_tb);
>   
> -out:
> -	nmi_exit();
> -
>   	return 0;
>   }
>   
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C
  2021-01-13 14:12   ` Christophe Leroy
@ 2021-01-14  3:24     ` Nicholas Piggin
  2021-01-14 12:09       ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14  3:24 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 12:12 am:
> 
> 
> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
>> The page fault handling still has some complex logic particularly around
>> hash table handling, in asm. Implement this in C instead.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>>   arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>>   arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>>   arch/powerpc/mm/fault.c                       |  46 ++++--
>>   4 files changed, 107 insertions(+), 148 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>> index 066b1d34c7bc..60a669379aa0 100644
>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>>   #define HPTE_NOHPTE_UPDATE	0x2
>>   #define HPTE_USE_KERNEL_KEY	0x4
>>   
>> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>>   extern int __hash_page_4K(unsigned long ea, unsigned long access,
>>   			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>>   			  unsigned long flags, int ssize, int subpage_prot);
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index 6e53f7638737..bcb5e81d2088 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>    *
>>    * Handling:
>>    * - Hash MMU
>> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
>> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
>> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
>> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>>    *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>>    *   "non-bolted" regions, e.g., vmalloc space. However these should always be
>> - *   backed by Linux page tables.
>> + *   backed by Linux page table entries.
>>    *
>> - *   If none is found, do a Linux page fault. Linux page faults can happen in
>> - *   kernel mode due to user copy operations of course.
>> + *   If no entry is found the Linux page fault handler is invoked (by
>> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
>> + *   copy operations of course.
>>    *
>>    *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>>    *   MMU context, which may cause a DSI in the host, which must go to the
>> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>>   	GEN_COMMON data_access
>>   	ld	r4,_DAR(r1)
>>   	ld	r5,_DSISR(r1)
> 
> We have DSISR here. I think the dispatch between page fault or do_break() should be done here:
> - It would be more similar to other arches

Other sub-archs?

> - Would avoid doing it also in instruction fault

True but it's hidden under an unlikely branch so won't really help 
instruction fault.

> - Would avoid that -1 return which looks more like a hack.

I don't really see it as a hack, we return a code to asm caller to
direct whether to restore registers or not, we alrady have this
pattern.

(I'm hoping all that might be go away one day by conrolling NV
regs from C if we can get good code generation but even if not we
still have it in the interrupt returns).

That said I will give it a try here. At very least it might be a
better intermediate step.

[snip]

>> +#ifdef CONFIG_PPC_BOOK3S_64
> 
> Seems like you are re-implementing handle_page_fault() inside do_page_fault(). Wouldn't it be 
> possible to keep do_page_fault() as is for the moment and implement a C version of handle_page_fault() ?

The test goes in a better place (existing unlikely branch) if we do it 
in do_page_fault.

> Or just keep it in assembly ? It is not that big, keeping it in assembly would keep things more 
> common with PPC32, and would still allow to save NV GPRS only when needed.

I think it's better to go the other way and move more of the other archs 
to C (in general that is, but for this patch as I said I will try the DABR
test in asm).

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 04/21] powerpc: bad_page_fault, do_break get registers from regs
  2021-01-13 14:25   ` Christophe Leroy
@ 2021-01-14  3:26     ` Nicholas Piggin
  0 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14  3:26 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 12:25 am:
> 
> 
> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
>> Similar to the previous patch this makes interrupt handler function
>> types more regular so they can be wrapped with the next patch.
>> 
>> bad_page_fault and do_break are not performance critical.
> 
> It's a bit different between do_break() and bad_page_fault():
> - do_break() is not performance critical for sure
> - bad_page_fault(), it doesn't matter, because bad_page_fault() was not using the address param so 
> it doesn't get anything from regs at the end.
> 
> Maybe it would be worth splitting in two patches, one for bad_page_fault() and one for do_break()

Okay I'll try it.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 06/21] powerpc: interrupt handler wrapper functions
  2021-01-13 14:45   ` Christophe Leroy
@ 2021-01-14  3:41     ` Nicholas Piggin
  0 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14  3:41 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 12:45 am:
> 
> 
> Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
>> Add wrapper functions (derived from x86 macros) for interrupt handler
>> functions. This allows interrupt entry code to be written in C.
> 
> Looks like you are doing more than just that is this patch. WOuld be worth splitting in several 
> patches I think,
> 
> I'd suggest:
> - Other patches for unrelated changes, see below for details
> - One patch that brings the wrapper macros
> - One patch that uses those macros
> 
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   arch/powerpc/include/asm/asm-prototypes.h     |  29 ---
>>   arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 -
>>   arch/powerpc/include/asm/hw_irq.h             |   9 -
>>   arch/powerpc/include/asm/interrupt.h          | 218 ++++++++++++++++++
>>   arch/powerpc/include/asm/time.h               |   2 +
>>   arch/powerpc/kernel/dbell.c                   |  12 +-
>>   arch/powerpc/kernel/exceptions-64s.S          |   7 +-
>>   arch/powerpc/kernel/head_book3s_32.S          |   6 +-
>>   arch/powerpc/kernel/irq.c                     |   3 +-
>>   arch/powerpc/kernel/mce.c                     |   5 +-
>>   arch/powerpc/kernel/syscall_64.c              |   1 +
>>   arch/powerpc/kernel/tau_6xx.c                 |   2 +-
>>   arch/powerpc/kernel/time.c                    |   3 +-
>>   arch/powerpc/kernel/traps.c                   |  90 +++++---
>>   arch/powerpc/kernel/watchdog.c                |   7 +-
>>   arch/powerpc/kvm/book3s_hv.c                  |   1 +
>>   arch/powerpc/kvm/book3s_hv_builtin.c          |   1 +
>>   arch/powerpc/kvm/booke.c                      |   1 +
>>   arch/powerpc/mm/book3s64/hash_utils.c         |  57 +++--
>>   arch/powerpc/mm/book3s64/slb.c                |  29 +--
>>   arch/powerpc/mm/fault.c                       |  15 +-
>>   arch/powerpc/platforms/powernv/idle.c         |   1 +
>>   22 files changed, 374 insertions(+), 126 deletions(-)
>>   create mode 100644 arch/powerpc/include/asm/interrupt.h
>> 
> 
> ...
> 
>> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
>> new file mode 100644
>> index 000000000000..60363e5eeffa
>> --- /dev/null
>> +++ b/arch/powerpc/include/asm/interrupt.h
> 
> ...
> 
>> +/* Interrupt handlers */
>> +DECLARE_INTERRUPT_HANDLER_NMI(machine_check_early);
>> +DECLARE_INTERRUPT_HANDLER_NMI(hmi_exception_realmode);
>> +DECLARE_INTERRUPT_HANDLER(SMIException);
>> +DECLARE_INTERRUPT_HANDLER(handle_hmi_exception);
>> +DECLARE_INTERRUPT_HANDLER(instruction_breakpoint_exception);
>> +DECLARE_INTERRUPT_HANDLER(RunModeException);
>> +DECLARE_INTERRUPT_HANDLER(single_step_exception);
>> +DECLARE_INTERRUPT_HANDLER(program_check_exception);
>> +DECLARE_INTERRUPT_HANDLER(alignment_exception);
>> +DECLARE_INTERRUPT_HANDLER(StackOverflow);
>> +DECLARE_INTERRUPT_HANDLER(stack_overflow_exception);
>> +DECLARE_INTERRUPT_HANDLER(kernel_fp_unavailable_exception);
>> +DECLARE_INTERRUPT_HANDLER(altivec_unavailable_exception);
>> +DECLARE_INTERRUPT_HANDLER(vsx_unavailable_exception);
>> +DECLARE_INTERRUPT_HANDLER(fp_unavailable_tm);
>> +DECLARE_INTERRUPT_HANDLER(altivec_unavailable_tm);
>> +DECLARE_INTERRUPT_HANDLER(vsx_unavailable_tm);
>> +DECLARE_INTERRUPT_HANDLER(facility_unavailable_exception);
>> +DECLARE_INTERRUPT_HANDLER_ASYNC(TAUException);
>> +DECLARE_INTERRUPT_HANDLER(altivec_assist_exception);
>> +DECLARE_INTERRUPT_HANDLER(unrecoverable_exception);
>> +DECLARE_INTERRUPT_HANDLER(kernel_bad_stack);
>> +DECLARE_INTERRUPT_HANDLER_NMI(system_reset_exception);
>> +#ifdef CONFIG_PPC_BOOK3S_64
>> +DECLARE_INTERRUPT_HANDLER_ASYNC(machine_check_exception);
>> +#else
>> +DECLARE_INTERRUPT_HANDLER_NMI(machine_check_exception);
>> +#endif
>> +DECLARE_INTERRUPT_HANDLER(emulation_assist_interrupt);
>> +DECLARE_INTERRUPT_HANDLER_RAW(do_slb_fault);
>> +DECLARE_INTERRUPT_HANDLER(do_bad_slb_fault);
>> +DECLARE_INTERRUPT_HANDLER_RAW(do_hash_fault);
>> +DECLARE_INTERRUPT_HANDLER_RET(do_page_fault);
>> +DECLARE_INTERRUPT_HANDLER(__do_bad_page_fault);
>> +DECLARE_INTERRUPT_HANDLER(do_bad_page_fault);
> 
> Missing DECLARE_INTERRUPT_HANDLER(do_break)
> 
>> +
>> +DECLARE_INTERRUPT_HANDLER_ASYNC(timer_interrupt);
>> +DECLARE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi);
>> +DECLARE_INTERRUPT_HANDLER_ASYNC(performance_monitor_exception_async);
>> +DECLARE_INTERRUPT_HANDLER_RAW(performance_monitor_exception);
>> +DECLARE_INTERRUPT_HANDLER(WatchdogException);
>> +DECLARE_INTERRUPT_HANDLER(unknown_exception);
>> +DECLARE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception);
>> +
>> +void replay_system_reset(void);
>> +void replay_soft_interrupts(void);
>> +
>> +#endif /* _ASM_POWERPC_INTERRUPT_H */
>> diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
>> index 8f789b597bae..8dd3cdb25338 100644
>> --- a/arch/powerpc/include/asm/time.h
>> +++ b/arch/powerpc/include/asm/time.h
>> @@ -102,6 +102,8 @@ DECLARE_PER_CPU(u64, decrementers_next_tb);
>>   /* Convert timebase ticks to nanoseconds */
>>   unsigned long long tb_to_ns(unsigned long long tb_ticks);
>>   
>> +void timer_broadcast_interrupt(void);
> 
> This seems unrelated. I think a separate patch woud be better for moving prototypes without making 
> them wrappers.

Yeah this might have just slipped in, thanks.

>> +
>>   /* SPLPAR */
>>   void accumulate_stolen_time(void);
>>   
>> diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
>> index 52680cf07c9d..c0f99f8ffa7d 100644
>> --- a/arch/powerpc/kernel/dbell.c
>> +++ b/arch/powerpc/kernel/dbell.c
>> @@ -12,14 +12,14 @@
>>   #include <linux/hardirq.h>
>>   
>>   #include <asm/dbell.h>
>> +#include <asm/interrupt.h>
>>   #include <asm/irq_regs.h>
>>   #include <asm/kvm_ppc.h>
>>   #include <asm/trace.h>
>>   
>> -#ifdef CONFIG_SMP
>> -
> 
> This seems unrelated, is that needed ? What's the problem with having to full versions of 
> doorbell_exception() ?

No real problem I think, it might have been from some earlier work in 
progress. I can adjust it.

> 
>> -void doorbell_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_ASYNC(doorbell_exception)
>>   {
>> +#ifdef CONFIG_SMP
>>   	struct pt_regs *old_regs = set_irq_regs(regs);
>>   
>>   	irq_enter();
>> @@ -37,11 +37,7 @@ void doorbell_exception(struct pt_regs *regs)
>>   	trace_doorbell_exit(regs);
>>   	irq_exit();
>>   	set_irq_regs(old_regs);
>> -}
>>   #else /* CONFIG_SMP */
>> -void doorbell_exception(struct pt_regs *regs)
>> -{
>>   	printk(KERN_WARNING "Received doorbell on non-smp system\n");
>> -}
>>   #endif /* CONFIG_SMP */
>> -
>> +}
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index 36dea2020ec5..8b0db807974c 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -1923,7 +1923,7 @@ EXC_COMMON_BEGIN(doorbell_super_common)
>>   #ifdef CONFIG_PPC_DOORBELL
>>   	bl	doorbell_exception
>>   #else
>> -	bl	unknown_exception
>> +	bl	unknown_async_exception
> 
> Unrelated  to wrappers ?

Well there's now a difference between sync and async exceptions, but it 
could be introduced in a different patch.

>>   #endif
>>   	b	interrupt_return
>>   
>> @@ -2136,8 +2136,7 @@ EXC_COMMON_BEGIN(h_data_storage_common)
>>   	GEN_COMMON h_data_storage
>>   	addi    r3,r1,STACK_FRAME_OVERHEAD
>>   BEGIN_MMU_FTR_SECTION
>> -	li	r4,SIGSEGV
>> -	bl      bad_page_fault
>> +	bl      do_bad_page_fault
> 
> Is this name change related ?

The do_ variant is a "handler", the other can be called from C.

>>   MMU_FTR_SECTION_ELSE
>>   	bl      unknown_exception
>>   ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_TYPE_RADIX)
>> @@ -2310,7 +2309,7 @@ EXC_COMMON_BEGIN(h_doorbell_common)
>>   #ifdef CONFIG_PPC_DOORBELL
>>   	bl	doorbell_exception
>>   #else
>> -	bl	unknown_exception
>> +	bl	unknown_async_exception
>>   #endif
>>   	b	interrupt_return
>>   
>> diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
>> index 94ad1372c490..9b4d5432e2db 100644
>> --- a/arch/powerpc/kernel/head_book3s_32.S
>> +++ b/arch/powerpc/kernel/head_book3s_32.S
>> @@ -238,8 +238,8 @@ __secondary_hold_acknowledge:
>>   
>>   /* System reset */
>>   /* core99 pmac starts the seconary here by changing the vector, and
>> -   putting it back to what it was (unknown_exception) when done.  */
>> -	EXCEPTION(0x100, Reset, unknown_exception, EXC_XFER_STD)
>> +   putting it back to what it was (unknown_async_exception) when done.  */
>> +	EXCEPTION(0x100, Reset, unknown_async_exception, EXC_XFER_STD)
>>   
>>   /* Machine check */
>>   /*
>> @@ -631,7 +631,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_NEED_DTLB_SW_LRU)
>>   #endif
>>   
>>   #ifndef CONFIG_TAU_INT
>> -#define TAUException	unknown_exception
>> +#define TAUException	unknown_async_exception
>>   #endif
>>   
>>   	EXCEPTION(0x1300, Trap_13, instruction_breakpoint_exception, EXC_XFER_STD)
>> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
>> index 6b1eca53e36c..2055d204d08e 100644
>> --- a/arch/powerpc/kernel/irq.c
>> +++ b/arch/powerpc/kernel/irq.c
>> @@ -54,6 +54,7 @@
>>   #include <linux/pgtable.h>
>>   
>>   #include <linux/uaccess.h>
>> +#include <asm/interrupt.h>
>>   #include <asm/io.h>
>>   #include <asm/irq.h>
>>   #include <asm/cache.h>
>> @@ -665,7 +666,7 @@ void __do_irq(struct pt_regs *regs)
>>   	irq_exit();
>>   }
>>   
>> -void do_IRQ(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_ASYNC(do_IRQ)
>>   {
>>   	struct pt_regs *old_regs = set_irq_regs(regs);
>>   	void *cursp, *irqsp, *sirqsp;
>> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
>> index 9f3e133b57b7..54269947113d 100644
>> --- a/arch/powerpc/kernel/mce.c
>> +++ b/arch/powerpc/kernel/mce.c
>> @@ -18,6 +18,7 @@
>>   #include <linux/extable.h>
>>   #include <linux/ftrace.h>
>>   
>> +#include <asm/interrupt.h>
>>   #include <asm/machdep.h>
>>   #include <asm/mce.h>
>>   #include <asm/nmi.h>
>> @@ -588,7 +589,7 @@ EXPORT_SYMBOL_GPL(machine_check_print_event_info);
>>    *
>>    * regs->nip and regs->msr contains srr0 and ssr1.
>>    */
>> -long notrace machine_check_early(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
>>   {
>>   	long handled = 0;
>>   	u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
>> @@ -722,7 +723,7 @@ long hmi_handle_debugtrig(struct pt_regs *regs)
>>   /*
>>    * Return values:
>>    */
>> -long hmi_exception_realmode(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_NMI(hmi_exception_realmode)
>>   {	
>>   	int ret;
>>   
>> diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
>> index 7c85ed04a164..dd87b2118620 100644
>> --- a/arch/powerpc/kernel/syscall_64.c
>> +++ b/arch/powerpc/kernel/syscall_64.c
>> @@ -5,6 +5,7 @@
>>   #include <asm/kup.h>
>>   #include <asm/cputime.h>
>>   #include <asm/hw_irq.h>
>> +#include <asm/interrupt.h>
>>   #include <asm/kprobes.h>
>>   #include <asm/paca.h>
>>   #include <asm/ptrace.h>
>> diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c
>> index 0b4694b8d248..46b2e5de4ef5 100644
>> --- a/arch/powerpc/kernel/tau_6xx.c
>> +++ b/arch/powerpc/kernel/tau_6xx.c
>> @@ -100,7 +100,7 @@ static void TAUupdate(int cpu)
>>    * with interrupts disabled
>>    */
>>   
>> -void TAUException(struct pt_regs * regs)
>> +DEFINE_INTERRUPT_HANDLER_ASYNC(TAUException)
>>   {
>>   	int cpu = smp_processor_id();
>>   
>> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
>> index 67feb3524460..435a251247ed 100644
>> --- a/arch/powerpc/kernel/time.c
>> +++ b/arch/powerpc/kernel/time.c
>> @@ -56,6 +56,7 @@
>>   #include <linux/processor.h>
>>   #include <asm/trace.h>
>>   
>> +#include <asm/interrupt.h>
>>   #include <asm/io.h>
>>   #include <asm/nvram.h>
>>   #include <asm/cache.h>
>> @@ -570,7 +571,7 @@ void arch_irq_work_raise(void)
>>    * timer_interrupt - gets called when the decrementer overflows,
>>    * with interrupts disabled.
>>    */
>> -void timer_interrupt(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
>>   {
>>   	struct clock_event_device *evt = this_cpu_ptr(&decrementers);
>>   	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
>> index 9b5298c016c7..f4462b481248 100644
>> --- a/arch/powerpc/kernel/traps.c
>> +++ b/arch/powerpc/kernel/traps.c
>> @@ -41,6 +41,7 @@
>>   #include <asm/emulated_ops.h>
>>   #include <linux/uaccess.h>
>>   #include <asm/debugfs.h>
>> +#include <asm/interrupt.h>
>>   #include <asm/io.h>
>>   #include <asm/machdep.h>
>>   #include <asm/rtas.h>
>> @@ -430,8 +431,7 @@ void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
>>   	regs->msr &= ~MSR_RI;
>>   #endif
>>   }
>> -
>> -void system_reset_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
>>   {
>>   	unsigned long hsrr0, hsrr1;
>>   	bool saved_hsrrs = false;
>> @@ -516,7 +516,10 @@ void system_reset_exception(struct pt_regs *regs)
>>   	this_cpu_set_ftrace_enabled(ftrace_enabled);
>>   
>>   	/* What should we do here? We could issue a shutdown or hard reset. */
>> +
>> +	return 0;
>>   }
>> +NOKPROBE_SYMBOL(system_reset_exception);
> 
> Is this NOKPROBE_SYMBOL() related to wrappers or just a bug fix ?

Hmm, I don't remember off the top of my head now, may be bug fix but
I don't know why we haven't seen it already. I have tested MCE with
tracing I thought but maybe not carefully enough.

I will move them to another patch.

> 
>>   
>>   /*
>>    * I/O accesses can cause machine checks on powermacs.
>> @@ -788,7 +791,12 @@ int machine_check_generic(struct pt_regs *regs)
>>   }
>>   #endif /* everything else */
>>   
>> -void machine_check_exception(struct pt_regs *regs)
>> +
>> +#ifdef CONFIG_PPC_BOOK3S_64
>> +DEFINE_INTERRUPT_HANDLER_ASYNC(machine_check_exception)
>> +#else
>> +DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
>> +#endif
>>   {
>>   	int recover = 0;
>>   
>> @@ -838,13 +846,21 @@ void machine_check_exception(struct pt_regs *regs)
>>   	if (!(regs->msr & MSR_RI))
>>   		die("Unrecoverable Machine check", regs, SIGBUS);
>>   
>> +#ifdef CONFIG_PPC_BOOK3S_64
>> +bail:
>>   	return;
>> +#else
>> +	return 0;
>>   
>>   bail:
>>   	if (nmi) nmi_exit();
>> +
>> +	return 0;
>> +#endif
> 
> Looks fishy. Can't we have both returning either long or void ?

All handlers could return long, then you would have lots of pointless
`return 0` and `li r3,0`. HMI exception NMI wants to return something.

Arguably HMI should handle local_irq_disable with irq_work, like MCE
does. I have noted that in the HMI interrupt comment and will get to
it eventually, which should unify this.

> 
>>   }
>> +NOKPROBE_SYMBOL(machine_check_exception);
> 
> Is this NOKPROBE_SYMBOL() related to wrappers or just a bug fix ?
> 
>>   
>> -void SMIException(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(SMIException) /* async? */
>>   {
>>   	die("System Management Interrupt", regs, SIGABRT);
>>   }
>> @@ -1030,7 +1046,7 @@ static void p9_hmi_special_emu(struct pt_regs *regs)
>>   }
>>   #endif /* CONFIG_VSX */
>>   
>> -void handle_hmi_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_ASYNC(handle_hmi_exception)
>>   {
>>   	struct pt_regs *old_regs;
>>   
>> @@ -1059,7 +1075,7 @@ void handle_hmi_exception(struct pt_regs *regs)
>>   	set_irq_regs(old_regs);
>>   }
>>   
>> -void unknown_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(unknown_exception)
>>   {
>>   	enum ctx_state prev_state = exception_enter();
>>   
>> @@ -1071,7 +1087,19 @@ void unknown_exception(struct pt_regs *regs)
>>   	exception_exit(prev_state);
>>   }
>>   
>> -void instruction_breakpoint_exception(struct pt_regs *regs)
> 
> shouldn't unknown_async_exception() be added in a preceeding patch ?

+1

>> +DEFINE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception)
>> +{
>> +	enum ctx_state prev_state = exception_enter();
>> +
>> +	printk("Bad trap at PC: %lx, SR: %lx, vector=%lx\n",
>> +	       regs->nip, regs->msr, regs->trap);
>> +
>> +	_exception(SIGTRAP, regs, TRAP_UNK, 0);
>> +
>> +	exception_exit(prev_state);
>> +}
>> +
>> +DEFINE_INTERRUPT_HANDLER(instruction_breakpoint_exception)
>>   {
>>   	enum ctx_state prev_state = exception_enter();
>>   
>> @@ -1086,12 +1114,12 @@ void instruction_breakpoint_exception(struct pt_regs *regs)
>>   	exception_exit(prev_state);
>>   }
>>   
>> -void RunModeException(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(RunModeException)
>>   {
>>   	_exception(SIGTRAP, regs, TRAP_UNK, 0);
>>   }
>>   
>> -void single_step_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(single_step_exception)
>>   {
>>   	enum ctx_state prev_state = exception_enter();
>>   
>> @@ -1436,7 +1464,7 @@ static int emulate_math(struct pt_regs *regs)
>>   static inline int emulate_math(struct pt_regs *regs) { return -1; }
>>   #endif
>>   
>> -void program_check_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(program_check_exception)
>>   {
>>   	enum ctx_state prev_state = exception_enter();
>>   	unsigned int reason = get_reason(regs);
>> @@ -1561,14 +1589,14 @@ NOKPROBE_SYMBOL(program_check_exception);
>>    * This occurs when running in hypervisor mode on POWER6 or later
>>    * and an illegal instruction is encountered.
>>    */
>> -void emulation_assist_interrupt(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(emulation_assist_interrupt)
>>   {
>>   	regs->msr |= REASON_ILLEGAL;
>>   	program_check_exception(regs);
>>   }
>>   NOKPROBE_SYMBOL(emulation_assist_interrupt);
>>   
>> -void alignment_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(alignment_exception)
>>   {
>>   	enum ctx_state prev_state = exception_enter();
>>   	int sig, code, fixed = 0;
>> @@ -1618,7 +1646,7 @@ void alignment_exception(struct pt_regs *regs)
>>   	exception_exit(prev_state);
>>   }
>>   
>> -void StackOverflow(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(StackOverflow)
>>   {
>>   	pr_crit("Kernel stack overflow in process %s[%d], r1=%lx\n",
>>   		current->comm, task_pid_nr(current), regs->gpr[1]);
>> @@ -1627,7 +1655,7 @@ void StackOverflow(struct pt_regs *regs)
>>   	panic("kernel stack overflow");
>>   }
>>   
>> -void stack_overflow_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(stack_overflow_exception)
>>   {
>>   	enum ctx_state prev_state = exception_enter();
>>   
>> @@ -1636,7 +1664,7 @@ void stack_overflow_exception(struct pt_regs *regs)
>>   	exception_exit(prev_state);
>>   }
>>   
>> -void kernel_fp_unavailable_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(kernel_fp_unavailable_exception)
>>   {
>>   	enum ctx_state prev_state = exception_enter();
>>   
>> @@ -1647,7 +1675,7 @@ void kernel_fp_unavailable_exception(struct pt_regs *regs)
>>   	exception_exit(prev_state);
>>   }
>>   
>> -void altivec_unavailable_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(altivec_unavailable_exception)
>>   {
>>   	enum ctx_state prev_state = exception_enter();
>>   
>> @@ -1666,7 +1694,7 @@ void altivec_unavailable_exception(struct pt_regs *regs)
>>   	exception_exit(prev_state);
>>   }
>>   
>> -void vsx_unavailable_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(vsx_unavailable_exception)
>>   {
>>   	if (user_mode(regs)) {
>>   		/* A user program has executed an vsx instruction,
>> @@ -1697,7 +1725,7 @@ static void tm_unavailable(struct pt_regs *regs)
>>   	die("Unrecoverable TM Unavailable Exception", regs, SIGABRT);
>>   }
>>   
>> -void facility_unavailable_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(facility_unavailable_exception)
>>   {
>>   	static char *facility_strings[] = {
>>   		[FSCR_FP_LG] = "FPU",
>> @@ -1817,7 +1845,7 @@ void facility_unavailable_exception(struct pt_regs *regs)
>>   
>>   #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>>   
>> -void fp_unavailable_tm(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(fp_unavailable_tm)
>>   {
>>   	/* Note:  This does not handle any kind of FP laziness. */
>>   
>> @@ -1850,7 +1878,7 @@ void fp_unavailable_tm(struct pt_regs *regs)
>>   	tm_recheckpoint(&current->thread);
>>   }
>>   
>> -void altivec_unavailable_tm(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(altivec_unavailable_tm)
>>   {
>>   	/* See the comments in fp_unavailable_tm().  This function operates
>>   	 * the same way.
>> @@ -1865,7 +1893,7 @@ void altivec_unavailable_tm(struct pt_regs *regs)
>>   	current->thread.used_vr = 1;
>>   }
>>   
>> -void vsx_unavailable_tm(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(vsx_unavailable_tm)
>>   {
>>   	/* See the comments in fp_unavailable_tm().  This works similarly,
>>   	 * though we're loading both FP and VEC registers in here.
>> @@ -1890,7 +1918,8 @@ void vsx_unavailable_tm(struct pt_regs *regs)
>>   }
>>   #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>>   
>> -static void performance_monitor_exception_nmi(struct pt_regs *regs)
>> +#ifdef CONFIG_PPC64
>> +DEFINE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi)
>>   {
>>   	nmi_enter();
>>   
>> @@ -1899,9 +1928,12 @@ static void performance_monitor_exception_nmi(struct pt_regs *regs)
>>   	perf_irq(regs);
>>   
>>   	nmi_exit();
>> +
>> +	return 0;
>>   }
>> +#endif
>>   
>> -static void performance_monitor_exception_async(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_ASYNC(performance_monitor_exception_async)
>>   {
>>   	irq_enter();
>>   
>> @@ -1912,7 +1944,7 @@ static void performance_monitor_exception_async(struct pt_regs *regs)
>>   	irq_exit();
>>   }
>>   
>> -void performance_monitor_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_RAW(performance_monitor_exception)
>>   {
>>   	/*
>>   	 * On 64-bit, if perf interrupts hit in a local_irq_disable
>> @@ -1924,6 +1956,8 @@ void performance_monitor_exception(struct pt_regs *regs)
>>   		performance_monitor_exception_nmi(regs);
>>   	else
>>   		performance_monitor_exception_async(regs);
>> +
>> +	return 0;
>>   }
>>   
>>   #ifdef CONFIG_PPC_ADV_DEBUG_REGS
>> @@ -2057,7 +2091,7 @@ NOKPROBE_SYMBOL(DebugException);
>>   #endif /* CONFIG_PPC_ADV_DEBUG_REGS */
>>   
>>   #ifdef CONFIG_ALTIVEC
>> -void altivec_assist_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(altivec_assist_exception)
>>   {
>>   	int err;
>>   
>> @@ -2199,7 +2233,7 @@ void SPEFloatingPointRoundException(struct pt_regs *regs)
>>    * in the MSR is 0.  This indicates that SRR0/1 are live, and that
>>    * we therefore lost state by taking this exception.
>>    */
>> -void unrecoverable_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(unrecoverable_exception)
>>   {
>>   	pr_emerg("Unrecoverable exception %lx at %lx (msr=%lx)\n",
>>   		 regs->trap, regs->nip, regs->msr);
>> @@ -2219,7 +2253,7 @@ void __attribute__ ((weak)) WatchdogHandler(struct pt_regs *regs)
>>   	return;
>>   }
>>   
>> -void WatchdogException(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(WatchdogException) /* XXX NMI? async? */
>>   {
>>   	printk (KERN_EMERG "PowerPC Book-E Watchdog Exception\n");
>>   	WatchdogHandler(regs);
>> @@ -2230,7 +2264,7 @@ void WatchdogException(struct pt_regs *regs)
>>    * We enter here if we discover during exception entry that we are
>>    * running in supervisor mode with a userspace value in the stack pointer.
>>    */
>> -void kernel_bad_stack(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(kernel_bad_stack)
>>   {
>>   	printk(KERN_EMERG "Bad kernel stack pointer %lx at %lx\n",
>>   	       regs->gpr[1], regs->nip);
>> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
>> index af3c15a1d41e..824b9376ac35 100644
>> --- a/arch/powerpc/kernel/watchdog.c
>> +++ b/arch/powerpc/kernel/watchdog.c
>> @@ -26,6 +26,7 @@
>>   #include <linux/delay.h>
>>   #include <linux/smp.h>
>>   
>> +#include <asm/interrupt.h>
>>   #include <asm/paca.h>
>>   
>>   /*
>> @@ -247,14 +248,14 @@ static void watchdog_timer_interrupt(int cpu)
>>   		watchdog_smp_panic(cpu, tb);
>>   }
>>   
>> -void soft_nmi_interrupt(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>>   {
>>   	unsigned long flags;
>>   	int cpu = raw_smp_processor_id();
>>   	u64 tb;
>>   
>>   	if (!cpumask_test_cpu(cpu, &wd_cpus_enabled))
>> -		return;
>> +		return 0;
>>   
>>   	nmi_enter();
>>   
>> @@ -291,6 +292,8 @@ void soft_nmi_interrupt(struct pt_regs *regs)
>>   
>>   out:
>>   	nmi_exit();
>> +
>> +	return 0;
>>   }
>>   
>>   static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 6f612d240392..3f9a229f82a2 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -53,6 +53,7 @@
>>   #include <asm/cputable.h>
>>   #include <asm/cacheflush.h>
>>   #include <linux/uaccess.h>
>> +#include <asm/interrupt.h>
>>   #include <asm/io.h>
>>   #include <asm/kvm_ppc.h>
>>   #include <asm/kvm_book3s.h>
>> diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
>> index 8053efdf7ea7..10fc274bea65 100644
>> --- a/arch/powerpc/kvm/book3s_hv_builtin.c
>> +++ b/arch/powerpc/kvm/book3s_hv_builtin.c
>> @@ -17,6 +17,7 @@
>>   
>>   #include <asm/asm-prototypes.h>
>>   #include <asm/cputable.h>
>> +#include <asm/interrupt.h>
>>   #include <asm/kvm_ppc.h>
>>   #include <asm/kvm_book3s.h>
>>   #include <asm/archrandom.h>
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index 288a9820ec01..bd2bb73021d8 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -20,6 +20,7 @@
>>   
>>   #include <asm/cputable.h>
>>   #include <linux/uaccess.h>
>> +#include <asm/interrupt.h>
>>   #include <asm/kvm_ppc.h>
>>   #include <asm/cacheflush.h>
>>   #include <asm/dbell.h>
>> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
>> index 77073a256cff..453afb9ae9b4 100644
>> --- a/arch/powerpc/mm/book3s64/hash_utils.c
>> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
>> @@ -38,6 +38,7 @@
>>   #include <linux/pgtable.h>
>>   
>>   #include <asm/debugfs.h>
>> +#include <asm/interrupt.h>
>>   #include <asm/processor.h>
>>   #include <asm/mmu.h>
>>   #include <asm/mmu_context.h>
>> @@ -1512,7 +1513,7 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
>>   }
>>   EXPORT_SYMBOL_GPL(hash_page);
>>   
>> -long do_hash_fault(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
>>   {
>>   	unsigned long ea = regs->dar;
>>   	unsigned long dsisr = regs->dsisr;
>> @@ -1522,27 +1523,6 @@ long do_hash_fault(struct pt_regs *regs)
>>   	unsigned int region_id;
>>   	long err;
>>   
>> -	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
>> -		goto page_fault;
>> -
>> -	/*
>> -	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
>> -	 * don't call hash_page, just fail the fault. This is required to
>> -	 * prevent re-entrancy problems in the hash code, namely perf
>> -	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
>> -	 * hash fault. See the comment in hash_preload().
>> -	 *
>> -	 * We come here as a result of a DSI at a point where we don't want
>> -	 * to call hash_page, such as when we are accessing memory (possibly
>> -	 * user memory) inside a PMU interrupt that occurred while interrupts
>> -	 * were soft-disabled.  We want to invoke the exception handler for
>> -	 * the access, or panic if there isn't a handler.
>> -	 */
>> -	if (unlikely(in_nmi())) {
>> -		bad_page_fault(regs, SIGSEGV);
>> -		return 0;
>> -	}
>> -
>>   	region_id = get_region_id(ea);
>>   	if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
>>   		mm = &init_mm;
>> @@ -1583,13 +1563,44 @@ long do_hash_fault(struct pt_regs *regs)
>>   		err = 0;
>>   
>>   	} else if (err) {
>> -page_fault:
>>   		err = do_page_fault(regs);
>>   	}
>>   
>>   	return err;
>>   }
>>   
>> +/*
>> + * The _RAW interrupt entry checks for the in_nmi() case before
>> + * running the full handler.
>> + */
>> +DEFINE_INTERRUPT_HANDLER_RAW(do_hash_fault)
> 
> Could we do that split into __do_hash_fault() / do_hash_fault() in a preceeding patch ?

Yeah sure.

>> +{
>> +	unsigned long dsisr = regs->dsisr;
>> +
>> +	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
>> +		return do_page_fault(regs);
>> +
>> +	/*
>> +	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
>> +	 * don't call hash_page, just fail the fault. This is required to
>> +	 * prevent re-entrancy problems in the hash code, namely perf
>> +	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
>> +	 * hash fault. See the comment in hash_preload().
>> +	 *
>> +	 * We come here as a result of a DSI at a point where we don't want
>> +	 * to call hash_page, such as when we are accessing memory (possibly
>> +	 * user memory) inside a PMU interrupt that occurred while interrupts
>> +	 * were soft-disabled.  We want to invoke the exception handler for
>> +	 * the access, or panic if there isn't a handler.
>> +	 */
>> +	if (unlikely(in_nmi())) {
>> +		do_bad_page_fault(regs);
>> +		return 0;
>> +	}
>> +
>> +	return __do_hash_fault(regs);
>> +}
>> +
>>   #ifdef CONFIG_PPC_MM_SLICES
>>   static bool should_hash_preload(struct mm_struct *mm, unsigned long ea)
>>   {
>> diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
>> index c581548b533f..0ae10adae203 100644
>> --- a/arch/powerpc/mm/book3s64/slb.c
>> +++ b/arch/powerpc/mm/book3s64/slb.c
>> @@ -10,6 +10,7 @@
>>    */
>>   
>>   #include <asm/asm-prototypes.h>
>> +#include <asm/interrupt.h>
>>   #include <asm/mmu.h>
>>   #include <asm/mmu_context.h>
>>   #include <asm/paca.h>
>> @@ -813,7 +814,7 @@ static long slb_allocate_user(struct mm_struct *mm, unsigned long ea)
>>   	return slb_insert_entry(ea, context, flags, ssize, false);
>>   }
>>   
>> -long do_slb_fault(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER_RAW(do_slb_fault)
>>   {
>>   	unsigned long ea = regs->dar;
>>   	unsigned long id = get_region_id(ea);
>> @@ -827,17 +828,19 @@ long do_slb_fault(struct pt_regs *regs)
>>   	/*
>>   	 * SLB kernel faults must be very careful not to touch anything
>>   	 * that is not bolted. E.g., PACA and global variables are okay,
>> -	 * mm->context stuff is not.
>> -	 *
>> -	 * SLB user faults can access all of kernel memory, but must be
>> -	 * careful not to touch things like IRQ state because it is not
>> -	 * "reconciled" here. The difficulty is that we must use
>> -	 * fast_exception_return to return from kernel SLB faults without
>> -	 * looking at possible non-bolted memory. We could test user vs
>> -	 * kernel faults in the interrupt handler asm and do a full fault,
>> -	 * reconcile, ret_from_except for user faults which would make them
>> -	 * first class kernel code. But for performance it's probably nicer
>> -	 * if they go via fast_exception_return too.
>> +	 * mm->context stuff is not. SLB user faults may access all of
>> +	 * memory (and induce one recursive SLB kernel fault), so the
>> +	 * kernel fault must not trample on the user fault state at those
>> +	 * points.
>> +	 */
>> +
>> +	/*
>> +	 * This is a _RAW interrupt handler, so it must not touch local
>> +	 * irq state, or schedule. We could test for usermode and upgrade
>> +	 * to a normal process context (synchronous) interrupt for those,
>> +	 * which would make them first-class kernel code and able to be
>> +	 * traced and instrumented, although performance would suffer a
>> +	 * bit, it would probably be a good tradeoff.
> 
> Is the comment change really related to the wrapper macros ?

You're right. Aneesh wanted comments for _RAW so I added them but 
pointed out that the restriction already exists, it's just that it 
may not have been obvious before.

Thanks for the good review.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 09/21] powerpc/64: context tracking remove _TIF_NOHZ
  2021-01-13 14:50   ` Christophe Leroy
@ 2021-01-14  3:48     ` Nicholas Piggin
  0 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14  3:48 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 12:50 am:
> 
> 
> Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
>> Add context tracking to the system call handler explicitly, and remove
>> _TIF_NOHZ.
>> 
>> This saves 35 cycles on gettid system call cost on POWER9 with a
>> CONFIG_NOHZ_FULL kernel.
> 
> 35 cycles among 100 cycles, or among 5000 cycles ? I meant what pourcentage to you win ?

I can re-check when I update and retest. On the order of about 500
IIRC, so quite significant proportion of the cost.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 15/21] powerpc/64s: reconcile interrupts in C
  2021-01-13 14:54   ` Christophe Leroy
@ 2021-01-14  3:51     ` Nicholas Piggin
  0 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14  3:51 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 12:54 am:
> 
> 
> Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
>> There is no need for this to be in asm, use the new intrrupt entry wrapper.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   arch/powerpc/include/asm/interrupt.h | 15 +++++++++++----
>>   arch/powerpc/kernel/exceptions-64s.S | 26 --------------------------
>>   2 files changed, 11 insertions(+), 30 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
>> index 34d7cca2cb2e..6eba7c489753 100644
>> --- a/arch/powerpc/include/asm/interrupt.h
>> +++ b/arch/powerpc/include/asm/interrupt.h
>> @@ -14,11 +14,14 @@ struct interrupt_state {
>>   
>>   static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
>>   {
>> -#ifdef CONFIG_PPC_BOOK3E_64
>> -	state->ctx_state = exception_enter();
>> -#endif
>> -
> 
> Can't the above stay on top of the function ?

It could but I prefer to do it this way because exception_enter
needs the irq soft-mask state to be set up. It is E vs S, but it
reads better this way (and one day I hope to get E to use C
interrupt returns).

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 16/21] powerpc/64: move account_stolen_time into its own function
  2021-01-13 14:59   ` Christophe Leroy
@ 2021-01-14  3:51     ` Nicholas Piggin
  0 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14  3:51 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 12:59 am:
> 
> 
> Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
>> This will be used by interrupt entry as well.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   arch/powerpc/include/asm/cputime.h | 15 +++++++++++++++
>>   arch/powerpc/kernel/syscall_64.c   | 10 +---------
>>   2 files changed, 16 insertions(+), 9 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/cputime.h b/arch/powerpc/include/asm/cputime.h
>> index ed75d1c318e3..3f61604e1fcf 100644
>> --- a/arch/powerpc/include/asm/cputime.h
>> +++ b/arch/powerpc/include/asm/cputime.h
>> @@ -87,6 +87,18 @@ static notrace inline void account_cpu_user_exit(void)
>>   	acct->starttime_user = tb;
>>   }
>>   
>> +static notrace inline void account_stolen_time(void)
>> +{
>> +#ifdef CONFIG_PPC_SPLPAR
>> +	if (IS_ENABLED(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) &&
> 
> Arent' you already inside a CONFIG_VIRT_CPU_ACCOUNTING_NATIVE section ?

Yes, will fix.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 17/21] powerpc/64: entry cpu time accounting in C
  2021-01-13 15:05   ` Christophe Leroy
@ 2021-01-14  3:58     ` Nicholas Piggin
  0 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14  3:58 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 1:05 am:
> 
> 
> Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
>> There is no need for this to be in asm, use the new intrrupt entry wrapper.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   arch/powerpc/include/asm/interrupt.h |  7 +++++++
>>   arch/powerpc/include/asm/ppc_asm.h   | 24 ------------------------
>>   arch/powerpc/kernel/exceptions-64e.S |  1 -
>>   arch/powerpc/kernel/exceptions-64s.S |  5 -----
>>   4 files changed, 7 insertions(+), 30 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
>> index 6eba7c489753..e278dffe7657 100644
>> --- a/arch/powerpc/include/asm/interrupt.h
>> +++ b/arch/powerpc/include/asm/interrupt.h
>> @@ -4,6 +4,7 @@
>>   
>>   #include <linux/context_tracking.h>
>>   #include <linux/hardirq.h>
>> +#include <asm/cputime.h>
>>   #include <asm/ftrace.h>
>>   
>>   struct interrupt_state {
>> @@ -25,6 +26,9 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
>>   	if (user_mode(regs)) {
>>   		CT_WARN_ON(ct_state() != CONTEXT_USER);
>>   		user_exit_irqoff();
>> +
>> +		account_cpu_user_entry();
> 
> Are interrupts still disabled here ? Otherwise you risk getting IRQ time accounted on user.

Yes. Only the handlers themselves will enable interrupts, with
interrupt_cond_local_irq_enable.

> 
>> +		account_stolen_time();
>>   	} else {
>>   		/*
>>   		 * CT_WARN_ON comes here via program_check_exception,
>> @@ -38,6 +42,9 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
>>   #ifdef CONFIG_PPC_BOOK3E_64
>>   	state->ctx_state = exception_enter();
>>   #endif
>> +
>> +	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) && user_mode(regs))
>> +		account_cpu_user_entry();
> 
> Isn't this interrupt_enter_prepare() function called also on PPC32 ?
> Have you removed the ACCOUNT_CPU_USER_ENTRY() from entry_32.S ?

Yes and no, I was thinking of 64 only :( I can make that for 64E. 32-bit
could be another patch if you want it.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 18/21] powerpc: move NMI entry/exit code into wrapper
  2021-01-13 15:13   ` Christophe Leroy
@ 2021-01-14  4:00     ` Nicholas Piggin
  0 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14  4:00 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 1:13 am:
> 
> 
> Le 13/01/2021 à 08:32, Nicholas Piggin a écrit :
>> This moves the common NMI entry and exit code into the interrupt handler
>> wrappers.
>> 
>> This changes the behaviour of soft-NMI (watchdog) and HMI interrupts, and
>> also MCE interrupts on 64e, by adding missing parts of the NMI entry to
>> them.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   arch/powerpc/include/asm/interrupt.h | 24 ++++++++++++++++
>>   arch/powerpc/kernel/mce.c            | 11 --------
>>   arch/powerpc/kernel/traps.c          | 42 +++++-----------------------
>>   arch/powerpc/kernel/watchdog.c       | 10 +++----
>>   4 files changed, 35 insertions(+), 52 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
>> index e278dffe7657..01192e213f9a 100644
>> --- a/arch/powerpc/include/asm/interrupt.h
>> +++ b/arch/powerpc/include/asm/interrupt.h
>> @@ -95,14 +95,38 @@ static inline void interrupt_async_exit_prepare(struct pt_regs *regs, struct int
>>   }
>>   
>>   struct interrupt_nmi_state {
>> +#ifdef CONFIG_PPC64
>> +	u8 ftrace_enabled;
>> +#endif
>>   };
>>   
>>   static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
>>   {
>> +#ifdef CONFIG_PPC64
>> +	state->ftrace_enabled = this_cpu_get_ftrace_enabled();
>> +	this_cpu_set_ftrace_enabled(0);
>> +#endif
>> +
>> +	/*
>> +	 * Do not use nmi_enter() for pseries hash guest taking a real-mode
>> +	 * NMI because not everything it touches is within the RMA limit.
>> +	 */
>> +	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
>> +			!firmware_has_feature(FW_FEATURE_LPAR) ||
>> +			radix_enabled() || (mfmsr() & MSR_DR))
>> +		nmi_enter();
>>   }
>>   
>>   static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
>>   {
>> +	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
>> +			!firmware_has_feature(FW_FEATURE_LPAR) ||
>> +			radix_enabled() || (mfmsr() & MSR_DR))
>> +		nmi_exit();
>> +
>> +#ifdef CONFIG_PPC64
>> +	this_cpu_set_ftrace_enabled(state->ftrace_enabled);
>> +#endif
>>   }
>>   
>>   /**
>> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
>> index 54269947113d..51456217ec40 100644
>> --- a/arch/powerpc/kernel/mce.c
>> +++ b/arch/powerpc/kernel/mce.c
>> @@ -592,12 +592,6 @@ EXPORT_SYMBOL_GPL(machine_check_print_event_info);
>>   DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
>>   {
>>   	long handled = 0;
>> -	u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
>> -
>> -	this_cpu_set_ftrace_enabled(0);
>> -	/* Do not use nmi_enter/exit for pseries hpte guest */
>> -	if (radix_enabled() || !firmware_has_feature(FW_FEATURE_LPAR))
>> -		nmi_enter();
>>   
>>   	hv_nmi_check_nonrecoverable(regs);
>>   
>> @@ -607,11 +601,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_early)
>>   	if (ppc_md.machine_check_early)
>>   		handled = ppc_md.machine_check_early(regs);
>>   
>> -	if (radix_enabled() || !firmware_has_feature(FW_FEATURE_LPAR))
>> -		nmi_exit();
>> -
>> -	this_cpu_set_ftrace_enabled(ftrace_enabled);
>> -
>>   	return handled;
>>   }
>>   
>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
>> index b4f23e871a68..43d23232ef5c 100644
>> --- a/arch/powerpc/kernel/traps.c
>> +++ b/arch/powerpc/kernel/traps.c
>> @@ -435,11 +435,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
>>   {
>>   	unsigned long hsrr0, hsrr1;
>>   	bool saved_hsrrs = false;
>> -	u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
>> -
>> -	this_cpu_set_ftrace_enabled(0);
>> -
>> -	nmi_enter();
>>   
>>   	/*
>>   	 * System reset can interrupt code where HSRRs are live and MSR[RI]=1.
>> @@ -511,10 +506,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
>>   		mtspr(SPRN_HSRR1, hsrr1);
>>   	}
>>   
>> -	nmi_exit();
>> -
>> -	this_cpu_set_ftrace_enabled(ftrace_enabled);
>> -
>>   	/* What should we do here? We could issue a shutdown or hard reset. */
>>   
>>   	return 0;
>> @@ -792,6 +783,12 @@ int machine_check_generic(struct pt_regs *regs)
>>   #endif /* everything else */
>>   
>>   
>> +/*
>> + * BOOK3S_64 does not call this handler as a non-maskable interrupt
>> + * (it uses its own early real-mode handler to handle the MCE proper
>> + * and then raises irq_work to call this handler when interrupts are
>> + * enabled).
>> + */
>>   #ifdef CONFIG_PPC_BOOK3S_64
>>   DEFINE_INTERRUPT_HANDLER_ASYNC(machine_check_exception)
>>   #else
>> @@ -800,20 +797,6 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
>>   {
>>   	int recover = 0;
>>   
>> -	/*
>> -	 * BOOK3S_64 does not call this handler as a non-maskable interrupt
>> -	 * (it uses its own early real-mode handler to handle the MCE proper
>> -	 * and then raises irq_work to call this handler when interrupts are
>> -	 * enabled).
>> -	 *
>> -	 * This is silly. The BOOK3S_64 should just call a different function
>> -	 * rather than expecting semantics to magically change. Something
>> -	 * like 'non_nmi_machine_check_exception()', perhaps?
>> -	 */
>> -	const bool nmi = !IS_ENABLED(CONFIG_PPC_BOOK3S_64);
>> -
>> -	if (nmi) nmi_enter();
>> -
>>   	__this_cpu_inc(irq_stat.mce_exceptions);
>>   
>>   	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
>> @@ -838,24 +821,17 @@ DEFINE_INTERRUPT_HANDLER_NMI(machine_check_exception)
>>   	if (check_io_access(regs))
>>   		goto bail;
>>   
>> -	if (nmi) nmi_exit();
>> -
> 
> IIRC, not doing the nmi_exit() before the die() is problematic.
> 
> See 
> https://github.com/linuxppc/linux/commit/daf00ae71dad8aa05965713c62558aeebf2df48e#diff-70077148c383252ca949063eaf1b0250620e4607b43f4ef3fd2d8f448a83ab0a

Yes good catch. Maybe putting it into a nmi_die() or having die 
explicitly check for the NMI case might be the go.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C
  2021-01-14  3:24     ` Nicholas Piggin
@ 2021-01-14 12:09       ` Nicholas Piggin
  2021-01-14 12:25         ` Christophe Leroy
  0 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14 12:09 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Nicholas Piggin's message of January 14, 2021 1:24 pm:
> Excerpts from Christophe Leroy's message of January 14, 2021 12:12 am:
>> 
>> 
>> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
>>> The page fault handling still has some complex logic particularly around
>>> hash table handling, in asm. Implement this in C instead.
>>> 
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>>   arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>>>   arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>>>   arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>>>   arch/powerpc/mm/fault.c                       |  46 ++++--
>>>   4 files changed, 107 insertions(+), 148 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>> index 066b1d34c7bc..60a669379aa0 100644
>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>>>   #define HPTE_NOHPTE_UPDATE	0x2
>>>   #define HPTE_USE_KERNEL_KEY	0x4
>>>   
>>> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>>>   extern int __hash_page_4K(unsigned long ea, unsigned long access,
>>>   			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>>>   			  unsigned long flags, int ssize, int subpage_prot);
>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>> index 6e53f7638737..bcb5e81d2088 100644
>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>>    *
>>>    * Handling:
>>>    * - Hash MMU
>>> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
>>> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
>>> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
>>> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>>>    *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>>>    *   "non-bolted" regions, e.g., vmalloc space. However these should always be
>>> - *   backed by Linux page tables.
>>> + *   backed by Linux page table entries.
>>>    *
>>> - *   If none is found, do a Linux page fault. Linux page faults can happen in
>>> - *   kernel mode due to user copy operations of course.
>>> + *   If no entry is found the Linux page fault handler is invoked (by
>>> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
>>> + *   copy operations of course.
>>>    *
>>>    *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>>>    *   MMU context, which may cause a DSI in the host, which must go to the
>>> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>>>   	GEN_COMMON data_access
>>>   	ld	r4,_DAR(r1)
>>>   	ld	r5,_DSISR(r1)
>> 
>> We have DSISR here. I think the dispatch between page fault or do_break() should be done here:
>> - It would be more similar to other arches
> 
> Other sub-archs?
> 
>> - Would avoid doing it also in instruction fault
> 
> True but it's hidden under an unlikely branch so won't really help 
> instruction fault.
> 
>> - Would avoid that -1 return which looks more like a hack.
> 
> I don't really see it as a hack, we return a code to asm caller to
> direct whether to restore registers or not, we alrady have this
> pattern.
> 
> (I'm hoping all that might be go away one day by conrolling NV
> regs from C if we can get good code generation but even if not we
> still have it in the interrupt returns).
> 
> That said I will give it a try here. At very least it might be a
> better intermediate step.

Ah yes, this way doesn't work well for later patches because you end
e.g., with the do_break call having to call the interrupt handler
wrappers again when they actually expect to be in the asm entry state
(e.g., irq soft-mask state) when called, and return via interrupt_return
after the exit wrapper runs (which 64s uses to implement better context
tracking for example).

That could possibly be hacked up to deal with multiple interrupt 
wrappers per interrupt, but I'd rather not go backwards.

That does leave the other sub archs as having this issue, but they don't 
do so much in their handlers. 32 doesn't have soft-mask or context 
tracking to deal with for example. We will need to fix this up though 
and unify things more.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C
  2021-01-14 12:09       ` Nicholas Piggin
@ 2021-01-14 12:25         ` Christophe Leroy
  2021-01-14 13:17           ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-14 12:25 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 14/01/2021 à 13:09, Nicholas Piggin a écrit :
> Excerpts from Nicholas Piggin's message of January 14, 2021 1:24 pm:
>> Excerpts from Christophe Leroy's message of January 14, 2021 12:12 am:
>>>
>>>
>>> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
>>>> The page fault handling still has some complex logic particularly around
>>>> hash table handling, in asm. Implement this in C instead.
>>>>
>>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>>> ---
>>>>    arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>>>>    arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>>>>    arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>>>>    arch/powerpc/mm/fault.c                       |  46 ++++--
>>>>    4 files changed, 107 insertions(+), 148 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>> index 066b1d34c7bc..60a669379aa0 100644
>>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>>>>    #define HPTE_NOHPTE_UPDATE	0x2
>>>>    #define HPTE_USE_KERNEL_KEY	0x4
>>>>    
>>>> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>>>>    extern int __hash_page_4K(unsigned long ea, unsigned long access,
>>>>    			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>>>>    			  unsigned long flags, int ssize, int subpage_prot);
>>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>>> index 6e53f7638737..bcb5e81d2088 100644
>>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>>> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>>>     *
>>>>     * Handling:
>>>>     * - Hash MMU
>>>> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
>>>> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
>>>> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
>>>> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>     *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>>>>     *   "non-bolted" regions, e.g., vmalloc space. However these should always be
>>>> - *   backed by Linux page tables.
>>>> + *   backed by Linux page table entries.
>>>>     *
>>>> - *   If none is found, do a Linux page fault. Linux page faults can happen in
>>>> - *   kernel mode due to user copy operations of course.
>>>> + *   If no entry is found the Linux page fault handler is invoked (by
>>>> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
>>>> + *   copy operations of course.
>>>>     *
>>>>     *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>>>>     *   MMU context, which may cause a DSI in the host, which must go to the
>>>> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>>>>    	GEN_COMMON data_access
>>>>    	ld	r4,_DAR(r1)
>>>>    	ld	r5,_DSISR(r1)
>>>
>>> We have DSISR here. I think the dispatch between page fault or do_break() should be done here:
>>> - It would be more similar to other arches
>>
>> Other sub-archs?
>>
>>> - Would avoid doing it also in instruction fault
>>
>> True but it's hidden under an unlikely branch so won't really help
>> instruction fault.
>>
>>> - Would avoid that -1 return which looks more like a hack.
>>
>> I don't really see it as a hack, we return a code to asm caller to
>> direct whether to restore registers or not, we alrady have this
>> pattern.
>>
>> (I'm hoping all that might be go away one day by conrolling NV
>> regs from C if we can get good code generation but even if not we
>> still have it in the interrupt returns).
>>
>> That said I will give it a try here. At very least it might be a
>> better intermediate step.
> 
> Ah yes, this way doesn't work well for later patches because you end
> e.g., with the do_break call having to call the interrupt handler
> wrappers again when they actually expect to be in the asm entry state
> (e.g., irq soft-mask state) when called, and return via interrupt_return
> after the exit wrapper runs (which 64s uses to implement better context
> tracking for example).
> 
> That could possibly be hacked up to deal with multiple interrupt
> wrappers per interrupt, but I'd rather not go backwards.
> 
> That does leave the other sub archs as having this issue, but they don't
> do so much in their handlers. 32 doesn't have soft-mask or context
> tracking to deal with for example. We will need to fix this up though
> and unify things more.
> 

Not sure I understand what you mean exactly.

On the 8xx, do_break() is called by totally different exceptions:
- Exception 0x1c00 Data breakpoint ==> do_break()
- Exception 0x1300 Instruction TLB error ==> handle_page_fault()
- Exception 0x1400 Data TLB error ==> handle_page_fault()

On book3s/32, we now (after my patch ie patch 1 in your series ) have either do_break() or 
handle_page_fault() being called from very early in ASM.

If you do the same in book3s/64, then there is no issue with interrupt wrappers being called twice, 
is it ?

Christophe

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C
  2021-01-14 12:25         ` Christophe Leroy
@ 2021-01-14 13:17           ` Nicholas Piggin
  2021-01-14 13:28             ` Christophe Leroy
  0 siblings, 1 reply; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-14 13:17 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 10:25 pm:
> 
> 
> Le 14/01/2021 à 13:09, Nicholas Piggin a écrit :
>> Excerpts from Nicholas Piggin's message of January 14, 2021 1:24 pm:
>>> Excerpts from Christophe Leroy's message of January 14, 2021 12:12 am:
>>>>
>>>>
>>>> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
>>>>> The page fault handling still has some complex logic particularly around
>>>>> hash table handling, in asm. Implement this in C instead.
>>>>>
>>>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>>>> ---
>>>>>    arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>>>>>    arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>>>>>    arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>>>>>    arch/powerpc/mm/fault.c                       |  46 ++++--
>>>>>    4 files changed, 107 insertions(+), 148 deletions(-)
>>>>>
>>>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>> index 066b1d34c7bc..60a669379aa0 100644
>>>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>>>>>    #define HPTE_NOHPTE_UPDATE	0x2
>>>>>    #define HPTE_USE_KERNEL_KEY	0x4
>>>>>    
>>>>> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>>>>>    extern int __hash_page_4K(unsigned long ea, unsigned long access,
>>>>>    			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>>>>>    			  unsigned long flags, int ssize, int subpage_prot);
>>>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>>>> index 6e53f7638737..bcb5e81d2088 100644
>>>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>>>> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>>>>     *
>>>>>     * Handling:
>>>>>     * - Hash MMU
>>>>> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
>>>>> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
>>>>> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>>     *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>>>>>     *   "non-bolted" regions, e.g., vmalloc space. However these should always be
>>>>> - *   backed by Linux page tables.
>>>>> + *   backed by Linux page table entries.
>>>>>     *
>>>>> - *   If none is found, do a Linux page fault. Linux page faults can happen in
>>>>> - *   kernel mode due to user copy operations of course.
>>>>> + *   If no entry is found the Linux page fault handler is invoked (by
>>>>> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
>>>>> + *   copy operations of course.
>>>>>     *
>>>>>     *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>>>>>     *   MMU context, which may cause a DSI in the host, which must go to the
>>>>> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>>>>>    	GEN_COMMON data_access
>>>>>    	ld	r4,_DAR(r1)
>>>>>    	ld	r5,_DSISR(r1)
>>>>
>>>> We have DSISR here. I think the dispatch between page fault or do_break() should be done here:
>>>> - It would be more similar to other arches
>>>
>>> Other sub-archs?
>>>
>>>> - Would avoid doing it also in instruction fault
>>>
>>> True but it's hidden under an unlikely branch so won't really help
>>> instruction fault.
>>>
>>>> - Would avoid that -1 return which looks more like a hack.
>>>
>>> I don't really see it as a hack, we return a code to asm caller to
>>> direct whether to restore registers or not, we alrady have this
>>> pattern.
>>>
>>> (I'm hoping all that might be go away one day by conrolling NV
>>> regs from C if we can get good code generation but even if not we
>>> still have it in the interrupt returns).
>>>
>>> That said I will give it a try here. At very least it might be a
>>> better intermediate step.
>> 
>> Ah yes, this way doesn't work well for later patches because you end
>> e.g., with the do_break call having to call the interrupt handler
>> wrappers again when they actually expect to be in the asm entry state
>> (e.g., irq soft-mask state) when called, and return via interrupt_return
>> after the exit wrapper runs (which 64s uses to implement better context
>> tracking for example).
>> 
>> That could possibly be hacked up to deal with multiple interrupt
>> wrappers per interrupt, but I'd rather not go backwards.
>> 
>> That does leave the other sub archs as having this issue, but they don't
>> do so much in their handlers. 32 doesn't have soft-mask or context
>> tracking to deal with for example. We will need to fix this up though
>> and unify things more.
>> 
> 
> Not sure I understand what you mean exactly.
> 
> On the 8xx, do_break() is called by totally different exceptions:
> - Exception 0x1c00 Data breakpoint ==> do_break()
> - Exception 0x1300 Instruction TLB error ==> handle_page_fault()
> - Exception 0x1400 Data TLB error ==> handle_page_fault()
> 
> On book3s/32, we now (after my patch ie patch 1 in your series ) have either do_break() or 
> handle_page_fault() being called from very early in ASM.
> 
> If you do the same in book3s/64, then there is no issue with interrupt wrappers being called twice, 
> is it ?

bad_page_fault is the problem, it has to go afterwards.

Once we have the changed 64s behaviour of do_page_fault, I don't know if 
there is any point leaving do_break in asm is there? I guess it is neat 
to treat it quite separately, I might need to count fast path branches...
I have done the split anyway already, so I will post your way first.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C
  2021-01-14 13:17           ` Nicholas Piggin
@ 2021-01-14 13:28             ` Christophe Leroy
  2021-01-15  0:25               ` Nicholas Piggin
  0 siblings, 1 reply; 44+ messages in thread
From: Christophe Leroy @ 2021-01-14 13:28 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 14/01/2021 à 14:17, Nicholas Piggin a écrit :
> Excerpts from Christophe Leroy's message of January 14, 2021 10:25 pm:
>>
>>
>> Le 14/01/2021 à 13:09, Nicholas Piggin a écrit :
>>> Excerpts from Nicholas Piggin's message of January 14, 2021 1:24 pm:
>>>> Excerpts from Christophe Leroy's message of January 14, 2021 12:12 am:
>>>>>
>>>>>
>>>>> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
>>>>>> The page fault handling still has some complex logic particularly around
>>>>>> hash table handling, in asm. Implement this in C instead.
>>>>>>
>>>>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>>>>> ---
>>>>>>     arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>>>>>>     arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>>>>>>     arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>>>>>>     arch/powerpc/mm/fault.c                       |  46 ++++--
>>>>>>     4 files changed, 107 insertions(+), 148 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>> index 066b1d34c7bc..60a669379aa0 100644
>>>>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>>>>>>     #define HPTE_NOHPTE_UPDATE	0x2
>>>>>>     #define HPTE_USE_KERNEL_KEY	0x4
>>>>>>     
>>>>>> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>>>>>>     extern int __hash_page_4K(unsigned long ea, unsigned long access,
>>>>>>     			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>>>>>>     			  unsigned long flags, int ssize, int subpage_prot);
>>>>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>>>>> index 6e53f7638737..bcb5e81d2088 100644
>>>>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>>>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>>>>> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>>>>>      *
>>>>>>      * Handling:
>>>>>>      * - Hash MMU
>>>>>> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
>>>>>> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>>> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
>>>>>> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>>>      *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>>>>>>      *   "non-bolted" regions, e.g., vmalloc space. However these should always be
>>>>>> - *   backed by Linux page tables.
>>>>>> + *   backed by Linux page table entries.
>>>>>>      *
>>>>>> - *   If none is found, do a Linux page fault. Linux page faults can happen in
>>>>>> - *   kernel mode due to user copy operations of course.
>>>>>> + *   If no entry is found the Linux page fault handler is invoked (by
>>>>>> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
>>>>>> + *   copy operations of course.
>>>>>>      *
>>>>>>      *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>>>>>>      *   MMU context, which may cause a DSI in the host, which must go to the
>>>>>> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>>>>>>     	GEN_COMMON data_access
>>>>>>     	ld	r4,_DAR(r1)
>>>>>>     	ld	r5,_DSISR(r1)
>>>>>
>>>>> We have DSISR here. I think the dispatch between page fault or do_break() should be done here:
>>>>> - It would be more similar to other arches
>>>>
>>>> Other sub-archs?
>>>>
>>>>> - Would avoid doing it also in instruction fault
>>>>
>>>> True but it's hidden under an unlikely branch so won't really help
>>>> instruction fault.
>>>>
>>>>> - Would avoid that -1 return which looks more like a hack.
>>>>
>>>> I don't really see it as a hack, we return a code to asm caller to
>>>> direct whether to restore registers or not, we alrady have this
>>>> pattern.
>>>>
>>>> (I'm hoping all that might be go away one day by conrolling NV
>>>> regs from C if we can get good code generation but even if not we
>>>> still have it in the interrupt returns).
>>>>
>>>> That said I will give it a try here. At very least it might be a
>>>> better intermediate step.
>>>
>>> Ah yes, this way doesn't work well for later patches because you end
>>> e.g., with the do_break call having to call the interrupt handler
>>> wrappers again when they actually expect to be in the asm entry state
>>> (e.g., irq soft-mask state) when called, and return via interrupt_return
>>> after the exit wrapper runs (which 64s uses to implement better context
>>> tracking for example).
>>>
>>> That could possibly be hacked up to deal with multiple interrupt
>>> wrappers per interrupt, but I'd rather not go backwards.
>>>
>>> That does leave the other sub archs as having this issue, but they don't
>>> do so much in their handlers. 32 doesn't have soft-mask or context
>>> tracking to deal with for example. We will need to fix this up though
>>> and unify things more.
>>>
>>
>> Not sure I understand what you mean exactly.
>>
>> On the 8xx, do_break() is called by totally different exceptions:
>> - Exception 0x1c00 Data breakpoint ==> do_break()
>> - Exception 0x1300 Instruction TLB error ==> handle_page_fault()
>> - Exception 0x1400 Data TLB error ==> handle_page_fault()
>>
>> On book3s/32, we now (after my patch ie patch 1 in your series ) have either do_break() or
>> handle_page_fault() being called from very early in ASM.
>>
>> If you do the same in book3s/64, then there is no issue with interrupt wrappers being called twice,
>> is it ?
> 
> bad_page_fault is the problem, it has to go afterwards.
> 
> Once we have the changed 64s behaviour of do_page_fault, I don't know if
> there is any point leaving do_break in asm is there? I guess it is neat
> to treat it quite separately, I might need to count fast path branches...
> I have done the split anyway already, so I will post your way first.
> 

As far as I understand, not taken unlikely branches are costless (at least on book3s/32), so you 
would only suffer the cost of the logical 'and.' on the value of DSISR that you already have in a 
register. Should be in the noise.

bad_page_fault() is not in the fast path anymore since we now handle the exception fixup at the end 
of do_page_fault(). So I think it shouldn't be a concern to call the wrapper again for bad_page_fault()

Christophe

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 03/21] powerpc: remove arguments from fault handler functions
  2021-01-13  7:31 ` [PATCH v5 03/21] powerpc: remove arguments from fault handler functions Nicholas Piggin
@ 2021-01-14 14:12   ` Christophe Leroy
  0 siblings, 0 replies; 44+ messages in thread
From: Christophe Leroy @ 2021-01-14 14:12 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev



Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
> Make mm fault handlers all just take the pt_regs * argument and load
> DAR/DSISR from that. Make those that return a value return long.
> 
> This is done to make the function signatures match other handlers, which
> will help with a future patch to add wrappers. Explicit arguments could
> be added for performance but that would require more wrapper macro
> variants.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/asm-prototypes.h     |  4 ++--
>   arch/powerpc/include/asm/book3s/64/mmu-hash.h |  2 +-
>   arch/powerpc/include/asm/bug.h                |  2 +-
>   arch/powerpc/kernel/entry_32.S                |  6 +-----
>   arch/powerpc/kernel/exceptions-64e.S          |  2 --
>   arch/powerpc/kernel/exceptions-64s.S          | 14 ++------------
>   arch/powerpc/kernel/head_40x.S                | 10 +++++-----
>   arch/powerpc/kernel/head_8xx.S                |  6 +++---
>   arch/powerpc/kernel/head_book3s_32.S          |  5 ++---
>   arch/powerpc/kernel/head_booke.h              |  4 +---
>   arch/powerpc/mm/book3s64/hash_utils.c         |  8 +++++---
>   arch/powerpc/mm/book3s64/slb.c                | 11 +++++++----
>   arch/powerpc/mm/fault.c                       |  7 ++++---
>   13 files changed, 34 insertions(+), 47 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
> index 238eacfda7b0..a32157ce0551 100644
> --- a/arch/powerpc/kernel/entry_32.S
> +++ b/arch/powerpc/kernel/entry_32.S
> @@ -277,7 +277,7 @@ reenable_mmu:
>   	 * r3 can be different from GPR3(r1) at this point, r9 and r11
>   	 * contains the old MSR and handler address respectively,
>   	 * r4 & r5 can contain page fault arguments that need to be passed

The line above should be dropped as well (its end on the line below is dropped already)


> -	 * along as well. r0, r6-r8, r12, CCR, CTR, XER etc... are left
> +	 * r0, r4-r8, r12, CCR, CTR, XER etc... are left
>   	 * clobbered as they aren't useful past this point.
>   	 */
>   

Christophe

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C
  2021-01-14 13:28             ` Christophe Leroy
@ 2021-01-15  0:25               ` Nicholas Piggin
  0 siblings, 0 replies; 44+ messages in thread
From: Nicholas Piggin @ 2021-01-15  0:25 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

Excerpts from Christophe Leroy's message of January 14, 2021 11:28 pm:
> 
> 
> Le 14/01/2021 à 14:17, Nicholas Piggin a écrit :
>> Excerpts from Christophe Leroy's message of January 14, 2021 10:25 pm:
>>>
>>>
>>> Le 14/01/2021 à 13:09, Nicholas Piggin a écrit :
>>>> Excerpts from Nicholas Piggin's message of January 14, 2021 1:24 pm:
>>>>> Excerpts from Christophe Leroy's message of January 14, 2021 12:12 am:
>>>>>>
>>>>>>
>>>>>> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
>>>>>>> The page fault handling still has some complex logic particularly around
>>>>>>> hash table handling, in asm. Implement this in C instead.
>>>>>>>
>>>>>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>>>>>> ---
>>>>>>>     arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>>>>>>>     arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>>>>>>>     arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>>>>>>>     arch/powerpc/mm/fault.c                       |  46 ++++--
>>>>>>>     4 files changed, 107 insertions(+), 148 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>>> index 066b1d34c7bc..60a669379aa0 100644
>>>>>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>>> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>>>>>>>     #define HPTE_NOHPTE_UPDATE	0x2
>>>>>>>     #define HPTE_USE_KERNEL_KEY	0x4
>>>>>>>     
>>>>>>> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>>>>>>>     extern int __hash_page_4K(unsigned long ea, unsigned long access,
>>>>>>>     			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>>>>>>>     			  unsigned long flags, int ssize, int subpage_prot);
>>>>>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>>>>>> index 6e53f7638737..bcb5e81d2088 100644
>>>>>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>>>>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>>>>>> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>>>>>>      *
>>>>>>>      * Handling:
>>>>>>>      * - Hash MMU
>>>>>>> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
>>>>>>> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>>>> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
>>>>>>> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>>>>      *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>>>>>>>      *   "non-bolted" regions, e.g., vmalloc space. However these should always be
>>>>>>> - *   backed by Linux page tables.
>>>>>>> + *   backed by Linux page table entries.
>>>>>>>      *
>>>>>>> - *   If none is found, do a Linux page fault. Linux page faults can happen in
>>>>>>> - *   kernel mode due to user copy operations of course.
>>>>>>> + *   If no entry is found the Linux page fault handler is invoked (by
>>>>>>> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
>>>>>>> + *   copy operations of course.
>>>>>>>      *
>>>>>>>      *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>>>>>>>      *   MMU context, which may cause a DSI in the host, which must go to the
>>>>>>> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>>>>>>>     	GEN_COMMON data_access
>>>>>>>     	ld	r4,_DAR(r1)
>>>>>>>     	ld	r5,_DSISR(r1)
>>>>>>
>>>>>> We have DSISR here. I think the dispatch between page fault or do_break() should be done here:
>>>>>> - It would be more similar to other arches
>>>>>
>>>>> Other sub-archs?
>>>>>
>>>>>> - Would avoid doing it also in instruction fault
>>>>>
>>>>> True but it's hidden under an unlikely branch so won't really help
>>>>> instruction fault.
>>>>>
>>>>>> - Would avoid that -1 return which looks more like a hack.
>>>>>
>>>>> I don't really see it as a hack, we return a code to asm caller to
>>>>> direct whether to restore registers or not, we alrady have this
>>>>> pattern.
>>>>>
>>>>> (I'm hoping all that might be go away one day by conrolling NV
>>>>> regs from C if we can get good code generation but even if not we
>>>>> still have it in the interrupt returns).
>>>>>
>>>>> That said I will give it a try here. At very least it might be a
>>>>> better intermediate step.
>>>>
>>>> Ah yes, this way doesn't work well for later patches because you end
>>>> e.g., with the do_break call having to call the interrupt handler
>>>> wrappers again when they actually expect to be in the asm entry state
>>>> (e.g., irq soft-mask state) when called, and return via interrupt_return
>>>> after the exit wrapper runs (which 64s uses to implement better context
>>>> tracking for example).
>>>>
>>>> That could possibly be hacked up to deal with multiple interrupt
>>>> wrappers per interrupt, but I'd rather not go backwards.
>>>>
>>>> That does leave the other sub archs as having this issue, but they don't
>>>> do so much in their handlers. 32 doesn't have soft-mask or context
>>>> tracking to deal with for example. We will need to fix this up though
>>>> and unify things more.
>>>>
>>>
>>> Not sure I understand what you mean exactly.
>>>
>>> On the 8xx, do_break() is called by totally different exceptions:
>>> - Exception 0x1c00 Data breakpoint ==> do_break()
>>> - Exception 0x1300 Instruction TLB error ==> handle_page_fault()
>>> - Exception 0x1400 Data TLB error ==> handle_page_fault()
>>>
>>> On book3s/32, we now (after my patch ie patch 1 in your series ) have either do_break() or
>>> handle_page_fault() being called from very early in ASM.
>>>
>>> If you do the same in book3s/64, then there is no issue with interrupt wrappers being called twice,
>>> is it ?
>> 
>> bad_page_fault is the problem, it has to go afterwards.
>> 
>> Once we have the changed 64s behaviour of do_page_fault, I don't know if
>> there is any point leaving do_break in asm is there? I guess it is neat
>> to treat it quite separately, I might need to count fast path branches...
>> I have done the split anyway already, so I will post your way first.
>> 
> 
> As far as I understand, not taken unlikely branches are costless (at least on book3s/32), so you 
> would only suffer the cost of the logical 'and.' on the value of DSISR that you already have in a 
> register. Should be in the noise.
> 
> bad_page_fault() is not in the fast path anymore since we now handle the exception fixup at the end 
> of do_page_fault(). So I think it shouldn't be a concern to call the wrapper again for bad_page_fault()

It's not performance but correctness. For example we can have interrupts 
enabled again at this time, which the interrupt wrapper does not expect.
Or the context tracking code in the entry wrapper would break if it's
called again before interrupt_return.

I think it's not too ugly to put bad_page_fault in C and having do_break 
in asm still avoids a lot of your complaints.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2021-01-15  0:27 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-13  7:31 [PATCH v5 00/21] powerpc: interrupt wrappers Nicholas Piggin
2021-01-13  7:31 ` [PATCH v5 01/21] powerpc/32s: Do DABR match out of handle_page_fault() Nicholas Piggin
2021-01-13  7:31 ` [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C Nicholas Piggin
2021-01-13 14:12   ` Christophe Leroy
2021-01-14  3:24     ` Nicholas Piggin
2021-01-14 12:09       ` Nicholas Piggin
2021-01-14 12:25         ` Christophe Leroy
2021-01-14 13:17           ` Nicholas Piggin
2021-01-14 13:28             ` Christophe Leroy
2021-01-15  0:25               ` Nicholas Piggin
2021-01-13  7:31 ` [PATCH v5 03/21] powerpc: remove arguments from fault handler functions Nicholas Piggin
2021-01-14 14:12   ` Christophe Leroy
2021-01-13  7:31 ` [PATCH v5 04/21] powerpc: bad_page_fault, do_break get registers from regs Nicholas Piggin
2021-01-13 14:25   ` Christophe Leroy
2021-01-14  3:26     ` Nicholas Piggin
2021-01-13  7:31 ` [PATCH v5 05/21] powerpc/perf: move perf irq/nmi handling details into traps.c Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 06/21] powerpc: interrupt handler wrapper functions Nicholas Piggin
2021-01-13 14:45   ` Christophe Leroy
2021-01-14  3:41     ` Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 07/21] powerpc: add interrupt wrapper entry / exit stub functions Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 08/21] powerpc: add interrupt_cond_local_irq_enable helper Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 09/21] powerpc/64: context tracking remove _TIF_NOHZ Nicholas Piggin
2021-01-13 14:50   ` Christophe Leroy
2021-01-14  3:48     ` Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 10/21] powerpc/64s/hash: improve context tracking of hash faults Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 11/21] powerpc/64: context tracking move to interrupt wrappers Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 12/21] powerpc/64: add context tracking to asynchronous interrupts Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 13/21] powerpc: handle irq_enter/irq_exit in interrupt handler wrappers Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 14/21] powerpc/64s: move context tracking exit to interrupt exit path Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 15/21] powerpc/64s: reconcile interrupts in C Nicholas Piggin
2021-01-13 14:54   ` Christophe Leroy
2021-01-14  3:51     ` Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 16/21] powerpc/64: move account_stolen_time into its own function Nicholas Piggin
2021-01-13 14:59   ` Christophe Leroy
2021-01-14  3:51     ` Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 17/21] powerpc/64: entry cpu time accounting in C Nicholas Piggin
2021-01-13 15:05   ` Christophe Leroy
2021-01-14  3:58     ` Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 18/21] powerpc: move NMI entry/exit code into wrapper Nicholas Piggin
2021-01-13 15:13   ` Christophe Leroy
2021-01-14  4:00     ` Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 19/21] powerpc/64s: move NMI soft-mask handling to C Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 20/21] powerpc/64s: runlatch interrupt handling in C Nicholas Piggin
2021-01-13  7:32 ` [PATCH v5 21/21] powerpc/64s: power4 nap fixup " Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).