All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc/64s: prevent recursive replay_soft_interrupts causing superfluous interrupt
@ 2021-01-23  6:12 Nicholas Piggin
  2021-01-27  8:32 ` Athira Rajeev
  2021-02-03 11:46 ` Michael Ellerman
  0 siblings, 2 replies; 3+ messages in thread
From: Nicholas Piggin @ 2021-01-23  6:12 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Athira Rajeev, Nicholas Piggin

When an asynchronous interrupt calls irq_exit, it checks for softirqs
that may have been created, and runs them. Running softirqs enables
local irqs, which can replay pending interrupts causing recursion in
replay_soft_interrupts. This abridged trace shows how this can occur:

! NIP replay_soft_interrupts
  LR  interrupt_exit_kernel_prepare
  Call Trace:
    interrupt_exit_kernel_prepare (unreliable)
    interrupt_return
  --- interrupt: ea0 at __rb_reserve_next
  NIP __rb_reserve_next
  LR __rb_reserve_next
  Call Trace:
    ring_buffer_lock_reserve
    trace_function
    function_trace_call
    ftrace_call
    __do_softirq
    irq_exit
    timer_interrupt
!   replay_soft_interrupts
    interrupt_exit_kernel_prepare
    interrupt_return
  --- interrupt: ea0 at arch_local_irq_restore

This can not be prevented easily, because softirqs must not block hard
irqs, so it has to be dealt with.

The recursion is bounded by design in the softirq code because softirq
replay disables softirqs and loops around again to check for new
softirqs created while it ran, so that's not a problem.

However it does mess up interrupt replay state, causing superfluous
interrupts when the second replay_soft_interrupts clears a pending
interrupt, leaving it still set in the first call in the 'happened'
local variable.

Fix this by not caching a copy of irqs_happened across interrupt
handler calls.

Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
It's actually very tricky / practically impossible to prevent softirqs
from being run like my previous attempt tried to do, when you really
get into details. Certainly not for a fix. I might try to attempt that
some time in future, but it would require kernel/softirq.c changes and
what we do now isn't conceptually different from architectures without
soft-masking, it's just another complexity to the complex soft-masking
code.

Anyway this hopefully fixes the PMU bug, but as I said I couldn't
reproduce it with your test case so it would be good if you could
give it a test.

Thanks,
Nick

 arch/powerpc/kernel/irq.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 6b1eca53e36c..cc7a6271b6b4 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -180,13 +180,18 @@ void notrace restore_interrupts(void)
 
 void replay_soft_interrupts(void)
 {
+	struct pt_regs regs;
+
 	/*
-	 * We use local_paca rather than get_paca() to avoid all
-	 * the debug_smp_processor_id() business in this low level
-	 * function
+	 * Be careful here, calling these interrupt handlers can cause
+	 * softirqs to be raised, which they may run when calling irq_exit,
+	 * which will cause local_irq_enable() to be run, which can then
+	 * recurse into this function. Don't keep any state across
+	 * interrupt handler calls which may change underneath us.
+	 *
+	 * We use local_paca rather than get_paca() to avoid all the
+	 * debug_smp_processor_id() business in this low level function.
 	 */
-	unsigned char happened = local_paca->irq_happened;
-	struct pt_regs regs;
 
 	ppc_save_regs(&regs);
 	regs.softe = IRQS_ENABLED;
@@ -209,7 +214,7 @@ void replay_soft_interrupts(void)
 	 * This is a higher priority interrupt than the others, so
 	 * replay it first.
 	 */
-	if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (happened & PACA_IRQ_HMI)) {
+	if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (local_paca->irq_happened & PACA_IRQ_HMI)) {
 		local_paca->irq_happened &= ~PACA_IRQ_HMI;
 		regs.trap = 0xe60;
 		handle_hmi_exception(&regs);
@@ -217,7 +222,7 @@ void replay_soft_interrupts(void)
 			hard_irq_disable();
 	}
 
-	if (happened & PACA_IRQ_DEC) {
+	if (local_paca->irq_happened & PACA_IRQ_DEC) {
 		local_paca->irq_happened &= ~PACA_IRQ_DEC;
 		regs.trap = 0x900;
 		timer_interrupt(&regs);
@@ -225,7 +230,7 @@ void replay_soft_interrupts(void)
 			hard_irq_disable();
 	}
 
-	if (happened & PACA_IRQ_EE) {
+	if (local_paca->irq_happened & PACA_IRQ_EE) {
 		local_paca->irq_happened &= ~PACA_IRQ_EE;
 		regs.trap = 0x500;
 		do_IRQ(&regs);
@@ -233,7 +238,7 @@ void replay_soft_interrupts(void)
 			hard_irq_disable();
 	}
 
-	if (IS_ENABLED(CONFIG_PPC_DOORBELL) && (happened & PACA_IRQ_DBELL)) {
+	if (IS_ENABLED(CONFIG_PPC_DOORBELL) && (local_paca->irq_happened & PACA_IRQ_DBELL)) {
 		local_paca->irq_happened &= ~PACA_IRQ_DBELL;
 		if (IS_ENABLED(CONFIG_PPC_BOOK3E))
 			regs.trap = 0x280;
@@ -245,7 +250,7 @@ void replay_soft_interrupts(void)
 	}
 
 	/* Book3E does not support soft-masking PMI interrupts */
-	if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (happened & PACA_IRQ_PMI)) {
+	if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (local_paca->irq_happened & PACA_IRQ_PMI)) {
 		local_paca->irq_happened &= ~PACA_IRQ_PMI;
 		regs.trap = 0xf00;
 		performance_monitor_exception(&regs);
@@ -253,8 +258,7 @@ void replay_soft_interrupts(void)
 			hard_irq_disable();
 	}
 
-	happened = local_paca->irq_happened;
-	if (happened & ~PACA_IRQ_HARD_DIS) {
+	if (local_paca->irq_happened & ~PACA_IRQ_HARD_DIS) {
 		/*
 		 * We are responding to the next interrupt, so interrupt-off
 		 * latencies should be reset here.
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] powerpc/64s: prevent recursive replay_soft_interrupts causing superfluous interrupt
  2021-01-23  6:12 [PATCH] powerpc/64s: prevent recursive replay_soft_interrupts causing superfluous interrupt Nicholas Piggin
@ 2021-01-27  8:32 ` Athira Rajeev
  2021-02-03 11:46 ` Michael Ellerman
  1 sibling, 0 replies; 3+ messages in thread
From: Athira Rajeev @ 2021-01-27  8:32 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev



> On 23-Jan-2021, at 11:42 AM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> When an asynchronous interrupt calls irq_exit, it checks for softirqs
> that may have been created, and runs them. Running softirqs enables
> local irqs, which can replay pending interrupts causing recursion in
> replay_soft_interrupts. This abridged trace shows how this can occur:
> 
> ! NIP replay_soft_interrupts
> LR  interrupt_exit_kernel_prepare
> Call Trace:
> interrupt_exit_kernel_prepare (unreliable)
> interrupt_return
> --- interrupt: ea0 at __rb_reserve_next
> NIP __rb_reserve_next
> LR __rb_reserve_next
> Call Trace:
> ring_buffer_lock_reserve
> trace_function
> function_trace_call
> ftrace_call
> __do_softirq
> irq_exit
> timer_interrupt
> !   replay_soft_interrupts
> interrupt_exit_kernel_prepare
> interrupt_return
> --- interrupt: ea0 at arch_local_irq_restore
> 
> This can not be prevented easily, because softirqs must not block hard
> irqs, so it has to be dealt with.
> 
> The recursion is bounded by design in the softirq code because softirq
> replay disables softirqs and loops around again to check for new
> softirqs created while it ran, so that's not a problem.
> 
> However it does mess up interrupt replay state, causing superfluous
> interrupts when the second replay_soft_interrupts clears a pending
> interrupt, leaving it still set in the first call in the 'happened'
> local variable.
> 
> Fix this by not caching a copy of irqs_happened across interrupt
> handler calls.
> 
> Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Thanks for the fix Nick.

Tested this below scenario where previously it was resulting in soft lockup’s with the trace described in the commit message. 
I re-tested this patch along with https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=226017
And there were no soft lockup’s.

Test scenario: My test kernel module below tries to create one of performance monitor
counter ( PMC6 ) overflow between local_irq_save/local_irq_restore. I am also configuring ftrace.

Environment :One CPU online and Bare Metal system
prerequisite for ftrace:
# cd /sys/kernel/debug/tracing
# echo 100 > buffer_percent
# echo 200000 > buffer_size_kb 
# echo ppc-tb > trace_clock
# echo function > current_tracer

Part of sample kernel test module to trigger a PMI between 
local_irq_save and local_irq_restore:

<<>>
static ulong delay = 1;
static void busy_wait(ulong time)
{
  udelay(delay);
}
static __always_inline void irq_test(void)
{
  unsigned long flags;
  local_irq_save(flags);
  trace_printk("IN IRQ TEST\n");
  mtspr(SPRN_MMCR0, 0x80000000);
  mtspr(SPRN_PMC6, 0x80000000 - 100);
  mtspr(SPRN_MMCR0, 0x6004000);
  busy_wait(delay);
  trace_printk("IN IRQ TEST DONE\n");
  local_irq_restore(flags);
  mtspr(SPRN_MMCR0, 0x80000000);
  mtspr(SPRN_PMC6, 0);
}
<<>>

With the patch, there is no soft lockup’s.

Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>

> ---
> It's actually very tricky / practically impossible to prevent softirqs
> from being run like my previous attempt tried to do, when you really
> get into details. Certainly not for a fix. I might try to attempt that
> some time in future, but it would require kernel/softirq.c changes and
> what we do now isn't conceptually different from architectures without
> soft-masking, it's just another complexity to the complex soft-masking
> code.
> 
> Anyway this hopefully fixes the PMU bug, but as I said I couldn't
> reproduce it with your test case so it would be good if you could
> give it a test.
> 
> Thanks,
> Nick
> 
> arch/powerpc/kernel/irq.c | 28 ++++++++++++++++------------
> 1 file changed, 16 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 6b1eca53e36c..cc7a6271b6b4 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -180,13 +180,18 @@ void notrace restore_interrupts(void)
> 
> void replay_soft_interrupts(void)
> {
> +	struct pt_regs regs;
> +
> 	/*
> -	 * We use local_paca rather than get_paca() to avoid all
> -	 * the debug_smp_processor_id() business in this low level
> -	 * function
> +	 * Be careful here, calling these interrupt handlers can cause
> +	 * softirqs to be raised, which they may run when calling irq_exit,
> +	 * which will cause local_irq_enable() to be run, which can then
> +	 * recurse into this function. Don't keep any state across
> +	 * interrupt handler calls which may change underneath us.
> +	 *
> +	 * We use local_paca rather than get_paca() to avoid all the
> +	 * debug_smp_processor_id() business in this low level function.
> 	 */
> -	unsigned char happened = local_paca->irq_happened;
> -	struct pt_regs regs;
> 
> 	ppc_save_regs(&regs);
> 	regs.softe = IRQS_ENABLED;
> @@ -209,7 +214,7 @@ void replay_soft_interrupts(void)
> 	 * This is a higher priority interrupt than the others, so
> 	 * replay it first.
> 	 */
> -	if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (happened & PACA_IRQ_HMI)) {
> +	if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (local_paca->irq_happened & PACA_IRQ_HMI)) {
> 		local_paca->irq_happened &= ~PACA_IRQ_HMI;
> 		regs.trap = 0xe60;
> 		handle_hmi_exception(&regs);
> @@ -217,7 +222,7 @@ void replay_soft_interrupts(void)
> 			hard_irq_disable();
> 	}
> 
> -	if (happened & PACA_IRQ_DEC) {
> +	if (local_paca->irq_happened & PACA_IRQ_DEC) {
> 		local_paca->irq_happened &= ~PACA_IRQ_DEC;
> 		regs.trap = 0x900;
> 		timer_interrupt(&regs);
> @@ -225,7 +230,7 @@ void replay_soft_interrupts(void)
> 			hard_irq_disable();
> 	}
> 
> -	if (happened & PACA_IRQ_EE) {
> +	if (local_paca->irq_happened & PACA_IRQ_EE) {
> 		local_paca->irq_happened &= ~PACA_IRQ_EE;
> 		regs.trap = 0x500;
> 		do_IRQ(&regs);
> @@ -233,7 +238,7 @@ void replay_soft_interrupts(void)
> 			hard_irq_disable();
> 	}
> 
> -	if (IS_ENABLED(CONFIG_PPC_DOORBELL) && (happened & PACA_IRQ_DBELL)) {
> +	if (IS_ENABLED(CONFIG_PPC_DOORBELL) && (local_paca->irq_happened & PACA_IRQ_DBELL)) {
> 		local_paca->irq_happened &= ~PACA_IRQ_DBELL;
> 		if (IS_ENABLED(CONFIG_PPC_BOOK3E))
> 			regs.trap = 0x280;
> @@ -245,7 +250,7 @@ void replay_soft_interrupts(void)
> 	}
> 
> 	/* Book3E does not support soft-masking PMI interrupts */
> -	if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (happened & PACA_IRQ_PMI)) {
> +	if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (local_paca->irq_happened & PACA_IRQ_PMI)) {
> 		local_paca->irq_happened &= ~PACA_IRQ_PMI;
> 		regs.trap = 0xf00;
> 		performance_monitor_exception(&regs);
> @@ -253,8 +258,7 @@ void replay_soft_interrupts(void)
> 			hard_irq_disable();
> 	}
> 
> -	happened = local_paca->irq_happened;
> -	if (happened & ~PACA_IRQ_HARD_DIS) {
> +	if (local_paca->irq_happened & ~PACA_IRQ_HARD_DIS) {
> 		/*
> 		 * We are responding to the next interrupt, so interrupt-off
> 		 * latencies should be reset here.
> -- 
> 2.23.0
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] powerpc/64s: prevent recursive replay_soft_interrupts causing superfluous interrupt
  2021-01-23  6:12 [PATCH] powerpc/64s: prevent recursive replay_soft_interrupts causing superfluous interrupt Nicholas Piggin
  2021-01-27  8:32 ` Athira Rajeev
@ 2021-02-03 11:46 ` Michael Ellerman
  1 sibling, 0 replies; 3+ messages in thread
From: Michael Ellerman @ 2021-02-03 11:46 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Athira Rajeev

On Sat, 23 Jan 2021 16:12:44 +1000, Nicholas Piggin wrote:
> When an asynchronous interrupt calls irq_exit, it checks for softirqs
> that may have been created, and runs them. Running softirqs enables
> local irqs, which can replay pending interrupts causing recursion in
> replay_soft_interrupts. This abridged trace shows how this can occur:
> 
> ! NIP replay_soft_interrupts
>   LR  interrupt_exit_kernel_prepare
>   Call Trace:
>     interrupt_exit_kernel_prepare (unreliable)
>     interrupt_return
>   --- interrupt: ea0 at __rb_reserve_next
>   NIP __rb_reserve_next
>   LR __rb_reserve_next
>   Call Trace:
>     ring_buffer_lock_reserve
>     trace_function
>     function_trace_call
>     ftrace_call
>     __do_softirq
>     irq_exit
>     timer_interrupt
> !   replay_soft_interrupts
>     interrupt_exit_kernel_prepare
>     interrupt_return
>   --- interrupt: ea0 at arch_local_irq_restore
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/64s: prevent recursive replay_soft_interrupts causing superfluous interrupt
      https://git.kernel.org/powerpc/c/4025c784c573cab7e3f84746cc82b8033923ec62

cheers

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-03 13:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-23  6:12 [PATCH] powerpc/64s: prevent recursive replay_soft_interrupts causing superfluous interrupt Nicholas Piggin
2021-01-27  8:32 ` Athira Rajeev
2021-02-03 11:46 ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.