All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v6 14/39] powerpc/perf: move perf irq/nmi handling details into traps.c
Date: Wed, 20 Jan 2021 14:21:45 +1000	[thread overview]
Message-ID: <1611116256.ntg9xvtsyb.astroid@bobo.none> (raw)
In-Reply-To: <1611108829.0isdl3z9na.astroid@bobo.none>

Excerpts from Nicholas Piggin's message of January 20, 2021 1:09 pm:
> Excerpts from Athira Rajeev's message of January 19, 2021 8:24 pm:
>> 
>> [  883.900762] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0]
>> [  883.901381] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           OE     5.11.0-rc3+ #34
>> --
>> [  883.901999] NIP [c0000000000168d0] replay_soft_interrupts+0x70/0x2f0
>> [  883.902032] LR [c00000000003b2b8] interrupt_exit_kernel_prepare+0x1e8/0x240
>> [  883.902063] Call Trace:
>> [  883.902085] [c000000001c96f50] [c00000000003b2b8] interrupt_exit_kernel_prepare+0x1e8/0x240 (unreliable)
>> [  883.902139] [c000000001c96fb0] [c00000000000fd88] interrupt_return+0x158/0x200
>> [  883.902185] --- interrupt: ea0 at __rb_reserve_next+0xc0/0x5b0
>> [  883.902224] NIP:  c0000000002d8980 LR: c0000000002d897c CTR: c0000000001aad90
>> [  883.902262] REGS: c000000001c97020 TRAP: 0ea0   Tainted: G           OE      (5.11.0-rc3+)
>> [  883.902301] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28000484  XER: 20040000
>> [  883.902387] CFAR: c00000000000fe00 IRQMASK: 0 
>> --
>> [  883.902757] NIP [c0000000002d8980] __rb_reserve_next+0xc0/0x5b0
>> [  883.902786] LR [c0000000002d897c] __rb_reserve_next+0xbc/0x5b0
>> [  883.902824] --- interrupt: ea0
>> [  883.902848] [c000000001c97360] [c0000000002d8fcc] ring_buffer_lock_reserve+0x15c/0x580
>> [  883.902894] [c000000001c973f0] [c0000000002e82fc] trace_function+0x4c/0x1c0
>> [  883.902930] [c000000001c97440] [c0000000002f6f50] function_trace_call+0x140/0x190
>> [  883.902976] [c000000001c97470] [c00000000007d6f8] ftrace_call+0x4/0x44
>> [  883.903021] [c000000001c97660] [c000000000dcf70c] __do_softirq+0x15c/0x3d4
>> [  883.903066] [c000000001c97750] [c00000000015fc68] irq_exit+0x198/0x1b0
>> [  883.903102] [c000000001c97780] [c000000000dc1790] timer_interrupt+0x170/0x3b0
>> [  883.903148] [c000000001c977e0] [c000000000016994] replay_soft_interrupts+0x134/0x2f0
>> [  883.903193] [c000000001c979d0] [c00000000003b2b8] interrupt_exit_kernel_prepare+0x1e8/0x240
>> [  883.903240] [c000000001c97a30] [c00000000000fd88] interrupt_return+0x158/0x200
>> [  883.903276] --- interrupt: ea0 at arch_local_irq_restore+0x70/0xc0
> 
> You got a 0xea0 interrupt in the ftrace code. I wonder where it is 
> looping. Do you see more soft lockup messages?

We should probably fix this recursion too. I was vaguely aware of it and 
thought it might have existed with the old interrupt exit and replay 
code as well and was pretty well bounded, but I'm not entirely sure it's
okay. And now that I've thought about it a bit harder, I think there is
actualy a simple way to fix it -

[PATCH] powerpc/64: prevent replayed interrupt handlers from running
 softirqs

Running softirqs enables interrupts, which can then end up recursing
into the irq soft-mask code we're trying to adjust, including replaying
interrupts itself which may not be bounded. This abridged trace shows
how this can occur:

  NIP replay_soft_interrupts
  LR  interrupt_exit_kernel_prepare
  Call Trace:
    interrupt_exit_kernel_prepare (unreliable)
    interrupt_return
  --- interrupt: ea0 at __rb_reserve_next
  NIP __rb_reserve_next
  LR __rb_reserve_next
  Call Trace:
    ring_buffer_lock_reserve
    trace_function
    function_trace_call
    ftrace_call
    __do_softirq
    irq_exit
   timer_interrupt
   replay_soft_interrupts
   interrupt_exit_kernel_prepare
   interrupt_return
  --- interrupt: ea0 at arch_local_irq_restore

Fix this by disabling bhs (softirqs) around the interrupt replay.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/irq.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 681abb7c0507..bb0d4fc8df89 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -189,6 +189,18 @@ void replay_soft_interrupts(void)
 	unsigned char happened = local_paca->irq_happened;
 	struct pt_regs regs;
 
+	/*
+	 * Prevent softirqs from being run when an interrupt handler returns
+	 * and calls irq_exit(), because softirq processing enables interrupts.
+	 * If an interrupt is taken, it may then call replay_soft_interrupts
+	 * on its way out, which gets messy and recursive.
+	 *
+	 * softirqs created by replayed interrupts will be run at the end of
+	 * this function when bhs are enabled (if they were enabled in our
+	 * caller).
+	 */
+	local_bh_disable();
+
 	ppc_save_regs(&regs);
 	regs.softe = IRQS_ENABLED;
 
@@ -264,6 +276,8 @@ void replay_soft_interrupts(void)
 		trace_hardirqs_off();
 		goto again;
 	}
+
+	local_bh_enable();
 }
 
 notrace void arch_local_irq_restore(unsigned long mask)
-- 
2.23.0


  reply	other threads:[~2021-01-20  4:23 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-15 16:49 [PATCH v6 00/39] powerpc: interrupt wrappers Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 01/39] KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs Nicholas Piggin
2021-01-15 16:49   ` Nicholas Piggin
2021-01-16 10:38   ` Nicholas Piggin
2021-01-16 10:38     ` Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 02/39] powerpc/32s: move DABR match out of handle_page_fault Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 03/39] powerpc/64s: " Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 04/39] powerpc/64s: move the hash fault handling logic to C Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 05/39] powerpc: remove arguments from fault handler functions Nicholas Piggin
2021-01-27  6:38   ` Christophe Leroy
2021-01-28  0:05     ` Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 06/39] powerpc: do_break get registers from regs Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 07/39] powerpc: bad_page_fault " Nicholas Piggin
2021-01-15 17:09   ` Christophe Leroy
2021-01-16  0:42     ` Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 08/39] powerpc: rearrange do_page_fault error case to be inside exception_enter Nicholas Piggin
2021-01-28  9:25   ` Christophe Leroy
2021-01-15 16:49 ` [PATCH v6 09/39] powerpc/64s: move bad_page_fault handling to C Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 10/39] powerpc/64s: split do_hash_fault Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 11/39] powerpc/mm: Remove stale do_page_fault comment referring to SLB faults Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 12/39] powerpc/64s: slb comment update Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 13/39] powerpc/traps: add NOKPROBE_SYMBOL for sreset and mce Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 14/39] powerpc/perf: move perf irq/nmi handling details into traps.c Nicholas Piggin
2021-01-19 10:24   ` Athira Rajeev
2021-01-20  3:09     ` Nicholas Piggin
2021-01-20  4:21       ` Nicholas Piggin [this message]
2021-01-27  5:49       ` Athira Rajeev
2021-01-15 16:49 ` [PATCH v6 15/39] powerpc/time: move timer_broadcast_interrupt prototype to asm/time.h Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 16/39] powerpc: add and use unknown_async_exception Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 17/39] powerpc/fsl_booke/32: CacheLockingException remove args Nicholas Piggin
2021-01-15 17:14   ` Christophe Leroy
2021-01-16  0:43     ` Nicholas Piggin
2021-01-16  7:38       ` Christophe Leroy
2021-01-16 10:34         ` Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 18/39] powerpc: DebugException " Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 19/39] powerpc/cell: tidy up pervasive declarations Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 20/39] powerpc: introduce die_mce Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 21/39] powerpc/mce: ensure machine check handler always tests RI Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 22/39] powerpc: improve handling of unrecoverable system reset Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 23/39] powerpc: interrupt handler wrapper functions Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 24/39] powerpc: add interrupt wrapper entry / exit stub functions Nicholas Piggin
2021-01-15 16:49 ` [PATCH v6 25/39] powerpc: convert interrupt handlers to use wrappers Nicholas Piggin
2021-01-20 10:48   ` kernel test robot
2021-01-20 10:48     ` kernel test robot
2021-01-20 11:45   ` kernel test robot
2021-01-20 11:45     ` kernel test robot
2021-01-15 16:49 ` [PATCH v6 26/39] powerpc: add interrupt_cond_local_irq_enable helper Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 27/39] powerpc/64: context tracking remove _TIF_NOHZ Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 28/39] powerpc/64s/hash: improve context tracking of hash faults Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 29/39] powerpc/64: context tracking move to interrupt wrappers Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 30/39] powerpc/64: add context tracking to asynchronous interrupts Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 31/39] powerpc: handle irq_enter/irq_exit in interrupt handler wrappers Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 32/39] powerpc/64s: move context tracking exit to interrupt exit path Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 33/39] powerpc/64s: reconcile interrupts in C Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 34/39] powerpc/64: move account_stolen_time into its own function Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 35/39] powerpc/64: entry cpu time accounting in C Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 36/39] powerpc: move NMI entry/exit code into wrapper Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 37/39] powerpc/64s: move NMI soft-mask handling to C Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 38/39] powerpc/64s: runlatch interrupt handling in C Nicholas Piggin
2021-01-15 16:50 ` [PATCH v6 39/39] powerpc/64s: power4 nap fixup " Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1611116256.ntg9xvtsyb.astroid@bobo.none \
    --to=npiggin@gmail.com \
    --cc=atrajeev@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.