All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc/64: prevent replayed interrupt handlers from running softirqs
@ 2021-01-20  7:50 Nicholas Piggin
  2021-01-20 18:12 ` Athira Rajeev
  2021-01-21 12:50 ` Michael Ellerman
  0 siblings, 2 replies; 4+ messages in thread
From: Nicholas Piggin @ 2021-01-20  7:50 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Running softirqs enables interrupts, which can then end up recursing
into the irq soft-mask code we're adjusting, including replaying
interrupts itself, which might be theoretically unbounded.

This abridged trace shows how this can occur:

! NIP replay_soft_interrupts
  LR  interrupt_exit_kernel_prepare
  Call Trace:
    interrupt_exit_kernel_prepare (unreliable)
    interrupt_return
  --- interrupt: ea0 at __rb_reserve_next
  NIP __rb_reserve_next
  LR __rb_reserve_next
  Call Trace:
    ring_buffer_lock_reserve
    trace_function
    function_trace_call
    ftrace_call
    __do_softirq
    irq_exit
    timer_interrupt
!   replay_soft_interrupts
    interrupt_exit_kernel_prepare
    interrupt_return
  --- interrupt: ea0 at arch_local_irq_restore

Fix this by disabling bhs (softirqs) around the interrupt replay.

I don't know that commit 3282a3da25bd ("powerpc/64: Implement soft
interrupt replay in C") actually introduced the problem. Prior to that
change, the interrupt replay seems like it should still be subect to
this recusion, however it's done after all the irq state has been fixed
up at the end of the replay, so it seems reasonable to fix back to this
commit.

Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/irq.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 6b1eca53e36c..7064135f9dc3 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -188,6 +188,18 @@ void replay_soft_interrupts(void)
 	unsigned char happened = local_paca->irq_happened;
 	struct pt_regs regs;
 
+	/*
+	 * Prevent softirqs from being run when an interrupt handler returns
+	 * and calls irq_exit(), because softirq processing enables interrupts.
+	 * If an interrupt is taken, it may then call replay_soft_interrupts
+	 * on its way out, which gets messy and recursive.
+	 *
+	 * softirqs created by replayed interrupts will be run at the end of
+	 * this function when bhs are enabled (if they were enabled in our
+	 * caller).
+	 */
+	local_bh_disable();
+
 	ppc_save_regs(&regs);
 	regs.softe = IRQS_ENABLED;
 
@@ -263,6 +275,8 @@ void replay_soft_interrupts(void)
 		trace_hardirqs_off();
 		goto again;
 	}
+
+	local_bh_enable();
 }
 
 notrace void arch_local_irq_restore(unsigned long mask)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc/64: prevent replayed interrupt handlers from running softirqs
  2021-01-20  7:50 [PATCH] powerpc/64: prevent replayed interrupt handlers from running softirqs Nicholas Piggin
@ 2021-01-20 18:12 ` Athira Rajeev
  2021-01-21 12:50 ` Michael Ellerman
  1 sibling, 0 replies; 4+ messages in thread
From: Athira Rajeev @ 2021-01-20 18:12 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/html, Size: 6264 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc/64: prevent replayed interrupt handlers from running softirqs
  2021-01-20  7:50 [PATCH] powerpc/64: prevent replayed interrupt handlers from running softirqs Nicholas Piggin
  2021-01-20 18:12 ` Athira Rajeev
@ 2021-01-21 12:50 ` Michael Ellerman
  2021-01-22  0:33   ` Nicholas Piggin
  1 sibling, 1 reply; 4+ messages in thread
From: Michael Ellerman @ 2021-01-21 12:50 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:
> Running softirqs enables interrupts, which can then end up recursing
> into the irq soft-mask code we're adjusting, including replaying
> interrupts itself, which might be theoretically unbounded.
>
> This abridged trace shows how this can occur:
>
> ! NIP replay_soft_interrupts
>   LR  interrupt_exit_kernel_prepare
>   Call Trace:
>     interrupt_exit_kernel_prepare (unreliable)
>     interrupt_return
>   --- interrupt: ea0 at __rb_reserve_next
>   NIP __rb_reserve_next
>   LR __rb_reserve_next
>   Call Trace:
>     ring_buffer_lock_reserve
>     trace_function
>     function_trace_call
>     ftrace_call
>     __do_softirq
>     irq_exit
>     timer_interrupt
> !   replay_soft_interrupts
>     interrupt_exit_kernel_prepare
>     interrupt_return
>   --- interrupt: ea0 at arch_local_irq_restore
>
> Fix this by disabling bhs (softirqs) around the interrupt replay.
>
> I don't know that commit 3282a3da25bd ("powerpc/64: Implement soft
> interrupt replay in C") actually introduced the problem. Prior to that
> change, the interrupt replay seems like it should still be subect to
> this recusion, however it's done after all the irq state has been fixed
> up at the end of the replay, so it seems reasonable to fix back to this
> commit.

This seems very unhappy for me (on P9 bare metal):

[    0.038571] Mountpoint-cache hash table entries: 131072 (order: 4, 1048576 bytes, linear)
[    0.040194] ------------[ cut here ]------------
[    0.040228] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:176 __local_bh_enable_ip+0x150/0x210
[    0.040263] Modules linked in:
[    0.040280] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-rc2-00008-g4899f32e4f2a #1
[    0.040321] NIP:  c000000000114bc0 LR: c0000000000172a0 CTR: c00000000002a020
[    0.040360] REGS: c00000000177f670 TRAP: 0700   Not tainted  (5.11.0-rc2-00008-g4899f32e4f2a)
[    0.040410] MSR:  9000000002021033 <SF,HV,VEC,ME,IR,DR,RI,LE>  CR: 28000224  XER: 20040000
[    0.040472] CFAR: c000000000114ae8 IRQMASK: 3
               GPR00: c0000000000172a0 c00000000177f910 c000000001783900 c000000000017290
               GPR04: 0000000000000200 4000000000000000 0000000000000002 00000001312d0000
               GPR08: 0000000000000000 c0000000016f3480 0000000000000202 0000000000000000
               GPR12: c00000000002a020 c0000000023a0000 0000000000000000 0000000000000000
               GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
               GPR20: 0000000000000001 00000000100051c6 0000000000000000 0000000000000009
               GPR24: 0000000000000e60 0000000000000900 0000000000000500 0000000000000a00
               GPR28: 0000000000000f00 0000000000000002 0000000000000003 0000000000000200
[    0.040824] NIP [c000000000114bc0] __local_bh_enable_ip+0x150/0x210
[    0.040863] LR [c0000000000172a0] replay_soft_interrupts+0x2e0/0x340
[    0.040904] Call Trace:
[    0.040926] [c00000000177f910] [0000000000000500] 0x500 (unreliable)
[    0.040962] [c00000000177f950] [c0000000000172a0] replay_soft_interrupts+0x2e0/0x340
[    0.041008] [c00000000177fb50] [c000000000017370] arch_local_irq_restore+0x70/0xe0
[    0.041042] [c00000000177fb80] [c000000000476514] kmem_cache_alloc+0x474/0x520
[    0.041066] [c00000000177fc00] [c0000000004e394c] __d_alloc+0x4c/0x2e0
[    0.041109] [c00000000177fc50] [c0000000004e40ac] d_make_root+0x3c/0xa0
[    0.041142] [c00000000177fc80] [c000000000679ce0] ramfs_fill_super+0x80/0xb0
[    0.041186] [c00000000177fcb0] [c0000000004c1b04] get_tree_nodev+0xb4/0x130
[    0.041230] [c00000000177fcf0] [c000000000679578] ramfs_get_tree+0x28/0x40
[    0.041282] [c00000000177fd10] [c0000000004bee78] vfs_get_tree+0x48/0x120
[    0.041325] [c00000000177fd80] [c0000000004f7fe0] vfs_kern_mount.part.0+0xd0/0x130
[    0.041368] [c00000000177fdc0] [c000000001366700] mnt_init+0x1c8/0x2fc
[    0.041420] [c00000000177fe60] [c000000001366178] vfs_caches_init+0x110/0x138
[    0.041454] [c00000000177fee0] [c000000001331020] start_kernel+0x6d8/0x780
[    0.041497] [c00000000177ff90] [c00000000000d354] start_here_common+0x1c/0x5c8
[    0.041539] Instruction dump:
[    0.041555] e9490002 394a0001 91490000 e90d0028 3d42ffcc 394a4730 7d0a42aa e9490002
[    0.041608] 2c280000 394affff 91490000 4082ff30 <0fe00000> 892d0988 39400001 994d0988
[    0.041660] irq event stamp: 555
[    0.041674] hardirqs last  enabled at (553): [<c00000000047654c>] kmem_cache_alloc+0x4ac/0x520
[    0.041707] hardirqs last disabled at (554): [<c000000000017368>] arch_local_irq_restore+0x68/0xe0
[    0.041750] softirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.041778] softirqs last disabled at (555): [<c000000000016fd0>] replay_soft_interrupts+0x10/0x340
[    0.041824] ---[ end trace aa6f9769e07e43db ]---


And lots and lots of these, or similar:


[   14.369838] =============================
[   14.369839] WARNING: suspicious RCU usage
[   14.369841] 5.11.0-rc2-00008-g4899f32e4f2a #1 Tainted: G        W
[   14.369843] -----------------------------
[   14.369844] include/linux/rcupdate.h:692 rcu_read_unlock() used illegally while idle!
[   14.369846]
               other info that might help us debug this:

[   14.369848]
               rcu_scheduler_active = 2, debug_locks = 1
[   14.369850] RCU used illegally from extended quiescent state!
[   14.369851] 2 locks held by swapper/32/0:
[   14.369853]  #0: c0000000015e6fc0 (rcu_callback){....}-{0:0}, at: rcu_core+0x2e0/0x990
[   14.369864]  #1: c0000000015e6f30 (rcu_read_lock){....}-{1:3}, at: kmem_cache_free+0x7cc/0x7e0
[   14.369874]
               stack backtrace:
[   14.369876] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G        W         5.11.0-rc2-00008-g4899f32e4f2a #1
[   14.369879] Call Trace:
[   14.369880] [c000001fff557c10] [c0000000008630b8] dump_stack+0xec/0x144 (unreliable)
[   14.369886] [c000001fff557c60] [c0000000001ad2d0] lockdep_rcu_suspicious+0x124/0x144
[   14.369890] [c000001fff557cf0] [c00000000047783c] kmem_cache_free+0x2ac/0x7e0
[   14.369894] [c000001fff557db0] [c0000000004bdeac] file_free_rcu+0x5c/0xa0
[   14.369898] [c000001fff557de0] [c0000000001e214c] rcu_core+0x33c/0x990
[   14.369902] [c000001fff557e90] [c000000000f496d0] __do_softirq+0x180/0x688
[   14.369906] [c000001fff557f90] [c0000000000307bc] call_do_softirq+0x14/0x24
[   14.369911] [c000000002e1fab0] [c000000000017418] do_softirq_own_stack+0x38/0x50
[   14.369916] [c000000002e1fad0] [c000000000114a60] do_softirq+0x120/0x130
[   14.369920] [c000000002e1fb00] [c000000000114c64] __local_bh_enable_ip+0x1f4/0x210
[   14.369924] [c000000002e1fb40] [c0000000000172a0] replay_soft_interrupts+0x2e0/0x340
[   14.369928] [c000000002e1fd40] [c000000000017370] arch_local_irq_restore+0x70/0xe0
[   14.369933] [c000000002e1fd70] [c000000000c87184] snooze_loop+0x64/0x2e4
[   14.369937] [c000000002e1fdb0] [c000000000c84204] cpuidle_enter_state+0x2e4/0x550
[   14.369941] [c000000002e1fe10] [c000000000c8450c] cpuidle_enter+0x4c/0x70
[   14.369946] [c000000002e1fe50] [c00000000016892c] call_cpuidle+0x4c/0x90
[   14.369949] [c000000002e1fe70] [c000000000168d74] do_idle+0x2f4/0x380
[   14.369953] [c000000002e1ff10] [c000000000169208] cpu_startup_entry+0x38/0x40
[   14.369957] [c000000002e1ff40] [c000000000053484] start_secondary+0x2a4/0x2b0
[   14.369961] [c000000002e1ff90] [c00000000000d254] start_secondary_prolog+0x10/0x14


cheers

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc/64: prevent replayed interrupt handlers from running softirqs
  2021-01-21 12:50 ` Michael Ellerman
@ 2021-01-22  0:33   ` Nicholas Piggin
  0 siblings, 0 replies; 4+ messages in thread
From: Nicholas Piggin @ 2021-01-22  0:33 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman

Excerpts from Michael Ellerman's message of January 21, 2021 10:50 pm:
> Nicholas Piggin <npiggin@gmail.com> writes:
>> Running softirqs enables interrupts, which can then end up recursing
>> into the irq soft-mask code we're adjusting, including replaying
>> interrupts itself, which might be theoretically unbounded.
>>
>> This abridged trace shows how this can occur:
>>
>> ! NIP replay_soft_interrupts
>>   LR  interrupt_exit_kernel_prepare
>>   Call Trace:
>>     interrupt_exit_kernel_prepare (unreliable)
>>     interrupt_return
>>   --- interrupt: ea0 at __rb_reserve_next
>>   NIP __rb_reserve_next
>>   LR __rb_reserve_next
>>   Call Trace:
>>     ring_buffer_lock_reserve
>>     trace_function
>>     function_trace_call
>>     ftrace_call
>>     __do_softirq
>>     irq_exit
>>     timer_interrupt
>> !   replay_soft_interrupts
>>     interrupt_exit_kernel_prepare
>>     interrupt_return
>>   --- interrupt: ea0 at arch_local_irq_restore
>>
>> Fix this by disabling bhs (softirqs) around the interrupt replay.
>>
>> I don't know that commit 3282a3da25bd ("powerpc/64: Implement soft
>> interrupt replay in C") actually introduced the problem. Prior to that
>> change, the interrupt replay seems like it should still be subect to
>> this recusion, however it's done after all the irq state has been fixed
>> up at the end of the replay, so it seems reasonable to fix back to this
>> commit.
> 
> This seems very unhappy for me (on P9 bare metal):

Oh, damn lockdep I always forget it has a bunch of interrupt checks.

In this case it might have a point though. Thanks, I'll try to fix it.

Thanks,
Nick

> 
> [    0.038571] Mountpoint-cache hash table entries: 131072 (order: 4, 1048576 bytes, linear)
> [    0.040194] ------------[ cut here ]------------
> [    0.040228] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:176 __local_bh_enable_ip+0x150/0x210
> [    0.040263] Modules linked in:
> [    0.040280] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-rc2-00008-g4899f32e4f2a #1
> [    0.040321] NIP:  c000000000114bc0 LR: c0000000000172a0 CTR: c00000000002a020
> [    0.040360] REGS: c00000000177f670 TRAP: 0700   Not tainted  (5.11.0-rc2-00008-g4899f32e4f2a)
> [    0.040410] MSR:  9000000002021033 <SF,HV,VEC,ME,IR,DR,RI,LE>  CR: 28000224  XER: 20040000
> [    0.040472] CFAR: c000000000114ae8 IRQMASK: 3
>                GPR00: c0000000000172a0 c00000000177f910 c000000001783900 c000000000017290
>                GPR04: 0000000000000200 4000000000000000 0000000000000002 00000001312d0000
>                GPR08: 0000000000000000 c0000000016f3480 0000000000000202 0000000000000000
>                GPR12: c00000000002a020 c0000000023a0000 0000000000000000 0000000000000000
>                GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>                GPR20: 0000000000000001 00000000100051c6 0000000000000000 0000000000000009
>                GPR24: 0000000000000e60 0000000000000900 0000000000000500 0000000000000a00
>                GPR28: 0000000000000f00 0000000000000002 0000000000000003 0000000000000200
> [    0.040824] NIP [c000000000114bc0] __local_bh_enable_ip+0x150/0x210
> [    0.040863] LR [c0000000000172a0] replay_soft_interrupts+0x2e0/0x340
> [    0.040904] Call Trace:
> [    0.040926] [c00000000177f910] [0000000000000500] 0x500 (unreliable)
> [    0.040962] [c00000000177f950] [c0000000000172a0] replay_soft_interrupts+0x2e0/0x340
> [    0.041008] [c00000000177fb50] [c000000000017370] arch_local_irq_restore+0x70/0xe0
> [    0.041042] [c00000000177fb80] [c000000000476514] kmem_cache_alloc+0x474/0x520
> [    0.041066] [c00000000177fc00] [c0000000004e394c] __d_alloc+0x4c/0x2e0
> [    0.041109] [c00000000177fc50] [c0000000004e40ac] d_make_root+0x3c/0xa0
> [    0.041142] [c00000000177fc80] [c000000000679ce0] ramfs_fill_super+0x80/0xb0
> [    0.041186] [c00000000177fcb0] [c0000000004c1b04] get_tree_nodev+0xb4/0x130
> [    0.041230] [c00000000177fcf0] [c000000000679578] ramfs_get_tree+0x28/0x40
> [    0.041282] [c00000000177fd10] [c0000000004bee78] vfs_get_tree+0x48/0x120
> [    0.041325] [c00000000177fd80] [c0000000004f7fe0] vfs_kern_mount.part.0+0xd0/0x130
> [    0.041368] [c00000000177fdc0] [c000000001366700] mnt_init+0x1c8/0x2fc
> [    0.041420] [c00000000177fe60] [c000000001366178] vfs_caches_init+0x110/0x138
> [    0.041454] [c00000000177fee0] [c000000001331020] start_kernel+0x6d8/0x780
> [    0.041497] [c00000000177ff90] [c00000000000d354] start_here_common+0x1c/0x5c8
> [    0.041539] Instruction dump:
> [    0.041555] e9490002 394a0001 91490000 e90d0028 3d42ffcc 394a4730 7d0a42aa e9490002
> [    0.041608] 2c280000 394affff 91490000 4082ff30 <0fe00000> 892d0988 39400001 994d0988
> [    0.041660] irq event stamp: 555
> [    0.041674] hardirqs last  enabled at (553): [<c00000000047654c>] kmem_cache_alloc+0x4ac/0x520
> [    0.041707] hardirqs last disabled at (554): [<c000000000017368>] arch_local_irq_restore+0x68/0xe0
> [    0.041750] softirqs last  enabled at (0): [<0000000000000000>] 0x0
> [    0.041778] softirqs last disabled at (555): [<c000000000016fd0>] replay_soft_interrupts+0x10/0x340
> [    0.041824] ---[ end trace aa6f9769e07e43db ]---
> 
> 
> And lots and lots of these, or similar:
> 
> 
> [   14.369838] =============================
> [   14.369839] WARNING: suspicious RCU usage
> [   14.369841] 5.11.0-rc2-00008-g4899f32e4f2a #1 Tainted: G        W
> [   14.369843] -----------------------------
> [   14.369844] include/linux/rcupdate.h:692 rcu_read_unlock() used illegally while idle!
> [   14.369846]
>                other info that might help us debug this:
> 
> [   14.369848]
>                rcu_scheduler_active = 2, debug_locks = 1
> [   14.369850] RCU used illegally from extended quiescent state!
> [   14.369851] 2 locks held by swapper/32/0:
> [   14.369853]  #0: c0000000015e6fc0 (rcu_callback){....}-{0:0}, at: rcu_core+0x2e0/0x990
> [   14.369864]  #1: c0000000015e6f30 (rcu_read_lock){....}-{1:3}, at: kmem_cache_free+0x7cc/0x7e0
> [   14.369874]
>                stack backtrace:
> [   14.369876] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G        W         5.11.0-rc2-00008-g4899f32e4f2a #1
> [   14.369879] Call Trace:
> [   14.369880] [c000001fff557c10] [c0000000008630b8] dump_stack+0xec/0x144 (unreliable)
> [   14.369886] [c000001fff557c60] [c0000000001ad2d0] lockdep_rcu_suspicious+0x124/0x144
> [   14.369890] [c000001fff557cf0] [c00000000047783c] kmem_cache_free+0x2ac/0x7e0
> [   14.369894] [c000001fff557db0] [c0000000004bdeac] file_free_rcu+0x5c/0xa0
> [   14.369898] [c000001fff557de0] [c0000000001e214c] rcu_core+0x33c/0x990
> [   14.369902] [c000001fff557e90] [c000000000f496d0] __do_softirq+0x180/0x688
> [   14.369906] [c000001fff557f90] [c0000000000307bc] call_do_softirq+0x14/0x24
> [   14.369911] [c000000002e1fab0] [c000000000017418] do_softirq_own_stack+0x38/0x50
> [   14.369916] [c000000002e1fad0] [c000000000114a60] do_softirq+0x120/0x130
> [   14.369920] [c000000002e1fb00] [c000000000114c64] __local_bh_enable_ip+0x1f4/0x210
> [   14.369924] [c000000002e1fb40] [c0000000000172a0] replay_soft_interrupts+0x2e0/0x340
> [   14.369928] [c000000002e1fd40] [c000000000017370] arch_local_irq_restore+0x70/0xe0
> [   14.369933] [c000000002e1fd70] [c000000000c87184] snooze_loop+0x64/0x2e4
> [   14.369937] [c000000002e1fdb0] [c000000000c84204] cpuidle_enter_state+0x2e4/0x550
> [   14.369941] [c000000002e1fe10] [c000000000c8450c] cpuidle_enter+0x4c/0x70
> [   14.369946] [c000000002e1fe50] [c00000000016892c] call_cpuidle+0x4c/0x90
> [   14.369949] [c000000002e1fe70] [c000000000168d74] do_idle+0x2f4/0x380
> [   14.369953] [c000000002e1ff10] [c000000000169208] cpu_startup_entry+0x38/0x40
> [   14.369957] [c000000002e1ff40] [c000000000053484] start_secondary+0x2a4/0x2b0
> [   14.369961] [c000000002e1ff90] [c00000000000d254] start_secondary_prolog+0x10/0x14
> 
> 
> cheers
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-01-22  0:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-20  7:50 [PATCH] powerpc/64: prevent replayed interrupt handlers from running softirqs Nicholas Piggin
2021-01-20 18:12 ` Athira Rajeev
2021-01-21 12:50 ` Michael Ellerman
2021-01-22  0:33   ` Nicholas Piggin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.