* [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-02 12:35 ` Will Deacon 0 siblings, 0 replies; 22+ messages in thread From: Will Deacon @ 2019-10-02 12:35 UTC (permalink / raw) To: linux-kernel Cc: linux-arm-kernel, contact, Will Deacon, Russell King, Greg Kroah-Hartman, Ingo Molnar, Kees Cook, Andrew Morton, stable Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the calling CPU in an infinite loop, but with interrupts and preemption enabled. From this state, userspace can continue to be scheduled, despite the system being "dead" as far as the kernel is concerned. This is easily reproducible on arm64 when booting with "nosmp" on the command line; a couple of shell scripts print out a periodic "Ping" message whilst another triggers a crash by writing to /proc/sysrq-trigger: | sysrq: Trigger a crash | Kernel panic - not syncing: sysrq triggered crash | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x148 | show_stack+0x14/0x20 | dump_stack+0xa0/0xc4 | panic+0x140/0x32c | sysrq_handle_reboot+0x0/0x20 | __handle_sysrq+0x124/0x190 | write_sysrq_trigger+0x64/0x88 | proc_reg_write+0x60/0xa8 | __vfs_write+0x18/0x40 | vfs_write+0xa4/0x1b8 | ksys_write+0x64/0xf0 | __arm64_sys_write+0x14/0x20 | el0_svc_common.constprop.0+0xb0/0x168 | el0_svc_handler+0x28/0x78 | el0_svc+0x8/0xc | Kernel Offset: disabled | CPU features: 0x0002,24002004 | Memory Limit: none | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- | Ping 2! | Ping 1! | Ping 1! | Ping 2! The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise local interrupts are disabled in 'smp_send_stop()'. Disable preemption in 'panic()' before re-enabling interrupts. Cc: Russell King <linux@armlinux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/BX1W47JXPMR8.58IYW53H6M5N@dragonstone Reported-by: Xogium <contact@xogium.me> Signed-off-by: Will Deacon <will@kernel.org> --- kernel/panic.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/panic.c b/kernel/panic.c index 47e8ebccc22b..f470a038b05b 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -180,6 +180,7 @@ void panic(const char *fmt, ...) * after setting panic_cpu) from invoking panic() again. */ local_irq_disable(); + preempt_disable_notrace(); /* * It's possible to come here directly from a panic-assertion and -- 2.23.0.444.g18eeb5a265-goog ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-02 12:35 ` Will Deacon 0 siblings, 0 replies; 22+ messages in thread From: Will Deacon @ 2019-10-02 12:35 UTC (permalink / raw) To: linux-kernel Cc: Kees Cook, Greg Kroah-Hartman, contact, Russell King, stable, Ingo Molnar, Andrew Morton, Will Deacon, linux-arm-kernel Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the calling CPU in an infinite loop, but with interrupts and preemption enabled. From this state, userspace can continue to be scheduled, despite the system being "dead" as far as the kernel is concerned. This is easily reproducible on arm64 when booting with "nosmp" on the command line; a couple of shell scripts print out a periodic "Ping" message whilst another triggers a crash by writing to /proc/sysrq-trigger: | sysrq: Trigger a crash | Kernel panic - not syncing: sysrq triggered crash | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x148 | show_stack+0x14/0x20 | dump_stack+0xa0/0xc4 | panic+0x140/0x32c | sysrq_handle_reboot+0x0/0x20 | __handle_sysrq+0x124/0x190 | write_sysrq_trigger+0x64/0x88 | proc_reg_write+0x60/0xa8 | __vfs_write+0x18/0x40 | vfs_write+0xa4/0x1b8 | ksys_write+0x64/0xf0 | __arm64_sys_write+0x14/0x20 | el0_svc_common.constprop.0+0xb0/0x168 | el0_svc_handler+0x28/0x78 | el0_svc+0x8/0xc | Kernel Offset: disabled | CPU features: 0x0002,24002004 | Memory Limit: none | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- | Ping 2! | Ping 1! | Ping 1! | Ping 2! The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise local interrupts are disabled in 'smp_send_stop()'. Disable preemption in 'panic()' before re-enabling interrupts. Cc: Russell King <linux@armlinux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/BX1W47JXPMR8.58IYW53H6M5N@dragonstone Reported-by: Xogium <contact@xogium.me> Signed-off-by: Will Deacon <will@kernel.org> --- kernel/panic.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/panic.c b/kernel/panic.c index 47e8ebccc22b..f470a038b05b 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -180,6 +180,7 @@ void panic(const char *fmt, ...) * after setting panic_cpu) from invoking panic() again. */ local_irq_disable(); + preempt_disable_notrace(); /* * It's possible to come here directly from a panic-assertion and -- 2.23.0.444.g18eeb5a265-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-02 12:35 ` Will Deacon @ 2019-10-02 20:58 ` Kees Cook -1 siblings, 0 replies; 22+ messages in thread From: Kees Cook @ 2019-10-02 20:58 UTC (permalink / raw) To: Will Deacon Cc: linux-kernel, linux-arm-kernel, contact, Russell King, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton, stable, Feng Tang, Petr Mladek, Steven Rostedt, Sergey Senozhatsky On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote: > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > calling CPU in an infinite loop, but with interrupts and preemption > enabled. From this state, userspace can continue to be scheduled, > despite the system being "dead" as far as the kernel is concerned. This > is easily reproducible on arm64 when booting with "nosmp" on the command > line; a couple of shell scripts print out a periodic "Ping" message > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > | sysrq: Trigger a crash > | Kernel panic - not syncing: sysrq triggered crash > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > | Hardware name: linux,dummy-virt (DT) > | Call trace: > | dump_backtrace+0x0/0x148 > | show_stack+0x14/0x20 > | dump_stack+0xa0/0xc4 > | panic+0x140/0x32c > | sysrq_handle_reboot+0x0/0x20 > | __handle_sysrq+0x124/0x190 > | write_sysrq_trigger+0x64/0x88 > | proc_reg_write+0x60/0xa8 > | __vfs_write+0x18/0x40 > | vfs_write+0xa4/0x1b8 > | ksys_write+0x64/0xf0 > | __arm64_sys_write+0x14/0x20 > | el0_svc_common.constprop.0+0xb0/0x168 > | el0_svc_handler+0x28/0x78 > | el0_svc+0x8/0xc > | Kernel Offset: disabled > | CPU features: 0x0002,24002004 > | Memory Limit: none > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > | Ping 2! > | Ping 1! > | Ping 1! > | Ping 2! > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > local interrupts are disabled in 'smp_send_stop()'. > > Disable preemption in 'panic()' before re-enabling interrupts. Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic: avoid the extra noise dmesg") was trying to fix? Either way: Reviewed-by: Kees Cook <keescook@chromium.org> -Kees > > Cc: Russell King <linux@armlinux.org.uk> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Kees Cook <keescook@chromium.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: <stable@vger.kernel.org> > Link: https://lore.kernel.org/r/BX1W47JXPMR8.58IYW53H6M5N@dragonstone > Reported-by: Xogium <contact@xogium.me> > Signed-off-by: Will Deacon <will@kernel.org> > --- > kernel/panic.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/panic.c b/kernel/panic.c > index 47e8ebccc22b..f470a038b05b 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -180,6 +180,7 @@ void panic(const char *fmt, ...) > * after setting panic_cpu) from invoking panic() again. > */ > local_irq_disable(); > + preempt_disable_notrace(); > > /* > * It's possible to come here directly from a panic-assertion and > -- > 2.23.0.444.g18eeb5a265-goog > -- Kees Cook ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-02 20:58 ` Kees Cook 0 siblings, 0 replies; 22+ messages in thread From: Kees Cook @ 2019-10-02 20:58 UTC (permalink / raw) To: Will Deacon Cc: Petr Mladek, Feng Tang, Greg Kroah-Hartman, contact, linux-kernel, stable, Russell King, Sergey Senozhatsky, Ingo Molnar, Steven Rostedt, Andrew Morton, linux-arm-kernel On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote: > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > calling CPU in an infinite loop, but with interrupts and preemption > enabled. From this state, userspace can continue to be scheduled, > despite the system being "dead" as far as the kernel is concerned. This > is easily reproducible on arm64 when booting with "nosmp" on the command > line; a couple of shell scripts print out a periodic "Ping" message > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > | sysrq: Trigger a crash > | Kernel panic - not syncing: sysrq triggered crash > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > | Hardware name: linux,dummy-virt (DT) > | Call trace: > | dump_backtrace+0x0/0x148 > | show_stack+0x14/0x20 > | dump_stack+0xa0/0xc4 > | panic+0x140/0x32c > | sysrq_handle_reboot+0x0/0x20 > | __handle_sysrq+0x124/0x190 > | write_sysrq_trigger+0x64/0x88 > | proc_reg_write+0x60/0xa8 > | __vfs_write+0x18/0x40 > | vfs_write+0xa4/0x1b8 > | ksys_write+0x64/0xf0 > | __arm64_sys_write+0x14/0x20 > | el0_svc_common.constprop.0+0xb0/0x168 > | el0_svc_handler+0x28/0x78 > | el0_svc+0x8/0xc > | Kernel Offset: disabled > | CPU features: 0x0002,24002004 > | Memory Limit: none > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > | Ping 2! > | Ping 1! > | Ping 1! > | Ping 2! > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > local interrupts are disabled in 'smp_send_stop()'. > > Disable preemption in 'panic()' before re-enabling interrupts. Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic: avoid the extra noise dmesg") was trying to fix? Either way: Reviewed-by: Kees Cook <keescook@chromium.org> -Kees > > Cc: Russell King <linux@armlinux.org.uk> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Kees Cook <keescook@chromium.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: <stable@vger.kernel.org> > Link: https://lore.kernel.org/r/BX1W47JXPMR8.58IYW53H6M5N@dragonstone > Reported-by: Xogium <contact@xogium.me> > Signed-off-by: Will Deacon <will@kernel.org> > --- > kernel/panic.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/panic.c b/kernel/panic.c > index 47e8ebccc22b..f470a038b05b 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -180,6 +180,7 @@ void panic(const char *fmt, ...) > * after setting panic_cpu) from invoking panic() again. > */ > local_irq_disable(); > + preempt_disable_notrace(); > > /* > * It's possible to come here directly from a panic-assertion and > -- > 2.23.0.444.g18eeb5a265-goog > -- Kees Cook _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-02 20:58 ` Kees Cook @ 2019-10-03 20:56 ` Will Deacon -1 siblings, 0 replies; 22+ messages in thread From: Will Deacon @ 2019-10-03 20:56 UTC (permalink / raw) To: Kees Cook Cc: linux-kernel, linux-arm-kernel, contact, Russell King, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton, stable, Feng Tang, Petr Mladek, Steven Rostedt, Sergey Senozhatsky Hi Kees, On Wed, Oct 02, 2019 at 01:58:46PM -0700, Kees Cook wrote: > On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote: > > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > > calling CPU in an infinite loop, but with interrupts and preemption > > enabled. From this state, userspace can continue to be scheduled, > > despite the system being "dead" as far as the kernel is concerned. This > > is easily reproducible on arm64 when booting with "nosmp" on the command > > line; a couple of shell scripts print out a periodic "Ping" message > > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > > > | sysrq: Trigger a crash > > | Kernel panic - not syncing: sysrq triggered crash > > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > > | Hardware name: linux,dummy-virt (DT) > > | Call trace: > > | dump_backtrace+0x0/0x148 > > | show_stack+0x14/0x20 > > | dump_stack+0xa0/0xc4 > > | panic+0x140/0x32c > > | sysrq_handle_reboot+0x0/0x20 > > | __handle_sysrq+0x124/0x190 > > | write_sysrq_trigger+0x64/0x88 > > | proc_reg_write+0x60/0xa8 > > | __vfs_write+0x18/0x40 > > | vfs_write+0xa4/0x1b8 > > | ksys_write+0x64/0xf0 > > | __arm64_sys_write+0x14/0x20 > > | el0_svc_common.constprop.0+0xb0/0x168 > > | el0_svc_handler+0x28/0x78 > > | el0_svc+0x8/0xc > > | Kernel Offset: disabled > > | CPU features: 0x0002,24002004 > > | Memory Limit: none > > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > > | Ping 2! > > | Ping 1! > > | Ping 1! > > | Ping 2! > > > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > > local interrupts are disabled in 'smp_send_stop()'. > > > > Disable preemption in 'panic()' before re-enabling interrupts. > > Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic: > avoid the extra noise dmesg") was trying to fix? Hmm, maybe, although that looks like it's focussed more on irq handling than preemption. I've deliberately left the irq part alone, since I think having magic sysrq work via the keyboard interrupt is desirable from the panic loop. > Reviewed-by: Kees Cook <keescook@chromium.org> Thanks! Will ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-03 20:56 ` Will Deacon 0 siblings, 0 replies; 22+ messages in thread From: Will Deacon @ 2019-10-03 20:56 UTC (permalink / raw) To: Kees Cook Cc: Petr Mladek, Feng Tang, Greg Kroah-Hartman, contact, linux-kernel, stable, Russell King, Sergey Senozhatsky, Ingo Molnar, Steven Rostedt, Andrew Morton, linux-arm-kernel Hi Kees, On Wed, Oct 02, 2019 at 01:58:46PM -0700, Kees Cook wrote: > On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote: > > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > > calling CPU in an infinite loop, but with interrupts and preemption > > enabled. From this state, userspace can continue to be scheduled, > > despite the system being "dead" as far as the kernel is concerned. This > > is easily reproducible on arm64 when booting with "nosmp" on the command > > line; a couple of shell scripts print out a periodic "Ping" message > > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > > > | sysrq: Trigger a crash > > | Kernel panic - not syncing: sysrq triggered crash > > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > > | Hardware name: linux,dummy-virt (DT) > > | Call trace: > > | dump_backtrace+0x0/0x148 > > | show_stack+0x14/0x20 > > | dump_stack+0xa0/0xc4 > > | panic+0x140/0x32c > > | sysrq_handle_reboot+0x0/0x20 > > | __handle_sysrq+0x124/0x190 > > | write_sysrq_trigger+0x64/0x88 > > | proc_reg_write+0x60/0xa8 > > | __vfs_write+0x18/0x40 > > | vfs_write+0xa4/0x1b8 > > | ksys_write+0x64/0xf0 > > | __arm64_sys_write+0x14/0x20 > > | el0_svc_common.constprop.0+0xb0/0x168 > > | el0_svc_handler+0x28/0x78 > > | el0_svc+0x8/0xc > > | Kernel Offset: disabled > > | CPU features: 0x0002,24002004 > > | Memory Limit: none > > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > > | Ping 2! > > | Ping 1! > > | Ping 1! > > | Ping 2! > > > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > > local interrupts are disabled in 'smp_send_stop()'. > > > > Disable preemption in 'panic()' before re-enabling interrupts. > > Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic: > avoid the extra noise dmesg") was trying to fix? Hmm, maybe, although that looks like it's focussed more on irq handling than preemption. I've deliberately left the irq part alone, since I think having magic sysrq work via the keyboard interrupt is desirable from the panic loop. > Reviewed-by: Kees Cook <keescook@chromium.org> Thanks! Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-03 20:56 ` Will Deacon @ 2019-10-04 9:11 ` Petr Mladek -1 siblings, 0 replies; 22+ messages in thread From: Petr Mladek @ 2019-10-04 9:11 UTC (permalink / raw) To: Will Deacon Cc: Kees Cook, Russell King, Sergey Senozhatsky, Steven Rostedt, Feng Tang, Andrew Morton, Greg Kroah-Hartman, linux-arm-kernel, Ingo Molnar, linux-kernel, stable, contact On Thu 2019-10-03 21:56:34, Will Deacon wrote: > Hi Kees, > > On Wed, Oct 02, 2019 at 01:58:46PM -0700, Kees Cook wrote: > > On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote: > > > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > > > calling CPU in an infinite loop, but with interrupts and preemption > > > enabled. From this state, userspace can continue to be scheduled, > > > despite the system being "dead" as far as the kernel is concerned. This > > > is easily reproducible on arm64 when booting with "nosmp" on the command > > > line; a couple of shell scripts print out a periodic "Ping" message > > > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > > > > > | sysrq: Trigger a crash > > > | Kernel panic - not syncing: sysrq triggered crash > > > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > > > | Hardware name: linux,dummy-virt (DT) > > > | Call trace: > > > | dump_backtrace+0x0/0x148 > > > | show_stack+0x14/0x20 > > > | dump_stack+0xa0/0xc4 > > > | panic+0x140/0x32c > > > | sysrq_handle_reboot+0x0/0x20 > > > | __handle_sysrq+0x124/0x190 > > > | write_sysrq_trigger+0x64/0x88 > > > | proc_reg_write+0x60/0xa8 > > > | __vfs_write+0x18/0x40 > > > | vfs_write+0xa4/0x1b8 > > > | ksys_write+0x64/0xf0 > > > | __arm64_sys_write+0x14/0x20 > > > | el0_svc_common.constprop.0+0xb0/0x168 > > > | el0_svc_handler+0x28/0x78 > > > | el0_svc+0x8/0xc > > > | Kernel Offset: disabled > > > | CPU features: 0x0002,24002004 > > > | Memory Limit: none > > > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > > > | Ping 2! > > > | Ping 1! > > > | Ping 1! > > > | Ping 2! > > > > > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > > > local interrupts are disabled in 'smp_send_stop()'. > > > > > > Disable preemption in 'panic()' before re-enabling interrupts. > > > > Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic: > > avoid the extra noise dmesg") was trying to fix? > > Hmm, maybe, although that looks like it's focussed more on irq handling > than preemption. Exactly, the backtrace mentioned in commit c39ea0b9dd24 ("panic: avoid the extra noise dmesg") is printed by wake_up() called from wake_up_klogd_work_func(). It is irq_work. Therefore disabling preemption would not prevent this. > I've deliberately left the irq part alone, since I think > having magic sysrq work via the keyboard interrupt is desirable from the > panic loop. I agree that we should keep sysrq working. One pity thing is that led_panic_blink() in leds/drivers/trigger/ledtrig-panic.c uses workqueues: + led_panic_blink() + led_trigger_event() + led_set_brightness() + schedule_work() It means that it depends on the scheduler. I guess that it does not work in many panic situations. But this patch will always block it. I agree that it is strange that userspace still works at this stage. But does it cause any real problems? Best Regards, Petr ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-04 9:11 ` Petr Mladek 0 siblings, 0 replies; 22+ messages in thread From: Petr Mladek @ 2019-10-04 9:11 UTC (permalink / raw) To: Will Deacon Cc: Feng Tang, Kees Cook, Greg Kroah-Hartman, contact, Russell King, Steven Rostedt, linux-kernel, Sergey Senozhatsky, Ingo Molnar, stable, Andrew Morton, linux-arm-kernel On Thu 2019-10-03 21:56:34, Will Deacon wrote: > Hi Kees, > > On Wed, Oct 02, 2019 at 01:58:46PM -0700, Kees Cook wrote: > > On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote: > > > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > > > calling CPU in an infinite loop, but with interrupts and preemption > > > enabled. From this state, userspace can continue to be scheduled, > > > despite the system being "dead" as far as the kernel is concerned. This > > > is easily reproducible on arm64 when booting with "nosmp" on the command > > > line; a couple of shell scripts print out a periodic "Ping" message > > > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > > > > > | sysrq: Trigger a crash > > > | Kernel panic - not syncing: sysrq triggered crash > > > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > > > | Hardware name: linux,dummy-virt (DT) > > > | Call trace: > > > | dump_backtrace+0x0/0x148 > > > | show_stack+0x14/0x20 > > > | dump_stack+0xa0/0xc4 > > > | panic+0x140/0x32c > > > | sysrq_handle_reboot+0x0/0x20 > > > | __handle_sysrq+0x124/0x190 > > > | write_sysrq_trigger+0x64/0x88 > > > | proc_reg_write+0x60/0xa8 > > > | __vfs_write+0x18/0x40 > > > | vfs_write+0xa4/0x1b8 > > > | ksys_write+0x64/0xf0 > > > | __arm64_sys_write+0x14/0x20 > > > | el0_svc_common.constprop.0+0xb0/0x168 > > > | el0_svc_handler+0x28/0x78 > > > | el0_svc+0x8/0xc > > > | Kernel Offset: disabled > > > | CPU features: 0x0002,24002004 > > > | Memory Limit: none > > > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > > > | Ping 2! > > > | Ping 1! > > > | Ping 1! > > > | Ping 2! > > > > > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > > > local interrupts are disabled in 'smp_send_stop()'. > > > > > > Disable preemption in 'panic()' before re-enabling interrupts. > > > > Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic: > > avoid the extra noise dmesg") was trying to fix? > > Hmm, maybe, although that looks like it's focussed more on irq handling > than preemption. Exactly, the backtrace mentioned in commit c39ea0b9dd24 ("panic: avoid the extra noise dmesg") is printed by wake_up() called from wake_up_klogd_work_func(). It is irq_work. Therefore disabling preemption would not prevent this. > I've deliberately left the irq part alone, since I think > having magic sysrq work via the keyboard interrupt is desirable from the > panic loop. I agree that we should keep sysrq working. One pity thing is that led_panic_blink() in leds/drivers/trigger/ledtrig-panic.c uses workqueues: + led_panic_blink() + led_trigger_event() + led_set_brightness() + schedule_work() It means that it depends on the scheduler. I guess that it does not work in many panic situations. But this patch will always block it. I agree that it is strange that userspace still works at this stage. But does it cause any real problems? Best Regards, Petr _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-04 9:11 ` Petr Mladek @ 2019-10-04 9:29 ` Russell King - ARM Linux admin -1 siblings, 0 replies; 22+ messages in thread From: Russell King - ARM Linux admin @ 2019-10-04 9:29 UTC (permalink / raw) To: Petr Mladek Cc: Will Deacon, Kees Cook, Sergey Senozhatsky, Steven Rostedt, Feng Tang, Andrew Morton, Greg Kroah-Hartman, linux-arm-kernel, Ingo Molnar, linux-kernel, stable, contact On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote: > On Thu 2019-10-03 21:56:34, Will Deacon wrote: > > Hi Kees, > > > > On Wed, Oct 02, 2019 at 01:58:46PM -0700, Kees Cook wrote: > > > On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote: > > > > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > > > > calling CPU in an infinite loop, but with interrupts and preemption > > > > enabled. From this state, userspace can continue to be scheduled, > > > > despite the system being "dead" as far as the kernel is concerned. This > > > > is easily reproducible on arm64 when booting with "nosmp" on the command > > > > line; a couple of shell scripts print out a periodic "Ping" message > > > > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > > > > > > > | sysrq: Trigger a crash > > > > | Kernel panic - not syncing: sysrq triggered crash > > > > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > > > > | Hardware name: linux,dummy-virt (DT) > > > > | Call trace: > > > > | dump_backtrace+0x0/0x148 > > > > | show_stack+0x14/0x20 > > > > | dump_stack+0xa0/0xc4 > > > > | panic+0x140/0x32c > > > > | sysrq_handle_reboot+0x0/0x20 > > > > | __handle_sysrq+0x124/0x190 > > > > | write_sysrq_trigger+0x64/0x88 > > > > | proc_reg_write+0x60/0xa8 > > > > | __vfs_write+0x18/0x40 > > > > | vfs_write+0xa4/0x1b8 > > > > | ksys_write+0x64/0xf0 > > > > | __arm64_sys_write+0x14/0x20 > > > > | el0_svc_common.constprop.0+0xb0/0x168 > > > > | el0_svc_handler+0x28/0x78 > > > > | el0_svc+0x8/0xc > > > > | Kernel Offset: disabled > > > > | CPU features: 0x0002,24002004 > > > > | Memory Limit: none > > > > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > > > > | Ping 2! > > > > | Ping 1! > > > > | Ping 1! > > > > | Ping 2! > > > > > > > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > > > > local interrupts are disabled in 'smp_send_stop()'. > > > > > > > > Disable preemption in 'panic()' before re-enabling interrupts. > > > > > > Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic: > > > avoid the extra noise dmesg") was trying to fix? > > > > Hmm, maybe, although that looks like it's focussed more on irq handling > > than preemption. > > Exactly, the backtrace mentioned in commit c39ea0b9dd24 ("panic: avoid > the extra noise dmesg") is printed by wake_up() called from > wake_up_klogd_work_func(). It is irq_work. Therefore disabling > preemption would not prevent this. > > > > I've deliberately left the irq part alone, since I think > > having magic sysrq work via the keyboard interrupt is desirable from the > > panic loop. > > I agree that we should keep sysrq working. > > One pity thing is that led_panic_blink() in > leds/drivers/trigger/ledtrig-panic.c uses workqueues: > > + led_panic_blink() > + led_trigger_event() > + led_set_brightness() > + schedule_work() > > It means that it depends on the scheduler. I guess that it > does not work in many panic situations. But this patch > will always block it. > > I agree that it is strange that userspace still works at > this stage. But does it cause any real problems? Yes, there are watchdog drivers that continue to pat their watchdog after the kernel has panic'd. It makes watchdogs useless (which is exactly how this problem was discovered.) -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-04 9:29 ` Russell King - ARM Linux admin 0 siblings, 0 replies; 22+ messages in thread From: Russell King - ARM Linux admin @ 2019-10-04 9:29 UTC (permalink / raw) To: Petr Mladek Cc: Feng Tang, Kees Cook, Greg Kroah-Hartman, contact, linux-kernel, Steven Rostedt, Sergey Senozhatsky, Ingo Molnar, stable, Andrew Morton, Will Deacon, linux-arm-kernel On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote: > On Thu 2019-10-03 21:56:34, Will Deacon wrote: > > Hi Kees, > > > > On Wed, Oct 02, 2019 at 01:58:46PM -0700, Kees Cook wrote: > > > On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote: > > > > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > > > > calling CPU in an infinite loop, but with interrupts and preemption > > > > enabled. From this state, userspace can continue to be scheduled, > > > > despite the system being "dead" as far as the kernel is concerned. This > > > > is easily reproducible on arm64 when booting with "nosmp" on the command > > > > line; a couple of shell scripts print out a periodic "Ping" message > > > > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > > > > > > > | sysrq: Trigger a crash > > > > | Kernel panic - not syncing: sysrq triggered crash > > > > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > > > > | Hardware name: linux,dummy-virt (DT) > > > > | Call trace: > > > > | dump_backtrace+0x0/0x148 > > > > | show_stack+0x14/0x20 > > > > | dump_stack+0xa0/0xc4 > > > > | panic+0x140/0x32c > > > > | sysrq_handle_reboot+0x0/0x20 > > > > | __handle_sysrq+0x124/0x190 > > > > | write_sysrq_trigger+0x64/0x88 > > > > | proc_reg_write+0x60/0xa8 > > > > | __vfs_write+0x18/0x40 > > > > | vfs_write+0xa4/0x1b8 > > > > | ksys_write+0x64/0xf0 > > > > | __arm64_sys_write+0x14/0x20 > > > > | el0_svc_common.constprop.0+0xb0/0x168 > > > > | el0_svc_handler+0x28/0x78 > > > > | el0_svc+0x8/0xc > > > > | Kernel Offset: disabled > > > > | CPU features: 0x0002,24002004 > > > > | Memory Limit: none > > > > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > > > > | Ping 2! > > > > | Ping 1! > > > > | Ping 1! > > > > | Ping 2! > > > > > > > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > > > > local interrupts are disabled in 'smp_send_stop()'. > > > > > > > > Disable preemption in 'panic()' before re-enabling interrupts. > > > > > > Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic: > > > avoid the extra noise dmesg") was trying to fix? > > > > Hmm, maybe, although that looks like it's focussed more on irq handling > > than preemption. > > Exactly, the backtrace mentioned in commit c39ea0b9dd24 ("panic: avoid > the extra noise dmesg") is printed by wake_up() called from > wake_up_klogd_work_func(). It is irq_work. Therefore disabling > preemption would not prevent this. > > > > I've deliberately left the irq part alone, since I think > > having magic sysrq work via the keyboard interrupt is desirable from the > > panic loop. > > I agree that we should keep sysrq working. > > One pity thing is that led_panic_blink() in > leds/drivers/trigger/ledtrig-panic.c uses workqueues: > > + led_panic_blink() > + led_trigger_event() > + led_set_brightness() > + schedule_work() > > It means that it depends on the scheduler. I guess that it > does not work in many panic situations. But this patch > will always block it. > > I agree that it is strange that userspace still works at > this stage. But does it cause any real problems? Yes, there are watchdog drivers that continue to pat their watchdog after the kernel has panic'd. It makes watchdogs useless (which is exactly how this problem was discovered.) -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-04 9:29 ` Russell King - ARM Linux admin @ 2019-10-04 10:49 ` Will Deacon -1 siblings, 0 replies; 22+ messages in thread From: Will Deacon @ 2019-10-04 10:49 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Petr Mladek, Kees Cook, Sergey Senozhatsky, Steven Rostedt, Feng Tang, Andrew Morton, Greg Kroah-Hartman, linux-arm-kernel, Ingo Molnar, linux-kernel, stable, contact On Fri, Oct 04, 2019 at 10:29:17AM +0100, Russell King - ARM Linux admin wrote: > On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote: > > On Thu 2019-10-03 21:56:34, Will Deacon wrote: > > > I've deliberately left the irq part alone, since I think > > > having magic sysrq work via the keyboard interrupt is desirable from the > > > panic loop. > > > > I agree that we should keep sysrq working. > > > > One pity thing is that led_panic_blink() in > > leds/drivers/trigger/ledtrig-panic.c uses workqueues: > > > > + led_panic_blink() > > + led_trigger_event() > > + led_set_brightness() > > + schedule_work() > > > > It means that it depends on the scheduler. I guess that it > > does not work in many panic situations. But this patch > > will always block it. > > > > I agree that it is strange that userspace still works at > > this stage. But does it cause any real problems? > > Yes, there are watchdog drivers that continue to pat their watchdog > after the kernel has panic'd. It makes watchdogs useless (which is > exactly how this problem was discovered.) Indeed, and I think the LED blinking is already unreliable if the brightness operation needs to sleep. For example, if the kernel isn't preemptible or the work gets queued up on a different CPU which is sitting in panic_smp_self_stop(). Will ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-04 10:49 ` Will Deacon 0 siblings, 0 replies; 22+ messages in thread From: Will Deacon @ 2019-10-04 10:49 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Petr Mladek, Feng Tang, Kees Cook, Greg Kroah-Hartman, contact, linux-kernel, Steven Rostedt, Sergey Senozhatsky, Ingo Molnar, stable, Andrew Morton, linux-arm-kernel On Fri, Oct 04, 2019 at 10:29:17AM +0100, Russell King - ARM Linux admin wrote: > On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote: > > On Thu 2019-10-03 21:56:34, Will Deacon wrote: > > > I've deliberately left the irq part alone, since I think > > > having magic sysrq work via the keyboard interrupt is desirable from the > > > panic loop. > > > > I agree that we should keep sysrq working. > > > > One pity thing is that led_panic_blink() in > > leds/drivers/trigger/ledtrig-panic.c uses workqueues: > > > > + led_panic_blink() > > + led_trigger_event() > > + led_set_brightness() > > + schedule_work() > > > > It means that it depends on the scheduler. I guess that it > > does not work in many panic situations. But this patch > > will always block it. > > > > I agree that it is strange that userspace still works at > > this stage. But does it cause any real problems? > > Yes, there are watchdog drivers that continue to pat their watchdog > after the kernel has panic'd. It makes watchdogs useless (which is > exactly how this problem was discovered.) Indeed, and I think the LED blinking is already unreliable if the brightness operation needs to sleep. For example, if the kernel isn't preemptible or the work gets queued up on a different CPU which is sitting in panic_smp_self_stop(). Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-04 10:49 ` Will Deacon @ 2019-10-04 11:15 ` Petr Mladek -1 siblings, 0 replies; 22+ messages in thread From: Petr Mladek @ 2019-10-04 11:15 UTC (permalink / raw) To: Will Deacon Cc: Russell King - ARM Linux admin, Kees Cook, Sergey Senozhatsky, Steven Rostedt, Feng Tang, Andrew Morton, Greg Kroah-Hartman, linux-arm-kernel, Ingo Molnar, linux-kernel, stable, contact On Fri 2019-10-04 11:49:48, Will Deacon wrote: > On Fri, Oct 04, 2019 at 10:29:17AM +0100, Russell King - ARM Linux admin wrote: > > On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote: > > > On Thu 2019-10-03 21:56:34, Will Deacon wrote: > > > > I've deliberately left the irq part alone, since I think > > > > having magic sysrq work via the keyboard interrupt is desirable from the > > > > panic loop. > > > > > > I agree that we should keep sysrq working. > > > > > > One pity thing is that led_panic_blink() in > > > leds/drivers/trigger/ledtrig-panic.c uses workqueues: > > > > > > + led_panic_blink() > > > + led_trigger_event() > > > + led_set_brightness() > > > + schedule_work() > > > > > > It means that it depends on the scheduler. I guess that it > > > does not work in many panic situations. But this patch > > > will always block it. > > > > > > I agree that it is strange that userspace still works at > > > this stage. But does it cause any real problems? > > > > Yes, there are watchdog drivers that continue to pat their watchdog > > after the kernel has panic'd. It makes watchdogs useless (which is > > exactly how this problem was discovered.) > > Indeed, and I think the LED blinking is already unreliable if the > brightness operation needs to sleep. For example, if the kernel isn't > preemptible or the work gets queued up on a different CPU which is > sitting in panic_smp_self_stop(). To make it clear. I do not want to block this patch. I just wanted to point out the problem. I am not sure how the blinking is important these days. Well, I could imagine that it might be useful on some embedded devices. Another question is how many people want to end up with dead system these days. The watchdogs are likely used in data centers. I guess that automatic reboot in panic() is a better choice there. Anyway, it might make sense to remove the panic blinking code when it will not have a chance to work. Best Regards, Petr ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-04 11:15 ` Petr Mladek 0 siblings, 0 replies; 22+ messages in thread From: Petr Mladek @ 2019-10-04 11:15 UTC (permalink / raw) To: Will Deacon Cc: Feng Tang, Kees Cook, Greg Kroah-Hartman, contact, Russell King - ARM Linux admin, Steven Rostedt, linux-kernel, Sergey Senozhatsky, Ingo Molnar, stable, Andrew Morton, linux-arm-kernel On Fri 2019-10-04 11:49:48, Will Deacon wrote: > On Fri, Oct 04, 2019 at 10:29:17AM +0100, Russell King - ARM Linux admin wrote: > > On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote: > > > On Thu 2019-10-03 21:56:34, Will Deacon wrote: > > > > I've deliberately left the irq part alone, since I think > > > > having magic sysrq work via the keyboard interrupt is desirable from the > > > > panic loop. > > > > > > I agree that we should keep sysrq working. > > > > > > One pity thing is that led_panic_blink() in > > > leds/drivers/trigger/ledtrig-panic.c uses workqueues: > > > > > > + led_panic_blink() > > > + led_trigger_event() > > > + led_set_brightness() > > > + schedule_work() > > > > > > It means that it depends on the scheduler. I guess that it > > > does not work in many panic situations. But this patch > > > will always block it. > > > > > > I agree that it is strange that userspace still works at > > > this stage. But does it cause any real problems? > > > > Yes, there are watchdog drivers that continue to pat their watchdog > > after the kernel has panic'd. It makes watchdogs useless (which is > > exactly how this problem was discovered.) > > Indeed, and I think the LED blinking is already unreliable if the > brightness operation needs to sleep. For example, if the kernel isn't > preemptible or the work gets queued up on a different CPU which is > sitting in panic_smp_self_stop(). To make it clear. I do not want to block this patch. I just wanted to point out the problem. I am not sure how the blinking is important these days. Well, I could imagine that it might be useful on some embedded devices. Another question is how many people want to end up with dead system these days. The watchdogs are likely used in data centers. I guess that automatic reboot in panic() is a better choice there. Anyway, it might make sense to remove the panic blinking code when it will not have a chance to work. Best Regards, Petr _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-04 11:15 ` Petr Mladek @ 2019-10-04 13:51 ` Feng Tang -1 siblings, 0 replies; 22+ messages in thread From: Feng Tang @ 2019-10-04 13:51 UTC (permalink / raw) To: Petr Mladek Cc: Will Deacon, Russell King - ARM Linux admin, Kees Cook, Sergey Senozhatsky, Steven Rostedt, Andrew Morton, Greg Kroah-Hartman, linux-arm-kernel, Ingo Molnar, linux-kernel, stable, contact On Fri, Oct 04, 2019 at 01:15:21PM +0200, Petr Mladek wrote: > On Fri 2019-10-04 11:49:48, Will Deacon wrote: > > On Fri, Oct 04, 2019 at 10:29:17AM +0100, Russell King - ARM Linux admin wrote: > > > On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote: > > > > On Thu 2019-10-03 21:56:34, Will Deacon wrote: > > > > > I've deliberately left the irq part alone, since I think > > > > > having magic sysrq work via the keyboard interrupt is desirable from the > > > > > panic loop. > > > > > > > > I agree that we should keep sysrq working. > > > > > > > > One pity thing is that led_panic_blink() in > > > > leds/drivers/trigger/ledtrig-panic.c uses workqueues: > > > > > > > > + led_panic_blink() > > > > + led_trigger_event() > > > > + led_set_brightness() > > > > + schedule_work() > > > > > > > > It means that it depends on the scheduler. I guess that it > > > > does not work in many panic situations. But this patch > > > > will always block it. > > > > > > > > I agree that it is strange that userspace still works at > > > > this stage. But does it cause any real problems? > > > > > > Yes, there are watchdog drivers that continue to pat their watchdog > > > after the kernel has panic'd. It makes watchdogs useless (which is > > > exactly how this problem was discovered.) > > > > Indeed, and I think the LED blinking is already unreliable if the > > brightness operation needs to sleep. For example, if the kernel isn't > > preemptible or the work gets queued up on a different CPU which is > > sitting in panic_smp_self_stop(). > > To make it clear. I do not want to block this patch. I just wanted > to point out the problem. I am not sure how the blinking is important > these days. Well, I could imagine that it might be useful on some > embedded devices. When reviewing the c39ea0b9dd24 ("panic: avoid the extra noise dmesg"), there was similar discussion about what's the expectation for kernel when panic happens, as the earlier version patch is simply keeping the the local irq disabled, which may break the sysrq and the panic blink code, so we turned to handling it together with printk. > > Another question is how many people want to end up with dead system > these days. The watchdogs are likely used in data centers. I guess > that automatic reboot in panic() is a better choice there. > > Anyway, it might make sense to remove the panic blinking code when > it will not have a chance to work. I was also wondering if the panic blinking code still really works on any platforms. Thanks, Feng > > Best Regards, > Petr ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-04 13:51 ` Feng Tang 0 siblings, 0 replies; 22+ messages in thread From: Feng Tang @ 2019-10-04 13:51 UTC (permalink / raw) To: Petr Mladek Cc: Kees Cook, Greg Kroah-Hartman, contact, Russell King - ARM Linux admin, Steven Rostedt, linux-kernel, Sergey Senozhatsky, Ingo Molnar, stable, Andrew Morton, Will Deacon, linux-arm-kernel On Fri, Oct 04, 2019 at 01:15:21PM +0200, Petr Mladek wrote: > On Fri 2019-10-04 11:49:48, Will Deacon wrote: > > On Fri, Oct 04, 2019 at 10:29:17AM +0100, Russell King - ARM Linux admin wrote: > > > On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote: > > > > On Thu 2019-10-03 21:56:34, Will Deacon wrote: > > > > > I've deliberately left the irq part alone, since I think > > > > > having magic sysrq work via the keyboard interrupt is desirable from the > > > > > panic loop. > > > > > > > > I agree that we should keep sysrq working. > > > > > > > > One pity thing is that led_panic_blink() in > > > > leds/drivers/trigger/ledtrig-panic.c uses workqueues: > > > > > > > > + led_panic_blink() > > > > + led_trigger_event() > > > > + led_set_brightness() > > > > + schedule_work() > > > > > > > > It means that it depends on the scheduler. I guess that it > > > > does not work in many panic situations. But this patch > > > > will always block it. > > > > > > > > I agree that it is strange that userspace still works at > > > > this stage. But does it cause any real problems? > > > > > > Yes, there are watchdog drivers that continue to pat their watchdog > > > after the kernel has panic'd. It makes watchdogs useless (which is > > > exactly how this problem was discovered.) > > > > Indeed, and I think the LED blinking is already unreliable if the > > brightness operation needs to sleep. For example, if the kernel isn't > > preemptible or the work gets queued up on a different CPU which is > > sitting in panic_smp_self_stop(). > > To make it clear. I do not want to block this patch. I just wanted > to point out the problem. I am not sure how the blinking is important > these days. Well, I could imagine that it might be useful on some > embedded devices. When reviewing the c39ea0b9dd24 ("panic: avoid the extra noise dmesg"), there was similar discussion about what's the expectation for kernel when panic happens, as the earlier version patch is simply keeping the the local irq disabled, which may break the sysrq and the panic blink code, so we turned to handling it together with printk. > > Another question is how many people want to end up with dead system > these days. The watchdogs are likely used in data centers. I guess > that automatic reboot in panic() is a better choice there. > > Anyway, it might make sense to remove the panic blinking code when > it will not have a chance to work. I was also wondering if the panic blinking code still really works on any platforms. Thanks, Feng > > Best Regards, > Petr _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-04 10:49 ` Will Deacon @ 2019-10-07 8:02 ` Jiri Kosina -1 siblings, 0 replies; 22+ messages in thread From: Jiri Kosina @ 2019-10-07 8:02 UTC (permalink / raw) To: Will Deacon Cc: Russell King - ARM Linux admin, Petr Mladek, Kees Cook, Sergey Senozhatsky, Steven Rostedt, Feng Tang, Andrew Morton, Greg Kroah-Hartman, linux-arm-kernel, Ingo Molnar, linux-kernel, stable, contact On Fri, 4 Oct 2019, Will Deacon wrote: > Indeed, and I think the LED blinking is already unreliable if the > brightness operation needs to sleep. One thing is that led_set_brightness() can probably be forced to avoid the workqueue scheduling, by setting LED_BLINK_SW on the device (e.g. by issuing led_set_software_blink() during panic). But I am afraid this still won't solve the issue completely, as USB keyboards need workqueues for blinking the LEDs in for URB management. -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-07 8:02 ` Jiri Kosina 0 siblings, 0 replies; 22+ messages in thread From: Jiri Kosina @ 2019-10-07 8:02 UTC (permalink / raw) To: Will Deacon Cc: Petr Mladek, Feng Tang, Kees Cook, Greg Kroah-Hartman, contact, Russell King - ARM Linux admin, Steven Rostedt, linux-kernel, Sergey Senozhatsky, Ingo Molnar, stable, Andrew Morton, linux-arm-kernel On Fri, 4 Oct 2019, Will Deacon wrote: > Indeed, and I think the LED blinking is already unreliable if the > brightness operation needs to sleep. One thing is that led_set_brightness() can probably be forced to avoid the workqueue scheduling, by setting LED_BLINK_SW on the device (e.g. by issuing led_set_software_blink() during panic). But I am afraid this still won't solve the issue completely, as USB keyboards need workqueues for blinking the LEDs in for URB management. -- Jiri Kosina SUSE Labs _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-02 12:35 ` Will Deacon @ 2019-10-02 21:45 ` Andrew Morton -1 siblings, 0 replies; 22+ messages in thread From: Andrew Morton @ 2019-10-02 21:45 UTC (permalink / raw) To: Will Deacon Cc: linux-kernel, linux-arm-kernel, contact, Russell King, Greg Kroah-Hartman, Ingo Molnar, Kees Cook, stable On Wed, 2 Oct 2019 13:35:38 +0100 Will Deacon <will@kernel.org> wrote: > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > calling CPU in an infinite loop, but with interrupts and preemption > enabled. From this state, userspace can continue to be scheduled, > despite the system being "dead" as far as the kernel is concerned. This > is easily reproducible on arm64 when booting with "nosmp" on the command > line; a couple of shell scripts print out a periodic "Ping" message > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > | sysrq: Trigger a crash > | Kernel panic - not syncing: sysrq triggered crash > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > | Hardware name: linux,dummy-virt (DT) > | Call trace: > | dump_backtrace+0x0/0x148 > | show_stack+0x14/0x20 > | dump_stack+0xa0/0xc4 > | panic+0x140/0x32c > | sysrq_handle_reboot+0x0/0x20 > | __handle_sysrq+0x124/0x190 > | write_sysrq_trigger+0x64/0x88 > | proc_reg_write+0x60/0xa8 > | __vfs_write+0x18/0x40 > | vfs_write+0xa4/0x1b8 > | ksys_write+0x64/0xf0 > | __arm64_sys_write+0x14/0x20 > | el0_svc_common.constprop.0+0xb0/0x168 > | el0_svc_handler+0x28/0x78 > | el0_svc+0x8/0xc > | Kernel Offset: disabled > | CPU features: 0x0002,24002004 > | Memory Limit: none > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > | Ping 2! > | Ping 1! > | Ping 1! > | Ping 2! > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > local interrupts are disabled in 'smp_send_stop()'. > > Disable preemption in 'panic()' before re-enabling interrupts. > > ... > > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -180,6 +180,7 @@ void panic(const char *fmt, ...) > * after setting panic_cpu) from invoking panic() again. > */ > local_irq_disable(); > + preempt_disable_notrace(); > > /* > * It's possible to come here directly from a panic-assertion and We still do a lot of stuff (kexec, kgdb, etc) after this preempt_disable() and I worry that something in there will now trigger a might_sleep() warning as a result? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-02 21:45 ` Andrew Morton 0 siblings, 0 replies; 22+ messages in thread From: Andrew Morton @ 2019-10-02 21:45 UTC (permalink / raw) To: Will Deacon Cc: Kees Cook, Greg Kroah-Hartman, contact, linux-kernel, stable, Russell King, Ingo Molnar, linux-arm-kernel On Wed, 2 Oct 2019 13:35:38 +0100 Will Deacon <will@kernel.org> wrote: > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the > calling CPU in an infinite loop, but with interrupts and preemption > enabled. From this state, userspace can continue to be scheduled, > despite the system being "dead" as far as the kernel is concerned. This > is easily reproducible on arm64 when booting with "nosmp" on the command > line; a couple of shell scripts print out a periodic "Ping" message > whilst another triggers a crash by writing to /proc/sysrq-trigger: > > | sysrq: Trigger a crash > | Kernel panic - not syncing: sysrq triggered crash > | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 > | Hardware name: linux,dummy-virt (DT) > | Call trace: > | dump_backtrace+0x0/0x148 > | show_stack+0x14/0x20 > | dump_stack+0xa0/0xc4 > | panic+0x140/0x32c > | sysrq_handle_reboot+0x0/0x20 > | __handle_sysrq+0x124/0x190 > | write_sysrq_trigger+0x64/0x88 > | proc_reg_write+0x60/0xa8 > | __vfs_write+0x18/0x40 > | vfs_write+0xa4/0x1b8 > | ksys_write+0x64/0xf0 > | __arm64_sys_write+0x14/0x20 > | el0_svc_common.constprop.0+0xb0/0x168 > | el0_svc_handler+0x28/0x78 > | el0_svc+0x8/0xc > | Kernel Offset: disabled > | CPU features: 0x0002,24002004 > | Memory Limit: none > | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- > | Ping 2! > | Ping 1! > | Ping 1! > | Ping 2! > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise > local interrupts are disabled in 'smp_send_stop()'. > > Disable preemption in 'panic()' before re-enabling interrupts. > > ... > > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -180,6 +180,7 @@ void panic(const char *fmt, ...) > * after setting panic_cpu) from invoking panic() again. > */ > local_irq_disable(); > + preempt_disable_notrace(); > > /* > * It's possible to come here directly from a panic-assertion and We still do a lot of stuff (kexec, kgdb, etc) after this preempt_disable() and I worry that something in there will now trigger a might_sleep() warning as a result? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() 2019-10-02 21:45 ` Andrew Morton @ 2019-10-03 20:53 ` Will Deacon -1 siblings, 0 replies; 22+ messages in thread From: Will Deacon @ 2019-10-03 20:53 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, linux-arm-kernel, contact, Russell King, Greg Kroah-Hartman, Ingo Molnar, Kees Cook, stable Hi Andrew, Thanks for having a look. On Wed, Oct 02, 2019 at 02:45:58PM -0700, Andrew Morton wrote: > On Wed, 2 Oct 2019 13:35:38 +0100 Will Deacon <will@kernel.org> wrote: > > Disable preemption in 'panic()' before re-enabling interrupts. > > > > ... > > > > --- a/kernel/panic.c > > +++ b/kernel/panic.c > > @@ -180,6 +180,7 @@ void panic(const char *fmt, ...) > > * after setting panic_cpu) from invoking panic() again. > > */ > > local_irq_disable(); > > + preempt_disable_notrace(); > > > > /* > > * It's possible to come here directly from a panic-assertion and > > We still do a lot of stuff (kexec, kgdb, etc) after this > preempt_disable() and I worry that something in there will now trigger > a might_sleep() warning as a result? Given that interrupts are already disabled at this point, I don't think we'll get any additional warnings here by disabling preemption as well. Will ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] panic: Ensure preemption is disabled during panic() @ 2019-10-03 20:53 ` Will Deacon 0 siblings, 0 replies; 22+ messages in thread From: Will Deacon @ 2019-10-03 20:53 UTC (permalink / raw) To: Andrew Morton Cc: Kees Cook, Greg Kroah-Hartman, contact, linux-kernel, stable, Russell King, Ingo Molnar, linux-arm-kernel Hi Andrew, Thanks for having a look. On Wed, Oct 02, 2019 at 02:45:58PM -0700, Andrew Morton wrote: > On Wed, 2 Oct 2019 13:35:38 +0100 Will Deacon <will@kernel.org> wrote: > > Disable preemption in 'panic()' before re-enabling interrupts. > > > > ... > > > > --- a/kernel/panic.c > > +++ b/kernel/panic.c > > @@ -180,6 +180,7 @@ void panic(const char *fmt, ...) > > * after setting panic_cpu) from invoking panic() again. > > */ > > local_irq_disable(); > > + preempt_disable_notrace(); > > > > /* > > * It's possible to come here directly from a panic-assertion and > > We still do a lot of stuff (kexec, kgdb, etc) after this > preempt_disable() and I worry that something in there will now trigger > a might_sleep() warning as a result? Given that interrupts are already disabled at this point, I don't think we'll get any additional warnings here by disabling preemption as well. Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2019-10-07 8:02 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-10-02 12:35 [PATCH] panic: Ensure preemption is disabled during panic() Will Deacon 2019-10-02 12:35 ` Will Deacon 2019-10-02 20:58 ` Kees Cook 2019-10-02 20:58 ` Kees Cook 2019-10-03 20:56 ` Will Deacon 2019-10-03 20:56 ` Will Deacon 2019-10-04 9:11 ` Petr Mladek 2019-10-04 9:11 ` Petr Mladek 2019-10-04 9:29 ` Russell King - ARM Linux admin 2019-10-04 9:29 ` Russell King - ARM Linux admin 2019-10-04 10:49 ` Will Deacon 2019-10-04 10:49 ` Will Deacon 2019-10-04 11:15 ` Petr Mladek 2019-10-04 11:15 ` Petr Mladek 2019-10-04 13:51 ` Feng Tang 2019-10-04 13:51 ` Feng Tang 2019-10-07 8:02 ` Jiri Kosina 2019-10-07 8:02 ` Jiri Kosina 2019-10-02 21:45 ` Andrew Morton 2019-10-02 21:45 ` Andrew Morton 2019-10-03 20:53 ` Will Deacon 2019-10-03 20:53 ` Will Deacon
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.