linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tty/vt: Touch NMI watchdog in vt_console_print
@ 2019-09-20  8:57 Qiujun Huang
  2019-10-04 13:12 ` Greg KH
  0 siblings, 1 reply; 2+ messages in thread
From: Qiujun Huang @ 2019-09-20  8:57 UTC (permalink / raw)
  To: gregkh
  Cc: jslaby, nico, textshell, daniel.vetter, sam, mpatocka, ghalat,
	linux-kernel, hqjagain

vt_console_print could trigger NMI watchdog in case writing slow:

[2858736.789664] NMI watchdog: Watchdog detected hard LOCKUP on cpu 23
...
[2858736.790194] CPU: 23 PID: 32504 Comm: tensorflow_mode Not tainted 4.4.131-1.el7.elrepo.x86_64 #1
[2858736.790206] Hardware name: Huawei RH2288 V3/BC11HGSB0, BIOS 3.57 02/26/2017
[2858736.790222] task: ffff881e0a191640 ti: ffff881fd73a8000 task.ti: ffff881fd73a8000
[2858736.790358] RIP: 0010:[<ffffffff810cc06e>]  [<ffffffff810cc06e>] native_queued_spin_lock_slowpath+0x15e/0x170
[2858736.790363] RSP: 0018:ffff88203f043db0  EFLAGS: 00000002
[2858736.790365] RAX: 00000000005c0101 RBX: 0000000000000246 RCX: 0000000000000001
...
[2858736.790452] Call Trace:
[2858736.790521]  <IRQ>
[2858736.790521]  [<ffffffff8118ab28>] queued_spin_lock_slowpath+0xb/0xf
[2858736.790552]  [<ffffffff8170eca7>] _raw_spin_lock_irqsave+0x37/0x40
[2858736.790653]  [<ffffffff814c80a4>] scsi_end_request+0x104/0x1d0
[2858736.790656]  [<ffffffff814c9e13>] scsi_io_completion+0x153/0x650
[2858736.790671]  [<ffffffff814c1092>] scsi_finish_command+0xd2/0x120
[2858736.790673]  [<ffffffff814c9607>] scsi_softirq_done+0x127/0x150
[2858736.790749]  [<ffffffff8131973e>] blk_done_softirq+0x8e/0xc0
[2858736.790811]  [<ffffffff810857db>] __do_softirq+0xeb/0x2f0
[2858736.790813]  [<ffffffff81085c85>] irq_exit+0xf5/0x100
[2858736.790867]  [<ffffffff81051819>] smp_call_function_single_interrupt+0x39/0x40
[2858736.790890]  [<ffffffff8171055b>] call_function_single_interrupt+0x9b/0xa0
[2858736.790973]  <EOI>
...

PID: 1793   TASK: ffff88103445c2c0  CPU: 32  COMMAND: "java"
 #0 [ffff88103fe88e38] crash_nmi_callback at ffffffff810504d7
 #1 [ffff88103fe88e48] nmi_handle at ffffffff8101c1f7
 #2 [ffff88103fe88ea0] default_do_nmi at ffffffff8101c7d0
 #3 [ffff88103fe88ec0] do_nmi at ffffffff8101c901
 #4 [ffff88103fe88ee8] end_repeat_nmi at ffffffff8171176a
    [exception RIP: cfb_imageblit+1167]
    RIP: ffffffff813bdf8f  RSP: ffff880006823380  RFLAGS: 00000046
    RAX: 0000000000000001  RBX: 0000000000000000  RCX: 0000000000000005
    RDX: 000000000000024d  RSI: 00000000ff000000  RDI: 0000000000000001
    RBP: ffff8800068233f0   R8: ffffffff81785e80   R9: ffffc9000c843168
    R10: 0000000000000001  R11: 0000000000000000  R12: ffff882037a831ba
    R13: ffff882037a831af  R14: ffffc9000c84316c  R15: ffffc9000c843000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #5 [ffff880006823380] cfb_imageblit at ffffffff813bdf8f
 #6 [ffff880006823398] bit_putcs at ffffffff813b2307
 #7 [ffff8800068234d0] bit_cursor at ffffffff813b1fc8
 #8 [ffff8800068235f0] fbcon_scroll at ffffffff813aebda
 #9 [ffff880006823650] scrup at ffffffff81442600
...

the cpu23 wait for the same blk queue_lock

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
---
 drivers/tty/vt/vt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 34aa39d..cd32d66 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -105,6 +105,7 @@
 #include <linux/ctype.h>
 #include <linux/bsearch.h>
 #include <linux/gcd.h>
+#include <linux/nmi.h>
 
 #define MAX_NR_CON_DRIVER 16
 
@@ -2978,6 +2979,7 @@ static void vt_console_print(struct console *co, const char *b, unsigned count)
 			vc->vc_pos += 2;
 			vc->vc_x++;
 		}
+		touch_nmi_watchdog();
 	}
 	if (cnt && con_is_visible(vc))
 		vc->vc_sw->con_putcs(vc, start, cnt, vc->vc_y, start_x);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] tty/vt: Touch NMI watchdog in vt_console_print
  2019-09-20  8:57 [PATCH] tty/vt: Touch NMI watchdog in vt_console_print Qiujun Huang
@ 2019-10-04 13:12 ` Greg KH
  0 siblings, 0 replies; 2+ messages in thread
From: Greg KH @ 2019-10-04 13:12 UTC (permalink / raw)
  To: Qiujun Huang
  Cc: jslaby, nico, textshell, daniel.vetter, sam, mpatocka, ghalat,
	linux-kernel

On Fri, Sep 20, 2019 at 04:57:26PM +0800, Qiujun Huang wrote:
> vt_console_print could trigger NMI watchdog in case writing slow:
> 
> [2858736.789664] NMI watchdog: Watchdog detected hard LOCKUP on cpu 23
> ...
> [2858736.790194] CPU: 23 PID: 32504 Comm: tensorflow_mode Not tainted 4.4.131-1.el7.elrepo.x86_64 #1
> [2858736.790206] Hardware name: Huawei RH2288 V3/BC11HGSB0, BIOS 3.57 02/26/2017
> [2858736.790222] task: ffff881e0a191640 ti: ffff881fd73a8000 task.ti: ffff881fd73a8000
> [2858736.790358] RIP: 0010:[<ffffffff810cc06e>]  [<ffffffff810cc06e>] native_queued_spin_lock_slowpath+0x15e/0x170
> [2858736.790363] RSP: 0018:ffff88203f043db0  EFLAGS: 00000002
> [2858736.790365] RAX: 00000000005c0101 RBX: 0000000000000246 RCX: 0000000000000001
> ...
> [2858736.790452] Call Trace:
> [2858736.790521]  <IRQ>
> [2858736.790521]  [<ffffffff8118ab28>] queued_spin_lock_slowpath+0xb/0xf
> [2858736.790552]  [<ffffffff8170eca7>] _raw_spin_lock_irqsave+0x37/0x40
> [2858736.790653]  [<ffffffff814c80a4>] scsi_end_request+0x104/0x1d0
> [2858736.790656]  [<ffffffff814c9e13>] scsi_io_completion+0x153/0x650
> [2858736.790671]  [<ffffffff814c1092>] scsi_finish_command+0xd2/0x120
> [2858736.790673]  [<ffffffff814c9607>] scsi_softirq_done+0x127/0x150
> [2858736.790749]  [<ffffffff8131973e>] blk_done_softirq+0x8e/0xc0
> [2858736.790811]  [<ffffffff810857db>] __do_softirq+0xeb/0x2f0
> [2858736.790813]  [<ffffffff81085c85>] irq_exit+0xf5/0x100
> [2858736.790867]  [<ffffffff81051819>] smp_call_function_single_interrupt+0x39/0x40
> [2858736.790890]  [<ffffffff8171055b>] call_function_single_interrupt+0x9b/0xa0
> [2858736.790973]  <EOI>
> ...
> 
> PID: 1793   TASK: ffff88103445c2c0  CPU: 32  COMMAND: "java"
>  #0 [ffff88103fe88e38] crash_nmi_callback at ffffffff810504d7
>  #1 [ffff88103fe88e48] nmi_handle at ffffffff8101c1f7
>  #2 [ffff88103fe88ea0] default_do_nmi at ffffffff8101c7d0
>  #3 [ffff88103fe88ec0] do_nmi at ffffffff8101c901
>  #4 [ffff88103fe88ee8] end_repeat_nmi at ffffffff8171176a
>     [exception RIP: cfb_imageblit+1167]
>     RIP: ffffffff813bdf8f  RSP: ffff880006823380  RFLAGS: 00000046
>     RAX: 0000000000000001  RBX: 0000000000000000  RCX: 0000000000000005
>     RDX: 000000000000024d  RSI: 00000000ff000000  RDI: 0000000000000001
>     RBP: ffff8800068233f0   R8: ffffffff81785e80   R9: ffffc9000c843168
>     R10: 0000000000000001  R11: 0000000000000000  R12: ffff882037a831ba
>     R13: ffff882037a831af  R14: ffffc9000c84316c  R15: ffffc9000c843000
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> --- <NMI exception stack> ---
>  #5 [ffff880006823380] cfb_imageblit at ffffffff813bdf8f
>  #6 [ffff880006823398] bit_putcs at ffffffff813b2307
>  #7 [ffff8800068234d0] bit_cursor at ffffffff813b1fc8
>  #8 [ffff8800068235f0] fbcon_scroll at ffffffff813aebda
>  #9 [ffff880006823650] scrup at ffffffff81442600
> ...
> 
> the cpu23 wait for the same blk queue_lock

Why is this acting "slow"?

Do we want to sprinkle this type of call everywhere in the console layer
(hint, I doubt it.)

This feels like something odd is wrong with your system, not with the
tty layer...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-10-04 13:12 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-20  8:57 [PATCH] tty/vt: Touch NMI watchdog in vt_console_print Qiujun Huang
2019-10-04 13:12 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).