* printk: add pr_flush()
@ 2020-10-22 14:12 Mike Galbraith
2020-10-22 14:15 ` Sebastian Andrzej Siewior
2020-10-22 15:04 ` John Ogness
0 siblings, 2 replies; 8+ messages in thread
From: Mike Galbraith @ 2020-10-22 14:12 UTC (permalink / raw)
To: John Ogness; +Cc: Sebastian Andrzej Siewior, linux-rt-users
Greetings,
Seems this patch can cause hangs. I can reliably reproduce one such
hang by poking sysrq-b, and it goes away if I revert that.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: printk: add pr_flush()
2020-10-22 14:12 printk: add pr_flush() Mike Galbraith
@ 2020-10-22 14:15 ` Sebastian Andrzej Siewior
2020-10-22 14:36 ` Mike Galbraith
2020-10-22 15:04 ` John Ogness
1 sibling, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-22 14:15 UTC (permalink / raw)
To: Mike Galbraith; +Cc: John Ogness, linux-rt-users
On 2020-10-22 16:12:41 [+0200], Mike Galbraith wrote:
> Greetings,
>
> Seems this patch can cause hangs. I can reliably reproduce one such
> hang by poking sysrq-b, and it goes away if I revert that.
does it still happen if the wait argument is `false'?
> -Mike
Sebastian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: printk: add pr_flush()
2020-10-22 14:15 ` Sebastian Andrzej Siewior
@ 2020-10-22 14:36 ` Mike Galbraith
2020-10-22 14:47 ` Mike Galbraith
0 siblings, 1 reply; 8+ messages in thread
From: Mike Galbraith @ 2020-10-22 14:36 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: John Ogness, linux-rt-users
On Thu, 2020-10-22 at 16:15 +0200, Sebastian Andrzej Siewior wrote:
> On 2020-10-22 16:12:41 [+0200], Mike Galbraith wrote:
> > Greetings,
> >
> > Seems this patch can cause hangs. I can reliably reproduce one such
> > hang by poking sysrq-b, and it goes away if I revert that.
>
> does it still happen if the wait argument is `false'?
Yup.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: printk: add pr_flush()
2020-10-22 14:36 ` Mike Galbraith
@ 2020-10-22 14:47 ` Mike Galbraith
2020-10-22 14:52 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 8+ messages in thread
From: Mike Galbraith @ 2020-10-22 14:47 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: John Ogness, linux-rt-users
On Thu, 2020-10-22 at 16:36 +0200, Mike Galbraith wrote:
> On Thu, 2020-10-22 at 16:15 +0200, Sebastian Andrzej Siewior wrote:
> > On 2020-10-22 16:12:41 [+0200], Mike Galbraith wrote:
> > > Greetings,
> > >
> > > Seems this patch can cause hangs. I can reliably reproduce one such
> > > hang by poking sysrq-b, and it goes away if I revert that.
> >
> > does it still happen if the wait argument is `false'?
>
> Yup.
Um, that "yup" is predicated upon you meaning reset_on_progress, that
being the only bool.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: printk: add pr_flush()
2020-10-22 14:47 ` Mike Galbraith
@ 2020-10-22 14:52 ` Sebastian Andrzej Siewior
2020-10-23 7:23 ` John Ogness
0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-22 14:52 UTC (permalink / raw)
To: Mike Galbraith; +Cc: John Ogness, linux-rt-users
On 2020-10-22 16:47:20 [+0200], Mike Galbraith wrote:
> Um, that "yup" is predicated upon you meaning reset_on_progress, that
> being the only bool.
That was my question and you answered if perfectly fine. Thank you.
I saw "hangs" on reboot while I had quite a large backlog which was not
yet printed. So that was expected. I'm currently stuck on your seqcount
report so…
> -Mike
Sebastian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: printk: add pr_flush()
2020-10-22 14:12 printk: add pr_flush() Mike Galbraith
2020-10-22 14:15 ` Sebastian Andrzej Siewior
@ 2020-10-22 15:04 ` John Ogness
1 sibling, 0 replies; 8+ messages in thread
From: John Ogness @ 2020-10-22 15:04 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Sebastian Andrzej Siewior, linux-rt-users
[-- Attachment #1: Type: text/plain, Size: 489 bytes --]
On 2020-10-22, Mike Galbraith <efault@gmx.de> wrote:
> Seems this patch can cause hangs. I can reliably reproduce one such
> hang by poking sysrq-b, and it goes away if I revert that.
I see an RCU warning and then 30 seconds later the software watchdog
kicks in showing the ttyS0 irq handler is stuck in the D state.
I thought that kmsg_dumps for non-OOPS scenarios were always
preemptible. I will re-evaluate the possible non-OOPS kmsg_dump paths.
Thanks for reporting.
John Ogness
[-- Attachment #2: dmesg.txt --]
[-- Type: text/plain, Size: 7888 bytes --]
[ 10.120370][ T1067] ------------[ cut here ]------------
[ 10.122210][ T1067] WARNING: CPU: 0 PID: 1067 at kernel/rcu/tree_plugin.h:295 rcu_note_context_switch+0x75/0x520
[ 10.123313][ T1067] Modules linked in:
[ 10.123753][ T1067] CPU: 0 PID: 1067 Comm: irq/4-ttyS0 Not tainted 5.9.0-rt16+ #3
[ 10.124502][ T1067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 10.125529][ T1067] RIP: 0010:rcu_note_context_switch+0x75/0x520
[ 10.126220][ T1067] Code: ff 74 0f 65 8b 05 07 3f f0 7e 85 c0 0f 85 89 00 00 00 45 84 e4 75 15 65 48 8b 04 25 00 71 01 00 8b b0 bc 03 00 00 85 f6 7e 02 <0f> 0b 65 48 8b 04 25 00 71 01 00 8b 88 bc 03 00 00 851
[ 10.128323][ T1067] RSP: 0018:ffffc90000bfbbb0 EFLAGS: 00010002
[ 10.129040][ T1067] RAX: ffff88805b3db400 RBX: ffff88807e6283c0 RCX: 0000000000000001
[ 10.129837][ T1067] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001
[ 10.130788][ T1067] RBP: ffffc90000bfbbe8 R08: 0000000000000001 R09: 0000000000000001
[ 10.131692][ T1067] R10: 0000000000000001 R11: ffffc90000bfbca0 R12: 0000000000000000
[ 10.132586][ T1067] R13: 0000000000000000 R14: ffff88805b3db400 R15: 0000000000000000
[ 10.133531][ T1067] FS: 0000000000000000(0000) GS:ffff88807e600000(0000) knlGS:0000000000000000
[ 10.134517][ T1067] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 10.135247][ T1067] CR2: 000055f49044f118 CR3: 000000007f8b4001 CR4: 0000000000370ef0
[ 10.136100][ T1067] Call Trace:
[ 10.136397][ T1067] __schedule+0xb5/0x930
[ 10.136975][ T1067] ? lockdep_hardirqs_on+0x7d/0x100
[ 10.137525][ T1067] schedule+0x5a/0x100
[ 10.137932][ T1067] schedule_timeout+0x192/0x250
[ 10.138491][ T1067] ? del_timer_sync+0x120/0x120
[ 10.139044][ T1067] ? msleep+0x1c/0x50
[ 10.139511][ T1067] msleep+0x3c/0x50
[ 10.139887][ T1067] pr_flush+0xdd/0xf0
[ 10.140403][ T1067] kmsg_dump+0x4e/0x1a0
[ 10.140985][ T1067] emergency_restart+0x13/0x20
[ 10.141619][ T1067] sysrq_handle_reboot+0x27/0x30
[ 10.142173][ T1067] __handle_sysrq.cold.13+0x52/0x138
[ 10.142788][ T1067] handle_sysrq+0x1d/0x30
[ 10.143273][ T1067] serial8250_handle_irq.part.29+0x12f/0x160
[ 10.143966][ T1067] serial8250_default_handle_irq+0x42/0x60
[ 10.144636][ T1067] serial8250_interrupt+0x68/0xc0
[ 10.145287][ T1067] ? irq_thread+0xa1/0x1d0
[ 10.145825][ T1067] irq_forced_thread_fn+0x35/0x80
[ 10.146373][ T1067] irq_thread+0x10b/0x1d0
[ 10.146882][ T1067] ? irq_finalize_oneshot.part.48+0xe0/0xe0
[ 10.147601][ T1067] ? wake_threads_waitq+0x30/0x30
[ 10.148196][ T1067] ? irq_thread_dtor+0x90/0x90
[ 10.148751][ T1067] kthread+0x181/0x1a0
[ 10.149310][ T1067] ? kthread_park+0x90/0x90
[ 10.149789][ T1067] ret_from_fork+0x22/0x30
[ 10.150364][ T1067] irq event stamp: 785
[ 10.150875][ T1067] hardirqs last enabled at (784): [<ffffffff81d17325>] _raw_spin_unlock_irqrestore+0x75/0x80
[ 10.151954][ T1067] hardirqs last disabled at (785): [<ffffffff81d125dc>] __schedule+0x53c/0x930
[ 10.152918][ T1067] softirqs last enabled at (760): [<ffffffff8107bd2f>] __local_bh_enable_ip+0x7f/0x130
[ 10.153967][ T1067] softirqs last disabled at (770): [<ffffffff810fc2d5>] irq_forced_thread_fn+0x5/0x80
[ 10.154966][ T1067] ---[ end trace 0000000000000002 ]---
[ 61.851699][ T495] INFO: task rcub/0:12 blocked for more than 30 seconds.
[ 61.854535][ T495] Tainted: G W 5.9.0-rt16+ #3
[ 61.856706][ T495] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 61.857491][ T495] task:rcub/0 state:D stack:14728 pid: 12 ppid: 2 flags:0x00004000
[ 61.858230][ T495] Call Trace:
[ 61.858497][ T495] __schedule+0x2ec/0x930
[ 61.858847][ T495] ? _raw_spin_unlock_irq+0x27/0x70
[ 61.859266][ T495] schedule+0x5a/0x100
[ 61.859592][ T495] __rt_mutex_slowlock+0x93/0x1a0
[ 61.859996][ T495] rt_mutex_slowlock_locked+0xd0/0x300
[ 61.860436][ T495] rt_mutex_slowlock.constprop.31+0x72/0xb0
[ 61.860919][ T495] __rt_mutex_lock_state+0x32/0x40
[ 61.861393][ T495] rt_mutex_lock_nested+0x38/0x60
[ 61.861889][ T495] ? rcu_boost_kthread+0x140/0x6d0
[ 61.862301][ T495] ? rcu_boost_kthread+0x7a/0x6d0
[ 61.862704][ T495] rcu_boost_kthread+0x140/0x6d0
[ 61.863101][ T495] ? rcu_barrier+0x320/0x320
[ 61.863469][ T495] kthread+0x181/0x1a0
[ 61.863796][ T495] ? kthread_park+0x90/0x90
[ 61.864158][ T495] ret_from_fork+0x22/0x30
[ 61.864527][ T495] INFO: task irq/4-ttyS0:1067 blocked for more than 30 seconds.
[ 61.865183][ T495] Tainted: G W 5.9.0-rt16+ #3
[ 61.865702][ T495] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 61.866394][ T495] task:irq/4-ttyS0 state:D stack:13512 pid: 1067 ppid: 2 flags:0x00004000
[ 61.867130][ T495] Call Trace:
[ 61.867394][ T495] __schedule+0x2ec/0x930
[ 61.867744][ T495] schedule+0x5a/0x100
[ 61.868070][ T495] schedule_timeout+0x192/0x250
[ 61.868459][ T495] ? del_timer_sync+0x120/0x120
[ 61.868850][ T495] ? msleep+0x1c/0x50
[ 61.869214][ T495] msleep+0x3c/0x50
[ 61.869519][ T495] pr_flush+0xdd/0xf0
[ 61.869840][ T495] kmsg_dump+0x4e/0x1a0
[ 61.870173][ T495] emergency_restart+0x13/0x20
[ 61.870555][ T495] sysrq_handle_reboot+0x27/0x30
[ 61.870950][ T495] __handle_sysrq.cold.13+0x52/0x138
[ 61.871374][ T495] handle_sysrq+0x1d/0x30
[ 61.871836][ T495] serial8250_handle_irq.part.29+0x12f/0x160
[ 61.872320][ T495] serial8250_default_handle_irq+0x42/0x60
[ 61.872787][ T495] serial8250_interrupt+0x68/0xc0
[ 61.873236][ T495] ? irq_thread+0xa1/0x1d0
[ 61.873589][ T495] irq_forced_thread_fn+0x35/0x80
[ 61.873993][ T495] irq_thread+0x10b/0x1d0
[ 61.874436][ T495] ? irq_finalize_oneshot.part.48+0xe0/0xe0
[ 61.874910][ T495] ? wake_threads_waitq+0x30/0x30
[ 61.875314][ T495] ? irq_thread_dtor+0x90/0x90
[ 61.875696][ T495] kthread+0x181/0x1a0
[ 61.876022][ T495] ? kthread_park+0x90/0x90
[ 61.876384][ T495] ret_from_fork+0x22/0x30
[ 61.876745][ T495]
[ 61.876745][ T495] Showing all locks held in the system:
[ 61.877428][ T495] 1 lock held by rcuc/0/10:
[ 61.877787][ T495] #0: ffff88807e617e30 ((softirq_ctrl.lock).lock){+.+.}-{3:3}, at: __local_bh_disable_ip+0x11a/0x2b0
[ 61.878662][ T495] 1 lock held by rcub/0/12:
[ 61.879021][ T495] #0: ffffffff828bb488 (rcu_state){+.+.}-{0:0}, at: rcu_boost_kthread+0x140/0x6d0
[ 61.879767][ T495] 1 lock held by khungtaskd/495:
[ 61.880161][ T495] #0: ffffffff828b8e00 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x23/0x17c
[ 61.880995][ T495] 6 locks held by irq/4-ttyS0/1067:
[ 61.881486][ T495] #0: ffff88807e617e30 ((softirq_ctrl.lock).lock){+.+.}-{3:3}, at: __local_bh_disable_ip+0x11a/0x2b0
[ 61.882402][ T495] #1: ffffffff828b8e00 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x5/0xc0
[ 61.883119][ T495] #2: ffffffff828b8e00 (rcu_read_lock){....}-{1:3}, at: __local_bh_disable_ip+0x157/0x2b0
[ 61.883917][ T495] #3: ffff88805b74e8a8 (&i->lock){+.+.}-{0:0}, at: serial8250_interrupt+0x32/0xc0
[ 61.884661][ T495] #4: ffffffff828b8e00 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x5/0xc0
[ 61.885380][ T495] #5: ffffffff828b8e00 (rcu_read_lock){....}-{1:3}, at: __handle_sysrq+0x5/0xa0
[ 61.886108][ T495] 2 locks held by init/1084:
[ 61.886474][ T495] #0: ffff88807f4c1098 (&tty->ldisc_sem){++++}-{0:0}, at: ldsem_down_read+0x32/0x40
[ 61.887229][ T495] #1: ffffc90000303308 (&ldata->atomic_read_lock){+.+.}-{0:0}, at: n_tty_read+0xc2/0x9c0
[ 61.888021][ T495]
[ 61.888207][ T495] =============================================
[ 61.888207][ T495]
[ 68.863491][ T1054] printk: console [ttyS0]: printing thread stopped
[ 68.891970][ T1056] random: fast init done
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: printk: add pr_flush()
2020-10-22 14:52 ` Sebastian Andrzej Siewior
@ 2020-10-23 7:23 ` John Ogness
2020-10-23 11:08 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 8+ messages in thread
From: John Ogness @ 2020-10-23 7:23 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, Mike Galbraith; +Cc: linux-rt-users
Hi Sebastian,
The "printk: add pr_flush()" patch should be removed from PREEMPT_RT
until this issue is resolved. I am getting several independent bug
reports of reboot problems.
Note that on crash situations the full ringbuffer should still be
printed. pr_flush() was only responsible for non-crash scenarios.
John Ogness
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: printk: add pr_flush()
2020-10-23 7:23 ` John Ogness
@ 2020-10-23 11:08 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-23 11:08 UTC (permalink / raw)
To: John Ogness; +Cc: Mike Galbraith, linux-rt-users
On 2020-10-23 09:29:41 [+0206], John Ogness wrote:
> Hi Sebastian,
Hi,
> The "printk: add pr_flush()" patch should be removed from PREEMPT_RT
> until this issue is resolved. I am getting several independent bug
> reports of reboot problems.
>
> Note that on crash situations the full ringbuffer should still be
> printed. pr_flush() was only responsible for non-crash scenarios.
oki, dropping.
> John Ogness
Sebastian
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-10-23 11:08 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 14:12 printk: add pr_flush() Mike Galbraith
2020-10-22 14:15 ` Sebastian Andrzej Siewior
2020-10-22 14:36 ` Mike Galbraith
2020-10-22 14:47 ` Mike Galbraith
2020-10-22 14:52 ` Sebastian Andrzej Siewior
2020-10-23 7:23 ` John Ogness
2020-10-23 11:08 ` Sebastian Andrzej Siewior
2020-10-22 15:04 ` John Ogness
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).