* INFO: task hung in perf_trace_event_unreg @ 2018-04-02 9:20 syzbot 2018-04-02 13:40 ` Steven Rostedt 0 siblings, 1 reply; 19+ messages in thread From: syzbot @ 2018-04-02 9:20 UTC (permalink / raw) To: linux-kernel, mingo, rostedt, syzkaller-bugs Hello, syzbot hit the following crash on upstream commit 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) Linux 4.16 syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd Unfortunately, I don't have any reproducer for this crash yet. Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 Kernel config: https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 compiler: gcc (GCC) 7.1.1 20170620 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount option "g\a�;e�K�>pquota" INFO: task syz-executor3:10803 blocked for more than 120 seconds. Not tainted 4.16.0+ #10 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor3 D20944 10803 4492 0x80000002 Call Trace: context_switch kernel/sched/core.c:2862 [inline] __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 schedule+0xf5/0x430 kernel/sched/core.c:3499 schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 do_wait_for_common kernel/sched/completion.c:86 [inline] __wait_for_common kernel/sched/completion.c:107 [inline] wait_for_common kernel/sched/completion.c:118 [inline] wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 tracepoint_synchronize_unregister include/linux/tracepoint.h:80 [inline] perf_trace_event_unreg.isra.2+0xb7/0x1f0 kernel/trace/trace_event_perf.c:161 perf_trace_destroy+0xbc/0x100 kernel/trace/trace_event_perf.c:236 tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7976 _free_event+0x3bd/0x10f0 kernel/events/core.c:4121 put_event+0x24/0x30 kernel/events/core.c:4204 perf_event_release_kernel+0x6e8/0xfc0 kernel/events/core.c:4310 perf_release+0x37/0x50 kernel/events/core.c:4320 __fput+0x327/0x7e0 fs/file_table.c:209 ____fput+0x15/0x20 fs/file_table.c:243 task_work_run+0x199/0x270 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x9bb/0x1ad0 kernel/exit.c:865 do_group_exit+0x149/0x400 kernel/exit.c:968 get_signal+0x73a/0x16d0 kernel/signal.c:2469 do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809 exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] syscall_return_slowpath arch/x86/entry/common.c:265 [inline] do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x455269 RSP: 002b:00007f8976371ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: 0000000000000000 RBX: 000000000072bec8 RCX: 0000000000455269 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8 RBP: 000000000072bec8 R08: 0000000000000000 R09: 000000000072bea0 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffe793f79cf R14: 00007f89763729c0 R15: 0000000000000000 Showing all locks held in the system: 2 locks held by khungtaskd/876: #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] watchdog+0x1c5/0xd60 kernel/hung_task.c:249 #1: (tasklist_lock){.+.+}, at: [<0000000006b3009f>] debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 2 locks held by getty/4414: #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 2 locks held by getty/4415: #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 2 locks held by getty/4416: #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 2 locks held by getty/4417: #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 2 locks held by getty/4418: #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 2 locks held by getty/4419: #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 2 locks held by getty/4420: #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 1 lock held by syz-executor3/10803: #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 4 locks held by syz-executor5/10816: #0: (&tty->legacy_mutex){+.+.}, at: [<00000000567b7b94>] tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 #1: (&tty->legacy_mutex/1){+.+.}, at: [<00000000567b7b94>] tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 #2: (&tty->ldisc_sem){++++}, at: [<000000002b6b6a29>] tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:298 #3: (&o_tty->termios_rwsem/1){++++}, at: [<0000000007d9a7a4>] n_tty_flush_buffer+0x21/0x320 drivers/tty/n_tty.c:357 1 lock held by syz-executor2/10827: #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 1 lock held by blkid/10832: #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 1 lock held by syz-executor4/10835: #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 1 lock held by syz-executor4/10845: #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 ============================================= NMI backtrace for cpu 1 CPU: 1 PID: 876 Comm: khungtaskd Not tainted 4.16.0+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] check_hung_task kernel/hung_task.c:132 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] watchdog+0x90c/0xd60 kernel/hung_task.c:249 INFO: rcu_sched self-detected stall on CPU 0-....: (124996 ticks this GP) idle=75e/1/4611686018427387906 softirq=33205/33205 fqs=30980 (t=125000 jiffies g=17618 c=17617 q=921) kthread+0x33c/0x400 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 PID: 7457 Comm: kworker/u4:5 Not tainted 4.16.0+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events_unbound flush_to_ldisc RIP: 0010:__process_echoes+0x641/0x770 drivers/tty/n_tty.c:733 RSP: 0018:ffff8801af4ff078 EFLAGS: 00000217 RAX: 0000000000000000 RBX: ffffc90003673000 RCX: ffffffff8352d4c2 RDX: 0000000000000006 RSI: 1ffff10039602994 RDI: ffffc9000367515e RBP: ffff8801af4ff0e0 R08: 1ffff10035e9fdb5 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000625628efd R13: dffffc0000000000 R14: 0000000000000efe R15: 0000000000001b15 FS: 0000000000000000(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffd5bfa4ca8 CR3: 000000000846a005 CR4: 00000000001606f0 DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Call Trace: commit_echoes+0x147/0x1b0 drivers/tty/n_tty.c:764 n_tty_receive_char_fast drivers/tty/n_tty.c:1416 [inline] n_tty_receive_buf_fast drivers/tty/n_tty.c:1576 [inline] __receive_buf drivers/tty/n_tty.c:1611 [inline] n_tty_receive_buf_common+0x1156/0x2520 drivers/tty/n_tty.c:1709 n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456 tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38 receive_buf drivers/tty/tty_buffer.c:475 [inline] flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524 process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 worker_thread+0x223/0x1990 kernel/workqueue.c:2247 kthread+0x33c/0x400 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 Code: 60 12 00 00 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 42 0f b6 04 28 38 d0 7f 08 84 c0 0f 85 21 01 00 00 42 80 bc 33 60 12 00 00 82 <74> 0f e8 48 90 1e fe 4d 8d 74 24 02 e9 58 ff ff ff e8 39 90 1e --- This bug is generated by a dumb bot. It may contain errors. See https://goo.gl/tpsmEJ for details. Direct all questions to syzkaller@googlegroups.com. syzbot will keep track of this bug report. If you forgot to add the Reported-by tag, once the fix for this bug is merged into any tree, please reply to this email with: #syz fix: exact-commit-title To mark this as a duplicate of another syzbot report, please reply with: #syz dup: exact-subject-of-another-report If it's a one-off invalid bug report, please reply with: #syz invalid Note: if the crash happens again, it will cause creation of a new bug report. Note: all commands must start from beginning of the line in the email body. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-02 9:20 INFO: task hung in perf_trace_event_unreg syzbot @ 2018-04-02 13:40 ` Steven Rostedt 2018-04-02 15:33 ` Paul E. McKenney 0 siblings, 1 reply; 19+ messages in thread From: Steven Rostedt @ 2018-04-02 13:40 UTC (permalink / raw) To: syzbot Cc: linux-kernel, mingo, syzkaller-bugs, Peter Zijlstra, Paul E. McKenney On Mon, 02 Apr 2018 02:20:02 -0700 syzbot <syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com> wrote: > Hello, > > syzbot hit the following crash on upstream commit > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > Linux 4.16 > syzbot dashboard link: > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > > Unfortunately, I don't have any reproducer for this crash yet. > Raw console output: > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > Kernel config: > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > compiler: gcc (GCC) 7.1.1 20170620 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > option "g\a�;e�K�>pquota" > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > Not tainted 4.16.0+ #10 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > syz-executor3 D20944 10803 4492 0x80000002 > Call Trace: > context_switch kernel/sched/core.c:2862 [inline] > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > schedule+0xf5/0x430 kernel/sched/core.c:3499 > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > do_wait_for_common kernel/sched/completion.c:86 [inline] > __wait_for_common kernel/sched/completion.c:107 [inline] > wait_for_common kernel/sched/completion.c:118 [inline] > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 I don't think this is a perf issue. Looks like something is preventing rcu_sched from completing. If there's a CPU that is running in kernel space and never scheduling, that can cause this issue. Or if RCU somehow missed a transition into idle or user space. -- Steve > tracepoint_synchronize_unregister include/linux/tracepoint.h:80 [inline] > perf_trace_event_unreg.isra.2+0xb7/0x1f0 > kernel/trace/trace_event_perf.c:161 > perf_trace_destroy+0xbc/0x100 kernel/trace/trace_event_perf.c:236 > tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7976 > _free_event+0x3bd/0x10f0 kernel/events/core.c:4121 > put_event+0x24/0x30 kernel/events/core.c:4204 > perf_event_release_kernel+0x6e8/0xfc0 kernel/events/core.c:4310 > perf_release+0x37/0x50 kernel/events/core.c:4320 > __fput+0x327/0x7e0 fs/file_table.c:209 > ____fput+0x15/0x20 fs/file_table.c:243 > task_work_run+0x199/0x270 kernel/task_work.c:113 > exit_task_work include/linux/task_work.h:22 [inline] > do_exit+0x9bb/0x1ad0 kernel/exit.c:865 > do_group_exit+0x149/0x400 kernel/exit.c:968 > get_signal+0x73a/0x16d0 kernel/signal.c:2469 > do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809 > exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162 > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] > syscall_return_slowpath arch/x86/entry/common.c:265 [inline] > do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292 > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > RIP: 0033:0x455269 > RSP: 002b:00007f8976371ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > RAX: 0000000000000000 RBX: 000000000072bec8 RCX: 0000000000455269 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8 > RBP: 000000000072bec8 R08: 0000000000000000 R09: 000000000072bea0 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007ffe793f79cf R14: 00007f89763729c0 R15: 0000000000000000 > > Showing all locks held in the system: > 2 locks held by khungtaskd/876: > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] watchdog+0x1c5/0xd60 > kernel/hung_task.c:249 > #1: (tasklist_lock){.+.+}, at: [<0000000006b3009f>] > debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 > 2 locks held by getty/4414: > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4415: > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4416: > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4417: > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4418: > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4419: > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4420: > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 1 lock held by syz-executor3/10803: > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > 4 locks held by syz-executor5/10816: > #0: (&tty->legacy_mutex){+.+.}, at: [<00000000567b7b94>] > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > #1: (&tty->legacy_mutex/1){+.+.}, at: [<00000000567b7b94>] > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > #2: (&tty->ldisc_sem){++++}, at: [<000000002b6b6a29>] > tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:298 > #3: (&o_tty->termios_rwsem/1){++++}, at: [<0000000007d9a7a4>] > n_tty_flush_buffer+0x21/0x320 drivers/tty/n_tty.c:357 > 1 lock held by syz-executor2/10827: > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > 1 lock held by blkid/10832: > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > 1 lock held by syz-executor4/10835: > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > 1 lock held by syz-executor4/10845: > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > > ============================================= > > NMI backtrace for cpu 1 > CPU: 1 PID: 876 Comm: khungtaskd Not tainted 4.16.0+ #10 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x24d lib/dump_stack.c:53 > nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 > nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 > trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] > check_hung_task kernel/hung_task.c:132 [inline] > check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] > watchdog+0x90c/0xd60 kernel/hung_task.c:249 > INFO: rcu_sched self-detected stall on CPU > 0-....: (124996 ticks this GP) idle=75e/1/4611686018427387906 > softirq=33205/33205 fqs=30980 > > (t=125000 jiffies g=17618 c=17617 q=921) > kthread+0x33c/0x400 kernel/kthread.c:238 > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > Sending NMI from CPU 1 to CPUs 0: > NMI backtrace for cpu 0 > CPU: 0 PID: 7457 Comm: kworker/u4:5 Not tainted 4.16.0+ #10 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Workqueue: events_unbound flush_to_ldisc > RIP: 0010:__process_echoes+0x641/0x770 drivers/tty/n_tty.c:733 > RSP: 0018:ffff8801af4ff078 EFLAGS: 00000217 > RAX: 0000000000000000 RBX: ffffc90003673000 RCX: ffffffff8352d4c2 > RDX: 0000000000000006 RSI: 1ffff10039602994 RDI: ffffc9000367515e > RBP: ffff8801af4ff0e0 R08: 1ffff10035e9fdb5 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000625628efd > R13: dffffc0000000000 R14: 0000000000000efe R15: 0000000000001b15 > FS: 0000000000000000(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007ffd5bfa4ca8 CR3: 000000000846a005 CR4: 00000000001606f0 > DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > Call Trace: > commit_echoes+0x147/0x1b0 drivers/tty/n_tty.c:764 > n_tty_receive_char_fast drivers/tty/n_tty.c:1416 [inline] > n_tty_receive_buf_fast drivers/tty/n_tty.c:1576 [inline] > __receive_buf drivers/tty/n_tty.c:1611 [inline] > n_tty_receive_buf_common+0x1156/0x2520 drivers/tty/n_tty.c:1709 > n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 > tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456 > tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38 > receive_buf drivers/tty/tty_buffer.c:475 [inline] > flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524 > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 > kthread+0x33c/0x400 kernel/kthread.c:238 > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > Code: 60 12 00 00 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 42 0f b6 04 28 38 > d0 7f 08 84 c0 0f 85 21 01 00 00 42 80 bc 33 60 12 00 00 82 <74> 0f e8 48 > 90 1e fe 4d 8d 74 24 02 e9 58 ff ff ff e8 39 90 1e > > > --- > This bug is generated by a dumb bot. It may contain errors. > See https://goo.gl/tpsmEJ for details. > Direct all questions to syzkaller@googlegroups.com. > > syzbot will keep track of this bug report. > If you forgot to add the Reported-by tag, once the fix for this bug is > merged > into any tree, please reply to this email with: > #syz fix: exact-commit-title > To mark this as a duplicate of another syzbot report, please reply with: > #syz dup: exact-subject-of-another-report > If it's a one-off invalid bug report, please reply with: > #syz invalid > Note: if the crash happens again, it will cause creation of a new bug > report. > Note: all commands must start from beginning of the line in the email body. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-02 13:40 ` Steven Rostedt @ 2018-04-02 15:33 ` Paul E. McKenney 2018-04-02 16:04 ` Dmitry Vyukov 0 siblings, 1 reply; 19+ messages in thread From: Paul E. McKenney @ 2018-04-02 15:33 UTC (permalink / raw) To: Steven Rostedt Cc: syzbot, linux-kernel, mingo, syzkaller-bugs, Peter Zijlstra On Mon, Apr 02, 2018 at 09:40:40AM -0400, Steven Rostedt wrote: > On Mon, 02 Apr 2018 02:20:02 -0700 > syzbot <syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com> wrote: > > > Hello, > > > > syzbot hit the following crash on upstream commit > > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > > Linux 4.16 > > syzbot dashboard link: > > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > > > > Unfortunately, I don't have any reproducer for this crash yet. > > Raw console output: > > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > > Kernel config: > > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > > compiler: gcc (GCC) 7.1.1 20170620 > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > > It will help syzbot understand when the bug is fixed. See footer for > > details. > > If you forward the report, please keep this part and the footer. > > > > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > > option "g\a�;e�K�>pquota" Might not hurt to look into the above, though perhaps this is just syzkaller playing around with mount options. > > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > > Not tainted 4.16.0+ #10 > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > syz-executor3 D20944 10803 4492 0x80000002 > > Call Trace: > > context_switch kernel/sched/core.c:2862 [inline] > > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > > schedule+0xf5/0x430 kernel/sched/core.c:3499 > > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > > do_wait_for_common kernel/sched/completion.c:86 [inline] > > __wait_for_common kernel/sched/completion.c:107 [inline] > > wait_for_common kernel/sched/completion.c:118 [inline] > > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 > > I don't think this is a perf issue. Looks like something is preventing > rcu_sched from completing. If there's a CPU that is running in kernel > space and never scheduling, that can cause this issue. Or if RCU > somehow missed a transition into idle or user space. The RCU CPU stall warning below strongly supports this position ... > -- Steve > > > tracepoint_synchronize_unregister include/linux/tracepoint.h:80 [inline] > > perf_trace_event_unreg.isra.2+0xb7/0x1f0 > > kernel/trace/trace_event_perf.c:161 > > perf_trace_destroy+0xbc/0x100 kernel/trace/trace_event_perf.c:236 > > tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7976 > > _free_event+0x3bd/0x10f0 kernel/events/core.c:4121 > > put_event+0x24/0x30 kernel/events/core.c:4204 > > perf_event_release_kernel+0x6e8/0xfc0 kernel/events/core.c:4310 > > perf_release+0x37/0x50 kernel/events/core.c:4320 > > __fput+0x327/0x7e0 fs/file_table.c:209 > > ____fput+0x15/0x20 fs/file_table.c:243 > > task_work_run+0x199/0x270 kernel/task_work.c:113 > > exit_task_work include/linux/task_work.h:22 [inline] > > do_exit+0x9bb/0x1ad0 kernel/exit.c:865 > > do_group_exit+0x149/0x400 kernel/exit.c:968 > > get_signal+0x73a/0x16d0 kernel/signal.c:2469 > > do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809 > > exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162 > > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] > > syscall_return_slowpath arch/x86/entry/common.c:265 [inline] > > do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292 > > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > RIP: 0033:0x455269 > > RSP: 002b:00007f8976371ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > > RAX: 0000000000000000 RBX: 000000000072bec8 RCX: 0000000000455269 > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8 > > RBP: 000000000072bec8 R08: 0000000000000000 R09: 000000000072bea0 > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > > R13: 00007ffe793f79cf R14: 00007f89763729c0 R15: 0000000000000000 > > > > Showing all locks held in the system: > > 2 locks held by khungtaskd/876: > > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] > > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] > > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] watchdog+0x1c5/0xd60 > > kernel/hung_task.c:249 ... And two places to start looking are the two above rcu_read_lock() calls. Especially given that khungtask shows up below. > > #1: (tasklist_lock){.+.+}, at: [<0000000006b3009f>] > > debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 > > 2 locks held by getty/4414: > > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > > 2 locks held by getty/4415: > > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > > 2 locks held by getty/4416: > > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > > 2 locks held by getty/4417: > > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > > 2 locks held by getty/4418: > > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > > 2 locks held by getty/4419: > > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > > 2 locks held by getty/4420: > > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > > 1 lock held by syz-executor3/10803: > > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > > 4 locks held by syz-executor5/10816: > > #0: (&tty->legacy_mutex){+.+.}, at: [<00000000567b7b94>] > > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > > #1: (&tty->legacy_mutex/1){+.+.}, at: [<00000000567b7b94>] > > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > > #2: (&tty->ldisc_sem){++++}, at: [<000000002b6b6a29>] > > tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:298 > > #3: (&o_tty->termios_rwsem/1){++++}, at: [<0000000007d9a7a4>] > > n_tty_flush_buffer+0x21/0x320 drivers/tty/n_tty.c:357 > > 1 lock held by syz-executor2/10827: > > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > > 1 lock held by blkid/10832: > > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > > 1 lock held by syz-executor4/10835: > > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > > 1 lock held by syz-executor4/10845: > > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > > > > ============================================= > > > > NMI backtrace for cpu 1 > > CPU: 1 PID: 876 Comm: khungtaskd Not tainted 4.16.0+ #10 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:17 [inline] > > dump_stack+0x194/0x24d lib/dump_stack.c:53 > > nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 > > nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 > > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 > > trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] > > check_hung_task kernel/hung_task.c:132 [inline] > > check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] > > watchdog+0x90c/0xd60 kernel/hung_task.c:249 > > INFO: rcu_sched self-detected stall on CPU > > 0-....: (124996 ticks this GP) idle=75e/1/4611686018427387906 > > softirq=33205/33205 fqs=30980 > > > > (t=125000 jiffies g=17618 c=17617 q=921) > > kthread+0x33c/0x400 kernel/kthread.c:238 > > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > > Sending NMI from CPU 1 to CPUs 0: > > NMI backtrace for cpu 0 > > CPU: 0 PID: 7457 Comm: kworker/u4:5 Not tainted 4.16.0+ #10 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Workqueue: events_unbound flush_to_ldisc > > RIP: 0010:__process_echoes+0x641/0x770 drivers/tty/n_tty.c:733 > > RSP: 0018:ffff8801af4ff078 EFLAGS: 00000217 > > RAX: 0000000000000000 RBX: ffffc90003673000 RCX: ffffffff8352d4c2 > > RDX: 0000000000000006 RSI: 1ffff10039602994 RDI: ffffc9000367515e > > RBP: ffff8801af4ff0e0 R08: 1ffff10035e9fdb5 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000625628efd > > R13: dffffc0000000000 R14: 0000000000000efe R15: 0000000000001b15 > > FS: 0000000000000000(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007ffd5bfa4ca8 CR3: 000000000846a005 CR4: 00000000001606f0 > > DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > > Call Trace: > > commit_echoes+0x147/0x1b0 drivers/tty/n_tty.c:764 > > n_tty_receive_char_fast drivers/tty/n_tty.c:1416 [inline] > > n_tty_receive_buf_fast drivers/tty/n_tty.c:1576 [inline] > > __receive_buf drivers/tty/n_tty.c:1611 [inline] > > n_tty_receive_buf_common+0x1156/0x2520 drivers/tty/n_tty.c:1709 > > n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 > > tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456 > > tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38 > > receive_buf drivers/tty/tty_buffer.c:475 [inline] > > flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524 > > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 > > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 > > kthread+0x33c/0x400 kernel/kthread.c:238 > > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 And the above is another good place to look. Thanx, Paul > > Code: 60 12 00 00 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 42 0f b6 04 28 38 > > d0 7f 08 84 c0 0f 85 21 01 00 00 42 80 bc 33 60 12 00 00 82 <74> 0f e8 48 > > 90 1e fe 4d 8d 74 24 02 e9 58 ff ff ff e8 39 90 1e > > > > > > --- > > This bug is generated by a dumb bot. It may contain errors. > > See https://goo.gl/tpsmEJ for details. > > Direct all questions to syzkaller@googlegroups.com. > > > > syzbot will keep track of this bug report. > > If you forgot to add the Reported-by tag, once the fix for this bug is > > merged > > into any tree, please reply to this email with: > > #syz fix: exact-commit-title > > To mark this as a duplicate of another syzbot report, please reply with: > > #syz dup: exact-subject-of-another-report > > If it's a one-off invalid bug report, please reply with: > > #syz invalid > > Note: if the crash happens again, it will cause creation of a new bug > > report. > > Note: all commands must start from beginning of the line in the email body. > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-02 15:33 ` Paul E. McKenney @ 2018-04-02 16:04 ` Dmitry Vyukov 2018-04-02 16:21 ` Paul E. McKenney 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Vyukov @ 2018-04-02 16:04 UTC (permalink / raw) To: Paul McKenney Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 2, 2018 at 5:33 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Mon, Apr 02, 2018 at 09:40:40AM -0400, Steven Rostedt wrote: >> On Mon, 02 Apr 2018 02:20:02 -0700 >> syzbot <syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com> wrote: >> >> > Hello, >> > >> > syzbot hit the following crash on upstream commit >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) >> > Linux 4.16 >> > syzbot dashboard link: >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd >> > >> > Unfortunately, I don't have any reproducer for this crash yet. >> > Raw console output: >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 >> > Kernel config: >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 >> > compiler: gcc (GCC) 7.1.1 20170620 >> > >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com >> > It will help syzbot understand when the bug is fixed. See footer for >> > details. >> > If you forward the report, please keep this part and the footer. >> > >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount >> > option "g �;e�K�>pquota" > > Might not hurt to look into the above, though perhaps this is just syzkaller > playing around with mount options. > >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. >> > Not tainted 4.16.0+ #10 >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> > syz-executor3 D20944 10803 4492 0x80000002 >> > Call Trace: >> > context_switch kernel/sched/core.c:2862 [inline] >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 >> > do_wait_for_common kernel/sched/completion.c:86 [inline] >> > __wait_for_common kernel/sched/completion.c:107 [inline] >> > wait_for_common kernel/sched/completion.c:118 [inline] >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 >> >> I don't think this is a perf issue. Looks like something is preventing >> rcu_sched from completing. If there's a CPU that is running in kernel >> space and never scheduling, that can cause this issue. Or if RCU >> somehow missed a transition into idle or user space. > > The RCU CPU stall warning below strongly supports this position ... I think this is this guy then: https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 #syz dup: INFO: rcu detected stall in __process_echoes Looking retrospectively at the various hang/stall bugs that we have, I think we need some kind of priority between them. I.e. we have rcu stalls, spinlock stalls, workqueue hangs, task hangs, silent machine hang and maybe something else. It would be useful if they fire deterministically according to priorities. If there is an rcu stall, that's always detected as CPU stall. Then if there is no RCU stall, but a workqueue stall, then that's always detected as workqueue stall, etc. Currently if we have an RCU stall (effectively CPU stall), that can be detected either RCU stall or a task hung, producing 2 different bug reports (which is bad). One can say that it's only a matter of tuning timeouts, but at least task hung detector has a problem that if you set timeout to X, it can detect hung anywhere between X and 2*X. And on one hand we need quite large timeout (a minute may not be enough), and on the other hand we can't wait for an hour just to make sure that the machine is indeed dead (these things happen every few minutes). >> > tracepoint_synchronize_unregister include/linux/tracepoint.h:80 [inline] >> > perf_trace_event_unreg.isra.2+0xb7/0x1f0 >> > kernel/trace/trace_event_perf.c:161 >> > perf_trace_destroy+0xbc/0x100 kernel/trace/trace_event_perf.c:236 >> > tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7976 >> > _free_event+0x3bd/0x10f0 kernel/events/core.c:4121 >> > put_event+0x24/0x30 kernel/events/core.c:4204 >> > perf_event_release_kernel+0x6e8/0xfc0 kernel/events/core.c:4310 >> > perf_release+0x37/0x50 kernel/events/core.c:4320 >> > __fput+0x327/0x7e0 fs/file_table.c:209 >> > ____fput+0x15/0x20 fs/file_table.c:243 >> > task_work_run+0x199/0x270 kernel/task_work.c:113 >> > exit_task_work include/linux/task_work.h:22 [inline] >> > do_exit+0x9bb/0x1ad0 kernel/exit.c:865 >> > do_group_exit+0x149/0x400 kernel/exit.c:968 >> > get_signal+0x73a/0x16d0 kernel/signal.c:2469 >> > do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809 >> > exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162 >> > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] >> > syscall_return_slowpath arch/x86/entry/common.c:265 [inline] >> > do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292 >> > entry_SYSCALL_64_after_hwframe+0x42/0xb7 >> > RIP: 0033:0x455269 >> > RSP: 002b:00007f8976371ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca >> > RAX: 0000000000000000 RBX: 000000000072bec8 RCX: 0000000000455269 >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8 >> > RBP: 000000000072bec8 R08: 0000000000000000 R09: 000000000072bea0 >> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 >> > R13: 00007ffe793f79cf R14: 00007f89763729c0 R15: 0000000000000000 >> > >> > Showing all locks held in the system: >> > 2 locks held by khungtaskd/876: >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] >> > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] watchdog+0x1c5/0xd60 >> > kernel/hung_task.c:249 > > ... And two places to start looking are the two above rcu_read_lock() calls. > Especially given that khungtask shows up below. > >> > #1: (tasklist_lock){.+.+}, at: [<0000000006b3009f>] >> > debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 >> > 2 locks held by getty/4414: >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> > 2 locks held by getty/4415: >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> > 2 locks held by getty/4416: >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> > 2 locks held by getty/4417: >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> > 2 locks held by getty/4418: >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> > 2 locks held by getty/4419: >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> > 2 locks held by getty/4420: >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> > 1 lock held by syz-executor3/10803: >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 >> > 4 locks held by syz-executor5/10816: >> > #0: (&tty->legacy_mutex){+.+.}, at: [<00000000567b7b94>] >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 >> > #1: (&tty->legacy_mutex/1){+.+.}, at: [<00000000567b7b94>] >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 >> > #2: (&tty->ldisc_sem){++++}, at: [<000000002b6b6a29>] >> > tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:298 >> > #3: (&o_tty->termios_rwsem/1){++++}, at: [<0000000007d9a7a4>] >> > n_tty_flush_buffer+0x21/0x320 drivers/tty/n_tty.c:357 >> > 1 lock held by syz-executor2/10827: >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 >> > 1 lock held by blkid/10832: >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 >> > 1 lock held by syz-executor4/10835: >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 >> > 1 lock held by syz-executor4/10845: >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 >> > >> > ============================================= >> > >> > NMI backtrace for cpu 1 >> > CPU: 1 PID: 876 Comm: khungtaskd Not tainted 4.16.0+ #10 >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> > Google 01/01/2011 >> > Call Trace: >> > __dump_stack lib/dump_stack.c:17 [inline] >> > dump_stack+0x194/0x24d lib/dump_stack.c:53 >> > nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 >> > nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 >> > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 >> > trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] >> > check_hung_task kernel/hung_task.c:132 [inline] >> > check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] >> > watchdog+0x90c/0xd60 kernel/hung_task.c:249 >> > INFO: rcu_sched self-detected stall on CPU >> > 0-....: (124996 ticks this GP) idle=75e/1/4611686018427387906 >> > softirq=33205/33205 fqs=30980 >> > >> > (t=125000 jiffies g=17618 c=17617 q=921) >> > kthread+0x33c/0x400 kernel/kthread.c:238 >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 >> > Sending NMI from CPU 1 to CPUs 0: >> > NMI backtrace for cpu 0 >> > CPU: 0 PID: 7457 Comm: kworker/u4:5 Not tainted 4.16.0+ #10 >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> > Google 01/01/2011 >> > Workqueue: events_unbound flush_to_ldisc >> > RIP: 0010:__process_echoes+0x641/0x770 drivers/tty/n_tty.c:733 >> > RSP: 0018:ffff8801af4ff078 EFLAGS: 00000217 >> > RAX: 0000000000000000 RBX: ffffc90003673000 RCX: ffffffff8352d4c2 >> > RDX: 0000000000000006 RSI: 1ffff10039602994 RDI: ffffc9000367515e >> > RBP: ffff8801af4ff0e0 R08: 1ffff10035e9fdb5 R09: 0000000000000000 >> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000625628efd >> > R13: dffffc0000000000 R14: 0000000000000efe R15: 0000000000001b15 >> > FS: 0000000000000000(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> > CR2: 00007ffd5bfa4ca8 CR3: 000000000846a005 CR4: 00000000001606f0 >> > DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 >> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 >> > Call Trace: >> > commit_echoes+0x147/0x1b0 drivers/tty/n_tty.c:764 >> > n_tty_receive_char_fast drivers/tty/n_tty.c:1416 [inline] >> > n_tty_receive_buf_fast drivers/tty/n_tty.c:1576 [inline] >> > __receive_buf drivers/tty/n_tty.c:1611 [inline] >> > n_tty_receive_buf_common+0x1156/0x2520 drivers/tty/n_tty.c:1709 >> > n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 >> > tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456 >> > tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38 >> > receive_buf drivers/tty/tty_buffer.c:475 [inline] >> > flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524 >> > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 >> > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 >> > kthread+0x33c/0x400 kernel/kthread.c:238 >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > > And the above is another good place to look. > > Thanx, Paul > >> > Code: 60 12 00 00 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 42 0f b6 04 28 38 >> > d0 7f 08 84 c0 0f 85 21 01 00 00 42 80 bc 33 60 12 00 00 82 <74> 0f e8 48 >> > 90 1e fe 4d 8d 74 24 02 e9 58 ff ff ff e8 39 90 1e >> > >> > >> > --- >> > This bug is generated by a dumb bot. It may contain errors. >> > See https://goo.gl/tpsmEJ for details. >> > Direct all questions to syzkaller@googlegroups.com. >> > >> > syzbot will keep track of this bug report. >> > If you forgot to add the Reported-by tag, once the fix for this bug is >> > merged >> > into any tree, please reply to this email with: >> > #syz fix: exact-commit-title >> > To mark this as a duplicate of another syzbot report, please reply with: >> > #syz dup: exact-subject-of-another-report >> > If it's a one-off invalid bug report, please reply with: >> > #syz invalid >> > Note: if the crash happens again, it will cause creation of a new bug >> > report. >> > Note: all commands must start from beginning of the line in the email body. >> > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20180402153332.GM3948%40linux.vnet.ibm.com. > For more options, visit https://groups.google.com/d/optout. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-02 16:04 ` Dmitry Vyukov @ 2018-04-02 16:21 ` Paul E. McKenney 2018-04-02 16:32 ` Dmitry Vyukov 0 siblings, 1 reply; 19+ messages in thread From: Paul E. McKenney @ 2018-04-02 16:21 UTC (permalink / raw) To: Dmitry Vyukov Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 02, 2018 at 06:04:35PM +0200, Dmitry Vyukov wrote: > On Mon, Apr 2, 2018 at 5:33 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > On Mon, Apr 02, 2018 at 09:40:40AM -0400, Steven Rostedt wrote: > >> On Mon, 02 Apr 2018 02:20:02 -0700 > >> syzbot <syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com> wrote: > >> > >> > Hello, > >> > > >> > syzbot hit the following crash on upstream commit > >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > >> > Linux 4.16 > >> > syzbot dashboard link: > >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > >> > > >> > Unfortunately, I don't have any reproducer for this crash yet. > >> > Raw console output: > >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > >> > Kernel config: > >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > >> > compiler: gcc (GCC) 7.1.1 20170620 > >> > > >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > >> > It will help syzbot understand when the bug is fixed. See footer for > >> > details. > >> > If you forward the report, please keep this part and the footer. > >> > > >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > >> > option "g �;e�K�>pquota" > > > > Might not hurt to look into the above, though perhaps this is just syzkaller > > playing around with mount options. > > > >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > >> > Not tainted 4.16.0+ #10 > >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> > syz-executor3 D20944 10803 4492 0x80000002 > >> > Call Trace: > >> > context_switch kernel/sched/core.c:2862 [inline] > >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 > >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > >> > do_wait_for_common kernel/sched/completion.c:86 [inline] > >> > __wait_for_common kernel/sched/completion.c:107 [inline] > >> > wait_for_common kernel/sched/completion.c:118 [inline] > >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 > >> > >> I don't think this is a perf issue. Looks like something is preventing > >> rcu_sched from completing. If there's a CPU that is running in kernel > >> space and never scheduling, that can cause this issue. Or if RCU > >> somehow missed a transition into idle or user space. > > > > The RCU CPU stall warning below strongly supports this position ... > > I think this is this guy then: > > https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 > > #syz dup: INFO: rcu detected stall in __process_echoes Seems likely to me! > Looking retrospectively at the various hang/stall bugs that we have, I > think we need some kind of priority between them. I.e. we have rcu > stalls, spinlock stalls, workqueue hangs, task hangs, silent machine > hang and maybe something else. It would be useful if they fire > deterministically according to priorities. If there is an rcu stall, > that's always detected as CPU stall. Then if there is no RCU stall, > but a workqueue stall, then that's always detected as workqueue stall, > etc. > Currently if we have an RCU stall (effectively CPU stall), that can be > detected either RCU stall or a task hung, producing 2 different bug > reports (which is bad). > One can say that it's only a matter of tuning timeouts, but at least > task hung detector has a problem that if you set timeout to X, it can > detect hung anywhere between X and 2*X. And on one hand we need quite > large timeout (a minute may not be enough), and on the other hand we > can't wait for an hour just to make sure that the machine is indeed > dead (these things happen every few minutes). I suppose that we could have a global variable that was set to the priority of the complaint in question, which would suppress all lower-priority complaints. Might need to be opt-in, though -- I would guess that not everyone is going to be happy with one complaint suppressing others, especially given the possibility that the two complaints might be about different things. Or did you have something more deft in mind? Thanx, Paul > >> > tracepoint_synchronize_unregister include/linux/tracepoint.h:80 [inline] > >> > perf_trace_event_unreg.isra.2+0xb7/0x1f0 > >> > kernel/trace/trace_event_perf.c:161 > >> > perf_trace_destroy+0xbc/0x100 kernel/trace/trace_event_perf.c:236 > >> > tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7976 > >> > _free_event+0x3bd/0x10f0 kernel/events/core.c:4121 > >> > put_event+0x24/0x30 kernel/events/core.c:4204 > >> > perf_event_release_kernel+0x6e8/0xfc0 kernel/events/core.c:4310 > >> > perf_release+0x37/0x50 kernel/events/core.c:4320 > >> > __fput+0x327/0x7e0 fs/file_table.c:209 > >> > ____fput+0x15/0x20 fs/file_table.c:243 > >> > task_work_run+0x199/0x270 kernel/task_work.c:113 > >> > exit_task_work include/linux/task_work.h:22 [inline] > >> > do_exit+0x9bb/0x1ad0 kernel/exit.c:865 > >> > do_group_exit+0x149/0x400 kernel/exit.c:968 > >> > get_signal+0x73a/0x16d0 kernel/signal.c:2469 > >> > do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809 > >> > exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162 > >> > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] > >> > syscall_return_slowpath arch/x86/entry/common.c:265 [inline] > >> > do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292 > >> > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > >> > RIP: 0033:0x455269 > >> > RSP: 002b:00007f8976371ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > >> > RAX: 0000000000000000 RBX: 000000000072bec8 RCX: 0000000000455269 > >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8 > >> > RBP: 000000000072bec8 R08: 0000000000000000 R09: 000000000072bea0 > >> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > >> > R13: 00007ffe793f79cf R14: 00007f89763729c0 R15: 0000000000000000 > >> > > >> > Showing all locks held in the system: > >> > 2 locks held by khungtaskd/876: > >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] > >> > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] > >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] watchdog+0x1c5/0xd60 > >> > kernel/hung_task.c:249 > > > > ... And two places to start looking are the two above rcu_read_lock() calls. > > Especially given that khungtask shows up below. > > > >> > #1: (tasklist_lock){.+.+}, at: [<0000000006b3009f>] > >> > debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 > >> > 2 locks held by getty/4414: > >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> > 2 locks held by getty/4415: > >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> > 2 locks held by getty/4416: > >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> > 2 locks held by getty/4417: > >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> > 2 locks held by getty/4418: > >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> > 2 locks held by getty/4419: > >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> > 2 locks held by getty/4420: > >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> > 1 lock held by syz-executor3/10803: > >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > >> > 4 locks held by syz-executor5/10816: > >> > #0: (&tty->legacy_mutex){+.+.}, at: [<00000000567b7b94>] > >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > >> > #1: (&tty->legacy_mutex/1){+.+.}, at: [<00000000567b7b94>] > >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > >> > #2: (&tty->ldisc_sem){++++}, at: [<000000002b6b6a29>] > >> > tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:298 > >> > #3: (&o_tty->termios_rwsem/1){++++}, at: [<0000000007d9a7a4>] > >> > n_tty_flush_buffer+0x21/0x320 drivers/tty/n_tty.c:357 > >> > 1 lock held by syz-executor2/10827: > >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > >> > 1 lock held by blkid/10832: > >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > >> > 1 lock held by syz-executor4/10835: > >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > >> > 1 lock held by syz-executor4/10845: > >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > >> > > >> > ============================================= > >> > > >> > NMI backtrace for cpu 1 > >> > CPU: 1 PID: 876 Comm: khungtaskd Not tainted 4.16.0+ #10 > >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > >> > Google 01/01/2011 > >> > Call Trace: > >> > __dump_stack lib/dump_stack.c:17 [inline] > >> > dump_stack+0x194/0x24d lib/dump_stack.c:53 > >> > nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 > >> > nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 > >> > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 > >> > trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] > >> > check_hung_task kernel/hung_task.c:132 [inline] > >> > check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] > >> > watchdog+0x90c/0xd60 kernel/hung_task.c:249 > >> > INFO: rcu_sched self-detected stall on CPU > >> > 0-....: (124996 ticks this GP) idle=75e/1/4611686018427387906 > >> > softirq=33205/33205 fqs=30980 > >> > > >> > (t=125000 jiffies g=17618 c=17617 q=921) > >> > kthread+0x33c/0x400 kernel/kthread.c:238 > >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > >> > Sending NMI from CPU 1 to CPUs 0: > >> > NMI backtrace for cpu 0 > >> > CPU: 0 PID: 7457 Comm: kworker/u4:5 Not tainted 4.16.0+ #10 > >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > >> > Google 01/01/2011 > >> > Workqueue: events_unbound flush_to_ldisc > >> > RIP: 0010:__process_echoes+0x641/0x770 drivers/tty/n_tty.c:733 > >> > RSP: 0018:ffff8801af4ff078 EFLAGS: 00000217 > >> > RAX: 0000000000000000 RBX: ffffc90003673000 RCX: ffffffff8352d4c2 > >> > RDX: 0000000000000006 RSI: 1ffff10039602994 RDI: ffffc9000367515e > >> > RBP: ffff8801af4ff0e0 R08: 1ffff10035e9fdb5 R09: 0000000000000000 > >> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000625628efd > >> > R13: dffffc0000000000 R14: 0000000000000efe R15: 0000000000001b15 > >> > FS: 0000000000000000(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 > >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> > CR2: 00007ffd5bfa4ca8 CR3: 000000000846a005 CR4: 00000000001606f0 > >> > DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > >> > Call Trace: > >> > commit_echoes+0x147/0x1b0 drivers/tty/n_tty.c:764 > >> > n_tty_receive_char_fast drivers/tty/n_tty.c:1416 [inline] > >> > n_tty_receive_buf_fast drivers/tty/n_tty.c:1576 [inline] > >> > __receive_buf drivers/tty/n_tty.c:1611 [inline] > >> > n_tty_receive_buf_common+0x1156/0x2520 drivers/tty/n_tty.c:1709 > >> > n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 > >> > tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456 > >> > tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38 > >> > receive_buf drivers/tty/tty_buffer.c:475 [inline] > >> > flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524 > >> > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 > >> > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 > >> > kthread+0x33c/0x400 kernel/kthread.c:238 > >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > > > > And the above is another good place to look. > > > > Thanx, Paul > > > >> > Code: 60 12 00 00 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 42 0f b6 04 28 38 > >> > d0 7f 08 84 c0 0f 85 21 01 00 00 42 80 bc 33 60 12 00 00 82 <74> 0f e8 48 > >> > 90 1e fe 4d 8d 74 24 02 e9 58 ff ff ff e8 39 90 1e > >> > > >> > > >> > --- > >> > This bug is generated by a dumb bot. It may contain errors. > >> > See https://goo.gl/tpsmEJ for details. > >> > Direct all questions to syzkaller@googlegroups.com. > >> > > >> > syzbot will keep track of this bug report. > >> > If you forgot to add the Reported-by tag, once the fix for this bug is > >> > merged > >> > into any tree, please reply to this email with: > >> > #syz fix: exact-commit-title > >> > To mark this as a duplicate of another syzbot report, please reply with: > >> > #syz dup: exact-subject-of-another-report > >> > If it's a one-off invalid bug report, please reply with: > >> > #syz invalid > >> > Note: if the crash happens again, it will cause creation of a new bug > >> > report. > >> > Note: all commands must start from beginning of the line in the email body. > >> > > > > -- > > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20180402153332.GM3948%40linux.vnet.ibm.com. > > For more options, visit https://groups.google.com/d/optout. > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-02 16:21 ` Paul E. McKenney @ 2018-04-02 16:32 ` Dmitry Vyukov 2018-04-02 16:39 ` Paul E. McKenney 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Vyukov @ 2018-04-02 16:32 UTC (permalink / raw) To: Paul McKenney Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 2, 2018 at 6:21 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Mon, Apr 02, 2018 at 06:04:35PM +0200, Dmitry Vyukov wrote: >> On Mon, Apr 2, 2018 at 5:33 PM, Paul E. McKenney >> <paulmck@linux.vnet.ibm.com> wrote: >> > On Mon, Apr 02, 2018 at 09:40:40AM -0400, Steven Rostedt wrote: >> >> On Mon, 02 Apr 2018 02:20:02 -0700 >> >> syzbot <syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com> wrote: >> >> >> >> > Hello, >> >> > >> >> > syzbot hit the following crash on upstream commit >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) >> >> > Linux 4.16 >> >> > syzbot dashboard link: >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd >> >> > >> >> > Unfortunately, I don't have any reproducer for this crash yet. >> >> > Raw console output: >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 >> >> > Kernel config: >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 >> >> > compiler: gcc (GCC) 7.1.1 20170620 >> >> > >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com >> >> > It will help syzbot understand when the bug is fixed. See footer for >> >> > details. >> >> > If you forward the report, please keep this part and the footer. >> >> > >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount >> >> > option "g �;e�K�>pquota" >> > >> > Might not hurt to look into the above, though perhaps this is just syzkaller >> > playing around with mount options. >> > >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. >> >> > Not tainted 4.16.0+ #10 >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> >> > syz-executor3 D20944 10803 4492 0x80000002 >> >> > Call Trace: >> >> > context_switch kernel/sched/core.c:2862 [inline] >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] >> >> > wait_for_common kernel/sched/completion.c:118 [inline] >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 >> >> >> >> I don't think this is a perf issue. Looks like something is preventing >> >> rcu_sched from completing. If there's a CPU that is running in kernel >> >> space and never scheduling, that can cause this issue. Or if RCU >> >> somehow missed a transition into idle or user space. >> > >> > The RCU CPU stall warning below strongly supports this position ... >> >> I think this is this guy then: >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 >> >> #syz dup: INFO: rcu detected stall in __process_echoes > > Seems likely to me! > >> Looking retrospectively at the various hang/stall bugs that we have, I >> think we need some kind of priority between them. I.e. we have rcu >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine >> hang and maybe something else. It would be useful if they fire >> deterministically according to priorities. If there is an rcu stall, >> that's always detected as CPU stall. Then if there is no RCU stall, >> but a workqueue stall, then that's always detected as workqueue stall, >> etc. >> Currently if we have an RCU stall (effectively CPU stall), that can be >> detected either RCU stall or a task hung, producing 2 different bug >> reports (which is bad). >> One can say that it's only a matter of tuning timeouts, but at least >> task hung detector has a problem that if you set timeout to X, it can >> detect hung anywhere between X and 2*X. And on one hand we need quite >> large timeout (a minute may not be enough), and on the other hand we >> can't wait for an hour just to make sure that the machine is indeed >> dead (these things happen every few minutes). > > I suppose that we could have a global variable that was set to the > priority of the complaint in question, which would suppress all > lower-priority complaints. Might need to be opt-in, though -- I would > guess that not everyone is going to be happy with one complaint suppressing > others, especially given the possibility that the two complaints might > be about different things. > > Or did you have something more deft in mind? syzkaller generally looks only at the first report. One does not know if/when there will be a second one, or the second one can be induced by the first one, and we generally want clean reports on a non-tainted kernel. So we don't just need to suppress lower priority ones, we need to produce the right report first. I am thinking maybe setting: - rcu stalls at 1.5 minutes - workqueue stalls at 2 minutes - task hungs at 2.5 minutes - and no output whatsoever at 3 minutes Do I miss anything? I think at least spinlocks. Should they go before or after rcu? This will require fixing task hung. Have not yet looked at workqueue detector. Does at least RCU respect the given timeout more or less precisely? >> >> > tracepoint_synchronize_unregister include/linux/tracepoint.h:80 [inline] >> >> > perf_trace_event_unreg.isra.2+0xb7/0x1f0 >> >> > kernel/trace/trace_event_perf.c:161 >> >> > perf_trace_destroy+0xbc/0x100 kernel/trace/trace_event_perf.c:236 >> >> > tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7976 >> >> > _free_event+0x3bd/0x10f0 kernel/events/core.c:4121 >> >> > put_event+0x24/0x30 kernel/events/core.c:4204 >> >> > perf_event_release_kernel+0x6e8/0xfc0 kernel/events/core.c:4310 >> >> > perf_release+0x37/0x50 kernel/events/core.c:4320 >> >> > __fput+0x327/0x7e0 fs/file_table.c:209 >> >> > ____fput+0x15/0x20 fs/file_table.c:243 >> >> > task_work_run+0x199/0x270 kernel/task_work.c:113 >> >> > exit_task_work include/linux/task_work.h:22 [inline] >> >> > do_exit+0x9bb/0x1ad0 kernel/exit.c:865 >> >> > do_group_exit+0x149/0x400 kernel/exit.c:968 >> >> > get_signal+0x73a/0x16d0 kernel/signal.c:2469 >> >> > do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809 >> >> > exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162 >> >> > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] >> >> > syscall_return_slowpath arch/x86/entry/common.c:265 [inline] >> >> > do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292 >> >> > entry_SYSCALL_64_after_hwframe+0x42/0xb7 >> >> > RIP: 0033:0x455269 >> >> > RSP: 002b:00007f8976371ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca >> >> > RAX: 0000000000000000 RBX: 000000000072bec8 RCX: 0000000000455269 >> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8 >> >> > RBP: 000000000072bec8 R08: 0000000000000000 R09: 000000000072bea0 >> >> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 >> >> > R13: 00007ffe793f79cf R14: 00007f89763729c0 R15: 0000000000000000 >> >> > >> >> > Showing all locks held in the system: >> >> > 2 locks held by khungtaskd/876: >> >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] >> >> > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] >> >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] watchdog+0x1c5/0xd60 >> >> > kernel/hung_task.c:249 >> > >> > ... And two places to start looking are the two above rcu_read_lock() calls. >> > Especially given that khungtask shows up below. >> > >> >> > #1: (tasklist_lock){.+.+}, at: [<0000000006b3009f>] >> >> > debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 >> >> > 2 locks held by getty/4414: >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> > 2 locks held by getty/4415: >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> > 2 locks held by getty/4416: >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> > 2 locks held by getty/4417: >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> > 2 locks held by getty/4418: >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> > 2 locks held by getty/4419: >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> > 2 locks held by getty/4420: >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> > 1 lock held by syz-executor3/10803: >> >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] >> >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 >> >> > 4 locks held by syz-executor5/10816: >> >> > #0: (&tty->legacy_mutex){+.+.}, at: [<00000000567b7b94>] >> >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 >> >> > #1: (&tty->legacy_mutex/1){+.+.}, at: [<00000000567b7b94>] >> >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 >> >> > #2: (&tty->ldisc_sem){++++}, at: [<000000002b6b6a29>] >> >> > tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:298 >> >> > #3: (&o_tty->termios_rwsem/1){++++}, at: [<0000000007d9a7a4>] >> >> > n_tty_flush_buffer+0x21/0x320 drivers/tty/n_tty.c:357 >> >> > 1 lock held by syz-executor2/10827: >> >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] >> >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 >> >> > 1 lock held by blkid/10832: >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 >> >> > 1 lock held by syz-executor4/10835: >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 >> >> > 1 lock held by syz-executor4/10845: >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 >> >> > >> >> > ============================================= >> >> > >> >> > NMI backtrace for cpu 1 >> >> > CPU: 1 PID: 876 Comm: khungtaskd Not tainted 4.16.0+ #10 >> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> >> > Google 01/01/2011 >> >> > Call Trace: >> >> > __dump_stack lib/dump_stack.c:17 [inline] >> >> > dump_stack+0x194/0x24d lib/dump_stack.c:53 >> >> > nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 >> >> > nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 >> >> > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 >> >> > trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] >> >> > check_hung_task kernel/hung_task.c:132 [inline] >> >> > check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] >> >> > watchdog+0x90c/0xd60 kernel/hung_task.c:249 >> >> > INFO: rcu_sched self-detected stall on CPU >> >> > 0-....: (124996 ticks this GP) idle=75e/1/4611686018427387906 >> >> > softirq=33205/33205 fqs=30980 >> >> > >> >> > (t=125000 jiffies g=17618 c=17617 q=921) >> >> > kthread+0x33c/0x400 kernel/kthread.c:238 >> >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 >> >> > Sending NMI from CPU 1 to CPUs 0: >> >> > NMI backtrace for cpu 0 >> >> > CPU: 0 PID: 7457 Comm: kworker/u4:5 Not tainted 4.16.0+ #10 >> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> >> > Google 01/01/2011 >> >> > Workqueue: events_unbound flush_to_ldisc >> >> > RIP: 0010:__process_echoes+0x641/0x770 drivers/tty/n_tty.c:733 >> >> > RSP: 0018:ffff8801af4ff078 EFLAGS: 00000217 >> >> > RAX: 0000000000000000 RBX: ffffc90003673000 RCX: ffffffff8352d4c2 >> >> > RDX: 0000000000000006 RSI: 1ffff10039602994 RDI: ffffc9000367515e >> >> > RBP: ffff8801af4ff0e0 R08: 1ffff10035e9fdb5 R09: 0000000000000000 >> >> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000625628efd >> >> > R13: dffffc0000000000 R14: 0000000000000efe R15: 0000000000001b15 >> >> > FS: 0000000000000000(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 >> >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> >> > CR2: 00007ffd5bfa4ca8 CR3: 000000000846a005 CR4: 00000000001606f0 >> >> > DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 >> >> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 >> >> > Call Trace: >> >> > commit_echoes+0x147/0x1b0 drivers/tty/n_tty.c:764 >> >> > n_tty_receive_char_fast drivers/tty/n_tty.c:1416 [inline] >> >> > n_tty_receive_buf_fast drivers/tty/n_tty.c:1576 [inline] >> >> > __receive_buf drivers/tty/n_tty.c:1611 [inline] >> >> > n_tty_receive_buf_common+0x1156/0x2520 drivers/tty/n_tty.c:1709 >> >> > n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 >> >> > tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456 >> >> > tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38 >> >> > receive_buf drivers/tty/tty_buffer.c:475 [inline] >> >> > flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524 >> >> > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 >> >> > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 >> >> > kthread+0x33c/0x400 kernel/kthread.c:238 >> >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 >> > >> > And the above is another good place to look. >> > >> > Thanx, Paul >> > >> >> > Code: 60 12 00 00 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 42 0f b6 04 28 38 >> >> > d0 7f 08 84 c0 0f 85 21 01 00 00 42 80 bc 33 60 12 00 00 82 <74> 0f e8 48 >> >> > 90 1e fe 4d 8d 74 24 02 e9 58 ff ff ff e8 39 90 1e >> >> > >> >> > >> >> > --- >> >> > This bug is generated by a dumb bot. It may contain errors. >> >> > See https://goo.gl/tpsmEJ for details. >> >> > Direct all questions to syzkaller@googlegroups.com. >> >> > >> >> > syzbot will keep track of this bug report. >> >> > If you forgot to add the Reported-by tag, once the fix for this bug is >> >> > merged >> >> > into any tree, please reply to this email with: >> >> > #syz fix: exact-commit-title >> >> > To mark this as a duplicate of another syzbot report, please reply with: >> >> > #syz dup: exact-subject-of-another-report >> >> > If it's a one-off invalid bug report, please reply with: >> >> > #syz invalid >> >> > Note: if the crash happens again, it will cause creation of a new bug >> >> > report. >> >> > Note: all commands must start from beginning of the line in the email body. >> >> >> > >> > -- >> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. >> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. >> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20180402153332.GM3948%40linux.vnet.ibm.com. >> > For more options, visit https://groups.google.com/d/optout. >> > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-02 16:32 ` Dmitry Vyukov @ 2018-04-02 16:39 ` Paul E. McKenney 2018-04-02 17:11 ` Dmitry Vyukov 0 siblings, 1 reply; 19+ messages in thread From: Paul E. McKenney @ 2018-04-02 16:39 UTC (permalink / raw) To: Dmitry Vyukov Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 02, 2018 at 06:32:03PM +0200, Dmitry Vyukov wrote: > On Mon, Apr 2, 2018 at 6:21 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > On Mon, Apr 02, 2018 at 06:04:35PM +0200, Dmitry Vyukov wrote: > >> On Mon, Apr 2, 2018 at 5:33 PM, Paul E. McKenney > >> <paulmck@linux.vnet.ibm.com> wrote: > >> > On Mon, Apr 02, 2018 at 09:40:40AM -0400, Steven Rostedt wrote: > >> >> On Mon, 02 Apr 2018 02:20:02 -0700 > >> >> syzbot <syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com> wrote: > >> >> > >> >> > Hello, > >> >> > > >> >> > syzbot hit the following crash on upstream commit > >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > >> >> > Linux 4.16 > >> >> > syzbot dashboard link: > >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > >> >> > > >> >> > Unfortunately, I don't have any reproducer for this crash yet. > >> >> > Raw console output: > >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > >> >> > Kernel config: > >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > >> >> > compiler: gcc (GCC) 7.1.1 20170620 > >> >> > > >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > >> >> > It will help syzbot understand when the bug is fixed. See footer for > >> >> > details. > >> >> > If you forward the report, please keep this part and the footer. > >> >> > > >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > >> >> > option "g �;e�K�>pquota" > >> > > >> > Might not hurt to look into the above, though perhaps this is just syzkaller > >> > playing around with mount options. > >> > > >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > >> >> > Not tainted 4.16.0+ #10 > >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> >> > syz-executor3 D20944 10803 4492 0x80000002 > >> >> > Call Trace: > >> >> > context_switch kernel/sched/core.c:2862 [inline] > >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 > >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] > >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] > >> >> > wait_for_common kernel/sched/completion.c:118 [inline] > >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 > >> >> > >> >> I don't think this is a perf issue. Looks like something is preventing > >> >> rcu_sched from completing. If there's a CPU that is running in kernel > >> >> space and never scheduling, that can cause this issue. Or if RCU > >> >> somehow missed a transition into idle or user space. > >> > > >> > The RCU CPU stall warning below strongly supports this position ... > >> > >> I think this is this guy then: > >> > >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 > >> > >> #syz dup: INFO: rcu detected stall in __process_echoes > > > > Seems likely to me! > > > >> Looking retrospectively at the various hang/stall bugs that we have, I > >> think we need some kind of priority between them. I.e. we have rcu > >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine > >> hang and maybe something else. It would be useful if they fire > >> deterministically according to priorities. If there is an rcu stall, > >> that's always detected as CPU stall. Then if there is no RCU stall, > >> but a workqueue stall, then that's always detected as workqueue stall, > >> etc. > >> Currently if we have an RCU stall (effectively CPU stall), that can be > >> detected either RCU stall or a task hung, producing 2 different bug > >> reports (which is bad). > >> One can say that it's only a matter of tuning timeouts, but at least > >> task hung detector has a problem that if you set timeout to X, it can > >> detect hung anywhere between X and 2*X. And on one hand we need quite > >> large timeout (a minute may not be enough), and on the other hand we > >> can't wait for an hour just to make sure that the machine is indeed > >> dead (these things happen every few minutes). > > > > I suppose that we could have a global variable that was set to the > > priority of the complaint in question, which would suppress all > > lower-priority complaints. Might need to be opt-in, though -- I would > > guess that not everyone is going to be happy with one complaint suppressing > > others, especially given the possibility that the two complaints might > > be about different things. > > > > Or did you have something more deft in mind? > > > syzkaller generally looks only at the first report. One does not know > if/when there will be a second one, or the second one can be induced > by the first one, and we generally want clean reports on a non-tainted > kernel. So we don't just need to suppress lower priority ones, we need > to produce the right report first. > I am thinking maybe setting: > - rcu stalls at 1.5 minutes > - workqueue stalls at 2 minutes > - task hungs at 2.5 minutes > - and no output whatsoever at 3 minutes > Do I miss anything? I think at least spinlocks. Should they go before > or after rcu? That is what I know of, but the Linux kernel being what it is, there is probably something more out there. If not now, in a few months. The RCU CPU stall timeout can be set on the kernel-boot command line, but you probably already knew that. Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout was 1.5 -seconds-. ;-) > This will require fixing task hung. Have not yet looked at workqueue detector. > Does at least RCU respect the given timeout more or less precisely? Assuming that there is at least one CPU capable of taking scheduling-clock interrupts, it should respect the timeout to within a few jiffies. Thanx, Paul > >> >> > tracepoint_synchronize_unregister include/linux/tracepoint.h:80 [inline] > >> >> > perf_trace_event_unreg.isra.2+0xb7/0x1f0 > >> >> > kernel/trace/trace_event_perf.c:161 > >> >> > perf_trace_destroy+0xbc/0x100 kernel/trace/trace_event_perf.c:236 > >> >> > tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7976 > >> >> > _free_event+0x3bd/0x10f0 kernel/events/core.c:4121 > >> >> > put_event+0x24/0x30 kernel/events/core.c:4204 > >> >> > perf_event_release_kernel+0x6e8/0xfc0 kernel/events/core.c:4310 > >> >> > perf_release+0x37/0x50 kernel/events/core.c:4320 > >> >> > __fput+0x327/0x7e0 fs/file_table.c:209 > >> >> > ____fput+0x15/0x20 fs/file_table.c:243 > >> >> > task_work_run+0x199/0x270 kernel/task_work.c:113 > >> >> > exit_task_work include/linux/task_work.h:22 [inline] > >> >> > do_exit+0x9bb/0x1ad0 kernel/exit.c:865 > >> >> > do_group_exit+0x149/0x400 kernel/exit.c:968 > >> >> > get_signal+0x73a/0x16d0 kernel/signal.c:2469 > >> >> > do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809 > >> >> > exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162 > >> >> > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] > >> >> > syscall_return_slowpath arch/x86/entry/common.c:265 [inline] > >> >> > do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292 > >> >> > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > >> >> > RIP: 0033:0x455269 > >> >> > RSP: 002b:00007f8976371ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > >> >> > RAX: 0000000000000000 RBX: 000000000072bec8 RCX: 0000000000455269 > >> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8 > >> >> > RBP: 000000000072bec8 R08: 0000000000000000 R09: 000000000072bea0 > >> >> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > >> >> > R13: 00007ffe793f79cf R14: 00007f89763729c0 R15: 0000000000000000 > >> >> > > >> >> > Showing all locks held in the system: > >> >> > 2 locks held by khungtaskd/876: > >> >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] > >> >> > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] > >> >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] watchdog+0x1c5/0xd60 > >> >> > kernel/hung_task.c:249 > >> > > >> > ... And two places to start looking are the two above rcu_read_lock() calls. > >> > Especially given that khungtask shows up below. > >> > > >> >> > #1: (tasklist_lock){.+.+}, at: [<0000000006b3009f>] > >> >> > debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 > >> >> > 2 locks held by getty/4414: > >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> > 2 locks held by getty/4415: > >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> > 2 locks held by getty/4416: > >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> > 2 locks held by getty/4417: > >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> > 2 locks held by getty/4418: > >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> > 2 locks held by getty/4419: > >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> > 2 locks held by getty/4420: > >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> > 1 lock held by syz-executor3/10803: > >> >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > >> >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > >> >> > 4 locks held by syz-executor5/10816: > >> >> > #0: (&tty->legacy_mutex){+.+.}, at: [<00000000567b7b94>] > >> >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > >> >> > #1: (&tty->legacy_mutex/1){+.+.}, at: [<00000000567b7b94>] > >> >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > >> >> > #2: (&tty->ldisc_sem){++++}, at: [<000000002b6b6a29>] > >> >> > tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:298 > >> >> > #3: (&o_tty->termios_rwsem/1){++++}, at: [<0000000007d9a7a4>] > >> >> > n_tty_flush_buffer+0x21/0x320 drivers/tty/n_tty.c:357 > >> >> > 1 lock held by syz-executor2/10827: > >> >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > >> >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > >> >> > 1 lock held by blkid/10832: > >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > >> >> > 1 lock held by syz-executor4/10835: > >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > >> >> > 1 lock held by syz-executor4/10845: > >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > >> >> > > >> >> > ============================================= > >> >> > > >> >> > NMI backtrace for cpu 1 > >> >> > CPU: 1 PID: 876 Comm: khungtaskd Not tainted 4.16.0+ #10 > >> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > >> >> > Google 01/01/2011 > >> >> > Call Trace: > >> >> > __dump_stack lib/dump_stack.c:17 [inline] > >> >> > dump_stack+0x194/0x24d lib/dump_stack.c:53 > >> >> > nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 > >> >> > nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 > >> >> > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 > >> >> > trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] > >> >> > check_hung_task kernel/hung_task.c:132 [inline] > >> >> > check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] > >> >> > watchdog+0x90c/0xd60 kernel/hung_task.c:249 > >> >> > INFO: rcu_sched self-detected stall on CPU > >> >> > 0-....: (124996 ticks this GP) idle=75e/1/4611686018427387906 > >> >> > softirq=33205/33205 fqs=30980 > >> >> > > >> >> > (t=125000 jiffies g=17618 c=17617 q=921) > >> >> > kthread+0x33c/0x400 kernel/kthread.c:238 > >> >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > >> >> > Sending NMI from CPU 1 to CPUs 0: > >> >> > NMI backtrace for cpu 0 > >> >> > CPU: 0 PID: 7457 Comm: kworker/u4:5 Not tainted 4.16.0+ #10 > >> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > >> >> > Google 01/01/2011 > >> >> > Workqueue: events_unbound flush_to_ldisc > >> >> > RIP: 0010:__process_echoes+0x641/0x770 drivers/tty/n_tty.c:733 > >> >> > RSP: 0018:ffff8801af4ff078 EFLAGS: 00000217 > >> >> > RAX: 0000000000000000 RBX: ffffc90003673000 RCX: ffffffff8352d4c2 > >> >> > RDX: 0000000000000006 RSI: 1ffff10039602994 RDI: ffffc9000367515e > >> >> > RBP: ffff8801af4ff0e0 R08: 1ffff10035e9fdb5 R09: 0000000000000000 > >> >> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000625628efd > >> >> > R13: dffffc0000000000 R14: 0000000000000efe R15: 0000000000001b15 > >> >> > FS: 0000000000000000(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 > >> >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> >> > CR2: 00007ffd5bfa4ca8 CR3: 000000000846a005 CR4: 00000000001606f0 > >> >> > DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> >> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > >> >> > Call Trace: > >> >> > commit_echoes+0x147/0x1b0 drivers/tty/n_tty.c:764 > >> >> > n_tty_receive_char_fast drivers/tty/n_tty.c:1416 [inline] > >> >> > n_tty_receive_buf_fast drivers/tty/n_tty.c:1576 [inline] > >> >> > __receive_buf drivers/tty/n_tty.c:1611 [inline] > >> >> > n_tty_receive_buf_common+0x1156/0x2520 drivers/tty/n_tty.c:1709 > >> >> > n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 > >> >> > tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456 > >> >> > tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38 > >> >> > receive_buf drivers/tty/tty_buffer.c:475 [inline] > >> >> > flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524 > >> >> > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 > >> >> > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 > >> >> > kthread+0x33c/0x400 kernel/kthread.c:238 > >> >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > >> > > >> > And the above is another good place to look. > >> > > >> > Thanx, Paul > >> > > >> >> > Code: 60 12 00 00 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 42 0f b6 04 28 38 > >> >> > d0 7f 08 84 c0 0f 85 21 01 00 00 42 80 bc 33 60 12 00 00 82 <74> 0f e8 48 > >> >> > 90 1e fe 4d 8d 74 24 02 e9 58 ff ff ff e8 39 90 1e > >> >> > > >> >> > > >> >> > --- > >> >> > This bug is generated by a dumb bot. It may contain errors. > >> >> > See https://goo.gl/tpsmEJ for details. > >> >> > Direct all questions to syzkaller@googlegroups.com. > >> >> > > >> >> > syzbot will keep track of this bug report. > >> >> > If you forgot to add the Reported-by tag, once the fix for this bug is > >> >> > merged > >> >> > into any tree, please reply to this email with: > >> >> > #syz fix: exact-commit-title > >> >> > To mark this as a duplicate of another syzbot report, please reply with: > >> >> > #syz dup: exact-subject-of-another-report > >> >> > If it's a one-off invalid bug report, please reply with: > >> >> > #syz invalid > >> >> > Note: if the crash happens again, it will cause creation of a new bug > >> >> > report. > >> >> > Note: all commands must start from beginning of the line in the email body. > >> >> > >> > > >> > -- > >> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > >> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > >> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20180402153332.GM3948%40linux.vnet.ibm.com. > >> > For more options, visit https://groups.google.com/d/optout. > >> > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-02 16:39 ` Paul E. McKenney @ 2018-04-02 17:11 ` Dmitry Vyukov 2018-04-02 17:23 ` Paul E. McKenney 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Vyukov @ 2018-04-02 17:11 UTC (permalink / raw) To: Paul McKenney Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 2, 2018 at 6:39 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Mon, Apr 02, 2018 at 06:32:03PM +0200, Dmitry Vyukov wrote: >> On Mon, Apr 2, 2018 at 6:21 PM, Paul E. McKenney >> <paulmck@linux.vnet.ibm.com> wrote: >> > On Mon, Apr 02, 2018 at 06:04:35PM +0200, Dmitry Vyukov wrote: >> >> On Mon, Apr 2, 2018 at 5:33 PM, Paul E. McKenney >> >> <paulmck@linux.vnet.ibm.com> wrote: >> >> > On Mon, Apr 02, 2018 at 09:40:40AM -0400, Steven Rostedt wrote: >> >> >> On Mon, 02 Apr 2018 02:20:02 -0700 >> >> >> syzbot <syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com> wrote: >> >> >> >> >> >> > Hello, >> >> >> > >> >> >> > syzbot hit the following crash on upstream commit >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) >> >> >> > Linux 4.16 >> >> >> > syzbot dashboard link: >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd >> >> >> > >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. >> >> >> > Raw console output: >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 >> >> >> > Kernel config: >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 >> >> >> > >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com >> >> >> > It will help syzbot understand when the bug is fixed. See footer for >> >> >> > details. >> >> >> > If you forward the report, please keep this part and the footer. >> >> >> > >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount >> >> >> > option "g �;e�K�>pquota" >> >> > >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller >> >> > playing around with mount options. >> >> > >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. >> >> >> > Not tainted 4.16.0+ #10 >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 >> >> >> > Call Trace: >> >> >> > context_switch kernel/sched/core.c:2862 [inline] >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 >> >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel >> >> >> space and never scheduling, that can cause this issue. Or if RCU >> >> >> somehow missed a transition into idle or user space. >> >> > >> >> > The RCU CPU stall warning below strongly supports this position ... >> >> >> >> I think this is this guy then: >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes >> > >> > Seems likely to me! >> > >> >> Looking retrospectively at the various hang/stall bugs that we have, I >> >> think we need some kind of priority between them. I.e. we have rcu >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine >> >> hang and maybe something else. It would be useful if they fire >> >> deterministically according to priorities. If there is an rcu stall, >> >> that's always detected as CPU stall. Then if there is no RCU stall, >> >> but a workqueue stall, then that's always detected as workqueue stall, >> >> etc. >> >> Currently if we have an RCU stall (effectively CPU stall), that can be >> >> detected either RCU stall or a task hung, producing 2 different bug >> >> reports (which is bad). >> >> One can say that it's only a matter of tuning timeouts, but at least >> >> task hung detector has a problem that if you set timeout to X, it can >> >> detect hung anywhere between X and 2*X. And on one hand we need quite >> >> large timeout (a minute may not be enough), and on the other hand we >> >> can't wait for an hour just to make sure that the machine is indeed >> >> dead (these things happen every few minutes). >> > >> > I suppose that we could have a global variable that was set to the >> > priority of the complaint in question, which would suppress all >> > lower-priority complaints. Might need to be opt-in, though -- I would >> > guess that not everyone is going to be happy with one complaint suppressing >> > others, especially given the possibility that the two complaints might >> > be about different things. >> > >> > Or did you have something more deft in mind? >> >> >> syzkaller generally looks only at the first report. One does not know >> if/when there will be a second one, or the second one can be induced >> by the first one, and we generally want clean reports on a non-tainted >> kernel. So we don't just need to suppress lower priority ones, we need >> to produce the right report first. >> I am thinking maybe setting: >> - rcu stalls at 1.5 minutes >> - workqueue stalls at 2 minutes >> - task hungs at 2.5 minutes >> - and no output whatsoever at 3 minutes >> Do I miss anything? I think at least spinlocks. Should they go before >> or after rcu? > > That is what I know of, but the Linux kernel being what it is, there is > probably something more out there. If not now, in a few months. The > RCU CPU stall timeout can be set on the kernel-boot command line, but > you probably already knew that. Well, it's all based solely on a large number of patches and stopgaps. If we fix main problems for today, it's already good. > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout > was 1.5 -seconds-. ;-) Have you tried to instrument every basic block with a function call to collect coverage, check every damn memory access for validity, enable all thinkable and unthinkable debug configs and put the insanest load one can imagine from a swarm of parallel threads? It makes things a bit slower ;) >> This will require fixing task hung. Have not yet looked at workqueue detector. >> Does at least RCU respect the given timeout more or less precisely? > > Assuming that there is at least one CPU capable of taking scheduling-clock > interrupts, it should respect the timeout to within a few jiffies. This is good! > Thanx, Paul > >> >> >> > tracepoint_synchronize_unregister include/linux/tracepoint.h:80 [inline] >> >> >> > perf_trace_event_unreg.isra.2+0xb7/0x1f0 >> >> >> > kernel/trace/trace_event_perf.c:161 >> >> >> > perf_trace_destroy+0xbc/0x100 kernel/trace/trace_event_perf.c:236 >> >> >> > tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7976 >> >> >> > _free_event+0x3bd/0x10f0 kernel/events/core.c:4121 >> >> >> > put_event+0x24/0x30 kernel/events/core.c:4204 >> >> >> > perf_event_release_kernel+0x6e8/0xfc0 kernel/events/core.c:4310 >> >> >> > perf_release+0x37/0x50 kernel/events/core.c:4320 >> >> >> > __fput+0x327/0x7e0 fs/file_table.c:209 >> >> >> > ____fput+0x15/0x20 fs/file_table.c:243 >> >> >> > task_work_run+0x199/0x270 kernel/task_work.c:113 >> >> >> > exit_task_work include/linux/task_work.h:22 [inline] >> >> >> > do_exit+0x9bb/0x1ad0 kernel/exit.c:865 >> >> >> > do_group_exit+0x149/0x400 kernel/exit.c:968 >> >> >> > get_signal+0x73a/0x16d0 kernel/signal.c:2469 >> >> >> > do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809 >> >> >> > exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162 >> >> >> > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] >> >> >> > syscall_return_slowpath arch/x86/entry/common.c:265 [inline] >> >> >> > do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292 >> >> >> > entry_SYSCALL_64_after_hwframe+0x42/0xb7 >> >> >> > RIP: 0033:0x455269 >> >> >> > RSP: 002b:00007f8976371ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca >> >> >> > RAX: 0000000000000000 RBX: 000000000072bec8 RCX: 0000000000455269 >> >> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8 >> >> >> > RBP: 000000000072bec8 R08: 0000000000000000 R09: 000000000072bea0 >> >> >> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 >> >> >> > R13: 00007ffe793f79cf R14: 00007f89763729c0 R15: 0000000000000000 >> >> >> > >> >> >> > Showing all locks held in the system: >> >> >> > 2 locks held by khungtaskd/876: >> >> >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] >> >> >> > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] >> >> >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] watchdog+0x1c5/0xd60 >> >> >> > kernel/hung_task.c:249 >> >> > >> >> > ... And two places to start looking are the two above rcu_read_lock() calls. >> >> > Especially given that khungtask shows up below. >> >> > >> >> >> > #1: (tasklist_lock){.+.+}, at: [<0000000006b3009f>] >> >> >> > debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 >> >> >> > 2 locks held by getty/4414: >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> >> > 2 locks held by getty/4415: >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> >> > 2 locks held by getty/4416: >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> >> > 2 locks held by getty/4417: >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> >> > 2 locks held by getty/4418: >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> >> > 2 locks held by getty/4419: >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> >> > 2 locks held by getty/4420: >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 >> >> >> > 1 lock held by syz-executor3/10803: >> >> >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] >> >> >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 >> >> >> > 4 locks held by syz-executor5/10816: >> >> >> > #0: (&tty->legacy_mutex){+.+.}, at: [<00000000567b7b94>] >> >> >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 >> >> >> > #1: (&tty->legacy_mutex/1){+.+.}, at: [<00000000567b7b94>] >> >> >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 >> >> >> > #2: (&tty->ldisc_sem){++++}, at: [<000000002b6b6a29>] >> >> >> > tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:298 >> >> >> > #3: (&o_tty->termios_rwsem/1){++++}, at: [<0000000007d9a7a4>] >> >> >> > n_tty_flush_buffer+0x21/0x320 drivers/tty/n_tty.c:357 >> >> >> > 1 lock held by syz-executor2/10827: >> >> >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] >> >> >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 >> >> >> > 1 lock held by blkid/10832: >> >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] >> >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 >> >> >> > 1 lock held by syz-executor4/10835: >> >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] >> >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 >> >> >> > 1 lock held by syz-executor4/10845: >> >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] >> >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 >> >> >> > >> >> >> > ============================================= >> >> >> > >> >> >> > NMI backtrace for cpu 1 >> >> >> > CPU: 1 PID: 876 Comm: khungtaskd Not tainted 4.16.0+ #10 >> >> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> >> >> > Google 01/01/2011 >> >> >> > Call Trace: >> >> >> > __dump_stack lib/dump_stack.c:17 [inline] >> >> >> > dump_stack+0x194/0x24d lib/dump_stack.c:53 >> >> >> > nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 >> >> >> > nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 >> >> >> > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 >> >> >> > trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] >> >> >> > check_hung_task kernel/hung_task.c:132 [inline] >> >> >> > check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] >> >> >> > watchdog+0x90c/0xd60 kernel/hung_task.c:249 >> >> >> > INFO: rcu_sched self-detected stall on CPU >> >> >> > 0-....: (124996 ticks this GP) idle=75e/1/4611686018427387906 >> >> >> > softirq=33205/33205 fqs=30980 >> >> >> > >> >> >> > (t=125000 jiffies g=17618 c=17617 q=921) >> >> >> > kthread+0x33c/0x400 kernel/kthread.c:238 >> >> >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 >> >> >> > Sending NMI from CPU 1 to CPUs 0: >> >> >> > NMI backtrace for cpu 0 >> >> >> > CPU: 0 PID: 7457 Comm: kworker/u4:5 Not tainted 4.16.0+ #10 >> >> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> >> >> > Google 01/01/2011 >> >> >> > Workqueue: events_unbound flush_to_ldisc >> >> >> > RIP: 0010:__process_echoes+0x641/0x770 drivers/tty/n_tty.c:733 >> >> >> > RSP: 0018:ffff8801af4ff078 EFLAGS: 00000217 >> >> >> > RAX: 0000000000000000 RBX: ffffc90003673000 RCX: ffffffff8352d4c2 >> >> >> > RDX: 0000000000000006 RSI: 1ffff10039602994 RDI: ffffc9000367515e >> >> >> > RBP: ffff8801af4ff0e0 R08: 1ffff10035e9fdb5 R09: 0000000000000000 >> >> >> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000625628efd >> >> >> > R13: dffffc0000000000 R14: 0000000000000efe R15: 0000000000001b15 >> >> >> > FS: 0000000000000000(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 >> >> >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> >> >> > CR2: 00007ffd5bfa4ca8 CR3: 000000000846a005 CR4: 00000000001606f0 >> >> >> > DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 >> >> >> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 >> >> >> > Call Trace: >> >> >> > commit_echoes+0x147/0x1b0 drivers/tty/n_tty.c:764 >> >> >> > n_tty_receive_char_fast drivers/tty/n_tty.c:1416 [inline] >> >> >> > n_tty_receive_buf_fast drivers/tty/n_tty.c:1576 [inline] >> >> >> > __receive_buf drivers/tty/n_tty.c:1611 [inline] >> >> >> > n_tty_receive_buf_common+0x1156/0x2520 drivers/tty/n_tty.c:1709 >> >> >> > n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 >> >> >> > tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456 >> >> >> > tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38 >> >> >> > receive_buf drivers/tty/tty_buffer.c:475 [inline] >> >> >> > flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524 >> >> >> > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 >> >> >> > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 >> >> >> > kthread+0x33c/0x400 kernel/kthread.c:238 >> >> >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 >> >> > >> >> > And the above is another good place to look. >> >> > >> >> > Thanx, Paul >> >> > >> >> >> > Code: 60 12 00 00 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 42 0f b6 04 28 38 >> >> >> > d0 7f 08 84 c0 0f 85 21 01 00 00 42 80 bc 33 60 12 00 00 82 <74> 0f e8 48 >> >> >> > 90 1e fe 4d 8d 74 24 02 e9 58 ff ff ff e8 39 90 1e >> >> >> > >> >> >> > >> >> >> > --- >> >> >> > This bug is generated by a dumb bot. It may contain errors. >> >> >> > See https://goo.gl/tpsmEJ for details. >> >> >> > Direct all questions to syzkaller@googlegroups.com. >> >> >> > >> >> >> > syzbot will keep track of this bug report. >> >> >> > If you forgot to add the Reported-by tag, once the fix for this bug is >> >> >> > merged >> >> >> > into any tree, please reply to this email with: >> >> >> > #syz fix: exact-commit-title >> >> >> > To mark this as a duplicate of another syzbot report, please reply with: >> >> >> > #syz dup: exact-subject-of-another-report >> >> >> > If it's a one-off invalid bug report, please reply with: >> >> >> > #syz invalid >> >> >> > Note: if the crash happens again, it will cause creation of a new bug >> >> >> > report. >> >> >> > Note: all commands must start from beginning of the line in the email body. >> >> >> >> >> > >> >> > -- >> >> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. >> >> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. >> >> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20180402153332.GM3948%40linux.vnet.ibm.com. >> >> > For more options, visit https://groups.google.com/d/optout. >> >> >> > >> > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-02 17:11 ` Dmitry Vyukov @ 2018-04-02 17:23 ` Paul E. McKenney 2018-04-09 12:54 ` Dmitry Vyukov 0 siblings, 1 reply; 19+ messages in thread From: Paul E. McKenney @ 2018-04-02 17:23 UTC (permalink / raw) To: Dmitry Vyukov Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 02, 2018 at 07:11:50PM +0200, Dmitry Vyukov wrote: > On Mon, Apr 2, 2018 at 6:39 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > On Mon, Apr 02, 2018 at 06:32:03PM +0200, Dmitry Vyukov wrote: > >> On Mon, Apr 2, 2018 at 6:21 PM, Paul E. McKenney > >> <paulmck@linux.vnet.ibm.com> wrote: > >> > On Mon, Apr 02, 2018 at 06:04:35PM +0200, Dmitry Vyukov wrote: > >> >> On Mon, Apr 2, 2018 at 5:33 PM, Paul E. McKenney > >> >> <paulmck@linux.vnet.ibm.com> wrote: > >> >> > On Mon, Apr 02, 2018 at 09:40:40AM -0400, Steven Rostedt wrote: > >> >> >> On Mon, 02 Apr 2018 02:20:02 -0700 > >> >> >> syzbot <syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com> wrote: > >> >> >> > >> >> >> > Hello, > >> >> >> > > >> >> >> > syzbot hit the following crash on upstream commit > >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > >> >> >> > Linux 4.16 > >> >> >> > syzbot dashboard link: > >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > >> >> >> > > >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. > >> >> >> > Raw console output: > >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > >> >> >> > Kernel config: > >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 > >> >> >> > > >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > >> >> >> > It will help syzbot understand when the bug is fixed. See footer for > >> >> >> > details. > >> >> >> > If you forward the report, please keep this part and the footer. > >> >> >> > > >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > >> >> >> > option "g �;e�K�>pquota" > >> >> > > >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller > >> >> > playing around with mount options. > >> >> > > >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > >> >> >> > Not tainted 4.16.0+ #10 > >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 > >> >> >> > Call Trace: > >> >> >> > context_switch kernel/sched/core.c:2862 [inline] > >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 > >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] > >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] > >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] > >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 > >> >> >> > >> >> >> I don't think this is a perf issue. Looks like something is preventing > >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel > >> >> >> space and never scheduling, that can cause this issue. Or if RCU > >> >> >> somehow missed a transition into idle or user space. > >> >> > > >> >> > The RCU CPU stall warning below strongly supports this position ... > >> >> > >> >> I think this is this guy then: > >> >> > >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 > >> >> > >> >> #syz dup: INFO: rcu detected stall in __process_echoes > >> > > >> > Seems likely to me! > >> > > >> >> Looking retrospectively at the various hang/stall bugs that we have, I > >> >> think we need some kind of priority between them. I.e. we have rcu > >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine > >> >> hang and maybe something else. It would be useful if they fire > >> >> deterministically according to priorities. If there is an rcu stall, > >> >> that's always detected as CPU stall. Then if there is no RCU stall, > >> >> but a workqueue stall, then that's always detected as workqueue stall, > >> >> etc. > >> >> Currently if we have an RCU stall (effectively CPU stall), that can be > >> >> detected either RCU stall or a task hung, producing 2 different bug > >> >> reports (which is bad). > >> >> One can say that it's only a matter of tuning timeouts, but at least > >> >> task hung detector has a problem that if you set timeout to X, it can > >> >> detect hung anywhere between X and 2*X. And on one hand we need quite > >> >> large timeout (a minute may not be enough), and on the other hand we > >> >> can't wait for an hour just to make sure that the machine is indeed > >> >> dead (these things happen every few minutes). > >> > > >> > I suppose that we could have a global variable that was set to the > >> > priority of the complaint in question, which would suppress all > >> > lower-priority complaints. Might need to be opt-in, though -- I would > >> > guess that not everyone is going to be happy with one complaint suppressing > >> > others, especially given the possibility that the two complaints might > >> > be about different things. > >> > > >> > Or did you have something more deft in mind? > >> > >> > >> syzkaller generally looks only at the first report. One does not know > >> if/when there will be a second one, or the second one can be induced > >> by the first one, and we generally want clean reports on a non-tainted > >> kernel. So we don't just need to suppress lower priority ones, we need > >> to produce the right report first. > >> I am thinking maybe setting: > >> - rcu stalls at 1.5 minutes > >> - workqueue stalls at 2 minutes > >> - task hungs at 2.5 minutes > >> - and no output whatsoever at 3 minutes > >> Do I miss anything? I think at least spinlocks. Should they go before > >> or after rcu? > > > > That is what I know of, but the Linux kernel being what it is, there is > > probably something more out there. If not now, in a few months. The > > RCU CPU stall timeout can be set on the kernel-boot command line, but > > you probably already knew that. > > Well, it's all based solely on a large number of patches and stopgaps. > If we fix main problems for today, it's already good. Fair enough! > > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout > > was 1.5 -seconds-. ;-) > > Have you tried to instrument every basic block with a function call to > collect coverage, check every damn memory access for validity, enable > all thinkable and unthinkable debug configs and put the insanest load > one can imagine from a swarm of parallel threads? It makes things a > bit slower ;) Given that we wouldn't have had enough CPU or memory to accommodate all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) > >> This will require fixing task hung. Have not yet looked at workqueue detector. > >> Does at least RCU respect the given timeout more or less precisely? > > > > Assuming that there is at least one CPU capable of taking scheduling-clock > > interrupts, it should respect the timeout to within a few jiffies. > > This is good! ;-) Thanx, Paul > >> >> >> > tracepoint_synchronize_unregister include/linux/tracepoint.h:80 [inline] > >> >> >> > perf_trace_event_unreg.isra.2+0xb7/0x1f0 > >> >> >> > kernel/trace/trace_event_perf.c:161 > >> >> >> > perf_trace_destroy+0xbc/0x100 kernel/trace/trace_event_perf.c:236 > >> >> >> > tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7976 > >> >> >> > _free_event+0x3bd/0x10f0 kernel/events/core.c:4121 > >> >> >> > put_event+0x24/0x30 kernel/events/core.c:4204 > >> >> >> > perf_event_release_kernel+0x6e8/0xfc0 kernel/events/core.c:4310 > >> >> >> > perf_release+0x37/0x50 kernel/events/core.c:4320 > >> >> >> > __fput+0x327/0x7e0 fs/file_table.c:209 > >> >> >> > ____fput+0x15/0x20 fs/file_table.c:243 > >> >> >> > task_work_run+0x199/0x270 kernel/task_work.c:113 > >> >> >> > exit_task_work include/linux/task_work.h:22 [inline] > >> >> >> > do_exit+0x9bb/0x1ad0 kernel/exit.c:865 > >> >> >> > do_group_exit+0x149/0x400 kernel/exit.c:968 > >> >> >> > get_signal+0x73a/0x16d0 kernel/signal.c:2469 > >> >> >> > do_signal+0x90/0x1e90 arch/x86/kernel/signal.c:809 > >> >> >> > exit_to_usermode_loop+0x258/0x2f0 arch/x86/entry/common.c:162 > >> >> >> > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] > >> >> >> > syscall_return_slowpath arch/x86/entry/common.c:265 [inline] > >> >> >> > do_syscall_64+0x6ec/0x940 arch/x86/entry/common.c:292 > >> >> >> > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > >> >> >> > RIP: 0033:0x455269 > >> >> >> > RSP: 002b:00007f8976371ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > >> >> >> > RAX: 0000000000000000 RBX: 000000000072bec8 RCX: 0000000000455269 > >> >> >> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bec8 > >> >> >> > RBP: 000000000072bec8 R08: 0000000000000000 R09: 000000000072bea0 > >> >> >> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > >> >> >> > R13: 00007ffe793f79cf R14: 00007f89763729c0 R15: 0000000000000000 > >> >> >> > > >> >> >> > Showing all locks held in the system: > >> >> >> > 2 locks held by khungtaskd/876: > >> >> >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] > >> >> >> > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] > >> >> >> > #0: (rcu_read_lock){....}, at: [<000000008f2bec4b>] watchdog+0x1c5/0xd60 > >> >> >> > kernel/hung_task.c:249 > >> >> > > >> >> > ... And two places to start looking are the two above rcu_read_lock() calls. > >> >> > Especially given that khungtask shows up below. > >> >> > > >> >> >> > #1: (tasklist_lock){.+.+}, at: [<0000000006b3009f>] > >> >> >> > debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 > >> >> >> > 2 locks held by getty/4414: > >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> >> > 2 locks held by getty/4415: > >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> >> > 2 locks held by getty/4416: > >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> >> > 2 locks held by getty/4417: > >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> >> > 2 locks held by getty/4418: > >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> >> > 2 locks held by getty/4419: > >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> >> > 2 locks held by getty/4420: > >> >> >> > #0: (&tty->ldisc_sem){++++}, at: [<00000000e51437c8>] > >> >> >> > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > >> >> >> > #1: (&ldata->atomic_read_lock){+.+.}, at: [<00000000762a7320>] > >> >> >> > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > >> >> >> > 1 lock held by syz-executor3/10803: > >> >> >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > >> >> >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > >> >> >> > 4 locks held by syz-executor5/10816: > >> >> >> > #0: (&tty->legacy_mutex){+.+.}, at: [<00000000567b7b94>] > >> >> >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > >> >> >> > #1: (&tty->legacy_mutex/1){+.+.}, at: [<00000000567b7b94>] > >> >> >> > tty_lock+0x5d/0x90 drivers/tty/tty_mutex.c:19 > >> >> >> > #2: (&tty->ldisc_sem){++++}, at: [<000000002b6b6a29>] > >> >> >> > tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:298 > >> >> >> > #3: (&o_tty->termios_rwsem/1){++++}, at: [<0000000007d9a7a4>] > >> >> >> > n_tty_flush_buffer+0x21/0x320 drivers/tty/n_tty.c:357 > >> >> >> > 1 lock held by syz-executor2/10827: > >> >> >> > #0: (event_mutex){+.+.}, at: [<00000000c507b78a>] > >> >> >> > perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234 > >> >> >> > 1 lock held by blkid/10832: > >> >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > >> >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > >> >> >> > 1 lock held by syz-executor4/10835: > >> >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > >> >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > >> >> >> > 1 lock held by syz-executor4/10845: > >> >> >> > #0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<000000006e2f031e>] > >> >> >> > lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 > >> >> >> > > >> >> >> > ============================================= > >> >> >> > > >> >> >> > NMI backtrace for cpu 1 > >> >> >> > CPU: 1 PID: 876 Comm: khungtaskd Not tainted 4.16.0+ #10 > >> >> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > >> >> >> > Google 01/01/2011 > >> >> >> > Call Trace: > >> >> >> > __dump_stack lib/dump_stack.c:17 [inline] > >> >> >> > dump_stack+0x194/0x24d lib/dump_stack.c:53 > >> >> >> > nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 > >> >> >> > nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 > >> >> >> > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 > >> >> >> > trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] > >> >> >> > check_hung_task kernel/hung_task.c:132 [inline] > >> >> >> > check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] > >> >> >> > watchdog+0x90c/0xd60 kernel/hung_task.c:249 > >> >> >> > INFO: rcu_sched self-detected stall on CPU > >> >> >> > 0-....: (124996 ticks this GP) idle=75e/1/4611686018427387906 > >> >> >> > softirq=33205/33205 fqs=30980 > >> >> >> > > >> >> >> > (t=125000 jiffies g=17618 c=17617 q=921) > >> >> >> > kthread+0x33c/0x400 kernel/kthread.c:238 > >> >> >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > >> >> >> > Sending NMI from CPU 1 to CPUs 0: > >> >> >> > NMI backtrace for cpu 0 > >> >> >> > CPU: 0 PID: 7457 Comm: kworker/u4:5 Not tainted 4.16.0+ #10 > >> >> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > >> >> >> > Google 01/01/2011 > >> >> >> > Workqueue: events_unbound flush_to_ldisc > >> >> >> > RIP: 0010:__process_echoes+0x641/0x770 drivers/tty/n_tty.c:733 > >> >> >> > RSP: 0018:ffff8801af4ff078 EFLAGS: 00000217 > >> >> >> > RAX: 0000000000000000 RBX: ffffc90003673000 RCX: ffffffff8352d4c2 > >> >> >> > RDX: 0000000000000006 RSI: 1ffff10039602994 RDI: ffffc9000367515e > >> >> >> > RBP: ffff8801af4ff0e0 R08: 1ffff10035e9fdb5 R09: 0000000000000000 > >> >> >> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000625628efd > >> >> >> > R13: dffffc0000000000 R14: 0000000000000efe R15: 0000000000001b15 > >> >> >> > FS: 0000000000000000(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 > >> >> >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> >> >> > CR2: 00007ffd5bfa4ca8 CR3: 000000000846a005 CR4: 00000000001606f0 > >> >> >> > DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> >> >> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > >> >> >> > Call Trace: > >> >> >> > commit_echoes+0x147/0x1b0 drivers/tty/n_tty.c:764 > >> >> >> > n_tty_receive_char_fast drivers/tty/n_tty.c:1416 [inline] > >> >> >> > n_tty_receive_buf_fast drivers/tty/n_tty.c:1576 [inline] > >> >> >> > __receive_buf drivers/tty/n_tty.c:1611 [inline] > >> >> >> > n_tty_receive_buf_common+0x1156/0x2520 drivers/tty/n_tty.c:1709 > >> >> >> > n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 > >> >> >> > tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:456 > >> >> >> > tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:38 > >> >> >> > receive_buf drivers/tty/tty_buffer.c:475 [inline] > >> >> >> > flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:524 > >> >> >> > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 > >> >> >> > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 > >> >> >> > kthread+0x33c/0x400 kernel/kthread.c:238 > >> >> >> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > >> >> > > >> >> > And the above is another good place to look. > >> >> > > >> >> > Thanx, Paul > >> >> > > >> >> >> > Code: 60 12 00 00 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 42 0f b6 04 28 38 > >> >> >> > d0 7f 08 84 c0 0f 85 21 01 00 00 42 80 bc 33 60 12 00 00 82 <74> 0f e8 48 > >> >> >> > 90 1e fe 4d 8d 74 24 02 e9 58 ff ff ff e8 39 90 1e > >> >> >> > > >> >> >> > > >> >> >> > --- > >> >> >> > This bug is generated by a dumb bot. It may contain errors. > >> >> >> > See https://goo.gl/tpsmEJ for details. > >> >> >> > Direct all questions to syzkaller@googlegroups.com. > >> >> >> > > >> >> >> > syzbot will keep track of this bug report. > >> >> >> > If you forgot to add the Reported-by tag, once the fix for this bug is > >> >> >> > merged > >> >> >> > into any tree, please reply to this email with: > >> >> >> > #syz fix: exact-commit-title > >> >> >> > To mark this as a duplicate of another syzbot report, please reply with: > >> >> >> > #syz dup: exact-subject-of-another-report > >> >> >> > If it's a one-off invalid bug report, please reply with: > >> >> >> > #syz invalid > >> >> >> > Note: if the crash happens again, it will cause creation of a new bug > >> >> >> > report. > >> >> >> > Note: all commands must start from beginning of the line in the email body. > >> >> >> > >> >> > > >> >> > -- > >> >> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > >> >> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > >> >> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20180402153332.GM3948%40linux.vnet.ibm.com. > >> >> > For more options, visit https://groups.google.com/d/optout. > >> >> > >> > > >> > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-02 17:23 ` Paul E. McKenney @ 2018-04-09 12:54 ` Dmitry Vyukov 2018-04-09 16:20 ` Paul E. McKenney 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Vyukov @ 2018-04-09 12:54 UTC (permalink / raw) To: Paul McKenney Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 2, 2018 at 7:23 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: >> >> >> >> >> >> >> >> > Hello, >> >> >> >> > >> >> >> >> > syzbot hit the following crash on upstream commit >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) >> >> >> >> > Linux 4.16 >> >> >> >> > syzbot dashboard link: >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd >> >> >> >> > >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. >> >> >> >> > Raw console output: >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 >> >> >> >> > Kernel config: >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 >> >> >> >> > >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for >> >> >> >> > details. >> >> >> >> > If you forward the report, please keep this part and the footer. >> >> >> >> > >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount >> >> >> >> > option "g �;e�K�>pquota" >> >> >> > >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller >> >> >> > playing around with mount options. >> >> >> > >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. >> >> >> >> > Not tainted 4.16.0+ #10 >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 >> >> >> >> > Call Trace: >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 >> >> >> >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU >> >> >> >> somehow missed a transition into idle or user space. >> >> >> > >> >> >> > The RCU CPU stall warning below strongly supports this position ... >> >> >> >> >> >> I think this is this guy then: >> >> >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 >> >> >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes >> >> > >> >> > Seems likely to me! >> >> > >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I >> >> >> think we need some kind of priority between them. I.e. we have rcu >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine >> >> >> hang and maybe something else. It would be useful if they fire >> >> >> deterministically according to priorities. If there is an rcu stall, >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, >> >> >> but a workqueue stall, then that's always detected as workqueue stall, >> >> >> etc. >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be >> >> >> detected either RCU stall or a task hung, producing 2 different bug >> >> >> reports (which is bad). >> >> >> One can say that it's only a matter of tuning timeouts, but at least >> >> >> task hung detector has a problem that if you set timeout to X, it can >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite >> >> >> large timeout (a minute may not be enough), and on the other hand we >> >> >> can't wait for an hour just to make sure that the machine is indeed >> >> >> dead (these things happen every few minutes). >> >> > >> >> > I suppose that we could have a global variable that was set to the >> >> > priority of the complaint in question, which would suppress all >> >> > lower-priority complaints. Might need to be opt-in, though -- I would >> >> > guess that not everyone is going to be happy with one complaint suppressing >> >> > others, especially given the possibility that the two complaints might >> >> > be about different things. >> >> > >> >> > Or did you have something more deft in mind? >> >> >> >> >> >> syzkaller generally looks only at the first report. One does not know >> >> if/when there will be a second one, or the second one can be induced >> >> by the first one, and we generally want clean reports on a non-tainted >> >> kernel. So we don't just need to suppress lower priority ones, we need >> >> to produce the right report first. >> >> I am thinking maybe setting: >> >> - rcu stalls at 1.5 minutes >> >> - workqueue stalls at 2 minutes >> >> - task hungs at 2.5 minutes >> >> - and no output whatsoever at 3 minutes >> >> Do I miss anything? I think at least spinlocks. Should they go before >> >> or after rcu? >> > >> > That is what I know of, but the Linux kernel being what it is, there is >> > probably something more out there. If not now, in a few months. The >> > RCU CPU stall timeout can be set on the kernel-boot command line, but >> > you probably already knew that. >> >> Well, it's all based solely on a large number of patches and stopgaps. >> If we fix main problems for today, it's already good. > > Fair enough! > >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout >> > was 1.5 -seconds-. ;-) >> >> Have you tried to instrument every basic block with a function call to >> collect coverage, check every damn memory access for validity, enable >> all thinkable and unthinkable debug configs and put the insanest load >> one can imagine from a swarm of parallel threads? It makes things a >> bit slower ;) > > Given that we wouldn't have had enough CPU or memory to accommodate > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) > >> >> This will require fixing task hung. Have not yet looked at workqueue detector. >> >> Does at least RCU respect the given timeout more or less precisely? >> > >> > Assuming that there is at least one CPU capable of taking scheduling-clock >> > interrupts, it should respect the timeout to within a few jiffies. Hi Paul, Speaking of stalls and rcu, we are seeing lots of crashes that go like this: INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: rcu_sched detected stalls on CPUs/tasks: INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: rcu_sched detected stalls on CPUs/tasks: INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: rcu_sched detected stalls on CPUs/tasks: or like this: INFO: rcu_sched self-detected stall on CPU INFO: rcu_sched detected stalls on CPUs/tasks: 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 softirq=57641/57641 fqs=31151 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 softirq=57641/57641 fqs=31151 (t=125002 jiffies g=31656 c=31655 q=910) INFO: rcu_sched self-detected stall on CPU INFO: rcu_sched detected stalls on CPUs/tasks: 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 softirq=65194/65194 fqs=31231 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 softirq=65194/65194 fqs=31231 (t=125002 jiffies g=34421 c=34420 q=1119) (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) and then there is an unintelligible mess of 2 reports. Such crashes go to trash bin, because we can't even say which function hanged. It seems that in all cases 2 different rcu stall detection facilities race with each other. Is it possible to make them not race? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-09 12:54 ` Dmitry Vyukov @ 2018-04-09 16:20 ` Paul E. McKenney 2018-04-09 16:28 ` Dmitry Vyukov 0 siblings, 1 reply; 19+ messages in thread From: Paul E. McKenney @ 2018-04-09 16:20 UTC (permalink / raw) To: Dmitry Vyukov Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 09, 2018 at 02:54:20PM +0200, Dmitry Vyukov wrote: > On Mon, Apr 2, 2018 at 7:23 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > >> >> >> >> > >> >> >> >> > Hello, > >> >> >> >> > > >> >> >> >> > syzbot hit the following crash on upstream commit > >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > >> >> >> >> > Linux 4.16 > >> >> >> >> > syzbot dashboard link: > >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > >> >> >> >> > > >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. > >> >> >> >> > Raw console output: > >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > >> >> >> >> > Kernel config: > >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 > >> >> >> >> > > >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for > >> >> >> >> > details. > >> >> >> >> > If you forward the report, please keep this part and the footer. > >> >> >> >> > > >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > >> >> >> >> > option "g �;e�K�>pquota" > >> >> >> > > >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller > >> >> >> > playing around with mount options. > >> >> >> > > >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > >> >> >> >> > Not tainted 4.16.0+ #10 > >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 > >> >> >> >> > Call Trace: > >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] > >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 > >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] > >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] > >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] > >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 > >> >> >> >> > >> >> >> >> I don't think this is a perf issue. Looks like something is preventing > >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel > >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU > >> >> >> >> somehow missed a transition into idle or user space. > >> >> >> > > >> >> >> > The RCU CPU stall warning below strongly supports this position ... > >> >> >> > >> >> >> I think this is this guy then: > >> >> >> > >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 > >> >> >> > >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes > >> >> > > >> >> > Seems likely to me! > >> >> > > >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I > >> >> >> think we need some kind of priority between them. I.e. we have rcu > >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine > >> >> >> hang and maybe something else. It would be useful if they fire > >> >> >> deterministically according to priorities. If there is an rcu stall, > >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, > >> >> >> but a workqueue stall, then that's always detected as workqueue stall, > >> >> >> etc. > >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be > >> >> >> detected either RCU stall or a task hung, producing 2 different bug > >> >> >> reports (which is bad). > >> >> >> One can say that it's only a matter of tuning timeouts, but at least > >> >> >> task hung detector has a problem that if you set timeout to X, it can > >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite > >> >> >> large timeout (a minute may not be enough), and on the other hand we > >> >> >> can't wait for an hour just to make sure that the machine is indeed > >> >> >> dead (these things happen every few minutes). > >> >> > > >> >> > I suppose that we could have a global variable that was set to the > >> >> > priority of the complaint in question, which would suppress all > >> >> > lower-priority complaints. Might need to be opt-in, though -- I would > >> >> > guess that not everyone is going to be happy with one complaint suppressing > >> >> > others, especially given the possibility that the two complaints might > >> >> > be about different things. > >> >> > > >> >> > Or did you have something more deft in mind? > >> >> > >> >> > >> >> syzkaller generally looks only at the first report. One does not know > >> >> if/when there will be a second one, or the second one can be induced > >> >> by the first one, and we generally want clean reports on a non-tainted > >> >> kernel. So we don't just need to suppress lower priority ones, we need > >> >> to produce the right report first. > >> >> I am thinking maybe setting: > >> >> - rcu stalls at 1.5 minutes > >> >> - workqueue stalls at 2 minutes > >> >> - task hungs at 2.5 minutes > >> >> - and no output whatsoever at 3 minutes > >> >> Do I miss anything? I think at least spinlocks. Should they go before > >> >> or after rcu? > >> > > >> > That is what I know of, but the Linux kernel being what it is, there is > >> > probably something more out there. If not now, in a few months. The > >> > RCU CPU stall timeout can be set on the kernel-boot command line, but > >> > you probably already knew that. > >> > >> Well, it's all based solely on a large number of patches and stopgaps. > >> If we fix main problems for today, it's already good. > > > > Fair enough! > > > >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout > >> > was 1.5 -seconds-. ;-) > >> > >> Have you tried to instrument every basic block with a function call to > >> collect coverage, check every damn memory access for validity, enable > >> all thinkable and unthinkable debug configs and put the insanest load > >> one can imagine from a swarm of parallel threads? It makes things a > >> bit slower ;) > > > > Given that we wouldn't have had enough CPU or memory to accommodate > > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) > > > >> >> This will require fixing task hung. Have not yet looked at workqueue detector. > >> >> Does at least RCU respect the given timeout more or less precisely? > >> > > >> > Assuming that there is at least one CPU capable of taking scheduling-clock > >> > interrupts, it should respect the timeout to within a few jiffies. > > > Hi Paul, > > Speaking of stalls and rcu, we are seeing lots of crashes that go like this: > > INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: > rcu_sched detected stalls on CPUs/tasks: > INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: > rcu_sched detected stalls on CPUs/tasks: > INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: > rcu_sched detected stalls on CPUs/tasks: > > or like this: > > INFO: rcu_sched self-detected stall on CPU > INFO: rcu_sched detected stalls on CPUs/tasks: > 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > softirq=57641/57641 fqs=31151 > 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > softirq=57641/57641 fqs=31151 > (t=125002 jiffies g=31656 c=31655 q=910) > > INFO: rcu_sched self-detected stall on CPU > INFO: rcu_sched detected stalls on CPUs/tasks: > 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > softirq=65194/65194 fqs=31231 > 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > softirq=65194/65194 fqs=31231 > (t=125002 jiffies g=34421 c=34420 q=1119) > (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) > > > and then there is an unintelligible mess of 2 reports. Such crashes go > to trash bin, because we can't even say which function hanged. It > seems that in all cases 2 different rcu stall detection facilities > race with each other. Is it possible to make them not race? How about the following (untested, not for mainline) patch? It suppresses all but the "main" RCU flavor, which is rcu_sched for !PREEMPT builds and rcu_preempt otherwise. Either way, this is the RCU flavor corresponding to synchronize_rcu(). This works well in the common case where there is almost always an RCU grace period in flight. One reason that this patch is not for mainline is that I am working on merging the RCU-bh, RCU-preempt, and RCU-sched flavors into one thing, at which point there won't be any races. But that might be a couple merge windows away from now. Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 381b47a68ac6..31f7818f2d63 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1552,7 +1552,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) struct rcu_node *rnp; if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || - !rcu_gp_in_progress(rsp)) + !rcu_gp_in_progress(rsp) || rsp != rcu_state_p) return; rcu_stall_kick_kthreads(rsp); j = jiffies; ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-09 16:20 ` Paul E. McKenney @ 2018-04-09 16:28 ` Dmitry Vyukov 2018-04-09 18:11 ` Paul E. McKenney 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Vyukov @ 2018-04-09 16:28 UTC (permalink / raw) To: Paul McKenney Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 9, 2018 at 6:20 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Mon, Apr 09, 2018 at 02:54:20PM +0200, Dmitry Vyukov wrote: >> On Mon, Apr 2, 2018 at 7:23 PM, Paul E. McKenney >> <paulmck@linux.vnet.ibm.com> wrote: >> >> >> >> >> >> >> >> >> >> > Hello, >> >> >> >> >> > >> >> >> >> >> > syzbot hit the following crash on upstream commit >> >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) >> >> >> >> >> > Linux 4.16 >> >> >> >> >> > syzbot dashboard link: >> >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd >> >> >> >> >> > >> >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. >> >> >> >> >> > Raw console output: >> >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 >> >> >> >> >> > Kernel config: >> >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 >> >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 >> >> >> >> >> > >> >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: >> >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com >> >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for >> >> >> >> >> > details. >> >> >> >> >> > If you forward the report, please keep this part and the footer. >> >> >> >> >> > >> >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount >> >> >> >> >> > option "g �;e�K�>pquota" >> >> >> >> > >> >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller >> >> >> >> > playing around with mount options. >> >> >> >> > >> >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. >> >> >> >> >> > Not tainted 4.16.0+ #10 >> >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 >> >> >> >> >> > Call Trace: >> >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] >> >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 >> >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 >> >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 >> >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] >> >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] >> >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] >> >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 >> >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 >> >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 >> >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 >> >> >> >> >> >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing >> >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel >> >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU >> >> >> >> >> somehow missed a transition into idle or user space. >> >> >> >> > >> >> >> >> > The RCU CPU stall warning below strongly supports this position ... >> >> >> >> >> >> >> >> I think this is this guy then: >> >> >> >> >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 >> >> >> >> >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes >> >> >> > >> >> >> > Seems likely to me! >> >> >> > >> >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I >> >> >> >> think we need some kind of priority between them. I.e. we have rcu >> >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine >> >> >> >> hang and maybe something else. It would be useful if they fire >> >> >> >> deterministically according to priorities. If there is an rcu stall, >> >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, >> >> >> >> but a workqueue stall, then that's always detected as workqueue stall, >> >> >> >> etc. >> >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be >> >> >> >> detected either RCU stall or a task hung, producing 2 different bug >> >> >> >> reports (which is bad). >> >> >> >> One can say that it's only a matter of tuning timeouts, but at least >> >> >> >> task hung detector has a problem that if you set timeout to X, it can >> >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite >> >> >> >> large timeout (a minute may not be enough), and on the other hand we >> >> >> >> can't wait for an hour just to make sure that the machine is indeed >> >> >> >> dead (these things happen every few minutes). >> >> >> > >> >> >> > I suppose that we could have a global variable that was set to the >> >> >> > priority of the complaint in question, which would suppress all >> >> >> > lower-priority complaints. Might need to be opt-in, though -- I would >> >> >> > guess that not everyone is going to be happy with one complaint suppressing >> >> >> > others, especially given the possibility that the two complaints might >> >> >> > be about different things. >> >> >> > >> >> >> > Or did you have something more deft in mind? >> >> >> >> >> >> >> >> >> syzkaller generally looks only at the first report. One does not know >> >> >> if/when there will be a second one, or the second one can be induced >> >> >> by the first one, and we generally want clean reports on a non-tainted >> >> >> kernel. So we don't just need to suppress lower priority ones, we need >> >> >> to produce the right report first. >> >> >> I am thinking maybe setting: >> >> >> - rcu stalls at 1.5 minutes >> >> >> - workqueue stalls at 2 minutes >> >> >> - task hungs at 2.5 minutes >> >> >> - and no output whatsoever at 3 minutes >> >> >> Do I miss anything? I think at least spinlocks. Should they go before >> >> >> or after rcu? >> >> > >> >> > That is what I know of, but the Linux kernel being what it is, there is >> >> > probably something more out there. If not now, in a few months. The >> >> > RCU CPU stall timeout can be set on the kernel-boot command line, but >> >> > you probably already knew that. >> >> >> >> Well, it's all based solely on a large number of patches and stopgaps. >> >> If we fix main problems for today, it's already good. >> > >> > Fair enough! >> > >> >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout >> >> > was 1.5 -seconds-. ;-) >> >> >> >> Have you tried to instrument every basic block with a function call to >> >> collect coverage, check every damn memory access for validity, enable >> >> all thinkable and unthinkable debug configs and put the insanest load >> >> one can imagine from a swarm of parallel threads? It makes things a >> >> bit slower ;) >> > >> > Given that we wouldn't have had enough CPU or memory to accommodate >> > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) >> > >> >> >> This will require fixing task hung. Have not yet looked at workqueue detector. >> >> >> Does at least RCU respect the given timeout more or less precisely? >> >> > >> >> > Assuming that there is at least one CPU capable of taking scheduling-clock >> >> > interrupts, it should respect the timeout to within a few jiffies. >> >> >> Hi Paul, >> >> Speaking of stalls and rcu, we are seeing lots of crashes that go like this: >> >> INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: >> rcu_sched detected stalls on CPUs/tasks: >> INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: >> rcu_sched detected stalls on CPUs/tasks: >> INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: >> rcu_sched detected stalls on CPUs/tasks: >> >> or like this: >> >> INFO: rcu_sched self-detected stall on CPU >> INFO: rcu_sched detected stalls on CPUs/tasks: >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 >> softirq=57641/57641 fqs=31151 >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 >> softirq=57641/57641 fqs=31151 >> (t=125002 jiffies g=31656 c=31655 q=910) >> >> INFO: rcu_sched self-detected stall on CPU >> INFO: rcu_sched detected stalls on CPUs/tasks: >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 >> softirq=65194/65194 fqs=31231 >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 >> softirq=65194/65194 fqs=31231 >> (t=125002 jiffies g=34421 c=34420 q=1119) >> (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) >> >> >> and then there is an unintelligible mess of 2 reports. Such crashes go >> to trash bin, because we can't even say which function hanged. It >> seems that in all cases 2 different rcu stall detection facilities >> race with each other. Is it possible to make them not race? > > How about the following (untested, not for mainline) patch? It suppresses > all but the "main" RCU flavor, which is rcu_sched for !PREEMPT builds and > rcu_preempt otherwise. Either way, this is the RCU flavor corresponding > to synchronize_rcu(). This works well in the common case where there > is almost always an RCU grace period in flight. > > One reason that this patch is not for mainline is that I am working on > merging the RCU-bh, RCU-preempt, and RCU-sched flavors into one thing, > at which point there won't be any races. But that might be a couple > merge windows away from now. > > Thanx, Paul > > ------------------------------------------------------------------------ > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 381b47a68ac6..31f7818f2d63 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1552,7 +1552,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > struct rcu_node *rnp; > > if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || > - !rcu_gp_in_progress(rsp)) > + !rcu_gp_in_progress(rsp) || rsp != rcu_state_p) > return; > rcu_stall_kick_kthreads(rsp); > j = jiffies; But doesn't they both relate to the same rcu flavor? They both say rcu_sched. I assumed that the difference is "self-detected" vs "on CPUs/tasks", i.e. on the current CPU vs on other CPUs. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-09 16:28 ` Dmitry Vyukov @ 2018-04-09 18:11 ` Paul E. McKenney 2018-04-10 11:13 ` Dmitry Vyukov 0 siblings, 1 reply; 19+ messages in thread From: Paul E. McKenney @ 2018-04-09 18:11 UTC (permalink / raw) To: Dmitry Vyukov Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 09, 2018 at 06:28:16PM +0200, Dmitry Vyukov wrote: > On Mon, Apr 9, 2018 at 6:20 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > On Mon, Apr 09, 2018 at 02:54:20PM +0200, Dmitry Vyukov wrote: > >> On Mon, Apr 2, 2018 at 7:23 PM, Paul E. McKenney > >> <paulmck@linux.vnet.ibm.com> wrote: > >> >> >> >> >> > >> >> >> >> >> > Hello, > >> >> >> >> >> > > >> >> >> >> >> > syzbot hit the following crash on upstream commit > >> >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > >> >> >> >> >> > Linux 4.16 > >> >> >> >> >> > syzbot dashboard link: > >> >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > >> >> >> >> >> > > >> >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. > >> >> >> >> >> > Raw console output: > >> >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > >> >> >> >> >> > Kernel config: > >> >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > >> >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 > >> >> >> >> >> > > >> >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > >> >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for > >> >> >> >> >> > details. > >> >> >> >> >> > If you forward the report, please keep this part and the footer. > >> >> >> >> >> > > >> >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > >> >> >> >> >> > option "g �;e�K�>pquota" > >> >> >> >> > > >> >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller > >> >> >> >> > playing around with mount options. > >> >> >> >> > > >> >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > >> >> >> >> >> > Not tainted 4.16.0+ #10 > >> >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 > >> >> >> >> >> > Call Trace: > >> >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] > >> >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > >> >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 > >> >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > >> >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] > >> >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] > >> >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] > >> >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > >> >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > >> >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > >> >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 > >> >> >> >> >> > >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing > >> >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel > >> >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU > >> >> >> >> >> somehow missed a transition into idle or user space. > >> >> >> >> > > >> >> >> >> > The RCU CPU stall warning below strongly supports this position ... > >> >> >> >> > >> >> >> >> I think this is this guy then: > >> >> >> >> > >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 > >> >> >> >> > >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes > >> >> >> > > >> >> >> > Seems likely to me! > >> >> >> > > >> >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I > >> >> >> >> think we need some kind of priority between them. I.e. we have rcu > >> >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine > >> >> >> >> hang and maybe something else. It would be useful if they fire > >> >> >> >> deterministically according to priorities. If there is an rcu stall, > >> >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, > >> >> >> >> but a workqueue stall, then that's always detected as workqueue stall, > >> >> >> >> etc. > >> >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be > >> >> >> >> detected either RCU stall or a task hung, producing 2 different bug > >> >> >> >> reports (which is bad). > >> >> >> >> One can say that it's only a matter of tuning timeouts, but at least > >> >> >> >> task hung detector has a problem that if you set timeout to X, it can > >> >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite > >> >> >> >> large timeout (a minute may not be enough), and on the other hand we > >> >> >> >> can't wait for an hour just to make sure that the machine is indeed > >> >> >> >> dead (these things happen every few minutes). > >> >> >> > > >> >> >> > I suppose that we could have a global variable that was set to the > >> >> >> > priority of the complaint in question, which would suppress all > >> >> >> > lower-priority complaints. Might need to be opt-in, though -- I would > >> >> >> > guess that not everyone is going to be happy with one complaint suppressing > >> >> >> > others, especially given the possibility that the two complaints might > >> >> >> > be about different things. > >> >> >> > > >> >> >> > Or did you have something more deft in mind? > >> >> >> > >> >> >> > >> >> >> syzkaller generally looks only at the first report. One does not know > >> >> >> if/when there will be a second one, or the second one can be induced > >> >> >> by the first one, and we generally want clean reports on a non-tainted > >> >> >> kernel. So we don't just need to suppress lower priority ones, we need > >> >> >> to produce the right report first. > >> >> >> I am thinking maybe setting: > >> >> >> - rcu stalls at 1.5 minutes > >> >> >> - workqueue stalls at 2 minutes > >> >> >> - task hungs at 2.5 minutes > >> >> >> - and no output whatsoever at 3 minutes > >> >> >> Do I miss anything? I think at least spinlocks. Should they go before > >> >> >> or after rcu? > >> >> > > >> >> > That is what I know of, but the Linux kernel being what it is, there is > >> >> > probably something more out there. If not now, in a few months. The > >> >> > RCU CPU stall timeout can be set on the kernel-boot command line, but > >> >> > you probably already knew that. > >> >> > >> >> Well, it's all based solely on a large number of patches and stopgaps. > >> >> If we fix main problems for today, it's already good. > >> > > >> > Fair enough! > >> > > >> >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout > >> >> > was 1.5 -seconds-. ;-) > >> >> > >> >> Have you tried to instrument every basic block with a function call to > >> >> collect coverage, check every damn memory access for validity, enable > >> >> all thinkable and unthinkable debug configs and put the insanest load > >> >> one can imagine from a swarm of parallel threads? It makes things a > >> >> bit slower ;) > >> > > >> > Given that we wouldn't have had enough CPU or memory to accommodate > >> > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) > >> > > >> >> >> This will require fixing task hung. Have not yet looked at workqueue detector. > >> >> >> Does at least RCU respect the given timeout more or less precisely? > >> >> > > >> >> > Assuming that there is at least one CPU capable of taking scheduling-clock > >> >> > interrupts, it should respect the timeout to within a few jiffies. > >> > >> > >> Hi Paul, > >> > >> Speaking of stalls and rcu, we are seeing lots of crashes that go like this: > >> > >> INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: > >> rcu_sched detected stalls on CPUs/tasks: > >> INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: > >> rcu_sched detected stalls on CPUs/tasks: > >> INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: > >> rcu_sched detected stalls on CPUs/tasks: > >> > >> or like this: > >> > >> INFO: rcu_sched self-detected stall on CPU > >> INFO: rcu_sched detected stalls on CPUs/tasks: > >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > >> softirq=57641/57641 fqs=31151 > >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > >> softirq=57641/57641 fqs=31151 > >> (t=125002 jiffies g=31656 c=31655 q=910) > >> > >> INFO: rcu_sched self-detected stall on CPU > >> INFO: rcu_sched detected stalls on CPUs/tasks: > >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > >> softirq=65194/65194 fqs=31231 > >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > >> softirq=65194/65194 fqs=31231 > >> (t=125002 jiffies g=34421 c=34420 q=1119) > >> (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) > >> > >> > >> and then there is an unintelligible mess of 2 reports. Such crashes go > >> to trash bin, because we can't even say which function hanged. It > >> seems that in all cases 2 different rcu stall detection facilities > >> race with each other. Is it possible to make them not race? > > > > How about the following (untested, not for mainline) patch? It suppresses > > all but the "main" RCU flavor, which is rcu_sched for !PREEMPT builds and > > rcu_preempt otherwise. Either way, this is the RCU flavor corresponding > > to synchronize_rcu(). This works well in the common case where there > > is almost always an RCU grace period in flight. > > > > One reason that this patch is not for mainline is that I am working on > > merging the RCU-bh, RCU-preempt, and RCU-sched flavors into one thing, > > at which point there won't be any races. But that might be a couple > > merge windows away from now. > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > index 381b47a68ac6..31f7818f2d63 100644 > > --- a/kernel/rcu/tree.c > > +++ b/kernel/rcu/tree.c > > @@ -1552,7 +1552,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > > struct rcu_node *rnp; > > > > if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || > > - !rcu_gp_in_progress(rsp)) > > + !rcu_gp_in_progress(rsp) || rsp != rcu_state_p) > > return; > > rcu_stall_kick_kthreads(rsp); > > j = jiffies; > > But doesn't they both relate to the same rcu flavor? They both say > rcu_sched. I assumed that the difference is "self-detected" vs "on > CPUs/tasks", i.e. on the current CPU vs on other CPUs. Right you are! One approach would be to increase the value of RCU_STALL_RAT_DELAY, which is currently two jiffies to (say) 20 jiffies. This is in kernel/rcu/tree.h. But this would fail on a sufficiently overloaded system -- and the failure of the two-jiffy delay is a bit of a surprise, given interrupts disabled and all that. Are you by any chance loaded heavily enough to see vCPU preemption? I could avoid at least some of these timing issues instead using cmpxchg() on ->jiffies_stall to allow only one CPU in, but leave the non-atomic update to discourage overly long stall prints from running into the next one. This is not perfect, either, and is roughly equivalent to setting RCU_STALL_RAT_DELAY to many second's worth of jiffies, but avoiding that minute's delay. But it should get rid of the duplication in almost all cases, though it could allow a stall warning to overlap with a later stall warning for that same grace period. Which can already happen anyway. Also, a tens-of-seconds vCPU preemption can still cause concurrent stall warnings, but if that is happening to you, the concurrent stall warnings are probably the least of your problems. Besides, we do need at least one CPU to actually report the stall, which won't happen if that CPU's vCPU is indefinitely preempted. So there is only so much I can do about that particular corner case. So how does the following (untested) patch work for you? Thanx, Paul ------------------------------------------------------------------------ commit 6a5ab1e68f8636d8823bb5a9aee35fc44c2be866 Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Date: Mon Apr 9 11:04:46 2018 -0700 rcu: Exclude near-simultaneous RCU CPU stall warnings There is a two-jiffy delay between the time that a CPU will self-report an RCU CPU stall warning and the time that some other CPU will report a warning on behalf of the first CPU. This has worked well in the past, but on busy systems, it is possible for the two warnings to overlap, which makes interpreting them extremely difficult. This commit therefore uses a cmpxchg-based timing decision that allows only one report in a given one-minute period (assuming default stall-warning Kconfig parameters). This approach will of course fail if you are seeing minute-long vCPU preemption, but in that case the overlapping RCU CPU stall warnings are the least of your worries. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 381b47a68ac6..b7246bcbf633 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1429,8 +1429,6 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) raw_spin_unlock_irqrestore_rcu_node(rnp, flags); return; } - WRITE_ONCE(rsp->jiffies_stall, - jiffies + 3 * rcu_jiffies_till_stall_check() + 3); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); /* @@ -1481,6 +1479,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) sched_show_task(current); } } + /* Rewrite if needed in case of slow consoles. */ + if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) + WRITE_ONCE(rsp->jiffies_stall, + jiffies + 3 * rcu_jiffies_till_stall_check() + 3); rcu_check_gp_kthread_starvation(rsp); @@ -1525,6 +1527,7 @@ static void print_cpu_stall(struct rcu_state *rsp) rcu_dump_cpu_stacks(rsp); raw_spin_lock_irqsave_rcu_node(rnp, flags); + /* Rewrite if needed in case of slow consoles. */ if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) WRITE_ONCE(rsp->jiffies_stall, jiffies + 3 * rcu_jiffies_till_stall_check() + 3); @@ -1548,6 +1551,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) unsigned long gpnum; unsigned long gps; unsigned long j; + unsigned long jn; unsigned long js; struct rcu_node *rnp; @@ -1586,14 +1590,17 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) ULONG_CMP_GE(gps, js)) return; /* No stall or GP completed since entering function. */ rnp = rdp->mynode; + jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; if (rcu_gp_in_progress(rsp) && - (READ_ONCE(rnp->qsmask) & rdp->grpmask)) { + (READ_ONCE(rnp->qsmask) & rdp->grpmask) && + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { /* We haven't checked in, so go dump stack. */ print_cpu_stall(rsp); } else if (rcu_gp_in_progress(rsp) && - ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) { + ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { /* They had a few time units to dump stack, so complain. */ print_other_cpu_stall(rsp, gpnum); ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-09 18:11 ` Paul E. McKenney @ 2018-04-10 11:13 ` Dmitry Vyukov 2018-04-10 17:02 ` Paul E. McKenney 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Vyukov @ 2018-04-10 11:13 UTC (permalink / raw) To: Paul McKenney Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Mon, Apr 9, 2018 at 8:11 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Mon, Apr 09, 2018 at 06:28:16PM +0200, Dmitry Vyukov wrote: >> On Mon, Apr 9, 2018 at 6:20 PM, Paul E. McKenney >> <paulmck@linux.vnet.ibm.com> wrote: >> > On Mon, Apr 09, 2018 at 02:54:20PM +0200, Dmitry Vyukov wrote: >> >> On Mon, Apr 2, 2018 at 7:23 PM, Paul E. McKenney >> >> <paulmck@linux.vnet.ibm.com> wrote: >> >> >> >> >> >> >> >> >> >> >> >> > Hello, >> >> >> >> >> >> > >> >> >> >> >> >> > syzbot hit the following crash on upstream commit >> >> >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) >> >> >> >> >> >> > Linux 4.16 >> >> >> >> >> >> > syzbot dashboard link: >> >> >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd >> >> >> >> >> >> > >> >> >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. >> >> >> >> >> >> > Raw console output: >> >> >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 >> >> >> >> >> >> > Kernel config: >> >> >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 >> >> >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 >> >> >> >> >> >> > >> >> >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: >> >> >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com >> >> >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for >> >> >> >> >> >> > details. >> >> >> >> >> >> > If you forward the report, please keep this part and the footer. >> >> >> >> >> >> > >> >> >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount >> >> >> >> >> >> > option "g �;e�K�>pquota" >> >> >> >> >> > >> >> >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller >> >> >> >> >> > playing around with mount options. >> >> >> >> >> > >> >> >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. >> >> >> >> >> >> > Not tainted 4.16.0+ #10 >> >> >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> >> >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 >> >> >> >> >> >> > Call Trace: >> >> >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] >> >> >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 >> >> >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 >> >> >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 >> >> >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] >> >> >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] >> >> >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] >> >> >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 >> >> >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 >> >> >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 >> >> >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 >> >> >> >> >> >> >> >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing >> >> >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel >> >> >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU >> >> >> >> >> >> somehow missed a transition into idle or user space. >> >> >> >> >> > >> >> >> >> >> > The RCU CPU stall warning below strongly supports this position ... >> >> >> >> >> >> >> >> >> >> I think this is this guy then: >> >> >> >> >> >> >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 >> >> >> >> >> >> >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes >> >> >> >> > >> >> >> >> > Seems likely to me! >> >> >> >> > >> >> >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I >> >> >> >> >> think we need some kind of priority between them. I.e. we have rcu >> >> >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine >> >> >> >> >> hang and maybe something else. It would be useful if they fire >> >> >> >> >> deterministically according to priorities. If there is an rcu stall, >> >> >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, >> >> >> >> >> but a workqueue stall, then that's always detected as workqueue stall, >> >> >> >> >> etc. >> >> >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be >> >> >> >> >> detected either RCU stall or a task hung, producing 2 different bug >> >> >> >> >> reports (which is bad). >> >> >> >> >> One can say that it's only a matter of tuning timeouts, but at least >> >> >> >> >> task hung detector has a problem that if you set timeout to X, it can >> >> >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite >> >> >> >> >> large timeout (a minute may not be enough), and on the other hand we >> >> >> >> >> can't wait for an hour just to make sure that the machine is indeed >> >> >> >> >> dead (these things happen every few minutes). >> >> >> >> > >> >> >> >> > I suppose that we could have a global variable that was set to the >> >> >> >> > priority of the complaint in question, which would suppress all >> >> >> >> > lower-priority complaints. Might need to be opt-in, though -- I would >> >> >> >> > guess that not everyone is going to be happy with one complaint suppressing >> >> >> >> > others, especially given the possibility that the two complaints might >> >> >> >> > be about different things. >> >> >> >> > >> >> >> >> > Or did you have something more deft in mind? >> >> >> >> >> >> >> >> >> >> >> >> syzkaller generally looks only at the first report. One does not know >> >> >> >> if/when there will be a second one, or the second one can be induced >> >> >> >> by the first one, and we generally want clean reports on a non-tainted >> >> >> >> kernel. So we don't just need to suppress lower priority ones, we need >> >> >> >> to produce the right report first. >> >> >> >> I am thinking maybe setting: >> >> >> >> - rcu stalls at 1.5 minutes >> >> >> >> - workqueue stalls at 2 minutes >> >> >> >> - task hungs at 2.5 minutes >> >> >> >> - and no output whatsoever at 3 minutes >> >> >> >> Do I miss anything? I think at least spinlocks. Should they go before >> >> >> >> or after rcu? >> >> >> > >> >> >> > That is what I know of, but the Linux kernel being what it is, there is >> >> >> > probably something more out there. If not now, in a few months. The >> >> >> > RCU CPU stall timeout can be set on the kernel-boot command line, but >> >> >> > you probably already knew that. >> >> >> >> >> >> Well, it's all based solely on a large number of patches and stopgaps. >> >> >> If we fix main problems for today, it's already good. >> >> > >> >> > Fair enough! >> >> > >> >> >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout >> >> >> > was 1.5 -seconds-. ;-) >> >> >> >> >> >> Have you tried to instrument every basic block with a function call to >> >> >> collect coverage, check every damn memory access for validity, enable >> >> >> all thinkable and unthinkable debug configs and put the insanest load >> >> >> one can imagine from a swarm of parallel threads? It makes things a >> >> >> bit slower ;) >> >> > >> >> > Given that we wouldn't have had enough CPU or memory to accommodate >> >> > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) >> >> > >> >> >> >> This will require fixing task hung. Have not yet looked at workqueue detector. >> >> >> >> Does at least RCU respect the given timeout more or less precisely? >> >> >> > >> >> >> > Assuming that there is at least one CPU capable of taking scheduling-clock >> >> >> > interrupts, it should respect the timeout to within a few jiffies. >> >> >> >> >> >> Hi Paul, >> >> >> >> Speaking of stalls and rcu, we are seeing lots of crashes that go like this: >> >> >> >> INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: >> >> rcu_sched detected stalls on CPUs/tasks: >> >> INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: >> >> rcu_sched detected stalls on CPUs/tasks: >> >> INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: >> >> rcu_sched detected stalls on CPUs/tasks: >> >> >> >> or like this: >> >> >> >> INFO: rcu_sched self-detected stall on CPU >> >> INFO: rcu_sched detected stalls on CPUs/tasks: >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 >> >> softirq=57641/57641 fqs=31151 >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 >> >> softirq=57641/57641 fqs=31151 >> >> (t=125002 jiffies g=31656 c=31655 q=910) >> >> >> >> INFO: rcu_sched self-detected stall on CPU >> >> INFO: rcu_sched detected stalls on CPUs/tasks: >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 >> >> softirq=65194/65194 fqs=31231 >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 >> >> softirq=65194/65194 fqs=31231 >> >> (t=125002 jiffies g=34421 c=34420 q=1119) >> >> (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) >> >> >> >> >> >> and then there is an unintelligible mess of 2 reports. Such crashes go >> >> to trash bin, because we can't even say which function hanged. It >> >> seems that in all cases 2 different rcu stall detection facilities >> >> race with each other. Is it possible to make them not race? >> > >> > How about the following (untested, not for mainline) patch? It suppresses >> > all but the "main" RCU flavor, which is rcu_sched for !PREEMPT builds and >> > rcu_preempt otherwise. Either way, this is the RCU flavor corresponding >> > to synchronize_rcu(). This works well in the common case where there >> > is almost always an RCU grace period in flight. >> > >> > One reason that this patch is not for mainline is that I am working on >> > merging the RCU-bh, RCU-preempt, and RCU-sched flavors into one thing, >> > at which point there won't be any races. But that might be a couple >> > merge windows away from now. >> > >> > Thanx, Paul >> > >> > ------------------------------------------------------------------------ >> > >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >> > index 381b47a68ac6..31f7818f2d63 100644 >> > --- a/kernel/rcu/tree.c >> > +++ b/kernel/rcu/tree.c >> > @@ -1552,7 +1552,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) >> > struct rcu_node *rnp; >> > >> > if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || >> > - !rcu_gp_in_progress(rsp)) >> > + !rcu_gp_in_progress(rsp) || rsp != rcu_state_p) >> > return; >> > rcu_stall_kick_kthreads(rsp); >> > j = jiffies; >> >> But doesn't they both relate to the same rcu flavor? They both say >> rcu_sched. I assumed that the difference is "self-detected" vs "on >> CPUs/tasks", i.e. on the current CPU vs on other CPUs. > > Right you are! > > One approach would be to increase the value of RCU_STALL_RAT_DELAY, > which is currently two jiffies to (say) 20 jiffies. This is in > kernel/rcu/tree.h. But this would fail on a sufficiently overloaded > system -- and the failure of the two-jiffy delay is a bit of a surprise, > given interrupts disabled and all that. Are you by any chance loaded > heavily enough to see vCPU preemption? > > I could avoid at least some of these timing issues instead using cmpxchg() > on ->jiffies_stall to allow only one CPU in, but leave the non-atomic > update to discourage overly long stall prints from running into the > next one. This is not perfect, either, and is roughly equivalent to > setting RCU_STALL_RAT_DELAY to many second's worth of jiffies, but > avoiding that minute's delay. But it should get rid of the duplication > in almost all cases, though it could allow a stall warning to overlap > with a later stall warning for that same grace period. Which can > already happen anyway. Also, a tens-of-seconds vCPU preemption can > still cause concurrent stall warnings, but if that is happening to you, > the concurrent stall warnings are probably the least of your problems. > Besides, we do need at least one CPU to actually report the stall, which > won't happen if that CPU's vCPU is indefinitely preempted. So there is > only so much I can do about that particular corner case. > > So how does the following (untested) patch work for you? Looks good to me. We run on VMs, so we can well have vCPU preemption. > Thanx, Paul > > ------------------------------------------------------------------------ > > commit 6a5ab1e68f8636d8823bb5a9aee35fc44c2be866 > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Date: Mon Apr 9 11:04:46 2018 -0700 > > rcu: Exclude near-simultaneous RCU CPU stall warnings > > There is a two-jiffy delay between the time that a CPU will self-report > an RCU CPU stall warning and the time that some other CPU will report a > warning on behalf of the first CPU. This has worked well in the past, > but on busy systems, it is possible for the two warnings to overlap, > which makes interpreting them extremely difficult. > > This commit therefore uses a cmpxchg-based timing decision that > allows only one report in a given one-minute period (assuming default > stall-warning Kconfig parameters). This approach will of course fail > if you are seeing minute-long vCPU preemption, but in that case the > overlapping RCU CPU stall warnings are the least of your worries. > > Reported-by: Dmitry Vyukov <dvyukov@google.com> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 381b47a68ac6..b7246bcbf633 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1429,8 +1429,6 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > return; > } > - WRITE_ONCE(rsp->jiffies_stall, > - jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > > /* > @@ -1481,6 +1479,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) > sched_show_task(current); > } > } > + /* Rewrite if needed in case of slow consoles. */ > + if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) > + WRITE_ONCE(rsp->jiffies_stall, > + jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > > rcu_check_gp_kthread_starvation(rsp); > > @@ -1525,6 +1527,7 @@ static void print_cpu_stall(struct rcu_state *rsp) > rcu_dump_cpu_stacks(rsp); > > raw_spin_lock_irqsave_rcu_node(rnp, flags); > + /* Rewrite if needed in case of slow consoles. */ > if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) > WRITE_ONCE(rsp->jiffies_stall, > jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > @@ -1548,6 +1551,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > unsigned long gpnum; > unsigned long gps; > unsigned long j; > + unsigned long jn; > unsigned long js; > struct rcu_node *rnp; > > @@ -1586,14 +1590,17 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > ULONG_CMP_GE(gps, js)) > return; /* No stall or GP completed since entering function. */ > rnp = rdp->mynode; > + jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; > if (rcu_gp_in_progress(rsp) && > - (READ_ONCE(rnp->qsmask) & rdp->grpmask)) { > + (READ_ONCE(rnp->qsmask) & rdp->grpmask) && > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { > > /* We haven't checked in, so go dump stack. */ > print_cpu_stall(rsp); > > } else if (rcu_gp_in_progress(rsp) && > - ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) { > + ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { > > /* They had a few time units to dump stack, so complain. */ > print_other_cpu_stall(rsp, gpnum); > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-10 11:13 ` Dmitry Vyukov @ 2018-04-10 17:02 ` Paul E. McKenney 2018-04-11 10:06 ` Dmitry Vyukov 0 siblings, 1 reply; 19+ messages in thread From: Paul E. McKenney @ 2018-04-10 17:02 UTC (permalink / raw) To: Dmitry Vyukov Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Tue, Apr 10, 2018 at 01:13:13PM +0200, Dmitry Vyukov wrote: > On Mon, Apr 9, 2018 at 8:11 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > On Mon, Apr 09, 2018 at 06:28:16PM +0200, Dmitry Vyukov wrote: > >> On Mon, Apr 9, 2018 at 6:20 PM, Paul E. McKenney > >> <paulmck@linux.vnet.ibm.com> wrote: > >> > On Mon, Apr 09, 2018 at 02:54:20PM +0200, Dmitry Vyukov wrote: > >> >> On Mon, Apr 2, 2018 at 7:23 PM, Paul E. McKenney > >> >> <paulmck@linux.vnet.ibm.com> wrote: > >> >> >> >> >> >> > >> >> >> >> >> >> > Hello, > >> >> >> >> >> >> > > >> >> >> >> >> >> > syzbot hit the following crash on upstream commit > >> >> >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > >> >> >> >> >> >> > Linux 4.16 > >> >> >> >> >> >> > syzbot dashboard link: > >> >> >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > >> >> >> >> >> >> > > >> >> >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. > >> >> >> >> >> >> > Raw console output: > >> >> >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > >> >> >> >> >> >> > Kernel config: > >> >> >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > >> >> >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 > >> >> >> >> >> >> > > >> >> >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> >> >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > >> >> >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for > >> >> >> >> >> >> > details. > >> >> >> >> >> >> > If you forward the report, please keep this part and the footer. > >> >> >> >> >> >> > > >> >> >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > >> >> >> >> >> >> > option "g �;e�K�>pquota" > >> >> >> >> >> > > >> >> >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller > >> >> >> >> >> > playing around with mount options. > >> >> >> >> >> > > >> >> >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > >> >> >> >> >> >> > Not tainted 4.16.0+ #10 > >> >> >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> >> >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 > >> >> >> >> >> >> > Call Trace: > >> >> >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] > >> >> >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > >> >> >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 > >> >> >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > >> >> >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] > >> >> >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] > >> >> >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] > >> >> >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > >> >> >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > >> >> >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > >> >> >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 > >> >> >> >> >> >> > >> >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing > >> >> >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel > >> >> >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU > >> >> >> >> >> >> somehow missed a transition into idle or user space. > >> >> >> >> >> > > >> >> >> >> >> > The RCU CPU stall warning below strongly supports this position ... > >> >> >> >> >> > >> >> >> >> >> I think this is this guy then: > >> >> >> >> >> > >> >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 > >> >> >> >> >> > >> >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes > >> >> >> >> > > >> >> >> >> > Seems likely to me! > >> >> >> >> > > >> >> >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I > >> >> >> >> >> think we need some kind of priority between them. I.e. we have rcu > >> >> >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine > >> >> >> >> >> hang and maybe something else. It would be useful if they fire > >> >> >> >> >> deterministically according to priorities. If there is an rcu stall, > >> >> >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, > >> >> >> >> >> but a workqueue stall, then that's always detected as workqueue stall, > >> >> >> >> >> etc. > >> >> >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be > >> >> >> >> >> detected either RCU stall or a task hung, producing 2 different bug > >> >> >> >> >> reports (which is bad). > >> >> >> >> >> One can say that it's only a matter of tuning timeouts, but at least > >> >> >> >> >> task hung detector has a problem that if you set timeout to X, it can > >> >> >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite > >> >> >> >> >> large timeout (a minute may not be enough), and on the other hand we > >> >> >> >> >> can't wait for an hour just to make sure that the machine is indeed > >> >> >> >> >> dead (these things happen every few minutes). > >> >> >> >> > > >> >> >> >> > I suppose that we could have a global variable that was set to the > >> >> >> >> > priority of the complaint in question, which would suppress all > >> >> >> >> > lower-priority complaints. Might need to be opt-in, though -- I would > >> >> >> >> > guess that not everyone is going to be happy with one complaint suppressing > >> >> >> >> > others, especially given the possibility that the two complaints might > >> >> >> >> > be about different things. > >> >> >> >> > > >> >> >> >> > Or did you have something more deft in mind? > >> >> >> >> > >> >> >> >> > >> >> >> >> syzkaller generally looks only at the first report. One does not know > >> >> >> >> if/when there will be a second one, or the second one can be induced > >> >> >> >> by the first one, and we generally want clean reports on a non-tainted > >> >> >> >> kernel. So we don't just need to suppress lower priority ones, we need > >> >> >> >> to produce the right report first. > >> >> >> >> I am thinking maybe setting: > >> >> >> >> - rcu stalls at 1.5 minutes > >> >> >> >> - workqueue stalls at 2 minutes > >> >> >> >> - task hungs at 2.5 minutes > >> >> >> >> - and no output whatsoever at 3 minutes > >> >> >> >> Do I miss anything? I think at least spinlocks. Should they go before > >> >> >> >> or after rcu? > >> >> >> > > >> >> >> > That is what I know of, but the Linux kernel being what it is, there is > >> >> >> > probably something more out there. If not now, in a few months. The > >> >> >> > RCU CPU stall timeout can be set on the kernel-boot command line, but > >> >> >> > you probably already knew that. > >> >> >> > >> >> >> Well, it's all based solely on a large number of patches and stopgaps. > >> >> >> If we fix main problems for today, it's already good. > >> >> > > >> >> > Fair enough! > >> >> > > >> >> >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout > >> >> >> > was 1.5 -seconds-. ;-) > >> >> >> > >> >> >> Have you tried to instrument every basic block with a function call to > >> >> >> collect coverage, check every damn memory access for validity, enable > >> >> >> all thinkable and unthinkable debug configs and put the insanest load > >> >> >> one can imagine from a swarm of parallel threads? It makes things a > >> >> >> bit slower ;) > >> >> > > >> >> > Given that we wouldn't have had enough CPU or memory to accommodate > >> >> > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) > >> >> > > >> >> >> >> This will require fixing task hung. Have not yet looked at workqueue detector. > >> >> >> >> Does at least RCU respect the given timeout more or less precisely? > >> >> >> > > >> >> >> > Assuming that there is at least one CPU capable of taking scheduling-clock > >> >> >> > interrupts, it should respect the timeout to within a few jiffies. > >> >> > >> >> > >> >> Hi Paul, > >> >> > >> >> Speaking of stalls and rcu, we are seeing lots of crashes that go like this: > >> >> > >> >> INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: > >> >> rcu_sched detected stalls on CPUs/tasks: > >> >> INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: > >> >> rcu_sched detected stalls on CPUs/tasks: > >> >> INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: > >> >> rcu_sched detected stalls on CPUs/tasks: > >> >> > >> >> or like this: > >> >> > >> >> INFO: rcu_sched self-detected stall on CPU > >> >> INFO: rcu_sched detected stalls on CPUs/tasks: > >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > >> >> softirq=57641/57641 fqs=31151 > >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > >> >> softirq=57641/57641 fqs=31151 > >> >> (t=125002 jiffies g=31656 c=31655 q=910) > >> >> > >> >> INFO: rcu_sched self-detected stall on CPU > >> >> INFO: rcu_sched detected stalls on CPUs/tasks: > >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > >> >> softirq=65194/65194 fqs=31231 > >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > >> >> softirq=65194/65194 fqs=31231 > >> >> (t=125002 jiffies g=34421 c=34420 q=1119) > >> >> (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) > >> >> > >> >> > >> >> and then there is an unintelligible mess of 2 reports. Such crashes go > >> >> to trash bin, because we can't even say which function hanged. It > >> >> seems that in all cases 2 different rcu stall detection facilities > >> >> race with each other. Is it possible to make them not race? > >> > > >> > How about the following (untested, not for mainline) patch? It suppresses > >> > all but the "main" RCU flavor, which is rcu_sched for !PREEMPT builds and > >> > rcu_preempt otherwise. Either way, this is the RCU flavor corresponding > >> > to synchronize_rcu(). This works well in the common case where there > >> > is almost always an RCU grace period in flight. > >> > > >> > One reason that this patch is not for mainline is that I am working on > >> > merging the RCU-bh, RCU-preempt, and RCU-sched flavors into one thing, > >> > at which point there won't be any races. But that might be a couple > >> > merge windows away from now. > >> > > >> > Thanx, Paul > >> > > >> > ------------------------------------------------------------------------ > >> > > >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > >> > index 381b47a68ac6..31f7818f2d63 100644 > >> > --- a/kernel/rcu/tree.c > >> > +++ b/kernel/rcu/tree.c > >> > @@ -1552,7 +1552,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > >> > struct rcu_node *rnp; > >> > > >> > if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || > >> > - !rcu_gp_in_progress(rsp)) > >> > + !rcu_gp_in_progress(rsp) || rsp != rcu_state_p) > >> > return; > >> > rcu_stall_kick_kthreads(rsp); > >> > j = jiffies; > >> > >> But doesn't they both relate to the same rcu flavor? They both say > >> rcu_sched. I assumed that the difference is "self-detected" vs "on > >> CPUs/tasks", i.e. on the current CPU vs on other CPUs. > > > > Right you are! > > > > One approach would be to increase the value of RCU_STALL_RAT_DELAY, > > which is currently two jiffies to (say) 20 jiffies. This is in > > kernel/rcu/tree.h. But this would fail on a sufficiently overloaded > > system -- and the failure of the two-jiffy delay is a bit of a surprise, > > given interrupts disabled and all that. Are you by any chance loaded > > heavily enough to see vCPU preemption? > > > > I could avoid at least some of these timing issues instead using cmpxchg() > > on ->jiffies_stall to allow only one CPU in, but leave the non-atomic > > update to discourage overly long stall prints from running into the > > next one. This is not perfect, either, and is roughly equivalent to > > setting RCU_STALL_RAT_DELAY to many second's worth of jiffies, but > > avoiding that minute's delay. But it should get rid of the duplication > > in almost all cases, though it could allow a stall warning to overlap > > with a later stall warning for that same grace period. Which can > > already happen anyway. Also, a tens-of-seconds vCPU preemption can > > still cause concurrent stall warnings, but if that is happening to you, > > the concurrent stall warnings are probably the least of your problems. > > Besides, we do need at least one CPU to actually report the stall, which > > won't happen if that CPU's vCPU is indefinitely preempted. So there is > > only so much I can do about that particular corner case. > > > > So how does the following (untested) patch work for you? > > Looks good to me. > > We run on VMs, so we can well have vCPU preemption. Very good! Please do get me a Tested-by when you get to that point. Thanx, Paul > > ------------------------------------------------------------------------ > > > > commit 6a5ab1e68f8636d8823bb5a9aee35fc44c2be866 > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Date: Mon Apr 9 11:04:46 2018 -0700 > > > > rcu: Exclude near-simultaneous RCU CPU stall warnings > > > > There is a two-jiffy delay between the time that a CPU will self-report > > an RCU CPU stall warning and the time that some other CPU will report a > > warning on behalf of the first CPU. This has worked well in the past, > > but on busy systems, it is possible for the two warnings to overlap, > > which makes interpreting them extremely difficult. > > > > This commit therefore uses a cmpxchg-based timing decision that > > allows only one report in a given one-minute period (assuming default > > stall-warning Kconfig parameters). This approach will of course fail > > if you are seeing minute-long vCPU preemption, but in that case the > > overlapping RCU CPU stall warnings are the least of your worries. > > > > Reported-by: Dmitry Vyukov <dvyukov@google.com> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > index 381b47a68ac6..b7246bcbf633 100644 > > --- a/kernel/rcu/tree.c > > +++ b/kernel/rcu/tree.c > > @@ -1429,8 +1429,6 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) > > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > > return; > > } > > - WRITE_ONCE(rsp->jiffies_stall, > > - jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > > > > /* > > @@ -1481,6 +1479,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) > > sched_show_task(current); > > } > > } > > + /* Rewrite if needed in case of slow consoles. */ > > + if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) > > + WRITE_ONCE(rsp->jiffies_stall, > > + jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > > > > rcu_check_gp_kthread_starvation(rsp); > > > > @@ -1525,6 +1527,7 @@ static void print_cpu_stall(struct rcu_state *rsp) > > rcu_dump_cpu_stacks(rsp); > > > > raw_spin_lock_irqsave_rcu_node(rnp, flags); > > + /* Rewrite if needed in case of slow consoles. */ > > if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) > > WRITE_ONCE(rsp->jiffies_stall, > > jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > > @@ -1548,6 +1551,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > > unsigned long gpnum; > > unsigned long gps; > > unsigned long j; > > + unsigned long jn; > > unsigned long js; > > struct rcu_node *rnp; > > > > @@ -1586,14 +1590,17 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > > ULONG_CMP_GE(gps, js)) > > return; /* No stall or GP completed since entering function. */ > > rnp = rdp->mynode; > > + jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; > > if (rcu_gp_in_progress(rsp) && > > - (READ_ONCE(rnp->qsmask) & rdp->grpmask)) { > > + (READ_ONCE(rnp->qsmask) & rdp->grpmask) && > > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { > > > > /* We haven't checked in, so go dump stack. */ > > print_cpu_stall(rsp); > > > > } else if (rcu_gp_in_progress(rsp) && > > - ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) { > > + ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && > > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { > > > > /* They had a few time units to dump stack, so complain. */ > > print_other_cpu_stall(rsp, gpnum); > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-10 17:02 ` Paul E. McKenney @ 2018-04-11 10:06 ` Dmitry Vyukov 2018-04-11 19:36 ` Paul E. McKenney 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Vyukov @ 2018-04-11 10:06 UTC (permalink / raw) To: Paul McKenney Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Tue, Apr 10, 2018 at 7:02 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: >> >> >> On Mon, Apr 2, 2018 at 7:23 PM, Paul E. McKenney >> >> >> <paulmck@linux.vnet.ibm.com> wrote: >> >> >> >> >> >> >> >> >> >> >> >> >> >> > Hello, >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > syzbot hit the following crash on upstream commit >> >> >> >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) >> >> >> >> >> >> >> > Linux 4.16 >> >> >> >> >> >> >> > syzbot dashboard link: >> >> >> >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. >> >> >> >> >> >> >> > Raw console output: >> >> >> >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 >> >> >> >> >> >> >> > Kernel config: >> >> >> >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 >> >> >> >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: >> >> >> >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com >> >> >> >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for >> >> >> >> >> >> >> > details. >> >> >> >> >> >> >> > If you forward the report, please keep this part and the footer. >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount >> >> >> >> >> >> >> > option "g �;e�K�>pquota" >> >> >> >> >> >> > >> >> >> >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller >> >> >> >> >> >> > playing around with mount options. >> >> >> >> >> >> > >> >> >> >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. >> >> >> >> >> >> >> > Not tainted 4.16.0+ #10 >> >> >> >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> >> >> >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 >> >> >> >> >> >> >> > Call Trace: >> >> >> >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] >> >> >> >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 >> >> >> >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 >> >> >> >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 >> >> >> >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] >> >> >> >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] >> >> >> >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] >> >> >> >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 >> >> >> >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 >> >> >> >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 >> >> >> >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 >> >> >> >> >> >> >> >> >> >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing >> >> >> >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel >> >> >> >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU >> >> >> >> >> >> >> somehow missed a transition into idle or user space. >> >> >> >> >> >> > >> >> >> >> >> >> > The RCU CPU stall warning below strongly supports this position ... >> >> >> >> >> >> >> >> >> >> >> >> I think this is this guy then: >> >> >> >> >> >> >> >> >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 >> >> >> >> >> >> >> >> >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes >> >> >> >> >> > >> >> >> >> >> > Seems likely to me! >> >> >> >> >> > >> >> >> >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I >> >> >> >> >> >> think we need some kind of priority between them. I.e. we have rcu >> >> >> >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine >> >> >> >> >> >> hang and maybe something else. It would be useful if they fire >> >> >> >> >> >> deterministically according to priorities. If there is an rcu stall, >> >> >> >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, >> >> >> >> >> >> but a workqueue stall, then that's always detected as workqueue stall, >> >> >> >> >> >> etc. >> >> >> >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be >> >> >> >> >> >> detected either RCU stall or a task hung, producing 2 different bug >> >> >> >> >> >> reports (which is bad). >> >> >> >> >> >> One can say that it's only a matter of tuning timeouts, but at least >> >> >> >> >> >> task hung detector has a problem that if you set timeout to X, it can >> >> >> >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite >> >> >> >> >> >> large timeout (a minute may not be enough), and on the other hand we >> >> >> >> >> >> can't wait for an hour just to make sure that the machine is indeed >> >> >> >> >> >> dead (these things happen every few minutes). >> >> >> >> >> > >> >> >> >> >> > I suppose that we could have a global variable that was set to the >> >> >> >> >> > priority of the complaint in question, which would suppress all >> >> >> >> >> > lower-priority complaints. Might need to be opt-in, though -- I would >> >> >> >> >> > guess that not everyone is going to be happy with one complaint suppressing >> >> >> >> >> > others, especially given the possibility that the two complaints might >> >> >> >> >> > be about different things. >> >> >> >> >> > >> >> >> >> >> > Or did you have something more deft in mind? >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> syzkaller generally looks only at the first report. One does not know >> >> >> >> >> if/when there will be a second one, or the second one can be induced >> >> >> >> >> by the first one, and we generally want clean reports on a non-tainted >> >> >> >> >> kernel. So we don't just need to suppress lower priority ones, we need >> >> >> >> >> to produce the right report first. >> >> >> >> >> I am thinking maybe setting: >> >> >> >> >> - rcu stalls at 1.5 minutes >> >> >> >> >> - workqueue stalls at 2 minutes >> >> >> >> >> - task hungs at 2.5 minutes >> >> >> >> >> - and no output whatsoever at 3 minutes >> >> >> >> >> Do I miss anything? I think at least spinlocks. Should they go before >> >> >> >> >> or after rcu? >> >> >> >> > >> >> >> >> > That is what I know of, but the Linux kernel being what it is, there is >> >> >> >> > probably something more out there. If not now, in a few months. The >> >> >> >> > RCU CPU stall timeout can be set on the kernel-boot command line, but >> >> >> >> > you probably already knew that. >> >> >> >> >> >> >> >> Well, it's all based solely on a large number of patches and stopgaps. >> >> >> >> If we fix main problems for today, it's already good. >> >> >> > >> >> >> > Fair enough! >> >> >> > >> >> >> >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout >> >> >> >> > was 1.5 -seconds-. ;-) >> >> >> >> >> >> >> >> Have you tried to instrument every basic block with a function call to >> >> >> >> collect coverage, check every damn memory access for validity, enable >> >> >> >> all thinkable and unthinkable debug configs and put the insanest load >> >> >> >> one can imagine from a swarm of parallel threads? It makes things a >> >> >> >> bit slower ;) >> >> >> > >> >> >> > Given that we wouldn't have had enough CPU or memory to accommodate >> >> >> > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) >> >> >> > >> >> >> >> >> This will require fixing task hung. Have not yet looked at workqueue detector. >> >> >> >> >> Does at least RCU respect the given timeout more or less precisely? >> >> >> >> > >> >> >> >> > Assuming that there is at least one CPU capable of taking scheduling-clock >> >> >> >> > interrupts, it should respect the timeout to within a few jiffies. >> >> >> >> >> >> >> >> >> Hi Paul, >> >> >> >> >> >> Speaking of stalls and rcu, we are seeing lots of crashes that go like this: >> >> >> >> >> >> INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: >> >> >> rcu_sched detected stalls on CPUs/tasks: >> >> >> INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: >> >> >> rcu_sched detected stalls on CPUs/tasks: >> >> >> INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: >> >> >> rcu_sched detected stalls on CPUs/tasks: >> >> >> >> >> >> or like this: >> >> >> >> >> >> INFO: rcu_sched self-detected stall on CPU >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: >> >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 >> >> >> softirq=57641/57641 fqs=31151 >> >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 >> >> >> softirq=57641/57641 fqs=31151 >> >> >> (t=125002 jiffies g=31656 c=31655 q=910) >> >> >> >> >> >> INFO: rcu_sched self-detected stall on CPU >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: >> >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 >> >> >> softirq=65194/65194 fqs=31231 >> >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 >> >> >> softirq=65194/65194 fqs=31231 >> >> >> (t=125002 jiffies g=34421 c=34420 q=1119) >> >> >> (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) >> >> >> >> >> >> >> >> >> and then there is an unintelligible mess of 2 reports. Such crashes go >> >> >> to trash bin, because we can't even say which function hanged. It >> >> >> seems that in all cases 2 different rcu stall detection facilities >> >> >> race with each other. Is it possible to make them not race? >> >> > >> >> > How about the following (untested, not for mainline) patch? It suppresses >> >> > all but the "main" RCU flavor, which is rcu_sched for !PREEMPT builds and >> >> > rcu_preempt otherwise. Either way, this is the RCU flavor corresponding >> >> > to synchronize_rcu(). This works well in the common case where there >> >> > is almost always an RCU grace period in flight. >> >> > >> >> > One reason that this patch is not for mainline is that I am working on >> >> > merging the RCU-bh, RCU-preempt, and RCU-sched flavors into one thing, >> >> > at which point there won't be any races. But that might be a couple >> >> > merge windows away from now. >> >> > >> >> > Thanx, Paul >> >> > >> >> > ------------------------------------------------------------------------ >> >> > >> >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >> >> > index 381b47a68ac6..31f7818f2d63 100644 >> >> > --- a/kernel/rcu/tree.c >> >> > +++ b/kernel/rcu/tree.c >> >> > @@ -1552,7 +1552,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) >> >> > struct rcu_node *rnp; >> >> > >> >> > if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || >> >> > - !rcu_gp_in_progress(rsp)) >> >> > + !rcu_gp_in_progress(rsp) || rsp != rcu_state_p) >> >> > return; >> >> > rcu_stall_kick_kthreads(rsp); >> >> > j = jiffies; >> >> >> >> But doesn't they both relate to the same rcu flavor? They both say >> >> rcu_sched. I assumed that the difference is "self-detected" vs "on >> >> CPUs/tasks", i.e. on the current CPU vs on other CPUs. >> > >> > Right you are! >> > >> > One approach would be to increase the value of RCU_STALL_RAT_DELAY, >> > which is currently two jiffies to (say) 20 jiffies. This is in >> > kernel/rcu/tree.h. But this would fail on a sufficiently overloaded >> > system -- and the failure of the two-jiffy delay is a bit of a surprise, >> > given interrupts disabled and all that. Are you by any chance loaded >> > heavily enough to see vCPU preemption? >> > >> > I could avoid at least some of these timing issues instead using cmpxchg() >> > on ->jiffies_stall to allow only one CPU in, but leave the non-atomic >> > update to discourage overly long stall prints from running into the >> > next one. This is not perfect, either, and is roughly equivalent to >> > setting RCU_STALL_RAT_DELAY to many second's worth of jiffies, but >> > avoiding that minute's delay. But it should get rid of the duplication >> > in almost all cases, though it could allow a stall warning to overlap >> > with a later stall warning for that same grace period. Which can >> > already happen anyway. Also, a tens-of-seconds vCPU preemption can >> > still cause concurrent stall warnings, but if that is happening to you, >> > the concurrent stall warnings are probably the least of your problems. >> > Besides, we do need at least one CPU to actually report the stall, which >> > won't happen if that CPU's vCPU is indefinitely preempted. So there is >> > only so much I can do about that particular corner case. >> > >> > So how does the following (untested) patch work for you? >> >> Looks good to me. >> >> We run on VMs, so we can well have vCPU preemption. > > Very good! Please do get me a Tested-by when you get to that point. Unfortunately I don't have a good way to test it until it's submitted upstream. While we are seeing thousands of such instances, they happen episodically on a farm of test machines. But they are still harmful, especially when the system tries to reproduce a bug, because it's mid-way through and thinks it got a hook, but then suddenly boom! it gets some mess that it can't parse and now it does not know if it's still the same bug, or maybe a different bug triggered by the same program, so it does not know how to properly attribute the reproducer. You can see these cases as they happen here (under report/log links in the table): https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452 When the patch is submitted, the rate should go down. > Thanx, Paul > >> > ------------------------------------------------------------------------ >> > >> > commit 6a5ab1e68f8636d8823bb5a9aee35fc44c2be866 >> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >> > Date: Mon Apr 9 11:04:46 2018 -0700 >> > >> > rcu: Exclude near-simultaneous RCU CPU stall warnings >> > >> > There is a two-jiffy delay between the time that a CPU will self-report >> > an RCU CPU stall warning and the time that some other CPU will report a >> > warning on behalf of the first CPU. This has worked well in the past, >> > but on busy systems, it is possible for the two warnings to overlap, >> > which makes interpreting them extremely difficult. >> > >> > This commit therefore uses a cmpxchg-based timing decision that >> > allows only one report in a given one-minute period (assuming default >> > stall-warning Kconfig parameters). This approach will of course fail >> > if you are seeing minute-long vCPU preemption, but in that case the >> > overlapping RCU CPU stall warnings are the least of your worries. >> > >> > Reported-by: Dmitry Vyukov <dvyukov@google.com> >> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >> > >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >> > index 381b47a68ac6..b7246bcbf633 100644 >> > --- a/kernel/rcu/tree.c >> > +++ b/kernel/rcu/tree.c >> > @@ -1429,8 +1429,6 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) >> > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); >> > return; >> > } >> > - WRITE_ONCE(rsp->jiffies_stall, >> > - jiffies + 3 * rcu_jiffies_till_stall_check() + 3); >> > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); >> > >> > /* >> > @@ -1481,6 +1479,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) >> > sched_show_task(current); >> > } >> > } >> > + /* Rewrite if needed in case of slow consoles. */ >> > + if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) >> > + WRITE_ONCE(rsp->jiffies_stall, >> > + jiffies + 3 * rcu_jiffies_till_stall_check() + 3); >> > >> > rcu_check_gp_kthread_starvation(rsp); >> > >> > @@ -1525,6 +1527,7 @@ static void print_cpu_stall(struct rcu_state *rsp) >> > rcu_dump_cpu_stacks(rsp); >> > >> > raw_spin_lock_irqsave_rcu_node(rnp, flags); >> > + /* Rewrite if needed in case of slow consoles. */ >> > if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) >> > WRITE_ONCE(rsp->jiffies_stall, >> > jiffies + 3 * rcu_jiffies_till_stall_check() + 3); >> > @@ -1548,6 +1551,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) >> > unsigned long gpnum; >> > unsigned long gps; >> > unsigned long j; >> > + unsigned long jn; >> > unsigned long js; >> > struct rcu_node *rnp; >> > >> > @@ -1586,14 +1590,17 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) >> > ULONG_CMP_GE(gps, js)) >> > return; /* No stall or GP completed since entering function. */ >> > rnp = rdp->mynode; >> > + jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; >> > if (rcu_gp_in_progress(rsp) && >> > - (READ_ONCE(rnp->qsmask) & rdp->grpmask)) { >> > + (READ_ONCE(rnp->qsmask) & rdp->grpmask) && >> > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { >> > >> > /* We haven't checked in, so go dump stack. */ >> > print_cpu_stall(rsp); >> > >> > } else if (rcu_gp_in_progress(rsp) && >> > - ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) { >> > + ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && >> > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { >> > >> > /* They had a few time units to dump stack, so complain. */ >> > print_other_cpu_stall(rsp, gpnum); >> > >> > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-11 10:06 ` Dmitry Vyukov @ 2018-04-11 19:36 ` Paul E. McKenney 2018-04-12 9:39 ` Dmitry Vyukov 0 siblings, 1 reply; 19+ messages in thread From: Paul E. McKenney @ 2018-04-11 19:36 UTC (permalink / raw) To: Dmitry Vyukov Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Wed, Apr 11, 2018 at 12:06:27PM +0200, Dmitry Vyukov wrote: > On Tue, Apr 10, 2018 at 7:02 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > >> >> >> On Mon, Apr 2, 2018 at 7:23 PM, Paul E. McKenney > >> >> >> <paulmck@linux.vnet.ibm.com> wrote: > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > Hello, > >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> > syzbot hit the following crash on upstream commit > >> >> >> >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > >> >> >> >> >> >> >> > Linux 4.16 > >> >> >> >> >> >> >> > syzbot dashboard link: > >> >> >> >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. > >> >> >> >> >> >> >> > Raw console output: > >> >> >> >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > >> >> >> >> >> >> >> > Kernel config: > >> >> >> >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > >> >> >> >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 > >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> >> >> >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > >> >> >> >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for > >> >> >> >> >> >> >> > details. > >> >> >> >> >> >> >> > If you forward the report, please keep this part and the footer. > >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > >> >> >> >> >> >> >> > option "g �;e�K�>pquota" > >> >> >> >> >> >> > > >> >> >> >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller > >> >> >> >> >> >> > playing around with mount options. > >> >> >> >> >> >> > > >> >> >> >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > >> >> >> >> >> >> >> > Not tainted 4.16.0+ #10 > >> >> >> >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> >> >> >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 > >> >> >> >> >> >> >> > Call Trace: > >> >> >> >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] > >> >> >> >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > >> >> >> >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 > >> >> >> >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > >> >> >> >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] > >> >> >> >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] > >> >> >> >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] > >> >> >> >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > >> >> >> >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > >> >> >> >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > >> >> >> >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing > >> >> >> >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel > >> >> >> >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU > >> >> >> >> >> >> >> somehow missed a transition into idle or user space. > >> >> >> >> >> >> > > >> >> >> >> >> >> > The RCU CPU stall warning below strongly supports this position ... > >> >> >> >> >> >> > >> >> >> >> >> >> I think this is this guy then: > >> >> >> >> >> >> > >> >> >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 > >> >> >> >> >> >> > >> >> >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes > >> >> >> >> >> > > >> >> >> >> >> > Seems likely to me! > >> >> >> >> >> > > >> >> >> >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I > >> >> >> >> >> >> think we need some kind of priority between them. I.e. we have rcu > >> >> >> >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine > >> >> >> >> >> >> hang and maybe something else. It would be useful if they fire > >> >> >> >> >> >> deterministically according to priorities. If there is an rcu stall, > >> >> >> >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, > >> >> >> >> >> >> but a workqueue stall, then that's always detected as workqueue stall, > >> >> >> >> >> >> etc. > >> >> >> >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be > >> >> >> >> >> >> detected either RCU stall or a task hung, producing 2 different bug > >> >> >> >> >> >> reports (which is bad). > >> >> >> >> >> >> One can say that it's only a matter of tuning timeouts, but at least > >> >> >> >> >> >> task hung detector has a problem that if you set timeout to X, it can > >> >> >> >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite > >> >> >> >> >> >> large timeout (a minute may not be enough), and on the other hand we > >> >> >> >> >> >> can't wait for an hour just to make sure that the machine is indeed > >> >> >> >> >> >> dead (these things happen every few minutes). > >> >> >> >> >> > > >> >> >> >> >> > I suppose that we could have a global variable that was set to the > >> >> >> >> >> > priority of the complaint in question, which would suppress all > >> >> >> >> >> > lower-priority complaints. Might need to be opt-in, though -- I would > >> >> >> >> >> > guess that not everyone is going to be happy with one complaint suppressing > >> >> >> >> >> > others, especially given the possibility that the two complaints might > >> >> >> >> >> > be about different things. > >> >> >> >> >> > > >> >> >> >> >> > Or did you have something more deft in mind? > >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> syzkaller generally looks only at the first report. One does not know > >> >> >> >> >> if/when there will be a second one, or the second one can be induced > >> >> >> >> >> by the first one, and we generally want clean reports on a non-tainted > >> >> >> >> >> kernel. So we don't just need to suppress lower priority ones, we need > >> >> >> >> >> to produce the right report first. > >> >> >> >> >> I am thinking maybe setting: > >> >> >> >> >> - rcu stalls at 1.5 minutes > >> >> >> >> >> - workqueue stalls at 2 minutes > >> >> >> >> >> - task hungs at 2.5 minutes > >> >> >> >> >> - and no output whatsoever at 3 minutes > >> >> >> >> >> Do I miss anything? I think at least spinlocks. Should they go before > >> >> >> >> >> or after rcu? > >> >> >> >> > > >> >> >> >> > That is what I know of, but the Linux kernel being what it is, there is > >> >> >> >> > probably something more out there. If not now, in a few months. The > >> >> >> >> > RCU CPU stall timeout can be set on the kernel-boot command line, but > >> >> >> >> > you probably already knew that. > >> >> >> >> > >> >> >> >> Well, it's all based solely on a large number of patches and stopgaps. > >> >> >> >> If we fix main problems for today, it's already good. > >> >> >> > > >> >> >> > Fair enough! > >> >> >> > > >> >> >> >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout > >> >> >> >> > was 1.5 -seconds-. ;-) > >> >> >> >> > >> >> >> >> Have you tried to instrument every basic block with a function call to > >> >> >> >> collect coverage, check every damn memory access for validity, enable > >> >> >> >> all thinkable and unthinkable debug configs and put the insanest load > >> >> >> >> one can imagine from a swarm of parallel threads? It makes things a > >> >> >> >> bit slower ;) > >> >> >> > > >> >> >> > Given that we wouldn't have had enough CPU or memory to accommodate > >> >> >> > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) > >> >> >> > > >> >> >> >> >> This will require fixing task hung. Have not yet looked at workqueue detector. > >> >> >> >> >> Does at least RCU respect the given timeout more or less precisely? > >> >> >> >> > > >> >> >> >> > Assuming that there is at least one CPU capable of taking scheduling-clock > >> >> >> >> > interrupts, it should respect the timeout to within a few jiffies. > >> >> >> > >> >> >> > >> >> >> Hi Paul, > >> >> >> > >> >> >> Speaking of stalls and rcu, we are seeing lots of crashes that go like this: > >> >> >> > >> >> >> INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: > >> >> >> rcu_sched detected stalls on CPUs/tasks: > >> >> >> INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: > >> >> >> rcu_sched detected stalls on CPUs/tasks: > >> >> >> INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: > >> >> >> rcu_sched detected stalls on CPUs/tasks: > >> >> >> > >> >> >> or like this: > >> >> >> > >> >> >> INFO: rcu_sched self-detected stall on CPU > >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: > >> >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > >> >> >> softirq=57641/57641 fqs=31151 > >> >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > >> >> >> softirq=57641/57641 fqs=31151 > >> >> >> (t=125002 jiffies g=31656 c=31655 q=910) > >> >> >> > >> >> >> INFO: rcu_sched self-detected stall on CPU > >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: > >> >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > >> >> >> softirq=65194/65194 fqs=31231 > >> >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > >> >> >> softirq=65194/65194 fqs=31231 > >> >> >> (t=125002 jiffies g=34421 c=34420 q=1119) > >> >> >> (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) > >> >> >> > >> >> >> > >> >> >> and then there is an unintelligible mess of 2 reports. Such crashes go > >> >> >> to trash bin, because we can't even say which function hanged. It > >> >> >> seems that in all cases 2 different rcu stall detection facilities > >> >> >> race with each other. Is it possible to make them not race? > >> >> > > >> >> > How about the following (untested, not for mainline) patch? It suppresses > >> >> > all but the "main" RCU flavor, which is rcu_sched for !PREEMPT builds and > >> >> > rcu_preempt otherwise. Either way, this is the RCU flavor corresponding > >> >> > to synchronize_rcu(). This works well in the common case where there > >> >> > is almost always an RCU grace period in flight. > >> >> > > >> >> > One reason that this patch is not for mainline is that I am working on > >> >> > merging the RCU-bh, RCU-preempt, and RCU-sched flavors into one thing, > >> >> > at which point there won't be any races. But that might be a couple > >> >> > merge windows away from now. > >> >> > > >> >> > Thanx, Paul > >> >> > > >> >> > ------------------------------------------------------------------------ > >> >> > > >> >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > >> >> > index 381b47a68ac6..31f7818f2d63 100644 > >> >> > --- a/kernel/rcu/tree.c > >> >> > +++ b/kernel/rcu/tree.c > >> >> > @@ -1552,7 +1552,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > >> >> > struct rcu_node *rnp; > >> >> > > >> >> > if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || > >> >> > - !rcu_gp_in_progress(rsp)) > >> >> > + !rcu_gp_in_progress(rsp) || rsp != rcu_state_p) > >> >> > return; > >> >> > rcu_stall_kick_kthreads(rsp); > >> >> > j = jiffies; > >> >> > >> >> But doesn't they both relate to the same rcu flavor? They both say > >> >> rcu_sched. I assumed that the difference is "self-detected" vs "on > >> >> CPUs/tasks", i.e. on the current CPU vs on other CPUs. > >> > > >> > Right you are! > >> > > >> > One approach would be to increase the value of RCU_STALL_RAT_DELAY, > >> > which is currently two jiffies to (say) 20 jiffies. This is in > >> > kernel/rcu/tree.h. But this would fail on a sufficiently overloaded > >> > system -- and the failure of the two-jiffy delay is a bit of a surprise, > >> > given interrupts disabled and all that. Are you by any chance loaded > >> > heavily enough to see vCPU preemption? > >> > > >> > I could avoid at least some of these timing issues instead using cmpxchg() > >> > on ->jiffies_stall to allow only one CPU in, but leave the non-atomic > >> > update to discourage overly long stall prints from running into the > >> > next one. This is not perfect, either, and is roughly equivalent to > >> > setting RCU_STALL_RAT_DELAY to many second's worth of jiffies, but > >> > avoiding that minute's delay. But it should get rid of the duplication > >> > in almost all cases, though it could allow a stall warning to overlap > >> > with a later stall warning for that same grace period. Which can > >> > already happen anyway. Also, a tens-of-seconds vCPU preemption can > >> > still cause concurrent stall warnings, but if that is happening to you, > >> > the concurrent stall warnings are probably the least of your problems. > >> > Besides, we do need at least one CPU to actually report the stall, which > >> > won't happen if that CPU's vCPU is indefinitely preempted. So there is > >> > only so much I can do about that particular corner case. > >> > > >> > So how does the following (untested) patch work for you? > >> > >> Looks good to me. > >> > >> We run on VMs, so we can well have vCPU preemption. > > > > Very good! Please do get me a Tested-by when you get to that point. > > Unfortunately I don't have a good way to test it until it's submitted > upstream. While we are seeing thousands of such instances, they happen > episodically on a farm of test machines. But they are still harmful, > especially when the system tries to reproduce a bug, because it's > mid-way through and thinks it got a hook, but then suddenly boom! it > gets some mess that it can't parse and now it does not know if it's > still the same bug, or maybe a different bug triggered by the same > program, so it does not know how to properly attribute the reproducer. > You can see these cases as they happen here (under report/log links in > the table): > https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452 > When the patch is submitted, the rate should go down. OK, I will bite... How do you test fixes to problems that syzkaller finds? Thanx, Paul > >> > ------------------------------------------------------------------------ > >> > > >> > commit 6a5ab1e68f8636d8823bb5a9aee35fc44c2be866 > >> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > >> > Date: Mon Apr 9 11:04:46 2018 -0700 > >> > > >> > rcu: Exclude near-simultaneous RCU CPU stall warnings > >> > > >> > There is a two-jiffy delay between the time that a CPU will self-report > >> > an RCU CPU stall warning and the time that some other CPU will report a > >> > warning on behalf of the first CPU. This has worked well in the past, > >> > but on busy systems, it is possible for the two warnings to overlap, > >> > which makes interpreting them extremely difficult. > >> > > >> > This commit therefore uses a cmpxchg-based timing decision that > >> > allows only one report in a given one-minute period (assuming default > >> > stall-warning Kconfig parameters). This approach will of course fail > >> > if you are seeing minute-long vCPU preemption, but in that case the > >> > overlapping RCU CPU stall warnings are the least of your worries. > >> > > >> > Reported-by: Dmitry Vyukov <dvyukov@google.com> > >> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > >> > > >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > >> > index 381b47a68ac6..b7246bcbf633 100644 > >> > --- a/kernel/rcu/tree.c > >> > +++ b/kernel/rcu/tree.c > >> > @@ -1429,8 +1429,6 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) > >> > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > >> > return; > >> > } > >> > - WRITE_ONCE(rsp->jiffies_stall, > >> > - jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > >> > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > >> > > >> > /* > >> > @@ -1481,6 +1479,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) > >> > sched_show_task(current); > >> > } > >> > } > >> > + /* Rewrite if needed in case of slow consoles. */ > >> > + if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) > >> > + WRITE_ONCE(rsp->jiffies_stall, > >> > + jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > >> > > >> > rcu_check_gp_kthread_starvation(rsp); > >> > > >> > @@ -1525,6 +1527,7 @@ static void print_cpu_stall(struct rcu_state *rsp) > >> > rcu_dump_cpu_stacks(rsp); > >> > > >> > raw_spin_lock_irqsave_rcu_node(rnp, flags); > >> > + /* Rewrite if needed in case of slow consoles. */ > >> > if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) > >> > WRITE_ONCE(rsp->jiffies_stall, > >> > jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > >> > @@ -1548,6 +1551,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > >> > unsigned long gpnum; > >> > unsigned long gps; > >> > unsigned long j; > >> > + unsigned long jn; > >> > unsigned long js; > >> > struct rcu_node *rnp; > >> > > >> > @@ -1586,14 +1590,17 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > >> > ULONG_CMP_GE(gps, js)) > >> > return; /* No stall or GP completed since entering function. */ > >> > rnp = rdp->mynode; > >> > + jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; > >> > if (rcu_gp_in_progress(rsp) && > >> > - (READ_ONCE(rnp->qsmask) & rdp->grpmask)) { > >> > + (READ_ONCE(rnp->qsmask) & rdp->grpmask) && > >> > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { > >> > > >> > /* We haven't checked in, so go dump stack. */ > >> > print_cpu_stall(rsp); > >> > > >> > } else if (rcu_gp_in_progress(rsp) && > >> > - ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) { > >> > + ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && > >> > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { > >> > > >> > /* They had a few time units to dump stack, so complain. */ > >> > print_other_cpu_stall(rsp, gpnum); > >> > > >> > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-11 19:36 ` Paul E. McKenney @ 2018-04-12 9:39 ` Dmitry Vyukov 2018-04-12 15:07 ` Paul E. McKenney 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Vyukov @ 2018-04-12 9:39 UTC (permalink / raw) To: Paul McKenney Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Wed, Apr 11, 2018 at 9:36 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: >> >> >> >> <paulmck@linux.vnet.ibm.com> wrote: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > Hello, >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> > syzbot hit the following crash on upstream commit >> >> >> >> >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) >> >> >> >> >> >> >> >> > Linux 4.16 >> >> >> >> >> >> >> >> > syzbot dashboard link: >> >> >> >> >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. >> >> >> >> >> >> >> >> > Raw console output: >> >> >> >> >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 >> >> >> >> >> >> >> >> > Kernel config: >> >> >> >> >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 >> >> >> >> >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: >> >> >> >> >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com >> >> >> >> >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for >> >> >> >> >> >> >> >> > details. >> >> >> >> >> >> >> >> > If you forward the report, please keep this part and the footer. >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount >> >> >> >> >> >> >> >> > option "g �;e�K�>pquota" >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller >> >> >> >> >> >> >> > playing around with mount options. >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. >> >> >> >> >> >> >> >> > Not tainted 4.16.0+ #10 >> >> >> >> >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> >> >> >> >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 >> >> >> >> >> >> >> >> > Call Trace: >> >> >> >> >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] >> >> >> >> >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 >> >> >> >> >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 >> >> >> >> >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 >> >> >> >> >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] >> >> >> >> >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] >> >> >> >> >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] >> >> >> >> >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 >> >> >> >> >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 >> >> >> >> >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 >> >> >> >> >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing >> >> >> >> >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel >> >> >> >> >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU >> >> >> >> >> >> >> >> somehow missed a transition into idle or user space. >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > The RCU CPU stall warning below strongly supports this position ... >> >> >> >> >> >> >> >> >> >> >> >> >> >> I think this is this guy then: >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 >> >> >> >> >> >> >> >> >> >> >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes >> >> >> >> >> >> > >> >> >> >> >> >> > Seems likely to me! >> >> >> >> >> >> > >> >> >> >> >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I >> >> >> >> >> >> >> think we need some kind of priority between them. I.e. we have rcu >> >> >> >> >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine >> >> >> >> >> >> >> hang and maybe something else. It would be useful if they fire >> >> >> >> >> >> >> deterministically according to priorities. If there is an rcu stall, >> >> >> >> >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, >> >> >> >> >> >> >> but a workqueue stall, then that's always detected as workqueue stall, >> >> >> >> >> >> >> etc. >> >> >> >> >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be >> >> >> >> >> >> >> detected either RCU stall or a task hung, producing 2 different bug >> >> >> >> >> >> >> reports (which is bad). >> >> >> >> >> >> >> One can say that it's only a matter of tuning timeouts, but at least >> >> >> >> >> >> >> task hung detector has a problem that if you set timeout to X, it can >> >> >> >> >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite >> >> >> >> >> >> >> large timeout (a minute may not be enough), and on the other hand we >> >> >> >> >> >> >> can't wait for an hour just to make sure that the machine is indeed >> >> >> >> >> >> >> dead (these things happen every few minutes). >> >> >> >> >> >> > >> >> >> >> >> >> > I suppose that we could have a global variable that was set to the >> >> >> >> >> >> > priority of the complaint in question, which would suppress all >> >> >> >> >> >> > lower-priority complaints. Might need to be opt-in, though -- I would >> >> >> >> >> >> > guess that not everyone is going to be happy with one complaint suppressing >> >> >> >> >> >> > others, especially given the possibility that the two complaints might >> >> >> >> >> >> > be about different things. >> >> >> >> >> >> > >> >> >> >> >> >> > Or did you have something more deft in mind? >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> syzkaller generally looks only at the first report. One does not know >> >> >> >> >> >> if/when there will be a second one, or the second one can be induced >> >> >> >> >> >> by the first one, and we generally want clean reports on a non-tainted >> >> >> >> >> >> kernel. So we don't just need to suppress lower priority ones, we need >> >> >> >> >> >> to produce the right report first. >> >> >> >> >> >> I am thinking maybe setting: >> >> >> >> >> >> - rcu stalls at 1.5 minutes >> >> >> >> >> >> - workqueue stalls at 2 minutes >> >> >> >> >> >> - task hungs at 2.5 minutes >> >> >> >> >> >> - and no output whatsoever at 3 minutes >> >> >> >> >> >> Do I miss anything? I think at least spinlocks. Should they go before >> >> >> >> >> >> or after rcu? >> >> >> >> >> > >> >> >> >> >> > That is what I know of, but the Linux kernel being what it is, there is >> >> >> >> >> > probably something more out there. If not now, in a few months. The >> >> >> >> >> > RCU CPU stall timeout can be set on the kernel-boot command line, but >> >> >> >> >> > you probably already knew that. >> >> >> >> >> >> >> >> >> >> Well, it's all based solely on a large number of patches and stopgaps. >> >> >> >> >> If we fix main problems for today, it's already good. >> >> >> >> > >> >> >> >> > Fair enough! >> >> >> >> > >> >> >> >> >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout >> >> >> >> >> > was 1.5 -seconds-. ;-) >> >> >> >> >> >> >> >> >> >> Have you tried to instrument every basic block with a function call to >> >> >> >> >> collect coverage, check every damn memory access for validity, enable >> >> >> >> >> all thinkable and unthinkable debug configs and put the insanest load >> >> >> >> >> one can imagine from a swarm of parallel threads? It makes things a >> >> >> >> >> bit slower ;) >> >> >> >> > >> >> >> >> > Given that we wouldn't have had enough CPU or memory to accommodate >> >> >> >> > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) >> >> >> >> > >> >> >> >> >> >> This will require fixing task hung. Have not yet looked at workqueue detector. >> >> >> >> >> >> Does at least RCU respect the given timeout more or less precisely? >> >> >> >> >> > >> >> >> >> >> > Assuming that there is at least one CPU capable of taking scheduling-clock >> >> >> >> >> > interrupts, it should respect the timeout to within a few jiffies. >> >> >> >> >> >> >> >> >> >> >> >> Hi Paul, >> >> >> >> >> >> >> >> Speaking of stalls and rcu, we are seeing lots of crashes that go like this: >> >> >> >> >> >> >> >> INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: >> >> >> >> rcu_sched detected stalls on CPUs/tasks: >> >> >> >> INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: >> >> >> >> rcu_sched detected stalls on CPUs/tasks: >> >> >> >> INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: >> >> >> >> rcu_sched detected stalls on CPUs/tasks: >> >> >> >> >> >> >> >> or like this: >> >> >> >> >> >> >> >> INFO: rcu_sched self-detected stall on CPU >> >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: >> >> >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 >> >> >> >> softirq=57641/57641 fqs=31151 >> >> >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 >> >> >> >> softirq=57641/57641 fqs=31151 >> >> >> >> (t=125002 jiffies g=31656 c=31655 q=910) >> >> >> >> >> >> >> >> INFO: rcu_sched self-detected stall on CPU >> >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: >> >> >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 >> >> >> >> softirq=65194/65194 fqs=31231 >> >> >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 >> >> >> >> softirq=65194/65194 fqs=31231 >> >> >> >> (t=125002 jiffies g=34421 c=34420 q=1119) >> >> >> >> (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) >> >> >> >> >> >> >> >> >> >> >> >> and then there is an unintelligible mess of 2 reports. Such crashes go >> >> >> >> to trash bin, because we can't even say which function hanged. It >> >> >> >> seems that in all cases 2 different rcu stall detection facilities >> >> >> >> race with each other. Is it possible to make them not race? >> >> >> > >> >> >> > How about the following (untested, not for mainline) patch? It suppresses >> >> >> > all but the "main" RCU flavor, which is rcu_sched for !PREEMPT builds and >> >> >> > rcu_preempt otherwise. Either way, this is the RCU flavor corresponding >> >> >> > to synchronize_rcu(). This works well in the common case where there >> >> >> > is almost always an RCU grace period in flight. >> >> >> > >> >> >> > One reason that this patch is not for mainline is that I am working on >> >> >> > merging the RCU-bh, RCU-preempt, and RCU-sched flavors into one thing, >> >> >> > at which point there won't be any races. But that might be a couple >> >> >> > merge windows away from now. >> >> >> > >> >> >> > Thanx, Paul >> >> >> > >> >> >> > ------------------------------------------------------------------------ >> >> >> > >> >> >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >> >> >> > index 381b47a68ac6..31f7818f2d63 100644 >> >> >> > --- a/kernel/rcu/tree.c >> >> >> > +++ b/kernel/rcu/tree.c >> >> >> > @@ -1552,7 +1552,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) >> >> >> > struct rcu_node *rnp; >> >> >> > >> >> >> > if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || >> >> >> > - !rcu_gp_in_progress(rsp)) >> >> >> > + !rcu_gp_in_progress(rsp) || rsp != rcu_state_p) >> >> >> > return; >> >> >> > rcu_stall_kick_kthreads(rsp); >> >> >> > j = jiffies; >> >> >> >> >> >> But doesn't they both relate to the same rcu flavor? They both say >> >> >> rcu_sched. I assumed that the difference is "self-detected" vs "on >> >> >> CPUs/tasks", i.e. on the current CPU vs on other CPUs. >> >> > >> >> > Right you are! >> >> > >> >> > One approach would be to increase the value of RCU_STALL_RAT_DELAY, >> >> > which is currently two jiffies to (say) 20 jiffies. This is in >> >> > kernel/rcu/tree.h. But this would fail on a sufficiently overloaded >> >> > system -- and the failure of the two-jiffy delay is a bit of a surprise, >> >> > given interrupts disabled and all that. Are you by any chance loaded >> >> > heavily enough to see vCPU preemption? >> >> > >> >> > I could avoid at least some of these timing issues instead using cmpxchg() >> >> > on ->jiffies_stall to allow only one CPU in, but leave the non-atomic >> >> > update to discourage overly long stall prints from running into the >> >> > next one. This is not perfect, either, and is roughly equivalent to >> >> > setting RCU_STALL_RAT_DELAY to many second's worth of jiffies, but >> >> > avoiding that minute's delay. But it should get rid of the duplication >> >> > in almost all cases, though it could allow a stall warning to overlap >> >> > with a later stall warning for that same grace period. Which can >> >> > already happen anyway. Also, a tens-of-seconds vCPU preemption can >> >> > still cause concurrent stall warnings, but if that is happening to you, >> >> > the concurrent stall warnings are probably the least of your problems. >> >> > Besides, we do need at least one CPU to actually report the stall, which >> >> > won't happen if that CPU's vCPU is indefinitely preempted. So there is >> >> > only so much I can do about that particular corner case. >> >> > >> >> > So how does the following (untested) patch work for you? >> >> >> >> Looks good to me. >> >> >> >> We run on VMs, so we can well have vCPU preemption. >> > >> > Very good! Please do get me a Tested-by when you get to that point. >> >> Unfortunately I don't have a good way to test it until it's submitted >> upstream. While we are seeing thousands of such instances, they happen >> episodically on a farm of test machines. But they are still harmful, >> especially when the system tries to reproduce a bug, because it's >> mid-way through and thinks it got a hook, but then suddenly boom! it >> gets some mess that it can't parse and now it does not know if it's >> still the same bug, or maybe a different bug triggered by the same >> program, so it does not know how to properly attribute the reproducer. >> You can see these cases as they happen here (under report/log links in >> the table): >> https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452 >> When the patch is submitted, the rate should go down. > > OK, I will bite... How do you test fixes to problems that syzkaller finds? I don't. I can't. No one can test that many fixes. Normally syzbot provides reproducers for bugs. Then you have 2 choices: (1) test it yourself (if you debugged it, you probably already have everything setup for this), or (2) ask syzbot to test the patch on this particular reproducer. Some bugs don't have reproducers. Then you either localize the bug and write a test, or go with the old good "it must be correct, right?". Even for the second case, syzbot will notify if the bug happens again after the fix is landed, or it's silent, then presumably the fix indeed fixed the bug. Now, this is not a syzbot bug (syzbot reports bugs itself from own email address). This is more like you looked at somebody else dmsg and like "oh, this looks bad, let me copy-paste and report it". So can also go with the old good "it must be correct, right?" and assess how well it goes after few weeks when it reaches syzbot, or someone needs to write a test for rcu. This could have been handled with some kind of "cluster-wide" test, but I don't see how it is feasible. See this for details: https://groups.google.com/d/msg/syzkaller-bugs/7ucgCkAJKSk/skZjgavRAQAJ Especially the part that someone will need to go through and triage hundreds of crashes and assess that they are not related to the new patch, and do something with then afterwards. >> >> > ------------------------------------------------------------------------ >> >> > >> >> > commit 6a5ab1e68f8636d8823bb5a9aee35fc44c2be866 >> >> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >> >> > Date: Mon Apr 9 11:04:46 2018 -0700 >> >> > >> >> > rcu: Exclude near-simultaneous RCU CPU stall warnings >> >> > >> >> > There is a two-jiffy delay between the time that a CPU will self-report >> >> > an RCU CPU stall warning and the time that some other CPU will report a >> >> > warning on behalf of the first CPU. This has worked well in the past, >> >> > but on busy systems, it is possible for the two warnings to overlap, >> >> > which makes interpreting them extremely difficult. >> >> > >> >> > This commit therefore uses a cmpxchg-based timing decision that >> >> > allows only one report in a given one-minute period (assuming default >> >> > stall-warning Kconfig parameters). This approach will of course fail >> >> > if you are seeing minute-long vCPU preemption, but in that case the >> >> > overlapping RCU CPU stall warnings are the least of your worries. >> >> > >> >> > Reported-by: Dmitry Vyukov <dvyukov@google.com> >> >> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >> >> > >> >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >> >> > index 381b47a68ac6..b7246bcbf633 100644 >> >> > --- a/kernel/rcu/tree.c >> >> > +++ b/kernel/rcu/tree.c >> >> > @@ -1429,8 +1429,6 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) >> >> > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); >> >> > return; >> >> > } >> >> > - WRITE_ONCE(rsp->jiffies_stall, >> >> > - jiffies + 3 * rcu_jiffies_till_stall_check() + 3); >> >> > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); >> >> > >> >> > /* >> >> > @@ -1481,6 +1479,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) >> >> > sched_show_task(current); >> >> > } >> >> > } >> >> > + /* Rewrite if needed in case of slow consoles. */ >> >> > + if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) >> >> > + WRITE_ONCE(rsp->jiffies_stall, >> >> > + jiffies + 3 * rcu_jiffies_till_stall_check() + 3); >> >> > >> >> > rcu_check_gp_kthread_starvation(rsp); >> >> > >> >> > @@ -1525,6 +1527,7 @@ static void print_cpu_stall(struct rcu_state *rsp) >> >> > rcu_dump_cpu_stacks(rsp); >> >> > >> >> > raw_spin_lock_irqsave_rcu_node(rnp, flags); >> >> > + /* Rewrite if needed in case of slow consoles. */ >> >> > if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) >> >> > WRITE_ONCE(rsp->jiffies_stall, >> >> > jiffies + 3 * rcu_jiffies_till_stall_check() + 3); >> >> > @@ -1548,6 +1551,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) >> >> > unsigned long gpnum; >> >> > unsigned long gps; >> >> > unsigned long j; >> >> > + unsigned long jn; >> >> > unsigned long js; >> >> > struct rcu_node *rnp; >> >> > >> >> > @@ -1586,14 +1590,17 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) >> >> > ULONG_CMP_GE(gps, js)) >> >> > return; /* No stall or GP completed since entering function. */ >> >> > rnp = rdp->mynode; >> >> > + jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; >> >> > if (rcu_gp_in_progress(rsp) && >> >> > - (READ_ONCE(rnp->qsmask) & rdp->grpmask)) { >> >> > + (READ_ONCE(rnp->qsmask) & rdp->grpmask) && >> >> > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { >> >> > >> >> > /* We haven't checked in, so go dump stack. */ >> >> > print_cpu_stall(rsp); >> >> > >> >> > } else if (rcu_gp_in_progress(rsp) && >> >> > - ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) { >> >> > + ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && >> >> > + cmpxchg(&rsp->jiffies_stall, js, jn) == js) { >> >> > >> >> > /* They had a few time units to dump stack, so complain. */ >> >> > print_other_cpu_stall(rsp, gpnum); >> >> > >> >> >> > >> > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: INFO: task hung in perf_trace_event_unreg 2018-04-12 9:39 ` Dmitry Vyukov @ 2018-04-12 15:07 ` Paul E. McKenney 0 siblings, 0 replies; 19+ messages in thread From: Paul E. McKenney @ 2018-04-12 15:07 UTC (permalink / raw) To: Dmitry Vyukov Cc: Steven Rostedt, syzbot, LKML, Ingo Molnar, syzkaller-bugs, Peter Zijlstra, syzkaller On Thu, Apr 12, 2018 at 11:39:42AM +0200, Dmitry Vyukov wrote: > On Wed, Apr 11, 2018 at 9:36 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > >> >> >> >> <paulmck@linux.vnet.ibm.com> wrote: > >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> > Hello, > >> >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> >> > syzbot hit the following crash on upstream commit > >> >> >> >> >> >> >> >> > 0adb32858b0bddf4ada5f364a84ed60b196dbcda (Sun Apr 1 21:20:27 2018 +0000) > >> >> >> >> >> >> >> >> > Linux 4.16 > >> >> >> >> >> >> >> >> > syzbot dashboard link: > >> >> >> >> >> >> >> >> > https://syzkaller.appspot.com/bug?extid=2dbc55da20fa246378fd > >> >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> >> > Unfortunately, I don't have any reproducer for this crash yet. > >> >> >> >> >> >> >> >> > Raw console output: > >> >> >> >> >> >> >> >> > https://syzkaller.appspot.com/x/log.txt?id=5487937873510400 > >> >> >> >> >> >> >> >> > Kernel config: > >> >> >> >> >> >> >> >> > https://syzkaller.appspot.com/x/.config?id=-2374466361298166459 > >> >> >> >> >> >> >> >> > compiler: gcc (GCC) 7.1.1 20170620 > >> >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit: > >> >> >> >> >> >> >> >> > Reported-by: syzbot+2dbc55da20fa246378fd@syzkaller.appspotmail.com > >> >> >> >> >> >> >> >> > It will help syzbot understand when the bug is fixed. See footer for > >> >> >> >> >> >> >> >> > details. > >> >> >> >> >> >> >> >> > If you forward the report, please keep this part and the footer. > >> >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> >> > REISERFS warning (device loop4): super-6502 reiserfs_getopt: unknown mount > >> >> >> >> >> >> >> >> > option "g �;e�K�>pquota" > >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> > Might not hurt to look into the above, though perhaps this is just syzkaller > >> >> >> >> >> >> >> > playing around with mount options. > >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> >> > INFO: task syz-executor3:10803 blocked for more than 120 seconds. > >> >> >> >> >> >> >> >> > Not tainted 4.16.0+ #10 > >> >> >> >> >> >> >> >> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> >> >> >> >> >> >> >> > syz-executor3 D20944 10803 4492 0x80000002 > >> >> >> >> >> >> >> >> > Call Trace: > >> >> >> >> >> >> >> >> > context_switch kernel/sched/core.c:2862 [inline] > >> >> >> >> >> >> >> >> > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > >> >> >> >> >> >> >> >> > schedule+0xf5/0x430 kernel/sched/core.c:3499 > >> >> >> >> >> >> >> >> > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > >> >> >> >> >> >> >> >> > do_wait_for_common kernel/sched/completion.c:86 [inline] > >> >> >> >> >> >> >> >> > __wait_for_common kernel/sched/completion.c:107 [inline] > >> >> >> >> >> >> >> >> > wait_for_common kernel/sched/completion.c:118 [inline] > >> >> >> >> >> >> >> >> > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > >> >> >> >> >> >> >> >> > __wait_rcu_gp+0x221/0x340 kernel/rcu/update.c:414 > >> >> >> >> >> >> >> >> > synchronize_sched.part.64+0xac/0x100 kernel/rcu/tree.c:3212 > >> >> >> >> >> >> >> >> > synchronize_sched+0x76/0xf0 kernel/rcu/tree.c:3213 > >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> I don't think this is a perf issue. Looks like something is preventing > >> >> >> >> >> >> >> >> rcu_sched from completing. If there's a CPU that is running in kernel > >> >> >> >> >> >> >> >> space and never scheduling, that can cause this issue. Or if RCU > >> >> >> >> >> >> >> >> somehow missed a transition into idle or user space. > >> >> >> >> >> >> >> > > >> >> >> >> >> >> >> > The RCU CPU stall warning below strongly supports this position ... > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> I think this is this guy then: > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> #syz dup: INFO: rcu detected stall in __process_echoes > >> >> >> >> >> >> > > >> >> >> >> >> >> > Seems likely to me! > >> >> >> >> >> >> > > >> >> >> >> >> >> >> Looking retrospectively at the various hang/stall bugs that we have, I > >> >> >> >> >> >> >> think we need some kind of priority between them. I.e. we have rcu > >> >> >> >> >> >> >> stalls, spinlock stalls, workqueue hangs, task hangs, silent machine > >> >> >> >> >> >> >> hang and maybe something else. It would be useful if they fire > >> >> >> >> >> >> >> deterministically according to priorities. If there is an rcu stall, > >> >> >> >> >> >> >> that's always detected as CPU stall. Then if there is no RCU stall, > >> >> >> >> >> >> >> but a workqueue stall, then that's always detected as workqueue stall, > >> >> >> >> >> >> >> etc. > >> >> >> >> >> >> >> Currently if we have an RCU stall (effectively CPU stall), that can be > >> >> >> >> >> >> >> detected either RCU stall or a task hung, producing 2 different bug > >> >> >> >> >> >> >> reports (which is bad). > >> >> >> >> >> >> >> One can say that it's only a matter of tuning timeouts, but at least > >> >> >> >> >> >> >> task hung detector has a problem that if you set timeout to X, it can > >> >> >> >> >> >> >> detect hung anywhere between X and 2*X. And on one hand we need quite > >> >> >> >> >> >> >> large timeout (a minute may not be enough), and on the other hand we > >> >> >> >> >> >> >> can't wait for an hour just to make sure that the machine is indeed > >> >> >> >> >> >> >> dead (these things happen every few minutes). > >> >> >> >> >> >> > > >> >> >> >> >> >> > I suppose that we could have a global variable that was set to the > >> >> >> >> >> >> > priority of the complaint in question, which would suppress all > >> >> >> >> >> >> > lower-priority complaints. Might need to be opt-in, though -- I would > >> >> >> >> >> >> > guess that not everyone is going to be happy with one complaint suppressing > >> >> >> >> >> >> > others, especially given the possibility that the two complaints might > >> >> >> >> >> >> > be about different things. > >> >> >> >> >> >> > > >> >> >> >> >> >> > Or did you have something more deft in mind? > >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> syzkaller generally looks only at the first report. One does not know > >> >> >> >> >> >> if/when there will be a second one, or the second one can be induced > >> >> >> >> >> >> by the first one, and we generally want clean reports on a non-tainted > >> >> >> >> >> >> kernel. So we don't just need to suppress lower priority ones, we need > >> >> >> >> >> >> to produce the right report first. > >> >> >> >> >> >> I am thinking maybe setting: > >> >> >> >> >> >> - rcu stalls at 1.5 minutes > >> >> >> >> >> >> - workqueue stalls at 2 minutes > >> >> >> >> >> >> - task hungs at 2.5 minutes > >> >> >> >> >> >> - and no output whatsoever at 3 minutes > >> >> >> >> >> >> Do I miss anything? I think at least spinlocks. Should they go before > >> >> >> >> >> >> or after rcu? > >> >> >> >> >> > > >> >> >> >> >> > That is what I know of, but the Linux kernel being what it is, there is > >> >> >> >> >> > probably something more out there. If not now, in a few months. The > >> >> >> >> >> > RCU CPU stall timeout can be set on the kernel-boot command line, but > >> >> >> >> >> > you probably already knew that. > >> >> >> >> >> > >> >> >> >> >> Well, it's all based solely on a large number of patches and stopgaps. > >> >> >> >> >> If we fix main problems for today, it's already good. > >> >> >> >> > > >> >> >> >> > Fair enough! > >> >> >> >> > > >> >> >> >> >> > Just for comparison, back in DYNIX/ptx days the RCU CPU stall timeout > >> >> >> >> >> > was 1.5 -seconds-. ;-) > >> >> >> >> >> > >> >> >> >> >> Have you tried to instrument every basic block with a function call to > >> >> >> >> >> collect coverage, check every damn memory access for validity, enable > >> >> >> >> >> all thinkable and unthinkable debug configs and put the insanest load > >> >> >> >> >> one can imagine from a swarm of parallel threads? It makes things a > >> >> >> >> >> bit slower ;) > >> >> >> >> > > >> >> >> >> > Given that we wouldn't have had enough CPU or memory to accommodate > >> >> >> >> > all of that back in DYNIX/ptx days, I am forced to answer "no". ;-) > >> >> >> >> > > >> >> >> >> >> >> This will require fixing task hung. Have not yet looked at workqueue detector. > >> >> >> >> >> >> Does at least RCU respect the given timeout more or less precisely? > >> >> >> >> >> > > >> >> >> >> >> > Assuming that there is at least one CPU capable of taking scheduling-clock > >> >> >> >> >> > interrupts, it should respect the timeout to within a few jiffies. > >> >> >> >> > >> >> >> >> > >> >> >> >> Hi Paul, > >> >> >> >> > >> >> >> >> Speaking of stalls and rcu, we are seeing lots of crashes that go like this: > >> >> >> >> > >> >> >> >> INFO: rcu_sched self-detected stall on CPU[ 404.992530] INFO: > >> >> >> >> rcu_sched detected stalls on CPUs/tasks: > >> >> >> >> INFO: rcu_sched self-detected stall on CPU[ 454.347448] INFO: > >> >> >> >> rcu_sched detected stalls on CPUs/tasks: > >> >> >> >> INFO: rcu_sched self-detected stall on CPU[ 396.073634] INFO: > >> >> >> >> rcu_sched detected stalls on CPUs/tasks: > >> >> >> >> > >> >> >> >> or like this: > >> >> >> >> > >> >> >> >> INFO: rcu_sched self-detected stall on CPU > >> >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: > >> >> >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > >> >> >> >> softirq=57641/57641 fqs=31151 > >> >> >> >> 0-....: (125000 ticks this GP) idle=0ba/1/4611686018427387906 > >> >> >> >> softirq=57641/57641 fqs=31151 > >> >> >> >> (t=125002 jiffies g=31656 c=31655 q=910) > >> >> >> >> > >> >> >> >> INFO: rcu_sched self-detected stall on CPU > >> >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: > >> >> >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > >> >> >> >> softirq=65194/65194 fqs=31231 > >> >> >> >> 0-....: (125000 ticks this GP) idle=49a/1/4611686018427387906 > >> >> >> >> softirq=65194/65194 fqs=31231 > >> >> >> >> (t=125002 jiffies g=34421 c=34420 q=1119) > >> >> >> >> (detected by 1, t=125002 jiffies, g=34421, c=34420, q=1119) > >> >> >> >> > >> >> >> >> > >> >> >> >> and then there is an unintelligible mess of 2 reports. Such crashes go > >> >> >> >> to trash bin, because we can't even say which function hanged. It > >> >> >> >> seems that in all cases 2 different rcu stall detection facilities > >> >> >> >> race with each other. Is it possible to make them not race? > >> >> >> > > >> >> >> > How about the following (untested, not for mainline) patch? It suppresses > >> >> >> > all but the "main" RCU flavor, which is rcu_sched for !PREEMPT builds and > >> >> >> > rcu_preempt otherwise. Either way, this is the RCU flavor corresponding > >> >> >> > to synchronize_rcu(). This works well in the common case where there > >> >> >> > is almost always an RCU grace period in flight. > >> >> >> > > >> >> >> > One reason that this patch is not for mainline is that I am working on > >> >> >> > merging the RCU-bh, RCU-preempt, and RCU-sched flavors into one thing, > >> >> >> > at which point there won't be any races. But that might be a couple > >> >> >> > merge windows away from now. > >> >> >> > > >> >> >> > Thanx, Paul > >> >> >> > > >> >> >> > ------------------------------------------------------------------------ > >> >> >> > > >> >> >> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > >> >> >> > index 381b47a68ac6..31f7818f2d63 100644 > >> >> >> > --- a/kernel/rcu/tree.c > >> >> >> > +++ b/kernel/rcu/tree.c > >> >> >> > @@ -1552,7 +1552,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) > >> >> >> > struct rcu_node *rnp; > >> >> >> > > >> >> >> > if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || > >> >> >> > - !rcu_gp_in_progress(rsp)) > >> >> >> > + !rcu_gp_in_progress(rsp) || rsp != rcu_state_p) > >> >> >> > return; > >> >> >> > rcu_stall_kick_kthreads(rsp); > >> >> >> > j = jiffies; > >> >> >> > >> >> >> But doesn't they both relate to the same rcu flavor? They both say > >> >> >> rcu_sched. I assumed that the difference is "self-detected" vs "on > >> >> >> CPUs/tasks", i.e. on the current CPU vs on other CPUs. > >> >> > > >> >> > Right you are! > >> >> > > >> >> > One approach would be to increase the value of RCU_STALL_RAT_DELAY, > >> >> > which is currently two jiffies to (say) 20 jiffies. This is in > >> >> > kernel/rcu/tree.h. But this would fail on a sufficiently overloaded > >> >> > system -- and the failure of the two-jiffy delay is a bit of a surprise, > >> >> > given interrupts disabled and all that. Are you by any chance loaded > >> >> > heavily enough to see vCPU preemption? > >> >> > > >> >> > I could avoid at least some of these timing issues instead using cmpxchg() > >> >> > on ->jiffies_stall to allow only one CPU in, but leave the non-atomic > >> >> > update to discourage overly long stall prints from running into the > >> >> > next one. This is not perfect, either, and is roughly equivalent to > >> >> > setting RCU_STALL_RAT_DELAY to many second's worth of jiffies, but > >> >> > avoiding that minute's delay. But it should get rid of the duplication > >> >> > in almost all cases, though it could allow a stall warning to overlap > >> >> > with a later stall warning for that same grace period. Which can > >> >> > already happen anyway. Also, a tens-of-seconds vCPU preemption can > >> >> > still cause concurrent stall warnings, but if that is happening to you, > >> >> > the concurrent stall warnings are probably the least of your problems. > >> >> > Besides, we do need at least one CPU to actually report the stall, which > >> >> > won't happen if that CPU's vCPU is indefinitely preempted. So there is > >> >> > only so much I can do about that particular corner case. > >> >> > > >> >> > So how does the following (untested) patch work for you? > >> >> > >> >> Looks good to me. > >> >> > >> >> We run on VMs, so we can well have vCPU preemption. > >> > > >> > Very good! Please do get me a Tested-by when you get to that point. > >> > >> Unfortunately I don't have a good way to test it until it's submitted > >> upstream. While we are seeing thousands of such instances, they happen > >> episodically on a farm of test machines. But they are still harmful, > >> especially when the system tries to reproduce a bug, because it's > >> mid-way through and thinks it got a hook, but then suddenly boom! it > >> gets some mess that it can't parse and now it does not know if it's > >> still the same bug, or maybe a different bug triggered by the same > >> program, so it does not know how to properly attribute the reproducer. > >> You can see these cases as they happen here (under report/log links in > >> the table): > >> https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452 > >> When the patch is submitted, the rate should go down. > > > > OK, I will bite... How do you test fixes to problems that syzkaller finds? > > I don't. I can't. No one can test that many fixes. > > Normally syzbot provides reproducers for bugs. Then you have 2 > choices: (1) test it yourself (if you debugged it, you probably > already have everything setup for this), or (2) ask syzbot to test the > patch on this particular reproducer. > Some bugs don't have reproducers. Then you either localize the bug and > write a test, or go with the old good "it must be correct, right?". > Even for the second case, syzbot will notify if the bug happens again > after the fix is landed, or it's silent, then presumably the fix > indeed fixed the bug. > > Now, this is not a syzbot bug (syzbot reports bugs itself from own > email address). This is more like you looked at somebody else dmsg and > like "oh, this looks bad, let me copy-paste and report it". > So can also go with the old good "it must be correct, right?" and > assess how well it goes after few weeks when it reaches syzbot, or > someone needs to write a test for rcu. > > This could have been handled with some kind of "cluster-wide" test, > but I don't see how it is feasible. See this for details: > https://groups.google.com/d/msg/syzkaller-bugs/7ucgCkAJKSk/skZjgavRAQAJ > Especially the part that someone will need to go through and triage > hundreds of crashes and assess that they are not related to the new > patch, and do something with then afterwards. Fair enough, and apologies for the hassle. I don't expect that the patch will be controversial, so it should go into the next merge window. Thanx, Paul ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2018-04-12 15:06 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-04-02 9:20 INFO: task hung in perf_trace_event_unreg syzbot 2018-04-02 13:40 ` Steven Rostedt 2018-04-02 15:33 ` Paul E. McKenney 2018-04-02 16:04 ` Dmitry Vyukov 2018-04-02 16:21 ` Paul E. McKenney 2018-04-02 16:32 ` Dmitry Vyukov 2018-04-02 16:39 ` Paul E. McKenney 2018-04-02 17:11 ` Dmitry Vyukov 2018-04-02 17:23 ` Paul E. McKenney 2018-04-09 12:54 ` Dmitry Vyukov 2018-04-09 16:20 ` Paul E. McKenney 2018-04-09 16:28 ` Dmitry Vyukov 2018-04-09 18:11 ` Paul E. McKenney 2018-04-10 11:13 ` Dmitry Vyukov 2018-04-10 17:02 ` Paul E. McKenney 2018-04-11 10:06 ` Dmitry Vyukov 2018-04-11 19:36 ` Paul E. McKenney 2018-04-12 9:39 ` Dmitry Vyukov 2018-04-12 15:07 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).