From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752107AbdLSQVe (ORCPT ); Tue, 19 Dec 2017 11:21:34 -0500 Received: from mail-qt0-f195.google.com ([209.85.216.195]:36338 "EHLO mail-qt0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750801AbdLSQV3 (ORCPT ); Tue, 19 Dec 2017 11:21:29 -0500 X-Google-Smtp-Source: ACJfBot75BVX+2tgWpJSLL1QmzowPVIyTyB82kAKvgvbaOlt/ZI9Oe1nxy13XPyZ6wy//8xGK5fF1VKcWjjuqDKuwOM= MIME-Version: 1.0 In-Reply-To: References: <001a1143fd00a8cc790560b0b552@google.com> From: Xin Long Date: Wed, 20 Dec 2017 00:21:27 +0800 Message-ID: Subject: Re: INFO: task hung in bpf_exit_net To: Dmitry Vyukov Cc: syzbot , LKML , Ingo Molnar , Peter Zijlstra , syzkaller-bugs@googlegroups.com, David Miller , David Ahern , Florian Westphal , Daniel Borkmann , jakub.kicinski@netronome.com, mschiffer@universe-factory.net, Vladislav Yasevich , Jiri Benc , netdev Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 19, 2017 at 8:47 PM, Dmitry Vyukov wrote: > On Tue, Dec 19, 2017 at 1:36 PM, syzbot > > wrote: >> Hello, >> >> syzkaller hit the following crash on >> 7ceb97a071e80f1b5e4cd5a36de135612a836388 >> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master >> compiler: gcc (GCC) 7.1.1 20170620 >> .config is attached >> Raw console output is attached. >> >> Unfortunately, I don't have any reproducer for this bug yet. >> >> >> sctp: sctp_transport_update_pmtu: Reported pmtu 508 too low, using default >> minimum of 512 >> INFO: task kworker/u4:0:5 blocked for more than 120 seconds. >> Not tainted 4.15.0-rc2-next-20171205+ #59 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> kworker/u4:0 D15808 5 2 0x80000000 >> Workqueue: netns cleanup_net >> Call Trace: >> context_switch kernel/sched/core.c:2800 [inline] >> __schedule+0x8eb/0x2060 kernel/sched/core.c:3376 >> schedule+0xf5/0x430 kernel/sched/core.c:3435 >> schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3493 >> __mutex_lock_common kernel/locking/mutex.c:833 [inline] >> __mutex_lock+0xaad/0x1a80 kernel/locking/mutex.c:893 >> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 >> rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74 >> tc_action_net_exit include/net/act_api.h:125 [inline] >> bpf_exit_net+0x1a2/0x340 net/sched/act_bpf.c:408 >> ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:142 >> cleanup_net+0x5c7/0xb60 net/core/net_namespace.c:484 >> process_one_work+0xbfd/0x1bc0 kernel/workqueue.c:2113 >> worker_thread+0x223/0x1990 kernel/workqueue.c:2247 >> kthread+0x37a/0x440 kernel/kthread.c:238 >> ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:517 >> [...] >> Call Trace: >> serial_in drivers/tty/serial/8250/8250.h:111 [inline] >> wait_for_xmitr+0x93/0x1e0 drivers/tty/serial/8250/8250_port.c:2033 I saw this call trace on both 'bpf_exit_net task hung' and 'cleanup_net task hung'. Note when cpu is here, it's still holding the rtnl_lock in these 2 cases, one is in nl80211_dump_interface(), and the other one is in dev_ioctl(). I noticed this patch: commit 54f19b4a679149130f78413c421a5780e90a9d0a Author: Jiri Olsa Date: Wed Sep 21 16:43:15 2016 +0200 tty/serial/8250: Touch NMI watchdog in wait_for_xmitr It means in early time, watchdog timeout can be triggered here, And this patch was to fix it by restarting NMI watchdog timeout with calling touch_nmi_watchdog(). But this patch missed that it's still holding the rtnl_lock(), other threads may timeout on watchdog when trying to acquire rtnl_lock(). >> serial8250_console_putchar+0x1f/0x60 >> drivers/tty/serial/8250/8250_port.c:3170 >> uart_console_write+0xac/0xe0 drivers/tty/serial/serial_core.c:1858 >> serial8250_console_write+0x647/0xa20 >> drivers/tty/serial/8250/8250_port.c:3236 >> univ8250_console_write+0x5f/0x70 drivers/tty/serial/8250/8250_core.c:590 >> call_console_drivers kernel/printk/printk.c:1574 [inline] >> console_unlock+0x788/0xd70 kernel/printk/printk.c:2233 >> vprintk_emit+0x4ad/0x590 kernel/printk/printk.c:1757 >> vprintk_default+0x28/0x30 kernel/printk/printk.c:1796 >> vprintk_func+0x57/0xc0 kernel/printk/printk_safe.c:379 >> printk+0xaa/0xca kernel/printk/printk.c:1829 >> nla_parse+0x374/0x3d0 lib/nlattr.c:257 >> nlmsg_parse include/net/netlink.h:398 [inline] >> nl80211_dump_wiphy_parse.isra.37.constprop.83+0x138/0x5c0 >> net/wireless/nl80211.c:1920 >> nl80211_dump_interface+0x596/0x820 net/wireless/nl80211.c:2660 >> genl_lock_dumpit+0x68/0x90 net/netlink/genetlink.c:480 >> netlink_dump+0x48c/0xce0 net/netlink/af_netlink.c:2186 >> __netlink_dump_start+0x4f0/0x6d0 net/netlink/af_netlink.c:2283 >> genl_family_rcv_msg+0xd27/0xfc0 net/netlink/genetlink.c:548 >> genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:624 >> netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2405 >> genl_rcv+0x28/0x40 net/netlink/genetlink.c:635 >> netlink_unicast_kernel net/netlink/af_netlink.c:1272 [inline] >> netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1298 >> netlink_sendmsg+0xa4a/0xe70 net/netlink/af_netlink.c:1861 >> sock_sendmsg_nosec net/socket.c:636 [inline] >> sock_sendmsg+0xca/0x110 net/socket.c:646 >> sock_write_iter+0x320/0x5e0 net/socket.c:915 >> call_write_iter include/linux/fs.h:1776 [inline] >> new_sync_write fs/read_write.c:469 [inline] >> __vfs_write+0x68a/0x970 fs/read_write.c:482 >> vfs_write+0x18f/0x510 fs/read_write.c:544 >> SYSC_write fs/read_write.c:589 [inline] >> SyS_write+0xef/0x220 fs/read_write.c:581 >> entry_SYSCALL_64_fastpath+0x1f/0x96 >> RIP: 0033:0x4529d9 >> RSP: 002b:00007f6d52e3ec58 EFLAGS: 00000212 ORIG_RAX: 0000000000000001 >> RAX: ffffffffffffffda RBX: 00007f6d52e3f700 RCX: 00000000004529d9 >> RDX: 0000000000000024 RSI: 0000000020454000 RDI: 0000000000000016 >> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000 >> R13: 0000000000a6f7ff R14: 00007f6d52e3f9c0 R15: 0000000000000000 >> Code: 24 d9 00 00 00 49 8d 7c 24 40 48 b8 00 00 00 00 00 fc ff df 48 89 fa >> 48 c1 ea 03 d3 e3 80 3c 02 00 75 17 41 03 5c 24 40 89 da ec <5b> 0f b6 c0 41 >> 5c 5d c3 e8 38 b0 18 ff eb c2 e8 91 b0 18 ff eb >> >> >> --- >> This bug is generated by a dumb bot. It may contain errors. >> See https://goo.gl/tpsmEJ for details. >> Direct all questions to syzkaller@googlegroups.com. >> Please credit me with: Reported-by: syzbot >> >> syzbot will keep track of this bug report. >> Once a fix for this bug is merged into any tree, reply to this email with: >> #syz fix: exact-commit-title >> To mark this as a duplicate of another syzbot report, please reply with: >> #syz dup: exact-subject-of-another-report >> If it's a one-off invalid bug report, please reply with: >> #syz invalid >> Note: if the crash happens again, it will cause creation of a new bug >> report. >> Note: all commands must start from beginning of the line in the email body. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "syzkaller-bugs" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to syzkaller-bugs+unsubscribe@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/syzkaller-bugs/001a1143fd00a8cc790560b0b552%40google.com. >> For more options, visit https://groups.google.com/d/optout. > > > This looks like +rtnetlink issue.