linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* INFO: task hung in wdm_flush
@ 2019-08-12 12:18 syzbot
  2019-11-19  9:14 ` Bjørn Mork
  0 siblings, 1 reply; 18+ messages in thread
From: syzbot @ 2019-08-12 12:18 UTC (permalink / raw)
  To: andreyknvl, baijiaju1990, bigeasy, colin.king, gregkh,
	linux-kernel, linux-usb, syzkaller-bugs, yuehaibing

Hello,

syzbot found the following crash on:

HEAD commit:    e96407b4 usb-fuzzer: main usb gadget fuzzer driver
git tree:       https://github.com/google/kasan.git usb-fuzzer
console output: https://syzkaller.appspot.com/x/log.txt?x=1046c6ee600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=cfa2c18fb6a8068e
dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1299132c600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=176e6d8c600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com

INFO: task syz-executor121:1726 blocked for more than 143 seconds.
       Not tainted 5.3.0-rc2+ #25
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor121 D28520  1726   1724 0x80004006
Call Trace:
  schedule+0x9a/0x250 kernel/sched/core.c:3944
  wdm_flush+0x20c/0x370 drivers/usb/class/cdc-wdm.c:590
  filp_close+0xb4/0x160 fs/open.c:1166
  close_files fs/file.c:388 [inline]
  put_files_struct fs/file.c:416 [inline]
  put_files_struct+0x1d8/0x2e0 fs/file.c:413
  exit_files+0x7e/0xa0 fs/file.c:445
  do_exit+0x8bc/0x2c50 kernel/exit.c:873
  do_group_exit+0x125/0x340 kernel/exit.c:982
  get_signal+0x466/0x23d0 kernel/signal.c:2728
  do_signal+0x88/0x14e0 arch/x86/kernel/signal.c:815
  exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:159
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x401520
Code: 6e 65 54 61 62 6c 65 00 67 65 74 63 6f 6e 00 5f 69 6e 69 74 00 69 73  
5f 73 65 6c 69 6e 75 78 5f 65 6e 61 62 6c 65 64 00 73 65 <63> 75 72 69 74  
79 5f 67 65 74 65 6e 66 6f 72 63 65 00 67 65 74 5f
RSP: 002b:00007ffd59c75df8 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: 0000000000000004 RBX: 0000000000000000 RCX: 0000000000401520
RDX: 0000000000000000 RSI: 0000000000000002 RDI: 00007ffd59c75e10
RBP: 00000000006cc018 R08: 0000000000000000 R09: 000000000000000f
R10: 0000000000000064 R11: 0000000000000246 R12: 0000000000402540
R13: 00000000004025d0 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor121:1731 blocked for more than 143 seconds.
       Not tainted 5.3.0-rc2+ #25
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor121 D28520  1731   1730 0x80004006
Call Trace:
  schedule+0x9a/0x250 kernel/sched/core.c:3944
  wdm_flush+0x20c/0x370 drivers/usb/class/cdc-wdm.c:590
  filp_close+0xb4/0x160 fs/open.c:1166
  close_files fs/file.c:388 [inline]
  put_files_struct fs/file.c:416 [inline]
  put_files_struct+0x1d8/0x2e0 fs/file.c:413
  exit_files+0x7e/0xa0 fs/file.c:445
  do_exit+0x8bc/0x2c50 kernel/exit.c:873
  do_group_exit+0x125/0x340 kernel/exit.c:982
  get_signal+0x466/0x23d0 kernel/signal.c:2728
  do_signal+0x88/0x14e0 arch/x86/kernel/signal.c:815
  exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:159
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4417e9
Code: 65 64 2e 0a 44 69 64 20 79 6f 75 20 64 6f 20 61 20 22 6d 61 6b 65 20  
69 6e 73 74 61 6c 6c 22 3f 0a 53 75 67 67 65 73 74 65 64 <20> 61 63 74 69  
6f 6e 3a 20 72 75 6e 20 72 73 79 73 6c 6f 67 64 20
RSP: 002b:00007ffd59c75ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00000000004417e9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
RBP: 00000000006cc018 R08: 000000000000000f R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402540
R13: 00000000004025d0 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor121:1732 blocked for more than 143 seconds.
       Not tainted 5.3.0-rc2+ #25
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor121 D28520  1732   1728 0x80004006
Call Trace:
  schedule+0x9a/0x250 kernel/sched/core.c:3944
  wdm_flush+0x20c/0x370 drivers/usb/class/cdc-wdm.c:590
  filp_close+0xb4/0x160 fs/open.c:1166
  close_files fs/file.c:388 [inline]
  put_files_struct fs/file.c:416 [inline]
  put_files_struct+0x1d8/0x2e0 fs/file.c:413
  exit_files+0x7e/0xa0 fs/file.c:445
  do_exit+0x8bc/0x2c50 kernel/exit.c:873
  do_group_exit+0x125/0x340 kernel/exit.c:982
  get_signal+0x466/0x23d0 kernel/signal.c:2728
  do_signal+0x88/0x14e0 arch/x86/kernel/signal.c:815
  exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:159
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x401520
Code: 00 00 3d 02 00 00 46 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 10  
01 00 00 2f 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 20 01 00 00 00 00
RSP: 002b:00007ffd59c75df8 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: 0000000000000004 RBX: 0000000000000000 RCX: 0000000000401520
RDX: 0000000000000000 RSI: 0000000000000002 RDI: 00007ffd59c75e10
RBP: 00000000006cc018 R08: 0000000000000000 R09: 000000000000000f
R10: 0000000000000064 R11: 0000000000000246 R12: 0000000000402540
R13: 00000000004025d0 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor121:1733 blocked for more than 144 seconds.
       Not tainted 5.3.0-rc2+ #25
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor121 D28376  1733   1725 0x80000002
Call Trace:
  schedule+0x9a/0x250 kernel/sched/core.c:3944
  wdm_flush+0x20c/0x370 drivers/usb/class/cdc-wdm.c:590
  filp_close+0xb4/0x160 fs/open.c:1166
  close_files fs/file.c:388 [inline]
  put_files_struct fs/file.c:416 [inline]
  put_files_struct+0x1d8/0x2e0 fs/file.c:413
  exit_files+0x7e/0xa0 fs/file.c:445
  do_exit+0x8bc/0x2c50 kernel/exit.c:873
  do_group_exit+0x125/0x340 kernel/exit.c:982
  __do_sys_exit_group kernel/exit.c:993 [inline]
  __se_sys_exit_group kernel/exit.c:991 [inline]
  __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:991
  do_syscall_64+0xb7/0x580 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440438
Code: 61 74 68 3e 5d 0a 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 5b  
2d 75 3c 6e 75 6d 62 65 72 3e 5d 0a 54 6f 20 72 75 6e 20 <72> 73 79 73 6c  
6f 67 64 20 69 6e 20 6e 61 74 69 76 65 20 6d 6f 64
RSP: 002b:00007ffd59c75e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440438
RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
RBP: 00000000004bff70 R08: 00000000000000e7 R09: ffffffffffffffd0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000006d2180 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor121:1734 blocked for more than 144 seconds.
       Not tainted 5.3.0-rc2+ #25
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor121 D28248  1734   1729 0x80004006
Call Trace:
  schedule+0x9a/0x250 kernel/sched/core.c:3944
  wdm_flush+0x20c/0x370 drivers/usb/class/cdc-wdm.c:590
  filp_close+0xb4/0x160 fs/open.c:1166
  close_files fs/file.c:388 [inline]
  put_files_struct fs/file.c:416 [inline]
  put_files_struct+0x1d8/0x2e0 fs/file.c:413
  exit_files+0x7e/0xa0 fs/file.c:445
  do_exit+0x8bc/0x2c50 kernel/exit.c:873
  do_group_exit+0x125/0x340 kernel/exit.c:982
  get_signal+0x466/0x23d0 kernel/signal.c:2728
  do_signal+0x88/0x14e0 arch/x86/kernel/signal.c:815
  exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:159
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4417e9
Code: 65 64 2e 0a 44 69 64 20 79 6f 75 20 64 6f 20 61 20 22 6d 61 6b 65 20  
69 6e 73 74 61 6c 6c 22 3f 0a 53 75 67 67 65 73 74 65 64 <20> 61 63 74 69  
6f 6e 3a 20 72 75 6e 20 72 73 79 73 6c 6f 67 64 20
RSP: 002b:00007ffd59c75ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00000000004417e9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
RBP: 00000000006cc018 R08: 000000000000000f R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402540
R13: 00000000004025d0 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor121:1736 blocked for more than 144 seconds.
       Not tainted 5.3.0-rc2+ #25
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor121 D28520  1736   1727 0x80004006
Call Trace:
  schedule+0x9a/0x250 kernel/sched/core.c:3944
  wdm_flush+0x20c/0x370 drivers/usb/class/cdc-wdm.c:590
  filp_close+0xb4/0x160 fs/open.c:1166
  close_files fs/file.c:388 [inline]
  put_files_struct fs/file.c:416 [inline]
  put_files_struct+0x1d8/0x2e0 fs/file.c:413
  exit_files+0x7e/0xa0 fs/file.c:445
  do_exit+0x8bc/0x2c50 kernel/exit.c:873
  do_group_exit+0x125/0x340 kernel/exit.c:982
  get_signal+0x466/0x23d0 kernel/signal.c:2728
  do_signal+0x88/0x14e0 arch/x86/kernel/signal.c:815
  exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:159
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4417e9
Code: 65 64 2e 0a 44 69 64 20 79 6f 75 20 64 6f 20 61 20 22 6d 61 6b 65 20  
69 6e 73 74 61 6c 6c 22 3f 0a 53 75 67 67 65 73 74 65 64 <20> 61 63 74 69  
6f 6e 3a 20 72 75 6e 20 72 73 79 73 6c 6f 67 64 20
RSP: 002b:00007ffd59c75ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00000000004417e9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
RBP: 00000000006cc018 R08: 000000000000000f R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402540
R13: 00000000004025d0 R14: 0000000000000000 R15: 0000000000000000

Showing all locks held in the system:
1 lock held by khungtaskd/23:
  #0: 00000000743497a3 (rcu_read_lock){....}, at:  
debug_show_all_locks+0x53/0x269 kernel/locking/lockdep.c:5254
1 lock held by rsyslogd/1602:
  #0: 00000000988125b0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xe3/0x100  
fs/file.c:801
2 locks held by getty/1693:
  #0: 0000000047c29258 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:272
  #1: 00000000527dfb3a (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x223/0x1ae0 drivers/tty/n_tty.c:2156
2 locks held by getty/1694:
  #0: 000000003a351c46 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:272
  #1: 00000000d8d75c5b (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x223/0x1ae0 drivers/tty/n_tty.c:2156
2 locks held by getty/1695:
  #0: 00000000e15b15bf (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:272
  #1: 000000004d294c18 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x223/0x1ae0 drivers/tty/n_tty.c:2156
2 locks held by getty/1696:
  #0: 0000000051d028a3 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:272
  #1: 0000000038c23150 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x223/0x1ae0 drivers/tty/n_tty.c:2156
2 locks held by getty/1697:
  #0: 000000001b33f7ab (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:272
  #1: 00000000f5955915 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x223/0x1ae0 drivers/tty/n_tty.c:2156
2 locks held by getty/1698:
  #0: 000000007ef217e0 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:272
  #1: 00000000bc876517 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x223/0x1ae0 drivers/tty/n_tty.c:2156
2 locks held by getty/1699:
  #0: 000000000ee3efd4 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:272
  #1: 000000006bc64f89 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x223/0x1ae0 drivers/tty/n_tty.c:2156

=============================================

NMI backtrace for cpu 0
CPU: 0 PID: 23 Comm: khungtaskd Not tainted 5.3.0-rc2+ #25
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xca/0x13e lib/dump_stack.c:113
  nmi_cpu_backtrace.cold+0x55/0x96 lib/nmi_backtrace.c:101
  nmi_trigger_cpumask_backtrace+0x1b0/0x1c7 lib/nmi_backtrace.c:62
  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
  check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
  watchdog+0x9a4/0xe50 kernel/hung_task.c:289
  kthread+0x318/0x420 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1 skipped: idling at native_safe_halt  
arch/x86/include/asm/irqflags.h:60 [inline]
NMI backtrace for cpu 1 skipped: idling at arch_safe_halt  
arch/x86/include/asm/irqflags.h:103 [inline]
NMI backtrace for cpu 1 skipped: idling at default_idle+0x28/0x2e0  
arch/x86/kernel/process.c:580


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2019-08-12 12:18 INFO: task hung in wdm_flush syzbot
@ 2019-11-19  9:14 ` Bjørn Mork
  2019-11-19 10:31   ` Oliver Neukum
  0 siblings, 1 reply; 18+ messages in thread
From: Bjørn Mork @ 2019-11-19  9:14 UTC (permalink / raw)
  To: syzbot
  Cc: andreyknvl, baijiaju1990, bigeasy, colin.king, gregkh,
	linux-kernel, linux-usb, syzkaller-bugs, yuehaibing

syzbot <syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com> writes:

> INFO: task syz-executor121:1726 blocked for more than 143 seconds.
>       Not tainted 5.3.0-rc2+ #25
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor121 D28520  1726   1724 0x80004006
> Call Trace:
>  schedule+0x9a/0x250 kernel/sched/core.c:3944
>  wdm_flush+0x20c/0x370 drivers/usb/class/cdc-wdm.c:590
>  filp_close+0xb4/0x160 fs/open.c:1166
>  close_files fs/file.c:388 [inline]
>  put_files_struct fs/file.c:416 [inline]
>  put_files_struct+0x1d8/0x2e0 fs/file.c:413
>  exit_files+0x7e/0xa0 fs/file.c:445
>  do_exit+0x8bc/0x2c50 kernel/exit.c:873
>  do_group_exit+0x125/0x340 kernel/exit.c:982
>  get_signal+0x466/0x23d0 kernel/signal.c:2728
>  do_signal+0x88/0x14e0 arch/x86/kernel/signal.c:815
>  exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:159
>  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
>  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x401520
> Code: 6e 65 54 61 62 6c 65 00 67 65 74 63 6f 6e 00 5f 69 6e 69 74 00
> 69 73 5f 73 65 6c 69 6e 75 78 5f 65 6e 61 62 6c 65 64 00 73 65 <63> 75
> 72 69 74 79 5f 67 65 74 65 6e 66 6f 72 63 65 00 67 65 74 5f
> RSP: 002b:00007ffd59c75df8 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
> RAX: 0000000000000004 RBX: 0000000000000000 RCX: 0000000000401520
> RDX: 0000000000000000 RSI: 0000000000000002 RDI: 00007ffd59c75e10
> RBP: 00000000006cc018 R08: 0000000000000000 R09: 000000000000000f
> R10: 0000000000000064 R11: 0000000000000246 R12: 0000000000402540
> R13: 00000000004025d0 R14: 0000000000000000 R15: 0000000000000000


Thanks to Eric for reminiding me of this one.  I did look briefly at it
before, and meant to revisit it for a more thorough analysis.  And
forgot, of corse...

Anyway, I believe this is not a bug.

wdm_flush will wait forever for the IN_USE flag to be cleared or the
DISCONNECTING flag to be set. The only way you can avoid this is by
creating a device that works normally up to a point and then completely
ignores all messages, but without resetting or disconnecting. It is
obviously possible to create such a device. But I think the current
error handling is more than sufficient, unless you show me some way to
abuse this or reproduce the issue with a real device.

Just disconnect the malfunctioning device and throw it away.


Bjørn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2019-11-19  9:14 ` Bjørn Mork
@ 2019-11-19 10:31   ` Oliver Neukum
  2019-11-19 11:34     ` Bjørn Mork
  0 siblings, 1 reply; 18+ messages in thread
From: Oliver Neukum @ 2019-11-19 10:31 UTC (permalink / raw)
  To: Bjørn Mork, syzbot
  Cc: andreyknvl, baijiaju1990, bigeasy, colin.king, gregkh,
	linux-kernel, linux-usb, syzkaller-bugs, yuehaibing

Am Dienstag, den 19.11.2019, 10:14 +0100 schrieb Bjørn Mork:
> syzbot <syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com> writes:
> 
> > INFO: task syz-executor121:1726 blocked for more than 143 seconds.
> >       Not tainted 5.3.0-rc2+ #25
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > syz-executor121 D28520  1726   1724 0x80004006
> > Call Trace:
> >  schedule+0x9a/0x250 kernel/sched/core.c:3944
> >  wdm_flush+0x20c/0x370 drivers/usb/class/cdc-wdm.c:590
> >  filp_close+0xb4/0x160 fs/open.c:1166
> >  close_files fs/file.c:388 [inline]
> >  put_files_struct fs/file.c:416 [inline]
> >  put_files_struct+0x1d8/0x2e0 fs/file.c:413
> >  exit_files+0x7e/0xa0 fs/file.c:445
> >  do_exit+0x8bc/0x2c50 kernel/exit.c:873
> >  do_group_exit+0x125/0x340 kernel/exit.c:982
> >  get_signal+0x466/0x23d0 kernel/signal.c:2728
> >  do_signal+0x88/0x14e0 arch/x86/kernel/signal.c:815
> >  exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:159
> >  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> >  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
> >  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x401520
> > Code: 6e 65 54 61 62 6c 65 00 67 65 74 63 6f 6e 00 5f 69 6e 69 74 00
> > 69 73 5f 73 65 6c 69 6e 75 78 5f 65 6e 61 62 6c 65 64 00 73 65 <63> 75
> > 72 69 74 79 5f 67 65 74 65 6e 66 6f 72 63 65 00 67 65 74 5f
> > RSP: 002b:00007ffd59c75df8 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
> > RAX: 0000000000000004 RBX: 0000000000000000 RCX: 0000000000401520
> > RDX: 0000000000000000 RSI: 0000000000000002 RDI: 00007ffd59c75e10
> > RBP: 00000000006cc018 R08: 0000000000000000 R09: 000000000000000f
> > R10: 0000000000000064 R11: 0000000000000246 R12: 0000000000402540
> > R13: 00000000004025d0 R14: 0000000000000000 R15: 0000000000000000
> 
> 
> Thanks to Eric for reminiding me of this one.  I did look briefly at it
> before, and meant to revisit it for a more thorough analysis.  And
> forgot, of corse...
> 
> Anyway, I believe this is not a bug.
> 
> wdm_flush will wait forever for the IN_USE flag to be cleared or the

Damn. Too obvious. So you think we simply have pending output that does
just not complete?

> DISCONNECTING flag to be set. The only way you can avoid this is by
> creating a device that works normally up to a point and then completely
> ignores all messages,

Devices may crash. I don't think we can ignore that case.

>  but without resetting or disconnecting. It is
> obviously possible to create such a device. But I think the current
> error handling is more than sufficient, unless you show me some way to
> abuse this or reproduce the issue with a real device.

Malicious devices are real. Potentially at least.
But you are right, we need not bend over to handle them well, but we
ought to be able to handle them.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2019-11-19 10:31   ` Oliver Neukum
@ 2019-11-19 11:34     ` Bjørn Mork
  2019-11-23  6:52       ` Dmitry Vyukov
  0 siblings, 1 reply; 18+ messages in thread
From: Bjørn Mork @ 2019-11-19 11:34 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: syzbot, andreyknvl, baijiaju1990, bigeasy, colin.king, gregkh,
	linux-kernel, linux-usb, syzkaller-bugs, yuehaibing

Oliver Neukum <oneukum@suse.de> writes:
> Am Dienstag, den 19.11.2019, 10:14 +0100 schrieb Bjørn Mork:
>
>> Anyway, I believe this is not a bug.
>> 
>> wdm_flush will wait forever for the IN_USE flag to be cleared or the
>
> Damn. Too obvious. So you think we simply have pending output that does
> just not complete?

I do miss a lot of stuff so I might be wrong, but I can't see any other
way this can happen.  The out_callback will unconditionally clear the
IN_USE flag and wake up the wait_queue.

>> DISCONNECTING flag to be set. The only way you can avoid this is by
>> creating a device that works normally up to a point and then completely
>> ignores all messages,
>
> Devices may crash. I don't think we can ignore that case.

Sure, but I've never seen that happen without the device falling off the
bus.  Which is a disconnect.

But I am all for handling this *if* someone reproduces it with a real
device.  I just don't think it's worth the effort if it's only a
theoretical problem.

>>  but without resetting or disconnecting. It is
>> obviously possible to create such a device. But I think the current
>> error handling is more than sufficient, unless you show me some way to
>> abuse this or reproduce the issue with a real device.
>
> Malicious devices are real. Potentially at least.
> But you are right, we need not bend over to handle them well, but we
> ought to be able to handle them.

Sure, we need to handle malicious devices.  But only if they can be used
for real harm.

This warning requires physical acceess and is only slightly annoying.
Like a USB device making loud farting sounds.  You'd just disconnect the
device.  No need for Linux to detect the sound and handle it
automatically, I think.


Bjørn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2019-11-19 11:34     ` Bjørn Mork
@ 2019-11-23  6:52       ` Dmitry Vyukov
  2020-02-10 10:06         ` Dmitry Vyukov
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Vyukov @ 2019-11-23  6:52 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing

On Tue, Nov 19, 2019 at 12:34 PM Bjørn Mork <bjorn@mork.no> wrote:
>
> Oliver Neukum <oneukum@suse.de> writes:
> > Am Dienstag, den 19.11.2019, 10:14 +0100 schrieb Bjørn Mork:
> >
> >> Anyway, I believe this is not a bug.
> >>
> >> wdm_flush will wait forever for the IN_USE flag to be cleared or the
> >
> > Damn. Too obvious. So you think we simply have pending output that does
> > just not complete?
>
> I do miss a lot of stuff so I might be wrong, but I can't see any other
> way this can happen.  The out_callback will unconditionally clear the
> IN_USE flag and wake up the wait_queue.
>
> >> DISCONNECTING flag to be set. The only way you can avoid this is by
> >> creating a device that works normally up to a point and then completely
> >> ignores all messages,
> >
> > Devices may crash. I don't think we can ignore that case.
>
> Sure, but I've never seen that happen without the device falling off the
> bus.  Which is a disconnect.
>
> But I am all for handling this *if* someone reproduces it with a real
> device.  I just don't think it's worth the effort if it's only a
> theoretical problem.
>
> >>  but without resetting or disconnecting. It is
> >> obviously possible to create such a device. But I think the current
> >> error handling is more than sufficient, unless you show me some way to
> >> abuse this or reproduce the issue with a real device.
> >
> > Malicious devices are real. Potentially at least.
> > But you are right, we need not bend over to handle them well, but we
> > ought to be able to handle them.
>
> Sure, we need to handle malicious devices.  But only if they can be used
> for real harm.
>
> This warning requires physical acceess and is only slightly annoying.
> Like a USB device making loud farting sounds.  You'd just disconnect the
> device.  No need for Linux to detect the sound and handle it
> automatically, I think.

Hi Bjørn,

Besides the production use you are referring to, there are 2 cases we
should take into account as well:
1. Testing.
Any kernel testing system needs a binary criteria for detecting kernel
bugs. It seems right to detect unkillable hung tasks as kernel bugs.
Which means that we need to resolve this in some way regardless of the
production scenario.
2. Reliable killing of processes.
It's a very important property that an admin or script can reliably
kill whatever process/container they need to kill for whatever reason.
This case results in an unkillable process, which means scripts will
fail, automated systems will misbehave, admins will waste time (if
they are qualified to resolve this at all).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2019-11-23  6:52       ` Dmitry Vyukov
@ 2020-02-10 10:06         ` Dmitry Vyukov
  2020-02-10 10:09           ` Dmitry Vyukov
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Vyukov @ 2020-02-10 10:06 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing, Bjørn Mork

On Sat, Nov 23, 2019 at 7:52 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Tue, Nov 19, 2019 at 12:34 PM Bjørn Mork <bjorn@mork.no> wrote:
> >
> > Oliver Neukum <oneukum@suse.de> writes:
> > > Am Dienstag, den 19.11.2019, 10:14 +0100 schrieb Bjørn Mork:
> > >
> > >> Anyway, I believe this is not a bug.
> > >>
> > >> wdm_flush will wait forever for the IN_USE flag to be cleared or the
> > >
> > > Damn. Too obvious. So you think we simply have pending output that does
> > > just not complete?
> >
> > I do miss a lot of stuff so I might be wrong, but I can't see any other
> > way this can happen.  The out_callback will unconditionally clear the
> > IN_USE flag and wake up the wait_queue.
> >
> > >> DISCONNECTING flag to be set. The only way you can avoid this is by
> > >> creating a device that works normally up to a point and then completely
> > >> ignores all messages,
> > >
> > > Devices may crash. I don't think we can ignore that case.
> >
> > Sure, but I've never seen that happen without the device falling off the
> > bus.  Which is a disconnect.
> >
> > But I am all for handling this *if* someone reproduces it with a real
> > device.  I just don't think it's worth the effort if it's only a
> > theoretical problem.
> >
> > >>  but without resetting or disconnecting. It is
> > >> obviously possible to create such a device. But I think the current
> > >> error handling is more than sufficient, unless you show me some way to
> > >> abuse this or reproduce the issue with a real device.
> > >
> > > Malicious devices are real. Potentially at least.
> > > But you are right, we need not bend over to handle them well, but we
> > > ought to be able to handle them.
> >
> > Sure, we need to handle malicious devices.  But only if they can be used
> > for real harm.
> >
> > This warning requires physical acceess and is only slightly annoying.
> > Like a USB device making loud farting sounds.  You'd just disconnect the
> > device.  No need for Linux to detect the sound and handle it
> > automatically, I think.
>
> Hi Bjørn,
>
> Besides the production use you are referring to, there are 2 cases we
> should take into account as well:
> 1. Testing.
> Any kernel testing system needs a binary criteria for detecting kernel
> bugs. It seems right to detect unkillable hung tasks as kernel bugs.
> Which means that we need to resolve this in some way regardless of the
> production scenario.
> 2. Reliable killing of processes.
> It's a very important property that an admin or script can reliably
> kill whatever process/container they need to kill for whatever reason.
> This case results in an unkillable process, which means scripts will
> fail, automated systems will misbehave, admins will waste time (if
> they are qualified to resolve this at all).

On Mon, Feb 10, 2020 at 11:00 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Hello.
>
> Will you check whether patch testing is working? I tried
>
>   #syz test: https://github.com/google/kasan.git usb-fuzzer
>
> but the reproducer did not trigger crash for both "with a patch"
> and "without a patch", despite dashboard is still adding crashes.
> I suspect something is wrong. Is it possible that reproducer is
> trying to test a bug which was already fixed but a different new
> bug is still reported as the same bug?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2020-02-10 10:06         ` Dmitry Vyukov
@ 2020-02-10 10:09           ` Dmitry Vyukov
  2020-02-10 12:46             ` Tetsuo Handa
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Vyukov @ 2020-02-10 10:09 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing, Bjørn Mork

On Mon, Feb 10, 2020 at 11:06 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > Oliver Neukum <oneukum@suse.de> writes:
> > > > Am Dienstag, den 19.11.2019, 10:14 +0100 schrieb Bjørn Mork:
> > > >
> > > >> Anyway, I believe this is not a bug.
> > > >>
> > > >> wdm_flush will wait forever for the IN_USE flag to be cleared or the
> > > >
> > > > Damn. Too obvious. So you think we simply have pending output that does
> > > > just not complete?
> > >
> > > I do miss a lot of stuff so I might be wrong, but I can't see any other
> > > way this can happen.  The out_callback will unconditionally clear the
> > > IN_USE flag and wake up the wait_queue.
> > >
> > > >> DISCONNECTING flag to be set. The only way you can avoid this is by
> > > >> creating a device that works normally up to a point and then completely
> > > >> ignores all messages,
> > > >
> > > > Devices may crash. I don't think we can ignore that case.
> > >
> > > Sure, but I've never seen that happen without the device falling off the
> > > bus.  Which is a disconnect.
> > >
> > > But I am all for handling this *if* someone reproduces it with a real
> > > device.  I just don't think it's worth the effort if it's only a
> > > theoretical problem.
> > >
> > > >>  but without resetting or disconnecting. It is
> > > >> obviously possible to create such a device. But I think the current
> > > >> error handling is more than sufficient, unless you show me some way to
> > > >> abuse this or reproduce the issue with a real device.
> > > >
> > > > Malicious devices are real. Potentially at least.
> > > > But you are right, we need not bend over to handle them well, but we
> > > > ought to be able to handle them.
> > >
> > > Sure, we need to handle malicious devices.  But only if they can be used
> > > for real harm.
> > >
> > > This warning requires physical acceess and is only slightly annoying.
> > > Like a USB device making loud farting sounds.  You'd just disconnect the
> > > device.  No need for Linux to detect the sound and handle it
> > > automatically, I think.
> >
> > Hi Bjørn,
> >
> > Besides the production use you are referring to, there are 2 cases we
> > should take into account as well:
> > 1. Testing.
> > Any kernel testing system needs a binary criteria for detecting kernel
> > bugs. It seems right to detect unkillable hung tasks as kernel bugs.
> > Which means that we need to resolve this in some way regardless of the
> > production scenario.
> > 2. Reliable killing of processes.
> > It's a very important property that an admin or script can reliably
> > kill whatever process/container they need to kill for whatever reason.
> > This case results in an unkillable process, which means scripts will
> > fail, automated systems will misbehave, admins will waste time (if
> > they are qualified to resolve this at all).
>
> On Mon, Feb 10, 2020 at 11:00 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > Hello.
> >
> > Will you check whether patch testing is working? I tried
> >
> >   #syz test: https://github.com/google/kasan.git usb-fuzzer
> >
> > but the reproducer did not trigger crash for both "with a patch"
> > and "without a patch", despite dashboard is still adding crashes.
> > I suspect something is wrong. Is it possible that reproducer is
> > trying to test a bug which was already fixed but a different new
> > bug is still reported as the same bug?

Hi Tetsuo,

The simplest and fastest you may try is to request testing on another,
simpler bug. I have not seen any other signals suggesting that patch
testing in general is somehow broken.

You may also try on the exact commit the bug was reported, because
usb-fuzzer is tracking branch, things may change there.

If the old bug was fixed, but syzbot is not aware, new bugs being
piled into the same bucket is exactly what will happen. So that's
definitely possible.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2020-02-10 10:09           ` Dmitry Vyukov
@ 2020-02-10 12:46             ` Tetsuo Handa
  2020-02-10 15:04               ` Dmitry Vyukov
  0 siblings, 1 reply; 18+ messages in thread
From: Tetsuo Handa @ 2020-02-10 12:46 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing, Bjørn Mork

On 2020/02/10 19:09, Dmitry Vyukov wrote:
> You may also try on the exact commit the bug was reported, because
> usb-fuzzer is tracking branch, things may change there.

OK. I explicitly tried

  #syz test: https://github.com/google/kasan.git e5cd56e94edde38ca4dafae5a450c5a16b8a5f23

but syzbot still cannot reproduce this bug using the reproducer...

On 2020/02/10 21:02, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger crash:
> 
> Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> git tree:       https://github.com/google/kasan.git
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Note: testing is done by a robot and is best-effort only.
> 

Anyway, I'm just suspecting that we are forgetting to wake up all waiters
after clearing WDM_IN_USE bit because sometimes multiple threads are reported
as hung.

On 2020/02/10 15:27, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger crash:
> 
> Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> git tree:       https://github.com/google/kasan.git usb-fuzzer
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=117c3ae9e00000
> 
> Note: testing is done by a robot and is best-effort only.
> 

On 2020/02/10 15:55, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger crash:
> 
> Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> git tree:       https://github.com/google/kasan.git usb-fuzzer
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=13b3f6e9e00000
> 
> Note: testing is done by a robot and is best-effort only.
> 

On 2020/02/10 16:21, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger crash:
> 
> Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> git tree:       https://github.com/google/kasan.git usb-fuzzer
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=115026b5e00000
> 
> Note: testing is done by a robot and is best-effort only.
> 

On 2020/02/10 16:44, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger crash:
> 
> Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> git tree:       https://github.com/google/kasan.git usb-fuzzer
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=17285431e00000
> 
> Note: testing is done by a robot and is best-effort only.
> 

On 2020/02/10 17:05, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger crash:
> 
> Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> git tree:       https://github.com/google/kasan.git usb-fuzzer
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Note: testing is done by a robot and is best-effort only.
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2020-02-10 12:46             ` Tetsuo Handa
@ 2020-02-10 15:04               ` Dmitry Vyukov
  2020-02-10 15:06                 ` Dmitry Vyukov
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Vyukov @ 2020-02-10 15:04 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing, Bjørn Mork

On Mon, Feb 10, 2020 at 1:46 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/02/10 19:09, Dmitry Vyukov wrote:
> > You may also try on the exact commit the bug was reported, because
> > usb-fuzzer is tracking branch, things may change there.
>
> OK. I explicitly tried
>
>   #syz test: https://github.com/google/kasan.git e5cd56e94edde38ca4dafae5a450c5a16b8a5f23
>
> but syzbot still cannot reproduce this bug using the reproducer...
>
> On 2020/02/10 21:02, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger crash:
> >
> > Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> >
> > Tested on:
> >
> > commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> > git tree:       https://github.com/google/kasan.git
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> > dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Note: testing is done by a robot and is best-effort only.
> >
>
> Anyway, I'm just suspecting that we are forgetting to wake up all waiters
> after clearing WDM_IN_USE bit because sometimes multiple threads are reported
> as hung.
>
> On 2020/02/10 15:27, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger crash:
> >
> > Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> >
> > Tested on:
> >
> > commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> > git tree:       https://github.com/google/kasan.git usb-fuzzer
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> > dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > patch:          https://syzkaller.appspot.com/x/patch.diff?x=117c3ae9e00000
> >
> > Note: testing is done by a robot and is best-effort only.
> >
>
> On 2020/02/10 15:55, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger crash:
> >
> > Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> >
> > Tested on:
> >
> > commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> > git tree:       https://github.com/google/kasan.git usb-fuzzer
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> > dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > patch:          https://syzkaller.appspot.com/x/patch.diff?x=13b3f6e9e00000
> >
> > Note: testing is done by a robot and is best-effort only.
> >
>
> On 2020/02/10 16:21, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger crash:
> >
> > Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> >
> > Tested on:
> >
> > commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> > git tree:       https://github.com/google/kasan.git usb-fuzzer
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> > dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > patch:          https://syzkaller.appspot.com/x/patch.diff?x=115026b5e00000
> >
> > Note: testing is done by a robot and is best-effort only.
> >
>
> On 2020/02/10 16:44, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger crash:
> >
> > Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> >
> > Tested on:
> >
> > commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> > git tree:       https://github.com/google/kasan.git usb-fuzzer
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> > dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > patch:          https://syzkaller.appspot.com/x/patch.diff?x=17285431e00000
> >
> > Note: testing is done by a robot and is best-effort only.
> >
>
> On 2020/02/10 17:05, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger crash:
> >
> > Reported-and-tested-by: syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com
> >
> > Tested on:
> >
> > commit:         e5cd56e9 usb: gadget: add raw-gadget interface
> > git tree:       https://github.com/google/kasan.git usb-fuzzer
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162
> > dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Note: testing is done by a robot and is best-effort only.



On Mon, Feb 10, 2020 at 4:03 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/02/10 21:46, Tetsuo Handa wrote:
> > On 2020/02/10 19:09, Dmitry Vyukov wrote:
> >> You may also try on the exact commit the bug was reported, because
> >> usb-fuzzer is tracking branch, things may change there.
> >
> > OK. I explicitly tried
> >
> >   #syz test: https://github.com/google/kasan.git e5cd56e94edde38ca4dafae5a450c5a16b8a5f23
> >
> > but syzbot still cannot reproduce this bug using the reproducer...
>
> It seems that there is non-trivial difference between kernel config in dashboard
> and kernel config in "syz test:" mails. Maybe that's the cause...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2020-02-10 15:04               ` Dmitry Vyukov
@ 2020-02-10 15:06                 ` Dmitry Vyukov
  2020-02-10 15:21                   ` Tetsuo Handa
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Vyukov @ 2020-02-10 15:06 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing, Bjørn Mork

> On Mon, Feb 10, 2020 at 4:03 PM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > On 2020/02/10 21:46, Tetsuo Handa wrote:
> > > On 2020/02/10 19:09, Dmitry Vyukov wrote:
> > >> You may also try on the exact commit the bug was reported, because
> > >> usb-fuzzer is tracking branch, things may change there.
> > >
> > > OK. I explicitly tried
> > >
> > >   #syz test: https://github.com/google/kasan.git e5cd56e94edde38ca4dafae5a450c5a16b8a5f23
> > >
> > > but syzbot still cannot reproduce this bug using the reproducer...
> >
> > It seems that there is non-trivial difference between kernel config in dashboard
> > and kernel config in "syz test:" mails. Maybe that's the cause...


syzkaller runs oldconfig when building any kernels:
https://github.com/google/syzkaller/blob/master/pkg/build/linux.go#L56
Is that difference what oldconfig produces?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2020-02-10 15:06                 ` Dmitry Vyukov
@ 2020-02-10 15:21                   ` Tetsuo Handa
  2020-02-11 13:55                     ` Tetsuo Handa
  2020-02-11 14:01                     ` Dmitry Vyukov
  0 siblings, 2 replies; 18+ messages in thread
From: Tetsuo Handa @ 2020-02-10 15:21 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing, Bjørn Mork

On 2020/02/11 0:06, Dmitry Vyukov wrote:
>> On Mon, Feb 10, 2020 at 4:03 PM Tetsuo Handa
>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>>
>>> On 2020/02/10 21:46, Tetsuo Handa wrote:
>>>> On 2020/02/10 19:09, Dmitry Vyukov wrote:
>>>>> You may also try on the exact commit the bug was reported, because
>>>>> usb-fuzzer is tracking branch, things may change there.
>>>>
>>>> OK. I explicitly tried
>>>>
>>>>   #syz test: https://github.com/google/kasan.git e5cd56e94edde38ca4dafae5a450c5a16b8a5f23
>>>>
>>>> but syzbot still cannot reproduce this bug using the reproducer...
>>>
>>> It seems that there is non-trivial difference between kernel config in dashboard
>>> and kernel config in "syz test:" mails. Maybe that's the cause...
> 
> 
> syzkaller runs oldconfig when building any kernels:
> https://github.com/google/syzkaller/blob/master/pkg/build/linux.go#L56
> Is that difference what oldconfig produces?
> 

Here is the diff (with "#" lines excluded) between dashboard and "syz test:" mails.
I feel this difference is bigger than what simple oldconfig would cause.

$ curl 'https://syzkaller.appspot.com/text?tag=KernelConfig&x=8cff427cc8996115' | sort > dashboard
$ curl 'https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162' | sort > syz-test
$ diff -u dashboard syz-test | grep -vF '#' | grep '^[+-]'
--- dashboard   2020-02-11 00:19:14.793977153 +0900
+++ syz-test    2020-02-11 00:19:15.659977108 +0900
-CONFIG_BLK_DEV_LOOP_MIN_COUNT=16
+CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
-CONFIG_BUG_ON_DATA_CORRUPTION=y
-CONFIG_DEBUG_CREDENTIALS=y
-CONFIG_DEBUG_PER_CPU_MAPS=y
-CONFIG_DEBUG_PLIST=y
-CONFIG_DEBUG_SG=y
-CONFIG_DEBUG_VIRTUAL=y
+CONFIG_DEVMEM=y
+CONFIG_DEVPORT=y
+CONFIG_DMA_OF=y
-CONFIG_DYNAMIC_DEBUG=y
-CONFIG_DYNAMIC_MEMORY_LAYOUT=y
+CONFIG_HID_REDRAGON=y
+CONFIG_IRQCHIP=y
-CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
+CONFIG_LSM="yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
-CONFIG_MAC80211_HWSIM=y
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
+CONFIG_MAGIC_SYSRQ_SERIAL=y
+CONFIG_NET_TC_SKB_EXT=y
+CONFIG_OF=y
+CONFIG_OF_ADDRESS=y
+CONFIG_OF_GPIO=y
+CONFIG_OF_IOMMU=y
+CONFIG_OF_IRQ=y
+CONFIG_OF_KOBJ=y
+CONFIG_OF_MDIO=y
+CONFIG_OF_NET=y
-CONFIG_PGTABLE_LEVELS=5
+CONFIG_PGTABLE_LEVELS=4
+CONFIG_PWRSEQ_EMMC=y
+CONFIG_PWRSEQ_SIMPLE=y
+CONFIG_RTLWIFI_DEBUG=y
-CONFIG_SECURITYFS=y
+CONFIG_STRICT_DEVMEM=y
+CONFIG_THERMAL_OF=y
+CONFIG_USB_CHIPIDEA_OF=y
+CONFIG_USB_DWC3_OF_SIMPLE=y
-CONFIG_USB_RAW_GADGET=y
+CONFIG_USB_SNP_UDC_PLAT=y
-CONFIG_VIRTIO_BLK_SCSI=y
-CONFIG_VIRT_WIFI=y
-CONFIG_X86_5LEVEL=y

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2020-02-10 15:21                   ` Tetsuo Handa
@ 2020-02-11 13:55                     ` Tetsuo Handa
  2020-02-11 14:11                       ` Dmitry Vyukov
  2020-02-11 14:01                     ` Dmitry Vyukov
  1 sibling, 1 reply; 18+ messages in thread
From: Tetsuo Handa @ 2020-02-11 13:55 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing, Bjørn Mork

On 2020/02/11 0:21, Tetsuo Handa wrote:
> On 2020/02/11 0:06, Dmitry Vyukov wrote:
>>> On Mon, Feb 10, 2020 at 4:03 PM Tetsuo Handa
>>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>>>
>>>> On 2020/02/10 21:46, Tetsuo Handa wrote:
>>>>> On 2020/02/10 19:09, Dmitry Vyukov wrote:
>>>>>> You may also try on the exact commit the bug was reported, because
>>>>>> usb-fuzzer is tracking branch, things may change there.
>>>>>
>>>>> OK. I explicitly tried
>>>>>
>>>>>   #syz test: https://github.com/google/kasan.git e5cd56e94edde38ca4dafae5a450c5a16b8a5f23
>>>>>
>>>>> but syzbot still cannot reproduce this bug using the reproducer...
>>>>
>>>> It seems that there is non-trivial difference between kernel config in dashboard
>>>> and kernel config in "syz test:" mails. Maybe that's the cause...
>>
>>
>> syzkaller runs oldconfig when building any kernels:
>> https://github.com/google/syzkaller/blob/master/pkg/build/linux.go#L56
>> Is that difference what oldconfig produces?
>>
> 
> Here is the diff (with "#" lines excluded) between dashboard and "syz test:" mails.
> I feel this difference is bigger than what simple oldconfig would cause.
> 

I explicitly tried a commit as of the first report (instead of the latest report)

  #syz test: https://github.com/google/kasan.git e96407b497622d03f088bcf17d2c8c5a1ab066c8

and syzbot reproduced this bug using the reproducer. Therefore, it seems that differences
in the kernel config used for "syz test:" was inappropriate but "syz test:" failed to detect
it. Since there might be changes which fixed different bugs (and in order to confirm that
proposed patch cleanly applies to the current kernel without causing other problems), I guess
that people tend to test using the latest commit (instead of a commit as of the first report).

I suggest "syz test:" to retest without proposed patch when proposed patch did not reproduce
the bug. If retesting without proposed patch did not reproduce the bug, we can figure out that
something is wrong (maybe the bug is difficult to reproduce, maybe the bug was already fixed,
maybe kernel config was inappropriate, maybe something else).



Regarding the bug for this report, debug printk() reported that WDM_IN_USE was not cleared
for some reason. While we need to investigate why WDM_IN_USE was not cleared, I guess that
wdm_write() should clear WDM_IN_USE upon error
( https://syzkaller.appspot.com/x/patch.diff?x=17ec7ee9e00000 ) so that we will surely
wake up somebody potentially waiting on WDM_IN_USE.

[   38.587596][ T2807] wdm_flush: file=ffff8881d488bb80 flags=2
[   40.214039][ T2807] wdm_flush: file=ffff8881d63fb400 flags=2
[   40.304390][ T2842] wdm_flush: file=ffff8881d5e22500 flags=0
[   40.371742][ T2869] wdm_flush: file=ffff8881d4964c80 flags=0
[   40.429954][ T2844] wdm_flush: file=ffff8881d5937b80 flags=0
[   40.461538][ T2858] wdm_flush: file=ffff8881d488b400 flags=0
[   40.464909][ T2863] wdm_flush: file=ffff8881d488ea00 flags=0
[   41.576761][ T2896] wdm_flush: file=ffff8881d43dea00 flags=2
[   41.949941][ T2909] wdm_flush: file=ffff8881d63c3b80 flags=2
[   43.760828][ T2899] wdm_flush: file=ffff8881d3d7a000 flags=2
[   43.857364][ T2911] wdm_flush: file=ffff8881d63c2000 flags=2
[   43.857501][ T2904] wdm_flush: file=ffff8881d3d7a280 flags=2
[   43.866560][ T2906] wdm_flush: file=ffff8881d5ce4780 flags=2
[   43.876210][ T2897] wdm_flush: file=ffff8881d385db80 flags=2
[   72.308895][ T2909] INFO: task syz-executor.0:2909 blocked for more than 30 seconds.
[   72.316860][ T2909] wdm_flush: file=ffff8881d63c3b80 flags=2
[   74.228916][ T2906] INFO: task syz-executor.1:2906 blocked for more than 30 seconds.
[   74.228921][ T2911] INFO: task syz-executor.3:2911 blocked for more than 30 seconds.
[   74.228935][ T2911] wdm_flush: file=ffff8881d63c2000 flags=2
[   74.236949][ T2906] wdm_flush: file=ffff8881d5ce4780 flags=2
[   74.236991][ T2904] INFO: task syz-executor.4:2904 blocked for more than 30 seconds.
[   74.245459][ T2897] INFO: task syz-executor.2:2897 blocked for more than 30 seconds.
[   74.251305][ T2904] wdm_flush: file=ffff8881d3d7a280 flags=2
[   74.257129][ T2897] wdm_flush: file=ffff8881d385db80 flags=2
[   74.257951][ T2899] INFO: task syz-executor.5:2899 blocked for more than 30 seconds.
[   74.294465][ T2899] wdm_flush: file=ffff8881d3d7a000 flags=2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2020-02-10 15:21                   ` Tetsuo Handa
  2020-02-11 13:55                     ` Tetsuo Handa
@ 2020-02-11 14:01                     ` Dmitry Vyukov
  1 sibling, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2020-02-11 14:01 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing, Bjørn Mork

On Mon, Feb 10, 2020 at 4:22 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/02/11 0:06, Dmitry Vyukov wrote:
> >> On Mon, Feb 10, 2020 at 4:03 PM Tetsuo Handa
> >> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >>>
> >>> On 2020/02/10 21:46, Tetsuo Handa wrote:
> >>>> On 2020/02/10 19:09, Dmitry Vyukov wrote:
> >>>>> You may also try on the exact commit the bug was reported, because
> >>>>> usb-fuzzer is tracking branch, things may change there.
> >>>>
> >>>> OK. I explicitly tried
> >>>>
> >>>>   #syz test: https://github.com/google/kasan.git e5cd56e94edde38ca4dafae5a450c5a16b8a5f23
> >>>>
> >>>> but syzbot still cannot reproduce this bug using the reproducer...
> >>>
> >>> It seems that there is non-trivial difference between kernel config in dashboard
> >>> and kernel config in "syz test:" mails. Maybe that's the cause...
> >
> >
> > syzkaller runs oldconfig when building any kernels:
> > https://github.com/google/syzkaller/blob/master/pkg/build/linux.go#L56
> > Is that difference what oldconfig produces?
> >
>
> Here is the diff (with "#" lines excluded) between dashboard and "syz test:" mails.
> I feel this difference is bigger than what simple oldconfig would cause.
>
> $ curl 'https://syzkaller.appspot.com/text?tag=KernelConfig&x=8cff427cc8996115' | sort > dashboard

I think you took a wrong config as a base.
This 8cff427cc8996115 was only used for crashes without reproducers as
far as I see, so it can't be used for patch testing.
I would expect the one used for last patch testing is this one:
https://syzkaller.appspot.com/text?tag=KernelConfig&x=8847e5384a16f66a
associated with this crash:
ci2-upstream-usb2019/09/23 13:26https://github.com/google/kasan.git
usb-fuzzere0bd8d79d96e88f3

I checked at least CONFIG_DYNAMIC_DEBUG, and it matches what was used
for patch testing.
So everything seems right to me as far as I see.



> $ curl 'https://syzkaller.appspot.com/x/.config?x=c372cdb7140fc162' | sort > syz-test
> $ diff -u dashboard syz-test | grep -vF '#' | grep '^[+-]'
> --- dashboard   2020-02-11 00:19:14.793977153 +0900
> +++ syz-test    2020-02-11 00:19:15.659977108 +0900
> -CONFIG_BLK_DEV_LOOP_MIN_COUNT=16
> +CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
> -CONFIG_BUG_ON_DATA_CORRUPTION=y
> -CONFIG_DEBUG_CREDENTIALS=y
> -CONFIG_DEBUG_PER_CPU_MAPS=y
> -CONFIG_DEBUG_PLIST=y
> -CONFIG_DEBUG_SG=y
> -CONFIG_DEBUG_VIRTUAL=y
> +CONFIG_DEVMEM=y
> +CONFIG_DEVPORT=y
> +CONFIG_DMA_OF=y
> -CONFIG_DYNAMIC_DEBUG=y
> -CONFIG_DYNAMIC_MEMORY_LAYOUT=y
> +CONFIG_HID_REDRAGON=y
> +CONFIG_IRQCHIP=y
> -CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
> +CONFIG_LSM="yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
> -CONFIG_MAC80211_HWSIM=y
> +CONFIG_MAGIC_SYSRQ=y
> +CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
> +CONFIG_MAGIC_SYSRQ_SERIAL=y
> +CONFIG_NET_TC_SKB_EXT=y
> +CONFIG_OF=y
> +CONFIG_OF_ADDRESS=y
> +CONFIG_OF_GPIO=y
> +CONFIG_OF_IOMMU=y
> +CONFIG_OF_IRQ=y
> +CONFIG_OF_KOBJ=y
> +CONFIG_OF_MDIO=y
> +CONFIG_OF_NET=y
> -CONFIG_PGTABLE_LEVELS=5
> +CONFIG_PGTABLE_LEVELS=4
> +CONFIG_PWRSEQ_EMMC=y
> +CONFIG_PWRSEQ_SIMPLE=y
> +CONFIG_RTLWIFI_DEBUG=y
> -CONFIG_SECURITYFS=y
> +CONFIG_STRICT_DEVMEM=y
> +CONFIG_THERMAL_OF=y
> +CONFIG_USB_CHIPIDEA_OF=y
> +CONFIG_USB_DWC3_OF_SIMPLE=y
> -CONFIG_USB_RAW_GADGET=y
> +CONFIG_USB_SNP_UDC_PLAT=y
> -CONFIG_VIRTIO_BLK_SCSI=y
> -CONFIG_VIRT_WIFI=y
> -CONFIG_X86_5LEVEL=y

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2020-02-11 13:55                     ` Tetsuo Handa
@ 2020-02-11 14:11                       ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2020-02-11 14:11 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Oliver Neukum, syzbot, Andrey Konovalov, Jia-Ju Bai,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman, LKML,
	USB list, syzkaller-bugs, yuehaibing, Bjørn Mork

On Tue, Feb 11, 2020 at 2:55 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/02/11 0:21, Tetsuo Handa wrote:
> > On 2020/02/11 0:06, Dmitry Vyukov wrote:
> >>> On Mon, Feb 10, 2020 at 4:03 PM Tetsuo Handa
> >>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >>>>
> >>>> On 2020/02/10 21:46, Tetsuo Handa wrote:
> >>>>> On 2020/02/10 19:09, Dmitry Vyukov wrote:
> >>>>>> You may also try on the exact commit the bug was reported, because
> >>>>>> usb-fuzzer is tracking branch, things may change there.
> >>>>>
> >>>>> OK. I explicitly tried
> >>>>>
> >>>>>   #syz test: https://github.com/google/kasan.git e5cd56e94edde38ca4dafae5a450c5a16b8a5f23
> >>>>>
> >>>>> but syzbot still cannot reproduce this bug using the reproducer...
> >>>>
> >>>> It seems that there is non-trivial difference between kernel config in dashboard
> >>>> and kernel config in "syz test:" mails. Maybe that's the cause...
> >>
> >>
> >> syzkaller runs oldconfig when building any kernels:
> >> https://github.com/google/syzkaller/blob/master/pkg/build/linux.go#L56
> >> Is that difference what oldconfig produces?
> >>
> >
> > Here is the diff (with "#" lines excluded) between dashboard and "syz test:" mails.
> > I feel this difference is bigger than what simple oldconfig would cause.
> >
>
> I explicitly tried a commit as of the first report (instead of the latest report)
>
>   #syz test: https://github.com/google/kasan.git e96407b497622d03f088bcf17d2c8c5a1ab066c8
>
> and syzbot reproduced this bug using the reproducer. Therefore, it seems that differences
> in the kernel config used for "syz test:" was inappropriate but "syz test:" failed to detect
> it. Since there might be changes which fixed different bugs (and in order to confirm that
> proposed patch cleanly applies to the current kernel without causing other problems), I guess
> that people tend to test using the latest commit (instead of a commit as of the first report).
>
> I suggest "syz test:" to retest without proposed patch when proposed patch did not reproduce
> the bug. If retesting without proposed patch did not reproduce the bug, we can figure out that
> something is wrong (maybe the bug is difficult to reproduce, maybe the bug was already fixed,
> maybe kernel config was inappropriate, maybe something else).

This is already possible, right? One can request any single testing as
they see fit.
Chaining tests into complex workflows won't necessarily make things
simpler. It will be hard to explain what exactly happened and why.
Also, consider, a reproducer is flaky, it did not crashed with patch,
but crashed without the patch (just because it's flaky).


> Regarding the bug for this report, debug printk() reported that WDM_IN_USE was not cleared
> for some reason. While we need to investigate why WDM_IN_USE was not cleared, I guess that
> wdm_write() should clear WDM_IN_USE upon error
> ( https://syzkaller.appspot.com/x/patch.diff?x=17ec7ee9e00000 ) so that we will surely
> wake up somebody potentially waiting on WDM_IN_USE.
>
> [   38.587596][ T2807] wdm_flush: file=ffff8881d488bb80 flags=2
> [   40.214039][ T2807] wdm_flush: file=ffff8881d63fb400 flags=2
> [   40.304390][ T2842] wdm_flush: file=ffff8881d5e22500 flags=0
> [   40.371742][ T2869] wdm_flush: file=ffff8881d4964c80 flags=0
> [   40.429954][ T2844] wdm_flush: file=ffff8881d5937b80 flags=0
> [   40.461538][ T2858] wdm_flush: file=ffff8881d488b400 flags=0
> [   40.464909][ T2863] wdm_flush: file=ffff8881d488ea00 flags=0
> [   41.576761][ T2896] wdm_flush: file=ffff8881d43dea00 flags=2
> [   41.949941][ T2909] wdm_flush: file=ffff8881d63c3b80 flags=2
> [   43.760828][ T2899] wdm_flush: file=ffff8881d3d7a000 flags=2
> [   43.857364][ T2911] wdm_flush: file=ffff8881d63c2000 flags=2
> [   43.857501][ T2904] wdm_flush: file=ffff8881d3d7a280 flags=2
> [   43.866560][ T2906] wdm_flush: file=ffff8881d5ce4780 flags=2
> [   43.876210][ T2897] wdm_flush: file=ffff8881d385db80 flags=2
> [   72.308895][ T2909] INFO: task syz-executor.0:2909 blocked for more than 30 seconds.
> [   72.316860][ T2909] wdm_flush: file=ffff8881d63c3b80 flags=2
> [   74.228916][ T2906] INFO: task syz-executor.1:2906 blocked for more than 30 seconds.
> [   74.228921][ T2911] INFO: task syz-executor.3:2911 blocked for more than 30 seconds.
> [   74.228935][ T2911] wdm_flush: file=ffff8881d63c2000 flags=2
> [   74.236949][ T2906] wdm_flush: file=ffff8881d5ce4780 flags=2
> [   74.236991][ T2904] INFO: task syz-executor.4:2904 blocked for more than 30 seconds.
> [   74.245459][ T2897] INFO: task syz-executor.2:2897 blocked for more than 30 seconds.
> [   74.251305][ T2904] wdm_flush: file=ffff8881d3d7a280 flags=2
> [   74.257129][ T2897] wdm_flush: file=ffff8881d385db80 flags=2
> [   74.257951][ T2899] INFO: task syz-executor.5:2899 blocked for more than 30 seconds.
> [   74.294465][ T2899] wdm_flush: file=ffff8881d3d7a000 flags=2
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2019-11-21 11:07   ` Oliver Neukum
@ 2019-11-22  9:11     ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2019-11-22  9:11 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: syzbot, bjorn, USB list, syzkaller-bugs, Andrey Konovalov

On Thu, Nov 21, 2019 at 12:07 PM Oliver Neukum <oneukum@suse.com> wrote:
>
> Am Mittwoch, den 20.11.2019, 14:40 -0800 schrieb syzbot:
> > Hello,
> >
> > syzbot tried to test the proposed patch but build/boot failed:
> >
> > failed to apply patch:
> > checking file drivers/usb/class/cdc-wdm.c
> > Hunk #1 FAILED at 587.
> > Hunk #2 FAILED at 596.
> > 2 out of 2 hunks FAILED
>
> This is unexpected.
> >
> >
> >
> > Tested on:
> >
> > commit:         e96407b4 usb-fuzzer: main usb gadget fuzzer driver
> > git tree:       https://github.com/google/kasan.git
>
> If I do a git am on the branch usb-fuzzer-usb-testing-2019.11.19,
> the patch applies. Which branch do I need to backport to?

Hi Oliver,

You give exact tree/base commit when you ask syzbot to test. It does
not do any second guessing.
Well, if you provide a tree+branch then it's a bit of chasing a moving
target b/c HEAD can be updated meanwhile, but you can also give it
tree+commit, then that's 100% fixed.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2019-11-20 22:40 ` syzbot
@ 2019-11-21 11:07   ` Oliver Neukum
  2019-11-22  9:11     ` Dmitry Vyukov
  0 siblings, 1 reply; 18+ messages in thread
From: Oliver Neukum @ 2019-11-21 11:07 UTC (permalink / raw)
  To: syzbot, bjorn, linux-usb, syzkaller-bugs; +Cc: andreyknvl

Am Mittwoch, den 20.11.2019, 14:40 -0800 schrieb syzbot:
> Hello,
> 
> syzbot tried to test the proposed patch but build/boot failed:
> 
> failed to apply patch:
> checking file drivers/usb/class/cdc-wdm.c
> Hunk #1 FAILED at 587.
> Hunk #2 FAILED at 596.
> 2 out of 2 hunks FAILED

This is unexpected.
> 
> 
> 
> Tested on:
> 
> commit:         e96407b4 usb-fuzzer: main usb gadget fuzzer driver
> git tree:       https://github.com/google/kasan.git

If I do a git am on the branch usb-fuzzer-usb-testing-2019.11.19,
the patch applies. Which branch do I need to backport to?

	Reagrds
		Oliver


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
  2019-11-19 13:21 Oliver Neukum
@ 2019-11-20 22:40 ` syzbot
  2019-11-21 11:07   ` Oliver Neukum
  0 siblings, 1 reply; 18+ messages in thread
From: syzbot @ 2019-11-20 22:40 UTC (permalink / raw)
  To: bjorn, linux-usb, oneukum, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but build/boot failed:

failed to apply patch:
checking file drivers/usb/class/cdc-wdm.c
Hunk #1 FAILED at 587.
Hunk #2 FAILED at 596.
2 out of 2 hunks FAILED



Tested on:

commit:         e96407b4 usb-fuzzer: main usb gadget fuzzer driver
git tree:       https://github.com/google/kasan.git
dashboard link: https://syzkaller.appspot.com/bug?extid=854768b99f19e89d7f81
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1779956ae00000


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: INFO: task hung in wdm_flush
@ 2019-11-19 13:21 Oliver Neukum
  2019-11-20 22:40 ` syzbot
  0 siblings, 1 reply; 18+ messages in thread
From: Oliver Neukum @ 2019-11-19 13:21 UTC (permalink / raw)
  To: syzbot; +Cc: Bjørn Mork, linux-usb

#syz test: https://github.com/google/kasan.git e96407b4

From d3d9edf17e33889e0fc4238f3d03a2dce7af30e1 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum@suse.com>
Date: Tue, 19 Nov 2019 14:09:41 +0100
Subject: [PATCH] cdc-wdm: add timeout in wdm_flush()

wdm_flush() will wait forever for IO to end. If a device
happens to crash exactly at that time and becomes unresponsive or
turns rogue and malicious exactly at that time, we get unkillable
tasks. The solition is to add a sensible timeout.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
---
 drivers/usb/class/cdc-wdm.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index f9f7c8a5e091..17de5c88a325 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -587,8 +587,9 @@ static ssize_t wdm_read
 static int wdm_flush(struct file *file, fl_owner_t id)
 {
 	struct wdm_device *desc = file->private_data;
+	int timeout;
 
-	wait_event(desc->wait,
+	timeout = wait_event_timeout(desc->wait,
 			/*
 			 * needs both flags. We cannot do with one
 			 * because resetting it would cause a race
@@ -596,7 +597,14 @@ static int wdm_flush(struct file *file, fl_owner_t id)
 			 * a disconnect
 			 */
 			!test_bit(WDM_IN_USE, &desc->flags) ||
-			test_bit(WDM_DISCONNECTING, &desc->flags));
+			test_bit(WDM_DISCONNECTING, &desc->flags),
+			/* pulled out of thin air */
+			30 * HZ);
+
+	if (!timeout) {
+		usb_kill_urb(desc->command);
+		return -EIO;
+	}
 
 	/* cannot dereference desc->intf if WDM_DISCONNECTING */
 	if (test_bit(WDM_DISCONNECTING, &desc->flags))
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-02-11 14:11 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-12 12:18 INFO: task hung in wdm_flush syzbot
2019-11-19  9:14 ` Bjørn Mork
2019-11-19 10:31   ` Oliver Neukum
2019-11-19 11:34     ` Bjørn Mork
2019-11-23  6:52       ` Dmitry Vyukov
2020-02-10 10:06         ` Dmitry Vyukov
2020-02-10 10:09           ` Dmitry Vyukov
2020-02-10 12:46             ` Tetsuo Handa
2020-02-10 15:04               ` Dmitry Vyukov
2020-02-10 15:06                 ` Dmitry Vyukov
2020-02-10 15:21                   ` Tetsuo Handa
2020-02-11 13:55                     ` Tetsuo Handa
2020-02-11 14:11                       ` Dmitry Vyukov
2020-02-11 14:01                     ` Dmitry Vyukov
2019-11-19 13:21 Oliver Neukum
2019-11-20 22:40 ` syzbot
2019-11-21 11:07   ` Oliver Neukum
2019-11-22  9:11     ` Dmitry Vyukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).