linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* INFO: task hung in nbd_ioctl
@ 2021-09-16  2:43 Hao Sun
  0 siblings, 0 replies; 13+ messages in thread
From: Hao Sun @ 2021-09-16  2:43 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel; +Cc: Josef Bacik, linux-block, nbd

Hello,

When using Healer to fuzz the latest Linux kernel, the following crash
was triggered.

HEAD commit: 6880fa6c5660 Linux 5.15-rc1
git tree: upstream
console output:
https://drive.google.com/file/d/1LfSHVsXZBF1k8KjBkz5OauavDE0rMs7D/view?usp=sharing
kernel config: https://drive.google.com/file/d/1rUzyMbe5vcs6khA3tL9EHTLJvsUdWcgB/view?usp=sharing

Sorry, I don't have a reproducer for this crash, hope the symbolized
report can help.
If you fix this issue, please add the following tag to the commit:
Reported-by: Hao Sun <sunhao.th@gmail.com>

INFO: task syz-executor:24965 blocked for more than 143 seconds.
      Not tainted 5.15.0-rc1 #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor    state:D stack:27880 pid:24965 ppid: 24302 flags:0x00004004
Call Trace:
 context_switch kernel/sched/core.c:4940 [inline]
 __schedule+0xcd9/0x2530 kernel/sched/core.c:6287
 schedule+0xd3/0x270 kernel/sched/core.c:6366
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6425
 __mutex_lock_common kernel/locking/mutex.c:669 [inline]
 __mutex_lock+0xc96/0x1680 kernel/locking/mutex.c:729
 nbd_start_device_ioctl drivers/block/nbd.c:1361 [inline]
 __nbd_ioctl drivers/block/nbd.c:1422 [inline]
 nbd_ioctl+0x58b/0x9c0 drivers/block/nbd.c:1462
 blkdev_ioctl+0x2a4/0x720 block/ioctl.c:589
 block_ioctl+0xfa/0x140 block/fops.c:477
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:874 [inline]
 __se_sys_ioctl fs/ioctl.c:860 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4739cd
RSP: 002b:00007fd1b9ddec58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 00000000004739cd
RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000008
RBP: 00000000004ebd80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0a0
R13: 00007ffd8476ee1f R14: 00007ffd8476efc0 R15: 00007fd1b9ddedc0
INFO: task syz-executor:24976 blocked for more than 143 seconds.
      Not tainted 5.15.0-rc1 #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor    state:D stack:28400 pid:24976 ppid: 24302 flags:0x00000004
Call Trace:
 context_switch kernel/sched/core.c:4940 [inline]
 __schedule+0xcd9/0x2530 kernel/sched/core.c:6287
 schedule+0xd3/0x270 kernel/sched/core.c:6366
 blk_mq_freeze_queue_wait+0x114/0x160 block/blk-mq.c:151
 nbd_add_socket+0x102/0x7c0 drivers/block/nbd.c:1050
 __nbd_ioctl drivers/block/nbd.c:1405 [inline]
 nbd_ioctl+0x391/0x9c0 drivers/block/nbd.c:1462
 blkdev_ioctl+0x2a4/0x720 block/ioctl.c:589
 block_ioctl+0xfa/0x140 block/fops.c:477
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:874 [inline]
 __se_sys_ioctl fs/ioctl.c:860 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4739cd
RSP: 002b:00007fd1b9d7bc58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000059c2c8 RCX: 00000000004739cd
RDX: 0000000000000006 RSI: 000000000000ab00 RDI: 0000000000000004
RBP: 00000000004ebd80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c2c8
R13: 00007ffd8476ee1f R14: 00007ffd8476efc0 R15: 00007fd1b9d7bdc0

Showing all locks held in the system:
1 lock held by khungtaskd/39:
 #0: ffffffff8b97e9a0 (rcu_read_lock){....}-{1:2}, at:
debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6446
1 lock held by in:imklog/15673:
 #0: ffff88801eeab570 (&f->f_pos_lock){+.+.}-{3:3}, at:
__fdget_pos+0xe9/0x100 fs/file.c:990
1 lock held by syz-executor/24965:
 #0: ffff88801a0f4208 (&nbd->config_lock){+.+.}-{3:3}, at:
nbd_start_device_ioctl drivers/block/nbd.c:1361 [inline]
 #0: ffff88801a0f4208 (&nbd->config_lock){+.+.}-{3:3}, at: __nbd_ioctl
drivers/block/nbd.c:1422 [inline]
 #0: ffff88801a0f4208 (&nbd->config_lock){+.+.}-{3:3}, at:
nbd_ioctl+0x58b/0x9c0 drivers/block/nbd.c:1462
1 lock held by syz-executor/24976:
 #0: ffff88801a0f4208 (&nbd->config_lock){+.+.}-{3:3}, at:
nbd_ioctl+0x14f/0x9c0 drivers/block/nbd.c:1455

=============================================

NMI backtrace for cpu 2
CPU: 2 PID: 39 Comm: khungtaskd Not tainted 5.15.0-rc1 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:105
 nmi_trigger_cpumask_backtrace+0x1e1/0x220 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:210 [inline]
 watchdog+0xcc8/0x1010 kernel/hung_task.c:295
 kthread+0x3e5/0x4d0 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Sending NMI from CPU 2 to CPUs 0-1,3:
NMI backtrace for cpu 1
CPU: 1 PID: 15674 Comm: rs:main Q:Reg Not tainted 5.15.0-rc1 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:__lock_acquire+0xdc5/0x57e0 kernel/locking/lockdep.c:4885
Code: bc e9 0d e9 f4 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 f2
48 c1 ea 03 80 3c 02 00 0f 85 76 2a 00 00 49 81 3e 40 34 f0 8e <0f> 84
16 f3 ff ff 83 fd 01 0f 87 1e f3 ff ff 89 eb 0f 87 3d 39 00
RSP: 0018:ffffc90007fbf628 EFLAGS: 00000087
RAX: dffffc0000000000 RBX: 1ffff92000ff7ef5 RCX: 0000000000000000
RDX: 1ffffffff1757c40 RSI: 0000000000000000 RDI: ffffffff8babe200
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: ffff888135c32a0b R11: ffffed1026b86541 R12: 0000000000000000
R13: ffff88810287d580 R14: ffffffff8babe200 R15: 0000000000000000
FS:  00007fe40dfd0700(0000) GS:ffff888135c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000056f1d8 CR3: 000000001911a000 CR4: 0000000000350ee0
Call Trace:
 lock_acquire kernel/locking/lockdep.c:5625 [inline]
 lock_acquire+0x1ab/0x520 kernel/locking/lockdep.c:5590
 fs_reclaim_acquire+0xd2/0x160 mm/page_alloc.c:4556
 might_alloc include/linux/sched/mm.h:198 [inline]
 slab_pre_alloc_hook mm/slab.h:492 [inline]
 slab_alloc_node mm/slub.c:3120 [inline]
 slab_alloc mm/slub.c:3214 [inline]
 kmem_cache_alloc+0x42/0x340 mm/slub.c:3219
 kmem_cache_zalloc include/linux/slab.h:711 [inline]
 jbd2_alloc_handle include/linux/jbd2.h:1603 [inline]
 new_handle fs/jbd2/transaction.c:481 [inline]
 jbd2__journal_start fs/jbd2/transaction.c:508 [inline]
 jbd2__journal_start+0x191/0x920 fs/jbd2/transaction.c:490
 __ext4_journal_start_sb+0x3a8/0x4a0 fs/ext4/ext4_jbd2.c:105
 __ext4_journal_start fs/ext4/ext4_jbd2.h:326 [inline]
 ext4_da_write_begin+0x4c5/0x1180 fs/ext4/inode.c:3002
 generic_perform_write+0x1fe/0x510 mm/filemap.c:3770
 ext4_buffered_write_iter+0x206/0x4c0 fs/ext4/file.c:269
 ext4_file_write_iter+0x42e/0x14a0 fs/ext4/file.c:680
 call_write_iter include/linux/fs.h:2163 [inline]
 new_sync_write+0x414/0x640 fs/read_write.c:507
 vfs_write+0x67a/0xae0 fs/read_write.c:594
 ksys_write+0x12d/0x250 fs/read_write.c:647
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fe410a141cd
Code: c2 20 00 00 75 10 b8 01 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31
c3 48 83 ec 08 e8 ae fc ff ff 48 89 04 24 b8 01 00 00 00 0f 05 <48> 8b
3c 24 48 89 c2 e8 f7 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007fe40dfcf590 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fe404027260 RCX: 00007fe410a141cd
RDX: 000000000000005c RSI: 00007fe404027260 RDI: 0000000000000009
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 00007fe404026fe0
R13: 00007fe40dfcf5b0 R14: 000055dcd82aa800 R15: 000000000000005c
NMI backtrace for cpu 0 skipped: idling at native_safe_halt
arch/x86/include/asm/irqflags.h:51 [inline]
NMI backtrace for cpu 0 skipped: idling at arch_safe_halt
arch/x86/include/asm/irqflags.h:89 [inline]
NMI backtrace for cpu 0 skipped: idling at default_idle+0xb/0x10
arch/x86/kernel/process.c:716
NMI backtrace for cpu 3
CPU: 3 PID: 3017 Comm: systemd-journal Not tainted 5.15.0-rc1 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:lockdep_hardirqs_off+0x3b/0xd0 kernel/locking/lockdep.c:4372
Code: 2b 47 cf 76 a9 00 00 f0 00 55 53 48 89 fb 74 49 8b 15 f9 2f f2
06 85 d2 74 0e 65 8b 05 6a 4e cf 76 85 c0 75 4e 5b 5d c3 9c 58 <f6> c4
02 74 eb e8 5b fa ac fa 85 c0 74 ed 8b 05 b9 46 3b 04 85 c0
RSP: 0018:ffffc90000edf900 EFLAGS: 00000046
RAX: 0000000000000046 RBX: ffffffff81ccee3d RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90000edf9b0 R08: ffffffff817c49c9 R09: 0000000000000000
R10: 0000000000000007 R11: ffffed1026ba6541 R12: 0000000000000200
R13: 0000000000000000 R14: ffff888109853900 R15: 0000000000000000
FS:  00007fdba43168c0(0000) GS:ffff888135d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdb9ff28000 CR3: 000000001bdb3000 CR4: 0000000000350ee0
Call Trace:
 trace_hardirqs_off+0x13/0x1b0 kernel/trace/trace_preemptirq.c:76
 seqcount_lockdep_reader_access include/linux/seqlock.h:102 [inline]
 set_root+0x39d/0x560 fs/namei.c:940
 nd_jump_root+0x38d/0x520 fs/namei.c:961
 path_init+0xf81/0x1700 fs/namei.c:2359
 path_openat+0x18e/0x2710 fs/namei.c:3556
 do_filp_open+0x1c1/0x290 fs/namei.c:3588
 do_sys_openat2+0x61b/0x9a0 fs/open.c:1200
 do_sys_open+0xc3/0x140 fs/open.c:1216
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fdba38a685d
Code: bb 20 00 00 75 10 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31
c3 48 83 ec 08 e8 1e f6 ff ff 48 89 04 24 b8 02 00 00 00 0f 05 <48> 8b
3c 24 48 89 c2 e8 67 f6 ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007ffd7e798f70 EFLAGS: 00000293 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007ffd7e799280 RCX: 00007fdba38a685d
RDX: 00000000000001a0 RSI: 0000000000080042 RDI: 000055ef99869060
RBP: 000000000000000d R08: 000000000000ffc0 R09: 00000000ffffffff
R10: 0000000000000069 R11: 0000000000000293 R12: 00000000ffffffff
R13: 000055ef99865040 R14: 00007ffd7e799240 R15: 000055ef99870c40
----------------
Code disassembly (best guess), 1 bytes skipped:
   0:   e9 0d e9 f4 01          jmpq   0x1f4e912
   5:   00 00                   add    %al,(%rax)
   7:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
   e:   fc ff df
  11:   4c 89 f2                mov    %r14,%rdx
  14:   48 c1 ea 03             shr    $0x3,%rdx
  18:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1)
  1c:   0f 85 76 2a 00 00       jne    0x2a98
  22:   49 81 3e 40 34 f0 8e    cmpq   $0xffffffff8ef03440,(%r14)
* 29:   0f 84 16 f3 ff ff       je     0xfffff345 <-- trapping instruction
  2f:   83 fd 01                cmp    $0x1,%ebp
  32:   0f 87 1e f3 ff ff       ja     0xfffff356
  38:   89 eb                   mov    %ebp,%ebx
  3a:   0f                      .byte 0xf
  3b:   87                      .byte 0x87
  3c:   3d                      .byte 0x3d
  3d:   39 00                   cmp    %eax,(%rax)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* INFO: task hung in nbd_ioctl
@ 2021-09-18  1:34 Hao Sun
  0 siblings, 0 replies; 13+ messages in thread
From: Hao Sun @ 2021-09-18  1:34 UTC (permalink / raw)
  To: Jens Axboe, Linux Kernel Mailing List; +Cc: Josef Bacik, linux-block, nbd

Hello,

When using Healer to fuzz the latest Linux kernel, the following crash
was triggered.

HEAD commit: ff1ffd71d5f0 Merge tag 'hyperv-fixes-signed-20210915
git tree: upstream
console output:
https://drive.google.com/file/d/1Htx96ZZ5dAxLIr-4jNJ62iQdstmHnliH/view?usp=sharing
kernel config: https://drive.google.com/file/d/1zXpDhs-IdE7tX17B7MhaYP0VGUfP6m9B/view?usp=sharing

Sorry, I don't have a reproducer for this crash, hope the symbolized
report can help.
If you fix this issue, please add the following tag to the commit:
Reported-by: Hao Sun <sunhao.th@gmail.com>

INFO: task syz-executor:25816 blocked for more than 143 seconds.
      Not tainted 5.15.0-rc1+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor    state:D stack:27872 pid:25816 ppid: 24814 flags:0x00000004
Call Trace:
 context_switch kernel/sched/core.c:4940 [inline]
 __schedule+0xcd9/0x2530 kernel/sched/core.c:6287
 schedule+0xd3/0x270 kernel/sched/core.c:6366
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6425
 __mutex_lock_common kernel/locking/mutex.c:669 [inline]
 __mutex_lock+0xc96/0x1680 kernel/locking/mutex.c:729
 nbd_ioctl+0x14f/0x9c0 drivers/block/nbd.c:1455
 blkdev_ioctl+0x2a4/0x720 block/ioctl.c:589
 block_ioctl+0xfa/0x140 block/fops.c:477
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:874 [inline]
 __se_sys_ioctl fs/ioctl.c:860 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4739cd
RSP: 002b:00007fe430645c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 00000000004739cd
RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000007
RBP: 00000000004ebd80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0a0
R13: 00007ffcaa94abdf R14: 00007ffcaa94ad80 R15: 00007fe430645dc0
INFO: task syz-executor:25822 blocked for more than 143 seconds.
      Not tainted 5.15.0-rc1+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor    state:D stack:28400 pid:25822 ppid: 24814 flags:0x00000004
Call Trace:
 context_switch kernel/sched/core.c:4940 [inline]
 __schedule+0xcd9/0x2530 kernel/sched/core.c:6287
 schedule+0xd3/0x270 kernel/sched/core.c:6366
 blk_mq_freeze_queue_wait+0x114/0x160 block/blk-mq.c:151
 nbd_add_socket+0x102/0x7c0 drivers/block/nbd.c:1050
 __nbd_ioctl drivers/block/nbd.c:1405 [inline]
 nbd_ioctl+0x391/0x9c0 drivers/block/nbd.c:1462
 blkdev_ioctl+0x2a4/0x720 block/ioctl.c:589
 block_ioctl+0xfa/0x140 block/fops.c:477
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:874 [inline]
 __se_sys_ioctl fs/ioctl.c:860 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4739cd
RSP: 002b:00007fe430603c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000059c210 RCX: 00000000004739cd
RDX: 0000000000000005 RSI: 000000000000ab00 RDI: 0000000000000004
RBP: 00000000004ebd80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c210
R13: 00007ffcaa94abdf R14: 00007ffcaa94ad80 R15: 00007fe430603dc0
INFO: task syz-executor:25823 blocked for more than 143 seconds.
      Not tainted 5.15.0-rc1+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor    state:D stack:27408 pid:25823 ppid: 24814 flags:0x00000004
Call Trace:
 context_switch kernel/sched/core.c:4940 [inline]
 __schedule+0xcd9/0x2530 kernel/sched/core.c:6287
 schedule+0xd3/0x270 kernel/sched/core.c:6366
 blk_queue_enter+0x956/0xdb0 block/blk-core.c:462
 bio_queue_enter block/blk-core.c:477 [inline]
 __submit_bio_noacct_mq block/blk-core.c:989 [inline]
 submit_bio_noacct+0xd32/0x1460 block/blk-core.c:1031
 submit_bio+0x10a/0x460 block/blk-core.c:1093
 submit_bio_wait+0x106/0x230 block/bio.c:1248
 blkdev_issue_flush+0xd7/0x120 block/blk-flush.c:458
 blkdev_fsync+0x8e/0xd0 block/fops.c:420
 vfs_fsync_range+0x13a/0x220 fs/sync.c:200
 vfs_fsync fs/sync.c:214 [inline]
 do_fsync+0x4d/0x90 fs/sync.c:224
 __do_sys_fsync fs/sync.c:232 [inline]
 __se_sys_fsync fs/sync.c:230 [inline]
 __x64_sys_fsync+0x2f/0x40 fs/sync.c:230
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x2000014c
RSP: 002b:00007fe4305e2bb8 EFLAGS: 00000213 ORIG_RAX: 000000000000004a
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 000000002000014c
RDX: 0000000000004c01 RSI: 0000000000000003 RDI: 0000000000000003
RBP: 00000000000000b9 R08: 0000000000000005 R09: 0000000000000006
R10: 0000000000000007 R11: 0000000000000213 R12: 000000000000000b
R13: 000000000000000c R14: 000000000000000d R15: 00007fe4305e2dc0

Showing all locks held in the system:
1 lock held by khungtaskd/39:
 #0: ffffffff8b97e9a0 (rcu_read_lock){....}-{1:2}, at:
debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6446
1 lock held by in:imklog/6298:
 #0: ffff88801c9d19f0 (&f->f_pos_lock){+.+.}-{3:3}, at:
__fdget_pos+0xe9/0x100 fs/file.c:990
3 locks held by kworker/u8:2/6743:
1 lock held by syz-executor/25816:
 #0: ffff88801ae63208 (&nbd->config_lock){+.+.}-{3:3}, at:
nbd_ioctl+0x14f/0x9c0 drivers/block/nbd.c:1455
1 lock held by syz-executor/25822:
 #0: ffff88801ae63208 (&nbd->config_lock){+.+.}-{3:3}, at:
nbd_ioctl+0x14f/0x9c0 drivers/block/nbd.c:1455

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 39 Comm: khungtaskd Not tainted 5.15.0-rc1+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:105
 nmi_trigger_cpumask_backtrace+0x1e1/0x220 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:210 [inline]
 watchdog+0xcc8/0x1010 kernel/hung_task.c:295
 kthread+0x3e5/0x4d0 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Sending NMI from CPU 1 to CPUs 0,2-3:
NMI backtrace for cpu 0
CPU: 0 PID: 3022 Comm: systemd-journal Not tainted 5.15.0-rc1+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:__orc_find+0x0/0xf0 arch/x86/kernel/unwind_orc.c:35
Code: 7f 8b e8 63 a6 c8 02 e9 60 fb ff ff e8 b9 96 8a 00 e9 cf fb ff
ff cc cc cc cc 48 8b 07 c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 <41> 57
89 d0 41 56 41 55 41 54 4c 8d 64 87 fc 55 53 48 83 ec 10 85
RSP: 0018:ffffc9000121f980 EFLAGS: 00000212
RAX: 000000000002c858 RBX: 1ffff92000243f39 RCX: ffffffff81bdbadf
RDX: 000000000000000b RSI: ffffffff8df4036c RDI: ffffffff8d82b228
RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffff8df4036c
R10: ffffc9000121fadf R11: 0000000000086088 R12: ffffc9000121fac8
R13: ffffc9000121fab5 R14: ffffc9000121fa80 R15: ffffffff81bdbadf
FS:  00007f13812868c0(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f137ce35000 CR3: 0000000018bf1000 CR4: 0000000000350ef0
Call Trace:
 orc_find arch/x86/kernel/unwind_orc.c:173 [inline]
 unwind_next_frame+0x33a/0x1770 arch/x86/kernel/unwind_orc.c:443
 arch_stack_walk+0x7d/0xe0 arch/x86/kernel/stacktrace.c:25
 stack_trace_save+0x8c/0xc0 kernel/stacktrace.c:121
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:360
 ____kasan_slab_free mm/kasan/common.c:366 [inline]
 ____kasan_slab_free mm/kasan/common.c:328 [inline]
 __kasan_slab_free+0x100/0x140 mm/kasan/common.c:374
 kasan_slab_free include/linux/kasan.h:230 [inline]
 slab_free_hook mm/slub.c:1700 [inline]
 slab_free_freelist_hook mm/slub.c:1725 [inline]
 slab_free mm/slub.c:3483 [inline]
 kmem_cache_free+0xa0/0x670 mm/slub.c:3499
 putname+0xfe/0x140 fs/namei.c:270
 do_mkdirat+0x18a/0x2b0 fs/namei.c:3920
 __do_sys_mkdir fs/namei.c:3931 [inline]
 __se_sys_mkdir fs/namei.c:3929 [inline]
 __x64_sys_mkdir+0x61/0x80 fs/namei.c:3929
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f1380542687
Code: 00 b8 ff ff ff ff c3 0f 1f 40 00 48 8b 05 09 d8 2b 00 64 c7 00
5f 00 00 00 b8 ff ff ff ff c3 0f 1f 40 00 b8 53 00 00 00 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d e1 d7 2b 00 f7 d8 64 89 01 48
RSP: 002b:00007ffcb3a1e5b8 EFLAGS: 00000293 ORIG_RAX: 0000000000000053
RAX: ffffffffffffffda RBX: 00007ffcb3a214d0 RCX: 00007f1380542687
RDX: 00007f1380fb3a00 RSI: 00000000000001ed RDI: 00005567696f38a0
RBP: 00007ffcb3a1e5f0 R08: 000000000000c000 R09: 0000000000000000
R10: 0000000000000069 R11: 0000000000000293 R12: 0000000000000000
R13: 0000000000000000 R14: 00007ffcb3a214d0 R15: 00007ffcb3a1eae0
NMI backtrace for cpu 3 skipped: idling at native_safe_halt
arch/x86/include/asm/irqflags.h:51 [inline]
NMI backtrace for cpu 3 skipped: idling at arch_safe_halt
arch/x86/include/asm/irqflags.h:89 [inline]
NMI backtrace for cpu 3 skipped: idling at default_idle+0xb/0x10
arch/x86/kernel/process.c:716
NMI backtrace for cpu 2
CPU: 2 PID: 6743 Comm: kworker/u8:2 Not tainted 5.15.0-rc1+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: netns cleanup_net
RIP: 0010:check_wait_context kernel/locking/lockdep.c:4693 [inline]
RIP: 0010:__lock_acquire+0x4b6/0x57e0 kernel/locking/lockdep.c:4965
Code: 06 49 81 c7 40 fd cf 8f 45 84 e4 0f 84 f8 02 00 00 48 8d 7d 21
48 b8 00 00 00 00 00 fc ff df 48 89 f9 48 c1 e9 03 0f b6 04 01 <48> 89
f9 83 e1 07 38 c8 7f 08 84 c0 0f 85 62 33 00 00 44 0f b6 4d
RSP: 0018:ffffc9000308f8f0 EFLAGS: 00000012
RAX: 0000000000000000 RBX: 0000000000000007 RCX: 1ffff11005adb152
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88802d6d8a91
RBP: ffff88802d6d8a70 R08: 0000000000000001 R09: fffffbfff1f9ff25
R10: ffffffff8fcff927 R11: fffffbfff1f9ff24 R12: 0000000000000002
R13: ffff88802d6d8000 R14: ffffffff8b97e9a0 R15: ffffffff8fd00280
FS:  0000000000000000(0000) GS:ffff888063f00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002a6ac48 CR3: 000000000b68e000 CR4: 0000000000350ee0
Call Trace:
 lock_acquire kernel/locking/lockdep.c:5625 [inline]
 lock_acquire+0x1ab/0x520 kernel/locking/lockdep.c:5590
 rcu_lock_acquire include/linux/rcupdate.h:267 [inline]
 rcu_read_lock include/linux/rcupdate.h:687 [inline]
 inet_twsk_purge+0x117/0x7b0 net/ipv4/inet_timewait_sock.c:268
 ops_exit_list.isra.0+0x103/0x150 net/core/net_namespace.c:171
 cleanup_net+0x511/0xa90 net/core/net_namespace.c:591
 process_one_work+0x9df/0x16d0 kernel/workqueue.c:2297
 worker_thread+0x90/0xed0 kernel/workqueue.c:2444
 kthread+0x3e5/0x4d0 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
----------------
Code disassembly (best guess):
   0: 7f 8b                jg     0xffffff8d
   2: e8 63 a6 c8 02        callq  0x2c8a66a
   7: e9 60 fb ff ff        jmpq   0xfffffb6c
   c: e8 b9 96 8a 00        callq  0x8a96ca
  11: e9 cf fb ff ff        jmpq   0xfffffbe5
  16: cc                    int3
  17: cc                    int3
  18: cc                    int3
  19: cc                    int3
  1a: 48 8b 07              mov    (%rdi),%rax
  1d: c3                    retq
  1e: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
  25: 00 00 00 00
  29: 90                    nop
* 2a: 41 57                push   %r15 <-- trapping instruction
  2c: 89 d0                mov    %edx,%eax
  2e: 41 56                push   %r14
  30: 41 55                push   %r13
  32: 41 54                push   %r12
  34: 4c 8d 64 87 fc        lea    -0x4(%rdi,%rax,4),%r12
  39: 55                    push   %rbp
  3a: 53                    push   %rbx
  3b: 48 83 ec 10          sub    $0x10,%rsp
  3f: 85                    .byte 0x85

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-10-30  8:39     ` Wouter Verhelst
@ 2019-10-30  8:41       ` Wouter Verhelst
  0 siblings, 0 replies; 13+ messages in thread
From: Wouter Verhelst @ 2019-10-30  8:41 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Mike Christie, syzbot, axboe, josef, linux-block, linux-kernel,
	nbd, syzkaller-bugs

On Wed, Oct 30, 2019 at 10:39:57AM +0200, Wouter Verhelst wrote:
> On Thu, Oct 17, 2019 at 03:03:30PM +0100, Richard W.M. Jones wrote:
> > On Tue, Oct 01, 2019 at 04:19:25PM -0500, Mike Christie wrote:
> > > Hey Josef and nbd list,
> > > 
> > > I had a question about if there are any socket family restrictions for nbd?
> > 
> > In normal circumstances, in userspace, the NBD protocol would only be
> > used over AF_UNIX or AF_INET/AF_INET6.
> 
> Note that someone once also did work to make it work over SCTP. I
> incorporated the patch into nbd-client and nbd-server, but never
> actually tested it myself. I have no way of knowing if it even still
> works anymore...

Actually, I meant SDP (as you pointed out downthread). Sorry for the
confusion ;-)

(I should probably kick that out though, indeed)

-- 
To the thief who stole my anti-depressants: I hope you're happy

  -- seen somewhere on the Internet on a photo of a billboard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-10-17 14:03   ` Richard W.M. Jones
  2019-10-17 15:47     ` Mike Christie
@ 2019-10-30  8:39     ` Wouter Verhelst
  2019-10-30  8:41       ` Wouter Verhelst
  1 sibling, 1 reply; 13+ messages in thread
From: Wouter Verhelst @ 2019-10-30  8:39 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Mike Christie, syzbot, axboe, josef, linux-block, linux-kernel,
	nbd, syzkaller-bugs

On Thu, Oct 17, 2019 at 03:03:30PM +0100, Richard W.M. Jones wrote:
> On Tue, Oct 01, 2019 at 04:19:25PM -0500, Mike Christie wrote:
> > Hey Josef and nbd list,
> > 
> > I had a question about if there are any socket family restrictions for nbd?
> 
> In normal circumstances, in userspace, the NBD protocol would only be
> used over AF_UNIX or AF_INET/AF_INET6.

Note that someone once also did work to make it work over SCTP. I
incorporated the patch into nbd-client and nbd-server, but never
actually tested it myself. I have no way of knowing if it even still
works anymore...

[...]
-- 
To the thief who stole my anti-depressants: I hope you're happy

  -- seen somewhere on the Internet on a photo of a billboard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-10-17 16:49           ` Richard W.M. Jones
@ 2019-10-17 21:26             ` Mike Christie
  0 siblings, 0 replies; 13+ messages in thread
From: Mike Christie @ 2019-10-17 21:26 UTC (permalink / raw)
  To: Richard W.M. Jones, syzbot, axboe, josef, linux-block,
	linux-kernel, nbd, syzkaller-bugs

On 10/17/2019 11:49 AM, Richard W.M. Jones wrote:
> On Thu, Oct 17, 2019 at 09:36:34AM -0700, Eric Biggers wrote:
>> On Thu, Oct 17, 2019 at 05:28:29PM +0100, Richard W.M. Jones wrote:
>>> On Thu, Oct 17, 2019 at 10:47:59AM -0500, Mike Christie wrote:
>>>> On 10/17/2019 09:03 AM, Richard W.M. Jones wrote:
>>>>> On Tue, Oct 01, 2019 at 04:19:25PM -0500, Mike Christie wrote:
>>>>>> Hey Josef and nbd list,
>>>>>>
>>>>>> I had a question about if there are any socket family restrictions for nbd?
>>>>>
>>>>> In normal circumstances, in userspace, the NBD protocol would only be
>>>>> used over AF_UNIX or AF_INET/AF_INET6.
>>>>>
>>>>> There's a bit of confusion because netlink is used by nbd-client to
>>>>> configure the NBD device, setting things like block size and timeouts
>>>>> (instead of ioctl which is deprecated).  I think you don't mean this
>>>>> use of netlink?
>>>>
>>>> I didn't. It looks like it is just a bad test.
>>>>
>>>> For the automated test in this thread the test created a AF_NETLINK
>>>> socket and passed it into the NBD_SET_SOCK ioctl. That is what got used
>>>> for the NBD_DO_IT ioctl.
>>>>
>>>> I was not sure if the test creator picked any old socket and it just
>>>> happened to pick one nbd never supported, or it was trying to simulate
>>>> sockets that did not support the shutdown method.
>>>>
>>>> I attached the automated test that got run (test.c).
>>>
>>> I'd say it sounds like a bad test, but I'm not familiar with syzkaller
>>> nor how / from where it generates these tests.  Did someone report a
>>> bug and then syzkaller wrote this test?
>>
>> It's an automatically generated fuzz test.
>>
>> There's rarely any such thing as a "bad" fuzz test.  If userspace
>> can do something that causes the kernel to crash or hang, it's a
>> kernel bug, with very few exceptions (e.g. like writing to
>> /dev/mem).
>>
>> If there are cases that aren't supported, like sockets that don't
>> support a certain function or whatever, then the code needs to check
>> for those cases and return an error, not hang the kernel.
> 
> Oh I see.  In that case I agree, although I believe this is a
> root-only API and root has a lot of ways to crash the kernel, but sure
> it could be fixed to restrict sockets to one of:
> 
>  - AF_LOCAL or AF_UNIX
>  - AF_INET or AF_INET6
>  - AF_INET*_SDP (? no idea what this is, but it's used by nbd-client)
> 

This one as for a infinniband related socket family that never made it
upstream.

It did support the shutdown callout, so I just made my patch check that
the passed in socket support that instead of hard coding the family
names just in case there was some user still using it.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-10-17 16:36         ` Eric Biggers
@ 2019-10-17 16:49           ` Richard W.M. Jones
  2019-10-17 21:26             ` Mike Christie
  0 siblings, 1 reply; 13+ messages in thread
From: Richard W.M. Jones @ 2019-10-17 16:49 UTC (permalink / raw)
  To: Mike Christie, syzbot, axboe, josef, linux-block, linux-kernel,
	nbd, syzkaller-bugs

On Thu, Oct 17, 2019 at 09:36:34AM -0700, Eric Biggers wrote:
> On Thu, Oct 17, 2019 at 05:28:29PM +0100, Richard W.M. Jones wrote:
> > On Thu, Oct 17, 2019 at 10:47:59AM -0500, Mike Christie wrote:
> > > On 10/17/2019 09:03 AM, Richard W.M. Jones wrote:
> > > > On Tue, Oct 01, 2019 at 04:19:25PM -0500, Mike Christie wrote:
> > > >> Hey Josef and nbd list,
> > > >>
> > > >> I had a question about if there are any socket family restrictions for nbd?
> > > > 
> > > > In normal circumstances, in userspace, the NBD protocol would only be
> > > > used over AF_UNIX or AF_INET/AF_INET6.
> > > > 
> > > > There's a bit of confusion because netlink is used by nbd-client to
> > > > configure the NBD device, setting things like block size and timeouts
> > > > (instead of ioctl which is deprecated).  I think you don't mean this
> > > > use of netlink?
> > > 
> > > I didn't. It looks like it is just a bad test.
> > > 
> > > For the automated test in this thread the test created a AF_NETLINK
> > > socket and passed it into the NBD_SET_SOCK ioctl. That is what got used
> > > for the NBD_DO_IT ioctl.
> > > 
> > > I was not sure if the test creator picked any old socket and it just
> > > happened to pick one nbd never supported, or it was trying to simulate
> > > sockets that did not support the shutdown method.
> > > 
> > > I attached the automated test that got run (test.c).
> > 
> > I'd say it sounds like a bad test, but I'm not familiar with syzkaller
> > nor how / from where it generates these tests.  Did someone report a
> > bug and then syzkaller wrote this test?
>
> It's an automatically generated fuzz test.
>
> There's rarely any such thing as a "bad" fuzz test.  If userspace
> can do something that causes the kernel to crash or hang, it's a
> kernel bug, with very few exceptions (e.g. like writing to
> /dev/mem).
>
> If there are cases that aren't supported, like sockets that don't
> support a certain function or whatever, then the code needs to check
> for those cases and return an error, not hang the kernel.

Oh I see.  In that case I agree, although I believe this is a
root-only API and root has a lot of ways to crash the kernel, but sure
it could be fixed to restrict sockets to one of:

 - AF_LOCAL or AF_UNIX
 - AF_INET or AF_INET6
 - AF_INET*_SDP (? no idea what this is, but it's used by nbd-client)

Here are some ways NBD is used in real code:

libnbd$ git grep AF_
fuzzing/libnbd-fuzz-wrapper.c:  if (socketpair (AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, sv) == -1) {
generator/states-connect-socket-activation.c:  s = socket (AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0);
generator/states-connect-socket-activation.c:  addr.sun_family = AF_UNIX;
generator/states-connect.c:  fd = socket (AF_UNIX, SOCK_STREAM|SOCK_NONBLOCK|SOCK_CLOEXEC, 0);
generator/states-connect.c:  struct sockaddr_un sun = { .sun_family = AF_UNIX };
generator/states-connect.c:  if (socketpair (AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, sv) == -1) {


nbdkit$ git grep AF_
plugins/info/info.c:  case AF_INET:
plugins/info/info.c:    if (inet_ntop (AF_INET, &addr->sin_addr,
plugins/info/info.c:  case AF_INET6:
plugins/info/info.c:    if (inet_ntop (AF_INET6, &addr6->sin6_addr,
plugins/info/info.c:  case AF_UNIX:
plugins/nbd/nbd-standalone.c:  struct sockaddr_un sock = { .sun_family = AF_UNIX };
plugins/nbd/nbd-standalone.c:  fd = socket (AF_UNIX, SOCK_STREAM, 0);
server/sockets.c:  sock = socket (AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0);
server/sockets.c:  sock = set_cloexec (socket (AF_UNIX, SOCK_STREAM, 0));
server/sockets.c:  addr.sun_family = AF_UNIX;
tests/test-layers.c:  if (socketpair (AF_LOCAL, SOCK_STREAM, 0, sfd) == -1) {
tests/test-socket-activation.c:  sock = socket (AF_UNIX, SOCK_STREAM /* NB do not use SOCK_CLOEXEC */, 0);
tests/test-socket-activation.c:  addr.sun_family = AF_UNIX;
tests/test-socket-activation.c:  sock = socket (AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0);
tests/web-server.c:  listen_sock = socket (AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0);
tests/web-server.c:  addr.sun_family = AF_UNIX;

nbd$ git grep AF_
gznbd/gznbd.c:  if(socketpair(AF_UNIX, SOCK_STREAM, 0, pr)){
nbd-client.c:           if (ai->ai_family == AF_INET)
nbd-client.c:                   ai->ai_family = AF_INET_SDP;
nbd-client.c:           else (ai->ai_family == AF_INET6)
nbd-client.c:                   ai->ai_family = AF_INET6_SDP;
nbd-client.c:   un_addr.sun_family = AF_UNIX;
nbd-client.c:   if ((sock = socket(AF_UNIX, SOCK_STREAM, 0)) == -1) {
nbd-client.c:           if (socketpair(AF_UNIX, SOCK_STREAM, 0, plainfd) < 0)
nbd-server.c:   if(netaddr.ss_family == AF_UNIX) {
nbd-server.c:           client->clientaddr.ss_family = AF_UNIX;
nbd-server.c:                   if(client->clientaddr.ss_family == AF_UNIX) {
nbd-server.c:                           assert((ai->ai_family == AF_INET) || (ai->ai_family == AF_INET6));
nbd-server.c:                           if(ai->ai_family == AF_INET) {
nbd-server.c:                           } else if(ai->ai_family == AF_INET6) {
nbd-server.c:   socketpair(AF_UNIX, SOCK_STREAM, 0, sockets);
nbd-server.c:   sa.sun_family = AF_UNIX;
nbd-server.c:   sock = socket(AF_UNIX, SOCK_STREAM, 0);
nbdsrv.c:       int addrlen = addr->sa_family == AF_INET ? 4 : 16;
nbdsrv.c:               assert(addr->sa_family == AF_INET || addr->sa_family == AF_INET6);
nbdsrv.c:                       case AF_INET:
nbdsrv.c:                       case AF_INET6:
tests/code/trim.c:      socketpair(AF_UNIX, SOCK_STREAM, AF_UNIX, spair);
tests/run/nbd-tester-client.c:          if (socketpair(AF_UNIX, SOCK_STREAM, 0, plainfd) < 0) {
tests/run/nbd-tester-client.c:  if ((sock = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) {
tests/run/nbd-tester-client.c:  addr.sun_family = AF_UNIX;
tests/run/nbd-tester-client.c:  addr.sin_family = AF_INET;
tests/run/nbd-tester-client.c:  if (socketpair(AF_UNIX, SOCK_STREAM, 0, sv) == -1) {


qemu-nbd is a bit hard to grep like this, but it only supports
Unix domain sockets or TCP/IP.


Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-10-17 16:28       ` Richard W.M. Jones
@ 2019-10-17 16:36         ` Eric Biggers
  2019-10-17 16:49           ` Richard W.M. Jones
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Biggers @ 2019-10-17 16:36 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Mike Christie, syzbot, axboe, josef, linux-block, linux-kernel,
	nbd, syzkaller-bugs

On Thu, Oct 17, 2019 at 05:28:29PM +0100, Richard W.M. Jones wrote:
> On Thu, Oct 17, 2019 at 10:47:59AM -0500, Mike Christie wrote:
> > On 10/17/2019 09:03 AM, Richard W.M. Jones wrote:
> > > On Tue, Oct 01, 2019 at 04:19:25PM -0500, Mike Christie wrote:
> > >> Hey Josef and nbd list,
> > >>
> > >> I had a question about if there are any socket family restrictions for nbd?
> > > 
> > > In normal circumstances, in userspace, the NBD protocol would only be
> > > used over AF_UNIX or AF_INET/AF_INET6.
> > > 
> > > There's a bit of confusion because netlink is used by nbd-client to
> > > configure the NBD device, setting things like block size and timeouts
> > > (instead of ioctl which is deprecated).  I think you don't mean this
> > > use of netlink?
> > 
> > I didn't. It looks like it is just a bad test.
> > 
> > For the automated test in this thread the test created a AF_NETLINK
> > socket and passed it into the NBD_SET_SOCK ioctl. That is what got used
> > for the NBD_DO_IT ioctl.
> > 
> > I was not sure if the test creator picked any old socket and it just
> > happened to pick one nbd never supported, or it was trying to simulate
> > sockets that did not support the shutdown method.
> > 
> > I attached the automated test that got run (test.c).
> 
> I'd say it sounds like a bad test, but I'm not familiar with syzkaller
> nor how / from where it generates these tests.  Did someone report a
> bug and then syzkaller wrote this test?
> 
> Rich.
> 
> > > 
> > >> The bug here is that some socket familys do not support the
> > >> sock->ops->shutdown callout, and when nbd calls kernel_sock_shutdown
> > >> their callout returns -EOPNOTSUPP. That then leaves recv_work stuck in
> > >> nbd_read_stat -> sock_xmit -> sock_recvmsg. My patch added a
> > >> flush_workqueue call, so for socket familys like AF_NETLINK in this bug
> > >> we hang like we see below.
> > >>
> > >> I can just remove the flush_workqueue call in that code path since it's
> > >> not needed there, but it leaves the original bug my patch was hitting
> > >> where we leave the recv_work running which can then result in leaked
> > >> resources, or possible use after free crashes and you still get the hang
> > >> if you remove the module.
> > >>
> > >> It looks like we have used kernel_sock_shutdown for a while so I thought
> > >> we might never have supported sockets that did not support the callout.
> > >> Is that correct? If so then I can just add a check for this in
> > >> nbd_add_socket and fix that bug too.
> > > 
> > > Rich.
> > > 

It's an automatically generated fuzz test.

There's rarely any such thing as a "bad" fuzz test.  If userspace can do
something that causes the kernel to crash or hang, it's a kernel bug, with very
few exceptions (e.g. like writing to /dev/mem).

If there are cases that aren't supported, like sockets that don't support a
certain function or whatever, then the code needs to check for those cases and
return an error, not hang the kernel.

- Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-10-17 15:47     ` Mike Christie
@ 2019-10-17 16:28       ` Richard W.M. Jones
  2019-10-17 16:36         ` Eric Biggers
  0 siblings, 1 reply; 13+ messages in thread
From: Richard W.M. Jones @ 2019-10-17 16:28 UTC (permalink / raw)
  To: Mike Christie
  Cc: syzbot, axboe, josef, linux-block, linux-kernel, nbd, syzkaller-bugs

On Thu, Oct 17, 2019 at 10:47:59AM -0500, Mike Christie wrote:
> On 10/17/2019 09:03 AM, Richard W.M. Jones wrote:
> > On Tue, Oct 01, 2019 at 04:19:25PM -0500, Mike Christie wrote:
> >> Hey Josef and nbd list,
> >>
> >> I had a question about if there are any socket family restrictions for nbd?
> > 
> > In normal circumstances, in userspace, the NBD protocol would only be
> > used over AF_UNIX or AF_INET/AF_INET6.
> > 
> > There's a bit of confusion because netlink is used by nbd-client to
> > configure the NBD device, setting things like block size and timeouts
> > (instead of ioctl which is deprecated).  I think you don't mean this
> > use of netlink?
> 
> I didn't. It looks like it is just a bad test.
> 
> For the automated test in this thread the test created a AF_NETLINK
> socket and passed it into the NBD_SET_SOCK ioctl. That is what got used
> for the NBD_DO_IT ioctl.
> 
> I was not sure if the test creator picked any old socket and it just
> happened to pick one nbd never supported, or it was trying to simulate
> sockets that did not support the shutdown method.
> 
> I attached the automated test that got run (test.c).

I'd say it sounds like a bad test, but I'm not familiar with syzkaller
nor how / from where it generates these tests.  Did someone report a
bug and then syzkaller wrote this test?

Rich.

> > 
> >> The bug here is that some socket familys do not support the
> >> sock->ops->shutdown callout, and when nbd calls kernel_sock_shutdown
> >> their callout returns -EOPNOTSUPP. That then leaves recv_work stuck in
> >> nbd_read_stat -> sock_xmit -> sock_recvmsg. My patch added a
> >> flush_workqueue call, so for socket familys like AF_NETLINK in this bug
> >> we hang like we see below.
> >>
> >> I can just remove the flush_workqueue call in that code path since it's
> >> not needed there, but it leaves the original bug my patch was hitting
> >> where we leave the recv_work running which can then result in leaked
> >> resources, or possible use after free crashes and you still get the hang
> >> if you remove the module.
> >>
> >> It looks like we have used kernel_sock_shutdown for a while so I thought
> >> we might never have supported sockets that did not support the callout.
> >> Is that correct? If so then I can just add a check for this in
> >> nbd_add_socket and fix that bug too.
> > 
> > Rich.
> > 
> >> On 09/30/2019 05:39 PM, syzbot wrote:
> >>> Hello,
> >>>
> >>> syzbot found the following crash on:
> >>>
> >>> HEAD commit:    bb2aee77 Add linux-next specific files for 20190926
> >>> git tree:       linux-next
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=13385ca3600000
> >>> kernel config:  https://syzkaller.appspot.com/x/.config?x=e60af4ac5a01e964
> >>> dashboard link:
> >>> https://syzkaller.appspot.com/bug?extid=24c12fa8d218ed26011a
> >>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12abc2a3600000
> >>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11712c05600000
> >>>
> >>> The bug was bisected to:
> >>>
> >>> commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
> >>> Author: Mike Christie <mchristi@redhat.com>
> >>> Date:   Sun Aug 4 19:10:06 2019 +0000
> >>>
> >>>     nbd: fix max number of supported devs
> >>>
> >>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1226f3c5600000
> >>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=1126f3c5600000
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=1626f3c5600000
> >>>
> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >>> Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
> >>> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
> >>>
> >>> INFO: task syz-executor390:8778 can't die for more than 143 seconds.
> >>> syz-executor390 D27432  8778   8777 0x00004004
> >>> Call Trace:
> >>>  context_switch kernel/sched/core.c:3384 [inline]
> >>>  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
> >>>  schedule+0xd9/0x260 kernel/sched/core.c:4132
> >>>  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
> >>>  do_wait_for_common kernel/sched/completion.c:83 [inline]
> >>>  __wait_for_common kernel/sched/completion.c:104 [inline]
> >>>  wait_for_common kernel/sched/completion.c:115 [inline]
> >>>  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
> >>>  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
> >>>  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
> >>>  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
> >>>  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
> >>>  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
> >>>  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
> >>>  block_ioctl+0xee/0x130 fs/block_dev.c:1954
> >>>  vfs_ioctl fs/ioctl.c:47 [inline]
> >>>  file_ioctl fs/ioctl.c:539 [inline]
> >>>  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
> >>>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
> >>>  __do_sys_ioctl fs/ioctl.c:750 [inline]
> >>>  __se_sys_ioctl fs/ioctl.c:748 [inline]
> >>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
> >>>  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
> >>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>> RIP: 0033:0x4452d9
> >>> Code: Bad RIP value.
> >>> RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> >>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
> >>> RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
> >>> RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
> >>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
> >>> R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
> >>> INFO: task syz-executor390:8778 blocked for more than 143 seconds.
> >>>       Not tainted 5.3.0-next-20190926 #0
> >>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>> syz-executor390 D27432  8778   8777 0x00004004
> >>> Call Trace:
> >>>  context_switch kernel/sched/core.c:3384 [inline]
> >>>  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
> >>>  schedule+0xd9/0x260 kernel/sched/core.c:4132
> >>>  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
> >>>  do_wait_for_common kernel/sched/completion.c:83 [inline]
> >>>  __wait_for_common kernel/sched/completion.c:104 [inline]
> >>>  wait_for_common kernel/sched/completion.c:115 [inline]
> >>>  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
> >>>  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
> >>>  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
> >>>  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
> >>>  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
> >>>  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
> >>>  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
> >>>  block_ioctl+0xee/0x130 fs/block_dev.c:1954
> >>>  vfs_ioctl fs/ioctl.c:47 [inline]
> >>>  file_ioctl fs/ioctl.c:539 [inline]
> >>>  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
> >>>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
> >>>  __do_sys_ioctl fs/ioctl.c:750 [inline]
> >>>  __se_sys_ioctl fs/ioctl.c:748 [inline]
> >>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
> >>>  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
> >>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>> RIP: 0033:0x4452d9
> >>> Code: Bad RIP value.
> >>> RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> >>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
> >>> RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
> >>> RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
> >>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
> >>> R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
> >>>
> >>> Showing all locks held in the system:
> >>> 1 lock held by khungtaskd/1066:
> >>>  #0: ffffffff88faad80 (rcu_read_lock){....}, at:
> >>> debug_show_all_locks+0x5f/0x27e kernel/locking/lockdep.c:5337
> >>> 2 locks held by kworker/u5:0/1525:
> >>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> >>> __write_once_size include/linux/compiler.h:226 [inline]
> >>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> >>> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
> >>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> >>> atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
> >>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> >>> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
> >>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> >>> set_work_data kernel/workqueue.c:620 [inline]
> >>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> >>> set_work_pool_and_clear_pending kernel/workqueue.c:647 [inline]
> >>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> >>> process_one_work+0x88b/0x1740 kernel/workqueue.c:2240
> >>>  #1: ffff8880a63b7dc0 ((work_completion)(&args->work)){+.+.}, at:
> >>> process_one_work+0x8c1/0x1740 kernel/workqueue.c:2244
> >>> 1 lock held by rsyslogd/8659:
> >>> 2 locks held by getty/8749:
> >>>  #0: ffff888098c08090 (&tty->ldisc_sem){++++}, at:
> >>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >>>  #1: ffffc90005f112e0 (&ldata->atomic_read_lock){+.+.}, at:
> >>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> >>> 2 locks held by getty/8750:
> >>>  #0: ffff88808f10b090 (&tty->ldisc_sem){++++}, at:
> >>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >>>  #1: ffffc90005f2d2e0 (&ldata->atomic_read_lock){+.+.}, at:
> >>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> >>> 2 locks held by getty/8751:
> >>>  #0: ffff88809a6be090 (&tty->ldisc_sem){++++}, at:
> >>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >>>  #1: ffffc90005f192e0 (&ldata->atomic_read_lock){+.+.}, at:
> >>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> >>> 2 locks held by getty/8752:
> >>>  #0: ffff8880a48af090 (&tty->ldisc_sem){++++}, at:
> >>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >>>  #1: ffffc90005f352e0 (&ldata->atomic_read_lock){+.+.}, at:
> >>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> >>> 2 locks held by getty/8753:
> >>>  #0: ffff88808c599090 (&tty->ldisc_sem){++++}, at:
> >>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >>>  #1: ffffc90005f212e0 (&ldata->atomic_read_lock){+.+.}, at:
> >>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> >>> 2 locks held by getty/8754:
> >>>  #0: ffff88808f1a8090 (&tty->ldisc_sem){++++}, at:
> >>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >>>  #1: ffffc90005f392e0 (&ldata->atomic_read_lock){+.+.}, at:
> >>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> >>> 2 locks held by getty/8755:
> >>>  #0: ffff88809ab33090 (&tty->ldisc_sem){++++}, at:
> >>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >>>  #1: ffffc90005f012e0 (&ldata->atomic_read_lock){+.+.}, at:
> >>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> >>>
> >>> =============================================
> >>>
> >>> NMI backtrace for cpu 1
> >>> CPU: 1 PID: 1066 Comm: khungtaskd Not tainted 5.3.0-next-20190926 #0
> >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >>> Google 01/01/2011
> >>> Call Trace:
> >>>  __dump_stack lib/dump_stack.c:77 [inline]
> >>>  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> >>>  nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
> >>>  nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
> >>>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
> >>>  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
> >>>  check_hung_uninterruptible_tasks kernel/hung_task.c:269 [inline]
> >>>  watchdog+0xc99/0x1360 kernel/hung_task.c:353
> >>>  kthread+0x361/0x430 kernel/kthread.c:255
> >>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> >>> Sending NMI from CPU 1 to CPUs 0:
> >>> NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x10
> >>> arch/x86/include/asm/irqflags.h:60
> >>>
> >>>
> >>> ---
> >>> This bug is generated by a bot. It may contain errors.
> >>> See https://goo.gl/tpsmEJ for more information about syzbot.
> >>> syzbot engineers can be reached at syzkaller@googlegroups.com.
> >>>
> >>> syzbot will keep track of this bug report. See:
> >>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >>> For information about bisection process see:
> >>> https://goo.gl/tpsmEJ#bisection
> >>> syzbot can test patches for this bug, for details see:
> >>> https://goo.gl/tpsmEJ#testing-patches
> > 
> 

> // autogenerated by syzkaller (https://github.com/google/syzkaller)
> 
> #define _GNU_SOURCE
> 
> #include <dirent.h>
> #include <endian.h>
> #include <errno.h>
> #include <fcntl.h>
> #include <setjmp.h>
> #include <signal.h>
> #include <stdarg.h>
> #include <stdbool.h>
> #include <stdint.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/prctl.h>
> #include <sys/stat.h>
> #include <sys/syscall.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> #include <time.h>
> #include <unistd.h>
> 
> static __thread int skip_segv;
> static __thread jmp_buf segv_env;
> 
> static void segv_handler(int sig, siginfo_t* info, void* ctx)
> {
>   uintptr_t addr = (uintptr_t)info->si_addr;
>   const uintptr_t prog_start = 1 << 20;
>   const uintptr_t prog_end = 100 << 20;
>   if (__atomic_load_n(&skip_segv, __ATOMIC_RELAXED) &&
>       (addr < prog_start || addr > prog_end)) {
>     _longjmp(segv_env, 1);
>   }
>   exit(sig);
> }
> 
> static void install_segv_handler(void)
> {
>   struct sigaction sa;
>   memset(&sa, 0, sizeof(sa));
>   sa.sa_handler = SIG_IGN;
>   syscall(SYS_rt_sigaction, 0x20, &sa, NULL, 8);
>   syscall(SYS_rt_sigaction, 0x21, &sa, NULL, 8);
>   memset(&sa, 0, sizeof(sa));
>   sa.sa_sigaction = segv_handler;
>   sa.sa_flags = SA_NODEFER | SA_SIGINFO;
>   sigaction(SIGSEGV, &sa, NULL);
>   sigaction(SIGBUS, &sa, NULL);
> }
> 
> #define NONFAILING(...)                                                        \
>   {                                                                            \
>     __atomic_fetch_add(&skip_segv, 1, __ATOMIC_SEQ_CST);                       \
>     if (_setjmp(segv_env) == 0) {                                              \
>       __VA_ARGS__;                                                             \
>     }                                                                          \
>     __atomic_fetch_sub(&skip_segv, 1, __ATOMIC_SEQ_CST);                       \
>   }
> 
> static void sleep_ms(uint64_t ms)
> {
>   usleep(ms * 1000);
> }
> 
> static uint64_t current_time_ms(void)
> {
>   struct timespec ts;
>   if (clock_gettime(CLOCK_MONOTONIC, &ts))
>     exit(1);
>   return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
> }
> 
> static bool write_file(const char* file, const char* what, ...)
> {
>   char buf[1024];
>   va_list args;
>   va_start(args, what);
>   vsnprintf(buf, sizeof(buf), what, args);
>   va_end(args);
>   buf[sizeof(buf) - 1] = 0;
>   int len = strlen(buf);
>   int fd = open(file, O_WRONLY | O_CLOEXEC);
>   if (fd == -1)
>     return false;
>   if (write(fd, buf, len) != len) {
>     int err = errno;
>     close(fd);
>     errno = err;
>     return false;
>   }
>   close(fd);
>   return true;
> }
> 
> static long syz_open_dev(volatile long a0, volatile long a1, volatile long a2)
> {
>   if (a0 == 0xc || a0 == 0xb) {
>     char buf[128];
>     sprintf(buf, "/dev/%s/%d:%d", a0 == 0xc ? "char" : "block", (uint8_t)a1,
>             (uint8_t)a2);
>     return open(buf, O_RDWR, 0);
>   } else {
>     char buf[1024];
>     char* hash;
>     NONFAILING(strncpy(buf, (char*)a0, sizeof(buf) - 1));
>     buf[sizeof(buf) - 1] = 0;
>     while ((hash = strchr(buf, '#'))) {
>       *hash = '0' + (char)(a1 % 10);
>       a1 /= 10;
>     }
>     return open(buf, a2, 0);
>   }
> }
> 
> static void kill_and_wait(int pid, int* status)
> {
>   kill(-pid, SIGKILL);
>   kill(pid, SIGKILL);
>   int i;
>   for (i = 0; i < 100; i++) {
>     if (waitpid(-1, status, WNOHANG | __WALL) == pid)
>       return;
>     usleep(1000);
>   }
>   DIR* dir = opendir("/sys/fs/fuse/connections");
>   if (dir) {
>     for (;;) {
>       struct dirent* ent = readdir(dir);
>       if (!ent)
>         break;
>       if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0)
>         continue;
>       char abort[300];
>       snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort",
>                ent->d_name);
>       int fd = open(abort, O_WRONLY);
>       if (fd == -1) {
>         continue;
>       }
>       if (write(fd, abort, 1) < 0) {
>       }
>       close(fd);
>     }
>     closedir(dir);
>   } else {
>   }
>   while (waitpid(-1, status, __WALL) != pid) {
>   }
> }
> 
> static void setup_test()
> {
>   prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
>   setpgrp();
>   write_file("/proc/self/oom_score_adj", "1000");
> }
> 
> static void execute_one(void);
> 
> #define WAIT_FLAGS __WALL
> 
> static void loop(void)
> {
>   int iter;
>   for (iter = 0; iter < 1; iter++) {
>     int pid = fork();
>     if (pid < 0)
>       exit(1);
>     if (pid == 0) {
>       setup_test();
>       execute_one();
>       exit(0);
>     }
>     int status = 0;
>     uint64_t start = current_time_ms();
>     for (;;) {
>       if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
>         break;
>       sleep_ms(1);
>       if (current_time_ms() - start < 5 * 1000)
>         continue;
>       kill_and_wait(pid, &status);
>       break;
>     }
>   }
> }
> 
> uint64_t r[3] = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff};
> 
> void execute_one(void)
> {
>   intptr_t res = 0;
>   res = syscall(__NR_socket, 0x10, 2, 2);
>   if (res != -1)
>     r[0] = res;
>   NONFAILING(memcpy((void*)0x20000080, "/dev/nbd#\000", 10));
>   res = syz_open_dev(0x20000080, 0, 0);
>   if (res != -1)
>     r[1] = res;
>   res = syz_open_dev(0, 0, 0);
>   if (res != -1)
>     r[2] = res;
>   syscall(__NR_ioctl, r[2], 0xab00, r[0]);
>   syscall(__NR_ioctl, r[1], 0xab03, 0);
> }
> int main(void)
> {
>   syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
>   install_segv_handler();
>   loop();
>   return 0;
> }


-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-10-17 14:03   ` Richard W.M. Jones
@ 2019-10-17 15:47     ` Mike Christie
  2019-10-17 16:28       ` Richard W.M. Jones
  2019-10-30  8:39     ` Wouter Verhelst
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Christie @ 2019-10-17 15:47 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: syzbot, axboe, josef, linux-block, linux-kernel, nbd, syzkaller-bugs

[-- Attachment #1: Type: text/plain, Size: 11884 bytes --]

On 10/17/2019 09:03 AM, Richard W.M. Jones wrote:
> On Tue, Oct 01, 2019 at 04:19:25PM -0500, Mike Christie wrote:
>> Hey Josef and nbd list,
>>
>> I had a question about if there are any socket family restrictions for nbd?
> 
> In normal circumstances, in userspace, the NBD protocol would only be
> used over AF_UNIX or AF_INET/AF_INET6.
> 
> There's a bit of confusion because netlink is used by nbd-client to
> configure the NBD device, setting things like block size and timeouts
> (instead of ioctl which is deprecated).  I think you don't mean this
> use of netlink?

I didn't. It looks like it is just a bad test.

For the automated test in this thread the test created a AF_NETLINK
socket and passed it into the NBD_SET_SOCK ioctl. That is what got used
for the NBD_DO_IT ioctl.

I was not sure if the test creator picked any old socket and it just
happened to pick one nbd never supported, or it was trying to simulate
sockets that did not support the shutdown method.

I attached the automated test that got run (test.c).

> 
>> The bug here is that some socket familys do not support the
>> sock->ops->shutdown callout, and when nbd calls kernel_sock_shutdown
>> their callout returns -EOPNOTSUPP. That then leaves recv_work stuck in
>> nbd_read_stat -> sock_xmit -> sock_recvmsg. My patch added a
>> flush_workqueue call, so for socket familys like AF_NETLINK in this bug
>> we hang like we see below.
>>
>> I can just remove the flush_workqueue call in that code path since it's
>> not needed there, but it leaves the original bug my patch was hitting
>> where we leave the recv_work running which can then result in leaked
>> resources, or possible use after free crashes and you still get the hang
>> if you remove the module.
>>
>> It looks like we have used kernel_sock_shutdown for a while so I thought
>> we might never have supported sockets that did not support the callout.
>> Is that correct? If so then I can just add a check for this in
>> nbd_add_socket and fix that bug too.
> 
> Rich.
> 
>> On 09/30/2019 05:39 PM, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:    bb2aee77 Add linux-next specific files for 20190926
>>> git tree:       linux-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13385ca3600000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=e60af4ac5a01e964
>>> dashboard link:
>>> https://syzkaller.appspot.com/bug?extid=24c12fa8d218ed26011a
>>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12abc2a3600000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11712c05600000
>>>
>>> The bug was bisected to:
>>>
>>> commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
>>> Author: Mike Christie <mchristi@redhat.com>
>>> Date:   Sun Aug 4 19:10:06 2019 +0000
>>>
>>>     nbd: fix max number of supported devs
>>>
>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1226f3c5600000
>>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=1126f3c5600000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1626f3c5600000
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
>>> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
>>>
>>> INFO: task syz-executor390:8778 can't die for more than 143 seconds.
>>> syz-executor390 D27432  8778   8777 0x00004004
>>> Call Trace:
>>>  context_switch kernel/sched/core.c:3384 [inline]
>>>  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
>>>  schedule+0xd9/0x260 kernel/sched/core.c:4132
>>>  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
>>>  do_wait_for_common kernel/sched/completion.c:83 [inline]
>>>  __wait_for_common kernel/sched/completion.c:104 [inline]
>>>  wait_for_common kernel/sched/completion.c:115 [inline]
>>>  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
>>>  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
>>>  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
>>>  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
>>>  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
>>>  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
>>>  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
>>>  block_ioctl+0xee/0x130 fs/block_dev.c:1954
>>>  vfs_ioctl fs/ioctl.c:47 [inline]
>>>  file_ioctl fs/ioctl.c:539 [inline]
>>>  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
>>>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
>>>  __do_sys_ioctl fs/ioctl.c:750 [inline]
>>>  __se_sys_ioctl fs/ioctl.c:748 [inline]
>>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
>>>  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> RIP: 0033:0x4452d9
>>> Code: Bad RIP value.
>>> RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
>>> RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
>>> RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
>>> R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
>>> INFO: task syz-executor390:8778 blocked for more than 143 seconds.
>>>       Not tainted 5.3.0-next-20190926 #0
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> syz-executor390 D27432  8778   8777 0x00004004
>>> Call Trace:
>>>  context_switch kernel/sched/core.c:3384 [inline]
>>>  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
>>>  schedule+0xd9/0x260 kernel/sched/core.c:4132
>>>  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
>>>  do_wait_for_common kernel/sched/completion.c:83 [inline]
>>>  __wait_for_common kernel/sched/completion.c:104 [inline]
>>>  wait_for_common kernel/sched/completion.c:115 [inline]
>>>  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
>>>  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
>>>  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
>>>  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
>>>  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
>>>  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
>>>  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
>>>  block_ioctl+0xee/0x130 fs/block_dev.c:1954
>>>  vfs_ioctl fs/ioctl.c:47 [inline]
>>>  file_ioctl fs/ioctl.c:539 [inline]
>>>  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
>>>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
>>>  __do_sys_ioctl fs/ioctl.c:750 [inline]
>>>  __se_sys_ioctl fs/ioctl.c:748 [inline]
>>>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
>>>  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> RIP: 0033:0x4452d9
>>> Code: Bad RIP value.
>>> RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
>>> RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
>>> RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
>>> R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
>>>
>>> Showing all locks held in the system:
>>> 1 lock held by khungtaskd/1066:
>>>  #0: ffffffff88faad80 (rcu_read_lock){....}, at:
>>> debug_show_all_locks+0x5f/0x27e kernel/locking/lockdep.c:5337
>>> 2 locks held by kworker/u5:0/1525:
>>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
>>> __write_once_size include/linux/compiler.h:226 [inline]
>>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
>>> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
>>> atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
>>> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
>>> set_work_data kernel/workqueue.c:620 [inline]
>>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
>>> set_work_pool_and_clear_pending kernel/workqueue.c:647 [inline]
>>>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
>>> process_one_work+0x88b/0x1740 kernel/workqueue.c:2240
>>>  #1: ffff8880a63b7dc0 ((work_completion)(&args->work)){+.+.}, at:
>>> process_one_work+0x8c1/0x1740 kernel/workqueue.c:2244
>>> 1 lock held by rsyslogd/8659:
>>> 2 locks held by getty/8749:
>>>  #0: ffff888098c08090 (&tty->ldisc_sem){++++}, at:
>>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>>>  #1: ffffc90005f112e0 (&ldata->atomic_read_lock){+.+.}, at:
>>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
>>> 2 locks held by getty/8750:
>>>  #0: ffff88808f10b090 (&tty->ldisc_sem){++++}, at:
>>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>>>  #1: ffffc90005f2d2e0 (&ldata->atomic_read_lock){+.+.}, at:
>>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
>>> 2 locks held by getty/8751:
>>>  #0: ffff88809a6be090 (&tty->ldisc_sem){++++}, at:
>>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>>>  #1: ffffc90005f192e0 (&ldata->atomic_read_lock){+.+.}, at:
>>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
>>> 2 locks held by getty/8752:
>>>  #0: ffff8880a48af090 (&tty->ldisc_sem){++++}, at:
>>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>>>  #1: ffffc90005f352e0 (&ldata->atomic_read_lock){+.+.}, at:
>>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
>>> 2 locks held by getty/8753:
>>>  #0: ffff88808c599090 (&tty->ldisc_sem){++++}, at:
>>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>>>  #1: ffffc90005f212e0 (&ldata->atomic_read_lock){+.+.}, at:
>>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
>>> 2 locks held by getty/8754:
>>>  #0: ffff88808f1a8090 (&tty->ldisc_sem){++++}, at:
>>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>>>  #1: ffffc90005f392e0 (&ldata->atomic_read_lock){+.+.}, at:
>>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
>>> 2 locks held by getty/8755:
>>>  #0: ffff88809ab33090 (&tty->ldisc_sem){++++}, at:
>>> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>>>  #1: ffffc90005f012e0 (&ldata->atomic_read_lock){+.+.}, at:
>>> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
>>>
>>> =============================================
>>>
>>> NMI backtrace for cpu 1
>>> CPU: 1 PID: 1066 Comm: khungtaskd Not tainted 5.3.0-next-20190926 #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>>> Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
>>>  nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
>>>  nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
>>>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>>>  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
>>>  check_hung_uninterruptible_tasks kernel/hung_task.c:269 [inline]
>>>  watchdog+0xc99/0x1360 kernel/hung_task.c:353
>>>  kthread+0x361/0x430 kernel/kthread.c:255
>>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>>> Sending NMI from CPU 1 to CPUs 0:
>>> NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x10
>>> arch/x86/include/asm/irqflags.h:60
>>>
>>>
>>> ---
>>> This bug is generated by a bot. It may contain errors.
>>> See https://goo.gl/tpsmEJ for more information about syzbot.
>>> syzbot engineers can be reached at syzkaller@googlegroups.com.
>>>
>>> syzbot will keep track of this bug report. See:
>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>>> For information about bisection process see:
>>> https://goo.gl/tpsmEJ#bisection
>>> syzbot can test patches for this bug, for details see:
>>> https://goo.gl/tpsmEJ#testing-patches
> 


[-- Attachment #2: test.c --]
[-- Type: text/x-csrc, Size: 5247 bytes --]

// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <dirent.h>
#include <endian.h>
#include <errno.h>
#include <fcntl.h>
#include <setjmp.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/prctl.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>

static __thread int skip_segv;
static __thread jmp_buf segv_env;

static void segv_handler(int sig, siginfo_t* info, void* ctx)
{
  uintptr_t addr = (uintptr_t)info->si_addr;
  const uintptr_t prog_start = 1 << 20;
  const uintptr_t prog_end = 100 << 20;
  if (__atomic_load_n(&skip_segv, __ATOMIC_RELAXED) &&
      (addr < prog_start || addr > prog_end)) {
    _longjmp(segv_env, 1);
  }
  exit(sig);
}

static void install_segv_handler(void)
{
  struct sigaction sa;
  memset(&sa, 0, sizeof(sa));
  sa.sa_handler = SIG_IGN;
  syscall(SYS_rt_sigaction, 0x20, &sa, NULL, 8);
  syscall(SYS_rt_sigaction, 0x21, &sa, NULL, 8);
  memset(&sa, 0, sizeof(sa));
  sa.sa_sigaction = segv_handler;
  sa.sa_flags = SA_NODEFER | SA_SIGINFO;
  sigaction(SIGSEGV, &sa, NULL);
  sigaction(SIGBUS, &sa, NULL);
}

#define NONFAILING(...)                                                        \
  {                                                                            \
    __atomic_fetch_add(&skip_segv, 1, __ATOMIC_SEQ_CST);                       \
    if (_setjmp(segv_env) == 0) {                                              \
      __VA_ARGS__;                                                             \
    }                                                                          \
    __atomic_fetch_sub(&skip_segv, 1, __ATOMIC_SEQ_CST);                       \
  }

static void sleep_ms(uint64_t ms)
{
  usleep(ms * 1000);
}

static uint64_t current_time_ms(void)
{
  struct timespec ts;
  if (clock_gettime(CLOCK_MONOTONIC, &ts))
    exit(1);
  return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
}

static bool write_file(const char* file, const char* what, ...)
{
  char buf[1024];
  va_list args;
  va_start(args, what);
  vsnprintf(buf, sizeof(buf), what, args);
  va_end(args);
  buf[sizeof(buf) - 1] = 0;
  int len = strlen(buf);
  int fd = open(file, O_WRONLY | O_CLOEXEC);
  if (fd == -1)
    return false;
  if (write(fd, buf, len) != len) {
    int err = errno;
    close(fd);
    errno = err;
    return false;
  }
  close(fd);
  return true;
}

static long syz_open_dev(volatile long a0, volatile long a1, volatile long a2)
{
  if (a0 == 0xc || a0 == 0xb) {
    char buf[128];
    sprintf(buf, "/dev/%s/%d:%d", a0 == 0xc ? "char" : "block", (uint8_t)a1,
            (uint8_t)a2);
    return open(buf, O_RDWR, 0);
  } else {
    char buf[1024];
    char* hash;
    NONFAILING(strncpy(buf, (char*)a0, sizeof(buf) - 1));
    buf[sizeof(buf) - 1] = 0;
    while ((hash = strchr(buf, '#'))) {
      *hash = '0' + (char)(a1 % 10);
      a1 /= 10;
    }
    return open(buf, a2, 0);
  }
}

static void kill_and_wait(int pid, int* status)
{
  kill(-pid, SIGKILL);
  kill(pid, SIGKILL);
  int i;
  for (i = 0; i < 100; i++) {
    if (waitpid(-1, status, WNOHANG | __WALL) == pid)
      return;
    usleep(1000);
  }
  DIR* dir = opendir("/sys/fs/fuse/connections");
  if (dir) {
    for (;;) {
      struct dirent* ent = readdir(dir);
      if (!ent)
        break;
      if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0)
        continue;
      char abort[300];
      snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort",
               ent->d_name);
      int fd = open(abort, O_WRONLY);
      if (fd == -1) {
        continue;
      }
      if (write(fd, abort, 1) < 0) {
      }
      close(fd);
    }
    closedir(dir);
  } else {
  }
  while (waitpid(-1, status, __WALL) != pid) {
  }
}

static void setup_test()
{
  prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
  setpgrp();
  write_file("/proc/self/oom_score_adj", "1000");
}

static void execute_one(void);

#define WAIT_FLAGS __WALL

static void loop(void)
{
  int iter;
  for (iter = 0; iter < 1; iter++) {
    int pid = fork();
    if (pid < 0)
      exit(1);
    if (pid == 0) {
      setup_test();
      execute_one();
      exit(0);
    }
    int status = 0;
    uint64_t start = current_time_ms();
    for (;;) {
      if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
        break;
      sleep_ms(1);
      if (current_time_ms() - start < 5 * 1000)
        continue;
      kill_and_wait(pid, &status);
      break;
    }
  }
}

uint64_t r[3] = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff};

void execute_one(void)
{
  intptr_t res = 0;
  res = syscall(__NR_socket, 0x10, 2, 2);
  if (res != -1)
    r[0] = res;
  NONFAILING(memcpy((void*)0x20000080, "/dev/nbd#\000", 10));
  res = syz_open_dev(0x20000080, 0, 0);
  if (res != -1)
    r[1] = res;
  res = syz_open_dev(0, 0, 0);
  if (res != -1)
    r[2] = res;
  syscall(__NR_ioctl, r[2], 0xab00, r[0]);
  syscall(__NR_ioctl, r[1], 0xab03, 0);
}
int main(void)
{
  syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
  install_segv_handler();
  loop();
  return 0;
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-10-01 21:19 ` Mike Christie
@ 2019-10-17 14:03   ` Richard W.M. Jones
  2019-10-17 15:47     ` Mike Christie
  2019-10-30  8:39     ` Wouter Verhelst
  0 siblings, 2 replies; 13+ messages in thread
From: Richard W.M. Jones @ 2019-10-17 14:03 UTC (permalink / raw)
  To: Mike Christie
  Cc: syzbot, axboe, josef, linux-block, linux-kernel, nbd, syzkaller-bugs

On Tue, Oct 01, 2019 at 04:19:25PM -0500, Mike Christie wrote:
> Hey Josef and nbd list,
> 
> I had a question about if there are any socket family restrictions for nbd?

In normal circumstances, in userspace, the NBD protocol would only be
used over AF_UNIX or AF_INET/AF_INET6.

There's a bit of confusion because netlink is used by nbd-client to
configure the NBD device, setting things like block size and timeouts
(instead of ioctl which is deprecated).  I think you don't mean this
use of netlink?

> The bug here is that some socket familys do not support the
> sock->ops->shutdown callout, and when nbd calls kernel_sock_shutdown
> their callout returns -EOPNOTSUPP. That then leaves recv_work stuck in
> nbd_read_stat -> sock_xmit -> sock_recvmsg. My patch added a
> flush_workqueue call, so for socket familys like AF_NETLINK in this bug
> we hang like we see below.
> 
> I can just remove the flush_workqueue call in that code path since it's
> not needed there, but it leaves the original bug my patch was hitting
> where we leave the recv_work running which can then result in leaked
> resources, or possible use after free crashes and you still get the hang
> if you remove the module.
> 
> It looks like we have used kernel_sock_shutdown for a while so I thought
> we might never have supported sockets that did not support the callout.
> Is that correct? If so then I can just add a check for this in
> nbd_add_socket and fix that bug too.

Rich.

> On 09/30/2019 05:39 PM, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following crash on:
> > 
> > HEAD commit:    bb2aee77 Add linux-next specific files for 20190926
> > git tree:       linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13385ca3600000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=e60af4ac5a01e964
> > dashboard link:
> > https://syzkaller.appspot.com/bug?extid=24c12fa8d218ed26011a
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12abc2a3600000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11712c05600000
> > 
> > The bug was bisected to:
> > 
> > commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
> > Author: Mike Christie <mchristi@redhat.com>
> > Date:   Sun Aug 4 19:10:06 2019 +0000
> > 
> >     nbd: fix max number of supported devs
> > 
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1226f3c5600000
> > final crash:    https://syzkaller.appspot.com/x/report.txt?x=1126f3c5600000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1626f3c5600000
> > 
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
> > Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
> > 
> > INFO: task syz-executor390:8778 can't die for more than 143 seconds.
> > syz-executor390 D27432  8778   8777 0x00004004
> > Call Trace:
> >  context_switch kernel/sched/core.c:3384 [inline]
> >  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
> >  schedule+0xd9/0x260 kernel/sched/core.c:4132
> >  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
> >  do_wait_for_common kernel/sched/completion.c:83 [inline]
> >  __wait_for_common kernel/sched/completion.c:104 [inline]
> >  wait_for_common kernel/sched/completion.c:115 [inline]
> >  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
> >  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
> >  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
> >  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
> >  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
> >  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
> >  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
> >  block_ioctl+0xee/0x130 fs/block_dev.c:1954
> >  vfs_ioctl fs/ioctl.c:47 [inline]
> >  file_ioctl fs/ioctl.c:539 [inline]
> >  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
> >  __do_sys_ioctl fs/ioctl.c:750 [inline]
> >  __se_sys_ioctl fs/ioctl.c:748 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
> >  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x4452d9
> > Code: Bad RIP value.
> > RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
> > RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
> > RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
> > R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
> > INFO: task syz-executor390:8778 blocked for more than 143 seconds.
> >       Not tainted 5.3.0-next-20190926 #0
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > syz-executor390 D27432  8778   8777 0x00004004
> > Call Trace:
> >  context_switch kernel/sched/core.c:3384 [inline]
> >  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
> >  schedule+0xd9/0x260 kernel/sched/core.c:4132
> >  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
> >  do_wait_for_common kernel/sched/completion.c:83 [inline]
> >  __wait_for_common kernel/sched/completion.c:104 [inline]
> >  wait_for_common kernel/sched/completion.c:115 [inline]
> >  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
> >  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
> >  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
> >  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
> >  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
> >  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
> >  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
> >  block_ioctl+0xee/0x130 fs/block_dev.c:1954
> >  vfs_ioctl fs/ioctl.c:47 [inline]
> >  file_ioctl fs/ioctl.c:539 [inline]
> >  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
> >  __do_sys_ioctl fs/ioctl.c:750 [inline]
> >  __se_sys_ioctl fs/ioctl.c:748 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
> >  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x4452d9
> > Code: Bad RIP value.
> > RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
> > RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
> > RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
> > R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
> > 
> > Showing all locks held in the system:
> > 1 lock held by khungtaskd/1066:
> >  #0: ffffffff88faad80 (rcu_read_lock){....}, at:
> > debug_show_all_locks+0x5f/0x27e kernel/locking/lockdep.c:5337
> > 2 locks held by kworker/u5:0/1525:
> >  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> > __write_once_size include/linux/compiler.h:226 [inline]
> >  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> > arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
> >  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> > atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
> >  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> > atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
> >  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> > set_work_data kernel/workqueue.c:620 [inline]
> >  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> > set_work_pool_and_clear_pending kernel/workqueue.c:647 [inline]
> >  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> > process_one_work+0x88b/0x1740 kernel/workqueue.c:2240
> >  #1: ffff8880a63b7dc0 ((work_completion)(&args->work)){+.+.}, at:
> > process_one_work+0x8c1/0x1740 kernel/workqueue.c:2244
> > 1 lock held by rsyslogd/8659:
> > 2 locks held by getty/8749:
> >  #0: ffff888098c08090 (&tty->ldisc_sem){++++}, at:
> > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >  #1: ffffc90005f112e0 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> > 2 locks held by getty/8750:
> >  #0: ffff88808f10b090 (&tty->ldisc_sem){++++}, at:
> > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >  #1: ffffc90005f2d2e0 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> > 2 locks held by getty/8751:
> >  #0: ffff88809a6be090 (&tty->ldisc_sem){++++}, at:
> > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >  #1: ffffc90005f192e0 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> > 2 locks held by getty/8752:
> >  #0: ffff8880a48af090 (&tty->ldisc_sem){++++}, at:
> > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >  #1: ffffc90005f352e0 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> > 2 locks held by getty/8753:
> >  #0: ffff88808c599090 (&tty->ldisc_sem){++++}, at:
> > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >  #1: ffffc90005f212e0 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> > 2 locks held by getty/8754:
> >  #0: ffff88808f1a8090 (&tty->ldisc_sem){++++}, at:
> > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >  #1: ffffc90005f392e0 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> > 2 locks held by getty/8755:
> >  #0: ffff88809ab33090 (&tty->ldisc_sem){++++}, at:
> > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
> >  #1: ffffc90005f012e0 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> > 
> > =============================================
> > 
> > NMI backtrace for cpu 1
> > CPU: 1 PID: 1066 Comm: khungtaskd Not tainted 5.3.0-next-20190926 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> >  nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
> >  nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
> >  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
> >  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
> >  check_hung_uninterruptible_tasks kernel/hung_task.c:269 [inline]
> >  watchdog+0xc99/0x1360 kernel/hung_task.c:353
> >  kthread+0x361/0x430 kernel/kthread.c:255
> >  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> > Sending NMI from CPU 1 to CPUs 0:
> > NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x10
> > arch/x86/include/asm/irqflags.h:60
> > 
> > 
> > ---
> > This bug is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > 
> > syzbot will keep track of this bug report. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > For information about bisection process see:
> > https://goo.gl/tpsmEJ#bisection
> > syzbot can test patches for this bug, for details see:
> > https://goo.gl/tpsmEJ#testing-patches

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-09-30 22:39 syzbot
  2019-10-01 17:48 ` Mike Christie
@ 2019-10-01 21:19 ` Mike Christie
  2019-10-17 14:03   ` Richard W.M. Jones
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Christie @ 2019-10-01 21:19 UTC (permalink / raw)
  To: syzbot, axboe, josef, linux-block, linux-kernel, nbd, syzkaller-bugs

Hey Josef and nbd list,

I had a question about if there are any socket family restrictions for nbd?

The bug here is that some socket familys do not support the
sock->ops->shutdown callout, and when nbd calls kernel_sock_shutdown
their callout returns -EOPNOTSUPP. That then leaves recv_work stuck in
nbd_read_stat -> sock_xmit -> sock_recvmsg. My patch added a
flush_workqueue call, so for socket familys like AF_NETLINK in this bug
we hang like we see below.

I can just remove the flush_workqueue call in that code path since it's
not needed there, but it leaves the original bug my patch was hitting
where we leave the recv_work running which can then result in leaked
resources, or possible use after free crashes and you still get the hang
if you remove the module.

It looks like we have used kernel_sock_shutdown for a while so I thought
we might never have supported sockets that did not support the callout.
Is that correct? If so then I can just add a check for this in
nbd_add_socket and fix that bug too.


On 09/30/2019 05:39 PM, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    bb2aee77 Add linux-next specific files for 20190926
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=13385ca3600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=e60af4ac5a01e964
> dashboard link:
> https://syzkaller.appspot.com/bug?extid=24c12fa8d218ed26011a
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12abc2a3600000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11712c05600000
> 
> The bug was bisected to:
> 
> commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
> Author: Mike Christie <mchristi@redhat.com>
> Date:   Sun Aug 4 19:10:06 2019 +0000
> 
>     nbd: fix max number of supported devs
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1226f3c5600000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=1126f3c5600000
> console output: https://syzkaller.appspot.com/x/log.txt?x=1626f3c5600000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
> 
> INFO: task syz-executor390:8778 can't die for more than 143 seconds.
> syz-executor390 D27432  8778   8777 0x00004004
> Call Trace:
>  context_switch kernel/sched/core.c:3384 [inline]
>  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
>  schedule+0xd9/0x260 kernel/sched/core.c:4132
>  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
>  do_wait_for_common kernel/sched/completion.c:83 [inline]
>  __wait_for_common kernel/sched/completion.c:104 [inline]
>  wait_for_common kernel/sched/completion.c:115 [inline]
>  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
>  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
>  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
>  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
>  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
>  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
>  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
>  block_ioctl+0xee/0x130 fs/block_dev.c:1954
>  vfs_ioctl fs/ioctl.c:47 [inline]
>  file_ioctl fs/ioctl.c:539 [inline]
>  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
>  __do_sys_ioctl fs/ioctl.c:750 [inline]
>  __se_sys_ioctl fs/ioctl.c:748 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
>  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x4452d9
> Code: Bad RIP value.
> RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
> RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
> RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
> R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
> INFO: task syz-executor390:8778 blocked for more than 143 seconds.
>       Not tainted 5.3.0-next-20190926 #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor390 D27432  8778   8777 0x00004004
> Call Trace:
>  context_switch kernel/sched/core.c:3384 [inline]
>  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
>  schedule+0xd9/0x260 kernel/sched/core.c:4132
>  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
>  do_wait_for_common kernel/sched/completion.c:83 [inline]
>  __wait_for_common kernel/sched/completion.c:104 [inline]
>  wait_for_common kernel/sched/completion.c:115 [inline]
>  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
>  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
>  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
>  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
>  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
>  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
>  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
>  block_ioctl+0xee/0x130 fs/block_dev.c:1954
>  vfs_ioctl fs/ioctl.c:47 [inline]
>  file_ioctl fs/ioctl.c:539 [inline]
>  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
>  __do_sys_ioctl fs/ioctl.c:750 [inline]
>  __se_sys_ioctl fs/ioctl.c:748 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
>  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x4452d9
> Code: Bad RIP value.
> RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
> RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
> RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
> R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
> 
> Showing all locks held in the system:
> 1 lock held by khungtaskd/1066:
>  #0: ffffffff88faad80 (rcu_read_lock){....}, at:
> debug_show_all_locks+0x5f/0x27e kernel/locking/lockdep.c:5337
> 2 locks held by kworker/u5:0/1525:
>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> __write_once_size include/linux/compiler.h:226 [inline]
>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> set_work_data kernel/workqueue.c:620 [inline]
>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> set_work_pool_and_clear_pending kernel/workqueue.c:647 [inline]
>  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:
> process_one_work+0x88b/0x1740 kernel/workqueue.c:2240
>  #1: ffff8880a63b7dc0 ((work_completion)(&args->work)){+.+.}, at:
> process_one_work+0x8c1/0x1740 kernel/workqueue.c:2244
> 1 lock held by rsyslogd/8659:
> 2 locks held by getty/8749:
>  #0: ffff888098c08090 (&tty->ldisc_sem){++++}, at:
> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>  #1: ffffc90005f112e0 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8750:
>  #0: ffff88808f10b090 (&tty->ldisc_sem){++++}, at:
> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>  #1: ffffc90005f2d2e0 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8751:
>  #0: ffff88809a6be090 (&tty->ldisc_sem){++++}, at:
> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>  #1: ffffc90005f192e0 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8752:
>  #0: ffff8880a48af090 (&tty->ldisc_sem){++++}, at:
> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>  #1: ffffc90005f352e0 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8753:
>  #0: ffff88808c599090 (&tty->ldisc_sem){++++}, at:
> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>  #1: ffffc90005f212e0 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8754:
>  #0: ffff88808f1a8090 (&tty->ldisc_sem){++++}, at:
> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>  #1: ffffc90005f392e0 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8755:
>  #0: ffff88809ab33090 (&tty->ldisc_sem){++++}, at:
> ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
>  #1: ffffc90005f012e0 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 
> =============================================
> 
> NMI backtrace for cpu 1
> CPU: 1 PID: 1066 Comm: khungtaskd Not tainted 5.3.0-next-20190926 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
>  nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
>  nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
>  check_hung_uninterruptible_tasks kernel/hung_task.c:269 [inline]
>  watchdog+0xc99/0x1360 kernel/hung_task.c:353
>  kthread+0x361/0x430 kernel/kthread.c:255
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x10
> arch/x86/include/asm/irqflags.h:60
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> For information about bisection process see:
> https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: INFO: task hung in nbd_ioctl
  2019-09-30 22:39 syzbot
@ 2019-10-01 17:48 ` Mike Christie
  2019-10-01 21:19 ` Mike Christie
  1 sibling, 0 replies; 13+ messages in thread
From: Mike Christie @ 2019-10-01 17:48 UTC (permalink / raw)
  To: syzbot, axboe, josef, linux-block, linux-kernel, nbd, syzkaller-bugs

On 09/30/2019 05:39 PM, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    bb2aee77 Add linux-next specific files for 20190926
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=13385ca3600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=e60af4ac5a01e964
> dashboard link:
> https://syzkaller.appspot.com/bug?extid=24c12fa8d218ed26011a
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12abc2a3600000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11712c05600000
> 
> The bug was bisected to:
> 
> commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
> Author: Mike Christie <mchristi@redhat.com>
> Date:   Sun Aug 4 19:10:06 2019 +0000
> 
>     nbd: fix max number of supported devs
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1226f3c5600000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=1126f3c5600000
> console output: https://syzkaller.appspot.com/x/log.txt?x=1626f3c5600000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
> 
> INFO: task syz-executor390:8778 can't die for more than 143 seconds.
> syz-executor390 D27432  8778   8777 0x00004004
> Call Trace:
>  context_switch kernel/sched/core.c:3384 [inline]
>  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
>  schedule+0xd9/0x260 kernel/sched/core.c:4132
>  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
>  do_wait_for_common kernel/sched/completion.c:83 [inline]
>  __wait_for_common kernel/sched/completion.c:104 [inline]
>  wait_for_common kernel/sched/completion.c:115 [inline]
>  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
>  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
>  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
>  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
>  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
>  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
>  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
>  block_ioctl+0xee/0x130 fs/block_dev.c:1954
>  vfs_ioctl fs/ioctl.c:47 [inline]
>  file_ioctl fs/ioctl.c:539 [inline]
>  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
>  __do_sys_ioctl fs/ioctl.c:750 [inline]
>  __se_sys_ioctl fs/ioctl.c:748 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
>  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x4452d9
> Code: Bad RIP value.
> RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
> RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
> RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
> R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
> INFO: task syz-executor390:8778 blocked for more than 143 seconds.
>       Not tainted 5.3.0-next-20190926 #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor390 D27432  8778   8777 0x00004004
> Call Trace:
>  context_switch kernel/sched/core.c:3384 [inline]
>  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
>  schedule+0xd9/0x260 kernel/sched/core.c:4132
>  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
>  do_wait_for_common kernel/sched/completion.c:83 [inline]
>  __wait_for_common kernel/sched/completion.c:104 [inline]
>  wait_for_common kernel/sched/completion.c:115 [inline]
>  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
>  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
>  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
>  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
>  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
>  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
>  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
>  block_ioctl+0xee/0x130 fs/block_dev.c:1954
>  vfs_ioctl fs/ioctl.c:47 [inline]
>  file_ioctl fs/ioctl.c:539 [inline]
>  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
>  __do_sys_ioctl fs/ioctl.c:750 [inline]
>  __se_sys_ioctl fs/ioctl.c:748 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
>  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x4452d9
> Code: Bad RIP value.
> RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
> RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
> RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
> R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
> 

I will send a fix for this.

I had assumed that for every socket type a kernel_sock_shutdown would
break us out of sock_recvmsg call, but it looks like that's not the case.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* INFO: task hung in nbd_ioctl
@ 2019-09-30 22:39 syzbot
  2019-10-01 17:48 ` Mike Christie
  2019-10-01 21:19 ` Mike Christie
  0 siblings, 2 replies; 13+ messages in thread
From: syzbot @ 2019-09-30 22:39 UTC (permalink / raw)
  To: axboe, josef, linux-block, linux-kernel, mchristi, nbd, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    bb2aee77 Add linux-next specific files for 20190926
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=13385ca3600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e60af4ac5a01e964
dashboard link: https://syzkaller.appspot.com/bug?extid=24c12fa8d218ed26011a
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12abc2a3600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11712c05600000

The bug was bisected to:

commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
Author: Mike Christie <mchristi@redhat.com>
Date:   Sun Aug 4 19:10:06 2019 +0000

     nbd: fix max number of supported devs

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1226f3c5600000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=1126f3c5600000
console output: https://syzkaller.appspot.com/x/log.txt?x=1626f3c5600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")

INFO: task syz-executor390:8778 can't die for more than 143 seconds.
syz-executor390 D27432  8778   8777 0x00004004
Call Trace:
  context_switch kernel/sched/core.c:3384 [inline]
  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
  schedule+0xd9/0x260 kernel/sched/core.c:4132
  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
  do_wait_for_common kernel/sched/completion.c:83 [inline]
  __wait_for_common kernel/sched/completion.c:104 [inline]
  wait_for_common kernel/sched/completion.c:115 [inline]
  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
  block_ioctl+0xee/0x130 fs/block_dev.c:1954
  vfs_ioctl fs/ioctl.c:47 [inline]
  file_ioctl fs/ioctl.c:539 [inline]
  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
  __do_sys_ioctl fs/ioctl.c:750 [inline]
  __se_sys_ioctl fs/ioctl.c:748 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4452d9
Code: Bad RIP value.
RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000
INFO: task syz-executor390:8778 blocked for more than 143 seconds.
       Not tainted 5.3.0-next-20190926 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor390 D27432  8778   8777 0x00004004
Call Trace:
  context_switch kernel/sched/core.c:3384 [inline]
  __schedule+0x828/0x1c20 kernel/sched/core.c:4065
  schedule+0xd9/0x260 kernel/sched/core.c:4132
  schedule_timeout+0x717/0xc50 kernel/time/timer.c:1871
  do_wait_for_common kernel/sched/completion.c:83 [inline]
  __wait_for_common kernel/sched/completion.c:104 [inline]
  wait_for_common kernel/sched/completion.c:115 [inline]
  wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
  flush_workqueue+0x40f/0x14c0 kernel/workqueue.c:2826
  nbd_start_device_ioctl drivers/block/nbd.c:1272 [inline]
  __nbd_ioctl drivers/block/nbd.c:1347 [inline]
  nbd_ioctl+0xb2e/0xc44 drivers/block/nbd.c:1387
  __blkdev_driver_ioctl block/ioctl.c:304 [inline]
  blkdev_ioctl+0xedb/0x1c20 block/ioctl.c:606
  block_ioctl+0xee/0x130 fs/block_dev.c:1954
  vfs_ioctl fs/ioctl.c:47 [inline]
  file_ioctl fs/ioctl.c:539 [inline]
  do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
  ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
  __do_sys_ioctl fs/ioctl.c:750 [inline]
  __se_sys_ioctl fs/ioctl.c:748 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4452d9
Code: Bad RIP value.
RSP: 002b:00007ffde928d288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004452d9
RDX: 0000000000000000 RSI: 000000000000ab03 RDI: 0000000000000004
RBP: 0000000000000000 R08: 00000000004025b0 R09: 00000000004025b0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402520
R13: 00000000004025b0 R14: 0000000000000000 R15: 0000000000000000

Showing all locks held in the system:
1 lock held by khungtaskd/1066:
  #0: ffffffff88faad80 (rcu_read_lock){....}, at:  
debug_show_all_locks+0x5f/0x27e kernel/locking/lockdep.c:5337
2 locks held by kworker/u5:0/1525:
  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:  
__write_once_size include/linux/compiler.h:226 [inline]
  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:  
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at: atomic64_set  
include/asm-generic/atomic-instrumented.h:855 [inline]
  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:  
atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at: set_work_data  
kernel/workqueue.c:620 [inline]
  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:  
set_work_pool_and_clear_pending kernel/workqueue.c:647 [inline]
  #0: ffff8880923d0d28 ((wq_completion)knbd0-recv){+.+.}, at:  
process_one_work+0x88b/0x1740 kernel/workqueue.c:2240
  #1: ffff8880a63b7dc0 ((work_completion)(&args->work)){+.+.}, at:  
process_one_work+0x8c1/0x1740 kernel/workqueue.c:2244
1 lock held by rsyslogd/8659:
2 locks held by getty/8749:
  #0: ffff888098c08090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f112e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
2 locks held by getty/8750:
  #0: ffff88808f10b090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f2d2e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
2 locks held by getty/8751:
  #0: ffff88809a6be090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f192e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
2 locks held by getty/8752:
  #0: ffff8880a48af090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f352e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
2 locks held by getty/8753:
  #0: ffff88808c599090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f212e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
2 locks held by getty/8754:
  #0: ffff88808f1a8090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f392e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
2 locks held by getty/8755:
  #0: ffff88809ab33090 (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
  #1: ffffc90005f012e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 1066 Comm: khungtaskd Not tainted 5.3.0-next-20190926 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
  nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
  check_hung_uninterruptible_tasks kernel/hung_task.c:269 [inline]
  watchdog+0xc99/0x1360 kernel/hung_task.c:353
  kthread+0x361/0x430 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0xe/0x10  
arch/x86/include/asm/irqflags.h:60


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-09-18  1:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-16  2:43 INFO: task hung in nbd_ioctl Hao Sun
  -- strict thread matches above, loose matches on Subject: below --
2021-09-18  1:34 Hao Sun
2019-09-30 22:39 syzbot
2019-10-01 17:48 ` Mike Christie
2019-10-01 21:19 ` Mike Christie
2019-10-17 14:03   ` Richard W.M. Jones
2019-10-17 15:47     ` Mike Christie
2019-10-17 16:28       ` Richard W.M. Jones
2019-10-17 16:36         ` Eric Biggers
2019-10-17 16:49           ` Richard W.M. Jones
2019-10-17 21:26             ` Mike Christie
2019-10-30  8:39     ` Wouter Verhelst
2019-10-30  8:41       ` Wouter Verhelst

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).