linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* INFO: task hung in fuse_reverse_inval_entry
@ 2018-07-23  7:59 syzbot
  2018-07-23  8:11 ` Dmitry Vyukov
  2019-11-07 13:42 ` syzbot
  0 siblings, 2 replies; 16+ messages in thread
From: syzbot @ 2018-07-23  7:59 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, miklos, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    d72e90f33aa4 Linux 4.18-rc6
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com

random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
INFO: task syz-executor842:4559 blocked for more than 140 seconds.
       Not tainted 4.18.0-rc6+ #160
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor842 D23528  4559   4556 0x00000004
Call Trace:
  context_switch kernel/sched/core.c:2853 [inline]
  __schedule+0x87c/0x1ed0 kernel/sched/core.c:3501
  schedule+0xfb/0x450 kernel/sched/core.c:3545
  __rwsem_down_write_failed_common+0x95d/0x1630  
kernel/locking/rwsem-xadd.c:566
  rwsem_down_write_failed+0xe/0x10 kernel/locking/rwsem-xadd.c:595
  call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
  __down_write arch/x86/include/asm/rwsem.h:142 [inline]
  down_write+0xaa/0x130 kernel/locking/rwsem.c:72
  inode_lock include/linux/fs.h:715 [inline]
  fuse_reverse_inval_entry+0xae/0x6d0 fs/fuse/dir.c:969
  fuse_notify_inval_entry fs/fuse/dev.c:1491 [inline]
  fuse_notify fs/fuse/dev.c:1764 [inline]
  fuse_dev_do_write+0x2b97/0x3700 fs/fuse/dev.c:1848
  fuse_dev_write+0x19a/0x240 fs/fuse/dev.c:1928
  call_write_iter include/linux/fs.h:1793 [inline]
  new_sync_write fs/read_write.c:474 [inline]
  __vfs_write+0x6c6/0x9f0 fs/read_write.c:487
  vfs_write+0x1f8/0x560 fs/read_write.c:549
  ksys_write+0x101/0x260 fs/read_write.c:598
  __do_sys_write fs/read_write.c:610 [inline]
  __se_sys_write fs/read_write.c:607 [inline]
  __x64_sys_write+0x73/0xb0 fs/read_write.c:607
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x445869
Code: Bad RIP value.
RSP: 002b:00007ffa2ef7fda8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00000000006dac24 RCX: 0000000000445869
RDX: 0000000000000029 RSI: 00000000200000c0 RDI: 0000000000000003
RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0030656c69662f2e
R13: 64695f70756f7267 R14: 2f30656c69662f2e R15: 0000000000000001
INFO: task syz-executor842:4560 blocked for more than 140 seconds.
       Not tainted 4.18.0-rc6+ #160
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor842 D26008  4560   4556 0x00000004
Call Trace:
  context_switch kernel/sched/core.c:2853 [inline]
  __schedule+0x87c/0x1ed0 kernel/sched/core.c:3501
  schedule+0xfb/0x450 kernel/sched/core.c:3545
  request_wait_answer+0x4c8/0x920 fs/fuse/dev.c:463
  __fuse_request_send+0x12a/0x1d0 fs/fuse/dev.c:483
  fuse_request_send+0x62/0xa0 fs/fuse/dev.c:496
  fuse_simple_request+0x33d/0x730 fs/fuse/dev.c:554
  fuse_lookup_name+0x3ee/0x830 fs/fuse/dir.c:323
  fuse_lookup+0xf9/0x4c0 fs/fuse/dir.c:360
  __lookup_hash+0x12e/0x190 fs/namei.c:1505
  filename_create+0x1e5/0x5b0 fs/namei.c:3646
  user_path_create fs/namei.c:3703 [inline]
  do_mkdirat+0xda/0x310 fs/namei.c:3842
  __do_sys_mkdirat fs/namei.c:3861 [inline]
  __se_sys_mkdirat fs/namei.c:3859 [inline]
  __x64_sys_mkdirat+0x76/0xb0 fs/namei.c:3859
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x445869
Code: Bad RIP value.
RSP: 002b:00007ffa2ef5eda8 EFLAGS: 00000297 ORIG_RAX: 0000000000000102
RAX: ffffffffffffffda RBX: 00000000006dac3c RCX: 0000000000445869
RDX: 0000000000000000 RSI: 0000000020000500 RDI: 00000000ffffff9c
RBP: 00000000006dac38 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000297 R12: 0030656c69662f2e
R13: 64695f70756f7267 R14: 2f30656c69662f2e R15: 0000000000000001

Showing all locks held in the system:
1 lock held by khungtaskd/901:
  #0: (____ptrval____) (rcu_read_lock){....}, at:  
debug_show_all_locks+0xd0/0x428 kernel/locking/lockdep.c:4461
1 lock held by rsyslogd/4441:
  #0: (____ptrval____) (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1bb/0x200  
fs/file.c:766
2 locks held by getty/4531:
  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/4532:
  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/4533:
  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/4534:
  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/4535:
  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/4536:
  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/4537:
  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at:  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by syz-executor842/4559:
  #0: (____ptrval____) (&fc->killsb){.+.+}, at: fuse_notify_inval_entry  
fs/fuse/dev.c:1488 [inline]
  #0: (____ptrval____) (&fc->killsb){.+.+}, at: fuse_notify  
fs/fuse/dev.c:1764 [inline]
  #0: (____ptrval____) (&fc->killsb){.+.+}, at:  
fuse_dev_do_write+0x2b2d/0x3700 fs/fuse/dev.c:1848
  #1: (____ptrval____) (&type->i_mutex_dir_key#4){+.+.}, at: inode_lock  
include/linux/fs.h:715 [inline]
  #1: (____ptrval____) (&type->i_mutex_dir_key#4){+.+.}, at:  
fuse_reverse_inval_entry+0xae/0x6d0 fs/fuse/dir.c:969
3 locks held by syz-executor842/4560:
  #0: (____ptrval____) (sb_writers#9){.+.+}, at: sb_start_write  
include/linux/fs.h:1554 [inline]
  #0: (____ptrval____) (sb_writers#9){.+.+}, at: mnt_want_write+0x3f/0xc0  
fs/namespace.c:386
  #1: (____ptrval____) (&type->i_mutex_dir_key#3/1){+.+.}, at:  
inode_lock_nested include/linux/fs.h:750 [inline]
  #1: (____ptrval____) (&type->i_mutex_dir_key#3/1){+.+.}, at:  
filename_create+0x1b2/0x5b0 fs/namei.c:3645
  #2: (____ptrval____) (&fi->mutex){+.+.}, at: fuse_lock_inode+0xaf/0xe0  
fs/fuse/inode.c:363

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 901 Comm: khungtaskd Not tainted 4.18.0-rc6+ #160
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
  trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline]
  check_hung_uninterruptible_tasks kernel/hung_task.c:196 [inline]
  watchdog+0x9c4/0xf80 kernel/hung_task.c:252
  kthread+0x345/0x410 kernel/kthread.c:246
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0x6/0x10  
arch/x86/include/asm/irqflags.h:54


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23  7:59 INFO: task hung in fuse_reverse_inval_entry syzbot
@ 2018-07-23  8:11 ` Dmitry Vyukov
  2018-07-23 12:12   ` Miklos Szeredi
  2019-11-07 13:42 ` syzbot
  1 sibling, 1 reply; 16+ messages in thread
From: Dmitry Vyukov @ 2018-07-23  8:11 UTC (permalink / raw)
  To: linux-fsdevel, Miklos Szeredi; +Cc: LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 9:59 AM, syzbot
<syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    d72e90f33aa4 Linux 4.18-rc6
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000


Hi fuse maintainers,

We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I
understand this is mostly working-as-intended (parts about deadlocks
in Documentation/filesystems/fuse.txt). The intended way to resolve
this is aborting connections via fusectl, right? The doc says "Under
the fuse control filesystem each connection has a directory named by a
unique number". The question is: if I start a process and this process
can mount fuse, how do I kill it? I mean: totally and certainly get
rid of it right away? How do I find these unique numbers for the
mounts it created? Taking into account that there is usually no
operator attached to each server, I wonder if kernel could somehow
auto-abort fuse on kill? E.g. if all processes holding the fuse fd are
killed, it would be reasonable to abort the fuse conn and auto-resolve
deadlocks. Is it possible?



> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com
>
> random: sshd: uninitialized urandom read (32 bytes read)
> random: sshd: uninitialized urandom read (32 bytes read)
> random: sshd: uninitialized urandom read (32 bytes read)
> random: sshd: uninitialized urandom read (32 bytes read)
> random: sshd: uninitialized urandom read (32 bytes read)
> INFO: task syz-executor842:4559 blocked for more than 140 seconds.
>       Not tainted 4.18.0-rc6+ #160
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor842 D23528  4559   4556 0x00000004
> Call Trace:
>  context_switch kernel/sched/core.c:2853 [inline]
>  __schedule+0x87c/0x1ed0 kernel/sched/core.c:3501
>  schedule+0xfb/0x450 kernel/sched/core.c:3545
>  __rwsem_down_write_failed_common+0x95d/0x1630
> kernel/locking/rwsem-xadd.c:566
>  rwsem_down_write_failed+0xe/0x10 kernel/locking/rwsem-xadd.c:595
>  call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
>  __down_write arch/x86/include/asm/rwsem.h:142 [inline]
>  down_write+0xaa/0x130 kernel/locking/rwsem.c:72
>  inode_lock include/linux/fs.h:715 [inline]
>  fuse_reverse_inval_entry+0xae/0x6d0 fs/fuse/dir.c:969
>  fuse_notify_inval_entry fs/fuse/dev.c:1491 [inline]
>  fuse_notify fs/fuse/dev.c:1764 [inline]
>  fuse_dev_do_write+0x2b97/0x3700 fs/fuse/dev.c:1848
>  fuse_dev_write+0x19a/0x240 fs/fuse/dev.c:1928
>  call_write_iter include/linux/fs.h:1793 [inline]
>  new_sync_write fs/read_write.c:474 [inline]
>  __vfs_write+0x6c6/0x9f0 fs/read_write.c:487
>  vfs_write+0x1f8/0x560 fs/read_write.c:549
>  ksys_write+0x101/0x260 fs/read_write.c:598
>  __do_sys_write fs/read_write.c:610 [inline]
>  __se_sys_write fs/read_write.c:607 [inline]
>  __x64_sys_write+0x73/0xb0 fs/read_write.c:607
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x445869
> Code: Bad RIP value.
> RSP: 002b:00007ffa2ef7fda8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 00000000006dac24 RCX: 0000000000445869
> RDX: 0000000000000029 RSI: 00000000200000c0 RDI: 0000000000000003
> RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0030656c69662f2e
> R13: 64695f70756f7267 R14: 2f30656c69662f2e R15: 0000000000000001
> INFO: task syz-executor842:4560 blocked for more than 140 seconds.
>       Not tainted 4.18.0-rc6+ #160
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor842 D26008  4560   4556 0x00000004
> Call Trace:
>  context_switch kernel/sched/core.c:2853 [inline]
>  __schedule+0x87c/0x1ed0 kernel/sched/core.c:3501
>  schedule+0xfb/0x450 kernel/sched/core.c:3545
>  request_wait_answer+0x4c8/0x920 fs/fuse/dev.c:463
>  __fuse_request_send+0x12a/0x1d0 fs/fuse/dev.c:483
>  fuse_request_send+0x62/0xa0 fs/fuse/dev.c:496
>  fuse_simple_request+0x33d/0x730 fs/fuse/dev.c:554
>  fuse_lookup_name+0x3ee/0x830 fs/fuse/dir.c:323
>  fuse_lookup+0xf9/0x4c0 fs/fuse/dir.c:360
>  __lookup_hash+0x12e/0x190 fs/namei.c:1505
>  filename_create+0x1e5/0x5b0 fs/namei.c:3646
>  user_path_create fs/namei.c:3703 [inline]
>  do_mkdirat+0xda/0x310 fs/namei.c:3842
>  __do_sys_mkdirat fs/namei.c:3861 [inline]
>  __se_sys_mkdirat fs/namei.c:3859 [inline]
>  __x64_sys_mkdirat+0x76/0xb0 fs/namei.c:3859
>  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x445869
> Code: Bad RIP value.
> RSP: 002b:00007ffa2ef5eda8 EFLAGS: 00000297 ORIG_RAX: 0000000000000102
> RAX: ffffffffffffffda RBX: 00000000006dac3c RCX: 0000000000445869
> RDX: 0000000000000000 RSI: 0000000020000500 RDI: 00000000ffffff9c
> RBP: 00000000006dac38 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000297 R12: 0030656c69662f2e
> R13: 64695f70756f7267 R14: 2f30656c69662f2e R15: 0000000000000001
>
> Showing all locks held in the system:
> 1 lock held by khungtaskd/901:
>  #0: (____ptrval____) (rcu_read_lock){....}, at:
> debug_show_all_locks+0xd0/0x428 kernel/locking/lockdep.c:4461
> 1 lock held by rsyslogd/4441:
>  #0: (____ptrval____) (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1bb/0x200
> fs/file.c:766
> 2 locks held by getty/4531:
>  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
> 2 locks held by getty/4532:
>  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
> 2 locks held by getty/4533:
>  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
> 2 locks held by getty/4534:
>  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
> 2 locks held by getty/4535:
>  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
> 2 locks held by getty/4536:
>  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
> 2 locks held by getty/4537:
>  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
> 2 locks held by syz-executor842/4559:
>  #0: (____ptrval____) (&fc->killsb){.+.+}, at: fuse_notify_inval_entry
> fs/fuse/dev.c:1488 [inline]
>  #0: (____ptrval____) (&fc->killsb){.+.+}, at: fuse_notify
> fs/fuse/dev.c:1764 [inline]
>  #0: (____ptrval____) (&fc->killsb){.+.+}, at:
> fuse_dev_do_write+0x2b2d/0x3700 fs/fuse/dev.c:1848
>  #1: (____ptrval____) (&type->i_mutex_dir_key#4){+.+.}, at: inode_lock
> include/linux/fs.h:715 [inline]
>  #1: (____ptrval____) (&type->i_mutex_dir_key#4){+.+.}, at:
> fuse_reverse_inval_entry+0xae/0x6d0 fs/fuse/dir.c:969
> 3 locks held by syz-executor842/4560:
>  #0: (____ptrval____) (sb_writers#9){.+.+}, at: sb_start_write
> include/linux/fs.h:1554 [inline]
>  #0: (____ptrval____) (sb_writers#9){.+.+}, at: mnt_want_write+0x3f/0xc0
> fs/namespace.c:386
>  #1: (____ptrval____) (&type->i_mutex_dir_key#3/1){+.+.}, at:
> inode_lock_nested include/linux/fs.h:750 [inline]
>  #1: (____ptrval____) (&type->i_mutex_dir_key#3/1){+.+.}, at:
> filename_create+0x1b2/0x5b0 fs/namei.c:3645
>  #2: (____ptrval____) (&fi->mutex){+.+.}, at: fuse_lock_inode+0xaf/0xe0
> fs/fuse/inode.c:363
>
> =============================================
>
> NMI backtrace for cpu 1
> CPU: 1 PID: 901 Comm: khungtaskd Not tainted 4.18.0-rc6+ #160
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
>  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
>  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>  trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline]
>  check_hung_uninterruptible_tasks kernel/hung_task.c:196 [inline]
>  watchdog+0x9c4/0xf80 kernel/hung_task.c:252
>  kthread+0x345/0x410 kernel/kthread.c:246
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0x6/0x10
> arch/x86/include/asm/irqflags.h:54
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> syzbot.
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/000000000000bc17b60571a60434%40google.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23  8:11 ` Dmitry Vyukov
@ 2018-07-23 12:12   ` Miklos Szeredi
  2018-07-23 12:22     ` Dmitry Vyukov
  0 siblings, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2018-07-23 12:12 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 10:11 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Mon, Jul 23, 2018 at 9:59 AM, syzbot
> <syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com> wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    d72e90f33aa4 Linux 4.18-rc6
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
>> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000
>
>
> Hi fuse maintainers,
>
> We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I
> understand this is mostly working-as-intended (parts about deadlocks
> in Documentation/filesystems/fuse.txt). The intended way to resolve
> this is aborting connections via fusectl, right?

Yes.  Alternative is with "umount -f".

> The doc says "Under
> the fuse control filesystem each connection has a directory named by a
> unique number". The question is: if I start a process and this process
> can mount fuse, how do I kill it? I mean: totally and certainly get
> rid of it right away? How do I find these unique numbers for the
> mounts it created?

It is the device number found in st_dev for the mount.  Other than
doing stat(2) it is possible to find out the device number by reading
/proc/$PID/mountinfo  (third field).

> Taking into account that there is usually no
> operator attached to each server, I wonder if kernel could somehow
> auto-abort fuse on kill?

Depends on what the fuse server is sleeping on.   If it's trying to
acquire an inode lock (e.g. unlink(2)), which is classical way to
deadlock a fuse filesystem, then it will go into an uninterruptible
sleep.  There's no way in which that process can be killed except to
force a release of the offending lock, which can only be done by
aborting the request that is being performed while holding that lock.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23 12:12   ` Miklos Szeredi
@ 2018-07-23 12:22     ` Dmitry Vyukov
  2018-07-23 12:33       ` Miklos Szeredi
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Vyukov @ 2018-07-23 12:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 2:12 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Mon, Jul 23, 2018 at 10:11 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Mon, Jul 23, 2018 at 9:59 AM, syzbot
>> <syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com> wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:    d72e90f33aa4 Linux 4.18-rc6
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000
>>
>>
>> Hi fuse maintainers,
>>
>> We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I
>> understand this is mostly working-as-intended (parts about deadlocks
>> in Documentation/filesystems/fuse.txt). The intended way to resolve
>> this is aborting connections via fusectl, right?
>
> Yes.  Alternative is with "umount -f".
>
>> The doc says "Under
>> the fuse control filesystem each connection has a directory named by a
>> unique number". The question is: if I start a process and this process
>> can mount fuse, how do I kill it? I mean: totally and certainly get
>> rid of it right away? How do I find these unique numbers for the
>> mounts it created?
>
> It is the device number found in st_dev for the mount.  Other than
> doing stat(2) it is possible to find out the device number by reading
> /proc/$PID/mountinfo  (third field).

Thanks. I will try to figure out fusectl connection numbers and see if
it's possible to integrate aborting into syzkaller.

>> Taking into account that there is usually no
>> operator attached to each server, I wonder if kernel could somehow
>> auto-abort fuse on kill?
>
> Depends on what the fuse server is sleeping on.   If it's trying to
> acquire an inode lock (e.g. unlink(2)), which is classical way to
> deadlock a fuse filesystem, then it will go into an uninterruptible
> sleep.  There's no way in which that process can be killed except to
> force a release of the offending lock, which can only be done by
> aborting the request that is being performed while holding that lock.

I understand that it is not killed today, but I am asking if we can
make it killable. It's all code that we can change, and if a human
operator can do it, it can be done pure programmatically on kill too,
right?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23 12:22     ` Dmitry Vyukov
@ 2018-07-23 12:33       ` Miklos Szeredi
  2018-07-23 12:46         ` Dmitry Vyukov
  0 siblings, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2018-07-23 12:33 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 2:22 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Mon, Jul 23, 2018 at 2:12 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Mon, Jul 23, 2018 at 10:11 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> On Mon, Jul 23, 2018 at 9:59 AM, syzbot
>>> <syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com> wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following crash on:
>>>>
>>>> HEAD commit:    d72e90f33aa4 Linux 4.18-rc6
>>>> git tree:       upstream
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000
>>>
>>>
>>> Hi fuse maintainers,
>>>
>>> We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I
>>> understand this is mostly working-as-intended (parts about deadlocks
>>> in Documentation/filesystems/fuse.txt). The intended way to resolve
>>> this is aborting connections via fusectl, right?
>>
>> Yes.  Alternative is with "umount -f".
>>
>>> The doc says "Under
>>> the fuse control filesystem each connection has a directory named by a
>>> unique number". The question is: if I start a process and this process
>>> can mount fuse, how do I kill it? I mean: totally and certainly get
>>> rid of it right away? How do I find these unique numbers for the
>>> mounts it created?
>>
>> It is the device number found in st_dev for the mount.  Other than
>> doing stat(2) it is possible to find out the device number by reading
>> /proc/$PID/mountinfo  (third field).
>
> Thanks. I will try to figure out fusectl connection numbers and see if
> it's possible to integrate aborting into syzkaller.
>
>>> Taking into account that there is usually no
>>> operator attached to each server, I wonder if kernel could somehow
>>> auto-abort fuse on kill?
>>
>> Depends on what the fuse server is sleeping on.   If it's trying to
>> acquire an inode lock (e.g. unlink(2)), which is classical way to
>> deadlock a fuse filesystem, then it will go into an uninterruptible
>> sleep.  There's no way in which that process can be killed except to
>> force a release of the offending lock, which can only be done by
>> aborting the request that is being performed while holding that lock.
>
> I understand that it is not killed today, but I am asking if we can
> make it killable. It's all code that we can change, and if a human
> operator can do it, it can be done pure programmatically on kill too,
> right?

Hmm, you mean if a process is in an uninterruptible sleep trying to
acquire a lock on a fuse filesystem and is killed, then the fuse
filesystem should be aborted?

Even if we'd manage to implement that, it's a large backward
incompatibility risk.

I don't argue that it can be done, but I would definitely argue *if*
it should be done.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23 12:33       ` Miklos Szeredi
@ 2018-07-23 12:46         ` Dmitry Vyukov
  2018-07-23 13:05           ` Miklos Szeredi
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Vyukov @ 2018-07-23 12:46 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 2:33 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>> On Mon, Jul 23, 2018 at 9:59 AM, syzbot
>>>> <syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com> wrote:
>>>>> Hello,
>>>>>
>>>>> syzbot found the following crash on:
>>>>>
>>>>> HEAD commit:    d72e90f33aa4 Linux 4.18-rc6
>>>>> git tree:       upstream
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
>>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000
>>>>
>>>>
>>>> Hi fuse maintainers,
>>>>
>>>> We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I
>>>> understand this is mostly working-as-intended (parts about deadlocks
>>>> in Documentation/filesystems/fuse.txt). The intended way to resolve
>>>> this is aborting connections via fusectl, right?
>>>
>>> Yes.  Alternative is with "umount -f".
>>>
>>>> The doc says "Under
>>>> the fuse control filesystem each connection has a directory named by a
>>>> unique number". The question is: if I start a process and this process
>>>> can mount fuse, how do I kill it? I mean: totally and certainly get
>>>> rid of it right away? How do I find these unique numbers for the
>>>> mounts it created?
>>>
>>> It is the device number found in st_dev for the mount.  Other than
>>> doing stat(2) it is possible to find out the device number by reading
>>> /proc/$PID/mountinfo  (third field).
>>
>> Thanks. I will try to figure out fusectl connection numbers and see if
>> it's possible to integrate aborting into syzkaller.
>>
>>>> Taking into account that there is usually no
>>>> operator attached to each server, I wonder if kernel could somehow
>>>> auto-abort fuse on kill?
>>>
>>> Depends on what the fuse server is sleeping on.   If it's trying to
>>> acquire an inode lock (e.g. unlink(2)), which is classical way to
>>> deadlock a fuse filesystem, then it will go into an uninterruptible
>>> sleep.  There's no way in which that process can be killed except to
>>> force a release of the offending lock, which can only be done by
>>> aborting the request that is being performed while holding that lock.
>>
>> I understand that it is not killed today, but I am asking if we can
>> make it killable. It's all code that we can change, and if a human
>> operator can do it, it can be done pure programmatically on kill too,
>> right?
>
> Hmm, you mean if a process is in an uninterruptible sleep trying to
> acquire a lock on a fuse filesystem and is killed, then the fuse
> filesystem should be aborted?
>
> Even if we'd manage to implement that, it's a large backward
> incompatibility risk.
>
> I don't argue that it can be done, but I would definitely argue *if*
> it should be done.


I understand that we should abort only if we are sure that it's
actually deadlocked and there is no other way.
So if fuse-user process is blocked on fuse lock, then we probably
should do nothing. However, if the fuse-server is killed, then perhaps
we could abort the connection at that point. Namely, if a process that
has a fuse fd open is killed and it is the only process that shared
this fd, then we could abort the connection on arrival of the kill
signal (rather than wait untill all it's threads finish and then start
closing all fd's, this is where we get the deadlock -- some of its
threads won't finish). I don't know if such synchronous kill hook is
available, though. If several processes shared the same fuse fd, then
we could close the fd in each process on SIGKILL arrival, then when
all of these processes are killed, fuse fd will be closed and we can
abort the connection, which will un-deadlock all of these processes.
Does this look any reasonable?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23 12:46         ` Dmitry Vyukov
@ 2018-07-23 13:05           ` Miklos Szeredi
  2018-07-23 13:37             ` Dmitry Vyukov
  0 siblings, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2018-07-23 13:05 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 2:46 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Mon, Jul 23, 2018 at 2:33 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>> On Mon, Jul 23, 2018 at 9:59 AM, syzbot
>>>>> <syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> syzbot found the following crash on:
>>>>>>
>>>>>> HEAD commit:    d72e90f33aa4 Linux 4.18-rc6
>>>>>> git tree:       upstream
>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
>>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
>>>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000
>>>>>
>>>>>
>>>>> Hi fuse maintainers,
>>>>>
>>>>> We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I
>>>>> understand this is mostly working-as-intended (parts about deadlocks
>>>>> in Documentation/filesystems/fuse.txt). The intended way to resolve
>>>>> this is aborting connections via fusectl, right?
>>>>
>>>> Yes.  Alternative is with "umount -f".
>>>>
>>>>> The doc says "Under
>>>>> the fuse control filesystem each connection has a directory named by a
>>>>> unique number". The question is: if I start a process and this process
>>>>> can mount fuse, how do I kill it? I mean: totally and certainly get
>>>>> rid of it right away? How do I find these unique numbers for the
>>>>> mounts it created?
>>>>
>>>> It is the device number found in st_dev for the mount.  Other than
>>>> doing stat(2) it is possible to find out the device number by reading
>>>> /proc/$PID/mountinfo  (third field).
>>>
>>> Thanks. I will try to figure out fusectl connection numbers and see if
>>> it's possible to integrate aborting into syzkaller.
>>>
>>>>> Taking into account that there is usually no
>>>>> operator attached to each server, I wonder if kernel could somehow
>>>>> auto-abort fuse on kill?
>>>>
>>>> Depends on what the fuse server is sleeping on.   If it's trying to
>>>> acquire an inode lock (e.g. unlink(2)), which is classical way to
>>>> deadlock a fuse filesystem, then it will go into an uninterruptible
>>>> sleep.  There's no way in which that process can be killed except to
>>>> force a release of the offending lock, which can only be done by
>>>> aborting the request that is being performed while holding that lock.
>>>
>>> I understand that it is not killed today, but I am asking if we can
>>> make it killable. It's all code that we can change, and if a human
>>> operator can do it, it can be done pure programmatically on kill too,
>>> right?
>>
>> Hmm, you mean if a process is in an uninterruptible sleep trying to
>> acquire a lock on a fuse filesystem and is killed, then the fuse
>> filesystem should be aborted?
>>
>> Even if we'd manage to implement that, it's a large backward
>> incompatibility risk.
>>
>> I don't argue that it can be done, but I would definitely argue *if*
>> it should be done.
>
>
> I understand that we should abort only if we are sure that it's
> actually deadlocked and there is no other way.
> So if fuse-user process is blocked on fuse lock, then we probably
> should do nothing. However, if the fuse-server is killed, then perhaps
> we could abort the connection at that point. Namely, if a process that
> has a fuse fd open is killed and it is the only process that shared
> this fd, then we could abort the connection on arrival of the kill
> signal (rather than wait untill all it's threads finish and then start
> closing all fd's, this is where we get the deadlock -- some of its
> threads won't finish). I don't know if such synchronous kill hook is
> available, though. If several processes shared the same fuse fd, then
> we could close the fd in each process on SIGKILL arrival, then when
> all of these processes are killed, fuse fd will be closed and we can
> abort the connection, which will un-deadlock all of these processes.
> Does this look any reasonable?

Biggest conceptual problem: your definition of fuse-server is weak.
Take the following example: process A is holding the fuse device fd
and is forwarding requests and replies to/from process B via a pipe.
So basically A is just a proxy that does nothing interesting, the
"real" server is B.  But according to your definition B is not a
server, only A is.

And this is just a simple example, parts of the server might be on
different machines, etc...  It's impossible to automatically detect if
a process is acting as a fuse server or not.

We could let the fuse server itself notify the kernel that it's a fuse
server.  That might help in the cases where the deadlock is
accidental, but obviously not in the case when done by a malicious
agent.  I'm not sure it's worth the effort.   Also I have no idea how
the respective maintainers would take the idea of "kill hooks"...   It
would probably be a lot of work for little gain.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23 13:05           ` Miklos Szeredi
@ 2018-07-23 13:37             ` Dmitry Vyukov
  2018-07-23 15:09               ` Miklos Szeredi
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Vyukov @ 2018-07-23 13:37 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 3:05 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>>> <syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> syzbot found the following crash on:
>>>>>>>
>>>>>>> HEAD commit:    d72e90f33aa4 Linux 4.18-rc6
>>>>>>> git tree:       upstream
>>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
>>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
>>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
>>>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
>>>>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000
>>>>>>
>>>>>>
>>>>>> Hi fuse maintainers,
>>>>>>
>>>>>> We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I
>>>>>> understand this is mostly working-as-intended (parts about deadlocks
>>>>>> in Documentation/filesystems/fuse.txt). The intended way to resolve
>>>>>> this is aborting connections via fusectl, right?
>>>>>
>>>>> Yes.  Alternative is with "umount -f".
>>>>>
>>>>>> The doc says "Under
>>>>>> the fuse control filesystem each connection has a directory named by a
>>>>>> unique number". The question is: if I start a process and this process
>>>>>> can mount fuse, how do I kill it? I mean: totally and certainly get
>>>>>> rid of it right away? How do I find these unique numbers for the
>>>>>> mounts it created?
>>>>>
>>>>> It is the device number found in st_dev for the mount.  Other than
>>>>> doing stat(2) it is possible to find out the device number by reading
>>>>> /proc/$PID/mountinfo  (third field).
>>>>
>>>> Thanks. I will try to figure out fusectl connection numbers and see if
>>>> it's possible to integrate aborting into syzkaller.
>>>>
>>>>>> Taking into account that there is usually no
>>>>>> operator attached to each server, I wonder if kernel could somehow
>>>>>> auto-abort fuse on kill?
>>>>>
>>>>> Depends on what the fuse server is sleeping on.   If it's trying to
>>>>> acquire an inode lock (e.g. unlink(2)), which is classical way to
>>>>> deadlock a fuse filesystem, then it will go into an uninterruptible
>>>>> sleep.  There's no way in which that process can be killed except to
>>>>> force a release of the offending lock, which can only be done by
>>>>> aborting the request that is being performed while holding that lock.
>>>>
>>>> I understand that it is not killed today, but I am asking if we can
>>>> make it killable. It's all code that we can change, and if a human
>>>> operator can do it, it can be done pure programmatically on kill too,
>>>> right?
>>>
>>> Hmm, you mean if a process is in an uninterruptible sleep trying to
>>> acquire a lock on a fuse filesystem and is killed, then the fuse
>>> filesystem should be aborted?
>>>
>>> Even if we'd manage to implement that, it's a large backward
>>> incompatibility risk.
>>>
>>> I don't argue that it can be done, but I would definitely argue *if*
>>> it should be done.
>>
>>
>> I understand that we should abort only if we are sure that it's
>> actually deadlocked and there is no other way.
>> So if fuse-user process is blocked on fuse lock, then we probably
>> should do nothing. However, if the fuse-server is killed, then perhaps
>> we could abort the connection at that point. Namely, if a process that
>> has a fuse fd open is killed and it is the only process that shared
>> this fd, then we could abort the connection on arrival of the kill
>> signal (rather than wait untill all it's threads finish and then start
>> closing all fd's, this is where we get the deadlock -- some of its
>> threads won't finish). I don't know if such synchronous kill hook is
>> available, though. If several processes shared the same fuse fd, then
>> we could close the fd in each process on SIGKILL arrival, then when
>> all of these processes are killed, fuse fd will be closed and we can
>> abort the connection, which will un-deadlock all of these processes.
>> Does this look any reasonable?
>
> Biggest conceptual problem: your definition of fuse-server is weak.
> Take the following example: process A is holding the fuse device fd
> and is forwarding requests and replies to/from process B via a pipe.
> So basically A is just a proxy that does nothing interesting, the
> "real" server is B.  But according to your definition B is not a
> server, only A is.

I proposed to abort fuse conn when all fuse device fd's are "killed"
(all processes having the fd opened are killed). So if _only_ process
B is killed, then, yes, it will still hang. However if A is killed or
both A and B (say, process group, everything inside of pid namespace,
etc) then the deadlock will be autoresolved without human
intervention.

> And this is just a simple example, parts of the server might be on
> different machines, etc...  It's impossible to automatically detect if
> a process is acting as a fuse server or not.

It does not seem we need the precise definition. If no one ever can
write anything into the fd, we can safely abort the connection (?). If
we don't, we can either get that the process exits normally and the
connection is doomed anyway, so no difference in behavior, or we can
get a deadlock.

> We could let the fuse server itself notify the kernel that it's a fuse
> server.  That might help in the cases where the deadlock is
> accidental, but obviously not in the case when done by a malicious
> agent.  I'm not sure it's worth the effort.   Also I have no idea how
> the respective maintainers would take the idea of "kill hooks"...   It
> would probably be a lot of work for little gain.

What looks wrong to me here is that fuse is only (?) subsystem in
kernel that stops SIGKILL from working and requires complex custom
dance performed by a human operator (which is not necessary there at
all). Say, if a process has opened a socket, whatever, I don't need to
locate and abort something in socketctl fs, just SIGKILL. If a
processes has opened a file, I don't need to locate the fd in /proc
and abort it, just SIGKILL. If a process has created an ipc object, I
don't need to do any special dance, just SIGKILL. fuse is somehow very
special, if we have more such cases, it definitely won't scale.
I understand that there can be implementation difficulties, but
fundamentally that's how things should work -- choose target
processes, kill, done, right?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23 13:37             ` Dmitry Vyukov
@ 2018-07-23 15:09               ` Miklos Szeredi
  2018-07-23 15:19                 ` Dmitry Vyukov
  0 siblings, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2018-07-23 15:09 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 3:37 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Mon, Jul 23, 2018 at 3:05 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:

>> Biggest conceptual problem: your definition of fuse-server is weak.
>> Take the following example: process A is holding the fuse device fd
>> and is forwarding requests and replies to/from process B via a pipe.
>> So basically A is just a proxy that does nothing interesting, the
>> "real" server is B.  But according to your definition B is not a
>> server, only A is.
>
> I proposed to abort fuse conn when all fuse device fd's are "killed"
> (all processes having the fd opened are killed). So if _only_ process
> B is killed, then, yes, it will still hang. However if A is killed or
> both A and B (say, process group, everything inside of pid namespace,
> etc) then the deadlock will be autoresolved without human
> intervention.

Okay, so you're saying:

1) when process gets SIGKILL and is uninterruptible sleep mark process as doomed
2) for a particular fuse instance find set of fuse device fd
references that are in non-doomed tasks; if there are none then abort
fuse instance

Right?

The above is not an implementation proposal, just to get us on the
same page regarding the concept.

>> And this is just a simple example, parts of the server might be on
>> different machines, etc...  It's impossible to automatically detect if
>> a process is acting as a fuse server or not.
>
> It does not seem we need the precise definition. If no one ever can
> write anything into the fd, we can safely abort the connection (?).

Seems to me so.

> If
> we don't, we can either get that the process exits normally and the
> connection is doomed anyway, so no difference in behavior, or we can
> get a deadlock.
>
>> We could let the fuse server itself notify the kernel that it's a fuse
>> server.  That might help in the cases where the deadlock is
>> accidental, but obviously not in the case when done by a malicious
>> agent.  I'm not sure it's worth the effort.   Also I have no idea how
>> the respective maintainers would take the idea of "kill hooks"...   It
>> would probably be a lot of work for little gain.
>
> What looks wrong to me here is that fuse is only (?) subsystem in
> kernel that stops SIGKILL from working and requires complex custom
> dance performed by a human operator (which is not necessary there at
> all). Say, if a process has opened a socket, whatever, I don't need to
> locate and abort something in socketctl fs, just SIGKILL. If a
> processes has opened a file, I don't need to locate the fd in /proc
> and abort it, just SIGKILL. If a process has created an ipc object, I
> don't need to do any special dance, just SIGKILL. fuse is somehow very
> special, if we have more such cases, it definitely won't scale.
> I understand that there can be implementation difficulties, but
> fundamentally that's how things should work -- choose target
> processes, kill, done, right?

Yes, it would be nice.

But I'm not sure it will fly due to implementation difficulties.  It's
definitely not  a high prio feature currently for me, but I'll happily
accept patches.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23 15:09               ` Miklos Szeredi
@ 2018-07-23 15:19                 ` Dmitry Vyukov
  2018-07-24 15:17                   ` Miklos Szeredi
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Vyukov @ 2018-07-23 15:19 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 5:09 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Mon, Jul 23, 2018 at 3:37 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Mon, Jul 23, 2018 at 3:05 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>
>>> Biggest conceptual problem: your definition of fuse-server is weak.
>>> Take the following example: process A is holding the fuse device fd
>>> and is forwarding requests and replies to/from process B via a pipe.
>>> So basically A is just a proxy that does nothing interesting, the
>>> "real" server is B.  But according to your definition B is not a
>>> server, only A is.
>>
>> I proposed to abort fuse conn when all fuse device fd's are "killed"
>> (all processes having the fd opened are killed). So if _only_ process
>> B is killed, then, yes, it will still hang. However if A is killed or
>> both A and B (say, process group, everything inside of pid namespace,
>> etc) then the deadlock will be autoresolved without human
>> intervention.
>
> Okay, so you're saying:
>
> 1) when process gets SIGKILL and is uninterruptible sleep mark process as doomed
> 2) for a particular fuse instance find set of fuse device fd
> references that are in non-doomed tasks; if there are none then abort
> fuse instance
>
> Right?


Yes, something like this.
Perhaps checking for "uninterruptible sleep" is excessive. If it has
SIGKILL pending it's pretty much doomed already. This info should be
already available for tasks.
Not saying that it's better, but what I described was the other way
around: when a task killed it drops a reference to all opened fuse
fds, when the last fd is dropped, the connection can be aborted.


> The above is not an implementation proposal, just to get us on the
> same page regarding the concept.
>
>>> And this is just a simple example, parts of the server might be on
>>> different machines, etc...  It's impossible to automatically detect if
>>> a process is acting as a fuse server or not.
>>
>> It does not seem we need the precise definition. If no one ever can
>> write anything into the fd, we can safely abort the connection (?).
>
> Seems to me so.
>
>> If
>> we don't, we can either get that the process exits normally and the
>> connection is doomed anyway, so no difference in behavior, or we can
>> get a deadlock.
>>
>>> We could let the fuse server itself notify the kernel that it's a fuse
>>> server.  That might help in the cases where the deadlock is
>>> accidental, but obviously not in the case when done by a malicious
>>> agent.  I'm not sure it's worth the effort.   Also I have no idea how
>>> the respective maintainers would take the idea of "kill hooks"...   It
>>> would probably be a lot of work for little gain.
>>
>> What looks wrong to me here is that fuse is only (?) subsystem in
>> kernel that stops SIGKILL from working and requires complex custom
>> dance performed by a human operator (which is not necessary there at
>> all). Say, if a process has opened a socket, whatever, I don't need to
>> locate and abort something in socketctl fs, just SIGKILL. If a
>> processes has opened a file, I don't need to locate the fd in /proc
>> and abort it, just SIGKILL. If a process has created an ipc object, I
>> don't need to do any special dance, just SIGKILL. fuse is somehow very
>> special, if we have more such cases, it definitely won't scale.
>> I understand that there can be implementation difficulties, but
>> fundamentally that's how things should work -- choose target
>> processes, kill, done, right?
>
> Yes, it would be nice.
>
> But I'm not sure it will fly due to implementation difficulties.  It's
> definitely not  a high prio feature currently for me, but I'll happily
> accept patches.

I see. Thanks for bearing with me.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23 15:19                 ` Dmitry Vyukov
@ 2018-07-24 15:17                   ` Miklos Szeredi
  2018-07-25  9:12                     ` Dmitry Vyukov
  0 siblings, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2018-07-24 15:17 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Mon, Jul 23, 2018 at 5:19 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Mon, Jul 23, 2018 at 5:09 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Mon, Jul 23, 2018 at 3:37 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> On Mon, Jul 23, 2018 at 3:05 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>>>> Biggest conceptual problem: your definition of fuse-server is weak.
>>>> Take the following example: process A is holding the fuse device fd
>>>> and is forwarding requests and replies to/from process B via a pipe.
>>>> So basically A is just a proxy that does nothing interesting, the
>>>> "real" server is B.  But according to your definition B is not a
>>>> server, only A is.
>>>
>>> I proposed to abort fuse conn when all fuse device fd's are "killed"
>>> (all processes having the fd opened are killed). So if _only_ process
>>> B is killed, then, yes, it will still hang. However if A is killed or
>>> both A and B (say, process group, everything inside of pid namespace,
>>> etc) then the deadlock will be autoresolved without human
>>> intervention.
>>
>> Okay, so you're saying:
>>
>> 1) when process gets SIGKILL and is uninterruptible sleep mark process as doomed
>> 2) for a particular fuse instance find set of fuse device fd
>> references that are in non-doomed tasks; if there are none then abort
>> fuse instance
>>
>> Right?
>
>
> Yes, something like this.
> Perhaps checking for "uninterruptible sleep" is excessive. If it has
> SIGKILL pending it's pretty much doomed already. This info should be
> already available for tasks.
> Not saying that it's better, but what I described was the other way
> around: when a task killed it drops a reference to all opened fuse
> fds, when the last fd is dropped, the connection can be aborted.

struct task_struct {
[...]
    struct files_struct        *files;
[...]
};

struct files_struct {
[...]
    struct fdtable __rcu *fdt;
[...]
};

struct fdtable {
[...]
    struct file __rcu **fd;      /* current fd array */
[...]
};

So there we have an array of pointers to struct files.  Suppose we'd
magically be able to find files that point to fuse devices upon
receiving SIGKILL, what would we do with them?  We can't close them:
other tasks might still be pointing to the same files_struct.

We could do a global search for non-doomed tasks referencing the same
fuse device, but I have no clue how we'd go about doing that without
racing with forks, fd sending, etc...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-24 15:17                   ` Miklos Szeredi
@ 2018-07-25  9:12                     ` Dmitry Vyukov
  2018-07-26  8:44                       ` Miklos Szeredi
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Vyukov @ 2018-07-25  9:12 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Tue, Jul 24, 2018 at 5:17 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>> Biggest conceptual problem: your definition of fuse-server is weak.
>>>>> Take the following example: process A is holding the fuse device fd
>>>>> and is forwarding requests and replies to/from process B via a pipe.
>>>>> So basically A is just a proxy that does nothing interesting, the
>>>>> "real" server is B.  But according to your definition B is not a
>>>>> server, only A is.
>>>>
>>>> I proposed to abort fuse conn when all fuse device fd's are "killed"
>>>> (all processes having the fd opened are killed). So if _only_ process
>>>> B is killed, then, yes, it will still hang. However if A is killed or
>>>> both A and B (say, process group, everything inside of pid namespace,
>>>> etc) then the deadlock will be autoresolved without human
>>>> intervention.
>>>
>>> Okay, so you're saying:
>>>
>>> 1) when process gets SIGKILL and is uninterruptible sleep mark process as doomed
>>> 2) for a particular fuse instance find set of fuse device fd
>>> references that are in non-doomed tasks; if there are none then abort
>>> fuse instance
>>>
>>> Right?
>>
>>
>> Yes, something like this.
>> Perhaps checking for "uninterruptible sleep" is excessive. If it has
>> SIGKILL pending it's pretty much doomed already. This info should be
>> already available for tasks.
>> Not saying that it's better, but what I described was the other way
>> around: when a task killed it drops a reference to all opened fuse
>> fds, when the last fd is dropped, the connection can be aborted.
>
> struct task_struct {
> [...]
>     struct files_struct        *files;
> [...]
> };
>
> struct files_struct {
> [...]
>     struct fdtable __rcu *fdt;
> [...]
> };
>
> struct fdtable {
> [...]
>     struct file __rcu **fd;      /* current fd array */
> [...]
> };
>
> So there we have an array of pointers to struct files.  Suppose we'd
> magically be able to find files that point to fuse devices upon
> receiving SIGKILL, what would we do with them?  We can't close them:
> other tasks might still be pointing to the same files_struct.
>
> We could do a global search for non-doomed tasks referencing the same
> fuse device, but I have no clue how we'd go about doing that without
> racing with forks, fd sending, etc...


Good questions for which I don't have answers.

Maybe more waits in fuse need to be interruptible? E.g. request_wait_answer?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-25  9:12                     ` Dmitry Vyukov
@ 2018-07-26  8:44                       ` Miklos Szeredi
  2018-07-26  9:12                         ` Miklos Szeredi
  0 siblings, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2018-07-26  8:44 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Wed, Jul 25, 2018 at 11:12 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Tue, Jul 24, 2018 at 5:17 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>>> Biggest conceptual problem: your definition of fuse-server is weak.
>>>>>> Take the following example: process A is holding the fuse device fd
>>>>>> and is forwarding requests and replies to/from process B via a pipe.
>>>>>> So basically A is just a proxy that does nothing interesting, the
>>>>>> "real" server is B.  But according to your definition B is not a
>>>>>> server, only A is.
>>>>>
>>>>> I proposed to abort fuse conn when all fuse device fd's are "killed"
>>>>> (all processes having the fd opened are killed). So if _only_ process
>>>>> B is killed, then, yes, it will still hang. However if A is killed or
>>>>> both A and B (say, process group, everything inside of pid namespace,
>>>>> etc) then the deadlock will be autoresolved without human
>>>>> intervention.
>>>>
>>>> Okay, so you're saying:
>>>>
>>>> 1) when process gets SIGKILL and is uninterruptible sleep mark process as doomed
>>>> 2) for a particular fuse instance find set of fuse device fd
>>>> references that are in non-doomed tasks; if there are none then abort
>>>> fuse instance
>>>>
>>>> Right?
>>>
>>>
>>> Yes, something like this.
>>> Perhaps checking for "uninterruptible sleep" is excessive. If it has
>>> SIGKILL pending it's pretty much doomed already. This info should be
>>> already available for tasks.
>>> Not saying that it's better, but what I described was the other way
>>> around: when a task killed it drops a reference to all opened fuse
>>> fds, when the last fd is dropped, the connection can be aborted.
>>
>> struct task_struct {
>> [...]
>>     struct files_struct        *files;
>> [...]
>> };
>>
>> struct files_struct {
>> [...]
>>     struct fdtable __rcu *fdt;
>> [...]
>> };
>>
>> struct fdtable {
>> [...]
>>     struct file __rcu **fd;      /* current fd array */
>> [...]
>> };
>>
>> So there we have an array of pointers to struct files.  Suppose we'd
>> magically be able to find files that point to fuse devices upon
>> receiving SIGKILL, what would we do with them?  We can't close them:
>> other tasks might still be pointing to the same files_struct.
>>
>> We could do a global search for non-doomed tasks referencing the same
>> fuse device, but I have no clue how we'd go about doing that without
>> racing with forks, fd sending, etc...
>
>
> Good questions for which I don't have answers.
>
> Maybe more waits in fuse need to be interruptible? E.g. request_wait_answer?

That's an interesting aspect.  Making request_wait_answer always be
killable would help with the issue you raise (killing set of processes
taking part in deadlock should resolve deadlock), but it breaks
another aspect of the interface.

Namely that userspace filesystems expect some serialization from
kernel when performing operations.  If we allow killing of a process
in the middle of an fs operation, then that serialization is no longer
there, which can break the server.

One solution to that is to duplicate all locking in the server
(libfuse normally), but it would not solve the issue for legacy
libfuse or legacy non-libfuse servers.  It would also be difficult to
test.  Also it doesn't solve the problem of killing the server, as
that alone doesn't resolve the deadlock.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-26  8:44                       ` Miklos Szeredi
@ 2018-07-26  9:12                         ` Miklos Szeredi
  2018-11-02 19:31                           ` Dmitry Vyukov
  0 siblings, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2018-07-26  9:12 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Thu, Jul 26, 2018 at 10:44 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jul 25, 2018 at 11:12 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Tue, Jul 24, 2018 at 5:17 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:

>> Maybe more waits in fuse need to be interruptible? E.g. request_wait_answer?
>
> That's an interesting aspect.  Making request_wait_answer always be
> killable would help with the issue you raise (killing set of processes
> taking part in deadlock should resolve deadlock), but it breaks
> another aspect of the interface.
>
> Namely that userspace filesystems expect some serialization from
> kernel when performing operations.  If we allow killing of a process
> in the middle of an fs operation, then that serialization is no longer
> there, which can break the server.
>
> One solution to that is to duplicate all locking in the server
> (libfuse normally), but it would not solve the issue for legacy
> libfuse or legacy non-libfuse servers.  It would also be difficult to
> test.  Also it doesn't solve the problem of killing the server, as
> that alone doesn't resolve the deadlock.

Umm, we can actually do better.   Duplicate all vfs locking in the
fuse kernel implementation: when killing a task that has an
outstanding request, return immediately (which results in releasing
the VFS level lock and hence the deadlock) but hold onto our own lock
until the reply from the userspace server comes back.

Need to think about the details; this might not be easy to do this
properly.   Notably memory management locks (page->lock, mmap_sem,
etc) are notoriously tricky.

Thanks,
MIklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-26  9:12                         ` Miklos Szeredi
@ 2018-11-02 19:31                           ` Dmitry Vyukov
  0 siblings, 0 replies; 16+ messages in thread
From: Dmitry Vyukov @ 2018-11-02 19:31 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, LKML, syzkaller-bugs, syzbot

On Thu, Jul 26, 2018 at 11:12 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Jul 26, 2018 at 10:44 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Jul 25, 2018 at 11:12 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> On Tue, Jul 24, 2018 at 5:17 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>
>>> Maybe more waits in fuse need to be interruptible? E.g. request_wait_answer?
>>
>> That's an interesting aspect.  Making request_wait_answer always be
>> killable would help with the issue you raise (killing set of processes
>> taking part in deadlock should resolve deadlock), but it breaks
>> another aspect of the interface.
>>
>> Namely that userspace filesystems expect some serialization from
>> kernel when performing operations.  If we allow killing of a process
>> in the middle of an fs operation, then that serialization is no longer
>> there, which can break the server.
>>
>> One solution to that is to duplicate all locking in the server
>> (libfuse normally), but it would not solve the issue for legacy
>> libfuse or legacy non-libfuse servers.  It would also be difficult to
>> test.  Also it doesn't solve the problem of killing the server, as
>> that alone doesn't resolve the deadlock.
>
> Umm, we can actually do better.   Duplicate all vfs locking in the
> fuse kernel implementation: when killing a task that has an
> outstanding request, return immediately (which results in releasing
> the VFS level lock and hence the deadlock) but hold onto our own lock
> until the reply from the userspace server comes back.
>
> Need to think about the details; this might not be easy to do this
> properly.   Notably memory management locks (page->lock, mmap_sem,
> etc) are notoriously tricky.

Hi Miklos,

Any updates on this?

syzbot recently found this hang in fuse, which looks real (totally unkillable):
https://syzkaller.appspot.com/bug?id=0d08132d6dac82ae63b7b8d4a9d027d30b46167d

but this one still happens, and it's hard to tell if it's real or not:
https://syzkaller.appspot.com/bug?id=76f8203fef423375d230f14b8f5b45617ab945e2

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: INFO: task hung in fuse_reverse_inval_entry
  2018-07-23  7:59 INFO: task hung in fuse_reverse_inval_entry syzbot
  2018-07-23  8:11 ` Dmitry Vyukov
@ 2019-11-07 13:42 ` syzbot
  1 sibling, 0 replies; 16+ messages in thread
From: syzbot @ 2019-11-07 13:42 UTC (permalink / raw)
  To: dvyukov, ktkhai, linux-fsdevel, linux-kernel, miklos, mszeredi,
	syzkaller-bugs

syzbot suspects this bug was fixed by commit:

commit c59fd85e4fd07fdf0ab523a5e9734f5338d6aa19
Author: Kirill Tkhai <ktkhai@virtuozzo.com>
Date:   Tue Sep 11 10:11:56 2018 +0000

     fuse: change interrupt requests allocation algorithm

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15518db2600000
start commit:   d72e90f3 Linux 4.18-rc6
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000

If the result looks correct, please mark the bug fixed by replying with:

#syz fix: fuse: change interrupt requests allocation algorithm

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-11-07 13:43 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-23  7:59 INFO: task hung in fuse_reverse_inval_entry syzbot
2018-07-23  8:11 ` Dmitry Vyukov
2018-07-23 12:12   ` Miklos Szeredi
2018-07-23 12:22     ` Dmitry Vyukov
2018-07-23 12:33       ` Miklos Szeredi
2018-07-23 12:46         ` Dmitry Vyukov
2018-07-23 13:05           ` Miklos Szeredi
2018-07-23 13:37             ` Dmitry Vyukov
2018-07-23 15:09               ` Miklos Szeredi
2018-07-23 15:19                 ` Dmitry Vyukov
2018-07-24 15:17                   ` Miklos Szeredi
2018-07-25  9:12                     ` Dmitry Vyukov
2018-07-26  8:44                       ` Miklos Szeredi
2018-07-26  9:12                         ` Miklos Szeredi
2018-11-02 19:31                           ` Dmitry Vyukov
2019-11-07 13:42 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).