All of lore.kernel.org
 help / color / mirror / Atom feed
* INFO: task hung in p9_fd_close
@ 2019-08-30 19:28 syzbot
  2019-09-21 16:19 ` syzbot
  0 siblings, 1 reply; 12+ messages in thread
From: syzbot @ 2019-08-30 19:28 UTC (permalink / raw)
  To: asmadeus, davem, ericvh, linux-kernel, lucho, netdev,
	syzkaller-bugs, v9fs-developer

Hello,

syzbot found the following crash on:

HEAD commit:    6525771f Merge tag 'arc-5.3-rc7' of git://git.kernel.org/p..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1118a71e600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=58485246ad14eafe
dashboard link: https://syzkaller.appspot.com/bug?extid=8b41a1365f1106fd0f33
compiler:       clang version 9.0.0 (/home/glider/llvm/clang  
80fee25776c2fb61e74c1ecb1a523375c2500b69)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1125ee12600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com

INFO: task syz-executor.1:13699 blocked for more than 143 seconds.
       Not tainted 5.3.0-rc6+ #94
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.1  D28888 13699   9148 0x00004004
Call Trace:
  context_switch kernel/sched/core.c:3254 [inline]
  __schedule+0x877/0xc50 kernel/sched/core.c:3880
  schedule+0x131/0x1e0 kernel/sched/core.c:3947
  schedule_timeout+0x46/0x240 kernel/time/timer.c:1783
  do_wait_for_common+0x2e7/0x4d0 kernel/sched/completion.c:83
  __wait_for_common kernel/sched/completion.c:104 [inline]
  wait_for_common kernel/sched/completion.c:115 [inline]
  wait_for_completion+0x47/0x60 kernel/sched/completion.c:136
  __flush_work+0xd4/0x150 kernel/workqueue.c:3040
  __cancel_work_timer+0x420/0x570 kernel/workqueue.c:3127
  cancel_work_sync+0x17/0x20 kernel/workqueue.c:3163
  p9_conn_destroy net/9p/trans_fd.c:868 [inline]
  p9_fd_close+0x297/0x3c0 net/9p/trans_fd.c:898
  p9_client_create+0x974/0xee0 net/9p/client.c:1068
  v9fs_session_init+0x192/0x18e0 fs/9p/v9fs.c:406
  v9fs_mount+0x82/0x810 fs/9p/vfs_super.c:120
  legacy_get_tree+0xf9/0x1a0 fs/fs_context.c:661
  vfs_get_tree+0x8f/0x380 fs/super.c:1413
  do_new_mount fs/namespace.c:2791 [inline]
  do_mount+0x169d/0x2490 fs/namespace.c:3111
  ksys_mount+0xcc/0x100 fs/namespace.c:3320
  __do_sys_mount fs/namespace.c:3334 [inline]
  __se_sys_mount fs/namespace.c:3331 [inline]
  __x64_sys_mount+0xbf/0xd0 fs/namespace.c:3331
  do_syscall_64+0xfe/0x140 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x459879
Code: 8b 44 24 18 48 8b 4c 24 30 48 83 c1 08 48 89 0c 24 48 89 44 24 08 48  
c7 44 24 10 10 00 00 00 e8 0d da fa ff 48 8b 44 24 18 48 <89> 44 24 40 48  
8b 6c 24 20 48 83 c4 28 c3 e8 14 b9 ff ff eb 82 cc
RSP: 002b:00007f6b4dda7c78 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000459879
RDX: 0000000020000140 RSI: 0000000020000000 RDI: 0000000000000000
RBP: 000000000075c118 R08: 0000000020000480 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f6b4dda86d4
R13: 00000000004c5e2f R14: 00000000004da930 R15: 00000000ffffffff
INFO: lockdep is turned off.
NMI backtrace for cpu 0
CPU: 0 PID: 1057 Comm: khungtaskd Not tainted 5.3.0-rc6+ #94
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1d8/0x2f8 lib/dump_stack.c:113
  nmi_cpu_backtrace+0xaf/0x1a0 lib/nmi_backtrace.c:101
  nmi_trigger_cpumask_backtrace+0x174/0x290 lib/nmi_backtrace.c:62
  arch_trigger_cpumask_backtrace+0x10/0x20 arch/x86/kernel/apic/hw_nmi.c:38
  trigger_all_cpu_backtrace+0x17/0x20 include/linux/nmi.h:146
  check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
  watchdog+0xbb9/0xbd0 kernel/hung_task.c:289
  kthread+0x332/0x350 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0xe/0x10  
arch/x86/include/asm/irqflags.h:60


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: INFO: task hung in p9_fd_close
  2019-08-30 19:28 INFO: task hung in p9_fd_close syzbot
@ 2019-09-21 16:19 ` syzbot
  2022-08-26 15:27   ` [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write Tetsuo Handa
  0 siblings, 1 reply; 12+ messages in thread
From: syzbot @ 2019-09-21 16:19 UTC (permalink / raw)
  To: asmadeus, davem, ericvh, linux-kernel, lucho, netdev,
	syzkaller-bugs, v9fs-developer

syzbot has found a reproducer for the following crash on:

HEAD commit:    f97c81dc Merge tag 'armsoc-late' of git://git.kernel.org/p..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=171a1555600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=61f948934213449f
dashboard link: https://syzkaller.appspot.com/bug?extid=8b41a1365f1106fd0f33
compiler:       clang version 9.0.0 (/home/glider/llvm/clang  
80fee25776c2fb61e74c1ecb1a523375c2500b69)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1641d429600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11eff9ad600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com

INFO: task syz-executor635:8085 blocked for more than 143 seconds.
       Not tainted 5.3.0+ #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor635 D27848  8085   8070 0x00004004
Call Trace:
  context_switch kernel/sched/core.c:3384 [inline]
  __schedule+0x82e/0xc50 kernel/sched/core.c:4056
  schedule+0x131/0x1e0 kernel/sched/core.c:4123
  schedule_timeout+0x46/0x240 kernel/time/timer.c:1869
  do_wait_for_common+0x2e7/0x4d0 kernel/sched/completion.c:83
  __wait_for_common kernel/sched/completion.c:104 [inline]
  wait_for_common kernel/sched/completion.c:115 [inline]
  wait_for_completion+0x47/0x60 kernel/sched/completion.c:136
  __flush_work+0xd4/0x150 kernel/workqueue.c:3040
  __cancel_work_timer+0x420/0x570 kernel/workqueue.c:3127
  cancel_work_sync+0x17/0x20 kernel/workqueue.c:3163
  p9_conn_destroy net/9p/trans_fd.c:868 [inline]
  p9_fd_close+0x297/0x3c0 net/9p/trans_fd.c:898
  p9_client_create+0x974/0xee0 net/9p/client.c:1068
  v9fs_session_init+0x192/0x18e0 fs/9p/v9fs.c:406
  v9fs_mount+0x82/0x860 fs/9p/vfs_super.c:124
  legacy_get_tree+0xf9/0x1a0 fs/fs_context.c:659
  vfs_get_tree+0x8f/0x380 fs/super.c:1542
  do_new_mount fs/namespace.c:2825 [inline]
  do_mount+0x16c7/0x2510 fs/namespace.c:3145
  ksys_mount+0xcc/0x100 fs/namespace.c:3354
  __do_sys_mount fs/namespace.c:3368 [inline]
  __se_sys_mount fs/namespace.c:3365 [inline]
  __x64_sys_mount+0xbf/0xd0 fs/namespace.c:3365
  do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x447909
Code: e8 5c 14 03 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 9b 0c fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007eff1aea0db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00000000006ddc78 RCX: 0000000000447909
RDX: 00000000200005c0 RSI: 0000000020000540 RDI: 0000000000000000
RBP: 00000000006ddc70 R08: 0000000020000680 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006ddc7c
R13: 00007ffc4e6377ef R14: 00007eff1aea19c0 R15: 000000000000002d

Showing all locks held in the system:
2 locks held by kworker/1:1/22:
  #0: ffff8880aa4278e8 ((wq_completion)events){+.+.}, at: spin_unlock_irq  
include/linux/spinlock.h:388 [inline]
  #0: ffff8880aa4278e8 ((wq_completion)events){+.+.}, at:  
process_one_work+0x75d/0x10e0 kernel/workqueue.c:2242
  #1: ffff8880a9a3fd78 ((work_completion)(&m->wq)){+.+.}, at:  
process_one_work+0x79f/0x10e0 kernel/workqueue.c:2244
1 lock held by khungtaskd/1053:
  #0: ffffffff888d3900 (rcu_read_lock){....}, at: rcu_lock_acquire+0x4/0x30  
include/linux/rcupdate.h:207
1 lock held by rsyslogd/7959:
  #0: ffff8880a99f1e20 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x243/0x2e0  
fs/file.c:801
2 locks held by getty/8049:
  #0: ffff8880a602d450 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
  #1: ffffc90005f212e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156
2 locks held by getty/8050:
  #0: ffff8880a902ac50 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
  #1: ffffc90005f092e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156
2 locks held by getty/8051:
  #0: ffff88809c0f1750 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
  #1: ffffc90005f312e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156
2 locks held by getty/8052:
  #0: ffff88809e344b90 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
  #1: ffffc90005f152e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156
2 locks held by getty/8053:
  #0: ffff88809e344310 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
  #1: ffffc90005f1d2e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156
2 locks held by getty/8054:
  #0: ffff88809c8aa410 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
  #1: ffffc90005f2d2e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156
2 locks held by getty/8055:
  #0: ffff8880a0267310 (&tty->ldisc_sem){++++}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
  #1: ffffc90005f012e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156

=============================================

NMI backtrace for cpu 0
CPU: 0 PID: 1053 Comm: khungtaskd Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1d8/0x2f8 lib/dump_stack.c:113
  nmi_cpu_backtrace+0xaf/0x1a0 lib/nmi_backtrace.c:101
  nmi_trigger_cpumask_backtrace+0x174/0x290 lib/nmi_backtrace.c:62
  arch_trigger_cpumask_backtrace+0x10/0x20 arch/x86/kernel/apic/hw_nmi.c:38
  trigger_all_cpu_backtrace+0x17/0x20 include/linux/nmi.h:146
  check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
  watchdog+0xbb9/0xbd0 kernel/hung_task.c:289
  kthread+0x332/0x350 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:check_preemption_disabled+0x6/0x2a0 lib/smp_processor_id.c:13
Code: 90 90 90 90 55 48 89 e5 e8 a7 b2 2e fe 48 c7 c7 79 91 38 88 48 c7 c6  
10 72 4e 88 e8 04 00 00 00 5d c3 66 90 55 48 89 e5 41 57 <41> 56 41 55 41  
54 53 48 83 ec 10 49 89 f6 49 89 ff e8 74 b2 2e fe
RSP: 0018:ffff8880aeb09ee0 EFLAGS: 00000093
RAX: ffffffff83448fc9 RBX: 0000000000000004 RCX: ffff8880a98c2340
RDX: 0000000000000000 RSI: ffffffff884e7210 RDI: ffffffff88389179
RBP: ffff8880aeb09ee8 R08: ffffffff816701e0 R09: fffffbfff117c3cd
R10: fffffbfff117c3cd R11: 0000000000000000 R12: 1ffff11015d64eaf
R13: ffff8880aeb00000 R14: dffffc0000000000 R15: ffff8880aeb2757c
FS:  0000000000000000(0000) GS:ffff8880aeb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff600400 CR3: 00000000a3d38000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  <IRQ>
  debug_smp_processor_id+0x1c/0x20 lib/smp_processor_id.c:57
  tick_nohz_stop_idle kernel/time/tick-sched.c:535 [inline]
  tick_nohz_irq_enter kernel/time/tick-sched.c:1255 [inline]
  tick_irq_enter+0xbd/0x3e0 kernel/time/tick-sched.c:1274
  irq_enter+0x52/0xc0 kernel/softirq.c:354
  scheduler_ipi+0xb3/0x4a0 kernel/sched/core.c:2337
  smp_reschedule_interrupt+0x7a/0xa0 arch/x86/kernel/smp.c:244
  reschedule_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:854
  </IRQ>
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
Code: 3c fa eb ae 89 d9 80 e1 07 80 c1 03 38 c1 7c ba 48 89 df e8 94 ab 3c  
fa eb b0 90 90 e9 07 00 00 00 0f 00 2d b6 99 52 00 fb f4 <c3> 90 e9 07 00  
00 00 0f 00 2d a6 99 52 00 f4 c3 90 90 55 48 89 e5
RSP: 0018:ffff8880a98cfe10 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff02
RAX: 1ffffffff1115179 RBX: ffff8880a98c2340 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: ffffffff812ba81a RDI: ffff8880a98c2b80
RBP: ffff8880a98cfe18 R08: ffff8880a98c2b98 R09: ffffed1015318469
R10: ffffed1015318469 R11: 0000000000000000 R12: dffffc0000000000
R13: 1ffff11015318468 R14: dffffc0000000000 R15: 1ffffffff1115177
  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571
  default_idle_call+0x59/0xa0 kernel/sched/idle.c:94
  cpuidle_idle_call kernel/sched/idle.c:154 [inline]
  do_idle+0x140/0x670 kernel/sched/idle.c:264
  cpu_startup_entry+0x25/0x30 kernel/sched/idle.c:356
  start_secondary+0x384/0x410 arch/x86/kernel/smpboot.c:264
  secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write
  2019-09-21 16:19 ` syzbot
@ 2022-08-26 15:27   ` Tetsuo Handa
  2022-08-27  6:11     ` [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set Tetsuo Handa
  2022-10-06 14:55     ` [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write Christian Schoenebeck
  0 siblings, 2 replies; 12+ messages in thread
From: Tetsuo Handa @ 2022-08-26 15:27 UTC (permalink / raw)
  To: Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet,
	Christian Schoenebeck
  Cc: syzbot, v9fs-developer, syzkaller-bugs, netdev

syzbot is reporting hung task at p9_fd_close() [1], for p9_mux_poll_stop()
 from p9_conn_destroy() from p9_fd_close() is failing to interrupt already
started kernel_read() from p9_fd_read() from p9_read_work() and/or
kernel_write() from p9_fd_write() from p9_write_work() requests.

Since p9_socket_open() sets O_NONBLOCK flag, p9_mux_poll_stop() does not
need to interrupt kernel_read()/kernel_write(). However, since p9_fd_open()
does not set O_NONBLOCK flag, but pipe blocks unless signal is pending,
p9_mux_poll_stop() needs to interrupt kernel_read()/kernel_write() when
the file descriptor refers to a pipe. In other words, pipe file descriptor
needs to be handled as if socket file descriptor.

We somehow need to interrupt kernel_read()/kernel_write() on pipes.

A minimal change, which this patch is doing, is to set O_NONBLOCK flag
 from p9_fd_open(), for O_NONBLOCK flag does not affect reading/writing
of regular files. But this approach changes O_NONBLOCK flag on userspace-
supplied file descriptors (which might break userspace programs), and
O_NONBLOCK flag could be changed by userspace. It would be possible to set
O_NONBLOCK flag every time p9_fd_read()/p9_fd_write() is invoked, but still
remains small race window for clearing O_NONBLOCK flag.

If we don't want to manipulate O_NONBLOCK flag, we might be able to
surround kernel_read()/kernel_write() with set_thread_flag(TIF_SIGPENDING)
and recalc_sigpending(). Since p9_read_work()/p9_write_work() works are
processed by kernel threads which process global system_wq workqueue,
signals could not be delivered from remote threads when p9_mux_poll_stop()
 from p9_conn_destroy() from p9_fd_close() is called. Therefore, calling
set_thread_flag(TIF_SIGPENDING)/recalc_sigpending() every time would be
needed if we count on signals for making kernel_read()/kernel_write()
non-blocking.

Link: https://syzkaller.appspot.com/bug?extid=8b41a1365f1106fd0f33 [1]
Reported-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
---
Although syzbot tested that this patch solves hung task problem, syzbot
cannot verify that this patch will not break functionality of p9 users.
Please test before applying this patch.

 net/9p/trans_fd.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index e758978b44be..9870597da583 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -821,11 +821,13 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd)
 		goto out_free_ts;
 	if (!(ts->rd->f_mode & FMODE_READ))
 		goto out_put_rd;
+	ts->rd->f_flags |= O_NONBLOCK;
 	ts->wr = fget(wfd);
 	if (!ts->wr)
 		goto out_put_rd;
 	if (!(ts->wr->f_mode & FMODE_WRITE))
 		goto out_put_wr;
+	ts->wr->f_flags |= O_NONBLOCK;
 
 	client->trans = ts;
 	client->status = Connected;
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set
  2022-08-26 15:27   ` [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write Tetsuo Handa
@ 2022-08-27  6:11     ` Tetsuo Handa
  2022-09-01 15:23       ` Christian Schoenebeck
  2022-10-06 14:55     ` [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write Christian Schoenebeck
  1 sibling, 1 reply; 12+ messages in thread
From: Tetsuo Handa @ 2022-08-27  6:11 UTC (permalink / raw)
  To: Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet,
	Christian Schoenebeck
  Cc: syzbot, v9fs-developer, syzkaller-bugs, netdev

syzbot is reporting hung task at p9_fd_close() [1], for p9_mux_poll_stop()
 from p9_conn_destroy() from p9_fd_close() is failing to interrupt already
started kernel_read() from p9_fd_read() from p9_read_work() and/or
kernel_write() from p9_fd_write() from p9_write_work() requests.

Since p9_socket_open() sets O_NONBLOCK flag, p9_mux_poll_stop() does not
need to interrupt kernel_{read,write}(). However, since p9_fd_open() does
not set O_NONBLOCK flag, but pipe blocks unless signal is pending,
p9_mux_poll_stop() needs to interrupt kernel_{read,write}() when the file
descriptor refers to a pipe. In other words, pipe file descriptor needs
to be handled as if socket file descriptor. We somehow need to interrupt
kernel_{read,write}() on pipes.

If we can tolerate "possibility of breaking userspace program by setting
O_NONBLOCK flag on userspace-supplied file descriptors" and "possibility
of race window that userspace program clears O_NONBLOCK flag between after
automatically setting O_NONBLOCK flag and before calling
kernel_{read,write}()", we could automatically set O_NONBLOCK flag
immediately before calling kernel_{read,write}().

A different approach, which this patch is doing, is to surround
kernel_{read,write}() with set_thread_flag(TIF_SIGPENDING) and
recalc_sigpending(). This might be ugly and bit costly, but does not
touch userspace-supplied file descriptors.

Link: https://syzkaller.appspot.com/bug?extid=8b41a1365f1106fd0f33 [1]
Reported-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
---
Although syzbot tested that this patch solves hung task problem, syzbot
cannot verify that this patch will not break functionality of p9 users.
Please test before applying this patch.

 net/9p/trans_fd.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index e758978b44be..e2f4e3245a80 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -256,11 +256,13 @@ static int p9_fd_read(struct p9_client *client, void *v, int len)
 	if (!ts)
 		return -EREMOTEIO;
 
-	if (!(ts->rd->f_flags & O_NONBLOCK))
-		p9_debug(P9_DEBUG_ERROR, "blocking read ...\n");
-
 	pos = ts->rd->f_pos;
+	/* Force non-blocking read() even without O_NONBLOCK. */
+	set_thread_flag(TIF_SIGPENDING);
 	ret = kernel_read(ts->rd, v, len, &pos);
+	spin_lock_irq(&current->sighand->siglock);
+	recalc_sigpending();
+	spin_unlock_irq(&current->sighand->siglock);
 	if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN)
 		client->status = Disconnected;
 	return ret;
@@ -423,10 +425,12 @@ static int p9_fd_write(struct p9_client *client, void *v, int len)
 	if (!ts)
 		return -EREMOTEIO;
 
-	if (!(ts->wr->f_flags & O_NONBLOCK))
-		p9_debug(P9_DEBUG_ERROR, "blocking write ...\n");
-
+	/* Force non-blocking write() even without O_NONBLOCK. */
+	set_thread_flag(TIF_SIGPENDING);
 	ret = kernel_write(ts->wr, v, len, &ts->wr->f_pos);
+	spin_lock_irq(&current->sighand->siglock);
+	recalc_sigpending();
+	spin_unlock_irq(&current->sighand->siglock);
 	if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN)
 		client->status = Disconnected;
 	return ret;
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set
  2022-08-27  6:11     ` [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set Tetsuo Handa
@ 2022-09-01 15:23       ` Christian Schoenebeck
  2022-09-01 22:25         ` Tetsuo Handa
  0 siblings, 1 reply; 12+ messages in thread
From: Christian Schoenebeck @ 2022-09-01 15:23 UTC (permalink / raw)
  To: Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Tetsuo Handa
  Cc: syzbot, v9fs-developer, syzkaller-bugs, netdev

On Samstag, 27. August 2022 08:11:48 CEST Tetsuo Handa wrote:
> syzbot is reporting hung task at p9_fd_close() [1], for p9_mux_poll_stop()
>  from p9_conn_destroy() from p9_fd_close() is failing to interrupt already
> started kernel_read() from p9_fd_read() from p9_read_work() and/or
> kernel_write() from p9_fd_write() from p9_write_work() requests.
> 
> Since p9_socket_open() sets O_NONBLOCK flag, p9_mux_poll_stop() does not
> need to interrupt kernel_{read,write}(). However, since p9_fd_open() does
> not set O_NONBLOCK flag, but pipe blocks unless signal is pending,
> p9_mux_poll_stop() needs to interrupt kernel_{read,write}() when the file
> descriptor refers to a pipe. In other words, pipe file descriptor needs
> to be handled as if socket file descriptor. We somehow need to interrupt
> kernel_{read,write}() on pipes.
> 
> If we can tolerate "possibility of breaking userspace program by setting
> O_NONBLOCK flag on userspace-supplied file descriptors" and "possibility
> of race window that userspace program clears O_NONBLOCK flag between after
> automatically setting O_NONBLOCK flag and before calling
> kernel_{read,write}()", we could automatically set O_NONBLOCK flag
> immediately before calling kernel_{read,write}().
> 
> A different approach, which this patch is doing, is to surround
> kernel_{read,write}() with set_thread_flag(TIF_SIGPENDING) and
> recalc_sigpending(). This might be ugly and bit costly, but does not
> touch userspace-supplied file descriptors.

So the intention in this alternative approach is to allow user space apps  
still being able to perform blocking I/O, while at the same time making the 
kernel thread interruptible to fix this hung task issue, correct?

> Link: https://syzkaller.appspot.com/bug?extid=8b41a1365f1106fd0f33 [1]
> Reported-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Tested-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
> ---
> Although syzbot tested that this patch solves hung task problem, syzbot
> cannot verify that this patch will not break functionality of p9 users.
> Please test before applying this patch.
> 
>  net/9p/trans_fd.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> index e758978b44be..e2f4e3245a80 100644
> --- a/net/9p/trans_fd.c
> +++ b/net/9p/trans_fd.c
> @@ -256,11 +256,13 @@ static int p9_fd_read(struct p9_client *client, void
> *v, int len) if (!ts)
>  		return -EREMOTEIO;
> 
> -	if (!(ts->rd->f_flags & O_NONBLOCK))
> -		p9_debug(P9_DEBUG_ERROR, "blocking read ...\n");
> -
>  	pos = ts->rd->f_pos;
> +	/* Force non-blocking read() even without O_NONBLOCK. */
> +	set_thread_flag(TIF_SIGPENDING);
>  	ret = kernel_read(ts->rd, v, len, &pos);
> +	spin_lock_irq(&current->sighand->siglock);
> +	recalc_sigpending();
> +	spin_unlock_irq(&current->sighand->siglock);

Is the recalc_sigpending() block here actually needed? The TIF_SIGPENDING flag 
is already cleared by net/9p/client.c, no?

>  	if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN)
>  		client->status = Disconnected;
>  	return ret;
> @@ -423,10 +425,12 @@ static int p9_fd_write(struct p9_client *client, void
> *v, int len) if (!ts)
>  		return -EREMOTEIO;
> 
> -	if (!(ts->wr->f_flags & O_NONBLOCK))
> -		p9_debug(P9_DEBUG_ERROR, "blocking write ...\n");
> -
> +	/* Force non-blocking write() even without O_NONBLOCK. */
> +	set_thread_flag(TIF_SIGPENDING);
>  	ret = kernel_write(ts->wr, v, len, &ts->wr->f_pos);
> +	spin_lock_irq(&current->sighand->siglock);
> +	recalc_sigpending();
> +	spin_unlock_irq(&current->sighand->siglock);
>  	if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN)
>  		client->status = Disconnected;
>  	return ret;



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set
  2022-09-01 15:23       ` Christian Schoenebeck
@ 2022-09-01 22:25         ` Tetsuo Handa
  2022-09-03 23:39           ` Dominique Martinet
  0 siblings, 1 reply; 12+ messages in thread
From: Tetsuo Handa @ 2022-09-01 22:25 UTC (permalink / raw)
  To: Christian Schoenebeck, Eric Van Hensbergen, Latchesar Ionkov,
	Dominique Martinet
  Cc: syzbot, v9fs-developer, syzkaller-bugs, netdev, linux-fsdevel

On 2022/09/02 0:23, Christian Schoenebeck wrote:
> So the intention in this alternative approach is to allow user space apps  
> still being able to perform blocking I/O, while at the same time making the 
> kernel thread interruptible to fix this hung task issue, correct?

Making the kernel thread "non-blocking" (rather than "interruptible") in order
not to be blocked on I/O on pipes.

Since kernel threads by default do not receive signals, being "interruptible"
or "killable" does not help (except for silencing khungtaskd warning). Being
"non-blocking" like I/O on sockets helps.

>> --- a/net/9p/trans_fd.c
>> +++ b/net/9p/trans_fd.c
>> @@ -256,11 +256,13 @@ static int p9_fd_read(struct p9_client *client, void
>> *v, int len) if (!ts)
>>  		return -EREMOTEIO;
>>
>> -	if (!(ts->rd->f_flags & O_NONBLOCK))
>> -		p9_debug(P9_DEBUG_ERROR, "blocking read ...\n");
>> -
>>  	pos = ts->rd->f_pos;
>> +	/* Force non-blocking read() even without O_NONBLOCK. */
>> +	set_thread_flag(TIF_SIGPENDING);
>>  	ret = kernel_read(ts->rd, v, len, &pos);
>> +	spin_lock_irq(&current->sighand->siglock);
>> +	recalc_sigpending();
>> +	spin_unlock_irq(&current->sighand->siglock);
> 
> Is the recalc_sigpending() block here actually needed? The TIF_SIGPENDING flag 
> is already cleared by net/9p/client.c, no?

This is actually needed.

The thread which processes this function is a kernel workqueue thread which
is supposed to process other functions (which might call "interruptible"
functions even if signals are not received by default).

The thread which currently clearing the TIF_SIGPENDING flag is a user process
(which are calling "killable" functions from syscall context but effectively
"uninterruptible" due to clearing the TIF_SIGPENDING flag and retrying).
By the way, clearing the TIF_SIGPENDING flag before retrying "killable" functions
(like p9_client_rpc() does) is very bad and needs to be avoided...


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set
  2022-09-01 22:25         ` Tetsuo Handa
@ 2022-09-03 23:39           ` Dominique Martinet
  2022-09-04  0:27             ` Tetsuo Handa
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2022-09-03 23:39 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Christian Schoenebeck, Eric Van Hensbergen, Latchesar Ionkov,
	syzbot, v9fs-developer, syzkaller-bugs, netdev, linux-fsdevel

Thanks for the patch and sorry for the slow reply

v1 vs v2: my take is that I think v1 is easier to understand, and if you
pass a fd to be used as kernel end for 9p you shouldn't also be messing
with it so it's fair game to make it O_NONBLOCK -- we're reading and
writing to these things, the fds shouldn't be used by the caller after
the mount syscall.

Is there any reason you spent time working on v2, or is that just
theorical for not messing with userland fd ?

unless there's any reason I'll try to find time to test v1 and queue it
for 6.1

Tetsuo Handa wrote on Fri, Sep 02, 2022 at 07:25:30AM +0900:
> On 2022/09/02 0:23, Christian Schoenebeck wrote:
> > So the intention in this alternative approach is to allow user space apps  
> > still being able to perform blocking I/O, while at the same time making the 
> > kernel thread interruptible to fix this hung task issue, correct?
> 
> Making the kernel thread "non-blocking" (rather than "interruptible") in order
> not to be blocked on I/O on pipes.
> 
> Since kernel threads by default do not receive signals, being "interruptible"
> or "killable" does not help (except for silencing khungtaskd warning). Being
> "non-blocking" like I/O on sockets helps.

I'm still not 100% sure I understand the root of the deadlock, but I can
agree the worker thread shouldn't block.

We seem to check for EAGAIN where kernel_read/write end up being called
and there's a poll for scheduling so it -should- work, but I assume this
hasn't been tested much and might take a bit of time to test.


> The thread which currently clearing the TIF_SIGPENDING flag is a user process
> (which are calling "killable" functions from syscall context but effectively
> "uninterruptible" due to clearing the TIF_SIGPENDING flag and retrying).
> By the way, clearing the TIF_SIGPENDING flag before retrying "killable" functions
> (like p9_client_rpc() does) is very bad and needs to be avoided...

Yes, I really wish we could make this go away.
I started work to make the cancel (flush) asynchronous, but it
introduced a regression I never had (and still don't have) time to
figure out... If you have motivation to take over, the patches I sent
are here:
https://lore.kernel.org/all/20181217110111.GB17466@nautica/T/
(unfortunately some refactoring happened and they no longer apply, but
the logic should be mostly sane)


Thanks,
--
Dominique

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set
  2022-09-03 23:39           ` Dominique Martinet
@ 2022-09-04  0:27             ` Tetsuo Handa
  2022-10-07  1:40               ` Dominique Martinet
  0 siblings, 1 reply; 12+ messages in thread
From: Tetsuo Handa @ 2022-09-04  0:27 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Christian Schoenebeck, Eric Van Hensbergen, Latchesar Ionkov,
	syzbot, v9fs-developer, syzkaller-bugs, netdev, linux-fsdevel

On 2022/09/04 8:39, Dominique Martinet wrote:
> Is there any reason you spent time working on v2, or is that just
> theorical for not messing with userland fd ?

Just theoretical for not messing with userland fd, for programs generated
by fuzzers might use fds passed to the mount() syscall. I imagined that
syzbot again reports this problem when it started playing with fcntl().

For robustness, not messing with userland fd is the better. ;-)

> 
> unless there's any reason I'll try to find time to test v1 and queue it
> for 6.1

OK.

> We seem to check for EAGAIN where kernel_read/write end up being called
> and there's a poll for scheduling so it -should- work, but I assume this
> hasn't been tested much and might take a bit of time to test.

Right. Since the I/O in kernel side is poll based multiplexing, forcing
non-blocking I/O -should- work. (But I can't test e.g. changes in CPU time
usage because I don't have environment to test. I assume that poll based
multiplexing saves us from doing busy looping.)

We are currently checking for ERESTARTSYS and EAGAIN. The former is for
non-socket fds which do not have O_NONBLOCK flag, and the latter is for
socket fds which have O_NONBLOCK flag. If we enforce O_NONBLOCK flag,
the former will become redundant. I think we can remove the former check
after you tested that setting O_NONBLOCK flag on non-socket fds does not break.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write
  2022-08-26 15:27   ` [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write Tetsuo Handa
  2022-08-27  6:11     ` [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set Tetsuo Handa
@ 2022-10-06 14:55     ` Christian Schoenebeck
  2022-10-07  1:03       ` Dominique Martinet
  1 sibling, 1 reply; 12+ messages in thread
From: Christian Schoenebeck @ 2022-10-06 14:55 UTC (permalink / raw)
  To: Dominique Martinet, Tetsuo Handa
  Cc: Eric Van Hensbergen, Latchesar Ionkov, syzbot, v9fs-developer,
	syzkaller-bugs, netdev

On Freitag, 26. August 2022 17:27:46 CEST Tetsuo Handa wrote:
> syzbot is reporting hung task at p9_fd_close() [1], for p9_mux_poll_stop()
>  from p9_conn_destroy() from p9_fd_close() is failing to interrupt already
> started kernel_read() from p9_fd_read() from p9_read_work() and/or
> kernel_write() from p9_fd_write() from p9_write_work() requests.
> 
> Since p9_socket_open() sets O_NONBLOCK flag, p9_mux_poll_stop() does not
> need to interrupt kernel_read()/kernel_write(). However, since p9_fd_open()
> does not set O_NONBLOCK flag, but pipe blocks unless signal is pending,
> p9_mux_poll_stop() needs to interrupt kernel_read()/kernel_write() when
> the file descriptor refers to a pipe. In other words, pipe file descriptor
> needs to be handled as if socket file descriptor.
> 
> We somehow need to interrupt kernel_read()/kernel_write() on pipes.
> 
> A minimal change, which this patch is doing, is to set O_NONBLOCK flag
>  from p9_fd_open(), for O_NONBLOCK flag does not affect reading/writing
> of regular files. But this approach changes O_NONBLOCK flag on userspace-
> supplied file descriptors (which might break userspace programs), and
> O_NONBLOCK flag could be changed by userspace. It would be possible to set
> O_NONBLOCK flag every time p9_fd_read()/p9_fd_write() is invoked, but still
> remains small race window for clearing O_NONBLOCK flag.
> 
> If we don't want to manipulate O_NONBLOCK flag, we might be able to
> surround kernel_read()/kernel_write() with set_thread_flag(TIF_SIGPENDING)
> and recalc_sigpending(). Since p9_read_work()/p9_write_work() works are
> processed by kernel threads which process global system_wq workqueue,
> signals could not be delivered from remote threads when p9_mux_poll_stop()
>  from p9_conn_destroy() from p9_fd_close() is called. Therefore, calling
> set_thread_flag(TIF_SIGPENDING)/recalc_sigpending() every time would be
> needed if we count on signals for making kernel_read()/kernel_write()
> non-blocking.
> 
> Link: https://syzkaller.appspot.com/bug?extid=8b41a1365f1106fd0f33 [1]
> Reported-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Tested-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
> ---
> Although syzbot tested that this patch solves hung task problem, syzbot
> cannot verify that this patch will not break functionality of p9 users.
> Please test before applying this patch.
> 
>  net/9p/trans_fd.c | 2 ++
>  1 file changed, 2 insertions(+)
> 

I would also prefer this simpler v1 instead of v2 for now. One nitpicking ...

> diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> index e758978b44be..9870597da583 100644
> --- a/net/9p/trans_fd.c
> +++ b/net/9p/trans_fd.c
> @@ -821,11 +821,13 @@ static int p9_fd_open(struct p9_client *client, int
> rfd, int wfd) goto out_free_ts;
>  	if (!(ts->rd->f_mode & FMODE_READ))
>  		goto out_put_rd;
> +	ts->rd->f_flags |= O_NONBLOCK;

... I think this deserves a short comment like:

    /* prevent hung task with pipes */

Anyway,

Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>

>  	ts->wr = fget(wfd);
>  	if (!ts->wr)
>  		goto out_put_rd;
>  	if (!(ts->wr->f_mode & FMODE_WRITE))
>  		goto out_put_wr;
> +	ts->wr->f_flags |= O_NONBLOCK;
> 
>  	client->trans = ts;
>  	client->status = Connected;




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write
  2022-10-06 14:55     ` [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write Christian Schoenebeck
@ 2022-10-07  1:03       ` Dominique Martinet
  0 siblings, 0 replies; 12+ messages in thread
From: Dominique Martinet @ 2022-10-07  1:03 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: Tetsuo Handa, Eric Van Hensbergen, Latchesar Ionkov, syzbot,
	v9fs-developer, syzkaller-bugs, netdev

Christian Schoenebeck wrote on Thu, Oct 06, 2022 at 04:55:23PM +0200:
> > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> > index e758978b44be..9870597da583 100644
> > --- a/net/9p/trans_fd.c
> > +++ b/net/9p/trans_fd.c
> > @@ -821,11 +821,13 @@ static int p9_fd_open(struct p9_client *client, int
> > rfd, int wfd) goto out_free_ts;
> >  	if (!(ts->rd->f_mode & FMODE_READ))
> >  		goto out_put_rd;
> > +	ts->rd->f_flags |= O_NONBLOCK;
> 
> ... I think this deserves a short comment like:
> 
>     /* prevent hung task with pipes */

Good point, I've sneaked in this comment:
    /* prevent workers from hanging on IO when fd is a pipe */

https://github.com/martinetd/linux/commit/ef575281b21e9a34dfae544a187c6aac2ae424a9


> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>

Thank you!

--
Dominique

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set
  2022-09-04  0:27             ` Tetsuo Handa
@ 2022-10-07  1:40               ` Dominique Martinet
  2022-10-07 11:52                 ` Tetsuo Handa
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2022-10-07  1:40 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Christian Schoenebeck, Eric Van Hensbergen, Latchesar Ionkov,
	syzbot, v9fs-developer, syzkaller-bugs, netdev, linux-fsdevel

Tetsuo Handa wrote on Sun, Sep 04, 2022 at 09:27:22AM +0900:
> On 2022/09/04 8:39, Dominique Martinet wrote:
> > Is there any reason you spent time working on v2, or is that just
> > theorical for not messing with userland fd ?
> 
> Just theoretical for not messing with userland fd, for programs generated
> by fuzzers might use fds passed to the mount() syscall. I imagined that
> syzbot again reports this problem when it started playing with fcntl().
> 
> For robustness, not messing with userland fd is the better. ;-)

By the way digging this back made me think about this a bit again.
My opinion hasn't really changed that if you want to shoot yourself in
the foot I don't think we're crossing any priviledge boundary here, but
we could probably prevent it by saying the mount call with close that fd
and somehow steal it? (drop the fget, close_fd after get_file perhaps?)

That should address your concern about robustess and syzbot will no
longer be able to play with fcntl on "our" end of the pipe. I think it's
fair to say that once you pass it to the kernel all bets are off, so
closing it for the userspace application could make sense, and the mount
already survives when short processes do the mount call and immediately
exit so it's not like we need that fd to be open...


What do you think?

(either way would be for 6.2, the patch is already good enough as is for
me)
--
Dominique

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set
  2022-10-07  1:40               ` Dominique Martinet
@ 2022-10-07 11:52                 ` Tetsuo Handa
  0 siblings, 0 replies; 12+ messages in thread
From: Tetsuo Handa @ 2022-10-07 11:52 UTC (permalink / raw)
  To: Dominique Martinet, Alexander Viro
  Cc: Christian Schoenebeck, Eric Van Hensbergen, Latchesar Ionkov,
	syzbot, v9fs-developer, syzkaller-bugs, netdev, linux-fsdevel

On 2022/10/07 10:40, Dominique Martinet wrote:
> Tetsuo Handa wrote on Sun, Sep 04, 2022 at 09:27:22AM +0900:
>> On 2022/09/04 8:39, Dominique Martinet wrote:
>>> Is there any reason you spent time working on v2, or is that just
>>> theorical for not messing with userland fd ?
>>
>> Just theoretical for not messing with userland fd, for programs generated
>> by fuzzers might use fds passed to the mount() syscall. I imagined that
>> syzbot again reports this problem when it started playing with fcntl().
>>
>> For robustness, not messing with userland fd is the better. ;-)
> 
> By the way digging this back made me think about this a bit again.
> My opinion hasn't really changed that if you want to shoot yourself in
> the foot I don't think we're crossing any priviledge boundary here, but
> we could probably prevent it by saying the mount call with close that fd
> and somehow steal it? (drop the fget, close_fd after get_file perhaps?)
> 
> That should address your concern about robustess and syzbot will no
> longer be able to play with fcntl on "our" end of the pipe. I think it's
> fair to say that once you pass it to the kernel all bets are off, so
> closing it for the userspace application could make sense, and the mount
> already survives when short processes do the mount call and immediately
> exit so it's not like we need that fd to be open...
> 
> 
> What do you think?

I found that pipe is using alloc_file_clone() which allocates "struct file"
instead of just incrementing "struct file"->f_count.

Then, can we add EXPORT_SYMBOL_GPL(alloc_file_clone) to fs/file_table.c and
use it like

  struct file *f;

  ts->rd = fget(rfd);
  if (!ts->rd)
    goto out_free_ts;
  if (!(ts->rd->f_mode & FMODE_READ))
    goto out_put_rd;
  f = alloc_file_clone(ts->rd, ts->rd->f_flags | O_NONBLOCK, ts->rd->f_op);
  if (IS_ERR(f))
    goto out_put_rd;
  fput(ts->rd);
  ts->rd = f;

  ts->wr = fget(wfd);
  if (!ts->wr)
    goto out_put_rd;
  if (!(ts->wr->f_mode & FMODE_WRITE))
    goto out_put_wr;
  f = alloc_file_clone(ts->wr, ts->wr->f_flags | O_NONBLOCK, ts->wr->f_op);
  if (IS_ERR(f))
    goto out_put_wr;
  fput(ts->wr);
  ts->wr = f;

 from p9_fd_open() for cloning "struct file" with O_NONBLOCK flag added?
Just an idea. I don't know whether alloc_file_clone() arguments are correct...


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-10-07 11:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-30 19:28 INFO: task hung in p9_fd_close syzbot
2019-09-21 16:19 ` syzbot
2022-08-26 15:27   ` [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write Tetsuo Handa
2022-08-27  6:11     ` [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set Tetsuo Handa
2022-09-01 15:23       ` Christian Schoenebeck
2022-09-01 22:25         ` Tetsuo Handa
2022-09-03 23:39           ` Dominique Martinet
2022-09-04  0:27             ` Tetsuo Handa
2022-10-07  1:40               ` Dominique Martinet
2022-10-07 11:52                 ` Tetsuo Handa
2022-10-06 14:55     ` [PATCH] 9p/trans_fd: always use O_NONBLOCK read/write Christian Schoenebeck
2022-10-07  1:03       ` Dominique Martinet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.