* INFO: task hung in pipe_release (2) @ 2020-08-02 18:26 syzbot 2020-08-07 10:35 ` splice: infinite busy loop lockup bug Tetsuo Handa 2020-08-13 3:57 ` syzbot 0 siblings, 2 replies; 22+ messages in thread From: syzbot @ 2020-08-02 18:26 UTC (permalink / raw) To: linux-fsdevel, linux-kernel, syzkaller-bugs, viro Hello, syzbot found the following issue on: HEAD commit: 6ba1b005 Merge tag 'asm-generic-fixes-5.8' of git://git.ke.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=16a5a3b0900000 kernel config: https://syzkaller.appspot.com/x/.config?x=84f076779e989e69 dashboard link: https://syzkaller.appspot.com/bug?extid=61acc40a49a3e46e25ea compiler: gcc (GCC) 10.1.0-syz 20200507 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=142ae224900000 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com INFO: task syz-executor.0:7075 blocked for more than 143 seconds. Not tainted 5.8.0-rc7-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor.0 D28288 7075 6828 0x00000004 Call Trace: context_switch kernel/sched/core.c:3458 [inline] __schedule+0x8ea/0x2210 kernel/sched/core.c:4219 schedule+0xd0/0x2a0 kernel/sched/core.c:4294 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4353 __mutex_lock_common kernel/locking/mutex.c:1033 [inline] __mutex_lock+0x3e2/0x10d0 kernel/locking/mutex.c:1103 __pipe_lock fs/pipe.c:87 [inline] pipe_release+0x49/0x320 fs/pipe.c:722 __fput+0x33c/0x880 fs/file_table.c:281 task_work_run+0xdd/0x190 kernel/task_work.c:135 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_usermode_loop arch/x86/entry/common.c:239 [inline] __prepare_exit_to_usermode+0x1e9/0x1f0 arch/x86/entry/common.c:269 do_syscall_64+0x6c/0xe0 arch/x86/entry/common.c:393 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x415ee1 Code: Bad RIP value. RSP: 002b:00007ffd249bc0c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000415ee1 RDX: 0000000000000000 RSI: 00000000007904d0 RDI: 0000000000000006 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 00007ffd249bc1b0 R11: 0000000000000293 R12: 00000000007905a8 R13: 00000000000ffc79 R14: ffffffffffffffff R15: 000000000078bfac Showing all locks held in the system: 1 lock held by khungtaskd/1153: #0: ffffffff89bc11c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:5779 1 lock held by in:imklog/6695: #0: ffff8880979b6670 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:826 1 lock held by syz-executor.0/7075: #0: ffff8880a8e02068 (&pipe->mutex/1){+.+.}-{3:3}, at: __pipe_lock fs/pipe.c:87 [inline] #0: ffff8880a8e02068 (&pipe->mutex/1){+.+.}-{3:3}, at: pipe_release+0x49/0x320 fs/pipe.c:722 2 locks held by syz-executor.0/7077: ============================================= NMI backtrace for cpu 0 CPU: 0 PID: 1153 Comm: khungtaskd Not tainted 5.8.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x18f/0x20d lib/dump_stack.c:118 nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101 nmi_trigger_cpumask_backtrace+0x1b3/0x223 lib/nmi_backtrace.c:62 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:209 [inline] watchdog+0xd7d/0x1000 kernel/hung_task.c:295 kthread+0x3b5/0x4a0 kernel/kthread.c:291 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293 Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 7077 Comm: syz-executor.0 Not tainted 5.8.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:iov_iter_alignment+0x39e/0x850 lib/iov_iter.c:1236 Code: 48 8b 44 24 18 80 38 00 0f 85 f4 03 00 00 4d 8b 7d 18 44 89 e3 48 89 d8 48 c1 e0 04 4c 01 f8 48 8d 78 0c 48 89 fa 48 c1 ea 03 <42> 0f b6 0c 32 48 89 fa 83 e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 85 RSP: 0018:ffffc9000206f868 EFLAGS: 00000a07 RAX: ffff8880a8124c00 RBX: 0000000000000000 RCX: ffffffff8395bf6f RDX: 1ffff11015024981 RSI: ffffffff8395be14 RDI: ffff8880a8124c0c RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8880994e63c7 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffffc9000206fc68 R14: dffffc0000000000 R15: ffff8880a8124c00 FS: 00007fba306d0700(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f9573761000 CR3: 0000000091a0d000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ext4_unaligned_io fs/ext4/file.c:175 [inline] ext4_dio_write_iter fs/ext4/file.c:465 [inline] ext4_file_write_iter+0x345/0x13d0 fs/ext4/file.c:653 call_write_iter include/linux/fs.h:1908 [inline] do_iter_readv_writev+0x567/0x780 fs/read_write.c:713 do_iter_write+0x188/0x5f0 fs/read_write.c:1018 vfs_iter_write+0x70/0xa0 fs/read_write.c:1059 iter_file_splice_write+0x721/0xbe0 fs/splice.c:750 do_splice_from fs/splice.c:846 [inline] do_splice+0xbcd/0x1820 fs/splice.c:1144 __do_sys_splice fs/splice.c:1419 [inline] __se_sys_splice fs/splice.c:1401 [inline] __x64_sys_splice+0x198/0x250 fs/splice.c:1401 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45c369 Code: 8d b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fba306cfc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000113 RAX: ffffffffffffffda RBX: 0000000000032240 RCX: 000000000045c369 RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000005 RBP: 000000000078bff8 R08: 000000000000ffe0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000078bfac R13: 00007ffd249bc04f R14: 00007fba306d09c0 R15: 000000000078bfac --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. syzbot can test patches for this issue, for details see: https://goo.gl/tpsmEJ#testing-patches ^ permalink raw reply [flat|nested] 22+ messages in thread
* splice: infinite busy loop lockup bug 2020-08-02 18:26 INFO: task hung in pipe_release (2) syzbot @ 2020-08-07 10:35 ` Tetsuo Handa 2020-08-07 12:27 ` Al Viro 2020-08-13 3:57 ` syzbot 1 sibling, 1 reply; 22+ messages in thread From: Tetsuo Handa @ 2020-08-07 10:35 UTC (permalink / raw) To: syzkaller-bugs, viro; +Cc: syzbot, linux-fsdevel, linux-kernel syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from call_write_iter() from do_iter_readv_writev() from do_iter_write() from vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop with pipe->mutex held. The reason of falling into infinite busy loop is that iter_file_splice_write() for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 while for_each_bvec() cannot handle .bv_len == 0. --- a/fs/splice.c +++ b/fs/splice.c @@ -747,6 +747,14 @@ iter_file_splice_write(struct pipe_inode_info *pipe, struct file *out, } iov_iter_bvec(&from, WRITE, array, n, sd.total_len - left); + if (!strncmp(current->comm, "syz-executor", 12)) { + int i; + printk("Starting vfs_write_iter from.type=%d from.iov_offset=%zu from.count=%zu n=%u sd.total_len=%zu left=%zu\n", + from.type, from.iov_offset, from.count, n, sd.total_len, left); + for (i = 0; i < n; i++) + printk(" array[%u]: bv_page=%px bv_len=%u bv_offset=%u\n", + i, array[i].bv_page, array[i].bv_len, array[i].bv_offset); + } ret = vfs_iter_write(out, &from, &sd.pos, 0); if (ret <= 0) break; When splice() from pipe to file works. [ 31.704915][ T6552] Starting vfs_write_iter from.type=17 from.iov_offset=0 from.count=4096 n=1 sd.total_len=65504 left=61408 [ 31.709098][ T6552] array[0]: bv_page=ffffea000870a7c0 bv_len=4096 bv_offset=0 When splice() from pipe to file falls into infinite busy loop. [ 31.717178][ T6553] Starting vfs_write_iter from.type=17 from.iov_offset=0 from.count=4096 n=2 sd.total_len=65504 left=61408 [ 31.720983][ T6553] array[0]: bv_page=ffffea0008706680 bv_len=0 bv_offset=0 [ 31.723565][ T6553] array[1]: bv_page=ffffea00086f4e80 bv_len=4096 bv_offset=0 Is it normal behavior that an empty page is linked to pipe's array? If yes, don't we need to skip empty pages when iter_file_splice_write() fills in "struct bio_vec *array" ? [1] https://syzkaller.appspot.com/bug?id=2ccac875e85dc852911a0b5b788ada82dc0a081e ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-07 10:35 ` splice: infinite busy loop lockup bug Tetsuo Handa @ 2020-08-07 12:27 ` Al Viro 2020-08-07 12:34 ` Tetsuo Handa 2020-08-07 12:38 ` Al Viro 0 siblings, 2 replies; 22+ messages in thread From: Al Viro @ 2020-08-07 12:27 UTC (permalink / raw) To: Tetsuo Handa Cc: syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel, Ming Lei On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > call_write_iter() from do_iter_readv_writev() from do_iter_write() from > vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > with pipe->mutex held. > > The reason of falling into infinite busy loop is that iter_file_splice_write() > for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > while for_each_bvec() cannot handle .bv_len == 0. broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", unless I'm misreading it... Zero-length segments are not disallowed; it's not all that hard to filter them out in iter_file_splice_write(), but the intent had always been to have iterate_all_kinds() et.al. able to cope with those. How are these pipe_buffers with ->len == 0 generated in that reproducer, BTW? There might be something else fishy going on... ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-07 12:27 ` Al Viro @ 2020-08-07 12:34 ` Tetsuo Handa 2020-09-01 0:51 ` Qian Cai 2020-08-07 12:38 ` Al Viro 1 sibling, 1 reply; 22+ messages in thread From: Tetsuo Handa @ 2020-08-07 12:34 UTC (permalink / raw) To: Al Viro; +Cc: syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel, Ming Lei On 2020/08/07 21:27, Al Viro wrote: > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: >> syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from >> iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from >> ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from >> call_write_iter() from do_iter_readv_writev() from do_iter_write() from >> vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop >> with pipe->mutex held. >> >> The reason of falling into infinite busy loop is that iter_file_splice_write() >> for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 >> while for_each_bvec() cannot handle .bv_len == 0. > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > unless I'm misreading it... > > Zero-length segments are not disallowed; it's not all that hard to filter them > out in iter_file_splice_write(), but the intent had always been to have > iterate_all_kinds() et.al. able to cope with those. > > How are these pipe_buffers with ->len == 0 generated in that reproducer, BTW? > There might be something else fishy going on... > OK. Indeed writing to empty pipe which returns -EFAULT allows an empty page to be linked to pipe's array. Now, I've just found a simple reproducer, and confirmed that this bug is a local lockup DoS by unprivileged user. Please fix. ---------- #define _GNU_SOURCE #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> int main(int argc, char *argv[]) { static char buffer[4096]; const int fd = open("/tmp/testfile", O_WRONLY | O_CREAT, 0600); int pipe_fd[2] = { EOF, EOF }; pipe(pipe_fd); write(pipe_fd[1], NULL, 4096); write(pipe_fd[1], buffer, 4096); splice(pipe_fd[0], NULL, fd, NULL, 65536, 0); return 0; } ---------- [ 125.598898][ C0] rcu: INFO: rcu_sched self-detected stall on CPU [ 125.601072][ C0] rcu: 0-....: (20171 ticks this GP) idle=526/1/0x4000000000000000 softirq=7918/7918 fqs=5136 [ 125.604874][ C0] (t=21006 jiffies g=9341 q=30) [ 125.606512][ C0] NMI backtrace for cpu 0 [ 125.607931][ C0] CPU: 0 PID: 2792 Comm: a.out Not tainted 5.8.0+ #793 [ 125.610948][ C0] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020 [ 125.614938][ C0] Call Trace: [ 125.616049][ C0] <IRQ> [ 125.617010][ C0] dump_stack+0x5e/0x7a [ 125.618370][ C0] nmi_cpu_backtrace.cold.7+0x14/0x52 [ 125.620148][ C0] ? lapic_can_unplug_cpu.cold.39+0x3a/0x3a [ 125.622074][ C0] nmi_trigger_cpumask_backtrace+0x92/0x9f [ 125.624154][ C0] arch_trigger_cpumask_backtrace+0x14/0x20 [ 125.626102][ C0] rcu_dump_cpu_stacks+0xa0/0xd0 [ 125.627919][ C0] rcu_sched_clock_irq.cold.95+0x121/0x39c [ 125.629833][ C0] ? acct_account_cputime+0x17/0x20 [ 125.631534][ C0] ? account_system_index_time+0x8a/0xa0 [ 125.633422][ C0] update_process_times+0x23/0x60 [ 125.635070][ C0] tick_sched_handle.isra.22+0x20/0x60 [ 125.636870][ C0] tick_sched_timer+0x68/0x80 [ 125.638403][ C0] ? tick_sched_handle.isra.22+0x60/0x60 [ 125.640588][ C0] __hrtimer_run_queues+0xf9/0x1a0 [ 125.642591][ C0] hrtimer_interrupt+0xfc/0x210 [ 125.645033][ C0] __sysvec_apic_timer_interrupt+0x4c/0x60 [ 125.647292][ C0] asm_call_on_stack+0xf/0x20 [ 125.649192][ C0] </IRQ> [ 125.650501][ C0] sysvec_apic_timer_interrupt+0x75/0x80 [ 125.652900][ C0] asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 125.655487][ C0] RIP: 0010:iov_iter_copy_from_user_atomic+0x19b/0x350 [ 125.658124][ C0] Code: 89 45 d0 48 c1 e6 06 48 03 37 4d 8d 3c 09 4c 89 cf e8 d9 e5 ff ff 48 8b 45 d0 45 39 eb 0f 87 35 01 00 00 49 8b 4a 18 4d 89 f9 <45> 29 dd 45 01 d8 75 12 eb 19 41 83 c4 01 41 29 c0 74 10 44 89 e0 [ 125.666132][ C0] RSP: 0018:ffffa6cdc1237aa8 EFLAGS: 00000246 [ 125.668557][ C0] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffff945035a25100 [ 125.671576][ C0] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff945035a25100 [ 125.674851][ C0] RBP: ffffa6cdc1237ad8 R08: 0000000000000000 R09: ffff945028a80000 [ 125.677989][ C0] R10: ffffa6cdc1237de0 R11: 0000000000000000 R12: 0000000000000000 [ 125.680990][ C0] R13: 0000000000001000 R14: 0000000000001000 R15: 0000000000001000 [ 125.684006][ C0] iomap_write_actor+0xbe/0x190 [ 125.685982][ C0] ? iomap_write_begin+0x460/0x460 [ 125.688031][ C0] iomap_apply+0xf4/0x1a0 [ 125.689810][ C0] ? iomap_write_begin+0x460/0x460 [ 125.691826][ C0] iomap_file_buffered_write+0x69/0x90 [ 125.698598][ C0] ? iomap_write_begin+0x460/0x460 [ 125.705341][ C0] xfs_file_buffered_aio_write+0xc2/0x2c0 [ 125.707780][ C0] xfs_file_write_iter+0xa3/0xc0 [ 125.709802][ C0] do_iter_readv_writev+0x15b/0x1c0 [ 125.712496][ C0] do_iter_write+0x81/0x190 [ 125.715245][ C0] vfs_iter_write+0x14/0x20 [ 125.717221][ C0] iter_file_splice_write+0x288/0x3e0 [ 125.719340][ C0] do_splice_from+0x1a/0x40 [ 125.721175][ C0] do_splice+0x2e5/0x620 [ 125.722950][ C0] __x64_sys_splice+0x97/0x100 [ 125.724937][ C0] do_syscall_64+0x31/0x40 [ 125.726766][ C0] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 125.729388][ C0] RIP: 0033:0x7f7515dd91c3 [ 125.731246][ C0] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 3d fd dd 2c 00 00 75 13 49 89 ca b8 13 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 2b d4 00 00 48 89 04 24 [ 125.749632][ C0] RSP: 002b:00007ffd553cde18 EFLAGS: 00000246 ORIG_RAX: 0000000000000113 [ 125.753713][ C0] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7515dd91c3 [ 125.756867][ C0] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000004 [ 125.759872][ C0] RBP: 0000000000000000 R08: 0000000000010000 R09: 0000000000000000 [ 125.763023][ C0] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004005a6 [ 125.766225][ C0] R13: 00007ffd553cdf10 R14: 0000000000000000 R15: 0000000000000000 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-07 12:34 ` Tetsuo Handa @ 2020-09-01 0:51 ` Qian Cai 2020-09-01 1:09 ` Al Viro 2020-09-01 1:10 ` Ming Lei 0 siblings, 2 replies; 22+ messages in thread From: Qian Cai @ 2020-09-01 0:51 UTC (permalink / raw) To: Tetsuo Handa Cc: Al Viro, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel, Ming Lei, paulmck On Fri, Aug 07, 2020 at 09:34:08PM +0900, Tetsuo Handa wrote: > On 2020/08/07 21:27, Al Viro wrote: > > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > >> syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > >> iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > >> ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > >> call_write_iter() from do_iter_readv_writev() from do_iter_write() from > >> vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > >> with pipe->mutex held. > >> > >> The reason of falling into infinite busy loop is that iter_file_splice_write() > >> for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > >> while for_each_bvec() cannot handle .bv_len == 0. > > > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > > unless I'm misreading it... I have been chasing something similar for a while as in, https://lore.kernel.org/linux-fsdevel/89F418A9-EB20-48CB-9AE0-52C700E6BB74@lca.pw/ In my case, it seems the endless loop happens in iterate_iovec() instead where I put a debug patch here, --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -33,6 +33,7 @@ if (unlikely(!__v.iov_len)) \ continue; \ __v.iov_base = __p->iov_base; \ + printk_ratelimited("ITER_IOVEC left = %zu, n = %zu\n", left, n); \ left = (STEP); \ __v.iov_len -= left; \ skip = __v.iov_len; \ and end up seeing overflows ("n" supposes to be less than PAGE_SIZE) before the soft-lockups and a dead system, [ 4300.249180][T470195] ITER_IOVEC left = 0, n = 48566423 Thoughts? > > > > Zero-length segments are not disallowed; it's not all that hard to filter them > > out in iter_file_splice_write(), but the intent had always been to have > > iterate_all_kinds() et.al. able to cope with those. > > > > How are these pipe_buffers with ->len == 0 generated in that reproducer, BTW? > > There might be something else fishy going on... > > > > OK. Indeed writing to empty pipe which returns -EFAULT allows an empty > page to be linked to pipe's array. > > Now, I've just found a simple reproducer, and confirmed that this bug is > a local lockup DoS by unprivileged user. Please fix. > > ---------- > #define _GNU_SOURCE > #include <stdio.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <unistd.h> > > int main(int argc, char *argv[]) > { > static char buffer[4096]; > const int fd = open("/tmp/testfile", O_WRONLY | O_CREAT, 0600); > int pipe_fd[2] = { EOF, EOF }; > pipe(pipe_fd); > write(pipe_fd[1], NULL, 4096); > write(pipe_fd[1], buffer, 4096); > splice(pipe_fd[0], NULL, fd, NULL, 65536, 0); > return 0; > } > ---------- > > [ 125.598898][ C0] rcu: INFO: rcu_sched self-detected stall on CPU > [ 125.601072][ C0] rcu: 0-....: (20171 ticks this GP) idle=526/1/0x4000000000000000 softirq=7918/7918 fqs=5136 > [ 125.604874][ C0] (t=21006 jiffies g=9341 q=30) > [ 125.606512][ C0] NMI backtrace for cpu 0 > [ 125.607931][ C0] CPU: 0 PID: 2792 Comm: a.out Not tainted 5.8.0+ #793 > [ 125.610948][ C0] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020 > [ 125.614938][ C0] Call Trace: > [ 125.616049][ C0] <IRQ> > [ 125.617010][ C0] dump_stack+0x5e/0x7a > [ 125.618370][ C0] nmi_cpu_backtrace.cold.7+0x14/0x52 > [ 125.620148][ C0] ? lapic_can_unplug_cpu.cold.39+0x3a/0x3a > [ 125.622074][ C0] nmi_trigger_cpumask_backtrace+0x92/0x9f > [ 125.624154][ C0] arch_trigger_cpumask_backtrace+0x14/0x20 > [ 125.626102][ C0] rcu_dump_cpu_stacks+0xa0/0xd0 > [ 125.627919][ C0] rcu_sched_clock_irq.cold.95+0x121/0x39c > [ 125.629833][ C0] ? acct_account_cputime+0x17/0x20 > [ 125.631534][ C0] ? account_system_index_time+0x8a/0xa0 > [ 125.633422][ C0] update_process_times+0x23/0x60 > [ 125.635070][ C0] tick_sched_handle.isra.22+0x20/0x60 > [ 125.636870][ C0] tick_sched_timer+0x68/0x80 > [ 125.638403][ C0] ? tick_sched_handle.isra.22+0x60/0x60 > [ 125.640588][ C0] __hrtimer_run_queues+0xf9/0x1a0 > [ 125.642591][ C0] hrtimer_interrupt+0xfc/0x210 > [ 125.645033][ C0] __sysvec_apic_timer_interrupt+0x4c/0x60 > [ 125.647292][ C0] asm_call_on_stack+0xf/0x20 > [ 125.649192][ C0] </IRQ> > [ 125.650501][ C0] sysvec_apic_timer_interrupt+0x75/0x80 > [ 125.652900][ C0] asm_sysvec_apic_timer_interrupt+0x12/0x20 > [ 125.655487][ C0] RIP: 0010:iov_iter_copy_from_user_atomic+0x19b/0x350 > [ 125.658124][ C0] Code: 89 45 d0 48 c1 e6 06 48 03 37 4d 8d 3c 09 4c 89 cf e8 d9 e5 ff ff 48 8b 45 d0 45 39 eb 0f 87 35 01 00 00 49 8b 4a 18 4d 89 f9 <45> 29 dd 45 01 d8 75 12 eb 19 41 83 c4 01 41 29 c0 74 10 44 89 e0 > [ 125.666132][ C0] RSP: 0018:ffffa6cdc1237aa8 EFLAGS: 00000246 > [ 125.668557][ C0] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffff945035a25100 > [ 125.671576][ C0] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff945035a25100 > [ 125.674851][ C0] RBP: ffffa6cdc1237ad8 R08: 0000000000000000 R09: ffff945028a80000 > [ 125.677989][ C0] R10: ffffa6cdc1237de0 R11: 0000000000000000 R12: 0000000000000000 > [ 125.680990][ C0] R13: 0000000000001000 R14: 0000000000001000 R15: 0000000000001000 > [ 125.684006][ C0] iomap_write_actor+0xbe/0x190 > [ 125.685982][ C0] ? iomap_write_begin+0x460/0x460 > [ 125.688031][ C0] iomap_apply+0xf4/0x1a0 > [ 125.689810][ C0] ? iomap_write_begin+0x460/0x460 > [ 125.691826][ C0] iomap_file_buffered_write+0x69/0x90 > [ 125.698598][ C0] ? iomap_write_begin+0x460/0x460 > [ 125.705341][ C0] xfs_file_buffered_aio_write+0xc2/0x2c0 > [ 125.707780][ C0] xfs_file_write_iter+0xa3/0xc0 > [ 125.709802][ C0] do_iter_readv_writev+0x15b/0x1c0 > [ 125.712496][ C0] do_iter_write+0x81/0x190 > [ 125.715245][ C0] vfs_iter_write+0x14/0x20 > [ 125.717221][ C0] iter_file_splice_write+0x288/0x3e0 > [ 125.719340][ C0] do_splice_from+0x1a/0x40 > [ 125.721175][ C0] do_splice+0x2e5/0x620 > [ 125.722950][ C0] __x64_sys_splice+0x97/0x100 > [ 125.724937][ C0] do_syscall_64+0x31/0x40 > [ 125.726766][ C0] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 125.729388][ C0] RIP: 0033:0x7f7515dd91c3 > [ 125.731246][ C0] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 3d fd dd 2c 00 00 75 13 49 89 ca b8 13 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 2b d4 00 00 48 89 04 24 > [ 125.749632][ C0] RSP: 002b:00007ffd553cde18 EFLAGS: 00000246 ORIG_RAX: 0000000000000113 > [ 125.753713][ C0] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7515dd91c3 > [ 125.756867][ C0] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000004 > [ 125.759872][ C0] RBP: 0000000000000000 R08: 0000000000010000 R09: 0000000000000000 > [ 125.763023][ C0] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004005a6 > [ 125.766225][ C0] R13: 00007ffd553cdf10 R14: 0000000000000000 R15: 0000000000000000 > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-09-01 0:51 ` Qian Cai @ 2020-09-01 1:09 ` Al Viro 2020-09-01 1:47 ` Qian Cai 2020-09-01 3:32 ` Qian Cai 2020-09-01 1:10 ` Ming Lei 1 sibling, 2 replies; 22+ messages in thread From: Al Viro @ 2020-09-01 1:09 UTC (permalink / raw) To: Qian Cai Cc: Tetsuo Handa, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel, Ming Lei, paulmck On Mon, Aug 31, 2020 at 08:51:32PM -0400, Qian Cai wrote: > On Fri, Aug 07, 2020 at 09:34:08PM +0900, Tetsuo Handa wrote: > > On 2020/08/07 21:27, Al Viro wrote: > > > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > > >> syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > > >> iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > > >> ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > > >> call_write_iter() from do_iter_readv_writev() from do_iter_write() from > > >> vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > > >> with pipe->mutex held. > > >> > > >> The reason of falling into infinite busy loop is that iter_file_splice_write() > > >> for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > > >> while for_each_bvec() cannot handle .bv_len == 0. > > > > > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > > > unless I'm misreading it... > > I have been chasing something similar for a while as in, > > https://lore.kernel.org/linux-fsdevel/89F418A9-EB20-48CB-9AE0-52C700E6BB74@lca.pw/ > > In my case, it seems the endless loop happens in iterate_iovec() instead where > I put a debug patch here, > > --- a/lib/iov_iter.c > +++ b/lib/iov_iter.c > @@ -33,6 +33,7 @@ > if (unlikely(!__v.iov_len)) \ > continue; \ > __v.iov_base = __p->iov_base; \ > + printk_ratelimited("ITER_IOVEC left = %zu, n = %zu\n", left, n); \ > left = (STEP); \ > __v.iov_len -= left; \ > skip = __v.iov_len; \ > > and end up seeing overflows ("n" supposes to be less than PAGE_SIZE) before the > soft-lockups and a dead system, > > [ 4300.249180][T470195] ITER_IOVEC left = 0, n = 48566423 > > Thoughts? Er... Where does that size come from? If that's generic_perform_write(), I'd like to see pos, offset and bytes at the time of call... ->iov_offset would also be interesting to see (along with the entire iovec array, really). ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-09-01 1:09 ` Al Viro @ 2020-09-01 1:47 ` Qian Cai 2020-09-01 3:32 ` Qian Cai 1 sibling, 0 replies; 22+ messages in thread From: Qian Cai @ 2020-09-01 1:47 UTC (permalink / raw) To: Al Viro Cc: Tetsuo Handa, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel, Ming Lei, paulmck On Tue, Sep 01, 2020 at 02:09:28AM +0100, Al Viro wrote: > On Mon, Aug 31, 2020 at 08:51:32PM -0400, Qian Cai wrote: > > On Fri, Aug 07, 2020 at 09:34:08PM +0900, Tetsuo Handa wrote: > > > On 2020/08/07 21:27, Al Viro wrote: > > > > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > > > >> syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > > > >> iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > > > >> ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > > > >> call_write_iter() from do_iter_readv_writev() from do_iter_write() from > > > >> vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > > > >> with pipe->mutex held. > > > >> > > > >> The reason of falling into infinite busy loop is that iter_file_splice_write() > > > >> for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > > > >> while for_each_bvec() cannot handle .bv_len == 0. > > > > > > > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > > > > unless I'm misreading it... > > > > I have been chasing something similar for a while as in, > > > > https://lore.kernel.org/linux-fsdevel/89F418A9-EB20-48CB-9AE0-52C700E6BB74@lca.pw/ > > > > In my case, it seems the endless loop happens in iterate_iovec() instead where > > I put a debug patch here, > > > > --- a/lib/iov_iter.c > > +++ b/lib/iov_iter.c > > @@ -33,6 +33,7 @@ > > if (unlikely(!__v.iov_len)) \ > > continue; \ > > __v.iov_base = __p->iov_base; \ > > + printk_ratelimited("ITER_IOVEC left = %zu, n = %zu\n", left, n); \ > > left = (STEP); \ > > __v.iov_len -= left; \ > > skip = __v.iov_len; \ > > > > and end up seeing overflows ("n" supposes to be less than PAGE_SIZE) before the > > soft-lockups and a dead system, > > > > [ 4300.249180][T470195] ITER_IOVEC left = 0, n = 48566423 > > > > Thoughts? > > Er... Where does that size come from? If that's generic_perform_write(), > I'd like to see pos, offset and bytes at the time of call... ->iov_offset would > also be interesting to see (along with the entire iovec array, really). Yes, generic_perform_write(). I'll see if I can capture more information. [ 2867.463013][T217919] ITER_IOVEC left = 0, n = 2209 [ 2867.466154][T217971] ITER_IOVEC left = 0, n = 2093279 [ 2867.903689][T217971] ITER_IOVEC left = 0, n = 2093259 [ 2867.928514][T217971] ITER_IOVEC left = 0, n = 2093239 [ 2867.952450][T217971] ITER_IOVEC left = 0, n = 2090980 [ 2867.976585][T217971] ITER_IOVEC left = 0, n = 2090960 [ 2869.219459][T217774] futex_wake_op: trinity-c61 tries to shift op by -1; fix this program [ 2870.005178][T218110] futex_wake_op: trinity-c9 tries to shift op by -1; fix this program [ 2870.297607][T218213] futex_wake_op: trinity-c4 tries to shift op by -836; fix this program [ 2870.338118][T218213] futex_wake_op: trinity-c4 tries to shift op by -836; fix this program [-- MARK -- Mon Aug 31 18:50:00 2020] [ 2893.870387][ C62] watchdog: BUG: soft lockup - CPU#62 stuck for 23s! [trinity-c5:218155] [ 2893.912341][ C62] Modules linked in: nls_ascii nls_cp437 vfat fat kvm_intel kvm irqbypass efivars ip_tables x_tables sd_mod ahci bnx2x libahci mdio libata firmware_class dm_mirror dm_region_hash dm_log dm_mod efivarfs [ 2894.003787][ C62] irq event stamp: 55378 [ 2894.022398][ C62] hardirqs last enabled at (55377): [<ffffffff95a00c42>] asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 2894.070770][ C62] hardirqs last disabled at (55378): [<ffffffff959b111d>] irqentry_enter+0x1d/0x50 [ 2894.112794][ C62] softirqs last enabled at (50602): [<ffffffff95c0070f>] __do_softirq+0x70f/0xa9f [ 2894.154580][ C62] softirqs last disabled at (49393): [<ffffffff95a00ec2>] asm_call_on_stack+0x12/0x20 [ 2894.197654][ C62] CPU: 62 PID: 218155 Comm: trinity-c5 Not tainted 5.9.0-rc2-next-20200828+ #4 [ 2894.239807][ C62] Hardware name: HP Synergy 480 Gen9/Synergy 480 Gen9 Compute Module, BIOS I37 10/21/2019 [ 2894.284894][ C62] RIP: 0010:iov_iter_copy_from_user_atomic+0x598/0xab0 [ 2894.316037][ C62] Code: 42 0f b6 14 08 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 7c 03 00 00 48 8d 7e 08 8b 46 0c 48 89 fa 48 c1 ea 03 01 e8 <42> 0f b6 14 0a 84 d2 74 09 80 fa 03 0f 8e 0e 03 00 00 44 8b 6e 08 [ 2894.408763][ C62] RSP: 0018:ffffc9000fb7f848 EFLAGS: 00000246 [ 2894.440034][ C62] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff888fce295100 [ 2894.480312][ C62] RDX: 1ffff111f9c52a21 RSI: ffff888fce295100 RDI: ffff888fce295108 [ 2894.517551][ C62] RBP: 0000000000000000 R08: fffff52001f6ffa1 R09: dffffc0000000000 [ 2894.556460][ C62] R10: ffff889055c1f000 R11: 0000000000000400 R12: ffffc9000fb7fcf0 [ 2894.593928][ C62] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000400 [ 2894.630979][ C62] FS: 00007ff89140a740(0000) GS:ffff88905fd80000(0000) knlGS:0000000000000000 [ 2894.673104][ C62] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2894.702914][ C62] CR2: 00007ff8906242fc CR3: 000000101bb76004 CR4: 00000000003706e0 [ 2894.740628][ C62] DR0: 00007ff891250000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2894.778016][ C62] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 [ 2894.815393][ C62] Call Trace: [ 2894.830531][ C62] ? shmem_write_begin+0xc4/0x1b0 [ 2894.853643][ C62] generic_perform_write+0x2ac/0x410 [ 2894.878266][ C62] ? filemap_check_errors+0xe0/0xe0 [ 2894.901557][ C62] ? file_update_time+0x215/0x380 [ 2894.925234][ C62] ? update_time+0xa0/0xa0 [ 2894.947683][ C62] ? down_write+0xdb/0x150 [ 2894.969843][ C62] __generic_file_write_iter+0x2fe/0x4f0 [ 2894.995527][ C62] ? rcu_read_unlock+0x50/0x50 [ 2895.017683][ C62] generic_file_write_iter+0x2ee/0x520 [ 2895.042981][ C62] ? __generic_file_write_iter+0x4f0/0x4f0 [ 2895.069612][ C62] ? __mutex_lock+0x4af/0x1390 [ 2895.092240][ C62] do_iter_readv_writev+0x388/0x6f0 [ 2895.115794][ C62] ? lockdep_hardirqs_on_prepare+0x33e/0x4e0 [ 2895.143303][ C62] ? default_llseek+0x240/0x240 [ 2895.165522][ C62] ? rcu_read_lock_bh_held+0xc0/0xc0 [ 2895.190122][ C62] do_iter_write+0x130/0x5f0 [ 2895.211226][ C62] iter_file_splice_write+0x54c/0xa40 [ 2895.235917][ C62] ? page_cache_pipe_buf_try_steal+0x1e0/0x1e0 [ 2895.264415][ C62] ? rcu_read_lock_any_held+0xdb/0x100 [ 2895.290249][ C62] do_splice+0x86c/0x1440 [ 2895.310155][ C62] ? syscall_enter_from_user_mode+0x1b/0x230 [ 2895.337531][ C62] ? direct_splice_actor+0x100/0x100 [ 2895.362348][ C62] __x64_sys_splice+0x151/0x200 [ 2895.384677][ C62] do_syscall_64+0x33/0x40 [ 2895.405223][ C62] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 2895.433800][ C62] RIP: 0033:0x7ff890d246ed [ 2895.456812][ C62] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6b 57 2c 00 f7 d8 64 89 01 48 [ 2895.553070][ C62] RSP: 002b:00007fff8a6acfb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000113 [ 2895.592859][ C62] RAX: ffffffffffffffda RBX: 0000000000000113 RCX: 00007ff890d246ed [ 2895.630744][ C62] RDX: 000000000000015b RSI: 0000000000000000 RDI: 000000000000016b [ 2895.668870][ C62] RBP: 0000000000000113 R08: 0000000000000400 R09: 0000000000000000 [ 2895.707052][ C62] R10: 00007ff88f30a000 R11: 0000000000000246 R12: 0000000000000002 [ 2895.744716][ C62] R13: 00007ff8913ee058 R14: 00007ff89140a6c0 R15: 00007ff8913ee000 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-09-01 1:09 ` Al Viro 2020-09-01 1:47 ` Qian Cai @ 2020-09-01 3:32 ` Qian Cai 2020-09-01 4:07 ` Al Viro 1 sibling, 1 reply; 22+ messages in thread From: Qian Cai @ 2020-09-01 3:32 UTC (permalink / raw) To: Al Viro Cc: Tetsuo Handa, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel, tom.leiming, paulmck On Tue, Sep 01, 2020 at 02:09:28AM +0100, Al Viro wrote: > On Mon, Aug 31, 2020 at 08:51:32PM -0400, Qian Cai wrote: > > On Fri, Aug 07, 2020 at 09:34:08PM +0900, Tetsuo Handa wrote: > > > On 2020/08/07 21:27, Al Viro wrote: > > > > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > > > >> syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > > > >> iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > > > >> ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > > > >> call_write_iter() from do_iter_readv_writev() from do_iter_write() from > > > >> vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > > > >> with pipe->mutex held. > > > >> > > > >> The reason of falling into infinite busy loop is that iter_file_splice_write() > > > >> for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > > > >> while for_each_bvec() cannot handle .bv_len == 0. > > > > > > > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > > > > unless I'm misreading it... > > > > I have been chasing something similar for a while as in, > > > > https://lore.kernel.org/linux-fsdevel/89F418A9-EB20-48CB-9AE0-52C700E6BB74@lca.pw/ > > > > In my case, it seems the endless loop happens in iterate_iovec() instead where > > I put a debug patch here, > > > > --- a/lib/iov_iter.c > > +++ b/lib/iov_iter.c > > @@ -33,6 +33,7 @@ > > if (unlikely(!__v.iov_len)) \ > > continue; \ > > __v.iov_base = __p->iov_base; \ > > + printk_ratelimited("ITER_IOVEC left = %zu, n = %zu\n", left, n); \ > > left = (STEP); \ > > __v.iov_len -= left; \ > > skip = __v.iov_len; \ > > > > and end up seeing overflows ("n" supposes to be less than PAGE_SIZE) before the > > soft-lockups and a dead system, > > > > [ 4300.249180][T470195] ITER_IOVEC left = 0, n = 48566423 > > > > Thoughts? > > Er... Where does that size come from? If that's generic_perform_write(), > I'd like to see pos, offset and bytes at the time of call... ->iov_offset would > also be interesting to see (along with the entire iovec array, really). I used a new debug patch but not sure how to capture without printk_ratelimited() because the call sites are large, diff --git a/lib/iov_iter.c b/lib/iov_iter.c index d812eef23a32..214b93c3d8a8 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -33,6 +33,7 @@ if (unlikely(!__v.iov_len)) \ continue; \ __v.iov_base = __p->iov_base; \ + printk_ratelimited("ITER_IOVEC left = %zu, n = %zu, iov_offset = %zu, iov_base = %px, iov_len = %lu\n", left, n, i->iov_offset, __p->iov_base, __p->iov_len); \ left = (STEP); \ __v.iov_len -= left; \ skip = __v.iov_len; \ diff --git a/mm/filemap.c b/mm/filemap.c index 1aaea26556cc..202b0ab28adf 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3408,6 +3408,7 @@ ssize_t generic_perform_write(struct file *file, if (mapping_writably_mapped(mapping)) flush_dcache_page(page); + printk_ratelimited("pos = %lld, offset = %ld, bytes = %ld\n", pos, offset, bytes); copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes); flush_dcache_page(page); Al, does it help? If not, can you cook a debug patch instead? [ 587.765400][T21348] pos = 468152746240, offset = 256, bytes = 3840 [ 587.765424][T21348] pos = 468152750080, offset = 0, bytes = 4096 [ 587.765439][T21348] pos = 468152754176, offset = 0, bytes = 4096 [ 587.765455][T21348] pos = 468152758272, offset = 0, bytes = 4096 [ 591.235409][T22038] ITER_IOVEC left = 0, n = 22476968, iov_offset = 0, iov_base = 00007ff12570c000, iov_len = 20 [ 591.313456][T22038] ITER_IOVEC left = 0, n = 22476948, iov_offset = 0, iov_base = 00007ff123250000, iov_len = 20 [ 591.363679][T22038] ITER_IOVEC left = 0, n = 22476928, iov_offset = 0, iov_base = 00007ff123650000, iov_len = 6370 [ 591.413217][T22038] ITER_IOVEC left = 0, n = 22470558, iov_offset = 0, iov_base = 00007ff123850000, iov_len = 20 [ 591.459754][T22038] ITER_IOVEC left = 0, n = 22470538, iov_offset = 0, iov_base = 00007ff12570c000, iov_len = 376 [ 591.507748][T22038] ITER_IOVEC left = 0, n = 22470162, iov_offset = 0, iov_base = 00007ff12570c000, iov_len = 3473 [ 591.557395][T22038] ITER_IOVEC left = 0, n = 22466689, iov_offset = 0, iov_base = 00007ff123250000, iov_len = 20 [ 591.605295][T22038] ITER_IOVEC left = 0, n = 22466669, iov_offset = 0, iov_base = 00007ff12570a000, iov_len = 3392 [ 591.653045][T22038] ITER_IOVEC left = 0, n = 22463277, iov_offset = 0, iov_base = 00007ff123250000, iov_len = 1329038 [ 591.705324][T22038] ITER_IOVEC left = 0, n = 21134239, iov_offset = 0, iov_base = 00007ff123850000, iov_len = 20 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-09-01 3:32 ` Qian Cai @ 2020-09-01 4:07 ` Al Viro 0 siblings, 0 replies; 22+ messages in thread From: Al Viro @ 2020-09-01 4:07 UTC (permalink / raw) To: Qian Cai Cc: Tetsuo Handa, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel, tom.leiming, paulmck On Mon, Aug 31, 2020 at 11:32:28PM -0400, Qian Cai wrote: > I used a new debug patch but not sure how to capture without > printk_ratelimited() because the call sites are large, if (!strcmp(current->comm, "bugger")) printk(KERN_ERR.... and call the binary you are running ./bugger... And I'd slap such printk into the beginning of iterate_iovec() as well, if not into the entry of iov_iter_copy_from_user_atomic(). That BS value of n must've come from somewhere; it should expand to 'bytes'. What we have in the beginning is const struct iovec *iov; struct iovec v; size_t skip = i->iov_offset; size_t left; size_t wanted = bytes; iov = i->iov; __v.iov_len = min(bytes, iov->iov_len - skip); if (likely(__v.iov_len)) { __v.iov_base = iov->iov_base + skip; left = copyin((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len); __v.iov_len -= left; skip += __v.iov_len; bytes -= __v.iov_len; } else { left = 0; } and something leaves you with bytes bumped to 22476968. What was in that first iovec? Incidentally, what's in 'wanted'? And... Timestamps don't look like that crap has come from generic_perform_write() - it's about 4 seconds later. While we are at it, there are other users of iterate_all_kinds(), and some of them can very well get large sizes; they are not copying anything (iov_iter_advance(), for starters). There that kind of values would be just fine; are you sure those printks came from iov_iter_copy_from_user_atomic()? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-09-01 0:51 ` Qian Cai 2020-09-01 1:09 ` Al Viro @ 2020-09-01 1:10 ` Ming Lei 2020-09-01 14:22 ` Qian Cai 1 sibling, 1 reply; 22+ messages in thread From: Ming Lei @ 2020-09-01 1:10 UTC (permalink / raw) To: Qian Cai Cc: Tetsuo Handa, Al Viro, syzkaller-bugs, syzbot, Linux FS Devel, Linux Kernel Mailing List, Ming Lei, paulmck On Tue, Sep 1, 2020 at 8:53 AM Qian Cai <cai@lca.pw> wrote: > > On Fri, Aug 07, 2020 at 09:34:08PM +0900, Tetsuo Handa wrote: > > On 2020/08/07 21:27, Al Viro wrote: > > > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > > >> syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > > >> iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > > >> ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > > >> call_write_iter() from do_iter_readv_writev() from do_iter_write() from > > >> vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > > >> with pipe->mutex held. > > >> > > >> The reason of falling into infinite busy loop is that iter_file_splice_write() > > >> for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > > >> while for_each_bvec() cannot handle .bv_len == 0. > > > > > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > > > unless I'm misreading it... > > I have been chasing something similar for a while as in, > > https://lore.kernel.org/linux-fsdevel/89F418A9-EB20-48CB-9AE0-52C700E6BB74@lca.pw/ > > In my case, it seems the endless loop happens in iterate_iovec() instead where > I put a debug patch here, > > --- a/lib/iov_iter.c > +++ b/lib/iov_iter.c > @@ -33,6 +33,7 @@ > if (unlikely(!__v.iov_len)) \ > continue; \ > __v.iov_base = __p->iov_base; \ > + printk_ratelimited("ITER_IOVEC left = %zu, n = %zu\n", left, n); \ > left = (STEP); \ > __v.iov_len -= left; \ > skip = __v.iov_len; \ > > and end up seeing overflows ("n" supposes to be less than PAGE_SIZE) before the > soft-lockups and a dead system, > > [ 4300.249180][T470195] ITER_IOVEC left = 0, n = 48566423 > > Thoughts? Does the following patch make a difference for you? https://lore.kernel.org/linux-block/20200817100055.2495905-1-ming.lei@redhat.com/ thanks, Ming Lei ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-09-01 1:10 ` Ming Lei @ 2020-09-01 14:22 ` Qian Cai 0 siblings, 0 replies; 22+ messages in thread From: Qian Cai @ 2020-09-01 14:22 UTC (permalink / raw) To: Ming Lei Cc: Tetsuo Handa, Al Viro, syzkaller-bugs, syzbot, Linux FS Devel, Linux Kernel Mailing List, Ming Lei, paulmck On Tue, Sep 01, 2020 at 09:10:05AM +0800, Ming Lei wrote: > On Tue, Sep 1, 2020 at 8:53 AM Qian Cai <cai@lca.pw> wrote: > > > > On Fri, Aug 07, 2020 at 09:34:08PM +0900, Tetsuo Handa wrote: > > > On 2020/08/07 21:27, Al Viro wrote: > > > > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > > > >> syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > > > >> iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > > > >> ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > > > >> call_write_iter() from do_iter_readv_writev() from do_iter_write() from > > > >> vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > > > >> with pipe->mutex held. > > > >> > > > >> The reason of falling into infinite busy loop is that iter_file_splice_write() > > > >> for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > > > >> while for_each_bvec() cannot handle .bv_len == 0. > > > > > > > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > > > > unless I'm misreading it... > > > > I have been chasing something similar for a while as in, > > > > https://lore.kernel.org/linux-fsdevel/89F418A9-EB20-48CB-9AE0-52C700E6BB74@lca.pw/ > > > > In my case, it seems the endless loop happens in iterate_iovec() instead where > > I put a debug patch here, > > > > --- a/lib/iov_iter.c > > +++ b/lib/iov_iter.c > > @@ -33,6 +33,7 @@ > > if (unlikely(!__v.iov_len)) \ > > continue; \ > > __v.iov_base = __p->iov_base; \ > > + printk_ratelimited("ITER_IOVEC left = %zu, n = %zu\n", left, n); \ > > left = (STEP); \ > > __v.iov_len -= left; \ > > skip = __v.iov_len; \ > > > > and end up seeing overflows ("n" supposes to be less than PAGE_SIZE) before the > > soft-lockups and a dead system, > > > > [ 4300.249180][T470195] ITER_IOVEC left = 0, n = 48566423 > > > > Thoughts? > > Does the following patch make a difference for you? > > https://lore.kernel.org/linux-block/20200817100055.2495905-1-ming.lei@redhat.com/ Yes, it does. I could no longer be able to reproduce the soft lockups with the patch applied. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-07 12:27 ` Al Viro 2020-08-07 12:34 ` Tetsuo Handa @ 2020-08-07 12:38 ` Al Viro 2020-08-07 13:41 ` Ming Lei 1 sibling, 1 reply; 22+ messages in thread From: Al Viro @ 2020-08-07 12:38 UTC (permalink / raw) To: Tetsuo Handa Cc: syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel, Ming Lei On Fri, Aug 07, 2020 at 01:27:27PM +0100, Al Viro wrote: > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > > syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > > iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > > ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > > call_write_iter() from do_iter_readv_writev() from do_iter_write() from > > vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > > with pipe->mutex held. > > > > The reason of falling into infinite busy loop is that iter_file_splice_write() > > for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > > while for_each_bvec() cannot handle .bv_len == 0. > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > unless I'm misreading it... > > Zero-length segments are not disallowed; it's not all that hard to filter them > out in iter_file_splice_write(), but the intent had always been to have > iterate_all_kinds() et.al. able to cope with those. > > How are these pipe_buffers with ->len == 0 generated in that reproducer, BTW? > There might be something else fishy going on... FWIW, my preference would be to have for_each_bvec() advance past zero-length segments; I'll need to go through its uses elsewhere in the tree first, though (after I grab some sleep), ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-07 12:38 ` Al Viro @ 2020-08-07 13:41 ` Ming Lei 2020-08-07 14:11 ` Matthew Wilcox 2020-08-07 14:17 ` Tetsuo Handa 0 siblings, 2 replies; 22+ messages in thread From: Ming Lei @ 2020-08-07 13:41 UTC (permalink / raw) To: Al Viro; +Cc: Tetsuo Handa, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel On Fri, Aug 07, 2020 at 01:38:54PM +0100, Al Viro wrote: > On Fri, Aug 07, 2020 at 01:27:27PM +0100, Al Viro wrote: > > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > > > syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > > > iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > > > ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > > > call_write_iter() from do_iter_readv_writev() from do_iter_write() from > > > vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > > > with pipe->mutex held. > > > > > > The reason of falling into infinite busy loop is that iter_file_splice_write() > > > for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > > > while for_each_bvec() cannot handle .bv_len == 0. > > > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > > unless I'm misreading it... > > > > Zero-length segments are not disallowed; it's not all that hard to filter them > > out in iter_file_splice_write(), but the intent had always been to have > > iterate_all_kinds() et.al. able to cope with those. > > > > How are these pipe_buffers with ->len == 0 generated in that reproducer, BTW? > > There might be something else fishy going on... > > FWIW, my preference would be to have for_each_bvec() advance past zero-length > segments; I'll need to go through its uses elsewhere in the tree first, though > (after I grab some sleep), Usually block layer doesn't allow/support zero bvec, however we can make for_each_bvec() to support it only. Tetsuo, can you try the following patch? diff --git a/include/linux/bvec.h b/include/linux/bvec.h index ac0c7299d5b8..b03c793dd28d 100644 --- a/include/linux/bvec.h +++ b/include/linux/bvec.h @@ -117,11 +117,19 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv, return true; } +static inline void bvec_iter_skip_zero_vec(const struct bio_vec *bv, + struct bvec_iter *iter) +{ + iter->bi_idx++; + iter->bi_bvec_done = 0; +} + #define for_each_bvec(bvl, bio_vec, iter, start) \ for (iter = (start); \ (iter).bi_size && \ - ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ - bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len)) + ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ + (bvl).bv_len ? bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len) : \ + bvec_iter_skip_zero_vec((bio_vec), &(iter))) /* for iterating one bio from start to end */ #define BVEC_ITER_ALL_INIT (struct bvec_iter) \ Thanks, Ming ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-07 13:41 ` Ming Lei @ 2020-08-07 14:11 ` Matthew Wilcox 2020-08-07 15:11 ` Tetsuo Handa 2020-08-09 2:31 ` Ming Lei 2020-08-07 14:17 ` Tetsuo Handa 1 sibling, 2 replies; 22+ messages in thread From: Matthew Wilcox @ 2020-08-07 14:11 UTC (permalink / raw) To: Ming Lei Cc: Al Viro, Tetsuo Handa, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel On Fri, Aug 07, 2020 at 09:41:14PM +0800, Ming Lei wrote: > On Fri, Aug 07, 2020 at 01:38:54PM +0100, Al Viro wrote: > > FWIW, my preference would be to have for_each_bvec() advance past zero-length > > segments; I'll need to go through its uses elsewhere in the tree first, though > > (after I grab some sleep), > > Usually block layer doesn't allow/support zero bvec, however we can make > for_each_bvec() to support it only. > > Tetsuo, can you try the following patch? > > diff --git a/include/linux/bvec.h b/include/linux/bvec.h > index ac0c7299d5b8..b03c793dd28d 100644 > --- a/include/linux/bvec.h > +++ b/include/linux/bvec.h > @@ -117,11 +117,19 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv, > return true; > } > > +static inline void bvec_iter_skip_zero_vec(const struct bio_vec *bv, > + struct bvec_iter *iter) > +{ > + iter->bi_idx++; > + iter->bi_bvec_done = 0; > +} > + > #define for_each_bvec(bvl, bio_vec, iter, start) \ > for (iter = (start); \ > (iter).bi_size && \ > - ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ > - bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len)) > + ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ > + (bvl).bv_len ? bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len) : \ > + bvec_iter_skip_zero_vec((bio_vec), &(iter))) Uhm, bvec_iter_advance() already skips over zero length bio_vecs. while (bytes && bytes >= bv[idx].bv_len) { bytes -= bv[idx].bv_len; idx++; } The problem is when the _first_ bio_vec is zero length. Maybe something more like this (which doesn't even compile, but hopefully makes my point): @@ -86,12 +86,24 @@ struct bvec_iter_all { (mp_bvec_iter_page((bvec), (iter)) + \ mp_bvec_iter_page_idx((bvec), (iter))) -#define bvec_iter_bvec(bvec, iter) \ -((struct bio_vec) { \ - .bv_page = bvec_iter_page((bvec), (iter)), \ - .bv_len = bvec_iter_len((bvec), (iter)), \ - .bv_offset = bvec_iter_offset((bvec), (iter)), \ -}) +static inline bool bvec_iter_bvec(struct bio_vec *bv, struct bio_vec *bvec, + struct bvec_iter *iter) +{ + unsigned int idx = iter->bi_idx; + + if (!iter->bi_size) + return false; + + while (!bv[idx].bv_len) + idx++; + iter->bi_idx = idx; + + bv->bv_page = bvec_iter_page(bvec, *iter); + bv->bv_len = bvec_iter_len(bvec, *iter); + bv->bv_offset = bvec_iter_offset(bvec, *iter); + + return true; +} static inline bool bvec_iter_advance(const struct bio_vec *bv, struct bvec_iter *iter, unsigned bytes) @@ -119,8 +131,7 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv, #define for_each_bvec(bvl, bio_vec, iter, start) \ for (iter = (start); \ - (iter).bi_size && \ - ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ + bvec_iter_bvec(&(bvl), (bio_vec), &(iter)); \ bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len)) /* for iterating one bio from start to end */ (I find the whole bvec handling a mess of confusing macros and would welcome more of it being inline functions, in general). ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-07 14:11 ` Matthew Wilcox @ 2020-08-07 15:11 ` Tetsuo Handa 2020-08-09 2:31 ` Ming Lei 1 sibling, 0 replies; 22+ messages in thread From: Tetsuo Handa @ 2020-08-07 15:11 UTC (permalink / raw) To: Matthew Wilcox, Ming Lei Cc: Al Viro, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel, Dmitry Vyukov On 2020/08/07 23:11, Matthew Wilcox wrote: > (I find the whole bvec handling a mess of confusing macros and would > welcome more of it being inline functions, in general). Indeed. Inlined functions will be more useful than macros when syzbot calculates the location of the source code from address for reporting. I spent a lot of time where RIP: 0010:iov_iter_alignment+0x39e/0x850 lib/iov_iter.c:1236 within the complicated macros is. If inlined line numbers were available, I could have narrowed down the location of infinite loop faster... ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-07 14:11 ` Matthew Wilcox 2020-08-07 15:11 ` Tetsuo Handa @ 2020-08-09 2:31 ` Ming Lei 2020-08-09 2:49 ` Ming Lei 1 sibling, 1 reply; 22+ messages in thread From: Ming Lei @ 2020-08-09 2:31 UTC (permalink / raw) To: Matthew Wilcox Cc: Al Viro, Tetsuo Handa, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel On Fri, Aug 07, 2020 at 03:11:48PM +0100, Matthew Wilcox wrote: > On Fri, Aug 07, 2020 at 09:41:14PM +0800, Ming Lei wrote: > > On Fri, Aug 07, 2020 at 01:38:54PM +0100, Al Viro wrote: > > > FWIW, my preference would be to have for_each_bvec() advance past zero-length > > > segments; I'll need to go through its uses elsewhere in the tree first, though > > > (after I grab some sleep), > > > > Usually block layer doesn't allow/support zero bvec, however we can make > > for_each_bvec() to support it only. > > > > Tetsuo, can you try the following patch? > > > > diff --git a/include/linux/bvec.h b/include/linux/bvec.h > > index ac0c7299d5b8..b03c793dd28d 100644 > > --- a/include/linux/bvec.h > > +++ b/include/linux/bvec.h > > @@ -117,11 +117,19 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv, > > return true; > > } > > > > +static inline void bvec_iter_skip_zero_vec(const struct bio_vec *bv, > > + struct bvec_iter *iter) > > +{ > > + iter->bi_idx++; > > + iter->bi_bvec_done = 0; > > +} > > + > > #define for_each_bvec(bvl, bio_vec, iter, start) \ > > for (iter = (start); \ > > (iter).bi_size && \ > > - ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ > > - bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len)) > > + ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ > > + (bvl).bv_len ? bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len) : \ > > + bvec_iter_skip_zero_vec((bio_vec), &(iter))) > > Uhm, bvec_iter_advance() already skips over zero length bio_vecs. > > while (bytes && bytes >= bv[idx].bv_len) { > bytes -= bv[idx].bv_len; > idx++; > } The issue is that zero (bvl).bv_len passed to bvec_iter_advance(), so the iterator can't move on. And I tried to avoid change to bvec_iter_advance() since this exact issue only exists on for_each_bvec, and block layer won't support/allow zero-length bvec. > > The problem is when the _first_ bio_vec is zero length. It can be any zero-length bvec during the iterating. > Maybe something more > like this (which doesn't even compile, but hopefully makes my point): > > @@ -86,12 +86,24 @@ struct bvec_iter_all { > (mp_bvec_iter_page((bvec), (iter)) + \ > mp_bvec_iter_page_idx((bvec), (iter))) > > -#define bvec_iter_bvec(bvec, iter) \ > -((struct bio_vec) { \ > - .bv_page = bvec_iter_page((bvec), (iter)), \ > - .bv_len = bvec_iter_len((bvec), (iter)), \ > - .bv_offset = bvec_iter_offset((bvec), (iter)), \ > -}) > +static inline bool bvec_iter_bvec(struct bio_vec *bv, struct bio_vec *bvec, > + struct bvec_iter *iter) > +{ > + unsigned int idx = iter->bi_idx; > + > + if (!iter->bi_size) > + return false; > + > + while (!bv[idx].bv_len) > + idx++; > + iter->bi_idx = idx; > + > + bv->bv_page = bvec_iter_page(bvec, *iter); > + bv->bv_len = bvec_iter_len(bvec, *iter); > + bv->bv_offset = bvec_iter_offset(bvec, *iter); > + > + return true; > +} > > static inline bool bvec_iter_advance(const struct bio_vec *bv, > struct bvec_iter *iter, unsigned bytes) > @@ -119,8 +131,7 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv, > > #define for_each_bvec(bvl, bio_vec, iter, start) \ > for (iter = (start); \ > - (iter).bi_size && \ > - ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ > + bvec_iter_bvec(&(bvl), (bio_vec), &(iter)); \ > bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len)) > > /* for iterating one bio from start to end */ > > (I find the whole bvec handling a mess of confusing macros and would > welcome more of it being inline functions, in general). The above change may bring more code duplication. Meantime, it can't work because (bvl).bv_len isn't taken into account into bvec_iter_bvec(), then how can the iterator advance? Thanks, Ming ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-09 2:31 ` Ming Lei @ 2020-08-09 2:49 ` Ming Lei 0 siblings, 0 replies; 22+ messages in thread From: Ming Lei @ 2020-08-09 2:49 UTC (permalink / raw) To: Matthew Wilcox Cc: Al Viro, Tetsuo Handa, syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel On Sun, Aug 09, 2020 at 10:31:23AM +0800, Ming Lei wrote: > On Fri, Aug 07, 2020 at 03:11:48PM +0100, Matthew Wilcox wrote: > > On Fri, Aug 07, 2020 at 09:41:14PM +0800, Ming Lei wrote: > > > On Fri, Aug 07, 2020 at 01:38:54PM +0100, Al Viro wrote: > > > > FWIW, my preference would be to have for_each_bvec() advance past zero-length > > > > segments; I'll need to go through its uses elsewhere in the tree first, though > > > > (after I grab some sleep), > > > > > > Usually block layer doesn't allow/support zero bvec, however we can make > > > for_each_bvec() to support it only. > > > > > > Tetsuo, can you try the following patch? > > > > > > diff --git a/include/linux/bvec.h b/include/linux/bvec.h > > > index ac0c7299d5b8..b03c793dd28d 100644 > > > --- a/include/linux/bvec.h > > > +++ b/include/linux/bvec.h > > > @@ -117,11 +117,19 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv, > > > return true; > > > } > > > > > > +static inline void bvec_iter_skip_zero_vec(const struct bio_vec *bv, > > > + struct bvec_iter *iter) > > > +{ > > > + iter->bi_idx++; > > > + iter->bi_bvec_done = 0; > > > +} > > > + > > > #define for_each_bvec(bvl, bio_vec, iter, start) \ > > > for (iter = (start); \ > > > (iter).bi_size && \ > > > - ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ > > > - bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len)) > > > + ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ > > > + (bvl).bv_len ? bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len) : \ > > > + bvec_iter_skip_zero_vec((bio_vec), &(iter))) > > > > Uhm, bvec_iter_advance() already skips over zero length bio_vecs. > > > > while (bytes && bytes >= bv[idx].bv_len) { > > bytes -= bv[idx].bv_len; > > idx++; > > } > > The issue is that zero (bvl).bv_len passed to bvec_iter_advance(), so > the iterator can't move on. > > And I tried to avoid change to bvec_iter_advance() since this exact > issue only exists on for_each_bvec, and block layer won't support/allow > zero-length bvec. > > > > > The problem is when the _first_ bio_vec is zero length. > > It can be any zero-length bvec during the iterating. > > > Maybe something more > > like this (which doesn't even compile, but hopefully makes my point): > > > > @@ -86,12 +86,24 @@ struct bvec_iter_all { > > (mp_bvec_iter_page((bvec), (iter)) + \ > > mp_bvec_iter_page_idx((bvec), (iter))) > > > > -#define bvec_iter_bvec(bvec, iter) \ > > -((struct bio_vec) { \ > > - .bv_page = bvec_iter_page((bvec), (iter)), \ > > - .bv_len = bvec_iter_len((bvec), (iter)), \ > > - .bv_offset = bvec_iter_offset((bvec), (iter)), \ > > -}) > > +static inline bool bvec_iter_bvec(struct bio_vec *bv, struct bio_vec *bvec, > > + struct bvec_iter *iter) > > +{ > > + unsigned int idx = iter->bi_idx; > > + > > + if (!iter->bi_size) > > + return false; > > + > > + while (!bv[idx].bv_len) > > + idx++; > > + iter->bi_idx = idx; > > + > > + bv->bv_page = bvec_iter_page(bvec, *iter); > > + bv->bv_len = bvec_iter_len(bvec, *iter); > > + bv->bv_offset = bvec_iter_offset(bvec, *iter); > > + > > + return true; > > +} > > > > static inline bool bvec_iter_advance(const struct bio_vec *bv, > > struct bvec_iter *iter, unsigned bytes) > > @@ -119,8 +131,7 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv, > > > > #define for_each_bvec(bvl, bio_vec, iter, start) \ > > for (iter = (start); \ > > - (iter).bi_size && \ > > - ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ > > + bvec_iter_bvec(&(bvl), (bio_vec), &(iter)); \ > > bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len)) > > > > /* for iterating one bio from start to end */ > > > > (I find the whole bvec handling a mess of confusing macros and would > > welcome more of it being inline functions, in general). > > The above change may bring more code duplication. Meantime, it can't > work because (bvl).bv_len isn't taken into account into bvec_iter_bvec(), > then how can the iterator advance? oops, looks you change bvec_iter_bvec() only for skipping zero length bvec, and this way might work(still ->bi_bvec_done isn't reset). However the change is ugly, cause the iterator is supposed to not be updated in bvec_iter_bvec(). Also block layer code doesn't require such change. BTW, I agree on switching to inline if performance isn't affected. Thanks, Ming ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: splice: infinite busy loop lockup bug 2020-08-07 13:41 ` Ming Lei 2020-08-07 14:11 ` Matthew Wilcox @ 2020-08-07 14:17 ` Tetsuo Handa 1 sibling, 0 replies; 22+ messages in thread From: Tetsuo Handa @ 2020-08-07 14:17 UTC (permalink / raw) To: Ming Lei, Al Viro; +Cc: syzkaller-bugs, syzbot, linux-fsdevel, linux-kernel On 2020/08/07 22:41, Ming Lei wrote: >> FWIW, my preference would be to have for_each_bvec() advance past zero-length >> segments; I'll need to go through its uses elsewhere in the tree first, though >> (after I grab some sleep), > > Usually block layer doesn't allow/support zero bvec, however we can make > for_each_bvec() to support it only. > > Tetsuo, can you try the following patch? Yes, this patch solves the lockup. Thank you. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: INFO: task hung in pipe_release (2) 2020-08-02 18:26 INFO: task hung in pipe_release (2) syzbot 2020-08-07 10:35 ` splice: infinite busy loop lockup bug Tetsuo Handa 2020-08-13 3:57 ` syzbot @ 2020-08-13 3:57 ` syzbot 1 sibling, 0 replies; 22+ messages in thread From: syzbot @ 2020-08-13 3:57 UTC (permalink / raw) To: James.Bottomley, amanieu, arnd, benh, bfields, borntraeger, bp, catalin.marinas, chris, christian, corbet, cyphar, dalias, davem, deller, dvyukov, fenghua.yu, geert, gor, heiko.carstens, hpa, ink, jcmvbkbc, jhogan, jlayton, kvalo, linux-alpha, linux-api, linux-arch, linux-arm-kernel, linux-fsdevel, linux-ia64, linux-kernel, linux-m68k, linux-mips, linux-parisc, linux-s390, linux-sh, linux-xtensa, linux, linux, linuxppc-dev, luis.f.correia, luto, martink, mattst88, ming.lei, ming.lei, mingo, monstr syzbot has bisected this issue to: commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179 Author: Aleksa Sarai <cyphar@cyphar.com> Date: Sat Jan 18 12:07:59 2020 +0000 open: introduce openat2(2) syscall bisection log: https://syzkaller.appspot.com/x/bisect.txt?x\x164e716a900000 start commit: 6ba1b005 Merge tag 'asm-generic-fixes-5.8' of git://git.ke.. git tree: upstream final oops: https://syzkaller.appspot.com/x/report.txt?x\x154e716a900000 console output: https://syzkaller.appspot.com/x/log.txt?x\x114e716a900000 kernel config: https://syzkaller.appspot.com/x/.config?x„f076779e989e69 dashboard link: https://syzkaller.appspot.com/bug?extidaacc40a49a3e46e25ea syz repro: https://syzkaller.appspot.com/x/repro.syz?x\x142ae224900000 Reported-by: syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com Fixes: fddb5d430ad9 ("open: introduce openat2(2) syscall") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: INFO: task hung in pipe_release (2) @ 2020-08-13 3:57 ` syzbot 0 siblings, 0 replies; 22+ messages in thread From: syzbot @ 2020-08-13 3:57 UTC (permalink / raw) To: James.Bottomley, amanieu, arnd, benh, bfields, borntraeger, bp, catalin.marinas, chris, christian, corbet, cyphar, dalias, davem, deller, dvyukov, fenghua.yu, geert, gor, heiko.carstens, hpa, ink, jcmvbkbc, jhogan, jlayton, kvalo, linux-alpha, linux-api, linux-arch, linux-arm-kernel, linux-fsdevel, linux-ia64, linux-kernel, linux-m68k, linu syzbot has bisected this issue to: commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179 Author: Aleksa Sarai <cyphar@cyphar.com> Date: Sat Jan 18 12:07:59 2020 +0000 open: introduce openat2(2) syscall bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=164e716a900000 start commit: 6ba1b005 Merge tag 'asm-generic-fixes-5.8' of git://git.ke.. git tree: upstream final oops: https://syzkaller.appspot.com/x/report.txt?x=154e716a900000 console output: https://syzkaller.appspot.com/x/log.txt?x=114e716a900000 kernel config: https://syzkaller.appspot.com/x/.config?x=84f076779e989e69 dashboard link: https://syzkaller.appspot.com/bug?extid=61acc40a49a3e46e25ea syz repro: https://syzkaller.appspot.com/x/repro.syz?x=142ae224900000 Reported-by: syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com Fixes: fddb5d430ad9 ("open: introduce openat2(2) syscall") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: INFO: task hung in pipe_release (2) @ 2020-08-13 3:57 ` syzbot 0 siblings, 0 replies; 22+ messages in thread From: syzbot @ 2020-08-13 3:57 UTC (permalink / raw) To: James.Bottomley, amanieu, arnd, benh, bfields, borntraeger, bp, catalin.marinas, chris, christian, corbet, cyphar, dalias, davem, deller, dvyukov, fenghua.yu, geert, gor, heiko.carstens, hpa, ink, jcmvbkbc, jhogan, jlayton, kvalo, linux-alpha, linux-api, linux-arch, linux-arm-kernel, linux-fsdevel, linux-ia64, linux-kernel, linux-m68k, linux-mips, linux-parisc, linux-s390, linux-sh, linux-xtensa, linux, linux, linuxppc-dev, luis.f.correia, luto, martink, mattst88, ming.lei, ming.lei, mingo, monstr syzbot has bisected this issue to: commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179 Author: Aleksa Sarai <cyphar@cyphar.com> Date: Sat Jan 18 12:07:59 2020 +0000 open: introduce openat2(2) syscall bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=164e716a900000 start commit: 6ba1b005 Merge tag 'asm-generic-fixes-5.8' of git://git.ke.. git tree: upstream final oops: https://syzkaller.appspot.com/x/report.txt?x=154e716a900000 console output: https://syzkaller.appspot.com/x/log.txt?x=114e716a900000 kernel config: https://syzkaller.appspot.com/x/.config?x=84f076779e989e69 dashboard link: https://syzkaller.appspot.com/bug?extid=61acc40a49a3e46e25ea syz repro: https://syzkaller.appspot.com/x/repro.syz?x=142ae224900000 Reported-by: syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com Fixes: fddb5d430ad9 ("open: introduce openat2(2) syscall") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: INFO: task hung in pipe_release (2) @ 2020-08-13 3:57 ` syzbot 0 siblings, 0 replies; 22+ messages in thread From: syzbot @ 2020-08-13 3:57 UTC (permalink / raw) To: James.Bottomley, amanieu, arnd, benh, bfields, borntraeger, bp, catalin.marinas, chris, christian, corbet, cyphar, dalias, davem, deller, dvyukov, fenghua.yu, geert, gor, heiko.carstens, hpa, ink, jcmvbkbc, jhogan, jlayton, kvalo, linux-alpha, linux-api, linux-arch, linux-arm-kernel, linux-fsdevel, linux-ia64, linux-kernel, linux-m68k, linux-mips, linux-parisc, linux-s390, linux-sh, linux-xtensa, linux, linux, linuxppc-dev, luis.f.correia, luto, martink, mattst88, ming.lei, ming.lei, mingo, monstr syzbot has bisected this issue to: commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179 Author: Aleksa Sarai <cyphar@cyphar.com> Date: Sat Jan 18 12:07:59 2020 +0000 open: introduce openat2(2) syscall bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=164e716a900000 start commit: 6ba1b005 Merge tag 'asm-generic-fixes-5.8' of git://git.ke.. git tree: upstream final oops: https://syzkaller.appspot.com/x/report.txt?x=154e716a900000 console output: https://syzkaller.appspot.com/x/log.txt?x=114e716a900000 kernel config: https://syzkaller.appspot.com/x/.config?x=84f076779e989e69 dashboard link: https://syzkaller.appspot.com/bug?extid=61acc40a49a3e46e25ea syz repro: https://syzkaller.appspot.com/x/repro.syz?x=142ae224900000 Reported-by: syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com Fixes: fddb5d430ad9 ("open: introduce openat2(2) syscall") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2020-09-01 14:24 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-02 18:26 INFO: task hung in pipe_release (2) syzbot 2020-08-07 10:35 ` splice: infinite busy loop lockup bug Tetsuo Handa 2020-08-07 12:27 ` Al Viro 2020-08-07 12:34 ` Tetsuo Handa 2020-09-01 0:51 ` Qian Cai 2020-09-01 1:09 ` Al Viro 2020-09-01 1:47 ` Qian Cai 2020-09-01 3:32 ` Qian Cai 2020-09-01 4:07 ` Al Viro 2020-09-01 1:10 ` Ming Lei 2020-09-01 14:22 ` Qian Cai 2020-08-07 12:38 ` Al Viro 2020-08-07 13:41 ` Ming Lei 2020-08-07 14:11 ` Matthew Wilcox 2020-08-07 15:11 ` Tetsuo Handa 2020-08-09 2:31 ` Ming Lei 2020-08-09 2:49 ` Ming Lei 2020-08-07 14:17 ` Tetsuo Handa 2020-08-13 3:57 ` INFO: task hung in pipe_release (2) syzbot 2020-08-13 3:57 ` syzbot 2020-08-13 3:57 ` syzbot 2020-08-13 3:57 ` syzbot
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.