* possible deadlock in dquot_commit @ 2021-02-10 11:25 syzbot 2021-02-11 11:37 ` Jan Kara ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: syzbot @ 2021-02-10 11:25 UTC (permalink / raw) To: jack, linux-kernel, syzkaller-bugs Hello, syzbot found the following issue on: HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew) git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=101cf2f8d00000 kernel config: https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6 dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com loop1: detected capacity change from 4096 to 0 EXT4-fs (loop1): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback. ====================================================== WARNING: possible circular locking dependency detected 5.11.0-rc6-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor.1/16170 is trying to acquire lock: ffff8880795f5b28 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x4d/0x420 fs/quota/dquot.c:476 but task is already holding lock: ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&ei->i_data_sem/2){++++}-{3:3}: down_read+0x95/0x440 kernel/locking/rwsem.c:1353 ext4_map_blocks+0x381/0x17d0 fs/ext4/inode.c:560 ext4_getblk+0x13c/0x670 fs/ext4/inode.c:847 ext4_bread+0x29/0x210 fs/ext4/inode.c:899 ext4_quota_write+0x26b/0x670 fs/ext4/super.c:6557 write_blk+0x12e/0x220 fs/quota/quota_tree.c:73 get_free_dqblk+0xff/0x2d0 fs/quota/quota_tree.c:102 do_insert_tree+0x79c/0x1180 fs/quota/quota_tree.c:309 do_insert_tree+0xf77/0x1180 fs/quota/quota_tree.c:340 do_insert_tree+0xf77/0x1180 fs/quota/quota_tree.c:340 do_insert_tree+0xf77/0x1180 fs/quota/quota_tree.c:340 dq_insert_tree fs/quota/quota_tree.c:366 [inline] qtree_write_dquot+0x3b7/0x580 fs/quota/quota_tree.c:385 v2_write_dquot+0x11c/0x250 fs/quota/quota_v2.c:353 dquot_acquire+0x2c5/0x590 fs/quota/dquot.c:443 ext4_acquire_dquot+0x254/0x3b0 fs/ext4/super.c:6216 dqget+0x678/0x1080 fs/quota/dquot.c:901 __dquot_initialize+0x560/0xbe0 fs/quota/dquot.c:1479 ext4_create+0x8b/0x4c0 fs/ext4/namei.c:2606 lookup_open.isra.0+0xf85/0x1350 fs/namei.c:3106 open_last_lookups fs/namei.c:3180 [inline] path_openat+0x96d/0x2730 fs/namei.c:3368 do_filp_open+0x17e/0x3c0 fs/namei.c:3398 do_sys_openat2+0x16d/0x420 fs/open.c:1172 do_sys_open fs/open.c:1188 [inline] __do_sys_creat fs/open.c:1262 [inline] __se_sys_creat fs/open.c:1256 [inline] __x64_sys_creat+0xc9/0x120 fs/open.c:1256 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #1 (&s->s_dquot.dqio_sem){++++}-{3:3}: down_read+0x95/0x440 kernel/locking/rwsem.c:1353 v2_read_dquot+0x49/0x120 fs/quota/quota_v2.c:327 dquot_acquire+0x12e/0x590 fs/quota/dquot.c:434 ext4_acquire_dquot+0x254/0x3b0 fs/ext4/super.c:6216 dqget+0x678/0x1080 fs/quota/dquot.c:901 __dquot_initialize+0x560/0xbe0 fs/quota/dquot.c:1479 ext4_create+0x8b/0x4c0 fs/ext4/namei.c:2606 lookup_open.isra.0+0xf85/0x1350 fs/namei.c:3106 open_last_lookups fs/namei.c:3180 [inline] path_openat+0x96d/0x2730 fs/namei.c:3368 do_filp_open+0x17e/0x3c0 fs/namei.c:3398 do_sys_openat2+0x16d/0x420 fs/open.c:1172 do_sys_open fs/open.c:1188 [inline] __do_sys_creat fs/open.c:1262 [inline] __se_sys_creat fs/open.c:1256 [inline] __x64_sys_creat+0xc9/0x120 fs/open.c:1256 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&dquot->dq_lock){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:2868 [inline] check_prevs_add kernel/locking/lockdep.c:2993 [inline] validate_chain kernel/locking/lockdep.c:3608 [inline] __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832 lock_acquire kernel/locking/lockdep.c:5442 [inline] lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407 __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103 dquot_commit+0x4d/0x420 fs/quota/dquot.c:476 ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200 ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline] ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242 mark_dquot_dirty fs/quota/dquot.c:347 [inline] mark_all_dquot_dirty fs/quota/dquot.c:385 [inline] __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709 dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline] dquot_alloc_space include/linux/quotaops.h:310 [inline] dquot_alloc_block include/linux/quotaops.h:334 [inline] ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937 ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238 ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637 _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793 ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077 ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202 ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961 generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412 ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270 ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write+0x426/0x650 fs/read_write.c:518 vfs_write+0x791/0xa30 fs/read_write.c:605 ksys_write+0x12d/0x250 fs/read_write.c:658 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 other info that might help us debug this: Chain exists of: &dquot->dq_lock --> &s->s_dquot.dqio_sem --> &ei->i_data_sem/2 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->i_data_sem/2); lock(&s->s_dquot.dqio_sem); lock(&ei->i_data_sem/2); lock(&dquot->dq_lock); *** DEADLOCK *** 5 locks held by syz-executor.1/16170: #0: ffff88802ad18b70 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:947 #1: ffff88802fbec460 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x12d/0x250 fs/read_write.c:658 #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock include/linux/fs.h:773 [inline] #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: ext4_buffered_write_iter+0xb6/0x4d0 fs/ext4/file.c:264 #3: ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630 #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: i_dquot fs/quota/dquot.c:926 [inline] #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: __dquot_alloc_space+0x1b4/0xb60 fs/quota/dquot.c:1671 stack backtrace: CPU: 0 PID: 16170 Comm: syz-executor.1 Not tainted 5.11.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2117 check_prev_add kernel/locking/lockdep.c:2868 [inline] check_prevs_add kernel/locking/lockdep.c:2993 [inline] validate_chain kernel/locking/lockdep.c:3608 [inline] __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832 lock_acquire kernel/locking/lockdep.c:5442 [inline] lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407 __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103 dquot_commit+0x4d/0x420 fs/quota/dquot.c:476 ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200 ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline] ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242 mark_dquot_dirty fs/quota/dquot.c:347 [inline] mark_all_dquot_dirty fs/quota/dquot.c:385 [inline] __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709 dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline] dquot_alloc_space include/linux/quotaops.h:310 [inline] dquot_alloc_block include/linux/quotaops.h:334 [inline] ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937 ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238 ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637 _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793 ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077 ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202 ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961 generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412 ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270 ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write+0x426/0x650 fs/read_write.c:518 vfs_write+0x791/0xa30 fs/read_write.c:605 ksys_write+0x12d/0x250 fs/read_write.c:658 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x465b09 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f8097ffc188 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465b09 RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003 RBP: 00000000004b069f R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 R13: 00007ffefc77f01f R14: 00007f8097ffc300 R15: 0000000000022000 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible deadlock in dquot_commit 2021-02-10 11:25 possible deadlock in dquot_commit syzbot @ 2021-02-11 11:37 ` Jan Kara 2021-02-11 11:47 ` Dmitry Vyukov 2021-08-09 12:54 ` [syzbot] " syzbot [not found] ` <20210810041100.3271-1-hdanton@sina.com> 2 siblings, 1 reply; 15+ messages in thread From: Jan Kara @ 2021-02-11 11:37 UTC (permalink / raw) To: syzbot; +Cc: jack, linux-kernel, syzkaller-bugs On Wed 10-02-21 03:25:22, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew) > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=101cf2f8d00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6 > dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 > > Unfortunately, I don't have any reproducer for this issue yet. > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com > > loop1: detected capacity change from 4096 to 0 > EXT4-fs (loop1): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback. > ====================================================== > WARNING: possible circular locking dependency detected > 5.11.0-rc6-syzkaller #0 Not tainted > ------------------------------------------------------ > syz-executor.1/16170 is trying to acquire lock: > ffff8880795f5b28 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x4d/0x420 fs/quota/dquot.c:476 > > but task is already holding lock: > ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630 > > which lock already depends on the new lock. <snip> All snipped stacktraces look perfectly fine and the lock dependencies are as expected. > 5 locks held by syz-executor.1/16170: > #0: ffff88802ad18b70 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:947 > #1: ffff88802fbec460 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x12d/0x250 fs/read_write.c:658 > #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock include/linux/fs.h:773 [inline] > #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: ext4_buffered_write_iter+0xb6/0x4d0 fs/ext4/file.c:264 > #3: ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630 > #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: i_dquot fs/quota/dquot.c:926 [inline] > #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: __dquot_alloc_space+0x1b4/0xb60 fs/quota/dquot.c:1671 This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e., I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from ext4_block_write_begin(). This suggests that the write has been happening directly to the quota file (or that lockdep annotation of the inode went wrong somewhere). Now we normally protect quota files with IMMUTABLE flag so writing it should not be possible. We also don't allow clearing this flag on used quota file. Finally I'd checked lockdep annotation and everything looks correct. So at this point the best theory I have is that a filesystem has been suitably corrupted and quota file supposed to be inaccessible from userspace got exposed but I'd expect other problems to hit first in that case. Anyway without a reproducer I have no more ideas... Honza > > stack backtrace: > CPU: 0 PID: 16170 Comm: syz-executor.1 Not tainted 5.11.0-rc6-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:79 [inline] > dump_stack+0x107/0x163 lib/dump_stack.c:120 > check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2117 > check_prev_add kernel/locking/lockdep.c:2868 [inline] > check_prevs_add kernel/locking/lockdep.c:2993 [inline] > validate_chain kernel/locking/lockdep.c:3608 [inline] > __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832 > lock_acquire kernel/locking/lockdep.c:5442 [inline] > lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407 > __mutex_lock_common kernel/locking/mutex.c:956 [inline] > __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103 > dquot_commit+0x4d/0x420 fs/quota/dquot.c:476 > ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200 > ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline] > ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242 > mark_dquot_dirty fs/quota/dquot.c:347 [inline] > mark_all_dquot_dirty fs/quota/dquot.c:385 [inline] > __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709 > dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline] > dquot_alloc_space include/linux/quotaops.h:310 [inline] > dquot_alloc_block include/linux/quotaops.h:334 [inline] > ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937 > ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238 > ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637 > _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793 > ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077 > ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202 > ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961 > generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412 > ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270 > ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664 > call_write_iter include/linux/fs.h:1901 [inline] > new_sync_write+0x426/0x650 fs/read_write.c:518 > vfs_write+0x791/0xa30 fs/read_write.c:605 > ksys_write+0x12d/0x250 fs/read_write.c:658 > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > RIP: 0033:0x465b09 > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007f8097ffc188 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465b09 > RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003 > RBP: 00000000004b069f R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 > R13: 00007ffefc77f01f R14: 00007f8097ffc300 R15: 0000000000022000 > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible deadlock in dquot_commit 2021-02-11 11:37 ` Jan Kara @ 2021-02-11 11:47 ` Dmitry Vyukov 2021-02-11 15:47 ` Jan Kara 2021-02-11 21:46 ` Theodore Ts'o 0 siblings, 2 replies; 15+ messages in thread From: Dmitry Vyukov @ 2021-02-11 11:47 UTC (permalink / raw) To: Jan Kara; +Cc: syzbot, Jan Kara, LKML, syzkaller-bugs On Thu, Feb 11, 2021 at 12:37 PM Jan Kara <jack@suse.cz> wrote: > > On Wed 10-02-21 03:25:22, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew) > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=101cf2f8d00000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6 > > dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com > > > > loop1: detected capacity change from 4096 to 0 > > EXT4-fs (loop1): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback. > > ====================================================== > > WARNING: possible circular locking dependency detected > > 5.11.0-rc6-syzkaller #0 Not tainted > > ------------------------------------------------------ > > syz-executor.1/16170 is trying to acquire lock: > > ffff8880795f5b28 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x4d/0x420 fs/quota/dquot.c:476 > > > > but task is already holding lock: > > ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630 > > > > which lock already depends on the new lock. > > <snip> > > All snipped stacktraces look perfectly fine and the lock dependencies are as > expected. > > > 5 locks held by syz-executor.1/16170: > > #0: ffff88802ad18b70 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:947 > > #1: ffff88802fbec460 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x12d/0x250 fs/read_write.c:658 > > #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock include/linux/fs.h:773 [inline] > > #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: ext4_buffered_write_iter+0xb6/0x4d0 fs/ext4/file.c:264 > > #3: ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630 > > #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: i_dquot fs/quota/dquot.c:926 [inline] > > #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: __dquot_alloc_space+0x1b4/0xb60 fs/quota/dquot.c:1671 > > This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e., > I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from > ext4_block_write_begin(). This suggests that the write has been happening > directly to the quota file (or that lockdep annotation of the inode went > wrong somewhere). Now we normally protect quota files with IMMUTABLE flag > so writing it should not be possible. We also don't allow clearing this > flag on used quota file. Finally I'd checked lockdep annotation and > everything looks correct. So at this point the best theory I have is that a > filesystem has been suitably corrupted and quota file supposed to be > inaccessible from userspace got exposed but I'd expect other problems to > hit first in that case. Anyway without a reproducer I have no more ideas... There is a reproducer for 4.19 available on the dashboard. Maybe it will help. I don't why it did not pop up on upstream yet, there lots of potential reasons for this. > Honza > > > > > stack backtrace: > > CPU: 0 PID: 16170 Comm: syz-executor.1 Not tainted 5.11.0-rc6-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:79 [inline] > > dump_stack+0x107/0x163 lib/dump_stack.c:120 > > check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2117 > > check_prev_add kernel/locking/lockdep.c:2868 [inline] > > check_prevs_add kernel/locking/lockdep.c:2993 [inline] > > validate_chain kernel/locking/lockdep.c:3608 [inline] > > __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832 > > lock_acquire kernel/locking/lockdep.c:5442 [inline] > > lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407 > > __mutex_lock_common kernel/locking/mutex.c:956 [inline] > > __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103 > > dquot_commit+0x4d/0x420 fs/quota/dquot.c:476 > > ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200 > > ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline] > > ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242 > > mark_dquot_dirty fs/quota/dquot.c:347 [inline] > > mark_all_dquot_dirty fs/quota/dquot.c:385 [inline] > > __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709 > > dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline] > > dquot_alloc_space include/linux/quotaops.h:310 [inline] > > dquot_alloc_block include/linux/quotaops.h:334 [inline] > > ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937 > > ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238 > > ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637 > > _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793 > > ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077 > > ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202 > > ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961 > > generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412 > > ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270 > > ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664 > > call_write_iter include/linux/fs.h:1901 [inline] > > new_sync_write+0x426/0x650 fs/read_write.c:518 > > vfs_write+0x791/0xa30 fs/read_write.c:605 > > ksys_write+0x12d/0x250 fs/read_write.c:658 > > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > RIP: 0033:0x465b09 > > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 > > RSP: 002b:00007f8097ffc188 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > > RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465b09 > > RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003 > > RBP: 00000000004b069f R08: 0000000000000000 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 > > R13: 00007ffefc77f01f R14: 00007f8097ffc300 R15: 0000000000022000 > > > > > > --- > > This report is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkaller@googlegroups.com. > > > > syzbot will keep track of this issue. See: > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20210211113718.GM19070%40quack2.suse.cz. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible deadlock in dquot_commit 2021-02-11 11:47 ` Dmitry Vyukov @ 2021-02-11 15:47 ` Jan Kara 2021-02-11 21:46 ` Theodore Ts'o 1 sibling, 0 replies; 15+ messages in thread From: Jan Kara @ 2021-02-11 15:47 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs On Thu 11-02-21 12:47:18, Dmitry Vyukov wrote: > On Thu, Feb 11, 2021 at 12:37 PM Jan Kara <jack@suse.cz> wrote: > > > > On Wed 10-02-21 03:25:22, syzbot wrote: > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew) > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=101cf2f8d00000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com > > > > > > loop1: detected capacity change from 4096 to 0 > > > EXT4-fs (loop1): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback. > > > ====================================================== > > > WARNING: possible circular locking dependency detected > > > 5.11.0-rc6-syzkaller #0 Not tainted > > > ------------------------------------------------------ > > > syz-executor.1/16170 is trying to acquire lock: > > > ffff8880795f5b28 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x4d/0x420 fs/quota/dquot.c:476 > > > > > > but task is already holding lock: > > > ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630 > > > > > > which lock already depends on the new lock. > > > > <snip> > > > > All snipped stacktraces look perfectly fine and the lock dependencies are as > > expected. > > > > > 5 locks held by syz-executor.1/16170: > > > #0: ffff88802ad18b70 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:947 > > > #1: ffff88802fbec460 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x12d/0x250 fs/read_write.c:658 > > > #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock include/linux/fs.h:773 [inline] > > > #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: ext4_buffered_write_iter+0xb6/0x4d0 fs/ext4/file.c:264 > > > #3: ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630 > > > #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: i_dquot fs/quota/dquot.c:926 [inline] > > > #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: __dquot_alloc_space+0x1b4/0xb60 fs/quota/dquot.c:1671 > > > > This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e., > > I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from > > ext4_block_write_begin(). This suggests that the write has been happening > > directly to the quota file (or that lockdep annotation of the inode went > > wrong somewhere). Now we normally protect quota files with IMMUTABLE flag > > so writing it should not be possible. We also don't allow clearing this > > flag on used quota file. Finally I'd checked lockdep annotation and > > everything looks correct. So at this point the best theory I have is that a > > filesystem has been suitably corrupted and quota file supposed to be > > inaccessible from userspace got exposed but I'd expect other problems to > > hit first in that case. Anyway without a reproducer I have no more ideas... > > There is a reproducer for 4.19 available on the dashboard. Maybe it will help. > I don't why it did not pop up on upstream yet, there lots of potential > reasons for this. OK, so I've checked the fs images generated by the syzkaller reproducer and they indeed have QUOTA feature enabled. Also inodes used by quota files are not marked as allocated so there is some potential for surprises. But all the possible paths I could think of seem to be covered and return EFSCORRUPTED. Also note that the reproducer didn't trigger the lockdep splat for me so the problem still isn't clear to me. Honza > > > > > > stack backtrace: > > > CPU: 0 PID: 16170 Comm: syz-executor.1 Not tainted 5.11.0-rc6-syzkaller #0 > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > > Call Trace: > > > __dump_stack lib/dump_stack.c:79 [inline] > > > dump_stack+0x107/0x163 lib/dump_stack.c:120 > > > check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2117 > > > check_prev_add kernel/locking/lockdep.c:2868 [inline] > > > check_prevs_add kernel/locking/lockdep.c:2993 [inline] > > > validate_chain kernel/locking/lockdep.c:3608 [inline] > > > __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832 > > > lock_acquire kernel/locking/lockdep.c:5442 [inline] > > > lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407 > > > __mutex_lock_common kernel/locking/mutex.c:956 [inline] > > > __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103 > > > dquot_commit+0x4d/0x420 fs/quota/dquot.c:476 > > > ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200 > > > ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline] > > > ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242 > > > mark_dquot_dirty fs/quota/dquot.c:347 [inline] > > > mark_all_dquot_dirty fs/quota/dquot.c:385 [inline] > > > __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709 > > > dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline] > > > dquot_alloc_space include/linux/quotaops.h:310 [inline] > > > dquot_alloc_block include/linux/quotaops.h:334 [inline] > > > ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937 > > > ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238 > > > ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637 > > > _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793 > > > ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077 > > > ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202 > > > ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961 > > > generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412 > > > ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270 > > > ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664 > > > call_write_iter include/linux/fs.h:1901 [inline] > > > new_sync_write+0x426/0x650 fs/read_write.c:518 > > > vfs_write+0x791/0xa30 fs/read_write.c:605 > > > ksys_write+0x12d/0x250 fs/read_write.c:658 > > > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > RIP: 0033:0x465b09 > > > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 > > > RSP: 002b:00007f8097ffc188 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > > > RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465b09 > > > RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003 > > > RBP: 00000000004b069f R08: 0000000000000000 R09: 0000000000000000 > > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 > > > R13: 00007ffefc77f01f R14: 00007f8097ffc300 R15: 0000000000022000 > > > > > > > > > --- > > > This report is generated by a bot. It may contain errors. > > > See https://goo.gl/tpsmEJ for more information about syzbot. > > > syzbot engineers can be reached at syzkaller@googlegroups.com. > > > > > > syzbot will keep track of this issue. See: > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > > > -- > > Jan Kara <jack@suse.com> > > SUSE Labs, CR > > > > -- > > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20210211113718.GM19070%40quack2.suse.cz. -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible deadlock in dquot_commit 2021-02-11 11:47 ` Dmitry Vyukov 2021-02-11 15:47 ` Jan Kara @ 2021-02-11 21:46 ` Theodore Ts'o 2021-02-12 11:01 ` Dmitry Vyukov 1 sibling, 1 reply; 15+ messages in thread From: Theodore Ts'o @ 2021-02-11 21:46 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs On Thu, Feb 11, 2021 at 12:47:18PM +0100, Dmitry Vyukov wrote: > > This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e., > > I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from > > ext4_block_write_begin(). This suggests that the write has been happening > > directly to the quota file (or that lockdep annotation of the inode went > > wrong somewhere). Now we normally protect quota files with IMMUTABLE flag > > so writing it should not be possible. We also don't allow clearing this > > flag on used quota file. Finally I'd checked lockdep annotation and > > everything looks correct. So at this point the best theory I have is that a > > filesystem has been suitably corrupted and quota file supposed to be > > inaccessible from userspace got exposed but I'd expect other problems to > > hit first in that case. Anyway without a reproducer I have no more ideas... > > There is a reproducer for 4.19 available on the dashboard. Maybe it will help. > I don't why it did not pop up on upstream yet, there lots of potential > reasons for this. The 4.19 version of the syzbot report has a very different stack trace. Instead of it being related to an apparent write to the quota file, it is apparently caused by a call to rmdir: dump_stack+0x22c/0x33e lib/dump_stack.c:118 print_circular_bug.constprop.0.cold+0x2d7/0x41e kernel/locking/lockdep.c:1221 ... __mutex_lock+0xd7/0x13f0 kernel/locking/mutex.c:1072 dquot_commit+0x4d/0x400 fs/quota/dquot.c:469 ext4_write_dquot+0x1f2/0x2a0 fs/ext4/super.c:5644 ... ext4_evict_inode+0x933/0x1830 fs/ext4/inode.c:298 evict+0x2ed/0x780 fs/inode.c:559 iput_final fs/inode.c:1555 [inline] ... vfs_rmdir fs/namei.c:3865 [inline] do_rmdir+0x3af/0x420 fs/namei.c:3943 __do_sys_unlinkat fs/namei.c:4105 [inline] __se_sys_unlinkat fs/namei.c:4099 [inline] __x64_sys_unlinkat+0xdf/0x120 fs/namei.c:4099 do_syscall_64+0xf9/0x670 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x49/0xbe Which leads me to another apparent contradiction. Looking at the C reproducer source code, and running the C reproducer under "strace -ff", there is never any attempt to run rmdir() on the corrupted file system that is mounted. Neither as observed by my running the C reproducer, or by looking at the C reproducer source code. Looking at the code, I did see a number of things which seemed to be bugs; procid never gets incremented, so all of the threads only operate on /dev/loop0, and each call to the execute() function tries to setup two file systems on /dev/loop0. So the each thread to run creates a temp file, binds it to /dev/loop0, and then creates another temp file, tries to bind it to /dev/loop0 (which will fail), tries to mount /dev/loop0 (again) on the samee mount point (which will succeed). I'm not sure if this is just some insanity that was consed up by the fuzzer... or I'm wondering if this was an unfaithful translation of the syzbot repro to C. Am I correct in understanding that when syzbot is running, it uses the syzbot repro, and not the C repro? - Ted ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible deadlock in dquot_commit 2021-02-11 21:46 ` Theodore Ts'o @ 2021-02-12 11:01 ` Dmitry Vyukov 2021-02-12 16:10 ` Theodore Ts'o 0 siblings, 1 reply; 15+ messages in thread From: Dmitry Vyukov @ 2021-02-12 11:01 UTC (permalink / raw) To: Theodore Ts'o Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs, syzkaller On Thu, Feb 11, 2021 at 10:46 PM Theodore Ts'o <tytso@mit.edu> wrote: > > On Thu, Feb 11, 2021 at 12:47:18PM +0100, Dmitry Vyukov wrote: > > > This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e., > > > I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from > > > ext4_block_write_begin(). This suggests that the write has been happening > > > directly to the quota file (or that lockdep annotation of the inode went > > > wrong somewhere). Now we normally protect quota files with IMMUTABLE flag > > > so writing it should not be possible. We also don't allow clearing this > > > flag on used quota file. Finally I'd checked lockdep annotation and > > > everything looks correct. So at this point the best theory I have is that a > > > filesystem has been suitably corrupted and quota file supposed to be > > > inaccessible from userspace got exposed but I'd expect other problems to > > > hit first in that case. Anyway without a reproducer I have no more ideas... > > > > There is a reproducer for 4.19 available on the dashboard. Maybe it will help. > > I don't why it did not pop up on upstream yet, there lots of potential > > reasons for this. > > The 4.19 version of the syzbot report has a very different stack > trace. Instead of it being related to an apparent write to the quota > file, it is apparently caused by a call to rmdir: > > dump_stack+0x22c/0x33e lib/dump_stack.c:118 > print_circular_bug.constprop.0.cold+0x2d7/0x41e kernel/locking/lockdep.c:1221 > ... > __mutex_lock+0xd7/0x13f0 kernel/locking/mutex.c:1072 > dquot_commit+0x4d/0x400 fs/quota/dquot.c:469 > ext4_write_dquot+0x1f2/0x2a0 fs/ext4/super.c:5644 > ... > ext4_evict_inode+0x933/0x1830 fs/ext4/inode.c:298 > evict+0x2ed/0x780 fs/inode.c:559 > iput_final fs/inode.c:1555 [inline] > ... > vfs_rmdir fs/namei.c:3865 [inline] > do_rmdir+0x3af/0x420 fs/namei.c:3943 > __do_sys_unlinkat fs/namei.c:4105 [inline] > __se_sys_unlinkat fs/namei.c:4099 [inline] > __x64_sys_unlinkat+0xdf/0x120 fs/namei.c:4099 > do_syscall_64+0xf9/0x670 arch/x86/entry/common.c:293 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > Which leads me to another apparent contradiction. Looking at the C > reproducer source code, and running the C reproducer under "strace > -ff", there is never any attempt to run rmdir() on the corrupted file > system that is mounted. Neither as observed by my running the C > reproducer, or by looking at the C reproducer source code. > > Looking at the code, I did see a number of things which seemed to be > bugs; procid never gets incremented, so all of the threads only > operate on /dev/loop0, and each call to the execute() function tries > to setup two file systems on /dev/loop0. So the each thread to run > creates a temp file, binds it to /dev/loop0, and then creates another > temp file, tries to bind it to /dev/loop0 (which will fail), tries to > mount /dev/loop0 (again) on the samee mount point (which will > succeed). > > I'm not sure if this is just some insanity that was consed up by the > fuzzer... or I'm wondering if this was an unfaithful translation of > the syzbot repro to C. Am I correct in understanding that when syzbot > is running, it uses the syzbot repro, and not the C repro? Hi Ted, The 4.19 reproducer may reproducer something else, you know better. I just want to answer points re syzkaller reproducers. FTR the 4.19 reproducer/reproducer is here: https://syzkaller.appspot.com/bug?id=b6cacc9fa48fea07154b8797236727de981c1e02 > there is never any attempt to run rmdir() on the corrupted file system that is mounted. Recursive rmdir happens as part of test cleanup implicitly, you can see rmdir call in remove_dir function in the C reproducer: https://syzkaller.appspot.com/text?tag=ReproC&x=12caea37900000 > procid never gets incremented, so all of the threads only operate on /dev/loop0 This is intentional. procid is supposed to "isolate" parallel test processes (if any). This reproducer does not use parallel test processes, thus procid has constant value. > Am I correct in understanding that when syzbot is running, it uses the syzbot repro, and not the C repro? It tries both. If first tries to interpret "syzkaller program" as it was done when the bug was triggered during fuzzing. But then it tries to convert it to a corresponding stand-alone C program and confirms that it still triggers the bug. If it provides a C reproducer, it means that it did trigger the bug using this exact C program on a freshly booted kernel (and the provided kernel oops is the corresponding oops obtained on this exact program). If it fails to reproduce the bug with a C reproducer, then it provides only the "syzkaller program" to not mislead developers. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible deadlock in dquot_commit 2021-02-12 11:01 ` Dmitry Vyukov @ 2021-02-12 16:10 ` Theodore Ts'o 2021-02-15 12:50 ` Dmitry Vyukov 0 siblings, 1 reply; 15+ messages in thread From: Theodore Ts'o @ 2021-02-12 16:10 UTC (permalink / raw) To: Dmitry Vyukov; +Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs, syzkaller ^[>From: Theodore Ts'o <tytso@mit.edu> On Fri, Feb 12, 2021 at 12:01:51PM +0100, Dmitry Vyukov wrote: > > > > > > There is a reproducer for 4.19 available on the dashboard. Maybe it will help. > > > I don't why it did not pop up on upstream yet, there lots of potential > > > reasons for this. > > > > The 4.19 version of the syzbot report has a very different stack > > trace. Instead of it being related to an apparent write to the quota > > file, it is apparently caused by a call to rmdir: > > > > The 4.19 reproducer may reproducer something else, you know better. I > just want to answer points re syzkaller reproducers. FTR the 4.19 > reproducer/reproducer is here: > https://syzkaller.appspot.com/bug?id=b6cacc9fa48fea07154b8797236727de981c1e02 Yes, I know. That was my point. I don't think it's useful for debugging the upstream dquot_commit syzbot report (for which we don't have a reproducer yet). > > there is never any attempt to run rmdir() on the corrupted file system that is mounted. > > Recursive rmdir happens as part of test cleanup implicitly, you can > see rmdir call in remove_dir function in the C reproducer: > https://syzkaller.appspot.com/text?tag=ReproC&x=12caea37900000 That rmdir() removes the mountpoint, which is *not* the fuzzed file system which has the quota feature enabled. > > procid never gets incremented, so all of the threads only operate on /dev/loop0 > > This is intentional. procid is supposed to "isolate" parallel test > processes (if any). This reproducer does not use parallel test > processes, thus procid has constant value. Um... yes it does: int main(void) { syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul); syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); use_temporary_dir(); loop(); return 0; } and what is loop? static void loop(void) { int iter = 0; for (;; iter++) { ... reset_loop(); int pid = fork(); if (pid < 0) exit(1); if (pid == 0) { if (chdir(cwdbuf)) exit(1); setup_test(); execute_one(); exit(0); } ... remove_dir(cwdbuf); } } > > Am I correct in understanding that when syzbot is running, it uses the syzbot repro, and not the C repro? > > It tries both. If first tries to interpret "syzkaller program" as it > was done when the bug was triggered during fuzzing. But then it tries > to convert it to a corresponding stand-alone C program and confirms > that it still triggers the bug. If it provides a C reproducer, it > means that it did trigger the bug using this exact C program on a > freshly booted kernel (and the provided kernel oops is the > corresponding oops obtained on this exact program). > If it fails to reproduce the bug with a C reproducer, then it provides > only the "syzkaller program" to not mislead developers. Well, looking at the C reproducer, it doesn't reproduce on upstream, and the stack trace makes no sense to me. The rmdir() executes at the end of the test, as part of the cleanup, and looking at the syzkaller console, the stack trace involving rmdir happens *early* while test threads are still trying to mount the file system. - Ted ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible deadlock in dquot_commit 2021-02-12 16:10 ` Theodore Ts'o @ 2021-02-15 12:50 ` Dmitry Vyukov 0 siblings, 0 replies; 15+ messages in thread From: Dmitry Vyukov @ 2021-02-15 12:50 UTC (permalink / raw) To: Theodore Ts'o Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs, syzkaller On Fri, Feb 12, 2021 at 5:10 PM Theodore Ts'o <tytso@mit.edu> wrote: > > >From: Theodore Ts'o <tytso@mit.edu> > > On Fri, Feb 12, 2021 at 12:01:51PM +0100, Dmitry Vyukov wrote: > > > > > > > > There is a reproducer for 4.19 available on the dashboard. Maybe it will help. > > > > I don't why it did not pop up on upstream yet, there lots of potential > > > > reasons for this. > > > > > > The 4.19 version of the syzbot report has a very different stack > > > trace. Instead of it being related to an apparent write to the quota > > > file, it is apparently caused by a call to rmdir: > > > > > > > The 4.19 reproducer may reproducer something else, you know better. I > > just want to answer points re syzkaller reproducers. FTR the 4.19 > > reproducer/reproducer is here: > > https://syzkaller.appspot.com/bug?id=b6cacc9fa48fea07154b8797236727de981c1e02 > > Yes, I know. That was my point. I don't think it's useful for > debugging the upstream dquot_commit syzbot report (for which we don't > have a reproducer yet). > > > > there is never any attempt to run rmdir() on the corrupted file system that is mounted. > > > > Recursive rmdir happens as part of test cleanup implicitly, you can > > see rmdir call in remove_dir function in the C reproducer: > > https://syzkaller.appspot.com/text?tag=ReproC&x=12caea37900000 > > That rmdir() removes the mountpoint, which is *not* the fuzzed file > system which has the quota feature enabled. remove_dir function is recursive, so rmdir should be called for all subdirectories starting from the deepest ones. At least that was the intention. Do you see it's not working this way? That would be something to fix. > > > procid never gets incremented, so all of the threads only operate on /dev/loop0 > > > > This is intentional. procid is supposed to "isolate" parallel test > > processes (if any). This reproducer does not use parallel test > > processes, thus procid has constant value. > > Um... yes it does: There is waitpid before remove_dir. So these are sequential test processes, not parallel. > int main(void) > { > syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); > syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul); > syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); > use_temporary_dir(); > loop(); > return 0; > } > > and what is loop? > > static void loop(void) > { > int iter = 0; > for (;; iter++) { > ... > reset_loop(); > int pid = fork(); > if (pid < 0) > exit(1); > if (pid == 0) { > if (chdir(cwdbuf)) > exit(1); > setup_test(); > execute_one(); > exit(0); > } > ... > remove_dir(cwdbuf); > } > } > > > > Am I correct in understanding that when syzbot is running, it uses the syzbot repro, and not the C repro? > > > > It tries both. If first tries to interpret "syzkaller program" as it > > was done when the bug was triggered during fuzzing. But then it tries > > to convert it to a corresponding stand-alone C program and confirms > > that it still triggers the bug. If it provides a C reproducer, it > > means that it did trigger the bug using this exact C program on a > > freshly booted kernel (and the provided kernel oops is the > > corresponding oops obtained on this exact program). > > If it fails to reproduce the bug with a C reproducer, then it provides > > only the "syzkaller program" to not mislead developers. > > Well, looking at the C reproducer, it doesn't reproduce on upstream, > and the stack trace makes no sense to me. The rmdir() executes at the > end of the test, as part of the cleanup, and looking at the syzkaller > console, the stack trace involving rmdir happens *early* while test > threads are still trying to mount the file system. My assumption that the 4.19 reproducer for a somewhat similarly looking bug may also reproduce this upstream bug is false then. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] possible deadlock in dquot_commit 2021-02-10 11:25 possible deadlock in dquot_commit syzbot 2021-02-11 11:37 ` Jan Kara @ 2021-08-09 12:54 ` syzbot 2021-08-09 14:52 ` Jan Kara 2021-10-07 8:44 ` Jan Kara [not found] ` <20210810041100.3271-1-hdanton@sina.com> 2 siblings, 2 replies; 15+ messages in thread From: syzbot @ 2021-08-09 12:54 UTC (permalink / raw) To: dvyukov, jack, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso syzbot has found a reproducer for the following issue on: HEAD commit: 66745863ecde Merge tag 'char-misc-5.14-rc5' of git://git.k.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=13edca6e300000 kernel config: https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324 dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15aeba6e300000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17a609e6300000 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com loop0: detected capacity change from 0 to 4096 EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback. ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc4-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor211/9242 is trying to acquire lock: ffff88803a37ece8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474 but task is already holding lock: ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&ei->i_data_sem/2){++++}-{3:3}: lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625 down_read+0x3b/0x50 kernel/locking/rwsem.c:1353 ext4_map_blocks+0x266/0x1cb0 fs/ext4/inode.c:561 ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848 ext4_bread+0x2a/0x170 fs/ext4/inode.c:900 ext4_quota_write+0x2c7/0x5b0 fs/ext4/super.c:6602 write_blk fs/quota/quota_tree.c:64 [inline] get_free_dqblk+0x33a/0x660 fs/quota/quota_tree.c:93 do_insert_tree+0x24c/0x1d30 fs/quota/quota_tree.c:300 do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331 do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331 do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331 dq_insert_tree fs/quota/quota_tree.c:357 [inline] qtree_write_dquot+0x3b6/0x530 fs/quota/quota_tree.c:376 v2_write_dquot+0x110/0x1a0 fs/quota/quota_v2.c:358 dquot_acquire+0x2d7/0x5b0 fs/quota/dquot.c:441 ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261 dqget+0x999/0xdc0 fs/quota/dquot.c:899 __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477 ext4_create+0xb0/0x550 fs/ext4/namei.c:2731 lookup_open fs/namei.c:3228 [inline] open_last_lookups fs/namei.c:3298 [inline] path_openat+0x13b7/0x36b0 fs/namei.c:3504 do_filp_open+0x253/0x4d0 fs/namei.c:3534 do_sys_openat2+0x124/0x460 fs/open.c:1204 do_sys_open fs/open.c:1220 [inline] __do_sys_creat fs/open.c:1294 [inline] __se_sys_creat fs/open.c:1288 [inline] __x64_sys_creat+0x11f/0x160 fs/open.c:1288 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #1 (&s->s_dquot.dqio_sem){++++}-{3:3}: lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625 down_read+0x3b/0x50 kernel/locking/rwsem.c:1353 v2_read_dquot+0x4a/0x100 fs/quota/quota_v2.c:332 dquot_acquire+0x144/0x5b0 fs/quota/dquot.c:432 ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261 dqget+0x999/0xdc0 fs/quota/dquot.c:899 __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477 ext4_create+0xb0/0x550 fs/ext4/namei.c:2731 lookup_open fs/namei.c:3228 [inline] open_last_lookups fs/namei.c:3298 [inline] path_openat+0x13b7/0x36b0 fs/namei.c:3504 do_filp_open+0x253/0x4d0 fs/namei.c:3534 do_sys_openat2+0x124/0x460 fs/open.c:1204 do_sys_open fs/open.c:1220 [inline] __do_sys_creat fs/open.c:1294 [inline] __se_sys_creat fs/open.c:1288 [inline] __x64_sys_creat+0x11f/0x160 fs/open.c:1288 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (&dquot->dq_lock){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:3051 [inline] check_prevs_add+0x4f9/0x5b30 kernel/locking/lockdep.c:3174 validate_chain kernel/locking/lockdep.c:3789 [inline] __lock_acquire+0x4476/0x6100 kernel/locking/lockdep.c:5015 lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625 __mutex_lock_common+0x1ad/0x3770 kernel/locking/mutex.c:959 __mutex_lock kernel/locking/mutex.c:1104 [inline] mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1119 dquot_commit+0x57/0x360 fs/quota/dquot.c:474 ext4_write_dquot+0x1e4/0x2b0 fs/ext4/super.c:6245 mark_dquot_dirty fs/quota/dquot.c:345 [inline] mark_all_dquot_dirty fs/quota/dquot.c:383 [inline] __dquot_alloc_space+0xa18/0x1020 fs/quota/dquot.c:1707 dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline] dquot_alloc_space include/linux/quotaops.h:310 [inline] dquot_alloc_block include/linux/quotaops.h:334 [inline] ext4_mb_new_blocks+0xe85/0x2470 fs/ext4/mballoc.c:5477 ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4245 ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638 _ext4_get_block+0x24b/0x710 fs/ext4/inode.c:794 ext4_block_write_begin+0x63a/0x1250 fs/ext4/inode.c:1077 ext4_write_begin+0x5cc/0x1350 fs/ext4/ext4_jbd2.h:498 ext4_da_write_begin+0x384/0x10c0 fs/ext4/inode.c:2960 generic_perform_write+0x262/0x580 mm/filemap.c:3656 ext4_buffered_write_iter+0x41c/0x590 fs/ext4/file.c:269 ext4_file_write_iter+0x8f7/0x1b90 fs/ext4/file.c:519 call_write_iter include/linux/fs.h:2114 [inline] new_sync_write fs/read_write.c:518 [inline] vfs_write+0xa39/0xc90 fs/read_write.c:605 ksys_write+0x171/0x2a0 fs/read_write.c:658 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &dquot->dq_lock --> &s->s_dquot.dqio_sem --> &ei->i_data_sem/2 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->i_data_sem/2); lock(&s->s_dquot.dqio_sem); lock(&ei->i_data_sem/2); lock(&dquot->dq_lock); *** DEADLOCK *** 4 locks held by syz-executor211/9242: #0: ffff88802ff60460 (sb_writers#5){.+.+}-{0:0}, at: vfs_write+0x21b/0xc90 fs/read_write.c:601 #1: ffff88803a304058 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:774 [inline] #1: ffff88803a304058 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: ext4_buffered_write_iter+0xaf/0x590 fs/ext4/file.c:263 #2: ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631 #3: ffffffff8c840518 (dquot_srcu){....}-{0:0}, at: rcu_lock_acquire+0x5/0x30 include/linux/rcupdate.h:266 stack backtrace: CPU: 1 PID: 9242 Comm: syz-executor211 Not tainted 5.14.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1ae/0x29f lib/dump_stack.c:105 print_circular_bug+0xb17/0xdc0 kernel/locking/lockdep.c:2009 check_noncircular+0x2cc/0x390 kernel/locking/lockdep.c:2131 check_prev_add kernel/locking/lockdep.c:3051 [inline] check_prevs_add+0x4f9/0x5b30 kernel/locking/lockdep.c:3174 validate_chain kernel/locking/lockdep.c:3789 [inline] __lock_acquire+0x4476/0x6100 kernel/locking/lockdep.c:5015 lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625 __mutex_lock_common+0x1ad/0x3770 kernel/locking/mutex.c:959 __mutex_lock kernel/locking/mutex.c:1104 [inline] mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1119 dquot_commit+0x57/0x360 fs/quota/dquot.c:474 ext4_write_dquot+0x1e4/0x2b0 fs/ext4/super.c:6245 mark_dquot_dirty fs/quota/dquot.c:345 [inline] mark_all_dquot_dirty fs/quota/dquot.c:383 [inline] __dquot_alloc_space+0xa18/0x1020 fs/quota/dquot.c:1707 dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline] dquot_alloc_space include/linux/quotaops.h:310 [inline] dquot_alloc_block include/linux/quotaops.h:334 [inline] ext4_mb_new_blocks+0xe85/0x2470 fs/ext4/mballoc.c:5477 ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4245 ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638 _ext4_get_block+0x24b/0x710 fs/ext4/inode.c:794 ext4_block_write_begin+0x63a/0x1250 fs/ext4/inode.c:1077 ext4_write_begin+0x5cc/0x1350 fs/ext4/ext4_jbd2.h:498 ext4_da_write_begin+0x384/0x10c0 fs/ext4/inode.c:2960 generic_perform_write+0x262/0x580 mm/filemap.c:3656 ext4_buffered_write_iter+0x41c/0x590 fs/ext4/file.c:269 ext4_file_write_iter+0x8f7/0x1b90 fs/ext4/file.c:519 call_write_iter include/linux/fs.h:2114 [inline] new_sync_write fs/read_write.c:518 [inline] vfs_write+0xa39/0xc90 fs/read_write.c:605 ksys_write+0x171/0x2a0 fs/read_write.c:658 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x445219 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffd8864cd18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000004885e9 RCX: 0000000000445219 RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003 RBP: 0000000020010500 R08: 00007ffd8864cd40 R09: 00007ffd8864cd40 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000020010000 R13: 0030656c69662f2e R14: 00007ffd8864cd50 R15: 000000000000004d ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] possible deadlock in dquot_commit 2021-08-09 12:54 ` [syzbot] " syzbot @ 2021-08-09 14:52 ` Jan Kara 2021-08-09 17:43 ` syzbot 2021-10-07 8:44 ` Jan Kara 1 sibling, 1 reply; 15+ messages in thread From: Jan Kara @ 2021-08-09 14:52 UTC (permalink / raw) To: syzbot Cc: dvyukov, jack, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso [-- Attachment #1: Type: text/plain, Size: 1817 bytes --] On Mon 09-08-21 05:54:27, syzbot wrote: > syzbot has found a reproducer for the following issue on: > > HEAD commit: 66745863ecde Merge tag 'char-misc-5.14-rc5' of git://git.k.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=13edca6e300000 > kernel config: https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324 > dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 > compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15aeba6e300000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17a609e6300000 > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com > > loop0: detected capacity change from 0 to 4096 > EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback. > ====================================================== > WARNING: possible circular locking dependency detected > 5.14.0-rc4-syzkaller #0 Not tainted > ------------------------------------------------------ > syz-executor211/9242 is trying to acquire lock: > ffff88803a37ece8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474 > > but task is already holding lock: > ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631 > > which lock already depends on the new lock. Hmm, looks like hidden quota file got linked from directory hierarchy. Attached patch should fix this. #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 902e7f373fff2476b53824264c12e4e76c7ec02a Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR [-- Attachment #2: 0001-ext4-Make-sure-quota-files-are-not-grabbed-accidenta.patch --] [-- Type: text/x-patch, Size: 1798 bytes --] From 6efc0878a8c8f498eb138bdb57fad8a6c85d115c Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@suse.cz> Date: Mon, 9 Aug 2021 16:09:27 +0200 Subject: [PATCH] ext4: Make sure quota files are not grabbed accidentally If ext4 filesystem is corrupted so that quota files are linked from directory hirerarchy, bad things can happen. E.g. quota files can get corrupted or deleted. Make sure we are not grabbing quota file inodes when we expect normal inodes. Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext4/inode.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d8de607849df..2c33c795c4a7 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4603,6 +4603,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, struct ext4_iloc iloc; struct ext4_inode *raw_inode; struct ext4_inode_info *ei; + struct ext4_super_block *es = EXT4_SB(sb)->s_es; struct inode *inode; journal_t *journal = EXT4_SB(sb)->s_journal; long ret; @@ -4613,9 +4614,12 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, projid_t i_projid; if ((!(flags & EXT4_IGET_SPECIAL) && - (ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO)) || + ((ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO) || + ino == le32_to_cpu(es->s_usr_quota_inum) || + ino == le32_to_cpu(es->s_grp_quota_inum) || + ino == le32_to_cpu(es->s_prj_quota_inum))) || (ino < EXT4_ROOT_INO) || - (ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count))) { + (ino > le32_to_cpu(es->s_inodes_count))) { if (flags & EXT4_IGET_HANDLE) return ERR_PTR(-ESTALE); __ext4_error(sb, function, line, false, EFSCORRUPTED, 0, -- 2.26.2 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [syzbot] possible deadlock in dquot_commit 2021-08-09 14:52 ` Jan Kara @ 2021-08-09 17:43 ` syzbot 0 siblings, 0 replies; 15+ messages in thread From: syzbot @ 2021-08-09 17:43 UTC (permalink / raw) To: dvyukov, jack, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: possible deadlock in dquot_commit EXT4-fs warning (device loop4): ext4_enable_quotas:6478: Failed to enable quota tracking (type=1, err=-22). Please run e2fsck to fix. EXT4-fs (loop4): mount failed ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc4-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor.4/28771 is trying to acquire lock: ffff88803941cea8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474 but task is already holding lock: ffff8880463d2a58 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&ei->i_data_sem/2){++++}-{3:3}: lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625 down_read+0x3b/0x50 kernel/locking/rwsem.c:1353 ext4_map_blocks+0x266/0x1cb0 fs/ext4/inode.c:561 ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848 ext4_bread+0x2a/0x170 fs/ext4/inode.c:900 ext4_quota_write+0x2c7/0x5b0 fs/ext4/super.c:6602 write_blk fs/quota/quota_tree.c:64 [inline] get_free_dqblk+0x33a/0x660 fs/quota/quota_tree.c:93 do_insert_tree+0x24c/0x1d30 fs/quota/quota_tree.c:300 do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331 do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331 do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331 dq_insert_tree fs/quota/quota_tree.c:357 [inline] qtree_write_dquot+0x3b6/0x530 fs/quota/quota_tree.c:376 v2_write_dquot+0x110/0x1a0 fs/quota/quota_v2.c:358 dquot_acquire+0x2d7/0x5b0 fs/quota/dquot.c:441 ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261 dqget+0x999/0xdc0 fs/quota/dquot.c:899 __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477 ext4_create+0xb0/0x550 fs/ext4/namei.c:2731 lookup_open fs/namei.c:3228 [inline] open_last_lookups fs/namei.c:3298 [inline] path_openat+0x13b7/0x36b0 fs/namei.c:3504 do_filp_open+0x253/0x4d0 fs/namei.c:3534 do_sys_openat2+0x124/0x460 fs/open.c:1204 do_sys_open fs/open.c:1220 [inline] __do_sys_creat fs/open.c:1294 [inline] __se_sys_creat fs/open.c:1288 [inline] __x64_sys_creat+0x11f/0x160 fs/open.c:1288 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #1 (&s->s_dquot.dqio_sem){++++}-{3:3}: lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625 down_read+0x3b/0x50 kernel/locking/rwsem.c:1353 v2_read_dquot+0x4a/0x100 fs/quota/quota_v2.c:332 dquot_acquire+0x144/0x5b0 fs/quota/dquot.c:432 ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261 dqget+0x999/0xdc0 fs/quota/dquot.c:899 __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477 ext4_create+0xb0/0x550 fs/ext4/namei.c:2731 lookup_open fs/namei.c:3228 [inline] open_last_lookups fs/namei.c:3298 [inline] path_openat+0x13b7/0x36b0 fs/namei.c:3504 do_filp_open+0x253/0x4d0 fs/namei.c:3534 do_sys_openat2+0x124/0x460 fs/open.c:1204 do_sys_open fs/open.c:1220 [inline] __do_sys_creat fs/open.c:1294 [inline] __se_sys_creat fs/open.c:1288 [inline] __x64_sys_creat+0x11f/0x160 fs/open.c:1288 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (&dquot->dq_lock){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:3051 [inline] check_prevs_add+0x4f9/0x5b30 kernel/locking/lockdep.c:3174 __lo Tested on: commit: 902e7f37 Merge tag 'net-5.14-rc5' of git://git.kernel... git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git console output: https://syzkaller.appspot.com/x/log.txt?x=125d6b79300000 kernel config: https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324 dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1 patch: https://syzkaller.appspot.com/x/patch.diff?x=173c0ee9300000 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot] possible deadlock in dquot_commit 2021-08-09 12:54 ` [syzbot] " syzbot 2021-08-09 14:52 ` Jan Kara @ 2021-10-07 8:44 ` Jan Kara 2021-10-07 13:50 ` syzbot 1 sibling, 1 reply; 15+ messages in thread From: Jan Kara @ 2021-10-07 8:44 UTC (permalink / raw) To: syzbot; +Cc: dvyukov, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso [-- Attachment #1: Type: text/plain, Size: 2032 bytes --] On Mon 09-08-21 05:54:27, syzbot wrote: > syzbot has found a reproducer for the following issue on: > > HEAD commit: 66745863ecde Merge tag 'char-misc-5.14-rc5' of git://git.k.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=13edca6e300000 > kernel config: https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324 > dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 > compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15aeba6e300000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17a609e6300000 > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com > > loop0: detected capacity change from 0 to 4096 > EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback. > ====================================================== > WARNING: possible circular locking dependency detected > 5.14.0-rc4-syzkaller #0 Not tainted > ------------------------------------------------------ > syz-executor211/9242 is trying to acquire lock: > ffff88803a37ece8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474 > > but task is already holding lock: > ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631 > > which lock already depends on the new lock. I've got back to this and I have one more idea what could be causing this and why I'm not able to reproduce. I think we free some inode with i_data_sem locking class set to 2 and when the inode gets reused (which doesn't happen in my VM for some reason) for a normal file, problems trigger. Let's try attached patch. #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 60a9483534ed0d99090a2ee1d4bb0b8179195f51 Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR [-- Attachment #2: 0001-ext4-Make-sure-to-reset-inode-lockdep-class-when-quo.patch --] [-- Type: text/x-patch, Size: 1440 bytes --] From 2c0f00967aecfbc03216feb5a6d7286346268932 Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@suse.cz> Date: Thu, 7 Oct 2021 10:30:46 +0200 Subject: [PATCH] ext4: Make sure to reset inode lockdep class when quota enabling fails When we succeed in enabling some quota type but fail to enable another one with quota feature, we correctly disable all enabled quota types. However we forget to reset i_data_sem lockdep class. When the inode gets freed and reused, it will inherit this lockdep class (i_data_sem is initialized only when a slab is created) and thus eventually lockdep barfs about possible deadlocks. Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext4/super.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index fbe9cae63786..70b5fcbd351a 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -6355,8 +6355,19 @@ int ext4_enable_quotas(struct super_block *sb) "Failed to enable quota tracking " "(type=%d, err=%d). Please run " "e2fsck to fix.", type, err); - for (type--; type >= 0; type--) + for (type--; type >= 0; type--) { + struct inode *inode; + + inode = sb_dqopt(sb)->files[type]; + if (inode) + inode = igrab(inode); dquot_quota_off(sb, type); + if (inode) { + lockdep_set_quota_inode(inode, + I_DATA_SEM_NORMAL); + iput(inode); + } + } return err; } -- 2.26.2 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [syzbot] possible deadlock in dquot_commit 2021-10-07 8:44 ` Jan Kara @ 2021-10-07 13:50 ` syzbot 0 siblings, 0 replies; 15+ messages in thread From: syzbot @ 2021-10-07 13:50 UTC (permalink / raw) To: dvyukov, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-and-tested-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com Tested on: commit: 60a94835 Merge tag 'warning-fixes-20211005' of git://g.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git kernel config: https://syzkaller.appspot.com/x/.config?x=74f6ab826fb913cd dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2 patch: https://syzkaller.appspot.com/x/patch.diff?x=162860d0b00000 Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20210810041100.3271-1-hdanton@sina.com>]
* Re: [syzbot] possible deadlock in dquot_commit [not found] ` <20210810041100.3271-1-hdanton@sina.com> @ 2021-08-10 9:21 ` Jan Kara [not found] ` <20210811041232.2449-1-hdanton@sina.com> 1 sibling, 0 replies; 15+ messages in thread From: Jan Kara @ 2021-08-10 9:21 UTC (permalink / raw) To: Hillf Danton Cc: syzbot, dvyukov, jack, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso On Tue 10-08-21 12:11:00, Hillf Danton wrote: > On Mon, 09 Aug 2021 05:54:27 -0700 > > syzbot has found a reproducer for the following issue on: > > > > HEAD commit: 66745863ecde Merge tag 'char-misc-5.14-rc5' of git://git.k.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=13edca6e300000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324 > > dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2 > > compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15aeba6e300000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17a609e6300000 > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com > > > > loop0: detected capacity change from 0 to 4096 > > EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback. > > ====================================================== > > WARNING: possible circular locking dependency detected > > 5.14.0-rc4-syzkaller #0 Not tainted > > ------------------------------------------------------ > > syz-executor211/9242 is trying to acquire lock: > > ffff88803a37ece8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474 > > > > but task is already holding lock: > > ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631 > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > -> #2 (&ei->i_data_sem/2){++++}-{3:3}: > > lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625 > > down_read+0x3b/0x50 kernel/locking/rwsem.c:1353 > > ext4_map_blocks+0x266/0x1cb0 fs/ext4/inode.c:561 > > ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848 > > ext4_bread+0x2a/0x170 fs/ext4/inode.c:900 > > ext4_quota_write+0x2c7/0x5b0 fs/ext4/super.c:6602 > > write_blk fs/quota/quota_tree.c:64 [inline] > > get_free_dqblk+0x33a/0x660 fs/quota/quota_tree.c:93 > > do_insert_tree+0x24c/0x1d30 fs/quota/quota_tree.c:300 > > do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331 > > do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331 > > do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331 > > dq_insert_tree fs/quota/quota_tree.c:357 [inline] > > qtree_write_dquot+0x3b6/0x530 fs/quota/quota_tree.c:376 > > v2_write_dquot+0x110/0x1a0 fs/quota/quota_v2.c:358 > > dquot_acquire+0x2d7/0x5b0 fs/quota/dquot.c:441 > > Mark1, see below. > > > ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261 > > dqget+0x999/0xdc0 fs/quota/dquot.c:899 > > __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477 > > ext4_create+0xb0/0x550 fs/ext4/namei.c:2731 > > lookup_open fs/namei.c:3228 [inline] > > open_last_lookups fs/namei.c:3298 [inline] > > path_openat+0x13b7/0x36b0 fs/namei.c:3504 > > do_filp_open+0x253/0x4d0 fs/namei.c:3534 > > do_sys_openat2+0x124/0x460 fs/open.c:1204 > > do_sys_open fs/open.c:1220 [inline] > > __do_sys_creat fs/open.c:1294 [inline] > > __se_sys_creat fs/open.c:1288 [inline] > > __x64_sys_creat+0x11f/0x160 fs/open.c:1288 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > -> #1 (&s->s_dquot.dqio_sem){++++}-{3:3}: > > lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625 > > down_read+0x3b/0x50 kernel/locking/rwsem.c:1353 > > v2_read_dquot+0x4a/0x100 fs/quota/quota_v2.c:332 > > dquot_acquire+0x144/0x5b0 fs/quota/dquot.c:432 > > What boggles mind is both this line and the above line at Mark1 are under > > 430 mutex_lock(&dquot->dq_lock); > > Is it likely? I'm not quite sure what you are asking about but yes, dquot_acquire() grabs dquot->dq_lock, then e.g. v2_write_dquot() acquires dqio_sem, then ext4_map_blocks() acquires i_data_sem/2 (special lock subclass for quota files). What is unexpected is the #0 trace where i_data_sem/2 is acquired by ext4_map_blocks() called from ext4_write_begin(). That shows that normal write(2) call was able to operate on quota file which is certainly wrong. My patch closed one path how this could happen and I'm puzzled how else this could happen. I'll try to reproduce the issue (I've already tried but so far failed) as see if I can find out more. Honza > > dqget+0x999/0xdc0 fs/quota/dquot.c:899 > > __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477 > > ext4_create+0xb0/0x550 fs/ext4/namei.c:2731 > > lookup_open fs/namei.c:3228 [inline] > > open_last_lookups fs/namei.c:3298 [inline] > > path_openat+0x13b7/0x36b0 fs/namei.c:3504 > > do_filp_open+0x253/0x4d0 fs/namei.c:3534 > > do_sys_openat2+0x124/0x460 fs/open.c:1204 > > do_sys_open fs/open.c:1220 [inline] > > __do_sys_creat fs/open.c:1294 [inline] > > __se_sys_creat fs/open.c:1288 [inline] > > __x64_sys_creat+0x11f/0x160 fs/open.c:1288 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > -> #0 (&dquot->dq_lock){+.+.}-{3:3}: > > check_prev_add kernel/locking/lockdep.c:3051 [inline] > > check_prevs_add+0x4f9/0x5b30 kernel/locking/lockdep.c:3174 > > validate_chain kernel/locking/lockdep.c:3789 [inline] > > __lock_acquire+0x4476/0x6100 kernel/locking/lockdep.c:5015 > > lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625 > > __mutex_lock_common+0x1ad/0x3770 kernel/locking/mutex.c:959 > > __mutex_lock kernel/locking/mutex.c:1104 [inline] > > mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1119 > > dquot_commit+0x57/0x360 fs/quota/dquot.c:474 > > ext4_write_dquot+0x1e4/0x2b0 fs/ext4/super.c:6245 > > mark_dquot_dirty fs/quota/dquot.c:345 [inline] > > mark_all_dquot_dirty fs/quota/dquot.c:383 [inline] > > __dquot_alloc_space+0xa18/0x1020 fs/quota/dquot.c:1707 > > dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline] > > dquot_alloc_space include/linux/quotaops.h:310 [inline] > > dquot_alloc_block include/linux/quotaops.h:334 [inline] > > ext4_mb_new_blocks+0xe85/0x2470 fs/ext4/mballoc.c:5477 > > ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4245 > > ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638 > > _ext4_get_block+0x24b/0x710 fs/ext4/inode.c:794 > > ext4_block_write_begin+0x63a/0x1250 fs/ext4/inode.c:1077 > > ext4_write_begin+0x5cc/0x1350 fs/ext4/ext4_jbd2.h:498 > > ext4_da_write_begin+0x384/0x10c0 fs/ext4/inode.c:2960 > > generic_perform_write+0x262/0x580 mm/filemap.c:3656 > > ext4_buffered_write_iter+0x41c/0x590 fs/ext4/file.c:269 > > ext4_file_write_iter+0x8f7/0x1b90 fs/ext4/file.c:519 > > call_write_iter include/linux/fs.h:2114 [inline] > > new_sync_write fs/read_write.c:518 [inline] > > vfs_write+0xa39/0xc90 fs/read_write.c:605 > > ksys_write+0x171/0x2a0 fs/read_write.c:658 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > other info that might help us debug this: > > > > Chain exists of: > > &dquot->dq_lock --> &s->s_dquot.dqio_sem --> &ei->i_data_sem/2 > > > > Possible unsafe locking scenario: > > > > CPU0 CPU1 > > ---- ---- > > lock(&ei->i_data_sem/2); > > lock(&s->s_dquot.dqio_sem); > > lock(&ei->i_data_sem/2); > > lock(&dquot->dq_lock); > > > > *** DEADLOCK *** -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20210811041232.2449-1-hdanton@sina.com>]
* Re: [syzbot] possible deadlock in dquot_commit [not found] ` <20210811041232.2449-1-hdanton@sina.com> @ 2021-08-12 13:55 ` Jan Kara 0 siblings, 0 replies; 15+ messages in thread From: Jan Kara @ 2021-08-12 13:55 UTC (permalink / raw) To: Hillf Danton Cc: Jan Kara, syzbot, dvyukov, linux-kernel, syzkaller-bugs, syzkaller, tytso On Wed 11-08-21 12:12:32, Hillf Danton wrote: > On Tue, 10 Aug 2021 11:21:42 +0200 Jan Kara wrote: > > > >I'm not quite sure what you are asking about but yes, dquot_acquire() grabs > > It is hard to understand the rooms in mutex for two lock owners. > > >dquot->dq_lock, then e.g. v2_write_dquot() acquires dqio_sem, then > >ext4_map_blocks() acquires i_data_sem/2 (special lock subclass for quota > >files). > > > >What is unexpected is the #0 trace where i_data_sem/2 is acquired > >by ext4_map_blocks() called from ext4_write_begin(). That shows that > >normal write(2) call was able to operate on quota file which is certainly > >wrong. > > The change below can test your theory. > > > >My patch closed one path how this could happen and I'm puzzled how > >else this could happen. I'll try to reproduce the issue (I've already tried > >but so far failed) as see if I can find out more. > > Actually there is one check for quota file near 100 lines of code lower, > and copy it to just before taking i_data_sem to avoid writing the file of > wrong type. > > Now only for thoughts. > > +++ x/fs/ext4/inode.c > @@ -616,6 +616,8 @@ found: > if (!(flags & EXT4_GET_BLOCKS_CONVERT_UNWRITTEN)) > return retval; > > + if (ext4_is_quota_file(inode)) > + return -EINVAL; > /* > * Here we clear m_flags because after allocating an new extent, > * it will be set again. This would be certainly wrong. ext4_map_blocks() is used for accessing and allocating blocks for quota file. It is ext4_write_begin() that should not be called for the quota file. I've run the reproducer here for couple of hours but the problem didn't trigger for me. Strange. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2021-10-07 13:50 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-10 11:25 possible deadlock in dquot_commit syzbot 2021-02-11 11:37 ` Jan Kara 2021-02-11 11:47 ` Dmitry Vyukov 2021-02-11 15:47 ` Jan Kara 2021-02-11 21:46 ` Theodore Ts'o 2021-02-12 11:01 ` Dmitry Vyukov 2021-02-12 16:10 ` Theodore Ts'o 2021-02-15 12:50 ` Dmitry Vyukov 2021-08-09 12:54 ` [syzbot] " syzbot 2021-08-09 14:52 ` Jan Kara 2021-08-09 17:43 ` syzbot 2021-10-07 8:44 ` Jan Kara 2021-10-07 13:50 ` syzbot [not found] ` <20210810041100.3271-1-hdanton@sina.com> 2021-08-10 9:21 ` Jan Kara [not found] ` <20210811041232.2449-1-hdanton@sina.com> 2021-08-12 13:55 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).