All of lore.kernel.org
 help / color / mirror / Atom feed
* possible deadlock in dquot_commit
@ 2021-02-10 11:25 syzbot
  2021-02-11 11:37 ` Jan Kara
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: syzbot @ 2021-02-10 11:25 UTC (permalink / raw)
  To: jack, linux-kernel, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    1e0d27fc Merge branch 'akpm' (patches from Andrew)
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=101cf2f8d00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6
dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com

loop1: detected capacity change from 4096 to 0
EXT4-fs (loop1): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback.
======================================================
WARNING: possible circular locking dependency detected
5.11.0-rc6-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor.1/16170 is trying to acquire lock:
ffff8880795f5b28 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x4d/0x420 fs/quota/dquot.c:476

but task is already holding lock:
ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&ei->i_data_sem/2){++++}-{3:3}:
       down_read+0x95/0x440 kernel/locking/rwsem.c:1353
       ext4_map_blocks+0x381/0x17d0 fs/ext4/inode.c:560
       ext4_getblk+0x13c/0x670 fs/ext4/inode.c:847
       ext4_bread+0x29/0x210 fs/ext4/inode.c:899
       ext4_quota_write+0x26b/0x670 fs/ext4/super.c:6557
       write_blk+0x12e/0x220 fs/quota/quota_tree.c:73
       get_free_dqblk+0xff/0x2d0 fs/quota/quota_tree.c:102
       do_insert_tree+0x79c/0x1180 fs/quota/quota_tree.c:309
       do_insert_tree+0xf77/0x1180 fs/quota/quota_tree.c:340
       do_insert_tree+0xf77/0x1180 fs/quota/quota_tree.c:340
       do_insert_tree+0xf77/0x1180 fs/quota/quota_tree.c:340
       dq_insert_tree fs/quota/quota_tree.c:366 [inline]
       qtree_write_dquot+0x3b7/0x580 fs/quota/quota_tree.c:385
       v2_write_dquot+0x11c/0x250 fs/quota/quota_v2.c:353
       dquot_acquire+0x2c5/0x590 fs/quota/dquot.c:443
       ext4_acquire_dquot+0x254/0x3b0 fs/ext4/super.c:6216
       dqget+0x678/0x1080 fs/quota/dquot.c:901
       __dquot_initialize+0x560/0xbe0 fs/quota/dquot.c:1479
       ext4_create+0x8b/0x4c0 fs/ext4/namei.c:2606
       lookup_open.isra.0+0xf85/0x1350 fs/namei.c:3106
       open_last_lookups fs/namei.c:3180 [inline]
       path_openat+0x96d/0x2730 fs/namei.c:3368
       do_filp_open+0x17e/0x3c0 fs/namei.c:3398
       do_sys_openat2+0x16d/0x420 fs/open.c:1172
       do_sys_open fs/open.c:1188 [inline]
       __do_sys_creat fs/open.c:1262 [inline]
       __se_sys_creat fs/open.c:1256 [inline]
       __x64_sys_creat+0xc9/0x120 fs/open.c:1256
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #1 (&s->s_dquot.dqio_sem){++++}-{3:3}:
       down_read+0x95/0x440 kernel/locking/rwsem.c:1353
       v2_read_dquot+0x49/0x120 fs/quota/quota_v2.c:327
       dquot_acquire+0x12e/0x590 fs/quota/dquot.c:434
       ext4_acquire_dquot+0x254/0x3b0 fs/ext4/super.c:6216
       dqget+0x678/0x1080 fs/quota/dquot.c:901
       __dquot_initialize+0x560/0xbe0 fs/quota/dquot.c:1479
       ext4_create+0x8b/0x4c0 fs/ext4/namei.c:2606
       lookup_open.isra.0+0xf85/0x1350 fs/namei.c:3106
       open_last_lookups fs/namei.c:3180 [inline]
       path_openat+0x96d/0x2730 fs/namei.c:3368
       do_filp_open+0x17e/0x3c0 fs/namei.c:3398
       do_sys_openat2+0x16d/0x420 fs/open.c:1172
       do_sys_open fs/open.c:1188 [inline]
       __do_sys_creat fs/open.c:1262 [inline]
       __se_sys_creat fs/open.c:1256 [inline]
       __x64_sys_creat+0xc9/0x120 fs/open.c:1256
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #0 (&dquot->dq_lock){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:2868 [inline]
       check_prevs_add kernel/locking/lockdep.c:2993 [inline]
       validate_chain kernel/locking/lockdep.c:3608 [inline]
       __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832
       lock_acquire kernel/locking/lockdep.c:5442 [inline]
       lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407
       __mutex_lock_common kernel/locking/mutex.c:956 [inline]
       __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103
       dquot_commit+0x4d/0x420 fs/quota/dquot.c:476
       ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200
       ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline]
       ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242
       mark_dquot_dirty fs/quota/dquot.c:347 [inline]
       mark_all_dquot_dirty fs/quota/dquot.c:385 [inline]
       __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709
       dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline]
       dquot_alloc_space include/linux/quotaops.h:310 [inline]
       dquot_alloc_block include/linux/quotaops.h:334 [inline]
       ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937
       ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238
       ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637
       _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793
       ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077
       ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202
       ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961
       generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412
       ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270
       ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664
       call_write_iter include/linux/fs.h:1901 [inline]
       new_sync_write+0x426/0x650 fs/read_write.c:518
       vfs_write+0x791/0xa30 fs/read_write.c:605
       ksys_write+0x12d/0x250 fs/read_write.c:658
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

other info that might help us debug this:

Chain exists of:
  &dquot->dq_lock --> &s->s_dquot.dqio_sem --> &ei->i_data_sem/2

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ei->i_data_sem/2);
                               lock(&s->s_dquot.dqio_sem);
                               lock(&ei->i_data_sem/2);
  lock(&dquot->dq_lock);

 *** DEADLOCK ***

5 locks held by syz-executor.1/16170:
 #0: ffff88802ad18b70 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:947
 #1: ffff88802fbec460 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x12d/0x250 fs/read_write.c:658
 #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock include/linux/fs.h:773 [inline]
 #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: ext4_buffered_write_iter+0xb6/0x4d0 fs/ext4/file.c:264
 #3: ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630
 #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: i_dquot fs/quota/dquot.c:926 [inline]
 #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: __dquot_alloc_space+0x1b4/0xb60 fs/quota/dquot.c:1671

stack backtrace:
CPU: 0 PID: 16170 Comm: syz-executor.1 Not tainted 5.11.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2117
 check_prev_add kernel/locking/lockdep.c:2868 [inline]
 check_prevs_add kernel/locking/lockdep.c:2993 [inline]
 validate_chain kernel/locking/lockdep.c:3608 [inline]
 __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832
 lock_acquire kernel/locking/lockdep.c:5442 [inline]
 lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407
 __mutex_lock_common kernel/locking/mutex.c:956 [inline]
 __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103
 dquot_commit+0x4d/0x420 fs/quota/dquot.c:476
 ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200
 ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline]
 ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242
 mark_dquot_dirty fs/quota/dquot.c:347 [inline]
 mark_all_dquot_dirty fs/quota/dquot.c:385 [inline]
 __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709
 dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline]
 dquot_alloc_space include/linux/quotaops.h:310 [inline]
 dquot_alloc_block include/linux/quotaops.h:334 [inline]
 ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937
 ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238
 ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637
 _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793
 ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077
 ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202
 ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961
 generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412
 ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270
 ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664
 call_write_iter include/linux/fs.h:1901 [inline]
 new_sync_write+0x426/0x650 fs/read_write.c:518
 vfs_write+0x791/0xa30 fs/read_write.c:605
 ksys_write+0x12d/0x250 fs/read_write.c:658
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x465b09
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f8097ffc188 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465b09
RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003
RBP: 00000000004b069f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
R13: 00007ffefc77f01f R14: 00007f8097ffc300 R15: 0000000000022000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in dquot_commit
  2021-02-10 11:25 possible deadlock in dquot_commit syzbot
@ 2021-02-11 11:37 ` Jan Kara
  2021-02-11 11:47   ` Dmitry Vyukov
  2021-08-09 12:54 ` [syzbot] " syzbot
       [not found] ` <20210810041100.3271-1-hdanton@sina.com>
  2 siblings, 1 reply; 15+ messages in thread
From: Jan Kara @ 2021-02-11 11:37 UTC (permalink / raw)
  To: syzbot; +Cc: jack, linux-kernel, syzkaller-bugs

On Wed 10-02-21 03:25:22, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    1e0d27fc Merge branch 'akpm' (patches from Andrew)
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=101cf2f8d00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6
> dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
> 
> loop1: detected capacity change from 4096 to 0
> EXT4-fs (loop1): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback.
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.11.0-rc6-syzkaller #0 Not tainted
> ------------------------------------------------------
> syz-executor.1/16170 is trying to acquire lock:
> ffff8880795f5b28 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x4d/0x420 fs/quota/dquot.c:476
> 
> but task is already holding lock:
> ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630
> 
> which lock already depends on the new lock.

<snip>

All snipped stacktraces look perfectly fine and the lock dependencies are as
expected.

> 5 locks held by syz-executor.1/16170:
>  #0: ffff88802ad18b70 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:947
>  #1: ffff88802fbec460 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x12d/0x250 fs/read_write.c:658
>  #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock include/linux/fs.h:773 [inline]
>  #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: ext4_buffered_write_iter+0xb6/0x4d0 fs/ext4/file.c:264
>  #3: ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630
>  #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: i_dquot fs/quota/dquot.c:926 [inline]
>  #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: __dquot_alloc_space+0x1b4/0xb60 fs/quota/dquot.c:1671

This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e.,
I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from
ext4_block_write_begin(). This suggests that the write has been happening
directly to the quota file (or that lockdep annotation of the inode went
wrong somewhere). Now we normally protect quota files with IMMUTABLE flag
so writing it should not be possible. We also don't allow clearing this
flag on used quota file. Finally I'd checked lockdep annotation and
everything looks correct. So at this point the best theory I have is that a
filesystem has been suitably corrupted and quota file supposed to be
inaccessible from userspace got exposed but I'd expect other problems to
hit first in that case. Anyway without a reproducer I have no more ideas...

								Honza

> 
> stack backtrace:
> CPU: 0 PID: 16170 Comm: syz-executor.1 Not tainted 5.11.0-rc6-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:120
>  check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2117
>  check_prev_add kernel/locking/lockdep.c:2868 [inline]
>  check_prevs_add kernel/locking/lockdep.c:2993 [inline]
>  validate_chain kernel/locking/lockdep.c:3608 [inline]
>  __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832
>  lock_acquire kernel/locking/lockdep.c:5442 [inline]
>  lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407
>  __mutex_lock_common kernel/locking/mutex.c:956 [inline]
>  __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103
>  dquot_commit+0x4d/0x420 fs/quota/dquot.c:476
>  ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200
>  ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline]
>  ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242
>  mark_dquot_dirty fs/quota/dquot.c:347 [inline]
>  mark_all_dquot_dirty fs/quota/dquot.c:385 [inline]
>  __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709
>  dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline]
>  dquot_alloc_space include/linux/quotaops.h:310 [inline]
>  dquot_alloc_block include/linux/quotaops.h:334 [inline]
>  ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937
>  ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238
>  ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637
>  _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793
>  ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077
>  ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202
>  ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961
>  generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412
>  ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270
>  ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664
>  call_write_iter include/linux/fs.h:1901 [inline]
>  new_sync_write+0x426/0x650 fs/read_write.c:518
>  vfs_write+0x791/0xa30 fs/read_write.c:605
>  ksys_write+0x12d/0x250 fs/read_write.c:658
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x465b09
> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f8097ffc188 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465b09
> RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003
> RBP: 00000000004b069f R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
> R13: 00007ffefc77f01f R14: 00007f8097ffc300 R15: 0000000000022000
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in dquot_commit
  2021-02-11 11:37 ` Jan Kara
@ 2021-02-11 11:47   ` Dmitry Vyukov
  2021-02-11 15:47     ` Jan Kara
  2021-02-11 21:46     ` Theodore Ts'o
  0 siblings, 2 replies; 15+ messages in thread
From: Dmitry Vyukov @ 2021-02-11 11:47 UTC (permalink / raw)
  To: Jan Kara; +Cc: syzbot, Jan Kara, LKML, syzkaller-bugs

On Thu, Feb 11, 2021 at 12:37 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 10-02-21 03:25:22, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=101cf2f8d00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
> >
> > loop1: detected capacity change from 4096 to 0
> > EXT4-fs (loop1): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback.
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 5.11.0-rc6-syzkaller #0 Not tainted
> > ------------------------------------------------------
> > syz-executor.1/16170 is trying to acquire lock:
> > ffff8880795f5b28 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x4d/0x420 fs/quota/dquot.c:476
> >
> > but task is already holding lock:
> > ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630
> >
> > which lock already depends on the new lock.
>
> <snip>
>
> All snipped stacktraces look perfectly fine and the lock dependencies are as
> expected.
>
> > 5 locks held by syz-executor.1/16170:
> >  #0: ffff88802ad18b70 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:947
> >  #1: ffff88802fbec460 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x12d/0x250 fs/read_write.c:658
> >  #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock include/linux/fs.h:773 [inline]
> >  #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: ext4_buffered_write_iter+0xb6/0x4d0 fs/ext4/file.c:264
> >  #3: ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630
> >  #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: i_dquot fs/quota/dquot.c:926 [inline]
> >  #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: __dquot_alloc_space+0x1b4/0xb60 fs/quota/dquot.c:1671
>
> This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e.,
> I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from
> ext4_block_write_begin(). This suggests that the write has been happening
> directly to the quota file (or that lockdep annotation of the inode went
> wrong somewhere). Now we normally protect quota files with IMMUTABLE flag
> so writing it should not be possible. We also don't allow clearing this
> flag on used quota file. Finally I'd checked lockdep annotation and
> everything looks correct. So at this point the best theory I have is that a
> filesystem has been suitably corrupted and quota file supposed to be
> inaccessible from userspace got exposed but I'd expect other problems to
> hit first in that case. Anyway without a reproducer I have no more ideas...

There is a reproducer for 4.19 available on the dashboard. Maybe it will help.
I don't why it did not pop up on upstream yet, there lots of potential
reasons for this.

>                                                                 Honza
>
> >
> > stack backtrace:
> > CPU: 0 PID: 16170 Comm: syz-executor.1 Not tainted 5.11.0-rc6-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:79 [inline]
> >  dump_stack+0x107/0x163 lib/dump_stack.c:120
> >  check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2117
> >  check_prev_add kernel/locking/lockdep.c:2868 [inline]
> >  check_prevs_add kernel/locking/lockdep.c:2993 [inline]
> >  validate_chain kernel/locking/lockdep.c:3608 [inline]
> >  __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832
> >  lock_acquire kernel/locking/lockdep.c:5442 [inline]
> >  lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407
> >  __mutex_lock_common kernel/locking/mutex.c:956 [inline]
> >  __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103
> >  dquot_commit+0x4d/0x420 fs/quota/dquot.c:476
> >  ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200
> >  ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline]
> >  ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242
> >  mark_dquot_dirty fs/quota/dquot.c:347 [inline]
> >  mark_all_dquot_dirty fs/quota/dquot.c:385 [inline]
> >  __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709
> >  dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline]
> >  dquot_alloc_space include/linux/quotaops.h:310 [inline]
> >  dquot_alloc_block include/linux/quotaops.h:334 [inline]
> >  ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937
> >  ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238
> >  ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637
> >  _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793
> >  ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077
> >  ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202
> >  ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961
> >  generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412
> >  ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270
> >  ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664
> >  call_write_iter include/linux/fs.h:1901 [inline]
> >  new_sync_write+0x426/0x650 fs/read_write.c:518
> >  vfs_write+0x791/0xa30 fs/read_write.c:605
> >  ksys_write+0x12d/0x250 fs/read_write.c:658
> >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x465b09
> > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007f8097ffc188 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465b09
> > RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003
> > RBP: 00000000004b069f R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
> > R13: 00007ffefc77f01f R14: 00007f8097ffc300 R15: 0000000000022000
> >
> >
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> >
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20210211113718.GM19070%40quack2.suse.cz.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in dquot_commit
  2021-02-11 11:47   ` Dmitry Vyukov
@ 2021-02-11 15:47     ` Jan Kara
  2021-02-11 21:46     ` Theodore Ts'o
  1 sibling, 0 replies; 15+ messages in thread
From: Jan Kara @ 2021-02-11 15:47 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs

On Thu 11-02-21 12:47:18, Dmitry Vyukov wrote:
> On Thu, Feb 11, 2021 at 12:37 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 10-02-21 03:25:22, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=101cf2f8d00000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=e83e68d0a6aba5f6
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
> > >
> > > loop1: detected capacity change from 4096 to 0
> > > EXT4-fs (loop1): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback.
> > > ======================================================
> > > WARNING: possible circular locking dependency detected
> > > 5.11.0-rc6-syzkaller #0 Not tainted
> > > ------------------------------------------------------
> > > syz-executor.1/16170 is trying to acquire lock:
> > > ffff8880795f5b28 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x4d/0x420 fs/quota/dquot.c:476
> > >
> > > but task is already holding lock:
> > > ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630
> > >
> > > which lock already depends on the new lock.
> >
> > <snip>
> >
> > All snipped stacktraces look perfectly fine and the lock dependencies are as
> > expected.
> >
> > > 5 locks held by syz-executor.1/16170:
> > >  #0: ffff88802ad18b70 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:947
> > >  #1: ffff88802fbec460 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x12d/0x250 fs/read_write.c:658
> > >  #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock include/linux/fs.h:773 [inline]
> > >  #2: ffff88807960b648 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: ext4_buffered_write_iter+0xb6/0x4d0 fs/ext4/file.c:264
> > >  #3: ffff88807960b438 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x5e1/0x17d0 fs/ext4/inode.c:630
> > >  #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: i_dquot fs/quota/dquot.c:926 [inline]
> > >  #4: ffffffff8bf1be58 (dquot_srcu){....}-{0:0}, at: __dquot_alloc_space+0x1b4/0xb60 fs/quota/dquot.c:1671
> >
> > This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e.,
> > I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from
> > ext4_block_write_begin(). This suggests that the write has been happening
> > directly to the quota file (or that lockdep annotation of the inode went
> > wrong somewhere). Now we normally protect quota files with IMMUTABLE flag
> > so writing it should not be possible. We also don't allow clearing this
> > flag on used quota file. Finally I'd checked lockdep annotation and
> > everything looks correct. So at this point the best theory I have is that a
> > filesystem has been suitably corrupted and quota file supposed to be
> > inaccessible from userspace got exposed but I'd expect other problems to
> > hit first in that case. Anyway without a reproducer I have no more ideas...
> 
> There is a reproducer for 4.19 available on the dashboard. Maybe it will help.
> I don't why it did not pop up on upstream yet, there lots of potential
> reasons for this.

OK, so I've checked the fs images generated by the syzkaller reproducer and
they indeed have QUOTA feature enabled. Also inodes used by quota files are
not marked as allocated so there is some potential for surprises. But all
the possible paths I could think of seem to be covered and return
EFSCORRUPTED. Also note that the reproducer didn't trigger the
lockdep splat for me so the problem still isn't clear to me.

								Honza

> > >
> > > stack backtrace:
> > > CPU: 0 PID: 16170 Comm: syz-executor.1 Not tainted 5.11.0-rc6-syzkaller #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > Call Trace:
> > >  __dump_stack lib/dump_stack.c:79 [inline]
> > >  dump_stack+0x107/0x163 lib/dump_stack.c:120
> > >  check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2117
> > >  check_prev_add kernel/locking/lockdep.c:2868 [inline]
> > >  check_prevs_add kernel/locking/lockdep.c:2993 [inline]
> > >  validate_chain kernel/locking/lockdep.c:3608 [inline]
> > >  __lock_acquire+0x2b26/0x54f0 kernel/locking/lockdep.c:4832
> > >  lock_acquire kernel/locking/lockdep.c:5442 [inline]
> > >  lock_acquire+0x1a8/0x720 kernel/locking/lockdep.c:5407
> > >  __mutex_lock_common kernel/locking/mutex.c:956 [inline]
> > >  __mutex_lock+0x134/0x1110 kernel/locking/mutex.c:1103
> > >  dquot_commit+0x4d/0x420 fs/quota/dquot.c:476
> > >  ext4_write_dquot+0x24e/0x310 fs/ext4/super.c:6200
> > >  ext4_mark_dquot_dirty fs/ext4/super.c:6248 [inline]
> > >  ext4_mark_dquot_dirty+0x111/0x1b0 fs/ext4/super.c:6242
> > >  mark_dquot_dirty fs/quota/dquot.c:347 [inline]
> > >  mark_all_dquot_dirty fs/quota/dquot.c:385 [inline]
> > >  __dquot_alloc_space+0x5d4/0xb60 fs/quota/dquot.c:1709
> > >  dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline]
> > >  dquot_alloc_space include/linux/quotaops.h:310 [inline]
> > >  dquot_alloc_block include/linux/quotaops.h:334 [inline]
> > >  ext4_mb_new_blocks+0x5a9/0x51a0 fs/ext4/mballoc.c:4937
> > >  ext4_ext_map_blocks+0x20da/0x5fb0 fs/ext4/extents.c:4238
> > >  ext4_map_blocks+0x653/0x17d0 fs/ext4/inode.c:637
> > >  _ext4_get_block+0x241/0x590 fs/ext4/inode.c:793
> > >  ext4_block_write_begin+0x4f8/0x1190 fs/ext4/inode.c:1077
> > >  ext4_write_begin+0x4b5/0x14b0 fs/ext4/inode.c:1202
> > >  ext4_da_write_begin+0x672/0x1150 fs/ext4/inode.c:2961
> > >  generic_perform_write+0x20a/0x4f0 mm/filemap.c:3412
> > >  ext4_buffered_write_iter+0x244/0x4d0 fs/ext4/file.c:270
> > >  ext4_file_write_iter+0x423/0x14d0 fs/ext4/file.c:664
> > >  call_write_iter include/linux/fs.h:1901 [inline]
> > >  new_sync_write+0x426/0x650 fs/read_write.c:518
> > >  vfs_write+0x791/0xa30 fs/read_write.c:605
> > >  ksys_write+0x12d/0x250 fs/read_write.c:658
> > >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > RIP: 0033:0x465b09
> > > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> > > RSP: 002b:00007f8097ffc188 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > > RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465b09
> > > RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003
> > > RBP: 00000000004b069f R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
> > > R13: 00007ffefc77f01f R14: 00007f8097ffc300 R15: 0000000000022000
> > >
> > >
> > > ---
> > > This report is generated by a bot. It may contain errors.
> > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > >
> > > syzbot will keep track of this issue. See:
> > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > >
> > --
> > Jan Kara <jack@suse.com>
> > SUSE Labs, CR
> >
> > --
> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20210211113718.GM19070%40quack2.suse.cz.
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in dquot_commit
  2021-02-11 11:47   ` Dmitry Vyukov
  2021-02-11 15:47     ` Jan Kara
@ 2021-02-11 21:46     ` Theodore Ts'o
  2021-02-12 11:01       ` Dmitry Vyukov
  1 sibling, 1 reply; 15+ messages in thread
From: Theodore Ts'o @ 2021-02-11 21:46 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs

On Thu, Feb 11, 2021 at 12:47:18PM +0100, Dmitry Vyukov wrote:
> > This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e.,
> > I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from
> > ext4_block_write_begin(). This suggests that the write has been happening
> > directly to the quota file (or that lockdep annotation of the inode went
> > wrong somewhere). Now we normally protect quota files with IMMUTABLE flag
> > so writing it should not be possible. We also don't allow clearing this
> > flag on used quota file. Finally I'd checked lockdep annotation and
> > everything looks correct. So at this point the best theory I have is that a
> > filesystem has been suitably corrupted and quota file supposed to be
> > inaccessible from userspace got exposed but I'd expect other problems to
> > hit first in that case. Anyway without a reproducer I have no more ideas...
> 
> There is a reproducer for 4.19 available on the dashboard. Maybe it will help.
> I don't why it did not pop up on upstream yet, there lots of potential
> reasons for this.

The 4.19 version of the syzbot report has a very different stack
trace.  Instead of it being related to an apparent write to the quota
file, it is apparently caused by a call to rmdir:

 dump_stack+0x22c/0x33e lib/dump_stack.c:118
 print_circular_bug.constprop.0.cold+0x2d7/0x41e kernel/locking/lockdep.c:1221
   ...
 __mutex_lock+0xd7/0x13f0 kernel/locking/mutex.c:1072
 dquot_commit+0x4d/0x400 fs/quota/dquot.c:469
 ext4_write_dquot+0x1f2/0x2a0 fs/ext4/super.c:5644
   ...
 ext4_evict_inode+0x933/0x1830 fs/ext4/inode.c:298
 evict+0x2ed/0x780 fs/inode.c:559
 iput_final fs/inode.c:1555 [inline]
   ...
 vfs_rmdir fs/namei.c:3865 [inline]
 do_rmdir+0x3af/0x420 fs/namei.c:3943
 __do_sys_unlinkat fs/namei.c:4105 [inline]
 __se_sys_unlinkat fs/namei.c:4099 [inline]
 __x64_sys_unlinkat+0xdf/0x120 fs/namei.c:4099
 do_syscall_64+0xf9/0x670 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Which leads me to another apparent contradiction.  Looking at the C
reproducer source code, and running the C reproducer under "strace
-ff", there is never any attempt to run rmdir() on the corrupted file
system that is mounted.  Neither as observed by my running the C
reproducer, or by looking at the C reproducer source code.

Looking at the code, I did see a number of things which seemed to be
bugs; procid never gets incremented, so all of the threads only
operate on /dev/loop0, and each call to the execute() function tries
to setup two file systems on /dev/loop0.  So the each thread to run
creates a temp file, binds it to /dev/loop0, and then creates another
temp file, tries to bind it to /dev/loop0 (which will fail), tries to
mount /dev/loop0 (again) on the samee mount point (which will
succeed).

I'm not sure if this is just some insanity that was consed up by the
fuzzer... or I'm wondering if this was an unfaithful translation of
the syzbot repro to C.  Am I correct in understanding that when syzbot
is running, it uses the syzbot repro, and not the C repro?

   	       	    	       	      - Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in dquot_commit
  2021-02-11 21:46     ` Theodore Ts'o
@ 2021-02-12 11:01       ` Dmitry Vyukov
  2021-02-12 16:10         ` Theodore Ts'o
  0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Vyukov @ 2021-02-12 11:01 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs, syzkaller

On Thu, Feb 11, 2021 at 10:46 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Thu, Feb 11, 2021 at 12:47:18PM +0100, Dmitry Vyukov wrote:
> > > This actually looks problematic: We acquired &ei->i_data_sem/2 (i.e.,
> > > I_DATA_SEM_QUOTA subclass) in ext4_map_blocks() called from
> > > ext4_block_write_begin(). This suggests that the write has been happening
> > > directly to the quota file (or that lockdep annotation of the inode went
> > > wrong somewhere). Now we normally protect quota files with IMMUTABLE flag
> > > so writing it should not be possible. We also don't allow clearing this
> > > flag on used quota file. Finally I'd checked lockdep annotation and
> > > everything looks correct. So at this point the best theory I have is that a
> > > filesystem has been suitably corrupted and quota file supposed to be
> > > inaccessible from userspace got exposed but I'd expect other problems to
> > > hit first in that case. Anyway without a reproducer I have no more ideas...
> >
> > There is a reproducer for 4.19 available on the dashboard. Maybe it will help.
> > I don't why it did not pop up on upstream yet, there lots of potential
> > reasons for this.
>
> The 4.19 version of the syzbot report has a very different stack
> trace.  Instead of it being related to an apparent write to the quota
> file, it is apparently caused by a call to rmdir:
>
>  dump_stack+0x22c/0x33e lib/dump_stack.c:118
>  print_circular_bug.constprop.0.cold+0x2d7/0x41e kernel/locking/lockdep.c:1221
>    ...
>  __mutex_lock+0xd7/0x13f0 kernel/locking/mutex.c:1072
>  dquot_commit+0x4d/0x400 fs/quota/dquot.c:469
>  ext4_write_dquot+0x1f2/0x2a0 fs/ext4/super.c:5644
>    ...
>  ext4_evict_inode+0x933/0x1830 fs/ext4/inode.c:298
>  evict+0x2ed/0x780 fs/inode.c:559
>  iput_final fs/inode.c:1555 [inline]
>    ...
>  vfs_rmdir fs/namei.c:3865 [inline]
>  do_rmdir+0x3af/0x420 fs/namei.c:3943
>  __do_sys_unlinkat fs/namei.c:4105 [inline]
>  __se_sys_unlinkat fs/namei.c:4099 [inline]
>  __x64_sys_unlinkat+0xdf/0x120 fs/namei.c:4099
>  do_syscall_64+0xf9/0x670 arch/x86/entry/common.c:293
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Which leads me to another apparent contradiction.  Looking at the C
> reproducer source code, and running the C reproducer under "strace
> -ff", there is never any attempt to run rmdir() on the corrupted file
> system that is mounted.  Neither as observed by my running the C
> reproducer, or by looking at the C reproducer source code.
>
> Looking at the code, I did see a number of things which seemed to be
> bugs; procid never gets incremented, so all of the threads only
> operate on /dev/loop0, and each call to the execute() function tries
> to setup two file systems on /dev/loop0.  So the each thread to run
> creates a temp file, binds it to /dev/loop0, and then creates another
> temp file, tries to bind it to /dev/loop0 (which will fail), tries to
> mount /dev/loop0 (again) on the samee mount point (which will
> succeed).
>
> I'm not sure if this is just some insanity that was consed up by the
> fuzzer... or I'm wondering if this was an unfaithful translation of
> the syzbot repro to C.  Am I correct in understanding that when syzbot
> is running, it uses the syzbot repro, and not the C repro?

Hi Ted,

The 4.19 reproducer may reproducer something else, you know better. I
just want to answer points re syzkaller reproducers. FTR the 4.19
reproducer/reproducer is here:
https://syzkaller.appspot.com/bug?id=b6cacc9fa48fea07154b8797236727de981c1e02

> there is never any attempt to run rmdir() on the corrupted file system that is mounted.

Recursive rmdir happens as part of test cleanup implicitly, you can
see rmdir call in remove_dir function in the C reproducer:
https://syzkaller.appspot.com/text?tag=ReproC&x=12caea37900000

> procid never gets incremented, so all of the threads only operate on /dev/loop0

This is intentional. procid is supposed to "isolate" parallel test
processes (if any). This reproducer does not use parallel test
processes, thus procid has constant value.

> Am I correct in understanding that when syzbot is running, it uses the syzbot repro, and not the C repro?

It tries both. If first tries to interpret "syzkaller program" as it
was done when the bug was triggered during fuzzing. But then it tries
to convert it to a corresponding stand-alone C program and confirms
that it still triggers the bug. If it provides a C reproducer, it
means that it did trigger the bug using this exact C program on a
freshly booted kernel (and the provided kernel oops is the
corresponding oops obtained on this exact program).
If it fails to reproduce the bug with a C reproducer, then it provides
only the "syzkaller program" to not mislead developers.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in dquot_commit
  2021-02-12 11:01       ` Dmitry Vyukov
@ 2021-02-12 16:10         ` Theodore Ts'o
  2021-02-15 12:50           ` Dmitry Vyukov
  0 siblings, 1 reply; 15+ messages in thread
From: Theodore Ts'o @ 2021-02-12 16:10 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs, syzkaller

^[>From: Theodore Ts'o <tytso@mit.edu>

On Fri, Feb 12, 2021 at 12:01:51PM +0100, Dmitry Vyukov wrote:
> > >
> > > There is a reproducer for 4.19 available on the dashboard. Maybe it will help.
> > > I don't why it did not pop up on upstream yet, there lots of potential
> > > reasons for this.
> >
> > The 4.19 version of the syzbot report has a very different stack
> > trace.  Instead of it being related to an apparent write to the quota
> > file, it is apparently caused by a call to rmdir:
> >
>
> The 4.19 reproducer may reproducer something else, you know better. I
> just want to answer points re syzkaller reproducers. FTR the 4.19
> reproducer/reproducer is here:
> https://syzkaller.appspot.com/bug?id=b6cacc9fa48fea07154b8797236727de981c1e02

Yes, I know.  That was my point.  I don't think it's useful for
debugging the upstream dquot_commit syzbot report (for which we don't
have a reproducer yet).

> > there is never any attempt to run rmdir() on the corrupted file system that is mounted.
> 
> Recursive rmdir happens as part of test cleanup implicitly, you can
> see rmdir call in remove_dir function in the C reproducer:
> https://syzkaller.appspot.com/text?tag=ReproC&x=12caea37900000

That rmdir() removes the mountpoint, which is *not* the fuzzed file
system which has the quota feature enabled.

> > procid never gets incremented, so all of the threads only operate on /dev/loop0
> 
> This is intentional. procid is supposed to "isolate" parallel test
> processes (if any). This reproducer does not use parallel test
> processes, thus procid has constant value.

Um... yes it does:

int main(void)
{
  syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
  syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
  syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
  use_temporary_dir();
  loop();
  return 0;
}

and what is loop?

static void loop(void)
{
  int iter = 0;
  for (;; iter++) {
  	...
    reset_loop();
    int pid = fork();
    if (pid < 0)
      exit(1);
    if (pid == 0) {
      if (chdir(cwdbuf))
        exit(1);
      setup_test();
      execute_one();
      exit(0);
    }
    ...	
    remove_dir(cwdbuf);
  }
}

> > Am I correct in understanding that when syzbot is running, it uses the syzbot repro, and not the C repro?
> 
> It tries both. If first tries to interpret "syzkaller program" as it
> was done when the bug was triggered during fuzzing. But then it tries
> to convert it to a corresponding stand-alone C program and confirms
> that it still triggers the bug. If it provides a C reproducer, it
> means that it did trigger the bug using this exact C program on a
> freshly booted kernel (and the provided kernel oops is the
> corresponding oops obtained on this exact program).
> If it fails to reproduce the bug with a C reproducer, then it provides
> only the "syzkaller program" to not mislead developers.

Well, looking at the C reproducer, it doesn't reproduce on upstream,
and the stack trace makes no sense to me.  The rmdir() executes at the
end of the test, as part of the cleanup, and looking at the syzkaller
console, the stack trace involving rmdir happens *early* while test
threads are still trying to mount the file system.

	    	  	    	      - Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in dquot_commit
  2021-02-12 16:10         ` Theodore Ts'o
@ 2021-02-15 12:50           ` Dmitry Vyukov
  0 siblings, 0 replies; 15+ messages in thread
From: Dmitry Vyukov @ 2021-02-15 12:50 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Jan Kara, syzbot, Jan Kara, LKML, syzkaller-bugs, syzkaller

On Fri, Feb 12, 2021 at 5:10 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
>  >From: Theodore Ts'o <tytso@mit.edu>
>
> On Fri, Feb 12, 2021 at 12:01:51PM +0100, Dmitry Vyukov wrote:
> > > >
> > > > There is a reproducer for 4.19 available on the dashboard. Maybe it will help.
> > > > I don't why it did not pop up on upstream yet, there lots of potential
> > > > reasons for this.
> > >
> > > The 4.19 version of the syzbot report has a very different stack
> > > trace.  Instead of it being related to an apparent write to the quota
> > > file, it is apparently caused by a call to rmdir:
> > >
> >
> > The 4.19 reproducer may reproducer something else, you know better. I
> > just want to answer points re syzkaller reproducers. FTR the 4.19
> > reproducer/reproducer is here:
> > https://syzkaller.appspot.com/bug?id=b6cacc9fa48fea07154b8797236727de981c1e02
>
> Yes, I know.  That was my point.  I don't think it's useful for
> debugging the upstream dquot_commit syzbot report (for which we don't
> have a reproducer yet).
>
> > > there is never any attempt to run rmdir() on the corrupted file system that is mounted.
> >
> > Recursive rmdir happens as part of test cleanup implicitly, you can
> > see rmdir call in remove_dir function in the C reproducer:
> > https://syzkaller.appspot.com/text?tag=ReproC&x=12caea37900000
>
> That rmdir() removes the mountpoint, which is *not* the fuzzed file
> system which has the quota feature enabled.

remove_dir function is recursive, so rmdir should be called for all
subdirectories starting from the deepest ones. At least that was the
intention. Do you see it's not working this way? That would be
something to fix.

> > > procid never gets incremented, so all of the threads only operate on /dev/loop0
> >
> > This is intentional. procid is supposed to "isolate" parallel test
> > processes (if any). This reproducer does not use parallel test
> > processes, thus procid has constant value.
>
> Um... yes it does:

There is waitpid before remove_dir. So these are sequential test
processes, not parallel.

> int main(void)
> {
>   syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>   syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
>   syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>   use_temporary_dir();
>   loop();
>   return 0;
> }
>
> and what is loop?
>
> static void loop(void)
> {
>   int iter = 0;
>   for (;; iter++) {
>         ...
>     reset_loop();
>     int pid = fork();
>     if (pid < 0)
>       exit(1);
>     if (pid == 0) {
>       if (chdir(cwdbuf))
>         exit(1);
>       setup_test();
>       execute_one();
>       exit(0);
>     }
>     ...
>     remove_dir(cwdbuf);
>   }
> }
>
> > > Am I correct in understanding that when syzbot is running, it uses the syzbot repro, and not the C repro?
> >
> > It tries both. If first tries to interpret "syzkaller program" as it
> > was done when the bug was triggered during fuzzing. But then it tries
> > to convert it to a corresponding stand-alone C program and confirms
> > that it still triggers the bug. If it provides a C reproducer, it
> > means that it did trigger the bug using this exact C program on a
> > freshly booted kernel (and the provided kernel oops is the
> > corresponding oops obtained on this exact program).
> > If it fails to reproduce the bug with a C reproducer, then it provides
> > only the "syzkaller program" to not mislead developers.
>
> Well, looking at the C reproducer, it doesn't reproduce on upstream,
> and the stack trace makes no sense to me.  The rmdir() executes at the
> end of the test, as part of the cleanup, and looking at the syzkaller
> console, the stack trace involving rmdir happens *early* while test
> threads are still trying to mount the file system.

My assumption that the 4.19 reproducer for a somewhat similarly
looking bug may also reproduce this upstream bug is false then.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] possible deadlock in dquot_commit
  2021-02-10 11:25 possible deadlock in dquot_commit syzbot
  2021-02-11 11:37 ` Jan Kara
@ 2021-08-09 12:54 ` syzbot
  2021-08-09 14:52   ` Jan Kara
  2021-10-07  8:44   ` Jan Kara
       [not found] ` <20210810041100.3271-1-hdanton@sina.com>
  2 siblings, 2 replies; 15+ messages in thread
From: syzbot @ 2021-08-09 12:54 UTC (permalink / raw)
  To: dvyukov, jack, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso

syzbot has found a reproducer for the following issue on:

HEAD commit:    66745863ecde Merge tag 'char-misc-5.14-rc5' of git://git.k..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13edca6e300000
kernel config:  https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324
dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2
compiler:       Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15aeba6e300000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17a609e6300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com

loop0: detected capacity change from 0 to 4096
EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback.
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc4-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor211/9242 is trying to acquire lock:
ffff88803a37ece8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474

but task is already holding lock:
ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&ei->i_data_sem/2){++++}-{3:3}:
       lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625
       down_read+0x3b/0x50 kernel/locking/rwsem.c:1353
       ext4_map_blocks+0x266/0x1cb0 fs/ext4/inode.c:561
       ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848
       ext4_bread+0x2a/0x170 fs/ext4/inode.c:900
       ext4_quota_write+0x2c7/0x5b0 fs/ext4/super.c:6602
       write_blk fs/quota/quota_tree.c:64 [inline]
       get_free_dqblk+0x33a/0x660 fs/quota/quota_tree.c:93
       do_insert_tree+0x24c/0x1d30 fs/quota/quota_tree.c:300
       do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331
       do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331
       do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331
       dq_insert_tree fs/quota/quota_tree.c:357 [inline]
       qtree_write_dquot+0x3b6/0x530 fs/quota/quota_tree.c:376
       v2_write_dquot+0x110/0x1a0 fs/quota/quota_v2.c:358
       dquot_acquire+0x2d7/0x5b0 fs/quota/dquot.c:441
       ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261
       dqget+0x999/0xdc0 fs/quota/dquot.c:899
       __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477
       ext4_create+0xb0/0x550 fs/ext4/namei.c:2731
       lookup_open fs/namei.c:3228 [inline]
       open_last_lookups fs/namei.c:3298 [inline]
       path_openat+0x13b7/0x36b0 fs/namei.c:3504
       do_filp_open+0x253/0x4d0 fs/namei.c:3534
       do_sys_openat2+0x124/0x460 fs/open.c:1204
       do_sys_open fs/open.c:1220 [inline]
       __do_sys_creat fs/open.c:1294 [inline]
       __se_sys_creat fs/open.c:1288 [inline]
       __x64_sys_creat+0x11f/0x160 fs/open.c:1288
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #1 (&s->s_dquot.dqio_sem){++++}-{3:3}:
       lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625
       down_read+0x3b/0x50 kernel/locking/rwsem.c:1353
       v2_read_dquot+0x4a/0x100 fs/quota/quota_v2.c:332
       dquot_acquire+0x144/0x5b0 fs/quota/dquot.c:432
       ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261
       dqget+0x999/0xdc0 fs/quota/dquot.c:899
       __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477
       ext4_create+0xb0/0x550 fs/ext4/namei.c:2731
       lookup_open fs/namei.c:3228 [inline]
       open_last_lookups fs/namei.c:3298 [inline]
       path_openat+0x13b7/0x36b0 fs/namei.c:3504
       do_filp_open+0x253/0x4d0 fs/namei.c:3534
       do_sys_openat2+0x124/0x460 fs/open.c:1204
       do_sys_open fs/open.c:1220 [inline]
       __do_sys_creat fs/open.c:1294 [inline]
       __se_sys_creat fs/open.c:1288 [inline]
       __x64_sys_creat+0x11f/0x160 fs/open.c:1288
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (&dquot->dq_lock){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3051 [inline]
       check_prevs_add+0x4f9/0x5b30 kernel/locking/lockdep.c:3174
       validate_chain kernel/locking/lockdep.c:3789 [inline]
       __lock_acquire+0x4476/0x6100 kernel/locking/lockdep.c:5015
       lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625
       __mutex_lock_common+0x1ad/0x3770 kernel/locking/mutex.c:959
       __mutex_lock kernel/locking/mutex.c:1104 [inline]
       mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1119
       dquot_commit+0x57/0x360 fs/quota/dquot.c:474
       ext4_write_dquot+0x1e4/0x2b0 fs/ext4/super.c:6245
       mark_dquot_dirty fs/quota/dquot.c:345 [inline]
       mark_all_dquot_dirty fs/quota/dquot.c:383 [inline]
       __dquot_alloc_space+0xa18/0x1020 fs/quota/dquot.c:1707
       dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline]
       dquot_alloc_space include/linux/quotaops.h:310 [inline]
       dquot_alloc_block include/linux/quotaops.h:334 [inline]
       ext4_mb_new_blocks+0xe85/0x2470 fs/ext4/mballoc.c:5477
       ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4245
       ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638
       _ext4_get_block+0x24b/0x710 fs/ext4/inode.c:794
       ext4_block_write_begin+0x63a/0x1250 fs/ext4/inode.c:1077
       ext4_write_begin+0x5cc/0x1350 fs/ext4/ext4_jbd2.h:498
       ext4_da_write_begin+0x384/0x10c0 fs/ext4/inode.c:2960
       generic_perform_write+0x262/0x580 mm/filemap.c:3656
       ext4_buffered_write_iter+0x41c/0x590 fs/ext4/file.c:269
       ext4_file_write_iter+0x8f7/0x1b90 fs/ext4/file.c:519
       call_write_iter include/linux/fs.h:2114 [inline]
       new_sync_write fs/read_write.c:518 [inline]
       vfs_write+0xa39/0xc90 fs/read_write.c:605
       ksys_write+0x171/0x2a0 fs/read_write.c:658
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

Chain exists of:
  &dquot->dq_lock --> &s->s_dquot.dqio_sem --> &ei->i_data_sem/2

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ei->i_data_sem/2);
                               lock(&s->s_dquot.dqio_sem);
                               lock(&ei->i_data_sem/2);
  lock(&dquot->dq_lock);

 *** DEADLOCK ***

4 locks held by syz-executor211/9242:
 #0: ffff88802ff60460 (sb_writers#5){.+.+}-{0:0}, at: vfs_write+0x21b/0xc90 fs/read_write.c:601
 #1: ffff88803a304058 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:774 [inline]
 #1: ffff88803a304058 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: ext4_buffered_write_iter+0xaf/0x590 fs/ext4/file.c:263
 #2: ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631
 #3: ffffffff8c840518 (dquot_srcu){....}-{0:0}, at: rcu_lock_acquire+0x5/0x30 include/linux/rcupdate.h:266

stack backtrace:
CPU: 1 PID: 9242 Comm: syz-executor211 Not tainted 5.14.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1ae/0x29f lib/dump_stack.c:105
 print_circular_bug+0xb17/0xdc0 kernel/locking/lockdep.c:2009
 check_noncircular+0x2cc/0x390 kernel/locking/lockdep.c:2131
 check_prev_add kernel/locking/lockdep.c:3051 [inline]
 check_prevs_add+0x4f9/0x5b30 kernel/locking/lockdep.c:3174
 validate_chain kernel/locking/lockdep.c:3789 [inline]
 __lock_acquire+0x4476/0x6100 kernel/locking/lockdep.c:5015
 lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625
 __mutex_lock_common+0x1ad/0x3770 kernel/locking/mutex.c:959
 __mutex_lock kernel/locking/mutex.c:1104 [inline]
 mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1119
 dquot_commit+0x57/0x360 fs/quota/dquot.c:474
 ext4_write_dquot+0x1e4/0x2b0 fs/ext4/super.c:6245
 mark_dquot_dirty fs/quota/dquot.c:345 [inline]
 mark_all_dquot_dirty fs/quota/dquot.c:383 [inline]
 __dquot_alloc_space+0xa18/0x1020 fs/quota/dquot.c:1707
 dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline]
 dquot_alloc_space include/linux/quotaops.h:310 [inline]
 dquot_alloc_block include/linux/quotaops.h:334 [inline]
 ext4_mb_new_blocks+0xe85/0x2470 fs/ext4/mballoc.c:5477
 ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4245
 ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638
 _ext4_get_block+0x24b/0x710 fs/ext4/inode.c:794
 ext4_block_write_begin+0x63a/0x1250 fs/ext4/inode.c:1077
 ext4_write_begin+0x5cc/0x1350 fs/ext4/ext4_jbd2.h:498
 ext4_da_write_begin+0x384/0x10c0 fs/ext4/inode.c:2960
 generic_perform_write+0x262/0x580 mm/filemap.c:3656
 ext4_buffered_write_iter+0x41c/0x590 fs/ext4/file.c:269
 ext4_file_write_iter+0x8f7/0x1b90 fs/ext4/file.c:519
 call_write_iter include/linux/fs.h:2114 [inline]
 new_sync_write fs/read_write.c:518 [inline]
 vfs_write+0xa39/0xc90 fs/read_write.c:605
 ksys_write+0x171/0x2a0 fs/read_write.c:658
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x445219
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd8864cd18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00000000004885e9 RCX: 0000000000445219
RDX: 000000000d4ba0ff RSI: 00000000200009c0 RDI: 0000000000000003
RBP: 0000000020010500 R08: 00007ffd8864cd40 R09: 00007ffd8864cd40
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000020010000
R13: 0030656c69662f2e R14: 00007ffd8864cd50 R15: 000000000000004d


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] possible deadlock in dquot_commit
  2021-08-09 12:54 ` [syzbot] " syzbot
@ 2021-08-09 14:52   ` Jan Kara
  2021-08-09 17:43     ` syzbot
  2021-10-07  8:44   ` Jan Kara
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Kara @ 2021-08-09 14:52 UTC (permalink / raw)
  To: syzbot
  Cc: dvyukov, jack, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso

[-- Attachment #1: Type: text/plain, Size: 1817 bytes --]

On Mon 09-08-21 05:54:27, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    66745863ecde Merge tag 'char-misc-5.14-rc5' of git://git.k..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13edca6e300000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324
> dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2
> compiler:       Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15aeba6e300000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17a609e6300000
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
> 
> loop0: detected capacity change from 0 to 4096
> EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback.
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.14.0-rc4-syzkaller #0 Not tainted
> ------------------------------------------------------
> syz-executor211/9242 is trying to acquire lock:
> ffff88803a37ece8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474
> 
> but task is already holding lock:
> ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631
> 
> which lock already depends on the new lock.

Hmm, looks like hidden quota file got linked from directory hierarchy.
Attached patch should fix this.

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 902e7f373fff2476b53824264c12e4e76c7ec02a

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-ext4-Make-sure-quota-files-are-not-grabbed-accidenta.patch --]
[-- Type: text/x-patch, Size: 1798 bytes --]

From 6efc0878a8c8f498eb138bdb57fad8a6c85d115c Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Mon, 9 Aug 2021 16:09:27 +0200
Subject: [PATCH] ext4: Make sure quota files are not grabbed accidentally

If ext4 filesystem is corrupted so that quota files are linked from
directory hirerarchy, bad things can happen. E.g. quota files can get
corrupted or deleted. Make sure we are not grabbing quota file inodes
when we expect normal inodes.

Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d8de607849df..2c33c795c4a7 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4603,6 +4603,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
 	struct ext4_iloc iloc;
 	struct ext4_inode *raw_inode;
 	struct ext4_inode_info *ei;
+	struct ext4_super_block *es = EXT4_SB(sb)->s_es;
 	struct inode *inode;
 	journal_t *journal = EXT4_SB(sb)->s_journal;
 	long ret;
@@ -4613,9 +4614,12 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
 	projid_t i_projid;
 
 	if ((!(flags & EXT4_IGET_SPECIAL) &&
-	     (ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO)) ||
+	     ((ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO) ||
+	      ino == le32_to_cpu(es->s_usr_quota_inum) ||
+	      ino == le32_to_cpu(es->s_grp_quota_inum) ||
+	      ino == le32_to_cpu(es->s_prj_quota_inum))) ||
 	    (ino < EXT4_ROOT_INO) ||
-	    (ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count))) {
+	    (ino > le32_to_cpu(es->s_inodes_count))) {
 		if (flags & EXT4_IGET_HANDLE)
 			return ERR_PTR(-ESTALE);
 		__ext4_error(sb, function, line, false, EFSCORRUPTED, 0,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] possible deadlock in dquot_commit
  2021-08-09 14:52   ` Jan Kara
@ 2021-08-09 17:43     ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2021-08-09 17:43 UTC (permalink / raw)
  To: dvyukov, jack, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in dquot_commit

EXT4-fs warning (device loop4): ext4_enable_quotas:6478: Failed to enable quota tracking (type=1, err=-22). Please run e2fsck to fix.
EXT4-fs (loop4): mount failed
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc4-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor.4/28771 is trying to acquire lock:
ffff88803941cea8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474

but task is already holding lock:
ffff8880463d2a58 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&ei->i_data_sem/2){++++}-{3:3}:
       lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625
       down_read+0x3b/0x50 kernel/locking/rwsem.c:1353
       ext4_map_blocks+0x266/0x1cb0 fs/ext4/inode.c:561
       ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848
       ext4_bread+0x2a/0x170 fs/ext4/inode.c:900
       ext4_quota_write+0x2c7/0x5b0 fs/ext4/super.c:6602
       write_blk fs/quota/quota_tree.c:64 [inline]
       get_free_dqblk+0x33a/0x660 fs/quota/quota_tree.c:93
       do_insert_tree+0x24c/0x1d30 fs/quota/quota_tree.c:300
       do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331
       do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331
       do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331
       dq_insert_tree fs/quota/quota_tree.c:357 [inline]
       qtree_write_dquot+0x3b6/0x530 fs/quota/quota_tree.c:376
       v2_write_dquot+0x110/0x1a0 fs/quota/quota_v2.c:358
       dquot_acquire+0x2d7/0x5b0 fs/quota/dquot.c:441
       ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261
       dqget+0x999/0xdc0 fs/quota/dquot.c:899
       __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477
       ext4_create+0xb0/0x550 fs/ext4/namei.c:2731
       lookup_open fs/namei.c:3228 [inline]
       open_last_lookups fs/namei.c:3298 [inline]
       path_openat+0x13b7/0x36b0 fs/namei.c:3504
       do_filp_open+0x253/0x4d0 fs/namei.c:3534
       do_sys_openat2+0x124/0x460 fs/open.c:1204
       do_sys_open fs/open.c:1220 [inline]
       __do_sys_creat fs/open.c:1294 [inline]
       __se_sys_creat fs/open.c:1288 [inline]
       __x64_sys_creat+0x11f/0x160 fs/open.c:1288
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #1 (&s->s_dquot.dqio_sem){++++}-{3:3}:
       lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625
       down_read+0x3b/0x50 kernel/locking/rwsem.c:1353
       v2_read_dquot+0x4a/0x100 fs/quota/quota_v2.c:332
       dquot_acquire+0x144/0x5b0 fs/quota/dquot.c:432
       ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261
       dqget+0x999/0xdc0 fs/quota/dquot.c:899
       __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477
       ext4_create+0xb0/0x550 fs/ext4/namei.c:2731
       lookup_open fs/namei.c:3228 [inline]
       open_last_lookups fs/namei.c:3298 [inline]
       path_openat+0x13b7/0x36b0 fs/namei.c:3504
       do_filp_open+0x253/0x4d0 fs/namei.c:3534
       do_sys_openat2+0x124/0x460 fs/open.c:1204
       do_sys_open fs/open.c:1220 [inline]
       __do_sys_creat fs/open.c:1294 [inline]
       __se_sys_creat fs/open.c:1288 [inline]
       __x64_sys_creat+0x11f/0x160 fs/open.c:1288
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (&dquot->dq_lock){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3051 [inline]
       check_prevs_add+0x4f9/0x5b30 kernel/locking/lockdep.c:3174
       __lo


Tested on:

commit:         902e7f37 Merge tag 'net-5.14-rc5' of git://git.kernel...
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=125d6b79300000
kernel config:  https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324
dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2
compiler:       Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1
patch:          https://syzkaller.appspot.com/x/patch.diff?x=173c0ee9300000


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] possible deadlock in dquot_commit
       [not found] ` <20210810041100.3271-1-hdanton@sina.com>
@ 2021-08-10  9:21   ` Jan Kara
       [not found]   ` <20210811041232.2449-1-hdanton@sina.com>
  1 sibling, 0 replies; 15+ messages in thread
From: Jan Kara @ 2021-08-10  9:21 UTC (permalink / raw)
  To: Hillf Danton
  Cc: syzbot, dvyukov, jack, jack, linux-kernel, syzkaller-bugs,
	syzkaller, tytso

On Tue 10-08-21 12:11:00, Hillf Danton wrote:
> On Mon, 09 Aug 2021 05:54:27 -0700
> > syzbot has found a reproducer for the following issue on:
> > 
> > HEAD commit:    66745863ecde Merge tag 'char-misc-5.14-rc5' of git://git.k..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13edca6e300000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2
> > compiler:       Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15aeba6e300000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17a609e6300000
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
> > 
> > loop0: detected capacity change from 0 to 4096
> > EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback.
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 5.14.0-rc4-syzkaller #0 Not tainted
> > ------------------------------------------------------
> > syz-executor211/9242 is trying to acquire lock:
> > ffff88803a37ece8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474
> > 
> > but task is already holding lock:
> > ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631
> > 
> > which lock already depends on the new lock.
> > 
> > 
> > the existing dependency chain (in reverse order) is:
> > 
> > -> #2 (&ei->i_data_sem/2){++++}-{3:3}:
> >        lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625
> >        down_read+0x3b/0x50 kernel/locking/rwsem.c:1353
> >        ext4_map_blocks+0x266/0x1cb0 fs/ext4/inode.c:561
> >        ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848
> >        ext4_bread+0x2a/0x170 fs/ext4/inode.c:900
> >        ext4_quota_write+0x2c7/0x5b0 fs/ext4/super.c:6602
> >        write_blk fs/quota/quota_tree.c:64 [inline]
> >        get_free_dqblk+0x33a/0x660 fs/quota/quota_tree.c:93
> >        do_insert_tree+0x24c/0x1d30 fs/quota/quota_tree.c:300
> >        do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331
> >        do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331
> >        do_insert_tree+0x659/0x1d30 fs/quota/quota_tree.c:331
> >        dq_insert_tree fs/quota/quota_tree.c:357 [inline]
> >        qtree_write_dquot+0x3b6/0x530 fs/quota/quota_tree.c:376
> >        v2_write_dquot+0x110/0x1a0 fs/quota/quota_v2.c:358
> >        dquot_acquire+0x2d7/0x5b0 fs/quota/dquot.c:441
> 
> Mark1, see below.
> 
> >        ext4_acquire_dquot+0x2e0/0x400 fs/ext4/super.c:6261
> >        dqget+0x999/0xdc0 fs/quota/dquot.c:899
> >        __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477
> >        ext4_create+0xb0/0x550 fs/ext4/namei.c:2731
> >        lookup_open fs/namei.c:3228 [inline]
> >        open_last_lookups fs/namei.c:3298 [inline]
> >        path_openat+0x13b7/0x36b0 fs/namei.c:3504
> >        do_filp_open+0x253/0x4d0 fs/namei.c:3534
> >        do_sys_openat2+0x124/0x460 fs/open.c:1204
> >        do_sys_open fs/open.c:1220 [inline]
> >        __do_sys_creat fs/open.c:1294 [inline]
> >        __se_sys_creat fs/open.c:1288 [inline]
> >        __x64_sys_creat+0x11f/0x160 fs/open.c:1288
> >        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >        do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
> >        entry_SYSCALL_64_after_hwframe+0x44/0xae
> > 
> > -> #1 (&s->s_dquot.dqio_sem){++++}-{3:3}:
> >        lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625
> >        down_read+0x3b/0x50 kernel/locking/rwsem.c:1353
> >        v2_read_dquot+0x4a/0x100 fs/quota/quota_v2.c:332
> >        dquot_acquire+0x144/0x5b0 fs/quota/dquot.c:432
> 
> What boggles mind is both this line and the above line at Mark1 are under
> 
> 430	mutex_lock(&dquot->dq_lock);
> 
> Is it likely?

I'm not quite sure what you are asking about but yes, dquot_acquire() grabs
dquot->dq_lock, then e.g. v2_write_dquot() acquires dqio_sem, then
ext4_map_blocks() acquires i_data_sem/2 (special lock subclass for quota
files). What is unexpected is the #0 trace where i_data_sem/2 is acquired
by ext4_map_blocks() called from ext4_write_begin(). That shows that
normal write(2) call was able to operate on quota file which is certainly
wrong. My patch closed one path how this could happen and I'm puzzled how
else this could happen. I'll try to reproduce the issue (I've already tried
but so far failed) as see if I can find out more.

									Honza

> >        dqget+0x999/0xdc0 fs/quota/dquot.c:899
> >        __dquot_initialize+0x291/0xd40 fs/quota/dquot.c:1477
> >        ext4_create+0xb0/0x550 fs/ext4/namei.c:2731
> >        lookup_open fs/namei.c:3228 [inline]
> >        open_last_lookups fs/namei.c:3298 [inline]
> >        path_openat+0x13b7/0x36b0 fs/namei.c:3504
> >        do_filp_open+0x253/0x4d0 fs/namei.c:3534
> >        do_sys_openat2+0x124/0x460 fs/open.c:1204
> >        do_sys_open fs/open.c:1220 [inline]
> >        __do_sys_creat fs/open.c:1294 [inline]
> >        __se_sys_creat fs/open.c:1288 [inline]
> >        __x64_sys_creat+0x11f/0x160 fs/open.c:1288
> >        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >        do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
> >        entry_SYSCALL_64_after_hwframe+0x44/0xae
> > 
> > -> #0 (&dquot->dq_lock){+.+.}-{3:3}:
> >        check_prev_add kernel/locking/lockdep.c:3051 [inline]
> >        check_prevs_add+0x4f9/0x5b30 kernel/locking/lockdep.c:3174
> >        validate_chain kernel/locking/lockdep.c:3789 [inline]
> >        __lock_acquire+0x4476/0x6100 kernel/locking/lockdep.c:5015
> >        lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5625
> >        __mutex_lock_common+0x1ad/0x3770 kernel/locking/mutex.c:959
> >        __mutex_lock kernel/locking/mutex.c:1104 [inline]
> >        mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:1119
> >        dquot_commit+0x57/0x360 fs/quota/dquot.c:474
> >        ext4_write_dquot+0x1e4/0x2b0 fs/ext4/super.c:6245
> >        mark_dquot_dirty fs/quota/dquot.c:345 [inline]
> >        mark_all_dquot_dirty fs/quota/dquot.c:383 [inline]
> >        __dquot_alloc_space+0xa18/0x1020 fs/quota/dquot.c:1707
> >        dquot_alloc_space_nodirty include/linux/quotaops.h:297 [inline]
> >        dquot_alloc_space include/linux/quotaops.h:310 [inline]
> >        dquot_alloc_block include/linux/quotaops.h:334 [inline]
> >        ext4_mb_new_blocks+0xe85/0x2470 fs/ext4/mballoc.c:5477
> >        ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4245
> >        ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638
> >        _ext4_get_block+0x24b/0x710 fs/ext4/inode.c:794
> >        ext4_block_write_begin+0x63a/0x1250 fs/ext4/inode.c:1077
> >        ext4_write_begin+0x5cc/0x1350 fs/ext4/ext4_jbd2.h:498
> >        ext4_da_write_begin+0x384/0x10c0 fs/ext4/inode.c:2960
> >        generic_perform_write+0x262/0x580 mm/filemap.c:3656
> >        ext4_buffered_write_iter+0x41c/0x590 fs/ext4/file.c:269
> >        ext4_file_write_iter+0x8f7/0x1b90 fs/ext4/file.c:519
> >        call_write_iter include/linux/fs.h:2114 [inline]
> >        new_sync_write fs/read_write.c:518 [inline]
> >        vfs_write+0xa39/0xc90 fs/read_write.c:605
> >        ksys_write+0x171/0x2a0 fs/read_write.c:658
> >        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >        do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
> >        entry_SYSCALL_64_after_hwframe+0x44/0xae
> > 
> > other info that might help us debug this:
> > 
> > Chain exists of:
> >   &dquot->dq_lock --> &s->s_dquot.dqio_sem --> &ei->i_data_sem/2
> > 
> >  Possible unsafe locking scenario:
> > 
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&ei->i_data_sem/2);
> >                                lock(&s->s_dquot.dqio_sem);
> >                                lock(&ei->i_data_sem/2);
> >   lock(&dquot->dq_lock);
> > 
> >  *** DEADLOCK ***
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] possible deadlock in dquot_commit
       [not found]   ` <20210811041232.2449-1-hdanton@sina.com>
@ 2021-08-12 13:55     ` Jan Kara
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Kara @ 2021-08-12 13:55 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Jan Kara, syzbot, dvyukov, linux-kernel, syzkaller-bugs,
	syzkaller, tytso

On Wed 11-08-21 12:12:32, Hillf Danton wrote:
> On Tue, 10 Aug 2021 11:21:42 +0200 Jan Kara wrote:
> >
> >I'm not quite sure what you are asking about but yes, dquot_acquire() grabs
> 
> It is hard to understand the rooms in mutex for two lock owners.
> 
> >dquot->dq_lock, then e.g. v2_write_dquot() acquires dqio_sem, then
> >ext4_map_blocks() acquires i_data_sem/2 (special lock subclass for quota
> >files).
> >
> >What is unexpected is the #0 trace where i_data_sem/2 is acquired
> >by ext4_map_blocks() called from ext4_write_begin(). That shows that
> >normal write(2) call was able to operate on quota file which is certainly
> >wrong.
> 
> The change below can test your theory.
> >
> >My patch closed one path how this could happen and I'm puzzled how
> >else this could happen. I'll try to reproduce the issue (I've already tried
> >but so far failed) as see if I can find out more.
> 
> Actually there is one check for quota file near 100 lines of code lower,
> and copy it to just before taking i_data_sem to avoid writing the file of
> wrong type.
> 
> Now only for thoughts.
> 
> +++ x/fs/ext4/inode.c
> @@ -616,6 +616,8 @@ found:
>  		if (!(flags & EXT4_GET_BLOCKS_CONVERT_UNWRITTEN))
>  			return retval;
>  
> +	if (ext4_is_quota_file(inode))
> +		return -EINVAL;
>  	/*
>  	 * Here we clear m_flags because after allocating an new extent,
>  	 * it will be set again.

This would be certainly wrong. ext4_map_blocks() is used for accessing and
allocating blocks for quota file. It is ext4_write_begin() that should not
be called for the quota file. I've run the reproducer here for couple of
hours but the problem didn't trigger for me. Strange.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] possible deadlock in dquot_commit
  2021-08-09 12:54 ` [syzbot] " syzbot
  2021-08-09 14:52   ` Jan Kara
@ 2021-10-07  8:44   ` Jan Kara
  2021-10-07 13:50     ` syzbot
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Kara @ 2021-10-07  8:44 UTC (permalink / raw)
  To: syzbot; +Cc: dvyukov, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso

[-- Attachment #1: Type: text/plain, Size: 2032 bytes --]

On Mon 09-08-21 05:54:27, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    66745863ecde Merge tag 'char-misc-5.14-rc5' of git://git.k..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13edca6e300000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=702bfdfbf389c324
> dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2
> compiler:       Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.1
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15aeba6e300000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17a609e6300000
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
> 
> loop0: detected capacity change from 0 to 4096
> EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue. Quota mode: writeback.
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.14.0-rc4-syzkaller #0 Not tainted
> ------------------------------------------------------
> syz-executor211/9242 is trying to acquire lock:
> ffff88803a37ece8 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_commit+0x57/0x360 fs/quota/dquot.c:474
> 
> but task is already holding lock:
> ffff88803a303e48 (&ei->i_data_sem/2){++++}-{3:3}, at: ext4_map_blocks+0x9e5/0x1cb0 fs/ext4/inode.c:631
> 
> which lock already depends on the new lock.

I've got back to this and I have one more idea what could be causing this
and why I'm not able to reproduce. I think we free some inode with
i_data_sem locking class set to 2 and when the inode gets reused (which
doesn't happen in my VM for some reason) for a normal file, problems trigger.
Let's try attached patch.

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 60a9483534ed0d99090a2ee1d4bb0b8179195f51

									Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-ext4-Make-sure-to-reset-inode-lockdep-class-when-quo.patch --]
[-- Type: text/x-patch, Size: 1440 bytes --]

From 2c0f00967aecfbc03216feb5a6d7286346268932 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Thu, 7 Oct 2021 10:30:46 +0200
Subject: [PATCH] ext4: Make sure to reset inode lockdep class when quota
 enabling fails

When we succeed in enabling some quota type but fail to enable another
one with quota feature, we correctly disable all enabled quota types.
However we forget to reset i_data_sem lockdep class. When the inode gets
freed and reused, it will inherit this lockdep class (i_data_sem is
initialized only when a slab is created) and thus eventually lockdep
barfs about possible deadlocks.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/super.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index fbe9cae63786..70b5fcbd351a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -6355,8 +6355,19 @@ int ext4_enable_quotas(struct super_block *sb)
 					"Failed to enable quota tracking "
 					"(type=%d, err=%d). Please run "
 					"e2fsck to fix.", type, err);
-				for (type--; type >= 0; type--)
+				for (type--; type >= 0; type--) {
+					struct inode *inode;
+
+					inode = sb_dqopt(sb)->files[type];
+					if (inode)
+						inode = igrab(inode);
 					dquot_quota_off(sb, type);
+					if (inode) {
+						lockdep_set_quota_inode(inode,
+							I_DATA_SEM_NORMAL);
+						iput(inode);
+					}
+				}
 
 				return err;
 			}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] possible deadlock in dquot_commit
  2021-10-07  8:44   ` Jan Kara
@ 2021-10-07 13:50     ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2021-10-07 13:50 UTC (permalink / raw)
  To: dvyukov, jack, linux-kernel, syzkaller-bugs, syzkaller, tytso

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com

Tested on:

commit:         60a94835 Merge tag 'warning-fixes-20211005' of git://g..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config:  https://syzkaller.appspot.com/x/.config?x=74f6ab826fb913cd
dashboard link: https://syzkaller.appspot.com/bug?extid=3b6f9218b1301ddda3e2
compiler:       Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=162860d0b00000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-10-07 13:50 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-10 11:25 possible deadlock in dquot_commit syzbot
2021-02-11 11:37 ` Jan Kara
2021-02-11 11:47   ` Dmitry Vyukov
2021-02-11 15:47     ` Jan Kara
2021-02-11 21:46     ` Theodore Ts'o
2021-02-12 11:01       ` Dmitry Vyukov
2021-02-12 16:10         ` Theodore Ts'o
2021-02-15 12:50           ` Dmitry Vyukov
2021-08-09 12:54 ` [syzbot] " syzbot
2021-08-09 14:52   ` Jan Kara
2021-08-09 17:43     ` syzbot
2021-10-07  8:44   ` Jan Kara
2021-10-07 13:50     ` syzbot
     [not found] ` <20210810041100.3271-1-hdanton@sina.com>
2021-08-10  9:21   ` Jan Kara
     [not found]   ` <20210811041232.2449-1-hdanton@sina.com>
2021-08-12 13:55     ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.