linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected"
@ 2020-10-09 17:29 Ritesh Harjani
  2020-10-09 20:12 ` Eric Whitney
  2020-10-12 11:03 ` Jan Kara
  0 siblings, 2 replies; 6+ messages in thread
From: Ritesh Harjani @ 2020-10-09 17:29 UTC (permalink / raw)
  To: linux-ext4; +Cc: Jan Kara, Eric Whitney, Theodore Ts'o

Hello All,

While running generic/219 fstests on a 1k blocksize on x86 box, I see
below dmesg warning msg and generic/219 fails. I haven't yet analyzed
it, but I remember I have seen such warnings before as well in my testing.
I was wondering if others have also seen it in their testing or not and
if this is any known issue?


Here it goes, 219.dmesg file.


[14459.933253] run fstests generic/219 at 2020-10-08 20:03:27
[14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode. 
Opts: acl,user_xattr,block_validity,usrquota,grpquota
[14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null)
[14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null)
[14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode. 
Opts: usrquota
[14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null)
[14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null)

[14464.149407] ======================================================
[14464.149407] WARNING: possible circular locking dependency detected
[14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted
[14464.149407] ------------------------------------------------------
[14464.149407] chown/23358 is trying to acquire lock:
[14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at: 
ext4_map_blocks+0xd1/0x640
[14464.149407]
                but task is already holding lock:
[14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: 
v2_write_dquot+0x2d/0xb0
[14464.149407]
                which lock already depends on the new lock.

[14464.149407]
                the existing dependency chain (in reverse order) is:
[14464.149407]
                -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}:
[14464.149407]        down_read+0x41/0x200
[14464.149407]        v2_read_dquot+0x23/0x60
[14464.149407]        dquot_acquire+0x4c/0x100
[14464.149407]        ext4_acquire_dquot+0x72/0xc0
[14464.149407]        dqget+0x24a/0x4a0
[14464.149407]        __dquot_initialize+0x1f0/0x330
[14464.149407]        ext4_create+0x38/0x190
[14464.149407]        lookup_open+0x4cd/0x630
[14464.149407]        path_openat+0x2c4/0xa10
[14464.149407]        do_filp_open+0x91/0x100
[14464.149407]        do_sys_openat2+0x20d/0x2d0
[14464.149407]        do_sys_open+0x44/0x80
[14464.149407]        do_syscall_64+0x33/0x40
[14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[14464.149407]
                -> #1 (&dquot->dq_lock){+.+.}-{3:3}:
[14464.149407]        __mutex_lock+0xaa/0x9c0
[14464.149407]        dquot_commit+0x23/0xf0
[14464.149407]        ext4_write_dquot+0x78/0xb0
[14464.149407]        __dquot_alloc_space+0x151/0x310
[14464.149407]        ext4_mb_new_blocks+0x1f6/0x12e0
[14464.149407]        ext4_ext_map_blocks+0x9c3/0x1470
[14464.149407]        ext4_map_blocks+0xf4/0x640
[14464.149407]        _ext4_get_block+0x90/0x110
[14464.149407]        ext4_block_write_begin+0x15f/0x5a0
[14464.149407]        ext4_write_begin+0x267/0x5b0
[14464.149407]        generic_perform_write+0xc2/0x1e0
[14464.149407]        ext4_buffered_write_iter+0x8b/0x130
[14464.149407]        ext4_file_write_iter+0x6c/0x6d0
[14464.149407]        new_sync_write+0x122/0x1b0
[14464.149407]        vfs_write+0x1ca/0x230
[14464.149407]        ksys_pwrite64+0x68/0xa0
[14464.149407]        do_syscall_64+0x33/0x40
[14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[14464.149407]
                -> #0 (&ei->i_data_sem){++++}-{3:3}:
[14464.149407]        __lock_acquire+0x147b/0x2830
[14464.149407]        lock_acquire+0xc6/0x3a0
[14464.149407]        down_write+0x40/0x110
[14464.149407]        ext4_map_blocks+0xd1/0x640
[14464.149407]        ext4_getblk+0x54/0x1c0
[14464.149407]        ext4_bread+0x1f/0xd0
[14464.149407]        ext4_quota_write+0xbf/0x280
[14464.149407]        write_blk+0x35/0x70
[14464.149407]        get_free_dqblk+0x42/0xa0
[14464.149407]        do_insert_tree+0x1d6/0x480
[14464.149407]        do_insert_tree+0x464/0x480
[14464.149407]        do_insert_tree+0x218/0x480
[14464.149407]        do_insert_tree+0x218/0x480
[14464.149407]        qtree_write_dquot+0x79/0x1b0
[14464.149407]        v2_write_dquot+0x52/0xb0
[14464.149407]        dquot_acquire+0x8e/0x100
[14464.149407]        ext4_acquire_dquot+0x72/0xc0
[14464.149407]        dqget+0x24a/0x4a0
[14464.149407]        dquot_transfer+0xfe/0x140
[14464.149407]        ext4_setattr+0x134/0x9c0
[14464.149407]        notify_change+0x34b/0x490
[14464.149407]        chown_common+0x96/0x150
[14464.149407]        do_fchownat+0x8d/0xe0
[14464.149407]        __x64_sys_fchownat+0x21/0x30
[14464.149407]        do_syscall_64+0x33/0x40
[14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[14464.149407]
                other info that might help us debug this:

[14464.149407] Chain exists of:
                  &ei->i_data_sem --> &dquot->dq_lock --> 
&s->s_dquot.dqio_sem

[14464.149407]  Possible unsafe locking scenario:

[14464.149407]        CPU0                    CPU1
[14464.149407]        ----                    ----
[14464.149407]   lock(&s->s_dquot.dqio_sem);
[14464.149407]                                lock(&dquot->dq_lock);
[14464.149407]                                lock(&s->s_dquot.dqio_sem);
[14464.149407]   lock(&ei->i_data_sem);
[14464.149407]
                 *** DEADLOCK ***

[14464.149407] 6 locks held by chown/23358:
[14464.149407]  #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at: 
mnt_want_write+0x20/0x50
[14464.149407]  #1: ffff8b9b8ff64bc8 
(&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150
[14464.149407]  #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at: 
start_this_handle+0x1ad/0x690
[14464.149407]  #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at: 
ext4_setattr+0x129/0x9c0
[14464.149407]  #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at: 
dquot_acquire+0x23/0x100
[14464.149407]  #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, 
at: v2_write_dquot+0x2d/0xb0
[14464.149407]
                stack backtrace:
[14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted 
5.9.0-rc8-next-20201008 #16
[14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS 1.13.0-1ubuntu1 04/01/2014
[14464.149407] Call Trace:
[14464.149407]  dump_stack+0x77/0x97
[14464.149407]  check_noncircular+0x132/0x150
[14464.149407]  ? sched_clock_local+0x12/0x80
[14464.149407]  __lock_acquire+0x147b/0x2830
[14464.149407]  lock_acquire+0xc6/0x3a0
[14464.149407]  ? ext4_map_blocks+0xd1/0x640
[14464.149407]  down_write+0x40/0x110
[14464.149407]  ? ext4_map_blocks+0xd1/0x640
[14464.149407]  ext4_map_blocks+0xd1/0x640
[14464.149407]  ? sched_clock_local+0x12/0x80
[14464.149407]  ext4_getblk+0x54/0x1c0
[14464.149407]  ext4_bread+0x1f/0xd0
[14464.149407]  ext4_quota_write+0xbf/0x280
[14464.149407]  write_blk+0x35/0x70
[14464.149407]  get_free_dqblk+0x42/0xa0
[14464.149407]  do_insert_tree+0x1d6/0x480
[14464.149407]  ? ext4_quota_read+0x9d/0x110
[14464.149407]  do_insert_tree+0x464/0x480
[14464.149407]  ? ext4_quota_read+0x9d/0x110
[14464.149407]  do_insert_tree+0x218/0x480
[14464.149407]  ? ext4_quota_read+0x9d/0x110
[14464.149407]  do_insert_tree+0x218/0x480
[14464.149407]  ? __kmalloc+0x319/0x350
[14464.149407]  qtree_write_dquot+0x79/0x1b0
[14464.149407]  v2_write_dquot+0x52/0xb0
[14464.149407]  dquot_acquire+0x8e/0x100
[14464.149407]  ext4_acquire_dquot+0x72/0xc0
[14464.149407]  dqget+0x24a/0x4a0
[14464.149407]  dquot_transfer+0xfe/0x140
[14464.149407]  ext4_setattr+0x134/0x9c0
[14464.149407]  notify_change+0x34b/0x490
[14464.149407]  ? chown_common+0x96/0x150
[14464.149407]  chown_common+0x96/0x150
[14464.149407]  ? preempt_count_add+0x49/0xa0
[14464.149407]  do_fchownat+0x8d/0xe0
[14464.149407]  __x64_sys_fchownat+0x21/0x30
[14464.149407]  do_syscall_64+0x33/0x40
[14464.149407]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[14464.149407] RIP: 0033:0x7efd3d2666ca
[14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 
66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48
[14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX: 
0000000000000104
[14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX: 
00007efd3d2666ca
[14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI: 
00000000ffffff9c
[14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09: 
00000000ffffffff
[14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12: 
00005598d654e1f0
[14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15: 
00000000ffffffff
[14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode. 
Opts: grpquota
[14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null)
[14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected"
  2020-10-09 17:29 xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" Ritesh Harjani
@ 2020-10-09 20:12 ` Eric Whitney
  2020-10-09 20:24   ` Darrick J. Wong
  2020-10-12 11:03 ` Jan Kara
  1 sibling, 1 reply; 6+ messages in thread
From: Eric Whitney @ 2020-10-09 20:12 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: linux-ext4, Jan Kara, Eric Whitney, Theodore Ts'o

* Ritesh Harjani <riteshh@linux.ibm.com>:
> Hello All,
> 
> While running generic/219 fstests on a 1k blocksize on x86 box, I see
> below dmesg warning msg and generic/219 fails. I haven't yet analyzed
> it, but I remember I have seen such warnings before as well in my testing.
> I was wondering if others have also seen it in their testing or not and
> if this is any known issue?
> 
> 
> Here it goes, 219.dmesg file.
> 
> 
> [14459.933253] run fstests generic/219 at 2020-10-08 20:03:27
> [14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> Opts: acl,user_xattr,block_validity,usrquota,grpquota
> [14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null)
> [14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null)
> [14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> Opts: usrquota
> [14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null)
> [14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null)
> 
> [14464.149407] ======================================================
> [14464.149407] WARNING: possible circular locking dependency detected
> [14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted
> [14464.149407] ------------------------------------------------------
> [14464.149407] chown/23358 is trying to acquire lock:
> [14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at:
> ext4_map_blocks+0xd1/0x640
> [14464.149407]
>                but task is already holding lock:
> [14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at:
> v2_write_dquot+0x2d/0xb0
> [14464.149407]
>                which lock already depends on the new lock.
> 
> [14464.149407]
>                the existing dependency chain (in reverse order) is:
> [14464.149407]
>                -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}:
> [14464.149407]        down_read+0x41/0x200
> [14464.149407]        v2_read_dquot+0x23/0x60
> [14464.149407]        dquot_acquire+0x4c/0x100
> [14464.149407]        ext4_acquire_dquot+0x72/0xc0
> [14464.149407]        dqget+0x24a/0x4a0
> [14464.149407]        __dquot_initialize+0x1f0/0x330
> [14464.149407]        ext4_create+0x38/0x190
> [14464.149407]        lookup_open+0x4cd/0x630
> [14464.149407]        path_openat+0x2c4/0xa10
> [14464.149407]        do_filp_open+0x91/0x100
> [14464.149407]        do_sys_openat2+0x20d/0x2d0
> [14464.149407]        do_sys_open+0x44/0x80
> [14464.149407]        do_syscall_64+0x33/0x40
> [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [14464.149407]
>                -> #1 (&dquot->dq_lock){+.+.}-{3:3}:
> [14464.149407]        __mutex_lock+0xaa/0x9c0
> [14464.149407]        dquot_commit+0x23/0xf0
> [14464.149407]        ext4_write_dquot+0x78/0xb0
> [14464.149407]        __dquot_alloc_space+0x151/0x310
> [14464.149407]        ext4_mb_new_blocks+0x1f6/0x12e0
> [14464.149407]        ext4_ext_map_blocks+0x9c3/0x1470
> [14464.149407]        ext4_map_blocks+0xf4/0x640
> [14464.149407]        _ext4_get_block+0x90/0x110
> [14464.149407]        ext4_block_write_begin+0x15f/0x5a0
> [14464.149407]        ext4_write_begin+0x267/0x5b0
> [14464.149407]        generic_perform_write+0xc2/0x1e0
> [14464.149407]        ext4_buffered_write_iter+0x8b/0x130
> [14464.149407]        ext4_file_write_iter+0x6c/0x6d0
> [14464.149407]        new_sync_write+0x122/0x1b0
> [14464.149407]        vfs_write+0x1ca/0x230
> [14464.149407]        ksys_pwrite64+0x68/0xa0
> [14464.149407]        do_syscall_64+0x33/0x40
> [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [14464.149407]
>                -> #0 (&ei->i_data_sem){++++}-{3:3}:
> [14464.149407]        __lock_acquire+0x147b/0x2830
> [14464.149407]        lock_acquire+0xc6/0x3a0
> [14464.149407]        down_write+0x40/0x110
> [14464.149407]        ext4_map_blocks+0xd1/0x640
> [14464.149407]        ext4_getblk+0x54/0x1c0
> [14464.149407]        ext4_bread+0x1f/0xd0
> [14464.149407]        ext4_quota_write+0xbf/0x280
> [14464.149407]        write_blk+0x35/0x70
> [14464.149407]        get_free_dqblk+0x42/0xa0
> [14464.149407]        do_insert_tree+0x1d6/0x480
> [14464.149407]        do_insert_tree+0x464/0x480
> [14464.149407]        do_insert_tree+0x218/0x480
> [14464.149407]        do_insert_tree+0x218/0x480
> [14464.149407]        qtree_write_dquot+0x79/0x1b0
> [14464.149407]        v2_write_dquot+0x52/0xb0
> [14464.149407]        dquot_acquire+0x8e/0x100
> [14464.149407]        ext4_acquire_dquot+0x72/0xc0
> [14464.149407]        dqget+0x24a/0x4a0
> [14464.149407]        dquot_transfer+0xfe/0x140
> [14464.149407]        ext4_setattr+0x134/0x9c0
> [14464.149407]        notify_change+0x34b/0x490
> [14464.149407]        chown_common+0x96/0x150
> [14464.149407]        do_fchownat+0x8d/0xe0
> [14464.149407]        __x64_sys_fchownat+0x21/0x30
> [14464.149407]        do_syscall_64+0x33/0x40
> [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [14464.149407]
>                other info that might help us debug this:
> 
> [14464.149407] Chain exists of:
>                  &ei->i_data_sem --> &dquot->dq_lock -->
> &s->s_dquot.dqio_sem
> 
> [14464.149407]  Possible unsafe locking scenario:
> 
> [14464.149407]        CPU0                    CPU1
> [14464.149407]        ----                    ----
> [14464.149407]   lock(&s->s_dquot.dqio_sem);
> [14464.149407]                                lock(&dquot->dq_lock);
> [14464.149407]                                lock(&s->s_dquot.dqio_sem);
> [14464.149407]   lock(&ei->i_data_sem);
> [14464.149407]
>                 *** DEADLOCK ***
> 
> [14464.149407] 6 locks held by chown/23358:
> [14464.149407]  #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at:
> mnt_want_write+0x20/0x50
> [14464.149407]  #1: ffff8b9b8ff64bc8
> (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150
> [14464.149407]  #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at:
> start_this_handle+0x1ad/0x690
> [14464.149407]  #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at:
> ext4_setattr+0x129/0x9c0
> [14464.149407]  #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at:
> dquot_acquire+0x23/0x100
> [14464.149407]  #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at:
> v2_write_dquot+0x2d/0xb0
> [14464.149407]
>                stack backtrace:
> [14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted
> 5.9.0-rc8-next-20201008 #16
> [14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.13.0-1ubuntu1 04/01/2014
> [14464.149407] Call Trace:
> [14464.149407]  dump_stack+0x77/0x97
> [14464.149407]  check_noncircular+0x132/0x150
> [14464.149407]  ? sched_clock_local+0x12/0x80
> [14464.149407]  __lock_acquire+0x147b/0x2830
> [14464.149407]  lock_acquire+0xc6/0x3a0
> [14464.149407]  ? ext4_map_blocks+0xd1/0x640
> [14464.149407]  down_write+0x40/0x110
> [14464.149407]  ? ext4_map_blocks+0xd1/0x640
> [14464.149407]  ext4_map_blocks+0xd1/0x640
> [14464.149407]  ? sched_clock_local+0x12/0x80
> [14464.149407]  ext4_getblk+0x54/0x1c0
> [14464.149407]  ext4_bread+0x1f/0xd0
> [14464.149407]  ext4_quota_write+0xbf/0x280
> [14464.149407]  write_blk+0x35/0x70
> [14464.149407]  get_free_dqblk+0x42/0xa0
> [14464.149407]  do_insert_tree+0x1d6/0x480
> [14464.149407]  ? ext4_quota_read+0x9d/0x110
> [14464.149407]  do_insert_tree+0x464/0x480
> [14464.149407]  ? ext4_quota_read+0x9d/0x110
> [14464.149407]  do_insert_tree+0x218/0x480
> [14464.149407]  ? ext4_quota_read+0x9d/0x110
> [14464.149407]  do_insert_tree+0x218/0x480
> [14464.149407]  ? __kmalloc+0x319/0x350
> [14464.149407]  qtree_write_dquot+0x79/0x1b0
> [14464.149407]  v2_write_dquot+0x52/0xb0
> [14464.149407]  dquot_acquire+0x8e/0x100
> [14464.149407]  ext4_acquire_dquot+0x72/0xc0
> [14464.149407]  dqget+0x24a/0x4a0
> [14464.149407]  dquot_transfer+0xfe/0x140
> [14464.149407]  ext4_setattr+0x134/0x9c0
> [14464.149407]  notify_change+0x34b/0x490
> [14464.149407]  ? chown_common+0x96/0x150
> [14464.149407]  chown_common+0x96/0x150
> [14464.149407]  ? preempt_count_add+0x49/0xa0
> [14464.149407]  do_fchownat+0x8d/0xe0
> [14464.149407]  __x64_sys_fchownat+0x21/0x30
> [14464.149407]  do_syscall_64+0x33/0x40
> [14464.149407]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [14464.149407] RIP: 0033:0x7efd3d2666ca
> [14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66
> 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 05 <48>
> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48
> [14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000104
> [14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX:
> 00007efd3d2666ca
> [14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI:
> 00000000ffffff9c
> [14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09:
> 00000000ffffffff
> [14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12:
> 00005598d654e1f0
> [14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15:
> 00000000ffffffff
> [14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> Opts: grpquota
> [14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null)
> [14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null)

Hi Ritesh:

I've seen that exact failure on 1k at least as far back as 5.0.  It fails
on every 1k run for me using the test appliance.  I've been treating this
as a "known problem, likely harmless, no fix likely anytime soon" for so long
I unfortunately don't remember what the original analysis was.  IIRC,
Jan Kara took a look at this one back when I first encountered it.

Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected"
  2020-10-09 20:12 ` Eric Whitney
@ 2020-10-09 20:24   ` Darrick J. Wong
  2020-10-12 11:35     ` Jan Kara
  0 siblings, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2020-10-09 20:24 UTC (permalink / raw)
  To: Eric Whitney; +Cc: Ritesh Harjani, linux-ext4, Jan Kara, Theodore Ts'o

On Fri, Oct 09, 2020 at 04:12:38PM -0400, Eric Whitney wrote:
> * Ritesh Harjani <riteshh@linux.ibm.com>:
> > Hello All,
> > 
> > While running generic/219 fstests on a 1k blocksize on x86 box, I see
> > below dmesg warning msg and generic/219 fails. I haven't yet analyzed
> > it, but I remember I have seen such warnings before as well in my testing.
> > I was wondering if others have also seen it in their testing or not and
> > if this is any known issue?
> > 
> > 
> > Here it goes, 219.dmesg file.
> > 
> > 
> > [14459.933253] run fstests generic/219 at 2020-10-08 20:03:27
> > [14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> > Opts: acl,user_xattr,block_validity,usrquota,grpquota
> > [14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null)
> > [14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null)
> > [14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> > Opts: usrquota
> > [14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null)
> > [14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null)
> > 
> > [14464.149407] ======================================================
> > [14464.149407] WARNING: possible circular locking dependency detected
> > [14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted
> > [14464.149407] ------------------------------------------------------
> > [14464.149407] chown/23358 is trying to acquire lock:
> > [14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at:
> > ext4_map_blocks+0xd1/0x640
> > [14464.149407]
> >                but task is already holding lock:
> > [14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at:
> > v2_write_dquot+0x2d/0xb0
> > [14464.149407]
> >                which lock already depends on the new lock.
> > 
> > [14464.149407]
> >                the existing dependency chain (in reverse order) is:
> > [14464.149407]
> >                -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}:
> > [14464.149407]        down_read+0x41/0x200
> > [14464.149407]        v2_read_dquot+0x23/0x60
> > [14464.149407]        dquot_acquire+0x4c/0x100
> > [14464.149407]        ext4_acquire_dquot+0x72/0xc0
> > [14464.149407]        dqget+0x24a/0x4a0
> > [14464.149407]        __dquot_initialize+0x1f0/0x330
> > [14464.149407]        ext4_create+0x38/0x190
> > [14464.149407]        lookup_open+0x4cd/0x630
> > [14464.149407]        path_openat+0x2c4/0xa10
> > [14464.149407]        do_filp_open+0x91/0x100
> > [14464.149407]        do_sys_openat2+0x20d/0x2d0
> > [14464.149407]        do_sys_open+0x44/0x80
> > [14464.149407]        do_syscall_64+0x33/0x40
> > [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [14464.149407]
> >                -> #1 (&dquot->dq_lock){+.+.}-{3:3}:
> > [14464.149407]        __mutex_lock+0xaa/0x9c0
> > [14464.149407]        dquot_commit+0x23/0xf0
> > [14464.149407]        ext4_write_dquot+0x78/0xb0
> > [14464.149407]        __dquot_alloc_space+0x151/0x310
> > [14464.149407]        ext4_mb_new_blocks+0x1f6/0x12e0
> > [14464.149407]        ext4_ext_map_blocks+0x9c3/0x1470
> > [14464.149407]        ext4_map_blocks+0xf4/0x640
> > [14464.149407]        _ext4_get_block+0x90/0x110
> > [14464.149407]        ext4_block_write_begin+0x15f/0x5a0
> > [14464.149407]        ext4_write_begin+0x267/0x5b0
> > [14464.149407]        generic_perform_write+0xc2/0x1e0
> > [14464.149407]        ext4_buffered_write_iter+0x8b/0x130
> > [14464.149407]        ext4_file_write_iter+0x6c/0x6d0
> > [14464.149407]        new_sync_write+0x122/0x1b0
> > [14464.149407]        vfs_write+0x1ca/0x230
> > [14464.149407]        ksys_pwrite64+0x68/0xa0
> > [14464.149407]        do_syscall_64+0x33/0x40
> > [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [14464.149407]
> >                -> #0 (&ei->i_data_sem){++++}-{3:3}:
> > [14464.149407]        __lock_acquire+0x147b/0x2830
> > [14464.149407]        lock_acquire+0xc6/0x3a0
> > [14464.149407]        down_write+0x40/0x110
> > [14464.149407]        ext4_map_blocks+0xd1/0x640
> > [14464.149407]        ext4_getblk+0x54/0x1c0
> > [14464.149407]        ext4_bread+0x1f/0xd0
> > [14464.149407]        ext4_quota_write+0xbf/0x280
> > [14464.149407]        write_blk+0x35/0x70
> > [14464.149407]        get_free_dqblk+0x42/0xa0
> > [14464.149407]        do_insert_tree+0x1d6/0x480
> > [14464.149407]        do_insert_tree+0x464/0x480
> > [14464.149407]        do_insert_tree+0x218/0x480
> > [14464.149407]        do_insert_tree+0x218/0x480
> > [14464.149407]        qtree_write_dquot+0x79/0x1b0
> > [14464.149407]        v2_write_dquot+0x52/0xb0
> > [14464.149407]        dquot_acquire+0x8e/0x100
> > [14464.149407]        ext4_acquire_dquot+0x72/0xc0
> > [14464.149407]        dqget+0x24a/0x4a0
> > [14464.149407]        dquot_transfer+0xfe/0x140
> > [14464.149407]        ext4_setattr+0x134/0x9c0
> > [14464.149407]        notify_change+0x34b/0x490
> > [14464.149407]        chown_common+0x96/0x150
> > [14464.149407]        do_fchownat+0x8d/0xe0
> > [14464.149407]        __x64_sys_fchownat+0x21/0x30
> > [14464.149407]        do_syscall_64+0x33/0x40
> > [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [14464.149407]
> >                other info that might help us debug this:
> > 
> > [14464.149407] Chain exists of:
> >                  &ei->i_data_sem --> &dquot->dq_lock -->
> > &s->s_dquot.dqio_sem
> > 
> > [14464.149407]  Possible unsafe locking scenario:
> > 
> > [14464.149407]        CPU0                    CPU1
> > [14464.149407]        ----                    ----
> > [14464.149407]   lock(&s->s_dquot.dqio_sem);
> > [14464.149407]                                lock(&dquot->dq_lock);
> > [14464.149407]                                lock(&s->s_dquot.dqio_sem);
> > [14464.149407]   lock(&ei->i_data_sem);

It's complaining that chown on a regular file takes i_data_sem, then
takes dq_lock to update the quotas, and then takes the i_data_sem of the
quota file.  This /should/ be harmless as long as quota files themselves
are not accounted for in the quota file, but you could also shut up
lockdep by doing something like:

	static struct lock_class_key ext4_dquot_lock_class;

and then when initializing the incore quota inodes:

	lockdep_set_class(&inode->i_data_sem, &ext4_dquot_lock_class);

That way lockdep always knows that the quota inodes are "special".  Even
better, I think this also means that lockdep actually /will/ call this
out specifically if someone screws up and tries to start a regular write
after taking the quota lock.

The downside is that now you have special inodes that are not like the
others.

--D

> > [14464.149407]
> >                 *** DEADLOCK ***
> > 
> > [14464.149407] 6 locks held by chown/23358:
> > [14464.149407]  #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at:
> > mnt_want_write+0x20/0x50
> > [14464.149407]  #1: ffff8b9b8ff64bc8
> > (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150
> > [14464.149407]  #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at:
> > start_this_handle+0x1ad/0x690
> > [14464.149407]  #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at:
> > ext4_setattr+0x129/0x9c0
> > [14464.149407]  #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at:
> > dquot_acquire+0x23/0x100
> > [14464.149407]  #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at:
> > v2_write_dquot+0x2d/0xb0
> > [14464.149407]
> >                stack backtrace:
> > [14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted
> > 5.9.0-rc8-next-20201008 #16
> > [14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > 1.13.0-1ubuntu1 04/01/2014
> > [14464.149407] Call Trace:
> > [14464.149407]  dump_stack+0x77/0x97
> > [14464.149407]  check_noncircular+0x132/0x150
> > [14464.149407]  ? sched_clock_local+0x12/0x80
> > [14464.149407]  __lock_acquire+0x147b/0x2830
> > [14464.149407]  lock_acquire+0xc6/0x3a0
> > [14464.149407]  ? ext4_map_blocks+0xd1/0x640
> > [14464.149407]  down_write+0x40/0x110
> > [14464.149407]  ? ext4_map_blocks+0xd1/0x640
> > [14464.149407]  ext4_map_blocks+0xd1/0x640
> > [14464.149407]  ? sched_clock_local+0x12/0x80
> > [14464.149407]  ext4_getblk+0x54/0x1c0
> > [14464.149407]  ext4_bread+0x1f/0xd0
> > [14464.149407]  ext4_quota_write+0xbf/0x280
> > [14464.149407]  write_blk+0x35/0x70
> > [14464.149407]  get_free_dqblk+0x42/0xa0
> > [14464.149407]  do_insert_tree+0x1d6/0x480
> > [14464.149407]  ? ext4_quota_read+0x9d/0x110
> > [14464.149407]  do_insert_tree+0x464/0x480
> > [14464.149407]  ? ext4_quota_read+0x9d/0x110
> > [14464.149407]  do_insert_tree+0x218/0x480
> > [14464.149407]  ? ext4_quota_read+0x9d/0x110
> > [14464.149407]  do_insert_tree+0x218/0x480
> > [14464.149407]  ? __kmalloc+0x319/0x350
> > [14464.149407]  qtree_write_dquot+0x79/0x1b0
> > [14464.149407]  v2_write_dquot+0x52/0xb0
> > [14464.149407]  dquot_acquire+0x8e/0x100
> > [14464.149407]  ext4_acquire_dquot+0x72/0xc0
> > [14464.149407]  dqget+0x24a/0x4a0
> > [14464.149407]  dquot_transfer+0xfe/0x140
> > [14464.149407]  ext4_setattr+0x134/0x9c0
> > [14464.149407]  notify_change+0x34b/0x490
> > [14464.149407]  ? chown_common+0x96/0x150
> > [14464.149407]  chown_common+0x96/0x150
> > [14464.149407]  ? preempt_count_add+0x49/0xa0
> > [14464.149407]  do_fchownat+0x8d/0xe0
> > [14464.149407]  __x64_sys_fchownat+0x21/0x30
> > [14464.149407]  do_syscall_64+0x33/0x40
> > [14464.149407]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [14464.149407] RIP: 0033:0x7efd3d2666ca
> > [14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66
> > 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 05 <48>
> > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48
> > [14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX:
> > 0000000000000104
> > [14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX:
> > 00007efd3d2666ca
> > [14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI:
> > 00000000ffffff9c
> > [14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09:
> > 00000000ffffffff
> > [14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12:
> > 00005598d654e1f0
> > [14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15:
> > 00000000ffffffff
> > [14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> > Opts: grpquota
> > [14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null)
> > [14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null)
> 
> Hi Ritesh:
> 
> I've seen that exact failure on 1k at least as far back as 5.0.  It fails
> on every 1k run for me using the test appliance.  I've been treating this
> as a "known problem, likely harmless, no fix likely anytime soon" for so long
> I unfortunately don't remember what the original analysis was.  IIRC,
> Jan Kara took a look at this one back when I first encountered it.
> 
> Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected"
  2020-10-09 17:29 xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" Ritesh Harjani
  2020-10-09 20:12 ` Eric Whitney
@ 2020-10-12 11:03 ` Jan Kara
  2020-10-12 11:50   ` Jan Kara
  1 sibling, 1 reply; 6+ messages in thread
From: Jan Kara @ 2020-10-12 11:03 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: linux-ext4, Jan Kara, Eric Whitney, Theodore Ts'o

Hi!

On Fri 09-10-20 22:59:13, Ritesh Harjani wrote:
> While running generic/219 fstests on a 1k blocksize on x86 box, I see
> below dmesg warning msg and generic/219 fails. I haven't yet analyzed
> it, but I remember I have seen such warnings before as well in my testing.
> I was wondering if others have also seen it in their testing or not and
> if this is any known issue?

I don't remember seeing this lately. What mkfs options do you use for your
test? I can see below that mount options are apparently:

acl,user_xattr,block_validity,usrquota,grpquota

> Here it goes, 219.dmesg file.
> 
> [14459.933253] run fstests generic/219 at 2020-10-08 20:03:27
> [14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> Opts: acl,user_xattr,block_validity,usrquota,grpquota
> [14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null)
> [14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null)
> [14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> Opts: usrquota
> [14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null)
> [14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null)
> 
> [14464.149407] ======================================================
> [14464.149407] WARNING: possible circular locking dependency detected
> [14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted
> [14464.149407] ------------------------------------------------------
> [14464.149407] chown/23358 is trying to acquire lock:
> [14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at:
> ext4_map_blocks+0xd1/0x640
> [14464.149407]
>                but task is already holding lock:
> [14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at:
> v2_write_dquot+0x2d/0xb0
> [14464.149407]
>                which lock already depends on the new lock.

Umm, this seems to show that somehow the lockdep annotation for the
i_data_sem on the quota file didn't get set properly. i_data_sem on the
quota file should have I_DATA_SEM_QUOTA (2) subclass but in this report we
can see that there's no subclass set on ei->i_data_sem. We do set the
subclass in ext4_quota_on() or ext4_enable_quota(). Strange...

								Honza

> [14464.149407]
>                the existing dependency chain (in reverse order) is:
> [14464.149407]
>                -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}:
> [14464.149407]        down_read+0x41/0x200
> [14464.149407]        v2_read_dquot+0x23/0x60
> [14464.149407]        dquot_acquire+0x4c/0x100
> [14464.149407]        ext4_acquire_dquot+0x72/0xc0
> [14464.149407]        dqget+0x24a/0x4a0
> [14464.149407]        __dquot_initialize+0x1f0/0x330
> [14464.149407]        ext4_create+0x38/0x190
> [14464.149407]        lookup_open+0x4cd/0x630
> [14464.149407]        path_openat+0x2c4/0xa10
> [14464.149407]        do_filp_open+0x91/0x100
> [14464.149407]        do_sys_openat2+0x20d/0x2d0
> [14464.149407]        do_sys_open+0x44/0x80
> [14464.149407]        do_syscall_64+0x33/0x40
> [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [14464.149407]
>                -> #1 (&dquot->dq_lock){+.+.}-{3:3}:
> [14464.149407]        __mutex_lock+0xaa/0x9c0
> [14464.149407]        dquot_commit+0x23/0xf0
> [14464.149407]        ext4_write_dquot+0x78/0xb0
> [14464.149407]        __dquot_alloc_space+0x151/0x310
> [14464.149407]        ext4_mb_new_blocks+0x1f6/0x12e0
> [14464.149407]        ext4_ext_map_blocks+0x9c3/0x1470
> [14464.149407]        ext4_map_blocks+0xf4/0x640
> [14464.149407]        _ext4_get_block+0x90/0x110
> [14464.149407]        ext4_block_write_begin+0x15f/0x5a0
> [14464.149407]        ext4_write_begin+0x267/0x5b0
> [14464.149407]        generic_perform_write+0xc2/0x1e0
> [14464.149407]        ext4_buffered_write_iter+0x8b/0x130
> [14464.149407]        ext4_file_write_iter+0x6c/0x6d0
> [14464.149407]        new_sync_write+0x122/0x1b0
> [14464.149407]        vfs_write+0x1ca/0x230
> [14464.149407]        ksys_pwrite64+0x68/0xa0
> [14464.149407]        do_syscall_64+0x33/0x40
> [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [14464.149407]
>                -> #0 (&ei->i_data_sem){++++}-{3:3}:
> [14464.149407]        __lock_acquire+0x147b/0x2830
> [14464.149407]        lock_acquire+0xc6/0x3a0
> [14464.149407]        down_write+0x40/0x110
> [14464.149407]        ext4_map_blocks+0xd1/0x640
> [14464.149407]        ext4_getblk+0x54/0x1c0
> [14464.149407]        ext4_bread+0x1f/0xd0
> [14464.149407]        ext4_quota_write+0xbf/0x280
> [14464.149407]        write_blk+0x35/0x70
> [14464.149407]        get_free_dqblk+0x42/0xa0
> [14464.149407]        do_insert_tree+0x1d6/0x480
> [14464.149407]        do_insert_tree+0x464/0x480
> [14464.149407]        do_insert_tree+0x218/0x480
> [14464.149407]        do_insert_tree+0x218/0x480
> [14464.149407]        qtree_write_dquot+0x79/0x1b0
> [14464.149407]        v2_write_dquot+0x52/0xb0
> [14464.149407]        dquot_acquire+0x8e/0x100
> [14464.149407]        ext4_acquire_dquot+0x72/0xc0
> [14464.149407]        dqget+0x24a/0x4a0
> [14464.149407]        dquot_transfer+0xfe/0x140
> [14464.149407]        ext4_setattr+0x134/0x9c0
> [14464.149407]        notify_change+0x34b/0x490
> [14464.149407]        chown_common+0x96/0x150
> [14464.149407]        do_fchownat+0x8d/0xe0
> [14464.149407]        __x64_sys_fchownat+0x21/0x30
> [14464.149407]        do_syscall_64+0x33/0x40
> [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [14464.149407]
>                other info that might help us debug this:
> 
> [14464.149407] Chain exists of:
>                  &ei->i_data_sem --> &dquot->dq_lock -->
> &s->s_dquot.dqio_sem
> 
> [14464.149407]  Possible unsafe locking scenario:
> 
> [14464.149407]        CPU0                    CPU1
> [14464.149407]        ----                    ----
> [14464.149407]   lock(&s->s_dquot.dqio_sem);
> [14464.149407]                                lock(&dquot->dq_lock);
> [14464.149407]                                lock(&s->s_dquot.dqio_sem);
> [14464.149407]   lock(&ei->i_data_sem);
> [14464.149407]
>                 *** DEADLOCK ***
> 
> [14464.149407] 6 locks held by chown/23358:
> [14464.149407]  #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at:
> mnt_want_write+0x20/0x50
> [14464.149407]  #1: ffff8b9b8ff64bc8
> (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150
> [14464.149407]  #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at:
> start_this_handle+0x1ad/0x690
> [14464.149407]  #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at:
> ext4_setattr+0x129/0x9c0
> [14464.149407]  #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at:
> dquot_acquire+0x23/0x100
> [14464.149407]  #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at:
> v2_write_dquot+0x2d/0xb0
> [14464.149407]
>                stack backtrace:
> [14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted
> 5.9.0-rc8-next-20201008 #16
> [14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.13.0-1ubuntu1 04/01/2014
> [14464.149407] Call Trace:
> [14464.149407]  dump_stack+0x77/0x97
> [14464.149407]  check_noncircular+0x132/0x150
> [14464.149407]  ? sched_clock_local+0x12/0x80
> [14464.149407]  __lock_acquire+0x147b/0x2830
> [14464.149407]  lock_acquire+0xc6/0x3a0
> [14464.149407]  ? ext4_map_blocks+0xd1/0x640
> [14464.149407]  down_write+0x40/0x110
> [14464.149407]  ? ext4_map_blocks+0xd1/0x640
> [14464.149407]  ext4_map_blocks+0xd1/0x640
> [14464.149407]  ? sched_clock_local+0x12/0x80
> [14464.149407]  ext4_getblk+0x54/0x1c0
> [14464.149407]  ext4_bread+0x1f/0xd0
> [14464.149407]  ext4_quota_write+0xbf/0x280
> [14464.149407]  write_blk+0x35/0x70
> [14464.149407]  get_free_dqblk+0x42/0xa0
> [14464.149407]  do_insert_tree+0x1d6/0x480
> [14464.149407]  ? ext4_quota_read+0x9d/0x110
> [14464.149407]  do_insert_tree+0x464/0x480
> [14464.149407]  ? ext4_quota_read+0x9d/0x110
> [14464.149407]  do_insert_tree+0x218/0x480
> [14464.149407]  ? ext4_quota_read+0x9d/0x110
> [14464.149407]  do_insert_tree+0x218/0x480
> [14464.149407]  ? __kmalloc+0x319/0x350
> [14464.149407]  qtree_write_dquot+0x79/0x1b0
> [14464.149407]  v2_write_dquot+0x52/0xb0
> [14464.149407]  dquot_acquire+0x8e/0x100
> [14464.149407]  ext4_acquire_dquot+0x72/0xc0
> [14464.149407]  dqget+0x24a/0x4a0
> [14464.149407]  dquot_transfer+0xfe/0x140
> [14464.149407]  ext4_setattr+0x134/0x9c0
> [14464.149407]  notify_change+0x34b/0x490
> [14464.149407]  ? chown_common+0x96/0x150
> [14464.149407]  chown_common+0x96/0x150
> [14464.149407]  ? preempt_count_add+0x49/0xa0
> [14464.149407]  do_fchownat+0x8d/0xe0
> [14464.149407]  __x64_sys_fchownat+0x21/0x30
> [14464.149407]  do_syscall_64+0x33/0x40
> [14464.149407]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [14464.149407] RIP: 0033:0x7efd3d2666ca
> [14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66
> 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 05 <48>
> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48
> [14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX:
> 0000000000000104
> [14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX:
> 00007efd3d2666ca
> [14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI:
> 00000000ffffff9c
> [14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09:
> 00000000ffffffff
> [14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12:
> 00005598d654e1f0
> [14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15:
> 00000000ffffffff
> [14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> Opts: grpquota
> [14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null)
> [14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null)
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected"
  2020-10-09 20:24   ` Darrick J. Wong
@ 2020-10-12 11:35     ` Jan Kara
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2020-10-12 11:35 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Eric Whitney, Ritesh Harjani, linux-ext4, Jan Kara, Theodore Ts'o

On Fri 09-10-20 13:24:08, Darrick J. Wong wrote:
> On Fri, Oct 09, 2020 at 04:12:38PM -0400, Eric Whitney wrote:
> > * Ritesh Harjani <riteshh@linux.ibm.com>:
> > > Hello All,
> > > 
> > > While running generic/219 fstests on a 1k blocksize on x86 box, I see
> > > below dmesg warning msg and generic/219 fails. I haven't yet analyzed
> > > it, but I remember I have seen such warnings before as well in my testing.
> > > I was wondering if others have also seen it in their testing or not and
> > > if this is any known issue?
> > > 
> > > 
> > > Here it goes, 219.dmesg file.
> > > 
> > > 
> > > [14459.933253] run fstests generic/219 at 2020-10-08 20:03:27
> > > [14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> > > Opts: acl,user_xattr,block_validity,usrquota,grpquota
> > > [14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null)
> > > [14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null)
> > > [14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> > > Opts: usrquota
> > > [14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null)
> > > [14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null)
> > > 
> > > [14464.149407] ======================================================
> > > [14464.149407] WARNING: possible circular locking dependency detected
> > > [14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted
> > > [14464.149407] ------------------------------------------------------
> > > [14464.149407] chown/23358 is trying to acquire lock:
> > > [14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at:
> > > ext4_map_blocks+0xd1/0x640
> > > [14464.149407]
> > >                but task is already holding lock:
> > > [14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at:
> > > v2_write_dquot+0x2d/0xb0
> > > [14464.149407]
> > >                which lock already depends on the new lock.
> > > 
> > > [14464.149407]
> > >                the existing dependency chain (in reverse order) is:
> > > [14464.149407]
> > >                -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}:
> > > [14464.149407]        down_read+0x41/0x200
> > > [14464.149407]        v2_read_dquot+0x23/0x60
> > > [14464.149407]        dquot_acquire+0x4c/0x100
> > > [14464.149407]        ext4_acquire_dquot+0x72/0xc0
> > > [14464.149407]        dqget+0x24a/0x4a0
> > > [14464.149407]        __dquot_initialize+0x1f0/0x330
> > > [14464.149407]        ext4_create+0x38/0x190
> > > [14464.149407]        lookup_open+0x4cd/0x630
> > > [14464.149407]        path_openat+0x2c4/0xa10
> > > [14464.149407]        do_filp_open+0x91/0x100
> > > [14464.149407]        do_sys_openat2+0x20d/0x2d0
> > > [14464.149407]        do_sys_open+0x44/0x80
> > > [14464.149407]        do_syscall_64+0x33/0x40
> > > [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > [14464.149407]
> > >                -> #1 (&dquot->dq_lock){+.+.}-{3:3}:
> > > [14464.149407]        __mutex_lock+0xaa/0x9c0
> > > [14464.149407]        dquot_commit+0x23/0xf0
> > > [14464.149407]        ext4_write_dquot+0x78/0xb0
> > > [14464.149407]        __dquot_alloc_space+0x151/0x310
> > > [14464.149407]        ext4_mb_new_blocks+0x1f6/0x12e0
> > > [14464.149407]        ext4_ext_map_blocks+0x9c3/0x1470
> > > [14464.149407]        ext4_map_blocks+0xf4/0x640
> > > [14464.149407]        _ext4_get_block+0x90/0x110
> > > [14464.149407]        ext4_block_write_begin+0x15f/0x5a0
> > > [14464.149407]        ext4_write_begin+0x267/0x5b0
> > > [14464.149407]        generic_perform_write+0xc2/0x1e0
> > > [14464.149407]        ext4_buffered_write_iter+0x8b/0x130
> > > [14464.149407]        ext4_file_write_iter+0x6c/0x6d0
> > > [14464.149407]        new_sync_write+0x122/0x1b0
> > > [14464.149407]        vfs_write+0x1ca/0x230
> > > [14464.149407]        ksys_pwrite64+0x68/0xa0
> > > [14464.149407]        do_syscall_64+0x33/0x40
> > > [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > [14464.149407]
> > >                -> #0 (&ei->i_data_sem){++++}-{3:3}:
> > > [14464.149407]        __lock_acquire+0x147b/0x2830
> > > [14464.149407]        lock_acquire+0xc6/0x3a0
> > > [14464.149407]        down_write+0x40/0x110
> > > [14464.149407]        ext4_map_blocks+0xd1/0x640
> > > [14464.149407]        ext4_getblk+0x54/0x1c0
> > > [14464.149407]        ext4_bread+0x1f/0xd0
> > > [14464.149407]        ext4_quota_write+0xbf/0x280
> > > [14464.149407]        write_blk+0x35/0x70
> > > [14464.149407]        get_free_dqblk+0x42/0xa0
> > > [14464.149407]        do_insert_tree+0x1d6/0x480
> > > [14464.149407]        do_insert_tree+0x464/0x480
> > > [14464.149407]        do_insert_tree+0x218/0x480
> > > [14464.149407]        do_insert_tree+0x218/0x480
> > > [14464.149407]        qtree_write_dquot+0x79/0x1b0
> > > [14464.149407]        v2_write_dquot+0x52/0xb0
> > > [14464.149407]        dquot_acquire+0x8e/0x100
> > > [14464.149407]        ext4_acquire_dquot+0x72/0xc0
> > > [14464.149407]        dqget+0x24a/0x4a0
> > > [14464.149407]        dquot_transfer+0xfe/0x140
> > > [14464.149407]        ext4_setattr+0x134/0x9c0
> > > [14464.149407]        notify_change+0x34b/0x490
> > > [14464.149407]        chown_common+0x96/0x150
> > > [14464.149407]        do_fchownat+0x8d/0xe0
> > > [14464.149407]        __x64_sys_fchownat+0x21/0x30
> > > [14464.149407]        do_syscall_64+0x33/0x40
> > > [14464.149407]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > [14464.149407]
> > >                other info that might help us debug this:
> > > 
> > > [14464.149407] Chain exists of:
> > >                  &ei->i_data_sem --> &dquot->dq_lock -->
> > > &s->s_dquot.dqio_sem
> > > 
> > > [14464.149407]  Possible unsafe locking scenario:
> > > 
> > > [14464.149407]        CPU0                    CPU1
> > > [14464.149407]        ----                    ----
> > > [14464.149407]   lock(&s->s_dquot.dqio_sem);
> > > [14464.149407]                                lock(&dquot->dq_lock);
> > > [14464.149407]                                lock(&s->s_dquot.dqio_sem);
> > > [14464.149407]   lock(&ei->i_data_sem);
> 
> It's complaining that chown on a regular file takes i_data_sem, then
> takes dq_lock to update the quotas, and then takes the i_data_sem of the
> quota file.  This /should/ be harmless as long as quota files themselves
> are not accounted for in the quota file, but you could also shut up
> lockdep by doing something like:
> 
> 	static struct lock_class_key ext4_dquot_lock_class;
> 
> and then when initializing the incore quota inodes:
> 
> 	lockdep_set_class(&inode->i_data_sem, &ext4_dquot_lock_class);
> 
> That way lockdep always knows that the quota inodes are "special".  Even
> better, I think this also means that lockdep actually /will/ call this
> out specifically if someone screws up and tries to start a regular write
> after taking the quota lock.
> 
> The downside is that now you have special inodes that are not like the
> others.

Yeah, we already do this in ext4_quota_on() but somehow this doesn't seem
to work for the 1k case which is puzzling...

								Honza

> > > [14464.149407]
> > >                 *** DEADLOCK ***
> > > 
> > > [14464.149407] 6 locks held by chown/23358:
> > > [14464.149407]  #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at:
> > > mnt_want_write+0x20/0x50
> > > [14464.149407]  #1: ffff8b9b8ff64bc8
> > > (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150
> > > [14464.149407]  #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at:
> > > start_this_handle+0x1ad/0x690
> > > [14464.149407]  #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at:
> > > ext4_setattr+0x129/0x9c0
> > > [14464.149407]  #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at:
> > > dquot_acquire+0x23/0x100
> > > [14464.149407]  #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at:
> > > v2_write_dquot+0x2d/0xb0
> > > [14464.149407]
> > >                stack backtrace:
> > > [14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted
> > > 5.9.0-rc8-next-20201008 #16
> > > [14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > 1.13.0-1ubuntu1 04/01/2014
> > > [14464.149407] Call Trace:
> > > [14464.149407]  dump_stack+0x77/0x97
> > > [14464.149407]  check_noncircular+0x132/0x150
> > > [14464.149407]  ? sched_clock_local+0x12/0x80
> > > [14464.149407]  __lock_acquire+0x147b/0x2830
> > > [14464.149407]  lock_acquire+0xc6/0x3a0
> > > [14464.149407]  ? ext4_map_blocks+0xd1/0x640
> > > [14464.149407]  down_write+0x40/0x110
> > > [14464.149407]  ? ext4_map_blocks+0xd1/0x640
> > > [14464.149407]  ext4_map_blocks+0xd1/0x640
> > > [14464.149407]  ? sched_clock_local+0x12/0x80
> > > [14464.149407]  ext4_getblk+0x54/0x1c0
> > > [14464.149407]  ext4_bread+0x1f/0xd0
> > > [14464.149407]  ext4_quota_write+0xbf/0x280
> > > [14464.149407]  write_blk+0x35/0x70
> > > [14464.149407]  get_free_dqblk+0x42/0xa0
> > > [14464.149407]  do_insert_tree+0x1d6/0x480
> > > [14464.149407]  ? ext4_quota_read+0x9d/0x110
> > > [14464.149407]  do_insert_tree+0x464/0x480
> > > [14464.149407]  ? ext4_quota_read+0x9d/0x110
> > > [14464.149407]  do_insert_tree+0x218/0x480
> > > [14464.149407]  ? ext4_quota_read+0x9d/0x110
> > > [14464.149407]  do_insert_tree+0x218/0x480
> > > [14464.149407]  ? __kmalloc+0x319/0x350
> > > [14464.149407]  qtree_write_dquot+0x79/0x1b0
> > > [14464.149407]  v2_write_dquot+0x52/0xb0
> > > [14464.149407]  dquot_acquire+0x8e/0x100
> > > [14464.149407]  ext4_acquire_dquot+0x72/0xc0
> > > [14464.149407]  dqget+0x24a/0x4a0
> > > [14464.149407]  dquot_transfer+0xfe/0x140
> > > [14464.149407]  ext4_setattr+0x134/0x9c0
> > > [14464.149407]  notify_change+0x34b/0x490
> > > [14464.149407]  ? chown_common+0x96/0x150
> > > [14464.149407]  chown_common+0x96/0x150
> > > [14464.149407]  ? preempt_count_add+0x49/0xa0
> > > [14464.149407]  do_fchownat+0x8d/0xe0
> > > [14464.149407]  __x64_sys_fchownat+0x21/0x30
> > > [14464.149407]  do_syscall_64+0x33/0x40
> > > [14464.149407]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > [14464.149407] RIP: 0033:0x7efd3d2666ca
> > > [14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66
> > > 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 05 <48>
> > > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48
> > > [14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX:
> > > 0000000000000104
> > > [14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX:
> > > 00007efd3d2666ca
> > > [14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI:
> > > 00000000ffffff9c
> > > [14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09:
> > > 00000000ffffffff
> > > [14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12:
> > > 00005598d654e1f0
> > > [14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15:
> > > 00000000ffffffff
> > > [14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode.
> > > Opts: grpquota
> > > [14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null)
> > > [14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null)
> > 
> > Hi Ritesh:
> > 
> > I've seen that exact failure on 1k at least as far back as 5.0.  It fails
> > on every 1k run for me using the test appliance.  I've been treating this
> > as a "known problem, likely harmless, no fix likely anytime soon" for so long
> > I unfortunately don't remember what the original analysis was.  IIRC,
> > Jan Kara took a look at this one back when I first encountered it.
> > 
> > Eric
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected"
  2020-10-12 11:03 ` Jan Kara
@ 2020-10-12 11:50   ` Jan Kara
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2020-10-12 11:50 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: linux-ext4, Jan Kara, Eric Whitney, Theodore Ts'o

On Mon 12-10-20 13:03:47, Jan Kara wrote:
> Hi!
> 
> On Fri 09-10-20 22:59:13, Ritesh Harjani wrote:
> > While running generic/219 fstests on a 1k blocksize on x86 box, I see
> > below dmesg warning msg and generic/219 fails. I haven't yet analyzed
> > it, but I remember I have seen such warnings before as well in my testing.
> > I was wondering if others have also seen it in their testing or not and
> > if this is any known issue?
> 
> I don't remember seeing this lately. What mkfs options do you use for your
> test? I can see below that mount options are apparently:
> 
> acl,user_xattr,block_validity,usrquota,grpquota

BTW, I've tried and the problem indeed does *not* reproduce in my test KVM
setup with 1k block size and default mkfs options. But it does reproduce
with mkfs option "-O quota". So it indeed seems the lockdep annotation is
somehow broken in this config. I'll have a look...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-12 11:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09 17:29 xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" Ritesh Harjani
2020-10-09 20:12 ` Eric Whitney
2020-10-09 20:24   ` Darrick J. Wong
2020-10-12 11:35     ` Jan Kara
2020-10-12 11:03 ` Jan Kara
2020-10-12 11:50   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).