* xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" @ 2020-10-09 17:29 Ritesh Harjani 2020-10-09 20:12 ` Eric Whitney 2020-10-12 11:03 ` Jan Kara 0 siblings, 2 replies; 6+ messages in thread From: Ritesh Harjani @ 2020-10-09 17:29 UTC (permalink / raw) To: linux-ext4; +Cc: Jan Kara, Eric Whitney, Theodore Ts'o Hello All, While running generic/219 fstests on a 1k blocksize on x86 box, I see below dmesg warning msg and generic/219 fails. I haven't yet analyzed it, but I remember I have seen such warnings before as well in my testing. I was wondering if others have also seen it in their testing or not and if this is any known issue? Here it goes, 219.dmesg file. [14459.933253] run fstests generic/219 at 2020-10-08 20:03:27 [14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: acl,user_xattr,block_validity,usrquota,grpquota [14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null) [14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null) [14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: usrquota [14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null) [14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null) [14464.149407] ====================================================== [14464.149407] WARNING: possible circular locking dependency detected [14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted [14464.149407] ------------------------------------------------------ [14464.149407] chown/23358 is trying to acquire lock: [14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at: ext4_map_blocks+0xd1/0x640 [14464.149407] but task is already holding lock: [14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: v2_write_dquot+0x2d/0xb0 [14464.149407] which lock already depends on the new lock. [14464.149407] the existing dependency chain (in reverse order) is: [14464.149407] -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}: [14464.149407] down_read+0x41/0x200 [14464.149407] v2_read_dquot+0x23/0x60 [14464.149407] dquot_acquire+0x4c/0x100 [14464.149407] ext4_acquire_dquot+0x72/0xc0 [14464.149407] dqget+0x24a/0x4a0 [14464.149407] __dquot_initialize+0x1f0/0x330 [14464.149407] ext4_create+0x38/0x190 [14464.149407] lookup_open+0x4cd/0x630 [14464.149407] path_openat+0x2c4/0xa10 [14464.149407] do_filp_open+0x91/0x100 [14464.149407] do_sys_openat2+0x20d/0x2d0 [14464.149407] do_sys_open+0x44/0x80 [14464.149407] do_syscall_64+0x33/0x40 [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [14464.149407] -> #1 (&dquot->dq_lock){+.+.}-{3:3}: [14464.149407] __mutex_lock+0xaa/0x9c0 [14464.149407] dquot_commit+0x23/0xf0 [14464.149407] ext4_write_dquot+0x78/0xb0 [14464.149407] __dquot_alloc_space+0x151/0x310 [14464.149407] ext4_mb_new_blocks+0x1f6/0x12e0 [14464.149407] ext4_ext_map_blocks+0x9c3/0x1470 [14464.149407] ext4_map_blocks+0xf4/0x640 [14464.149407] _ext4_get_block+0x90/0x110 [14464.149407] ext4_block_write_begin+0x15f/0x5a0 [14464.149407] ext4_write_begin+0x267/0x5b0 [14464.149407] generic_perform_write+0xc2/0x1e0 [14464.149407] ext4_buffered_write_iter+0x8b/0x130 [14464.149407] ext4_file_write_iter+0x6c/0x6d0 [14464.149407] new_sync_write+0x122/0x1b0 [14464.149407] vfs_write+0x1ca/0x230 [14464.149407] ksys_pwrite64+0x68/0xa0 [14464.149407] do_syscall_64+0x33/0x40 [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [14464.149407] -> #0 (&ei->i_data_sem){++++}-{3:3}: [14464.149407] __lock_acquire+0x147b/0x2830 [14464.149407] lock_acquire+0xc6/0x3a0 [14464.149407] down_write+0x40/0x110 [14464.149407] ext4_map_blocks+0xd1/0x640 [14464.149407] ext4_getblk+0x54/0x1c0 [14464.149407] ext4_bread+0x1f/0xd0 [14464.149407] ext4_quota_write+0xbf/0x280 [14464.149407] write_blk+0x35/0x70 [14464.149407] get_free_dqblk+0x42/0xa0 [14464.149407] do_insert_tree+0x1d6/0x480 [14464.149407] do_insert_tree+0x464/0x480 [14464.149407] do_insert_tree+0x218/0x480 [14464.149407] do_insert_tree+0x218/0x480 [14464.149407] qtree_write_dquot+0x79/0x1b0 [14464.149407] v2_write_dquot+0x52/0xb0 [14464.149407] dquot_acquire+0x8e/0x100 [14464.149407] ext4_acquire_dquot+0x72/0xc0 [14464.149407] dqget+0x24a/0x4a0 [14464.149407] dquot_transfer+0xfe/0x140 [14464.149407] ext4_setattr+0x134/0x9c0 [14464.149407] notify_change+0x34b/0x490 [14464.149407] chown_common+0x96/0x150 [14464.149407] do_fchownat+0x8d/0xe0 [14464.149407] __x64_sys_fchownat+0x21/0x30 [14464.149407] do_syscall_64+0x33/0x40 [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [14464.149407] other info that might help us debug this: [14464.149407] Chain exists of: &ei->i_data_sem --> &dquot->dq_lock --> &s->s_dquot.dqio_sem [14464.149407] Possible unsafe locking scenario: [14464.149407] CPU0 CPU1 [14464.149407] ---- ---- [14464.149407] lock(&s->s_dquot.dqio_sem); [14464.149407] lock(&dquot->dq_lock); [14464.149407] lock(&s->s_dquot.dqio_sem); [14464.149407] lock(&ei->i_data_sem); [14464.149407] *** DEADLOCK *** [14464.149407] 6 locks held by chown/23358: [14464.149407] #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at: mnt_want_write+0x20/0x50 [14464.149407] #1: ffff8b9b8ff64bc8 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150 [14464.149407] #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0x1ad/0x690 [14464.149407] #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at: ext4_setattr+0x129/0x9c0 [14464.149407] #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at: dquot_acquire+0x23/0x100 [14464.149407] #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: v2_write_dquot+0x2d/0xb0 [14464.149407] stack backtrace: [14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted 5.9.0-rc8-next-20201008 #16 [14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014 [14464.149407] Call Trace: [14464.149407] dump_stack+0x77/0x97 [14464.149407] check_noncircular+0x132/0x150 [14464.149407] ? sched_clock_local+0x12/0x80 [14464.149407] __lock_acquire+0x147b/0x2830 [14464.149407] lock_acquire+0xc6/0x3a0 [14464.149407] ? ext4_map_blocks+0xd1/0x640 [14464.149407] down_write+0x40/0x110 [14464.149407] ? ext4_map_blocks+0xd1/0x640 [14464.149407] ext4_map_blocks+0xd1/0x640 [14464.149407] ? sched_clock_local+0x12/0x80 [14464.149407] ext4_getblk+0x54/0x1c0 [14464.149407] ext4_bread+0x1f/0xd0 [14464.149407] ext4_quota_write+0xbf/0x280 [14464.149407] write_blk+0x35/0x70 [14464.149407] get_free_dqblk+0x42/0xa0 [14464.149407] do_insert_tree+0x1d6/0x480 [14464.149407] ? ext4_quota_read+0x9d/0x110 [14464.149407] do_insert_tree+0x464/0x480 [14464.149407] ? ext4_quota_read+0x9d/0x110 [14464.149407] do_insert_tree+0x218/0x480 [14464.149407] ? ext4_quota_read+0x9d/0x110 [14464.149407] do_insert_tree+0x218/0x480 [14464.149407] ? __kmalloc+0x319/0x350 [14464.149407] qtree_write_dquot+0x79/0x1b0 [14464.149407] v2_write_dquot+0x52/0xb0 [14464.149407] dquot_acquire+0x8e/0x100 [14464.149407] ext4_acquire_dquot+0x72/0xc0 [14464.149407] dqget+0x24a/0x4a0 [14464.149407] dquot_transfer+0xfe/0x140 [14464.149407] ext4_setattr+0x134/0x9c0 [14464.149407] notify_change+0x34b/0x490 [14464.149407] ? chown_common+0x96/0x150 [14464.149407] chown_common+0x96/0x150 [14464.149407] ? preempt_count_add+0x49/0xa0 [14464.149407] do_fchownat+0x8d/0xe0 [14464.149407] __x64_sys_fchownat+0x21/0x30 [14464.149407] do_syscall_64+0x33/0x40 [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [14464.149407] RIP: 0033:0x7efd3d2666ca [14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48 [14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX: 0000000000000104 [14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX: 00007efd3d2666ca [14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI: 00000000ffffff9c [14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09: 00000000ffffffff [14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12: 00005598d654e1f0 [14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15: 00000000ffffffff [14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: grpquota [14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null) [14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" 2020-10-09 17:29 xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" Ritesh Harjani @ 2020-10-09 20:12 ` Eric Whitney 2020-10-09 20:24 ` Darrick J. Wong 2020-10-12 11:03 ` Jan Kara 1 sibling, 1 reply; 6+ messages in thread From: Eric Whitney @ 2020-10-09 20:12 UTC (permalink / raw) To: Ritesh Harjani; +Cc: linux-ext4, Jan Kara, Eric Whitney, Theodore Ts'o * Ritesh Harjani <riteshh@linux.ibm.com>: > Hello All, > > While running generic/219 fstests on a 1k blocksize on x86 box, I see > below dmesg warning msg and generic/219 fails. I haven't yet analyzed > it, but I remember I have seen such warnings before as well in my testing. > I was wondering if others have also seen it in their testing or not and > if this is any known issue? > > > Here it goes, 219.dmesg file. > > > [14459.933253] run fstests generic/219 at 2020-10-08 20:03:27 > [14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode. > Opts: acl,user_xattr,block_validity,usrquota,grpquota > [14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null) > [14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null) > [14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode. > Opts: usrquota > [14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null) > [14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null) > > [14464.149407] ====================================================== > [14464.149407] WARNING: possible circular locking dependency detected > [14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted > [14464.149407] ------------------------------------------------------ > [14464.149407] chown/23358 is trying to acquire lock: > [14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at: > ext4_map_blocks+0xd1/0x640 > [14464.149407] > but task is already holding lock: > [14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: > v2_write_dquot+0x2d/0xb0 > [14464.149407] > which lock already depends on the new lock. > > [14464.149407] > the existing dependency chain (in reverse order) is: > [14464.149407] > -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}: > [14464.149407] down_read+0x41/0x200 > [14464.149407] v2_read_dquot+0x23/0x60 > [14464.149407] dquot_acquire+0x4c/0x100 > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > [14464.149407] dqget+0x24a/0x4a0 > [14464.149407] __dquot_initialize+0x1f0/0x330 > [14464.149407] ext4_create+0x38/0x190 > [14464.149407] lookup_open+0x4cd/0x630 > [14464.149407] path_openat+0x2c4/0xa10 > [14464.149407] do_filp_open+0x91/0x100 > [14464.149407] do_sys_openat2+0x20d/0x2d0 > [14464.149407] do_sys_open+0x44/0x80 > [14464.149407] do_syscall_64+0x33/0x40 > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [14464.149407] > -> #1 (&dquot->dq_lock){+.+.}-{3:3}: > [14464.149407] __mutex_lock+0xaa/0x9c0 > [14464.149407] dquot_commit+0x23/0xf0 > [14464.149407] ext4_write_dquot+0x78/0xb0 > [14464.149407] __dquot_alloc_space+0x151/0x310 > [14464.149407] ext4_mb_new_blocks+0x1f6/0x12e0 > [14464.149407] ext4_ext_map_blocks+0x9c3/0x1470 > [14464.149407] ext4_map_blocks+0xf4/0x640 > [14464.149407] _ext4_get_block+0x90/0x110 > [14464.149407] ext4_block_write_begin+0x15f/0x5a0 > [14464.149407] ext4_write_begin+0x267/0x5b0 > [14464.149407] generic_perform_write+0xc2/0x1e0 > [14464.149407] ext4_buffered_write_iter+0x8b/0x130 > [14464.149407] ext4_file_write_iter+0x6c/0x6d0 > [14464.149407] new_sync_write+0x122/0x1b0 > [14464.149407] vfs_write+0x1ca/0x230 > [14464.149407] ksys_pwrite64+0x68/0xa0 > [14464.149407] do_syscall_64+0x33/0x40 > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [14464.149407] > -> #0 (&ei->i_data_sem){++++}-{3:3}: > [14464.149407] __lock_acquire+0x147b/0x2830 > [14464.149407] lock_acquire+0xc6/0x3a0 > [14464.149407] down_write+0x40/0x110 > [14464.149407] ext4_map_blocks+0xd1/0x640 > [14464.149407] ext4_getblk+0x54/0x1c0 > [14464.149407] ext4_bread+0x1f/0xd0 > [14464.149407] ext4_quota_write+0xbf/0x280 > [14464.149407] write_blk+0x35/0x70 > [14464.149407] get_free_dqblk+0x42/0xa0 > [14464.149407] do_insert_tree+0x1d6/0x480 > [14464.149407] do_insert_tree+0x464/0x480 > [14464.149407] do_insert_tree+0x218/0x480 > [14464.149407] do_insert_tree+0x218/0x480 > [14464.149407] qtree_write_dquot+0x79/0x1b0 > [14464.149407] v2_write_dquot+0x52/0xb0 > [14464.149407] dquot_acquire+0x8e/0x100 > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > [14464.149407] dqget+0x24a/0x4a0 > [14464.149407] dquot_transfer+0xfe/0x140 > [14464.149407] ext4_setattr+0x134/0x9c0 > [14464.149407] notify_change+0x34b/0x490 > [14464.149407] chown_common+0x96/0x150 > [14464.149407] do_fchownat+0x8d/0xe0 > [14464.149407] __x64_sys_fchownat+0x21/0x30 > [14464.149407] do_syscall_64+0x33/0x40 > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [14464.149407] > other info that might help us debug this: > > [14464.149407] Chain exists of: > &ei->i_data_sem --> &dquot->dq_lock --> > &s->s_dquot.dqio_sem > > [14464.149407] Possible unsafe locking scenario: > > [14464.149407] CPU0 CPU1 > [14464.149407] ---- ---- > [14464.149407] lock(&s->s_dquot.dqio_sem); > [14464.149407] lock(&dquot->dq_lock); > [14464.149407] lock(&s->s_dquot.dqio_sem); > [14464.149407] lock(&ei->i_data_sem); > [14464.149407] > *** DEADLOCK *** > > [14464.149407] 6 locks held by chown/23358: > [14464.149407] #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at: > mnt_want_write+0x20/0x50 > [14464.149407] #1: ffff8b9b8ff64bc8 > (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150 > [14464.149407] #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at: > start_this_handle+0x1ad/0x690 > [14464.149407] #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at: > ext4_setattr+0x129/0x9c0 > [14464.149407] #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at: > dquot_acquire+0x23/0x100 > [14464.149407] #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: > v2_write_dquot+0x2d/0xb0 > [14464.149407] > stack backtrace: > [14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted > 5.9.0-rc8-next-20201008 #16 > [14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.13.0-1ubuntu1 04/01/2014 > [14464.149407] Call Trace: > [14464.149407] dump_stack+0x77/0x97 > [14464.149407] check_noncircular+0x132/0x150 > [14464.149407] ? sched_clock_local+0x12/0x80 > [14464.149407] __lock_acquire+0x147b/0x2830 > [14464.149407] lock_acquire+0xc6/0x3a0 > [14464.149407] ? ext4_map_blocks+0xd1/0x640 > [14464.149407] down_write+0x40/0x110 > [14464.149407] ? ext4_map_blocks+0xd1/0x640 > [14464.149407] ext4_map_blocks+0xd1/0x640 > [14464.149407] ? sched_clock_local+0x12/0x80 > [14464.149407] ext4_getblk+0x54/0x1c0 > [14464.149407] ext4_bread+0x1f/0xd0 > [14464.149407] ext4_quota_write+0xbf/0x280 > [14464.149407] write_blk+0x35/0x70 > [14464.149407] get_free_dqblk+0x42/0xa0 > [14464.149407] do_insert_tree+0x1d6/0x480 > [14464.149407] ? ext4_quota_read+0x9d/0x110 > [14464.149407] do_insert_tree+0x464/0x480 > [14464.149407] ? ext4_quota_read+0x9d/0x110 > [14464.149407] do_insert_tree+0x218/0x480 > [14464.149407] ? ext4_quota_read+0x9d/0x110 > [14464.149407] do_insert_tree+0x218/0x480 > [14464.149407] ? __kmalloc+0x319/0x350 > [14464.149407] qtree_write_dquot+0x79/0x1b0 > [14464.149407] v2_write_dquot+0x52/0xb0 > [14464.149407] dquot_acquire+0x8e/0x100 > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > [14464.149407] dqget+0x24a/0x4a0 > [14464.149407] dquot_transfer+0xfe/0x140 > [14464.149407] ext4_setattr+0x134/0x9c0 > [14464.149407] notify_change+0x34b/0x490 > [14464.149407] ? chown_common+0x96/0x150 > [14464.149407] chown_common+0x96/0x150 > [14464.149407] ? preempt_count_add+0x49/0xa0 > [14464.149407] do_fchownat+0x8d/0xe0 > [14464.149407] __x64_sys_fchownat+0x21/0x30 > [14464.149407] do_syscall_64+0x33/0x40 > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [14464.149407] RIP: 0033:0x7efd3d2666ca > [14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 > 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 05 <48> > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48 > [14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX: > 0000000000000104 > [14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX: > 00007efd3d2666ca > [14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI: > 00000000ffffff9c > [14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09: > 00000000ffffffff > [14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12: > 00005598d654e1f0 > [14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15: > 00000000ffffffff > [14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode. > Opts: grpquota > [14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null) > [14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null) Hi Ritesh: I've seen that exact failure on 1k at least as far back as 5.0. It fails on every 1k run for me using the test appliance. I've been treating this as a "known problem, likely harmless, no fix likely anytime soon" for so long I unfortunately don't remember what the original analysis was. IIRC, Jan Kara took a look at this one back when I first encountered it. Eric ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" 2020-10-09 20:12 ` Eric Whitney @ 2020-10-09 20:24 ` Darrick J. Wong 2020-10-12 11:35 ` Jan Kara 0 siblings, 1 reply; 6+ messages in thread From: Darrick J. Wong @ 2020-10-09 20:24 UTC (permalink / raw) To: Eric Whitney; +Cc: Ritesh Harjani, linux-ext4, Jan Kara, Theodore Ts'o On Fri, Oct 09, 2020 at 04:12:38PM -0400, Eric Whitney wrote: > * Ritesh Harjani <riteshh@linux.ibm.com>: > > Hello All, > > > > While running generic/219 fstests on a 1k blocksize on x86 box, I see > > below dmesg warning msg and generic/219 fails. I haven't yet analyzed > > it, but I remember I have seen such warnings before as well in my testing. > > I was wondering if others have also seen it in their testing or not and > > if this is any known issue? > > > > > > Here it goes, 219.dmesg file. > > > > > > [14459.933253] run fstests generic/219 at 2020-10-08 20:03:27 > > [14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode. > > Opts: acl,user_xattr,block_validity,usrquota,grpquota > > [14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null) > > [14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null) > > [14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode. > > Opts: usrquota > > [14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null) > > [14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null) > > > > [14464.149407] ====================================================== > > [14464.149407] WARNING: possible circular locking dependency detected > > [14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted > > [14464.149407] ------------------------------------------------------ > > [14464.149407] chown/23358 is trying to acquire lock: > > [14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at: > > ext4_map_blocks+0xd1/0x640 > > [14464.149407] > > but task is already holding lock: > > [14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: > > v2_write_dquot+0x2d/0xb0 > > [14464.149407] > > which lock already depends on the new lock. > > > > [14464.149407] > > the existing dependency chain (in reverse order) is: > > [14464.149407] > > -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}: > > [14464.149407] down_read+0x41/0x200 > > [14464.149407] v2_read_dquot+0x23/0x60 > > [14464.149407] dquot_acquire+0x4c/0x100 > > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > > [14464.149407] dqget+0x24a/0x4a0 > > [14464.149407] __dquot_initialize+0x1f0/0x330 > > [14464.149407] ext4_create+0x38/0x190 > > [14464.149407] lookup_open+0x4cd/0x630 > > [14464.149407] path_openat+0x2c4/0xa10 > > [14464.149407] do_filp_open+0x91/0x100 > > [14464.149407] do_sys_openat2+0x20d/0x2d0 > > [14464.149407] do_sys_open+0x44/0x80 > > [14464.149407] do_syscall_64+0x33/0x40 > > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [14464.149407] > > -> #1 (&dquot->dq_lock){+.+.}-{3:3}: > > [14464.149407] __mutex_lock+0xaa/0x9c0 > > [14464.149407] dquot_commit+0x23/0xf0 > > [14464.149407] ext4_write_dquot+0x78/0xb0 > > [14464.149407] __dquot_alloc_space+0x151/0x310 > > [14464.149407] ext4_mb_new_blocks+0x1f6/0x12e0 > > [14464.149407] ext4_ext_map_blocks+0x9c3/0x1470 > > [14464.149407] ext4_map_blocks+0xf4/0x640 > > [14464.149407] _ext4_get_block+0x90/0x110 > > [14464.149407] ext4_block_write_begin+0x15f/0x5a0 > > [14464.149407] ext4_write_begin+0x267/0x5b0 > > [14464.149407] generic_perform_write+0xc2/0x1e0 > > [14464.149407] ext4_buffered_write_iter+0x8b/0x130 > > [14464.149407] ext4_file_write_iter+0x6c/0x6d0 > > [14464.149407] new_sync_write+0x122/0x1b0 > > [14464.149407] vfs_write+0x1ca/0x230 > > [14464.149407] ksys_pwrite64+0x68/0xa0 > > [14464.149407] do_syscall_64+0x33/0x40 > > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [14464.149407] > > -> #0 (&ei->i_data_sem){++++}-{3:3}: > > [14464.149407] __lock_acquire+0x147b/0x2830 > > [14464.149407] lock_acquire+0xc6/0x3a0 > > [14464.149407] down_write+0x40/0x110 > > [14464.149407] ext4_map_blocks+0xd1/0x640 > > [14464.149407] ext4_getblk+0x54/0x1c0 > > [14464.149407] ext4_bread+0x1f/0xd0 > > [14464.149407] ext4_quota_write+0xbf/0x280 > > [14464.149407] write_blk+0x35/0x70 > > [14464.149407] get_free_dqblk+0x42/0xa0 > > [14464.149407] do_insert_tree+0x1d6/0x480 > > [14464.149407] do_insert_tree+0x464/0x480 > > [14464.149407] do_insert_tree+0x218/0x480 > > [14464.149407] do_insert_tree+0x218/0x480 > > [14464.149407] qtree_write_dquot+0x79/0x1b0 > > [14464.149407] v2_write_dquot+0x52/0xb0 > > [14464.149407] dquot_acquire+0x8e/0x100 > > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > > [14464.149407] dqget+0x24a/0x4a0 > > [14464.149407] dquot_transfer+0xfe/0x140 > > [14464.149407] ext4_setattr+0x134/0x9c0 > > [14464.149407] notify_change+0x34b/0x490 > > [14464.149407] chown_common+0x96/0x150 > > [14464.149407] do_fchownat+0x8d/0xe0 > > [14464.149407] __x64_sys_fchownat+0x21/0x30 > > [14464.149407] do_syscall_64+0x33/0x40 > > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [14464.149407] > > other info that might help us debug this: > > > > [14464.149407] Chain exists of: > > &ei->i_data_sem --> &dquot->dq_lock --> > > &s->s_dquot.dqio_sem > > > > [14464.149407] Possible unsafe locking scenario: > > > > [14464.149407] CPU0 CPU1 > > [14464.149407] ---- ---- > > [14464.149407] lock(&s->s_dquot.dqio_sem); > > [14464.149407] lock(&dquot->dq_lock); > > [14464.149407] lock(&s->s_dquot.dqio_sem); > > [14464.149407] lock(&ei->i_data_sem); It's complaining that chown on a regular file takes i_data_sem, then takes dq_lock to update the quotas, and then takes the i_data_sem of the quota file. This /should/ be harmless as long as quota files themselves are not accounted for in the quota file, but you could also shut up lockdep by doing something like: static struct lock_class_key ext4_dquot_lock_class; and then when initializing the incore quota inodes: lockdep_set_class(&inode->i_data_sem, &ext4_dquot_lock_class); That way lockdep always knows that the quota inodes are "special". Even better, I think this also means that lockdep actually /will/ call this out specifically if someone screws up and tries to start a regular write after taking the quota lock. The downside is that now you have special inodes that are not like the others. --D > > [14464.149407] > > *** DEADLOCK *** > > > > [14464.149407] 6 locks held by chown/23358: > > [14464.149407] #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at: > > mnt_want_write+0x20/0x50 > > [14464.149407] #1: ffff8b9b8ff64bc8 > > (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150 > > [14464.149407] #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at: > > start_this_handle+0x1ad/0x690 > > [14464.149407] #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at: > > ext4_setattr+0x129/0x9c0 > > [14464.149407] #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at: > > dquot_acquire+0x23/0x100 > > [14464.149407] #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: > > v2_write_dquot+0x2d/0xb0 > > [14464.149407] > > stack backtrace: > > [14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted > > 5.9.0-rc8-next-20201008 #16 > > [14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > 1.13.0-1ubuntu1 04/01/2014 > > [14464.149407] Call Trace: > > [14464.149407] dump_stack+0x77/0x97 > > [14464.149407] check_noncircular+0x132/0x150 > > [14464.149407] ? sched_clock_local+0x12/0x80 > > [14464.149407] __lock_acquire+0x147b/0x2830 > > [14464.149407] lock_acquire+0xc6/0x3a0 > > [14464.149407] ? ext4_map_blocks+0xd1/0x640 > > [14464.149407] down_write+0x40/0x110 > > [14464.149407] ? ext4_map_blocks+0xd1/0x640 > > [14464.149407] ext4_map_blocks+0xd1/0x640 > > [14464.149407] ? sched_clock_local+0x12/0x80 > > [14464.149407] ext4_getblk+0x54/0x1c0 > > [14464.149407] ext4_bread+0x1f/0xd0 > > [14464.149407] ext4_quota_write+0xbf/0x280 > > [14464.149407] write_blk+0x35/0x70 > > [14464.149407] get_free_dqblk+0x42/0xa0 > > [14464.149407] do_insert_tree+0x1d6/0x480 > > [14464.149407] ? ext4_quota_read+0x9d/0x110 > > [14464.149407] do_insert_tree+0x464/0x480 > > [14464.149407] ? ext4_quota_read+0x9d/0x110 > > [14464.149407] do_insert_tree+0x218/0x480 > > [14464.149407] ? ext4_quota_read+0x9d/0x110 > > [14464.149407] do_insert_tree+0x218/0x480 > > [14464.149407] ? __kmalloc+0x319/0x350 > > [14464.149407] qtree_write_dquot+0x79/0x1b0 > > [14464.149407] v2_write_dquot+0x52/0xb0 > > [14464.149407] dquot_acquire+0x8e/0x100 > > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > > [14464.149407] dqget+0x24a/0x4a0 > > [14464.149407] dquot_transfer+0xfe/0x140 > > [14464.149407] ext4_setattr+0x134/0x9c0 > > [14464.149407] notify_change+0x34b/0x490 > > [14464.149407] ? chown_common+0x96/0x150 > > [14464.149407] chown_common+0x96/0x150 > > [14464.149407] ? preempt_count_add+0x49/0xa0 > > [14464.149407] do_fchownat+0x8d/0xe0 > > [14464.149407] __x64_sys_fchownat+0x21/0x30 > > [14464.149407] do_syscall_64+0x33/0x40 > > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [14464.149407] RIP: 0033:0x7efd3d2666ca > > [14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 > > 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 05 <48> > > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48 > > [14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX: > > 0000000000000104 > > [14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX: > > 00007efd3d2666ca > > [14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI: > > 00000000ffffff9c > > [14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09: > > 00000000ffffffff > > [14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12: > > 00005598d654e1f0 > > [14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15: > > 00000000ffffffff > > [14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode. > > Opts: grpquota > > [14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null) > > [14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null) > > Hi Ritesh: > > I've seen that exact failure on 1k at least as far back as 5.0. It fails > on every 1k run for me using the test appliance. I've been treating this > as a "known problem, likely harmless, no fix likely anytime soon" for so long > I unfortunately don't remember what the original analysis was. IIRC, > Jan Kara took a look at this one back when I first encountered it. > > Eric ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" 2020-10-09 20:24 ` Darrick J. Wong @ 2020-10-12 11:35 ` Jan Kara 0 siblings, 0 replies; 6+ messages in thread From: Jan Kara @ 2020-10-12 11:35 UTC (permalink / raw) To: Darrick J. Wong Cc: Eric Whitney, Ritesh Harjani, linux-ext4, Jan Kara, Theodore Ts'o On Fri 09-10-20 13:24:08, Darrick J. Wong wrote: > On Fri, Oct 09, 2020 at 04:12:38PM -0400, Eric Whitney wrote: > > * Ritesh Harjani <riteshh@linux.ibm.com>: > > > Hello All, > > > > > > While running generic/219 fstests on a 1k blocksize on x86 box, I see > > > below dmesg warning msg and generic/219 fails. I haven't yet analyzed > > > it, but I remember I have seen such warnings before as well in my testing. > > > I was wondering if others have also seen it in their testing or not and > > > if this is any known issue? > > > > > > > > > Here it goes, 219.dmesg file. > > > > > > > > > [14459.933253] run fstests generic/219 at 2020-10-08 20:03:27 > > > [14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode. > > > Opts: acl,user_xattr,block_validity,usrquota,grpquota > > > [14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null) > > > [14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null) > > > [14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode. > > > Opts: usrquota > > > [14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null) > > > [14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null) > > > > > > [14464.149407] ====================================================== > > > [14464.149407] WARNING: possible circular locking dependency detected > > > [14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted > > > [14464.149407] ------------------------------------------------------ > > > [14464.149407] chown/23358 is trying to acquire lock: > > > [14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at: > > > ext4_map_blocks+0xd1/0x640 > > > [14464.149407] > > > but task is already holding lock: > > > [14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: > > > v2_write_dquot+0x2d/0xb0 > > > [14464.149407] > > > which lock already depends on the new lock. > > > > > > [14464.149407] > > > the existing dependency chain (in reverse order) is: > > > [14464.149407] > > > -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}: > > > [14464.149407] down_read+0x41/0x200 > > > [14464.149407] v2_read_dquot+0x23/0x60 > > > [14464.149407] dquot_acquire+0x4c/0x100 > > > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > > > [14464.149407] dqget+0x24a/0x4a0 > > > [14464.149407] __dquot_initialize+0x1f0/0x330 > > > [14464.149407] ext4_create+0x38/0x190 > > > [14464.149407] lookup_open+0x4cd/0x630 > > > [14464.149407] path_openat+0x2c4/0xa10 > > > [14464.149407] do_filp_open+0x91/0x100 > > > [14464.149407] do_sys_openat2+0x20d/0x2d0 > > > [14464.149407] do_sys_open+0x44/0x80 > > > [14464.149407] do_syscall_64+0x33/0x40 > > > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > [14464.149407] > > > -> #1 (&dquot->dq_lock){+.+.}-{3:3}: > > > [14464.149407] __mutex_lock+0xaa/0x9c0 > > > [14464.149407] dquot_commit+0x23/0xf0 > > > [14464.149407] ext4_write_dquot+0x78/0xb0 > > > [14464.149407] __dquot_alloc_space+0x151/0x310 > > > [14464.149407] ext4_mb_new_blocks+0x1f6/0x12e0 > > > [14464.149407] ext4_ext_map_blocks+0x9c3/0x1470 > > > [14464.149407] ext4_map_blocks+0xf4/0x640 > > > [14464.149407] _ext4_get_block+0x90/0x110 > > > [14464.149407] ext4_block_write_begin+0x15f/0x5a0 > > > [14464.149407] ext4_write_begin+0x267/0x5b0 > > > [14464.149407] generic_perform_write+0xc2/0x1e0 > > > [14464.149407] ext4_buffered_write_iter+0x8b/0x130 > > > [14464.149407] ext4_file_write_iter+0x6c/0x6d0 > > > [14464.149407] new_sync_write+0x122/0x1b0 > > > [14464.149407] vfs_write+0x1ca/0x230 > > > [14464.149407] ksys_pwrite64+0x68/0xa0 > > > [14464.149407] do_syscall_64+0x33/0x40 > > > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > [14464.149407] > > > -> #0 (&ei->i_data_sem){++++}-{3:3}: > > > [14464.149407] __lock_acquire+0x147b/0x2830 > > > [14464.149407] lock_acquire+0xc6/0x3a0 > > > [14464.149407] down_write+0x40/0x110 > > > [14464.149407] ext4_map_blocks+0xd1/0x640 > > > [14464.149407] ext4_getblk+0x54/0x1c0 > > > [14464.149407] ext4_bread+0x1f/0xd0 > > > [14464.149407] ext4_quota_write+0xbf/0x280 > > > [14464.149407] write_blk+0x35/0x70 > > > [14464.149407] get_free_dqblk+0x42/0xa0 > > > [14464.149407] do_insert_tree+0x1d6/0x480 > > > [14464.149407] do_insert_tree+0x464/0x480 > > > [14464.149407] do_insert_tree+0x218/0x480 > > > [14464.149407] do_insert_tree+0x218/0x480 > > > [14464.149407] qtree_write_dquot+0x79/0x1b0 > > > [14464.149407] v2_write_dquot+0x52/0xb0 > > > [14464.149407] dquot_acquire+0x8e/0x100 > > > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > > > [14464.149407] dqget+0x24a/0x4a0 > > > [14464.149407] dquot_transfer+0xfe/0x140 > > > [14464.149407] ext4_setattr+0x134/0x9c0 > > > [14464.149407] notify_change+0x34b/0x490 > > > [14464.149407] chown_common+0x96/0x150 > > > [14464.149407] do_fchownat+0x8d/0xe0 > > > [14464.149407] __x64_sys_fchownat+0x21/0x30 > > > [14464.149407] do_syscall_64+0x33/0x40 > > > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > [14464.149407] > > > other info that might help us debug this: > > > > > > [14464.149407] Chain exists of: > > > &ei->i_data_sem --> &dquot->dq_lock --> > > > &s->s_dquot.dqio_sem > > > > > > [14464.149407] Possible unsafe locking scenario: > > > > > > [14464.149407] CPU0 CPU1 > > > [14464.149407] ---- ---- > > > [14464.149407] lock(&s->s_dquot.dqio_sem); > > > [14464.149407] lock(&dquot->dq_lock); > > > [14464.149407] lock(&s->s_dquot.dqio_sem); > > > [14464.149407] lock(&ei->i_data_sem); > > It's complaining that chown on a regular file takes i_data_sem, then > takes dq_lock to update the quotas, and then takes the i_data_sem of the > quota file. This /should/ be harmless as long as quota files themselves > are not accounted for in the quota file, but you could also shut up > lockdep by doing something like: > > static struct lock_class_key ext4_dquot_lock_class; > > and then when initializing the incore quota inodes: > > lockdep_set_class(&inode->i_data_sem, &ext4_dquot_lock_class); > > That way lockdep always knows that the quota inodes are "special". Even > better, I think this also means that lockdep actually /will/ call this > out specifically if someone screws up and tries to start a regular write > after taking the quota lock. > > The downside is that now you have special inodes that are not like the > others. Yeah, we already do this in ext4_quota_on() but somehow this doesn't seem to work for the 1k case which is puzzling... Honza > > > [14464.149407] > > > *** DEADLOCK *** > > > > > > [14464.149407] 6 locks held by chown/23358: > > > [14464.149407] #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at: > > > mnt_want_write+0x20/0x50 > > > [14464.149407] #1: ffff8b9b8ff64bc8 > > > (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150 > > > [14464.149407] #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at: > > > start_this_handle+0x1ad/0x690 > > > [14464.149407] #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at: > > > ext4_setattr+0x129/0x9c0 > > > [14464.149407] #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at: > > > dquot_acquire+0x23/0x100 > > > [14464.149407] #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: > > > v2_write_dquot+0x2d/0xb0 > > > [14464.149407] > > > stack backtrace: > > > [14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted > > > 5.9.0-rc8-next-20201008 #16 > > > [14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > > 1.13.0-1ubuntu1 04/01/2014 > > > [14464.149407] Call Trace: > > > [14464.149407] dump_stack+0x77/0x97 > > > [14464.149407] check_noncircular+0x132/0x150 > > > [14464.149407] ? sched_clock_local+0x12/0x80 > > > [14464.149407] __lock_acquire+0x147b/0x2830 > > > [14464.149407] lock_acquire+0xc6/0x3a0 > > > [14464.149407] ? ext4_map_blocks+0xd1/0x640 > > > [14464.149407] down_write+0x40/0x110 > > > [14464.149407] ? ext4_map_blocks+0xd1/0x640 > > > [14464.149407] ext4_map_blocks+0xd1/0x640 > > > [14464.149407] ? sched_clock_local+0x12/0x80 > > > [14464.149407] ext4_getblk+0x54/0x1c0 > > > [14464.149407] ext4_bread+0x1f/0xd0 > > > [14464.149407] ext4_quota_write+0xbf/0x280 > > > [14464.149407] write_blk+0x35/0x70 > > > [14464.149407] get_free_dqblk+0x42/0xa0 > > > [14464.149407] do_insert_tree+0x1d6/0x480 > > > [14464.149407] ? ext4_quota_read+0x9d/0x110 > > > [14464.149407] do_insert_tree+0x464/0x480 > > > [14464.149407] ? ext4_quota_read+0x9d/0x110 > > > [14464.149407] do_insert_tree+0x218/0x480 > > > [14464.149407] ? ext4_quota_read+0x9d/0x110 > > > [14464.149407] do_insert_tree+0x218/0x480 > > > [14464.149407] ? __kmalloc+0x319/0x350 > > > [14464.149407] qtree_write_dquot+0x79/0x1b0 > > > [14464.149407] v2_write_dquot+0x52/0xb0 > > > [14464.149407] dquot_acquire+0x8e/0x100 > > > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > > > [14464.149407] dqget+0x24a/0x4a0 > > > [14464.149407] dquot_transfer+0xfe/0x140 > > > [14464.149407] ext4_setattr+0x134/0x9c0 > > > [14464.149407] notify_change+0x34b/0x490 > > > [14464.149407] ? chown_common+0x96/0x150 > > > [14464.149407] chown_common+0x96/0x150 > > > [14464.149407] ? preempt_count_add+0x49/0xa0 > > > [14464.149407] do_fchownat+0x8d/0xe0 > > > [14464.149407] __x64_sys_fchownat+0x21/0x30 > > > [14464.149407] do_syscall_64+0x33/0x40 > > > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > [14464.149407] RIP: 0033:0x7efd3d2666ca > > > [14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 > > > 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 05 <48> > > > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48 > > > [14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX: > > > 0000000000000104 > > > [14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX: > > > 00007efd3d2666ca > > > [14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI: > > > 00000000ffffff9c > > > [14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09: > > > 00000000ffffffff > > > [14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12: > > > 00005598d654e1f0 > > > [14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15: > > > 00000000ffffffff > > > [14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode. > > > Opts: grpquota > > > [14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null) > > > [14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null) > > > > Hi Ritesh: > > > > I've seen that exact failure on 1k at least as far back as 5.0. It fails > > on every 1k run for me using the test appliance. I've been treating this > > as a "known problem, likely harmless, no fix likely anytime soon" for so long > > I unfortunately don't remember what the original analysis was. IIRC, > > Jan Kara took a look at this one back when I first encountered it. > > > > Eric -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" 2020-10-09 17:29 xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" Ritesh Harjani 2020-10-09 20:12 ` Eric Whitney @ 2020-10-12 11:03 ` Jan Kara 2020-10-12 11:50 ` Jan Kara 1 sibling, 1 reply; 6+ messages in thread From: Jan Kara @ 2020-10-12 11:03 UTC (permalink / raw) To: Ritesh Harjani; +Cc: linux-ext4, Jan Kara, Eric Whitney, Theodore Ts'o Hi! On Fri 09-10-20 22:59:13, Ritesh Harjani wrote: > While running generic/219 fstests on a 1k blocksize on x86 box, I see > below dmesg warning msg and generic/219 fails. I haven't yet analyzed > it, but I remember I have seen such warnings before as well in my testing. > I was wondering if others have also seen it in their testing or not and > if this is any known issue? I don't remember seeing this lately. What mkfs options do you use for your test? I can see below that mount options are apparently: acl,user_xattr,block_validity,usrquota,grpquota > Here it goes, 219.dmesg file. > > [14459.933253] run fstests generic/219 at 2020-10-08 20:03:27 > [14462.947295] EXT4-fs (vdc): mounted filesystem with ordered data mode. > Opts: acl,user_xattr,block_validity,usrquota,grpquota > [14462.992869] EXT4-fs (vdc): re-mounted. Opts: (null) > [14463.017058] EXT4-fs (vdc): re-mounted. Opts: (null) > [14463.727095] EXT4-fs (vdc): mounted filesystem with ordered data mode. > Opts: usrquota > [14463.837130] EXT4-fs (vdc): re-mounted. Opts: (null) > [14463.874293] EXT4-fs (vdc): re-mounted. Opts: (null) > > [14464.149407] ====================================================== > [14464.149407] WARNING: possible circular locking dependency detected > [14464.149407] 5.9.0-rc8-next-20201008 #16 Not tainted > [14464.149407] ------------------------------------------------------ > [14464.149407] chown/23358 is trying to acquire lock: > [14464.149407] ffff8b9b8ff60be0 (&ei->i_data_sem){++++}-{3:3}, at: > ext4_map_blocks+0xd1/0x640 > [14464.149407] > but task is already holding lock: > [14464.149407] ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: > v2_write_dquot+0x2d/0xb0 > [14464.149407] > which lock already depends on the new lock. Umm, this seems to show that somehow the lockdep annotation for the i_data_sem on the quota file didn't get set properly. i_data_sem on the quota file should have I_DATA_SEM_QUOTA (2) subclass but in this report we can see that there's no subclass set on ei->i_data_sem. We do set the subclass in ext4_quota_on() or ext4_enable_quota(). Strange... Honza > [14464.149407] > the existing dependency chain (in reverse order) is: > [14464.149407] > -> #2 (&s->s_dquot.dqio_sem){++++}-{3:3}: > [14464.149407] down_read+0x41/0x200 > [14464.149407] v2_read_dquot+0x23/0x60 > [14464.149407] dquot_acquire+0x4c/0x100 > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > [14464.149407] dqget+0x24a/0x4a0 > [14464.149407] __dquot_initialize+0x1f0/0x330 > [14464.149407] ext4_create+0x38/0x190 > [14464.149407] lookup_open+0x4cd/0x630 > [14464.149407] path_openat+0x2c4/0xa10 > [14464.149407] do_filp_open+0x91/0x100 > [14464.149407] do_sys_openat2+0x20d/0x2d0 > [14464.149407] do_sys_open+0x44/0x80 > [14464.149407] do_syscall_64+0x33/0x40 > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [14464.149407] > -> #1 (&dquot->dq_lock){+.+.}-{3:3}: > [14464.149407] __mutex_lock+0xaa/0x9c0 > [14464.149407] dquot_commit+0x23/0xf0 > [14464.149407] ext4_write_dquot+0x78/0xb0 > [14464.149407] __dquot_alloc_space+0x151/0x310 > [14464.149407] ext4_mb_new_blocks+0x1f6/0x12e0 > [14464.149407] ext4_ext_map_blocks+0x9c3/0x1470 > [14464.149407] ext4_map_blocks+0xf4/0x640 > [14464.149407] _ext4_get_block+0x90/0x110 > [14464.149407] ext4_block_write_begin+0x15f/0x5a0 > [14464.149407] ext4_write_begin+0x267/0x5b0 > [14464.149407] generic_perform_write+0xc2/0x1e0 > [14464.149407] ext4_buffered_write_iter+0x8b/0x130 > [14464.149407] ext4_file_write_iter+0x6c/0x6d0 > [14464.149407] new_sync_write+0x122/0x1b0 > [14464.149407] vfs_write+0x1ca/0x230 > [14464.149407] ksys_pwrite64+0x68/0xa0 > [14464.149407] do_syscall_64+0x33/0x40 > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [14464.149407] > -> #0 (&ei->i_data_sem){++++}-{3:3}: > [14464.149407] __lock_acquire+0x147b/0x2830 > [14464.149407] lock_acquire+0xc6/0x3a0 > [14464.149407] down_write+0x40/0x110 > [14464.149407] ext4_map_blocks+0xd1/0x640 > [14464.149407] ext4_getblk+0x54/0x1c0 > [14464.149407] ext4_bread+0x1f/0xd0 > [14464.149407] ext4_quota_write+0xbf/0x280 > [14464.149407] write_blk+0x35/0x70 > [14464.149407] get_free_dqblk+0x42/0xa0 > [14464.149407] do_insert_tree+0x1d6/0x480 > [14464.149407] do_insert_tree+0x464/0x480 > [14464.149407] do_insert_tree+0x218/0x480 > [14464.149407] do_insert_tree+0x218/0x480 > [14464.149407] qtree_write_dquot+0x79/0x1b0 > [14464.149407] v2_write_dquot+0x52/0xb0 > [14464.149407] dquot_acquire+0x8e/0x100 > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > [14464.149407] dqget+0x24a/0x4a0 > [14464.149407] dquot_transfer+0xfe/0x140 > [14464.149407] ext4_setattr+0x134/0x9c0 > [14464.149407] notify_change+0x34b/0x490 > [14464.149407] chown_common+0x96/0x150 > [14464.149407] do_fchownat+0x8d/0xe0 > [14464.149407] __x64_sys_fchownat+0x21/0x30 > [14464.149407] do_syscall_64+0x33/0x40 > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [14464.149407] > other info that might help us debug this: > > [14464.149407] Chain exists of: > &ei->i_data_sem --> &dquot->dq_lock --> > &s->s_dquot.dqio_sem > > [14464.149407] Possible unsafe locking scenario: > > [14464.149407] CPU0 CPU1 > [14464.149407] ---- ---- > [14464.149407] lock(&s->s_dquot.dqio_sem); > [14464.149407] lock(&dquot->dq_lock); > [14464.149407] lock(&s->s_dquot.dqio_sem); > [14464.149407] lock(&ei->i_data_sem); > [14464.149407] > *** DEADLOCK *** > > [14464.149407] 6 locks held by chown/23358: > [14464.149407] #0: ffff8b9b8963c480 (sb_writers#3){++++}-{0:0}, at: > mnt_want_write+0x20/0x50 > [14464.149407] #1: ffff8b9b8ff64bc8 > (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: chown_common+0x85/0x150 > [14464.149407] #2: ffff8b9b670c48d8 (jbd2_handle){++++}-{0:0}, at: > start_this_handle+0x1ad/0x690 > [14464.149407] #3: ffff8b9b8ff648d0 (&ei->xattr_sem){++++}-{3:3}, at: > ext4_setattr+0x129/0x9c0 > [14464.149407] #4: ffff8b9b1313f5f0 (&dquot->dq_lock){+.+.}-{3:3}, at: > dquot_acquire+0x23/0x100 > [14464.149407] #5: ffff8b9b8963c208 (&s->s_dquot.dqio_sem){++++}-{3:3}, at: > v2_write_dquot+0x2d/0xb0 > [14464.149407] > stack backtrace: > [14464.149407] CPU: 5 PID: 23358 Comm: chown Not tainted > 5.9.0-rc8-next-20201008 #16 > [14464.149407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.13.0-1ubuntu1 04/01/2014 > [14464.149407] Call Trace: > [14464.149407] dump_stack+0x77/0x97 > [14464.149407] check_noncircular+0x132/0x150 > [14464.149407] ? sched_clock_local+0x12/0x80 > [14464.149407] __lock_acquire+0x147b/0x2830 > [14464.149407] lock_acquire+0xc6/0x3a0 > [14464.149407] ? ext4_map_blocks+0xd1/0x640 > [14464.149407] down_write+0x40/0x110 > [14464.149407] ? ext4_map_blocks+0xd1/0x640 > [14464.149407] ext4_map_blocks+0xd1/0x640 > [14464.149407] ? sched_clock_local+0x12/0x80 > [14464.149407] ext4_getblk+0x54/0x1c0 > [14464.149407] ext4_bread+0x1f/0xd0 > [14464.149407] ext4_quota_write+0xbf/0x280 > [14464.149407] write_blk+0x35/0x70 > [14464.149407] get_free_dqblk+0x42/0xa0 > [14464.149407] do_insert_tree+0x1d6/0x480 > [14464.149407] ? ext4_quota_read+0x9d/0x110 > [14464.149407] do_insert_tree+0x464/0x480 > [14464.149407] ? ext4_quota_read+0x9d/0x110 > [14464.149407] do_insert_tree+0x218/0x480 > [14464.149407] ? ext4_quota_read+0x9d/0x110 > [14464.149407] do_insert_tree+0x218/0x480 > [14464.149407] ? __kmalloc+0x319/0x350 > [14464.149407] qtree_write_dquot+0x79/0x1b0 > [14464.149407] v2_write_dquot+0x52/0xb0 > [14464.149407] dquot_acquire+0x8e/0x100 > [14464.149407] ext4_acquire_dquot+0x72/0xc0 > [14464.149407] dqget+0x24a/0x4a0 > [14464.149407] dquot_transfer+0xfe/0x140 > [14464.149407] ext4_setattr+0x134/0x9c0 > [14464.149407] notify_change+0x34b/0x490 > [14464.149407] ? chown_common+0x96/0x150 > [14464.149407] chown_common+0x96/0x150 > [14464.149407] ? preempt_count_add+0x49/0xa0 > [14464.149407] do_fchownat+0x8d/0xe0 > [14464.149407] __x64_sys_fchownat+0x21/0x30 > [14464.149407] do_syscall_64+0x33/0x40 > [14464.149407] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [14464.149407] RIP: 0033:0x7efd3d2666ca > [14464.149407] Code: 48 8b 0d c9 f7 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 > 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 00 0f 05 <48> > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 96 f7 0c 00 f7 d8 64 89 01 48 > [14464.149407] RSP: 002b:00007ffdbe6ae418 EFLAGS: 00000206 ORIG_RAX: > 0000000000000104 > [14464.149407] RAX: ffffffffffffffda RBX: 00007ffdbe6ae650 RCX: > 00007efd3d2666ca > [14464.149407] RDX: 0000000000007ab7 RSI: 00005598d654ef60 RDI: > 00000000ffffff9c > [14464.149407] RBP: 00000000ffffff9c R08: 0000000000000000 R09: > 00000000ffffffff > [14464.149407] R10: 00000000ffffffff R11: 0000000000000206 R12: > 00005598d654e1f0 > [14464.149407] R13: 00005598d654e268 R14: 0000000000007ab7 R15: > 00000000ffffffff > [14465.070206] EXT4-fs (vdc): mounted filesystem with ordered data mode. > Opts: grpquota > [14465.165678] EXT4-fs (vdc): re-mounted. Opts: (null) > [14465.201132] EXT4-fs (vdc): re-mounted. Opts: (null) -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" 2020-10-12 11:03 ` Jan Kara @ 2020-10-12 11:50 ` Jan Kara 0 siblings, 0 replies; 6+ messages in thread From: Jan Kara @ 2020-10-12 11:50 UTC (permalink / raw) To: Ritesh Harjani; +Cc: linux-ext4, Jan Kara, Eric Whitney, Theodore Ts'o On Mon 12-10-20 13:03:47, Jan Kara wrote: > Hi! > > On Fri 09-10-20 22:59:13, Ritesh Harjani wrote: > > While running generic/219 fstests on a 1k blocksize on x86 box, I see > > below dmesg warning msg and generic/219 fails. I haven't yet analyzed > > it, but I remember I have seen such warnings before as well in my testing. > > I was wondering if others have also seen it in their testing or not and > > if this is any known issue? > > I don't remember seeing this lately. What mkfs options do you use for your > test? I can see below that mount options are apparently: > > acl,user_xattr,block_validity,usrquota,grpquota BTW, I've tried and the problem indeed does *not* reproduce in my test KVM setup with 1k block size and default mkfs options. But it does reproduce with mkfs option "-O quota". So it indeed seems the lockdep annotation is somehow broken in this config. I'll have a look... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-10-12 11:50 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-09 17:29 xfstests global-ext4/1k generic/219 failure due to dmesg warning of "circular locking dependency detected" Ritesh Harjani 2020-10-09 20:12 ` Eric Whitney 2020-10-09 20:24 ` Darrick J. Wong 2020-10-12 11:35 ` Jan Kara 2020-10-12 11:03 ` Jan Kara 2020-10-12 11:50 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).