From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: dm-devel@redhat.com
Subject: Re: 5.6-rc1 lockdep warning during dm-thin shutdown
Date: Thu, 20 Feb 2020 18:45:19 -0500 [thread overview]
Message-ID: <20200220234519.GA620489@mit.edu> (raw)
In-Reply-To: <20200213205014.GA349527@mit.edu>
Ping? It looks like I'm still seeing across the board failures for
generic/347 in 5.6-rc2 and it looks like it's pretty clearly a dm-thin
issue.
If this isn't going to be fixed soon, that's fine. I'm just going to
blacklist generic/347 in kvm-xfstests and gce-xfstests.
Thanks,
- Ted
On Thu, Feb 13, 2020 at 03:50:14PM -0500, Theodore Y. Ts'o wrote:
> I noticed this while running xfstests generic/347. The lockdep report is below:
>
> [ 3742.062744] run fstests generic/347 at 2020-02-13 13:18:03
> [ 3742.431761] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: acl,user_xattr,block_validity
> [ 3743.696630] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: acl,user_xattr,block_validity
> [ 3749.384483] device-mapper: thin: 253:4: reached low water mark for data device: sending event.
> [ 3750.853156] device-mapper: thin: 253:4: switching pool to out-of-data-space (queue IO) mode
> [ 3812.123242] device-mapper: thin: 253:4: switching pool to out-of-data-space (error IO) mode
> [ 3812.138217] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 378 starting block 128192)
> [ 3812.150804] buffer_io_error: 708 callbacks suppressed
> [ 3812.150809] Buffer I/O error on device dm-5, logical block 128000
> [ 3812.162553] Buffer I/O error on device dm-5, logical block 128001
> [ 3812.169205] Buffer I/O error on device dm-5, logical block 128002
> [ 3812.175793] Buffer I/O error on device dm-5, logical block 128003
> [ 3812.183686] Buffer I/O error on device dm-5, logical block 128004
> [ 3812.191467] Buffer I/O error on device dm-5, logical block 128005
> [ 3812.199311] Buffer I/O error on device dm-5, logical block 128006
> [ 3812.207204] Buffer I/O error on device dm-5, logical block 128007
> [ 3812.214915] Buffer I/O error on device dm-5, logical block 128008
> [ 3812.222699] Buffer I/O error on device dm-5, logical block 128009
> [ 3812.240796] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 379 starting block 128448)
> [ 3812.264563] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 380 starting block 128704)
> [ 3812.287052] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 381 starting block 128960)
> [ 3812.308866] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 382 starting block 129216)
> [ 3812.330810] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 383 starting block 129472)
> [ 3812.352898] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 384 starting block 129728)
> [ 3812.373989] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 385 starting block 129984)
> [ 3812.395096] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 386 starting block 130240)
> [ 3812.416946] EXT4-fs warning (device dm-5): ext4_end_bio:347: I/O error 3 writing to inode 387 starting block 130496)
> [ 3813.736289] device-mapper: thin: 253:4: switching pool to write mode
> [ 3813.736304] device-mapper: thin: 253:4: growing the data device from 1000 to 1200 blocks
> [ 3813.848386] device-mapper: thin: 253:4: reached low water mark for data device: sending event.
>
> [ 3814.349795] ======================================================
> [ 3814.356234] WARNING: possible circular locking dependency detected
> [ 3814.362533] 5.5.0-xfstests-07635-gc96dceeabf76 #1477 Not tainted
> [ 3814.368654] ------------------------------------------------------
> [ 3814.374947] dmsetup/10207 is trying to acquire lock:
> [ 3814.380079] ffffffff85489a78 (shrinker_rwsem){++++}, at: unregister_shrinker+0x22/0x80
> [ 3814.388115]
> but task is already holding lock:
> [ 3814.394061] ffff9c184c8c95d0 (&pmd->root_lock){++++}, at: dm_pool_metadata_close+0xba/0x120
> [ 3814.402530]
> which lock already depends on the new lock.
>
> [ 3814.411867]
> the existing dependency chain (in reverse order) is:
> [ 3814.419464]
> -> #3 (&pmd->root_lock){++++}:
> [ 3814.425321] lock_acquire+0x9a/0x160
> [ 3814.429545] down_read+0x41/0x200
> [ 3814.433542] dm_thin_find_block+0x2e/0x80
> [ 3814.438590] thin_map+0x172/0x270
> [ 3814.442548] __map_bio+0x42/0x210
> [ 3814.446501] __split_and_process_non_flush+0xd6/0x150
> [ 3814.452292] __split_and_process_bio+0xe0/0x380
> [ 3814.457588] dm_process_bio+0x90/0x2b0
> [ 3814.462838] dm_make_request+0x54/0x120
> [ 3814.467596] generic_make_request+0xd8/0x400
> [ 3814.472508] submit_bio+0x3c/0x140
> [ 3814.476587] submit_bh_wbc+0x16d/0x190
> [ 3814.480975] ext4_read_block_bitmap_nowait+0x41d/0x670
> [ 3814.486896] ext4_mb_init_cache+0x163/0x760
> [ 3814.491799] ext4_mb_init_group+0x19a/0x260
> [ 3814.496759] ext4_mb_good_group+0x16a/0x180
> [ 3814.501587] ext4_mb_regular_allocator+0x28f/0x400
> [ 3814.507036] ext4_mb_new_blocks+0x57e/0xbe0
> [ 3814.512016] ext4_ext_map_blocks+0xb32/0xf30
> [ 3814.516992] ext4_map_blocks+0xef/0x580
> [ 3814.521743] mpage_map_and_submit_extent+0xe7/0x510
> [ 3814.527259] ext4_writepages+0x895/0xb10
> [ 3814.531971] do_writepages+0x41/0xe0
> [ 3814.536199] __filemap_fdatawrite_range+0xd2/0x120
> [ 3814.541752] file_write_and_wait_range+0x71/0xc0
> [ 3814.547100] ext4_sync_file+0x130/0x4b0
> [ 3814.551580] do_fsync+0x38/0x70
> [ 3814.555367] __x64_sys_fsync+0x10/0x20
> [ 3814.559760] do_syscall_64+0x50/0x1f0
> [ 3814.564195] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3814.569882]
> -> #2 (&ei->i_data_sem){++++}:
> [ 3814.575580] lock_acquire+0x9a/0x160
> [ 3814.580109] down_write+0x40/0x110
> [ 3814.584145] ext4_truncate+0x3ce/0x510
> [ 3814.588529] ext4_evict_inode+0x29a/0x5c0
> [ 3814.593184] evict+0xd0/0x1a0
> [ 3814.596952] vfs_rmdir+0x167/0x190
> [ 3814.600990] do_rmdir+0x145/0x1b0
> [ 3814.605303] do_syscall_64+0x50/0x1f0
> [ 3814.609613] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3814.615301]
> -> #1 (jbd2_handle){++++}:
> [ 3814.620744] lock_acquire+0x9a/0x160
> [ 3814.625088] start_this_handle+0x226/0x680
> [ 3814.629824] jbd2__journal_start+0xf7/0x270
> [ 3814.634856] __ext4_journal_start_sb+0x85/0x1b0
> [ 3814.640051] ext4_release_dquot+0x68/0xc0
> [ 3814.644716] dqput.part.0+0x7f/0x210
> [ 3814.648931] __dquot_drop+0x69/0x90
> [ 3814.653495] ext4_clear_inode+0x35/0x90
> [ 3814.658107] ext4_evict_inode+0x6f/0x5c0
> [ 3814.662767] evict+0xd0/0x1a0
> [ 3814.666365] dispose_list+0x48/0x70
> [ 3814.673422] prune_icache_sb+0x54/0x80
> [ 3814.677808] super_cache_scan+0x124/0x1a0
> [ 3814.682454] do_shrink_slab+0x171/0x3a0
> [ 3814.687036] shrink_slab+0x22c/0x2b0
> [ 3814.691322] drop_slab_node+0x2e/0x50
> [ 3814.695628] drop_slab+0x34/0x60
> [ 3814.699511] drop_caches_sysctl_handler+0x70/0x80
> [ 3814.704859] proc_sys_call_handler+0x19d/0x1c0
> [ 3814.709944] vfs_write+0xde/0x1b0
> [ 3814.713911] ksys_write+0x68/0xe0
> [ 3814.718207] do_syscall_64+0x50/0x1f0
> [ 3814.722512] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3814.728895]
> -> #0 (shrinker_rwsem){++++}:
> [ 3814.734504] check_prev_add+0x8e/0x9f0
> [ 3814.738897] __lock_acquire+0xcdf/0xf80
> [ 3814.743431] lock_acquire+0x9a/0x160
> [ 3814.747651] down_write+0x40/0x110
> [ 3814.751806] unregister_shrinker+0x22/0x80
> [ 3814.756646] dm_bufio_client_destroy+0xbf/0x220
> [ 3814.761922] dm_block_manager_destroy+0x11/0x20
> [ 3814.767098] dm_pool_metadata_close+0x113/0x120
> [ 3814.772503] __pool_dec+0x96/0x120
> [ 3814.776547] pool_dtr+0x2d/0x70
> [ 3814.780574] dm_table_destroy+0x58/0x110
> [ 3814.785137] __dm_destroy+0x159/0x1b0
> [ 3814.789444] dev_remove+0xd3/0x110
> [ 3814.793482] ctl_ioctl+0x1ab/0x3b0
> [ 3814.797543] dm_ctl_ioctl+0xa/0x10
> [ 3814.801608] ksys_ioctl+0x82/0xc0
> [ 3814.805593] __x64_sys_ioctl+0x16/0x20
> [ 3814.809991] do_syscall_64+0x50/0x1f0
> [ 3814.814401] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3814.820226]
> other info that might help us debug this:
>
> [ 3814.828343] Chain exists of:
> shrinker_rwsem --> &ei->i_data_sem --> &pmd->root_lock
>
> [ 3814.839823] Possible unsafe locking scenario:
>
> [ 3814.845864] CPU0 CPU1
> [ 3814.850625] ---- ----
> [ 3814.855274] lock(&pmd->root_lock);
> [ 3814.858962] lock(&ei->i_data_sem);
> [ 3814.865215] lock(&pmd->root_lock);
> [ 3814.871686] lock(shrinker_rwsem);
> [ 3814.875377]
> *** DEADLOCK ***
>
> [ 3814.881412] 2 locks held by dmsetup/10207:
> [ 3814.885637] #0: ffffffff86fc05f0 (&dm_thin_pool_table.mutex){+.+.}, at: pool_dtr+0x1d/0x70
> [ 3814.894114] #1: ffff9c184c8c95d0 (&pmd->root_lock){++++}, at: dm_pool_metadata_close+0xba/0x120
> [ 3814.903068]
> stack backtrace:
> [ 3814.907544] CPU: 1 PID: 10207 Comm: dmsetup Not tainted 5.5.0-xfstests-07635-gc96dceeabf76 #1477
> [ 3814.916451] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> [ 3814.925915] Call Trace:
> [ 3814.928485] dump_stack+0x71/0xa0
> [ 3814.931929] check_noncircular+0x162/0x180
> [ 3814.936144] check_prev_add+0x8e/0x9f0
> [ 3814.940010] __lock_acquire+0xcdf/0xf80
> [ 3814.943964] lock_acquire+0x9a/0x160
> [ 3814.947733] ? unregister_shrinker+0x22/0x80
> [ 3814.952123] down_write+0x40/0x110
> [ 3814.955650] ? unregister_shrinker+0x22/0x80
> [ 3814.960035] unregister_shrinker+0x22/0x80
> [ 3814.964248] dm_bufio_client_destroy+0xbf/0x220
> [ 3814.968898] dm_block_manager_destroy+0x11/0x20
> [ 3814.973551] dm_pool_metadata_close+0x113/0x120
> [ 3814.978307] __pool_dec+0x96/0x120
> [ 3814.981829] pool_dtr+0x2d/0x70
> [ 3814.985086] dm_table_destroy+0x58/0x110
> [ 3814.989156] __dm_destroy+0x159/0x1b0
> [ 3814.992939] dev_remove+0xd3/0x110
> [ 3814.996494] ? remove_all+0x30/0x30
> [ 3815.000136] ctl_ioctl+0x1ab/0x3b0
> [ 3815.003672] dm_ctl_ioctl+0xa/0x10
> [ 3815.007187] ksys_ioctl+0x82/0xc0
> [ 3815.010614] __x64_sys_ioctl+0x16/0x20
> [ 3815.014474] do_syscall_64+0x50/0x1f0
> [ 3815.018257] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3815.023440] RIP: 0033:0x7f2c9d82e427
> [ 3815.027131] Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8 64 89 01 48
> [ 3815.046014] RSP: 002b:00007ffee7570578 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
> [ 3815.053715] RAX: ffffffffffffffda RBX: 00007f2c9d90ca7f RCX: 00007f2c9d82e427
> [ 3815.061052] RDX: 00005625880b2b40 RSI: 00000000c138fd04 RDI: 0000000000000003
> [ 3815.068302] RBP: 00007ffee7570630 R08: 00007f2c9d95b310 R09: 00007ffee75703e0
> [ 3815.076018] R10: 00007f2c9d95a65a R11: 0000000000000202 R12: 00007f2c9d95a65a
> [ 3815.083398] R13: 00007f2c9d95a65a R14: 00007f2c9d95a65a R15: 00007f2c9d95a65a
next prev parent reply other threads:[~2020-02-20 23:45 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-13 20:50 5.6-rc1 lockdep warning during dm-thin shutdown Theodore Y. Ts'o
2020-02-20 23:45 ` Theodore Y. Ts'o [this message]
2020-02-23 19:54 ` [PATCH] dm thin metadata: fix lockdep complaint Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200220234519.GA620489@mit.edu \
--to=tytso@mit.edu \
--cc=dm-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.