From: "Theodore Ts'o" <tytso@mit.edu> To: Byungchul Park <byungchul.park@lge.com> Cc: torvalds@linux-foundation.org, damien.lemoal@opensource.wdc.com, linux-ide@vger.kernel.org, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, mingo@redhat.com, linux-kernel@vger.kernel.org, peterz@infradead.org, will@kernel.org, tglx@linutronix.de, rostedt@goodmis.org, joel@joelfernandes.org, sashal@kernel.org, daniel.vetter@ffwll.ch, chris@chris-wilson.co.uk, duyuyang@gmail.com, johannes.berg@intel.com, tj@kernel.org, willy@infradead.org, david@fromorbit.com, amir73il@gmail.com, bfields@fieldses.org, gregkh@linuxfoundation.org, kernel-team@lge.com, linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@kernel.org, minchan@kernel.org, hannes@cmpxchg.org, vdavydov.dev@gmail.com, sj@kernel.org, jglisse@redhat.com, dennis@kernel.org, cl@linux.com, penberg@kernel.org, rientjes@google.com, vbabka@suse.cz, ngupta@vflare.org, linux-block@vger.kernel.org, paolo.valente@linaro.org, josef@toxicpanda.com, linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk, jack@suse.cz, jack@suse.com, jlayton@kernel.org, dan.j.williams@intel.com, hch@infradead.org, djwong@kernel.org, dri-devel@lists.freedesktop.org, airlied@linux.ie, rodrigosiqueiramelo@gmail.com, melissa.srw@gmail.com, hamohammed.sa@gmail.com, 42.hyeyoo@gmail.com Subject: Re: [PATCH RFC v6 00/21] DEPT(Dependency Tracker) Date: Mon, 9 May 2022 17:05:23 -0400 [thread overview] Message-ID: <YnmCE2iwa0MSqocr@mit.edu> (raw) In-Reply-To: <1651652269-15342-1-git-send-email-byungchul.park@lge.com> I tried DEPT-v6 applied against 5.18-rc5, and it reported the following positive. The reason why it's nonsense is that in context A's [W] wait: [ 1538.545054] [W] folio_wait_bit_common(pglocked:0): [ 1538.545370] [<ffffffff81259944>] __filemap_get_folio+0x3e4/0x420 [ 1538.545763] stacktrace: [ 1538.545928] folio_wait_bit_common+0x2fa/0x460 [ 1538.546248] __filemap_get_folio+0x3e4/0x420 [ 1538.546558] pagecache_get_page+0x11/0x40 [ 1538.546852] ext4_mb_init_group+0x80/0x2e0 [ 1538.547152] ext4_mb_good_group_nolock+0x2a3/0x2d0 ... we're reading the block allocation bitmap into the page cache. This does not correspond to a real inode, and so we don't actually take ei->i_data_sem in this on the psuedo-inode used. In contast, context's B's [W] and [E]'s stack traces, the folio_wait_bit is clearly associated with page which is mapped to a real inode: [ 1538.553656] [W] down_write(&ei->i_data_sem:0): [ 1538.553948] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680 [ 1538.554320] stacktrace: [ 1538.554485] ext4_map_blocks+0x17b/0x680 [ 1538.554772] mpage_map_and_submit_extent+0xef/0x530 [ 1538.555122] ext4_writepages+0x798/0x990 [ 1538.555409] do_writepages+0xcf/0x1c0 [ 1538.555682] __writeback_single_inode+0x58/0x3f0 [ 1538.556014] writeback_sb_inodes+0x210/0x540 ... [ 1538.558621] [E] folio_wake_bit(pglocked:0): [ 1538.558896] [<ffffffff814418c0>] ext4_bio_write_page+0x400/0x560 [ 1538.559290] stacktrace: [ 1538.559455] ext4_bio_write_page+0x400/0x560 [ 1538.559765] mpage_submit_page+0x5c/0x80 [ 1538.560051] mpage_map_and_submit_buffers+0x15a/0x250 [ 1538.560409] mpage_map_and_submit_extent+0x134/0x530 [ 1538.560764] ext4_writepages+0x798/0x990 [ 1538.561057] do_writepages+0xcf/0x1c0 [ 1538.561329] __writeback_single_inode+0x58/0x3f0 ... In any case, this will ***never*** deadlock, and it's due to DEPT fundamentally not understanding that waiting on different pages may be due to inodes that come from completely different inodes, and so there is zero possible chance this would never deadlock. I suspect there will be similar false positives for tests (or userspace) that uses copy_file_range(2) or send_file(2) system calls. I've included the full DEPT log report below. - Ted generic/011 [20:11:16][ 1533.411773] run fstests generic/011 at 2022-05-07 20:11:16 [ 1533.509603] DEPT_INFO_ONCE: Need to expand the ring buffer. [ 1536.910044] DEPT_INFO_ONCE: Pool(wait) is empty. [ 1538.533315] =================================================== [ 1538.533793] DEPT: Circular dependency has been detected. [ 1538.534199] 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571 Not tainted [ 1538.534645] --------------------------------------------------- [ 1538.535035] summary [ 1538.535177] --------------------------------------------------- [ 1538.535567] *** DEADLOCK *** [ 1538.535567] [ 1538.535854] context A [ 1538.536008] [S] down_write(&ei->i_data_sem:0) [ 1538.536323] [W] folio_wait_bit_common(pglocked:0) [ 1538.536655] [E] up_write(&ei->i_data_sem:0) [ 1538.536958] [ 1538.537063] context B [ 1538.537216] [S] (unknown)(pglocked:0) [ 1538.537480] [W] down_write(&ei->i_data_sem:0) [ 1538.537789] [E] folio_wake_bit(pglocked:0) [ 1538.538082] [ 1538.538184] [S]: start of the event context [ 1538.538460] [W]: the wait blocked [ 1538.538680] [E]: the event not reachable [ 1538.538939] --------------------------------------------------- [ 1538.539327] context A's detail [ 1538.539530] --------------------------------------------------- [ 1538.539918] context A [ 1538.540072] [S] down_write(&ei->i_data_sem:0) [ 1538.540382] [W] folio_wait_bit_common(pglocked:0) [ 1538.540712] [E] up_write(&ei->i_data_sem:0) [ 1538.541015] [ 1538.541119] [S] down_write(&ei->i_data_sem:0): [ 1538.541410] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680 [ 1538.541782] stacktrace: [ 1538.541946] ext4_map_blocks+0x17b/0x680 [ 1538.542234] ext4_getblk+0x5f/0x1f0 [ 1538.542493] ext4_bread+0xc/0x70 [ 1538.542736] ext4_append+0x48/0xf0 [ 1538.542991] ext4_init_new_dir+0xc8/0x160 [ 1538.543284] ext4_mkdir+0x19a/0x320 [ 1538.543542] vfs_mkdir+0x83/0xe0 [ 1538.543788] do_mkdirat+0x8c/0x130 [ 1538.544042] __x64_sys_mkdir+0x29/0x30 [ 1538.544319] do_syscall_64+0x40/0x90 [ 1538.544584] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 1538.544949] [ 1538.545054] [W] folio_wait_bit_common(pglocked:0): [ 1538.545370] [<ffffffff81259944>] __filemap_get_folio+0x3e4/0x420 [ 1538.545763] stacktrace: [ 1538.545928] folio_wait_bit_common+0x2fa/0x460 [ 1538.546248] __filemap_get_folio+0x3e4/0x420 [ 1538.546558] pagecache_get_page+0x11/0x40 [ 1538.546852] ext4_mb_init_group+0x80/0x2e0 [ 1538.547152] ext4_mb_good_group_nolock+0x2a3/0x2d0 [ 1538.547496] ext4_mb_regular_allocator+0x391/0x780 [ 1538.547840] ext4_mb_new_blocks+0x44e/0x720 [ 1538.548145] ext4_ext_map_blocks+0x7f1/0xd00 [ 1538.548455] ext4_map_blocks+0x19e/0x680 [ 1538.548743] ext4_getblk+0x5f/0x1f0 [ 1538.549006] ext4_bread+0xc/0x70 [ 1538.549250] ext4_append+0x48/0xf0 [ 1538.549505] ext4_init_new_dir+0xc8/0x160 [ 1538.549798] ext4_mkdir+0x19a/0x320 [ 1538.550058] vfs_mkdir+0x83/0xe0 [ 1538.550302] do_mkdirat+0x8c/0x130 [ 1538.550557] [ 1538.550660] [E] up_write(&ei->i_data_sem:0): [ 1538.550940] (N/A) [ 1538.551071] --------------------------------------------------- [ 1538.551459] context B's detail [ 1538.551662] --------------------------------------------------- [ 1538.552047] context B [ 1538.552202] [S] (unknown)(pglocked:0) [ 1538.552466] [W] down_write(&ei->i_data_sem:0) [ 1538.552775] [E] folio_wake_bit(pglocked:0) [ 1538.553071] [ 1538.553174] [S] (unknown)(pglocked:0): [ 1538.553422] (N/A) [ 1538.553553] [ 1538.553656] [W] down_write(&ei->i_data_sem:0): [ 1538.553948] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680 [ 1538.554320] stacktrace: [ 1538.554485] ext4_map_blocks+0x17b/0x680 [ 1538.554772] mpage_map_and_submit_extent+0xef/0x530 [ 1538.555122] ext4_writepages+0x798/0x990 [ 1538.555409] do_writepages+0xcf/0x1c0 [ 1538.555682] __writeback_single_inode+0x58/0x3f0 [ 1538.556014] writeback_sb_inodes+0x210/0x540 [ 1538.556324] __writeback_inodes_wb+0x4c/0xe0 [ 1538.556635] wb_writeback+0x298/0x450 [ 1538.556911] wb_do_writeback+0x29e/0x320 [ 1538.557199] wb_workfn+0x6a/0x2c0 [ 1538.557447] process_one_work+0x302/0x650 [ 1538.557743] worker_thread+0x55/0x400 [ 1538.558013] kthread+0xf0/0x120 [ 1538.558251] ret_from_fork+0x1f/0x30 [ 1538.558518] [ 1538.558621] [E] folio_wake_bit(pglocked:0): [ 1538.558896] [<ffffffff814418c0>] ext4_bio_write_page+0x400/0x560 [ 1538.559290] stacktrace: [ 1538.559455] ext4_bio_write_page+0x400/0x560 [ 1538.559765] mpage_submit_page+0x5c/0x80 [ 1538.560051] mpage_map_and_submit_buffers+0x15a/0x250 [ 1538.560409] mpage_map_and_submit_extent+0x134/0x530 [ 1538.560764] ext4_writepages+0x798/0x990 [ 1538.561057] do_writepages+0xcf/0x1c0 [ 1538.561329] __writeback_single_inode+0x58/0x3f0 [ 1538.561662] writeback_sb_inodes+0x210/0x540 [ 1538.561973] __writeback_inodes_wb+0x4c/0xe0 [ 1538.562283] wb_writeback+0x298/0x450 [ 1538.562555] wb_do_writeback+0x29e/0x320 [ 1538.562842] wb_workfn+0x6a/0x2c0 [ 1538.563095] process_one_work+0x302/0x650 [ 1538.563387] worker_thread+0x55/0x400 [ 1538.563658] kthread+0xf0/0x120 [ 1538.563895] ret_from_fork+0x1f/0x30 [ 1538.564161] --------------------------------------------------- [ 1538.564548] information that might be helpful [ 1538.564832] --------------------------------------------------- [ 1538.565223] CPU: 1 PID: 46539 Comm: dirstress Not tainted 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571 [ 1538.565854] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 1538.566394] Call Trace: [ 1538.566559] <TASK> [ 1538.566701] dump_stack_lvl+0x4f/0x68 [ 1538.566945] print_circle.cold+0x15b/0x169 [ 1538.567218] ? print_circle+0xe0/0xe0 [ 1538.567461] cb_check_dl+0x55/0x60 [ 1538.567687] bfs+0xd5/0x1b0 [ 1538.567874] add_dep+0xd3/0x1a0 [ 1538.568083] ? __filemap_get_folio+0x3e4/0x420 [ 1538.568374] add_wait+0xe3/0x250 [ 1538.568590] ? __filemap_get_folio+0x3e4/0x420 [ 1538.568886] dept_wait_split_map+0xb1/0x130 [ 1538.569163] folio_wait_bit_common+0x2fa/0x460 [ 1538.569456] ? lock_is_held_type+0xfc/0x130 [ 1538.569733] __filemap_get_folio+0x3e4/0x420 [ 1538.570013] ? __lock_release+0x1b2/0x2c0 [ 1538.570278] pagecache_get_page+0x11/0x40 [ 1538.570543] ext4_mb_init_group+0x80/0x2e0 [ 1538.570813] ? ext4_get_group_desc+0xb2/0x200 [ 1538.571102] ext4_mb_good_group_nolock+0x2a3/0x2d0 [ 1538.571418] ext4_mb_regular_allocator+0x391/0x780 [ 1538.571733] ? rcu_read_lock_sched_held+0x3f/0x70 [ 1538.572044] ? trace_kmem_cache_alloc+0x2c/0xd0 [ 1538.572343] ? kmem_cache_alloc+0x1f7/0x3f0 [ 1538.572618] ext4_mb_new_blocks+0x44e/0x720 [ 1538.572896] ext4_ext_map_blocks+0x7f1/0xd00 [ 1538.573179] ? find_held_lock+0x2b/0x80 [ 1538.573434] ext4_map_blocks+0x19e/0x680 [ 1538.573693] ext4_getblk+0x5f/0x1f0 [ 1538.573927] ext4_bread+0xc/0x70 [ 1538.574141] ext4_append+0x48/0xf0 [ 1538.574369] ext4_init_new_dir+0xc8/0x160 [ 1538.574634] ext4_mkdir+0x19a/0x320 [ 1538.574866] vfs_mkdir+0x83/0xe0 [ 1538.575082] do_mkdirat+0x8c/0x130 [ 1538.575308] __x64_sys_mkdir+0x29/0x30 [ 1538.575557] do_syscall_64+0x40/0x90 [ 1538.575795] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 1538.576128] RIP: 0033:0x7f0960466b07 [ 1538.576367] Code: 1f 40 00 48 8b 05 89 f3 0c 00 64 c7 00 5f 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 53 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 59 f3 0c 00 f7 d8 64 89 01 48 [ 1538.577576] RSP: 002b:00007ffd0fa955a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000053 [ 1538.578069] RAX: ffffffffffffffda RBX: 0000000000000239 RCX: 00007f0960466b07 [ 1538.578533] RDX: 0000000000000000 RSI: 00000000000001ff RDI: 00007ffd0fa955d0 [ 1538.578995] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000010 [ 1538.579458] R10: 00007ffd0fa95345 R11: 0000000000000246 R12: 00000000000003e8 [ 1538.579923] R13: 0000000000000000 R14: 00007ffd0fa955d0 R15: 00007ffd0fa95dd0 [ 1538.580389] </TASK> [ 1540.581382] EXT4-fs (vdb): mounted filesystem with ordered data mode. Quota mode: none. [20:11:24] 8s P.S. Later on the console, the test ground to the halt because DEPT started WARNING over and over and over again.... [ 3129.686102] DEPT_WARN_ON: dt->ecxt_held_pos == DEPT_MAX_ECXT_HELD [ 3129.686396] ? __might_fault+0x32/0x80 [ 3129.686660] WARNING: CPU: 1 PID: 107320 at kernel/dependency/dept.c:1537 add_ecxt+0x1c0/0x1d0 [ 3129.687040] ? __might_fault+0x32/0x80 [ 3129.687282] CPU: 1 PID: 107320 Comm: aio-stress Tainted: G W 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571 with multiple CPU's completely spamming the serial console. This should probably be a WARN_ON_ONCE, or some thing that disables DEPT entirely, since apparently won't be any useful DEPT reports (or any useful kernel work, for that matteR) is going to be happening after this.
WARNING: multiple messages have this Message-ID (diff)
From: "Theodore Ts'o" <tytso@mit.edu> To: Byungchul Park <byungchul.park@lge.com> Cc: hamohammed.sa@gmail.com, jack@suse.cz, peterz@infradead.org, daniel.vetter@ffwll.ch, amir73il@gmail.com, david@fromorbit.com, dri-devel@lists.freedesktop.org, chris@chris-wilson.co.uk, bfields@fieldses.org, linux-ide@vger.kernel.org, adilger.kernel@dilger.ca, joel@joelfernandes.org, 42.hyeyoo@gmail.com, cl@linux.com, will@kernel.org, duyuyang@gmail.com, sashal@kernel.org, paolo.valente@linaro.org, damien.lemoal@opensource.wdc.com, willy@infradead.org, hch@infradead.org, airlied@linux.ie, mingo@redhat.com, djwong@kernel.org, vdavydov.dev@gmail.com, rientjes@google.com, dennis@kernel.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, ngupta@vflare.org, johannes.berg@intel.com, jack@suse.com, dan.j.williams@intel.com, josef@toxicpanda.com, rostedt@goodmis.org, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, jglisse@redhat.com, viro@zeniv.linux.org.uk, tglx@linutronix.de, mhocko@kernel.org, vbabka@suse.cz, melissa.srw@gmail.com, sj@kernel.org, rodrigosiqueiramelo@gmail.com, kernel-team@lge.com, gregkh@linuxfoundation.org, jlayton@kernel.org, linux-kernel@vger.kernel.org, penberg@kernel.org, minchan@kernel.org, hannes@cmpxchg.org, tj@kernel.org, akpm@linux-foundation.org, torvalds@linux-foundation.org Subject: Re: [PATCH RFC v6 00/21] DEPT(Dependency Tracker) Date: Mon, 9 May 2022 17:05:23 -0400 [thread overview] Message-ID: <YnmCE2iwa0MSqocr@mit.edu> (raw) In-Reply-To: <1651652269-15342-1-git-send-email-byungchul.park@lge.com> I tried DEPT-v6 applied against 5.18-rc5, and it reported the following positive. The reason why it's nonsense is that in context A's [W] wait: [ 1538.545054] [W] folio_wait_bit_common(pglocked:0): [ 1538.545370] [<ffffffff81259944>] __filemap_get_folio+0x3e4/0x420 [ 1538.545763] stacktrace: [ 1538.545928] folio_wait_bit_common+0x2fa/0x460 [ 1538.546248] __filemap_get_folio+0x3e4/0x420 [ 1538.546558] pagecache_get_page+0x11/0x40 [ 1538.546852] ext4_mb_init_group+0x80/0x2e0 [ 1538.547152] ext4_mb_good_group_nolock+0x2a3/0x2d0 ... we're reading the block allocation bitmap into the page cache. This does not correspond to a real inode, and so we don't actually take ei->i_data_sem in this on the psuedo-inode used. In contast, context's B's [W] and [E]'s stack traces, the folio_wait_bit is clearly associated with page which is mapped to a real inode: [ 1538.553656] [W] down_write(&ei->i_data_sem:0): [ 1538.553948] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680 [ 1538.554320] stacktrace: [ 1538.554485] ext4_map_blocks+0x17b/0x680 [ 1538.554772] mpage_map_and_submit_extent+0xef/0x530 [ 1538.555122] ext4_writepages+0x798/0x990 [ 1538.555409] do_writepages+0xcf/0x1c0 [ 1538.555682] __writeback_single_inode+0x58/0x3f0 [ 1538.556014] writeback_sb_inodes+0x210/0x540 ... [ 1538.558621] [E] folio_wake_bit(pglocked:0): [ 1538.558896] [<ffffffff814418c0>] ext4_bio_write_page+0x400/0x560 [ 1538.559290] stacktrace: [ 1538.559455] ext4_bio_write_page+0x400/0x560 [ 1538.559765] mpage_submit_page+0x5c/0x80 [ 1538.560051] mpage_map_and_submit_buffers+0x15a/0x250 [ 1538.560409] mpage_map_and_submit_extent+0x134/0x530 [ 1538.560764] ext4_writepages+0x798/0x990 [ 1538.561057] do_writepages+0xcf/0x1c0 [ 1538.561329] __writeback_single_inode+0x58/0x3f0 ... In any case, this will ***never*** deadlock, and it's due to DEPT fundamentally not understanding that waiting on different pages may be due to inodes that come from completely different inodes, and so there is zero possible chance this would never deadlock. I suspect there will be similar false positives for tests (or userspace) that uses copy_file_range(2) or send_file(2) system calls. I've included the full DEPT log report below. - Ted generic/011 [20:11:16][ 1533.411773] run fstests generic/011 at 2022-05-07 20:11:16 [ 1533.509603] DEPT_INFO_ONCE: Need to expand the ring buffer. [ 1536.910044] DEPT_INFO_ONCE: Pool(wait) is empty. [ 1538.533315] =================================================== [ 1538.533793] DEPT: Circular dependency has been detected. [ 1538.534199] 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571 Not tainted [ 1538.534645] --------------------------------------------------- [ 1538.535035] summary [ 1538.535177] --------------------------------------------------- [ 1538.535567] *** DEADLOCK *** [ 1538.535567] [ 1538.535854] context A [ 1538.536008] [S] down_write(&ei->i_data_sem:0) [ 1538.536323] [W] folio_wait_bit_common(pglocked:0) [ 1538.536655] [E] up_write(&ei->i_data_sem:0) [ 1538.536958] [ 1538.537063] context B [ 1538.537216] [S] (unknown)(pglocked:0) [ 1538.537480] [W] down_write(&ei->i_data_sem:0) [ 1538.537789] [E] folio_wake_bit(pglocked:0) [ 1538.538082] [ 1538.538184] [S]: start of the event context [ 1538.538460] [W]: the wait blocked [ 1538.538680] [E]: the event not reachable [ 1538.538939] --------------------------------------------------- [ 1538.539327] context A's detail [ 1538.539530] --------------------------------------------------- [ 1538.539918] context A [ 1538.540072] [S] down_write(&ei->i_data_sem:0) [ 1538.540382] [W] folio_wait_bit_common(pglocked:0) [ 1538.540712] [E] up_write(&ei->i_data_sem:0) [ 1538.541015] [ 1538.541119] [S] down_write(&ei->i_data_sem:0): [ 1538.541410] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680 [ 1538.541782] stacktrace: [ 1538.541946] ext4_map_blocks+0x17b/0x680 [ 1538.542234] ext4_getblk+0x5f/0x1f0 [ 1538.542493] ext4_bread+0xc/0x70 [ 1538.542736] ext4_append+0x48/0xf0 [ 1538.542991] ext4_init_new_dir+0xc8/0x160 [ 1538.543284] ext4_mkdir+0x19a/0x320 [ 1538.543542] vfs_mkdir+0x83/0xe0 [ 1538.543788] do_mkdirat+0x8c/0x130 [ 1538.544042] __x64_sys_mkdir+0x29/0x30 [ 1538.544319] do_syscall_64+0x40/0x90 [ 1538.544584] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 1538.544949] [ 1538.545054] [W] folio_wait_bit_common(pglocked:0): [ 1538.545370] [<ffffffff81259944>] __filemap_get_folio+0x3e4/0x420 [ 1538.545763] stacktrace: [ 1538.545928] folio_wait_bit_common+0x2fa/0x460 [ 1538.546248] __filemap_get_folio+0x3e4/0x420 [ 1538.546558] pagecache_get_page+0x11/0x40 [ 1538.546852] ext4_mb_init_group+0x80/0x2e0 [ 1538.547152] ext4_mb_good_group_nolock+0x2a3/0x2d0 [ 1538.547496] ext4_mb_regular_allocator+0x391/0x780 [ 1538.547840] ext4_mb_new_blocks+0x44e/0x720 [ 1538.548145] ext4_ext_map_blocks+0x7f1/0xd00 [ 1538.548455] ext4_map_blocks+0x19e/0x680 [ 1538.548743] ext4_getblk+0x5f/0x1f0 [ 1538.549006] ext4_bread+0xc/0x70 [ 1538.549250] ext4_append+0x48/0xf0 [ 1538.549505] ext4_init_new_dir+0xc8/0x160 [ 1538.549798] ext4_mkdir+0x19a/0x320 [ 1538.550058] vfs_mkdir+0x83/0xe0 [ 1538.550302] do_mkdirat+0x8c/0x130 [ 1538.550557] [ 1538.550660] [E] up_write(&ei->i_data_sem:0): [ 1538.550940] (N/A) [ 1538.551071] --------------------------------------------------- [ 1538.551459] context B's detail [ 1538.551662] --------------------------------------------------- [ 1538.552047] context B [ 1538.552202] [S] (unknown)(pglocked:0) [ 1538.552466] [W] down_write(&ei->i_data_sem:0) [ 1538.552775] [E] folio_wake_bit(pglocked:0) [ 1538.553071] [ 1538.553174] [S] (unknown)(pglocked:0): [ 1538.553422] (N/A) [ 1538.553553] [ 1538.553656] [W] down_write(&ei->i_data_sem:0): [ 1538.553948] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680 [ 1538.554320] stacktrace: [ 1538.554485] ext4_map_blocks+0x17b/0x680 [ 1538.554772] mpage_map_and_submit_extent+0xef/0x530 [ 1538.555122] ext4_writepages+0x798/0x990 [ 1538.555409] do_writepages+0xcf/0x1c0 [ 1538.555682] __writeback_single_inode+0x58/0x3f0 [ 1538.556014] writeback_sb_inodes+0x210/0x540 [ 1538.556324] __writeback_inodes_wb+0x4c/0xe0 [ 1538.556635] wb_writeback+0x298/0x450 [ 1538.556911] wb_do_writeback+0x29e/0x320 [ 1538.557199] wb_workfn+0x6a/0x2c0 [ 1538.557447] process_one_work+0x302/0x650 [ 1538.557743] worker_thread+0x55/0x400 [ 1538.558013] kthread+0xf0/0x120 [ 1538.558251] ret_from_fork+0x1f/0x30 [ 1538.558518] [ 1538.558621] [E] folio_wake_bit(pglocked:0): [ 1538.558896] [<ffffffff814418c0>] ext4_bio_write_page+0x400/0x560 [ 1538.559290] stacktrace: [ 1538.559455] ext4_bio_write_page+0x400/0x560 [ 1538.559765] mpage_submit_page+0x5c/0x80 [ 1538.560051] mpage_map_and_submit_buffers+0x15a/0x250 [ 1538.560409] mpage_map_and_submit_extent+0x134/0x530 [ 1538.560764] ext4_writepages+0x798/0x990 [ 1538.561057] do_writepages+0xcf/0x1c0 [ 1538.561329] __writeback_single_inode+0x58/0x3f0 [ 1538.561662] writeback_sb_inodes+0x210/0x540 [ 1538.561973] __writeback_inodes_wb+0x4c/0xe0 [ 1538.562283] wb_writeback+0x298/0x450 [ 1538.562555] wb_do_writeback+0x29e/0x320 [ 1538.562842] wb_workfn+0x6a/0x2c0 [ 1538.563095] process_one_work+0x302/0x650 [ 1538.563387] worker_thread+0x55/0x400 [ 1538.563658] kthread+0xf0/0x120 [ 1538.563895] ret_from_fork+0x1f/0x30 [ 1538.564161] --------------------------------------------------- [ 1538.564548] information that might be helpful [ 1538.564832] --------------------------------------------------- [ 1538.565223] CPU: 1 PID: 46539 Comm: dirstress Not tainted 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571 [ 1538.565854] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 1538.566394] Call Trace: [ 1538.566559] <TASK> [ 1538.566701] dump_stack_lvl+0x4f/0x68 [ 1538.566945] print_circle.cold+0x15b/0x169 [ 1538.567218] ? print_circle+0xe0/0xe0 [ 1538.567461] cb_check_dl+0x55/0x60 [ 1538.567687] bfs+0xd5/0x1b0 [ 1538.567874] add_dep+0xd3/0x1a0 [ 1538.568083] ? __filemap_get_folio+0x3e4/0x420 [ 1538.568374] add_wait+0xe3/0x250 [ 1538.568590] ? __filemap_get_folio+0x3e4/0x420 [ 1538.568886] dept_wait_split_map+0xb1/0x130 [ 1538.569163] folio_wait_bit_common+0x2fa/0x460 [ 1538.569456] ? lock_is_held_type+0xfc/0x130 [ 1538.569733] __filemap_get_folio+0x3e4/0x420 [ 1538.570013] ? __lock_release+0x1b2/0x2c0 [ 1538.570278] pagecache_get_page+0x11/0x40 [ 1538.570543] ext4_mb_init_group+0x80/0x2e0 [ 1538.570813] ? ext4_get_group_desc+0xb2/0x200 [ 1538.571102] ext4_mb_good_group_nolock+0x2a3/0x2d0 [ 1538.571418] ext4_mb_regular_allocator+0x391/0x780 [ 1538.571733] ? rcu_read_lock_sched_held+0x3f/0x70 [ 1538.572044] ? trace_kmem_cache_alloc+0x2c/0xd0 [ 1538.572343] ? kmem_cache_alloc+0x1f7/0x3f0 [ 1538.572618] ext4_mb_new_blocks+0x44e/0x720 [ 1538.572896] ext4_ext_map_blocks+0x7f1/0xd00 [ 1538.573179] ? find_held_lock+0x2b/0x80 [ 1538.573434] ext4_map_blocks+0x19e/0x680 [ 1538.573693] ext4_getblk+0x5f/0x1f0 [ 1538.573927] ext4_bread+0xc/0x70 [ 1538.574141] ext4_append+0x48/0xf0 [ 1538.574369] ext4_init_new_dir+0xc8/0x160 [ 1538.574634] ext4_mkdir+0x19a/0x320 [ 1538.574866] vfs_mkdir+0x83/0xe0 [ 1538.575082] do_mkdirat+0x8c/0x130 [ 1538.575308] __x64_sys_mkdir+0x29/0x30 [ 1538.575557] do_syscall_64+0x40/0x90 [ 1538.575795] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 1538.576128] RIP: 0033:0x7f0960466b07 [ 1538.576367] Code: 1f 40 00 48 8b 05 89 f3 0c 00 64 c7 00 5f 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 53 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 59 f3 0c 00 f7 d8 64 89 01 48 [ 1538.577576] RSP: 002b:00007ffd0fa955a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000053 [ 1538.578069] RAX: ffffffffffffffda RBX: 0000000000000239 RCX: 00007f0960466b07 [ 1538.578533] RDX: 0000000000000000 RSI: 00000000000001ff RDI: 00007ffd0fa955d0 [ 1538.578995] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000010 [ 1538.579458] R10: 00007ffd0fa95345 R11: 0000000000000246 R12: 00000000000003e8 [ 1538.579923] R13: 0000000000000000 R14: 00007ffd0fa955d0 R15: 00007ffd0fa95dd0 [ 1538.580389] </TASK> [ 1540.581382] EXT4-fs (vdb): mounted filesystem with ordered data mode. Quota mode: none. [20:11:24] 8s P.S. Later on the console, the test ground to the halt because DEPT started WARNING over and over and over again.... [ 3129.686102] DEPT_WARN_ON: dt->ecxt_held_pos == DEPT_MAX_ECXT_HELD [ 3129.686396] ? __might_fault+0x32/0x80 [ 3129.686660] WARNING: CPU: 1 PID: 107320 at kernel/dependency/dept.c:1537 add_ecxt+0x1c0/0x1d0 [ 3129.687040] ? __might_fault+0x32/0x80 [ 3129.687282] CPU: 1 PID: 107320 Comm: aio-stress Tainted: G W 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571 with multiple CPU's completely spamming the serial console. This should probably be a WARN_ON_ONCE, or some thing that disables DEPT entirely, since apparently won't be any useful DEPT reports (or any useful kernel work, for that matteR) is going to be happening after this.
next prev parent reply other threads:[~2022-05-09 21:05 UTC|newest] Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-05-04 8:17 [PATCH RFC v6 00/21] DEPT(Dependency Tracker) Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 01/21] llist: Move llist_{head,node} definition to types.h Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 01/21] llist: Move llist_{head, node} " Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 02/21] dept: Implement Dept(Dependency Tracker) Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 13:29 ` kernel test robot 2022-05-21 3:24 ` Hyeonggon Yoo 2022-05-21 3:24 ` Hyeonggon Yoo 2022-05-04 8:17 ` [PATCH RFC v6 03/21] dept: Apply Dept to spinlock Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 04/21] dept: Apply Dept to mutex families Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 05/21] dept: Apply Dept to rwlock Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 06/21] dept: Apply Dept to wait_for_completion()/complete() Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 07/21] dept: Apply Dept to seqlock Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-21 5:25 ` Hyeonggon Yoo 2022-05-21 5:25 ` Hyeonggon Yoo 2022-05-24 6:00 ` Byungchul Park 2022-05-24 6:00 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 08/21] dept: Apply Dept to rwsem Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 09/21] dept: Add proc knobs to show stats and dependency graph Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 10/21] dept: Introduce split map concept and new APIs for them Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 11/21] dept: Apply Dept to wait/event of PG_{locked,writeback} Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 11/21] dept: Apply Dept to wait/event of PG_{locked, writeback} Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 12/21] dept: Apply SDT to swait Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 13/21] dept: Apply SDT to wait(waitqueue) Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 14/21] locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 15/21] dept: Distinguish each syscall context from another Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 16/21] dept: Distinguish each work " Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 11:23 ` Sergey Shtylyov 2022-05-04 11:23 ` Sergey Shtylyov 2022-05-04 8:17 ` [PATCH RFC v6 17/21] dept: Disable Dept within the wait_bit layer by default Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 18/21] dept: Disable Dept on struct crypto_larval's completion for now Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 19/21] dept: Differentiate onstack maps from others of different tasks in class Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 20/21] dept: Do not add dependencies between events within scheduler and sleeps Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 8:17 ` [PATCH RFC v6 21/21] dept: Unstage wait when tagging a normal sleep wait Byungchul Park 2022-05-04 8:17 ` Byungchul Park 2022-05-04 18:17 ` [PATCH RFC v6 00/21] DEPT(Dependency Tracker) Linus Torvalds 2022-05-04 18:17 ` Linus Torvalds 2022-05-06 0:11 ` Byungchul Park 2022-05-06 0:11 ` Byungchul Park 2022-05-07 7:20 ` Hyeonggon Yoo 2022-05-07 7:20 ` Hyeonggon Yoo 2022-05-09 0:16 ` Byungchul Park 2022-05-09 0:16 ` Byungchul Park 2022-05-09 20:47 ` Steven Rostedt 2022-05-09 20:47 ` Steven Rostedt 2022-05-09 23:38 ` Byungchul Park 2022-05-09 23:38 ` Byungchul Park 2022-05-10 14:12 ` Steven Rostedt 2022-05-10 14:12 ` Steven Rostedt 2022-05-10 23:26 ` Byungchul Park 2022-05-10 23:26 ` Byungchul Park 2022-05-10 11:18 ` Hyeonggon Yoo 2022-05-10 11:18 ` Hyeonggon Yoo 2022-05-10 23:39 ` Byungchul Park 2022-05-10 23:39 ` Byungchul Park 2022-05-11 10:04 ` Hyeonggon Yoo 2022-05-11 10:04 ` Hyeonggon Yoo 2022-05-19 10:11 ` Catalin Marinas 2022-05-19 10:11 ` Catalin Marinas 2022-05-23 2:43 ` Byungchul Park 2022-05-23 2:43 ` Byungchul Park 2022-05-09 1:22 ` Byungchul Park 2022-05-09 1:22 ` Byungchul Park 2022-05-09 21:05 ` Theodore Ts'o [this message] 2022-05-09 21:05 ` Theodore Ts'o 2022-05-09 22:28 ` Theodore Ts'o 2022-05-09 22:28 ` Theodore Ts'o 2022-05-10 0:32 ` Byungchul Park 2022-05-10 0:32 ` Byungchul Park 2022-05-10 1:32 ` Theodore Ts'o 2022-05-10 1:32 ` Theodore Ts'o 2022-05-10 5:37 ` Byungchul Park 2022-05-10 5:37 ` Byungchul Park 2022-05-11 1:16 ` Byungchul Park 2022-05-11 1:16 ` Byungchul Park 2022-05-12 5:25 ` [REPORT] syscall reboot + umh + firmware fallback Byungchul Park 2022-05-12 5:25 ` Byungchul Park 2022-05-12 9:15 ` Tejun Heo 2022-05-12 9:15 ` Tejun Heo 2022-05-12 11:18 ` Byungchul Park 2022-05-12 11:18 ` Byungchul Park 2022-05-12 13:56 ` Theodore Ts'o 2022-05-12 13:56 ` Theodore Ts'o 2022-05-23 1:10 ` Byungchul Park 2022-05-23 1:10 ` Byungchul Park 2022-05-12 16:41 ` Tejun Heo 2022-05-12 16:41 ` Tejun Heo
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YnmCE2iwa0MSqocr@mit.edu \ --to=tytso@mit.edu \ --cc=42.hyeyoo@gmail.com \ --cc=adilger.kernel@dilger.ca \ --cc=airlied@linux.ie \ --cc=akpm@linux-foundation.org \ --cc=amir73il@gmail.com \ --cc=bfields@fieldses.org \ --cc=byungchul.park@lge.com \ --cc=chris@chris-wilson.co.uk \ --cc=cl@linux.com \ --cc=damien.lemoal@opensource.wdc.com \ --cc=dan.j.williams@intel.com \ --cc=daniel.vetter@ffwll.ch \ --cc=david@fromorbit.com \ --cc=dennis@kernel.org \ --cc=djwong@kernel.org \ --cc=dri-devel@lists.freedesktop.org \ --cc=duyuyang@gmail.com \ --cc=gregkh@linuxfoundation.org \ --cc=hamohammed.sa@gmail.com \ --cc=hannes@cmpxchg.org \ --cc=hch@infradead.org \ --cc=jack@suse.com \ --cc=jack@suse.cz \ --cc=jglisse@redhat.com \ --cc=jlayton@kernel.org \ --cc=joel@joelfernandes.org \ --cc=johannes.berg@intel.com \ --cc=josef@toxicpanda.com \ --cc=kernel-team@lge.com \ --cc=linux-block@vger.kernel.org \ --cc=linux-ext4@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-ide@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=melissa.srw@gmail.com \ --cc=mhocko@kernel.org \ --cc=minchan@kernel.org \ --cc=mingo@redhat.com \ --cc=ngupta@vflare.org \ --cc=paolo.valente@linaro.org \ --cc=penberg@kernel.org \ --cc=peterz@infradead.org \ --cc=rientjes@google.com \ --cc=rodrigosiqueiramelo@gmail.com \ --cc=rostedt@goodmis.org \ --cc=sashal@kernel.org \ --cc=sj@kernel.org \ --cc=tglx@linutronix.de \ --cc=tj@kernel.org \ --cc=torvalds@linux-foundation.org \ --cc=vbabka@suse.cz \ --cc=vdavydov.dev@gmail.com \ --cc=viro@zeniv.linux.org.uk \ --cc=will@kernel.org \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.