All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Byungchul Park <byungchul.park@lge.com>
Cc: torvalds@linux-foundation.org, damien.lemoal@opensource.wdc.com,
	linux-ide@vger.kernel.org, adilger.kernel@dilger.ca,
	linux-ext4@vger.kernel.org, mingo@redhat.com,
	linux-kernel@vger.kernel.org, peterz@infradead.org,
	will@kernel.org, tglx@linutronix.de, rostedt@goodmis.org,
	joel@joelfernandes.org, sashal@kernel.org,
	daniel.vetter@ffwll.ch, chris@chris-wilson.co.uk,
	duyuyang@gmail.com, johannes.berg@intel.com, tj@kernel.org,
	willy@infradead.org, david@fromorbit.com, amir73il@gmail.com,
	bfields@fieldses.org, gregkh@linuxfoundation.org,
	kernel-team@lge.com, linux-mm@kvack.org,
	akpm@linux-foundation.org, mhocko@kernel.org, minchan@kernel.org,
	hannes@cmpxchg.org, vdavydov.dev@gmail.com, sj@kernel.org,
	jglisse@redhat.com, dennis@kernel.org, cl@linux.com,
	penberg@kernel.org, rientjes@google.com, vbabka@suse.cz,
	ngupta@vflare.org, linux-block@vger.kernel.org,
	paolo.valente@linaro.org, josef@toxicpanda.com,
	linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	jack@suse.cz, jack@suse.com, jlayton@kernel.org,
	dan.j.williams@intel.com, hch@infradead.org, djwong@kernel.org,
	dri-devel@lists.freedesktop.org, airlied@linux.ie,
	rodrigosiqueiramelo@gmail.com, melissa.srw@gmail.com,
	hamohammed.sa@gmail.com, 42.hyeyoo@gmail.com
Subject: Re: [PATCH RFC v6 00/21] DEPT(Dependency Tracker)
Date: Mon, 9 May 2022 17:05:23 -0400	[thread overview]
Message-ID: <YnmCE2iwa0MSqocr@mit.edu> (raw)
In-Reply-To: <1651652269-15342-1-git-send-email-byungchul.park@lge.com>

I tried DEPT-v6 applied against 5.18-rc5, and it reported the
following positive.

The reason why it's nonsense is that in context A's [W] wait:

[ 1538.545054] [W] folio_wait_bit_common(pglocked:0):
[ 1538.545370] [<ffffffff81259944>] __filemap_get_folio+0x3e4/0x420
[ 1538.545763] stacktrace:
[ 1538.545928]       folio_wait_bit_common+0x2fa/0x460
[ 1538.546248]       __filemap_get_folio+0x3e4/0x420
[ 1538.546558]       pagecache_get_page+0x11/0x40
[ 1538.546852]       ext4_mb_init_group+0x80/0x2e0
[ 1538.547152]       ext4_mb_good_group_nolock+0x2a3/0x2d0

... we're reading the block allocation bitmap into the page cache.
This does not correspond to a real inode, and so we don't actually
take ei->i_data_sem in this on the psuedo-inode used.

In contast, context's B's [W] and [E]'s stack traces, the
folio_wait_bit is clearly associated with page which is mapped to a
real inode:

[ 1538.553656] [W] down_write(&ei->i_data_sem:0):
[ 1538.553948] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680
[ 1538.554320] stacktrace:
[ 1538.554485]       ext4_map_blocks+0x17b/0x680
[ 1538.554772]       mpage_map_and_submit_extent+0xef/0x530
[ 1538.555122]       ext4_writepages+0x798/0x990
[ 1538.555409]       do_writepages+0xcf/0x1c0
[ 1538.555682]       __writeback_single_inode+0x58/0x3f0
[ 1538.556014]       writeback_sb_inodes+0x210/0x540
  		     ...

[ 1538.558621] [E] folio_wake_bit(pglocked:0):
[ 1538.558896] [<ffffffff814418c0>] ext4_bio_write_page+0x400/0x560
[ 1538.559290] stacktrace:
[ 1538.559455]       ext4_bio_write_page+0x400/0x560
[ 1538.559765]       mpage_submit_page+0x5c/0x80
[ 1538.560051]       mpage_map_and_submit_buffers+0x15a/0x250
[ 1538.560409]       mpage_map_and_submit_extent+0x134/0x530
[ 1538.560764]       ext4_writepages+0x798/0x990
[ 1538.561057]       do_writepages+0xcf/0x1c0
[ 1538.561329]       __writeback_single_inode+0x58/0x3f0
		...


In any case, this will ***never*** deadlock, and it's due to DEPT
fundamentally not understanding that waiting on different pages may be
due to inodes that come from completely different inodes, and so there
is zero possible chance this would never deadlock.

I suspect there will be similar false positives for tests (or
userspace) that uses copy_file_range(2) or send_file(2) system calls.

I've included the full DEPT log report below.

						- Ted

generic/011		[20:11:16][ 1533.411773] run fstests generic/011 at 2022-05-07 20:11:16
[ 1533.509603] DEPT_INFO_ONCE: Need to expand the ring buffer.
[ 1536.910044] DEPT_INFO_ONCE: Pool(wait) is empty.
[ 1538.533315] ===================================================
[ 1538.533793] DEPT: Circular dependency has been detected.
[ 1538.534199] 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571 Not tainted
[ 1538.534645] ---------------------------------------------------
[ 1538.535035] summary
[ 1538.535177] ---------------------------------------------------
[ 1538.535567] *** DEADLOCK ***
[ 1538.535567] 
[ 1538.535854] context A
[ 1538.536008]     [S] down_write(&ei->i_data_sem:0)
[ 1538.536323]     [W] folio_wait_bit_common(pglocked:0)
[ 1538.536655]     [E] up_write(&ei->i_data_sem:0)
[ 1538.536958] 
[ 1538.537063] context B
[ 1538.537216]     [S] (unknown)(pglocked:0)
[ 1538.537480]     [W] down_write(&ei->i_data_sem:0)
[ 1538.537789]     [E] folio_wake_bit(pglocked:0)
[ 1538.538082] 
[ 1538.538184] [S]: start of the event context
[ 1538.538460] [W]: the wait blocked
[ 1538.538680] [E]: the event not reachable
[ 1538.538939] ---------------------------------------------------
[ 1538.539327] context A's detail
[ 1538.539530] ---------------------------------------------------
[ 1538.539918] context A
[ 1538.540072]     [S] down_write(&ei->i_data_sem:0)
[ 1538.540382]     [W] folio_wait_bit_common(pglocked:0)
[ 1538.540712]     [E] up_write(&ei->i_data_sem:0)
[ 1538.541015] 
[ 1538.541119] [S] down_write(&ei->i_data_sem:0):
[ 1538.541410] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680
[ 1538.541782] stacktrace:
[ 1538.541946]       ext4_map_blocks+0x17b/0x680
[ 1538.542234]       ext4_getblk+0x5f/0x1f0
[ 1538.542493]       ext4_bread+0xc/0x70
[ 1538.542736]       ext4_append+0x48/0xf0
[ 1538.542991]       ext4_init_new_dir+0xc8/0x160
[ 1538.543284]       ext4_mkdir+0x19a/0x320
[ 1538.543542]       vfs_mkdir+0x83/0xe0
[ 1538.543788]       do_mkdirat+0x8c/0x130
[ 1538.544042]       __x64_sys_mkdir+0x29/0x30
[ 1538.544319]       do_syscall_64+0x40/0x90
[ 1538.544584]       entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1538.544949] 
[ 1538.545054] [W] folio_wait_bit_common(pglocked:0):
[ 1538.545370] [<ffffffff81259944>] __filemap_get_folio+0x3e4/0x420
[ 1538.545763] stacktrace:
[ 1538.545928]       folio_wait_bit_common+0x2fa/0x460
[ 1538.546248]       __filemap_get_folio+0x3e4/0x420
[ 1538.546558]       pagecache_get_page+0x11/0x40
[ 1538.546852]       ext4_mb_init_group+0x80/0x2e0
[ 1538.547152]       ext4_mb_good_group_nolock+0x2a3/0x2d0
[ 1538.547496]       ext4_mb_regular_allocator+0x391/0x780
[ 1538.547840]       ext4_mb_new_blocks+0x44e/0x720
[ 1538.548145]       ext4_ext_map_blocks+0x7f1/0xd00
[ 1538.548455]       ext4_map_blocks+0x19e/0x680
[ 1538.548743]       ext4_getblk+0x5f/0x1f0
[ 1538.549006]       ext4_bread+0xc/0x70
[ 1538.549250]       ext4_append+0x48/0xf0
[ 1538.549505]       ext4_init_new_dir+0xc8/0x160
[ 1538.549798]       ext4_mkdir+0x19a/0x320
[ 1538.550058]       vfs_mkdir+0x83/0xe0
[ 1538.550302]       do_mkdirat+0x8c/0x130
[ 1538.550557] 
[ 1538.550660] [E] up_write(&ei->i_data_sem:0):
[ 1538.550940] (N/A)
[ 1538.551071] ---------------------------------------------------
[ 1538.551459] context B's detail
[ 1538.551662] ---------------------------------------------------
[ 1538.552047] context B
[ 1538.552202]     [S] (unknown)(pglocked:0)
[ 1538.552466]     [W] down_write(&ei->i_data_sem:0)
[ 1538.552775]     [E] folio_wake_bit(pglocked:0)
[ 1538.553071] 
[ 1538.553174] [S] (unknown)(pglocked:0):
[ 1538.553422] (N/A)
[ 1538.553553] 
[ 1538.553656] [W] down_write(&ei->i_data_sem:0):
[ 1538.553948] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680
[ 1538.554320] stacktrace:
[ 1538.554485]       ext4_map_blocks+0x17b/0x680
[ 1538.554772]       mpage_map_and_submit_extent+0xef/0x530
[ 1538.555122]       ext4_writepages+0x798/0x990
[ 1538.555409]       do_writepages+0xcf/0x1c0
[ 1538.555682]       __writeback_single_inode+0x58/0x3f0
[ 1538.556014]       writeback_sb_inodes+0x210/0x540
[ 1538.556324]       __writeback_inodes_wb+0x4c/0xe0
[ 1538.556635]       wb_writeback+0x298/0x450
[ 1538.556911]       wb_do_writeback+0x29e/0x320
[ 1538.557199]       wb_workfn+0x6a/0x2c0
[ 1538.557447]       process_one_work+0x302/0x650
[ 1538.557743]       worker_thread+0x55/0x400
[ 1538.558013]       kthread+0xf0/0x120
[ 1538.558251]       ret_from_fork+0x1f/0x30
[ 1538.558518] 
[ 1538.558621] [E] folio_wake_bit(pglocked:0):
[ 1538.558896] [<ffffffff814418c0>] ext4_bio_write_page+0x400/0x560
[ 1538.559290] stacktrace:
[ 1538.559455]       ext4_bio_write_page+0x400/0x560
[ 1538.559765]       mpage_submit_page+0x5c/0x80
[ 1538.560051]       mpage_map_and_submit_buffers+0x15a/0x250
[ 1538.560409]       mpage_map_and_submit_extent+0x134/0x530
[ 1538.560764]       ext4_writepages+0x798/0x990
[ 1538.561057]       do_writepages+0xcf/0x1c0
[ 1538.561329]       __writeback_single_inode+0x58/0x3f0
[ 1538.561662]       writeback_sb_inodes+0x210/0x540
[ 1538.561973]       __writeback_inodes_wb+0x4c/0xe0
[ 1538.562283]       wb_writeback+0x298/0x450
[ 1538.562555]       wb_do_writeback+0x29e/0x320
[ 1538.562842]       wb_workfn+0x6a/0x2c0
[ 1538.563095]       process_one_work+0x302/0x650
[ 1538.563387]       worker_thread+0x55/0x400
[ 1538.563658]       kthread+0xf0/0x120
[ 1538.563895]       ret_from_fork+0x1f/0x30
[ 1538.564161] ---------------------------------------------------
[ 1538.564548] information that might be helpful
[ 1538.564832] ---------------------------------------------------
[ 1538.565223] CPU: 1 PID: 46539 Comm: dirstress Not tainted 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571
[ 1538.565854] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 1538.566394] Call Trace:
[ 1538.566559]  <TASK>
[ 1538.566701]  dump_stack_lvl+0x4f/0x68
[ 1538.566945]  print_circle.cold+0x15b/0x169
[ 1538.567218]  ? print_circle+0xe0/0xe0
[ 1538.567461]  cb_check_dl+0x55/0x60
[ 1538.567687]  bfs+0xd5/0x1b0
[ 1538.567874]  add_dep+0xd3/0x1a0
[ 1538.568083]  ? __filemap_get_folio+0x3e4/0x420
[ 1538.568374]  add_wait+0xe3/0x250
[ 1538.568590]  ? __filemap_get_folio+0x3e4/0x420
[ 1538.568886]  dept_wait_split_map+0xb1/0x130
[ 1538.569163]  folio_wait_bit_common+0x2fa/0x460
[ 1538.569456]  ? lock_is_held_type+0xfc/0x130
[ 1538.569733]  __filemap_get_folio+0x3e4/0x420
[ 1538.570013]  ? __lock_release+0x1b2/0x2c0
[ 1538.570278]  pagecache_get_page+0x11/0x40
[ 1538.570543]  ext4_mb_init_group+0x80/0x2e0
[ 1538.570813]  ? ext4_get_group_desc+0xb2/0x200
[ 1538.571102]  ext4_mb_good_group_nolock+0x2a3/0x2d0
[ 1538.571418]  ext4_mb_regular_allocator+0x391/0x780
[ 1538.571733]  ? rcu_read_lock_sched_held+0x3f/0x70
[ 1538.572044]  ? trace_kmem_cache_alloc+0x2c/0xd0
[ 1538.572343]  ? kmem_cache_alloc+0x1f7/0x3f0
[ 1538.572618]  ext4_mb_new_blocks+0x44e/0x720
[ 1538.572896]  ext4_ext_map_blocks+0x7f1/0xd00
[ 1538.573179]  ? find_held_lock+0x2b/0x80
[ 1538.573434]  ext4_map_blocks+0x19e/0x680
[ 1538.573693]  ext4_getblk+0x5f/0x1f0
[ 1538.573927]  ext4_bread+0xc/0x70
[ 1538.574141]  ext4_append+0x48/0xf0
[ 1538.574369]  ext4_init_new_dir+0xc8/0x160
[ 1538.574634]  ext4_mkdir+0x19a/0x320
[ 1538.574866]  vfs_mkdir+0x83/0xe0
[ 1538.575082]  do_mkdirat+0x8c/0x130
[ 1538.575308]  __x64_sys_mkdir+0x29/0x30
[ 1538.575557]  do_syscall_64+0x40/0x90
[ 1538.575795]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1538.576128] RIP: 0033:0x7f0960466b07
[ 1538.576367] Code: 1f 40 00 48 8b 05 89 f3 0c 00 64 c7 00 5f 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 53 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 59 f3 0c 00 f7 d8 64 89 01 48
[ 1538.577576] RSP: 002b:00007ffd0fa955a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000053
[ 1538.578069] RAX: ffffffffffffffda RBX: 0000000000000239 RCX: 00007f0960466b07
[ 1538.578533] RDX: 0000000000000000 RSI: 00000000000001ff RDI: 00007ffd0fa955d0
[ 1538.578995] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000010
[ 1538.579458] R10: 00007ffd0fa95345 R11: 0000000000000246 R12: 00000000000003e8
[ 1538.579923] R13: 0000000000000000 R14: 00007ffd0fa955d0 R15: 00007ffd0fa95dd0
[ 1538.580389]  </TASK>
[ 1540.581382] EXT4-fs (vdb): mounted filesystem with ordered data mode. Quota mode: none.
 [20:11:24] 8s


P.S.  Later on the console, the test ground to the halt because DEPT
started WARNING over and over and over again....

[ 3129.686102] DEPT_WARN_ON: dt->ecxt_held_pos == DEPT_MAX_ECXT_HELD
[ 3129.686396]  ? __might_fault+0x32/0x80
[ 3129.686660] WARNING: CPU: 1 PID: 107320 at kernel/dependency/dept.c:1537 add_ecxt+0x1c0/0x1d0
[ 3129.687040]  ? __might_fault+0x32/0x80
[ 3129.687282] CPU: 1 PID: 107320 Comm: aio-stress Tainted: G        W         5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571

with multiple CPU's completely spamming the serial console.  This
should probably be a WARN_ON_ONCE, or some thing that disables DEPT
entirely, since apparently won't be any useful DEPT reports (or any
useful kernel work, for that matteR) is going to be happening after
this.


WARNING: multiple messages have this Message-ID (diff)
From: "Theodore Ts'o" <tytso@mit.edu>
To: Byungchul Park <byungchul.park@lge.com>
Cc: hamohammed.sa@gmail.com, jack@suse.cz, peterz@infradead.org,
	daniel.vetter@ffwll.ch, amir73il@gmail.com, david@fromorbit.com,
	dri-devel@lists.freedesktop.org, chris@chris-wilson.co.uk,
	bfields@fieldses.org, linux-ide@vger.kernel.org,
	adilger.kernel@dilger.ca, joel@joelfernandes.org,
	42.hyeyoo@gmail.com, cl@linux.com, will@kernel.org,
	duyuyang@gmail.com, sashal@kernel.org, paolo.valente@linaro.org,
	damien.lemoal@opensource.wdc.com, willy@infradead.org,
	hch@infradead.org, airlied@linux.ie, mingo@redhat.com,
	djwong@kernel.org, vdavydov.dev@gmail.com, rientjes@google.com,
	dennis@kernel.org, linux-ext4@vger.kernel.org,
	linux-mm@kvack.org, ngupta@vflare.org, johannes.berg@intel.com,
	jack@suse.com, dan.j.williams@intel.com, josef@toxicpanda.com,
	rostedt@goodmis.org, linux-block@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, jglisse@redhat.com,
	viro@zeniv.linux.org.uk, tglx@linutronix.de, mhocko@kernel.org,
	vbabka@suse.cz, melissa.srw@gmail.com, sj@kernel.org,
	rodrigosiqueiramelo@gmail.com, kernel-team@lge.com,
	gregkh@linuxfoundation.org, jlayton@kernel.org,
	linux-kernel@vger.kernel.org, penberg@kernel.org,
	minchan@kernel.org, hannes@cmpxchg.org, tj@kernel.org,
	akpm@linux-foundation.org, torvalds@linux-foundation.org
Subject: Re: [PATCH RFC v6 00/21] DEPT(Dependency Tracker)
Date: Mon, 9 May 2022 17:05:23 -0400	[thread overview]
Message-ID: <YnmCE2iwa0MSqocr@mit.edu> (raw)
In-Reply-To: <1651652269-15342-1-git-send-email-byungchul.park@lge.com>

I tried DEPT-v6 applied against 5.18-rc5, and it reported the
following positive.

The reason why it's nonsense is that in context A's [W] wait:

[ 1538.545054] [W] folio_wait_bit_common(pglocked:0):
[ 1538.545370] [<ffffffff81259944>] __filemap_get_folio+0x3e4/0x420
[ 1538.545763] stacktrace:
[ 1538.545928]       folio_wait_bit_common+0x2fa/0x460
[ 1538.546248]       __filemap_get_folio+0x3e4/0x420
[ 1538.546558]       pagecache_get_page+0x11/0x40
[ 1538.546852]       ext4_mb_init_group+0x80/0x2e0
[ 1538.547152]       ext4_mb_good_group_nolock+0x2a3/0x2d0

... we're reading the block allocation bitmap into the page cache.
This does not correspond to a real inode, and so we don't actually
take ei->i_data_sem in this on the psuedo-inode used.

In contast, context's B's [W] and [E]'s stack traces, the
folio_wait_bit is clearly associated with page which is mapped to a
real inode:

[ 1538.553656] [W] down_write(&ei->i_data_sem:0):
[ 1538.553948] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680
[ 1538.554320] stacktrace:
[ 1538.554485]       ext4_map_blocks+0x17b/0x680
[ 1538.554772]       mpage_map_and_submit_extent+0xef/0x530
[ 1538.555122]       ext4_writepages+0x798/0x990
[ 1538.555409]       do_writepages+0xcf/0x1c0
[ 1538.555682]       __writeback_single_inode+0x58/0x3f0
[ 1538.556014]       writeback_sb_inodes+0x210/0x540
  		     ...

[ 1538.558621] [E] folio_wake_bit(pglocked:0):
[ 1538.558896] [<ffffffff814418c0>] ext4_bio_write_page+0x400/0x560
[ 1538.559290] stacktrace:
[ 1538.559455]       ext4_bio_write_page+0x400/0x560
[ 1538.559765]       mpage_submit_page+0x5c/0x80
[ 1538.560051]       mpage_map_and_submit_buffers+0x15a/0x250
[ 1538.560409]       mpage_map_and_submit_extent+0x134/0x530
[ 1538.560764]       ext4_writepages+0x798/0x990
[ 1538.561057]       do_writepages+0xcf/0x1c0
[ 1538.561329]       __writeback_single_inode+0x58/0x3f0
		...


In any case, this will ***never*** deadlock, and it's due to DEPT
fundamentally not understanding that waiting on different pages may be
due to inodes that come from completely different inodes, and so there
is zero possible chance this would never deadlock.

I suspect there will be similar false positives for tests (or
userspace) that uses copy_file_range(2) or send_file(2) system calls.

I've included the full DEPT log report below.

						- Ted

generic/011		[20:11:16][ 1533.411773] run fstests generic/011 at 2022-05-07 20:11:16
[ 1533.509603] DEPT_INFO_ONCE: Need to expand the ring buffer.
[ 1536.910044] DEPT_INFO_ONCE: Pool(wait) is empty.
[ 1538.533315] ===================================================
[ 1538.533793] DEPT: Circular dependency has been detected.
[ 1538.534199] 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571 Not tainted
[ 1538.534645] ---------------------------------------------------
[ 1538.535035] summary
[ 1538.535177] ---------------------------------------------------
[ 1538.535567] *** DEADLOCK ***
[ 1538.535567] 
[ 1538.535854] context A
[ 1538.536008]     [S] down_write(&ei->i_data_sem:0)
[ 1538.536323]     [W] folio_wait_bit_common(pglocked:0)
[ 1538.536655]     [E] up_write(&ei->i_data_sem:0)
[ 1538.536958] 
[ 1538.537063] context B
[ 1538.537216]     [S] (unknown)(pglocked:0)
[ 1538.537480]     [W] down_write(&ei->i_data_sem:0)
[ 1538.537789]     [E] folio_wake_bit(pglocked:0)
[ 1538.538082] 
[ 1538.538184] [S]: start of the event context
[ 1538.538460] [W]: the wait blocked
[ 1538.538680] [E]: the event not reachable
[ 1538.538939] ---------------------------------------------------
[ 1538.539327] context A's detail
[ 1538.539530] ---------------------------------------------------
[ 1538.539918] context A
[ 1538.540072]     [S] down_write(&ei->i_data_sem:0)
[ 1538.540382]     [W] folio_wait_bit_common(pglocked:0)
[ 1538.540712]     [E] up_write(&ei->i_data_sem:0)
[ 1538.541015] 
[ 1538.541119] [S] down_write(&ei->i_data_sem:0):
[ 1538.541410] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680
[ 1538.541782] stacktrace:
[ 1538.541946]       ext4_map_blocks+0x17b/0x680
[ 1538.542234]       ext4_getblk+0x5f/0x1f0
[ 1538.542493]       ext4_bread+0xc/0x70
[ 1538.542736]       ext4_append+0x48/0xf0
[ 1538.542991]       ext4_init_new_dir+0xc8/0x160
[ 1538.543284]       ext4_mkdir+0x19a/0x320
[ 1538.543542]       vfs_mkdir+0x83/0xe0
[ 1538.543788]       do_mkdirat+0x8c/0x130
[ 1538.544042]       __x64_sys_mkdir+0x29/0x30
[ 1538.544319]       do_syscall_64+0x40/0x90
[ 1538.544584]       entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1538.544949] 
[ 1538.545054] [W] folio_wait_bit_common(pglocked:0):
[ 1538.545370] [<ffffffff81259944>] __filemap_get_folio+0x3e4/0x420
[ 1538.545763] stacktrace:
[ 1538.545928]       folio_wait_bit_common+0x2fa/0x460
[ 1538.546248]       __filemap_get_folio+0x3e4/0x420
[ 1538.546558]       pagecache_get_page+0x11/0x40
[ 1538.546852]       ext4_mb_init_group+0x80/0x2e0
[ 1538.547152]       ext4_mb_good_group_nolock+0x2a3/0x2d0
[ 1538.547496]       ext4_mb_regular_allocator+0x391/0x780
[ 1538.547840]       ext4_mb_new_blocks+0x44e/0x720
[ 1538.548145]       ext4_ext_map_blocks+0x7f1/0xd00
[ 1538.548455]       ext4_map_blocks+0x19e/0x680
[ 1538.548743]       ext4_getblk+0x5f/0x1f0
[ 1538.549006]       ext4_bread+0xc/0x70
[ 1538.549250]       ext4_append+0x48/0xf0
[ 1538.549505]       ext4_init_new_dir+0xc8/0x160
[ 1538.549798]       ext4_mkdir+0x19a/0x320
[ 1538.550058]       vfs_mkdir+0x83/0xe0
[ 1538.550302]       do_mkdirat+0x8c/0x130
[ 1538.550557] 
[ 1538.550660] [E] up_write(&ei->i_data_sem:0):
[ 1538.550940] (N/A)
[ 1538.551071] ---------------------------------------------------
[ 1538.551459] context B's detail
[ 1538.551662] ---------------------------------------------------
[ 1538.552047] context B
[ 1538.552202]     [S] (unknown)(pglocked:0)
[ 1538.552466]     [W] down_write(&ei->i_data_sem:0)
[ 1538.552775]     [E] folio_wake_bit(pglocked:0)
[ 1538.553071] 
[ 1538.553174] [S] (unknown)(pglocked:0):
[ 1538.553422] (N/A)
[ 1538.553553] 
[ 1538.553656] [W] down_write(&ei->i_data_sem:0):
[ 1538.553948] [<ffffffff8141c01b>] ext4_map_blocks+0x17b/0x680
[ 1538.554320] stacktrace:
[ 1538.554485]       ext4_map_blocks+0x17b/0x680
[ 1538.554772]       mpage_map_and_submit_extent+0xef/0x530
[ 1538.555122]       ext4_writepages+0x798/0x990
[ 1538.555409]       do_writepages+0xcf/0x1c0
[ 1538.555682]       __writeback_single_inode+0x58/0x3f0
[ 1538.556014]       writeback_sb_inodes+0x210/0x540
[ 1538.556324]       __writeback_inodes_wb+0x4c/0xe0
[ 1538.556635]       wb_writeback+0x298/0x450
[ 1538.556911]       wb_do_writeback+0x29e/0x320
[ 1538.557199]       wb_workfn+0x6a/0x2c0
[ 1538.557447]       process_one_work+0x302/0x650
[ 1538.557743]       worker_thread+0x55/0x400
[ 1538.558013]       kthread+0xf0/0x120
[ 1538.558251]       ret_from_fork+0x1f/0x30
[ 1538.558518] 
[ 1538.558621] [E] folio_wake_bit(pglocked:0):
[ 1538.558896] [<ffffffff814418c0>] ext4_bio_write_page+0x400/0x560
[ 1538.559290] stacktrace:
[ 1538.559455]       ext4_bio_write_page+0x400/0x560
[ 1538.559765]       mpage_submit_page+0x5c/0x80
[ 1538.560051]       mpage_map_and_submit_buffers+0x15a/0x250
[ 1538.560409]       mpage_map_and_submit_extent+0x134/0x530
[ 1538.560764]       ext4_writepages+0x798/0x990
[ 1538.561057]       do_writepages+0xcf/0x1c0
[ 1538.561329]       __writeback_single_inode+0x58/0x3f0
[ 1538.561662]       writeback_sb_inodes+0x210/0x540
[ 1538.561973]       __writeback_inodes_wb+0x4c/0xe0
[ 1538.562283]       wb_writeback+0x298/0x450
[ 1538.562555]       wb_do_writeback+0x29e/0x320
[ 1538.562842]       wb_workfn+0x6a/0x2c0
[ 1538.563095]       process_one_work+0x302/0x650
[ 1538.563387]       worker_thread+0x55/0x400
[ 1538.563658]       kthread+0xf0/0x120
[ 1538.563895]       ret_from_fork+0x1f/0x30
[ 1538.564161] ---------------------------------------------------
[ 1538.564548] information that might be helpful
[ 1538.564832] ---------------------------------------------------
[ 1538.565223] CPU: 1 PID: 46539 Comm: dirstress Not tainted 5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571
[ 1538.565854] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 1538.566394] Call Trace:
[ 1538.566559]  <TASK>
[ 1538.566701]  dump_stack_lvl+0x4f/0x68
[ 1538.566945]  print_circle.cold+0x15b/0x169
[ 1538.567218]  ? print_circle+0xe0/0xe0
[ 1538.567461]  cb_check_dl+0x55/0x60
[ 1538.567687]  bfs+0xd5/0x1b0
[ 1538.567874]  add_dep+0xd3/0x1a0
[ 1538.568083]  ? __filemap_get_folio+0x3e4/0x420
[ 1538.568374]  add_wait+0xe3/0x250
[ 1538.568590]  ? __filemap_get_folio+0x3e4/0x420
[ 1538.568886]  dept_wait_split_map+0xb1/0x130
[ 1538.569163]  folio_wait_bit_common+0x2fa/0x460
[ 1538.569456]  ? lock_is_held_type+0xfc/0x130
[ 1538.569733]  __filemap_get_folio+0x3e4/0x420
[ 1538.570013]  ? __lock_release+0x1b2/0x2c0
[ 1538.570278]  pagecache_get_page+0x11/0x40
[ 1538.570543]  ext4_mb_init_group+0x80/0x2e0
[ 1538.570813]  ? ext4_get_group_desc+0xb2/0x200
[ 1538.571102]  ext4_mb_good_group_nolock+0x2a3/0x2d0
[ 1538.571418]  ext4_mb_regular_allocator+0x391/0x780
[ 1538.571733]  ? rcu_read_lock_sched_held+0x3f/0x70
[ 1538.572044]  ? trace_kmem_cache_alloc+0x2c/0xd0
[ 1538.572343]  ? kmem_cache_alloc+0x1f7/0x3f0
[ 1538.572618]  ext4_mb_new_blocks+0x44e/0x720
[ 1538.572896]  ext4_ext_map_blocks+0x7f1/0xd00
[ 1538.573179]  ? find_held_lock+0x2b/0x80
[ 1538.573434]  ext4_map_blocks+0x19e/0x680
[ 1538.573693]  ext4_getblk+0x5f/0x1f0
[ 1538.573927]  ext4_bread+0xc/0x70
[ 1538.574141]  ext4_append+0x48/0xf0
[ 1538.574369]  ext4_init_new_dir+0xc8/0x160
[ 1538.574634]  ext4_mkdir+0x19a/0x320
[ 1538.574866]  vfs_mkdir+0x83/0xe0
[ 1538.575082]  do_mkdirat+0x8c/0x130
[ 1538.575308]  __x64_sys_mkdir+0x29/0x30
[ 1538.575557]  do_syscall_64+0x40/0x90
[ 1538.575795]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1538.576128] RIP: 0033:0x7f0960466b07
[ 1538.576367] Code: 1f 40 00 48 8b 05 89 f3 0c 00 64 c7 00 5f 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 53 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 59 f3 0c 00 f7 d8 64 89 01 48
[ 1538.577576] RSP: 002b:00007ffd0fa955a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000053
[ 1538.578069] RAX: ffffffffffffffda RBX: 0000000000000239 RCX: 00007f0960466b07
[ 1538.578533] RDX: 0000000000000000 RSI: 00000000000001ff RDI: 00007ffd0fa955d0
[ 1538.578995] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000010
[ 1538.579458] R10: 00007ffd0fa95345 R11: 0000000000000246 R12: 00000000000003e8
[ 1538.579923] R13: 0000000000000000 R14: 00007ffd0fa955d0 R15: 00007ffd0fa95dd0
[ 1538.580389]  </TASK>
[ 1540.581382] EXT4-fs (vdb): mounted filesystem with ordered data mode. Quota mode: none.
 [20:11:24] 8s


P.S.  Later on the console, the test ground to the halt because DEPT
started WARNING over and over and over again....

[ 3129.686102] DEPT_WARN_ON: dt->ecxt_held_pos == DEPT_MAX_ECXT_HELD
[ 3129.686396]  ? __might_fault+0x32/0x80
[ 3129.686660] WARNING: CPU: 1 PID: 107320 at kernel/dependency/dept.c:1537 add_ecxt+0x1c0/0x1d0
[ 3129.687040]  ? __might_fault+0x32/0x80
[ 3129.687282] CPU: 1 PID: 107320 Comm: aio-stress Tainted: G        W         5.18.0-rc5-xfstests-dept-00021-g8d3d751c9964 #571

with multiple CPU's completely spamming the serial console.  This
should probably be a WARN_ON_ONCE, or some thing that disables DEPT
entirely, since apparently won't be any useful DEPT reports (or any
useful kernel work, for that matteR) is going to be happening after
this.


  parent reply	other threads:[~2022-05-09 21:05 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-04  8:17 [PATCH RFC v6 00/21] DEPT(Dependency Tracker) Byungchul Park
2022-05-04  8:17 ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 01/21] llist: Move llist_{head,node} definition to types.h Byungchul Park
2022-05-04  8:17   ` [PATCH RFC v6 01/21] llist: Move llist_{head, node} " Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 02/21] dept: Implement Dept(Dependency Tracker) Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04 13:29   ` kernel test robot
2022-05-21  3:24   ` Hyeonggon Yoo
2022-05-21  3:24     ` Hyeonggon Yoo
2022-05-04  8:17 ` [PATCH RFC v6 03/21] dept: Apply Dept to spinlock Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 04/21] dept: Apply Dept to mutex families Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 05/21] dept: Apply Dept to rwlock Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 06/21] dept: Apply Dept to wait_for_completion()/complete() Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 07/21] dept: Apply Dept to seqlock Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-21  5:25   ` Hyeonggon Yoo
2022-05-21  5:25     ` Hyeonggon Yoo
2022-05-24  6:00     ` Byungchul Park
2022-05-24  6:00       ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 08/21] dept: Apply Dept to rwsem Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 09/21] dept: Add proc knobs to show stats and dependency graph Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 10/21] dept: Introduce split map concept and new APIs for them Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 11/21] dept: Apply Dept to wait/event of PG_{locked,writeback} Byungchul Park
2022-05-04  8:17   ` [PATCH RFC v6 11/21] dept: Apply Dept to wait/event of PG_{locked, writeback} Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 12/21] dept: Apply SDT to swait Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 13/21] dept: Apply SDT to wait(waitqueue) Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 14/21] locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 15/21] dept: Distinguish each syscall context from another Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 16/21] dept: Distinguish each work " Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04 11:23   ` Sergey Shtylyov
2022-05-04 11:23     ` Sergey Shtylyov
2022-05-04  8:17 ` [PATCH RFC v6 17/21] dept: Disable Dept within the wait_bit layer by default Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 18/21] dept: Disable Dept on struct crypto_larval's completion for now Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 19/21] dept: Differentiate onstack maps from others of different tasks in class Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 20/21] dept: Do not add dependencies between events within scheduler and sleeps Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04  8:17 ` [PATCH RFC v6 21/21] dept: Unstage wait when tagging a normal sleep wait Byungchul Park
2022-05-04  8:17   ` Byungchul Park
2022-05-04 18:17 ` [PATCH RFC v6 00/21] DEPT(Dependency Tracker) Linus Torvalds
2022-05-04 18:17   ` Linus Torvalds
2022-05-06  0:11   ` Byungchul Park
2022-05-06  0:11     ` Byungchul Park
2022-05-07  7:20     ` Hyeonggon Yoo
2022-05-07  7:20       ` Hyeonggon Yoo
2022-05-09  0:16       ` Byungchul Park
2022-05-09  0:16         ` Byungchul Park
2022-05-09 20:47         ` Steven Rostedt
2022-05-09 20:47           ` Steven Rostedt
2022-05-09 23:38           ` Byungchul Park
2022-05-09 23:38             ` Byungchul Park
2022-05-10 14:12             ` Steven Rostedt
2022-05-10 14:12               ` Steven Rostedt
2022-05-10 23:26               ` Byungchul Park
2022-05-10 23:26                 ` Byungchul Park
2022-05-10 11:18         ` Hyeonggon Yoo
2022-05-10 11:18           ` Hyeonggon Yoo
2022-05-10 23:39           ` Byungchul Park
2022-05-10 23:39             ` Byungchul Park
2022-05-11 10:04             ` Hyeonggon Yoo
2022-05-11 10:04               ` Hyeonggon Yoo
2022-05-19 10:11               ` Catalin Marinas
2022-05-19 10:11                 ` Catalin Marinas
2022-05-23  2:43                 ` Byungchul Park
2022-05-23  2:43                   ` Byungchul Park
2022-05-09  1:22   ` Byungchul Park
2022-05-09  1:22     ` Byungchul Park
2022-05-09 21:05 ` Theodore Ts'o [this message]
2022-05-09 21:05   ` Theodore Ts'o
2022-05-09 22:28   ` Theodore Ts'o
2022-05-09 22:28     ` Theodore Ts'o
2022-05-10  0:32     ` Byungchul Park
2022-05-10  0:32       ` Byungchul Park
2022-05-10  1:32       ` Theodore Ts'o
2022-05-10  1:32         ` Theodore Ts'o
2022-05-10  5:37         ` Byungchul Park
2022-05-10  5:37           ` Byungchul Park
2022-05-11  1:16           ` Byungchul Park
2022-05-11  1:16             ` Byungchul Park
2022-05-12  5:25 ` [REPORT] syscall reboot + umh + firmware fallback Byungchul Park
2022-05-12  5:25   ` Byungchul Park
2022-05-12  9:15   ` Tejun Heo
2022-05-12  9:15     ` Tejun Heo
2022-05-12 11:18     ` Byungchul Park
2022-05-12 11:18       ` Byungchul Park
2022-05-12 13:56       ` Theodore Ts'o
2022-05-12 13:56         ` Theodore Ts'o
2022-05-23  1:10         ` Byungchul Park
2022-05-23  1:10           ` Byungchul Park
2022-05-12 16:41       ` Tejun Heo
2022-05-12 16:41         ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnmCE2iwa0MSqocr@mit.edu \
    --to=tytso@mit.edu \
    --cc=42.hyeyoo@gmail.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=airlied@linux.ie \
    --cc=akpm@linux-foundation.org \
    --cc=amir73il@gmail.com \
    --cc=bfields@fieldses.org \
    --cc=byungchul.park@lge.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=cl@linux.com \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=david@fromorbit.com \
    --cc=dennis@kernel.org \
    --cc=djwong@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=duyuyang@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hamohammed.sa@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=jglisse@redhat.com \
    --cc=jlayton@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=johannes.berg@intel.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@lge.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=melissa.srw@gmail.com \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=ngupta@vflare.org \
    --cc=paolo.valente@linaro.org \
    --cc=penberg@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rodrigosiqueiramelo@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=sashal@kernel.org \
    --cc=sj@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.