* [PATCH] xfs: fix possible assert failed in xfs_fs_put_super() when do cpu offline @ 2023-03-14 9:06 Ye Bin 2023-03-14 16:31 ` Darrick J. Wong 0 siblings, 1 reply; 4+ messages in thread From: Ye Bin @ 2023-03-14 9:06 UTC (permalink / raw) To: djwong, linux-xfs; +Cc: linux-kernel, Ye Bin From: Ye Bin <yebin10@huawei.com> There's a issue when do cpu offline test: CPU: 48 PID: 1168152 Comm: umount Kdump: loaded Tainted: G L pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) pc : assfail+0x8c/0xb4 lr : assfail+0x38/0xb4 sp : ffffa00033ce7c40 x29: ffffa00033ce7c40 x28: ffffa00014794f30 x27: ffffa00014f6ca20 x26: 1fffe0120b2e2030 x25: ffff009059710188 x24: ffff00886c0a4650 x23: 1fffe0110d8148ca x22: ffff009059710180 x21: ffffa00015155680 x20: ffff00886c0a4000 x19: 0000000000000001 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000007 x14: 1fffe00304cef265 x13: ffff00182642b200 x12: ffff8012d37757bf x11: 1fffe012d37757be x10: ffff8012d37757be x9 : ffffa00010603a0c x8 : 0000000041b58ab3 x7 : ffff94000679cf44 x6 : 00000000ffffffc0 x5 : 0000000000000021 x4 : 00000000ffffffca x3 : 1ffff40002a27ee1 x2 : 0000000000000004 x1 : 0000000000000000 x0 : ffffa0001513f000 Call trace: assfail+0x8c/0xb4 xfs_destroy_percpu_counters+0x98/0xa4 xfs_fs_put_super+0x1a0/0x2a4 generic_shutdown_super+0x104/0x2c0 kill_block_super+0x8c/0xf4 deactivate_locked_super+0xa4/0x164 deactivate_super+0xb0/0xdc cleanup_mnt+0x29c/0x3ec __cleanup_mnt+0x1c/0x30 task_work_run+0xe0/0x200 do_notify_resume+0x244/0x320 work_pending+0xc/0xa0 We analyzed the data in vmcore is correct. But triggered above issue. As f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") commit describes there is a small race window between the online CPUs traversal of percpu_counter_sum and the CPU offline callback. This means percpu_counter_sum() may return incorrect result during cpu offline. To solve above issue use percpu_counter_sum_all() interface to make sure result is correct to prevent false triggering of assertions. Signed-off-by: Ye Bin <yebin10@huawei.com> --- fs/xfs/xfs_super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 2479b5cbd75e..c0ce66f966ee 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1076,7 +1076,7 @@ xfs_destroy_percpu_counters( percpu_counter_destroy(&mp->m_ifree); percpu_counter_destroy(&mp->m_fdblocks); ASSERT(xfs_is_shutdown(mp) || - percpu_counter_sum(&mp->m_delalloc_blks) == 0); + percpu_counter_sum_all(&mp->m_delalloc_blks) == 0); percpu_counter_destroy(&mp->m_delalloc_blks); percpu_counter_destroy(&mp->m_frextents); } -- 2.31.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: fix possible assert failed in xfs_fs_put_super() when do cpu offline 2023-03-14 9:06 [PATCH] xfs: fix possible assert failed in xfs_fs_put_super() when do cpu offline Ye Bin @ 2023-03-14 16:31 ` Darrick J. Wong 2023-03-14 22:13 ` Dave Chinner 0 siblings, 1 reply; 4+ messages in thread From: Darrick J. Wong @ 2023-03-14 16:31 UTC (permalink / raw) To: Ye Bin; +Cc: linux-xfs, linux-kernel, Ye Bin On Tue, Mar 14, 2023 at 05:06:49PM +0800, Ye Bin wrote: > From: Ye Bin <yebin10@huawei.com> > > There's a issue when do cpu offline test: > CPU: 48 PID: 1168152 Comm: umount Kdump: loaded Tainted: G L > pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) > pc : assfail+0x8c/0xb4 > lr : assfail+0x38/0xb4 > sp : ffffa00033ce7c40 > x29: ffffa00033ce7c40 x28: ffffa00014794f30 > x27: ffffa00014f6ca20 x26: 1fffe0120b2e2030 > x25: ffff009059710188 x24: ffff00886c0a4650 > x23: 1fffe0110d8148ca x22: ffff009059710180 > x21: ffffa00015155680 x20: ffff00886c0a4000 > x19: 0000000000000001 x18: 0000000000000000 > x17: 0000000000000000 x16: 0000000000000000 > x15: 0000000000000007 x14: 1fffe00304cef265 > x13: ffff00182642b200 x12: ffff8012d37757bf > x11: 1fffe012d37757be x10: ffff8012d37757be > x9 : ffffa00010603a0c x8 : 0000000041b58ab3 > x7 : ffff94000679cf44 x6 : 00000000ffffffc0 > x5 : 0000000000000021 x4 : 00000000ffffffca > x3 : 1ffff40002a27ee1 x2 : 0000000000000004 > x1 : 0000000000000000 x0 : ffffa0001513f000 > Call trace: > assfail+0x8c/0xb4 > xfs_destroy_percpu_counters+0x98/0xa4 > xfs_fs_put_super+0x1a0/0x2a4 > generic_shutdown_super+0x104/0x2c0 > kill_block_super+0x8c/0xf4 > deactivate_locked_super+0xa4/0x164 > deactivate_super+0xb0/0xdc > cleanup_mnt+0x29c/0x3ec > __cleanup_mnt+0x1c/0x30 > task_work_run+0xe0/0x200 > do_notify_resume+0x244/0x320 > work_pending+0xc/0xa0 > > We analyzed the data in vmcore is correct. But triggered above issue. > As f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") > commit describes there is a small race window between the online CPUs traversal > of percpu_counter_sum and the CPU offline callback. This means percpu_counter_sum() > may return incorrect result during cpu offline. > To solve above issue use percpu_counter_sum_all() interface to make sure > result is correct to prevent false triggering of assertions. How about the other percpu_counter_sum callsites inside XFS? Some of them are involved in writing ondisk metadata (xfs_log_sb) or doing correctness checks (fs/xfs/scrub/*); shouldn't those also be using the _all variant? --D > Signed-off-by: Ye Bin <yebin10@huawei.com> > --- > fs/xfs/xfs_super.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > index 2479b5cbd75e..c0ce66f966ee 100644 > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -1076,7 +1076,7 @@ xfs_destroy_percpu_counters( > percpu_counter_destroy(&mp->m_ifree); > percpu_counter_destroy(&mp->m_fdblocks); > ASSERT(xfs_is_shutdown(mp) || > - percpu_counter_sum(&mp->m_delalloc_blks) == 0); > + percpu_counter_sum_all(&mp->m_delalloc_blks) == 0); > percpu_counter_destroy(&mp->m_delalloc_blks); > percpu_counter_destroy(&mp->m_frextents); > } > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: fix possible assert failed in xfs_fs_put_super() when do cpu offline 2023-03-14 16:31 ` Darrick J. Wong @ 2023-03-14 22:13 ` Dave Chinner 2023-03-15 3:06 ` Darrick J. Wong 0 siblings, 1 reply; 4+ messages in thread From: Dave Chinner @ 2023-03-14 22:13 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Ye Bin, linux-xfs, linux-kernel, Ye Bin On Tue, Mar 14, 2023 at 09:31:00AM -0700, Darrick J. Wong wrote: > On Tue, Mar 14, 2023 at 05:06:49PM +0800, Ye Bin wrote: > > From: Ye Bin <yebin10@huawei.com> > > > > There's a issue when do cpu offline test: > > CPU: 48 PID: 1168152 Comm: umount Kdump: loaded Tainted: G L > > pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) > > pc : assfail+0x8c/0xb4 > > lr : assfail+0x38/0xb4 > > sp : ffffa00033ce7c40 > > x29: ffffa00033ce7c40 x28: ffffa00014794f30 > > x27: ffffa00014f6ca20 x26: 1fffe0120b2e2030 > > x25: ffff009059710188 x24: ffff00886c0a4650 > > x23: 1fffe0110d8148ca x22: ffff009059710180 > > x21: ffffa00015155680 x20: ffff00886c0a4000 > > x19: 0000000000000001 x18: 0000000000000000 > > x17: 0000000000000000 x16: 0000000000000000 > > x15: 0000000000000007 x14: 1fffe00304cef265 > > x13: ffff00182642b200 x12: ffff8012d37757bf > > x11: 1fffe012d37757be x10: ffff8012d37757be > > x9 : ffffa00010603a0c x8 : 0000000041b58ab3 > > x7 : ffff94000679cf44 x6 : 00000000ffffffc0 > > x5 : 0000000000000021 x4 : 00000000ffffffca > > x3 : 1ffff40002a27ee1 x2 : 0000000000000004 > > x1 : 0000000000000000 x0 : ffffa0001513f000 > > Call trace: > > assfail+0x8c/0xb4 > > xfs_destroy_percpu_counters+0x98/0xa4 > > xfs_fs_put_super+0x1a0/0x2a4 > > generic_shutdown_super+0x104/0x2c0 > > kill_block_super+0x8c/0xf4 > > deactivate_locked_super+0xa4/0x164 > > deactivate_super+0xb0/0xdc > > cleanup_mnt+0x29c/0x3ec > > __cleanup_mnt+0x1c/0x30 > > task_work_run+0xe0/0x200 > > do_notify_resume+0x244/0x320 > > work_pending+0xc/0xa0 > > > > We analyzed the data in vmcore is correct. But triggered above issue. > > As f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") > > commit describes there is a small race window between the online CPUs traversal > > of percpu_counter_sum and the CPU offline callback. This means percpu_counter_sum() > > may return incorrect result during cpu offline. > > To solve above issue use percpu_counter_sum_all() interface to make sure > > result is correct to prevent false triggering of assertions. > > How about the other percpu_counter_sum callsites inside XFS? Some of > them are involved in writing ondisk metadata (xfs_log_sb) or doing > correctness checks (fs/xfs/scrub/*); shouldn't those also be using the > _all variant? Ugh. I kinda wish that the percpu_counter_sum_all() patch had been cc'd to lists for subsystems that use percpu_counter_sum() extensively, or just to people who have modified that code in the past. The problem is that it uses cpu_possible_mask, which means it walks all possible CPUs that can be added to the system even if the CPUs aren't physically present. That can be a lot in the case of systems that can have cpu-capable nodes hotplugged into them, and that makes percpu_counter_sum_all() excitingly expensive for no good reason. AFAICT, if we are trying to close a race condition between iterating online CPUs not summing dying CPUs and the cpu dead notification updating the sum, then shouldn't we be using cpu_mask_or(cpu_online_mask, cpu_dying_mask) for summing iteration rather than just cpu_online_mask? i.e. when a CPU is being taken down, it gets added to the cpu_dying_mask, then removed from the cpu_online_mask, then the offline notifications are run (i.e. the percpu counter dead callback), and when the CPU reaches the CPUHP_TEARDOWN_CPU state, it is removed from the cpu_dying_mask because it is now dead and all the "cpu dying" callbacks have been run. Except when a hotplug event is being processed, cpu_dying_mask will be empty, hence there is little change in summing overhead. But it will close the summing race condition when a CPU is being offlined... That, I think, is the solution we want for XFS. Having the percpu counters just do the right thing is far better than always having to wonder if summation interface we are using is correct in the face of CPU hotplug. I'll put a patchset together to do: 1. fix percpu_counter_sum() to include the dying mask in it's iteration. This should fix the XFS issue. 2. change the only user of percpu_counter_sum_all() to only use percpu_counter_sum() because percpu_counter_sum_all() is now redundant. 3. remove percpu_counter_sum_all() because it is unused. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: fix possible assert failed in xfs_fs_put_super() when do cpu offline 2023-03-14 22:13 ` Dave Chinner @ 2023-03-15 3:06 ` Darrick J. Wong 0 siblings, 0 replies; 4+ messages in thread From: Darrick J. Wong @ 2023-03-15 3:06 UTC (permalink / raw) To: Dave Chinner; +Cc: Ye Bin, linux-xfs, linux-kernel, Ye Bin [add lfsdevel to cc to spread the, um, love] TLDR: percpu_counter_sum doesn't add in the values from CPUs in the dying mask. As a result, the summation can race with cpu hotunplug and return the wrong values. On Wed, Mar 15, 2023 at 09:13:05AM +1100, Dave Chinner wrote: > On Tue, Mar 14, 2023 at 09:31:00AM -0700, Darrick J. Wong wrote: > > On Tue, Mar 14, 2023 at 05:06:49PM +0800, Ye Bin wrote: > > > From: Ye Bin <yebin10@huawei.com> > > > > > > There's a issue when do cpu offline test: > > > CPU: 48 PID: 1168152 Comm: umount Kdump: loaded Tainted: G L > > > pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) > > > pc : assfail+0x8c/0xb4 > > > lr : assfail+0x38/0xb4 > > > sp : ffffa00033ce7c40 > > > x29: ffffa00033ce7c40 x28: ffffa00014794f30 > > > x27: ffffa00014f6ca20 x26: 1fffe0120b2e2030 > > > x25: ffff009059710188 x24: ffff00886c0a4650 > > > x23: 1fffe0110d8148ca x22: ffff009059710180 > > > x21: ffffa00015155680 x20: ffff00886c0a4000 > > > x19: 0000000000000001 x18: 0000000000000000 > > > x17: 0000000000000000 x16: 0000000000000000 > > > x15: 0000000000000007 x14: 1fffe00304cef265 > > > x13: ffff00182642b200 x12: ffff8012d37757bf > > > x11: 1fffe012d37757be x10: ffff8012d37757be > > > x9 : ffffa00010603a0c x8 : 0000000041b58ab3 > > > x7 : ffff94000679cf44 x6 : 00000000ffffffc0 > > > x5 : 0000000000000021 x4 : 00000000ffffffca > > > x3 : 1ffff40002a27ee1 x2 : 0000000000000004 > > > x1 : 0000000000000000 x0 : ffffa0001513f000 > > > Call trace: > > > assfail+0x8c/0xb4 > > > xfs_destroy_percpu_counters+0x98/0xa4 > > > xfs_fs_put_super+0x1a0/0x2a4 > > > generic_shutdown_super+0x104/0x2c0 > > > kill_block_super+0x8c/0xf4 > > > deactivate_locked_super+0xa4/0x164 > > > deactivate_super+0xb0/0xdc > > > cleanup_mnt+0x29c/0x3ec > > > __cleanup_mnt+0x1c/0x30 > > > task_work_run+0xe0/0x200 > > > do_notify_resume+0x244/0x320 > > > work_pending+0xc/0xa0 > > > > > > We analyzed the data in vmcore is correct. But triggered above issue. > > > As f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") > > > commit describes there is a small race window between the online CPUs traversal > > > of percpu_counter_sum and the CPU offline callback. This means percpu_counter_sum() > > > may return incorrect result during cpu offline. > > > To solve above issue use percpu_counter_sum_all() interface to make sure > > > result is correct to prevent false triggering of assertions. > > > > How about the other percpu_counter_sum callsites inside XFS? Some of > > them are involved in writing ondisk metadata (xfs_log_sb) or doing > > correctness checks (fs/xfs/scrub/*); shouldn't those also be using the > > _all variant? > > Ugh. I kinda wish that the percpu_counter_sum_all() patch had been > cc'd to lists for subsystems that use percpu_counter_sum() > extensively, or just to people who have modified that code in the > past. > > The problem is that it uses cpu_possible_mask, which means it > walks all possible CPUs that can be added to the system even if the > CPUs aren't physically present. That can be a lot in the case of > systems that can have cpu-capable nodes hotplugged into them, and > that makes percpu_counter_sum_all() excitingly expensive for no good > reason. > > AFAICT, if we are trying to close a race condition between iterating > online CPUs not summing dying CPUs and the cpu dead notification > updating the sum, then shouldn't we be using > cpu_mask_or(cpu_online_mask, cpu_dying_mask) for summing iteration > rather than just cpu_online_mask? > > i.e. when a CPU is being taken down, it gets added to the > cpu_dying_mask, then removed from the cpu_online_mask, then the > offline notifications are run (i.e. the percpu counter dead > callback), and when the CPU reaches the CPUHP_TEARDOWN_CPU state, > it is removed from the cpu_dying_mask because it is now dead and all > the "cpu dying" callbacks have been run. > > Except when a hotplug event is being processed, cpu_dying_mask will > be empty, hence there is little change in summing overhead. But it > will close the summing race condition when a CPU is being > offlined... > > That, I think, is the solution we want for XFS. Having the percpu > counters just do the right thing is far better than always having to > wonder if summation interface we are using is correct in the face of > CPU hotplug. I'll put a patchset together to do: > > 1. fix percpu_counter_sum() to include the dying mask in it's > iteration. This should fix the XFS issue. I took a quick look at ext4 and btrfs usage of percpu_counter_sum. I /think/ they're less impacted because most of the usage seems to be for things like statfs which are inherently racy. That said, mixing in the dying mask sounds like a cheap fix. > 2. change the only user of percpu_counter_sum_all() to only use > percpu_counter_sum() because percpu_counter_sum_all() is now > redundant. > 3. remove percpu_counter_sum_all() because it is unused. Sounds reasonable to /me. :) --D > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-03-15 3:06 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-03-14 9:06 [PATCH] xfs: fix possible assert failed in xfs_fs_put_super() when do cpu offline Ye Bin 2023-03-14 16:31 ` Darrick J. Wong 2023-03-14 22:13 ` Dave Chinner 2023-03-15 3:06 ` Darrick J. Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).