All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+
@ 2023-06-11 12:48 Zorro Lang
  2023-06-11 13:40 ` Zorro Lang
  2023-06-11 23:01 ` Dave Chinner
  0 siblings, 2 replies; 31+ messages in thread
From: Zorro Lang @ 2023-06-11 12:48 UTC (permalink / raw)
  To: linux-xfs

Hi,

When I tried to do fstests regression test this weekend on latest linux
v6.5-rc5+ (HEAD=64569520920a3ca5d456ddd9f4f95fc6ea9b8b45), nearly all
testing jobs on xfs hang on generic/051 (more than 24 hours, still blocked).
No matter 1k or 4k blocksize, general disk or pmem dev, or any architectures,
or any mkfs/mount options testing, all hang there.

Someone console log as below (a bit long), the call trace doesn't contains any
xfs functions, it might be not a xfs bug, but it can't be reproduced on ext4.

Thanks,
Zorro

[ 3058.375251] run fstests generic/051 at 2023-06-10 05:24:53 
[ 3060.823847] XFS (sda2): Mounting V5 Filesystem d29dc9fd-f34e-412d-ad57-660089a07846 
[ 3060.861421] XFS (sda2): Ending clean mount 
[ 3060.890745] XFS (sda2): User initiated shutdown received. 
[ 3060.896445] XFS (sda2): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x4b/0x180 [xfs] (fs/xfs/xfs_fsops.c:483).  Shutting down filesystem. 
[ 3060.910587] XFS (sda2): Please unmount the filesystem and rectify the problem(s) 
[ 3060.935623] XFS (sda2): Unmounting Filesystem d29dc9fd-f34e-412d-ad57-660089a07846 
[ 3063.187766] XFS (sda2): Mounting V5 Filesystem 2e3103da-304e-4b5d-889a-9107fbfac9c4 
[ 3063.266929] XFS (sda2): Ending clean mount 
[-- MARK -- Sat Jun 10 09:25:00 2023] 
[ 3089.010524] restraintd[2605]: *** Current Time: Sat Jun 10 05:25:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3094.716728] XFS (sda2): Unmounting Filesystem 2e3103da-304e-4b5d-889a-9107fbfac9c4 
[ 3094.953685] XFS (sda2): Mounting V5 Filesystem 2e3103da-304e-4b5d-889a-9107fbfac9c4 
[ 3094.999476] XFS (sda2): Ending clean mount 
[ 3130.092264] XFS (sda2): User initiated shutdown received. 
[ 3130.097725] XFS (sda2): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xe1/0x180 [xfs] (fs/xfs/xfs_fsops.c:486).  Shutting down filesystem. 
[ 3130.111026] XFS (sda2): Please unmount the filesystem and rectify the problem(s) 
[ 3149.011385] restraintd[2605]: *** Current Time: Sat Jun 10 05:26:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3209.009949] restraintd[2605]: *** Current Time: Sat Jun 10 05:27:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3269.009872] restraintd[2605]: *** Current Time: Sat Jun 10 05:28:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3327.273357] INFO: task fsstress:570762 blocked for more than 122 seconds. 
[ 3327.280311]       Not tainted 6.4.0-rc5+ #1 
[ 3327.284525] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3327.292384] task:fsstress        state:D stack:21664 pid:570762 ppid:570760 flags:0x00000002 
[ 3327.300853] Call Trace: 
[ 3327.303331]  <TASK> 
[ 3327.305463]  __schedule+0x79c/0x1d70 
[ 3327.309086]  ? __pfx___schedule+0x10/0x10 
[ 3327.313125]  ? __lock_acquire+0xbaa/0x1be0 
[ 3327.317257]  ? __lock_release+0x485/0x960 
[ 3327.321302]  schedule+0x130/0x220 
[ 3327.324653]  schedule_timeout+0x22a/0x270 
[ 3327.328691]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3327.333245]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3327.338603]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3327.342991]  __wait_for_common+0x1ca/0x5c0 
[ 3327.347113]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3327.351677]  ? __pfx___wait_for_common+0x10/0x10 
[ 3327.356334]  wait_for_completion_state+0x1d/0x40 
[ 3327.360976]  coredump_wait+0x502/0x750 
[ 3327.364760]  do_coredump+0x2fa/0x1c10 
[ 3327.368447]  ? __pfx___lock_release+0x10/0x10 
[ 3327.372825]  ? __pfx___lock_acquired+0x10/0x10 
[ 3327.377289]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3327.381676]  ? __pfx_do_coredump+0x10/0x10 
[ 3327.385801]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3327.390882]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3327.395179]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3327.400171]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3327.405272]  ? __lock_release+0x485/0x960 
[ 3327.409317]  ? __wait_for_common+0x9e/0x5c0 
[ 3327.413527]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3327.417922]  get_signal+0x12f8/0x1990 
[ 3327.421620]  ? do_send_specific+0xf8/0x280 
[ 3327.425748]  ? __pfx_get_signal+0x10/0x10 
[ 3327.429786]  ? do_send_specific+0x110/0x280 
[ 3327.433996]  ? __pfx_do_send_specific+0x10/0x10 
[ 3327.438558]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3327.443286]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3327.448627]  ? do_tkill+0x186/0x200 
[ 3327.452180]  exit_to_user_mode_loop+0xac/0x160 
[ 3327.456652]  exit_to_user_mode_prepare+0x142/0x150 
[ 3327.461468]  syscall_exit_to_user_mode+0x19/0x50 
[ 3327.466113]  do_syscall_64+0x69/0x90 
[ 3327.469723]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3327.474106]  ? do_syscall_64+0x69/0x90 
[ 3327.477886]  ? do_syscall_64+0x69/0x90 
[ 3327.481662]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3327.486047]  ? do_syscall_64+0x69/0x90 
[ 3327.489828]  ? asm_exc_page_fault+0x22/0x30 
[ 3327.494039]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3327.498422]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3327.503492] RIP: 0033:0x7fe5cfaa157c 
[ 3327.507097] RSP: 002b:00007fffe9f3c6e0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3327.514689] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3327.521840] RDX: 0000000000000006 RSI: 000000000008b58a RDI: 000000000008b58a 
[ 3327.528989] RBP: 000000000008b58a R08: 00007fffe9f3c7b0 R09: 00007fe5cfbb14e0 
[ 3327.536161] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3327.543317] R13: 8f5c28f5c28f5c29 R14: 000000000040ac10 R15: 00007fe5cfce86c0 
[ 3327.550487]  </TASK> 
[ 3327.552721] INFO: task fsstress:570763 blocked for more than 123 seconds. 
[ 3327.559533]       Not tainted 6.4.0-rc5+ #1 
[ 3327.563743] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3327.571586] task:fsstress        state:D stack:22832 pid:570763 ppid:570760 flags:0x00000002 
[ 3327.580042] Call Trace: 
[ 3327.582517]  <TASK> 
[ 3327.584643]  __schedule+0x79c/0x1d70 
[ 3327.588248]  ? __pfx___schedule+0x10/0x10 
[ 3327.592282]  ? __lock_acquire+0xbaa/0x1be0 
[ 3327.596407]  ? __lock_release+0x485/0x960 
[ 3327.600446]  schedule+0x130/0x220 
[ 3327.603784]  schedule_timeout+0x22a/0x270 
[ 3327.607813]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3327.612361]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3327.617706]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3327.622091]  __wait_for_common+0x1ca/0x5c0 
[ 3327.626210]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3327.630767]  ? __pfx___wait_for_common+0x10/0x10 
[ 3327.635423]  wait_for_completion_state+0x1d/0x40 
[ 3327.640060]  coredump_wait+0x502/0x750 
[ 3327.643844]  do_coredump+0x2fa/0x1c10 
[ 3327.647533]  ? __pfx___lock_release+0x10/0x10 
[ 3327.651910]  ? __pfx___lock_acquired+0x10/0x10 
[ 3327.656372]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3327.660750]  ? __pfx_do_coredump+0x10/0x10 
[ 3327.664875]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3327.669954]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3327.674252]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3327.679242]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3327.684332]  ? __lock_release+0x485/0x960 
[ 3327.688379]  ? __wait_for_common+0x9e/0x5c0 
[ 3327.692581]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3327.696969]  get_signal+0x12f8/0x1990 
[ 3327.700668]  ? do_send_specific+0xf8/0x280 
[ 3327.704793]  ? __pfx_get_signal+0x10/0x10 
[ 3327.708823]  ? do_send_specific+0x110/0x280 
[ 3327.713032]  ? __pfx_do_send_specific+0x10/0x10 
[ 3327.717591]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3327.722316]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3327.727649]  ? do_tkill+0x186/0x200 
[ 3327.731185]  exit_to_user_mode_loop+0xac/0x160 
[ 3327.735654]  exit_to_user_mode_prepare+0x142/0x150 
[ 3327.740465]  syscall_exit_to_user_mode+0x19/0x50 
[ 3327.745099]  do_syscall_64+0x69/0x90 
[ 3327.748698]  ? __task_pid_nr_ns+0x105/0x3c0 
[ 3327.752916]  ? do_syscall_64+0x69/0x90 
[ 3327.756687]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3327.761066]  ? do_syscall_64+0x69/0x90 
[ 3327.764844]  ? asm_exc_page_fault+0x22/0x30 
[ 3327.769045]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3327.773425]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3327.778506] RIP: 0033:0x7fe5cfaa157c 
[ 3327.782105] RSP: 002b:00007fffe9f3c610 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3327.789689] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3327.796839] RDX: 0000000000000006 RSI: 000000000008b58b RDI: 000000000008b58b 
[ 3327.803990] RBP: 000000000008b58b R08: 00007fffe9f3c6e0 R09: 00007fe5cfbb14e0 
[ 3327.811148] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3327.818296] R13: 8f5c28f5c28f5c29 R14: 000000000040df50 R15: 00007fe5cfce86c0 
[ 3327.825468]  </TASK> 
[ 3327.827679]  
[ 3327.827679] Showing all locks held in the system: 
[ 3327.833882] 1 lock held by rcu_tasks_kthre/13: 
[ 3327.838359]  #0: ffffffffb972c000 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3327.847964] 1 lock held by rcu_tasks_rude_/14: 
[ 3327.852426]  #0: ffffffffb972bd20 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3327.862456] 1 lock held by rcu_tasks_trace/15: 
[ 3327.866918]  #0: ffffffffb972b9e0 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3327.877070] 1 lock held by khungtaskd/865: 
[ 3327.881183]  #0: ffffffffb972ccc0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x51/0x340 
[ 3327.890188] 1 lock held by systemd-journal/1644: 
[ 3327.894823]  #0: ff11000f34a04ad8 (&rq->__lock){-.-.}-{2:2}, at: finish_task_switch.isra.0+0x146/0xb40 
[ 3327.904184]  
[ 3327.905700] ============================================= 
[ 3327.905700]  
[ 3329.010807] restraintd[2605]: *** Current Time: Sat Jun 10 05:29:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[-- MARK -- Sat Jun 10 09:30:00 2023] 
[ 3389.009908] restraintd[2605]: *** Current Time: Sat Jun 10 05:30:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3450.152225] INFO: task fsstress:570762 blocked for more than 245 seconds. 
[ 3450.159038]       Not tainted 6.4.0-rc5+ #1 
[ 3450.163251] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3450.171105] task:fsstress        state:D stack:21664 pid:570762 ppid:570760 flags:0x00000002 
[ 3450.179569] Call Trace: 
[ 3450.182049]  <TASK> 
[ 3450.184182]  __schedule+0x79c/0x1d70 
[ 3450.187791]  ? __pfx___schedule+0x10/0x10 
[ 3450.191823]  ? __lock_acquire+0xbaa/0x1be0 
[ 3450.195958]  ? __lock_release+0x485/0x960 
[ 3450.200004]  schedule+0x130/0x220 
[ 3450.203352]  schedule_timeout+0x22a/0x270 
[ 3450.207391]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3450.211948]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3450.217305]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3450.221693]  __wait_for_common+0x1ca/0x5c0 
[ 3450.225814]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3450.230379]  ? __pfx___wait_for_common+0x10/0x10 
[ 3450.235035]  wait_for_completion_state+0x1d/0x40 
[ 3450.239679]  coredump_wait+0x502/0x750 
[ 3450.243471]  do_coredump+0x2fa/0x1c10 
[ 3450.247160]  ? __pfx___lock_release+0x10/0x10 
[ 3450.251545]  ? __pfx___lock_acquired+0x10/0x10 
[ 3450.256021]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3450.260399]  ? __pfx_do_coredump+0x10/0x10 
[ 3450.264518]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3450.269602]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3450.273899]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3450.278890]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3450.283991]  ? __lock_release+0x485/0x960 
[ 3450.288036]  ? __wait_for_common+0x9e/0x5c0 
[ 3450.292240]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3450.296629]  get_signal+0x12f8/0x1990 
[ 3450.300328]  ? do_send_specific+0xf8/0x280 
[ 3450.304445]  ? __pfx_get_signal+0x10/0x10 
[ 3450.308474]  ? do_send_specific+0x110/0x280 
[ 3450.312677]  ? __pfx_do_send_specific+0x10/0x10 
[ 3450.317236]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3450.321971]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3450.327311]  ? do_tkill+0x186/0x200 
[ 3450.330846]  exit_to_user_mode_loop+0xac/0x160 
[ 3450.335319]  exit_to_user_mode_prepare+0x142/0x150 
[ 3450.340133]  syscall_exit_to_user_mode+0x19/0x50 
[ 3450.344770]  do_syscall_64+0x69/0x90 
[ 3450.348374]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3450.352753]  ? do_syscall_64+0x69/0x90 
[ 3450.356524]  ? do_syscall_64+0x69/0x90 
[ 3450.360299]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3450.364678]  ? do_syscall_64+0x69/0x90 
[ 3450.368448]  ? asm_exc_page_fault+0x22/0x30 
[ 3450.372649]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3450.377028]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3450.382101] RIP: 0033:0x7fe5cfaa157c 
[ 3450.385706] RSP: 002b:00007fffe9f3c6e0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3450.393294] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3450.400443] RDX: 0000000000000006 RSI: 000000000008b58a RDI: 000000000008b58a 
[ 3450.407596] RBP: 000000000008b58a R08: 00007fffe9f3c7b0 R09: 00007fe5cfbb14e0 
[ 3450.414751] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3450.421902] R13: 8f5c28f5c28f5c29 R14: 000000000040ac10 R15: 00007fe5cfce86c0 
[ 3450.429089]  </TASK> 
[ 3450.431296] INFO: task fsstress:570763 blocked for more than 246 seconds. 
[ 3450.438112]       Not tainted 6.4.0-rc5+ #1 
[ 3450.442313] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3450.450154] task:fsstress        state:D stack:22832 pid:570763 ppid:570760 flags:0x00000002 
[ 3450.458613] Call Trace: 
[ 3450.461086]  <TASK> 
[ 3450.463217]  __schedule+0x79c/0x1d70 
[ 3450.466820]  ? __pfx___schedule+0x10/0x10 
[ 3450.470853]  ? __lock_acquire+0xbaa/0x1be0 
[ 3450.474978]  ? __lock_release+0x485/0x960 
[ 3450.479018]  schedule+0x130/0x220 
[ 3450.482361]  schedule_timeout+0x22a/0x270 
[ 3450.486389]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3450.490941]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3450.496289]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3450.500669]  __wait_for_common+0x1ca/0x5c0 
[ 3450.504795]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3450.509359]  ? __pfx___wait_for_common+0x10/0x10 
[ 3450.514014]  wait_for_completion_state+0x1d/0x40 
[ 3450.518648]  coredump_wait+0x502/0x750 
[ 3450.522429]  do_coredump+0x2fa/0x1c10 
[ 3450.526117]  ? __pfx___lock_release+0x10/0x10 
[ 3450.530495]  ? __pfx___lock_acquired+0x10/0x10 
[ 3450.534959]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3450.539341]  ? __pfx_do_coredump+0x10/0x10 
[ 3450.543472]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3450.548545]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3450.552845]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3450.557835]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3450.562930]  ? __lock_release+0x485/0x960 
[ 3450.566977]  ? __wait_for_common+0x9e/0x5c0 
[ 3450.571182]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3450.575579]  get_signal+0x12f8/0x1990 
[ 3450.579277]  ? do_send_specific+0xf8/0x280 
[ 3450.583406]  ? __pfx_get_signal+0x10/0x10 
[ 3450.587442]  ? do_send_specific+0x110/0x280 
[ 3450.591654]  ? __pfx_do_send_specific+0x10/0x10 
[ 3450.596215]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3450.600939]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3450.606273]  ? do_tkill+0x186/0x200 
[ 3450.609810]  exit_to_user_mode_loop+0xac/0x160 
[ 3450.614278]  exit_to_user_mode_prepare+0x142/0x150 
[ 3450.619100]  syscall_exit_to_user_mode+0x19/0x50 
[ 3450.623744]  do_syscall_64+0x69/0x90 
[ 3450.627348]  ? __task_pid_nr_ns+0x105/0x3c0 
[ 3450.631569]  ? do_syscall_64+0x69/0x90 
[ 3450.635349]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3450.639738]  ? do_syscall_64+0x69/0x90 
[ 3450.643514]  ? asm_exc_page_fault+0x22/0x30 
[ 3450.647727]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3450.652113]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3450.657186] RIP: 0033:0x7fe5cfaa157c 
[ 3450.660784] RSP: 002b:00007fffe9f3c610 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3450.668376] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3450.675531] RDX: 0000000000000006 RSI: 000000000008b58b RDI: 000000000008b58b 
[ 3450.682682] RBP: 000000000008b58b R08: 00007fffe9f3c6e0 R09: 00007fe5cfbb14e0 
[ 3450.689835] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3450.696985] R13: 8f5c28f5c28f5c29 R14: 000000000040df50 R15: 00007fe5cfce86c0 
[ 3450.704151]  </TASK> 
[ 3450.706362]  
[ 3450.706362] Showing all locks held in the system: 
[ 3450.712570] 1 lock held by rcu_tasks_kthre/13: 
[ 3450.717037]  #0: ffffffffb972c000 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3450.726639] 1 lock held by rcu_tasks_rude_/14: 
[ 3450.731106]  #0: ffffffffb972bd20 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3450.741139] 1 lock held by rcu_tasks_trace/15: 
[ 3450.745604]  #0: ffffffffb972b9e0 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3450.755747] 1 lock held by khungtaskd/865: 
[ 3450.759872]  #0: ffffffffb972ccc0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x51/0x340 
[ 3450.768868] 1 lock held by systemd-journal/1644: 
[ 3450.773512]  #0: ff11000f34a04ad8 (&rq->__lock){-.-.}-{2:2}, at: ep_send_events+0xea/0xa00 
[ 3450.781825]  
[ 3450.783348] ============================================= 
[ 3450.783348]  
[ 3449.011268] restraintd[2605]: *** Current Time: Sat Jun 10 05:31:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3509.010466] restraintd[2605]: *** Current Time: Sat Jun 10 05:32:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3569.010481] restraintd[2605]: *** Current Time: Sat Jun 10 05:33:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3573.031190] INFO: task fsstress:570762 blocked for more than 368 seconds. 
[ 3573.038002]       Not tainted 6.4.0-rc5+ #1 
[ 3573.042222] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3573.050069] task:fsstress        state:D stack:21664 pid:570762 ppid:570760 flags:0x00000002 
[ 3573.058534] Call Trace: 
[ 3573.061014]  <TASK> 
[ 3573.063148]  __schedule+0x79c/0x1d70 
[ 3573.066757]  ? __pfx___schedule+0x10/0x10 
[ 3573.070792]  ? __lock_acquire+0xbaa/0x1be0 
[ 3573.074926]  ? __lock_release+0x485/0x960 
[ 3573.078973]  schedule+0x130/0x220 
[ 3573.082321]  schedule_timeout+0x22a/0x270 
[ 3573.086357]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3573.090911]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3573.096257]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3573.100639]  __wait_for_common+0x1ca/0x5c0 
[ 3573.104759]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3573.109314]  ? __pfx___wait_for_common+0x10/0x10 
[ 3573.113971]  wait_for_completion_state+0x1d/0x40 
[ 3573.118609]  coredump_wait+0x502/0x750 
[ 3573.122390]  do_coredump+0x2fa/0x1c10 
[ 3573.126079]  ? __pfx___lock_release+0x10/0x10 
[ 3573.130458]  ? __pfx___lock_acquired+0x10/0x10 
[ 3573.134921]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3573.139306]  ? __pfx_do_coredump+0x10/0x10 
[ 3573.143425]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3573.148504]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3573.152801]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3573.157792]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3573.162883]  ? __lock_release+0x485/0x960 
[ 3573.166929]  ? __wait_for_common+0x9e/0x5c0 
[ 3573.171136]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3573.175526]  get_signal+0x12f8/0x1990 
[ 3573.179223]  ? do_send_specific+0xf8/0x280 
[ 3573.183344]  ? __pfx_get_signal+0x10/0x10 
[ 3573.187380]  ? do_send_specific+0x110/0x280 
[ 3573.191582]  ? __pfx_do_send_specific+0x10/0x10 
[ 3573.196137]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3573.200865]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3573.206196]  ? do_tkill+0x186/0x200 
[ 3573.209739]  exit_to_user_mode_loop+0xac/0x160 
[ 3573.214205]  exit_to_user_mode_prepare+0x142/0x150 
[ 3573.219027]  syscall_exit_to_user_mode+0x19/0x50 
[ 3573.223665]  do_syscall_64+0x69/0x90 
[ 3573.227264]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3573.231649]  ? do_syscall_64+0x69/0x90 
[ 3573.235422]  ? do_syscall_64+0x69/0x90 
[ 3573.239196]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3573.243575]  ? do_syscall_64+0x69/0x90 
[ 3573.247345]  ? asm_exc_page_fault+0x22/0x30 
[ 3573.251553]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3573.255934]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3573.261000] RIP: 0033:0x7fe5cfaa157c 
[ 3573.264609] RSP: 002b:00007fffe9f3c6e0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3573.272198] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3573.279347] RDX: 0000000000000006 RSI: 000000000008b58a RDI: 000000000008b58a 
[ 3573.286499] RBP: 000000000008b58a R08: 00007fffe9f3c7b0 R09: 00007fe5cfbb14e0 
[ 3573.293650] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3573.300799] R13: 8f5c28f5c28f5c29 R14: 000000000040ac10 R15: 00007fe5cfce86c0 
[ 3573.307968]  </TASK> 
[ 3573.310183] INFO: task fsstress:570763 blocked for more than 368 seconds. 
[ 3573.316995]       Not tainted 6.4.0-rc5+ #1 
[ 3573.321200] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3573.329044] task:fsstress        state:D stack:22832 pid:570763 ppid:570760 flags:0x00000002 
[ 3573.337503] Call Trace: 
[ 3573.339971]  <TASK> 
[ 3573.342100]  __schedule+0x79c/0x1d70 
[ 3573.345707]  ? __pfx___schedule+0x10/0x10 
[ 3573.349739]  ? __lock_acquire+0xbaa/0x1be0 
[ 3573.353868]  ? __lock_release+0x485/0x960 
[ 3573.357906]  schedule+0x130/0x220 
[ 3573.361252]  schedule_timeout+0x22a/0x270 
[ 3573.365286]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3573.369837]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3573.375182]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3573.379565]  __wait_for_common+0x1ca/0x5c0 
[ 3573.383679]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3573.388232]  ? __pfx___wait_for_common+0x10/0x10 
[ 3573.392890]  wait_for_completion_state+0x1d/0x40 
[ 3573.397527]  coredump_wait+0x502/0x750 
[ 3573.401307]  do_coredump+0x2fa/0x1c10 
[ 3573.404988]  ? __pfx___lock_release+0x10/0x10 
[ 3573.409368]  ? __pfx___lock_acquired+0x10/0x10 
[ 3573.413830]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3573.418206]  ? __pfx_do_coredump+0x10/0x10 
[ 3573.422329]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3573.427405]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3573.431702]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3573.436691]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3573.441782]  ? __lock_release+0x485/0x960 
[ 3573.445830]  ? __wait_for_common+0x9e/0x5c0 
[ 3573.450031]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3573.454419]  get_signal+0x12f8/0x1990 
[ 3573.458117]  ? do_send_specific+0xf8/0x280 
[ 3573.462233]  ? __pfx_get_signal+0x10/0x10 
[ 3573.466263]  ? do_send_specific+0x110/0x280 
[ 3573.470466]  ? __pfx_do_send_specific+0x10/0x10 
[ 3573.475027]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3573.479757]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3573.485090]  ? do_tkill+0x186/0x200 
[ 3573.488625]  exit_to_user_mode_loop+0xac/0x160 
[ 3573.493098]  exit_to_user_mode_prepare+0x142/0x150 
[ 3573.497913]  syscall_exit_to_user_mode+0x19/0x50 
[ 3573.502550]  do_syscall_64+0x69/0x90 
[ 3573.506145]  ? __task_pid_nr_ns+0x105/0x3c0 
[ 3573.510357]  ? do_syscall_64+0x69/0x90 
[ 3573.514127]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3573.518506]  ? do_syscall_64+0x69/0x90 
[ 3573.522273]  ? asm_exc_page_fault+0x22/0x30 
[ 3573.526478]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3573.530856]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3573.535929] RIP: 0033:0x7fe5cfaa157c 
[ 3573.539534] RSP: 002b:00007fffe9f3c610 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3573.547125] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3573.554276] RDX: 0000000000000006 RSI: 000000000008b58b RDI: 000000000008b58b 
[ 3573.561420] RBP: 000000000008b58b R08: 00007fffe9f3c6e0 R09: 00007fe5cfbb14e0 
[ 3573.568575] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3573.575726] R13: 8f5c28f5c28f5c29 R14: 000000000040df50 R15: 00007fe5cfce86c0 
[ 3573.582896]  </TASK> 
[ 3573.585108]  
[ 3573.585108] Showing all locks held in the system: 
[ 3573.591316] 1 lock held by rcu_tasks_kthre/13: 
[ 3573.595776]  #0: ffffffffb972c000 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3573.605379] 1 lock held by rcu_tasks_rude_/14: 
[ 3573.609845]  #0: ffffffffb972bd20 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3573.619875] 1 lock held by rcu_tasks_trace/15: 
[ 3573.624347]  #0: ffffffffb972b9e0 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3573.634490] 1 lock held by khungtaskd/865: 
[ 3573.638611]  #0: ffffffffb972ccc0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x51/0x340 
[ 3573.647618]  
[ 3573.649139] ============================================= 
[ 3573.649139]  
[ 3629.010212] restraintd[2605]: *** Current Time: Sat Jun 10 05:34:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[-- MARK -- Sat Jun 10 09:35:00 2023] 
[ 3689.011047] restraintd[2605]: *** Current Time: Sat Jun 10 05:35:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3695.910116] INFO: task fsstress:570762 blocked for more than 491 seconds. 
[ 3695.916930]       Not tainted 6.4.0-rc5+ #1 
[ 3695.921139] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3695.928985] task:fsstress        state:D stack:21664 pid:570762 ppid:570760 flags:0x00000002 
[ 3695.937451] Call Trace: 
[ 3695.939919]  <TASK> 
[ 3695.942046]  __schedule+0x79c/0x1d70 
[ 3695.945655]  ? __pfx___schedule+0x10/0x10 
[ 3695.949689]  ? __lock_acquire+0xbaa/0x1be0 
[ 3695.953824]  ? __lock_release+0x485/0x960 
[ 3695.957866]  schedule+0x130/0x220 
[ 3695.961205]  schedule_timeout+0x22a/0x270 
[ 3695.965244]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3695.969805]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3695.975161]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3695.979549]  __wait_for_common+0x1ca/0x5c0 
[ 3695.983669]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3695.988233]  ? __pfx___wait_for_common+0x10/0x10 
[ 3695.992891]  wait_for_completion_state+0x1d/0x40 
[ 3695.997537]  coredump_wait+0x502/0x750 
[ 3696.001328]  do_coredump+0x2fa/0x1c10 
[ 3696.005013]  ? __pfx___lock_release+0x10/0x10 
[ 3696.009393]  ? __pfx___lock_acquired+0x10/0x10 
[ 3696.013868]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3696.018248]  ? __pfx_do_coredump+0x10/0x10 
[ 3696.022373]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3696.027449]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3696.031742]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3696.036733]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3696.041826]  ? __lock_release+0x485/0x960 
[ 3696.045869]  ? __wait_for_common+0x9e/0x5c0 
[ 3696.050069]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3696.054460]  get_signal+0x12f8/0x1990 
[ 3696.058157]  ? do_send_specific+0xf8/0x280 
[ 3696.062276]  ? __pfx_get_signal+0x10/0x10 
[ 3696.066314]  ? do_send_specific+0x110/0x280 
[ 3696.070527]  ? __pfx_do_send_specific+0x10/0x10 
[ 3696.075087]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3696.079814]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3696.085147]  ? do_tkill+0x186/0x200 
[ 3696.088682]  exit_to_user_mode_loop+0xac/0x160 
[ 3696.093152]  exit_to_user_mode_prepare+0x142/0x150 
[ 3696.097966]  syscall_exit_to_user_mode+0x19/0x50 
[ 3696.102606]  do_syscall_64+0x69/0x90 
[ 3696.106205]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3696.110590]  ? do_syscall_64+0x69/0x90 
[ 3696.114362]  ? do_syscall_64+0x69/0x90 
[ 3696.118137]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3696.122515]  ? do_syscall_64+0x69/0x90 
[ 3696.126288]  ? asm_exc_page_fault+0x22/0x30 
[ 3696.130497]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3696.134874]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3696.139952] RIP: 0033:0x7fe5cfaa157c 
[ 3696.143546] RSP: 002b:00007fffe9f3c6e0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3696.151129] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3696.158281] RDX: 0000000000000006 RSI: 000000000008b58a RDI: 000000000008b58a 
[ 3696.165431] RBP: 000000000008b58a R08: 00007fffe9f3c7b0 R09: 00007fe5cfbb14e0 
[ 3696.172585] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3696.179732] R13: 8f5c28f5c28f5c29 R14: 000000000040ac10 R15: 00007fe5cfce86c0 
[ 3696.186900]  </TASK> 
[ 3696.189108] INFO: task fsstress:570763 blocked for more than 491 seconds. 
[ 3696.195919]       Not tainted 6.4.0-rc5+ #1 
[ 3696.200126] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3696.207965] task:fsstress        state:D stack:22832 pid:570763 ppid:570760 flags:0x00000002 
[ 3696.216424] Call Trace: 
[ 3696.218894]  <TASK> 
[ 3696.221025]  __schedule+0x79c/0x1d70 
[ 3696.224632]  ? __pfx___schedule+0x10/0x10 
[ 3696.228666]  ? __lock_acquire+0xbaa/0x1be0 
[ 3696.232791]  ? __lock_release+0x485/0x960 
[ 3696.236832]  schedule+0x130/0x220 
[ 3696.240175]  schedule_timeout+0x22a/0x270 
[ 3696.244202]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3696.248754]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3696.254097]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3696.258480]  __wait_for_common+0x1ca/0x5c0 
[ 3696.262594]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3696.267150]  ? __pfx___wait_for_common+0x10/0x10 
[ 3696.271806]  wait_for_completion_state+0x1d/0x40 
[ 3696.276442]  coredump_wait+0x502/0x750 
[ 3696.280223]  do_coredump+0x2fa/0x1c10 
[ 3696.283904]  ? __pfx___lock_release+0x10/0x10 
[ 3696.288283]  ? __pfx___lock_acquired+0x10/0x10 
[ 3696.292746]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3696.297122]  ? __pfx_do_coredump+0x10/0x10 
[ 3696.301243]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3696.306321]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3696.310616]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3696.315600]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3696.320691]  ? __lock_release+0x485/0x960 
[ 3696.324736]  ? __wait_for_common+0x9e/0x5c0 
[ 3696.328948]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3696.333337]  get_signal+0x12f8/0x1990 
[ 3696.337032]  ? do_send_specific+0xf8/0x280 
[ 3696.341151]  ? __pfx_get_signal+0x10/0x10 
[ 3696.345187]  ? do_send_specific+0x110/0x280 
[ 3696.349392]  ? __pfx_do_send_specific+0x10/0x10 
[ 3696.353963]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3696.358691]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3696.364035]  ? do_tkill+0x186/0x200 
[ 3696.367566]  exit_to_user_mode_loop+0xac/0x160 
[ 3696.372038]  exit_to_user_mode_prepare+0x142/0x150 
[ 3696.376855]  syscall_exit_to_user_mode+0x19/0x50 
[ 3696.381492]  do_syscall_64+0x69/0x90 
[ 3696.385086]  ? __task_pid_nr_ns+0x105/0x3c0 
[ 3696.389299]  ? do_syscall_64+0x69/0x90 
[ 3696.393070]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3696.397446]  ? do_syscall_64+0x69/0x90 
[ 3696.401217]  ? asm_exc_page_fault+0x22/0x30 
[ 3696.405421]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3696.409799]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3696.414873] RIP: 0033:0x7fe5cfaa157c 
[ 3696.418471] RSP: 002b:00007fffe9f3c610 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3696.426053] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3696.433205] RDX: 0000000000000006 RSI: 000000000008b58b RDI: 000000000008b58b 
[ 3696.440353] RBP: 000000000008b58b R08: 00007fffe9f3c6e0 R09: 00007fe5cfbb14e0 
[ 3696.447505] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3696.454653] R13: 8f5c28f5c28f5c29 R14: 000000000040df50 R15: 00007fe5cfce86c0 
[ 3696.461826]  </TASK> 
[ 3696.464034]  
[ 3696.464034] Showing all locks held in the system: 
[ 3696.470235] 1 lock held by rcu_tasks_kthre/13: 
[ 3696.474703]  #0: ffffffffb972c000 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3696.484298] 1 lock held by rcu_tasks_rude_/14: 
[ 3696.488757]  #0: ffffffffb972bd20 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3696.498790] 1 lock held by rcu_tasks_trace/15: 
[ 3696.503249]  #0: ffffffffb972b9e0 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3696.513394] 1 lock held by khungtaskd/865: 
[ 3696.517516]  #0: ffffffffb972ccc0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x51/0x340 
[ 3696.526526]  
[ 3696.528044] ============================================= 
[ 3696.528044]  
[ 3749.010553] restraintd[2605]: *** Current Time: Sat Jun 10 05:36:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3809.009825] restraintd[2605]: *** Current Time: Sat Jun 10 05:37:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3818.789022] INFO: task fsstress:570762 blocked for more than 614 seconds. 
[ 3818.795829]       Not tainted 6.4.0-rc5+ #1 
[ 3818.800036] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3818.807891] task:fsstress        state:D stack:21664 pid:570762 ppid:570760 flags:0x00000002 
[ 3818.816352] Call Trace: 
[ 3818.818821]  <TASK> 
[ 3818.820946]  __schedule+0x79c/0x1d70 
[ 3818.824557]  ? __pfx___schedule+0x10/0x10 
[ 3818.828592]  ? __lock_acquire+0xbaa/0x1be0 
[ 3818.832725]  ? __lock_release+0x485/0x960 
[ 3818.836772]  schedule+0x130/0x220 
[ 3818.840120]  schedule_timeout+0x22a/0x270 
[ 3818.844150]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3818.848707]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3818.854062]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3818.858452]  __wait_for_common+0x1ca/0x5c0 
[ 3818.862572]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3818.867132]  ? __pfx___wait_for_common+0x10/0x10 
[ 3818.871789]  wait_for_completion_state+0x1d/0x40 
[ 3818.876428]  coredump_wait+0x502/0x750 
[ 3818.880221]  do_coredump+0x2fa/0x1c10 
[ 3818.883908]  ? __pfx___lock_release+0x10/0x10 
[ 3818.888294]  ? __pfx___lock_acquired+0x10/0x10 
[ 3818.892768]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3818.897152]  ? __pfx_do_coredump+0x10/0x10 
[ 3818.901280]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3818.906359]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3818.910657]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3818.915642]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3818.920740]  ? __lock_release+0x485/0x960 
[ 3818.924784]  ? __wait_for_common+0x9e/0x5c0 
[ 3818.928992]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3818.933382]  get_signal+0x12f8/0x1990 
[ 3818.937079]  ? do_send_specific+0xf8/0x280 
[ 3818.941205]  ? __pfx_get_signal+0x10/0x10 
[ 3818.945244]  ? do_send_specific+0x110/0x280 
[ 3818.949457]  ? __pfx_do_send_specific+0x10/0x10 
[ 3818.954018]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3818.958746]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3818.964086]  ? do_tkill+0x186/0x200 
[ 3818.967617]  exit_to_user_mode_loop+0xac/0x160 
[ 3818.972082]  exit_to_user_mode_prepare+0x142/0x150 
[ 3818.976904]  syscall_exit_to_user_mode+0x19/0x50 
[ 3818.981544]  do_syscall_64+0x69/0x90 
[ 3818.985146]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3818.989530]  ? do_syscall_64+0x69/0x90 
[ 3818.993312]  ? do_syscall_64+0x69/0x90 
[ 3818.997081]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3819.001465]  ? do_syscall_64+0x69/0x90 
[ 3819.005242]  ? asm_exc_page_fault+0x22/0x30 
[ 3819.009453]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3819.013839]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3819.018912] RIP: 0033:0x7fe5cfaa157c 
[ 3819.022515] RSP: 002b:00007fffe9f3c6e0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3819.030103] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3819.037257] RDX: 0000000000000006 RSI: 000000000008b58a RDI: 000000000008b58a 
[ 3819.044415] RBP: 000000000008b58a R08: 00007fffe9f3c7b0 R09: 00007fe5cfbb14e0 
[ 3819.051564] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3819.058709] R13: 8f5c28f5c28f5c29 R14: 000000000040ac10 R15: 00007fe5cfce86c0 
[ 3819.065896]  </TASK> 
[ 3819.068107] INFO: task fsstress:570763 blocked for more than 614 seconds. 
[ 3819.074916]       Not tainted 6.4.0-rc5+ #1 
[ 3819.079121] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3819.086961] task:fsstress        state:D stack:22832 pid:570763 ppid:570760 flags:0x00000002 
[ 3819.095411] Call Trace: 
[ 3819.097897]  <TASK> 
[ 3819.100025]  __schedule+0x79c/0x1d70 
[ 3819.103629]  ? __pfx___schedule+0x10/0x10 
[ 3819.107660]  ? __lock_acquire+0xbaa/0x1be0 
[ 3819.111787]  ? __lock_release+0x485/0x960 
[ 3819.115828]  schedule+0x130/0x220 
[ 3819.119172]  schedule_timeout+0x22a/0x270 
[ 3819.123208]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3819.127760]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20 
[ 3819.133107]  ? _raw_spin_unlock_irq+0x2a/0x50 
[ 3819.137490]  __wait_for_common+0x1ca/0x5c0 
[ 3819.141612]  ? __pfx_schedule_timeout+0x10/0x10 
[ 3819.146176]  ? __pfx___wait_for_common+0x10/0x10 
[ 3819.150829]  wait_for_completion_state+0x1d/0x40 
[ 3819.155464]  coredump_wait+0x502/0x750 
[ 3819.159248]  do_coredump+0x2fa/0x1c10 
[ 3819.162933]  ? __pfx___lock_release+0x10/0x10 
[ 3819.167312]  ? __pfx___lock_acquired+0x10/0x10 
[ 3819.171779]  ? do_raw_spin_trylock+0xb5/0x180 
[ 3819.176162]  ? __pfx_do_coredump+0x10/0x10 
[ 3819.180287]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3819.185364]  ? do_raw_spin_unlock+0x55/0x1f0 
[ 3819.189661]  ? _raw_spin_unlock_irqrestore+0x42/0x70 
[ 3819.194654]  ? __debug_check_no_obj_freed+0x1e8/0x3c0 
[ 3819.199747]  ? __lock_release+0x485/0x960 
[ 3819.203795]  ? __wait_for_common+0x9e/0x5c0 
[ 3819.207998]  ? __pfx_dequeue_signal+0x10/0x10 
[ 3819.212397]  get_signal+0x12f8/0x1990 
[ 3819.216094]  ? do_send_specific+0xf8/0x280 
[ 3819.220222]  ? __pfx_get_signal+0x10/0x10 
[ 3819.224258]  ? do_send_specific+0x110/0x280 
[ 3819.228470]  ? __pfx_do_send_specific+0x10/0x10 
[ 3819.233032]  arch_do_signal_or_restart+0x77/0x2f0 
[ 3819.237761]  ? __pfx_arch_do_signal_or_restart+0x10/0x10 
[ 3819.243101]  ? do_tkill+0x186/0x200 
[ 3819.246636]  exit_to_user_mode_loop+0xac/0x160 
[ 3819.251104]  exit_to_user_mode_prepare+0x142/0x150 
[ 3819.255927]  syscall_exit_to_user_mode+0x19/0x50 
[ 3819.260571]  do_syscall_64+0x69/0x90 
[ 3819.264173]  ? __task_pid_nr_ns+0x105/0x3c0 
[ 3819.268387]  ? do_syscall_64+0x69/0x90 
[ 3819.272164]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3819.276554]  ? do_syscall_64+0x69/0x90 
[ 3819.280332]  ? asm_exc_page_fault+0x22/0x30 
[ 3819.284544]  ? lockdep_hardirqs_on+0x79/0x100 
[ 3819.288930]  entry_SYSCALL_64_after_hwframe+0x72/0xdc 
[ 3819.294005] RIP: 0033:0x7fe5cfaa157c 
[ 3819.297609] RSP: 002b:00007fffe9f3c610 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea 
[ 3819.305200] RAX: 0000000000000000 RBX: 00007fe5cfce8740 RCX: 00007fe5cfaa157c 
[ 3819.312356] RDX: 0000000000000006 RSI: 000000000008b58b RDI: 000000000008b58b 
[ 3819.319508] RBP: 000000000008b58b R08: 00007fffe9f3c6e0 R09: 00007fe5cfbb14e0 
[ 3819.326662] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006 
[ 3819.333813] R13: 8f5c28f5c28f5c29 R14: 000000000040df50 R15: 00007fe5cfce86c0 
[ 3819.340992]  </TASK> 
[ 3819.343201] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings 
[ 3819.351395]  
[ 3819.351395] Showing all locks held in the system: 
[ 3819.357596] 1 lock held by rcu_tasks_kthre/13: 
[ 3819.362067]  #0: ffffffffb972c000 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3819.371670] 1 lock held by rcu_tasks_rude_/14: 
[ 3819.376142]  #0: ffffffffb972bd20 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3819.386170] 1 lock held by rcu_tasks_trace/15: 
[ 3819.390638]  #0: ffffffffb972b9e0 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0x5c0 
[ 3819.400783] 1 lock held by khungtaskd/865: 
[ 3819.404901]  #0: ffffffffb972ccc0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x51/0x340 
[ 3819.413920]  
[ 3819.415443] ============================================= 
[ 3819.415443]  
[ 3869.010274] restraintd[2605]: *** Current Time: Sat Jun 10 05:38:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[ 3929.010015] restraintd[2605]: *** Current Time: Sat Jun 10 05:39:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
...
...
[100469.009966] restraintd[2605]: *** Current Time: Sun Jun 11 08:28:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023 
[100529.010810] restraintd[2605]: *** Current Time: Sun Jun 11 08:29:25 2023  Localwatchdog at: Mon Jun 12 04:37:25 2023
...
...


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+
  2023-06-11 12:48 [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+ Zorro Lang
@ 2023-06-11 13:40 ` Zorro Lang
  2023-06-11 23:01 ` Dave Chinner
  1 sibling, 0 replies; 31+ messages in thread
From: Zorro Lang @ 2023-06-11 13:40 UTC (permalink / raw)
  To: linux-xfs

On Sun, Jun 11, 2023 at 08:48:36PM +0800, Zorro Lang wrote:
> Hi,
> 
> When I tried to do fstests regression test this weekend on latest linux
> v6.5-rc5+ (HEAD=64569520920a3ca5d456ddd9f4f95fc6ea9b8b45), nearly all
> testing jobs on xfs hang on generic/051 (more than 24 hours, still blocked).
> No matter 1k or 4k blocksize, general disk or pmem dev, or any architectures,
> or any mkfs/mount options testing, all hang there.
> 
> Someone console log as below (a bit long), the call trace doesn't contains any
> xfs functions, it might be not a xfs bug, but it can't be reproduced on ext4.

Overlayfs can reproduce this bug too, when it's based on xfs.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+
  2023-06-11 12:48 [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+ Zorro Lang
  2023-06-11 13:40 ` Zorro Lang
@ 2023-06-11 23:01 ` Dave Chinner
  2023-06-12  1:51   ` Darrick J. Wong
  1 sibling, 1 reply; 31+ messages in thread
From: Dave Chinner @ 2023-06-11 23:01 UTC (permalink / raw)
  To: Zorro Lang; +Cc: linux-xfs

On Sun, Jun 11, 2023 at 08:48:36PM +0800, Zorro Lang wrote:
> Hi,
> 
> When I tried to do fstests regression test this weekend on latest linux
> v6.5-rc5+ (HEAD=64569520920a3ca5d456ddd9f4f95fc6ea9b8b45), nearly all
> testing jobs on xfs hang on generic/051 (more than 24 hours, still blocked).
> No matter 1k or 4k blocksize, general disk or pmem dev, or any architectures,
> or any mkfs/mount options testing, all hang there.

Yup, I started seeing this on upgrade to 6.5-rc5, too. xfs/079
generates it, because the fsstress process is crashing when the
XFS filesystems shuts down (maybe a SIGBUS from a mmap page fault?)
I don't know how reproducable it is yet; these only showed up in my
thrusday night testing so I haven't had a chance to triage it yet.

> Someone console log as below (a bit long), the call trace doesn't contains any
> xfs functions, it might be not a xfs bug, but it can't be reproduced on ext4.

AFAICT, the coredump is being done on the root drive (where fsstress
is being executed from), not the XFS test/scratch devices that
fsstress processes are exercising. I have ext3 root drives for my
test machines, so at this point I'm not sure that this is even a
filesystem related regression. i.e. it may be a recent regression in
the coredump or signal handling code....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+
  2023-06-11 23:01 ` Dave Chinner
@ 2023-06-12  1:51   ` Darrick J. Wong
  2023-06-12  2:21     ` [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+) Dave Chinner
  0 siblings, 1 reply; 31+ messages in thread
From: Darrick J. Wong @ 2023-06-12  1:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Zorro Lang, linux-xfs

On Mon, Jun 12, 2023 at 09:01:19AM +1000, Dave Chinner wrote:
> On Sun, Jun 11, 2023 at 08:48:36PM +0800, Zorro Lang wrote:
> > Hi,
> > 
> > When I tried to do fstests regression test this weekend on latest linux
> > v6.5-rc5+ (HEAD=64569520920a3ca5d456ddd9f4f95fc6ea9b8b45), nearly all
> > testing jobs on xfs hang on generic/051 (more than 24 hours, still blocked).
> > No matter 1k or 4k blocksize, general disk or pmem dev, or any architectures,
> > or any mkfs/mount options testing, all hang there.
> 
> Yup, I started seeing this on upgrade to 6.5-rc5, too. xfs/079
> generates it, because the fsstress process is crashing when the
> XFS filesystems shuts down (maybe a SIGBUS from a mmap page fault?)
> I don't know how reproducable it is yet; these only showed up in my
> thrusday night testing so I haven't had a chance to triage it yet.
> 
> > Someone console log as below (a bit long), the call trace doesn't contains any
> > xfs functions, it might be not a xfs bug, but it can't be reproduced on ext4.
> 
> AFAICT, the coredump is being done on the root drive (where fsstress
> is being executed from), not the XFS test/scratch devices that
> fsstress processes are exercising. I have ext3 root drives for my
> test machines, so at this point I'm not sure that this is even a
> filesystem related regression. i.e. it may be a recent regression in
> the coredump or signal handling code....

Willy was complaining about the same thing on Friday.  Oddly I've not
seen any problems with coredumps on 6.4-rc5, so .... <shrug>

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  1:51   ` Darrick J. Wong
@ 2023-06-12  2:21     ` Dave Chinner
  2023-06-12  3:14       ` Linus Torvalds
  0 siblings, 1 reply; 31+ messages in thread
From: Dave Chinner @ 2023-06-12  2:21 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Zorro Lang, linux-xfs, Eric W. Biederman, Mike Christie,
	Michael S. Tsirkin, Linus Torvalds, linux-kernel

On Sun, Jun 11, 2023 at 06:51:45PM -0700, Darrick J. Wong wrote:
> On Mon, Jun 12, 2023 at 09:01:19AM +1000, Dave Chinner wrote:
> > On Sun, Jun 11, 2023 at 08:48:36PM +0800, Zorro Lang wrote:
> > > Hi,
> > > 
> > > When I tried to do fstests regression test this weekend on latest linux
> > > v6.5-rc5+ (HEAD=64569520920a3ca5d456ddd9f4f95fc6ea9b8b45), nearly all
> > > testing jobs on xfs hang on generic/051 (more than 24 hours, still blocked).
> > > No matter 1k or 4k blocksize, general disk or pmem dev, or any architectures,
> > > or any mkfs/mount options testing, all hang there.
> > 
> > Yup, I started seeing this on upgrade to 6.5-rc5, too. xfs/079
> > generates it, because the fsstress process is crashing when the
> > XFS filesystems shuts down (maybe a SIGBUS from a mmap page fault?)
> > I don't know how reproducable it is yet; these only showed up in my
> > thrusday night testing so I haven't had a chance to triage it yet.
> > 
> > > Someone console log as below (a bit long), the call trace doesn't contains any
> > > xfs functions, it might be not a xfs bug, but it can't be reproduced on ext4.
> > 
> > AFAICT, the coredump is being done on the root drive (where fsstress
> > is being executed from), not the XFS test/scratch devices that
> > fsstress processes are exercising. I have ext3 root drives for my
> > test machines, so at this point I'm not sure that this is even a
> > filesystem related regression. i.e. it may be a recent regression in
> > the coredump or signal handling code....
> 
> Willy was complaining about the same thing on Friday.  Oddly I've not
> seen any problems with coredumps on 6.4-rc5, so .... <shrug>

A quick check against xfs/179 (the case I was seeing) indicates that
it is a regression between v6.4-rc4 and v6.4-rc5.

Looking at the kernel/ changelog, I'm expecting that it is a
regression introduced by this commit:

commit f9010dbdce911ee1f1af1398a24b1f9f992e0080
Author: Mike Christie <michael.christie@oracle.com>
Date:   Thu Jun 1 13:32:32 2023 -0500

    fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
    
    When switching from kthreads to vhost_tasks two bugs were added:
    1. The vhost worker tasks's now show up as processes so scripts doing
    ps or ps a would not incorrectly detect the vhost task as another
    process.  2. kthreads disabled freeze by setting PF_NOFREEZE, but
    vhost tasks's didn't disable or add support for them.
    
    To fix both bugs, this switches the vhost task to be thread in the
    process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
    get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
    SIGKILL/STOP support is required because CLONE_THREAD requires
    CLONE_SIGHAND which requires those 2 signals to be supported.
    
    This is a modified version of the patch written by Mike Christie
    <michael.christie@oracle.com> which was a modified version of patch
    originally written by Linus.
    
    Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER.
    Including ignoring signals, setting up the register state, and having
    get_signal return instead of calling do_group_exit.
    
    Tidied up the vhost_task abstraction so that the definition of
    vhost_task only needs to be visible inside of vhost_task.c.  Making
    it easier to review the code and tell what needs to be done where.
    As part of this the main loop has been moved from vhost_worker into
    vhost_task_fn.  vhost_worker now returns true if work was done.
    
    The main loop has been updated to call get_signal which handles
    SIGSTOP, freezing, and collects the message that tells the thread to
    exit as part of process exit.  This collection clears
    __fatal_signal_pending.  This collection is not guaranteed to
    clear signal_pending() so clear that explicitly so the schedule()
    sleeps.
    
    For now the vhost thread continues to exist and run work until the
    last file descriptor is closed and the release function is called as
    part of freeing struct file.  To avoid hangs in the coredump
    rendezvous and when killing threads in a multi-threaded exec.  The
    coredump code and de_thread have been modified to ignore vhost threads.
    
    Remvoing the special case for exec appears to require teaching
    vhost_dev_flush how to directly complete transactions in case
    the vhost thread is no longer running.
    
    Removing the special case for coredump rendezvous requires either the
    above fix needed for exec or moving the coredump rendezvous into
    get_signal.
    
    Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Co-developed-by: Mike Christie <michael.christie@oracle.com>
    Signed-off-by: Mike Christie <michael.christie@oracle.com>
    Acked-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

I'm just building a v6.5-rc5 with this one commit reverted to test
it right now.....

<a few minutes later>

Yup, 6.5-rc5 with that patch reverted doesn't hang at all. That's
the problem.

I guess the regression fix needs a regression fix....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  2:21     ` [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+) Dave Chinner
@ 2023-06-12  3:14       ` Linus Torvalds
  2023-06-12  5:16         ` Dave Chinner
                           ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12  3:14 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Darrick J. Wong, Zorro Lang, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1928 bytes --]

On Sun, Jun 11, 2023 at 7:22 PM Dave Chinner <david@fromorbit.com> wrote:
>
> I guess the regression fix needs a regression fix....

Yup.

From the description of the problem, it sounds like this happens on
real hardware, no vhost anywhere?

Or maybe Darrick (who doesn't see the issue) is running on raw
hardware, and you and Zorro are running in a virtual environment?

It sounds like zap_other_threads() and coredump_task_exit() do not
agree about the core_state->nr_threads counting, which is part of what
changed there.

[ Goes off to look ]

Hmm. Both seem to be using the same test for

    (t->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER

which I don't love - I don't think io_uring threads should participate
in core dumping either, so I think the test could just be

    (t->flags & PF_IO_WORKER)

but that shouldn't be the issue here.

But according to

  https://lore.kernel.org/all/20230611124836.whfktwaumnefm5z5@zlang-mailbox/

it's clearly hanging in wait_for_completion_state() in
coredump_wait(), so it really looks like some confusion about that
core_waiters (aka core_state->nr_threads) count.

Oh. Humm. Mike changed that initial rough patch of mine, and I had
moved the "if you don't participate in c ore dumps" test up also past
the "do_coredump()" logic.

And I think it's horribly *wrong* for a thread that doesn't get
counted for core-dumping to go into do_coredump(), because then it
will set the "core_state" to possibly be the core-state of the vhost
thread that isn't even counted.

So *maybe* this attached patch might fix it? I haven't thought very
deeply about this, but vhost workers most definitely shouldn't call
do_coredump(), since they are then not counted.

(And again, I think we should just check that PF_IO_WORKER bit, not
use this more complex test, but that's a separate and bigger change).

              Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 554 bytes --]

 kernel/signal.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/signal.c b/kernel/signal.c
index 2547fa73bde5..a1e11ee8537c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2847,6 +2847,10 @@ bool get_signal(struct ksignal *ksig)
 		 */
 		current->flags |= PF_SIGNALED;
 
+		/* vhost workers don't participate in core dups */
+		if ((current->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER)
+			goto out;
+
 		if (sig_kernel_coredump(signr)) {
 			if (print_fatal_signals)
 				print_fatal_signal(ksig->info.si_signo);

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  3:14       ` Linus Torvalds
@ 2023-06-12  5:16         ` Dave Chinner
  2023-06-12  5:34           ` Linus Torvalds
  2023-06-12  6:36         ` Zorro Lang
  2023-06-12 15:36         ` Darrick J. Wong
  2 siblings, 1 reply; 31+ messages in thread
From: Dave Chinner @ 2023-06-12  5:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Darrick J. Wong, Zorro Lang, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On Sun, Jun 11, 2023 at 08:14:25PM -0700, Linus Torvalds wrote:
> On Sun, Jun 11, 2023 at 7:22 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > I guess the regression fix needs a regression fix....
> 
> Yup.
> 
> From the description of the problem, it sounds like this happens on
> real hardware, no vhost anywhere?
>
> Or maybe Darrick (who doesn't see the issue) is running on raw
> hardware, and you and Zorro are running in a virtual environment?

I'm testing inside VMs and seeing it, I can't speak for anyone else.

....

> So *maybe* this attached patch might fix it? I haven't thought very
> deeply about this, but vhost workers most definitely shouldn't call
> do_coredump(), since they are then not counted.
> 
> (And again, I think we should just check that PF_IO_WORKER bit, not
> use this more complex test, but that's a separate and bigger change).
> 
>               Linus

>  kernel/signal.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 2547fa73bde5..a1e11ee8537c 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2847,6 +2847,10 @@ bool get_signal(struct ksignal *ksig)
>  		 */
>  		current->flags |= PF_SIGNALED;
>  
> +		/* vhost workers don't participate in core dups */
> +		if ((current->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER)
> +			goto out;
> +
>  		if (sig_kernel_coredump(signr)) {
>  			if (print_fatal_signals)
>  				print_fatal_signal(ksig->info.si_signo);


That would appear to make things worse. mkfs.xfs hung in Z state on
exit and never returned to the shell. Also, multiple processes are
livelocked like this:

 Sending NMI from CPU 0 to CPUs 1-3:
 NMI backtrace for cpu 2
 CPU: 2 PID: 3409 Comm: pmlogger_farm Not tainted 6.4.0-rc5-dgc+ #1822
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
 RIP: 0010:uprobe_deny_signal+0x5/0x90
 Code: 48 c7 c1 c4 64 62 82 48 c7 c7 d1 64 62 82 e8 b2 39 ec ff e9 70 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 <55> 31 4
 RSP: 0018:ffffc900023abdf0 EFLAGS: 00000202
 RAX: 0000000000000004 RBX: ffff888103b127c0 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000296 RDI: ffffc900023abe70
 RBP: ffffc900023abe60 R08: 0000000000000001 R09: 0000000000000001
 R10: 0000000000000000 R11: ffff88813bd2ccf0 R12: ffff888103b127c0
 R13: ffffc900023abe70 R14: ffff888110413700 R15: ffff888103d26e80
 FS:  00007f35497a4740(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
 CR2: 00007ffd4ca0ce80 CR3: 000000010f7d1000 CR4: 00000000000006e0
 Call Trace:
  <NMI>
  ? show_regs+0x61/0x70
  ? nmi_cpu_backtrace+0x88/0xf0
  ? nmi_cpu_backtrace_handler+0x11/0x20
  ? nmi_handle+0x57/0x150
  ? default_do_nmi+0x49/0x240
  ? exc_nmi+0xf4/0x110
  ? end_repeat_nmi+0x16/0x31
  ? uprobe_deny_signal+0x5/0x90
  ? uprobe_deny_signal+0x5/0x90
  ? uprobe_deny_signal+0x5/0x90
  </NMI>
  <TASK>
  ? get_signal+0x94/0x9b0
  ? signal_setup_done+0x66/0x190
  arch_do_signal_or_restart+0x2f/0x260
  exit_to_user_mode_prepare+0x181/0x1c0
  syscall_exit_to_user_mode+0x16/0x40
  do_syscall_64+0x40/0x80
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 RIP: 0023:0xffff888103b127c0
 Code: Unable to access opcode bytes at 0xffff888103b12796.
 RSP: 002b:00007ffd4ca0d0ac EFLAGS: 00000202 ORIG_RAX: 000000000000003d
 RAX: 0000000000000009 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 00007ffd4d20bb9c RDI: 00000000ffffffff
 RBP: 00007ffd4d20bb9c R08: 0000000000000002 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
 R13: 00007ffd4d20bba0 R14: 00005604571fc380 R15: 0000000000000001
  </TASK>
 NMI backtrace for cpu 3
 CPU: 3 PID: 3526 Comm: pmlogger_check Not tainted 6.4.0-rc5-dgc+ #1822
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
 RIP: 0010:fixup_exception+0x72/0x260
 Code: 14 0f 87 03 02 00 00 ff 24 d5 98 67 22 82 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 41 81 cd 00 00 00 40 4d 63 ed 4d 89 6c 24 50 <31> c0 9
 RSP: 0018:ffffc9000275bb58 EFLAGS: 00000083
 RAX: 000000000000000f RBX: ffffffff827d0a4c RCX: ffffffff810c5f95
 RDX: 000000000000000f RSI: ffffffff827d0a4c RDI: ffffc9000275bb28
 RBP: ffffc9000275bb80 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000275bc78
 R13: 000000000000000e R14: 000000008f5ded3f R15: 0000000000000000
 FS:  00007f56a36de740(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
 CR2: 000000008f5ded3f CR3: 000000010dcde000 CR4: 00000000000006e0
 Call Trace:
  <NMI>
  ? show_regs+0x61/0x70
  ? nmi_cpu_backtrace+0x88/0xf0
  ? nmi_cpu_backtrace_handler+0x11/0x20
  ? nmi_handle+0x57/0x150
  ? default_do_nmi+0x49/0x240
  ? exc_nmi+0xf4/0x110
  ? end_repeat_nmi+0x16/0x31
  ? copy_fpstate_to_sigframe+0x1c5/0x3a0
  ? fixup_exception+0x72/0x260
  ? fixup_exception+0x72/0x260
  ? fixup_exception+0x72/0x260
  </NMI>
  <TASK>
  kernelmode_fixup_or_oops+0x49/0x120
  __bad_area_nosemaphore+0x15a/0x230
  ? __bad_area+0x57/0x80
  bad_area_nosemaphore+0x16/0x20
  exc_page_fault+0x323/0x880
  asm_exc_page_fault+0x27/0x30
 RIP: 0010:copy_fpstate_to_sigframe+0x1c5/0x3a0
 Code: 45 89 bc 24 40 25 00 00 f0 41 80 64 24 01 bf e9 f5 fe ff ff be 3c 00 00 00 48 c7 c7 77 9c 5f 82 e8 00 2a 23 00 31 c0 0f 1f 00 <49> 0f 1
 RSP: 0018:ffffc9000275bd28 EFLAGS: 00010246
 RAX: 000000000000000e RBX: 000000008f5de7ec RCX: ffffc9000275bda8
 RDX: 000000008f5ded40 RSI: 000000000000003c RDI: ffffffff825f9c77
 RBP: ffffc9000275bd98 R08: ffffc9000275be30 R09: 0000000000000001
 R10: 0000000000000000 R11: ffffc90000138ff8 R12: ffff8881106527c0
 R13: 000000008f5deb40 R14: ffff888110654d40 R15: ffff88810a653f40
  ? copy_fpstate_to_sigframe+0x1c0/0x3a0
  ? __might_sleep+0x42/0x70
  get_sigframe+0xcd/0x2b0
  ia32_setup_frame+0x61/0x230
  arch_do_signal_or_restart+0x1d1/0x260
  exit_to_user_mode_prepare+0x181/0x1c0
  irqentry_exit_to_user_mode+0x9/0x30
  irqentry_exit+0x33/0x40
  exc_page_fault+0x1b6/0x880
  asm_exc_page_fault+0x27/0x30
 RIP: 0023:0x106527c0
 Code: Unable to access opcode bytes at 0x10652796.
 RSP: 002b:000000008f5ded6c EFLAGS: 00010202
 RAX: 000000000000000b RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 00007ffd8f5df2ec RDI: 00000000ffffffff
 RBP: 00007ffd8f5df2ec R08: 0000000000000000 R09: 00005558962eb526
 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
 R13: 00007ffd8f5df2f0 R14: 00005558962b5e60 R15: 0000000000000001
  </TASK>


Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  5:16         ` Dave Chinner
@ 2023-06-12  5:34           ` Linus Torvalds
  2023-06-12  5:49             ` Dave Chinner
  0 siblings, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12  5:34 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Darrick J. Wong, Zorro Lang, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On Sun, Jun 11, 2023 at 10:16 PM Dave Chinner <david@fromorbit.com> wrote:
>
> > +             /* vhost workers don't participate in core dups */
> > +             if ((current->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER)
> > +                     goto out;
> > +
>
> That would appear to make things worse. mkfs.xfs hung in Z state on
> exit and never returned to the shell.

Well, duh, that's because I'm a complete nincompoop who just copied
the condition from the other cases, but those other cases were for
testing the "this is *not* a vhost worker".

Here the logic - as per the comment I added - was supposed to be "is
this a vhost worker".

So that "!=" should obviously have been a "==".

Not that I'm at all convinced that that will fix the problem you are
seeing, but at least it shouldn't have made things worse like the
getting the condition completely wrong did.

                    Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  5:34           ` Linus Torvalds
@ 2023-06-12  5:49             ` Dave Chinner
  2023-06-12  6:11               ` Linus Torvalds
  0 siblings, 1 reply; 31+ messages in thread
From: Dave Chinner @ 2023-06-12  5:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Darrick J. Wong, Zorro Lang, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On Sun, Jun 11, 2023 at 10:34:29PM -0700, Linus Torvalds wrote:
> On Sun, Jun 11, 2023 at 10:16 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > > +             /* vhost workers don't participate in core dups */
> > > +             if ((current->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER)
> > > +                     goto out;
> > > +
> >
> > That would appear to make things worse. mkfs.xfs hung in Z state on
> > exit and never returned to the shell.
> 
> Well, duh, that's because I'm a complete nincompoop who just copied
> the condition from the other cases, but those other cases were for
> testing the "this is *not* a vhost worker".
> 
> Here the logic - as per the comment I added - was supposed to be "is
> this a vhost worker".
> 
> So that "!=" should obviously have been a "==".

Same as without the condition - all the fsstress tasks hang in
do_coredump().

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  5:49             ` Dave Chinner
@ 2023-06-12  6:11               ` Linus Torvalds
  2023-06-12  8:45                 ` Eric W. Biederman
  0 siblings, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12  6:11 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Darrick J. Wong, Zorro Lang, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On Sun, Jun 11, 2023 at 10:49 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Sun, Jun 11, 2023 at 10:34:29PM -0700, Linus Torvalds wrote:
> >
> > So that "!=" should obviously have been a "==".
>
> Same as without the condition - all the fsstress tasks hang in
> do_coredump().

Ok, that at least makes sense. Your "it made things worse" made me go
"What?" until I noticed the stupid backwards test.

I'm not seeing anything else that looks odd in that commit
f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps
regression").

Let's see if somebody else goes "Ahh" when they wake up tomorrow...

              Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  3:14       ` Linus Torvalds
  2023-06-12  5:16         ` Dave Chinner
@ 2023-06-12  6:36         ` Zorro Lang
  2023-06-12 15:36         ` Darrick J. Wong
  2 siblings, 0 replies; 31+ messages in thread
From: Zorro Lang @ 2023-06-12  6:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Darrick J. Wong, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On Sun, Jun 11, 2023 at 08:14:25PM -0700, Linus Torvalds wrote:
> On Sun, Jun 11, 2023 at 7:22 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > I guess the regression fix needs a regression fix....
> 
> Yup.
> 
> From the description of the problem, it sounds like this happens on
> real hardware, no vhost anywhere?
> 
> Or maybe Darrick (who doesn't see the issue) is running on raw
> hardware, and you and Zorro are running in a virtual environment?

I tested virtual environment and raw hardware both.

We have a testing machines pool which contains lots of different machines,
include real machines, kvm, and other kind of vm, include different
rches (aarch64, s390x, ppc64le and x86_64), and different kind of
storage (virt, hard RAID, generic scsi disk, pmem, etc...). They all hang
on fstests generic/051.

I remembered Darrick said he did test with ~160 VMs (need confirmation
from him).

So this issue might not be related with VMs or real machine. Hmm... maybe
related with some kernel config? If Darrick would like to provide his .config
file, I can make a diff with mine to check the difference.

Thanks,
Zorro

> 
> It sounds like zap_other_threads() and coredump_task_exit() do not
> agree about the core_state->nr_threads counting, which is part of what
> changed there.
> 
> [ Goes off to look ]
> 
> Hmm. Both seem to be using the same test for
> 
>     (t->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER
> 
> which I don't love - I don't think io_uring threads should participate
> in core dumping either, so I think the test could just be
> 
>     (t->flags & PF_IO_WORKER)
> 
> but that shouldn't be the issue here.
> 
> But according to
> 
>   https://lore.kernel.org/all/20230611124836.whfktwaumnefm5z5@zlang-mailbox/
> 
> it's clearly hanging in wait_for_completion_state() in
> coredump_wait(), so it really looks like some confusion about that
> core_waiters (aka core_state->nr_threads) count.
> 
> Oh. Humm. Mike changed that initial rough patch of mine, and I had
> moved the "if you don't participate in c ore dumps" test up also past
> the "do_coredump()" logic.
> 
> And I think it's horribly *wrong* for a thread that doesn't get
> counted for core-dumping to go into do_coredump(), because then it
> will set the "core_state" to possibly be the core-state of the vhost
> thread that isn't even counted.
> 
> So *maybe* this attached patch might fix it? I haven't thought very
> deeply about this, but vhost workers most definitely shouldn't call
> do_coredump(), since they are then not counted.
> 
> (And again, I think we should just check that PF_IO_WORKER bit, not
> use this more complex test, but that's a separate and bigger change).
> 
>               Linus



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  6:11               ` Linus Torvalds
@ 2023-06-12  8:45                 ` Eric W. Biederman
  2023-06-12  9:30                   ` Zorro Lang
  2023-06-12 11:27                   ` Dave Chinner
  0 siblings, 2 replies; 31+ messages in thread
From: Eric W. Biederman @ 2023-06-12  8:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Darrick J. Wong, Zorro Lang, linux-xfs,
	Mike Christie, Michael S. Tsirkin, linux-kernel

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Sun, Jun 11, 2023 at 10:49 PM Dave Chinner <david@fromorbit.com> wrote:
>>
>> On Sun, Jun 11, 2023 at 10:34:29PM -0700, Linus Torvalds wrote:
>> >
>> > So that "!=" should obviously have been a "==".
>>
>> Same as without the condition - all the fsstress tasks hang in
>> do_coredump().
>
> Ok, that at least makes sense. Your "it made things worse" made me go
> "What?" until I noticed the stupid backwards test.
>
> I'm not seeing anything else that looks odd in that commit
> f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps
> regression").
>
> Let's see if somebody else goes "Ahh" when they wake up tomorrow...

It feels like there have been about half a dozen bugs pointed out in
that version of the patch.  I am going to have to sleep before I can get
as far as "Ahh"

One thing that really stands out for me is.

if (test_if_loop_should_continue) {
	set_current_state(TASK_INTERRUPTIBLE);
        schedule();
}

/* elsewhere */
llist_add(...);
wake_up_process()

So it is possible that the code can sleep indefinitely waiting for a
wake-up that has already come, because the order of set_current_state
and the test are in the wrong order.

Unfortunately I don't see what would effect a coredump on a process that
does not trigger the vhost_worker code.



About the only thing I can image is if io_uring is involved.  Some of
the PF_IO_WORKER code was changed, and the test
"((t->flags & (PF_USER_WORKER | PF_IO_WORKER)) != PF_USER_WORKER)"
was sprinkled around.

That is the only code outside of vhost specific code that was changed.


Is io_uring involved in the cases that hang?


Eric


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  8:45                 ` Eric W. Biederman
@ 2023-06-12  9:30                   ` Zorro Lang
  2023-06-12 11:27                   ` Dave Chinner
  1 sibling, 0 replies; 31+ messages in thread
From: Zorro Lang @ 2023-06-12  9:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linus Torvalds, Dave Chinner, Darrick J. Wong, Zorro Lang,
	linux-xfs, Mike Christie, Michael S. Tsirkin, linux-kernel

On Mon, Jun 12, 2023 at 03:45:12AM -0500, Eric W. Biederman wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > On Sun, Jun 11, 2023 at 10:49 PM Dave Chinner <david@fromorbit.com> wrote:
> >>
> >> On Sun, Jun 11, 2023 at 10:34:29PM -0700, Linus Torvalds wrote:
> >> >
> >> > So that "!=" should obviously have been a "==".
> >>
> >> Same as without the condition - all the fsstress tasks hang in
> >> do_coredump().
> >
> > Ok, that at least makes sense. Your "it made things worse" made me go
> > "What?" until I noticed the stupid backwards test.
> >
> > I'm not seeing anything else that looks odd in that commit
> > f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps
> > regression").
> >
> > Let's see if somebody else goes "Ahh" when they wake up tomorrow...
> 
> It feels like there have been about half a dozen bugs pointed out in
> that version of the patch.  I am going to have to sleep before I can get
> as far as "Ahh"
> 
> One thing that really stands out for me is.
> 
> if (test_if_loop_should_continue) {
> 	set_current_state(TASK_INTERRUPTIBLE);
>         schedule();
> }
> 
> /* elsewhere */
> llist_add(...);
> wake_up_process()
> 
> So it is possible that the code can sleep indefinitely waiting for a
> wake-up that has already come, because the order of set_current_state
> and the test are in the wrong order.
> 
> Unfortunately I don't see what would effect a coredump on a process that
> does not trigger the vhost_worker code.
> 
> 
> 
> About the only thing I can image is if io_uring is involved.  Some of
> the PF_IO_WORKER code was changed, and the test
> "((t->flags & (PF_USER_WORKER | PF_IO_WORKER)) != PF_USER_WORKER)"
> was sprinkled around.
> 
> That is the only code outside of vhost specific code that was changed.
> 
> 
> Is io_uring involved in the cases that hang?

Oh, right, I involved io_uring into in fstests' fsstress.c, and I built kernel
with CONFIG_IO_URING=y. If Darrick (said he didn't hit this issue) didn't enable
io_uring, that might mean it's io_uring related.

> 
> 
> Eric
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  8:45                 ` Eric W. Biederman
  2023-06-12  9:30                   ` Zorro Lang
@ 2023-06-12 11:27                   ` Dave Chinner
  1 sibling, 0 replies; 31+ messages in thread
From: Dave Chinner @ 2023-06-12 11:27 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linus Torvalds, Darrick J. Wong, Zorro Lang, linux-xfs,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On Mon, Jun 12, 2023 at 03:45:12AM -0500, Eric W. Biederman wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > On Sun, Jun 11, 2023 at 10:49 PM Dave Chinner <david@fromorbit.com> wrote:
> >>
> >> On Sun, Jun 11, 2023 at 10:34:29PM -0700, Linus Torvalds wrote:
> >> >
> >> > So that "!=" should obviously have been a "==".
> >>
> >> Same as without the condition - all the fsstress tasks hang in
> >> do_coredump().
> >
> > Ok, that at least makes sense. Your "it made things worse" made me go
> > "What?" until I noticed the stupid backwards test.
> >
> > I'm not seeing anything else that looks odd in that commit
> > f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps
> > regression").
> >
> > Let's see if somebody else goes "Ahh" when they wake up tomorrow...

....

> About the only thing I can image is if io_uring is involved.  Some of
> the PF_IO_WORKER code was changed, and the test
> "((t->flags & (PF_USER_WORKER | PF_IO_WORKER)) != PF_USER_WORKER)"
> was sprinkled around.
> 
> That is the only code outside of vhost specific code that was changed.
> 
> 
> Is io_uring involved in the cases that hang?

Yes. fsstress randomly uses io_uring for the ops that it runs.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12  3:14       ` Linus Torvalds
  2023-06-12  5:16         ` Dave Chinner
  2023-06-12  6:36         ` Zorro Lang
@ 2023-06-12 15:36         ` Darrick J. Wong
  2023-06-12 15:52           ` Eric W. Biederman
  2023-06-12 15:56           ` Linus Torvalds
  2 siblings, 2 replies; 31+ messages in thread
From: Darrick J. Wong @ 2023-06-12 15:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Zorro Lang, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On Sun, Jun 11, 2023 at 08:14:25PM -0700, Linus Torvalds wrote:
> On Sun, Jun 11, 2023 at 7:22 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > I guess the regression fix needs a regression fix....
> 
> Yup.
> 
> From the description of the problem, it sounds like this happens on
> real hardware, no vhost anywhere?
> 
> Or maybe Darrick (who doesn't see the issue) is running on raw
> hardware, and you and Zorro are running in a virtual environment?

Ahah, it turns out that liburing-dev isn't installed on the test fleet,
so fstests didn't get built with io_uring support.  That probably
explains why I don't see any of these hangs.

Oh.  I can't *install* the debian liburing-dev package because it has
a versioned dependency on linux-libc-dev >= 5.1, which isn't compatible
with me having a linux-libc-dev-djwong package that contains the uapi
headers for the latest upstream kernel and Replaces: linux-libc-dev.
So either I have to create a dummy linux-libc-dev with adequate version
number that pulls in my own libc header package, or rename that package.

<sigh> It's going to take me a while to research how best to split this
stupid knot.

--D

> It sounds like zap_other_threads() and coredump_task_exit() do not
> agree about the core_state->nr_threads counting, which is part of what
> changed there.
> 
> [ Goes off to look ]
> 
> Hmm. Both seem to be using the same test for
> 
>     (t->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER
> 
> which I don't love - I don't think io_uring threads should participate
> in core dumping either, so I think the test could just be
> 
>     (t->flags & PF_IO_WORKER)
> 
> but that shouldn't be the issue here.
> 
> But according to
> 
>   https://lore.kernel.org/all/20230611124836.whfktwaumnefm5z5@zlang-mailbox/
> 
> it's clearly hanging in wait_for_completion_state() in
> coredump_wait(), so it really looks like some confusion about that
> core_waiters (aka core_state->nr_threads) count.
> 
> Oh. Humm. Mike changed that initial rough patch of mine, and I had
> moved the "if you don't participate in c ore dumps" test up also past
> the "do_coredump()" logic.
> 
> And I think it's horribly *wrong* for a thread that doesn't get
> counted for core-dumping to go into do_coredump(), because then it
> will set the "core_state" to possibly be the core-state of the vhost
> thread that isn't even counted.
> 
> So *maybe* this attached patch might fix it? I haven't thought very
> deeply about this, but vhost workers most definitely shouldn't call
> do_coredump(), since they are then not counted.
> 
> (And again, I think we should just check that PF_IO_WORKER bit, not
> use this more complex test, but that's a separate and bigger change).
> 
>               Linus

>  kernel/signal.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 2547fa73bde5..a1e11ee8537c 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2847,6 +2847,10 @@ bool get_signal(struct ksignal *ksig)
>  		 */
>  		current->flags |= PF_SIGNALED;
>  
> +		/* vhost workers don't participate in core dups */
> +		if ((current->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER)
> +			goto out;
> +
>  		if (sig_kernel_coredump(signr)) {
>  			if (print_fatal_signals)
>  				print_fatal_signal(ksig->info.si_signo);


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 15:36         ` Darrick J. Wong
@ 2023-06-12 15:52           ` Eric W. Biederman
  2023-06-12 15:56           ` Linus Torvalds
  1 sibling, 0 replies; 31+ messages in thread
From: Eric W. Biederman @ 2023-06-12 15:52 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Linus Torvalds, Dave Chinner, Zorro Lang, linux-xfs,
	Mike Christie, Michael S. Tsirkin, linux-kernel


Can someone who can reproduce the hang run this test patch.

I am currently drawing a blank looking at the changes, so I am
proposing some debug code to help us narrow things down.

Can someone who can reproduce this run the code below?

The tests reproducing this don't appear to use use /dev/host-net or
/dev/vhost-vsock.  So if the WARN_ON's trigger it is a good sign
that code connected to the WARN_ON's are wrong.

If the WARN_ON's don't trigger I suspect the code in kernel/fork.c

But as I said staring at the code I don't see anything wrong.

Eric


diff --git a/fs/coredump.c b/fs/coredump.c
index 88740c51b942..e9acf0a2d2f0 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -374,6 +374,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 			/* The vhost_worker does not particpate in coredumps */
 			if ((t->flags & (PF_USER_WORKER | PF_IO_WORKER)) != PF_USER_WORKER)
 				nr++;
+			else WARN_ON_ONCE(true);
 		}
 	}
 
diff --git a/kernel/exit.c b/kernel/exit.c
index edb50b4c9972..56002a58ec33 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -437,6 +437,7 @@ static void coredump_task_exit(struct task_struct *tsk)
 		}
 		__set_current_state(TASK_RUNNING);
 	}
+	else if (core_state) WARN_ON_ONCE(true);
 }
 
 #ifdef CONFIG_MEMCG
diff --git a/kernel/signal.c b/kernel/signal.c
index 2547fa73bde5..1be27dbbce62 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1371,6 +1371,7 @@ int zap_other_threads(struct task_struct *p)
 		/* Don't require de_thread to wait for the vhost_worker */
 		if ((t->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER)
 			count++;
+		else WARN_ON_ONCE(true);
 
 		/* Don't bother with already dead threads */
 		if (t->exit_state)

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 15:36         ` Darrick J. Wong
  2023-06-12 15:52           ` Eric W. Biederman
@ 2023-06-12 15:56           ` Linus Torvalds
  2023-06-12 16:27             ` Jens Axboe
  1 sibling, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12 15:56 UTC (permalink / raw)
  To: Darrick J. Wong, Jens Axboe
  Cc: Dave Chinner, Zorro Lang, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On Mon, Jun 12, 2023 at 8:36 AM Darrick J. Wong <djwong@kernel.org> wrote:
>
> > Or maybe Darrick (who doesn't see the issue) is running on raw
> > hardware, and you and Zorro are running in a virtual environment?
>
> Ahah, it turns out that liburing-dev isn't installed on the test fleet,
> so fstests didn't get built with io_uring support.  That probably
> explains why I don't see any of these hangs.
>
> Oh.  I can't *install* the debian liburing-dev package because it has
> a versioned dependency on linux-libc-dev >= 5.1, which isn't compatible
> with me having a linux-libc-dev-djwong package that contains the uapi
> headers for the latest upstream kernel and Replaces: linux-libc-dev.
> So either I have to create a dummy linux-libc-dev with adequate version
> number that pulls in my own libc header package, or rename that package.
>
> <sigh> It's going to take me a while to research how best to split this
> stupid knot.

Oh, no, that's great. It explains why you don't see the problem, and
Dave and Zorro do. Perfect.

No need for you to install any liburing packages, at least for this
issue. You'll probably want it eventually just for test coverage, but
for now it's the smoking gun we wanted - I was looking at why vhost
would be impacted, because that commit so intentionally *tried* to not
do anything at all to io_uring.

But it obviously failed. Which then in turn explains the bug.

Not that I see exactly where it went wrong yet, but at least we're
looking at the right thing. Adding Jens to the participants, in case
he sees what goes wrong.

Jens, commit f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix
freezer/ps regression") seems to have broken core dumping with
io_uring threads, even though it tried very hard not to. See

  https://lore.kernel.org/all/20230611124836.whfktwaumnefm5z5@zlang-mailbox/

for the beginning of this thread.

Honestly, that "try to not change io_uring" was my least favorite part
of that patch, because I really think we want to try to aim for these
user helper threads having as much infrastructure in common as
possible. And when it comes to core dumps, I do not believe that
waiting for the io_uring thread adds anything to the end result,
because the only reason we wait for it is to put in the thread
register state into the core dump, and for kernel helper threads, that
information just isn't useful. It's going to be the state that caused
the thread to be created, not anything that is worth saving in a core
dump for.

So I'd actually prefer to just simplify the logic entirely, and say
"PF_USER_WORKER tasks do not participate in core dumps, end of story".
io_uring didn't _care_, so including them wasn't a pain, but if the
vhost exit case can be delayed, I'd rather just say "let's do thig
thing for both io_uring and vhost, and not split those two cases up".

Anyway, I don't see exactly what goes wrong, but I feel better just
from this having been narrowed down to io_uring threads. I suspect
Jens actually might even have a core-dumping test-case somewhere,
since core dumping was a thing that io_uring ended up having some
issues with at one point.

           Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 15:56           ` Linus Torvalds
@ 2023-06-12 16:27             ` Jens Axboe
  2023-06-12 16:38               ` Jens Axboe
  0 siblings, 1 reply; 31+ messages in thread
From: Jens Axboe @ 2023-06-12 16:27 UTC (permalink / raw)
  To: Linus Torvalds, Darrick J. Wong
  Cc: Dave Chinner, Zorro Lang, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On 6/12/23 9:56?AM, Linus Torvalds wrote:
> On Mon, Jun 12, 2023 at 8:36?AM Darrick J. Wong <djwong@kernel.org> wrote:
>>
>>> Or maybe Darrick (who doesn't see the issue) is running on raw
>>> hardware, and you and Zorro are running in a virtual environment?
>>
>> Ahah, it turns out that liburing-dev isn't installed on the test fleet,
>> so fstests didn't get built with io_uring support.  That probably
>> explains why I don't see any of these hangs.
>>
>> Oh.  I can't *install* the debian liburing-dev package because it has
>> a versioned dependency on linux-libc-dev >= 5.1, which isn't compatible
>> with me having a linux-libc-dev-djwong package that contains the uapi
>> headers for the latest upstream kernel and Replaces: linux-libc-dev.
>> So either I have to create a dummy linux-libc-dev with adequate version
>> number that pulls in my own libc header package, or rename that package.
>>
>> <sigh> It's going to take me a while to research how best to split this
>> stupid knot.
> 
> Oh, no, that's great. It explains why you don't see the problem, and
> Dave and Zorro do. Perfect.
> 
> No need for you to install any liburing packages, at least for this
> issue. You'll probably want it eventually just for test coverage, but
> for now it's the smoking gun we wanted - I was looking at why vhost
> would be impacted, because that commit so intentionally *tried* to not
> do anything at all to io_uring.
> 
> But it obviously failed. Which then in turn explains the bug.
> 
> Not that I see exactly where it went wrong yet, but at least we're
> looking at the right thing. Adding Jens to the participants, in case
> he sees what goes wrong.
> 
> Jens, commit f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix
> freezer/ps regression") seems to have broken core dumping with
> io_uring threads, even though it tried very hard not to. See
> 
>   https://lore.kernel.org/all/20230611124836.whfktwaumnefm5z5@zlang-mailbox/
> 
> for the beginning of this thread.
> 
> Honestly, that "try to not change io_uring" was my least favorite part
> of that patch, because I really think we want to try to aim for these
> user helper threads having as much infrastructure in common as
> possible. And when it comes to core dumps, I do not believe that
> waiting for the io_uring thread adds anything to the end result,
> because the only reason we wait for it is to put in the thread
> register state into the core dump, and for kernel helper threads, that
> information just isn't useful. It's going to be the state that caused
> the thread to be created, not anything that is worth saving in a core
> dump for.
> 
> So I'd actually prefer to just simplify the logic entirely, and say
> "PF_USER_WORKER tasks do not participate in core dumps, end of story".
> io_uring didn't _care_, so including them wasn't a pain, but if the
> vhost exit case can be delayed, I'd rather just say "let's do thig
> thing for both io_uring and vhost, and not split those two cases up".
> 
> Anyway, I don't see exactly what goes wrong, but I feel better just
> from this having been narrowed down to io_uring threads. I suspect
> Jens actually might even have a core-dumping test-case somewhere,
> since core dumping was a thing that io_uring ended up having some
> issues with at one point.

I'll take a look - at the exact same time as your email, someone just
reported this issue separately on the liburing GH page as well. Tried
myself, and yup, anything that ends up spawning an io-wq worker and then
core dumps will now get stuck:

[  136.271250] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  136.271711] task:ih              state:D stack:0     pid:736   ppid:727    flags:0x00000004
[  136.272218] Call trace:
[  136.272353]  __switch_to+0xb0/0xc8
[  136.272555]  __schedule+0x528/0x584
[  136.272757]  schedule+0x4c/0x90
[  136.272936]  schedule_timeout+0x30/0xdc
[  136.273179]  __wait_for_common+0x8c/0x118
[  136.273407]  wait_for_completion_state+0x1c/0x30
[  136.273686]  do_coredump+0x334/0x1000
[  136.273898]  get_signal+0x19c/0x5d8
[  136.274108]  do_notify_resume+0x10c/0xa0c
[  136.274346]  el0_da+0x50/0x5c
[  136.274555]  el0t_64_sync_handler+0xb8/0x134
[  136.274812]  el0t_64_sync+0x168/0x16c

Not good... I don't immediately see what the issue is, but I'll poke
shortly after a few meetings.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 16:27             ` Jens Axboe
@ 2023-06-12 16:38               ` Jens Axboe
  2023-06-12 16:42                 ` Linus Torvalds
  0 siblings, 1 reply; 31+ messages in thread
From: Jens Axboe @ 2023-06-12 16:38 UTC (permalink / raw)
  To: Linus Torvalds, Darrick J. Wong
  Cc: Dave Chinner, Zorro Lang, linux-xfs, Eric W. Biederman,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On 6/12/23 10:27?AM, Jens Axboe wrote:
> On 6/12/23 9:56?AM, Linus Torvalds wrote:
>> On Mon, Jun 12, 2023 at 8:36?AM Darrick J. Wong <djwong@kernel.org> wrote:
>>>
>>>> Or maybe Darrick (who doesn't see the issue) is running on raw
>>>> hardware, and you and Zorro are running in a virtual environment?
>>>
>>> Ahah, it turns out that liburing-dev isn't installed on the test fleet,
>>> so fstests didn't get built with io_uring support.  That probably
>>> explains why I don't see any of these hangs.
>>>
>>> Oh.  I can't *install* the debian liburing-dev package because it has
>>> a versioned dependency on linux-libc-dev >= 5.1, which isn't compatible
>>> with me having a linux-libc-dev-djwong package that contains the uapi
>>> headers for the latest upstream kernel and Replaces: linux-libc-dev.
>>> So either I have to create a dummy linux-libc-dev with adequate version
>>> number that pulls in my own libc header package, or rename that package.
>>>
>>> <sigh> It's going to take me a while to research how best to split this
>>> stupid knot.
>>
>> Oh, no, that's great. It explains why you don't see the problem, and
>> Dave and Zorro do. Perfect.
>>
>> No need for you to install any liburing packages, at least for this
>> issue. You'll probably want it eventually just for test coverage, but
>> for now it's the smoking gun we wanted - I was looking at why vhost
>> would be impacted, because that commit so intentionally *tried* to not
>> do anything at all to io_uring.
>>
>> But it obviously failed. Which then in turn explains the bug.
>>
>> Not that I see exactly where it went wrong yet, but at least we're
>> looking at the right thing. Adding Jens to the participants, in case
>> he sees what goes wrong.
>>
>> Jens, commit f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix
>> freezer/ps regression") seems to have broken core dumping with
>> io_uring threads, even though it tried very hard not to. See
>>
>>   https://lore.kernel.org/all/20230611124836.whfktwaumnefm5z5@zlang-mailbox/
>>
>> for the beginning of this thread.
>>
>> Honestly, that "try to not change io_uring" was my least favorite part
>> of that patch, because I really think we want to try to aim for these
>> user helper threads having as much infrastructure in common as
>> possible. And when it comes to core dumps, I do not believe that
>> waiting for the io_uring thread adds anything to the end result,
>> because the only reason we wait for it is to put in the thread
>> register state into the core dump, and for kernel helper threads, that
>> information just isn't useful. It's going to be the state that caused
>> the thread to be created, not anything that is worth saving in a core
>> dump for.
>>
>> So I'd actually prefer to just simplify the logic entirely, and say
>> "PF_USER_WORKER tasks do not participate in core dumps, end of story".
>> io_uring didn't _care_, so including them wasn't a pain, but if the
>> vhost exit case can be delayed, I'd rather just say "let's do thig
>> thing for both io_uring and vhost, and not split those two cases up".
>>
>> Anyway, I don't see exactly what goes wrong, but I feel better just
>> from this having been narrowed down to io_uring threads. I suspect
>> Jens actually might even have a core-dumping test-case somewhere,
>> since core dumping was a thing that io_uring ended up having some
>> issues with at one point.
> 
> I'll take a look - at the exact same time as your email, someone just
> reported this issue separately on the liburing GH page as well. Tried
> myself, and yup, anything that ends up spawning an io-wq worker and then
> core dumps will now get stuck:
> 
> [  136.271250] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  136.271711] task:ih              state:D stack:0     pid:736   ppid:727    flags:0x00000004
> [  136.272218] Call trace:
> [  136.272353]  __switch_to+0xb0/0xc8
> [  136.272555]  __schedule+0x528/0x584
> [  136.272757]  schedule+0x4c/0x90
> [  136.272936]  schedule_timeout+0x30/0xdc
> [  136.273179]  __wait_for_common+0x8c/0x118
> [  136.273407]  wait_for_completion_state+0x1c/0x30
> [  136.273686]  do_coredump+0x334/0x1000
> [  136.273898]  get_signal+0x19c/0x5d8
> [  136.274108]  do_notify_resume+0x10c/0xa0c
> [  136.274346]  el0_da+0x50/0x5c
> [  136.274555]  el0t_64_sync_handler+0xb8/0x134
> [  136.274812]  el0t_64_sync+0x168/0x16c
> 
> Not good... I don't immediately see what the issue is, but I'll poke
> shortly after a few meetings.

Quick peek would suggest that it's because io-wq clears PF_IO_WORKER on
exit, and now we fail the check in coredump_task_exit() that was added.

From my quick recollection, this is to avoid hitting the schedule out
callback on exit. But I could be totally wrong... In any case, I'd be
surprised if this isn't why it got broken by Mike's patch.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 16:38               ` Jens Axboe
@ 2023-06-12 16:42                 ` Linus Torvalds
  2023-06-12 16:45                   ` Jens Axboe
  2023-06-12 16:45                   ` Linus Torvalds
  0 siblings, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12 16:42 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On Mon, Jun 12, 2023 at 9:38 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> Quick peek would suggest that it's because io-wq clears PF_IO_WORKER on
> exit, and now we fail the check in coredump_task_exit() that was added.

Oh, that makes sense.

Well, it makes sense for the bug, but that whole

        preempt_disable();
        current->flags &= ~PF_IO_WORKER;
        preempt_enable();

thin in io_worker_exit() does *not* make sense to me.

Does removing those three lines make things "JustWork(tm)"?

               Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 16:42                 ` Linus Torvalds
@ 2023-06-12 16:45                   ` Jens Axboe
  2023-06-12 16:57                     ` Linus Torvalds
  2023-06-12 16:45                   ` Linus Torvalds
  1 sibling, 1 reply; 31+ messages in thread
From: Jens Axboe @ 2023-06-12 16:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On 6/12/23 10:42?AM, Linus Torvalds wrote:
> On Mon, Jun 12, 2023 at 9:38?AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Quick peek would suggest that it's because io-wq clears PF_IO_WORKER on
>> exit, and now we fail the check in coredump_task_exit() that was added.
> 
> Oh, that makes sense.
> 
> Well, it makes sense for the bug, but that whole
> 
>         preempt_disable();
>         current->flags &= ~PF_IO_WORKER;
>         preempt_enable();
> 
> thin in io_worker_exit() does *not* make sense to me.
> 
> Does removing those three lines make things "JustWork(tm)"?

You snipped the suspicion in my reply on why that exists, to avoid
io_wq_worker_sleeping() triggering. But arguably this should be covered
by the RUNNING flag. I'll poke at it shortly, and yeah then we should
just remove the PF_IO_WORKER clearing.

Or maybe I'm just smoking crack and it's no longer needed at all. Will
test and send a patch.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 16:42                 ` Linus Torvalds
  2023-06-12 16:45                   ` Jens Axboe
@ 2023-06-12 16:45                   ` Linus Torvalds
  1 sibling, 0 replies; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12 16:45 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On Mon, Jun 12, 2023 at 9:42 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Well, it makes sense for the bug, but that whole
>
>         preempt_disable();
>         current->flags &= ~PF_IO_WORKER;
>         preempt_enable();
>
> thin in io_worker_exit() does *not* make sense to me.

Oh, it looks like that goes back to the original io_worker
implementation, when io_worker_start() would set the PF_IO_WORKER bit,
and  io_worker_exit() would clear it, and they were otherwise created
like normal kthreads.

It *looks* all entirely historical to me, and removing those three
lines as left-over baggage smells like the right thing.

              Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 16:45                   ` Jens Axboe
@ 2023-06-12 16:57                     ` Linus Torvalds
  2023-06-12 17:11                       ` Eric W. Biederman
  2023-06-12 17:29                       ` Jens Axboe
  0 siblings, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12 16:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On Mon, Jun 12, 2023 at 9:45 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> You snipped the suspicion in my reply on why that exists, to avoid
> io_wq_worker_sleeping() triggering.

I'm not seeing why triggering io_wq_worker_sleeping() should even be a
problem in the first place.

I suspect that is entirely historical too, and has to do with how it
used to do that

        struct io_worker *worker = kthread_data(tsk);
        struct io_wqe *wqe = worker->wqe;

back in the bad old days of kthreads.

But yeah, I don't know that code.

              Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 16:57                     ` Linus Torvalds
@ 2023-06-12 17:11                       ` Eric W. Biederman
  2023-06-12 17:30                         ` Jens Axboe
  2023-06-12 17:29                       ` Jens Axboe
  1 sibling, 1 reply; 31+ messages in thread
From: Eric W. Biederman @ 2023-06-12 17:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Mike Christie, Michael S. Tsirkin, linux-kernel

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, Jun 12, 2023 at 9:45 AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> You snipped the suspicion in my reply on why that exists, to avoid
>> io_wq_worker_sleeping() triggering.
>
> I'm not seeing why triggering io_wq_worker_sleeping() should even be a
> problem in the first place.
>
> I suspect that is entirely historical too, and has to do with how it
> used to do that
>
>         struct io_worker *worker = kthread_data(tsk);
>         struct io_wqe *wqe = worker->wqe;
>
> back in the bad old days of kthreads.
>
> But yeah, I don't know that code.

If it is a problem it looks like the thread shutdown can clear
"worker->flags & IO_WORKER_F_UP" rather than
"current->flags & PF_IO_WORKER".

I don't see how it makes sense for the load balancing logic for
a per-process thread pool to be running at that point.

Eric

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 16:57                     ` Linus Torvalds
  2023-06-12 17:11                       ` Eric W. Biederman
@ 2023-06-12 17:29                       ` Jens Axboe
  2023-06-12 17:51                         ` Linus Torvalds
  1 sibling, 1 reply; 31+ messages in thread
From: Jens Axboe @ 2023-06-12 17:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On 6/12/23 10:57 AM, Linus Torvalds wrote:
> On Mon, Jun 12, 2023 at 9:45 AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> You snipped the suspicion in my reply on why that exists, to avoid
>> io_wq_worker_sleeping() triggering.
> 
> I'm not seeing why triggering io_wq_worker_sleeping() should even be a
> problem in the first place.
> 
> I suspect that is entirely historical too, and has to do with how it
> used to do that
> 
>         struct io_worker *worker = kthread_data(tsk);
>         struct io_wqe *wqe = worker->wqe;
> 
> back in the bad old days of kthreads.
> 
> But yeah, I don't know that code.

Looks fine to me to just kill it indeed, whatever we did need this
for is definitely no longer the case. I _think_ we used to have
something in the worker exit that would potentially sleep which
is why we killed it before doing that, now it just looks like dead
code.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 17:11                       ` Eric W. Biederman
@ 2023-06-12 17:30                         ` Jens Axboe
  0 siblings, 0 replies; 31+ messages in thread
From: Jens Axboe @ 2023-06-12 17:30 UTC (permalink / raw)
  To: Eric W. Biederman, Linus Torvalds
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Mike Christie, Michael S. Tsirkin, linux-kernel

On 6/12/23 11:11?AM, Eric W. Biederman wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
>> On Mon, Jun 12, 2023 at 9:45?AM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> You snipped the suspicion in my reply on why that exists, to avoid
>>> io_wq_worker_sleeping() triggering.
>>
>> I'm not seeing why triggering io_wq_worker_sleeping() should even be a
>> problem in the first place.
>>
>> I suspect that is entirely historical too, and has to do with how it
>> used to do that
>>
>>         struct io_worker *worker = kthread_data(tsk);
>>         struct io_wqe *wqe = worker->wqe;
>>
>> back in the bad old days of kthreads.
>>
>> But yeah, I don't know that code.
> 
> If it is a problem it looks like the thread shutdown can clear
> "worker->flags & IO_WORKER_F_UP" rather than
> "current->flags & PF_IO_WORKER".
> 
> I don't see how it makes sense for the load balancing logic for
> a per-process thread pool to be running at that point.

Yep that was my thinking too, if we did need it, we could fiddle with
the UP flag instead. But as per the previous reply, it should be able to
just get removed at this point.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 17:29                       ` Jens Axboe
@ 2023-06-12 17:51                         ` Linus Torvalds
  2023-06-12 17:53                           ` Jens Axboe
  2023-06-12 18:34                           ` Linus Torvalds
  0 siblings, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12 17:51 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On Mon, Jun 12, 2023 at 10:29 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> Looks fine to me to just kill it indeed, whatever we did need this
> for is definitely no longer the case. I _think_ we used to have
> something in the worker exit that would potentially sleep which
> is why we killed it before doing that, now it just looks like dead
> code.

Ok, can you (and the fsstress people) confirm that this
whitespace-damaged patch fixes the coredump issue:


  --- a/io_uring/io-wq.c
  +++ b/io_uring/io-wq.c
  @@ -221,9 +221,6 @@ static void io_worker_exit(..
        raw_spin_unlock(&wq->lock);
        io_wq_dec_running(worker);
        worker->flags = 0;
  -     preempt_disable();
  -     current->flags &= ~PF_IO_WORKER;
  -     preempt_enable();

        kfree_rcu(worker, rcu);
        io_worker_ref_put(wq);

Jens, I think that the two lines above there, ie the whole

        io_wq_dec_running(worker);
        worker->flags = 0;

thing may actually be the (partial?) reason for those PF_IO_WORKER
games. It's basically doing "now I'm doing stats by hand", and I
wonder if now it decrements the running worker one time too much?

Ie when the finally *dead* worker schedules away, never to return,
that's when that io_wq_worker_sleeping() case triggers and decrements
things one more time.

So there might be some bookkeeping reason for those games, but it
looks like if that's the case, then that

        worker->flags = 0;

will have already taken care of it.

I wonder if those two lines could just be removed too. But I think
that's separate from the "let's fix the core dumping" issue.

           Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 17:51                         ` Linus Torvalds
@ 2023-06-12 17:53                           ` Jens Axboe
  2023-06-12 17:56                             ` Linus Torvalds
  2023-06-12 18:34                           ` Linus Torvalds
  1 sibling, 1 reply; 31+ messages in thread
From: Jens Axboe @ 2023-06-12 17:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On 6/12/23 11:51?AM, Linus Torvalds wrote:
> On Mon, Jun 12, 2023 at 10:29?AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Looks fine to me to just kill it indeed, whatever we did need this
>> for is definitely no longer the case. I _think_ we used to have
>> something in the worker exit that would potentially sleep which
>> is why we killed it before doing that, now it just looks like dead
>> code.
> 
> Ok, can you (and the fsstress people) confirm that this
> whitespace-damaged patch fixes the coredump issue:
> 
> 
>   --- a/io_uring/io-wq.c
>   +++ b/io_uring/io-wq.c
>   @@ -221,9 +221,6 @@ static void io_worker_exit(..
>         raw_spin_unlock(&wq->lock);
>         io_wq_dec_running(worker);
>         worker->flags = 0;
>   -     preempt_disable();
>   -     current->flags &= ~PF_IO_WORKER;
>   -     preempt_enable();
> 
>         kfree_rcu(worker, rcu);
>         io_worker_ref_put(wq);

Yep, it fixes it on my end and it passes my basic tests as well.

> Jens, I think that the two lines above there, ie the whole
> 
>         io_wq_dec_running(worker);
>         worker->flags = 0;
> 
> thing may actually be the (partial?) reason for those PF_IO_WORKER
> games. It's basically doing "now I'm doing stats by hand", and I
> wonder if now it decrements the running worker one time too much?
> 
> Ie when the finally *dead* worker schedules away, never to return,
> that's when that io_wq_worker_sleeping() case triggers and decrements
> things one more time.
> 
> So there might be some bookkeeping reason for those games, but it
> looks like if that's the case, then that
> 
>         worker->flags = 0;
> 
> will have already taken care of it.
> 
> I wonder if those two lines could just be removed too. But I think
> that's separate from the "let's fix the core dumping" issue.

Something like that was/is my worry. Let me add some tracing to confirm
it's fine, don't fully trust just my inspection of it. I'll send out the
patch separately once done, and then would be great if you can just pick
it up so it won't have to wait until I need to send a pull later in the
week. Particularly as I have nothing planned for 6.4 unless something
else comes up of course.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 17:53                           ` Jens Axboe
@ 2023-06-12 17:56                             ` Linus Torvalds
  0 siblings, 0 replies; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12 17:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On Mon, Jun 12, 2023 at 10:54 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> Something like that was/is my worry. Let me add some tracing to confirm
> it's fine, don't fully trust just my inspection of it. I'll send out the
> patch separately once done, and then would be great if you can just pick
> it up so it won't have to wait until I need to send a pull later in the
> week. Particularly as I have nothing planned for 6.4 unless something
> else comes up of course.

Ack, sounds good.

                   Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 17:51                         ` Linus Torvalds
  2023-06-12 17:53                           ` Jens Axboe
@ 2023-06-12 18:34                           ` Linus Torvalds
  2023-06-12 23:33                             ` Dave Chinner
  1 sibling, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2023-06-12 18:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Darrick J. Wong, Dave Chinner, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On Mon, Jun 12, 2023 at 10:51 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Ok, can you (and the fsstress people) confirm that this
> whitespace-damaged patch fixes the coredump issue:

The proper patch from Jens is now in my tree, and I'm archiving this
thread on the assumption that it's all good.

Dave/Zorro - if you still see any issues with that patch in place, holler.

                   Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)
  2023-06-12 18:34                           ` Linus Torvalds
@ 2023-06-12 23:33                             ` Dave Chinner
  0 siblings, 0 replies; 31+ messages in thread
From: Dave Chinner @ 2023-06-12 23:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Darrick J. Wong, Zorro Lang, linux-xfs,
	Eric W. Biederman, Mike Christie, Michael S. Tsirkin,
	linux-kernel

On Mon, Jun 12, 2023 at 11:34:28AM -0700, Linus Torvalds wrote:
> On Mon, Jun 12, 2023 at 10:51 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Ok, can you (and the fsstress people) confirm that this
> > whitespace-damaged patch fixes the coredump issue:
> 
> The proper patch from Jens is now in my tree, and I'm archiving this
> thread on the assumption that it's all good.
> 
> Dave/Zorro - if you still see any issues with that patch in place, holler.

The fix seems to work. The reproducer just ran through 50 times
without fail; it was previously always failing on the first or
second cycle.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-06-12 23:33 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-11 12:48 [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+ Zorro Lang
2023-06-11 13:40 ` Zorro Lang
2023-06-11 23:01 ` Dave Chinner
2023-06-12  1:51   ` Darrick J. Wong
2023-06-12  2:21     ` [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+) Dave Chinner
2023-06-12  3:14       ` Linus Torvalds
2023-06-12  5:16         ` Dave Chinner
2023-06-12  5:34           ` Linus Torvalds
2023-06-12  5:49             ` Dave Chinner
2023-06-12  6:11               ` Linus Torvalds
2023-06-12  8:45                 ` Eric W. Biederman
2023-06-12  9:30                   ` Zorro Lang
2023-06-12 11:27                   ` Dave Chinner
2023-06-12  6:36         ` Zorro Lang
2023-06-12 15:36         ` Darrick J. Wong
2023-06-12 15:52           ` Eric W. Biederman
2023-06-12 15:56           ` Linus Torvalds
2023-06-12 16:27             ` Jens Axboe
2023-06-12 16:38               ` Jens Axboe
2023-06-12 16:42                 ` Linus Torvalds
2023-06-12 16:45                   ` Jens Axboe
2023-06-12 16:57                     ` Linus Torvalds
2023-06-12 17:11                       ` Eric W. Biederman
2023-06-12 17:30                         ` Jens Axboe
2023-06-12 17:29                       ` Jens Axboe
2023-06-12 17:51                         ` Linus Torvalds
2023-06-12 17:53                           ` Jens Axboe
2023-06-12 17:56                             ` Linus Torvalds
2023-06-12 18:34                           ` Linus Torvalds
2023-06-12 23:33                             ` Dave Chinner
2023-06-12 16:45                   ` Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.