[RFC PATCH] btrfs: qgroup: fix deadlock between rescan worker and remove qgroup

* [RFC PATCH] btrfs: qgroup: fix deadlock between rescan worker and remove qgroup
@ 2022-02-23  9:51 Sidong Yang
  2022-02-24 18:48 ` David Sterba
  0 siblings, 1 reply; 5+ messages in thread
From: Sidong Yang @ 2022-02-23  9:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Sidong Yang

The patch e804861bd4e6 by Kawasaki resolves deadlock between quota
disable and qgroup rescan worker. but also there is a deadlock case like
it. It's about enabling or disabling quota and creating or removing
qgroup. It can be reproduced in simple script below.

for i in {1..100}
do
    btrfs quota enable /mnt &
    btrfs qgroup create 1/0 /mnt &
    btrfs qgroup destroy 1/0 /mnt &
    btrfs quota disable /mnt &
done

This script simply enables quota and creates/destroies qgroup and disables
qgroup 100 times. Enabling quota starts rescan worker and it commits
transaction and wait in wait_for_commit(). transaction_kthread would
wakup for the commit and try to attach trasaction but there would be
another current transaction. The transaction was from another command
that destroy qgroup. but destroying qgroup could be blocked by
qgroup_ioctl_lock which locked by the thread disabling quota.

An example report of the deadlock:

[  363.661448] INFO: task kworker/u16:4:295 blocked for more than 120 seconds.
[  363.661582]       Not tainted 5.17.0-rc4+ #16
[  363.661659] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  363.661744] task:kworker/u16:4   state:D stack:    0 pid:  295 ppid:     2 flags:0x00004000
[  363.661762] Workqueue: btrfs-qgroup-rescan btrfs_work_helper [btrfs]
[  363.661936] Call Trace:
[  363.661949]  <TASK>
[  363.661958]  __schedule+0x2e5/0xbb0
[  363.662002]  ? btrfs_free_path+0x27/0x30 [btrfs]
[  363.662094]  ? mutex_lock+0x13/0x40
[  363.662106]  schedule+0x58/0xd0
[  363.662116]  btrfs_commit_transaction+0x2dc/0xb40 [btrfs]
[  363.662250]  ? wait_woken+0x60/0x60
[  363.662271]  btrfs_qgroup_rescan_worker+0x3cb/0x600 [btrfs]
[  363.662419]  btrfs_work_helper+0xc8/0x330 [btrfs]
[  363.662551]  process_one_work+0x21a/0x3f0
[  363.662588]  worker_thread+0x4a/0x3b0
[  363.662600]  ? process_one_work+0x3f0/0x3f0
[  363.662609]  kthread+0xfd/0x130
[  363.662618]  ? kthread_complete_and_exit+0x20/0x20
[  363.662628]  ret_from_fork+0x1f/0x30
[  363.662659]  </TASK>
[  363.662691] INFO: task btrfs-transacti:1158 blocked for more than 120 seconds.
[  363.662765]       Not tainted 5.17.0-rc4+ #16
[  363.662809] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  363.662880] task:btrfs-transacti state:D stack:    0 pid: 1158 ppid:     2 flags:0x00004000
[  363.662889] Call Trace:
[  363.662892]  <TASK>
[  363.662896]  __schedule+0x2e5/0xbb0
[  363.662906]  ? _raw_spin_lock_irqsave+0x2a/0x60
[  363.662925]  schedule+0x58/0xd0
[  363.662942]  wait_current_trans+0xd2/0x130 [btrfs]
[  363.663046]  ? wait_woken+0x60/0x60
[  363.663055]  start_transaction+0x33c/0x600 [btrfs]
[  363.663159]  btrfs_attach_transaction+0x1d/0x20 [btrfs]
[  363.663268]  transaction_kthread+0xb5/0x1b0 [btrfs]
[  363.663368]  ? btrfs_cleanup_transaction+0x580/0x580 [btrfs]
[  363.663465]  kthread+0xfd/0x130
[  363.663475]  ? kthread_complete_and_exit+0x20/0x20
[  363.663484]  ret_from_fork+0x1f/0x30
[  363.663498]  </TASK>
[  363.663503] INFO: task btrfs:81196 blocked for more than 120 seconds.
[  363.663568]       Not tainted 5.17.0-rc4+ #16
[  363.663612] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  363.663693] task:btrfs           state:D stack:    0 pid:81196 ppid:     1 flags:0x00000000
[  363.663702] Call Trace:
[  363.663705]  <TASK>
[  363.663709]  __schedule+0x2e5/0xbb0
[  363.663721]  schedule+0x58/0xd0
[  363.663729]  rwsem_down_write_slowpath+0x310/0x5b0
[  363.663748]  ? __check_object_size+0x130/0x150
[  363.663770]  down_write+0x41/0x50
[  363.663780]  btrfs_ioctl+0x20e6/0x2f40 [btrfs]
[  363.663918]  ? debug_smp_processor_id+0x17/0x20
[  363.663932]  ? fpregs_assert_state_consistent+0x23/0x50
[  363.663963]  __x64_sys_ioctl+0x8e/0xc0
[  363.663981]  ? __x64_sys_ioctl+0x8e/0xc0
[  363.663990]  do_syscall_64+0x38/0xc0
[  363.663998]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  363.664006] RIP: 0033:0x7f1082add50b
[  363.664014] RSP: 002b:00007fffbfd1ba98 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[  363.664022] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1082add50b
[  363.664028] RDX: 00007fffbfd1bab0 RSI: 00000000c0109428 RDI: 0000000000000003
[  363.664032] RBP: 0000000000000003 R08: 000055e4263142a0 R09: 000000000000007c
[  363.664036] R10: 00007f1082bb1be0 R11: 0000000000000206 R12: 00007fffbfd1c723
[  363.664040] R13: 0000000000000001 R14: 000055e42615408d R15: 00007fffbfd1bc68
[  363.664049]  </TASK>
[  363.664053] INFO: task btrfs:81197 blocked for more than 120 seconds.
[  363.664117]       Not tainted 5.17.0-rc4+ #16
[  363.664160] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  363.664231] task:btrfs           state:D stack:    0 pid:81197 ppid:     1 flags:0x00000000
[  363.664239] Call Trace:
[  363.664241]  <TASK>
[  363.664245]  __schedule+0x2e5/0xbb0
[  363.664257]  schedule+0x58/0xd0
[  363.664265]  rwsem_down_write_slowpath+0x310/0x5b0
[  363.664274]  ? __check_object_size+0x130/0x150
[  363.664282]  down_write+0x41/0x50
[  363.664292]  btrfs_ioctl+0x20e6/0x2f40 [btrfs]
[  363.664430]  ? debug_smp_processor_id+0x17/0x20
[  363.664442]  ? fpregs_assert_state_consistent+0x23/0x50
[  363.664453]  __x64_sys_ioctl+0x8e/0xc0
[  363.664462]  ? __x64_sys_ioctl+0x8e/0xc0
[  363.664470]  do_syscall_64+0x38/0xc0
[  363.664478]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  363.664484] RIP: 0033:0x7ff1752ac50b
[  363.664489] RSP: 002b:00007ffc0cb56eb8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[  363.664495] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff1752ac50b
[  363.664500] RDX: 00007ffc0cb56ed0 RSI: 00000000c0109428 RDI: 0000000000000003
[  363.664503] RBP: 0000000000000003 R08: 000055d0dcf182a0 R09: 000000000000007c
[  363.664507] R10: 00007ff175380be0 R11: 0000000000000206 R12: 00007ffc0cb58723
[  363.664520] R13: 0000000000000001 R14: 000055d0db04708d R15: 00007ffc0cb57088
[  363.664528]  </TASK>
[  363.664532] INFO: task btrfs:81204 blocked for more than 120 seconds.
[  363.664596]       Not tainted 5.17.0-rc4+ #16
[  363.664639] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  363.664710] task:btrfs           state:D stack:    0 pid:81204 ppid:     1 flags:0x00004000
[  363.664717] Call Trace:
[  363.664720]  <TASK>
[  363.664723]  __schedule+0x2e5/0xbb0
[  363.664735]  schedule+0x58/0xd0
[  363.664743]  schedule_timeout+0x1f3/0x290
[  363.664754]  ? __mutex_lock.isra.0+0x8f/0x4c0
[  363.664765]  wait_for_completion+0x8b/0xf0
[  363.664776]  btrfs_qgroup_wait_for_completion+0x62/0x70 [btrfs]
[  363.664995]  btrfs_quota_disable+0x51/0x320 [btrfs]
[  363.665136]  btrfs_ioctl+0x2106/0x2f40 [btrfs]
[  363.665385]  ? debug_smp_processor_id+0x17/0x20
[  363.665402]  ? fpregs_assert_state_consistent+0x23/0x50
[  363.665417]  __x64_sys_ioctl+0x8e/0xc0
[  363.665428]  ? __x64_sys_ioctl+0x8e/0xc0
[  363.665439]  do_syscall_64+0x38/0xc0
[  363.665450]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  363.665459] RIP: 0033:0x7f9d7462050b
[  363.665466] RSP: 002b:00007ffc1de68558 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[  363.665475] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9d7462050b
[  363.665480] RDX: 00007ffc1de68570 RSI: 00000000c0109428 RDI: 0000000000000003
[  363.665486] RBP: 0000000000000003 R08: 00005629e953b2a0 R09: 000000000000007c
[  363.665492] R10: 00007f9d746f4be0 R11: 0000000000000206 R12: 00007ffc1de69723
[  363.665498] R13: 0000000000000001 R14: 00005629e8e5708d R15: 00007ffc1de68728
[  363.665510]  </TASK>

To resolve this issue, The thread disabling quota should unlock
qgroup_ioctl_lock before waiting rescan completion. This patch moves
btrfs_qgroup_wait_for_completion() after qgroup_ioctl_lock().

Signed-off-by: Sidong Yang <realwakka@gmail.com>
---
Hi, I found some deadlock bug with testing a simple script.
With this patch, it seems that it resolves it. but I don't know much about how
transaction works. and I'm afraid that it has other side effects.
---
 fs/btrfs/qgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 2c0dd6b8a80c..85e5b3572dda 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1219,8 +1219,8 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 	 * deadlock with transaction by the qgroup rescan worker.
 	 */
 	clear_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
-	btrfs_qgroup_wait_for_completion(fs_info, false);
 	mutex_unlock(&fs_info->qgroup_ioctl_lock);
+	btrfs_qgroup_wait_for_completion(fs_info, false);
 
 	/*
 	 * 1 For the root item
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread