linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs: qgroup: Fix tree dead lock caused by backref walk during snapshot dropping
@ 2018-12-10  8:24 Qu Wenruo
  0 siblings, 0 replies; only message in thread
From: Qu Wenruo @ 2018-12-10  8:24 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
System may lockup during snapshot dropping with quota enabled.

The backtrace would be:

  btrfs-cleaner   D    0  4062      2 0x80000000
  Call Trace:
   schedule+0x32/0x90
   btrfs_tree_read_lock+0x93/0x130 [btrfs]
   find_parent_nodes+0x29b/0x1170 [btrfs]
   btrfs_find_all_roots_safe+0xa8/0x120 [btrfs]
   btrfs_find_all_roots+0x57/0x70 [btrfs]
   btrfs_qgroup_trace_extent_post+0x37/0x70 [btrfs]
   btrfs_qgroup_trace_leaf_items+0x10b/0x140 [btrfs]
   btrfs_qgroup_trace_subtree+0xc8/0xe0 [btrfs]
   do_walk_down+0x541/0x5e3 [btrfs]
   walk_down_tree+0xab/0xe7 [btrfs]
   btrfs_drop_snapshot+0x356/0x71a [btrfs]
   btrfs_clean_one_deleted_snapshot+0xb8/0xf0 [btrfs]
   cleaner_kthread+0x12b/0x160 [btrfs]
   kthread+0x112/0x130
   ret_from_fork+0x27/0x50

[CAUSE]
When dropping snapshots with qgroup enabled, we will trigger backref
walk.

However such backref walk at that timing is pretty dangerous, as if one
of the parent nodes get WRITE locked by other thread, we could cause a
dead lock.

For example:

           FS 260     FS 261 (Dropped)
            node A        node B
           /      \      /      \
       node C      node D      node E
      /   \         /  \        /     \
  leaf F|leaf G|leaf H|leaf I|leaf J|leaf K

The lock sequence would be:

      Thread A (cleaner)             |       Thread B (other writer)
-----------------------------------------------------------------------
write_lock(B)                        |
write_lock(D)                        |
^^^ called by walk_down_tree()       |
                                     |       write_lock(A)
                                     |       write_lock(D) << Stall
read_lock(H) << for backref walk     |
read_lock(D) << lock owner is        |
                the same thread A    |
                so read lock is OK   |
read_lock(A) << Stall                |

So thread A hold write lock D, and needs read lock A to unlock.
While thread B holds write lock A, while needs lock D to unlock.

This will cause a dead lock.

[FIX]
Just stop doing such dangerous backref at snapshots dropping time.

Reported-by: David Sterba <dsterba@suse.cz>
Fixes: 1152651a0817 ("btrfs: qgroup: account shared subtrees during snapshot delete")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---

This patch needs "btrfs: qgroup: Allow btrfs_qgroup_extent_record::old_roots
unpopulated at insert time" as dependency, which allows us to skip
backref walk.
---
 fs/btrfs/qgroup.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 257c557e3aaa..b75dcc75ef89 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2186,16 +2186,16 @@ int btrfs_qgroup_trace_subtree(struct btrfs_trans_handle *trans,
 			btrfs_set_lock_blocking_rw(eb, BTRFS_READ_LOCK);
 			path->locks[level] = BTRFS_READ_LOCK_BLOCKING;
 
-			ret = btrfs_qgroup_trace_extent(trans, child_bytenr,
-							fs_info->nodesize,
-							GFP_NOFS);
+			ret = qgroup_trace_extent(trans, child_bytenr,
+						  fs_info->nodesize, GFP_NOFS,
+						  false);
 			if (ret)
 				goto out;
 		}
 
 		if (level == 0) {
-			ret = btrfs_qgroup_trace_leaf_items(trans,
-							    path->nodes[level]);
+			ret = qgroup_trace_leaf_items(trans, path->nodes[level],
+						      false);
 			if (ret)
 				goto out;
 
-- 
2.19.2


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2018-12-10  8:24 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-10  8:24 [PATCH] btrfs: qgroup: Fix tree dead lock caused by backref walk during snapshot dropping Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).