linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5][v2] Relocation and backref resolution fixes
@ 2020-03-20 18:34 Josef Bacik
  2020-03-20 18:34 ` [PATCH 1/5] btrfs: reorder reservation before reloc root selection Josef Bacik
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Josef Bacik @ 2020-03-20 18:34 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

v1->v2:
- reworded the first patch.
- Added "btrfs: restart snapshot delete if we have to end the transaction".
  Zygo still was able to hit the backref walking+snapshot delete deadlock
  because the original fix is still a little racey.  We can already restart
  snapshot deletes, just drop our path if we have to end the transaction so
  we're not holding locks across trans handles.

===================== Original email =====================================
These are standalone fixes that came out of my debugging Zygo's problems.  The
first two address a problem with how we handle restarting relocation.
Previously this rarely happened, because if it had people would have complained.
The restart logic was broken in a few subtle ways, and these two patches address
those issues.

The third patch just boggles my mind.  We were recording reloc roots based on
their current bytenr.  This worked fine if we never restarted, but broke if we
had to lookup a ref to a reloc root that we found on the tree.  This is because
that would point at the commit root of the reloc root, but if we had modified
the reloc root we'd no longer be able to find it.

And finally the last one was a weird deadlock that Zygo's insane test rig found,
as he runs the dedup thing while balancing and deleting snapshots, which made
this thing fall out.  Thanks,

Josef


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/5] btrfs: reorder reservation before reloc root selection
  2020-03-20 18:34 [PATCH 0/5][v2] Relocation and backref resolution fixes Josef Bacik
@ 2020-03-20 18:34 ` Josef Bacik
  2020-03-20 18:34 ` [PATCH 2/5] btrfs: restart relocate_tree_blocks properly Josef Bacik
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2020-03-20 18:34 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Since we're not only checking for metadata reservations but also if we
need to throttle our delayed ref generation, reorder
reserve_metadat_bytes() above the select_reloc_root() call in
reserve_metadat_bytes().  The reason we want this is because
select_reloc_root() will mess with the backref cache, and if we're going
to bail we want to be able to cleanly remove this node from the backref
cache and come back along to regenerate it.  Move it up so this is the
first thing we do to make restarting cleaner.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 3ccc126d0df3..df33649c592c 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3134,6 +3134,14 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
 	if (!node)
 		return 0;
 
+	/*
+	 * If we fail here we want to drop our backref_node because we are going
+	 * to start over and regenerate the tree for it.
+	 */
+	ret = reserve_metadata_space(trans, rc, node);
+	if (ret)
+		goto out;
+
 	BUG_ON(node->processed);
 	root = select_one_root(node);
 	if (root == ERR_PTR(-ENOENT)) {
@@ -3141,12 +3149,6 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
 		goto out;
 	}
 
-	if (!root || test_bit(BTRFS_ROOT_REF_COWS, &root->state)) {
-		ret = reserve_metadata_space(trans, rc, node);
-		if (ret)
-			goto out;
-	}
-
 	if (root) {
 		if (test_bit(BTRFS_ROOT_REF_COWS, &root->state)) {
 			BUG_ON(node->new_bytenr);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/5] btrfs: restart relocate_tree_blocks properly
  2020-03-20 18:34 [PATCH 0/5][v2] Relocation and backref resolution fixes Josef Bacik
  2020-03-20 18:34 ` [PATCH 1/5] btrfs: reorder reservation before reloc root selection Josef Bacik
@ 2020-03-20 18:34 ` Josef Bacik
  2020-03-20 18:34 ` [PATCH 3/5] btrfs: track reloc roots based on their commit_root bytenr Josef Bacik
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2020-03-20 18:34 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

There are two bugs here, but fixing them independently would just result
in pain if you happened to bisect between the two patches.

First is how we handle the -EAGAIN from relocate_tree_block().  We don't
set error, unless we happen to be the first node, which makes 0 sense, I
have no idea what the code was trying to accomplish here.

We in fact _do_ want err set here so that we know we need to restart in
relocate_block_group().  Also we need finish_pending_nodes() to not
actually call link_to_upper(), because we didn't actually relocate the
block.

And then if we do get -EAGAIN we do not want to set our backref cache
last_trans to the one before ours.  This would force us to update our
backref cache if we didn't cross trans id's, which would mean we'd have
some nodes updated to their new_bytenr, but still able to find their old
bytenr because we're searching the same commit root as the last time we
went through relocate_tree_blocks.

Fixing these two things keeps us from panicing when we start breaking
out of relocate_tree_blocks() either for delayed ref flushing or enospc.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index df33649c592c..66a344df4f05 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3226,9 +3226,8 @@ int relocate_tree_blocks(struct btrfs_trans_handle *trans,
 		ret = relocate_tree_block(trans, rc, node, &block->key,
 					  path);
 		if (ret < 0) {
-			if (ret != -EAGAIN || &block->rb_node == rb_first(blocks))
-				err = ret;
-			goto out;
+			err = ret;
+			break;
 		}
 	}
 out:
@@ -4204,12 +4203,6 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
 		if (!RB_EMPTY_ROOT(&blocks)) {
 			ret = relocate_tree_blocks(trans, rc, &blocks);
 			if (ret < 0) {
-				/*
-				 * if we fail to relocate tree blocks, force to update
-				 * backref cache when committing transaction.
-				 */
-				rc->backref_cache.last_trans = trans->transid - 1;
-
 				if (ret != -EAGAIN) {
 					err = ret;
 					break;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/5] btrfs: track reloc roots based on their commit_root bytenr
  2020-03-20 18:34 [PATCH 0/5][v2] Relocation and backref resolution fixes Josef Bacik
  2020-03-20 18:34 ` [PATCH 1/5] btrfs: reorder reservation before reloc root selection Josef Bacik
  2020-03-20 18:34 ` [PATCH 2/5] btrfs: restart relocate_tree_blocks properly Josef Bacik
@ 2020-03-20 18:34 ` Josef Bacik
  2020-03-20 18:34 ` [PATCH 4/5] btrfs: do not resolve backrefs for roots that are being deleted Josef Bacik
  2020-03-20 18:34 ` [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction Josef Bacik
  4 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2020-03-20 18:34 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We always search the commit root of the extent tree for looking up
back references, however we track the reloc roots based on their current
bytenr.

This is wrong, if we commit the transaction between relocating tree
blocks we could end up in this code in build_backref_tree

if (key.objectid == key.offset) {
	/*
	 * Only root blocks of reloc trees use
	 backref
	 * pointing to itself.
	 */
	root = find_reloc_root(rc, cur->bytenr);
	ASSERT(root);
	cur->root = root;
	break;
}

find_reloc_root() is looking based on the bytenr we had in the commit
root, but if we've cow'ed this reloc root we will not find that bytenr,
and we will trip over the ASSERT(root).

Fix this by using the commit_root->start bytenr for indexing the commit
root.  Then we change the __update_reloc_root() caller to be used when
we switch the commit root for the reloc root during commit.

This fixes the panic I was seeing when we started throttling relocation
for delayed refs.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 66a344df4f05..45268e50cb17 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1355,7 +1355,7 @@ static int __must_check __add_reloc_root(struct btrfs_root *root)
 	if (!node)
 		return -ENOMEM;
 
-	node->bytenr = root->node->start;
+	node->bytenr = root->commit_root->start;
 	node->data = root;
 
 	spin_lock(&rc->reloc_root_tree.lock);
@@ -1386,10 +1386,11 @@ static void __del_reloc_root(struct btrfs_root *root)
 	if (rc && root->node) {
 		spin_lock(&rc->reloc_root_tree.lock);
 		rb_node = tree_search(&rc->reloc_root_tree.rb_root,
-				      root->node->start);
+				      root->commit_root->start);
 		if (rb_node) {
 			node = rb_entry(rb_node, struct mapping_node, rb_node);
 			rb_erase(&node->rb_node, &rc->reloc_root_tree.rb_root);
+			RB_CLEAR_NODE(&node->rb_node);
 		}
 		spin_unlock(&rc->reloc_root_tree.lock);
 		if (!node)
@@ -1407,7 +1408,7 @@ static void __del_reloc_root(struct btrfs_root *root)
  * helper to update the 'address of tree root -> reloc tree'
  * mapping
  */
-static int __update_reloc_root(struct btrfs_root *root, u64 new_bytenr)
+static int __update_reloc_root(struct btrfs_root *root)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct rb_node *rb_node;
@@ -1416,7 +1417,7 @@ static int __update_reloc_root(struct btrfs_root *root, u64 new_bytenr)
 
 	spin_lock(&rc->reloc_root_tree.lock);
 	rb_node = tree_search(&rc->reloc_root_tree.rb_root,
-			      root->node->start);
+			      root->commit_root->start);
 	if (rb_node) {
 		node = rb_entry(rb_node, struct mapping_node, rb_node);
 		rb_erase(&node->rb_node, &rc->reloc_root_tree.rb_root);
@@ -1428,7 +1429,7 @@ static int __update_reloc_root(struct btrfs_root *root, u64 new_bytenr)
 	BUG_ON((struct btrfs_root *)node->data != root);
 
 	spin_lock(&rc->reloc_root_tree.lock);
-	node->bytenr = new_bytenr;
+	node->bytenr = root->node->start;
 	rb_node = tree_insert(&rc->reloc_root_tree.rb_root,
 			      node->bytenr, &node->rb_node);
 	spin_unlock(&rc->reloc_root_tree.lock);
@@ -1587,6 +1588,7 @@ int btrfs_update_reloc_root(struct btrfs_trans_handle *trans,
 	}
 
 	if (reloc_root->commit_root != reloc_root->node) {
+		__update_reloc_root(reloc_root);
 		btrfs_set_root_node(root_item, reloc_root->node);
 		free_extent_buffer(reloc_root->commit_root);
 		reloc_root->commit_root = btrfs_root_node(reloc_root);
@@ -4789,11 +4791,6 @@ int btrfs_reloc_cow_block(struct btrfs_trans_handle *trans,
 	BUG_ON(rc->stage == UPDATE_DATA_PTRS &&
 	       root->root_key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID);
 
-	if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
-		if (buf == root->node)
-			__update_reloc_root(root, cow->start);
-	}
-
 	level = btrfs_header_level(buf);
 	if (btrfs_header_generation(buf) <=
 	    btrfs_root_last_snapshot(&root->root_item))
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/5] btrfs: do not resolve backrefs for roots that are being deleted
  2020-03-20 18:34 [PATCH 0/5][v2] Relocation and backref resolution fixes Josef Bacik
                   ` (2 preceding siblings ...)
  2020-03-20 18:34 ` [PATCH 3/5] btrfs: track reloc roots based on their commit_root bytenr Josef Bacik
@ 2020-03-20 18:34 ` Josef Bacik
  2020-03-20 18:34 ` [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction Josef Bacik
  4 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2020-03-20 18:34 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Zygo reported a deadlock where a task was stuck in the inode logical
resolve code.  The deadlock looks like this

Task 1
btrfs_ioctl_logical_to_ino
->iterate_inodes_from_logical
 ->iterate_extent_inodes
  ->path->search_commit_root isn't set, so a transaction is started
    ->resolve_indirect_ref for a root that's being deleted
      ->search for our key, attempt to lock a node, DEADLOCK

Task 2
btrfs_drop_snapshot
->walk down to a leaf, lock it, walk up, lock node
 ->end transaction
  ->start transaction
    -> wait_cur_trans

Task 3
btrfs_commit_transaction
->wait_event(cur_trans->write_wait, num_writers == 1) DEADLOCK

We are holding a transaction open in btrfs_ioctl_logical_to_ino while we
try to resolve our references.  btrfs_drop_snapshot() holds onto its
locks while it stops and starts transaction handles, because it assumes
nobody is going to touch the root now.  Commit just does what commit
does, waiting for the writers to finish, blocking any new trans handles
from starting.

Fix this by making the backref code not try to resolve backrefs of roots
that are currently being deleted.  This will keep us from walking into a
snapshot that's currently being deleted.

This problem was harder to hit before because we rarely broke out of the
snapshot delete halfway through, but with my delayed ref throttling code
it happened much more often.  However we've always been able to do this,
so it's not a new problem.

Fixes: 8da6d5815c59 ("Btrfs: added btrfs_find_all_roots()")
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/backref.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 9d0f87df2c35..0dcc11644be4 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -523,6 +523,12 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 		goto out_free;
 	}
 
+	if (!path->search_commit_root &&
+	    test_bit(BTRFS_ROOT_DELETING, &root->state)) {
+		ret = -ENOENT;
+		goto out;
+	}
+
 	if (btrfs_is_testing(fs_info)) {
 		ret = -ENOENT;
 		goto out;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction
  2020-03-20 18:34 [PATCH 0/5][v2] Relocation and backref resolution fixes Josef Bacik
                   ` (3 preceding siblings ...)
  2020-03-20 18:34 ` [PATCH 4/5] btrfs: do not resolve backrefs for roots that are being deleted Josef Bacik
@ 2020-03-20 18:34 ` Josef Bacik
  2020-03-20 19:39   ` David Sterba
  2020-10-28 22:51   ` Holger Hoffstätte
  4 siblings, 2 replies; 9+ messages in thread
From: Josef Bacik @ 2020-03-20 18:34 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This is to fully fix the deadlock described in

btrfs: do not resolve backrefs for roots that are being deleted

Holding write locks on our deleted snapshot across trans handles will
just lead to sadness, and our backref lookup code is going to want to
still process dropped snapshots for things like qgroup accounting.

Fix this by simply dropping our path before we restart our transaction,
and picking back up from our drop_progress key.  This is less efficient
obviously, but it also doesn't deadlock, so it feels like a reasonable
trade off.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 2925b3ad77a1..bfb413747283 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5257,6 +5257,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
 	 * already dropped.
 	 */
 	set_bit(BTRFS_ROOT_DELETING, &root->state);
+again:
 	if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) {
 		level = btrfs_header_level(root->node);
 		path->nodes[level] = btrfs_lock_root_node(root);
@@ -5269,7 +5270,9 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
 		btrfs_disk_key_to_cpu(&key, &root_item->drop_progress);
 		memcpy(&wc->update_progress, &key,
 		       sizeof(wc->update_progress));
+		memcpy(&wc->drop_progress, &key, sizeof(key));
 
+		wc->drop_level = root_item->drop_level;
 		level = root_item->drop_level;
 		BUG_ON(level == 0);
 		path->lowest_level = level;
@@ -5362,6 +5365,18 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
 				goto out_end_trans;
 			}
 
+			/*
+			 * We used to keep the path open until we completed the
+			 * snapshot delete.  However this can deadlock with
+			 * things like backref walking that may want to resolve
+			 * references that still point to this deleted root.  We
+			 * already have the ability to restart snapshot
+			 * deletions on mount, so just clear our walk_control,
+			 * drop the path, and go to the beginning and re-lookup
+			 * our drop_progress key and continue from there.
+			 */
+			memset(wc, 0, sizeof(*wc));
+			btrfs_release_path(path);
 			btrfs_end_transaction_throttle(trans);
 			if (!for_reloc && btrfs_need_cleaner_sleep(fs_info)) {
 				btrfs_debug(fs_info,
@@ -5377,6 +5392,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
 			}
 			if (block_rsv)
 				trans->block_rsv = block_rsv;
+			goto again;
 		}
 	}
 	btrfs_release_path(path);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction
  2020-03-20 18:34 ` [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction Josef Bacik
@ 2020-03-20 19:39   ` David Sterba
  2020-10-28 22:51   ` Holger Hoffstätte
  1 sibling, 0 replies; 9+ messages in thread
From: David Sterba @ 2020-03-20 19:39 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Fri, Mar 20, 2020 at 02:34:36PM -0400, Josef Bacik wrote:
> @@ -5377,6 +5392,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>  			}
>  			if (block_rsv)
>  				trans->block_rsv = block_rsv;
> +			goto again;

This hunk does not apply cleanly and there's no block_rsv around in
current misc-next + this series. I get this as automatic merge
resolution but that's obviously wrong and I don't see how can I fix it.

  3166                 ret = add_to_free_space_tree(trans, bytenr, num_bytes);                                                                                                                                     
  3167                 if (ret) {                                                                                                                                                                                  
  3168                         btrfs_abort_transaction(trans, ret);                                                                                                                                                
  3169                         goto out;                                                                                                                                                                           
  3170                 }                                                                                                                                                                                           
  3171                                                                                                                                                                                                             
  3172                 ret = btrfs_update_block_group(trans, bytenr, num_bytes, 0);                                                                                                                                
  3173                 if (ret) {                                                                                                                                                                                  
  3174                         btrfs_abort_transaction(trans, ret);                                                                                                                                                
  3175                         goto out;                                                                                                                                                                           
+ 3176                         goto again;                                                                                                                                                                         
  3177                 }                                                                                                                                                                                           
  3178         }                                                                                                                                                                                                   
  3179         btrfs_release_path(path);                                                                                                                                                                           
  3180                                                                                                                                                                                                             
  3181 out:                                                                                                                                                                                                        
  3182         btrfs_free_path(path);                                                                                                                                                                              
  3183         return ret;                                                                                                                                                                                         
  3184 }

So I guess it's because of some other patches in your tree. I'm about to
push misc-next with patches 1-4, so you can have a look.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction
  2020-03-20 18:34 ` [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction Josef Bacik
  2020-03-20 19:39   ` David Sterba
@ 2020-10-28 22:51   ` Holger Hoffstätte
  2020-11-20  8:48     ` Holger Hoffstätte
  1 sibling, 1 reply; 9+ messages in thread
From: Holger Hoffstätte @ 2020-10-28 22:51 UTC (permalink / raw)
  To: linux-btrfs, Josef Bacik

On 2020-03-20 19:34, Josef Bacik wrote:
> This is to fully fix the deadlock described in
> 
> btrfs: do not resolve backrefs for roots that are being deleted
> 
> Holding write locks on our deleted snapshot across trans handles will
> just lead to sadness, and our backref lookup code is going to want to
> still process dropped snapshots for things like qgroup accounting.
> 
> Fix this by simply dropping our path before we restart our transaction,
> and picking back up from our drop_progress key.  This is less efficient
> obviously, but it also doesn't deadlock, so it feels like a reasonable
> trade off.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>   fs/btrfs/extent-tree.c | 16 ++++++++++++++++
>   1 file changed, 16 insertions(+)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 2925b3ad77a1..bfb413747283 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -5257,6 +5257,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>   	 * already dropped.
>   	 */
>   	set_bit(BTRFS_ROOT_DELETING, &root->state);
> +again:
>   	if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) {
>   		level = btrfs_header_level(root->node);
>   		path->nodes[level] = btrfs_lock_root_node(root);
> @@ -5269,7 +5270,9 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>   		btrfs_disk_key_to_cpu(&key, &root_item->drop_progress);
>   		memcpy(&wc->update_progress, &key,
>   		       sizeof(wc->update_progress));
> +		memcpy(&wc->drop_progress, &key, sizeof(key));
>   
> +		wc->drop_level = root_item->drop_level;
>   		level = root_item->drop_level;
>   		BUG_ON(level == 0);
>   		path->lowest_level = level;
> @@ -5362,6 +5365,18 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>   				goto out_end_trans;
>   			}
>   
> +			/*
> +			 * We used to keep the path open until we completed the
> +			 * snapshot delete.  However this can deadlock with
> +			 * things like backref walking that may want to resolve
> +			 * references that still point to this deleted root.  We
> +			 * already have the ability to restart snapshot
> +			 * deletions on mount, so just clear our walk_control,
> +			 * drop the path, and go to the beginning and re-lookup
> +			 * our drop_progress key and continue from there.
> +			 */
> +			memset(wc, 0, sizeof(*wc));
> +			btrfs_release_path(path);
>   			btrfs_end_transaction_throttle(trans);
>   			if (!for_reloc && btrfs_need_cleaner_sleep(fs_info)) {
>   				btrfs_debug(fs_info,
> @@ -5377,6 +5392,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>   			}
>   			if (block_rsv)
>   				trans->block_rsv = block_rsv;
> +			goto again;
>   		}
>   	}
>   	btrfs_release_path(path);
> 

Josef,

the above fix still seems to be missing, apparently since Dave couldn't merge it
properly at the time (see [1]). Is this still needed? There were several long
discussions about balance loops and it would be great to get this fixed once and
for all. It applies and (seems to?) work fine in 5.9 (at least it hasn't eaten
anything here so far) but if it's not needed anymore then all the better.

thanks
Holger

[1] https://lore.kernel.org/linux-btrfs/20200320193927.GH12659@twin.jikos.cz/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction
  2020-10-28 22:51   ` Holger Hoffstätte
@ 2020-11-20  8:48     ` Holger Hoffstätte
  0 siblings, 0 replies; 9+ messages in thread
From: Holger Hoffstätte @ 2020-11-20  8:48 UTC (permalink / raw)
  To: linux-btrfs


Trying again out-of-thread, but I haven't seen any answers to this yet..

On 2020-03-20 19:34, Josef Bacik wrote:
> This is to fully fix the deadlock described in
> 
> btrfs: do not resolve backrefs for roots that are being deleted
> 
> Holding write locks on our deleted snapshot across trans handles will
> just lead to sadness, and our backref lookup code is going to want to
> still process dropped snapshots for things like qgroup accounting.
> 
> Fix this by simply dropping our path before we restart our transaction,
> and picking back up from our drop_progress key.  This is less efficient
> obviously, but it also doesn't deadlock, so it feels like a reasonable
> trade off.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>   fs/btrfs/extent-tree.c | 16 ++++++++++++++++
>   1 file changed, 16 insertions(+)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 2925b3ad77a1..bfb413747283 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -5257,6 +5257,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>   	 * already dropped.
>   	 */
>   	set_bit(BTRFS_ROOT_DELETING, &root->state);
> +again:
>   	if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) {
>   		level = btrfs_header_level(root->node);
>   		path->nodes[level] = btrfs_lock_root_node(root);
> @@ -5269,7 +5270,9 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>   		btrfs_disk_key_to_cpu(&key, &root_item->drop_progress);
>   		memcpy(&wc->update_progress, &key,
>   		       sizeof(wc->update_progress));
> +		memcpy(&wc->drop_progress, &key, sizeof(key));
>   
> +		wc->drop_level = root_item->drop_level;
>   		level = root_item->drop_level;
>   		BUG_ON(level == 0);
>   		path->lowest_level = level;
> @@ -5362,6 +5365,18 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>   				goto out_end_trans;
>   			}
>   
> +			/*
> +			 * We used to keep the path open until we completed the
> +			 * snapshot delete.  However this can deadlock with
> +			 * things like backref walking that may want to resolve
> +			 * references that still point to this deleted root.  We
> +			 * already have the ability to restart snapshot
> +			 * deletions on mount, so just clear our walk_control,
> +			 * drop the path, and go to the beginning and re-lookup
> +			 * our drop_progress key and continue from there.
> +			 */
> +			memset(wc, 0, sizeof(*wc));
> +			btrfs_release_path(path);
>   			btrfs_end_transaction_throttle(trans);
>   			if (!for_reloc && btrfs_need_cleaner_sleep(fs_info)) {
>   				btrfs_debug(fs_info,
> @@ -5377,6 +5392,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>   			}
>   			if (block_rsv)
>   				trans->block_rsv = block_rsv;
> +			goto again;
>   		}
>   	}
>   	btrfs_release_path(path);
> 

Josef,

the above fix still seems to be missing, apparently since Dave couldn't merge it
properly at the time (see [1]). Is this still needed? There were several long
discussions about balance loops and it would be great to get this fixed once and
for all. It applies and (seems to?) work fine in 5.9 (at least it hasn't eaten
anything here so far) but if it's not needed anymore then all the better.

thanks
Holger

[1] https://lore.kernel.org/linux-btrfs/20200320193927.GH12659@twin.jikos.cz/


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-11-20  8:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-20 18:34 [PATCH 0/5][v2] Relocation and backref resolution fixes Josef Bacik
2020-03-20 18:34 ` [PATCH 1/5] btrfs: reorder reservation before reloc root selection Josef Bacik
2020-03-20 18:34 ` [PATCH 2/5] btrfs: restart relocate_tree_blocks properly Josef Bacik
2020-03-20 18:34 ` [PATCH 3/5] btrfs: track reloc roots based on their commit_root bytenr Josef Bacik
2020-03-20 18:34 ` [PATCH 4/5] btrfs: do not resolve backrefs for roots that are being deleted Josef Bacik
2020-03-20 18:34 ` [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction Josef Bacik
2020-03-20 19:39   ` David Sterba
2020-10-28 22:51   ` Holger Hoffstätte
2020-11-20  8:48     ` Holger Hoffstätte

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).