All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Various fixes
@ 2019-12-06 14:37 Josef Bacik
  2019-12-06 14:37 ` [PATCH 1/5] btrfs: drop log root for dropped roots Josef Bacik
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Josef Bacik @ 2019-12-06 14:37 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

These were discovered while reworking how we handle root refs.  They are all
relatively straightforward and mostly deal with error cases, with the exception
of

[PATCH 1/5] btrfs: drop log root for dropped roots
[PATCH 4/5] btrfs: skip log replay on orphaned roots

These two are pretty important and were uncovered with my fsstress patch.  The
first fixes a space leak in the case that we delete a subvol that has a tree log
attached to it.  The leak does not persist across mounts so it's not too bad,
but still pretty important.  The second patch I've only seen in production once
in the last 90 days, but could keep us from mounting if we have a subvol that
was deleted with a tree log that we didn't finish deleting.

The rest of these are just for various error conditions and are less important,
but should be safe enough to send along now if desired.  Thanks,

Josef


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/5] btrfs: drop log root for dropped roots
  2019-12-06 14:37 [PATCH 0/5] Various fixes Josef Bacik
@ 2019-12-06 14:37 ` Josef Bacik
  2019-12-06 15:02   ` Filipe Manana
  2019-12-06 15:03   ` Nikolay Borisov
  2019-12-06 14:37 ` [PATCH 2/5] btrfs: don't BUG_ON in create_subvol Josef Bacik
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 24+ messages in thread
From: Josef Bacik @ 2019-12-06 14:37 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we fsync on a subvolume and create a log root for that volume, and
then later delete that subvolume we'll never clean up its log root.  Fix
this by making switch_commit_roots free the log for any dropped roots we
encounter.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/transaction.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index cfc08ef9b876..55d8fd68775a 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -147,13 +147,14 @@ void btrfs_put_transaction(struct btrfs_transaction *transaction)
 	}
 }
 
-static noinline void switch_commit_roots(struct btrfs_transaction *trans)
+static noinline void switch_commit_roots(struct btrfs_trans_handle *trans)
 {
+	struct btrfs_transaction *cur_trans = trans->transaction;
 	struct btrfs_fs_info *fs_info = trans->fs_info;
 	struct btrfs_root *root, *tmp;
 
 	down_write(&fs_info->commit_root_sem);
-	list_for_each_entry_safe(root, tmp, &trans->switch_commits,
+	list_for_each_entry_safe(root, tmp, &cur_trans->switch_commits,
 				 dirty_list) {
 		list_del_init(&root->dirty_list);
 		free_extent_buffer(root->commit_root);
@@ -165,16 +166,17 @@ static noinline void switch_commit_roots(struct btrfs_transaction *trans)
 	}
 
 	/* We can free old roots now. */
-	spin_lock(&trans->dropped_roots_lock);
-	while (!list_empty(&trans->dropped_roots)) {
-		root = list_first_entry(&trans->dropped_roots,
+	spin_lock(&cur_trans->dropped_roots_lock);
+	while (!list_empty(&cur_trans->dropped_roots)) {
+		root = list_first_entry(&cur_trans->dropped_roots,
 					struct btrfs_root, root_list);
 		list_del_init(&root->root_list);
-		spin_unlock(&trans->dropped_roots_lock);
+		spin_unlock(&cur_trans->dropped_roots_lock);
+		btrfs_free_log(trans, root);
 		btrfs_drop_and_free_fs_root(fs_info, root);
-		spin_lock(&trans->dropped_roots_lock);
+		spin_lock(&cur_trans->dropped_roots_lock);
 	}
-	spin_unlock(&trans->dropped_roots_lock);
+	spin_unlock(&cur_trans->dropped_roots_lock);
 	up_write(&fs_info->commit_root_sem);
 }
 
@@ -1421,7 +1423,7 @@ static int qgroup_account_snapshot(struct btrfs_trans_handle *trans,
 	ret = commit_cowonly_roots(trans);
 	if (ret)
 		goto out;
-	switch_commit_roots(trans->transaction);
+	switch_commit_roots(trans);
 	ret = btrfs_write_and_wait_transaction(trans);
 	if (ret)
 		btrfs_handle_fs_error(fs_info, ret,
@@ -2301,7 +2303,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 	list_add_tail(&fs_info->chunk_root->dirty_list,
 		      &cur_trans->switch_commits);
 
-	switch_commit_roots(cur_trans);
+	switch_commit_roots(trans);
 
 	ASSERT(list_empty(&cur_trans->dirty_bgs));
 	ASSERT(list_empty(&cur_trans->io_bgs));
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/5] btrfs: don't BUG_ON in create_subvol
  2019-12-06 14:37 [PATCH 0/5] Various fixes Josef Bacik
  2019-12-06 14:37 ` [PATCH 1/5] btrfs: drop log root for dropped roots Josef Bacik
@ 2019-12-06 14:37 ` Josef Bacik
  2019-12-06 15:03   ` Filipe Manana
  2019-12-09 10:49   ` Johannes Thumshirn
  2019-12-06 14:37 ` [PATCH 3/5] btrfs: handle ENOENT in btrfs_uuid_tree_iterate Josef Bacik
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 24+ messages in thread
From: Josef Bacik @ 2019-12-06 14:37 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We can just abort the transaction here, and in fact do that for every
other failure in this function except these two cases.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ioctl.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a1ee0b775e65..375befdecc19 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -704,11 +704,17 @@ static noinline int create_subvol(struct inode *dir,
 
 	btrfs_i_size_write(BTRFS_I(dir), dir->i_size + namelen * 2);
 	ret = btrfs_update_inode(trans, root, dir);
-	BUG_ON(ret);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto fail;
+	}
 
 	ret = btrfs_add_root_ref(trans, objectid, root->root_key.objectid,
 				 btrfs_ino(BTRFS_I(dir)), index, name, namelen);
-	BUG_ON(ret);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto fail;
+	}
 
 	ret = btrfs_uuid_tree_add(trans, root_item->uuid,
 				  BTRFS_UUID_KEY_SUBVOL, objectid);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/5] btrfs: handle ENOENT in btrfs_uuid_tree_iterate
  2019-12-06 14:37 [PATCH 0/5] Various fixes Josef Bacik
  2019-12-06 14:37 ` [PATCH 1/5] btrfs: drop log root for dropped roots Josef Bacik
  2019-12-06 14:37 ` [PATCH 2/5] btrfs: don't BUG_ON in create_subvol Josef Bacik
@ 2019-12-06 14:37 ` Josef Bacik
  2019-12-06 15:13   ` Filipe Manana
  2019-12-06 16:39   ` [PATCH 3/5][v2] " Josef Bacik
  2019-12-06 14:37 ` [PATCH 4/5] btrfs: skip log replay on orphaned roots Josef Bacik
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 24+ messages in thread
From: Josef Bacik @ 2019-12-06 14:37 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we get an -ENOENT back from btrfs_uuid_iter_rem when iterating the
uuid tree we'll just continue and do btrfs_next_item().  However we've
done a btrfs_release_path() at this point and no longer have a valid
path.  So increment the key and go back and do a normal search.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/uuid-tree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c
index 91caab63bdf5..8871e0bb3b69 100644
--- a/fs/btrfs/uuid-tree.c
+++ b/fs/btrfs/uuid-tree.c
@@ -324,6 +324,8 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info,
 				}
 				if (ret < 0 && ret != -ENOENT)
 					goto out;
+				key.objectid++;
+				goto again_search_slot;
 			}
 			item_size -= sizeof(subid_le);
 			offset += sizeof(subid_le);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/5] btrfs: skip log replay on orphaned roots
  2019-12-06 14:37 [PATCH 0/5] Various fixes Josef Bacik
                   ` (2 preceding siblings ...)
  2019-12-06 14:37 ` [PATCH 3/5] btrfs: handle ENOENT in btrfs_uuid_tree_iterate Josef Bacik
@ 2019-12-06 14:37 ` Josef Bacik
  2019-12-06 15:23   ` Filipe Manana
  2019-12-06 14:37 ` [PATCH 5/5] btrfs: do not leak reloc root if we fail to read the fs root Josef Bacik
  2019-12-09 18:16 ` [PATCH 0/5] Various fixes David Sterba
  5 siblings, 1 reply; 24+ messages in thread
From: Josef Bacik @ 2019-12-06 14:37 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

My fsstress modifications coupled with generic/475 uncovered a failure
to mount and replay the log if we hit a orphaned root.  We do not want
to replay the log for an orphan root, but it's completely legitimate to
have an orphaned root with a log attached.  Fix this by simply skipping
replaying the log.  We still need to pin it's root node so that we do
not overwrite it while replaying other logs, as we re-read the log root
at every stage of the replay.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/tree-log.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 6f757361db53..4bbb4fd490b5 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -6294,9 +6294,28 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
 		wc.replay_dest = btrfs_read_fs_root_no_name(fs_info, &tmp_key);
 		if (IS_ERR(wc.replay_dest)) {
 			ret = PTR_ERR(wc.replay_dest);
+
+			/*
+			 * We didn't find the subvol, likely because it was
+			 * deleted.  This is ok, simply skip this log and go to
+			 * the next one.
+			 *
+			 * We need to exclude the root because we can't have
+			 * other log replays overwriting this log as we'll read
+			 * it back in a few more times.  This will keep our
+			 * block from being modified, and we'll just bail for
+			 * each subsequent pass.
+			 */
+			if (ret == -ENOENT)
+				ret = btrfs_pin_extent_for_log_replay(fs_info,
+							log->node->start,
+							log->node->len);
 			free_extent_buffer(log->node);
 			free_extent_buffer(log->commit_root);
 			kfree(log);
+
+			if (!ret)
+				goto next;
 			btrfs_handle_fs_error(fs_info, ret,
 				"Couldn't read target root for tree log recovery.");
 			goto error;
@@ -6328,7 +6347,6 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
 						  &root->highest_objectid);
 		}
 
-		key.offset = found_key.offset - 1;
 		wc.replay_dest->log_root = NULL;
 		free_extent_buffer(log->node);
 		free_extent_buffer(log->commit_root);
@@ -6336,9 +6354,10 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
 
 		if (ret)
 			goto error;
-
+next:
 		if (found_key.offset == 0)
 			break;
+		key.offset = found_key.offset - 1;
 	}
 	btrfs_release_path(path);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/5] btrfs: do not leak reloc root if we fail to read the fs root
  2019-12-06 14:37 [PATCH 0/5] Various fixes Josef Bacik
                   ` (3 preceding siblings ...)
  2019-12-06 14:37 ` [PATCH 4/5] btrfs: skip log replay on orphaned roots Josef Bacik
@ 2019-12-06 14:37 ` Josef Bacik
  2019-12-06 15:24   ` Filipe Manana
  2019-12-09 10:58   ` Johannes Thumshirn
  2019-12-09 18:16 ` [PATCH 0/5] Various fixes David Sterba
  5 siblings, 2 replies; 24+ messages in thread
From: Josef Bacik @ 2019-12-06 14:37 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we fail to read the fs root corresponding with a reloc root we'll
just break out and free the reloc roots.  But we remove our current
reloc_root from this list higher up, which means we'll leak this
reloc_root.  Fix this by adding ourselves back to the reloc_roots list
so we are properly cleaned up.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/relocation.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index a857fc8271d2..c5fcddad1c15 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4554,6 +4554,7 @@ int btrfs_recover_relocation(struct btrfs_root *root)
 		fs_root = read_fs_root(fs_info, reloc_root->root_key.offset);
 		if (IS_ERR(fs_root)) {
 			err = PTR_ERR(fs_root);
+			list_add_tail(&reloc_root->root_list, &reloc_roots);
 			goto out_free;
 		}
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] btrfs: drop log root for dropped roots
  2019-12-06 14:37 ` [PATCH 1/5] btrfs: drop log root for dropped roots Josef Bacik
@ 2019-12-06 15:02   ` Filipe Manana
  2019-12-06 15:03   ` Nikolay Borisov
  1 sibling, 0 replies; 24+ messages in thread
From: Filipe Manana @ 2019-12-06 15:02 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Fri, Dec 6, 2019 at 2:38 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> If we fsync on a subvolume and create a log root for that volume, and
> then later delete that subvolume we'll never clean up its log root.  Fix
> this by making switch_commit_roots free the log for any dropped roots we
> encounter.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks.

> ---
>  fs/btrfs/transaction.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index cfc08ef9b876..55d8fd68775a 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -147,13 +147,14 @@ void btrfs_put_transaction(struct btrfs_transaction *transaction)
>         }
>  }
>
> -static noinline void switch_commit_roots(struct btrfs_transaction *trans)
> +static noinline void switch_commit_roots(struct btrfs_trans_handle *trans)
>  {
> +       struct btrfs_transaction *cur_trans = trans->transaction;
>         struct btrfs_fs_info *fs_info = trans->fs_info;
>         struct btrfs_root *root, *tmp;
>
>         down_write(&fs_info->commit_root_sem);
> -       list_for_each_entry_safe(root, tmp, &trans->switch_commits,
> +       list_for_each_entry_safe(root, tmp, &cur_trans->switch_commits,
>                                  dirty_list) {
>                 list_del_init(&root->dirty_list);
>                 free_extent_buffer(root->commit_root);
> @@ -165,16 +166,17 @@ static noinline void switch_commit_roots(struct btrfs_transaction *trans)
>         }
>
>         /* We can free old roots now. */
> -       spin_lock(&trans->dropped_roots_lock);
> -       while (!list_empty(&trans->dropped_roots)) {
> -               root = list_first_entry(&trans->dropped_roots,
> +       spin_lock(&cur_trans->dropped_roots_lock);
> +       while (!list_empty(&cur_trans->dropped_roots)) {
> +               root = list_first_entry(&cur_trans->dropped_roots,
>                                         struct btrfs_root, root_list);
>                 list_del_init(&root->root_list);
> -               spin_unlock(&trans->dropped_roots_lock);
> +               spin_unlock(&cur_trans->dropped_roots_lock);
> +               btrfs_free_log(trans, root);
>                 btrfs_drop_and_free_fs_root(fs_info, root);
> -               spin_lock(&trans->dropped_roots_lock);
> +               spin_lock(&cur_trans->dropped_roots_lock);
>         }
> -       spin_unlock(&trans->dropped_roots_lock);
> +       spin_unlock(&cur_trans->dropped_roots_lock);
>         up_write(&fs_info->commit_root_sem);
>  }
>
> @@ -1421,7 +1423,7 @@ static int qgroup_account_snapshot(struct btrfs_trans_handle *trans,
>         ret = commit_cowonly_roots(trans);
>         if (ret)
>                 goto out;
> -       switch_commit_roots(trans->transaction);
> +       switch_commit_roots(trans);
>         ret = btrfs_write_and_wait_transaction(trans);
>         if (ret)
>                 btrfs_handle_fs_error(fs_info, ret,
> @@ -2301,7 +2303,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>         list_add_tail(&fs_info->chunk_root->dirty_list,
>                       &cur_trans->switch_commits);
>
> -       switch_commit_roots(cur_trans);
> +       switch_commit_roots(trans);
>
>         ASSERT(list_empty(&cur_trans->dirty_bgs));
>         ASSERT(list_empty(&cur_trans->io_bgs));
> --
> 2.23.0
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] btrfs: drop log root for dropped roots
  2019-12-06 14:37 ` [PATCH 1/5] btrfs: drop log root for dropped roots Josef Bacik
  2019-12-06 15:02   ` Filipe Manana
@ 2019-12-06 15:03   ` Nikolay Borisov
  2019-12-09 17:39     ` David Sterba
  2019-12-10 20:05     ` Josef Bacik
  1 sibling, 2 replies; 24+ messages in thread
From: Nikolay Borisov @ 2019-12-06 15:03 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 6.12.19 г. 16:37 ч., Josef Bacik wrote:
> If we fsync on a subvolume and create a log root for that volume, and
> then later delete that subvolume we'll never clean up its log root.  Fix
> this by making switch_commit_roots free the log for any dropped roots we
> encounter.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/transaction.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index cfc08ef9b876..55d8fd68775a 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -147,13 +147,14 @@ void btrfs_put_transaction(struct btrfs_transaction *transaction)
>  	}
>  }
>  
> -static noinline void switch_commit_roots(struct btrfs_transaction *trans)
> +static noinline void switch_commit_roots(struct btrfs_trans_handle *trans)
>  {
> +	struct btrfs_transaction *cur_trans = trans->transaction;
>  	struct btrfs_fs_info *fs_info = trans->fs_info;
>  	struct btrfs_root *root, *tmp;
>  
>  	down_write(&fs_info->commit_root_sem);
> -	list_for_each_entry_safe(root, tmp, &trans->switch_commits,
> +	list_for_each_entry_safe(root, tmp, &cur_trans->switch_commits,
>  				 dirty_list) {
>  		list_del_init(&root->dirty_list);
>  		free_extent_buffer(root->commit_root);
> @@ -165,16 +166,17 @@ static noinline void switch_commit_roots(struct btrfs_transaction *trans)
>  	}
>  
>  	/* We can free old roots now. */
> -	spin_lock(&trans->dropped_roots_lock);
> -	while (!list_empty(&trans->dropped_roots)) {
> -		root = list_first_entry(&trans->dropped_roots,
> +	spin_lock(&cur_trans->dropped_roots_lock);
> +	while (!list_empty(&cur_trans->dropped_roots)) {
> +		root = list_first_entry(&cur_trans->dropped_roots,
>  					struct btrfs_root, root_list);
>  		list_del_init(&root->root_list);
> -		spin_unlock(&trans->dropped_roots_lock);
> +		spin_unlock(&cur_trans->dropped_roots_lock);
> +		btrfs_free_log(trans, root);

THis patch should really have been this line and converting
switch_commit_roots to taking a trans handle another patch. Otherwise
this is lost in the mechanical refactoring.

>  		btrfs_drop_and_free_fs_root(fs_info, root);
> -		spin_lock(&trans->dropped_roots_lock);
> +		spin_lock(&cur_trans->dropped_roots_lock);
>  	}
> -	spin_unlock(&trans->dropped_roots_lock);
> +	spin_unlock(&cur_trans->dropped_roots_lock);
>  	up_write(&fs_info->commit_root_sem);
>  }
>  
> @@ -1421,7 +1423,7 @@ static int qgroup_account_snapshot(struct btrfs_trans_handle *trans,
>  	ret = commit_cowonly_roots(trans);
>  	if (ret)
>  		goto out;
> -	switch_commit_roots(trans->transaction);
> +	switch_commit_roots(trans);
>  	ret = btrfs_write_and_wait_transaction(trans);
>  	if (ret)
>  		btrfs_handle_fs_error(fs_info, ret,
> @@ -2301,7 +2303,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>  	list_add_tail(&fs_info->chunk_root->dirty_list,
>  		      &cur_trans->switch_commits);
>  
> -	switch_commit_roots(cur_trans);
> +	switch_commit_roots(trans);
>  
>  	ASSERT(list_empty(&cur_trans->dirty_bgs));
>  	ASSERT(list_empty(&cur_trans->io_bgs));
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] btrfs: don't BUG_ON in create_subvol
  2019-12-06 14:37 ` [PATCH 2/5] btrfs: don't BUG_ON in create_subvol Josef Bacik
@ 2019-12-06 15:03   ` Filipe Manana
  2019-12-09 10:49   ` Johannes Thumshirn
  1 sibling, 0 replies; 24+ messages in thread
From: Filipe Manana @ 2019-12-06 15:03 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Fri, Dec 6, 2019 at 2:38 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> We can just abort the transaction here, and in fact do that for every
> other failure in this function except these two cases.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks.

> ---
>  fs/btrfs/ioctl.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index a1ee0b775e65..375befdecc19 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -704,11 +704,17 @@ static noinline int create_subvol(struct inode *dir,
>
>         btrfs_i_size_write(BTRFS_I(dir), dir->i_size + namelen * 2);
>         ret = btrfs_update_inode(trans, root, dir);
> -       BUG_ON(ret);
> +       if (ret) {
> +               btrfs_abort_transaction(trans, ret);
> +               goto fail;
> +       }
>
>         ret = btrfs_add_root_ref(trans, objectid, root->root_key.objectid,
>                                  btrfs_ino(BTRFS_I(dir)), index, name, namelen);
> -       BUG_ON(ret);
> +       if (ret) {
> +               btrfs_abort_transaction(trans, ret);
> +               goto fail;
> +       }
>
>         ret = btrfs_uuid_tree_add(trans, root_item->uuid,
>                                   BTRFS_UUID_KEY_SUBVOL, objectid);
> --
> 2.23.0
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/5] btrfs: handle ENOENT in btrfs_uuid_tree_iterate
  2019-12-06 14:37 ` [PATCH 3/5] btrfs: handle ENOENT in btrfs_uuid_tree_iterate Josef Bacik
@ 2019-12-06 15:13   ` Filipe Manana
  2019-12-06 15:17     ` Josef Bacik
  2019-12-06 16:39   ` [PATCH 3/5][v2] " Josef Bacik
  1 sibling, 1 reply; 24+ messages in thread
From: Filipe Manana @ 2019-12-06 15:13 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Fri, Dec 6, 2019 at 2:38 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> If we get an -ENOENT back from btrfs_uuid_iter_rem when iterating the
> uuid tree we'll just continue and do btrfs_next_item().  However we've
> done a btrfs_release_path() at this point and no longer have a valid
> path.  So increment the key and go back and do a normal search.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/uuid-tree.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c
> index 91caab63bdf5..8871e0bb3b69 100644
> --- a/fs/btrfs/uuid-tree.c
> +++ b/fs/btrfs/uuid-tree.c
> @@ -324,6 +324,8 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info,
>                                 }
>                                 if (ret < 0 && ret != -ENOENT)
>                                         goto out;
> +                               key.objectid++;

Why not key.offset++ instead?
By incrementing the objectid it seems we can skip the key for another
subvolume with an uuid having the same value for its first 8 bytes as
the current one, no?

thanks

> +                               goto again_search_slot;
>                         }
>                         item_size -= sizeof(subid_le);
>                         offset += sizeof(subid_le);
> --
> 2.23.0
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/5] btrfs: handle ENOENT in btrfs_uuid_tree_iterate
  2019-12-06 15:13   ` Filipe Manana
@ 2019-12-06 15:17     ` Josef Bacik
  0 siblings, 0 replies; 24+ messages in thread
From: Josef Bacik @ 2019-12-06 15:17 UTC (permalink / raw)
  To: Filipe Manana; +Cc: Josef Bacik, linux-btrfs, kernel-team

On Fri, Dec 06, 2019 at 03:13:21PM +0000, Filipe Manana wrote:
> On Fri, Dec 6, 2019 at 2:38 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > If we get an -ENOENT back from btrfs_uuid_iter_rem when iterating the
> > uuid tree we'll just continue and do btrfs_next_item().  However we've
> > done a btrfs_release_path() at this point and no longer have a valid
> > path.  So increment the key and go back and do a normal search.
> >
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> >  fs/btrfs/uuid-tree.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c
> > index 91caab63bdf5..8871e0bb3b69 100644
> > --- a/fs/btrfs/uuid-tree.c
> > +++ b/fs/btrfs/uuid-tree.c
> > @@ -324,6 +324,8 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info,
> >                                 }
> >                                 if (ret < 0 && ret != -ENOENT)
> >                                         goto out;
> > +                               key.objectid++;
> 
> Why not key.offset++ instead?
> By incrementing the objectid it seems we can skip the key for another
> subvolume with an uuid having the same value for its first 8 bytes as
> the current one, no?

Oops you're right, I had it in my head the objectid was the subvolid.  I'll fix
this, thanks,

Josef

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] btrfs: skip log replay on orphaned roots
  2019-12-06 14:37 ` [PATCH 4/5] btrfs: skip log replay on orphaned roots Josef Bacik
@ 2019-12-06 15:23   ` Filipe Manana
  0 siblings, 0 replies; 24+ messages in thread
From: Filipe Manana @ 2019-12-06 15:23 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Fri, Dec 6, 2019 at 2:38 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> My fsstress modifications coupled with generic/475 uncovered a failure
> to mount and replay the log if we hit a orphaned root.  We do not want
> to replay the log for an orphan root, but it's completely legitimate to
> have an orphaned root with a log attached.  Fix this by simply skipping
> replaying the log.  We still need to pin it's root node so that we do
> not overwrite it while replaying other logs, as we re-read the log root
> at every stage of the replay.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks.

> ---
>  fs/btrfs/tree-log.c | 23 +++++++++++++++++++++--
>  1 file changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 6f757361db53..4bbb4fd490b5 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -6294,9 +6294,28 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
>                 wc.replay_dest = btrfs_read_fs_root_no_name(fs_info, &tmp_key);
>                 if (IS_ERR(wc.replay_dest)) {
>                         ret = PTR_ERR(wc.replay_dest);
> +
> +                       /*
> +                        * We didn't find the subvol, likely because it was
> +                        * deleted.  This is ok, simply skip this log and go to
> +                        * the next one.
> +                        *
> +                        * We need to exclude the root because we can't have
> +                        * other log replays overwriting this log as we'll read
> +                        * it back in a few more times.  This will keep our
> +                        * block from being modified, and we'll just bail for
> +                        * each subsequent pass.
> +                        */
> +                       if (ret == -ENOENT)
> +                               ret = btrfs_pin_extent_for_log_replay(fs_info,
> +                                                       log->node->start,
> +                                                       log->node->len);
>                         free_extent_buffer(log->node);
>                         free_extent_buffer(log->commit_root);
>                         kfree(log);
> +
> +                       if (!ret)
> +                               goto next;
>                         btrfs_handle_fs_error(fs_info, ret,
>                                 "Couldn't read target root for tree log recovery.");
>                         goto error;
> @@ -6328,7 +6347,6 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
>                                                   &root->highest_objectid);
>                 }
>
> -               key.offset = found_key.offset - 1;
>                 wc.replay_dest->log_root = NULL;
>                 free_extent_buffer(log->node);
>                 free_extent_buffer(log->commit_root);
> @@ -6336,9 +6354,10 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
>
>                 if (ret)
>                         goto error;
> -
> +next:
>                 if (found_key.offset == 0)
>                         break;
> +               key.offset = found_key.offset - 1;
>         }
>         btrfs_release_path(path);
>
> --
> 2.23.0
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] btrfs: do not leak reloc root if we fail to read the fs root
  2019-12-06 14:37 ` [PATCH 5/5] btrfs: do not leak reloc root if we fail to read the fs root Josef Bacik
@ 2019-12-06 15:24   ` Filipe Manana
  2019-12-09 10:58   ` Johannes Thumshirn
  1 sibling, 0 replies; 24+ messages in thread
From: Filipe Manana @ 2019-12-06 15:24 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Fri, Dec 6, 2019 at 2:39 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> If we fail to read the fs root corresponding with a reloc root we'll
> just break out and free the reloc roots.  But we remove our current
> reloc_root from this list higher up, which means we'll leak this
> reloc_root.  Fix this by adding ourselves back to the reloc_roots list
> so we are properly cleaned up.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks.

> ---
>  fs/btrfs/relocation.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index a857fc8271d2..c5fcddad1c15 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -4554,6 +4554,7 @@ int btrfs_recover_relocation(struct btrfs_root *root)
>                 fs_root = read_fs_root(fs_info, reloc_root->root_key.offset);
>                 if (IS_ERR(fs_root)) {
>                         err = PTR_ERR(fs_root);
> +                       list_add_tail(&reloc_root->root_list, &reloc_roots);
>                         goto out_free;
>                 }
>
> --
> 2.23.0
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/5][v2] btrfs: handle ENOENT in btrfs_uuid_tree_iterate
  2019-12-06 14:37 ` [PATCH 3/5] btrfs: handle ENOENT in btrfs_uuid_tree_iterate Josef Bacik
  2019-12-06 15:13   ` Filipe Manana
@ 2019-12-06 16:39   ` Josef Bacik
  2019-12-06 16:46     ` Filipe Manana
  2019-12-09 10:52     ` Johannes Thumshirn
  1 sibling, 2 replies; 24+ messages in thread
From: Josef Bacik @ 2019-12-06 16:39 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we get an -ENOENT back from btrfs_uuid_iter_rem when iterating the
uuid tree we'll just continue and do btrfs_next_item().  However we've
done a btrfs_release_path() at this point and no longer have a valid
path.  So increment the key and go back and do a normal search.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
v1->v2:
- increase key.offset instead of key.objectid so we don't skip over a bunch of
  keys.

 fs/btrfs/uuid-tree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c
index 91caab63bdf5..76b84f2397b1 100644
--- a/fs/btrfs/uuid-tree.c
+++ b/fs/btrfs/uuid-tree.c
@@ -324,6 +324,8 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info,
 				}
 				if (ret < 0 && ret != -ENOENT)
 					goto out;
+				key.offset++;
+				goto again_search_slot;
 			}
 			item_size -= sizeof(subid_le);
 			offset += sizeof(subid_le);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/5][v2] btrfs: handle ENOENT in btrfs_uuid_tree_iterate
  2019-12-06 16:39   ` [PATCH 3/5][v2] " Josef Bacik
@ 2019-12-06 16:46     ` Filipe Manana
  2019-12-09 10:52     ` Johannes Thumshirn
  1 sibling, 0 replies; 24+ messages in thread
From: Filipe Manana @ 2019-12-06 16:46 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Fri, Dec 6, 2019 at 4:39 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> If we get an -ENOENT back from btrfs_uuid_iter_rem when iterating the
> uuid tree we'll just continue and do btrfs_next_item().  However we've
> done a btrfs_release_path() at this point and no longer have a valid
> path.  So increment the key and go back and do a normal search.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Now it looks good, thanks.

> ---
> v1->v2:
> - increase key.offset instead of key.objectid so we don't skip over a bunch of
>   keys.
>
>  fs/btrfs/uuid-tree.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c
> index 91caab63bdf5..76b84f2397b1 100644
> --- a/fs/btrfs/uuid-tree.c
> +++ b/fs/btrfs/uuid-tree.c
> @@ -324,6 +324,8 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info,
>                                 }
>                                 if (ret < 0 && ret != -ENOENT)
>                                         goto out;
> +                               key.offset++;
> +                               goto again_search_slot;
>                         }
>                         item_size -= sizeof(subid_le);
>                         offset += sizeof(subid_le);
> --
> 2.23.0
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] btrfs: don't BUG_ON in create_subvol
  2019-12-06 14:37 ` [PATCH 2/5] btrfs: don't BUG_ON in create_subvol Josef Bacik
  2019-12-06 15:03   ` Filipe Manana
@ 2019-12-09 10:49   ` Johannes Thumshirn
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-12-09 10:49 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/5][v2] btrfs: handle ENOENT in btrfs_uuid_tree_iterate
  2019-12-06 16:39   ` [PATCH 3/5][v2] " Josef Bacik
  2019-12-06 16:46     ` Filipe Manana
@ 2019-12-09 10:52     ` Johannes Thumshirn
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-12-09 10:52 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

On 06/12/2019 17:39, Josef Bacik wrote:
> If we get an -ENOENT back from btrfs_uuid_iter_rem when iterating the
> uuid tree we'll just continue and do btrfs_next_item().  However we've
> done a btrfs_release_path() at this point and no longer have a valid
> path.  So increment the key and go back and do a normal search.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
> v1->v2:
> - increase key.offset instead of key.objectid so we don't skip over a bunch of
>   keys.
> 
>  fs/btrfs/uuid-tree.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c
> index 91caab63bdf5..76b84f2397b1 100644
> --- a/fs/btrfs/uuid-tree.c
> +++ b/fs/btrfs/uuid-tree.c
> @@ -324,6 +324,8 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info,
>  				}
>  				if (ret < 0 && ret != -ENOENT)
>  					goto out;
> +				key.offset++;
> +				goto again_search_slot;
>  			}
>  			item_size -= sizeof(subid_le);
>  			offset += sizeof(subid_le);
> 

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] btrfs: do not leak reloc root if we fail to read the fs root
  2019-12-06 14:37 ` [PATCH 5/5] btrfs: do not leak reloc root if we fail to read the fs root Josef Bacik
  2019-12-06 15:24   ` Filipe Manana
@ 2019-12-09 10:58   ` Johannes Thumshirn
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-12-09 10:58 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] btrfs: drop log root for dropped roots
  2019-12-06 15:03   ` Nikolay Borisov
@ 2019-12-09 17:39     ` David Sterba
  2019-12-10 20:05     ` Josef Bacik
  1 sibling, 0 replies; 24+ messages in thread
From: David Sterba @ 2019-12-09 17:39 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Josef Bacik, linux-btrfs, kernel-team

On Fri, Dec 06, 2019 at 05:03:36PM +0200, Nikolay Borisov wrote:
> On 6.12.19 г. 16:37 ч., Josef Bacik wrote:
> > If we fsync on a subvolume and create a log root for that volume, and
> > then later delete that subvolume we'll never clean up its log root.  Fix
> > this by making switch_commit_roots free the log for any dropped roots we
> > encounter.
> > 
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> >  fs/btrfs/transaction.c | 22 ++++++++++++----------
> >  1 file changed, 12 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> > index cfc08ef9b876..55d8fd68775a 100644
> > --- a/fs/btrfs/transaction.c
> > +++ b/fs/btrfs/transaction.c
> > @@ -147,13 +147,14 @@ void btrfs_put_transaction(struct btrfs_transaction *transaction)
> >  	}
> >  }
> >  
> > -static noinline void switch_commit_roots(struct btrfs_transaction *trans)
> > +static noinline void switch_commit_roots(struct btrfs_trans_handle *trans)
> >  {
> > +	struct btrfs_transaction *cur_trans = trans->transaction;
> >  	struct btrfs_fs_info *fs_info = trans->fs_info;
> >  	struct btrfs_root *root, *tmp;
> >  
> >  	down_write(&fs_info->commit_root_sem);
> > -	list_for_each_entry_safe(root, tmp, &trans->switch_commits,
> > +	list_for_each_entry_safe(root, tmp, &cur_trans->switch_commits,
> >  				 dirty_list) {
> >  		list_del_init(&root->dirty_list);
> >  		free_extent_buffer(root->commit_root);
> > @@ -165,16 +166,17 @@ static noinline void switch_commit_roots(struct btrfs_transaction *trans)
> >  	}
> >  
> >  	/* We can free old roots now. */
> > -	spin_lock(&trans->dropped_roots_lock);
> > -	while (!list_empty(&trans->dropped_roots)) {
> > -		root = list_first_entry(&trans->dropped_roots,
> > +	spin_lock(&cur_trans->dropped_roots_lock);
> > +	while (!list_empty(&cur_trans->dropped_roots)) {
> > +		root = list_first_entry(&cur_trans->dropped_roots,
> >  					struct btrfs_root, root_list);
> >  		list_del_init(&root->root_list);
> > -		spin_unlock(&trans->dropped_roots_lock);
> > +		spin_unlock(&cur_trans->dropped_roots_lock);
> > +		btrfs_free_log(trans, root);
> 
> THis patch should really have been this line and converting
> switch_commit_roots to taking a trans handle another patch. Otherwise
> this is lost in the mechanical refactoring.

Agreed, as this is a bugfix, we want it backported to older threes so
minimizing the potential conflicts is desired. For that the order 1)
fix 2) cleanup, seems appropriate.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] Various fixes
  2019-12-06 14:37 [PATCH 0/5] Various fixes Josef Bacik
                   ` (4 preceding siblings ...)
  2019-12-06 14:37 ` [PATCH 5/5] btrfs: do not leak reloc root if we fail to read the fs root Josef Bacik
@ 2019-12-09 18:16 ` David Sterba
  5 siblings, 0 replies; 24+ messages in thread
From: David Sterba @ 2019-12-09 18:16 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Fri, Dec 06, 2019 at 09:37:13AM -0500, Josef Bacik wrote:
> These were discovered while reworking how we handle root refs.  They are all
> relatively straightforward and mostly deal with error cases, with the exception
> of
> 
> [PATCH 1/5] btrfs: drop log root for dropped roots
> [PATCH 4/5] btrfs: skip log replay on orphaned roots
> 
> These two are pretty important and were uncovered with my fsstress patch.  The
> first fixes a space leak in the case that we delete a subvol that has a tree log
> attached to it.  The leak does not persist across mounts so it's not too bad,
> but still pretty important.  The second patch I've only seen in production once
> in the last 90 days, but could keep us from mounting if we have a subvol that
> was deleted with a tree log that we didn't finish deleting.
> 
> The rest of these are just for various error conditions and are less important,
> but should be safe enough to send along now if desired.  Thanks,

2-5 added to misc-next, 1 has some comments. Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] btrfs: drop log root for dropped roots
  2019-12-06 15:03   ` Nikolay Borisov
  2019-12-09 17:39     ` David Sterba
@ 2019-12-10 20:05     ` Josef Bacik
  2019-12-10 21:19       ` Nikolay Borisov
  1 sibling, 1 reply; 24+ messages in thread
From: Josef Bacik @ 2019-12-10 20:05 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team

On 12/6/19 10:03 AM, Nikolay Borisov wrote:
> 
> 
> On 6.12.19 г. 16:37 ч., Josef Bacik wrote:
>> If we fsync on a subvolume and create a log root for that volume, and
>> then later delete that subvolume we'll never clean up its log root.  Fix
>> this by making switch_commit_roots free the log for any dropped roots we
>> encounter.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/transaction.c | 22 ++++++++++++----------
>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>> index cfc08ef9b876..55d8fd68775a 100644
>> --- a/fs/btrfs/transaction.c
>> +++ b/fs/btrfs/transaction.c
>> @@ -147,13 +147,14 @@ void btrfs_put_transaction(struct btrfs_transaction *transaction)
>>   	}
>>   }
>>   
>> -static noinline void switch_commit_roots(struct btrfs_transaction *trans)
>> +static noinline void switch_commit_roots(struct btrfs_trans_handle *trans)
>>   {
>> +	struct btrfs_transaction *cur_trans = trans->transaction;
>>   	struct btrfs_fs_info *fs_info = trans->fs_info;
>>   	struct btrfs_root *root, *tmp;
>>   
>>   	down_write(&fs_info->commit_root_sem);
>> -	list_for_each_entry_safe(root, tmp, &trans->switch_commits,
>> +	list_for_each_entry_safe(root, tmp, &cur_trans->switch_commits,
>>   				 dirty_list) {
>>   		list_del_init(&root->dirty_list);
>>   		free_extent_buffer(root->commit_root);
>> @@ -165,16 +166,17 @@ static noinline void switch_commit_roots(struct btrfs_transaction *trans)
>>   	}
>>   
>>   	/* We can free old roots now. */
>> -	spin_lock(&trans->dropped_roots_lock);
>> -	while (!list_empty(&trans->dropped_roots)) {
>> -		root = list_first_entry(&trans->dropped_roots,
>> +	spin_lock(&cur_trans->dropped_roots_lock);
>> +	while (!list_empty(&cur_trans->dropped_roots)) {
>> +		root = list_first_entry(&cur_trans->dropped_roots,
>>   					struct btrfs_root, root_list);
>>   		list_del_init(&root->root_list);
>> -		spin_unlock(&trans->dropped_roots_lock);
>> +		spin_unlock(&cur_trans->dropped_roots_lock);
>> +		btrfs_free_log(trans, root);
> 
> THis patch should really have been this line and converting
> switch_commit_roots to taking a trans handle another patch. Otherwise
> this is lost in the mechanical refactoring.
> 

We need the trans handle to even call btrfs_free_log, we're just fixing it so 
the trans handle can be passed in, making its separate is just superfluous.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] btrfs: drop log root for dropped roots
  2019-12-10 20:05     ` Josef Bacik
@ 2019-12-10 21:19       ` Nikolay Borisov
  2019-12-10 21:28         ` Josef Bacik
  0 siblings, 1 reply; 24+ messages in thread
From: Nikolay Borisov @ 2019-12-10 21:19 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 10.12.19 г. 22:05 ч., Josef Bacik wrote:
> On 12/6/19 10:03 AM, Nikolay Borisov wrote:
>>
>>
>> On 6.12.19 г. 16:37 ч., Josef Bacik wrote:
>>> If we fsync on a subvolume and create a log root for that volume, and
>>> then later delete that subvolume we'll never clean up its log root.  Fix
>>> this by making switch_commit_roots free the log for any dropped roots we
>>> encounter.
>>>
>>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>>> ---
>>>   fs/btrfs/transaction.c | 22 ++++++++++++----------
>>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>>> index cfc08ef9b876..55d8fd68775a 100644
>>> --- a/fs/btrfs/transaction.c
>>> +++ b/fs/btrfs/transaction.c
>>> @@ -147,13 +147,14 @@ void btrfs_put_transaction(struct
>>> btrfs_transaction *transaction)
>>>       }
>>>   }
>>>   -static noinline void switch_commit_roots(struct btrfs_transaction
>>> *trans)
>>> +static noinline void switch_commit_roots(struct btrfs_trans_handle
>>> *trans)
>>>   {
>>> +    struct btrfs_transaction *cur_trans = trans->transaction;
>>>       struct btrfs_fs_info *fs_info = trans->fs_info;
>>>       struct btrfs_root *root, *tmp;
>>>         down_write(&fs_info->commit_root_sem);
>>> -    list_for_each_entry_safe(root, tmp, &trans->switch_commits,
>>> +    list_for_each_entry_safe(root, tmp, &cur_trans->switch_commits,
>>>                    dirty_list) {
>>>           list_del_init(&root->dirty_list);
>>>           free_extent_buffer(root->commit_root);
>>> @@ -165,16 +166,17 @@ static noinline void switch_commit_roots(struct
>>> btrfs_transaction *trans)
>>>       }
>>>         /* We can free old roots now. */
>>> -    spin_lock(&trans->dropped_roots_lock);
>>> -    while (!list_empty(&trans->dropped_roots)) {
>>> -        root = list_first_entry(&trans->dropped_roots,
>>> +    spin_lock(&cur_trans->dropped_roots_lock);
>>> +    while (!list_empty(&cur_trans->dropped_roots)) {
>>> +        root = list_first_entry(&cur_trans->dropped_roots,
>>>                       struct btrfs_root, root_list);
>>>           list_del_init(&root->root_list);
>>> -        spin_unlock(&trans->dropped_roots_lock);
>>> +        spin_unlock(&cur_trans->dropped_roots_lock);
>>> +        btrfs_free_log(trans, root);
>>
>> THis patch should really have been this line and converting
>> switch_commit_roots to taking a trans handle another patch. Otherwise
>> this is lost in the mechanical refactoring.
>>
> 
> We need the trans handle to even call btrfs_free_log, we're just fixing
> it so the trans handle can be passed in, making its separate is just
> superfluous.  Thanks,

Actually no because callees handle the case when trans is not passed
(i.e. walk_log_tree and walk_(up|down)_log_tree. If passing valid
trances changes the call logic then this needs to be explained in the
changelog. And there is currently one caller calling that function
without a trans - btrfs_drop_and_free_fs_root in BTRFS_FS_STATE_ERROR case.

> 
> Josef

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] btrfs: drop log root for dropped roots
  2019-12-10 21:19       ` Nikolay Borisov
@ 2019-12-10 21:28         ` Josef Bacik
  2019-12-13 15:17           ` David Sterba
  0 siblings, 1 reply; 24+ messages in thread
From: Josef Bacik @ 2019-12-10 21:28 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team

On 12/10/19 4:19 PM, Nikolay Borisov wrote:
> 
> 
> On 10.12.19 г. 22:05 ч., Josef Bacik wrote:
>> On 12/6/19 10:03 AM, Nikolay Borisov wrote:
>>>
>>>
>>> On 6.12.19 г. 16:37 ч., Josef Bacik wrote:
>>>> If we fsync on a subvolume and create a log root for that volume, and
>>>> then later delete that subvolume we'll never clean up its log root.  Fix
>>>> this by making switch_commit_roots free the log for any dropped roots we
>>>> encounter.
>>>>
>>>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>>>> ---
>>>>    fs/btrfs/transaction.c | 22 ++++++++++++----------
>>>>    1 file changed, 12 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>>>> index cfc08ef9b876..55d8fd68775a 100644
>>>> --- a/fs/btrfs/transaction.c
>>>> +++ b/fs/btrfs/transaction.c
>>>> @@ -147,13 +147,14 @@ void btrfs_put_transaction(struct
>>>> btrfs_transaction *transaction)
>>>>        }
>>>>    }
>>>>    -static noinline void switch_commit_roots(struct btrfs_transaction
>>>> *trans)
>>>> +static noinline void switch_commit_roots(struct btrfs_trans_handle
>>>> *trans)
>>>>    {
>>>> +    struct btrfs_transaction *cur_trans = trans->transaction;
>>>>        struct btrfs_fs_info *fs_info = trans->fs_info;
>>>>        struct btrfs_root *root, *tmp;
>>>>          down_write(&fs_info->commit_root_sem);
>>>> -    list_for_each_entry_safe(root, tmp, &trans->switch_commits,
>>>> +    list_for_each_entry_safe(root, tmp, &cur_trans->switch_commits,
>>>>                     dirty_list) {
>>>>            list_del_init(&root->dirty_list);
>>>>            free_extent_buffer(root->commit_root);
>>>> @@ -165,16 +166,17 @@ static noinline void switch_commit_roots(struct
>>>> btrfs_transaction *trans)
>>>>        }
>>>>          /* We can free old roots now. */
>>>> -    spin_lock(&trans->dropped_roots_lock);
>>>> -    while (!list_empty(&trans->dropped_roots)) {
>>>> -        root = list_first_entry(&trans->dropped_roots,
>>>> +    spin_lock(&cur_trans->dropped_roots_lock);
>>>> +    while (!list_empty(&cur_trans->dropped_roots)) {
>>>> +        root = list_first_entry(&cur_trans->dropped_roots,
>>>>                        struct btrfs_root, root_list);
>>>>            list_del_init(&root->root_list);
>>>> -        spin_unlock(&trans->dropped_roots_lock);
>>>> +        spin_unlock(&cur_trans->dropped_roots_lock);
>>>> +        btrfs_free_log(trans, root);
>>>
>>> THis patch should really have been this line and converting
>>> switch_commit_roots to taking a trans handle another patch. Otherwise
>>> this is lost in the mechanical refactoring.
>>>
>>
>> We need the trans handle to even call btrfs_free_log, we're just fixing
>> it so the trans handle can be passed in, making its separate is just
>> superfluous.  Thanks,
> 
> Actually no because callees handle the case when trans is not passed
> (i.e. walk_log_tree and walk_(up|down)_log_tree. If passing valid
> trances changes the call logic then this needs to be explained in the
> changelog. And there is currently one caller calling that function
> without a trans - btrfs_drop_and_free_fs_root in BTRFS_FS_STATE_ERROR case.
> 

Yeah, it's clear that NULL is used only in the error case.  I'm not going to 
explain the entirety of how the log tree works in a basic fix for not freeing up 
a tree log when we should be doing it.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] btrfs: drop log root for dropped roots
  2019-12-10 21:28         ` Josef Bacik
@ 2019-12-13 15:17           ` David Sterba
  0 siblings, 0 replies; 24+ messages in thread
From: David Sterba @ 2019-12-13 15:17 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Nikolay Borisov, linux-btrfs, kernel-team

On Tue, Dec 10, 2019 at 04:28:17PM -0500, Josef Bacik wrote:
> >>>> If we fsync on a subvolume and create a log root for that volume, and
> >>>> then later delete that subvolume we'll never clean up its log root.  Fix
> >>>> this by making switch_commit_roots free the log for any dropped roots we
> >>>> encounter.
> >>>> +        btrfs_free_log(trans, root);
> >>>
> >>> THis patch should really have been this line and converting
> >>> switch_commit_roots to taking a trans handle another patch. Otherwise
> >>> this is lost in the mechanical refactoring.
> >>
> >> We need the trans handle to even call btrfs_free_log, we're just fixing
> >> it so the trans handle can be passed in, making its separate is just
> >> superfluous.  Thanks,
> > 
> > Actually no because callees handle the case when trans is not passed
> > (i.e. walk_log_tree and walk_(up|down)_log_tree. If passing valid
> > trances changes the call logic then this needs to be explained in the
> > changelog. And there is currently one caller calling that function
> > without a trans - btrfs_drop_and_free_fs_root in BTRFS_FS_STATE_ERROR case.
> 
> Yeah, it's clear that NULL is used only in the error case.  I'm not going to 
> explain the entirety of how the log tree works in a basic fix for not freeing up 
> a tree log when we should be doing it.  Thanks,

Sure you can at least note that the parameter type needs to be changed,
so we don't have to spend too much time looking for the real fix.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-12-13 20:38 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-06 14:37 [PATCH 0/5] Various fixes Josef Bacik
2019-12-06 14:37 ` [PATCH 1/5] btrfs: drop log root for dropped roots Josef Bacik
2019-12-06 15:02   ` Filipe Manana
2019-12-06 15:03   ` Nikolay Borisov
2019-12-09 17:39     ` David Sterba
2019-12-10 20:05     ` Josef Bacik
2019-12-10 21:19       ` Nikolay Borisov
2019-12-10 21:28         ` Josef Bacik
2019-12-13 15:17           ` David Sterba
2019-12-06 14:37 ` [PATCH 2/5] btrfs: don't BUG_ON in create_subvol Josef Bacik
2019-12-06 15:03   ` Filipe Manana
2019-12-09 10:49   ` Johannes Thumshirn
2019-12-06 14:37 ` [PATCH 3/5] btrfs: handle ENOENT in btrfs_uuid_tree_iterate Josef Bacik
2019-12-06 15:13   ` Filipe Manana
2019-12-06 15:17     ` Josef Bacik
2019-12-06 16:39   ` [PATCH 3/5][v2] " Josef Bacik
2019-12-06 16:46     ` Filipe Manana
2019-12-09 10:52     ` Johannes Thumshirn
2019-12-06 14:37 ` [PATCH 4/5] btrfs: skip log replay on orphaned roots Josef Bacik
2019-12-06 15:23   ` Filipe Manana
2019-12-06 14:37 ` [PATCH 5/5] btrfs: do not leak reloc root if we fail to read the fs root Josef Bacik
2019-12-06 15:24   ` Filipe Manana
2019-12-09 10:58   ` Johannes Thumshirn
2019-12-09 18:16 ` [PATCH 0/5] Various fixes David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.