linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting
@ 2021-12-15 20:43 Josef Bacik
  2021-12-15 20:43 ` [PATCH 1/9] btrfs: remove BUG_ON(ret) in alloc_reserved_tree_block Josef Bacik
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Hello,

This is the kernel side of the support for the GC trees and no longer tracking
metadata reference counts.

For the GC tree we're only implementing offloading the truncate to the GC tree
for now.  As new support is added we'll add code for the garbage collection for
each of the new operations.  Truncate was picked because it's simple enough to
do, gets us a nice latency win on normal workloads, and is a quick way to
validate that the GC tree is doing what it's supposed to.

This also disables the reference counting of metadata blocks.  Snapshotting and
everything reference counting related to metadata has been disabled, and will be
turned back on as the code needed to support those operations is added back.

This survives xfstests without blowing up.  Thanks,

Josef

Josef Bacik (9):
  btrfs: remove BUG_ON(ret) in alloc_reserved_tree_block
  btrfs: add a alloc_reserved_extent helper
  btrfs: remove `last_ref` from the extent freeing code
  btrfs: add a do_free_extent_accounting helper
  btrfs: don't do backref modification for metadata for extent tree v2
  btrfs: add definitions and read support for the garbage collection
    tree
  btrfs: add a btrfs_first_item helper
  btrfs: turn evict_refill_and_join into a real helper
  btrfs: add garbage collection tree support

 fs/btrfs/Makefile               |   2 +-
 fs/btrfs/ctree.c                |  23 ++++
 fs/btrfs/ctree.h                |  11 +-
 fs/btrfs/disk-io.c              |  14 +-
 fs/btrfs/extent-tree.c          | 154 +++++++++++-----------
 fs/btrfs/gc-tree.c              | 223 ++++++++++++++++++++++++++++++++
 fs/btrfs/gc-tree.h              |  15 +++
 fs/btrfs/inode.c                |  65 +++-------
 fs/btrfs/print-tree.c           |   4 +
 fs/btrfs/space-info.c           |   4 +-
 fs/btrfs/transaction.c          |  52 ++++++++
 fs/btrfs/transaction.h          |   2 +
 include/uapi/linux/btrfs_tree.h |   6 +
 13 files changed, 441 insertions(+), 134 deletions(-)
 create mode 100644 fs/btrfs/gc-tree.c
 create mode 100644 fs/btrfs/gc-tree.h

-- 
2.26.3


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/9] btrfs: remove BUG_ON(ret) in alloc_reserved_tree_block
  2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
@ 2021-12-15 20:43 ` Josef Bacik
  2021-12-15 20:43 ` [PATCH 2/9] btrfs: add a alloc_reserved_extent helper Josef Bacik
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Switch this to an ASSERT() and return the error in the normal case.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 8cb67df5acef..3715ee1f0a08 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4762,9 +4762,10 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
 	ret = btrfs_update_block_group(trans, extent_key.objectid,
 				       fs_info->nodesize, true);
 	if (ret) { /* -ENOENT, logic error */
+		ASSERT(!ret);
 		btrfs_err(fs_info, "update block group failed for %llu %llu",
 			extent_key.objectid, extent_key.offset);
-		BUG();
+		return ret;
 	}
 
 	trace_btrfs_reserved_extent_alloc(fs_info, extent_key.objectid,
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/9] btrfs: add a alloc_reserved_extent helper
  2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
  2021-12-15 20:43 ` [PATCH 1/9] btrfs: remove BUG_ON(ret) in alloc_reserved_tree_block Josef Bacik
@ 2021-12-15 20:43 ` Josef Bacik
  2021-12-15 20:43 ` [PATCH 3/9] btrfs: remove `last_ref` from the extent freeing code Josef Bacik
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We duplicate this logic for both data and metadata, at this point we've
already done our type specific extent root operations, this is just
doing the accounting and removing the space from the free space tree.
Extract this common logic out into a helper.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 56 ++++++++++++++++++------------------------
 1 file changed, 24 insertions(+), 32 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 3715ee1f0a08..832cbcd52fea 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4605,6 +4605,28 @@ int btrfs_pin_reserved_extent(struct btrfs_trans_handle *trans, u64 start,
 	return ret;
 }
 
+static int alloc_reserved_extent(struct btrfs_trans_handle *trans, u64 bytenr,
+				 u64 num_bytes)
+{
+	struct btrfs_fs_info *fs_info = trans->fs_info;
+	int ret;
+
+	ret = remove_from_free_space_tree(trans, bytenr, num_bytes);
+	if (ret)
+		return ret;
+
+	ret = btrfs_update_block_group(trans, bytenr, num_bytes, true);
+	if (ret) {
+		ASSERT(!ret);
+		btrfs_err(fs_info, "update block group failed for %llu %llu",
+			  bytenr, num_bytes);
+		return ret;
+	}
+
+	trace_btrfs_reserved_extent_alloc(fs_info, bytenr, num_bytes);
+	return 0;
+}
+
 static int alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
 				      u64 parent, u64 root_objectid,
 				      u64 flags, u64 owner, u64 offset,
@@ -4665,18 +4687,7 @@ static int alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
 	btrfs_mark_buffer_dirty(path->nodes[0]);
 	btrfs_free_path(path);
 
-	ret = remove_from_free_space_tree(trans, ins->objectid, ins->offset);
-	if (ret)
-		return ret;
-
-	ret = btrfs_update_block_group(trans, ins->objectid, ins->offset, true);
-	if (ret) { /* -ENOENT, logic error */
-		btrfs_err(fs_info, "update block group failed for %llu %llu",
-			ins->objectid, ins->offset);
-		BUG();
-	}
-	trace_btrfs_reserved_extent_alloc(fs_info, ins->objectid, ins->offset);
-	return ret;
+	return alloc_reserved_extent(trans, ins->objectid, ins->offset);
 }
 
 static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
@@ -4694,7 +4705,6 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
 	struct extent_buffer *leaf;
 	struct btrfs_delayed_tree_ref *ref;
 	u32 size = sizeof(*extent_item) + sizeof(*iref);
-	u64 num_bytes;
 	u64 flags = extent_op->flags_to_set;
 	bool skinny_metadata = btrfs_fs_incompat(fs_info, SKINNY_METADATA);
 
@@ -4704,12 +4714,10 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
 	if (skinny_metadata) {
 		extent_key.offset = ref->level;
 		extent_key.type = BTRFS_METADATA_ITEM_KEY;
-		num_bytes = fs_info->nodesize;
 	} else {
 		extent_key.offset = node->num_bytes;
 		extent_key.type = BTRFS_EXTENT_ITEM_KEY;
 		size += sizeof(*block_info);
-		num_bytes = node->num_bytes;
 	}
 
 	path = btrfs_alloc_path();
@@ -4754,23 +4762,7 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
 	btrfs_mark_buffer_dirty(leaf);
 	btrfs_free_path(path);
 
-	ret = remove_from_free_space_tree(trans, extent_key.objectid,
-					  num_bytes);
-	if (ret)
-		return ret;
-
-	ret = btrfs_update_block_group(trans, extent_key.objectid,
-				       fs_info->nodesize, true);
-	if (ret) { /* -ENOENT, logic error */
-		ASSERT(!ret);
-		btrfs_err(fs_info, "update block group failed for %llu %llu",
-			extent_key.objectid, extent_key.offset);
-		return ret;
-	}
-
-	trace_btrfs_reserved_extent_alloc(fs_info, extent_key.objectid,
-					  fs_info->nodesize);
-	return ret;
+	return alloc_reserved_extent(trans, node->bytenr, fs_info->nodesize);
 }
 
 int btrfs_alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/9] btrfs: remove `last_ref` from the extent freeing code
  2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
  2021-12-15 20:43 ` [PATCH 1/9] btrfs: remove BUG_ON(ret) in alloc_reserved_tree_block Josef Bacik
  2021-12-15 20:43 ` [PATCH 2/9] btrfs: add a alloc_reserved_extent helper Josef Bacik
@ 2021-12-15 20:43 ` Josef Bacik
  2021-12-15 20:43 ` [PATCH 4/9] btrfs: add a do_free_extent_accounting helper Josef Bacik
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This is a remnant of the work I did for qgroups a long time ago to only
run for a block when we had dropped the last ref.  We haven't done that
for years, but the code remains.  Drop this remnant.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 33 +++++++++++----------------------
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 832cbcd52fea..4bd238ae0753 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -598,7 +598,7 @@ static noinline int insert_extent_data_ref(struct btrfs_trans_handle *trans,
 static noinline int remove_extent_data_ref(struct btrfs_trans_handle *trans,
 					   struct btrfs_root *root,
 					   struct btrfs_path *path,
-					   int refs_to_drop, int *last_ref)
+					   int refs_to_drop)
 {
 	struct btrfs_key key;
 	struct btrfs_extent_data_ref *ref1 = NULL;
@@ -631,7 +631,6 @@ static noinline int remove_extent_data_ref(struct btrfs_trans_handle *trans,
 
 	if (num_refs == 0) {
 		ret = btrfs_del_item(trans, root, path);
-		*last_ref = 1;
 	} else {
 		if (key.type == BTRFS_EXTENT_DATA_REF_KEY)
 			btrfs_set_extent_data_ref_count(leaf, ref1, num_refs);
@@ -1072,8 +1071,7 @@ static noinline_for_stack
 void update_inline_extent_backref(struct btrfs_path *path,
 				  struct btrfs_extent_inline_ref *iref,
 				  int refs_to_mod,
-				  struct btrfs_delayed_extent_op *extent_op,
-				  int *last_ref)
+				  struct btrfs_delayed_extent_op *extent_op)
 {
 	struct extent_buffer *leaf = path->nodes[0];
 	struct btrfs_extent_item *ei;
@@ -1121,7 +1119,6 @@ void update_inline_extent_backref(struct btrfs_path *path,
 		else
 			btrfs_set_shared_data_ref_count(leaf, sref, refs);
 	} else {
-		*last_ref = 1;
 		size =  btrfs_extent_inline_ref_size(type);
 		item_size = btrfs_item_size(leaf, path->slots[0]);
 		ptr = (unsigned long)iref;
@@ -1167,7 +1164,7 @@ int insert_inline_extent_backref(struct btrfs_trans_handle *trans,
 			return -EUCLEAN;
 		}
 		update_inline_extent_backref(path, iref, refs_to_add,
-					     extent_op, NULL);
+					     extent_op);
 	} else if (ret == -ENOENT) {
 		setup_inline_extent_backref(trans->fs_info, path, iref, parent,
 					    root_objectid, owner, offset,
@@ -1181,21 +1178,17 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans,
 				 struct btrfs_root *root,
 				 struct btrfs_path *path,
 				 struct btrfs_extent_inline_ref *iref,
-				 int refs_to_drop, int is_data, int *last_ref)
+				 int refs_to_drop, int is_data)
 {
 	int ret = 0;
 
 	BUG_ON(!is_data && refs_to_drop != 1);
-	if (iref) {
-		update_inline_extent_backref(path, iref, -refs_to_drop, NULL,
-					     last_ref);
-	} else if (is_data) {
-		ret = remove_extent_data_ref(trans, root, path, refs_to_drop,
-					     last_ref);
-	} else {
-		*last_ref = 1;
+	if (iref)
+		update_inline_extent_backref(path, iref, -refs_to_drop, NULL);
+	else if (is_data)
+		ret = remove_extent_data_ref(trans, root, path, refs_to_drop);
+	else
 		ret = btrfs_del_item(trans, root, path);
-	}
 	return ret;
 }
 
@@ -2943,7 +2936,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	u64 refs;
 	u64 bytenr = node->bytenr;
 	u64 num_bytes = node->num_bytes;
-	int last_ref = 0;
 	bool skinny_metadata = btrfs_fs_incompat(info, SKINNY_METADATA);
 
 	extent_root = btrfs_extent_root(info, bytenr);
@@ -3010,8 +3002,7 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 			}
 			/* Must be SHARED_* item, remove the backref first */
 			ret = remove_extent_backref(trans, extent_root, path,
-						    NULL, refs_to_drop, is_data,
-						    &last_ref);
+						    NULL, refs_to_drop, is_data);
 			if (ret) {
 				btrfs_abort_transaction(trans, ret);
 				goto out;
@@ -3136,8 +3127,7 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 		}
 		if (found_extent) {
 			ret = remove_extent_backref(trans, extent_root, path,
-						    iref, refs_to_drop, is_data,
-						    &last_ref);
+						    iref, refs_to_drop, is_data);
 			if (ret) {
 				btrfs_abort_transaction(trans, ret);
 				goto out;
@@ -3182,7 +3172,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 			}
 		}
 
-		last_ref = 1;
 		ret = btrfs_del_items(trans, extent_root, path, path->slots[0],
 				      num_to_del);
 		if (ret) {
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/9] btrfs: add a do_free_extent_accounting helper
  2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
                   ` (2 preceding siblings ...)
  2021-12-15 20:43 ` [PATCH 3/9] btrfs: remove `last_ref` from the extent freeing code Josef Bacik
@ 2021-12-15 20:43 ` Josef Bacik
  2021-12-15 20:43 ` [PATCH 5/9] btrfs: don't do backref modification for metadata for extent tree v2 Josef Bacik
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

__btrfs_free_extent() does all of the hard work of updating the extent
ref items, and then at the end if we dropped the extent completely it
does the cleanup accounting work.  We're going to only want to do that
work for metadata with extent tree v2, so extract this bit into its own
helper.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 53 ++++++++++++++++++++++++------------------
 1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 4bd238ae0753..0c1988a7f845 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2855,6 +2855,35 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans)
 	return 0;
 }
 
+static int do_free_extent_accounting(struct btrfs_trans_handle *trans,
+				     u64 bytenr, u64 num_bytes, bool is_data)
+{
+	int ret;
+
+	if (is_data) {
+		struct btrfs_root *csum_root;
+		csum_root = btrfs_csum_root(trans->fs_info, bytenr);
+		ret = btrfs_del_csums(trans, csum_root, bytenr,
+				      num_bytes);
+		if (ret) {
+			btrfs_abort_transaction(trans, ret);
+			return ret;
+		}
+	}
+
+	ret = add_to_free_space_tree(trans, bytenr, num_bytes);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		return ret;
+	}
+
+	ret = btrfs_update_block_group(trans, bytenr, num_bytes, false);
+	if (ret)
+		btrfs_abort_transaction(trans, ret);
+
+	return ret;
+}
+
 /*
  * Drop one or more refs of @node.
  *
@@ -3180,28 +3209,8 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 		}
 		btrfs_release_path(path);
 
-		if (is_data) {
-			struct btrfs_root *csum_root;
-			csum_root = btrfs_csum_root(info, bytenr);
-			ret = btrfs_del_csums(trans, csum_root, bytenr,
-					      num_bytes);
-			if (ret) {
-				btrfs_abort_transaction(trans, ret);
-				goto out;
-			}
-		}
-
-		ret = add_to_free_space_tree(trans, bytenr, num_bytes);
-		if (ret) {
-			btrfs_abort_transaction(trans, ret);
-			goto out;
-		}
-
-		ret = btrfs_update_block_group(trans, bytenr, num_bytes, false);
-		if (ret) {
-			btrfs_abort_transaction(trans, ret);
-			goto out;
-		}
+		ret = do_free_extent_accounting(trans, bytenr, num_bytes,
+						is_data);
 	}
 	btrfs_release_path(path);
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/9] btrfs: don't do backref modification for metadata for extent tree v2
  2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
                   ` (3 preceding siblings ...)
  2021-12-15 20:43 ` [PATCH 4/9] btrfs: add a do_free_extent_accounting helper Josef Bacik
@ 2021-12-15 20:43 ` Josef Bacik
  2021-12-15 20:43 ` [PATCH 6/9] btrfs: add definitions and read support for the garbage collection tree Josef Bacik
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

For extent tree v2 we will no longer track references for metadata in
the extent tree.  Make changes at the alloc and free sides so the proper
accounting is done but skip the extent tree modification parts.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 0c1988a7f845..369489394660 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2957,7 +2957,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	struct btrfs_extent_item *ei;
 	struct btrfs_extent_inline_ref *iref;
 	int ret;
-	int is_data;
 	int extent_slot = 0;
 	int found_extent = 0;
 	int num_to_del = 1;
@@ -2966,6 +2965,11 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	u64 bytenr = node->bytenr;
 	u64 num_bytes = node->num_bytes;
 	bool skinny_metadata = btrfs_fs_incompat(info, SKINNY_METADATA);
+	bool is_data = owner_objectid >= BTRFS_FIRST_FREE_OBJECTID;
+
+	if (btrfs_fs_incompat(info, EXTENT_TREE_V2) && !is_data)
+		return do_free_extent_accounting(trans, bytenr, num_bytes,
+						 is_data);
 
 	extent_root = btrfs_extent_root(info, bytenr);
 	ASSERT(extent_root);
@@ -2974,8 +2978,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	if (!path)
 		return -ENOMEM;
 
-	is_data = owner_objectid >= BTRFS_FIRST_FREE_OBJECTID;
-
 	if (!is_data && refs_to_drop != 1) {
 		btrfs_crit(info,
 "invalid refs_to_drop, dropping more than 1 refs for tree block %llu refs_to_drop %u",
@@ -4706,6 +4708,9 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
 	u64 flags = extent_op->flags_to_set;
 	bool skinny_metadata = btrfs_fs_incompat(fs_info, SKINNY_METADATA);
 
+	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
+		goto out;
+
 	ref = btrfs_delayed_node_to_tree_ref(node);
 
 	extent_key.objectid = node->bytenr;
@@ -4759,7 +4764,7 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
 
 	btrfs_mark_buffer_dirty(leaf);
 	btrfs_free_path(path);
-
+out:
 	return alloc_reserved_extent(trans, node->bytenr, fs_info->nodesize);
 }
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/9] btrfs: add definitions and read support for the garbage collection tree
  2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
                   ` (4 preceding siblings ...)
  2021-12-15 20:43 ` [PATCH 5/9] btrfs: don't do backref modification for metadata for extent tree v2 Josef Bacik
@ 2021-12-15 20:43 ` Josef Bacik
  2021-12-15 20:43 ` [PATCH 7/9] btrfs: add a btrfs_first_item helper Josef Bacik
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This adds the on disk definitions for the garbage collection tree and
the code to load it on mount.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/disk-io.c              | 6 ++++++
 fs/btrfs/print-tree.c           | 4 ++++
 include/uapi/linux/btrfs_tree.h | 6 ++++++
 3 files changed, 16 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 2a70f61345aa..98b37850d614 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2668,6 +2668,12 @@ static int load_global_roots(struct btrfs_root *tree_root)
 	ret = load_global_roots_objectid(tree_root, path,
 					 BTRFS_FREE_SPACE_TREE_OBJECTID,
 					 "free space");
+	if (ret)
+		goto out;
+	if (!btrfs_fs_incompat(tree_root->fs_info, EXTENT_TREE_V2))
+		goto out;
+	ret = load_global_roots_objectid(tree_root, path,
+					 BTRFS_GC_TREE_OBJECTID, "gc");
 out:
 	btrfs_free_path(path);
 	return ret;
diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c
index 524fdb0ddd74..7fa202105e97 100644
--- a/fs/btrfs/print-tree.c
+++ b/fs/btrfs/print-tree.c
@@ -24,6 +24,7 @@ static const struct root_name_map root_map[] = {
 	{ BTRFS_UUID_TREE_OBJECTID,		"UUID_TREE"		},
 	{ BTRFS_FREE_SPACE_TREE_OBJECTID,	"FREE_SPACE_TREE"	},
 	{ BTRFS_BLOCK_GROUP_TREE_OBJECTID,	"BLOCK_GROUP_TREE"	},
+	{ BTRFS_GC_TREE_OBJECTID,		"GC_TREE"		},
 	{ BTRFS_DATA_RELOC_TREE_OBJECTID,	"DATA_RELOC_TREE"	},
 };
 
@@ -348,6 +349,9 @@ void btrfs_print_leaf(struct extent_buffer *l)
 			print_uuid_item(l, btrfs_item_ptr_offset(l, i),
 					btrfs_item_size(l, i));
 			break;
+		case BTRFS_GC_INODE_ITEM_KEY:
+			pr_info("\t\tgc inode item\n");
+			break;
 		}
 	}
 }
diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h
index 854df92520a1..690b01e0138b 100644
--- a/include/uapi/linux/btrfs_tree.h
+++ b/include/uapi/linux/btrfs_tree.h
@@ -56,6 +56,9 @@
 /* holds the block group items for extent tree v2. */
 #define BTRFS_BLOCK_GROUP_TREE_OBJECTID 11ULL
 
+/* holds the garbage collection itesm for extent tree v2. */
+#define BTRFS_GC_TREE_OBJECTID 12ULL
+
 /* device stats in the device tree */
 #define BTRFS_DEV_STATS_OBJECTID 0ULL
 
@@ -147,6 +150,9 @@
 #define BTRFS_ORPHAN_ITEM_KEY		48
 /* reserve 2-15 close to the inode for later flexibility */
 
+/* The garbage collection items. */
+#define BTRFS_GC_INODE_ITEM_KEY		49
+
 /*
  * dir items are the name -> inode pointers in a directory.  There is one
  * for every name in a directory.  BTRFS_DIR_LOG_ITEM_KEY is no longer used
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/9] btrfs: add a btrfs_first_item helper
  2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
                   ` (5 preceding siblings ...)
  2021-12-15 20:43 ` [PATCH 6/9] btrfs: add definitions and read support for the garbage collection tree Josef Bacik
@ 2021-12-15 20:43 ` Josef Bacik
  2021-12-15 20:43 ` [PATCH 8/9] btrfs: turn evict_refill_and_join into a real helper Josef Bacik
  2021-12-15 20:43 ` [PATCH 9/9] btrfs: add garbage collection tree support Josef Bacik
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

The GC tree stuff is going to use this helper and it'll make the code a
bit cleaner to abstract this into a helper.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.c | 23 +++++++++++++++++++++++
 fs/btrfs/ctree.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 781537692a4a..efb413a6db0c 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -4775,3 +4775,26 @@ int btrfs_previous_extent_item(struct btrfs_root *root,
 	}
 	return 1;
 }
+
+/**
+ * btrfs_first_item - search the given root for the first item.
+ * @root: the root to search.
+ * @path: the path to use for the search.
+ * @return: 0 if it found something, 1 if nothing was found and < on error.
+ *
+ * Search down and find the first item in a tree.  If the root is empty return
+ * 1, otherwise we'll return 0 or < 0 if there was an error.
+ */
+int btrfs_first_item(struct btrfs_root *root, struct btrfs_path *path)
+{
+	struct btrfs_key key = {};
+	int ret;
+
+	ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
+	if (ret > 0) {
+		if (btrfs_header_nritems(path->nodes[0]) == 0)
+			return 1;
+		ret = 0;
+	}
+	return ret;
+}
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 2a5ed393eb21..6bcf112f9872 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2892,6 +2892,7 @@ void btrfs_wait_for_snapshot_creation(struct btrfs_root *root);
 int btrfs_bin_search(struct extent_buffer *eb, const struct btrfs_key *key,
 		     int *slot);
 int __pure btrfs_comp_cpu_keys(const struct btrfs_key *k1, const struct btrfs_key *k2);
+int btrfs_first_item(struct btrfs_root *root, struct btrfs_path *path);
 int btrfs_previous_item(struct btrfs_root *root,
 			struct btrfs_path *path, u64 min_objectid,
 			int type);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 8/9] btrfs: turn evict_refill_and_join into a real helper
  2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
                   ` (6 preceding siblings ...)
  2021-12-15 20:43 ` [PATCH 7/9] btrfs: add a btrfs_first_item helper Josef Bacik
@ 2021-12-15 20:43 ` Josef Bacik
  2021-12-15 20:43 ` [PATCH 9/9] btrfs: add garbage collection tree support Josef Bacik
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We are going to be using this same mechanism for garbage collection as
evict uses.  Rename the flush state to be reflective of the role in GC
it will play from now own, and move the helper to transaction.c, rename
it and make it public.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h       |  2 +-
 fs/btrfs/inode.c       | 52 ++----------------------------------------
 fs/btrfs/space-info.c  |  4 ++--
 fs/btrfs/transaction.c | 49 +++++++++++++++++++++++++++++++++++++++
 fs/btrfs/transaction.h |  2 ++
 5 files changed, 56 insertions(+), 53 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 6bcf112f9872..720ea66e37c1 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2830,7 +2830,7 @@ enum btrfs_reserve_flush_enum {
 	 * - Running delalloc and waiting for ordered extents
 	 * - Allocating a new chunk
 	 */
-	BTRFS_RESERVE_FLUSH_EVICT,
+	BTRFS_RESERVE_FLUSH_GC,
 
 	/*
 	 * Flush space by above mentioned methods and by:
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3d590a96f5d0..cc4e077686c3 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5147,54 +5147,6 @@ static void evict_inode_truncate_pages(struct inode *inode)
 	spin_unlock(&io_tree->lock);
 }
 
-static struct btrfs_trans_handle *evict_refill_and_join(struct btrfs_root *root,
-							struct btrfs_block_rsv *rsv)
-{
-	struct btrfs_fs_info *fs_info = root->fs_info;
-	struct btrfs_trans_handle *trans;
-	u64 delayed_refs_extra = btrfs_calc_insert_metadata_size(fs_info, 1);
-	int ret;
-
-	/*
-	 * Eviction should be taking place at some place safe because of our
-	 * delayed iputs.  However the normal flushing code will run delayed
-	 * iputs, so we cannot use FLUSH_ALL otherwise we'll deadlock.
-	 *
-	 * We reserve the delayed_refs_extra here again because we can't use
-	 * btrfs_start_transaction(root, 0) for the same deadlocky reason as
-	 * above.  We reserve our extra bit here because we generate a ton of
-	 * delayed refs activity by truncating.
-	 *
-	 * BTRFS_RESERVE_FLUSH_EVICT will steal from the global_rsv if it can,
-	 * if we fail to make this reservation we can re-try without the
-	 * delayed_refs_extra so we can make some forward progress.
-	 */
-	ret = btrfs_block_rsv_refill(fs_info, rsv, rsv->size + delayed_refs_extra,
-				     BTRFS_RESERVE_FLUSH_EVICT);
-	if (ret) {
-		ret = btrfs_block_rsv_refill(fs_info, rsv, rsv->size,
-					     BTRFS_RESERVE_FLUSH_EVICT);
-		if (ret) {
-			btrfs_warn(fs_info,
-				   "could not allocate space for delete; will truncate on mount");
-			return ERR_PTR(-ENOSPC);
-		}
-		delayed_refs_extra = 0;
-	}
-
-	trans = btrfs_join_transaction(root);
-	if (IS_ERR(trans))
-		return trans;
-
-	if (delayed_refs_extra) {
-		trans->block_rsv = &fs_info->trans_block_rsv;
-		trans->bytes_reserved = delayed_refs_extra;
-		btrfs_block_rsv_migrate(rsv, trans->block_rsv,
-					delayed_refs_extra, 1);
-	}
-	return trans;
-}
-
 void btrfs_evict_inode(struct inode *inode)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
@@ -5265,7 +5217,7 @@ void btrfs_evict_inode(struct inode *inode)
 			.min_type = 0,
 		};
 
-		trans = evict_refill_and_join(root, rsv);
+		trans = btrfs_gc_rsv_refill_and_join(root, rsv);
 		if (IS_ERR(trans))
 			goto free_rsv;
 
@@ -5290,7 +5242,7 @@ void btrfs_evict_inode(struct inode *inode)
 	 * If it turns out that we are dropping too many of these, we might want
 	 * to add a mechanism for retrying these after a commit.
 	 */
-	trans = evict_refill_and_join(root, rsv);
+	trans = btrfs_gc_rsv_refill_and_join(root, rsv);
 	if (!IS_ERR(trans)) {
 		trans->block_rsv = rsv;
 		btrfs_orphan_del(trans, BTRFS_I(inode));
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 79fe0ad17acf..5c4834b591cd 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1398,7 +1398,7 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
 						priority_flush_states,
 						ARRAY_SIZE(priority_flush_states));
 		break;
-	case BTRFS_RESERVE_FLUSH_EVICT:
+	case BTRFS_RESERVE_FLUSH_GC:
 		priority_reclaim_metadata_space(fs_info, space_info, ticket,
 						evict_flush_states,
 						ARRAY_SIZE(evict_flush_states));
@@ -1456,7 +1456,7 @@ static inline void maybe_clamp_preempt(struct btrfs_fs_info *fs_info,
 static inline bool can_steal(enum btrfs_reserve_flush_enum flush)
 {
 	return (flush == BTRFS_RESERVE_FLUSH_ALL_STEAL ||
-		flush == BTRFS_RESERVE_FLUSH_EVICT);
+		flush == BTRFS_RESERVE_FLUSH_GC);
 }
 
 /**
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 0b73b3ad1e57..5a5a72a32e76 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -857,6 +857,55 @@ static noinline void wait_for_commit(struct btrfs_transaction *commit,
 	wait_event(commit->commit_wait, commit->state >= min_state);
 }
 
+/**
+ * btrfs_gc_rsv_refill_and_join - refill a block rsv and join transaction for gc
+ * @root: the root we're modifying
+ * @rsv: the rsv we're refilling
+ * @return: trans handle with a refilled block_rsv
+ *
+ * Inode eviction or GC will be taking place somewhere safe because of either
+ * delayed iputs or the GC threads.  However the normal flushing behavior
+ * may want to wait on eviction or GC in order to reclaim some space.
+ *
+ * This refills the rsv, and also adds some extra for the delayed refs that may
+ * be generated by the operation.  If it cannot get the delayed refs reservation
+ * it'll reduce the reservation so we can possibly make progress.
+ *
+ * This will also steal from the global reserve if it needs to.
+ */
+struct btrfs_trans_handle *btrfs_gc_rsv_refill_and_join(struct btrfs_root *root,
+							struct btrfs_block_rsv *rsv)
+{
+	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct btrfs_trans_handle *trans;
+	u64 delayed_refs_extra = btrfs_calc_insert_metadata_size(fs_info, 1);
+	int ret;
+
+	ret = btrfs_block_rsv_refill(fs_info, rsv, rsv->size + delayed_refs_extra,
+				     BTRFS_RESERVE_FLUSH_GC);
+	if (ret) {
+		ret = btrfs_block_rsv_refill(fs_info, rsv, rsv->size,
+					     BTRFS_RESERVE_FLUSH_GC);
+		if (ret) {
+			btrfs_warn(fs_info,
+				   "could not allocate space for delete; will truncate on mount");
+			return ERR_PTR(-ENOSPC);
+		}
+		delayed_refs_extra = 0;
+	}
+
+	trans = btrfs_join_transaction(root);
+	if (IS_ERR(trans))
+		return trans;
+
+	if (delayed_refs_extra) {
+		trans->block_rsv = &fs_info->trans_block_rsv;
+		trans->bytes_reserved = delayed_refs_extra;
+		btrfs_block_rsv_migrate(rsv, trans->block_rsv,
+					delayed_refs_extra, 1);
+	}
+	return trans;
+}
 int btrfs_wait_for_commit(struct btrfs_fs_info *fs_info, u64 transid)
 {
 	struct btrfs_transaction *cur_trans = NULL, *t;
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index 1852ed9de7fd..2aac8aaeddba 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -232,5 +232,7 @@ void btrfs_apply_pending_changes(struct btrfs_fs_info *fs_info);
 void btrfs_add_dropped_root(struct btrfs_trans_handle *trans,
 			    struct btrfs_root *root);
 void btrfs_trans_release_chunk_metadata(struct btrfs_trans_handle *trans);
+struct btrfs_trans_handle *btrfs_gc_rsv_refill_and_join(struct btrfs_root *root,
+							struct btrfs_block_rsv *rsv);
 
 #endif
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 9/9] btrfs: add garbage collection tree support
  2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
                   ` (7 preceding siblings ...)
  2021-12-15 20:43 ` [PATCH 8/9] btrfs: turn evict_refill_and_join into a real helper Josef Bacik
@ 2021-12-15 20:43 ` Josef Bacik
  8 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2021-12-15 20:43 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This patch adds the support for loading the gc tree, and running the
inode garbage collection work.  Every time the transaction is committed
we'll kick off helpers to run any items in the GC tree.  Currently we
just have the inode item collection, which will handle the work of
deleting the inode items once an inode is unlinked.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/Makefile      |   2 +-
 fs/btrfs/ctree.h       |   8 ++
 fs/btrfs/disk-io.c     |   8 +-
 fs/btrfs/gc-tree.c     | 223 +++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/gc-tree.h     |  15 +++
 fs/btrfs/inode.c       |  13 +++
 fs/btrfs/transaction.c |   3 +
 7 files changed, 270 insertions(+), 2 deletions(-)
 create mode 100644 fs/btrfs/gc-tree.c
 create mode 100644 fs/btrfs/gc-tree.h

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 3dcf9bcc2326..514f117d253c 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -30,7 +30,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
 	   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
 	   uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \
 	   block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \
-	   subpage.o tree-mod-log.o
+	   subpage.o tree-mod-log.o gc-tree.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 720ea66e37c1..eb0715602948 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -855,6 +855,9 @@ struct btrfs_fs_info {
 	struct btrfs_workqueue *fixup_workers;
 	struct btrfs_workqueue *delayed_workers;
 
+	/* Used to run the GC work. */
+	struct btrfs_workqueue *gc_workers;
+
 	struct task_struct *transaction_kthread;
 	struct task_struct *cleaner_kthread;
 	u32 thread_pool_size;
@@ -1002,6 +1005,9 @@ struct btrfs_fs_info {
 
 	struct semaphore uuid_tree_rescan_sem;
 
+	/* Used to run GC in the background. */
+	struct work_struct gc_work;
+
 	/* Used to reclaim the metadata space in the background. */
 	struct work_struct async_reclaim_work;
 	struct work_struct async_data_reclaim_work;
@@ -1137,6 +1143,8 @@ enum {
 	BTRFS_ROOT_QGROUP_FLUSHING,
 	/* We started the orphan cleanup for this root. */
 	BTRFS_ROOT_ORPHAN_CLEANUP,
+	/* GC is happening on this root. */
+	BTRFS_ROOT_GC_RUNNING,
 };
 
 /*
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 98b37850d614..aefe1edacd57 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2268,6 +2268,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info *fs_info)
 	 */
 	btrfs_destroy_workqueue(fs_info->endio_meta_workers);
 	btrfs_destroy_workqueue(fs_info->endio_meta_write_workers);
+	btrfs_destroy_workqueue(fs_info->gc_workers);
 }
 
 static void free_root_extent_buffers(struct btrfs_root *root)
@@ -2477,6 +2478,8 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 		btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
 	fs_info->discard_ctl.discard_workers =
 		alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
+	fs_info->gc_workers =
+		btrfs_alloc_workqueue(fs_info, "garbage-collect", flags, max_active, 1);
 
 	if (!(fs_info->workers && fs_info->delalloc_workers &&
 	      fs_info->flush_workers &&
@@ -2487,7 +2490,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 	      fs_info->caching_workers && fs_info->readahead_workers &&
 	      fs_info->fixup_workers && fs_info->delayed_workers &&
 	      fs_info->qgroup_rescan_workers &&
-	      fs_info->discard_ctl.discard_workers)) {
+	      fs_info->discard_ctl.discard_workers && fs_info->gc_workers)) {
 		return -ENOMEM;
 	}
 
@@ -4588,6 +4591,9 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
 	 */
 	kthread_park(fs_info->cleaner_kthread);
 
+	/* Stop the gc workers. */
+	btrfs_flush_workqueue(fs_info->gc_workers);
+
 	/* wait for the qgroup rescan worker to stop */
 	btrfs_qgroup_wait_for_completion(fs_info, false);
 
diff --git a/fs/btrfs/gc-tree.c b/fs/btrfs/gc-tree.c
new file mode 100644
index 000000000000..7df7236f805c
--- /dev/null
+++ b/fs/btrfs/gc-tree.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "ctree.h"
+#include "gc-tree.h"
+#include "btrfs_inode.h"
+#include "disk-io.h"
+#include "transaction.h"
+#include "inode-item.h"
+
+struct gc_work {
+	struct btrfs_work work;
+	struct btrfs_root *root;
+};
+
+static struct btrfs_root *inode_gc_root(struct btrfs_inode *inode)
+{
+	struct btrfs_fs_info *fs_info = inode->root->fs_info;
+	struct btrfs_key key = {
+		.objectid = BTRFS_GC_TREE_OBJECTID,
+		.type = BTRFS_ROOT_ITEM_KEY,
+		.offset = btrfs_ino(inode) % fs_info->nr_global_roots,
+	};
+
+	return btrfs_global_root(fs_info, &key);
+}
+
+static int add_gc_item(struct btrfs_root *root, struct btrfs_key *key,
+		       struct btrfs_block_rsv *rsv)
+{
+	struct btrfs_path *path;
+	struct btrfs_trans_handle *trans;
+	int ret = 0;
+
+	path = btrfs_alloc_path();
+	if (!path)
+		return -ENOMEM;
+
+	trans = btrfs_gc_rsv_refill_and_join(root, rsv);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		goto out;
+	}
+
+	trans->block_rsv = rsv;
+	ret = btrfs_insert_empty_item(trans, root, path, key, 0);
+	trans->block_rsv = &root->fs_info->trans_block_rsv;
+	btrfs_end_transaction(trans);
+out:
+	btrfs_free_path(path);
+	return ret;
+}
+
+static void delete_gc_item(struct btrfs_root *root, struct btrfs_path *path,
+			   struct btrfs_block_rsv *rsv, struct btrfs_key *key)
+{
+	struct btrfs_trans_handle *trans;
+	int ret;
+
+	trans = btrfs_gc_rsv_refill_and_join(root, rsv);
+	if (IS_ERR(trans))
+		return;
+
+	ret = btrfs_search_slot(trans, root, key, path, -1, 1);
+	if (ret > 0)
+		ret = -ENOENT;
+	if (ret < 0)
+		return;
+	btrfs_del_item(trans, root, path);
+	btrfs_release_path(path);
+	btrfs_end_transaction(trans);
+}
+
+static int gc_inode(struct btrfs_fs_info *fs_info, struct btrfs_block_rsv *rsv,
+		    struct btrfs_key *key)
+{
+	struct btrfs_root *root = btrfs_get_fs_root(fs_info, key->objectid, true);
+	struct btrfs_trans_handle *trans;
+	int ret = 0;
+
+	if (IS_ERR(root)) {
+		ret = PTR_ERR(root);
+
+		/* We are deleting this subvolume, just delete the GC item for it. */
+		if (ret == -ENOENT)
+			return 0;
+
+		btrfs_err(fs_info, "failed to look up root during gc %llu: %d",
+			  key->objectid, ret);
+		return ret;
+	}
+
+	do {
+		struct btrfs_truncate_control control = {
+			.ino = key->offset,
+			.new_size = 0,
+			.min_type = 0,
+		};
+
+		trans = btrfs_gc_rsv_refill_and_join(root, rsv);
+		if (IS_ERR(trans)) {
+			ret = PTR_ERR(trans);
+			break;
+		}
+
+		trans->block_rsv = rsv;
+
+		ret = btrfs_truncate_inode_items(trans, root, &control);
+
+		trans->block_rsv = &fs_info->trans_block_rsv;
+		btrfs_end_transaction(trans);
+		btrfs_btree_balance_dirty(fs_info);
+	} while (ret == -ENOSPC || ret == -EAGAIN);
+
+	btrfs_put_root(root);
+	return ret;
+}
+
+static void gc_work_fn(struct btrfs_work *work)
+{
+	struct gc_work *gc_work = container_of(work, struct gc_work, work);
+	struct btrfs_root *root = gc_work->root;
+	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct btrfs_path *path;
+	struct btrfs_block_rsv *rsv;
+	int ret;
+
+	path = btrfs_alloc_path();
+	if (!path)
+		goto out;
+
+	rsv = btrfs_alloc_block_rsv(fs_info, BTRFS_BLOCK_RSV_TEMP);
+	if (!rsv)
+		goto out_path;
+	rsv->size = btrfs_calc_metadata_size(fs_info, 1);
+	rsv->failfast = 1;
+
+	while (btrfs_fs_closing(fs_info) &&
+	       !btrfs_first_item(root, path)) {
+		struct btrfs_key key;
+
+		btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
+		btrfs_release_path(path);
+
+		switch (key.type) {
+		case BTRFS_GC_INODE_ITEM_KEY:
+			ret = gc_inode(root->fs_info, rsv, &key);
+			break;
+		default:
+			ASSERT(0);
+			ret = -EINVAL;
+			break;
+		}
+
+		if (!ret)
+			delete_gc_item(root, path, rsv, &key);
+	}
+	btrfs_free_block_rsv(fs_info, rsv);
+out_path:
+	btrfs_free_path(path);
+out:
+	clear_bit(BTRFS_ROOT_GC_RUNNING, &root->state);
+	kfree(gc_work);
+}
+
+/**
+ * btrfs_queue_gc_work - queue work for non-empty GC roots.
+ * @fs_info: The fs_info for the file system.
+ *
+ * This walks through all of the garbage collection roots and schedules the
+ * work structs to chew through their work.
+ */
+void btrfs_queue_gc_work(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_root *root;
+	struct gc_work *gc_work;
+	struct btrfs_key key = {
+		.objectid = BTRFS_GC_TREE_OBJECTID,
+		.type = BTRFS_ROOT_ITEM_KEY,
+	};
+	int nr_global_roots = fs_info->nr_global_roots;
+	int i;
+
+	if (!btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
+		return;
+
+	if (btrfs_fs_closing(fs_info))
+		return;
+
+	for (i = 0; i < nr_global_roots; i++) {
+		key.offset = i;
+		root = btrfs_global_root(fs_info, &key);
+		if (test_and_set_bit(BTRFS_ROOT_GC_RUNNING, &root->state))
+			continue;
+		gc_work = kmalloc(sizeof(struct gc_work), GFP_KERNEL);
+		if (!gc_work) {
+			clear_bit(BTRFS_ROOT_GC_RUNNING, &root->state);
+			continue;
+		}
+		gc_work->root = root;
+		btrfs_init_work(&gc_work->work, gc_work_fn, NULL, NULL);
+		btrfs_queue_work(fs_info->gc_workers, &gc_work->work);
+	}
+}
+
+/**
+ * btrfs_add_inode_gc_item - add a gc item for an inode that needs to be removed.
+ * @inode: The inode that needs to have a gc item added.
+ * @rsv: The block rsv to use for the reservation.
+ *
+ * This adds the gc item for the given inode.  This must be called during evict
+ * to make sure nobody else is going to access this inode.
+ */
+int btrfs_add_inode_gc_item(struct btrfs_inode *inode,
+			    struct btrfs_block_rsv *rsv)
+{
+	struct btrfs_key key = {
+		.objectid = inode->root->root_key.objectid,
+		.type = BTRFS_GC_INODE_ITEM_KEY,
+		.offset = btrfs_ino(inode),
+	};
+
+	return add_gc_item(inode_gc_root(inode), &key, rsv);
+}
diff --git a/fs/btrfs/gc-tree.h b/fs/btrfs/gc-tree.h
new file mode 100644
index 000000000000..d744f45f8c8e
--- /dev/null
+++ b/fs/btrfs/gc-tree.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef BTRFS_GC_TREE_H
+#define BTRFS_GC_TREE_H
+
+struct btrfs_fs_info;
+struct btrfs_inode;
+struct btrfs_block_rsv;
+
+void btrfs_queue_gc_work(struct btrfs_fs_info *fs_info);
+int btrfs_add_inode_gc_item(struct btrfs_inode *inode,
+			    struct btrfs_block_rsv *rsv);
+
+#endif
+
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index cc4e077686c3..6fadf28608f1 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -55,6 +55,7 @@
 #include "zoned.h"
 #include "subpage.h"
 #include "inode-item.h"
+#include "gc-tree.h"
 
 struct btrfs_iget_args {
 	u64 ino;
@@ -5207,6 +5208,17 @@ void btrfs_evict_inode(struct inode *inode)
 	rsv->size = btrfs_calc_metadata_size(fs_info, 1);
 	rsv->failfast = 1;
 
+	/*
+	 * If we have extent tree v2 enabled, insert our gc item and we're done,
+	 * remove the orphan item if we succeeded.
+	 */
+	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
+		ret = btrfs_add_inode_gc_item(BTRFS_I(inode), rsv);
+		if (ret)
+			goto free_rsv;
+		goto delete_orphan;
+	}
+
 	btrfs_i_size_write(BTRFS_I(inode), 0);
 
 	while (1) {
@@ -5242,6 +5254,7 @@ void btrfs_evict_inode(struct inode *inode)
 	 * If it turns out that we are dropping too many of these, we might want
 	 * to add a mechanism for retrying these after a commit.
 	 */
+delete_orphan:
 	trans = btrfs_gc_rsv_refill_and_join(root, rsv);
 	if (!IS_ERR(trans)) {
 		trans->block_rsv = rsv;
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 5a5a72a32e76..7742786ecdb4 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -22,6 +22,7 @@
 #include "block-group.h"
 #include "space-info.h"
 #include "zoned.h"
+#include "gc-tree.h"
 
 #define BTRFS_ROOT_TRANS_TAG 0
 
@@ -2420,6 +2421,8 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 	btrfs_put_transaction(cur_trans);
 	btrfs_put_transaction(cur_trans);
 
+	btrfs_queue_gc_work(fs_info);
+
 	if (trans->type & __TRANS_FREEZABLE)
 		sb_end_intwrite(fs_info->sb);
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-12-15 20:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-15 20:43 [PATCH 0/9] btrfs: extent-tree-v2, gc and no meta ref counting Josef Bacik
2021-12-15 20:43 ` [PATCH 1/9] btrfs: remove BUG_ON(ret) in alloc_reserved_tree_block Josef Bacik
2021-12-15 20:43 ` [PATCH 2/9] btrfs: add a alloc_reserved_extent helper Josef Bacik
2021-12-15 20:43 ` [PATCH 3/9] btrfs: remove `last_ref` from the extent freeing code Josef Bacik
2021-12-15 20:43 ` [PATCH 4/9] btrfs: add a do_free_extent_accounting helper Josef Bacik
2021-12-15 20:43 ` [PATCH 5/9] btrfs: don't do backref modification for metadata for extent tree v2 Josef Bacik
2021-12-15 20:43 ` [PATCH 6/9] btrfs: add definitions and read support for the garbage collection tree Josef Bacik
2021-12-15 20:43 ` [PATCH 7/9] btrfs: add a btrfs_first_item helper Josef Bacik
2021-12-15 20:43 ` [PATCH 8/9] btrfs: turn evict_refill_and_join into a real helper Josef Bacik
2021-12-15 20:43 ` [PATCH 9/9] btrfs: add garbage collection tree support Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).