All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: REF_COWS bit rework
@ 2020-05-13  6:16 Qu Wenruo
  2020-05-13  6:16 ` [PATCH 1/2] btrfs: Rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE Qu Wenruo
  2020-05-13  6:16 ` [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree Qu Wenruo
  0 siblings, 2 replies; 10+ messages in thread
From: Qu Wenruo @ 2020-05-13  6:16 UTC (permalink / raw)
  To: linux-btrfs

This small patchset reworks the REF_COWS bit, by renaming it, and remove
that bit for data relocation root.

The basic idea of such rework is to reduce the confusion caused by the
name REF_COWS.
With the new bit called SHAREABLE, it should be clear that no user can
really create snapshot for data reloc tree, thus its tree blocks
shouldn't be shareable.

This would make data balance for reloc tree a little simpler.

Qu Wenruo (2):
  btrfs: Rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE
  btrfs: Don't set SHAREABLE flag for data reloc tree

 fs/btrfs/backref.c     |  4 ++--
 fs/btrfs/backref.h     |  2 +-
 fs/btrfs/block-rsv.c   |  2 +-
 fs/btrfs/ctree.c       | 26 +++++++++++++-------------
 fs/btrfs/ctree.h       | 25 +++++++++++++++++++++++--
 fs/btrfs/disk-io.c     | 27 ++++++++++++++++++++-------
 fs/btrfs/extent-tree.c |  2 +-
 fs/btrfs/file.c        |  2 +-
 fs/btrfs/inode.c       | 27 ++++++++++++++++-----------
 fs/btrfs/ioctl.c       |  2 +-
 fs/btrfs/relocation.c  | 42 +++++++++++++++++-------------------------
 fs/btrfs/transaction.c | 12 ++++++------
 fs/btrfs/tree-defrag.c |  2 +-
 13 files changed, 103 insertions(+), 72 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] btrfs: Rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE
  2020-05-13  6:16 [PATCH 0/2] btrfs: REF_COWS bit rework Qu Wenruo
@ 2020-05-13  6:16 ` Qu Wenruo
  2020-05-13 14:18   ` David Sterba
  2020-05-13  6:16 ` [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree Qu Wenruo
  1 sibling, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2020-05-13  6:16 UTC (permalink / raw)
  To: linux-btrfs

The name BTRFS_ROOT_REF_COWS is not helpful to show what it really
means.

In fact, that bit can only be set to those trees:
- Subvolume roots
- Data reloc root
- Reloc roots for above roots

All other trees won't get this bit set.
So just by the result, it is obvious that, roots with this bit set can
have tree blocks shared with other trees.
Either shared by snapshots, or by reloc roots (an special snapshot
created by relocation).

This patch will rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE to
make it easier to understand, and update all comment mentioning
"reference counted" to follow the rename.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/backref.c     |  4 ++--
 fs/btrfs/backref.h     |  2 +-
 fs/btrfs/block-rsv.c   |  2 +-
 fs/btrfs/ctree.c       | 26 +++++++++++++-------------
 fs/btrfs/ctree.h       | 24 ++++++++++++++++++++++--
 fs/btrfs/disk-io.c     | 13 +++++++------
 fs/btrfs/extent-tree.c |  2 +-
 fs/btrfs/file.c        |  2 +-
 fs/btrfs/inode.c       | 20 ++++++++++++--------
 fs/btrfs/ioctl.c       |  2 +-
 fs/btrfs/relocation.c  | 24 +++++++++++++-----------
 fs/btrfs/transaction.c | 12 ++++++------
 fs/btrfs/tree-defrag.c |  2 +-
 13 files changed, 81 insertions(+), 54 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 60a69f7c0b36..f1ff92eb14f3 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -2702,7 +2702,7 @@ static int handle_indirect_tree_backref(struct btrfs_backref_cache *cache,
 	root = btrfs_get_fs_root(fs_info, &root_key, false);
 	if (IS_ERR(root))
 		return PTR_ERR(root);
-	if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state))
+	if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 		cur->cowonly = 1;
 
 	if (btrfs_root_level(&root->root_item) == cur->level) {
@@ -2789,7 +2789,7 @@ static int handle_indirect_tree_backref(struct btrfs_backref_cache *cache,
 				goto out;
 			}
 			upper->owner = btrfs_header_owner(eb);
-			if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state))
+			if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 				upper->cowonly = 1;
 
 			/*
diff --git a/fs/btrfs/backref.h b/fs/btrfs/backref.h
index 18393cc05ca2..ff705cc564a9 100644
--- a/fs/btrfs/backref.h
+++ b/fs/btrfs/backref.h
@@ -184,7 +184,7 @@ struct btrfs_backref_node {
 	struct extent_buffer *eb;
 	/* Level of the tree block */
 	unsigned int level:8;
-	/* Is the block in non-reference counted tree */
+	/* Is the block in a non-shareable tree */
 	unsigned int cowonly:1;
 	/* 1 if no child node is in the cache */
 	unsigned int lowest:1;
diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
index dbba53e712e6..7e1549a84fcc 100644
--- a/fs/btrfs/block-rsv.c
+++ b/fs/btrfs/block-rsv.c
@@ -458,7 +458,7 @@ static struct btrfs_block_rsv *get_block_rsv(
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct btrfs_block_rsv *block_rsv = NULL;
 
-	if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) ||
+	if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
 	    (root == fs_info->csum_root && trans->adding_csums) ||
 	    (root == fs_info->uuid_root))
 		block_rsv = trans->block_rsv;
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 6c28efe5b14a..90a9e238ead8 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -144,9 +144,10 @@ struct extent_buffer *btrfs_root_node(struct btrfs_root *root)
 	return eb;
 }
 
-/* cowonly root (everything not a reference counted cow subvolume), just get
- * put onto a simple dirty list.  transaction.c walks this to make sure they
- * get properly updated on disk.
+/*
+ * Cowonly root (not shareable trees, everything not subvolume roots or reloc
+ * roots), just get put onto a simple dirty list.  transaction.c walks this to
+ * make sure they get properly updated on disk.
  */
 static void add_root_to_dirty_list(struct btrfs_root *root)
 {
@@ -185,9 +186,9 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans,
 	int level;
 	struct btrfs_disk_key disk_key;
 
-	WARN_ON(test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
+	WARN_ON(test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
 		trans->transid != fs_info->running_transaction->transid);
-	WARN_ON(test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
+	WARN_ON(test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
 		trans->transid != root->last_trans);
 
 	level = btrfs_header_level(buf);
@@ -826,12 +827,11 @@ int btrfs_block_can_be_shared(struct btrfs_root *root,
 			      struct extent_buffer *buf)
 {
 	/*
-	 * Tree blocks not in reference counted trees and tree roots
-	 * are never shared. If a block was allocated after the last
-	 * snapshot and the block was not allocated by tree relocation,
-	 * we know the block is not shared.
+	 * Tree blocks not in shareable trees and tree roots are never shared.
+	 * If a block was allocated after the last snapshot and the block was
+	 * not allocated by tree relocation, we know the block is not shared.
 	 */
-	if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
+	if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
 	    buf != root->node && buf != root->commit_root &&
 	    (btrfs_header_generation(buf) <=
 	     btrfs_root_last_snapshot(&root->root_item) ||
@@ -1024,9 +1024,9 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
 
 	btrfs_assert_tree_locked(buf);
 
-	WARN_ON(test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
+	WARN_ON(test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
 		trans->transid != fs_info->running_transaction->transid);
-	WARN_ON(test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
+	WARN_ON(test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
 		trans->transid != root->last_trans);
 
 	level = btrfs_header_level(buf);
@@ -1065,7 +1065,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
 		return ret;
 	}
 
-	if (test_bit(BTRFS_ROOT_REF_COWS, &root->state)) {
+	if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state)) {
 		ret = btrfs_reloc_cow_block(trans, root, buf, cow);
 		if (ret) {
 			btrfs_abort_transaction(trans, ret);
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 03ea7370aea7..65c09aea4cb9 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -969,7 +969,27 @@ enum {
 	 * is used to tell us when more checks are required
 	 */
 	BTRFS_ROOT_IN_TRANS_SETUP,
-	BTRFS_ROOT_REF_COWS,
+
+	/*
+	 * If tree blocks of this root can be shared between other roots.
+	 * Only subvolume trees and their reloc trees have this bit set.
+	 * Conflicts with TRACK_DIRTY bit.
+	 *
+	 * This affects two things:
+	 * - How balance works
+	 *   For shareable roots, btrfs needs to use reloc tree and do path
+	 *   replacement for balance, and needs various pre/post hooks for
+	 *   snapshot creation to handle them.
+	 *
+	 *   While for non shareable trees, we just simply do a tree search
+	 *   with COW.
+	 *
+	 * - How dirty roots are tracked
+	 *   For shareable roots, btrfs_record_root_in_trans() is needed to
+	 *   trace them, while non subvolume roots have TRACK_DIRTY bit, they
+	 *   don't need manual record.
+	 */
+	BTRFS_ROOT_SHAREABLE,
 	BTRFS_ROOT_TRACK_DIRTY,
 	BTRFS_ROOT_IN_RADIX,
 	BTRFS_ROOT_ORPHAN_ITEM_INSERTED,
@@ -1055,7 +1075,7 @@ struct btrfs_root {
 	struct btrfs_key defrag_progress;
 	struct btrfs_key defrag_max;
 
-	/* the dirty list is only used by non-reference counted roots */
+	/* the dirty list is only used by non-shareable roots */
 	struct list_head dirty_list;
 
 	struct list_head root_list;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8ad451695d49..4dd3206c1ace 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1275,12 +1275,13 @@ static struct btrfs_root *alloc_log_tree(struct btrfs_trans_handle *trans,
 	root->root_key.offset = BTRFS_TREE_LOG_OBJECTID;
 
 	/*
-	 * DON'T set REF_COWS for log trees
+	 * DON'T set SHAREABLE bit for log trees.
 	 *
-	 * log trees do not get reference counted because they go away
-	 * before a real commit is actually done.  They do store pointers
-	 * to file data extents, and those reference counts still get
-	 * updated (along with back refs to the log tree).
+	 * User has no way to create snapshot for log trees, and they go away
+	 * before a real commit is actually done.
+	 *
+	 * They do store pointers to file data extents, and those reference
+	 * counts still get updated (along with back refs to the log tree).
 	 */
 
 	leaf = btrfs_alloc_tree_block(trans, root, 0, BTRFS_TREE_LOG_OBJECTID,
@@ -1419,7 +1420,7 @@ static int btrfs_init_fs_root(struct btrfs_root *root)
 		goto fail;
 
 	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
-		set_bit(BTRFS_ROOT_REF_COWS, &root->state);
+		set_bit(BTRFS_ROOT_SHAREABLE, &root->state);
 		btrfs_check_and_init_root_item(&root->root_item);
 	}
 
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index faa585d54eb7..36d8014ce452 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2442,7 +2442,7 @@ static int __btrfs_mod_ref(struct btrfs_trans_handle *trans,
 	nritems = btrfs_header_nritems(buf);
 	level = btrfs_header_level(buf);
 
-	if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state) && level == 0)
+	if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state) && level == 0)
 		return 0;
 
 	if (full_backref)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 719e68ab552c..606c2f3c1a38 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -775,7 +775,7 @@ int __btrfs_drop_extents(struct btrfs_trans_handle *trans,
 	if (start >= BTRFS_I(inode)->disk_i_size && !replace_extent)
 		modify_tree = 0;
 
-	update_refs = (test_bit(BTRFS_ROOT_REF_COWS, &root->state) ||
+	update_refs = (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
 		       root == fs_info->tree_root);
 	while (1) {
 		recow = 0;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5d567082f95a..13bbb6b0d495 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4107,11 +4107,13 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 	BUG_ON(new_size > 0 && min_type != BTRFS_EXTENT_DATA_KEY);
 
 	/*
-	 * for non-free space inodes and ref cows, we want to back off from
-	 * time to time
+	 * for non-free space inodes and non-shareable roots, we want to back
+	 * off from time to time.
+	 * This means all inodes in subvolume roots, reloc roots, and data
+	 * reloc roots.
 	 */
 	if (!btrfs_is_free_space_inode(BTRFS_I(inode)) &&
-	    test_bit(BTRFS_ROOT_REF_COWS, &root->state))
+	    test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 		be_nice = true;
 
 	path = btrfs_alloc_path();
@@ -4127,8 +4129,10 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 	 * We want to drop from the next block forward in case this new size is
 	 * not block aligned since we will be keeping the last block of the
 	 * extent just the way it is.
+	 * This covers all inodes, including free space cache inodes, and btree
+	 * inode.
 	 */
-	if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) ||
+	if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
 	    root == fs_info->tree_root)
 		btrfs_drop_extent_cache(BTRFS_I(inode), ALIGN(new_size,
 					fs_info->sectorsize),
@@ -4240,7 +4244,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 							 extent_num_bytes);
 				num_dec = (orig_num_bytes -
 					   extent_num_bytes);
-				if (test_bit(BTRFS_ROOT_REF_COWS,
+				if (test_bit(BTRFS_ROOT_SHAREABLE,
 					     &root->state) &&
 				    extent_start != 0)
 					inode_sub_bytes(inode, num_dec);
@@ -4256,7 +4260,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 				num_dec = btrfs_file_extent_num_bytes(leaf, fi);
 				if (extent_start != 0) {
 					found_extent = 1;
-					if (test_bit(BTRFS_ROOT_REF_COWS,
+					if (test_bit(BTRFS_ROOT_SHAREABLE,
 						     &root->state))
 						inode_sub_bytes(inode, num_dec);
 				}
@@ -4292,7 +4296,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 				clear_len = fs_info->sectorsize;
 			}
 
-			if (test_bit(BTRFS_ROOT_REF_COWS, &root->state))
+			if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 				inode_sub_bytes(inode, item_end + 1 - new_size);
 		}
 delete:
@@ -4333,7 +4337,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 		should_throttle = false;
 
 		if (found_extent &&
-		    (test_bit(BTRFS_ROOT_REF_COWS, &root->state) ||
+		    (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
 		     root == fs_info->tree_root)) {
 			struct btrfs_ref ref = { 0 };
 
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 40b729dce91c..709d9446896a 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -750,7 +750,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	int ret;
 	bool snapshot_force_cow = false;
 
-	if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state))
+	if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 		return -EINVAL;
 
 	if (atomic_read(&root->nr_swapfiles)) {
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index f25deca18a5d..437b782c57e6 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -321,7 +321,7 @@ int btrfs_should_ignore_reloc_root(struct btrfs_root *root)
 {
 	struct btrfs_root *reloc_root;
 
-	if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state))
+	if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 		return 0;
 
 	/* This root has been merged with its reloc tree, we can ignore it */
@@ -808,7 +808,7 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans,
 
 	reloc_root = btrfs_read_tree_root(fs_info->tree_root, &root_key);
 	BUG_ON(IS_ERR(reloc_root));
-	set_bit(BTRFS_ROOT_REF_COWS, &reloc_root->state);
+	set_bit(BTRFS_ROOT_SHAREABLE, &reloc_root->state);
 	reloc_root->last_trans = trans->transid;
 	return reloc_root;
 }
@@ -2018,7 +2018,7 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
 		next = walk_up_backref(next, edges, &index);
 		root = next->root;
 		BUG_ON(!root);
-		BUG_ON(!test_bit(BTRFS_ROOT_REF_COWS, &root->state));
+		BUG_ON(!test_bit(BTRFS_ROOT_SHAREABLE, &root->state));
 
 		if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
 			record_reloc_root_in_trans(trans, root);
@@ -2062,10 +2062,12 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
 }
 
 /*
- * select a tree root for relocation. return NULL if the block
- * is reference counted. we should use do_relocation() in this
- * case. return a tree root pointer if the block isn't reference
- * counted. return -ENOENT if the block is root of reloc tree.
+ * Select a tree root for relocation.
+ *
+ * Return NULL if the block is not shareable. we should use do_relocation() in this
+ * case.
+ * Return a tree root pointer if the block is shareable.
+ * Return -ENOENT if the block is root of reloc tree.
  */
 static noinline_for_stack
 struct btrfs_root *select_one_root(struct btrfs_backref_node *node)
@@ -2083,8 +2085,8 @@ struct btrfs_root *select_one_root(struct btrfs_backref_node *node)
 		root = next->root;
 		BUG_ON(!root);
 
-		/* no other choice for non-references counted tree */
-		if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state))
+		/* no other choice for non-shareable tree */
+		if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 			return root;
 
 		if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID)
@@ -2480,7 +2482,7 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans,
 	}
 
 	if (root) {
-		if (test_bit(BTRFS_ROOT_REF_COWS, &root->state)) {
+		if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state)) {
 			BUG_ON(node->new_bytenr);
 			BUG_ON(!list_empty(&node->list));
 			btrfs_record_root_in_trans(trans, root);
@@ -3765,7 +3767,7 @@ int btrfs_recover_relocation(struct btrfs_root *root)
 			goto out;
 		}
 
-		set_bit(BTRFS_ROOT_REF_COWS, &reloc_root->state);
+		set_bit(BTRFS_ROOT_SHAREABLE, &reloc_root->state);
 		list_add(&reloc_root->root_list, &reloc_roots);
 
 		if (btrfs_root_refs(&reloc_root->root_item) > 0) {
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 96eb313a5080..a04b5442ac39 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -349,10 +349,10 @@ static noinline int join_transaction(struct btrfs_fs_info *fs_info,
 }
 
 /*
- * this does all the record keeping required to make sure that a reference
- * counted root is properly recorded in a given transaction.  This is required
+ * This does all the record keeping required to make sure that a shareable
+ * root is properly recorded in a given transaction.  This is required
  * to make sure the old root from before we joined the transaction is deleted
- * when the transaction commits
+ * when the transaction commits.
  */
 static int record_root_in_trans(struct btrfs_trans_handle *trans,
 			       struct btrfs_root *root,
@@ -360,7 +360,7 @@ static int record_root_in_trans(struct btrfs_trans_handle *trans,
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 
-	if ((test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
+	if ((test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
 	    root->last_trans < trans->transid) || force) {
 		WARN_ON(root == fs_info->extent_root);
 		WARN_ON(!force && root->commit_root != root->node);
@@ -439,7 +439,7 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 
-	if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state))
+	if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 		return 0;
 
 	/*
@@ -504,7 +504,7 @@ static inline bool need_reserve_reloc_root(struct btrfs_root *root)
 	struct btrfs_fs_info *fs_info = root->fs_info;
 
 	if (!fs_info->reloc_ctl ||
-	    !test_bit(BTRFS_ROOT_REF_COWS, &root->state) ||
+	    !test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
 	    root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
 	    root->reloc_root)
 		return false;
diff --git a/fs/btrfs/tree-defrag.c b/fs/btrfs/tree-defrag.c
index 5f9e2dd413af..16c3a6d2586d 100644
--- a/fs/btrfs/tree-defrag.c
+++ b/fs/btrfs/tree-defrag.c
@@ -35,7 +35,7 @@ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
 		goto out;
 	}
 
-	if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state))
+	if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
 		goto out;
 
 	path = btrfs_alloc_path();
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree
  2020-05-13  6:16 [PATCH 0/2] btrfs: REF_COWS bit rework Qu Wenruo
  2020-05-13  6:16 ` [PATCH 1/2] btrfs: Rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE Qu Wenruo
@ 2020-05-13  6:16 ` Qu Wenruo
  2020-05-13  6:44   ` Nikolay Borisov
  2020-05-13 14:01   ` David Sterba
  1 sibling, 2 replies; 10+ messages in thread
From: Qu Wenruo @ 2020-05-13  6:16 UTC (permalink / raw)
  To: linux-btrfs

SHAREABLE flag is set for subvolumes because users can create snapshot
for subvolumes, thus sharing tree blocks of them.

But users can't access data reloc tree as it's only an internal tree for
data relocation, thus it doesn't need the full path replacement treat at
all.

This patch will make data reloc tree just a non-shareable tree, and add
btrfs_fs_info::dreloc_root for data reloc tree, so relocation code can
grab it from fs_info directly.

This would slightly improve tree relocation, as now data reloc tree
can go through regular COW routine to get relocated, without bothering
the complex tree reloc tree routine.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ctree.h      |  1 +
 fs/btrfs/disk-io.c    | 14 +++++++++++++-
 fs/btrfs/inode.c      |  7 ++++---
 fs/btrfs/relocation.c | 18 ++++--------------
 4 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 65c09aea4cb9..a9690f438a15 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -582,6 +582,7 @@ struct btrfs_fs_info {
 	struct btrfs_root *quota_root;
 	struct btrfs_root *uuid_root;
 	struct btrfs_root *free_space_root;
+	struct btrfs_root *dreloc_root;
 
 	/* the log root tree is a directory of all the other log roots */
 	struct btrfs_root *log_root_tree;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 4dd3206c1ace..7355ebc648c5 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1419,7 +1419,8 @@ static int btrfs_init_fs_root(struct btrfs_root *root)
 	if (ret)
 		goto fail;
 
-	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
+	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
+	    root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID) {
 		set_bit(BTRFS_ROOT_SHAREABLE, &root->state);
 		btrfs_check_and_init_root_item(&root->root_item);
 	}
@@ -1525,6 +1526,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
 	btrfs_put_root(fs_info->uuid_root);
 	btrfs_put_root(fs_info->free_space_root);
 	btrfs_put_root(fs_info->fs_root);
+	btrfs_put_root(fs_info->dreloc_root);
 	btrfs_check_leaked_roots(fs_info);
 	btrfs_extent_buffer_leak_debug_check(fs_info);
 	kfree(fs_info->super_copy);
@@ -1981,6 +1983,7 @@ static void free_root_pointers(struct btrfs_fs_info *info, bool free_chunk_root)
 	free_root_extent_buffers(info->quota_root);
 	free_root_extent_buffers(info->uuid_root);
 	free_root_extent_buffers(info->fs_root);
+	free_root_extent_buffers(info->dreloc_root);
 	if (free_chunk_root)
 		free_root_extent_buffers(info->chunk_root);
 	free_root_extent_buffers(info->free_space_root);
@@ -2287,6 +2290,15 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
 	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
 	fs_info->csum_root = root;
 
+	location.objectid = BTRFS_DATA_RELOC_TREE_OBJECTID;
+	root = btrfs_get_fs_root(fs_info, &location, true);
+	if (IS_ERR(root)) {
+		ret = PTR_ERR(root);
+		goto out;
+	}
+	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
+	fs_info->dreloc_root = root;
+
 	location.objectid = BTRFS_QUOTA_TREE_OBJECTID;
 	root = btrfs_read_tree_root(tree_root, &location);
 	if (!IS_ERR(root)) {
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 13bbb6b0d495..6b7ba20eee52 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4133,7 +4133,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 	 * inode.
 	 */
 	if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
-	    root == fs_info->tree_root)
+	    root == fs_info->tree_root || root == fs_info->dreloc_root)
 		btrfs_drop_extent_cache(BTRFS_I(inode), ALIGN(new_size,
 					fs_info->sectorsize),
 					(u64)-1, 0);
@@ -4338,7 +4338,8 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 
 		if (found_extent &&
 		    (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
-		     root == fs_info->tree_root)) {
+		     root == fs_info->tree_root ||
+		     root == fs_info->dreloc_root)) {
 			struct btrfs_ref ref = { 0 };
 
 			bytes_deleted += extent_num_bytes;
@@ -5046,7 +5047,7 @@ void btrfs_evict_inode(struct inode *inode)
 		btrfs_end_transaction(trans);
 	}
 
-	if (!(root == fs_info->tree_root ||
+	if (!(root == fs_info->tree_root || root == fs_info->dreloc_root ||
 	      root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID))
 		btrfs_return_ino(root, btrfs_ino(BTRFS_I(inode)));
 
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 437b782c57e6..cb45ae92f15d 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1087,7 +1087,8 @@ int replace_file_extents(struct btrfs_trans_handle *trans,
 		 * if we are modifying block in fs tree, wait for readpage
 		 * to complete and drop the extent cache
 		 */
-		if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID) {
+		if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
+		    root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID) {
 			if (first) {
 				inode = find_next_inode(root, key.objectid);
 				first = 0;
@@ -3470,15 +3471,11 @@ struct inode *create_reloc_inode(struct btrfs_fs_info *fs_info,
 {
 	struct inode *inode = NULL;
 	struct btrfs_trans_handle *trans;
-	struct btrfs_root *root;
+	struct btrfs_root *root = fs_info->dreloc_root;
 	struct btrfs_key key;
 	u64 objectid;
 	int err = 0;
 
-	root = read_fs_root(fs_info, BTRFS_DATA_RELOC_TREE_OBJECTID);
-	if (IS_ERR(root))
-		return ERR_CAST(root);
-
 	trans = btrfs_start_transaction(root, 6);
 	if (IS_ERR(trans)) {
 		btrfs_put_root(root);
@@ -3501,7 +3498,6 @@ struct inode *create_reloc_inode(struct btrfs_fs_info *fs_info,
 
 	err = btrfs_orphan_add(trans, BTRFS_I(inode));
 out:
-	btrfs_put_root(root);
 	btrfs_end_transaction(trans);
 	btrfs_btree_balance_dirty(fs_info);
 	if (err) {
@@ -3870,13 +3866,7 @@ int btrfs_recover_relocation(struct btrfs_root *root)
 
 	if (err == 0) {
 		/* cleanup orphan inode in data relocation tree */
-		fs_root = read_fs_root(fs_info, BTRFS_DATA_RELOC_TREE_OBJECTID);
-		if (IS_ERR(fs_root)) {
-			err = PTR_ERR(fs_root);
-		} else {
-			err = btrfs_orphan_cleanup(fs_root);
-			btrfs_put_root(fs_root);
-		}
+		err = btrfs_orphan_cleanup(fs_info->dreloc_root);
 	}
 	return err;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree
  2020-05-13  6:16 ` [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree Qu Wenruo
@ 2020-05-13  6:44   ` Nikolay Borisov
  2020-05-13  6:49     ` Qu Wenruo
  2020-05-13 14:01   ` David Sterba
  1 sibling, 1 reply; 10+ messages in thread
From: Nikolay Borisov @ 2020-05-13  6:44 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 13.05.20 г. 9:16 ч., Qu Wenruo wrote:
> SHAREABLE flag is set for subvolumes because users can create snapshot
> for subvolumes, thus sharing tree blocks of them.
> 
> But users can't access data reloc tree as it's only an internal tree for
> data relocation, thus it doesn't need the full path replacement treat at
> all.
> 
> This patch will make data reloc tree just a non-shareable tree, and add
> btrfs_fs_info::dreloc_root for data reloc tree, so relocation code can
> grab it from fs_info directly.
> 
> This would slightly improve tree relocation, as now data reloc tree
> can go through regular COW routine to get relocated, without bothering
> the complex tree reloc tree routine.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/ctree.h      |  1 +
>  fs/btrfs/disk-io.c    | 14 +++++++++++++-
>  fs/btrfs/inode.c      |  7 ++++---
>  fs/btrfs/relocation.c | 18 ++++--------------
>  4 files changed, 22 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 65c09aea4cb9..a9690f438a15 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -582,6 +582,7 @@ struct btrfs_fs_info {
>  	struct btrfs_root *quota_root;
>  	struct btrfs_root *uuid_root;
>  	struct btrfs_root *free_space_root;
> +	struct btrfs_root *dreloc_root;

Rename this to data_reloc_root or simply reloc_root, dreloc is rather
cryptic.
>  
>  	/* the log root tree is a directory of all the other log roots */
>  	struct btrfs_root *log_root_tree;
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 4dd3206c1ace..7355ebc648c5 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1419,7 +1419,8 @@ static int btrfs_init_fs_root(struct btrfs_root *root)
>  	if (ret)
>  		goto fail;
>  
> -	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
> +	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
> +	    root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID) {
>  		set_bit(BTRFS_ROOT_SHAREABLE, &root->state);
>  		btrfs_check_and_init_root_item(&root->root_item);
>  	}
> @@ -1525,6 +1526,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
>  	btrfs_put_root(fs_info->uuid_root);
>  	btrfs_put_root(fs_info->free_space_root);
>  	btrfs_put_root(fs_info->fs_root);
> +	btrfs_put_root(fs_info->dreloc_root);
>  	btrfs_check_leaked_roots(fs_info);
>  	btrfs_extent_buffer_leak_debug_check(fs_info);
>  	kfree(fs_info->super_copy);
> @@ -1981,6 +1983,7 @@ static void free_root_pointers(struct btrfs_fs_info *info, bool free_chunk_root)
>  	free_root_extent_buffers(info->quota_root);
>  	free_root_extent_buffers(info->uuid_root);
>  	free_root_extent_buffers(info->fs_root);
> +	free_root_extent_buffers(info->dreloc_root);
>  	if (free_chunk_root)
>  		free_root_extent_buffers(info->chunk_root);
>  	free_root_extent_buffers(info->free_space_root);
> @@ -2287,6 +2290,15 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
>  	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
>  	fs_info->csum_root = root;
>  
> +	location.objectid = BTRFS_DATA_RELOC_TREE_OBJECTID;
> +	root = btrfs_get_fs_root(fs_info, &location, true);
> +	if (IS_ERR(root)) {
> +		ret = PTR_ERR(root);
> +		goto out;
> +	}
> +	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
> +	fs_info->dreloc_root = root;
> +
>  	location.objectid = BTRFS_QUOTA_TREE_OBJECTID;
>  	root = btrfs_read_tree_root(tree_root, &location);
>  	if (!IS_ERR(root)) {
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 13bbb6b0d495..6b7ba20eee52 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -4133,7 +4133,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
>  	 * inode.
>  	 */
>  	if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
> -	    root == fs_info->tree_root)
> +	    root == fs_info->tree_root || root == fs_info->dreloc_root)
>  		btrfs_drop_extent_cache(BTRFS_I(inode), ALIGN(new_size,
>  					fs_info->sectorsize),
>  					(u64)-1, 0);
> @@ -4338,7 +4338,8 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
>  
>  		if (found_extent &&
>  		    (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
> -		     root == fs_info->tree_root)) {
> +		     root == fs_info->tree_root ||
> +		     root == fs_info->dreloc_root)) {
>  			struct btrfs_ref ref = { 0 };
>  
>  			bytes_deleted += extent_num_bytes;
> @@ -5046,7 +5047,7 @@ void btrfs_evict_inode(struct inode *inode)
>  		btrfs_end_transaction(trans);
>  	}
>  
> -	if (!(root == fs_info->tree_root ||
> +	if (!(root == fs_info->tree_root || root == fs_info->dreloc_root ||
>  	      root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID))
>  		btrfs_return_ino(root, btrfs_ino(BTRFS_I(inode)));
>  
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 437b782c57e6..cb45ae92f15d 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1087,7 +1087,8 @@ int replace_file_extents(struct btrfs_trans_handle *trans,
>  		 * if we are modifying block in fs tree, wait for readpage
>  		 * to complete and drop the extent cache
>  		 */
> -		if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID) {
> +		if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
> +		    root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID) {
>  			if (first) {
>  				inode = find_next_inode(root, key.objectid);
>  				first = 0;
> @@ -3470,15 +3471,11 @@ struct inode *create_reloc_inode(struct btrfs_fs_info *fs_info,
>  {
>  	struct inode *inode = NULL;
>  	struct btrfs_trans_handle *trans;
> -	struct btrfs_root *root;
> +	struct btrfs_root *root = fs_info->dreloc_root;

So why haven't you added code in btrfs_get_fs_root to quickly return a
refcounted root and instead reference it without incrementing the refcount?

>  	struct btrfs_key key;
>  	u64 objectid;
>  	int err = 0;
>  
> -	root = read_fs_root(fs_info, BTRFS_DATA_RELOC_TREE_OBJECTID);
> -	if (IS_ERR(root))
> -		return ERR_CAST(root);
> -
>  	trans = btrfs_start_transaction(root, 6);
>  	if (IS_ERR(trans)) {
>  		btrfs_put_root(root);
> @@ -3501,7 +3498,6 @@ struct inode *create_reloc_inode(struct btrfs_fs_info *fs_info,
>  
>  	err = btrfs_orphan_add(trans, BTRFS_I(inode));
>  out:
> -	btrfs_put_root(root);
>  	btrfs_end_transaction(trans);
>  	btrfs_btree_balance_dirty(fs_info);
>  	if (err) {
> @@ -3870,13 +3866,7 @@ int btrfs_recover_relocation(struct btrfs_root *root)
>  
>  	if (err == 0) {
>  		/* cleanup orphan inode in data relocation tree */
> -		fs_root = read_fs_root(fs_info, BTRFS_DATA_RELOC_TREE_OBJECTID);
> -		if (IS_ERR(fs_root)) {
> -			err = PTR_ERR(fs_root);
> -		} else {
> -			err = btrfs_orphan_cleanup(fs_root);
> -			btrfs_put_root(fs_root);
> -		}
> +		err = btrfs_orphan_cleanup(fs_info->dreloc_root);
>  	}
>  	return err;
>  }
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree
  2020-05-13  6:44   ` Nikolay Borisov
@ 2020-05-13  6:49     ` Qu Wenruo
  2020-05-13 13:51       ` David Sterba
  0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2020-05-13  6:49 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2020/5/13 下午2:44, Nikolay Borisov wrote:
>
>
> On 13.05.20 г. 9:16 ч., Qu Wenruo wrote:
>> SHAREABLE flag is set for subvolumes because users can create snapshot
>> for subvolumes, thus sharing tree blocks of them.
>>
>> But users can't access data reloc tree as it's only an internal tree for
>> data relocation, thus it doesn't need the full path replacement treat at
>> all.
>>
>> This patch will make data reloc tree just a non-shareable tree, and add
>> btrfs_fs_info::dreloc_root for data reloc tree, so relocation code can
>> grab it from fs_info directly.
>>
>> This would slightly improve tree relocation, as now data reloc tree
>> can go through regular COW routine to get relocated, without bothering
>> the complex tree reloc tree routine.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/ctree.h      |  1 +
>>  fs/btrfs/disk-io.c    | 14 +++++++++++++-
>>  fs/btrfs/inode.c      |  7 ++++---
>>  fs/btrfs/relocation.c | 18 ++++--------------
>>  4 files changed, 22 insertions(+), 18 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 65c09aea4cb9..a9690f438a15 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -582,6 +582,7 @@ struct btrfs_fs_info {
>>  	struct btrfs_root *quota_root;
>>  	struct btrfs_root *uuid_root;
>>  	struct btrfs_root *free_space_root;
>> +	struct btrfs_root *dreloc_root;
>
> Rename this to data_reloc_root or simply reloc_root, dreloc is rather
> cryptic.

Data_reloc_root looks good.
Reloc_root would easily conflict with tree reloc tree.

>>
>>  	/* the log root tree is a directory of all the other log roots */
>>  	struct btrfs_root *log_root_tree;
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 4dd3206c1ace..7355ebc648c5 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -1419,7 +1419,8 @@ static int btrfs_init_fs_root(struct btrfs_root *root)
>>  	if (ret)
>>  		goto fail;
>>
>> -	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
>> +	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
>> +	    root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID) {
>>  		set_bit(BTRFS_ROOT_SHAREABLE, &root->state);
>>  		btrfs_check_and_init_root_item(&root->root_item);
>>  	}
>> @@ -1525,6 +1526,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
>>  	btrfs_put_root(fs_info->uuid_root);
>>  	btrfs_put_root(fs_info->free_space_root);
>>  	btrfs_put_root(fs_info->fs_root);
>> +	btrfs_put_root(fs_info->dreloc_root);
>>  	btrfs_check_leaked_roots(fs_info);
>>  	btrfs_extent_buffer_leak_debug_check(fs_info);
>>  	kfree(fs_info->super_copy);
>> @@ -1981,6 +1983,7 @@ static void free_root_pointers(struct btrfs_fs_info *info, bool free_chunk_root)
>>  	free_root_extent_buffers(info->quota_root);
>>  	free_root_extent_buffers(info->uuid_root);
>>  	free_root_extent_buffers(info->fs_root);
>> +	free_root_extent_buffers(info->dreloc_root);
>>  	if (free_chunk_root)
>>  		free_root_extent_buffers(info->chunk_root);
>>  	free_root_extent_buffers(info->free_space_root);
>> @@ -2287,6 +2290,15 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
>>  	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
>>  	fs_info->csum_root = root;
>>
>> +	location.objectid = BTRFS_DATA_RELOC_TREE_OBJECTID;
>> +	root = btrfs_get_fs_root(fs_info, &location, true);
>> +	if (IS_ERR(root)) {
>> +		ret = PTR_ERR(root);
>> +		goto out;
>> +	}
>> +	set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
>> +	fs_info->dreloc_root = root;
>> +
>>  	location.objectid = BTRFS_QUOTA_TREE_OBJECTID;
>>  	root = btrfs_read_tree_root(tree_root, &location);
>>  	if (!IS_ERR(root)) {
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index 13bbb6b0d495..6b7ba20eee52 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -4133,7 +4133,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
>>  	 * inode.
>>  	 */
>>  	if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
>> -	    root == fs_info->tree_root)
>> +	    root == fs_info->tree_root || root == fs_info->dreloc_root)
>>  		btrfs_drop_extent_cache(BTRFS_I(inode), ALIGN(new_size,
>>  					fs_info->sectorsize),
>>  					(u64)-1, 0);
>> @@ -4338,7 +4338,8 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
>>
>>  		if (found_extent &&
>>  		    (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
>> -		     root == fs_info->tree_root)) {
>> +		     root == fs_info->tree_root ||
>> +		     root == fs_info->dreloc_root)) {
>>  			struct btrfs_ref ref = { 0 };
>>
>>  			bytes_deleted += extent_num_bytes;
>> @@ -5046,7 +5047,7 @@ void btrfs_evict_inode(struct inode *inode)
>>  		btrfs_end_transaction(trans);
>>  	}
>>
>> -	if (!(root == fs_info->tree_root ||
>> +	if (!(root == fs_info->tree_root || root == fs_info->dreloc_root ||
>>  	      root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID))
>>  		btrfs_return_ino(root, btrfs_ino(BTRFS_I(inode)));
>>
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index 437b782c57e6..cb45ae92f15d 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -1087,7 +1087,8 @@ int replace_file_extents(struct btrfs_trans_handle *trans,
>>  		 * if we are modifying block in fs tree, wait for readpage
>>  		 * to complete and drop the extent cache
>>  		 */
>> -		if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID) {
>> +		if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
>> +		    root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID) {
>>  			if (first) {
>>  				inode = find_next_inode(root, key.objectid);
>>  				first = 0;
>> @@ -3470,15 +3471,11 @@ struct inode *create_reloc_inode(struct btrfs_fs_info *fs_info,
>>  {
>>  	struct inode *inode = NULL;
>>  	struct btrfs_trans_handle *trans;
>> -	struct btrfs_root *root;
>> +	struct btrfs_root *root = fs_info->dreloc_root;
>
> So why haven't you added code in btrfs_get_fs_root to quickly return a
> refcounted root and instead reference it without incrementing the refcount?

This is exactly the same as how we handle fs_root.
And since the lifespan of data reloc root will be the same as the fs, I
don't think there is any need for it to be grabbed each time we need it.

Thanks,
Qu
>
>>  	struct btrfs_key key;
>>  	u64 objectid;
>>  	int err = 0;
>>
>> -	root = read_fs_root(fs_info, BTRFS_DATA_RELOC_TREE_OBJECTID);
>> -	if (IS_ERR(root))
>> -		return ERR_CAST(root);
>> -
>>  	trans = btrfs_start_transaction(root, 6);
>>  	if (IS_ERR(trans)) {
>>  		btrfs_put_root(root);
>> @@ -3501,7 +3498,6 @@ struct inode *create_reloc_inode(struct btrfs_fs_info *fs_info,
>>
>>  	err = btrfs_orphan_add(trans, BTRFS_I(inode));
>>  out:
>> -	btrfs_put_root(root);
>>  	btrfs_end_transaction(trans);
>>  	btrfs_btree_balance_dirty(fs_info);
>>  	if (err) {
>> @@ -3870,13 +3866,7 @@ int btrfs_recover_relocation(struct btrfs_root *root)
>>
>>  	if (err == 0) {
>>  		/* cleanup orphan inode in data relocation tree */
>> -		fs_root = read_fs_root(fs_info, BTRFS_DATA_RELOC_TREE_OBJECTID);
>> -		if (IS_ERR(fs_root)) {
>> -			err = PTR_ERR(fs_root);
>> -		} else {
>> -			err = btrfs_orphan_cleanup(fs_root);
>> -			btrfs_put_root(fs_root);
>> -		}
>> +		err = btrfs_orphan_cleanup(fs_info->dreloc_root);
>>  	}
>>  	return err;
>>  }
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree
  2020-05-13  6:49     ` Qu Wenruo
@ 2020-05-13 13:51       ` David Sterba
  2020-05-14  0:01         ` Qu Wenruo
  0 siblings, 1 reply; 10+ messages in thread
From: David Sterba @ 2020-05-13 13:51 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Nikolay Borisov, Qu Wenruo, linux-btrfs

On Wed, May 13, 2020 at 02:49:11PM +0800, Qu Wenruo wrote:
> >>  {
> >>  	struct inode *inode = NULL;
> >>  	struct btrfs_trans_handle *trans;
> >> -	struct btrfs_root *root;
> >> +	struct btrfs_root *root = fs_info->dreloc_root;
> >
> > So why haven't you added code in btrfs_get_fs_root to quickly return a
> > refcounted root and instead reference it without incrementing the refcount?
> 
> This is exactly the same as how we handle fs_root.
> And since the lifespan of data reloc root will be the same as the fs, I
> don't think there is any need for it to be grabbed each time we need it.

I'd vote for some consistency in the refcounting, ie. even if it's for
the same life span as the filesystem, set the reference.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree
  2020-05-13  6:16 ` [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree Qu Wenruo
  2020-05-13  6:44   ` Nikolay Borisov
@ 2020-05-13 14:01   ` David Sterba
  2020-05-14  0:06     ` Qu Wenruo
  1 sibling, 1 reply; 10+ messages in thread
From: David Sterba @ 2020-05-13 14:01 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, May 13, 2020 at 02:16:11PM +0800, Qu Wenruo wrote:
> -	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
> +	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
> +	    root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID) {

>  	if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
> -	    root == fs_info->tree_root)
> +	    root == fs_info->tree_root || root == fs_info->dreloc_root)

>  		if (found_extent &&
>  		    (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
> -		     root == fs_info->tree_root)) {
> +		     root == fs_info->tree_root ||
> +		     root == fs_info->dreloc_root)) {

Lots of repeated conditions, though not all of them exactly the same. I
was thinking about adding some helpers but don't have a good suggestion.

The concern is about too much special casing, it's eg tree_root +
data_reloc_tree, or SHAREABLE + tree_reloc + data_reloc, etc. The
helpers should capture the common semantics of the condition and also
reduce the surface for future errors.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] btrfs: Rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE
  2020-05-13  6:16 ` [PATCH 1/2] btrfs: Rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE Qu Wenruo
@ 2020-05-13 14:18   ` David Sterba
  0 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2020-05-13 14:18 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, May 13, 2020 at 02:16:10PM +0800, Qu Wenruo wrote:
> The name BTRFS_ROOT_REF_COWS is not helpful to show what it really
> means.
> 
> In fact, that bit can only be set to those trees:
> - Subvolume roots
> - Data reloc root
> - Reloc roots for above roots
> 
> All other trees won't get this bit set.
> So just by the result, it is obvious that, roots with this bit set can
> have tree blocks shared with other trees.
> Either shared by snapshots, or by reloc roots (an special snapshot
> created by relocation).
> 
> This patch will rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE to
> make it easier to understand, and update all comment mentioning
> "reference counted" to follow the rename.

The new name sounds good to me.

> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1275,12 +1275,13 @@ static struct btrfs_root *alloc_log_tree(struct btrfs_trans_handle *trans,
>  	root->root_key.offset = BTRFS_TREE_LOG_OBJECTID;
>  
>  	/*
> -	 * DON'T set REF_COWS for log trees
> +	 * DON'T set SHAREABLE bit for log trees.
>  	 *
> -	 * log trees do not get reference counted because they go away
> -	 * before a real commit is actually done.  They do store pointers
> -	 * to file data extents, and those reference counts still get
> -	 * updated (along with back refs to the log tree).
> +	 * User has no way to create snapshot for log trees, and they go away
> +	 * before a real commit is actually done.

I think that refering to 'user' is confusing as the reference counting
or the sharing is an internal mechanics, there's no existing interface
for users to directly manipulate the log trees.

> +	 *
> +	 * They do store pointers to file data extents, and those reference
> +	 * counts still get updated (along with back refs to the log tree).
>  	 */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree
  2020-05-13 13:51       ` David Sterba
@ 2020-05-14  0:01         ` Qu Wenruo
  0 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2020-05-14  0:01 UTC (permalink / raw)
  To: dsterba, Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2020/5/13 下午9:51, David Sterba wrote:
> On Wed, May 13, 2020 at 02:49:11PM +0800, Qu Wenruo wrote:
>>>>  {
>>>>  	struct inode *inode = NULL;
>>>>  	struct btrfs_trans_handle *trans;
>>>> -	struct btrfs_root *root;
>>>> +	struct btrfs_root *root = fs_info->dreloc_root;
>>>
>>> So why haven't you added code in btrfs_get_fs_root to quickly return a
>>> refcounted root and instead reference it without incrementing the refcount?
>>
>> This is exactly the same as how we handle fs_root.
>> And since the lifespan of data reloc root will be the same as the fs, I
>> don't think there is any need for it to be grabbed each time we need it.
>
> I'd vote for some consistency in the refcounting, ie. even if it's for
> the same life span as the filesystem, set the reference.
>
OK, I'll call grab and put in next version.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree
  2020-05-13 14:01   ` David Sterba
@ 2020-05-14  0:06     ` Qu Wenruo
  0 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2020-05-14  0:06 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1621 bytes --]



On 2020/5/13 下午10:01, David Sterba wrote:
> On Wed, May 13, 2020 at 02:16:11PM +0800, Qu Wenruo wrote:
>> -	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
>> +	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
>> +	    root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID) {
> 
>>  	if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
>> -	    root == fs_info->tree_root)
>> +	    root == fs_info->tree_root || root == fs_info->dreloc_root)
> 
>>  		if (found_extent &&
>>  		    (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
>> -		     root == fs_info->tree_root)) {
>> +		     root == fs_info->tree_root ||
>> +		     root == fs_info->dreloc_root)) {
> 
> Lots of repeated conditions, though not all of them exactly the same. I
> was thinking about adding some helpers but don't have a good suggestion.

The good news is, only inode truncating cares that much.
All other locations won't bother at all.

The concern here is, INODE_ITEM can exist in trees without SHAREABLE
(REF_COWS).
The original exception is root tree (v1 space cache uses INODE_ITEM).

As we don't set SHAREABLE flag for data reloc tree now, it's adding the
exception.

I could find a way to make it more readable, by separating the regular
SHAREABLE check and two exceptions and wrap it into a function.

Thanks,
Qu

> 
> The concern is about too much special casing, it's eg tree_root +
> data_reloc_tree, or SHAREABLE + tree_reloc + data_reloc, etc. The
> helpers should capture the common semantics of the condition and also
> reduce the surface for future errors.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-05-14  0:06 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13  6:16 [PATCH 0/2] btrfs: REF_COWS bit rework Qu Wenruo
2020-05-13  6:16 ` [PATCH 1/2] btrfs: Rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE Qu Wenruo
2020-05-13 14:18   ` David Sterba
2020-05-13  6:16 ` [PATCH 2/2] btrfs: Don't set SHAREABLE flag for data reloc tree Qu Wenruo
2020-05-13  6:44   ` Nikolay Borisov
2020-05-13  6:49     ` Qu Wenruo
2020-05-13 13:51       ` David Sterba
2020-05-14  0:01         ` Qu Wenruo
2020-05-13 14:01   ` David Sterba
2020-05-14  0:06     ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.