All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots
@ 2022-03-07 22:10 Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 01/19] btrfs-progs: add support for loading the block group root Josef Bacik
                   ` (19 more replies)
  0 siblings, 20 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

v4->v3:
- Rebase onto devel, depends on "btrfs-progs: cleanup btrfs_item* accessors".
- Dropped the various patches that have already been merged into -progs.

v3->v4:
- Rebase onto devel, depends on the v3 prep patches that were sent on December
  1st which has the rest of the "don't access ->*_root" patches.
- I think I screwed up the versioning of this, but I lost the other submission,
  so call this v3.

v1->v2:
- These depend on the v3 of the prep patches (it's marked as v2 because I'm
  stupid, but the second v2 posting I sent.)
- I've moved the global root rb tree patches into this series to differentiate
  them from the actual fixes in the prep series.

--- Original email ---
Hello,

These patches are the first chunk of the extent tree v2 format changes.  This
includes the separate block group root which will hold all of the block group
items.  This also includes the global root support, which is the work to allow
us to have multiple extent, csum, and free space trees in the same file system.

The goal of these two changes are straightforward.  For the block group root, on
very large file systems the block group items are very widely separated, which
means it takes a very long time to mount the file system on large, slow disks.
Putting the block group items in their own root will allow us to densely
populate the tree and dramatically increase mount times in these cases.

The global roots change is motivated by lock contention on the root nodes of
these global roots.  I've had to make many changes to how we run delayed refs to
speed up things like the transaction commit because of all the delayed refs
going into one tree and contending on the root node of the extent tree.  In the
same token you can have heavy lock contention on the csum roots when writing to
many files.  Allowing for multiple roots will let us spread the lock contention
load around.

I have disabled a few key features, namely balance and qgroups.  There will be
more to come as I make more and more invasive changes, and then they will slowly
be re-enabled as the work is added.  These are disabled to avoid a bunch of work
that would be thrown away by future changes.

These patches have passed xfstests without panicing, but clearly failing a lot
of tests because of the disabled features.  I've also run it through fsperf to
validate that there are no major performance regressions.

WARNING: there are many more format changes planned, this is just the first
batch.  If you want to test then please feel free, but know that the format is
still in flux.  Thanks,

Josef

Josef Bacik (19):
  btrfs-progs: add support for loading the block group root
  btrfs-progs: add print support for the block group tree
  btrfs-progs: mkfs: use the btrfs_block_group_root helper
  btrfs-progs: check-lowmem: use the btrfs_block_group_root helper
  btrfs-progs: handle no bg item in extent tree for free space tree
  btrfs-progs: mkfs: add support for the block group tree
  btrfs-progs: check: add block group tree support
  btrfs-progs: qgroup-verify: scan extents based on block groups
  btrfs-progs: check: make free space tree validation extent tree v2
    aware
  btrfs-progs: check: add helper to reinit the root based on a key
  btrfs-progs: check: handle the block group tree properly
  btrfs-progs: set the number of global roots in the super block
  btrfs-progs: handle the per-block group global root id
  btrfs-progs: add a btrfs_delete_and_free_root helper
  btrfs-progs: make btrfs_clear_free_space_tree extent tree v2 aware
  btrfs-progs: make btrfs_create_tree take a key for the root key
  btrfs-progs: mkfs: set chunk_item_objectid properly for extent tree v2
  btrfs-progs: mkfs: create the global root's
  btrfs-progs: check: don't do the root item check for extent tree v2

 check/main.c                    | 233 +++++++++++++++++--------------
 check/mode-lowmem.c             |  12 +-
 check/qgroup-verify.c           |  32 +++--
 cmds/inspect-dump-tree.c        |  30 +++-
 common/repair.c                 |   3 +
 kernel-shared/ctree.h           |   9 +-
 kernel-shared/disk-io.c         | 235 ++++++++++++++++++++++++--------
 kernel-shared/disk-io.h         |  15 +-
 kernel-shared/extent-tree.c     |  32 ++++-
 kernel-shared/free-space-tree.c |  72 +++++-----
 kernel-shared/print-tree.c      |  23 +++-
 kernel-shared/transaction.c     |   2 +
 mkfs/common.c                   |  94 ++++++++++---
 mkfs/common.h                   |  12 ++
 mkfs/main.c                     |  93 ++++++++++++-
 15 files changed, 658 insertions(+), 239 deletions(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 01/19] btrfs-progs: add support for loading the block group root
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 02/19] btrfs-progs: add print support for the block group tree Josef Bacik
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This adds the ability to load the block group root, as well as make sure
the various backup super block and super block updates are made
appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 kernel-shared/ctree.h       |   1 +
 kernel-shared/disk-io.c     | 161 +++++++++++++++++++++++++++---------
 kernel-shared/disk-io.h     |  10 ++-
 kernel-shared/extent-tree.c |   8 +-
 kernel-shared/transaction.c |   2 +
 5 files changed, 138 insertions(+), 44 deletions(-)

diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h
index addfafc7..b12dbff1 100644
--- a/kernel-shared/ctree.h
+++ b/kernel-shared/ctree.h
@@ -1201,6 +1201,7 @@ struct btrfs_fs_info {
 	struct btrfs_root *dev_root;
 	struct btrfs_root *quota_root;
 	struct btrfs_root *uuid_root;
+	struct btrfs_root *block_group_root;
 
 	struct rb_root global_roots_tree;
 	struct rb_root fs_root_tree;
diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c
index 0434ed7d..3d1157ad 100644
--- a/kernel-shared/disk-io.c
+++ b/kernel-shared/disk-io.c
@@ -857,6 +857,9 @@ struct btrfs_root *btrfs_read_fs_root(struct btrfs_fs_info *fs_info,
 		root = btrfs_global_root(fs_info, location);
 		return root ? root : ERR_PTR(-ENOENT);
 	}
+	if (location->objectid == BTRFS_BLOCK_GROUP_TREE_OBJECTID)
+		return fs_info->block_group_root ? fs_info->block_group_root :
+						ERR_PTR(-ENOENT);
 
 	BUG_ON(location->objectid == BTRFS_TREE_RELOC_OBJECTID);
 
@@ -895,6 +898,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
 	free(fs_info->chunk_root);
 	free(fs_info->dev_root);
 	free(fs_info->uuid_root);
+	free(fs_info->block_group_root);
 	free(fs_info->super_copy);
 	free(fs_info->log_root_tree);
 	free(fs_info);
@@ -913,10 +917,12 @@ struct btrfs_fs_info *btrfs_new_fs_info(int writable, u64 sb_bytenr)
 	fs_info->dev_root = calloc(1, sizeof(struct btrfs_root));
 	fs_info->quota_root = calloc(1, sizeof(struct btrfs_root));
 	fs_info->uuid_root = calloc(1, sizeof(struct btrfs_root));
+	fs_info->block_group_root = calloc(1, sizeof(struct btrfs_root));
 	fs_info->super_copy = calloc(1, BTRFS_SUPER_INFO_SIZE);
 
 	if (!fs_info->tree_root || !fs_info->chunk_root || !fs_info->dev_root ||
-	    !fs_info->quota_root || !fs_info->uuid_root || !fs_info->super_copy)
+	    !fs_info->quota_root || !fs_info->uuid_root ||
+	    !fs_info->block_group_root || !fs_info->super_copy)
 		goto free_all;
 
 	extent_io_tree_init(&fs_info->extent_cache);
@@ -1040,7 +1046,7 @@ static int read_root_or_create_block(struct btrfs_fs_info *fs_info,
 static inline bool maybe_load_block_groups(struct btrfs_fs_info *fs_info,
 					   u64 flags)
 {
-	struct btrfs_root *root = btrfs_extent_root(fs_info, 0);
+	struct btrfs_root *root = btrfs_block_group_root(fs_info);
 
 	if (flags & OPEN_CTREE_NO_BLOCK_GROUPS)
 		return false;
@@ -1051,7 +1057,6 @@ static inline bool maybe_load_block_groups(struct btrfs_fs_info *fs_info,
 	return false;
 }
 
-
 static int load_global_roots_objectid(struct btrfs_fs_info *fs_info,
 				      struct btrfs_path *path, u64 objectid,
 				      unsigned flags, char *str)
@@ -1202,43 +1207,99 @@ out:
 	return ret;
 }
 
-int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
-			  unsigned flags)
+static int load_important_roots(struct btrfs_fs_info *fs_info,
+				u64 root_tree_bytenr, unsigned flags)
 {
 	struct btrfs_super_block *sb = fs_info->super_copy;
+	struct btrfs_root_backup *backup = NULL;
 	struct btrfs_root *root;
-	struct btrfs_key key;
-	u64 generation;
+	u64 bytenr, gen;
 	int level;
+	int index = -1;
 	int ret;
 
-	root = fs_info->tree_root;
-	btrfs_setup_root(root, fs_info, BTRFS_ROOT_TREE_OBJECTID);
-	generation = btrfs_super_generation(sb);
-	level = btrfs_super_root_level(sb);
-
-	if (!root_tree_bytenr && !(flags & OPEN_CTREE_BACKUP_ROOT)) {
-		root_tree_bytenr = btrfs_super_root(sb);
-	} else if (flags & OPEN_CTREE_BACKUP_ROOT) {
-		struct btrfs_root_backup *backup;
-		int index = find_best_backup_root(sb);
+	if (flags & OPEN_CTREE_BACKUP_ROOT) {
+		index = find_best_backup_root(sb);
 		if (index >= BTRFS_NUM_BACKUP_ROOTS) {
 			fprintf(stderr, "Invalid backup root number\n");
 			return -EIO;
 		}
-		backup = fs_info->super_copy->super_roots + index;
-		root_tree_bytenr = btrfs_backup_tree_root(backup);
-		generation = btrfs_backup_tree_root_gen(backup);
+		backup = sb->super_roots + index;
+	}
+
+	if (!btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
+		free(fs_info->block_group_root);
+		fs_info->block_group_root = NULL;
+		goto tree_root;
+	}
+
+	if (backup) {
+		bytenr = btrfs_backup_block_group_root(backup);
+		gen = btrfs_backup_block_group_root_gen(backup);
+		level = btrfs_backup_block_group_root_level(backup);
+	} else {
+		bytenr = btrfs_super_block_group_root(sb);
+		gen = btrfs_super_block_group_root_generation(sb);
+		level = btrfs_super_block_group_root_level(sb);
+	}
+	root = fs_info->block_group_root;
+	btrfs_setup_root(root, fs_info, BTRFS_BLOCK_GROUP_TREE_OBJECTID);
+
+	ret = read_root_node(fs_info, root, bytenr, gen, level);
+	if (ret) {
+		fprintf(stderr, "Couldn't read block group root\n");
+		return -EIO;
+	}
+
+	if (maybe_load_block_groups(fs_info, flags)) {
+		int ret = btrfs_read_block_groups(fs_info);
+		if (ret < 0 && ret != -ENOENT) {
+			errno = -ret;
+			error("failed to read block groups: %m");
+			return ret;
+		}
+	}
+
+tree_root:
+	if (backup) {
+		bytenr = btrfs_backup_tree_root(backup);
+		gen = btrfs_backup_tree_root_gen(backup);
 		level = btrfs_backup_tree_root_level(backup);
+	} else {
+		if (root_tree_bytenr)
+			bytenr = root_tree_bytenr;
+		else
+			bytenr = btrfs_super_root(sb);
+		gen = btrfs_super_generation(sb);
+		level = btrfs_super_root_level(sb);
 	}
 
-	ret = read_root_node(fs_info, root, root_tree_bytenr, generation,
-			     level);
+	fs_info->generation = gen;
+	fs_info->last_trans_committed = gen;
+	root = fs_info->tree_root;
+	btrfs_setup_root(root, fs_info, BTRFS_ROOT_TREE_OBJECTID);
+
+	ret = read_root_node(fs_info, root, bytenr, gen, level);
 	if (ret) {
 		fprintf(stderr, "Couldn't read tree root\n");
 		return -EIO;
 	}
 
+	return 0;
+}
+
+int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
+			  unsigned flags)
+{
+	struct btrfs_super_block *sb = fs_info->super_copy;
+	struct btrfs_root *root = fs_info->tree_root;
+	struct btrfs_key key;
+	int ret;
+
+	ret = load_important_roots(fs_info, root_tree_bytenr, flags);
+	if (ret)
+		return ret;
+
 	ret = load_global_roots(fs_info, flags);
 	if (ret)
 		return ret;
@@ -1276,9 +1337,8 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
 			return -EIO;
 	}
 
-	fs_info->generation = generation;
-	fs_info->last_trans_committed = generation;
-	if (maybe_load_block_groups(fs_info, flags)) {
+	if (!btrfs_fs_incompat(fs_info, EXTENT_TREE_V2) &&
+	    maybe_load_block_groups(fs_info, flags)) {
 		ret = btrfs_read_block_groups(fs_info);
 		/*
 		 * If we don't find any blockgroups (ENOENT) we're either
@@ -1321,6 +1381,8 @@ static void release_global_roots(struct btrfs_fs_info *fs_info)
 void btrfs_release_all_roots(struct btrfs_fs_info *fs_info)
 {
 	release_global_roots(fs_info);
+	if (fs_info->block_group_root)
+		free_extent_buffer(fs_info->block_group_root->node);
 	if (fs_info->quota_root)
 		free_extent_buffer(fs_info->quota_root->node);
 	if (fs_info->dev_root)
@@ -2066,8 +2128,6 @@ static int write_dev_supers(struct btrfs_fs_info *fs_info,
 static void backup_super_roots(struct btrfs_fs_info *info)
 {
 	struct btrfs_root_backup *root_backup;
-	struct btrfs_root *csum_root = btrfs_csum_root(info, 0);
-	struct btrfs_root *extent_root = btrfs_extent_root(info, 0);
 	int next_backup;
 	int last_backup;
 
@@ -2099,11 +2159,6 @@ static void backup_super_roots(struct btrfs_fs_info *info)
 	btrfs_set_backup_chunk_root_level(root_backup,
 			       btrfs_header_level(info->chunk_root->node));
 
-	btrfs_set_backup_extent_root(root_backup, extent_root->node->start);
-	btrfs_set_backup_extent_root_gen(root_backup,
-			       btrfs_header_generation(extent_root->node));
-	btrfs_set_backup_extent_root_level(root_backup,
-			       btrfs_header_level(extent_root->node));
 	/*
 	 * we might commit during log recovery, which happens before we set
 	 * the fs_root.  Make sure it is valid before we fill it in.
@@ -2123,18 +2178,37 @@ static void backup_super_roots(struct btrfs_fs_info *info)
 	btrfs_set_backup_dev_root_level(root_backup,
 				       btrfs_header_level(info->dev_root->node));
 
-	btrfs_set_backup_csum_root(root_backup, csum_root->node->start);
-	btrfs_set_backup_csum_root_gen(root_backup,
-			       btrfs_header_generation(csum_root->node));
-	btrfs_set_backup_csum_root_level(root_backup,
-			       btrfs_header_level(csum_root->node));
-
 	btrfs_set_backup_total_bytes(root_backup,
 			     btrfs_super_total_bytes(info->super_copy));
 	btrfs_set_backup_bytes_used(root_backup,
 			     btrfs_super_bytes_used(info->super_copy));
 	btrfs_set_backup_num_devices(root_backup,
 			     btrfs_super_num_devices(info->super_copy));
+
+	if (btrfs_fs_incompat(info, EXTENT_TREE_V2)) {
+		btrfs_set_backup_block_group_root(root_backup,
+				info->block_group_root->node->start);
+		btrfs_set_backup_block_group_root_gen(root_backup,
+			btrfs_header_generation(info->block_group_root->node));
+		btrfs_set_backup_block_group_root_level(root_backup,
+			btrfs_header_level(info->block_group_root->node));
+	} else {
+		struct btrfs_root *csum_root = btrfs_csum_root(info, 0);
+		struct btrfs_root *extent_root = btrfs_extent_root(info, 0);
+
+		btrfs_set_backup_csum_root(root_backup, csum_root->node->start);
+		btrfs_set_backup_csum_root_gen(root_backup,
+				btrfs_header_generation(csum_root->node));
+		btrfs_set_backup_csum_root_level(root_backup,
+				btrfs_header_level(csum_root->node));
+
+		btrfs_set_backup_extent_root(root_backup,
+					     extent_root->node->start);
+		btrfs_set_backup_extent_root_gen(root_backup,
+			btrfs_header_generation(extent_root->node));
+		btrfs_set_backup_extent_root_level(root_backup,
+			btrfs_header_level(extent_root->node));
+	}
 }
 
 int write_all_supers(struct btrfs_fs_info *fs_info)
@@ -2181,7 +2255,7 @@ int write_ctree_super(struct btrfs_trans_handle *trans)
 	struct btrfs_fs_info *fs_info = trans->fs_info;
 	struct btrfs_root *tree_root = fs_info->tree_root;
 	struct btrfs_root *chunk_root = fs_info->chunk_root;
-
+	struct btrfs_root *block_group_root = fs_info->block_group_root;
 	if (fs_info->readonly)
 		return 0;
 
@@ -2198,6 +2272,15 @@ int write_ctree_super(struct btrfs_trans_handle *trans)
 	btrfs_set_super_chunk_root_generation(fs_info->super_copy,
 				btrfs_header_generation(chunk_root->node));
 
+	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
+		btrfs_set_super_block_group_root(fs_info->super_copy,
+						 block_group_root->node->start);
+		btrfs_set_super_block_group_root_generation(fs_info->super_copy,
+				btrfs_header_generation(block_group_root->node));
+		btrfs_set_super_block_group_root_level(fs_info->super_copy,
+				btrfs_header_level(block_group_root->node));
+	}
+
 	ret = write_all_supers(fs_info);
 	if (ret)
 		fprintf(stderr, "failed to write new super block err %d\n", ret);
diff --git a/kernel-shared/disk-io.h b/kernel-shared/disk-io.h
index d55ced1e..1e9044f8 100644
--- a/kernel-shared/disk-io.h
+++ b/kernel-shared/disk-io.h
@@ -223,9 +223,17 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 				     u64 objectid);
 struct btrfs_root *btrfs_csum_root(struct btrfs_fs_info *fs_info, u64 bytenr);
 struct btrfs_root *btrfs_extent_root(struct btrfs_fs_info *fs_inf, u64 bytenr);
-struct btrfs_root *btrfs_block_group_root(struct btrfs_fs_info *fs_info);
 struct btrfs_root *btrfs_global_root(struct btrfs_fs_info *fs_info,
 				     struct btrfs_key *key);
 int btrfs_global_root_insert(struct btrfs_fs_info *fs_info,
 			     struct btrfs_root *root);
+
+static inline struct btrfs_root *btrfs_block_group_root(
+						struct btrfs_fs_info *fs_info)
+{
+	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
+		return fs_info->block_group_root;
+	return btrfs_extent_root(fs_info, 0);
+}
+
 #endif
diff --git a/kernel-shared/extent-tree.c b/kernel-shared/extent-tree.c
index e36745ca..b2b99d4f 100644
--- a/kernel-shared/extent-tree.c
+++ b/kernel-shared/extent-tree.c
@@ -1540,7 +1540,7 @@ static int update_block_group_item(struct btrfs_trans_handle *trans,
 {
 	int ret;
 	struct btrfs_fs_info *fs_info = trans->fs_info;
-	struct btrfs_root *root = btrfs_extent_root(fs_info, 0);
+	struct btrfs_root *root = btrfs_block_group_root(fs_info);
 	unsigned long bi;
 	struct btrfs_block_group_item bgi;
 	struct extent_buffer *leaf;
@@ -2731,7 +2731,7 @@ int btrfs_read_block_groups(struct btrfs_fs_info *fs_info)
 	int ret;
 	struct btrfs_key key;
 
-	root = btrfs_extent_root(fs_info, 0);
+	root = btrfs_block_group_root(fs_info);
 	key.objectid = 0;
 	key.offset = 0;
 	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
@@ -2812,7 +2812,7 @@ static int insert_block_group_item(struct btrfs_trans_handle *trans,
 	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
 	key.offset = block_group->length;
 
-	root = btrfs_extent_root(fs_info, 0);
+	root = btrfs_block_group_root(fs_info);
 	return btrfs_insert_item(trans, root, &key, &bgi, sizeof(bgi));
 }
 
@@ -2929,7 +2929,7 @@ static int remove_block_group_item(struct btrfs_trans_handle *trans,
 {
 	struct btrfs_fs_info *fs_info = trans->fs_info;
 	struct btrfs_key key;
-	struct btrfs_root *root = btrfs_extent_root(fs_info, 0);
+	struct btrfs_root *root = btrfs_block_group_root(fs_info);
 	int ret = 0;
 
 	key.objectid = block_group->start;
diff --git a/kernel-shared/transaction.c b/kernel-shared/transaction.c
index 5b991651..02012266 100644
--- a/kernel-shared/transaction.c
+++ b/kernel-shared/transaction.c
@@ -185,6 +185,8 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 		goto commit_tree;
 	if (root == root->fs_info->chunk_root)
 		goto commit_tree;
+	if (root == root->fs_info->block_group_root)
+		goto commit_tree;
 
 	free_extent_buffer(root->commit_root);
 	root->commit_root = NULL;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 02/19] btrfs-progs: add print support for the block group tree
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 01/19] btrfs-progs: add support for loading the block group root Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 03/19] btrfs-progs: mkfs: use the btrfs_block_group_root helper Josef Bacik
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Add the appropriate support to the print tree and dump tree code to spit
out the block group tree.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 cmds/inspect-dump-tree.c   | 30 +++++++++++++++++++++++++++++-
 kernel-shared/print-tree.c | 23 +++++++++++++++++++----
 2 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/cmds/inspect-dump-tree.c b/cmds/inspect-dump-tree.c
index 6332b46d..daa7f925 100644
--- a/cmds/inspect-dump-tree.c
+++ b/cmds/inspect-dump-tree.c
@@ -83,8 +83,14 @@ out:
 
 static void print_old_roots(struct btrfs_super_block *super)
 {
+	const char *extent_tree_str = "extent root";
 	struct btrfs_root_backup *backup;
 	int i;
+	bool extent_tree_v2 = (btrfs_super_incompat_flags(super) &
+		BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2);
+
+	if (extent_tree_v2)
+		extent_tree_str = "block group root";
 
 	for (i = 0; i < BTRFS_NUM_BACKUP_ROOTS; i++) {
 		backup = super->super_roots + i;
@@ -93,7 +99,7 @@ static void print_old_roots(struct btrfs_super_block *super)
 		       (unsigned long long)btrfs_backup_tree_root_gen(backup),
 		       (unsigned long long)btrfs_backup_tree_root(backup));
 
-		printf("\t\textent root gen %llu block %llu\n",
+		printf("\t\t%s gen %llu block %llu\n", extent_tree_str,
 		       (unsigned long long)btrfs_backup_extent_root_gen(backup),
 		       (unsigned long long)btrfs_backup_extent_root(backup));
 
@@ -510,6 +516,11 @@ static int cmd_inspect_dump_tree(const struct cmd_struct *cmd,
 				       info->log_root_tree->node->start,
 					btrfs_header_level(
 						info->log_root_tree->node));
+			if (info->block_group_root)
+				printf("block group tree: %llu level %d\n",
+				       info->block_group_root->node->start,
+					btrfs_header_level(
+						info->block_group_root->node));
 		} else {
 			if (info->tree_root->node) {
 				printf("root tree\n");
@@ -528,6 +539,12 @@ static int cmd_inspect_dump_tree(const struct cmd_struct *cmd,
 				btrfs_print_tree(info->log_root_tree->node,
 					BTRFS_PRINT_TREE_FOLLOW | print_mode);
 			}
+
+			if (info->block_group_root) {
+				printf("block group tree\n");
+				btrfs_print_tree(info->block_group_root->node,
+					BTRFS_PRINT_TREE_FOLLOW | print_mode);
+			}
 		}
 	}
 	tree_root_scan = info->tree_root;
@@ -573,6 +590,17 @@ again:
 		goto close_root;
 	}
 
+	if (tree_id && tree_id == BTRFS_BLOCK_GROUP_TREE_OBJECTID) {
+		if (!info->block_group_root) {
+			error("cannot print block group tree, invalid pointer");
+			goto close_root;
+		}
+		printf("block group tree\n");
+		btrfs_print_tree(info->block_group_root->node,
+					BTRFS_PRINT_TREE_FOLLOW | print_mode);
+		goto close_root;
+	}
+
 	key.offset = 0;
 	key.objectid = 0;
 	key.type = BTRFS_ROOT_ITEM_KEY;
diff --git a/kernel-shared/print-tree.c b/kernel-shared/print-tree.c
index 717be5d5..978d92bc 100644
--- a/kernel-shared/print-tree.c
+++ b/kernel-shared/print-tree.c
@@ -1872,8 +1872,14 @@ static int empty_backup(struct btrfs_root_backup *backup)
 	return 0;
 }
 
-static void print_root_backup(struct btrfs_root_backup *backup)
+static void print_root_backup(struct btrfs_root_backup *backup,
+			      bool extent_tree_v2)
 {
+	const char *extent_tree_str = "backup_extent_root";
+
+	if (extent_tree_v2)
+		extent_tree_str = "backup_block_group_root";
+
 	printf("\t\tbackup_tree_root:\t%llu\tgen: %llu\tlevel: %d\n",
 			btrfs_backup_tree_root(backup),
 			btrfs_backup_tree_root_gen(backup),
@@ -1882,7 +1888,8 @@ static void print_root_backup(struct btrfs_root_backup *backup)
 			btrfs_backup_chunk_root(backup),
 			btrfs_backup_chunk_root_gen(backup),
 			btrfs_backup_chunk_root_level(backup));
-	printf("\t\tbackup_extent_root:\t%llu\tgen: %llu\tlevel: %d\n",
+	printf("\t\t%s:\t%llu\tgen: %llu\tlevel: %d\n",
+			extent_tree_str,
 			btrfs_backup_extent_root(backup),
 			btrfs_backup_extent_root_gen(backup),
 			btrfs_backup_extent_root_level(backup));
@@ -1894,7 +1901,7 @@ static void print_root_backup(struct btrfs_root_backup *backup)
 			btrfs_backup_dev_root(backup),
 			btrfs_backup_dev_root_gen(backup),
 			btrfs_backup_dev_root_level(backup));
-	printf("\t\tbackup_csum_root:\t%llu\tgen: %llu\tlevel: %d\n",
+	printf("\t\tcsum_root:\t%llu\tgen: %llu\tlevel: %d\n",
 			btrfs_backup_csum_root(backup),
 			btrfs_backup_csum_root_gen(backup),
 			btrfs_backup_csum_root_level(backup));
@@ -1912,12 +1919,14 @@ static void print_backup_roots(struct btrfs_super_block *sb)
 {
 	struct btrfs_root_backup *backup;
 	int i;
+	bool extent_tree_v2 = (btrfs_super_incompat_flags(sb) &
+		BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2);
 
 	for (i = 0; i < BTRFS_NUM_BACKUP_ROOTS; i++) {
 		backup = sb->super_roots + i;
 		if (!empty_backup(backup)) {
 			printf("\tbackup %d:\n", i);
-			print_root_backup(backup);
+			print_root_backup(backup, extent_tree_v2);
 		}
 	}
 }
@@ -2034,6 +2043,12 @@ void btrfs_print_superblock(struct btrfs_super_block *sb, int full)
 	       (unsigned long long)btrfs_super_cache_generation(sb));
 	printf("uuid_tree_generation\t%llu\n",
 	       (unsigned long long)btrfs_super_uuid_tree_generation(sb));
+	printf("block_group_root\t%llu\n",
+	       (unsigned long long)btrfs_super_block_group_root(sb));
+	printf("block_group_root_generation\t%llu\n",
+	       (unsigned long long)btrfs_super_block_group_root_generation(sb));
+	printf("block_group_root_level\t%llu\n",
+	       (unsigned long long)btrfs_super_block_group_root_level(sb));
 
 	uuid_unparse(sb->dev_item.uuid, buf);
 	printf("dev_item.uuid\t\t%s\n", buf);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 03/19] btrfs-progs: mkfs: use the btrfs_block_group_root helper
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 01/19] btrfs-progs: add support for loading the block group root Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 02/19] btrfs-progs: add print support for the block group tree Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 04/19] btrfs-progs: check-lowmem: " Josef Bacik
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Instead of accessing the extent root directory for modifying block
groups, use the helper which will do the correct thing based on the
flags of the file system.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 check/main.c | 4 ++--
 mkfs/main.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/check/main.c b/check/main.c
index 39cb1ce5..6ddfd18a 100644
--- a/check/main.c
+++ b/check/main.c
@@ -9426,6 +9426,7 @@ static int reinit_global_roots(struct btrfs_trans_handle *trans, u64 objectid)
 
 static int reinit_extent_tree(struct btrfs_trans_handle *trans, bool pin)
 {
+	struct btrfs_root *bg_root = btrfs_block_group_root(trans->fs_info);
 	u64 start = 0;
 	int ret;
 
@@ -9499,7 +9500,6 @@ again:
 	while (1) {
 		struct btrfs_block_group_item bgi;
 		struct btrfs_block_group *cache;
-		struct btrfs_root *extent_root = btrfs_extent_root(gfs_info, 0);
 		struct btrfs_key key;
 
 		cache = btrfs_lookup_first_block_group(gfs_info, start);
@@ -9513,7 +9513,7 @@ again:
 		key.objectid = cache->start;
 		key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
 		key.offset = cache->length;
-		ret = btrfs_insert_item(trans, extent_root, &key, &bgi,
+		ret = btrfs_insert_item(trans, bg_root, &key, &bgi,
 					sizeof(bgi));
 		if (ret) {
 			fprintf(stderr, "Error adding block group\n");
diff --git a/mkfs/main.c b/mkfs/main.c
index 3dd06979..20dc0436 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -596,7 +596,7 @@ static int cleanup_temp_chunks(struct btrfs_fs_info *fs_info,
 {
 	struct btrfs_trans_handle *trans = NULL;
 	struct btrfs_block_group_item *bgi;
-	struct btrfs_root *root = btrfs_extent_root(fs_info, 0);
+	struct btrfs_root *root = btrfs_block_group_root(fs_info);
 	struct btrfs_key key;
 	struct btrfs_key found_key;
 	struct btrfs_path path;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 04/19] btrfs-progs: check-lowmem: use the btrfs_block_group_root helper
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (2 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 03/19] btrfs-progs: mkfs: use the btrfs_block_group_root helper Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 05/19] btrfs-progs: handle no bg item in extent tree for free space tree Josef Bacik
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When we're messing with block group items use the
btrfs_block_group_root() helper to get the correct root to search, and
this will do the right thing based on the file system flags.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 check/mode-lowmem.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index 99d04945..8535e684 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -266,7 +266,7 @@ static int modify_block_group_cache(struct btrfs_block_group *block_group, int c
  */
 static int modify_block_groups_cache(u64 flags, int cache)
 {
-	struct btrfs_root *root = btrfs_extent_root(gfs_info, 0);
+	struct btrfs_root *root = btrfs_block_group_root(gfs_info);
 	struct btrfs_key key;
 	struct btrfs_path path;
 	struct btrfs_block_group *bg_cache;
@@ -331,7 +331,7 @@ static int clear_block_groups_full(u64 flags)
 static int create_chunk_and_block_group(u64 flags, u64 *start, u64 *nbytes)
 {
 	struct btrfs_trans_handle *trans;
-	struct btrfs_root *root = btrfs_extent_root(gfs_info, 0);
+	struct btrfs_root *root = btrfs_block_group_root(gfs_info);
 	int ret;
 
 	if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
@@ -419,7 +419,7 @@ static int is_chunk_almost_full(u64 start)
 {
 	struct btrfs_path path;
 	struct btrfs_key key;
-	struct btrfs_root *root = btrfs_extent_root(gfs_info, 0);
+	struct btrfs_root *root = btrfs_block_group_root(gfs_info);
 	struct btrfs_block_group_item *bi;
 	struct btrfs_block_group_item bg_item;
 	struct extent_buffer *eb;
@@ -4601,7 +4601,7 @@ next:
 static int find_block_group_item(struct btrfs_path *path, u64 bytenr, u64 len,
 				 u64 type)
 {
-	struct btrfs_root *root = btrfs_extent_root(gfs_info, 0);
+	struct btrfs_root *root = btrfs_block_group_root(gfs_info);
 	struct btrfs_block_group_item bgi;
 	struct btrfs_key key;
 	int ret;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 05/19] btrfs-progs: handle no bg item in extent tree for free space tree
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (3 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 04/19] btrfs-progs: check-lowmem: " Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 06/19] btrfs-progs: mkfs: add support for the block group tree Josef Bacik
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We have an ASSERT(ret == 0) when populating the free space tree as we
should at least find the block group item with extent tree v1.  However
with v2 we no longer have the block group item in the extent tree, so
fix the population logic to handle an empty block group (which occurs
during mkfs) and only assert if ret != 0 and we don't have extent tree
v2 turned on.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 kernel-shared/free-space-tree.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/kernel-shared/free-space-tree.c b/kernel-shared/free-space-tree.c
index 0fdf5004..896bd3a2 100644
--- a/kernel-shared/free-space-tree.c
+++ b/kernel-shared/free-space-tree.c
@@ -1057,6 +1057,9 @@ int populate_free_space_tree(struct btrfs_trans_handle *trans,
 	if (ret)
 		goto out;
 
+	start = block_group->start;
+	end = block_group->start + block_group->length;
+
 	/*
 	 * Iterate through all of the extent and metadata items in this block
 	 * group, adding the free space between them and the free space at the
@@ -1071,10 +1074,11 @@ int populate_free_space_tree(struct btrfs_trans_handle *trans,
 	ret = btrfs_search_slot_for_read(extent_root, &key, path, 1, 0);
 	if (ret < 0)
 		goto out;
-	ASSERT(ret == 0);
+	if (ret > 0) {
+		ASSERT(btrfs_fs_incompat(trans->fs_info, EXTENT_TREE_V2));
+		goto done;
+	}
 
-	start = block_group->start;
-	end = block_group->start + block_group->length;
 	while (1) {
 		btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
 
@@ -1106,6 +1110,7 @@ int populate_free_space_tree(struct btrfs_trans_handle *trans,
 		if (ret)
 			break;
 	}
+done:
 	if (start < end) {
 		ret = __add_to_free_space_tree(trans, block_group, path2,
 				start, end - start);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 06/19] btrfs-progs: mkfs: add support for the block group tree
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (4 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 05/19] btrfs-progs: handle no bg item in extent tree for free space tree Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 07/19] btrfs-progs: check: add block group tree support Josef Bacik
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Add the extent tree v2 table with the block group tree as a root, and
then create the empty root and use the proper root for cleanup up the
temporary block groups.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 mkfs/common.c | 93 ++++++++++++++++++++++++++++++++++++++++-----------
 mkfs/common.h | 12 +++++++
 mkfs/main.c   |  5 +++
 3 files changed, 91 insertions(+), 19 deletions(-)

diff --git a/mkfs/common.c b/mkfs/common.c
index 11d92c8b..aa65543b 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -39,6 +39,7 @@ static u64 reference_root_table[] = {
 	[MKFS_FS_TREE]		=	BTRFS_FS_TREE_OBJECTID,
 	[MKFS_CSUM_TREE]	=	BTRFS_CSUM_TREE_OBJECTID,
 	[MKFS_FREE_SPACE_TREE]	=	BTRFS_FREE_SPACE_TREE_OBJECTID,
+	[MKFS_BLOCK_GROUP_TREE]	=	BTRFS_BLOCK_GROUP_TREE_OBJECTID,
 };
 
 static int btrfs_write_empty_tree(int fd, struct btrfs_mkfs_config *cfg,
@@ -97,7 +98,8 @@ static int btrfs_create_tree_root(int fd, struct btrfs_mkfs_config *cfg,
 
 	for (i = 0; i < blocks_nr; i++) {
 		blk = blocks[i];
-		if (blk == MKFS_ROOT_TREE || blk == MKFS_CHUNK_TREE)
+		if (blk == MKFS_ROOT_TREE || blk == MKFS_CHUNK_TREE ||
+		    blk == MKFS_BLOCK_GROUP_TREE)
 			continue;
 
 		btrfs_set_root_bytenr(&root_item, cfg->blocks[blk]);
@@ -187,6 +189,50 @@ static int create_free_space_tree(int fd, struct btrfs_mkfs_config *cfg,
 	return 0;
 }
 
+static void write_block_group_item(struct extent_buffer *buf, u32 nr,
+				   u64 objectid, u64 offset, u64 used,
+				   u32 itemoff)
+{
+	struct btrfs_block_group_item *bg_item;
+	struct btrfs_disk_key disk_key;
+
+	btrfs_set_disk_key_objectid(&disk_key, objectid);
+	btrfs_set_disk_key_offset(&disk_key, offset);
+	btrfs_set_disk_key_type(&disk_key, BTRFS_BLOCK_GROUP_ITEM_KEY);
+	btrfs_set_item_key(buf, &disk_key, nr);
+	btrfs_set_item_offset(buf, nr, itemoff);
+	btrfs_set_item_size(buf, nr, sizeof(*bg_item));
+
+	bg_item = btrfs_item_ptr(buf, nr, struct btrfs_block_group_item);
+	btrfs_set_block_group_used(buf, bg_item, used);
+	btrfs_set_block_group_flags(buf, bg_item, BTRFS_BLOCK_GROUP_SYSTEM);
+	btrfs_set_block_group_chunk_objectid(buf, bg_item,
+					     BTRFS_FIRST_CHUNK_TREE_OBJECTID);
+}
+
+static int create_block_group_tree(int fd, struct btrfs_mkfs_config *cfg,
+				   struct extent_buffer *buf,
+				   u64 bg_offset, u64 bg_size, u64 bg_used)
+{
+	int ret;
+
+	memset(buf->data + sizeof(struct btrfs_header), 0,
+		cfg->nodesize - sizeof(struct btrfs_header));
+	write_block_group_item(buf, 0, bg_offset, bg_size, bg_used,
+			       cfg->leaf_data_size -
+			       sizeof(struct btrfs_block_group_item));
+	btrfs_set_header_bytenr(buf, cfg->blocks[MKFS_BLOCK_GROUP_TREE]);
+	btrfs_set_header_owner(buf, BTRFS_BLOCK_GROUP_TREE_OBJECTID);
+	btrfs_set_header_nritems(buf, 1);
+	csum_tree_block_size(buf, btrfs_csum_type_size(cfg->csum_type), 0,
+			     cfg->csum_type);
+	ret = pwrite(fd, buf->data, cfg->nodesize,
+		     cfg->blocks[MKFS_BLOCK_GROUP_TREE]);
+	if (ret != cfg->nodesize)
+		return ret < 0 ? -errno : -EIO;
+	return 0;
+}
+
 /*
  * @fs_uuid - if NULL, generates a UUID, returns back the new filesystem UUID
  *
@@ -239,11 +285,19 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 	bool add_block_group = true;
 	bool free_space_tree = !!(cfg->runtime_features &
 				  BTRFS_RUNTIME_FEATURE_FREE_SPACE_TREE);
+	bool extent_tree_v2 = !!(cfg->features &
+				 BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2);
 
 	/* Don't include the free space tree in the blocks to process. */
 	if (!free_space_tree)
 		blocks_nr--;
 
+	if (extent_tree_v2) {
+		blocks = extent_tree_v2_blocks;
+		blocks_nr = ARRAY_SIZE(extent_tree_v2_blocks);
+		add_block_group = false;
+	}
+
 	if ((cfg->features & BTRFS_FEATURE_INCOMPAT_ZONED)) {
 		system_group_offset = cfg->zone_size * BTRFS_NR_SB_LOG_ZONES;
 		system_group_size = cfg->zone_size;
@@ -300,6 +354,12 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 		btrfs_set_super_compat_ro_flags(&super, ro_flags);
 		btrfs_set_super_cache_generation(&super, 0);
 	}
+	if (extent_tree_v2) {
+		btrfs_set_super_block_group_root(&super,
+						 cfg->blocks[MKFS_BLOCK_GROUP_TREE]);
+		btrfs_set_super_block_group_root_generation(&super, 1);
+		btrfs_set_super_block_group_root_level(&super, 0);
+	}
 	if (cfg->label)
 		__strncpy_null(super.label, cfg->label, BTRFS_LABEL_SIZE - 1);
 
@@ -331,25 +391,12 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 
 		/* Add the block group item for our temporary chunk. */
 		if (cfg->blocks[blk] > system_group_offset && add_block_group) {
-			struct btrfs_block_group_item *bg_item;
-
+			itemoff -= sizeof(struct btrfs_block_group_item);
+			write_block_group_item(buf, nritems,
+					       system_group_offset,
+					       system_group_size, total_used,
+					       itemoff);
 			add_block_group = false;
-
-			itemoff -= sizeof(*bg_item);
-			btrfs_set_disk_key_objectid(&disk_key, system_group_offset);
-			btrfs_set_disk_key_offset(&disk_key, system_group_size);
-			btrfs_set_disk_key_type(&disk_key, BTRFS_BLOCK_GROUP_ITEM_KEY);
-			btrfs_set_item_key(buf, &disk_key, nritems);
-			btrfs_set_item_offset(buf, nritems, itemoff);
-			btrfs_set_item_size(buf, nritems, sizeof(*bg_item));
-
-			bg_item = btrfs_item_ptr(buf, nritems,
-						 struct btrfs_block_group_item);
-			btrfs_set_block_group_used(buf, bg_item, total_used);
-			btrfs_set_block_group_flags(buf, bg_item,
-						    BTRFS_BLOCK_GROUP_SYSTEM);
-			btrfs_set_block_group_chunk_objectid(buf, bg_item,
-					BTRFS_FIRST_CHUNK_TREE_OBJECTID);
 			nritems++;
 		}
 
@@ -565,6 +612,14 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 			goto out;
 	}
 
+	if (extent_tree_v2) {
+		ret = create_block_group_tree(fd, cfg, buf,
+					      system_group_offset,
+					      system_group_size, total_used);
+		if (ret)
+			goto out;
+	}
+
 	/* and write out the super block */
 	memset(buf->data, 0, BTRFS_SUPER_INFO_SIZE);
 	memcpy(buf->data, &super, sizeof(super));
diff --git a/mkfs/common.h b/mkfs/common.h
index 428cd366..3533e114 100644
--- a/mkfs/common.h
+++ b/mkfs/common.h
@@ -51,6 +51,7 @@ enum btrfs_mkfs_block {
 	MKFS_FS_TREE,
 	MKFS_CSUM_TREE,
 	MKFS_FREE_SPACE_TREE,
+	MKFS_BLOCK_GROUP_TREE,
 	MKFS_BLOCK_COUNT
 };
 
@@ -69,6 +70,17 @@ static const enum btrfs_mkfs_block extent_tree_v1_blocks[] = {
 	MKFS_FREE_SPACE_TREE,
 };
 
+static const enum btrfs_mkfs_block extent_tree_v2_blocks[] = {
+	MKFS_ROOT_TREE,
+	MKFS_EXTENT_TREE,
+	MKFS_CHUNK_TREE,
+	MKFS_DEV_TREE,
+	MKFS_FS_TREE,
+	MKFS_CSUM_TREE,
+	MKFS_FREE_SPACE_TREE,
+	MKFS_BLOCK_GROUP_TREE,
+};
+
 struct btrfs_mkfs_config {
 	/* Label of the new filesystem */
 	const char *label;
diff --git a/mkfs/main.c b/mkfs/main.c
index 20dc0436..7f79ba1a 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -299,6 +299,11 @@ static int recow_roots(struct btrfs_trans_handle *trans,
 	ret = __recow_root(trans, info->dev_root);
 	if (ret)
 		return ret;
+        if (btrfs_fs_incompat(info, EXTENT_TREE_V2)) {
+		ret = __recow_root(trans, info->block_group_root);
+		if (ret)
+			return ret;
+        }
 	ret = recow_global_roots(trans);
 	if (ret)
 		return ret;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 07/19] btrfs-progs: check: add block group tree support
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (5 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 06/19] btrfs-progs: mkfs: add support for the block group tree Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 08/19] btrfs-progs: qgroup-verify: scan extents based on block groups Josef Bacik
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This makes the appropriate changes to enable the block group tree
checking for both lowmem and normal check modes.  This is relatively
straightforward, simply need to use the helper to get the right root for
dealing with block groups.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 check/main.c        | 21 ++++++++++++++++++++-
 check/mode-lowmem.c |  4 ++--
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/check/main.c b/check/main.c
index 6ddfd18a..a0f4ee91 100644
--- a/check/main.c
+++ b/check/main.c
@@ -6268,10 +6268,17 @@ static int check_type_with_root(u64 rootid, u8 key_type)
 		break;
 	case BTRFS_EXTENT_ITEM_KEY:
 	case BTRFS_METADATA_ITEM_KEY:
-	case BTRFS_BLOCK_GROUP_ITEM_KEY:
 		if (rootid != BTRFS_EXTENT_TREE_OBJECTID)
 			goto err;
 		break;
+	case BTRFS_BLOCK_GROUP_ITEM_KEY:
+		if (btrfs_fs_incompat(gfs_info, EXTENT_TREE_V2)) {
+			if (rootid != BTRFS_BLOCK_GROUP_TREE_OBJECTID)
+				goto err;
+		} else if (rootid != BTRFS_EXTENT_TREE_OBJECTID) {
+			goto err;
+		}
+		break;
 	case BTRFS_ROOT_ITEM_KEY:
 		if (rootid != BTRFS_ROOT_TREE_OBJECTID)
 			goto err;
@@ -9492,6 +9499,18 @@ again:
 		return ret;
 	}
 
+	/*
+	 * If we are extent tree v2 then we can reint the block group root as
+	 * well.
+	 */
+	if (btrfs_fs_incompat(gfs_info, EXTENT_TREE_V2)) {
+		ret = btrfs_fsck_reinit_root(trans, gfs_info->block_group_root);
+		if (ret) {
+			fprintf(stderr, "block group initialization failed\n");
+			return ret;
+		}
+	}
+
 	/*
 	 * Now we have all the in-memory block groups setup so we can make
 	 * allocations properly, and the metadata we care about is safe since we
diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index 8535e684..68c1adfd 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -5540,7 +5540,7 @@ int check_chunks_and_extents_lowmem(void)
 	key.offset = 0;
 	key.type = BTRFS_ROOT_ITEM_KEY;
 
-	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
+	ret = btrfs_search_slot(NULL, gfs_info->tree_root, &key, &path, 0, 0);
 	if (ret) {
 		error("cannot find extent tree in tree_root");
 		goto out;
@@ -5575,7 +5575,7 @@ int check_chunks_and_extents_lowmem(void)
 		if (ret)
 			goto out;
 next:
-		ret = btrfs_next_item(root, &path);
+		ret = btrfs_next_item(gfs_info->tree_root, &path);
 		if (ret)
 			goto out;
 	}
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 08/19] btrfs-progs: qgroup-verify: scan extents based on block groups
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (6 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 07/19] btrfs-progs: check: add block group tree support Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 09/19] btrfs-progs: check: make free space tree validation extent tree v2 aware Josef Bacik
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When we switch to per-block group extent roots we'll need to scan each
individual extent root.  To make this easier in the future go ahead and
use the range of the block groups to scan the extents.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 check/qgroup-verify.c | 32 ++++++++++++++++++++++++--------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/check/qgroup-verify.c b/check/qgroup-verify.c
index 2c05f875..6e012c1f 100644
--- a/check/qgroup-verify.c
+++ b/check/qgroup-verify.c
@@ -1400,6 +1400,7 @@ static bool is_bad_qgroup(struct qgroup_count *count)
  */
 int qgroup_verify_all(struct btrfs_fs_info *info)
 {
+	struct rb_node *n;
 	int ret;
 	bool found_err = false;
 	bool skip_err = false;
@@ -1430,10 +1431,17 @@ int qgroup_verify_all(struct btrfs_fs_info *info)
 	/*
 	 * Put all extent refs into our rbtree
 	 */
-	ret = scan_extents(info, 0, ~0ULL);
-	if (ret) {
-		fprintf(stderr, "ERROR: while scanning extent tree: %d\n", ret);
-		goto out;
+	for (n = rb_first(&info->block_group_cache_tree); n; n = rb_next(n)) {
+		struct btrfs_block_group *bg;
+
+		bg = rb_entry(n, struct btrfs_block_group, cache_node);
+		ret = scan_extents(info, bg->start,
+				   bg->start + bg->length - 1);
+		if (ret) {
+			fprintf(stderr, "ERROR: while scanning extent tree: %d\n",
+				ret);
+			goto out;
+		}
 	}
 
 	ret = map_implied_refs(info);
@@ -1507,6 +1515,7 @@ static void print_subvol_info(u64 subvolid, u64 bytenr, u64 num_bytes,
 
 int print_extent_state(struct btrfs_fs_info *info, u64 subvol)
 {
+	struct rb_node *n;
 	int ret;
 
 	tree_blocks = ulist_alloc(0);
@@ -1519,10 +1528,17 @@ int print_extent_state(struct btrfs_fs_info *info, u64 subvol)
 	/*
 	 * Put all extent refs into our rbtree
 	 */
-	ret = scan_extents(info, 0, ~0ULL);
-	if (ret) {
-		fprintf(stderr, "ERROR: while scanning extent tree: %d\n", ret);
-		goto out;
+	for (n = rb_first(&info->block_group_cache_tree); n; n = rb_next(n)) {
+		struct btrfs_block_group *bg;
+
+		bg = rb_entry(n, struct btrfs_block_group, cache_node);
+		ret = scan_extents(info, bg->start,
+				   bg->start + bg->length - 1);
+		if (ret) {
+			fprintf(stderr, "ERROR: while scanning extent tree: %d\n",
+				ret);
+			goto out;
+		}
 	}
 
 	ret = map_implied_refs(info);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 09/19] btrfs-progs: check: make free space tree validation extent tree v2 aware
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (7 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 08/19] btrfs-progs: qgroup-verify: scan extents based on block groups Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 10/19] btrfs-progs: check: add helper to reinit the root based on a key Josef Bacik
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

The free space tree needs to be validated against all referenced blocks
in the file system, so use the btrfs_mark_used_blocks() helper to check
the free space tree and free space cache against.  This will do the
right thing for both extent tree v1 and extent tree v2.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 check/main.c | 90 ++++++++++++++++++----------------------------------
 1 file changed, 31 insertions(+), 59 deletions(-)

diff --git a/check/main.c b/check/main.c
index a0f4ee91..9d090fdc 100644
--- a/check/main.c
+++ b/check/main.c
@@ -5637,72 +5637,38 @@ static int check_cache_range(struct btrfs_root *root,
 }
 
 static int verify_space_cache(struct btrfs_root *root,
-			      struct btrfs_block_group *cache)
+			      struct btrfs_block_group *cache,
+			      struct extent_io_tree *used)
 {
-	struct btrfs_path path;
-	struct extent_buffer *leaf;
-	struct btrfs_key key;
-	u64 last;
+	u64 start, end, last_end, bg_end;
 	int ret = 0;
 
-	root = btrfs_extent_root(root->fs_info, cache->start);
+	start = cache->start;
+	bg_end = cache->start + cache->length;
+	last_end = start;
 
-	last = max_t(u64, cache->start, BTRFS_SUPER_INFO_OFFSET);
-
-	btrfs_init_path(&path);
-	key.objectid = last;
-	key.offset = 0;
-	key.type = BTRFS_EXTENT_ITEM_KEY;
-	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
-	if (ret < 0)
-		goto out;
-	ret = 0;
-	while (1) {
-		if (path.slots[0] >= btrfs_header_nritems(path.nodes[0])) {
-			ret = btrfs_next_leaf(root, &path);
-			if (ret < 0)
-				goto out;
-			if (ret > 0) {
-				ret = 0;
-				break;
-			}
-		}
-		leaf = path.nodes[0];
-		btrfs_item_key_to_cpu(leaf, &key, path.slots[0]);
-		if (key.objectid >= cache->start + cache->length)
+	while (start < bg_end) {
+		ret = find_first_extent_bit(used, cache->start, &start, &end,
+					    EXTENT_DIRTY);
+		if (ret || start >= bg_end) {
+			ret = 0;
 			break;
-		if (key.type != BTRFS_EXTENT_ITEM_KEY &&
-		    key.type != BTRFS_METADATA_ITEM_KEY) {
-			path.slots[0]++;
-			continue;
 		}
-
-		if (last == key.objectid) {
-			if (key.type == BTRFS_EXTENT_ITEM_KEY)
-				last = key.objectid + key.offset;
-			else
-				last = key.objectid + gfs_info->nodesize;
-			path.slots[0]++;
-			continue;
+		if (last_end < start) {
+			ret = check_cache_range(root, cache, last_end,
+						start - last_end);
+			if (ret)
+				return ret;
 		}
-
-		ret = check_cache_range(root, cache, last,
-					key.objectid - last);
-		if (ret)
-			break;
-		if (key.type == BTRFS_EXTENT_ITEM_KEY)
-			last = key.objectid + key.offset;
-		else
-			last = key.objectid + gfs_info->nodesize;
-		path.slots[0]++;
+		end = min(end, bg_end - 1);
+		clear_extent_dirty(used, start, end);
+		start = end + 1;
+		last_end = start;
 	}
 
-	if (last < cache->start + cache->length)
-		ret = check_cache_range(root, cache, last,
-					cache->start + cache->length - last);
-
-out:
-	btrfs_release_path(&path);
+	if (last_end < bg_end)
+		ret = check_cache_range(root, cache, last_end,
+					bg_end - last_end);
 
 	if (!ret &&
 	    !RB_EMPTY_ROOT(&cache->free_space_ctl->free_space_offset)) {
@@ -5716,11 +5682,17 @@ out:
 
 static int check_space_cache(struct btrfs_root *root)
 {
+	struct extent_io_tree used;
 	struct btrfs_block_group *cache;
 	u64 start = BTRFS_SUPER_INFO_OFFSET + BTRFS_SUPER_INFO_SIZE;
 	int ret;
 	int error = 0;
 
+	extent_io_tree_init(&used);
+	ret = btrfs_mark_used_blocks(gfs_info, &used);
+	if (ret)
+		return ret;
+
 	while (1) {
 		ctx.item_count++;
 		cache = btrfs_lookup_first_block_group(gfs_info, start);
@@ -5765,14 +5737,14 @@ static int check_space_cache(struct btrfs_root *root)
 				continue;
 		}
 
-		ret = verify_space_cache(root, cache);
+		ret = verify_space_cache(root, cache, &used);
 		if (ret) {
 			fprintf(stderr, "cache appears valid but isn't %llu\n",
 				cache->start);
 			error++;
 		}
 	}
-
+	extent_io_tree_cleanup(&used);
 	return error ? -EINVAL : 0;
 }
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 10/19] btrfs-progs: check: add helper to reinit the root based on a key
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (8 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 09/19] btrfs-progs: check: make free space tree validation extent tree v2 aware Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 11/19] btrfs-progs: check: handle the block group tree properly Josef Bacik
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

In the case of per-bg roots we may be missing the root items.  To
re-initialize them we want to add the root item as well as allocate the
empty block.  To achieve this extract out the reinit root logic to a
helper that just takes the root key and then does the appropriate work
to allocate an empty root and update the root item.  Fix the normal
reinit root helper to use this new helper.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 check/main.c | 88 ++++++++++++++++++++++++++++++++++------------------
 1 file changed, 58 insertions(+), 30 deletions(-)

diff --git a/check/main.c b/check/main.c
index 9d090fdc..5b700350 100644
--- a/check/main.c
+++ b/check/main.c
@@ -9129,29 +9129,34 @@ static int do_check_chunks_and_extents(void)
 	return ret;
 }
 
-static int btrfs_fsck_reinit_root(struct btrfs_trans_handle *trans,
-				  struct btrfs_root *root)
+static struct extent_buffer *btrfs_fsck_clear_root(
+					struct btrfs_trans_handle *trans,
+					struct btrfs_key *key)
 {
+	struct btrfs_root_item ri = {};
+	struct btrfs_path *path;
 	struct extent_buffer *c;
-	struct extent_buffer *old = root->node;
-	int level;
+	struct btrfs_disk_key disk_key = {};
 	int ret;
-	struct btrfs_disk_key disk_key = {0,0,0};
 
-	level = 0;
+	path = btrfs_alloc_path();
+	if (!path)
+		return ERR_PTR(-ENOMEM);
 
-	c = btrfs_alloc_free_block(trans, root, gfs_info->nodesize,
-				   root->root_key.objectid,
-				   &disk_key, level, 0, 0);
-	if (IS_ERR(c))
-		return PTR_ERR(c);
+	c = btrfs_alloc_free_block(trans, gfs_info->tree_root,
+				   gfs_info->nodesize, key->objectid,
+				   &disk_key, 0, 0, 0);
+	if (IS_ERR(c)) {
+		btrfs_free_path(path);
+		return c;
+	}
 
 	memset_extent_buffer(c, 0, 0, sizeof(struct btrfs_header));
-	btrfs_set_header_level(c, level);
+	btrfs_set_header_level(c, 0);
 	btrfs_set_header_bytenr(c, c->start);
 	btrfs_set_header_generation(c, trans->transid);
 	btrfs_set_header_backref_rev(c, BTRFS_MIXED_BACKREF_REV);
-	btrfs_set_header_owner(c, root->root_key.objectid);
+	btrfs_set_header_owner(c, key->objectid);
 
 	write_extent_buffer(c, gfs_info->fs_devices->metadata_uuid,
 			    btrfs_header_fsid(), BTRFS_FSID_SIZE);
@@ -9161,25 +9166,48 @@ static int btrfs_fsck_reinit_root(struct btrfs_trans_handle *trans,
 			    BTRFS_UUID_SIZE);
 
 	btrfs_mark_buffer_dirty(c);
+
 	/*
-	 * this case can happen in the following case:
-	 *
-	 * reinit reloc data root, this is because we skip pin
-	 * down reloc data tree before which means we can allocate
-	 * same block bytenr here.
+	 * The root item may not exist, try to insert an empty one so it exists,
+	 * otherwise simply update the existing one with the correct settings.
 	 */
-	if (old->start == c->start) {
-		btrfs_set_root_generation(&root->root_item,
-					  trans->transid);
-		root->root_item.level = btrfs_header_level(root->node);
-		ret = btrfs_update_root(trans, gfs_info->tree_root,
-					&root->root_key, &root->root_item);
-		if (ret) {
-			free_extent_buffer(c);
-			return ret;
-		}
-	}
-	free_extent_buffer(old);
+	ret = btrfs_insert_empty_item(trans, gfs_info->tree_root, path, key,
+				      sizeof(ri));
+	if (ret == -EEXIST) {
+		read_extent_buffer(path->nodes[0], &ri,
+				   btrfs_item_ptr_offset(path->nodes[0],
+							 path->slots[0]),
+				   sizeof(ri));
+	} else if (ret) {
+		btrfs_free_path(path);
+		free_extent_buffer(c);
+		return ERR_PTR(ret);
+	}
+	btrfs_set_root_bytenr(&ri, c->start);
+	btrfs_set_root_generation(&ri, trans->transid);
+	btrfs_set_root_refs(&ri, 1);
+	btrfs_set_root_used(&ri, c->len);
+	btrfs_set_root_generation_v2(&ri, trans->transid);
+
+	write_extent_buffer(path->nodes[0], &ri,
+			    btrfs_item_ptr_offset(path->nodes[0],
+						  path->slots[0]),
+			    sizeof(ri));
+	btrfs_mark_buffer_dirty(path->nodes[0]);
+	btrfs_free_path(path);
+	return c;
+}
+
+static int btrfs_fsck_reinit_root(struct btrfs_trans_handle *trans,
+				  struct btrfs_root *root)
+{
+	struct extent_buffer *c;
+
+	c = btrfs_fsck_clear_root(trans, &root->root_key);
+	if (IS_ERR(c))
+		return PTR_ERR(c);
+
+	free_extent_buffer(root->node);
 	root->node = c;
 	add_root_to_dirty_list(root);
 	return 0;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 11/19] btrfs-progs: check: handle the block group tree properly
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (9 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 10/19] btrfs-progs: check: add helper to reinit the root based on a key Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block Josef Bacik
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We need to make sure we process the block group root, and mark its
blocks as used for the free space tree checking.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 check/main.c    | 27 +++++++++++++++++----------
 common/repair.c |  3 +++
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/check/main.c b/check/main.c
index 5b700350..45065989 100644
--- a/check/main.c
+++ b/check/main.c
@@ -8947,6 +8947,18 @@ out:
 	return ret;
 }
 
+static int load_super_root(struct list_head *head, struct btrfs_root *root)
+{
+	u8 level;
+
+	if (!root)
+		return 0;
+
+	level = btrfs_header_level(root->node);
+	return add_root_item_to_list(head, root->root_key.objectid,
+				     root->node->start, 0, level, 0, NULL);
+}
+
 static int check_chunks_and_extents(void)
 {
 	struct rb_root dev_cache;
@@ -8965,9 +8977,7 @@ static int check_chunks_and_extents(void)
 	int bits_nr;
 	struct list_head dropping_trees;
 	struct list_head normal_trees;
-	struct btrfs_root *root1;
 	struct btrfs_root *root;
-	u8 level;
 
 	root = gfs_info->fs_root;
 	dev_cache = RB_ROOT;
@@ -9000,16 +9010,13 @@ static int check_chunks_and_extents(void)
 	}
 
 again:
-	root1 = gfs_info->tree_root;
-	level = btrfs_header_level(root1->node);
-	ret = add_root_item_to_list(&normal_trees, root1->root_key.objectid,
-				    root1->node->start, 0, level, 0, NULL);
+	ret = load_super_root(&normal_trees, gfs_info->tree_root);
+	if (ret < 0)
+		goto out;
+	ret = load_super_root(&normal_trees, gfs_info->chunk_root);
 	if (ret < 0)
 		goto out;
-	root1 = gfs_info->chunk_root;
-	level = btrfs_header_level(root1->node);
-	ret = add_root_item_to_list(&normal_trees, root1->root_key.objectid,
-				    root1->node->start, 0, level, 0, NULL);
+	ret = load_super_root(&normal_trees, gfs_info->block_group_root);
 	if (ret < 0)
 		goto out;
 
diff --git a/common/repair.c b/common/repair.c
index a73949b0..37a6943f 100644
--- a/common/repair.c
+++ b/common/repair.c
@@ -149,6 +149,9 @@ int btrfs_mark_used_tree_blocks(struct btrfs_fs_info *fs_info,
 	ret = traverse_tree_blocks(tree, fs_info->chunk_root->node, 0);
 	if (!ret)
 		ret = traverse_tree_blocks(tree, fs_info->tree_root->node, 1);
+	if (!ret && fs_info->block_group_root)
+		ret = traverse_tree_blocks(tree,
+					   fs_info->block_group_root->node, 0);
 	return ret;
 }
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (10 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 11/19] btrfs-progs: check: handle the block group tree properly Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-08 16:19   ` David Sterba
  2022-03-07 22:10 ` [PATCH v5 13/19] btrfs-progs: handle the per-block group global root id Josef Bacik
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

In order to make sure the file system is consistent we need to record
the number of global roots we should have in the super block.  We could
infer this from the number of global roots we find, however this could
lead to interesting fuzzing problems, so add a source of truth to the
super block in order to make it easier to verify the file system is
consistent.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 kernel-shared/ctree.h   | 6 +++++-
 kernel-shared/disk-io.c | 4 ++++
 mkfs/common.c           | 1 +
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h
index b12dbff1..90de7a65 100644
--- a/kernel-shared/ctree.h
+++ b/kernel-shared/ctree.h
@@ -463,13 +463,15 @@ struct btrfs_super_block {
 
 	u8 metadata_uuid[BTRFS_FSID_SIZE];
 
+	__le64 nr_global_roots;
+
 	__le64 block_group_root;
 	__le64 block_group_root_generation;
 	u8 block_group_root_level;
 
 	/* future expansion */
 	u8 reserved8[7];
-	__le64 reserved[25];
+	__le64 reserved[24];
 	u8 sys_chunk_array[BTRFS_SYSTEM_CHUNK_ARRAY_SIZE];
 	struct btrfs_root_backup super_roots[BTRFS_NUM_BACKUP_ROOTS];
 	/* Padded to 4096 bytes */
@@ -2372,6 +2374,8 @@ BTRFS_SETGET_STACK_FUNCS(super_block_group_root_generation,
 			 block_group_root_generation, 64);
 BTRFS_SETGET_STACK_FUNCS(super_block_group_root_level,
 			 struct btrfs_super_block, block_group_root_level, 8);
+BTRFS_SETGET_STACK_FUNCS(super_nr_global_roots, struct btrfs_super_block,
+			 nr_global_roots, 64);
 
 static inline unsigned long btrfs_leaf_data(struct extent_buffer *l)
 {
diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c
index 3d1157ad..fcef6e97 100644
--- a/kernel-shared/disk-io.c
+++ b/kernel-shared/disk-io.c
@@ -1620,6 +1620,10 @@ static struct btrfs_fs_info *__open_ctree_fd(int fp, struct open_ctree_flags *oc
 	if (ret)
 		goto out_devices;
 
+	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
+		fs_info->nr_global_roots =
+			btrfs_super_nr_global_roots(fs_info->super_copy);
+
 	/*
 	 * fs_info->zone_size (and zoned) are not known before reading the
 	 * chunk tree, so it's 0 at this point. But, fs_info->zoned == 0
diff --git a/mkfs/common.c b/mkfs/common.c
index aa65543b..eac8c46c 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -355,6 +355,7 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 		btrfs_set_super_cache_generation(&super, 0);
 	}
 	if (extent_tree_v2) {
+		btrfs_set_super_nr_global_roots(&super, 1);
 		btrfs_set_super_block_group_root(&super,
 						 cfg->blocks[MKFS_BLOCK_GROUP_TREE]);
 		btrfs_set_super_block_group_root_generation(&super, 1);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 13/19] btrfs-progs: handle the per-block group global root id
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (11 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:10 ` [PATCH v5 14/19] btrfs-progs: add a btrfs_delete_and_free_root helper Josef Bacik
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We will now be using block_group->chunk_objectid to point at the global
root id for this particular block group.  For now we'll assign this
based on mod'ing the offset of the block group against the number of
global root id's and handle the block_group_item updating appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 kernel-shared/ctree.h           |  2 ++
 kernel-shared/disk-io.c         | 24 ++++++++++++++++++++++--
 kernel-shared/disk-io.h         |  1 +
 kernel-shared/extent-tree.c     | 24 ++++++++++++++++++++++--
 kernel-shared/free-space-tree.c |  3 +++
 5 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h
index 90de7a65..d79f49c9 100644
--- a/kernel-shared/ctree.h
+++ b/kernel-shared/ctree.h
@@ -1190,6 +1190,8 @@ struct btrfs_block_group {
 	 */
 	u64 alloc_offset;
 	u64 write_offset;
+
+	u64 global_root_id;
 };
 
 struct btrfs_device;
diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c
index fcef6e97..59c46946 100644
--- a/kernel-shared/disk-io.c
+++ b/kernel-shared/disk-io.c
@@ -806,13 +806,33 @@ struct btrfs_root *btrfs_global_root(struct btrfs_fs_info *fs_info,
 	return NULL;
 }
 
+u64 btrfs_global_root_id(struct btrfs_fs_info *fs_info, u64 bytenr)
+{
+	struct btrfs_block_group *block_group;
+	u64 ret = 0;
+
+	if (!btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
+		return ret;
+
+	/*
+	 * We use this because we won't have this many global roots, and -1 is
+	 * special, so we need something that'll not be found if we have any
+	 * errors from here on.
+	 */
+	ret = BTRFS_LAST_FREE_OBJECTID;
+	block_group = btrfs_lookup_first_block_group(fs_info, bytenr);
+	if (block_group)
+		ret = block_group->global_root_id;
+	return ret;
+}
+
 struct btrfs_root *btrfs_csum_root(struct btrfs_fs_info *fs_info,
 				   u64 bytenr)
 {
 	struct btrfs_key key = {
 		.objectid = BTRFS_CSUM_TREE_OBJECTID,
 		.type = BTRFS_ROOT_ITEM_KEY,
-		.offset = 0,
+		.offset = btrfs_global_root_id(fs_info, bytenr),
 	};
 
 	return btrfs_global_root(fs_info, &key);
@@ -824,7 +844,7 @@ struct btrfs_root *btrfs_extent_root(struct btrfs_fs_info *fs_info,
 	struct btrfs_key key = {
 		.objectid = BTRFS_EXTENT_TREE_OBJECTID,
 		.type = BTRFS_ROOT_ITEM_KEY,
-		.offset = 0,
+		.offset = btrfs_global_root_id(fs_info, bytenr),
 	};
 
 	return btrfs_global_root(fs_info, &key);
diff --git a/kernel-shared/disk-io.h b/kernel-shared/disk-io.h
index 1e9044f8..81d8670f 100644
--- a/kernel-shared/disk-io.h
+++ b/kernel-shared/disk-io.h
@@ -225,6 +225,7 @@ struct btrfs_root *btrfs_csum_root(struct btrfs_fs_info *fs_info, u64 bytenr);
 struct btrfs_root *btrfs_extent_root(struct btrfs_fs_info *fs_inf, u64 bytenr);
 struct btrfs_root *btrfs_global_root(struct btrfs_fs_info *fs_info,
 				     struct btrfs_key *key);
+u64 btrfs_global_root_id(struct btrfs_fs_info *fs_info, u64 bytenr);
 int btrfs_global_root_insert(struct btrfs_fs_info *fs_info,
 			     struct btrfs_root *root);
 
diff --git a/kernel-shared/extent-tree.c b/kernel-shared/extent-tree.c
index b2b99d4f..697a8a1e 100644
--- a/kernel-shared/extent-tree.c
+++ b/kernel-shared/extent-tree.c
@@ -1561,7 +1561,7 @@ static int update_block_group_item(struct btrfs_trans_handle *trans,
 	btrfs_set_stack_block_group_used(&bgi, cache->used);
 	btrfs_set_stack_block_group_flags(&bgi, cache->flags);
 	btrfs_set_stack_block_group_chunk_objectid(&bgi,
-			BTRFS_FIRST_CHUNK_TREE_OBJECTID);
+						   cache->global_root_id);
 	write_extent_buffer(leaf, &bgi, bi, sizeof(bgi));
 	btrfs_mark_buffer_dirty(leaf);
 fail:
@@ -2658,6 +2658,7 @@ static int read_block_group_item(struct btrfs_block_group *cache,
 			   sizeof(bgi));
 	cache->used = btrfs_stack_block_group_used(&bgi);
 	cache->flags = btrfs_stack_block_group_flags(&bgi);
+	cache->global_root_id = btrfs_stack_block_group_chunk_objectid(&bgi);
 
 	return 0;
 }
@@ -2765,6 +2766,24 @@ error:
 	return ret;
 }
 
+/*
+ * For extent tree v2 we use the block_group_item->chunk_offset to point at our
+ * global root id.  For v1 it's always set to BTRFS_FIRST_CHUNK_TREE_OBJECTID.
+ */
+static u64 calculate_global_root_id(struct btrfs_fs_info *fs_info, u64 offset)
+{
+	u64 div = SZ_1G;
+
+	if (!btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
+		return BTRFS_FIRST_CHUNK_TREE_OBJECTID;
+
+	/* If we have a smaller fs index based on 128m. */
+	if (btrfs_super_total_bytes(fs_info->super_copy) <= (SZ_1G * 10ULL))
+		div = SZ_128M;
+
+	return (div_u64(offset, div) % fs_info->nr_global_roots);
+}
+
 struct btrfs_block_group *
 btrfs_add_block_group(struct btrfs_fs_info *fs_info, u64 bytes_used, u64 type,
 		      u64 chunk_offset, u64 size)
@@ -2776,6 +2795,7 @@ btrfs_add_block_group(struct btrfs_fs_info *fs_info, u64 bytes_used, u64 type,
 	BUG_ON(!cache);
 	cache->start = chunk_offset;
 	cache->length = size;
+	cache->global_root_id = calculate_global_root_id(fs_info, chunk_offset);
 
 	ret = btrfs_load_block_group_zone_info(fs_info, cache);
 	BUG_ON(ret);
@@ -2806,7 +2826,7 @@ static int insert_block_group_item(struct btrfs_trans_handle *trans,
 
 	btrfs_set_stack_block_group_used(&bgi, block_group->used);
 	btrfs_set_stack_block_group_chunk_objectid(&bgi,
-				BTRFS_FIRST_CHUNK_TREE_OBJECTID);
+						   block_group->global_root_id);
 	btrfs_set_stack_block_group_flags(&bgi, block_group->flags);
 	key.objectid = block_group->start;
 	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
diff --git a/kernel-shared/free-space-tree.c b/kernel-shared/free-space-tree.c
index 896bd3a2..a82865d3 100644
--- a/kernel-shared/free-space-tree.c
+++ b/kernel-shared/free-space-tree.c
@@ -34,6 +34,9 @@ static struct btrfs_root *btrfs_free_space_root(struct btrfs_fs_info *fs_info,
 		.offset = 0,
 	};
 
+	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
+		key.offset = block_group->global_root_id;
+
 	return btrfs_global_root(fs_info, &key);
 }
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 14/19] btrfs-progs: add a btrfs_delete_and_free_root helper
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (12 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 13/19] btrfs-progs: handle the per-block group global root id Josef Bacik
@ 2022-03-07 22:10 ` Josef Bacik
  2022-03-07 22:11 ` [PATCH v5 15/19] btrfs-progs: make btrfs_clear_free_space_tree extent tree v2 aware Josef Bacik
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:10 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

The free space tree code already does this, but we need it for cleaning
up per block group roots.  Abstract this code out into a helper so that
we can use it in multiple places in the future.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 kernel-shared/disk-io.c         | 25 +++++++++++++++++++++++++
 kernel-shared/disk-io.h         |  2 ++
 kernel-shared/free-space-tree.c | 24 +++---------------------
 3 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c
index 59c46946..f3ddf9e3 100644
--- a/kernel-shared/disk-io.c
+++ b/kernel-shared/disk-io.c
@@ -2399,6 +2399,31 @@ int btrfs_set_buffer_uptodate(struct extent_buffer *eb)
 	return set_extent_buffer_uptodate(eb);
 }
 
+int btrfs_delete_and_free_root(struct btrfs_trans_handle *trans,
+			       struct btrfs_root *root)
+{
+	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct btrfs_root *tree_root = fs_info->tree_root;
+	int ret;
+
+	ret = btrfs_del_root(trans, tree_root, &root->root_key);
+	if (ret)
+		return ret;
+
+	list_del(&root->dirty_list);
+	ret = clean_tree_block(root->node);
+	if (ret)
+		return ret;
+	ret = btrfs_free_tree_block(trans, root, root->node, 0, 1);
+	if (ret)
+		return ret;
+	rb_erase(&root->rb_node, &fs_info->global_roots_tree);
+	free_extent_buffer(root->node);
+	free_extent_buffer(root->commit_root);
+	kfree(root);
+	return 0;
+}
+
 struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 				     struct btrfs_fs_info *fs_info,
 				     u64 objectid)
diff --git a/kernel-shared/disk-io.h b/kernel-shared/disk-io.h
index 81d8670f..1e97b9ac 100644
--- a/kernel-shared/disk-io.h
+++ b/kernel-shared/disk-io.h
@@ -221,6 +221,8 @@ int btrfs_fs_roots_compare_roots(struct rb_node *node1, struct rb_node *node2);
 struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 				     struct btrfs_fs_info *fs_info,
 				     u64 objectid);
+int btrfs_delete_and_free_root(struct btrfs_trans_handle *trans,
+			       struct btrfs_root *root);
 struct btrfs_root *btrfs_csum_root(struct btrfs_fs_info *fs_info, u64 bytenr);
 struct btrfs_root *btrfs_extent_root(struct btrfs_fs_info *fs_inf, u64 bytenr);
 struct btrfs_root *btrfs_global_root(struct btrfs_fs_info *fs_info,
diff --git a/kernel-shared/free-space-tree.c b/kernel-shared/free-space-tree.c
index a82865d3..0a13b1d6 100644
--- a/kernel-shared/free-space-tree.c
+++ b/kernel-shared/free-space-tree.c
@@ -1257,27 +1257,9 @@ int btrfs_clear_free_space_tree(struct btrfs_fs_info *fs_info)
 	if (ret)
 		goto abort;
 
-	ret = btrfs_del_root(trans, tree_root, &free_space_root->root_key);
-	if (ret)
-		goto abort;
-
-	list_del(&free_space_root->dirty_list);
-
-	ret = clean_tree_block(free_space_root->node);
-	if (ret)
-		goto abort;
-	ret = btrfs_free_tree_block(trans, free_space_root,
-				    free_space_root->node, 0, 1);
-	if (ret)
-		goto abort;
-
-	rb_erase(&free_space_root->rb_node, &fs_info->global_roots_tree);
-	free_extent_buffer(free_space_root->node);
-	free_extent_buffer(free_space_root->commit_root);
-	kfree(free_space_root);
-
-	ret = btrfs_commit_transaction(trans, tree_root);
-
+	ret = btrfs_delete_and_free_root(trans, free_space_root);
+	if (!ret)
+		ret = btrfs_commit_transaction(trans, tree_root);
 abort:
 	return ret;
 }
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 15/19] btrfs-progs: make btrfs_clear_free_space_tree extent tree v2 aware
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (13 preceding siblings ...)
  2022-03-07 22:10 ` [PATCH v5 14/19] btrfs-progs: add a btrfs_delete_and_free_root helper Josef Bacik
@ 2022-03-07 22:11 ` Josef Bacik
  2022-03-07 22:11 ` [PATCH v5 16/19] btrfs-progs: make btrfs_create_tree take a key for the root key Josef Bacik
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:11 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

With extent tree v2 we'll have multiple free space trees, and we can't
just unset the feature flags for the free space tree.  Fix this to loop
through all of the free space trees and clear them out properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 kernel-shared/free-space-tree.c | 37 ++++++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/kernel-shared/free-space-tree.c b/kernel-shared/free-space-tree.c
index 0a13b1d6..7ac75c20 100644
--- a/kernel-shared/free-space-tree.c
+++ b/kernel-shared/free-space-tree.c
@@ -1248,18 +1248,35 @@ int btrfs_clear_free_space_tree(struct btrfs_fs_info *fs_info)
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
 
-	features = btrfs_super_compat_ro_flags(fs_info->super_copy);
-	features &= ~(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID |
-		      BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE);
-	btrfs_set_super_compat_ro_flags(fs_info->super_copy, features);
+	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
+		struct btrfs_key key = {
+			.objectid = BTRFS_FREE_SPACE_TREE_OBJECTID,
+			.type = BTRFS_ROOT_ITEM_KEY,
+			.offset = 0,
+		};
+
+		while (key.offset < fs_info->nr_global_roots) {
+			free_space_root = btrfs_global_root(fs_info, &key);
+			ret = clear_free_space_tree(trans, free_space_root);
+			if (ret)
+				goto abort;
+			key.offset++;
+		}
+	} else {
+		features = btrfs_super_compat_ro_flags(fs_info->super_copy);
+		features &= ~(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID |
+			      BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE);
+		btrfs_set_super_compat_ro_flags(fs_info->super_copy, features);
 
-	ret = clear_free_space_tree(trans, free_space_root);
-	if (ret)
-		goto abort;
+		ret = clear_free_space_tree(trans, free_space_root);
+		if (ret)
+			goto abort;
 
-	ret = btrfs_delete_and_free_root(trans, free_space_root);
-	if (!ret)
-		ret = btrfs_commit_transaction(trans, tree_root);
+		ret = btrfs_delete_and_free_root(trans, free_space_root);
+		if (ret)
+			goto abort;
+	}
+	ret = btrfs_commit_transaction(trans, tree_root);
 abort:
 	return ret;
 }
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 16/19] btrfs-progs: make btrfs_create_tree take a key for the root key
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (14 preceding siblings ...)
  2022-03-07 22:11 ` [PATCH v5 15/19] btrfs-progs: make btrfs_clear_free_space_tree extent tree v2 aware Josef Bacik
@ 2022-03-07 22:11 ` Josef Bacik
  2022-03-07 22:11 ` [PATCH v5 17/19] btrfs-progs: mkfs: set chunk_item_objectid properly for extent tree v2 Josef Bacik
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:11 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We're going to start create global roots from mkfs, and we need to have
a offset set for the root key.  Make the btrfs_create_tree() take a key
for the root_key instead of just the objectid so we can setup these new
style roots properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 kernel-shared/disk-io.c         | 21 ++++++++-------------
 kernel-shared/disk-io.h         |  2 +-
 kernel-shared/free-space-tree.c |  7 +++++--
 mkfs/main.c                     | 13 ++++++++++---
 4 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c
index f3ddf9e3..4964cd38 100644
--- a/kernel-shared/disk-io.c
+++ b/kernel-shared/disk-io.c
@@ -2426,25 +2426,22 @@ int btrfs_delete_and_free_root(struct btrfs_trans_handle *trans,
 
 struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 				     struct btrfs_fs_info *fs_info,
-				     u64 objectid)
+				     struct btrfs_key *key)
 {
 	struct extent_buffer *leaf;
 	struct btrfs_root *tree_root = fs_info->tree_root;
 	struct btrfs_root *root;
-	struct btrfs_key key;
 	int ret = 0;
 
 	root = kzalloc(sizeof(*root), GFP_KERNEL);
 	if (!root)
 		return ERR_PTR(-ENOMEM);
 
-	btrfs_setup_root(root, fs_info, objectid);
-	root->root_key.objectid = objectid;
-	root->root_key.type = BTRFS_ROOT_ITEM_KEY;
-	root->root_key.offset = 0;
+	btrfs_setup_root(root, fs_info, key->objectid);
+	memcpy(&root->root_key, key, sizeof(struct btrfs_key));
 
-	leaf = btrfs_alloc_free_block(trans, root, fs_info->nodesize, objectid,
-			NULL, 0, 0, 0);
+	leaf = btrfs_alloc_free_block(trans, root, fs_info->nodesize,
+				      root->root_key.objectid, NULL, 0, 0, 0);
 	if (IS_ERR(leaf)) {
 		ret = PTR_ERR(leaf);
 		leaf = NULL;
@@ -2455,7 +2452,7 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 	btrfs_set_header_bytenr(leaf, leaf->start);
 	btrfs_set_header_generation(leaf, trans->transid);
 	btrfs_set_header_backref_rev(leaf, BTRFS_MIXED_BACKREF_REV);
-	btrfs_set_header_owner(leaf, objectid);
+	btrfs_set_header_owner(leaf, root->root_key.objectid);
 	root->node = leaf;
 	write_extent_buffer(leaf, fs_info->fs_devices->metadata_uuid,
 			    btrfs_header_fsid(), BTRFS_FSID_SIZE);
@@ -2480,10 +2477,8 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 	memset(root->root_item.uuid, 0, BTRFS_UUID_SIZE);
 	root->root_item.drop_level = 0;
 
-	key.objectid = objectid;
-	key.type = BTRFS_ROOT_ITEM_KEY;
-	key.offset = 0;
-	ret = btrfs_insert_root(trans, tree_root, &key, &root->root_item);
+	ret = btrfs_insert_root(trans, tree_root, &root->root_key,
+				&root->root_item);
 	if (ret)
 		goto fail;
 
diff --git a/kernel-shared/disk-io.h b/kernel-shared/disk-io.h
index 1e97b9ac..e07141a9 100644
--- a/kernel-shared/disk-io.h
+++ b/kernel-shared/disk-io.h
@@ -220,7 +220,7 @@ int write_and_map_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb);
 int btrfs_fs_roots_compare_roots(struct rb_node *node1, struct rb_node *node2);
 struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 				     struct btrfs_fs_info *fs_info,
-				     u64 objectid);
+				     struct btrfs_key *key);
 int btrfs_delete_and_free_root(struct btrfs_trans_handle *trans,
 			       struct btrfs_root *root);
 struct btrfs_root *btrfs_csum_root(struct btrfs_fs_info *fs_info, u64 bytenr);
diff --git a/kernel-shared/free-space-tree.c b/kernel-shared/free-space-tree.c
index 7ac75c20..03eb0ed2 100644
--- a/kernel-shared/free-space-tree.c
+++ b/kernel-shared/free-space-tree.c
@@ -1475,14 +1475,17 @@ int btrfs_create_free_space_tree(struct btrfs_fs_info *fs_info)
 	struct btrfs_root *free_space_root;
 	struct btrfs_block_group *block_group;
 	u64 start = BTRFS_SUPER_INFO_OFFSET + BTRFS_SUPER_INFO_SIZE;
+	struct btrfs_key root_key = {
+		.objectid = BTRFS_FREE_SPACE_TREE_OBJECTID,
+		.type = BTRFS_ROOT_ITEM_KEY,
+	};
 	int ret;
 
 	trans = btrfs_start_transaction(tree_root, 0);
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
 
-	free_space_root = btrfs_create_tree(trans, fs_info,
-					    BTRFS_FREE_SPACE_TREE_OBJECTID);
+	free_space_root = btrfs_create_tree(trans, fs_info, &root_key);
 	if (IS_ERR(free_space_root)) {
 		ret = PTR_ERR(free_space_root);
 		goto abort;
diff --git a/mkfs/main.c b/mkfs/main.c
index 7f79ba1a..19535604 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -717,12 +717,15 @@ static int create_data_reloc_tree(struct btrfs_trans_handle *trans)
 	struct btrfs_inode_item *inode;
 	struct btrfs_root *root;
 	struct btrfs_path path;
-	struct btrfs_key key;
+	struct btrfs_key key = {
+		.objectid = BTRFS_DATA_RELOC_TREE_OBJECTID,
+		.type = BTRFS_ROOT_ITEM_KEY,
+	};
 	u64 ino = BTRFS_FIRST_FREE_OBJECTID;
 	char *name = "..";
 	int ret;
 
-	root = btrfs_create_tree(trans, fs_info, BTRFS_DATA_RELOC_TREE_OBJECTID);
+	root = btrfs_create_tree(trans, fs_info, &key);
 	if (IS_ERR(root)) {
 		ret = PTR_ERR(root);
 		goto out;
@@ -782,10 +785,14 @@ static int create_uuid_tree(struct btrfs_trans_handle *trans)
 {
 	struct btrfs_fs_info *fs_info = trans->fs_info;
 	struct btrfs_root *root;
+	struct btrfs_key key = {
+		.objectid = BTRFS_UUID_TREE_OBJECTID,
+		.type = BTRFS_ROOT_ITEM_KEY,
+	};
 	int ret = 0;
 
 	ASSERT(fs_info->uuid_root == NULL);
-	root = btrfs_create_tree(trans, fs_info, BTRFS_UUID_TREE_OBJECTID);
+	root = btrfs_create_tree(trans, fs_info, &key);
 	if (IS_ERR(root)) {
 		ret = PTR_ERR(root);
 		goto out;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 17/19] btrfs-progs: mkfs: set chunk_item_objectid properly for extent tree v2
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (15 preceding siblings ...)
  2022-03-07 22:11 ` [PATCH v5 16/19] btrfs-progs: make btrfs_create_tree take a key for the root key Josef Bacik
@ 2022-03-07 22:11 ` Josef Bacik
  2022-03-07 22:11 ` [PATCH v5 18/19] btrfs-progs: mkfs: create the global root's Josef Bacik
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:11 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Our initial block group will use global root id 0 with extent tree v2,
so adjust the helper to take the chunk_objectid as an argument, as we'll
set this to 0 for extent tree v2 and then
BTRFS_FIRST_CHUNK_TREE_OBJECTID for extent tree v1.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 mkfs/common.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mkfs/common.c b/mkfs/common.c
index eac8c46c..75680d03 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -191,7 +191,7 @@ static int create_free_space_tree(int fd, struct btrfs_mkfs_config *cfg,
 
 static void write_block_group_item(struct extent_buffer *buf, u32 nr,
 				   u64 objectid, u64 offset, u64 used,
-				   u32 itemoff)
+				   u64 chunk_objectid, u32 itemoff)
 {
 	struct btrfs_block_group_item *bg_item;
 	struct btrfs_disk_key disk_key;
@@ -206,8 +206,7 @@ static void write_block_group_item(struct extent_buffer *buf, u32 nr,
 	bg_item = btrfs_item_ptr(buf, nr, struct btrfs_block_group_item);
 	btrfs_set_block_group_used(buf, bg_item, used);
 	btrfs_set_block_group_flags(buf, bg_item, BTRFS_BLOCK_GROUP_SYSTEM);
-	btrfs_set_block_group_chunk_objectid(buf, bg_item,
-					     BTRFS_FIRST_CHUNK_TREE_OBJECTID);
+	btrfs_set_block_group_chunk_objectid(buf, bg_item, chunk_objectid);
 }
 
 static int create_block_group_tree(int fd, struct btrfs_mkfs_config *cfg,
@@ -218,7 +217,7 @@ static int create_block_group_tree(int fd, struct btrfs_mkfs_config *cfg,
 
 	memset(buf->data + sizeof(struct btrfs_header), 0,
 		cfg->nodesize - sizeof(struct btrfs_header));
-	write_block_group_item(buf, 0, bg_offset, bg_size, bg_used,
+	write_block_group_item(buf, 0, bg_offset, bg_size, bg_used, 0,
 			       cfg->leaf_data_size -
 			       sizeof(struct btrfs_block_group_item));
 	btrfs_set_header_bytenr(buf, cfg->blocks[MKFS_BLOCK_GROUP_TREE]);
@@ -396,6 +395,7 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 			write_block_group_item(buf, nritems,
 					       system_group_offset,
 					       system_group_size, total_used,
+					       BTRFS_FIRST_CHUNK_TREE_OBJECTID,
 					       itemoff);
 			add_block_group = false;
 			nritems++;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 18/19] btrfs-progs: mkfs: create the global root's
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (16 preceding siblings ...)
  2022-03-07 22:11 ` [PATCH v5 17/19] btrfs-progs: mkfs: set chunk_item_objectid properly for extent tree v2 Josef Bacik
@ 2022-03-07 22:11 ` Josef Bacik
  2022-03-09 18:35   ` David Sterba
  2022-03-07 22:11 ` [PATCH v5 19/19] btrfs-progs: check: don't do the root item check for extent tree v2 Josef Bacik
  2022-03-09 18:48 ` [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots David Sterba
  19 siblings, 1 reply; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:11 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that we have all of the supporting code, add the ability to create
all of the global roots for an extent tree v2 fs.  This will default to
nr_cpu's, but also allow the user to specify how many global roots they
would like.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 mkfs/main.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/mkfs/main.c b/mkfs/main.c
index 19535604..a603ec58 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -810,6 +810,53 @@ out:
 	return ret;
 }
 
+static int create_global_root(struct btrfs_trans_handle *trans, u64 objectid,
+			      int root_id)
+{
+	struct btrfs_fs_info *fs_info = trans->fs_info;
+	struct btrfs_root *root;
+	struct btrfs_key key = {
+		.objectid = objectid,
+		.type = BTRFS_ROOT_ITEM_KEY,
+		.offset = root_id,
+	};
+	int ret = 0;
+
+	root = btrfs_create_tree(trans, fs_info, &key);
+	if (IS_ERR(root)) {
+		ret = PTR_ERR(root);
+		goto out;
+	}
+	ret = btrfs_global_root_insert(fs_info, root);
+out:
+	if (ret)
+		btrfs_abort_transaction(trans, ret);
+	return ret;
+}
+
+static int create_global_roots(struct btrfs_trans_handle *trans,
+			       int nr_global_roots)
+{
+	int ret, i;
+
+	for (i = 1; i < nr_global_roots; i++) {
+		ret = create_global_root(trans, BTRFS_EXTENT_TREE_OBJECTID, i);
+		if (ret)
+			return ret;
+		ret = create_global_root(trans, BTRFS_CSUM_TREE_OBJECTID, i);
+		if (ret)
+			return ret;
+		ret = create_global_root(trans, BTRFS_FREE_SPACE_TREE_OBJECTID, i);
+		if (ret)
+			return ret;
+	}
+
+	btrfs_set_super_nr_global_roots(trans->fs_info->super_copy,
+					nr_global_roots);
+
+	return 0;
+}
+
 static int insert_qgroup_items(struct btrfs_trans_handle *trans,
 			       struct btrfs_fs_info *fs_info,
 			       u64 qgroupid)
@@ -966,13 +1013,18 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 	struct btrfs_mkfs_config mkfs_cfg;
 	enum btrfs_csum_type csum_type = BTRFS_CSUM_TYPE_CRC32;
 	u64 system_group_size;
+	int nr_global_roots = sysconf(_SC_NPROCESSORS_ONLN);
 
 	crc32c_optimization_init();
 	btrfs_config_init();
 
 	while(1) {
 		int c;
-		enum { GETOPT_VAL_SHRINK = 257, GETOPT_VAL_CHECKSUM };
+		enum {
+			GETOPT_VAL_SHRINK = 257,
+			GETOPT_VAL_CHECKSUM,
+			GETOPT_VAL_GLOBAL_ROOTS,
+		};
 		static const struct option long_options[] = {
 			{ "byte-count", required_argument, NULL, 'b' },
 			{ "csum", required_argument, NULL,
@@ -996,6 +1048,9 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 			{ "quiet", 0, NULL, 'q' },
 			{ "verbose", 0, NULL, 'v' },
 			{ "shrink", no_argument, NULL, GETOPT_VAL_SHRINK },
+#if EXPERIMENTAL
+			{ "num-global-roots", required_argument, NULL, GETOPT_VAL_GLOBAL_ROOTS },
+#endif
 			{ "help", no_argument, NULL, GETOPT_VAL_HELP },
 			{ NULL, 0, NULL, 0}
 		};
@@ -1100,6 +1155,9 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 			case GETOPT_VAL_CHECKSUM:
 				csum_type = parse_csum_type(optarg);
 				break;
+			case GETOPT_VAL_GLOBAL_ROOTS:
+				nr_global_roots = (int)arg_strtou64(optarg);
+				break;
 			case GETOPT_VAL_HELP:
 			default:
 				print_usage(c != GETOPT_VAL_HELP);
@@ -1239,6 +1297,11 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 	if (features & BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2) {
 		features |= BTRFS_FEATURE_INCOMPAT_NO_HOLES;
 		runtime_features |= BTRFS_RUNTIME_FEATURE_FREE_SPACE_TREE;
+
+		if (!nr_global_roots) {
+			error("you must set a non-zero num-global-roots value");
+			exit(1);
+		}
 	}
 
 	if (zoned) {
@@ -1467,6 +1530,14 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 		goto error;
 	}
 
+	if (features & BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2) {
+		ret = create_global_roots(trans, nr_global_roots);
+		if (ret) {
+			error("failed to create global roots: %d", ret);
+			goto error;
+		}
+	}
+
 	ret = make_root_dir(trans, root);
 	if (ret) {
 		error("failed to setup the root directory: %d", ret);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 19/19] btrfs-progs: check: don't do the root item check for extent tree v2
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (17 preceding siblings ...)
  2022-03-07 22:11 ` [PATCH v5 18/19] btrfs-progs: mkfs: create the global root's Josef Bacik
@ 2022-03-07 22:11 ` Josef Bacik
  2022-03-09 18:48 ` [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots David Sterba
  19 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-07 22:11 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

With the current set of changes we could probably do this check, but it
would involve changing the code quite a bit, and in the future we're not
going to track the metadata in the extent tree at all.  Since this check
was for a very old kernel just skip it for extent tree v2.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 check/main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/check/main.c b/check/main.c
index 45065989..6bedd648 100644
--- a/check/main.c
+++ b/check/main.c
@@ -9860,6 +9860,9 @@ static int repair_root_items(void)
 	int bad_roots = 0;
 	int need_trans = 0;
 
+	if (btrfs_fs_incompat(gfs_info, EXTENT_TREE_V2))
+		return 0;
+
 	btrfs_init_path(&path);
 
 	ret = build_roots_info_cache();
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block
  2022-03-07 22:10 ` [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block Josef Bacik
@ 2022-03-08 16:19   ` David Sterba
  2022-03-08 16:41     ` Johannes Thumshirn
  0 siblings, 1 reply; 29+ messages in thread
From: David Sterba @ 2022-03-08 16:19 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Mon, Mar 07, 2022 at 05:10:57PM -0500, Josef Bacik wrote:
> In order to make sure the file system is consistent we need to record
> the number of global roots we should have in the super block.  We could
> infer this from the number of global roots we find, however this could
> lead to interesting fuzzing problems, so add a source of truth to the
> super block in order to make it easier to verify the file system is
> consistent.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  kernel-shared/ctree.h   | 6 +++++-
>  kernel-shared/disk-io.c | 4 ++++
>  mkfs/common.c           | 1 +
>  3 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h
> index b12dbff1..90de7a65 100644
> --- a/kernel-shared/ctree.h
> +++ b/kernel-shared/ctree.h
> @@ -463,13 +463,15 @@ struct btrfs_super_block {
>  
>  	u8 metadata_uuid[BTRFS_FSID_SIZE];
>  
> +	__le64 nr_global_roots;
> +

Shouldn't this be added after the last item?

>  	__le64 block_group_root;
>  	__le64 block_group_root_generation;
>  	u8 block_group_root_level;
>  
>  	/* future expansion */
>  	u8 reserved8[7];
> -	__le64 reserved[25];
> +	__le64 reserved[24];
>  	u8 sys_chunk_array[BTRFS_SYSTEM_CHUNK_ARRAY_SIZE];
>  	struct btrfs_root_backup super_roots[BTRFS_NUM_BACKUP_ROOTS];
>  	/* Padded to 4096 bytes */

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block
  2022-03-08 16:19   ` David Sterba
@ 2022-03-08 16:41     ` Johannes Thumshirn
  2022-03-09 17:05       ` David Sterba
  0 siblings, 1 reply; 29+ messages in thread
From: Johannes Thumshirn @ 2022-03-08 16:41 UTC (permalink / raw)
  To: dsterba, Josef Bacik; +Cc: linux-btrfs, kernel-team

On 08/03/2022 17:23, David Sterba wrote: 
>>  	u8 metadata_uuid[BTRFS_FSID_SIZE];
>>  
>> +	__le64 nr_global_roots;
>> +
> 
> Shouldn't this be added after the last item?
> 
>>  	__le64 block_group_root;
>>  	__le64 block_group_root_generation;
>>  	u8 block_group_root_level;
>>  
>>  	/* future expansion */
>>  	u8 reserved8[7];
>> -	__le64 reserved[25];
>> +	__le64 reserved[24];

Or at least inside one of these reserved fields.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block
  2022-03-08 16:41     ` Johannes Thumshirn
@ 2022-03-09 17:05       ` David Sterba
  2022-03-09 21:22         ` Josef Bacik
  0 siblings, 1 reply; 29+ messages in thread
From: David Sterba @ 2022-03-09 17:05 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: dsterba, Josef Bacik, linux-btrfs, kernel-team

On Tue, Mar 08, 2022 at 04:41:44PM +0000, Johannes Thumshirn wrote:
> On 08/03/2022 17:23, David Sterba wrote: 
> >>  	u8 metadata_uuid[BTRFS_FSID_SIZE];
> >>  
> >> +	__le64 nr_global_roots;
> >> +
> > 
> > Shouldn't this be added after the last item?
> > 
> >>  	__le64 block_group_root;
> >>  	__le64 block_group_root_generation;
> >>  	u8 block_group_root_level;
> >>  
> >>  	/* future expansion */
> >>  	u8 reserved8[7];
> >> -	__le64 reserved[25];
> >> +	__le64 reserved[24];
> 
> Or at least inside one of these reserved fields.

OTOH, it's still experimental so we don't expect backward compatibility
yet so it should be ok to change for now.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 18/19] btrfs-progs: mkfs: create the global root's
  2022-03-07 22:11 ` [PATCH v5 18/19] btrfs-progs: mkfs: create the global root's Josef Bacik
@ 2022-03-09 18:35   ` David Sterba
  2022-03-09 21:21     ` Josef Bacik
  0 siblings, 1 reply; 29+ messages in thread
From: David Sterba @ 2022-03-09 18:35 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Mon, Mar 07, 2022 at 05:11:03PM -0500, Josef Bacik wrote:
> Now that we have all of the supporting code, add the ability to create
> all of the global roots for an extent tree v2 fs.  This will default to
> nr_cpu's, but also allow the user to specify how many global roots they
> would like.

Why is number of online cpus a good default? Or how a user should know
what's a good number? It resembles the allocation groups on xfs that
are set at mkfs time and once the filesystem is grown the size remains
but the number explodes and becomes problematic if the the old/new sizes
are disproportionate. We have more flexibility in btrfs with the resize
so we could afford to set the intial number based rather on the device
size and then a rebalance after resize can adjust that again. Maybe
there's something in kernel taking care of that, I don't know.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots
  2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
                   ` (18 preceding siblings ...)
  2022-03-07 22:11 ` [PATCH v5 19/19] btrfs-progs: check: don't do the root item check for extent tree v2 Josef Bacik
@ 2022-03-09 18:48 ` David Sterba
  19 siblings, 0 replies; 29+ messages in thread
From: David Sterba @ 2022-03-09 18:48 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Mon, Mar 07, 2022 at 05:10:45PM -0500, Josef Bacik wrote:
> v4->v3:
> - Rebase onto devel, depends on "btrfs-progs: cleanup btrfs_item* accessors".
> - Dropped the various patches that have already been merged into -progs.
> 
> v3->v4:
> - Rebase onto devel, depends on the v3 prep patches that were sent on December
>   1st which has the rest of the "don't access ->*_root" patches.
> - I think I screwed up the versioning of this, but I lost the other submission,
>   so call this v3.
> 
> v1->v2:
> - These depend on the v3 of the prep patches (it's marked as v2 because I'm
>   stupid, but the second v2 posting I sent.)
> - I've moved the global root rb tree patches into this series to differentiate
>   them from the actual fixes in the prep series.
> 
> --- Original email ---
> Hello,
> 
> These patches are the first chunk of the extent tree v2 format changes.  This
> includes the separate block group root which will hold all of the block group
> items.  This also includes the global root support, which is the work to allow
> us to have multiple extent, csum, and free space trees in the same file system.
> 
> The goal of these two changes are straightforward.  For the block group root, on
> very large file systems the block group items are very widely separated, which
> means it takes a very long time to mount the file system on large, slow disks.
> Putting the block group items in their own root will allow us to densely
> populate the tree and dramatically increase mount times in these cases.
> 
> The global roots change is motivated by lock contention on the root nodes of
> these global roots.  I've had to make many changes to how we run delayed refs to
> speed up things like the transaction commit because of all the delayed refs
> going into one tree and contending on the root node of the extent tree.  In the
> same token you can have heavy lock contention on the csum roots when writing to
> many files.  Allowing for multiple roots will let us spread the lock contention
> load around.
> 
> I have disabled a few key features, namely balance and qgroups.  There will be
> more to come as I make more and more invasive changes, and then they will slowly
> be re-enabled as the work is added.  These are disabled to avoid a bunch of work
> that would be thrown away by future changes.
> 
> These patches have passed xfstests without panicing, but clearly failing a lot
> of tests because of the disabled features.  I've also run it through fsperf to
> validate that there are no major performance regressions.
> 
> WARNING: there are many more format changes planned, this is just the first
> batch.  If you want to test then please feel free, but know that the format is
> still in flux.  Thanks,
> 
> Josef
> 
> Josef Bacik (19):
>   btrfs-progs: add support for loading the block group root
>   btrfs-progs: add print support for the block group tree
>   btrfs-progs: mkfs: use the btrfs_block_group_root helper
>   btrfs-progs: check-lowmem: use the btrfs_block_group_root helper
>   btrfs-progs: handle no bg item in extent tree for free space tree
>   btrfs-progs: mkfs: add support for the block group tree
>   btrfs-progs: check: add block group tree support
>   btrfs-progs: qgroup-verify: scan extents based on block groups
>   btrfs-progs: check: make free space tree validation extent tree v2
>     aware
>   btrfs-progs: check: add helper to reinit the root based on a key
>   btrfs-progs: check: handle the block group tree properly
>   btrfs-progs: set the number of global roots in the super block
>   btrfs-progs: handle the per-block group global root id
>   btrfs-progs: add a btrfs_delete_and_free_root helper
>   btrfs-progs: make btrfs_clear_free_space_tree extent tree v2 aware
>   btrfs-progs: make btrfs_create_tree take a key for the root key
>   btrfs-progs: mkfs: set chunk_item_objectid properly for extent tree v2
>   btrfs-progs: mkfs: create the global root's
>   btrfs-progs: check: don't do the root item check for extent tree v2

Added to devel, thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 18/19] btrfs-progs: mkfs: create the global root's
  2022-03-09 18:35   ` David Sterba
@ 2022-03-09 21:21     ` Josef Bacik
  0 siblings, 0 replies; 29+ messages in thread
From: Josef Bacik @ 2022-03-09 21:21 UTC (permalink / raw)
  To: dsterba, linux-btrfs, kernel-team

On Wed, Mar 09, 2022 at 07:35:34PM +0100, David Sterba wrote:
> On Mon, Mar 07, 2022 at 05:11:03PM -0500, Josef Bacik wrote:
> > Now that we have all of the supporting code, add the ability to create
> > all of the global roots for an extent tree v2 fs.  This will default to
> > nr_cpu's, but also allow the user to specify how many global roots they
> > would like.
> 
> Why is number of online cpus a good default? Or how a user should know
> what's a good number? It resembles the allocation groups on xfs that
> are set at mkfs time and once the filesystem is grown the size remains
> but the number explodes and becomes problematic if the the old/new sizes
> are disproportionate. We have more flexibility in btrfs with the resize
> so we could afford to set the intial number based rather on the device
> size and then a rebalance after resize can adjust that again. Maybe
> there's something in kernel taking care of that, I don't know.

Right now I have no idea what a good number is, so I'm defaulting to NR_CPUS.  I
*think* this is a good idea because generally we want to spread the locking
pain, so hopefully NR_CPU's is good enough for most people?

We allow setting your own number if a user does the benchmarking to find their
ideal global roots number (*cough*Zygo*cough*).  And it'll be easy enough to add
new global roots since we tie the block group to the global root at create time.
Thanks,

Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block
  2022-03-09 17:05       ` David Sterba
@ 2022-03-09 21:22         ` Josef Bacik
  2022-06-14 12:15           ` Qu Wenruo
  0 siblings, 1 reply; 29+ messages in thread
From: Josef Bacik @ 2022-03-09 21:22 UTC (permalink / raw)
  To: dsterba, Johannes Thumshirn, linux-btrfs, kernel-team

On Wed, Mar 09, 2022 at 06:05:53PM +0100, David Sterba wrote:
> On Tue, Mar 08, 2022 at 04:41:44PM +0000, Johannes Thumshirn wrote:
> > On 08/03/2022 17:23, David Sterba wrote: 
> > >>  	u8 metadata_uuid[BTRFS_FSID_SIZE];
> > >>  
> > >> +	__le64 nr_global_roots;
> > >> +
> > > 
> > > Shouldn't this be added after the last item?
> > > 
> > >>  	__le64 block_group_root;
> > >>  	__le64 block_group_root_generation;
> > >>  	u8 block_group_root_level;
> > >>  
> > >>  	/* future expansion */
> > >>  	u8 reserved8[7];
> > >> -	__le64 reserved[25];
> > >> +	__le64 reserved[24];
> > 
> > Or at least inside one of these reserved fields.
> 
> OTOH, it's still experimental so we don't expect backward compatibility
> yet so it should be ok to change for now.

I did it this way because it's all still experimental and it makes more sense
for it to be before the new root stuff.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block
  2022-03-09 21:22         ` Josef Bacik
@ 2022-06-14 12:15           ` Qu Wenruo
  2022-06-14 12:47             ` Qu Wenruo
  0 siblings, 1 reply; 29+ messages in thread
From: Qu Wenruo @ 2022-06-14 12:15 UTC (permalink / raw)
  To: Josef Bacik, dsterba, Johannes Thumshirn, linux-btrfs, kernel-team



On 2022/3/10 05:22, Josef Bacik wrote:
> On Wed, Mar 09, 2022 at 06:05:53PM +0100, David Sterba wrote:
>> On Tue, Mar 08, 2022 at 04:41:44PM +0000, Johannes Thumshirn wrote:
>>> On 08/03/2022 17:23, David Sterba wrote:
>>>>>   	u8 metadata_uuid[BTRFS_FSID_SIZE];
>>>>>
>>>>> +	__le64 nr_global_roots;
>>>>> +
>>>>
>>>> Shouldn't this be added after the last item?
>>>>
>>>>>   	__le64 block_group_root;
>>>>>   	__le64 block_group_root_generation;
>>>>>   	u8 block_group_root_level;
>>>>>
>>>>>   	/* future expansion */
>>>>>   	u8 reserved8[7];
>>>>> -	__le64 reserved[25];
>>>>> +	__le64 reserved[24];
>>>
>>> Or at least inside one of these reserved fields.
>>
>> OTOH, it's still experimental so we don't expect backward compatibility
>> yet so it should be ok to change for now.
>
> I did it this way because it's all still experimental and it makes more sense
> for it to be before the new root stuff.  Thanks,

I'd say, please don't.

It's making anyone who want to add a new member in super block miserable.

Everyone is going to add the new member from the reserved members, but
such insert into the existing members are destructive.

Furthermore, if the new member is going to be merged way before extent
tree v2 part, how do we solve the conflicts?

(The new member I want to introduce is just to indicate how many bytes
we have reserved at the beginning of each device, with a new RO compat
flag).

Thanks,
Qu
>
> Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block
  2022-06-14 12:15           ` Qu Wenruo
@ 2022-06-14 12:47             ` Qu Wenruo
  0 siblings, 0 replies; 29+ messages in thread
From: Qu Wenruo @ 2022-06-14 12:47 UTC (permalink / raw)
  To: Josef Bacik, dsterba, Johannes Thumshirn, linux-btrfs, kernel-team



On 2022/6/14 20:15, Qu Wenruo wrote:
>
>
> On 2022/3/10 05:22, Josef Bacik wrote:
>> On Wed, Mar 09, 2022 at 06:05:53PM +0100, David Sterba wrote:
>>> On Tue, Mar 08, 2022 at 04:41:44PM +0000, Johannes Thumshirn wrote:
>>>> On 08/03/2022 17:23, David Sterba wrote:
>>>>>>       u8 metadata_uuid[BTRFS_FSID_SIZE];
>>>>>>
>>>>>> +    __le64 nr_global_roots;
>>>>>> +
>>>>>
>>>>> Shouldn't this be added after the last item?
>>>>>
>>>>>>       __le64 block_group_root;
>>>>>>       __le64 block_group_root_generation;
>>>>>>       u8 block_group_root_level;
>>>>>>
>>>>>>       /* future expansion */
>>>>>>       u8 reserved8[7];
>>>>>> -    __le64 reserved[25];
>>>>>> +    __le64 reserved[24];
>>>>
>>>> Or at least inside one of these reserved fields.
>>>
>>> OTOH, it's still experimental so we don't expect backward compatibility
>>> yet so it should be ok to change for now.
>>
>> I did it this way because it's all still experimental and it makes
>> more sense
>> for it to be before the new root stuff.  Thanks,
>
> I'd say, please don't.
>
> It's making anyone who want to add a new member in super block miserable.
>
> Everyone is going to add the new member from the reserved members, but
> such insert into the existing members are destructive.
>
> Furthermore, if the new member is going to be merged way before extent
> tree v2 part, how do we solve the conflicts?
>
> (The new member I want to introduce is just to indicate how many bytes
> we have reserved at the beginning of each device, with a new RO compat
> flag).

My bad, the main problem is not shuffling the members of extent tree v2,
but out-of-sync between kernel and btrfs-progs for super block.

Anyway, I'd use the padding[] for my new members, to avoid possible
out-of-sync problems.

Thanks,
Qu
>
> Thanks,
> Qu
>>
>> Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-06-14 12:47 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-07 22:10 [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots Josef Bacik
2022-03-07 22:10 ` [PATCH v5 01/19] btrfs-progs: add support for loading the block group root Josef Bacik
2022-03-07 22:10 ` [PATCH v5 02/19] btrfs-progs: add print support for the block group tree Josef Bacik
2022-03-07 22:10 ` [PATCH v5 03/19] btrfs-progs: mkfs: use the btrfs_block_group_root helper Josef Bacik
2022-03-07 22:10 ` [PATCH v5 04/19] btrfs-progs: check-lowmem: " Josef Bacik
2022-03-07 22:10 ` [PATCH v5 05/19] btrfs-progs: handle no bg item in extent tree for free space tree Josef Bacik
2022-03-07 22:10 ` [PATCH v5 06/19] btrfs-progs: mkfs: add support for the block group tree Josef Bacik
2022-03-07 22:10 ` [PATCH v5 07/19] btrfs-progs: check: add block group tree support Josef Bacik
2022-03-07 22:10 ` [PATCH v5 08/19] btrfs-progs: qgroup-verify: scan extents based on block groups Josef Bacik
2022-03-07 22:10 ` [PATCH v5 09/19] btrfs-progs: check: make free space tree validation extent tree v2 aware Josef Bacik
2022-03-07 22:10 ` [PATCH v5 10/19] btrfs-progs: check: add helper to reinit the root based on a key Josef Bacik
2022-03-07 22:10 ` [PATCH v5 11/19] btrfs-progs: check: handle the block group tree properly Josef Bacik
2022-03-07 22:10 ` [PATCH v5 12/19] btrfs-progs: set the number of global roots in the super block Josef Bacik
2022-03-08 16:19   ` David Sterba
2022-03-08 16:41     ` Johannes Thumshirn
2022-03-09 17:05       ` David Sterba
2022-03-09 21:22         ` Josef Bacik
2022-06-14 12:15           ` Qu Wenruo
2022-06-14 12:47             ` Qu Wenruo
2022-03-07 22:10 ` [PATCH v5 13/19] btrfs-progs: handle the per-block group global root id Josef Bacik
2022-03-07 22:10 ` [PATCH v5 14/19] btrfs-progs: add a btrfs_delete_and_free_root helper Josef Bacik
2022-03-07 22:11 ` [PATCH v5 15/19] btrfs-progs: make btrfs_clear_free_space_tree extent tree v2 aware Josef Bacik
2022-03-07 22:11 ` [PATCH v5 16/19] btrfs-progs: make btrfs_create_tree take a key for the root key Josef Bacik
2022-03-07 22:11 ` [PATCH v5 17/19] btrfs-progs: mkfs: set chunk_item_objectid properly for extent tree v2 Josef Bacik
2022-03-07 22:11 ` [PATCH v5 18/19] btrfs-progs: mkfs: create the global root's Josef Bacik
2022-03-09 18:35   ` David Sterba
2022-03-09 21:21     ` Josef Bacik
2022-03-07 22:11 ` [PATCH v5 19/19] btrfs-progs: check: don't do the root item check for extent tree v2 Josef Bacik
2022-03-09 18:48 ` [PATCH v5 00/19] btrfs-progs: extent tree v2 support, global roots David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.