linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] btrfs-progs: separate BLOCK_GROUP_TREE feature from extent-tree-v2
@ 2022-07-13  7:57 Qu Wenruo
  2022-07-13  7:57 ` [PATCH 1/3] btrfs-progs: mkfs: dynamically modify mkfs blocks array Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Qu Wenruo @ 2022-07-13  7:57 UTC (permalink / raw)
  To: linux-btrfs

The block group tree idea is introduced to greatly reduce mount time for
large fs, over SEVERN years ago.

And 4 years ago, it's determined to let extent-tree-v2 to implement the
feature.

However extent tree v2 still doesn't have a consistent on-disk format,
nor any implementation on the real extent items, nor any tests on some
independent sub-features.

I strongly doubt if that's a correct decision, especially considering
there is really no dependency from extent tree v2 on this block group
tree feature.

Not to mention this is against the common idea on progressive
improvement.

So now is the time to revive the independent block group compat RO flag.

[CHANGE FROM EXTENT-TREE-V2]
- Don't store block group root into super block
  There is no special reason for block group root to be stored in super
  block.

- Separate block-group-tree as a compat RO flag from extent-tree-v2
  The change to extent tree is not affecting read-only opeartions.
  No reason to make it incompat.

- Fix a bug in extent-tree-v2 which doesn't initialize block group item
  correctly.
  Since we're re-using the existing block group item structure, we
  should properly initialize chunk_objectid to 256, or tree block
  will reject it.

- Dynamically arrange the mkfs_block array
  Instead a completely new array dedicated for extent-tree-v2, now we
  have proper helpers to add/delete block from the array on-the-fly.

[TODO]
- Add btrfstune support to convert to block-group-tree feature
  and back.
  This is supported in previous push, but now due to the new
  changes introduced by extent-tree-v2, I need to revisit the
  convert tool.

  And due to recent inspirations from csum conversion, I will
  go the double tree co-exist method to do the conversion,
  instead of the old one transaction conversion.

Qu Wenruo (3):
  btrfs-progs: mkfs: dynamically modify mkfs blocks array
  btrfs-progs: don't save block group root into super block
  btrfs-progs: separate block group tree from extent tree v2

 check/main.c               |   8 +--
 cmds/inspect-dump-tree.c   |  11 ----
 common/fsfeatures.c        |   8 +++
 common/fsfeatures.h        |   2 +
 kernel-shared/ctree.h      |  35 +++---------
 kernel-shared/disk-io.c    |  77 ++++++-------------------
 kernel-shared/disk-io.h    |   2 +-
 kernel-shared/print-tree.c |  11 +---
 mkfs/common.c              | 113 ++++++++++++++++++++++++++++++-------
 mkfs/common.h              |  20 +------
 mkfs/main.c                |   3 +-
 11 files changed, 138 insertions(+), 152 deletions(-)

-- 
2.37.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] btrfs-progs: mkfs: dynamically modify mkfs blocks array
  2022-07-13  7:57 [PATCH 0/3] btrfs-progs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 Qu Wenruo
@ 2022-07-13  7:57 ` Qu Wenruo
  2022-07-13  7:57 ` [PATCH 2/3] btrfs-progs: don't save block group root into super block Qu Wenruo
  2022-07-13  7:57 ` [PATCH 3/3] btrfs-progs: separate block group tree from extent tree v2 Qu Wenruo
  2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2022-07-13  7:57 UTC (permalink / raw)
  To: linux-btrfs

In mkfs_btrfs(), we have a btrfs_mkfs_block array to store how many tree
blocks we need to reserve for the initial btrfs image.

Currently we have two very similar arrays, extent_tree_v1_blocks and
extent_tree_v2_blocks.

The only difference is just v2 has an extra block for block group tree.

This patch will add two helpers, mkfs_blocks_add() and
mkfs_blocks_remove() to properly add/remove one block dynamically from
the array.

This allows 3 things:

- Merge extent_tree_v1_blocks and extent_tree_v2_blocks into one array
  The new array will be the same as extent_tree_v1_blocks.
  For extent-tree-v2, we just dynamically add MKFS_BLOCK_GROUP_TREE.

- Remove free space tree block on-demand
  This only works for extent-tree-v1 case, as v2 has a hard requirement
  on free space tree.
  But this still make code much cleaner, not doing any special hacks.

- Allow future expansion without introduce new array
  I strongly doubt why this is not properly done in extent-tree-v2
  preparation patches.
  We should not allow bad practice to sneak in just because it's some
  preparation patches for a larger feature.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 mkfs/common.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++-----
 mkfs/common.h | 20 +++-----------
 2 files changed, 69 insertions(+), 24 deletions(-)

diff --git a/mkfs/common.c b/mkfs/common.c
index 218854491c14..d5a49ca11cde 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -260,6 +260,60 @@ next:
 	__builtin_unreachable();
 }
 
+/*
+ * Add @block into the @blocks array.
+ *
+ * The @blocks should already be in ascending order and no duplicate.
+ */
+static void mkfs_blocks_add(enum btrfs_mkfs_block *blocks, int *blocks_nr,
+			    enum btrfs_mkfs_block to_add)
+{
+	int i;
+
+	for (i = 0; i < *blocks_nr; i++) {
+		/* The target is already in the array. */
+		if (blocks[i] == to_add)
+			return;
+
+		/*
+		 * We find the first one past @to_add, move the array one slot
+		 * right, insert a new one.
+		 */
+		if (blocks[i] > to_add) {
+			memmove(blocks + i + 1, blocks + i, *blocks_nr - i);
+			blocks[i] = to_add;
+			(*blocks_nr)++;
+			return;
+		}
+		/* Current one still smaller than @to_add, go to next slot. */
+	}
+	/* All slots iterated and not match, insert into the last slot. */
+	blocks[i] = to_add;
+	(*blocks_nr)++;
+	return;
+}
+
+/*
+ * Remove @block from the @blocks array.
+ *
+ * The @blocks should already be in ascending order and no duplicate.
+ */
+static void mkfs_blocks_remove(enum btrfs_mkfs_block *blocks, int *blocks_nr,
+			       enum btrfs_mkfs_block to_remove)
+{
+	int i;
+
+	for (i = 0; i < *blocks_nr; i++) {
+		/* Found the target, move the array one slot left. */
+		if (blocks[i] == to_remove) {
+			memmove(blocks + i, blocks + i + 1, *blocks_nr - i - 1);
+			(*blocks_nr)--;
+		}
+	}
+	/* Nothing found, exit directly. */
+	return;
+}
+
 /*
  * @fs_uuid - if NULL, generates a UUID, returns back the new filesystem UUID
  *
@@ -290,12 +344,12 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 	struct btrfs_chunk *chunk;
 	struct btrfs_dev_item *dev_item;
 	struct btrfs_dev_extent *dev_extent;
-	const enum btrfs_mkfs_block *blocks = extent_tree_v1_blocks;
+	enum btrfs_mkfs_block blocks[MKFS_BLOCK_COUNT];
 	u8 chunk_tree_uuid[BTRFS_UUID_SIZE];
 	u8 *ptr;
 	int i;
 	int ret;
-	int blocks_nr = ARRAY_SIZE(extent_tree_v1_blocks);
+	int blocks_nr;
 	int blk;
 	u32 itemoff;
 	u32 nritems = 0;
@@ -315,16 +369,21 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 	bool extent_tree_v2 = !!(cfg->features &
 				 BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2);
 
-	/* Don't include the free space tree in the blocks to process. */
-	if (!free_space_tree)
-		blocks_nr--;
+	memcpy(blocks, default_blocks,
+	       sizeof(enum btrfs_mkfs_block) * ARRAY_SIZE(default_blocks));
+	blocks_nr = ARRAY_SIZE(default_blocks);
 
+	/* Extent tree v2 needs an extra block for block group tree.*/
 	if (extent_tree_v2) {
-		blocks = extent_tree_v2_blocks;
-		blocks_nr = ARRAY_SIZE(extent_tree_v2_blocks);
+		mkfs_blocks_add(blocks, &blocks_nr, MKFS_BLOCK_GROUP_TREE);
 		add_block_group = false;
 	}
 
+	/* Don't include the free space tree in the blocks to process. */
+	if (!free_space_tree)
+		mkfs_blocks_remove(blocks, &blocks_nr, MKFS_FREE_SPACE_TREE);
+
+
 	if ((cfg->features & BTRFS_FEATURE_INCOMPAT_ZONED)) {
 		system_group_offset = zoned_system_group_offset(cfg->zone_size);
 		system_group_size = cfg->zone_size;
diff --git a/mkfs/common.h b/mkfs/common.h
index 3533e114e81c..47b14cdae2f3 100644
--- a/mkfs/common.h
+++ b/mkfs/common.h
@@ -52,25 +52,12 @@ enum btrfs_mkfs_block {
 	MKFS_CSUM_TREE,
 	MKFS_FREE_SPACE_TREE,
 	MKFS_BLOCK_GROUP_TREE,
-	MKFS_BLOCK_COUNT
-};
-
-static const enum btrfs_mkfs_block extent_tree_v1_blocks[] = {
-	MKFS_ROOT_TREE,
-	MKFS_EXTENT_TREE,
-	MKFS_CHUNK_TREE,
-	MKFS_DEV_TREE,
-	MKFS_FS_TREE,
-	MKFS_CSUM_TREE,
 
-	/*
-	 * Since the free space tree is optional with v1 it must always be last
-	 * in this array.
-	 */
-	MKFS_FREE_SPACE_TREE,
+	/* MKFS_BLOCK_COUNT should be the max blocks we can have at mkfs time. */
+	MKFS_BLOCK_COUNT
 };
 
-static const enum btrfs_mkfs_block extent_tree_v2_blocks[] = {
+static const enum btrfs_mkfs_block default_blocks[] = {
 	MKFS_ROOT_TREE,
 	MKFS_EXTENT_TREE,
 	MKFS_CHUNK_TREE,
@@ -78,7 +65,6 @@ static const enum btrfs_mkfs_block extent_tree_v2_blocks[] = {
 	MKFS_FS_TREE,
 	MKFS_CSUM_TREE,
 	MKFS_FREE_SPACE_TREE,
-	MKFS_BLOCK_GROUP_TREE,
 };
 
 struct btrfs_mkfs_config {
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] btrfs-progs: don't save block group root into super block
  2022-07-13  7:57 [PATCH 0/3] btrfs-progs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 Qu Wenruo
  2022-07-13  7:57 ` [PATCH 1/3] btrfs-progs: mkfs: dynamically modify mkfs blocks array Qu Wenruo
@ 2022-07-13  7:57 ` Qu Wenruo
  2022-07-13  7:57 ` [PATCH 3/3] btrfs-progs: separate block group tree from extent tree v2 Qu Wenruo
  2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2022-07-13  7:57 UTC (permalink / raw)
  To: linux-btrfs

The extent tree v2 (thankfully not yet fully materialized) needs a
new root for storing all block group items.

My initial proposal years ago just added a new tree rootid, and load it
from tree root, just like what we did for quota/free space tree/uuid/extent
roots.

But the extent tree v2 patches introduced a completely new (and to me,
wasteful) way to store block group tree root into super block.

Currently there are only 3 trees stored in super blocks, and they all
have their valid reasons:

- Chunk root
  Needed for bootstrap.

- Tree root
  Really the entrance of all trees.

- Log root
  This is special as log root has to be updated out of existing
  transaction mechanism.

There is not even any reason to put block group root into super blocks,
the block group tree is updated at the same timing as old extent tree,
no need for extra bootstrap/out-of-transaction update.

So just move block group root from super block into tree root.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 cmds/inspect-dump-tree.c   | 11 ------
 kernel-shared/ctree.h      | 26 +------------
 kernel-shared/disk-io.c    | 75 ++++++++------------------------------
 kernel-shared/print-tree.c |  6 ---
 mkfs/common.c              | 11 ++----
 5 files changed, 20 insertions(+), 109 deletions(-)

diff --git a/cmds/inspect-dump-tree.c b/cmds/inspect-dump-tree.c
index 73ffd57eb13d..6374f137f7fb 100644
--- a/cmds/inspect-dump-tree.c
+++ b/cmds/inspect-dump-tree.c
@@ -517,11 +517,6 @@ static int cmd_inspect_dump_tree(const struct cmd_struct *cmd,
 				       info->log_root_tree->node->start,
 					btrfs_header_level(
 						info->log_root_tree->node));
-			if (info->block_group_root)
-				printf("block group tree: %llu level %d\n",
-				       info->block_group_root->node->start,
-					btrfs_header_level(
-						info->block_group_root->node));
 		} else {
 			if (info->tree_root->node) {
 				printf("root tree\n");
@@ -540,12 +535,6 @@ static int cmd_inspect_dump_tree(const struct cmd_struct *cmd,
 				btrfs_print_tree(info->log_root_tree->node,
 					BTRFS_PRINT_TREE_FOLLOW | print_mode);
 			}
-
-			if (info->block_group_root) {
-				printf("block group tree\n");
-				btrfs_print_tree(info->block_group_root->node,
-					BTRFS_PRINT_TREE_FOLLOW | print_mode);
-			}
 		}
 	}
 	tree_root_scan = info->tree_root;
diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h
index fc8b61eda829..c12076202577 100644
--- a/kernel-shared/ctree.h
+++ b/kernel-shared/ctree.h
@@ -457,13 +457,7 @@ struct btrfs_super_block {
 
 	__le64 nr_global_roots;
 
-	__le64 block_group_root;
-	__le64 block_group_root_generation;
-	u8 block_group_root_level;
-
-	/* future expansion */
-	u8 reserved8[7];
-	__le64 reserved[24];
+	__le64 reserved[27];
 	u8 sys_chunk_array[BTRFS_SYSTEM_CHUNK_ARRAY_SIZE];
 	struct btrfs_root_backup super_roots[BTRFS_NUM_BACKUP_ROOTS];
 	/* Padded to 4096 bytes */
@@ -2304,17 +2298,6 @@ BTRFS_SETGET_STACK_FUNCS(backup_bytes_used, struct btrfs_root_backup,
 BTRFS_SETGET_STACK_FUNCS(backup_num_devices, struct btrfs_root_backup,
 		   num_devices, 64);
 
-/*
- * Extent tree v2 doesn't have a global csum or extent root, so we use the
- * extent root slot for the block group root.
- */
-BTRFS_SETGET_STACK_FUNCS(backup_block_group_root, struct btrfs_root_backup,
-		   extent_root, 64);
-BTRFS_SETGET_STACK_FUNCS(backup_block_group_root_gen, struct btrfs_root_backup,
-		   extent_root_gen, 64);
-BTRFS_SETGET_STACK_FUNCS(backup_block_group_root_level, struct btrfs_root_backup,
-		   extent_root_level, 8);
-
 /* struct btrfs_super_block */
 
 BTRFS_SETGET_STACK_FUNCS(super_bytenr, struct btrfs_super_block, bytenr, 64);
@@ -2365,13 +2348,6 @@ BTRFS_SETGET_STACK_FUNCS(super_cache_generation, struct btrfs_super_block,
 BTRFS_SETGET_STACK_FUNCS(super_uuid_tree_generation, struct btrfs_super_block,
 			 uuid_tree_generation, 64);
 BTRFS_SETGET_STACK_FUNCS(super_magic, struct btrfs_super_block, magic, 64);
-BTRFS_SETGET_STACK_FUNCS(super_block_group_root, struct btrfs_super_block,
-			 block_group_root, 64);
-BTRFS_SETGET_STACK_FUNCS(super_block_group_root_generation,
-			 struct btrfs_super_block,
-			 block_group_root_generation, 64);
-BTRFS_SETGET_STACK_FUNCS(super_block_group_root_level,
-			 struct btrfs_super_block, block_group_root_level, 8);
 BTRFS_SETGET_STACK_FUNCS(super_nr_global_roots, struct btrfs_super_block,
 			 nr_global_roots, 64);
 
diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c
index 26b1c9aa192a..80db5976cc3f 100644
--- a/kernel-shared/disk-io.c
+++ b/kernel-shared/disk-io.c
@@ -1209,33 +1209,9 @@ static int load_important_roots(struct btrfs_fs_info *fs_info,
 		goto tree_root;
 	}
 
-	if (backup) {
-		bytenr = btrfs_backup_block_group_root(backup);
-		gen = btrfs_backup_block_group_root_gen(backup);
-		level = btrfs_backup_block_group_root_level(backup);
-	} else {
-		bytenr = btrfs_super_block_group_root(sb);
-		gen = btrfs_super_block_group_root_generation(sb);
-		level = btrfs_super_block_group_root_level(sb);
-	}
 	root = fs_info->block_group_root;
 	btrfs_setup_root(root, fs_info, BTRFS_BLOCK_GROUP_TREE_OBJECTID);
 
-	ret = read_root_node(fs_info, root, bytenr, gen, level);
-	if (ret) {
-		fprintf(stderr, "Couldn't read block group root\n");
-		return -EIO;
-	}
-
-	if (maybe_load_block_groups(fs_info, flags)) {
-		int ret = btrfs_read_block_groups(fs_info);
-		if (ret < 0 && ret != -ENOENT) {
-			errno = -ret;
-			error("failed to read block groups: %m");
-			return ret;
-		}
-	}
-
 tree_root:
 	if (backup) {
 		bytenr = btrfs_backup_tree_root(backup);
@@ -1280,6 +1256,17 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
 	if (ret)
 		return ret;
 
+	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
+		ret = find_and_setup_root(root, fs_info,
+				BTRFS_BLOCK_GROUP_TREE_OBJECTID,
+				fs_info->block_group_root);
+		if (ret) {
+			error("Couldn't load block group tree\n");
+			return -EIO;
+		}
+		fs_info->block_group_root->track_dirty = 1;
+	}
+
 	ret = find_and_setup_root(root, fs_info, BTRFS_DEV_TREE_OBJECTID,
 				  fs_info->dev_root);
 	if (ret) {
@@ -1288,6 +1275,7 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
 	}
 	fs_info->dev_root->track_dirty = 1;
 
+
 	ret = find_and_setup_root(root, fs_info, BTRFS_UUID_TREE_OBJECTID,
 				  fs_info->uuid_root);
 	if (ret) {
@@ -1313,8 +1301,7 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
 			return -EIO;
 	}
 
-	if (!btrfs_fs_incompat(fs_info, EXTENT_TREE_V2) &&
-	    maybe_load_block_groups(fs_info, flags)) {
+	if (maybe_load_block_groups(fs_info, flags)) {
 		ret = btrfs_read_block_groups(fs_info);
 		/*
 		 * If we don't find any blockgroups (ENOENT) we're either
@@ -1834,20 +1821,6 @@ int btrfs_check_super(struct btrfs_super_block *sb, unsigned sbflags)
 		goto error_out;
 	}
 
-	if (btrfs_super_incompat_flags(sb) & BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2) {
-		if (btrfs_super_block_group_root_level(sb) >= BTRFS_MAX_LEVEL) {
-			error("block_group_root level too big: %d >= %d",
-			      btrfs_super_block_group_root_level(sb),
-			      BTRFS_MAX_LEVEL);
-			goto error_out;
-		}
-		if (!IS_ALIGNED(btrfs_super_block_group_root(sb), 4096)) {
-			error("block_group_root block unaligned: %llu",
-			      btrfs_super_block_group_root(sb));
-			goto error_out;
-		}
-	}
-
 	if (btrfs_super_incompat_flags(sb) & BTRFS_FEATURE_INCOMPAT_METADATA_UUID)
 		metadata_uuid = sb->metadata_uuid;
 	else
@@ -2165,16 +2138,9 @@ static void backup_super_roots(struct btrfs_fs_info *info)
 	btrfs_set_backup_num_devices(root_backup,
 			     btrfs_super_num_devices(info->super_copy));
 
-	if (btrfs_fs_incompat(info, EXTENT_TREE_V2)) {
-		btrfs_set_backup_block_group_root(root_backup,
-				info->block_group_root->node->start);
-		btrfs_set_backup_block_group_root_gen(root_backup,
-			btrfs_header_generation(info->block_group_root->node));
-		btrfs_set_backup_block_group_root_level(root_backup,
-			btrfs_header_level(info->block_group_root->node));
-	} else {
-		struct btrfs_root *csum_root = btrfs_csum_root(info, 0);
+	if (!btrfs_fs_incompat(info, EXTENT_TREE_V2)) {
 		struct btrfs_root *extent_root = btrfs_extent_root(info, 0);
+		struct btrfs_root *csum_root = btrfs_csum_root(info, 0);
 
 		btrfs_set_backup_csum_root(root_backup, csum_root->node->start);
 		btrfs_set_backup_csum_root_gen(root_backup,
@@ -2235,7 +2201,7 @@ int write_ctree_super(struct btrfs_trans_handle *trans)
 	struct btrfs_fs_info *fs_info = trans->fs_info;
 	struct btrfs_root *tree_root = fs_info->tree_root;
 	struct btrfs_root *chunk_root = fs_info->chunk_root;
-	struct btrfs_root *block_group_root = fs_info->block_group_root;
+
 	if (fs_info->readonly)
 		return 0;
 
@@ -2252,15 +2218,6 @@ int write_ctree_super(struct btrfs_trans_handle *trans)
 	btrfs_set_super_chunk_root_generation(fs_info->super_copy,
 				btrfs_header_generation(chunk_root->node));
 
-	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
-		btrfs_set_super_block_group_root(fs_info->super_copy,
-						 block_group_root->node->start);
-		btrfs_set_super_block_group_root_generation(fs_info->super_copy,
-				btrfs_header_generation(block_group_root->node));
-		btrfs_set_super_block_group_root_level(fs_info->super_copy,
-				btrfs_header_level(block_group_root->node));
-	}
-
 	ret = write_all_supers(fs_info);
 	if (ret)
 		fprintf(stderr, "failed to write new super block err %d\n", ret);
diff --git a/kernel-shared/print-tree.c b/kernel-shared/print-tree.c
index a5886ff602ee..bffe30b405c7 100644
--- a/kernel-shared/print-tree.c
+++ b/kernel-shared/print-tree.c
@@ -2046,12 +2046,6 @@ void btrfs_print_superblock(struct btrfs_super_block *sb, int full)
 	       (unsigned long long)btrfs_super_cache_generation(sb));
 	printf("uuid_tree_generation\t%llu\n",
 	       (unsigned long long)btrfs_super_uuid_tree_generation(sb));
-	printf("block_group_root\t%llu\n",
-	       (unsigned long long)btrfs_super_block_group_root(sb));
-	printf("block_group_root_generation\t%llu\n",
-	       (unsigned long long)btrfs_super_block_group_root_generation(sb));
-	printf("block_group_root_level\t%llu\n",
-	       (unsigned long long)btrfs_super_block_group_root_level(sb));
 
 	uuid_unparse(sb->dev_item.uuid, buf);
 	printf("dev_item.uuid\t\t%s\n", buf);
diff --git a/mkfs/common.c b/mkfs/common.c
index d5a49ca11cde..b72338551dfb 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -98,8 +98,7 @@ static int btrfs_create_tree_root(int fd, struct btrfs_mkfs_config *cfg,
 
 	for (i = 0; i < blocks_nr; i++) {
 		blk = blocks[i];
-		if (blk == MKFS_ROOT_TREE || blk == MKFS_CHUNK_TREE ||
-		    blk == MKFS_BLOCK_GROUP_TREE)
+		if (blk == MKFS_ROOT_TREE || blk == MKFS_CHUNK_TREE)
 			continue;
 
 		btrfs_set_root_bytenr(&root_item, cfg->blocks[blk]);
@@ -440,13 +439,9 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 		btrfs_set_super_compat_ro_flags(&super, ro_flags);
 		btrfs_set_super_cache_generation(&super, 0);
 	}
-	if (extent_tree_v2) {
+	if (extent_tree_v2)
 		btrfs_set_super_nr_global_roots(&super, 1);
-		btrfs_set_super_block_group_root(&super,
-						 cfg->blocks[MKFS_BLOCK_GROUP_TREE]);
-		btrfs_set_super_block_group_root_generation(&super, 1);
-		btrfs_set_super_block_group_root_level(&super, 0);
-	}
+
 	if (cfg->label)
 		__strncpy_null(super.label, cfg->label, BTRFS_LABEL_SIZE - 1);
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] btrfs-progs: separate block group tree from extent tree v2
  2022-07-13  7:57 [PATCH 0/3] btrfs-progs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 Qu Wenruo
  2022-07-13  7:57 ` [PATCH 1/3] btrfs-progs: mkfs: dynamically modify mkfs blocks array Qu Wenruo
  2022-07-13  7:57 ` [PATCH 2/3] btrfs-progs: don't save block group root into super block Qu Wenruo
@ 2022-07-13  7:57 ` Qu Wenruo
  2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2022-07-13  7:57 UTC (permalink / raw)
  To: linux-btrfs

Block group tree feature is completely a standalone feature, and it has
been over 5 years before the initial introduction to solve the long
mount time.

I don't really want to waste another 5 years waiting for a feature which
may or may not work, but definitely not properly reviewed for its
preparation patches.

So this patch will separate the block group tree feature into a
standalone compat RO feature.

There is a catch, in mkfs create_block_group_tree(), current
tree-checker only accepts block group item with valid chunk_objectid,
but the existing code from extent-tree-v2 didn't properly initialize it.

This patch will also fix above mentioned problem so kernel can mount it
correctly.

Now mkfs/fsck should be able to handle the fs with block group tree.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 check/main.c               |  8 ++------
 common/fsfeatures.c        |  8 ++++++++
 common/fsfeatures.h        |  2 ++
 kernel-shared/ctree.h      |  9 ++++++++-
 kernel-shared/disk-io.c    |  4 ++--
 kernel-shared/disk-io.h    |  2 +-
 kernel-shared/print-tree.c |  5 ++---
 mkfs/common.c              | 31 ++++++++++++++++++++++++-------
 mkfs/main.c                |  3 ++-
 9 files changed, 51 insertions(+), 21 deletions(-)

diff --git a/check/main.c b/check/main.c
index 4f7ab8b29309..02abbd5289f9 100644
--- a/check/main.c
+++ b/check/main.c
@@ -6293,7 +6293,7 @@ static int check_type_with_root(u64 rootid, u8 key_type)
 			goto err;
 		break;
 	case BTRFS_BLOCK_GROUP_ITEM_KEY:
-		if (btrfs_fs_incompat(gfs_info, EXTENT_TREE_V2)) {
+		if (btrfs_fs_compat_ro(gfs_info, BLOCK_GROUP_TREE)) {
 			if (rootid != BTRFS_BLOCK_GROUP_TREE_OBJECTID)
 				goto err;
 		} else if (rootid != BTRFS_EXTENT_TREE_OBJECTID) {
@@ -9071,10 +9071,6 @@ again:
 	ret = load_super_root(&normal_trees, gfs_info->chunk_root);
 	if (ret < 0)
 		goto out;
-	ret = load_super_root(&normal_trees, gfs_info->block_group_root);
-	if (ret < 0)
-		goto out;
-
 	ret = parse_tree_roots(&normal_trees, &dropping_trees);
 	if (ret < 0)
 		goto out;
@@ -9574,7 +9570,7 @@ again:
 	 * If we are extent tree v2 then we can reint the block group root as
 	 * well.
 	 */
-	if (btrfs_fs_incompat(gfs_info, EXTENT_TREE_V2)) {
+	if (btrfs_fs_compat_ro(gfs_info, BLOCK_GROUP_TREE)) {
 		ret = btrfs_fsck_reinit_root(trans, gfs_info->block_group_root);
 		if (ret) {
 			fprintf(stderr, "block group initialization failed\n");
diff --git a/common/fsfeatures.c b/common/fsfeatures.c
index 23a92c21a2cc..90704959b13b 100644
--- a/common/fsfeatures.c
+++ b/common/fsfeatures.c
@@ -172,6 +172,14 @@ static const struct btrfs_feature runtime_features[] = {
 		VERSION_TO_STRING2(safe, 4,9),
 		VERSION_TO_STRING2(default, 5,15),
 		.desc		= "free space tree (space_cache=v2)"
+	}, {
+		.name		= "block-group-tree",
+		.flag		= BTRFS_RUNTIME_FEATURE_BLOCK_GROUP_TREE,
+		.sysfs_name = "block_group_tree",
+		VERSION_TO_STRING2(compat, 6,0),
+		VERSION_NULL(safe),
+		VERSION_NULL(default),
+		.desc		= "block group tree to reduce mount time"
 	},
 	/* Keep this one last */
 	{
diff --git a/common/fsfeatures.h b/common/fsfeatures.h
index 9e39c667b900..a8d77fd4da05 100644
--- a/common/fsfeatures.h
+++ b/common/fsfeatures.h
@@ -45,6 +45,8 @@
 
 #define BTRFS_RUNTIME_FEATURE_QUOTA		(1ULL << 0)
 #define BTRFS_RUNTIME_FEATURE_FREE_SPACE_TREE	(1ULL << 1)
+#define BTRFS_RUNTIME_FEATURE_BLOCK_GROUP_TREE	(1ULL << 2)
+
 
 void btrfs_list_all_fs_features(u64 mask_disallowed);
 void btrfs_list_all_runtime_features(u64 mask_disallowed);
diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h
index c12076202577..d8909b3fdf20 100644
--- a/kernel-shared/ctree.h
+++ b/kernel-shared/ctree.h
@@ -479,6 +479,12 @@ BUILD_ASSERT(sizeof(struct btrfs_super_block) == BTRFS_SUPER_INFO_SIZE);
  */
 #define BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID	(1ULL << 1)
 
+/*
+ * Save all block group items into a dedicated block group tree, to greatly
+ * reduce mount time for large fs.
+ */
+#define BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE	(1ULL << 5)
+
 #define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF	(1ULL << 0)
 #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL	(1ULL << 1)
 #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS	(1ULL << 2)
@@ -508,7 +514,8 @@ BUILD_ASSERT(sizeof(struct btrfs_super_block) == BTRFS_SUPER_INFO_SIZE);
  */
 #define BTRFS_FEATURE_COMPAT_RO_SUPP			\
 	(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |	\
-	 BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID)
+	 BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID| \
+	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE)
 
 #if EXPERIMENTAL
 #define BTRFS_FEATURE_INCOMPAT_SUPP			\
diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c
index 80db5976cc3f..6eeb5ecd1d59 100644
--- a/kernel-shared/disk-io.c
+++ b/kernel-shared/disk-io.c
@@ -1203,7 +1203,7 @@ static int load_important_roots(struct btrfs_fs_info *fs_info,
 		backup = sb->super_roots + index;
 	}
 
-	if (!btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
+	if (!btrfs_fs_compat_ro(fs_info, BLOCK_GROUP_TREE)) {
 		free(fs_info->block_group_root);
 		fs_info->block_group_root = NULL;
 		goto tree_root;
@@ -1256,7 +1256,7 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
 	if (ret)
 		return ret;
 
-	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
+	if (btrfs_fs_compat_ro(fs_info, BLOCK_GROUP_TREE)) {
 		ret = find_and_setup_root(root, fs_info,
 				BTRFS_BLOCK_GROUP_TREE_OBJECTID,
 				fs_info->block_group_root);
diff --git a/kernel-shared/disk-io.h b/kernel-shared/disk-io.h
index bba97fc1a814..6c8eaa2bd13d 100644
--- a/kernel-shared/disk-io.h
+++ b/kernel-shared/disk-io.h
@@ -232,7 +232,7 @@ int btrfs_global_root_insert(struct btrfs_fs_info *fs_info,
 static inline struct btrfs_root *btrfs_block_group_root(
 						struct btrfs_fs_info *fs_info)
 {
-	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
+	if (btrfs_fs_compat_ro(fs_info, BLOCK_GROUP_TREE))
 		return fs_info->block_group_root;
 	return btrfs_extent_root(fs_info, 0);
 }
diff --git a/kernel-shared/print-tree.c b/kernel-shared/print-tree.c
index bffe30b405c7..b2ee77c2fb73 100644
--- a/kernel-shared/print-tree.c
+++ b/kernel-shared/print-tree.c
@@ -1668,6 +1668,7 @@ struct readable_flag_entry {
 static struct readable_flag_entry compat_ro_flags_array[] = {
 	DEF_COMPAT_RO_FLAG_ENTRY(FREE_SPACE_TREE),
 	DEF_COMPAT_RO_FLAG_ENTRY(FREE_SPACE_TREE_VALID),
+	DEF_COMPAT_RO_FLAG_ENTRY(BLOCK_GROUP_TREE),
 };
 static const int compat_ro_flags_num = sizeof(compat_ro_flags_array) /
 				       sizeof(struct readable_flag_entry);
@@ -1754,9 +1755,7 @@ static void print_readable_compat_ro_flag(u64 flag)
 	 */
 	return __print_readable_flag(flag, compat_ro_flags_array,
 				     compat_ro_flags_num,
-				     BTRFS_FEATURE_COMPAT_RO_SUPP |
-				     BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |
-				     BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID);
+				     BTRFS_FEATURE_COMPAT_RO_SUPP);
 }
 
 static void print_readable_incompat_flag(u64 flag)
diff --git a/mkfs/common.c b/mkfs/common.c
index b72338551dfb..cb616f13ef9b 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -75,6 +75,8 @@ static int btrfs_create_tree_root(int fd, struct btrfs_mkfs_config *cfg,
 	int blk;
 	int i;
 	u8 uuid[BTRFS_UUID_SIZE];
+	bool block_group_tree = !!(cfg->runtime_features &
+				   BTRFS_RUNTIME_FEATURE_BLOCK_GROUP_TREE);
 
 	memset(buf->data + sizeof(struct btrfs_header), 0,
 		cfg->nodesize - sizeof(struct btrfs_header));
@@ -101,6 +103,9 @@ static int btrfs_create_tree_root(int fd, struct btrfs_mkfs_config *cfg,
 		if (blk == MKFS_ROOT_TREE || blk == MKFS_CHUNK_TREE)
 			continue;
 
+		if (!block_group_tree && blk == MKFS_BLOCK_GROUP_TREE)
+			continue;
+
 		btrfs_set_root_bytenr(&root_item, cfg->blocks[blk]);
 		btrfs_set_disk_key_objectid(&disk_key,
 			reference_root_table[blk]);
@@ -216,7 +221,8 @@ static int create_block_group_tree(int fd, struct btrfs_mkfs_config *cfg,
 
 	memset(buf->data + sizeof(struct btrfs_header), 0,
 		cfg->nodesize - sizeof(struct btrfs_header));
-	write_block_group_item(buf, 0, bg_offset, bg_size, bg_used, 0,
+	write_block_group_item(buf, 0, bg_offset, bg_size, bg_used,
+			       BTRFS_FIRST_CHUNK_TREE_OBJECTID,
 			       cfg->leaf_data_size -
 			       sizeof(struct btrfs_block_group_item));
 	btrfs_set_header_bytenr(buf, cfg->blocks[MKFS_BLOCK_GROUP_TREE]);
@@ -357,6 +363,7 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 	u32 array_size;
 	u32 item_size;
 	u64 total_used = 0;
+	u64 ro_flags = 0;
 	int skinny_metadata = !!(cfg->features &
 				 BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA);
 	u64 num_bytes;
@@ -365,6 +372,8 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 	bool add_block_group = true;
 	bool free_space_tree = !!(cfg->runtime_features &
 				  BTRFS_RUNTIME_FEATURE_FREE_SPACE_TREE);
+	bool block_group_tree = !!(cfg->runtime_features &
+				   BTRFS_RUNTIME_FEATURE_BLOCK_GROUP_TREE);
 	bool extent_tree_v2 = !!(cfg->features &
 				 BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2);
 
@@ -372,8 +381,13 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 	       sizeof(enum btrfs_mkfs_block) * ARRAY_SIZE(default_blocks));
 	blocks_nr = ARRAY_SIZE(default_blocks);
 
-	/* Extent tree v2 needs an extra block for block group tree.*/
-	if (extent_tree_v2) {
+	/*
+	 * Add one new block for block group tree.
+	 * And for block group tree, we don't need to add block group item
+	 * into extent tree, the item will be handled in block group tree
+	 * initialization.
+	 */
+	if (block_group_tree) {
 		mkfs_blocks_add(blocks, &blocks_nr, MKFS_BLOCK_GROUP_TREE);
 		add_block_group = false;
 	}
@@ -433,12 +447,15 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 		btrfs_set_super_cache_generation(&super, -1);
 	btrfs_set_super_incompat_flags(&super, cfg->features);
 	if (free_space_tree) {
-		u64 ro_flags = BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |
-			BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID;
+		ro_flags |= (BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |
+			     BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID);
 
-		btrfs_set_super_compat_ro_flags(&super, ro_flags);
 		btrfs_set_super_cache_generation(&super, 0);
 	}
+	if (block_group_tree)
+		ro_flags |= BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE;
+	btrfs_set_super_compat_ro_flags(&super, ro_flags);
+
 	if (extent_tree_v2)
 		btrfs_set_super_nr_global_roots(&super, 1);
 
@@ -695,7 +712,7 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 			goto out;
 	}
 
-	if (extent_tree_v2) {
+	if (block_group_tree) {
 		ret = create_block_group_tree(fd, cfg, buf,
 					      system_group_offset,
 					      system_group_size, total_used);
diff --git a/mkfs/main.c b/mkfs/main.c
index ce096d362171..518ce0fd7523 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -299,7 +299,8 @@ static int recow_roots(struct btrfs_trans_handle *trans,
 	ret = __recow_root(trans, info->dev_root);
 	if (ret)
 		return ret;
-        if (btrfs_fs_incompat(info, EXTENT_TREE_V2)) {
+
+	if (btrfs_fs_compat_ro(info, BLOCK_GROUP_TREE)) {
 		ret = __recow_root(trans, info->block_group_root);
 		if (ret)
 			return ret;
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-07-13  7:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-13  7:57 [PATCH 0/3] btrfs-progs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 Qu Wenruo
2022-07-13  7:57 ` [PATCH 1/3] btrfs-progs: mkfs: dynamically modify mkfs blocks array Qu Wenruo
2022-07-13  7:57 ` [PATCH 2/3] btrfs-progs: don't save block group root into super block Qu Wenruo
2022-07-13  7:57 ` [PATCH 3/3] btrfs-progs: separate block group tree from extent tree v2 Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).