All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2
@ 2022-07-20  5:06 Qu Wenruo
  2022-07-20  5:06 ` [PATCH v2 1/3] btrfs: enhance unsupported compat RO flags handling Qu Wenruo
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Qu Wenruo @ 2022-07-20  5:06 UTC (permalink / raw)
  To: linux-btrfs

[Changelog]
v2:
- Rebased to latest misc-next
  This fixes some random crash not related to btrfs.

- Fix some missing conversion due to bad branch
  I got my code messed up due to some bad local branch naming.
  The previous version sent to the ML lacks some essential conversion.

  Now it can properly pass full fstest run with block group tree.

This is the kernel part to revive block-group-tree feature.

Thanfully unlike btrfs-progs, the changes to kernel is much smaller, and
we can re-use most of the infrastructures from the extent-tree-v2
preparation patches.

But there are still some changes needed:

- Enhance unsupporter compat RO flags handling
  Extent tree is only needed for read-write opeartions, and for
  unsupported compat RO flags, we should not do any write into the fs.

  So this patch will make the kernel to skip block group items search
  if there is any unsupport RO compat flags.

  And really make the incoming block-group-tree feature compat RO.

  Unfortunately, we need that patch to be backported, or older kernels
  will still reject RO mounts of fses with block-group-tree feature.

- Don't store block group root into super block
  There is no special reason for block group root to be stored in super
  block.
  We should review those preparation patches with more scrutiny.


For the proper time reduction introduced by this patchset, the old data
should still be correct, as the on-disk format is not changed.
https://lwn.net/Articles/801990/


Qu Wenruo (3):
  btrfs: enhance unsupported compat RO flags handling
  btrfs: don't save block group root into super block
  btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2

 fs/btrfs/block-group.c     | 11 ++++++++-
 fs/btrfs/block-rsv.c       |  1 +
 fs/btrfs/ctree.h           | 30 +++--------------------
 fs/btrfs/disk-io.c         | 50 +++++++++++++++-----------------------
 fs/btrfs/disk-io.h         |  2 +-
 fs/btrfs/super.c           |  9 +++++++
 fs/btrfs/sysfs.c           |  2 ++
 fs/btrfs/transaction.c     |  8 ------
 include/uapi/linux/btrfs.h |  6 +++++
 9 files changed, 53 insertions(+), 66 deletions(-)

-- 
2.37.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/3] btrfs: enhance unsupported compat RO flags handling
  2022-07-20  5:06 [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 Qu Wenruo
@ 2022-07-20  5:06 ` Qu Wenruo
  2022-07-20 10:20   ` Nikolay Borisov
  2022-07-20  5:07 ` [PATCH v2 2/3] btrfs: don't save block group root into super block Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2022-07-20  5:06 UTC (permalink / raw)
  To: linux-btrfs; +Cc: stable

Currently there are two corner cases not handling compat RO flags
correctly:

- Remount
  We can still mount the fs RO with compat RO flags, then remount it RW.
  We should not allow any write into a fs with unsupported RO flags.

- Still try to search block group items
  In fact, behavior/on-disk format change to extent tree should not
  need a full incompat flag.

  And since we can ensure fs with unsupported RO flags never got any
  writes (with above case fixed), then we can even skip block group
  items search at mount time.

This patch will enhance the unsupported RO compat flags by:

- Reject RW remount if there is unsupported RO compat flags

- Go dummy block group items directly for unsupported RO compat flags
  In fact, only changes to chunk/subvolume/root/csum trees should go
  incompat flags.

The latter part should allow future change to extent tree to be compat
RO flags.

Thus this patch also needs to be backported to all stable trees.

Cc: stable@vger.kernel.org
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/block-group.c | 11 ++++++++++-
 fs/btrfs/super.c       |  9 +++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index c9475219c70c..88d23d6760f0 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2206,7 +2206,16 @@ int btrfs_read_block_groups(struct btrfs_fs_info *info)
 	int need_clear = 0;
 	u64 cache_gen;
 
-	if (!root)
+	/*
+	 * Either no extent root (with ibadroots rescue option) or we have
+	 * unsupporter RO options. The fs can never be mounted RW, so no
+	 * need to waste time search block group items.
+	 *
+	 * This also allows new extent tree related changes to be RO compat,
+	 * no need for a full incompat flag.
+	 */
+	if (!root || (btrfs_super_compat_ro_flags(info->super_copy) &
+		      ~BTRFS_FEATURE_COMPAT_RO_SUPP))
 		return fill_dummy_bgs(info);
 
 	key.objectid = 0;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 4c7089b1681b..7d3213e67fb5 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2110,6 +2110,15 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 			ret = -EINVAL;
 			goto restore;
 		}
+		if (btrfs_super_compat_ro_flags(fs_info->super_copy) &
+		    ~BTRFS_FEATURE_COMPAT_RO_SUPP) {
+			btrfs_err(fs_info,
+		"can not remount read-write due to unsupported optional flags 0x%llx",
+				btrfs_super_compat_ro_flags(fs_info->super_copy) &
+				~BTRFS_FEATURE_COMPAT_RO_SUPP);
+			ret = -EINVAL;
+			goto restore;
+		}
 		if (fs_info->fs_devices->rw_devices == 0) {
 			ret = -EACCES;
 			goto restore;
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/3] btrfs: don't save block group root into super block
  2022-07-20  5:06 [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 Qu Wenruo
  2022-07-20  5:06 ` [PATCH v2 1/3] btrfs: enhance unsupported compat RO flags handling Qu Wenruo
@ 2022-07-20  5:07 ` Qu Wenruo
  2022-07-20  5:07 ` [PATCH v2 3/3] btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2 Qu Wenruo
  2022-07-26 17:59 ` [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 David Sterba
  3 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2022-07-20  5:07 UTC (permalink / raw)
  To: linux-btrfs

The extent tree v2 (thankfully not yet fully materialized) needs a
new root for storing all block group items.

My initial proposal years ago just added a new tree rootid, and load it
from tree root, just like what we did for quota/free space tree/uuid/extent
roots.

But the extent tree v2 patches introduced a completely new (and to me,
wasteful) way to store block group tree root into super block.

Currently there are only 3 trees stored in super blocks, and they all
have their valid reasons:

- Chunk root
  Needed for bootstrap.

- Tree root
  Really the entrance of all trees.

- Log root
  This is special as log root has to be updated out of existing
  transaction mechanism.

There is not even any reason to put block group root into super blocks,
the block group tree is updated at the same timing as old extent tree,
no need for extra bootstrap/out-of-transaction update.

So just move block group root from super block into tree root.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/block-rsv.c   |  1 +
 fs/btrfs/ctree.h       | 27 ++-------------------------
 fs/btrfs/disk-io.c     | 40 ++++++++++++++++++++--------------------
 fs/btrfs/transaction.c |  8 --------
 4 files changed, 23 insertions(+), 53 deletions(-)

diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
index 06be0644dd37..6ce704d3bdd2 100644
--- a/fs/btrfs/block-rsv.c
+++ b/fs/btrfs/block-rsv.c
@@ -424,6 +424,7 @@ void btrfs_init_root_block_rsv(struct btrfs_root *root)
 	case BTRFS_CSUM_TREE_OBJECTID:
 	case BTRFS_EXTENT_TREE_OBJECTID:
 	case BTRFS_FREE_SPACE_TREE_OBJECTID:
+	case BTRFS_BLOCK_GROUP_TREE_OBJECTID:
 		root->block_rsv = &fs_info->delayed_refs_rsv;
 		break;
 	case BTRFS_ROOT_TREE_OBJECTID:
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 4db85b9dc7ed..7a1ff777f61b 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -280,14 +280,9 @@ struct btrfs_super_block {
 	/* the UUID written into btree blocks */
 	u8 metadata_uuid[BTRFS_FSID_SIZE];
 
-	/* Extent tree v2 */
-	__le64 block_group_root;
-	__le64 block_group_root_generation;
-	u8 block_group_root_level;
-
 	/* future expansion */
-	u8 reserved8[7];
-	__le64 reserved[25];
+	u8 reserved8[8];
+	__le64 reserved[27];
 	u8 sys_chunk_array[BTRFS_SYSTEM_CHUNK_ARRAY_SIZE];
 	struct btrfs_root_backup super_roots[BTRFS_NUM_BACKUP_ROOTS];
 
@@ -2392,17 +2387,6 @@ BTRFS_SETGET_STACK_FUNCS(backup_bytes_used, struct btrfs_root_backup,
 BTRFS_SETGET_STACK_FUNCS(backup_num_devices, struct btrfs_root_backup,
 		   num_devices, 64);
 
-/*
- * For extent tree v2 we overload the extent root with the block group root, as
- * we will have multiple extent roots.
- */
-BTRFS_SETGET_STACK_FUNCS(backup_block_group_root, struct btrfs_root_backup,
-			 extent_root, 64);
-BTRFS_SETGET_STACK_FUNCS(backup_block_group_root_gen, struct btrfs_root_backup,
-			 extent_root_gen, 64);
-BTRFS_SETGET_STACK_FUNCS(backup_block_group_root_level,
-			 struct btrfs_root_backup, extent_root_level, 8);
-
 /* struct btrfs_balance_item */
 BTRFS_SETGET_FUNCS(balance_flags, struct btrfs_balance_item, flags, 64);
 
@@ -2535,13 +2519,6 @@ BTRFS_SETGET_STACK_FUNCS(super_cache_generation, struct btrfs_super_block,
 BTRFS_SETGET_STACK_FUNCS(super_magic, struct btrfs_super_block, magic, 64);
 BTRFS_SETGET_STACK_FUNCS(super_uuid_tree_generation, struct btrfs_super_block,
 			 uuid_tree_generation, 64);
-BTRFS_SETGET_STACK_FUNCS(super_block_group_root, struct btrfs_super_block,
-			 block_group_root, 64);
-BTRFS_SETGET_STACK_FUNCS(super_block_group_root_generation,
-			 struct btrfs_super_block,
-			 block_group_root_generation, 64);
-BTRFS_SETGET_STACK_FUNCS(super_block_group_root_level, struct btrfs_super_block,
-			 block_group_root_level, 8);
 
 int btrfs_super_csum_size(const struct btrfs_super_block *s);
 const char *btrfs_super_csum_name(u16 csum_type);
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3fac429cf8a4..91d443755174 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1608,6 +1608,9 @@ static struct btrfs_root *btrfs_get_global_root(struct btrfs_fs_info *fs_info,
 	if (objectid == BTRFS_UUID_TREE_OBJECTID)
 		return btrfs_grab_root(fs_info->uuid_root) ?
 			fs_info->uuid_root : ERR_PTR(-ENOENT);
+	if (objectid == BTRFS_BLOCK_GROUP_TREE_OBJECTID)
+		return btrfs_grab_root(fs_info->block_group_root) ?
+			fs_info->block_group_root : ERR_PTR(-ENOENT);
 	if (objectid == BTRFS_FREE_SPACE_TREE_OBJECTID) {
 		struct btrfs_root *root = btrfs_global_root(fs_info, &key);
 
@@ -2064,14 +2067,7 @@ static void backup_super_roots(struct btrfs_fs_info *info)
 	btrfs_set_backup_chunk_root_level(root_backup,
 			       btrfs_header_level(info->chunk_root->node));
 
-	if (btrfs_fs_incompat(info, EXTENT_TREE_V2)) {
-		btrfs_set_backup_block_group_root(root_backup,
-					info->block_group_root->node->start);
-		btrfs_set_backup_block_group_root_gen(root_backup,
-			btrfs_header_generation(info->block_group_root->node));
-		btrfs_set_backup_block_group_root_level(root_backup,
-			btrfs_header_level(info->block_group_root->node));
-	} else {
+	if (!btrfs_fs_incompat(info, EXTENT_TREE_V2)) {
 		struct btrfs_root *extent_root = btrfs_extent_root(info, 0);
 		struct btrfs_root *csum_root = btrfs_csum_root(info, 0);
 
@@ -2613,10 +2609,24 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
 	if (ret)
 		return ret;
 
-	location.objectid = BTRFS_DEV_TREE_OBJECTID;
 	location.type = BTRFS_ROOT_ITEM_KEY;
 	location.offset = 0;
 
+	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
+		location.objectid = BTRFS_BLOCK_GROUP_TREE_OBJECTID;
+		root = btrfs_read_tree_root(tree_root, &location);
+		if (IS_ERR(root)) {
+			if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
+				ret = PTR_ERR(root);
+				goto out;
+			}
+		} else {
+			set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
+			fs_info->block_group_root = root;
+		}
+	}
+
+	location.objectid = BTRFS_DEV_TREE_OBJECTID;
 	root = btrfs_read_tree_root(tree_root, &location);
 	if (IS_ERR(root)) {
 		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) {
@@ -2944,17 +2954,7 @@ static int load_important_roots(struct btrfs_fs_info *fs_info)
 		btrfs_warn(fs_info, "couldn't read tree root");
 		return ret;
 	}
-
-	if (!btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
-		return 0;
-
-	bytenr = btrfs_super_block_group_root(sb);
-	gen = btrfs_super_block_group_root_generation(sb);
-	level = btrfs_super_block_group_root_level(sb);
-	ret = load_super_root(fs_info->block_group_root, bytenr, gen, level);
-	if (ret)
-		btrfs_warn(fs_info, "couldn't read block group root");
-	return ret;
+	return 0;
 }
 
 static int __cold init_tree_roots(struct btrfs_fs_info *fs_info)
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 0bec10740ad3..8fab3b274957 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1912,14 +1912,6 @@ static void update_super_roots(struct btrfs_fs_info *fs_info)
 		super->cache_generation = 0;
 	if (test_bit(BTRFS_FS_UPDATE_UUID_TREE_GEN, &fs_info->flags))
 		super->uuid_tree_generation = root_item->generation;
-
-	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
-		root_item = &fs_info->block_group_root->root_item;
-
-		super->block_group_root = root_item->bytenr;
-		super->block_group_root_generation = root_item->generation;
-		super->block_group_root_level = root_item->level;
-	}
 }
 
 int btrfs_transaction_in_commit(struct btrfs_fs_info *info)
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/3] btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2
  2022-07-20  5:06 [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 Qu Wenruo
  2022-07-20  5:06 ` [PATCH v2 1/3] btrfs: enhance unsupported compat RO flags handling Qu Wenruo
  2022-07-20  5:07 ` [PATCH v2 2/3] btrfs: don't save block group root into super block Qu Wenruo
@ 2022-07-20  5:07 ` Qu Wenruo
  2022-07-26 17:59 ` [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 David Sterba
  3 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2022-07-20  5:07 UTC (permalink / raw)
  To: linux-btrfs

The problem of long mount time caused by block group item search is
already known for over 7 years, and the solution of block group tree is
proposed at least for 5 years.

There is really no need to bound this feature into extent tree v2, just
introduce compat RO flag, BLOCK_GROUP_TREE, to correctly solve the
problem.

All the code handling block group root is already in the upstream
kernel, thus this patch really only needs to introduce the new compat RO
flag.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ctree.h           |  3 ++-
 fs/btrfs/disk-io.c         | 14 ++------------
 fs/btrfs/disk-io.h         |  2 +-
 fs/btrfs/sysfs.c           |  2 ++
 include/uapi/linux/btrfs.h |  6 ++++++
 5 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 7a1ff777f61b..94268c07dbfe 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -302,7 +302,8 @@ static_assert(sizeof(struct btrfs_super_block) == BTRFS_SUPER_INFO_SIZE);
 #define BTRFS_FEATURE_COMPAT_RO_SUPP			\
 	(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE |	\
 	 BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID | \
-	 BTRFS_FEATURE_COMPAT_RO_VERITY)
+	 BTRFS_FEATURE_COMPAT_RO_VERITY |		\
+	 BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE)
 
 #define BTRFS_FEATURE_COMPAT_RO_SAFE_SET	0ULL
 #define BTRFS_FEATURE_COMPAT_RO_SAFE_CLEAR	0ULL
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 91d443755174..b925bb443e0d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2067,7 +2067,7 @@ static void backup_super_roots(struct btrfs_fs_info *info)
 	btrfs_set_backup_chunk_root_level(root_backup,
 			       btrfs_header_level(info->chunk_root->node));
 
-	if (!btrfs_fs_incompat(info, EXTENT_TREE_V2)) {
+	if (!btrfs_fs_compat_ro(info, BLOCK_GROUP_TREE)) {
 		struct btrfs_root *extent_root = btrfs_extent_root(info, 0);
 		struct btrfs_root *csum_root = btrfs_csum_root(info, 0);
 
@@ -2612,7 +2612,7 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
 	location.type = BTRFS_ROOT_ITEM_KEY;
 	location.offset = 0;
 
-	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
+	if (btrfs_fs_compat_ro(fs_info, BLOCK_GROUP_TREE)) {
 		location.objectid = BTRFS_BLOCK_GROUP_TREE_OBJECTID;
 		root = btrfs_read_tree_root(tree_root, &location);
 		if (IS_ERR(root)) {
@@ -2966,16 +2966,6 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info)
 	int ret = 0;
 	int i;
 
-	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
-		struct btrfs_root *root;
-
-		root = btrfs_alloc_root(fs_info, BTRFS_BLOCK_GROUP_TREE_OBJECTID,
-					GFP_KERNEL);
-		if (!root)
-			return -ENOMEM;
-		fs_info->block_group_root = root;
-	}
-
 	for (i = 0; i < BTRFS_NUM_BACKUP_ROOTS; i++) {
 		if (handle_error) {
 			if (!IS_ERR(tree_root->node))
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 8993b428e09c..e7d289b0ed14 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -103,7 +103,7 @@ static inline struct btrfs_root *btrfs_grab_root(struct btrfs_root *root)
 
 static inline struct btrfs_root *btrfs_block_group_root(struct btrfs_fs_info *fs_info)
 {
-	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
+	if (btrfs_fs_compat_ro(fs_info, BLOCK_GROUP_TREE))
 		return fs_info->block_group_root;
 	return btrfs_extent_root(fs_info, 0);
 }
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index d5d0717fd09a..b2eb6d40b21d 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -286,6 +286,7 @@ BTRFS_FEAT_ATTR_INCOMPAT(skinny_metadata, SKINNY_METADATA);
 BTRFS_FEAT_ATTR_INCOMPAT(no_holes, NO_HOLES);
 BTRFS_FEAT_ATTR_INCOMPAT(metadata_uuid, METADATA_UUID);
 BTRFS_FEAT_ATTR_COMPAT_RO(free_space_tree, FREE_SPACE_TREE);
+BTRFS_FEAT_ATTR_COMPAT_RO(block_group_tree, BLOCK_GROUP_TREE);
 BTRFS_FEAT_ATTR_INCOMPAT(raid1c34, RAID1C34);
 #ifdef CONFIG_BLK_DEV_ZONED
 BTRFS_FEAT_ATTR_INCOMPAT(zoned, ZONED);
@@ -316,6 +317,7 @@ static struct attribute *btrfs_supported_feature_attrs[] = {
 	BTRFS_FEAT_ATTR_PTR(no_holes),
 	BTRFS_FEAT_ATTR_PTR(metadata_uuid),
 	BTRFS_FEAT_ATTR_PTR(free_space_tree),
+	BTRFS_FEAT_ATTR_PTR(block_group_tree),
 	BTRFS_FEAT_ATTR_PTR(raid1c34),
 #ifdef CONFIG_BLK_DEV_ZONED
 	BTRFS_FEAT_ATTR_PTR(zoned),
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index f54dc91e4025..5f79610e1c72 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -290,6 +290,12 @@ struct btrfs_ioctl_fs_info_args {
 #define BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID	(1ULL << 1)
 #define BTRFS_FEATURE_COMPAT_RO_VERITY			(1ULL << 2)
 
+/*
+ * Put all block group items into a dedicate block group tree, greatly
+ * reduce mount time for large fs.
+ */
+#define BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE	(1ULL << 5)
+
 #define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF	(1ULL << 0)
 #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL	(1ULL << 1)
 #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS	(1ULL << 2)
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] btrfs: enhance unsupported compat RO flags handling
  2022-07-20  5:06 ` [PATCH v2 1/3] btrfs: enhance unsupported compat RO flags handling Qu Wenruo
@ 2022-07-20 10:20   ` Nikolay Borisov
  0 siblings, 0 replies; 11+ messages in thread
From: Nikolay Borisov @ 2022-07-20 10:20 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: stable



On 20.07.22 г. 8:06 ч., Qu Wenruo wrote:
> Currently there are two corner cases not handling compat RO flags
> correctly:
> 
> - Remount
>    We can still mount the fs RO with compat RO flags, then remount it RW.
>    We should not allow any write into a fs with unsupported RO flags.
> 
> - Still try to search block group items
>    In fact, behavior/on-disk format change to extent tree should not
>    need a full incompat flag.
> 
>    And since we can ensure fs with unsupported RO flags never got any
>    writes (with above case fixed), then we can even skip block group
>    items search at mount time.
> 
> This patch will enhance the unsupported RO compat flags by:
> 
> - Reject RW remount if there is unsupported RO compat flags
> 
> - Go dummy block group items directly for unsupported RO compat flags
>    In fact, only changes to chunk/subvolume/root/csum trees should go
>    incompat flags.
> 
> The latter part should allow future change to extent tree to be compat
> RO flags.
> 
> Thus this patch also needs to be backported to all stable trees.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2
  2022-07-20  5:06 [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 Qu Wenruo
                   ` (2 preceding siblings ...)
  2022-07-20  5:07 ` [PATCH v2 3/3] btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2 Qu Wenruo
@ 2022-07-26 17:59 ` David Sterba
  2022-07-26 21:47   ` Qu Wenruo
  3 siblings, 1 reply; 11+ messages in thread
From: David Sterba @ 2022-07-26 17:59 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Wed, Jul 20, 2022 at 01:06:58PM +0800, Qu Wenruo wrote:
> Qu Wenruo (3):
>   btrfs: enhance unsupported compat RO flags handling
>   btrfs: don't save block group root into super block
>   btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2

It's short series and I don't see any new code to use the separate tree
for bg items, so it's on top of the extent tree v2, right?

From the last time we were experimenting with the block group tree, I
was trying to avoid a new tree but there were problems. So, I think we
can go with the separate tree that you suggest. We have reports about
slow mount and people use large filesystems, so this is justified.

Will it be possible to convert existing filesystem to use the bg tree?
I'm not sure about a remount, that would need a new option and for
single use. We could possibly use the sysfs interface to trigger it, or
leave it to offline change by btrfstune.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2
  2022-07-26 17:59 ` [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 David Sterba
@ 2022-07-26 21:47   ` Qu Wenruo
  2022-07-26 21:52     ` David Sterba
  0 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2022-07-26 21:47 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2022/7/27 01:59, David Sterba wrote:
> On Wed, Jul 20, 2022 at 01:06:58PM +0800, Qu Wenruo wrote:
>> Qu Wenruo (3):
>>    btrfs: enhance unsupported compat RO flags handling
>>    btrfs: don't save block group root into super block
>>    btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2
>
> It's short series and I don't see any new code to use the separate tree
> for bg items, so it's on top of the extent tree v2, right?

Yes, it's based on extent tree v2 prepare code that is already in the
mainline code.

>
>  From the last time we were experimenting with the block group tree, I
> was trying to avoid a new tree but there were problems. So, I think we
> can go with the separate tree that you suggest. We have reports about
> slow mount and people use large filesystems, so this is justified.
>
> Will it be possible to convert existing filesystem to use the bg tree?

Yes, that's completely planned as the old bg tree code, btrfs-progs
convert tool will be provided (mostly in btrfstune).

Thanks,
Qu

> I'm not sure about a remount, that would need a new option and for
> single use. We could possibly use the sysfs interface to trigger it, or
> leave it to offline change by btrfstune.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2
  2022-07-26 21:47   ` Qu Wenruo
@ 2022-07-26 21:52     ` David Sterba
  2022-07-26 22:09       ` Qu Wenruo
  0 siblings, 1 reply; 11+ messages in thread
From: David Sterba @ 2022-07-26 21:52 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Qu Wenruo, linux-btrfs

On Wed, Jul 27, 2022 at 05:47:25AM +0800, Qu Wenruo wrote:
> 
> 
> On 2022/7/27 01:59, David Sterba wrote:
> > On Wed, Jul 20, 2022 at 01:06:58PM +0800, Qu Wenruo wrote:
> >> Qu Wenruo (3):
> >>    btrfs: enhance unsupported compat RO flags handling
> >>    btrfs: don't save block group root into super block
> >>    btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2
> >
> > It's short series and I don't see any new code to use the separate tree
> > for bg items, so it's on top of the extent tree v2, right?
> 
> Yes, it's based on extent tree v2 prepare code that is already in the
> mainline code.
> 
> >
> >  From the last time we were experimenting with the block group tree, I
> > was trying to avoid a new tree but there were problems. So, I think we
> > can go with the separate tree that you suggest. We have reports about
> > slow mount and people use large filesystems, so this is justified.
> >
> > Will it be possible to convert existing filesystem to use the bg tree?
> 
> Yes, that's completely planned as the old bg tree code, btrfs-progs
> convert tool will be provided (mostly in btrfstune).

Ok, good. I'm thinking if we should go for an online conversion too or
not, because on a many-TB filesystem it would possibly take a long time
but the benefit is not to unmount and do the conversion.

We could copy what the free space conversion does on remount, for bg
tree implemented as "set some flag via sysfs" and ten trigger remount
that does all th work. It should be less or comparable work to free
space tree conversion, it's basically copying the block group items to
the new tree and deleting from extent tree.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2
  2022-07-26 21:52     ` David Sterba
@ 2022-07-26 22:09       ` Qu Wenruo
  2022-07-28 10:02         ` Qu Wenruo
  0 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2022-07-26 22:09 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2022/7/27 05:52, David Sterba wrote:
> On Wed, Jul 27, 2022 at 05:47:25AM +0800, Qu Wenruo wrote:
>>
>>
>> On 2022/7/27 01:59, David Sterba wrote:
>>> On Wed, Jul 20, 2022 at 01:06:58PM +0800, Qu Wenruo wrote:
>>>> Qu Wenruo (3):
>>>>     btrfs: enhance unsupported compat RO flags handling
>>>>     btrfs: don't save block group root into super block
>>>>     btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2
>>>
>>> It's short series and I don't see any new code to use the separate tree
>>> for bg items, so it's on top of the extent tree v2, right?
>>
>> Yes, it's based on extent tree v2 prepare code that is already in the
>> mainline code.
>>
>>>
>>>   From the last time we were experimenting with the block group tree, I
>>> was trying to avoid a new tree but there were problems. So, I think we
>>> can go with the separate tree that you suggest. We have reports about
>>> slow mount and people use large filesystems, so this is justified.
>>>
>>> Will it be possible to convert existing filesystem to use the bg tree?
>>
>> Yes, that's completely planned as the old bg tree code, btrfs-progs
>> convert tool will be provided (mostly in btrfstune).
>
> Ok, good. I'm thinking if we should go for an online conversion too or
> not, because on a many-TB filesystem it would possibly take a long time
> but the benefit is not to unmount and do the conversion.

For my previous tests, even TB level (used space) fs, it only takes
seconds to do the convert (although on SSD).

For HDD systems, it would be as slow as the mount time for the convert.
Most of time spent would be just searching the block group items,
writing them into bg tree would be super fast though.

Currently I'm working on a multi-transaction solution in btrfstune to be
extra safe on the convert.
(Previous code is one transaction to do the convert, which may or may
not handle thousands of bg items).

>
> We could copy what the free space conversion does on remount, for bg
> tree implemented as "set some flag via sysfs" and ten trigger remount
> that does all th work. It should be less or comparable work to free
> space tree conversion, it's basically copying the block group items to
> the new tree and deleting from extent tree.

I tend not to do any convert in kernel even it may not be that complex.

Shouldn't we keep the kernel code small and put the convert thing all
into progs?

Thanks,
Qu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2
  2022-07-26 22:09       ` Qu Wenruo
@ 2022-07-28 10:02         ` Qu Wenruo
  2022-08-03 19:10           ` David Sterba
  0 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2022-07-28 10:02 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2022/7/27 06:09, Qu Wenruo wrote:
>
>
> On 2022/7/27 05:52, David Sterba wrote:
>> On Wed, Jul 27, 2022 at 05:47:25AM +0800, Qu Wenruo wrote:
>>>
>>>
>>> On 2022/7/27 01:59, David Sterba wrote:
>>>> On Wed, Jul 20, 2022 at 01:06:58PM +0800, Qu Wenruo wrote:
>>>>> Qu Wenruo (3):
>>>>>     btrfs: enhance unsupported compat RO flags handling
>>>>>     btrfs: don't save block group root into super block
>>>>>     btrfs: separate BLOCK_GROUP_TREE compat RO flag from
>>>>> EXTENT_TREE_V2
>>>>
>>>> It's short series and I don't see any new code to use the separate tree
>>>> for bg items, so it's on top of the extent tree v2, right?
>>>
>>> Yes, it's based on extent tree v2 prepare code that is already in the
>>> mainline code.
>>>
>>>>
>>>>   From the last time we were experimenting with the block group tree, I
>>>> was trying to avoid a new tree but there were problems. So, I think we
>>>> can go with the separate tree that you suggest. We have reports about
>>>> slow mount and people use large filesystems, so this is justified.
>>>>
>>>> Will it be possible to convert existing filesystem to use the bg tree?
>>>
>>> Yes, that's completely planned as the old bg tree code, btrfs-progs
>>> convert tool will be provided (mostly in btrfstune).
>>
>> Ok, good. I'm thinking if we should go for an online conversion too or
>> not, because on a many-TB filesystem it would possibly take a long time
>> but the benefit is not to unmount and do the conversion.
>
> For my previous tests, even TB level (used space) fs, it only takes
> seconds to do the convert (although on SSD).
>
> For HDD systems, it would be as slow as the mount time for the convert.
> Most of time spent would be just searching the block group items,
> writing them into bg tree would be super fast though.

Despite the incoming multi-transaction bg tree convert tool
(bidirectional), mind me to update the kernel series to address one of
the concern from Josef?

To reduce the test matrix, I'd like to make bg tree to rely on free
space tree and no holes features.

Although those features have no linkage to each other, such artificial
requirement should greatly reduce our test combinations.

Thanks,
Qu
>
> Currently I'm working on a multi-transaction solution in btrfstune to be
> extra safe on the convert.
> (Previous code is one transaction to do the convert, which may or may
> not handle thousands of bg items).
>
>>
>> We could copy what the free space conversion does on remount, for bg
>> tree implemented as "set some flag via sysfs" and ten trigger remount
>> that does all th work. It should be less or comparable work to free
>> space tree conversion, it's basically copying the block group items to
>> the new tree and deleting from extent tree.
>
> I tend not to do any convert in kernel even it may not be that complex.
>
> Shouldn't we keep the kernel code small and put the convert thing all
> into progs?
>
> Thanks,
> Qu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2
  2022-07-28 10:02         ` Qu Wenruo
@ 2022-08-03 19:10           ` David Sterba
  0 siblings, 0 replies; 11+ messages in thread
From: David Sterba @ 2022-08-03 19:10 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Qu Wenruo, linux-btrfs

On Thu, Jul 28, 2022 at 06:02:44PM +0800, Qu Wenruo wrote:
> >> Ok, good. I'm thinking if we should go for an online conversion too or
> >> not, because on a many-TB filesystem it would possibly take a long time
> >> but the benefit is not to unmount and do the conversion.
> >
> > For my previous tests, even TB level (used space) fs, it only takes
> > seconds to do the convert (although on SSD).
> >
> > For HDD systems, it would be as slow as the mount time for the convert.
> > Most of time spent would be just searching the block group items,
> > writing them into bg tree would be super fast though.
> 
> Despite the incoming multi-transaction bg tree convert tool
> (bidirectional), mind me to update the kernel series to address one of
> the concern from Josef?
> 
> To reduce the test matrix, I'd like to make bg tree to rely on free
> space tree and no holes features.
> 
> Although those features have no linkage to each other, such artificial
> requirement should greatly reduce our test combinations.

That's a good idea, we add the features incrementally and we've probably
reached the point where we should have a basic set that everybody wants
and should use.

Features that are clear optimizations or improvements should be easy to
decide, the other depend eg. on hardware (zoned) or are individual
features like quotas or raid1c34.

On mkfs side it would require more checks so that selecting
block-group-tree and disabling free-space-tree can't be done and
additional checks in kernel.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-08-03 19:16 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-20  5:06 [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 Qu Wenruo
2022-07-20  5:06 ` [PATCH v2 1/3] btrfs: enhance unsupported compat RO flags handling Qu Wenruo
2022-07-20 10:20   ` Nikolay Borisov
2022-07-20  5:07 ` [PATCH v2 2/3] btrfs: don't save block group root into super block Qu Wenruo
2022-07-20  5:07 ` [PATCH v2 3/3] btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2 Qu Wenruo
2022-07-26 17:59 ` [PATCH v2 0/3] btrfs: separate BLOCK_GROUP_TREE feature from extent-tree-v2 David Sterba
2022-07-26 21:47   ` Qu Wenruo
2022-07-26 21:52     ` David Sterba
2022-07-26 22:09       ` Qu Wenruo
2022-07-28 10:02         ` Qu Wenruo
2022-08-03 19:10           ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.