Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH RFC 0/7] 
@ 2019-11-04 12:03 Qu Wenruo
  2019-11-04 12:03 ` [PATCH RFC 1/7] btrfs-progs: check/lowmem: Lookup block group item in a seperate function Qu Wenruo
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Qu Wenruo @ 2019-11-04 12:03 UTC (permalink / raw)
  To: linux-btrfs

This patchset can be fetched from github:
https://github.com/adam900710/btrfs-progs/tree/skinny_bg_tree
Which is based on david/devel branch.
HEAD is:
bdb42fb63382e8aca6bd02fd04a28e415408d4ea (david/devel) btrfs-progs: tests: Test backup root retention logic

This patchset provides the needed user space infrastructure for
SKINNY_BG_TREE feature.

Since it's an new incompatible feature, unlike SKINNY_METADATA, btrfs-progs
is needed to convert existing fs (unmounted) to new format.
Or determined at mkfs time.

For the performance improvement, please check the kernel patchset cover
letter or the last patch.
(SPOILER ALERT: It's super-duper fast, even faster than regular bg tree)

The chanllege here is, even we have some patches merged into devel
branch, due to the change of definition of key->offset for block group
item, we have to refactor more functions to implement SKINNY_BG_TREE.

Qu Wenruo (7):
  btrfs-progs: check/lowmem: Lookup block group item in a seperate
    function
  btrfs-progs: Enable read-write ability for 'skinny_bg_tree' feature
  btrfs-progs: mkfs: Introduce -O skinny-bg-tree
  btrfs-progs: dump-tree/dump-super: Introduce support for skinny bg
    tree
  btrfs-progs: Refactor btrfs_new_block_group_record() to accept
    parameters directly
  btrfs-progs: check: Introduce support for bg-tree feature
  btrfs-progs: btrfstune: Allow to enable bg-tree feature offline

 Documentation/btrfstune.asciidoc |   6 +
 btrfsck.h                        |   4 +-
 btrfstune.c                      |  45 +++++-
 check/common.h                   |   4 +-
 check/main.c                     |  63 ++++++--
 check/mode-lowmem.c              | 137 ++++++++++++----
 cmds/inspect-dump-super.c        |   3 +-
 cmds/inspect-dump-tree.c         |   5 +
 cmds/rescue-chunk-recover.c      |   6 +-
 common/fsfeatures.c              |   6 +
 ctree.h                          |  18 ++-
 disk-io.c                        |  21 ++-
 extent-tree.c                    | 269 +++++++++++++++++++++++++++++--
 mkfs/common.c                    |   5 +-
 mkfs/main.c                      |  25 +++
 print-tree.c                     |   4 +
 transaction.c                    |   1 +
 17 files changed, 549 insertions(+), 73 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RFC 1/7] btrfs-progs: check/lowmem: Lookup block group item in a seperate function
  2019-11-04 12:03 [PATCH RFC 0/7] Qu Wenruo
@ 2019-11-04 12:03 ` Qu Wenruo
  2019-11-04 12:03 ` [PATCH RFC 2/7] btrfs-progs: Enable read-write ability for 'skinny_bg_tree' feature Qu Wenruo
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2019-11-04 12:03 UTC (permalink / raw)
  To: linux-btrfs

In check_chunk_item() we search extent tree for block group item.

Refactor this part into a separate function, find_block_group_item(),
so that later skinny-bg-tree feature can reuse it.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 check/mode-lowmem.c | 74 ++++++++++++++++++++++++++++-----------------
 1 file changed, 47 insertions(+), 27 deletions(-)

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index f53a0c39e86e..7ecf95ed0170 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -4472,6 +4472,50 @@ next:
 	return 0;
 }
 
+/*
+ * Find the block group item with @bytenr, @len and @type
+ *
+ * Return 0 if found.
+ * Return -ENOENT if not found.
+ * Return <0 for fatal error.
+ */
+static int find_block_group_item(struct btrfs_fs_info *fs_info,
+				 struct btrfs_path *path, u64 bytenr, u64 len,
+				 u64 type)
+{
+	struct btrfs_block_group_item bgi;
+	struct btrfs_key key;
+	int ret;
+
+	key.objectid = bytenr;
+	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
+	key.offset = len;
+
+	ret = btrfs_search_slot(NULL, fs_info->extent_root, &key, path, 0, 0);
+	if (ret < 0)
+		return ret;
+	if (ret > 0) {
+		ret = -ENOENT;
+		error("chunk [%llu %llu) doesn't have related block group item",
+		      bytenr, bytenr + len);
+		goto out;
+	}
+	read_extent_buffer(path->nodes[0], &bgi,
+			btrfs_item_ptr_offset(path->nodes[0], path->slots[0]),
+			sizeof(bgi));
+	if (btrfs_block_group_flags(&bgi) != type) {
+		error(
+"chunk [%llu %llu) type mismatch with block group, block group has 0x%llx chunk has %llx",
+			bytenr, bytenr + len, btrfs_block_group_flags(&bgi),
+			type);
+		ret = -EUCLEAN;
+	}
+
+out:
+	btrfs_release_path(path);
+	return ret;
+}
+
 /*
  * Check a chunk item.
  * Including checking all referred dev_extents and block group
@@ -4479,16 +4523,12 @@ next:
 static int check_chunk_item(struct btrfs_fs_info *fs_info,
 			    struct extent_buffer *eb, int slot)
 {
-	struct btrfs_root *extent_root = fs_info->extent_root;
 	struct btrfs_root *dev_root = fs_info->dev_root;
 	struct btrfs_path path;
 	struct btrfs_key chunk_key;
-	struct btrfs_key bg_key;
 	struct btrfs_key devext_key;
 	struct btrfs_chunk *chunk;
 	struct extent_buffer *leaf;
-	struct btrfs_block_group_item *bi;
-	struct btrfs_block_group_item bg_item;
 	struct btrfs_dev_extent *ptr;
 	u64 length;
 	u64 chunk_end;
@@ -4515,31 +4555,11 @@ static int check_chunk_item(struct btrfs_fs_info *fs_info,
 	}
 	type = btrfs_chunk_type(eb, chunk);
 
-	bg_key.objectid = chunk_key.offset;
-	bg_key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
-	bg_key.offset = length;
-
 	btrfs_init_path(&path);
-	ret = btrfs_search_slot(NULL, extent_root, &bg_key, &path, 0, 0);
-	if (ret) {
-		error(
-		"chunk[%llu %llu) did not find the related block group item",
-			chunk_key.offset, chunk_end);
+	ret = find_block_group_item(fs_info, &path, chunk_key.offset, length,
+				    type);
+	if (ret < 0)
 		err |= REFERENCER_MISSING;
-	} else{
-		leaf = path.nodes[0];
-		bi = btrfs_item_ptr(leaf, path.slots[0],
-				    struct btrfs_block_group_item);
-		read_extent_buffer(leaf, &bg_item, (unsigned long)bi,
-				   sizeof(bg_item));
-		if (btrfs_block_group_flags(&bg_item) != type) {
-			error(
-"chunk[%llu %llu) related block group item flags mismatch, wanted: %llu, have: %llu",
-				chunk_key.offset, chunk_end, type,
-				btrfs_block_group_flags(&bg_item));
-			err |= REFERENCER_MISSING;
-		}
-	}
 
 	num_stripes = btrfs_chunk_num_stripes(eb, chunk);
 	stripe_len = btrfs_stripe_length(fs_info, eb, chunk);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RFC 2/7] btrfs-progs: Enable read-write ability for 'skinny_bg_tree' feature
  2019-11-04 12:03 [PATCH RFC 0/7] Qu Wenruo
  2019-11-04 12:03 ` [PATCH RFC 1/7] btrfs-progs: check/lowmem: Lookup block group item in a seperate function Qu Wenruo
@ 2019-11-04 12:03 ` Qu Wenruo
  2019-11-04 12:03 ` [PATCH RFC 3/7] btrfs-progs: mkfs: Introduce -O skinny-bg-tree Qu Wenruo
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2019-11-04 12:03 UTC (permalink / raw)
  To: linux-btrfs

Allow btrfs-progs to open, read and write 'skinny_bg_tree' enabled fs.

The modification itself is not large, as block groups items are only
used at 4 timing:

1) open_ctree()
   We only need to populate fs_info->bg_root and read block group items
   from fs_info->bg_root.
   The obvious change is, we don't need to do btrfs_search_slot() for
   each block group item, but btrfs_next_item() is enough.

   This should hugely reduce open_ctree() execution duration.

2) btrfs_commit_transaction()
   We need to write back dirty block group items back to bg_root.

   The modification here is to insert new block group item if we can't
   find one existing in bg_root, and delete the old one in extent tree
   if we're converting to skinny_bg_tree feature.

3) btrfs_make_block_group()
   For skinny_bg_tree feature, we insert key only, with key.offset ==
   used.

   This modification needs extra handling for converting case, where
   block group items can be either in extent tree or bg tree.

4) free_block_group_item()
   Just delete the block group item in extent tree or bg tree.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 ctree.h       |  17 +++-
 disk-io.c     |  21 ++++-
 extent-tree.c | 230 ++++++++++++++++++++++++++++++++++++++++++++++----
 3 files changed, 251 insertions(+), 17 deletions(-)

diff --git a/ctree.h b/ctree.h
index ec57f113839f..a93e1d5d202d 100644
--- a/ctree.h
+++ b/ctree.h
@@ -89,6 +89,9 @@ struct btrfs_free_space_ctl;
 /* tracks free space in block groups. */
 #define BTRFS_FREE_SPACE_TREE_OBJECTID 10ULL
 
+/* store BLOCK_GROUP_ITEMS in a seperate tree */
+#define BTRFS_BLOCK_GROUP_TREE_OBJECTID 11ULL
+
 /* device stats in the device tree */
 #define BTRFS_DEV_STATS_OBJECTID 0ULL
 
@@ -492,6 +495,7 @@ struct btrfs_super_block {
 #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA	(1ULL << 8)
 #define BTRFS_FEATURE_INCOMPAT_NO_HOLES		(1ULL << 9)
 #define BTRFS_FEATURE_INCOMPAT_METADATA_UUID    (1ULL << 10)
+#define BTRFS_FEATURE_INCOMPAT_SKINNY_BG_TREE	(1ULL << 11)
 
 #define BTRFS_FEATURE_COMPAT_SUPP		0ULL
 
@@ -515,7 +519,8 @@ struct btrfs_super_block {
 	 BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS |		\
 	 BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA |	\
 	 BTRFS_FEATURE_INCOMPAT_NO_HOLES |		\
-	 BTRFS_FEATURE_INCOMPAT_METADATA_UUID)
+	 BTRFS_FEATURE_INCOMPAT_METADATA_UUID |		\
+	 BTRFS_FEATURE_INCOMPAT_SKINNY_BG_TREE)
 
 /*
  * A leaf is full of items. offset and size tell us where to find
@@ -1125,6 +1130,7 @@ struct btrfs_fs_info {
 	struct btrfs_root *quota_root;
 	struct btrfs_root *free_space_root;
 	struct btrfs_root *uuid_root;
+	struct btrfs_root *bg_root;
 
 	struct rb_root fs_root_tree;
 
@@ -1176,6 +1182,8 @@ struct btrfs_fs_info {
 	unsigned int avoid_meta_chunk_alloc:1;
 	unsigned int avoid_sys_chunk_alloc:1;
 	unsigned int finalize_on_close:1;
+	/* Converting from bg in extent tree to skinny bg tree */
+	unsigned int convert_to_bg_tree:1;
 
 	int transaction_aborted;
 
@@ -1332,6 +1340,13 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct btrfs_fs_info *info)
  */
 #define BTRFS_BLOCK_GROUP_ITEM_KEY 192
 
+/*
+ * More optimized block group item, use key.objectid for block group bytenr,
+ * key.offset for used bytes.
+ * No item data needed.
+ */
+#define BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY 193
+
 /*
  * Every block group is represented in the free space tree by a free space info
  * item, which stores some accounting information. It is keyed on
diff --git a/disk-io.c b/disk-io.c
index a5b47b0ef16c..1cb62511f4ad 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -731,6 +731,8 @@ struct btrfs_root *btrfs_read_fs_root(struct btrfs_fs_info *fs_info,
 	if (location->objectid == BTRFS_FREE_SPACE_TREE_OBJECTID)
 		return fs_info->free_space_root ? fs_info->free_space_root :
 						ERR_PTR(-ENOENT);
+	if (location->objectid == BTRFS_BLOCK_GROUP_TREE_OBJECTID)
+		return fs_info->bg_root ? fs_info->bg_root : ERR_PTR(-ENOENT);
 
 	BUG_ON(location->objectid == BTRFS_TREE_RELOC_OBJECTID ||
 	       location->offset != (u64)-1);
@@ -783,6 +785,7 @@ struct btrfs_fs_info *btrfs_new_fs_info(int writable, u64 sb_bytenr)
 	fs_info->quota_root = calloc(1, sizeof(struct btrfs_root));
 	fs_info->free_space_root = calloc(1, sizeof(struct btrfs_root));
 	fs_info->uuid_root = calloc(1, sizeof(struct btrfs_root));
+	fs_info->bg_root = calloc(1, sizeof(struct btrfs_root));
 	fs_info->super_copy = calloc(1, BTRFS_SUPER_INFO_SIZE);
 
 	if (!fs_info->tree_root || !fs_info->extent_root ||
@@ -932,7 +935,6 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
 		root_tree_bytenr = btrfs_backup_tree_root(backup);
 		generation = btrfs_backup_tree_root_gen(backup);
 	}
-
 	root->node = read_tree_block(fs_info, root_tree_bytenr, generation);
 	if (!extent_buffer_uptodate(root->node)) {
 		fprintf(stderr, "Couldn't read tree root\n");
@@ -945,6 +947,21 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
 		return ret;
 	fs_info->extent_root->track_dirty = 1;
 
+	if (btrfs_fs_incompat(fs_info, SKINNY_BG_TREE)) {
+		ret = setup_root_or_create_block(fs_info, flags,
+					fs_info->bg_root,
+					BTRFS_BLOCK_GROUP_TREE_OBJECTID, "bg");
+		if (ret < 0) {
+			error("Couldn't setup bg tree");
+			return ret;
+		}
+		fs_info->bg_root->track_dirty = 1;
+		fs_info->bg_root->ref_cows = 0;
+	} else {
+		free(fs_info->bg_root);
+		fs_info->bg_root = NULL;
+	}
+
 	ret = find_and_setup_root(root, fs_info, BTRFS_DEV_TREE_OBJECTID,
 				  fs_info->dev_root);
 	if (ret) {
@@ -1035,6 +1052,8 @@ void btrfs_release_all_roots(struct btrfs_fs_info *fs_info)
 		free_extent_buffer(fs_info->extent_root->node);
 	if (fs_info->tree_root)
 		free_extent_buffer(fs_info->tree_root->node);
+	if (fs_info->bg_root)
+		free_extent_buffer(fs_info->bg_root->node);
 	if (fs_info->log_root_tree)
 		free_extent_buffer(fs_info->log_root_tree->node);
 	if (fs_info->chunk_root)
diff --git a/extent-tree.c b/extent-tree.c
index d67e4098351f..7c68508de2ac 100644
--- a/extent-tree.c
+++ b/extent-tree.c
@@ -1524,6 +1524,67 @@ int btrfs_dec_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 	return __btrfs_mod_ref(trans, root, buf, record_parent, 0);
 }
 
+static int write_one_skinny_block_group(struct btrfs_trans_handle *trans,
+					struct btrfs_path *path,
+					struct btrfs_block_group_cache *cache)
+{
+	struct btrfs_fs_info *fs_info = trans->fs_info;
+	struct btrfs_root *bg_root = fs_info->bg_root;
+	struct btrfs_key key;
+	int ret;
+
+	ASSERT(bg_root && btrfs_fs_incompat(fs_info, SKINNY_BG_TREE));
+	key.objectid = cache->key.objectid;
+	key.type = BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY;
+	key.offset = (u64)-1;
+
+	ret = btrfs_search_slot(trans, bg_root, &key, path, 0, 1);
+	if (ret < 0)
+		return ret;
+	if (ret == 0) {
+		error("invalid skinny bg found, start=%llu", key.objectid);
+		ret = -EUCLEAN;
+		goto out;
+	}
+	ret = btrfs_previous_item(bg_root, path, key.objectid, key.type);
+	if (ret < 0)
+		goto out;
+	if (ret > 0 && fs_info->convert_to_bg_tree) {
+		btrfs_release_path(path);
+
+		/* We are doing convert, insert new one for it */
+		key.offset = cache->used;
+		ret = btrfs_insert_item(trans, bg_root, &key, NULL, 0);
+		if (ret < 0)
+			goto out;
+
+		/* Also delete the existing one in extent tree */
+		key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
+		key.offset = cache->key.offset;
+
+		ret = btrfs_search_slot(trans, fs_info->extent_root, &key,
+					path, -1, 1);
+		if (ret < 0)
+			goto out;
+		if (ret > 0) {
+			ret = 0;
+			goto out;
+		}
+		ret = btrfs_del_item(trans, fs_info->extent_root, path);
+		goto out;
+	}
+	if (ret > 0) {
+		ret = -ENOENT;
+		goto out;
+	}
+	key.offset = cache->used;
+	btrfs_set_item_key_safe(bg_root, path, &key);
+	btrfs_mark_buffer_dirty(path->nodes[0]);
+out:
+	btrfs_release_path(path);
+	return ret;
+}
+
 static int write_one_cache_group(struct btrfs_trans_handle *trans,
 				 struct btrfs_path *path,
 				 struct btrfs_block_group_cache *cache)
@@ -1534,6 +1595,9 @@ static int write_one_cache_group(struct btrfs_trans_handle *trans,
 	struct btrfs_block_group_item bgi;
 	struct extent_buffer *leaf;
 
+	if (btrfs_fs_incompat(trans->fs_info, SKINNY_BG_TREE))
+		return write_one_skinny_block_group(trans, path, cache);
+
 	ret = btrfs_search_slot(trans, extent_root, &cache->key, path, 0, 1);
 	if (ret < 0)
 		goto fail;
@@ -2665,32 +2729,63 @@ static int read_one_block_group(struct btrfs_fs_info *fs_info,
 	struct extent_buffer *leaf = path->nodes[0];
 	struct btrfs_space_info *space_info;
 	struct btrfs_block_group_cache *cache;
-	struct btrfs_block_group_item bgi;
 	struct btrfs_key key;
+	u64 bg_len;
+	u64 flags;
+	u64 used;
 	int slot = path->slots[0];
 	int bit = 0;
 	int ret;
 
 	btrfs_item_key_to_cpu(leaf, &key, slot);
-	ASSERT(key.type == BTRFS_BLOCK_GROUP_ITEM_KEY);
+	ASSERT((!btrfs_fs_incompat(fs_info, SKINNY_BG_TREE) &&
+		key.type == BTRFS_BLOCK_GROUP_ITEM_KEY) ||
+	       (btrfs_fs_incompat(fs_info, SKINNY_BG_TREE) &&
+		key.type == BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY));
 
 	/*
 	 * Skip 0 sized block group, don't insert them into block group cache
 	 * tree, as its length is 0, it won't get freed at close_ctree() time.
 	 */
-	if (key.offset == 0)
+	if (key.type == BTRFS_BLOCK_GROUP_ITEM_KEY && key.offset == 0)
 		return 0;
 
+	if (key.type == BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY) {
+		struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree;
+		struct map_lookup *map;
+		struct cache_extent *ce;
+
+		ce = search_cache_extent(&map_tree->cache_tree, key.objectid);
+		if (!ce || ce->start != key.objectid) {
+			error(
+		"invalid skinny block group %llu: no corresponding chunk",
+				key.objectid);
+			return -ENOENT;
+		}
+		bg_len = ce->size;
+		map = container_of(ce, struct map_lookup, ce);
+		flags = map->type;
+		used = key.offset;
+	} else {
+		struct btrfs_block_group_item bgi;
+
+		bg_len = key.offset;
+		read_extent_buffer(leaf, &bgi,
+				   btrfs_item_ptr_offset(leaf, slot),
+				   sizeof(bgi));
+		flags = btrfs_block_group_flags(&bgi);
+		used = btrfs_block_group_used(&bgi);
+	}
 	cache = kzalloc(sizeof(*cache), GFP_NOFS);
 	if (!cache)
 		return -ENOMEM;
-	read_extent_buffer(leaf, &bgi, btrfs_item_ptr_offset(leaf, slot),
-			   sizeof(bgi));
-	memcpy(&cache->key, &key, sizeof(key));
 	cache->cached = 0;
 	cache->pinned = 0;
-	cache->flags = btrfs_block_group_flags(&bgi);
-	cache->used = btrfs_block_group_used(&bgi);
+	cache->flags = flags;
+	cache->used = used;
+	cache->key.objectid = key.objectid;
+	cache->key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
+	cache->key.offset = bg_len;
 	if (cache->flags & BTRFS_BLOCK_GROUP_DATA) {
 		bit = BLOCK_GROUP_DATA;
 	} else if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) {
@@ -2719,6 +2814,55 @@ static int read_one_block_group(struct btrfs_fs_info *fs_info,
 	return 0;
 }
 
+static int read_skinny_bg_tree(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_root *bg_root = fs_info->bg_root;
+	struct btrfs_path path;
+	struct btrfs_key key;
+	int ret;
+
+	btrfs_init_path(&path);
+	key.objectid = 0;
+	key.offset = 0;
+	key.type = 0;
+
+	ret = btrfs_search_slot(NULL, bg_root, &key, &path, 0, 0);
+	if (ret < 0)
+		goto out;
+	if (ret == 0) {
+		error("invalid key found in skinny bg tree: (0, 0, 0)");
+		ret = -EUCLEAN;
+		goto out;
+	}
+
+	while (1) {
+		btrfs_item_key_to_cpu(path.nodes[0], &key, path.slots[0]);
+		if (key.type != BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY) {
+			error(
+		"invalid key found in skinny bg tree: (%llu, %u, %llu)",
+			      key.objectid, key.type, key.offset);
+			ret = -EUCLEAN;
+			goto out;
+		}
+		ret = read_one_block_group(fs_info, &path);
+		if (ret < 0) {
+			errno = -ret;
+			error("failed to read one block group: %m");
+			goto out;
+		}
+		ret = btrfs_next_item(bg_root, &path);
+		if (ret < 0)
+			goto out;
+		if (ret > 0) {
+			ret = 0;
+			goto out;
+		}
+	}
+out:
+	btrfs_release_path(&path);
+	return ret;
+}
+
 int btrfs_read_block_groups(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_path path;
@@ -2726,6 +2870,9 @@ int btrfs_read_block_groups(struct btrfs_fs_info *fs_info)
 	int ret;
 	struct btrfs_key key;
 
+	if (btrfs_fs_incompat(fs_info, SKINNY_BG_TREE))
+		return read_skinny_bg_tree(fs_info);
+
 	root = fs_info->extent_root;
 	key.objectid = 0;
 	key.offset = 0;
@@ -2806,16 +2953,25 @@ int btrfs_make_block_group(struct btrfs_trans_handle *trans,
 	int ret;
 	struct btrfs_root *extent_root = fs_info->extent_root;
 	struct btrfs_block_group_cache *cache;
-	struct btrfs_block_group_item bgi;
 
 	cache = btrfs_add_block_group(fs_info, bytes_used, type, chunk_offset,
 				      size);
-	btrfs_set_block_group_used(&bgi, cache->used);
-	btrfs_set_block_group_flags(&bgi, cache->flags);
-	btrfs_set_block_group_chunk_objectid(&bgi,
-			BTRFS_FIRST_CHUNK_TREE_OBJECTID);
-	ret = btrfs_insert_item(trans, extent_root, &cache->key, &bgi,
+	if (btrfs_fs_incompat(fs_info, SKINNY_BG_TREE)) {
+		struct btrfs_key key;
+
+		key.objectid = cache->key.objectid;
+		key.type = BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY;
+		key.offset = bytes_used;
+		ret = btrfs_insert_item(trans, fs_info->bg_root, &key, NULL, 0);
+	} else {
+		struct btrfs_block_group_item bgi;
+		btrfs_set_block_group_used(&bgi, cache->used);
+		btrfs_set_block_group_flags(&bgi, cache->flags);
+		btrfs_set_block_group_chunk_objectid(&bgi,
+				BTRFS_FIRST_CHUNK_TREE_OBJECTID);
+		ret = btrfs_insert_item(trans, extent_root, &cache->key, &bgi,
 				sizeof(bgi));
+	}
 	BUG_ON(ret);
 
 	return 0;
@@ -2925,6 +3081,41 @@ int btrfs_update_block_group(struct btrfs_root *root,
 				  alloc, mark_free);
 }
 
+static int free_skinny_block_group_item(struct btrfs_trans_handle *trans,
+					struct btrfs_fs_info *fs_info,
+					u64 bytenr)
+{
+	struct btrfs_path path;
+	struct btrfs_key key;
+	struct btrfs_root *bg_root = fs_info->bg_root;
+	int ret;
+
+	btrfs_init_path(&path);
+	key.objectid = bytenr;
+	key.type = BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY;
+	key.offset = (u64)-1;
+
+	ret = btrfs_search_slot(trans, bg_root, &key, &path, -1, 1);
+	if (ret < 0)
+		return ret;
+	if (ret == 0) {
+		error("invalid skinny block group item found");
+		ret = -EUCLEAN;
+		goto out;
+	}
+	ret = btrfs_previous_item(bg_root, &path, key.objectid, key.type);
+	if (ret < 0)
+		goto out;
+	if (ret > 0) {
+		ret = -ENOENT;
+		goto out;
+	}
+	ret = btrfs_del_item(trans, bg_root, &path);
+out:
+	btrfs_release_path(&path);
+	return ret;
+}
+
 /*
  * Just remove a block group item in extent tree
  * Caller should ensure the block group is empty and all space is pinned.
@@ -2939,6 +3130,12 @@ static int free_block_group_item(struct btrfs_trans_handle *trans,
 	struct btrfs_root *root = fs_info->extent_root;
 	int ret = 0;
 
+	if (btrfs_fs_incompat(fs_info, SKINNY_BG_TREE)) {
+		ret = free_skinny_block_group_item(trans, fs_info, bytenr);
+		if (!fs_info->convert_to_bg_tree)
+			return ret;
+	}
+
 	key.objectid = bytenr;
 	key.offset = len;
 	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
@@ -2949,7 +3146,10 @@ static int free_block_group_item(struct btrfs_trans_handle *trans,
 
 	ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
 	if (ret > 0) {
-		ret = -ENOENT;
+		if (fs_info->convert_to_bg_tree)
+			ret = 0;
+		else
+			ret = -ENOENT;
 		goto out;
 	}
 	if (ret < 0)
-- 
2.23.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RFC 3/7] btrfs-progs: mkfs: Introduce -O skinny-bg-tree
  2019-11-04 12:03 [PATCH RFC 0/7] Qu Wenruo
  2019-11-04 12:03 ` [PATCH RFC 1/7] btrfs-progs: check/lowmem: Lookup block group item in a seperate function Qu Wenruo
  2019-11-04 12:03 ` [PATCH RFC 2/7] btrfs-progs: Enable read-write ability for 'skinny_bg_tree' feature Qu Wenruo
@ 2019-11-04 12:03 ` Qu Wenruo
  2019-11-04 12:03 ` [PATCH RFC 4/7] btrfs-progs: dump-tree/dump-super: Introduce support for skinny bg tree Qu Wenruo
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2019-11-04 12:03 UTC (permalink / raw)
  To: linux-btrfs

This allow mkfs.btrfs to create a btrfs with skinny-bg-tree feature.

This patch introduce a global function, btrfs_convert_to_bg_tree() in
extent-tree.c, to do the work.

The workflow is pretty simple:
- Create a new tree block for bg tree
- Set the SKINNY_BG_TREE feature for superblock
- Set the fs_info->convert_to_bg_tree flag
- Mark all block group items as dirty
- Commit transaction
  * With fs_info->convert_to_skinny_bg_tree set, we will try to delete the
    BLOCK_GROUP_ITEM in extent tree first, then write the new
    BLOCK_GROUP_ITEM into bg tree.

This btrfs_convert_to_skinny_bg_tree() will be used in mkfs after the basic fs
is created.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 common/fsfeatures.c |  6 ++++++
 ctree.h             |  1 +
 extent-tree.c       | 39 +++++++++++++++++++++++++++++++++++++++
 mkfs/common.c       |  5 ++++-
 mkfs/main.c         | 25 +++++++++++++++++++++++++
 transaction.c       |  1 +
 6 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/common/fsfeatures.c b/common/fsfeatures.c
index 50934bd161b0..087bab1310b7 100644
--- a/common/fsfeatures.c
+++ b/common/fsfeatures.c
@@ -86,6 +86,12 @@ static const struct btrfs_fs_feature {
 		VERSION_TO_STRING2(4,0),
 		NULL, 0,
 		"no explicit hole extents for files" },
+	{ "skinny-bg-tree", BTRFS_FEATURE_INCOMPAT_SKINNY_BG_TREE,
+		"skinny_bg_tree",
+		VERSION_TO_STRING2(5, 6),
+		NULL, 0,
+		NULL, 0,
+		"store optimized block group items in dedicated tree" },
 	/* Keep this one last */
 	{ "list-all", BTRFS_FEATURE_LIST_ALL, NULL }
 };
diff --git a/ctree.h b/ctree.h
index a93e1d5d202d..631a1f9ce14a 100644
--- a/ctree.h
+++ b/ctree.h
@@ -2862,5 +2862,6 @@ int btrfs_read_file(struct btrfs_root *root, u64 ino, u64 start, int len,
 
 /* extent-tree.c */
 int btrfs_run_delayed_refs(struct btrfs_trans_handle *trans, unsigned long nr);
+int btrfs_convert_to_skinny_bg_tree(struct btrfs_trans_handle *trans);
 
 #endif
diff --git a/extent-tree.c b/extent-tree.c
index 7c68508de2ac..cf89a1be6ab5 100644
--- a/extent-tree.c
+++ b/extent-tree.c
@@ -1524,6 +1524,45 @@ int btrfs_dec_ref(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 	return __btrfs_mod_ref(trans, root, buf, record_parent, 0);
 }
 
+int btrfs_convert_to_skinny_bg_tree(struct btrfs_trans_handle *trans)
+{
+	struct btrfs_fs_info *fs_info = trans->fs_info;
+	struct btrfs_block_group_cache *bg;
+	struct btrfs_root *bg_root;
+	u64 features = btrfs_super_incompat_flags(fs_info->super_copy);
+	int ret;
+
+	/* create bg tree first */
+	bg_root = btrfs_create_tree(trans, fs_info, BTRFS_BLOCK_GROUP_TREE_OBJECTID);
+	if (IS_ERR(bg_root)) {
+		ret = PTR_ERR(bg_root);
+		errno = -ret;
+		error("failed to create bg tree: %m");
+		return ret;
+	}
+	fs_info->bg_root = bg_root;
+	fs_info->bg_root->track_dirty = 1;
+	fs_info->bg_root->ref_cows = 0;
+	add_root_to_dirty_list(bg_root);
+
+	/* set BG_TREE feature and mark the fs into bg_tree convert status */
+	btrfs_set_super_incompat_flags(fs_info->super_copy,
+			features | BTRFS_FEATURE_INCOMPAT_SKINNY_BG_TREE);
+	fs_info->convert_to_bg_tree = 1;
+
+	/*
+	 * Mark all block groups dirty so they will get converted to bg tree at
+	 * commit transaction time
+	 */
+	for (bg = btrfs_lookup_first_block_group(fs_info, 0); bg;
+	     bg = btrfs_lookup_first_block_group(fs_info,
+				bg->key.objectid + bg->key.offset))
+		set_extent_bits(&fs_info->block_group_cache, bg->key.objectid,
+				bg->key.objectid + bg->key.offset - 1,
+				BLOCK_GROUP_DIRTY);
+	return 0;
+}
+
 static int write_one_skinny_block_group(struct btrfs_trans_handle *trans,
 					struct btrfs_path *path,
 					struct btrfs_block_group_cache *cache)
diff --git a/mkfs/common.c b/mkfs/common.c
index 469b88d6a8d3..161c2aa5ca47 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -112,6 +112,9 @@ static int btrfs_create_tree_root(int fd, struct btrfs_mkfs_config *cfg,
 	return ret;
 }
 
+/* These features will not be set in the temporary fs */
+#define MASKED_FEATURES		(~(BTRFS_FEATURE_INCOMPAT_SKINNY_BG_TREE))
+
 /*
  * @fs_uuid - if NULL, generates a UUID, returns back the new filesystem UUID
  *
@@ -205,7 +208,7 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg)
 	btrfs_set_super_csum_type(&super, cfg->csum_type);
 	btrfs_set_super_chunk_root_generation(&super, 1);
 	btrfs_set_super_cache_generation(&super, -1);
-	btrfs_set_super_incompat_flags(&super, cfg->features);
+	btrfs_set_super_incompat_flags(&super, cfg->features & MASKED_FEATURES);
 	if (cfg->label)
 		__strncpy_null(super.label, cfg->label, BTRFS_LABEL_SIZE - 1);
 
diff --git a/mkfs/main.c b/mkfs/main.c
index 1a4578412b41..9edeb82ee70c 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -1344,6 +1344,31 @@ raid_groups:
 		goto out;
 	}
 
+	/*
+	 * Bg tree are converted after temp chunks cleaned up, or we can
+	 * populate temp chunks.
+	 */
+	if (mkfs_cfg.features & BTRFS_FEATURE_INCOMPAT_SKINNY_BG_TREE) {
+		trans = btrfs_start_transaction(fs_info->tree_root, 1);
+		if (IS_ERR(trans)) {
+			error("failed to start transaction: %d", ret);
+			goto out;
+		}
+		ret = btrfs_convert_to_skinny_bg_tree(trans);
+		if (ret < 0) {
+			errno = -ret;
+			error(
+		"bg-tree feature will not be enabled, due to error: %m");
+			btrfs_abort_transaction(trans, ret);
+			goto out;
+		}
+		ret = btrfs_commit_transaction(trans, fs_info->tree_root);
+		if (ret < 0) {
+			error("failed to commit transaction: %d", ret);
+			goto out;
+		}
+	}
+
 	if (source_dir_set) {
 		ret = btrfs_mkfs_fill_dir(source_dir, root, verbose);
 		if (ret) {
diff --git a/transaction.c b/transaction.c
index 45bb9e1f9de6..5de967fb015f 100644
--- a/transaction.c
+++ b/transaction.c
@@ -225,6 +225,7 @@ commit_tree:
 	root->commit_root = NULL;
 	fs_info->running_transaction = NULL;
 	fs_info->last_trans_committed = transid;
+	fs_info->convert_to_bg_tree = 0;
 	list_for_each_entry(sinfo, &fs_info->space_info, list) {
 		if (sinfo->bytes_reserved) {
 			warning(
-- 
2.23.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RFC 4/7] btrfs-progs: dump-tree/dump-super: Introduce support for skinny bg tree
  2019-11-04 12:03 [PATCH RFC 0/7] Qu Wenruo
                   ` (2 preceding siblings ...)
  2019-11-04 12:03 ` [PATCH RFC 3/7] btrfs-progs: mkfs: Introduce -O skinny-bg-tree Qu Wenruo
@ 2019-11-04 12:03 ` Qu Wenruo
  2019-11-04 12:03 ` [PATCH RFC 5/7] btrfs-progs: Refactor btrfs_new_block_group_record() to accept parameters directly Qu Wenruo
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2019-11-04 12:03 UTC (permalink / raw)
  To: linux-btrfs

Just a new tree called BLOCK_GROUP_TREE.

The new type (SKINNY_BLOCK_GROUP_ITEM) doesn't has any item, thus no
need to add any extra output.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 cmds/inspect-dump-super.c | 3 ++-
 cmds/inspect-dump-tree.c  | 5 +++++
 print-tree.c              | 4 ++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/cmds/inspect-dump-super.c b/cmds/inspect-dump-super.c
index fc06488dde32..02afa96054cc 100644
--- a/cmds/inspect-dump-super.c
+++ b/cmds/inspect-dump-super.c
@@ -227,7 +227,8 @@ static struct readable_flag_entry incompat_flags_array[] = {
 	DEF_INCOMPAT_FLAG_ENTRY(RAID56),
 	DEF_INCOMPAT_FLAG_ENTRY(SKINNY_METADATA),
 	DEF_INCOMPAT_FLAG_ENTRY(NO_HOLES),
-	DEF_INCOMPAT_FLAG_ENTRY(METADATA_UUID)
+	DEF_INCOMPAT_FLAG_ENTRY(METADATA_UUID),
+	DEF_INCOMPAT_FLAG_ENTRY(SKINNY_BG_TREE)
 };
 static const int incompat_flags_num = sizeof(incompat_flags_array) /
 				      sizeof(struct readable_flag_entry);
diff --git a/cmds/inspect-dump-tree.c b/cmds/inspect-dump-tree.c
index e5efe2470111..002fd92fdc84 100644
--- a/cmds/inspect-dump-tree.c
+++ b/cmds/inspect-dump-tree.c
@@ -152,6 +152,7 @@ static u64 treeid_from_string(const char *str, const char **end)
 		{ "QUOTA", BTRFS_QUOTA_TREE_OBJECTID },
 		{ "UUID", BTRFS_UUID_TREE_OBJECTID },
 		{ "FREE_SPACE", BTRFS_FREE_SPACE_TREE_OBJECTID },
+		{ "BG", BTRFS_BLOCK_GROUP_TREE_OBJECTID},
 		{ "TREE_LOG_FIXUP", BTRFS_TREE_LOG_FIXUP_OBJECTID },
 		{ "TREE_LOG", BTRFS_TREE_LOG_OBJECTID },
 		{ "TREE_RELOC", BTRFS_TREE_RELOC_OBJECTID },
@@ -663,6 +664,10 @@ again:
 				if (!skip)
 					printf("free space");
 				break;
+			case BTRFS_BLOCK_GROUP_TREE_OBJECTID:
+				if (!skip)
+					printf("block group");
+				break;
 			case BTRFS_MULTIPLE_OBJECTIDS:
 				if (!skip) {
 					printf("multiple");
diff --git a/print-tree.c b/print-tree.c
index f70ce6844a7e..87c1bf2f40f6 100644
--- a/print-tree.c
+++ b/print-tree.c
@@ -656,6 +656,7 @@ void print_key_type(FILE *stream, u64 objectid, u8 type)
 		[BTRFS_EXTENT_CSUM_KEY]		= "EXTENT_CSUM",
 		[BTRFS_EXTENT_DATA_KEY]		= "EXTENT_DATA",
 		[BTRFS_BLOCK_GROUP_ITEM_KEY]	= "BLOCK_GROUP_ITEM",
+		[BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY] = "SKINNY_BLOCK_GROUP_ITEM",
 		[BTRFS_FREE_SPACE_INFO_KEY]	= "FREE_SPACE_INFO",
 		[BTRFS_FREE_SPACE_EXTENT_KEY]	= "FREE_SPACE_EXTENT",
 		[BTRFS_FREE_SPACE_BITMAP_KEY]	= "FREE_SPACE_BITMAP",
@@ -775,6 +776,9 @@ void print_objectid(FILE *stream, u64 objectid, u8 type)
 	case BTRFS_FREE_SPACE_TREE_OBJECTID:
 		fprintf(stream, "FREE_SPACE_TREE");
 		break;
+	case BTRFS_BLOCK_GROUP_TREE_OBJECTID:
+		fprintf(stream, "BLOCK_GROUP_TREE");
+		break;
 	case BTRFS_MULTIPLE_OBJECTIDS:
 		fprintf(stream, "MULTIPLE");
 		break;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RFC 5/7] btrfs-progs: Refactor btrfs_new_block_group_record() to accept parameters directly
  2019-11-04 12:03 [PATCH RFC 0/7] Qu Wenruo
                   ` (3 preceding siblings ...)
  2019-11-04 12:03 ` [PATCH RFC 4/7] btrfs-progs: dump-tree/dump-super: Introduce support for skinny bg tree Qu Wenruo
@ 2019-11-04 12:03 ` Qu Wenruo
  2019-11-04 12:04 ` [PATCH RFC 6/7] btrfs-progs: check: Introduce support for bg-tree feature Qu Wenruo
  2019-11-04 12:04 ` [PATCH RFC 7/7] btrfs-progs: btrfstune: Allow to enable bg-tree feature offline Qu Wenruo
  6 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2019-11-04 12:03 UTC (permalink / raw)
  To: linux-btrfs

Currently btrfs_new_block_group_record() needs to extract numbers from
key and block group item manually.

This is not generic enough to handle skinny-bg-tree feature.
So change let btrfs_new_block_group_record() to accept @bytenr, @len and
@flags directly, so later skinny-bg-tree feature can reuse it in
original mode.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 btrfsck.h                   |  4 ++--
 check/common.h              |  4 ++--
 check/main.c                | 25 +++++++++++++------------
 cmds/rescue-chunk-recover.c |  6 +++++-
 4 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/btrfsck.h b/btrfsck.h
index ac7f5d488b5a..be811a133687 100644
--- a/btrfsck.h
+++ b/btrfsck.h
@@ -189,8 +189,8 @@ struct chunk_record *btrfs_new_chunk_record(struct extent_buffer *leaf,
 					    struct btrfs_key *key,
 					    int slot);
 struct block_group_record *
-btrfs_new_block_group_record(struct extent_buffer *leaf, struct btrfs_key *key,
-			     int slot);
+btrfs_new_block_group_record(struct extent_buffer *leaf, u64 bytenr, u64 len,
+			     u64 flags);
 struct device_extent_record *
 btrfs_new_device_extent_record(struct extent_buffer *leaf,
 			       struct btrfs_key *key, int slot);
diff --git a/check/common.h b/check/common.h
index 62cdc1d934c7..ff206f27c304 100644
--- a/check/common.h
+++ b/check/common.h
@@ -166,8 +166,8 @@ struct chunk_record *btrfs_new_chunk_record(struct extent_buffer *leaf,
 					    struct btrfs_key *key,
 					    int slot);
 struct block_group_record *
-btrfs_new_block_group_record(struct extent_buffer *leaf, struct btrfs_key *key,
-			     int slot);
+btrfs_new_block_group_record(struct extent_buffer *leaf, u64 bytenr, u64 len,
+			     u64 flags);
 struct device_extent_record *
 btrfs_new_device_extent_record(struct extent_buffer *leaf,
 			       struct btrfs_key *key, int slot);
diff --git a/check/main.c b/check/main.c
index a0e5ac47c152..a1261ce0ebe7 100644
--- a/check/main.c
+++ b/check/main.c
@@ -5193,10 +5193,9 @@ static int process_device_item(struct rb_root *dev_cache,
 }
 
 struct block_group_record *
-btrfs_new_block_group_record(struct extent_buffer *leaf, struct btrfs_key *key,
-			     int slot)
+btrfs_new_block_group_record(struct extent_buffer *leaf, u64 bytenr, u64 len,
+			     u64 flags)
 {
-	struct btrfs_block_group_item *ptr;
 	struct block_group_record *rec;
 
 	rec = calloc(1, sizeof(*rec));
@@ -5205,17 +5204,15 @@ btrfs_new_block_group_record(struct extent_buffer *leaf, struct btrfs_key *key,
 		exit(-1);
 	}
 
-	rec->cache.start = key->objectid;
-	rec->cache.size = key->offset;
+	rec->cache.start = bytenr;
+	rec->cache.size = len;
 
 	rec->generation = btrfs_header_generation(leaf);
 
-	rec->objectid = key->objectid;
-	rec->type = key->type;
-	rec->offset = key->offset;
-
-	ptr = btrfs_item_ptr(leaf, slot, struct btrfs_block_group_item);
-	rec->flags = btrfs_disk_block_group_flags(leaf, ptr);
+	rec->objectid = bytenr;
+	rec->type = BTRFS_BLOCK_GROUP_ITEM_KEY;
+	rec->offset = len;
+	rec->flags = flags;
 
 	INIT_LIST_HEAD(&rec->list);
 
@@ -5226,10 +5223,14 @@ static int process_block_group_item(struct block_group_tree *block_group_cache,
 				    struct btrfs_key *key,
 				    struct extent_buffer *eb, int slot)
 {
+	struct btrfs_block_group_item bgi;
 	struct block_group_record *rec;
 	int ret = 0;
 
-	rec = btrfs_new_block_group_record(eb, key, slot);
+	read_extent_buffer(eb, &bgi, btrfs_item_ptr_offset(eb, slot),
+			   sizeof(bgi));
+	rec = btrfs_new_block_group_record(eb, key->objectid, key->offset,
+					   btrfs_block_group_flags(&bgi));
 	ret = insert_block_group_record(block_group_cache, rec);
 	if (ret) {
 		fprintf(stderr, "Block Group[%llu, %llu] existed.\n",
diff --git a/cmds/rescue-chunk-recover.c b/cmds/rescue-chunk-recover.c
index 22d7a5959531..cd575668f89e 100644
--- a/cmds/rescue-chunk-recover.c
+++ b/cmds/rescue-chunk-recover.c
@@ -226,12 +226,16 @@ static int process_block_group_item(struct block_group_tree *bg_cache,
 				    struct extent_buffer *leaf,
 				    struct btrfs_key *key, int slot)
 {
+	struct btrfs_block_group_item bgi;
 	struct block_group_record *rec;
 	struct block_group_record *exist;
 	struct cache_extent *cache;
 	int ret = 0;
 
-	rec = btrfs_new_block_group_record(leaf, key, slot);
+	read_extent_buffer(leaf, &bgi, btrfs_item_ptr_offset(leaf, slot),
+			   sizeof(bgi));
+	rec = btrfs_new_block_group_record(leaf, key->objectid, key->offset,
+					   btrfs_block_group_flags(&bgi));
 	if (!rec->cache.size)
 		goto free_out;
 again:
-- 
2.23.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RFC 6/7] btrfs-progs: check: Introduce support for bg-tree feature
  2019-11-04 12:03 [PATCH RFC 0/7] Qu Wenruo
                   ` (4 preceding siblings ...)
  2019-11-04 12:03 ` [PATCH RFC 5/7] btrfs-progs: Refactor btrfs_new_block_group_record() to accept parameters directly Qu Wenruo
@ 2019-11-04 12:04 ` Qu Wenruo
  2019-11-04 12:04 ` [PATCH RFC 7/7] btrfs-progs: btrfstune: Allow to enable bg-tree feature offline Qu Wenruo
  6 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2019-11-04 12:04 UTC (permalink / raw)
  To: linux-btrfs

Just some minor modification.

- original mode:
  * skinny block group item can only occur in bg tree
  * check skinny block group item
    Introduce a new function, process_skinny_bgi(), for this check.
- lowmem mode:
  * search skinny block group items in bg tree if SKINNY_BG_TREE feature is set.
  * check skinny block group item
    This is done by reusing check_block_group_item().

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 check/main.c        | 38 +++++++++++++++++++++++++++
 check/mode-lowmem.c | 63 +++++++++++++++++++++++++++++++++++++++------
 2 files changed, 93 insertions(+), 8 deletions(-)

diff --git a/check/main.c b/check/main.c
index a1261ce0ebe7..066d9574a556 100644
--- a/check/main.c
+++ b/check/main.c
@@ -5241,6 +5241,32 @@ static int process_block_group_item(struct block_group_tree *block_group_cache,
 	return ret;
 }
 
+static int process_skinny_bgi(struct block_group_tree *block_group_cache,
+			      struct btrfs_key *key, struct extent_buffer *eb)
+{
+	struct btrfs_mapping_tree *map_tree = &global_info->mapping_tree;
+	struct block_group_record *rec;
+	struct cache_extent *ce;
+	struct map_lookup *map;
+	int ret;
+
+	ce = search_cache_extent(&map_tree->cache_tree, key->objectid);
+	/* For mismatch case, we just skip this bgi */
+	if (ce->start != key->objectid)
+		return 0;
+
+	map = container_of(ce, struct map_lookup, ce);
+	rec = btrfs_new_block_group_record(eb, key->objectid, ce->size,
+					   map->type);
+	ret = insert_block_group_record(block_group_cache, rec);
+	if (ret) {
+		error("block group [%llu, %llu) existed.",
+			ce->start, ce->start + ce->size);
+		free(rec);
+	}
+	return ret;
+}
+
 struct device_extent_record *
 btrfs_new_device_extent_record(struct extent_buffer *leaf,
 			       struct btrfs_key *key, int slot)
@@ -6106,6 +6132,10 @@ static int check_type_with_root(u64 rootid, u8 key_type)
 		if (rootid != BTRFS_EXTENT_TREE_OBJECTID)
 			goto err;
 		break;
+	case BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY:
+		if (rootid != BTRFS_BLOCK_GROUP_TREE_OBJECTID)
+			goto err;
+		break;
 	case BTRFS_ROOT_ITEM_KEY:
 		if (rootid != BTRFS_ROOT_TREE_OBJECTID)
 			goto err;
@@ -6309,6 +6339,14 @@ static int run_next_block(struct btrfs_root *root,
 					&key, buf, i);
 				continue;
 			}
+			if (key.type == BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY) {
+				ret = process_skinny_bgi(block_group_cache,
+							 &key, buf);
+				/* -ENOMEM */
+				if (ret < 0)
+					goto out;
+				continue;
+			}
 			if (key.type == BTRFS_DEV_EXTENT_KEY) {
 				process_device_extent_item(dev_extent_cache,
 					&key, buf, i);
diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index 7ecf95ed0170..26ae07ccb007 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -3536,16 +3536,39 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info,
 	u32 nodesize = btrfs_super_nodesize(fs_info->super_copy);
 	u64 flags;
 	u64 bg_flags;
+	u64 bg_len;
 	u64 used;
 	u64 total = 0;
 	int ret;
 	int err = 0;
 
 	btrfs_item_key_to_cpu(eb, &bg_key, slot);
-	bi = btrfs_item_ptr(eb, slot, struct btrfs_block_group_item);
-	read_extent_buffer(eb, &bg_item, (unsigned long)bi, sizeof(bg_item));
-	used = btrfs_block_group_used(&bg_item);
-	bg_flags = btrfs_block_group_flags(&bg_item);
+	if (bg_key.type == BTRFS_BLOCK_GROUP_ITEM_KEY) {
+		bi = btrfs_item_ptr(eb, slot, struct btrfs_block_group_item);
+		read_extent_buffer(eb, &bg_item, (unsigned long)bi, sizeof(bg_item));
+		used = btrfs_block_group_used(&bg_item);
+		bg_flags = btrfs_block_group_flags(&bg_item);
+		bg_len = bg_key.offset;
+	} else {
+		struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree;
+		struct cache_extent *ce;
+		struct map_lookup *map;
+
+		ce = search_cache_extent(&map_tree->cache_tree,
+					 bg_key.objectid);
+		if (!ce || ce->start != bg_key.objectid) {
+			error(
+		"block group[%llu %llu] did not find the related chunk item",
+				bg_key.objectid, bg_key.offset);
+			err |= REFERENCER_MISSING;
+			return err;
+		} else {
+			map = container_of(ce, struct map_lookup, ce);
+			bg_flags = map->type;
+		}
+		used = bg_key.offset;
+		bg_len = ce->size;
+	}
 
 	chunk_key.objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
 	chunk_key.type = BTRFS_CHUNK_ITEM_KEY;
@@ -3563,10 +3586,10 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info,
 		chunk = btrfs_item_ptr(path.nodes[0], path.slots[0],
 					struct btrfs_chunk);
 		if (btrfs_chunk_length(path.nodes[0], chunk) !=
-						bg_key.offset) {
+						bg_len) {
 			error(
 	"block group[%llu %llu] related chunk item length does not match",
-				bg_key.objectid, bg_key.offset);
+				bg_key.objectid, bg_len);
 			err |= REFERENCER_MISMATCH;
 		}
 	}
@@ -3591,7 +3614,7 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info,
 			goto next;
 
 		btrfs_item_key_to_cpu(leaf, &extent_key, path.slots[0]);
-		if (extent_key.objectid >= bg_key.objectid + bg_key.offset)
+		if (extent_key.objectid >= bg_key.objectid + bg_len)
 			break;
 
 		if (extent_key.type != BTRFS_METADATA_ITEM_KEY &&
@@ -3638,7 +3661,7 @@ out:
 	if (total != used) {
 		error(
 		"block group[%llu %llu] used %llu but extent items used %llu",
-			bg_key.objectid, bg_key.offset, used, total);
+			bg_key.objectid, bg_len, used, total);
 		err |= BG_ACCOUNTING_ERROR;
 	}
 	return err;
@@ -4487,6 +4510,29 @@ static int find_block_group_item(struct btrfs_fs_info *fs_info,
 	struct btrfs_key key;
 	int ret;
 
+	if (btrfs_fs_incompat(fs_info, SKINNY_BG_TREE)) {
+		key.objectid = bytenr;
+		key.type = BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY;
+		key.offset = (u64)-1;
+
+		ret = btrfs_search_slot(NULL, fs_info->bg_root, &key, path, 0, 0);
+		if (ret < 0)
+			return ret;
+		if (ret == 0) {
+			ret = -EUCLEAN;
+			error("invalid skinny bg item found for chunk [%llu, %llu)",
+				bytenr, bytenr + len);
+			goto out;
+		}
+		ret = btrfs_previous_item(fs_info->bg_root, path, bytenr,
+					  BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY);
+		if (ret > 0) {
+			ret = -ENOENT;
+			error("can't find skinny bg item for chunk [%llu, %llu)",
+				bytenr, bytenr + len);
+		}
+		goto out;
+	}
 	key.objectid = bytenr;
 	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
 	key.offset = len;
@@ -4694,6 +4740,7 @@ again:
 			ret = repair_extent_data_item(root, path, nrefs, ret);
 		err |= ret;
 		break;
+	case BTRFS_SKINNY_BLOCK_GROUP_ITEM_KEY:
 	case BTRFS_BLOCK_GROUP_ITEM_KEY:
 		ret = check_block_group_item(fs_info, eb, slot);
 		if (repair &&
-- 
2.23.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RFC 7/7] btrfs-progs: btrfstune: Allow to enable bg-tree feature offline
  2019-11-04 12:03 [PATCH RFC 0/7] Qu Wenruo
                   ` (5 preceding siblings ...)
  2019-11-04 12:04 ` [PATCH RFC 6/7] btrfs-progs: check: Introduce support for bg-tree feature Qu Wenruo
@ 2019-11-04 12:04 ` Qu Wenruo
  6 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2019-11-04 12:04 UTC (permalink / raw)
  To: linux-btrfs

Add a new option '-b' for btrfstune, to enable bg-tree feature for a
unmounted fs.

This feature will convert all BLOCK_GROUP_ITEMs in extent tree to bg
tree, by reusing the existing btrfs_convert_to_bg_tree() function.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 Documentation/btrfstune.asciidoc |  6 +++++
 btrfstune.c                      | 45 ++++++++++++++++++++++++++++++--
 2 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/Documentation/btrfstune.asciidoc b/Documentation/btrfstune.asciidoc
index 1d6bc98deed8..ccb89b44a701 100644
--- a/Documentation/btrfstune.asciidoc
+++ b/Documentation/btrfstune.asciidoc
@@ -26,6 +26,12 @@ means.  Please refer to the 'FILESYSTEM FEATURES' in `btrfs`(5).
 OPTIONS
 -------
 
+-b::
+(since kernel: 5.x)
++
+enable skinny-bg-tree feature (faster mount time for large fs), enabled by mkfs
+feature 'skinny-bg-tree'.
+
 -f::
 Allow dangerous changes, e.g. clear the seeding flag or change fsid. Make sure
 that you are aware of the dangers.
diff --git a/btrfstune.c b/btrfstune.c
index afa3aae35412..ba8b628a2f71 100644
--- a/btrfstune.c
+++ b/btrfstune.c
@@ -476,11 +476,40 @@ static void print_usage(void)
 	printf("\t-m          change fsid in metadata_uuid to a random UUID\n");
 	printf("\t            (incompat change, more lightweight than -u|-U)\n");
 	printf("\t-M UUID     change fsid in metadata_uuid to UUID\n");
+	printf("\t-b          enable skinny-bg-tree feature (mkfs: skinny-bg-tree)");
+	printf("\t            for faster mount time\n");
 	printf("  general:\n");
 	printf("\t-f          allow dangerous operations, make sure that you are aware of the dangers\n");
 	printf("\t--help      print this help\n");
 }
 
+static int convert_to_skinny_bg_tree(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_trans_handle *trans;
+	int ret;
+
+	trans = btrfs_start_transaction(fs_info->tree_root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		errno = -ret;
+		error("failed to start transaction: %m");
+		return ret;
+	}
+	ret = btrfs_convert_to_skinny_bg_tree(trans);
+	if (ret < 0) {
+		errno = -ret;
+		error("failed to convert: %m");
+		btrfs_abort_transaction(trans, ret);
+		return ret;
+	}
+	ret = btrfs_commit_transaction(trans, fs_info->tree_root);
+	if (ret < 0) {
+		errno = -ret;
+		error("failed to commit transaction: %m");
+	}
+	return ret;
+}
+
 int BOX_MAIN(btrfstune)(int argc, char *argv[])
 {
 	struct btrfs_root *root;
@@ -491,6 +520,7 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 	u64 seeding_value = 0;
 	int random_fsid = 0;
 	int change_metadata_uuid = 0;
+	bool to_skinny_bg_tree = false;
 	char *new_fsid_str = NULL;
 	int ret;
 	u64 super_flags = 0;
@@ -501,7 +531,7 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 			{ "help", no_argument, NULL, GETOPT_VAL_HELP},
 			{ NULL, 0, NULL, 0 }
 		};
-		int c = getopt_long(argc, argv, "S:rxfuU:nmM:", long_options, NULL);
+		int c = getopt_long(argc, argv, "S:rxfuU:nmM:b", long_options, NULL);
 
 		if (c < 0)
 			break;
@@ -539,6 +569,9 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 			ctree_flags |= OPEN_CTREE_IGNORE_FSID_MISMATCH;
 			change_metadata_uuid = 1;
 			break;
+		case 'b':
+			to_skinny_bg_tree = true;
+			break;
 		case GETOPT_VAL_HELP:
 		default:
 			print_usage();
@@ -556,7 +589,7 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 		return 1;
 	}
 	if (!super_flags && !seeding_flag && !(random_fsid || new_fsid_str) &&
-	    !change_metadata_uuid) {
+	    !change_metadata_uuid && !to_skinny_bg_tree) {
 		error("at least one option should be specified");
 		print_usage();
 		return 1;
@@ -602,6 +635,14 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 		return 1;
 	}
 
+	if (to_skinny_bg_tree) {
+		ret = convert_to_skinny_bg_tree(root->fs_info);
+		if (ret < 0) {
+			errno = -ret;
+			error("failed to convert to bg-tree feature: %m");
+			goto out;
+		}
+	}
 	if (seeding_flag) {
 		if (btrfs_fs_incompat(root->fs_info, METADATA_UUID)) {
 			fprintf(stderr, "SEED flag cannot be changed on a metadata-uuid changed fs\n");
-- 
2.23.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-04 12:03 [PATCH RFC 0/7] Qu Wenruo
2019-11-04 12:03 ` [PATCH RFC 1/7] btrfs-progs: check/lowmem: Lookup block group item in a seperate function Qu Wenruo
2019-11-04 12:03 ` [PATCH RFC 2/7] btrfs-progs: Enable read-write ability for 'skinny_bg_tree' feature Qu Wenruo
2019-11-04 12:03 ` [PATCH RFC 3/7] btrfs-progs: mkfs: Introduce -O skinny-bg-tree Qu Wenruo
2019-11-04 12:03 ` [PATCH RFC 4/7] btrfs-progs: dump-tree/dump-super: Introduce support for skinny bg tree Qu Wenruo
2019-11-04 12:03 ` [PATCH RFC 5/7] btrfs-progs: Refactor btrfs_new_block_group_record() to accept parameters directly Qu Wenruo
2019-11-04 12:04 ` [PATCH RFC 6/7] btrfs-progs: check: Introduce support for bg-tree feature Qu Wenruo
2019-11-04 12:04 ` [PATCH RFC 7/7] btrfs-progs: btrfstune: Allow to enable bg-tree feature offline Qu Wenruo

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git