All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite
@ 2017-12-20  4:57 Su Yue
  2017-12-20  4:57 ` [PATCH v2 01/17] btrfs-progs: lowmem check: release path in repair_extent_data_item() Su Yue
                   ` (17 more replies)
  0 siblings, 18 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

This patchset can be fetched from my github:
https://github.com/Damenly/btrfs-progs/tree/lowmem_repair
based on kdave/devel.

Thanks to Qu Wenruo's ideas and suggestions first.
Any suggestions about names of functions and variables are welcome.

Patch[1-3] fix minor problems of lowmem repair.

Patch[4-8] introduce two ways to avoid extents overwrite:
1) Traverse trees and exclude all metadata blocks.
It's time-inefficient for big filesystems.
2) Mark all existed chunks full, allocate new chunk for CoW.
More efficient than method 1. However, it can't not handle situation
like no space.
Lowmem repair will try method 2 first and method 1 next now.

Patch[9-17] remove parameters in lowmem repair functions. They
will try to avoid extents overwrite if necessary and start 
transactions themselves.

Those patches are mainly for lowmem repair. Original mode is not
influenced.
Since btrfs-progs v4.14, lowmem mode changed to check all tree blocks
backrefs. There are many bugs about shared backref in lowmem mode now.
If lowmem check can not work as expected, lowmem repair are useless.
I will fix lowmem check in next.

---
Changlog:
v1->v2:
 - Let @err in check_btrfs_root() record err bits but excluded
   negative values.
 - Do not delete a line of code to release path after extent item'
   insertion in repair_extent_data_item().
 - Add patch[3].
 - Force CoW in new allocated chunk to avoid extents overwrite.
 - Remove parameters @trans in check_chunks_and_extents_v2() and
   related callees.
 - Repair functions for lowmem mode call try_avoid_extents_overwrite()
   and start transactions.

Su Yue (17):
  btrfs-progs: lowmem check: release path in repair_extent_data_item()
  btrfs-progs: lowmem check: record returned errors after
    walk_down_tree_v2()
  btrfs-progs: lowmem check: assign @parent early in
    repair_extent_data_item()
  btrfs-progs: lowmem check: exclude extents of metadata blocks
  btrfs-progs: lowmem check: introduce modify_block_groups_cache()
  btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  btrfs-progs: lowmem check: introduce try_avoid_extents_overwrite()
  btrfs-progs: lowmem check: exclude extents if init-extent-tree in
    lowmem
  btrfs-progs: lowmem check: start to remove parameters @trans in lowmem
  btrfs-progs: lowmem check: remove parameter @trans of
    delete_extent_item()
  btrfs-progs: lowmem check: remove parameter @trans of
    repair_chunk_item()
  btrfs-progs: lowmem check: remove parameter @trans of
    repair_extent_item()
  btrfs-progs: lowmem check: remove parameter @trans of
    check_leaf_items()
  btrfs-progs: lowmem check: remove parameter @trans of
    repair_tree_back_ref()
  btrfs-progs: lowmem check: remove parameter @trans of
    check_btrfs_root()
  btrfs-progs: lowmem check: introduce repair_block_accounting()
  btrfs-progs: lowmem check: end of removing parameters @trans in lowmem

 cmds-check.c | 620 +++++++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 515 insertions(+), 105 deletions(-)

-- 
2.15.1




^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2 01/17] btrfs-progs: lowmem check: release path in repair_extent_data_item()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2() Su Yue
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

In repair_extent_data_item(), path is not be released if some
errors occurs which causes extent buffer leak.

So release path in end of the function.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/cmds-check.c b/cmds-check.c
index 7fc30da83ea1..309ac9553b3a 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -11999,6 +11999,7 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
 
 	err &= ~BACKREF_MISSING;
 out:
+	btrfs_release_path(&path);
 	if (ret)
 		error("can't repair root %llu extent data item[%llu %llu]",
 		      root->objectid, disk_bytenr, num_bytes);
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
  2017-12-20  4:57 ` [PATCH v2 01/17] btrfs-progs: lowmem check: release path in repair_extent_data_item() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-29 11:17   ` Nikolay Borisov
  2017-12-20  4:57 ` [PATCH v2 03/17] btrfs-progs: lowmem check: assign @parent early in repair_extent_data_item() Su Yue
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

In lowmem mode with '--repair', check_chunks_and_extents_v2()
will fix accounting in block groups and clear the error
bit BG_ACCOUNTING_ERROR.
However, return value of check_btrfs_root() is 0 either 1 instead of
error bits.

If extent tree is on error, lowmem repair always prints error and
returns nonzero value even the filesystem is fine after repair.

So let @err contains bits after walk_down_tree_v2().

Introduce FATAL_ERROR for lowmem mode to represents negative return
values since negative and positive can't not be mixed in bits operations.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 309ac9553b3a..ebede26cef01 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -134,6 +134,7 @@ struct data_backref {
 #define DIR_INDEX_MISMATCH      (1<<19) /* INODE_INDEX found but not match */
 #define DIR_COUNT_AGAIN         (1<<20) /* DIR isize should be recalculated */
 #define BG_ACCOUNTING_ERROR     (1<<21) /* Block group accounting error */
+#define FATAL_ERROR             (1<<22) /* fatal bit for errno */
 
 static inline struct data_backref* to_data_backref(struct extent_backref *back)
 {
@@ -6556,7 +6557,7 @@ static struct data_backref *find_data_backref(struct extent_record *rec,
  *                otherwise means check fs tree(s) items relationship and
  *		  @root MUST be a fs tree root.
  * Returns 0      represents OK.
- * Returns not 0  represents error.
+ * Returns > 0    represents error bits.
  */
 static int check_btrfs_root(struct btrfs_trans_handle *trans,
 			    struct btrfs_root *root, unsigned int ext_ref,
@@ -6607,12 +6608,12 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans,
 	while (1) {
 		ret = walk_down_tree_v2(trans, root, &path, &level, &nrefs,
 					ext_ref, check_all);
-
-		err |= !!ret;
+		if (ret > 0)
+			err |= ret;
 
 		/* if ret is negative, walk shall stop */
 		if (ret < 0) {
-			ret = err;
+			ret = err | FATAL_ERROR;
 			break;
 		}
 
@@ -6636,12 +6637,12 @@ out:
  * @ext_ref:	the EXTENDED_IREF feature
  *
  * Return 0 if no error found.
- * Return <0 for error.
+ * Return not 0 for error.
  */
 static int check_fs_root_v2(struct btrfs_root *root, unsigned int ext_ref)
 {
 	reset_cached_block_groups(root->fs_info);
-	return check_btrfs_root(NULL, root, ext_ref, 0);
+	return !!check_btrfs_root(NULL, root, ext_ref, 0);
 }
 
 /*
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 03/17] btrfs-progs: lowmem check: assign @parent early in repair_extent_data_item()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
  2017-12-20  4:57 ` [PATCH v2 01/17] btrfs-progs: lowmem check: release path in repair_extent_data_item() Su Yue
  2017-12-20  4:57 ` [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 04/17] btrfs-progs: lowmem check: exclude extents of metadata blocks Su Yue
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

The variable @eb is assigned to leaf in fs_tree before insertion of
backref. It will causes wrong parent of new inserted backref.

Set @parent in the begin solves the problem.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index ebede26cef01..1f06f0a0ea61 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -11938,6 +11938,11 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
 	extent_offset = btrfs_file_extent_offset(eb, fi);
 	offset = file_offset - extent_offset;
 
+	if (nrefs->full_backref[0])
+		parent = btrfs_header_bytenr(eb);
+	else
+		parent = 0;
+
 	/* now repair only adds backref */
 	if ((err & BACKREF_MISSING) == 0)
 		return err;
@@ -11979,11 +11984,6 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
 		btrfs_release_path(&path);
 	}
 
-	if (nrefs->full_backref[0])
-		parent = btrfs_header_bytenr(eb);
-	else
-		parent = 0;
-
 	ret = btrfs_inc_extent_ref(trans, root, disk_bytenr, num_bytes, parent,
 				   root->objectid,
 		   parent ? BTRFS_FIRST_FREE_OBJECTID : fi_key.objectid,
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 04/17] btrfs-progs: lowmem check: exclude extents of metadata blocks
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (2 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 03/17] btrfs-progs: lowmem check: assign @parent early in repair_extent_data_item() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 05/17] btrfs-progs: lowmem check: introduce modify_block_groups_cache() Su Yue
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

Commit d17d6663c99c ("btrfs-progs: lowmem check: Fix regression which
screws up extent allocator") removes pin_metadata_blocks() from
lowmem repair.
So we have to find another way to exclude extents which should be
occupied by tree blocks.

Modify pin_down_tree_blocks() only for code reuse.
So behavior of pin_metadata_blocks() which works with option
'init-extent-tree' is not influenced.

Introduce exclude_blocks_and_extent_items() to mark extents of all tree
blocks dirty in fs_info->excluded_extents.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 63 insertions(+), 10 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 1f06f0a0ea61..dd46569f3811 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -13537,40 +13537,54 @@ init:
 	return 0;
 }
 
-static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
-				struct extent_buffer *eb, int tree_root)
+static int traverse_tree_blocks(struct btrfs_fs_info *fs_info,
+				struct extent_buffer *eb, int tree_root,
+				int pin)
 {
 	struct extent_buffer *tmp;
 	struct btrfs_root_item *ri;
 	struct btrfs_key key;
+	struct extent_io_tree *tree;
 	u64 bytenr;
 	int level = btrfs_header_level(eb);
 	int nritems;
 	int ret;
 	int i;
+	u64 end = eb->start + eb->len;
 
+	if (pin)
+		tree = &fs_info->pinned_extents;
+	else
+		tree = fs_info->excluded_extents;
 	/*
-	 * If we have pinned this block before, don't pin it again.
+	 * If we have pinned/excluded this block before, don't do it again.
 	 * This can not only avoid forever loop with broken filesystem
 	 * but also give us some speedups.
 	 */
-	if (test_range_bit(&fs_info->pinned_extents, eb->start,
-			   eb->start + eb->len - 1, EXTENT_DIRTY, 0))
+	if (test_range_bit(tree, eb->start, end - 1, EXTENT_DIRTY, 0))
 		return 0;
 
-	btrfs_pin_extent(fs_info, eb->start, eb->len);
+	if (pin)
+		btrfs_pin_extent(fs_info, eb->start, eb->len);
+	else
+		set_extent_dirty(tree, eb->start, end - 1);
 
 	nritems = btrfs_header_nritems(eb);
 	for (i = 0; i < nritems; i++) {
 		if (level == 0) {
+			bool is_extent_root;
 			btrfs_item_key_to_cpu(eb, &key, i);
 			if (key.type != BTRFS_ROOT_ITEM_KEY)
 				continue;
 			/* Skip the extent root and reloc roots */
-			if (key.objectid == BTRFS_EXTENT_TREE_OBJECTID ||
-			    key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
+			if (key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
 			    key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID)
 				continue;
+			is_extent_root =
+				key.objectid == BTRFS_EXTENT_TREE_OBJECTID;
+			/* If pin, skip the extent root */
+			if (pin && is_extent_root)
+				continue;
 			ri = btrfs_item_ptr(eb, i, struct btrfs_root_item);
 			bytenr = btrfs_disk_root_bytenr(eb, ri);
 
@@ -13585,7 +13599,7 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
 				fprintf(stderr, "Error reading root block\n");
 				return -EIO;
 			}
-			ret = pin_down_tree_blocks(fs_info, tmp, 0);
+			ret = traverse_tree_blocks(fs_info, tmp, 0, pin);
 			free_extent_buffer(tmp);
 			if (ret)
 				return ret;
@@ -13604,7 +13618,8 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
 				fprintf(stderr, "Error reading tree block\n");
 				return -EIO;
 			}
-			ret = pin_down_tree_blocks(fs_info, tmp, tree_root);
+			ret = traverse_tree_blocks(fs_info, tmp, tree_root,
+						   pin);
 			free_extent_buffer(tmp);
 			if (ret)
 				return ret;
@@ -13614,6 +13629,12 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
 	return 0;
 }
 
+static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
+				struct extent_buffer *eb, int tree_root)
+{
+	return traverse_tree_blocks(fs_info, eb, tree_root, 1);
+}
+
 static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
 {
 	int ret;
@@ -13625,6 +13646,38 @@ static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
 	return pin_down_tree_blocks(fs_info, fs_info->tree_root->node, 1);
 }
 
+static int exclude_tree_blocks(struct btrfs_fs_info *fs_info,
+				struct extent_buffer *eb, int tree_root)
+{
+	return traverse_tree_blocks(fs_info, eb, tree_root, 0);
+}
+
+static int exclude_metadata_blocks(struct btrfs_fs_info *fs_info)
+{
+	int ret;
+	struct extent_io_tree *excluded_extents;
+
+	excluded_extents = malloc(sizeof(*excluded_extents));
+	if (!excluded_extents)
+		return -ENOMEM;
+	extent_io_tree_init(excluded_extents);
+	fs_info->excluded_extents = excluded_extents;
+
+	ret = exclude_tree_blocks(fs_info, fs_info->chunk_root->node, 0);
+	if (ret)
+		return ret;
+	return exclude_tree_blocks(fs_info, fs_info->tree_root->node, 1);
+}
+
+static void cleanup_excluded_extents(struct btrfs_fs_info *fs_info)
+{
+	if (fs_info->excluded_extents) {
+		extent_io_tree_cleanup(fs_info->excluded_extents);
+		free(fs_info->excluded_extents);
+	}
+	fs_info->excluded_extents = NULL;
+}
+
 static int reset_block_groups(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_block_group_cache *cache;
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 05/17] btrfs-progs: lowmem check: introduce modify_block_groups_cache()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (3 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 04/17] btrfs-progs: lowmem check: exclude extents of metadata blocks Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  5:38   ` Qu Wenruo
  2017-12-20  4:57 ` [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk() Su Yue
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

Excluding or pining all metadata blocks is not time-efficient for large
storage filesystems.
Here is another way to mark all metadata block groups full and allocate
a new chunk for CoW. So new reservered extents never overwrite
extents.

Introduce modify_block_groups_cache() to modify all blocks groups
cache state and set all extents in block groups unfree in free space
cache.

Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index dd46569f3811..d98d4bda7357 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -10828,6 +10828,89 @@ static void reset_cached_block_groups(struct btrfs_fs_info *fs_info)
 	}
 }
 
+/*
+ * Mark all extents unfree in the block group. And set @block_group->cached
+ * according to @cache.
+ */
+static int modify_block_group_cache(struct btrfs_fs_info *fs_info,
+		    struct btrfs_block_group_cache *block_group, int cache)
+{
+	struct extent_io_tree *free_space_cache = &fs_info->free_space_cache;
+	u64 start = block_group->key.objectid;
+	u64 end = start + block_group->key.offset;
+
+	if (cache && !block_group->cached) {
+		block_group->cached = 1;
+		clear_extent_dirty(free_space_cache, start, end - 1);
+	}
+
+	if (!cache && block_group->cached) {
+		block_group->cached = 0;
+		clear_extent_dirty(free_space_cache, start, end - 1);
+	}
+	return 0;
+}
+
+/*
+ * Modify block groups which have @flags unfree in free space cache.
+ *
+ * @cache: if 0, clear block groups cache state;
+ *         not 0, mark blocks groups cached.
+ */
+static int modify_block_groups_cache(struct btrfs_fs_info *fs_info, u64 flags,
+				     int cache)
+{
+	struct btrfs_root *root = fs_info->extent_root;
+	struct btrfs_key key;
+	struct btrfs_path path;
+	struct btrfs_block_group_cache *bg_cache;
+	struct btrfs_block_group_item *bi;
+	struct btrfs_block_group_item bg_item;
+	struct extent_buffer *eb;
+	int slot;
+	int ret;
+
+	key.objectid = 0;
+	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
+	key.offset = 0;
+
+	btrfs_init_path(&path);
+	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
+	if (ret < 0) {
+		error("fail to search block groups due to %s", strerror(-ret));
+		goto out;
+	}
+
+	while (1) {
+		eb = path.nodes[0];
+		slot = path.slots[0];
+		btrfs_item_key_to_cpu(eb, &key, slot);
+		bg_cache = btrfs_lookup_block_group(fs_info, key.objectid);
+		if (!bg_cache) {
+			ret = -ENOENT;
+			goto out;
+		}
+
+		bi = btrfs_item_ptr(eb, slot, struct btrfs_block_group_item);
+		read_extent_buffer(eb, &bg_item, (unsigned long)bi,
+				   sizeof(bg_item));
+		if (btrfs_block_group_flags(&bg_item) & flags)
+			modify_block_group_cache(fs_info, bg_cache, cache);
+
+		ret = btrfs_next_item(root, &path);
+		if (ret > 0) {
+			ret = 0;
+			goto out;
+		}
+		if (ret < 0)
+			goto out;
+	}
+
+out:
+	btrfs_release_path(&path);
+	return ret;
+}
+
 static int check_extent_refs(struct btrfs_root *root,
 			     struct cache_tree *extent_cache)
 {
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (4 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 05/17] btrfs-progs: lowmem check: introduce modify_block_groups_cache() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  5:41   ` Qu Wenruo
  2017-12-20  4:57 ` [PATCH v2 07/17] btrfs-progs: lowmem check: introduce try_avoid_extents_overwrite() Su Yue
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

Introduce create_chunk_and_block_block_group() to allocate new chunk
and corresponding block group.

The new function force_cow_in_new_chunk() first allocates new chunk
and records its start.
Then it modifies all metadata block groups cached and full.
Finally it marks the new block group uncached and unfree.
In the next CoW, extents states will be updated automatically by
cache_block_group().

Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index d98d4bda7357..311c8a9f45e8 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -10911,6 +10911,86 @@ out:
 	return ret;
 }
 
+static int create_chunk_and_block_group(struct btrfs_fs_info *fs_info,
+					u64 flags, u64 *start, u64 *nbytes)
+{
+	struct btrfs_trans_handle *trans;
+	struct btrfs_root *root = fs_info->extent_root;
+	int ret;
+
+	if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
+		return -EINVAL;
+
+	trans = btrfs_start_transaction(root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		error("error starting transaction %s", strerror(-ret));
+		return ret;
+	}
+	ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
+	if (ret) {
+		error("fail to allocate new chunk %s", strerror(-ret));
+		goto out;
+	}
+	ret = btrfs_make_block_group(trans, fs_info, 0, flags,
+			     BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes);
+	if (ret) {
+		error("fail to make block group for chunk %llu %llu %s",
+		      *start, *nbytes, strerror(-ret));
+		goto out;
+	}
+out:
+	btrfs_commit_transaction(trans, root);
+	return ret;
+}
+
+static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_block_group_cache *bg;
+	u64 start;
+	u64 nbytes;
+	u64 alloc_profile;
+	u64 flags;
+	int ret;
+
+	alloc_profile = (fs_info->avail_metadata_alloc_bits &
+			 fs_info->metadata_alloc_profile);
+	flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
+	if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
+		flags |= BTRFS_BLOCK_GROUP_DATA;
+
+	ret = create_chunk_and_block_group(fs_info, flags, &start, &nbytes);
+	if (!ret)
+		printf("Create new chunk %llu %llu\n", start, nbytes);
+	else
+		goto err;
+
+	flags = BTRFS_BLOCK_GROUP_METADATA;
+	/* Mark all metadata block groups cached and full in free space*/
+	ret = modify_block_groups_cache(fs_info, flags, 1);
+	if (ret)
+		goto clear_bg_cache;
+
+	bg = btrfs_lookup_block_group(fs_info, start);
+	if (!bg) {
+		ret = -ENOENT;
+		error("fail to look up block group %llu %llu", start, nbytes);
+		goto clear_bg_cache;
+	}
+
+	/* Clear block group cache just allocated */
+	ret = modify_block_group_cache(fs_info, bg, 0);
+	if (ret)
+		goto clear_bg_cache;
+
+	return 0;
+
+clear_bg_cache:
+	modify_block_groups_cache(fs_info, flags, 0);
+err:
+	return ret;
+}
+
 static int check_extent_refs(struct btrfs_root *root,
 			     struct cache_tree *extent_cache)
 {
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 07/17] btrfs-progs: lowmem check: introduce try_avoid_extents_overwrite()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (5 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  5:46   ` Qu Wenruo
  2017-12-20  4:57 ` [PATCH v2 08/17] btrfs-progs: lowmem check: exclude extents if init-extent-tree in lowmem Su Yue
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

Define a global enum extents_operation to record extents are pinned,
excluded or new chunk is allocated for extents.
Although global variable is not so graceful, it simplifies codes much.

New function try_avoid_extents_overwrite() will try to mark block
groups full and allocate a new chunk. If it failed because of no space
or wrong used bytes(fsck-tests/004), then try to exclude metadata
blocks.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 101 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 101 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index 311c8a9f45e8..9042bab93785 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -84,6 +84,15 @@ enum btrfs_check_mode {
 
 static enum btrfs_check_mode check_mode = CHECK_MODE_DEFAULT;
 
+enum lowmem_extents_operation {
+	EXTENTS_PIN,
+	EXTENTS_EXCLUDE,
+	EXTENTS_MARK_BG_FULL,
+	EXTENTS_NONE
+};
+
+static enum lowmem_extents_operation extents_operation = EXTENTS_NONE;
+
 struct extent_backref {
 	struct rb_node node;
 	unsigned int is_data:1;
@@ -13517,6 +13526,98 @@ out:
 	return err;
 }
 
+static void cleanup_excluded_extents(struct btrfs_fs_info *fs_info);
+static int end_avoid_extents_overwrite(struct btrfs_fs_info *fs_info)
+{
+	int ret;
+
+	switch (extents_operation) {
+	case EXTENTS_PIN:
+		ret = btrfs_finish_extent_commit(NULL, fs_info->extent_root,
+						 &fs_info->pinned_extents);
+		break;
+	case EXTENTS_EXCLUDE:
+		cleanup_excluded_extents(fs_info);
+		ret = 0;
+		break;
+	case EXTENTS_MARK_BG_FULL:
+		ret = modify_block_groups_cache(fs_info,
+						BTRFS_BLOCK_GROUP_METADATA, 0);
+		break;
+	case EXTENTS_NONE:
+		ret = 0;
+		break;
+	default:
+		ret = -EINVAL;
+	}
+
+	if (!ret)
+		extents_operation = EXTENTS_NONE;
+	return ret;
+}
+
+static int pin_metadata_blocks(struct btrfs_fs_info *fs_info);
+static int exclude_metadata_blocks(struct btrfs_fs_info *fs_info);
+/*
+ * NOTE: Do not call this function during transaction.
+ */
+static int avoid_extents_overwrite(struct btrfs_fs_info *fs_info,
+				   enum lowmem_extents_operation op)
+{
+	int ret;
+
+	if (op == extents_operation)
+		return 0;
+
+	switch (op) {
+	case EXTENTS_PIN:
+		ret = pin_metadata_blocks(fs_info);
+		break;
+	case EXTENTS_EXCLUDE:
+		ret = exclude_metadata_blocks(fs_info);
+		break;
+	case EXTENTS_MARK_BG_FULL:
+		ret = force_cow_in_new_chunk(fs_info);
+		break;
+	case EXTENTS_NONE:
+		ret = 0;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* extents_operation should be assigned anyway for latter clean up. */
+	extents_operation = op;
+
+	if (ret)
+		end_avoid_extents_overwrite(fs_info);
+	return ret;
+}
+
+static int try_avoid_extents_overwrite(struct btrfs_fs_info *fs_info)
+{
+	int ret;
+	int mixed = btrfs_fs_incompat(fs_info, MIXED_GROUPS);
+
+	if (extents_operation != EXTENTS_NONE)
+		return 0;
+	ret = avoid_extents_overwrite(fs_info, EXTENTS_MARK_BG_FULL);
+
+	/*
+	 * If there is no space left to allocate, try to exclude all metadata
+	 * blocks. Mix filesystem is unsupported.
+	 */
+	if (ret && ret == -ENOSPC && !mixed) {
+		printf(
+	"Try to exclude all metadata blcoks and extents, it may be slow\n");
+		ret = avoid_extents_overwrite(fs_info, EXTENTS_EXCLUDE);
+	}
+
+	if (ret)
+		error("failed to avoid extents overwrite %s", strerror(-ret));
+	return ret;
+}
+
 /*
  * Low memory usage version check_chunks_and_extents.
  */
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 08/17] btrfs-progs: lowmem check: exclude extents if init-extent-tree in lowmem
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (6 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 07/17] btrfs-progs: lowmem check: introduce try_avoid_extents_overwrite() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 09/17] btrfs-progs: lowmem check: start to remove parameters @trans " Su Yue
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

If options '--init-extent-tree' and '--mode=lowmem' are both
input, all metadata blocks will be traversed twice.
First one is done by pin_metadata_blocks() in reinit_extent_tree().
Second one is in check_chunks_and_extents_v2().

Excluding instead of pining metadata blocks before reinit extent tree
in lowmem can save some time.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 9042bab93785..bc405d0f3fc0 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -13699,6 +13699,11 @@ out:
 
 	/* if repair, update block accounting */
 	if (repair) {
+		/* Just do clean up, ignore return values here*/
+		end_avoid_extents_overwrite(fs_info);
+		modify_block_groups_cache(fs_info, BTRFS_BLOCK_GROUP_METADATA,
+					  0);
+
 		ret = btrfs_fix_block_accounting(trans, root);
 		if (ret)
 			err |= ret;
@@ -13710,7 +13715,6 @@ out:
 		btrfs_commit_transaction(trans, root->fs_info->extent_root);
 
 	btrfs_release_path(&path);
-
 	return err;
 }
 
@@ -14121,6 +14125,7 @@ static int reinit_extent_tree(struct btrfs_trans_handle *trans,
 {
 	u64 start = 0;
 	int ret;
+	int pin_extents = check_mode == CHECK_MODE_ORIGINAL;
 
 	/*
 	 * The only reason we don't do this is because right now we're just
@@ -14142,10 +14147,21 @@ static int reinit_extent_tree(struct btrfs_trans_handle *trans,
 	 * down the bytes that are in use so we don't overwrite any existing
 	 * metadata.
 	 */
-	ret = pin_metadata_blocks(fs_info);
-	if (ret) {
-		fprintf(stderr, "error pinning down used bytes\n");
-		return ret;
+again:
+	if (pin_extents) {
+		ret = pin_metadata_blocks(fs_info);
+		if (ret) {
+			fprintf(stderr, "error pinning down used bytes\n");
+			return ret;
+		}
+	} else {
+		ret = avoid_extents_overwrite(fs_info, EXTENTS_EXCLUDE);
+		if (ret) {
+			fprintf(stderr, "error excluding used bytes\n");
+			printf("try to pin down used bytes\n");
+			pin_extents = 1;
+			goto again;
+		}
 	}
 
 	/*
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 09/17] btrfs-progs: lowmem check: start to remove parameters @trans in lowmem
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (7 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 08/17] btrfs-progs: lowmem check: exclude extents if init-extent-tree in lowmem Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  5:51   ` Qu Wenruo
  2017-12-20  4:57 ` [PATCH v2 10/17] btrfs-progs: lowmem check: remove parameter @trans of delete_extent_item() Su Yue
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

Since extents can be avoid overwrite by excluding or new chunk
allocation. It's unnessesary to do all repairs in one transaction.

This patch removes parameter @trans of repair_extent_data_item().
repair_extent_data_item() calls try_avoid_extents_overwrite()
and starts a transaction by itself.

Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
This error will disappear after removing @trans finished.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index bc405d0f3fc0..20bf3d230946 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -12064,18 +12064,19 @@ out:
 	return err;
 }
 
+static int try_avoid_extents_overwrite(struct btrfs_fs_info *fs_info);
 /*
  * If @err contains BACKREF_MISSING then add extent of the
  * file_extent_data_item.
  *
  * Returns error bits after reapir.
  */
-static int repair_extent_data_item(struct btrfs_trans_handle *trans,
-				   struct btrfs_root *root,
+static int repair_extent_data_item(struct btrfs_root *root,
 				   struct btrfs_path *pathp,
 				   struct node_refs *nrefs,
 				   int err)
 {
+	struct btrfs_trans_handle *trans = NULL;
 	struct btrfs_file_extent_item *fi;
 	struct btrfs_key fi_key;
 	struct btrfs_key key;
@@ -12092,6 +12093,7 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
 	u64 file_offset;
 	int generation;
 	int slot;
+	int need_insert = 0;
 	int ret = 0;
 
 	eb = pathp->nodes[0];
@@ -12130,9 +12132,20 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
 		ret = -EIO;
 		goto out;
 	}
+	need_insert = ret;
 
+	ret = try_avoid_extents_overwrite(root->fs_info);
+	if (ret)
+		goto out;
+	trans = btrfs_start_transaction(root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		trans = NULL;
+		error("fail to start transaction %s", strerror(-ret));
+		goto out;
+	}
 	/* insert an extent item */
-	if (ret > 0) {
+	if (need_insert) {
 		key.objectid = disk_bytenr;
 		key.type = BTRFS_EXTENT_ITEM_KEY;
 		key.offset = num_bytes;
@@ -12172,6 +12185,8 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
 
 	err &= ~BACKREF_MISSING;
 out:
+	if (trans)
+		btrfs_commit_transaction(trans, root);
 	btrfs_release_path(&path);
 	if (ret)
 		error("can't repair root %llu extent data item[%llu %llu]",
@@ -13445,8 +13460,7 @@ again:
 	case BTRFS_EXTENT_DATA_KEY:
 		ret = check_extent_data_item(root, path, nrefs, account_bytes);
 		if (repair && ret)
-			ret = repair_extent_data_item(trans, root, path, nrefs,
-						      ret);
+			ret = repair_extent_data_item(root, path, nrefs, ret);
 		err |= ret;
 		break;
 	case BTRFS_BLOCK_GROUP_ITEM_KEY:
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 10/17] btrfs-progs: lowmem check: remove parameter @trans of delete_extent_item()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (8 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 09/17] btrfs-progs: lowmem check: start to remove parameters @trans " Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 11/17] btrfs-progs: lowmem check: remove parameter @trans of repair_chunk_item() Su Yue
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

This patch removes the parameter @trans of delete_extent_item().
It calls try_avoid_extents_overwrite() and starts a transaction by itself.

Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
This error will disappear after removing @trans finished.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 20bf3d230946..f1ab93550bc5 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -13392,13 +13392,22 @@ out:
 	return err;
 }
 
-static int delete_extent_tree_item(struct btrfs_trans_handle *trans,
-				   struct btrfs_root *root,
+static int delete_extent_tree_item(struct btrfs_root *root,
 				   struct btrfs_path *path)
 {
 	struct btrfs_key key;
+	struct btrfs_trans_handle *trans;
 	int ret = 0;
 
+	ret = try_avoid_extents_overwrite(root->fs_info);
+	if (ret)
+		return ret;
+	trans = btrfs_start_transaction(root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		error("fail to start transaction %s", strerror(-ret));
+		goto out;
+	}
 	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
 	btrfs_release_path(path);
 	ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
@@ -13416,6 +13425,7 @@ static int delete_extent_tree_item(struct btrfs_trans_handle *trans,
 	else
 		path->slots[0]--;
 out:
+	btrfs_commit_transaction(trans, root);
 	if (ret)
 		error("failed to delete root %llu item[%llu, %u, %llu]",
 		      root->objectid, key.objectid, key.type, key.offset);
@@ -13467,7 +13477,7 @@ again:
 		ret = check_block_group_item(fs_info, eb, slot);
 		if (repair &&
 		    ret & REFERENCER_MISSING)
-			ret = delete_extent_tree_item(trans, root, path);
+			ret = delete_extent_tree_item(root, path);
 		err |= ret;
 		break;
 	case BTRFS_DEV_ITEM_KEY:
@@ -13498,7 +13508,7 @@ again:
 					       key.objectid, -1);
 		if (repair &&
 		    ret & (REFERENCER_MISMATCH | REFERENCER_MISSING))
-			ret = delete_extent_tree_item(trans, root, path);
+			ret = delete_extent_tree_item(root, path);
 		err |= ret;
 		break;
 	case BTRFS_EXTENT_DATA_REF_KEY:
@@ -13511,7 +13521,7 @@ again:
 				btrfs_extent_data_ref_count(eb, dref));
 		if (repair &&
 		    ret & (REFERENCER_MISMATCH | REFERENCER_MISSING))
-			ret = delete_extent_tree_item(trans, root, path);
+			ret = delete_extent_tree_item(root, path);
 		err |= ret;
 		break;
 	case BTRFS_SHARED_BLOCK_REF_KEY:
@@ -13519,7 +13529,7 @@ again:
 						 key.objectid, -1);
 		if (repair &&
 		    ret & (REFERENCER_MISMATCH | REFERENCER_MISSING))
-			ret = delete_extent_tree_item(trans, root, path);
+			ret = delete_extent_tree_item(root, path);
 		err |= ret;
 		break;
 	case BTRFS_SHARED_DATA_REF_KEY:
@@ -13527,7 +13537,7 @@ again:
 						key.objectid);
 		if (repair &&
 		    ret & (REFERENCER_MISMATCH | REFERENCER_MISSING))
-			ret = delete_extent_tree_item(trans, root, path);
+			ret = delete_extent_tree_item(root, path);
 		err |= ret;
 		break;
 	default:
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 11/17] btrfs-progs: lowmem check: remove parameter @trans of repair_chunk_item()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (9 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 10/17] btrfs-progs: lowmem check: remove parameter @trans of delete_extent_item() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 12/17] btrfs-progs: lowmem check: remove parameter @trans of repair_extent_item() Su Yue
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

This patch removes parameter @trans of repair_chunk_item().
It calls try_avoid_extents_overwrite() and starts a transaction by
itself.

Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
This error will disappear after removing @trans finished.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 48 ++++++++++++++++++++++++++++++++----------------
 1 file changed, 32 insertions(+), 16 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index f1ab93550bc5..5a6433623fb6 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -13250,13 +13250,14 @@ out:
  *
  * Returns error after repair.
  */
-static int repair_chunk_item(struct btrfs_trans_handle *trans,
-			     struct btrfs_root *chunk_root,
+static int repair_chunk_item(struct btrfs_root *chunk_root,
 			     struct btrfs_path *path, int err)
 {
 	struct btrfs_chunk *chunk;
 	struct btrfs_key chunk_key;
 	struct extent_buffer *eb = path->nodes[0];
+	struct btrfs_root *extent_root = chunk_root->fs_info->extent_root;
+	struct btrfs_trans_handle *trans;
 	u64 length;
 	int slot = path->slots[0];
 	u64 type;
@@ -13269,21 +13270,36 @@ static int repair_chunk_item(struct btrfs_trans_handle *trans,
 	type = btrfs_chunk_type(path->nodes[0], chunk);
 	length = btrfs_chunk_length(eb, chunk);
 
-	if (err & REFERENCER_MISSING) {
-		ret = btrfs_make_block_group(trans, chunk_root->fs_info, 0,
-		     type, chunk_key.objectid, chunk_key.offset, length);
-		if (ret) {
-			error("fail to add block group item[%llu %llu]",
-			      chunk_key.offset, length);
-			goto out;
-		} else {
-			err &= ~REFERENCER_MISSING;
-			printf("Added block group item[%llu %llu]\n",
-			       chunk_key.offset, length);
-		}
+	/* now repair only adds block group */
+	if ((err & REFERENCER_MISSING) == 0)
+		return err;
+
+	ret = try_avoid_extents_overwrite(chunk_root->fs_info);
+	if (ret)
+		return ret;
+
+	trans = btrfs_start_transaction(extent_root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		error("fail to start transaction %s", strerror(-ret));
+		return ret;
 	}
 
-out:
+	ret = btrfs_make_block_group(trans, chunk_root->fs_info, 0, type,
+			     chunk_key.objectid, chunk_key.offset, length);
+	if (ret) {
+		error("fail to add block group item[%llu %llu]",
+		      chunk_key.offset, length);
+	} else {
+		err &= ~REFERENCER_MISSING;
+		printf("Added block group item[%llu %llu]\n", chunk_key.offset,
+		       length);
+	}
+
+	btrfs_commit_transaction(trans, extent_root);
+	if (ret)
+		error("fail to repair item(s) related to chunk item[%llu %llu]",
+		      chunk_key.objectid, chunk_key.offset);
 	return err;
 }
 
@@ -13487,7 +13503,7 @@ again:
 	case BTRFS_CHUNK_ITEM_KEY:
 		ret = check_chunk_item(fs_info, eb, slot);
 		if (repair && ret)
-			ret = repair_chunk_item(trans, root, path, ret);
+			ret = repair_chunk_item(root, path, ret);
 		err |= ret;
 		break;
 	case BTRFS_DEV_EXTENT_KEY:
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 12/17] btrfs-progs: lowmem check: remove parameter @trans of repair_extent_item()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (10 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 11/17] btrfs-progs: lowmem check: remove parameter @trans of repair_chunk_item() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 13/17] btrfs-progs: lowmem check: remove parameter @trans of check_leaf_items() Su Yue
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

This patch removes parameter @trans of repair_extent_item().
It calls try_avoid_extents_overwrite() and starts a transaction by
itself.

Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
This error will disappear after removing @trans finished.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 54 ++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 34 insertions(+), 20 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 5a6433623fb6..2c77afe25edc 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -12788,40 +12788,55 @@ out:
  *               means error after repair
  * Returns  0   nothing happened
  */
-static int repair_extent_item(struct btrfs_trans_handle *trans,
-		      struct btrfs_root *root, struct btrfs_path *path,
+static int repair_extent_item(struct btrfs_root *root, struct btrfs_path *path,
 		      u64 bytenr, u64 num_bytes, u64 parent, u64 root_objectid,
 		      u64 owner, u64 offset, int err)
 {
+	struct btrfs_trans_handle *trans;
+	struct btrfs_root *extent_root = root->fs_info->extent_root;
 	struct btrfs_key old_key;
 	int freed = 0;
 	int ret;
 
 	btrfs_item_key_to_cpu(path->nodes[0], &old_key, path->slots[0]);
 
-	if (err & (REFERENCER_MISSING | REFERENCER_MISMATCH)) {
-		/* delete the backref */
-		ret = btrfs_free_extent(trans, root->fs_info->fs_root, bytenr,
-			  num_bytes, parent, root_objectid, owner, offset);
-		if (!ret) {
-			freed = 1;
-			err &= ~REFERENCER_MISSING;
-			printf("Delete backref in extent [%llu %llu]\n",
-			       bytenr, num_bytes);
-		} else {
-			error("fail to delete backref in extent [%llu %llu]",
-			       bytenr, num_bytes);
-		}
+	if ((err & (REFERENCER_MISSING | REFERENCER_MISMATCH)) == 0)
+		return err;
+
+	ret = try_avoid_extents_overwrite(root->fs_info);
+	if (ret)
+		return err;
+
+	trans = btrfs_start_transaction(extent_root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		error("fail to start transaction %s", strerror(-ret));
+		/* nothing happened */
+		ret = 0;
+		goto out;
 	}
+	/* delete the backref */
+	ret = btrfs_free_extent(trans, root->fs_info->fs_root, bytenr,
+			num_bytes, parent, root_objectid, owner, offset);
+	if (!ret) {
+		freed = 1;
+		err &= ~REFERENCER_MISSING;
+		printf("Delete backref in extent [%llu %llu]\n",
+		       bytenr, num_bytes);
+	} else {
+		error("fail to delete backref in extent [%llu %llu]",
+		      bytenr, num_bytes);
+	}
+	btrfs_commit_transaction(trans, extent_root);
 
 	/* btrfs_free_extent may delete the extent */
 	btrfs_release_path(path);
 	ret = btrfs_search_slot(NULL, root, &old_key, path, 0, 0);
-
 	if (ret)
 		ret = -ENOENT;
 	else if (freed)
 		ret = err;
+out:
 	return ret;
 }
 
@@ -12831,8 +12846,7 @@ static int repair_extent_item(struct btrfs_trans_handle *trans,
  *
  * Since we don't use extent_record anymore, introduce new error bit
  */
-static int check_extent_item(struct btrfs_trans_handle *trans,
-			     struct btrfs_fs_info *fs_info,
+static int check_extent_item(struct btrfs_fs_info *fs_info,
 			     struct btrfs_path *path)
 {
 	struct btrfs_extent_item *ei;
@@ -12963,7 +12977,7 @@ next:
 	}
 
 	if (err && repair) {
-		ret = repair_extent_item(trans, fs_info->extent_root, path,
+		ret = repair_extent_item(fs_info->extent_root, path,
 			 key.objectid, num_bytes, parent, root_objectid,
 			 owner, owner_offset, ret);
 		if (ret < 0)
@@ -13512,7 +13526,7 @@ again:
 		break;
 	case BTRFS_EXTENT_ITEM_KEY:
 	case BTRFS_METADATA_ITEM_KEY:
-		ret = check_extent_item(trans, fs_info, path);
+		ret = check_extent_item(fs_info, path);
 		err |= ret;
 		break;
 	case BTRFS_EXTENT_CSUM_KEY:
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 13/17] btrfs-progs: lowmem check: remove parameter @trans of check_leaf_items()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (11 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 12/17] btrfs-progs: lowmem check: remove parameter @trans of repair_extent_item() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 14/17] btrfs-progs: lowmem check: remove parameter @trans of repair_tree_back_ref() Su Yue
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

This patch removes parameter @trans of check_leaf_items().

Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
This error will disappear after removing @trans finished.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 2c77afe25edc..a601ffdc5cd9 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -2680,8 +2680,7 @@ static int check_inode_item(struct btrfs_root *root, struct btrfs_path *path,
 static int check_tree_block_ref(struct btrfs_root *root,
 				struct extent_buffer *eb, u64 bytenr,
 				int level, u64 owner, struct node_refs *nrefs);
-static int check_leaf_items(struct btrfs_trans_handle *trans,
-			    struct btrfs_root *root, struct btrfs_path *path,
+static int check_leaf_items(struct btrfs_root *root, struct btrfs_path *path,
 			    struct node_refs *nrefs, int account_bytes);
 
 /*
@@ -2770,8 +2769,8 @@ static int walk_down_tree_v2(struct btrfs_trans_handle *trans,
 				ret = process_one_leaf_v2(root, path, nrefs,
 							  level, ext_ref);
 			else
-				ret = check_leaf_items(trans, root, path,
-					       nrefs, account_file_data);
+				ret = check_leaf_items(root, path, nrefs,
+						       account_file_data);
 			err |= ret;
 			break;
 		} else {
@@ -13468,8 +13467,7 @@ out:
 /*
  * Main entry function to check known items and update related accounting info
  */
-static int check_leaf_items(struct btrfs_trans_handle *trans,
-			    struct btrfs_root *root, struct btrfs_path *path,
+static int check_leaf_items(struct btrfs_root *root, struct btrfs_path *path,
 			    struct node_refs *nrefs, int account_bytes)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 14/17] btrfs-progs: lowmem check: remove parameter @trans of repair_tree_back_ref()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (12 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 13/17] btrfs-progs: lowmem check: remove parameter @trans of check_leaf_items() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 15/17] btrfs-progs: lowmem check: remove parameter @trans of check_btrfs_root() Su Yue
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

This patch removes parameter @trans of repair_tree_back_ref().
It calls try_avoid_extents_overwrite() and starts a transaction by
itself.

Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
This error will disappear after removing @trans finished.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index a601ffdc5cd9..90270f1f8f72 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -2533,6 +2533,7 @@ static void account_bytes(struct btrfs_root *root, struct btrfs_path *path,
 	}
 }
 
+static int try_avoid_extents_overwrite(struct btrfs_fs_info *fs_info);
 /*
  * This function only handles BACKREF_MISSING,
  * If corresponding extent item exists, increase the ref, else insert an extent
@@ -2540,11 +2541,11 @@ static void account_bytes(struct btrfs_root *root, struct btrfs_path *path,
  *
  * Returns error bits after repair.
  */
-static int repair_tree_block_ref(struct btrfs_trans_handle *trans,
-				 struct btrfs_root *root,
+static int repair_tree_block_ref(struct btrfs_root *root,
 				 struct extent_buffer *node,
 				 struct node_refs *nrefs, int level, int err)
 {
+	struct btrfs_trans_handle *trans = NULL;
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct btrfs_root *extent_root = fs_info->extent_root;
 	struct btrfs_path path;
@@ -2594,6 +2595,16 @@ static int repair_tree_block_ref(struct btrfs_trans_handle *trans,
 	if (nrefs->full_backref[level] != 0)
 		flags |= BTRFS_BLOCK_FLAG_FULL_BACKREF;
 
+	ret = try_avoid_extents_overwrite(root->fs_info);
+	if (ret)
+		goto out;
+	trans = btrfs_start_transaction(extent_root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		trans = NULL;
+		error("fail to start transaction %s", strerror(-ret));
+		goto out;
+	}
 	/* insert an extent item */
 	if (insert_extent) {
 		struct btrfs_disk_key copy_key;
@@ -2660,6 +2671,8 @@ static int repair_tree_block_ref(struct btrfs_trans_handle *trans,
 
 	nrefs->refs[level]++;
 out:
+	if (trans)
+		btrfs_commit_transaction(trans, extent_root);
 	btrfs_release_path(&path);
 	if (ret) {
 		error(
@@ -2738,7 +2751,7 @@ static int walk_down_tree_v2(struct btrfs_trans_handle *trans,
 			   btrfs_header_owner(cur), nrefs);
 
 			if (repair && ret)
-				ret = repair_tree_block_ref(trans, root,
+				ret = repair_tree_block_ref(root,
 				    path->nodes[*level], nrefs, *level, ret);
 			err |= ret;
 
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 15/17] btrfs-progs: lowmem check: remove parameter @trans of check_btrfs_root()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (13 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 14/17] btrfs-progs: lowmem check: remove parameter @trans of repair_tree_back_ref() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 16/17] btrfs-progs: lowmem check: introduce repair_block_accounting() Su Yue
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

Remove parameters @trans of delete_extent_item() and
walk_down_tree_v2().

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 90270f1f8f72..81e125b9315f 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -2705,8 +2705,7 @@ static int check_leaf_items(struct btrfs_root *root, struct btrfs_path *path,
  * Returns <0  Fatal error, must exit the whole check
  * Returns 0   No errors found
  */
-static int walk_down_tree_v2(struct btrfs_trans_handle *trans,
-			     struct btrfs_root *root, struct btrfs_path *path,
+static int walk_down_tree_v2(struct btrfs_root *root, struct btrfs_path *path,
 			     int *level, struct node_refs *nrefs, int ext_ref,
 			     int check_all)
 
@@ -6580,8 +6579,7 @@ static struct data_backref *find_data_backref(struct extent_record *rec,
  * Returns 0      represents OK.
  * Returns > 0    represents error bits.
  */
-static int check_btrfs_root(struct btrfs_trans_handle *trans,
-			    struct btrfs_root *root, unsigned int ext_ref,
+static int check_btrfs_root(struct btrfs_root *root, unsigned int ext_ref,
 			    int check_all)
 
 {
@@ -6627,8 +6625,8 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans,
 	}
 
 	while (1) {
-		ret = walk_down_tree_v2(trans, root, &path, &level, &nrefs,
-					ext_ref, check_all);
+		ret = walk_down_tree_v2(root, &path, &level, &nrefs, ext_ref,
+					check_all);
 		if (ret > 0)
 			err |= ret;
 
@@ -6663,7 +6661,7 @@ out:
 static int check_fs_root_v2(struct btrfs_root *root, unsigned int ext_ref)
 {
 	reset_cached_block_groups(root->fs_info);
-	return !!check_btrfs_root(NULL, root, ext_ref, 0);
+	return !!check_btrfs_root(root, ext_ref, 0);
 }
 
 /*
@@ -13709,11 +13707,11 @@ static int check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
 	}
 
 	root1 = root->fs_info->chunk_root;
-	ret = check_btrfs_root(trans, root1, 0, 1);
+	ret = check_btrfs_root(root1, 0, 1);
 	err |= ret;
 
 	root1 = root->fs_info->tree_root;
-	ret = check_btrfs_root(trans, root1, 0, 1);
+	ret = check_btrfs_root(root1, 0, 1);
 	err |= ret;
 
 	btrfs_init_path(&path);
@@ -13744,7 +13742,7 @@ static int check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
 			goto next;
 		}
 
-		ret = check_btrfs_root(trans, cur_root, 0, 1);
+		ret = check_btrfs_root(cur_root, 0, 1);
 		err |= ret;
 
 		if (key.objectid == BTRFS_TREE_RELOC_OBJECTID)
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 16/17] btrfs-progs: lowmem check: introduce repair_block_accounting()
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (14 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 15/17] btrfs-progs: lowmem check: remove parameter @trans of check_btrfs_root() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  4:57 ` [PATCH v2 17/17] btrfs-progs: lowmem check: end of removing parameters @trans in lowmem Su Yue
  2017-12-20  5:59 ` [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Qu Wenruo
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

Introduce repair_block_accounting() which calls
btrfs_fix_block_accounting() to repair block group accouting.

Replace btrfs_fix_block_accounting() with the new function.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/cmds-check.c b/cmds-check.c
index 81e125b9315f..b51fcfd8ffdb 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -13681,6 +13681,30 @@ static int try_avoid_extents_overwrite(struct btrfs_fs_info *fs_info)
 	return ret;
 }
 
+/*
+ * Wrapper function for btrfs_fix_block_accounting().
+ *
+ * Returns 0     on success.
+ * Returns != 0  on error.
+ */
+static int repair_block_accounting(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_trans_handle *trans = NULL;
+	struct btrfs_root *root = fs_info->extent_root;
+	int ret;
+
+	trans = btrfs_start_transaction(root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		error("fail to start transaction %s", strerror(-ret));
+		return ret;
+	}
+
+	ret = btrfs_fix_block_accounting(trans, root);
+	btrfs_commit_transaction(trans, root);
+	return ret;
+}
+
 /*
  * Low memory usage version check_chunks_and_extents.
  */
@@ -13767,7 +13791,7 @@ out:
 		modify_block_groups_cache(fs_info, BTRFS_BLOCK_GROUP_METADATA,
 					  0);
 
-		ret = btrfs_fix_block_accounting(trans, root);
+		ret = repair_block_accounting(fs_info);
 		if (ret)
 			err |= ret;
 		else
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 17/17] btrfs-progs: lowmem check: end of removing parameters @trans in lowmem
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (15 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 16/17] btrfs-progs: lowmem check: introduce repair_block_accounting() Su Yue
@ 2017-12-20  4:57 ` Su Yue
  2017-12-20  5:59 ` [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Qu Wenruo
  17 siblings, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-20  4:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs, suy.fnst

Remove @trans in check_chunks_and_extents.

After this patch, Lowmem repair should work again.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index b51fcfd8ffdb..3f409c967f5e 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -13710,7 +13710,6 @@ static int repair_block_accounting(struct btrfs_fs_info *fs_info)
  */
 static int check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
 {
-	struct btrfs_trans_handle *trans = NULL;
 	struct btrfs_path path;
 	struct btrfs_key old_key;
 	struct btrfs_key key;
@@ -13722,14 +13721,6 @@ static int check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
 
 	root = fs_info->fs_root;
 
-	if (repair) {
-		trans = btrfs_start_transaction(fs_info->extent_root, 1);
-		if (IS_ERR(trans)) {
-			error("failed to start transaction before check");
-			return PTR_ERR(trans);
-		}
-	}
-
 	root1 = root->fs_info->chunk_root;
 	ret = check_btrfs_root(root1, 0, 1);
 	err |= ret;
@@ -13798,9 +13789,6 @@ out:
 			err &= ~BG_ACCOUNTING_ERROR;
 	}
 
-	if (trans)
-		btrfs_commit_transaction(trans, root->fs_info->extent_root);
-
 	btrfs_release_path(&path);
 	return err;
 }
-- 
2.15.1




^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 05/17] btrfs-progs: lowmem check: introduce modify_block_groups_cache()
  2017-12-20  4:57 ` [PATCH v2 05/17] btrfs-progs: lowmem check: introduce modify_block_groups_cache() Su Yue
@ 2017-12-20  5:38   ` Qu Wenruo
  0 siblings, 0 replies; 36+ messages in thread
From: Qu Wenruo @ 2017-12-20  5:38 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3899 bytes --]



On 2017年12月20日 12:57, Su Yue wrote:
> Excluding or pining all metadata blocks is not time-efficient for large
> storage filesystems.
> Here is another way to mark all metadata block groups full and allocate
> a new chunk for CoW. So new reservered extents never overwrite
> extents.
> 
> Introduce modify_block_groups_cache() to modify all blocks groups
> cache state and set all extents in block groups unfree in free space
> cache.
> 
> Suggested-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
> ---
>  cmds-check.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 83 insertions(+)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index dd46569f3811..d98d4bda7357 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -10828,6 +10828,89 @@ static void reset_cached_block_groups(struct btrfs_fs_info *fs_info)
>  	}
>  }
>  
> +/*
> + * Mark all extents unfree in the block group. And set @block_group->cached
> + * according to @cache.
> + */
> +static int modify_block_group_cache(struct btrfs_fs_info *fs_info,
> +		    struct btrfs_block_group_cache *block_group, int cache)
> +{
> +	struct extent_io_tree *free_space_cache = &fs_info->free_space_cache;
> +	u64 start = block_group->key.objectid;
> +	u64 end = start + block_group->key.offset;
> +
> +	if (cache && !block_group->cached) {
> +		block_group->cached = 1;
> +		clear_extent_dirty(free_space_cache, start, end - 1);
> +	}
> +
> +	if (!cache && block_group->cached) {
> +		block_group->cached = 0;
> +		clear_extent_dirty(free_space_cache, start, end - 1);
> +	}
> +	return 0;
> +}
> +
> +/*
> + * Modify block groups which have @flags unfree in free space cache.
> + *
> + * @cache: if 0, clear block groups cache state;
> + *         not 0, mark blocks groups cached.

The naming of the function and the @cache parameter is quite confusing.

And so is the later usage of the function.
It's used by both marking all block groups full and revert the full mark
as cleanup.


I would call the function mark_block_groups_full() to do the full
marking thing, without the extra @cache parameter.

And another function to clear_block_groups_full() to make them into back
to normal status.

Thanks,
Qu
> + */
> +static int modify_block_groups_cache(struct btrfs_fs_info *fs_info, u64 flags,
> +				     int cache)
> +{
> +	struct btrfs_root *root = fs_info->extent_root;
> +	struct btrfs_key key;
> +	struct btrfs_path path;
> +	struct btrfs_block_group_cache *bg_cache;
> +	struct btrfs_block_group_item *bi;
> +	struct btrfs_block_group_item bg_item;
> +	struct extent_buffer *eb;
> +	int slot;
> +	int ret;
> +
> +	key.objectid = 0;
> +	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
> +	key.offset = 0;
> +
> +	btrfs_init_path(&path);
> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
> +	if (ret < 0) {
> +		error("fail to search block groups due to %s", strerror(-ret));
> +		goto out;
> +	}
> +
> +	while (1) {
> +		eb = path.nodes[0];
> +		slot = path.slots[0];
> +		btrfs_item_key_to_cpu(eb, &key, slot);
> +		bg_cache = btrfs_lookup_block_group(fs_info, key.objectid);
> +		if (!bg_cache) {
> +			ret = -ENOENT;
> +			goto out;
> +		}
> +
> +		bi = btrfs_item_ptr(eb, slot, struct btrfs_block_group_item);
> +		read_extent_buffer(eb, &bg_item, (unsigned long)bi,
> +				   sizeof(bg_item));
> +		if (btrfs_block_group_flags(&bg_item) & flags)
> +			modify_block_group_cache(fs_info, bg_cache, cache);
> +
> +		ret = btrfs_next_item(root, &path);
> +		if (ret > 0) {
> +			ret = 0;
> +			goto out;
> +		}
> +		if (ret < 0)
> +			goto out;
> +	}
> +
> +out:
> +	btrfs_release_path(&path);
> +	return ret;
> +}
> +
>  static int check_extent_refs(struct btrfs_root *root,
>  			     struct cache_tree *extent_cache)
>  {
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-20  4:57 ` [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk() Su Yue
@ 2017-12-20  5:41   ` Qu Wenruo
  2017-12-20  8:21     ` Su Yue
  0 siblings, 1 reply; 36+ messages in thread
From: Qu Wenruo @ 2017-12-20  5:41 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3600 bytes --]



On 2017年12月20日 12:57, Su Yue wrote:
> Introduce create_chunk_and_block_block_group() to allocate new chunk
> and corresponding block group.
> 
> The new function force_cow_in_new_chunk() first allocates new chunk
> and records its start.
> Then it modifies all metadata block groups cached and full.
> Finally it marks the new block group uncached and unfree.
> In the next CoW, extents states will be updated automatically by
> cache_block_group().
> 
> Suggested-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
> ---
>  cmds-check.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 80 insertions(+)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index d98d4bda7357..311c8a9f45e8 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -10911,6 +10911,86 @@ out:
>  	return ret;
>  }
>  
> +static int create_chunk_and_block_group(struct btrfs_fs_info *fs_info,
> +					u64 flags, u64 *start, u64 *nbytes)
> +{
> +	struct btrfs_trans_handle *trans;
> +	struct btrfs_root *root = fs_info->extent_root;
> +	int ret;
> +
> +	if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
> +		return -EINVAL;
> +
> +	trans = btrfs_start_transaction(root, 1);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		error("error starting transaction %s", strerror(-ret));
> +		return ret;
> +	}
> +	ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
> +	if (ret) {
> +		error("fail to allocate new chunk %s", strerror(-ret));
> +		goto out;
> +	}
> +	ret = btrfs_make_block_group(trans, fs_info, 0, flags,
> +			     BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes);
> +	if (ret) {
> +		error("fail to make block group for chunk %llu %llu %s",
> +		      *start, *nbytes, strerror(-ret));
> +		goto out;
> +	}
> +out:
> +	btrfs_commit_transaction(trans, root);
> +	return ret;
> +}
> +
> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
> +{
> +	struct btrfs_block_group_cache *bg;
> +	u64 start;
> +	u64 nbytes;
> +	u64 alloc_profile;
> +	u64 flags;
> +	int ret;
> +
> +	alloc_profile = (fs_info->avail_metadata_alloc_bits &
> +			 fs_info->metadata_alloc_profile);
> +	flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
> +	if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
> +		flags |= BTRFS_BLOCK_GROUP_DATA;
> +
> +	ret = create_chunk_and_block_group(fs_info, flags, &start, &nbytes);

Why bother allocating a new chunk by yourself?

Just mark all block groups full and that's all.

Any later tree block allocation like btrfs_search_slot() with @cow = 1
will trigger chunk allocation automatically.

Thanks,
Qu

> +	if (!ret)
> +		printf("Create new chunk %llu %llu\n", start, nbytes);
> +	else
> +		goto err;
> +
> +	flags = BTRFS_BLOCK_GROUP_METADATA;
> +	/* Mark all metadata block groups cached and full in free space*/
> +	ret = modify_block_groups_cache(fs_info, flags, 1);
> +	if (ret)
> +		goto clear_bg_cache;
> +
> +	bg = btrfs_lookup_block_group(fs_info, start);
> +	if (!bg) {
> +		ret = -ENOENT;
> +		error("fail to look up block group %llu %llu", start, nbytes);
> +		goto clear_bg_cache;
> +	}
> +
> +	/* Clear block group cache just allocated */
> +	ret = modify_block_group_cache(fs_info, bg, 0);
> +	if (ret)
> +		goto clear_bg_cache;
> +
> +	return 0;
> +
> +clear_bg_cache:
> +	modify_block_groups_cache(fs_info, flags, 0);
> +err:
> +	return ret;
> +}
> +
>  static int check_extent_refs(struct btrfs_root *root,
>  			     struct cache_tree *extent_cache)
>  {
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 07/17] btrfs-progs: lowmem check: introduce try_avoid_extents_overwrite()
  2017-12-20  4:57 ` [PATCH v2 07/17] btrfs-progs: lowmem check: introduce try_avoid_extents_overwrite() Su Yue
@ 2017-12-20  5:46   ` Qu Wenruo
  0 siblings, 0 replies; 36+ messages in thread
From: Qu Wenruo @ 2017-12-20  5:46 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 4155 bytes --]



On 2017年12月20日 12:57, Su Yue wrote:
> Define a global enum extents_operation to record extents are pinned,
> excluded or new chunk is allocated for extents.
> Although global variable is not so graceful, it simplifies codes much.
> 
> New function try_avoid_extents_overwrite() will try to mark block
> groups full and allocate a new chunk. If it failed because of no space
> or wrong used bytes(fsck-tests/004), then try to exclude metadata
> blocks.
> 
> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
> ---
>  cmds-check.c | 101 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 101 insertions(+)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index 311c8a9f45e8..9042bab93785 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -84,6 +84,15 @@ enum btrfs_check_mode {
>  
>  static enum btrfs_check_mode check_mode = CHECK_MODE_DEFAULT;
>  
> +enum lowmem_extents_operation {
> +	EXTENTS_PIN,
> +	EXTENTS_EXCLUDE,
> +	EXTENTS_MARK_BG_FULL,
> +	EXTENTS_NONE
> +};
> +
> +static enum lowmem_extents_operation extents_operation = EXTENTS_NONE;
> +
>  struct extent_backref {
>  	struct rb_node node;
>  	unsigned int is_data:1;
> @@ -13517,6 +13526,98 @@ out:
>  	return err;
>  }
>  
> +static void cleanup_excluded_extents(struct btrfs_fs_info *fs_info);
> +static int end_avoid_extents_overwrite(struct btrfs_fs_info *fs_info)
> +{
> +	int ret;
> +
> +	switch (extents_operation) {
> +	case EXTENTS_PIN:
> +		ret = btrfs_finish_extent_commit(NULL, fs_info->extent_root,
> +						 &fs_info->pinned_extents);
> +		break;

Any particular reason to support pining down extents?

As I already mentioned in the bug fix patch, pin will cause extent to be
considered as free after transaction commitment.

Or it's only designed to support extent-tree rebuild?

At least I didn't see any where using EXTENT_PIN in the patchset.

Thanks,
Qu

> +	case EXTENTS_EXCLUDE:
> +		cleanup_excluded_extents(fs_info);
> +		ret = 0;
> +		break;
> +	case EXTENTS_MARK_BG_FULL:
> +		ret = modify_block_groups_cache(fs_info,
> +						BTRFS_BLOCK_GROUP_METADATA, 0);
> +		break;
> +	case EXTENTS_NONE:
> +		ret = 0;
> +		break;
> +	default:
> +		ret = -EINVAL;
> +	}
> +
> +	if (!ret)
> +		extents_operation = EXTENTS_NONE;
> +	return ret;
> +}
> +
> +static int pin_metadata_blocks(struct btrfs_fs_info *fs_info);
> +static int exclude_metadata_blocks(struct btrfs_fs_info *fs_info);
> +/*
> + * NOTE: Do not call this function during transaction.
> + */
> +static int avoid_extents_overwrite(struct btrfs_fs_info *fs_info,
> +				   enum lowmem_extents_operation op)
> +{
> +	int ret;
> +
> +	if (op == extents_operation)
> +		return 0;
> +
> +	switch (op) {
> +	case EXTENTS_PIN:
> +		ret = pin_metadata_blocks(fs_info);
> +		break;
> +	case EXTENTS_EXCLUDE:
> +		ret = exclude_metadata_blocks(fs_info);
> +		break;
> +	case EXTENTS_MARK_BG_FULL:
> +		ret = force_cow_in_new_chunk(fs_info);
> +		break;
> +	case EXTENTS_NONE:
> +		ret = 0;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	/* extents_operation should be assigned anyway for latter clean up. */
> +	extents_operation = op;
> +
> +	if (ret)
> +		end_avoid_extents_overwrite(fs_info);
> +	return ret;
> +}
> +
> +static int try_avoid_extents_overwrite(struct btrfs_fs_info *fs_info)
> +{
> +	int ret;
> +	int mixed = btrfs_fs_incompat(fs_info, MIXED_GROUPS);
> +
> +	if (extents_operation != EXTENTS_NONE)
> +		return 0;
> +	ret = avoid_extents_overwrite(fs_info, EXTENTS_MARK_BG_FULL);
> +
> +	/*
> +	 * If there is no space left to allocate, try to exclude all metadata
> +	 * blocks. Mix filesystem is unsupported.
> +	 */
> +	if (ret && ret == -ENOSPC && !mixed) {
> +		printf(
> +	"Try to exclude all metadata blcoks and extents, it may be slow\n");
> +		ret = avoid_extents_overwrite(fs_info, EXTENTS_EXCLUDE);
> +	}
> +
> +	if (ret)
> +		error("failed to avoid extents overwrite %s", strerror(-ret));
> +	return ret;
> +}
> +
>  /*
>   * Low memory usage version check_chunks_and_extents.
>   */
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 09/17] btrfs-progs: lowmem check: start to remove parameters @trans in lowmem
  2017-12-20  4:57 ` [PATCH v2 09/17] btrfs-progs: lowmem check: start to remove parameters @trans " Su Yue
@ 2017-12-20  5:51   ` Qu Wenruo
  0 siblings, 0 replies; 36+ messages in thread
From: Qu Wenruo @ 2017-12-20  5:51 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3370 bytes --]



On 2017年12月20日 12:57, Su Yue wrote:
> Since extents can be avoid overwrite by excluding or new chunk
> allocation. It's unnessesary to do all repairs in one transaction.
> 
> This patch removes parameter @trans of repair_extent_data_item().
> repair_extent_data_item() calls try_avoid_extents_overwrite()
> and starts a transaction by itself.
> 
> Note: This patch and next patches cause error in lowmem repair like:
> "Error: Commit_root already set when starting transaction".
> This error will disappear after removing @trans finished.
> 
> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
> ---
>  cmds-check.c | 24 +++++++++++++++++++-----
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index bc405d0f3fc0..20bf3d230946 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -12064,18 +12064,19 @@ out:
>  	return err;
>  }
>  
> +static int try_avoid_extents_overwrite(struct btrfs_fs_info *fs_info);
>  /*
>   * If @err contains BACKREF_MISSING then add extent of the
>   * file_extent_data_item.
>   *
>   * Returns error bits after reapir.
>   */
> -static int repair_extent_data_item(struct btrfs_trans_handle *trans,
> -				   struct btrfs_root *root,
> +static int repair_extent_data_item(struct btrfs_root *root,
>  				   struct btrfs_path *pathp,
>  				   struct node_refs *nrefs,
>  				   int err)
>  {
> +	struct btrfs_trans_handle *trans = NULL;
>  	struct btrfs_file_extent_item *fi;
>  	struct btrfs_key fi_key;
>  	struct btrfs_key key;
> @@ -12092,6 +12093,7 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
>  	u64 file_offset;
>  	int generation;
>  	int slot;
> +	int need_insert = 0;
>  	int ret = 0;
>  
>  	eb = pathp->nodes[0];
> @@ -12130,9 +12132,20 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
>  		ret = -EIO;
>  		goto out;
>  	}
> +	need_insert = ret;
>  
> +	ret = try_avoid_extents_overwrite(root->fs_info);

Even it's much cheaper to pin extent tree using chunk unit, I don't
think it's a good idea to do it every time you are going to fix some
extent tree problem.

Why not just do it at the beginning of extent tree check?

Thanks,
Qu

> +	if (ret)
> +		goto out;
> +	trans = btrfs_start_transaction(root, 1);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		trans = NULL;
> +		error("fail to start transaction %s", strerror(-ret));
> +		goto out;
> +	}
>  	/* insert an extent item */
> -	if (ret > 0) {
> +	if (need_insert) {
>  		key.objectid = disk_bytenr;
>  		key.type = BTRFS_EXTENT_ITEM_KEY;
>  		key.offset = num_bytes;
> @@ -12172,6 +12185,8 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
>  
>  	err &= ~BACKREF_MISSING;
>  out:
> +	if (trans)
> +		btrfs_commit_transaction(trans, root);
>  	btrfs_release_path(&path);
>  	if (ret)
>  		error("can't repair root %llu extent data item[%llu %llu]",
> @@ -13445,8 +13460,7 @@ again:
>  	case BTRFS_EXTENT_DATA_KEY:
>  		ret = check_extent_data_item(root, path, nrefs, account_bytes);
>  		if (repair && ret)
> -			ret = repair_extent_data_item(trans, root, path, nrefs,
> -						      ret);
> +			ret = repair_extent_data_item(root, path, nrefs, ret);
>  		err |= ret;
>  		break;
>  	case BTRFS_BLOCK_GROUP_ITEM_KEY:
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite
  2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
                   ` (16 preceding siblings ...)
  2017-12-20  4:57 ` [PATCH v2 17/17] btrfs-progs: lowmem check: end of removing parameters @trans in lowmem Su Yue
@ 2017-12-20  5:59 ` Qu Wenruo
  17 siblings, 0 replies; 36+ messages in thread
From: Qu Wenruo @ 2017-12-20  5:59 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3528 bytes --]



On 2017年12月20日 12:57, Su Yue wrote:
> This patchset can be fetched from my github:
> https://github.com/Damenly/btrfs-progs/tree/lowmem_repair
> based on kdave/devel.
> 
> Thanks to Qu Wenruo's ideas and suggestions first.
> Any suggestions about names of functions and variables are welcome.
> 
> Patch[1-3] fix minor problems of lowmem repair.

Mostly fine for patch 1-3.

So Reviewed-by: Qu Wenruo <wqu@suse.com> for those patches.

Although for 2nd patch, I'm not sure if the naming or if the >0 for
error is good.


> 
> Patch[4-8] introduce two ways to avoid extents overwrite:
> 1) Traverse trees and exclude all metadata blocks.
> It's time-inefficient for big filesystems.
> 2) Mark all existed chunks full, allocate new chunk for CoW.
> More efficient than method 1. However, it can't not handle situation
> like no space.
> Lowmem repair will try method 2 first and method 1 next now.

Commented in separate patches.

Thanks,
Qu

> 
> Patch[9-17] remove parameters in lowmem repair functions. They
> will try to avoid extents overwrite if necessary and start 
> transactions themselves.
> 
> Those patches are mainly for lowmem repair. Original mode is not
> influenced.
> Since btrfs-progs v4.14, lowmem mode changed to check all tree blocks
> backrefs. There are many bugs about shared backref in lowmem mode now.
> If lowmem check can not work as expected, lowmem repair are useless.
> I will fix lowmem check in next.
> 
> ---
> Changlog:
> v1->v2:
>  - Let @err in check_btrfs_root() record err bits but excluded
>    negative values.
>  - Do not delete a line of code to release path after extent item'
>    insertion in repair_extent_data_item().
>  - Add patch[3].
>  - Force CoW in new allocated chunk to avoid extents overwrite.
>  - Remove parameters @trans in check_chunks_and_extents_v2() and
>    related callees.
>  - Repair functions for lowmem mode call try_avoid_extents_overwrite()
>    and start transactions.
> 
> Su Yue (17):
>   btrfs-progs: lowmem check: release path in repair_extent_data_item()
>   btrfs-progs: lowmem check: record returned errors after
>     walk_down_tree_v2()
>   btrfs-progs: lowmem check: assign @parent early in
>     repair_extent_data_item()
>   btrfs-progs: lowmem check: exclude extents of metadata blocks
>   btrfs-progs: lowmem check: introduce modify_block_groups_cache()
>   btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
>   btrfs-progs: lowmem check: introduce try_avoid_extents_overwrite()
>   btrfs-progs: lowmem check: exclude extents if init-extent-tree in
>     lowmem
>   btrfs-progs: lowmem check: start to remove parameters @trans in lowmem
>   btrfs-progs: lowmem check: remove parameter @trans of
>     delete_extent_item()
>   btrfs-progs: lowmem check: remove parameter @trans of
>     repair_chunk_item()
>   btrfs-progs: lowmem check: remove parameter @trans of
>     repair_extent_item()
>   btrfs-progs: lowmem check: remove parameter @trans of
>     check_leaf_items()
>   btrfs-progs: lowmem check: remove parameter @trans of
>     repair_tree_back_ref()
>   btrfs-progs: lowmem check: remove parameter @trans of
>     check_btrfs_root()
>   btrfs-progs: lowmem check: introduce repair_block_accounting()
>   btrfs-progs: lowmem check: end of removing parameters @trans in lowmem
> 
>  cmds-check.c | 620 +++++++++++++++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 515 insertions(+), 105 deletions(-)
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-20  5:41   ` Qu Wenruo
@ 2017-12-20  8:21     ` Su Yue
  2017-12-20  8:37       ` Qu Wenruo
  0 siblings, 1 reply; 36+ messages in thread
From: Su Yue @ 2017-12-20  8:21 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 12/20/2017 01:41 PM, Qu Wenruo wrote:
> 
> 
> On 2017年12月20日 12:57, Su Yue wrote:
>> Introduce create_chunk_and_block_block_group() to allocate new chunk
>> and corresponding block group.
>>
>> The new function force_cow_in_new_chunk() first allocates new chunk
>> and records its start.
>> Then it modifies all metadata block groups cached and full.
>> Finally it marks the new block group uncached and unfree.
>> In the next CoW, extents states will be updated automatically by
>> cache_block_group().
>>
>> Suggested-by: Qu Wenruo <wqu@suse.com>
>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>> ---
>>   cmds-check.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 80 insertions(+)
>>
>> diff --git a/cmds-check.c b/cmds-check.c
>> index d98d4bda7357..311c8a9f45e8 100644
>> --- a/cmds-check.c
>> +++ b/cmds-check.c
>> @@ -10911,6 +10911,86 @@ out:
>>   	return ret;
>>   }
>>   
>> +static int create_chunk_and_block_group(struct btrfs_fs_info *fs_info,
>> +					u64 flags, u64 *start, u64 *nbytes)
>> +{
>> +	struct btrfs_trans_handle *trans;
>> +	struct btrfs_root *root = fs_info->extent_root;
>> +	int ret;
>> +
>> +	if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>> +		return -EINVAL;
>> +
>> +	trans = btrfs_start_transaction(root, 1);
>> +	if (IS_ERR(trans)) {
>> +		ret = PTR_ERR(trans);
>> +		error("error starting transaction %s", strerror(-ret));
>> +		return ret;
>> +	}
>> +	ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>> +	if (ret) {
>> +		error("fail to allocate new chunk %s", strerror(-ret));
>> +		goto out;
>> +	}
>> +	ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>> +			     BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes);
>> +	if (ret) {
>> +		error("fail to make block group for chunk %llu %llu %s",
>> +		      *start, *nbytes, strerror(-ret));
>> +		goto out;
>> +	}
>> +out:
>> +	btrfs_commit_transaction(trans, root);
>> +	return ret;
>> +}
>> +
>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>> +{
>> +	struct btrfs_block_group_cache *bg;
>> +	u64 start;
>> +	u64 nbytes;
>> +	u64 alloc_profile;
>> +	u64 flags;
>> +	int ret;
>> +
>> +	alloc_profile = (fs_info->avail_metadata_alloc_bits &
>> +			 fs_info->metadata_alloc_profile);
>> +	flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>> +	if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>> +		flags |= BTRFS_BLOCK_GROUP_DATA;
>> +
>> +	ret = create_chunk_and_block_group(fs_info, flags, &start, &nbytes);
> 
> Why bother allocating a new chunk by yourself?
> 
> Just mark all block groups full and that's all.
> 
> Any later tree block allocation like btrfs_search_slot() with @cow = 1
> will trigger chunk allocation automatically.
> 

Tried to let it happen but BUG_ON with -ENOSPC.
Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is called 
while doing CoW?
In progs, allocation of new chunk during CoW depends @root->ref_cows.
However, @root->ref_cows should be set only if @root is root of a fs
trees.

Thanks,
Su

> Thanks,
> Qu
> 
>> +	if (!ret)
>> +		printf("Create new chunk %llu %llu\n", start, nbytes);
>> +	else
>> +		goto err;
>> +
>> +	flags = BTRFS_BLOCK_GROUP_METADATA;
>> +	/* Mark all metadata block groups cached and full in free space*/
>> +	ret = modify_block_groups_cache(fs_info, flags, 1);
>> +	if (ret)
>> +		goto clear_bg_cache;
>> +
>> +	bg = btrfs_lookup_block_group(fs_info, start);
>> +	if (!bg) {
>> +		ret = -ENOENT;
>> +		error("fail to look up block group %llu %llu", start, nbytes);
>> +		goto clear_bg_cache;
>> +	}
>> +
>> +	/* Clear block group cache just allocated */
>> +	ret = modify_block_group_cache(fs_info, bg, 0);
>> +	if (ret)
>> +		goto clear_bg_cache;
>> +
>> +	return 0;
>> +
>> +clear_bg_cache:
>> +	modify_block_groups_cache(fs_info, flags, 0);
>> +err:
>> +	return ret;
>> +}
>> +
>>   static int check_extent_refs(struct btrfs_root *root,
>>   			     struct cache_tree *extent_cache)
>>   {
>>
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-20  8:21     ` Su Yue
@ 2017-12-20  8:37       ` Qu Wenruo
  2017-12-21  6:51         ` Qu Wenruo
  0 siblings, 1 reply; 36+ messages in thread
From: Qu Wenruo @ 2017-12-20  8:37 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 5176 bytes --]



On 2017年12月20日 16:21, Su Yue wrote:
> 
> 
> On 12/20/2017 01:41 PM, Qu Wenruo wrote:
>>
>>
>> On 2017年12月20日 12:57, Su Yue wrote:
>>> Introduce create_chunk_and_block_block_group() to allocate new chunk
>>> and corresponding block group.
>>>
>>> The new function force_cow_in_new_chunk() first allocates new chunk
>>> and records its start.
>>> Then it modifies all metadata block groups cached and full.
>>> Finally it marks the new block group uncached and unfree.
>>> In the next CoW, extents states will be updated automatically by
>>> cache_block_group().
>>>
>>> Suggested-by: Qu Wenruo <wqu@suse.com>
>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>> ---
>>>   cmds-check.c | 80
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 80 insertions(+)
>>>
>>> diff --git a/cmds-check.c b/cmds-check.c
>>> index d98d4bda7357..311c8a9f45e8 100644
>>> --- a/cmds-check.c
>>> +++ b/cmds-check.c
>>> @@ -10911,6 +10911,86 @@ out:
>>>       return ret;
>>>   }
>>>   +static int create_chunk_and_block_group(struct btrfs_fs_info
>>> *fs_info,
>>> +                    u64 flags, u64 *start, u64 *nbytes)
>>> +{
>>> +    struct btrfs_trans_handle *trans;
>>> +    struct btrfs_root *root = fs_info->extent_root;
>>> +    int ret;
>>> +
>>> +    if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>>> +        return -EINVAL;
>>> +
>>> +    trans = btrfs_start_transaction(root, 1);
>>> +    if (IS_ERR(trans)) {
>>> +        ret = PTR_ERR(trans);
>>> +        error("error starting transaction %s", strerror(-ret));
>>> +        return ret;
>>> +    }
>>> +    ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>>> +    if (ret) {
>>> +        error("fail to allocate new chunk %s", strerror(-ret));
>>> +        goto out;
>>> +    }
>>> +    ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>>> +                 BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes);
>>> +    if (ret) {
>>> +        error("fail to make block group for chunk %llu %llu %s",
>>> +              *start, *nbytes, strerror(-ret));
>>> +        goto out;
>>> +    }
>>> +out:
>>> +    btrfs_commit_transaction(trans, root);
>>> +    return ret;
>>> +}
>>> +
>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>>> +{
>>> +    struct btrfs_block_group_cache *bg;
>>> +    u64 start;
>>> +    u64 nbytes;
>>> +    u64 alloc_profile;
>>> +    u64 flags;
>>> +    int ret;
>>> +
>>> +    alloc_profile = (fs_info->avail_metadata_alloc_bits &
>>> +             fs_info->metadata_alloc_profile);
>>> +    flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>>> +    if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>>> +        flags |= BTRFS_BLOCK_GROUP_DATA;
>>> +
>>> +    ret = create_chunk_and_block_group(fs_info, flags, &start,
>>> &nbytes);
>>
>> Why bother allocating a new chunk by yourself?
>>
>> Just mark all block groups full and that's all.
>>
>> Any later tree block allocation like btrfs_search_slot() with @cow = 1
>> will trigger chunk allocation automatically.
>>
> 
> Tried to let it happen but BUG_ON with -ENOSPC.

Then fix it.

It's not a normal behavior in this case.

Thanks,
Qu

> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is called
> while doing CoW?> In progs, allocation of new chunk during CoW depends @root->ref_cows.
> However, @root->ref_cows should be set only if @root is root of a fs
> trees.
> 
> Thanks,
> Su
> 
>> Thanks,
>> Qu
>>
>>> +    if (!ret)
>>> +        printf("Create new chunk %llu %llu\n", start, nbytes);
>>> +    else
>>> +        goto err;
>>> +
>>> +    flags = BTRFS_BLOCK_GROUP_METADATA;
>>> +    /* Mark all metadata block groups cached and full in free space*/
>>> +    ret = modify_block_groups_cache(fs_info, flags, 1);
>>> +    if (ret)
>>> +        goto clear_bg_cache;
>>> +
>>> +    bg = btrfs_lookup_block_group(fs_info, start);
>>> +    if (!bg) {
>>> +        ret = -ENOENT;
>>> +        error("fail to look up block group %llu %llu", start, nbytes);
>>> +        goto clear_bg_cache;
>>> +    }
>>> +
>>> +    /* Clear block group cache just allocated */
>>> +    ret = modify_block_group_cache(fs_info, bg, 0);
>>> +    if (ret)
>>> +        goto clear_bg_cache;
>>> +
>>> +    return 0;
>>> +
>>> +clear_bg_cache:
>>> +    modify_block_groups_cache(fs_info, flags, 0);
>>> +err:
>>> +    return ret;
>>> +}
>>> +
>>>   static int check_extent_refs(struct btrfs_root *root,
>>>                    struct cache_tree *extent_cache)
>>>   {
>>>
>>
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-20  8:37       ` Qu Wenruo
@ 2017-12-21  6:51         ` Qu Wenruo
  2017-12-21  7:06           ` Su Yue
  2017-12-21  7:09           ` Su Yue
  0 siblings, 2 replies; 36+ messages in thread
From: Qu Wenruo @ 2017-12-21  6:51 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 5806 bytes --]



On 2017年12月20日 16:37, Qu Wenruo wrote:
> 
> 
> On 2017年12月20日 16:21, Su Yue wrote:
>>
>>
>> On 12/20/2017 01:41 PM, Qu Wenruo wrote:
>>>
>>>
>>> On 2017年12月20日 12:57, Su Yue wrote:
>>>> Introduce create_chunk_and_block_block_group() to allocate new chunk
>>>> and corresponding block group.
>>>>
>>>> The new function force_cow_in_new_chunk() first allocates new chunk
>>>> and records its start.
>>>> Then it modifies all metadata block groups cached and full.
>>>> Finally it marks the new block group uncached and unfree.
>>>> In the next CoW, extents states will be updated automatically by
>>>> cache_block_group().
>>>>
>>>> Suggested-by: Qu Wenruo <wqu@suse.com>
>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>> ---
>>>>   cmds-check.c | 80
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>   1 file changed, 80 insertions(+)
>>>>
>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>> index d98d4bda7357..311c8a9f45e8 100644
>>>> --- a/cmds-check.c
>>>> +++ b/cmds-check.c
>>>> @@ -10911,6 +10911,86 @@ out:
>>>>       return ret;
>>>>   }
>>>>   +static int create_chunk_and_block_group(struct btrfs_fs_info
>>>> *fs_info,
>>>> +                    u64 flags, u64 *start, u64 *nbytes)
>>>> +{
>>>> +    struct btrfs_trans_handle *trans;
>>>> +    struct btrfs_root *root = fs_info->extent_root;
>>>> +    int ret;
>>>> +
>>>> +    if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>>>> +        return -EINVAL;
>>>> +
>>>> +    trans = btrfs_start_transaction(root, 1);
>>>> +    if (IS_ERR(trans)) {
>>>> +        ret = PTR_ERR(trans);
>>>> +        error("error starting transaction %s", strerror(-ret));
>>>> +        return ret;
>>>> +    }
>>>> +    ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>>>> +    if (ret) {
>>>> +        error("fail to allocate new chunk %s", strerror(-ret));
>>>> +        goto out;
>>>> +    }
>>>> +    ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>>>> +                 BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes);
>>>> +    if (ret) {
>>>> +        error("fail to make block group for chunk %llu %llu %s",
>>>> +              *start, *nbytes, strerror(-ret));
>>>> +        goto out;
>>>> +    }
>>>> +out:
>>>> +    btrfs_commit_transaction(trans, root);
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>>>> +{
>>>> +    struct btrfs_block_group_cache *bg;
>>>> +    u64 start;
>>>> +    u64 nbytes;
>>>> +    u64 alloc_profile;
>>>> +    u64 flags;
>>>> +    int ret;
>>>> +
>>>> +    alloc_profile = (fs_info->avail_metadata_alloc_bits &
>>>> +             fs_info->metadata_alloc_profile);
>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>>>> +    if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>>>> +        flags |= BTRFS_BLOCK_GROUP_DATA;
>>>> +
>>>> +    ret = create_chunk_and_block_group(fs_info, flags, &start,
>>>> &nbytes);
>>>
>>> Why bother allocating a new chunk by yourself?
>>>
>>> Just mark all block groups full and that's all.
>>>
>>> Any later tree block allocation like btrfs_search_slot() with @cow = 1
>>> will trigger chunk allocation automatically.
>>>
>>
>> Tried to let it happen but BUG_ON with -ENOSPC.
> 
> Then fix it.
> 
> It's not a normal behavior in this case.
> 
> Thanks,
> Qu

And I think the fix to allow btrfs_reserve_extent() to allocate new
chunk will solve a lot of strange BUG_ON().

it just occurred to me that, a lot of use cases relies on the assumption
that btrfs_reserve_extent() will try to allocate new chunks.

Especially for case like convert, certain btrfs check --repair, some
rescue tools.

So this would be a quite nice start point for such fix.

Thanks,
Qu

> 
>> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is called
>> while doing CoW?> In progs, allocation of new chunk during CoW depends @root->ref_cows.
>> However, @root->ref_cows should be set only if @root is root of a fs
>> trees.
>>
>> Thanks,
>> Su
>>
>>> Thanks,
>>> Qu
>>>
>>>> +    if (!ret)
>>>> +        printf("Create new chunk %llu %llu\n", start, nbytes);
>>>> +    else
>>>> +        goto err;
>>>> +
>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA;
>>>> +    /* Mark all metadata block groups cached and full in free space*/
>>>> +    ret = modify_block_groups_cache(fs_info, flags, 1);
>>>> +    if (ret)
>>>> +        goto clear_bg_cache;
>>>> +
>>>> +    bg = btrfs_lookup_block_group(fs_info, start);
>>>> +    if (!bg) {
>>>> +        ret = -ENOENT;
>>>> +        error("fail to look up block group %llu %llu", start, nbytes);
>>>> +        goto clear_bg_cache;
>>>> +    }
>>>> +
>>>> +    /* Clear block group cache just allocated */
>>>> +    ret = modify_block_group_cache(fs_info, bg, 0);
>>>> +    if (ret)
>>>> +        goto clear_bg_cache;
>>>> +
>>>> +    return 0;
>>>> +
>>>> +clear_bg_cache:
>>>> +    modify_block_groups_cache(fs_info, flags, 0);
>>>> +err:
>>>> +    return ret;
>>>> +}
>>>> +
>>>>   static int check_extent_refs(struct btrfs_root *root,
>>>>                    struct cache_tree *extent_cache)
>>>>   {
>>>>
>>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-21  6:51         ` Qu Wenruo
@ 2017-12-21  7:06           ` Su Yue
  2017-12-21  7:09           ` Su Yue
  1 sibling, 0 replies; 36+ messages in thread
From: Su Yue @ 2017-12-21  7:06 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 12/21/2017 02:51 PM, Qu Wenruo wrote:
> 
> 
> On 2017年12月20日 16:37, Qu Wenruo wrote:
>>
>>
>> On 2017年12月20日 16:21, Su Yue wrote:
>>>
>>>
>>> On 12/20/2017 01:41 PM, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2017年12月20日 12:57, Su Yue wrote:
>>>>> Introduce create_chunk_and_block_block_group() to allocate new chunk
>>>>> and corresponding block group.
>>>>>
>>>>> The new function force_cow_in_new_chunk() first allocates new chunk
>>>>> and records its start.
>>>>> Then it modifies all metadata block groups cached and full.
>>>>> Finally it marks the new block group uncached and unfree.
>>>>> In the next CoW, extents states will be updated automatically by
>>>>> cache_block_group().
>>>>>
>>>>> Suggested-by: Qu Wenruo <wqu@suse.com>
>>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>> ---
>>>>>    cmds-check.c | 80
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>    1 file changed, 80 insertions(+)
>>>>>
>>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>>> index d98d4bda7357..311c8a9f45e8 100644
>>>>> --- a/cmds-check.c
>>>>> +++ b/cmds-check.c
>>>>> @@ -10911,6 +10911,86 @@ out:
>>>>>        return ret;
>>>>>    }
>>>>>    +static int create_chunk_and_block_group(struct btrfs_fs_info
>>>>> *fs_info,
>>>>> +                    u64 flags, u64 *start, u64 *nbytes)
>>>>> +{
>>>>> +    struct btrfs_trans_handle *trans;
>>>>> +    struct btrfs_root *root = fs_info->extent_root;
>>>>> +    int ret;
>>>>> +
>>>>> +    if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>>>>> +        return -EINVAL;
>>>>> +
>>>>> +    trans = btrfs_start_transaction(root, 1);
>>>>> +    if (IS_ERR(trans)) {
>>>>> +        ret = PTR_ERR(trans);
>>>>> +        error("error starting transaction %s", strerror(-ret));
>>>>> +        return ret;
>>>>> +    }
>>>>> +    ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>>>>> +    if (ret) {
>>>>> +        error("fail to allocate new chunk %s", strerror(-ret));
>>>>> +        goto out;
>>>>> +    }
>>>>> +    ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>>>>> +                 BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes);
>>>>> +    if (ret) {
>>>>> +        error("fail to make block group for chunk %llu %llu %s",
>>>>> +              *start, *nbytes, strerror(-ret));
>>>>> +        goto out;
>>>>> +    }
>>>>> +out:
>>>>> +    btrfs_commit_transaction(trans, root);
>>>>> +    return ret;
>>>>> +}
>>>>> +
>>>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>>>>> +{
>>>>> +    struct btrfs_block_group_cache *bg;
>>>>> +    u64 start;
>>>>> +    u64 nbytes;
>>>>> +    u64 alloc_profile;
>>>>> +    u64 flags;
>>>>> +    int ret;
>>>>> +
>>>>> +    alloc_profile = (fs_info->avail_metadata_alloc_bits &
>>>>> +             fs_info->metadata_alloc_profile);
>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>>>>> +    if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>>>>> +        flags |= BTRFS_BLOCK_GROUP_DATA;
>>>>> +
>>>>> +    ret = create_chunk_and_block_group(fs_info, flags, &start,
>>>>> &nbytes);
>>>>
>>>> Why bother allocating a new chunk by yourself?
>>>>
>>>> Just mark all block groups full and that's all.
>>>>
>>>> Any later tree block allocation like btrfs_search_slot() with @cow = 1
>>>> will trigger chunk allocation automatically.
>>>>
>>>
>>> Tried to let it happen but BUG_ON with -ENOSPC.
>>
>> Then fix it.
>>
>> It's not a normal behavior in this case.
>>
>> Thanks,
>> Qu
> 
> And I think the fix to allow btrfs_reserve_extent() to allocate new
> chunk will solve a lot of strange BUG_ON().
> 
> it just occurred to me that, a lot of use cases relies on the assumption
> that btrfs_reserve_extent() will try to allocate new chunks.
> Yes, it has many dependency so I considered to do chunk allocation
manually in the patchset. If fix is not good enough, many funtions
of btrfs-progs will behave abnormal.
Since you ask it, I will go to fix it.

Thanks,
Su
> Especially for case like convert, certain btrfs check --repair, some
> rescue tools.
> 
> So this would be a quite nice start point for such fix.
> 
> Thanks,
> Qu
> 
>>
>>> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is called
>>> while doing CoW?> In progs, allocation of new chunk during CoW depends @root->ref_cows.
>>> However, @root->ref_cows should be set only if @root is root of a fs
>>> trees.
>>>
>>> Thanks,
>>> Su
>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>> +    if (!ret)
>>>>> +        printf("Create new chunk %llu %llu\n", start, nbytes);
>>>>> +    else
>>>>> +        goto err;
>>>>> +
>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA;
>>>>> +    /* Mark all metadata block groups cached and full in free space*/
>>>>> +    ret = modify_block_groups_cache(fs_info, flags, 1);
>>>>> +    if (ret)
>>>>> +        goto clear_bg_cache;
>>>>> +
>>>>> +    bg = btrfs_lookup_block_group(fs_info, start);
>>>>> +    if (!bg) {
>>>>> +        ret = -ENOENT;
>>>>> +        error("fail to look up block group %llu %llu", start, nbytes);
>>>>> +        goto clear_bg_cache;
>>>>> +    }
>>>>> +
>>>>> +    /* Clear block group cache just allocated */
>>>>> +    ret = modify_block_group_cache(fs_info, bg, 0);
>>>>> +    if (ret)
>>>>> +        goto clear_bg_cache;
>>>>> +
>>>>> +    return 0;
>>>>> +
>>>>> +clear_bg_cache:
>>>>> +    modify_block_groups_cache(fs_info, flags, 0);
>>>>> +err:
>>>>> +    return ret;
>>>>> +}
>>>>> +
>>>>>    static int check_extent_refs(struct btrfs_root *root,
>>>>>                     struct cache_tree *extent_cache)
>>>>>    {
>>>>>
>>>>
>>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-21  6:51         ` Qu Wenruo
  2017-12-21  7:06           ` Su Yue
@ 2017-12-21  7:09           ` Su Yue
  2017-12-21  7:12             ` Qu Wenruo
  1 sibling, 1 reply; 36+ messages in thread
From: Su Yue @ 2017-12-21  7:09 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 12/21/2017 02:51 PM, Qu Wenruo wrote:
> 
> 
> On 2017年12月20日 16:37, Qu Wenruo wrote:
>>
>>
>> On 2017年12月20日 16:21, Su Yue wrote:
>>>
>>>
>>> On 12/20/2017 01:41 PM, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2017年12月20日 12:57, Su Yue wrote:
>>>>> Introduce create_chunk_and_block_block_group() to allocate new chunk
>>>>> and corresponding block group.
>>>>>
>>>>> The new function force_cow_in_new_chunk() first allocates new chunk
>>>>> and records its start.
>>>>> Then it modifies all metadata block groups cached and full.
>>>>> Finally it marks the new block group uncached and unfree.
>>>>> In the next CoW, extents states will be updated automatically by
>>>>> cache_block_group().
>>>>>
>>>>> Suggested-by: Qu Wenruo <wqu@suse.com>
>>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>> ---
>>>>>    cmds-check.c | 80
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>    1 file changed, 80 insertions(+)
>>>>>
>>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>>> index d98d4bda7357..311c8a9f45e8 100644
>>>>> --- a/cmds-check.c
>>>>> +++ b/cmds-check.c
>>>>> @@ -10911,6 +10911,86 @@ out:
>>>>>        return ret;
>>>>>    }
>>>>>    +static int create_chunk_and_block_group(struct btrfs_fs_info
>>>>> *fs_info,
>>>>> +                    u64 flags, u64 *start, u64 *nbytes)
>>>>> +{
>>>>> +    struct btrfs_trans_handle *trans;
>>>>> +    struct btrfs_root *root = fs_info->extent_root;
>>>>> +    int ret;
>>>>> +
>>>>> +    if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>>>>> +        return -EINVAL;
>>>>> +
>>>>> +    trans = btrfs_start_transaction(root, 1);
>>>>> +    if (IS_ERR(trans)) {
>>>>> +        ret = PTR_ERR(trans);
>>>>> +        error("error starting transaction %s", strerror(-ret));
>>>>> +        return ret;
>>>>> +    }
>>>>> +    ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>>>>> +    if (ret) {
>>>>> +        error("fail to allocate new chunk %s", strerror(-ret));
>>>>> +        goto out;
>>>>> +    }
>>>>> +    ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>>>>> +                 BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes);
>>>>> +    if (ret) {
>>>>> +        error("fail to make block group for chunk %llu %llu %s",
>>>>> +              *start, *nbytes, strerror(-ret));
>>>>> +        goto out;
>>>>> +    }
>>>>> +out:
>>>>> +    btrfs_commit_transaction(trans, root);
>>>>> +    return ret;
>>>>> +}
>>>>> +
>>>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>>>>> +{
>>>>> +    struct btrfs_block_group_cache *bg;
>>>>> +    u64 start;
>>>>> +    u64 nbytes;
>>>>> +    u64 alloc_profile;
>>>>> +    u64 flags;
>>>>> +    int ret;
>>>>> +
>>>>> +    alloc_profile = (fs_info->avail_metadata_alloc_bits &
>>>>> +             fs_info->metadata_alloc_profile);
>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>>>>> +    if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>>>>> +        flags |= BTRFS_BLOCK_GROUP_DATA;
>>>>> +
>>>>> +    ret = create_chunk_and_block_group(fs_info, flags, &start,
>>>>> &nbytes);
>>>>
>>>> Why bother allocating a new chunk by yourself?
>>>>
>>>> Just mark all block groups full and that's all.
>>>>
>>>> Any later tree block allocation like btrfs_search_slot() with @cow = 1
>>>> will trigger chunk allocation automatically.
>>>>
>>>
>>> Tried to let it happen but BUG_ON with -ENOSPC.
>>
>> Then fix it.
>>
>> It's not a normal behavior in this case.
>>
>> Thanks,
>> Qu
> 
> And I think the fix to allow btrfs_reserve_extent() to allocate new
> chunk will solve a lot of strange BUG_ON().
> 
> it just occurred to me that, a lot of use cases relies on the assumption
> that btrfs_reserve_extent() will try to allocate new chunks.
> 
> Especially for case like convert, certain btrfs check --repair, some
> rescue tools.
> 
Sorry for the previous wrong format mail.

Yes, it has many dependency so I considered to do chunk allocation
manually in the patchset. If fix is not good enough, many funtions
of btrfs-progs will behave abnormal.
Since you ask it, I will go to fix it.

Thanks,
Su

> So this would be a quite nice start point for such fix.
> 
> Thanks,
> Qu
> 
>>
>>> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is called
>>> while doing CoW?> In progs, allocation of new chunk during CoW depends @root->ref_cows.
>>> However, @root->ref_cows should be set only if @root is root of a fs
>>> trees.
>>>
>>> Thanks,
>>> Su
>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>> +    if (!ret)
>>>>> +        printf("Create new chunk %llu %llu\n", start, nbytes);
>>>>> +    else
>>>>> +        goto err;
>>>>> +
>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA;
>>>>> +    /* Mark all metadata block groups cached and full in free space*/
>>>>> +    ret = modify_block_groups_cache(fs_info, flags, 1);
>>>>> +    if (ret)
>>>>> +        goto clear_bg_cache;
>>>>> +
>>>>> +    bg = btrfs_lookup_block_group(fs_info, start);
>>>>> +    if (!bg) {
>>>>> +        ret = -ENOENT;
>>>>> +        error("fail to look up block group %llu %llu", start, nbytes);
>>>>> +        goto clear_bg_cache;
>>>>> +    }
>>>>> +
>>>>> +    /* Clear block group cache just allocated */
>>>>> +    ret = modify_block_group_cache(fs_info, bg, 0);
>>>>> +    if (ret)
>>>>> +        goto clear_bg_cache;
>>>>> +
>>>>> +    return 0;
>>>>> +
>>>>> +clear_bg_cache:
>>>>> +    modify_block_groups_cache(fs_info, flags, 0);
>>>>> +err:
>>>>> +    return ret;
>>>>> +}
>>>>> +
>>>>>    static int check_extent_refs(struct btrfs_root *root,
>>>>>                     struct cache_tree *extent_cache)
>>>>>    {
>>>>>
>>>>
>>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-21  7:09           ` Su Yue
@ 2017-12-21  7:12             ` Qu Wenruo
  2017-12-26  8:17               ` Su Yue
  0 siblings, 1 reply; 36+ messages in thread
From: Qu Wenruo @ 2017-12-21  7:12 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 7215 bytes --]



On 2017年12月21日 15:09, Su Yue wrote:
> 
> 
> On 12/21/2017 02:51 PM, Qu Wenruo wrote:
>>
>>
>> On 2017年12月20日 16:37, Qu Wenruo wrote:
>>>
>>>
>>> On 2017年12月20日 16:21, Su Yue wrote:
>>>>
>>>>
>>>> On 12/20/2017 01:41 PM, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> On 2017年12月20日 12:57, Su Yue wrote:
>>>>>> Introduce create_chunk_and_block_block_group() to allocate new chunk
>>>>>> and corresponding block group.
>>>>>>
>>>>>> The new function force_cow_in_new_chunk() first allocates new chunk
>>>>>> and records its start.
>>>>>> Then it modifies all metadata block groups cached and full.
>>>>>> Finally it marks the new block group uncached and unfree.
>>>>>> In the next CoW, extents states will be updated automatically by
>>>>>> cache_block_group().
>>>>>>
>>>>>> Suggested-by: Qu Wenruo <wqu@suse.com>
>>>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>>> ---
>>>>>>    cmds-check.c | 80
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>    1 file changed, 80 insertions(+)
>>>>>>
>>>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>>>> index d98d4bda7357..311c8a9f45e8 100644
>>>>>> --- a/cmds-check.c
>>>>>> +++ b/cmds-check.c
>>>>>> @@ -10911,6 +10911,86 @@ out:
>>>>>>        return ret;
>>>>>>    }
>>>>>>    +static int create_chunk_and_block_group(struct btrfs_fs_info
>>>>>> *fs_info,
>>>>>> +                    u64 flags, u64 *start, u64 *nbytes)
>>>>>> +{
>>>>>> +    struct btrfs_trans_handle *trans;
>>>>>> +    struct btrfs_root *root = fs_info->extent_root;
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>>>>>> +        return -EINVAL;
>>>>>> +
>>>>>> +    trans = btrfs_start_transaction(root, 1);
>>>>>> +    if (IS_ERR(trans)) {
>>>>>> +        ret = PTR_ERR(trans);
>>>>>> +        error("error starting transaction %s", strerror(-ret));
>>>>>> +        return ret;
>>>>>> +    }
>>>>>> +    ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>>>>>> +    if (ret) {
>>>>>> +        error("fail to allocate new chunk %s", strerror(-ret));
>>>>>> +        goto out;
>>>>>> +    }
>>>>>> +    ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>>>>>> +                 BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes);
>>>>>> +    if (ret) {
>>>>>> +        error("fail to make block group for chunk %llu %llu %s",
>>>>>> +              *start, *nbytes, strerror(-ret));
>>>>>> +        goto out;
>>>>>> +    }
>>>>>> +out:
>>>>>> +    btrfs_commit_transaction(trans, root);
>>>>>> +    return ret;
>>>>>> +}
>>>>>> +
>>>>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>>>>>> +{
>>>>>> +    struct btrfs_block_group_cache *bg;
>>>>>> +    u64 start;
>>>>>> +    u64 nbytes;
>>>>>> +    u64 alloc_profile;
>>>>>> +    u64 flags;
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    alloc_profile = (fs_info->avail_metadata_alloc_bits &
>>>>>> +             fs_info->metadata_alloc_profile);
>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>>>>>> +    if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>>>>>> +        flags |= BTRFS_BLOCK_GROUP_DATA;
>>>>>> +
>>>>>> +    ret = create_chunk_and_block_group(fs_info, flags, &start,
>>>>>> &nbytes);
>>>>>
>>>>> Why bother allocating a new chunk by yourself?
>>>>>
>>>>> Just mark all block groups full and that's all.
>>>>>
>>>>> Any later tree block allocation like btrfs_search_slot() with @cow = 1
>>>>> will trigger chunk allocation automatically.
>>>>>
>>>>
>>>> Tried to let it happen but BUG_ON with -ENOSPC.
>>>
>>> Then fix it.
>>>
>>> It's not a normal behavior in this case.
>>>
>>> Thanks,
>>> Qu
>>
>> And I think the fix to allow btrfs_reserve_extent() to allocate new
>> chunk will solve a lot of strange BUG_ON().
>>
>> it just occurred to me that, a lot of use cases relies on the assumption
>> that btrfs_reserve_extent() will try to allocate new chunks.
>>
>> Especially for case like convert, certain btrfs check --repair, some
>> rescue tools.
>>
> Sorry for the previous wrong format mail.
> 
> Yes, it has many dependency so I considered to do chunk allocation
> manually in the patchset. If fix is not good enough, many funtions
> of btrfs-progs will behave abnormal.
> Since you ask it, I will go to fix it.

Manually allocation in advance has its advantage, like we can determine
if there is enough space for new chunk instead of checking every return
value with ENOSPC.


However in current case, your metadata usage is limited to the new chunk
only.
If there extent tree has quite a lot of problem, and the chunk allocated
is small (if using single profile and small fs), it can easily hit
ENOSPC again, since btrfs doesn't allocate new chunk for later metadata
write.

So here, we still need to allow btrfs allocate new meta chunk, even we
pre-allocate one meta chunk in advance.

Thanks,
Qu
> 
> Thanks,
> Su
> 
>> So this would be a quite nice start point for such fix.
>>
>> Thanks,
>> Qu
>>
>>>
>>>> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is called
>>>> while doing CoW?> In progs, allocation of new chunk during CoW
>>>> depends @root->ref_cows.
>>>> However, @root->ref_cows should be set only if @root is root of a fs
>>>> trees.
>>>>
>>>> Thanks,
>>>> Su
>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>> +    if (!ret)
>>>>>> +        printf("Create new chunk %llu %llu\n", start, nbytes);
>>>>>> +    else
>>>>>> +        goto err;
>>>>>> +
>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA;
>>>>>> +    /* Mark all metadata block groups cached and full in free
>>>>>> space*/
>>>>>> +    ret = modify_block_groups_cache(fs_info, flags, 1);
>>>>>> +    if (ret)
>>>>>> +        goto clear_bg_cache;
>>>>>> +
>>>>>> +    bg = btrfs_lookup_block_group(fs_info, start);
>>>>>> +    if (!bg) {
>>>>>> +        ret = -ENOENT;
>>>>>> +        error("fail to look up block group %llu %llu", start,
>>>>>> nbytes);
>>>>>> +        goto clear_bg_cache;
>>>>>> +    }
>>>>>> +
>>>>>> +    /* Clear block group cache just allocated */
>>>>>> +    ret = modify_block_group_cache(fs_info, bg, 0);
>>>>>> +    if (ret)
>>>>>> +        goto clear_bg_cache;
>>>>>> +
>>>>>> +    return 0;
>>>>>> +
>>>>>> +clear_bg_cache:
>>>>>> +    modify_block_groups_cache(fs_info, flags, 0);
>>>>>> +err:
>>>>>> +    return ret;
>>>>>> +}
>>>>>> +
>>>>>>    static int check_extent_refs(struct btrfs_root *root,
>>>>>>                     struct cache_tree *extent_cache)
>>>>>>    {
>>>>>>
>>>>>
>>>>
>>>>
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe
>>>> linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-21  7:12             ` Qu Wenruo
@ 2017-12-26  8:17               ` Su Yue
  2017-12-26 10:28                 ` Qu Wenruo
  0 siblings, 1 reply; 36+ messages in thread
From: Su Yue @ 2017-12-26  8:17 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 12/21/2017 03:12 PM, Qu Wenruo wrote:
> 
> 
> On 2017年12月21日 15:09, Su Yue wrote:
>>
>>
>> On 12/21/2017 02:51 PM, Qu Wenruo wrote:
>>>
>>>
>>> On 2017年12月20日 16:37, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2017年12月20日 16:21, Su Yue wrote:
>>>>>
>>>>>
>>>>> On 12/20/2017 01:41 PM, Qu Wenruo wrote:
>>>>>>
>>>>>>
>>>>>> On 2017年12月20日 12:57, Su Yue wrote:
>>>>>>> Introduce create_chunk_and_block_block_group() to allocate new chunk
>>>>>>> and corresponding block group.
>>>>>>>
>>>>>>> The new function force_cow_in_new_chunk() first allocates new chunk
>>>>>>> and records its start.
>>>>>>> Then it modifies all metadata block groups cached and full.
>>>>>>> Finally it marks the new block group uncached and unfree.
>>>>>>> In the next CoW, extents states will be updated automatically by
>>>>>>> cache_block_group().
>>>>>>>
>>>>>>> Suggested-by: Qu Wenruo <wqu@suse.com>
>>>>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>>>> ---
>>>>>>>     cmds-check.c | 80
>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>     1 file changed, 80 insertions(+)
>>>>>>>
>>>>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>>>>> index d98d4bda7357..311c8a9f45e8 100644
>>>>>>> --- a/cmds-check.c
>>>>>>> +++ b/cmds-check.c
>>>>>>> @@ -10911,6 +10911,86 @@ out:
>>>>>>>         return ret;
>>>>>>>     }
>>>>>>>     +static int create_chunk_and_block_group(struct btrfs_fs_info
>>>>>>> *fs_info,
>>>>>>> +                    u64 flags, u64 *start, u64 *nbytes)
>>>>>>> +{
>>>>>>> +    struct btrfs_trans_handle *trans;
>>>>>>> +    struct btrfs_root *root = fs_info->extent_root;
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>>>>>>> +        return -EINVAL;
>>>>>>> +
>>>>>>> +    trans = btrfs_start_transaction(root, 1);
>>>>>>> +    if (IS_ERR(trans)) {
>>>>>>> +        ret = PTR_ERR(trans);
>>>>>>> +        error("error starting transaction %s", strerror(-ret));
>>>>>>> +        return ret;
>>>>>>> +    }
>>>>>>> +    ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>>>>>>> +    if (ret) {
>>>>>>> +        error("fail to allocate new chunk %s", strerror(-ret));
>>>>>>> +        goto out;
>>>>>>> +    }
>>>>>>> +    ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>>>>>>> +                 BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start, *nbytes);
>>>>>>> +    if (ret) {
>>>>>>> +        error("fail to make block group for chunk %llu %llu %s",
>>>>>>> +              *start, *nbytes, strerror(-ret));
>>>>>>> +        goto out;
>>>>>>> +    }
>>>>>>> +out:
>>>>>>> +    btrfs_commit_transaction(trans, root);
>>>>>>> +    return ret;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>>>>>>> +{
>>>>>>> +    struct btrfs_block_group_cache *bg;
>>>>>>> +    u64 start;
>>>>>>> +    u64 nbytes;
>>>>>>> +    u64 alloc_profile;
>>>>>>> +    u64 flags;
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    alloc_profile = (fs_info->avail_metadata_alloc_bits &
>>>>>>> +             fs_info->metadata_alloc_profile);
>>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>>>>>>> +    if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>>>>>>> +        flags |= BTRFS_BLOCK_GROUP_DATA;
>>>>>>> +
>>>>>>> +    ret = create_chunk_and_block_group(fs_info, flags, &start,
>>>>>>> &nbytes);
>>>>>>
>>>>>> Why bother allocating a new chunk by yourself?
>>>>>>
>>>>>> Just mark all block groups full and that's all.
>>>>>>
>>>>>> Any later tree block allocation like btrfs_search_slot() with @cow = 1
>>>>>> will trigger chunk allocation automatically.
>>>>>>
>>>>>
>>>>> Tried to let it happen but BUG_ON with -ENOSPC.
>>>>
>>>> Then fix it.
>>>>
>>>> It's not a normal behavior in this case.
>>>>
>>>> Thanks,
>>>> Qu
>>>
>>> And I think the fix to allow btrfs_reserve_extent() to allocate new
>>> chunk will solve a lot of strange BUG_ON().
>>>
>>> it just occurred to me that, a lot of use cases relies on the assumption
>>> that btrfs_reserve_extent() will try to allocate new chunks.
>>>
>>> Especially for case like convert, certain btrfs check --repair, some
>>> rescue tools.
>>>
>> Sorry for the previous wrong format mail.
>>
>> Yes, it has many dependency so I considered to do chunk allocation
>> manually in the patchset. If fix is not good enough, many funtions
>> of btrfs-progs will behave abnormal.
>> Since you ask it, I will go to fix it.
> 
> Manually allocation in advance has its advantage, like we can determine
> if there is enough space for new chunk instead of checking every return
> value with ENOSPC.
> 
> 
> However in current case, your metadata usage is limited to the new chunk
> only.
> If there extent tree has quite a lot of problem, and the chunk allocated
> is small (if using single profile and small fs), it can easily hit
> ENOSPC again, since btrfs doesn't allocate new chunk for later metadata
> write.
> 
SAD. After I tried to implement above nice idea, infinite recursive
brings me back to the reality.

Here is the reason why btrfs_reserve_extent can not allocate chunk
by itself if ENOSPC hints:

btrfs_cow_block
...
   btrfs_reserve_extent
     btrfs_alloc_chunk
       btrfs_alloc_dev_extent
         btrfs_insert_empty_item
           ...
            btrfs_cow_block

Thanks,
Su

> So here, we still need to allow btrfs allocate new meta chunk, even we
> pre-allocate one meta chunk in advance.
> 
> Thanks,
> Qu
>>
>> Thanks,
>> Su
>>
>>> So this would be a quite nice start point for such fix.
>>>
>>> Thanks,
>>> Qu
>>>
>>>>
>>>>> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is called
>>>>> while doing CoW?> In progs, allocation of new chunk during CoW
>>>>> depends @root->ref_cows.
>>>>> However, @root->ref_cows should be set only if @root is root of a fs
>>>>> trees.
>>>>>
>>>>> Thanks,
>>>>> Su
>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>> +    if (!ret)
>>>>>>> +        printf("Create new chunk %llu %llu\n", start, nbytes);
>>>>>>> +    else
>>>>>>> +        goto err;
>>>>>>> +
>>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA;
>>>>>>> +    /* Mark all metadata block groups cached and full in free
>>>>>>> space*/
>>>>>>> +    ret = modify_block_groups_cache(fs_info, flags, 1);
>>>>>>> +    if (ret)
>>>>>>> +        goto clear_bg_cache;
>>>>>>> +
>>>>>>> +    bg = btrfs_lookup_block_group(fs_info, start);
>>>>>>> +    if (!bg) {
>>>>>>> +        ret = -ENOENT;
>>>>>>> +        error("fail to look up block group %llu %llu", start,
>>>>>>> nbytes);
>>>>>>> +        goto clear_bg_cache;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    /* Clear block group cache just allocated */
>>>>>>> +    ret = modify_block_group_cache(fs_info, bg, 0);
>>>>>>> +    if (ret)
>>>>>>> +        goto clear_bg_cache;
>>>>>>> +
>>>>>>> +    return 0;
>>>>>>> +
>>>>>>> +clear_bg_cache:
>>>>>>> +    modify_block_groups_cache(fs_info, flags, 0);
>>>>>>> +err:
>>>>>>> +    return ret;
>>>>>>> +}
>>>>>>> +
>>>>>>>     static int check_extent_refs(struct btrfs_root *root,
>>>>>>>                      struct cache_tree *extent_cache)
>>>>>>>     {
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>> linux-btrfs" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>
>>
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-26  8:17               ` Su Yue
@ 2017-12-26 10:28                 ` Qu Wenruo
  2017-12-27  1:11                   ` Su Yue
  0 siblings, 1 reply; 36+ messages in thread
From: Qu Wenruo @ 2017-12-26 10:28 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 8672 bytes --]



On 2017年12月26日 16:17, Su Yue wrote:
> 
> 
> On 12/21/2017 03:12 PM, Qu Wenruo wrote:
>>
>>
>> On 2017年12月21日 15:09, Su Yue wrote:
>>>
>>>
>>> On 12/21/2017 02:51 PM, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2017年12月20日 16:37, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> On 2017年12月20日 16:21, Su Yue wrote:
>>>>>>
>>>>>>
>>>>>> On 12/20/2017 01:41 PM, Qu Wenruo wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2017年12月20日 12:57, Su Yue wrote:
>>>>>>>> Introduce create_chunk_and_block_block_group() to allocate new
>>>>>>>> chunk
>>>>>>>> and corresponding block group.
>>>>>>>>
>>>>>>>> The new function force_cow_in_new_chunk() first allocates new chunk
>>>>>>>> and records its start.
>>>>>>>> Then it modifies all metadata block groups cached and full.
>>>>>>>> Finally it marks the new block group uncached and unfree.
>>>>>>>> In the next CoW, extents states will be updated automatically by
>>>>>>>> cache_block_group().
>>>>>>>>
>>>>>>>> Suggested-by: Qu Wenruo <wqu@suse.com>
>>>>>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>>>>> ---
>>>>>>>>     cmds-check.c | 80
>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>     1 file changed, 80 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>>>>>> index d98d4bda7357..311c8a9f45e8 100644
>>>>>>>> --- a/cmds-check.c
>>>>>>>> +++ b/cmds-check.c
>>>>>>>> @@ -10911,6 +10911,86 @@ out:
>>>>>>>>         return ret;
>>>>>>>>     }
>>>>>>>>     +static int create_chunk_and_block_group(struct btrfs_fs_info
>>>>>>>> *fs_info,
>>>>>>>> +                    u64 flags, u64 *start, u64 *nbytes)
>>>>>>>> +{
>>>>>>>> +    struct btrfs_trans_handle *trans;
>>>>>>>> +    struct btrfs_root *root = fs_info->extent_root;
>>>>>>>> +    int ret;
>>>>>>>> +
>>>>>>>> +    if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>>>>>>>> +        return -EINVAL;
>>>>>>>> +
>>>>>>>> +    trans = btrfs_start_transaction(root, 1);
>>>>>>>> +    if (IS_ERR(trans)) {
>>>>>>>> +        ret = PTR_ERR(trans);
>>>>>>>> +        error("error starting transaction %s", strerror(-ret));
>>>>>>>> +        return ret;
>>>>>>>> +    }
>>>>>>>> +    ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>>>>>>>> +    if (ret) {
>>>>>>>> +        error("fail to allocate new chunk %s", strerror(-ret));
>>>>>>>> +        goto out;
>>>>>>>> +    }
>>>>>>>> +    ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>>>>>>>> +                 BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start,
>>>>>>>> *nbytes);
>>>>>>>> +    if (ret) {
>>>>>>>> +        error("fail to make block group for chunk %llu %llu %s",
>>>>>>>> +              *start, *nbytes, strerror(-ret));
>>>>>>>> +        goto out;
>>>>>>>> +    }
>>>>>>>> +out:
>>>>>>>> +    btrfs_commit_transaction(trans, root);
>>>>>>>> +    return ret;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>>>>>>>> +{
>>>>>>>> +    struct btrfs_block_group_cache *bg;
>>>>>>>> +    u64 start;
>>>>>>>> +    u64 nbytes;
>>>>>>>> +    u64 alloc_profile;
>>>>>>>> +    u64 flags;
>>>>>>>> +    int ret;
>>>>>>>> +
>>>>>>>> +    alloc_profile = (fs_info->avail_metadata_alloc_bits &
>>>>>>>> +             fs_info->metadata_alloc_profile);
>>>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>>>>>>>> +    if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>>>>>>>> +        flags |= BTRFS_BLOCK_GROUP_DATA;
>>>>>>>> +
>>>>>>>> +    ret = create_chunk_and_block_group(fs_info, flags, &start,
>>>>>>>> &nbytes);
>>>>>>>
>>>>>>> Why bother allocating a new chunk by yourself?
>>>>>>>
>>>>>>> Just mark all block groups full and that's all.
>>>>>>>
>>>>>>> Any later tree block allocation like btrfs_search_slot() with
>>>>>>> @cow = 1
>>>>>>> will trigger chunk allocation automatically.
>>>>>>>
>>>>>>
>>>>>> Tried to let it happen but BUG_ON with -ENOSPC.
>>>>>
>>>>> Then fix it.
>>>>>
>>>>> It's not a normal behavior in this case.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>
>>>> And I think the fix to allow btrfs_reserve_extent() to allocate new
>>>> chunk will solve a lot of strange BUG_ON().
>>>>
>>>> it just occurred to me that, a lot of use cases relies on the
>>>> assumption
>>>> that btrfs_reserve_extent() will try to allocate new chunks.
>>>>
>>>> Especially for case like convert, certain btrfs check --repair, some
>>>> rescue tools.
>>>>
>>> Sorry for the previous wrong format mail.
>>>
>>> Yes, it has many dependency so I considered to do chunk allocation
>>> manually in the patchset. If fix is not good enough, many funtions
>>> of btrfs-progs will behave abnormal.
>>> Since you ask it, I will go to fix it.
>>
>> Manually allocation in advance has its advantage, like we can determine
>> if there is enough space for new chunk instead of checking every return
>> value with ENOSPC.
>>
>>
>> However in current case, your metadata usage is limited to the new chunk
>> only.
>> If there extent tree has quite a lot of problem, and the chunk allocated
>> is small (if using single profile and small fs), it can easily hit
>> ENOSPC again, since btrfs doesn't allocate new chunk for later metadata
>> write.
>>
> SAD. After I tried to implement above nice idea, infinite recursive
> brings me back to the reality.
> 
> Here is the reason why btrfs_reserve_extent can not allocate chunk
> by itself if ENOSPC hints:
> 
> btrfs_cow_block
> ...
>   btrfs_reserve_extent
>     btrfs_alloc_chunk
>       btrfs_alloc_dev_extent
>         btrfs_insert_empty_item
>           ...
>            btrfs_cow_block

The order is just wrong.

We should first check for the new chunk space, not doing the insert
right now.

I'll take over the work if you're OK with it.

Thanks,
Qu

> 
> Thanks,
> Su
> 
>> So here, we still need to allow btrfs allocate new meta chunk, even we
>> pre-allocate one meta chunk in advance.
>>
>> Thanks,
>> Qu
>>>
>>> Thanks,
>>> Su
>>>
>>>> So this would be a quite nice start point for such fix.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>>> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is
>>>>>> called
>>>>>> while doing CoW?> In progs, allocation of new chunk during CoW
>>>>>> depends @root->ref_cows.
>>>>>> However, @root->ref_cows should be set only if @root is root of a fs
>>>>>> trees.
>>>>>>
>>>>>> Thanks,
>>>>>> Su
>>>>>>
>>>>>>> Thanks,
>>>>>>> Qu
>>>>>>>
>>>>>>>> +    if (!ret)
>>>>>>>> +        printf("Create new chunk %llu %llu\n", start, nbytes);
>>>>>>>> +    else
>>>>>>>> +        goto err;
>>>>>>>> +
>>>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA;
>>>>>>>> +    /* Mark all metadata block groups cached and full in free
>>>>>>>> space*/
>>>>>>>> +    ret = modify_block_groups_cache(fs_info, flags, 1);
>>>>>>>> +    if (ret)
>>>>>>>> +        goto clear_bg_cache;
>>>>>>>> +
>>>>>>>> +    bg = btrfs_lookup_block_group(fs_info, start);
>>>>>>>> +    if (!bg) {
>>>>>>>> +        ret = -ENOENT;
>>>>>>>> +        error("fail to look up block group %llu %llu", start,
>>>>>>>> nbytes);
>>>>>>>> +        goto clear_bg_cache;
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +    /* Clear block group cache just allocated */
>>>>>>>> +    ret = modify_block_group_cache(fs_info, bg, 0);
>>>>>>>> +    if (ret)
>>>>>>>> +        goto clear_bg_cache;
>>>>>>>> +
>>>>>>>> +    return 0;
>>>>>>>> +
>>>>>>>> +clear_bg_cache:
>>>>>>>> +    modify_block_groups_cache(fs_info, flags, 0);
>>>>>>>> +err:
>>>>>>>> +    return ret;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>     static int check_extent_refs(struct btrfs_root *root,
>>>>>>>>                      struct cache_tree *extent_cache)
>>>>>>>>     {
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>> linux-btrfs" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>
>>>
>>
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-26 10:28                 ` Qu Wenruo
@ 2017-12-27  1:11                   ` Su Yue
  2018-01-09  7:44                     ` Qu Wenruo
  0 siblings, 1 reply; 36+ messages in thread
From: Su Yue @ 2017-12-27  1:11 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 12/26/2017 06:28 PM, Qu Wenruo wrote:
> 
> 
> On 2017年12月26日 16:17, Su Yue wrote:
>>
>>
>> On 12/21/2017 03:12 PM, Qu Wenruo wrote:
>>>
>>>
>>> On 2017年12月21日 15:09, Su Yue wrote:
>>>>
>>>>
>>>> On 12/21/2017 02:51 PM, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> On 2017年12月20日 16:37, Qu Wenruo wrote:
>>>>>>
>>>>>>
>>>>>> On 2017年12月20日 16:21, Su Yue wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 12/20/2017 01:41 PM, Qu Wenruo wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2017年12月20日 12:57, Su Yue wrote:
>>>>>>>>> Introduce create_chunk_and_block_block_group() to allocate new
>>>>>>>>> chunk
>>>>>>>>> and corresponding block group.
>>>>>>>>>
>>>>>>>>> The new function force_cow_in_new_chunk() first allocates new chunk
>>>>>>>>> and records its start.
>>>>>>>>> Then it modifies all metadata block groups cached and full.
>>>>>>>>> Finally it marks the new block group uncached and unfree.
>>>>>>>>> In the next CoW, extents states will be updated automatically by
>>>>>>>>> cache_block_group().
>>>>>>>>>
>>>>>>>>> Suggested-by: Qu Wenruo <wqu@suse.com>
>>>>>>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>>>>>> ---
>>>>>>>>>      cmds-check.c | 80
>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>      1 file changed, 80 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>>>>>>> index d98d4bda7357..311c8a9f45e8 100644
>>>>>>>>> --- a/cmds-check.c
>>>>>>>>> +++ b/cmds-check.c
>>>>>>>>> @@ -10911,6 +10911,86 @@ out:
>>>>>>>>>          return ret;
>>>>>>>>>      }
>>>>>>>>>      +static int create_chunk_and_block_group(struct btrfs_fs_info
>>>>>>>>> *fs_info,
>>>>>>>>> +                    u64 flags, u64 *start, u64 *nbytes)
>>>>>>>>> +{
>>>>>>>>> +    struct btrfs_trans_handle *trans;
>>>>>>>>> +    struct btrfs_root *root = fs_info->extent_root;
>>>>>>>>> +    int ret;
>>>>>>>>> +
>>>>>>>>> +    if ((flags & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0)
>>>>>>>>> +        return -EINVAL;
>>>>>>>>> +
>>>>>>>>> +    trans = btrfs_start_transaction(root, 1);
>>>>>>>>> +    if (IS_ERR(trans)) {
>>>>>>>>> +        ret = PTR_ERR(trans);
>>>>>>>>> +        error("error starting transaction %s", strerror(-ret));
>>>>>>>>> +        return ret;
>>>>>>>>> +    }
>>>>>>>>> +    ret = btrfs_alloc_chunk(trans, fs_info, start, nbytes, flags);
>>>>>>>>> +    if (ret) {
>>>>>>>>> +        error("fail to allocate new chunk %s", strerror(-ret));
>>>>>>>>> +        goto out;
>>>>>>>>> +    }
>>>>>>>>> +    ret = btrfs_make_block_group(trans, fs_info, 0, flags,
>>>>>>>>> +                 BTRFS_FIRST_CHUNK_TREE_OBJECTID, *start,
>>>>>>>>> *nbytes);
>>>>>>>>> +    if (ret) {
>>>>>>>>> +        error("fail to make block group for chunk %llu %llu %s",
>>>>>>>>> +              *start, *nbytes, strerror(-ret));
>>>>>>>>> +        goto out;
>>>>>>>>> +    }
>>>>>>>>> +out:
>>>>>>>>> +    btrfs_commit_transaction(trans, root);
>>>>>>>>> +    return ret;
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +static int force_cow_in_new_chunk(struct btrfs_fs_info *fs_info)
>>>>>>>>> +{
>>>>>>>>> +    struct btrfs_block_group_cache *bg;
>>>>>>>>> +    u64 start;
>>>>>>>>> +    u64 nbytes;
>>>>>>>>> +    u64 alloc_profile;
>>>>>>>>> +    u64 flags;
>>>>>>>>> +    int ret;
>>>>>>>>> +
>>>>>>>>> +    alloc_profile = (fs_info->avail_metadata_alloc_bits &
>>>>>>>>> +             fs_info->metadata_alloc_profile);
>>>>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA | alloc_profile;
>>>>>>>>> +    if (btrfs_fs_incompat(fs_info, MIXED_GROUPS))
>>>>>>>>> +        flags |= BTRFS_BLOCK_GROUP_DATA;
>>>>>>>>> +
>>>>>>>>> +    ret = create_chunk_and_block_group(fs_info, flags, &start,
>>>>>>>>> &nbytes);
>>>>>>>>
>>>>>>>> Why bother allocating a new chunk by yourself?
>>>>>>>>
>>>>>>>> Just mark all block groups full and that's all.
>>>>>>>>
>>>>>>>> Any later tree block allocation like btrfs_search_slot() with
>>>>>>>> @cow = 1
>>>>>>>> will trigger chunk allocation automatically.
>>>>>>>>
>>>>>>>
>>>>>>> Tried to let it happen but BUG_ON with -ENOSPC.
>>>>>>
>>>>>> Then fix it.
>>>>>>
>>>>>> It's not a normal behavior in this case.
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>
>>>>> And I think the fix to allow btrfs_reserve_extent() to allocate new
>>>>> chunk will solve a lot of strange BUG_ON().
>>>>>
>>>>> it just occurred to me that, a lot of use cases relies on the
>>>>> assumption
>>>>> that btrfs_reserve_extent() will try to allocate new chunks.
>>>>>
>>>>> Especially for case like convert, certain btrfs check --repair, some
>>>>> rescue tools.
>>>>>
>>>> Sorry for the previous wrong format mail.
>>>>
>>>> Yes, it has many dependency so I considered to do chunk allocation
>>>> manually in the patchset. If fix is not good enough, many funtions
>>>> of btrfs-progs will behave abnormal.
>>>> Since you ask it, I will go to fix it.
>>>
>>> Manually allocation in advance has its advantage, like we can determine
>>> if there is enough space for new chunk instead of checking every return
>>> value with ENOSPC.
>>>
>>>
>>> However in current case, your metadata usage is limited to the new chunk
>>> only.
>>> If there extent tree has quite a lot of problem, and the chunk allocated
>>> is small (if using single profile and small fs), it can easily hit
>>> ENOSPC again, since btrfs doesn't allocate new chunk for later metadata
>>> write.
>>>
>> SAD. After I tried to implement above nice idea, infinite recursive
>> brings me back to the reality.
>>
>> Here is the reason why btrfs_reserve_extent can not allocate chunk
>> by itself if ENOSPC hints:
>>
>> btrfs_cow_block
>> ...
>>    btrfs_reserve_extent
>>      btrfs_alloc_chunk
>>        btrfs_alloc_dev_extent
>>          btrfs_insert_empty_item
>>            ...
>>             btrfs_cow_block
> 
> The order is just wrong.
> 
> We should first check for the new chunk space, not doing the insert
> right now.
> 
> I'll take over the work if you're OK with it.
> 
OK. Thanks a lot.

Thanks,
Su

> Thanks,
> Qu
> 
>>
>> Thanks,
>> Su
>>
>>> So here, we still need to allow btrfs allocate new meta chunk, even we
>>> pre-allocate one meta chunk in advance.
>>>
>>> Thanks,
>>> Qu
>>>>
>>>> Thanks,
>>>> Su
>>>>
>>>>> So this would be a quite nice start point for such fix.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>>
>>>>>>> Do you mean do_chunk_alloc() in btrfs_reserve_extent() which is
>>>>>>> called
>>>>>>> while doing CoW?> In progs, allocation of new chunk during CoW
>>>>>>> depends @root->ref_cows.
>>>>>>> However, @root->ref_cows should be set only if @root is root of a fs
>>>>>>> trees.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Su
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Qu
>>>>>>>>
>>>>>>>>> +    if (!ret)
>>>>>>>>> +        printf("Create new chunk %llu %llu\n", start, nbytes);
>>>>>>>>> +    else
>>>>>>>>> +        goto err;
>>>>>>>>> +
>>>>>>>>> +    flags = BTRFS_BLOCK_GROUP_METADATA;
>>>>>>>>> +    /* Mark all metadata block groups cached and full in free
>>>>>>>>> space*/
>>>>>>>>> +    ret = modify_block_groups_cache(fs_info, flags, 1);
>>>>>>>>> +    if (ret)
>>>>>>>>> +        goto clear_bg_cache;
>>>>>>>>> +
>>>>>>>>> +    bg = btrfs_lookup_block_group(fs_info, start);
>>>>>>>>> +    if (!bg) {
>>>>>>>>> +        ret = -ENOENT;
>>>>>>>>> +        error("fail to look up block group %llu %llu", start,
>>>>>>>>> nbytes);
>>>>>>>>> +        goto clear_bg_cache;
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    /* Clear block group cache just allocated */
>>>>>>>>> +    ret = modify_block_group_cache(fs_info, bg, 0);
>>>>>>>>> +    if (ret)
>>>>>>>>> +        goto clear_bg_cache;
>>>>>>>>> +
>>>>>>>>> +    return 0;
>>>>>>>>> +
>>>>>>>>> +clear_bg_cache:
>>>>>>>>> +    modify_block_groups_cache(fs_info, flags, 0);
>>>>>>>>> +err:
>>>>>>>>> +    return ret;
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>      static int check_extent_refs(struct btrfs_root *root,
>>>>>>>>>                       struct cache_tree *extent_cache)
>>>>>>>>>      {
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>> linux-btrfs" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2()
  2017-12-20  4:57 ` [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2() Su Yue
@ 2017-12-29 11:17   ` Nikolay Borisov
  2018-01-02  1:44     ` Su Yue
  2018-01-02  1:50     ` Su Yue
  0 siblings, 2 replies; 36+ messages in thread
From: Nikolay Borisov @ 2017-12-29 11:17 UTC (permalink / raw)
  To: Su Yue, linux-btrfs; +Cc: quwenruo.btrfs



On 20.12.2017 06:57, Su Yue wrote:
> In lowmem mode with '--repair', check_chunks_and_extents_v2()
> will fix accounting in block groups and clear the error
> bit BG_ACCOUNTING_ERROR.
> However, return value of check_btrfs_root() is 0 either 1 instead of
> error bits.
> 
> If extent tree is on error, lowmem repair always prints error and
> returns nonzero value even the filesystem is fine after repair.
> 
> So let @err contains bits after walk_down_tree_v2().
> 
> Introduce FATAL_ERROR for lowmem mode to represents negative return
> values since negative and positive can't not be mixed in bits operations.
> 
> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
> ---
>  cmds-check.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index 309ac9553b3a..ebede26cef01 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -134,6 +134,7 @@ struct data_backref {
>  #define DIR_INDEX_MISMATCH      (1<<19) /* INODE_INDEX found but not match */
>  #define DIR_COUNT_AGAIN         (1<<20) /* DIR isize should be recalculated */
>  #define BG_ACCOUNTING_ERROR     (1<<21) /* Block group accounting error */
> +#define FATAL_ERROR             (1<<22) /* fatal bit for errno */
>  
>  static inline struct data_backref* to_data_backref(struct extent_backref *back)
>  {
> @@ -6556,7 +6557,7 @@ static struct data_backref *find_data_backref(struct extent_record *rec,
>   *                otherwise means check fs tree(s) items relationship and
>   *		  @root MUST be a fs tree root.
>   * Returns 0      represents OK.
> - * Returns not 0  represents error.
> + * Returns > 0    represents error bits.
>   */

What about the code in 'if (!check_all)' branch, check_fs_first_inode
can return a negative value, hence check_btrfs_root can return a
negative value. A negative value can also be returned from
btrfs_search_slot.

Clearly this patch needs to be thought out better

>  static int check_btrfs_root(struct btrfs_trans_handle *trans,
>  			    struct btrfs_root *root, unsigned int ext_ref,
> @@ -6607,12 +6608,12 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans,
>  	while (1) {
>  		ret = walk_down_tree_v2(trans, root, &path, &level, &nrefs,
>  					ext_ref, check_all);
> -
> -		err |= !!ret;
> +		if (ret > 0)
> +			err |= ret;
>  
>  		/* if ret is negative, walk shall stop */
>  		if (ret < 0) {
> -			ret = err;
> +			ret = err | FATAL_ERROR;
>  			break;
>  		}
>  
> @@ -6636,12 +6637,12 @@ out:
>   * @ext_ref:	the EXTENDED_IREF feature
>   *
>   * Return 0 if no error found.
> - * Return <0 for error.
> + * Return not 0 for error.
>   */
>  static int check_fs_root_v2(struct btrfs_root *root, unsigned int ext_ref)
>  {
>  	reset_cached_block_groups(root->fs_info);
> -	return check_btrfs_root(NULL, root, ext_ref, 0);
> +	return !!check_btrfs_root(NULL, root, ext_ref, 0);
>  }

You make the function effectively boolean, make this explicit by
changing its return value to bool. Also the name and the boolean return
makes the function REALLY confusing. I.e when should we return true or
false? As it stands it return "false" on success and "true" otherwise,
this is a mess...


>  
>  /*
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2()
  2017-12-29 11:17   ` Nikolay Borisov
@ 2018-01-02  1:44     ` Su Yue
  2018-01-02  1:50     ` Su Yue
  1 sibling, 0 replies; 36+ messages in thread
From: Su Yue @ 2018-01-02  1:44 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs; +Cc: quwenruo.btrfs



On 12/29/2017 07:17 PM, Nikolay Borisov wrote:
> 
> 
> On 20.12.2017 06:57, Su Yue wrote:
>> In lowmem mode with '--repair', check_chunks_and_extents_v2()
>> will fix accounting in block groups and clear the error
>> bit BG_ACCOUNTING_ERROR.
>> However, return value of check_btrfs_root() is 0 either 1 instead of
>> error bits.
>>
>> If extent tree is on error, lowmem repair always prints error and
>> returns nonzero value even the filesystem is fine after repair.
>>
>> So let @err contains bits after walk_down_tree_v2().
>>
>> Introduce FATAL_ERROR for lowmem mode to represents negative return
>> values since negative and positive can't not be mixed in bits operations.
>>
>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>> ---
>>   cmds-check.c | 13 +++++++------
>>   1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/cmds-check.c b/cmds-check.c
>> index 309ac9553b3a..ebede26cef01 100644
>> --- a/cmds-check.c
>> +++ b/cmds-check.c
>> @@ -134,6 +134,7 @@ struct data_backref {
>>   #define DIR_INDEX_MISMATCH      (1<<19) /* INODE_INDEX found but not match */
>>   #define DIR_COUNT_AGAIN         (1<<20) /* DIR isize should be recalculated */
>>   #define BG_ACCOUNTING_ERROR     (1<<21) /* Block group accounting error */
>> +#define FATAL_ERROR             (1<<22) /* fatal bit for errno */
>>   
>>   static inline struct data_backref* to_data_backref(struct extent_backref *back)
>>   {
>> @@ -6556,7 +6557,7 @@ static struct data_backref *find_data_backref(struct extent_record *rec,
>>    *                otherwise means check fs tree(s) items relationship and
>>    *		  @root MUST be a fs tree root.
>>    * Returns 0      represents OK.
>> - * Returns not 0  represents error.
>> + * Returns > 0    represents error bits.
>>    */
> 
> What about the code in 'if (!check_all)' branch, check_fs_first_inode
> can return a negative value, hence check_btrfs_root can return a
> negative value. A negative value can also be returned from
> btrfs_search_slot.
> 
> Clearly this patch needs to be thought out better
> 
>>   static int check_btrfs_root(struct btrfs_trans_handle *trans,
>>   			    struct btrfs_root *root, unsigned int ext_ref,
>> @@ -6607,12 +6608,12 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans,
>>   	while (1) {
>>   		ret = walk_down_tree_v2(trans, root, &path, &level, &nrefs,
>>   					ext_ref, check_all);
>> -
>> -		err |= !!ret;
>> +		if (ret > 0)
>> +			err |= ret;
>>   
>>   		/* if ret is negative, walk shall stop */
>>   		if (ret < 0) {
>> -			ret = err;
>> +			ret = err | FATAL_ERROR;
>>   			break;
>>   		}
>>   
>> @@ -6636,12 +6637,12 @@ out:
>>    * @ext_ref:	the EXTENDED_IREF feature
>>    *
>>    * Return 0 if no error found.
>> - * Return <0 for error.
>> + * Return not 0 for error.
>>    */
>>   static int check_fs_root_v2(struct btrfs_root *root, unsigned int ext_ref)
>>   {
>>   	reset_cached_block_groups(root->fs_info);
>> -	return check_btrfs_root(NULL, root, ext_ref, 0);
>> +	return !!check_btrfs_root(NULL, root, ext_ref, 0);
>>   }
> 
> You make the function effectively boolean, make this explicit by
> changing its return value to bool. Also the name and the boolean return
> makes the function REALLY confusing. I.e when should we return true or

In the past and present, check_fs_root_v2() always returns boolean.
So the old annotation "Return <0 for error." is wrong.

Here check_btrfs_root() returns error bits instead of boolean, so I
just make check_fs_root_v2() return boolean explictly.

> false? As it stands it return "false" on success and "true" otherwise,
> this is a mess...
> 
Although it returns 1 or 0, IMHO, let it return inteager
is good enough.

Thanks,
Su

> 
>>   
>>   /*
>>
> 
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2()
  2017-12-29 11:17   ` Nikolay Borisov
  2018-01-02  1:44     ` Su Yue
@ 2018-01-02  1:50     ` Su Yue
  1 sibling, 0 replies; 36+ messages in thread
From: Su Yue @ 2018-01-02  1:50 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs; +Cc: quwenruo.btrfs



On 12/29/2017 07:17 PM, Nikolay Borisov wrote:
> 
> 
> On 20.12.2017 06:57, Su Yue wrote:
>> In lowmem mode with '--repair', check_chunks_and_extents_v2()
>> will fix accounting in block groups and clear the error
>> bit BG_ACCOUNTING_ERROR.
>> However, return value of check_btrfs_root() is 0 either 1 instead of
>> error bits.
>>
>> If extent tree is on error, lowmem repair always prints error and
>> returns nonzero value even the filesystem is fine after repair.
>>
>> So let @err contains bits after walk_down_tree_v2().
>>
>> Introduce FATAL_ERROR for lowmem mode to represents negative return
>> values since negative and positive can't not be mixed in bits operations.
>>
>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>> ---
>>   cmds-check.c | 13 +++++++------
>>   1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/cmds-check.c b/cmds-check.c
>> index 309ac9553b3a..ebede26cef01 100644
>> --- a/cmds-check.c
>> +++ b/cmds-check.c
>> @@ -134,6 +134,7 @@ struct data_backref {
>>   #define DIR_INDEX_MISMATCH      (1<<19) /* INODE_INDEX found but not match */
>>   #define DIR_COUNT_AGAIN         (1<<20) /* DIR isize should be recalculated */
>>   #define BG_ACCOUNTING_ERROR     (1<<21) /* Block group accounting error */
>> +#define FATAL_ERROR             (1<<22) /* fatal bit for errno */
>>   
>>   static inline struct data_backref* to_data_backref(struct extent_backref *back)
>>   {
>> @@ -6556,7 +6557,7 @@ static struct data_backref *find_data_backref(struct extent_record *rec,
>>    *                otherwise means check fs tree(s) items relationship and
>>    *		  @root MUST be a fs tree root.
>>    * Returns 0      represents OK.
>> - * Returns not 0  represents error.
>> + * Returns > 0    represents error bits.
>>    */
> 
> What about the code in 'if (!check_all)' branch, check_fs_first_inode
> can return a negative value, hence check_btrfs_root can return a
> negative value. A negative value can also be returned from
> btrfs_search_slot.
> 
> Clearly this patch needs to be thought out better
> 
OK, I will update it.
Thanks for review.
>>   static int check_btrfs_root(struct btrfs_trans_handle *trans,
>>   			    struct btrfs_root *root, unsigned int ext_ref,
>> @@ -6607,12 +6608,12 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans,
>>   	while (1) {
>>   		ret = walk_down_tree_v2(trans, root, &path, &level, &nrefs,
>>   					ext_ref, check_all);
>> -
>> -		err |= !!ret;
>> +		if (ret > 0)
>> +			err |= ret;
>>   
>>   		/* if ret is negative, walk shall stop */
>>   		if (ret < 0) {
>> -			ret = err;
>> +			ret = err | FATAL_ERROR;
>>   			break;
>>   		}
>>   
>> @@ -6636,12 +6637,12 @@ out:
>>    * @ext_ref:	the EXTENDED_IREF feature
>>    *
>>    * Return 0 if no error found.
>> - * Return <0 for error.
>> + * Return not 0 for error.
>>    */
>>   static int check_fs_root_v2(struct btrfs_root *root, unsigned int ext_ref)
>>   {
>>   	reset_cached_block_groups(root->fs_info);
>> -	return check_btrfs_root(NULL, root, ext_ref, 0);
>> +	return !!check_btrfs_root(NULL, root, ext_ref, 0);
>>   }
> 
> You make the function effectively boolean, make this explicit by
> changing its return value to bool. Also the name and the boolean return
> makes the function REALLY confusing. I.e when should we return true or
> false? As it stands it return "false" on success and "true" otherwise,
> this is a mess...
> 
> 
>>   
>>   /*
>>
> 
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk()
  2017-12-27  1:11                   ` Su Yue
@ 2018-01-09  7:44                     ` Qu Wenruo
  0 siblings, 0 replies; 36+ messages in thread
From: Qu Wenruo @ 2018-01-09  7:44 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2375 bytes --]



On 2017年12月27日 09:11, Su Yue wrote:
> 
> 
[snip]
>>>>
>>>> Manually allocation in advance has its advantage, like we can determine
>>>> if there is enough space for new chunk instead of checking every return
>>>> value with ENOSPC.
>>>>
>>>>
>>>> However in current case, your metadata usage is limited to the new
>>>> chunk
>>>> only.
>>>> If there extent tree has quite a lot of problem, and the chunk
>>>> allocated
>>>> is small (if using single profile and small fs), it can easily hit
>>>> ENOSPC again, since btrfs doesn't allocate new chunk for later metadata
>>>> write.
>>>>
>>> SAD. After I tried to implement above nice idea, infinite recursive
>>> brings me back to the reality.
>>>
>>> Here is the reason why btrfs_reserve_extent can not allocate chunk
>>> by itself if ENOSPC hints:
>>>
>>> btrfs_cow_block
>>> ...
>>>    btrfs_reserve_extent
>>>      btrfs_alloc_chunk
>>>        btrfs_alloc_dev_extent
>>>          btrfs_insert_empty_item
>>>            ...
>>>             btrfs_cow_block
>>
>> The order is just wrong.
>>
>> We should first check for the new chunk space, not doing the insert
>> right now.
>>
>> I'll take over the work if you're OK with it.
>>
> OK. Thanks a lot.
> 
> Thanks,
> Su

Well, even after my preparation patches, the situation is still much
complex than my expectation.

Yes, We could delay chunk/extent tree modification so newer chunk
allocation can be used for extent tree CoW.


But a lot of other infrastructure can't handle it well
The main problem is, we break a lot of ctree.c assumption.

For case like a normal tree CoW:

btrfs_search_slot(root=extent_root)
|- btrfs_cow_block()
   |- btrfs_alloc_tree_block()
      |- btrfs_reserve_extent()
         |- btrfs_alloc_chunk()
            |- insert block group item
               *CoW* extent tree

In above case, extent_root changed even before we cow the extent root.
And will cause later (b == root->node) check fail and trigger a SEGV.

To really address it, we may need a large code rework with dynamic chunk
allocation in mind, which seems to expensive for a rarely used case.

So in short, I'm totally wrong and too optimistic about the feature.
Please use the original way, which is to alloc a new meta chunk manually.

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2018-01-09  7:44 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-20  4:57 [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Su Yue
2017-12-20  4:57 ` [PATCH v2 01/17] btrfs-progs: lowmem check: release path in repair_extent_data_item() Su Yue
2017-12-20  4:57 ` [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2() Su Yue
2017-12-29 11:17   ` Nikolay Borisov
2018-01-02  1:44     ` Su Yue
2018-01-02  1:50     ` Su Yue
2017-12-20  4:57 ` [PATCH v2 03/17] btrfs-progs: lowmem check: assign @parent early in repair_extent_data_item() Su Yue
2017-12-20  4:57 ` [PATCH v2 04/17] btrfs-progs: lowmem check: exclude extents of metadata blocks Su Yue
2017-12-20  4:57 ` [PATCH v2 05/17] btrfs-progs: lowmem check: introduce modify_block_groups_cache() Su Yue
2017-12-20  5:38   ` Qu Wenruo
2017-12-20  4:57 ` [PATCH v2 06/17] btrfs-progs: lowmem check: introduce force_cow_in_new_chunk() Su Yue
2017-12-20  5:41   ` Qu Wenruo
2017-12-20  8:21     ` Su Yue
2017-12-20  8:37       ` Qu Wenruo
2017-12-21  6:51         ` Qu Wenruo
2017-12-21  7:06           ` Su Yue
2017-12-21  7:09           ` Su Yue
2017-12-21  7:12             ` Qu Wenruo
2017-12-26  8:17               ` Su Yue
2017-12-26 10:28                 ` Qu Wenruo
2017-12-27  1:11                   ` Su Yue
2018-01-09  7:44                     ` Qu Wenruo
2017-12-20  4:57 ` [PATCH v2 07/17] btrfs-progs: lowmem check: introduce try_avoid_extents_overwrite() Su Yue
2017-12-20  5:46   ` Qu Wenruo
2017-12-20  4:57 ` [PATCH v2 08/17] btrfs-progs: lowmem check: exclude extents if init-extent-tree in lowmem Su Yue
2017-12-20  4:57 ` [PATCH v2 09/17] btrfs-progs: lowmem check: start to remove parameters @trans " Su Yue
2017-12-20  5:51   ` Qu Wenruo
2017-12-20  4:57 ` [PATCH v2 10/17] btrfs-progs: lowmem check: remove parameter @trans of delete_extent_item() Su Yue
2017-12-20  4:57 ` [PATCH v2 11/17] btrfs-progs: lowmem check: remove parameter @trans of repair_chunk_item() Su Yue
2017-12-20  4:57 ` [PATCH v2 12/17] btrfs-progs: lowmem check: remove parameter @trans of repair_extent_item() Su Yue
2017-12-20  4:57 ` [PATCH v2 13/17] btrfs-progs: lowmem check: remove parameter @trans of check_leaf_items() Su Yue
2017-12-20  4:57 ` [PATCH v2 14/17] btrfs-progs: lowmem check: remove parameter @trans of repair_tree_back_ref() Su Yue
2017-12-20  4:57 ` [PATCH v2 15/17] btrfs-progs: lowmem check: remove parameter @trans of check_btrfs_root() Su Yue
2017-12-20  4:57 ` [PATCH v2 16/17] btrfs-progs: lowmem check: introduce repair_block_accounting() Su Yue
2017-12-20  4:57 ` [PATCH v2 17/17] btrfs-progs: lowmem check: end of removing parameters @trans in lowmem Su Yue
2017-12-20  5:59 ` [PATCH v2 00/17] btrfs-progs: lowmem check: avoid extents overwrite Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.