All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] btrfs-progs: check: fix up bugs of lowmem mode
@ 2017-11-27  3:13 Su Yue
  2017-11-27  3:13 ` [PATCH 1/3] btrfs-progs: check: release path in repair_extent_data_item() Su Yue
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Su Yue @ 2017-11-27  3:13 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs

The patchset can be fetched from my github:
https://github.com/Damenly/btrfs-progs/tree/lowmem
which is based on Qu's branch on
https://github.com/adam900710/btrfs-progs/tree/lowmem_fix

The first two patches fix minor bugs:
1) Extent buffer leaks while repairing data extent items in lowmem mode.
2) Repair in lowmem mode always reports error about block group accounting.

The third patch excludes extents of all treeblocks and extent items to
avoid wrong extent overwrite. Then repair in lowmem mode should work
again.

Su Yue (3):
  btrfs-progs: check: release path in repair_extent_data_item()
  btrfs-progs: check: record returned errors after walk_down_tree_v2()
  btrfs-progs: check: exclude extents of tree blocks and extent items

 cmds-check.c | 126 +++++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 114 insertions(+), 12 deletions(-)

-- 
2.15.0




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] btrfs-progs: check: release path in repair_extent_data_item()
  2017-11-27  3:13 [PATCH 0/3] btrfs-progs: check: fix up bugs of lowmem mode Su Yue
@ 2017-11-27  3:13 ` Su Yue
  2017-11-27 10:21   ` Qu Wenruo
  2017-11-27  3:13 ` [PATCH 2/3] btrfs-progs: check: record returned errors after walk_down_tree_v2() Su Yue
  2017-11-27  3:13 ` [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items Su Yue
  2 siblings, 1 reply; 12+ messages in thread
From: Su Yue @ 2017-11-27  3:13 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs

In repair_extent_data_item(), path is not be released if some
errors occurs which causes extent buffer leak.

So release path in end of the function.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-check.c b/cmds-check.c
index e746ee7b281d..3a72f8e3877d 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -11974,7 +11974,6 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
 		btrfs_mark_buffer_dirty(eb);
 		ret = btrfs_update_block_group(trans, extent_root, disk_bytenr,
 					       num_bytes, 1, 0);
-		btrfs_release_path(&path);
 	}
 
 	if (nrefs->full_backref[0])
@@ -11998,6 +11997,7 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
 
 	err &= ~BACKREF_MISSING;
 out:
+	btrfs_release_path(&path);
 	if (ret)
 		error("can't repair root %llu extent data item[%llu %llu]",
 		      root->objectid, disk_bytenr, num_bytes);
-- 
2.15.0




^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] btrfs-progs: check: record returned errors after walk_down_tree_v2()
  2017-11-27  3:13 [PATCH 0/3] btrfs-progs: check: fix up bugs of lowmem mode Su Yue
  2017-11-27  3:13 ` [PATCH 1/3] btrfs-progs: check: release path in repair_extent_data_item() Su Yue
@ 2017-11-27  3:13 ` Su Yue
  2017-11-27 10:32   ` Qu Wenruo
  2017-11-27  3:13 ` [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items Su Yue
  2 siblings, 1 reply; 12+ messages in thread
From: Su Yue @ 2017-11-27  3:13 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs

In lowmem mode with '--repair', check_chunks_and_extents_v2()
will fix accounting in block groups and clear the error
bit BG_ACCOUNTING_ERROR.
However, return value of check_btrfs_root() is 0 either 1 instead of
error bits.

If extent tree is on error, lowmem repair always prints error and
returns nonzero value even the filesystem is fine after repair.

So let @err contains bits after walk_down_tree_v2().

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-check.c b/cmds-check.c
index 3a72f8e3877d..c7b570bef9c3 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -6607,7 +6607,7 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans,
 		ret = walk_down_tree_v2(trans, root, &path, &level, &nrefs,
 					ext_ref, check_all);
 
-		err |= !!ret;
+		err |= ret;
 
 		/* if ret is negative, walk shall stop */
 		if (ret < 0) {
-- 
2.15.0




^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items
  2017-11-27  3:13 [PATCH 0/3] btrfs-progs: check: fix up bugs of lowmem mode Su Yue
  2017-11-27  3:13 ` [PATCH 1/3] btrfs-progs: check: release path in repair_extent_data_item() Su Yue
  2017-11-27  3:13 ` [PATCH 2/3] btrfs-progs: check: record returned errors after walk_down_tree_v2() Su Yue
@ 2017-11-27  3:13 ` Su Yue
  2017-11-27 10:37   ` Qu Wenruo
  2 siblings, 1 reply; 12+ messages in thread
From: Su Yue @ 2017-11-27  3:13 UTC (permalink / raw)
  To: linux-btrfs; +Cc: quwenruo.btrfs

Commit d17d6663c99c ("btrfs-progs: lowmem check: Fix regression which
screws up extent allocator") removes pin_metadata_blocks() from
lowmem repair.
So we have to find another way to exclude extents which should be
occupied by tree blocks.

Modify pin_down_tree_blocks() only for code reuse.
So behavior of pin_metadata_blocks() which works with option
'init-extent-tree' is not influenced.

Introduce exclude_blocks_and_extent_items() to mark extents of all tree
blocks dirty in fs_info->excluded_extents. It also calls
exclude_extent_items() to exclude extent_items in extent tree.

Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
---
 cmds-check.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 112 insertions(+), 10 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index c7b570bef9c3..f39285c5a3c2 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -13351,6 +13351,7 @@ out:
 	return err;
 }
 
+static int exclude_blocks_and_extent_items(struct btrfs_fs_info *fs_info);
 /*
  * Low memory usage version check_chunks_and_extents.
  */
@@ -13363,12 +13364,22 @@ static int check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
 	struct btrfs_root *root1;
 	struct btrfs_root *root;
 	struct btrfs_root *cur_root;
+	struct extent_io_tree excluded_extents;
 	int err = 0;
 	int ret;
 
 	root = fs_info->fs_root;
 
 	if (repair) {
+		extent_io_tree_init(&excluded_extents);
+		fs_info->excluded_extents = &excluded_extents;
+		ret = exclude_blocks_and_extent_items(fs_info);
+		if (ret) {
+			error("failed to exclude tree blocks and extent items");
+			return ret;
+		}
+		reset_cached_block_groups(fs_info);
+
 		trans = btrfs_start_transaction(fs_info->extent_root, 1);
 		if (IS_ERR(trans)) {
 			error("failed to start transaction before check");
@@ -13437,6 +13448,8 @@ out:
 			err |= ret;
 		else
 			err &= ~BG_ACCOUNTING_ERROR;
+		extent_io_tree_cleanup(&excluded_extents);
+		fs_info->excluded_extents = NULL;
 	}
 
 	if (trans)
@@ -13534,40 +13547,106 @@ init:
 	return 0;
 }
 
-static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
-				struct extent_buffer *eb, int tree_root)
+static int exclude_extent_items(struct btrfs_fs_info *fs_info,
+				struct extent_io_tree *tree)
+{
+	struct btrfs_root *root = fs_info->extent_root;
+	struct btrfs_key key;
+	struct btrfs_path path;
+	struct extent_buffer *eb;
+	int slot;
+	int ret;
+	u64 start;
+	u64 end;
+
+	ASSERT(tree);
+	btrfs_init_path(&path);
+	key.objectid = 0;
+	key.type = 0;
+	key.offset = 0;
+
+	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
+	if (ret < 0)
+		goto out;
+
+	while (1) {
+		eb = path.nodes[0];
+		slot = path.slots[0];
+		btrfs_item_key_to_cpu(eb, &key, slot);
+		if (key.type != BTRFS_EXTENT_ITEM_KEY &&
+		    key.type != BTRFS_METADATA_ITEM_KEY)
+			goto next;
+		start = key.objectid;
+		if (key.type == BTRFS_EXTENT_ITEM_KEY)
+			end = start + key.offset;
+		else
+			end = start + fs_info->nodesize;
+
+		set_extent_dirty(tree, start, end - 1);
+next:
+		ret = btrfs_next_item(root, &path);
+		if (ret > 0) {
+			ret = 0;
+			goto out;
+		}
+		if (ret < 0)
+			goto out;
+	}
+out:
+	if (ret)
+		error("failed to exclude extent items");
+	btrfs_release_path(&path);
+	return ret;
+}
+
+static int traverse_tree_blocks(struct btrfs_fs_info *fs_info,
+				struct extent_buffer *eb, int tree_root,
+				int pin)
 {
 	struct extent_buffer *tmp;
 	struct btrfs_root_item *ri;
 	struct btrfs_key key;
+	struct extent_io_tree *tree;
 	u64 bytenr;
 	int level = btrfs_header_level(eb);
 	int nritems;
 	int ret;
 	int i;
+	u64 end = eb->start + eb->len;
 
+	if (pin)
+		tree = &fs_info->pinned_extents;
+	else
+		tree = fs_info->excluded_extents;
 	/*
-	 * If we have pinned this block before, don't pin it again.
+	 * If we have pinned/excluded this block before, don't do it again.
 	 * This can not only avoid forever loop with broken filesystem
 	 * but also give us some speedups.
 	 */
-	if (test_range_bit(&fs_info->pinned_extents, eb->start,
-			   eb->start + eb->len - 1, EXTENT_DIRTY, 0))
+	if (test_range_bit(tree, eb->start, end - 1, EXTENT_DIRTY, 0))
 		return 0;
 
-	btrfs_pin_extent(fs_info, eb->start, eb->len);
+	if (pin)
+		btrfs_pin_extent(fs_info, eb->start, eb->len);
+	else
+		set_extent_dirty(tree, eb->start, end - 1);
 
 	nritems = btrfs_header_nritems(eb);
 	for (i = 0; i < nritems; i++) {
 		if (level == 0) {
+			bool is_extent_root;
 			btrfs_item_key_to_cpu(eb, &key, i);
 			if (key.type != BTRFS_ROOT_ITEM_KEY)
 				continue;
 			/* Skip the extent root and reloc roots */
-			if (key.objectid == BTRFS_EXTENT_TREE_OBJECTID ||
-			    key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
+			if (key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
 			    key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID)
 				continue;
+			is_extent_root =
+				key.objectid == BTRFS_EXTENT_TREE_OBJECTID;
+			/* If pin, skip the extent root */
+			if (pin && is_extent_root)
+				continue;
 			ri = btrfs_item_ptr(eb, i, struct btrfs_root_item);
 			bytenr = btrfs_disk_root_bytenr(eb, ri);
 
@@ -13582,7 +13661,7 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
 				fprintf(stderr, "Error reading root block\n");
 				return -EIO;
 			}
-			ret = pin_down_tree_blocks(fs_info, tmp, 0);
+			ret = traverse_tree_blocks(fs_info, tmp, 0, pin);
 			free_extent_buffer(tmp);
 			if (ret)
 				return ret;
@@ -13601,7 +13680,8 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
 				fprintf(stderr, "Error reading tree block\n");
 				return -EIO;
 			}
-			ret = pin_down_tree_blocks(fs_info, tmp, tree_root);
+			ret = traverse_tree_blocks(fs_info, tmp, tree_root,
+						   pin);
 			free_extent_buffer(tmp);
 			if (ret)
 				return ret;
@@ -13611,6 +13691,12 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
 	return 0;
 }
 
+static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
+				struct extent_buffer *eb, int tree_root)
+{
+	return traverse_tree_blocks(fs_info, eb, tree_root, 1);
+}
+
 static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
 {
 	int ret;
@@ -13622,6 +13708,22 @@ static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
 	return pin_down_tree_blocks(fs_info, fs_info->tree_root->node, 1);
 }
 
+static int exclude_tree_blocks(struct btrfs_fs_info *fs_info,
+				struct extent_buffer *eb, int tree_root)
+{
+	return traverse_tree_blocks(fs_info, fs_info->tree_root->node, 1, 0);
+}
+
+static int exclude_blocks_and_extent_items(struct btrfs_fs_info *fs_info)
+{
+	int ret;
+
+	ret = exclude_extent_items(fs_info, fs_info->excluded_extents);
+	if (ret)
+		return ret;
+	return exclude_tree_blocks(fs_info, fs_info->tree_root->node, 1);
+}
+
 static int reset_block_groups(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_block_group_cache *cache;
-- 
2.15.0




^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] btrfs-progs: check: release path in repair_extent_data_item()
  2017-11-27  3:13 ` [PATCH 1/3] btrfs-progs: check: release path in repair_extent_data_item() Su Yue
@ 2017-11-27 10:21   ` Qu Wenruo
  0 siblings, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2017-11-27 10:21 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1789 bytes --]



On 2017年11月27日 11:13, Su Yue wrote:
> In repair_extent_data_item(), path is not be released if some
> errors occurs which causes extent buffer leak.
> 
> So release path in end of the function.
> 
> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
> ---
>  cmds-check.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index e746ee7b281d..3a72f8e3877d 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -11974,7 +11974,6 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
>  		btrfs_mark_buffer_dirty(eb);
>  		ret = btrfs_update_block_group(trans, extent_root, disk_bytenr,
>  					       num_bytes, 1, 0);
> -		btrfs_release_path(&path);

Here btrfs_release_path() should not be removed.

Lines below this, btrfs_inc_extent_ref() is called.
In which we will alloc a new path and modify extent tree.

Thanks to the fact that there is no tree locking implemented in
btrfs-progs, it won't cause dead lock, but it's a good habit to make
path allocation deadlock free.

So please keep the path release here.

(And if you find some code holding conflicting path on the same tree,
feel free to fix them, even it won't cause real problem in btrfs right now)

>  	}
>  
>  	if (nrefs->full_backref[0])
> @@ -11998,6 +11997,7 @@ static int repair_extent_data_item(struct btrfs_trans_handle *trans,
>  
>  	err &= ~BACKREF_MISSING;
>  out:
> +	btrfs_release_path(&path);

On the other hand, an empty path can be released as many times as you wish.
So double releasing the path in out branch is not a problem.

Thanks,
Q

>  	if (ret)
>  		error("can't repair root %llu extent data item[%llu %llu]",
>  		      root->objectid, disk_bytenr, num_bytes);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] btrfs-progs: check: record returned errors after walk_down_tree_v2()
  2017-11-27  3:13 ` [PATCH 2/3] btrfs-progs: check: record returned errors after walk_down_tree_v2() Su Yue
@ 2017-11-27 10:32   ` Qu Wenruo
  0 siblings, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2017-11-27 10:32 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1351 bytes --]



On 2017年11月27日 11:13, Su Yue wrote:
> In lowmem mode with '--repair', check_chunks_and_extents_v2()
> will fix accounting in block groups and clear the error
> bit BG_ACCOUNTING_ERROR.
> However, return value of check_btrfs_root() is 0 either 1 instead of
> error bits.
> 
> If extent tree is on error, lowmem repair always prints error and
> returns nonzero value even the filesystem is fine after repair.
> 
> So let @err contains bits after walk_down_tree_v2().
> 
> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
> ---
>  cmds-check.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index 3a72f8e3877d..c7b570bef9c3 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -6607,7 +6607,7 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans,
>  		ret = walk_down_tree_v2(trans, root, &path, &level, &nrefs,
>  					ext_ref, check_all);
>  
> -		err |= !!ret;
> +		err |= ret;

Since ret from walk_down_tree_v2() can be plus, minus and 0, using bit
OR here is quite confusing.

For example, if ret is -1, it will makes all bits of err to 1.

Please don't mix bit operation with possible minus value.
(And feel free to fix similar problems)

Thanks,
Qu

>  
>  		/* if ret is negative, walk shall stop */
>  		if (ret < 0) {
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items
  2017-11-27  3:13 ` [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items Su Yue
@ 2017-11-27 10:37   ` Qu Wenruo
  2017-11-28  2:38     ` Su Yue
  0 siblings, 1 reply; 12+ messages in thread
From: Qu Wenruo @ 2017-11-27 10:37 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 7977 bytes --]



On 2017年11月27日 11:13, Su Yue wrote:
> Commit d17d6663c99c ("btrfs-progs: lowmem check: Fix regression which
> screws up extent allocator") removes pin_metadata_blocks() from
> lowmem repair.
> So we have to find another way to exclude extents which should be
> occupied by tree blocks.

First thing first, any tree block which is referred by will not be freed.

Unless some special case, like extent tree initialization which clears
the whole extent tree so we can't check if one tree block should be
freed using extent tree, there is no need to explicitly pin existing
tree blocks.

Their creation/freeing is already handled well.

Please explain the reason why you need to pin extents first.

Thanks,
Qu

> 
> Modify pin_down_tree_blocks() only for code reuse.
> So behavior of pin_metadata_blocks() which works with option
> 'init-extent-tree' is not influenced.
> 
> Introduce exclude_blocks_and_extent_items() to mark extents of all tree
> blocks dirty in fs_info->excluded_extents. It also calls
> exclude_extent_items() to exclude extent_items in extent tree.
> 
> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
> ---
>  cmds-check.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 112 insertions(+), 10 deletions(-)
> 
> diff --git a/cmds-check.c b/cmds-check.c
> index c7b570bef9c3..f39285c5a3c2 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -13351,6 +13351,7 @@ out:
>  	return err;
>  }
>  
> +static int exclude_blocks_and_extent_items(struct btrfs_fs_info *fs_info);
>  /*
>   * Low memory usage version check_chunks_and_extents.
>   */
> @@ -13363,12 +13364,22 @@ static int check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
>  	struct btrfs_root *root1;
>  	struct btrfs_root *root;
>  	struct btrfs_root *cur_root;
> +	struct extent_io_tree excluded_extents;
>  	int err = 0;
>  	int ret;
>  
>  	root = fs_info->fs_root;
>  
>  	if (repair) {
> +		extent_io_tree_init(&excluded_extents);
> +		fs_info->excluded_extents = &excluded_extents;
> +		ret = exclude_blocks_and_extent_items(fs_info);
> +		if (ret) {
> +			error("failed to exclude tree blocks and extent items");
> +			return ret;
> +		}
> +		reset_cached_block_groups(fs_info);
> +
>  		trans = btrfs_start_transaction(fs_info->extent_root, 1);
>  		if (IS_ERR(trans)) {
>  			error("failed to start transaction before check");
> @@ -13437,6 +13448,8 @@ out:
>  			err |= ret;
>  		else
>  			err &= ~BG_ACCOUNTING_ERROR;
> +		extent_io_tree_cleanup(&excluded_extents);
> +		fs_info->excluded_extents = NULL;
>  	}
>  
>  	if (trans)
> @@ -13534,40 +13547,106 @@ init:
>  	return 0;
>  }
>  
> -static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
> -				struct extent_buffer *eb, int tree_root)
> +static int exclude_extent_items(struct btrfs_fs_info *fs_info,
> +				struct extent_io_tree *tree)
> +{
> +	struct btrfs_root *root = fs_info->extent_root;
> +	struct btrfs_key key;
> +	struct btrfs_path path;
> +	struct extent_buffer *eb;
> +	int slot;
> +	int ret;
> +	u64 start;
> +	u64 end;
> +
> +	ASSERT(tree);
> +	btrfs_init_path(&path);
> +	key.objectid = 0;
> +	key.type = 0;
> +	key.offset = 0;
> +
> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
> +	if (ret < 0)
> +		goto out;
> +
> +	while (1) {
> +		eb = path.nodes[0];
> +		slot = path.slots[0];
> +		btrfs_item_key_to_cpu(eb, &key, slot);
> +		if (key.type != BTRFS_EXTENT_ITEM_KEY &&
> +		    key.type != BTRFS_METADATA_ITEM_KEY)
> +			goto next;
> +		start = key.objectid;
> +		if (key.type == BTRFS_EXTENT_ITEM_KEY)
> +			end = start + key.offset;
> +		else
> +			end = start + fs_info->nodesize;
> +
> +		set_extent_dirty(tree, start, end - 1);
> +next:
> +		ret = btrfs_next_item(root, &path);
> +		if (ret > 0) {
> +			ret = 0;
> +			goto out;
> +		}
> +		if (ret < 0)
> +			goto out;
> +	}
> +out:
> +	if (ret)
> +		error("failed to exclude extent items");
> +	btrfs_release_path(&path);
> +	return ret;
> +}
> +
> +static int traverse_tree_blocks(struct btrfs_fs_info *fs_info,
> +				struct extent_buffer *eb, int tree_root,
> +				int pin)
>  {
>  	struct extent_buffer *tmp;
>  	struct btrfs_root_item *ri;
>  	struct btrfs_key key;
> +	struct extent_io_tree *tree;
>  	u64 bytenr;
>  	int level = btrfs_header_level(eb);
>  	int nritems;
>  	int ret;
>  	int i;
> +	u64 end = eb->start + eb->len;
>  
> +	if (pin)
> +		tree = &fs_info->pinned_extents;
> +	else
> +		tree = fs_info->excluded_extents;
>  	/*
> -	 * If we have pinned this block before, don't pin it again.
> +	 * If we have pinned/excluded this block before, don't do it again.
>  	 * This can not only avoid forever loop with broken filesystem
>  	 * but also give us some speedups.
>  	 */
> -	if (test_range_bit(&fs_info->pinned_extents, eb->start,
> -			   eb->start + eb->len - 1, EXTENT_DIRTY, 0))
> +	if (test_range_bit(tree, eb->start, end - 1, EXTENT_DIRTY, 0))
>  		return 0;
>  
> -	btrfs_pin_extent(fs_info, eb->start, eb->len);
> +	if (pin)
> +		btrfs_pin_extent(fs_info, eb->start, eb->len);
> +	else
> +		set_extent_dirty(tree, eb->start, end - 1);
>  
>  	nritems = btrfs_header_nritems(eb);
>  	for (i = 0; i < nritems; i++) {
>  		if (level == 0) {
> +			bool is_extent_root;
>  			btrfs_item_key_to_cpu(eb, &key, i);
>  			if (key.type != BTRFS_ROOT_ITEM_KEY)
>  				continue;
>  			/* Skip the extent root and reloc roots */
> -			if (key.objectid == BTRFS_EXTENT_TREE_OBJECTID ||
> -			    key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
> +			if (key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
>  			    key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID)
>  				continue;
> +			is_extent_root =
> +				key.objectid == BTRFS_EXTENT_TREE_OBJECTID;
> +			/* If pin, skip the extent root */
> +			if (pin && is_extent_root)
> +				continue;
>  			ri = btrfs_item_ptr(eb, i, struct btrfs_root_item);
>  			bytenr = btrfs_disk_root_bytenr(eb, ri);
>  
> @@ -13582,7 +13661,7 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>  				fprintf(stderr, "Error reading root block\n");
>  				return -EIO;
>  			}
> -			ret = pin_down_tree_blocks(fs_info, tmp, 0);
> +			ret = traverse_tree_blocks(fs_info, tmp, 0, pin);
>  			free_extent_buffer(tmp);
>  			if (ret)
>  				return ret;
> @@ -13601,7 +13680,8 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>  				fprintf(stderr, "Error reading tree block\n");
>  				return -EIO;
>  			}
> -			ret = pin_down_tree_blocks(fs_info, tmp, tree_root);
> +			ret = traverse_tree_blocks(fs_info, tmp, tree_root,
> +						   pin);
>  			free_extent_buffer(tmp);
>  			if (ret)
>  				return ret;
> @@ -13611,6 +13691,12 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>  	return 0;
>  }
>  
> +static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
> +				struct extent_buffer *eb, int tree_root)
> +{
> +	return traverse_tree_blocks(fs_info, eb, tree_root, 1);
> +}
> +
>  static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
>  {
>  	int ret;
> @@ -13622,6 +13708,22 @@ static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
>  	return pin_down_tree_blocks(fs_info, fs_info->tree_root->node, 1);
>  }
>  
> +static int exclude_tree_blocks(struct btrfs_fs_info *fs_info,
> +				struct extent_buffer *eb, int tree_root)
> +{
> +	return traverse_tree_blocks(fs_info, fs_info->tree_root->node, 1, 0);
> +}
> +
> +static int exclude_blocks_and_extent_items(struct btrfs_fs_info *fs_info)
> +{
> +	int ret;
> +
> +	ret = exclude_extent_items(fs_info, fs_info->excluded_extents);
> +	if (ret)
> +		return ret;
> +	return exclude_tree_blocks(fs_info, fs_info->tree_root->node, 1);
> +}
> +
>  static int reset_block_groups(struct btrfs_fs_info *fs_info)
>  {
>  	struct btrfs_block_group_cache *cache;
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items
  2017-11-27 10:37   ` Qu Wenruo
@ 2017-11-28  2:38     ` Su Yue
  2017-11-28  4:05       ` Qu Wenruo
  0 siblings, 1 reply; 12+ messages in thread
From: Su Yue @ 2017-11-28  2:38 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

Thanks for review.

On 11/27/2017 06:37 PM, Qu Wenruo wrote:
> 
> 
> On 2017年11月27日 11:13, Su Yue wrote:
>> Commit d17d6663c99c ("btrfs-progs: lowmem check: Fix regression which
>> screws up extent allocator") removes pin_metadata_blocks() from
>> lowmem repair.
>> So we have to find another way to exclude extents which should be
>> occupied by tree blocks.
> 
> First thing first, any tree block which is referred by will not be freed.
> 
> Unless some special case, like extent tree initialization which clears
> the whole extent tree so we can't check if one tree block should be
> freed using extent tree, there is no need to explicitly pin existing
> tree blocks.
> 
> Their creation/freeing is already handled well.
> 
If I understand the logic of extents correctly:

The extents in free space cache are handled in the way like
cache_block_group() in extent-tree.c.
It traverses all extents items then marks all holes free in free space
cache.

Yes, in a normal situation, extents are already handled well.
But original and lowmem check try to repair corrupted extent tree.

Suppose a situation:
1) Let an extent item of one tree block is corrupted or missed.
2) The correct extent in free space cache will be marked as free.
3) Next CoW may allocate the "used" extents from free space
    and insert an extent item.
4) Lowmem repair increases refs of the extent item and
    causes a wrong extent item.
OR
3) Lowmem repair inserts the extent item firstly.
4) CoW may allocate the "used" extents from free space.
    BUG_ON failure of inserting the extent item.

> Please explain the reason why you need to pin extents first.
> 
What I do in the patch is like origin mode's.
Pining extents first ensures every extents(corrupted and uncorrupted)
will not be rellocated.

Thanks,
Su

> Thanks,
> Qu
> 
>>
>> Modify pin_down_tree_blocks() only for code reuse.
>> So behavior of pin_metadata_blocks() which works with option
>> 'init-extent-tree' is not influenced.
>>
>> Introduce exclude_blocks_and_extent_items() to mark extents of all tree
>> blocks dirty in fs_info->excluded_extents. It also calls
>> exclude_extent_items() to exclude extent_items in extent tree.
>>
>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>> ---
>>   cmds-check.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>>   1 file changed, 112 insertions(+), 10 deletions(-)
>>
>> diff --git a/cmds-check.c b/cmds-check.c
>> index c7b570bef9c3..f39285c5a3c2 100644
>> --- a/cmds-check.c
>> +++ b/cmds-check.c
>> @@ -13351,6 +13351,7 @@ out:
>>   	return err;
>>   }
>>   
>> +static int exclude_blocks_and_extent_items(struct btrfs_fs_info *fs_info);
>>   /*
>>    * Low memory usage version check_chunks_and_extents.
>>    */
>> @@ -13363,12 +13364,22 @@ static int check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
>>   	struct btrfs_root *root1;
>>   	struct btrfs_root *root;
>>   	struct btrfs_root *cur_root;
>> +	struct extent_io_tree excluded_extents;
>>   	int err = 0;
>>   	int ret;
>>   
>>   	root = fs_info->fs_root;
>>   
>>   	if (repair) {
>> +		extent_io_tree_init(&excluded_extents);
>> +		fs_info->excluded_extents = &excluded_extents;
>> +		ret = exclude_blocks_and_extent_items(fs_info);
>> +		if (ret) {
>> +			error("failed to exclude tree blocks and extent items");
>> +			return ret;
>> +		}
>> +		reset_cached_block_groups(fs_info);
>> +
>>   		trans = btrfs_start_transaction(fs_info->extent_root, 1);
>>   		if (IS_ERR(trans)) {
>>   			error("failed to start transaction before check");
>> @@ -13437,6 +13448,8 @@ out:
>>   			err |= ret;
>>   		else
>>   			err &= ~BG_ACCOUNTING_ERROR;
>> +		extent_io_tree_cleanup(&excluded_extents);
>> +		fs_info->excluded_extents = NULL;
>>   	}
>>   
>>   	if (trans)
>> @@ -13534,40 +13547,106 @@ init:
>>   	return 0;
>>   }
>>   
>> -static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>> -				struct extent_buffer *eb, int tree_root)
>> +static int exclude_extent_items(struct btrfs_fs_info *fs_info,
>> +				struct extent_io_tree *tree)
>> +{
>> +	struct btrfs_root *root = fs_info->extent_root;
>> +	struct btrfs_key key;
>> +	struct btrfs_path path;
>> +	struct extent_buffer *eb;
>> +	int slot;
>> +	int ret;
>> +	u64 start;
>> +	u64 end;
>> +
>> +	ASSERT(tree);
>> +	btrfs_init_path(&path);
>> +	key.objectid = 0;
>> +	key.type = 0;
>> +	key.offset = 0;
>> +
>> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
>> +	if (ret < 0)
>> +		goto out;
>> +
>> +	while (1) {
>> +		eb = path.nodes[0];
>> +		slot = path.slots[0];
>> +		btrfs_item_key_to_cpu(eb, &key, slot);
>> +		if (key.type != BTRFS_EXTENT_ITEM_KEY &&
>> +		    key.type != BTRFS_METADATA_ITEM_KEY)
>> +			goto next;
>> +		start = key.objectid;
>> +		if (key.type == BTRFS_EXTENT_ITEM_KEY)
>> +			end = start + key.offset;
>> +		else
>> +			end = start + fs_info->nodesize;
>> +
>> +		set_extent_dirty(tree, start, end - 1);
>> +next:
>> +		ret = btrfs_next_item(root, &path);
>> +		if (ret > 0) {
>> +			ret = 0;
>> +			goto out;
>> +		}
>> +		if (ret < 0)
>> +			goto out;
>> +	}
>> +out:
>> +	if (ret)
>> +		error("failed to exclude extent items");
>> +	btrfs_release_path(&path);
>> +	return ret;
>> +}
>> +
>> +static int traverse_tree_blocks(struct btrfs_fs_info *fs_info,
>> +				struct extent_buffer *eb, int tree_root,
>> +				int pin)
>>   {
>>   	struct extent_buffer *tmp;
>>   	struct btrfs_root_item *ri;
>>   	struct btrfs_key key;
>> +	struct extent_io_tree *tree;
>>   	u64 bytenr;
>>   	int level = btrfs_header_level(eb);
>>   	int nritems;
>>   	int ret;
>>   	int i;
>> +	u64 end = eb->start + eb->len;
>>   
>> +	if (pin)
>> +		tree = &fs_info->pinned_extents;
>> +	else
>> +		tree = fs_info->excluded_extents;
>>   	/*
>> -	 * If we have pinned this block before, don't pin it again.
>> +	 * If we have pinned/excluded this block before, don't do it again.
>>   	 * This can not only avoid forever loop with broken filesystem
>>   	 * but also give us some speedups.
>>   	 */
>> -	if (test_range_bit(&fs_info->pinned_extents, eb->start,
>> -			   eb->start + eb->len - 1, EXTENT_DIRTY, 0))
>> +	if (test_range_bit(tree, eb->start, end - 1, EXTENT_DIRTY, 0))
>>   		return 0;
>>   
>> -	btrfs_pin_extent(fs_info, eb->start, eb->len);
>> +	if (pin)
>> +		btrfs_pin_extent(fs_info, eb->start, eb->len);
>> +	else
>> +		set_extent_dirty(tree, eb->start, end - 1);
>>   
>>   	nritems = btrfs_header_nritems(eb);
>>   	for (i = 0; i < nritems; i++) {
>>   		if (level == 0) {
>> +			bool is_extent_root;
>>   			btrfs_item_key_to_cpu(eb, &key, i);
>>   			if (key.type != BTRFS_ROOT_ITEM_KEY)
>>   				continue;
>>   			/* Skip the extent root and reloc roots */
>> -			if (key.objectid == BTRFS_EXTENT_TREE_OBJECTID ||
>> -			    key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
>> +			if (key.objectid == BTRFS_TREE_RELOC_OBJECTID ||
>>   			    key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID)
>>   				continue;
>> +			is_extent_root =
>> +				key.objectid == BTRFS_EXTENT_TREE_OBJECTID;
>> +			/* If pin, skip the extent root */
>> +			if (pin && is_extent_root)
>> +				continue;
>>   			ri = btrfs_item_ptr(eb, i, struct btrfs_root_item);
>>   			bytenr = btrfs_disk_root_bytenr(eb, ri);
>>   
>> @@ -13582,7 +13661,7 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>   				fprintf(stderr, "Error reading root block\n");
>>   				return -EIO;
>>   			}
>> -			ret = pin_down_tree_blocks(fs_info, tmp, 0);
>> +			ret = traverse_tree_blocks(fs_info, tmp, 0, pin);
>>   			free_extent_buffer(tmp);
>>   			if (ret)
>>   				return ret;
>> @@ -13601,7 +13680,8 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>   				fprintf(stderr, "Error reading tree block\n");
>>   				return -EIO;
>>   			}
>> -			ret = pin_down_tree_blocks(fs_info, tmp, tree_root);
>> +			ret = traverse_tree_blocks(fs_info, tmp, tree_root,
>> +						   pin);
>>   			free_extent_buffer(tmp);
>>   			if (ret)
>>   				return ret;
>> @@ -13611,6 +13691,12 @@ static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>   	return 0;
>>   }
>>   
>> +static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>> +				struct extent_buffer *eb, int tree_root)
>> +{
>> +	return traverse_tree_blocks(fs_info, eb, tree_root, 1);
>> +}
>> +
>>   static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
>>   {
>>   	int ret;
>> @@ -13622,6 +13708,22 @@ static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
>>   	return pin_down_tree_blocks(fs_info, fs_info->tree_root->node, 1);
>>   }
>>   
>> +static int exclude_tree_blocks(struct btrfs_fs_info *fs_info,
>> +				struct extent_buffer *eb, int tree_root)
>> +{
>> +	return traverse_tree_blocks(fs_info, fs_info->tree_root->node, 1, 0);
>> +}
>> +
>> +static int exclude_blocks_and_extent_items(struct btrfs_fs_info *fs_info)
>> +{
>> +	int ret;
>> +
>> +	ret = exclude_extent_items(fs_info, fs_info->excluded_extents);
>> +	if (ret)
>> +		return ret;
>> +	return exclude_tree_blocks(fs_info, fs_info->tree_root->node, 1);
>> +}
>> +
>>   static int reset_block_groups(struct btrfs_fs_info *fs_info)
>>   {
>>   	struct btrfs_block_group_cache *cache;
>>
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items
  2017-11-28  2:38     ` Su Yue
@ 2017-11-28  4:05       ` Qu Wenruo
  2017-11-28  7:47         ` Su Yue
  0 siblings, 1 reply; 12+ messages in thread
From: Qu Wenruo @ 2017-11-28  4:05 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 12593 bytes --]



On 2017年11月28日 10:38, Su Yue wrote:
> Thanks for review.
> 
> On 11/27/2017 06:37 PM, Qu Wenruo wrote:
>>
>>
>> On 2017年11月27日 11:13, Su Yue wrote:
>>> Commit d17d6663c99c ("btrfs-progs: lowmem check: Fix regression which
>>> screws up extent allocator") removes pin_metadata_blocks() from
>>> lowmem repair.
>>> So we have to find another way to exclude extents which should be
>>> occupied by tree blocks.
>>
>> First thing first, any tree block which is referred by will not be freed.
>>
>> Unless some special case, like extent tree initialization which clears
>> the whole extent tree so we can't check if one tree block should be
>> freed using extent tree, there is no need to explicitly pin existing
>> tree blocks.
>>
>> Their creation/freeing is already handled well.
>>
> If I understand the logic of extents correctly:
> 
> The extents in free space cache are handled in the way like
> cache_block_group() in extent-tree.c.
> It traverses all extents items then marks all holes free in free space
> cache.
> 
> Yes, in a normal situation, extents are already handled well.
> But original and lowmem check try to repair corrupted extent tree.
> 
> Suppose a situation:
> 1) Let an extent item of one tree block is corrupted or missed.
> 2) The correct extent in free space cache will be marked as free.
> 3) Next CoW may allocate the "used" extents from free space
>    and insert an extent item.
> 4) Lowmem repair increases refs of the extent item and
>    causes a wrong extent item.
> OR
> 3) Lowmem repair inserts the extent item firstly.
> 4) CoW may allocate the "used" extents from free space.
>    BUG_ON failure of inserting the extent item.

OK, this explains things, but still not optimal.

Such behavior should not happen if nothing is exposed.
Pinning down all tree blocks is never a light operation and should be
avoided if possible.

It would be nice to do it when starting a new transaction to modify the fs.

> 
>> Please explain the reason why you need to pin extents first.
>>
> What I do in the patch is like origin mode's.
> Pining extents first ensures every extents(corrupted and uncorrupted)
> will not be rellocated.

Only extent tree reinit will pin down all metadata block in original
mode while normal repair won't.

So you're still doing something which is not done in original mode,
which either needs to be explained in detail or fix original mode first.

> 
> Thanks,
> Su
> 
>> Thanks,
>> Qu
>>
>>>
>>> Modify pin_down_tree_blocks() only for code reuse.
>>> So behavior of pin_metadata_blocks() which works with option
>>> 'init-extent-tree' is not influenced.
>>>
>>> Introduce exclude_blocks_and_extent_items() to mark extents of all tree
>>> blocks dirty in fs_info->excluded_extents. It also calls
>>> exclude_extent_items() to exclude extent_items in extent tree.
>>>
>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>> ---
>>>   cmds-check.c | 122
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>>>   1 file changed, 112 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/cmds-check.c b/cmds-check.c
>>> index c7b570bef9c3..f39285c5a3c2 100644
>>> --- a/cmds-check.c
>>> +++ b/cmds-check.c
>>> @@ -13351,6 +13351,7 @@ out:
>>>       return err;
>>>   }
>>>   +static int exclude_blocks_and_extent_items(struct btrfs_fs_info
>>> *fs_info);
>>>   /*
>>>    * Low memory usage version check_chunks_and_extents.
>>>    */
>>> @@ -13363,12 +13364,22 @@ static int
>>> check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
>>>       struct btrfs_root *root1;
>>>       struct btrfs_root *root;
>>>       struct btrfs_root *cur_root;
>>> +    struct extent_io_tree excluded_extents;
>>>       int err =;
>>>       int ret;
>>>         root =s_info->fs_root;
>>>         if (repair) {
>>> +        extent_io_tree_init(&excluded_extents);
>>> +        fs_info->excluded_extents =excluded_extents;
>>> +        ret =xclude_blocks_and_extent_items(fs_info);
>>> +        if (ret) {
>>> +            error("failed to exclude tree blocks and extent items");
>>> +            return ret;
>>> +        }
>>> +        reset_cached_block_groups(fs_info);
>>> +
>>>           trans =trfs_start_transaction(fs_info->extent_root, 1);
>>>           if (IS_ERR(trans)) {
>>>               error("failed to start transaction before check");
>>> @@ -13437,6 +13448,8 @@ out:
>>>               err |=et;
>>>           else
>>>               err &=BG_ACCOUNTING_ERROR;
>>> +        extent_io_tree_cleanup(&excluded_extents);
>>> +        fs_info->excluded_extents =ULL;
>>>       }
>>>         if (trans)
>>> @@ -13534,40 +13547,106 @@ init:
>>>       return 0;
>>>   }
>>>   -static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>> -                struct extent_buffer *eb, int tree_root)
>>> +static int exclude_extent_items(struct btrfs_fs_info *fs_info,
>>> +                struct extent_io_tree *tree)
>>> +{
>>> +    struct btrfs_root *root =s_info->extent_root;
>>> +    struct btrfs_key key;
>>> +    struct btrfs_path path;
>>> +    struct extent_buffer *eb;
>>> +    int slot;
>>> +    int ret;
>>> +    u64 start;
>>> +    u64 end;
>>> +
>>> +    ASSERT(tree);
>>> +    btrfs_init_path(&path);
>>> +    key.objectid =;
>>> +    key.type =;
>>> +    key.offset =;
>>> +
>>> +    ret =trfs_search_slot(NULL, root, &key, &path, 0, 0);
>>> +    if (ret < 0)
>>> +        goto out;
>>> +
>>> +    while (1) {
>>> +        eb =ath.nodes[0];
>>> +        slot =ath.slots[0];
>>> +        btrfs_item_key_to_cpu(eb, &key, slot);
>>> +        if (key.type !=TRFS_EXTENT_ITEM_KEY &&
>>> +            key.type !=TRFS_METADATA_ITEM_KEY)
>>> +            goto next;
>>> +        start =ey.objectid;
>>> +        if (key.type =BTRFS_EXTENT_ITEM_KEY)
>>> +            end =tart + key.offset;
>>> +        else
>>> +            end =tart + fs_info->nodesize;
>>> +
>>> +        set_extent_dirty(tree, start, end - 1);
>>> +next:
>>> +        ret =trfs_next_item(root, &path);
>>> +        if (ret > 0) {
>>> +            ret =;
>>> +            goto out;
>>> +        }
>>> +        if (ret < 0)
>>> +            goto out;
>>> +    }
>>> +out:
>>> +    if (ret)
>>> +        error("failed to exclude extent items");
>>> +    btrfs_release_path(&path);
>>> +    return ret;
>>> +}
>>> +
>>> +static int traverse_tree_blocks(struct btrfs_fs_info *fs_info,
>>> +                struct extent_buffer *eb, int tree_root,
>>> +                int pin)
>>>   {
>>>       struct extent_buffer *tmp;
>>>       struct btrfs_root_item *ri;
>>>       struct btrfs_key key;
>>> +    struct extent_io_tree *tree;
>>>       u64 bytenr;
>>>       int level =trfs_header_level(eb);
>>>       int nritems;
>>>       int ret;
>>>       int i;
>>> +    u64 end =b->start + eb->len;
>>>   +    if (pin)
>>> +        tree =fs_info->pinned_extents;
>>> +    else
>>> +        tree =s_info->excluded_extents;
>>>       /*
>>> -     * If we have pinned this block before, don't pin it again.
>>> +     * If we have pinned/excluded this block before, don't do it again.
>>>        * This can not only avoid forever loop with broken filesystem
>>>        * but also give us some speedups.
>>>        */
>>> -    if (test_range_bit(&fs_info->pinned_extents, eb->start,
>>> -               eb->start + eb->len - 1, EXTENT_DIRTY, 0))
>>> +    if (test_range_bit(tree, eb->start, end - 1, EXTENT_DIRTY, 0))
>>>           return 0;
>>>   -    btrfs_pin_extent(fs_info, eb->start, eb->len);
>>> +    if (pin)
>>> +        btrfs_pin_extent(fs_info, eb->start, eb->len);
>>> +    else
>>> +        set_extent_dirty(tree, eb->start, end - 1);
>>>         nritems =trfs_header_nritems(eb);
>>>       for (i =; i < nritems; i++) {
>>>           if (level =0) {
>>> +            bool is_extent_root;
>>>               btrfs_item_key_to_cpu(eb, &key, i);
>>>               if (key.type !=TRFS_ROOT_ITEM_KEY)
>>>                   continue;
>>>               /* Skip the extent root and reloc roots */
>>> -            if (key.objectid =BTRFS_EXTENT_TREE_OBJECTID ||
>>> -                key.objectid =BTRFS_TREE_RELOC_OBJECTID ||
>>> +            if (key.objectid =BTRFS_TREE_RELOC_OBJECTID ||
>>>                   key.objectid =BTRFS_DATA_RELOC_TREE_OBJECTID)
>>>                   continue;
>>> +            is_extent_root >> +                key.objectid ==
>>> BTRFS_EXTENT_TREE_OBJECTID;
>>> +            /* If pin, skip the extent root */
>>> +            if (pin && is_extent_root)
>>> +                continue;
>>>               ri =trfs_item_ptr(eb, i, struct btrfs_root_item);
>>>               bytenr =trfs_disk_root_bytenr(eb, ri);
>>>   @@ -13582,7 +13661,7 @@ static int pin_down_tree_blocks(struct
>>> btrfs_fs_info *fs_info,
>>>                   fprintf(stderr, "Error reading root block\n");
>>>                   return -EIO;
>>>               }
>>> -            ret =in_down_tree_blocks(fs_info, tmp, 0);
>>> +            ret =raverse_tree_blocks(fs_info, tmp, 0, pin);
>>>               free_extent_buffer(tmp);
>>>               if (ret)
>>>                   return ret;
>>> @@ -13601,7 +13680,8 @@ static int pin_down_tree_blocks(struct
>>> btrfs_fs_info *fs_info,
>>>                   fprintf(stderr, "Error reading tree block\n");
>>>                   return -EIO;
>>>               }
>>> -            ret =in_down_tree_blocks(fs_info, tmp, tree_root);
>>> +            ret =raverse_tree_blocks(fs_info, tmp, tree_root,
>>> +                           pin);
>>>               free_extent_buffer(tmp);
>>>               if (ret)
>>>                   return ret;
>>> @@ -13611,6 +13691,12 @@ static int pin_down_tree_blocks(struct
>>> btrfs_fs_info *fs_info,
>>>       return 0;
>>>   }
>>>   +static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>> +                struct extent_buffer *eb, int tree_root)
>>> +{
>>> +    return traverse_tree_blocks(fs_info, eb, tree_root, 1);
>>> +}
>>> +
>>>   static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
>>>   {
>>>       int ret;
>>> @@ -13622,6 +13708,22 @@ static int pin_metadata_blocks(struct
>>> btrfs_fs_info *fs_info)
>>>       return pin_down_tree_blocks(fs_info, fs_info->tree_root->node, 1);
>>>   }
>>>   +static int exclude_tree_blocks(struct btrfs_fs_info *fs_info,
>>> +                struct extent_buffer *eb, int tree_root)
>>> +{
>>> +    return traverse_tree_blocks(fs_info, fs_info->tree_root->node,
>>> 1, 0);
>>> +}
>>> +
>>> +static int exclude_blocks_and_extent_items(struct btrfs_fs_info
>>> *fs_info)
>>> +{
>>> +    int ret;
>>> +
>>> +    ret =xclude_extent_items(fs_info, fs_info->excluded_extents);
>>> +    if (ret)
>>> +        return ret;
>>> +    return exclude_tree_blocks(fs_info, fs_info->tree_root->node, 1);
>>> +}
>>> +
>>>   static int reset_block_groups(struct btrfs_fs_info *fs_info)
>>>   {
>>>       struct btrfs_block_group_cache *cache;
>>>
>>
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items
  2017-11-28  4:05       ` Qu Wenruo
@ 2017-11-28  7:47         ` Su Yue
  2017-11-28  8:25           ` Qu Wenruo
  0 siblings, 1 reply; 12+ messages in thread
From: Su Yue @ 2017-11-28  7:47 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 11/28/2017 12:05 PM, Qu Wenruo wrote:
> 
> 
> On 2017年11月28日 10:38, Su Yue wrote:
>> Thanks for review.
>>
>> On 11/27/2017 06:37 PM, Qu Wenruo wrote:
>>>
>>>
>>> On 2017年11月27日 11:13, Su Yue wrote:
>>>> Commit d17d6663c99c ("btrfs-progs: lowmem check: Fix regression which
>>>> screws up extent allocator") removes pin_metadata_blocks() from
>>>> lowmem repair.
>>>> So we have to find another way to exclude extents which should be
>>>> occupied by tree blocks.
>>>
>>> First thing first, any tree block which is referred by will not be freed.
>>>
>>> Unless some special case, like extent tree initialization which clears
>>> the whole extent tree so we can't check if one tree block should be
>>> freed using extent tree, there is no need to explicitly pin existing
>>> tree blocks.
>>>
>>> Their creation/freeing is already handled well.
>>>
>> If I understand the logic of extents correctly:
>>
>> The extents in free space cache are handled in the way like
>> cache_block_group() in extent-tree.c.
>> It traverses all extents items then marks all holes free in free space
>> cache.
>>
>> Yes, in a normal situation, extents are already handled well.
>> But original and lowmem check try to repair corrupted extent tree.
>>
>> Suppose a situation:
>> 1) Let an extent item of one tree block is corrupted or missed.
>> 2) The correct extent in free space cache will be marked as free.
>> 3) Next CoW may allocate the "used" extents from free space
>>     and insert an extent item.
>> 4) Lowmem repair increases refs of the extent item and
>>     causes a wrong extent item.
>> OR
>> 3) Lowmem repair inserts the extent item firstly.
>> 4) CoW may allocate the "used" extents from free space.
>>     BUG_ON failure of inserting the extent item.
> 
> OK, this explains things, but still not optimal.
> 
> Such behavior should not happen if nothing is exposed.
> Pinning down all tree blocks is never a light operation and should be
> avoided if possible.
> 

Agreed.

> It would be nice to do it when starting a new transaction to modify the fs.
> 

Now, there is only one transaction while checking chunks and extents
because of previous wrong call of pin_metadata_blocks().
It's meaningless to put it at start of new transaction now.

I will split the transaction into many minors. Then lowmem repair will
speed up if no error in chunks and extents. But it is just a little
optimization.

>>
>>> Please explain the reason why you need to pin extents first.
>>>
>> What I do in the patch is like origin mode's.
>> Pining extents first ensures every extents(corrupted and uncorrupted)
>> will not be rellocated.
> 
> Only extent tree reinit will pin down all metadata block in original
> mode while normal repair won't.
> 
> So you're still doing something which is not done in original mode,
> which either needs to be explained in detail or fix original mode first.
> 

You are right. I did miss details of how original mode frees extent
records.

After come thinking, pining extents is still necessary in lowmem
repair.

Original mode has extent cache to record all extents and free normal
extents while traversing tree blocks.
After traversal, only corrupted extent records exist in the cache.
At this moment, it pins them and start transactions to repair.

Lowmem mode does not store anything like extent records. It begins
transactions(in further patch) in traversal. We can not guarantee
extents allocated from free space is occupied by one tree block which
has not been traversed.

Although pining extents is heavy and painfully work, it seems
inevitable. Otherwise, I have no better idea than pining all
extents.

Thanks,
Su

>>
>> Thanks,
>> Su
>>
>>> Thanks,
>>> Qu
>>>
>>>>
>>>> Modify pin_down_tree_blocks() only for code reuse.
>>>> So behavior of pin_metadata_blocks() which works with option
>>>> 'init-extent-tree' is not influenced.
>>>>
>>>> Introduce exclude_blocks_and_extent_items() to mark extents of all tree
>>>> blocks dirty in fs_info->excluded_extents. It also calls
>>>> exclude_extent_items() to exclude extent_items in extent tree.
>>>>
>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>> ---
>>>>    cmds-check.c | 122
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>>>>    1 file changed, 112 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>> index c7b570bef9c3..f39285c5a3c2 100644
>>>> --- a/cmds-check.c
>>>> +++ b/cmds-check.c
>>>> @@ -13351,6 +13351,7 @@ out:
>>>>        return err;
>>>>    }
>>>>    +static int exclude_blocks_and_extent_items(struct btrfs_fs_info
>>>> *fs_info);
>>>>    /*
>>>>     * Low memory usage version check_chunks_and_extents.
>>>>     */
>>>> @@ -13363,12 +13364,22 @@ static int
>>>> check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
>>>>        struct btrfs_root *root1;
>>>>        struct btrfs_root *root;
>>>>        struct btrfs_root *cur_root;
>>>> +    struct extent_io_tree excluded_extents;
>>>>        int err =;
>>>>        int ret;
>>>>          root =s_info->fs_root;
>>>>          if (repair) {
>>>> +        extent_io_tree_init(&excluded_extents);
>>>> +        fs_info->excluded_extents =excluded_extents;
>>>> +        ret =xclude_blocks_and_extent_items(fs_info);
>>>> +        if (ret) {
>>>> +            error("failed to exclude tree blocks and extent items");
>>>> +            return ret;
>>>> +        }
>>>> +        reset_cached_block_groups(fs_info);
>>>> +
>>>>            trans =trfs_start_transaction(fs_info->extent_root, 1);
>>>>            if (IS_ERR(trans)) {
>>>>                error("failed to start transaction before check");
>>>> @@ -13437,6 +13448,8 @@ out:
>>>>                err |=et;
>>>>            else
>>>>                err &=BG_ACCOUNTING_ERROR;
>>>> +        extent_io_tree_cleanup(&excluded_extents);
>>>> +        fs_info->excluded_extents =ULL;
>>>>        }
>>>>          if (trans)
>>>> @@ -13534,40 +13547,106 @@ init:
>>>>        return 0;
>>>>    }
>>>>    -static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>>> -                struct extent_buffer *eb, int tree_root)
>>>> +static int exclude_extent_items(struct btrfs_fs_info *fs_info,
>>>> +                struct extent_io_tree *tree)
>>>> +{
>>>> +    struct btrfs_root *root =s_info->extent_root;
>>>> +    struct btrfs_key key;
>>>> +    struct btrfs_path path;
>>>> +    struct extent_buffer *eb;
>>>> +    int slot;
>>>> +    int ret;
>>>> +    u64 start;
>>>> +    u64 end;
>>>> +
>>>> +    ASSERT(tree);
>>>> +    btrfs_init_path(&path);
>>>> +    key.objectid =;
>>>> +    key.type =;
>>>> +    key.offset =;
>>>> +
>>>> +    ret =trfs_search_slot(NULL, root, &key, &path, 0, 0);
>>>> +    if (ret < 0)
>>>> +        goto out;
>>>> +
>>>> +    while (1) {
>>>> +        eb =ath.nodes[0];
>>>> +        slot =ath.slots[0];
>>>> +        btrfs_item_key_to_cpu(eb, &key, slot);
>>>> +        if (key.type !=TRFS_EXTENT_ITEM_KEY &&
>>>> +            key.type !=TRFS_METADATA_ITEM_KEY)
>>>> +            goto next;
>>>> +        start =ey.objectid;
>>>> +        if (key.type =BTRFS_EXTENT_ITEM_KEY)
>>>> +            end =tart + key.offset;
>>>> +        else
>>>> +            end =tart + fs_info->nodesize;
>>>> +
>>>> +        set_extent_dirty(tree, start, end - 1);
>>>> +next:
>>>> +        ret =trfs_next_item(root, &path);
>>>> +        if (ret > 0) {
>>>> +            ret =;
>>>> +            goto out;
>>>> +        }
>>>> +        if (ret < 0)
>>>> +            goto out;
>>>> +    }
>>>> +out:
>>>> +    if (ret)
>>>> +        error("failed to exclude extent items");
>>>> +    btrfs_release_path(&path);
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +static int traverse_tree_blocks(struct btrfs_fs_info *fs_info,
>>>> +                struct extent_buffer *eb, int tree_root,
>>>> +                int pin)
>>>>    {
>>>>        struct extent_buffer *tmp;
>>>>        struct btrfs_root_item *ri;
>>>>        struct btrfs_key key;
>>>> +    struct extent_io_tree *tree;
>>>>        u64 bytenr;
>>>>        int level =trfs_header_level(eb);
>>>>        int nritems;
>>>>        int ret;
>>>>        int i;
>>>> +    u64 end =b->start + eb->len;
>>>>    +    if (pin)
>>>> +        tree =fs_info->pinned_extents;
>>>> +    else
>>>> +        tree =s_info->excluded_extents;
>>>>        /*
>>>> -     * If we have pinned this block before, don't pin it again.
>>>> +     * If we have pinned/excluded this block before, don't do it again.
>>>>         * This can not only avoid forever loop with broken filesystem
>>>>         * but also give us some speedups.
>>>>         */
>>>> -    if (test_range_bit(&fs_info->pinned_extents, eb->start,
>>>> -               eb->start + eb->len - 1, EXTENT_DIRTY, 0))
>>>> +    if (test_range_bit(tree, eb->start, end - 1, EXTENT_DIRTY, 0))
>>>>            return 0;
>>>>    -    btrfs_pin_extent(fs_info, eb->start, eb->len);
>>>> +    if (pin)
>>>> +        btrfs_pin_extent(fs_info, eb->start, eb->len);
>>>> +    else
>>>> +        set_extent_dirty(tree, eb->start, end - 1);
>>>>          nritems =trfs_header_nritems(eb);
>>>>        for (i =; i < nritems; i++) {
>>>>            if (level =0) {
>>>> +            bool is_extent_root;
>>>>                btrfs_item_key_to_cpu(eb, &key, i);
>>>>                if (key.type !=TRFS_ROOT_ITEM_KEY)
>>>>                    continue;
>>>>                /* Skip the extent root and reloc roots */
>>>> -            if (key.objectid =BTRFS_EXTENT_TREE_OBJECTID ||
>>>> -                key.objectid =BTRFS_TREE_RELOC_OBJECTID ||
>>>> +            if (key.objectid =BTRFS_TREE_RELOC_OBJECTID ||
>>>>                    key.objectid =BTRFS_DATA_RELOC_TREE_OBJECTID)
>>>>                    continue;
>>>> +            is_extent_root >> +                key.objectid ==
>>>> BTRFS_EXTENT_TREE_OBJECTID;
>>>> +            /* If pin, skip the extent root */
>>>> +            if (pin && is_extent_root)
>>>> +                continue;
>>>>                ri =trfs_item_ptr(eb, i, struct btrfs_root_item);
>>>>                bytenr =trfs_disk_root_bytenr(eb, ri);
>>>>    @@ -13582,7 +13661,7 @@ static int pin_down_tree_blocks(struct
>>>> btrfs_fs_info *fs_info,
>>>>                    fprintf(stderr, "Error reading root block\n");
>>>>                    return -EIO;
>>>>                }
>>>> -            ret =in_down_tree_blocks(fs_info, tmp, 0);
>>>> +            ret =raverse_tree_blocks(fs_info, tmp, 0, pin);
>>>>                free_extent_buffer(tmp);
>>>>                if (ret)
>>>>                    return ret;
>>>> @@ -13601,7 +13680,8 @@ static int pin_down_tree_blocks(struct
>>>> btrfs_fs_info *fs_info,
>>>>                    fprintf(stderr, "Error reading tree block\n");
>>>>                    return -EIO;
>>>>                }
>>>> -            ret =in_down_tree_blocks(fs_info, tmp, tree_root);
>>>> +            ret =raverse_tree_blocks(fs_info, tmp, tree_root,
>>>> +                           pin);
>>>>                free_extent_buffer(tmp);
>>>>                if (ret)
>>>>                    return ret;
>>>> @@ -13611,6 +13691,12 @@ static int pin_down_tree_blocks(struct
>>>> btrfs_fs_info *fs_info,
>>>>        return 0;
>>>>    }
>>>>    +static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>>> +                struct extent_buffer *eb, int tree_root)
>>>> +{
>>>> +    return traverse_tree_blocks(fs_info, eb, tree_root, 1);
>>>> +}
>>>> +
>>>>    static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
>>>>    {
>>>>        int ret;
>>>> @@ -13622,6 +13708,22 @@ static int pin_metadata_blocks(struct
>>>> btrfs_fs_info *fs_info)
>>>>        return pin_down_tree_blocks(fs_info, fs_info->tree_root->node, 1);
>>>>    }
>>>>    +static int exclude_tree_blocks(struct btrfs_fs_info *fs_info,
>>>> +                struct extent_buffer *eb, int tree_root)
>>>> +{
>>>> +    return traverse_tree_blocks(fs_info, fs_info->tree_root->node,
>>>> 1, 0);
>>>> +}
>>>> +
>>>> +static int exclude_blocks_and_extent_items(struct btrfs_fs_info
>>>> *fs_info)
>>>> +{
>>>> +    int ret;
>>>> +
>>>> +    ret =xclude_extent_items(fs_info, fs_info->excluded_extents);
>>>> +    if (ret)
>>>> +        return ret;
>>>> +    return exclude_tree_blocks(fs_info, fs_info->tree_root->node, 1);
>>>> +}
>>>> +
>>>>    static int reset_block_groups(struct btrfs_fs_info *fs_info)
>>>>    {
>>>>        struct btrfs_block_group_cache *cache;
>>>>
>>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items
  2017-11-28  7:47         ` Su Yue
@ 2017-11-28  8:25           ` Qu Wenruo
  2017-11-28  8:40             ` Su Yue
  0 siblings, 1 reply; 12+ messages in thread
From: Qu Wenruo @ 2017-11-28  8:25 UTC (permalink / raw)
  To: Su Yue, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 15823 bytes --]



On 2017年11月28日 15:47, Su Yue wrote:
> 
> 
> On 11/28/2017 12:05 PM, Qu Wenruo wrote:
>>
>>
>> On 2017年11月28日 10:38, Su Yue wrote:
>>> Thanks for review.
>>>
>>> On 11/27/2017 06:37 PM, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2017年11月27日 11:13, Su Yue wrote:
>>>>> Commit d17d6663c99c ("btrfs-progs: lowmem check: Fix regression which
>>>>> screws up extent allocator") removes pin_metadata_blocks() from
>>>>> lowmem repair.
>>>>> So we have to find another way to exclude extents which should be
>>>>> occupied by tree blocks.
>>>>
>>>> First thing first, any tree block which is referred by will not be
>>>> freed.
>>>>
>>>> Unless some special case, like extent tree initialization which clears
>>>> the whole extent tree so we can't check if one tree block should be
>>>> freed using extent tree, there is no need to explicitly pin existing
>>>> tree blocks.
>>>>
>>>> Their creation/freeing is already handled well.
>>>>
>>> If I understand the logic of extents correctly:
>>>
>>> The extents in free space cache are handled in the way like
>>> cache_block_group() in extent-tree.c.
>>> It traverses all extents items then marks all holes free in free space
>>> cache.
>>>
>>> Yes, in a normal situation, extents are already handled well.
>>> But original and lowmem check try to repair corrupted extent tree.
>>>
>>> Suppose a situation:
>>> 1) Let an extent item of one tree block is corrupted or missed.
>>> 2) The correct extent in free space cache will be marked as free.
>>> 3) Next CoW may allocate the "used" extents from free space
>>>     and insert an extent item.
>>> 4) Lowmem repair increases refs of the extent item and
>>>     causes a wrong extent item.
>>> OR
>>> 3) Lowmem repair inserts the extent item firstly.
>>> 4) CoW may allocate the "used" extents from free space.
>>>     BUG_ON failure of inserting the extent item.
>>
>> OK, this explains things, but still not optimal.
>>
>> Such behavior should not happen if nothing is exposed.
>> Pinning down all tree blocks is never a light operation and should be
>> avoided if possible.
>>
> 
> Agreed.
> 
>> It would be nice to do it when starting a new transaction to modify
>> the fs.
>>
> 
> Now, there is only one transaction while checking chunks and extents
> because of previous wrong call of pin_metadata_blocks().
> It's meaningless to put it at start of new transaction now.

Because you start that transaction under all condition.

Normally, transaction should only be started if you're sure you will
modify the filesystem.

Start them unconditionally is not a preferred behavior.

> 
> I will split the transaction into many minors. Then lowmem repair will
> speed up if no error in chunks and extents. But it is just a little
> optimization.

Compared to iterating all tree blocks, it's a big optimization.

Just think of filesystem with several TB metadata.

> 
>>>
>>>> Please explain the reason why you need to pin extents first.
>>>>
>>> What I do in the patch is like origin mode's.
>>> Pining extents first ensures every extents(corrupted and uncorrupted)
>>> will not be rellocated.
>>
>> Only extent tree reinit will pin down all metadata block in original
>> mode while normal repair won't.
>>
>> So you're still doing something which is not done in original mode,
>> which either needs to be explained in detail or fix original mode first.
>>
> 
> You are right. I did miss details of how original mode frees extent
> records.
> 
> After come thinking, pining extents is still necessary in lowmem
> repair.
> 
> Original mode has extent cache to record all extents and free normal
> extents while traversing tree blocks.
> After traversal, only corrupted extent records exist in the cache.
> At this moment, it pins them and start transactions to repair.
> 
> Lowmem mode does not store anything like extent records. It begins
> transactions(in further patch) in traversal. We can not guarantee
> extents allocated from free space is occupied by one tree block which
> has not been traversed.
> 
> Although pining extents is heavy and painfully work, it seems
> inevitable. Otherwise, I have no better idea than pining all
> extents.

Firstly, only missing backref in extent tree will cause problem.

Even we have corrupted backref or extra backref which doesn't have any
tree block/data referring it, it won't cause a big problem.

So for lowmem mode, we can do it in a different way, just like seed
device, we allocate new meta chunks, and only use new meta chunks for
our tree operations.

This will avoid any extra tree iteration, and easier to implement AFAIK.
(Just mark all existing block groups full, to force chunk allocation)

Thanks,
Qu

> 
> Thanks,
> Su
> 
>>>
>>> Thanks,
>>> Su
>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>> Modify pin_down_tree_blocks() only for code reuse.
>>>>> So behavior of pin_metadata_blocks() which works with option
>>>>> 'init-extent-tree' is not influenced.
>>>>>
>>>>> Introduce exclude_blocks_and_extent_items() to mark extents of all
>>>>> tree
>>>>> blocks dirty in fs_info->excluded_extents. It also calls
>>>>> exclude_extent_items() to exclude extent_items in extent tree.
>>>>>
>>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>> ---
>>>>>    cmds-check.c | 122
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>>>>>    1 file changed, 112 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>>> index c7b570bef9c3..f39285c5a3c2 100644
>>>>> --- a/cmds-check.c
>>>>> +++ b/cmds-check.c
>>>>> @@ -13351,6 +13351,7 @@ out:
>>>>>        return err;
>>>>>    }
>>>>>    +static int exclude_blocks_and_extent_items(struct btrfs_fs_info
>>>>> *fs_info);
>>>>>    /*
>>>>>     * Low memory usage version check_chunks_and_extents.
>>>>>     */
>>>>> @@ -13363,12 +13364,22 @@ static int
>>>>> check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
>>>>>        struct btrfs_root *root1;
>>>>>        struct btrfs_root *root;
>>>>>        struct btrfs_root *cur_root;
>>>>> +    struct extent_io_tree excluded_extents;
>>>>>        int err =;
>>>>>        int ret;
>>>>>          root =s_info->fs_root;
>>>>>          if (repair) {
>>>>> +        extent_io_tree_init(&excluded_extents);
>>>>> +        fs_info->excluded_extents =excluded_extents;
>>>>> +        ret =xclude_blocks_and_extent_items(fs_info);
>>>>> +        if (ret) {
>>>>> +            error("failed to exclude tree blocks and extent items");
>>>>> +            return ret;
>>>>> +        }
>>>>> +        reset_cached_block_groups(fs_info);
>>>>> +
>>>>>            trans =trfs_start_transaction(fs_info->extent_root, 1);
>>>>>            if (IS_ERR(trans)) {
>>>>>                error("failed to start transaction before check");
>>>>> @@ -13437,6 +13448,8 @@ out:
>>>>>                err |=et;
>>>>>            else
>>>>>                err &=BG_ACCOUNTING_ERROR;
>>>>> +        extent_io_tree_cleanup(&excluded_extents);
>>>>> +        fs_info->excluded_extents =ULL;
>>>>>        }
>>>>>          if (trans)
>>>>> @@ -13534,40 +13547,106 @@ init:
>>>>>        return 0;
>>>>>    }
>>>>>    -static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>>>> -                struct extent_buffer *eb, int tree_root)
>>>>> +static int exclude_extent_items(struct btrfs_fs_info *fs_info,
>>>>> +                struct extent_io_tree *tree)
>>>>> +{
>>>>> +    struct btrfs_root *root =s_info->extent_root;
>>>>> +    struct btrfs_key key;
>>>>> +    struct btrfs_path path;
>>>>> +    struct extent_buffer *eb;
>>>>> +    int slot;
>>>>> +    int ret;
>>>>> +    u64 start;
>>>>> +    u64 end;
>>>>> +
>>>>> +    ASSERT(tree);
>>>>> +    btrfs_init_path(&path);
>>>>> +    key.objectid =;
>>>>> +    key.type =;
>>>>> +    key.offset =;
>>>>> +
>>>>> +    ret =trfs_search_slot(NULL, root, &key, &path, 0, 0);
>>>>> +    if (ret < 0)
>>>>> +        goto out;
>>>>> +
>>>>> +    while (1) {
>>>>> +        eb =ath.nodes[0];
>>>>> +        slot =ath.slots[0];
>>>>> +        btrfs_item_key_to_cpu(eb, &key, slot);
>>>>> +        if (key.type !=TRFS_EXTENT_ITEM_KEY &&
>>>>> +            key.type !=TRFS_METADATA_ITEM_KEY)
>>>>> +            goto next;
>>>>> +        start =ey.objectid;
>>>>> +        if (key.type =BTRFS_EXTENT_ITEM_KEY)
>>>>> +            end =tart + key.offset;
>>>>> +        else
>>>>> +            end =tart + fs_info->nodesize;
>>>>> +
>>>>> +        set_extent_dirty(tree, start, end - 1);
>>>>> +next:
>>>>> +        ret =trfs_next_item(root, &path);
>>>>> +        if (ret > 0) {
>>>>> +            ret =;
>>>>> +            goto out;
>>>>> +        }
>>>>> +        if (ret < 0)
>>>>> +            goto out;
>>>>> +    }
>>>>> +out:
>>>>> +    if (ret)
>>>>> +        error("failed to exclude extent items");
>>>>> +    btrfs_release_path(&path);
>>>>> +    return ret;
>>>>> +}
>>>>> +
>>>>> +static int traverse_tree_blocks(struct btrfs_fs_info *fs_info,
>>>>> +                struct extent_buffer *eb, int tree_root,
>>>>> +                int pin)
>>>>>    {
>>>>>        struct extent_buffer *tmp;
>>>>>        struct btrfs_root_item *ri;
>>>>>        struct btrfs_key key;
>>>>> +    struct extent_io_tree *tree;
>>>>>        u64 bytenr;
>>>>>        int level =trfs_header_level(eb);
>>>>>        int nritems;
>>>>>        int ret;
>>>>>        int i;
>>>>> +    u64 end =b->start + eb->len;
>>>>>    +    if (pin)
>>>>> +        tree =fs_info->pinned_extents;
>>>>> +    else
>>>>> +        tree =s_info->excluded_extents;
>>>>>        /*
>>>>> -     * If we have pinned this block before, don't pin it again.
>>>>> +     * If we have pinned/excluded this block before, don't do it
>>>>> again.
>>>>>         * This can not only avoid forever loop with broken filesystem
>>>>>         * but also give us some speedups.
>>>>>         */
>>>>> -    if (test_range_bit(&fs_info->pinned_extents, eb->start,
>>>>> -               eb->start + eb->len - 1, EXTENT_DIRTY, 0))
>>>>> +    if (test_range_bit(tree, eb->start, end - 1, EXTENT_DIRTY, 0))
>>>>>            return 0;
>>>>>    -    btrfs_pin_extent(fs_info, eb->start, eb->len);
>>>>> +    if (pin)
>>>>> +        btrfs_pin_extent(fs_info, eb->start, eb->len);
>>>>> +    else
>>>>> +        set_extent_dirty(tree, eb->start, end - 1);
>>>>>          nritems =trfs_header_nritems(eb);
>>>>>        for (i =; i < nritems; i++) {
>>>>>            if (level =0) {
>>>>> +            bool is_extent_root;
>>>>>                btrfs_item_key_to_cpu(eb, &key, i);
>>>>>                if (key.type !=TRFS_ROOT_ITEM_KEY)
>>>>>                    continue;
>>>>>                /* Skip the extent root and reloc roots */
>>>>> -            if (key.objectid =BTRFS_EXTENT_TREE_OBJECTID ||
>>>>> -                key.objectid =BTRFS_TREE_RELOC_OBJECTID ||
>>>>> +            if (key.objectid =BTRFS_TREE_RELOC_OBJECTID ||
>>>>>                    key.objectid =BTRFS_DATA_RELOC_TREE_OBJECTID)
>>>>>                    continue;
>>>>> +            is_extent_root >> +                key.objectid ==
>>>>> BTRFS_EXTENT_TREE_OBJECTID;
>>>>> +            /* If pin, skip the extent root */
>>>>> +            if (pin && is_extent_root)
>>>>> +                continue;
>>>>>                ri =trfs_item_ptr(eb, i, struct btrfs_root_item);
>>>>>                bytenr =trfs_disk_root_bytenr(eb, ri);
>>>>>    @@ -13582,7 +13661,7 @@ static int pin_down_tree_blocks(struct
>>>>> btrfs_fs_info *fs_info,
>>>>>                    fprintf(stderr, "Error reading root block\n");
>>>>>                    return -EIO;
>>>>>                }
>>>>> -            ret =in_down_tree_blocks(fs_info, tmp, 0);
>>>>> +            ret =raverse_tree_blocks(fs_info, tmp, 0, pin);
>>>>>                free_extent_buffer(tmp);
>>>>>                if (ret)
>>>>>                    return ret;
>>>>> @@ -13601,7 +13680,8 @@ static int pin_down_tree_blocks(struct
>>>>> btrfs_fs_info *fs_info,
>>>>>                    fprintf(stderr, "Error reading tree block\n");
>>>>>                    return -EIO;
>>>>>                }
>>>>> -            ret =in_down_tree_blocks(fs_info, tmp, tree_root);
>>>>> +            ret =raverse_tree_blocks(fs_info, tmp, tree_root,
>>>>> +                           pin);
>>>>>                free_extent_buffer(tmp);
>>>>>                if (ret)
>>>>>                    return ret;
>>>>> @@ -13611,6 +13691,12 @@ static int pin_down_tree_blocks(struct
>>>>> btrfs_fs_info *fs_info,
>>>>>        return 0;
>>>>>    }
>>>>>    +static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>>>> +                struct extent_buffer *eb, int tree_root)
>>>>> +{
>>>>> +    return traverse_tree_blocks(fs_info, eb, tree_root, 1);
>>>>> +}
>>>>> +
>>>>>    static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
>>>>>    {
>>>>>        int ret;
>>>>> @@ -13622,6 +13708,22 @@ static int pin_metadata_blocks(struct
>>>>> btrfs_fs_info *fs_info)
>>>>>        return pin_down_tree_blocks(fs_info,
>>>>> fs_info->tree_root->node, 1);
>>>>>    }
>>>>>    +static int exclude_tree_blocks(struct btrfs_fs_info *fs_info,
>>>>> +                struct extent_buffer *eb, int tree_root)
>>>>> +{
>>>>> +    return traverse_tree_blocks(fs_info, fs_info->tree_root->node,
>>>>> 1, 0);
>>>>> +}
>>>>> +
>>>>> +static int exclude_blocks_and_extent_items(struct btrfs_fs_info
>>>>> *fs_info)
>>>>> +{
>>>>> +    int ret;
>>>>> +
>>>>> +    ret =xclude_extent_items(fs_info, fs_info->excluded_extents);
>>>>> +    if (ret)
>>>>> +        return ret;
>>>>> +    return exclude_tree_blocks(fs_info, fs_info->tree_root->node, 1);
>>>>> +}
>>>>> +
>>>>>    static int reset_block_groups(struct btrfs_fs_info *fs_info)
>>>>>    {
>>>>>        struct btrfs_block_group_cache *cache;
>>>>>
>>>>
>>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items
  2017-11-28  8:25           ` Qu Wenruo
@ 2017-11-28  8:40             ` Su Yue
  0 siblings, 0 replies; 12+ messages in thread
From: Su Yue @ 2017-11-28  8:40 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 11/28/2017 04:25 PM, Qu Wenruo wrote:
> 
> 
> On 2017年11月28日 15:47, Su Yue wrote:
>>
>>
>> On 11/28/2017 12:05 PM, Qu Wenruo wrote:
>>>
>>>
>>> On 2017年11月28日 10:38, Su Yue wrote:
>>>> Thanks for review.
>>>>
>>>> On 11/27/2017 06:37 PM, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> On 2017年11月27日 11:13, Su Yue wrote:
>>>>>> Commit d17d6663c99c ("btrfs-progs: lowmem check: Fix regression which
>>>>>> screws up extent allocator") removes pin_metadata_blocks() from
>>>>>> lowmem repair.
>>>>>> So we have to find another way to exclude extents which should be
>>>>>> occupied by tree blocks.
>>>>>
>>>>> First thing first, any tree block which is referred by will not be
>>>>> freed.
>>>>>
>>>>> Unless some special case, like extent tree initialization which clears
>>>>> the whole extent tree so we can't check if one tree block should be
>>>>> freed using extent tree, there is no need to explicitly pin existing
>>>>> tree blocks.
>>>>>
>>>>> Their creation/freeing is already handled well.
>>>>>
>>>> If I understand the logic of extents correctly:
>>>>
>>>> The extents in free space cache are handled in the way like
>>>> cache_block_group() in extent-tree.c.
>>>> It traverses all extents items then marks all holes free in free space
>>>> cache.
>>>>
>>>> Yes, in a normal situation, extents are already handled well.
>>>> But original and lowmem check try to repair corrupted extent tree.
>>>>
>>>> Suppose a situation:
>>>> 1) Let an extent item of one tree block is corrupted or missed.
>>>> 2) The correct extent in free space cache will be marked as free.
>>>> 3) Next CoW may allocate the "used" extents from free space
>>>>      and insert an extent item.
>>>> 4) Lowmem repair increases refs of the extent item and
>>>>      causes a wrong extent item.
>>>> OR
>>>> 3) Lowmem repair inserts the extent item firstly.
>>>> 4) CoW may allocate the "used" extents from free space.
>>>>      BUG_ON failure of inserting the extent item.
>>>
>>> OK, this explains things, but still not optimal.
>>>
>>> Such behavior should not happen if nothing is exposed.
>>> Pinning down all tree blocks is never a light operation and should be
>>> avoided if possible.
>>>
>>
>> Agreed.
>>
>>> It would be nice to do it when starting a new transaction to modify
>>> the fs.
>>>
>>
>> Now, there is only one transaction while checking chunks and extents
>> because of previous wrong call of pin_metadata_blocks().
>> It's meaningless to put it at start of new transaction now.
> 
> Because you start that transaction under all condition.
> 
> Normally, transaction should only be started if you're sure you will
> modify the filesystem.
> 
> Start them unconditionally is not a preferred behavior.
> 
>>
>> I will split the transaction into many minors. Then lowmem repair will
>> speed up if no error in chunks and extents. But it is just a little
>> optimization.
> 
> Compared to iterating all tree blocks, it's a big optimization.
> 
> Just think of filesystem with several TB metadata.
> 
>>
>>>>
>>>>> Please explain the reason why you need to pin extents first.
>>>>>
>>>> What I do in the patch is like origin mode's.
>>>> Pining extents first ensures every extents(corrupted and uncorrupted)
>>>> will not be rellocated.
>>>
>>> Only extent tree reinit will pin down all metadata block in original
>>> mode while normal repair won't.
>>>
>>> So you're still doing something which is not done in original mode,
>>> which either needs to be explained in detail or fix original mode first.
>>>
>>
>> You are right. I did miss details of how original mode frees extent
>> records.
>>
>> After come thinking, pining extents is still necessary in lowmem
>> repair.
>>
>> Original mode has extent cache to record all extents and free normal
>> extents while traversing tree blocks.
>> After traversal, only corrupted extent records exist in the cache.
>> At this moment, it pins them and start transactions to repair.
>>
>> Lowmem mode does not store anything like extent records. It begins
>> transactions(in further patch) in traversal. We can not guarantee
>> extents allocated from free space is occupied by one tree block which
>> has not been traversed.
>>
>> Although pining extents is heavy and painfully work, it seems
>> inevitable. Otherwise, I have no better idea than pining all
>> extents.
> 
> Firstly, only missing backref in extent tree will cause problem.
> 
> Even we have corrupted backref or extra backref which doesn't have any
> tree block/data referring it, it won't cause a big problem.
> 
> So for lowmem mode, we can do it in a different way, just like seed
> device, we allocate new meta chunks, and only use new meta chunks for
> our tree operations.
> 
> This will avoid any extra tree iteration, and easier to implement AFAIK.
> (Just mark all existing block groups full, to force chunk allocation)
> 

Seems feasible! I will try to implement it.

Thanks,
Su

> Thanks,
> Qu
> 
>>
>> Thanks,
>> Su
>>
>>>>
>>>> Thanks,
>>>> Su
>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>>
>>>>>> Modify pin_down_tree_blocks() only for code reuse.
>>>>>> So behavior of pin_metadata_blocks() which works with option
>>>>>> 'init-extent-tree' is not influenced.
>>>>>>
>>>>>> Introduce exclude_blocks_and_extent_items() to mark extents of all
>>>>>> tree
>>>>>> blocks dirty in fs_info->excluded_extents. It also calls
>>>>>> exclude_extent_items() to exclude extent_items in extent tree.
>>>>>>
>>>>>> Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
>>>>>> ---
>>>>>>     cmds-check.c | 122
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>>>>>>     1 file changed, 112 insertions(+), 10 deletions(-)
>>>>>>
>>>>>> diff --git a/cmds-check.c b/cmds-check.c
>>>>>> index c7b570bef9c3..f39285c5a3c2 100644
>>>>>> --- a/cmds-check.c
>>>>>> +++ b/cmds-check.c
>>>>>> @@ -13351,6 +13351,7 @@ out:
>>>>>>         return err;
>>>>>>     }
>>>>>>     +static int exclude_blocks_and_extent_items(struct btrfs_fs_info
>>>>>> *fs_info);
>>>>>>     /*
>>>>>>      * Low memory usage version check_chunks_and_extents.
>>>>>>      */
>>>>>> @@ -13363,12 +13364,22 @@ static int
>>>>>> check_chunks_and_extents_v2(struct btrfs_fs_info *fs_info)
>>>>>>         struct btrfs_root *root1;
>>>>>>         struct btrfs_root *root;
>>>>>>         struct btrfs_root *cur_root;
>>>>>> +    struct extent_io_tree excluded_extents;
>>>>>>         int err =;
>>>>>>         int ret;
>>>>>>           root =s_info->fs_root;
>>>>>>           if (repair) {
>>>>>> +        extent_io_tree_init(&excluded_extents);
>>>>>> +        fs_info->excluded_extents =excluded_extents;
>>>>>> +        ret =xclude_blocks_and_extent_items(fs_info);
>>>>>> +        if (ret) {
>>>>>> +            error("failed to exclude tree blocks and extent items");
>>>>>> +            return ret;
>>>>>> +        }
>>>>>> +        reset_cached_block_groups(fs_info);
>>>>>> +
>>>>>>             trans =trfs_start_transaction(fs_info->extent_root, 1);
>>>>>>             if (IS_ERR(trans)) {
>>>>>>                 error("failed to start transaction before check");
>>>>>> @@ -13437,6 +13448,8 @@ out:
>>>>>>                 err |=et;
>>>>>>             else
>>>>>>                 err &=BG_ACCOUNTING_ERROR;
>>>>>> +        extent_io_tree_cleanup(&excluded_extents);
>>>>>> +        fs_info->excluded_extents =ULL;
>>>>>>         }
>>>>>>           if (trans)
>>>>>> @@ -13534,40 +13547,106 @@ init:
>>>>>>         return 0;
>>>>>>     }
>>>>>>     -static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>>>>> -                struct extent_buffer *eb, int tree_root)
>>>>>> +static int exclude_extent_items(struct btrfs_fs_info *fs_info,
>>>>>> +                struct extent_io_tree *tree)
>>>>>> +{
>>>>>> +    struct btrfs_root *root =s_info->extent_root;
>>>>>> +    struct btrfs_key key;
>>>>>> +    struct btrfs_path path;
>>>>>> +    struct extent_buffer *eb;
>>>>>> +    int slot;
>>>>>> +    int ret;
>>>>>> +    u64 start;
>>>>>> +    u64 end;
>>>>>> +
>>>>>> +    ASSERT(tree);
>>>>>> +    btrfs_init_path(&path);
>>>>>> +    key.objectid =;
>>>>>> +    key.type =;
>>>>>> +    key.offset =;
>>>>>> +
>>>>>> +    ret =trfs_search_slot(NULL, root, &key, &path, 0, 0);
>>>>>> +    if (ret < 0)
>>>>>> +        goto out;
>>>>>> +
>>>>>> +    while (1) {
>>>>>> +        eb =ath.nodes[0];
>>>>>> +        slot =ath.slots[0];
>>>>>> +        btrfs_item_key_to_cpu(eb, &key, slot);
>>>>>> +        if (key.type !=TRFS_EXTENT_ITEM_KEY &&
>>>>>> +            key.type !=TRFS_METADATA_ITEM_KEY)
>>>>>> +            goto next;
>>>>>> +        start =ey.objectid;
>>>>>> +        if (key.type =BTRFS_EXTENT_ITEM_KEY)
>>>>>> +            end =tart + key.offset;
>>>>>> +        else
>>>>>> +            end =tart + fs_info->nodesize;
>>>>>> +
>>>>>> +        set_extent_dirty(tree, start, end - 1);
>>>>>> +next:
>>>>>> +        ret =trfs_next_item(root, &path);
>>>>>> +        if (ret > 0) {
>>>>>> +            ret =;
>>>>>> +            goto out;
>>>>>> +        }
>>>>>> +        if (ret < 0)
>>>>>> +            goto out;
>>>>>> +    }
>>>>>> +out:
>>>>>> +    if (ret)
>>>>>> +        error("failed to exclude extent items");
>>>>>> +    btrfs_release_path(&path);
>>>>>> +    return ret;
>>>>>> +}
>>>>>> +
>>>>>> +static int traverse_tree_blocks(struct btrfs_fs_info *fs_info,
>>>>>> +                struct extent_buffer *eb, int tree_root,
>>>>>> +                int pin)
>>>>>>     {
>>>>>>         struct extent_buffer *tmp;
>>>>>>         struct btrfs_root_item *ri;
>>>>>>         struct btrfs_key key;
>>>>>> +    struct extent_io_tree *tree;
>>>>>>         u64 bytenr;
>>>>>>         int level =trfs_header_level(eb);
>>>>>>         int nritems;
>>>>>>         int ret;
>>>>>>         int i;
>>>>>> +    u64 end =b->start + eb->len;
>>>>>>     +    if (pin)
>>>>>> +        tree =fs_info->pinned_extents;
>>>>>> +    else
>>>>>> +        tree =s_info->excluded_extents;
>>>>>>         /*
>>>>>> -     * If we have pinned this block before, don't pin it again.
>>>>>> +     * If we have pinned/excluded this block before, don't do it
>>>>>> again.
>>>>>>          * This can not only avoid forever loop with broken filesystem
>>>>>>          * but also give us some speedups.
>>>>>>          */
>>>>>> -    if (test_range_bit(&fs_info->pinned_extents, eb->start,
>>>>>> -               eb->start + eb->len - 1, EXTENT_DIRTY, 0))
>>>>>> +    if (test_range_bit(tree, eb->start, end - 1, EXTENT_DIRTY, 0))
>>>>>>             return 0;
>>>>>>     -    btrfs_pin_extent(fs_info, eb->start, eb->len);
>>>>>> +    if (pin)
>>>>>> +        btrfs_pin_extent(fs_info, eb->start, eb->len);
>>>>>> +    else
>>>>>> +        set_extent_dirty(tree, eb->start, end - 1);
>>>>>>           nritems =trfs_header_nritems(eb);
>>>>>>         for (i =; i < nritems; i++) {
>>>>>>             if (level =0) {
>>>>>> +            bool is_extent_root;
>>>>>>                 btrfs_item_key_to_cpu(eb, &key, i);
>>>>>>                 if (key.type !=TRFS_ROOT_ITEM_KEY)
>>>>>>                     continue;
>>>>>>                 /* Skip the extent root and reloc roots */
>>>>>> -            if (key.objectid =BTRFS_EXTENT_TREE_OBJECTID ||
>>>>>> -                key.objectid =BTRFS_TREE_RELOC_OBJECTID ||
>>>>>> +            if (key.objectid =BTRFS_TREE_RELOC_OBJECTID ||
>>>>>>                     key.objectid =BTRFS_DATA_RELOC_TREE_OBJECTID)
>>>>>>                     continue;
>>>>>> +            is_extent_root >> +                key.objectid ==
>>>>>> BTRFS_EXTENT_TREE_OBJECTID;
>>>>>> +            /* If pin, skip the extent root */
>>>>>> +            if (pin && is_extent_root)
>>>>>> +                continue;
>>>>>>                 ri =trfs_item_ptr(eb, i, struct btrfs_root_item);
>>>>>>                 bytenr =trfs_disk_root_bytenr(eb, ri);
>>>>>>     @@ -13582,7 +13661,7 @@ static int pin_down_tree_blocks(struct
>>>>>> btrfs_fs_info *fs_info,
>>>>>>                     fprintf(stderr, "Error reading root block\n");
>>>>>>                     return -EIO;
>>>>>>                 }
>>>>>> -            ret =in_down_tree_blocks(fs_info, tmp, 0);
>>>>>> +            ret =raverse_tree_blocks(fs_info, tmp, 0, pin);
>>>>>>                 free_extent_buffer(tmp);
>>>>>>                 if (ret)
>>>>>>                     return ret;
>>>>>> @@ -13601,7 +13680,8 @@ static int pin_down_tree_blocks(struct
>>>>>> btrfs_fs_info *fs_info,
>>>>>>                     fprintf(stderr, "Error reading tree block\n");
>>>>>>                     return -EIO;
>>>>>>                 }
>>>>>> -            ret =in_down_tree_blocks(fs_info, tmp, tree_root);
>>>>>> +            ret =raverse_tree_blocks(fs_info, tmp, tree_root,
>>>>>> +                           pin);
>>>>>>                 free_extent_buffer(tmp);
>>>>>>                 if (ret)
>>>>>>                     return ret;
>>>>>> @@ -13611,6 +13691,12 @@ static int pin_down_tree_blocks(struct
>>>>>> btrfs_fs_info *fs_info,
>>>>>>         return 0;
>>>>>>     }
>>>>>>     +static int pin_down_tree_blocks(struct btrfs_fs_info *fs_info,
>>>>>> +                struct extent_buffer *eb, int tree_root)
>>>>>> +{
>>>>>> +    return traverse_tree_blocks(fs_info, eb, tree_root, 1);
>>>>>> +}
>>>>>> +
>>>>>>     static int pin_metadata_blocks(struct btrfs_fs_info *fs_info)
>>>>>>     {
>>>>>>         int ret;
>>>>>> @@ -13622,6 +13708,22 @@ static int pin_metadata_blocks(struct
>>>>>> btrfs_fs_info *fs_info)
>>>>>>         return pin_down_tree_blocks(fs_info,
>>>>>> fs_info->tree_root->node, 1);
>>>>>>     }
>>>>>>     +static int exclude_tree_blocks(struct btrfs_fs_info *fs_info,
>>>>>> +                struct extent_buffer *eb, int tree_root)
>>>>>> +{
>>>>>> +    return traverse_tree_blocks(fs_info, fs_info->tree_root->node,
>>>>>> 1, 0);
>>>>>> +}
>>>>>> +
>>>>>> +static int exclude_blocks_and_extent_items(struct btrfs_fs_info
>>>>>> *fs_info)
>>>>>> +{
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    ret =xclude_extent_items(fs_info, fs_info->excluded_extents);
>>>>>> +    if (ret)
>>>>>> +        return ret;
>>>>>> +    return exclude_tree_blocks(fs_info, fs_info->tree_root->node, 1);
>>>>>> +}
>>>>>> +
>>>>>>     static int reset_block_groups(struct btrfs_fs_info *fs_info)
>>>>>>     {
>>>>>>         struct btrfs_block_group_cache *cache;
>>>>>>
>>>>>
>>>>
>>>>
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe
>>>> linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-11-28  8:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-27  3:13 [PATCH 0/3] btrfs-progs: check: fix up bugs of lowmem mode Su Yue
2017-11-27  3:13 ` [PATCH 1/3] btrfs-progs: check: release path in repair_extent_data_item() Su Yue
2017-11-27 10:21   ` Qu Wenruo
2017-11-27  3:13 ` [PATCH 2/3] btrfs-progs: check: record returned errors after walk_down_tree_v2() Su Yue
2017-11-27 10:32   ` Qu Wenruo
2017-11-27  3:13 ` [PATCH 3/3] btrfs-progs: check: exclude extents of tree blocks and extent items Su Yue
2017-11-27 10:37   ` Qu Wenruo
2017-11-28  2:38     ` Su Yue
2017-11-28  4:05       ` Qu Wenruo
2017-11-28  7:47         ` Su Yue
2017-11-28  8:25           ` Qu Wenruo
2017-11-28  8:40             ` Su Yue

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.