All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] btrfs: improve normal backref walking
@ 2020-02-07  9:38 ethanwu
  2020-02-07  9:38 ` [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset ethanwu
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: ethanwu @ 2020-02-07  9:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ethanwu

Btrfs has two types of data backref.
For BTRFS_EXTENT_DATA_REF_KEY type of backref, we don't have the
exact block number. Therefore, we need to call resolve_indirect_refs.
It uses btrfs_search_slot to locate the leaf block. Then
we need to walk through the leaves to search for the EXTENT_DATA items
that have disk bytenr matching the extent item(add_all_parents).

When resolving indirect refs, we could take entries that don't
belong to the backref entry we are searching for right now.
For that reason when searching backref entry, we always use total
refs of that EXTENT_ITEM rather than individual count.

For example:
item 11 key (40831553536 EXTENT_ITEM 4194304) itemoff 15460 itemsize
  extent refs 24 gen 7302 flags DATA
  shared data backref parent 394985472 count 10 #1
  extent data backref root 257 objectid 260 offset 1048576 count 3 #2
  extent data backref root 256 objectid 260 offset 65536 count 6 #3
  extent data backref root 257 objectid 260 offset 65536 count 5 #4

For example, when searching backref entry #4, we'll use total_refs
24, a very loose loop ending condition, instead of total_refs = 5.

But using total_refs=24 is not accurate. Sometimes, we'll never find
all the refs from specific root.
As a result, the loop keeps on going until we reach the end of that inode.

The first 3 patches, handle 3 different types refs we might encounter.
These refs do not belong to the normal backref we are searching, and
hence need to be skipped.

The last patch changes the total_refs to correct number so that we could
end loop as soon as we find all the refs we want.

btrfs send uses backref to find possible clone sources, the following
is a simple test to compare the results with and without this patch

btrfs subvolume create /volume1/sub1
for i in `seq 1 163840`; do dd if=/dev/zero of=/volume1/sub1/file bs=64K count=1 seek=$((i-1)) conv=notrunc oflag=direct 2>/dev/null; done
btrfs subvolume snapshot /volume1/sub1 /volume1/sub2
for i in `seq 1 163840`; do dd if=/dev/zero of=/volume1/sub1/file bs=4K count=1 seek=$(((i-1)*16+10)) conv=notrunc oflag=direct 2>/dev/null; done
btrfs subvolume snapshot -r /volume1/sub1 /volume1/snap1
time btrfs send /volume1/snap1 | btrfs receive /volume2

without this patch
real 69m48.124s
user 0m50.199s
sys  70m15.600s

with this patch
real    1m59.683s
user    0m35.421s
sys     2m42.684s


ethanwu (4):
  btrfs: backref, only collect file extent items matching backref offset
  btrfs: backref, not adding refs from shared block when resolving
    normal backref
  btrfs: backref, only search backref entries from leaves of the same
    root
  btrfs: backref, use correct count to resolve normal data refs

 fs/btrfs/backref.c | 156 +++++++++++++++++++++++++++++----------------
 1 file changed, 100 insertions(+), 56 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-07  9:38 [PATCH 0/4] btrfs: improve normal backref walking ethanwu
@ 2020-02-07  9:38 ` ethanwu
  2020-02-07 16:26   ` Josef Bacik
  2020-02-10 10:33   ` Johannes Thumshirn
  2020-02-07  9:38 ` [PATCH 2/4] btrfs: backref, not adding refs from shared block when resolving normal backref ethanwu
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 25+ messages in thread
From: ethanwu @ 2020-02-07  9:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ethanwu

When resolving one backref of type EXTENT_DATA_REF, we collect all
references that simply references the EXTENT_ITEM even though
their (file_pos- file_extent_item::offset) are not the same as the
btrfs_extent_data_ref::offset we are searching.

This patch add additional check so that we only collect references whose
(file_pos- file_extent_item::offset) == btrfs_extent_data_ref::offset.

Signed-off-by: ethanwu <ethanwu@synology.com>
---
 fs/btrfs/backref.c | 63 ++++++++++++++++++++++++----------------------
 1 file changed, 33 insertions(+), 30 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index e5d85311d5d5..1572eab3cc06 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -347,33 +347,10 @@ static int add_prelim_ref(const struct btrfs_fs_info *fs_info,
 		return -ENOMEM;
 
 	ref->root_id = root_id;
-	if (key) {
+	if (key)
 		ref->key_for_search = *key;
-		/*
-		 * We can often find data backrefs with an offset that is too
-		 * large (>= LLONG_MAX, maximum allowed file offset) due to
-		 * underflows when subtracting a file's offset with the data
-		 * offset of its corresponding extent data item. This can
-		 * happen for example in the clone ioctl.
-		 * So if we detect such case we set the search key's offset to
-		 * zero to make sure we will find the matching file extent item
-		 * at add_all_parents(), otherwise we will miss it because the
-		 * offset taken form the backref is much larger then the offset
-		 * of the file extent item. This can make us scan a very large
-		 * number of file extent items, but at least it will not make
-		 * us miss any.
-		 * This is an ugly workaround for a behaviour that should have
-		 * never existed, but it does and a fix for the clone ioctl
-		 * would touch a lot of places, cause backwards incompatibility
-		 * and would not fix the problem for extents cloned with older
-		 * kernels.
-		 */
-		if (ref->key_for_search.type == BTRFS_EXTENT_DATA_KEY &&
-		    ref->key_for_search.offset >= LLONG_MAX)
-			ref->key_for_search.offset = 0;
-	} else {
+	else
 		memset(&ref->key_for_search, 0, sizeof(ref->key_for_search));
-	}
 
 	ref->inode_list = NULL;
 	ref->level = level;
@@ -424,6 +401,7 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 	u64 disk_byte;
 	u64 wanted_disk_byte = ref->wanted_disk_byte;
 	u64 count = 0;
+	u64 data_offset;
 
 	if (level != 0) {
 		eb = path->nodes[level];
@@ -457,11 +435,15 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 
 		fi = btrfs_item_ptr(eb, slot, struct btrfs_file_extent_item);
 		disk_byte = btrfs_file_extent_disk_bytenr(eb, fi);
+		data_offset = btrfs_file_extent_offset(eb, fi);
 
 		if (disk_byte == wanted_disk_byte) {
 			eie = NULL;
 			old = NULL;
-			count++;
+			if (ref->key_for_search.offset == key.offset - data_offset)
+				count++;
+			else
+				goto next;
 			if (extent_item_pos) {
 				ret = check_extent_in_eb(&key, eb, fi,
 						*extent_item_pos,
@@ -513,6 +495,7 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 	int root_level;
 	int level = ref->level;
 	int index;
+	struct btrfs_key search_key = ref->key_for_search;
 
 	root_key.objectid = ref->root_id;
 	root_key.type = BTRFS_ROOT_ITEM_KEY;
@@ -545,13 +528,33 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 		goto out;
 	}
 
+	/*
+	 * We can often find data backrefs with an offset that is too
+	 * large (>= LLONG_MAX, maximum allowed file offset) due to
+	 * underflows when subtracting a file's offset with the data
+	 * offset of its corresponding extent data item. This can
+	 * happen for example in the clone ioctl.
+	 * So if we detect such case we set the search key's offset to
+	 * zero to make sure we will find the matching file extent item
+	 * at add_all_parents(), otherwise we will miss it because the
+	 * offset taken form the backref is much larger then the offset
+	 * of the file extent item. This can make us scan a very large
+	 * number of file extent items, but at least it will not make
+	 * us miss any.
+	 * This is an ugly workaround for a behaviour that should have
+	 * never existed, but it does and a fix for the clone ioctl
+	 * would touch a lot of places, cause backwards incompatibility
+	 * and would not fix the problem for extents cloned with older
+	 * kernels.
+	 */
+	if (search_key.type == BTRFS_EXTENT_DATA_KEY &&
+	    search_key.offset >= LLONG_MAX)
+		search_key.offset = 0;
 	path->lowest_level = level;
 	if (time_seq == SEQ_LAST)
-		ret = btrfs_search_slot(NULL, root, &ref->key_for_search, path,
-					0, 0);
+		ret = btrfs_search_slot(NULL, root, &search_key, path, 0, 0);
 	else
-		ret = btrfs_search_old_slot(root, &ref->key_for_search, path,
-					    time_seq);
+		ret = btrfs_search_old_slot(root, &search_key, path, time_seq);
 
 	/* root node has been locked, we can release @subvol_srcu safely here */
 	srcu_read_unlock(&fs_info->subvol_srcu, index);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/4] btrfs: backref, not adding refs from shared block when resolving normal backref
  2020-02-07  9:38 [PATCH 0/4] btrfs: improve normal backref walking ethanwu
  2020-02-07  9:38 ` [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset ethanwu
@ 2020-02-07  9:38 ` ethanwu
  2020-02-07 16:35   ` Josef Bacik
  2020-02-10 10:51   ` Johannes Thumshirn
  2020-02-07  9:38 ` [PATCH 3/4] btrfs: backref, only search backref entries from leaves of the same root ethanwu
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 25+ messages in thread
From: ethanwu @ 2020-02-07  9:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ethanwu

All references from the block of SHARED_DATA_REF belong to that
shared block backref.

For example,
item 11 key (40831553536 EXTENT_ITEM 4194304) itemoff 15460 itemsize 95
    extent refs 24 gen 7302 flags DATA
    extent data backref root 257 objectid 260 offset 65536 count 5
    extent data backref root 258 objectid 265 offset 0 count 9
    shared data backref parent 394985472 count 10

block 394985472 might be leaf from root 257, and the item obejctid and
(file_pos - file_extent_item::offset) in that leaf just happends to be
260 and 65536 which is equal to the first extent data backref entry.

Before this patch, when we resolving backref:
root 257 objectid 260 offset 65536, we will add those refs in block
394985472 and wrongly treat those as the refs we want.

Fix this by checking if the leaf we are processing is shared data backref,
if so, just skip this leaf.

shared data refs added into preftrees.direct have all entry value = 0
(root_id = 0, key = NULL, level = 0) except parent entry.
Other refs from indirect tree will have key value and root id != 0, and
these value won't be changed when their parent is resolved and added to
preftrees.direct. Therefore, we could reuse the preftrees.direct
and search ref with all values = 0 except parent is set to avoid
getting those resolved refs block.

Signed-off-by: ethanwu <ethanwu@synology.com>
---
 fs/btrfs/backref.c | 61 +++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 52 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 1572eab3cc06..b4b68af48726 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -386,8 +386,34 @@ static int add_indirect_ref(const struct btrfs_fs_info *fs_info,
 			      wanted_disk_byte, count, sc, gfp_mask);
 }
 
+static int is_shared_data_backref(struct preftrees *preftrees, u64 bytenr)
+{
+	struct rb_node **p = &preftrees->direct.root.rb_root.rb_node;
+	struct rb_node *parent = NULL;
+	struct prelim_ref *ref = NULL;
+	struct prelim_ref target = {0};
+	int result;
+
+	target.parent = bytenr;
+
+	while (*p) {
+		parent = *p;
+		ref = rb_entry(parent, struct prelim_ref, rbnode);
+		result = prelim_ref_compare(ref, &target);
+
+		if (result < 0)
+			p = &(*p)->rb_left;
+		else if (result > 0)
+			p = &(*p)->rb_right;
+		else
+			return 1;
+	}
+	return 0;
+}
+
 static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
-			   struct ulist *parents, struct prelim_ref *ref,
+			   struct ulist *parents,
+			   struct preftrees *preftrees, struct prelim_ref *ref,
 			   int level, u64 time_seq, const u64 *extent_item_pos,
 			   u64 total_refs, bool ignore_offset)
 {
@@ -412,11 +438,16 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 	}
 
 	/*
-	 * We normally enter this function with the path already pointing to
-	 * the first item to check. But sometimes, we may enter it with
-	 * slot==nritems. In that case, go to the next leaf before we continue.
+	 * 1. We normally enter this function with the path already pointing to
+	 *    the first item to check. But sometimes, we may enter it with
+	 *    slot==nritems.
+	 * 2. We are searching for normal backref but bytenr of this leaf
+	 *    matches shared data backref
+	 * For these cases, go to the next leaf before we continue.
 	 */
-	if (path->slots[0] >= btrfs_header_nritems(path->nodes[0])) {
+	eb = path->nodes[0];
+	if (path->slots[0] >= btrfs_header_nritems(eb) ||
+		is_shared_data_backref(preftrees, eb->start)) {
 		if (time_seq == SEQ_LAST)
 			ret = btrfs_next_leaf(root, path);
 		else
@@ -433,6 +464,17 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 		    key.type != BTRFS_EXTENT_DATA_KEY)
 			break;
 
+		/*
+		 * We are searching for normal backref but bytenr of this
+		 * leaf matches shared data backref.
+		 */
+		if (slot == 0 && is_shared_data_backref(preftrees, eb->start)) {
+			if (time_seq == SEQ_LAST)
+				ret = btrfs_next_leaf(root, path);
+			else
+				ret = btrfs_next_old_leaf(root, path, time_seq);
+			continue;
+		}
 		fi = btrfs_item_ptr(eb, slot, struct btrfs_file_extent_item);
 		disk_byte = btrfs_file_extent_disk_bytenr(eb, fi);
 		data_offset = btrfs_file_extent_offset(eb, fi);
@@ -484,6 +526,7 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
  */
 static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 				struct btrfs_path *path, u64 time_seq,
+				struct preftrees *preftrees,
 				struct prelim_ref *ref, struct ulist *parents,
 				const u64 *extent_item_pos, u64 total_refs,
 				bool ignore_offset)
@@ -577,8 +620,8 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 		eb = path->nodes[level];
 	}
 
-	ret = add_all_parents(root, path, parents, ref, level, time_seq,
-			      extent_item_pos, total_refs, ignore_offset);
+	ret = add_all_parents(root, path, parents, preftrees, ref, level,
+			      time_seq, extent_item_pos, total_refs, ignore_offset);
 out:
 	path->lowest_level = 0;
 	btrfs_release_path(path);
@@ -656,8 +699,8 @@ static int resolve_indirect_refs(struct btrfs_fs_info *fs_info,
 			ret = BACKREF_FOUND_SHARED;
 			goto out;
 		}
-		err = resolve_indirect_ref(fs_info, path, time_seq, ref,
-					   parents, extent_item_pos,
+		err = resolve_indirect_ref(fs_info, path, time_seq, preftrees,
+					   ref, parents, extent_item_pos,
 					   total_refs, ignore_offset);
 		/*
 		 * we can only tolerate ENOENT,otherwise,we should catch error
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/4] btrfs: backref, only search backref entries from leaves of the same root
  2020-02-07  9:38 [PATCH 0/4] btrfs: improve normal backref walking ethanwu
  2020-02-07  9:38 ` [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset ethanwu
  2020-02-07  9:38 ` [PATCH 2/4] btrfs: backref, not adding refs from shared block when resolving normal backref ethanwu
@ 2020-02-07  9:38 ` ethanwu
  2020-02-07 16:37   ` Josef Bacik
  2020-02-10 10:54   ` Johannes Thumshirn
  2020-02-07  9:38 ` [PATCH 4/4] btrfs: backref, use correct count to resolve normal data refs ethanwu
  2020-02-20 16:41 ` [PATCH 0/4] btrfs: improve normal backref walking David Sterba
  4 siblings, 2 replies; 25+ messages in thread
From: ethanwu @ 2020-02-07  9:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ethanwu

We could have some nodes/leaves in subvolume whose owner are not the
that subvolume. In this way, when we resolve normal backrefs of that
subvolume, we should avoid collecting those references from these blocks.

Signed-off-by: ethanwu <ethanwu@synology.com>
---
 fs/btrfs/backref.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index b4b68af48726..7e2e647ec846 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -443,11 +443,13 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 	 *    slot==nritems.
 	 * 2. We are searching for normal backref but bytenr of this leaf
 	 *    matches shared data backref
+	 * 3. The leaf owner is not equal to the root we are searching
 	 * For these cases, go to the next leaf before we continue.
 	 */
 	eb = path->nodes[0];
 	if (path->slots[0] >= btrfs_header_nritems(eb) ||
-		is_shared_data_backref(preftrees, eb->start)) {
+		is_shared_data_backref(preftrees, eb->start) ||
+		ref->root_id != btrfs_header_owner(eb)) {
 		if (time_seq == SEQ_LAST)
 			ret = btrfs_next_leaf(root, path);
 		else
@@ -466,9 +468,12 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 
 		/*
 		 * We are searching for normal backref but bytenr of this
-		 * leaf matches shared data backref.
+		 * leaf matches shared data backref. OR
+		 * The leaf owner is not equal to the root we are searching
 		 */
-		if (slot == 0 && is_shared_data_backref(preftrees, eb->start)) {
+		if (slot == 0 &&
+		    (is_shared_data_backref(preftrees, eb->start) ||
+		     ref->root_id != btrfs_header_owner(eb))) {
 			if (time_seq == SEQ_LAST)
 				ret = btrfs_next_leaf(root, path);
 			else
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/4] btrfs: backref, use correct count to resolve normal data refs
  2020-02-07  9:38 [PATCH 0/4] btrfs: improve normal backref walking ethanwu
                   ` (2 preceding siblings ...)
  2020-02-07  9:38 ` [PATCH 3/4] btrfs: backref, only search backref entries from leaves of the same root ethanwu
@ 2020-02-07  9:38 ` ethanwu
  2020-02-07 16:39   ` Josef Bacik
  2020-02-10 10:55   ` Johannes Thumshirn
  2020-02-20 16:41 ` [PATCH 0/4] btrfs: improve normal backref walking David Sterba
  4 siblings, 2 replies; 25+ messages in thread
From: ethanwu @ 2020-02-07  9:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ethanwu

With the following patches:
btrfs: backref, only collect file extent items matching backref offset
btrfs: backref, not adding refs from shared block when resolving normal backref
btrfs: backref, only search backref entries from leaves of the same root

we only collect the normal data refs we want, so the imprecise
upper bound total_refs of that EXTENT_ITEM could now be changed
to the count of the normal backref entry we want to search.

Signed-off-by: ethanwu <ethanwu@synology.com>
---
 fs/btrfs/backref.c | 29 +++++++++++------------------
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 7e2e647ec846..46cd6ed1a016 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -415,7 +415,7 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 			   struct ulist *parents,
 			   struct preftrees *preftrees, struct prelim_ref *ref,
 			   int level, u64 time_seq, const u64 *extent_item_pos,
-			   u64 total_refs, bool ignore_offset)
+			   bool ignore_offset)
 {
 	int ret = 0;
 	int slot;
@@ -456,7 +456,7 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 			ret = btrfs_next_old_leaf(root, path, time_seq);
 	}
 
-	while (!ret && count < total_refs) {
+	while (!ret && count < ref->count) {
 		eb = path->nodes[0];
 		slot = path->slots[0];
 
@@ -533,8 +533,7 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 				struct btrfs_path *path, u64 time_seq,
 				struct preftrees *preftrees,
 				struct prelim_ref *ref, struct ulist *parents,
-				const u64 *extent_item_pos, u64 total_refs,
-				bool ignore_offset)
+				const u64 *extent_item_pos, bool ignore_offset)
 {
 	struct btrfs_root *root;
 	struct btrfs_key root_key;
@@ -626,7 +625,7 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 	}
 
 	ret = add_all_parents(root, path, parents, preftrees, ref, level,
-			      time_seq, extent_item_pos, total_refs, ignore_offset);
+			      time_seq, extent_item_pos, ignore_offset);
 out:
 	path->lowest_level = 0;
 	btrfs_release_path(path);
@@ -660,7 +659,7 @@ unode_aux_to_inode_list(struct ulist_node *node)
 static int resolve_indirect_refs(struct btrfs_fs_info *fs_info,
 				 struct btrfs_path *path, u64 time_seq,
 				 struct preftrees *preftrees,
-				 const u64 *extent_item_pos, u64 total_refs,
+				 const u64 *extent_item_pos,
 				 struct share_check *sc, bool ignore_offset)
 {
 	int err;
@@ -706,7 +705,7 @@ static int resolve_indirect_refs(struct btrfs_fs_info *fs_info,
 		}
 		err = resolve_indirect_ref(fs_info, path, time_seq, preftrees,
 					   ref, parents, extent_item_pos,
-					   total_refs, ignore_offset);
+					   ignore_offset);
 		/*
 		 * we can only tolerate ENOENT,otherwise,we should catch error
 		 * and return directly.
@@ -809,8 +808,7 @@ static int add_missing_keys(struct btrfs_fs_info *fs_info,
  */
 static int add_delayed_refs(const struct btrfs_fs_info *fs_info,
 			    struct btrfs_delayed_ref_head *head, u64 seq,
-			    struct preftrees *preftrees, u64 *total_refs,
-			    struct share_check *sc)
+			    struct preftrees *preftrees, struct share_check *sc)
 {
 	struct btrfs_delayed_ref_node *node;
 	struct btrfs_delayed_extent_op *extent_op = head->extent_op;
@@ -844,7 +842,6 @@ static int add_delayed_refs(const struct btrfs_fs_info *fs_info,
 		default:
 			BUG();
 		}
-		*total_refs += count;
 		switch (node->type) {
 		case BTRFS_TREE_BLOCK_REF_KEY: {
 			/* NORMAL INDIRECT METADATA backref */
@@ -927,7 +924,7 @@ static int add_delayed_refs(const struct btrfs_fs_info *fs_info,
 static int add_inline_refs(const struct btrfs_fs_info *fs_info,
 			   struct btrfs_path *path, u64 bytenr,
 			   int *info_level, struct preftrees *preftrees,
-			   u64 *total_refs, struct share_check *sc)
+			   struct share_check *sc)
 {
 	int ret = 0;
 	int slot;
@@ -951,7 +948,6 @@ static int add_inline_refs(const struct btrfs_fs_info *fs_info,
 
 	ei = btrfs_item_ptr(leaf, slot, struct btrfs_extent_item);
 	flags = btrfs_extent_flags(leaf, ei);
-	*total_refs += btrfs_extent_refs(leaf, ei);
 	btrfs_item_key_to_cpu(leaf, &found_key, slot);
 
 	ptr = (unsigned long)(ei + 1);
@@ -1176,8 +1172,6 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans,
 	struct prelim_ref *ref;
 	struct rb_node *node;
 	struct extent_inode_elem *eie = NULL;
-	/* total of both direct AND indirect refs! */
-	u64 total_refs = 0;
 	struct preftrees preftrees = {
 		.direct = PREFTREE_INIT,
 		.indirect = PREFTREE_INIT,
@@ -1246,7 +1240,7 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans,
 			}
 			spin_unlock(&delayed_refs->lock);
 			ret = add_delayed_refs(fs_info, head, time_seq,
-					       &preftrees, &total_refs, sc);
+					       &preftrees, sc);
 			mutex_unlock(&head->mutex);
 			if (ret)
 				goto out;
@@ -1267,8 +1261,7 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans,
 		    (key.type == BTRFS_EXTENT_ITEM_KEY ||
 		     key.type == BTRFS_METADATA_ITEM_KEY)) {
 			ret = add_inline_refs(fs_info, path, bytenr,
-					      &info_level, &preftrees,
-					      &total_refs, sc);
+					      &info_level, &preftrees, sc);
 			if (ret)
 				goto out;
 			ret = add_keyed_refs(fs_info, path, bytenr, info_level,
@@ -1287,7 +1280,7 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans,
 	WARN_ON(!RB_EMPTY_ROOT(&preftrees.indirect_missing_keys.root.rb_root));
 
 	ret = resolve_indirect_refs(fs_info, path, time_seq, &preftrees,
-				    extent_item_pos, total_refs, sc, ignore_offset);
+				    extent_item_pos, sc, ignore_offset);
 	if (ret)
 		goto out;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-07  9:38 ` [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset ethanwu
@ 2020-02-07 16:26   ` Josef Bacik
  2020-02-10  9:12     ` ethanwu
  2020-02-10 10:33   ` Johannes Thumshirn
  1 sibling, 1 reply; 25+ messages in thread
From: Josef Bacik @ 2020-02-07 16:26 UTC (permalink / raw)
  To: ethanwu, linux-btrfs

On 2/7/20 4:38 AM, ethanwu wrote:
> When resolving one backref of type EXTENT_DATA_REF, we collect all
> references that simply references the EXTENT_ITEM even though
> their (file_pos- file_extent_item::offset) are not the same as the
> btrfs_extent_data_ref::offset we are searching.
> 
> This patch add additional check so that we only collect references whose
> (file_pos- file_extent_item::offset) == btrfs_extent_data_ref::offset.
> 
> Signed-off-by: ethanwu <ethanwu@synology.com>

I just want to make sure that btrfs/097 passes still right?  That's what the 
key_for_search thing was about, so I want to make sure we're not regressing.  I 
assume you've run xfstests but I just want to make doubly sure we're good here. 
If you did then you can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] btrfs: backref, not adding refs from shared block when resolving normal backref
  2020-02-07  9:38 ` [PATCH 2/4] btrfs: backref, not adding refs from shared block when resolving normal backref ethanwu
@ 2020-02-07 16:35   ` Josef Bacik
  2020-02-10 10:51   ` Johannes Thumshirn
  1 sibling, 0 replies; 25+ messages in thread
From: Josef Bacik @ 2020-02-07 16:35 UTC (permalink / raw)
  To: ethanwu, linux-btrfs

On 2/7/20 4:38 AM, ethanwu wrote:
> All references from the block of SHARED_DATA_REF belong to that
> shared block backref.
> 
> For example,
> item 11 key (40831553536 EXTENT_ITEM 4194304) itemoff 15460 itemsize 95
>      extent refs 24 gen 7302 flags DATA
>      extent data backref root 257 objectid 260 offset 65536 count 5
>      extent data backref root 258 objectid 265 offset 0 count 9
>      shared data backref parent 394985472 count 10
> 
> block 394985472 might be leaf from root 257, and the item obejctid and
> (file_pos - file_extent_item::offset) in that leaf just happends to be
> 260 and 65536 which is equal to the first extent data backref entry.
> 
> Before this patch, when we resolving backref:
> root 257 objectid 260 offset 65536, we will add those refs in block
> 394985472 and wrongly treat those as the refs we want.
> 
> Fix this by checking if the leaf we are processing is shared data backref,
> if so, just skip this leaf.
> 
> shared data refs added into preftrees.direct have all entry value = 0
> (root_id = 0, key = NULL, level = 0) except parent entry.
> Other refs from indirect tree will have key value and root id != 0, and
> these value won't be changed when their parent is resolved and added to
> preftrees.direct. Therefore, we could reuse the preftrees.direct
> and search ref with all values = 0 except parent is set to avoid
> getting those resolved refs block.
> 
> Signed-off-by: ethanwu <ethanwu@synology.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] btrfs: backref, only search backref entries from leaves of the same root
  2020-02-07  9:38 ` [PATCH 3/4] btrfs: backref, only search backref entries from leaves of the same root ethanwu
@ 2020-02-07 16:37   ` Josef Bacik
  2020-02-10 10:54   ` Johannes Thumshirn
  1 sibling, 0 replies; 25+ messages in thread
From: Josef Bacik @ 2020-02-07 16:37 UTC (permalink / raw)
  To: ethanwu, linux-btrfs

On 2/7/20 4:38 AM, ethanwu wrote:
> We could have some nodes/leaves in subvolume whose owner are not the
> that subvolume. In this way, when we resolve normal backrefs of that
> subvolume, we should avoid collecting those references from these blocks.
> 
> Signed-off-by: ethanwu <ethanwu@synology.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] btrfs: backref, use correct count to resolve normal data refs
  2020-02-07  9:38 ` [PATCH 4/4] btrfs: backref, use correct count to resolve normal data refs ethanwu
@ 2020-02-07 16:39   ` Josef Bacik
  2020-02-10 10:55   ` Johannes Thumshirn
  1 sibling, 0 replies; 25+ messages in thread
From: Josef Bacik @ 2020-02-07 16:39 UTC (permalink / raw)
  To: ethanwu, linux-btrfs

On 2/7/20 4:38 AM, ethanwu wrote:
> With the following patches:
> btrfs: backref, only collect file extent items matching backref offset
> btrfs: backref, not adding refs from shared block when resolving normal backref
> btrfs: backref, only search backref entries from leaves of the same root
> 
> we only collect the normal data refs we want, so the imprecise
> upper bound total_refs of that EXTENT_ITEM could now be changed
> to the count of the normal backref entry we want to search.
> 
> Signed-off-by: ethanwu <ethanwu@synology.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-07 16:26   ` Josef Bacik
@ 2020-02-10  9:12     ` ethanwu
  2020-02-10 16:29       ` David Sterba
  0 siblings, 1 reply; 25+ messages in thread
From: ethanwu @ 2020-02-10  9:12 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs

Josef Bacik 於 2020-02-08 00:26 寫到:
> On 2/7/20 4:38 AM, ethanwu wrote:
>> When resolving one backref of type EXTENT_DATA_REF, we collect all
>> references that simply references the EXTENT_ITEM even though
>> their (file_pos- file_extent_item::offset) are not the same as the
>> btrfs_extent_data_ref::offset we are searching.
>> 
>> This patch add additional check so that we only collect references 
>> whose
>> (file_pos- file_extent_item::offset) == btrfs_extent_data_ref::offset.
>> 
>> Signed-off-by: ethanwu <ethanwu@synology.com>
> 
> I just want to make sure that btrfs/097 passes still right?  That's
> what the key_for_search thing was about, so I want to make sure we're
> not regressing.  I assume you've run xfstests but I just want to make
> doubly sure we're good here. If you did then you can add
> 
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> 
> Thanks,
> 
> Josef

Thanks for reviewing.

I've run the btrfs part of xfstests, 097 passed.
Failed at following tests:
074 (failed 2 out of 5 runs),
139, 153, 154,
197, 198(Patches related to these 2 tests seem to be not merged yet?)
201, 202

My kernel environment is 5.5-rc5, and this branch doesn't contain
fixes for tests 201 and 202.
All these failing tests also failed at the same version without my 
patch.

Thanks,
ethanwu


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-07  9:38 ` [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset ethanwu
  2020-02-07 16:26   ` Josef Bacik
@ 2020-02-10 10:33   ` Johannes Thumshirn
  1 sibling, 0 replies; 25+ messages in thread
From: Johannes Thumshirn @ 2020-02-10 10:33 UTC (permalink / raw)
  To: ethanwu, linux-btrfs

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] btrfs: backref, not adding refs from shared block when resolving normal backref
  2020-02-07  9:38 ` [PATCH 2/4] btrfs: backref, not adding refs from shared block when resolving normal backref ethanwu
  2020-02-07 16:35   ` Josef Bacik
@ 2020-02-10 10:51   ` Johannes Thumshirn
  1 sibling, 0 replies; 25+ messages in thread
From: Johannes Thumshirn @ 2020-02-10 10:51 UTC (permalink / raw)
  To: ethanwu, linux-btrfs

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] btrfs: backref, only search backref entries from leaves of the same root
  2020-02-07  9:38 ` [PATCH 3/4] btrfs: backref, only search backref entries from leaves of the same root ethanwu
  2020-02-07 16:37   ` Josef Bacik
@ 2020-02-10 10:54   ` Johannes Thumshirn
  1 sibling, 0 replies; 25+ messages in thread
From: Johannes Thumshirn @ 2020-02-10 10:54 UTC (permalink / raw)
  To: ethanwu, linux-btrfs

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] btrfs: backref, use correct count to resolve normal data refs
  2020-02-07  9:38 ` [PATCH 4/4] btrfs: backref, use correct count to resolve normal data refs ethanwu
  2020-02-07 16:39   ` Josef Bacik
@ 2020-02-10 10:55   ` Johannes Thumshirn
  1 sibling, 0 replies; 25+ messages in thread
From: Johannes Thumshirn @ 2020-02-10 10:55 UTC (permalink / raw)
  To: ethanwu, linux-btrfs

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-10  9:12     ` ethanwu
@ 2020-02-10 16:29       ` David Sterba
  2020-02-11  4:03         ` ethanwu
  0 siblings, 1 reply; 25+ messages in thread
From: David Sterba @ 2020-02-10 16:29 UTC (permalink / raw)
  To: ethanwu; +Cc: Josef Bacik, linux-btrfs

On Mon, Feb 10, 2020 at 05:12:48PM +0800, ethanwu wrote:
> Josef Bacik 於 2020-02-08 00:26 寫到:
> > On 2/7/20 4:38 AM, ethanwu wrote:
> >> When resolving one backref of type EXTENT_DATA_REF, we collect all
> >> references that simply references the EXTENT_ITEM even though
> >> their (file_pos- file_extent_item::offset) are not the same as the
> >> btrfs_extent_data_ref::offset we are searching.
> >> 
> >> This patch add additional check so that we only collect references 
> >> whose
> >> (file_pos- file_extent_item::offset) == btrfs_extent_data_ref::offset.
> >> 
> >> Signed-off-by: ethanwu <ethanwu@synology.com>
> > 
> > I just want to make sure that btrfs/097 passes still right?  That's
> > what the key_for_search thing was about, so I want to make sure we're
> > not regressing.  I assume you've run xfstests but I just want to make
> > doubly sure we're good here. If you did then you can add
> > 
> > Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> > 
> > Thanks,
> > 
> > Josef
> 
> Thanks for reviewing.
> 
> I've run the btrfs part of xfstests, 097 passed.
> Failed at following tests:
> 074 (failed 2 out of 5 runs),
> 139, 153, 154,
> 197, 198(Patches related to these 2 tests seem to be not merged yet?)
> 201, 202
> 
> My kernel environment is 5.5-rc5, and this branch doesn't contain
> fixes for tests 201 and 202.
> All these failing tests also failed at the same version without my 
> patch.

I tested the patchset in my environment and besides the above known
and unrelated failures, there's one that seems to be new and possibly
related to the patches:

btrfs/125               [18:18:14][ 5937.333021] run fstests btrfs/125 at 2020-02-07 18:18:14
[ 5937.737705] BTRFS info (device vda): disk space caching is enabled
[ 5937.741359] BTRFS info (device vda): has skinny extents
[ 5938.318881] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5 devid 1 transid 5 /dev/vdb scanned by mkfs.btrfs (21913)
[ 5938.323180] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5 devid 2 transid 5 /dev/vdc scanned by mkfs.btrfs (21913)
[ 5938.327229] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5 devid 3 transid 5 /dev/vdd scanned by mkfs.btrfs (21913)
[ 5938.344608] BTRFS info (device vdb): disk space caching is enabled
[ 5938.347892] BTRFS info (device vdb): has skinny extents
[ 5938.350941] BTRFS info (device vdb): flagging fs with big metadata feature
[ 5938.360083] BTRFS info (device vdb): checking UUID tree
[ 5938.470343] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5 devid 2 transid 7 /dev/vdc scanned by mount (21955)
[ 5938.476444] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5 devid 1 transid 7 /dev/vdb scanned by mount (21955)
[ 5938.480289] BTRFS info (device vdb): allowing degraded mounts
[ 5938.483738] BTRFS info (device vdb): disk space caching is enabled
[ 5938.487557] BTRFS info (device vdb): has skinny extents
[ 5938.491416] BTRFS warning (device vdb): devid 3 uuid f86704f4-689c-4677-b5f2-5380cf6be2d3 is missing
[ 5938.493879] BTRFS warning (device vdb): devid 3 uuid f86704f4-689c-4677-b5f2-5380cf6be2d3 is missing
[ 5939.233288] BTRFS: device fsid 265be525-bf76-447b-8db6-d69b0d3885d1 devid 1 transid 250 /dev/vda scanned by btrfs (21983)
[ 5939.250044] BTRFS info (device vdb): disk space caching is enabled
[ 5939.253525] BTRFS info (device vdb): has skinny extents
[ 5949.283384] BTRFS info (device vdb): balance: start -d -m -s
[ 5949.288563] BTRFS info (device vdb): relocating block group 217710592 flags data|raid5
[ 5949.322231] BTRFS error (device vdb): bad tree block start, want 39862272 have 30949376
[ 5949.328136] repair_io_failure: 22 callbacks suppressed
[ 5949.328139] BTRFS info (device vdb): read error corrected: ino 0 off 39862272 (dev /dev/vdd sector 19488)
[ 5949.333447] BTRFS info (device vdb): read error corrected: ino 0 off 39866368 (dev /dev/vdd sector 19496)
[ 5949.336875] BTRFS info (device vdb): read error corrected: ino 0 off 39870464 (dev /dev/vdd sector 19504)
[ 5949.340325] BTRFS info (device vdb): read error corrected: ino 0 off 39874560 (dev /dev/vdd sector 19512)
[ 5949.409934] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2228224 csum 0x5f6faf4265e0e04dc797f32ab085653d60954cfd976b257c83e7cd825ae7c98e expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.414764] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2555904 csum 0xde6a48c4c66a765d0142c27fee1ec429055152fe3f10d70e4ef59a9d7a071bdc expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.414774] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2621440 csum 0x47800732ac4471f4aced9c4fe35cb6046c401792a99daa02ccbc35e0b4632496 expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.414946] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2637824 csum 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78 expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.415061] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2641920 csum 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78 expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.415136] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2646016 csum 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78 expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.415214] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2650112 csum 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78 expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.415260] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2654208 csum 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78 expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.415304] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2658304 csum 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78 expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.415348] BTRFS warning (device vdb): csum failed root -9 ino 257 off 2662400 csum 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78 expected csum 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 mirror 1
[ 5949.419530] BTRFS info (device vdb): read error corrected: ino 257 off 2621440 (dev /dev/vdd sector 195712)
[ 5949.420414] BTRFS info (device vdb): read error corrected: ino 257 off 2641920 (dev /dev/vdd sector 195752)
[ 5949.420528] BTRFS info (device vdb): read error corrected: ino 257 off 2637824 (dev /dev/vdd sector 195744)
[ 5949.420651] BTRFS info (device vdb): read error corrected: ino 257 off 2650112 (dev /dev/vdd sector 195768)
[ 5949.420771] BTRFS info (device vdb): read error corrected: ino 257 off 2654208 (dev /dev/vdd sector 195776)
[ 5949.420886] BTRFS info (device vdb): read error corrected: ino 257 off 2662400 (dev /dev/vdd sector 195792)
[ 5949.527064] BTRFS error (device vdb): bad tree block start, want 39059456 have 30539776
[ 5949.527461] BTRFS error (device vdb): bad tree block start, want 39075840 have 30556160
[ 5949.527646] BTRFS error (device vdb): bad tree block start, want 39092224 have 0
[ 5949.527664] BTRFS error (device vdb): bad tree block start, want 39108608 have 30588928
[ 5949.546199] BTRFS error (device vdb): bad tree block start, want 39075840 have 30556160
[ 5949.579222] BTRFS error (device vdb): bad tree block start, want 39092224 have 0
[ 5949.589051] BTRFS error (device vdb): bad tree block start, want 39108608 have 30588928
[ 5949.828796] BTRFS error (device vdb): bad tree block start, want 39387136 have 30670848
[ 5949.828804] BTRFS error (device vdb): bad tree block start, want 39403520 have 0
[ 5950.430515] BTRFS info (device vdb): balance: ended with status: -5
[ 5950.450348] btrfs (22010) used greatest stack depth: 10304 bytes left
[failed, exit status 1][ 5950.461088] BTRFS info (device vdb): clearing incompat feature flag for RAID56 (0x80)
 [18:18:27]- output mismatch (see /tmp/fstests/results//btrfs/125.out.bad)
    --- tests/btrfs/125.out     2018-04-12 16:57:00.616225550 +0000
    +++ /tmp/fstests/results//btrfs/125.out.bad 2020-02-07 18:18:27.496000000 +0000
    @@ -3,5 +3,5 @@
     Write data with degraded mount
     
     Mount normal and balance
    -
    -Mount degraded but with other dev
    +failed: '/sbin/btrfs balance start /tmp/scratch'
    +(see /tmp/fstests/results//btrfs/125.full for details)
    ...
    (Run 'diff -u /tmp/fstests/tests/btrfs/125.out /tmp/fstests/results//btrfs/125.out.bad'  to see the entire diff)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-10 16:29       ` David Sterba
@ 2020-02-11  4:03         ` ethanwu
  2020-02-11  4:33           ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: ethanwu @ 2020-02-11  4:03 UTC (permalink / raw)
  To: dsterba, ethanwu, Josef Bacik, linux-btrfs

David Sterba 於 2020-02-11 00:29 寫到:
> On Mon, Feb 10, 2020 at 05:12:48PM +0800, ethanwu wrote:
>> Josef Bacik 於 2020-02-08 00:26 寫到:
>> > On 2/7/20 4:38 AM, ethanwu wrote:
>> >> When resolving one backref of type EXTENT_DATA_REF, we collect all
>> >> references that simply references the EXTENT_ITEM even though
>> >> their (file_pos- file_extent_item::offset) are not the same as the
>> >> btrfs_extent_data_ref::offset we are searching.
>> >>
>> >> This patch add additional check so that we only collect references
>> >> whose
>> >> (file_pos- file_extent_item::offset) == btrfs_extent_data_ref::offset.
>> >>
>> >> Signed-off-by: ethanwu <ethanwu@synology.com>
>> >
>> > I just want to make sure that btrfs/097 passes still right?  That's
>> > what the key_for_search thing was about, so I want to make sure we're
>> > not regressing.  I assume you've run xfstests but I just want to make
>> > doubly sure we're good here. If you did then you can add
>> >
>> > Reviewed-by: Josef Bacik <josef@toxicpanda.com>
>> >
>> > Thanks,
>> >
>> > Josef
>> 
>> Thanks for reviewing.
>> 
>> I've run the btrfs part of xfstests, 097 passed.
>> Failed at following tests:
>> 074 (failed 2 out of 5 runs),
>> 139, 153, 154,
>> 197, 198(Patches related to these 2 tests seem to be not merged yet?)
>> 201, 202
>> 
>> My kernel environment is 5.5-rc5, and this branch doesn't contain
>> fixes for tests 201 and 202.
>> All these failing tests also failed at the same version without my
>> patch.
> 
> I tested the patchset in my environment and besides the above known
> and unrelated failures, there's one that seems to be new and possibly
> related to the patches:
> 
> btrfs/125               [18:18:14][ 5937.333021] run fstests btrfs/125
> at 2020-02-07 18:18:14
> [ 5937.737705] BTRFS info (device vda): disk space caching is enabled
> [ 5937.741359] BTRFS info (device vda): has skinny extents
> [ 5938.318881] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
> devid 1 transid 5 /dev/vdb scanned by mkfs.btrfs (21913)
> [ 5938.323180] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
> devid 2 transid 5 /dev/vdc scanned by mkfs.btrfs (21913)
> [ 5938.327229] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
> devid 3 transid 5 /dev/vdd scanned by mkfs.btrfs (21913)
> [ 5938.344608] BTRFS info (device vdb): disk space caching is enabled
> [ 5938.347892] BTRFS info (device vdb): has skinny extents
> [ 5938.350941] BTRFS info (device vdb): flagging fs with big metadata 
> feature
> [ 5938.360083] BTRFS info (device vdb): checking UUID tree
> [ 5938.470343] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
> devid 2 transid 7 /dev/vdc scanned by mount (21955)
> [ 5938.476444] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
> devid 1 transid 7 /dev/vdb scanned by mount (21955)
> [ 5938.480289] BTRFS info (device vdb): allowing degraded mounts
> [ 5938.483738] BTRFS info (device vdb): disk space caching is enabled
> [ 5938.487557] BTRFS info (device vdb): has skinny extents
> [ 5938.491416] BTRFS warning (device vdb): devid 3 uuid
> f86704f4-689c-4677-b5f2-5380cf6be2d3 is missing
> [ 5938.493879] BTRFS warning (device vdb): devid 3 uuid
> f86704f4-689c-4677-b5f2-5380cf6be2d3 is missing
> [ 5939.233288] BTRFS: device fsid 265be525-bf76-447b-8db6-d69b0d3885d1
> devid 1 transid 250 /dev/vda scanned by btrfs (21983)
> [ 5939.250044] BTRFS info (device vdb): disk space caching is enabled
> [ 5939.253525] BTRFS info (device vdb): has skinny extents
> [ 5949.283384] BTRFS info (device vdb): balance: start -d -m -s
> [ 5949.288563] BTRFS info (device vdb): relocating block group
> 217710592 flags data|raid5
> [ 5949.322231] BTRFS error (device vdb): bad tree block start, want
> 39862272 have 30949376
> [ 5949.328136] repair_io_failure: 22 callbacks suppressed
> [ 5949.328139] BTRFS info (device vdb): read error corrected: ino 0
> off 39862272 (dev /dev/vdd sector 19488)
> [ 5949.333447] BTRFS info (device vdb): read error corrected: ino 0
> off 39866368 (dev /dev/vdd sector 19496)
> [ 5949.336875] BTRFS info (device vdb): read error corrected: ino 0
> off 39870464 (dev /dev/vdd sector 19504)
> [ 5949.340325] BTRFS info (device vdb): read error corrected: ino 0
> off 39874560 (dev /dev/vdd sector 19512)
> [ 5949.409934] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2228224 csum
> 0x5f6faf4265e0e04dc797f32ab085653d60954cfd976b257c83e7cd825ae7c98e
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.414764] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2555904 csum
> 0xde6a48c4c66a765d0142c27fee1ec429055152fe3f10d70e4ef59a9d7a071bdc
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.414774] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2621440 csum
> 0x47800732ac4471f4aced9c4fe35cb6046c401792a99daa02ccbc35e0b4632496
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.414946] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2637824 csum
> 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.415061] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2641920 csum
> 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.415136] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2646016 csum
> 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.415214] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2650112 csum
> 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.415260] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2654208 csum
> 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.415304] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2658304 csum
> 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.415348] BTRFS warning (device vdb): csum failed root -9 ino 257
> off 2662400 csum
> 0x769bd186841c10e5b1106b55986206c0e87fc05a7f565fdee01b5abcaff6ae78
> expected csum
> 0xad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
> mirror 1
> [ 5949.419530] BTRFS info (device vdb): read error corrected: ino 257
> off 2621440 (dev /dev/vdd sector 195712)
> [ 5949.420414] BTRFS info (device vdb): read error corrected: ino 257
> off 2641920 (dev /dev/vdd sector 195752)
> [ 5949.420528] BTRFS info (device vdb): read error corrected: ino 257
> off 2637824 (dev /dev/vdd sector 195744)
> [ 5949.420651] BTRFS info (device vdb): read error corrected: ino 257
> off 2650112 (dev /dev/vdd sector 195768)
> [ 5949.420771] BTRFS info (device vdb): read error corrected: ino 257
> off 2654208 (dev /dev/vdd sector 195776)
> [ 5949.420886] BTRFS info (device vdb): read error corrected: ino 257
> off 2662400 (dev /dev/vdd sector 195792)
> [ 5949.527064] BTRFS error (device vdb): bad tree block start, want
> 39059456 have 30539776
> [ 5949.527461] BTRFS error (device vdb): bad tree block start, want
> 39075840 have 30556160
> [ 5949.527646] BTRFS error (device vdb): bad tree block start, want
> 39092224 have 0
> [ 5949.527664] BTRFS error (device vdb): bad tree block start, want
> 39108608 have 30588928
> [ 5949.546199] BTRFS error (device vdb): bad tree block start, want
> 39075840 have 30556160
> [ 5949.579222] BTRFS error (device vdb): bad tree block start, want
> 39092224 have 0
> [ 5949.589051] BTRFS error (device vdb): bad tree block start, want
> 39108608 have 30588928
> [ 5949.828796] BTRFS error (device vdb): bad tree block start, want
> 39387136 have 30670848
> [ 5949.828804] BTRFS error (device vdb): bad tree block start, want
> 39403520 have 0
> [ 5950.430515] BTRFS info (device vdb): balance: ended with status: -5
> [ 5950.450348] btrfs (22010) used greatest stack depth: 10304 bytes 
> left
> [failed, exit status 1][ 5950.461088] BTRFS info (device vdb):
> clearing incompat feature flag for RAID56 (0x80)
>  [18:18:27]- output mismatch (see 
> /tmp/fstests/results//btrfs/125.out.bad)
>     --- tests/btrfs/125.out     2018-04-12 16:57:00.616225550 +0000
>     +++ /tmp/fstests/results//btrfs/125.out.bad 2020-02-07
> 18:18:27.496000000 +0000
>     @@ -3,5 +3,5 @@
>      Write data with degraded mount
> 
>      Mount normal and balance
>     -
>     -Mount degraded but with other dev
>     +failed: '/sbin/btrfs balance start /tmp/scratch'
>     +(see /tmp/fstests/results//btrfs/125.full for details)
>     ...
>     (Run 'diff -u /tmp/fstests/tests/btrfs/125.out
> /tmp/fstests/results//btrfs/125.out.bad'  to see the entire diff)

Hi,
I've rebased my kernel environment to the latest for-next branch,
xfstests is updated to latest as well.
Test 125 still passes many times without even one failure.

here's my local.config

export TEST_DEV=/dev/sdc
export TEST_DIR=/mnt/test
export SCRATCH_MNT=/mnt/scratch
export FSTYP=btrfs
export SCRATCH_DEV_POOL="/dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh"

each device has 80GB capacity.

Besides, LOGWRITES_DEV is not set and CONFIG_FAULT_INJECTION_DEBUG_FS
is turned off, but both seems to be unrelated to 125.

thanks,
ethanwu





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-11  4:03         ` ethanwu
@ 2020-02-11  4:33           ` Qu Wenruo
  2020-02-11 18:21             ` David Sterba
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2020-02-11  4:33 UTC (permalink / raw)
  To: ethanwu, dsterba, Josef Bacik, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 5427 bytes --]



On 2020/2/11 下午12:03, ethanwu wrote:
> David Sterba 於 2020-02-11 00:29 寫到:
>> On Mon, Feb 10, 2020 at 05:12:48PM +0800, ethanwu wrote:
>>> Josef Bacik 於 2020-02-08 00:26 寫到:
>>> > On 2/7/20 4:38 AM, ethanwu wrote:
>>> >> When resolving one backref of type EXTENT_DATA_REF, we collect all
>>> >> references that simply references the EXTENT_ITEM even though
>>> >> their (file_pos- file_extent_item::offset) are not the same as the
>>> >> btrfs_extent_data_ref::offset we are searching.
>>> >>
>>> >> This patch add additional check so that we only collect references
>>> >> whose
>>> >> (file_pos- file_extent_item::offset) ==
>>> btrfs_extent_data_ref::offset.
>>> >>
>>> >> Signed-off-by: ethanwu <ethanwu@synology.com>
>>> >
>>> > I just want to make sure that btrfs/097 passes still right?  That's
>>> > what the key_for_search thing was about, so I want to make sure we're
>>> > not regressing.  I assume you've run xfstests but I just want to make
>>> > doubly sure we're good here. If you did then you can add
>>> >
>>> > Reviewed-by: Josef Bacik <josef@toxicpanda.com>
>>> >
>>> > Thanks,
>>> >
>>> > Josef
>>>
>>> Thanks for reviewing.
>>>
>>> I've run the btrfs part of xfstests, 097 passed.
>>> Failed at following tests:
>>> 074 (failed 2 out of 5 runs),
>>> 139, 153, 154,
>>> 197, 198(Patches related to these 2 tests seem to be not merged yet?)
>>> 201, 202
>>>
>>> My kernel environment is 5.5-rc5, and this branch doesn't contain
>>> fixes for tests 201 and 202.
>>> All these failing tests also failed at the same version without my
>>> patch.
>>
>> I tested the patchset in my environment and besides the above known
>> and unrelated failures, there's one that seems to be new and possibly
>> related to the patches:
>>
>> btrfs/125               [18:18:14][ 5937.333021] run fstests btrfs/125
>> at 2020-02-07 18:18:14
>> [ 5937.737705] BTRFS info (device vda): disk space caching is enabled
>> [ 5937.741359] BTRFS info (device vda): has skinny extents
>> [ 5938.318881] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
>> devid 1 transid 5 /dev/vdb scanned by mkfs.btrfs (21913)
>> [ 5938.323180] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
>> devid 2 transid 5 /dev/vdc scanned by mkfs.btrfs (21913)
>> [ 5938.327229] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
>> devid 3 transid 5 /dev/vdd scanned by mkfs.btrfs (21913)
>> [ 5938.344608] BTRFS info (device vdb): disk space caching is enabled
>> [ 5938.347892] BTRFS info (device vdb): has skinny extents
>> [ 5938.350941] BTRFS info (device vdb): flagging fs with big metadata
>> feature
>> [ 5938.360083] BTRFS info (device vdb): checking UUID tree
>> [ 5938.470343] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
>> devid 2 transid 7 /dev/vdc scanned by mount (21955)
>> [ 5938.476444] BTRFS: device fsid e34ea734-aeef-484b-8a5b-d061e3bad8c5
>> devid 1 transid 7 /dev/vdb scanned by mount (21955)
>> [ 5938.480289] BTRFS info (device vdb): allowing degraded mounts
>> [ 5938.483738] BTRFS info (device vdb): disk space caching is enabled
>> [ 5938.487557] BTRFS info (device vdb): has skinny extents
>> [ 5938.491416] BTRFS warning (device vdb): devid 3 uuid
>> f86704f4-689c-4677-b5f2-5380cf6be2d3 is missing
>> [ 5938.493879] BTRFS warning (device vdb): devid 3 uuid
>> f86704f4-689c-4677-b5f2-5380cf6be2d3 is missing
>> [ 5939.233288] BTRFS: device fsid 265be525-bf76-447b-8db6-d69b0d3885d1
>> devid 1 transid 250 /dev/vda scanned by btrfs (21983)
>> [ 5939.250044] BTRFS info (device vdb): disk space caching is enabled
>> [ 5939.253525] BTRFS info (device vdb): has skinny extents
>> [ 5949.283384] BTRFS info (device vdb): balance: start -d -m -s
>> [ 5949.288563] BTRFS info (device vdb): relocating block group
>> 217710592 flags data|raid5
>> [ 5949.322231] BTRFS error (device vdb): bad tree block start, want
>> 39862272 have 30949376
>> [ 5949.328136] repair_io_failure: 22 callbacks suppressed
>> [ 5949.328139] BTRFS info (device vdb): read error corrected: ino 0
>> off 39862272 (dev /dev/vdd sector 19488)
>> [ 5949.333447] BTRFS info (device vdb): read error corrected: ino 0
>> off 39866368 (dev /dev/vdd sector 19496)
>> [ 5949.336875] BTRFS info (device vdb): read error corrected: ino 0
>> off 39870464 (dev /dev/vdd sector 19504)
>> [ 5949.340325] BTRFS info (device vdb): read error corrected: ino 0
>> off 39874560 (dev /dev/vdd sector 19512)
>> [ 5949.409934] BTRFS warning (device vdb): csum failed root -9 ino 257
>> off 2228224 csum

This looks like an existing bug, IIRC Zygo reported it before.

Btrfs balance just randomly failed at data reloc tree.

Thus I don't believe it's related to Ethan's patches.

Thanks,
Qu

> 
> Hi,
> I've rebased my kernel environment to the latest for-next branch,
> xfstests is updated to latest as well.
> Test 125 still passes many times without even one failure.
> 
> here's my local.config
> 
> export TEST_DEV=/dev/sdc
> export TEST_DIR=/mnt/test
> export SCRATCH_MNT=/mnt/scratch
> export FSTYP=btrfs
> export SCRATCH_DEV_POOL="/dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh"
> 
> each device has 80GB capacity.
> 
> Besides, LOGWRITES_DEV is not set and CONFIG_FAULT_INJECTION_DEBUG_FS
> is turned off, but both seems to be unrelated to 125.
> 
> thanks,
> ethanwu
> 
> 
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-11  4:33           ` Qu Wenruo
@ 2020-02-11 18:21             ` David Sterba
  2020-02-12 11:32               ` ethanwu
  0 siblings, 1 reply; 25+ messages in thread
From: David Sterba @ 2020-02-11 18:21 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: ethanwu, dsterba, Josef Bacik, linux-btrfs

On Tue, Feb 11, 2020 at 12:33:48PM +0800, Qu Wenruo wrote:
> >> 39862272 have 30949376
> >> [ 5949.328136] repair_io_failure: 22 callbacks suppressed
> >> [ 5949.328139] BTRFS info (device vdb): read error corrected: ino 0
> >> off 39862272 (dev /dev/vdd sector 19488)
> >> [ 5949.333447] BTRFS info (device vdb): read error corrected: ino 0
> >> off 39866368 (dev /dev/vdd sector 19496)
> >> [ 5949.336875] BTRFS info (device vdb): read error corrected: ino 0
> >> off 39870464 (dev /dev/vdd sector 19504)
> >> [ 5949.340325] BTRFS info (device vdb): read error corrected: ino 0
> >> off 39874560 (dev /dev/vdd sector 19512)
> >> [ 5949.409934] BTRFS warning (device vdb): csum failed root -9 ino 257
> >> off 2228224 csum
> 
> This looks like an existing bug, IIRC Zygo reported it before.
> 
> Btrfs balance just randomly failed at data reloc tree.
> 
> Thus I don't believe it's related to Ethan's patches.

Ok, than the patches make it more likely to happen, which could mean
that faster backref processing hits some race window. As there could be
more we should first fix the bug you say Zygo reported.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-11 18:21             ` David Sterba
@ 2020-02-12 11:32               ` ethanwu
  2020-02-12 12:03                 ` Filipe Manana
  2020-02-12 12:11                 ` Qu Wenruo
  0 siblings, 2 replies; 25+ messages in thread
From: ethanwu @ 2020-02-12 11:32 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, ethanwu, Josef Bacik, linux-btrfs

David Sterba 於 2020-02-12 02:21 寫到:
> On Tue, Feb 11, 2020 at 12:33:48PM +0800, Qu Wenruo wrote:
>> >> 39862272 have 30949376
>> >> [ 5949.328136] repair_io_failure: 22 callbacks suppressed
>> >> [ 5949.328139] BTRFS info (device vdb): read error corrected: ino 0
>> >> off 39862272 (dev /dev/vdd sector 19488)
>> >> [ 5949.333447] BTRFS info (device vdb): read error corrected: ino 0
>> >> off 39866368 (dev /dev/vdd sector 19496)
>> >> [ 5949.336875] BTRFS info (device vdb): read error corrected: ino 0
>> >> off 39870464 (dev /dev/vdd sector 19504)
>> >> [ 5949.340325] BTRFS info (device vdb): read error corrected: ino 0
>> >> off 39874560 (dev /dev/vdd sector 19512)
>> >> [ 5949.409934] BTRFS warning (device vdb): csum failed root -9 ino 257
>> >> off 2228224 csum
>> 
>> This looks like an existing bug, IIRC Zygo reported it before.
>> 
>> Btrfs balance just randomly failed at data reloc tree.
>> 
>> Thus I don't believe it's related to Ethan's patches.
> 
> Ok, than the patches make it more likely to happen, which could mean
> that faster backref processing hits some race window. As there could be
> more we should first fix the bug you say Zygo reported.

I added a log to check if find_parent_nodes is ever called under
test btrfs/125. It turns out that btrfs/125 doesn't pass through the
function. What my patches do is all under find_parent_nodes.
Therefore, I don't think my patch would make btrfs/125 more likely
to happen, at least it doesn't change the behavior of functions
btrfs/125 run through.

Is it easy to reproduce in your test environment?

Thanks,
ethanwu

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-12 11:32               ` ethanwu
@ 2020-02-12 12:03                 ` Filipe Manana
  2020-02-12 12:11                 ` Qu Wenruo
  1 sibling, 0 replies; 25+ messages in thread
From: Filipe Manana @ 2020-02-12 12:03 UTC (permalink / raw)
  To: ethanwu; +Cc: dsterba, Qu Wenruo, Josef Bacik, linux-btrfs

On Wed, Feb 12, 2020 at 11:34 AM ethanwu <ethanwu@synology.com> wrote:
>
> David Sterba 於 2020-02-12 02:21 寫到:
> > On Tue, Feb 11, 2020 at 12:33:48PM +0800, Qu Wenruo wrote:
> >> >> 39862272 have 30949376
> >> >> [ 5949.328136] repair_io_failure: 22 callbacks suppressed
> >> >> [ 5949.328139] BTRFS info (device vdb): read error corrected: ino 0
> >> >> off 39862272 (dev /dev/vdd sector 19488)
> >> >> [ 5949.333447] BTRFS info (device vdb): read error corrected: ino 0
> >> >> off 39866368 (dev /dev/vdd sector 19496)
> >> >> [ 5949.336875] BTRFS info (device vdb): read error corrected: ino 0
> >> >> off 39870464 (dev /dev/vdd sector 19504)
> >> >> [ 5949.340325] BTRFS info (device vdb): read error corrected: ino 0
> >> >> off 39874560 (dev /dev/vdd sector 19512)
> >> >> [ 5949.409934] BTRFS warning (device vdb): csum failed root -9 ino 257
> >> >> off 2228224 csum
> >>
> >> This looks like an existing bug, IIRC Zygo reported it before.
> >>
> >> Btrfs balance just randomly failed at data reloc tree.
> >>
> >> Thus I don't believe it's related to Ethan's patches.
> >
> > Ok, than the patches make it more likely to happen, which could mean
> > that faster backref processing hits some race window. As there could be
> > more we should first fix the bug you say Zygo reported.
>
> I added a log to check if find_parent_nodes is ever called under
> test btrfs/125. It turns out that btrfs/125 doesn't pass through the
> function. What my patches do is all under find_parent_nodes.
> Therefore, I don't think my patch would make btrfs/125 more likely
> to happen, at least it doesn't change the behavior of functions
> btrfs/125 run through.

Yep, test btrfs/125 doesn't trigger backreference walking.
Backreference walking is used by fiemap, send, the logical ino ioctls,
scrub (in error paths to get a path for an inode), and qgroups. The
test doesn't use any of these features.

I've seen that test fail for the same reason once as well, so it's
definitely not caused by your patches.

Thanks.

>
> Is it easy to reproduce in your test environment?
>
> Thanks,
> ethanwu



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-12 11:32               ` ethanwu
  2020-02-12 12:03                 ` Filipe Manana
@ 2020-02-12 12:11                 ` Qu Wenruo
  2020-02-12 14:57                   ` David Sterba
  1 sibling, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2020-02-12 12:11 UTC (permalink / raw)
  To: ethanwu, dsterba, Josef Bacik, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2108 bytes --]



On 2020/2/12 下午7:32, ethanwu wrote:
> David Sterba 於 2020-02-12 02:21 寫到:
>> On Tue, Feb 11, 2020 at 12:33:48PM +0800, Qu Wenruo wrote:
>>> >> 39862272 have 30949376
>>> >> [ 5949.328136] repair_io_failure: 22 callbacks suppressed
>>> >> [ 5949.328139] BTRFS info (device vdb): read error corrected: ino 0
>>> >> off 39862272 (dev /dev/vdd sector 19488)
>>> >> [ 5949.333447] BTRFS info (device vdb): read error corrected: ino 0
>>> >> off 39866368 (dev /dev/vdd sector 19496)
>>> >> [ 5949.336875] BTRFS info (device vdb): read error corrected: ino 0
>>> >> off 39870464 (dev /dev/vdd sector 19504)
>>> >> [ 5949.340325] BTRFS info (device vdb): read error corrected: ino 0
>>> >> off 39874560 (dev /dev/vdd sector 19512)
>>> >> [ 5949.409934] BTRFS warning (device vdb): csum failed root -9 ino
>>> 257
>>> >> off 2228224 csum
>>>
>>> This looks like an existing bug, IIRC Zygo reported it before.
>>>
>>> Btrfs balance just randomly failed at data reloc tree.
>>>
>>> Thus I don't believe it's related to Ethan's patches.
>>
>> Ok, than the patches make it more likely to happen, which could mean
>> that faster backref processing hits some race window. As there could be
>> more we should first fix the bug you say Zygo reported.
> 
> I added a log to check if find_parent_nodes is ever called under
> test btrfs/125. It turns out that btrfs/125 doesn't pass through the
> function. What my patches do is all under find_parent_nodes.

Balance goes through its own backref cache, thus it doesn't utilize the
path you're modifying.

So don't worry your patches look pretty good.

Furthermore, this csum mismatch is not related to backref walk, but the
data csum and the data in data reloc tree, which are all created by balance.

So there is really no reason to block such good optimization.

Thanks,
Qu
> Therefore, I don't think my patch would make btrfs/125 more likely
> to happen, at least it doesn't change the behavior of functions
> btrfs/125 run through.
> 
> Is it easy to reproduce in your test environment?>
> Thanks,
> ethanwu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-12 12:11                 ` Qu Wenruo
@ 2020-02-12 14:57                   ` David Sterba
  2020-02-13  0:59                     ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: David Sterba @ 2020-02-12 14:57 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: ethanwu, dsterba, Josef Bacik, linux-btrfs

On Wed, Feb 12, 2020 at 08:11:56PM +0800, Qu Wenruo wrote:
> >>>
> >>> This looks like an existing bug, IIRC Zygo reported it before.
> >>>
> >>> Btrfs balance just randomly failed at data reloc tree.
> >>>
> >>> Thus I don't believe it's related to Ethan's patches.
> >>
> >> Ok, than the patches make it more likely to happen, which could mean
> >> that faster backref processing hits some race window. As there could be
> >> more we should first fix the bug you say Zygo reported.
> > 
> > I added a log to check if find_parent_nodes is ever called under
> > test btrfs/125. It turns out that btrfs/125 doesn't pass through the
> > function. What my patches do is all under find_parent_nodes.
> 
> Balance goes through its own backref cache, thus it doesn't utilize the
> path you're modifying.
> 
> So don't worry your patches look pretty good.
> 
> Furthermore, this csum mismatch is not related to backref walk, but the
> data csum and the data in data reloc tree, which are all created by balance.
> 
> So there is really no reason to block such good optimization.

I don't mean to block the patchset but when I test patchsets from 5
people and tests start to fail I need to know what's the cause and if
there's a fix in sight. So far the test failed 2 out of 2 (once the
branch itself and then with for-next), I can do more rounds but at this
point it's too reliable to reproduce so there is some connection.

Sometimes it looks like I blame the messenger and complaining under
patches that don't cause the bugs, but often I don't have anyting better
than new warnings between 2 test rounds. Once we have more eyes on the
problem we'll narrow it down and find the root cause.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-12 14:57                   ` David Sterba
@ 2020-02-13  0:59                     ` Qu Wenruo
  2020-02-18 16:54                       ` David Sterba
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2020-02-13  0:59 UTC (permalink / raw)
  To: dsterba, ethanwu, Josef Bacik, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1992 bytes --]



On 2020/2/12 下午10:57, David Sterba wrote:
> On Wed, Feb 12, 2020 at 08:11:56PM +0800, Qu Wenruo wrote:
>>>>>
>>>>> This looks like an existing bug, IIRC Zygo reported it before.
>>>>>
>>>>> Btrfs balance just randomly failed at data reloc tree.
>>>>>
>>>>> Thus I don't believe it's related to Ethan's patches.
>>>>
>>>> Ok, than the patches make it more likely to happen, which could mean
>>>> that faster backref processing hits some race window. As there could be
>>>> more we should first fix the bug you say Zygo reported.
>>>
>>> I added a log to check if find_parent_nodes is ever called under
>>> test btrfs/125. It turns out that btrfs/125 doesn't pass through the
>>> function. What my patches do is all under find_parent_nodes.
>>
>> Balance goes through its own backref cache, thus it doesn't utilize the
>> path you're modifying.
>>
>> So don't worry your patches look pretty good.
>>
>> Furthermore, this csum mismatch is not related to backref walk, but the
>> data csum and the data in data reloc tree, which are all created by balance.
>>
>> So there is really no reason to block such good optimization.
> 
> I don't mean to block the patchset but when I test patchsets from 5
> people and tests start to fail I need to know what's the cause and if
> there's a fix in sight. So far the test failed 2 out of 2 (once the
> branch itself and then with for-next), I can do more rounds but at this
> point it's too reliable to reproduce so there is some connection.
> 
> Sometimes it looks like I blame the messenger and complaining under
> patches that don't cause the bugs, but often I don't have anyting better
> than new warnings between 2 test rounds. Once we have more eyes on the
> problem we'll narrow it down and find the root cause.
> 
BTW, from your initial report, the csum looks pretty long.

Are you testing with those new csum algos? And could that be the reason
why it's much easier to reproduce?

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset
  2020-02-13  0:59                     ` Qu Wenruo
@ 2020-02-18 16:54                       ` David Sterba
  0 siblings, 0 replies; 25+ messages in thread
From: David Sterba @ 2020-02-18 16:54 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, ethanwu, Josef Bacik, linux-btrfs

On Thu, Feb 13, 2020 at 08:59:56AM +0800, Qu Wenruo wrote:
> BTW, from your initial report, the csum looks pretty long.
> 
> Are you testing with those new csum algos? And could that be the reason
> why it's much easier to reproduce?

I have various configurations of test setups, some VMs use SHA256 and it
could be the factor that makes the bug reproducible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/4] btrfs: improve normal backref walking
  2020-02-07  9:38 [PATCH 0/4] btrfs: improve normal backref walking ethanwu
                   ` (3 preceding siblings ...)
  2020-02-07  9:38 ` [PATCH 4/4] btrfs: backref, use correct count to resolve normal data refs ethanwu
@ 2020-02-20 16:41 ` David Sterba
  4 siblings, 0 replies; 25+ messages in thread
From: David Sterba @ 2020-02-20 16:41 UTC (permalink / raw)
  To: ethanwu; +Cc: linux-btrfs

On Fri, Feb 07, 2020 at 05:38:14PM +0800, ethanwu wrote:
> Btrfs has two types of data backref.
> For BTRFS_EXTENT_DATA_REF_KEY type of backref, we don't have the
> exact block number. Therefore, we need to call resolve_indirect_refs.
> It uses btrfs_search_slot to locate the leaf block. Then
> we need to walk through the leaves to search for the EXTENT_DATA items
> that have disk bytenr matching the extent item(add_all_parents).
> 
> When resolving indirect refs, we could take entries that don't
> belong to the backref entry we are searching for right now.
> For that reason when searching backref entry, we always use total
> refs of that EXTENT_ITEM rather than individual count.
> 
> For example:
> item 11 key (40831553536 EXTENT_ITEM 4194304) itemoff 15460 itemsize
>   extent refs 24 gen 7302 flags DATA
>   shared data backref parent 394985472 count 10 #1
>   extent data backref root 257 objectid 260 offset 1048576 count 3 #2
>   extent data backref root 256 objectid 260 offset 65536 count 6 #3
>   extent data backref root 257 objectid 260 offset 65536 count 5 #4
> 
> For example, when searching backref entry #4, we'll use total_refs
> 24, a very loose loop ending condition, instead of total_refs = 5.
> 
> But using total_refs=24 is not accurate. Sometimes, we'll never find
> all the refs from specific root.
> As a result, the loop keeps on going until we reach the end of that inode.
> 
> The first 3 patches, handle 3 different types refs we might encounter.
> These refs do not belong to the normal backref we are searching, and
> hence need to be skipped.
> 
> The last patch changes the total_refs to correct number so that we could
> end loop as soon as we find all the refs we want.
> 
> btrfs send uses backref to find possible clone sources, the following
> is a simple test to compare the results with and without this patch
> 
> btrfs subvolume create /volume1/sub1
> for i in `seq 1 163840`; do dd if=/dev/zero of=/volume1/sub1/file bs=64K count=1 seek=$((i-1)) conv=notrunc oflag=direct 2>/dev/null; done
> btrfs subvolume snapshot /volume1/sub1 /volume1/sub2
> for i in `seq 1 163840`; do dd if=/dev/zero of=/volume1/sub1/file bs=4K count=1 seek=$(((i-1)*16+10)) conv=notrunc oflag=direct 2>/dev/null; done
> btrfs subvolume snapshot -r /volume1/sub1 /volume1/snap1
> time btrfs send /volume1/snap1 | btrfs receive /volume2
> 
> without this patch
> real 69m48.124s
> user 0m50.199s
> sys  70m15.600s
> 
> with this patch
> real    1m59.683s
> user    0m35.421s
> sys     2m42.684s

This is too good to be left only in the cover letter and lost in the
mailinglist, so I copied that to the 4th patch that puts all the things
together and the explanation is very useful. Also the numbers show a
significant improvement.

I've moved the patchset from a topic branch to misc-next for wider
testing. The test failure I reported is not directly caused by these
changes but it's still something to watch for. The target merge is 5.7
so there's plenty of time to test, backreferences are the corner stone
of btrfs so this must work 100%.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-02-20 16:41 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-07  9:38 [PATCH 0/4] btrfs: improve normal backref walking ethanwu
2020-02-07  9:38 ` [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset ethanwu
2020-02-07 16:26   ` Josef Bacik
2020-02-10  9:12     ` ethanwu
2020-02-10 16:29       ` David Sterba
2020-02-11  4:03         ` ethanwu
2020-02-11  4:33           ` Qu Wenruo
2020-02-11 18:21             ` David Sterba
2020-02-12 11:32               ` ethanwu
2020-02-12 12:03                 ` Filipe Manana
2020-02-12 12:11                 ` Qu Wenruo
2020-02-12 14:57                   ` David Sterba
2020-02-13  0:59                     ` Qu Wenruo
2020-02-18 16:54                       ` David Sterba
2020-02-10 10:33   ` Johannes Thumshirn
2020-02-07  9:38 ` [PATCH 2/4] btrfs: backref, not adding refs from shared block when resolving normal backref ethanwu
2020-02-07 16:35   ` Josef Bacik
2020-02-10 10:51   ` Johannes Thumshirn
2020-02-07  9:38 ` [PATCH 3/4] btrfs: backref, only search backref entries from leaves of the same root ethanwu
2020-02-07 16:37   ` Josef Bacik
2020-02-10 10:54   ` Johannes Thumshirn
2020-02-07  9:38 ` [PATCH 4/4] btrfs: backref, use correct count to resolve normal data refs ethanwu
2020-02-07 16:39   ` Josef Bacik
2020-02-10 10:55   ` Johannes Thumshirn
2020-02-20 16:41 ` [PATCH 0/4] btrfs: improve normal backref walking David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.