Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/2] btrfs-progs: dump-tree: Support print trees being half dropped
@ 2019-08-14  1:04 Qu Wenruo
  2019-08-14  1:04 ` [PATCH 1/2] btrfs-progs: Introduce drop borderline for drop progress Qu Wenruo
  2019-08-14  1:04 ` [PATCH 2/2] btrfs-progs: print-tree: Skip dropped tree blocks properly Qu Wenruo
  0 siblings, 2 replies; 3+ messages in thread
From: Qu Wenruo @ 2019-08-14  1:04 UTC (permalink / raw)
  To: linux-btrfs

For half dropped subvolumes, dump-tree can sometimes lead to csum error
if the dropped tree blocks are also trimmed.

E.g
  node 33153024 level 1 items 61 free 60 generation 7 owner 256
  fs uuid 793daf87-b345-4896-a500-adf19baabd92
  chunk uuid 6fcfe63c-3a64-4e78-b095-5980186a0cc0
          key (256 INODE_ITEM 0) block 33173504 gen 7 <<< Dropped
          key (256 DIR_ITEM 266948847) block 30556160 gen 7
          key (256 DIR_ITEM 540403373) block 30658560 gen 7
  ...
  checksum verify failed on 33173504 found 295F0086 wanted 00000000
  checksum verify failed on 33173504 found 295F0086 wanted 00000000
  checksum verify failed on 33173504 found 295F0086 wanted 00000000
  bad tree block 33173504, bytenr mismatch, want=33173504, have=0
  failed to read 33173504 in tree 256


This patch will make btrfs ins dump-tree to detect dropped tree blocks
and output something like:

  node 33153024 level 1 items 61 free 60 generation 7 owner 256
  fs uuid 793daf87-b345-4896-a500-adf19baabd92
  chunk uuid 6fcfe63c-3a64-4e78-b095-5980186a0cc0
          key (256 INODE_ITEM 0) block 33173504 gen 7 =DROPPED=
          key (256 DIR_ITEM 266948847) block 30556160 gen 7 =DROPPED=
          key (256 DIR_ITEM 540403373) block 30658560 gen 7
  ...
  leaf 30658560 items 56 free space 172 generation 7 owner 256
  leaf 30658560 flags 0x1(WRITTEN) backref revision 1
  fs uuid 793daf87-b345-4896-a500-adf19baabd92
  chunk uuid 6fcfe63c-3a64-4e78-b095-5980186a0cc0
          item 0 key (256 DIR_ITEM 540403373) itemoff 3953 itemsize 42
                  location key (1045 INODE_ITEM 0) type FILE
                  transid 7 data_len 0 name_len 12
                  name: file_reg_394
  ...

And skip the dropped tree blocks completely.


Such problem also seems to affect original mode check, when dropped
subvolumes triggers false alert for missing backref.
But that problem needs more confirmation, and will be addressed in
another patchset.

Qu Wenruo (2):
  btrfs-progs: Introduce drop borderline for drop progress
  btrfs-progs: print-tree: Skip dropped tree blocks properly

 cmds/inspect-dump-tree.c | 33 ++++++++++++++------
 ctree.c                  | 31 +++++++++++++++++++
 ctree.h                  | 60 +++++++++++++++++++++++++++++++++++
 print-tree.c             | 67 ++++++++++++++++++++++++++++++++--------
 print-tree.h             |  4 ++-
 5 files changed, 171 insertions(+), 24 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] btrfs-progs: Introduce drop borderline for drop progress
  2019-08-14  1:04 [PATCH 0/2] btrfs-progs: dump-tree: Support print trees being half dropped Qu Wenruo
@ 2019-08-14  1:04 ` Qu Wenruo
  2019-08-14  1:04 ` [PATCH 2/2] btrfs-progs: print-tree: Skip dropped tree blocks properly Qu Wenruo
  1 sibling, 0 replies; 3+ messages in thread
From: Qu Wenruo @ 2019-08-14  1:04 UTC (permalink / raw)
  To: linux-btrfs

Btrfs kernel module drop its subvolume tree using post-order traversal,
this sometimes makes things harder to determine if we should print
one subtree.

For the full hassle it can lead to, please check the lengthy comment.

This patch will introduce a simple new facility, btrfs_drop_borderline,
to address the problem.

Currently the only user of such tree traversal is dump-tree.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 ctree.c | 31 +++++++++++++++++++++++++++++
 ctree.h | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 91 insertions(+)

diff --git a/ctree.c b/ctree.c
index c97fbb40bb6e..59d0cb7ab89e 100644
--- a/ctree.c
+++ b/ctree.c
@@ -3233,3 +3233,34 @@ out:
 	btrfs_free_path(path);
 	return ret;
 }
+
+int btrfs_init_drop_borderline(struct btrfs_root *root,
+			       struct btrfs_drop_borderline *borderline)
+{
+	struct btrfs_path path;
+	struct btrfs_key drop_key;
+	int i;
+	int ret;
+
+	memset(borderline, 0, sizeof(*borderline));
+	btrfs_disk_key_to_cpu(&drop_key, &root->root_item.drop_progress);
+	if (is_empty_key(&drop_key) || root->root_item.drop_level == 0)
+		return 0;
+
+	btrfs_init_path(&path);
+	path.lowest_level = root->root_item.drop_level;
+
+	ret = btrfs_search_slot(NULL, root, &drop_key, &path, 0, 0);
+	if (ret < 0)
+		goto out;
+	ret = 0;
+
+	for (i = root->root_item.drop_level; i < BTRFS_MAX_LEVEL; i++) {
+		if (path.nodes[i])
+			btrfs_node_key_to_cpu(path.nodes[i],
+					&borderline->keys[i], path.slots[i]);
+	}
+out:
+	btrfs_release_path(&path);
+	return ret;
+}
diff --git a/ctree.h b/ctree.h
index 0d12563b7261..ed4bfdac77eb 100644
--- a/ctree.h
+++ b/ctree.h
@@ -1217,6 +1217,66 @@ struct btrfs_root {
 	struct rb_node rb_node;
 };
 
+/*
+ * drop progress indicator.
+ *
+ * Current kernel subvolume drop progress is post-order iteration of each
+ * level 1 node.
+ * We can't just easily compare a key of a node against root::drop_progress.
+ * E.g. The number is the first key of the tree block.
+ *     Level 3                 A (1)
+ *                        /         \
+ *     Level 2          B(1)         ...
+ *                  /   |   \
+ *     Level 1    C(1)  D(4)  E(7)
+ *               /|\   /|\   /|\
+ *     Level 0  F G H I J K L M N
+ *              1 2 3 4 5 6 7 8 9
+ * The number is the key of the leaf.
+ * The first key of B ~ D will be 1, 4, 7
+ *
+ * If we have drop progress of 5, tree blocks C F G H should haven been
+ * dropped. While A B D E J K L M N are still untouched.
+ *
+ * We can't simply compare the key of the a tree block against drop progress.
+ * Or D(4) will be considered already dropped.
+ * We can't simply compare the drop level, as it's always fixed to 1, while
+ * we have already dropped level 1 node C.
+ *
+ * What we need is a full path, at least a key for each level to mark the
+ * borderline. Here we don't use path is avoid holding tree blocks references.
+ *
+ * For above case, we need will create a borderline like:
+ * Level 4~7   0 (0 means not populated as no valid key is 0)
+ * Level 3     1
+ * Level 2     1
+ * Level 1     4
+ * (Level 0 makes no sense)
+ *
+ * Any tree block has smaller key than it's corresponding borderline key will
+ * be considered dropped.
+ *
+ * Currently, only btrfs check and print-tree need this feature.
+ * Kernel doesn't need to iterate a dropped tree, thus kernel has no need
+ * for such facility.
+ */
+struct btrfs_drop_borderline {
+       /*
+        * In theory, it can be BTRFS_MAX_LEVEL - 1, but that will be
+        * confusing against btrfs_path usage.
+        */
+       struct btrfs_key keys[BTRFS_MAX_LEVEL];
+};
+
+static inline bool is_empty_key(struct btrfs_key *key)
+{
+	if (key->objectid == 0 && key->type == 0 && key->offset == 0)
+		return true;
+	return false;
+}
+
+int btrfs_init_drop_borderline(struct btrfs_root *root,
+			       struct btrfs_drop_borderline *borderline);
 static inline u32 BTRFS_MAX_ITEM_SIZE(const struct btrfs_fs_info *info)
 {
 	return BTRFS_LEAF_DATA_SIZE(info) - sizeof(struct btrfs_item);
-- 
2.22.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 2/2] btrfs-progs: print-tree: Skip dropped tree blocks properly
  2019-08-14  1:04 [PATCH 0/2] btrfs-progs: dump-tree: Support print trees being half dropped Qu Wenruo
  2019-08-14  1:04 ` [PATCH 1/2] btrfs-progs: Introduce drop borderline for drop progress Qu Wenruo
@ 2019-08-14  1:04 ` Qu Wenruo
  1 sibling, 0 replies; 3+ messages in thread
From: Qu Wenruo @ 2019-08-14  1:04 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
For certain btrfs image, btrfs ins dump-tree can hit checksum error for
tree blocks:

  $ btrfs ins dump-tree -t 256 /dev/test/scratch1 > /dev/null
  checksum verify failed on 33173504 found 295F0086 wanted 00000000
  checksum verify failed on 33173504 found 295F0086 wanted 00000000
  checksum verify failed on 33173504 found 295F0086 wanted 00000000
  bad tree block 33173504, bytenr mismatch, want=33173504, have=0

[CAUSE]
The fs is a completely valid fs, with its file tree 256 being half
dropped.

The fs is crafted by using dm-log-writes, with extra kernel hack to
commit transaction for each tree block dropped, and mounted with discard
option to discard dropped tree blocks.

The cause is pretty simple, btrfs ins dump-tree doesn't really check the
drop progress of a tree.
So when above condition is met, dump-tree will still try to print a tree
block which is already dropped.

[FIX]
Fix this problem by introducing @borderline parameter for
btrfs_print_tree() and its children functions.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 cmds/inspect-dump-tree.c | 33 ++++++++++++++------
 print-tree.c             | 67 ++++++++++++++++++++++++++++++++--------
 print-tree.h             |  4 ++-
 3 files changed, 80 insertions(+), 24 deletions(-)

diff --git a/cmds/inspect-dump-tree.c b/cmds/inspect-dump-tree.c
index e50130a4a161..c2145a2559bc 100644
--- a/cmds/inspect-dump-tree.c
+++ b/cmds/inspect-dump-tree.c
@@ -277,7 +277,7 @@ static int dump_print_tree_blocks(struct btrfs_fs_info *fs_info,
 			ret = -EIO;
 			goto next;
 		}
-		btrfs_print_tree(eb, follow, BTRFS_PRINT_TREE_DEFAULT);
+		btrfs_print_tree(eb, NULL, follow, BTRFS_PRINT_TREE_DEFAULT);
 		free_extent_buffer(eb);
 next:
 		remove_cache_extent(tree, ce);
@@ -486,20 +486,20 @@ static int cmd_inspect_dump_tree(const struct cmd_struct *cmd,
 		} else {
 			if (info->tree_root->node) {
 				printf("root tree\n");
-				btrfs_print_tree(info->tree_root->node, true,
-						 traverse);
+				btrfs_print_tree(info->tree_root->node, NULL,
+						 true, traverse);
 			}
 
 			if (info->chunk_root->node) {
 				printf("chunk tree\n");
-				btrfs_print_tree(info->chunk_root->node, true,
-						 traverse);
+				btrfs_print_tree(info->chunk_root->node, NULL,
+						 true, traverse);
 			}
 
 			if (info->log_root_tree) {
 				printf("log root tree\n");
 				btrfs_print_tree(info->log_root_tree->node,
-						 true, traverse);
+						 NULL, true, traverse);
 			}
 		}
 	}
@@ -519,7 +519,7 @@ again:
 			goto close_root;
 		}
 		printf("root tree\n");
-		btrfs_print_tree(info->tree_root->node, true, traverse);
+		btrfs_print_tree(info->tree_root->node, NULL, true, traverse);
 		goto close_root;
 	}
 
@@ -529,7 +529,7 @@ again:
 			goto close_root;
 		}
 		printf("chunk tree\n");
-		btrfs_print_tree(info->chunk_root->node, true, traverse);
+		btrfs_print_tree(info->chunk_root->node, NULL, true, traverse);
 		goto close_root;
 	}
 
@@ -539,7 +539,7 @@ again:
 			goto close_root;
 		}
 		printf("log root tree\n");
-		btrfs_print_tree(info->log_root_tree->node, true, traverse);
+		btrfs_print_tree(info->log_root_tree->node, NULL, true, traverse);
 		goto close_root;
 	}
 
@@ -566,12 +566,24 @@ again:
 		btrfs_item_key(leaf, &disk_key, path.slots[0]);
 		btrfs_disk_key_to_cpu(&found_key, &disk_key);
 		if (found_key.type == BTRFS_ROOT_ITEM_KEY) {
+			struct btrfs_root *root;
+			struct btrfs_key search_key;
+			struct btrfs_drop_borderline borderline;
 			unsigned long offset;
 			struct extent_buffer *buf;
 			int skip = extent_only | device_only | uuid_tree_only;
 
+			search_key.objectid = found_key.objectid;
+			search_key.type = found_key.type;
+			search_key.offset = -1;
 			offset = btrfs_item_ptr_offset(leaf, slot);
 			read_extent_buffer(leaf, &ri, offset, sizeof(ri));
+			root = btrfs_read_fs_root(info, &search_key);
+			if (!IS_ERR_OR_NULL(root)) {
+				btrfs_init_drop_borderline(root, &borderline);
+			} else {
+				memset(&borderline, 0, sizeof(borderline));
+			}
 			buf = read_tree_block(info, btrfs_root_bytenr(&ri), 0);
 			if (!extent_buffer_uptodate(buf))
 				goto next;
@@ -685,7 +697,8 @@ again:
 					       btrfs_header_level(buf));
 				} else {
 					printf(" \n");
-					btrfs_print_tree(buf, true, traverse);
+					btrfs_print_tree(buf, &borderline,
+							 true, traverse);
 				}
 			}
 			free_extent_buffer(buf);
diff --git a/print-tree.c b/print-tree.c
index b31e515f8989..105663a00710 100644
--- a/print-tree.c
+++ b/print-tree.c
@@ -1352,6 +1352,7 @@ void btrfs_print_leaf(struct extent_buffer *eb)
 
 /* Helper function to reach the leftmost tree block at @path->lowest_level */
 static int search_leftmost_tree_block(struct btrfs_fs_info *fs_info,
+				      struct btrfs_drop_borderline *borderline,
 				      struct btrfs_path *path, int root_level)
 {
 	int i;
@@ -1365,12 +1366,20 @@ static int search_leftmost_tree_block(struct btrfs_fs_info *fs_info,
 		free_extent_buffer(path->nodes[i]);
 	}
 
-	/* Reach the leftmost tree block by always reading out slot 0 */
+	/*
+	 * Reach the leftmost tree block by always reading out slot 0
+	 * or goes for the borderline
+	 */
 	for (i = root_level; i > path->lowest_level; i--) {
 		struct extent_buffer *eb;
 
-		path->slots[i] = 0;
-		eb = read_node_slot(fs_info, path->nodes[i], 0);
+		if (borderline)
+			btrfs_bin_search(path->nodes[i], &borderline->keys[i],
+					 i, &path->slots[i]);
+		else
+			path->slots[i] = 0;
+
+		eb = read_node_slot(fs_info, path->nodes[i], path->slots[i]);
 		if (!extent_buffer_uptodate(eb)) {
 			ret = -EIO;
 			goto out;
@@ -1381,7 +1390,8 @@ out:
 	return ret;
 }
 
-static void bfs_print_children(struct extent_buffer *root_eb)
+static void bfs_print_children(struct extent_buffer *root_eb,
+			       struct btrfs_drop_borderline *borderline)
 {
 	struct btrfs_fs_info *fs_info = root_eb->fs_info;
 	struct btrfs_path path;
@@ -1401,13 +1411,14 @@ static void bfs_print_children(struct extent_buffer *root_eb)
 		path.lowest_level = cur_level;
 
 		/* Use the leftmost tree block as a starting point */
-		ret = search_leftmost_tree_block(fs_info, &path, root_level);
+		ret = search_leftmost_tree_block(fs_info, borderline, &path,
+						 root_level);
 		if (ret < 0)
 			goto out;
 
 		/* Print all sibling tree blocks */
 		while (1) {
-			btrfs_print_tree(path.nodes[cur_level], 0,
+			btrfs_print_tree(path.nodes[cur_level], borderline, 0,
 					 BTRFS_PRINT_TREE_BFS);
 			ret = btrfs_next_sibling_tree_block(fs_info, &path);
 			if (ret < 0)
@@ -1423,7 +1434,8 @@ out:
 	return;
 }
 
-static void dfs_print_children(struct extent_buffer *root_eb)
+static void dfs_print_children(struct extent_buffer *root_eb,
+			       struct btrfs_drop_borderline *borderline)
 {
 	struct btrfs_fs_info *fs_info = root_eb->fs_info;
 	struct extent_buffer *next;
@@ -1432,6 +1444,15 @@ static void dfs_print_children(struct extent_buffer *root_eb)
 	int i;
 
 	for (i = 0; i < nr; i++) {
+		struct btrfs_key key;
+
+		if (borderline) {
+			btrfs_node_key_to_cpu(root_eb, &key, i);
+			/* We are at the left side, skip the slot */
+			if (btrfs_comp_cpu_keys(&key,
+					&borderline->keys[root_eb_level]) < 0)
+				continue;
+		}
 		next = read_tree_block(fs_info, btrfs_node_blockptr(root_eb, i),
 				btrfs_node_ptr_generation(root_eb, i));
 		if (!extent_buffer_uptodate(next)) {
@@ -1449,16 +1470,29 @@ static void dfs_print_children(struct extent_buffer *root_eb)
 			free_extent_buffer(next);
 			continue;
 		}
-		btrfs_print_tree(next, 1, BTRFS_PRINT_TREE_DFS);
+		btrfs_print_tree(next, borderline, 1, BTRFS_PRINT_TREE_DFS);
 		free_extent_buffer(next);
 	}
 }
 
-void btrfs_print_tree(struct extent_buffer *eb, bool follow, int traverse)
+/*
+ * Print the tree block or the whole subtree at @eb.
+ *
+ * @borderline:	if given, this function will ensure no tree blocks at the
+ * 		left side of the borderline will be printed.
+ * 		(tree blocks at borderline will still be printed)
+ * @follow:	whether to print the whole subtree. Only makes sense for nodes.
+ * @traverse:	the order to iterate the subtree. Only makes sense when
+ * 		@follow is true
+ */
+void btrfs_print_tree(struct extent_buffer *eb,
+		      struct btrfs_drop_borderline *borderline,
+		      bool follow, int traverse)
 {
 	u32 i;
 	u32 nr;
 	u32 ptr_num;
+	int level;
 	struct btrfs_fs_info *fs_info = eb->fs_info;
 	struct btrfs_disk_key disk_key;
 	struct btrfs_key key;
@@ -1468,6 +1502,7 @@ void btrfs_print_tree(struct extent_buffer *eb, bool follow, int traverse)
 	if (traverse != BTRFS_PRINT_TREE_DFS && traverse != BTRFS_PRINT_TREE_BFS)
 		traverse = BTRFS_PRINT_TREE_DEFAULT;
 
+	level = btrfs_header_level(eb);
 	nr = btrfs_header_nritems(eb);
 	if (btrfs_is_leaf(eb)) {
 		btrfs_print_leaf(eb);
@@ -1489,15 +1524,21 @@ void btrfs_print_tree(struct extent_buffer *eb, bool follow, int traverse)
 	fflush(stdout);
 	ptr_num = BTRFS_NODEPTRS_PER_EXTENT_BUFFER(eb);
 	for (i = 0; i < nr && i < ptr_num; i++) {
+		const char *dropped_str = "";
+
 		u64 blocknr = btrfs_node_blockptr(eb, i);
 
 		btrfs_node_key(eb, &disk_key, i);
 		btrfs_disk_key_to_cpu(&key, &disk_key);
 		printf("\t");
 		btrfs_print_key(&disk_key);
-		printf(" block %llu gen %llu\n",
+		if (borderline && btrfs_comp_cpu_keys(&key,
+					&borderline->keys[level]) < 0)
+			dropped_str = " =DROPPED=";
+		printf(" block %llu gen %llu%s\n",
 		       (unsigned long long)blocknr,
-		       (unsigned long long)btrfs_node_ptr_generation(eb, i));
+		       (unsigned long long)btrfs_node_ptr_generation(eb, i),
+		       dropped_str);
 		fflush(stdout);
 	}
 	if (!follow)
@@ -1507,8 +1548,8 @@ void btrfs_print_tree(struct extent_buffer *eb, bool follow, int traverse)
 		return;
 
 	if (traverse == BTRFS_PRINT_TREE_DFS)
-		dfs_print_children(eb);
+		dfs_print_children(eb, borderline);
 	else
-		bfs_print_children(eb);
+		bfs_print_children(eb, borderline);
 	return;
 }
diff --git a/print-tree.h b/print-tree.h
index d4721b60647f..5574d9102614 100644
--- a/print-tree.h
+++ b/print-tree.h
@@ -32,7 +32,9 @@ void btrfs_print_leaf(struct extent_buffer *l);
 #define BTRFS_PRINT_TREE_DFS		0
 #define BTRFS_PRINT_TREE_BFS		1
 #define BTRFS_PRINT_TREE_DEFAULT	BTRFS_PRINT_TREE_DFS
-void btrfs_print_tree(struct extent_buffer *eb, bool follow, int traverse);
+void btrfs_print_tree(struct extent_buffer *eb,
+		      struct btrfs_drop_borderline *borderline,
+		      bool follow, int traverse);
 
 void btrfs_print_key(struct btrfs_disk_key *disk_key);
 void print_chunk_item(struct extent_buffer *eb, struct btrfs_chunk *chunk);
-- 
2.22.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-14  1:04 [PATCH 0/2] btrfs-progs: dump-tree: Support print trees being half dropped Qu Wenruo
2019-08-14  1:04 ` [PATCH 1/2] btrfs-progs: Introduce drop borderline for drop progress Qu Wenruo
2019-08-14  1:04 ` [PATCH 2/2] btrfs-progs: print-tree: Skip dropped tree blocks properly Qu Wenruo

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org linux-btrfs@archiver.kernel.org
	public-inbox-index linux-btrfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox