All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs-progs: check: Handle drop_progress correctly for deleted subvolume
@ 2016-10-13 11:09 ethanwu
  2016-10-13 11:09 ` [PATCH 1/2] btrfs-progs: check: skip keys prior to drop key in deleted subvolume correctly ethanwu
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: ethanwu @ 2016-10-13 11:09 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ethanwu

The patchsets fixes 2 problems in checking deleted subvolume.

First, there are cases missing in checking drop_key, this add up the missing cases,
so the skipping flow could work correctly.

Second, the drop key is only applied to subvolume tree. This should be applied to the
corresponding relocation tree of the deleted subvolume as well.

ethanwu (2):
  btrfs-progs: check: skip keys prior to drop key in deleted subvolume
    correctly
  btrfs-progs: check: add drop key to relocation tree

 cmds-check.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 62 insertions(+), 4 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] btrfs-progs: check: skip keys prior to drop key in deleted subvolume correctly
  2016-10-13 11:09 [PATCH 0/2] btrfs-progs: check: Handle drop_progress correctly for deleted subvolume ethanwu
@ 2016-10-13 11:09 ` ethanwu
  2016-10-13 11:09 ` [PATCH 2/2] btrfs-progs: check: add drop key to relocation tree ethanwu
  2016-11-15 16:00 ` [PATCH 0/2] btrfs-progs: check: Handle drop_progress correctly for deleted subvolume David Sterba
  2 siblings, 0 replies; 4+ messages in thread
From: ethanwu @ 2016-10-13 11:09 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ethanwu

In processing deleted subvolume, we would skip keys based on subvolume
drop key. The condition only checks the keys at same level. This is not
correct. Consider the following situation of one deleted subvolume.

This is a node of subvolume root:
node 13459456000 level 3 items 9 free 484 generation 12281 owner 869
	key (256 INODE_ITEM 0) block 29986832384 (1830251) gen 12999
	key (5828 DIR_ITEM 819422210) block 30048043008 (1833987) gen 12995
	key (17069 INODE_REF 16812) block 30057644032 (1834573) gen 12996
	key (28250 DIR_ITEM 3506778047) block 30063820800 (1834950) gen 12996
	key (39436 INODE_REF 39387) block 30077894656 (1835809) gen 12997
	key (50746 XATTR_ITEM 3572018377) block 29961109504 (1828681) gen 12998
	key (61962 DIR_ITEM 1803896258) block 29960192000 (1828625) gen 12998
	key (73287 INODE_ITEM 0) block 29988372480 (1830345) gen 12999
	key (84574 XATTR_ITEM 819422210) block 29986570240 (1830235) gen 12999

Suppose the drop key is (5828 DIR_ITEM 0) and drop level is 1.
btrfs check will iterate all nodes pointed by node 13459456000.
Now consider the node 29986832384 whose first key is (256 INODE_ITEM 0):
Since the last key of this node is prior to drop key, the node is freed if
no other snapshots point to it. Although the node is freed, node 29986832384
is still pointed by node 13459456000(not cowed) in subvolume deleted case.
There's a chance that node 29986832384 could be reused.
In the origin logic to check whether to skip the node or not,
it only checks the key at the same level. Therefore, in this case,
node(block) 29986832384, which has level 2, won't be skipped and will be
added to list to be processed. Later when reading the node 29986832384,
btrfs-check finds out that transid mismatches because node 29986832384 is
already reused.
The following error would happen:
parent transid verify failed on 29986832384 wanted 12999 found XXXXX

Fix this by skipping the drop key correctly. Here's the logic:

We refer to this_node and this_key as the node and the key we are processing
right now. next_key is the key next to this_key. In the above example, if
this_key is (61962 DIR_ITEM 1803896258), next_key is (73287 INODE_ITEM 0).

If the level of this_key is:
1. the same with drop_level of drop_key, simply compare the keys (the original
   logic).

2. less than the drop key, and value of this_key
2.1. this_key >= drop_key: this node needs to be processed.
2.2. this_key < drop_key:
    The first key of its ancestor node at drop_level meets the condition:
    ancestor_key <= this_key < drop_key which.
    According to condition 1, it will be skipped, so this should not happen.

3. greater than the drop key, and the value of this_key
3.1 this_key >= drop_key:
    All keys of descendants node pointed by this_key >= this_key >= drop_key.
    This node needs to be processed
3.2 this_key < drop_key:
    If this_key is not the last key in this node, check the next_key.
    If next_key <= drop_key, then this_key < next_key <= drop_key,
    we know the node containing this_key is already freed, so skip this_node.
    Otherwise, this_key < drop_key < next_key, we know that the drop_key at
    drop_level must be this_key's descendant, so this node needs to be
    processed.

By combinging the above conditions altogether, we only needs to skip
condition 1 and condition 3.2.
condition 3.1 implies next_key > this_key >= drop_key -> next_key > drop_key,
so we get the result conditions.

Signed-off-by: ethanwu <ethanwu@synology.com>
---
 cmds-check.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 670ccd1..bf6398d 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -6459,9 +6459,17 @@ static int run_next_block(struct btrfs_root *root,
 			size = root->nodesize;
 			btrfs_node_key_to_cpu(buf, &key, i);
 			if (ri != NULL) {
-				if ((level == ri->drop_level)
-				    && is_dropped_key(&key, &ri->drop_key)) {
-					continue;
+				if (ri->drop_level) {
+					if (level == ri->drop_level) {
+						if (is_dropped_key(&key, &ri->drop_key))
+							continue;
+					} else if (level > ri->drop_level && i+1 < nritems) {
+						struct btrfs_key next_key;
+
+						btrfs_node_key_to_cpu(buf, &next_key, i+1);
+						if (0 >= btrfs_comp_cpu_keys(&next_key, &ri->drop_key))
+							continue;
+					}
 				}
 			}
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] btrfs-progs: check: add drop key to relocation tree
  2016-10-13 11:09 [PATCH 0/2] btrfs-progs: check: Handle drop_progress correctly for deleted subvolume ethanwu
  2016-10-13 11:09 ` [PATCH 1/2] btrfs-progs: check: skip keys prior to drop key in deleted subvolume correctly ethanwu
@ 2016-10-13 11:09 ` ethanwu
  2016-11-15 16:00 ` [PATCH 0/2] btrfs-progs: check: Handle drop_progress correctly for deleted subvolume David Sterba
  2 siblings, 0 replies; 4+ messages in thread
From: ethanwu @ 2016-10-13 11:09 UTC (permalink / raw)
  To: linux-btrfs; +Cc: ethanwu

If subvolume is deleted, we add drop_key into the corresponding
root item, so we know where to start processing the deleted subvolume.
However, we don't skip the keys prior to the drop_key in corresponding
relocation tree of the deleted subvolume. As a result, we might run
into block that is freed and being used again. This cause btrfs
check to report false alarm.

Fix this by adding drop_key for deleted subvolume to its corresponding
relocation tree.

Signed-off-by: ethanwu <ethanwu@synology.com>
---
 cmds-check.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/cmds-check.c b/cmds-check.c
index bf6398d..4d7cb68 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -92,6 +92,13 @@ struct extent_backref {
 	unsigned int broken:1;
 };
 
+static int add_root_item_to_list(struct list_head *head,
+				  u64 objectid, u64 bytenr, u64 last_snapshot,
+				  u8 level, u8 drop_level,
+				  int level_size, struct btrfs_key *drop_key);
+
+static void free_root_item_list(struct list_head *list);
+
 static inline struct extent_backref* to_extent_backref(struct list_head *entry)
 {
 	return list_entry(entry, struct extent_backref, list);
@@ -3761,6 +3768,10 @@ static int check_fs_roots(struct btrfs_root *root,
 	struct btrfs_root *tree_root = root->fs_info->tree_root;
 	int ret;
 	int err = 0;
+	struct list_head dropping_trees;
+	struct btrfs_disk_key *drop_progress;
+
+	INIT_LIST_HEAD(&dropping_trees);
 
 	if (ctx.progress_enabled) {
 		ctx.tp = TASK_FS_ROOTS;
@@ -3818,6 +3829,32 @@ again:
 				err = 1;
 				goto next;
 			}
+
+			drop_progress = &tmp_root->root_item.drop_progress;
+			if (btrfs_disk_key_objectid(drop_progress) != 0) {
+				struct btrfs_key drop_key;
+
+				btrfs_disk_key_to_cpu(&drop_key, drop_progress);
+				ret = add_root_item_to_list(&dropping_trees,
+						tmp_root->root_key.objectid,
+						0, 0, 0, tmp_root->root_item.drop_level,
+						0, &drop_key);
+				if (ret < 0) {
+					err = 1;
+					goto out;
+				}
+			} else if (key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
+				struct root_item_record *ri_rec;
+
+				list_for_each_entry(ri_rec, &dropping_trees, list) {
+					if (ri_rec->objectid == key.offset) {
+						btrfs_cpu_key_to_disk(drop_progress, &ri_rec->drop_key);
+						tmp_root->root_item.drop_level = ri_rec->drop_level;
+						break;
+					}
+				}
+			}
+
 			ret = check_fs_root(tmp_root, root_cache, &wc);
 			if (ret == -EAGAIN) {
 				free_root_recs_tree(root_cache);
@@ -3837,6 +3874,7 @@ next:
 		path.slots[0]++;
 	}
 out:
+	free_root_item_list(&dropping_trees);
 	btrfs_release_path(&path);
 	if (err)
 		free_extent_cache_tree(&wc.shared);
@@ -8536,6 +8574,9 @@ again:
 		if (found_key.type == BTRFS_ROOT_ITEM_KEY) {
 			unsigned long offset;
 			u64 last_snapshot;
+			struct root_item_record *ri_rec;
+			struct btrfs_key drop_key = {0};
+			u8 drop_level = 0;
 
 			offset = btrfs_item_ptr_offset(leaf, path.slots[0]);
 			read_extent_buffer(leaf, &ri, offset, sizeof(ri));
@@ -8543,11 +8584,20 @@ again:
 			if (btrfs_disk_key_objectid(&ri.drop_progress) == 0) {
 				level = btrfs_root_level(&ri);
 				level_size = root->nodesize;
+				if (found_key.objectid == BTRFS_TREE_RELOC_OBJECTID) {
+					list_for_each_entry(ri_rec, &dropping_trees, list) {
+						if (ri_rec->objectid == found_key.offset) {
+							drop_key = ri_rec->drop_key;
+							drop_level = ri_rec->drop_level;
+							break;
+						}
+					}
+				}
 				ret = add_root_item_to_list(&normal_trees,
 						found_key.objectid,
 						btrfs_root_bytenr(&ri),
 						last_snapshot, level,
-						0, level_size, NULL);
+						drop_level, level_size, &drop_key);
 				if (ret < 0)
 					goto out;
 			} else {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] btrfs-progs: check: Handle drop_progress correctly for deleted subvolume
  2016-10-13 11:09 [PATCH 0/2] btrfs-progs: check: Handle drop_progress correctly for deleted subvolume ethanwu
  2016-10-13 11:09 ` [PATCH 1/2] btrfs-progs: check: skip keys prior to drop key in deleted subvolume correctly ethanwu
  2016-10-13 11:09 ` [PATCH 2/2] btrfs-progs: check: add drop key to relocation tree ethanwu
@ 2016-11-15 16:00 ` David Sterba
  2 siblings, 0 replies; 4+ messages in thread
From: David Sterba @ 2016-11-15 16:00 UTC (permalink / raw)
  To: ethanwu; +Cc: linux-btrfs

Hi,

On Thu, Oct 13, 2016 at 07:09:47PM +0800, ethanwu wrote:
> The patchsets fixes 2 problems in checking deleted subvolume.
> 
> First, there are cases missing in checking drop_key, this add up the missing cases,
> so the skipping flow could work correctly.
> 
> Second, the drop key is only applied to subvolume tree. This should be applied to the
> corresponding relocation tree of the deleted subvolume as well.

the changelogs are very good, unfortunatelly I'm not very familiar with
the parts so I'd need a review from somebody else, otherwise the patches
will be on a slow merge path. Would be good if you could provide crafted
images where the bug exhibits. Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-11-15 16:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-13 11:09 [PATCH 0/2] btrfs-progs: check: Handle drop_progress correctly for deleted subvolume ethanwu
2016-10-13 11:09 ` [PATCH 1/2] btrfs-progs: check: skip keys prior to drop key in deleted subvolume correctly ethanwu
2016-10-13 11:09 ` [PATCH 2/2] btrfs-progs: check: add drop key to relocation tree ethanwu
2016-11-15 16:00 ` [PATCH 0/2] btrfs-progs: check: Handle drop_progress correctly for deleted subvolume David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.