linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] btrfs-progs: check: Repair invalid inode mode in
@ 2019-09-05  7:57 Qu Wenruo
  2019-09-05  7:57 ` [PATCH v2 1/6] btrfs-progs: check: Export btrfs_type_to_imode Qu Wenruo
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-09-05  7:57 UTC (permalink / raw)
  To: linux-btrfs

Before this patch, btrfs check can only repair bad free space cache
inode mode (as it was the first case detected by tree-checker and reported)

But Murphy is always right, what may happen will finally happen, we have
users reporting bad inode mode in subvolume trees.
According to the creation time, it looks like some older kernel around
2014 is causing the problem.

Although the reported get the fs fixed by removing the offending old
files, it's still a bad thing that "btrfs check" can't fix it.

This patch will bring the repair functionality to all inodes, along with
needed test image.

The core complexity is in how to determine the correct imode.
This patch will use the following methods to determine the correct
imode:
- INODE_REF
  Do a DIR_INDEX/ITEM search to find a valid filetype then convert it to
  imode. If it works, this should be the most reliable method.

- DIR_INDEX/DIR_ITEM belong to this inode
  Then this inode must be a directory.

- EXTENT_DATA
  This inode can be a regular file or soft link.
  We default to regular file so user can inspect the content to do
  further correction.

- rdev of INODE_ITEM
  If all above fails, and the INODE_ITEM has non-zero rdev, this inode
  must be either BLK or CHR. We default to BLK for this case.

- Error out if nothing matches
  This is to be 100% sure that we won't further corrupt the fs.

Changelog:
v2:
- Implement INODE_REF based imode lookup functionality
- Instead of defaulting to REG, error out if no imode can be found
  To avoid corrupting the fs.

Qu Wenruo (6):
  btrfs-progs: check: Export btrfs_type_to_imode
  btrfs-progs: check/common: Introduce a function to find imode using
    INODE_REF
  btrfs-progs: check/common: Make repair_imode_common() to handle inodes
    in subvolume trees
  btrfs-progs: check/lowmem: Repair bad imode early
  btrfs-progs: check/original: Fix inode mode in subvolume trees
  btrfs-progs: tests/fsck: Add new images for inode mode repair
    functionality

 check/main.c                                  |  50 ++--
 check/mode-common.c                           | 229 +++++++++++++++++-
 check/mode-common.h                           |  17 ++
 check/mode-lowmem.c                           |  39 +++
 .../039-bad-inode-mode/.lowmem_repairable     |   0
 .../bad_free_space_cache_imode.raw.xz}        | Bin
 .../bad_regular_file_imode.img.xz             | Bin 0 -> 2060 bytes
 7 files changed, 298 insertions(+), 37 deletions(-)
 create mode 100644 tests/fsck-tests/039-bad-inode-mode/.lowmem_repairable
 rename tests/fsck-tests/{039-bad-free-space-cache-inode-mode/test.raw.xz => 039-bad-inode-mode/bad_free_space_cache_imode.raw.xz} (100%)
 create mode 100644 tests/fsck-tests/039-bad-inode-mode/bad_regular_file_imode.img.xz

-- 
2.23.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 1/6] btrfs-progs: check: Export btrfs_type_to_imode
  2019-09-05  7:57 [PATCH v2 0/6] btrfs-progs: check: Repair invalid inode mode in Qu Wenruo
@ 2019-09-05  7:57 ` Qu Wenruo
  2019-09-05  7:57 ` [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF Qu Wenruo
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-09-05  7:57 UTC (permalink / raw)
  To: linux-btrfs

This function will be later used by common mode code, so export it.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 check/main.c        | 15 ---------------
 check/mode-common.h | 15 +++++++++++++++
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/check/main.c b/check/main.c
index 2e16b4e6f05b..902279740589 100644
--- a/check/main.c
+++ b/check/main.c
@@ -2448,21 +2448,6 @@ out:
 	return ret;
 }
 
-static u32 btrfs_type_to_imode(u8 type)
-{
-	static u32 imode_by_btrfs_type[] = {
-		[BTRFS_FT_REG_FILE]	= S_IFREG,
-		[BTRFS_FT_DIR]		= S_IFDIR,
-		[BTRFS_FT_CHRDEV]	= S_IFCHR,
-		[BTRFS_FT_BLKDEV]	= S_IFBLK,
-		[BTRFS_FT_FIFO]		= S_IFIFO,
-		[BTRFS_FT_SOCK]		= S_IFSOCK,
-		[BTRFS_FT_SYMLINK]	= S_IFLNK,
-	};
-
-	return imode_by_btrfs_type[(type)];
-}
-
 static int repair_inode_no_item(struct btrfs_trans_handle *trans,
 				struct btrfs_root *root,
 				struct btrfs_path *path,
diff --git a/check/mode-common.h b/check/mode-common.h
index 161b84a8deb0..6c8d6d7578a6 100644
--- a/check/mode-common.h
+++ b/check/mode-common.h
@@ -156,4 +156,19 @@ static inline bool is_valid_imode(u32 imode)
 }
 
 int recow_extent_buffer(struct btrfs_root *root, struct extent_buffer *eb);
+
+static inline u32 btrfs_type_to_imode(u8 type)
+{
+	static u32 imode_by_btrfs_type[] = {
+		[BTRFS_FT_REG_FILE]	= S_IFREG,
+		[BTRFS_FT_DIR]		= S_IFDIR,
+		[BTRFS_FT_CHRDEV]	= S_IFCHR,
+		[BTRFS_FT_BLKDEV]	= S_IFBLK,
+		[BTRFS_FT_FIFO]		= S_IFIFO,
+		[BTRFS_FT_SOCK]		= S_IFSOCK,
+		[BTRFS_FT_SYMLINK]	= S_IFLNK,
+	};
+
+	return imode_by_btrfs_type[(type)];
+}
 #endif
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF
  2019-09-05  7:57 [PATCH v2 0/6] btrfs-progs: check: Repair invalid inode mode in Qu Wenruo
  2019-09-05  7:57 ` [PATCH v2 1/6] btrfs-progs: check: Export btrfs_type_to_imode Qu Wenruo
@ 2019-09-05  7:57 ` Qu Wenruo
  2019-09-09 13:25   ` Nikolay Borisov
  2019-09-09 13:42   ` Nikolay Borisov
  2019-09-05  7:57 ` [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees Qu Wenruo
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-09-05  7:57 UTC (permalink / raw)
  To: linux-btrfs

Introduce a function, find_file_type(), to find filetype using
INODE_REF.

This function will:
- Search DIR_INDEX first
  DIR_INDEX is easier since there is only one item in it.

- Valid the DIR_INDEX item
  If the DIR_INDEX is valid, use the filetype and call it a day.

- Search DIR_ITEM then

- Valide the DIR_ITEM
  If valid, call it a day. Or return -ENOENT;

This would be used as the primary method to determine the imode in later
imode repair code.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 check/mode-common.c | 99 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 99 insertions(+)

diff --git a/check/mode-common.c b/check/mode-common.c
index 195b6efaa7aa..c0ddc50a1dd0 100644
--- a/check/mode-common.c
+++ b/check/mode-common.c
@@ -16,6 +16,7 @@
 
 #include <time.h>
 #include "ctree.h"
+#include "hash.h"
 #include "common/internal.h"
 #include "common/messages.h"
 #include "transaction.h"
@@ -836,6 +837,104 @@ int reset_imode(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 	return ret;
 }
 
+static int find_file_type(struct btrfs_root *root, u64 ino, u64 dirid,
+			  u64 index, const char *name, u32 name_len,
+			  u32 *imode_ret)
+{
+	struct btrfs_path path;
+	struct btrfs_key location;
+	struct btrfs_key key;
+	struct btrfs_dir_item *di;
+	char namebuf[BTRFS_NAME_LEN] = {0};
+	unsigned long cur;
+	unsigned long end;
+	bool found = false;
+	u8 filetype;
+	u32 len;
+	int ret;
+
+	btrfs_init_path(&path);
+
+	/* Search DIR_INDEX first */
+	key.objectid = dirid;
+	key.offset = index;
+	key.type = BTRFS_DIR_INDEX_KEY;
+
+	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
+	if (ret > 0)
+		ret = -ENOENT;
+	if (ret < 0)
+		goto dir_item;
+	di = btrfs_item_ptr(path.nodes[0], path.slots[0],
+			       struct btrfs_dir_item);
+	btrfs_dir_item_key_to_cpu(path.nodes[0], di, &location);
+
+	/* Various basic check */
+	if (location.objectid != ino || location.type != BTRFS_INODE_ITEM_KEY ||
+	    location.offset != 0)
+		goto dir_item;
+	filetype = btrfs_dir_type(path.nodes[0], di);
+	if (filetype >= BTRFS_FT_MAX || filetype == BTRFS_FT_UNKNOWN)
+		goto dir_item;
+	len = min_t(u32, btrfs_item_size_nr(path.nodes[0], path.slots[0]) -
+			 sizeof(*di), BTRFS_NAME_LEN);
+	len = min_t(u32, len, btrfs_dir_name_len(path.nodes[0], di));
+	read_extent_buffer(path.nodes[0], namebuf, (unsigned long)(di + 1), len);
+	if (name_len != len || memcmp(namebuf, name, len))
+		goto dir_item;
+
+	/* Got a correct filetype */
+	found = true;
+	*imode_ret = btrfs_type_to_imode(filetype);
+	ret = 0;
+	goto out;
+
+dir_item:
+	btrfs_release_path(&path);
+	key.objectid = dirid;
+	key.offset = btrfs_name_hash(name, name_len);
+
+	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
+	if (ret > 0)
+		ret = -ENOENT;
+	if (ret < 0) {
+		btrfs_release_path(&path);
+		return ret;
+	}
+	cur = btrfs_item_ptr_offset(path.nodes[0], path.slots[0]);
+	end = cur + btrfs_item_size_nr(path.nodes[0], path.slots[0]);
+	while (cur < end) {
+		di = (struct btrfs_dir_item *)cur;
+		btrfs_dir_item_key_to_cpu(path.nodes[0], di, &location);
+		/* Various basic check */
+		if (location.objectid != ino ||
+		    location.type != BTRFS_INODE_ITEM_KEY ||
+		    location.offset != 0)
+			goto next;
+		filetype = btrfs_dir_type(path.nodes[0], di);
+		if (filetype >= BTRFS_FT_MAX || filetype == BTRFS_FT_UNKNOWN)
+			goto next;
+		len = min_t(u32, BTRFS_NAME_LEN,
+			    btrfs_item_size_nr(path.nodes[0], path.slots[0]) -
+			    sizeof(*di));
+		len = min_t(u32, len, btrfs_dir_name_len(path.nodes[0], di));
+		read_extent_buffer(path.nodes[0], namebuf,
+				   (unsigned long)(di + 1), len);
+		if (name_len != len || memcmp(namebuf, name, len))
+			goto next;
+		*imode_ret = btrfs_type_to_imode(filetype);
+		found = true;
+		goto out;
+next:
+		cur += btrfs_dir_name_len(path.nodes[0], di) + sizeof(*di);
+	}
+out:
+	btrfs_release_path(&path);
+	if (!found && !ret)
+		ret = -ENOENT;
+	return ret;
+}
+
 /*
  * Reset the inode mode of the inode specified by @path.
  *
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees
  2019-09-05  7:57 [PATCH v2 0/6] btrfs-progs: check: Repair invalid inode mode in Qu Wenruo
  2019-09-05  7:57 ` [PATCH v2 1/6] btrfs-progs: check: Export btrfs_type_to_imode Qu Wenruo
  2019-09-05  7:57 ` [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF Qu Wenruo
@ 2019-09-05  7:57 ` Qu Wenruo
  2019-09-09 14:17   ` Nikolay Borisov
                     ` (2 more replies)
  2019-09-05  7:57 ` [PATCH v2 4/6] btrfs-progs: check/lowmem: Repair bad imode early Qu Wenruo
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-09-05  7:57 UTC (permalink / raw)
  To: linux-btrfs

Before this patch, repair_imode_common() can only handle two types of
inodes:
- Free space cache inodes
- ROOT DIR inodes

For inodes in subvolume trees, the core complexity is how to determine the
correct imode, thus it was not implemented.

However there are more reports of incorrect imode in subvolume trees, we
need to support such fix.

So this patch adds a new function, detect_imode(), to detect imode for
inodes in subvolume trees.

That function will determine imode by:
- Search for INODE_REF
  If we have INODE_REF, we will then try to find DIR_ITEM/DIR_INDEX.
  As long as one valid DIR_ITEM or DIR_INDEX can be found, we convert
  the BTRFS_FT_* to imode, then call it a day.
  This should be the most accurate way.

- Search for DIR_INDEX/DIR_ITEM
  If above search fails, we falls back to locate the DIR_INDEX/DIR_ITEM
  just after the INODE_ITEM.
  If any can be found, it's definitely a directory.

- Search for EXTENT_DATA
  If EXTENT_DATA can be found, it's either REG or LNK.
  For this case, we default to REG, as user can inspect the file to
  determine if it's a file or just a path.

- Use rdev to detect BLK/CHR
  If all above fails, but INODE_ITEM has non-zero rdev, then it's either
  a BLK or CHR file. Then we default to BLK.

- Fail out if none of above methods succeeded
  No educated guess to make things worse.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 check/mode-common.c | 130 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 117 insertions(+), 13 deletions(-)

diff --git a/check/mode-common.c b/check/mode-common.c
index c0ddc50a1dd0..abea2ceda4c4 100644
--- a/check/mode-common.c
+++ b/check/mode-common.c
@@ -935,6 +935,113 @@ out:
 	return ret;
 }
 
+static int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
+			u32 *imode_ret)
+{
+	struct btrfs_key key;
+	struct btrfs_inode_item iitem;
+	const u32 priv = 0700;
+	bool found = false;
+	u64 ino;
+	u32 imode;
+	int ret = 0;
+
+	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
+	ino = key.objectid;
+	read_extent_buffer(path->nodes[0], &iitem,
+			btrfs_item_ptr_offset(path->nodes[0], path->slots[0]),
+			sizeof(iitem));
+	/* root inode */
+	if (ino == BTRFS_FIRST_FREE_OBJECTID) {
+		imode = S_IFDIR;
+		found = true;
+		goto out;
+	}
+
+	while (1) {
+		struct btrfs_inode_ref *iref;
+		struct extent_buffer *leaf;
+		unsigned long cur;
+		unsigned long end;
+		char namebuf[BTRFS_NAME_LEN] = {0};
+		u64 index;
+		u32 namelen;
+		int slot;
+
+		ret = btrfs_next_item(root, path);
+		if (ret > 0) {
+			/* falls back to rdev check */
+			ret = 0;
+			goto out;
+		}
+		if (ret < 0)
+			goto out;
+		leaf = path->nodes[0];
+		slot = path->slots[0];
+		btrfs_item_key_to_cpu(leaf, &key, slot);
+		if (key.objectid != ino)
+			goto out;
+
+		/*
+		 * We ignore some types to make life easier:
+		 * - XATTR
+		 *   Both REG and DIR can have xattr, so not useful
+		 */
+		switch (key.type) {
+		case BTRFS_INODE_REF_KEY:
+			/* The most accurate way to determine filetype */
+			cur = btrfs_item_ptr_offset(leaf, slot);
+			end = cur + btrfs_item_size_nr(leaf, slot);
+			while (cur < end) {
+				iref = (struct btrfs_inode_ref *)cur;
+				namelen = min_t(u32, end - cur - sizeof(&iref),
+					btrfs_inode_ref_name_len(leaf, iref));
+				index = btrfs_inode_ref_index(leaf, iref);
+				read_extent_buffer(leaf, namebuf,
+					(unsigned long)(iref + 1), namelen);
+				ret = find_file_type(root, ino, key.offset,
+						index, namebuf, namelen,
+						&imode);
+				if (ret == 0) {
+					found = true;
+					goto out;
+				}
+				cur += sizeof(*iref) + namelen;
+			}
+			break;
+		case BTRFS_DIR_ITEM_KEY:
+		case BTRFS_DIR_INDEX_KEY:
+			imode = S_IFDIR;
+			goto out;
+		case BTRFS_EXTENT_DATA_KEY:
+			/*
+			 * Both REG and LINK could have EXTENT_DATA.
+			 * We just fall back to REG as user can inspect the
+			 * content.
+			 */
+			imode = S_IFREG;
+			goto out;
+		}
+	}
+
+out:
+	/*
+	 * Both CHR and BLK uses rdev, no way to distinguish them, so fall back
+	 * to BLK. But either way it doesn't really matter, as CHR/BLK on btrfs
+	 * should be pretty rare, and no real data will be lost.
+	 */
+	if (!found && btrfs_stack_inode_rdev(&iitem) != 0) {
+		imode = S_IFBLK;
+		found = true;
+	}
+
+	if (found)
+		*imode_ret = (imode | priv);
+	else
+		ret = -ENOENT;
+	return ret;
+}
+
 /*
  * Reset the inode mode of the inode specified by @path.
  *
@@ -951,22 +1058,19 @@ int repair_imode_common(struct btrfs_root *root, struct btrfs_path *path)
 	u32 imode;
 	int ret;
 
-	if (root->root_key.objectid != BTRFS_ROOT_TREE_OBJECTID) {
-		error(
-		"repair inode mode outside of root tree is not supported yet");
-		return -ENOTTY;
-	}
 	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
 	ASSERT(key.type == BTRFS_INODE_ITEM_KEY);
-	if (key.objectid != BTRFS_ROOT_TREE_DIR_OBJECTID &&
-	    !is_fstree(key.objectid)) {
-		error("unsupported ino %llu", key.objectid);
-		return -ENOTTY;
+	if (root->objectid == BTRFS_ROOT_TREE_OBJECTID) {
+		/* In root tree we only have two possible imode */
+		if (key.objectid == BTRFS_ROOT_TREE_OBJECTID)
+			imode = S_IFDIR | 0755;
+		else
+			imode = S_IFREG | 0600;
+	} else {
+		ret = detect_imode(root, path, &imode);
+		if (ret < 0)
+			return ret;
 	}
-	if (key.objectid == BTRFS_ROOT_TREE_DIR_OBJECTID)
-		imode = 040755;
-	else
-		imode = 0100600;
 
 	trans = btrfs_start_transaction(root, 1);
 	if (IS_ERR(trans)) {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 4/6] btrfs-progs: check/lowmem: Repair bad imode early
  2019-09-05  7:57 [PATCH v2 0/6] btrfs-progs: check: Repair invalid inode mode in Qu Wenruo
                   ` (2 preceding siblings ...)
  2019-09-05  7:57 ` [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees Qu Wenruo
@ 2019-09-05  7:57 ` Qu Wenruo
  2019-09-09 14:55   ` Nikolay Borisov
  2019-09-05  7:57 ` [PATCH v2 5/6] btrfs-progs: check/original: Fix inode mode in subvolume trees Qu Wenruo
  2019-09-05  7:58 ` [PATCH v2 6/6] btrfs-progs: tests/fsck: Add new images for inode mode repair functionality Qu Wenruo
  5 siblings, 1 reply; 24+ messages in thread
From: Qu Wenruo @ 2019-09-05  7:57 UTC (permalink / raw)
  To: linux-btrfs

For lowmem mode, if we hit a bad inode mode, normally it is reported
when we checking the DIR_INDEX/DIR_ITEM of the parent inode.

If we didn't repair at that timing, the error will be recorded even we
fixed it later.

So this patch will check for INODE_ITEM_MISMATCH error type, and if it's
really caused by invalid imode, repair it and clear the error.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 check/mode-lowmem.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index 5f7f101daab1..5d0c520217fa 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -1550,6 +1550,35 @@ static int lowmem_delete_corrupted_dir_item(struct btrfs_root *root,
 	return ret;
 }
 
+static int try_repair_imode(struct btrfs_root *root, u64 ino)
+{
+	struct btrfs_inode_item *iitem;
+	struct btrfs_path path;
+	struct btrfs_key key;
+	int ret;
+
+	key.objectid = ino;
+	key.type = BTRFS_INODE_ITEM_KEY;
+	key.offset = 0;
+	btrfs_init_path(&path);
+
+	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
+	if (ret > 0)
+		ret = -ENOENT;
+	if (ret < 0)
+		goto out;
+	iitem = btrfs_item_ptr(path.nodes[0], path.slots[0],
+			       struct btrfs_inode_item);
+	if (!is_valid_imode(btrfs_inode_mode(path.nodes[0], iitem))) {
+		ret = repair_imode_common(root, &path);
+	} else {
+		ret = -ENOTTY;
+	}
+out:
+	btrfs_release_path(&path);
+	return ret;
+}
+
 /*
  * Call repair_inode_item_missing and repair_ternary_lowmem to repair
  *
@@ -1574,6 +1603,16 @@ static int repair_dir_item(struct btrfs_root *root, struct btrfs_key *di_key,
 			err &= ~(INODE_ITEM_MISMATCH | INODE_ITEM_MISSING);
 	}
 
+	if (err & INODE_ITEM_MISMATCH) {
+		/*
+		 * INODE_ITEM mismatch can be caused by bad imode,
+		 * so check if it's a bad imode, then repair if possible.
+		 */
+		ret = try_repair_imode(root, ino);
+		if (!ret)
+			err &= ~INODE_ITEM_MISMATCH;
+	}
+
 	if (err & ~(INODE_ITEM_MISMATCH | INODE_ITEM_MISSING)) {
 		ret = repair_ternary_lowmem(root, dirid, ino, index, namebuf,
 					    name_len, filetype, err);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 5/6] btrfs-progs: check/original: Fix inode mode in subvolume trees
  2019-09-05  7:57 [PATCH v2 0/6] btrfs-progs: check: Repair invalid inode mode in Qu Wenruo
                   ` (3 preceding siblings ...)
  2019-09-05  7:57 ` [PATCH v2 4/6] btrfs-progs: check/lowmem: Repair bad imode early Qu Wenruo
@ 2019-09-05  7:57 ` Qu Wenruo
  2019-09-05  7:58 ` [PATCH v2 6/6] btrfs-progs: tests/fsck: Add new images for inode mode repair functionality Qu Wenruo
  5 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-09-05  7:57 UTC (permalink / raw)
  To: linux-btrfs

To make original mode to repair imode error in subvolume trees, this
patch will do:
- Remove the show-stopper checks for root->objectid.
  Now repair_imode_original() will accept inodes in subvolume trees.

- Export detect_imode() for original mode
  Due to the call requirement, original mode must use an existing trans
  handler to do the repair, thus we need to re-implement most of the
  work done in repair_imode_common().

- Make repair_imode_original() to use detect_imode.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 check/main.c        | 35 ++++++++++++++++++++++++++---------
 check/mode-common.c |  4 ++--
 check/mode-common.h |  2 ++
 3 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/check/main.c b/check/main.c
index 902279740589..11d296b19ab9 100644
--- a/check/main.c
+++ b/check/main.c
@@ -2756,18 +2756,34 @@ static int repair_imode_original(struct btrfs_trans_handle *trans,
 				 struct btrfs_path *path,
 				 struct inode_record *rec)
 {
+	struct btrfs_key key;
 	int ret;
 	u32 imode;
 
-	if (root->root_key.objectid != BTRFS_ROOT_TREE_OBJECTID)
-		return -ENOTTY;
-	if (rec->ino != BTRFS_ROOT_TREE_DIR_OBJECTID || !is_fstree(rec->ino))
-		return -ENOTTY;
+	key.objectid = rec->ino;
+	key.type = BTRFS_INODE_ITEM_KEY;
+	key.offset = 0;
 
-	if (rec->ino == BTRFS_ROOT_TREE_DIR_OBJECTID)
-		imode = 040755;
-	else
-		imode = 0100600;
+	ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
+	if (ret > 0)
+		ret = -ENOENT;
+	if (ret < 0)
+		return ret;
+
+	if (root->objectid == BTRFS_ROOT_TREE_OBJECTID) {
+		/* In root tree we only have two possible imode */
+		if (rec->ino == BTRFS_ROOT_TREE_OBJECTID)
+			imode = S_IFDIR | 0755;
+		else
+			imode = S_IFREG | 0600;
+	} else {
+		ret = detect_imode(root, path, &imode);
+		if (ret < 0) {
+			btrfs_release_path(path);
+			return ret;
+		}
+	}
+	btrfs_release_path(path);
 	ret = reset_imode(trans, root, path, rec->ino, imode);
 	if (ret < 0)
 		return ret;
@@ -2795,7 +2811,8 @@ static int try_repair_inode(struct btrfs_root *root, struct inode_record *rec)
 			     I_ERR_FILE_NBYTES_WRONG |
 			     I_ERR_INLINE_RAM_BYTES_WRONG |
 			     I_ERR_MISMATCH_DIR_HASH |
-			     I_ERR_UNALIGNED_EXTENT_REC)))
+			     I_ERR_UNALIGNED_EXTENT_REC |
+			     I_ERR_INVALID_IMODE)))
 		return rec->errors;
 
 	/*
diff --git a/check/mode-common.c b/check/mode-common.c
index abea2ceda4c4..d0a6917ea863 100644
--- a/check/mode-common.c
+++ b/check/mode-common.c
@@ -935,8 +935,8 @@ out:
 	return ret;
 }
 
-static int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
-			u32 *imode_ret)
+int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
+		 u32 *imode_ret)
 {
 	struct btrfs_key key;
 	struct btrfs_inode_item iitem;
diff --git a/check/mode-common.h b/check/mode-common.h
index 6c8d6d7578a6..edf9257edaf0 100644
--- a/check/mode-common.h
+++ b/check/mode-common.h
@@ -126,6 +126,8 @@ int delete_corrupted_dir_item(struct btrfs_trans_handle *trans,
 			      struct btrfs_root *root,
 			      struct btrfs_key *di_key, char *namebuf,
 			      u32 namelen);
+int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
+		 u32 *imode_ret);
 int reset_imode(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 		struct btrfs_path *path, u64 ino, u32 mode);
 int repair_imode_common(struct btrfs_root *root, struct btrfs_path *path);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 6/6] btrfs-progs: tests/fsck: Add new images for inode mode repair functionality
  2019-09-05  7:57 [PATCH v2 0/6] btrfs-progs: check: Repair invalid inode mode in Qu Wenruo
                   ` (4 preceding siblings ...)
  2019-09-05  7:57 ` [PATCH v2 5/6] btrfs-progs: check/original: Fix inode mode in subvolume trees Qu Wenruo
@ 2019-09-05  7:58 ` Qu Wenruo
  2019-09-09 15:37   ` Nikolay Borisov
  5 siblings, 1 reply; 24+ messages in thread
From: Qu Wenruo @ 2019-09-05  7:58 UTC (permalink / raw)
  To: linux-btrfs

Add new test image for imode repair in subvolume trees.

Also rename the existing test case 039-bad-free-space-cache-inode-mode
to 039-bad-inode-mode, since now we can fix all bad imode.

And add the beacon file for lowmem test.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 .../039-bad-inode-mode/.lowmem_repairable        |   0
 .../bad_free_space_cache_imode.raw.xz}           | Bin
 .../bad_regular_file_imode.img.xz                | Bin 0 -> 2060 bytes
 3 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 tests/fsck-tests/039-bad-inode-mode/.lowmem_repairable
 rename tests/fsck-tests/{039-bad-free-space-cache-inode-mode/test.raw.xz => 039-bad-inode-mode/bad_free_space_cache_imode.raw.xz} (100%)
 create mode 100644 tests/fsck-tests/039-bad-inode-mode/bad_regular_file_imode.img.xz

diff --git a/tests/fsck-tests/039-bad-inode-mode/.lowmem_repairable b/tests/fsck-tests/039-bad-inode-mode/.lowmem_repairable
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/tests/fsck-tests/039-bad-free-space-cache-inode-mode/test.raw.xz b/tests/fsck-tests/039-bad-inode-mode/bad_free_space_cache_imode.raw.xz
similarity index 100%
rename from tests/fsck-tests/039-bad-free-space-cache-inode-mode/test.raw.xz
rename to tests/fsck-tests/039-bad-inode-mode/bad_free_space_cache_imode.raw.xz
diff --git a/tests/fsck-tests/039-bad-inode-mode/bad_regular_file_imode.img.xz b/tests/fsck-tests/039-bad-inode-mode/bad_regular_file_imode.img.xz
new file mode 100644
index 0000000000000000000000000000000000000000..bbdd17de1fb9e59d26d2b0502fff326f019be011
GIT binary patch
literal 2060
zcmV+n2=n*-H+ooF000E$*0e?f03iV!0000G&sfah5B~?rT>wRyj;C3^v%$$4d1oRm
zhA1@4!K86y=JF%i%5FwXCo1%)6gh&*5vGw=-Bj#y{x65M*-1+G4!lf`bW+BqyX<nI
zX>0#KDM;y(0T@FT;0J$&unGzx5VPj~*lxWv?pH4e3>)mmv4?lb0ocO~C8+S@`kgat
z91vZ;`v*9#EUF{=4NuC!E&oT`>E^~9dAS)tCUIor_@+y!<YSP>^G{j*i;LP1iJ3HZ
zhJHIrdnKRlC=rZGe3RyEt5&kZu;FB**-BspaqNBpGT}!X^+6~Y3bJ;(UPEFMhAq+w
zlILfo586KJT53zTWcBehL$SZAS8IYWDIoueT3wZWIGy!CsO@@ag2c833P~k7qvO{O
zF{y99tegz900OE+h4%!{IpqjVN4nb<;*|=aFX_DOnItK<l5wBY4?Jn+rX#hq6ACV^
zzy}R5JPb>mGi}Vh1wv5ddUBexpxV>&nQ!6;C5YM!ofWvH^HoQzwcz!%{r`zl%&Tr+
zI*Joz5WZ2XIS;$uV>0`ZQ0lI<G~+|%%h3CcjOP*Q)-!p1?s&Wc{(dM!(IC{UvLkGv
zxpV_Emu6~0v!GHHxr0ob;;LpM$JF?&H;&d|umK4k$~odM=`WX#_DOHaa{g&j&4fA#
zD}rHi)}Zk4?Wxd<L5#6<o{T6E0+>$%+Sfe0;%dehBQdp18OfsmF<4f3N=o_**GS2%
zxwKW$JcC0H97RoRkwZ4r5!NiI?O|Vwj)=U(o-$t#EbiwCO<?^1VkT}ECpy;+VD6fw
zgW_jd{(NT4^0ptVU?h}G^wjEWh(K+<q~wdk7t&KbCQNaXi+B8hpx>x~5%BX<dlv$K
z==)s!J7x&BAVR3^>&r8)W}5)Gk?jtGoiw%)fpQT?)=|fti!8$9S-$>BU}`H5*xycG
zV@sq5nuWMb8C9X(e(B4zIp5WtHmmmM^VPgzaKx1*z4$DXGVUbB&?h_{hmG35m(BhF
zIJHz)7dT%kyn!v5qmjYb;hd(*q&D@-IBBQ_%IM{U4P0K{ANn0KHB?Cun)R28fzw$i
z)Nv7&;F{omy*<$vpc}Mw$Zrw@oMe;aSl>PDFoTZZw8WA`hvuIu*2zTN+-oH&P}aA9
zE@J=fvLIYM{)|4h3|t(lFK4DdpE_h`L}g;tY<?v5z(K4eW(-65*QVVxC}PkSG+Q(>
zp@IvIq83TK7Cu%efxZc5-BF|+_H3^8^O=1kyx<Qn*KFm#e|<t;KG8yI4$l^zU*%Km
zxeIB3Gx5(@p>QXhJww)q0g_}55`xsdcWe1{(Iohyh@IN!!mtp5y(+Uwe%tM>13W@F
zMs|@j``W4CEJS)W0ylf&8d=JFw1E5D@+-7&lxdW(Cv3R}Mvu0X(~8Sj^<N9=Z0~~u
z>J$tG6lZJfB$wbXVN=HJjy#ch5-!2l<iD3*ZhI^@g_Zc`C&CJ$Z>8|LBft<ARU8)|
zz;y;ai}bD+G8d2|!fBi~O@>(pm;z*|J|=aH9O~1)c#G@N3hP(Y^g5t3<`r|2ij}IC
z29yEU`W12ApHuTY8CHL$QJ;ENl3iAvsMCO~tW}YVelOso6!VPND7pQWfWyd{>=>+Y
z{V}{KXqtMAfWaX7xgeW_4wFGgFl9uSVi2d8DjMve%9l;m^#&{=+U993Mg~fr_BktZ
z-2~61uNMNp5`dsSV>AB2lHM61C75J&sW*8zaj3LJ=z)T-mo)|ZJl%Y}qL;Bx?0(5f
zUMUX%MeZ1<Lq^Gtbf4}HmpH%p;II1_4aA<OC*jPRmE}St;Rt~w+Gv|uIgo$f$Lecc
z?nHVcG9gVAj>N6A(U)uvA$)U+iu`;38hXZ|D^!{(NKMDUK^Hg_4LRbCgD)0~^}jnU
zNkES-dD|N9WrZ~RyRmuEC$MbO7@u`o>X92HiH^PA*AIzl;WFPpGkhnuh$r7c7P*eO
z<yCbrKge#qAEl|xN2toOGPCOG0-A9D%g=P%y7kfy17gd-jZzxflTBq|Ctk?boWGkU
z0gN7_^L@PUdsKWd^=Zar#}KJ1n#3~|_Rt2O85ho2m`2lcLwi%IyC2^pBBP7vr&htq
z8<<MvK8tQw^^f-TXgqa}0batcF~ZuJNh|5PAVIvcETxbk*R`xK??-mxfPeJJCV|oP
z>8$;CRw@4nMlwN-^Vlq?Auh*t*(b9HZQl8OL#h&1@0x@5MW38n5DJ)e9F??$47by;
zpW_6%iNOK^u~92%wFLV5PIetrV23i^IDi4<ba%=gQ{NJ|6_lT^r{*~n7{&pz1G?>o
z$VbC7U_;WALf6$n`=3l>Jgm+dg^m-k2^k+=6-ZJdx`H6>bi||dv&=m%z&P>VqaKSz
zMqC{6maun1Jdy)}yZ(!I$};ywt@h4YfRm2$#9S~}4}`@UPVn)n;pZ$uI}rgWpR;|X
zEf><ioEs~-Tm;tsr`|^wKaGAtZ_&Tu+$|~$Ey2XtZBLI+ca?cHHSp!Lw{N|zFmLZC
z@vF0yGxls@bPTA#Qc~d`I1Xm@9|50%Zz(pP#wE}XKC03pXEh&LidD52YmYc*kk}~g
zKLtwh*04AM(45<!VI6K!sarjD0hoYABdHI}yj})3m1(_Yverl$<Fsw1a{)b`Dr8RK
qm|_3`0000$H3dx@SJO%W0p$;XAOHa7IhfM1#Ao{g000001X)^qgznh@

literal 0
HcmV?d00001

-- 
2.23.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF
  2019-09-05  7:57 ` [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF Qu Wenruo
@ 2019-09-09 13:25   ` Nikolay Borisov
  2019-09-09 14:24     ` Qu Wenruo
  2019-09-09 13:42   ` Nikolay Borisov
  1 sibling, 1 reply; 24+ messages in thread
From: Nikolay Borisov @ 2019-09-09 13:25 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
> Introduce a function, find_file_type(), to find filetype using
> INODE_REF.
> 
> This function will:
> - Search DIR_INDEX first
>   DIR_INDEX is easier since there is only one item in it.
> 
> - Valid the DIR_INDEX item
>   If the DIR_INDEX is valid, use the filetype and call it a day.
> 
> - Search DIR_ITEM then
> 
> - Valide the DIR_ITEM
>   If valid, call it a day. Or return -ENOENT;
> 
> This would be used as the primary method to determine the imode in later
> imode repair code.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  check/mode-common.c | 99 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 99 insertions(+)
> 
> diff --git a/check/mode-common.c b/check/mode-common.c
> index 195b6efaa7aa..c0ddc50a1dd0 100644
> --- a/check/mode-common.c
> +++ b/check/mode-common.c
> @@ -16,6 +16,7 @@
>  
>  #include <time.h>
>  #include "ctree.h"
> +#include "hash.h"
>  #include "common/internal.h"
>  #include "common/messages.h"
>  #include "transaction.h"
> @@ -836,6 +837,104 @@ int reset_imode(struct btrfs_trans_handle *trans, struct btrfs_root *root,
>  	return ret;
>  }
>  
> +static int find_file_type(struct btrfs_root *root, u64 ino, u64 dirid,
> +			  u64 index, const char *name, u32 name_len,
> +			  u32 *imode_ret)
> +{
> +	struct btrfs_path path;
> +	struct btrfs_key location;
> +	struct btrfs_key key;
> +	struct btrfs_dir_item *di;
> +	char namebuf[BTRFS_NAME_LEN] = {0};
> +	unsigned long cur;
> +	unsigned long end;
> +	bool found = false;
> +	u8 filetype;
> +	u32 len;
> +	int ret;
> +
> +	btrfs_init_path(&path);
> +
> +	/* Search DIR_INDEX first */
> +	key.objectid = dirid;
> +	key.offset = index;
> +	key.type = BTRFS_DIR_INDEX_KEY;
> +
> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
> +	if (ret > 0)
> +		ret = -ENOENT;

Even if it returns 1 meaning there is no DIR_INDEX item perhaps it still
makes sense to go to dir_item: label to search for DIR_ITEM, what if the
corruption has affected just the DIR_INDEX item?

> +	if (ret < 0)
> +		goto dir_item;
nit: Use elseif to make it more explicit it is a single construct.

> +	di = btrfs_item_ptr(path.nodes[0], path.slots[0],
> +			       struct btrfs_dir_item);
> +	btrfs_dir_item_key_to_cpu(path.nodes[0], di, &location);
> +
> +	/* Various basic check */
> +	if (location.objectid != ino || location.type != BTRFS_INODE_ITEM_KEY ||
> +	    location.offset != 0)
> +		goto dir_item;
> +	filetype = btrfs_dir_type(path.nodes[0], di);
> +	if (filetype >= BTRFS_FT_MAX || filetype == BTRFS_FT_UNKNOWN)
> +		goto dir_item;
> +	len = min_t(u32, btrfs_item_size_nr(path.nodes[0], path.slots[0]) -
> +			 sizeof(*di), BTRFS_NAME_LEN);
> +	len = min_t(u32, len, btrfs_dir_name_len(path.nodes[0], di));
> +	read_extent_buffer(path.nodes[0], namebuf, (unsigned long)(di + 1), len);
> +	if (name_len != len || memcmp(namebuf, name, len))
> +		goto dir_item;
> +
> +	/* Got a correct filetype */
> +	found = true;
> +	*imode_ret = btrfs_type_to_imode(filetype);
> +	ret = 0;
> +	goto out;
> +
> +dir_item:

This function is needlessly structured in an awkward way. E.g. there is
no dependence between "searching by DIR INDEX item" and "searching by
DIR_ITEM". I think the best way is to have 3 function:

Top level find_file_type which will call find_file_type_by_dir_index if
the reeturn value is negative -> call 2nd function
find_file_type_by_dir_item. This will result in 3 fairly short and easy
to read/parse functions.

> +	btrfs_release_path(&path);
> +	key.objectid = dirid;
> +	key.offset = btrfs_name_hash(name, name_len);
> +
> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
> +	if (ret > 0)
> +		ret = -ENOENT;
> +	if (ret < 0) {
> +		btrfs_release_path(&path);
> +		return ret;
> +	}

ditto about the else if construct

> +	cur = btrfs_item_ptr_offset(path.nodes[0], path.slots[0]);
> +	end = cur + btrfs_item_size_nr(path.nodes[0], path.slots[0]);
> +	while (cur < end) {

Just checking : this is needed in case we have items whose names create
a crc32 collision in the DIR_ITEM offset?

> +		di = (struct btrfs_dir_item *)cur;
> +		btrfs_dir_item_key_to_cpu(path.nodes[0], di, &location);
> +		/* Various basic check */
> +		if (location.objectid != ino ||
> +		    location.type != BTRFS_INODE_ITEM_KEY ||
> +		    location.offset != 0)
> +			goto next;> +		filetype = btrfs_dir_type(path.nodes[0], di);
> +		if (filetype >= BTRFS_FT_MAX || filetype == BTRFS_FT_UNKNOWN)
> +			goto next;
> +		len = min_t(u32, BTRFS_NAME_LEN,
> +			    btrfs_item_size_nr(path.nodes[0], path.slots[0]) -
> +			    sizeof(*di));
> +		len = min_t(u32, len, btrfs_dir_name_len(path.nodes[0], di));
> +		read_extent_buffer(path.nodes[0], namebuf,
> +				   (unsigned long)(di + 1), len);
> +		if (name_len != len || memcmp(namebuf, name, len))
> +			goto next;
> +		*imode_ret = btrfs_type_to_imode(filetype);
> +		found = true;
> +		goto out;
> +next:
> +		cur += btrfs_dir_name_len(path.nodes[0], di) + sizeof(*di);

If this line is moved right after assigning to 'di' instead of having a
'next' label you can simply use the 'continue' keyword

> +	}
> +out:
> +	btrfs_release_path(&path);
> +	if (!found && !ret)
> +		ret = -ENOENT;
> +	return ret;
> +}
> +
>  /*
>   * Reset the inode mode of the inode specified by @path.
>   *
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF
  2019-09-05  7:57 ` [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF Qu Wenruo
  2019-09-09 13:25   ` Nikolay Borisov
@ 2019-09-09 13:42   ` Nikolay Borisov
  2019-09-09 14:26     ` Qu Wenruo
  1 sibling, 1 reply; 24+ messages in thread
From: Nikolay Borisov @ 2019-09-09 13:42 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
> Introduce a function, find_file_type(), to find filetype using
> INODE_REF.

This is confusing, there is not a single reference to INODE_REF in the
code. I guess you must replace this with DIR_ITEM/DIR_INDEX ?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees
  2019-09-05  7:57 ` [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees Qu Wenruo
@ 2019-09-09 14:17   ` Nikolay Borisov
  2019-09-09 14:27     ` Qu Wenruo
  2019-09-10  4:27   ` Su Yue
  2019-09-10 16:14   ` Nikolay Borisov
  2 siblings, 1 reply; 24+ messages in thread
From: Nikolay Borisov @ 2019-09-09 14:17 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
> Before this patch, repair_imode_common() can only handle two types of
> inodes:
> - Free space cache inodes
> - ROOT DIR inodes
> 
> For inodes in subvolume trees, the core complexity is how to determine the
> correct imode, thus it was not implemented.
> 
> However there are more reports of incorrect imode in subvolume trees, we
> need to support such fix.
> 
> So this patch adds a new function, detect_imode(), to detect imode for
> inodes in subvolume trees.
> 
> That function will determine imode by:
> - Search for INODE_REF
>   If we have INODE_REF, we will then try to find DIR_ITEM/DIR_INDEX.
>   As long as one valid DIR_ITEM or DIR_INDEX can be found, we convert
>   the BTRFS_FT_* to imode, then call it a day.
>   This should be the most accurate way.
> 
> - Search for DIR_INDEX/DIR_ITEM
>   If above search fails, we falls back to locate the DIR_INDEX/DIR_ITEM
>   just after the INODE_ITEM.
>   If any can be found, it's definitely a directory.
> 
> - Search for EXTENT_DATA
>   If EXTENT_DATA can be found, it's either REG or LNK.
>   For this case, we default to REG, as user can inspect the file to
>   determine if it's a file or just a path.
> 
> - Use rdev to detect BLK/CHR
>   If all above fails, but INODE_ITEM has non-zero rdev, then it's either
>   a BLK or CHR file. Then we default to BLK.
> 
> - Fail out if none of above methods succeeded
>   No educated guess to make things worse.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  check/mode-common.c | 130 +++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 117 insertions(+), 13 deletions(-)
> 
> diff --git a/check/mode-common.c b/check/mode-common.c
> index c0ddc50a1dd0..abea2ceda4c4 100644
> --- a/check/mode-common.c
> +++ b/check/mode-common.c
> @@ -935,6 +935,113 @@ out:
>  	return ret;
>  }
>  
> +static int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
> +			u32 *imode_ret)

I think the imode is less than u32 so it should be possible to return it
directly from the function as a positive number and error as negative?

> +{
> +	struct btrfs_key key;
> +	struct btrfs_inode_item iitem;
> +	const u32 priv = 0700;

Having this in a variable doesn't bring more clarity, just use 0700
directly at the end of the function.

> +	bool found = false;
> +	u64 ino;
> +	u32 imode;
> +	int ret = 0;
> +
> +	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
> +	ino = key.objectid;
> +	read_extent_buffer(path->nodes[0], &iitem,
> +			btrfs_item_ptr_offset(path->nodes[0], path->slots[0]),
> +			sizeof(iitem));
> +	/* root inode */
> +	if (ino == BTRFS_FIRST_FREE_OBJECTID) {
> +		imode = S_IFDIR;
> +		found = true;
> +		goto out;
> +	}
> +
> +	while (1) {
> +		struct btrfs_inode_ref *iref;
> +		struct extent_buffer *leaf;
> +		unsigned long cur;
> +		unsigned long end;
> +		char namebuf[BTRFS_NAME_LEN] = {0};
> +		u64 index;
> +		u32 namelen;
> +		int slot;
> +
> +		ret = btrfs_next_item(root, path);
> +		if (ret > 0) {
> +			/* falls back to rdev check */
> +			ret = 0;
> +			goto out;
> +		}
> +		if (ret < 0)
> +			goto out;
> +		leaf = path->nodes[0];
> +		slot = path->slots[0];
> +		btrfs_item_key_to_cpu(leaf, &key, slot);
> +		if (key.objectid != ino)
> +			goto out;
> +
> +		/*
> +		 * We ignore some types to make life easier:
> +		 * - XATTR
> +		 *   Both REG and DIR can have xattr, so not useful
> +		 */
> +		switch (key.type) {
> +		case BTRFS_INODE_REF_KEY:
> +			/* The most accurate way to determine filetype */
> +			cur = btrfs_item_ptr_offset(leaf, slot);
> +			end = cur + btrfs_item_size_nr(leaf, slot);
> +			while (cur < end) {
> +				iref = (struct btrfs_inode_ref *)cur;
> +				namelen = min_t(u32, end - cur - sizeof(&iref),
> +					btrfs_inode_ref_name_len(leaf, iref));
> +				index = btrfs_inode_ref_index(leaf, iref);
> +				read_extent_buffer(leaf, namebuf,
> +					(unsigned long)(iref + 1), namelen);
> +				ret = find_file_type(root, ino, key.offset,
> +						index, namebuf, namelen,
> +						&imode);
> +				if (ret == 0) {
> +					found = true;
> +					goto out;
> +				}
> +				cur += sizeof(*iref) + namelen;
> +			}
> +			break;
> +		case BTRFS_DIR_ITEM_KEY:
> +		case BTRFS_DIR_INDEX_KEY:
> +			imode = S_IFDIR;
> +			goto out;
> +		case BTRFS_EXTENT_DATA_KEY:
> +			/*
> +			 * Both REG and LINK could have EXTENT_DATA.
> +			 * We just fall back to REG as user can inspect the
> +			 * content.
> +			 */
> +			imode = S_IFREG;
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	/*
> +	 * Both CHR and BLK uses rdev, no way to distinguish them, so fall back
> +	 * to BLK. But either way it doesn't really matter, as CHR/BLK on btrfs
> +	 * should be pretty rare, and no real data will be lost.
> +	 */
> +	if (!found && btrfs_stack_inode_rdev(&iitem) != 0) {
> +		imode = S_IFBLK;
> +		found = true;
> +	}
> +
> +	if (found)
> +		*imode_ret = (imode | priv);
> +	else
> +		ret = -ENOENT;
> +	return ret;
> +}
> +
>  /*
>   * Reset the inode mode of the inode specified by @path.
>   *
> @@ -951,22 +1058,19 @@ int repair_imode_common(struct btrfs_root *root, struct btrfs_path *path)
>  	u32 imode;
>  	int ret;
>  
> -	if (root->root_key.objectid != BTRFS_ROOT_TREE_OBJECTID) {
> -		error(
> -		"repair inode mode outside of root tree is not supported yet");
> -		return -ENOTTY;
> -	}
>  	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
>  	ASSERT(key.type == BTRFS_INODE_ITEM_KEY);
> -	if (key.objectid != BTRFS_ROOT_TREE_DIR_OBJECTID &&
> -	    !is_fstree(key.objectid)) {
> -		error("unsupported ino %llu", key.objectid);
> -		return -ENOTTY;
> +	if (root->objectid == BTRFS_ROOT_TREE_OBJECTID) {
> +		/* In root tree we only have two possible imode */
> +		if (key.objectid == BTRFS_ROOT_TREE_OBJECTID)
> +			imode = S_IFDIR | 0755;
> +		else
> +			imode = S_IFREG | 0600;
> +	} else {
> +		ret = detect_imode(root, path, &imode);
> +		if (ret < 0)
> +			return ret;
>  	}
> -	if (key.objectid == BTRFS_ROOT_TREE_DIR_OBJECTID)
> -		imode = 040755;
> -	else
> -		imode = 0100600;
>  
>  	trans = btrfs_start_transaction(root, 1);
>  	if (IS_ERR(trans)) {
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF
  2019-09-09 13:25   ` Nikolay Borisov
@ 2019-09-09 14:24     ` Qu Wenruo
  2019-09-09 14:34       ` Nikolay Borisov
  0 siblings, 1 reply; 24+ messages in thread
From: Qu Wenruo @ 2019-09-09 14:24 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2019/9/9 下午9:25, Nikolay Borisov wrote:
>
>
> On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
>> Introduce a function, find_file_type(), to find filetype using
>> INODE_REF.
>>
>> This function will:
>> - Search DIR_INDEX first
>>   DIR_INDEX is easier since there is only one item in it.
>>
>> - Valid the DIR_INDEX item
>>   If the DIR_INDEX is valid, use the filetype and call it a day.
>>
>> - Search DIR_ITEM then
>>
>> - Valide the DIR_ITEM
>>   If valid, call it a day. Or return -ENOENT;
>>
>> This would be used as the primary method to determine the imode in later
>> imode repair code.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  check/mode-common.c | 99 +++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 99 insertions(+)
>>
>> diff --git a/check/mode-common.c b/check/mode-common.c
>> index 195b6efaa7aa..c0ddc50a1dd0 100644
>> --- a/check/mode-common.c
>> +++ b/check/mode-common.c
>> @@ -16,6 +16,7 @@
>>
>>  #include <time.h>
>>  #include "ctree.h"
>> +#include "hash.h"
>>  #include "common/internal.h"
>>  #include "common/messages.h"
>>  #include "transaction.h"
>> @@ -836,6 +837,104 @@ int reset_imode(struct btrfs_trans_handle *trans, struct btrfs_root *root,
>>  	return ret;
>>  }
>>
>> +static int find_file_type(struct btrfs_root *root, u64 ino, u64 dirid,
>> +			  u64 index, const char *name, u32 name_len,
>> +			  u32 *imode_ret)
>> +{
>> +	struct btrfs_path path;
>> +	struct btrfs_key location;
>> +	struct btrfs_key key;
>> +	struct btrfs_dir_item *di;
>> +	char namebuf[BTRFS_NAME_LEN] = {0};
>> +	unsigned long cur;
>> +	unsigned long end;
>> +	bool found = false;
>> +	u8 filetype;
>> +	u32 len;
>> +	int ret;
>> +
>> +	btrfs_init_path(&path);
>> +
>> +	/* Search DIR_INDEX first */
>> +	key.objectid = dirid;
>> +	key.offset = index;
>> +	key.type = BTRFS_DIR_INDEX_KEY;
>> +
>> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
>> +	if (ret > 0)
>> +		ret = -ENOENT;
>
> Even if it returns 1 meaning there is no DIR_INDEX item perhaps it still
> makes sense to go to dir_item: label to search for DIR_ITEM, what if the
> corruption has affected just the DIR_INDEX item?

I didn't get the point.

The next line is just going to do dir_item search, just as you mentioned.
>
>> +	if (ret < 0)
>> +		goto dir_item;
> nit: Use elseif to make it more explicit it is a single construct.

Not sure such usage is recommened, but I see a lot of usage like:
	ret = btrfs_search_slot();
	if (ret > 0)
		ret = -ENOENT;
	if (ret < 0)
		goto error;

So I just followed this practice.

>
>> +	di = btrfs_item_ptr(path.nodes[0], path.slots[0],
>> +			       struct btrfs_dir_item);
>> +	btrfs_dir_item_key_to_cpu(path.nodes[0], di, &location);
>> +
>> +	/* Various basic check */
>> +	if (location.objectid != ino || location.type != BTRFS_INODE_ITEM_KEY ||
>> +	    location.offset != 0)
>> +		goto dir_item;
>> +	filetype = btrfs_dir_type(path.nodes[0], di);
>> +	if (filetype >= BTRFS_FT_MAX || filetype == BTRFS_FT_UNKNOWN)
>> +		goto dir_item;
>> +	len = min_t(u32, btrfs_item_size_nr(path.nodes[0], path.slots[0]) -
>> +			 sizeof(*di), BTRFS_NAME_LEN);
>> +	len = min_t(u32, len, btrfs_dir_name_len(path.nodes[0], di));
>> +	read_extent_buffer(path.nodes[0], namebuf, (unsigned long)(di + 1), len);
>> +	if (name_len != len || memcmp(namebuf, name, len))
>> +		goto dir_item;
>> +
>> +	/* Got a correct filetype */
>> +	found = true;
>> +	*imode_ret = btrfs_type_to_imode(filetype);
>> +	ret = 0;
>> +	goto out;
>> +
>> +dir_item:
>
> This function is needlessly structured in an awkward way. E.g. there is
> no dependence between "searching by DIR INDEX item" and "searching by
> DIR_ITEM". I think the best way is to have 3 function:
>
> Top level find_file_type which will call find_file_type_by_dir_index if
> the reeturn value is negative -> call 2nd function
> find_file_type_by_dir_item. This will result in 3 fairly short and easy
> to read/parse functions.

I'm OK with that, which is also my original plan, but it's just too easy
to put them altogether into one function, thus I forgot that plan.

>
>> +	btrfs_release_path(&path);
>> +	key.objectid = dirid;
>> +	key.offset = btrfs_name_hash(name, name_len);
>> +
>> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
>> +	if (ret > 0)
>> +		ret = -ENOENT;
>> +	if (ret < 0) {
>> +		btrfs_release_path(&path);
>> +		return ret;
>> +	}
>
> ditto about the else if construct
>
>> +	cur = btrfs_item_ptr_offset(path.nodes[0], path.slots[0]);
>> +	end = cur + btrfs_item_size_nr(path.nodes[0], path.slots[0]);
>> +	while (cur < end) {
>
> Just checking : this is needed in case we have items whose names create
> a crc32 collision in the DIR_ITEM offset?

Yep.
For DIR_ITEM we can have collision while for DIR_INDEX it won't cause
collision.

>
>> +		di = (struct btrfs_dir_item *)cur;
>> +		btrfs_dir_item_key_to_cpu(path.nodes[0], di, &location);
>> +		/* Various basic check */
>> +		if (location.objectid != ino ||
>> +		    location.type != BTRFS_INODE_ITEM_KEY ||
>> +		    location.offset != 0)
>> +			goto next;> +		filetype = btrfs_dir_type(path.nodes[0], di);
>> +		if (filetype >= BTRFS_FT_MAX || filetype == BTRFS_FT_UNKNOWN)
>> +			goto next;
>> +		len = min_t(u32, BTRFS_NAME_LEN,
>> +			    btrfs_item_size_nr(path.nodes[0], path.slots[0]) -
>> +			    sizeof(*di));
>> +		len = min_t(u32, len, btrfs_dir_name_len(path.nodes[0], di));
>> +		read_extent_buffer(path.nodes[0], namebuf,
>> +				   (unsigned long)(di + 1), len);
>> +		if (name_len != len || memcmp(namebuf, name, len))
>> +			goto next;
>> +		*imode_ret = btrfs_type_to_imode(filetype);
>> +		found = true;
>> +		goto out;
>> +next:
>> +		cur += btrfs_dir_name_len(path.nodes[0], di) + sizeof(*di);
>
> If this line is moved right after assigning to 'di' instead of having a
> 'next' label you can simply use the 'continue' keyword

Right, saving a tag is never a bad idea.

Thanks,
Qu
>
>> +	}
>> +out:
>> +	btrfs_release_path(&path);
>> +	if (!found && !ret)
>> +		ret = -ENOENT;
>> +	return ret;
>> +}
>> +
>>  /*
>>   * Reset the inode mode of the inode specified by @path.
>>   *
>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF
  2019-09-09 13:42   ` Nikolay Borisov
@ 2019-09-09 14:26     ` Qu Wenruo
  2019-09-09 14:35       ` Nikolay Borisov
  0 siblings, 1 reply; 24+ messages in thread
From: Qu Wenruo @ 2019-09-09 14:26 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2019/9/9 下午9:42, Nikolay Borisov wrote:
>
>
> On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
>> Introduce a function, find_file_type(), to find filetype using
>> INODE_REF.
>
> This is confusing, there is not a single reference to INODE_REF in the
> code. I guess you must replace this with DIR_ITEM/DIR_INDEX ?

Don't forget how you get the @dirid @index,@name,@namelen from.

All these info are from INODE_REF item.

But I totally understand your concern, it's sometimes really easy to get
confused about 1) what we are searching for 2) what the search indexes
are from.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees
  2019-09-09 14:17   ` Nikolay Borisov
@ 2019-09-09 14:27     ` Qu Wenruo
  0 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-09-09 14:27 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2019/9/9 下午10:17, Nikolay Borisov wrote:
>
>
> On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
>> Before this patch, repair_imode_common() can only handle two types of
>> inodes:
>> - Free space cache inodes
>> - ROOT DIR inodes
>>
>> For inodes in subvolume trees, the core complexity is how to determine the
>> correct imode, thus it was not implemented.
>>
>> However there are more reports of incorrect imode in subvolume trees, we
>> need to support such fix.
>>
>> So this patch adds a new function, detect_imode(), to detect imode for
>> inodes in subvolume trees.
>>
>> That function will determine imode by:
>> - Search for INODE_REF
>>   If we have INODE_REF, we will then try to find DIR_ITEM/DIR_INDEX.
>>   As long as one valid DIR_ITEM or DIR_INDEX can be found, we convert
>>   the BTRFS_FT_* to imode, then call it a day.
>>   This should be the most accurate way.
>>
>> - Search for DIR_INDEX/DIR_ITEM
>>   If above search fails, we falls back to locate the DIR_INDEX/DIR_ITEM
>>   just after the INODE_ITEM.
>>   If any can be found, it's definitely a directory.
>>
>> - Search for EXTENT_DATA
>>   If EXTENT_DATA can be found, it's either REG or LNK.
>>   For this case, we default to REG, as user can inspect the file to
>>   determine if it's a file or just a path.
>>
>> - Use rdev to detect BLK/CHR
>>   If all above fails, but INODE_ITEM has non-zero rdev, then it's either
>>   a BLK or CHR file. Then we default to BLK.
>>
>> - Fail out if none of above methods succeeded
>>   No educated guess to make things worse.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  check/mode-common.c | 130 +++++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 117 insertions(+), 13 deletions(-)
>>
>> diff --git a/check/mode-common.c b/check/mode-common.c
>> index c0ddc50a1dd0..abea2ceda4c4 100644
>> --- a/check/mode-common.c
>> +++ b/check/mode-common.c
>> @@ -935,6 +935,113 @@ out:
>>  	return ret;
>>  }
>>
>> +static int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
>> +			u32 *imode_ret)
>
> I think the imode is less than u32 so it should be possible to return it
> directly from the function as a positive number and error as negative?

It still doesn't look good enough to me.
Combining data value from error number is not a good idea to me, even
even the range doesn't overlap.

>
>> +{
>> +	struct btrfs_key key;
>> +	struct btrfs_inode_item iitem;
>> +	const u32 priv = 0700;
>
> Having this in a variable doesn't bring more clarity, just use 0700
> directly at the end of the function.

I'm OK with that.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF
  2019-09-09 14:24     ` Qu Wenruo
@ 2019-09-09 14:34       ` Nikolay Borisov
  0 siblings, 0 replies; 24+ messages in thread
From: Nikolay Borisov @ 2019-09-09 14:34 UTC (permalink / raw)
  To: Qu Wenruo, WenRuo Qu, linux-btrfs; +Cc: David Sterba



On 9.09.19 г. 17:24 ч., Qu Wenruo wrote:
> 
> 
> On 2019/9/9 下午9:25, Nikolay Borisov wrote:
>>
>>
>> On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
>>> Introduce a function, find_file_type(), to find filetype using
>>> INODE_REF.
>>>
>>> This function will:
>>> - Search DIR_INDEX first
>>>   DIR_INDEX is easier since there is only one item in it.
>>>
>>> - Valid the DIR_INDEX item
>>>   If the DIR_INDEX is valid, use the filetype and call it a day.
>>>
>>> - Search DIR_ITEM then
>>>
>>> - Valide the DIR_ITEM
>>>   If valid, call it a day. Or return -ENOENT;
>>>
>>> This would be used as the primary method to determine the imode in later
>>> imode repair code.
>>>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>>  check/mode-common.c | 99 +++++++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 99 insertions(+)
>>>
>>> diff --git a/check/mode-common.c b/check/mode-common.c
>>> index 195b6efaa7aa..c0ddc50a1dd0 100644
>>> --- a/check/mode-common.c
>>> +++ b/check/mode-common.c
>>> @@ -16,6 +16,7 @@
>>>
>>>  #include <time.h>
>>>  #include "ctree.h"
>>> +#include "hash.h"
>>>  #include "common/internal.h"
>>>  #include "common/messages.h"
>>>  #include "transaction.h"
>>> @@ -836,6 +837,104 @@ int reset_imode(struct btrfs_trans_handle *trans, struct btrfs_root *root,
>>>  	return ret;
>>>  }
>>>
>>> +static int find_file_type(struct btrfs_root *root, u64 ino, u64 dirid,
>>> +			  u64 index, const char *name, u32 name_len,
>>> +			  u32 *imode_ret)
>>> +{
>>> +	struct btrfs_path path;
>>> +	struct btrfs_key location;
>>> +	struct btrfs_key key;
>>> +	struct btrfs_dir_item *di;
>>> +	char namebuf[BTRFS_NAME_LEN] = {0};
>>> +	unsigned long cur;
>>> +	unsigned long end;
>>> +	bool found = false;
>>> +	u8 filetype;
>>> +	u32 len;
>>> +	int ret;
>>> +
>>> +	btrfs_init_path(&path);
>>> +
>>> +	/* Search DIR_INDEX first */
>>> +	key.objectid = dirid;
>>> +	key.offset = index;
>>> +	key.type = BTRFS_DIR_INDEX_KEY;
>>> +
>>> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
>>> +	if (ret > 0)
>>> +		ret = -ENOENT;
>>
>> Even if it returns 1 meaning there is no DIR_INDEX item perhaps it still
>> makes sense to go to dir_item: label to search for DIR_ITEM, what if the
>> corruption has affected just the DIR_INDEX item?
> 
> I didn't get the point.
> 
> The next line is just going to do dir_item search, just as you mentioned.

You are right, however, this is somewhat subtle and the fact I missed it
just proves the point. The following will be much better (and explicit):

if (ret)
   goto dir_item

In fact setting the -ENOENT on ret > 0 is only needed because of the
awkward way the return value of btrfs_search_slot is handled. So yeah,
I'm even more convinced that a simple if (ret) goto dir_item (or call a
function) is the way to go here.

>>
>>> +	if (ret < 0)
>>> +		goto dir_item;
>> nit: Use elseif to make it more explicit it is a single construct.
> 
> Not sure such usage is recommened, but I see a lot of usage like:
> 	ret = btrfs_search_slot();
> 	if (ret > 0)
> 		ret = -ENOENT;
> 	if (ret < 0)
> 		goto error;
> 
> So I just followed this practice.
> 

I guess this is debatable. You are right that most of the retval
handling is as you say but I think the more "purist" (so to say) way
should be an if {} else if {}. Because ultimately you are handling one
thing - the return value of btrfs_search_slot.

 David, what is your take on that ?

<snip>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF
  2019-09-09 14:26     ` Qu Wenruo
@ 2019-09-09 14:35       ` Nikolay Borisov
  0 siblings, 0 replies; 24+ messages in thread
From: Nikolay Borisov @ 2019-09-09 14:35 UTC (permalink / raw)
  To: Qu Wenruo, WenRuo Qu, linux-btrfs



On 9.09.19 г. 17:26 ч., Qu Wenruo wrote:
> 
> 
> On 2019/9/9 下午9:42, Nikolay Borisov wrote:
>>
>>
>> On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
>>> Introduce a function, find_file_type(), to find filetype using
>>> INODE_REF.
>>
>> This is confusing, there is not a single reference to INODE_REF in the
>> code. I guess you must replace this with DIR_ITEM/DIR_INDEX ?
> 
> Don't forget how you get the @dirid @index,@name,@namelen from.
> 
> All these info are from INODE_REF item.
> 
> But I totally understand your concern, it's sometimes really easy to get
> confused about 1) what we are searching for 2) what the search indexes
> are from.

Yes but that is only apparent when one reads the next patch. When you
take this function in isolation it really gets input data and based on
that tries to search for relevant DIR_ITEM/DIR_INDEX. Think about
someone stumbling on this commit 6 months from now, without necessarily
having reviewed the whole series.

> 
> Thanks,
> Qu
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/6] btrfs-progs: check/lowmem: Repair bad imode early
  2019-09-05  7:57 ` [PATCH v2 4/6] btrfs-progs: check/lowmem: Repair bad imode early Qu Wenruo
@ 2019-09-09 14:55   ` Nikolay Borisov
  2019-09-10  2:35     ` Qu Wenruo
  0 siblings, 1 reply; 24+ messages in thread
From: Nikolay Borisov @ 2019-09-09 14:55 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
> For lowmem mode, if we hit a bad inode mode, normally it is reported
> when we checking the DIR_INDEX/DIR_ITEM of the parent inode.
> 
> If we didn't repair at that timing, the error will be recorded even we
> fixed it later.
> 
> So this patch will check for INODE_ITEM_MISMATCH error type, and if it's
> really caused by invalid imode, repair it and clear the error.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  check/mode-lowmem.c | 39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
> index 5f7f101daab1..5d0c520217fa 100644
> --- a/check/mode-lowmem.c
> +++ b/check/mode-lowmem.c
> @@ -1550,6 +1550,35 @@ static int lowmem_delete_corrupted_dir_item(struct btrfs_root *root,
>  	return ret;
>  }
>  
> +static int try_repair_imode(struct btrfs_root *root, u64 ino)
> +{
> +	struct btrfs_inode_item *iitem;
> +	struct btrfs_path path;
> +	struct btrfs_key key;
> +	int ret;
> +
> +	key.objectid = ino;
> +	key.type = BTRFS_INODE_ITEM_KEY;
> +	key.offset = 0;
> +	btrfs_init_path(&path);
> +
> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
> +	if (ret > 0)
> +		ret = -ENOENT;
> +	if (ret < 0)
> +		goto out;
> +	iitem = btrfs_item_ptr(path.nodes[0], path.slots[0],
> +			       struct btrfs_inode_item);
> +	if (!is_valid_imode(btrfs_inode_mode(path.nodes[0], iitem))) {

INODE_ITEM_MISMATCH is only set if:

1. The first inode item is not a directory (it should always be)
2. There is a mismatch between the filetype in the inode item and in the
dir/index item pointing to it.

By using this check you could possibly miss case 1 above, if the first
inode has a valid type e.g. regular whereas it should really be a
directory.

I'm not entirely sure whether it makes sense to handle this situation,
since admittedly, it would require a perfect storm to get such a
corruption.


<snip>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 6/6] btrfs-progs: tests/fsck: Add new images for inode mode repair functionality
  2019-09-05  7:58 ` [PATCH v2 6/6] btrfs-progs: tests/fsck: Add new images for inode mode repair functionality Qu Wenruo
@ 2019-09-09 15:37   ` Nikolay Borisov
  2019-09-09 23:22     ` Qu Wenruo
  0 siblings, 1 reply; 24+ messages in thread
From: Nikolay Borisov @ 2019-09-09 15:37 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 5.09.19 г. 10:58 ч., Qu Wenruo wrote:
> Add new test image for imode repair in subvolume trees.
> 
> Also rename the existing test case 039-bad-free-space-cache-inode-mode
> to 039-bad-inode-mode, since now we can fix all bad imode.
> 
> And add the beacon file for lowmem test.
> 

What kind of corruption does the image have? Ideally it should have
cases where both DIR_ITEM and DIR_INDEX is used. As well as DIR_ITEM
containing collisions.

<split>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 6/6] btrfs-progs: tests/fsck: Add new images for inode mode repair functionality
  2019-09-09 15:37   ` Nikolay Borisov
@ 2019-09-09 23:22     ` Qu Wenruo
  0 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-09-09 23:22 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2019/9/9 下午11:37, Nikolay Borisov wrote:
>
>
> On 5.09.19 г. 10:58 ч., Qu Wenruo wrote:
>> Add new test image for imode repair in subvolume trees.
>>
>> Also rename the existing test case 039-bad-free-space-cache-inode-mode
>> to 039-bad-inode-mode, since now we can fix all bad imode.
>>
>> And add the beacon file for lowmem test.
>>
>
> What kind of corruption does the image have?

Just bad imode, all 0.

> Ideally it should have
> cases where both DIR_ITEM and DIR_INDEX is used. As well as DIR_ITEM
> containing collisions.

Yes, that's the best case, but in that case, we will have more failure
modes, e.g. we need to repair that missing DIR_ITEM before we use
DIR_INDEX to regenerate the filetype.

So I'm just sticking to one test image for one corruption.

Thanks,
Qu

>
> <split>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/6] btrfs-progs: check/lowmem: Repair bad imode early
  2019-09-09 14:55   ` Nikolay Borisov
@ 2019-09-10  2:35     ` Qu Wenruo
  0 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-09-10  2:35 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2019/9/9 下午10:55, Nikolay Borisov wrote:
>
>
> On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
>> For lowmem mode, if we hit a bad inode mode, normally it is reported
>> when we checking the DIR_INDEX/DIR_ITEM of the parent inode.
>>
>> If we didn't repair at that timing, the error will be recorded even we
>> fixed it later.
>>
>> So this patch will check for INODE_ITEM_MISMATCH error type, and if it's
>> really caused by invalid imode, repair it and clear the error.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  check/mode-lowmem.c | 39 +++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 39 insertions(+)
>>
>> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
>> index 5f7f101daab1..5d0c520217fa 100644
>> --- a/check/mode-lowmem.c
>> +++ b/check/mode-lowmem.c
>> @@ -1550,6 +1550,35 @@ static int lowmem_delete_corrupted_dir_item(struct btrfs_root *root,
>>  	return ret;
>>  }
>>
>> +static int try_repair_imode(struct btrfs_root *root, u64 ino)
>> +{
>> +	struct btrfs_inode_item *iitem;
>> +	struct btrfs_path path;
>> +	struct btrfs_key key;
>> +	int ret;
>> +
>> +	key.objectid = ino;
>> +	key.type = BTRFS_INODE_ITEM_KEY;
>> +	key.offset = 0;
>> +	btrfs_init_path(&path);
>> +
>> +	ret = btrfs_search_slot(NULL, root, &key, &path, 0, 0);
>> +	if (ret > 0)
>> +		ret = -ENOENT;
>> +	if (ret < 0)
>> +		goto out;
>> +	iitem = btrfs_item_ptr(path.nodes[0], path.slots[0],
>> +			       struct btrfs_inode_item);
>> +	if (!is_valid_imode(btrfs_inode_mode(path.nodes[0], iitem))) {
>
> INODE_ITEM_MISMATCH is only set if:
>
> 1. The first inode item is not a directory (it should always be)
> 2. There is a mismatch between the filetype in the inode item and in the
> dir/index item pointing to it.
>
> By using this check you could possibly miss case 1 above, if the first
> inode has a valid type e.g. regular whereas it should really be a
> directory.

That's indeed a special rare case.

>
> I'm not entirely sure whether it makes sense to handle this situation,
> since admittedly, it would require a perfect storm to get such a
> corruption.

Let's wait to see if this could happen.
If it happened, we can easily modify the code to handle it anyway.

On the other hand, let's just focus on the real corruption where some
old 2014 kernel screwed up the imode of some inodes.

Thanks,
Qu

>
>
> <snip>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees
  2019-09-05  7:57 ` [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees Qu Wenruo
  2019-09-09 14:17   ` Nikolay Borisov
@ 2019-09-10  4:27   ` Su Yue
  2019-09-10 16:14   ` Nikolay Borisov
  2 siblings, 0 replies; 24+ messages in thread
From: Su Yue @ 2019-09-10  4:27 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 2019/9/5 3:57 PM, Qu Wenruo wrote:
> Before this patch, repair_imode_common() can only handle two types of
> inodes:
> - Free space cache inodes
> - ROOT DIR inodes
>
> For inodes in subvolume trees, the core complexity is how to determine the
> correct imode, thus it was not implemented.
>
> However there are more reports of incorrect imode in subvolume trees, we
> need to support such fix.
>
> So this patch adds a new function, detect_imode(), to detect imode for
> inodes in subvolume trees.
>
> That function will determine imode by:
> - Search for INODE_REF
>    If we have INODE_REF, we will then try to find DIR_ITEM/DIR_INDEX.
>    As long as one valid DIR_ITEM or DIR_INDEX can be found, we convert
>    the BTRFS_FT_* to imode, then call it a day.
>    This should be the most accurate way.
>
> - Search for DIR_INDEX/DIR_ITEM
>    If above search fails, we falls back to locate the DIR_INDEX/DIR_ITEM
>    just after the INODE_ITEM.
>    If any can be found, it's definitely a directory.
>
> - Search for EXTENT_DATA
>    If EXTENT_DATA can be found, it's either REG or LNK.
>    For this case, we default to REG, as user can inspect the file to
>    determine if it's a file or just a path.
>
> - Use rdev to detect BLK/CHR
>    If all above fails, but INODE_ITEM has non-zero rdev, then it's either
>    a BLK or CHR file. Then we default to BLK.
>
> - Fail out if none of above methods succeeded
>    No educated guess to make things worse.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   check/mode-common.c | 130 +++++++++++++++++++++++++++++++++++++++-----
>   1 file changed, 117 insertions(+), 13 deletions(-)
>
> diff --git a/check/mode-common.c b/check/mode-common.c
> index c0ddc50a1dd0..abea2ceda4c4 100644
> --- a/check/mode-common.c
> +++ b/check/mode-common.c
> @@ -935,6 +935,113 @@ out:
>   	return ret;
>   }
>
> +static int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
> +			u32 *imode_ret)
> +{
> +	struct btrfs_key key;
> +	struct btrfs_inode_item iitem;
> +	const u32 priv = 0700;
> +	bool found = false;
> +	u64 ino;
> +	u32 imode;
> +	int ret = 0;
> +
> +	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
> +	ino = key.objectid;
> +	read_extent_buffer(path->nodes[0], &iitem,
> +			btrfs_item_ptr_offset(path->nodes[0], path->slots[0]),
> +			sizeof(iitem));
> +	/* root inode */
> +	if (ino == BTRFS_FIRST_FREE_OBJECTID) {
> +		imode = S_IFDIR;
> +		found = true;
> +		goto out;
> +	}
> +
> +	while (1) {
> +		struct btrfs_inode_ref *iref;
> +		struct extent_buffer *leaf;
> +		unsigned long cur;
> +		unsigned long end;
> +		char namebuf[BTRFS_NAME_LEN] = {0};
> +		u64 index;
> +		u32 namelen;
> +		int slot;
> +
> +		ret = btrfs_next_item(root, path);
> +		if (ret > 0) {
> +			/* falls back to rdev check */
> +			ret = 0;
> +			goto out;
> +		}
> +		if (ret < 0)
> +			goto out;
> +		leaf = path->nodes[0];
> +		slot = path->slots[0];
> +		btrfs_item_key_to_cpu(leaf, &key, slot);
> +		if (key.objectid != ino)
> +			goto out;
> +
> +		/*
> +		 * We ignore some types to make life easier:
> +		 * - XATTR
> +		 *   Both REG and DIR can have xattr, so not useful
> +		 */
> +		switch (key.type) {
> +		case BTRFS_INODE_REF_KEY:
> +			/* The most accurate way to determine filetype */
> +			cur = btrfs_item_ptr_offset(leaf, slot);
> +			end = cur + btrfs_item_size_nr(leaf, slot);
> +			while (cur < end) {
> +				iref = (struct btrfs_inode_ref *)cur;
> +				namelen = min_t(u32, end - cur - sizeof(&iref),
> +					btrfs_inode_ref_name_len(leaf, iref));
> +				index = btrfs_inode_ref_index(leaf, iref);
> +				read_extent_buffer(leaf, namebuf,
> +					(unsigned long)(iref + 1), namelen);
> +				ret = find_file_type(root, ino, key.offset,
> +						index, namebuf, namelen,
> +						&imode);
> +				if (ret == 0) {
> +					found = true;
> +					goto out;
> +				}
> +				cur += sizeof(*iref) + namelen;
> +			}
> +			break;
> +		case BTRFS_DIR_ITEM_KEY:
> +		case BTRFS_DIR_INDEX_KEY:
> +			imode = S_IFDIR;
> +			goto out;
> +		case BTRFS_EXTENT_DATA_KEY:
> +			/*
> +			 * Both REG and LINK could have EXTENT_DATA.
> +			 * We just fall back to REG as user can inspect the
> +			 * content.
> +			 */
> +			imode = S_IFREG;
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	/*
> +	 * Both CHR and BLK uses rdev, no way to distinguish them, so fall back
> +	 * to BLK. But either way it doesn't really matter, as CHR/BLK on btrfs
> +	 * should be pretty rare, and no real data will be lost.
> +	 */
> +	if (!found && btrfs_stack_inode_rdev(&iitem) != 0) {
> +		imode = S_IFBLK;
> +		found = true;
> +	}
> +
> +	if (found)
> +		*imode_ret = (imode | priv);

Should set @ret to 0 here?
In above fallback path, @ret may be negative.

--
Su
> +	else
> +		ret = -ENOENT;
> +	return ret;
> +}
> +
>   /*
>    * Reset the inode mode of the inode specified by @path.
>    *
> @@ -951,22 +1058,19 @@ int repair_imode_common(struct btrfs_root *root, struct btrfs_path *path)
>   	u32 imode;
>   	int ret;
>
> -	if (root->root_key.objectid != BTRFS_ROOT_TREE_OBJECTID) {
> -		error(
> -		"repair inode mode outside of root tree is not supported yet");
> -		return -ENOTTY;
> -	}
>   	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
>   	ASSERT(key.type == BTRFS_INODE_ITEM_KEY);
> -	if (key.objectid != BTRFS_ROOT_TREE_DIR_OBJECTID &&
> -	    !is_fstree(key.objectid)) {
> -		error("unsupported ino %llu", key.objectid);
> -		return -ENOTTY;
> +	if (root->objectid == BTRFS_ROOT_TREE_OBJECTID) {
> +		/* In root tree we only have two possible imode */
> +		if (key.objectid == BTRFS_ROOT_TREE_OBJECTID)
> +			imode = S_IFDIR | 0755;
> +		else
> +			imode = S_IFREG | 0600;
> +	} else {
> +		ret = detect_imode(root, path, &imode);
> +		if (ret < 0)
> +			return ret;
>   	}
> -	if (key.objectid == BTRFS_ROOT_TREE_DIR_OBJECTID)
> -		imode = 040755;
> -	else
> -		imode = 0100600;
>
>   	trans = btrfs_start_transaction(root, 1);
>   	if (IS_ERR(trans)) {
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees
  2019-09-05  7:57 ` [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees Qu Wenruo
  2019-09-09 14:17   ` Nikolay Borisov
  2019-09-10  4:27   ` Su Yue
@ 2019-09-10 16:14   ` Nikolay Borisov
  2019-09-11  0:39     ` Qu Wenruo
  2 siblings, 1 reply; 24+ messages in thread
From: Nikolay Borisov @ 2019-09-10 16:14 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 5.09.19 г. 10:57 ч., Qu Wenruo wrote:
> Before this patch, repair_imode_common() can only handle two types of
> inodes:
> - Free space cache inodes
> - ROOT DIR inodes
> 
> For inodes in subvolume trees, the core complexity is how to determine the
> correct imode, thus it was not implemented.
> 
> However there are more reports of incorrect imode in subvolume trees, we
> need to support such fix.
> 
> So this patch adds a new function, detect_imode(), to detect imode for
> inodes in subvolume trees.
> 
> That function will determine imode by:
> - Search for INODE_REF
>   If we have INODE_REF, we will then try to find DIR_ITEM/DIR_INDEX.
>   As long as one valid DIR_ITEM or DIR_INDEX can be found, we convert
>   the BTRFS_FT_* to imode, then call it a day.
>   This should be the most accurate way.
> 
> - Search for DIR_INDEX/DIR_ITEM
>   If above search fails, we falls back to locate the DIR_INDEX/DIR_ITEM
>   just after the INODE_ITEM.
>   If any can be found, it's definitely a directory.

This needs an explicit satement that it will only work for non-empty files and directories
> 
> - Search for EXTENT_DATA
>   If EXTENT_DATA can be found, it's either REG or LNK.
>   For this case, we default to REG, as user can inspect the file to
>   determine if it's a file or just a path.
> 
> - Use rdev to detect BLK/CHR
>   If all above fails, but INODE_ITEM has non-zero rdev, then it's either
>   a BLK or CHR file. Then we default to BLK.
> 
> - Fail out if none of above methods succeeded
>   No educated guess to make things worse.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  check/mode-common.c | 130 +++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 117 insertions(+), 13 deletions(-)
> 
> diff --git a/check/mode-common.c b/check/mode-common.c
> index c0ddc50a1dd0..abea2ceda4c4 100644
> --- a/check/mode-common.c
> +++ b/check/mode-common.c
> @@ -935,6 +935,113 @@ out:
>  	return ret;
>  }
>  
> +static int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
> +			u32 *imode_ret)
> +{
> +	struct btrfs_key key;
> +	struct btrfs_inode_item iitem;
> +	const u32 priv = 0700;
> +	bool found = false;
> +	u64 ino;
> +	u32 imode;
> +	int ret = 0;
> +
> +	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
> +	ino = key.objectid;
> +	read_extent_buffer(path->nodes[0], &iitem,
> +			btrfs_item_ptr_offset(path->nodes[0], path->slots[0]),
> +			sizeof(iitem));
> +	/* root inode */
> +	if (ino == BTRFS_FIRST_FREE_OBJECTID) {
> +		imode = S_IFDIR;
> +		found = true;
> +		goto out;
> +	}
> +
> +	while (1) {
> +		struct btrfs_inode_ref *iref;
> +		struct extent_buffer *leaf;
> +		unsigned long cur;
> +		unsigned long end;
> +		char namebuf[BTRFS_NAME_LEN] = {0};
> +		u64 index;
> +		u32 namelen;
> +		int slot;
> +
> +		ret = btrfs_next_item(root, path);
> +		if (ret > 0) {
> +			/* falls back to rdev check */
> +			ret = 0;
> +			goto out;
> +		}

In my testing if an inode is the last one in the leaf and it doesn't have 
an INODE_REF item then it won't be repaired. But e.g. it can have perfectly 
valid DIR_ITEM/DIR_INDEX entries which describe this inode as being a file. E.g. 

	item 2 key (256 DIR_ITEM 388586943) itemoff 16076 itemsize 35
		location key (260 INODE_ITEM 0) type FILE
		transid 7 data_len 0 name_len 5
		name: file3

	.....
	item 15 key (260 INODE_ITEM 0) itemoff 15184 itemsize 160
		generation 7 transid 7 size 0 nbytes 0
		block group 0 mode 26772225102 links 1 uid 0 gid 0 rdev 0
		sequence 1 flags 0x0(none)
		atime 1568127261.284993602 (2019-09-10 14:54:21)
		ctime 1568127261.284993602 (2019-09-10 14:54:21)
		mtime 1568127261.284993602 (2019-09-10 14:54:21)
		otime 1568127261.284993602 (2019-09-10 14:54:21)

I have intentionally deleted INODE_REF too see what's happening. Is this intended?


> +		if (ret < 0)
> +			goto out;
> +		leaf = path->nodes[0];
> +		slot = path->slots[0];
> +		btrfs_item_key_to_cpu(leaf, &key, slot);
> +		if (key.objectid != ino)
> +			goto out;
> +
> +		/*
> +		 * We ignore some types to make life easier:
> +		 * - XATTR
> +		 *   Both REG and DIR can have xattr, so not useful
> +		 */
> +		switch (key.type) {
> +		case BTRFS_INODE_REF_KEY:
> +			/* The most accurate way to determine filetype */
> +			cur = btrfs_item_ptr_offset(leaf, slot);
> +			end = cur + btrfs_item_size_nr(leaf, slot);
> +			while (cur < end) {
> +				iref = (struct btrfs_inode_ref *)cur;
> +				namelen = min_t(u32, end - cur - sizeof(&iref),
> +					btrfs_inode_ref_name_len(leaf, iref));
> +				index = btrfs_inode_ref_index(leaf, iref);
> +				read_extent_buffer(leaf, namebuf,
> +					(unsigned long)(iref + 1), namelen);
> +				ret = find_file_type(root, ino, key.offset,
> +						index, namebuf, namelen,
> +						&imode);
> +				if (ret == 0) {
> +					found = true;
> +					goto out;
> +				}
> +				cur += sizeof(*iref) + namelen;
> +			}
> +			break;
> +		case BTRFS_DIR_ITEM_KEY:
> +		case BTRFS_DIR_INDEX_KEY:
> +			imode = S_IFDIR;

You should set found here. Otherwise this function returns ENOENT in those 2 branches. 
BTW in relation to out private conversation I found this while trying to create test 
cases which will trigger all branches. The fact it found bugs proves we should strive 
for as much testing coverage as possible. 

> +			goto out;
> +		case BTRFS_EXTENT_DATA_KEY:
> +			/*
> +			 * Both REG and LINK could have EXTENT_DATA.
> +			 * We just fall back to REG as user can inspect the
> +			 * content.
> +			 */
> +			imode = S_IFREG;
ditto

> +			goto out;
> +		}
> +	}
> +
> +out:
> +	/*
> +	 * Both CHR and BLK uses rdev, no way to distinguish them, so fall back
> +	 * to BLK. But either way it doesn't really matter, as CHR/BLK on btrfs
> +	 * should be pretty rare, and no real data will be lost.
> +	 */
> +	if (!found && btrfs_stack_inode_rdev(&iitem) != 0) {
> +		imode = S_IFBLK;
> +		found = true;
> +	}
> +
> +	if (found)
> +		*imode_ret = (imode | priv);
> +	else
> +		ret = -ENOENT;
> +	return ret;
> +}
> +
>  /*
>   * Reset the inode mode of the inode specified by @path.

<snip>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees
  2019-09-10 16:14   ` Nikolay Borisov
@ 2019-09-11  0:39     ` Qu Wenruo
  2019-09-11 12:27       ` Nikolay Borisov
  0 siblings, 1 reply; 24+ messages in thread
From: Qu Wenruo @ 2019-09-11  0:39 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs

[...]
>> - Search for DIR_INDEX/DIR_ITEM
>>   If above search fails, we falls back to locate the DIR_INDEX/DIR_ITEM
>>   just after the INODE_ITEM.
>>   If any can be found, it's definitely a directory.
>
> This needs an explicit satement that it will only work for non-empty files and directories

Indeed.

>>
>> - Search for EXTENT_DATA
>>   If EXTENT_DATA can be found, it's either REG or LNK.
>>   For this case, we default to REG, as user can inspect the file to
>>   determine if it's a file or just a path.
>>
>> - Use rdev to detect BLK/CHR
>>   If all above fails, but INODE_ITEM has non-zero rdev, then it's either
>>   a BLK or CHR file. Then we default to BLK.
>>
>> - Fail out if none of above methods succeeded
>>   No educated guess to make things worse.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  check/mode-common.c | 130 +++++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 117 insertions(+), 13 deletions(-)
>>
>> diff --git a/check/mode-common.c b/check/mode-common.c
>> index c0ddc50a1dd0..abea2ceda4c4 100644
>> --- a/check/mode-common.c
>> +++ b/check/mode-common.c
>> @@ -935,6 +935,113 @@ out:
>>  	return ret;
>>  }
>>
>> +static int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
>> +			u32 *imode_ret)
>> +{
>> +	struct btrfs_key key;
>> +	struct btrfs_inode_item iitem;
>> +	const u32 priv = 0700;
>> +	bool found = false;
>> +	u64 ino;
>> +	u32 imode;
>> +	int ret = 0;
>> +
>> +	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
>> +	ino = key.objectid;
>> +	read_extent_buffer(path->nodes[0], &iitem,
>> +			btrfs_item_ptr_offset(path->nodes[0], path->slots[0]),
>> +			sizeof(iitem));
>> +	/* root inode */
>> +	if (ino == BTRFS_FIRST_FREE_OBJECTID) {
>> +		imode = S_IFDIR;
>> +		found = true;
>> +		goto out;
>> +	}
>> +
>> +	while (1) {
>> +		struct btrfs_inode_ref *iref;
>> +		struct extent_buffer *leaf;
>> +		unsigned long cur;
>> +		unsigned long end;
>> +		char namebuf[BTRFS_NAME_LEN] = {0};
>> +		u64 index;
>> +		u32 namelen;
>> +		int slot;
>> +
>> +		ret = btrfs_next_item(root, path);
>> +		if (ret > 0) {
>> +			/* falls back to rdev check */
>> +			ret = 0;
>> +			goto out;
>> +		}
>
> In my testing if an inode is the last one in the leaf and it doesn't have
> an INODE_REF item then it won't be repaired. But e.g. it can have perfectly
> valid DIR_ITEM/DIR_INDEX entries which describe this inode as being a file. E.g.
>
> 	item 2 key (256 DIR_ITEM 388586943) itemoff 16076 itemsize 35
> 		location key (260 INODE_ITEM 0) type FILE
> 		transid 7 data_len 0 name_len 5
> 		name: file3
>
> 	.....
> 	item 15 key (260 INODE_ITEM 0) itemoff 15184 itemsize 160
> 		generation 7 transid 7 size 0 nbytes 0
> 		block group 0 mode 26772225102 links 1 uid 0 gid 0 rdev 0
> 		sequence 1 flags 0x0(none)
> 		atime 1568127261.284993602 (2019-09-10 14:54:21)
> 		ctime 1568127261.284993602 (2019-09-10 14:54:21)
> 		mtime 1568127261.284993602 (2019-09-10 14:54:21)
> 		otime 1568127261.284993602 (2019-09-10 14:54:21)
>
> I have intentionally deleted INODE_REF too see what's happening. Is this intended?

Yes, completely intended.

For this case, you need to iterate through the whole tree to locate the
DIR_INDEX to fix, which is not really possible with current code base,
which only search based on the INODE, not the DIR_INDEX/DIR_ITEM from
its parent dir.

Furthermore, didn't you mention that if we don't have confident about
the imode, then we should fail out instead of using REG as default?

>
>
>> +		if (ret < 0)
>> +			goto out;
>> +		leaf = path->nodes[0];
>> +		slot = path->slots[0];
>> +		btrfs_item_key_to_cpu(leaf, &key, slot);
>> +		if (key.objectid != ino)
>> +			goto out;
>> +
>> +		/*
>> +		 * We ignore some types to make life easier:
>> +		 * - XATTR
>> +		 *   Both REG and DIR can have xattr, so not useful
>> +		 */
>> +		switch (key.type) {
>> +		case BTRFS_INODE_REF_KEY:
>> +			/* The most accurate way to determine filetype */
>> +			cur = btrfs_item_ptr_offset(leaf, slot);
>> +			end = cur + btrfs_item_size_nr(leaf, slot);
>> +			while (cur < end) {
>> +				iref = (struct btrfs_inode_ref *)cur;
>> +				namelen = min_t(u32, end - cur - sizeof(&iref),
>> +					btrfs_inode_ref_name_len(leaf, iref));
>> +				index = btrfs_inode_ref_index(leaf, iref);
>> +				read_extent_buffer(leaf, namebuf,
>> +					(unsigned long)(iref + 1), namelen);
>> +				ret = find_file_type(root, ino, key.offset,
>> +						index, namebuf, namelen,
>> +						&imode);
>> +				if (ret == 0) {
>> +					found = true;
>> +					goto out;
>> +				}
>> +				cur += sizeof(*iref) + namelen;
>> +			}
>> +			break;
>> +		case BTRFS_DIR_ITEM_KEY:
>> +		case BTRFS_DIR_INDEX_KEY:
>> +			imode = S_IFDIR;
>
> You should set found here.

Yep, also found that and fixed in next version.

Thanks,
Qu

> Otherwise this function returns ENOENT in those 2 branches.
> BTW in relation to out private conversation I found this while trying to create test
> cases which will trigger all branches. The fact it found bugs proves we should strive
> for as much testing coverage as possible.
>
>> +			goto out;
>> +		case BTRFS_EXTENT_DATA_KEY:
>> +			/*
>> +			 * Both REG and LINK could have EXTENT_DATA.
>> +			 * We just fall back to REG as user can inspect the
>> +			 * content.
>> +			 */
>> +			imode = S_IFREG;
> ditto
>
>> +			goto out;
>> +		}
>> +	}
>> +
>> +out:
>> +	/*
>> +	 * Both CHR and BLK uses rdev, no way to distinguish them, so fall back
>> +	 * to BLK. But either way it doesn't really matter, as CHR/BLK on btrfs
>> +	 * should be pretty rare, and no real data will be lost.
>> +	 */
>> +	if (!found && btrfs_stack_inode_rdev(&iitem) != 0) {
>> +		imode = S_IFBLK;
>> +		found = true;
>> +	}
>> +
>> +	if (found)
>> +		*imode_ret = (imode | priv);
>> +	else
>> +		ret = -ENOENT;
>> +	return ret;
>> +}
>> +
>>  /*
>>   * Reset the inode mode of the inode specified by @path.
>
> <snip>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees
  2019-09-11  0:39     ` Qu Wenruo
@ 2019-09-11 12:27       ` Nikolay Borisov
  2019-09-11 12:44         ` Qu Wenruo
  0 siblings, 1 reply; 24+ messages in thread
From: Nikolay Borisov @ 2019-09-11 12:27 UTC (permalink / raw)
  To: Qu Wenruo, WenRuo Qu, linux-btrfs



On 11.09.19 г. 3:39 ч., Qu Wenruo wrote:
> [...]
>>> - Search for DIR_INDEX/DIR_ITEM
>>>   If above search fails, we falls back to locate the DIR_INDEX/DIR_ITEM
>>>   just after the INODE_ITEM.
>>>   If any can be found, it's definitely a directory.
>>
>> This needs an explicit satement that it will only work for non-empty files and directories
> 
> Indeed.
> 
>>>
>>> - Search for EXTENT_DATA
>>>   If EXTENT_DATA can be found, it's either REG or LNK.
>>>   For this case, we default to REG, as user can inspect the file to
>>>   determine if it's a file or just a path.
>>>
>>> - Use rdev to detect BLK/CHR
>>>   If all above fails, but INODE_ITEM has non-zero rdev, then it's either
>>>   a BLK or CHR file. Then we default to BLK.
>>>
>>> - Fail out if none of above methods succeeded
>>>   No educated guess to make things worse.
>>>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>>  check/mode-common.c | 130 +++++++++++++++++++++++++++++++++++++++-----
>>>  1 file changed, 117 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/check/mode-common.c b/check/mode-common.c
>>> index c0ddc50a1dd0..abea2ceda4c4 100644
>>> --- a/check/mode-common.c
>>> +++ b/check/mode-common.c
>>> @@ -935,6 +935,113 @@ out:
>>>  	return ret;
>>>  }
>>>
>>> +static int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
>>> +			u32 *imode_ret)
>>> +{
>>> +	struct btrfs_key key;
>>> +	struct btrfs_inode_item iitem;
>>> +	const u32 priv = 0700;
>>> +	bool found = false;
>>> +	u64 ino;
>>> +	u32 imode;
>>> +	int ret = 0;
>>> +
>>> +	btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
>>> +	ino = key.objectid;
>>> +	read_extent_buffer(path->nodes[0], &iitem,
>>> +			btrfs_item_ptr_offset(path->nodes[0], path->slots[0]),
>>> +			sizeof(iitem));
>>> +	/* root inode */
>>> +	if (ino == BTRFS_FIRST_FREE_OBJECTID) {
>>> +		imode = S_IFDIR;
>>> +		found = true;
>>> +		goto out;
>>> +	}
>>> +
>>> +	while (1) {
>>> +		struct btrfs_inode_ref *iref;
>>> +		struct extent_buffer *leaf;
>>> +		unsigned long cur;
>>> +		unsigned long end;
>>> +		char namebuf[BTRFS_NAME_LEN] = {0};
>>> +		u64 index;
>>> +		u32 namelen;
>>> +		int slot;
>>> +
>>> +		ret = btrfs_next_item(root, path);
>>> +		if (ret > 0) {
>>> +			/* falls back to rdev check */
>>> +			ret = 0;
>>> +			goto out;
>>> +		}
>>
>> In my testing if an inode is the last one in the leaf and it doesn't have
>> an INODE_REF item then it won't be repaired. But e.g. it can have perfectly
>> valid DIR_ITEM/DIR_INDEX entries which describe this inode as being a file. E.g.
>>
>> 	item 2 key (256 DIR_ITEM 388586943) itemoff 16076 itemsize 35
>> 		location key (260 INODE_ITEM 0) type FILE
>> 		transid 7 data_len 0 name_len 5
>> 		name: file3
>>
>> 	.....
>> 	item 15 key (260 INODE_ITEM 0) itemoff 15184 itemsize 160
>> 		generation 7 transid 7 size 0 nbytes 0
>> 		block group 0 mode 26772225102 links 1 uid 0 gid 0 rdev 0
>> 		sequence 1 flags 0x0(none)
>> 		atime 1568127261.284993602 (2019-09-10 14:54:21)
>> 		ctime 1568127261.284993602 (2019-09-10 14:54:21)
>> 		mtime 1568127261.284993602 (2019-09-10 14:54:21)
>> 		otime 1568127261.284993602 (2019-09-10 14:54:21)
>>
>> I have intentionally deleted INODE_REF too see what's happening. Is this intended?
> 
> Yes, completely intended.
> 
> For this case, you need to iterate through the whole tree to locate the
> DIR_INDEX to fix, which is not really possible with current code base,
> which only search based on the INODE, not the DIR_INDEX/DIR_ITEM from
> its parent dir.
> 
> Furthermore, didn't you mention that if we don't have confident about
> the imode, then we should fail out instead of using REG as default?

I did, what I supposed could happen here is if we can't find an
INODE_REF then do a search for DIR_INDEX/DIR_ITEM since those items also
contain the type of the inode they are pointing to. Fixing really boils
down to exploiting redundancy in the on-disk metadata to repair existing
items. Here comes the question, of course, what to do if we don't have
an INODE_REF and DIR_INDEX/DIR_ITEM are broken. I guess it's a judgement
call, currently you decided that inode_ref will be the source of
information. I'm fine with that I was merely pointing there is more we
can do. Of course we need to weigh the pros/cons between code complexity
and the returns we get.

Just that I will advise to make it explicit in the changelog that you
made a judgement call to utilize INODE_REF as the starting point of
doing imode repair but not necessarily the only one.


<snip>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees
  2019-09-11 12:27       ` Nikolay Borisov
@ 2019-09-11 12:44         ` Qu Wenruo
  0 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-09-11 12:44 UTC (permalink / raw)
  To: Nikolay Borisov, WenRuo Qu, linux-btrfs

[...]
>>> I have intentionally deleted INODE_REF too see what's happening. Is this intended?
>>
>> Yes, completely intended.
>>
>> For this case, you need to iterate through the whole tree to locate the
>> DIR_INDEX to fix, which is not really possible with current code base,
>> which only search based on the INODE, not the DIR_INDEX/DIR_ITEM from
>> its parent dir.
>>
>> Furthermore, didn't you mention that if we don't have confident about
>> the imode, then we should fail out instead of using REG as default?
>
> I did, what I supposed could happen here is if we can't find an
> INODE_REF then do a search for DIR_INDEX/DIR_ITEM since those items also
> contain the type of the inode they are pointing to. Fixing really boils
> down to exploiting redundancy in the on-disk metadata to repair existing
> items. Here comes the question, of course, what to do if we don't have
> an INODE_REF and DIR_INDEX/DIR_ITEM are broken. I guess it's a judgement
> call, currently you decided that inode_ref will be the source of
> information. I'm fine with that I was merely pointing there is more we
> can do. Of course we need to weigh the pros/cons between code complexity
> and the returns we get.

BTW, the kinda of "inconsistency" between different judgement calls is
somewhat causing the problem you see in some more complex repair situation.

To me and the old Fujitsu guys, what we (at least used to) believe is,
if we have any chance to recover, then try our best. (just like what I
did when we can't find inode ref)

But if we have redundant info, like the INODE_REF/DIR_INDEX/DIR_ITEM, we
go democracy, two valid items then recovery, one valid item, then
discard it.

If we have a very consistent policy, like anything wrong the discard the
bad part even it will cause data loss.
The recovery will be much easier and consistent.
In this imode case, we don't "recover" the inode, but remove it
completely, and if the INODE_REF/DIR_INDEX/DIR_ITEM repair follows the
same standard, a lot of things will be much easier to do.
(Now I'm regretting the call I did for the INODE_REF/DIR_INDEX/DIR_ITEM
repair code)

But I guess that's the ultimate judgement call made way beyond I joined
the btrfs-progs development (see the extent repair code and bad key
order repair), thus not that easy to change without a large rework...

>
> Just that I will advise to make it explicit in the changelog that you
> made a judgement call to utilize INODE_REF as the starting point of
> doing imode repair but not necessarily the only one.

No problem, I'd add more comment and the disadvantage of the current
behavior.

Thanks,
Qu

>
>
> <snip>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-09-11 12:45 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-05  7:57 [PATCH v2 0/6] btrfs-progs: check: Repair invalid inode mode in Qu Wenruo
2019-09-05  7:57 ` [PATCH v2 1/6] btrfs-progs: check: Export btrfs_type_to_imode Qu Wenruo
2019-09-05  7:57 ` [PATCH v2 2/6] btrfs-progs: check/common: Introduce a function to find imode using INODE_REF Qu Wenruo
2019-09-09 13:25   ` Nikolay Borisov
2019-09-09 14:24     ` Qu Wenruo
2019-09-09 14:34       ` Nikolay Borisov
2019-09-09 13:42   ` Nikolay Borisov
2019-09-09 14:26     ` Qu Wenruo
2019-09-09 14:35       ` Nikolay Borisov
2019-09-05  7:57 ` [PATCH v2 3/6] btrfs-progs: check/common: Make repair_imode_common() to handle inodes in subvolume trees Qu Wenruo
2019-09-09 14:17   ` Nikolay Borisov
2019-09-09 14:27     ` Qu Wenruo
2019-09-10  4:27   ` Su Yue
2019-09-10 16:14   ` Nikolay Borisov
2019-09-11  0:39     ` Qu Wenruo
2019-09-11 12:27       ` Nikolay Borisov
2019-09-11 12:44         ` Qu Wenruo
2019-09-05  7:57 ` [PATCH v2 4/6] btrfs-progs: check/lowmem: Repair bad imode early Qu Wenruo
2019-09-09 14:55   ` Nikolay Borisov
2019-09-10  2:35     ` Qu Wenruo
2019-09-05  7:57 ` [PATCH v2 5/6] btrfs-progs: check/original: Fix inode mode in subvolume trees Qu Wenruo
2019-09-05  7:58 ` [PATCH v2 6/6] btrfs-progs: tests/fsck: Add new images for inode mode repair functionality Qu Wenruo
2019-09-09 15:37   ` Nikolay Borisov
2019-09-09 23:22     ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).